top of page

AwesomeOps Part 2: Demystifying Windows Automation with Ansible

Updated: Jun 21, 2023

Do not fear friends, automating Windows Server is not that hard!! Well, it is not that hard once you get past the hard part. We know, we know, that is a typical Windows administrator

response, but it is true. Let's just say Windows Server has made many an administrator cry late at night trying to automate super simple tasks across dozens or thousands of hosts. But all the pain and frustration of automating Windows Server will subside after understanding and implementing a few foundational things that will set you and your organization up for long term success. Let's get into it.


What In The WinRM?

WinRM is short for Windows Remote Management. Cool! The expanded acronym explains Microsoft's intent of the tool, which is to manage remote Windows hosts. WinRM originally debuted around the Windows XP and Server 2003 days, and was an implementation of what was/is referred to as Web Services Management (WS-Management). In short, this was a specification for the exchange of administrative information with SOAP (Simple Object Access Protocol). What all of this means is that WinRM was developed for IT professionals to enable the management of hosts from a remote workstation or server. This sounds awesome, and it is! However, there are some basics that you need to be aware of so that you can securely scale Ansible within your organization. If you are interested in the fun low-level tech, checkout this quick read on the architecture of WinRM.

Below is a standard looking WinRM config. Let's review and go through what you want to change.

 PS C:\Users\****\****> winrm get winrm/config
Config
    MaxEnvelopeSizekb = 500
    MaxTimeoutms = 60000
    MaxBatchItems = 32000
    MaxProviderRequests = 4294967295
    Client
        NetworkDelayms = 5000
        URLPrefix = wsman
        AllowUnencrypted = true [Source="GPO"] --> Set this to False
        Auth
            Basic = true --> Change this to False
            Digest = true
            Kerberos = true
            Negotiate = true
            Certificate = true
            CredSSP = false
        DefaultPorts
            HTTP = 5985 --> Do your best to disable this and use 5986 exclusively
            HTTPS = 5986
        TrustedHosts = * [Source="GPO"] --> Change this a small set of Ansible Hosts. This can be Ansible Automation Platform (AAP) running OpenShift like we usually implement for clients, or Ansible Tower, or Azure Kubernetes Service (AKS) CIDR blocks. Basically, this IP list should be your most protected command and control systems that are allowed to connect to your Windows hosts on WinRM.
    Service
        RootSDDL = O:NSG:BAD:P(A;;GA;;;BA)(A;;GR;;;IU)S:P(AU;FA;GA;;;WD)(AU;SA;GXGW;;;WD)
        MaxConcurrentOperations = 4294967295
        MaxConcurrentOperationsPerUser = 1500
        EnumerationTimeoutms = 240000
        MaxConnections = 300 --> Complete your due diligence to make a determination if you can/should lower this value or if you need to increase. Usually you can lower this number. 
        MaxPacketRetrievalTimeSeconds = 120
        AllowUnencrypted = true [Source="GPO"] --> Change to false
        Auth
            Basic = false [Source="GPO"]
            Kerberos = true
            Negotiate = true
            Certificate = false --> think about enabling this. We use a combination of certs with NTLM and encrypted port 5986. Security is about layers not individual settings and tools.
            CredSSP = false --> think about enabling this. it is a relatively new security feature with a lot of cool stuff.
            CbtHardeningLevel = Relaxed
        DefaultPorts
            HTTP = 5985 --> disable if possible
            HTTPS = 5986
        IPv4Filter = * [Source="GPO"]
        IPv6Filter = * [Source="GPO"]
        EnableCompatibilityHttpListener = false
        EnableCompatibilityHttpsListener = false
        CertificateThumbprint
        AllowRemoteAccess = true [Source="GPO"]
    Winrs
        AllowRemoteShellAccess = true
        IdleTimeout = 7200000
        MaxConcurrentUsers = 100 [Source="GPO"] --> you can probably lower this, but do your research. most systems only need less than a handful of concurrent users, and these are usually core infrastructure nodes like ADDS/DNS/DHCP etc.
        MaxShellRunTime = 2147483647
        MaxProcessesPerShell = 150 [Source="GPO"]
        MaxMemoryPerShellMB = 2147483647
        MaxShellsPerUser = 50 [Source="GPO"] --> do some research on your organization, but realistically most systems do NOT need 50 shells per user.

There is a lot to process in the above example. So we will summarize here:

  1. Disable 5985. Just stop using 5985. Just because it is easy it does NOT make it right.

  2. Only use 5986. Just do this. It is the encrypted port. Yes it is harder to get working, but that is why we are here. :)

  3. Disable Basic Auth. There are many tutorials online about getting started with Ansible and the vast majority use Basic Auth as an example. Just do NOT use it. Leaving this enabled and using 5985 is like permanently turning on an evil version of the Bat Signal. Hackers will see it. Your system will be compromised.

  4. Use Kerberos where possible. Let's face it, no enterprise has a uniform implementation of Active Directory with a singular and perfect forest and a singular domain. There are always caveats, there are usually multiple forests and multiple domains. There are usually black-box areas where systems may or may not be domain joined to a specialized domain, or not connected to a domain at all.

  5. Use GPOs to set your default config. This is generally a good recommendation for those hosts domain joined.

  6. TrustedHosts update. Get a list of only the hosts that your organization wants to connect to all other Windows systems and add this to your config. This will block all other WinRM traffic.

  7. AllowUnencrypted. Set this to false. This should ONLY be set to true if you are a developer, and you are working in a feature branch, and you are in debug mode.

To start your WinRM journey you will need to deploy a few test machines, and get connected via RDP. Yes, we said it! Manually connect with RDP. One of the biggest stumbling blocks of new DevOps recruits we have seen over the years is the false thinking that all of automation is automated. Not only is manual effort required, manual work is the starting point for automation. The reason for this is simple. Most of the time when DevOps engineers are asked to automate a system or process, it is a whole new world - no, not that new world with Aladdin and Jasmine! So the first place you should start when entering a whole new world is manual trial and effort combined with documentation. You need to manually log into your new hosts via RDP so that you can start manually setting and removing WinRM configurations. What this will do is teach you more than a blog post could ever do about how WinRM actually works.


Once you are connected to your new host, you will need the commands below to display and review WinRM configurations:

winrm get winrm/config

The command above will display a WinRM config that looks very similar to the config we reviewed before in this blog. Next you will need to get a cert setup on your host. You will need to get a certificate from your Windows administrative team. Once you receive the cert, you will need to:

  1. Click Start and select Run

  2. Type MMC, which is short of Microsoft Management Console

  3. Click File from the menu options

  4. Select Add or Remove Snap-ins

  5. Select Certificates and click Add

  6. Click Computer Account

  7. Install certs under Certificate local compute personal certificates

  8. Open PowerShell and type out --> winrm quickconfig -transport:https

Now you should have a good simple and more secure WinRM config rocking on your host. This is more secure than simply using the winrm quickconfig command. That command gives you all of the defaults, which you do NOT want.


Next you will want to make sure your host is running a WinRM listener. Use the command below to check the status.

winrm enumerate winrm/config/listener

Check to see if your certs have been setup correctly:

winrm get http://schemas.microsoft.com/wbem/wsman/1/config

For more information on WinRM you can run the following command

winrm help config

Lastly here is a nice simple command you should test manually, and then incorporate into your automation framework either in Packer or as a bootstrap script at build time.

winrm s winrm/config/client                         @{TrustedHosts="server1,server2,server3"}

This was one of the recommendations previously mentioned.


Here is a good starter Ansible secure_vars.yml file that will get you started with some of the recommendations above.

ansible_connection: winrm
ansible_winrm_transport: ntlm
ansible_port: 5986
ansible_winrm_message_encryption: always
ansible_winrm_kerberos_delegation: True
ansible_become_method: runas

If you are very interested in learning everything there is to know about WinRM Ansible settings, checkout this page provided by Ansible.


We will give you some time to work on WinRM and the Ansible config above.



Excellent! At this point you should know a bit more about WinRM and be generally more comfortable with using and updating it. Congrats! Now you may be asking; "how do we make this work at scale and securely?" Great question, and a perfect transition into the next section. 🤓


Simple, Secure, and Scalable Ansible With Windows

The good and bad dichotomy of information technology is that there are usually many ways to get to the same desired outcome. This post will show one of the many ways you can create a simple, secure, and scalable Windows automation environment with Ansible.


Our approach is to move as much of the hard stuff left into builds and bake as much as we can into the foundational starting point of all future builds. This has a number of benefits, but the biggest one is the harder automation bits usually take longer for the automation to complete, so while the initial time commitment to get things working is very high, you will save hundreds/thousands of hours of time after your system is functional with scheduled builds. Practically speaking, we setup our simple, secure, and scalable Packer platform.

The diagram above is a loose approximation of our Packer platform.

  • Microsoft Azure DevOps = Orchestration Platform

  • Microsoft Azure Key Vault = Secrets Management

  • Packer = Imaging

  • Ansible = Configuration Management

  • Azure/Azure-Gov/AWS/VMware = Compute Platforms

Unfortunately we cannot go into all of the details about how to make this work, however we will write about a few of the key components.


First, you will need to get setup with an autounattended.xml file. If you have never used this before, you should check it out. This is a simple to understand and update XML file that is injected into the imaging process. One of the cool blocks of code you can modify is:


<settings pass="oobeSystem">
    ...
    ...
    <component xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" name="Microsoft-Windows-Shell-Setup" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS">
         <FirstLogonCommands>
            <SynchronousCommand wcm:action="add">
            <CommandLine>%SystemRoot%\system32\WindowsPowerShell\v1.0\powershell.exe -File a:\windows-cert-setup.ps1</CommandLine>
               <Description>Set Execution Policy 64-Bit</Description>
               <Order>2</Order>
               <RequiresUserInput>true</RequiresUserInput>
            </SynchronousCommand>

So we see that under the oobesystem block we can execute any number of PowerShell scripts at image configuration time. We have a few scripts here that will configure WinRM to our specifications. Based on the WinRM commands outlined previously, you should be able to create a simple script that meets the demands of your organization. Just remember the points above.


Here is a hint of what your finished Packer directory will look like:

To run this in Packer we first create a variable:

variable "vm_floppy_files_server_dc_dexp" {
  type        = list(string)
  description = "Used for Server with Desktop Experience. The list of files or directories to be added to the virtual floppy device. Used for unattended installation."
  default = [
    "../../../configs/windows/windows-server-2022/autounattend.xml",
    "../../../scripts/windows/",
    "../../../drivers/windows",
  ]
}

Then we call the var in our source like the below example:

source "vsphere-iso" "windows-server-2022-cis" {
  vcenter_server       = var.vcenter_server
  username             = var.vcenter_username
  password             = var.vcenter_password
  datacenter           = var.vcenter_datacenter
  cluster              = var.vcenter_cluster
  datastore            = var.vcenter_datastore
  folder               = var.vcenter_folder
  insecure_connection  = var.vcenter_insecure_connection
  tools_upgrade_policy = true
  tools_sync_time      = true
  remove_cdrom         = false
  convert_to_template  = false
  guest_os_type        = var.vm_guest_os_type
  vm_version           = var.vm_version
  notes                = "Built by Mentat-Packer on ${local.buildtime}."
  vm_name              = local.vm_name
  firmware             = var.vm_firmware
  CPUs                 = var.vm_cpu_sockets
  cpu_cores            = var.vm_cpu_cores
  CPU_hot_plug         = false
  RAM                  = var.vm_mem_size
  RAM_hot_plug         = false
  boot_wait            = var.vm_boot_wait
  boot_command         = var.vm_boot_command
  boot_order           = "disk,cdrom"
  cdrom_type           = var.vm_cdrom_type
  disk_controller_type = var.vm_disk_controller_type
  storage {
    disk_size             = var.vm_disk_size
    disk_controller_index = 0
    disk_thin_provisioned = true
  }
  network_adapters {
    network      = var.vcenter_network
    network_card = var.vm_network_card
  }
  floppy_files = var.vm_floppy_files_server_dc_dexp
  iso_paths = [
    "${var.iso_datastore}${var.iso_path}/${var.iso_file}.iso",
    "${var.iso_datastore}${var.iso_path}/vmware-tools.iso"
  ]
  iso_checksum     = "none"
  ip_wait_timeout  = "10m"
  communicator     = "winrm"
  winrm_username   = var.build_username
  winrm_password   = var.build_password
  winrm_port       = 5986
  winrm_timeout    = "15m"
  winrm_use_ssl    = true
  winrm_insecure   = true
  shutdown_command = var.vm_shutdown_command
  shutdown_timeout = "5m"
  content_library_destination {
    library = var.vcenter_content_library
    name    = "cis-windows-server-2022-${local.buildtime}"
    ovf     = false
    destroy = true
  }
}

Now there is a LOT more code that goes into this, but the above Packer source is enough to get you started. You will notice that our default port is WinRM 5986 in the source config. That means that even the initial connection to the ephemeral Packer instance is connecting on WinRM 5986! This is AwesomeOps at work.









The End

As always, we hope this blog helps a few people out! If you are interested in learning more about how to create a simple, secure, and scalable Packer platform within your organization, please reach out to us.

589 views0 comments

Recent Posts

See All
bottom of page