Automating Virtual Machine Management in Azure with Ansible and Azure DevOps

Łukasz Kołodziej

DevOps i Cloud Architect

12 grudnia 2024

Managing a fleet of virtual machines (VMs) across multiple Azure regions can quickly become a logistical challenge. As the person responsible for streamlining this process, I developed an Azure DevOps pipeline that automates much of the heavy lifting using Ansible. I’d like to share my experience creating this pipeline, highlight its key features, and provide insights for anyone looking to build a similar solution.


The Problem to Solve

When managing hundreds of VMs across numerous regions, I faced several recurring challenges:

  • Keeping VM inventories updated dynamically.
  • Consistently applying configurations and updates.
  • Targeting specific hosts or regions for specialized tasks.
  • Avoiding manual errors while speeding up deployment.

This was the inspiration behind automating the entire process. My goal was to create a pipeline that required minimal manual intervention while maintaining security and flexibility.


The Solution: My Azure DevOps Pipeline

I built a YAML-based Azure DevOps pipeline that integrates with Ansible to:

  1. Automatically update inventories of VMs based on Azure tags.
  2. Run Ansible playbooks dynamically across selected regions or specific hosts.
  3. Leverage Git for inventory version control and collaboration.

The result was a highly flexible pipeline that addressed all my pain points.


Key Features of the Pipeline

1. Parameterized Flexibility

The pipeline is fully parameterized, allowing me to choose the playbooks to execute, the regions to target, and whether to apply changes to all hosts or a single host.

Here’s a snippet of the parameters:

parameters:
  - name: selected_playbooks
    displayName: 'Select Playbooks to Run'
    type: string
    default: 'playbooks/ping.yaml'
    values:
      - 'playbooks/install_basic.yaml'
      - 'playbooks/update_ubuntu.yaml'
      # Additional playbooks...

  - name: selected_regions
    displayName: 'Select Regions to Target'
    type: string
    default: '*'
    values:
      - 'ase'
      - 'aue'
      # Other regions... 

This flexibility allows me to quickly adapt the pipeline to different scenarios, such as deploying monitoring agents or running health checks.


2. Dynamic Inventory Updates

One of the most critical components is the update_inventory.sh script. It dynamically discovers VMs based on Azure tags and updates Ansible inventory files by region.

Here’s the core logic:

#!/bin/bash

# Capture the start time
start_time=$(date +%s)

# Define the inventory directory
INVENTORY_DIR="./inventories"

# Define the tag key and value to filter VMs
TAG_KEY="your_tag"
TAG_VALUE="your_tag_value"

# Temporary file to store results for found VMs
TEMP_FILE=$(mktemp)

# Get the list of VMs with the specified tag
vms=$(az vm list --query "[?tags.$TAG_KEY=='$TAG_VALUE'].{name:name, resource_group:resourceGroup}" -o json)

# Function to process each VM
process_vm() {
    vm=$1
    _jq() {
        echo ${vm} | base64 --decode | jq -r ${1}
    }

    vm_name=$(_jq '.name')
    resource_group=$(_jq '.resource_group')

    # Extract the region suffix from the VM name
    region_suffix=$(echo "$vm_name" | awk -F'-' '{print $NF}')

    # Define the path to the corresponding inventory.ini file
    inventory_file="$INVENTORY_DIR/$region_suffix/inventory.ini"

    # Quick check: If the VM is already in the inventory, skip further processing
    if grep -q "^$vm_name ansible_host=" "$inventory_file"; then
        echo "$vm_name already exists in $inventory_file. Skipping..."
        return
    fi

    # Get the private IP address for the VM
    private_ip=$(az vm list-ip-addresses -g "$resource_group" -n "$vm_name" --query "[].virtualMachine.network.privateIpAddresses[0]" -o tsv)

    # Check if the private IP was retrieved
    if [[ -z "$private_ip" ]]; then
        echo "Warning: No private IP found for VM $vm_name. Skipping..."
        return
    fi

    # Define the new entry for the VM
    new_entry="$vm_name ansible_host=$private_ip ansible_user=<user> ansible_become_pass=<password>!"

    # Check if the inventory file exists, create it if not
    if [[ ! -f "$inventory_file" ]]; then
        echo "[servers]" > "$inventory_file"  # Initialize with the group header
    fi

    # Add the VM entry to the inventory file
    echo "$new_entry" >> "$inventory_file"
    echo "Added $vm_name to $inventory_file"

    # Add the VM details to the temporary file
    echo "$vm_name (IP: $private_ip)" >> "$TEMP_FILE"
}

# Iterate over the VMs and process them in parallel
for vm in $(echo "${vms}" | jq -r '.[] | @base64'); do
    process_vm "$vm" &
done

# Wait for all background processes to finish
wait

# Capture the end time
end_time=$(date +%s)

# Calculate the duration in seconds
duration=$((end_time - start_time))

# Convert duration to minutes and seconds
minutes=$((duration / 60))
seconds=$((duration % 60))

# Display the duration of the script
echo -e "\nScript execution took $minutes minutes and $seconds seconds."

# Display the count and the list of found VMs
found_vms_count=$(wc -l < "$TEMP_FILE")
echo "Found $found_vms_count VMs with tag '$TAG_VALUE':"
cat "$TEMP_FILE"
rm "$TEMP_FILE"  # Clean up the temporary file 

The script:

  • Filters VMs based on a specific tag (your_tag).
  • Retrieves private IPs and dynamically generates Ansible inventory files for each region.

Tip: Make sure to enable the „Shell tasks arguments validation” option in Azure DevOps pipeline settings to avoid validation errors during script execution. This is a small but critical detail that I learned the hard way!


3. Git Integration for Inventory Management

One aspect I’m particularly proud of is how I integrated Git into the pipeline. All inventory updates are pushed to a Git repository, ensuring version control and collaboration.

steps:
  - bash: |
      git config --global user.email "$GIT_EMAIL"
      git config --global user.name "$GIT_USERNAME"
      echo "https://$GIT_USERNAME:$(System.AccessToken)@orgnization.visualstudio.com" > ~/.git-credentials

      git remote add origin $(GIT_REPO_URL) || echo "Remote 'origin' already exists."
      git fetch --all

      git checkout -b $(SOURCE_BRANCH) || git checkout $(SOURCE_BRANCH)
      git add -A :!*.pem :!update_inventory.sh
      git commit -m "Updated inventory on $(date '+%Y-%m-%d %H:%M:%S')" || echo "No changes to commit."
      git push origin $(SOURCE_BRANCH) || echo "Failed to push changes."
    displayName: "Push Inventory Updates to Git" 

This step ensures:

  • Every change to the inventory is tracked and versioned.
  • Team members can easily review and collaborate on updates.

4. Running Ansible Playbooks Dynamically

With the inventories updated, the pipeline executes selected playbooks across the targeted regions or specific hosts. Here’s how I handled this:

steps:
  - bash: |
      playbooks=(${SELECTED_PLAYBOOKS//,/ })
      regions=(${ALL_REGIONS//,/ })

      for playbook in "${playbooks[@]}"; do
        for region in "${regions[@]}"; do
          inventory_path="inventories/$region/inventory.ini"

          if [ -f "$inventory_path" ]; then
            ansible-playbook $playbook -i $inventory_path --private-key $(ANSIBLE_PRIVATE_KEY_PATH)
          else
            echo "Inventory file for $region not found. Skipping."
          fi
        done
      done
    displayName: "Run Selected Playbooks" 

This logic:

  • Iterates over selected playbooks and regions.
  • Dynamically adjusts the inventory path for each region.
  • Gracefully handles missing inventory files.

Lessons Learned

  1. Dynamic Updates Save Time: Automating the inventory process eliminated hours of manual effort and ensured accuracy.
  2. Parameterization is Key: Building flexibility into the pipeline allowed me to reuse it across different scenarios without additional modifications.
  3. Git is a Lifesaver: By pushing inventory updates to Git, I ensured traceability and easy collaboration with the team.
  4. Attention to Pipeline Settings: Enabling „Shell tasks arguments validation” was essential to ensure my scripts ran smoothly.

Final Thoughts

Building this pipeline was a rewarding experience. It’s not just about automation—it’s about creating a robust system that scales with your needs and reduces operational overhead. For anyone managing a large-scale Azure environment, I encourage you to explore similar solutions.

If you’re tackling similar challenges or have questions about this approach, I’d love to connect and discuss! 🚀


Disclaimer: Sensitive credentials and specifics have been anonymized for security.

Full pipeline:

parameters:
  - name: selected_playbooks
    displayName: 'Select Playbooks to Run'
    type: string
    default: 'playbooks/sample-playbook.yaml'
    values:
      - 'playbooks/install_agent.yml'
      - 'playbooks/update_service.yml'
      - 'playbooks/sample-playbook.yaml'

  - name: selected_regions
    displayName: 'Select Regions to Target'
    type: string
    default: '*'
    values:
      - 'region1'
      - 'region2'
      - 'region3'
      - '*'

  - name: target_scope
    displayName: 'Select Target Scope'
    type: string
    default: 'all_hosts'
    values:
      - 'all_hosts'
      - 'single_host'

  - name: single_host
    displayName: 'Enter Single Host to Target (Required if targeting a single host)'
    type: string
    default: 'sample-host-prod-region1'

variables:
  AZURE_CLIENT_ID: '********-****-****-****-************'
  AZURE_CLIENT_SECRET: '*********'
  AZURE_TENANT_ID: '********-****-****-****-************'
  AZURE_SUBSCRIPTION: 'Your Azure Subscription Name'
  GIT_EMAIL: 'devops@example.com'
  GIT_USERNAME: 'DevOps'
  GIT_REPO_URL: 'https://your-org.visualstudio.com/YourProject/_git/ansible-repo'
  SOURCE_BRANCH: 'feature-ansible-updates'
  ANSIBLE_PRIVATE_KEY_PATH: 'ssh/private-key.pem'
  ANSIBLE_HOST_KEY_CHECKING: 'False'
  SELECTED_PLAYBOOKS: "${{ parameters.selected_playbooks }}"
  SELECTED_REGIONS: "${{ parameters.selected_regions }}"
  SINGLE_HOST: "${{ parameters.single_host }}"
  TARGET_SCOPE: "${{ parameters.target_scope }}"
  ALL_REGIONS: 'region1,region2,region3'

stages:
  - stage: CheckCommitAuthor
    displayName: "Check Commit Author"
    jobs:
      - job: CheckAuthor
        displayName: "Check Commit Author"
        pool:
          vmImage: "ubuntu-latest"

        steps:
          - checkout: self
            displayName: "Checkout Repository"

          - bash: |
              COMMIT_AUTHOR=$(git log -1 --pretty=format:'%an')
              echo "Commit Author: $COMMIT_AUTHOR"
              if [ "$COMMIT_AUTHOR" == "$GIT_USERNAME" ]; then
                echo "Commit made by DevOps, skipping pipeline."
                echo "##vso[task.complete result=Succeeded;]DONE"
                exit 0
              else
                echo "Commit made by another user, proceeding with pipeline."
              fi
            displayName: "Check Commit Author"

  - stage: InventoryUpdate
    displayName: "Inventory Update Stage"
    jobs:
      - job: UpdateInventory
        displayName: "Update Inventory with New VM List"
        pool: ubuntu-k8s

        steps:
          - checkout: self
            displayName: "Checkout Repository"

          - bash: |
              sed -i -e 's|http://archive.ubuntu.com/ubuntu|http://old-releases.ubuntu.com/ubuntu|g' \
                     -e 's|http://security.ubuntu.com/ubuntu|http://old-releases.ubuntu.com/ubuntu|g' /etc/apt/sources.list

              apt-get update -y
              ACCEPT_EULA=Y apt-get install -y unixodbc jq msodbcsql18 mssql-tools18 unixodbc-dev
              apt-get install -y python3 python3-venv python3-pip jq

              python3 -m venv ansible_env
              source ansible_env/bin/activate
              pip install --upgrade pip --break-system-packages
              pip install ansible[azure]
              ansible-galaxy collection install azure.azcollection
            displayName: "Install Dependencies and Ansible"

          - task: AzureCLI@2
            inputs:
              azureSubscription: $(AZURE_SUBSCRIPTION)
              scriptType: bash
              scriptLocation: inlineScript
              inlineScript: |
                az login --service-principal -u $(AZURE_CLIENT_ID) -p $(AZURE_CLIENT_SECRET) --tenant $(AZURE_TENANT_ID)
                chmod +x update_inventory.sh
                ./update_inventory.sh   
              workingDirectory: $(System.DefaultWorkingDirectory)
            displayName: "Azure CLI Authentication"

          - bash: |
              git config --global user.email "$GIT_EMAIL"
              git config --global user.name "$GIT_USERNAME"
              echo "https://$GIT_USERNAME:$(System.AccessToken)@your-org.visualstudio.com" > ~/.git-credentials
              git config --global credential.helper store

              git remote add origin $(GIT_REPO_URL) || echo "Remote 'origin' already exists, skipping..."
              git fetch --all

              if git show-ref --verify --quiet refs/heads/$(SOURCE_BRANCH); then
                  git checkout $(SOURCE_BRANCH)
              else
                  git checkout -b $(SOURCE_BRANCH)
              fi
              git add -A :!ansible_env :!*.pem :!update_inventory.sh
              git commit -m "Updated inventory on $(date '+%Y-%m-%d %H:%M:%S')" || echo "No changes to commit."
              git push origin $(SOURCE_BRANCH) || echo "Failed to push changes."
            displayName: "Push Inventory Updates to Git"

          - bash: |
              echo "---------------------------------------" >> $(System.DefaultWorkingDirectory)/outputs.md
            displayName: "Initialize outputs.md file"

          - bash: |
              playbooks=(${SELECTED_PLAYBOOKS//,/ })
              chmod 600 $(ANSIBLE_PRIVATE_KEY_PATH)

              if [ "${SELECTED_REGIONS}" == "*" ]; then
                regions=(${ALL_REGIONS//,/ })
              else
                regions=(${SELECTED_REGIONS//,/ })
              fi

              for playbook in "${playbooks[@]}"; do
                for region in "${regions[@]}"; do
                  inventory_path="inventories/$region/inventory.ini"

                  if [ -f "$inventory_path" ] && [ -s "$inventory_path" ]; then
                    if [ "$TARGET_SCOPE" == "all_hosts" ]; then
                      echo "Running playbook $playbook for all hosts in region $region"
                      source ansible_env/bin/activate && ansible-playbook $playbook -i $inventory_path --private-key $(ANSIBLE_PRIVATE_KEY_PATH) || true
                    elif [ "$TARGET_SCOPE" == "single_host" ]; then
                      if [ -n "$SINGLE_HOST" ]; then
                        if grep -q "^$SINGLE_HOST" "$inventory_path"; then
                          echo "Running playbook $playbook for single host $SINGLE_HOST in region $region"
                          source ansible_env/bin/activate && ansible-playbook $playbook -i $inventory_path --limit "$SINGLE_HOST" --private-key $(ANSIBLE_PRIVATE_KEY_PATH) || true
                        else
                          echo "Host $SINGLE_HOST does not exist in region $region's inventory. Skipping."
                        fi
                      else
                        echo "No single host specified for region $region. Skipping."
                      fi
                    else
                      echo "Skipping playbook $playbook for region $region as no valid host selection is made."
                    fi
                  else
                    echo "Skipping region $region as inventory file is missing or empty"
                  fi
                done
              done
            displayName: "Run Selected Playbooks $(SELECTED_PLAYBOOKS) for Selected Regions $(SELECTED_REGIONS)"
            env:
              SELECTED_PLAYBOOKS: "$(SELECTED_PLAYBOOKS)"
              SELECTED_REGIONS: "$(SELECTED_REGIONS)"
              SINGLE_HOST: "$(SINGLE_HOST)"
              TARGET_SCOPE: "$(TARGET_SCOPE)"
              ANSIBLE_HOST_KEY_CHECKING: "$(ANSIBLE_HOST_KEY_CHECKING)"
              ANSIBLE_PRIVATE_KEY_PATH: "$(ANSIBLE_PRIVATE_KEY_PATH)"
              ALL_REGIONS: "$(ALL_REGIONS)"

          - bash: |
              echo "Displaying the contents of outputs.md:"
              ls -l $(System.DefaultWorkingDirectory)  
              cat $(System.DefaultWorkingDirectory)/outputs.md
            displayName: "Display outputs.md file" 

Komentarze

Dodaj komentarz

Twój adres e-mail nie zostanie opublikowany. Wymagane pola są oznaczone *