vSphere – vB(art)log

Dynamicly executing powershell code from GIT – a FaaS way

In this post I mentioned that I was tipping my toes in vRA.
And…. yeah, it is not dipping anymore, it is a deep dive… 🙂 but that is for another blog

This blog is about a challenge I solved by using a FaaS approach. FaaS stands for Function as a Service.
I have a small UPS running for my home environment. And found out that it could support my home environment for about 30 min. And then poof… power off…. So I was searching for a solution that my vsphere environment would be shutdown, triggered from a home assistant /Node-Red automation flow, monitoring my UPS status.

My first way of thinking was to use the REST API of vSphere / VMware esxi. But not all the actions I need are published. (Like shutdown of an ESXi host.)
And I want to be dynamic as possible.
I want to shutdown / suspend all VMs expect VMs that are tagged as coreVM. So the code that I should write, doesn’t contain the VM names or IDs, but would filter out all VMs that have a vSphere tag with coreVM.

The Idea

As mentioned, is that the automation would find all VMs that need to be suspended or shutdown ( depending if VMtools is running). And VMs that have a vSphere tag UPS/coreVM, would be ignored.

These VMs are controlled with the startup/shutdown feature of my ESXi host.
The total automation flow would be a 3 fase flow. Fase 1 shutdown all but the core VMs. Fase 2 shutdown the ESXi host, which would automaticly shutdown VMs controlled via the startup/shutdown feature. Fase 3, shutdown of Synology NAS and Home Assistant
The core VMs are my vCenter, Log Insight and router VM.

The tools

The tools I use are:

Home Assistant with
- NUT intergration (monitoring my UPS)
- Node-Red flow automation (ussing HTTP REST API calls to control docker via Portainer)
Docker CE with Portainer CE running on a synology NAS
- Docker image vmware/powerCLI
- Portainer controlling the docker environment and leveraging control via its APIs
Gitea GIT server running locally (containing my script and parameter files)

the FaaS way

For fase 2 I try to use a FaaS approach. Meaning I have a function for shutting down / suspending VMs. This function is run in a temporary runtime environment (docker container) and only available at runtime. The function is not part of the runtime environment, but on runtime it is downloaded from a GIT repository and executed.
This gives the advantage of maintaining a runtime environment seperate from scripts (or the other way around). And on every execution, it starts with a clean environment.
For the function I use powerCLI, because I haven’t found an API call in vSphere 7.0 that will shutdown an ESXi host. And filtering VMs on vsphere tags was (for me) a bridge to far.

Using a container gives me also the possibility to seperate my vSphere credentials from my script, by saving it in a docker volume, which is only mounted at runtime. The function itselfs contains no credentials. It looks to a file in that docker volume.

The hurdles I encountered were:

how to run the script from a git repo
- by using a powershell one-line invoke-expression with invoke-webrequest using the url to the raw presentation of a file (script) in the git repo.
- using this one-liner as the entrypoint parameter when creating the container
how to bypass caching by the web browser
- using “Cache-Control”=”no-store” in the header
running a full powershell script via invoke-expression
- write the full code block as a powershell function (with begin, End and Process blocks) and call that function within the script.

The powershell one-line is

pwsh -Command invoke-expression '$(Invoke-WebRequest -SkipCertificateCheck -uri ' + <git URI> + ' -Headers @{"Cache-Control"="no-store"} )'

The <git URI> is the url to the raw representation of the script file in the GIT repo. When leveraging the portainer/docker api to create the container you need to use the JSON notation for the entrypoint. The one-line will look something like:

["pwsh",
"-Command",
"invoke-expression",
'$(Invoke-WebRequest -SkipCertificateCheck -uri ' + <git URI> + ' -Headers @{"Cache-Control"="no-store"} )']

Powershell script formant / the snag

The snag with using invoke-expression is that it can’t handle a powershell script that has a Begin, End and Process code block. While I this is how I write my code, and wouldn’t like to deviate from it, it meant I had a snag.
The solution was to write a powershel script that contains a private function and execution of that function, like this

function main {
    <#
    .SYNOPSIS
        Example code for running full script from an URI.
    .DESCRIPTION
        Example code to run a powershell script with dynamic blocks Begin,Process and End.
        Loading parameters from a JSON file into an P object.
        By wrapping the code into the function main, we can use begin, process and end scriptblocks when calling with Invoke-Expression
        The Process block contains the main code to execute.
        The Begin and en blocks are mainly used for setting up the environment and closing it.
    .EXAMPLE
    Run the following cmdline in a powershell session.
    Invoke-Expression (Invoke-webrequest <URL>).content
    .NOTES    
    #>
    Begin {
        #=== Script parameters

        #-- GIT repository parameters for loading the parameters.json
        $scriptName="startVMs"
        $scriptGitServer = "https://....."   # IPv4 or FQDN of GIT server
        $scriptGitRepository = "organisation/repo/" # uri part containing git repository
        $scriptBranch = "master/" # GIT branch
        $scriptrootURI = $scriptGitServer+$scriptGitRepository+"raw/branch/"+$scriptBranch

        #==== No editing beyond this point !! ====
        $ts_start=get-date #-- Save current time for performance measurement

        #--- write log header to console
        write-host
        write-host "================================================================="
        write-host ""
        write-host "script: $scriptName.ps1"
        write-host ""
        write-host "-----------------------------------------------------------------"

        #-- trying to load parameters into $P object, preferably json style
        try { $webResult= Invoke-WebRequest -SkipCertificateCheck  -Uri ($scriptrootURI+$scriptName+".json") -Headers @{"Cache-Control"="no-store"}  }
        catch  {
            write-host "uri : " + $scriptrootURI
            throw "Request failed for loading parameters.json with uri: " + $webResult 
        }
        # validate answer
        if ($webResult.StatusCode -match "^2\d{2}" ) {
            # statuscode is 2.. so convert content into object $P
            $P = $webResult.content | ConvertFrom-Json 
        } else {
            throw ("Failed to load parameter.json from repository. Got statuscode "+ $webRequest.statusCode)
        }
    }
    End {

    }
    Process {

    }
}
main

What it does, is the script will run the function main. That function is basicly the full powershell script. In the begin code block the invoke-Webrequest is used to load a .json file and convert it to a powershell object called P.
This object contains all parameters used in the rest of script (like vCenter FQDN).

Result

The result is that from a monitoring trigger Node-Red will do some REST API calls to portainer to create run and delete a docker container based on a vmware/powerCLI image. During the lifetime of this container a volume is mounted with authorisation information and a powershell one-liner is executed which runs a powershell code directly loaded from a GIT repository.
With this setup I can run on demand any powershell script which doesn’t need user interaction, maintained in a GIT repository.

I hope you enjoyed this post. do you have questions / comments, please leave them below or reach out to me on twitter

migrate to vcsa 6.5 U3 GUI xlarge issue

For a customer I’m migrating their legacy vSphere 5.5 environment to vSphere 6.5 U3.
The migration is from a windows vCenter to the VCSA.
First, we tried to use the GUI and stumbled over an issue.

GUI

The GUI, is of course a nice, friendly way to do this process. But when we got at the stage to select the size of the VCSA, we only had the ‘xlarge’ option.
And well… that was a bit too much. Because we were aiming at the ‘small’ size.

So we did some searching on the internet and found out that we were not alone. Although it was an encouraging thought, we still didn’t find what we were looking for.
Most of the threads and blog posts pointed to database size, log size etc..
And yes, they need to be checked. We had a vcenter database table of 7 million records, the size of the database was 80 GB. After some cleaning up and shrinking the database it was only a few GB.
But al this effort, didn’t persuade the GUI to give us the desired ‘small’ size option.

CLI

Hoping that the validation in the GUI was different than when using the CLI, we decided to migrate using the vcsa-deploy CLI method.

Yes, it involves creating a json file. But the .iso file for the vcsa contains several template files, for several migration scenarios.
We used on of the templates, customized it to our needs (setting the size to ‘small’. And after some trial and error we finally got it working.

TIP: validate your json file with

vcsa-deploy --precheck-only \some_path_to_json_file\migrate.json

If you want to read more about the CLI way of migration, check the vmware docs here.

vcsa-deploy creates for every run a new log folder . When we checked the logs we found out that vcsa-deploy was content with the ‘small’ size option we configured in the json file.

Thoughts

Why not 6.7 U3, well due to dependencies 6.5 U3 is at the moment the most current version we can run.
Although this post is about migrating to 6.5 U3, it could also work for 6.7 U3, but no guarantees.
The gist of this all is, if the GUI doesn’t work, try doing it the CLI way.

Passed

I already blogged about my VCAP deploy experience.( you can find it here ) And as stated…. I passed !!!… Sadly VMware doesn’t let you know the mistakes you made in the exam.
But in the end…. this is what counts !!!

Next step ??

Well… to do the VCAP design exam. To be honest, I don’t look forward to it, because theory questions are not my strong suit.
My strategy is going to study, visit VMware Empower europe 2019, do the VCAP design exam @ Empower (it is part of the conference) so I get a direction of what the exam looks like…. and maybe pass it on the first try.

patch vCenter HA 6.7 U1

Maybe you think, we’ll how hard can it be ??
Yes, that was the same question I had. And to be honest… it is not that hard.
But there are some quarks or gotchas.
In this post I’ll explain the route I took for patching a vCenter HA setup.

Why don’t you use the VAMI ?

VAMI stands for ‘ Virtual Appliance Management Infrastructure ‘. It can be accessed via port 5480 like https://<FQDN VMware appliance>:5480 .
The VAMI of a vCenter Appliance (VCSA) has an update section, which you can use to patch the VCSA. This is a nice and easy way for patching the VCSA, but when you have vCenter configured as vCenter HA then this option won’t work. (I know from experience….)
After trying (and failing) I thought, why not read the manual….
VMware has a nice article about patching a vCenter in HA and you can find it here.
I still use the VAMI, but not for patching but for making a backup.

My VCAP-DCV 2018 Deploy experience

Yesterday I did the VMware VCAP-DCV 2018 Deploy experience. I dreaded to do the exam because the VMware DCV landscape is a fast landscape. I do have enough experience and expertise, but to know all the tiny details….
To be honest , there was a little fear for failure….
Especially because I heard that you really need all the time (3,5 hours)…. and that the performance of the virtual environment is bad …. And no access to VMware pubs/Docs/KBs and blog sites….
Do I really have it in me to pass this exam ??

But, eventhough I don’t now the score yet ( that could take up to 6 weeks or so)…. I’m quite confident that I passed it.
Why ?
Well… the situations you need to solve are for me common troubleshoot scenarios in a vSphere environment. Yes some questions are tricky… but hey… of course they are, you want to prove that you have advanced knowledge in this area..
I even enjoyed taking the exam.
It just felt as a normal day at the office…

Critics

Yes I do have some critical comments.
It is true, the lab enviroment (same experience as a VMware HOL enviroment), performs poorly. Not sure if it is because the internet connection, the hardware, or just the exam enviroment. But…. it was a while back that I used a 17″ monitor…. (i mean years…. prehistoric period)… but yesterday… yes …. really…. my exam enviroment had a 17″monitor….
Which is challenging.
As you maybe know… the HOL enviroment has a book area where the questions and explanations are shown (right side on the screen)…..
I had to switch often between this area and the console area.
With a normal monitor, you can keep them side by side…
And the performance…. selecting text with the mouse, click delays up to 10 seconds …. which is really annoying, and not helpful for someone who is nervous.

Conclusion

So I have to wait a few weeks for the result.
And even if I failed, I still think that in hindsight I should have taken this exam earlier.
If you have day to day hands on experience with a vSphere datacenter virtualization environment for more then a year, then I my advise would be to consider this exam. Your experience should be like, implementations , troubleshooting, and day-2 operations of a vSphere datacenter environment.
Get some experience with the VMware HOL environment to get used to the lab environment. It is also a great place to practice with certain functionalities without destroying your own production … euh…. test environment.
You can find VMware HOL here.

So fingers crossed (f0r me) and good luck when you want to certify your advance experience with a vSphere environment.

VMware Enhanced Authentication Plugin Not Working

For several versions of vSphere vCenter it is possible to logon with your windows credentials. Making it easier to only tick a box and logon instead of typing your username and password.
This nice and neat trick is done via the VMware Enhanced Authentication Plugin.
The only issue… it does need to work every time, else it is going to be an annoyance.
And , yes, it became an annoyance for me… especially when using Firefox.
So let me give you to possible solutions

Issue 2 – bypassing the fingerprint cache message when using PLINK

This article is part of a series of articles about issues I encountered during implementation of a vSphere stretched cluster based on vSphere 6.7 U1.
You can find the introduction article here

Issue 2

For some configuration settings I need SSH access to the host. I use plink.exe to execute instructions through the SSH session. One issue, the first time when you connect with plink you get a message about storing the fingerprint ID in the cache. Normally you would accept this when using putty. But now this is going to be a challenge.
On some other blogs I found the solution. You echo the ‘Y’ which results in storing the ID in the cache.
In my code I now call plink two times. The first time to accept the fingerprint, the second time to execute the command.
Why two times ? Well, I can’t assume that the fingerprint ID is already known.
The first plink instruction is a simple exit, we only want to check if we can logon.

$credential=get-credential
$plink="d:\plink.exe $hostname -l "+ $credential.username + " -pw " + $credential.getnetworkcredential().password
$command="ls"
invoke-expression ("echo Y | " + $plink +  " -ssh exit")
invoke-expression ($plink + " "+ $command)

Issue 1 – changing root password

This article is part of a series of articles about issues I encountered during implementation of a vSphere stretched cluster based on vSphere 6.7 U1.
You can find the introduction article here

Issue

All the hosts are delivered with 6.5 U2 pre-installed, and they have their own root password. For the implementation we want to have just one general root account password. So after adding all the hosts to the cluster I want to change the root password with powercli. But I tripped over a bug in get-esxcli (thanks to this thread ). The ‘&’ character is not correctly being interpreted when using get-esxli.
The script I wrote checks if the new password contains that character and will kindly ask to change it. After succesfull validation of the password it will apply it to all selected esxi hosts.
I

#-- select one or more hosts
[array]$esxiHosts=get-vmhost | select name | sort | out-gridview -Title "Select one or more ESXi Hosts"-OutputMode Multiple
if ($esxiHosts.count -eq 0) {
write-host "No host(s) selected, will exit." -foregroundcolor yellow
exit
}
#-- ask for root password and validate it agains known bug
Do {
$newCredential = Get-Credential -Username root -Message "Enter the password for the ESXi root account."
$isValid=$true
if ($newCredential.getNetworkCredential().Password -imatch "[\&]") {
$isValid=$false
write-host"Password contains character & which get-esxcli can't handle (bug)..... please consider a different password." -foregroundcolor yellow
}
}
until ($isValid)

#-- change root password for all selected esxi hosts
foreach ($esxiHost in $esxiHosts) {
$esxiHost=get-vmhost -Name -$esxiHost.name
$esxiCli=get-esxcli -v2 -vmhost $esxiHost
$arguments=$esxcli.system.account.set.createArgs()
$arguments.id=$newCredential.UserName
$arguments.password=$newCredential.GetNetworkCredential().password
$arguments.passwordconfirmation=$arguments.password
try {$esxcli.system.account.set.Invoke($arguments)}
catch{write-host "Setting password failed for " $esxiHost.name -ForegroundColor Yellow}
}

Issues I encountered with a stretched cluster implementation on 6.7 U1

At the moment I’m busy with a stretched cluster implementation based on vSphere 6.7 U1. Most of the configuration is straight forward. But I encounter some snags.
So this post is about these snags, and how I solved them.

For configuring 16 hosts I use a lot of powerCLI. Why ? Well I have some issues with host profiles, and not the time (yet) to figure out what is going on.
Edit: I found out what the issue is, I’ll explain it in Issue 3.

I encountered the following issues

Issue 1 – changing root password via get-esxcli
Issue 2 – bypassing the fingerprint cache when using Plink
Issue 3 – kernel port vmk0 removed by host profile

Restore experience vCenter appliance

Until a few months ago I worked with vCenters that were running on a Windows OS. Because that was the common practise. The appliance was promising but not ready for production use. When vSphere 5.5 was introduced it became an interesting discussion,should you use the appliance or still go the windows way.
I think one of the points holding people of from going to the appliance was (and maybe still is) the unknown.
I was mostly familiair with a windows OS.Yes the Windows patching was irritating, and if you wanted to upgrade VMware vCenter ….. that also was annoying. But not anymore… (well I guess for over a year that is…)

The appliance in 6.0 is great. I wrote in another article about my experience of migrating to 6.0 appliance with an update manager deployed. (https://vblog.bartlievers.nl/2017/01/17/update-manager-6-0-with-vcenter-6-0-after-migration-wizard) And I’m loving it. I was impressed by the ease of migrating from vCenter 5.5 running on Windows to 6.0 VCSA.

But there was one, tiny little, issue, called backup and restore.
How should that work when you are running an appliance.
For testing purposes we did a restore of the appliance but ran into a problem. The network adapter changed. The appliance OS didn’t recognize it anymore as eth0 but as eth1.
And our first conclusion was… well…. that is a problem…restoring a VCSA.
Diving into some vmware documentation, I ran into info that restoring a VCSA should work… (see question 29 in https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2146439)…. What did we do wrong….

After examening the events of the restored VCSA we saw an error. vCenter was saying that there was a duplicated MAC address…. so it changed the MAC address of the restored VCSA… hence the eth1.

And yes… that was the issue…. in default vCenter will change a MAC address if it already exists. (that is the short version)…. so I tried it again…. But with a small change.

I created a virtual portgroup on a vswitch, not on the dvswitch. Taking care of the duplicated MAC address in a dvSwitch… I changed the MAC address of the restored VCSA back to the original setting. And I started the VM…. voila….
The VM booted…. and after waiting for the boot process to complete…. I had a restored VCSA, fully functional.

Summary:

restoring a VCSA does work… just keep in mind if you replace the existing VCSA or restoring it next to it (duplicated MAC address)
Updating the VCSA is a breeze . You need connection to the internet, and the appliance will take care of the update. No seperate OS updates and vCenter updates.
(side note) the webclient interface is faster than the one in 5.5

The Idea

The tools

the FaaS way

The powershell one-line is

Powershell script formant / the snag

Result

Share this:

GUI

CLI

Thoughts

Share this:

Share this:

Why don’t you use the VAMI ?

Share this:

Critics

Conclusion

Share this:

Share this:

Issue 2

Share this:

Issue

Share this:

Share this:

Share this: