Is VMware Building a DevOps Framework for the…

Is @VMware Building a DevOps Framework for the Masses?

Is VMware Building a DevOps Framework for the…

While Cloud Native Applications and DevOps are generating massive amounts of hype, the ability of IT organizations to execute on this vision outside of Silicon Valley is often being questioned. VMware’s Cloud-Native Apps group is putting together an infrastructure framework that might just be the right model to bring DevOps interactions to the masses.

VMware Advocacy

You won’t want to miss our newly added DevOps…

DevOps @ VMworld keeps getting better and better…

You won’t want to miss our newly added DevOps…

For the first time ever, there will be a DevOps mini conference at VMworld! This 3-day event will include 25+ developer workshops, guest speakers, and a live hackathon! Newly added to our keynote lineup is Steve Herrod, managing director at General Catalyst Partners. Register today with promo code VMWDEVDAYOR for free entry to the event!

VMware Advocacy

Automating F5 AFM Using vCO Dynamic Types and vCAC – Part 2

This post is a continuation of Automating F5 AFM Using vCO Dynamic Types and vCAC – Part 1. In this post I will cover how the F5 Dynamic Object types were created and how to leverage these objects with vCAC.
Again a big thanks to Christophe Decanini and Marc Chisinevski for this example.

vCO F5 BIG-IP Dynamic Types Demo Package can be downloaded from -> com.definedbysoftware.DT.F5.package


Using the Dynamic Types Plugin Generator with F5->

So first things first we need to set up the Dynamic Types plugin with the F5 BIG-IP using the Dynamic Type Plugin Generator. Now although this is not a completely straight forward task (as the BIG-IP REST API requires custom serializedObject Actions) the reward of leveraging an inventory of objects for lists and trees are worth it.

The best part about of the use of Christophe Decanini’s Dynamic Type Plugin Generator is that is basically makes any REST API into a vCO Plugin 🙂 For the F5 this then allows us to create a vCO inventory of anything supported by the REST API not just the LTM module like the existing F5 vCO Plugin.
Dynamic Types are probably best described by Christophe post on XaaS as…
“Dynamic Types is a new vCO feature shipped starting with vCO 5.5.1 (experimental) allowing creating inventory types dynamically without doing any Java development. It brings together the quick implementation of the REST / SOAP plug-ins with the convenience of using inventory objects leveraged by vCAC XaaS.”

It is also important to note that Dynamic Types are now fully supported as of vCO 5.5.2.

To get started simply ensure you have an F5 BIG-IP you can connect to (preferably running at least version 11.5) and have downloaded and installed both Christophe’s Dynamic Type Plugin Generator and the demo package com.definedbysoftware.DT.F5.package.

The F5 Dynamic Types demo package essentially includes some example serializedObject Actions for parsing the BIG-IP JSON as well as this AFM Address List example workflow. In this early release of the package I have included serializedObject Actions for the following use cases:

  • AFM Address Lists – https://<BIG-IP.IP>/mgmt/tm/security/firewall/address-list
  • LTM Virtual Servers – https://<BIG-IP.IP>/mgmt/tm/ltm/virtual/?expandSubcollections=true
  • LTM Pools – https://<BIG-IP.IP>/mgmt/tm/ltm/pool/?expandSubcollections=true
  • LTM Nodes – https://<BIG-IP.IP>/mgmt/tm/ltm/node

You will notice above that some of the URL’s include the ?expandSubcollections=true flag. This is used to all the REST call to expand subcollections and return details of linked objects as nested JSON objects. This is very useful as without it you would need to do a second REST call to get the members of a pool for example.


Setting up the Dynamic Types Plugin Generator with F5->

So this process is based on Christophe’s Dynamic Type Plugin Generator Tutorial and as such I’m not going to repeat all the content in that post. Instead I’ll focus on what is different from the Twitter example and how you can create your own serializedObject Actions.

1. Setup your Dynamic Types Namespace and matching REST Host using Workflow “-1- Create a plug-in”. The only real things to consider here are:

  • Use a high level URL such as “https://<BIG-IP.IP>/mgmt/tm”. This allows the one REST host (and namespace) to be used for multiple F5 modules.
  • Use Basic Authentication.

2. Now the fun begins with the plugin creation. Run Workflow “-2- Create a new plug-in type”.
Select your created Dynamic Type namespace and enter a name for the root folder and type name. I have also uploaded a neat little F5 icon which adds a bit of style.


3. Now we need to provide URL’s, seralizedObject Actions and validate the output for the findAll, findbyID and findRelations functions for the plugin. For AFM all URL’s start with “https://<BIG-IP.IP>/mgmt/tm/security/firewall”  and in this case we are going down to the /address-list/ URL.


4. After submitting the findAll request for Address-Lists you should get valid response content which matches a string similar to the one below.


Before we move on to the next step I will explain why I have provided the custom serialzeActions. If you copy the provided JSON response into a JSON viewer you will see a structure similar to the one below.

You will notice that the JSON response starts off with a nested list for each Address List. This format is used across the entire BIG-IP platform and because the actual content we need is down a second level the provided GetPropertiesFromSerializedObject action will not provide the information we need.

However having thought of this Christophe has provided an option to use a custom action to parse the JSON response. These actions provided in my demo package above which can be used to parse the response. I have included some sample actions which can be used for AFM Address-Lists, LTM Virtual Servers, LTM Pools and LTM nodes. The reason the actions are specific to varies functions is that I am selecting which properties to push back to the plugin generator as it is rare that you will want all of them.

If you wish to use these actions for other F5 modules simply duplicate them and change the property matching accordingly. Match what properties you wish to return based by first looking at the response in a JSON viewer.


5. After selecting the getF5AddressListAFM action the check properties validate should show what properties have been found. In my case the nodes show as below. If you’re happy accept the result and we move on to find an object by ID.

name : AppServers
id : AppServers
description : Application Servers
selfLink : https://localhost/mgmt/tm/security/firewall/address-list/~Common~AppServers?ver=11.5.1
kind : tm:security:firewall:address-list:address-liststate
generationID : 1255
addresses :,,


6. Under the findbyID method we need to provide a URL on how to just retrieve one item by its ID. For BIG-IP this is usually the exact same URL followed by /~<Partition>~<ObjectName>/. In our case the getF5AddressListAFM action is setting the name of the object to the ID property so the URL is simply https://<BIG-IP.IP>/mgmt/tm/security/firewall/address-list/~Common~{ID}


7. The GET should then return a valid JSON response for which you are then prompted to provide another deserialized definition. You will need to supply the getF5AddressListObjAFM action which is different to the action provided with the findAll. This is because when selecting an object by its ID the nested list format is not used. You could use the default action however using my provided action will save you having to adjust the properties accessors to match the previous findAll query.


8. Once complete you are prompted with the final stage for findRelation. The URL and custom action simply match the findAll step.

9. In the final step you are prompted if you would only like to use properties found in both findAll, findbyID and findRelation. You can simply answer Yes as the JSON parse actions collect the same properties for each step so there should be no difference.

10. Bam! That should be all there is to it and you will have a object in your vCO inventory for each AFM Address List with a comma separate property for each IP that is in list.



Setting up the vCO AFM Workflow and vCAC Custom Action->

So with the F5 Address List objects now setup we can start to leverage them for vCO and vCAC workflows.

1. So before we configure the demo workflow F5 Add IP address to AFM Address List – Dynamic Type we need to setup another REST operation for PUT. BIG-IP use both PUT and POST depending on the location. See the BIG-IP REST User Guide for details. Simply use the Add REST operation workflow and the Template URL should match that of the findbyID URL that was used earlier with either PUT or POST as the method.


2. Edit the F5 Add IP address to AFM Address List – Dynamic Type workflow and update the Attributes for restOperationGET and restOperationPUT pointing them at your own F5 REST operations for GET and PUT based on findbyID.

3. Edit the workflow Inputs and update the Parameter for F5Address List and point it to the newly created Address List type. Note: Remember to use the type its self not the folder.


4. Give the workflow a whirl by Right Clicking on an Address List object and selecting Run workflow… The Address List should then be automatically populated and you only need to select a VM. Note this workflow uses VMtools to get the IP address so the VM must be powered on. Once the workflow is complete reload the Dynamic Types and the VMs IP address should now be in the list 🙂


5. Now onto vCAC. The following is based on the latest and greatest vCAC 6.1 however it is basically the same process for 6.0.
Under Advanced Services Designer create a new Resource Action and select the F5 Add IP address to AFM Address List – Dynamic Type workflow. The resource type should automatically map to the input type of VM.


6. In the details tab if your using vCAC 6.1 you can leverage the new state functionality have only have the action available if the VM is powered on.


7. Edit the form field to set the type as a List.


Your done 🙂


Happy Automating…. Chris Slater out.

Automating F5 AFM Using vCO Dynamic Types and vCAC – Part 1

So recently I have been working with Automating F5 BIG-IP for PaaS using vCloud Automation Center and vCloud Application Director. I must admit it has been really interesting work as automating the industry leading load balancer with vCAC for PaaS deployment really shows how the SDDC can come together. So I thought I would post on what some of the major options that are available for F5 automation with vCAC 6.x (because there are quite a few). As well as a demo of what I have done with F5 and vCO Dynamic Types.


F5 Automation Options for vCAC Integration->

So before I go into demonstrating the awesomeness of vCO Dynamic types and vCAC I thought I would go through the various options when integrating with F5. There are quite a few options and I have highlighted the major ones below.

vCO Plugin:

The F5 vCO Plugin is a free full vCO plugin available from the VMware Solutions Exchange. This plugin uses either REST or SOAP when connecting to an BIG-IP device and exposes common operations around LTM and GTM.
The advantage of this plugin is that it is ready to use and comes with some great OOTB example workflows. It also populates an inventory of vCO objects based on LTM Virtual Servers, Pools and Nodes. For many standard LTM and GTM use cases this plugin is probably the quickest and easiest way to get automating.

The disadvantage of the plugin is that it is relatively limited in terms of functionality and really is focused on common LTM and GTM operations. There are at least another 8 F5 modules that can be leveraged outside of LTM and GTM, in my case for this post I was required to automate the Advanced Firewall Module (AFM). As a result the plugin was not suitable for my use case and this would probably be the case for people who wish to automate a large amount of the F5 BIG-IP platform.


F5 also provide PowerShell Snapin for BIG-IP which is great for PowerShell fans such as myself. This can be leveraged with the vCO PowerShell plugin however its drawbacks are similar to that of the vCO Plugin. Not all modules (AFM for example) are exposed via PowerShell commandlets. For mainly this reason it was not my method of choice for vCAC/vCO.


For those who don’t know the Traffic Management Shell (tmsh) is the BIG-IP CLI. To say that it is a fully featured CLI would be an understatement as this reference document is over 2,390 pages.
This solves the problem of the previous two solutions where you can’t automate part of the solution because the method or commandlet simply doesn’t exist. This solution also is not very hard to implement as you can leverage the SSH plugin within vCO to run the commands directly on the BIG-IP platform.

The disadvantage of both the tmsh and PowerShell methods is that unfortunately you have no vCO inventory to leverage. This then results in users entering in information as strings rather than picking from objects in list. This then requires more error handling in the code and the requester to know exactly what they want/need to change.


The SOAP API is the traditional API that has always been available for BIG-IP. The SOAP API is available via HTTPS and returns content in XML.
I’m not going to go to much into REST vs SOAP however it seems that although F5 is still supporting the SOAP API, it looks like the REST API is the future.

I have had issues adding an F5 BIG-IP as a SOAP host to vCO which is described in this communities post. If anyone knows if this has been solved I would be grateful.

UPDATE: So I have been advised that F5 BIG-IP is using wsdl in an older RPC/encoded format and it is not compliant with what vCO is after rpc/literal or document/literal format.
Thanks to Simon Lynch for this information.


As just stated above the REST API which is fully supported from BIG-IP version 11.5 onwards seems to be the future of the F5 API. Although REST is not self-declarative like SOAP it does have some advantages.

  • Easy to setup and use in vCO
  • Easy to test and debug in a web browser with a REST client plugin
  • Provided in JSON which is easy to work with in a Javascript based vCO
  • Works with Christophe Decanini’s Dynamic Types plugin generator which provides a vCO Inventory 🙂

You can also save yourself allot of effort is setting up all the various REST operations by leveraging Simon Sparks vCO Workflow Script to Add REST Operations to a REST Host for F5 BIG-IP LTM – Part 1 and Part 2.



Using vCO Dynamic Types with the F5 REST API (AFM use case)->

So time to show off using vCAC, vCO Dynamic Types and the BIG-IP REST API together. So although this is quite a simple use case it shows the concept off quite well.

As mentioned earlier my use case was AFM, so what specifically in AFM was I needing to automate. Simply I was required to add Virtual Machines to specific Firewall Server Address Lists, these address lists were then already mapped to approved Firewall rules for multi-tiered applications. Now in my case the IP address were coming from vCloud Application Director as part of PaaS deployments however this example is based on vCAC Custom actions.

So before I get started I would like to give thanks to Christophe Decanini and Marc Chisinevski. I will be going through how to setup the Dynamic Type plugin generator for F5 in Part 2.


First things here is the before shot. Simply as I stated above I have an F5 AFM Address List I want to Add a VM to.

The user finds the VM in the vCAC inventory and simply selects the Custom Action F5 – Add Virtual Machine to AFM Address List

F5DT-Before F5DT-VCACDay2Option1

The requester provides a description and reason for the request.


The requester then selects from a tree list which Address List to add the VM IP to. This is a dynamic list that is checked via the Dynamic Types plugin, if I were to add or remove an Address List before this workflow the list would be updated appropriately. The magic of Dynamic Types 🙂


After submitting the request we can see the VM’s IP has now been added to the address list and associated Firewall rules.


This screenshot shows the underlying vCO workflow used to add the VM IP from vCAC to the BIG-IP using the REST API.


We can also see the Dynamic Types Plugin Inventory for the F5 Address Lists as used earlier in the vCAC Custom Action.

F5 Dynamic Types Examples


Continued in Part 2 with how to setup Dynamic Types with AFM in vCO based on this example…. Chris Slater out.

Intergrating vCloud Application Director (Linux VM) with vCenter Orchastrator

So I have been using vCloud Application Director (or vCloud Automation Center Application Services) a fair bit recently which is VMware’s Platform as a Service (PaaS) life-cycle management add-on to vCloud Automation Center. Although it is a great product for orchestrating multi-tier application platform deployment, one feature that is currently lacking in my opinion is native vCenter Orchestrator (vCO) integration. vCO has become the corner stone of advanced automation and integration in the VMware product suite and it is nearly always required when trying to do end to end automation with vCAC for IaaS or PaaS.


The Script:

I have created a vCO Integration Bash Script which can be found here The Script is designed to be run from Application Director using either a Linux Guest based Service or External Service with the service VM provided through a Linux OS. It is an important fact that although this post is based upon using this script for AppD to vCO integration, this bash script can be used on any Linux OS assuming XMLStarlet (more on XMLStarlet below) can be used.


Calling vCO via the REST API:

So if you have not guessed already the script is leveraging the vCO REST API to submit a vCO Workflow request. The vCO API is well documented and can be found here for 5.5 -> Using the vCenter Orchestrator REST API. However if you want to get started writing your own scripts against the vCO API I highly recommend the vCO Team post by Burke Azbill on How to use the REST API to Start a Workflow. This was basically the guide I followed to write the script and it worked a treat.


Why Bash and XMLStarlet?

So in my case I needed the script to run from an External Service VM in Application Director. For those who don’t know an External Service VM is a VM that is provisioned as part of a application blueprint deployment to integrate with an external system. The VM is quickly spun up, it then runs its required scripts and is then torn down again. The External Service VM needs to be quick to deploy, so a lightweight Linux VM is perfect. I also wanted to try and simply stick with OS level shell script and avoid having to install perl or python. One exception I did have to make however was to install XMLStarlet. “XMLStarlet is a set of command line utilities (tools) which can be used to transform, query, validate, and edit XML documents and files using simple set of shell commands in similar way it is done for plain text files using UNIX grep, sed, awk, diff, patch, join, etc commands.” XMLStarlet is required as we need to query the vCO REST API which responds in either either XML or JSON. Unfortunately there is no nice way to handle XML in bash so XMLStarlet is the solution to this problem.


Using the Script from a Linux OS:

To run the script from a Linux machine simply download the script, install XMLStarlet and update the input variables. The lines that need to be updated are shown below. Most are self explanatory with the exception of the PREFIX.
The PREFIX is used for Application Director to help distinguish which workflow and which workflow inputs are used for the different life-cycle stages such as Provision, Update, Teardown, etc…

You can change the PREFIX to anything you like, the key is that the script will use all variables that start with that prefix and an underscore as inputs to the vCO workflow.

PROV_WorkflowName=”Workflow Name in vCO”
#PROV_WorkflowID=”Insert Workflow ID”



For example I have used two script inputs being resName and clientMAC. These match vCO workflow parameter variables. The resName parameter is required however the clientMAC is optional for this particular workflow. Based on the options above the following XML is created. Note this is already provided in the log output.

INFO: Creating Template XML
INFO: The following XML will be submitted to run the vCO Workflow
<execution-context xmlns=””>
                <parameter type=”string” name=”clientMAC” scope=”local”>
                <parameter type=”string” name=”resName” scope=”local”>


Using the Script to Integrate AppD with vCO through an External Service:

So as this post is actually on integration with vCAC Application Services I should probably mention how to do that as a standalone external service.

Simply create a new External Service and ensure it is backed by a logical template that is a Linux OS (with XMLStarlet installed). Add properties to the service matching the Input variables above and as required for the Workflow Input variables.

Copy and paste the script into the various life-cycle stages you required ensuring you set the Prefix value at the start of each script in each stage. Also ensure that you do not include the input variables component of the script as this is now handled by AppD.



Save the External Service and register it with a Deployment Environment. Create a blueprint using the service and deploy.

Afterwards you should have your workflow complete successfully with vCO workflow completed and the output returned to AppD.


Happy automating 🙂

Intergrating VSAN Observer with vCOps

So I haven’t posted in a while <insert excuse here> however it’s time to get back into it. So I thought I would start with integrating VSAN Observer into a vCOps Dashboard.

The VSAN Observer is probably best described by Rawlinson as “The VSAN Observer is packaged with vSphere 5.5 vCenter Server. The VSAN observer is part of the Ruby vSphere Console (RVC), an interactive command line shell for vSphere management that is part of both Windows and Linux vCenter Server in vSphere 5.5. VMware Support exclusively used the VSAN Observer for early internal VSAN troubleshooting, but the utility is now available to all VMware customers using the new vSphere 5.5.”

The VSAN Observer is current the best tool I’m aware of for monitoring VSAN performance.

Setting up the VSAN Observer is well described in Erik Bussink’s and Rawlinson‘s posts. However I wanted to integrate it with vCOps we need it running all the time, not just when we want do perform some point-in-time troubleshooting.  I have done such a setup and the result can be seen below.


VSAN vCOps Dashboard


Setting up the VSAN Observer Dashboard

Before we get straight into setting up the dashboard we need to get the VSAN Observer running. Because most of the blogs on this are based on the vCenter Virtual Appliance (plus my lab is using the Windows version of vCenter) I have decided to base these instructions on the Windows vCenter version of RVC.


Setting up VSAN Observer for Windows->

Instructions on setting up the VSAN Observer for Windows can be found on Erik Bussink’s blog however there are a few things to keep in mind.

  1. You don’t need to run the rvc from the vCenter server. In fact I would recommend you don’t. It can be run from any Windows server by simply copying the C:\Program Files\VMware\Infrastructure\VirtualCenter Server\support\  folder to another server. I recommend running VSAN Observer on another server especially if you need to run multiple for multiple VSAN clusters.
  2. Make sure you open the ports needed in the Windows Firewall
    netsh advfirewall firewall add rule name = “VMware RVC VSAN Observer” dir = in protocol = tcp action = allow localport = 8010 remoteip = localsubnet profile = DOMAIN
  3. You can run more than one VSAN Observer script by ensuring they are listening on different ports. Use the –port XXXX argument for ensuring multiple scripts are using different ports.
  4. From what I can tell you need to use a local user to authenticate the rvc client with vCenter. Eg. Local Administrator. If someone can get this to work with a Domain account please reach out. As such you will need to give that local account access to vSphere to read the inventory objects.

So to tie this together the first thing I did was use Duco Jaspar’s rvc batch script which can be found here.

Copy the script to the same directory and change the variables as per the instructions. There are however a few things to add..

  1. As suggested enter the password for the local account so the rvc VSAN batch file can run without prompting for a password.
  2. At the end of the script you will find the line “….2.0.2\lib bin\rvc -c ‘ %clusterarg% –run-webserver –force ‘ %admin%@%vCenter%
    Make the following changes ->
    ………2.0.2\lib bin\rvc -c ‘ %clusterarg% –run-webserver –force -m 5 -p 801x‘ %admin%@%vCenter%
    The -p allows you to specify a different port from the default 8010. This is only required if you are going to run multiple batch files for multiple VSAN clusters.
    The -m <Number of Hours> switch allows you to specify how long the VSAN observer runs before terminating. This number might need some playing around with. By default it will run for 2 hours. However increasing the number will allow for older stats to be shown at the cost of memory. Keep in mind this is a 32-bit process so I wouldn’t recommend making this to high.
  3. Test the script 🙂


Setting up the scheduled task(s)->

Now setup some Windows Schedule Tasks to restart the scripts when the time Window has elapsed. To keep things simple I created a scheduled task (make sure it can run without someone logged in) that ran every 5 minutes. If you do this ENSURE you have the option of Do not start a new instance if the task is already running. This will ensure that only one copy of the script (per VSAN Cluster) is running at a time and that when the observer terminates after the time window that it restarts again with 5 minutes.



Setting up the vCOps Dashboard and Widget->

  1. Log into the vCOps Custom UI
  2. Create a New Dashboard with a Single Column and the Text Widget
  3. Expand then Edit the Widget.
  4. Enter the Cluster name and the URL of the VSAN Observer Web host. Ensure you check HTML.
    You can enable to auto refresh option if you wish. This option is useful if this is more of a monitoring dashboard. If your using this as a troubleshooting dashboard leave this off.
  5. Your Done!


VSAN vCOps Dashboard


NOTE: Some browsers may block the content as vCOps is displaying HTTP and HTTPS traffic in the same session. Simply accept this warning to display the content.


Final word ->

This integration is done leveraging the VSAN Observer Web UI and the vCOps Text Widget, as such the raw VSAN metrics are not being imported into vCOps. I am aware how much more useful this would be for historical data as well as comparing VSAN metrics with other vSphere metrics. This will come in time with the release of the VSAN vCOps Management Pack. Stay tuned….

Until then however enjoy this compromise in that at least we can continue to use vCOps as a single pane of glass for all things virtual 🙂

To Zero or not to Zero, that is the question…


A question that has been coming up quite a bit lately is related to the relative security of the different VMDK formats available for use within VMware vSphere. As most of you know, vSphere offers 3 basic disk formats.

  1. Lazy Zeroed Thick
    This is the standard format used by thick disks. Space is allocated at the time the VMDK is created but the underlying physical blocks are not zeroed. Zeroing of each block occurs as the first write to that block takes place
  2. Thin
    With a Thin disk physical blocks are not allocated at time of creation. Instead they are allocated as the virtual machine writes to the disk and consumes space. Again, the physical blocks are not pre-zeroed. They are allocated and zeroed on first write.
  3. Eager Zeroed Thick
    Eager Zeroed Thick disks have space allocated and all underlying blocks zeroed at the time the VMDK is created.

While Eager Zeroed Thick disks raise tend not to raise any questions, the fact that Thin and Lazy Zeroed Thick disks are not zeroed at the time of creation have caused some concerns in terms of a newly provisioned virtual machine being able to read ‘stale’ data that might have been left behind when an old VM is deleted or relocated to another datastore. If the blocks aren’t zeroed at creation time, how do we prevent that VM from reading the old data that might still reside on the physical blocks underlying the VMDK?

VMDK Metadata

A VMDK doesn’t just contain the data contained within the virtual disk. It also contains a metadata area that describes the attributes of the VMDK, including something called the Grain Table. The physical blocks/sectors backing the vmdk blocks are referred to as ‘grains’ and the details of each grain are recorded in the Grain Entry Table within the VMDK metadata. Each record in the table details the offset address of the physical locations backing the grain. Each record is referred to as a Grain Table Entry. (GTE)

What happens when a VM tries to access data stored within the VMDK?

When a VM tries to read a block of data within a VMDK, the following process takes place within the storage subsystem of the VMKernel:

We first read the GTE for that block

  • If the GTE has a value of 0 (zero) it indicates that no grain has been allocated. On reading a value of 0 the storage subsystem automatically returns zeros for the contents of the VMDK block being read.
  • If the GTE has a value of 1 it indicates that, while a grain has been allocated, nothing has ever been written to it so the value returned for the grain content is zeros again.
  • If the GTE is >1 it indicates that we have a physical block backing the VMDK grain and that it’s ok to read from and write to it. We use the GTE to determine the offset address of the physical blocks backing the grain.

You can see from the above that in the cases where blocks haven’t been allocated in a thin disk or blocks have never been written on a lazy thick disk we don’t even return an offset so it’s impossible for anything inside the guest to even determine the physical location of the block, let alone read any ‘stale’ data.

The process described above applies to disks where there are no linked clones, snapshots etc. Where these exist the process gets a little more complicated but the same basic protections are in place.


Whether you deploy a VMDK in Thin, Lazy Zero Thick or Eager Zero Thick format, the VMkernel storage subsystem has been designed in such a way that accessing stale data from within a guest is not possible. Any command executed within the guest cannot even determine the physical location of unwritten blocks, let alone access any stale data left behind by a previous VMDK. So in summary, you can use any of the virtual disk formats with complete peace of mind that it will not be possible to access old data on the underlying physical storage.


For more information you can refer to the Virtual Disk Format 5.0 Technical Note at

VM Large Pages, TPS and ASLR

In an earlier vCOps post I discussed how Large Pages effect consumed memory and how analyzing VM memory usage by Demand (or Active Memory), can give potentially misleading right-sizing recommendations.
In this post I will go into detail around the common subject of the effect of Large Pages on Transparent Page Sharing (TPS) and consumed memory, including the important topic of how Address Space Layout Randomisation (ASLR) effects host memory usage.

The Setup

vSphere 5.5 U1

Linux Server (SLES 11) 6GB RAM

Windows Server 1 (W2K8R2) 4GB RAM

Windows Server 2 (W2K8R2) 10GB RAM

The Question?

Why for most of our Virtual Machines does a VM’s Consumed Host Memory basically equal the amount of Configured Memory for the VM? When I check inside the guest OS there seems to be a large amount of free memory.

The Answer

The reason for this is essentially a combination of factors. However they include TPS not being used on Large Page VMs, the Guest OS and Address Space Layout Randomisation (ASLR).
First things first Large Pages and TPS.

Large Pages and TPS

Is is a pretty well known fact that since the introduction of Large Pages  with Intel EPT or AMD RVI that TPS is not used unless the host comes under memory contention. This is well explained in the Yellow Bricks Posts as well as KB 1021095. This is well known acceptable trade of performance over consolidation. But what effect does this actually have on Memory usage? TPS only save identical blocks right? It’s not like there would be gigabytes of these per ESXi host….

Well that’s where your wrong. One of the most common blocks that TPS can dedeplicate out is an empty 4KB memory page or unallocated memory page. This issue in conjunction with ASLR on certain guest OS’s (more on that later) can introduce a scenario where your guests consumed host memory usually matches your configured memory, weather the guest needs that amount of memory or not.

Before we get into controversial topics such as disabling Large Pages and ASLR lets look at some basic graphs.

The Testing Scenarios

Test 1 – Windows Server 1 4GB RAM with Large Pages Disabled

vCenter Stats - LP Disabled

From the graph above we can deduce a few key points:

  • Although 4GB of RAM is allocated only 3.2GB is being consumed on the host
  • The Active Memory is generally less than 1GB
  • The shared memory is equal to roughly the configured memory minus the consumed memory

What is also interesting about this VM is that the majority of the shared memory is made up of unused memory. But more on that in a sec. First lets look at the exact same VM after a vMotion to a host with Large Pages Enabled (the Default).

Test 2- Windows Server 1 – 4GB RAM with Large Pages Enabled

vCenter Stats - LP Enabled

What stands out straight away is that the VM is now consuming all 4GB on the ESX host. The shared memory which was around 1GB is now almost 0. This graph also shown the event of host memory contention. At 7:25 my ESXi host hits 94% memory usage and begins breaking large pages down into small pages, which then allows TPS to begin finding duplicate pages again.

Now just before I mentioned that the majority of the shared memory was unused memory. This is shown in the graph below.

Test 3- Windows Server 1 – 4GB RAM with Large Pages Disabled After vMotion

vCenter Stats - LP Disabled Single VM on Host

Now this is the same VM as the last test, and as you can see straight after the vMotion there is an amount of memory that is shared and the consumed memory is no longer at 4GB. This is an important fact and is well explained in KB 1021896. This is essentially because the guest has allocated pages spread all over its address space. Because these are backed with 2MB Large Pages some 4KB pages may solely take up an entire 2MB backed page in a worst case scenario. This results in almost no free 2MB memory pages in the case of Test 2. However in this test free 4K pages are far more common and as such, ESX does not need to back them with physical memory unless requested.

This test also highlights TPS in action with identical guest OS blocks. At around 8:25 I vMotioned three other Windows 2008 R2 VMs on the same host. As a result you can see the amount of shared memory increase over time (and retrospectively the amount of consumed memory decrease). The magic of TPS 🙂

So your probably wondering where ASLR fits into all this? Well if you haven’t guessed already lets discuss that.

Address Space Layout Randomisation (ASLR)

ASLR is a security feature of modern operating systems to help prevent buffer overflow attacks. It is well explained in this Wikipedia article. For Windows ASLR was introduced with Windows Vista (Server 2008) and has been around ever since. Other operating system such as Linux based OS’s also use ASLR with slightly different implementations.

So how does ASLR effect shared memory?

Because ASLR distributes memory pages all over the address space the chance of finding an unused 2MB block is greatly reduced. This is more obvious and a bigger issue in VMs with larger amounts of Memory.

Test 4 – Windows VM 2 – 10GB RAM with Large Pages Enabled – VM Boot

Windows VM 2 LP Enabled

From the graph above we can deduce a few key points:

  • Straight after boot the VM was already consuming 8GB of RAM. Even though this is a vanilla W2K8R2 OS with no applications installed. According to Perfmon Windows was only consuming 587MB of RAM just after boot.
  • At 9:35 I performed a 6GB Memory allocation using MemAlloc. Although the guest still had around 3.5GB of RAM free the consumed Memory was now 10GB.
  • At 9:41 I de-allocated the Memory. However the consumed memory never decreased back to the original 8GB.

An important point from the above is that after memory was deallocated inside the guest OS, the amount of host consumed memory has not decreased. This is well explained in Understanding Memory Resource Management in VMware® ESX Server Page 5. However it essentially boils down to the fact that ESX can not tell what pages are free after the guest OS has accessed them for the first time. The balloon driver solves this issue, however it is only enacted during times of resource contention due to its overhead and interference with the guest OS.

Test 5 – Linux VM – 6GB RAM with Large Pages Enabled

Linux LP Enabled

This test shows a Linux VM on the same host. As you can see at boot although the VM has 6GB of RAM configured, it is only consuming 1.5GB even with large pages. This shows different implementations of ASLR among different operating systems.

Now onto the final test. In this test we have disabled ASLR inside Windows via this Registry value.

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management]

Test 6 – Windows VM 2 – 10GB RAM with Large Pages Enabled and then Disabled – ASLR Disabled

Windows ASLR Disabled

In this final test the graph above shows the following trends during first boot and test with Large Pages enabled:

  • After boot the VMs consumed memory is far less than in Test 4.
  • During the Memory Alloc consumed memory increased, however it did not reach the granted amount.
  • After the memory was deallocated the consumed memory still remained around 9GB.

During the second boot and test with Large Pages disabled:

  • The VM has very little memory consumed after boot (530MB) which is common for Windows when Large Pages are disabled
  • During the Memory Alloc the consumed memory increased, however it did not reach the granted amount nor was it as high as the last test.
  • After the memory was deallocated TPS starts reducing the amount of consumed memory due to identical pages being found.

How do I disable Large Pages?

Large Pages can be disabled at the ESXi Host Level under Advanced Settings -> Mem -> Mem.AllocGuestLargePage and setting the value to 0. As this is an ESXi Host level setting I would not recommend making this change unless you are happy to trade performance for consolidation. Also note that VMs will need to be vMotioned off and back on the host (or power cycled) for the setting to take effect.

Disable Large Pages

Final Word

As you can see from the tests above the use of Large Pages and ASLR has had a significant effect on the amount of physical memory a guest consumes, as well as its ability to return unused memory back to the ESX host. With these two factors combined (which would be the standard scenario) a guests consumed memory will often far exceed its required amount of memory. This leads to my next post on the importance of VM right-sizing with vCOps and active memory.

Now your probably thinking ‘thanks for the info, I will just go ahead and disable Large Pages and ASLR to save memory!’.
In fact this would be far from my general recommendation of ‘Do not disable Large Pages and/or ASLR unless you have happy with the trade-offs’.
There are obviously performance and security trade-offs for disabling these features that I would generally not recommend in a production environment. However there are certain use cases where this may be preferred. These include:

  • VDI Environments where TCO is important
  • Development Environents
  • Home Lab Environmnets
  • Environments where VMs are grossly oversized and you plan to right size shortly.

In my next post I will discuss how to tune right-sizing with vCOps and well as active vs consumed memory.

Till then Chris Slater out.


Large Pages – Yellow Bricks

TPS in Hardware MMU Systems KB 1021095

Use of Large Pages can cause Memory to be fully Allocated KB 1021896

ASLR Wikipedia

Understanding Memory Resource Management in ESX


vCAC – Adding VM’s to Specific AD OU’s

Continuing on from my last post on vCAC – Adding Domain Selection to IaaS Blueprints I thought I would post on a common request for adding virtual machines to specific Active Directory OU’s during the provisioning process.

The Problem?

Using a vCenter Customization allows a VM to join the domain, however how can we influence what OU the Computer object is placed into?

The Answer

Leverage the Runonce component of a vCenter Windows customization to move the computer object once the VM has joined the domain.
Like my previous post on vCAC there are actually dozens of ways to achieve this. One of the most common ways will be to leverage a vCO workflow as part of the vCAC provisioning process. This would be just as effective, however it requires vCO to have a domain connection for each domain you wish to add VMs to. For this reason, and the fact that I believe Runonce is a little simpler to setup so I will blog on this method.

The Solution

Now the advantage of leveraging the vCenter Runonce is that the command is automatically cleaned-up after the customization and as the command is run from the VM its self, communicating with the DC’s shouldn’t be an issue.
There are many scripts and commands that can be leveraged to perform the OU move. Around the web their are numerous VBscripts, Powershell scripts and other programs that can easily perform this function. However you are going to want a script/program that can accept the destination OU as a string as well as credentials for the operation. As the Windows OOTB customization is running as SYSTEM it will not have privileges in AD to perform the move.
The steps below are credited to a VMware Communities blog post by jonathanvm.

1. As such I would recommend dsMove.exe which is part of the Windows 2008 and later Active Directory Domain Services tools. This program is provided by Microsoft and accepts username and passwords as inputs. You can get dsMove from any server with the Windows Active Directory Domain Services tools installed (such as a DC). Simply copy the file dsMove.exe from C:\Windows\System32 to the same location on your template VM.
NOTE: You also need dsmove.exe.mui from the subfolder en-US.

2. Simply edit the vCenter Customization we created in the previous post. Under the Administrator Password section ensure that Automatically log on as the Administrator is checked for 1 logon.

3. Add the commands below into the Runonce configuration substituting the domain names and accounts etc… The AD account to perform the OU move will need the Account Operators or equivalent role in AD. I recommend a secvice account with simply these permissions.

cmd.exe /c dsmove -u CN=%computername%,CN=Computers,DC=domain,DC=com -d -newparent ou=servers,dc=domain,dc=com -p password

timeout 10

cmd.exe /c shutdown -r -t 00

The advantage putting the target OU in the customization is the template its self is not hard coded to a specific OU. Simply create more customization specifications if a choice in OU is required.

dsMove vCenter Customisation

That’s all for now folks. Chris Slater out.

Oracle ‘Soft Partitioning’ and vSphere DRS Clusters

Those of you who have considered or are currently running Oracle workloads on VMware vSphere have probably come across the term ‘Soft Partitioning’ at some stage of your investigation. While the concept of Soft Partitioning has been around for a few years now, there still appears to be some confusion about what it actually means and how it applies to vSphere clusters. In this post I will attempt to explain in simple terms what Soft Partitioning is, what it isn’t and how it affects your vSphere Cluster design. Note that this is my current understanding of Oracle licensing on vSphere and I would more than welcome any official documentation from an Oracle representative if anything I state here is incorrect. Unfortunately, Oracle has so far been fairly silent on this topic, leading to a certain amount of confusion among their customers.

What is ‘Soft Partitioning’

In the Oracle Partitioning Document at, you will find ‘Soft Partitioning’ defined as follows:

Soft partitioning segments the operating system using OS resource managers. The operating system limits the number of CPUs where an Oracle database is running by creating areas where CPU resources are allocated to applications within the same operating system. This is a flexible way of managing data processing resources since the CPU capacity can be changed fairly easily, as additional resource is needed.
Examples of such partitioning type include: Solaris 9 Resource Containers, AIX Workload Manager, HP Process Resource Manager, Affinity Management, Oracle VM, and VMware.

Aside from the fact that ‘VMware’ is not a product, the interesting thing to note there is that Oracle VM is also listed as a Soft Partitioning technology. I’ll come back to this later.

So what does it mean? Basically what it is saying is that regardless of the number of vCPUs you configure for the VM, you have to license the entire physical host on which the Oracle workload is running. So, for example, if you create a 4 vCPU VM running Oracle and allow it to run on an ESXi host containing 2 CPU sockets and 6 cores per socket you need to purchase sufficient Oracle Standard licenses to cover 2 sockets or Oracle Enterprise licenses sufficient to cover 12 cores. (Oracle Standard edition is licensed per physical socket while Enterprise Edition is licensed per physical core) The size of the VM doesn’t play any part in the licensing calculations.

How does CPU Affinity affect my Oracle licensing obligations?

In short, it doesn’t. vSphere CPU affinity allows you to ‘pin’ a virtual machine to a specific physical core or set of cores within your ESXi host. Aside from the fact that this isn’t usually a good idea for performance and management reasons, to my knowledge it is not recognised by Oracle as a means of limiting your licensing obligations. Regardless of any CPU affinity rules you have in place, you still need to license the entire host.

Fig 1. vSphere Web Client showing a virtual machine with CPU affinity configured to only allow the VM to run on physical cores 1 and 2

The Double Standard

I mentioned earlier in this article that Oracle VM is also considered to be a Soft Partitioning technology. Where it gets a little blurred is that if you configure CPU affinity on a virtual machine running on Oracle VM it magically transforms itself into a Hard Partitioning technology, so the licensing restrictions no longer apply. (refer to ) Using Oracle VM with CPU Pinning configured you only need to license the physical cores on which the VM is configured to run, but you do lose the ability to do any form of live migration with that VM so it’s really a pointless exercise. If you are considering deploying OVM purely for the preferential Oracle licensing, keep in mind that you give up a lot of availability and performance features for that ‘benefit’.

So why the inconsistency in licensing terms? Why is CPU Affinity on vSphere treated differently to CPU Pinning on Oracle VM when they are basically the same thing? I have no idea.

If someone from Oracle would like to chime in on the comments and provide some explanation as to how this seemingly contradictory position was reached I’m sure it would be appreciated by a lot of people.

How does ‘Soft Partitioning’ apply to DRS Clusters?

So now we know that you have to license the entire ESX host regardless of how many physical cores/CPUs are executing the Oracle workloads. What this means in terms of vSphere is that you must license all hosts that an Oracle workload touches. If you VMotion a virtual machine running Oracle to a physical host, even if only for a second, that host must be fully licensed. On the flip side, you don’t need to license hosts that never have and never will run an Oracle workload. (That would be a bit silly – paying to license hosts that will never run Oracle workloads)

So how do you prevent Oracle workloads from touching unlicensed hosts? You have a number of options here. You could deploy separate vSphere clusters for the Oracle workloads, zone the storage so that the Oracle virtual machine files are only presented to a subset of hosts, or use DRS Affinity rules. There are probably other methods but all that’s important here is that you prevent workloads from migrating to unlicensed ESXi hosts. DRS Affinity Rules allow you to specify that a virtual machine will only ever run on a specified subset of ESXi hosts within the HA/DRS Cluster. By specifying a “Must run on hosts in group” rule you also ensure that HA won’t inadvertently start the VM on an unlicensed host in the event of a host failure.

Fig 2. DRS VM to Host affinity rule creation

I have heard a number of anecdotal claims in the past that Oracle does not officially recognise DRS Affinity as a means to control your licensing obligations and that it is considered a form of ‘Soft Partitioning’ so you need to license the entire cluster. Let’s address those two points.
In terms of Oracle not officially recognising DRS Affinity this is absolutely correct. Oracle has not to my knowledge ever published a document stating that DRS Affinity can be used to control your licensing obligations. They also haven’t published a document stating that storage zoning is a valid method to control your licensing obligations, nor physically separate clusters. Come to think of it, I don’t think I have seen an Oracle document stating that not installing Oracle prevents you from having to license a server. The fact is that Oracle does not publish anything that specifies what you don’t need to license, presumably because the list would be never ending. What they do publish is a Software Investment Guide (SIG) available at that is pretty clear-cut in terms of what does need to be licensed. You will note within this document that you are required to acquire licenses for the following types of environments:

  • Development
  • Test/Staging
  • Production
  • Failover (>10 days)
  • Standby
  • Remote Mirroring

ESXi hosts outside of the DRS Affinity rule do not fall into any of these categories and therefore require no licenses. The Oracle binaries are never installed on these hosts, no Oracle workloads ever run on them and the VMs running Oracle workloads are never even registered on the hosts.

On the topic of Soft Partitioning, you will notice that the Partitioning Guide mentioned earlier in this post contains the following sentence:

The operating system limits the number of CPUs where an Oracle database is running by creating areas where CPU resources are allocated to applications within the same operating system

Note the phrase “within the same operating system”. Assuming that Oracle also consider the VMKernel to be an “operating system”, it’s pretty clear from that phrase that they are referring to partitioning within a single physical host. The definition of Soft Partitioning has no relevance to vSphere clusters as each node within the cluster contains its own independent hypervisor/operating system. So in addition to the SIG, the Partitioning Guide also supports the case for not having to license hosts that will never run Oracle workloads.

In Conclusion…..

I hope that clarifies how the definition of Soft Partitioning affects Oracle licensing on vSphere. While Oracle definitely don’t provide the most virtualisation or cloud-friendly licensing in the world, it’s not as bad as some would have you believe. The basic three rules are:

  1. If an Oracle workload ever runs on an ESXi host you must license the entire host, regardless of the size of the VM
  2. If an Oracle workload never runs on an ESXi host you do not need to license that host
  3. How you prevent Oracle workloads running on unlicensed hosts is entirely up to you

Once again, this is my understanding of Oracle’s current licensing conditions based on the published information. If an Oracle representative has official documentation stating otherwise (e.g That you do in fact need to license hosts that never run Oracle workloads) I will be more than happy to post a correction.