People, you need to stop sizing your IaaS cloud workloads wrong!
In my 7 years at Zettagrid I’ve seen hundreds of customers and partners move workloads over to an IaaS provider. All too often it could be done in a much more efficient manner.
In today’s post I would like to go through with you some things I’ve experienced and learnt along the way. Hopefully this aides you when you start your journey to the cloud. It should make the transition faster and much more of a seamless experience for your end users.
What’s the problem and how does it happen?
Typically speaking, once the choice is made to go to the cloud the customer needs to size it all. Either instance sizes or freedom to choose their own VM specifications depending on the cloud provider.
The user will then choose the VM size closest to the VM they have deployed on site, inclusive of;
- RAM the same
- CPUs the same
- Provisioned storage amount the same
- Storage type chosen according to how important the VM is to the business
Very little thought other than mirroring what it currently has on-premises goes into the specifications.
So what, tell me why I should care?
Well for 3 reasons, cost, performance and potential downtime;
When you initially chose how much RAM/CPU/Storage for your on-premises VMs you sized them for the potential requirements. Probably over a 3 or 5 year period as that is the total amount of infrastructure you have.
What this often correlates to is Active Directory VMs having 8GB of RAM or SQL boxes having 32GB of RAM. You know SQL is important and don’t want users complaining of performance problems at all. You have a few Terabytes of RAM in your on-premises environment so why not…
The same is true for storage, you purchased a SAN that has 500TB of space available. Whilst your local file server only needs 2TB at the moment, the requirements will likely grow over time. Consequently you provision it with 10TB just to be safe.
What this means from a cost point of view is that you pay for resources you don’t need.
This will make the initial daily or monthly costs up to 5 times higher (in this example) when it doesn’t need to be. This is just wasting funds that could be used elsewhere in the business.
Go delving into your on-premises infrastructure management console (vSphere, SCVMM, oVirt, Ansible etc). Actually start looking at the current utilisation for each VM for RAM, Storage and CPU.
Provision your IaaS VMs with the current specifications of resources used, not what has been provisioned on-premises.
Similarly to the cost scenario, the performance assigned is often done with very little evidence or reasoning.
What I see is that customers will equate the applications importance to the business as a one to one correlation on how much resources it needs. If the application is crucial to the business then it must have lots of compute and should be on SSD.
The converse of this is applications that may not visible to the business. Often these are actually crucial components and are in desperate need of resources.
Chances are the cloud provider will be using different type and speeds of RAM, CPU and Storage. The result of this is 1vCPU on-premises may be vastly different to 1vCPU in the cloud.
Again, enter your infrastructure management console and start to create and monitor performance counters for Memory Ballooning, CPU Ready and Wait times. Don’t forget to also get data from the SAN for IOPS used, throughput and current levels of latency.
When looking at these metrics ensure that you capture an appropriate time period. Think about if there is anything significant that runs regularly within the business and make sure the time-frame you are looking at captures these events.
Let’s say you didn’t listen to the above advice and simply deployed the VMs as is from on-premises. Depending on what cloud platform you have moved to you will be met at some point with the need to reconfigure the VMs and scale up or down resources.
Each provider will allow a different level of configurability after the VMs have been provisioned. Some will allow changes on the fly with little or no impact to users. Others will enforce VMs to be rebuilt or migrated elsewhere with visible outages or performance restrictions to applications which users will notice.
Before deploying your production applications see what happens when you need to change all aspects of your NICs, Network, RAM, CPU, Storage with your new cloud provider. Make sure you know the implications beforehand if you need to scale things up or down and what impact (if any) this will have to users.
So as we have seen, simply taking your VMs, doing a lift-n-shift and assuming they should run as is in your new cloud provider can be fraught with danger.
Not taking the time to re-evaluate actually what you need for each VM can have serious impact and flow-on affect to the cost, performance and visible outages for users.
How have you guys found the shifting of VMs from on-premises to an IaaS provider, was altering the mindset of the business to allow the changing of the VM resources an easy or hard task?
For more info check out whether you can trust the cloud www.whatwouldlukedo.com/cloud-burnt-past-can-trust-now-part-1/. Also have look whether data locality or data sovereignty matter www.whatwouldlukedo.com/data-sovereignty-data-locality/.