Going Hybrid

Most IT operations have worked with public cloud setups of one sort or another. Most often, workloads have focused on archiving and backup, but the realization that the cloud model is a powerful way of organizing workloads efficiently and building an agile and scalable IT operation has triggered mainstreaming of the cloud. Even so, concerns about loss of control and data security have changed the view of cloud usage and today most operations are looking to a more advanced model where a private, in-house cloud is joined with public clouds to create a hybrid cloud.

To understand the requirements of a hybrid cloud we need to look at the strengths of the public cloud approach. First, there are hardware needs to be met. A cloud is built on a Lego principle. Server “bricks” are essentially identical COTS units, with no proprietary frills. This doesn’t mean they are all identical, since evolutionary processes in server design make that both uneconomic and technically limiting.

The key is that, like Lego, they all interconnect to the LAN and to storage in the same way, using Ethernet. This is crucial to a solid deployment. (There are some special-purpose clouds that use InfiniBand, though this adds a lot of complexity to deployment.)

COTS units are available from all the major vendors and now low-cost units are appearing from the ODMs that service the large cloud providers, offering a way to buy high-quality gear while saving considerably. As a result we can expect the cost of purchased hardware to drop considerably over the next two years.

Network switches are candidates for the new software-defined network approach, where most of the “smarts” of the switch software migrate into the virtual server pool, making the physical switches simpler and cheaper.

Storage, in the end, is a game of drives. The cloud providers all buy direct, but you’ll find that drives in distribution are much cheaper than proprietary drives from major server/storage vendors. The limitation is that some of these traditional vendors require the proprietary drives, which have an identification code built in, so be careful if you wish to use the distribution approach.

Hyper-converged systems are a variant of the COTS approach. The evolution of SSDs has shrunk the array from 60 drives down to an appliance with 12 drive bays, which in effect is the same configuration as a server. It was a no-brainer to take the step to unify storage and server boxes and that’s hyper-convergence in a nutshell.

Hardware doesn’t make a cloud, however. Without some very specific pieces of software, we’d just have business as usual. These are commonly called the cloud stack, though the typical stack isn’t monolithic or off-the-shelf yet. Almost everyone looking at a private cloud is using OpenStack, an open-source development that is maturing quickly, though Microsoft’s Azure is piloting a cut-down version of its cloud.

OpenStack provides orchestration software that controls creation and tear-down of virtual machines and also the resilient operation that protects against server failures. It allows the creation of virtual server and VLANs and supports connections to other tools in the OpenStack family. These include storage and authentication services, messaging and databases.

Setting up OpenStack can be a challenge. Many of the pilot projects stall with small cluster sizes due to difficulties with software configuration, especially in networking. With this in mind, several companies are delivering software tools that automate cloud-building and then help manage resources out over time, so that adding storage or servers is near seamless. In a real sense, they provide the OpenStack experts you need to cover that period when in-house expertise is just building, but they also provide time-savings and error avoidance in later phases of the cloud.

One of the values of the hybrid approach is cloud-bursting, where a job stream overflows its resources and public cloud instances can be invoked to take up the excess load. Cloud-bursting needs careful data management. You don’t want to be loading terabyte size files into the public cloud and delaying start-up. Solutions include sharding data sets to keep some work in the cloud or to limit updates that need to be sent. While there aren’t yet good automation tools for this, manual sharding is often good enough.

Among other critical issues to consider in setting up a hybrid, the need for version control discipline is high on the list. It’s easy to leave old images lying around, especially in public clouds. Joining these with newer code in a cloud-burst can cause errors or, even worse, open up an attack surface.

With hybrid clouds having multiple departmental users setting up and running their own virtual machines and app mashups, consider restricting the repositories they use for code pulls and install software to validate signatures on downloaded items. A certified in-house image library is a good way to do this.

Finally, there’s a great deal of discussion and many useful blogs around the hybrid cloud. The concept is still new and the community is a good source for up-to-the-minute information on issues and how to avoid or recover from them.