Monday 16 February 2015

Industry voice: When the cloud bursts: how to survive the inevitable cloud disasters

Industry voice: When the cloud bursts: how to survive the inevitable cloud disasters

It's a well-known fact that technical things go wrong. So what should businesses think about to ensure reliable and consistent operations with an added layer of complexity?


The first step is recognising that things will go wrong. Whether operations are in an in-house data centre, an external commercial colocation data centre, or in a hybrid cloud arrangement, with workload split between in-house and cloud, the principles are the same.


Cloud isn't new


No matter what marketing would have us believe, cloud is not a new concept. It is simply remote hosting of some or all of the workload in a data centre, and is not dissimilar in principle to 1960s timesharing services. The difference between 1965 and 2015 is the speed and data capacity of fibre optic cables, which open up a whole host of new possibilities to business owners. But the principle remains the same, as do the principles of resilient design.


As some or all of the workload can be hosted remotely, the most critical new consideration is the communication between the user and data centres where cloud operations take place.



Securing the right data partner


It is important that businesses choose a high quality data centre, with strong data communications and cloud experience to help minimise risks. Any data centre which says it has never had an outage of any sort is either too new to have a track record or is not training its sales staff to be honest.


Even major players, with more money to spend than most businesses can dream of such as Google, Facebook and Amazon, have experienced very public data centre outages in the last five years.


Operations managers and architects need to carefully ask the right questions to find out the truth and work through the concepts of automatic failovers or manual switching in the event of something going wrong. Ultimately, it comes down to choosing a data centre that you trust.


Moving the right workload


Choosing the right workload to move to the cloud is also important, especially in the early days when in-house IT staff have less experience of cloud operations. In general, workload which has infrequent, small transactions which are not latency-critical works well in cloud. A CRM system is a good example, where a submission of a visit report or the retrieval of a customer phone number is infrequent, small, and not time-critical.


On the other hand, voice telephony, which is a continuous stream of time-critical data, is not a good application to move to cloud, except for specialist suppliers who know how to do this and will be located in carrier-rich, carrier-neutral data centres to get the connectivity and diversity they need.


Automatic switching of IP address allocations is a particular problem which needs careful thought. The difficulty of automatically detecting a failure and instantly transferring all the IP addresses to another set of equipment in another location leads many smaller installations to accept a short outage and transfer the addresses manually.


In resilient or safety critical design, every element must be considered and there is one key question which must be asked – "what will happen if this element fails?" The design can then be changed so operations will continue without interruption. If that is not possible, then a plan has to be put in place to deal with the effects of a failure that cannot be mitigated.


Testing is key


Continuous testing is essential, as is reconsidering the effects of each potential failure anew each time the system design or architecture is changed. So is rehearsal and practice of both automatic failovers and manual procedures to deal with failures.


At least once a year, every likely failure should be forced to happen, so that its effect on the overall system operation can be checked. This is one of the main principles of ensuring reliable, continuous operations, and is the same whether a business is operating an in-house data centre or a remotely hosted operation in a data centre in a cloud environment.



  • Roger Keenan is managing director of London data centre City Lifeline
















http://ift.tt/1FiqAL1

No comments:

Post a Comment