When data centres overheat they shut down, often without warning. This may seem an obvious point, but surprisingly it has been happening more frequently than one might expect.
Over the past few weeks, the extreme hot weather in northern Europe has resulted in several outages in the data centres of Google and Oracle. Elsewhere in the world, data centres running Microsoft Azure services have also experienced unexpected outages.
As more businesses are running their technology services in these Clouds, they are doing so with the expectation that they will have performance and reliability far superior to what they can get by doing this internally. Indeed, even multi-national corporations have given up the huge expense of running their own global data centres in favour of outsourcing to cloud service vendors. So to discover that in some circumstances, “the computer says no,” is a shock.
More to the point, Cloud infrastructure is designed to support multiple server failures within a facility, but not the whole facility going off line at the same time. In one instance, the cooling systems failed and the Cloud provider had to shut down access to their cloud in order to preserve existing structures.
Unpredictable weather events are becoming more common (predictably unpredictable?) and climate change is forcing the world to confront what this might mean. As supply chains and logistics operations are seeking to adjust to the ‘new normal’, whatever that might be, post Covid, any recovery assumes that the supporting information flows are reliable and accurate. The sharing of information between the parties involved in extensive supply chain networks is difficult enough. The underlying visibility their data informs, is what keeps these operations functioning. The fact that there is a possibility that large fragments may suddenly disappear, is sobering.
No doubt many of the larger customers of Cloud services and the Cloud service vendors themselves, are reconfiguring their infrastructures to account for the impact of extreme heat, but this will take time.
Extreme heat also impacts the physical infrastructure that enables global logistics. Facilities in the northern hemisphere are usually designed for use within a temperate to cold or freezing temperature range, so persistent higher temperatures will have an impact. Nonetheless, the breakdown of transportation equipment, or stifling heat and humidity in a warehouse, are events that are problematic in isolation, but manageable. When the information systems that drive the entire operation have unexpected outages, that’s a different scale of problem.
It is analogous in some ways to the baggage handling systems in a major airport. When they have a problem, they take time to recover and the impact can take days to resolve. Bringing systems back online when they have shutdown is not uncommon, but if the shutdown was instant and the internal mechanisms lost transactions that were not verified the split second the power failed, it takes time to identify the corrupted data and recreate anything missing. In some high volume operations, that could be thousands of transactions that are effected.
Some very large organisations are very experienced in operating their information systems capabilities. They can afford to work with several Cloud services simultaneously as a means of adding resilience to their operations. These ‘multi-cloud’ environments bring their own challenges, but the companies that use them have successfully addressed them. But this is not an option for every company.
Over the next decade, supply chain operations will come ‘alive’ as huge numbers of sensors go online. Every item moving through a supply chain will be tagged, machine learning and artificial intelligence algorithms, will augment and direct decision making. Failures will occur and the systems service providers will seek to improve resilience across the board. But unless the principal components of any information systems infrastructure is adapted to deal with extreme natural events, the vulnerabilities will persist.
Source: Transport Intelligence, 4th August 2022
Author: Ken Lyon