Rothstein Home: Your Source for Disaster Recovery, Business Continuity Books, Service Level Agreements & More Rothstein: Management Consulting Services Rothstein: Business Survival Newsletter Rothstein: Original Feature Articles Rothstein: Disaster Recovery Forum Rothstein: Today's Industry News Rothstein: Links to Industry Web Sites Contact Rothstein Associates

A Chilling Experience Disaster Recovery & Business Continuity & Contingency Planning & Disaster Prevention Bookstore
Service Level Management & Service Level Agreements Bookstore

by  Philp Jan Rothstein, FBCI

ING Canada Property and Casualty, part of the International Netherlands Group, successfully thawed out from the ice storm which paralyzed dozens of other Canadian and New England organizations in January, 1998.

This article originally appeared in
INFORMATION SECURITY MAGAZINE,
March, 1998; Copyright 1998, Rothstein Associates Inc.

Frozen Assets
Snow storms and ice storms are not uncommon in New England or in much of Canada. The massive ice storm that blasted parts of New England, northern New York State, Quebec and Ontario in mid-January of 1998 might very well be one for the record books. Early estimates of overall business and residential losses are on the order of $1 billion in Canada and $200 million in the U.S., according to Computerworld Magazine. Extended power outages in these areas have been particularly painful for I.T. facilities coping with the frosting. ING Canada Property and Casualty, a part of International Netherlands Group headquartered in Montreal, was especially hard-hit. Operation of their data center in Saint-Hyacinthe, Quebec has been outsourced to Computer Sciences Corporation (CSC); ING Canada I.T. staff also work out of the same building. A hot site recovery agreement is in place with Comdisco Disaster Recovery Services (CDRS). The Saint-Hyacinthe data center supports all of ING‚s Canadian operations. The ING Canada Property and Casualty Group is an industry leader. With more than half a million customers, customer service and satisfaction are top priorities.

Uh-Oh...
The data center lost utility power quickly when the ice storm hit, at around 10:00 A.M. on Wednesday, January 7, 1998. Fortunately, an uninterruptible power system (UPS) and backup generator effectively took over the short-term power load for the mainframe serving ING. Not so fortunately, this was not to be a short-term power outage: the ice storm impacted operations for two weeks. “Three major risk factors were identified,” according to Robert Proulx, First Vice President of Technology and Systems for ING Canada Property & Casualty: “(1) power, (2) the telecommunications network serving all of Canada, and (3) people.” “An important characteristic,” Proulx noted, “was that we were facing a major issue, because of downed poles and wires in the road, getting people to the data center.” His first assessment was that the disruption would be a matter of weeks, not days, with the added realization that they could not run for weeks on one generator – even if they could get enough fuel. In short order, Proulx became aware that Bell Canada‚s serving central offices and other essential facilities were also running on generators, presenting another weak link. Although data center operations staffing is ordinarily lean ˆ two operators were on duty at the time ˆ “even if we could drive people in, how would we feed and house them? What about their homes, spouses, children, with no heat or power and with water in the basement,” fretted Proulx.

Phase I
Proulx advised CDRS Wednesday night they were formally declaring a disaster; CDRS immediately activated their Toronto/Mississauga, Ontario recovery center. Although the Saint-Hyacinthe data center was still up and running on backup power, the situation was tense and uncertain. Martin Goulbourn, CDRS‚ Vice President, Business Continuity, Western and Canadian Regions, worried, “their main concern and the reason they declared a disaster, was that they didn‚t know how long their generator would operate properly, and if there was an outage on the generator, they wanted to be prepared so that there would be no business time lost.” An added worry was that the generator fuel tank was only half full, with a run time of under 24 hours at the time of the power failure. Fuel delivery within 24 hours was by no means assured.

Phase II
In the midst of the internal I.S. disruption, business was by no means as usual. ING‚s property and casualty claims processing exceeded five times normal volume. ING elected to continue claims operations straight through the weekend, compounding the already high stress level on I.S. Even though the generator and UPS were working well and the fallback capability was in place, ING prudently elected to bring in a second generator to back up the first. The second generator was cut in over the weekend to ensure that it would work if the first generator failed. Proulx and his staff faced a tough choice: should they continue running live at Saint-Hyacinthe, or bite the bullet and make an orderly move to Toronto? When the added risks associated with cutting over to the hot site and back again were weighted, Proulx determined it would be safer to continue production operations at Saint-Hyacinthe – as long as he had the hot site fallback primed and ready. During the twelve days at the recovery site, they were running in parallel at both Saint-Hyacinthe and Toronto. “On a nightly basis, they took backups and refreshed the system at CDRS so they were never more than about eight hours out of synch with their home systems,” noted CDRS‚ Gilbourn. By the following weekend, utility power was returning to Saint-Hyacinthe. Nevertheless, the data center remained on generator power. Proulx worried that utility power would continue to be unstable for some time as continuing repairs added load to the power grid, and as weakened or damaged power supply components failed once power was reapplied. At Proulx‚s Montreal office, he was unnerved by four power drops during that one afternoon.

The People Rest... And Eat... And Bathe
ING‚s Proulx never stopped worrying about the people who were making the recovery work. Long hours, tremendous workloads and unreasonable stress were only part of the problem. Housing, feeding and caring for hundreds of employees – many displaced from their homes and dealing with personal crises – was essential. ING‚s Human Resources involvement was essential. “We were serving over 800 lunches, 700 dinners and 700 breakfasts each day [at Saint-Hyacinthe]. We even had to install showers. Many of our people were working fourteen or fifteen hours a day at five degrees Celsius.” Employees were assured their wages were continuing. Psychologists were brought in to counsel and sustain employees. ING‚s proactive, supportive attitude paid off. “Up to this, our employees have been proud and happy. We were succeeding as a result of our people” ˆ especially a handful of key people who really came through.

Plan Ahead
Fortunately for ING, they had recognized the potential impact of a data center disruption and had the foresight to develop a data center contingency plan. With the exception of network switching, a comprehensive exercise had most recently taken place in May, 1997. The exercised recovery plan was to play a crucial role in the continuing operation and recovery. Network switching was the only aspect which had not been exercised. ING Canada‚s systems are deployed in hundreds of insurance brokers‚ offices throughout Quebec. Acting quickly, dial-up modems were deployed to all of these locations shortly after the storm hit. Thanks to a combination of advance planning, extensive testing and fast footwork in the clinch to deal with last-minute revelations, communications were successfully rerouted and, remarkably, ING Canada Property & Casualty never stopped doing business with their customers throughout the Ice Age of 1998. What would Proulx do differently, having been gone through his first trial by ice less than three months after joining ING Canada? “I would get every detail really well planned,” admits Proulx. The overall strategy was sound, but those niggling details certainly made the recovery more difficult. Rigorously maintained recovery data – equipment, network, contacts – can save a lot of grief. “I‚m going to work on improving communications even more,” adds Proulx. No matter how timely and effective the communications channels, they could always be better. “Don‚t assume anything, don‚t take anything for granted” – urges CDRS‚ Goulbourn.

Exercise.... Exercise... Exercise
Even a well-exercised contingency plan can have glitches. CDRS‚ Goulbourn noted. “They had tested with us previously, and from our perspective testing is very important. We understood the requirements, we understood how they were going to execute the recovery and where we could provide assistance. They determined they needed a very new CISCO router for which they did not have a spare, they had not included in the recovery contract, it was not included in the hot site contract.” An up-to-the-minute inventory might have averted this oversight, but Goulbourn notes “...you can‚t do that every week or every month. You should have specific time points or change management checkpoints.” While Proulx certainly will remember that Cisco Router next time, he has little doubt that there will be other details to handle on the fly. Any contingency plan should be flexible enough to accommodate these last-minute glitches or oversights. Goulbourn admonishes “Testing is absolutely critical. While everyone pays that lip service, where it becomes very critical is in situations like this where you build the relationship and rapport between the organizations so that when the disaster happens, the supplier can provide useful support. Exercising pays a lot of dividends.” Building strong relationships and rapport through mutual exercises is the best way to ensure suppliers can provide useful support in a pinch. “The senior executive focus also paid off. We were having regular conference calls twice a day with ING and CSC. ING‚s Proulx was the one driving those conference calls. In the whole recovery process he understood what was going on, he was the sponsoring executive for business continuity in the organization. The fact that it went up that high in the organization very much showed up in the execution of the plan.


Footnote: According to The New York Times (8/17/98), “Niagara Mohawk Power Corporation has spent the most by far [recovering from the January ice storm] - more than $125 million to repair lines to 120,000 customers who in some cases were left without power for weeks... The House of Representatives recently passed a $2.9 billion emergency spending bill intended to help upstate New York and New England recover from the storm.” (Disclosure: Niagara Mohawk is a consulting client of Rothstein Associates).

For you maple syrup fans, “... an estimated 380,000 taps were lost in northern New York as a result of the storm.”

Copyright (c)2003, Rothstein Associates Inc. All Rights Reserved.

Site Map | The Rothstein Catalog on Disaster Recovery | The Rothstein Catalog on Service Level Books

Contact Us | Management Consulting Services | Business Survival Newsletter | Original Feature Articles

Disaster Recovery Forum | Today's Industry News | Links to Industry Web Sites | ‘Keep Me Posted’ | Privacy Policy

 

E-mail Rothstein Associates Inc.