On Tue, Jul 7, 2020 at 8:19 AM Jackson, Rob <rwjack...@firsthorizon.com> wrote:
> Fun little note on RAID: it is fallible. The last Sunday of October 2016 > I got a call bright and early because our VTS (TS7740) had shut down. > Turns out we had a "cache" HDD failure at around 4 AM, and then a second > one failed at around 7 AM, before the first one had been rebuilt on a > spare. RAID-5 could not accommodate it. Because of IBM politics, we had > no tape until Monday at 16:00. I am ashamed to say that I sort of took > tape for granted. It was astonishing how much of our processing depended > on it. > We had a similar problem occurs, long ago, with an actual SAN dasd array (for Windows, not MVS). Weekend backup to physical tape aborted on a Sunday. The Windows admin said "No problem, it's a RAID-5 array, I can fix it Monday morning." A few hours later, a disk in the array failed. No problem, right? Unfortunately, while the CE was on his way in to replace it, a second disk failed. The array was destroyed. Management said to repair it and reload from the Sunday backup and we'd be good. When the admin admitted that the backup failed and he didn't go in, he was immediately terminated. Now, what are the chances that 2 drives in an array will fail within hours? I don't know, but one thing many don't think about with a "new array" is that all the drives are likely the same age and will start to fail (if they are) about the same time. IMO, given my paranoia, I firmly believe that the disks in an array should be replaced on a scheduled basis. I also believe in dual tape copies of important tapes. And also, that tapes in "long term" retention (we have tapes which have been at Iron Mountain for over 10 years!) should be brought in and the data copied to a new (not reused) tape annually. Of course, the bean counters will have an apoplectic fit and scream about how much it costs to do this. They only understand cost, not value. I consider them the bane of existence. Likely auditors, they take on too much authority. Or as I have heard: Fire is a good servant but a terrible master. > > R.S. is spot on: make backups. Because of the trauma from this one > event, we now have a three-way VTS grid, synchronous-mirrored SANs, and two > mainframes on the floor. > > First Horizon Bank > Mainframe Technical Support > > -- People in sleeping bags are the soft tacos of the bear world. Maranatha! <>< John McKown ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN