On Tue, Jul 7, 2020 at 8:19 AM Jackson, Rob <rwjack...@firsthorizon.com>
wrote:

> Fun little note on RAID:  it is fallible.  The last Sunday of October 2016
> I got a call bright and early because our VTS (TS7740) had shut down.
> Turns out we had a "cache" HDD failure at around 4 AM, and then a second
> one failed at around 7 AM, before the first one had been rebuilt on a
> spare.  RAID-5 could not accommodate it.  Because of IBM politics, we had
> no tape until Monday at 16:00.  I am ashamed to say that I sort of took
> tape for granted.  It was astonishing how much of our processing depended
> on it.
>

We had a similar problem occurs, long ago, with an actual SAN dasd array
(for Windows, not MVS). Weekend backup to physical tape aborted on a
Sunday. The Windows admin said "No problem, it's a RAID-5 array, I can fix
it Monday morning." A few hours later, a disk in the array failed. No
problem, right? Unfortunately, while the CE was on his way in to replace
it, a second disk failed. The array was destroyed. Management said to
repair it and reload from the Sunday backup and we'd be good. When the
admin admitted that the backup failed and he didn't go in, he was
immediately terminated. Now, what are the chances that 2 drives in an array
will fail within hours? I don't know, but one thing many don't think about
with a "new array" is that all the drives are likely the same age and will
start to fail (if they are) about the same time.

IMO, given my paranoia, I firmly believe that the disks in an array should
be replaced on a scheduled basis. I also believe in dual tape copies of
important tapes. And also, that tapes in "long term" retention (we have
tapes which have been at Iron Mountain for over 10 years!) should be
brought in and the data copied to a new (not reused) tape annually. Of
course, the bean counters will have an apoplectic fit and scream about how
much it costs to do this. They only understand cost, not value. I consider
them the bane of existence. Likely auditors, they take on too much
authority. Or as I have heard: Fire is a good servant but a terrible
master.



>
> R.S. is spot on:  make backups.  Because of the trauma from this one
> event, we now have a three-way VTS grid, synchronous-mirrored SANs, and two
> mainframes on the floor.
>
> First Horizon Bank
> Mainframe Technical Support
>
>
-- 
People in sleeping bags are the soft tacos of the bear world.
Maranatha! <><
John McKown

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to