Joel C. Ewing wrote:

R.S. wrote:
[...]
...
Since we will never completely eliminate human error (entering bad data, submitting bad jobs, software bugs, errors in application specification and design), there will always be a need for point-in-time backups to allow recovery of databases to multiple prior states. One of the problems with trying to do this entirely on DASD is the relative ease with which a file on DASD can be erased by a finger check, versus one on removable media, plus the high cost of bandwidth to a remote site for DR coverage (assuming you have the luxury of a hot site for DR). For most installations, I don't see any thing other than removable media (tape) being practical for many point-in-time backups.


Bad assumption. Tape datasets, those in automated library, especially on virtual tapes are also subject to human error. I don't want to judge, what is more likely - damage backups on disk or on tape. Both are possible. Additionally tape is not protected against media failures. Even virtual (when offloaded from cache to real cart). Of course it is possible to make dual copy, or use VTS PtP, however first method requires human effort (*) the second is quite expensive. (*) It can be don in unattended manner, with almost no cost (excluding drive & media) when using HSM of FDR/ABR and it's duplexing feature. But this is "INDIRECT" tape usage.


I totally disagree here. Unless you are running without a tape management system, manually managing scratch tapes, or using unlabled tapes (all exceeding bad practices on MVS), it is next to impossible to physically overwrite a tape that is not scratch. If the tape is made scratch by a deliberate manual act, then reasonable Operational procedures, tape management daily processing, etc. should keep that physical tape out of the physical scratch pool for up to a day, and if you have a reasonably sized scratch pool, it may still be days before it is overwritten.

However I observed several problems due to lost data on tape and only one when backup dataset on DASD disappeared. Vast majority of the problems was related to human errors causing misconfigured TMS and/or other components. The lost DASD backup was simply deleted by SMS due (good), but the retention period was too short (BAD). Both media are prone to human errors, and IMHO both are protected against simple finger check (RACF).

As long as the physical tape hasn't been overwritten, it is trivial to restore the tape dataset to service by re-cataloging the dataset and changing the tape status to non-scratch.

I did it for one customer. However I also saw "total disaster" when uneducated guy tried to do it himself.


Try doing that with a DASD dataset.

Try to recover the files from tape cartridge, without the content information. IMHO it is better to prevent it. I.e. using RACF. Admins should be allowed to READ backups, not delete them. Periodical backup should be automated so (assuming the rules are OK) there is no place for human errors. The media is quite irrelevant here.

Even if you were fortunate enough to have volume backups from which the dataset could be recovered, duplicate volser, SMS, and catalog constraints pretty well force recovery efforts to be done from an independent MVS system.

It wouldn't make too much sense - why backup backups, especially as volume dump ? I would prefer to migrate it (dual copy, dual locations) and keep for planned time. Or backup *logically* and keep both copies.
BTW: only HSM is slow when using logical operations. FDR is fast.

[...]
Just yesterday we had a 3590E drive go berserk, eat a cartridge, and damage the tape header beyond repair. It was a HSM-duplexed ML2 tape, potentially containing 1000's of datasets. We rebuilt the tape contents on a new volume from the duplexed copy and merrily went on our way. Similar incidents do happen several times a year.

I heard the following story:
Before some system changes they made "ad hoc" backup on tapes, on stand alone drives, no robot/VTS/VTAPE etc. The made it *twice* to have two copies. They were safe. Felt safe. Because of failure they decided to restore from the tape. Unfortunately the tape appeared to be destroyed. "It happens, no problem, we have alternate tape". However during restore from alternate tape the error repeated. Both copies were unusable. Reason: the dirve used to restore was bad and destroyed media. First and second copy.

--
Radoslaw Skorupka
Lodz, Poland

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to