Joel C. Ewing wrote:
R.S. wrote:
[...]
...
Since we will never completely eliminate human error (entering bad
data, submitting bad jobs, software bugs, errors in application
specification and design), there will always be a need for
point-in-time backups to allow recovery of databases to multiple
prior states. One of the problems with trying to do this entirely on
DASD is the relative ease with which a file on DASD can be erased by
a finger check, versus one on removable media, plus the high cost of
bandwidth to a remote site for DR coverage (assuming you have the
luxury of a hot site for DR). For most installations, I don't see
any thing other than removable media (tape) being practical for many
point-in-time backups.
Bad assumption. Tape datasets, those in automated library, especially
on virtual tapes are also subject to human error. I don't want to
judge, what is more likely - damage backups on disk or on tape. Both
are possible. Additionally tape is not protected against media
failures. Even virtual (when offloaded from cache to real cart). Of
course it is possible to make dual copy, or use VTS PtP, however first
method requires human effort (*) the second is quite expensive.
(*) It can be don in unattended manner, with almost no cost (excluding
drive & media) when using HSM of FDR/ABR and it's duplexing feature.
But this is "INDIRECT" tape usage.
I totally disagree here. Unless you are running without a tape
management system, manually managing scratch tapes, or using unlabled
tapes (all exceeding bad practices on MVS), it is next to impossible to
physically overwrite a tape that is not scratch. If the tape is made
scratch by a deliberate manual act, then reasonable Operational
procedures, tape management daily processing, etc. should keep that
physical tape out of the physical scratch pool for up to a day, and if
you have a reasonably sized scratch pool, it may still be days before it
is overwritten.
However I observed several problems due to lost data on tape and only
one when backup dataset on DASD disappeared. Vast majority of the
problems was related to human errors causing misconfigured TMS and/or
other components. The lost DASD backup was simply deleted by SMS due
(good), but the retention period was too short (BAD).
Both media are prone to human errors, and IMHO both are protected
against simple finger check (RACF).
As long as the physical tape hasn't been overwritten,
it is trivial to restore the tape dataset to service by re-cataloging
the dataset and changing the tape status to non-scratch.
I did it for one customer. However I also saw "total disaster" when
uneducated guy tried to do it himself.
Try doing that
with a DASD dataset.
Try to recover the files from tape cartridge, without the content
information.
IMHO it is better to prevent it. I.e. using RACF. Admins should be
allowed to READ backups, not delete them. Periodical backup should be
automated so (assuming the rules are OK) there is no place for human
errors. The media is quite irrelevant here.
Even if you were fortunate enough to have volume
backups from which the dataset could be recovered, duplicate volser,
SMS, and catalog constraints pretty well force recovery efforts to be
done from an independent MVS system.
It wouldn't make too much sense - why backup backups, especially as
volume dump ?
I would prefer to migrate it (dual copy, dual locations) and keep for
planned time. Or backup *logically* and keep both copies.
BTW: only HSM is slow when using logical operations. FDR is fast.
[...]
Just yesterday we had a 3590E drive go berserk, eat a cartridge, and
damage the tape header beyond repair. It was a HSM-duplexed ML2 tape,
potentially containing 1000's of datasets. We rebuilt the tape contents
on a new volume from the duplexed copy and merrily went on our way.
Similar incidents do happen several times a year.
I heard the following story:
Before some system changes they made "ad hoc" backup on tapes, on stand
alone drives, no robot/VTS/VTAPE etc. The made it *twice* to have two
copies. They were safe. Felt safe.
Because of failure they decided to restore from the tape. Unfortunately
the tape appeared to be destroyed. "It happens, no problem, we have
alternate tape". However during restore from alternate tape the error
repeated. Both copies were unusable. Reason: the dirve used to restore
was bad and destroyed media. First and second copy.
--
Radoslaw Skorupka
Lodz, Poland
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html