I concur - I've seen more frozen tapes due to drive/robot failures than due to any intrinsic fault in the tape itself.
Also when we DO destroy a tape because we suspect or know it is bad it is only AFTER all the images on it have expired. Remember that there might have been images on the tape before the time it got frozen. There are forensic ways of getting data off of old tapes if it becomes critical. -----Original Message----- From: veritas-bu-boun...@mailman.eng.auburn.edu [mailto:veritas-bu-boun...@mailman.eng.auburn.edu] On Behalf Of Martin, Jonathan Sent: Thursday, September 03, 2009 10:02 AM To: veritas-bu@mailman.eng.auburn.edu Subject: Re: [Veritas-bu] Frozen Tapes Not to be a prude, but I don't concur with simply destroying any tape that ever gets an error. 10 minutes of troubleshooting isn't going to kill you and I would add to your troubleshooting list to check the problems report for more information. I've had media frozen because of robotic / tape mount errors, scsi hba conflicts, lack of cleaning and occasionally a bad write. Back in the DLT tape days I saw write errors all the time, but since our switch to LTO3 Media I've only had 5 or so go bad, and that's with me shipping media all over the world and supporting 15+ remote sites. -Jonathan -----Original Message----- From: veritas-bu-boun...@mailman.eng.auburn.edu [mailto:veritas-bu-boun...@mailman.eng.auburn.edu] On Behalf Of bob944 Sent: Thursday, September 03, 2009 4:21 AM To: veritas-bu@mailman.eng.auburn.edu Subject: Re: [Veritas-bu] Frozen Tapes > 1) NetBackup detects Non-NetBackup data format [...] > > 2) NetBackup detects that [it is a catalog tape...] > > 3) NetBackup tried to read/write to the tape and [got write or > positioning errors...] ... if the barcode and recorded mediaID don't match... if the tape winds up in the wrong drive, ... > Assuming I'm correct so far, then is the proper method of > troubleshooting Frozen media to: > > 1) Ensure there isn't some catalog data on the tape. > > 2) Ensure that the tapes aren't from some other commercial backup > product environment's tape pool (for those of you running multiple > commercial backup applications at a single site). > > 3) Make sure your tape drives have been cleaned recently. No matter what the reason, it should be in the logs; IMO, that should always be your first troubleshooting step: find out why it was frozen and go from there. Special mention to: > 4) Use bpmedia -m <media id> -unfreeze to unfreeze the tape(s), make a > note of the tape you're unfreezing, and leave it in the scratch pool > to see if it gets used for tonight's backups. No. Either toss it immediately, or, if you _must_ try to re-use it or do root-cause, put it in the None pool until you can thoroughly test it end-to-end error-free. But even if it passes, how much of your time does it take to exceed the cost of a replacement tape? How much time/money will you spend rerunning a backup that fails on that tape again? How much time/money/resume' will you spend if you cannot recover a backup from that tape when you need it. (I see Simon has commented on this and I concur.) > Now for my question: Assuming I was correct on my selection criteria > and my troubleshooting steps, am I correct in saying that if I came in > tomorrow and that media from step 4 was frozen a second time, that it > indicates that the media is more than likely defective? Is there any > other troubleshooting steps anyone would care to add? Kudos for doing the research you show above. But why did you list all those causes but not look in the logs to see which one caused the error and address it directly? If it's a media-overwrite that you haven't allowed, there's no point in re-running; it'll still be ANSI or whatever. Then, "why do you have tapes in inventory that you must preserve but rely on a method that's only a mouse-click away from causing someone a disaster" becomes a critical question. If it was media errors, NetBackup already made the educated guess of whether it was drive or media (see the manual), and that'll show up in the logs. If it was a cold-catalog-backup tape, that's in the logs but why/how did it get put into a scratch or data pool? _______________________________________________ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu _______________________________________________ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu _______________________________________________ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu