On Fri, May 15, 2020 at 10:59:49 +0200, Stefan G. Weichinger wrote: > I am gonna restore as well just to check if there are hidden write > errors (doesn't look like that to me so far ...)
That reminded me that (at least on our Ubuntu Linux system) the smartmontools package's "smartctl" let us read error statistics information from the SCSI tape drive. I put smartctl -l error -H $TAPEDEV in the cron script which ran Amanda, and it would produce output like this: ====== smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ TapeAlert: OK Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 0 0 0 0 0 0.000 0 write: 6304 0 6304 6304 6304 0.000 0 ====== With that output at the end of the Amanda-running script I could keep an eye on the numbers for each particular tape as it came through the rotation cycle. (A few thousand seemed normal, at least by the time I implemented this monitoring [since by then the tapes were several years old]; when tapes were starting to really go bad I saw error counts in the tens of thousands, etc.) When the drive detected that the tape heads needed to be cleaned, the output looked like ====== smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ TapeAlert Errors (C=Critical, W=Warning, I=Informational): [0x14] C: The tape drive needs cleaning: 1. If the operation has stopped, eject the tape and clean the drive. 2. If the operation has not stopped, wait for it to finish and then clean the drive. Check the tape drive users manual for device specific cleaning instructions. Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 0 0 0 0 0 0.000 0 write: 3390 1 3392 3389 3390 0.000 1 ====== (You should double-check the behavior if you are going to rely on it, but as I recall the stats are cleared when you swap a tape and when you run the above smartctl command.) Nathan ---------------------------------------------------------------------------- Nathan Stratton Treadway - natha...@ontko.com - Mid-Atlantic region Ray Ontko & Co. - Software consulting services - http://www.ontko.com/ GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt ID: 1023D/ECFB6239 Key fingerprint = 6AD8 485E 20B9 5C71 231C 0C32 15F3 ADCD ECFB 6239