On Tue, Feb 23, 2010 at 2:01 PM, Michael Bilow <mik...@colossus.bilow.com> wrote: > During the md check operation, the array is "clean" (not degraded) > and you can see that explicitly with the "[UU]" status report ...
Of course, mdstat still calls the array "clean" even after mismatches are detected, which isn't what I'd usually call "clean"... :-) > It is not a "scrub" because it does not attempt to repair anything. Comments in previously mentioned config file don't make it sound like that. "A check operation will scan the drives looking for bad sectors and automatically repairing only bad sectors." It doesn't explain how it would repair bad sectors. Perhaps it means the bad sectors will be "repaired" by failing the entire member and having the sysadmin insert a new disk. Perhaps the comments are just wrong. Not arguing with you, just reporting what the file told me. Would the file lie? ;-) > Detecting and reporting "soft failure" incidents > such as reallocations of spare sectors ... The relocation algorithm in modern disks generally works like this (or so I'm told): R1. OS requests read logical block from HDD. HDD tries to read from block on disk, and can't, even with retries and ECC. HDD returns failure to the OS, and marks that physical block as "bad" and as a candidate for relocation. R2. Repeated attempts by OS to read from the same block cause the HDD to retry. It won't throw away your data on its own. R3. OS requests write to same logical block. HDD relocate to different physical block, and throws away the bad block. It can do that now, since you've told it you don't want the data that was there, by writing new data over it. It would be nice if hard disks were smart enough to detect a block that was getting marginal and preemptively relocate it. Last I looked into this (admittedly, several years ago), they didn't do that. Maybe they've gotten smarter about that. If they haven't gotten smarter, if the "check" operation reads all the blocks on the the disk but never writes, that alone won't trigger relocation of a bad block. The "check" operation would have to read the good block from the other disk, and attempt to rewrite it to the bad disk. *That* might trigger a useful relocation by the HDD with the bad block. > smartmontools, which can and should be configured to look past the > md device and monitor the physical drives that are its components. While I run smartd in monitor mode, I've never had it give me a useful pre-failure alert. Likewise, I've never had the SMART health check in PC BIOSes give me a useful pre-failure alert. More than once I've seen SMART report the overall health check as "PASS" when the whole damn disk is unreadable. It make me wonder just what the overall SMART health is supposed to indicate -- "Yes, the HDD is physically present"? :) I did once have the BIOS check start reporting a SMART health warning, but all the OEM diagnostics, smartctl, "badblocks -w", etc., didn't actually report anything wrong. The reseller replaced the drive at my insistence. Maybe the SMART health check knew something that none of the other SMART parameters were reporting. -- Ben _______________________________________________ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/