Patrick Hoover wrote:
However, in this case, where SMART doesn't appear to work,
what are the best options for monitoring disk integrity / degradation?

echo check > /sys/block/mdX/md/sync_action

AFAIK, using the above, MD will "repair" bad blocks and kick disks if it fails.
A failed repair means the spare area is full and the disk desperately
needs a replacement.

Having a spare disk in the array is probably a good idea too, to
minimize downtime, especially if you're not able to get to the machine
all the time.  I don't know if MD checks spares for read errors,
though that would definitely be useful.

Monitoring for kicked disks can be done by running mdadm in daemon
mode with --monitor, having it send you an email when such an event
occurs.

Something that I've found useful is to "dd if=/dev/hdX of=/dev/null
bs=1M count=512".  If there is a problem in the IDE driver, or the
cable is loose, or there is a controller problem, the disk might
respond, but will fail as soon as you read a large amount of data.
I've also seen disks succeed to read some amount of data, but at a
significantly lower rate than it should.  Monitoring the read rates of
each disk can be helpful (at least it is to me) to diagnose such
problems.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to