I've had a disk failure or two.  The only problem I had was the SCSI
timeouts with an AIC7890.  The system would lock up waiting on the SCSI i/o,
but a reboot would always recover and come up with 2 out of 3.

Here's a script I use for watching status (forget where I got it from).
Just run "checkmd -i", then put a crontab entry like:
15 * * * *      /usr/local/bin/checkmd -v

If there's any output (i.e. errors) root will get an email.
#####################################################
#! /bin/bash
#
# This script checks that the md configuration is the same as that
# read at configuration time.  When called with the -i option, it
# reads /proc/mdstat and learns the configuration.  If called without
# args, it returns non zero status if the configuration is different
# from the one learned, and prints a message if the -v flag is present.
#
# usage: checkmd.sh [-iv]

init=""
verbose=""
CONF=/etc/md.conf

while getopts "iv" opt; do case $opt in
    i)
init=true
;;
    v)
verbose=true
;;
    *)
cat <<-EOF
usage: $0 [-iv] [fromdev] [todev]
    -i means init the file $CONF with the current md configuration
    -v means display a message in case of configuration mismatch
EOF
exit 1
;;
esac; done

if [ ! -r $CONF -o "$init" = true ]; then
    cat /proc/mdstat > $CONF
    chmod 444 $CONF
    echo "Current configuration saved in $CONF:" >&2
    cat $CONF >&2
else
    cat /proc/mdstat | cmp $CONF >/dev/null
    if [ $? != 0 ]; then
if [ $verbose ]; then
    echo >&2
    echo "ALARM! md configuration problem" >&2
    echo >&2
    echo "Current configuration is:" >&2
    cat /proc/mdstat >&2
    echo >&2
    echo "should be:" >&2
    cat $CONF >&2
fi
    exit 1
    fi
fi

#####################################################

________________________________________
Michael D. Black   Principal Engineer
[EMAIL PROTECTED]  407-676-2923,x203
http://www.csi.cc  Computer Science Innovations
http://www.csi.cc/~mike  My home page
FAX 407-676-2355
----- Original Message -----
From: Wayne Buttles <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, October 07, 1999 7:12 PM
Subject: Notify scripts?


I have been playing with raid on a stock Redhat 6.0 install for a couple
days now.  I think I have finally figured everything out.  I have raid5
working with 3 drives automounting via the kernel with type fd partitions.
I powered off a drive and then added it back with raidhotadd on a
successive boot with no problem.  All seems swell.

I was wondering, are there scripts to add audible beep and/or email admin
on failure?  After all, if raid is working properly you won't even notice
unless you are logged in (right?).

Also, has anyone had a drive fail for real?  I'm curious about the real
life condidion of a scsi driver dealing with a failed drive.

Thanks,
Wayne.

Reply via email to