> Well, we've been using assorted versions of the 0.90 raid code for over a
> year in a couple of servers.  We've had mostly good success with both the
> raid1 and raid5 code.  I don't have any raid5 disk failure stories (yet
> ;-), but we are using EIDE drives so I expect one before TOO long ;-)
> 
> Raid5 has given us good performance and reliability so far.  Now, we do
> have a raid1 array that did something interesting (and bad).  One of the
> drives was failing intermittently (and fairly silently) and had been
> removed from the array.  Unfortunately, backups were also failing (again,
> fairly silently). On a normal power down / reboot, it appears that the
> wrong drive was marked as master and on the reboot it re-synced to the
> drive that had been out of the array for a couple of months. (yeah, yeah,
> we need a sys-admin ;-) Anyway, 2 months of data went down the tubes.  No
> level of raid is a replacement for good backups.  

On the subject of semi-silent failures: Has anyone written a script to
monitor the [UUUUU]'s in the /proc/mdstat location? It would be fairly
trivial (start beeping the system speaker loudly and emailing
repetitively) Has this already been or should I work on it?

Thanks

-sv

Reply via email to