On Oct 8, 2010, at 8:25 AM, Bob Friesenhahn wrote:
> 
> It also does not include the "human factor" which is still the most 
> significant contributor to data loss.  This is the most difficult factor to 
> diminish.  If the humans have difficulty understanding the system or the 
> hardware, then they are more likely to do something wrong which damages the 
> data.

This is often overlooked during a system design. It is very easy to lose your 
head during a high stress moment, and pull the wrong drive (I of course, have 
never done that... <ahem>). Having z2(3) / triple mirrors, graphical pictures 
of which disk has failed, working LED failures lights, and letting a hot spare 
finish reslivering before replacing a disk are all good counter measures.

> It also does not account for an OS kernel which caches quite a lot of data in 
> memory (relying on ECC for reliability), and which may have bugs.

At some point you have to rely on your backups for the unexpected and 
unforeseen. Make sure they are good!

Michael, nice reliability write up!

--

Scott Meilicke



_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to