John Kotches wrote:
> Oh, they should also fix Thumper and Thumper2 to have 2 slots for mirrored OS 
> away from the big honking storage.
>   

In general, I agree.  However, the data does not necessarily support
this as a solution and there is a point of diminishing return.

Years ago, disk MTBFs were on the order of 250,000 hours.  Today,
enterprise class disks, like the ones Sun sells, are on the order of
1.2-1.6 Million hours.  High quality CFs are on the order of 3-4
Million hours.   What difference does that make?  A relatively easy
way to get the intuition is to use annual failure rate (AFR) instead of
MTBF.  I use AFR for 24x7x365 hours per year.

Single device:
250,000 hours MTBF = 3.5%  AFR
1,500,000 hours MTBF = 0.584% AFR
3,000,000 hours MTBF = 0.292% AFR
4,000,000 hours MTBF = 0.219% AFR

OK, so we can immediately see that the newer systems is 12x-15x
more reliable than the systems we got burned by a few years ago.
But suppose we mirror them.  For a redundant system we want
to worry about the unscheduled mean time between system
interruptions, because that is when we feel pain.  We call that the
annual system interruption rate (ASIR). This is not the AFR because
we can suffer a disk failure, and fix it, without causing an interruption.
Or, another way to look at it, for single disks, ASIR == AFR.
For these calculations, I'll use a 40 hour MTTR so that we won't get
interrupted during a hot date on Friday night :-)

Mirrored device:
250,000 hours MTBF  -> 0.0014% ASIR
1,500,000 hours MTBF -> 0.000039% ASIR
3,000,000 hours MTBF -> 0.0000097% ASIR [*]
4,000,000 hours MTBF -> 0.0000055% ASIR

[*] at this point, you might have more success buying California
SuperLOTTO lottery tickets once per week and winning the big
prize once per year.

So the math proves that you can still get much better ASIR for the
mirrored disk case.  But at some point you won't worry about it
any more.  I like to refer to this mental phenomenon with the
axiom "reliability trumps availability."  Basically, once a disk
becomes reliable enough, I won't worry about mirroring it.  If
you look at the AFR numbers you'll note that things have improved
significantly already, and you know that there is some point
between 3.5% AFR and 0.0014% ASIR that probably represents
your comfort level.  If that point is 0.3% AFR, then a single CF
boot disk is a reasonable design decision.

But there is more to the story.  The typical MTBF model, like
used above, assumes that the disk actually breaks such that it
ceases to function. What we find in practice is that you will
be much more likely to lose data than have the disk completely
fail.  This failure mode exists in CFs too, but is rarely characterized
in a useful manner. A good solution with ZFS is to set the copies
property, zfs set copies=2. For disks or CFs, this will decrease the
annualized data loss percentage by a few orders of magnitude.
It won't quite reach the comfort level of mirroring across devices,
but it will be much better than having only a single copy of the
data.  Though I may not always advocate redundant devices, I
almost always advocate data redundancy :-)

Besides, you can always mirror someplace else... any writable,
bootable device will do.
 -- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to