Not upset as such :-)

 

What I'm worried about that time period where the pool is resilvering to
the hot spare. For example: one half of a mirror has failed completely,
and the mirror is being rebuilt onto the spare - if I get a read error
from the remaining half of the mirror, then I've lost data. If the RE
drives return's an error for a request that a consumer drive would have
(eventually) returned, then in this specific case I would have been
better off with the consumer drive.

 

That said, my initial ZFS systems are built with consumer drives, not
Raid Edition's, as much as anything as we got burned by some early RE
drives in some of our existing raid boxes here, so I had a general low
opinion of them. However, having done a little more reading about the
error recovery time stuff, I will also be putting in RE drives for the
production systems, and moving the consumer drives to the DR systems.

 

My logic is pretty straight forward:

 

Complete disk failures are comparatively rare, while media or transient
errors are far more common. As a media I/O or transient error on the
drive can affect the performance of the entire pool, I'm best of with
the RE drives to mitigate that. The risk of a double disk failure as
described above is partially mitigated by regular scrubs. The impact of
a double disk failure is mitigated by send/recv'ing to another box, and
catastrophic and human failures are partially mitigated by backing the
whole lot up to tape. :-)

 

Regards

            Tristan.

 

 

 

________________________________

From: Tim Cook [mailto:t...@cook.ms] 
Sent: Wednesday, 26 August 2009 2:08 PM
To: Tristan Ball
Cc: thomas; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Using consumer drives in a zraid2

 

 

On Tue, Aug 25, 2009 at 10:56 PM, Tristan Ball
<tristan.b...@leica-microsystems.com> wrote:

I guess it depends on whether or not you class the various "Raid
Edition" drives as "consumer"? :-)

My one concern with these RE drives is that because they will return
errors early rather than retry is that they may fault when a "normal"
consumer drive would have returned the data eventually. If the pool is
already degraded due to a bad device, that could mean faulting the
entire pool.

Regards,
       Tristan



Having it return errors when they really exist isn't a bad thing.  What
it boils down to is: you need a hot spare.  If you're running raid-z and
a drive fails, the hot spare takes over.  Once it's done resilvering,
you send the *bad drive* back in for an RMA.  (added bonus is the RE's
have a 5 year instead of 3 year warranty).  

You seem to be upset that the drive is more conservative about fail
modes.  To me, that's a good thing.  I'll admit, I was cheap at first
and my fileserver right now is consumer drives.  You can bet all my
future purchases will be of the enterprise grade.  And guess what...
none of the drives in my array are less than 5 years old, so even if
they did die, and I had bought the enterprise versions, they'd be
covered.

--Tim


______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to