On Apr 8, 2010, at 9:06 PM, Daniel Carosone wrote:

> On Thu, Apr 08, 2010 at 08:36:43PM -0700, Richard Elling wrote:
>> On Apr 8, 2010, at 6:19 PM, Daniel Carosone wrote:
>>> 
>>> As for error rates, this is something zfs should not be afraid
>>> of. Indeed, many of us would be happy to get drives with less internal
>>> ECC overhead and complexity for greater capacity, and tolerate the
>>> resultant higher error rates, specifically for use with zfs (sector
>>> errors, not overall drive failure, of course).  Even if it means I
>>> need raidz4, and wind up with the same overall usable space, I may
>>> prefer the redundancy across drives rather than within.
>> 
>> Disagree. Reliability trumps availability every time. 
> 
> Often, but not sure about every.

I am quite sure.

> The economics shift around too fast
> for such truisms to be reliable, and there's always room for an
> upstart (often in a niche) to make great economic advantages out of
> questioning this established wisdom.  The oft-touted example  is
> google's servers, but there are many others.

A small change in reliability for massively parallel systems has a
significant, multiplicative effect on the overall system.  Companies
like Google weigh many factors, including component reliability,
when designing the systems.

> 
>> And the problem
>> with the availability provided by redundancy techniques is that the
>> amount of work needed to recover is increasing.  This work is limited
>> by latency and HDDs are not winning any latency competitions anymore.
> 
> We're talking about generalities; the niche can be very important to
> enable these kinds of tricks by holding some of the other troubling
> variables constant (e.g. application/programming platform).  It
> doesn't really matter whether you're talking about 1 dual-PSU server
> vs 2 single-PSU servers, or whole datacentres - except that solid
> large-scale diversity tends to lessen your concentration (and perhaps
> spend) on internal redundancy within a datacentre (or disk).
> 
> Put another way: some application niches are much more able to adopt
> redundancy techniques that don't require so much work. 

At the other extreme, if disks were truly reliable, the only RAID that
would matter is RAID-0.

> Again, for the google example: if you're big and diverse enough that
> shifting load between data centres on failure is no work, then
> moving the load for other reasons is viable too - such as moving
> to where it's night time and power and cooling are cheaper.  The work
> has been done once, up front, and the benefits are repeatable.

Most folks never even get to a decent disaster recovery design, let
alone a full datacenter mirror :-(

>> To combat this, some vendors are moving to an overprovision model.
>> Current products deliver multiple "disks" in a single FRU with builtin, 
>> fine-grained redundancy. Because the size and scope of the FRU is 
>> bounded, the recovery can be optimized and the reliability of the FRU 
>> is increased. 
> 
> That's not new.  Past examplees in the direct experience of this
> community include the BladeStor and SSA-1000 storage units, which
> aggregated disks into failure domains (e.g. drawers) for a (big)
> density win.

Nope. The FRUs for BladeStor and SSA-100 were traditional disks.
To see something different you need to rethink the "disk" -- something
like a Xiotech ISE.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to