Re: [zfs-discuss] Split responsibility for data with ZFS

Gary Mills Thu, 11 Dec 2008 19:27:44 -0800

On Wed, Dec 10, 2008 at 12:58:48PM -0800, Richard Elling wrote:
> Nicolas Williams wrote:
> >On Wed, Dec 10, 2008 at 01:30:30PM -0600, Nicolas Williams wrote:
> >  
> >>On Wed, Dec 10, 2008 at 12:46:40PM -0600, Gary Mills wrote:
> >>    
> >>>On the server, a variety of filesystems can be created on this virtual
> >>>disk.  UFS is most common, but ZFS has a number of advantages over
> >>>UFS.  Two of these are dynamic space management and snapshots.  There
> >>>are also a number of objections to employing ZFS in this manner.
> >>>      
> >>ZFS has very strong error detection built-in, and for mirrored and
> >>RAID-Zed pools can recover from errors automatically as long as there's
> >>a mirror left or enough disks in RAID-Z left to complete the recovery.
> >
> >Oh, but I get it: all the redundancy here would be in the SAN, and the
> >ZFS pools would have no mirrors, no RAID-Z.
> >  
> >>Note that you'll generally be better off using RAID-Z than HW RAID-5.
> >
> >Precisely because ZFS can reconstruct the correct data if it's
> >responsible for redundancy.
> >
> >But note that the setup you describe puts ZFS in no worse a situation
> >than any other filesystem.
> 
> Well, actually, it does.  ZFS is susceptible to a class of failure modes
> I classify as "kill the canary" types.  ZFS will detect errors and complain
> about them, which results in people blaming ZFS (the canary).  If you
> follow this forum, you'll see a "kill the canary" post about every month
> or so. 
> 
> By default, ZFS implements the policy that uncorrectable, but important
> failures may cause it to do an armadillo impression (staying with the
> animal theme ;-) but for which some other file systems, like UFS, will
> blissfully ignore -- putting data at risk.  Occasionally, arguments will
> arise over whether this is the best default policy, though most folks
> seem to agree that data corruption is a bad thing.  Later versions of
> ZFS, particularly that available in Solaris 10 10/08 and all OpenSolaris
> releases, allow system admins to have better control over these policies.


Yes, that's what I was getting at.  Without redundancy at the ZFS
level, ZFS can report errors but not correct them.  Of course, with a
reliable SAN and storage device, those errors will never happen.
Certainly, vendors of these products will claim that they have
extremely high standards of data integrity.  Data corruption is the
worst nightmare of storage designers, after all.  It rarely happens,
although I have seen it on one occasion in a high-quality storage
device.

The split responsibility model is quite appealing.  I'd like to see
ZFS address this model.  Is there not a way that ZFS could delegate
responsibility for both error detection and correction to the storage
device, at least one more sophisticated than a physical disk?

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Split responsibility for data with ZFS

Reply via email to