2012-01-15 19:38, Edward Ned Harvey wrote:
>> 1) How does raidzN protect agaist bit-rot without known full
>>    death of a component disk, if it at all does?
> zfs can read disks 1,2,3,4...  Then read disks 1,2,3,5...
> Then read disks 1,2,4,5...  ZFS can figure out which disk
> returned the faulty data, UNLESS the disk actually returns
> correct data upon subsequent retries.

Makes sense, if ZFS does actually do that ;)

Counter-examples:
1) For several scrubs in a row, my pool consistently found two
   vdev errors and one pool error with zero per-disk errors
   (further leading to error in some object <metadata>:<0x0>).
   If the disk-read errors were transient, sometimes returning
   correct data (i.e. bad sector relocation was successful in
   the background), ZFS would receive good blocks on further
   scrubs - shouldn't it?

2) Even with one bad sector consistently in place, if ZFS can
   deduce correct original block data, why report the error
   at all (especially - for many times) as uncorrectable?

This leaves me thinking of two on-disk errors, and/or lack of
checksums for leaf blocks, as the possible reasons for such
detected raidz errors with undetected faulty individual disks.

Any other options I overlooked?

You know the open-source question in regards to ZFS is pretty much
concluded, right?  What oracle called zpool version 28 was the last open
source version, currently in use on nexenta, freebsd, and some others.  The
illumos project has continued development, minimally.  If you think the
development effort is resource limited in oracle working on zfs, just try
the open source illumos community...

I do try it. I do also see some companies like Nexenta or Joyent
having discussed the NetApp problem and having moved on betting
on their work with opensourced ZFS.

Also, Oracle's closed ZFS is actually of little relevance to me
or other SOHO users (laptops, home NASes, etc.)
As Oracle doesn't deal with small customers, and people still
have problems buying or getting support for small-volume stuff,
or find Oracle's offerings prohibitively expensive, it is hard
to get Oracle noticing a bug/RFE report not backed by money.
There is nothing inherently bad with the business model, Sun
also had it (while being more open to suggestions). It's just
that in this model SOHO users have no influence on ZFS and it
becomes a closed proprietary gadget like any other FS, without
engineering interest to enhance it. And this couples with
limited understanding whether you have a right to use it
at all and not get sued by Oracle (i.e. for trying to put
Solaris 11 in your production without paying the tax).

Over the past year I have proposed or discussed a number of
features for ZFS, and while there is little chance that illumos
developers would implement any of that soon, there is near-zero
chance that Oracle ever will. And there is a greater chance that
myself or some other developer would dig into such RFEs and
publish a solution - especially if such developer is helped
with theory.

//Jim
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to