On 6/30/07, Matthew Ahrens <[EMAIL PROTECTED]> wrote:
Peter Bortas wrote:
> According to the zdb dump, object 0 seems to be the DMU node on each
> file system. My understanding of this part of ZFS is very shallow, but
> why does it allow the filesystems to be mounted rw with damaged DMU
> nodes, doesn't that result in a risk of more permanent damage to the
> structure of those filesystems? Or are there redundant DMU nodes it's
> now using, and in that case, why doesn't it automatically fix the
> damaged ones?

Object 0 is basically the object that describes the other objects.  So the
end result will be that some range of (up to 32) objects in each of those
filesystems will be inaccessible.  There is no risk of additional damage by
running in read/write mode, because ZFS is always able to detect what data is
good and what is bad by using checksums.

That said, blkid 0 of object 0 always happens to contain some critical
objects (the ZPL "master node" and root directory).  So if you are able to
mount these filesystems at all, then it probably means that ZFS was able to
find another redundant copy, or the failure was actually transient.  (Eg,
because one disk was temporarily offline, and some pieces of another disk are
damaged, so raidz1 couldn't reconstruct.)

The question is why it didn't clear those errors when resilvering if
it found redundant copies? Before the resilvering there where actually
four of those errors. This one:

           37       0       lvl=2 blkid=0

was removed by resilvering.

FYI, in a later build, 'zpool status -v' actually tells you the names of the
damaged filesystem & files, so you don't have to muck around with zdb.

Yes, that is a feature that has been tempting me to upgrade for a
while. Unfortunately I won't have time to do it this weekend.

--
Peter Bortas
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to