Hello all,

After playing around a bit with the disks (powering down, pulling one
disk out,  powering down putting the disk back in and pulling out
another one, repeat) zpool status reports permanent data corruption:

# uname -a
SunOS bhelliom 5.11 snv_55b i86pc i386 i86pc
# zpool status -v
 pool: famine
state: ONLINE
status: One or more devices has experienced an error resulting in data
       corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
       entire pool from backup.
  see: http://www.sun.com/msg/ZFS-8000-8A
scrub: none requested
config:

       NAME        STATE     READ WRITE CKSUM
       famine      ONLINE       0     0     0
         raidz1    ONLINE       0     0     0
           c2d0    ONLINE       0     0     0
           c2d1    ONLINE       0     0     0
           c3d0    ONLINE       0     0     0
           c4d0    ONLINE       0     0     0
           c4d1    ONLINE       0     0     0
           c5d0    ONLINE       0     0     0

errors: The following persistent errors have been detected:

         DATASET  OBJECT  RANGE
         6d       0       lvl=4 blkid=0
         73       0       lvl=0 blkid=0
         10b1     0       lvl=6 blkid=0


The corruption is somewhat understandable. It's my home fileserver and
I do the most horrible things to it now and then just to find out what
happens. The point of this exercise was to go through the disks, label
them, and locate c2d1 since it  had been experiences lockups that
required a cold reset to get the disk online again, and I was to lazy
to do it without fully starting the OS and thus mounting the raidz
each time. During one of the restarts both the disk I pulled out and
c2d1 went missing while starting the filesystem.

According to the zdb dump, object 0 seems to be the DMU node on each
file system. My understanding of this part of ZFS is very shallow, but
why does it allow the filesystems to be mounted rw with damaged DMU
nodes, doesn't that result in a risk of more permanent damage to the
structure of those filesystems? Or are there redundant DMU nodes it's
now using, and in that case, why doesn't it automatically fix the
damaged ones?

I'm currently doing a complete scrub, but according to zpool status
latest estimate it will be 63h before I know how that went...

--
Peter Bortas
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to