On 11/16/12 17:15, Peter Jeremy wrote:
I have been tracking down a problem with "zfs diff" that reveals
itself variously as a hang (unkillable process), panic or error,
depending on the ZFS kernel version but seems to be caused by
corruption within the pool.  I am using FreeBSD but the issue looks to
be generic ZFS, rather than FreeBSD-specific.

The hang and panic are related to the rw_enter() in
opensolaris/uts/common/fs/zfs/zap.c:zap_get_leaf_byblk()


There is probably nothing wrong with the snapshots. This is a bug in ZFS diff. The ZPL parent pointer is only guaranteed to be correct for directory objects. What you probably have is a file that was hard linked multiple times and the parent pointer (i.e. directory) was recycled and is now a file


The error is:
Unable to determine path or stats for object 2128453 in 
tank/beckett/home@20120518: Invalid argument

A scrub reports no issues:
root@FB10-64:~ # zpool status
   pool: tank
  state: ONLINE
status: The pool is formatted using a legacy on-disk format.  The pool can
         still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
         pool will no longer be accessible on software that does not support 
feature
         flags.
   scan: scrub repaired 0 in 3h24m with 0 errors on Wed Nov 14 01:58:36 2012
config:

         NAME        STATE     READ WRITE CKSUM
         tank        ONLINE       0     0     0
           ada2      ONLINE       0     0     0

errors: No known data errors

But zdb says that object is the child of a plain file - which isn't sane:

root@FB10-64:~ # zdb -vvv tank/beckett/home@20120518 2128453
Dataset tank/beckett/home@20120518 [ZPL], ID 605, cr_txg 8379, 143G, 2026419 objects, 
rootbp DVA[0]=<0:266a0efa00:200>  DVA[1]=<0:31b07fbc00:200>  [L0 DMU objset] 
fletcher4 lzjb LE contiguous unique double size=800L/200P birth=8375L/8375P fill=2026419 
cksum=1acdb1fbd9:93bf9c61e94:1b35c72eb8adb:389743898e4f79

     Object  lvl   iblk   dblk  dsize  lsize   %full  type
    2128453    1    16K  1.50K  1.50K  1.50K  100.00  ZFS plain file
                                         264   bonus  ZFS znode
         dnode flags: USED_BYTES USERUSED_ACCOUNTED
         dnode maxblkid: 0
         path    ???<object#2128453>
         uid     1000
         gid     1000
         atime   Fri Mar 23 16:34:52 2012
         mtime   Sat Oct 22 16:13:42 2011
         ctime   Sun Oct 23 21:09:02 2011
         crtime  Sat Oct 22 16:13:42 2011
         gen     2237174
         mode    100444
         size    1089
         parent  2242171
         links   1
         pflags  40800000004
         xattr   0
         rdev    0x0000000000000000

root@FB10-64:~ # zdb -vvv tank/beckett/home@20120518 2242171
Dataset tank/beckett/home@20120518 [ZPL], ID 605, cr_txg 8379, 143G, 2026419 objects, 
rootbp DVA[0]=<0:266a0efa00:200>  DVA[1]=<0:31b07fbc00:200>  [L0 DMU objset] 
fletcher4 lzjb LE contiguous unique double size=800L/200P birth=8375L/8375P fill=2026419 
cksum=1acdb1fbd9:93bf9c61e94:1b35c72eb8adb:389743898e4f79

     Object  lvl   iblk   dblk  dsize  lsize   %full  type
    2242171    3    16K   128K  25.4M  25.5M  100.00  ZFS plain file
                                         264   bonus  ZFS znode
         dnode flags: USED_BYTES USERUSED_ACCOUNTED
         dnode maxblkid: 203
         path    /jashank/Pictures/sch/pdm-a4-11/stereo-pair-2.png
         uid     1000
         gid     1000
         atime   Fri Mar 23 16:41:53 2012
         mtime   Mon Oct 24 21:15:56 2011
         ctime   Mon Oct 24 21:15:56 2011
         crtime  Mon Oct 24 21:15:37 2011
         gen     2286679
         mode    100644
         size    26625731
         parent  7001490
         links   1
         pflags  40800000004
         xattr   0
         rdev    0x0000000000000000

root@FB10-64:~ # zdb -vvv tank/beckett/home@20120518 7001490
Dataset tank/beckett/home@20120518 [ZPL], ID 605, cr_txg 8379, 143G, 2026419 objects, 
rootbp DVA[0]=<0:266a0efa00:200>  DVA[1]=<0:31b07fbc00:200>  [L0 DMU objset] 
fletcher4 lzjb LE contiguous unique double size=800L/200P birth=8375L/8375P fill=2026419 
cksum=1acdb1fbd9:93bf9c61e94:1b35c72eb8adb:389743898e4f79

     Object  lvl   iblk   dblk  dsize  lsize   %full  type
    7001490    1    16K    512     1K    512  100.00  ZFS directory
                                         264   bonus  ZFS znode
         dnode flags: USED_BYTES USERUSED_ACCOUNTED
         dnode maxblkid: 0
         path    /jashank/Pictures/sch/pdm-a4-11
         uid     1000
         gid     1000
         atime   Thu May 17 03:38:32 2012
         mtime   Mon Oct 24 21:15:37 2011
         ctime   Mon Oct 24 21:15:37 2011
         crtime  Fri Oct 14 22:17:44 2011
         gen     2088407
         mode    40755
         size    6
         parent  6370559
         links   2
         pflags  40800000144
         xattr   0
         rdev    0x0000000000000000
         microzap: 512 bytes, 4 entries

                 stereo-pair-2.png = 2242171 (type: Regular File)
                 stereo-pair-2.xcf = 7002074 (type: Regular File)
                 stereo-pair-1.xcf = 7001512 (type: Regular File)
                 stereo-pair-1.png = 2241802 (type: Regular File)

root@FB10-64:~ #

The above experiments were carried out on a partial copy of the pool.
The main pool started quite a long while ago and has been upgraded and
moved several times using send/recv (which happily and quietly
replicates the corruption).  Note that I have never (intentionally)
used extended attributes within the pool but it has been exported to
Windows XP via Samba and possibly to OS-X via NFSv3.

Does anyone have any suggestions for fixing the corruption?  One
suggestion was "tar c | tar x" but that is a last resort (since there
are 54 filesystems and ~1900 snapshots in the pool).




_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to