2011-11-08 22:30, Jim Klimov wrote:
Hello all,

I have an oi_148a PC with a single root disk, and since
recently it fails to boot - hangs after the copyright
message whenever I use any of my GRUB menu options.

Thanks to my wife's sister, who is my hands and eyes near
the problematic PC, here's some ZDB output from this rpool:

# zpool import
  pool: rpool
    id: 17995958177810353692
 state: ONLINE
status: The pool was last accessed by another system.
action: The pool can be imported using its name or numeric identifier and
        the '-f' flag.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

        rpool       ONLINE
          c4t1d0s0  ONLINE


So here it is - a single-device "rpool".
There are some on-disk errors, so some of zdb walks fail:


root@openindiana:~# time zdb -bb -e 17995958177810353692

Traversing all blocks to verify nothing leaked ...
Assertion failed: ss->ss_start <= start (0x79e22600 <= 0x79e1dc00), file ../../../uts/common/fs/zfs/space_map.c, line 173
Abort (core dumped)

real    0m12.184s
user    0m0.367s
sys     0m0.474s

root@openindiana:~# time zdb -bsvc -e 17995958177810353692

Traversing all blocks to verify checksums and verify nothing leaked ...
Assertion failed: ss->ss_start <= start (0x79e22600 <= 0x79e1dc00), file ../../../uts/common/fs/zfs/space_map.c, line 173
Abort (core dumped)

real    0m12.019s
user    0m0.360s
sys     0m0.458s



However "-bsvL" and "-bsvcL" (with checksum-checks) do finish,
results of the former test (more complete) are listed below:



root@openindiana:~# time zdb -bsvcL -e 17995958177810353692

Traversing all blocks to verify checksums ...

zdb_blkptr_cb: Got error 50 reading <182, 19177, 0, 1> DVA[0]=<0:a8c8e600:20000> [L0 ZFS plain file] fletcher4 uncompressed LE contiguous unique single size=20000L/20000P birth=82L/82P fill=1 cksum=3401f5fe522b:109ee10ba48ed38c:e7f49c220f7b8bc:ff405ef051b91e65 -- skipping zdb_blkptr_cb: Got error 50 reading <182, 19202, 0, 1> DVA[0]=<0:a9030a00:20000> [L0 ZFS plain file] fletcher4 uncompressed LE contiguous unique single size=20000L/20000P birth=82L/82P fill=1 cksum=11c4c738b0ba:7bb81bce3313913:8f85a7abf1b9e34:58e8746d63119393 -- skipping zdb_blkptr_cb: Got error 50 reading <182, 24924, 0, 0> DVA[0]=<0:b1aaec00:14a00> [L0 ZFS plain file] fletcher4 uncompressed LE contiguous unique single size=14a00L/14a00P birth=85L/85P fill=1 cksum=270679cd905d:6119a969a134566:6f0f7da64c4d2d90:3ab86aa985abef02 -- skipping zdb_blkptr_cb: Got error 50 reading <182, 24944, 0, 0> DVA[0]=<0:b1cdf000:10800> [L0 ZFS plain file] fletcher4 uncompressed LE contiguous unique single size=10800L/10800P birth=85L/85P fill=1 cksum=1ebb4d1ae9f5:3cf5f42afa9a332:757613fc2d2de7b3:5f197017333a4f89 -- skipping

zdb_blkptr_cb: Got error 50 reading <493, 947, 0, 165> DVA[0]=<0:b3efc200:20000> [L0 ZFS plain file] fletcher4 uncompressed LE contiguous unique single size=20000L/20000P birth=26691L/26691P fill=1 cksum=2cdc2ae22d10:b33d31bcbc0d8da:f1571c9975e151b0:a037073594569635 -- skipping

Error counts:

        errno  count
           50  5
block traversal size 11986202624 != alloc 11986203136 (unreachable 512)

        bp count:          405927
        bp logical:    15030449664      avg:  37027
        bp physical:   12995855872      avg:  32015     compression:   1.16
        bp allocated:  13172434944      avg:  32450     compression:   1.14
        bp deduped:    1186232320    ref>1:  12767   deduplication:   1.09
        SPA allocated: 11986203136     used: 56.17%

Blocks  LSIZE   PSIZE   ASIZE     avg    comp   %Total  Type
     -      -       -       -       -       -        -  unallocated
     2    32K      4K   12.0K   6.00K    8.00     0.00  object directory
     3  1.50K   1.50K   4.50K   1.50K    1.00     0.00  object array
     1    16K   1.50K   4.50K   4.50K   10.67     0.00  packed nvlist
     -      -       -       -       -       -        -  packed nvlist size
   197  24.2M   1.87M   5.61M   29.2K   12.92     0.04  bpobj
     -      -       -       -       -       -        -  bpobj header
- - - - - - - SPA space map header
 1.27K  6.79M   3.25M    9.8M   7.70K    2.09     0.08  SPA space map
     8   144K    144K    144K   18.0K    1.00     0.00  ZIL intent log
 26.6K   426M   91.1M    182M   6.86K    4.67     1.45  DMU dnode
    75   150K   39.0K   80.0K   1.07K    3.85     0.00  DMU objset
     -      -       -       -       -       -        -  DSL directory
23 12.0K 11.5K 34.5K 1.50K 1.04 0.00 DSL directory child map 21 11.5K 10.5K 31.5K 1.50K 1.10 0.00 DSL dataset snap map
    49   707K   79.5K    239K   4.87K    8.89     0.00  DSL props
     -      -       -       -       -       -        -  DSL dataset
     -      -       -       -       -       -        -  ZFS znode
     -      -       -       -       -       -        -  ZFS V0 ACL
  321K  12.0G   10.5G   10.5G   33.4K    1.14    85.46  ZFS plain file
 26.8K  41.5M   19.1M   38.2M   1.42K    2.17     0.30  ZFS directory
    18  17.5K   9.00K   18.0K      1K    1.94     0.00  ZFS master node
    50  84.5K   25.0K   50.0K      1K    3.38     0.00  ZFS delete queue
 12.1K  1.50G   1.50G   1.50G    127K    1.00    12.22  zvol object
     1     1K     512      1K      1K    2.00     0.00  zvol prop
     -      -       -       -       -       -        -  other uint8[]
     -      -       -       -       -       -        -  other uint64[]
     -      -       -       -       -       -        -  other ZAP
- - - - - - - persistent error log
     2   256K   44.0K    132K   66.0K    5.82     0.00  SPA history
     -      -       -       -       -       -        -  SPA history offsets
     1    512     512   1.50K   1.50K    1.00     0.00  Pool properties
     -      -       -       -       -       -        -  DSL permissions
     -      -       -       -       -       -        -  ZFS ACL
     -      -       -       -       -       -        -  ZFS SYSACL
     -      -       -       -       -       -        -  FUID table
     -      -       -       -       -       -        -  FUID table size
2 2K 1K 3.00K 1.50K 2.00 0.00 DSL dataset next clones
     -      -       -       -       -       -        -  scan work queue
   146   103K   73.0K    146K      1K    1.40     0.00  ZFS user/group used
- - - - - - - ZFS user/group quota 1 512 512 1.50K 1.50K 1.00 0.00 snapshot refcount tags
 7.14K  28.6M   17.5M   52.6M   7.37K    1.63     0.42  DDT ZAP algorithm
     2    32K      4K   12.0K   6.00K    8.00     0.00  DDT statistics
     -      -       -       -       -       -        -  System attributes
    18  9.00K   9.00K   18.0K      1K    1.00     0.00  SA master node
18 27.0K 9.00K 18.0K 1K 3.00 0.00 SA attr registration
    44   704K   77.0K    154K   3.50K    9.14     0.00  SA attr layouts
     -      -       -       -       -       -        -  scan translations
     -      -       -       -       -       -        -  deduplicated block
   133  71.0K   66.5K    200K   1.50K    1.07     0.00  DSL deadlist map
- - - - - - - DSL deadlist map hdr
     3  2.50K   1.50K   4.50K   1.50K    1.67     0.00  DSL dir clones
    27  3.38M    122K    365K   13.5K   28.44     0.00  bpobj subobj
   144  1.42M    228K    683K   4.74K    6.37     0.01  deferred free
     4   130K    130K    130K   32.5K    1.00     0.00  dedup ditto
  396K  14.0G   12.1G   12.3G   31.7K    1.16   100.00  Total

capacity operations bandwidth ---- errors ---- description used avail read write read write read write cksum rpool 11.2G 8.71G 308 0 11.2M 0 0 0 5 /dev/dsk/c4t1d0s0 11.2G 8.71G 308 0 11.2M 0 0 0 10

real    38m56.588s
user    4m15.708s
sys     0m56.255s



I see a non-empty deferred-free list and, apparently,
blocks with checksum errors. If I read this right, four
blocks are from old generations (TXGs 82 and 85?), and
one is more recent (26691). What else does a trained eye
see which I don't?

According to "zdb -l" below, current TXG numbers are in
560 million range...

root@openindiana:~# zdb -l /dev/dsk/c4t1d0s0
--------------------------------------------
LABEL 0
--------------------------------------------
    version: 28
    name: 'rpool'
    state: 0
    txg: 560647931
    pool_guid: 17995958177810353692
    hostid: 13583512
    hostname: ''
    top_guid: 3656218981390172871
    guid: 3656218981390172871
    vdev_children: 1
    vdev_tree:
        type: 'disk'
        id: 0
        guid: 3656218981390172871
        path: '/dev/dsk/c4t1d0s0'
        devid: 'id1,sd@SATA_____ST3808110AS_________________5LR557KB/a'
        phys_path: '/pci@0,0/pci8086,2847@1c,4/pci1043,81e4@0/disk@1,0:a'
        whole_disk: 0
        metaslab_array: 30
        metaslab_shift: 27
        ashift: 9
        asize: 21430272000
        is_log: 0
        DTL: 4098
        create_txg: 4
--------------------------------------------
LABEL 1
--------------------------------------------
    version: 28
    name: 'rpool'
    state: 0
    txg: 560647931
    pool_guid: 17995958177810353692
    hostid: 13583512
    hostname: ''
    top_guid: 3656218981390172871
    guid: 3656218981390172871
    vdev_children: 1
    vdev_tree:
        type: 'disk'
        id: 0
        guid: 3656218981390172871
        path: '/dev/dsk/c4t1d0s0'
        devid: 'id1,sd@SATA_____ST3808110AS_________________5LR557KB/a'
        phys_path: '/pci@0,0/pci8086,2847@1c,4/pci1043,81e4@0/disk@1,0:a'
        whole_disk: 0
        metaslab_array: 30
        metaslab_shift: 27
        ashift: 9
        asize: 21430272000
        is_log: 0
        DTL: 4098
        create_txg: 4
--------------------------------------------
LABEL 2
--------------------------------------------
    version: 28
    name: 'rpool'
    state: 0
    txg: 560647931
    pool_guid: 17995958177810353692
    hostid: 13583512
    hostname: ''
    top_guid: 3656218981390172871
    guid: 3656218981390172871
    vdev_children: 1
    vdev_tree:
        type: 'disk'
        id: 0
        guid: 3656218981390172871
        path: '/dev/dsk/c4t1d0s0'
        devid: 'id1,sd@SATA_____ST3808110AS_________________5LR557KB/a'
        phys_path: '/pci@0,0/pci8086,2847@1c,4/pci1043,81e4@0/disk@1,0:a'
        whole_disk: 0
        metaslab_array: 30
        metaslab_shift: 27
        ashift: 9
        asize: 21430272000
        is_log: 0
        DTL: 4098
        create_txg: 4
--------------------------------------------
LABEL 3
--------------------------------------------
    version: 28
    name: 'rpool'
    state: 0
    txg: 560647931
    pool_guid: 17995958177810353692
    hostid: 13583512
    hostname: ''
    top_guid: 3656218981390172871
    guid: 3656218981390172871
    vdev_children: 1
    vdev_tree:
        type: 'disk'
        id: 0
        guid: 3656218981390172871
        path: '/dev/dsk/c4t1d0s0'
        devid: 'id1,sd@SATA_____ST3808110AS_________________5LR557KB/a'
        phys_path: '/pci@0,0/pci8086,2847@1c,4/pci1043,81e4@0/disk@1,0:a'
        whole_disk: 0
        metaslab_array: 30
        metaslab_shift: 27
        ashift: 9
        asize: 21430272000
        is_log: 0
        DTL: 4098
        create_txg: 4


Any ideas as to whether this rpool can be recovered into
mountable state, or recreation is my only option now? ;)

In particular, I'm currently testing with LiveUSB oi_148a
as that is what they have at the broken PC. Should we
expect for zpool import and fixup to work better with
oi_151a, oi_dev, or Solaris 11 (Express or Release)?
It might be problematic to record another bootable
device remotely, so if no related code has changed...

//Jim


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to