[zfs-discuss] Single-disk rpool with inconsistent checksums, import fails

Jim Klimov Tue, 08 Nov 2011 10:33:54 -0800

Hello all,

I have an oi_148a PC with a single root disk, and since
recently it fails to boot - hangs after the copyright
message whenever I use any of my GRUB menu options.


Booting with an oi_148a LiveUSB I had around since
installation, I ran some zdb traversals over the rpool
and zpool import attempts. The imports fail by running
the kernel out of RAM (as recently discussed in the
list with Paul Kraus's problems).

However, in my current case, the rpool has just 11.2Gb
allocated with 8.7Gb "available". So almost all of it
could fit in the 8Gb RAM of this computer (no more can
be placed into the motherboard). And I don't believe
there is so much metadata as to exhaust the RAM during
an import attempt.

I have also tried "rollback" imports with -F, but they
have also failed so far.

I am not ready to copypaste the zdb/zpool outputs here
(I have to get text files off that box), but in short:

1) "zdb -bsvL -e <rpool-GUID>" showed that there are some
problems:
* "deferred free" block count is not zero, although small
  (144 blocks amounting to 1.4Mbytes), and it remained at
  this value over several import attempts.
  I have removed a swap volume some time before the failure,
  so this might be its leftovers.
* It had also output this line:
block traversal size 11986202624 != alloc 11986203136 (unreachable 512)
  I believe this refers to the allocated data size in bytes,
  and that one sector (512b) is deemed unreachable. Is that
  so fatal?

2) "zdb -bsvc -e <rpool-GUID>" showed that there are some
consistency problems. Namely, five blocks had mismatching
checksums. They were named "plain file" blocks with no
further details (like what files they might be parts of).
But I hope that this means no metadata was hurt so far.

3) I've tried importing the pool in several ways (including
normal and rollback mounts, readonly and "-n"), but so far
all attempts led to to the computer hanging within a minute
("vmstat 1" shows that free RAM plummets towards the zero
mark).

I've tried preparing the system tunables as well:

:; echo "aok/W 1" | mdb -kw
:; echo "zfs_recover/W 1" | mdb -kw

and sometimes adding:
:; echo zfs_vdev_max_pending/W0t5 | mdb -kw
:; echo zfs_resilver_delay/W0t0 | mdb -kw
:; echo zfs_resilver_min_time_ms/W0t20000 | mdb -kw
:; echo zfs_txg_synctime/W0t1 | mdb -kw


In this case I am not very hesitant to recreate the rpool
and reinstall the OS - it was mostly needed to server the
separate data pool. However this option is not always an
acceptable one, so I wonder if anything can be done to
repair an inconsistent non-redundant pool - at least to
make it importable again in order to evacuate some of the
settings and tunings that I've made over time.

//Jim

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Single-disk rpool with inconsistent checksums, import fails

Reply via email to