I have an X4540 running b134 where I'm replacing 500GB disks with 2TB disks
(Seagate Constellation) and the pool seems sick now. The pool has four
raidz2 vdevs (8+2) where the first set of 10 disks were replaced a few
months ago. I replaced two disks in the second set (c2t0d0, c3t0d0) a
couple of weeks ago, but have been unable to get the third disk to finish
replacing (c4t0d0).
I have tried the resilver for c4t0d0 four times now and the pool also comes
up with checksum errors and a permanent error (<metadata>:<0x0>). The
first resilver was from 'zpool replace', which came up with checksum
errors. I cleared the errors which triggered the second resilver (same
result). I then did a 'zpool scrub' which started the third resilver and
also identified three permanent errors (the two additional were in files in
snapshots which I then destroyed). I then did a 'zpool clear' and then
another scrub which started the fourth resilver attempt. This last attempt
identified another file with errors in a snapshot that I have now destroyed.
Any ideas how to get this disk finished being replaced without rebuilding
the pool and restoring from backup? The pool is working, but is reporting
as degraded and with checksum errors.
Here is what the pool currently looks like:
# zpool status -v pool2
pool: pool2
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: resilver completed after 33h9m with 4 errors on Thu Sep 16 00:28:14
config:
NAME STATE READ WRITE CKSUM
pool2 DEGRADED 0 0 8
raidz2-0 ONLINE 0 0 0
c0t4d0 ONLINE 0 0 0
c1t4d0 ONLINE 0 0 0
c2t4d0 ONLINE 0 0 0
c3t4d0 ONLINE 0 0 0
c4t4d0 ONLINE 0 0 0
c5t4d0 ONLINE 0 0 0
c2t5d0 ONLINE 0 0 0
c3t5d0 ONLINE 0 0 0
c4t5d0 ONLINE 0 0 0
c5t5d0 ONLINE 0 0 0
raidz2-1 DEGRADED 0 0 14
c0t5d0 ONLINE 0 0 0
c1t5d0 ONLINE 0 0 0
c2t1d0 ONLINE 0 0 0
c3t1d0 ONLINE 0 0 0
c4t1d0 ONLINE 0 0 0
c5t1d0 ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
c3t0d0 ONLINE 0 0 0
replacing-8 DEGRADED 0 0 0
c4t0d0s0/o OFFLINE 0 0 0
c4t0d0 ONLINE 0 0 0 268G resilvered
c5t0d0 ONLINE 0 0 0
raidz2-2 ONLINE 0 0 0
c0t6d0 ONLINE 0 0 0
c1t6d0 ONLINE 0 0 0
c2t6d0 ONLINE 0 0 0
c3t6d0 ONLINE 0 0 0
c4t6d0 ONLINE 0 0 0
c5t6d0 ONLINE 0 0 0
c2t7d0 ONLINE 0 0 0
c3t7d0 ONLINE 0 0 0
c4t7d0 ONLINE 0 0 0
c5t7d0 ONLINE 0 0 0
raidz2-3 ONLINE 0 0 0
c0t7d0 ONLINE 0 0 0
c1t7d0 ONLINE 0 0 0
c2t3d0 ONLINE 0 0 0
c3t3d0 ONLINE 0 0 0
c4t3d0 ONLINE 0 0 0
c5t3d0 ONLINE 0 0 0
c2t2d0 ONLINE 0 0 0
c3t2d0 ONLINE 0 0 0
c4t2d0 ONLINE 0 0 0
c5t2d0 ONLINE 0 0 0
logs
mirror-4 ONLINE 0 0 0
c0t1d0s0 ONLINE 0 0 0
c1t3d0s0 ONLINE 0 0 0
cache
c0t3d0s7 ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
<metadata>:<0x0>
<0x167a2>:<0x552ed>
(This second file was in a snapshot I destroyed after the resilver
completed).
# zpool list pool2
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
pool2 31.8T 13.8T 17.9T 43% 1.65x DEGRADED -
The slog is a mirror of two SLC SSDs and the L2ARC is an MLC SSD.
thanks,
Ben
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss