[zfs-discuss] Resilver endlessly restarting at completion

Tuomas Leikola Mon, 27 Sep 2010 03:14:51 -0700

Hi!

My home server had some disk outages due to flaky cabling and whatnot, and
started resilvering to a spare disk. During this another disk or two
dropped, and were reinserted into the array. So no devices were actually
lost, they just were intermittently away for a while each.


The situation is currently as follows:
  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver in progress for 5h33m, 22.47% done, 19h10m to go
config:

        NAME                       STATE     READ WRITE CKSUM
        tank                       ONLINE       0     0     0
          raidz1-0                 ONLINE       0     0     0
            c11t1d0p0              ONLINE       0     0     0
            c11t2d0                ONLINE       0     0     5
            c11t6d0p0              ONLINE       0     0     0
            spare-3                ONLINE       0     0     0
              c11t3d0p0            ONLINE       0     0     0  106M
resilvered
              c9d1                 ONLINE       0     0     0  104G
resilvered
            c11t4d0p0              ONLINE       0     0     0
            c11t0d0p0              ONLINE       0     0     0
            c11t5d0p0              ONLINE       0     0     0
            c11t7d0p0              ONLINE       0     0     0  93.6G
resilvered
          raidz1-2                 ONLINE       0     0     0
            c6t2d0                 ONLINE       0     0     0
            c6t3d0                 ONLINE       0     0     0
            c6t4d0                 ONLINE       0     0     0  2.50K
resilvered
            c6t5d0                 ONLINE       0     0     0
            c6t6d0                 ONLINE       0     0     0
            c6t7d0                 ONLINE       0     0     0
            c6t1d0                 ONLINE       0     0     1
        logs
          /dev/zvol/dsk/rpool/log  ONLINE       0     0     0
        cache
          c6t0d0p0                 ONLINE       0     0     0
        spares
          c9d1                     INUSE     currently in use

errors: No known data errors

And this has been going on for a week now, always restarting when it should
complete.

The questions in my mind atm:

1. How can i determine the cause for each resilver? Is there a log?

2. Why does it resilver the same data over and over, and not just the
changed bits?

3. Can i force remove c9d1 as it is no longer needed but c11t3 can be
resilvered instead?

I'm running opensolaris 134, but the event originally happened on 111b. I
upgraded and tried quiescing snapshots and IO, none of which helped.

I've already ordered some new hardware to recreate this entire array as
raidz2 among other things, but there's about a week of time when I can run
debuggers and traces if instructed to.

- Tuomas

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Resilver endlessly restarting at completion

Reply via email to