I had a disk malfunction in a raidz pool today. I had an extra on in the enclosure and performed a: zpool replace pool old new and several unexpected behaviors have transpired:

the zpool replace command "hung" for 52 minutes during which no zpool commands could be executed (like status, iostat or list).

When it finally returned, the drive was marked as "replacing" as I expected from reading the man page. However, it's progress counter has not been monotonically increasing. It started at 1% and then went to 5% and then back to 2%, etc. etc.

I just logged in to see if it was "done" and ran zpool status and received:

  pool: xsr_slow_2
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress, 100.00% done, 0h0m to go
config:

        NAME                           STATE     READ WRITE CKSUM
        xsr_slow_2                     ONLINE       0     0     0
          raidz                        ONLINE       0     0     0
            c4t6000393000016A1Fd0s2    ONLINE       0     0     0
            c4t6000393000016A1Fd1s2    ONLINE       0     0     0
            c4t6000393000016A1Fd2s2    ONLINE       0     0     0
            c4t6000393000016A1Fd3s2    ONLINE       0     0     0
            replacing                  ONLINE       0     0     0
              c4t6000393000016A1Fd4s2  ONLINE   2.87K   251     0
              c4t6000393000016A1Fd6    ONLINE       0     0     0
            c4t6000393000016A1Fd5s2    ONLINE       0     0     0


I thought to myself, if it is 100% done why is it still replacing? I waited about 15 seconds and ran the command again to find something rather disconcerting:

  pool: xsr_slow_2
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress, 0.45% done, 27h27m to go
config:

        NAME                           STATE     READ WRITE CKSUM
        xsr_slow_2                     ONLINE       0     0     0
          raidz                        ONLINE       0     0     0
            c4t6000393000016A1Fd0s2    ONLINE       0     0     0
            c4t6000393000016A1Fd1s2    ONLINE       0     0     0
            c4t6000393000016A1Fd2s2    ONLINE       0     0     0
            c4t6000393000016A1Fd3s2    ONLINE       0     0     0
            replacing                  ONLINE       0     0     0
              c4t6000393000016A1Fd4s2  ONLINE   2.87K   251     0
              c4t6000393000016A1Fd6    ONLINE       0     0     0
            c4t6000393000016A1Fd5s2    ONLINE       0     0     0

WTF?!

Best regards,

Theo

// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to