On Dec 2, 2006, at 1:32 PM, Bill Sommerfeld wrote:

On Sat, 2006-12-02 at 00:08 -0500, Theo Schlossnagle wrote:
I had a disk malfunction in a raidz pool today.  I had an extra on in
the enclosure and performed a: zpool replace pool old new and several
unexpected behaviors have transpired:

the zpool replace command "hung" for 52 minutes during which no zpool
commands could be executed (like status, iostat or list).

So, I've observed that zfs will continue to attempt to do I/O to the
outgoing drive while a replacement is in progress.  (seems
counterintuitive - I'd expect that you'd want to touch the outgoing
drive as little as possible, perhaps only attempting to read from it in
the event that a block wasn't recoverable from the healthy drives).

When it finally returned, the drive was marked as "replacing" as I
expected from reading the man page.  However, it's progress counter
has not been monotonically increasing.  It started at 1% and then
went to 5% and then back to 2%, etc. etc.

do you have any cron jobs set up to do periodic snapshots?
If so, I think you're seeing:

6343667 scrub/resilver has to start over when a snapshot is taken

I ran into this myself this week - replaced a drive, and the resilver
made it to 95% before a snapshot cron job fired and set things back to
0%.

Yesterday, a snapshot was taking to assist in backups -- that could be it.

// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to