Re: [zfs-discuss] Why did resilvering restart?

2007-11-21 Thread Albert Chin
On Tue, Nov 20, 2007 at 11:39:30AM -0600, Albert Chin wrote:
 On Tue, Nov 20, 2007 at 11:10:20AM -0600, [EMAIL PROTECTED] wrote:
  
  [EMAIL PROTECTED] wrote on 11/20/2007 10:11:50 AM:
  
   On Tue, Nov 20, 2007 at 10:01:49AM -0600, [EMAIL PROTECTED] wrote:
Resilver and scrub are broken and restart when a snapshot is created
-- the current workaround is to disable snaps while resilvering,
the ZFS team is working on the issue for a long term fix.
  
   But, no snapshot was taken. If so, zpool history would have shown
   this. So, in short, _no_ ZFS operations are going on during the
   resilvering. Yet, it is restarting.
  
  
  Does 2007-11-20.02:37:13 actually match the expected timestamp of
  the original zpool replace command before the first zpool status
  output listed below?
 
 No. We ran some 'zpool status' commands after the last 'zpool
 replace'. The 'zpool status' output in the initial email is from this
 morning. The only ZFS command we've been running is 'zfs list', 'zpool
 list tww', 'zpool status', or 'zpool status -v' after the last 'zpool
 replace'.

I think the 'zpool status' command was resetting the resilvering. We
upgraded to b77 this morning which did not exhibit this problem.
Resilvering is now done.

 Server is on GMT time.
 
  Is it possible that another zpool replace is further up on your
  pool history (ie it was rerun by an admin or automatically from some
  service)?
 
 Yes, but a zpool replace for the same bad disk:
   2007-11-20.00:57:40 zpool replace tww c0t600A0B8000299966059E4668CBD3d0
   c0t600A0B800029996606584741C7C3d0
   2007-11-20.02:35:22 zpool detach tww c0t600A0B800029996606584741C7C3d0
   2007-11-20.02:37:13 zpool replace tww c0t600A0B8000299966059E4668CBD3d0
   c0t600A0B8000299CCC06734741CD4Ed0
 
 We accidentally removed c0t600A0B800029996606584741C7C3d0 from the
 array, hence the 'zpool detach'.
 
 The last 'zpool replace' has been running for 15h now.
 
  -Wade
  
  
   
[EMAIL PROTECTED] wrote on 11/20/2007 09:58:19 AM:
   
 On b66:
   # zpool replace tww c0t600A0B8000299966059E4668CBD3d0 \
   c0t600A0B8000299CCC06734741CD4Ed0
some hours later
   # zpool status tww
 pool: tww
state: DEGRADED
   status: One or more devices is currently being resilvered.  The
  pool
will
   continue to function, possibly in a degraded state.
   action: Wait for the resilver to complete.
scrub: resilver in progress, 62.90% done, 4h26m to go
some hours later
   # zpool status tww
 pool: tww
state: DEGRADED
   status: One or more devices is currently being resilvered.  The
  pool
will
   continue to function, possibly in a degraded state.
   action: Wait for the resilver to complete.
scrub: resilver in progress, 3.85% done, 18h49m to go

   # zpool history tww | tail -1
   2007-11-20.02:37:13 zpool replace tww
c0t600A0B8000299966059E4668CBD3d0
   c0t600A0B8000299CCC06734741CD4Ed0

 So, why did resilvering restart when no zfs operations occurred? I
 just ran zpool status again and now I get:
   # zpool status tww
 pool: tww
state: DEGRADED
   status: One or more devices is currently being resilvered.  The
  pool
will
   continue to function, possibly in a degraded state.
   action: Wait for the resilver to complete.
scrub: resilver in progress, 0.00% done, 134h45m to go

 What's going on?

 --
 albert chin ([EMAIL PROTECTED])
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   
   
  
   --
   albert chin ([EMAIL PROTECTED])
   ___
   zfs-discuss mailing list
   zfs-discuss@opensolaris.org
   http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  
  
 
 -- 
 albert chin ([EMAIL PROTECTED])
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why did resilvering restart?

2007-11-20 Thread Wade . Stuart
Resilver and scrub are broken and restart when a snapshot is created -- the
current workaround is to disable snaps while resilvering,  the ZFS team is
working on the issue for a long term fix.

-Wade

[EMAIL PROTECTED] wrote on 11/20/2007 09:58:19 AM:

 On b66:
   # zpool replace tww c0t600A0B8000299966059E4668CBD3d0 \
   c0t600A0B8000299CCC06734741CD4Ed0
some hours later
   # zpool status tww
 pool: tww
state: DEGRADED
   status: One or more devices is currently being resilvered.  The pool
will
   continue to function, possibly in a degraded state.
   action: Wait for the resilver to complete.
scrub: resilver in progress, 62.90% done, 4h26m to go
some hours later
   # zpool status tww
 pool: tww
state: DEGRADED
   status: One or more devices is currently being resilvered.  The pool
will
   continue to function, possibly in a degraded state.
   action: Wait for the resilver to complete.
scrub: resilver in progress, 3.85% done, 18h49m to go

   # zpool history tww | tail -1
   2007-11-20.02:37:13 zpool replace tww
c0t600A0B8000299966059E4668CBD3d0
   c0t600A0B8000299CCC06734741CD4Ed0

 So, why did resilvering restart when no zfs operations occurred? I
 just ran zpool status again and now I get:
   # zpool status tww
 pool: tww
state: DEGRADED
   status: One or more devices is currently being resilvered.  The pool
will
   continue to function, possibly in a degraded state.
   action: Wait for the resilver to complete.
scrub: resilver in progress, 0.00% done, 134h45m to go

 What's going on?

 --
 albert chin ([EMAIL PROTECTED])
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why did resilvering restart?

2007-11-20 Thread Albert Chin
On Tue, Nov 20, 2007 at 10:01:49AM -0600, [EMAIL PROTECTED] wrote:
 Resilver and scrub are broken and restart when a snapshot is created
 -- the current workaround is to disable snaps while resilvering,
 the ZFS team is working on the issue for a long term fix.

But, no snapshot was taken. If so, zpool history would have shown
this. So, in short, _no_ ZFS operations are going on during the
resilvering. Yet, it is restarting.

 -Wade
 
 [EMAIL PROTECTED] wrote on 11/20/2007 09:58:19 AM:
 
  On b66:
# zpool replace tww c0t600A0B8000299966059E4668CBD3d0 \
c0t600A0B8000299CCC06734741CD4Ed0
 some hours later
# zpool status tww
  pool: tww
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool
 will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 62.90% done, 4h26m to go
 some hours later
# zpool status tww
  pool: tww
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool
 will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 3.85% done, 18h49m to go
 
# zpool history tww | tail -1
2007-11-20.02:37:13 zpool replace tww
 c0t600A0B8000299966059E4668CBD3d0
c0t600A0B8000299CCC06734741CD4Ed0
 
  So, why did resilvering restart when no zfs operations occurred? I
  just ran zpool status again and now I get:
# zpool status tww
  pool: tww
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool
 will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 0.00% done, 134h45m to go
 
  What's going on?
 
  --
  albert chin ([EMAIL PROTECTED])
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why did resilvering restart?

2007-11-20 Thread Wade . Stuart

[EMAIL PROTECTED] wrote on 11/20/2007 10:11:50 AM:

 On Tue, Nov 20, 2007 at 10:01:49AM -0600, [EMAIL PROTECTED] wrote:
  Resilver and scrub are broken and restart when a snapshot is created
  -- the current workaround is to disable snaps while resilvering,
  the ZFS team is working on the issue for a long term fix.

 But, no snapshot was taken. If so, zpool history would have shown
 this. So, in short, _no_ ZFS operations are going on during the
 resilvering. Yet, it is restarting.


Does 2007-11-20.02:37:13 actually match the expected timestamp of the
original zpool replace command before the first zpool status output listed
below?  Is it possible that another zpool replace is further up on your
pool history (ie it was rerun by an admin or automatically from some
service)?

-Wade


 
  [EMAIL PROTECTED] wrote on 11/20/2007 09:58:19 AM:
 
   On b66:
 # zpool replace tww c0t600A0B8000299966059E4668CBD3d0 \
 c0t600A0B8000299CCC06734741CD4Ed0
  some hours later
 # zpool status tww
   pool: tww
  state: DEGRADED
 status: One or more devices is currently being resilvered.  The
pool
  will
 continue to function, possibly in a degraded state.
 action: Wait for the resilver to complete.
  scrub: resilver in progress, 62.90% done, 4h26m to go
  some hours later
 # zpool status tww
   pool: tww
  state: DEGRADED
 status: One or more devices is currently being resilvered.  The
pool
  will
 continue to function, possibly in a degraded state.
 action: Wait for the resilver to complete.
  scrub: resilver in progress, 3.85% done, 18h49m to go
  
 # zpool history tww | tail -1
 2007-11-20.02:37:13 zpool replace tww
  c0t600A0B8000299966059E4668CBD3d0
 c0t600A0B8000299CCC06734741CD4Ed0
  
   So, why did resilvering restart when no zfs operations occurred? I
   just ran zpool status again and now I get:
 # zpool status tww
   pool: tww
  state: DEGRADED
 status: One or more devices is currently being resilvered.  The
pool
  will
 continue to function, possibly in a degraded state.
 action: Wait for the resilver to complete.
  scrub: resilver in progress, 0.00% done, 134h45m to go
  
   What's going on?
  
   --
   albert chin ([EMAIL PROTECTED])
   ___
   zfs-discuss mailing list
   zfs-discuss@opensolaris.org
   http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 

 --
 albert chin ([EMAIL PROTECTED])
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why did resilvering restart?

2007-11-20 Thread Albert Chin
On Tue, Nov 20, 2007 at 11:10:20AM -0600, [EMAIL PROTECTED] wrote:
 
 [EMAIL PROTECTED] wrote on 11/20/2007 10:11:50 AM:
 
  On Tue, Nov 20, 2007 at 10:01:49AM -0600, [EMAIL PROTECTED] wrote:
   Resilver and scrub are broken and restart when a snapshot is created
   -- the current workaround is to disable snaps while resilvering,
   the ZFS team is working on the issue for a long term fix.
 
  But, no snapshot was taken. If so, zpool history would have shown
  this. So, in short, _no_ ZFS operations are going on during the
  resilvering. Yet, it is restarting.
 
 
 Does 2007-11-20.02:37:13 actually match the expected timestamp of
 the original zpool replace command before the first zpool status
 output listed below?

No. We ran some 'zpool status' commands after the last 'zpool
replace'. The 'zpool status' output in the initial email is from this
morning. The only ZFS command we've been running is 'zfs list', 'zpool
list tww', 'zpool status', or 'zpool status -v' after the last 'zpool
replace'.

Server is on GMT time.

 Is it possible that another zpool replace is further up on your
 pool history (ie it was rerun by an admin or automatically from some
 service)?

Yes, but a zpool replace for the same bad disk:
  2007-11-20.00:57:40 zpool replace tww c0t600A0B8000299966059E4668CBD3d0
  c0t600A0B800029996606584741C7C3d0
  2007-11-20.02:35:22 zpool detach tww c0t600A0B800029996606584741C7C3d0
  2007-11-20.02:37:13 zpool replace tww c0t600A0B8000299966059E4668CBD3d0
  c0t600A0B8000299CCC06734741CD4Ed0

We accidentally removed c0t600A0B800029996606584741C7C3d0 from the
array, hence the 'zpool detach'.

The last 'zpool replace' has been running for 15h now.

 -Wade
 
 
  
   [EMAIL PROTECTED] wrote on 11/20/2007 09:58:19 AM:
  
On b66:
  # zpool replace tww c0t600A0B8000299966059E4668CBD3d0 \
  c0t600A0B8000299CCC06734741CD4Ed0
   some hours later
  # zpool status tww
pool: tww
   state: DEGRADED
  status: One or more devices is currently being resilvered.  The
 pool
   will
  continue to function, possibly in a degraded state.
  action: Wait for the resilver to complete.
   scrub: resilver in progress, 62.90% done, 4h26m to go
   some hours later
  # zpool status tww
pool: tww
   state: DEGRADED
  status: One or more devices is currently being resilvered.  The
 pool
   will
  continue to function, possibly in a degraded state.
  action: Wait for the resilver to complete.
   scrub: resilver in progress, 3.85% done, 18h49m to go
   
  # zpool history tww | tail -1
  2007-11-20.02:37:13 zpool replace tww
   c0t600A0B8000299966059E4668CBD3d0
  c0t600A0B8000299CCC06734741CD4Ed0
   
So, why did resilvering restart when no zfs operations occurred? I
just ran zpool status again and now I get:
  # zpool status tww
pool: tww
   state: DEGRADED
  status: One or more devices is currently being resilvered.  The
 pool
   will
  continue to function, possibly in a degraded state.
  action: Wait for the resilver to complete.
   scrub: resilver in progress, 0.00% done, 134h45m to go
   
What's going on?
   
--
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  
   ___
   zfs-discuss mailing list
   zfs-discuss@opensolaris.org
   http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  
  
 
  --
  albert chin ([EMAIL PROTECTED])
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss