Re: [zfs-discuss] ZFS resilvering loop from hell

2011-07-27 Thread Daniel Carosone
On Wed, Jul 27, 2011 at 08:00:43PM -0500, Bob Friesenhahn wrote:
> On Tue, 26 Jul 2011, Charles Stephens wrote:
>
>> I'm on S11E 150.0.1.9 and I replaced one of the drives and the pool  
>> seems to be stuck in a resilvering loop.  I performed a 'zpool clear' 
>> and 'zpool scrub' and just complains that the drives I didn't replace 
>> are degraded because of too many errors.  Oddly the replaced drive is 
>> reported as being fine.  The CKSUM counts get up to about 108 or so 
>> when the resilver is completed.
>
> This sort of problem (failing disks during a recovery) is a good reason 
> not to use raidz1 in modern systems.  Use raidz2 or raidz3.
>
> Assuming that the system is good and it is really a problem with the  
> disks experiencing bad reads, it seems that the only path forward is to 
> wait for the resilver to complete or see if creating a new pool from a 
> recent backup is better.

Indeed, but that assumption may be too strong.  If you're getting
errors across all the members, you are likely to have some other
systemic problem, such as: 
 * bad ram / cpu / motherboard
 * too-weak power supply
 * faulty disk controller / driver

Had you scrubbed the pool regularly before the replacement? Were those
clean?  If not, the possibility is that the scrubs are telling you
that bad data was written originally, especially if it's repeatable on
the same files.  If it hits different counts and files each scrub, you
may be seeing corruption on reads, due to the same causes. Or you may
have both.

--
Dan.


pgpV4QDOAXvnT.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS resilvering loop from hell

2011-07-27 Thread Bob Friesenhahn

On Tue, 26 Jul 2011, Charles Stephens wrote:

I'm on S11E 150.0.1.9 and I replaced one of the drives and the pool 
seems to be stuck in a resilvering loop.  I performed a 'zpool 
clear' and 'zpool scrub' and just complains that the drives I didn't 
replace are degraded because of too many errors.  Oddly the replaced 
drive is reported as being fine.  The CKSUM counts get up to about 
108 or so when the resilver is completed.


This sort of problem (failing disks during a recovery) is a good 
reason not to use raidz1 in modern systems.  Use raidz2 or raidz3.


Assuming that the system is good and it is really a problem with the 
disks experiencing bad reads, it seems that the only path forward is 
to wait for the resilver to complete or see if creating a new pool 
from a recent backup is better.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS resilvering loop from hell

2011-07-26 Thread Charles Stephens
I'm on S11E 150.0.1.9 and I replaced one of the drives and the pool seems to be 
stuck in a resilvering loop.  I performed a 'zpool clear' and 'zpool scrub' and 
just complains that the drives I didn't replace are degraded because of too 
many errors.  Oddly the replaced drive is reported as being fine.  The CKSUM 
counts get up to about 108 or so when the resilver is completed.

I'm now trying to evacuate the pool onto another pool, however the zfs 
send/receive is dying after 380GB into sending the first dataset.

Here is some output.  Any help or insights will be helpful.  Thanks

cfs

  pool: dpool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scan: resilver in progress since Tue Jul 26 15:03:32 2011
63.4G scanned out of 5.02T at 6.81M/s, 212h12m to go
15.1G resilvered, 1.23% done
config:

NAMESTATE READ WRITE CKSUM
dpool   DEGRADED 0 0 6
  raidz1-0  DEGRADED 0 012
c9t0d0  DEGRADED 0 0 0  too many errors
c9t1d0  DEGRADED 0 0 0  too many errors
c9t3d0  DEGRADED 0 0 0  too many errors
c9t2d0  ONLINE   0 0 0  (resilvering)

errors: Permanent errors have been detected in the following files:

:<0x0>
[redacted list of 20 files, mostly in the same directory]


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs resilvering

2008-09-29 Thread Mikael Kjerrman
Richard,

thanks alot for that answer. It can be argued back and forth what is right, but 
it helps knowing the reason behind the problem. Again, thanks alot...

//Mike
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs resilvering

2008-09-29 Thread Mikael Kjerrman
Hi,

it was actually shared both as a dataset and a NFS-share.

we had zonedata/prodlogs set up as a dataset and then
we had zonedata/tmp mounted as a NFS filesystem within the zone.

//Mike
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs resilvering

2008-09-28 Thread Richard Elling
Johan Hartzenberg wrote:
>
>
> On Fri, Sep 26, 2008 at 7:03 PM, Richard Elling 
> <[EMAIL PROTECTED] > wrote:
>
> Mikael Kjerrman wrote:
> > define a lot :-)
> >
> > We are doing about 7-8M per second which I don't think is a lot
> but perhaps it is enough to screw up the estimates? Anyhow the
> resilvering completed about 4386h earlier than expected so
> everything is ok now, but I still feel that the way it figures out
> the number is wrong.
> >
>
> Yes, the algorithm is conservative and very often wrong until you
> get close to the end.  In part this is because resilvering works
> in time
> order, not spatial distance. In ZFS, the oldest data is resilvered
> first.
> This is also why you will see a lot of "thinking" before you see a
> lot of I/O because ZFS is determining the order to resilver the data.
> Unfortunately, this makes time completion prediction somewhat
> difficult to get right.
>
>
> Hi Richard,
>
> Would it not make more sense then for the program to say something 
> like "No Estimate Yet" during the early part of the process, at least?

Yes.  That would be a good idea.  Sounds like a good, quick opportunity
for a community contributor :-)
 -- richard

>
> Cheers,
>   _hartz

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs resilvering

2008-09-27 Thread Ian Collins
Mikael Kjerrman wrote:
>
> I also have a question about sharing a zfs from the global zone to a local 
> zone. Are there any issues with this? We had an unfortunate sysadmin who did 
> this and our systems hung. We have no logs that show anyhing at all, but I 
> thought I'd ask just be sure.
>
>   
How was it shared, as an fs or a dataset?  I'm using both and I haven't
seen any problems with either.

Ian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs resilvering

2008-09-27 Thread Johan Hartzenberg
On Fri, Sep 26, 2008 at 7:03 PM, Richard Elling <[EMAIL PROTECTED]>wrote:

> Mikael Kjerrman wrote:
> > define a lot :-)
> >
> > We are doing about 7-8M per second which I don't think is a lot but
> perhaps it is enough to screw up the estimates? Anyhow the resilvering
> completed about 4386h earlier than expected so everything is ok now, but I
> still feel that the way it figures out the number is wrong.
> >
>
> Yes, the algorithm is conservative and very often wrong until you
> get close to the end.  In part this is because resilvering works in time
> order, not spatial distance. In ZFS, the oldest data is resilvered first.
> This is also why you will see a lot of "thinking" before you see a
> lot of I/O because ZFS is determining the order to resilver the data.
> Unfortunately, this makes time completion prediction somewhat
> difficult to get right.
>

Hi Richard,

Would it not make more sense then for the program to say something like "No
Estimate Yet" during the early part of the process, at least?

Cheers,
  _hartz
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs resilvering

2008-09-26 Thread Richard Elling
Mikael Kjerrman wrote:
> define a lot :-)
>
> We are doing about 7-8M per second which I don't think is a lot but perhaps 
> it is enough to screw up the estimates? Anyhow the resilvering completed 
> about 4386h earlier than expected so everything is ok now, but I still feel 
> that the way it figures out the number is wrong.
>   

Yes, the algorithm is conservative and very often wrong until you
get close to the end.  In part this is because resilvering works in time
order, not spatial distance. In ZFS, the oldest data is resilvered first.
This is also why you will see a lot of "thinking" before you see a
lot of I/O because ZFS is determining the order to resilver the data.
Unfortunately, this makes time completion prediction somewhat
difficult to get right.

> Any thoughts on my other issue?
>   

Try the zones-discuss forum
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs resilvering

2008-09-26 Thread Johan Hartzenberg
On Fri, Sep 26, 2008 at 4:02 PM, <[EMAIL PROTECTED]> wrote:

>
> Note the progress so far "0.04%."  In my experience the time estimate has
> no basis in reality until it's about 1% do or so.  I think there is some
> bookkeeping or something ZFS does at the start of a scrub or resilver that
> throws off the time estimate for a while.  Thats just my experience with
> it but it's been like that pretty consistently for me.
>
> Jonathan Stewart


I agree here.

I've watched iostat -xnc 5 while I start scrubbing a few times, and the
first minute or so is spend doing very little IO.  There after the transfers
shoot up to near what I think is the maximum the drive can do an stays there
until the scrub is completed.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs resilvering

2008-09-26 Thread jonathan
> On Fri, Sep 26, 2008 at 1:27 AM, Mikael Kjerrman
> <[EMAIL PROTECTED]> wrote:
[snip]
>> this box was rebooted this morning and after the boot I noticed a
>> resilver was in progress. But the suggested time seemed a bit long, so
>> is this a problem which can be patched or remediated in another way?
>>
>> # zpool status -x
>>  pool: zonedata
>>  state: ONLINE
>> status: One or more devices is currently being resilvered.  The pool
>> will
>>continue to function, possibly in a degraded state.
>> action: Wait for the resilver to complete.
>>  scrub: resilver in progress, 0.04% done, [b]4398h43m[/b] to go
[snip]
> Do you have a lot of competing I/O's on the box which would slow down
> the resilver?

Note the progress so far "0.04%."  In my experience the time estimate has
no basis in reality until it's about 1% do or so.  I think there is some
bookkeeping or something ZFS does at the start of a scrub or resilver that
throws off the time estimate for a while.  Thats just my experience with
it but it's been like that pretty consistently for me.

Jonathan Stewart


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs resilvering

2008-09-26 Thread Mikael Kjerrman
define a lot :-)

We are doing about 7-8M per second which I don't think is a lot but perhaps it 
is enough to screw up the estimates? Anyhow the resilvering completed about 
4386h earlier than expected so everything is ok now, but I still feel that the 
way it figures out the number is wrong.

Any thoughts on my other issue?

cheers,

//Mike
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs resilvering

2008-09-26 Thread Brent Jones
On Fri, Sep 26, 2008 at 1:27 AM, Mikael Kjerrman
<[EMAIL PROTECTED]> wrote:
> Hi,
>
> I've searched without luck, so I'm asking instead.
>
> I have a Solaris 10 box,
>
> # cat /etc/release
>   Solaris 10 11/06 s10s_u3wos_10 SPARC
>   Copyright 2006 Sun Microsystems, Inc.  All Rights Reserved.
>Use is subject to license terms.
>   Assembled 14 November 2006
>
> this box was rebooted this morning and after the boot I noticed a resilver 
> was in progress. But the suggested time seemed a bit long, so is this a 
> problem which can be patched or remediated in another way?
>
> # zpool status -x
>  pool: zonedata
>  state: ONLINE
> status: One or more devices is currently being resilvered.  The pool will
>continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
>  scrub: resilver in progress, 0.04% done, [b]4398h43m[/b] to go
> config:
>
>NAME   STATE READ WRITE CKSUM
>zonedata   ONLINE   0 0 0
>  mirror   ONLINE   0 0 0
>c6t60060E8004282B00282B10A0d0  ONLINE   0 0 0
>c6t60060E8004283300283310A0d0  ONLINE   0 0 0
>  mirror   ONLINE   0 0 0
>c6t60060E8004282B00282B10A1d0  ONLINE   0 0 0
>c6t60060E8004283300283310A1d0  ONLINE   0 0 0
>  mirror   ONLINE   0 0 0
>c6t60060E8004282B00282B10A2d0  ONLINE   0 0 0
>c6t60060E8004283300283310A2d0  ONLINE   0 0 0
>  mirror   ONLINE   0 0 0
>c6t60060E8004282B00282B10A4d0  ONLINE   0 0 0
>c6t60060E8004283300283310A4d0  ONLINE   0 0 0
>  mirror   ONLINE   0 0 0
>c6t60060E8004282B00282B10A5d0  ONLINE   0 0 0
>c6t60060E8004283300283310A5d0  ONLINE   0 0 0
>  mirror   ONLINE   0 0 0
>c6t60060E8004282B00282B10A6d0  ONLINE   0 0 0
>c6t60060E8004283300283310A6d0  ONLINE   0 0 0
>  mirror   ONLINE   0 0 0
>c6t60060E8004282B00282B2022d0  ONLINE   0 0 0
>c6t60060E800428330028332022d0  ONLINE   0 0 0
>  mirror   ONLINE   0 0 0
>c6t60060E8004282B00282B2023d0  ONLINE   0 0 0
>c6t60060E800428330028332024d0  ONLINE   0 0 0
>  mirror   ONLINE   0 0 0
>c6t60060E8004282B00282B2024d0  ONLINE   0 0 0
>c6t60060E800428330028332023d0  ONLINE   0 0 0
>
>
> I also have a question about sharing a zfs from the global zone to a local 
> zone. Are there any issues with this? We had an unfortunate sysadmin who did 
> this and our systems hung. We have no logs that show anyhing at all, but I 
> thought I'd ask just be sure.
>
> cheers,
>
> //Mike
> --
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Do you have a lot of competing I/O's on the box which would slow down
the resilver?


-- 
Brent Jones
[EMAIL PROTECTED]
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs resilvering

2008-09-26 Thread Mikael Kjerrman
Hi,

I've searched without luck, so I'm asking instead.

I have a Solaris 10 box,

# cat /etc/release
   Solaris 10 11/06 s10s_u3wos_10 SPARC
   Copyright 2006 Sun Microsystems, Inc.  All Rights Reserved.
Use is subject to license terms.
   Assembled 14 November 2006

this box was rebooted this morning and after the boot I noticed a resilver was 
in progress. But the suggested time seemed a bit long, so is this a problem 
which can be patched or remediated in another way?

# zpool status -x
  pool: zonedata
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 0.04% done, [b]4398h43m[/b] to go
config:

NAME   STATE READ WRITE CKSUM
zonedata   ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c6t60060E8004282B00282B10A0d0  ONLINE   0 0 0
c6t60060E8004283300283310A0d0  ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c6t60060E8004282B00282B10A1d0  ONLINE   0 0 0
c6t60060E8004283300283310A1d0  ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c6t60060E8004282B00282B10A2d0  ONLINE   0 0 0
c6t60060E8004283300283310A2d0  ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c6t60060E8004282B00282B10A4d0  ONLINE   0 0 0
c6t60060E8004283300283310A4d0  ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c6t60060E8004282B00282B10A5d0  ONLINE   0 0 0
c6t60060E8004283300283310A5d0  ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c6t60060E8004282B00282B10A6d0  ONLINE   0 0 0
c6t60060E8004283300283310A6d0  ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c6t60060E8004282B00282B2022d0  ONLINE   0 0 0
c6t60060E800428330028332022d0  ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c6t60060E8004282B00282B2023d0  ONLINE   0 0 0
c6t60060E800428330028332024d0  ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c6t60060E8004282B00282B2024d0  ONLINE   0 0 0
c6t60060E800428330028332023d0  ONLINE   0 0 0


I also have a question about sharing a zfs from the global zone to a local 
zone. Are there any issues with this? We had an unfortunate sysadmin who did 
this and our systems hung. We have no logs that show anyhing at all, but I 
thought I'd ask just be sure.

cheers,

//Mike
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss