Re: [zfs-discuss] How does resilver/scrub work?

2012-05-22 Thread Jim Klimov

2012-05-22 7:30, Daniel Carosone wrote:

On Mon, May 21, 2012 at 09:18:03PM -0500, Bob Friesenhahn wrote:

On Mon, 21 May 2012, Jim Klimov wrote:

This is so far a relatively raw idea and I've probably missed
something. Do you think it is worth pursuing and asking some
zfs developers to make a POC? ;)


I did read all of your text. :-)

This is an interesting idea and could be of some use but it would be
wise to test it first a few times before suggesting it as a general
course.


I've done basically this kind of thing before: dd a disk and then
scrub rather than replace, treating errors as expected.


I got into similar situation last night on that Thumper -
it is now migrating a flaky source disk in the array from
an original old 250Gb disk into a same-sized partition on
the new 3Tb drive (as I outlined as IDEA7 in another thread).
The source disk itself had about 300 CKSUM errors during
the process, and for reasons beyond my current understanding,
the resilver never completed.

In zpool status it said that the process was done several
hours before the time I looked at it, but the TLVDEV still
had a spare component device comprised of the old disk
and new partition, and the (same) hotspare device in the
pool was INUSE.

After a while we just detached the old disk from the pool
and ran scrub, which first found some 178 CKSUM errors on
the new partition right away, and degraded the TLVDEV and
pool.

We cleared the errors, and ran the script below to log
the detected errors and clear them, so the disk is fixed
and not kicked out of the pool due to mismatches.
Overall 1277 errors were logged and apparently fixed, and
the pool is now on its second full scrub run - no bugs so
far (knocking wood; certainly none this early in the scrub
as we had last time).

So in effect, this methodology works for two of us :)

Since you did similar stuff already, I have a few questions:
1) How/what did you DD? The whole slice with the zfs vdev?
   Did the system complain (much) about the renaming of the
   device compared to paths embedded in pool/vdev headers?
   Did you do anything manually to remedy that (forcing
   import, DDing some handcrafted uberblocks, anything?)

2) How did you treat errors as expected during scrub?
   As I've discovered, there were hoops to jump through.
   Is there a switch to disable degrading of pools and
   TLVDEVs based on only the CKSUM counts?


My raw hoop-jumping script:
-

#!/bin/bash

# /root/scrubwatch.sh
# Watches 'pond' scrub and resets errors to avoid auto-degrading
# the device, but logs the detected error counts however.
# See also fmstat|grep zfs-diag for precise counts.
# See also https://blogs.oracle.com/bobn/entry/zfs_and_fma_two_great
#  for details on FMA and fmstat with zfs hotspares

while true; do
zpool status pond | gegrep -A4 -B3 'resilv|error|c1t2d|c5t6d|%'
date
echo 

C1=`zpool status pond | grep c1t2d`
C2=`echo $C1 | grep 'c1t2d0s1  ONLINE   0 0 0'`
if [ x$C2 = x ]; then
echo `date`: $C1  /var/tmp/zpool-clear_pond.log
zpool clear pond
zpool status pond | gegrep -A4 -B3 'resilv|error|c1t2d|c5t6d|%'
date
fi
echo 

sleep 60
done




HTH,
//Jim Klimov
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How does resilver/scrub work?

2012-05-22 Thread Daniel Carosone
On Tue, May 22, 2012 at 12:42:02PM +0400, Jim Klimov wrote:
 2012-05-22 7:30, Daniel Carosone wrote:
 I've done basically this kind of thing before: dd a disk and then
 scrub rather than replace, treating errors as expected.

 I got into similar situation last night on that Thumper -
 it is now migrating a flaky source disk in the array from
 an original old 250Gb disk into a same-sized partition on
 the new 3Tb drive (as I outlined as IDEA7 in another thread).
 The source disk itself had about 300 CKSUM errors during
 the process, and for reasons beyond my current understanding,
 the resilver never completed.

 In zpool status it said that the process was done several
 hours before the time I looked at it, but the TLVDEV still
 had a spare component device comprised of the old disk
 and new partition, and the (same) hotspare device in the
 pool was INUSE.

I think this is at least in part an issue with older code.  There have
been various fixes for hangs/restarts/incomplete replaces and sparings
over the time since.  

 After a while we just detached the old disk from the pool
 and ran scrub, which first found some 178 CKSUM errors on
 the new partition right away, and degraded the TLVDEV and
 pool.

 We cleared the errors, and ran the script below to log
 the detected errors and clear them, so the disk is fixed
 and not kicked out of the pool due to mismatches.

 So in effect, this methodology works for two of us :)

 Since you did similar stuff already, I have a few questions:
 1) How/what did you DD? The whole slice with the zfs vdev?
Did the system complain (much) about the renaming of the
device compared to paths embedded in pool/vdev headers?
Did you do anything manually to remedy that (forcing
import, DDing some handcrafted uberblocks, anything?)

I've done it a couple of times at least:

 * a failed disk in a raidz1, where i didn't trust that the other
   disks didn't also have errors.  Basically did a ddrescue from one
   disk to the new. I think these days, a 'replace' where the
   original disk is still online will use that content, like a
   hotspare replace, rather than assume it has gone away and must be
   recreated, but that wasn't the case at the time.

 * Where I had an iscsi mirror of a laptop hard disk, but it was out
   of date and had been detached when the laptop iscsi initiator
   refused to start.  Later, the disk developed a few bad sectors.  I
   made a new submirror, let it sync (with the error still), then
   blatted bits of the old image over the new in the areas where the
   bad sectors where being reported.  Scrub again, and they were fixed
   (as well as some blocks on the new submirror repaired coming back
   up to date again). 

 2) How did you treat errors as expected during scrub?

Pretty much as you did: decline to panic and restart scrubs.

--
Dan.

pgpis9PrONjka.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss