Re: [zfs-discuss] How does resilver/scrub work?
2012-05-22 7:30, Daniel Carosone wrote: On Mon, May 21, 2012 at 09:18:03PM -0500, Bob Friesenhahn wrote: On Mon, 21 May 2012, Jim Klimov wrote: This is so far a relatively raw idea and I've probably missed something. Do you think it is worth pursuing and asking some zfs developers to make a POC? ;) I did read all of your text. :-) This is an interesting idea and could be of some use but it would be wise to test it first a few times before suggesting it as a general course. I've done basically this kind of thing before: dd a disk and then scrub rather than replace, treating errors as expected. I got into similar situation last night on that Thumper - it is now migrating a flaky source disk in the array from an original old 250Gb disk into a same-sized partition on the new 3Tb drive (as I outlined as IDEA7 in another thread). The source disk itself had about 300 CKSUM errors during the process, and for reasons beyond my current understanding, the resilver never completed. In zpool status it said that the process was done several hours before the time I looked at it, but the TLVDEV still had a spare component device comprised of the old disk and new partition, and the (same) hotspare device in the pool was INUSE. After a while we just detached the old disk from the pool and ran scrub, which first found some 178 CKSUM errors on the new partition right away, and degraded the TLVDEV and pool. We cleared the errors, and ran the script below to log the detected errors and clear them, so the disk is fixed and not kicked out of the pool due to mismatches. Overall 1277 errors were logged and apparently fixed, and the pool is now on its second full scrub run - no bugs so far (knocking wood; certainly none this early in the scrub as we had last time). So in effect, this methodology works for two of us :) Since you did similar stuff already, I have a few questions: 1) How/what did you DD? The whole slice with the zfs vdev? Did the system complain (much) about the renaming of the device compared to paths embedded in pool/vdev headers? Did you do anything manually to remedy that (forcing import, DDing some handcrafted uberblocks, anything?) 2) How did you treat errors as expected during scrub? As I've discovered, there were hoops to jump through. Is there a switch to disable degrading of pools and TLVDEVs based on only the CKSUM counts? My raw hoop-jumping script: - #!/bin/bash # /root/scrubwatch.sh # Watches 'pond' scrub and resets errors to avoid auto-degrading # the device, but logs the detected error counts however. # See also fmstat|grep zfs-diag for precise counts. # See also https://blogs.oracle.com/bobn/entry/zfs_and_fma_two_great # for details on FMA and fmstat with zfs hotspares while true; do zpool status pond | gegrep -A4 -B3 'resilv|error|c1t2d|c5t6d|%' date echo C1=`zpool status pond | grep c1t2d` C2=`echo $C1 | grep 'c1t2d0s1 ONLINE 0 0 0'` if [ x$C2 = x ]; then echo `date`: $C1 /var/tmp/zpool-clear_pond.log zpool clear pond zpool status pond | gegrep -A4 -B3 'resilv|error|c1t2d|c5t6d|%' date fi echo sleep 60 done HTH, //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How does resilver/scrub work?
On Tue, May 22, 2012 at 12:42:02PM +0400, Jim Klimov wrote: 2012-05-22 7:30, Daniel Carosone wrote: I've done basically this kind of thing before: dd a disk and then scrub rather than replace, treating errors as expected. I got into similar situation last night on that Thumper - it is now migrating a flaky source disk in the array from an original old 250Gb disk into a same-sized partition on the new 3Tb drive (as I outlined as IDEA7 in another thread). The source disk itself had about 300 CKSUM errors during the process, and for reasons beyond my current understanding, the resilver never completed. In zpool status it said that the process was done several hours before the time I looked at it, but the TLVDEV still had a spare component device comprised of the old disk and new partition, and the (same) hotspare device in the pool was INUSE. I think this is at least in part an issue with older code. There have been various fixes for hangs/restarts/incomplete replaces and sparings over the time since. After a while we just detached the old disk from the pool and ran scrub, which first found some 178 CKSUM errors on the new partition right away, and degraded the TLVDEV and pool. We cleared the errors, and ran the script below to log the detected errors and clear them, so the disk is fixed and not kicked out of the pool due to mismatches. So in effect, this methodology works for two of us :) Since you did similar stuff already, I have a few questions: 1) How/what did you DD? The whole slice with the zfs vdev? Did the system complain (much) about the renaming of the device compared to paths embedded in pool/vdev headers? Did you do anything manually to remedy that (forcing import, DDing some handcrafted uberblocks, anything?) I've done it a couple of times at least: * a failed disk in a raidz1, where i didn't trust that the other disks didn't also have errors. Basically did a ddrescue from one disk to the new. I think these days, a 'replace' where the original disk is still online will use that content, like a hotspare replace, rather than assume it has gone away and must be recreated, but that wasn't the case at the time. * Where I had an iscsi mirror of a laptop hard disk, but it was out of date and had been detached when the laptop iscsi initiator refused to start. Later, the disk developed a few bad sectors. I made a new submirror, let it sync (with the error still), then blatted bits of the old image over the new in the areas where the bad sectors where being reported. Scrub again, and they were fixed (as well as some blocks on the new submirror repaired coming back up to date again). 2) How did you treat errors as expected during scrub? Pretty much as you did: decline to panic and restart scrubs. -- Dan. pgpis9PrONjka.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss