I'm removing the In-Reply-To mail headers for this thread, as you've now hijacked it for a different purpose. Please don't do this; start a new thread altogether. :-)
On Tue, Jan 26, 2010 at 02:57:20PM +0100, Gerrit Kühn wrote: > I am still busy replacing RE2-disks with updated drives. I came across a > very strange thing with zfs. Actually I had the following pool layout: > > mclane# zpool status > pool: tank > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > ad8 ONLINE 0 0 0 > ad10 ONLINE 0 0 0 > ad12 ONLINE 0 0 0 > spares > ad14 AVAIL > > errors: No known data errors > > All disks still have the firmware bug, so I want to replace them with > disks that I already fixed. I put in a updated drive as ad18 and > wanted to replace ad12 to get the drive with the broken firmware out: > > mclane# zpool replace tank /dev/ad12 /dev/ad18 > mclane# zpool status > pool: tank > state: ONLINE > status: One or more devices is currently being resilvered. The pool will > continue to function, possibly in a degraded state. > action: Wait for the resilver to complete. > scrub: resilver in progress for 0h0m, 0.01% done, 52h51m to go > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > ad8 ONLINE 0 0 0 7.21M resilvered > ad10 ONLINE 0 0 0 7.22M resilvered > replacing ONLINE 0 0 0 > ad12 ONLINE 0 0 0 > ad18 ONLINE 0 0 0 10.7M resilvered > spares > ad14 AVAIL > > errors: No known data errors > > However, something must have gone wrong during the resilvering process and > it now looks like this: > > mclane# zpool status > pool: tank > state: DEGRADED > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are > unaffected. action: Determine if the device needs to be replaced, and > clear the errors using 'zpool clear' or replace the device with 'zpool > replace'. see: http://www.sun.com/msg/ZFS-8000-9P > scrub: resilver completed after 2h39m with 0 errors on Tue Jan 26 > 14:00:00 2010 config: > > NAME STATE READ WRITE CKSUM > tank DEGRADED 0 0 0 > raidz1 DEGRADED 0 0 0 > ad8 ONLINE 0 0 0 975M resilvered > ad10 ONLINE 0 0 142 974M resilvered > replacing DEGRADED 0 7.25M 0 > ad12 ONLINE 0 0 0 > ad18 REMOVED 0 1 0 79.4M resilvered > spares > ad14 AVAIL > > errors: No known data errors > > > What is going on here? ad18 obviously detached during the > process. /var/log/messages just gives me > > Jan 26 11:23:33 mclane kernel: ad18: FAILURE - device detached > > Additionally ad10 obviously produced chksum errors. What do I do about the > degraded replacing process? Can I terminate it somehow and maybe replace > ad10 first? Any other hints? I'm not sure how the above is supposed to work (I haven't personally tried it), but: 1) Why didn't you offline the ad10 disk first? zpool offline tank ad10 2) How did you attach ad18? Did you tell the system about it using atacontrol? If so, what commands did you use? 3) Can you please provide uname -a output, as well as relevant dmesg output to show what kind of SATA controller you have, what's attached to what, etc.? -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"