On Sat, Nov 27, 2010 at 03:22:49PM +0200, Gareth de Vaux wrote:
> Hi all, I'm trying to simulate a disk fail and replacement in
> a raidz array and failing myself. What'm I doing wrong? Here's
> a transcript with interspersed commentary:
> 
> r...@file:~# zpool status
>   pool: raid
>  state: ONLINE
>  scrub: scrub completed after 0h0m with 0 errors on Sat Nov 27 13:20:06 2010
> config:
> 
>       NAME        STATE     READ WRITE CKSUM
>       raid        ONLINE       0     0     0
>         raidz1    ONLINE       0     0     0
>           ad12    ONLINE       0     0     0
>           ad13    ONLINE       0     0     0
>           ad4     ONLINE       0     0     0
>           ad6     ONLINE       0     0     0
> 
> errors: No known data errors
> r...@file:~# zpool offline raid ad12
> 
> reboot
> dd if=/dev/zero of=/dev/ad12 ..
> 
> r...@file:~# zpool replace raid ad12
> cannot replace ad12 with ad12: ad12 is busy
> r...@file:~# zpool replace -f raid ad12
> cannot replace ad12 with ad12: ad12 is busy
> 
>       The handbook suggests 'replace' but I guess this is only
>       if the disk is physically replaced and gets a new identifier?
>       Trying with 'online':
> 
> r...@file:~# zpool online raid ad12
> r...@file:~# zpool status
>   pool: raid
>  state: ONLINE
>  scrub: resilver completed after 0h0m with 0 errors on Sat Nov 27 13:29:14 
> 2010
> config:
> 
>       NAME        STATE     READ WRITE CKSUM
>       raid        ONLINE       0     0     0
>         raidz1    ONLINE       0     0     0
>           ad12    ONLINE       0     0     0  15.5K resilvered
>           ad13    ONLINE       0     0     0
>           ad4     ONLINE       0     0     0
>           ad6     ONLINE       0     0     0
> 
> errors: No known data errors
> 
>       Output remains as such, is this normal?
> 
> r...@file:~# zpool scrub raid
> r...@file:~# zpool status
>   pool: raid
>  state: ONLINE
> status: One or more devices has experienced an unrecoverable error.  An
>       attempt was made to correct the error.  Applications are unaffected.
> action: Determine if the device needs to be replaced, and clear the errors
>       using 'zpool clear' or replace the device with 'zpool replace'.
>    see: http://www.sun.com/msg/ZFS-8000-9P
>  scrub: scrub completed after 0h0m with 0 errors on Sat Nov 27 13:30:37 2010
> config:
> 
>       NAME        STATE     READ WRITE CKSUM
>       raid        ONLINE       0     0     0
>         raidz1    ONLINE       0     0     0
>           ad12    ONLINE       0     0 2.11K  87.7M repaired
>           ad13    ONLINE       0     0     0
>           ad4     ONLINE       0     0     0
>           ad6     ONLINE       0     0     0
> 
> errors: No known data errors
> r...@file:~# zpool scrub raid
> r...@file:~# zpool status
>   pool: raid
>  state: ONLINE
> status: One or more devices has experienced an unrecoverable error.  An
>       attempt was made to correct the error.  Applications are unaffected.
> action: Determine if the device needs to be replaced, and clear the errors
>       using 'zpool clear' or replace the device with 'zpool replace'.
>    see: http://www.sun.com/msg/ZFS-8000-9P
>  scrub: scrub completed after 0h0m with 0 errors on Sat Nov 27 13:30:55 2010
> config:
> 
>       NAME        STATE     READ WRITE CKSUM
>       raid        ONLINE       0     0     0
>         raidz1    ONLINE       0     0     0
>           ad12    ONLINE       0     0 2.11K
>           ad13    ONLINE       0     0     0
>           ad4     ONLINE       0     0     0
>             ad6     ONLINE       0     0     0
> 
> errors: No known data errors
> 
>       These are checksum errors? So the disk hasn't been integrated
>       properly?
> 
> r...@file:~# zpool clear raid ad12
> r...@file:~# zpool status
>   pool: raid
>  state: ONLINE
>  scrub: scrub completed after 0h0m with 0 errors on Sat Nov 27 13:39:09 2010
> config:
> 
>       NAME        STATE     READ WRITE CKSUM
>       raid        ONLINE       0     0     0
>         raidz1    ONLINE       0     0     0
>           ad12    ONLINE       0     0     0
>           ad13    ONLINE       0     0     0
>           ad4     ONLINE       0     0     0
>           ad6     ONLINE       0     0     0
> 
> errors: No known data errors
> r...@file:~# zpool status -x
> all pools are healthy
> 
>       To make sure this's the case I fail a different disk:
> 
> r...@file:~# zpool offline raid ad6
> r...@file:~# zpool status   
>   pool: raid
>  state: DEGRADED
> status: One or more devices has been taken offline by the administrator.
>       Sufficient replicas exist for the pool to continue functioning in a
>       degraded state.
> action: Online the device using 'zpool online' or replace the device with
>       'zpool replace'.
>  scrub: scrub completed after 0h0m with 0 errors on Sat Nov 27 13:40:52 2010
> config:
> 
>       NAME        STATE     READ WRITE CKSUM
>       raid        DEGRADED     0     0     0
>         raidz1    DEGRADED     0     0     0
>           ad12    ONLINE       0     0     0
>           ad13    ONLINE       0     0     0
>           ad4     ONLINE       0     0     0
>           ad6     OFFLINE      0     0     0
> 
> errors: No known data errors
> 
>       on reboot the status changes:
> 
> r...@file:~# zpool status
>   pool: raid
>  state: FAULTED
> status: The pool metadata is corrupted and the pool cannot be opened.
> action: Destroy and re-create the pool from a backup source.
>    see: http://www.sun.com/msg/ZFS-8000-72
>  scrub: none requested
> config:
> 
>       NAME        STATE     READ WRITE CKSUM
>       raid        FAULTED      0     0     1  corrupted data
>         raidz1    DEGRADED     0     0     6
>           ad12    OFFLINE      0     0     0
>           ad13    ONLINE       0     0     0
>           ad4     ONLINE       0     0     0
>           ad6     ONLINE       0     0     1
> 
> 
> The same happens if I recreate the array and try again.

uname -a please -- it matters greatly.

-- 
| Jeremy Chadwick                                   j...@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to