Re: [zfs-discuss] Can you manually trigger spares?

2010-03-09 Thread Mark J Musante

On Mon, 8 Mar 2010, Tim Cook wrote:


Is there a way to manually trigger a hot spare to kick in?


Yes - just use 'zpool replace fserv 12589257915302950264 c3t6d0'.  That's 
all the fma service does anyway.


If you ever get your drive to come back online, the fma service should 
recognize that and resilver it, switching the spare back to AVAIL.



Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can you manually trigger spares?

2010-03-08 Thread Cindy Swearingen

Hi Tim,

I'm not sure why your spare isn't kicking in, but you could manually
replace the failed disk with the spare like this:

# zpool replace fserv c7t5d0 c3t6d0

If you want to run with the spare for awhile, then you can also detach
the original failed disk like this:

# zpool detach fserv c7t5d0

I don't know why the device name changed either.

See a similar example below.

Thanks,

Cindy

# zpool create -f tank raidz2 c2t0d0 c2t1d0 c2t2d0 spare c2t3d0
# zpool status tank
  pool: tank
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  raidz2-0  ONLINE   0 0 0
c2t0d0  ONLINE   0 0 0
c2t1d0  ONLINE   0 0 0
c2t2d0  ONLINE   0 0 0
spares
  c2t3d0AVAIL
# zpool replace tank c2t2d0 c2t3d0
# zpool status tank
  pool: tank
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Mon Mar  8 
12:03:37 2010

config:

NAME  STATE READ WRITE CKSUM
tank  ONLINE   0 0 0
  raidz2-0ONLINE   0 0 0
c2t0d0ONLINE   0 0 0
c2t1d0ONLINE   0 0 0
spare-2   ONLINE   0 0 0
  c2t2d0  ONLINE   0 0 0
  c2t3d0  ONLINE   0 0 0  91.5K resilvered
spares
  c2t3d0  INUSE currently in use

errors: No known data errors
# zpool detach tank c2t2d0
# zpool status tank
  pool: tank
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Mon Mar  8 
12:03:37 2010

config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  raidz2-0  ONLINE   0 0 0
c2t0d0  ONLINE   0 0 0
c2t1d0  ONLINE   0 0 0
c2t3d0  ONLINE   0 0 0  91.5K resilvered

errors: No known data errors





On 03/08/10 11:33, Tim Cook wrote:
Is there a way to manually trigger a hot spare to kick in?  Mine doesn't 
appear to be doing so.  What happened is I exported a pool to reinstall 
solaris on this system.  When I went to re-import it, one of the drives 
refused to come back online.  So, the pool imported degraded, but it 
doesn't seem to want to use the hot spare... I've tried triggering a 
scrub to see if that would give it a kick, but no-go.


r...@fserv:~$ zpool status
  pool: fserv
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas 
exist for

the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: scrub completed after 3h19m with 0 errors on Mon Mar  8 02:28:08 
2010

config:

NAME  STATE READ WRITE CKSUM
fserv DEGRADED 0 0 0
  raidz2-0DEGRADED 0 0 0
c2t0d0ONLINE   0 0 0
c2t1d0ONLINE   0 0 0
c2t2d0ONLINE   0 0 0
c2t3d0ONLINE   0 0 0
c2t4d0ONLINE   0 0 0
c2t5d0ONLINE   0 0 0
c3t0d0ONLINE   0 0 0
c3t1d0ONLINE   0 0 0
c3t2d0ONLINE   0 0 0
c3t3d0ONLINE   0 0 0
c3t4d0ONLINE   0 0 0
12589257915302950264  UNAVAIL  0 0 0  was 
/dev/dsk/c7t5d0s0

spares
  c3t6d0  AVAIL

--Tim




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can you manually trigger spares?

2010-03-08 Thread Ian Collins

Tim Cook wrote:
Is there a way to manually trigger a hot spare to kick in?  Mine 
doesn't appear to be doing so.  What happened is I exported a pool to 
reinstall solaris on this system.  When I went to re-import it, one of 
the drives refused to come back online.  So, the pool imported 
degraded, but it doesn't seem to want to use the hot spare... I've 
tried triggering a scrub to see if that would give it a kick, but no-go.


Have you tried zpool replace (you might have to remove the spare from 
the pool first)?   Is the spare at least as big as the faulted drive?  
If not, replace would fail and you'd see why.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can you manually trigger spares?

2010-03-08 Thread Giovanni Tirloni
On Mon, Mar 8, 2010 at 3:33 PM, Tim Cook t...@cook.ms wrote:

 Is there a way to manually trigger a hot spare to kick in?  Mine doesn't
 appear to be doing so.  What happened is I exported a pool to reinstall
 solaris on this system.  When I went to re-import it, one of the drives
 refused to come back online.  So, the pool imported degraded, but it doesn't
 seem to want to use the hot spare... I've tried triggering a scrub to see if
 that would give it a kick, but no-go.


uts/common/fs/zfs/vdev.c says:

/*
 * If we fail to open a vdev during an import, we mark it as
 * not available, which signifies that it was never there to
 * begin with.  Failure to open such a device is not considered
 * an error.
*/

If there is no error then the fault management code probably doesn't kick in
and autoreplace isn't triggered.




 r...@fserv:~$ zpool status
   pool: fserv
  state: DEGRADED
 status: One or more devices could not be opened.  Sufficient replicas exist
 for
 the pool to continue functioning in a degraded state.
 action: Attach the missing device and online it using 'zpool online'.
see: http://www.sun.com/msg/ZFS-8000-2Q
  scrub: scrub completed after 3h19m with 0 errors on Mon Mar  8 02:28:08
 2010
 config:

 NAME  STATE READ WRITE CKSUM
 fserv DEGRADED 0 0 0
   raidz2-0DEGRADED 0 0 0
 c2t0d0ONLINE   0 0 0
 c2t1d0ONLINE   0 0 0
 c2t2d0ONLINE   0 0 0
 c2t3d0ONLINE   0 0 0
 c2t4d0ONLINE   0 0 0
 c2t5d0ONLINE   0 0 0
 c3t0d0ONLINE   0 0 0
 c3t1d0ONLINE   0 0 0
 c3t2d0ONLINE   0 0 0
 c3t3d0ONLINE   0 0 0
 c3t4d0ONLINE   0 0 0
 12589257915302950264  UNAVAIL  0 0 0  was
 /dev/dsk/c7t5d0s0
 spares
   c3t6d0  AVAIL


That crazy device name is guid (you can see that with eg. zdb -l
/dev/rdsk/c3t1d0s0)

I was able to replicate your situation here.

# uname -a
SunOS osol-dev 5.11 snv_133 i86pc i386 i86pc Solaris

# zpool status tank
  pool: tank
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  mirror-0  ONLINE   0 0 0
c6t0d0  ONLINE   0 0 0
c6t1d0  ONLINE   0 0 0
cache
  c6t2d0ONLINE   0 0 0
spares
  c6t3d0AVAIL

errors: No known data errors

# zpool export tank

removed c6t1d0

# zpool import tank

# zpool status tank
  pool: tank
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist
for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: none requested
config:

NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
  mirror-0   DEGRADED 0 0 0
6462738093222634405  UNAVAIL  0 0 0  was
/dev/dsk/c6t0d0s0
c6t1d0   ONLINE   0 0 0
cache
  c6t2d0 ONLINE   0 0 0
spares
  c6t3d0 AVAIL

errors: No known data errors

# zpool get autoreplace tank
NAME  PROPERTY VALUESOURCE
tank  autoreplace  on   local

# fmdump -e -t 08Mar2010
TIME CLASS

As you can see, no error report was posted. You can try to import the pool
again and see if `fmdump -e` lists any errors afterwards.

You use the spare with `zpool replace`.

-- 
Giovanni Tirloni
sysdroid.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss