Re: [zfs-discuss] replacing a device with itself doesn't work

2007-10-08 Thread Pawel Jakub Dawidek
On Wed, Oct 03, 2007 at 10:02:03PM +0200, Pawel Jakub Dawidek wrote:
 On Wed, Oct 03, 2007 at 12:10:19PM -0700, Richard Elling wrote:
   -
   
   # zpool scrub tank
   # zpool status -v tank
 pool: tank
state: ONLINE
   status: One or more devices could not be used because the label is 
   missing or
   invalid.  Sufficient replicas exist for the pool to continue
   functioning in a degraded state.
   action: Replace the device using 'zpool replace'.
  see: http://www.sun.com/msg/ZFS-8000-4J
scrub: resilver completed with 0 errors on Wed Oct  3 18:45:06 2007
   config:
   
   NAMESTATE READ WRITE CKSUM
   tankONLINE   0 0 0
 raidz1ONLINE   0 0 0
   md0 UNAVAIL  0 0 0  corrupted data
   md1 ONLINE   0 0 0
   md2 ONLINE   0 0 0
   
   errors: No known data errors
   # zpool replace tank md0
   invalid vdev specification
   use '-f' to override the following errors:
   md0 is in use (r1w1e1)
   # zpool replace -f tank md0
   invalid vdev specification
   the following errors must be manually repaired:
   md0 is in use (r1w1e1)
   
   -
   Well the advice of 'zpool replace' doesn't work. At this point the user 
   is now stuck. There seems to
   be just no way to now use the existing device md0.
  
  In Solaris NV b72, this works as you expect.
  # zpool replace zwimming /dev/ramdisk/rd1
  # zpool status -v zwimming
 pool: zwimming
state: DEGRADED
scrub: resilver completed with 0 errors on Wed Oct  3 11:55:36 2007
  config:
  
   NAMESTATE READ WRITE CKSUM
   zwimmingDEGRADED 0 0 0
 raidz1DEGRADED 0 0 0
   replacing   DEGRADED 0 0 0
 /dev/ramdisk/rd1/old  FAULTED  0 0 0  corrupted 
  data
 /dev/ramdisk/rd1  ONLINE   0 0 0
   /dev/ramdisk/rd2ONLINE   0 0 0
   /dev/ramdisk/rd3ONLINE   0 0 0
  
  errors: No known data errors
  # zpool status -v zwimming
 pool: zwimming
state: ONLINE
scrub: resilver completed with 0 errors on Wed Oct  3 11:55:36 2007
  config:
  
   NAME  STATE READ WRITE CKSUM
   zwimming  ONLINE   0 0 0
 raidz1  ONLINE   0 0 0
   /dev/ramdisk/rd1  ONLINE   0 0 0
   /dev/ramdisk/rd2  ONLINE   0 0 0
   /dev/ramdisk/rd3  ONLINE   0 0 0
  
  errors: No known data errors
 
 Good to know, but I think it's still a bit of ZFS fault. The error
 message 'md0 is in use (r1w1e1)' means that something (I'm quite sure
 it's ZFS) keeps device open. Why does it keeps it open when it doesn't
 recognize it? Or maybe it tries to open it twice for write (exclusively)
 when replacing, which is not allowed in GEOM in FreeBSD.
 
 I can take a look if this is the former or the latter, but it should be
 fixed in ZFS itself, IMHO.

Ok, it seems that it was fixed in ZFS itself already:

/*
 * If we are setting the vdev state to anything but an open state, then
 * always close the underlying device.  Otherwise, we keep accessible
 * but invalid devices open forever.  We don't call vdev_close() itself,
 * because that implies some extra checks (offline, etc) that we don't
 * want here.  This is limited to leaf devices, because otherwise
 * closing the device will affect other children.
 */
if (vdev_is_dead(vd)  vd-vdev_ops-vdev_op_leaf)
vd-vdev_ops-vdev_op_close(vd);

The ZFS version from FreeBSD-CURRENT doesn't have this code yet, it's only in
my perforce branch for now. I'll verify later today if it really fixes the
problem and I'll report back if not.

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
[EMAIL PROTECTED]   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgpqzqbHn0DZG.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] replacing a device with itself doesn't work

2007-10-06 Thread MP
Pawel,
  Is this a problem with ZFS trying to open the device twice?

Richard,
  Yes a scrub should fix the device. One of zfs' faetures is ease of 
administration. It seems to defy logic that a scrub does not fix all devices, 
if possible. Why make it any harder for the admin?

Cheers.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] replacing a device with itself doesn't work

2007-10-03 Thread Richard Elling
MP wrote:
 Hi,
 I hope someone can help cos ATM zfs' logic seems a little askew.
 I just swapped a failing 200gb drive that was one half of a 400gb gstripe 
 device which I was using as one of the devices in a 3 device raidz1. When the 
 OS came back up after the drive had been changed, the necessary metadata was 
 of course not on the new drive so the stripe didn't exist. Zfs understandably 
 complained it couldn't open the stripe, however it did not show the array as 
 degraded. I didn't save the output, but it was just like described in this 
 thread:
 
 http://www.nabble.com/Shooting-yourself-in-the-foot-with-ZFS:-is-quite-easy-t4512790.html
 
 I recreated the gstripe device under the same name stripe/str1 and assumed I 
 could just:
 
 # zpool replace pool stripe/str1
 invalid vdev specification
 stripe/str1 is in use (r1w1e1)
 
 It also told me to try -f, which I did, but was greeted with the same error.
 Why can I not replace a device with itself?
 As the man page describes just this procedure I'm a little confused.
 Try as I might (online, offline, scrub) I could not get the array to rebuild, 
 just like was the guy described in that thread above. I eventually resorted 
 to recreating the stripe with a different name stripe/str2. I could then 
 perform a:
 
 # zpool replace pool stripe/str1 stripe/str2
 
 Is there a reason I have to jump through these seemingly pointless hoops to 
 replace a device with itself?
 Many thanks.

Yes.  From the fine manual on zpool:
  zpool replace [-f] pool old_device [new_device]

  Replaces old_device with new_device. This is  equivalent
  to attaching new_device, waiting for it to resilver, and
  then detaching old_device.
...
  If  new_device  is  not  specified,   it   defaults   to
  old_device.  This form of replacement is useful after an
  existing  disk  has  failed  and  has  been   physically
  replaced.  In  this case, the new disk may have the same
  /dev/dsk path as the old device, even though it is actu-
  ally a different disk. ZFS recognizes this.

For a stripe, you don't have redundancy, so you cannot replace the
disk with itself.  You would have to specify the [new_device]
I've submitted CR6612596 for a better error message and CR6612605
to mention this in the man page.
  -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] replacing a device with itself doesn't work

2007-10-03 Thread Richard Elling
more below...

MP wrote:
 On 03/10/2007, *Richard Elling* [EMAIL PROTECTED] 
 mailto:[EMAIL PROTECTED] wrote:
 
 Yes.  From the fine manual on zpool:
   zpool replace [-f] pool old_device [new_device]
 
   Replaces old_device with new_device. This is  equivalent
   to attaching new_device, waiting for it to resilver, and
   then detaching old_device.
 ...
   If  new_device  is  not  specified,   it   defaults   to
   old_device.  This form of replacement is useful after an
   existing  disk  has  failed  and  has  been   physically
   replaced.  In  this case, the new disk may have the same
   /dev/dsk path as the old device, even though it is actu-
   ally a different disk. ZFS recognizes this.
 
 For a stripe, you don't have redundancy, so you cannot replace the
 disk with itself. 
 
 
 I don't see how a stripe makes a difference. It's just 2 drives joined 
 together logically to make a
 new device. It can be used by the system just like a normal hard drive.  Just 
 like a normal hard
 drive it too has no redundancy?

Correct.  It would be redundant if it were a mirror, raidz, or raidz2.  In the
case of stripes of mirrors, raidz, or raidz2 vdevs, they are redundant.

 You would have to specify the [new_device]
 I've submitted CR6612596 for a better error message and CR6612605
 to mention this in the man page.
 
 
 Perhaps I was a little unclear. Zfs did a few things during this whole 
 escapade which seemed wrong.
 
 # mdconfig -a -tswap -s64m
 md0
 # mdconfig -a -tswap -s64m
 md1
 # mdconfig -a -tswap -s64m
 md2

I presume you're not running Solaris, so please excuse me if I take a
Solaris view to this problem.

 # zpool create tank raidz md0 md1 md2
 # zpool status -v tank
   pool: tank
  state: ONLINE
  scrub: none requested
 config:
 
 NAMESTATE READ WRITE CKSUM
 tankONLINE   0 0 0
   raidz1ONLINE   0 0 0
 md0 ONLINE   0 0 0
 md1 ONLINE   0 0 0
 md2 ONLINE   0 0 0
 
 errors: No known data errors
 # zpool offline tank md0
 Bringing device md0 offline
 # dd if=/dev/zero of=/dev/md0 bs=1m
 dd: /dev/md0: end of device
 65+0 records in
 64+0 records out
 67108864 bytes transferred in 0.044925 secs (1493798602 bytes/sec)
 # zpool status -v tank
   pool: tank
  state: DEGRADED
 status: One or more devices has been taken offline by the administrator.
 Sufficient replicas exist for the pool to continue functioning in a
 degraded state.
 action: Online the device using 'zpool online' or replace the device with
 'zpool replace'.
  scrub: none requested
 config:
 
 NAMESTATE READ WRITE CKSUM
 tankDEGRADED 0 0 0
   raidz1DEGRADED 0 0 0
 md0 OFFLINE  0 0 0
 md1 ONLINE   0 0 0
 md2 ONLINE   0 0 0
 
 errors: No known data errors
 
 
 At this point where the drive is offline a 'zpool replace tank md0' will 
 fix the array.

Correct.  The pool is redundant.

 However, if instead the other advice given; 'zpool online tank md0' is 
 used then problems start to occur:
 
 
 # zpool online tank md0
 # zpool status -v tank
   pool: tank
  state: ONLINE
 status: One or more devices could not be used because the label is 
 missing or
 invalid.  Sufficient replicas exist for the pool to continue
 functioning in a degraded state.
 action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
  scrub: resilver completed with 0 errors on Wed Oct  3 18:44:22 2007
 config:
 
 NAMESTATE READ WRITE CKSUM
 tankONLINE   0 0 0
   raidz1ONLINE   0 0 0
 md0 UNAVAIL  0 0 0  corrupted data
 md1 ONLINE   0 0 0
 md2 ONLINE   0 0 0
 
 errors: No known data errors
 
 -
 ^^^
 Surely this is wrong? Zpool shows the pool as 'ONLINE'  and not 
 degraded. Whereas the status explanation
 says that it is degraded and 'zpool replace' is required. That's just 
 confusing.

I agree, I would expect the STATE to be DEGRADED.

 -
 
 # zpool scrub tank
 # zpool status -v tank
   pool: tank
  state: ONLINE
 status: One or more devices could not be used because the label is 
 missing or
 invalid.  Sufficient replicas exist for the pool to continue
 functioning in a degraded state.
 action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
  scrub: resilver completed with 0 errors on Wed Oct  3 18:45:06 2007
 config:
 
 NAMESTATE READ WRITE CKSUM
   

Re: [zfs-discuss] replacing a device with itself doesn't work

2007-10-03 Thread Pawel Jakub Dawidek
On Wed, Oct 03, 2007 at 12:10:19PM -0700, Richard Elling wrote:
  -
  
  # zpool scrub tank
  # zpool status -v tank
pool: tank
   state: ONLINE
  status: One or more devices could not be used because the label is 
  missing or
  invalid.  Sufficient replicas exist for the pool to continue
  functioning in a degraded state.
  action: Replace the device using 'zpool replace'.
 see: http://www.sun.com/msg/ZFS-8000-4J
   scrub: resilver completed with 0 errors on Wed Oct  3 18:45:06 2007
  config:
  
  NAMESTATE READ WRITE CKSUM
  tankONLINE   0 0 0
raidz1ONLINE   0 0 0
  md0 UNAVAIL  0 0 0  corrupted data
  md1 ONLINE   0 0 0
  md2 ONLINE   0 0 0
  
  errors: No known data errors
  # zpool replace tank md0
  invalid vdev specification
  use '-f' to override the following errors:
  md0 is in use (r1w1e1)
  # zpool replace -f tank md0
  invalid vdev specification
  the following errors must be manually repaired:
  md0 is in use (r1w1e1)
  
  -
  Well the advice of 'zpool replace' doesn't work. At this point the user 
  is now stuck. There seems to
  be just no way to now use the existing device md0.
 
 In Solaris NV b72, this works as you expect.
 # zpool replace zwimming /dev/ramdisk/rd1
 # zpool status -v zwimming
pool: zwimming
   state: DEGRADED
   scrub: resilver completed with 0 errors on Wed Oct  3 11:55:36 2007
 config:
 
  NAMESTATE READ WRITE CKSUM
  zwimmingDEGRADED 0 0 0
raidz1DEGRADED 0 0 0
  replacing   DEGRADED 0 0 0
/dev/ramdisk/rd1/old  FAULTED  0 0 0  corrupted 
 data
/dev/ramdisk/rd1  ONLINE   0 0 0
  /dev/ramdisk/rd2ONLINE   0 0 0
  /dev/ramdisk/rd3ONLINE   0 0 0
 
 errors: No known data errors
 # zpool status -v zwimming
pool: zwimming
   state: ONLINE
   scrub: resilver completed with 0 errors on Wed Oct  3 11:55:36 2007
 config:
 
  NAME  STATE READ WRITE CKSUM
  zwimming  ONLINE   0 0 0
raidz1  ONLINE   0 0 0
  /dev/ramdisk/rd1  ONLINE   0 0 0
  /dev/ramdisk/rd2  ONLINE   0 0 0
  /dev/ramdisk/rd3  ONLINE   0 0 0
 
 errors: No known data errors

Good to know, but I think it's still a bit of ZFS fault. The error
message 'md0 is in use (r1w1e1)' means that something (I'm quite sure
it's ZFS) keeps device open. Why does it keeps it open when it doesn't
recognize it? Or maybe it tries to open it twice for write (exclusively)
when replacing, which is not allowed in GEOM in FreeBSD.

I can take a look if this is the former or the latter, but it should be
fixed in ZFS itself, IMHO.

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
[EMAIL PROTECTED]   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgprcvACVf6zj.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] replacing a device with itself doesn't work

2007-10-03 Thread MC
I think I might have run into the same problem.  At the time I assumed I was 
doing something wrong, but...

I made a b72 raidz out of three new 1gb virtual disks in vmware.  I shut the vm 
off, replaced one of the disks with a new 1.5gb virtual disk.  No matter what 
command I tried, I couldn't get the new disk into the array.  The docs said 
that replacing the vdev with itself would work, but it didn't.  Nor did setting 
the 'automatic replace' feature on the pool and plugging a new device in.  I 
recall most of the errors being device in use.

Maybe I wasn't the problem after all?  0_o
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss