Re: [zfs-discuss] misleading zpool state and panic -- nevada b60 x86

George Wilson Mon, 09 Apr 2007 19:55:03 -0700

William D. Hathaway wrote:

I'm running Nevada build 60 inside VMWare, it is a test rig with no data of value.SunOS b60 5.11 snv_60 i86pc i386 i86pc

I wanted to check out the FMA handling of a serious zpool error, so I did the 
following:


2007-04-07.08:46:31 zpool create tank mirror c0d1 c1d1
2007-04-07.15:21:37 zpool scrub tank
(inserted some errors with dd on one device to see if it showed up, which it 
did, but healed fine)
2007-04-07.15:22:12 zpool scrub tank
2007-04-07.15:22:46 zpool clear tank c1d1
(added a single device without any redundancy)
2007-04-07.15:28:29 zpool add -f tank /var/500m_file
(then I copied data into /tank and removed the /var/500m_file, a panic 
resulted, which was expected)

I created a new /var/500m_file and then decided to destroy the pool and start 
over again.  This caused a panic, which I wasn't expecting.  On reboot, I did a 
zpool -x, which shows:
  pool: tank
 state: ONLINE
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-4J
 scrub: none requested
config:

        NAME              STATE     READ WRITE CKSUM
        tank              ONLINE       0     0     0
          mirror          ONLINE       0     0     0
            c0d1          ONLINE       0     0     0
            c1d1          ONLINE       0     0     0
          /var/500m_file  UNAVAIL      0     0     0  corrupted data

errors: No known data errors

Since there was no redundancy for the /var/500m_file vdev, I don't see how a 
replace will help (unless I still had the original device/file with the data 
intact).

When I try to destroy the pool with "zpool destroy tank", I get a panic with:
Apr  7 16:00:17 b60 genunix: [ID 403854 kern.notice] assertion failed: 
vdev_config_sync(rvd, t
xg) == 0, file: ../../common/fs/zfs/spa.c, line: 2910
Apr  7 16:00:17 b60 unix: [ID 100000 kern.notice]
Apr  7 16:00:17 b60 genunix: [ID 353471 kern.notice] d893cd0c 
genunix:assfail+5a (f9e87e74, f9
e87e58,)
Apr  7 16:00:17 b60 genunix: [ID 353471 kern.notice] d893cd6c zfs:spa_sync+6c3 
(da89cac0, 1363
, 0)
Apr  7 16:00:17 b60 genunix: [ID 353471 kern.notice] d893cdc8 
zfs:txg_sync_thread+1df (d467854
0, 0)
Apr  7 16:00:18 b60 genunix: [ID 353471 kern.notice] d893cdd8 
unix:thread_start+8 ()
Apr  7 16:00:18 b60 unix: [ID 100000 kern.notice]
Apr  7 16:00:18 b60 genunix: [ID 672855 kern.notice] syncing file systems...

My question/comment boil down to:
1) Should the pool state really be 'online' after losing a non-redundant vdev?


Yeah, this seems odd and is probably a bug.

2) It seems like a bug if I get a panic when trying to destroy a pool (although 
this clearly may be related to #1).


This is a known problem and one that we're working on right now:

6413847 vdev label write failure should be handled more gracefully.

In your case we are trying to update the label to indicate that the poolhas been destroyed and this results in label write failure and thus thepanic.


Thanks,
George

Am I hitting a known bug (or misconceptions about how the pool should function)?
I will happily provide any debugging info that I can.

I haven't tried a 'zpool destroy -f tank' yet since I didn't know if there was 
any debugging value in my current state.

Thanks,
William Hathaway
www.williamhathaway.com

This message posted from opensolaris.org

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] misleading zpool state and panic -- nevada b60 x86

Reply via email to