This didn't occur on a production server, but I thought I'd post this anyway because it might be interesting.
I'm currently testing a ZFS NAS machine consisting of a Dell R710 with two Dell 5/E SAS HBAs. Right now I'm in the middle of torture testing the system, simulating drive failures, exporting the storage pool, rearranging the disks in different slots, and what have you. Up until now, everything has been going swimmingly. Here was my original zpool configuration: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c1t6d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 c1t8d0 ONLINE 0 0 0 c1t9d0 ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 c1t12d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c2t25d0 ONLINE 0 0 0 c2t26d0 ONLINE 0 0 0 c2t27d0 ONLINE 0 0 0 c2t28d0 ONLINE 0 0 0 c2t29d0 ONLINE 0 0 0 c2t30d0 ONLINE 0 0 0 c2t31d0 ONLINE 0 0 0 c2t32d0 ONLINE 0 0 0 c2t33d0 ONLINE 0 0 0 c2t34d0 ONLINE 0 0 0 c2t35d0 ONLINE 0 0 0 c2t36d0 ONLINE 0 0 0 I exported the tank zpool, and rearranged drives in the chassis and reimported it - I ended up with this: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c2t31d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c1t12d0 ONLINE 0 0 0 c1t6d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 c1t8d0 ONLINE 0 0 0 c1t9d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 c2t25d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c2t26d0 ONLINE 0 0 0 c2t27d0 ONLINE 0 0 0 c2t28d0 ONLINE 0 0 0 c2t29d0 ONLINE 0 0 0 c2t30d0 ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c2t32d0 ONLINE 0 0 0 c2t33d0 ONLINE 0 0 0 c2t34d0 ONLINE 0 0 0 c2t35d0 ONLINE 0 0 0 c2t48d0 ONLINE 0 0 0 Great. No problems. Next, I took c2t48d0 offline and then unconfigured it with cfgadm. # zpool offline tank c2t48d0 # cfgadm -c unconfigure c2::dsk/c2t48d0 I checked the status next. # zpool status tank pool: tank state: DEGRADED status: One or more devices has been taken offline by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scrub: none requested config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 raidz2 ONLINE 0 0 0 c2t31d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c1t12d0 ONLINE 0 0 0 c1t6d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 c1t8d0 ONLINE 0 0 0 c1t9d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 c2t25d0 ONLINE 0 0 0 raidz2 DEGRADED 0 0 0 c1t4d0 ONLINE 0 0 0 c2t26d0 ONLINE 0 0 0 c2t27d0 ONLINE 0 0 0 c2t28d0 ONLINE 0 0 0 c2t29d0 ONLINE 0 0 0 c2t30d0 ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c2t32d0 ONLINE 0 0 0 c2t33d0 ONLINE 0 0 0 c2t34d0 ONLINE 0 0 0 c2t35d0 ONLINE 0 0 0 c2t48d0 OFFLINE 0 0 0 I went back and reconfigured the drive in cfgadm. # cfgadm -c configure c2::dsk/c2t48d0 I was surprised at this point because I didn't have to run zpool replace. As soon as I reconfigured the drive in cfgadm, ZFS re-silvered the zpool without any action from me. # zpool status tank pool: tank state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Tue Nov 10 15:33:08 2009 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c2t31d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c1t12d0 ONLINE 0 0 0 c1t6d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 c1t8d0 ONLINE 0 0 0 c1t9d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 c2t25d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c2t26d0 ONLINE 0 0 0 c2t27d0 ONLINE 0 0 0 c2t28d0 ONLINE 0 0 0 c2t29d0 ONLINE 0 0 0 c2t30d0 ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c2t32d0 ONLINE 0 0 0 c2t33d0 ONLINE 0 0 0 c2t34d0 ONLINE 0 0 0 c2t35d0 ONLINE 0 0 0 c2t48d0 ONLINE 0 0 0 3K resilvered I wanted to destroy this zpool and reconfigure it differently - but when I tried I got this error: # zpool destroy tank cannot unmount '/tank': Device busy could not destroy 'tank': could not unmount datasets Hmm. That's interesting. Let's reboot the system and see what happens. Upon reboot, this is what tank looks like: # zpool status tank pool: tank state: UNAVAIL status: One or more devices could not be used because the label is missing or invalid. There are insufficient replicas for the pool to continue functioning. action: Destroy and re-create the pool from a backup source. see: http://www.sun.com/msg/ZFS-8000-5E scrub: none requested config: NAME STATE READ WRITE CKSUM tank UNAVAIL 0 0 0 insufficient replicas raidz2 UNAVAIL 0 0 0 insufficient replicas c2t31d0 FAULTED 0 0 0 corrupted data c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c1t1d0 FAULTED 0 0 0 corrupted data c1t12d0 FAULTED 0 0 0 corrupted data c1t6d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 c1t8d0 ONLINE 0 0 0 c1t9d0 ONLINE 0 0 0 c1t5d0 FAULTED 0 0 0 corrupted data c1t11d0 ONLINE 0 0 0 c2t25d0 FAULTED 0 0 0 corrupted data raidz2 DEGRADED 0 0 0 c1t4d0 FAULTED 0 0 0 corrupted data c2t26d0 ONLINE 0 0 0 c2t27d0 ONLINE 0 0 0 c2t28d0 ONLINE 0 0 0 c2t29d0 ONLINE 0 0 0 c2t30d0 ONLINE 0 0 0 c1t10d0 FAULTED 0 0 0 corrupted data c2t32d0 ONLINE 0 0 0 c2t33d0 ONLINE 0 0 0 c2t34d0 ONLINE 0 0 0 c2t35d0 ONLINE 0 0 0 c2t48d0 ONLINE 0 0 0 Now, my only question is WTF? Can anyone shed some light on this? Obviously I won't be pulling any shenanigans once this system is in production, but if the system loses power and something crazy happens, I don't want to be seeing this when the systems comes back up. -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss