[zfs-discuss] Re: zfs exported a live filesystem
For the record, this happened with a new filesystem. I didn't muck about with an old filesystem while it was still mounted, I created a new one, mounted it and then accidentally exported it. Except that it doesn't: # mount /dev/dsk/c1t1d0s0 /mnt # share /mnt # umount /mnt umount: /mnt busy # unshare /mnt # umount /mnt If you umount -f it will though! Well, sure, but I was still surprised that it happened anyway. The system is working as designed, the NFS client did what it was supposed to do. If you brought the pool back in again with zpool import things should have picked up where they left off. Yep -- an import/shareall made the FS available again. Whats more you we probably running as root when you did that so you got what you asked for - there is only so much protection we can give without being annoying! Sure, but there are still safeguards in place even when running things as root, such as requiring umount -f as above, or warning you when running format on a disk with mounted partitions. Since this appeared to be an operation that may warrant such a safeguard I thought I'd check and see if this was to be expected or if a safeguard should be put in. Annoying isn't always bad :- Now having said that I personally wouldn't have expected that zpool export should have worked as easily as that while there where shared filesystems. I would have expected that exporting the pool should have attempted to unmount all the ZFS filesystems first - which would have failed without a -f flag because they were shared. So IMO it is a bug or at least an RFE. Ok, where should I file an RFE? Jim This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Kickstart hot spare attachment
For my latest test I set up a stripe of two mirrors with one hot spare like so: zpool create -f -m /export/zmir zmir mirror c0t0d0 c3t2d0 mirror c3t3d0 c3t4d0 spare c3t1d0 I spun down c3t2d0 and c3t4d0 simultaneously, and while the system kept running (my tar over NFS barely hiccuped), the zpool command hung again. I rebooted the machine with -dnq, and although the system didn't come up the first time, it did after a fsck and a second reboot. However, once again the hot spare isn't getting used: # zpool status -v pool: zmir state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-D3 scrub: resilver completed with 0 errors on Tue Dec 12 09:15:49 2006 config: NAMESTATE READ WRITE CKSUM zmirDEGRADED 0 0 0 mirrorDEGRADED 0 0 0 c0t0d0 ONLINE 0 0 0 c3t2d0 UNAVAIL 0 0 0 cannot open mirrorDEGRADED 0 0 0 c3t3d0 ONLINE 0 0 0 c3t4d0 UNAVAIL 0 0 0 cannot open spares c3t1d0AVAIL A few questions: - I know I can attach it via the zpool commands, but is there a way to kickstart the attachment process if it fails to attach automatically upon disk failure? - In this instance the spare is twice as big as the other drives -- does that make a difference? - Is there something inherent to an old SCSI bus that causes spun- down drives to hang the system in some way, even if it's just hanging the zpool/zfs system calls? Would a thumper be more resilient to this? Jim This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Can't destroy corrupted pool
Ok, so I'm planning on wiping my test pool that seems to have problems with non-spare disks being marked as spares, but I can't destroy it: # zpool destroy -f zmir cannot iterate filesystems: I/O error Anyone know how I can nuke this for good? Jim This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Can't destroy corrupted pool
BTW, I'm also unable to export the pool -- same error. Jim This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Can't destroy corrupted pool
Nevermind: # zfs destroy [EMAIL PROTECTED]:28 cannot open '[EMAIL PROTECTED]:28': I/O error Jim This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Can't destroy corrupted pool
You are likely hitting: 6397052 unmounting datasets should process /etc/mnttab instead of traverse DSL Which was fixed in build 46 of Nevada. In the meantime, you can remove /etc/zfs/zpool.cache manually and reboot, which will remove all your pools (which you can then re-import on an individual basis). I'm running b51, but I'll try deleting the cache. Jim This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Can't destroy corrupted pool
This worked. I've restarted my testing but I've been fdisking each drive before I add it to the pool, and so far the system is behaving as expected when I spin a drive down, i.e., the hot spare gets automatically used. This makes me wonder if it's possible to ensure that the forced addition of a drive to a pool wipes the pool of any previous data, especially any zfs metadata. I'll keep the list posted as I continue my tests. Jim This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs exported a live filesystem
By mistake, I just exported my test filesystem while it was up and being served via NFS, causing my tar over NFS to start throwing stale file handle errors. Should I file this as a bug, or should I just not do that :- Ko, This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Managed to corrupt my pool
So the questions are: - is this fixable? I don't see an inum I could run find on to remove, and I can't even do a zfs volinit anyway: nextest-01# zfs volinit cannot iterate filesystems: I/O error - would not enabling zil_disable have prevented this? - Should I have been doing a 3-way mirror? - Is there a more optimum configuration to help prevent this kind of corruption? Anyone have any thoughts on this? I'd really like to be able to build a nice ZFS box for file service but if a hardware failure can corrupt a disk pool I'll have to try to find another solution, I'm afraid. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Managed to corrupt my pool
Anyone have any thoughts on this? I'd really like to be able to build a nice ZFS box for file service but if a hardware failure can corrupt a disk pool I'll have to try to find another solution, I'm afraid. Sorry, I worded this poorly -- if the loss of a disk in a mirror can corrupt the pool it's going to give me pause in implementing a ZFS solution. Jim This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Managed to corrupt my pool
Platform: - old dell workstation with an Andataco gigaraid enclosure plugged into an Adaptec 39160 - Nevada b51 Current zpool config: - one two-disk mirror with two hot spares In my ferocious pounding of ZFS I've managed to corrupt my data pool. This is what I've been doing to test it: - set zil_disable to 1 in /etc/system - continually untar a couple of files into the filesystem - manually spin down a drive in the mirror by holding down the button on the enclosure - for any system hangs reboot with a nasty reboot -dnq I've gotten different results after the spindown: - works properly: short or no hang, hot spare successfully added to the mirror - system hangs, and after a reboot the spare is not added - tar hangs, but after running zpool status the hot spare is added properly and tar continues - tar continues, but hangs on zpool status The last is what happened just prior to the corruption. Here's the output of zpool status: nextest-01# zpool status -v pool: zmir state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: resilver completed with 1 errors on Thu Nov 30 11:37:21 2006 config: NAMESTATE READ WRITE CKSUM zmirDEGRADED 8 0 4 mirrorDEGRADED 8 0 4 c3t3d0 ONLINE 0 024 c3t4d0 UNAVAIL 0 0 0 cannot open spares c0t0d0AVAIL c3t1d0AVAIL errors: The following persistent errors have been detected: DATASET OBJECT RANGE 15 0 lvl=4294967295 blkid=0 So the questions are: - is this fixable? I don't see an inum I could run find on to remove, and I can't even do a zfs volinit anyway: nextest-01# zfs volinit cannot iterate filesystems: I/O error - would not enabling zil_disable have prevented this? - Should I have been doing a 3-way mirror? - Is there a more optimum configuration to help prevent this kind of corruption? Ultimately, I want to build a ZFS server with performance and reliability comparable to say, a Netapp, but the fact that I appear to have been able to nuke my pool by simulating a hardware error gives me pause. I'd love to know if I'm off-base in my worries. Jim This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: zfs hot spare not automatically getting used
I know this isn't necessarily ZFS specific, but after I reboot I spin the drives back up, but nothing I do (devfsadm, disks, etc) can get them seen again until the next reboot. I've got some older scsi drives in an old Andataco Gigaraid enclosure which I thought supported hot-swap, but I seem unable to hot swap them in. The PC has an adaptec 39160 card in it and I'm running Nevada b51. Is this not a setup that can support hot swap? Or is there something I have to do other than devfsadm to get the scsi bus rescanned? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: zfs hot spare not automatically getting used
So is there a command to make the spare get used, or so I have to remove it as a spare and add it if it doesn't get automatically used? Is this a bug to be fixed, or will this always be the case when the disks aren't exactly the same size? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss