Re: [zfs-discuss] Running on Dell hardware?
If you're still having issues go into the BIOS and disable C-States, if you haven't already. It is responsible for most of the problems with 11th Gen PowerEdge. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Disk keeps resilvering, was: Replacing a disk never completes
On 09/22/10 04:27 PM, Ben Miller wrote: On 09/21/10 09:16 AM, Ben Miller wrote: I had tried a clear a few times with no luck. I just did a detach and that did remove the old disk and has now triggered another resilver which hopefully works. I had tried a remove rather than a detach before, but that doesn't work on raidz2... thanks, Ben I made some progress. That resilver completed with 4 errors. I cleared those and still had the one error metadata:0x0 so I started a scrub. The scrub restarted the resilver on c4t0d0 again though! There currently are no errors anyway, but the resilver will be running for the next day+. Is this another bug or will doing a scrub eventually lead to a scrub of the pool instead of the resilver? Ben Well not much progress. The one permanent error metadata:0x0 came back. And the disk keeps wanting to resilver when trying to do a scrub. Now after the last resilver I have more checksum errors on the pool, but not on any disks: NAME STATE READ WRITE CKSUM pool2 ONLINE 0 037 ... raidz2-1ONLINE 0 074 All other checksum totals are 0. So three problems: 1. How to get the disk to stop resilvering? 2. How do you get checksum errors on the pool, but no disk is identified? If I clear them and let the resilver go again more checksum errors appear. So how to get rid of these errors? 3. How to get rid of the metadata:0x0 error? I'm currently destroying old snapshots (though that bug was fixed quite awhile ago and I'm running b134). I can try unmounting filesystems and remounting next (all are currently mounted). I can also schedule a reboot for next week if anyone things that would help. thanks, Ben ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Replacing a disk never completes
On 09/20/10 10:45 AM, Giovanni Tirloni wrote: On Thu, Sep 16, 2010 at 9:36 AM, Ben Miller bmil...@mail.eecis.udel.edu mailto:bmil...@mail.eecis.udel.edu wrote: I have an X4540 running b134 where I'm replacing 500GB disks with 2TB disks (Seagate Constellation) and the pool seems sick now. The pool has four raidz2 vdevs (8+2) where the first set of 10 disks were replaced a few months ago. I replaced two disks in the second set (c2t0d0, c3t0d0) a couple of weeks ago, but have been unable to get the third disk to finish replacing (c4t0d0). I have tried the resilver for c4t0d0 four times now and the pool also comes up with checksum errors and a permanent error (metadata:0x0). The first resilver was from 'zpool replace', which came up with checksum errors. I cleared the errors which triggered the second resilver (same result). I then did a 'zpool scrub' which started the third resilver and also identified three permanent errors (the two additional were in files in snapshots which I then destroyed). I then did a 'zpool clear' and then another scrub which started the fourth resilver attempt. This last attempt identified another file with errors in a snapshot that I have now destroyed. Any ideas how to get this disk finished being replaced without rebuilding the pool and restoring from backup? The pool is working, but is reporting as degraded and with checksum errors. [...] Try to run a `zpool clear pool2` and see if clears the errors. If not, you may have to detach `c4t0d0s0/o`. I believe it's a bug that was fixed in recent builds. I had tried a clear a few times with no luck. I just did a detach and that did remove the old disk and has now triggered another resilver which hopefully works. I had tried a remove rather than a detach before, but that doesn't work on raidz2... thanks, Ben -- Giovanni Tirloni gtirl...@sysdroid.com mailto:gtirl...@sysdroid.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Replacing a disk never completes
I have an X4540 running b134 where I'm replacing 500GB disks with 2TB disks (Seagate Constellation) and the pool seems sick now. The pool has four raidz2 vdevs (8+2) where the first set of 10 disks were replaced a few months ago. I replaced two disks in the second set (c2t0d0, c3t0d0) a couple of weeks ago, but have been unable to get the third disk to finish replacing (c4t0d0). I have tried the resilver for c4t0d0 four times now and the pool also comes up with checksum errors and a permanent error (metadata:0x0). The first resilver was from 'zpool replace', which came up with checksum errors. I cleared the errors which triggered the second resilver (same result). I then did a 'zpool scrub' which started the third resilver and also identified three permanent errors (the two additional were in files in snapshots which I then destroyed). I then did a 'zpool clear' and then another scrub which started the fourth resilver attempt. This last attempt identified another file with errors in a snapshot that I have now destroyed. Any ideas how to get this disk finished being replaced without rebuilding the pool and restoring from backup? The pool is working, but is reporting as degraded and with checksum errors. Here is what the pool currently looks like: # zpool status -v pool2 pool: pool2 state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: resilver completed after 33h9m with 4 errors on Thu Sep 16 00:28:14 config: NAME STATE READ WRITE CKSUM pool2 DEGRADED 0 0 8 raidz2-0ONLINE 0 0 0 c0t4d0ONLINE 0 0 0 c1t4d0ONLINE 0 0 0 c2t4d0ONLINE 0 0 0 c3t4d0ONLINE 0 0 0 c4t4d0ONLINE 0 0 0 c5t4d0ONLINE 0 0 0 c2t5d0ONLINE 0 0 0 c3t5d0ONLINE 0 0 0 c4t5d0ONLINE 0 0 0 c5t5d0ONLINE 0 0 0 raidz2-1DEGRADED 0 014 c0t5d0ONLINE 0 0 0 c1t5d0ONLINE 0 0 0 c2t1d0ONLINE 0 0 0 c3t1d0ONLINE 0 0 0 c4t1d0ONLINE 0 0 0 c5t1d0ONLINE 0 0 0 c2t0d0ONLINE 0 0 0 c3t0d0ONLINE 0 0 0 replacing-8 DEGRADED 0 0 0 c4t0d0s0/o OFFLINE 0 0 0 c4t0d0 ONLINE 0 0 0 268G resilvered c5t0d0ONLINE 0 0 0 raidz2-2ONLINE 0 0 0 c0t6d0ONLINE 0 0 0 c1t6d0ONLINE 0 0 0 c2t6d0ONLINE 0 0 0 c3t6d0ONLINE 0 0 0 c4t6d0ONLINE 0 0 0 c5t6d0ONLINE 0 0 0 c2t7d0ONLINE 0 0 0 c3t7d0ONLINE 0 0 0 c4t7d0ONLINE 0 0 0 c5t7d0ONLINE 0 0 0 raidz2-3ONLINE 0 0 0 c0t7d0ONLINE 0 0 0 c1t7d0ONLINE 0 0 0 c2t3d0ONLINE 0 0 0 c3t3d0ONLINE 0 0 0 c4t3d0ONLINE 0 0 0 c5t3d0ONLINE 0 0 0 c2t2d0ONLINE 0 0 0 c3t2d0ONLINE 0 0 0 c4t2d0ONLINE 0 0 0 c5t2d0ONLINE 0 0 0 logs mirror-4ONLINE 0 0 0 c0t1d0s0 ONLINE 0 0 0 c1t3d0s0 ONLINE 0 0 0 cache c0t3d0s7ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: metadata:0x0 0x167a2:0x552ed (This second file was in a snapshot I destroyed after the resilver completed). # zpool list pool2 NAMESIZE ALLOC FREECAP DEDUP HEALTH ALTROOT pool2 31.8T 13.8T 17.9T43% 1.65x DEGRADED - The slog is a mirror of two SLC SSDs and the L2ARC is an MLC SSD. thanks, Ben ___ zfs-discuss mailing list
Re: [zfs-discuss] Opensolaris is apparently dead
On 8/14/10 1:12 PM, Frank Cusack wrote: Wow, what leads you guys to even imagine that S11 wouldn't contain comstar, etc.? *Of course* it will contain most of the bits that are current today in OpenSolaris. That's a very good question actually. I would think that COMSTAR would stay because its used by the Fishworks appliance... however, COMSTAR is a competitive advantage for DIY storage solutions. Maybe they will rip it out of S11 and make it an add-on or something. That would suck. I guess the only real reason you can't yank COMSTAR is because its now the basis for iSCSI Target support. But again, there is nothing saying that Target support has to be part of the standard OS offering. Scary to think about. :) benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Opensolaris is apparently dead
On 8/13/10 9:02 PM, C. Bergström wrote: Erast wrote: On 08/13/2010 01:39 PM, Tim Cook wrote: http://www.theregister.co.uk/2010/08/13/opensolaris_is_dead/ I'm a bit surprised at this development... Oracle really just doesn't get it. The part that's most disturbing to me is the fact they won't be releasing nightly snapshots. It appears they've stopped Illumos in its tracks before it really even got started (perhaps that explains the timing of this press release) Wrong. Be patient, with the pace of current Illumos development it soon will have all the closed binaries liberated and ready to sync up with promised ON code drops as dictated by GPL and CDDL licenses. Illumos is just a source tree at this point. You're delusional, misinformed, or have some big wonderful secret if you believe you have all the bases covered for a pure open source distribution though.. What's closed binaries liberated really mean to you? Does it mean a. You copy over the binary libCrun and continue to use some version of Sun Studio to build onnv-gate b. You debug the problems with and start to use ancient gcc-3 (at the probably expense of performance regressions which most people would find unacceptable) c. Your definition is narrow and has missed some closed binaries I think it's great people are still hopeful, working hard and going to steward this forward, but I wonder.. What pace are you referring to? The last commit to illumos-gate was 6 days ago and you're already not even keeping it in sync.. Can you even build it yet and if so where's the binaries? Illumos is 2 weeks old. Lets cut it a little slack. :) benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS compression
Hi all, I'm running out of space on my OpenSolaris file server and can't afford to buy any new storage for a short while. Seeing at the machine has a dual core CPU at 2.2GHz and 4GB ram, I was thinking compression might be the way to go... I've read a small amount about compression, enough to find that it'll effect performance (not a problem for me) and that once you enable compression it only effects new files written to the file system. Is this still true of b134? And if it is, how can I compress all of the current data on the file system? Do I have to move it off then back on? Thanks for any advice, Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS compression
Thanks Alex, I've set compression on and have transferred data from the OpenSolaris machine to my Mac, deleted any snapshots and am now transferring them back. It seems to be working, but there's lots to transfer! I didn't know that MacZFS was still going, it's great to hear that people are still working on it. I may have to pluck up the courage to put it on my Mac Pro if I do a rebuild anytime soon. Thanks again, Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS mirror to RAIDz?
Hi all, I currently have four drives in my OpenSolaris box. The drives are split into two mirrors, one mirror containing my rpool (disks 1 2) and one containing other data (disks 2 3). I'm running out of space on my data mirror and am thinking of upgrading it to two 2TB disks. I then considered replacing disk 2 with a 2TB disk and making a RAIDz from the three new drives. I know this would leave my rpool vulnerable to hard drive failure, but I've got no data on it that can't be replaced with a reinstall. Can this be done easily? Or will I have to transfer all of my data to another machine and build the RAIDz from scratch, then transfer the data back? Thanks for any advice, Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Using a zvol from your rpool as zil for another zpool
We have a server with a couple X-25E's and a bunch of larger SATA disks. To save space, we want to install Solaris 10 (our install is only about 1.4GB) to the X-25E's and use the remaining space on the SSD's for ZIL attached to a zpool created from the SATA drives. Currently we do this by installing the OS using SVM+UFS (to mirror the OS between the two SSD's) and then using the remaining space on a slice as ZIL for the larger SATA-based zpool. However, SVM+UFS is more annoying to work with as far as LiveUpgrade is concerned. We'd love to use a ZFS root, but that requires that the entire SSD be dedicated as an rpool leaving no space for ZIL. Or does it? For every system I have ever done zfs root on, it's always been a slice on a disk. As an example, we have an x4500 with 1TB disks. For that root config, we are planning on something like 150G on s0, and the rest on S3. s0 for the rpool, and s3 for the qpool. We didn't want to have to deal with issues around flashing a huge volume, as we found out with our other x4500 with 500GB disks. AFAIK, it's only non-rpool disks that use the whole disk, and I doubt there's some sort of specific feature with an SSD, but I could be wrong. I like your idea of a reasonably sized root rpool and the rest used for the ZIL. But if you're going to do LU, you should probably take a good look at how much space you need for the clones and snapshots on the rpool Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on Ubuntu
What supporting applications are there on Ubuntu for RAIDZ? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on Ubuntu
I tried to post this question on the Ubuntu forum. Within 30 minutes my post was on the second page of new posts... Yah. Im really not down with using Ubuntu on my server here. But I may be forced to. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS on Ubuntu
How much of a difference is there in supporting applications in between Ubuntu and OpenSolaris? I was not considering Ubuntu until OpenSOlaris would not load onto my machine... Any info would be great. I have not been able to find any sort of comparison of ZFS on Ubuntu and OS. Thanks. (My current OS install troubleshoot thread - http://opensolaris.org/jive/thread.jspa?messageID=488193#488193) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Pool is wrong size in b134
I upgraded a server today that has been running SXCE b111 to the OpenSolaris preview b134. It has three pools and two are fine, but one comes up with no space available in the pool (SCSI jbod of 300GB disks). The zpool version is at 14. I tried exporting the pool and re-importing and I get several errors like this both exporting and importing: # zpool export pool1 WARNING: metaslab_free_dva(): bad DVA 0:645838978048 WARNING: metaslab_free_dva(): bad DVA 0:645843271168 ... I tried removing the zpool.cache file, rebooting, importing and receive no warnings, but still reporting the wrong avail and size. # zfs list pool1 NAMEUSED AVAIL REFER MOUNTPOINT pool1 396G 0 3.22M /export/home # zpool list pool1 NAMESIZE ALLOC FREECAP DEDUP HEALTH ALTROOT pool1 476G 341G 135G71% 1.00x ONLINE - # zpool status pool1 pool: pool1 state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scrub: none requested config: NAME STATE READ WRITE CKSUM pool1ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 c1t8d0 ONLINE 0 0 0 c1t9d0 ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 c1t12d0 ONLINE 0 0 0 c1t13d0 ONLINE 0 0 0 c1t14d0 ONLINE 0 0 0 errors: No known data errors I try exporting and again get the metaslab_free_dva() warnings. Imported again with no warnings, but same numbers as above. If I try to remove files or truncate files I receive no free space errors. I reverted back to b111 and here is what the pool really looks like. # zfs list pool1 NAMEUSED AVAIL REFER MOUNTPOINT pool1 396G 970G 3.22M /export/home # zpool list pool1 NAMESIZE USED AVAILCAP HEALTH ALTROOT pool1 1.91T 557G 1.36T28% ONLINE - # zpool status pool1 pool: pool1 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM pool1ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c1t8d0 ONLINE 0 0 0 c1t9d0 ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 c1t12d0 ONLINE 0 0 0 c1t13d0 ONLINE 0 0 0 c1t14d0 ONLINE 0 0 0 errors: No known data errors Also, the disks were replaced one at a time last year from 73GB to 300GB to increase the size of the pool. Any idea why the pool is showing up as the wrong size in b134 and have anything else to try? I don't want to upgrade the pool version yet and then not be able to revert back... thanks, Ben ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Pool is wrong size in b134
Cindy, The other two pools are 2 disk mirrors (rpool and another). Ben Cindy Swearingen wrote: Hi Ben, Any other details about this pool, like how it might be different from the other two pools on this system, might be helpful... I'm going to try to reproduce this problem. We'll be in touch. Thanks, Cindy On 06/17/10 07:02, Ben Miller wrote: I upgraded a server today that has been running SXCE b111 to the OpenSolaris preview b134. It has three pools and two are fine, but one comes up with no space available in the pool (SCSI jbod of 300GB disks). The zpool version is at 14. I tried exporting the pool and re-importing and I get several errors like this both exporting and importing: # zpool export pool1 WARNING: metaslab_free_dva(): bad DVA 0:645838978048 WARNING: metaslab_free_dva(): bad DVA 0:645843271168 ... I tried removing the zpool.cache file, rebooting, importing and receive no warnings, but still reporting the wrong avail and size. # zfs list pool1 NAMEUSED AVAIL REFER MOUNTPOINT pool1 396G 0 3.22M /export/home # zpool list pool1 NAMESIZE ALLOC FREECAP DEDUP HEALTH ALTROOT pool1 476G 341G 135G71% 1.00x ONLINE - # zpool status pool1 pool: pool1 state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scrub: none requested config: NAME STATE READ WRITE CKSUM pool1ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 c1t8d0 ONLINE 0 0 0 c1t9d0 ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 c1t12d0 ONLINE 0 0 0 c1t13d0 ONLINE 0 0 0 c1t14d0 ONLINE 0 0 0 errors: No known data errors I try exporting and again get the metaslab_free_dva() warnings. Imported again with no warnings, but same numbers as above. If I try to remove files or truncate files I receive no free space errors. I reverted back to b111 and here is what the pool really looks like. # zfs list pool1 NAMEUSED AVAIL REFER MOUNTPOINT pool1 396G 970G 3.22M /export/home # zpool list pool1 NAMESIZE USED AVAILCAP HEALTH ALTROOT pool1 1.91T 557G 1.36T28% ONLINE - # zpool status pool1 pool: pool1 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM pool1ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c1t8d0 ONLINE 0 0 0 c1t9d0 ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 c1t12d0 ONLINE 0 0 0 c1t13d0 ONLINE 0 0 0 c1t14d0 ONLINE 0 0 0 errors: No known data errors Also, the disks were replaced one at a time last year from 73GB to 300GB to increase the size of the pool. Any idea why the pool is showing up as the wrong size in b134 and have anything else to try? I don't want to upgrade the pool version yet and then not be able to revert back... thanks, Ben ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Hard disk buffer at 100%
The drive (c7t2d0)is bad and should be replaced. The second drive (c7t5d0) is either bad or going bad. This is exactly the kind of problem that can force a Thumper to it knees, ZFS performance is horrific, and as soon as you drop the bad disks things magicly return to normal. My first recommendation is to pull the SMART data from the disks if you can. I wrote a blog entry about SMART to address exactly the behavior your seeing back in 2008: http://www.cuddletech.com/blog/pivot/entry.php?id=993 Yes, people will claim that SMART data is useless for predicting failures, but in a case like yours you are just looking for data to corroborate a hypothesis. In order to test this condition, zpool offline... c7t2d0, which emulated removal. See if performance improves. On Thumpers I'd build a list of suspect disks based on 'iostat', like you show, and then correlate the SMART data, and then systematically offline disks to see if it really was the problem. In my experience the only other reason you'll legitimately see really wierd bottoming out of IO like this is if you hit the max conncurrent IO limits in ZFS (untill recently that limit was 35), so you'd see actv=35, and then when the device finally processed the IO's the thing would snap back to life. But even in those cases you shouldn't see request times (asvc_t) rise above 200ms. All that to say, replace those disks or at least test it. SSD's won't help, one or more drives are toast. benr. On 5/8/10 9:30 PM, Emily Grettel wrote: Hi Giovani, Thanks for the reply. Here's a bit of iostat after uncompressing a 2.4Gb RAR file that has 1 DWF file that we use. extended device statistics r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 1.0 13.0 26.0 18.0 0.0 0.00.00.8 0 1 c7t1d0 2.05.0 77.0 12.0 2.4 1.0 343.8 142.8 100 100 c7t2d0 1.0 16.0 25.5 15.5 0.0 0.00.00.3 0 0 c7t3d0 0.0 10.00.0 17.0 0.0 0.03.21.2 1 1 c7t4d0 1.0 12.0 25.5 15.5 0.4 0.1 32.4 10.9 14 14 c7t5d0 1.0 15.0 25.5 18.0 0.0 0.00.10.1 0 0 c0t1d0 extended device statistics r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.00.00.00.0 2.0 1.00.00.0 100 100 c7t2d0 1.00.00.50.0 0.0 0.00.00.1 0 0 c7t0d0 extended device statistics r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 5.0 15.0 128.0 18.0 0.0 0.00.01.8 0 3 c7t1d0 1.09.0 25.5 18.0 2.0 1.8 199.7 179.4 100 100 c7t2d0 3.0 13.0 102.5 14.5 0.0 0.10.05.2 0 5 c7t3d0 3.0 11.0 102.0 16.5 0.0 0.12.34.2 1 6 c7t4d0 1.04.0 25.52.0 0.4 0.8 71.3 158.9 12 79 c7t5d0 5.0 16.0 128.5 19.0 0.0 0.10.12.6 0 5 c0t1d0 extended device statistics r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.04.00.02.0 2.0 2.0 496.1 498.0 99 100 c7t2d0 0.00.00.00.0 0.0 1.00.00.0 0 100 c7t5d0 extended device statistics r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 7.00.0 204.50.0 0.0 0.00.00.2 0 0 c7t1d0 1.00.0 25.50.0 3.0 1.0 2961.6 1000.0 99 100 c7t2d0 8.00.0 282.00.0 0.0 0.00.00.3 0 0 c7t3d0 6.00.0 282.50.0 0.0 0.06.12.3 1 1 c7t4d0 0.03.00.05.0 0.5 1.0 165.4 333.3 18 100 c7t5d0 7.00.0 204.50.0 0.0 0.00.01.6 0 1 c0t1d0 2.02.0 89.0 12.0 0.0 0.03.16.1 1 2 c3t0d0 0.02.00.0 12.0 0.0 0.00.00.2 0 0 c3t1d0 Sometimes two or more disks are going at 100. How does one solve this issue if its a firmware bug? I tried looking around for Western Digital Firmware for WD10EADS but couldn't find any available. Would adding an SSD or two help here? Thanks, Em Date: Fri, 7 May 2010 14:38:25 -0300 Subject: Re: [zfs-discuss] ZFS Hard disk buffer at 100% From: gtirl...@sysdroid.com To: emilygrettelis...@hotmail.com CC: zfs-discuss@opensolaris.org On Fri, May 7, 2010 at 8:07 AM, Emily Grettel emilygrettelis...@hotmail.com mailto:emilygrettelis...@hotmail.com wrote: Hi, I've had my RAIDz volume working well on SNV_131 but it has come to my attention that there has been some read issues with the drives. Previously I thought this was a CIFS problem but I'm noticing that when transfering files or uncompressing some fairly large 7z (1-2Gb) files (or even smaller rar - 200-300Mb) files occasionally running iostat will give the b% as 100 for a drive or two. That's
Re: [zfs-discuss] Mirrored Servers
On 5/8/10 3:07 PM, Tony wrote: Lets say I have two servers, both running opensolaris with ZFS. I basically want to be able to create a filesystem where the two servers have a common volume, that is mirrored between the two. Meaning, each server keeps an identical, real time backup of the other's data directory. Set them both up as file servers, and load balance between the two for incoming requests. How would anyone suggest doing this? I would carefully consider whether or not the _really_ need to be real time. Can you tolerate 5 minutes or even just 60 seconds of difference between them? If you can, then things are much easier and less complex. I'd personally use ZFS Snapshots to keep the two servers in sync every 60 seconds. As for load balancing, that depends on which protocal your using. FTP is easy. NFS/CIFS is a little harder. I'd simply use a load balancer (Zeus, NetScaler, Balance, HA-Proxy, etc.), but that is a little scary and bizarre in the case of NFS/CIFS, where you should instead use a single-server failover solution, such as Sun Cluster. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Plugging in a hard drive after Solaris has booted up?
On 5/7/10 9:38 PM, Giovanni wrote: Hi guys, I have a quick question, I am playing around with ZFS and here's what I did. I created a storage pool with several drives. I unplugged 3 out of 5 drives from the array, currently: NAMESTATE READ WRITE CKSUM gpool UNAVAIL 0 0 0 insufficient replicas raidz1UNAVAIL 0 0 0 insufficient replicas c8t2d0 UNAVAIL 0 0 0 cannot open c8t4d0 UNAVAIL 0 0 0 cannot open c8t0d0 UNAVAIL 0 0 0 cannot open These drives had power all the time, the SATA cable however was disconnected. Now, after I logged into Solaris and opened firefox, I plugged them back in to sit and watch if the storage pool suddenly becomes available This did not happen, so my question is, do I need to make Solaris re-detect the hard drives and if so how? I tried format -e but it did not seem to detect the 3 drives I just plugged back in. Is this a BIOS issue? Does hot-swap hard drives only work when you replace current hard drives (previously detected by BIOS) with others but not when you have ZFS/Solaris running and want to add more storage without shutting down? It all boils down to, say the scenario is that I will need to purchase more hard drives as my array grows, I would like to be able to (without shutting down) add the drives to the storage pool (zpool) There are lots of different things you can look at and do, but it comes down to just one command: devfsadm -vC. This will cleanup (-C for cleanup, -v for verbose) the device tree if it gets into a funky state. Then run format or iostat -En to verify that the device(s) are there. Then re-import the zpool or add the device or whatever you wish to do. Even if device locations change, ZFS will do the right thing on import. If you wish to dig deeper... normally when you attach a new device hot-plug will do the right thing and you'll see the connection messages in dmesg. If you want to explicitly check the state of dynamic reconfiguration, checkout the cfgadm command. Normally, however, on modern version of Solaris there is no reason to resort to that, its just something fun if you wish to dig. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Benchmarking Methodologies
On 4/21/10 2:15 AM, Robert Milkowski wrote: I haven't heard from you in a while! Good to see you here again :) Sorry for stating obvious but at the end of a day it depends on what your goals are. Are you interested in micro-benchmarks and comparison to other file systems? I think the most relevant filesystem benchmarks for users is when you benchmark a specific application and present results from an application point of view. For example, given a workload for Oracle, MySQL, LDAP, ... how quickly it completes? How much benefit there is by using SSDs? What about other filesystems? Micro-benchmarks are fine but very hard to be properly interpreted by most users. Additionally most benchmarks are almost useless if they are not compared to some other configuration with only a benchmarked component changed. For example, knowing that some MySQL load completes in 1h on ZFS is basically useless. But knowing that on the same HW with Linux/ext3 and under the same load it completes in 2h would be interesting to users. Other interesting thing would be to see an impact of different ZFS setting on a benchmark results (aligned recordsize for database vs. default, atime off vs. on, lzjb, gzip, ssd). Also comparison of benchmark results with all default zfs setting compared to whatever setting you did which gave you the best result. Hey Robert... I'm always around. :) You've made an excellent case for benchmarking and where its useful but what I'm asking for on this thread is for folks to share the research they've done with as much specificity as possible for research purposes. :) Let me illustrate: To Darren's point on FileBench and vdbench... to date I've found these two to be the most useful. IOzone, while very popular, has always given me strange results which are inconsistent regardless of how large the block and data is. Given that the most important aspect of any benchmark is repeatability and sanity in results, I've found no value in IOzone any longer. vdbench has become my friend particularly in the area of physical disk profiling. Before tuning ZFS (or any filesystem) its important to find a solid baseline of performance on the underlying disk structure. So using a variety of vdbench profiles such as the following help you pinpoint exactly the edges of the performance envelope: sd=sd1,lun=/dev/rdsk/c0t1d0s0,threads=1 wd=wd1,sd=sd1,readpct=100,rhpct=0,seekpct=0 rd=run1,wd=wd1,iorate=max,elapsed=10,interval=1,forxfersize=(4k-4096k,d) With vdbench and the workload above I can get consistent, reliable results time after time and the results on other systems match. This is particularly key if your running a hardware RAID controller under ZFS. There isn't anything dd can do that vdbench can't do better. Using a workload like above both at differing xfer sizes and also at differing thread counts really helps give an accurate picture of the disk capabilities. Moving up into the filesystem. I've been looking intently at improving my FileBench profiles, based on the supplied ones with tweaking. I'm trying to get to a methodology that provides me with time-after-time repeatable results for real comparison between systems. I'm looking hard at vdbench file workloads, but they aren't yet nearly as sophisticated as FileBench. I am also looking at FIO (http://freshmeat.net/projects/fio/), which is FileBench-esce. At the end of the day, I agree entirely that application benchmarks are far more effective judges... but they are also more time consuming and less flexible than dedicated tools. The key is honing generic benchmarks to provide useful data which can be relied upon for making accurate estimates as regards to application performance. When you start judging filesystem performance based on something like MySQL there are simply too many variables involved. So, I appreciate the Benchmark 101, but I'm looking for anyone interested in sharing meat. Most of the existing ZFS benchmarks folks published are several years old now, and most were using IOzone. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Benchmarking Methodologies
I'm doing a little research study on ZFS benchmarking and performance profiling. Like most, I've had my favorite methods, but I'm re-evaluating my choices and trying to be a bit more scientific than I have in the past. To that end, I'm curious if folks wouldn't mind sharing their work on the subject? What tool(s) to you prefer in what situations? Do you have a standard method of running them (tool args; block sizes, thread counts, ...) or procedures between runs (zpool import/export, new dataset creation,...)? etc. Any feedback is appreciated. I want to get a good sampling of opinions. Thanks! benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] abusing zfs boot disk for fun and DR
Ben, I have found that booting from cdrom and importing the pool on the new host, then boot the hard disk will prevent these issues. That will reconfigure the zfs to use the new disk device. When running, zpool detach the missing mirror device and attach a new one. Thanks. I'm well versed in dealing with zfs issues. The reason I reported this boot/rpool issue, was that it was similar in nature to issues that occured trying to remediate an x4500 which had suffered may sata disks go offline (due to the buggy Marvell driver) as well as corruption that occured while trying to fix said issue. Backline spent a fair amount of time just trying to remediate the issue with hot spares that looked exactly like the faulted config in my rpool. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] abusing zfs boot disk for fun and DR
I'm in the process of standing up a couple of t5440's, of which the config will eventually end up in another data center 6k miles from the original config, and I'm supposed to send disks to the data center and we'll start from there (yes, I know how to flar and jumpstart. When the boss says do something, sometimes you *just* have to do it) As I've already run into the boot failsafe when moving a root disk from one sparc host to another, I recently found out that a sys-unconfig'd disk does not suffer from the same problem. While I am probably going to be told, I shouldn't be doing this, I ran into an interesting semantics issue that I think zfs should at least be able to avoid (and which I have seen in other non-abusive configurations. ;-) 2 zfs disk, root mirrored. c2t0 and c2t1. hot unplug c2t0, (and I should probably have removed the busted mirror from c2t1, but I didn't) sys-unconfig disk in c2t1 move disk to new t5440 boot disk, and it enumerates everything correctly and then I notice zpool thinks it's degraded. I had added the mirror after I realized I wanted to run this by the list pool: rpool state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-4J scrub: resilver completed after 0h7m with 0 errors on Thu Jan 7 12:10:03 2010 config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror DEGRADED 0 0 0 c2t0d0s0 ONLINE 0 0 0 c2t0d0s0 FAULTED 0 0 0 corrupted data c2t3d0s0 ONLINE 0 0 0 13.8G resilvered Anyway, should zfs report a faulted drive of the same ctd# which is already active? I understand why this happened, but from a logistics perspective, shouldn't zfs be smart enough to ignore a faulted disk like this? And this is not the first time I've had this scenario happen (I had an x4500 that had suffered through months of marvell driver bugs and corruption, and we probably had 2 or 3 of these types of things happen while trying to soft fix the problems). This also happened with hot-spares, which caused support to spend some time with back-line to figure out a procedure to clear those fauled disks which had the same ctd# as a working hot-spare... Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Liveupgrade'd to U8 and now can't boot previous U6 BE :(
Hi, As a related issue to this (specifically CR 6884728) - any ideas how I should go about removing the old BE? When I attempt to run ludelete I get the following: $ lustatus Boot Environment Is Active ActiveCanCopy Name Complete NowOn Reboot Delete Status -- -- - -- -- 10_05-09 yes no noyes- 10_10-09 yes yesyes no - $ ludelete 10_05-09 System has findroot enabled GRUB Checking if last BE on any disk... ERROR: cannot mount '/.alt.10_05-09/var': directory is not empty ERROR: cannot mount mount point /.alt.10_05-09/var device rpool/ROOT/s10x_u7wos_08/var ERROR: failed to mount file system rpool/ROOT/s10x_u7wos_08/var on /.alt.10_05-09/var ERROR: unmounting partially mounted boot environment file systems ERROR: No such file or directory: error unmounting rpool/ROOT/s10x_u7wos_08 ERROR: cannot mount boot environment by name 10_05-09 ERROR: Failed to mount BE 10_05-09. ERROR: Failed to mount BE 10_05-09. cat: cannot open /tmp/.lulib.luclb.dsk.2797.10_05-09 ERROR: This boot environment 10_05-09 is the last BE on the above disk. ERROR: Deleting this BE may make it impossible to boot from this disk. ERROR: However you may still boot solaris if you have BE(s) on other disks. ERROR: You *may* have to change boot-device order in the BIOS to accomplish this. ERROR: If you still want to delete this BE 10_05-09, please use the force option (-f). Unable to delete boot environment. My zfs setup now shows this: NAME USED AVAIL REFER MOUNTPOINT rpool 11.4G 4.26G 39.5K /rpool rpool/ROOT9.15G 4.26G18K legacy rpool/ROOT/10_10-09 9.14G 4.26G 4.04G / rpool/ROOT/10_10...@10_10-09 2.39G - 4.10G - rpool/ROOT/10_10-09/var 2.71G 4.26G 1.18G /var rpool/ROOT/10_10-09/v...@10_10-09 1.53G - 2.11G - rpool/ROOT/s10x_u7wos_08 17.4M 4.26G 4.10G /.alt.10_05-09 rpool/ROOT/s10x_u7wos_08/var 9.05M 4.26G 2.11G /.alt.10_05-09/var rpool/dump1.00G 4.26G 1.00G - rpool/export 74.6M 4.26G19K /export rpool/export/home 74.5M 4.26G21K /export/home rpool/export/home/admin 65.5K 4.26G 65.5K /export/home/admin rpool/swap 1G 4.71G 560M - It seems that the ludelete script reassigns the mountpoint for the BE to be deleted - but falls foul of the /var mount underneath the old BE. I tried lumounting the old BE and checked the /etc/vfstab - but there are no extra zfs entries in there. I'm just looking for a clean way to remove the old BE, and then remove the old snapshot without interfering with Live Upgrade from working in the future. Many thanks, Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Liveupgrade'd to U8 and now can't boot previous U6 BE :(
+ dev=`echo $dev | sed 's/mirror.*/mirror/'` Thanks for the suggestion Kurt. However, I'm not running a mirror on that pool - so am guessing this won't help in my case. I'll try and pick my way through the lulib script if I get any time. Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Increase size of ZFS mirror
Thanks very much everyone. Victor, I did think about using VirtualBox, but I have a real machine and a supply of hard drives for a short time, for I'll test it out using that if I can. Scott, of course, at work we use three mirrors and it works very well, has saved us on occasion where we have detached the third mirror, upgraded, found the upgrade failed and have been able to revert from the third mirror instead of having to go through backups. George, it will be great to see the 'autoexpand' in the next release. I'm keeping my home server on stable releases for the time being :) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Increase size of ZFS mirror
Hi all, I have a ZFS mirror of two 500GB disks, I'd like to up these to 1TB disks, how can I do this? I must break the mirror as I don't have enough controller on my system board. My current mirror looks like this: [b]r...@beleg-ia:/share/media# zpool status share pool: share state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM share ONLINE 0 0 0 mirrorONLINE 0 0 0 c5d0s0 ONLINE 0 0 0 c5d1s0 ONLINE 0 0 0 errors: No known data errors[/b] If I detach c5d1s0, add a 1TB drive, attach that, wait for it to resilver, then detach c5d0s0 and add another 1TB drive and attach that to the zpool, will that up the storage of the pool? Thanks very much, Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Increase size of ZFS mirror
Thomas, Could you post an example of what you mean (ie commands in the order to use them)? I've not played with ZFS that much and I don't want to muck my system up (I have data backed up, but am more concerned about getting myself in a mess and having to reinstall, thus losing my configurations). Many thanks for both of your replies, Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Increase size of ZFS mirror
Many thanks Thomas, I have a test machine so I shall try it on that before I try it on my main system. Thanks very much once again, Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [Fwd: Re: [perf-discuss] ZFS performance issue - READ is slow as hell...]
Ya, I agree that we need some additional data and testing. The iostat data in itself doesn't suggest to me that the process (dd) is slow but rather that most of the data is being retrieved elsewhere (ARC). An fsstat would be useful to correlate with the iostat data. One thing that also comes to mind with streaming write performance is the effects of the write throttle... curious if he'd have gotten more on the write side with that disabled. All these things don't strike me particularly as bugs (although there is always improvement) but rather that ZFS is designed for real world environments, not antiquated benchmarks. benr. Jim Mauro wrote: Posting this back to zfs-discuss. Roland's test case (below) is a single threaded sequential write followed by a single threaded sequential read. His bandwidth goes from horrible (~2MB/sec) to expected (~30MB/sec) when prefetch is disabled. This is with relatively recent nv bits (nv110). Roland - I'm wondering if you were tripping over CR6732803 ZFS prefetch creates performance issues for streaming workloads. It seems possible, but that CR is specific about multiple, concurrent IO streams, and your test case was only one. I think it's more likely you were tripping over CR6412053 zfetch needs a whole lotta love. For both CR's the workaround is disabling prefetch (echo zfs_prefetch_disable/W 1 | mdb -kw) Any other theories on this test case? Thanks, /jim Original Message Subject: Re: [perf-discuss] ZFS performance issue - READ is slow as hell... Date: Tue, 31 Mar 2009 02:33:00 -0700 (PDT) From: roland devz...@web.de To: perf-disc...@opensolaris.org Hello Jim, i double checked again - but it`s like i told: echo zfs_prefetch_disable/W0t1 | mdb -kw fixes my problem. i did a reboot and only set this single param - which immediately makes the read troughput go up from ~2 MB/s to ~30 MB/s I don't understand why disabling ZFS prefetch solved this problem. The test case was a single threaded sequential write, followed by a single threaded sequential read. i did not even do a single write - after reboot i just did dd if=/zfs/TESTFILE of=/dev/null Solaris Express Community Edition snv_110 X86 FSC RX300 S2 4GB RAM LSI Logic MegaRaid 320 Onboard SCSI Raid Controller 1x Raid1 LUN 1x Raid5 LUN (3 Disks) (both LUN`s show same behaviour) before: extended device statistics r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 21.30.1 2717.60.1 0.7 0.0 31.81.7 2 4 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 16.00.0 2048.40.0 34.9 0.1 2181.84.8 100 3 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 28.00.0 3579.20.0 34.8 0.1 1246.24.9 100 5 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 45.00.0 5760.40.0 34.8 0.2 772.74.5 100 7 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 19.00.0 2431.90.0 34.9 0.1 1837.34.4 100 3 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 58.00.0 7421.10.0 34.6 0.3 597.45.8 100 12 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 0.00.00.00.0 35.0 0.00.00.0 100 0 c0t1d0 after: extended device statistics r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 218.00.0 27842.30.0 0.0 0.40.11.8 1 40 c0t1d0 241.00.0 30848.00.0 0.0 0.40.01.6 0 38 c0t1d0 237.00.0 30340.10.0 0.0 0.40.01.6 0 38 c0t1d0 230.00.0 29434.70.0 0.0 0.40.01.8 0 40 c0t1d0 238.10.0 30471.30.0 0.0 0.40.01.5 0 37 c0t1d0 234.90.0 30001.90.0 0.0 0.40.01.6 1 37 c0t1d0 220.10.0 28171.40.0 0.0 0.4
Re: [zfs-discuss] zpool status -x strangeness
# zpool status -xv all pools are healthy Ben What does 'zpool status -xv' show? On Tue, Jan 27, 2009 at 8:01 AM, Ben Miller mil...@eecis.udel.edu wrote: I forgot the pool that's having problems was recreated recently so it's already at zfs version 3. I just did a 'zfs upgrade -a' for another pool, but some of those filesystems failed since they are busy and couldn't be unmounted. # zfs upgrade -a cannot unmount '/var/mysql': Device busy cannot unmount '/var/postfix': Device busy 6 filesystems upgraded 821 filesystems already at this version Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool status -x strangeness
I forgot the pool that's having problems was recreated recently so it's already at zfs version 3. I just did a 'zfs upgrade -a' for another pool, but some of those filesystems failed since they are busy and couldn't be unmounted. # zfs upgrade -a cannot unmount '/var/mysql': Device busy cannot unmount '/var/postfix': Device busy 6 filesystems upgraded 821 filesystems already at this version Ben You can upgrade live. 'zfs upgrade' with no arguments shows you the zfs version status of filesystems present without upgrading. On Jan 24, 2009, at 10:19 AM, Ben Miller mil...@eecis.udel.edu wrote: We haven't done 'zfs upgrade ...' any. I'll give that a try the next time the system can be taken down. Ben A little gotcha that I found in my 10u6 update process was that 'zpool upgrade [poolname]' is not the same as 'zfs upgrade [poolname]/[filesystem(s)]' What does 'zfs upgrade' say? I'm not saying this is the source of your problem, but it's a detail that seemed to affect stability for me. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool status -x strangeness
We haven't done 'zfs upgrade ...' any. I'll give that a try the next time the system can be taken down. Ben A little gotcha that I found in my 10u6 update process was that 'zpool upgrade [poolname]' is not the same as 'zfs upgrade [poolname]/[filesystem(s)]' What does 'zfs upgrade' say? I'm not saying this is the source of your problem, but it's a detail that seemed to affect stability for me. On Thu, Jan 22, 2009 at 7:25 AM, Ben Miller The pools are upgraded to version 10. Also, this is on Solaris 10u6. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool status -x strangeness
The pools are upgraded to version 10. Also, this is on Solaris 10u6. # zpool upgrade This system is currently running ZFS pool version 10. All pools are formatted using this version. Ben What's the output of 'zfs upgrade' and 'zpool upgrade'? (I'm just curious - I had a similar situation which seems to be resolved now that I've gone to Solaris 10u6 or OpenSolaris 2008.11). On Wed, Jan 21, 2009 at 2:11 PM, Ben Miller mil...@eecis.udel.edu wrote: Bug ID is 6793967. This problem just happened again. % zpool status pool1 pool: pool1 state: DEGRADED scrub: resilver completed after 0h48m with 0 errors on Mon Jan 5 12:30:52 2009 config: NAME STATE READ WRITE CKSUM pool1 DEGRADED 0 0 0 raidz2 DEGRADED 0 0 0 c4t8d0s0 ONLINE 0 0 0 c4t9d0s0 ONLINE 0 0 0 c4t10d0s0 ONLINE 0 0 0 c4t11d0s0 ONLINE 0 0 0 c4t12d0s0 REMOVED 0 0 0 c4t13d0s0 ONLINE 0 0 0 errors: No known data errors % zpool status -x all pools are healthy % # zpool online pool1 c4t12d0s0 % zpool status -x pool: pool1 state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 0h0m, 0.12% done, 2h38m to go config: NAME STATE READ WRITE CKSUM pool1 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c4t8d0s0 ONLINE 0 0 0 c4t9d0s0 ONLINE 0 0 0 c4t10d0s0 ONLINE 0 0 0 c4t11d0s0 ONLINE 0 0 0 c4t12d0s0 ONLINE 0 0 0 c4t13d0s0 ONLINE 0 0 0 errors: No known data errors % Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool status -x strangeness
Bug ID is 6793967. This problem just happened again. % zpool status pool1 pool: pool1 state: DEGRADED scrub: resilver completed after 0h48m with 0 errors on Mon Jan 5 12:30:52 2009 config: NAME STATE READ WRITE CKSUM pool1 DEGRADED 0 0 0 raidz2 DEGRADED 0 0 0 c4t8d0s0 ONLINE 0 0 0 c4t9d0s0 ONLINE 0 0 0 c4t10d0s0 ONLINE 0 0 0 c4t11d0s0 ONLINE 0 0 0 c4t12d0s0 REMOVED 0 0 0 c4t13d0s0 ONLINE 0 0 0 errors: No known data errors % zpool status -x all pools are healthy % # zpool online pool1 c4t12d0s0 % zpool status -x pool: pool1 state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 0h0m, 0.12% done, 2h38m to go config: NAME STATE READ WRITE CKSUM pool1 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c4t8d0s0 ONLINE 0 0 0 c4t9d0s0 ONLINE 0 0 0 c4t10d0s0 ONLINE 0 0 0 c4t11d0s0 ONLINE 0 0 0 c4t12d0s0 ONLINE 0 0 0 c4t13d0s0 ONLINE 0 0 0 errors: No known data errors % Ben I just put in a (low priority) bug report on this. Ben This post from close to a year ago never received a response. We just had this same thing happen to another server that is running Solaris 10 U6. One of the disks was marked as removed and the pool degraded, but 'zpool status -x' says all pools are healthy. After doing an 'zpool online' on the disk it resilvered in fine. Any ideas why 'zpool status -x' reports all healthy while 'zpool status' shows a pool in degraded mode? thanks, Ben We run a cron job that does a 'zpool status -x' to check for any degraded pools. We just happened to find a pool degraded this morning by running 'zpool status' by hand and were surprised that it was degraded as we didn't get a notice from the cron job. # uname -srvp SunOS 5.11 snv_78 i386 # zpool status -x all pools are healthy # zpool status pool1 pool: pool1 tate: DEGRADED scrub: none requested onfig: NAME STATE READ WRITE CKSUM pool1DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 c1t8d0 REMOVED 0 0 0 c1t9d0 ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 No known data errors I'm going to look into it now why the disk is listed as removed. Does this look like a bug with 'zpool status -x'? Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool status -x strangeness
I just put in a (low priority) bug report on this. Ben This post from close to a year ago never received a response. We just had this same thing happen to another server that is running Solaris 10 U6. One of the disks was marked as removed and the pool degraded, but 'zpool status -x' says all pools are healthy. After doing an 'zpool online' on the disk it resilvered in fine. Any ideas why 'zpool status -x' reports all healthy while 'zpool status' shows a pool in degraded mode? thanks, Ben We run a cron job that does a 'zpool status -x' to check for any degraded pools. We just happened to find a pool degraded this morning by running 'zpool status' by hand and were surprised that it was degraded as we didn't get a notice from the cron job. # uname -srvp SunOS 5.11 snv_78 i386 # zpool status -x all pools are healthy # zpool status pool1 pool: pool1 tate: DEGRADED scrub: none requested onfig: NAME STATE READ WRITE CKSUM pool1DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 c1t8d0 REMOVED 0 0 0 c1t9d0 ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 No known data errors I'm going to look into it now why the disk is listed as removed. Does this look like a bug with 'zpool status -x'? Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool status -x strangeness
This post from close to a year ago never received a response. We just had this same thing happen to another server that is running Solaris 10 U6. One of the disks was marked as removed and the pool degraded, but 'zpool status -x' says all pools are healthy. After doing an 'zpool online' on the disk it resilvered in fine. Any ideas why 'zpool status -x' reports all healthy while 'zpool status' shows a pool in degraded mode? thanks, Ben We run a cron job that does a 'zpool status -x' to check for any degraded pools. We just happened to find a pool degraded this morning by running 'zpool status' by hand and were surprised that it was degraded as we didn't get a notice from the cron job. # uname -srvp SunOS 5.11 snv_78 i386 # zpool status -x all pools are healthy # zpool status pool1 pool: pool1 tate: DEGRADED scrub: none requested onfig: NAME STATE READ WRITE CKSUM pool1DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 c1t8d0 REMOVED 0 0 0 c1t9d0 ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 No known data errors I'm going to look into it now why the disk is listed as removed. Does this look like a bug with 'zpool status -x'? Ben -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zdb to dump data
Is there some hidden way to coax zdb into not just displaying data based on a given DVA but rather to dump it in raw usable form? I've got a pool with large amounts of corruption. Several directories are toast and I get I/O Error when trying to enter or read the directory... however I can read the directory and files using ZDB, if I could just dump it in a raw format I could do recovery that way. To be clear, I've already recovered from the situation, this is purely an academic can I do it exercise for the sake of learning. If ZDB can't do it, I'd assume I'd have to write some code to read based on DVA. Maybe I could write a little tool for it. benr. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Lost Disk Space
No takers? :) benr. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Lost Disk Space
I've been struggling to fully understand why disk space seems to vanish. I've dug through bits of code and reviewed all the mails on the subject that I can find, but I still don't have a proper understanding of whats going on. I did a test with a local zpool on snv_97... zfs list, zpool list, and zdb all seem to disagree on how much space is available. In this case its only a discrepancy of about 20G or so, but I've got Thumpers that have a discrepancy of over 6TB! Can someone give a really detailed explanation about whats going on? block traversal size 670225837056 != alloc 720394438144 (leaked 50168601088) bp count:15182232 bp logical:672332631040 avg: 44284 bp physical: 669020836352 avg: 44066compression: 1.00 bp allocated: 670225837056 avg: 44145compression: 1.00 SPA allocated: 720394438144 used: 96.40% Blocks LSIZE PSIZE ASIZE avgcomp %Total Type 12 120K 26.5K 79.5K 6.62K4.53 0.00 deferred free 1512 512 1.50K 1.50K1.00 0.00 object directory 3 1.50K 1.50K 4.50K 1.50K1.00 0.00 object array 116K 1.50K 4.50K 4.50K 10.67 0.00 packed nvlist - - - - - -- packed nvlist size 72 8.45M889K 2.60M 37.0K9.74 0.00 bplist - - - - - -- bplist header - - - - - -- SPA space map header 974 4.48M 2.65M 7.94M 8.34K1.70 0.00 SPA space map - - - - - -- ZIL intent log 96.7K 1.51G389M777M 8.04K3.98 0.12 DMU dnode 17 17.0K 8.50K 17.5K 1.03K2.00 0.00 DMU objset - - - - - -- DSL directory 13 6.50K 6.50K 19.5K 1.50K1.00 0.00 DSL directory child map 12 6.00K 6.00K 18.0K 1.50K1.00 0.00 DSL dataset snap map 14 38.0K 10.0K 30.0K 2.14K3.80 0.00 DSL props - - - - - -- DSL dataset - - - - - -- ZFS znode 2 1K 1K 2K 1K1.00 0.00 ZFS V0 ACL 5.81M 558G557G557G 95.8K1.0089.27 ZFS plain file 382K 301M200M401M 1.05K1.50 0.06 ZFS directory 9 4.50K 4.50K 9.00K 1K1.00 0.00 ZFS master node 12 482K 20.0K 40.0K 3.33K 24.10 0.00 ZFS delete queue 8.20M 66.1G 65.4G 65.8G 8.03K1.0110.54 zvol object 1512 512 1K 1K1.00 0.00 zvol prop - - - - - -- other uint8[] - - - - - -- other uint64[] - - - - - -- other ZAP - - - - - -- persistent error log 1 128K 10.5K 31.5K 31.5K 12.19 0.00 SPA history - - - - - -- SPA history offsets - - - - - -- Pool properties - - - - - -- DSL permissions - - - - - -- ZFS ACL - - - - - -- ZFS SYSACL - - - - - -- FUID table - - - - - -- FUID table size 5 3.00K 2.50K 7.50K 1.50K1.20 0.00 DSL dataset next clones - - - - - -- scrub work queue 14.5M 626G623G624G 43.1K1.00 100.00 Total real21m16.862s user0m36.984s sys 0m5.757s === Looking at the data: [EMAIL PROTECTED] ~$ zfs list backup zpool list backup NAME USED AVAIL REFER MOUNTPOINT backup 685G 237K27K /backup NAME SIZE USED AVAILCAP HEALTH ALTROOT backup 696G 671G 25.1G96% ONLINE - So zdb says 626GB is used, zfs list says 685GB is used, and zpool list says 671GB is used. The pool was filled to 100% capacity via dd, this is confirmed, I can't write data, but yet zpool list says its only 96%. benr. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Couple of ZFS panics...
I've got a Intel DP35DP Motherboard, Q6600 proc (Intel 2.4G, 4 core), 4GB of ram and a copule of Sata disks, running ICH9. S10U5, patched about a week ago or so... I have a zpool on a single slice (haven't added a mirror yet, was getting to that) and have started to suffer regular hard resets and have gotten a few panics. The system is an nfs server for a couple of systems (not much write) and one writer (I do my svn updates over NFS cause my ath0 board refuses to work in 64-bit on S10U5) I also do local builds on the same server. Ideas? The first looks like: panic[cpu0]/thread=9bcf0460: BAD TRAP: type=e (#pf Page fault) rp=fe80001739a0 addr=c064dba0 cmake: #pf Page fault Bad kernel fault at addr=0xc064dba0 pid=6797, pc=0xf0a6350a, sp=0xfe8000173a90, eflags=0x10207 cr0: 80050033pg,wp,ne,et,mp,pe cr4: 6f8xmme,fxsr,pge,mce,pae,pse,de cr2: c064dba0 cr3: 12bf9b000 cr8: c rdi: 6c60 rsi:0 rdx:0 rcx:0 r8: 8b21017f r9: ae3a79c0 rax:0 rbx: c0611f40 rbp: fe8000173ac0 r10:0 r11:0 r12: ae4687d0 r13: d8c200 r14:2 r15: 826c0480 fsb: 8000 gsb: fbc24ec0 ds: 43 es: 43 fs:0 gs: 1c3 trp:e err:0 rip: f0a6350a cs: 28 rfl:10207 rsp: fe8000173a90 ss: 30 fe80001738b0 unix:real_mode_end+71e1 () fe8000173990 unix:trap+5e6 () fe80001739a0 unix:_cmntrap+140 () fe8000173ac0 zfs:zio_buf_alloc+a () fe8000173af0 zfs:arc_buf_alloc+9f () fe8000173b70 zfs:arc_read+ee () fe8000173bf0 zfs:dbuf_read_impl+1a0 () fe8000173c30 zfs:zfsctl_ops_root+304172dd () fe8000173c60 zfs:dmu_tx_check_ioerr+6e () fe8000173cc0 zfs:dmu_tx_count_write+73 () fe8000173cf0 zfs:dmu_tx_hold_write+4a () fe8000173db0 zfs:zfs_write+1bb () fe8000173e00 genunix:fop_write+31 () fe8000173eb0 genunix:write+287 () fe8000173ec0 genunix:write32+e () fe8000173f10 unix:brand_sys_sysenter+1f2 () syncing file systems... 3130 15 done dumping to /dev/dsk/c0t0d0s1, offset 860356608, content: kernel NOTICE: ahci_tran_reset_dport: port 0 reset port The second liek this: panic[cpu2]/thread=9b425f20: BAD TRAP: type=e (#pf Page fault) rp=fe80018cdf40 addr=c064dba0 nfsd: #pf Page fault Bad kernel fault at addr=0xc064dba0 pid=665, pc=0xf0a6350a, sp=0xfe80018ce030, eflags=0x10207 cr0: 8005003bpg,wp,ne,et,ts,mp,pe cr4: 6f8xmme,fxsr,pge,mce,pae,pse,de cr2: c064dba0 cr3: 12a9df000 cr8: c rdi: 6c60 rsi:0 rdx:0 rcx:0 r8: 8b21017f r9:f rax:0 rbx: c0611f40 rbp: fe80018ce060 r10:0 r11:0 r12: fe81c20ecf00 r13: d8c200 r14:2 r15: 826c2240 fsb: 8000 gsb: 81a6c800 ds: 43 es: 43 fs:0 gs: 1c3 trp:e err:0 rip: f0a6350a cs: 28 rfl:10207 rsp: fe80018ce030 ss: 30 fe80018cde50 unix:real_mode_end+71e1 () fe80018cdf30 unix:trap+5e6 () fe80018cdf40 unix:_cmntrap+140 () fe80018ce060 zfs:zio_buf_alloc+a () fe80018ce090 zfs:arc_buf_alloc+9f () fe80018ce110 zfs:arc_read+ee () fe80018ce190 zfs:dbuf_read_impl+1a0 () fe80018ce1d0 zfs:zfsctl_ops_root+304172dd () fe80018ce200 zfs:dmu_tx_check_ioerr+6e () fe80018ce260 zfs:dmu_tx_count_write+73 () fe80018ce290 zfs:dmu_tx_hold_write+4a () fe80018ce350 zfs:zfs_write+1bb () fe80018ce3a0 genunix:fop_write+31 () fe80018ce410 nfssrv:do_io+b5 () fe80018ce610 nfssrv:rfs4_op_write+40e () fe80018ce770 nfssrv:rfs4_compound+1b3 () fe80018ce800 nfssrv:rfs4_dispatch+234 () fe80018ceb10 nfssrv:common_dispatch+88a () fe80018ceb20 nfssrv:nfs4_drc+3051ccc1 () fe80018cebf0 rpcmod:svc_getreq+209 () fe80018cec40 rpcmod:svc_run+124 () fe80018cec70 rpcmod:svc_do_run+88 () fe80018ceec0 nfs:nfssys+208 () fe80018cef10 unix:brand_sys_sysenter+1f2 () syncing file systems... done dumping to /dev/dsk/c0t0d0s1, offset 860356608, content: kernel NOTICE: ahci_tran_reset_dport: port 0 reset port This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ARCSTAT Kstat Definitions
Thanks, not as much as I was hoping for but still extremely helpful. Can you, or others have a look at this: http://cuddletech.com/arc_summary.html This is a PERL script that uses kstats to drum up a report such as the following: System Memory: Physical RAM: 32759 MB Free Memory : 10230 MB LotsFree: 511 MB ARC Size: Current Size: 7989 MB (arcsize) Target Size (Adaptive): 8192 MB (c) Min Size (Hard Limit):1024 MB (zfs_arc_min) Max Size (Hard Limit):8192 MB (zfs_arc_max) ARC Size Breakdown: Most Recently Used Cache Size: 13%1087 MB (p) Most Frequently Used Cache Size:86%7104 MB (c-p) ARC Efficency: Cache Access Total: 3947194710 Cache Hit Ratio: 99% 3944674329 Cache Miss Ratio: 0% 2520381 Data Demand Efficiency:99% Data Prefetch Efficiency:69% CACHE HITS BY CACHE LIST: Anon:0%16730069 Most Frequently Used: 99%3915830091 (mfu) Most Recently Used: 0%10490502 (mru) Most Frequently Used Ghost: 0%439554 (mfu_ghost) Most Recently Used Ghost:0%1184113 (mru_ghost) CACHE HITS BY DATA TYPE: Demand Data:99%3914527790 Prefetch Data: 0%2447831 Demand Metadata: 0%10709326 Prefetch Metadata: 0%16989382 CACHE MISSES BY DATA TYPE: Demand Data:45%1144679 Prefetch Data: 42%1068975 Demand Metadata: 5%132649 Prefetch Metadata: 6%174078 - Feedback and input is welcome, in particular if I'm mischarrectorizing data. benr. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ARCSTAT Kstat Definitions
Its a starting point anyway. The key is to try and draw useful conclusions from the info to answer the torrent of why is my ARC 30GB??? There are several things I'm unclear on whether or not I'm properly interpreting such as: * As you state, the anon pages. Even the comment in code is, to me anyway, a little vague. I include them because otherwise you look at the hit counters and wonder where a large chunk of them went. * Prefetch... I want to use the Prefetch Data hit ratio as a judgment call on the efficiency of prefetch. If the value is very low it might be best to turn it off. but I'd like to hear that from someone else before I go saying that. In high latency environments, such as ZFS on iSCSI, prefetch can either significantly help or hurt, determining which is difficult without some type of metric as as above. * There are several instances (based on dtracing) in which the ARC is bypassed... for ZIL I understand, in some other cases I need to spend more time analyzing the DMU (dbuf_*) for why. * In answering the Is having a 30GB ARC good? question, I want to say that if MFU is 60% of ARC, and if the hits are mostly MFU that you are deriving significant benefit from your large ARC but on a system with a 2GB ARC or a 30GB ARC the overall hit ratio tends to be 99%. Which is nuts, and tends to reinforce a misinterpretation of anon hits. The only way I'm seeing to _really_ understand ARC's efficiency is to look at the overall number of reads and then how many are intercepted by ARC and how many actually made it to disk... and why (prefetch or demand). This is tricky to implement via kstats because you have to pick out and monitor the zpool disks themselves. I've spent a lot of time in this code (arc.c) and still have a lot of questions. I really wish there was an Advanced ZFS Internals talk coming up; I simply can't keep spending so much time on this. Feedback from PAE or other tuning experts is welcome and appreciated. :) benr. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ARCSTAT Kstat Definitions
New version is available (v0.2) : * Fixes divide by zero, * includes tuning from /etc/system in output * if prefetch is disabled I explicitly say so. * Accounts for jacked anon count. Still need improvement here. * Added friendly explanations for MRU/MFU Ghost lists counts. Page and examples are updated: cuddletech.com/arc_summary.pl Still needs work, but hopefully interest in this will stimulate some improved understanding of ARC internals. benr. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ARCSTAT Kstat Definitions
Would someone in the know be willing to write up (preferably blog) definitive definitions/explanations of all the arcstats provided via kstat? I'm struggling with proper interpretation of certain values, namely p, memory_throttle_count, and the mru/mfu+ghost hit vs demand/prefetch hit counters. I think I've got it figured out, but I'd really like expert clarification before I start tweaking. Thanks. benr. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to delete hundreds of emtpy snapshots
zfs list is mighty slow on systems with a large number of objects, but there is no foreseeable plan that I'm aware of to solve that problem. Never the less, you need to do a zfs list, therefore, do it once and work from that. zfs list /tmp/zfs.out for i in `grep mydataset@ /tmp/zfs.out`; do zfs destroy $i; done As for 5 minute snapshots this is NOT a bad idea. It is, however, complex to manage. Thus, you need to employ tactics to make it more digestible. You need to ask yourself first why you want 5 min snaps. Is it replication? If so, create it, replicate it, destroy all but the last snapshot or even rotate them. Or, is it fallback in case you make a mistake? Then just keep around the last 6 snapshots or so. zfs rename zfs destroy are your friends use them wisely. :) If you want to discuss exactly what your trying to facilitate I'm sure we can come up with some more concrete ideas to help you. benr. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] 40min ls in empty directory
I've run into an odd problem which I lovingly refer to as a black hole directory. On a Thumper used for mail stores we've found find's take an exceptionally long time to run. There are directories that have as many as 400,000 files, which I immediately considered the culprit. However, under investigation, they aren't the problem at all. The problem is seen here in this truss output (first column is delta time): 0.0001 lstat64(tmp, 0x08046A20) = 0 0. openat(AT_FDCWD, tmp, O_RDONLY|O_NDELAY|O_LARGEFILE) = 8 0.0001 fcntl(8, F_SETFD, 0x0001) = 0 0. fstat64(8, 0x08046920) = 0 0. fstat64(8, 0x08046AB0) = 0 0. fchdir(8) = 0 1321.3133 getdents64(8, 0xFEE48000, 8192) = 48 1255.8416 getdents64(8, 0xFEE48000, 8192) = 0 0.0001 fchdir(7) = 0 0.0001 close(8)= 0 These two getdents64 syscalls take approx 20 mins each. Notice that the directory structure is 48 bytes, the directory is empty: drwx-- 2 102 1022 Feb 21 02:24 tmp My assumption is that the directory is corrupt, but I'd like to prove that. I have a scrub running on the pool, but its got about 16 hours to go before it completes. 20% complete thus far and nothing is reported. No errors are logged when I stimulate this problem. Does anyone have suggestions on how to get additional data on this issue? I've used dtrace flows to examine, however what I really want to see is the zio's as a result of the getdents, but can't see how to do so. Ideally I'd quiet the system and watch all zio's occurring while I stimulate it, but this is production and not possible. If anyone knows how to watch DMU/ZIO activity that _only_ pertains to a certain PID please let me know. ;) Suggestions on how to pro-actively catch these sorts of instances are welcome, as are alternative explanations. benr. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] J4200/J4400 Array
Hi, According to the Sun Handbook, there is a new array : SAS interface 12 disks SAS or SATA ZFS could be used nicely with this box. There is an another version called J4400 with 24 disks. Doc is here : http://docs.sun.com/app/docs/coll/j4200 Does someone know price and availability for these products ? Best Regards, Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot delete errored file
Hi, Quick update: I left memtest running over night - 39 passes, no errors. I also attempted to force the BIOS to run the memory at 800MHz 5-5-5-15 as suggested - but the machine became very unstable - long boot times; PCI-Express failure of Yukon network card on booting etc. I've switched it back to Auto speedtiming for now. I'll just hope that it was a one-off glitch that corrupted the pool. I'm going to rebuild the pool this weekend. Thanks for all the suggestions. Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot delete errored file
Hi Marc, Thanks for all of your suggestions. I'll restart memtest when I'm next in the office and leave it running overnight. I can recreate the pool - but I guess the question is am I safe to do this on the existing setup, or am I going to hit the same issue again sometime? Assuming I don't find any obvious hardware issues - wouldn't this be a regarded as flaw in ZFS (i.e. no way of clearing such an error without a rebuild)? Would I be safer rebuilding to a pair of mirrors rather than a 3 disk zraid + hotspare? Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot delete errored file
Sent response by private message. Today's findings are that the cksum errors appear on the new disk on the other controller too - so I've ruled out controllers cables. It's probably as Jeff says - just got to figure out now how to prove the memory is duff. Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot delete errored file
Hi, Today's update: - I ran a memtest a few times - no errors. - I reseated, re-routed ad switched all connectors/cables - I'm currently running a scrub, but it's showing vast numbers of cksum errors now across all devices: $ zpool status -v pool: rpool state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub in progress for 0h5m, 3.35% done, 2h26m to go config: NAMESTATE READ WRITE CKSUM rpool DEGRADED 0 0 211K raidz1DEGRADED 0 0 211K c0t7d0 DEGRADED 0 0 0 too many errors c0t1d0 DEGRADED 0 0 0 too many errors c0t2d0 DEGRADED 0 0 0 too many errors errors: Permanent errors have been detected in the following files: /export/duke/test/Acoustic/3466/88832/09 - Check.mp3 I'll start moving each device over to a different controller to see if that helps once the scrub completes. Still getting I/O errors trying to delete that file. Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot delete errored file
Hello again, I'm not making progress on this. Every time I run a zpool scrub rpool I see: $ zpool status -vx pool: rpool state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub in progress for 0h0m, 0.01% done, 177h43m to go config: NAMESTATE READ WRITE CKSUM rpool DEGRADED 0 0 8 raidz1DEGRADED 0 0 8 c0t0d0 DEGRADED 0 0 0 too many errors c0t1d0 ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /export/duke/test/Acoustic/3466/88832/09 - Check.mp3 I popped in a brand new disk of the same size, and did a zpool replace on the persistently degraded drive and the new drive. i.e.: $ zpool replace rpool c0t0d0 c0t7d0 But that simply had the effect of transferring the issue to the new drive: $ zpool status -xv rpool pool: rpool state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: resilver completed after 2h41m with 1 errors on Wed Jun 4 20:22:27 2008 config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 8 raidz1 DEGRADED 0 0 8 spare DEGRADED 0 0 0 c0t0d0 DEGRADED 0 0 0 too many errors c0t7d0 ONLINE 0 0 0 c0t1d0ONLINE 0 0 0 c0t2d0ONLINE 0 0 0 spares c0t7d0 INUSE currently in use $ zpool detach rpool c0t0d0 $ zpool status -vx rpool pool: rpool state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: resilver completed after 2h41m with 1 errors on Wed Jun 4 20:22:27 2008 config: NAMESTATE READ WRITE CKSUM rpool ONLINE 0 0 8 raidz1ONLINE 0 0 8 c0t7d0 ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: 0xc3:0x1c0 $ zpool scrub rpool ... $ zpool status -vx rpool pool: rpool state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub in progress for 0h0m, 0.00% done, 0h0m to go config: NAMESTATE READ WRITE CKSUM rpool DEGRADED 0 0 4 raidz1DEGRADED 0 0 4 c0t7d0 DEGRADED 0 0 0 too many errors c0t1d0 ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /export/duke/test/Acoustic/3466/88832/09 - Check.mp3 $ rm -f /export/duke/test/Acoustic/3466/88832/09 - Check.mp3 rm: cannot remove `/export/duke/test/Acoustic/3466/88832/09 - Check.mp3': I/O error I'm guessing this isn't a hardware fault, but a glitch in ZFS - but am hoping to be proved wrong. Any ideas before I rebuild the pool from scratch? And if I do, is there anything I can do to prevent this problem in the future? B This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot delete errored file
Hi Marc, $ : 09 - Check.mp3 bash: 09 - Check.mp3: I/O error $ cd .. $ rm -rf BAD $ rm: cannot remove `BAD/09 - Check.mp3': I/O error I'll try shuffling the cables - but as you see above it occasionally reports on a different disk - so imagine the cables are OK. Also, the new disk I added has a new cable too, and on a different SATA port - which is also showing up as degraded. Is there any lower level debugging that I can enable to try and work out what is going on. This machine has been running fine since last August. I couldn't see anything in builds later than snv 86 that might help - but I could try upgrading to the latest? B This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS and ACL's over NFSv3
Can someone please clarify the ability to utilize ACL's over NFSv3 from a ZFS share? I can getfacl but I can't setfacl. I can't find any documentation in this regard. My suspicion is that that ZFS Shares must be NFSv4 in order to utilize ACLs but I'm hoping this isn't the case. Can anyone definitively speak to this? The closest related bug I can find is 6340720 which simply says See comments. benr. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Cannot delete errored file
Hi, I can't seem to delete a file in my zpool that has permanent errors: zpool status -vx pool: rpool state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub completed after 2h10m with 1 errors on Tue Jun 3 11:36:49 2008 config: NAMESTATE READ WRITE CKSUM rpool ONLINE 0 0 0 raidz1ONLINE 0 0 0 c0t0d0 ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: /export/duke/test/Acoustic/3466/88832/09 - Check.mp3 rm /export/duke/test/Acoustic/3466/88832/09 - Check.mp3 rm: cannot remove `/export/duke/test/Acoustic/3466/88832/09 - Check.mp3': I/O error Each time I try to do anything to the file, the checksum error count goes up on the pool. I also tried a mv and a cp over the top - but same I/O error. I performed a zpool scrub rpool followed by a zpool clear rpool - but still get the same error. Any ideas? PS - I'm running snv_86, and use the sata driver on an intel x86 architecture. B This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ClearCase support for ZFS?
Hi, Does anybody know what is the latest status with ClearCase support for ZFS? I noticed this from IBM: http://www-1.ibm.com/support/docview.wss?rs=0uid=swg21155708 I would like to make sure someone has installed and tested it before recommending to a customer. Regards, Nissim Ben-Haim Solution Architect ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Clearing corrupted file errors
Hi, Sorry if this is a RTM issue - but I wanted to be sure before continuing. I received a corrupted file error on one of my pools. I removed the file, and the status command now shows the following: zpool status -v rpool pool: rpool state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAMESTATE READ WRITE CKSUM rpool ONLINE 0 0 0 raidz1ONLINE 0 0 0 c0t0d0 ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: rpool/duke:0x8237 I tried running zpool clear rpool to clear the error, but it persists in the status output. Should a zpool scrub rpool get rid of this error? Thanks, Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs 32bits
Hi, I know that is not recommended by Sun to use ZFS on 32 bits machines but, what are really the consequences of doing this ? I have an old Bipro Xeon server (6 GB ram , 6 disks), and I would like to do a raidz with 4 disks with Solaris 10 update 4. Thanks, Ben ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool status -x strangeness on b78
We run a cron job that does a 'zpool status -x' to check for any degraded pools. We just happened to find a pool degraded this morning by running 'zpool status' by hand and were surprised that it was degraded as we didn't get a notice from the cron job. # uname -srvp SunOS 5.11 snv_78 i386 # zpool status -x all pools are healthy # zpool status pool1 pool: pool1 state: DEGRADED scrub: none requested config: NAME STATE READ WRITE CKSUM pool1DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 c1t8d0 REMOVED 0 0 0 c1t9d0 ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 errors: No known data errors I'm going to look into it now why the disk is listed as removed. Does this look like a bug with 'zpool status -x'? Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Panic on Zpool Import (Urgent)
The solution here was to upgrade to snv_78. By upgrade I mean re-jumpstart the system. I tested snv_67 via net-boot but the pool paniced just as below. I also attempted using zfs_recover without success. I then tested snv_78 via net-boot, used both aok=1 and zfs:zfs_recover=1 and was able to (slowly) import the pool. Following that test I exported and then did a full re-install of the box. A very important note to anyone upgrading a Thumper! Don't forget about the NCQ bug. After upgrading to a release more recent than snv_60 add the following to /etc/system: set sata:sata_max_queue_depth = 0x1 If you don't life will be highly unpleasant and you'll believe that disks are failing everywhere when in fact they are not. benr. Ben Rockwood wrote: Today, suddenly, without any apparent reason that I can find, I'm getting panic's during zpool import. The system paniced earlier today and has been suffering since. This is snv_43 on a thumper. Here's the stack: panic[cpu0]/thread=99adbac0: assertion failed: ss != NULL, file: ../../common/fs/zfs/space_map.c, line: 145 fe8000a240a0 genunix:assfail+83 () fe8000a24130 zfs:space_map_remove+1d6 () fe8000a24180 zfs:space_map_claim+49 () fe8000a241e0 zfs:metaslab_claim_dva+130 () fe8000a24240 zfs:metaslab_claim+94 () fe8000a24270 zfs:zio_dva_claim+27 () fe8000a24290 zfs:zio_next_stage+6b () fe8000a242b0 zfs:zio_gang_pipeline+33 () fe8000a242d0 zfs:zio_next_stage+6b () fe8000a24320 zfs:zio_wait_for_children+67 () fe8000a24340 zfs:zio_wait_children_ready+22 () fe8000a24360 zfs:zio_next_stage_async+c9 () fe8000a243a0 zfs:zio_wait+33 () fe8000a243f0 zfs:zil_claim_log_block+69 () fe8000a24520 zfs:zil_parse+ec () fe8000a24570 zfs:zil_claim+9a () fe8000a24750 zfs:dmu_objset_find+2cc () fe8000a24930 zfs:dmu_objset_find+fc () fe8000a24b10 zfs:dmu_objset_find+fc () fe8000a24bb0 zfs:spa_load+67b () fe8000a24c20 zfs:spa_import+a0 () fe8000a24c60 zfs:zfs_ioc_pool_import+79 () fe8000a24ce0 zfs:zfsdev_ioctl+135 () fe8000a24d20 genunix:cdev_ioctl+55 () fe8000a24d60 specfs:spec_ioctl+99 () fe8000a24dc0 genunix:fop_ioctl+3b () fe8000a24ec0 genunix:ioctl+180 () fe8000a24f10 unix:sys_syscall32+101 () syncing file systems... done This is almost identical to a post to this list over a year ago titled ZFS Panic. There was follow up on it but the results didn't make it back to the list. I spent time doing a full sweep for any hardware failures, pulled 2 drives that I suspected as problematic but weren't flagged as such, etc, etc, etc. Nothing helps. Bill suggested a 'zpool import -o ro' on the other post, but thats not working either. I _can_ use 'zpool import' to see the pool, but I have to force the import. A simple 'zpool import' returns output in about a minute. 'zpool import -f poolname' takes almost exactly 10 minutes every single time, like it hits some timeout and then panics. I did notice that while the 'zpool import' is running 'iostat' is useless, just hangs. I still want to believe this is some device misbehaving but I have no evidence to support that theory. Any and all suggestions are greatly appreciated. I've put around 8 hours into this so far and I'm getting absolutely nowhere. Thanks benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Removing An Errant Drive From Zpool
Robert Milkowski wrote: If you can't re-create a pool (+backuprestore your data) I would recommend to wait for device removal in zfs and in a mean time I would attach another drive to it so you've got mirrored configuration and remove them once there's a device removal. Since you're already working on nevada you probably could adopt new bits quickly. The only question is - when device removal is going to be integrated - last time someone mentioned it here it was supposed to be by the end of last year... Ya, I'm afraid your right. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Removing An Errant Drive From Zpool
I made a really stupid mistake... having trouble removing a hot spare marked as failed I was trying several ways to put it back in a good state. One means I tried was to 'zpool add pool c5t3d0'... but I forgot to use the proper syntax zpool add pool spare c5t3d0. Now I'm in a bind. I've got 4 large raidz2's and now this punty 500GB drive in the config: ... raidz2ONLINE 0 0 0 c5t7d0 ONLINE 0 0 0 c5t2d0 ONLINE 0 0 0 c7t7d0 ONLINE 0 0 0 c6t7d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 c0t7d0 ONLINE 0 0 0 c4t3d0 ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 c6t3d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c0t3d0 ONLINE 0 0 0 c5t3d0ONLINE 0 0 0 spares c5t3d0FAULTED corrupted data c4t7d0AVAIL ... Detach and Remove won't work. Does anyone know of a way to get that c5t3d0 out of the data configuration and back to hot-spare where it belongs? However if I understand the layout properly, this should not have an adverse impact on my existing configuration I think. If I can't dump it, what happens when that disk fills up? I can't believe I made such a bone headed mistake. This is one of those times when a Are you sure you...? would be helpful. :( benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Removing An Errant Drive From Zpool
Eric Schrock wrote: There's really no way to recover from this, since we don't have device removal. However, I'm suprised that no warning was given. There are at least two things that should have happened: 1. zpool(1M) should have warned you that the redundancy level you were attempting did not match that of your existing pool. This doesn't apply if you already have a mixed level of redundancy. 2. zpool(1M) should have warned you that the device was in use as an active spare and not let you continue. What bits were you running? snv_78, however the pool was created on snv_43 and hasn't yet been upgraded. Though, programatically, I can't see why there would be a difference in the way 'zpool' would handle the check. The big question is, if I'm stuck like the permanently, whats the potential risk? Could I potentially just fail that drive and leave it in a failed state? benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Panic on Zpool Import (Urgent)
Today, suddenly, without any apparent reason that I can find, I'm getting panic's during zpool import. The system paniced earlier today and has been suffering since. This is snv_43 on a thumper. Here's the stack: panic[cpu0]/thread=99adbac0: assertion failed: ss != NULL, file: ../../common/fs/zfs/space_map.c, line: 145 fe8000a240a0 genunix:assfail+83 () fe8000a24130 zfs:space_map_remove+1d6 () fe8000a24180 zfs:space_map_claim+49 () fe8000a241e0 zfs:metaslab_claim_dva+130 () fe8000a24240 zfs:metaslab_claim+94 () fe8000a24270 zfs:zio_dva_claim+27 () fe8000a24290 zfs:zio_next_stage+6b () fe8000a242b0 zfs:zio_gang_pipeline+33 () fe8000a242d0 zfs:zio_next_stage+6b () fe8000a24320 zfs:zio_wait_for_children+67 () fe8000a24340 zfs:zio_wait_children_ready+22 () fe8000a24360 zfs:zio_next_stage_async+c9 () fe8000a243a0 zfs:zio_wait+33 () fe8000a243f0 zfs:zil_claim_log_block+69 () fe8000a24520 zfs:zil_parse+ec () fe8000a24570 zfs:zil_claim+9a () fe8000a24750 zfs:dmu_objset_find+2cc () fe8000a24930 zfs:dmu_objset_find+fc () fe8000a24b10 zfs:dmu_objset_find+fc () fe8000a24bb0 zfs:spa_load+67b () fe8000a24c20 zfs:spa_import+a0 () fe8000a24c60 zfs:zfs_ioc_pool_import+79 () fe8000a24ce0 zfs:zfsdev_ioctl+135 () fe8000a24d20 genunix:cdev_ioctl+55 () fe8000a24d60 specfs:spec_ioctl+99 () fe8000a24dc0 genunix:fop_ioctl+3b () fe8000a24ec0 genunix:ioctl+180 () fe8000a24f10 unix:sys_syscall32+101 () syncing file systems... done This is almost identical to a post to this list over a year ago titled ZFS Panic. There was follow up on it but the results didn't make it back to the list. I spent time doing a full sweep for any hardware failures, pulled 2 drives that I suspected as problematic but weren't flagged as such, etc, etc, etc. Nothing helps. Bill suggested a 'zpool import -o ro' on the other post, but thats not working either. I _can_ use 'zpool import' to see the pool, but I have to force the import. A simple 'zpool import' returns output in about a minute. 'zpool import -f poolname' takes almost exactly 10 minutes every single time, like it hits some timeout and then panics. I did notice that while the 'zpool import' is running 'iostat' is useless, just hangs. I still want to believe this is some device misbehaving but I have no evidence to support that theory. Any and all suggestions are greatly appreciated. I've put around 8 hours into this so far and I'm getting absolutely nowhere. Thanks benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] offlining a storage pool
Hi, I would like to offline an entire storage pool (not some devices), ( I want to stop all io activity to the pool) Maybe it could be implemented with a a command like : zpool offline -f tank which should implicity do a zfs unmount tank I use zfs with solaris 10 update 4. Thanks, Ben ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS mirror and sun STK 2540 FC array
Hi all, we have just bought a sun X2200M2 (4GB / 2 opteron 2214 / 2 disks 250GB SATA2, solaris 10 update 4) and a sun STK 2540 FC array (8 disks SAS 146 GB, 1 raid controller). The server is attached to the array with a single 4 Gb Fibre Channel link. I want to make a mirror using ZFS with this array. I have created 2 volumes on the array in RAID0 (stripe of 128 KB) presented to the host with lun0 and lun1. So, on the host : bash-3.00# format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c1d0 DEFAULT cyl 30397 alt 2 hd 255 sec 63 /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 1. c2d0 DEFAULT cyl 30397 alt 2 hd 255 sec 63 /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 2. c6t600A0B800038AFBC02F7472155C0d0 DEFAULT cyl 35505 alt 2 hd 255 sec 126 /scsi_vhci/[EMAIL PROTECTED] 3. c6t600A0B800038AFBC02F347215518d0 DEFAULT cyl 35505 alt 2 hd 255 sec 126 /scsi_vhci/[EMAIL PROTECTED] Specify disk (enter its number): bash-3.00# zpool create tank mirror c6t600A0B800038AFBC02F347215518d0 c6t600A0B800038AFBC02F7472155C0d0 bash-3.00# df -h /tank Filesystem size used avail capacity Mounted on tank 532G24K 532G 1%/tank I have tested the performance with a simple dd [ time dd if=/dev/zero of=/tank/testfile bs=1024k count=1 time dd if=/tank/testfile of=/dev/null bs=1024k count=1 ] command and it gives : # local throughput stk2540 mirror zfs /tank read 232 MB/s write 175 MB/s # just to test the max perf I did: zpool destroy -f tank zpool create -f pool c6t600A0B800038AFBC02F347215518d0 And the same basic dd gives me : single zfs /pool read 320 MB/s write 263 MB/s Just to give an idea the SVM mirror using the two local sata2 disks gives : read 58 MB/s write 52 MB/s So, in production the zfs /tank mirror will be used to hold our home directories (10 users using 10GB each), our projects files (200 GB mostly text files and cvs database), and some vendors tools (100 GB). People will access the data (/tank) using nfs4 with their workstations (sun ultra 20M2 with centos 4update5). On the ultra20 M2, the basic test via nfs4 gives : read 104 MB/s write 63 MB/s A this point, I have the following questions : -- Does someone has some similar figures about the STK 2540 using zfs ? -- Instead of doing only 2 volumes in the array, what do you think about doing 8 volumes (one for each disk) and doing a 4 two way mirror : zpool create tank mirror c6t6001.. c6t6002.. mirror c6t6003.. c6t6004.. {...} mirror c6t6007.. c6t6008.. -- I will add 4 disks in the array next summer. Do you think I should create 2 new luns in the array and doing a : zpool add tank mirror c6t6001..(lun3) c6t6001..(lun4) or build from scratch the 2 luns (6 disks raid0) , and the pool tank (ie : backup /tank - zpool destroy -- add disk - reconfigure array -- zpool create tank ... - restore backuped data) -- I think about doing a disk scrubbing once a month. Is it sufficient ? -- Have you got any comment on the performance from the nfs4 client ? If you add any advices / suggestions, feel free to share. Thanks, Benjamin ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS Quota Oddness
I've run across an odd issue with ZFS Quota's. This is an snv_43 system with several zones/zfs datasets, but only one effected. The dataset shows 10GB used, 12GB refered but when counting the files only has 6.7GB of data: zones/ABC10.8G 26.2G 12.0G /zones/ABC zones/[EMAIL PROTECTED]14.7M - 12.0G - [xxx:/zones/ABC/.zfs/snapshot/now] root# gdu --max-depth=1 -h . 43k ./dev 6.7G./root 1.5k./lu 6.7G. I don't understand what might the cause this disparity. This is an older box, snv_43. Any bugs that might apply, fixed or in progress? Thanks. benr. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?
Dick Davies wrote: On 04/10/2007, Nathan Kroenert [EMAIL PROTECTED] wrote: Client A - import pool make couple-o-changes Client B - import pool -f (heh) Oct 4 15:03:12 fozzie ^Mpanic[cpu0]/thread=ff0002b51c80: Oct 4 15:03:12 fozzie genunix: [ID 603766 kern.notice] assertion failed: dmu_read(os, smo-smo_object, offset, size, entry_map) == 0 (0x5 == 0x0) , file: ../../common/fs/zfs/space_map.c, line: 339 Oct 4 15:03:12 fozzie unix: [ID 10 kern.notice] Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51160 genunix:assfail3+b9 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51200 zfs:space_map_load+2ef () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51240 zfs:metaslab_activate+66 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51300 zfs:metaslab_group_alloc+24e () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b513d0 zfs:metaslab_alloc_dva+192 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51470 zfs:metaslab_alloc+82 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514c0 zfs:zio_dva_allocate+68 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514e0 zfs:zio_next_stage+b3 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51510 zfs:zio_checksum_generate+6e () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51530 zfs:zio_next_stage+b3 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515a0 zfs:zio_write_compress+239 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515c0 zfs:zio_next_stage+b3 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51610 zfs:zio_wait_for_children+5d () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51630 zfs:zio_wait_children_ready+20 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51650 zfs:zio_next_stage_async+bb () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51670 zfs:zio_nowait+11 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51960 zfs:dbuf_sync_leaf+1ac () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b519a0 zfs:dbuf_sync_list+51 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a10 zfs:dnode_sync+23b () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a50 zfs:dmu_objset_sync_dnodes+55 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51ad0 zfs:dmu_objset_sync+13d () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51b40 zfs:dsl_pool_sync+199 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51bd0 zfs:spa_sync+1c5 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c60 zfs:txg_sync_thread+19a () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c70 unix:thread_start+8 () Oct 4 15:03:12 fozzie unix: [ID 10 kern.notice] Is this a known issue, already fixed in a later build, or should I bug it? It shouldn't panic the machine, no. I'd raise a bug. After spending a little time playing with iscsi, I have to say it's almost inevitable that someone is going to do this by accident and panic a big box for what I see as no good reason. (though I'm happy to be educated... ;) You use ACLs and TPGT groups to ensure 2 hosts can't simultaneously access the same LUN by accident. You'd have the same problem with Fibre Channel SANs. I ran into similar problems when replicating via AVS. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for OSX - it'll be in there.
Dale Ghent wrote: ...and eventually in a read-write capacity: http://www.macrumors.com/2007/10/04/apple-seeds-zfs-read-write- developer-preview-1-1-for-leopard/ Apple has seeded version 1.1 of ZFS (Zettabyte File System) for Mac OS X to Developers this week. The preview updates a previous build released on June 26, 2007. Y! Finally my USB Thumb Drives will work on my MacBook! :) I wonder if it'll automatically mount the Zpool on my iPod when I sync it. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] System hang caused by a bad snapshot
Hello Matthew, Tuesday, September 12, 2006, 7:57:45 PM, you wrote: MA Ben Miller wrote: I had a strange ZFS problem this morning. The entire system would hang when mounting the ZFS filesystems. After trial and error I determined that the problem was with one of the 2500 ZFS filesystems. When mounting that users' home the system would hang and need to be rebooted. After I removed the snapshots (9 of them) for that filesystem everything was fine. I don't know how to reproduce this and didn't get a crash dump. I don't remember seeing anything about this before so I wanted to report it and see if anyone has any ideas. MA Hmm, that sounds pretty bizarre, since I don't think that mounting a MA filesystem doesn't really interact with snapshots at all. MA Unfortunately, I don't think we'll be able to diagnose this without a MA crash dump or reproducibility. If it happens again, force a crash dump MA while the system is hung and we can take a look at it. Maybe it wasn't hung after all. I've seen similar behavior here sometimes. Did your disks used in a pool were actually working? There was lots of activity on the disks (iostat and status LEDs) until it got to this one filesystem and everything stopped. 'zpool iostat 5' stopped running, the shell wouldn't respond and activity on the disks stopped. This fs is relatively small (175M used of a 512M quota). Sometimes it takes a lot of time (30-50minutes) to mount a file system - it's rare, but it happens. And during this ZFS reads from those disks in a pool. I did report it here some time ago. In my case the system crashed during the evening and it was left hung up when I came in during the morning, so it was hung for a good 9-10 hours. The problem happened again last night, but for a different users' filesystem. I took a crash dump with it hung and the back trace looks like this: ::status debugging crash dump vmcore.0 (64-bit) from hostname operating system: 5.11 snv_40 (sun4u) panic message: sync initiated dump content: kernel pages only ::stack 0xf0046a3c(f005a4d8, 2a100047818, 181d010, 18378a8, 1849000, f005a4d8) prom_enter_mon+0x24(2, 183c000, 18b7000, 2a100046c61, 1812158, 181b4c8) debug_enter+0x110(0, a, a, 180fc00, 0, 183e000) abort_seq_softintr+0x8c(180fc00, 18abc00, 180c000, 2a100047d98, 1, 1859800) intr_thread+0x170(600019de0e0, 0, 6000d7bfc98, 600019de110, 600019de110, 600019de110) zfs_delete_thread_target+8(600019de080, , 0, 600019de080, 6000d791ae8, 60001aed428) zfs_delete_thread+0x164(600019de080, 6000d7bfc88, 1, 2a100c4faca, 2a100c4fac8, 600019de0e0) thread_start+4(600019de080, 0, 0, 0, 0, 0) In single user I set the mountpoint for that user to be none and then brought the system up fine. Then I destroyed the snapshots for that user and their filesystem mounted fine. In this case the quota was reached with the snapshots and 52% used without. Ben Hate to re-open something from a year ago, but we just had this problem happen again. We have been running Solaris 10u3 on this system for awhile. I searched the bug reports, but couldn't find anything on this. I also think I understand what happened a little more. We take snapshots at noon and the system hung up during that time. When trying to reboot the system would hang on the ZFS mounts. After I boot into single use and remove the snapshot from the filesystem causing the problem everything is fine. The filesystem in question at 100% use with snapshots in use. Here's the back trace for the system when it was hung: ::stack 0xf0046a3c(f005a4d8, 2a10004f828, 0, 181c850, 1848400, f005a4d8) prom_enter_mon+0x24(0, 0, 183b400, 1, 1812140, 181ae60) debug_enter+0x118(0, a, a, 180fc00, 0, 183d400) abort_seq_softintr+0x94(180fc00, 18a9800, 180c000, 2a10004fd98, 1, 1857c00) intr_thread+0x170(2, 30007b64bc0, 0, c001ed9, 110, 6000240) 0x985c8(300adca4c40, 0, 0, 0, 0, 30007b64bc0) dbuf_hold_impl+0x28(60008cd02e8, 0, 0, 0, 7b648d73, 2a105bb57c8) dbuf_hold_level+0x18(60008cd02e8, 0, 0, 7b648d73, 0, 0) dmu_tx_check_ioerr+0x20(0, 60008cd02e8, 0, 0, 0, 7b648c00) dmu_tx_hold_zap+0x84(60011fb2c40, 0, 0, 0, 30049b58008, 400) zfs_rmnode+0xc8(3002410d210, 2a105bb5cc0, 0, 60011fb2c40, 30007b3ff58, 30007b56ac0) zfs_delete_thread+0x168(30007b56ac0, 3002410d210, 69a4778, 30007b56b28, 2a105bb5aca, 2a105bb5ac8) thread_start+4(30007b56ac0, 0, 0, 489a48, d83a10bf28, 50386) Has this been fixed in more recent code? I can make the crash dump available. Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Is there _any_ suitable motherboard?
I've just purchased an Asus P5K WS, which seems to work OK. I had to download the Marvell Yukon ethernet driver - but it's all working fine. It's also got a PCI-X slot - so I have one of those Super Micro 8 port SATA cards - providing a total of 16 SATA ports across the system. Other specs are one of those Intel E6750 1333MHz FSB CPUs and 2Gb of matched memory. Ben. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, iSCSI + Mac OS X Tiger (globalSAN iSCSI)
George wrote: I have set up an iSCSI ZFS target that seems to connect properly from the Microsoft Windows initiator in that I can see the volume in MMC Disk Management. When I shift over to Mac OS X Tiger with globalSAN iSCSI, I am able to set up the Targets with the target name shown by `iscsitadm list target` and when I actually connect or Log On I see that one connection exists on the Solaris server. I then go on to the Sessions tab in globalSAN and I see the session details and it appears that data is being transferred via the PDUs Sent, PDUs Received, Bytes, etc. HOWEVER the connection then appears to terminate on the Solaris side if I check it a few minutes later it shows no connections, but the Mac OS X initiator still shows connected although no more traffic appears to be flowing in the Session Statistics dialog area. Additionally, when I then disconnect the Mac OS X initiator it seems to drop fine on the Mac OS X side, even though the Solaris side has shown it gone for a while, however when I reconnect or Log On again, it seems to spin infinitely on the Target Connect... dialog. Solaris is, interestingly, showing 1 connection while this apparent issue (spinning beachball of death) is going on with globalSAN. Even killing the Mac OS X process doesn't seem to get me full control again as I have to restart the system to kill all processes (unless I can hunt them down and `kill -9` them which I've not successfully done thus far). Has anyone dealt with this before and perhaps be able to assist or at least throw some further information towards me to troubleshoot this? When I learned of the globalSAN Initiator I was overcome with joy. after about 2 days of spending way too much time with it I gave up. Have a look at their forum (http://www.snsforums.com/index.php?s=b0c9031ebe1a89a40cfe4c417e3443f1showforum=14). There are a wide range of problems. In my case connections to the target (Solaris/ZFS/iscsitgt) look fine and dandy initially, but you can use the connection, on reboot globalSAN goes psycho, etc. At this point I've given up on the product; at least for now. If I could actually get an accessable disk at least part of the time I'd dig my fingers into it, but it doesn't offer a usable remote disk to begin with and in a variety of other environments it have identical problems. I consider debugging it to be purely academic at this point. Its a great way to gain insight into the inner workings of iSCSI, but without source code or DTrace on the Mac its hard to expect any big gains. Thats my personal take. If you really wanna go hacking on it regardless bring it up on the Storage list and we can corporately enjoy the academic challenge of finding the problems, but there is nothing to suggest its an OpenSolaris issue. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS Send/RECV
I'm trying to test an install of ZFS to see if I can backup data from one machine to another. I'm using Solaris 5.10 on two VMware installs. When I do the zfs send | ssh zfs recv part, the file system (folder) is getting created, but none of the data that I have in my snapshot is sent. I can browse on the source machine to view the snapshot data pool/.zfs/snapshot/snap-name and I see the data. Am I missing something to make it copy all of the data? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZVol Panic on 62
May 25 23:32:59 summer unix: [ID 836849 kern.notice] May 25 23:32:59 summer ^Mpanic[cpu1]/thread=1bf2e740: May 25 23:32:59 summer genunix: [ID 335743 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=ff00232c3a80 addr=490 occurred in module unix due to a NULL pointer dereference May 25 23:32:59 summer unix: [ID 10 kern.notice] May 25 23:32:59 summer unix: [ID 839527 kern.notice] grep: May 25 23:32:59 summer unix: [ID 753105 kern.notice] #pf Page fault May 25 23:32:59 summer unix: [ID 532287 kern.notice] Bad kernel fault at addr=0x490 May 25 23:32:59 summer unix: [ID 243837 kern.notice] pid=18425, pc=0xfb83b6bb, sp=0xff00232c3b78, eflags=0x10246 May 25 23:32:59 summer unix: [ID 211416 kern.notice] cr0: 8005003bpg,wp,ne,et,ts,mp,pe cr4: 6f8xmme,fxsr,pge,mce,pae,pse,de May 25 23:32:59 summer unix: [ID 354241 kern.notice] cr2: 490 cr3: 1fce52000 cr8: c May 25 23:32:59 summer unix: [ID 592667 kern.notice]rdi: 490 rsi:0 rdx: 1bf2e740 May 25 23:32:59 summer unix: [ID 592667 kern.notice]rcx:0 r8:d r9: 62ccc700 May 25 23:32:59 summer unix: [ID 592667 kern.notice]rax:0 rbx:0 rbp: ff00232c3bd0 May 25 23:32:59 summer unix: [ID 592667 kern.notice]r10: fc18 r11:0 r12: 490 May 25 23:32:59 summer unix: [ID 592667 kern.notice]r13: 450 r14: 52e3aac0 r15:0 May 25 23:32:59 summer unix: [ID 592667 kern.notice]fsb:0 gsb: fffec3731800 ds: 4b May 25 23:32:59 summer unix: [ID 592667 kern.notice] es: 4b fs:0 gs: 1c3 May 25 23:33:00 summer unix: [ID 592667 kern.notice]trp:e err:2 rip: fb83b6bb May 25 23:33:00 summer unix: [ID 592667 kern.notice] cs: 30 rfl:10246 rsp: ff00232c3b78 May 25 23:33:00 summer unix: [ID 266532 kern.notice] ss: 38 May 25 23:33:00 summer unix: [ID 10 kern.notice] May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3960 unix:die+c8 () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3a70 unix:trap+135b () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3a80 unix:cmntrap+e9 () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3bd0 unix:mutex_enter+b () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3c20 zfs:zvol_read+51 () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3c50 genunix:cdev_read+3c () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3cd0 specfs:spec_read+276 () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3d40 genunix:fop_read+3f () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3e90 genunix:read+288 () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3ec0 genunix:read32+1e () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3f10 unix:brand_sys_syscall32+1a3 () May 25 23:33:00 summer unix: [ID 10 kern.notice] May 25 23:33:00 summer genunix: [ID 672855 kern.notice] syncing file systems... Does anyone have an idea of what bug this might be? Occurred on X86 B62. I'm not seeing any putbacks into 63 or bugs that seem to match. Any insight is appreciated. Core's are available. benr. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New zfs pr0n server :)))
Diego Righi wrote: Hi all, I just built a new zfs server for home and, being a long time and avid reader of this forum, I'm going to post my config specs and my benchmarks hoping this could be of some help for others :) http://www.sickness.it/zfspr0nserver.jpg http://www.sickness.it/zfspr0nserver.txt http://www.sickness.it/zfspr0nserver.png http://www.sickness.it/zfspr0nserver.pdf Correct me if I'm wrong: from the benchmark results, I understand that this setup is slow at writing, but fast at reading (and this is perfect for my usage, copying large files once and then accessing only to read them). It also seems that at 128kb it gives the best performances, iirc due to the zfs stripe size (again, correct me if I'm wrong :). I'd happily try any other test, but if you suggest bonnie++ please tell me what's the right version to use, too much of them I really can't understand which to try! tnx :) Classy. +1 for style. ;) benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Remove files when at quota limit
Has anyone else run into this situation? Does anyone have any solutions other than removing snapshots or increasing the quota? I'd like to put in an RFE to reserve some space so files can be removed when users are at their quota. Any thoughts from the ZFS team? Ben We have around 1000 users all with quotas set on their ZFS filesystems on Solaris 10 U3. We take snapshots daily and rotate out the week old ones. The situation is that some users ignore the advice of keeping space used below 80% and keep creating large temporary files. They then try to remove files when the space used is 100% and get over quota messages. We then need to remove some or all of their snapshots to free space. Is there anything being worked on to keep some space reserved so files can be removed when at the quota limit or some other solution? What are other people doing is this situation? We have also set up alternate filesystems for users with transient data that we do not take snapshots on, but we still have this problem on home directories. thanks, Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Remove files when at quota limit
We have around 1000 users all with quotas set on their ZFS filesystems on Solaris 10 U3. We take snapshots daily and rotate out the week old ones. The situation is that some users ignore the advice of keeping space used below 80% and keep creating large temporary files. They then try to remove files when the space used is 100% and get over quota messages. We then need to remove some or all of their snapshots to free space. Is there anything being worked on to keep some space reserved so files can be removed when at the quota limit or some other solution? What are other people doing is this situation? We have also set up alternate filesystems for users with transient data that we do not take snapshots on, but we still have this problem on home directories. thanks, Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: ZFS disables nfs/server on a host
I just threw in a truss in the SMF script and rebooted the test system and it failed again. The truss output is at http://www.eecis.udel.edu/~bmiller/zfs.truss-Apr27-2007 thanks, Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: ZFS disables nfs/server on a host
I was able to duplicate this problem on a test Ultra 10. I put in a workaround by adding a service that depends on /milestone/multi-user-server which does a 'zfs share -a'. It's strange this hasn't happened on other systems, but maybe it's related to slower systems... Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: ZFS disables nfs/server on a host
It does seem like an ordering problem, but nfs/server should be starting up late enough with SMF dependencies. I need to see if I can duplicate the problem on a test system... This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] [OT] Multipathing on Mac OS X
This is pretty OT, but a bit ago there was some discussion of Mac OS X's multipathing support (or its lack thereof). According to this technote, multipathing support has been included Mac OS X since 10.3.5, but there are some particular requirements on the target devices HBAs. http://developer.apple.com/technotes/tn2007/tn2173.html -- Ben PGP.sig Description: This is a digitally signed message part ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] snapdir visable recursively throughout a dataset
Darren J Moffat wrote: Ben Rockwood wrote: Robert Milkowski wrote: I haven't tried it but what if you mounted ro via loopback into a zone /zones/myzone01/root/.zfs is loop mounted in RO to /zones/myzone01/.zfs That is so wrong. ;) Besides just being evil, I doubt it'd work. And if it does, it probly shouldn't. I think I'm the only one that gets a rash when using LOFI. lofi or lofs ? lofi - Loopback file driver Makes a block device from a file lofs - loopback virtual file system Makes a file system from a file system Yes, I know. I was referring more so to loopback happy people in general. :) benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Read Only Zpool: ZFS and Replication
I've been playing with replication of a ZFS Zpool using the recently released AVS. I'm pleased with things, but just replicating the data is only part of the problem. The big question is: can I have a zpool open in 2 places? What I really want is a Zpool on node1 open and writable (production storage) and a replicated to node2 where its open for read-only access (standby storage). This is an old problem. I'm not sure its remotely possible. Its bad enough with UFS, but ZFS maintains a hell of a lot more meta-data. How is node2 supposed to know that a snapshot has been created for instance. With UFS you can at least get by some of these problems using directio, but thats not an option with a zpool. I know this is a fairly remedial issue to bring up... but if I think about what I want Thumper-to-Thumper replication to look like, I want 2 usable storage systems. As I see it now the secondary storage (node2) is useless untill you break replication and import the pool, do your thing, and then re-sync storage to re-enable replication. Am I missing something? I'm hoping there is an option I'm not aware of. benr. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] snapdir visable recursively throughout a dataset
Is there an existing RFE for, what I'll wrongly call, recursively visable snapshots? That is, .zfs in directories other than the dataset root. Frankly, I don't need it available in all directories, although it'd be nice, but I do have a need for making it visiable 1 dir down from the dataset root. The problem is that while ZFS and Zones work smoothly together for moving, cloning, sizing, etc, you can't view .zfs/ from within the zone because the zone root is one dir down: /zones -- Dataset /zones/myzone01 -- Dataset, .zfs is located here. /zones/myzone01/root -- Directory, want .zfs Here! The ultimate idea is to make ZFS snapdirs accessable from within the zone. benr. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs / nfs issue (not performance :-) with courier-imap
Robert Milkowski wrote: CLSNL but if I click, say E, it has F's contents, F has Gs contents, and no CLSNL mail has D's contents that I can see. But the list in the mail CLSNL client list view is correct. I don't belive it's a problem with nfs/zfs server. Please try with simple dtrace script to see (or even truss) what files your imapd actually opens when you click E - I don't belive it opens E and you get F contents, I would bet it opens F. I completely agree with Robert. I'd personally suggest 'truss' to start because its trivial to use, then start using DTrace to further hone down the problem. In the case of Courier-IMAP the best way to go about it would be to truss the parent (courierlogger, which calls courierlogin and ultimately imapd) using 'truss -f -p PID'. Then open the mailbox and watch those stat's and open's closely. I'll be very interested in your findings. We use Courier on NFS/ZFS heavily and I'm thankful to report having no such problems. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS over NFS extra slow?
Brad Plecs wrote: I had a user report extreme slowness on a ZFS filesystem mounted over NFS over the weekend. After some extensive testing, the extreme slowness appears to only occur when a ZFS filesystem is mounted over NFS. One example is doing a 'gtar xzvf php-5.2.0.tar.gz'... over NFS onto a ZFS filesystem. this takes: real5m12.423s user0m0.936s sys 0m4.760s Locally on the server (to the same ZFS filesystem) takes: real0m4.415s user0m1.884s sys 0m3.395s The same job over NFS to a UFS filesystem takes real1m22.725s user0m0.901s sys 0m4.479s Same job locally on server to same UFS filesystem: real0m10.150s user0m2.121s sys 0m4.953s This is easily reproducible even with single large files, but the multiple small files seems to illustrate some awful sync latency between each file. Any idea why ZFS over NFS is so bad? I saw the threads that talk about an fsync penalty, but they don't seem relevant since the local ZFS performance is quite good. Known issue, discussed here: http://www.opensolaris.org/jive/thread.jspa?threadID=14696tstart=15 benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS
Andrew Summers wrote: So, I've read the wikipedia, and have done a lot of research on google about it, but it just doesn't make sense to me. Correct me if I'm wrong, but you can take a simple 5/10/20 GB drive or whatever size, and turn it into exabytes of storage space? If that is not true, please explain the importance of this other than the self heal and those other features. I'm probably to blame for the image of endless storage. With ZFS Sparse Volumes (aka: Thin Provisioning) you can make a 1G drive _look_ like a 500TB drive, but of course it isn't. See my entry on the topic here: http://www.cuddletech.com/blog/pivot/entry.php?id=729 With ZFS Compression you can, however, potentially store 10GB of data on a 5GB drive. It really depends on what type of data your storing and how compressible it is, but I've seen almost 2:1 compression in some cases by simply turning compression on. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS works in waves
Stuart Glenn wrote: A little back story: I have a Norco DS-1220, a 12 bay SATA box, it is connected to eSATA (SiI3124) via PCI-X two drives are straight connections, then the other two ports go to 5x multipliers within the box. My needs/hopes for this was using 12 500GB drives and ZFS make a very large simple data dump spot on my network for other servers to rsync to daily use zfs snapshots for some quick backup if it things worked out start trying to save up towards getting a thumper someday The trouble is it is too slow to really useable. At times it is fast enough to be useable, ~ 13MB/s write. However, this last for only a few minutes. It then just stalls doing nothing. iostat shows 100% blocking for one of the drives in the pool I can however use dd to read or write directly to/from the disks all at the same time with good speed (~30MB/s according to dd) The test pools I have had are either 2 raidz of 6 drives or 3 raidz of 4 drives. The system is using an Athlon 64 3500+ 1GB of RAM. Any suggestions on what I could do to make this useable? More RAM? Too many drives for ZFS? Any tests to find the real slow down? I would really like to use ZFS solaris for this. Linux was able to use the same hardware using some beta kernel modules for the sata multipliers its software raid at an acceptable speed, but I would like to finally rid my network of linux boxen. I have similar issues on my home workstation. They started happening when I put Seagate SATA-II drives with NCQ on a SI3124. I do not believe this to be an issue with ZFS. I've largely dismissed the issue as hardware caused, although I may be wrong. This system has had several problems with SATA-II drives which hardware forums suggest are issues with the nForce4 chipset and SATA-II. Anyway, your not alone, but its not a ZFS issue. Its possible a tunable parameter in the SATA drivers would help. If I find an answer I'll let you know. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [nfs-discuss] Re: [zfs-discuss] A Plea for Help: Thumper/ZFS/NFS/B43
Robert Milkowski wrote: Hello eric, Saturday, December 9, 2006, 7:07:49 PM, you wrote: ek Jim Mauro wrote: Could be NFS synchronous semantics on file create (followed by repeated flushing of the write cache). What kind of storage are you using (feel free to send privately if you need to) - is it a thumper? It's not clear why NFS-enforced synchronous semantics would induce different behavior than the same load to a local ZFS. ek Actually i forgot he had 'zil_disable' turned on, so it won't matter in ek this case. Ben, are you sure zil_disable was set to 1 BEFORE pool was imported? Yes, absolutely. Set var in /etc/system, reboot, system come up. That happened almost 2 months ago, long before this lock insanity problem popped up. To be clear, the ZIL issue was a problem for creation of a handful of files of any size. Untar'ing a file was a massive performance drain. This issue, other the other hand, deals with thousands of little files being created all the time (IMAP Locks). These are separate issues from my point of view. With ZIL slowness NFS performance was just slow but we didn't see massive CPU usage, with this issue on the other hand we were seeing waves in 10 second-ish cycles where the run queue would go sky high with 0 idle. Please see the earlier mails for examples of the symptoms. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] A Plea for Help: Thumper/ZFS/NFS/B43
Spencer Shepler wrote: Good to hear that you have figured out what is happening, Ben. For future reference, there are two commands that you may want to make use of in observing the behavior of the NFS server and individual filesystems. There is the trusty, nfsstat command. In this case, you would have been able to do something like: nfsstat -s -v3 60 This will provide all of the server side NFSv3 statistics on 60 second intervals. Then there is a new command fsstat that will provide vnode level activity on a per filesystem basis. Therefore, if the NFS server has multiple filesystems active and you want ot look at just one something like this can be helpful: fsstat /export/foo 60 Fsstat has a 'full' option that will list all of the vnode operations or just certain types. It also will watch a filesystem type (e.g. zfs, nfs). Very useful. NFSstat I've been using, but fsstat I was unaware of. Which I'd used it rather than duplicated most of its functionality with D script. :) Thanks for the tip. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] A Plea for Help: Thumper/ZFS/NFS/B43
Bill Moore wrote: On Fri, Dec 08, 2006 at 12:15:27AM -0800, Ben Rockwood wrote: Clearly ZFS file creation is just amazingly heavy even with ZIL disabled. If creating 4,000 files in a minute squashes 4 2.6Ghz Opteron cores we're in big trouble in the longer term. In the meantime I'm going to find a new home for our IMAP Mail so that the other things served from that NFS server at least aren't effected. For local tests, this is not true of ZFS. It seems that file creation only swamps us when coming over NFS. We can do thousands of files a second on a Thumper with room to spare if NFS isn't involved. Next step is to figure out why NFS kills us. Agreed. If mass file creation was a problem locally I'd think that we'd have people beating down the doors with complaints. One thought I had as a work around was to move all my mail on NFS to an iSCSI LUN and then put a Zpool on that. I'm willing to bet that'd work fine. Hopefully I can try it. To round out the discussion, the root cause of this whole mess was Courier IMAP Locking. After isolating the problem last night and writing a little d script to find out what files were being create it was obviously lock files, turn off locking and file creations dropped to a reasonable level and our problem vanished. If I can help at all with testing or analysis please let me know. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] A Plea for Help: Thumper/ZFS/NFS/B43
eric kustarz wrote: So i'm guessing there's lots of files being created over NFS in one particular dataset? We should figure out how many creates/second you are doing over NFS (i should have put a timeout on the script). Here's a real simple one (from your snoop it looked like you're only doing NFSv3, so i'm not tracking NFSv4): #!/usr/sbin/dtrace -s rfs3_create:entry, zfs_create:entry { @creates[probefunc] = count(); } tick-60s { exit(0); } Eric, I love you. Running this bit of DTrace reveled more than 4,000 files being created in almost any given 60 second window. And I've only got one system that would fit that sort of mass file creation: our Joyent Connector products Courier IMAP server which uses Maildir. As a test I simply shutdown Courier and unmounted the mail NFS share for good measure and sure enough the problem vanished and could not be reproduced. 10 minutes later I re-enabled Courier and our problem came back. Clearly ZFS file creation is just amazingly heavy even with ZIL disabled. If creating 4,000 files in a minute squashes 4 2.6Ghz Opteron cores we're in big trouble in the longer term. In the meantime I'm going to find a new home for our IMAP Mail so that the other things served from that NFS server at least aren't effected. You asked for the zpool and zfs info, which I don't want to share because its confidential (if you want it privately I'll do so, but not on a public list), but I will say that its a single massive Zpool in which we're using less than 2% of the capacity. But in thinking about this problem, even if we used 2 or more pools, the CPU consumption still would have choked the system, right? This leaves me really nervous about what we'll do when its not an internal mail server thats creating all those files but a customer. Oddly enough, this might be a very good reason to use iSCSI instead of NFS on the Thumper. Eric, I owe you a couple cases of beer for sure. I can't tell you how much I appreciate your help. Thanks to everyone else who chimed in with ideas and suggestions, all of you guys are the best! benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] A Plea for Help: Thumper/ZFS/NFS/B43
I've got a Thumper doing nothing but serving NFS. Its using B43 with zil_disabled. The system is being consumed in waves, but by what I don't know. Notice vmstat: 3 0 0 25693580 2586268 0 0 0 0 0 0 0 0 0 0 0 926 91 703 0 25 75 21 0 0 25693580 2586268 0 0 0 0 0 0 0 0 0 13 14 1720 21 1105 0 92 8 20 0 0 25693580 2586268 0 0 0 0 0 0 0 0 0 17 18 2538 70 834 0 100 0 25 0 0 25693580 2586268 0 0 0 0 0 0 0 0 0 0 0 745 18 179 0 100 0 37 0 0 25693552 2586240 0 0 0 0 0 0 0 0 0 7 7 1152 52 313 0 100 0 16 0 0 25693592 2586280 0 0 0 0 0 0 0 0 0 15 13 1543 52 767 0 100 0 17 0 0 25693592 2586280 0 0 0 0 0 0 0 0 0 2 2 890 72 192 0 100 0 27 0 0 25693572 2586260 0 0 0 0 0 0 0 0 0 15 15 3271 19 3103 0 98 2 0 0 0 25693456 2586144 0 11 0 0 0 0 0 0 0 281 249 34335 242 37289 0 46 54 0 0 0 25693448 2586136 0 2 0 0 0 0 0 0 0 0 0 2470 103 2900 0 27 73 0 0 0 25693448 2586136 0 0 0 0 0 0 0 0 0 0 0 1062 105 822 0 26 74 0 0 0 25693448 2586136 0 0 0 0 0 0 0 0 0 0 0 1076 91 857 0 25 75 0 0 0 25693448 2586136 0 0 0 0 0 0 0 0 0 0 0 917 126 674 0 25 75 These spikes of sys load come in waves like this. While there are close to a hundred systems mounting NFS shares on the Thumper, the amount of traffic is really low. Nothing to justify this. We're talking less than 10MB/s. NFS is pathetically slow. We're using NFSv3 TCP shared via ZFS sharenfs on a 3Gbps aggregation (3*1Gbps). I've been slamming my head against this problem for days and can't make headway. I'll post some of my notes below. Any thoughts or ideas are welcome! benr. === Step 1 was to disable any ZFS features that might consume large amounts of CPU: # zfs set compression=off joyous # zfs set atime=off joyous # zfs set checksum=off joyous These changes had no effect. Next was to consider that perhaps NFS was doing name lookups when it shouldn't. Indeed dns was specified in /etc/nsswitch.conf which won't work given that no DNS servers are accessable from the storage or private networks, but again, no improvement. In this process I removed dns from nsswitch.conf, deleted /etc/resolv.conf, and disabled the dns/client service in SMF. Turning back to CPU usage, we can see the activity is all SYStem time and comes in waves: [private:/tmp] root# sar 1 100 SunOS private.thumper1 5.11 snv_43 i86pc12/07/2006 10:38:05%usr%sys%wio %idle 10:38:06 0 27 0 73 10:38:07 0 27 0 73 10:38:09 0 27 0 73 10:38:10 1 26 0 73 10:38:11 0 26 0 74 10:38:12 0 26 0 74 10:38:13 0 24 0 76 10:38:14 0 6 0 94 10:38:15 0 7 0 93 10:38:22 0 99 0 1 -- 10:38:23 0 94 0 6 -- 10:38:24 0 28 0 72 10:38:25 0 27 0 73 10:38:26 0 27 0 73 10:38:27 0 27 0 73 10:38:28 0 27 0 73 10:38:29 1 30 0 69 10:38:30 0 27 0 73 And so we consider whether or not there is a pattern to the frequency. The following is sar output from any lines in which sys is above 90%: 10:40:04%usr%sys%wio %idleDelta 10:40:11 0 97 0 3 10:40:45 0 98 0 2 34 seconds 10:41:02 0 94 0 6 17 seconds 10:41:26 0 100 0 0 24 seconds 10:42:00 0 100 0 0 34 seconds 10:42:25 (end of sample) 25 seconds Looking at the congestion in the run queue: [private:/tmp] root# sar -q 5 100 10:45:43 runq-sz %runocc swpq-sz %swpocc 10:45:5127.0 85 0.0 0 10:45:57 1.0 20 0.0 0 10:46:02 2.0 60 0.0 0 10:46:1319.8 99 0.0 0 10:46:2317.7 99 0.0 0 10:46:3424.4 99 0.0 0 10:46:4122.1 97 0.0 0 10:46:4813.0 96 0.0 0 10:46:5525.3 102 0.0 0 Looking at the per-CPU breakdown: CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 00 00 324 224000 1540 00 100 0 0 10 00 1140 2260 10 130860 1 0 99 20 00 162 138 1490540 00 1 0 99 30 00556 460430 00 1 0 99 CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 00 00 310 210 340 17 1717 50 100 0 0 10 00 1521 2000 17 265591 65 0 34 20 00 271 197 1751 13 202 00 66 0 34 30 00 120
Re: [zfs-discuss] (OT: SVN branches) A versioning FS
On Oct 6, 2006, at 12:18 PM, David Dyer-Bennet wrote: On 10/5/06, Wee Yeh Tan [EMAIL PROTECTED] wrote: On 10/6/06, David Dyer-Bennet [EMAIL PROTECTED] wrote: One of the big problems with CVS and SVN and Microsoft SourceSafe is that you don't have the benefits of version control most of the time, because all commits are *public*. David, That is exactly what branch is for in CVS and SVN. Dunno much about M$ SourceSafe. I've never encountered branch being used that way, anywhere. It's used for things like developing release 2.0 while still supporting 1.5 and 1.6. However, especially with merge in svn it might be feasible to use a branch that way. What's the operation to update the branch from the trunk in that scenario? We use personal branches all the time; in fact each developer has at least one, sometimes several if they are working on orthogonal issues or experimenting with a couple of different approaches to the same problem. Personal branches are for messy code, unfinished patches - basically anything that took longer than 15 minutes to write. Keeping that stuff on just one machine is unworkable as I code from many locations, not to mention the server is backed up more often. Note that when I say 'personal', I mean intended for the use of one particular person. Some people refer to these as 'private' branches, but we don't do access control in svn other than on a per-project level, so other users can take a look at what I'm up to. This allows me to ask for suggestions or advice without having to email diffs around. Updating from trunk is slightly irritating as svn doesn't do merge tracking ATM (it's in the works, though). Currently I just grep the commit log for the last merge from trunk (I use a consistent log message so this is easy). svn log https://svn.example.com/project/branches/ben | grep 'Merged from trunk' (note last merged revision) svn merge -r$LAST_MERGED_REV:HEAD https://svn.example.com/project/ trunk /path/to/wc (fix any conflicts) svn ci /path/to/wc -m Merged from trunk r$LASTMERGEDREV Of course, you can also cherry-pick changes from other branches or tags if you know the revision number(s). From what I've seen on the svn mailing lists, this is a pretty common pattern to use. I don't think it's very common in CVS though, simply because branching and merging are more difficult. -- Ben PGP.sig Description: This is a digitally signed message part ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] A versioning FS
On Oct 6, 2006, at 6:15 PM, Nicolas Williams wrote: What I'm saying is that I'd like to be able to keep multiple versions of my files without echo * or ls showing them to me by default. Hmm, what about file.txt - ._file.txt.1, ._file.txt.2, etc? If you don't like the _ you could use @ or some other character. I'd like an option for ls(1), find(1) and friends to show file versions, and a way to copy (or, rather, un-hide) selected versions files so that I could now refer to them as usual -- when I do this I don't care to see version numbers in the file name, I just want to give them names. ln -s ._file.txt.1 first_published_draft.txt ln -s ._file.txt.5 second_published_draft.txt And, maybe, I'd like a way to write globs that match file versions (think of extended globboing, as in KSH). Hmm, I'm not exactly sure what you mean by this, but using a dotfile scheme would allow you to easily glob for the file names. Similarly with applications that keep files open but keep writing transactions in ways that the OS can't isolate without input from the app. E.g., databases. fsync(2) helps here, but lots and lots of fsync(2)s would result in no useful versioning. Presumably you'd create a different fs for your database, turning the versioning property off. You'd be likely to want to adjust other fs parameters anyway, judging from some recent posts discussing how to get the best database performance. -- Ben PGP.sig Description: This is a digitally signed message part ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: NFS Performance and Tar
I was really hoping for some option other than ZIL_DISABLE, but finally gave up the fight. Some people suggested NFSv4 helping over NFSv3 but it didn't... at least not enough to matter. ZIL_DISABLE was the solution, sadly. I'm running B43/X86 and hoping to get up to 48 or so soonish (I BFU'd it straight to B48 last night and brick'ed it). Here are the times. This is an untar (gtar xfj) of SIDEkick (http://www.cuddletech.com/blog/pivot/entry.php?id=491) on NFSv4 on a 20TB RAIDZ2 ZFS Pool: ZIL Enabled: real1m26.941s ZIL Disabled: real0m5.789s I'll update this post again when I finally get B48 or newer on the system and try it. Thanks to everyone for their suggestions. benr. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re[2]: System hang caused by a bad snapshot
Hello Matthew, Tuesday, September 12, 2006, 7:57:45 PM, you wrote: MA Ben Miller wrote: I had a strange ZFS problem this morning. The entire system would hang when mounting the ZFS filesystems. After trial and error I determined that the problem was with one of the 2500 ZFS filesystems. When mounting that users' home the system would hang and need to be rebooted. After I removed the snapshots (9 of them) for that filesystem everything was fine. I don't know how to reproduce this and didn't get a crash dump. I don't remember seeing anything about this before so I wanted to report it and see if anyone has any ideas. MA Hmm, that sounds pretty bizarre, since I don't think that mounting a MA filesystem doesn't really interact with snapshots at all. MA Unfortunately, I don't think we'll be able to diagnose this without a MA crash dump or reproducibility. If it happens again, force a crash dump MA while the system is hung and we can take a look at it. Maybe it wasn't hung after all. I've seen similar behavior here sometimes. Did your disks used in a pool were actually working? There was lots of activity on the disks (iostat and status LEDs) until it got to this one filesystem and everything stopped. 'zpool iostat 5' stopped running, the shell wouldn't respond and activity on the disks stopped. This fs is relatively small (175M used of a 512M quota). Sometimes it takes a lot of time (30-50minutes) to mount a file system - it's rare, but it happens. And during this ZFS reads from those disks in a pool. I did report it here some time ago. In my case the system crashed during the evening and it was left hung up when I came in during the morning, so it was hung for a good 9-10 hours. Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] System hang caused by a bad snapshot
I had a strange ZFS problem this morning. The entire system would hang when mounting the ZFS filesystems. After trial and error I determined that the problem was with one of the 2500 ZFS filesystems. When mounting that users' home the system would hang and need to be rebooted. After I removed the snapshots (9 of them) for that filesystem everything was fine. I don't know how to reproduce this and didn't get a crash dump. I don't remember seeing anything about this before so I wanted to report it and see if anyone has any ideas. The system is a Sun Fire 280R with 3GB of RAM running SXCR b40. The pool looks like this (I'm running a scrub currently): # zpool status pool1 pool: pool1 state: ONLINE scrub: scrub in progress, 78.61% done, 0h18m to go config: NAME STATE READ WRITE CKSUM pool1ONLINE 0 0 0 raidz ONLINE 0 0 0 c1t8d0 ONLINE 0 0 0 c1t9d0 ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 errors: No known data errors Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Home Server with ZFS
Hi, I'm plan to build home server that will host my svn repository, fileserver, mailserver and webserver. This is my plan.. I have an old dell precision 420 dual 933Mhz pIII cpus. Inside this i have one scsi 9.1G hdd and 2 80G ide hdds. I am going to install solaris 10 on the scsi drive and have it as the boot disk. I will then create a zfs mirror on the two ide drives. Since i dont want to mix internet facing services (mailserver, webservers) with my internal services (svn server, fileserver) i am going to use zones to isolate them. Not sure how many zones just yet. In this configureation i hope too of gained the protection of having the serives mirrors ( will perform backups also ). What i dont know is what happens if the boot disk dies? can i replace is, install solaris again and get it to see the zfs mirror? Also what happens if one of the ide drives fails? can i plug another one in and run some zfs commands to make it part of the mirror? Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Removing a device from a zfs pool
How can I remove a device or a partition from a pool. NOTE: The devices are not mirrored or raidz Thanks This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss