Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else
What you're talking about is a side-benefit of the BP rewrite section of the linked slides. I believe that once BP rewrite is fully baked, we'll soon afterwards see a device removal feature arrive. /dale On Dec 9, 2009, at 3:46 PM, R.G. Keen wrote: I didn't see remove a simple device anywhere in there. Is it: too hard to even contemplate doing, or too silly a thing to do to even consider letting that happen or too stupid a question to even consider or too easy and straightforward to do the procedure I see recommended (export the whole pool, destroy the pool, remove the device, remake the pool, then reimport the pool) to even bother with? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs/io performance on Netra X1
There is also a long-standing bug in the ALi chipset used on these servers which ZFS tickles. I don't think a work-around for this bug was ever implemented, and it's still present in Solaris 10. On Nov 13, 2009, at 11:29 AM, Richard Elling wrote: The Netra X1 has one ATA bus for both internal drives. No way to get high perf out of a snail. -- richard On Nov 13, 2009, at 8:08 AM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Fri, 13 Nov 2009, Tim Cook wrote: If it is using parallel SCSI, perhaps there is a problem with the SCSI bus termination or a bad cable? SCSI? Try PATA ;) Is that good? I don't recall ever selecting that option when purchasing a computer. It seemed safer to stick with SCSI than to try exotic technologies. Does PATA daisy-chain disks onto the same cable and controller? If this PATA and drives are becoming overwelmed, maybe it will help to tune zfs:zfs_vdev_max_pending down to a very small value in the kernel. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Sniping a bad inode in zfs?
I've have a single-fs, mirrored pool on my hands which recently went through a bout of corruption. I've managed to clean up a good bit of it but it appears that I'm left with some directories which have bad refcounts. For example, I have what should be an empty directory foo which, when you cd into it and ls -al, it shows a incorrect refcount for a empty directory: total 444 drwxr-xr-x 2 dalegusers 3 Aug 17 13:20 ./ drwx--x--x 64 dalegusers117 Aug 17 13:20 ../ Thus, attempts to remove this directory via rmdir fails with directory not empty and rm -rf gacks with File exists I can touch a new file in this dir and such, with the refcount incrementing to 4, and removing it poses no problem, either, with the refcount decrementing back to 3. However 3 is the wrong number. It should of course be only 2 (. and ..) Normally on UFS I would just take the 'nuke it from orbit' route and use clri to wipe the directory's inode. However, clri doesn't appear to be zfs aware (there's not even a zfs analog of clri in /usr/lib/fs/ zfs), and I don't immediately see an option in zdb which would help cure this. Any suggestions would be appreciated. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs code and fishworks fork
On Oct 27, 2009, at 2:00 PM, Bryan Cantrill wrote: I can agree that the software is the one that really has the added value, but to my opinion allowing a stack like Fishworks to run outside the Sun Unified Storage would lead to lower price per unit(Fishwork license) but maybe increase revenue. I'm afraid I don't see that argument at all; I think that the economics that you're advocating would be more than undermined by the necessarily higher costs of validating and supporting a broader range of hardware and firmware... (Just playing Devil's Advocate here) There could be no economics at all. A basic warranty would be provided but running a standalone product is a wholly on your own proposition once one ventures outside a very small hardware support matrix. Perhaps Fishworks/AK would have a OpenSolaris edition - leave the bulk of the actual hardware support up to a support infrastructure that's already geared towards making wide ranges of hardware supportable - OpenSolaris/Solaris, after all, does allow that. Perhaps this could be a version of Fishworks that's not as integrated with what you get on a SUS platform; if some of the Fishworks functionality that depends on a precise hardware combo could be reduced or generalized, perhaps it's worth consideration. Knowing the little I do about what's going on under the hood of a SUS system, I wouldn't expect the version of Fishworks uses on the SUS systems to have 100% parity with a unbundled Fishworks edition - but the core features, by and large, would convey. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs code and fishworks fork
On Oct 27, 2009, at 2:58 PM, Bryan Cantrill wrote: I can agree that the software is the one that really has the added value, but to my opinion allowing a stack like Fishworks to run outside the Sun Unified Storage would lead to lower price per unit(Fishwork license) but maybe increase revenue. I'm afraid I don't see that argument at all; I think that the economics that you're advocating would be more than undermined by the necessarily higher costs of validating and supporting a broader range of hardware and firmware... (Just playing Devil's Advocate here) There could be no economics at all. A basic warranty would be provided but running a standalone product is a wholly on your own proposition once one ventures outside a very small hardware support matrix. Perhaps Fishworks/AK would have a OpenSolaris edition - leave the bulk of the actual hardware support up to a support infrastructure that's already geared towards making wide ranges of hardware supportable - OpenSolaris/Solaris, after all, does allow that. Perhaps this could be a version of Fishworks that's not as integrated with what you get on a SUS platform; if some of the Fishworks functionality that depends on a precise hardware combo could be reduced or generalized, perhaps it's worth consideration. Knowing the little I do about what's going on under the hood of a SUS system, I wouldn't expect the version of Fishworks uses on the SUS systems to have 100% parity with a unbundled Fishworks edition - but the core features, by and large, would convey. Why would we do this? I'm all for zero-cost endeavors, but this isn't zero-cost -- and I'm having a hard time seeing the business case here, especially when we have so many paying customers for whom the business case for our time and energy is crystal clear... Hey, I was just offering food for thought from the technical end :) Of course the cost in man hours to attain a reasonable, unbundled version would have to be justifiable. If that aspect isn't currently justifiable, then that's as far as the conversation needs to go. However, times change and one day demand could very well justify the business costs. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] s10u8: lots of fixes, any commentary?
So looking at the README for patch 14144[45]-09, there are ton of ZFS fixes and feature adds. The big features are already described in the update 8 release docs, but would anyone in-the-know care to comment or point out any interesting CR fixes that might be substantial in the areas of stability or performance? Thanks for what looks like a well-loaded KJP. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Why is Solaris 10 ZFS performance so terrible?
On Sep 15, 2009, at 5:21 PM, Richard Elling wrote: On Sep 15, 2009, at 1:03 PM, Dale Ghent wrote: On Sep 10, 2009, at 3:12 PM, Rich Morris wrote: On 07/28/09 17:13, Rich Morris wrote: On Mon, Jul 20, 2009 at 7:52 PM, Bob Friesenhahn wrote: Sun has opened internal CR 6859997. It is now in Dispatched state at High priority. CR 6859997 has recently been fixed in Nevada. This fix will also be in Solaris 10 Update 9. This fix speeds up the sequential prefetch pattern described in this CR without slowing down other prefetch patterns. Some kstats have also been added to help improve the observability of ZFS file prefetching. Awesome that the fix exists. I've been having a hell of a time with device-level prefetch on my iscsi clients causing tons of ultimately useless IO and have resorted to setting zfs_vdev_cache_max=1. This only affects metadata. Wouldn't it be better to disable prefetching for data? Well, that's a surprise to me, but the zfs_vdev_cache_max=1 did provide relief. Just a general description of my environment: My setup consists of several s10uX iscsi clients which get LUNs from a pairs of thumpers. Each thumper pair exports identical LUNs to each iscsi client, and the client in turn mirrors each LUN pair inside a local zpool. As more space is needed on a client, a new LUN is created on the pair of thumpers, exported to the iscsi client, which then picks it up and we add a new mirrored vdev to the client's existing zpool. This is so we have data redundancy across chassis, so if one thumper were to fail or need patching, etc, the iscsi clients just see one of side of their mirrors drop out. The problem that we observed on the iscsi clients was that, when viewing things through 'zpool iostat -v', far more IO was being requested from the LUs than was being registered for the vdev those LUs were a member of. Being that that was a iscsi setup with stock thumpers (no SSD ZIL, L2ARC) serving the LUs, this apparently overhead caused far more uneccessary disk IO on the thumpers, thus starving out IO for data that was actually needed. The working set is lots of small-ish files, entirely random IO. If zfs_vdev_cache_max only affects metadata prefetches, which parameter affects data prefetches ? I have to admit that disabling device-level prefetching was a shot in the dark, but it did result in drastically reduced contention on the thumpers. /dale Question though... why is bug fix that can be a watershed for performance be held back for so long? s10u9 won't be available for at least 6 months from now, and with a huge environment, I try hard not to live off of IDRs. Am I the only one that thinks this is way too conservative? It's just maddening to know that a highly beneficial fix is out there, but its release is based on time rather than need. Sustaining really needs to be more proactive when it comes to this stuff. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Why is Solaris 10 ZFS performance so terrible?
On Sep 15, 2009, at 6:28 PM, Bob Friesenhahn wrote: On Tue, 15 Sep 2009, Dale Ghent wrote: Question though... why is bug fix that can be a watershed for performance be held back for so long? s10u9 won't be available for at least 6 months from now, and with a huge environment, I try hard not to live off of IDRs. As someone who currently faces kernel panics with recent U7+ kernel patches (on AMD64 and SPARC) related to PCI bus upset, I expect that Sun will take the time to make sure that the implementation is as good as it can be and is thoroughly tested before release. Are you referring the the same testing that gained you this PCI panic feature in s10u7? Testing is a no-brainer, and I would expect that there already exists some level of assurance that a CR fix is correct at the point of putback. But I've dealt with many bugs both very recently and long in the past where a fix has existed in nevada for months, even a year, before I got bit by the same bug in s10 and then had to go through the support channels to A) convince whomever I'm talking to that, yes, I'm hitting this bug, B) yes, there is a fix, and then C) pretty please can I have an IDR Just this week I'm wrapping up testing of a IDR which addresses a e1000g hardware errata that was fixed in onnv earlier this year in February. For something that addresses a hardware issue on a Intel chipset used on shipping Sun servers, one would think that Sustaining would be on the ball and get that integrated ASAP. But the current mode of operation appears to be no CR, no backport, which leaves us customers needlessly running into bugs and then begging for their fixes... or hearing the dreaded oh that fix will be available two updates from now. Not cool. /dale /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + EMC Cx310 Array (JBOD ? Or Singe MetaLUN ?)
On May 1, 2009, at 2:09 AM, Wilkinson, Alex wrote: 0n Thu, Apr 30, 2009 at 11:11:55AM -0500, Bob Friesenhahn wrote: On Thu, 30 Apr 2009, Wilkinson, Alex wrote: I currently have a single 17TB MetaLUN that i am about to present to an OpenSolaris initiator and it will obviously be ZFS. However, I am constantly reading that presenting a JBOD and using ZFS to manage the RAID is best practice ? Im not really sure why ? And isn't that a waste of a high performing RAID array (EMC) ? The JBOD advantage is that then ZFS can schedule I/O for the disks and there is less chance of an unrecoverable pool since ZFS is assured to lay out redundant data on redundant hardware and ZFS uses more robust error detection than the firmware on any array. When using mirrors there is considerable advantage since writes and reads can be concurrent. That said, your EMC hardware likely offers much nicer interfaces for indicating and replacing bad disk drives. With the ZFS JBOD approach you have to back-track from what ZFS tells you (a Solaris device ID) and figure out which physical drive is not behaving correctly. EMC tech support may not be very helpful if ZFS says there is something wrong but the raid array says there is not. Sometimes there is value with taking advantage of what you paid for. So, shall I forget ZFS and use UFS ? Not at all. Just export lots of LUNs from your EMC to get the IO scheduling win, not one giant one, and configure the zpool as a stripe. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Looking for new SATA/SAS HBA; JBOD is not always JBOD
On Jan 9, 2009, at 9:28 AM, Erik Trimble wrote: I'm pretty darned sure that the LSI 1068-based HBAs will do true JBOD. Indeed they do, and the mpt driver works fine with these cards. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance bake off vxfs/ufs/zfs need some help
Are you putting your archive and redo logs on a separate zpool (not just a different zfs fs with the same pool as your data files) ? Are you using direct io at all in any of the config scenarios you listed? /dale On Nov 22, 2008, at 12:41 PM, Chris Greer wrote: So to give a little background on this, we have been benchmarking Oracle RAC on Linux vs. Oracle on Solaris. In the Solaris test, we are using vxvm and vxfs. We noticed that the same Oracle TPC benchmark at roughly the same transaction rate was causing twice as many disk I/O's to the backend DMX4-1500. So we concluded this is pretty much either Oracle is very different in RAC, or our filesystems may be the culprits. This testing is wrapping up (it all gets dismantled Monday), so we took the time to run a simulated disk I/O test with an 8K IO size. vxvm with vxfs we achieved 2387 IOPS vxvm with ufs we achieved 4447 IOPS ufs on disk devices we achieved 4540 IOPS zfs we achieved 1232 IOPS The only zfs tunings we have done are setting set zfs:zfs_nocache=1 in /etc/system and changing the recordsize to be 8K to match the test. I think the files we are using in the test were created before we changed the recordsize, so I deleted them and recreated them and have started the other test...but does anyone have any other ideas? This is my first experience with ZFS with a comercial RAID array and so far it's not that great. For those interested, we are using the iorate command from EMC for the benchmark. For the different test, we have 13 luns presented. Each one is its own volume and filesystem and a singel file on those filesystems. We are running 13 iorate processes in parallel (there is no cpu bottleneck in this either). For zfs, we put all those luns in a pool with no redundancy and created 13 filesystems and still running 13 iorate processes. we are running Solaris 10U6 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Do you grok it?
On Sep 12, 2008, at 1:35 PM, Richard Elling wrote: greenBytes has a very well produced teaser commercial on their site. http://www.green-bytes.com Actually, I think it is one of the better commercials done by tech companies in a long time. Do you grok it? Did I detect a (well-done) metaphor for shared ZFS? I must say that the videography itself is very nice. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
On Jul 11, 2008, at 5:32 PM, Richard Elling wrote: Yes, of course. But there is only one CF slot. Cool coincidence that the following article on CF cards and DMA transfers was posted to /. http://hardware.slashdot.org/article.pl?sid=08/07/12/1851251 I take it that Sun's going ship/sell OEM'd CF cards of some sort for Loki. Hopefully they're ones that don't crap out on DMA transfers. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 2 items on the wish list
Re-reading your question is occurs to me that you might be referring to the ability to mount a snapshot on *another server* ? There's no built-in feature in zfs for that, but a workaround would be to do what I just detailed, with the additional step of exporting that cloned snapshot to the other server via NFS. /dale On Jun 27, 2008, at 7:02 PM, Dale Ghent wrote: On Jun 27, 2008, at 5:58 PM, Mertol Ozyoney wrote: Hi all ; There are two tinhs that some customers are asking for constantly about ZFS. Ability to mount snap shots somewhere else. [this doesnt look easy, perhaps a proxy kind of set up ? This feature has been in ZFS since day 1. You would promote a snapshot to a clone, and mount that clone where ever you wish: 1) Create the snapshot: zfs snapshot somepool/[EMAIL PROTECTED] 2) Promote the snapshot to a file system, which would be mounted at / somepool/snap: zfs clone somepool/[EMAIL PROTECTED] somepool/snap 3) Optionally mount that cloned snapshot somewhere else: zfs set mountpoint=/snapshot somepool/snap /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] mirroring zfs slice
On Jun 17, 2008, at 12:23 PM, Srinivas Chadalavada wrote: :root # zpool create export mirror c2t0d0s5 c2t0d0s5 invalid vdev specification use '-f' to override the following errors: /dev/dsk/c2t0d0s5 is part of active ZFS pool export. Please see zpool(1M). (I presume that you meant to use c2t2d0s5 as the second slice) You've already created your pool, so all you want to do is attach the new slice to be a mirror of one that is already in the pool: zpool attach export c2t0d0s5 c2t2d0s5 This will create a mirror between c2t0d0s5 and c2t2d0s5 First be sure that slice 5 on c2t2d0 is at least the same size as c2t0d0s5. If c2t2d0 is unused, you can copy the vtoc from the first disk to the second one with a simple command: prtvtoc /dev/rdsk/c2t0d0s2 | fmthard -s - /dev/rdsk/c2t2d0s2 Since you're on x86, you may need to run fdisk against c2t2d0 if it is a virgin drive. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] mirroring zfs slice
On Jun 17, 2008, at 1:13 PM, dick hoogendijk wrote: This is about slices. Can this be done for a whole disk too? And it yes, do these disks have to be exactly the same size? Indeed, it can be used on an entire disk. Examples: zpool create mypool c1t0d0 zpool attach mypool c1t0d0 c2t0d0 zpool create mypool mirror c1t0d0 c2t0d0 ... Note the lack of a slice ID in the above command's disk specifications. ZFS will interpret this as use the entire disk At that point it will apply a EFI label to the disk and bring it into the pool as specified. This method is preferred over specifying slice 2 (eg, c1t0d0s2) when wanting to use the entire disk for ZFS. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] nfs and smb performance
Have you turned on the Ignore cache flush commands option on the xraids? You should ensure this is on when using ZFS on them. /dale On Mar 27, 2008, at 6:16 PM, abs wrote: hello all, i have two xraids connect via fibre to a poweredge2950. the 2 xraids are configured with 2 raid5 volumes each, giving me a total of 4 raid5 volumes. these are striped across in zfs. the read and write speeds local to the machine are as expected but i have noticed some performance hits in the read and write speed over nfs and samba. here is the observation: each filesystem is shared via nfs as well as samba. i am able to mount via nfs and samba on a Mac OS 10.5.2 client. i am able to only mount via nfs on a Mac OS 10.4.11 client. (there seems to be authentication/encryption issue between the 10.4.11 client and solaris box in this scenario. i know this is a bug on the client side) when writing a file via nfs from the 10.5.2 client the speeds are 60 ~ 70 MB/sec. when writing a file via samba from the 10.5.2 client the speeds are 30 ~ 50 MB/sec when writing a file via nfs from the 10.4.11 client the speeds are 20 ~ 30 MB/sec. when writing a file via samba from a Windows XP client the speeds are 30 ~ 40 MB. i know that there is an implementational difference in nfs and samba on both Mac OS 10.4.11 and 10.5.2 clients but that still does not explain the Windows scenario. i was wondering if anyone else was experiencing similar issues and if there is some tuning i can do or am i just missing something. thanx in advance. cheers, abs Never miss a thing. Make Yahoo your homepage.___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] List of ZFS patches to be released with Solaris 10 U5
On Mar 4, 2008, at 5:13 PM, Ben Grele wrote: Experts, Do you know where I could find the list of all the ZFS patches that will be released with Solaris 10 U5? My customer told me that they've seen such list for prior update releases. I've not been able to find anything like it in the usual places. Yes, something akin to George Wilson's post for s10u4 would be nice: http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024516.html /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS replication strategies
On Feb 1, 2008, at 1:15 PM, Vincent Fox wrote: Ideally I'd love it if ZFS directly supported the idea of rolling snapshots out into slower secondary storage disks on the SAN, but in the meanwhile looks like we have to roll our own solutions. If you're running some recent SXCE build, you could use ZFS with AVS for remote replication over IP. http://blogs.sun.com/AVS/entry/avs_and_zfs_seamless /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZIL controls in Solaris 10 U4?
On Jan 30, 2008, at 3:44 PM, Vincent Fox wrote: What we ended up doing, for political reasons, was putting the squeeze on our Sun reps and getting a 10u4 kernel spin patch with... what did they call it? Oh yeah a big wad of ZFS fixes. So this ends up being a hug PITA because for the next 6 months to a year we are tied to getting any kernel patches through this other channel rather than the usual way. But it does work for us, so there you are. Speaking of big wad of ZFS fixes, is it me or is anyone else here getting kind of displeased over the glacial speed of the backporting of ZFS stability fixes to s10? It seems that we have to wait around 4-5 months for a oft-delayed s10 update for any fixes of substance to come out. Not only that, but also one day the zfs is its own patch, and then it is part of the current KU, and now it's part of the nfs patch where zfs isn't mentioned anywhere in the patch's synopsis. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] raidz and compression, difficulties
On Jan 26, 2008, at 3:24 AM, Joachim Pihl wrote: So far so good, zfs get all reports compression to be active. Now for the problem: After adding another 300GB of uncompressed .tif and .bin/.cue (audio CD) files, compression ratio is still at 1.00, indicating that no compression has taken place. TIFF files can how their own compression (compressed TIFF) and many image editors have this on by default, so you wouldn't know about it unless you specifically looked for it. I think the TIFF spec specifies LZW compression for this... but either way, if this is indeed the case, zfs compression won't help with those Now, the bin/cue file format specifies no optional compression, so those bin files should be nothing but a raw image of 16bit PCM audio... which you should see some (but not great) compression with. The default lzjb compression scheme in zfs might not be terribly effective on this type of file data being that it's optimized for speed rather than compression efficiency. Try turning on gzip compression in zfs instead and see if things improve. To make that simple I'd just make a new fs (eg; pool/data/audio) and then 'zfs set compression=gzip-4 pool/data/audio' and then mv your bin/ cue files there. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS for OSX - it'll be in there.
...and eventually in a read-write capacity: http://www.macrumors.com/2007/10/04/apple-seeds-zfs-read-write- developer-preview-1-1-for-leopard/ Apple has seeded version 1.1 of ZFS (Zettabyte File System) for Mac OS X to Developers this week. The preview updates a previous build released on June 26, 2007. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Direct I/O ability with zfs?
On Oct 3, 2007, at 10:31 AM, Roch - PAE wrote: If the DB cache is made large enough to consume most of memory, the ZFS copy will quickly be evicted to stage other I/Os on their way to the DB cache. What problem does that pose ? Personally, I'm still not completely sold on the performance (performance as in ability, not speed) of ARC eviction. Often times, especially during a resilver, a server with ~2GB of RAM free under normal circumstances will dive down to the minfree floor, causing processes to be swapped out. We've had to take to manually constraining ARC max size so this situation is avoided. This is on s10u2/3. I haven't tried anything heavy duty with Nevada simply because I don't put Nevada in production situations. Anyhow, in the case of DBs, ARC indeed becomes a vestigial organ. I'm surprised that this is being met with skepticism considering that Oracle highly recommends direct IO be used, and, IIRC, Oracle performance was the main motivation to adding DIO to UFS back in Solaris 2.6. This isn't a problem with ZFS or any specific fs per se, it's the buffer caching they all employ. So I'm a big fan of seeing 6429855 come to fruition. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Direct I/O ability with zfs?
On Oct 3, 2007, at 5:21 PM, Richard Elling wrote: Slightly off-topic, in looking at some field data this morning (looking for something completely unrelated) I notice that the use of directio on UFS is declining over time. I'm not sure what that means... hopefully not more performance escalations... Sounds like someone from ZFS team needs to get with someone from Oracle/MySQL/Postgres and get the skinny on how the IO rubber-road boundary should look, because it doesn't sound like there's a definitive or at least a sure answer here. Oracle trumpets the use of DIO, and there are benchmarks and first- hand accounts out there from DBAs on its virtues - at least when running on UFS (and EXT2/3 on Linux, etc) As it relates to ZFS mechanics specifically, there doesn't appear to be any settled opinion. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS
On Sep 24, 2007, at 6:15 PM, Paul B. Henson wrote: Well, considering that some days we automatically create accounts for thousands of students, I wouldn't want to be the one stuck typing 'zfs create' a thousand times 8-/. And that still wouldn't resolve our requirement for our help desk staff to be able to manage quotas through our existing identity management system. Not to sway you away from ZFS/NFS considerations, but I'd like to add that people who in the past used DFS typically went on to replace it with AFS. Have you considered it? /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] The Dangling DBuf Strikes Back
I saw a putback this past week from M. Maybee regarding this, but I thought I'd post here that I saw what is apparently an incarnation of 6569719 on a production box running s10u3 x86 w/ latest (on sunsolve) patches. I have 3 other servers configured the same way WRT work load, zfs pools and hardware resources, so if this occurs again I'll see about logging a case and getting a relief patch. Anyhow, perhaps a backport to s10 may be in order This server is an x4100 hosting about 10k email accounts using Cyrus, and Cyrus's squatter mailbox indexer was running at the time (lots of small r/w IO), as well as Networker-based backups which sucks data off a clone (yet tons more small ro IO). Unfortunately due to a recent RAM upgrade of the server in question, the dump device was too small to hold a complete vmcore, but at least the stack trace was logged. Here it is, at least for the posterity's sake: Sep 3 03:27:43 xxx ^Mpanic[cpu0]/thread=fe80007b7c80: Sep 3 03:27:43 xxx genunix: [ID 895785 kern.notice] dangling dbufs (dn=fe8432bad7d8, dbuf=fe81f93c5bd8) Sep 3 03:27:43 xxx unix: [ID 10 kern.notice] Sep 3 03:27:43 xxx genunix: [ID 655072 kern.notice] fe80007b7960 zfs:zfsctl_ops_root+2f168a42 () Sep 3 03:27:43 xxx genunix: [ID 655072 kern.notice] fe80007b79a0 zfs:zfsctl_ops_root+2f168af8 () Sep 3 03:27:44 xxx genunix: [ID 655072 kern.notice] fe80007b7a10 zfs:dnode_sync+334 () Sep 3 03:27:44 xxx genunix: [ID 655072 kern.notice] fe80007b7a60 zfs:dmu_objset_sync_dnodes+7b () Sep 3 03:27:44 xxx genunix: [ID 655072 kern.notice] fe80007b7af0 zfs:dmu_objset_sync+5c () Sep 3 03:27:44 xxx genunix: [ID 655072 kern.notice] fe80007b7b10 zfs:dsl_dataset_sync+23 () Sep 3 03:27:44 xxx genunix: [ID 655072 kern.notice] fe80007b7b60 zfs:dsl_pool_sync+7b () Sep 3 03:27:44 xxx genunix: [ID 655072 kern.notice] fe80007b7bd0 zfs:spa_sync+116 () Sep 3 03:27:44 xxx genunix: [ID 655072 kern.notice] fe80007b7c60 zfs:txg_sync_thread+115 () Sep 3 03:27:44 xxx genunix: [ID 655072 kern.notice] fe80007b7c70 unix:thread_start+8 () /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + ISCSI + LINUX QUESTIONS
On May 31, 2007, at 12:15 AM, Nathan Huisman wrote: = PROBLEM To create a disk storage system that will act as an archive point for user data (Non-recoverable data), and also act as a back end storage unit for virtual machines at a block level. snip Here are some tips from me. I notice you mention iSCSI a lot so I'll stick to that... Q1: The best way to mirror in real time is to do it from the consumers of the storage, ie, your iSCSI clients. Implement two storage servers (say, two x4100s with attached disk) and put their disk into zpools. The two servers do not have to know about each other. Configure ZFS file systems identically on both and export them to the client that'll use it. Use the software mirroring feature on the client to mirror these iSCSI shares (eg: dynamic disks on Windows, LVM on Linux, SVM on Solaris). What this gives you are two storage servers (ZFS-backed, serving out iSCSI shares) and the client(s) take a share from each and mirror them... if one of the ZFS servers were to go kaput, the other is still there actively taking in and serving data. From the client's perspective, it'll just look like one side of the mirror went down and after you get the downed ZFS server back up, you would initiate normal mirror reattachment procedure on the client(s). This will also allow you to patch your ZFS servers without downtime incurred on your clients. The disk storage on your two ZFS+iSCSI servers could be anything. Given your budget and space needs, I would suggest looking at the Apple Xserve RAID with 750GB drives. You're a .edu, so the price of these things will likely please you (I just snapped up two of them at my .edu for a really insane price). Q2: The client will just see the iSCSI share as a raw block device. Put your ext3/xfs/jfs on it as you please... to ZFS on the it is just data. That's the only way you can use iSCSI, really it's block level, remember. On ZFS, the iSCSI backing store is one large sparse file. Q3: See the zpool man page, specifically the 'zpool replace ...' command. Q4: Since (or if) you're doing iSCSI, ZFS snapshots will be of no value to you since ZFS can't see into those iSCSI backing store files. I'll assume that you have a backup system in place for your existing infrastructure (Networker, NetBackup or what have you) so back up the stuff from the *clients* and not the ZFS servers. Just space the backup schedule out if you have multiple clients so that the ZFS+iSCSI servers aren't overloaded with all its clients reading data suddenly with backup time rolls around. Q5: Sure, nothing would stop you from doing that sort of config, but it's something that would make Rube Goldberg smile. Keep out any unneeded complexity and condense the solution. Excuse my ASCII art skills, but consider this: [JBOD/ARRAY]---(fc)---[ZFS/iSCSI server 1]---(iscsi share)- [Client] [mirroring the] [JBOD/ARRAY]---(fc)---[ZFS/iSCSI server 2]---(iscsi share)- [ two shares ] Kill one of the JBODs or arrays, OR the ZFS+iSCSI servers, and your clients are still in good shape as long as their software mirroring facility behaves. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] RAIDZn+1 (related to the h/w raid ponderings)
Dropping in on this convo a little late, but here's something that has been nagging me - gaining the ability to mirror two (or more) RAIDZ sets. A little background on why I'd really like to see this I have two data centers on my campus and my FC-based SAN stretches between them. When I buy RAID arrays, I do so in pairs so that one array ends up in each data center, and the LUN config on these matched arrays are also the same. The server which consume those LUNs use software mirroring (via ZFS, SVM, or whatever) to mirror data in real time between the two arrays... in effect gaining me chassis-level redundancy in two separate buildings set half a mile apart. If one building goes up in smoke, I know that my data is fine in the other. Anyway, lets now step down to the array level. These array disks are configured in RAID5 sets. This is a further level of redundancy so that if one of the arrays in a pair is out of service for an extended time, the remaining array can still withstand a drive failure. Well, I'd like to get rid of that hardware RAID5 and use RAIDz... but then that would preclude my mirroring setup from happening since I can't set up two distinct RAIDZn sets within a pool and mirror them. For those familiar with ZFS internals, could a RAIDZ+1 configuration be a distinct possibility? /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] gzip compression throttles system?
On May 2, 2007, at 10:36 PM, Ian Collins wrote: The files are between 15 and 50MB. It's worth pointing out that .wav files only compress by a few percent. Not entirely related to your maxed CPU problem, but gzip on PCM audio isn't, as you point out, going to earn you much of a compression ratio. It's also very very slow on this kind of data. If you're looking to compress wav or other PCM-based audio formats for storage space reasons, use FLAC. It's guaranteed that you'll get a far better compress ratio and a quicker result for your troubles than you will with gzip. There are many technical reasons for this, but generally, FLAC knows about audio down to the sample and is geared for the properties of PCM audio. gzip just sees it as a generic data blob like any other which contributes to its inefficiencies in this case. The downside is that, well, it's not an option on the ZFS level, but you don't necessarily have to pre-decompress your FLAC-compressed WAV file in order to listen to them :) Hmm... a FLAC-based compression mech in ZFS for efficient (and lossless) PCM audio storage... /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] XServe Raid Complex Storage Considerations
On Apr 25, 2007, at 11:17 AM, cedric briner wrote: hello the list, After reading the _excellent_ ZFS Best Practices Guide, I've seen in the section: ZFS and Complex Storage Consideration that we should configure the storage system to ignore command which will flush the memory into the disk. So does some of you knows how to tell Xserve Raid to ignore ``fsync'' requests ? After the announce that zfs will be included in Tiger, I'll be surprised that the Xserve Raid will not include such configuration. You can tell the Xserve RAID to ignore cache flush commands, but the option to make this so is controller-wide and not settable on a per- LUN basis. In the Xserve RAID management app, select a controller, click the Settings button and enter the admin password for the array. Then click the Performance tab, and make sure that the Allow Host Cache Flushing option is unchecked for the controllers you don't want that on. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Exporting zvol properties to .zfs
Here at my university, I recently started selling disk space to users from a server with 4.5TB of space. They purchase space and I make them their own volume, typically with compression on and it's then exported via NFS to their servers/workstations. So far this has gone quite well (with zil_disable and a tuned up nfsd of course) Anyhow, the frustration exhibited by a new customer of mine made me think of a new RFE possibility. This customer purchased some space and began moving his data (2TB's worth) over to it from his ailing RAID array. He became frantic at one point and said that the transfer was taking too long. What he was doing was judging the speed at which the move was going by doing a 'df' on his NFS client and comparing that to the existing partition which holds his data. What he didn't realize was that the transfer seemed slower because his data on the ZFS-backed NFS server was being compressed by a 2:1 ratio... so, for example, although the df on his NFS client reported 250G used, in reality approximately 500G had been transfered and then compressed on ZFS. This was explained to him and that averted his fury for the time being... but it got me thinking about how things such as the current compression ratio for a volume could be indicated over a otherwise ZFS-agnostic NFS export. The .zfs snapdir came to mind. Perhaps ZFS could maintain a special file under there, called compressratio for example, and a remote client could cat it or whatever to be aware of how volume compression factors into their space usage. Any thoughts? A quick b.o.o search did bring up and existing RFE along these lines, so I thought I'd mention that here. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] snapdir visable recursively throughout a dataset
On Feb 5, 2007, at 7:57 AM, Robert Milkowski wrote: I haven't tried it but what if you mounted ro via loopback into a zone /zones/myzone01/root/.zfs is loop mounted in RO to /zones/ myzone01/.zfs I've tried something similar but found out that vfstab is evaluated prior to zpool import, so any lofs directives in vfstab will fail if the source of the lofs mount is in ZFS :/ /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Thumper Origins Q
On Jan 31, 2007, at 4:26 AM, Selim Daoud wrote: you can still do some lun masking at the HBA level (Solaris 10) this feature is call blacklist Oh, I'd do that but Solaris isn't the only OS that uses arrays on my SAN, and other hosts even cross-departmental. Thus masking from the array is a must to keep the amount of host-based tomfoolery to a minimum. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] System pause peculiarity with mysql on zfs
Hey all, I run a netra X1 as the mysql db server for my small personal web site. This X1 has two drives in it with SVM-mirrored UFS slices for / and /var, a swap slice, and slice 7 is zfs. There is one zfs mirror pool called local on which there are a few file systems, one of which is for mysql. slice 7 used to be ufs, and I had no performance problems when that was the case. There is 1152MB of RAM on this box, half of which is in use. Solaris 10 FCS + all the latest patches as of today. So anyway, after moving mysql to live on zfs (with compression turned on for the volume in question), I noticed that web pages on my site took a bit of time, sometimes up to 20 seconds to load. I'd jump on to my X1, and notice that according to top, kernel was hogging 80-100% of the 500Mhz CPU, and mysqld was the top process in CPU use. The load average would shoot from a normal 0.something up to 6 or even 8. Command-line response was stop and go. Then I'd notice my page would finally load, and that corresponded with load and kernel CPU usage decreasing back to normal levels. I am able to reliably replicate this, and I ran lockstat while this was going on, the output of which is here: http://elektronkind.org/osol/lockstat-zfs-0.txt Part of me is kind of sure that this is 6421427 as there appears to be long and copious trips through ata_wait() as that bug illustrates, but I just want to be sure of it (and when is that bug seeing a solaris 10 patch, btw?) TIA, /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] System pause peculiarity with mysql on zfs
On Dec 7, 2006, at 1:46 PM, Jason J. W. Williams wrote: Hi Dale, Are you using MyISAM or InnoDB? InnoDB. Also, what's your zpool configuration? A basic mirror: [EMAIL PROTECTED]zpool status pool: local state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM local ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t0d0s7 ONLINE 0 0 0 c0t2d0s7 ONLINE 0 0 0 errors: No known data errors ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] System pause peculiarity with mysql on zfs
On Dec 7, 2006, at 5:22 PM, Nicholas Senedzuk wrote: You said you are running Solaris 10 FCS but zfs was not released until Solaris 10 6/06 which is Solaris 10U2. Look at a Solaris 10 6/06 CD/DVD. Check out the Solaris_10/ UpgradePatches directory. ah! well whaddya know... Yes, apply those (you have to do them in the right order to do it in one run with 'patchadd -M') and you can bring your older box up to date with the update release. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: System pause peculiarity with mysql on zfs
On Dec 7, 2006, at 6:14 PM, Anton B. Rang wrote: This does look like the ATA driver bug rather than a ZFS issue per se. Yes indeed. Well, that answers that. FWIW, I'm hour 2 of a mysql configure script run. Yow! (For the curious, the reason ZFS triggers this when UFS doesn't is because ZFS sends a synchronize cache command to the disk, which is not handled in DMA mode by the controller; and for this particular controller, switching between DMA and PIO mode has some quirks which were worked around by adding delays. The fix involves a new quirk-work-around.) Ah, so I suppose this would affect the V100, too. The same ALi IDE controller in that box. Thanks for the insight. Since the fix for this made it into snv_52, I suppose it's too recent for a backport and patch release for s10 :( /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS related kernel panic
Matthew Ahrens wrote: Jason J. W. Williams wrote: Hi all, Having experienced this, it would be nice if there was an option to offline the filesystem instead of kernel panicking on a per-zpool basis. If its a system-critical partition like a database I'd prefer it to kernel-panick and thereby trigger a fail-over of the application. However, if its a zpool hosting some fileshares I'd prefer it to stay online. Putting that level of control in would alleviate a lot of the complaints it seems to me...or at least give less of a leg to stand on. ;-) Agreed, and we are working on this. Similar to UFS's onerror mount option, I take it? /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS related kernel panic
Richard Elling wrote: Actually, it would be interesting to see how many customers change the onerror setting. We have some data, just need more days in the hour. I'm pretty sure you'd find that info in over 6 years of submitted Explorer output :) I imagine that stuff is sandboxed away in a far off department, though. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Mirrored Raidz
On Oct 24, 2006, at 4:56 AM, Michel Kintz wrote: It is not always a matter of more redundancy. In my customer's case, they have storage in 2 different rooms of their datacenter and want to mirror from one storage unit in one room to the other. So having in this case a combination of RAID-Z + Mirror makes sense in my mind or ? It /does/ make sense. Having a geographically diverse storage scenario like this is good, but changes the rules a bit, and in a way that you can't fully take advantage of by using only soft RAID such as ZFS or SVM. The missing link as you point out is the missing ability to mirror (within the ZFS) a RAIDZ vdev. To get around this, I just use hardware RAID5 on my separate arrays and use either ZFS or SVM mirroring between the two on the hosts. I have thought about this over the past several months, and believe that it's probably better this way rather doing it all in ZFS or SVM. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Mirrored Raidz
On Oct 24, 2006, at 12:33 PM, Frank Cusack wrote: On October 24, 2006 9:19:07 AM -0700 Anton B. Rang [EMAIL PROTECTED] wrote: Our thinking is that if you want more redundancy than RAID-Z, you should use RAID-Z with double parity, which provides more reliability and more usable storage than a mirror of RAID-Zs would. This is only true if the drives have either independent or identical failure modes, I think. Consider two boxes, each containing ten drives. Creating RAID-Z within each box protects against single-drive failures. Mirroring the boxes together protects against single-box failures. But mirroring also protects against single-drive failures. Right, but mirrored raidz would in this case protect the admin from: 1) one entire jbod chassis/comm failure, and 2) individual drive failure in the remaining chassis during an occurrence of (1) Since the person is dealing with JBODS and not hardware RAID arrays, my suggestion is to combine ZFS and SVM. 1) Use ZFS and make a raidz-based ZVOL of disks on each of the two JBODs 2) Use SVM to mirror the two ZVOLs. Newfs that with UFS. Not at all optimal, but it'll work. It would be nice if you could manage a mirror of existing vdevs within ZFS and this mirroring would be a special case where it would be dumb and just present the volume and pass through most of the stuff to the raidz (or whatever) vdev below. It would be silly to double-cksum and compress everything, not to mention the possibility of differing record sizes. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Mirrored Raidz
On Oct 24, 2006, at 2:46 PM, Richard Elling - PAE wrote: Pedantic question, what would this gain us other than better data retention? Space and (especially?) performance would be worse with RAID-Z+1 than 2-way mirrors. You answered your own question, it would gain the user better data retention :) The space tradeoff is an obvious side effect and unavoidable. For situations where this is not an overriding issue, it just isn't an issue. I don't believe performance would be adversely impacted to a practical degree, though. A dumb ZFS mirror strategy in this case would just copy reads and writes to and from the vdevs below it, OR pre-package the writes itself with compression and checksums and send that data below to the raidz's to be stored (which would probably be more problematic to implement in the zfs code). With the latter, checksums and compression would be done only once (at the mirror level) and not done by each of the n number of underlying vdevs. So, a little ascii art to summarize: 1) The probably-easiest-to-implement approach: [app] | [zfs volume] | [vdev mirror] -- passes thru read/write ops, regulates recordsize. It's mainly dumb | [raidz vdev]-- |--[raidz vdev]... -- each vdev generates cksums, compression per normal | | | | | | | | [phys devs] [phys devs] 2) The less-CPU-but-more-convoluted approach: [app] | [zfs volume] | [vdev mirror] -- generates cksums, compression, regulates recordsize | [raidz vdev]-- |--[raidz vdev]... -- each vdev just stores data as it is passed in from above | | | | | | | | [phys devs] [phys devs] Any of those two would be quite handy in the environment where you want to mirror data between, say, JBODs and retention is the primary goal. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Mirrored Raidz
On Oct 24, 2006, at 3:23 PM, Frank Cusack wrote: http://blogs.sun.com/roch/entry/when_to_and_not_to says a raid-z vdev has the read throughput of 1 drive for random reads. Compared to #drives for a stripe. That's pretty significant. Okay, then if the person can stand to lose even more space, do zfs mirroring on each JBOD. Then we'd have a mirror of mirrors instead of a mirror of raidz's. Remember, the OP wanted chassis-level redundancy as well as redundancy within the domain of each chassis. You can't do that now with ZFS unless you combine ZFS with SVM. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS RAID-10
On Oct 22, 2006, at 9:57 PM, Al Hopper wrote: On Sun, 22 Oct 2006, Stephen Le wrote: Is it possible to construct a RAID-10 array with ZFS? I've read through the ZFS documentation, and it appears that the only way to create a RAID-10 array would be to create two mirrored (RAID-1) emulated volumes in ZFS and combine those to create the outer RAID-0 volume. Am I approaching this in the wrong way? Should I be using SVM to create my RAID-1 volumes and then create a ZFS filesystem from those volumes? No - don't do that. Here is a ZFS version of a RAID 10 config with 4 disks: snip To further agree with/illustrate Al's point, here's an example of 'zpool status' output which reflects this type of configuration: (Note that there is one mirror set for each pair of drives. In this case, drive 1 on crontroller 3 is mirrored to drive 1 on controller 4, and so on. This will ensure continuity should one controller/buss/ cable fail.) [EMAIL PROTECTED]zpool status pool: data state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM data ONLINE 0 0 0 mirror ONLINE 0 0 0 c3t0d0 ONLINE 0 0 0 c4t9d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c3t1d0 ONLINE 0 0 0 c4t10d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c3t2d0 ONLINE 0 0 0 c4t11d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c3t3d0 ONLINE 0 0 0 c4t12d0 ONLINE 0 0 0 errors: No known data errors ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS Inexpensive SATA Whitebox
On Oct 17, 2006, at 1:59 PM, Richard Elling - PAE wrote: The realities of the hardware world strike again. Sun does use the Siig SATA chips in some products, Marvell in others, and NVidia MCPs in others. The difference is in who writes the drivers. NVidia, for example, has a history of developing their own drivers and keeping them closed-source. This is their decision and, I speculate, largely based on their desire to keep the hardware implementation details from their competitors. If you want to learn the source of mine, Frank's and undoubtedly others' ire, please refer to: http://www.sun.com/products-n-solutions/hardware/docs/html/ 819-3722-15/index.html#21924 This is the release notes of the X2100. The indication that hot-swap works under Windows (but not Linux or Solaris) seems to be an obvious indicator of it not being a hardware lacking, but a driver one (which would make sense, ata does not expect a device to go away). Further, if my memory isn't playing tricks on me, when I received my first X2100 (around a month or two after they were first released) I recall an addition small yellow paper tucked in the accessories box separately from the standard documentation saying that hot-swap under Solaris would be supported in a future Solaris version. There's also a bug open on this matter, and has been open for a long time. If this wasn't feasible, I imagine the bug would be closed already with a WONTFIX. If you want NVidia drivers for Solaris, then please let NVidia know. As an outsider, I don't want to trivialize the happenings in the Sun- nVidia relationship, but look at nge(7d) as an example. Surely if that exists (closed source, and I assume it's provided by nVidia in part or whole and under NDA) then a NV SATA driver shouldn't be hard to obtain, even if it too ended up being closed-source (a la the Marvell driver). /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS patches for S10 6/06
On Oct 5, 2006, at 2:28 AM, George Wilson wrote: Andreas, The first ZFS patch will be released in the upcoming weeks. For now, the latest available bits are the ones from s10 6/06. George, will there at least be a T patch available? I'm anxious for these because my ZFS-backed NFS server just isn't having it in terms of client i/o rates. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS Inexpensive SATA Whitebox
On Oct 11, 2006, at 10:10 AM, [EMAIL PROTECTED] wrote: So are there any pci-e SATA cards that are supported ? I was hoping to go with a sempron64. Using old-pci seems like a waste. Yes. I wrote up a little review of the SIIG SC-SAE412-S1 card which is a two port PCIe card based on the Silicon Image 3132 chip: http://elektronkind.org/2006/09/siig-esata-ii-pcie-card-and-opensolaris The card is a two port eSATA2 card, but SIIG also sells a two port internal SATA card based on the same chip as well. This card is running fine under SX:CR build 47 and would presumably also run fine under Solaris 10 Update 2 or later. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS Inexpensive SATA Whitebox
On Oct 11, 2006, at 7:36 PM, David Dyer-Bennet wrote: I've been running Linux since kernel 0.99pl13, I think it was, and have had amazingly little trouble. Whereas I'm now sitting on $2k of hardware that won't do what I wanted it to do under Solaris, so it's a bit of a hot-button issue for me right now. Yes, but remember back in the days of Linux 0.99, the amount of PC hardware was nowhere near as varied as it is today. Integrated chipsets? A pipe dream! Aside from video card chips and proprietary pre-ATAPI CDROM interfaces, you didn't have to reach far to find a driver which covered a given piece of hardware because when you got down to it, most hardware was the same. NE2000, anyone? Today, in 2006 - much different story. I even had Linux AND Solaris problems with my machine's MCP51 chipset when it first came out. Both forcedeth and nge croaked on it. Welcome to the bleeding edge. You're unfortunately on the bleeding edge of hardware AND software. When in that situation, one can be patient, be helpful, or go back to where one came from. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS Inexpensive SATA Whitebox
On Oct 12, 2006, at 12:23 AM, Frank Cusack wrote: On October 11, 2006 11:14:59 PM -0400 Dale Ghent [EMAIL PROTECTED] wrote: Today, in 2006 - much different story. I even had Linux AND Solaris problems with my machine's MCP51 chipset when it first came out. Both forcedeth and nge croaked on it. Welcome to the bleeding edge. You're unfortunately on the bleeding edge of hardware AND software. Yeah, Solaris x86 is so bleeding edge that it doesn't even support Sun's own hardware! (x2100 SATA, which is now already in its second generation) You know, I'm really perplexed over that, especially given that the silicon image chips (AFAIK) aren't in any Sun product and yet they have a SATA framework driver. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS imported simultanously on 2 systems...
James C. McPherson wrote: As I understand things, SunCluster 3.2 is expected to have support for HA-ZFS and until that version is released you will not be running in a supported configuration and so any errors you encounter are *your fault alone*. Still, after reading Mathias's description, it seems that the former node is doing an implicit forced import when it boots back up. This seems wrong to me. zpools should be imported only of the zpool itself says it's not already taken, which of course would be overidden by a manual -f import. zpool sorry, i already have a boyfriend, host b host a darn, ok, maybe next time rather than the current scenario: zpool host a, I'm over you now. host b is now the man in my life! host a I don't care! you're coming with me anyways. you'll always be mine! * host a stuffs zpool into the car and drives off ...and we know those situations never turn out particularly well. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS imported simultanously on 2 systems...
On Sep 13, 2006, at 12:32 PM, Eric Schrock wrote: Storing the hostid as a last-ditch check for administrative error is a reasonable RFE - just one that we haven't yet gotten around to. Claiming that it will solve the clustering problem oversimplifies the problem and will lead to people who think they have a 'safe' homegrown failover when in reality the right sequence of actions will irrevocably corrupt their data. HostID is handy, but it'll only tell you who MIGHT or MIGHT NOT have control of the pool. Such an RFE would even more worthwhile if it included something such as a time stamp. This time stamp (or similar time-oriented signature) would be updated regularly (bases on some internal ZFS event). If this stamp goes for an arbitrary length of time without being updated, another host in the cluster could force import it on the assumption that the original host is no longer able to communicate to the zpool. This is a simple idea description, but perhaps worthwhile if you're already going to change the label structure for adding the hostid. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS imported simultanously on 2 systems...
On Sep 13, 2006, at 1:37 PM, Darren J Moffat wrote: That might be acceptable in some environments but that is going to cause disks to spin up. That will be very unacceptable in a laptop and maybe even in some energy conscious data centres. Introduce an option to 'zpool create'? Come to think of it, describing attributes for a pool seems to be lacking (unlike zfs volumes) What you are proposing sounds a lot like a cluster hear beat which IMO really should not be implemented by writing to disks. That would be an extreme example of the use for this. While it / could/ be used as a heart beat mechanism, it would be useful administratively. # zpool status foopool Pool foopool is currently imported by host.blah.com Import time: 4 April 2007 16:20:00 Last activity: 23 June 2007 18:42:53 ... ... /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] For those looking for an SATA add-on card...
I have success with one made by SIIG: http://elektronkind.org/2006/09/siig-esata-ii-pcie-card-and-opensolaris /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SATA hot plug correction
On Sep 9, 2006, at 12:04 PM, David Dyer-Bennet wrote: Thanks, that seems fairly clear. So another approach I could take is to buy one of the supported controllers, if they're available on a card I could plug in. The Silicon Image chipset is pretty popular and can be found on many SATA and eSATA cards, such as this one: http://cooldrives.com/seata1ex1inp.html The porting of ZFS to Linux may also eventually solve my problem. I'm going to try installing the nv44 I've got, just in case that might have advanced the driver state of the art. Even in nv44, the SATA framework still contains only two drivers, one for the Silicon Image sil3124 and the other for the Marvell 88SX6xxx. So as of today, you're still limited to using SATA controllers based on those two chipsets (at least they're popular chipsets for SATA add- on cards). As Frank mentioned, the nVidia CK8-04 chipset which is present in the X2100 and (I think) the Ultra 20 is conspicuously absent, as are SATA framework drivers for the nVidia MCP55 chipset which is used in the new Sun AM2-based systems. I scour the weekly putback logs for any SATA-related changes and things seem kind of quiet on that front, and naturally the bugs logged for adding these features never reflect any updates. grumble groan. I really do wish Sun was more vocal regarding their progress in this important area. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vs. Apple XRaid
On Jul 31, 2006, at 7:30 PM, Rich Teer wrote: On Mon, 31 Jul 2006, Dale Ghent wrote: So what does this exercise leave me thinking? Is Linux 2.4.x really screwed up in NFS-land? This Solaris NFS replaces a Linux-based NFS server that the Linux has had, uhhmmm (struggling to be nice), iffy NFS for ages. Right, but I never had this speed problem when the NFS server was running Linux on hardware that had the quarter of the CPU power and half the disk i/o capacity that the new Solaris-based one has. So either Linux's NFS client was more compatible with the bugs in Linux's NFS server and ran peachy that way, or something's truly messed up with how Solaris's NFS server handles Linux NFS clients. Mind you, all the tests I did in my previous posts were on shares served out of ZFS. I just lopped a fresh LUN off another Xserve RAID on my SAN, gave it to the NFS server and put UFS on it. Let's see if there's a difference when mounting that on the clients: Linux NFS client mounting UFS-backed share: = [EMAIL PROTECTED]/$ mount -o nfsvers=3,rsize=32768,wsize=32768 ds2- private:/ufsfoo /mnt [EMAIL PROTECTED]/$ cd /mnt [EMAIL PROTECTED]/mnt$ time dd if=/dev/zero of=blah bs=1024k count=128128 +0 records in 128+0 records out real0m9.267s user0m0.000s sys 0m2.480s = Hey! look at that! 9.2 seconds in this test. The same test with the ZFS-backed share (see previous email in this thread) took 1m 21s to complete. Remember this same test that I did but with a NFSv2 mount and took 36 minutes to complete on the ZFS-backed share? Let's try that here with the UFS-based share: = [EMAIL PROTECTED]/$mount -o nfsvers=2,rsize=32768,wsize=32768 ds2-private:/ ufsfoo /mnt [EMAIL PROTECTED]/$ cd /mnt [EMAIL PROTECTED]/mnt$ time dd if=/dev/zero of=blah2 bs=1024k count=128128 +0 records in 128+0 records out real0m3.103s user0m0.000s sys 0m2.880s = Three seconds vs. 36 minutes. Me thinks that there's something fishy here, regardless of Linux's reputation in the NFS world. Don't get me wrong. I love Solaris like I love taffy (and BOY do I love taffy) but there seems to be some really wonky Linux-NFS- Solaris-ZFS interaction going on that's really killing performance and my finger so far points at Solaris. :/ /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vs. Apple XRaid
On Jul 31, 2006, at 8:07 PM, eric kustarz wrote: The 2.6.x Linux client is much nicer... one thing fixed was the client doing too many commits (which translates to fsyncs on the server). I would still recommend the Solaris client but i'm sure that's no surprise. But if you'r'e stuck on Linux, upgrade to the latest stable 2.6.x and i'd be curious if it was better. I'd love to be on kernel 2.6 but due to the philosophical stance towards OpenAFS of some people on the lkml list[1], moving to 2.6 is a tough call for us to do. But that's another story for another list. The fact is that I'm stuck on 2.4 for the time being and I'm having problems with a Solaris/ZFS NFS server that I'm (and Jan) are not having with Solaris/UFS and (in my case) Linux/XFS NFS server. [1] https://lists.openafs.org/pipermail/openafs-devel/2006-July/ 014041.html /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS needs a viable backup mechanism
On Jul 9, 2006, at 12:42 PM, Richard Elling wrote: Ok, so I only managed data centers for 10 years. I can count on 2 fingers the times this was useful to me. It is becoming less useful over time unless your recovery disk is exactly identical to the lost disk. This may sound easy, but it isn't. In the old days, Sun put specific disk geometry information for all FRU-compatible disks, no matter who the supplier was. Since we use multiple suppliers, that enabled us to support some products which were very sensitive to the disk geometry. Needless to say, this is difficult to manage over time (and expensive). A better approach is to eliminate the need to worry about the geometry. ZFS is an example of this trend. You can now forget about those things which were painful, such as the borkeness created when you fmthard to a different disk :-) Methinks you are working too hard :-) Right. I am working too hard. It's been a pain but has shaved a lot of time and uncertainty off of recovering from big problems in the past. But up until 1.5 weeks ago ZFS in a production environ wasn't a reality (No, as much as I like it I'm not going to use nevada in production). Now ZFS is here in Solaris 10. But you hooked into my point too much. My point was that keeping backups of things that normal BR systems don't touch (such as VTOCs; such as ZFS volume structure and settings) is part of the Plan for the worst, maintain for the best ethos that I've developed over / my/ 10 years in data centers. This includes getting everything from app software to the lowest, deepest, darkest configs of RAID arrays (and now, ZFS) and whatnot back in place in as little time and with most ease as possible. I see dicking around with 'zfs create blah;zfs set foo=bar blah' and so on as a huge time waster when trying to resurrect a system from the depths of brokeness, no matter how often or not I'll find myself in that situation. It's slow and prone to error. I do agree that (evil) quotas and other attributes are useful to carry with the backup, but that is no panacea either and we'll need to be able to overrule them. For example, suppose I'm consolidating two servers onto one using backups. If I apply a new quota to an existing file system, then I may go over quota -- resulting in even more manual labor. I'm talking about nothing beyond restoring a system to the state it was prior to a catastrophic event. I'm just talking practicality here, not idiosyncratics of sysadmin'ing or what's evil and what's not. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS needs a viable backup mechanism
On Jul 9, 2006, at 12:32 AM, Richard Elling wrote: I'll call your bluff. Is a zpool create any different for backup than the original creation? Neither ufsdump nor tar-like programs do a mkfs or tunefs. In those cases, the sys admin still has to create the file system using whatever volume manager they wish. Creating a zpool is trivial by comparison. If you don't like it, then modifying a zpool on the fly afterwards is also, for most operations, quite painless. _Huh_? I was taking the stance that ZFS is a completely different paradigm than UFS and whatever volume management might be present underneath that and should be treated as such. I don't accept the argument that it wasn't in UFS, so we don't see the need for it in ZFS. What I was getting at was for a way to dump, in a human-readable but machine parsable form (eg: XML) the configuration of not only a zpool itself, but also the volumes within it as well as the settings for those volumes. Hypothetical situation: I have all my ZFS eggs in one basket (ie, a single JBOD or RAID array). Said array tanks in such a way that 100% data loss is suffered and it and its disks must be completely replaced. The files in the zpool(s) present on this array have been backed up using, say, Legato, so I can at least get my data back with a simple restore when the replacement array comes online. But Legato only saw things as files and directories. It never knew that a particular directory was actually the root of a volume nested amongst other volumes. So what of the tens or hundreds of ZFS volumes that I had that data sorted in and the individual (and perhaps highly varied) configurations of those volumes? That stuff - the metadata - sure wasn't saved by Legato. If I didn't manually keep notes or hadn't rolled my own script to save the volume configs in my own idiosyncratic format, I would be up the proverbial creek. So I postulated that it would be nice if one could save a zpool's entire volume configuration in one easy way and restore it just as easily if needed. Instead of: 1) Bring new hardware online 2) Create zpool and try one's best to recreate the previous volume structure and its settings (quota, compression, sharenfs, etc) 3) Restore data from traditional BR system (legato, netbackup, etc) 4) Pray I got (2) right. 5) Play config cleanup whack-a-mole as time goes on as mistakes or omissions are uncovered. In all likelihood it would be the users letting me know what I missed. ...I could instead do: 1) Bring new hardware online 2) Create zpool and then 'zfs config -f zpool-volume-config-backup.xml' 3) Restore data from wherever as in (3) above 4) Be reasonably happy knowing that the volume config is pretty close to what it should be, depending on how old the config dump is, of course. Every volume has its quotas set correctly, compression is turned on in the right places, the right volumes are shared along with their particular NFS options, and so on. Having this feature seems like a no-brainer to me. Who cares if SVM/ UFS/whatever didn't have it. ZFS is different from those. This is another area where ZFS could thumb its nose at those relative dinosaurs, feature-wise, and I argue that this is an important feature to have. See, you're talking with a person who saves prtvtoc output of all his disks so that if a disk dies, all I need to do to recreate the dead disk's exact slice layout on the replacement drive is to run that saved output through fmthard. One second on the command line rather than spending 10 minutes hmm'ing and haw'ing around in format. ZFS seems like it would be a prime candidate for this sort of thing. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS needs a viable backup mechanism
On Jul 7, 2006, at 1:45 PM, Bill Moore wrote: That said, we actually did talk to a lot of customers during the development of ZFS. The overwhelming majority of them had a backup scheme that did not involve ufsdump. I know there are folks that live and die by ufsdump, but most customers have other solutions, which generate backups just fine. Perhaps these dev customers needed to spend a little more time with ZFS, and do it in a production environ where backups and restores are arguably of a more urgent matter than in a test environment. Regarding making things ZFS aware, I just had a thought off the top of my head; the feasibility of which I have no idea and will leave up to those who are in the know to decide: ZFS we all know is just more than a dumb fs like UFS is. As mentioned, it has metadata in the form of volume options and whatnot. So, sure, I can still use my Legato/NetBackup/Amanda and friends to back that data up... but if the worst were to happen and I find myself having to restore not only data, but the volume structure of a pool as well, then there a huge time sink and an important one to avoid in a production environment. Immediately, I see quick way to relieve this (note I did not necessarily imply resolve this): Add an option to zpool(1M) to dump the pool config as well as the configuration of the volumes within it to an XML file. This file could then be sucked in to zpool at a later date to recreate/ replicate the pool and its volume structure in one fell swoop. After that, Just Add Data(tm). /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and Storage
Torrey McMahon wrote: ZFS is greatfor the systems that can run it. However, any enterprise datacenter is going to be made up of many many hosts running many many OS. In that world you're going to consolidate on large arrays and use the features of those arrays where they cover the most ground. For example, if I've 100 hosts all running different OS and apps and I can perform my data replication and redundancy algorithms, in most cases Raid, in one spot then it will be much more cost efficient to do it there. Exactly what I'm pondering. In the near to mid term, Solaris with ZFS can be seen as sort of a storage virtualizer where it takes disks into ZFS pools and volumes and then presents them to other hosts and OSes via iSCSI, NFS, SMB and so on. At that point, those other OSes can enjoy the benefits of ZFS. In the long term, it would be nice to see ZFS (or its concepts) integrated as the LUN provisioning and backing store mechanism on hardware RAID arrays themselves, supplanting the traditional RAID paradigms that have been in use for years. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Priorities
On Jun 23, 2006, at 1:09 PM, eric kustarz wrote: How about it folks - would it be a good idea for me to explore what it takes to get such a bug/RFE setup implemented for the ZFS community on OpenSolaris.org? what's wrong with http://bugs.opensolaris.org/bugdatabase/index.jsp for finding bugs? There's a LOT of things wrong with how b.s.o is presented. For us non-Sun people, b.s.o is a one-way ticket, and only when we're lucky. First, yes, we can search on bug keywords and categories. Great. Used to need a Sunsolve acct for this. But once we do that, we can only hope that the bugs we want to read about in detail aren't comprised solely of See Notes and that's it. It's like seeing To be continued... right before the climax of a movie. Useless and frustrating. Second, while there is a way for Joe Random to submit a bug, there is zero way for Joe Random to interact with a bug. No voting to bump or drop a priority, no easy way to find hot topic bugs, no way to add one's own notes to the issue. I guess the desperate just have to clog the system with new bugs and have them marked as dups or badger someone with a sun.com email address to do it for us. Third, much of end-to-end bug servicing from a non-Sun perspective is still an uphill battle, from acronyms and terms used to policies and coordination of work, e.g. Is someone in Sun or elsewhere already working on this particular bug I'm interested in? and questions which would stem from that basic one. In summary, the bug/RFE process is still a mystery after 1 year, and who knows if it'll stay the ginormous tease that it currently is. Really, it's still no better than if one had a Sunsolve account in years' past. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS questions
On Jun 16, 2006, at 11:40 PM, Richard Elling wrote: Kimberly Chang wrote: A couple of ZFS questions: 1. ZFS dynamic striping will automatically use new added devices when there are write requests. Customer has a *mostly read-only* application with I/O bottleneck, they wonder if there is a ZFS command or mechanism to enable the manual rebalancing of ZFS data when adding new drives to an existing pool? cp :-) If you copy the file then the new writes will be spread across the newly added drives. It doesn't really matter how you do the copy, though. She raises an interesting point, though. The concept of shifting blocks in a zpool around in the background as part of a scrubbing process and/or on the order of a explicit command to populate newly added devices seems like it could be right up ZFS's alley. Perhaps it could also be done with volume-level granularity. Off the top of my head, an area where this would be useful is performance management - e.g. relieving load on a particular FC interconnect or an overburdened RAID array controller/cache thus allowing total no-downtime-to-cp-data-around flexibility when one is horizontally scaling storage performance. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss