Re: [zfs-discuss] ZFS ... open source moving forward?
On Sun, Dec 12, 2010 at 00:17:08 +0100, Joerg Schilling wrote: : If you have substancial information on why NetApp may rightfully own a patent : that is essential for ZFS, I would be interested to get this information. Trivial: the US patent system is fundamentally broken, so owning patents on more or less anything is possible, whether inforceable or not. The act of defending against an invalid patent costs a fortune, so most entities aren't willing to try. Easier to avoid. You know this, I'm sure. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS ... open source moving forward?
On Sat, Dec 11, 2010 at 13:22:28 -0500, Miles Nordin wrote: : The only thing missing is ZFS. To me it looks like a good replacement : for that is years away. I'm not excited about ocfs, or about kernel : module ZFS ports taking advantage of the Linus kmod ``interpretation'' : and the grub GPLv3 patent protection. I'm of the opinion that that's a nice hack that Oracle won't object to, right up until some other project decides to try and use it. IANAL, don't work for Oracle, never worked for Sun, and have no financial interest in the outcome, and that's nothing but a wild guess, but I'd love someone to take the codebase and produce something commercial with it. I'll just stand back and watch, from a safe distance. It'll be worth it. I'm sure I'd learn a lot. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Odd prioritisation issues.
On Wed, Dec 12, 2007 at 10:27:56 +0100, Roch - PAE wrote: : O_DSYNC was good idea. Then if you have recent Nevada you : can use the separate intent log (log keyword in zpool : create) to absord thosewrites without having splindle : competition with the reads. Your write workload should then : be well handled here (unless the incoming network processing : is itself delayed). Thanks for the suggestion -- I'll see if we can give that a go. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Odd prioritisation issues.
On Fri, Dec 07, 2007 at 13:14:56 +, I wrote: : On Fri, Dec 07, 2007 at 12:58:17 +, Darren J Moffat wrote: : : Dickon Hood wrote: : : >On Fri, Dec 07, 2007 at 12:38:11 +, Darren J Moffat wrote: : : >: Dickon Hood wrote: : : >: >We're seeing the writes stall in favour of the reads. For normal : : >: >workloads I can understand the reasons, but I was under the impression : : >: >that real-time processes essentially trump all others, and I'm surprised : : >: >by this behaviour; I had a dozen or so RT-processes sat waiting for disc : : >: >for about 20s. : : >: Are the files opened with O_DSYNC or does the application call fsync ? : : >No. O_WRONLY|O_CREAT|O_LARGEFILE|O_APPEND. Would that help? : : Don't know if it will help, but it will be different :-). I suspected : : that since you put the processes in the RT class you would also be doing : : synchronous writes. : Right. I'll let you know on Monday; I'll need to restart it in the : morning. I was a tad busy yesterday and didn't have the time, but I've switched one of our recorder processes (the one doing the HD stream; ~17Mb/s, broadcasting a preview we don't mind trashing) to a version of the code which opens its file O_DSYNC as suggested. We've gone from ~130 write ops per second and 10MB/s to ~450 write ops per second and 27MB/s, with a marginally higher CPU usage. This is roughly what I'd expect. We've artifically throttled the reads, which has helped (but not fixed; it isn't as determinative as we'd like) the starvation problem at the expense of increasing a latency we'd rather have as close to zero as possible. Any ideas? Thanks. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Odd prioritisation issues.
On Fri, Dec 07, 2007 at 05:27:25 -0800, Anton B. Rang wrote: : > I was under the impression that real-time processes essentially trump all : > others, and I'm surprised by this behaviour; I had a dozen or so RT-processes : > sat waiting for disc for about 20s. : Process priorities on Solaris affect CPU scheduling, but not (currently) : I/O scheduling nor memory usage. Ah, hmm. I hadn't appreciated that. I'm surprised. : > * Is this a ZFS issue? Would we be better using another filesystem? : It is a ZFS issue, though depending on your I/O patterns, you might be : able to see similar starvation on other file systems. In general, other : file systems issue I/O independently, so on average each process will : make roughly equal forward process on a continuous basis. You still : don't have guaranteed I/O rates (in the sense that XFS on SGI, for : instance, provides). That would make sense. I've not seen this before on any other filesystem. : > * Is there any way to mitigate against it? Reduce the number of iops : > available for reading, say? : > Is there any way to disable or invert this behaviour? : I'll let the ZFS developers tackle this one : --- : Have you considered using two systems (or two virtual systems) to ensure : that the writer isn't affected by reads? Some QFS customers use this : configuration, with one system writing to disk and another system : reading from the same disk. This requires the use of a SAN file system : but it provides the potential for much greater (and controllable) : throughput. If your I/O needs are modest (less than a few GB/second), : this is overkill. We're writing (currently) about 10MB/s; this may rise to about double that if we add the other multiplexes. We're taking the BBC's DVB content off-air, splitting it into programme chunks, and moving it from the machine that's doing the recording to a filestore. As it's off-air streams, we have no control over the inbound data -- it just arrives whether we like it or not. We do control the movement from the recorder to the filestore, but as this is largely achieved via a Perl module calling sendfile(), even that's mostly out of our hands. Definitely a headscratcher. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Odd prioritisation issues.
On Fri, Dec 07, 2007 at 12:58:17 +, Darren J Moffat wrote: : Dickon Hood wrote: : >On Fri, Dec 07, 2007 at 12:38:11 +, Darren J Moffat wrote: : >: Dickon Hood wrote: : >: >We've got an interesting application which involves recieving lots of : >: >multicast groups, and writing the data to disc as a cache. We're : >: >currently using ZFS for this cache, as we're potentially dealing with a : >: >couple of TB at a time. : >: >The threads writing to the filesystem have real-time SCHED_FIFO : >priorities : >: >set to 25. The processes recovering data from the cache and moving it : >: >elsewhere are niced at +10. : >: >We're seeing the writes stall in favour of the reads. For normal : >: >workloads I can understand the reasons, but I was under the impression : >: >that real-time processes essentially trump all others, and I'm surprised : >: >by this behaviour; I had a dozen or so RT-processes sat waiting for disc : >: >for about 20s. : >: Are the files opened with O_DSYNC or does the application call fsync ? : >No. O_WRONLY|O_CREAT|O_LARGEFILE|O_APPEND. Would that help? : Don't know if it will help, but it will be different :-). I suspected : that since you put the processes in the RT class you would also be doing : synchronous writes. Right. I'll let you know on Monday; I'll need to restart it in the morning. I put the processes in the RT class as without they dropped packets once in a while, especially on lesser hardware (a Netra T1 can't cope without, a Niagara usually can...). Very odd. : If you can test this it may be worth doing so for the sake of gathering : another data point. Noted. I suspect (from reading the man pages) it won't make much difference, as to my mind it looks like a scheduling issue. Just for interest's sake: when everything is behaving normally when writing only, 'zpool iostat 10' looks like: capacity operationsbandwidth pool used avail read write read write -- - - - - - - content 56.9G 2.66T 0118 0 9.64M normally, whilst reading and writing it looks like: content 69.8G 2.65T435103 54.3M 9.63M and when everything breaks, it looks like: content 119G 2.60T564 0 66.3M 0 prstat usually shows processes idling, a priority 125 for a moment, and other behaviour that I'd expect. When it all breaks, I get most of them sat at priority 125 thumbtwiddling. Perplexing. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Odd prioritisation issues.
On Fri, Dec 07, 2007 at 12:38:11 +, Darren J Moffat wrote: : Dickon Hood wrote: : >We've got an interesting application which involves recieving lots of : >multicast groups, and writing the data to disc as a cache. We're : >currently using ZFS for this cache, as we're potentially dealing with a : >couple of TB at a time. : >The threads writing to the filesystem have real-time SCHED_FIFO priorities : >set to 25. The processes recovering data from the cache and moving it : >elsewhere are niced at +10. : >We're seeing the writes stall in favour of the reads. For normal : >workloads I can understand the reasons, but I was under the impression : >that real-time processes essentially trump all others, and I'm surprised : >by this behaviour; I had a dozen or so RT-processes sat waiting for disc : >for about 20s. : Are the files opened with O_DSYNC or does the application call fsync ? No. O_WRONLY|O_CREAT|O_LARGEFILE|O_APPEND. Would that help? -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Odd prioritisation issues.
We've got an interesting application which involves recieving lots of multicast groups, and writing the data to disc as a cache. We're currently using ZFS for this cache, as we're potentially dealing with a couple of TB at a time. The threads writing to the filesystem have real-time SCHED_FIFO priorities set to 25. The processes recovering data from the cache and moving it elsewhere are niced at +10. We're seeing the writes stall in favour of the reads. For normal workloads I can understand the reasons, but I was under the impression that real-time processes essentially trump all others, and I'm surprised by this behaviour; I had a dozen or so RT-processes sat waiting for disc for about 20s. My questions: * Is this a ZFS issue? Would we be better using another filesystem? * Is there any way to mitigate against it? Reduce the number of iops available for reading, say? * Is there any way to disable or invert this behaviour? * Is this a bug, or should it be considered one? Thanks. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] pls discontinue troll bait was: Yager on ZFS and
On Sun, Nov 18, 2007 at 12:15:25 -0800, can you guess? wrote: : > Big talk from someone who seems so intent on hiding : > their credentials. : Say, what? Not that credentials mean much to me since I evaluate people : on their actual merit, but I've not been shy about who I am (when I : responded 'can you guess?' in registering after giving billtodd as my : member name I was being facetious). You're using a web-based interface to a mailing list and the 'billtodd' bit doesn't appear to any users (such as me) subscribed via that mechanism. So yes, 'can you guess?' is unhelpful and makes you look as if you're being deliberately unhelpful. OK, it's in your email address, but various broken MUAs (Outlook and derivatives for one) fail to show that. : If you're still referring to your incompetent alleged research, [...] : [...] right out of the : same orifice from which you've pulled the rest of your crap. It's language like that that is causing the problem. IMHO you're being a tad rude. This doesn't help anybody. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Yager on ZFS
On Fri, Nov 09, 2007 at 21:34:35 +0100, Joerg Schilling wrote: : Dickon Hood <[EMAIL PROTECTED]> wrote: : > ZFS would be lovely. Pity about the licence issues. : There is no license issue: the CDDL allows a combination : with any other license and the GPL does not forbid a GPL : project to use code under other licenses in case that the : non-GPL code does not become a derived work of the the GPL : code. I happen to agree with you, but unfortunately those in charge of the kernel don't. Licence politics are annoying, but still fall under 'licence issues' in my book. : In case of a filesystem, I do not see why the filesystem could : be a derived work from e.g. Linux. Indeed not, however AIUI the FSF do. : If people did like, they could use ZFS in Linux and nobody would : complian.. I can't see why it isn't possible to maintain an out-of-tree implementation -- after all, the issues with mixing GPL and non-GPL code only come about on redistribution -- but I don't see anyone doing this. I'd give it a bash myself, but I have time issues at the moment, and as my knowledge of kernel internals (of any Unixoid) is rather lacking at the moment, would involve quite a learning curve. Pity. : The problem is in the first priority, politics then technical problems. Agreed. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Yager on ZFS
On Fri, Nov 09, 2007 at 12:11:48 -0700, Jason J. W. Williams wrote: : I'm somewhat surprised its being used as : a counterexample of journaling filesystems being no less reliable than : ZFS. XFS or ReiserFS are both better examples than ext3. I tend to use XFS on my Linux boxes because of it. ReiserFS I consider dangerous: if merely having an image of a ReiserFS filesystem on a ReiserFS filesystem is enough for fsck to screw everything up, it doesn't pass my 'good gods, just *what* were they *thinking*' test. I'd use XFS or JFS over the others, any day. ZFS would be lovely. Pity about the licence issues. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS
On Thu, Sep 20, 2007 at 16:22:45 -0500, Gary Mills wrote: : You should consider a Netapp filer. It will do both NFS and CIFS, : supports disk quotas, and is highly reliable. We use one for 30,000 : students and 3000 employees. Ours has never failed us. And they might only lightly sue you for contemplating zfs if you're really, really lucky... -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Single SAN Lun presented to 4 Hosts
On Sat, Aug 25, 2007 at 12:36:34 -0700, Matt B wrote: : Im not sure what you mean I think what he's trying to tell you is that you need to consult a storage expert. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Odd behaviour with heavy workloads.
% CPU according to prstat. This is an 8-core T1000. The reading process (Perl) chews 3.5% or so, which I make to be one thread plus a bit that can be parallelised automatically. The reader appears to be CPU-bound, which also concerns me; when we unbind it, I'm expecting this problem to get worse. My questions are: * Is this what I should expect? * Why? I'd've thought the extensive caching the filesystem does to sort this out for me? * Is there any way around it that doesn't involve editing the code? Thankyou for your time. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Btrfs, COW for Linux [somewhat OT]
On Thu, Jun 14, 2007 at 17:19:18 -0700, Frank Cusack wrote: : anyway, my point is that i didn't think COW was in and of itself a feature : a home or SOHO user would really care about. it's more an implementation : detail of zfs than a feature. i'm sure this is arguable. I'm really not sure I agree with you given the way you've put it. Yes, the average user doesn't give a damn whether his filesystem is copy-on-write, journalled, or none of the above if you ask him. Your average user *does* care, however, when faced with the consequences, and *will* complain when things break. CoW does appear to have some distinct advantages over other methods when faced with the unreliable hardware we all have to deal with these days. When software makes it better, everybody wins. Now all I need is a T2000 that behaves correctly rather than a T2000 that behaves as if it's missing the magic in /etc/system that it has... -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] slow sync on zfs
On Mon, Apr 23, 2007 at 17:43:31 -0400, Torrey McMahon wrote: : Dickon Hood wrote: : >[snip] : >I'm currently playing with ZFS on a T2000 with 24x500GB SATA discs in an : >external array that presents as SCSI. After having much 'fun' with the : >Solaris SCSI driver not handling LUNs >2TB : That should work if you have the latest KJP and friends. (Actually, it : should have been working for a while so if not) What release are you on? Google suggested it may or may not, depending on how lucky I was. I assume I was just unlucky, or didn't find the correct set of patches. Actually I thought I had at one point, but writes after the first 2TB returned IO errors. I tried every recentish version on our Jumpstart server: 0305, 0606, and 1106, with the latest 10_Recommended patch cluster or not, and with various other sd patches I could find. Which versions I couldn't honestly say; I gave up. 1106 out of the box won't even see the SCSI card with a 2TB LUN, which has some interesting side effects with installing: the expansion cards appear first, and if it can't see it, suddenly your boot devices change once patched. I got one combination -- sorry, I don't recall which, but I think it was a 0606 with a patch -- to see the device, but as I say, writes to >2TB fail with an IO error. This is unhelpful. I gave up and restructured it to export all the discs, as I said. AIUI, that's better for zfs anyway. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] slow sync on zfs
On Mon, Apr 23, 2007 at 20:27:56 +0100, Peter Tribble wrote: : On 4/23/07, Robert Milkowski <[EMAIL PROTECTED]> wrote: : >Relatively low traffic to the pool but sync takes too long to complete : >and other operations are also not that fast. : >Disks are on 3510 array. zil_disable=1. : >bash-3.00# ptime sync : >real 1:21.569 : >user0.001 : >sys 0.027 : Hey, that is *quick*! : On Friday afternoon I typed sync mid-afternoon. Nothing had happened : a couple of hours later when I went home. It looked as though it had : finished : by 11pm, when I checked in from home. : This was on a thumper running S10U3. As far as I could tell, all writes : to the pool stopped completely. There were applications trying to write, : but they had just stopped (and picked up later in the evening). A fairly : consistent few hundred K per second of reads; no writes; and pretty low : system load. I'm glad I'm not the only one to have seen this. I'm currently playing with ZFS on a T2000 with 24x500GB SATA discs in an external array that presents as SCSI. After having much 'fun' with the Solaris SCSI driver not handling LUNs >2TB, I reconfigured the array to present as one target with 24 LUNs, one per disc, and threw ZFS at it in a raidz2 configuration. I admit this isn't optimal, but it has the behaviour I wanted: namely lots of space with a little redundancy for safety. Having had said 'fun' with the SD driver I thought I'd thoroughly check large object handling, and started eight 'dd if=/dev/zero's before retiring to the pub and leaving it overnight. The next morning, I discovered a bunch of rather large files. 340GB in size. Everything seemed OK, so I issued an 'rm *', expecting it to return rather quickly. How wrong I was. It took a minute (61s from memory) to delete a single 320GB file, which flattened the SCSI bus issuing 4.5MB/s/disc reads (as reported by iostat -x), during which time all writes were suspended. This is not good. Once that had finished, a 'ptime sync' sat for 25 minutes running at about 1MB/s/disc. Again, all reads. Given what I intend to use this filesystem for -- dropping all the BBC's Freeview muxes to disc in 24-hour chunks -- performance on large objects is rather important to me. I've reconfigured to 3x(7+1) raidz, and this has helped a lot (as I expected it would), but it's still not great having multi-second write locks when deleting 16GB objects. 100MB/s write speed and 200MB/s read speed isn't bad, though. Quite impressed with that. : It did recover, but write latencies of a few hours is rather undesirable. To put it mildly. : What on earth was it doing? I wish I knew. Anyone any ideas on how to optimise it further? I'm using the defaults (whatever's created by a 8GB RAM T2000 with 8 1GHz cores); no compression, no nothing. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss