Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
2010/12/24 Edward Ned Harvey opensolarisisdeadlongliveopensola...@nedharvey.com: From: Frank Lahm [mailto:frankl...@googlemail.com] With Netatalk for AFP he _is_ running a database: any AFP server needs to maintain a consistent mapping between _not reused_ catalog node ids (CNIDs) and filesystem objects. Luckily for Apple, HFS[+] and their Cocoa/Carbon APIs provide such a mapping making diirect use of HFS+ CNIDs. Unfortunately most UNIX filesystem reuse inodes and have no API for mapping inodes to filesystem objects. Therefor all AFP servers running on non-Apple OSen maintain a database providing this mapping, in case of Netatalk it's `cnid_dbd` using a BerkeleyDB database. Don't all of those concerns disappear in the event of a reboot? If you stop AFP, you could completely obliterate the BDB database, and restart AFP, and functionally continue from where you left off. Right? No. Apple's APIs provide semantics by which you can reference filesystem objects by their parent directory CNID + object name. More important in this context: these references can be stored, retrieved and reused, eg. Finder Aliasses, Adobe InDesign and many more applications use these semantics to store references to files. If you nuke the CNID database, upon renumeration of the volumes all filesystem objects are likely to assigned new and different CNIDs, thus all references are broken. -f ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
From: Frank Lahm [mailto:frankl...@googlemail.com] Don't all of those concerns disappear in the event of a reboot? If you stop AFP, you could completely obliterate the BDB database, and restart AFP, and functionally continue from where you left off. Right? No. Apple's APIs provide semantics by which you can reference filesystem objects by their parent directory CNID + object name. More important in this context: these references can be stored, retrieved and reused, eg. Finder Aliasses, Adobe InDesign and many more applications use these semantics to store references to files. If you nuke the CNID database, upon renumeration of the volumes all filesystem objects are likely to assigned new and different CNIDs, thus all references are broken. Just like... If you shut down your Apple OSX AFP file server, move all the files to a new upgraded file server, reassigned the old IP address and DNS name to the new server, and enabled AFP file services on the new file server. How do people handle the broken links issue, when they upgrade their Apple server? If they don't bother doing anything about it, I would conclude it's no big deal. If there is instead, some process you're supposed to follow when you upgrade/replace your Apple AFP fileserver, I wonder if that process is applicable to the present thread of discussion as well. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
Am 02.01.11 16:52, schrieb Edward Ned Harvey: From: Frank Lahm [mailto:frankl...@googlemail.com] Don't all of those concerns disappear in the event of a reboot? If you stop AFP, you could completely obliterate the BDB database, and restart AFP, and functionally continue from where you left off. Right? No. Apple's APIs provide semantics by which you can reference filesystem objects by their parent directory CNID + object name. More important in this context: these references can be stored, retrieved and reused, eg. Finder Aliasses, Adobe InDesign and many more applications use these semantics to store references to files. If you nuke the CNID database, upon renumeration of the volumes all filesystem objects are likely to assigned new and different CNIDs, thus all references are broken. Just like... If you shut down your Apple OSX AFP file server, move all the files to a new upgraded file server, reassigned the old IP address and DNS name to the new server, and enabled AFP file services on the new file server. How do people handle the broken links issue, when they upgrade their Apple server? If they don't bother doing anything about it, I would conclude it's no big deal. If there is instead, some process you're supposed to follow when you upgrade/replace your Apple AFP fileserver, I wonder if that process is applicable to the present thread of discussion as well. Well… on the Apple platform HFS+ (the Mac's default fs) takes care of that, so you'd never have to worry about this issue there. On the *nix-side of things, when running Netatalk, you'll have to store these information in some kind of extra database, which is BDB in this case. Initially, I only wanted check what hw to get for my ZIL and I agree that by now, I have already decided - and ordered - two Vertex 2 EX 50GB SSDs to handle the ZIL for my zpool, since am serving already 50 AFP sharepoints which are accessed by 120 clients. The number of sharepoints will eventually rise up to 250 and the number of clients will rise up to 450 and that would cause some real random workload on the zpool and the ZIL, I guess. The technical discussion about short stroking is nevertheless very interesting. ;) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
From: Kevin Walker [mailto:indigoskywal...@gmail.com] You do seem to misunderstand ZIL. Wrong. ZIL is quite simply write cache ZIL is not simply write cache, but it enables certain types of operations to use write cache which otherwise would have been ineligible. The Intent Log is where ZFS immediately writes sync-write requests, so it can unblock the process which called write(). Once the data has been committed to nonvolatile ZIL storage, the process can continue processing, and ZFS can treat the write requests as async writes. Which means, after ZFS has written the ZIL, then the data is able to stay a while in the RAM write buffer along with all the async writes. Which means ZFS is able to aggregate and optimize all the writes for best performance. This means ZIL is highly sensitive to access times. (seek + latency) using a short stroked rotating drive is never going to provide a performance increase that is worth talking about If you don't add a dedicated log device, then the ZIL utilizes blocks from the main storage pool, and all sync writes suddenly get higher priority than all the queued reads and async writes. If you have a busy storage pool, your sync writes might see something like 20ms access times (seek + latency) before they can hit nonvolatile storage, and every time this happens, some other operation gets delayed. If you add a spindle drive dedicated log device, then that drive is always idle except when writing ZIL for sync writes, and also, the head will barely move over the platter because all the ZIL blocks will be clustered tightly together. So the ZIL might require typically 2ms or 3ms access times (negligible seek or 1ms seek + 2ms latency), which is an order of magnitude better than before. Plus the sync writes in this case don't take away performance from the main pool reads writes. If you replace your spindle drive with a SSD, then you get another order of magnitude smaller access time. (Tens of thousands of IOPS effectively compares to 1ms access time per OP) If you disable your ZIL completely, then you get another order of magnitude smaller access time. (Some ns to think about putting the data directly into RAM write buffer and entirely bypass the ZIL). and more importantly ZIL was designed to be used with a RAM/Solid State Disk. I hope you mean NVRAM or battery-backed RAM of some kind. Because if you use volatile RAM for ZIL, then you have disabled ZIL from being able to function correctly. The ZFS Best Practices Guide specifically mentions Better performance might be possible by using [...], or even a dedicated spindle disk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us] Sent: Tuesday, December 28, 2010 9:23 PM The question of IOPS here is relevant to conversation because of ZIL dedicated log. If you have advanced short-stroking to get the write latency of a log device down to zero, then it can compete against SSD for purposes of a log device, but nobody seems to believe such technology currently exists, and it certainly couldn't compete against SSD for random reads. (ZIL log is the only situation I know of, where write performance of a drive matters and read performance does not matter.) It seems that you may be confused. For the ZIL the drive's rotational latency (based on RPM) is the dominating factor and not the lateral head seek time on the media. In this case, the short-stroking you are talking about does not help any. The ZIL is already effectively short-stroking since it writes in order. Nope. I'm not confused at all. I'm making a distinction between short stroking and advanced short stroking. Where simple short stroking does as you said - eliminates the head seek time but still susceptible to rotational latency. As you said, the ZIL already effectively accomplishes that end result, provided a dedicated spindle disk for log device, but does not do that if your ZIL is on the pool storage. And what I'm calling advanced short stroking are techniques that effectively eliminate, or minimize both seek latency, to zero or near-zero. What I'm calling advanced short stroking doesn't exist as far as I know, but is theoretically possible through either special disk hardware or special drivers. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
You do seem to misunderstand ZIL. ZIL is quite simply write cache and using a short stroked rotating drive is never going to provide a performance increase that is worth talking about and more importantly ZIL was designed to be used with a RAM/Solid State Disk. We use sata2 *HyperDrive5* RAM disks in mirrors and they work well and are far cheaper than STEC or other enterprise SSD's and have non of the issue related to trim... Highly recommended... ;-) http://www.hyperossystems.co.uk/ Kevin On 29 December 2010 13:40, Edward Ned Harvey opensolarisisdeadlongliveopensola...@nedharvey.com wrote: From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us] Sent: Tuesday, December 28, 2010 9:23 PM The question of IOPS here is relevant to conversation because of ZIL dedicated log. If you have advanced short-stroking to get the write latency of a log device down to zero, then it can compete against SSD for purposes of a log device, but nobody seems to believe such technology currently exists, and it certainly couldn't compete against SSD for random reads. (ZIL log is the only situation I know of, where write performance of a drive matters and read performance does not matter.) It seems that you may be confused. For the ZIL the drive's rotational latency (based on RPM) is the dominating factor and not the lateral head seek time on the media. In this case, the short-stroking you are talking about does not help any. The ZIL is already effectively short-stroking since it writes in order. Nope. I'm not confused at all. I'm making a distinction between short stroking and advanced short stroking. Where simple short stroking does as you said - eliminates the head seek time but still susceptible to rotational latency. As you said, the ZIL already effectively accomplishes that end result, provided a dedicated spindle disk for log device, but does not do that if your ZIL is on the pool storage. And what I'm calling advanced short stroking are techniques that effectively eliminate, or minimize both seek latency, to zero or near-zero. What I'm calling advanced short stroking doesn't exist as far as I know, but is theoretically possible through either special disk hardware or special drivers. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
HyperDrive5 = ACard ANS9010 I have personally been wanting to try one of these for some time as a ZIL device. On 12/29/2010 06:35 PM, Kevin Walker wrote: You do seem to misunderstand ZIL. ZIL is quite simply write cache and using a short stroked rotating drive is never going to provide a performance increase that is worth talking about and more importantly ZIL was designed to be used with a RAM/Solid State Disk. We use sata2 *HyperDrive/5/* RAM disks in mirrors and they work well and are far cheaper than STEC or other enterprise SSD's and have non of the issue related to trim... Highly recommended... ;-) http://www.hyperossystems.co.uk/ Kevin On 29 December 2010 13:40, Edward Ned Harvey opensolarisisdeadlongliveopensola...@nedharvey.com mailto:opensolarisisdeadlongliveopensola...@nedharvey.com wrote: From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us mailto:bfrie...@simple.dallas.tx.us] Sent: Tuesday, December 28, 2010 9:23 PM The question of IOPS here is relevant to conversation because of ZIL dedicated log. If you have advanced short-stroking to get the write latency of a log device down to zero, then it can compete against SSD for purposes of a log device, but nobody seems to believe such technology currently exists, and it certainly couldn't compete against SSD for random reads. (ZIL log is the only situation I know of, where write performance of a drive matters and read performance does not matter.) It seems that you may be confused. For the ZIL the drive's rotational latency (based on RPM) is the dominating factor and not the lateral head seek time on the media. In this case, the short-stroking you are talking about does not help any. The ZIL is already effectively short-stroking since it writes in order. Nope. I'm not confused at all. I'm making a distinction between short stroking and advanced short stroking. Where simple short stroking does as you said - eliminates the head seek time but still susceptible to rotational latency. As you said, the ZIL already effectively accomplishes that end result, provided a dedicated spindle disk for log device, but does not do that if your ZIL is on the pool storage. And what I'm calling advanced short stroking are techniques that effectively eliminate, or minimize both seek latency, to zero or near-zero. What I'm calling advanced short stroking doesn't exist as far as I know, but is theoretically possible through either special disk hardware or special drivers. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org mailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
I do the same with ACARD… Works well enough. Fred From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Jason Warr Sent: 星期四, 十二月 30, 2010 8:56 To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL HyperDrive5 = ACard ANS9010 I have personally been wanting to try one of these for some time as a ZIL device. On 12/29/2010 06:35 PM, Kevin Walker wrote: You do seem to misunderstand ZIL. ZIL is quite simply write cache and using a short stroked rotating drive is never going to provide a performance increase that is worth talking about and more importantly ZIL was designed to be used with a RAM/Solid State Disk. We use sata2 HyperDrive5 RAM disks in mirrors and they work well and are far cheaper than STEC or other enterprise SSD's and have non of the issue related to trim... Highly recommended... ;-) http://www.hyperossystems.co.uk/ Kevin On 29 December 2010 13:40, Edward Ned Harvey opensolarisisdeadlongliveopensola...@nedharvey.commailto:opensolarisisdeadlongliveopensola...@nedharvey.com wrote: From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.usmailto:bfrie...@simple.dallas.tx.us] Sent: Tuesday, December 28, 2010 9:23 PM The question of IOPS here is relevant to conversation because of ZIL dedicated log. If you have advanced short-stroking to get the write latency of a log device down to zero, then it can compete against SSD for purposes of a log device, but nobody seems to believe such technology currently exists, and it certainly couldn't compete against SSD for random reads. (ZIL log is the only situation I know of, where write performance of a drive matters and read performance does not matter.) It seems that you may be confused. For the ZIL the drive's rotational latency (based on RPM) is the dominating factor and not the lateral head seek time on the media. In this case, the short-stroking you are talking about does not help any. The ZIL is already effectively short-stroking since it writes in order. Nope. I'm not confused at all. I'm making a distinction between short stroking and advanced short stroking. Where simple short stroking does as you said - eliminates the head seek time but still susceptible to rotational latency. As you said, the ZIL already effectively accomplishes that end result, provided a dedicated spindle disk for log device, but does not do that if your ZIL is on the pool storage. And what I'm calling advanced short stroking are techniques that effectively eliminate, or minimize both seek latency, to zero or near-zero. What I'm calling advanced short stroking doesn't exist as far as I know, but is theoretically possible through either special disk hardware or special drivers. ___ zfs-discuss mailing list zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
On 12/29/2010 4:55 PM, Jason Warr wrote: HyperDrive5 = ACard ANS9010 I have personally been wanting to try one of these for some time as a ZIL device. Yes, but do remember these require a half-height 5.25 drive bay, and you really, really should buy the extra CF card for backup. Also, stay away from the ANS-9010S with LVD SCSI interface. As (I think) Bob pointed out a long time ago, parallel SCSI isn't good for a high-IOPS interface. It (the LVD interface) will throttle long before the drive does... I've been waiting for them to come out with a 3.5 version, one which I can plug directly into a standard 3.5 SAS/SATA hotswap bay... And, of course, the ANS9010 is limited to the SATA2 interface speed, so it is cheaper and lower-performing (but still better than an SSD) than the DDRdrive. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
Had not even noticed the LVD version. The biggest issue for me is not the form factor but the how hard it would be to get the client I work for to accept them in the env given support issues. - Reply message - From: Erik Trimble erik.trim...@oracle.com Date: Wed, Dec 29, 2010 19:52 Subject: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL To: Jason Warr ja...@warr.net Cc: zfs-discuss@opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Edward Ned Harvey Ok, what we've hit here is two people using the same word to talk about different things. Apples to oranges, as it were. Both meanings of IOPS are ok, but context is everything. There are drive random IOPS, which is dependent on latency and seek time, and there is also measured random IOPS above the filesystem layer, which is not always related to latency or seek time, as described above. In any event, the relevant points are: The question of IOPS here is relevant to conversation because of ZIL dedicated log. If you have advanced short-stroking to get the write latency of a log device down to zero, then it can compete against SSD for purposes of a log device, but nobody seems to believe such technology currently exists, and it certainly couldn't compete against SSD for random reads. (ZIL log is the only situation I know of, where write performance of a drive matters and read performance does not matter.) If using ZFS for AFP (and consequently BDB)... If you disable the ZIL you will have maximum performance, but maybe you're not comfortable with that because you're not convinced of stability with ZIL disabled, or for other reasons. * If you put your BDB or ZIL on a spindle dedicated device, it will perform better than having no dedicated device, but the difference might be anything from 1x to 10x, depending on where your bottlenecks are. AKA no improvement is guaranteed, but probably you get at least a little bit. * If you put your BDB or ZIL on a SSD dedicated log device, it will perform still better, and again, the difference could be anywhere from 1x to 10x depending on your bottlenecks. * If you disable your ZIL, it will perform still better, and again, the difference could be anywhere from 1x to 10x. Realistically, at some point you'll hit a network bottleneck, and you won't notice the improved performance. If you're just doing small numbers of large files, none of the above will probably be noticeable, because in that case latency is pretty much irrelevant. But assuming you have at least a bunch of reasonably small files, IMHO that threshold is at the SSD, because the latency of the SSD is insignificant compared to the latency of the network. But even with short-stroking getting the latency down to 2ms, that's still significant compared to network latency, so there's probably still room for improvement over the short-stroking techniques. At least, until somebody creates a more advanced short-stroking which gets latency down to near-zero, if that will ever happen. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
On Tue, 28 Dec 2010, Edward Ned Harvey wrote: In any event, the relevant points are: The question of IOPS here is relevant to conversation because of ZIL dedicated log. If you have advanced short-stroking to get the write latency of a log device down to zero, then it can compete against SSD for purposes of a log device, but nobody seems to believe such technology currently exists, and it certainly couldn't compete against SSD for random reads. (ZIL log is the only situation I know of, where write performance of a drive matters and read performance does not matter.) It seems that you may be confused. For the ZIL the drive's rotational latency (based on RPM) is the dominating factor and not the lateral head seek time on the media. In this case, the short-stroking you are talking about does not help any. The ZIL is already effectively short-stroking since it writes in order. The (possibly) worthy optimizations I have heard about are writing the log data in a different pattern on disk (via a special device driver) with the goal that when when drive sync request comes in the drive is quite likely to be able to write immediately. Since such optimizations are quite device and write-load dependent, it is not worth while for a large company to develop the feature (but would make for an interesting project). Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Nicolas Williams Actually I'd say that latency has a direct relationship to IOPS because it's the time it takes to perform an IO that determines how many IOs Per Second that can be performed. Assuming you have enough synchronous writes and that you can organize them so as to keep the drive at max sustained sequential write bandwidth, then IOPS == bandwidth / logical I/O size. Latency doesn't Ok, what we've hit here is two people using the same word to talk about different things. Apples to oranges, as it were. Both meanings of IOPS are ok, but context is everything. There are drive random IOPS, which is dependent on latency and seek time, and there is also measured random IOPS above the filesystem layer, which is not always related to latency or seek time, as described above. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
On Dec 27, 2010, at 6:06 PM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Nicolas Williams Actually I'd say that latency has a direct relationship to IOPS because it's the time it takes to perform an IO that determines how many IOs Per Second that can be performed. Assuming you have enough synchronous writes and that you can organize them so as to keep the drive at max sustained sequential write bandwidth, then IOPS == bandwidth / logical I/O size. Latency doesn't Ok, what we've hit here is two people using the same word to talk about different things. Apples to oranges, as it were. Both meanings of IOPS are ok, but context is everything. There are drive random IOPS, which is dependent on latency and seek time, and there is also measured random IOPS above the filesystem layer, which is not always related to latency or seek time, as described above. The small, random read model can assume no cache hits. Adding caches makes the model too complicated for simple analysis, and arguably too complicated for modeling at all. For such systems, empirical measurements are possible, but can be overly optimistic. For example, it is relatively trivial to demonstrate 500,000 small, random read IOPS at the application using a file system that caches to RAM. Achieving that performance level for the general case is much less common. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
On Mon, Dec 27, 2010 at 09:06:45PM -0500, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Nicolas Williams Actually I'd say that latency has a direct relationship to IOPS because it's the time it takes to perform an IO that determines how many IOs Per Second that can be performed. Assuming you have enough synchronous writes and that you can organize them so as to keep the drive at max sustained sequential write bandwidth, then IOPS == bandwidth / logical I/O size. Latency doesn't Ok, what we've hit here is two people using the same word to talk about different things. Apples to oranges, as it were. Both meanings of IOPS are ok, but context is everything. There are drive random IOPS, which is dependent on latency and seek time, and there is also measured random IOPS above the filesystem layer, which is not always related to latency or seek time, as described above. Clearly the application cares about _synchronous_ operations that are meaningful to it. In the case of an NFS application that would be open() with O_CREAT (and particularly O_EXCL), close(), fsync() and so on. For a POSIX (but not NFS) application the number of synchronous operations is smaller. The rate of asynchronous operations is less important to the application because those are subject to caching, thus less predictable. But to the filesystem the IOPS are not just about synchronous I/O but about how many distinct I/O operations can be completed per unit of time. I tried to keep this clear; sorry for any confusion. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
On Sat, Dec 25, 2010 at 08:37:42PM -0500, Ross Walker wrote: On Dec 24, 2010, at 1:21 PM, Richard Elling richard.ell...@gmail.com wrote: Latency is what matters most. While there is a loose relationship between IOPS and latency, you really want low latency. For 15krpm drives, the average latency is 2ms for zero seeks. A decent SSD will beat that by an order of magnitude. Actually I'd say that latency has a direct relationship to IOPS because it's the time it takes to perform an IO that determines how many IOs Per Second that can be performed. Assuming you have enough synchronous writes and that you can organize them so as to keep the drive at max sustained sequential write bandwidth, then IOPS == bandwidth / logical I/O size. Latency doesn't enter into that formula. Latency does remain though, and will be noticeable to apps doing synchronous operations. Thus 100MB/s, say, sustained sequential write bandwidth with, say, 2KB avg ZIL entries you'd get 51200/s logical, sync write operations. The latency for each such operation would still be 2ms (or whatever it is for the given disk). Since you'd likely have to batch many ZIL writes you'd end up making the latency for some ops longer than 2ms and others shorter, but if you can keep the drive at max sustained seq write bandwidth then the average latency will be 2ms. SSDs are clearly a better choice. BTW, a parallelized tar would greatly help reduce the impact of high latency open()/close() (over NFS) operations... Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
On Dec 24, 2010, at 1:21 PM, Richard Elling richard.ell...@gmail.com wrote: Latency is what matters most. While there is a loose relationship between IOPS and latency, you really want low latency. For 15krpm drives, the average latency is 2ms for zero seeks. A decent SSD will beat that by an order of magnitude. Actually I'd say that latency has a direct relationship to IOPS because it's the time it takes to perform an IO that determines how many IOs Per Second that can be performed. Ever notice how storage vendors list their max IOPS in 512 byte sequential IO workloads and sustained throughput in 1MB+ sequential IO workloads. Only SSD makers list their random IOPS workload numbers and their 4K IO workload numbers. -Ross ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
On Dec 25, 2010, at 5:37 PM, Ross Walker wrote: On Dec 24, 2010, at 1:21 PM, Richard Elling richard.ell...@gmail.com wrote: Latency is what matters most. While there is a loose relationship between IOPS and latency, you really want low latency. For 15krpm drives, the average latency is 2ms for zero seeks. A decent SSD will beat that by an order of magnitude. Actually I'd say that latency has a direct relationship to IOPS because it's the time it takes to perform an IO that determines how many IOs Per Second that can be performed. That is only true when there is one queue and one server (in the queueing context). This is not the case where there are multiple concurrent I/O that can be completed out of order by multiple servers working in parallel (eg. disk subsystems). For an extreme example, the Sun Storage F5100 Array specifications show 1.6 million random read IOPS @ 4KB. But instead of an average latency of 625 nanoseconds, it shows an average latency of 0.378 milliseconds. The analogy we've used in parallel computing for many years is nine women cannot make a baby in one month. Ever notice how storage vendors list their max IOPS in 512 byte sequential IO workloads and sustained throughput in 1MB+ sequential IO workloads. Only SSD makers list their random IOPS workload numbers and their 4K IO workload numbers. The vendor will present the number that makes them look best, often without regard for practical application... the curse of marketing :-) -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
From: Frank Lahm [mailto:frankl...@googlemail.com] With Netatalk for AFP he _is_ running a database: any AFP server needs to maintain a consistent mapping between _not reused_ catalog node ids (CNIDs) and filesystem objects. Luckily for Apple, HFS[+] and their Cocoa/Carbon APIs provide such a mapping making diirect use of HFS+ CNIDs. Unfortunately most UNIX filesystem reuse inodes and have no API for mapping inodes to filesystem objects. Therefor all AFP servers running on non-Apple OSen maintain a database providing this mapping, in case of Netatalk it's `cnid_dbd` using a BerkeleyDB database. Don't all of those concerns disappear in the event of a reboot? If you stop AFP, you could completely obliterate the BDB database, and restart AFP, and functionally continue from where you left off. Right? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
On Dec 23, 2010, at 2:25 AM, Stephan Budach wrote: as I have learned from the discussion about which SSD to use as ZIL drives, I stumbled across this article, that discusses short stroking for increasing IOPs on SAS and SATA drives: http://www.tomshardware.com/reviews/short-stroking-hdd,2157.html Now, I am wondering if using a mirror of such 15k SAS drives would be a good-enough fit for a ZIL on a zpool that is mainly used for file services via AFP and SMB. SMB does not create much of a synchronous load. I haven't explored AFP directly, but if they do use Berkeley DB, then we do have a lot of experience tuning ZFS for Berkeley DB performance. I'd particulary like to know, if someone has already used such a solution and how it has worked out. Latency is what matters most. While there is a loose relationship between IOPS and latency, you really want low latency. For 15krpm drives, the average latency is 2ms for zero seeks. A decent SSD will beat that by an order of magnitude. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
On 24/12/2010 18:21, Richard Elling wrote: Latency is what matters most. While there is a loose relationship between IOPS and latency, you really want low latency. For 15krpm drives, the average latency is 2ms for zero seeks. A decent SSD will beat that by an order of magnitude. And the closer you get to the CPU, the lower the latency. For example, the DDRdrive X1 is yet another order of magnitude faster because it sits directly on the PCI bus, without the overhead of SAS protocol. Yet the humble old 15K drive with 2ms sequential latency is still and order of magnitude faster than a busy drive delivering 20ms latencies under a random workload. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] SAS/short stroking vs. SSDs for ZIL
Hi, as I have learned from the discussion about which SSD to use as ZIL drives, I stumbled across this article, that discusses short stroking for increasing IOPs on SAS and SATA drives: http://www.tomshardware.com/reviews/short-stroking-hdd,2157.html Now, I am wondering if using a mirror of such 15k SAS drives would be a good-enough fit for a ZIL on a zpool that is mainly used for file services via AFP and SMB. I'd particulary like to know, if someone has already used such a solution and how it has worked out. Cheers, budy -- Stephan Budach Jung von Matt/it-services GmbH Glashüttenstraße 79 20357 Hamburg Tel: +49 40-4321-1353 Fax: +49 40-4321-1114 E-Mail: stephan.bud...@jvm.de Internet: http://www.jvm.com Geschäftsführer: Ulrich Pallas, Frank Wilhelm AG HH HRB 98380 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
Great question. In good enough computing, beauty is in the eye of the beholder. My home NAS appliance uses IDE and SATA drives withoutba dedicated ZIL http://dtrace.org/blogs/ahl/2010/11/15/zil-analysis-from-chris-george/ if HDDs and commodity SSDs continue to be target ZIL devices, ZFS could and should do more to ensure that writes are sequential. On 23 Dec 2010, at 10:25, Stephan Budach stephan.bud...@jvm.de wrote: Hi, as I have learned from the discussion about which SSD to use as ZIL drives, I stumbled across this article, that discusses short stroking for increasing IOPs on SAS and SATA drives: http://www.tomshardware.com/reviews/short-stroking-hdd,2157.html Now, I am wondering if using a mirror of such 15k SAS drives would be a good-enough fit for a ZIL on a zpool that is mainly used for file services via AFP and SMB. I'd particulary like to know, if someone has already used such a solution and how it has worked out. Cheers, budy -- Stephan Budach Jung von Matt/it-services GmbH Glashüttenstraße 79 20357 Hamburg Tel: +49 40-4321-1353 Fax: +49 40-4321-1114 E-Mail: stephan.bud...@jvm.de Internet: http://www.jvm.com Geschäftsführer: Ulrich Pallas, Frank Wilhelm AG HH HRB 98380 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
Sent from my iPhone (which had a lousy user interface which makes it all too easy for a clumsy oaf like me to touch Send before I'm done)... On 23 Dec 2010, at 11:07, Phil Harman phil.har...@gmail.com wrote: Great question. In good enough computing, beauty is in the eye of the beholder. My home NAS appliance uses mirrorwd IDE and SATA drives without a dedicated ZIL device. And for my home SMB and NFS, that's good enough. I'm sure that even a 7200rpm SATA ZIL would improve things inmy case. The random I/O requirement for the ZIL is discussed by Adam (and Chris) here ... http://dtrace.org/blogs/ahl/2010/11/15/zil-analysis-from-chris-george/ What I find most encouraging is this statement: if HDDs and commodity SSDs continue to be target ZIL devices, ZFS could and should do more to ensure that writes are sequential. It's not broken, but it is suboptimal, and fixable (apparently) ;) On 23 Dec 2010, at 10:25, Stephan Budach stephan.bud...@jvm.de wrote: Hi, as I have learned from the discussion about which SSD to use as ZIL drives, I stumbled across this article, that discusses short stroking for increasing IOPs on SAS and SATA drives: http://www.tomshardware.com/reviews/short-stroking-hdd,2157.html Now, I am wondering if using a mirror of such 15k SAS drives would be a good-enough fit for a ZIL on a zpool that is mainly used for file services via AFP and SMB. I'd particulary like to know, if someone has already used such a solution and how it has worked out. Cheers, budy -- Stephan Budach Jung von Matt/it-services GmbH Glashüttenstraße 79 20357 Hamburg Tel: +49 40-4321-1353 Fax: +49 40-4321-1114 E-Mail: stephan.bud...@jvm.de Internet: http://www.jvm.com Geschäftsführer: Ulrich Pallas, Frank Wilhelm AG HH HRB 98380 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
Am 23.12.10 12:18, schrieb Phil Harman: Sent from my iPhone (which had a lousy user interface which makes it all too easy for a clumsy oaf like me to touch Send before I'm done)... On 23 Dec 2010, at 11:07, Phil Harman phil.har...@gmail.com mailto:phil.har...@gmail.com wrote: Great question. In good enough computing, beauty is in the eye of the beholder. My home NAS appliance uses mirrorwd IDE and SATA drives without a dedicated ZIL device. And for my home SMB and NFS, that's good enough. I'm sure that even a 7200rpm SATA ZIL would improve things inmy case. The random I/O requirement for the ZIL is discussed by Adam (and Chris) here ... http://dtrace.org/blogs/ahl/2010/11/15/zil-analysis-from-chris-george/ What I find most encouraging is this statement: if HDDs and commodity SSDs continue to be target ZIL devices, ZFS could and should do more to ensure that writes are sequential. It's not broken, but it is suboptimal, and fixable (apparently) ;) Yeah - I read through Christopher's article already and it clearly shows the shortcomings of current flash SSDs as ZIL devices. On the other hand, if you's be using a DDRdrive as a ZIL device, you'd pretty lock this zpool to that particular host, since you can't easily move the zpool onto another host, without moving the DDRdrive as well or without detaching the ZIL device(s) from the zpool, which I find a little bit odd. I am not actually running in a SOHO scenario with my ZFS file server, since it has to serve up to 200 users on up to 200 zfs volumes in one zpool, but the actual data traffic is also not that high either. The traffic is more of small peaks when someone writes back to a file. Cheers, budy ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
On 23 Dec 2010, at 11:53, Stephan Budach stephan.bud...@jvm.de wrote: Am 23.12.10 12:18, schrieb Phil Harman: Sent from my iPhone (which had a lousy user interface which makes it all too easy for a clumsy oaf like me to touch Send before I'm done)... On 23 Dec 2010, at 11:07, Phil Harman phil.har...@gmail.com wrote: Great question. In good enough computing, beauty is in the eye of the beholder. My home NAS appliance uses mirrorwd IDE and SATA drives without a dedicated ZIL device. And for my home SMB and NFS, that's good enough. I'm sure that even a 7200rpm SATA ZIL would improve things inmy case. The random I/O requirement for the ZIL is discussed by Adam (and Chris) here ... http://dtrace.org/blogs/ahl/2010/11/15/zil-analysis-from-chris-george/ What I find most encouraging is this statement: if HDDs and commodity SSDs continue to be target ZIL devices, ZFS could and should do more to ensure that writes are sequential. It's not broken, but it is suboptimal, and fixable (apparently) ;) Yeah - I read through Christopher's article already and it clearly shows the shortcomings of current flash SSDs as ZIL devices. On the other hand, if you's be using a DDRdrive as a ZIL device, you'd pretty lock this zpool to that particular host, since you can't easily move the zpool onto another host, without moving the DDRdrive as well or without detaching the ZIL device(s) from the zpool, which I find a little bit odd. I am not actually running in a SOHO scenario with my ZFS file server, since it has to serve up to 200 users on up to 200 zfs volumes in one zpool, but the actual data traffic is also not that high either. The traffic is more of small peaks when someone writes back to a file. Cheers, budy Well, your proposed config will improve what each user sees during their own private burst, and short stroking can only improve things in the worst case scenario (although it may not be measurable). So why not give it a spin and report back to the list in the new year? All the best, Phil___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
Am 23.12.10 13:09, schrieb Phil Harman: On 23 Dec 2010, at 11:53, Stephan Budach stephan.bud...@jvm.de mailto:stephan.bud...@jvm.de wrote: Am 23.12.10 12:18, schrieb Phil Harman: Sent from my iPhone (which had a lousy user interface which makes it all too easy for a clumsy oaf like me to touch Send before I'm done)... On 23 Dec 2010, at 11:07, Phil Harman phil.har...@gmail.com mailto:phil.har...@gmail.com wrote: Great question. In good enough computing, beauty is in the eye of the beholder. My home NAS appliance uses mirrorwd IDE and SATA drives without a dedicated ZIL device. And for my home SMB and NFS, that's good enough. I'm sure that even a 7200rpm SATA ZIL would improve things inmy case. The random I/O requirement for the ZIL is discussed by Adam (and Chris) here ... http://dtrace.org/blogs/ahl/2010/11/15/zil-analysis-from-chris-george/ What I find most encouraging is this statement: if HDDs and commodity SSDs continue to be target ZIL devices, ZFS could and should do more to ensure that writes are sequential. It's not broken, but it is suboptimal, and fixable (apparently) ;) Yeah - I read through Christopher's article already and it clearly shows the shortcomings of current flash SSDs as ZIL devices. On the other hand, if you's be using a DDRdrive as a ZIL device, you'd pretty lock this zpool to that particular host, since you can't easily move the zpool onto another host, without moving the DDRdrive as well or without detaching the ZIL device(s) from the zpool, which I find a little bit odd. I am not actually running in a SOHO scenario with my ZFS file server, since it has to serve up to 200 users on up to 200 zfs volumes in one zpool, but the actual data traffic is also not that high either. The traffic is more of small peaks when someone writes back to a file. Cheers, budy Well, your proposed config will improve what each user sees during their own private burst, and short stroking can only improve things in the worst case scenario (although it may not be measurable). So why not give it a spin and report back to the list in the new year? Ha ha - if no one else has some more input on this, I will definetively give it a try in jannuary. Cheers, budy ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
On Thu, Dec 23 at 11:25, Stephan Budach wrote: Hi, as I have learned from the discussion about which SSD to use as ZIL drives, I stumbled across this article, that discusses short stroking for increasing IOPs on SAS and SATA drives: [1]http://www.tomshardware.com/reviews/short-stroking-hdd,2157.html Now, I am wondering if using a mirror of such 15k SAS drives would be a good-enough fit for a ZIL on a zpool that is mainly used for file services via AFP and SMB. I'd particulary like to know, if someone has already used such a solution and how it has worked out. Haven't personally used it, but the worst case steady-state IOPS of the Vertex2 EX, from the DDRDrive presentation, is 6k IOPS assuming a full-pack random workload. To achieve that through SAS disks in the same workload, you'll probably spend significantly more money and it will consume a LOT more space and power. According to that Tom's article, a typical 15k SAS enterprise drive is in the 600 IOPS ballpark when short-stroked and consumes about 15W active. Thus you're going to need ten of these devices, to equal the degraded steady-state IOPS of an SSD. I just don't think the math works out. At that point, you're probably better-off not having a dedicated ZIL, instead of burning 10 slots and 150W. --eric -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
Am 23.12.10 19:05, schrieb Eric D. Mudama: On Thu, Dec 23 at 11:25, Stephan Budach wrote: Hi, as I have learned from the discussion about which SSD to use as ZIL drives, I stumbled across this article, that discusses short stroking for increasing IOPs on SAS and SATA drives: [1]http://www.tomshardware.com/reviews/short-stroking-hdd,2157.html Now, I am wondering if using a mirror of such 15k SAS drives would be a good-enough fit for a ZIL on a zpool that is mainly used for file services via AFP and SMB. I'd particulary like to know, if someone has already used such a solution and how it has worked out. Haven't personally used it, but the worst case steady-state IOPS of the Vertex2 EX, from the DDRDrive presentation, is 6k IOPS assuming a full-pack random workload. To achieve that through SAS disks in the same workload, you'll probably spend significantly more money and it will consume a LOT more space and power. According to that Tom's article, a typical 15k SAS enterprise drive is in the 600 IOPS ballpark when short-stroked and consumes about 15W active. Thus you're going to need ten of these devices, to equal the degraded steady-state IOPS of an SSD. I just don't think the math works out. At that point, you're probably better-off not having a dedicated ZIL, instead of burning 10 slots and 150W. Good - that was actually the information I have been missing. So, I will rather go with the Vertex2 EX then and save me the hassle of short stroking entirely. Thanks and merry christmas to all on this list. Cheers, budy ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
On Thu, Dec 23, 2010 at 11:25:43AM +0100, Stephan Budach wrote: as I have learned from the discussion about which SSD to use as ZIL drives, I stumbled across this article, that discusses short stroking for increasing IOPs on SAS and SATA drives: There was a thread on this a while back. I forget when or the subject. But yes, you could even use 7200 rpm drives to make a fast ZIL device. The trick is the on-disk format, and the pseudo-device driver that you would have to layer on top of the actual device(s) to get such performance. The key is that sustained sequential I/O rates for disks can be quite large, so if you organize the disk in a log form and use the outer tracks only, then you can get pretend to have awesome write IOPS for a disk (but NOT read IOPs). But it's not necessarily as cheap as you might think. You'd be making very inefficient use of an expensive disk (in the case of an SAS 15k rpm disk), or disks, and if plural then you are also using more ports (oops). Disks used this way probably also consume more power than SSDs (OK, this part of my analysis if very iffy), and you still need to do something about ensuring syncs to disk on power failure (such as just disabling the cache on the disk, but this would lower performance, increasing the cost). When you factor all the costs in I suspect you'll find that SSDs are priced reasonably well. That's not to say that one could not put together a disk-based log device that could eat SSDs' lunch, but SSD prices would then just come down to match that -- and you can expect SSD prices to come down anyways, as with any new technologies. I don't mean to discourage you, just to point out that there's plenty of work to do to make short-stroked disks as ZILs a workable reality, while the economics of doing that work versus waiting for SSD prices to come down don't seem appealing. Caveat emptor: my analysis is off-the-cuff; I could be wrong. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Stephan Budach Now, I am wondering if using a mirror of such 15k SAS drives would be a good-enough fit for a ZIL on a zpool that is mainly used for file services via AFP and SMB. For supporting AFP and SMB, most likely, you would be perfectly happy simply disabling the ZIL. You will get maximum performance... Even higher than the world's fastest SSD or DDRDrive or any other type of storage device for dedicated log. To determine if this is ok for you, be aware of the argument *against* disabling the ZIL: In the event of an ungraceful crash, with ZIL enabled, you lose up to 30 sec of async data, but you do not lose any sync data. In the event of an ungraceful crash, with ZIL disabled, you lose up to 30 sec of async and sync data. In neither case do you have data corruption, or a corrupt filesystem. The only question is about 30 seconds of sync data. You must protect this type of data, if you're running a database, an iscsi target for virtual hosts, and for some other types of data services... But if you're doing just AFP and SMB, it's pretty likely you don't need to worry about it. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
2010/12/24 Edward Ned Harvey opensolarisisdeadlongliveopensola...@nedharvey.com: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Stephan Budach Now, I am wondering if using a mirror of such 15k SAS drives would be a good-enough fit for a ZIL on a zpool that is mainly used for file services via AFP and SMB. For supporting AFP and SMB, most likely, you would be perfectly happy simply disabling the ZIL. You will get maximum performance... Even higher than the world's fastest SSD or DDRDrive or any other type of storage device for dedicated log. To determine if this is ok for you, be aware of the argument *against* disabling the ZIL: In the event of an ungraceful crash, with ZIL enabled, you lose up to 30 sec of async data, but you do not lose any sync data. In the event of an ungraceful crash, with ZIL disabled, you lose up to 30 sec of async and sync data. In neither case do you have data corruption, or a corrupt filesystem. The only question is about 30 seconds of sync data. You must protect this type of data, if you're running a database, ... With Netatalk for AFP he _is_ running a database: any AFP server needs to maintain a consistent mapping between _not reused_ catalog node ids (CNIDs) and filesystem objects. Luckily for Apple, HFS[+] and their Cocoa/Carbon APIs provide such a mapping making diirect use of HFS+ CNIDs. Unfortunately most UNIX filesystem reuse inodes and have no API for mapping inodes to filesystem objects. Therefor all AFP servers running on non-Apple OSen maintain a database providing this mapping, in case of Netatalk it's `cnid_dbd` using a BerkeleyDB database. -f ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss