Re: [zfs-discuss] Adding ZIL to pool questions
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Gregory Gee Prior to pool version 19, mirroring the log device is highly recommended. I have the following. This system is currently running ZFS pool version 14. This really worries me. For a home server, mirrored SSD ZIL is pricey. I assume this means upgrading to a dev version? What do people do now for upgrades? Before anything else, a clarification of terminology: Every pool has a ZIL. The ZIL by default is stored on disk within the primary pool. You are not talking about adding a ZIL; you are talking about adding a dedicated log device. If you look at opensolaris.org, under the download page, you'll see a section for Developer builds. There it says two things: There are instructions to upgrade your existing installation in-place, and there is a link http://genunix.org to download the latest developer build in ISO format, for a fresh installation. (The latest is b134, which is about 4 months old.) My personal experience is that the upgrade process is a little bit shaky. Nothing fatal happened, but the system didn't seem quite right afterward. So it's a good idea to backup before doing an upgrade. Another option, since this is your home server: I bet you're not processing credit card transactions. I bet you don't have a compute farm using this as a backend datastore. I bet it's not a mail server, and even more certainly, not a mail server for critical information. etc. You might just consider disabling the ZIL. If you run with ZIL disabled, then your sync writes behave as async writes, which means, it is faster than you could possibly hope even if you had the dedicated log device. The only risk is possibly 30 seconds worth of sync writes leading up to an ungraceful system crash ... but even if you have the ZIL (dedicated log or built-in to pool), then 30 sec of async writes would be at risk anyway. If you're not carefully and strictly paying attention and strictly obsessively controlling your applications to distinguish between sync and async writes, I bet you don't care about the distinction. So just run with your ZIL disabled. For all you knew, your application might have been using async writes anyway, right? Then, when trying to figure out the size if the SSD, I got this from the guide as well. The maximum size of a log device should be approximately 1/2 the size of physical memory because that is the maximum amount of potential in- play data that can be stored. I'll make this even more shocking: Since the maximum you could possibly store in ZIL is 30sec, just calculate the speed of your device and see how many GB that is. In an unrealistic super computer fast world, that would be 32G. Realistically, it's more like 4-6G for extreme heavy usage on real devices. Your system is simply incapable of using more than that. I don't think I can get SSD that small. It seems like a waste of larger SSD now. Is there a way to make better use of the SDD by partitioning?I have two pools, could I partition the SSD to have each partition as a ZIL for the different pools? Any other creative uses? Yes. In a high performance environment, people generally acknowledge the waste of space, and simply don't use most of their SSD. You can format and fdisk and partition your SSD, to create smaller slices. Then you can use a slice for log, and a slice for cache. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Adding ZIL to pool questions
You probably would not notice the performance effects of a SSD ZIL on a home network ; so the price of the ticket may not be worth the ride for you. OTOH, you would notice a significant improvement by using that SSD as an L2ARC device. Because the head latency on consumer 1TB drives is so long, the L2ARC would definitely make access to pool feel faster because the working footprint of files that your applications frequently reference will sit up front in the L2 cache ; meanwhile archival and infrequently touched items would park in the storage pool cylinders. On a small office network, OTOH, ZIL makes a big difference. For instance, if you had 10 software developers, with their home directories all exported from a ZFS box, adding a NVRAM ZIL will significantly improve performance. That's because developers often compile hundreds of files at a time, several times per hour, plus updates to files' atime attr - and that particular scale of operation will be greatly improved by an NVRAM ZIL. If I were to use a ZIL again, i'd use something like the ACARD DDR-2 SATA boxes, and not an SSD or an iRAM. -- Jim -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Adding ZIL to pool questions
Jim, that ACARD looks really nice, but out of the price range for a home server. Edward, disabling ZIL might be ok, but let me characterize what my home server does and tell me if disabling ZIL is ok. My home OpenSolaris server is only used for storage. I have a separate linux box that runs any software I need such as media servers and such. I export all pools from the OpenSolaris box to the linux box via NFS. The OpenSolaris box has 2 pools. The first pool stores videos, pictures, various files and mail all exported via NFS to the linux box. It is a mirrored zpool. The second mirrored zpool is NFS store for VM images. The linux box I mentioned are actually VMs running in XenServer. The VM vdisks are stored and run from the OpenSolaris NFS server mounted in the XenServer box. Yes, I know that this is not a typical home setup, but I'm sure that most here don't have a 'typical home setup'. So the question is, will disabling ZIL have negative impacts on the VM vdisks stored in NFS? Or any other files on the NFS shares? Thanks, Greg -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Adding ZIL to pool questions
On Sun, Aug 01, 2010 at 12:36:28PM -0700, Gregory Gee wrote: Jim, that ACARD looks really nice, but out of the price range for a home server. Edward, disabling ZIL might be ok, but let me characterize what my home server does and tell me if disabling ZIL is ok. My home OpenSolaris server is only used for storage. I have a separate linux box that runs any software I need such as media servers and such. I export all pools from the OpenSolaris box to the linux box via NFS. The OpenSolaris box has 2 pools. The first pool stores videos, pictures, various files and mail all exported via NFS to the linux box. It is a mirrored zpool. The second mirrored zpool is NFS store for VM images. The linux box I mentioned are actually VMs running in XenServer. The VM vdisks are stored and run from the OpenSolaris NFS server mounted in the XenServer box. Yes, I know that this is not a typical home setup, but I'm sure that most here don't have a 'typical home setup'. So the question is, will disabling ZIL have negative impacts on the VM vdisks stored in NFS? Or any other files on the NFS shares? You would probably see better performance at the expense of reliability in the case of an unplanned outage. Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Adding ZIL to pool questions
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Gregory Gee Edward, disabling ZIL might be ok, but let me characterize what my home server does and tell me if disabling ZIL is ok. You should understand what it all means, and make your own choice. For sync writes, an application tells the OS to write something to disk, and the function call blocks (waits) until the data has been committed to nonvolatile storage. For async writes, an applicaiton tells the OS to write something to disk, and the OS is permitted to buffer the write in RAM. The application continues doing other things, even if the data is not yet committed to nonvolatile storage. In ZFS, many async transactions can be aggregated into a single transaction group. ZFS chooses when to flush the TXG to disk based on many factors, optimized for performance, but never longer than 30 sec. In ZFS, sync transactions are first written to the ZIL, so the OS unblocks the application, and then they become async transactions just like all the other async transactions. After an ungraceful crash, the OS checks the ZIL to see if anything was requested to be written but not actually written. Of course, if any unplayed entries exist, they are played now, before the filesystem is mounted. In other filesystems and operating systems, it's critical to honor the sync mode behavior, because transactions such as file creation and removal are sync operations. So in other systems, not honoring the sync behavior could result in a corrupt filesystem, or corrupt data where a later write was committed to disk before an earlier write. ZFS is immune to those problems. Because ZFS keeps an in-memory snapshot of what the filesystem looks like as a whole, and ZFS only commits to disk a newer snapshot of the filesystem (doesn't commit individual file-based operations such as other filesystems and OSes), and because the committal of a new TXG is an atomic operation... It is impossible to ever bootup and discover ZFS to be in a corrupt or inconsistent state. During a crash, up to 30 sec of async writes are at risk. Anything which was in a TXG not yet flushed to disk is lost. So ... If you honor sync writes ... and some NFS client issues a sync write ... and the server reboots ungracefully ... then after reboot, the client will see things as they were expected to be. If you don't honor sync writes (ZIL disabled), it's possible for an ungraceful reboot to come up with a filesystem in a state older than what your NFS clients expect. So it's probably a good idea to reboot or at least remount your NFS clients along with the server reboot. Just to get them all into a consistent state. If you are using NFS to export some VM's, and some other compute servers are acting as the heads for those VM's ... Well, VM's are naturally sync mode machines. Because whenever an application inside the guest OS requests a sync write, the guest OS is going to issue a sync write to the host OS. I would not recommend NFS as the backend to host files for a VM guest. I would recommend iscsi, which will perform more natively and with less overhead. In either case, NFS or iscsi, if you disable ZIL, just make sure to reboot your VM guests too if the server has an ungraceful reboot. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Adding ZIL to pool questions
On Sun, 1 Aug 2010, Jim Doyle wrote: You probably would not notice the performance effects of a SSD ZIL on a home network ; so the price of the ticket may not be worth the ride for you. OTOH, There are cases where the SSD ZIL will offer tremendous improvement. One of the common cases is when the client uses NFS to access the files, and the files are bulk-copied, or extracted from a tar file. In this case there will be a huge improvement to write performance. On a small office network, OTOH, ZIL makes a big difference. For instance, if you had 10 software developers, with their home directories all exported from a ZFS box, adding a NVRAM ZIL will significantly improve performance. That's because developers often compile hundreds of files at a time, several times per hour, plus updates to files' atime attr - and that particular scale of operation will be greatly improved by an NVRAM ZIL. Smart software developers will access the source code via NFS but write the object files to local client disk. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Adding ZIL to pool questions
On Jul 31, 2010, at 8:56 PM, Gregory Gee wrote: I was thinking of adding an SDD ZIL to my pool, but then read this in the 'Best Practice Guide'. * Prior to pool version 19, if you have an unmirrored log device that fails, your whole pool is permanently lost. Prior to pool version 19, mirroring the log device is highly recommended. * This depends on the failure mode, of course. I have the following. * ad...@nas:~$ zpool upgrade This system is currently running ZFS pool version 14. All pools are formatted using this version. * This really worries me. For a home server, mirrored SSD ZIL is pricey. I assume this means upgrading to a dev version? What do people do now for upgrades? Yes, until the OpenSolaris dev's ran dry in February. There are other distributions with later releases. See http://www.genunix.org Then, when trying to figure out the size if the SSD, I got this from the guide as well. * The maximum size of a log device should be approximately 1/2 the size of physical memory because that is the maximum amount of potential in-play data that can be stored. For example, if a system has 16 Gbytes of physical memory, consider a maximum log device size of 8 Gbytes. * I don't think I can get SSD that small. It seems like a waste of larger SSD now. Is there a way to make better use of the SDD by partitioning?I have two pools, could I partition the SSD to have each partition as a ZIL for the different pools? Any other creative uses? I use something like an Intel X-25V (40GB) for boot disk and separate log. There are a number of similar-sized SSDs targeting the boot environments that are in the 32-40 GB size and around $120 or so. -- richard -- Richard Elling rich...@nexenta.com +1-760-896-4422 Enterprise class storage for everyone www.nexenta.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Adding ZIL to pool questions
I was thinking of adding an SDD ZIL to my pool, but then read this in the 'Best Practice Guide'. * Prior to pool version 19, if you have an unmirrored log device that fails, your whole pool is permanently lost. Prior to pool version 19, mirroring the log device is highly recommended. * I have the following. * ad...@nas:~$ zpool upgrade This system is currently running ZFS pool version 14. All pools are formatted using this version. * This really worries me. For a home server, mirrored SSD ZIL is pricey. I assume this means upgrading to a dev version? What do people do now for upgrades? Then, when trying to figure out the size if the SSD, I got this from the guide as well. * The maximum size of a log device should be approximately 1/2 the size of physical memory because that is the maximum amount of potential in-play data that can be stored. For example, if a system has 16 Gbytes of physical memory, consider a maximum log device size of 8 Gbytes. * I don't think I can get SSD that small. It seems like a waste of larger SSD now. Is there a way to make better use of the SDD by partitioning?I have two pools, could I partition the SSD to have each partition as a ZIL for the different pools? Any other creative uses? Thanks, Greg -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss