Re: [zfs-discuss] VM's on ZFS - 7210
Hi, In a setup similar to yours I changed from a single 15 disks raidz2 to 7 mirros of 2 disks each. The change in performance was stellar. The key point in serving things for VMware is that it always issue synchronous writes, wheter on iscsi or NFS. When you have tens of VM the resulting traffic is always random for the backing store, and random synch writes are the achille's heel for ZFS. now about your options 1) Change to iSCSI mounts to ESX, and enable write-cache on the LUN's since the 7210 is on a UPS. this won't save you from a crash 2) get a Logzilla SSD mirror. (do ssd's fail, do I really need a mirror?) yes you do need a mirror albeit in a recent thread here it's exposed it's not enough. 3) reconfigure the NAS to a RAID10 instead of RAIDz this is the way I would go. To make up for the lost space you can enable lz compression (the default one) which should be more or less transparent and leads to very good savings (1,5x - 2x) another advice if your guests are unix: unless you need it, mount your guests OS with noatime, this will reduce basic chatter about 50% in my experience. another thing that helps is to have cache devices, even if they aren't faster than the pool's ones they free up iops that can be used for writes. to summarize I'd go for the mirror setup, then if it's not enough a pair of SSD for SLOG would surely help. Il giorno 27/ago/2010, alle ore 07.04, Mark ha scritto: We are using a 7210, 44 disks I believe, 11 stripes of RAIDz sets. When I installed I selected the best bang for the buck on the speed vs capacity chart. We run about 30 VM's on it, across 3 ESX 4 servers. Right now, its all running NFS, and it sucks... sooo slow. iSCSI was no better. I am wondering how I can increase the performance, cause they want to add more vm's... the good news is most are idleish, but even idle vm's create a lot of random chatter to the disks! So a few options maybe... 1) Change to iSCSI mounts to ESX, and enable write-cache on the LUN's since the 7210 is on a UPS. 2) get a Logzilla SSD mirror. (do ssd's fail, do I really need a mirror?) 3) reconfigure the NAS to a RAID10 instead of RAIDz Obviously all 3 would be ideal , though with a SSD can I keep using NFS for the same performance since the R_SYNC's would be satisfied with the SSD? I am dreadful of getting the OK to spend the $$,$$$ SSD's and then not get the performance increase we want. How would you weight these? I noticed in testing on a 5 disk OpenSolaris, that changing from a single RAIDz pool to RAID10 netted a larger IOP increase then adding an Intel SSD as a Logzilla. That's not going to scale the same though with a 44 disk, 11 raidz striped RAID set. Some thoughts? Would simply moving to write-cache enabled iSCSI LUN's without a SSD speed things up a lot by itself? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Simone Caldana ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cant't detach spare device from pool
Il giorno 21/ago/2010, alle ore 10.10, Ian Collins ha scritto: On 08/21/10 07:03 PM, Martin Mundschenk wrote: After about 62 hours and 90%, the resilvering process got stuck. Since 12 hours nothing happens anymore. Thus, I can not detach the spare device. Is there a way to get the resilvering process back running? Are you sure it's stuck? They can take a very long time and go really slow at the end. especially if you're writing things in that pool. -- Simone Caldana ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solaris startup script location
Il giorno 18/ago/2010, alle ore 10.20, Alxen4 ha scritto: My NFS Client is ESXi so the major question is there risk of corruption for VMware images if I disable ZIL ? I do the same use of ZFS. I had a huge improvement in performance by using mirrors instead of raidz. How is your zpool configured? -- Simone Caldana Senior Consultant Critical Path via Cuniberti 58, 10100 Torino, Italia +39 011 4513811 (Direct) +39 011 4513825 (Fax) simone.cald...@criticalpath.net http://www.cp.net/ Critical Path A global leader in digital communications ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Group Quotas
Il giorno 18/ago/2010, alle ore 21.24, David Magda ha scritto: On Wed, August 18, 2010 15:14, Linder, Doug wrote: I've noticed that everytime someone mentions using NFS with ZFS here, they always seem to be using NFSv3. Is there a reason for this that I just don't know about? At $WORK it's generally namespace issues: http://blogs.sun.com/tdh/entry/linux_nfsv4_namespace_implementation_fools Also the linux NFSv4 client is bugged (as in hang-the-whole-machine bugged). I am deploying a new osol fileserver for home directories and I'm using NFSv3 + automounter (because I am also using one dataset per user, and thus I have to mount each home dir separately). -- Simone Caldana ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Backup zpool
Il giorno 13/ago/2010, alle ore 03.03, Marty Scholes ha scritto: Script attached. thanks :) -- Simone Caldana ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Backup zpool
Il giorno 12/ago/2010, alle ore 15.10, Marty Scholes ha scritto: Say the word and I'll send you a copy. pretty please :) thanks (meanwhile, I created the top dataset on the backup pool, set compression to gzip-2, removed any local compression setting on the source dataset children and I am sending one childrean at a time. This way what is sent has no property set and thus inherits the backup dataset one). -- Simone Caldana Senior Consultant Critical Path via Cuniberti 58, 10100 Torino, Italia +39 011 4513811 (Direct) +39 011 4513825 (Fax) simone.cald...@criticalpath.net http://www.cp.net/ Critical Path A global leader in digital communications ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and VMware
Hi Paul, I am using EXSi 4.0 with a NFS-on-ZFS datastore running on OSOL b134. It previously ran on Solaris 10u7 with VMware Server 2.x. Disks are SATAs in a JBOD over FC. I'll try to summarize my experience here, albeit our system does not provide services to end users and thus is not very stressed (it supports our internal developers only). There are some things to know before setting up a ZFS storage, in my opinion. 1. You cannot shrink a zpool. You can only detach disks from mirrors, not from raidz. You can grow it, tho, by either replacing disks with higher capacity ones or by adding more disks to the same pool. 2. Due to ZFS inherently coherent structure, synch writes (especially random) are its worst enemy: the cure is to bring a pair of mirrored SSDs in the pool or to use a battery backed write cache, especially if you want to use raidz. 3. VMware, regardless of NFS or iSCSI, will do synchronous writes. Due to point 2 above if your workload and number of VM is significant you will definitely need some kind of disk device based on memory and not on platters. YMMV. 4. ZFS snapshotting is great but it can burn a sizeable amount of disk if you leave your VMs local disks mounted without noatime option (assuming they are unix machines) because vmdks will get written to even if the internal vm processes only issue reads on files. (in my case ~60 idle linux machines burned up to 200MB/hr, generating an average of 2MB/sec of writing traffic) 5. Avoid putting different size and performance disks in a zpool, unless of course they are doing a different job. ZFS doesn't weigh in size or performances and spreads out data evenly. A zpool will perform as the slowest of its members (not all the time, but when the workload is high the slowest disk will be a limiting factor. 6. ZFS performs badly on disks that are more than 80% full. Keep that in mind when you size up for things. 7. ZFS compression works wonders, especially the default one: it costs little in cpu, it doesn't increase latency (unless you have a very unbalanced CPU/Disks system) and thus saves space and bandwidth. 8. By mirroring/raiding things at OS level ZFS effectively multiplies the bandwidth used on the bus. My SATA disks can sustain writing 60MB/sec, but in a zpool made by 6 mirrors of 2 disks each that uses a 2Gbit fibre the maximum throughtput is ~95MB/sec: the fibre max out at 190MB/sec, but Solaris need to write to both disks on each mirror. You can partially solve this by putting each side of a mirror on different storages and/or increasing the number of paths towards the disks. 9. Deduplication is great on paper and can be wonderful in virtualized environments, but it has a BIG costs upfront: search around, do you math but be aware that you'll need tons of ram and SSDs to be able to effectively deduplicate multi terabyte storages. Also it is common opinion that's not ready for production. If the analisys/tests of your use case tells you ZFS is a viable option I think you should give it a try. Administration is wonderfully flexible and easy: once you set up your zpools (which is the really critical phase) you can practically do anything you want in the most efficient way. So far I'm very pleased by it. -- Simone Caldana Senior Consultant Critical Path via Cuniberti 58, 10100 Torino, Italia +39 011 4513811 (Direct) +39 011 4513825 (Fax) simone.cald...@criticalpath.net http://www.cp.net/ Critical Path A global leader in digital communications ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss