Re: [zfs-discuss] VM's on ZFS - 7210
Hi, i think, the local ZFS filesystem with raidz on the 7210 is not the problem (when there are fast HDs), but you can test it with e.g. bonnie++ (downloadable at sunfreeware.com), also NFS should not be the problem because iscsi is also very slow(isn´t it?). some other ideas are: Network connection (did you test the network speed to the NAS?), maybe upgrade to 10gbit, when it is the bottleneck. You can test the speed/bandwith, when log on an ESX host via ssh and create a bigger (10GByte) virtual disk (vmdk) on an NFS mounted share (time /usr/sbin/vmkfstools -c 10G -d eagerzeroedthick /nfspath/test.vmdk). It is also possible, that the VMs are the bottleneck, VM-guests with heavy small (virtual-)HD access like databases can also penetrade a NAS and the network connection with many small IP-packets, so an 1GBit connection could be to slow (but virtualiziation of bigger databases with many access is not a good idea). When you have a test-NAS you can test varios thinks like disabling ZIL and let run a VM on this NAS. i hope i could help you a little, we have also VSphere 4 with a Solaris 10 NAS (NFS) and it runs very fine, but only VMs w/o or with small databases and a Raid-Controller with BBU-write cache and Raid 5 regards (sorry for my english ;-) Axel Denfeld Mark schrieb: We are using a 7210, 44 disks I believe, 11 stripes of RAIDz sets. When I installed I selected the best bang for the buck on the speed vs capacity chart. We run about 30 VM's on it, across 3 ESX 4 servers. Right now, its all running NFS, and it sucks... sooo slow. iSCSI was no better. I am wondering how I can increase the performance, cause they want to add more vm's... the good news is most are idleish, but even idle vm's create a lot of random chatter to the disks! So a few options maybe... 1) Change to iSCSI mounts to ESX, and enable write-cache on the LUN's since the 7210 is on a UPS. 2) get a Logzilla SSD mirror. (do ssd's fail, do I really need a mirror?) 3) reconfigure the NAS to a RAID10 instead of RAIDz Obviously all 3 would be ideal , though with a SSD can I keep using NFS for the same performance since the R_SYNC's would be satisfied with the SSD? I am dreadful of getting the OK to spend the $$,$$$ SSD's and then not get the performance increase we want. How would you weight these? I noticed in testing on a 5 disk OpenSolaris, that changing from a single RAIDz pool to RAID10 netted a larger IOP increase then adding an Intel SSD as a Logzilla. That's not going to scale the same though with a 44 disk, 11 raidz striped RAID set. Some thoughts? Would simply moving to write-cache enabled iSCSI LUN's without a SSD speed things up a lot by itself? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 If I remember correctly, ESX always uses synchronous writes over NFS. If so, adding a dedicated log device (such as a DDRdrive) might help you out here. You should be able to test it by disabling the ZIL for a short while and see if performance improves (http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Disabling_the_ZIL_.28Don.27t.29). I'm not sure how reliable the DDRdrive is in practice, but in theory it should be much better than an SSD, since DRAM doesn't wear. - -- Saso On 08/27/2010 07:04 AM, Mark wrote: > We are using a 7210, 44 disks I believe, 11 stripes of RAIDz sets. When I > installed I selected the best bang for the buck on the speed vs capacity > chart. > > We run about 30 VM's on it, across 3 ESX 4 servers. Right now, its all > running NFS, and it sucks... sooo slow. > > iSCSI was no better. > > I am wondering how I can increase the performance, cause they want to add > more vm's... the good news is most are idleish, but even idle vm's create a > lot of random chatter to the disks! > > So a few options maybe... > > 1) Change to iSCSI mounts to ESX, and enable write-cache on the LUN's since > the 7210 is on a UPS. > 2) get a Logzilla SSD mirror. (do ssd's fail, do I really need a mirror?) > 3) reconfigure the NAS to a RAID10 instead of RAIDz > > Obviously all 3 would be ideal , though with a SSD can I keep using NFS for > the same performance since the R_SYNC's would be satisfied with the SSD? > > I am dreadful of getting the OK to spend the $$,$$$ SSD's and then not get > the performance increase we want. > > How would you weight these? I noticed in testing on a 5 disk OpenSolaris, > that changing from a single RAIDz pool to RAID10 netted a larger IOP increase > then adding an Intel SSD as a Logzilla. That's not going to scale the same > though with a 44 disk, 11 raidz striped RAID set. > > Some thoughts? Would simply moving to write-cache enabled iSCSI LUN's > without a SSD speed things up a lot by itself? -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkx3gMQACgkQRO8UcfzpOHDL7ACfW43C6lkMD389j/vmldqMDK1f 1H0AoNFdhgHfWKCCMaJQ2DJACpkQicU7 =KIyA -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
Hi, In a setup similar to yours I changed from a single 15 disks raidz2 to 7 mirros of 2 disks each. The change in performance was stellar. The key point in serving things for VMware is that it always issue synchronous writes, wheter on iscsi or NFS. When you have tens of VM the resulting traffic is always random for the backing store, and random synch writes are the achille's heel for ZFS. now about your options > 1) Change to iSCSI mounts to ESX, and enable write-cache on the LUN's since > the 7210 is on a UPS. this won't save you from a crash > 2) get a Logzilla SSD mirror. (do ssd's fail, do I really need a mirror?) yes you do need a mirror albeit in a recent thread here it's exposed it's not enough. > 3) reconfigure the NAS to a RAID10 instead of RAIDz this is the way I would go. To make up for the lost space you can enable lz compression (the default one) which should be more or less transparent and leads to very good savings (1,5x - 2x) another advice if your guests are unix: unless you need it, mount your guests OS with noatime, this will reduce basic chatter about 50% in my experience. another thing that helps is to have cache devices, even if they aren't faster than the pool's ones they free up iops that can be used for writes. to summarize I'd go for the mirror setup, then if it's not enough a pair of SSD for SLOG would surely help. Il giorno 27/ago/2010, alle ore 07.04, Mark ha scritto: > We are using a 7210, 44 disks I believe, 11 stripes of RAIDz sets. When I > installed I selected the best bang for the buck on the speed vs capacity > chart. > > We run about 30 VM's on it, across 3 ESX 4 servers. Right now, its all > running NFS, and it sucks... sooo slow. > > iSCSI was no better. > > I am wondering how I can increase the performance, cause they want to add > more vm's... the good news is most are idleish, but even idle vm's create a > lot of random chatter to the disks! > > So a few options maybe... > > 1) Change to iSCSI mounts to ESX, and enable write-cache on the LUN's since > the 7210 is on a UPS. > 2) get a Logzilla SSD mirror. (do ssd's fail, do I really need a mirror?) > 3) reconfigure the NAS to a RAID10 instead of RAIDz > > Obviously all 3 would be ideal , though with a SSD can I keep using NFS for > the same performance since the R_SYNC's would be satisfied with the SSD? > > I am dreadful of getting the OK to spend the $$,$$$ SSD's and then not get > the performance increase we want. > > How would you weight these? I noticed in testing on a 5 disk OpenSolaris, > that changing from a single RAIDz pool to RAID10 netted a larger IOP increase > then adding an Intel SSD as a Logzilla. That's not going to scale the same > though with a 44 disk, 11 raidz striped RAID set. > > Some thoughts? Would simply moving to write-cache enabled iSCSI LUN's > without a SSD speed things up a lot by itself? > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Simone Caldana ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
Saso is correct - ESX/i always uses F_SYNC for all writes and that is for sure your performance killer. Do a snoop | grep sync and you'll see the sync write calls from VMWare. We use DDRdrives in our production VMWare storage and they are excellent for solving this problem. Our cluster supports 50,000 users and we've had no issues at all. Do not use an SSD for the ZIL - as soon as it fills up you will be very unhappy. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
On Fri, August 27, 2010 08:46, Eff Norwood wrote: > Saso is correct - ESX/i always uses F_SYNC for all writes and that is for > sure your performance killer. Do a snoop | grep sync and you'll see the > sync write calls from VMWare. We use DDRdrives in our production VMWare > storage and they are excellent for solving this problem. Our cluster > supports 50,000 users and we've had no issues at all. Do not use an SSD > for the ZIL - as soon as it fills up you will be very unhappy. What do you mean by "fills up"? There is very a very limited amount of data that is written to a slog device: between 5-30s second's worth. Furthermore a log device will at maximum be <= 50% the size of physical memory. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
David asked me what I meant by "filled up". If you make the unwise decision to use an SSD as your ZIL, at some point days to weeks after you install it, all of the pages will be allocated and you will suddenly find the device to be slower than a conventional disk drive. This is due to the way SSDs work. A great write up about how this works is here: http://www.anandtech.com/show/2738/8 The industry work around for this issue is called TRIM and AFAIK the current implementation of TRIM in Solaris does not work for ZIL devices, only for pool devices. If it does, then SSDs would not be a bad option, but the DDRdrive is so much better I wouldn't waste the time. If you don't believe me, try it and post your benchmarks for hour one, day one and week one. ;) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
On Aug 27, 2010, at 1:04 AM, Mark wrote: > We are using a 7210, 44 disks I believe, 11 stripes of RAIDz sets. When I > installed I selected the best bang for the buck on the speed vs capacity > chart. > > We run about 30 VM's on it, across 3 ESX 4 servers. Right now, its all > running NFS, and it sucks... sooo slow. I have a Dell 2950 server with a PERC6 controller with 512MB of write back cache and a pool of mirrors made out of 14 15K SAS drives. ZIL is integrated. This is serving 30 VMs on 3 ESXi hosts and performance is good. I find the #1 operation is random reads, so I doubt the ZIL will make as much difference as a very large L2ARC will. I'd hit that first, it's a cheaper buy. Random reads across a theoretical infinitely sized (in comparison to system RAM) 7200RPM device is a killer. Cache as much as possible in hope of hitting cache rather than disk. Breaking your pool into two or three, setting different vdev types of different type disks and tiering your VMs based on their performance profile would help. -Ross ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
On Fri, Aug 27, 2010 at 05:51:38AM -0700, David Magda wrote: > On Fri, August 27, 2010 08:46, Eff Norwood wrote: > > Saso is correct - ESX/i always uses F_SYNC for all writes and that is for > > sure your performance killer. Do a snoop | grep sync and you'll see the > > sync write calls from VMWare. We use DDRdrives in our production VMWare > > storage and they are excellent for solving this problem. Our cluster > > supports 50,000 users and we've had no issues at all. Do not use an SSD > > for the ZIL - as soon as it fills up you will be very unhappy. > > What do you mean by "fills up"? There is very a very limited amount of > data that is written to a slog device: between 5-30s second's worth. > Furthermore a log device will at maximum be <= 50% the size of physical > memory. I would second this. Excellent results here with "small" 32GB Intel X-25E's. Even 32GB is overkill for ZIL Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
On Fri, Aug 27 at 6:16, Eff Norwood wrote: David asked me what I meant by "filled up". If you make the unwise decision to use an SSD as your ZIL, at some point days to weeks after you install it, all of the pages will be allocated and you will suddenly find the device to be slower than a conventional disk drive. This is due to the way SSDs work. A great write up about how this works is here: http://www.anandtech.com/show/2738/8 While it's an interesting writeup, I think some assumptions are being made that may not be quite correct. In the case of a ZIL, with a relatively small data set (< 1GB typically) on your SSD, if designed correctly, drive will always be running with many gigabytes of "scratch" area available. Fully written SSDs may write more slowly than partially written SSDs in some workloads, but I wouldn't expect a ZIL usage model to create the scenario you linked due to the limited data set size. -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
Hey thanks for the replies everyone. Saddly most of those options will not work, since we are using a SUN Unified Storage 7210, the only option is to buy the SUN SSD's for it, which is about $15k USD for a pair. We also don't have the ability to shut off ZIL or any of the other options that one might have under OpenSolaris itself :( It sounds like I do want to change to a RAID10 mirror instead of RAIDz. It sounds like enabling write-cash without the ZIL in place might work but would lead to corruption should something crash. So the question is with a proper ZIL SSD from SUN, and a RAID10... would I be able to support all the VM's or would it still be pushing the limits a 44 disk pool? Today there are 30 VM's, 25 are Windows 2008 and 5 are Cent OS 5. A couple are DB servers that see very light load. The only thing that see's any real load is a build server which we get a lot of complaints about. I did some testing and posted my results a month ago, using OpenSolaris and 5 disks with my personal Intel SSD and saw good results, but I don't know how it will scale :( -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
markwo...@yahoo.com said: > So the question is with a proper ZIL SSD from SUN, and a RAID10... would I be > able to support all the VM's or would it still be pushing the limits a 44 > disk pool? If it weren't a closed 7000-series appliance, I'd suggest running the "zilstat" script. It should make it clear whether (and by how much) you would benefit from the Logzilla addition in your current raidz configuration. Maybe there's some equivalent in the builtin FishWorks analytics which can give you the same information. Marion ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
On Fri, Aug 27, 2010 at 11:57:17AM -0700, Marion Hakanson wrote: > markwo...@yahoo.com said: > > So the question is with a proper ZIL SSD from SUN, and a RAID10... would I > > be > > able to support all the VM's or would it still be pushing the limits a 44 > > disk pool? > > If it weren't a closed 7000-series appliance, I'd suggest running the > "zilstat" script. It should make it clear whether (and by how much) > you would benefit from the Logzilla addition in your current raidz > configuration. Maybe there's some equivalent in the builtin FishWorks > analytics which can give you the same information. > To the OP... I'd think turning the write cache on would help if that's an option. Does the box have reliable power (UPS, etc)? Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
It does, its on a pair of large APC's. Right now we're using NFS for our ESX Servers. The only iSCSI LUN's I have are mounted inside a couple Windows VM's. I'd have to migrate all our VM's to iSCSI, which I'm willing to do if it would help and not cause other issues. So far the 7210 Appliance has been very stable. I like the zilstat script. I emailed a support tech I am working with on another issue to ask if one of the built in Analytics DTrace scripts will get that data. I found one called L2ARC Eligibility: 3235 true, 66 false. This makes it sound like we would benefit from a READZilla, not quite what I had expected... I'm sure I don't know what I'm looking at anyways :) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
On Fri, Aug 27, 2010 at 12:46:42PM -0700, Mark wrote: > It does, its on a pair of large APC's. > > Right now we're using NFS for our ESX Servers. The only iSCSI LUN's > I have are mounted inside a couple Windows VM's. I'd have to > migrate all our VM's to iSCSI, which I'm willing to do if it would > help and not cause other issues. So far the 7210 Appliance has been > very stable. > > I like the zilstat script. I emailed a support tech I am working > with on another issue to ask if one of the built in Analytics DTrace > scripts will get that data. > > I found one called L2ARC Eligibility: 3235 true, 66 false. This > makes it sound like we would benefit from a READZilla, not quite what > I had expected... I'm sure I don't know what I'm looking at anyways > :) Obviously depends on your workload, and YMMV, but for us (we're also using NFS and love the flexibility it provides w/ ESX) and without ZIL, things are pretty dog slow. My impression is that synchronous writes are used too with iSCSI, so if your problems stem from not having a ZIL w/ NFS they could very easily reappear even with iSCSI. Someone else may correct me on that... Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
Wouldn't it be possible to saturate the SSD ZIL with enough backlogged sync writes? What I mean is, doesn't the ZIL eventually need to make it to the pool, and if the pool as a whole (spinning disks) can't keep up with 30+ vm's of write requests, couldn't you fill up the ZIL that way? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
On Aug 27, 2010, at 2:32 PM, Mark wrote: > Saddly most of those options will not work, since we are using a SUN Unified > Storage 7210, the only option is to buy the SUN SSD's for it, which is about > $15k USD for a pair. We also don't have the ability to shut off ZIL or any > of the other options that one might have under OpenSolaris itself :( > > It sounds like I do want to change to a RAID10 mirror instead of RAIDz. It > sounds like enabling write-cash without the ZIL in place might work but would > lead to corruption should something crash. > > So the question is with a proper ZIL SSD from SUN, and a RAID10... would I be > able to support all the VM's or would it still be pushing the limits a 44 > disk pool? We run roughly that number of VMs on ESXi 4 using a 7410 and a 7310 via NFS. The 7410 and 7310 have fewer disks (24), but they are arranged in a mirror configuration. Each has both readzilla and logzilla SSDs. Our VMs are similarly lightly loaded (much like yours...mix of Windows and Ubuntu, about 25% run a DB server with very little load). We use compression but not deduplication. It has worked extremely well for us. No complaints on speed, very stable. From what I have read on this list iscsi will not be a huge speed improvement for you (though we haven't tried it), and you give up a lot of management flexibility vs. NFS. I would say that the 7210 should be able to support your needs if you put SSDs in based on our experience (and the 7210 has more disks than our 7310 or 7410). Of course switching to a mirror pool requires destroying your current configuration, so it isn't easy. You might also need to remove some HDDs to make room for the SSDs. As far as analytics, the ARC stats (hit/miss) are available which will give you some indication of whether an L2ARC will help. On the SLOG, look at latency by file and operation for a VM that is having performance issues...is it showing high latency on NFS writes? Good luck, Ware ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
On Fri, Aug 27, 2010 at 01:22:15PM -0700, John wrote: > Wouldn't it be possible to saturate the SSD ZIL with enough > backlogged sync writes? > > What I mean is, doesn't the ZIL eventually need to make it to the > pool, and if the pool as a whole (spinning disks) can't keep up with > 30+ vm's of write requests, couldn't you fill up the ZIL that way? Depends on the workload of course, but we have 50+ VM server environments running off of 22x1TB SATA + 32GB Intel X25-E SSD's with no problems whatsoever. I don't have the zilstat numbers handy, but we're not pushing enough I/O for the slog device to even come close to sweating. Note that our VM's are in a LabManager environment and can spun up and down to do compiles mostly, not pushing huge amounts of non-random I/O. Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
No. From what I've seen, ZFS will periodically flush writes from the ZIL to disk. You may run into a "read starvation" situation where ZFS is so busy flushing to disk that you won't get reads. If you have VMs where developers expect low latency interactivity, they get unhappy. Trust me. :) One way to address this is either have an ARC that's large enough, or add a cache-device for the zpool. I have a config where ~20 ESX VMs share a single OpenSolaris NFS server. It has an Intel X25E for ZIL and X25M for cache. It seems to be doing ok. There are actually two of these setups. For one of them, the cache SSD died recently, and you can feel it when ZFS goes to disk for some uncached piece of data. I'll be replacing the cache SSD next week. -Paul On 8/27/10 1:22 PM, John wrote: Wouldn't it be possible to saturate the SSD ZIL with enough backlogged sync writes? What I mean is, doesn't the ZIL eventually need to make it to the pool, and if the pool as a whole (spinning disks) can't keep up with 30+ vm's of write requests, couldn't you fill up the ZIL that way? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
Hi Mark; I have installed several 7000 series systems, some running 100's of VM's. I can help try to help you but to find where exactly the problem is I may need more information. I can understand that you have no ZIL's. So most probably you are using the 7110 with 250 GB drives. All 7000 series have a module called analytics where you can monitor many components performance. Please start with selecting enable advanced analytics in preferences tab at configuration menu. Please make sure that you are running latest FW release Q1.2.1 2010 @ http://wikis.sun.com/display/FishWorks/Software+Updates Please read all release notes attached from the FW you are running to the FW level you will upgrade to. I understand that you are using iSCSI, if you are running earlier FW's, NFS can increase performance significantly however for recent FW's iSCSI and NFS performance is very close but I'd choose NFS over iSCSI for most installiations. Do so if yu can. Please start monitoring fallowing datasets using analytics. Network transfer broken by interface or device [check if you are stuck by gigabit ethernet etc] iSCSI IOPS iSCSI IOPS broken down by LUN (to understand which LUN demands more performance, with newer FW's you may find it use full to isolate some LUN's by defining different pools - beware that this may not offer much help if you use Raid 10) iSCSI the iops broken down type iSCSI write iops latency iSCSI latency Arc hit/Miss ratio Arc size Here are my recomendations :(if you can share some screen shots from analytics I may be able to help mre) 1) Convert to Raid10 - this will provide you 4-5x More IOPS on both read and writes. 2) Using analytics, decide if increasing L1 cache may help you. If it can, increase L1 cache 3) Check the IO size using analytics and check it against your lun definations. I suggest that Lun block size should be lower than IO size. 4) Enable wirte caching for a short time and monitor analytics report and if you see much improvement you can invest in SSD's. 5) Enable jumb frames (through out the path) 6) Use multiple interfaces to access data PS: I think that you have asked if you can disable ZIL on 7000 series. The answer is yes and you can decide it at share/lun granularity. PS: We usualy recommend to use writezilla for vmware users but I have seen 7210's running 30-40 vm's without much problem when there were no writezilla, but for sure this depends on the load pattern. Very best regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Mark Sent: Friday, August 27, 2010 10:47 PM To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] VM's on ZFS - 7210 It does, its on a pair of large APC's. Right now we're using NFS for our ESX Servers. The only iSCSI LUN's I have are mounted inside a couple Windows VM's. I'd have to migrate all our VM's to iSCSI, which I'm willing to do if it would help and not cause other issues. So far the 7210 Appliance has been very stable. I like the zilstat script. I emailed a support tech I am working with on another issue to ask if one of the built in Analytics DTrace scripts will get that data. I found one called L2ARC Eligibility: 3235 true, 66 false. This makes it sound like we would benefit from a READZilla, not quite what I had expected... I'm sure I don't know what I'm looking at anyways :) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
By all means please try it to validate it yourself and post your results from hour one, day one and week one. In a ZIL use case, although the data set is small it is always writing a small ever changing (from the SSDs perspective) data set. The SSD does not know to release previously written pages and without TRIM there is no way to tell it to. That means every time a ZIL write happens, new SSD pages are consumed. After some amount of time, all of those empty pages will become consumed and the SSD will now have to go into the read-erase-write cycle which is incredibly slow and the whole point of TRIM. I can assure you from my extensive benchmarking with all major SSDs in the role of a ZIL you will eventually not be happy. Depending on your use case it might take months, but eventually all those free pages will be consumed and read-erase-write is how the SSD world works after that - unless you have TRIM, which we don't yet. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
On Fri, Aug 27, 2010 at 03:51:39PM -0700, Eff Norwood wrote: > By all means please try it to validate it yourself and post your > results from hour one, day one and week one. In a ZIL use case, > although the data set is small it is always writing a small ever > changing (from the SSDs perspective) data set. The SSD does not know > to release previously written pages and without TRIM there is no way > to tell it to. That means every time a ZIL write happens, new SSD > pages are consumed. After some amount of time, all of those empty > pages will become consumed and the SSD will now have to go into the > read-erase-write cycle which is incredibly slow and the whole point > of TRIM. > > I can assure you from my extensive benchmarking with all major SSDs > in the role of a ZIL you will eventually not be happy. Depending on > your use case it might take months, but eventually all those free > pages will be consumed and read-erase-write is how the SSD world > works after that - unless you have TRIM, which we don't yet. -- > This message posted from opensolaris.org Is there a way to measure how many SSD pages are taken up? We've had a box running for nearly 8 months now -- it's performing well, but I'd be interested to see if we'll be close to (theoretically) hitting this problem or not. Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
Il giorno 27/ago/2010, alle ore 21.46, Mark ha scritto: > Right now we're using NFS for our ESX Servers. The only iSCSI LUN's I have > are mounted inside a couple Windows VM's. I'd have to migrate all our VM's > to iSCSI, which I'm willing to do if it would help and not cause other issues. It won't change. VMWare issues synch writes to both NFS and iSCSI so you'll gain nothing and you'll lose the flexibility of NFS. But since the "problem" lies in an "abuse of synch writes" (asynch writes from the guest OS becomes synch writes to VMWare datastore) I suggest you to do one of 2 things: 1. move your data to NFS/CIFS. Mount the very same NFS server (of course different dataset) inside the guest OS and put your data there. PRO: optimized network trafic (as ops will be "to the byte" and not "to the block") and reduced (albeit not zero, since NFS still synch file close etc, I don't know about CIFS) "synch abuse", you can access you data from outside the VM easily. CON: may not be possible depending on the kind of application, you're bound to the network file protocol semantics which can be not enough. 2. move your data to an iSCSI disk. Export a zvol to a guest OS, format it and move your data there (mind volblocksize). PRO: no "abuse of synch writes": no asynch writes will turn into synch writes as the guest OS will have complete control over the device. CON: less optimized network traffic (should be more or less what you see with vmdks), possibly performance/stability issues as the support for iSCSI is not yet mature in certain OSes (ie: https://bugzilla.redhat.com/show_bug.cgi?id=583218) "abuse of synch writes" has a PRO, tho: it makes your data safer in case of crash or power loss. Back in the day before caches all writes were inherently synchroous and a programmer knew that after a write returned the data was safe. Nowadays it's not the case anymore and the coder should wisely use synch writes or other ways to do things to ensure safety of data. Battery backed RAM based SLOGs gives you the best of both worlds, but they cost money :) -- Simone Caldana ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
I can't think of an easy way to measure pages that have not been consumed since it's really an SSD controller function which is obfuscated from the OS, and add the variable of over provisioning on top of that. If anyone would like to really get into what's going on inside of an SSD that makes it a bad choice for a ZIL, you can start here: http://en.wikipedia.org/wiki/TRIM_%28SSD_command%29 and http://en.wikipedia.org/wiki/Write_amplification Which will be more than you might have ever wanted to know. :) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
On Sat, Aug 28, 2010 at 05:50:38AM -0700, Eff Norwood wrote: > I can't think of an easy way to measure pages that have not been consumed > since it's really an SSD controller function which is obfuscated from the OS, > and add the variable of over provisioning on top of that. If anyone would > like to really get into what's going on inside of an SSD that makes it a bad > choice for a ZIL, you can start here: > > http://en.wikipedia.org/wiki/TRIM_%28SSD_command%29 > > and > > http://en.wikipedia.org/wiki/Write_amplification > > Which will be more than you might have ever wanted to know. :) So has anyone on this list actually run into this issue? Tons of people use SSD-backed slog devices... The theory sounds "sound", but if it's not really happening much in practice then I'm not too worried. Especially when I can replace a drive from my slog mirror for a $400 or so if problems do arise... (the alternative being much more expensive DRAM backed devices) Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
On Sat, Aug 28, 2010 at 8:19 AM, Ray Van Dolson wrote: > On Sat, Aug 28, 2010 at 05:50:38AM -0700, Eff Norwood wrote: >> I can't think of an easy way to measure pages that have not been consumed >> since it's really an SSD controller function which is obfuscated from the >> OS, and add the variable of over provisioning on top of that. If anyone >> would like to really get into what's going on inside of an SSD that makes it >> a bad choice for a ZIL, you can start here: >> >> http://en.wikipedia.org/wiki/TRIM_%28SSD_command%29 >> >> and >> >> http://en.wikipedia.org/wiki/Write_amplification >> >> Which will be more than you might have ever wanted to know. :) > > So has anyone on this list actually run into this issue? Tons of > people use SSD-backed slog devices... > > The theory sounds "sound", but if it's not really happening much in > practice then I'm not too worried. Especially when I can replace a > drive from my slog mirror for a $400 or so if problems do arise... (the > alternative being much more expensive DRAM backed devices) Presumably this problem is being worked... http://hg.genunix.org/onnv-gate.hg/rev/d560524b6bb6 Notice that it implements: 866610 Add SATA TRIM support With this in place, I would imagine a next step is for zfs to issue TRIM commands as zil entries have been committed to the data disks. -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
> "en" == Eff Norwood writes: en> http://www.anandtech.com/show/2738/8 but a few pages later: http://www.anandtech.com/show/2738/25 so, as you say, ``with all major SSDs in the role of a ZIL you will eventually not be happy.'' is true, but you seem to have accidentally left out the ``EXCEPT INTEL!'' Oops! Funnier still, the EXCEPT INTEL is right there in exactly the article YOU cited. however, that's not the end of it. Searching this very mailing list for 'anandtech' I found this cited about ten times: http://www.anandtech.com/show/2899/8 anandtech does not think TRIM / dirty drives are a problem any longer. You might want to redo whatever tests you did (or else read newer anandtech articles). I've made the same mistake of passing around anandtech links without keeping up with their latest posts, but the thing is, that link debunking your ideas was posted on this list *so* *many* *times* and over such a long interval! You can also use the anandtech articles as a point of reference for how you might write up your ``extensive testing'' of ``all major'' SSD's in a way that will ``assure'' people your conclusions are correct. (HINT: list the SSD's you tested. describe the testing method. Results would be nice, too, but the first two were missing from your post. They help a lot, and do not take much time to include, though leaving them out does help FUD spread further if you are trying to promote this ``DDRDrive'' with the silly external power brick.) en> I can't think of an easy way to measure pages that have not en> been consumed since it's really an SSD controller function en> which is obfuscated from the OS, yeah, SSD's are largely just a different way of selling proprietary software, but I guess a lot of ``hardware'' is. pgpi59M7WwDpr.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] VM's on ZFS - 7210
As I said, please by all means try it and post your benchmarks for first hour, first day and first week and then first month. The data will be of interest to you. On a subjective basis, if you feel that an SSD is working just fine as your ZIL, run with it. Good luck! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss