Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-30 Thread Eff Norwood
As I said, please by all means try it and post your benchmarks for first hour, 
first day and first week and then first month. The data will be of interest to 
you. On a subjective basis, if you feel that an SSD is working just fine as 
your ZIL, run with it. Good luck!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-29 Thread Miles Nordin
 en == Eff Norwood sm...@jsvp.com writes:

en http://www.anandtech.com/show/2738/8

but a few pages later:

  http://www.anandtech.com/show/2738/25

so, as you say, ``with all major SSDs in the role of a ZIL you will
eventually not be happy.'' is true, but you seem to have accidentally
left out the ``EXCEPT INTEL!''  Oops!  Funnier still, the EXCEPT INTEL
is right there in exactly the article YOU cited.

however, that's not the end of it.  Searching this very mailing list
for 'anandtech' I found this cited about ten times:

 http://www.anandtech.com/show/2899/8

anandtech does not think TRIM / dirty drives are a problem any longer.
You might want to redo whatever tests you did (or else read newer
anandtech articles).

I've made the same mistake of passing around anandtech links without
keeping up with their latest posts, but the thing is, that link
debunking your ideas was posted on this list *so* *many* *times* and
over such a long interval!

You can also use the anandtech articles as a point of reference for
how you might write up your ``extensive testing'' of ``all major''
SSD's in a way that will ``assure'' people your conclusions are
correct.  (HINT: list the SSD's you tested.  describe the testing
method.  Results would be nice, too, but the first two were missing
from your post.  They help a lot, and do not take much time to include,
though leaving them out does help FUD spread further if you are trying
to promote this ``DDRDrive'' with the silly external power brick.)

en I can't think of an easy way to measure pages that have not
en been consumed since it's really an SSD controller function
en which is obfuscated from the OS,

yeah, SSD's are largely just a different way of selling proprietary
software, but I guess a lot of ``hardware'' is.


pgpi59M7WwDpr.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-28 Thread Eff Norwood
I can't think of an easy way to measure pages that have not been consumed since 
it's really an SSD controller function which is obfuscated from the OS, and add 
the variable of over provisioning on top of that. If anyone would like to 
really get into what's going on inside of an SSD that makes it a bad choice for 
a ZIL, you can start here:

http://en.wikipedia.org/wiki/TRIM_%28SSD_command%29

and

http://en.wikipedia.org/wiki/Write_amplification

Which will be more than you might have ever wanted to know. :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-28 Thread Ray Van Dolson
On Sat, Aug 28, 2010 at 05:50:38AM -0700, Eff Norwood wrote:
 I can't think of an easy way to measure pages that have not been consumed 
 since it's really an SSD controller function which is obfuscated from the OS, 
 and add the variable of over provisioning on top of that. If anyone would 
 like to really get into what's going on inside of an SSD that makes it a bad 
 choice for a ZIL, you can start here:
 
 http://en.wikipedia.org/wiki/TRIM_%28SSD_command%29
 
 and
 
 http://en.wikipedia.org/wiki/Write_amplification
 
 Which will be more than you might have ever wanted to know. :)

So has anyone on this list actually run into this issue?  Tons of
people use SSD-backed slog devices...

The theory sounds sound, but if it's not really happening much in
practice then I'm not too worried.  Especially when I can replace a
drive from my slog mirror for a $400 or so if problems do arise... (the
alternative being much more expensive DRAM backed devices)

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-28 Thread Mike Gerdts
On Sat, Aug 28, 2010 at 8:19 AM, Ray Van Dolson rvandol...@esri.com wrote:
 On Sat, Aug 28, 2010 at 05:50:38AM -0700, Eff Norwood wrote:
 I can't think of an easy way to measure pages that have not been consumed 
 since it's really an SSD controller function which is obfuscated from the 
 OS, and add the variable of over provisioning on top of that. If anyone 
 would like to really get into what's going on inside of an SSD that makes it 
 a bad choice for a ZIL, you can start here:

 http://en.wikipedia.org/wiki/TRIM_%28SSD_command%29

 and

 http://en.wikipedia.org/wiki/Write_amplification

 Which will be more than you might have ever wanted to know. :)

 So has anyone on this list actually run into this issue?  Tons of
 people use SSD-backed slog devices...

 The theory sounds sound, but if it's not really happening much in
 practice then I'm not too worried.  Especially when I can replace a
 drive from my slog mirror for a $400 or so if problems do arise... (the
 alternative being much more expensive DRAM backed devices)

Presumably this problem is being worked...

http://hg.genunix.org/onnv-gate.hg/rev/d560524b6bb6

Notice that it implements:

866610  Add SATA TRIM support

With this in place, I would imagine a next step is for zfs to issue
TRIM commands as zil entries have been committed to the data disks.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Axel Denfeld

Hi,

i think, the local ZFS filesystem with raidz on the 7210 is not the 
problem (when there are fast HDs), but you can test it with e.g. 
bonnie++ (downloadable at sunfreeware.com), also NFS should not be the 
problem because iscsi is also very slow(isn´t it?).


some other ideas are:

Network connection (did you test the network speed to the NAS?), maybe 
upgrade to 10gbit, when it is the bottleneck. You can test the 
speed/bandwith, when log on an ESX host via ssh and create a bigger 
(10GByte) virtual disk (vmdk) on an NFS mounted share (time 
/usr/sbin/vmkfstools -c 10G -d eagerzeroedthick /nfspath/test.vmdk).


It is also possible, that the VMs are the bottleneck, VM-guests with 
heavy small (virtual-)HD access like databases can also penetrade a NAS 
and the network connection with many small IP-packets, so an 1GBit 
connection could be to slow (but virtualiziation of bigger databases 
with many access is not a good idea).


When you have a test-NAS you can test varios thinks like disabling ZIL 
and let run a VM on this NAS.


i hope i could help you a little, we have also VSphere 4 with a Solaris 
10 NAS (NFS) and it runs very fine, but only VMs w/o or with small 
databases and a Raid-Controller with BBU-write cache and Raid 5


regards (sorry for my english ;-)
Axel Denfeld



Mark schrieb:

We are using a 7210, 44 disks I believe, 11 stripes of RAIDz sets.  When I 
installed I selected the best bang for the buck on the speed vs capacity chart.

We run about 30 VM's on it, across 3 ESX 4 servers.  Right now, its all running 
NFS, and it sucks... sooo slow.

iSCSI was no better.   


I am wondering how I can increase the performance, cause they want to add more 
vm's... the good news is most are idleish, but even idle vm's create a lot of 
random chatter to the disks!

So a few options maybe... 


1) Change to iSCSI mounts to ESX, and enable write-cache on the LUN's since the 
7210 is on a UPS.
2) get a Logzilla SSD mirror.  (do ssd's fail, do I really need a mirror?)
3) reconfigure the NAS to a RAID10 instead of RAIDz

Obviously all 3 would be ideal , though with a SSD can I keep using NFS for the 
same performance since the R_SYNC's would be satisfied with the SSD?

I am dreadful of getting the OK to spend the $$,$$$ SSD's and then not get the 
performance increase we want.

How would you weight these?  I noticed in testing on a 5 disk OpenSolaris, that 
changing from a single RAIDz pool to RAID10 netted a larger IOP increase then 
adding an Intel SSD as a Logzilla.  That's not going to scale the same though 
with a 44 disk, 11 raidz striped RAID set.

Some thoughts?  Would simply moving to write-cache enabled iSCSI LUN's without 
a SSD speed things up a lot by itself?
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Saso Kiselkov
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

If I remember correctly, ESX always uses synchronous writes over NFS. If
so, adding a dedicated log device (such as a DDRdrive) might help you
out here. You should be able to test it by disabling the ZIL for a short
while and see if performance improves
(http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Disabling_the_ZIL_.28Don.27t.29).
I'm not sure how reliable the DDRdrive is in practice, but in theory it
should be much better than an SSD, since DRAM doesn't wear.

- --
Saso

On 08/27/2010 07:04 AM, Mark wrote:
 We are using a 7210, 44 disks I believe, 11 stripes of RAIDz sets.  When I 
 installed I selected the best bang for the buck on the speed vs capacity 
 chart.
 
 We run about 30 VM's on it, across 3 ESX 4 servers.  Right now, its all 
 running NFS, and it sucks... sooo slow.
 
 iSCSI was no better.   
 
 I am wondering how I can increase the performance, cause they want to add 
 more vm's... the good news is most are idleish, but even idle vm's create a 
 lot of random chatter to the disks!
 
 So a few options maybe... 
 
 1) Change to iSCSI mounts to ESX, and enable write-cache on the LUN's since 
 the 7210 is on a UPS.
 2) get a Logzilla SSD mirror.  (do ssd's fail, do I really need a mirror?)
 3) reconfigure the NAS to a RAID10 instead of RAIDz
 
 Obviously all 3 would be ideal , though with a SSD can I keep using NFS for 
 the same performance since the R_SYNC's would be satisfied with the SSD?
 
 I am dreadful of getting the OK to spend the $$,$$$ SSD's and then not get 
 the performance increase we want.
 
 How would you weight these?  I noticed in testing on a 5 disk OpenSolaris, 
 that changing from a single RAIDz pool to RAID10 netted a larger IOP increase 
 then adding an Intel SSD as a Logzilla.  That's not going to scale the same 
 though with a 44 disk, 11 raidz striped RAID set.
 
 Some thoughts?  Would simply moving to write-cache enabled iSCSI LUN's 
 without a SSD speed things up a lot by itself?

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkx3gMQACgkQRO8UcfzpOHDL7ACfW43C6lkMD389j/vmldqMDK1f
1H0AoNFdhgHfWKCCMaJQ2DJACpkQicU7
=KIyA
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Simone Caldana
Hi,

In a setup similar to yours I changed from a single 15 disks raidz2 to 7 mirros 
of 2 disks each. The change in performance was stellar. The key point in 
serving things for VMware is that it always issue synchronous writes, wheter on 
iscsi or NFS. When you have tens of VM the resulting traffic is always random 
for the backing store, and random synch writes are the achille's heel for ZFS.

now about your options

 1) Change to iSCSI mounts to ESX, and enable write-cache on the LUN's since 
 the 7210 is on a UPS.
this won't save you from a crash

 2) get a Logzilla SSD mirror.  (do ssd's fail, do I really need a mirror?)
yes you do need a mirror albeit in a recent thread here it's exposed it's not 
enough.

 3) reconfigure the NAS to a RAID10 instead of RAIDz

this is the way I would go. To make up for the lost space you can enable lz 
compression (the default one) which should be more or less transparent and 
leads to very good savings (1,5x - 2x)

another advice if your guests are unix: unless you need it, mount your guests 
OS with noatime, this will reduce basic chatter about 50% in my experience.
another thing that helps is to have cache devices, even if they aren't faster 
than the pool's ones they free up iops that can be used for writes.

to summarize I'd go for the mirror setup, then if it's not enough a pair of SSD 
for SLOG would surely help. 


Il giorno 27/ago/2010, alle ore 07.04, Mark ha scritto:

 We are using a 7210, 44 disks I believe, 11 stripes of RAIDz sets.  When I 
 installed I selected the best bang for the buck on the speed vs capacity 
 chart.
 
 We run about 30 VM's on it, across 3 ESX 4 servers.  Right now, its all 
 running NFS, and it sucks... sooo slow.
 
 iSCSI was no better.   
 
 I am wondering how I can increase the performance, cause they want to add 
 more vm's... the good news is most are idleish, but even idle vm's create a 
 lot of random chatter to the disks!
 
 So a few options maybe... 
 
 1) Change to iSCSI mounts to ESX, and enable write-cache on the LUN's since 
 the 7210 is on a UPS.
 2) get a Logzilla SSD mirror.  (do ssd's fail, do I really need a mirror?)
 3) reconfigure the NAS to a RAID10 instead of RAIDz
 
 Obviously all 3 would be ideal , though with a SSD can I keep using NFS for 
 the same performance since the R_SYNC's would be satisfied with the SSD?
 
 I am dreadful of getting the OK to spend the $$,$$$ SSD's and then not get 
 the performance increase we want.
 
 How would you weight these?  I noticed in testing on a 5 disk OpenSolaris, 
 that changing from a single RAIDz pool to RAID10 netted a larger IOP increase 
 then adding an Intel SSD as a Logzilla.  That's not going to scale the same 
 though with a 44 disk, 11 raidz striped RAID set.
 
 Some thoughts?  Would simply moving to write-cache enabled iSCSI LUN's 
 without a SSD speed things up a lot by itself?
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-- 
Simone Caldana

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Eff Norwood
Saso is correct - ESX/i always uses F_SYNC for all writes and that is for sure 
your performance killer. Do a snoop | grep sync and you'll see the sync write 
calls from VMWare. We use DDRdrives in our production VMWare storage and they 
are excellent for solving this problem. Our cluster supports 50,000 users and 
we've had no issues at all. Do not use an SSD for the ZIL - as soon as it fills 
up you will be very unhappy.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread David Magda
On Fri, August 27, 2010 08:46, Eff Norwood wrote:
 Saso is correct - ESX/i always uses F_SYNC for all writes and that is for
 sure your performance killer. Do a snoop | grep sync and you'll see the
 sync write calls from VMWare. We use DDRdrives in our production VMWare
 storage and they are excellent for solving this problem. Our cluster
 supports 50,000 users and we've had no issues at all. Do not use an SSD
 for the ZIL - as soon as it fills up you will be very unhappy.

What do you mean by fills up? There is very a very limited amount of
data that is written to a slog device: between 5-30s second's worth.
Furthermore a log device will at maximum be = 50% the size of physical
memory.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Eff Norwood
David asked me what I meant by filled up. If you make the unwise decision to 
use an SSD as your ZIL, at some point days to weeks after you install it, all 
of the pages will be allocated and you will suddenly find the device to be 
slower than a conventional disk drive. This is due to the way SSDs work. A 
great write up about how this works is here:

http://www.anandtech.com/show/2738/8

The industry work around for this issue is called TRIM and AFAIK the current 
implementation of TRIM in Solaris does not work for ZIL devices, only for pool 
devices. If it does, then SSDs would not be a bad option, but the DDRdrive is 
so much better I wouldn't waste the time. If you don't believe me, try it and 
post your benchmarks for hour one, day one and week one. ;)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Ross Walker
On Aug 27, 2010, at 1:04 AM, Mark markwo...@yahoo.com wrote:

 We are using a 7210, 44 disks I believe, 11 stripes of RAIDz sets.  When I 
 installed I selected the best bang for the buck on the speed vs capacity 
 chart.
 
 We run about 30 VM's on it, across 3 ESX 4 servers.  Right now, its all 
 running NFS, and it sucks... sooo slow.

I have a Dell 2950 server with a PERC6 controller with 512MB of write back 
cache and a pool of mirrors made out of 14 15K SAS drives. ZIL is integrated.

This is serving 30 VMs on 3 ESXi hosts and performance is good.

I find the #1 operation is random reads, so I doubt the ZIL will make as much 
difference as a very large L2ARC will. I'd hit that first, it's a cheaper buy. 
Random reads across a theoretical infinitely sized (in comparison to system 
RAM) 7200RPM device is a killer. Cache as much as possible in hope of hitting 
cache rather than disk.

Breaking your pool into two or three, setting different vdev types of different 
type disks and tiering your VMs based on their performance profile would help.

-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Ray Van Dolson
On Fri, Aug 27, 2010 at 05:51:38AM -0700, David Magda wrote:
 On Fri, August 27, 2010 08:46, Eff Norwood wrote:
  Saso is correct - ESX/i always uses F_SYNC for all writes and that is for
  sure your performance killer. Do a snoop | grep sync and you'll see the
  sync write calls from VMWare. We use DDRdrives in our production VMWare
  storage and they are excellent for solving this problem. Our cluster
  supports 50,000 users and we've had no issues at all. Do not use an SSD
  for the ZIL - as soon as it fills up you will be very unhappy.
 
 What do you mean by fills up? There is very a very limited amount of
 data that is written to a slog device: between 5-30s second's worth.
 Furthermore a log device will at maximum be = 50% the size of physical
 memory.

I would second this.  Excellent results here with small 32GB Intel
X-25E's.

Even 32GB is overkill for ZIL 

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Eric D. Mudama

On Fri, Aug 27 at  6:16, Eff Norwood wrote:


David asked me what I meant by filled up. If you make the unwise
decision to use an SSD as your ZIL, at some point days to weeks
after you install it, all of the pages will be allocated and you
will suddenly find the device to be slower than a conventional disk
drive. This is due to the way SSDs work. A great write up about how
this works is here:

http://www.anandtech.com/show/2738/8


While it's an interesting writeup, I think some assumptions are being
made that may not be quite correct.  In the case of a ZIL, with a
relatively small data set ( 1GB typically) on your SSD, if designed
correctly, drive will always be running with many gigabytes of
scratch area available.

Fully written SSDs may write more slowly than partially written SSDs
in some workloads, but I wouldn't expect a ZIL usage model to create
the scenario you linked due to the limited data set size.

--
Eric D. Mudama
edmud...@mail.bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Mark
Hey thanks for the replies everyone.

Saddly most of those options will not work, since we are using a SUN Unified 
Storage 7210, the only option is to buy the SUN SSD's for it, which is about 
$15k USD for a pair.   We also don't have the ability to shut off ZIL or any of 
the other options that one might have under OpenSolaris itself :(

It sounds like I do want to change to a RAID10 mirror instead of RAIDz.   It 
sounds like enabling write-cash without the ZIL in place might work but would 
lead to corruption should something crash.

So the question is with a proper ZIL SSD from SUN, and a RAID10... would I be 
able to support all the VM's or would it still be pushing the limits a 44 disk 
pool?

Today there are 30 VM's, 25 are Windows 2008 and 5 are Cent OS 5.   A couple 
are DB servers that see very light load.  The only thing that see's any real 
load is a build server which we get a lot of complaints about.

I did some testing and posted my results a month ago, using OpenSolaris and 5 
disks with my personal Intel SSD and saw good results, but I don't know how it 
will scale :(
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Marion Hakanson
markwo...@yahoo.com said:
 So the question is with a proper ZIL SSD from SUN, and a RAID10... would I be
 able to support all the VM's or would it still be pushing the limits a 44
 disk pool? 

If it weren't a closed 7000-series appliance, I'd suggest running the
zilstat script.  It should make it clear whether (and by how much)
you would benefit from the Logzilla addition in your current raidz
configuration.  Maybe there's some equivalent in the builtin FishWorks
analytics which can give you the same information.

Marion


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Ray Van Dolson
On Fri, Aug 27, 2010 at 11:57:17AM -0700, Marion Hakanson wrote:
 markwo...@yahoo.com said:
  So the question is with a proper ZIL SSD from SUN, and a RAID10... would I 
  be
  able to support all the VM's or would it still be pushing the limits a 44
  disk pool? 
 
 If it weren't a closed 7000-series appliance, I'd suggest running the
 zilstat script.  It should make it clear whether (and by how much)
 you would benefit from the Logzilla addition in your current raidz
 configuration.  Maybe there's some equivalent in the builtin FishWorks
 analytics which can give you the same information.
 

To the OP...

I'd think turning the write cache on would help if that's an option.
Does the box have reliable power (UPS, etc)?

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Mark
It does, its on a pair of large APC's.

Right now we're using NFS for our ESX Servers.  The only iSCSI LUN's I have are 
mounted inside a couple Windows VM's.   I'd have to migrate all our VM's to 
iSCSI, which I'm willing to do if it would help and not cause other issues.   
So far the 7210 Appliance has been very stable.

I like the zilstat script.  I emailed a support tech I am working with on 
another issue to ask if one of the built in Analytics DTrace scripts will get 
that data.   

I found one called L2ARC Eligibility:  3235 true, 66 false.  This makes it 
sound like we would benefit from a READZilla, not quite what I had expected...  
I'm sure I don't know what I'm looking at anyways :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Ray Van Dolson
On Fri, Aug 27, 2010 at 12:46:42PM -0700, Mark wrote:
 It does, its on a pair of large APC's.
 
 Right now we're using NFS for our ESX Servers.  The only iSCSI LUN's
 I have are mounted inside a couple Windows VM's.   I'd have to
 migrate all our VM's to iSCSI, which I'm willing to do if it would
 help and not cause other issues.   So far the 7210 Appliance has been
 very stable.
 
 I like the zilstat script.  I emailed a support tech I am working
 with on another issue to ask if one of the built in Analytics DTrace
 scripts will get that data.   
 
 I found one called L2ARC Eligibility:  3235 true, 66 false.  This
 makes it sound like we would benefit from a READZilla, not quite what
 I had expected...  I'm sure I don't know what I'm looking at anyways
 :)

Obviously depends on your workload, and YMMV, but for us (we're also
using NFS and love the flexibility it provides w/ ESX) and without ZIL,
things are pretty dog slow.

My impression is that synchronous writes are used too with iSCSI, so if
your problems stem from not having a ZIL w/ NFS they could very easily
reappear even with iSCSI.

Someone else may correct me on that...

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread John
Wouldn't it be possible to saturate the SSD ZIL with enough backlogged sync 
writes? 

What I mean is, doesn't the ZIL eventually need to make it to the pool, and if 
the pool as a whole (spinning disks) can't keep up with 30+ vm's of write 
requests, couldn't you fill up the ZIL that way?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Ware Adams
On Aug 27, 2010, at 2:32 PM, Mark wrote:
 Saddly most of those options will not work, since we are using a SUN Unified 
 Storage 7210, the only option is to buy the SUN SSD's for it, which is about 
 $15k USD for a pair.   We also don't have the ability to shut off ZIL or any 
 of the other options that one might have under OpenSolaris itself :(
 
 It sounds like I do want to change to a RAID10 mirror instead of RAIDz.   It 
 sounds like enabling write-cash without the ZIL in place might work but would 
 lead to corruption should something crash.
 
 So the question is with a proper ZIL SSD from SUN, and a RAID10... would I be 
 able to support all the VM's or would it still be pushing the limits a 44 
 disk pool?

We run roughly that number of VMs on ESXi 4 using a 7410 and a 7310 via NFS.  
The 7410 and 7310 have fewer disks (24), but they are arranged in a mirror 
configuration.  Each has both readzilla and logzilla SSDs.  Our VMs are 
similarly lightly loaded (much like yours...mix of Windows and Ubuntu, about 
25% run a DB server with very little load).  We use compression but not 
deduplication.

It has worked extremely well for us.  No complaints on speed, very stable.  
From what I have read on this list iscsi will not be a huge speed improvement 
for you (though we haven't tried it), and you give up a lot of management 
flexibility vs. NFS.

I would say that the 7210 should be able to support your needs if you put SSDs 
in based on our experience (and the 7210 has more disks than our 7310 or 7410). 
 Of course switching to a mirror pool requires destroying your current 
configuration, so it isn't easy.  You might also need to remove some HDDs to 
make room for the SSDs.

As far as analytics, the ARC stats (hit/miss) are available which will give you 
some indication of whether an L2ARC will help.  On the SLOG, look at latency by 
file and operation for a VM that is having performance issues...is it showing 
high latency on NFS writes?

Good luck,
Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Ray Van Dolson
On Fri, Aug 27, 2010 at 01:22:15PM -0700, John wrote:
 Wouldn't it be possible to saturate the SSD ZIL with enough
 backlogged sync writes? 
 
 What I mean is, doesn't the ZIL eventually need to make it to the
 pool, and if the pool as a whole (spinning disks) can't keep up with
 30+ vm's of write requests, couldn't you fill up the ZIL that way?

Depends on the workload of course, but we have 50+ VM server
environments running off of 22x1TB SATA + 32GB Intel X25-E SSD's with
no problems whatsoever.  I don't have the zilstat numbers handy, but
we're not pushing enough I/O for the slog device to even come close to
sweating.

Note that our VM's are in a LabManager environment and can spun up and
down to do compiles mostly, not pushing huge amounts of non-random I/O.

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Paul Choi


 No. From what I've seen, ZFS will periodically flush writes from the 
ZIL to disk. You may run into a read starvation situation where ZFS is 
so busy flushing to disk that you won't get reads. If you have VMs where 
developers expect low latency interactivity, they get unhappy. Trust me. :)


One way to address this is either have an ARC that's large enough, or 
add a cache-device for the zpool.


I have a config where ~20 ESX VMs share a single OpenSolaris NFS server. 
It has an Intel X25E for ZIL and X25M for cache. It seems to be doing 
ok. There are actually two of these setups. For one of them, the cache 
SSD died recently, and you can feel it when ZFS goes to disk for some 
uncached piece of data. I'll be replacing the cache SSD next week.


-Paul


On 8/27/10 1:22 PM, John wrote:

Wouldn't it be possible to saturate the SSD ZIL with enough backlogged sync 
writes?

What I mean is, doesn't the ZIL eventually need to make it to the pool, and if 
the pool as a whole (spinning disks) can't keep up with 30+ vm's of write 
requests, couldn't you fill up the ZIL that way?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Mertol Özyöney
Hi Mark;

I have installed several 7000 series systems, some running 100's of VM's. 
I can help try to help you but to find where exactly the problem is I may
need more information. 

I can understand that you have no ZIL's. So most probably you are using the
7110 with 250 GB drives. 

All 7000 series have a module called analytics where you can monitor many
components performance. 
Please start with selecting enable advanced analytics in preferences tab at
configuration menu. 

Please make sure that you are running latest FW release Q1.2.1 2010 @
http://wikis.sun.com/display/FishWorks/Software+Updates
Please read all release notes attached from the FW you are running to the FW
level you will upgrade to.   

I understand that you are using iSCSI, if you are running earlier FW's, NFS
can increase performance significantly however for recent FW's iSCSI and NFS
performance is very close but I'd choose NFS over iSCSI for most
installiations. Do so if yu can. 

Please start monitoring fallowing datasets using analytics. 

Network transfer broken by interface or device [check if you are stuck by
gigabit ethernet etc]
iSCSI IOPS 
iSCSI IOPS broken down by LUN (to understand which LUN demands more
performance, with newer FW's you may find it use full to isolate some LUN's
by defining different pools - beware that this may not offer much help if
you use Raid 10) 
iSCSI the iops broken down type
iSCSI write iops latency
iSCSI latency
Arc hit/Miss ratio 
Arc size 

Here are my recomendations :(if you can share some screen shots from
analytics I may be able to help mre) 

1) Convert to Raid10 - this will provide you 4-5x More IOPS on both read and
writes. 

2) Using analytics, decide if increasing L1 cache may help you. If it can,
increase L1 cache 

3) Check  the IO size using analytics and check it against your lun
definations. I suggest that Lun block size should be lower than IO size.  

4) Enable wirte caching for a short time and monitor analytics report and if
you see much improvement you can invest in SSD's.  

5) Enable jumb frames (through out the path)

6) Use multiple interfaces to access data 

PS: I think that you have asked if you can disable ZIL on 7000 series. The
answer is yes and you can decide it at share/lun granularity. 

PS: We usualy recommend to use writezilla for vmware users but I have seen
7210's running 30-40 vm's without much problem when there were no
writezilla, but for sure this depends on the load pattern. 

Very best regards
Mertol 

Mertol Ozyoney 
Storage Practice - Sales Manager

Sun Microsystems, TR
Istanbul TR
Phone +902123352200
Mobile +905339310752
Fax +90212335
Email mertol.ozyo...@sun.com



-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Mark
Sent: Friday, August 27, 2010 10:47 PM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] VM's on ZFS - 7210

It does, its on a pair of large APC's.

Right now we're using NFS for our ESX Servers.  The only iSCSI LUN's I have
are mounted inside a couple Windows VM's.   I'd have to migrate all our VM's
to iSCSI, which I'm willing to do if it would help and not cause other
issues.   So far the 7210 Appliance has been very stable.

I like the zilstat script.  I emailed a support tech I am working with on
another issue to ask if one of the built in Analytics DTrace scripts will
get that data.   

I found one called L2ARC Eligibility:  3235 true, 66 false.  This makes it
sound like we would benefit from a READZilla, not quite what I had
expected...  I'm sure I don't know what I'm looking at anyways :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Eff Norwood
By all means please try it to validate it yourself and post your results from 
hour one, day one and week one. In a ZIL use case, although the data set is 
small it is always writing a small ever changing (from the SSDs perspective) 
data set. The SSD does not know to release previously written pages and without 
TRIM there is no way to tell it to. That means every time a ZIL write happens, 
new SSD pages are consumed. After some amount of time, all of those empty pages 
will become consumed and the SSD will now have to go into the read-erase-write 
cycle which is incredibly slow and the whole point of TRIM.

I can assure you from my extensive benchmarking with all major SSDs in the role 
of a ZIL you will eventually not be happy. Depending on your use case it might 
take months, but eventually all those free pages will be consumed and 
read-erase-write is how the SSD world works after that - unless you have TRIM, 
which we don't yet.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Ray Van Dolson
On Fri, Aug 27, 2010 at 03:51:39PM -0700, Eff Norwood wrote:
 By all means please try it to validate it yourself and post your
 results from hour one, day one and week one. In a ZIL use case,
 although the data set is small it is always writing a small ever
 changing (from the SSDs perspective) data set. The SSD does not know
 to release previously written pages and without TRIM there is no way
 to tell it to. That means every time a ZIL write happens, new SSD
 pages are consumed. After some amount of time, all of those empty
 pages will become consumed and the SSD will now have to go into the
 read-erase-write cycle which is incredibly slow and the whole point
 of TRIM.
 
 I can assure you from my extensive benchmarking with all major SSDs
 in the role of a ZIL you will eventually not be happy. Depending on
 your use case it might take months, but eventually all those free
 pages will be consumed and read-erase-write is how the SSD world
 works after that - unless you have TRIM, which we don't yet.  -- 
 This message posted from opensolaris.org

Is there a way to measure how many SSD pages are taken up?

We've had a box running for nearly 8 months now -- it's performing
well, but I'd be interested to see if we'll be close to (theoretically)
hitting this problem or not.

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] VM's on ZFS - 7210

2010-08-26 Thread Mark
We are using a 7210, 44 disks I believe, 11 stripes of RAIDz sets.  When I 
installed I selected the best bang for the buck on the speed vs capacity chart.

We run about 30 VM's on it, across 3 ESX 4 servers.  Right now, its all running 
NFS, and it sucks... sooo slow.

iSCSI was no better.   

I am wondering how I can increase the performance, cause they want to add more 
vm's... the good news is most are idleish, but even idle vm's create a lot of 
random chatter to the disks!

So a few options maybe... 

1) Change to iSCSI mounts to ESX, and enable write-cache on the LUN's since the 
7210 is on a UPS.
2) get a Logzilla SSD mirror.  (do ssd's fail, do I really need a mirror?)
3) reconfigure the NAS to a RAID10 instead of RAIDz

Obviously all 3 would be ideal , though with a SSD can I keep using NFS for the 
same performance since the R_SYNC's would be satisfied with the SSD?

I am dreadful of getting the OK to spend the $$,$$$ SSD's and then not get the 
performance increase we want.

How would you weight these?  I noticed in testing on a 5 disk OpenSolaris, that 
changing from a single RAIDz pool to RAID10 netted a larger IOP increase then 
adding an Intel SSD as a Logzilla.  That's not going to scale the same though 
with a 44 disk, 11 raidz striped RAID set.

Some thoughts?  Would simply moving to write-cache enabled iSCSI LUN's without 
a SSD speed things up a lot by itself?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss