Re: [zfs-discuss] Entire client hangs every few seconds

2011-07-26 Thread Ian D
Hi Garrett- It is something that could happen at any time on a system that has been working fine for a while? That system has 256G of RAM, I think "adequate" is not a concern here :) We'll try 3.1 as soon as we can download it. Ian -- This message posted from opensolaris.org __

Re: [zfs-discuss] Entire client hangs every few seconds

2011-07-26 Thread Ian D
No dedup. The hiccups started around 2am on Sunday while (obviously) nobody was interacting with neither the clients or the server. It's been running for months (as is) without any problem. My guess is that it's a defective hard drive that instead of totally failing, just stutters. Or mayb

Re: [zfs-discuss] Entire client hangs every few seconds

2011-07-26 Thread Ian D
To add to that... iostat on the client boxes show the connection to always be around 98% util and tops at 100% whenever it hangs. The same clients are connected to another ZFS server with much lower specs and a smaller number of slower disks, it performs much better and rarely get past 5% util

[zfs-discuss] Entire client hangs every few seconds

2011-07-26 Thread Ian D
Hi all- We've been experiencing a very strange problem for two days now. We have three client (Linux boxes) connected to a ZFS box (Nexenta) via iSCSI. Every few seconds (seems random), iostats shows the clients go from an normal 80K+ IOPS to zero. It lasts up to a few seconds and things are

Re: [zfs-discuss] Mixing different disk sizes in a pool?

2010-12-18 Thread Ian D
> The answer really depends on what you want to do with > pool(s). You'll > have to provide more information. Get the maximum of very random IOPS I get can out of those drives for database usage. -- This message posted from opensolaris.org ___ zfs-di

Re: [zfs-discuss] Mixing different disk sizes in a pool?

2010-12-18 Thread Ian D
Another question: all those disks are on Dell MD1000 JBODs (11 of them) and we have 12 SAS ports on three LSI 9200-16e HBAs. Is there any point connecting each JBOD on a separate port or is it ok cascading them in groups of three? Is there a bandwidth limit we'll be hitting doing that? Thank

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-12-17 Thread Ian D
Here's a long due update for you all... After updating countless drivers, BIOSes and Nexenta, it seems that our issue has disappeared. We're slowly moving our production to our three appliances and things are going well so far. Sadly we don't know exactly what update fixed our issue. I wish I

[zfs-discuss] Mixing different disk sizes in a pool?

2010-12-17 Thread Ian D
I have 159x 15K RPM SAS drives I want to build a ZFS appliance with. 75x 145G 60x 300G 24x 600G The box has 4 CPUs, 256G of RAM, 14x 100G SLC SSDs for the cache and a mirrored pair of 4G DDRDrive X1s for the SLOG. My plan is to mirror all these drives and keep some hot spares. My question is

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-11-02 Thread Ian D
> Then set the zfs_write_limit_override to a reasonable > value. Our first experiments are showing progress. We'll play with it some more and let you know. Thanks! Ian -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discu

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-11-01 Thread Ian D
> Maybe you are experiencing this: > http://opensolaris.org/jive/thread.jspa?threadID=11942 It does look like this... Is this really the expected behaviour? That's just unacceptable. It is so bad it sometimes drop connection and fail copies and SQL queries... Ian -- This message posted from

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-11-01 Thread Ian D
> You "doubt" AMD or Intel cpu's suffer from bad cache > mgmt? In order to clear that out, we've tried using an older server (about 4 years old) as the head and we see the same pattern. It's actually more obvious that it consumes a whole lot of CPU cycles. Using the same box as a Linux-based N

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-11-01 Thread Ian D
> If you do a dd to the storage from the heads do > you still get the same issues? no, local read/writes are great, they never choke. It's whenever NFS or iSCSI are involved and that the read/writes are done from a remote box that we experience the problem. Local operations barely affects the

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-31 Thread Ian D
I get that multi-cores doesn't necessarily better performances, but I doubt that both the latest AMD CPUs (the Magny-Cours) and the latest Intel CPUs (the Beckton) suffer from incredibly bad cache management. Our two test system have 2 and 4 of each respectively. The thing is that the perform

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-30 Thread Ian D
I owe you all an update... We found out a clear pattern we can now recreate at will. Whenever we read/write the pool, it gives expected throughput and IOPS for a while, but at some point it slows down to a crawl, nothing is responding and pretty much "hang" for a few seconds and then things go

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-23 Thread Ian D
> I don't think the switch model was ever identified...perhaps it is a 1 GbE > switch with a few 10 GbE ports?  (Drawing at straws.) It is a a Dell 8024F. It has 24 SPF+ 10GbE ports and every NICs we connect to it are Intel X520. One issue we do have with it is when we turn jumbo frames on,

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-23 Thread Ian D
> Likely you don't have enough ram or CPU in the box. The Nexenta box has 256G of RAM and the latest X7500 series CPUs. That said, the load does get crazy high (like 35+) very quickly. We can't figure out what's taking so much CPU. It happens even when checksum/compression/unduping are off.

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-23 Thread Ian D
> A network switch that is being maxed out? Some > switches cannot switch > at rated line speed on all their ports all at the > same time. Their > internal buses simply don't have the bandwidth needed > for that. Maybe > you are running into that limit? (I know you > mentioned bypassing the

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-22 Thread Ian D
Some numbers... zpool status pool: Pool_sas state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM Pool_sas ONLINE 0 0 0 c4t5000C506A6D3d0 ONLINE 0 0 0 c4t5000C506

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-15 Thread Ian D
> Has anyone suggested either removing L2ARC/SLOG > entirely or relocating them so that all devices are > coming off the same controller? You've swapped the > external controller but the H700 with the internal > drives could be the real culprit. Could there be > issues with cross-controller IO in t

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-15 Thread Ian D
A little setback We found out that we also have the issue with the Dell H800 controllers, not just the LSI 9200-16e. With the Dell it's initially faster as we benefit from the cache, but after a little while it goes sour- from 350MB/sec down to less than 40MB/sec. We've also tried with a

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-15 Thread Ian D
After contacting LSI they say that the 9200-16e HBA is not supported in OpenSolaris, just Solaris. Aren't Solaris drivers the same as OpenSolaris? Is there anyone here using 9200-16e HBAs? What about the 9200-8e? We have a couple lying around and we'll test one shortly. Ian -- This message

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-15 Thread Ian D
> Does the Linux box have the same issue to any other > server ? > What if the client box isn't Linux but Solaris or > Windows or MacOS X ? That would be a good test. We'll try that. -- This message posted from opensolaris.org ___ zfs-discuss mailing l

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-15 Thread Ian D
> As I have mentioned already, it would be useful to > know more about the > onfig, how the tests are being done, and to see some > basic system > performance stats. I will shortly. Thanks! -- This message posted from opensolaris.org ___ zfs-discuss

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-15 Thread Ian D
> You mentioned a second Nexenta box earlier. To rule > out client-side issues, have you considered testing > with Nexenta as the iSCSI/NFS client? If you mean running the NFS client AND server on the same box then yes, and it doesn't show the same performance issues. It's only when a Linux box

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-15 Thread Ian D
> He already said he has SSD's for dedicated log. This > means the best > solution is to disable WriteBack and just use > WriteThrough. Not only is it > more reliable than WriteBack, it's faster. > > And I know I've said this many times before, but I > don't mind repeating: If > you have slog d

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-15 Thread Ian D
As I have mentioned already, we have the same performance issues whether we READ or we WRITE to the array, shouldn't that rule out caching issues? Also we can get great performances with the LSI HBA if we use the JBODs as a local file system. The issues only arise when it is done through iSCSI

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-14 Thread Ian D
> Earlier you said you had eliminated the ZIL as an > issue, but one difference > between the Dell H800 and the LSI HBA is that the > H800 has an NV cache (if > you have the battery backup present). > > A very simple test would be when things are running > slow, try disabling > the ZIL temporarily

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-14 Thread Ian D
> Our next test is to try with a different kind of HBA, > we have a Dell H800 lying around. ok... we're making progress. After swapping the LSI HBA for a Dell H800 the issue disappeared. Now, I'd rather not use those controllers because they don't have a JBOD mode. We have no choice but to mak

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-14 Thread Ian D
I've had a few people sending emails directly suggesting it might have something to do with the ZIL/SLOG. I guess I should have said that the issue happen both ways, whether we copy TO or FROM the Nexenta box. -- This message posted from opensolaris.org

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-14 Thread Ian D
> Sounding more and more like a networking issue - are > the network cards set up in an aggregate? I had some > similar issues on GbE where there was a mismatch > between the aggregate settings on the switches and > the LACP settings on the server. Basically the > network was wasting a ton of time

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-13 Thread Ian D
More stuff... We ran the same tests on another Nexenta box with fairly similar hardware and had the exact same issues. The two boxes have the same models of HBAs, NICs and JBODs but different CPUs and motherboards. Our next test is to try with a different kind of HBA, we have a Dell H800 lying

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-13 Thread Ian D
> Would it be possible to install OpenSolaris to an USB > disk and boot from it and try? That would take 1-2h > and could maybe help you narrow things down further? I'm a little afraid to lose my data, i wouldnt be the end of the world, but I'd rather avoid that. I'll do it in last resort. Ian

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-13 Thread Ian D
> The only thing that still stands out is that network > operations (iSCSI and NFS) to external drives are > slow, correct? Yes, that pretty much resume it. > Just for completeness, what happens if you scp a file > to the three different pools? If the results are the > same as NFS and iSCSI, th

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-13 Thread Ian D
Here are some more findings... The Nexenta box has 3 pools: syspool: made of 2 mirrored (hardware RAID) local SAS disks pool_sas: made of 22 15K SAS disks in ZFS mirrors on 2 JBODs on 2 controllers pool_sata: made of 42 SATA disks in 6 RAIDZ2 vdevs on a single controller When we copy data from an

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-10 Thread Ian D
> From the Linux side, it appears the drive in question > is either sdb or dm-3, and both appear to be the same > drive. Since switching to zfs, my Linux-disk-fu has > become a bit rusty. Is one an alias for the other? Yes, dm-3 is the alias created by LVM while sdb is the "physical" (or raw) d

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-09 Thread Ian D
\> We're aware of that. The original plan was to use > mirrored DDRDrive X1s but we're experiencing > stability issues. Chris George is being very > responsible and we'll help us out investigate that > once we figure out our most pressing performance > problems. I feel I need to add to my commen

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-09 Thread Ian D
> If you have a single SSD for dedicated log, that will > surely be a bottleneck > for you. We're aware of that. The original plan was to use mirrored DDRDrive X1s but we're experiencing stability issues. Chris George is being very responsible and we'll help us out investigate that once we f

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-09 Thread Ian D
>A couple of notes: we know the "Pool_sata" is resilvering, but we're >concerned about the performances of the other pool ("Pool_sas"). We also know >that we're >not using jumbo frames as for some reason it makes the linux box >crash. Could that explain it all? >What sort of drives are thes

Re: [zfs-discuss] Performance issues with iSCSI under Linux

2010-10-09 Thread Ian D
> I'll suggest trying something completely different, like, dd if=/dev/zero > bs=1024k | pv | ssh othermachine 'cat > /dev/null' ... Just to verify there > isn't something horribly wrong with your hardware (network). > > In linux, run "ifconfig" ... You should see "errors:0" > > Make sure each

[zfs-discuss] Performance issues with iSCSI under Linux

2010-10-08 Thread Ian D
Hi!We're trying to pinpoint our performance issues and we could use all the help to community can provide. We're running the latest version of Nexenta on a pretty powerful machine (4x Xeon 7550, 256GB RAM, 12x 100GB Samsung SSDs for the cache, 50GB Samsung SSD for the ZIL, 10GbE on a dedicated

Re: [zfs-discuss] Expected throughput

2010-07-05 Thread Ian D
>Just a short question - wouldn't it be easier, and perhaps faster, to just >have the MySQL DB on an NFS share? iSCSI adds >complexity, both on the target and the initiator. Yes, we did tried both and we didn't notice any difference in term of performances. I've read conflicting opinions on

Re: [zfs-discuss] Expected throughput

2010-07-04 Thread Ian D
> Is that 38% of one CPU or 38% of all CPU's? How many CPU's does the > Linux box have? I don't mean the number of sockets, I mean number of > sockets * number of cores * number of threads per core. My The server has two Intel X5570s, they are quad core and have hyperthreading. It would say

Re: [zfs-discuss] Expected throughput

2010-07-04 Thread Ian D
> In what way is CPU contention being monitored? "prstat" without > options is nearly useless for a multithreaded app on a multi-CPU (or > multi-core/multi-thread) system. mpstat is only useful if threads > never migrate between CPU's. "prstat -mL" gives a nice picture of how > busy each LWP (t

Re: [zfs-discuss] Expected throughput

2010-07-04 Thread Ian D
>Ok... so we've rebuilt the pool as 14 pairs of mirrors, each pair having one >disk in each of the two JBODs. >Now we're getting about 500-1000 IOPS >(according to zpool iostat) and 20-30MB/sec in random read on a big >database. > Does that sounds right?>Seems right, as Erik said. Btw, do you

Re: [zfs-discuss] Expected throughput

2010-07-03 Thread Ian D
>To summarise, putting 28 disks in a single vdev is nothing you would do if you >want performance. You'll end >up with as many IOPS a single drive can do. >Split it up into smaller (<10 disk) vdevs and try again. If you need >high >performance, put them in a striped mirror (aka RAID1+0)>A litt

Re: [zfs-discuss] Mix SAS and SATA drives?

2010-07-01 Thread Ian D
> As the 15k drives are faster seek-wise (and possibly faster for linear I/O), > you may want to separate them into different VDEVs or even pools, but then, > it's quite impossible to give a "correct" answer unless knowing what it's > going to be used for.Mostly database duty.> > Also, using 10

Re: [zfs-discuss] Mix SAS and SATA drives?

2010-07-01 Thread Ian D
Sorry for the formatting, that's 2x 15x 1000GB SATA 3x 15x 750GB SATA 2x 12x 600GB SAS 15K 4x 15x 300GB SAS 15K ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/list

[zfs-discuss] Mix SAS and SATA drives?

2010-07-01 Thread Ian D
Another question...We're building a ZFS NAS/SAN out of the following JBODs we already own: 2x 15x 1000GB SATA3x 15x 750GB SATA2x 12x 600GB SAS 15K4x 15x 300GB SAS 15K That's a lot of spindles we'd like to benefit from, but our assumption is that we should split these in two separate pools, on

[zfs-discuss] Expected throughput

2010-07-01 Thread Ian D
Hi! We've put 28x 750GB SATA drives in a RAIDZ2 pool (a single vdev) and we get about 80MB/s in sequential read or write. We're running local tests on the server itself (no network involved). Is that what we should be expecting? It seems slow to me. Thanks

[zfs-discuss] First Setup

2010-05-02 Thread Ian D
Hi! We're building our first dedicated ZFS-based NAS/SAN (probably using Nexenta) and I'd like to run the specs by you all to see if you have any recommendations. All of it is already bought, but it's not too late to add to it. Dell PowerEdge R9102x Intel X7550 2GHz, 8 cores each plus Hyper