Re: [zfs-discuss] Terrible zfs performance under NFS load
Stephen Stogner wrote: Hello, We have a S10U5 server sharing with zfs sharing up NFS shares. While using the nfs mount for a log destination for syslog for 20 or so busy mail servers we have noticed that the throughput becomes severly degraded shortly. I have tried disabling the zil, turning off cache flushing and I have not seen any changes in performance. The servers are only pushing about 1MB/s of constant traffic to the server over nfs of log data. I think this is due to the cache being flushed with every nfs commit, I was wondering if any one had any other suggestions as to what it could be? Thank you. Not that this is deals with the nfs/zfs performance you are experiencing, but why not forward the syslog directly to the target machine and allow it to write the syslog files locally to the filesystem? -- paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Terrible zfs performance under NFS load
Stephen Stogner wrote: True we could have all the syslog data be directed towards the host but the underlying issue remains the same with the performance hit. We have used nfs shares for log hosts and mail hosts and we are looking towards using a zfs based mail store with nfs moutnts from x mail servers but if nfs/zfs combo take such a performance hit I would need to investigate another solution. Syslog is funny in that it does a lot of open/write/close cycles so that rotate can work trivially. Those are meta-data updates and on NFS each implies a COMMIT. This leads us back to the old solaris nfs over zfs is slow discussion, where we talk about the fact that other nfs servers does not honor the COMMIT semantics and can lose data. I for one do *not* want solaris nfs/zfs to behave in any way other than it does. The bottom line is that for high COMMIT rate nfs workloads you need to do as Richard suggests and look into setting up a slog (separate intent log) on fast disks (or SSD) away from the rest of the storage pool. In spite of this, I would still recommend that you forward syslog traffic just for the sake of marshalling your resources for important work, rather than waste. -- paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed
Richard Elling wrote: I was able to reproduce this in b93, but might have a different interpretation of the conditions. More below... Ross Smith wrote: A little more information today. I had a feeling that ZFS would continue quite some time before giving an error, and today I've shown that you can carry on working with the filesystem for at least half an hour with the disk removed. I suspect on a system with little load you could carry on working for several hours without any indication that there is a problem. It looks to me like ZFS is caching reads writes, and that provided requests can be fulfilled from the cache, it doesn't care whether the disk is present or not. In my USB-flash-disk-sudden-removal-while-writing-big-file-test, 1. I/O to the missing device stopped (as I expected) 2. FMA kicked in, as expected. 3. /var/adm/messages recorded Command failed to complete... device gone. 4. After exactly 9 minutes, 17,951 e-reports had been processed and the diagnosis was complete. FMA logged the following to /var/adm/messages Wow! Who knew that 17, 951 was the magic number... Seriously, this does seem like an excessive amount of certainty. -- paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Yager on ZFS
Let's stop feeding the troll... -Original Message- From: [EMAIL PROTECTED] on behalf of Richard Elling Sent: Thu 11/8/2007 11:45 PM To: can you guess? Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Yager on ZFS can you guess? wrote: CERN was using relatively cheap disks and found that they were more than adequate (at least for any normal consumer use) without that additional level of protection: the incidence of errors, even including the firmware errors which presumably would not have occurred in a normal consumer installation lacking hardware RAID, was on the order of 1 per TB - and given that it's really, really difficult for a consumer to come anywhere near that much data without most of it being video files (which just laugh and keep playing when they discover small errors) that's pretty much tantamount to saying that consumers would encounter no *noticeable* errors at all. bull* -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS scaling
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of William Loewe I'm using a Sun Fire X4500 Thumper and trying to get some sense of the best performance I can get from it with zfs. I'm running without mirroring or raid, and have checksumming turned off. I built the zfs with these commands: # zpool create mypool disk0 disk1 ... diskN # zfs set checksum=off mypool # zfs create mypool/testing When I run an application with 8 threads performing writes, I see this performance: 1 disk -- 42 MB/s 2 disks -- 81 MB/s 4 disks -- 147 MB/s 8 disks -- 261 MB/s 12 disks -- 347 MB/s 16 disks -- 433 MB/s 32 disks -- 687 MB/s 45 disks -- 621 MB/s This is more a matter of the number of vdevs at the top-level of the pool coupled with the fact that there are six (6) controllers upon which the disks attached. The good news is that adding redundancy does not slow it down. The following two example configurations demonstrate the two ends of the performance expectations you can have for the thumper. For example, a pool constructed of three (3) raidz2 vdevs, each with twelve (12) like so: bash-3.00# zpool status conf6z2pool pool: conf6z2pool state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM conf6z2pool ONLINE 0 0 0 raidz2ONLINE 0 0 0 c0t7d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 c5t7d0 ONLINE 0 0 0 c6t7d0 ONLINE 0 0 0 c7t7d0 ONLINE 0 0 0 c8t7d0 ONLINE 0 0 0 c0t6d0 ONLINE 0 0 0 c1t6d0 ONLINE 0 0 0 c5t6d0 ONLINE 0 0 0 c6t6d0 ONLINE 0 0 0 c7t6d0 ONLINE 0 0 0 c8t6d0 ONLINE 0 0 0 raidz2ONLINE 0 0 0 c0t5d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c5t5d0 ONLINE 0 0 0 c6t5d0 ONLINE 0 0 0 c7t5d0 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 0 c0t3d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c5t3d0 ONLINE 0 0 0 c6t3d0 ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 c8t3d0 ONLINE 0 0 0 raidz2ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c5t2d0 ONLINE 0 0 0 c6t2d0 ONLINE 0 0 0 c7t2d0 ONLINE 0 0 0 c8t2d0 ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c5t1d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c7t1d0 ONLINE 0 0 0 c8t1d0 ONLINE 0 0 0 spares c8t0d0AVAIL will yield the following performance for several sustained writes of block size 128k yields: (9 x dd if=/dev/zero bs=128k) bash-3.00# zpool iostat conf6z2pool 1 capacity operationsbandwidth pool used avail read write read write --- - - - - - - conf6z2pool 9.56G 16.3T 0 3.91K 0 492M conf6z2pool 9.56G 16.3T 0 4.04K 0 509M conf6z2pool 9.56G 16.3T 0 4.05K 0 510M conf6z2pool 9.56G 16.3T 0 4.11K 0 517M and sustained read performance of several 128k streams yields: (9 x dd of=/dev/zero bs=128k) bash-3.00# zpool iostat conf6z2pool 1 capacity operationsbandwidth pool used avail read write read write --- - - - - - - conf6z2pool 1.30T 15.0T 5.97K 0 759M 0 conf6z2pool 1.30T 15.0T 5.97K 0 759M 0 conf6z2pool 1.30T 15.0T 5.96K 0 756M 0 and sustained read/write performance of several 128k streams yields: (9 x dd if=/dev/zero bs=128k 9 x dd of=/dev/null bs=128k) bash-3.00# zpool iostat conf6z2pool 1 capacity operationsbandwidth pool used avail read write read write --- - - - - - - conf6z2pool 1.30T 15.0T 3.34K 2.54K 424M 320M conf6z2pool 1.30T 15.0T 2.89K 2.83K 367M 356M conf6z2pool 1.30T 15.0T 2.96K 2.80K 375M 353M conf6z2pool 1.30T 15.0T 3.50K 2.58K 445M 325M The complete opposite end of the performance spectrum will come from a pool of mirrored vdevs with the following configuraiton: bash-3.00# zpool status conf7mpool pool: conf7mpool state: ONLINE scrub: none
RE: [zfs-discuss] Re: Slow write speed to ZFS pool (via NFS)
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Thomas Garner So it is expected behavior on my Nexenta alpha 7 server for Sun's nfsd to stop responding after 2 hours of running a bittorrent client over nfs4 from a linux client, causing zfs snapshots to hang and requiring a hard reboot to get the world back in order? We have seen this behavior, but it appears to be entirely related to the hardware having the Intel IPMI stuff swallow up the NFS traffic on port 623 directly by the network hardware and never getting. http://blogs.sun.com/shepler/entry/port_623_or_the_mount -- paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
RE: [zfs-discuss] ZFS Scalability/performance
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of mike Sent: Wednesday, June 20, 2007 9:30 AM I would prefer something like 15+1 :) I want ZFS to be able to detect and correct errors, but I do not need to squeeze all the performance out of it (I'll be using it as a home storage server for my DVDs and other audio/video stuff. So only a few clients at the most streaming off of it) I would not risk raidz on that many disks. A nice compromise may be 14+2 raidz2, which should perform nicely for your workload and be pretty reliable when the disks start to fail. -- paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Experiences with zfs/iscsi on T2000s and X4500s?
I would very much appreciate hearing from anyone that has experience running large zfs pools on T2000s created out of vdevs provided as iscsi targets from X4500s (Thumpers). Please respond with both positive or negative experiences, either on or off list. thanks in advance, paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Performance of zpool import?
Has anyone done benchmarking on the scalability and performance of zpool import in terms of the number of devices in the pool on recent opensolaris builds? In other words, what would the relative performance be for zpool import for the following three pool configurations on multi-pathed 4G FC connected JBODs: 1) 1, 12 disk raidz2 in pool 2) 10, 12 disk raidz2 in pool 3) 100, 12 disk raidz2 in pool Any feedback on your experiences would be greatly appreciated. -- paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
RE: [zfs-discuss] Performance of zpool import?
From: Eric Schrock [mailto:[EMAIL PROTECTED] Sent: Monday, February 26, 2007 12:05 PM The slow part of zpool import is actually discovering the pool configuration. This involves examining every device on the system (or every device within a 'import -d' directory) and seeing if it has any labels. Internally, the import action itself shoudl be quite fast... Thanks for the answer. Let me ask a follow-up question related to zpool import and the sun cluster+zfs integration -- is the slow part done early on the backup node so that at the time of the failover the actual import is fast as you describe above? -- paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
RE: [zfs-discuss] Re: ZFS or UFS - what to do?
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ed Gould Sent: Friday, January 26, 2007 3:38 PM Yes, I agree. I'm sorry I don't have the data that Jim presented at FAST, but he did present actual data. Richard Elling (I believe it was Richard) has also posted some related data from ZFS experience to this list. This seems to be from Jim and on point: http://www.usenix.org/event/fast05/tech/gray.pdf paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss