Re: [zfs-discuss] Terrible zfs performance under NFS load

2008-07-31 Thread Paul Fisher
Stephen Stogner wrote:
 Hello,
   We have a S10U5 server sharing with zfs sharing up NFS shares.  While using 
 the nfs mount for a log destination for syslog for 20 or so busy mail servers 
 we have noticed that the throughput becomes severly degraded shortly.  I have 
 tried disabling the zil, turning off cache flushing and I have not seen any 
 changes in performance.  The servers are only pushing about 1MB/s of constant 
 traffic to the server over nfs of log data.  I think this is due to the cache 
 being flushed with every nfs commit, I was wondering if any one had any other 
 suggestions as to what it could be? Thank you.
   
Not that this is deals with the nfs/zfs performance you are 
experiencing, but why not forward the syslog directly to the target 
machine and allow it to write the syslog files locally to the filesystem?


--
paul
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Terrible zfs performance under NFS load

2008-07-31 Thread Paul Fisher
Stephen Stogner wrote:
 True we could have all the syslog data be directed towards the host but the 
 underlying issue remains the same with the performance hit.  We have used nfs 
 shares for log hosts and mail hosts and we are looking towards using a zfs 
 based mail store with nfs moutnts from x mail servers but if nfs/zfs combo 
 take such a performance hit I would need to investigate another solution.
   
Syslog is funny in that it does a lot of open/write/close cycles so that 
rotate can work trivially.  Those are meta-data updates and on NFS each 
implies a COMMIT.  This leads us back to the old solaris nfs over zfs 
is slow discussion, where we talk about the fact that other nfs servers 
does not honor the COMMIT semantics and can lose data.  I for one do 
*not* want solaris nfs/zfs to behave in any way other than it does.

The bottom line is that for high COMMIT rate nfs workloads you need to 
do as Richard suggests and look into setting up a slog (separate intent 
log) on fast disks (or SSD) away from the rest of the storage pool.

In spite of this, I would still recommend that you forward syslog 
traffic just for the sake of marshalling your resources for important 
work, rather than waste.


--
paul
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-30 Thread Paul Fisher
Richard Elling wrote:
 I was able to reproduce this in b93, but might have a different
 interpretation of the conditions.  More below...

 Ross Smith wrote:
   
 A little more information today.  I had a feeling that ZFS would
 continue quite some time before giving an error, and today I've shown
 that you can carry on working with the filesystem for at least half an
 hour with the disk removed.

 I suspect on a system with little load you could carry on working for
 several hours without any indication that there is a problem.  It
 looks to me like ZFS is caching reads  writes, and that provided
 requests can be fulfilled from the cache, it doesn't care whether the
 disk is present or not.
 

 In my USB-flash-disk-sudden-removal-while-writing-big-file-test,
 1. I/O to the missing device stopped (as I expected)
 2. FMA kicked in, as expected.
 3. /var/adm/messages recorded Command failed to complete... device gone.
 4. After exactly 9 minutes, 17,951 e-reports had been processed and the
 diagnosis was complete.  FMA logged the following to /var/adm/messages
   
Wow! Who knew that 17, 951 was the magic number...  Seriously, this does 
seem like an excessive amount of certainty.


--
paul
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-11-09 Thread Paul Fisher
Let's stop feeding the troll...


-Original Message-
From: [EMAIL PROTECTED] on behalf of Richard Elling
Sent: Thu 11/8/2007 11:45 PM
To: can you guess?
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Yager on ZFS
 
can you guess? wrote:
 CERN was using relatively cheap disks and found that they were more 
 than adequate (at least for any normal consumer use) without that 
 additional level of protection: the incidence of errors, even 
 including the firmware errors which presumably would not have occurred 
 in a normal consumer installation lacking hardware RAID, was on the 
 order of 1 per TB - and given that it's really, really difficult for a 
 consumer to come anywhere near that much data without most of it being 
 video files (which just laugh and keep playing when they discover 
 small errors) that's pretty much tantamount to saying that consumers 
 would encounter no *noticeable* errors at all.

bull*
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS scaling

2007-07-03 Thread Paul Fisher
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 William Loewe

 I'm using a Sun Fire X4500 Thumper and trying to get some 
 sense of the best performance I can get from it with zfs.
 
 I'm running without mirroring or raid, and have checksumming 
 turned off.  I built the zfs with these commands:
 
 # zpool create mypool disk0 disk1 ... diskN
 # zfs set checksum=off mypool
 # zfs create mypool/testing
 
 When I run an application with 8 threads performing writes, I 
 see this performance:
 
1 disk  --  42 MB/s
2 disks --  81 MB/s
4 disks -- 147 MB/s
8 disks -- 261 MB/s
   12 disks -- 347 MB/s
   16 disks -- 433 MB/s
   32 disks -- 687 MB/s
   45 disks -- 621 MB/s

This is more a matter of the number of vdevs at the top-level of the pool 
coupled with the fact that there are six (6) controllers upon which the disks 
attached.  The good news is that adding redundancy does not slow it down.  The 
following two example configurations demonstrate the two ends of the 
performance expectations you can have for the thumper.

For example, a pool constructed of three (3) raidz2 vdevs, each with twelve 
(12) like so:

bash-3.00# zpool status conf6z2pool
  pool: conf6z2pool
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
conf6z2pool  ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c0t7d0  ONLINE   0 0 0
c1t7d0  ONLINE   0 0 0
c5t7d0  ONLINE   0 0 0
c6t7d0  ONLINE   0 0 0
c7t7d0  ONLINE   0 0 0
c8t7d0  ONLINE   0 0 0
c0t6d0  ONLINE   0 0 0
c1t6d0  ONLINE   0 0 0
c5t6d0  ONLINE   0 0 0
c6t6d0  ONLINE   0 0 0
c7t6d0  ONLINE   0 0 0
c8t6d0  ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c0t5d0  ONLINE   0 0 0
c1t5d0  ONLINE   0 0 0
c5t5d0  ONLINE   0 0 0
c6t5d0  ONLINE   0 0 0
c7t5d0  ONLINE   0 0 0
c8t5d0  ONLINE   0 0 0
c0t3d0  ONLINE   0 0 0
c1t3d0  ONLINE   0 0 0
c5t3d0  ONLINE   0 0 0
c6t3d0  ONLINE   0 0 0
c7t3d0  ONLINE   0 0 0
c8t3d0  ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c0t2d0  ONLINE   0 0 0
c1t2d0  ONLINE   0 0 0
c5t2d0  ONLINE   0 0 0
c6t2d0  ONLINE   0 0 0
c7t2d0  ONLINE   0 0 0
c8t2d0  ONLINE   0 0 0
c0t1d0  ONLINE   0 0 0
c1t1d0  ONLINE   0 0 0
c5t1d0  ONLINE   0 0 0
c6t1d0  ONLINE   0 0 0
c7t1d0  ONLINE   0 0 0
c8t1d0  ONLINE   0 0 0
spares
  c8t0d0AVAIL   


will yield the following performance for several sustained writes of block
size 128k yields:

 (9 x dd if=/dev/zero bs=128k)

bash-3.00# zpool iostat conf6z2pool 1 
capacity operationsbandwidth
pool  used  avail   read  write   read  write
---  -  -  -  -  -  -
conf6z2pool  9.56G  16.3T  0  3.91K  0   492M
conf6z2pool  9.56G  16.3T  0  4.04K  0   509M
conf6z2pool  9.56G  16.3T  0  4.05K  0   510M
conf6z2pool  9.56G  16.3T  0  4.11K  0   517M


and sustained read performance of several 128k streams yields:

 (9 x dd of=/dev/zero bs=128k)

bash-3.00# zpool iostat conf6z2pool 1
capacity operationsbandwidth
pool  used  avail   read  write   read  write
---  -  -  -  -  -  -
conf6z2pool  1.30T  15.0T  5.97K  0   759M  0
conf6z2pool  1.30T  15.0T  5.97K  0   759M  0
conf6z2pool  1.30T  15.0T  5.96K  0   756M  0

and sustained read/write performance of several 128k streams yields:

 (9 x dd if=/dev/zero bs=128k  9 x dd of=/dev/null bs=128k)

bash-3.00# zpool iostat conf6z2pool 1
capacity operationsbandwidth
pool  used  avail   read  write   read  write
---  -  -  -  -  -  -
conf6z2pool  1.30T  15.0T  3.34K  2.54K   424M   320M
conf6z2pool  1.30T  15.0T  2.89K  2.83K   367M   356M
conf6z2pool  1.30T  15.0T  2.96K  2.80K   375M   353M
conf6z2pool  1.30T  15.0T  3.50K  2.58K   445M   325M

The complete opposite end of the performance spectrum will come from a
pool of mirrored vdevs with the following configuraiton:

bash-3.00# zpool status conf7mpool
  pool: conf7mpool
 state: ONLINE
 scrub: none 

RE: [zfs-discuss] Re: Slow write speed to ZFS pool (via NFS)

2007-06-23 Thread Paul Fisher
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Thomas Garner
 
 So it is expected behavior on my Nexenta alpha 7 server for Sun's nfsd
 to stop responding after 2 hours of running a bittorrent client over
 nfs4 from a linux client, causing zfs snapshots to hang and requiring
 a hard reboot to get the world back in order?

We have seen this behavior, but it appears to be entirely related to the 
hardware having the Intel IPMI stuff swallow up the NFS traffic on port 623 
directly by the network hardware and never getting.

http://blogs.sun.com/shepler/entry/port_623_or_the_mount


--

paul
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread Paul Fisher
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of mike
 Sent: Wednesday, June 20, 2007 9:30 AM
 
 I would prefer something like 15+1 :) I want ZFS to be able to detect
 and correct errors, but I do not need to squeeze all the performance
 out of it (I'll be using it as a home storage server for my DVDs and
 other audio/video stuff. So only a few clients at the most streaming
 off of it)

I would not risk raidz on that many disks.  A nice compromise may be 14+2 
raidz2, which should perform nicely for your workload and be pretty reliable 
when the disks start to fail.


--

paul
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Experiences with zfs/iscsi on T2000s and X4500s?

2007-04-28 Thread Paul Fisher
I would very much appreciate hearing from anyone that has experience running 
large zfs pools on T2000s created out of vdevs provided as iscsi targets from 
X4500s (Thumpers).  Please respond with both positive or negative experiences, 
either on or off list.

thanks in advance,
paul
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Performance of zpool import?

2007-02-26 Thread Paul Fisher
Has anyone done benchmarking on the scalability and performance of zpool import 
in terms of the number of devices in the pool on recent opensolaris builds?

In other words, what would the relative performance be for zpool import for 
the following three pool configurations on multi-pathed 4G FC connected JBODs:
1) 1, 12 disk raidz2 in pool
2) 10, 12 disk raidz2 in pool
3) 100, 12 disk raidz2 in pool

Any feedback on your experiences would be greatly appreciated.


--

paul
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] Performance of zpool import?

2007-02-26 Thread Paul Fisher
 From: Eric Schrock [mailto:[EMAIL PROTECTED] 
 Sent: Monday, February 26, 2007 12:05 PM
 
 The slow part of zpool import is actually discovering the 
 pool configuration.  This involves examining every device on 
 the system (or every device within a 'import -d' directory) 
 and seeing if it has any labels.  Internally, the import 
 action itself shoudl be quite fast...

Thanks for the answer.  Let me ask a follow-up question related to zpool import 
and the sun cluster+zfs integration -- is the slow part done early on the 
backup node so that at the time of the failover the actual import is fast as 
you describe above?

--

paul
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] Re: ZFS or UFS - what to do?

2007-01-26 Thread Paul Fisher
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Ed Gould
 Sent: Friday, January 26, 2007 3:38 PM
 
 Yes, I agree.  I'm sorry I don't have the data that Jim presented at 
 FAST, but he did present actual data.  Richard Elling (I believe it 
 was
 Richard) has also posted some related data from ZFS experience to this 
 list.

This seems to be from Jim and on point:

http://www.usenix.org/event/fast05/tech/gray.pdf


paul
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss