Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-20 Thread Edward Ned Harvey
 ZFS has intelligent prefetching.  AFAIK, Solaris disk drivers do not
 prefetch.

Can you point me to any reference?  I didn't find anything stating yay or
nay, for either of these.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-20 Thread Edward Ned Harvey
 Doesn't this mean that if you enable write back, and you have
 a single, non-mirrored raid-controller, and your raid controller
 dies on you so that you loose the contents of the nvram, you have
 a potentially corrupt file system?

It is understood, that any single point of failure could result in failure,
yes.  If you have a CPU that performs miscalculations, makes mistakes, it
can instruct bad things to be written to disk (I've had something like that
happen before.)  If you have RAM with bit errors in it that go undetected,
you can have corrupted memory, and if that memory is destined to write to
disk, you'll have bad data written to disk.  If you have a non-redundant
raid controller, which buffers writes, and the buffer gets destroyed or
corrupted before the writes are put to disk, then the data has become
corrupt.  Heck, the same is true even with redundant raid controllers, if
there are memory errors in one that go undetected.

So you'll have to do your own calculation.  Which is worse?
- Don't get the benefit of accelerated hardware, for all the time that the
hardware is functioning correctly,
Or
- Take the risk of acceleration, with possibility the accelerator could fail
and cause harm to the data it was working on.

I know I always opt for using the raid write-back.  If I ever have a
situation where I'm so scared of the raid card corrupting data, I would be
equally scared of the CPU or SAS bus or system ram or whatever.  In that
case, I'd find a solution that makes entire machines redundant, rather than
worrying about one little perc card.

Yes it can happen.  I've seen it happen.  But not just to raid cards;
everything else is vulnerable too.

I'll take a 4x performance improvement for 99.999% of the time, and risk the
corruption the rest of the time.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-19 Thread Edward Ned Harvey
One more thing I¹d like to add here:

The PERC cache measurably and significantly accelerates small disk writes.
However, for read operations, it is insignificant compared to system ram,
both in terms of size and speed.  There is no significant performance
improvement by enabling adaptive readahead in the PERC.  I will recommend
instead, the PERC should be enabled for Write Back, and have the readahead
disabled.  Fortunately this is the default configuration on a new perc
volume, so unless you changed it, you should be fine.

It may be smart to double check, and ensure your OS does adaptive readahead.

In Linux (rhel/centos) you can check that the ³readahead² service is
loading.  I noticed this is enabled by default in runlevel 5, but disabled
by default in runlevel 3.  Interesting.

I don¹t know how to check solaris or opensolaris, to ensure adaptive
readahead is enabled.




On 2/18/10 8:08 AM, Edward Ned Harvey sola...@nedharvey.com wrote:

 Ok, I¹ve done all the tests I plan to complete.  For highest performance, it
 seems:
 ·The measure I think is the most relevant for typical operation is the
 fastest random read /write / mix.  (Thanks Bob, for suggesting I do this
 test.)
 The winner is clearly striped mirrors in ZFS
 
 ·The fastest sustained sequential write is striped mirrors via ZFS, or
 maybe raidz
 
 ·The fastest sustained sequential read is striped mirrors via ZFS, or
 maybe raidz
 
  
 Here are the results:
 ·Results summary of Bob's method
 
http://nedharvey.com/iozone_weezer/bobs%20method/iozone%20results%20summary.pd
f
 
 ·Raw results of Bob's method
 http://nedharvey.com/iozone_weezer/bobs%20method/raw_results.zip
 
 ·Results summary of Ned's method
 
http://nedharvey.com/iozone_weezer/neds%20method/iozone%20results%20summary.pd
f
 
 ·Raw results of Ned's method
 http://nedharvey.com/iozone_weezer/neds%20method/raw_results.zip
 
  
  
  
  
  
 
 From: Edward Ned Harvey [mailto:sola...@nedharvey.com]
 Sent: Saturday, February 13, 2010 9:07 AM
 To: opensolaris-disc...@opensolaris.org; zfs-discuss@opensolaris.org
 Subject: ZFS performance benchmarks in various configurations
  
 I have a new server, with 7 disks in it.  I am performing benchmarks on it
 before putting it into production, to substantiate claims I make, like
 ³striping mirrors is faster than raidz² and so on.  Would anybody like me to
 test any particular configuration? Unfortunately I don¹t have any SSD, so I
 can¹t do any meaningful test on the ZIL etc.  Unless someone in the Boston
 area has a 2.5² SAS SSD they wouldn¹t mind lending for a few hours.  ;-)
  
 My hardware configuration:  Dell PE 2970 with 8 cores.  Normally 32G, but I
 pulled it all out to get it down to 4G of ram.  (Easier to benchmark disks
 when the file operations aren¹t all cached.)  ;-)  Solaris 10 10/09.  PERC 6/i
 controller.  All disks are configured in PERC for Adaptive ReadAhead, and
 Write Back, JBOD.  7 disks present, each SAS 15krpm 160G.  OS is occupying 1
 disk, so I have 6 disks to play with.
  
 I am currently running the following tests:
  
 Will test, including the time to flush(), various record sizes inside file
 sizes up to 16G, sequential write and sequential read. Not doing any mixed
 read/write requests.  Not doing any random read/write.
 iozone -Reab somefile.wks -g 17G -i 1 -i 0
  
 Configurations being tested:
 ·Single disk
 
 ·2-way mirror
 
 ·3-way mirror
 
 ·4-way mirror
 
 ·5-way mirror
 
 ·6-way mirror
 
 ·Two mirrors striped (or concatenated)
 
 ·Three mirrors striped (or concatenated)
 
 ·5-disk raidz
 
 ·6-disk raidz
 
 ·6-disk raidz2
 
  
 Hypothesized results:
 ·N-way mirrors write at the same speed of a single disk
 
 ·N-way mirrors read n-times faster than a single disk
 
 ·Two mirrors striped read and write 2x faster than a single mirror
 
 ·Three mirrors striped read and write 3x faster than a single mirror
 
 ·Raidz and raidz2:  No hypothesis. Some people say they perform
 comparable to many disks working together. Some people say it¹s slower than a
 single disk.  Waiting to see the results.
 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-19 Thread Günther
hello
i have made some benchmarks with my napp-it zfs-serverbr
a href=http://www.napp-it.org/bench.pdf; target=_blankscreenshot/abr
br
a href=http://www.napp-it.org/bench.pdf; 
target=_blankwww.napp-it.org/bench.pdf/abr
br
- 2gb vs 4 gb vs 8 gb rambr
- mirror vs raidz vs raidz2 vs raidz3br
- dedup and compress enabled vs disabledbr
br
result in short:br
8gb ram vs 2 Gb: + 10% .. +500% more power (green drives)br
compress and dedup enabled: + 50% .. +300%br
mirror vs Raidz: fastest is raidz, slowest mirror, raidz level +/-20%br
br
br
gea
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-19 Thread Richard Elling
On Feb 19, 2010, at 8:35 AM, Edward Ned Harvey wrote:
 One more thing I’d like to add here:
 
 The PERC cache measurably and significantly accelerates small disk writes.  
 However, for read operations, it is insignificant compared to system ram, 
 both in terms of size and speed.  There is no significant performance 
 improvement by enabling adaptive readahead in the PERC.  I will recommend 
 instead, the PERC should be enabled for Write Back, and have the readahead 
 disabled.  Fortunately this is the default configuration on a new perc 
 volume, so unless you changed it, you should be fine.
 
 It may be smart to double check, and ensure your OS does adaptive readahead.  
 In Linux (rhel/centos) you can check that the “readahead” service is loading. 
  I noticed this is enabled by default in runlevel 5, but disabled by default 
 in runlevel 3.  Interesting.
 
 I don’t know how to check solaris or opensolaris, to ensure adaptive 
 readahead is enabled.

ZFS has intelligent prefetching.  AFAIK, Solaris disk drivers do not prefetch.

 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-19 Thread Ragnar Sundblad

On 19 feb 2010, at 17.35, Edward Ned Harvey wrote:

 The PERC cache measurably and significantly accelerates small disk writes.  
 However, for read operations, it is insignificant compared to system ram, 
 both in terms of size and speed.  There is no significant performance 
 improvement by enabling adaptive readahead in the PERC.  I will recommend 
 instead, the PERC should be enabled for Write Back, and have the readahead 
 disabled.  Fortunately this is the default configuration on a new perc 
 volume, so unless you changed it, you should be fine.

If I understand correctly, ZFS now adays will only flush data to
non volatile storage (such as a RAID controller NVRAM), and not
all the way out to disks. (To solve performance problems with some
storage systems, and I believe that it also is the right thing
to do under normal circumstances.)

Doesn't this mean that if you enable write back, and you have
a single, non-mirrored raid-controller, and your raid controller
dies on you so that you loose the contents of the nvram, you have
a potentially corrupt file system?

/ragge

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-19 Thread Neil Perrin



If I understand correctly, ZFS now adays will only flush data to
non volatile storage (such as a RAID controller NVRAM), and not
all the way out to disks. (To solve performance problems with some
storage systems, and I believe that it also is the right thing
to do under normal circumstances.)

Doesn't this mean that if you enable write back, and you have
a single, non-mirrored raid-controller, and your raid controller
dies on you so that you loose the contents of the nvram, you have
a potentially corrupt file system?


ZFS requires,that all writes be flushed to non-volatile storage.
This is needed for both transaction group (txg) commits to ensure pool integrity
and for the ZIL to satisfy the synchronous requirement of fsync/O_DSYNC etc.
If the caches weren't flushed then it would indeed be quicker but the pool
would be susceptible to corruption. Sadly some hardware doesn't honour
cache flushes and this can cause corruption.

Neil.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-18 Thread Edward Ned Harvey
Ok, I've done all the tests I plan to complete.  For highest performance, it
seems:

. The measure I think is the most relevant for typical operation is
the fastest random read /write / mix.  (Thanks Bob, for suggesting I do this
test.)
The winner is clearly striped mirrors in ZFS

. The fastest sustained sequential write is striped mirrors via ZFS,
or maybe raidz

. The fastest sustained sequential read is striped mirrors via ZFS,
or maybe raidz

 

Here are the results:

. Results summary of Bob's method
http://nedharvey.com/iozone_weezer/bobs%20method/iozone%20results%20summary.
pdf

. Raw results of Bob's method
http://nedharvey.com/iozone_weezer/bobs%20method/raw_results.zip 

. Results summary of Ned's method
http://nedharvey.com/iozone_weezer/neds%20method/iozone%20results%20summary.
pdf

. Raw results of Ned's method
http://nedharvey.com/iozone_weezer/neds%20method/raw_results.zip

 

 

 

 

 

From: Edward Ned Harvey [mailto:sola...@nedharvey.com] 
Sent: Saturday, February 13, 2010 9:07 AM
To: opensolaris-disc...@opensolaris.org; zfs-discuss@opensolaris.org
Subject: ZFS performance benchmarks in various configurations

 

I have a new server, with 7 disks in it.  I am performing benchmarks on it
before putting it into production, to substantiate claims I make, like
striping mirrors is faster than raidz and so on.  Would anybody like me to
test any particular configuration?  Unfortunately I don't have any SSD, so I
can't do any meaningful test on the ZIL etc.  Unless someone in the Boston
area has a 2.5 SAS SSD they wouldn't mind lending for a few hours.  ;-)

 

My hardware configuration:  Dell PE 2970 with 8 cores.  Normally 32G, but I
pulled it all out to get it down to 4G of ram.  (Easier to benchmark disks
when the file operations aren't all cached.)  ;-)  Solaris 10 10/09.  PERC
6/i controller.  All disks are configured in PERC for Adaptive ReadAhead,
and Write Back, JBOD.  7 disks present, each SAS 15krpm 160G.  OS is
occupying 1 disk, so I have 6 disks to play with.

 

I am currently running the following tests:

 

Will test, including the time to flush(), various record sizes inside file
sizes up to 16G, sequential write and sequential read.  Not doing any mixed
read/write requests.  Not doing any random read/write.

iozone -Reab somefile.wks -g 17G -i 1 -i 0

 

Configurations being tested:

. Single disk

. 2-way mirror

. 3-way mirror

. 4-way mirror

. 5-way mirror

. 6-way mirror

. Two mirrors striped (or concatenated)

. Three mirrors striped (or concatenated)

. 5-disk raidz

. 6-disk raidz

. 6-disk raidz2

 

Hypothesized results:

. N-way mirrors write at the same speed of a single disk

. N-way mirrors read n-times faster than a single disk

. Two mirrors striped read and write 2x faster than a single mirror

. Three mirrors striped read and write 3x faster than a single
mirror

. Raidz and raidz2:  No hypothesis.  Some people say they perform
comparable to many disks working together.  Some people say it's slower than
a single disk.  Waiting to see the results.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-18 Thread Bob Friesenhahn

On Thu, 18 Feb 2010, Edward Ned Harvey wrote:



Ok, I’ve done all the tests I plan to complete.  For highest performance, it 
seems:

· The measure I think is the most relevant for typical operation is the 
fastest random read
/write / mix.  (Thanks Bob, for suggesting I do this test.)
The winner is clearly striped mirrors in ZFS


A most excellent set of tests.  We could use some units in the PDF 
file though.


While it would take quite some time and effort to accomplish, we could 
use a similar summary for full disk resilver times in each 
configuration.



· The fastest sustained sequential write is striped mirrors via ZFS, or 
maybe raidz


Note that while these tests may be file-sequential, with 8 threads 
working at once, what the disks see is not necessarily sequential. 
However, for initial sequential write, it may be that zfs aggregates 
the write requests and orders them on disk in such a way that 
subsequent sequential reads by the name number of threads in a 
roughly similar order would see a performance benefit.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-18 Thread Edward Ned Harvey
 A most excellent set of tests.  We could use some units in the PDF
 file though.

Oh, hehehe.  ;-)  The units are written in the raw txt files.  On your
tests, the units were ops/sec, and in mine, they were Kbytes/sec.  If you
like, you can always grab the xlsx and modify it to your tastes, and create
an updated pdf.  Just substitute .xlsx instead of .pdf in the previous
URL's.  Or just drop the filename off the URL.  My web server allows
indexing on that directory.

Personally, I only look at the chart which is normalized against a single
disk, so units are intentionally not present.


 While it would take quite some time and effort to accomplish, we could
 use a similar summary for full disk resilver times in each
 configuration.

Actually, that's easy.  Although the zpool create happens instantly, all
the hardware raid configurations required an initial resilver.  And they
were exactly what you expect.  Write 1 Gbit/s until you reach the size of
the drive.  I watched the progress while I did other things, and it was
incredibly consistent.

I am assuming, with very high confidence, that ZFS would match that
performance.
 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-18 Thread Edward Ned Harvey
 A most excellent set of tests.  We could use some units in the PDF
 file though.

Oh, by the way, you originally requested the 12G file to be used in
benchmark, and later changed to 4G.  But by that time, two of the tests had
already completed on the 12G, and I didn't throw away those results, but I
didn't include them in the summary either.

If you look in the raw results, you'll see a directory called 12G, and if
you compare those results against the equivalent 4G counterpart, you'll see
the 12G in fact performed somewhat lower.

The reason is that there are sometimes cache hits during read operations,
and the write back buffer is enabled in the PERC.  So the smaller the data
set, the more frequently these things will accelerate you.  And
consequently, the 4G performance was measured higher.

This doesn't affect me at all.  I wanted to know qualitative results, not
quantitative.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-18 Thread Bob Friesenhahn

On Thu, 18 Feb 2010, Edward Ned Harvey wrote:

Actually, that's easy.  Although the zpool create happens instantly, all
the hardware raid configurations required an initial resilver.  And they
were exactly what you expect.  Write 1 Gbit/s until you reach the size of
the drive.  I watched the progress while I did other things, and it was
incredibly consistent.


This sounds like an initial 'silver' rather than a 'resilver'.  In a 
'resilver' process it is necessary to read other disks in the vdev in 
order to reconstruct the disk content.  As a result, we now have 
additional seeks and reads going on, which seems considerably 
different than pure writes.


What I am interested in is the answer to these sort of questions:

 o Does a mirror device resilver faster than raidz?

 o Does a mirror device in a triple mirror resilver faster than a
   two-device mirror?

 o Does a raidz2 with 9 disks resilver faster or slower than one with
   6 disks?

The answer to these questions could vary depending on how well the 
pool has been aged and if it has been used for a while close to 100% 
full.


Before someone pipes up and says that measuring this is useless since 
results like this are posted all over the internet, I challenge that 
someone to find this data already published somewhere.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-18 Thread Daniel Carosone
On Thu, Feb 18, 2010 at 10:39:48PM -0600, Bob Friesenhahn wrote:
 This sounds like an initial 'silver' rather than a 'resilver'. 

Yes, in particular it will be entirely seqential.

ZFS resilver is in txg order and involves seeking.

 What I am interested in is the answer to these sort of questions:

  o Does a mirror device resilver faster than raidz?

  o Does a mirror device in a triple mirror resilver faster than a
two-device mirror?

  o Does a raidz2 with 9 disks resilver faster or slower than one with
6 disks?

and, if we're wishing for comprehensive analysis:

  o What is the impact on concurrent IO benchmark loads, for each of the above. 

 The answer to these questions could vary depending on how well the pool 
 has been aged and if it has been used for a while close to 100% full.

Indeed, which makes it even harder to compare results from different
cases and test sources.  To get usable relative-to-each-other results,
one needs to compare idealised test cases with repeatable loads.
This is weeks of work, at least, and can be fun to specualte about up
front but rapidly gets very tiresome.

--
Dan.

pgp2nipoqXa1P.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-15 Thread Carson Gaspar

Richard Elling wrote:
...

As you can see, so much has changed, hopefully for the better, that running
performance benchmarks on old software just isn't very interesting.

NB. Oracle's Sun OpenStorage systems do not use Solaris 10 and if they did, they
would not be competitive in the market. The notion that OpenSolaris is worthless
and Solaris 10 rules is simply bull*


OpenSolaris isn't worthless, but no way in hell would I run it in 
production, based on my experiences running it at home from b111 to now. 
The mpt driver problems are just one of many show stoppers (is that 
resolved yet, or do we still need magic /etc/system voodoo?).


Of course, Solaris 10 couldn't properly drive the Marvell attached disks 
in an X4500 prior to U6 either, unless you ran an IDR (pretty 
inexcusable in a storage-centric server release).


--
Carson

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-14 Thread Edward Ned Harvey
 Never mind. I have no interest in performance tests for Solaris 10.

 The code is so old, that it does not represent current ZFS at all.

 

Whatever.  Regardless of what you say, it does show:

. Which is faster, raidz, or a stripe of mirrors?

. How much does raidz2 hurt performance compared to raidz?

. Which is faster, raidz, or hardware raid 5?

. Is a mirror twice as fast as a single disk for reading?  Is a
3-way mirror 3x faster?  And so on?

 

I've seen and heard many people stating answers to these questions, and my
results (not yet complete) already answer these questions, and demonstrate
that all the previous assertions were partial truths.

 

It's true, I am demonstrating no interest to compare performance of ZFS 3
versus ZFS 4.  If you want that, test it yourself and don't complain about
my tests.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-14 Thread Edward Ned Harvey
   iozone -m -t 8 -T -O -r 128k -o -s 12G
 
 Actually, it seems that this is more than sufficient:
 
iozone -m -t 8 -T -r 128k -o -s 4G

Good news, cuz I kicked off the first test earlier today, and it seems like
it will run till Wednesday.  ;-)  The first run, on a single disk, took 6.5
hrs, and I have it configured to repeat ... 2-way mirror, 3-way mirror,
4-way mirror, 5-way mirror, raidz 5 disks, raidz 6 disks, raidz2 6 disks,
stripe of 2 mirrors, stripe of 3 mirrors ...

I'll go stop it, and change to 4G.  Maybe it'll be done tomorrow.  ;-)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-14 Thread Thomas Burgess
 Whatever.  Regardless of what you say, it does show:

 · Which is faster, raidz, or a stripe of mirrors?

 · How much does raidz2 hurt performance compared to raidz?

 · Which is faster, raidz, or hardware raid 5?

 · Is a mirror twice as fast as a single disk for reading?  Is a
 3-way mirror 3x faster?  And so on?



 I’ve seen and heard many people stating answers to these questions, and my
 results (not yet complete) already answer these questions, and demonstrate
 that all the previous assertions were partial truths.




I don't think he was complaining, i think he was sayign he dind't need you
to run iosnoop on the old version of ZFS.

Solaris 10 has a really old version of ZFS.  i know there are some pretty
big differences in zfs versions from my own non scientific benchmarks.  It
would make sense that people wouldn't be as interested in benchmarks of
solaris 10 ZFS seeing as there are literally hundreds scattered around the
internet.

I don't think he was telling you not to bother testing for your own purposes
though.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-14 Thread Bob Friesenhahn

On Sun, 14 Feb 2010, Edward Ned Harvey wrote:


 Never mind. I have no interest in performance tests for Solaris 10.

 The code is so old, that it does not represent current ZFS at all.

Whatever.  Regardless of what you say, it does show:


Since Richard abandoned Sun (in favor of gmail), he has no qualms with 
suggesting to test the unstable version. ;-)


Regardless of denials to the contrary, Solaris 10 is still the stable 
enterprise version of Solaris, and will be for quite some time.  It 
has not yet achieved the status of Solaris 8.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-14 Thread Bob Friesenhahn

On Sun, 14 Feb 2010, Edward Ned Harvey wrote:


 iozone -m -t 8 -T -O -r 128k -o -s 12G


Actually, it seems that this is more than sufficient:

   iozone -m -t 8 -T -r 128k -o -s 4G


Good news, cuz I kicked off the first test earlier today, and it seems like
it will run till Wednesday.  ;-)  The first run, on a single disk, took 6.5
hrs, and I have it configured to repeat ... 2-way mirror, 3-way mirror,
4-way mirror, 5-way mirror, raidz 5 disks, raidz 6 disks, raidz2 6 disks,
stripe of 2 mirrors, stripe of 3 mirrors ...

I'll go stop it, and change to 4G.  Maybe it'll be done tomorrow.  ;-)


Probably even 2G is plenty since that gives 16GB of total file data.

Keep in mind that with file data much larger than memory, these 
benchmarks are testing the hardware more than they are testing 
Solaris.  If you wanted to test Solaris, then you would intentionally 
give it enough memory to work with since that is now it is expected to 
be used.


The performance of Solaris when it is given enough memory to do 
reasonable caching is astounding.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-14 Thread Bob Friesenhahn

On Sun, 14 Feb 2010, Thomas Burgess wrote:


Solaris 10 has a really old version of ZFS.  i know there are some 
pretty big differences in zfs versions from my own non scientific 
benchmarks.  It would make sense that people wouldn't be as 
interested in benchmarks of solaris 10 ZFS seeing as there are 
literally hundreds scattered around the internet.


Can you provide URLs for these useful benchmarks?  I am certainly 
interested in seeing them.


Even my own benchmarks that I posted almost two years ago are quite 
useless now.  Solaris 10 ZFS is a continually moving target.


OpenSolaris performance postings I have seen are not terribly far from 
Solaris 10.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-14 Thread Richard Elling
On Feb 14, 2010, at 6:45 PM, Thomas Burgess wrote:
 
 Whatever.  Regardless of what you say, it does show:
 
 · Which is faster, raidz, or a stripe of mirrors?
 
 · How much does raidz2 hurt performance compared to raidz?
 
 · Which is faster, raidz, or hardware raid 5?
 
 · Is a mirror twice as fast as a single disk for reading?  Is a 3-way 
 mirror 3x faster?  And so on?
 
  
 I’ve seen and heard many people stating answers to these questions, and my 
 results (not yet complete) already answer these questions, and demonstrate 
 that all the previous assertions were partial truths.
 
  
 
 I don't think he was complaining, i think he was sayign he dind't need you to 
 run iosnoop on the old version of ZFS.

iosnoop runs fine on Solaris 10.

I am sorta complaining, though. If you wish to advance ZFS, then use the 
latest bits. If you wish to discover the performance bugs in Solaris 10 that are
already fixed in OpenSolaris, then go ahead, be my guest.  Examples of 
improvements are:
+ intelligent prefetch algorithm is smarter
+ txg commit interval logic is improved
+ ZIL logic improved and added logbias property
+ stat() performance is improved
+ raidz write performance improved and raidz3 added
+ zfs caching improved
+ dedup changes touched many parts of ZFS
+ zfs_vdev_max_pending reduced and smarter
+ metaslab allocation improved
+ zfs write activity doesn't hog resource quite so much
+ a new scheduling class, SDC, added to better observe and manage
   ZFS thread scheduling
+ buffers can be shared between file system modules (fewer copies)

As you can see, so much has changed, hopefully for the better, that running
performance benchmarks on old software just isn't very interesting.

NB. Oracle's Sun OpenStorage systems do not use Solaris 10 and if they did, they
would not be competitive in the market. The notion that OpenSolaris is worthless
and Solaris 10 rules is simply bull*

 Solaris 10 has a really old version of ZFS.  i know there are some pretty big 
 differences in zfs versions from my own non scientific benchmarks.  It would 
 make sense that people wouldn't be as interested in benchmarks of solaris 10 
 ZFS seeing as there are literally hundreds scattered around the internet.
 
 I don't think he was telling you not to bother testing for your own purposes 
 though.

Correct.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-13 Thread Richard Elling
Some thoughts below...

On Feb 13, 2010, at 6:06 AM, Edward Ned Harvey wrote:

 I have a new server, with 7 disks in it.  I am performing benchmarks on it 
 before putting it into production, to substantiate claims I make, like 
 “striping mirrors is faster than raidz” and so on.  Would anybody like me to 
 test any particular configuration?  Unfortunately I don’t have any SSD, so I 
 can’t do any meaningful test on the ZIL etc.  Unless someone in the Boston 
 area has a 2.5” SAS SSD they wouldn’t mind lending for a few hours.  ;-)
  
 My hardware configuration:  Dell PE 2970 with 8 cores.  Normally 32G, but I 
 pulled it all out to get it down to 4G of ram.  (Easier to benchmark disks 
 when the file operations aren’t all cached.)  ;-)  Solaris 10 10/09.  PERC 
 6/i controller.  All disks are configured in PERC for Adaptive ReadAhead, and 
 Write Back, JBOD.  7 disks present, each SAS 15krpm 160G.  OS is occupying 1 
 disk, so I have 6 disks to play with.

Put the memory back in and limit the ARC cache size instead. x86 boxes
have a tendency to change the memory bus speed depending on how much
memory is in the box.

Similarly, you can test primarycache settings rather than just limiting ARC 
size.

 I am currently running the following tests:
  
 Will test, including the time to flush(), various record sizes inside file 
 sizes up to 16G, sequential write and sequential read.  Not doing any mixed 
 read/write requests.  Not doing any random read/write.
 iozone -Reab somefile.wks -g 17G -i 1 -i 0

IMHO, sequential tests are a waste of time.  With default configs, it will be 
difficult to separate the raw performance from prefetched performance.
You might try disabling prefetch as an option.

With sync writes, you will run into the zfs_immediate_write_sz boundary.

Perhaps someone else can comment on how often they find interesting 
sequential workloads which aren't backup-related.

 Configurations being tested:
 · Single disk
 · 2-way mirror
 · 3-way mirror
 · 4-way mirror
 · 5-way mirror
 · 6-way mirror
 · Two mirrors striped (or concatenated)
 · Three mirrors striped (or concatenated)
 · 5-disk raidz
 · 6-disk raidz
 · 6-disk raidz2

Please add some raidz3 tests :-)  We have little data on how raidz3 performs.

  
 Hypothesized results:
 · N-way mirrors write at the same speed of a single disk
 · N-way mirrors read n-times faster than a single disk
 · Two mirrors striped read and write 2x faster than a single mirror
 · Three mirrors striped read and write 3x faster than a single mirror
 · Raidz and raidz2:  No hypothesis.  Some people say they perform 
 comparable to many disks working together.  Some people say it’s slower than 
 a single disk.  Waiting to see the results.

Please post results (with raw data would be nice ;-).  If you would be so
kind as to collect samples of iosnoop -Da I would be eternally grateful :-)
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-13 Thread Bob Friesenhahn

On Sat, 13 Feb 2010, Edward Ned Harvey wrote:


Will test, including the time to flush(), various record sizes inside file 
sizes up to 16G,
sequential write and sequential read.  Not doing any mixed read/write 
requests.  Not doing any
random read/write.

iozone -Reab somefile.wks -g 17G -i 1 -i 0


Make sure to also test with a command like

  iozone -m -t 8 -T -O -r 128k -o -s 12G

I am eager to read your test report.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-13 Thread Bob Friesenhahn

On Sat, 13 Feb 2010, Bob Friesenhahn wrote:


Make sure to also test with a command like

 iozone -m -t 8 -T -O -r 128k -o -s 12G


Actually, it seems that this is more than sufficient:

  iozone -m -t 8 -T -r 128k -o -s 4G

since it creates a 4GB test file for each thread, with 8 threads.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-13 Thread Edward Ned Harvey
 IMHO, sequential tests are a waste of time.  With default configs, it

 will be

 difficult to separate the raw performance from prefetched

 performance.

 You might try disabling prefetch as an option.

 

Let me clarify:

 

Iozone does a nonsequential series of sequential tests, specifically for the
purpose of identifying the performance tiers, separating the various levels
of hardware accelerated performance from the raw disk performance.

 

This is the reason why I took out all but 4G of the system RAM.  In the
(incomplete) results I have so far, it's easy to see these tiers for a
single disk:  

. For filesizes 0 to 4M, a single disk 
writes 2.8 Gbit/sec and reads ~40-60 Gbit/sec.  
This boost comes from writing to PERC cache, and reading from CPU L2 cache.



. For filesizes 4M to 128M, a single disk 
writes 2.8 Gbit/sec and reads 24 Gbit/sec.  
This boost comes from writing to PERC cache, and reading from system memory.



. For filesizes 128M to 4G, a single disk 
writes 1.2 Gbit/sec and reads 24 Gbit/sec.
This boost comes from reading system memory.



. For filesizes 4G to 16G, a single disk
writes 1.2 Gbit/sec and reads 1.2 Gbit/sec
This is the raw disk performance.  (SAS, 15krpm, 146G disks)

 

 

 Please add some raidz3 tests :-)  We have little data on how raidz3

 performs.

 

Does this require a specific version of OS?  I'm on Solaris 10 10/09, and
man zpool doesn't seem to say anything about raidz3 ... I haven't tried
using it ... does it exist?

 

 

 Please post results (with raw data would be nice ;-).  If you would be

 so

 kind as to collect samples of iosnoop -Da I would be eternally

 grateful :-)

 

I'm guessing iosnoop is an opensolaris thing?  Is there an equivalent for
solaris?

 

I'll post both the raw results, and my simplified conclusions.  Most people
would not want the raw data.  Most people just want to know What's the
performance hit I take by using raidz2 instead of raidz?  and so on.

 

Or ... What's faster, raidz, or hardware raid-5?

 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-13 Thread Bob Friesenhahn

On Sat, 13 Feb 2010, Edward Ned Harvey wrote:


 kind as to collect samples of iosnoop -Da I would be eternally 
 grateful :-)


I'm guessing iosnoop is an opensolaris thing?  Is there an equivalent for 
solaris?


Iosnoop is part of the DTrace Toolkit by Brendan Gregg, which does 
work on Solaris 10.  See http://www.brendangregg.com/dtrace.html;.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance benchmarks in various configurations

2010-02-13 Thread Richard Elling
On Feb 13, 2010, at 10:54 AM, Edward Ned Harvey wrote:
  Please add some raidz3 tests :-)  We have little data on how raidz3
  performs.
  
 Does this require a specific version of OS?  I'm on Solaris 10 10/09, and 
 man zpool doesn't seem to say anything about raidz3 ... I haven't tried 
 using it ... does it exist?

Never mind. I have no interest in performance tests for Solaris 10.
The code is so old, that it does not represent current ZFS at all.
IMHO, if you want to do performance tests, then you need to be
on the very latest dev release.  Otherwise, the results can't be
carried forward to make a difference -- finding performance issues
that are already fixed isn't a good use of your time.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss