Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-08-04 Thread Roch

Bob Friesenhahn writes:
  On Wed, 29 Jul 2009, Jorgen Lundman wrote:
  
   For example, I know rsync and tar does not use fdsync (but dovecot does) 
   on 
   its close(), but does NFS make it fdsync anyway?
  
  NFS is required to do synchronous writes.  This is what allows NFS 
  clients to recover seamlessly if the server spontaneously reboots. 
  If the NFS client supports it, it can send substantial data (multiple 
  writes) to the server, and then commit it all via an NFS commit. 

In theory; but for lots of single threaded file creation
(the tar process) the NFS server is fairly constrained in
what it can do. We need something like directory delegation
to allow the client to interact with local caches like a DAS
filesystem can.

A slog on SDD can help, but that SSD needs to have low latency
writes, which typically implies DRAM buffers, and a capacitor
so that it can be made to ignore cache flushes.

-r


  Note that this requires more work by the client since the NFS client 
  is required to replay the uncommited writes if the server goes away.
  

   Sorry for the giant email.
  
  No, thank you very much for the interesting measurements and data.
  
  Bob
  --
  Bob Friesenhahn
  bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
  GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-08-01 Thread Joerg Moellenkamp

Hi Jorgen,

warning ... weird idea inside ...
Ah it just occurred to me that perhaps for our specific problem, we 
will buy two X25-Es and replace the root mirror. The OS and ZIL logs 
can live  together and put /var in the data pool. That way we would 
not need to rebuild the data-pool and all the work that comes with that.


Shame I can't zpool replace to a smaller disk (500GB HDD to 32GB SSD) 
though, I will have to lucreate and reboot one time.


Oh, you have a solution ... just had an weird idea and thought about 
suggesting you something of a hack: Putting SSD in a central server, 
build a pool out of them, perhaps activate compression (at the end small 
machines are today 4 core systems, they shouldn't idle for their money), 
create some zvols out of them, share them via iSCSI and assign them as 
slog devices. For high speed usage: Create a ramdisk, use it as slog on 
the ssd server, put a UPS under the ssd server. At the end a SSD drive 
is nothing else (a flash memory controller, with dram, some storage and 
some caps to keep the dram powered until the dram is flushed)


Regards
Joerg


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-31 Thread Ian Collins

Ross wrote:

Great idea, much neater than most of my suggestions too :-)
  

What is?  Please keep some context for those of us on email!

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-31 Thread Ross
 Ross wrote:
  Great idea, much neater than most of my suggestions
 too :-)

 What is?  Please keep some context for those of us on
 email!

x25-e drives as a mirrored boot volume on an x4500, partitioning off some of 
the space for the slog.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-30 Thread Jorgen Lundman



Bob Friesenhahn wrote:
Something to be aware of is that not all SSDs are the same.  In fact, 
some faster SSDs may use a RAM write cache (they all do) and then 
ignore a cache sync request while not including hardware/firmware 
support to ensure that the data is persisted if there is power loss. 
Perhaps your fast CF device does that.  If so, that would be really 
bad for zfs if your server was to spontaneously reboot or lose power. 
This is why you really want a true enterprise-capable SSD device for 
your slog.


Naturally, we just wanted to try the various technologies to see how 
they compared. Store-bought CF card took 26s, store-bought SSD 48s. We 
have not found a PCI NVRam card yet.


When talking to our Sun vendor, they have no solutions, which is annoying.

X25-E would be good, but some pools have no spares, and since you can't 
remove vdevs, we'd have to move all customers off the x4500 before we 
can use it.


CF card need reboot to see the cards, but 6 servers are x4500, not 
x4540, so not really a global solution.


PCI NVRam cards need a reboot, but should work in both x4500 and x4540 
without zpool rebuilding. But can't actually find any with Solaris drivers.


Peculiar.

Lund


--
Jorgen Lundman   | lund...@lundman.net
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo| +81 (0)90-5578-8500  (cell)
Japan| +81 (0)3 -3375-1767  (home)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-30 Thread Markus Kovero
btw, there's coming new Intel X25-M (G2) next month that will offer better 
random read/writes than E-series and seriously cheap pricetag, worth for a try 
I'd say.

Yours
Markus Kovero


-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Jorgen Lundman
Sent: 30. heinäkuuta 2009 9:55
To: ZFS Discussions
Subject: Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 
10 10/08



Bob Friesenhahn wrote:
 Something to be aware of is that not all SSDs are the same.  In fact, 
 some faster SSDs may use a RAM write cache (they all do) and then 
 ignore a cache sync request while not including hardware/firmware 
 support to ensure that the data is persisted if there is power loss. 
 Perhaps your fast CF device does that.  If so, that would be really 
 bad for zfs if your server was to spontaneously reboot or lose power. 
 This is why you really want a true enterprise-capable SSD device for 
 your slog.

Naturally, we just wanted to try the various technologies to see how 
they compared. Store-bought CF card took 26s, store-bought SSD 48s. We 
have not found a PCI NVRam card yet.

When talking to our Sun vendor, they have no solutions, which is annoying.

X25-E would be good, but some pools have no spares, and since you can't 
remove vdevs, we'd have to move all customers off the x4500 before we 
can use it.

CF card need reboot to see the cards, but 6 servers are x4500, not 
x4540, so not really a global solution.

PCI NVRam cards need a reboot, but should work in both x4500 and x4540 
without zpool rebuilding. But can't actually find any with Solaris drivers.

Peculiar.

Lund


-- 
Jorgen Lundman   | lund...@lundman.net
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo| +81 (0)90-5578-8500  (cell)
Japan| +81 (0)3 -3375-1767  (home)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-30 Thread Ross
Without spare drive bays I don't think you're going to find one solution that 
works for x4500 and x4540 servers.  However, are these servers physically close 
together?  Have you considered running the slog devices externally?

One possible choice may be to run something like the Supermicro SC216 chassis 
(2U with 24x 2.5 drive bays):
http://www.supermicro.com/products/chassis/2U/216/SC216E2-R900U.cfm

Buy the chassis with redundant power (SC216E2-R900UB), and the JBOD power 
module (CSE-PTJBOD-CB1) to convert it to a dumb JBOD unit.  The standard 
backplane has six SAS connectors, each of which connects to four drives.  You 
might struggle if you need to connect more than six servers, although it may be 
possible to run it in a rather non standard configuration, removing the 
backplane and powering and connecting drives individually.

However, for up to six servers, you can just fit Adaptec raid cards with 
external ports to each (PCI-e or PCI-x as needed), and use external cables to 
connect those to the SSD drives in the external chassis.

If you felt like splashing out on the raid cards, that would let you run the 
ZIL on up to four Intel X25-E drives per server, backed up by 512MB of battery 
backed cache.

I think that would have a dramatic effect on NFS speed to say the least :-)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-30 Thread Mike Gerdts
On Thu, Jul 30, 2009 at 5:27 AM, Rossno-re...@opensolaris.org wrote:
 Without spare drive bays I don't think you're going to find one solution that 
 works for x4500 and x4540 servers.  However, are these servers physically 
 close together?  Have you considered running the slog devices externally?

It appears as though there is an upgrade path.

http://www.c0t0d0s0.org/archives/5750-Upgrade-of-a-X4500-to-a-X4540.html

However, the troll that you have to pay to follow that path demands a
hefty sum ($7995 list).  Oh, and a reboot is required.  :)

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-30 Thread Ross
That should work just as well Bob, although rather than velcro I'd be tempted 
to drill some holes into the server chassis somewhere and screw the drives on.  
These things do use a bit of power, but with the airflow in a thumper I don't 
think I'd be worried.

If they were my own servers I'd be very tempted, but it really depends on how 
happy you would be voiding the warranty on a rather expensive piece of kit :-)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-30 Thread Andrew Gabriel

Richard Elling wrote:

On Jul 30, 2009, at 9:26 AM, Bob Friesenhahn wrote:

Do these SSDs require a lot of cooling?  


No. During the Turbo Charge your Apps presentations I was doing around 
the UK, I often pulled one out of a server to hand around the audience 
when I'd finished the demos on it. The first thing I noticed when doing 
this is that the disk is stone cold, which is not what you expect when 
you pull an operating disk out of a system.


Note that they draw all their power from the 5V rail, and can draw more 
current on the 5V rail than some HDDs, which is something to check if 
you're putting lots in a disk rack.


Traditional drive slots are designed for hard drives which need to 
avoid vibration and have specific cooling requirements.  What are the 
environmental requirements for the Intel X25-E?


Operating and non-operating shock: 1,000 G/0.5 msec (vs operating shock
for Barracuda ES.2 of 63G/2ms)
Power spec: 2.4 W @ 32 GB, 2.6W @ 64 GB. (less than HDDs @ ~8-15W)
MTBF: 2M hours (vs 1.2M hours for Barracuda ES.2)
Vibration specs are not consistent for comparison.
Compare:
http://download.intel.com/design/flash/nand/extreme/319984.pdf
vs
http://www.seagate.com/docs/pdf/datasheet/disc/ds_barracuda_es_2.pdf

Interesting that they are now specifying write endurance as:
1 PB of random writes for 32GB, 2 PB of random writes for 64GB.

Except for price/GB, it is game over for HDDs.  Since price/GB is 
based on

Moore's Law, it is just a matter of time.


SSD's are a sufficiently new technology that I suspect there's 
significant probably of discovering new techniques which give larger 
step improvements than Moore's Law for some years yet. However, HDD's 
aren't standing still either when it comes to capacity, although 
improvements in other HDD performance characteristics has been very 
disappointing this decade (e.g. IOPs haven't improved much at all, 
indeed they've only seen a 10-fold improvement over the last 25 years).


--
Andrew
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-30 Thread Bob Friesenhahn

On Thu, 30 Jul 2009, Andrew Gabriel wrote:


Except for price/GB, it is game over for HDDs.  Since price/GB is based on
Moore's Law, it is just a matter of time.


SSD's are a sufficiently new technology that I suspect there's significant 
probably of discovering new techniques which give larger step improvements 
than Moore's Law for some years yet. However, HDD's aren't standing still


FLASH technology is highly mature and has been around since the '80s. 
Given this, it is perhaps the case that (through continual refinement) 
FLASH has finally made it to the point of usability for bulk mass 
storage.  It is not clear if FLASH will obey Moore's Law or if it has 
already started its trailing off stage (similar to what happened with 
single-core CPU performance).


Only time will tell.  Currently (after rounding) SSDs occupy 0% of the 
enterprise storage market even though they dominate in some other 
markets.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10

2009-07-30 Thread Will Murnane
On Thu, Jul 30, 2009 at 14:50, Kurt Olsenno-re...@opensolaris.org wrote:
 I'm using an Acard ANS-9010B (configured with 12 GB battery backed ECC RAM w/ 
 16 GB CF card for longer term power losses. Device cost $250, RAM cost about 
 $120, and the CF around $100.) It just shows up as a SATA drive. Works fine 
 attached to an LSI 1068E. Since -- as I understand it -- one's ZIL doesn't 
 need to be particularly large, I've split that into 2 GB of ZIL and 10 GB of 
 L2ARC. Simple tests show it can do around 3200 sync 4k writes/sec over NFS 
 into a RAID-Z pool of five western digital 1 TB caviar green drives.

I, too, have one of these, and am mostly happy with it.  The biggest
inconvenience about it is the form factor: it occupies a 5.25 bay.
Since my case has no 5.25 bays (Norco RPC-4220) I improvised by
drilling a pair of correctly spaced holes into the lid of the case and
screwing it in there.  This isn't really recommended for enterprise
use, where drilling holes in the equipment is discouraged.

I don't have benchmarks for my setup, but anecdotally I no longer see
the stalls accessing files over NFS that I had before adding the Acard
to my pool as a log device.  I only have 1GB in it, and that seems
plenty for the purpose: it only ever seems to show up as 8k used, even
with 100 MB/s or more of writes to it.

Also, I should point out that the device doesn't support SMART.  Some
raid controllers may be unhappy about this.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-30 Thread Richard Elling


On Jul 30, 2009, at 12:07 PM, Bob Friesenhahn wrote:


On Thu, 30 Jul 2009, Andrew Gabriel wrote:
Except for price/GB, it is game over for HDDs.  Since price/GB is  
based on

Moore's Law, it is just a matter of time.


SSD's are a sufficiently new technology that I suspect there's  
significant probably of discovering new techniques which give  
larger step improvements than Moore's Law for some years yet.  
However, HDD's aren't standing still


FLASH technology is highly mature and has been around since the  
'80s. Given this, it is perhaps the case that (through continual  
refinement) FLASH has finally made it to the point of usability for  
bulk mass storage.  It is not clear if FLASH will obey Moore's Law  
or if it has already started its trailing off stage (similar to what  
happened with single-core CPU performance).


Only time will tell.  Currently (after rounding) SSDs occupy 0% of  
the enterprise storage market even though they dominate in some  
other markets.


According to Gartner, enterprise SSDs accounted for $92.6M of a
$585.5M SSD market in June 2009, representing 15.8% of the SSD
market. STEC recently announced an order for $120M of ZeusIOPS
drives from a single enterprise storage customer.  From 2007 to
2008, SSD market grew by 100%. IDC reports Q1CY09 had
$4,203M for the external disk storage factory revenue, down 16%
from Q1CY08 while total disk storage systems were down 25.8%
YOY to $5,616M[*]. So while it looks like enterprise SSDs represented
less than 1% of total storage revenue in 2008, it is the part that is
growing rapidly. I would not be surprised to see enterprise SSDs at  
5-10%

of the total disk storage systems market in 2010. I would also expect to
see  total disk storage systems revenue continue to decline as fewer
customers buy expensive RAID controllers.  IMHO, the total disk storage
systems market has already peaked, so the enterprise SSD gains at
the expense of overall market size.  Needless to say, whether or not
Sun can capitalize on its OpenStorage strategy, the market is moving
in the same direction, perhaps at a more rapid pace due to current
economic conditions.

[*] IDC defines a Disk Storage System as a set of storage elements,
including controllers, cables, and (in some instances) host bus  
adapters,

associated with three or more disks. A system may be located outside of
or within a server cabinet and the average cost of the disk storage  
systems

does not include infrastructure storage hardware (i.e. switches) and
non-bundled storage software.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-30 Thread Bob Friesenhahn

On Thu, 30 Jul 2009, Richard Elling wrote:


According to Gartner, enterprise SSDs accounted for $92.6M of a 
$585.5M SSD market in June 2009, representing 15.8% of the SSD 
market. STEC recently announced an order for $120M of ZeusIOPS 
drives from a single enterprise storage customer.  From 2007 to 
2008, SSD market grew by 100%. IDC reports Q1CY09 had $4,203M for 
the external disk storage factory revenue, down 16% from Q1CY08 
while total disk storage systems were down 25.8% YOY to $5,616M[*]. 
So while it looks like enterprise SSDs represented less than 1% of 
total storage revenue in 2008, it is the part that is growing 
rapidly. I would not be surprised to see enterprise SSDs at 5-10%


While $$$ are important for corporate bottom lines, when it comes to 
the number of units deployed, $$$ are a useless measure when comparing 
disk drives to SSDs since SSDs are much more expensive and offer much 
less storage space.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-30 Thread Jorgen Lundman


X25-E would be good, but some pools have no spares, and since you can't 
remove vdevs, we'd have to move all customers off the x4500 before we 
can use it.


Ah it just occurred to me that perhaps for our specific problem, we will 
buy two X25-Es and replace the root mirror. The OS and ZIL logs can live 
 together and put /var in the data pool. That way we would not need to 
rebuild the data-pool and all the work that comes with that.


Shame I can't zpool replace to a smaller disk (500GB HDD to 32GB SSD) 
though, I will have to lucreate and reboot one time.


Lund

--
Jorgen Lundman   | lund...@lundman.net
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo| +81 (0)90-5578-8500  (cell)
Japan| +81 (0)3 -3375-1767  (home)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-30 Thread Ross
Great idea, much neater than most of my suggestions too :-)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-29 Thread Jorgen Lundman


We just picked up the fastest SSD we could in the local biccamera, which 
turned out to be a CSSDーSM32NI, with supposedly 95MB/s write speed.


I put it in place, and replaced the slog over:

  0m49.173s
  0m48.809s

So, it is slower than the CF test. This is disappointing. Everyone else 
seems to use Intel X25-M, which have a write-speed of 170MB/s (2nd 
generation) so perhaps that is why it works better for them. It is 
curious that it is slower than the CF card. Perhaps because it shares 
with so many other SATA devices?


Oh and we'll probably have to get a 3.5 frame for it, as I doubt it'll 
stay standing after the next earthquake. :)


Lund


Jorgen Lundman wrote:


This thread started over in nfs-discuss, as it appeared to be an nfs 
problem initially. Or at the very least, interaction between nfs and zil.


Just summarising speeds we have found when untarring something. Always 
in a new/empty directory. Only looking at write speed. read is always 
very fast.


The reason we started to look at this was because the 7 year old netapp 
being phased out, could untar the test file in 11 seconds. The 
x4500/x4540 Suns took 5 minutes.


For all our tests, we used MTOS-4.261-ja.tar.gz, just a random tarball I 
had lying around, but it can be downloaded here if you want the same 
test. (http://www.movabletype.org/downloads/stable/MTOS-4.261-ja.tar.gz)


The command executed generally, is:

# mkdir .test34  time gtar --directory=.test34 -zxf 
/tmp/MTOS-4.261-ja.tar.gz




Solaris 10 1/06 intel client: netapp 6.5.1 FAS960 server: NFSv3
  0m11.114s

Solaris 10 6/06 intel client: x4500 OpenSolaris svn117 server: nfsv4
  5m11.654s

Solaris 10 6/06 intel client: x4500 Solaris 10 10/08 server: nfsv3
  8m55.911s

Solaris 10 6/06 intel client: x4500 Solaris 10 10/08 server: nfsv4
  10m32.629s


Just untarring the tarball on the x4500 itself:

: x4500 OpenSolaris svn117 server
  0m0.478s

: x4500 Solaris 10 10/08 server
  0m1.361s



So ZFS itself is very fast. Replacing NFS with different protocols, 
identical setup, just changing tar with rsync, and nfsd with sshd.


The baseline test, using:
rsync -are ssh /tmp/MTOS-4.261-ja /export/x4500/testXX


Solaris 10 6/06 intel client: x4500 OpenSolaris svn117 : rsync on nfsv4
  3m44.857s

Solaris 10 6/06 intel client: x4500 OpenSolaris svn117 : rsync+ssh
  0m1.387s

So, get rid of nfsd and it goes from 3 minutes to 1 second!

Lets share it with smb, and mount it:


OsX 10.5.6 intel client: x4500 OpenSolaris svn117 : smb+untar
  0m24.480s


Neat, even SMB can beat nfs in default settings.

This would then indicate to me that nfsd is broken somehow, but then we 
try again after only disabling ZIL.



Solaris 10 6/06 : x4500 OpenSolaris svn117 DISABLE ZIL: nfsv4
  0m8.453s
  0m8.284s
  0m8.264s

Nice, so this is theoretically the fastest NFS speeds we can reach? We 
run postfix+dovecot for mail, which probably would be safe to not use 
ZIL. The other type is FTP/WWW/CGI, which has more active 
writes/updates. Probably not as good. Comments?



Enable ZIL, but disable zfscache (Just as a test, I have been told 
disabling zfscache is far more dangerous).



Solaris 10 6/06 : x4500 OpenSolaris svn117 DISABLE zfscacheflush: nfsv4
  0m45.139s

Interesting. Anyway, enable ZIL and zfscacheflush again, and learn a 
whole lot about slog.


First I tried creating a 2G slog on the boot mirror:


Solaris 10 6/06 : x4500 OpenSolaris svn117 slog boot pool: nfsv4

  1m59.970s


Some improvements. For a lark, I created a 2GB file in /tmp/ and changed 
the slog to that. (I know, having the slog in volatile RAM is pretty 
much the same as disabling ZIL. But it should give me theoretical 
maximum speed with ZIL enabled right?).



Solaris 10 6/06 : x4500 OpenSolaris svn117 slog /tmp/junk: nfsv4
  0m8.916s


Nice! Same speed as ZIL disabled. Since this is a X4540, we thought we 
would test with a CF card attached. Alas the 600X (92MB/s) card are not 
out until next month, rats! So, we bought a 300X (40MB/s) card.



Solaris 10 6/06 : x4500 OpenSolaris svn117 slog 300X CFFlash: nfsv4
  0m26.566s


Not too bad really. But you have to reboot to see a CF card, fiddle with 
BIOS for the boot order etc. Just not an easy add on a live system. A 
SATA emulated SSD DISK can be hot-swapped.



Also, I learned an interesting lesson about rebooting with slog at 
/tmp/junk.



I am hoping to pick up a SSD SATA device today and see what speeds we 
get out of that.


That rsync (1s) vs nfs(8s) I can accept as over-head on a much more 
complicated protocol, but why would it take 3 minutes to write the same 
data on the same pool, with rsync(1s) vs nfs(3m)? The ZIL was on, slog 
is default, but both writing the same way. Does nfsd add FD_SYNC to 
every close regardless as to whether the application did or not?

This I have not yet wrapped my head around.

For example, I know rsync 

Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-29 Thread Ross
Everyone else should be using the Intel X25-E.  There's a massive difference 
between the M and E models, and for a slog it's IOPS and low latency that you 
need.  

I've heard that Sun use X25-E's, but I'm sure that original reports had them 
using STEC.  I have a feeling the 2nd generation X25-E's are going to give STEC 
a run for their money though.  If I were you, I'd see if you can get your hands 
on an X25-E for evaluation purposes.

Also, if you're just running NFS over gigabit ethernet, a single X25-E may be 
enough, but at around 90MB/s sustained performance for each, you might need to 
stripe a few of them to match the speeds your Thumper is capable of.

We're not running an x4500, but we were lucky enough to get our hands on some 
PCI 512MB nvram cards a while back, and I can confirm they make a huge 
difference to NFS speeds - for our purposes they're identical to ramdisk slog 
performance.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-29 Thread Ross
Hi James, I'll not reply in line since the forum software is completely munging 
your post.

On the X25-E I believe there is cache, and it's not backed up.  While I haven't 
tested it, I would expect the X25-E to have the cache turned off while used as 
a ZIL.

The 2nd generation X25-E announced by Intel does have 'safe storage' as they 
term it.  I believe it has more cache, a faster write speed, and is able to 
guarantee that the contents of the cache will always make it to stable storage.

My guess would be that since it's designed for the server market, the cache on 
the X25-E would be irrelevant - the device is going to honor flush requests and 
the ZIL will be stable.  I suspect that the X25-E G2 will ignore flush 
requests, with Intel's engineers confident that the data in the cache is safe.

The NVRAM card we're using is a MM-5425, identical to the one used in the 
famous 'blog on slogs', I was lucky to get my hands on a pair and some drivers 
:-)

I think the raid controller approach is a nice idea though, and should work 
just as well.

I'd love an 80GB ioDrive to use as our ZIL, I think that's the best hardware 
solution out there right now, but until Fusion-IO release Solaris drivers I'm 
going to have to stick with my 512MB...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-29 Thread Bob Friesenhahn

On Wed, 29 Jul 2009, Jorgen Lundman wrote:


So, it is slower than the CF test. This is disappointing. Everyone else seems 
to use Intel X25-M, which have a write-speed of 170MB/s (2nd generation) so 
perhaps that is why it works better for them. It is curious that it is slower 
than the CF card. Perhaps because it shares with so many other SATA devices?


Something to be aware of is that not all SSDs are the same.  In fact, 
some faster SSDs may use a RAM write cache (they all do) and then 
ignore a cache sync request while not including hardware/firmware 
support to ensure that the data is persisted if there is power loss. 
Perhaps your fast CF device does that.  If so, that would be really 
bad for zfs if your server was to spontaneously reboot or lose power. 
This is why you really want a true enterprise-capable SSD device for 
your slog.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-28 Thread Jorgen Lundman


This thread started over in nfs-discuss, as it appeared to be an nfs 
problem initially. Or at the very least, interaction between nfs and zil.


Just summarising speeds we have found when untarring something. Always 
in a new/empty directory. Only looking at write speed. read is always 
very fast.


The reason we started to look at this was because the 7 year old netapp 
being phased out, could untar the test file in 11 seconds. The 
x4500/x4540 Suns took 5 minutes.


For all our tests, we used MTOS-4.261-ja.tar.gz, just a random tarball I 
had lying around, but it can be downloaded here if you want the same 
test. (http://www.movabletype.org/downloads/stable/MTOS-4.261-ja.tar.gz)


The command executed generally, is:

# mkdir .test34  time gtar --directory=.test34 -zxf 
/tmp/MTOS-4.261-ja.tar.gz




Solaris 10 1/06 intel client: netapp 6.5.1 FAS960 server: NFSv3
  0m11.114s

Solaris 10 6/06 intel client: x4500 OpenSolaris svn117 server: nfsv4
  5m11.654s

Solaris 10 6/06 intel client: x4500 Solaris 10 10/08 server: nfsv3
  8m55.911s

Solaris 10 6/06 intel client: x4500 Solaris 10 10/08 server: nfsv4
  10m32.629s


Just untarring the tarball on the x4500 itself:

: x4500 OpenSolaris svn117 server
  0m0.478s

: x4500 Solaris 10 10/08 server
  0m1.361s



So ZFS itself is very fast. Replacing NFS with different protocols, 
identical setup, just changing tar with rsync, and nfsd with sshd.


The baseline test, using:
rsync -are ssh /tmp/MTOS-4.261-ja /export/x4500/testXX


Solaris 10 6/06 intel client: x4500 OpenSolaris svn117 : rsync on nfsv4
  3m44.857s

Solaris 10 6/06 intel client: x4500 OpenSolaris svn117 : rsync+ssh
  0m1.387s

So, get rid of nfsd and it goes from 3 minutes to 1 second!

Lets share it with smb, and mount it:


OsX 10.5.6 intel client: x4500 OpenSolaris svn117 : smb+untar
  0m24.480s


Neat, even SMB can beat nfs in default settings.

This would then indicate to me that nfsd is broken somehow, but then we 
try again after only disabling ZIL.



Solaris 10 6/06 : x4500 OpenSolaris svn117 DISABLE ZIL: nfsv4
  0m8.453s
  0m8.284s
  0m8.264s

Nice, so this is theoretically the fastest NFS speeds we can reach? We 
run postfix+dovecot for mail, which probably would be safe to not use 
ZIL. The other type is FTP/WWW/CGI, which has more active 
writes/updates. Probably not as good. Comments?



Enable ZIL, but disable zfscache (Just as a test, I have been told 
disabling zfscache is far more dangerous).



Solaris 10 6/06 : x4500 OpenSolaris svn117 DISABLE zfscacheflush: nfsv4
  0m45.139s

Interesting. Anyway, enable ZIL and zfscacheflush again, and learn a 
whole lot about slog.


First I tried creating a 2G slog on the boot mirror:


Solaris 10 6/06 : x4500 OpenSolaris svn117 slog boot pool: nfsv4

  1m59.970s


Some improvements. For a lark, I created a 2GB file in /tmp/ and changed 
the slog to that. (I know, having the slog in volatile RAM is pretty 
much the same as disabling ZIL. But it should give me theoretical 
maximum speed with ZIL enabled right?).



Solaris 10 6/06 : x4500 OpenSolaris svn117 slog /tmp/junk: nfsv4
  0m8.916s


Nice! Same speed as ZIL disabled. Since this is a X4540, we thought we 
would test with a CF card attached. Alas the 600X (92MB/s) card are not 
out until next month, rats! So, we bought a 300X (40MB/s) card.



Solaris 10 6/06 : x4500 OpenSolaris svn117 slog 300X CFFlash: nfsv4
  0m26.566s


Not too bad really. But you have to reboot to see a CF card, fiddle with 
BIOS for the boot order etc. Just not an easy add on a live system. A 
SATA emulated SSD DISK can be hot-swapped.



Also, I learned an interesting lesson about rebooting with slog at 
/tmp/junk.



I am hoping to pick up a SSD SATA device today and see what speeds we 
get out of that.


That rsync (1s) vs nfs(8s) I can accept as over-head on a much more 
complicated protocol, but why would it take 3 minutes to write the same 
data on the same pool, with rsync(1s) vs nfs(3m)? The ZIL was on, slog 
is default, but both writing the same way. Does nfsd add FD_SYNC to 
every close regardless as to whether the application did or not?

This I have not yet wrapped my head around.

For example, I know rsync and tar does not use fdsync (but dovecot does) 
on its close(), but does NFS make it fdsync anyway?



Sorry for the giant email.


--
Jorgen Lundman   | lund...@lundman.net
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo| +81 (0)90-5578-8500  (cell)
Japan| +81 (0)3 -3375-1767  (home)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-28 Thread Bob Friesenhahn

On Wed, 29 Jul 2009, Jorgen Lundman wrote:


For example, I know rsync and tar does not use fdsync (but dovecot does) on 
its close(), but does NFS make it fdsync anyway?


NFS is required to do synchronous writes.  This is what allows NFS 
clients to recover seamlessly if the server spontaneously reboots. 
If the NFS client supports it, it can send substantial data (multiple 
writes) to the server, and then commit it all via an NFS commit. 
Note that this requires more work by the client since the NFS client 
is required to replay the uncommited writes if the server goes away.



Sorry for the giant email.


No, thank you very much for the interesting measurements and data.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss