Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2011-01-02 Thread Frank Lahm
2010/12/24 Edward Ned Harvey
opensolarisisdeadlongliveopensola...@nedharvey.com:
 From: Frank Lahm [mailto:frankl...@googlemail.com]

 With Netatalk for AFP he _is_ running a database: any AFP server needs
 to maintain a consistent mapping between _not reused_ catalog node ids
 (CNIDs) and filesystem objects. Luckily for Apple, HFS[+] and their
 Cocoa/Carbon APIs provide such a mapping making diirect use of HFS+
 CNIDs. Unfortunately most UNIX filesystem reuse inodes and have no API
 for mapping inodes to filesystem objects. Therefor all AFP servers
 running on non-Apple OSen maintain a database providing this mapping,
 in case of Netatalk it's `cnid_dbd` using a BerkeleyDB database.

 Don't all of those concerns disappear in the event of a reboot?

 If you stop AFP, you could completely obliterate the BDB database, and 
 restart AFP, and functionally continue from where you left off.  Right?

No. Apple's APIs provide semantics by which you can reference
filesystem objects by their parent directory CNID + object name. More
important in this context: these references can be stored, retrieved
and reused, eg. Finder Aliasses, Adobe InDesign and many more
applications use these semantics to store references to files.
If you nuke the CNID database, upon renumeration of the volumes all
filesystem objects are likely to assigned new and different CNIDs,
thus all references are broken.

-f
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2011-01-02 Thread Edward Ned Harvey
 From: Frank Lahm [mailto:frankl...@googlemail.com]
 
  Don't all of those concerns disappear in the event of a reboot?
 
  If you stop AFP, you could completely obliterate the BDB database, and
 restart AFP, and functionally continue from where you left off.  Right?
 
 No. Apple's APIs provide semantics by which you can reference
 filesystem objects by their parent directory CNID + object name. More
 important in this context: these references can be stored, retrieved
 and reused, eg. Finder Aliasses, Adobe InDesign and many more
 applications use these semantics to store references to files.
 If you nuke the CNID database, upon renumeration of the volumes all
 filesystem objects are likely to assigned new and different CNIDs,
 thus all references are broken.

Just like...  If you shut down your Apple OSX AFP file server, move all the 
files to a new upgraded file server, reassigned the old IP address and DNS name 
to the new server, and enabled AFP file services on the new file server.

How do people handle the broken links issue, when they upgrade their Apple 
server?  If they don't bother doing anything about it, I would conclude it's no 
big deal.  If there is instead, some process you're supposed to follow when you 
upgrade/replace your Apple AFP fileserver, I wonder if that process is 
applicable to the present thread of discussion as well.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2011-01-02 Thread Stephan Budach

Am 02.01.11 16:52, schrieb Edward Ned Harvey:

From: Frank Lahm [mailto:frankl...@googlemail.com]


Don't all of those concerns disappear in the event of a reboot?

If you stop AFP, you could completely obliterate the BDB database, and

restart AFP, and functionally continue from where you left off.  Right?

No. Apple's APIs provide semantics by which you can reference
filesystem objects by their parent directory CNID + object name. More
important in this context: these references can be stored, retrieved
and reused, eg. Finder Aliasses, Adobe InDesign and many more
applications use these semantics to store references to files.
If you nuke the CNID database, upon renumeration of the volumes all
filesystem objects are likely to assigned new and different CNIDs,
thus all references are broken.

Just like...  If you shut down your Apple OSX AFP file server, move all the 
files to a new upgraded file server, reassigned the old IP address and DNS name 
to the new server, and enabled AFP file services on the new file server.

How do people handle the broken links issue, when they upgrade their Apple 
server?  If they don't bother doing anything about it, I would conclude it's no 
big deal.  If there is instead, some process you're supposed to follow when you 
upgrade/replace your Apple AFP fileserver, I wonder if that process is 
applicable to the present thread of discussion as well.

Well… on the Apple platform HFS+ (the Mac's default fs) takes care of 
that, so you'd never have to worry about this issue there. On the 
*nix-side of things, when running Netatalk, you'll have to store these 
information in some kind of extra database, which is BDB in this case.


Initially, I only wanted check what hw to get for my ZIL and I agree 
that by now, I have already decided - and ordered - two Vertex 2 EX 50GB 
SSDs to handle the ZIL for my zpool, since am serving already 50 AFP 
sharepoints which are accessed by 120 clients. The number of sharepoints 
will eventually rise up to 250 and the number of clients will rise up to 
450 and that would cause some real random workload on the zpool and the 
ZIL, I guess.


The technical discussion about short stroking is nevertheless very 
interesting. ;)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-30 Thread Edward Ned Harvey
 From: Kevin Walker [mailto:indigoskywal...@gmail.com]
 
 You do seem to misunderstand ZIL.

Wrong.


 ZIL is quite simply write cache 

ZIL is not simply write cache, but it enables certain types of operations to
use write cache which otherwise would have been ineligible.

The Intent Log is where ZFS immediately writes sync-write requests, so it
can unblock the process which called write().  Once the data has been
committed to nonvolatile ZIL storage, the process can continue processing,
and ZFS can treat the write requests as async writes.  Which means, after
ZFS has written the ZIL, then the data is able to stay a while in the RAM
write buffer along with all the async writes.  Which means ZFS is able to
aggregate and optimize all the writes for best performance.

This means ZIL is highly sensitive to access times.  (seek + latency)


 using a short stroked rotating drive is
 never going to provide a performance increase that is worth talking about

If you don't add a dedicated log device, then the ZIL utilizes blocks from
the main storage pool, and all sync writes suddenly get higher priority than
all the queued reads and async writes.  If you have a busy storage pool,
your sync writes might see something like 20ms access times (seek + latency)
before they can hit nonvolatile storage, and every time this happens, some
other operation gets delayed.

If you add a spindle drive dedicated log device, then that drive is always
idle except when writing ZIL for sync writes, and also, the head will barely
move over the platter because all the ZIL blocks will be clustered tightly
together.  So the ZIL might require typically 2ms or 3ms access times
(negligible seek or 1ms seek + 2ms latency), which is an order of magnitude
better than before.  Plus the sync writes in this case don't take away
performance from the main pool reads  writes.

If you replace your spindle drive with a SSD, then you get another order of
magnitude smaller access time.  (Tens of thousands of IOPS effectively
compares to 1ms access time per OP)

If you disable your ZIL completely, then you get another order of magnitude
smaller access time.  (Some ns to think about putting the data directly into
RAM write buffer and entirely bypass the ZIL).


 and more importantly ZIL was designed to be used with a RAM/Solid State
 Disk.

I hope you mean NVRAM or battery-backed RAM of some kind.  Because if you
use volatile RAM for ZIL, then you have disabled ZIL from being able to
function correctly.

The ZFS Best Practices Guide specifically mentions Better performance might
be possible by using [...], or even a dedicated spindle disk.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-29 Thread Edward Ned Harvey
 From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us]
 Sent: Tuesday, December 28, 2010 9:23 PM
 
  The question of IOPS here is relevant to conversation because of ZIL
  dedicated log.  If you have advanced short-stroking to get the write
latency
  of a log device down to zero, then it can compete against SSD for
purposes
  of a log device, but nobody seems to believe such technology currently
  exists, and it certainly couldn't compete against SSD for random reads.
  (ZIL log is the only situation I know of, where write performance of a
drive
  matters and read performance does not matter.)
 
 It seems that you may be confused.  For the ZIL the drive's rotational
 latency (based on RPM) is the dominating factor and not the lateral
 head seek time on the media.  In this case, the short-stroking you
 are talking about does not help any.  The ZIL is already effectively
 short-stroking since it writes in order.

Nope.  I'm not confused at all.  I'm making a distinction between short
stroking and advanced short stroking.  Where simple short stroking does
as you said - eliminates the head seek time but still susceptible to
rotational latency.  As you said, the ZIL already effectively accomplishes
that end result, provided a dedicated spindle disk for log device, but does
not do that if your ZIL is on the pool storage.  And what I'm calling
advanced short stroking are techniques that effectively eliminate, or
minimize both seek  latency, to zero or near-zero.  What I'm calling
advanced short stroking doesn't exist as far as I know, but is
theoretically possible through either special disk hardware or special
drivers.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-29 Thread Kevin Walker
You do seem to misunderstand ZIL.

ZIL is quite simply write cache and using a short stroked rotating drive is
never going to provide a performance increase that is worth talking about
and more importantly ZIL was designed to be used with a RAM/Solid State
Disk.

We use sata2 *HyperDrive5* RAM disks in mirrors and they work well and are
far cheaper than STEC or other enterprise SSD's and have non of the issue
related to trim...

Highly recommended... ;-)

http://www.hyperossystems.co.uk/

Kevin


On 29 December 2010 13:40, Edward Ned Harvey 
opensolarisisdeadlongliveopensola...@nedharvey.com wrote:

  From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us]
  Sent: Tuesday, December 28, 2010 9:23 PM
 
   The question of IOPS here is relevant to conversation because of ZIL
   dedicated log.  If you have advanced short-stroking to get the write
 latency
   of a log device down to zero, then it can compete against SSD for
 purposes
   of a log device, but nobody seems to believe such technology currently
   exists, and it certainly couldn't compete against SSD for random reads.
   (ZIL log is the only situation I know of, where write performance of a
 drive
   matters and read performance does not matter.)
 
  It seems that you may be confused.  For the ZIL the drive's rotational
  latency (based on RPM) is the dominating factor and not the lateral
  head seek time on the media.  In this case, the short-stroking you
  are talking about does not help any.  The ZIL is already effectively
  short-stroking since it writes in order.

 Nope.  I'm not confused at all.  I'm making a distinction between short
 stroking and advanced short stroking.  Where simple short stroking
 does
 as you said - eliminates the head seek time but still susceptible to
 rotational latency.  As you said, the ZIL already effectively accomplishes
 that end result, provided a dedicated spindle disk for log device, but does
 not do that if your ZIL is on the pool storage.  And what I'm calling
 advanced short stroking are techniques that effectively eliminate, or
 minimize both seek  latency, to zero or near-zero.  What I'm calling
 advanced short stroking doesn't exist as far as I know, but is
 theoretically possible through either special disk hardware or special
 drivers.


 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-29 Thread Jason Warr

HyperDrive5 = ACard ANS9010

I have personally been wanting to try one of these for some time as a 
ZIL device.


On 12/29/2010 06:35 PM, Kevin Walker wrote:

You do seem to misunderstand ZIL.

ZIL is quite simply write cache and using a short stroked rotating 
drive is never going to provide a performance increase that is worth 
talking about and more importantly ZIL was designed to be used with a 
RAM/Solid State Disk.


We use sata2 *HyperDrive/5/* RAM disks in mirrors and they work well 
and are far cheaper than STEC or other enterprise SSD's and have non 
of the issue related to trim...


Highly recommended... ;-)

http://www.hyperossystems.co.uk/

Kevin


On 29 December 2010 13:40, Edward Ned Harvey 
opensolarisisdeadlongliveopensola...@nedharvey.com 
mailto:opensolarisisdeadlongliveopensola...@nedharvey.com wrote:


 From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us
mailto:bfrie...@simple.dallas.tx.us]
 Sent: Tuesday, December 28, 2010 9:23 PM

  The question of IOPS here is relevant to conversation because
of ZIL
  dedicated log.  If you have advanced short-stroking to get the
write
latency
  of a log device down to zero, then it can compete against SSD for
purposes
  of a log device, but nobody seems to believe such technology
currently
  exists, and it certainly couldn't compete against SSD for
random reads.
  (ZIL log is the only situation I know of, where write
performance of a
drive
  matters and read performance does not matter.)

 It seems that you may be confused.  For the ZIL the drive's
rotational
 latency (based on RPM) is the dominating factor and not the lateral
 head seek time on the media.  In this case, the short-stroking you
 are talking about does not help any.  The ZIL is already effectively
 short-stroking since it writes in order.

Nope.  I'm not confused at all.  I'm making a distinction between
short
stroking and advanced short stroking.  Where simple short
stroking does
as you said - eliminates the head seek time but still susceptible to
rotational latency.  As you said, the ZIL already effectively
accomplishes
that end result, provided a dedicated spindle disk for log device,
but does
not do that if your ZIL is on the pool storage.  And what I'm calling
advanced short stroking are techniques that effectively
eliminate, or
minimize both seek  latency, to zero or near-zero.  What I'm calling
advanced short stroking doesn't exist as far as I know, but is
theoretically possible through either special disk hardware or special
drivers.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org mailto:zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-29 Thread Fred Liu
I do the same with ACARD…
Works well enough.

Fred

From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Jason Warr
Sent: 星期四, 十二月 30, 2010 8:56
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

HyperDrive5 = ACard ANS9010

I have personally been wanting to try one of these for some time as a ZIL 
device.

On 12/29/2010 06:35 PM, Kevin Walker wrote:
You do seem to misunderstand ZIL.

ZIL is quite simply write cache and using a short stroked rotating drive is 
never going to provide a performance increase that is worth talking about and 
more importantly ZIL was designed to be used with a RAM/Solid State Disk.

We use sata2 HyperDrive5 RAM disks in mirrors and they work well and are far 
cheaper than STEC or other enterprise SSD's and have non of the issue related 
to trim...

Highly recommended... ;-)

http://www.hyperossystems.co.uk/

Kevin

On 29 December 2010 13:40, Edward Ned Harvey 
opensolarisisdeadlongliveopensola...@nedharvey.commailto:opensolarisisdeadlongliveopensola...@nedharvey.com
 wrote:
 From: Bob Friesenhahn 
 [mailto:bfrie...@simple.dallas.tx.usmailto:bfrie...@simple.dallas.tx.us]
 Sent: Tuesday, December 28, 2010 9:23 PM

  The question of IOPS here is relevant to conversation because of ZIL
  dedicated log.  If you have advanced short-stroking to get the write
latency
  of a log device down to zero, then it can compete against SSD for
purposes
  of a log device, but nobody seems to believe such technology currently
  exists, and it certainly couldn't compete against SSD for random reads.
  (ZIL log is the only situation I know of, where write performance of a
drive
  matters and read performance does not matter.)

 It seems that you may be confused.  For the ZIL the drive's rotational
 latency (based on RPM) is the dominating factor and not the lateral
 head seek time on the media.  In this case, the short-stroking you
 are talking about does not help any.  The ZIL is already effectively
 short-stroking since it writes in order.
Nope.  I'm not confused at all.  I'm making a distinction between short
stroking and advanced short stroking.  Where simple short stroking does
as you said - eliminates the head seek time but still susceptible to
rotational latency.  As you said, the ZIL already effectively accomplishes
that end result, provided a dedicated spindle disk for log device, but does
not do that if your ZIL is on the pool storage.  And what I'm calling
advanced short stroking are techniques that effectively eliminate, or
minimize both seek  latency, to zero or near-zero.  What I'm calling
advanced short stroking doesn't exist as far as I know, but is
theoretically possible through either special disk hardware or special
drivers.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss







___

zfs-discuss mailing list

zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org

http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-29 Thread Erik Trimble

On 12/29/2010 4:55 PM, Jason Warr wrote:

HyperDrive5 = ACard ANS9010

I have personally been wanting to try one of these for some time as a 
ZIL device.




Yes, but do remember these require a half-height 5.25 drive bay, and 
you really, really should buy the extra CF card for backup.


Also, stay away from the ANS-9010S with LVD SCSI interface. As (I think) 
Bob pointed out a long time ago, parallel SCSI isn't good for a 
high-IOPS interface. It (the LVD interface) will throttle long before 
the drive does...


I've been waiting for them to come out with a 3.5 version, one which I 
can plug directly into a standard 3.5 SAS/SATA hotswap bay...


And, of course, the ANS9010 is limited to the SATA2 interface speed, so 
it is cheaper and lower-performing (but still better than an SSD) than 
the DDRdrive.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-29 Thread ja...@warr.net
Had not even noticed the LVD version.

The biggest issue for me is not the form factor but the how hard it would be to 
get the client I work for to accept them in the env given support issues.

- Reply message -
From: Erik Trimble erik.trim...@oracle.com
Date: Wed, Dec 29, 2010 19:52
Subject: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
To: Jason Warr ja...@warr.net
Cc: zfs-discuss@opensolaris.org


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-28 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
 
 Ok, what we've hit here is two people using the same word to talk about
 different things.  Apples to oranges, as it were.  Both meanings of IOPS
 are ok, but context is everything.
 
 There are drive random IOPS, which is dependent on latency and seek time,
 and there is also measured random IOPS above the filesystem layer, which
is
 not always related to latency or seek time, as described above.

In any event, the relevant points are:

The question of IOPS here is relevant to conversation because of ZIL
dedicated log.  If you have advanced short-stroking to get the write latency
of a log device down to zero, then it can compete against SSD for purposes
of a log device, but nobody seems to believe such technology currently
exists, and it certainly couldn't compete against SSD for random reads.
(ZIL log is the only situation I know of, where write performance of a drive
matters and read performance does not matter.)

If using ZFS for AFP (and consequently BDB)...  If you disable the ZIL you
will have maximum performance, but maybe you're not comfortable with that
because you're not convinced of stability with ZIL disabled, or for other
reasons.

* If you put your BDB or ZIL on a spindle dedicated device, it will perform
better than having no dedicated device, but the difference might be anything
from 1x to 10x, depending on where your bottlenecks are.  AKA no improvement
is guaranteed, but probably you get at least a little bit.
* If you put your BDB or ZIL on a SSD dedicated log device, it will perform
still better, and again, the difference could be anywhere from 1x to 10x
depending on your bottlenecks.  
* If you disable your ZIL, it will perform still better, and again, the
difference could be anywhere from 1x to 10x.

Realistically, at some point you'll hit a network bottleneck, and you won't
notice the improved performance.  If you're just doing small numbers of
large files, none of the above will probably be noticeable, because in that
case latency is pretty much irrelevant.  But assuming you have at least a
bunch of reasonably small files, IMHO that threshold is at the SSD, because
the latency of the SSD is insignificant compared to the latency of the
network.  But even with short-stroking getting the latency down to 2ms,
that's still significant compared to network latency, so there's probably
still room for improvement over the short-stroking techniques.  At least,
until somebody creates a more advanced short-stroking which gets latency
down to near-zero, if that will ever happen.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-28 Thread Bob Friesenhahn

On Tue, 28 Dec 2010, Edward Ned Harvey wrote:


In any event, the relevant points are:

The question of IOPS here is relevant to conversation because of ZIL
dedicated log.  If you have advanced short-stroking to get the write latency
of a log device down to zero, then it can compete against SSD for purposes
of a log device, but nobody seems to believe such technology currently
exists, and it certainly couldn't compete against SSD for random reads.
(ZIL log is the only situation I know of, where write performance of a drive
matters and read performance does not matter.)


It seems that you may be confused.  For the ZIL the drive's rotational 
latency (based on RPM) is the dominating factor and not the lateral 
head seek time on the media.  In this case, the short-stroking you 
are talking about does not help any.  The ZIL is already effectively 
short-stroking since it writes in order.


The (possibly) worthy optimizations I have heard about are writing the 
log data in a different pattern on disk (via a special device driver) 
with the goal that when when drive sync request comes in the drive is 
quite likely to be able to write immediately.  Since such 
optimizations are quite device and write-load dependent, it is not 
worth while for a large company to develop the feature (but would make 
for an interesting project).


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-27 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Nicolas Williams
 
  Actually I'd say that latency has a direct relationship to IOPS because
it's the
 time it takes to perform an IO that determines how many IOs Per Second
 that can be performed.
 
 Assuming you have enough synchronous writes and that you can organize
 them so as to keep the drive at max sustained sequential write
 bandwidth, then IOPS == bandwidth / logical I/O size.  Latency doesn't

Ok, what we've hit here is two people using the same word to talk about
different things.  Apples to oranges, as it were.  Both meanings of IOPS
are ok, but context is everything.  

There are drive random IOPS, which is dependent on latency and seek time,
and there is also measured random IOPS above the filesystem layer, which is
not always related to latency or seek time, as described above.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-27 Thread Richard Elling
On Dec 27, 2010, at 6:06 PM, Edward Ned Harvey wrote:

 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Nicolas Williams
 
 Actually I'd say that latency has a direct relationship to IOPS because
 it's the
 time it takes to perform an IO that determines how many IOs Per Second
 that can be performed.
 
 Assuming you have enough synchronous writes and that you can organize
 them so as to keep the drive at max sustained sequential write
 bandwidth, then IOPS == bandwidth / logical I/O size.  Latency doesn't
 
 Ok, what we've hit here is two people using the same word to talk about
 different things.  Apples to oranges, as it were.  Both meanings of IOPS
 are ok, but context is everything.  
 
 There are drive random IOPS, which is dependent on latency and seek time,
 and there is also measured random IOPS above the filesystem layer, which is
 not always related to latency or seek time, as described above.

The small, random read model can assume no cache hits. Adding caches makes
the model too complicated for simple analysis, and arguably too complicated for
modeling at all. For such systems, empirical measurements are possible, but can
be overly optimistic.  For example, it is relatively trivial to demonstrate 
500,000 
small, random read IOPS at the application using a file system that caches to 
RAM.
Achieving that performance level for the general case is much less common.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-27 Thread Nicolas Williams
On Mon, Dec 27, 2010 at 09:06:45PM -0500, Edward Ned Harvey wrote:
  From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
  boun...@opensolaris.org] On Behalf Of Nicolas Williams
  
   Actually I'd say that latency has a direct relationship to IOPS because
 it's the
  time it takes to perform an IO that determines how many IOs Per Second
  that can be performed.
  
  Assuming you have enough synchronous writes and that you can organize
  them so as to keep the drive at max sustained sequential write
  bandwidth, then IOPS == bandwidth / logical I/O size.  Latency doesn't
 
 Ok, what we've hit here is two people using the same word to talk about
 different things.  Apples to oranges, as it were.  Both meanings of IOPS
 are ok, but context is everything.  
 
 There are drive random IOPS, which is dependent on latency and seek time,
 and there is also measured random IOPS above the filesystem layer, which is
 not always related to latency or seek time, as described above.

Clearly the application cares about _synchronous_ operations that are
meaningful to it.  In the case of an NFS application that would be
open() with O_CREAT (and particularly O_EXCL), close(), fsync() and so
on.  For a POSIX (but not NFS) application the number of synchronous
operations is smaller.  The rate of asynchronous operations is less
important to the application because those are subject to caching, thus
less predictable.  But to the filesystem the IOPS are not just about
synchronous I/O but about how many distinct I/O operations can be
completed per unit of time.  I tried to keep this clear; sorry for any
confusion.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-26 Thread Nicolas Williams
On Sat, Dec 25, 2010 at 08:37:42PM -0500, Ross Walker wrote:
 On Dec 24, 2010, at 1:21 PM, Richard Elling richard.ell...@gmail.com wrote:
 
  Latency is what matters most.  While there is a loose relationship between 
  IOPS
  and latency, you really want low latency.  For 15krpm drives, the average 
  latency
  is 2ms for zero seeks.  A decent SSD will beat that by an order of 
  magnitude.
 
 Actually I'd say that latency has a direct relationship to IOPS because it's 
 the time it takes to perform an IO that determines how many IOs Per Second 
 that can be performed.

Assuming you have enough synchronous writes and that you can organize
them so as to keep the drive at max sustained sequential write
bandwidth, then IOPS == bandwidth / logical I/O size.  Latency doesn't
enter into that formula.  Latency does remain though, and will be
noticeable to apps doing synchronous operations.

Thus 100MB/s, say, sustained sequential write bandwidth with, say, 2KB
avg ZIL entries you'd get 51200/s logical, sync write operations.  The
latency for each such operation would still be 2ms (or whatever it is
for the given disk).  Since you'd likely have to batch many ZIL writes
you'd end up making the latency for some ops longer than 2ms and others
shorter, but if you can keep the drive at max sustained seq write
bandwidth then the average latency will be 2ms.

SSDs are clearly a better choice.

BTW, a parallelized tar would greatly help reduce the impact of high
latency open()/close() (over NFS) operations...

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-25 Thread Ross Walker
On Dec 24, 2010, at 1:21 PM, Richard Elling richard.ell...@gmail.com wrote:

 Latency is what matters most.  While there is a loose relationship between 
 IOPS
 and latency, you really want low latency.  For 15krpm drives, the average 
 latency
 is 2ms for zero seeks.  A decent SSD will beat that by an order of magnitude.

Actually I'd say that latency has a direct relationship to IOPS because it's 
the time it takes to perform an IO that determines how many IOs Per Second that 
can be performed.

Ever notice how storage vendors list their max IOPS in 512 byte sequential IO 
workloads and sustained throughput in 1MB+ sequential IO workloads. Only SSD 
makers list their random IOPS workload numbers and their 4K IO workload numbers.

-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-25 Thread Richard Elling
On Dec 25, 2010, at 5:37 PM, Ross Walker wrote:
 On Dec 24, 2010, at 1:21 PM, Richard Elling richard.ell...@gmail.com wrote:
 
 Latency is what matters most.  While there is a loose relationship between 
 IOPS
 and latency, you really want low latency.  For 15krpm drives, the average 
 latency
 is 2ms for zero seeks.  A decent SSD will beat that by an order of magnitude.
 
 Actually I'd say that latency has a direct relationship to IOPS because it's 
 the time it takes to perform an IO that determines how many IOs Per Second 
 that can be performed.

That is only true when there is one queue and one server (in the queueing 
context).
This is not the case where there are multiple concurrent I/O that can be 
completed
out of order by multiple servers working in parallel (eg. disk subsystems).  
For an
extreme example, the Sun Storage F5100 Array specifications show 1.6 million
random read IOPS @ 4KB.  But instead of an average latency of 625 nanoseconds,
it shows an average latency of 0.378 milliseconds.  The analogy we've used in 
parallel
computing for many years is nine women cannot make a baby in one month.

 Ever notice how storage vendors list their max IOPS in 512 byte sequential IO 
 workloads and sustained throughput in 1MB+ sequential IO workloads. Only SSD 
 makers list their random IOPS workload numbers and their 4K IO workload 
 numbers.

The vendor will present the number that makes them look best, often without
regard for practical application... the curse of marketing :-)
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-24 Thread Edward Ned Harvey
 From: Frank Lahm [mailto:frankl...@googlemail.com]
 
 With Netatalk for AFP he _is_ running a database: any AFP server needs
 to maintain a consistent mapping between _not reused_ catalog node ids
 (CNIDs) and filesystem objects. Luckily for Apple, HFS[+] and their
 Cocoa/Carbon APIs provide such a mapping making diirect use of HFS+
 CNIDs. Unfortunately most UNIX filesystem reuse inodes and have no API
 for mapping inodes to filesystem objects. Therefor all AFP servers
 running on non-Apple OSen maintain a database providing this mapping,
 in case of Netatalk it's `cnid_dbd` using a BerkeleyDB database.

Don't all of those concerns disappear in the event of a reboot?

If you stop AFP, you could completely obliterate the BDB database, and restart 
AFP, and functionally continue from where you left off.  Right?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-24 Thread Richard Elling
On Dec 23, 2010, at 2:25 AM, Stephan Budach wrote:
 as I have learned from the discussion about which SSD to use as ZIL drives, I 
 stumbled across this article, that discusses short stroking for increasing 
 IOPs on SAS and SATA drives:
 
 http://www.tomshardware.com/reviews/short-stroking-hdd,2157.html
 
 Now, I am wondering if using a mirror of such 15k SAS drives would be a 
 good-enough fit for a ZIL on a zpool that is mainly used for file services 
 via AFP and SMB.

SMB does not create much of a synchronous load.  I haven't explored AFP 
directly,
but if they do use Berkeley DB, then we do have a lot of experience tuning ZFS 
for
Berkeley DB performance.

 I'd particulary like to know, if someone has already used such a solution and 
 how it has worked out.

Latency is what matters most.  While there is a loose relationship between IOPS
and latency, you really want low latency.  For 15krpm drives, the average 
latency
is 2ms for zero seeks.  A decent SSD will beat that by an order of magnitude.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-24 Thread Phil Harman

On 24/12/2010 18:21, Richard Elling wrote:
Latency is what matters most.  While there is a loose relationship 
between IOPS
and latency, you really want low latency.  For 15krpm drives, the 
average latency
is 2ms for zero seeks.  A decent SSD will beat that by an order of 
magnitude.


And the closer you get to the CPU, the lower the latency. For example, 
the DDRdrive X1 is yet another order of magnitude faster because it sits 
directly on the PCI bus, without the overhead of SAS protocol.


Yet the humble old 15K drive with 2ms sequential latency is still and 
order of magnitude faster than a busy drive delivering 20ms latencies 
under a random workload.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-23 Thread Stephan Budach

Hi,

as I have learned from the discussion about which SSD to use as ZIL 
drives, I stumbled across this article, that discusses short stroking 
for increasing IOPs on SAS and SATA drives:


http://www.tomshardware.com/reviews/short-stroking-hdd,2157.html

Now, I am wondering if using a mirror of such 15k SAS drives would be a 
good-enough fit for a ZIL on a zpool that is mainly used for file 
services via AFP and SMB.
I'd particulary like to know, if someone has already used such a 
solution and how it has worked out.


Cheers,
budy


--
Stephan Budach
Jung von Matt/it-services GmbH
Glashüttenstraße 79
20357 Hamburg

Tel: +49 40-4321-1353
Fax: +49 40-4321-1114
E-Mail: stephan.bud...@jvm.de
Internet: http://www.jvm.com

Geschäftsführer: Ulrich Pallas, Frank Wilhelm
AG HH HRB 98380

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-23 Thread Phil Harman
Great question. In good enough computing, beauty is in the eye of the 
beholder. My home NAS appliance uses IDE and SATA drives withoutba dedicated ZIL

http://dtrace.org/blogs/ahl/2010/11/15/zil-analysis-from-chris-george/


if HDDs and commodity SSDs continue to be target ZIL devices, ZFS could and 
should do more to ensure that writes are sequential.


On 23 Dec 2010, at 10:25, Stephan Budach stephan.bud...@jvm.de wrote:

 Hi,
 
 as I have learned from the discussion about which SSD to use as ZIL drives, I 
 stumbled across this article, that discusses short stroking for increasing 
 IOPs on SAS and SATA drives:
 
 http://www.tomshardware.com/reviews/short-stroking-hdd,2157.html
 
 Now, I am wondering if using a mirror of such 15k SAS drives would be a 
 good-enough fit for a ZIL on a zpool that is mainly used for file services 
 via AFP and SMB.
 I'd particulary like to know, if someone has already used such a solution and 
 how it has worked out.
 
 Cheers,
 budy
 
 
  -- 
 Stephan Budach
 Jung von Matt/it-services GmbH
 Glashüttenstraße 79
 20357 Hamburg
 
 Tel: +49 40-4321-1353
 Fax: +49 40-4321-1114
 E-Mail: stephan.bud...@jvm.de
 Internet: http://www.jvm.com
 
 Geschäftsführer: Ulrich Pallas, Frank Wilhelm
 AG HH HRB 98380
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-23 Thread Phil Harman
Sent from my iPhone (which had a lousy user interface which makes it all too 
easy for a clumsy oaf like me to touch Send before I'm done)...

On 23 Dec 2010, at 11:07, Phil Harman phil.har...@gmail.com wrote:

 Great question. In good enough computing, beauty is in the eye of the 
 beholder. My home NAS appliance uses mirrorwd IDE and SATA drives without a 
 dedicated ZIL

device. And for my home SMB and NFS, that's good enough.

I'm sure that even a 7200rpm SATA ZIL would improve things inmy case.

The random I/O requirement for the ZIL is discussed by Adam (and Chris) here ...

 http://dtrace.org/blogs/ahl/2010/11/15/zil-analysis-from-chris-george/

What I find most encouraging is this statement:

 if HDDs and commodity SSDs continue to be target ZIL devices, ZFS could and 
 should do more to ensure that writes are sequential.

It's not broken, but it is suboptimal, and fixable (apparently) ;)

 On 23 Dec 2010, at 10:25, Stephan Budach stephan.bud...@jvm.de wrote:
 
 Hi,
 
 as I have learned from the discussion about which SSD to use as ZIL drives, 
 I stumbled across this article, that discusses short stroking for increasing 
 IOPs on SAS and SATA drives:
 
 http://www.tomshardware.com/reviews/short-stroking-hdd,2157.html
 
 Now, I am wondering if using a mirror of such 15k SAS drives would be a 
 good-enough fit for a ZIL on a zpool that is mainly used for file services 
 via AFP and SMB.
 I'd particulary like to know, if someone has already used such a solution 
 and how it has worked out.
 
 Cheers,
 budy
 
 
  -- 
 Stephan Budach
 Jung von Matt/it-services GmbH
 Glashüttenstraße 79
 20357 Hamburg
 
 Tel: +49 40-4321-1353
 Fax: +49 40-4321-1114
 E-Mail: stephan.bud...@jvm.de
 Internet: http://www.jvm.com
 
 Geschäftsführer: Ulrich Pallas, Frank Wilhelm
 AG HH HRB 98380
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-23 Thread Stephan Budach

Am 23.12.10 12:18, schrieb Phil Harman:
Sent from my iPhone (which had a lousy user interface which makes it 
all too easy for a clumsy oaf like me to touch Send before I'm done)...


On 23 Dec 2010, at 11:07, Phil Harman phil.har...@gmail.com 
mailto:phil.har...@gmail.com wrote:


Great question. In good enough computing, beauty is in the eye of 
the beholder. My home NAS appliance uses mirrorwd IDE and SATA drives 
without a dedicated ZIL


device. And for my home SMB and NFS, that's good enough.

I'm sure that even a 7200rpm SATA ZIL would improve things inmy case.

The random I/O requirement for the ZIL is discussed by Adam (and 
Chris) here ...



http://dtrace.org/blogs/ahl/2010/11/15/zil-analysis-from-chris-george/


What I find most encouraging is this statement:

if HDDs and commodity SSDs continue to be target ZIL devices, ZFS 
could and should do more to ensure that writes are sequential.


It's not broken, but it is suboptimal, and fixable (apparently) ;)
Yeah - I read through Christopher's article already and it clearly shows 
the shortcomings of current flash SSDs as ZIL devices. On the other 
hand, if you's be using a DDRdrive as a ZIL device, you'd pretty lock 
this zpool to that particular host, since you can't easily move the 
zpool onto another host, without moving the DDRdrive as well or without 
detaching the ZIL device(s) from the zpool, which I find a little bit odd.


I am not actually running in a SOHO scenario with my ZFS file server, 
since it has to serve up to 200 users on up to 200 zfs volumes in one 
zpool, but the actual data traffic is also not that high either. The 
traffic is more of small peaks when someone writes back to a file.


Cheers,
budy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-23 Thread Phil Harman
On 23 Dec 2010, at 11:53, Stephan Budach stephan.bud...@jvm.de wrote:

 Am 23.12.10 12:18, schrieb Phil Harman:
 
 Sent from my iPhone (which had a lousy user interface which makes it all too 
 easy for a clumsy oaf like me to touch Send before I'm done)...
 
 On 23 Dec 2010, at 11:07, Phil Harman phil.har...@gmail.com wrote:
 
 Great question. In good enough computing, beauty is in the eye of the 
 beholder. My home NAS appliance uses mirrorwd IDE and SATA drives without a 
 dedicated ZIL
 
 device. And for my home SMB and NFS, that's good enough.
 
 I'm sure that even a 7200rpm SATA ZIL would improve things inmy case.
 
 The random I/O requirement for the ZIL is discussed by Adam (and Chris) here 
 ...
 
 http://dtrace.org/blogs/ahl/2010/11/15/zil-analysis-from-chris-george/
 
 What I find most encouraging is this statement:
 
 if HDDs and commodity SSDs continue to be target ZIL devices, ZFS could 
 and should do more to ensure that writes are sequential.
 
 It's not broken, but it is suboptimal, and fixable (apparently) ;)
 Yeah - I read through Christopher's article already and it clearly shows the 
 shortcomings of current flash SSDs as ZIL devices. On the other hand, if 
 you's be using a DDRdrive as a ZIL device, you'd pretty lock this zpool to 
 that particular host, since you can't easily move the zpool onto another 
 host, without moving the DDRdrive as well or without detaching the ZIL 
 device(s) from the zpool, which I find a little bit odd.
 
 I am not actually running in a SOHO scenario with my ZFS file server, since 
 it has to serve up to 200 users on up to 200 zfs volumes in one zpool, but 
 the actual data traffic is also not that high either. The traffic is more of 
 small peaks when someone writes back to a file.
 
 Cheers,
 budy

Well, your proposed config will improve what each user sees during their own 
private burst, and short stroking can only improve things in the worst case 
scenario (although it may not be measurable). So why not give it a spin and 
report back to the list in the new year?

All the best,
Phil___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-23 Thread Stephan Budach

Am 23.12.10 13:09, schrieb Phil Harman:
On 23 Dec 2010, at 11:53, Stephan Budach stephan.bud...@jvm.de 
mailto:stephan.bud...@jvm.de wrote:



Am 23.12.10 12:18, schrieb Phil Harman:
Sent from my iPhone (which had a lousy user interface which makes it 
all too easy for a clumsy oaf like me to touch Send before I'm 
done)...


On 23 Dec 2010, at 11:07, Phil Harman phil.har...@gmail.com 
mailto:phil.har...@gmail.com wrote:


Great question. In good enough computing, beauty is in the eye of 
the beholder. My home NAS appliance uses mirrorwd IDE and SATA 
drives without a dedicated ZIL


device. And for my home SMB and NFS, that's good enough.

I'm sure that even a 7200rpm SATA ZIL would improve things inmy case.

The random I/O requirement for the ZIL is discussed by Adam (and 
Chris) here ...



http://dtrace.org/blogs/ahl/2010/11/15/zil-analysis-from-chris-george/


What I find most encouraging is this statement:

if HDDs and commodity SSDs continue to be target ZIL devices, ZFS 
could and should do more to ensure that writes are sequential.


It's not broken, but it is suboptimal, and fixable (apparently) ;)
Yeah - I read through Christopher's article already and it clearly 
shows the shortcomings of current flash SSDs as ZIL devices. On the 
other hand, if you's be using a DDRdrive as a ZIL device, you'd 
pretty lock this zpool to that particular host, since you can't 
easily move the zpool onto another host, without moving the DDRdrive 
as well or without detaching the ZIL device(s) from the zpool, which 
I find a little bit odd.


I am not actually running in a SOHO scenario with my ZFS file server, 
since it has to serve up to 200 users on up to 200 zfs volumes in one 
zpool, but the actual data traffic is also not that high either. The 
traffic is more of small peaks when someone writes back to a file.


Cheers,
budy


Well, your proposed config will improve what each user sees during 
their own private burst, and short stroking can only improve things in 
the worst case scenario (although it may not be measurable). So why 
not give it a spin and report back to the list in the new year?




Ha ha - if no one else has some more input on this, I will definetively 
give it a try in jannuary.


Cheers,
budy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-23 Thread Eric D. Mudama

On Thu, Dec 23 at 11:25, Stephan Budach wrote:

  Hi,

  as I have learned from the discussion about which SSD to use as ZIL
  drives, I stumbled across this article, that discusses short stroking for
  increasing IOPs on SAS and SATA drives:

  [1]http://www.tomshardware.com/reviews/short-stroking-hdd,2157.html

  Now, I am wondering if using a mirror of such 15k SAS drives would be a
  good-enough fit for a ZIL on a zpool that is mainly used for file services
  via AFP and SMB.
  I'd particulary like to know, if someone has already used such a solution
  and how it has worked out.


Haven't personally used it, but the worst case steady-state IOPS of
the Vertex2 EX, from the DDRDrive presentation, is 6k IOPS assuming a
full-pack random workload.

To achieve that through SAS disks in the same workload, you'll
probably spend significantly more money and it will consume a LOT more
space and power.

According to that Tom's article, a typical 15k SAS enterprise drive is
in the 600 IOPS ballpark when short-stroked and consumes about 15W
active.  Thus you're going to need ten of these devices, to equal the
degraded steady-state IOPS of an SSD.  I just don't think the math
works out.  At that point, you're probably better-off not having a
dedicated ZIL, instead of burning 10 slots and 150W.

--eric

--
Eric D. Mudama
edmud...@mail.bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-23 Thread Stephan Budach

Am 23.12.10 19:05, schrieb Eric D. Mudama:

On Thu, Dec 23 at 11:25, Stephan Budach wrote:

  Hi,

  as I have learned from the discussion about which SSD to use as ZIL
  drives, I stumbled across this article, that discusses short 
stroking for

  increasing IOPs on SAS and SATA drives:

  [1]http://www.tomshardware.com/reviews/short-stroking-hdd,2157.html

  Now, I am wondering if using a mirror of such 15k SAS drives would 
be a
  good-enough fit for a ZIL on a zpool that is mainly used for file 
services

  via AFP and SMB.
  I'd particulary like to know, if someone has already used such a 
solution

  and how it has worked out.


Haven't personally used it, but the worst case steady-state IOPS of
the Vertex2 EX, from the DDRDrive presentation, is 6k IOPS assuming a
full-pack random workload.

To achieve that through SAS disks in the same workload, you'll
probably spend significantly more money and it will consume a LOT more
space and power.

According to that Tom's article, a typical 15k SAS enterprise drive is
in the 600 IOPS ballpark when short-stroked and consumes about 15W
active.  Thus you're going to need ten of these devices, to equal the
degraded steady-state IOPS of an SSD.  I just don't think the math
works out.  At that point, you're probably better-off not having a
dedicated ZIL, instead of burning 10 slots and 150W.
Good - that was actually the information I have been missing. So, I will 
rather go with the Vertex2 EX then and save me the hassle of short 
stroking entirely.


Thanks and merry christmas to all on this list.

Cheers,
budy

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-23 Thread Nicolas Williams
On Thu, Dec 23, 2010 at 11:25:43AM +0100, Stephan Budach wrote:
 as I have learned from the discussion about which SSD to use as ZIL
 drives, I stumbled across this article, that discusses short
 stroking for increasing IOPs on SAS and SATA drives:

There was a thread on this a while back.  I forget when or the subject.
But yes, you could even use 7200 rpm drives to make a fast ZIL device.
The trick is the on-disk format, and the pseudo-device driver that you
would have to layer on top of the actual device(s) to get such
performance.  The key is that sustained sequential I/O rates for disks
can be quite large, so if you organize the disk in a log form and use
the outer tracks only, then you can get pretend to have awesome write
IOPS for a disk (but NOT read IOPs).

But it's not necessarily as cheap as you might think.  You'd be making
very inefficient use of an expensive disk (in the case of an SAS 15k rpm
disk), or disks, and if plural then you are also using more ports
(oops).  Disks used this way probably also consume more power than SSDs
(OK, this part of my analysis if very iffy), and you still need to do
something about ensuring syncs to disk on power failure (such as just
disabling the cache on the disk, but this would lower performance,
increasing the cost).  When you factor all the costs in I suspect you'll
find that SSDs are priced reasonably well.  That's not to say that one
could not put together a disk-based log device that could eat SSDs'
lunch, but SSD prices would then just come down to match that -- and you
can expect SSD prices to come down anyways, as with any new
technologies.

I don't mean to discourage you, just to point out that there's plenty of
work to do to make short-stroked disks as ZILs a workable reality,
while the economics of doing that work versus waiting for SSD prices to
come down don't seem appealing.  Caveat emptor: my analysis is
off-the-cuff; I could be wrong.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-23 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Stephan Budach
 
 Now, I am wondering if using a mirror of such 15k SAS drives would be a
 good-enough fit for a ZIL on a zpool that is mainly used for file services
via
 AFP and SMB.

For supporting AFP and SMB, most likely, you would be perfectly happy simply
disabling the ZIL.  You will get maximum performance... Even higher than the
world's fastest SSD or DDRDrive or any other type of storage device for
dedicated log.  To determine if this is ok for you, be aware of the argument
*against* disabling the ZIL:

In the event of an ungraceful crash, with ZIL enabled, you lose up to 30 sec
of async data, but you do not lose any sync data.

In the event of an ungraceful crash, with ZIL disabled, you lose up to 30
sec of async and sync data.

In neither case do you have data corruption, or a corrupt filesystem.  The
only question is about 30 seconds of sync data.  You must protect this type
of data, if you're running a database, an iscsi target for virtual hosts,
and for some other types of data services...  But if you're doing just AFP
and SMB, it's pretty likely you don't need to worry about it.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL

2010-12-23 Thread Frank Lahm
2010/12/24 Edward Ned Harvey
opensolarisisdeadlongliveopensola...@nedharvey.com:
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Stephan Budach

 Now, I am wondering if using a mirror of such 15k SAS drives would be a
 good-enough fit for a ZIL on a zpool that is mainly used for file services
 via
 AFP and SMB.

 For supporting AFP and SMB, most likely, you would be perfectly happy simply
 disabling the ZIL.  You will get maximum performance... Even higher than the
 world's fastest SSD or DDRDrive or any other type of storage device for
 dedicated log.  To determine if this is ok for you, be aware of the argument
 *against* disabling the ZIL:

 In the event of an ungraceful crash, with ZIL enabled, you lose up to 30 sec
 of async data, but you do not lose any sync data.

 In the event of an ungraceful crash, with ZIL disabled, you lose up to 30
 sec of async and sync data.

 In neither case do you have data corruption, or a corrupt filesystem.  The
 only question is about 30 seconds of sync data.  You must protect this type
 of data, if you're running a database, ...

With Netatalk for AFP he _is_ running a database: any AFP server needs
to maintain a consistent mapping between _not reused_ catalog node ids
(CNIDs) and filesystem objects. Luckily for Apple, HFS[+] and their
Cocoa/Carbon APIs provide such a mapping making diirect use of HFS+
CNIDs. Unfortunately most UNIX filesystem reuse inodes and have no API
for mapping inodes to filesystem objects. Therefor all AFP servers
running on non-Apple OSen maintain a database providing this mapping,
in case of Netatalk it's `cnid_dbd` using a BerkeleyDB database.

-f
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss