Re: [zfs-discuss] surprisingly poor performance

Richard Elling Sun, 05 Jul 2009 18:21:50 -0700

Ross Walker wrote:

On Jul 5, 2009, at 7:47 PM, Richard Elling <richard.ell...@gmail.com>wrote:
Ross Walker wrote:
On Jul 5, 2009, at 6:06 AM, James Lever <j...@jamver.id.au> wrote:
On 04/07/2009, at 3:08 AM, Bob Friesenhahn wrote:
It seems like you may have selected the wrong SSD product to use.There seems to be a huge variation in performance (and cost) withso-called "enterprise" SSDs. SSDs with capacitor-backed writecaches seem to be fastest.
Do you have any methods to "correctly" measure the performance ofan SSD for the purpose of a slog and any information on others(other than anecdotal evidence)?
There are two types of SSD drives on the market, the fast write SLC(single level cell) and the slow write MLC (multi level cell). MLCis usually used in laptops as SLC drives over 16GB usually go for$1000+ which isn't cost effective in a laptop. MLC is good for readcaching though and most use it for L2ARC.
Please don't classify them as MLC vs SLC or you'll find yourself totally
confused by the modern MLC designs which use SLC as a cache.  Be
happy with specs: random write iops: slow or fast.
Thanks for the info. SSD is still very much a moving target.
I worry about SSD drives long term reliability. If I mirror two ofthe same drives what do you think the probability of a double failurewill be in 3, 4, 5 years?


Assuming there are no common cause faults (eg firmware), you should
expect an MTBF of 2-4M hours.  But I can't answer the question without
knowing more info.  It seems to me that you are really asking for the
MTTDL, which is a representation  of probability of data loss.  I describe
these algorithms here:
http://blogs.sun.com/relling/entry/a_story_of_two_mttdl

Since the vendors do not report UER rates, which makes sense for
flash devices, the MTTDL[1] model applies.  You can do the math
yourself, once you figure out what your MTTR might be.  For enterprise
systems, we usually default to 8 hour response, but for home users you
might plan on a few days, so you can take a vacation every once in a
while.  For 48 hours MTTR:
2M hours MTBF -> MTTDL[1] = 4,756,469 years
4M hours MTBF -> MTTDL[1] = 19,025,875 years

Most folks find it more intuitive to look at probability per year in the
form of a percent, so
2M hours MTBF -> Annual DL rate = 0.000021%
4M hours MTBF -> Annual DL rate = 0.000005%

If you want to more accurately model based on endurance, then you'll
need to know the expected write rate and the nature of the wear leveling
mechanism. It can be done, but the probability is really, really small.

What I would really like to see is zpool's ability to fail-back to aninline zil in the event an external one fails or is missing. Then onecan remove an slog from a pool and add a different one if necessary orjust remove it altogether.


It already does this, with caveats.  What you might also want is
CR 6574286, removing a slog doesn't work.
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6574286
-- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] surprisingly poor performance

Reply via email to