Re: [PERFORM] Intel SSDs that may not suck

2011-04-07 Thread Scott Carey


On 4/6/11 10:48 PM, Greg Smith g...@2ndquadrant.com wrote:
Since they're bragging about it there, the safe bet is that the older R2
unit had no such facility.

I note that the Z-Drive R2 is basically some flash packed on top of an
LSI 1068e controller, mapped as a RAID0 volume.

In Linux, you can expose it as a set of 4 JBOD drives, use software RAID
of any kind on that,
and have access to TRIM.  Still useless for (most) databases but may be
useful for other applications, if the reliability level is OK otherwise.

I wonder if the R3 will also be configurable as direct JBOD.


It's possible they left
the battery-backup unit on that card exposed, so it may be possible to
do better with it.  The way they just stack those card layers together,
the thing is practically held together with duct tape though.  That's
not a confidence inspiring design to me.  The R3 drives are much more
cleanly integrated.

-- 
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread Scott Carey
I have generation 1 and 2 Intel MLC drives in production (~150+).  Some
have been around for 2 years.

None have died.  None have hit the write cycle limit.  We do ~ 75GB of
writes a day.

The data and writes on these are not transactional (if one dies, we have
copies).  But the reliability has been excellent.  We had the performance
degradation issues in the G1's that required a firmware update, and have
had to do a secure-erase a on some to get write performance back to
acceptable levels on a few.

I could care less about the 'fast' sandforce drives.  They fail at a high
rate and the performance improvement is BECAUSE they are using a large,
volatile write cache.  If I need higher sequential transfer rate, I'll
RAID some of these together.  A RAID-10 of 6 of these will make a simple
select count(1) query be CPU bound anyway.

I have some G3 SSD's I'll be doing power-fail testing on soon for database
use (currently, we only use the old ones for indexes in databases or
unimportant clone db's).

I have had more raid cards fail in the last 3 years (out of a couple
dozen) than Intel SSD's fail (out of ~150).  I do not trust the Intel 510
series yet -- its based on a non-Intel controller and has worse
random-write performance anyway.



On 3/28/11 9:13 PM, Merlin Moncure mmonc...@gmail.com wrote:

On Mon, Mar 28, 2011 at 7:54 PM, Andy angelf...@yahoo.com wrote:
 This might be a bit too little too late though. As you mentioned there
really isn't any real performance improvement for the Intel SSD.
Meanwhile, SandForce (the controller that OCZ Vertex is based on) is
releasing its next generation controller at a reportedly huge
performance increase.

 Is there any benchmark measuring the performance of these SSD's (the
new Intel vs. the new SandForce) running database workloads? The
benchmarks I've seen so far are for desktop applications.

The random performance data is usually a rough benchmark.  The
sequential numbers are mostly useless and always have been.  The
performance of either the ocz or intel drive is so disgustingly fast
compared to a hard drives that the main stumbling block is life span
and write endurance now that they are starting to get capactiors.

My own experience with MLC drives is that write cycle expectations are
more or less as advertised. They do go down (hard), and have to be
monitored. If you are writing a lot of data this can get pretty
expensive although the cost dynamics are getting better and better for
flash. I have no idea what would be precisely prudent, but maybe some
good monitoring tools and phased obsolescence at around 80% duty cycle
might not be a bad starting point.  With hard drives, you can kinda
wait for em to pop and swap em in -- this is NOT a good idea for flash
raid volumes.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread Andy

--- On Wed, 4/6/11, Scott Carey sc...@richrelevance.com wrote:


 I could care less about the 'fast' sandforce drives. 
 They fail at a high
 rate and the performance improvement is BECAUSE they are
 using a large,
 volatile write cache.  

The G1 and G2 Intel MLC also use volatile write cache, just like most SandForce 
drives do.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread gnuoytr
Not for user data, only controller data.



 Original message 
Date: Wed, 6 Apr 2011 14:11:10 -0700 (PDT)
From: pgsql-performance-ow...@postgresql.org (on behalf of Andy 
angelf...@yahoo.com)
Subject: Re: [PERFORM] Intel SSDs that may not suck  
To: Merlin Moncure mmonc...@gmail.com,Scott Carey sc...@richrelevance.com
Cc: pgsql-performance@postgresql.org pgsql-performance@postgresql.org,Greg 
Smith g...@2ndquadrant.com


--- On Wed, 4/6/11, Scott Carey sc...@richrelevance.com wrote:


 I could care less about the 'fast' sandforce drives. 
 They fail at a high
 rate and the performance improvement is BECAUSE they are
 using a large,
 volatile write cache.  

The G1 and G2 Intel MLC also use volatile write cache, just like most 
SandForce drives do.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread Scott Carey


On 3/29/11 7:16 AM, Jeff thres...@torgo.978.org wrote:


The write degradation could probably be monitored looking at svctime
from sar. We may be implementing that in the near future to detect
when this creeps up again.


For the X25-M's, overcommit.  Do a secure erase, then only partition and
use 85% or so of the drive (~7% is already hidden).  This helps a lot with
the write performance over time.  The Intel rep claimed that the new G3's
are much better at limiting the occasional write latency, by splitting
longer delays into slightly more frequent smaller delays.

Some of the benchmark reviews have histograms that demonstrate this
(although the authors of the review only note average latency or
throughput, the deviations have clearly gone down in this generation).

I'll know more for sure after some benchmarking myself.




--
Jeff Trout j...@jefftrout.com
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread Scott Carey


On 3/29/11 7:32 AM, Jeff thres...@torgo.978.org wrote:


On Mar 29, 2011, at 10:16 AM, Jeff wrote:

 Now that all sounds awful and horrible until you get to overall
 performance, especially with reads - you are looking at 20k random
 reads per second with a few disks.  Adding in writes does kick it
 down a noch, but you're still looking at 10k+ iops. That is the
 current trade off.


We've been doing a burn in for about 4 days now on an array of 8
x25m's behind a p812 controller: here's a sample of what it is
currently doing (I have 10 threads randomly seeking, reading, and 10%
of the time writing (then fsync'ing) out, using my pgiosim tool which
I need to update on pgfoundry)

Your RAID card is probably disabling the write cache on those.  If not, it
isn't power failure safe.

When the write cache is disabled, the negative effects of random writes on
longevity and performance are significantly amplified.

For the G3 drives, you can force the write caches on and remain power
failure safe.  This will significantly decrease the effects of the below.
You can also use a newer linux version with a file system that supports
TRIM/DISCARD which will help as long as your raid controller passes that
through.  It might end up that for many workloads with these drives, it is
faster to use software raid than hardware raid + raid controller.



that was from a simple dd, not random writes. (since it is in
production, I can't really do the random write test as easily)

theoretically, a nice rotation of disks would remove that problem.
annoying, but it is the price you need to pay

--
Jeff Trout j...@jefftrout.com
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread Scott Carey


On 4/6/11 2:11 PM, Andy angelf...@yahoo.com wrote:


--- On Wed, 4/6/11, Scott Carey sc...@richrelevance.com wrote:


 I could care less about the 'fast' sandforce drives.
 They fail at a high
 rate and the performance improvement is BECAUSE they are
 using a large,
 volatile write cache.

The G1 and G2 Intel MLC also use volatile write cache, just like most
SandForce drives do.

1. People are complaining that the Intel G3's aren't as fast as the
SandForce drives (they are faster than the 1st gen SandForce, but not the
yet-to-be-released ones like Vertex 3).  From a database perspective, this
is complete BS.

2. 256K versus 64MB write cache.   Power + time to flush a cache matters.

3. None of the performance benchmarks of drives are comparing the
performance with the cache _disabled_ which is required when not power
safe.  If the SandForce drives are still that much faster with it
disabled, I'd be shocked.  Disabling a 256K write cache will affect
performance less than disabling a 64MB one.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread Scott Carey


On 4/6/11 4:03 PM, gnuo...@rcn.com gnuo...@rcn.com wrote:

Not for user data, only controller data.


False.  I used to think so, but there is volatile write cache for user
data -- its on the 256K chip SRAM not the DRAM though.

Simple power failure tests demonstrate that you lose data with these
drives unless you disable the cache.  Disabling the cache roughly drops
write performance by a factor of 3 to 4 on G1 drives and significantly
hurts wear-leveling and longevity (I haven't tried G2's).



 Original message 
Date: Wed, 6 Apr 2011 14:11:10 -0700 (PDT)
From: pgsql-performance-ow...@postgresql.org (on behalf of Andy
angelf...@yahoo.com)
Subject: Re: [PERFORM] Intel SSDs that may not suck
To: Merlin Moncure mmonc...@gmail.com,Scott Carey
sc...@richrelevance.com
Cc: pgsql-performance@postgresql.org
pgsql-performance@postgresql.org,Greg Smith g...@2ndquadrant.com


--- On Wed, 4/6/11, Scott Carey sc...@richrelevance.com wrote:


 I could care less about the 'fast' sandforce drives.
 They fail at a high
 rate and the performance improvement is BECAUSE they are
 using a large,
 volatile write cache.

The G1 and G2 Intel MLC also use volatile write cache, just like most
SandForce drives do.

-- 
Sent via pgsql-performance mailing list
(pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread Scott Carey

On 4/5/11 7:07 AM, Merlin Moncure mmonc...@gmail.com wrote:

On Mon, Apr 4, 2011 at 8:26 PM, Greg Smith g...@2ndquadrant.com wrote:

 If you really don't need more than 120GB of storage, but do care about
 random I/O speed, this is a pretty easy decision now--presuming the
drive
 holds up to claims.  As the claims are reasonable relative to the
 engineering that went into the drive now, that may actually be the case.

One thing about MLC flash drives (which the industry seems to be
moving towards) is that you have to factor drive lifespan into the
total system balance of costs. Data point: had an ocz vertex 2 that
burned out in ~ 18 months.  In the post mortem, it was determined that
the drive met and exceeded its 10k write limit -- this was a busy
production box.

What OCZ Drive?  What controller?  Indilinx? SandForce?  Wear-leveling on
these vary quite a bit.

Intel claims write lifetimes in the single digit PB sizes for these 310's.
 They are due to have an update to the X25-E line too at some point.
Public roadmaps say this will be using enterprise MLC.  This stuff
trades off write endurance for data longevity -- if left without power for
too long the data will be lost.  This is a tradeoff for all flash -- but
the stuff that is optimized for USB sticks is quite different than the
stuff optimized for servers.


merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread David Rees
On Wed, Apr 6, 2011 at 5:42 PM, Scott Carey sc...@richrelevance.com wrote:
 On 4/5/11 7:07 AM, Merlin Moncure mmonc...@gmail.com wrote:
One thing about MLC flash drives (which the industry seems to be
moving towards) is that you have to factor drive lifespan into the
total system balance of costs. Data point: had an ocz vertex 2 that
burned out in ~ 18 months.  In the post mortem, it was determined that
the drive met and exceeded its 10k write limit -- this was a busy
production box.

 What OCZ Drive?  What controller?  Indilinx? SandForce?  Wear-leveling on
 these vary quite a bit.

SandForce SF-1200

-Dave

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread Greg Smith

On 04/06/2011 08:22 PM, Scott Carey wrote:

Simple power failure tests demonstrate that you lose data with these
drives unless you disable the cache.  Disabling the cache roughly drops
write performance by a factor of 3 to 4 on G1 drives and significantly
hurts wear-leveling and longevity (I haven't tried G2's).
   


Yup.  I have a customer running a busy system with Intel X25-Es, and 
another with X25-Ms, and every time there is a power failure at either 
place their database gets corrupted.  That those drives are worthless 
for a reliable database setup has been clear for two years now:  
http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/ 
and sometimes I even hear reports about those drives getting corrupted 
even when the write cache is turned off.  If you aggressively replicate 
the data to another location on a different power grid, you can survive 
with Intel's older drives.  But odds are you're going to lose at least 
some transactions no matter what you do, and the risk of database won't 
start levels of corruption is always lingering.


The fact that Intel is making so much noise over the improved write 
integrity features on the new drives gives you an idea how much these 
problems have hurt their reputation in the enterprise storage space.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread David Boreham
Had to say a quick thanks to Greg and the others who have posted 
detailed test results on SSDs here.
For those of us watching for the inflection point where we can begin the 
transition from mechanical to solid state storage, this data and 
experience is invaluable. Thanks for sharing it.


A short story while I'm posting : my Dad taught electronics engineering 
and would often visit the local factories with groups of students. I 
remember in particular after a visit to a disk drive manufacturer 
(Burroughs), in 1977 he came home telling me that he'd asked the plant 
manager what their plan was once solid state storage made their products 
obsolete. The manager looked at him like he was form another planet...


So I've been waiting patiently 34 years for this hopefully 
soon-to-arrive moment ;)




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread gnuoytr
SSDs have been around for quite some time.  The first that I've found is Texas 
Memory.  Not quite 1977, but not flash either, although they've been doing so 
for a couple of years.  

http://www.ramsan.com/company/history

 Original message 
Date: Wed, 06 Apr 2011 20:56:16 -0600
From: pgsql-performance-ow...@postgresql.org (on behalf of David Boreham 
david_l...@boreham.org)
Subject: Re: [PERFORM] Intel SSDs that may not suck  
To: pgsql-performance@postgresql.org

Had to say a quick thanks to Greg and the others who have posted 
detailed test results on SSDs here.
For those of us watching for the inflection point where we can begin the 
transition from mechanical to solid state storage, this data and 
experience is invaluable. Thanks for sharing it.

A short story while I'm posting : my Dad taught electronics engineering 
and would often visit the local factories with groups of students. I 
remember in particular after a visit to a disk drive manufacturer 
(Burroughs), in 1977 he came home telling me that he'd asked the plant 
manager what their plan was once solid state storage made their products 
obsolete. The manager looked at him like he was form another planet...

So I've been waiting patiently 34 years for this hopefully 
soon-to-arrive moment ;)



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread David Boreham

On 4/6/2011 9:19 PM, gnuo...@rcn.com wrote:

SSDs have been around for quite some time.  The first that I've found is Texas 
Memory.  Not quite 1977, but not flash either, although they've been doing so 
for a couple of years.
Well, I built my first ram disk (which of course I thought I had 
invented, at the time) in 1982.
But today we're seeing solid state storage seriously challenging 
rotating media across all applications, except at the TB and beyond 
scale. That's what's new.




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread Jesper Krogh

On 2011-03-28 22:21, Greg Smith wrote:
Some may still find these two cheap for enterprise use, given the use 
of MLC limits how much activity these drives can handle.  But it's 
great to have a new option for lower budget system that can tolerate 
some risk there.



Drifting of the topic slightly..  Has anyone opinions/experience with:
http://www.ocztechnology.com/ocz-z-drive-r2-p88-pci-express-ssd.html

They seem to be like the FusionIO drives just quite a lot cheaper,
wonder what the state of those 512MB is in case of a power-loss.


--
Jesper

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread Greg Smith

On 04/07/2011 12:27 AM, Jesper Krogh wrote:

On 2011-03-28 22:21, Greg Smith wrote:
Some may still find these two cheap for enterprise use, given the use 
of MLC limits how much activity these drives can handle.  But it's 
great to have a new option for lower budget system that can tolerate 
some risk there.



Drifting of the topic slightly..  Has anyone opinions/experience with:
http://www.ocztechnology.com/ocz-z-drive-r2-p88-pci-express-ssd.html

They seem to be like the FusionIO drives just quite a lot cheaper,
wonder what the state of those 512MB is in case of a power-loss.


What I do is assume that if the vendor doesn't say outright how the 
cache is preserved, that means it isn't, and the card is garbage for 
database use.  That rule is rarely wrong.  The available soon Z-Drive R3 
includes a Sandforce controller and supercap for preserving writes:  
http://hothardware.com/News/OCZ-Unveils-RevoDrive-X3-Vertex-3-and-Other-SSD-Goodness/


Since they're bragging about it there, the safe bet is that the older R2 
unit had no such facility.


I note that the Z-Drive R2 is basically some flash packed on top of an 
LSI 1068e controller, mapped as a RAID0 volume.  It's possible they left 
the battery-backup unit on that card exposed, so it may be possible to 
do better with it.  The way they just stack those card layers together, 
the thing is practically held together with duct tape though.  That's 
not a confidence inspiring design to me.  The R3 drives are much more 
cleanly integrated.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-05 Thread Merlin Moncure
On Mon, Apr 4, 2011 at 8:26 PM, Greg Smith g...@2ndquadrant.com wrote:
 On 03/28/2011 04:21 PM, Greg Smith wrote:

 Today is the launch of Intel's 3rd generation SSD line, the 320 series.
  And they've finally produced a cheap consumer product that may be useful
 for databases, too!  They've put 6 small capacitors onto the board and added
 logic to flush the write cache if the power drops.

 I decided a while ago that I wasn't going to buy a personal SSD until I
 could get one without a volatile write cache for less than what a
 battery-backed caching controller costs.  That seemed the really disruptive
 technology point for the sort of database use I worry about.  According to
 http://www.newegg.com/Product/Product.aspx?Item=N82E16820167050 that point
 was today, with the new 120GB drives now selling for $240.  UPS willing,
 later this week I should have one of those here for testing.

 A pair of those mirrored with software RAID-1 runs $480 for 120GB.  LSI
 MegaRAID 9260-4i with 512MB cache is $330, ditto 3ware 9750-4i.  Battery
 backup runs $135 to $180 depending on model; let's call it $150.  Decent
 enterprise hard drive without RAID-incompatible firmware, $90 for 500GB,
 need two of them.  That's $660 total for 500GB of storage.

 If you really don't need more than 120GB of storage, but do care about
 random I/O speed, this is a pretty easy decision now--presuming the drive
 holds up to claims.  As the claims are reasonable relative to the
 engineering that went into the drive now, that may actually be the case.

One thing about MLC flash drives (which the industry seems to be
moving towards) is that you have to factor drive lifespan into the
total system balance of costs. Data point: had an ocz vertex 2 that
burned out in ~ 18 months.  In the post mortem, it was determined that
the drive met and exceeded its 10k write limit -- this was a busy
production box.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-04-04 Thread Greg Smith

On 03/28/2011 04:21 PM, Greg Smith wrote:
Today is the launch of Intel's 3rd generation SSD line, the 320 
series.  And they've finally produced a cheap consumer product that 
may be useful for databases, too!  They've put 6 small capacitors onto 
the board and added logic to flush the write cache if the power drops.


I decided a while ago that I wasn't going to buy a personal SSD until I 
could get one without a volatile write cache for less than what a 
battery-backed caching controller costs.  That seemed the really 
disruptive technology point for the sort of database use I worry about.  
According to 
http://www.newegg.com/Product/Product.aspx?Item=N82E16820167050 that 
point was today, with the new 120GB drives now selling for $240.  UPS 
willing, later this week I should have one of those here for testing.


A pair of those mirrored with software RAID-1 runs $480 for 120GB.  LSI 
MegaRAID 9260-4i with 512MB cache is $330, ditto 3ware 9750-4i.  Battery 
backup runs $135 to $180 depending on model; let's call it $150.  Decent 
enterprise hard drive without RAID-incompatible firmware, $90 for 
500GB, need two of them.  That's $660 total for 500GB of storage.


If you really don't need more than 120GB of storage, but do care about 
random I/O speed, this is a pretty easy decision now--presuming the 
drive holds up to claims.  As the claims are reasonable relative to the 
engineering that went into the drive now, that may actually be the case.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Justin Pitts
The potential breakthrough here with the 320 is consumer grade SSD
performance and price paired with high reliability.

On Mon, Mar 28, 2011 at 7:54 PM, Andy angelf...@yahoo.com wrote:
 This might be a bit too little too late though. As you mentioned there really 
 isn't any real performance improvement for the Intel SSD. Meanwhile, 
 SandForce (the controller that OCZ Vertex is based on) is releasing its next 
 generation controller at a reportedly huge performance increase.

 Is there any benchmark measuring the performance of these SSD's (the new 
 Intel vs. the new SandForce) running database workloads? The benchmarks I've 
 seen so far are for desktop applications.

 Andy

 --- On Mon, 3/28/11, Greg Smith g...@2ndquadrant.com wrote:

 From: Greg Smith g...@2ndquadrant.com
 Subject: [PERFORM] Intel SSDs that may not suck
 To: pgsql-performance@postgresql.org pgsql-performance@postgresql.org
 Date: Monday, March 28, 2011, 4:21 PM
 Today is the launch of Intel's 3rd
 generation SSD line, the 320 series.  And they've
 finally produced a cheap consumer product that may be useful
 for databases, too!  They've put 6 small capacitors
 onto the board and added logic to flush the write cache if
 the power drops.  The cache on these was never very
 big, so they were able to avoid needing one of the big
 super-capacitors instead.  Having 6 little ones is
 probably a net reliability win over the single point of
 failure, too.

 Performance is only a little better than earlier generation
 designs, which means they're still behind the OCZ Vertex
 controllers that have been recommended on this list.  I
 haven't really been hearing good things about long-term
 reliability of OCZ's designs anyway, so glad to have an
 alternative.  *Important*:  don't buy SSD for
 important data without also having a good redundancy/backup
 plan.  As relatively new technology they do still have
 a pretty high failure rate.  Make sure you budget for
 two drives and make multiple copies of your data.

 Anyway, the new Intel drivers fast enough for most things,
 though, and are going to be very inexpensive.  See 
 http://www.storagereview.com/intel_ssd_320_review_300gb
 for some simulated database tests.  There's more about
 the internals at http://www.anandtech.com/show/4244/intel-ssd-320-review
 and the white paper about the capacitors is at 
 http://newsroom.intel.com/servlet/JiveServlet/download/38-4324/Intel_SSD_320_Series_Enhance_Power_Loss_Technology_Brief.pdf

 Some may still find these two cheap for enterprise use,
 given the use of MLC limits how much activity these drives
 can handle.  But it's great to have a new option for
 lower budget system that can tolerate some risk there.

 -- Greg Smith   2ndQuadrant US
 g...@2ndquadrant.com   Baltimore,
 MD
 PostgreSQL Training, Services, and 24x7 Support
 www.2ndQuadrant.us
 PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance





 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Yeb Havinga

Hello Greg, list,

On 2011-03-28 22:21, Greg Smith wrote:
Today is the launch of Intel's 3rd generation SSD line, the 320 
series.  And they've finally produced a cheap consumer product that 
may be useful for databases, too!  They've put 6 small capacitors onto 
the board and added logic to flush the write cache if the power 
drops.  The cache on these was never very big, so they were able to 
avoid needing one of the big super-capacitors instead.  Having 6 
little ones is probably a net reliability win over the single point of 
failure, too.


Performance is only a little better than earlier generation designs, 
which means they're still behind the OCZ Vertex controllers that have 
been recommended on this list.  I haven't really been hearing good 
things about long-term reliability of OCZ's designs anyway, so glad to 
have an alternative.  *Important*:  don't buy SSD for important data 
without also having a good redundancy/backup plan.  As relatively new 
technology they do still have a pretty high failure rate.  Make sure 
you budget for two drives and make multiple copies of your data.


Anyway, the new Intel drivers fast enough for most things, though, and 
are going to be very inexpensive.  See 
http://www.storagereview.com/intel_ssd_320_review_300gb for some 
simulated database tests.  There's more about the internals at 
http://www.anandtech.com/show/4244/intel-ssd-320-review and the white 
paper about the capacitors is at 
http://newsroom.intel.com/servlet/JiveServlet/download/38-4324/Intel_SSD_320_Series_Enhance_Power_Loss_Technology_Brief.pdf


Some may still find these two cheap for enterprise use, given the use 
of MLC limits how much activity these drives can handle.  But it's 
great to have a new option for lower budget system that can tolerate 
some risk there.


While I appreciate the heads up about these new drives, your posting 
suggests (though you formulated in a way that you do not actually say 
it) that OCZ products do not have a long term reliability. No factual 
data. If you have knowledge of sandforce based OCZ drives fail, that'd 
be interesting because that's the product line what the new Intel SSD 
ought to be compared with. From my POV I've verified that the sandforce 
based OCZ drives operate as they should (w.r.t. barriers/write through) 
and I've reported what and how that testing was done (where I really 
appreciated your help with) - 
http://archives.postgresql.org/pgsql-performance/2010-07/msg00449.php.


The three drives we're using in a development environment right now 
report (with recent SSD firmwares and smartmontools) their health status 
including the supercap status as well as reserved blocks and a lot more 
info, that can be used to monitor when it's about to be dead. Since none 
of the drives have failed yet, or are in the vicinity of their end of 
life predictions, it is currently unknown if this health status is 
reliable. It may be, but may as well not be. Therefore I'm very 
interested in hearing hard facts about failures and the smart readings 
right before that.


Below are smart readings from two Vertex 2 Pro's, the first is the same 
I did the testing with earlier. You can see it's lifetime reads/writes 
as well as unexpected power loss count is larger than the other, newer 
one. The FAILING_NOW of available reserved space is an artefact of 
smartmontools db that has its threshold wrong: it should be read as Gb's 
reserved space, and I suspect for a new drive it might be in the order 
of 18 or 20.


It's hard to compare with spindles: I've seen them fail in all sorts of 
ways, but as of yet I've seen no SSD failure yet. I'm inclined to start 
a perpetual pgbench on one ssd with monitoring of smart stats to see if 
what they report is really a good indicator of their lifetime. If that 
is so I'm beginning to believe then this technology is better in failure 
predictability than spindles, which pretty much seems at random when you 
have large arrays.


Model I tested with earlier:

=== START OF INFORMATION SECTION ===
Model Family: SandForce Driven SSDs
Device Model: OCZ VERTEX2-PRO
Serial Number:OCZ-BVW101PBN8Q8H8M5
LU WWN Device Id: 5 e83a97 f88e46007
Firmware Version: 1.32
User Capacity:50,020,540,416 bytes
Device is:In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 6
Local Time is:Tue Mar 29 11:25:04 2011 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: 
Disabled.
Self-test execution status:  (   0) The previous self-test routine 
completed

Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Jeff


On Mar 29, 2011, at 12:13 AM, Merlin Moncure wrote:



My own experience with MLC drives is that write cycle expectations are
more or less as advertised. They do go down (hard), and have to be
monitored. If you are writing a lot of data this can get pretty
expensive although the cost dynamics are getting better and better for
flash. I have no idea what would be precisely prudent, but maybe some
good monitoring tools and phased obsolescence at around 80% duty cycle
might not be a bad starting point.  With hard drives, you can kinda
wait for em to pop and swap em in -- this is NOT a good idea for flash
raid volumes.




we've been running some of our DB's on SSD's (x25m's, we also have a  
pair of x25e's in another box we use for some super hot tables).  They  
have been in production for well over a year (in some cases, nearly a  
couple years) under heavy load.


We're currently being bit in the ass by performance degradation and  
we're working out plans to remedy the situation.  One box has 8 x25m's  
in a R10 behind a P400 controller.  First, the p400 is not that  
powerful and we've run experiments with newer (p812) controllers that  
have been generally positive.   The main symptom we've been seeing is  
write stalls.  Writing will go, then come to a complete halt for 0.5-2  
seconds, then resume.   The fix we're going to do is replace each  
drive in order with the rebuild occuring between each.  Then we do a  
security erase to reset the drive back to completely empty (including  
the spare blocks kept around for writes).


Now that all sounds awful and horrible until you get to overall  
performance, especially with reads - you are looking at 20k random  
reads per second with a few disks.  Adding in writes does kick it down  
a noch, but you're still looking at 10k+ iops. That is the current  
trade off.


In general, i wouldn't recommend the cciss stuff with SSD's at this  
time because it makes some things such as security erase, smart and  
other things near impossible. (performance seems ok though) We've got  
some tests planned seeing what we can do with an Areca controller and  
some ssds to see how it goes.


Also note that there is a funky interaction with an MSA70 and SSDs.  
they do not work together. (I'm not sure if HP's official branded  
ssd's have the same issue).


The write degradation could probably be monitored looking at svctime  
from sar. We may be implementing that in the near future to detect  
when this creeps up again.



--
Jeff Trout j...@jefftrout.com
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Cédric Villemain
2011/3/29 Jeff thres...@torgo.978.org:

 On Mar 29, 2011, at 12:13 AM, Merlin Moncure wrote:


 My own experience with MLC drives is that write cycle expectations are
 more or less as advertised. They do go down (hard), and have to be
 monitored. If you are writing a lot of data this can get pretty
 expensive although the cost dynamics are getting better and better for
 flash. I have no idea what would be precisely prudent, but maybe some
 good monitoring tools and phased obsolescence at around 80% duty cycle
 might not be a bad starting point.  With hard drives, you can kinda
 wait for em to pop and swap em in -- this is NOT a good idea for flash
 raid volumes.



 we've been running some of our DB's on SSD's (x25m's, we also have a pair of
 x25e's in another box we use for some super hot tables).  They have been in
 production for well over a year (in some cases, nearly a couple years) under
 heavy load.

 We're currently being bit in the ass by performance degradation and we're
 working out plans to remedy the situation.  One box has 8 x25m's in a R10
 behind a P400 controller.  First, the p400 is not that powerful and we've
 run experiments with newer (p812) controllers that have been generally
 positive.   The main symptom we've been seeing is write stalls.  Writing
 will go, then come to a complete halt for 0.5-2 seconds, then resume.   The
 fix we're going to do is replace each drive in order with the rebuild
 occuring between each.  Then we do a security erase to reset the drive back
 to completely empty (including the spare blocks kept around for writes).

 Now that all sounds awful and horrible until you get to overall performance,
 especially with reads - you are looking at 20k random reads per second with
 a few disks.  Adding in writes does kick it down a noch, but you're still
 looking at 10k+ iops. That is the current trade off.

 In general, i wouldn't recommend the cciss stuff with SSD's at this time
 because it makes some things such as security erase, smart and other things
 near impossible. (performance seems ok though) We've got some tests planned
 seeing what we can do with an Areca controller and some ssds to see how it
 goes.

 Also note that there is a funky interaction with an MSA70 and SSDs. they do
 not work together. (I'm not sure if HP's official branded ssd's have the
 same issue).

 The write degradation could probably be monitored looking at svctime from
 sar. We may be implementing that in the near future to detect when this
 creeps up again.

svctime is untrustable. From the systat author, this field will be
removed in a future version.


-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Jeff


On Mar 29, 2011, at 10:16 AM, Jeff wrote:

Now that all sounds awful and horrible until you get to overall  
performance, especially with reads - you are looking at 20k random  
reads per second with a few disks.  Adding in writes does kick it  
down a noch, but you're still looking at 10k+ iops. That is the  
current trade off.




We've been doing a burn in for about 4 days now on an array of 8  
x25m's behind a p812 controller: here's a sample of what it is  
currently doing (I have 10 threads randomly seeking, reading, and 10%  
of the time writing (then fsync'ing) out, using my pgiosim tool which  
I need to update on pgfoundry)


10:25:24 AM  dev104-2   7652.21 109734.51  12375.22 15.96   
8.22  1.07  0.12 88.32
10:25:25 AM  dev104-2   7318.52 104948.15  11696.30 15.94   
8.62  1.17  0.13 92.50
10:25:26 AM  dev104-2   7871.56 112572.48  13034.86 15.96   
8.60  1.09  0.12 91.38
10:25:27 AM  dev104-2   7869.72 111955.96  13592.66 15.95   
8.65  1.10  0.12 91.65
10:25:28 AM  dev104-2   7859.41 111920.79  13560.40 15.97   
9.32  1.19  0.13 98.91
10:25:29 AM  dev104-2   7285.19 104133.33  12000.00 15.94   
8.08  1.11  0.13 92.59
10:25:30 AM  dev104-2   8017.27 114581.82  13250.91 15.94   
8.48  1.06  0.11 90.36
10:25:31 AM  dev104-2   8392.45 120030.19  13924.53 15.96   
8.90  1.06  0.11 94.34
10:25:32 AM  dev104-2  10173.86 145836.36  16409.09 15.95  
10.72  1.05  0.11113.52
10:25:33 AM  dev104-2   7007.14 100107.94  11688.89 15.95   
7.39  1.06  0.11 79.29
10:25:34 AM  dev104-2   8043.27 115076.92  13192.31 15.95   
9.09  1.13  0.12 96.15
10:25:35 AM  dev104-2   7409.09 104290.91  13774.55 15.94   
8.62  1.16  0.12 90.55


the 2nd to last column is svctime. first column after dev104-2 is  
TPS.  if I kill the writes off, tps rises quite a bit:
10:26:34 AM  dev104-2  22659.41 361528.71  0.00 15.95  
10.57  0.42  0.04 99.01
10:26:35 AM  dev104-2  22479.41 359184.31  7.84 15.98   
9.61  0.52  0.04 98.04
10:26:36 AM  dev104-2  21734.29 347230.48  0.00 15.98   
9.30  0.43  0.04 95.33
10:26:37 AM  dev104-2  21551.46 344023.30116.50 15.97   
9.56  0.44  0.05 97.09
10:26:38 AM  dev104-2  21964.42 350592.31  0.00 15.96  
10.25  0.42  0.04 96.15
10:26:39 AM  dev104-2  22512.75 359294.12  7.84 15.96  
10.23  0.50  0.04 98.04
10:26:40 AM  dev104-2  22373.53 357725.49  0.00 15.99   
9.52  0.43  0.04 98.04
10:26:41 AM  dev104-2  21436.79 342596.23  0.00 15.98   
9.17  0.43  0.04 94.34
10:26:42 AM  dev104-2  22525.49 359749.02 39.22 15.97  
10.18  0.45  0.04 98.04



now to demonstrate write stalls on the problemtic box:
10:30:49 AM  dev104-3  0.00  0.00  0.00  0.00   
0.38  0.00  0.00 35.85
10:30:50 AM  dev104-3  3.03  8.08258.59 88.00   
2.43635.00333.33101.01
10:30:51 AM  dev104-3  4.00  0.00128.00 32.00   
0.67391.75 92.75 37.10
10:30:52 AM  dev104-3 10.89  0.00 95.05  8.73   
1.45133.55 12.27 13.37
10:30:53 AM  dev104-3  0.00  0.00  0.00  0.00   
0.00  0.00  0.00  0.00
10:30:54 AM  dev104-3155.00  0.00   1488.00  9.60  
10.88 70.23  2.92 45.20
10:30:55 AM  dev104-3 10.00  0.00536.00 53.60   
1.66100.20 45.80 45.80
10:30:56 AM  dev104-3 46.53  0.00411.88  8.85   
3.01 78.51  4.30 20.00
10:30:57 AM  dev104-3 11.00  0.00 96.00  8.73   
0.79 72.91 27.00 29.70
10:30:58 AM  dev104-3 12.00  0.00 96.00  8.00   
0.79 65.42 11.17 13.40
10:30:59 AM  dev104-3  7.84  7.84 62.75  9.00   
0.67 85.38 32.00 25.10
10:31:00 AM  dev104-3  8.00  0.00224.00 28.00   
0.82102.00 47.12 37.70
10:31:01 AM  dev104-3 20.00  0.00184.00  9.20   
0.24 11.80  1.10  2.20
10:31:02 AM  dev104-3  4.95  0.00 39.60  8.00   
0.23 46.00 13.00  6.44
10:31:03 AM  dev104-3  0.00  0.00  0.00  0.00   
0.00  0.00  0.00  0.00


that was from a simple dd, not random writes. (since it is in  
production, I can't really do the random write test as easily)


theoretically, a nice rotation of disks would remove that problem.  
annoying, but it is the price you need to pay


--
Jeff Trout j...@jefftrout.com
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To 

Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Strange, John W
This can be resolved by partitioning the disk with a larger write spare area so 
that the cells don't have to by recycled so often. There is a lot of 
misinformation about SSD's, there are some great articles on anandtech that 
really explain how the technology works and some of the differences between the 
controllers as well.  If you do the reading you can find a solution that will 
work for you, SSD's are probably one of the best technologies to come along for 
us in a long time that gives us such a performance jump in the IO world.  We 
have gone from completely IO bound to CPU bound, it's really worth spending the 
time to investigate and understand how this can impact your system.

http://www.anandtech.com/show/2614
http://www.anandtech.com/show/2738
http://www.anandtech.com/show/4244/intel-ssd-320-review
http://www.anandtech.com/tag/storage
http://www.anandtech.com/show/3849/micron-announces-realssd-p300-slc-ssd-for-enterprise


-Original Message-
From: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Jeff
Sent: Tuesday, March 29, 2011 9:33 AM
To: Jeff
Cc: Merlin Moncure; Andy; pgsql-performance@postgresql.org; Greg Smith; Brian 
Ristuccia
Subject: Re: [PERFORM] Intel SSDs that may not suck


On Mar 29, 2011, at 10:16 AM, Jeff wrote:

 Now that all sounds awful and horrible until you get to overall 
 performance, especially with reads - you are looking at 20k random 
 reads per second with a few disks.  Adding in writes does kick it down 
 a noch, but you're still looking at 10k+ iops. That is the current 
 trade off.


We've been doing a burn in for about 4 days now on an array of 8 x25m's behind 
a p812 controller: here's a sample of what it is currently doing (I have 10 
threads randomly seeking, reading, and 10% of the time writing (then fsync'ing) 
out, using my pgiosim tool which I need to update on pgfoundry)

10:25:24 AM  dev104-2   7652.21 109734.51  12375.22 15.96   
8.22  1.07  0.12 88.32
10:25:25 AM  dev104-2   7318.52 104948.15  11696.30 15.94   
8.62  1.17  0.13 92.50
10:25:26 AM  dev104-2   7871.56 112572.48  13034.86 15.96   
8.60  1.09  0.12 91.38
10:25:27 AM  dev104-2   7869.72 111955.96  13592.66 15.95   
8.65  1.10  0.12 91.65
10:25:28 AM  dev104-2   7859.41 111920.79  13560.40 15.97   
9.32  1.19  0.13 98.91
10:25:29 AM  dev104-2   7285.19 104133.33  12000.00 15.94   
8.08  1.11  0.13 92.59
10:25:30 AM  dev104-2   8017.27 114581.82  13250.91 15.94   
8.48  1.06  0.11 90.36
10:25:31 AM  dev104-2   8392.45 120030.19  13924.53 15.96   
8.90  1.06  0.11 94.34
10:25:32 AM  dev104-2  10173.86 145836.36  16409.09 15.95  
10.72  1.05  0.11113.52
10:25:33 AM  dev104-2   7007.14 100107.94  11688.89 15.95   
7.39  1.06  0.11 79.29
10:25:34 AM  dev104-2   8043.27 115076.92  13192.31 15.95   
9.09  1.13  0.12 96.15
10:25:35 AM  dev104-2   7409.09 104290.91  13774.55 15.94   
8.62  1.16  0.12 90.55

the 2nd to last column is svctime. first column after dev104-2 is TPS.  if I 
kill the writes off, tps rises quite a bit:
10:26:34 AM  dev104-2  22659.41 361528.71  0.00 15.95  
10.57  0.42  0.04 99.01
10:26:35 AM  dev104-2  22479.41 359184.31  7.84 15.98   
9.61  0.52  0.04 98.04
10:26:36 AM  dev104-2  21734.29 347230.48  0.00 15.98   
9.30  0.43  0.04 95.33
10:26:37 AM  dev104-2  21551.46 344023.30116.50 15.97   
9.56  0.44  0.05 97.09
10:26:38 AM  dev104-2  21964.42 350592.31  0.00 15.96  
10.25  0.42  0.04 96.15
10:26:39 AM  dev104-2  22512.75 359294.12  7.84 15.96  
10.23  0.50  0.04 98.04
10:26:40 AM  dev104-2  22373.53 357725.49  0.00 15.99   
9.52  0.43  0.04 98.04
10:26:41 AM  dev104-2  21436.79 342596.23  0.00 15.98   
9.17  0.43  0.04 94.34
10:26:42 AM  dev104-2  22525.49 359749.02 39.22 15.97  
10.18  0.45  0.04 98.04


now to demonstrate write stalls on the problemtic box:
10:30:49 AM  dev104-3  0.00  0.00  0.00  0.00   
0.38  0.00  0.00 35.85
10:30:50 AM  dev104-3  3.03  8.08258.59 88.00   
2.43635.00333.33101.01
10:30:51 AM  dev104-3  4.00  0.00128.00 32.00   
0.67391.75 92.75 37.10
10:30:52 AM  dev104-3 10.89  0.00 95.05  8.73   
1.45133.55 12.27 13.37
10:30:53 AM  dev104-3  0.00  0.00  0.00  0.00   
0.00  0.00  0.00  0.00
10:30:54 AM  dev104-3155.00  0.00   1488.00  9.60  
10.88 70.23  2.92 45.20
10:30:55 AM  dev104-3 10.00  0.00536.00 53.60   
1.66100.20 45.80

Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Jesper Krogh

On 2011-03-29 16:16, Jeff wrote:

 halt for 0.5-2 seconds, then resume. The fix we're going to do is
 replace each drive in order with the rebuild occuring between each.
 Then we do a security erase to reset the drive back to completely
 empty (including the spare blocks kept around for writes).


Are you replacing the drives with new once, or just secure-erase and 
back in?

What kind of numbers are you drawing out of smartmontools in usage figures?
(Also seeing some write-stalls here, on 24 Raid50 volumes of x25m's, and
have been planning to cycle drives for quite some time, without actually
getting to it.


 Now that all sounds awful and horrible until you get to overall
 performance, especially with reads - you are looking at 20k random
 reads per second with a few disks. Adding in writes does kick it
 down a noch, but you're still looking at 10k+ iops. That is the
 current trade off.


Thats also my experience.
--
Jesper


Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Jeff


On Mar 29, 2011, at 12:12 PM, Jesper Krogh wrote:



Are you replacing the drives with new once, or just secure-erase and  
back in?
What kind of numbers are you drawing out of smartmontools in usage  
figures?
(Also seeing some write-stalls here, on 24 Raid50 volumes of x25m's,  
and
have been planning to cycle drives for quite some time, without  
actually

getting to it.



we have some new drives that we are going to use initially, but  
eventually it'll be a secure-erase'd one we replace it with (which  
should perform identical to a new one)


What enclosure  controller are you using on the 24 disk beast?

--
Jeff Trout j...@jefftrout.com
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread gnuoytr
Both the X25-M and the parts that AnandTech reviews (and a pretty thorough one 
they do) are, on a good day, prosumer.  Getting review material for truly 
Enterprise parts, the kind that STEC, Violin, and Texas Memory will spend a 
year to get qualified at HP or IBM or Oracle is really hard to come by.

Zsolt does keep track of what's going on in the space, although he doesn't test 
himself, that I've seen.  Still, a useful site to visit on occasion:

http://www.storagesearch.com/

regards

 Original message 
Date: Tue, 29 Mar 2011 11:32:16 -0400
From: pgsql-performance-ow...@postgresql.org (on behalf of Strange, John W 
john.w.stra...@jpmchase.com)
Subject: Re: [PERFORM] Intel SSDs that may not suck  
To: Jeff thres...@torgo.dyndns-server.com
Cc: Merlin Moncure mmonc...@gmail.com,Andy 
angelf...@yahoo.com,pgsql-performance@postgresql.org 
pgsql-performance@postgresql.org,Greg Smith g...@2ndquadrant.com,Brian 
Ristuccia br...@ristuccia.com

This can be resolved by partitioning the disk with a larger write spare area 
so that the cells don't have to by recycled so often. There is a lot of 
misinformation about SSD's, there are some great articles on anandtech that 
really explain how the technology works and some of the differences between 
the controllers as well.  If you do the reading you can find a solution that 
will work for you, SSD's are probably one of the best technologies to come 
along for us in a long time that gives us such a performance jump in the IO 
world.  We have gone from completely IO bound to CPU bound, it's really worth 
spending the time to investigate and understand how this can impact your 
system.



http://www.anandtech.com/show/2614

http://www.anandtech.com/show/2738

http://www.anandtech.com/show/4244/intel-ssd-320-review

http://www.anandtech.com/tag/storage

http://www.anandtech.com/show/3849/micron-announces-realssd-p300-slc-ssd-for-enterprise





-Original Message-

From: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Jeff

Sent: Tuesday, March 29, 2011 9:33 AM

To: Jeff

Cc: Merlin Moncure; Andy; pgsql-performance@postgresql.org; Greg Smith; Brian 
Ristuccia

Subject: Re: [PERFORM] Intel SSDs that may not suck





On Mar 29, 2011, at 10:16 AM, Jeff wrote:



 Now that all sounds awful and horrible until you get to overall 

 performance, especially with reads - you are looking at 20k random 

 reads per second with a few disks.  Adding in writes does kick it down 

 a noch, but you're still looking at 10k+ iops. That is the current 

 trade off.





We've been doing a burn in for about 4 days now on an array of 8 x25m's behind 
a p812 controller: here's a sample of what it is currently doing (I have 10 
threads randomly seeking, reading, and 10% of the time writing (then 
fsync'ing) out, using my pgiosim tool which I need to update on pgfoundry)



10:25:24 AM  dev104-2   7652.21 109734.51  12375.22 15.96   

8.22  1.07  0.12 88.32

10:25:25 AM  dev104-2   7318.52 104948.15  11696.30 15.94   

8.62  1.17  0.13 92.50

10:25:26 AM  dev104-2   7871.56 112572.48  13034.86 15.96   

8.60  1.09  0.12 91.38

10:25:27 AM  dev104-2   7869.72 111955.96  13592.66 15.95   

8.65  1.10  0.12 91.65

10:25:28 AM  dev104-2   7859.41 111920.79  13560.40 15.97   

9.32  1.19  0.13 98.91

10:25:29 AM  dev104-2   7285.19 104133.33  12000.00 15.94   

8.08  1.11  0.13 92.59

10:25:30 AM  dev104-2   8017.27 114581.82  13250.91 15.94   

8.48  1.06  0.11 90.36

10:25:31 AM  dev104-2   8392.45 120030.19  13924.53 15.96   

8.90  1.06  0.11 94.34

10:25:32 AM  dev104-2  10173.86 145836.36  16409.09 15.95  

10.72  1.05  0.11113.52

10:25:33 AM  dev104-2   7007.14 100107.94  11688.89 15.95   

7.39  1.06  0.11 79.29

10:25:34 AM  dev104-2   8043.27 115076.92  13192.31 15.95   

9.09  1.13  0.12 96.15

10:25:35 AM  dev104-2   7409.09 104290.91  13774.55 15.94   

8.62  1.16  0.12 90.55



the 2nd to last column is svctime. first column after dev104-2 is TPS.  if I 
kill the writes off, tps rises quite a bit:

10:26:34 AM  dev104-2  22659.41 361528.71  0.00 15.95  

10.57  0.42  0.04 99.01

10:26:35 AM  dev104-2  22479.41 359184.31  7.84 15.98   

9.61  0.52  0.04 98.04

10:26:36 AM  dev104-2  21734.29 347230.48  0.00 15.98   

9.30  0.43  0.04 95.33

10:26:37 AM  dev104-2  21551.46 344023.30116.50 15.97   

9.56  0.44  0.05 97.09

10:26:38 AM  dev104-2  21964.42 350592.31  0.00 15.96  

10.25  0.42  0.04 96.15

10:26:39 AM  dev104-2  22512.75 359294.12  7.84 15.96  

10.23  0.50  0.04 98.04

10:26:40 AM  dev104-2  22373.53 357725.49  0.00

Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Greg Smith

On 03/29/2011 06:34 AM, Yeb Havinga wrote:
While I appreciate the heads up about these new drives, your posting 
suggests (though you formulated in a way that you do not actually say 
it) that OCZ products do not have a long term reliability. No factual 
data. If you have knowledge of sandforce based OCZ drives fail, that'd 
be interesting because that's the product line what the new Intel SSD 
ought to be compared with.


I didn't want to say anything too strong until I got to the bottom of 
the reports I'd been sorting through.  It turns out that there is a very 
wide incompatibility between OCZ drives and some popular Gigabyte 
motherboards:  
http://www.ocztechnologyforum.com/forum/showthread.php?76177-do-you-own-a-Gigabyte-motherboard-and-have-the-SMART-error-with-FW1.11...look-inside


(I'm typing this message on a system with one of the impacted 
combinations, one reason why I don't own a Vertex 2 Pro yet.  That I 
would have to run a Beta BIOS does not inspire confidence.)


What happens on the models impacted is that you can't get SMART data 
from the drive.  That means no monitoring for the sort of expected 
failures we all know can happen with any drive.  So far that looks to be 
at the bottom of all the anecdotal failure reports I'd found:  the 
drives may have been throwing bad sectors or some other early failure, 
and the owners had no idea because they thought SMART would warn 
them--but it wasn't working at all.  Thus, don't find out there's a 
problem until the drive just dies altogether one day.


More popular doesn't always mean more reliable, but for stuff like this 
it helps.  Intel ships so many more drives than OCZ that I'd be shocked 
if Gigabyte themselves didn't have reference samples of them for 
testing.  This really looks like more of a warning about why you should 
be particularly aggressive with checking SMART when running recently 
introduced drives, which it sounds like you are already doing.


Reliability in this area is so strange...a diversion to older drives 
gives an idea how annoyed I am about all this.  Last year, I gave up on 
Western Digital's consumer drives (again).  Not because the failure 
rates were bad, but because the one failure I did run into was so 
terrible from a SMART perspective.  The drive just lied about the whole 
problem so aggressively I couldn't manage the process.  I couldn't get 
the drive to admit it had a problem such that it could turn into an RMA 
candidate, despite failing every time I ran an aggressive SMART error 
check.  It would reallocate a few sectors, say good as new!, and then 
fail at the next block when I re-tested.  Did that at least a dozen 
times before throwing it in the pathological drives pile I keep around 
for torture testing.


Meanwhile, the Seagate drives I switched back to are terrible, from a 
failure percentage perspective.  I just had two start to go bad last 
week, both halves of an array which is always fun.  But, the failure 
started with very clearly labeled increases in reallocated sectors, and 
the drive that eventually went really bad (making the bad noises) was 
kicked back for RMA.  If you've got redundancy, I'll take components 
that fail cleanly over ones that hide what's going on, even if the one 
that fails cleanly is actually more likely to fail.  With a rebuild 
always a drive swap away, having accurate data makes even a higher 
failure rate manageable.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Jesper Krogh

On 2011-03-29 18:50, Jeff wrote:


we have some new drives that we are going to use initially, but 
eventually it'll be a secure-erase'd one we replace it with (which 
should perform identical to a new one)


What enclosure  controller are you using on the 24 disk beast?


LSI ELP and a HP D2700 enclosure.

Works flawlessly, the only bad thing (which actually is pretty grave)
is that the controller mis-numbers the slots in the enclosure, so
you'll have to have the mapping drawn on paper next to the
enclosure to replace the correct disk.

--
Jesper

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-03-28 Thread Merlin Moncure
On Mon, Mar 28, 2011 at 7:54 PM, Andy angelf...@yahoo.com wrote:
 This might be a bit too little too late though. As you mentioned there really 
 isn't any real performance improvement for the Intel SSD. Meanwhile, 
 SandForce (the controller that OCZ Vertex is based on) is releasing its next 
 generation controller at a reportedly huge performance increase.

 Is there any benchmark measuring the performance of these SSD's (the new 
 Intel vs. the new SandForce) running database workloads? The benchmarks I've 
 seen so far are for desktop applications.

The random performance data is usually a rough benchmark.  The
sequential numbers are mostly useless and always have been.  The
performance of either the ocz or intel drive is so disgustingly fast
compared to a hard drives that the main stumbling block is life span
and write endurance now that they are starting to get capactiors.

My own experience with MLC drives is that write cycle expectations are
more or less as advertised. They do go down (hard), and have to be
monitored. If you are writing a lot of data this can get pretty
expensive although the cost dynamics are getting better and better for
flash. I have no idea what would be precisely prudent, but maybe some
good monitoring tools and phased obsolescence at around 80% duty cycle
might not be a bad starting point.  With hard drives, you can kinda
wait for em to pop and swap em in -- this is NOT a good idea for flash
raid volumes.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-03-28 Thread Jesper Krogh

On 2011-03-29 06:13, Merlin Moncure wrote:

My own experience with MLC drives is that write cycle expectations are
more or less as advertised. They do go down (hard), and have to be
monitored. If you are writing a lot of data this can get pretty
expensive although the cost dynamics are getting better and better for
flash. I have no idea what would be precisely prudent, but maybe some
good monitoring tools and phased obsolescence at around 80% duty cycle
might not be a bad starting point.  With hard drives, you can kinda
wait for em to pop and swap em in -- this is NOT a good idea for flash
raid volumes.

What do you mean by hard, I have some in our setup, but
havent seen anyting hard just yet. Based on report on the net
they seem to slow down writes to next to nothing when they
get used but that seems to be more gracefully than old
rotating drives..  can you elaborate a bit more?

Jesper

--
Jesper

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance