Re: [GENERAL] SSDD reliability

2011-05-18 Thread Craig Ringer
On 05/19/2011 08:57 AM, Martin Gainty wrote: what is this talk about replicating your primary database to secondary nodes in the cloud... slow. You'd have to do async replication with unbounded slave lag. It'd also be very easy to get to the point where the load on the master meant that the

Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-18 Thread Toby Corkindale
On 19/05/11 10:50, mark wrote: Note 1: I have seen an array that was powered on continuously for about six years, which killed half the disks when it was finally powered down, left to cool for a few hours, then started up again. Recently we rebooted about 6 machines that had uptimes of 950+ d

Re: [GENERAL] SSDD reliability

2011-05-18 Thread Martin Gainty
bilité pour le contenu fourni. > From: dvlh...@gmail.com > To: toby.corkind...@strategicdata.com.au; pgsql-general@postgresql.org > Subject: Re: Fwd: Re: [GENERAL] SSDD reliability > Date: Wed, 18 May 2011 18:50:28 -0600 > > > Note 1: > > I have seen an array that was powe

Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-18 Thread mark
> Note 1: > I have seen an array that was powered on continuously for about six > years, which killed half the disks when it was finally powered down, > left to cool for a few hours, then started up again. > Recently we rebooted about 6 machines that had uptimes of 950+ days. Last time fsck had

Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-11 Thread Toby Corkindale
BTW, I saw a news article today about a brand of SSD that was claiming to have the price effectiveness of MLC-type chips, but with lifetime of 4TB/day over 5 years. http://www.storagereview.com/anobit_unveils_genesis_mlc_enterprise_ssds which also links to: http://www.storagereview.com/sandfor

Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-05 Thread Toby Corkindale
On 05/05/11 18:36, Florian Weimer wrote: * Greg Smith: Intel claims their Annual Failure Rate (AFR) on their SSDs in IT deployments (not OEM ones) is 0.6%. Typical measured AFR rates for mechanical drives is around 2% during their first year, spiking to 5% afterwards. I suspect that Intel's n

Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-05 Thread Greg Smith
On 05/04/2011 08:31 PM, David Boreham wrote: Here's my best theory at present : the failures ARE caused by cell wear-out, but the SSD firmware is buggy in so far as it fails to boot up and respond to host commands due to the wear-out state. So rather than the expected outcome (SSD responds but

Re: [GENERAL] SSDD reliability

2011-05-05 Thread Scott Marlowe
On Thu, May 5, 2011 at 1:54 PM, Greg Smith wrote: > I think your faith in PC component manufacturing is out of touch with the > actual field failure rates for this stuff, which is produced with enormous > cost cutting pressure driving tolerances to the bleeding edge in many cases. >  The equipmen

Re: [GENERAL] SSDD reliability

2011-05-05 Thread Greg Smith
On 05/05/2011 10:35 AM, David Boreham wrote: On 5/5/2011 8:04 AM, Scott Ribe wrote: Actually, any of us who really tried could probably come up with a dozen examples--more if we've been around for a while. Original design cutting corners on power regulation; final manufacturers cutting corne

Re: [GENERAL] SSDD reliability

2011-05-05 Thread David Boreham
On 5/5/2011 8:04 AM, Scott Ribe wrote: Actually, any of us who really tried could probably come up with a dozen examples--more if we've been around for a while. Original design cutting corners on power regulation; final manufacturers cutting corners on specs; component manufacturers cutting c

Re: [GENERAL] SSDD reliability

2011-05-05 Thread Scott Ribe
On May 4, 2011, at 9:34 PM, David Boreham wrote: > So ok, yeah...I said that chips don't just keel over and die mid-life > and you came up with the one counterexample in the history of > the industry Actually, any of us who really tried could probably come up with a dozen examples--more if we've

Re: [GENERAL] SSDD reliability

2011-05-05 Thread David Boreham
On 5/4/2011 11:50 PM, Toby Corkindale wrote: In what way has the SMART read failed? (I get the relevant values out successfully myself, and have Munin graph them.) Mis-parse :) It was my _attempts_ to read SMART that failed. Specifically, I was able to read a table of numbers from the drive, b

Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-05 Thread David Boreham
On 5/5/2011 2:36 AM, Florian Weimer wrote: I'm a bit concerned with usage-dependent failures. Presumably, two SDDs in a RAID-1 configuration are weared down in the same way, and it would be rather inconvenient if they failed at the same point. With hard disks, this doesn't seem to happen; even

Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-05 Thread Florian Weimer
* Greg Smith: > Intel claims their Annual Failure Rate (AFR) on their SSDs in IT > deployments (not OEM ones) is 0.6%. Typical measured AFR rates for > mechanical drives is around 2% during their first year, spiking to 5% > afterwards. I suspect that Intel's numbers are actually much better > th

Re: [GENERAL] SSDD reliability

2011-05-04 Thread Toby Corkindale
On 05/05/11 03:31, David Boreham wrote: On 5/4/2011 11:15 AM, Scott Ribe wrote: Sigh... Step 2: paste link in ;-) To be honest, like the article author, I'd be happy with 300+ days to failure, IF the drives

Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-04 Thread Scott Marlowe
On Wed, May 4, 2011 at 9:34 PM, David Boreham wrote: > On 5/4/2011 9:06 PM, Scott Marlowe wrote: >> >> Most of it is.  But certain parts are fairly new, i.e. the >> controllers.  It is quite possible that all these various failing >> drives share some long term ~ 1 year degradation issue like the

Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-04 Thread David Boreham
On 5/4/2011 9:06 PM, Scott Marlowe wrote: Most of it is. But certain parts are fairly new, i.e. the controllers. It is quite possible that all these various failing drives share some long term ~ 1 year degradation issue like the 6Gb/s SAS ports on the early sandybridge Intel CPUs. If that's th

Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-04 Thread Scott Marlowe
On Wed, May 4, 2011 at 6:31 PM, David Boreham wrote: > > this). The technology and manufacturing processes are common across many > different types of product. They either all work , or they all fail. Most of it is. But certain parts are fairly new, i.e. the controllers. It is quite possible th

Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-04 Thread David Boreham
On 5/4/2011 6:02 PM, Greg Smith wrote: On 05/04/2011 03:24 PM, David Boreham wrote: So if someone says that SSDs have "failed", I'll assume that they suffered from Flash cell wear-out unless there is compelling proof to the contrary. I've been involved in four recovery situations similar to t

Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-04 Thread Greg Smith
On 05/04/2011 03:24 PM, David Boreham wrote: So if someone says that SSDs have "failed", I'll assume that they suffered from Flash cell wear-out unless there is compelling proof to the contrary. I've been involved in four recovery situations similar to the one described in that coding horror

Fwd: Re: [GENERAL] SSDD reliability

2011-05-04 Thread David Boreham
No problem with that, for a first step. ***BUT*** the failures in this article and many others I've read about are not in high-write db workloads, so they're not write wear, they're just crappy electronics failing. As a (lapsed) electronics design engineer, I'm suspicious of the notion that

Re: [GENERAL] SSDD reliability

2011-05-04 Thread Scott Ribe
On May 4, 2011, at 11:31 AM, David Boreham wrote: > To be honest, like the article author, I'd be happy with 300+ days to > failure, IF the drives provide an accurate predictor of impending doom. No problem with that, for a first step. ***BUT*** the failures in this article and many others I've

Re: [GENERAL] SSDD reliability

2011-05-04 Thread David Boreham
On 5/4/2011 11:15 AM, Scott Ribe wrote: Sigh... Step 2: paste link in ;-) To be honest, like the article author, I'd be happy with 300+ days to failure, IF the drives provide an accurate predictor of impend

Re: [GENERAL] SSDD reliability

2011-05-04 Thread Scott Ribe
On May 4, 2011, at 10:50 AM, Greg Smith wrote: > Your link didn't show up on this. Sigh... Step 2: paste link in ;-) -- Scott Ribe scott_r...@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voic

[GENERAL] SSDD reliability

2011-05-04 Thread Scott Ribe
Yeah, on that subject, anybody else see this: <> Absolutely pathetic. -- Scott Ribe scott_r...@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgres