Re: [ceph-users] SSD MTBF

2014-11-03 Thread Emmanuel Lacour
On Mon, Sep 29, 2014 at 10:31:03AM +0200, Emmanuel Lacour wrote:
> 
> Dear ceph users,
> 
> 
> we are managing ceph clusters since 1 year now. Our setup is typically
> made of Supermicro servers with OSD sata drives and journal on SSD.
> 
> Those SSD are all failing one after the other after one year :(
> 
> We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd,
> 2 HD in 1U):
> 

s/850/840

A quick update on this, those SSDs continues to fails, we replace each
with Intel S3700 and are rebuilding nodes with a different partition
table (RAID only for OS, one journal on each SSD, over provisionning).

We sent back Samsung SSD for warranty, its'very easy and one week later
we receive SSD with same S/N and smart ok but ... we tried to use back
two of those and they failed one day later. So sorry for samsung, but I
definitely do not recommend using 840 Pro on ceph clusters!


-- 
Easter-eggs  Spécialiste GNU/Linux
44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-10-07 Thread Emmanuel Lacour
On Tue, Oct 07, 2014 at 05:24:40PM +0200, Martin B Nielsen wrote:
> 
>I don't disagree with the above - but the table assumes you'll wear out
>your SSD. Adjust the wear level and the price will change proportionally -
>if you're only writing 50-100TB/year pr ssd then the value will heavily
>swing in the cheaper consumer grade ssd favor. It is all about your
>estimated usage pattern and whether they're 'good enough' for your
>scenario or not (and/or you trust that vendor).
>In my experience ceph seldom (ever) maxes out io of a ssd - it is much
>more likely to be cpu or network before coming to that.
> 

I agree with this. In our case, the response is Intel S3700 100Go
without any doubt :)


-- 
Easter-eggs  Spécialiste GNU/Linux
44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-10-07 Thread Martin B Nielsen
A bit late getting back on this one.

On Wed, Oct 1, 2014 at 5:05 PM, Christian Balzer  wrote:

> > smartctl states something like
> > Wear = 092%, Hours = 12883, Datawritten = 15321.83 TB avg on those. I
> > think that is ~30TB/day if I'm doing the calc right.
> >
> Something very much does not add up there.
> Either you've written 15321.83 GB on those drives, making it about
> 30GB/day and well withing the Samsung specs, or you've written 10-20 times
> the expected TBW level of those drives...
>

My bad, I forgot to say the Wear indicator here (92%) is sorta backwards -
so it means it still has 92% to go before reaching expected TBW limit.

I agree with what Massimiliano Cuttini wrote later as well - if your io
boundaries are well within the expected TBW of the lifetime I see no reason
to go for more expensive disks. Just monitor for wear and have a few in
stock ready for replacement.

Regarding the table of ssd and vendors:
Brand   Model TBW   €  €/TB
Intel   S3500 120Go   701221,74
Intel   S3500 240Go   140   2251,60
Intel   S3700 100Go   1873  2200,11
Intel   S3700 200Go   3737  4000,10
Samsung 840 pro 120Go 701201,71

I don't disagree with the above - but the table assumes you'll wear out
your SSD. Adjust the wear level and the price will change proportionally -
if you're only writing 50-100TB/year pr ssd then the value will heavily
swing in the cheaper consumer grade ssd favor. It is all about your
estimated usage pattern and whether they're 'good enough' for your scenario
or not (and/or you trust that vendor).

In my experience ceph seldom (ever) maxes out io of a ssd - it is much more
likely to be cpu or network before coming to that.

Cheers,
Martin


>
> In the article I mentioned previously:
>
> http://www.anandtech.com/show/8239/update-on-samsung-850-pro-endurance-vnand-die-size
>
> The author clearly comes with a relationship of durability versus SSD
> size, as one would expect. But the Samsung homepage just stated 150TBW,
> for all those models...
>
> Christian
>
> > Not to advertise or say every samsung 840 ssd is like this:
> > http://www.vojcik.net/samsung-ssd-840-endurance-destruct-test/
> >
> Seen it before, but I have a feeling that this test doesn't quite put the
> same strain on the poor NANDs as Emmanuel's environment.
>
> Christian
>
> > Cheers,
> > Martin
> >
> >
> > On Wed, Oct 1, 2014 at 10:18 AM, Christian Balzer  wrote:
> >
> > > On Wed, 1 Oct 2014 09:28:12 +0200 Kasper Dieter wrote:
> > >
> > > > On Tue, Sep 30, 2014 at 04:38:41PM +0200, Mark Nelson wrote:
> > > > > On 09/29/2014 03:58 AM, Dan Van Der Ster wrote:
> > > > > > Hi Emmanuel,
> > > > > > This is interesting, because we?ve had sales guys telling us that
> > > > > > those Samsung drives are definitely the best for a Ceph journal
> > > > > > O_o !
> > > > >
> > > > > Our sales guys or Samsung sales guys?  :)  If it was ours, let me
> > > > > know.
> > > > >
> > > > > > The conventional wisdom has been to use the Intel DC S3700
> > > > > > because of its massive durability.
> > > > >
> > > > > The S3700 is definitely one of the better drives on the market for
> > > > > Ceph journals.  Some of the higher end PCIE SSDs have pretty high
> > > > > durability (and performance) as well, but cost more (though you can
> > > > > save SAS bay space, so it's a trade-off).
> > > > Intel P3700 could be an alternative with 10 Drive-Writes/Day for 5
> > > > years (see attachment)
> > > >
> > > They're certainly nice and competitively priced (TBW/$ wise at least).
> > > However as I said in another thread, once your SSDs start to outlive
> > > your planned server deployment time (in our case 5 years) that's
> > > probably good enough.
> > >
> > > It's all about finding the balance between cost, speed (BW and IOPS),
> > > durability and space.
> > >
> > > For example I'm currently building a cluster based on 2U, 12 hotswap
> > > bays servers (because I already had 2 floating around) and am using 4
> > > 100GB DC S3700 (at US$200 each) and 8 HDDS in them.
> > > Putting in a 400GB DC P3700 (US$1200( instead and 4 more HDDs would
> > > have pushed me over the budget and left me with a less than 30% "used"
> > > SSD 5 years later, at a time when we clearly can expect these things
> > > to be massively faster and cheaper.
> > >
> > > Now if you're actually having a cluster that would wear out a P3700 in
> > > 5 years (or you're planning to run your machines until they burst into
> > > flames), then that's another story. ^.^
> > >
> > > Christian
> > >
> > > > -Dieter
> > > >
> > > > >
> > > > > >
> > > > > > Anyway, I?m curious what do the SMART counters say on your SSDs??
> > > > > > are they really failing due to worn out P/E cycles or is it
> > > > > > something else?
> > > > > >
> > > > > > Cheers, Dan
> > > > > >
> > > > > >
> > > > > >> On 29 Sep 2014, at 10:31, Emmanuel Lacour
> > > > > >>  wrote:
> > > > > >>
> > > > > >>
> > > > > >> Dear ceph users,
> > > > > >>
> > > > > >>
> > > > > >> we

Re: [ceph-users] SSD MTBF

2014-10-02 Thread Emmanuel Lacour
Le 02/10/2014 17:50, Massimiliano Cuttini a écrit :
> I don't think this is true.
> 
> If you have a SSD disk of 60Gb or 100GB then your TBW/day is really
> limited (the disk is small then will wrote always on same sectors).
> The bigger is the SSD the longer will be alive, you have limited write
> per day then if your disk is bigger you have more sectors to use.
> 
> Expecting more than 1year from a SSD as small as 100Gb is really much.
> Just think that the same SSD of 1TB then will be 10 years longer with
> the same usage.
> If what Emmanuel said is real then consumer SSD are the way (10years on
> a 1Tb disk).
> Then your score it's good (almost amazing) and not bad.
> 
> Just switch to bigger SSD and you'll solve all your problems.
> 
> 

Look at this table made with vendors specs:

Brand   Model TBW   €  €/TB
Intel   S3500 120Go   701221,74
Intel   S3500 240Go   140   2251,60
Intel   S3700 100Go   1873  2200,11
Intel   S3700 200Go   3737  4000,10
Samsung 840 pro 120Go 701201,71


of course bigger SSD allows more over-provisionning, but in term of
€/TBW ...


-- 
Easter-eggs  Spécialiste GNU/Linux
44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-10-02 Thread Emmanuel Lacour
Le 02/10/2014 17:58, Adam Boyhan a écrit :
> What about the Intel DC S3500 instead of the DC S3700?  
> 

A matter of x10 for TBW supported between the two. See the small
analysis I made previously in this thread regarding the cost per TBW.



-- 
Easter-eggs  Spécialiste GNU/Linux
44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-10-02 Thread Adam Boyhan
What about the Intel DC S3500 instead of the DC S3700? 



- Original Message -

From: "Emmanuel Lacour"  
To: ceph-users@lists.ceph.com 
Sent: Thursday, October 2, 2014 11:48:26 AM 
Subject: Re: [ceph-users] SSD MTBF 

Le 02/10/2014 17:14, Ron Allred a écrit : 
> One thing being missed, 
> 
> Samsung 850 Pro has only been available for about 1-2 months. 
> 
> The OP, noted that drives are failing after approx 1 year. This would 
> probably mean the SSDs are actually Samsung 840 Pro. The 
> write-durabilities of 850 and 840 are quite different. 
> 

Yes, mistake on my side, it was of course 804 Pro. 


> 
> You should be looking at Intel DC37xx, OCZ Intrepid 3800, HGST, etc. 
> Samsung recently released the 845DC (PRO/EVO) aimed at datacenters. 
> These have decent TBW specs, but not very much is known about them in 
> real-use yet. 
> 
> Spend a full day reading storagesearch.com, it can save you THOUSANDS of 
> dollars, when selecting an SSD for Datacenter use. 
> 


thanks for advices :) 


-- 
Easter-eggs Spécialiste GNU/Linux 
44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité 
Phone: +33 (0) 1 43 35 00 37 - Fax: +33 (0) 1 43 35 00 76 
mailto:elac...@easter-eggs.com - http://www.easter-eggs.com 
___ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-10-02 Thread Massimiliano Cuttini

I don't think this is true.

If you have a SSD disk of 60Gb or 100GB then your TBW/day is really 
limited (the disk is small then will wrote always on same sectors).
The bigger is the SSD the longer will be alive, you have limited write 
per day then if your disk is bigger you have more sectors to use.


Expecting more than 1year from a SSD as small as 100Gb is really much.
Just think that the same SSD of 1TB then will be 10 years longer with 
the same usage.
If what Emmanuel said is real then consumer SSD are the way (10years on 
a 1Tb disk).

Then your score it's good (almost amazing) and not bad.

Just switch to bigger SSD and you'll solve all your problems.



Il 02/10/2014 17:14, Ron Allred ha scritto:

One thing being missed,

Samsung 850 Pro has only been available for about 1-2 months.

The OP, noted that drives are failing after approx 1 year.  This would 
probably mean the SSDs are actually Samsung 840 Pro.  The 
write-durabilities of 850 and 840 are quite different.



That being said, Samsung 8X0 Pros are desktop drives.  Only "Data 
Center" grade SSDs should be used with Ceph, with decent TBW/day >= 5 
years.


You should be looking at Intel DC37xx, OCZ Intrepid 3800, HGST, etc.  
Samsung recently released the 845DC (PRO/EVO) aimed at datacenters.  
These have decent TBW specs, but not very much is known about them in 
real-use yet.


Spend a full day reading storagesearch.com, it can save you THOUSANDS 
of dollars, when selecting an SSD for Datacenter use.


Regards,
Ron

On 09/29/2014 02:31 AM, Emmanuel Lacour wrote:

Dear ceph users,


we are managing ceph clusters since 1 year now. Our setup is typically
made of Supermicro servers with OSD sata drives and journal on SSD.

Those SSD are all failing one after the other after one year :(

We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd,
2 HD in 1U):

1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes 
:()

2) raid 1 for OS (nearly no writes) and dedicated partition for journals
   (one per OSD)


I'm convinced that the second setup is better and we migrate old setup
to this one.

Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes 
per day on SSD on a not

really over loaded cluster. Samsung claims to give 5 years warranty if
under 40GB/day. Those numbers seems very low to me.

What are your experiences on this? What write volumes do you encounter,
on wich SSD models, which setup and what MTBF?




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-10-02 Thread Emmanuel Lacour
Le 02/10/2014 17:14, Ron Allred a écrit :
> One thing being missed,
> 
> Samsung 850 Pro has only been available for about 1-2 months.
> 
> The OP, noted that drives are failing after approx 1 year.  This would
> probably mean the SSDs are actually Samsung 840 Pro.  The
> write-durabilities of 850 and 840 are quite different.
>

Yes, mistake on my side, it was of course 804 Pro.


> 
> You should be looking at Intel DC37xx, OCZ Intrepid 3800, HGST, etc. 
> Samsung recently released the 845DC (PRO/EVO) aimed at datacenters. 
> These have decent TBW specs, but not very much is known about them in
> real-use yet.
> 
> Spend a full day reading storagesearch.com, it can save you THOUSANDS of
> dollars, when selecting an SSD for Datacenter use.
> 


thanks for advices :)


-- 
Easter-eggs  Spécialiste GNU/Linux
44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-10-02 Thread Ron Allred

One thing being missed,

Samsung 850 Pro has only been available for about 1-2 months.

The OP, noted that drives are failing after approx 1 year.  This would 
probably mean the SSDs are actually Samsung 840 Pro.  The 
write-durabilities of 850 and 840 are quite different.



That being said, Samsung 8X0 Pros are desktop drives.  Only "Data 
Center" grade SSDs should be used with Ceph, with decent TBW/day >= 5 
years.


You should be looking at Intel DC37xx, OCZ Intrepid 3800, HGST, etc.  
Samsung recently released the 845DC (PRO/EVO) aimed at datacenters.  
These have decent TBW specs, but not very much is known about them in 
real-use yet.


Spend a full day reading storagesearch.com, it can save you THOUSANDS of 
dollars, when selecting an SSD for Datacenter use.


Regards,
Ron

On 09/29/2014 02:31 AM, Emmanuel Lacour wrote:

Dear ceph users,


we are managing ceph clusters since 1 year now. Our setup is typically
made of Supermicro servers with OSD sata drives and journal on SSD.

Those SSD are all failing one after the other after one year :(

We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd,
2 HD in 1U):

1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :()
2) raid 1 for OS (nearly no writes) and dedicated partition for journals
   (one per OSD)


I'm convinced that the second setup is better and we migrate old setup
to this one.

Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes per day 
on SSD on a not
really over loaded cluster. Samsung claims to give 5 years warranty if
under 40GB/day. Those numbers seems very low to me.

What are your experiences on this? What write volumes do you encounter,
on wich SSD models, which setup and what MTBF?




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-10-01 Thread Christian Balzer

Hello,

On Wed, 1 Oct 2014 13:31:38 +0200 Martin B Nielsen wrote:

> Hi,
> 
> We settled on Samsung pro 840 240GB drives 1½ year ago and we've been
> happy so far. We've over-provisioned them a lot (left 120GB
> unpartitioned).
> 
> We have 16x 240GB and 32x 500GB - we've lost 1x 500GB so far.
> 
> smartctl states something like
> Wear = 092%, Hours = 12883, Datawritten = 15321.83 TB avg on those. I
> think that is ~30TB/day if I'm doing the calc right.
>
Something very much does not add up there.
Either you've written 15321.83 GB on those drives, making it about
30GB/day and well withing the Samsung specs, or you've written 10-20 times
the expected TBW level of those drives...

In the article I mentioned previously:
http://www.anandtech.com/show/8239/update-on-samsung-850-pro-endurance-vnand-die-size

The author clearly comes with a relationship of durability versus SSD
size, as one would expect. But the Samsung homepage just stated 150TBW,
for all those models...

Christian

> Not to advertise or say every samsung 840 ssd is like this:
> http://www.vojcik.net/samsung-ssd-840-endurance-destruct-test/
>
Seen it before, but I have a feeling that this test doesn't quite put the
same strain on the poor NANDs as Emmanuel's environment. 
 
Christian

> Cheers,
> Martin
> 
> 
> On Wed, Oct 1, 2014 at 10:18 AM, Christian Balzer  wrote:
> 
> > On Wed, 1 Oct 2014 09:28:12 +0200 Kasper Dieter wrote:
> >
> > > On Tue, Sep 30, 2014 at 04:38:41PM +0200, Mark Nelson wrote:
> > > > On 09/29/2014 03:58 AM, Dan Van Der Ster wrote:
> > > > > Hi Emmanuel,
> > > > > This is interesting, because we?ve had sales guys telling us that
> > > > > those Samsung drives are definitely the best for a Ceph journal
> > > > > O_o !
> > > >
> > > > Our sales guys or Samsung sales guys?  :)  If it was ours, let me
> > > > know.
> > > >
> > > > > The conventional wisdom has been to use the Intel DC S3700
> > > > > because of its massive durability.
> > > >
> > > > The S3700 is definitely one of the better drives on the market for
> > > > Ceph journals.  Some of the higher end PCIE SSDs have pretty high
> > > > durability (and performance) as well, but cost more (though you can
> > > > save SAS bay space, so it's a trade-off).
> > > Intel P3700 could be an alternative with 10 Drive-Writes/Day for 5
> > > years (see attachment)
> > >
> > They're certainly nice and competitively priced (TBW/$ wise at least).
> > However as I said in another thread, once your SSDs start to outlive
> > your planned server deployment time (in our case 5 years) that's
> > probably good enough.
> >
> > It's all about finding the balance between cost, speed (BW and IOPS),
> > durability and space.
> >
> > For example I'm currently building a cluster based on 2U, 12 hotswap
> > bays servers (because I already had 2 floating around) and am using 4
> > 100GB DC S3700 (at US$200 each) and 8 HDDS in them.
> > Putting in a 400GB DC P3700 (US$1200( instead and 4 more HDDs would
> > have pushed me over the budget and left me with a less than 30% "used"
> > SSD 5 years later, at a time when we clearly can expect these things
> > to be massively faster and cheaper.
> >
> > Now if you're actually having a cluster that would wear out a P3700 in
> > 5 years (or you're planning to run your machines until they burst into
> > flames), then that's another story. ^.^
> >
> > Christian
> >
> > > -Dieter
> > >
> > > >
> > > > >
> > > > > Anyway, I?m curious what do the SMART counters say on your SSDs??
> > > > > are they really failing due to worn out P/E cycles or is it
> > > > > something else?
> > > > >
> > > > > Cheers, Dan
> > > > >
> > > > >
> > > > >> On 29 Sep 2014, at 10:31, Emmanuel Lacour
> > > > >>  wrote:
> > > > >>
> > > > >>
> > > > >> Dear ceph users,
> > > > >>
> > > > >>
> > > > >> we are managing ceph clusters since 1 year now. Our setup is
> > > > >> typically made of Supermicro servers with OSD sata drives and
> > > > >> journal on SSD.
> > > > >>
> > > > >> Those SSD are all failing one after the other after one year :(
> > > > >>
> > > > >> We used Samsung 850 pro (120Go) with two setup (small nodes
> > > > >> with 2 ssd, 2 HD in 1U):
> > > > >>
> > > > >> 1) raid 1 :( (bad idea, each SSD support all the OSDs journals
> > > > >> writes :() 2) raid 1 for OS (nearly no writes) and dedicated
> > > > >> partition for journals (one per OSD)
> > > > >>
> > > > >>
> > > > >> I'm convinced that the second setup is better and we migrate old
> > > > >> setup to this one.
> > > > >>
> > > > >> Thought, statistics gives 60GB (option 2) to 100 GB (option 1)
> > > > >> writes per day on SSD on a not really over loaded cluster.
> > > > >> Samsung claims to give 5 years warranty if under 40GB/day.
> > > > >> Those numbers seems very low to me.
> > > > >>
> > > > >> What are your experiences on this? What write volumes do you
> > > > >> encounter, on wich SSD models, which setup and what MTBF?
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Easter-eggs 

Re: [ceph-users] SSD MTBF

2014-10-01 Thread Emmanuel Lacour
On Wed, Oct 01, 2014 at 01:31:38PM +0200, Martin B Nielsen wrote:
>Hi,
> 
>We settled on Samsung pro 840 240GB drives 1½ year ago and we've been
>happy so far. We've over-provisioned them a lot (left 120GB
>unpartitioned).
> 
>We have 16x 240GB and 32x 500GB - we've lost 1x 500GB so far.
>smartctl states something like
>Wear = 092%, Hours = 12883, Datawritten = 15321.83 TB avg on those. I
>think that is ~30TB/day if I'm doing the calc right.
> 
>Not to advertise or say every samsung 840 ssd is like this:
>[1]http://www.vojcik.net/samsung-ssd-840-endurance-destruct-test/
> 

I just returned 3 dead SSD, waiting for Samsung feedback ;)

(another one is dead yesterday)


-- 
Easter-eggs  Spécialiste GNU/Linux
44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-10-01 Thread Martin B Nielsen
Hi,

We settled on Samsung pro 840 240GB drives 1½ year ago and we've been happy
so far. We've over-provisioned them a lot (left 120GB unpartitioned).

We have 16x 240GB and 32x 500GB - we've lost 1x 500GB so far.

smartctl states something like
Wear = 092%, Hours = 12883, Datawritten = 15321.83 TB avg on those. I think
that is ~30TB/day if I'm doing the calc right.

Not to advertise or say every samsung 840 ssd is like this:
http://www.vojcik.net/samsung-ssd-840-endurance-destruct-test/

Cheers,
Martin


On Wed, Oct 1, 2014 at 10:18 AM, Christian Balzer  wrote:

> On Wed, 1 Oct 2014 09:28:12 +0200 Kasper Dieter wrote:
>
> > On Tue, Sep 30, 2014 at 04:38:41PM +0200, Mark Nelson wrote:
> > > On 09/29/2014 03:58 AM, Dan Van Der Ster wrote:
> > > > Hi Emmanuel,
> > > > This is interesting, because we?ve had sales guys telling us that
> > > > those Samsung drives are definitely the best for a Ceph journal O_o !
> > >
> > > Our sales guys or Samsung sales guys?  :)  If it was ours, let me know.
> > >
> > > > The conventional wisdom has been to use the Intel DC S3700 because
> > > > of its massive durability.
> > >
> > > The S3700 is definitely one of the better drives on the market for
> > > Ceph journals.  Some of the higher end PCIE SSDs have pretty high
> > > durability (and performance) as well, but cost more (though you can
> > > save SAS bay space, so it's a trade-off).
> > Intel P3700 could be an alternative with 10 Drive-Writes/Day for 5 years
> > (see attachment)
> >
> They're certainly nice and competitively priced (TBW/$ wise at least).
> However as I said in another thread, once your SSDs start to outlive your
> planned server deployment time (in our case 5 years) that's probably good
> enough.
>
> It's all about finding the balance between cost, speed (BW and IOPS),
> durability and space.
>
> For example I'm currently building a cluster based on 2U, 12 hotswap bays
> servers (because I already had 2 floating around) and am using 4 100GB DC
> S3700 (at US$200 each) and 8 HDDS in them.
> Putting in a 400GB DC P3700 (US$1200( instead and 4 more HDDs would have
> pushed me over the budget and left me with a less than 30% "used" SSD 5
> years later, at a time when we clearly can expect these things to be
> massively faster and cheaper.
>
> Now if you're actually having a cluster that would wear out a P3700 in 5
> years (or you're planning to run your machines until they burst into
> flames), then that's another story. ^.^
>
> Christian
>
> > -Dieter
> >
> > >
> > > >
> > > > Anyway, I?m curious what do the SMART counters say on your SSDs??
> > > > are they really failing due to worn out P/E cycles or is it
> > > > something else?
> > > >
> > > > Cheers, Dan
> > > >
> > > >
> > > >> On 29 Sep 2014, at 10:31, Emmanuel Lacour 
> > > >> wrote:
> > > >>
> > > >>
> > > >> Dear ceph users,
> > > >>
> > > >>
> > > >> we are managing ceph clusters since 1 year now. Our setup is
> > > >> typically made of Supermicro servers with OSD sata drives and
> > > >> journal on SSD.
> > > >>
> > > >> Those SSD are all failing one after the other after one year :(
> > > >>
> > > >> We used Samsung 850 pro (120Go) with two setup (small nodes with 2
> > > >> ssd, 2 HD in 1U):
> > > >>
> > > >> 1) raid 1 :( (bad idea, each SSD support all the OSDs journals
> > > >> writes :() 2) raid 1 for OS (nearly no writes) and dedicated
> > > >> partition for journals (one per OSD)
> > > >>
> > > >>
> > > >> I'm convinced that the second setup is better and we migrate old
> > > >> setup to this one.
> > > >>
> > > >> Thought, statistics gives 60GB (option 2) to 100 GB (option 1)
> > > >> writes per day on SSD on a not really over loaded cluster. Samsung
> > > >> claims to give 5 years warranty if under 40GB/day. Those numbers
> > > >> seems very low to me.
> > > >>
> > > >> What are your experiences on this? What write volumes do you
> > > >> encounter, on wich SSD models, which setup and what MTBF?
> > > >>
> > > >>
> > > >> --
> > > >> Easter-eggs  Spécialiste GNU/Linux
> > > >> 44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
> > > >> Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
> > > >> mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
> > > >> ___
> > > >> ceph-users mailing list
> > > >> ceph-users@lists.ceph.com
> > > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > >
> > > > ___
> > > > ceph-users mailing list
> > > > ceph-users@lists.ceph.com
> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > >
> > >
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> --
> Christian BalzerNetwork/Systems Engineer
> ch...@gol.com   Global OnLine Japan/Fusion Communications
> http://www.gol.com/
> ___

Re: [ceph-users] SSD MTBF

2014-10-01 Thread Christian Balzer
On Wed, 1 Oct 2014 09:28:12 +0200 Kasper Dieter wrote:

> On Tue, Sep 30, 2014 at 04:38:41PM +0200, Mark Nelson wrote:
> > On 09/29/2014 03:58 AM, Dan Van Der Ster wrote:
> > > Hi Emmanuel,
> > > This is interesting, because we?ve had sales guys telling us that
> > > those Samsung drives are definitely the best for a Ceph journal O_o !
> > 
> > Our sales guys or Samsung sales guys?  :)  If it was ours, let me know.
> > 
> > > The conventional wisdom has been to use the Intel DC S3700 because
> > > of its massive durability.
> > 
> > The S3700 is definitely one of the better drives on the market for
> > Ceph journals.  Some of the higher end PCIE SSDs have pretty high
> > durability (and performance) as well, but cost more (though you can
> > save SAS bay space, so it's a trade-off).
> Intel P3700 could be an alternative with 10 Drive-Writes/Day for 5 years
> (see attachment)
> 
They're certainly nice and competitively priced (TBW/$ wise at least). 
However as I said in another thread, once your SSDs start to outlive your
planned server deployment time (in our case 5 years) that's probably good
enough.

It's all about finding the balance between cost, speed (BW and IOPS),
durability and space.

For example I'm currently building a cluster based on 2U, 12 hotswap bays
servers (because I already had 2 floating around) and am using 4 100GB DC
S3700 (at US$200 each) and 8 HDDS in them. 
Putting in a 400GB DC P3700 (US$1200( instead and 4 more HDDs would have
pushed me over the budget and left me with a less than 30% "used" SSD 5
years later, at a time when we clearly can expect these things to be
massively faster and cheaper.

Now if you're actually having a cluster that would wear out a P3700 in 5
years (or you're planning to run your machines until they burst into
flames), then that's another story. ^.^

Christian

> -Dieter
> 
> > 
> > >
> > > Anyway, I?m curious what do the SMART counters say on your SSDs??
> > > are they really failing due to worn out P/E cycles or is it
> > > something else?
> > >
> > > Cheers, Dan
> > >
> > >
> > >> On 29 Sep 2014, at 10:31, Emmanuel Lacour 
> > >> wrote:
> > >>
> > >>
> > >> Dear ceph users,
> > >>
> > >>
> > >> we are managing ceph clusters since 1 year now. Our setup is
> > >> typically made of Supermicro servers with OSD sata drives and
> > >> journal on SSD.
> > >>
> > >> Those SSD are all failing one after the other after one year :(
> > >>
> > >> We used Samsung 850 pro (120Go) with two setup (small nodes with 2
> > >> ssd, 2 HD in 1U):
> > >>
> > >> 1) raid 1 :( (bad idea, each SSD support all the OSDs journals
> > >> writes :() 2) raid 1 for OS (nearly no writes) and dedicated
> > >> partition for journals (one per OSD)
> > >>
> > >>
> > >> I'm convinced that the second setup is better and we migrate old
> > >> setup to this one.
> > >>
> > >> Thought, statistics gives 60GB (option 2) to 100 GB (option 1)
> > >> writes per day on SSD on a not really over loaded cluster. Samsung
> > >> claims to give 5 years warranty if under 40GB/day. Those numbers
> > >> seems very low to me.
> > >>
> > >> What are your experiences on this? What write volumes do you
> > >> encounter, on wich SSD models, which setup and what MTBF?
> > >>
> > >>
> > >> --
> > >> Easter-eggs  Spécialiste GNU/Linux
> > >> 44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
> > >> Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
> > >> mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
> > >> ___
> > >> ceph-users mailing list
> > >> ceph-users@lists.ceph.com
> > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >
> > 
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-10-01 Thread Kasper Dieter
On Tue, Sep 30, 2014 at 04:38:41PM +0200, Mark Nelson wrote:
> On 09/29/2014 03:58 AM, Dan Van Der Ster wrote:
> > Hi Emmanuel,
> > This is interesting, because we?ve had sales guys telling us that those 
> > Samsung drives are definitely the best for a Ceph journal O_o !
> 
> Our sales guys or Samsung sales guys?  :)  If it was ours, let me know.
> 
> > The conventional wisdom has been to use the Intel DC S3700 because of its 
> > massive durability.
> 
> The S3700 is definitely one of the better drives on the market for Ceph 
> journals.  Some of the higher end PCIE SSDs have pretty high durability 
> (and performance) as well, but cost more (though you can save SAS bay 
> space, so it's a trade-off).
Intel P3700 could be an alternative with 10 Drive-Writes/Day for 5 years (see 
attachment)

-Dieter

> 
> >
> > Anyway, I?m curious what do the SMART counters say on your SSDs?? are they 
> > really failing due to worn out P/E cycles or is it something else?
> >
> > Cheers, Dan
> >
> >
> >> On 29 Sep 2014, at 10:31, Emmanuel Lacour  wrote:
> >>
> >>
> >> Dear ceph users,
> >>
> >>
> >> we are managing ceph clusters since 1 year now. Our setup is typically
> >> made of Supermicro servers with OSD sata drives and journal on SSD.
> >>
> >> Those SSD are all failing one after the other after one year :(
> >>
> >> We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd,
> >> 2 HD in 1U):
> >>
> >> 1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :()
> >> 2) raid 1 for OS (nearly no writes) and dedicated partition for journals
> >>   (one per OSD)
> >>
> >>
> >> I'm convinced that the second setup is better and we migrate old setup
> >> to this one.
> >>
> >> Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes per 
> >> day on SSD on a not
> >> really over loaded cluster. Samsung claims to give 5 years warranty if
> >> under 40GB/day. Those numbers seems very low to me.
> >>
> >> What are your experiences on this? What write volumes do you encounter,
> >> on wich SSD models, which setup and what MTBF?
> >>
> >>
> >> --
> >> Easter-eggs  Spécialiste GNU/Linux
> >> 44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
> >> Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
> >> mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


FJ-20140915-Best-Practice_Distributed-Intelligent-Storage_NVMe-SSD_fast-IC_v8_P3700,ksp.pdf
Description: Adobe PDF document
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-10-01 Thread Dan Van Der Ster
> On 30 Sep 2014, at 16:38, Mark Nelson  wrote:
> 
> On 09/29/2014 03:58 AM, Dan Van Der Ster wrote:
>> Hi Emmanuel,
>> This is interesting, because we’ve had sales guys telling us that those 
>> Samsung drives are definitely the best for a Ceph journal O_o !
> 
> Our sales guys or Samsung sales guys?  :)  If it was ours, let me know.

Haha, neither.

Cheers, Dan


> 
>> The conventional wisdom has been to use the Intel DC S3700 because of its 
>> massive durability.
> 
> The S3700 is definitely one of the better drives on the market for Ceph 
> journals.  Some of the higher end PCIE SSDs have pretty high durability (and 
> performance) as well, but cost more (though you can save SAS bay space, so 
> it's a trade-off).
> 
>> 
>> Anyway, I’m curious what do the SMART counters say on your SSDs?… are they 
>> really failing due to worn out P/E cycles or is it something else?
>> 
>> Cheers, Dan
>> 
>> 
>>> On 29 Sep 2014, at 10:31, Emmanuel Lacour  wrote:
>>> 
>>> 
>>> Dear ceph users,
>>> 
>>> 
>>> we are managing ceph clusters since 1 year now. Our setup is typically
>>> made of Supermicro servers with OSD sata drives and journal on SSD.
>>> 
>>> Those SSD are all failing one after the other after one year :(
>>> 
>>> We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd,
>>> 2 HD in 1U):
>>> 
>>> 1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :()
>>> 2) raid 1 for OS (nearly no writes) and dedicated partition for journals
>>>  (one per OSD)
>>> 
>>> 
>>> I'm convinced that the second setup is better and we migrate old setup
>>> to this one.
>>> 
>>> Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes per 
>>> day on SSD on a not
>>> really over loaded cluster. Samsung claims to give 5 years warranty if
>>> under 40GB/day. Those numbers seems very low to me.
>>> 
>>> What are your experiences on this? What write volumes do you encounter,
>>> on wich SSD models, which setup and what MTBF?
>>> 
>>> 
>>> --
>>> Easter-eggs  Spécialiste GNU/Linux
>>> 44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
>>> Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
>>> mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-09-30 Thread Mark Nelson

On 09/29/2014 03:58 AM, Dan Van Der Ster wrote:

Hi Emmanuel,
This is interesting, because we’ve had sales guys telling us that those Samsung 
drives are definitely the best for a Ceph journal O_o !


Our sales guys or Samsung sales guys?  :)  If it was ours, let me know.


The conventional wisdom has been to use the Intel DC S3700 because of its 
massive durability.


The S3700 is definitely one of the better drives on the market for Ceph 
journals.  Some of the higher end PCIE SSDs have pretty high durability 
(and performance) as well, but cost more (though you can save SAS bay 
space, so it's a trade-off).




Anyway, I’m curious what do the SMART counters say on your SSDs?… are they 
really failing due to worn out P/E cycles or is it something else?

Cheers, Dan



On 29 Sep 2014, at 10:31, Emmanuel Lacour  wrote:


Dear ceph users,


we are managing ceph clusters since 1 year now. Our setup is typically
made of Supermicro servers with OSD sata drives and journal on SSD.

Those SSD are all failing one after the other after one year :(

We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd,
2 HD in 1U):

1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :()
2) raid 1 for OS (nearly no writes) and dedicated partition for journals
  (one per OSD)


I'm convinced that the second setup is better and we migrate old setup
to this one.

Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes per day 
on SSD on a not
really over loaded cluster. Samsung claims to give 5 years warranty if
under 40GB/day. Those numbers seems very low to me.

What are your experiences on this? What write volumes do you encounter,
on wich SSD models, which setup and what MTBF?


--
Easter-eggs  Spécialiste GNU/Linux
44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-09-30 Thread Christian Balzer
On Tue, 30 Sep 2014 15:26:31 +0100 Kingsley Tart wrote:

> On Tue, 2014-09-30 at 00:30 +0900, Christian Balzer wrote:
> > On Mon, 29 Sep 2014 11:15:21 +0200 Emmanuel Lacour wrote:
> > 
> > > On Mon, Sep 29, 2014 at 05:57:12PM +0900, Christian Balzer wrote:
> > > > 
> > > > Given your SSDs, are they failing after more than 150TB have been
> > > > written?
> > > 
> > > between 30 and 40 TB ...
> > > 
> > That's low. One wonders what is going on here, Samsung being overly
> > optimistic or something else...
> 
> This isn't something I know much about so please do correct me if I'm
> wrong, but might this be something to do with actual data size vs
> written block size on the SSD?
> 

You're quite correct and astute, but according to this article:

http://www.anandtech.com/show/8239/update-on-samsung-850-pro-endurance-vnand-die-size

it should be still 70TB at the worst case (very end of the article).

It also doesn't mesh with the wear-out indicator levels Emmanuel is
seeing, the drive should know best about its state of health and when it
dies at about 40% something is very much off.

Christian
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-09-30 Thread Kingsley Tart
On Tue, 2014-09-30 at 00:30 +0900, Christian Balzer wrote:
> On Mon, 29 Sep 2014 11:15:21 +0200 Emmanuel Lacour wrote:
> 
> > On Mon, Sep 29, 2014 at 05:57:12PM +0900, Christian Balzer wrote:
> > > 
> > > Given your SSDs, are they failing after more than 150TB have been
> > > written?
> > 
> > between 30 and 40 TB ...
> > 
> That's low. One wonders what is going on here, Samsung being overly
> optimistic or something else...

This isn't something I know much about so please do correct me if I'm
wrong, but might this be something to do with actual data size vs
written block size on the SSD?

-- 
Cheers,
Kingsley.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-09-29 Thread Christian Balzer
On Mon, 29 Sep 2014 11:15:21 +0200 Emmanuel Lacour wrote:

> On Mon, Sep 29, 2014 at 05:57:12PM +0900, Christian Balzer wrote:
> > 
> > Given your SSDs, are they failing after more than 150TB have been
> > written?
> 
> between 30 and 40 TB ...
> 
That's low. One wonders what is going on here, Samsung being overly
optimistic or something else...

OTOH, that is about the right amount of 40GB/day and THREE years, the
warranty period


> > 
> > > Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes
> > > per day on SSD on a not really over loaded cluster. Samsung claims to
> > > give 5 years warranty if under 40GB/day. Those numbers seems very
> > > low to me.
> > > 
> > This is confusing, as the Samsung homepage gives a 150TBW lifetime, and
> > this would be about half of it. 
> > 
> 
> I didn't saw this spec. And did a quick look at samsung vs intel. Intel
> s3500 120Go is 70TBW which is the same as 40GB/day for 5 years.
>
The 3500s are the "low end" models. ^o^
And you better believe that they will last that time. ^^
 
> > > 
> > If you read/search this ML it should be clear to you that the only SSDs
> > that have the durability (and a good TBW/$ ratio when looking at it
> > long term) are Intel DC 3700S. 
> > Monitor their wearout ratio and you're likely never have one fail on
> > you unexpectedly.
> > A 200 TB DC 3700S has a TBW of 1825, more than 10 times that of your
> > Samsungs and would allow you to write 1TB each day for 5 years.
> > 
> 
> 
> Yes, I did a quick compare just now:
> 
> Brand   Model TBW   €  €/TB
> Intel   S3500 120Go   701221,74
> Intel   S3500 240Go   140   2251,60
> Intel   S3700 100Go   1873  2200,11
> Intel   S3700 200Go   3737  4000,10
> Samsung 840 pro 120Go 701201,71
> 
> amazing!
> 
> 
> considering that I need only 80Go, I can keep free space for
> over-provisionning and thus the S3700 200Go may be the better choice in
> €/TB.
> 
Over-provisioning helps as well, but it is not a necessity with those Intel
SSDs. Note that in your particular configuration of having basically one
SSD per HDD the 100GB DC S3700s are more than fast enough. 

Aside from the TBW/$ cost (which is of course based on current prices) one
also has to consider the expected deployment duration. 
We do retire/recycle machines after 4-5 years, so for us anything that
survives this time for sure is good enough. ^^

Christian
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-09-29 Thread Emmanuel Lacour
On Mon, Sep 29, 2014 at 08:58:38AM +, Dan Van Der Ster wrote:
> Hi Emmanuel,
> This is interesting, because we’ve had sales guys telling us that those 
> Samsung drives are definitely the best for a Ceph journal O_o !
> The conventional wisdom has been to use the Intel DC S3700 because of its 
> massive durability. 
> 
> Anyway, I’m curious what do the SMART counters say on your SSDs?… are they 
> really failing due to worn out P/E cycles or is it something else?
> 


Here are our current stats (health is the Wear_leveling_count):

hyp-prs-01
 SSD Status:   sda / 3622 hours / 8800.107 GB written / 58.311 GB/day / Health: 
82 %
 SSD Status:   sdb / 3622 hours / 9949.785 GB written / 65.929 GB/day / Health: 
80 %
hyp-prs-02
 SSD Status:   sda / 3620 hours / 9516.849 GB written / 63.095 GB/day / Health: 
81 %
 SSD Status:   sdb / 3620 hours / 9716.926 GB written / 64.421 GB/day / Health: 
80 %
hyp-prs-03
 SSD Status:   sda / 3530 hours / 9501.308 GB written / 64.598 GB/day / Health: 
82 %
 SSD Status:   sdb / 3530 hours / 9494.685 GB written / 64.553 GB/day / Health: 
80 %
hyp-pa2-02
 SSD Status:   sdc / 5692 hours / 11585.309 GB written / 48.848 GB/day / 
Health: 80 %
 SSD Status:   sdd / 5692 hours / 12771.698 GB written / 53.851 GB/day / 
Health: 77 %
hyp-pa2-03
 SSD Status:   sdc / 5691 hours / 12571.167 GB written / 53.014 GB/day / 
Health: 78 %
 SSD Status:   sdd / 5691 hours / 12882.846 GB written / 54.329 GB/day / 
Health: 76 %
hyp-pa2-04
 SSD Status:   sdc / 5691 hours / 12542.344 GB written / 52.893 GB/day / 
Health: 76 %
 SSD Status:   sdd / 5691 hours / 13534.304 GB written / 57.076 GB/day / 
Health: 77 %
hyp-pa3-02
 SSD Status:   sdc / 8747 hours / 30142.858 GB written / 82.705 GB/day / 
Health: 48 %
 SSD Status:   sdd / 8747 hours / 30737.615 GB written / 84.337 GB/day / 
Health: 40 %
hyp-pa3-03
 SSD Status:   sda / 8769 hours / 32669.734 GB written / 89.414 GB/day / 
Health: 43 %
 SSD Status:   sdb / 965 hours / 4006.301 GB written / 99.639 GB/day / Health: 
92 %
hyp-pa3-04
 SSD Status:   sda / 1033 hours / 4078.292 GB written / 94.753 GB/day / Health: 
91 %
 SSD Status:   sde / 49 hours / 299.994 GB written / 146.983 GB/day / Health: 
99 %
quadrille
 SSD Status:   sdc / 7732 hours / 10775.406 GB written / 33.446 GB/day / 
Health: 80 %
 SSD Status:   sdd / 7732 hours / 10656.070 GB written / 33.076 GB/day / 
Health: 81 %
hora
 SSD Status:   sdc / 7734 hours / 10978.489 GB written / 34.068 GB/day / 
Health: 81 %
 SSD Status:   sdd / 7734 hours / 10978.754 GB written / 34.069 GB/day / 
Health: 81 %
mazurka
 SSD Status:   sdc / 7732 hours / 11983.782 GB written / 37.197 GB/day / 
Health: 80 %
 SSD Status:   sdd / 7732 hours / 11803.509 GB written / 36.637 GB/day / 
Health: 81 %



That was stats on last friday. This morning hyp-pa3-02:sdd died. So a bit under
40 for Wear_leveling_count. And this summer we lost 3 SSDs with nearly the same
numbers (hyp-pa3-04:* and hyp-pa3-03:sdb) :(

hyp-pa3-* is the cluster with journals on raid 1 ssds of course.



-- 
Easter-eggs  Spécialiste GNU/Linux
44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-09-29 Thread Emmanuel Lacour
On Mon, Sep 29, 2014 at 05:57:12PM +0900, Christian Balzer wrote:
> 
> Given your SSDs, are they failing after more than 150TB have been written?

between 30 and 40 TB ...

> 
> > Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes
> > per day on SSD on a not really over loaded cluster. Samsung claims to
> > give 5 years warranty if under 40GB/day. Those numbers seems very low to
> > me.
> > 
> This is confusing, as the Samsung homepage gives a 150TBW lifetime, and
> this would be about half of it. 
> 

I didn't saw this spec. And did a quick look at samsung vs intel. Intel
s3500 120Go is 70TBW which is the same as 40GB/day for 5 years.

> > 
> If you read/search this ML it should be clear to you that the only SSDs
> that have the durability (and a good TBW/$ ratio when looking at it long
> term) are Intel DC 3700S. 
> Monitor their wearout ratio and you're likely never have one fail on you
> unexpectedly.
> A 200 TB DC 3700S has a TBW of 1825, more than 10 times that of your
> Samsungs and would allow you to write 1TB each day for 5 years.
> 


Yes, I did a quick compare just now:

Brand   Model TBW   €  €/TB
Intel   S3500 120Go   701221,74
Intel   S3500 240Go   140   2251,60
Intel   S3700 100Go   1873  2200,11
Intel   S3700 200Go   3737  4000,10
Samsung 840 pro 120Go 701201,71

amazing!


considering that I need only 80Go, I can keep free space for over-provisionning
and thus the S3700 200Go may be the better choice in €/TB.



-- 
Easter-eggs  Spécialiste GNU/Linux
44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-09-29 Thread Dan Van Der Ster
Hi Emmanuel,
This is interesting, because we’ve had sales guys telling us that those Samsung 
drives are definitely the best for a Ceph journal O_o !
The conventional wisdom has been to use the Intel DC S3700 because of its 
massive durability. 

Anyway, I’m curious what do the SMART counters say on your SSDs?… are they 
really failing due to worn out P/E cycles or is it something else?

Cheers, Dan


> On 29 Sep 2014, at 10:31, Emmanuel Lacour  wrote:
> 
> 
> Dear ceph users,
> 
> 
> we are managing ceph clusters since 1 year now. Our setup is typically
> made of Supermicro servers with OSD sata drives and journal on SSD.
> 
> Those SSD are all failing one after the other after one year :(
> 
> We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd,
> 2 HD in 1U):
> 
> 1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :()
> 2) raid 1 for OS (nearly no writes) and dedicated partition for journals
>  (one per OSD)
> 
> 
> I'm convinced that the second setup is better and we migrate old setup
> to this one.
> 
> Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes per day 
> on SSD on a not
> really over loaded cluster. Samsung claims to give 5 years warranty if
> under 40GB/day. Those numbers seems very low to me.
> 
> What are your experiences on this? What write volumes do you encounter,
> on wich SSD models, which setup and what MTBF?
> 
> 
> -- 
> Easter-eggs  Spécialiste GNU/Linux
> 44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
> Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
> mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD MTBF

2014-09-29 Thread Christian Balzer

Hello,

On Mon, 29 Sep 2014 10:31:03 +0200 Emmanuel Lacour wrote:

> 
> Dear ceph users,
> 
> 
> we are managing ceph clusters since 1 year now. Our setup is typically
> made of Supermicro servers with OSD sata drives and journal on SSD.
> 
> Those SSD are all failing one after the other after one year :(
> 
Given your SSDs, are they failing after more than 150TB have been written?

> We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd,
> 2 HD in 1U):
> 
> 1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :()
> 2) raid 1 for OS (nearly no writes) and dedicated partition for journals
>   (one per OSD)
> 
> 
> I'm convinced that the second setup is better and we migrate old setup
> to this one.
> 
Yes, the 2nd option is the better one for many reasons and I'm using that
myself. 

> Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes
> per day on SSD on a not really over loaded cluster. Samsung claims to
> give 5 years warranty if under 40GB/day. Those numbers seems very low to
> me.
> 
This is confusing, as the Samsung homepage gives a 150TBW lifetime, and
this would be about half of it. 

> What are your experiences on this? What write volumes do you encounter,
> on wich SSD models, which setup and what MTBF?
> 
If you read/search this ML it should be clear to you that the only SSDs
that have the durability (and a good TBW/$ ratio when looking at it long
term) are Intel DC 3700S. 
Monitor their wearout ratio and you're likely never have one fail on you
unexpectedly.
A 200 TB DC 3700S has a TBW of 1825, more than 10 times that of your
Samsungs and would allow you to write 1TB each day for 5 years.

Christian
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] SSD MTBF

2014-09-29 Thread Emmanuel Lacour

Dear ceph users,


we are managing ceph clusters since 1 year now. Our setup is typically
made of Supermicro servers with OSD sata drives and journal on SSD.

Those SSD are all failing one after the other after one year :(

We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd,
2 HD in 1U):

1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :()
2) raid 1 for OS (nearly no writes) and dedicated partition for journals
  (one per OSD)


I'm convinced that the second setup is better and we migrate old setup
to this one.

Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes per day 
on SSD on a not
really over loaded cluster. Samsung claims to give 5 years warranty if
under 40GB/day. Those numbers seems very low to me.

What are your experiences on this? What write volumes do you encounter,
on wich SSD models, which setup and what MTBF?


-- 
Easter-eggs  Spécialiste GNU/Linux
44-46 rue de l'Ouest  -  75014 Paris  -  France -  Métro Gaité
Phone: +33 (0) 1 43 35 00 37-   Fax: +33 (0) 1 43 35 00 76
mailto:elac...@easter-eggs.com  -   http://www.easter-eggs.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com