Re: [ceph-users] SSD MTBF
On Mon, Sep 29, 2014 at 10:31:03AM +0200, Emmanuel Lacour wrote: > > Dear ceph users, > > > we are managing ceph clusters since 1 year now. Our setup is typically > made of Supermicro servers with OSD sata drives and journal on SSD. > > Those SSD are all failing one after the other after one year :( > > We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd, > 2 HD in 1U): > s/850/840 A quick update on this, those SSDs continues to fails, we replace each with Intel S3700 and are rebuilding nodes with a different partition table (RAID only for OS, one journal on each SSD, over provisionning). We sent back Samsung SSD for warranty, its'very easy and one week later we receive SSD with same S/N and smart ok but ... we tried to use back two of those and they failed one day later. So sorry for samsung, but I definitely do not recommend using 840 Pro on ceph clusters! -- Easter-eggs Spécialiste GNU/Linux 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 mailto:elac...@easter-eggs.com - http://www.easter-eggs.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
On Tue, Oct 07, 2014 at 05:24:40PM +0200, Martin B Nielsen wrote: > >I don't disagree with the above - but the table assumes you'll wear out >your SSD. Adjust the wear level and the price will change proportionally - >if you're only writing 50-100TB/year pr ssd then the value will heavily >swing in the cheaper consumer grade ssd favor. It is all about your >estimated usage pattern and whether they're 'good enough' for your >scenario or not (and/or you trust that vendor). >In my experience ceph seldom (ever) maxes out io of a ssd - it is much >more likely to be cpu or network before coming to that. > I agree with this. In our case, the response is Intel S3700 100Go without any doubt :) -- Easter-eggs Spécialiste GNU/Linux 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 mailto:elac...@easter-eggs.com - http://www.easter-eggs.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
A bit late getting back on this one. On Wed, Oct 1, 2014 at 5:05 PM, Christian Balzer wrote: > > smartctl states something like > > Wear = 092%, Hours = 12883, Datawritten = 15321.83 TB avg on those. I > > think that is ~30TB/day if I'm doing the calc right. > > > Something very much does not add up there. > Either you've written 15321.83 GB on those drives, making it about > 30GB/day and well withing the Samsung specs, or you've written 10-20 times > the expected TBW level of those drives... > My bad, I forgot to say the Wear indicator here (92%) is sorta backwards - so it means it still has 92% to go before reaching expected TBW limit. I agree with what Massimiliano Cuttini wrote later as well - if your io boundaries are well within the expected TBW of the lifetime I see no reason to go for more expensive disks. Just monitor for wear and have a few in stock ready for replacement. Regarding the table of ssd and vendors: Brand Model TBW € €/TB Intel S3500 120Go 701221,74 Intel S3500 240Go 140 2251,60 Intel S3700 100Go 1873 2200,11 Intel S3700 200Go 3737 4000,10 Samsung 840 pro 120Go 701201,71 I don't disagree with the above - but the table assumes you'll wear out your SSD. Adjust the wear level and the price will change proportionally - if you're only writing 50-100TB/year pr ssd then the value will heavily swing in the cheaper consumer grade ssd favor. It is all about your estimated usage pattern and whether they're 'good enough' for your scenario or not (and/or you trust that vendor). In my experience ceph seldom (ever) maxes out io of a ssd - it is much more likely to be cpu or network before coming to that. Cheers, Martin > > In the article I mentioned previously: > > http://www.anandtech.com/show/8239/update-on-samsung-850-pro-endurance-vnand-die-size > > The author clearly comes with a relationship of durability versus SSD > size, as one would expect. But the Samsung homepage just stated 150TBW, > for all those models... > > Christian > > > Not to advertise or say every samsung 840 ssd is like this: > > http://www.vojcik.net/samsung-ssd-840-endurance-destruct-test/ > > > Seen it before, but I have a feeling that this test doesn't quite put the > same strain on the poor NANDs as Emmanuel's environment. > > Christian > > > Cheers, > > Martin > > > > > > On Wed, Oct 1, 2014 at 10:18 AM, Christian Balzer wrote: > > > > > On Wed, 1 Oct 2014 09:28:12 +0200 Kasper Dieter wrote: > > > > > > > On Tue, Sep 30, 2014 at 04:38:41PM +0200, Mark Nelson wrote: > > > > > On 09/29/2014 03:58 AM, Dan Van Der Ster wrote: > > > > > > Hi Emmanuel, > > > > > > This is interesting, because we?ve had sales guys telling us that > > > > > > those Samsung drives are definitely the best for a Ceph journal > > > > > > O_o ! > > > > > > > > > > Our sales guys or Samsung sales guys? :) If it was ours, let me > > > > > know. > > > > > > > > > > > The conventional wisdom has been to use the Intel DC S3700 > > > > > > because of its massive durability. > > > > > > > > > > The S3700 is definitely one of the better drives on the market for > > > > > Ceph journals. Some of the higher end PCIE SSDs have pretty high > > > > > durability (and performance) as well, but cost more (though you can > > > > > save SAS bay space, so it's a trade-off). > > > > Intel P3700 could be an alternative with 10 Drive-Writes/Day for 5 > > > > years (see attachment) > > > > > > > They're certainly nice and competitively priced (TBW/$ wise at least). > > > However as I said in another thread, once your SSDs start to outlive > > > your planned server deployment time (in our case 5 years) that's > > > probably good enough. > > > > > > It's all about finding the balance between cost, speed (BW and IOPS), > > > durability and space. > > > > > > For example I'm currently building a cluster based on 2U, 12 hotswap > > > bays servers (because I already had 2 floating around) and am using 4 > > > 100GB DC S3700 (at US$200 each) and 8 HDDS in them. > > > Putting in a 400GB DC P3700 (US$1200( instead and 4 more HDDs would > > > have pushed me over the budget and left me with a less than 30% "used" > > > SSD 5 years later, at a time when we clearly can expect these things > > > to be massively faster and cheaper. > > > > > > Now if you're actually having a cluster that would wear out a P3700 in > > > 5 years (or you're planning to run your machines until they burst into > > > flames), then that's another story. ^.^ > > > > > > Christian > > > > > > > -Dieter > > > > > > > > > > > > > > > > > > > > > Anyway, I?m curious what do the SMART counters say on your SSDs?? > > > > > > are they really failing due to worn out P/E cycles or is it > > > > > > something else? > > > > > > > > > > > > Cheers, Dan > > > > > > > > > > > > > > > > > >> On 29 Sep 2014, at 10:31, Emmanuel Lacour > > > > > >> wrote: > > > > > >> > > > > > >> > > > > > >> Dear ceph users, > > > > > >> > > > > > >> > > > > > >> we
Re: [ceph-users] SSD MTBF
Le 02/10/2014 17:50, Massimiliano Cuttini a écrit : > I don't think this is true. > > If you have a SSD disk of 60Gb or 100GB then your TBW/day is really > limited (the disk is small then will wrote always on same sectors). > The bigger is the SSD the longer will be alive, you have limited write > per day then if your disk is bigger you have more sectors to use. > > Expecting more than 1year from a SSD as small as 100Gb is really much. > Just think that the same SSD of 1TB then will be 10 years longer with > the same usage. > If what Emmanuel said is real then consumer SSD are the way (10years on > a 1Tb disk). > Then your score it's good (almost amazing) and not bad. > > Just switch to bigger SSD and you'll solve all your problems. > > Look at this table made with vendors specs: Brand Model TBW € €/TB Intel S3500 120Go 701221,74 Intel S3500 240Go 140 2251,60 Intel S3700 100Go 1873 2200,11 Intel S3700 200Go 3737 4000,10 Samsung 840 pro 120Go 701201,71 of course bigger SSD allows more over-provisionning, but in term of €/TBW ... -- Easter-eggs Spécialiste GNU/Linux 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 mailto:elac...@easter-eggs.com - http://www.easter-eggs.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
Le 02/10/2014 17:58, Adam Boyhan a écrit : > What about the Intel DC S3500 instead of the DC S3700? > A matter of x10 for TBW supported between the two. See the small analysis I made previously in this thread regarding the cost per TBW. -- Easter-eggs Spécialiste GNU/Linux 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 mailto:elac...@easter-eggs.com - http://www.easter-eggs.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
What about the Intel DC S3500 instead of the DC S3700? - Original Message - From: "Emmanuel Lacour" To: ceph-users@lists.ceph.com Sent: Thursday, October 2, 2014 11:48:26 AM Subject: Re: [ceph-users] SSD MTBF Le 02/10/2014 17:14, Ron Allred a écrit : > One thing being missed, > > Samsung 850 Pro has only been available for about 1-2 months. > > The OP, noted that drives are failing after approx 1 year. This would > probably mean the SSDs are actually Samsung 840 Pro. The > write-durabilities of 850 and 840 are quite different. > Yes, mistake on my side, it was of course 804 Pro. > > You should be looking at Intel DC37xx, OCZ Intrepid 3800, HGST, etc. > Samsung recently released the 845DC (PRO/EVO) aimed at datacenters. > These have decent TBW specs, but not very much is known about them in > real-use yet. > > Spend a full day reading storagesearch.com, it can save you THOUSANDS of > dollars, when selecting an SSD for Datacenter use. > thanks for advices :) -- Easter-eggs Spécialiste GNU/Linux 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité Phone: +33 (0) 1 43 35 00 37 - Fax: +33 (0) 1 43 35 00 76 mailto:elac...@easter-eggs.com - http://www.easter-eggs.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
I don't think this is true. If you have a SSD disk of 60Gb or 100GB then your TBW/day is really limited (the disk is small then will wrote always on same sectors). The bigger is the SSD the longer will be alive, you have limited write per day then if your disk is bigger you have more sectors to use. Expecting more than 1year from a SSD as small as 100Gb is really much. Just think that the same SSD of 1TB then will be 10 years longer with the same usage. If what Emmanuel said is real then consumer SSD are the way (10years on a 1Tb disk). Then your score it's good (almost amazing) and not bad. Just switch to bigger SSD and you'll solve all your problems. Il 02/10/2014 17:14, Ron Allred ha scritto: One thing being missed, Samsung 850 Pro has only been available for about 1-2 months. The OP, noted that drives are failing after approx 1 year. This would probably mean the SSDs are actually Samsung 840 Pro. The write-durabilities of 850 and 840 are quite different. That being said, Samsung 8X0 Pros are desktop drives. Only "Data Center" grade SSDs should be used with Ceph, with decent TBW/day >= 5 years. You should be looking at Intel DC37xx, OCZ Intrepid 3800, HGST, etc. Samsung recently released the 845DC (PRO/EVO) aimed at datacenters. These have decent TBW specs, but not very much is known about them in real-use yet. Spend a full day reading storagesearch.com, it can save you THOUSANDS of dollars, when selecting an SSD for Datacenter use. Regards, Ron On 09/29/2014 02:31 AM, Emmanuel Lacour wrote: Dear ceph users, we are managing ceph clusters since 1 year now. Our setup is typically made of Supermicro servers with OSD sata drives and journal on SSD. Those SSD are all failing one after the other after one year :( We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd, 2 HD in 1U): 1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :() 2) raid 1 for OS (nearly no writes) and dedicated partition for journals (one per OSD) I'm convinced that the second setup is better and we migrate old setup to this one. Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes per day on SSD on a not really over loaded cluster. Samsung claims to give 5 years warranty if under 40GB/day. Those numbers seems very low to me. What are your experiences on this? What write volumes do you encounter, on wich SSD models, which setup and what MTBF? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
Le 02/10/2014 17:14, Ron Allred a écrit : > One thing being missed, > > Samsung 850 Pro has only been available for about 1-2 months. > > The OP, noted that drives are failing after approx 1 year. This would > probably mean the SSDs are actually Samsung 840 Pro. The > write-durabilities of 850 and 840 are quite different. > Yes, mistake on my side, it was of course 804 Pro. > > You should be looking at Intel DC37xx, OCZ Intrepid 3800, HGST, etc. > Samsung recently released the 845DC (PRO/EVO) aimed at datacenters. > These have decent TBW specs, but not very much is known about them in > real-use yet. > > Spend a full day reading storagesearch.com, it can save you THOUSANDS of > dollars, when selecting an SSD for Datacenter use. > thanks for advices :) -- Easter-eggs Spécialiste GNU/Linux 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 mailto:elac...@easter-eggs.com - http://www.easter-eggs.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
One thing being missed, Samsung 850 Pro has only been available for about 1-2 months. The OP, noted that drives are failing after approx 1 year. This would probably mean the SSDs are actually Samsung 840 Pro. The write-durabilities of 850 and 840 are quite different. That being said, Samsung 8X0 Pros are desktop drives. Only "Data Center" grade SSDs should be used with Ceph, with decent TBW/day >= 5 years. You should be looking at Intel DC37xx, OCZ Intrepid 3800, HGST, etc. Samsung recently released the 845DC (PRO/EVO) aimed at datacenters. These have decent TBW specs, but not very much is known about them in real-use yet. Spend a full day reading storagesearch.com, it can save you THOUSANDS of dollars, when selecting an SSD for Datacenter use. Regards, Ron On 09/29/2014 02:31 AM, Emmanuel Lacour wrote: Dear ceph users, we are managing ceph clusters since 1 year now. Our setup is typically made of Supermicro servers with OSD sata drives and journal on SSD. Those SSD are all failing one after the other after one year :( We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd, 2 HD in 1U): 1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :() 2) raid 1 for OS (nearly no writes) and dedicated partition for journals (one per OSD) I'm convinced that the second setup is better and we migrate old setup to this one. Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes per day on SSD on a not really over loaded cluster. Samsung claims to give 5 years warranty if under 40GB/day. Those numbers seems very low to me. What are your experiences on this? What write volumes do you encounter, on wich SSD models, which setup and what MTBF? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
Hello, On Wed, 1 Oct 2014 13:31:38 +0200 Martin B Nielsen wrote: > Hi, > > We settled on Samsung pro 840 240GB drives 1½ year ago and we've been > happy so far. We've over-provisioned them a lot (left 120GB > unpartitioned). > > We have 16x 240GB and 32x 500GB - we've lost 1x 500GB so far. > > smartctl states something like > Wear = 092%, Hours = 12883, Datawritten = 15321.83 TB avg on those. I > think that is ~30TB/day if I'm doing the calc right. > Something very much does not add up there. Either you've written 15321.83 GB on those drives, making it about 30GB/day and well withing the Samsung specs, or you've written 10-20 times the expected TBW level of those drives... In the article I mentioned previously: http://www.anandtech.com/show/8239/update-on-samsung-850-pro-endurance-vnand-die-size The author clearly comes with a relationship of durability versus SSD size, as one would expect. But the Samsung homepage just stated 150TBW, for all those models... Christian > Not to advertise or say every samsung 840 ssd is like this: > http://www.vojcik.net/samsung-ssd-840-endurance-destruct-test/ > Seen it before, but I have a feeling that this test doesn't quite put the same strain on the poor NANDs as Emmanuel's environment. Christian > Cheers, > Martin > > > On Wed, Oct 1, 2014 at 10:18 AM, Christian Balzer wrote: > > > On Wed, 1 Oct 2014 09:28:12 +0200 Kasper Dieter wrote: > > > > > On Tue, Sep 30, 2014 at 04:38:41PM +0200, Mark Nelson wrote: > > > > On 09/29/2014 03:58 AM, Dan Van Der Ster wrote: > > > > > Hi Emmanuel, > > > > > This is interesting, because we?ve had sales guys telling us that > > > > > those Samsung drives are definitely the best for a Ceph journal > > > > > O_o ! > > > > > > > > Our sales guys or Samsung sales guys? :) If it was ours, let me > > > > know. > > > > > > > > > The conventional wisdom has been to use the Intel DC S3700 > > > > > because of its massive durability. > > > > > > > > The S3700 is definitely one of the better drives on the market for > > > > Ceph journals. Some of the higher end PCIE SSDs have pretty high > > > > durability (and performance) as well, but cost more (though you can > > > > save SAS bay space, so it's a trade-off). > > > Intel P3700 could be an alternative with 10 Drive-Writes/Day for 5 > > > years (see attachment) > > > > > They're certainly nice and competitively priced (TBW/$ wise at least). > > However as I said in another thread, once your SSDs start to outlive > > your planned server deployment time (in our case 5 years) that's > > probably good enough. > > > > It's all about finding the balance between cost, speed (BW and IOPS), > > durability and space. > > > > For example I'm currently building a cluster based on 2U, 12 hotswap > > bays servers (because I already had 2 floating around) and am using 4 > > 100GB DC S3700 (at US$200 each) and 8 HDDS in them. > > Putting in a 400GB DC P3700 (US$1200( instead and 4 more HDDs would > > have pushed me over the budget and left me with a less than 30% "used" > > SSD 5 years later, at a time when we clearly can expect these things > > to be massively faster and cheaper. > > > > Now if you're actually having a cluster that would wear out a P3700 in > > 5 years (or you're planning to run your machines until they burst into > > flames), then that's another story. ^.^ > > > > Christian > > > > > -Dieter > > > > > > > > > > > > > > > > > Anyway, I?m curious what do the SMART counters say on your SSDs?? > > > > > are they really failing due to worn out P/E cycles or is it > > > > > something else? > > > > > > > > > > Cheers, Dan > > > > > > > > > > > > > > >> On 29 Sep 2014, at 10:31, Emmanuel Lacour > > > > >> wrote: > > > > >> > > > > >> > > > > >> Dear ceph users, > > > > >> > > > > >> > > > > >> we are managing ceph clusters since 1 year now. Our setup is > > > > >> typically made of Supermicro servers with OSD sata drives and > > > > >> journal on SSD. > > > > >> > > > > >> Those SSD are all failing one after the other after one year :( > > > > >> > > > > >> We used Samsung 850 pro (120Go) with two setup (small nodes > > > > >> with 2 ssd, 2 HD in 1U): > > > > >> > > > > >> 1) raid 1 :( (bad idea, each SSD support all the OSDs journals > > > > >> writes :() 2) raid 1 for OS (nearly no writes) and dedicated > > > > >> partition for journals (one per OSD) > > > > >> > > > > >> > > > > >> I'm convinced that the second setup is better and we migrate old > > > > >> setup to this one. > > > > >> > > > > >> Thought, statistics gives 60GB (option 2) to 100 GB (option 1) > > > > >> writes per day on SSD on a not really over loaded cluster. > > > > >> Samsung claims to give 5 years warranty if under 40GB/day. > > > > >> Those numbers seems very low to me. > > > > >> > > > > >> What are your experiences on this? What write volumes do you > > > > >> encounter, on wich SSD models, which setup and what MTBF? > > > > >> > > > > >> > > > > >> -- > > > > >> Easter-eggs
Re: [ceph-users] SSD MTBF
On Wed, Oct 01, 2014 at 01:31:38PM +0200, Martin B Nielsen wrote: >Hi, > >We settled on Samsung pro 840 240GB drives 1½ year ago and we've been >happy so far. We've over-provisioned them a lot (left 120GB >unpartitioned). > >We have 16x 240GB and 32x 500GB - we've lost 1x 500GB so far. >smartctl states something like >Wear = 092%, Hours = 12883, Datawritten = 15321.83 TB avg on those. I >think that is ~30TB/day if I'm doing the calc right. > >Not to advertise or say every samsung 840 ssd is like this: >[1]http://www.vojcik.net/samsung-ssd-840-endurance-destruct-test/ > I just returned 3 dead SSD, waiting for Samsung feedback ;) (another one is dead yesterday) -- Easter-eggs Spécialiste GNU/Linux 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 mailto:elac...@easter-eggs.com - http://www.easter-eggs.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
Hi, We settled on Samsung pro 840 240GB drives 1½ year ago and we've been happy so far. We've over-provisioned them a lot (left 120GB unpartitioned). We have 16x 240GB and 32x 500GB - we've lost 1x 500GB so far. smartctl states something like Wear = 092%, Hours = 12883, Datawritten = 15321.83 TB avg on those. I think that is ~30TB/day if I'm doing the calc right. Not to advertise or say every samsung 840 ssd is like this: http://www.vojcik.net/samsung-ssd-840-endurance-destruct-test/ Cheers, Martin On Wed, Oct 1, 2014 at 10:18 AM, Christian Balzer wrote: > On Wed, 1 Oct 2014 09:28:12 +0200 Kasper Dieter wrote: > > > On Tue, Sep 30, 2014 at 04:38:41PM +0200, Mark Nelson wrote: > > > On 09/29/2014 03:58 AM, Dan Van Der Ster wrote: > > > > Hi Emmanuel, > > > > This is interesting, because we?ve had sales guys telling us that > > > > those Samsung drives are definitely the best for a Ceph journal O_o ! > > > > > > Our sales guys or Samsung sales guys? :) If it was ours, let me know. > > > > > > > The conventional wisdom has been to use the Intel DC S3700 because > > > > of its massive durability. > > > > > > The S3700 is definitely one of the better drives on the market for > > > Ceph journals. Some of the higher end PCIE SSDs have pretty high > > > durability (and performance) as well, but cost more (though you can > > > save SAS bay space, so it's a trade-off). > > Intel P3700 could be an alternative with 10 Drive-Writes/Day for 5 years > > (see attachment) > > > They're certainly nice and competitively priced (TBW/$ wise at least). > However as I said in another thread, once your SSDs start to outlive your > planned server deployment time (in our case 5 years) that's probably good > enough. > > It's all about finding the balance between cost, speed (BW and IOPS), > durability and space. > > For example I'm currently building a cluster based on 2U, 12 hotswap bays > servers (because I already had 2 floating around) and am using 4 100GB DC > S3700 (at US$200 each) and 8 HDDS in them. > Putting in a 400GB DC P3700 (US$1200( instead and 4 more HDDs would have > pushed me over the budget and left me with a less than 30% "used" SSD 5 > years later, at a time when we clearly can expect these things to be > massively faster and cheaper. > > Now if you're actually having a cluster that would wear out a P3700 in 5 > years (or you're planning to run your machines until they burst into > flames), then that's another story. ^.^ > > Christian > > > -Dieter > > > > > > > > > > > > > Anyway, I?m curious what do the SMART counters say on your SSDs?? > > > > are they really failing due to worn out P/E cycles or is it > > > > something else? > > > > > > > > Cheers, Dan > > > > > > > > > > > >> On 29 Sep 2014, at 10:31, Emmanuel Lacour > > > >> wrote: > > > >> > > > >> > > > >> Dear ceph users, > > > >> > > > >> > > > >> we are managing ceph clusters since 1 year now. Our setup is > > > >> typically made of Supermicro servers with OSD sata drives and > > > >> journal on SSD. > > > >> > > > >> Those SSD are all failing one after the other after one year :( > > > >> > > > >> We used Samsung 850 pro (120Go) with two setup (small nodes with 2 > > > >> ssd, 2 HD in 1U): > > > >> > > > >> 1) raid 1 :( (bad idea, each SSD support all the OSDs journals > > > >> writes :() 2) raid 1 for OS (nearly no writes) and dedicated > > > >> partition for journals (one per OSD) > > > >> > > > >> > > > >> I'm convinced that the second setup is better and we migrate old > > > >> setup to this one. > > > >> > > > >> Thought, statistics gives 60GB (option 2) to 100 GB (option 1) > > > >> writes per day on SSD on a not really over loaded cluster. Samsung > > > >> claims to give 5 years warranty if under 40GB/day. Those numbers > > > >> seems very low to me. > > > >> > > > >> What are your experiences on this? What write volumes do you > > > >> encounter, on wich SSD models, which setup and what MTBF? > > > >> > > > >> > > > >> -- > > > >> Easter-eggs Spécialiste GNU/Linux > > > >> 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité > > > >> Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 > > > >> mailto:elac...@easter-eggs.com - http://www.easter-eggs.com > > > >> ___ > > > >> ceph-users mailing list > > > >> ceph-users@lists.ceph.com > > > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > ___ > > > > ceph-users mailing list > > > > ceph-users@lists.ceph.com > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > > > ___ > > > ceph-users mailing list > > > ceph-users@lists.ceph.com > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > -- > Christian BalzerNetwork/Systems Engineer > ch...@gol.com Global OnLine Japan/Fusion Communications > http://www.gol.com/ > ___
Re: [ceph-users] SSD MTBF
On Wed, 1 Oct 2014 09:28:12 +0200 Kasper Dieter wrote: > On Tue, Sep 30, 2014 at 04:38:41PM +0200, Mark Nelson wrote: > > On 09/29/2014 03:58 AM, Dan Van Der Ster wrote: > > > Hi Emmanuel, > > > This is interesting, because we?ve had sales guys telling us that > > > those Samsung drives are definitely the best for a Ceph journal O_o ! > > > > Our sales guys or Samsung sales guys? :) If it was ours, let me know. > > > > > The conventional wisdom has been to use the Intel DC S3700 because > > > of its massive durability. > > > > The S3700 is definitely one of the better drives on the market for > > Ceph journals. Some of the higher end PCIE SSDs have pretty high > > durability (and performance) as well, but cost more (though you can > > save SAS bay space, so it's a trade-off). > Intel P3700 could be an alternative with 10 Drive-Writes/Day for 5 years > (see attachment) > They're certainly nice and competitively priced (TBW/$ wise at least). However as I said in another thread, once your SSDs start to outlive your planned server deployment time (in our case 5 years) that's probably good enough. It's all about finding the balance between cost, speed (BW and IOPS), durability and space. For example I'm currently building a cluster based on 2U, 12 hotswap bays servers (because I already had 2 floating around) and am using 4 100GB DC S3700 (at US$200 each) and 8 HDDS in them. Putting in a 400GB DC P3700 (US$1200( instead and 4 more HDDs would have pushed me over the budget and left me with a less than 30% "used" SSD 5 years later, at a time when we clearly can expect these things to be massively faster and cheaper. Now if you're actually having a cluster that would wear out a P3700 in 5 years (or you're planning to run your machines until they burst into flames), then that's another story. ^.^ Christian > -Dieter > > > > > > > > > Anyway, I?m curious what do the SMART counters say on your SSDs?? > > > are they really failing due to worn out P/E cycles or is it > > > something else? > > > > > > Cheers, Dan > > > > > > > > >> On 29 Sep 2014, at 10:31, Emmanuel Lacour > > >> wrote: > > >> > > >> > > >> Dear ceph users, > > >> > > >> > > >> we are managing ceph clusters since 1 year now. Our setup is > > >> typically made of Supermicro servers with OSD sata drives and > > >> journal on SSD. > > >> > > >> Those SSD are all failing one after the other after one year :( > > >> > > >> We used Samsung 850 pro (120Go) with two setup (small nodes with 2 > > >> ssd, 2 HD in 1U): > > >> > > >> 1) raid 1 :( (bad idea, each SSD support all the OSDs journals > > >> writes :() 2) raid 1 for OS (nearly no writes) and dedicated > > >> partition for journals (one per OSD) > > >> > > >> > > >> I'm convinced that the second setup is better and we migrate old > > >> setup to this one. > > >> > > >> Thought, statistics gives 60GB (option 2) to 100 GB (option 1) > > >> writes per day on SSD on a not really over loaded cluster. Samsung > > >> claims to give 5 years warranty if under 40GB/day. Those numbers > > >> seems very low to me. > > >> > > >> What are your experiences on this? What write volumes do you > > >> encounter, on wich SSD models, which setup and what MTBF? > > >> > > >> > > >> -- > > >> Easter-eggs Spécialiste GNU/Linux > > >> 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité > > >> Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 > > >> mailto:elac...@easter-eggs.com - http://www.easter-eggs.com > > >> ___ > > >> ceph-users mailing list > > >> ceph-users@lists.ceph.com > > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > ___ > > > ceph-users mailing list > > > ceph-users@lists.ceph.com > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
On Tue, Sep 30, 2014 at 04:38:41PM +0200, Mark Nelson wrote: > On 09/29/2014 03:58 AM, Dan Van Der Ster wrote: > > Hi Emmanuel, > > This is interesting, because we?ve had sales guys telling us that those > > Samsung drives are definitely the best for a Ceph journal O_o ! > > Our sales guys or Samsung sales guys? :) If it was ours, let me know. > > > The conventional wisdom has been to use the Intel DC S3700 because of its > > massive durability. > > The S3700 is definitely one of the better drives on the market for Ceph > journals. Some of the higher end PCIE SSDs have pretty high durability > (and performance) as well, but cost more (though you can save SAS bay > space, so it's a trade-off). Intel P3700 could be an alternative with 10 Drive-Writes/Day for 5 years (see attachment) -Dieter > > > > > Anyway, I?m curious what do the SMART counters say on your SSDs?? are they > > really failing due to worn out P/E cycles or is it something else? > > > > Cheers, Dan > > > > > >> On 29 Sep 2014, at 10:31, Emmanuel Lacour wrote: > >> > >> > >> Dear ceph users, > >> > >> > >> we are managing ceph clusters since 1 year now. Our setup is typically > >> made of Supermicro servers with OSD sata drives and journal on SSD. > >> > >> Those SSD are all failing one after the other after one year :( > >> > >> We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd, > >> 2 HD in 1U): > >> > >> 1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :() > >> 2) raid 1 for OS (nearly no writes) and dedicated partition for journals > >> (one per OSD) > >> > >> > >> I'm convinced that the second setup is better and we migrate old setup > >> to this one. > >> > >> Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes per > >> day on SSD on a not > >> really over loaded cluster. Samsung claims to give 5 years warranty if > >> under 40GB/day. Those numbers seems very low to me. > >> > >> What are your experiences on this? What write volumes do you encounter, > >> on wich SSD models, which setup and what MTBF? > >> > >> > >> -- > >> Easter-eggs Spécialiste GNU/Linux > >> 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité > >> Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 > >> mailto:elac...@easter-eggs.com - http://www.easter-eggs.com > >> ___ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com FJ-20140915-Best-Practice_Distributed-Intelligent-Storage_NVMe-SSD_fast-IC_v8_P3700,ksp.pdf Description: Adobe PDF document ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
> On 30 Sep 2014, at 16:38, Mark Nelson wrote: > > On 09/29/2014 03:58 AM, Dan Van Der Ster wrote: >> Hi Emmanuel, >> This is interesting, because we’ve had sales guys telling us that those >> Samsung drives are definitely the best for a Ceph journal O_o ! > > Our sales guys or Samsung sales guys? :) If it was ours, let me know. Haha, neither. Cheers, Dan > >> The conventional wisdom has been to use the Intel DC S3700 because of its >> massive durability. > > The S3700 is definitely one of the better drives on the market for Ceph > journals. Some of the higher end PCIE SSDs have pretty high durability (and > performance) as well, but cost more (though you can save SAS bay space, so > it's a trade-off). > >> >> Anyway, I’m curious what do the SMART counters say on your SSDs?… are they >> really failing due to worn out P/E cycles or is it something else? >> >> Cheers, Dan >> >> >>> On 29 Sep 2014, at 10:31, Emmanuel Lacour wrote: >>> >>> >>> Dear ceph users, >>> >>> >>> we are managing ceph clusters since 1 year now. Our setup is typically >>> made of Supermicro servers with OSD sata drives and journal on SSD. >>> >>> Those SSD are all failing one after the other after one year :( >>> >>> We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd, >>> 2 HD in 1U): >>> >>> 1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :() >>> 2) raid 1 for OS (nearly no writes) and dedicated partition for journals >>> (one per OSD) >>> >>> >>> I'm convinced that the second setup is better and we migrate old setup >>> to this one. >>> >>> Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes per >>> day on SSD on a not >>> really over loaded cluster. Samsung claims to give 5 years warranty if >>> under 40GB/day. Those numbers seems very low to me. >>> >>> What are your experiences on this? What write volumes do you encounter, >>> on wich SSD models, which setup and what MTBF? >>> >>> >>> -- >>> Easter-eggs Spécialiste GNU/Linux >>> 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité >>> Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 >>> mailto:elac...@easter-eggs.com - http://www.easter-eggs.com >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
On 09/29/2014 03:58 AM, Dan Van Der Ster wrote: Hi Emmanuel, This is interesting, because we’ve had sales guys telling us that those Samsung drives are definitely the best for a Ceph journal O_o ! Our sales guys or Samsung sales guys? :) If it was ours, let me know. The conventional wisdom has been to use the Intel DC S3700 because of its massive durability. The S3700 is definitely one of the better drives on the market for Ceph journals. Some of the higher end PCIE SSDs have pretty high durability (and performance) as well, but cost more (though you can save SAS bay space, so it's a trade-off). Anyway, I’m curious what do the SMART counters say on your SSDs?… are they really failing due to worn out P/E cycles or is it something else? Cheers, Dan On 29 Sep 2014, at 10:31, Emmanuel Lacour wrote: Dear ceph users, we are managing ceph clusters since 1 year now. Our setup is typically made of Supermicro servers with OSD sata drives and journal on SSD. Those SSD are all failing one after the other after one year :( We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd, 2 HD in 1U): 1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :() 2) raid 1 for OS (nearly no writes) and dedicated partition for journals (one per OSD) I'm convinced that the second setup is better and we migrate old setup to this one. Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes per day on SSD on a not really over loaded cluster. Samsung claims to give 5 years warranty if under 40GB/day. Those numbers seems very low to me. What are your experiences on this? What write volumes do you encounter, on wich SSD models, which setup and what MTBF? -- Easter-eggs Spécialiste GNU/Linux 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 mailto:elac...@easter-eggs.com - http://www.easter-eggs.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
On Tue, 30 Sep 2014 15:26:31 +0100 Kingsley Tart wrote: > On Tue, 2014-09-30 at 00:30 +0900, Christian Balzer wrote: > > On Mon, 29 Sep 2014 11:15:21 +0200 Emmanuel Lacour wrote: > > > > > On Mon, Sep 29, 2014 at 05:57:12PM +0900, Christian Balzer wrote: > > > > > > > > Given your SSDs, are they failing after more than 150TB have been > > > > written? > > > > > > between 30 and 40 TB ... > > > > > That's low. One wonders what is going on here, Samsung being overly > > optimistic or something else... > > This isn't something I know much about so please do correct me if I'm > wrong, but might this be something to do with actual data size vs > written block size on the SSD? > You're quite correct and astute, but according to this article: http://www.anandtech.com/show/8239/update-on-samsung-850-pro-endurance-vnand-die-size it should be still 70TB at the worst case (very end of the article). It also doesn't mesh with the wear-out indicator levels Emmanuel is seeing, the drive should know best about its state of health and when it dies at about 40% something is very much off. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
On Tue, 2014-09-30 at 00:30 +0900, Christian Balzer wrote: > On Mon, 29 Sep 2014 11:15:21 +0200 Emmanuel Lacour wrote: > > > On Mon, Sep 29, 2014 at 05:57:12PM +0900, Christian Balzer wrote: > > > > > > Given your SSDs, are they failing after more than 150TB have been > > > written? > > > > between 30 and 40 TB ... > > > That's low. One wonders what is going on here, Samsung being overly > optimistic or something else... This isn't something I know much about so please do correct me if I'm wrong, but might this be something to do with actual data size vs written block size on the SSD? -- Cheers, Kingsley. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
On Mon, 29 Sep 2014 11:15:21 +0200 Emmanuel Lacour wrote: > On Mon, Sep 29, 2014 at 05:57:12PM +0900, Christian Balzer wrote: > > > > Given your SSDs, are they failing after more than 150TB have been > > written? > > between 30 and 40 TB ... > That's low. One wonders what is going on here, Samsung being overly optimistic or something else... OTOH, that is about the right amount of 40GB/day and THREE years, the warranty period > > > > > Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes > > > per day on SSD on a not really over loaded cluster. Samsung claims to > > > give 5 years warranty if under 40GB/day. Those numbers seems very > > > low to me. > > > > > This is confusing, as the Samsung homepage gives a 150TBW lifetime, and > > this would be about half of it. > > > > I didn't saw this spec. And did a quick look at samsung vs intel. Intel > s3500 120Go is 70TBW which is the same as 40GB/day for 5 years. > The 3500s are the "low end" models. ^o^ And you better believe that they will last that time. ^^ > > > > > If you read/search this ML it should be clear to you that the only SSDs > > that have the durability (and a good TBW/$ ratio when looking at it > > long term) are Intel DC 3700S. > > Monitor their wearout ratio and you're likely never have one fail on > > you unexpectedly. > > A 200 TB DC 3700S has a TBW of 1825, more than 10 times that of your > > Samsungs and would allow you to write 1TB each day for 5 years. > > > > > Yes, I did a quick compare just now: > > Brand Model TBW € €/TB > Intel S3500 120Go 701221,74 > Intel S3500 240Go 140 2251,60 > Intel S3700 100Go 1873 2200,11 > Intel S3700 200Go 3737 4000,10 > Samsung 840 pro 120Go 701201,71 > > amazing! > > > considering that I need only 80Go, I can keep free space for > over-provisionning and thus the S3700 200Go may be the better choice in > €/TB. > Over-provisioning helps as well, but it is not a necessity with those Intel SSDs. Note that in your particular configuration of having basically one SSD per HDD the 100GB DC S3700s are more than fast enough. Aside from the TBW/$ cost (which is of course based on current prices) one also has to consider the expected deployment duration. We do retire/recycle machines after 4-5 years, so for us anything that survives this time for sure is good enough. ^^ Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
On Mon, Sep 29, 2014 at 08:58:38AM +, Dan Van Der Ster wrote: > Hi Emmanuel, > This is interesting, because we’ve had sales guys telling us that those > Samsung drives are definitely the best for a Ceph journal O_o ! > The conventional wisdom has been to use the Intel DC S3700 because of its > massive durability. > > Anyway, I’m curious what do the SMART counters say on your SSDs?… are they > really failing due to worn out P/E cycles or is it something else? > Here are our current stats (health is the Wear_leveling_count): hyp-prs-01 SSD Status: sda / 3622 hours / 8800.107 GB written / 58.311 GB/day / Health: 82 % SSD Status: sdb / 3622 hours / 9949.785 GB written / 65.929 GB/day / Health: 80 % hyp-prs-02 SSD Status: sda / 3620 hours / 9516.849 GB written / 63.095 GB/day / Health: 81 % SSD Status: sdb / 3620 hours / 9716.926 GB written / 64.421 GB/day / Health: 80 % hyp-prs-03 SSD Status: sda / 3530 hours / 9501.308 GB written / 64.598 GB/day / Health: 82 % SSD Status: sdb / 3530 hours / 9494.685 GB written / 64.553 GB/day / Health: 80 % hyp-pa2-02 SSD Status: sdc / 5692 hours / 11585.309 GB written / 48.848 GB/day / Health: 80 % SSD Status: sdd / 5692 hours / 12771.698 GB written / 53.851 GB/day / Health: 77 % hyp-pa2-03 SSD Status: sdc / 5691 hours / 12571.167 GB written / 53.014 GB/day / Health: 78 % SSD Status: sdd / 5691 hours / 12882.846 GB written / 54.329 GB/day / Health: 76 % hyp-pa2-04 SSD Status: sdc / 5691 hours / 12542.344 GB written / 52.893 GB/day / Health: 76 % SSD Status: sdd / 5691 hours / 13534.304 GB written / 57.076 GB/day / Health: 77 % hyp-pa3-02 SSD Status: sdc / 8747 hours / 30142.858 GB written / 82.705 GB/day / Health: 48 % SSD Status: sdd / 8747 hours / 30737.615 GB written / 84.337 GB/day / Health: 40 % hyp-pa3-03 SSD Status: sda / 8769 hours / 32669.734 GB written / 89.414 GB/day / Health: 43 % SSD Status: sdb / 965 hours / 4006.301 GB written / 99.639 GB/day / Health: 92 % hyp-pa3-04 SSD Status: sda / 1033 hours / 4078.292 GB written / 94.753 GB/day / Health: 91 % SSD Status: sde / 49 hours / 299.994 GB written / 146.983 GB/day / Health: 99 % quadrille SSD Status: sdc / 7732 hours / 10775.406 GB written / 33.446 GB/day / Health: 80 % SSD Status: sdd / 7732 hours / 10656.070 GB written / 33.076 GB/day / Health: 81 % hora SSD Status: sdc / 7734 hours / 10978.489 GB written / 34.068 GB/day / Health: 81 % SSD Status: sdd / 7734 hours / 10978.754 GB written / 34.069 GB/day / Health: 81 % mazurka SSD Status: sdc / 7732 hours / 11983.782 GB written / 37.197 GB/day / Health: 80 % SSD Status: sdd / 7732 hours / 11803.509 GB written / 36.637 GB/day / Health: 81 % That was stats on last friday. This morning hyp-pa3-02:sdd died. So a bit under 40 for Wear_leveling_count. And this summer we lost 3 SSDs with nearly the same numbers (hyp-pa3-04:* and hyp-pa3-03:sdb) :( hyp-pa3-* is the cluster with journals on raid 1 ssds of course. -- Easter-eggs Spécialiste GNU/Linux 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 mailto:elac...@easter-eggs.com - http://www.easter-eggs.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
On Mon, Sep 29, 2014 at 05:57:12PM +0900, Christian Balzer wrote: > > Given your SSDs, are they failing after more than 150TB have been written? between 30 and 40 TB ... > > > Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes > > per day on SSD on a not really over loaded cluster. Samsung claims to > > give 5 years warranty if under 40GB/day. Those numbers seems very low to > > me. > > > This is confusing, as the Samsung homepage gives a 150TBW lifetime, and > this would be about half of it. > I didn't saw this spec. And did a quick look at samsung vs intel. Intel s3500 120Go is 70TBW which is the same as 40GB/day for 5 years. > > > If you read/search this ML it should be clear to you that the only SSDs > that have the durability (and a good TBW/$ ratio when looking at it long > term) are Intel DC 3700S. > Monitor their wearout ratio and you're likely never have one fail on you > unexpectedly. > A 200 TB DC 3700S has a TBW of 1825, more than 10 times that of your > Samsungs and would allow you to write 1TB each day for 5 years. > Yes, I did a quick compare just now: Brand Model TBW € €/TB Intel S3500 120Go 701221,74 Intel S3500 240Go 140 2251,60 Intel S3700 100Go 1873 2200,11 Intel S3700 200Go 3737 4000,10 Samsung 840 pro 120Go 701201,71 amazing! considering that I need only 80Go, I can keep free space for over-provisionning and thus the S3700 200Go may be the better choice in €/TB. -- Easter-eggs Spécialiste GNU/Linux 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 mailto:elac...@easter-eggs.com - http://www.easter-eggs.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
Hi Emmanuel, This is interesting, because we’ve had sales guys telling us that those Samsung drives are definitely the best for a Ceph journal O_o ! The conventional wisdom has been to use the Intel DC S3700 because of its massive durability. Anyway, I’m curious what do the SMART counters say on your SSDs?… are they really failing due to worn out P/E cycles or is it something else? Cheers, Dan > On 29 Sep 2014, at 10:31, Emmanuel Lacour wrote: > > > Dear ceph users, > > > we are managing ceph clusters since 1 year now. Our setup is typically > made of Supermicro servers with OSD sata drives and journal on SSD. > > Those SSD are all failing one after the other after one year :( > > We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd, > 2 HD in 1U): > > 1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :() > 2) raid 1 for OS (nearly no writes) and dedicated partition for journals > (one per OSD) > > > I'm convinced that the second setup is better and we migrate old setup > to this one. > > Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes per day > on SSD on a not > really over loaded cluster. Samsung claims to give 5 years warranty if > under 40GB/day. Those numbers seems very low to me. > > What are your experiences on this? What write volumes do you encounter, > on wich SSD models, which setup and what MTBF? > > > -- > Easter-eggs Spécialiste GNU/Linux > 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité > Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 > mailto:elac...@easter-eggs.com - http://www.easter-eggs.com > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] SSD MTBF
Hello, On Mon, 29 Sep 2014 10:31:03 +0200 Emmanuel Lacour wrote: > > Dear ceph users, > > > we are managing ceph clusters since 1 year now. Our setup is typically > made of Supermicro servers with OSD sata drives and journal on SSD. > > Those SSD are all failing one after the other after one year :( > Given your SSDs, are they failing after more than 150TB have been written? > We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd, > 2 HD in 1U): > > 1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :() > 2) raid 1 for OS (nearly no writes) and dedicated partition for journals > (one per OSD) > > > I'm convinced that the second setup is better and we migrate old setup > to this one. > Yes, the 2nd option is the better one for many reasons and I'm using that myself. > Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes > per day on SSD on a not really over loaded cluster. Samsung claims to > give 5 years warranty if under 40GB/day. Those numbers seems very low to > me. > This is confusing, as the Samsung homepage gives a 150TBW lifetime, and this would be about half of it. > What are your experiences on this? What write volumes do you encounter, > on wich SSD models, which setup and what MTBF? > If you read/search this ML it should be clear to you that the only SSDs that have the durability (and a good TBW/$ ratio when looking at it long term) are Intel DC 3700S. Monitor their wearout ratio and you're likely never have one fail on you unexpectedly. A 200 TB DC 3700S has a TBW of 1825, more than 10 times that of your Samsungs and would allow you to write 1TB each day for 5 years. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] SSD MTBF
Dear ceph users, we are managing ceph clusters since 1 year now. Our setup is typically made of Supermicro servers with OSD sata drives and journal on SSD. Those SSD are all failing one after the other after one year :( We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd, 2 HD in 1U): 1) raid 1 :( (bad idea, each SSD support all the OSDs journals writes :() 2) raid 1 for OS (nearly no writes) and dedicated partition for journals (one per OSD) I'm convinced that the second setup is better and we migrate old setup to this one. Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes per day on SSD on a not really over loaded cluster. Samsung claims to give 5 years warranty if under 40GB/day. Those numbers seems very low to me. What are your experiences on this? What write volumes do you encounter, on wich SSD models, which setup and what MTBF? -- Easter-eggs Spécialiste GNU/Linux 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 43 35 00 76 mailto:elac...@easter-eggs.com - http://www.easter-eggs.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com