Hi Quentin,
Samsung has so different type of SSD for different type of workload with 
different SSD media like SLC,MLC,TLC ,3D NAND etc. They were designed for 
different workloads for different purposes. Thanks for your understanding and 
support.

Regards,
James

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Quentin Hartman
Sent: Thursday, September 17, 2015 4:05 PM
To: Andrija Panic
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] which SSD / experiences with Samsung 843T vs. Intel 
s3700

I ended up having 7 total die. 5 while in service, 2 more when I hooked them up 
to a test machine to collect information from them. To Samsung's credit, 
they've been great to deal with and are replacing the failed drives, on the 
condition that I don't use them for ceph again. Apparently they sent some of my 
failed drives to an engineer in Korea and they did a failure analysis on them 
and came to the conclusion they we put to an "unintended use". I have seven 
left I'm not sure what to do with.

I've honestly always really liked Samsung, and I'm disappointed that I wasn't 
able to find anyone with their DC-class drives actually in stock so I ended up 
switching the to Intel S3700s. My users will be happy to have some SSDs to put 
in their workstations though!

QH

On Thu, Sep 17, 2015 at 4:49 PM, Andrija Panic 
<andrija.pa...@gmail.com<mailto:andrija.pa...@gmail.com>> wrote:
Another one bites the dust...

This is Samsung 850 PRO 256GB... (6 journals on this SSDs just died...)

[root@cs23 ~]# smartctl -a /dev/sda
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.10.66-1.el6.elrepo.x86_64] 
(local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

Vendor:               /1:0:0:0
Product:
User Capacity:        600,332,565,813,390,450 bytes [600 PB]
Logical block size:   774843950 bytes
>> Terminate command early due to bad response to IEC mode page
A mandatory SMART command failed: exiting. To continue, add one or more '-T 
permissive' options

On 8 September 2015 at 18:01, Quentin Hartman 
<qhart...@direwolfdigital.com<mailto:qhart...@direwolfdigital.com>> wrote:
On Tue, Sep 8, 2015 at 9:05 AM, Mark Nelson 
<mnel...@redhat.com<mailto:mnel...@redhat.com>> wrote:
A list of hardware that is known to work well would be incredibly
valuable to people getting started. It doesn't have to be exhaustive,
nor does it have to provide all the guidance someone could want. A
simple "these things have worked for others" would be sufficient. If
nothing else, it will help people justify more expensive gear when their
approval people say "X seems just as good and is cheaper, why can't we
get that?".

So I have my opinions on different drives, but I think we do need to be really 
careful not to appear to endorse or pick on specific vendors. The more we can 
stick to high-level statements like:

- Drives should have high write endurance
- Drives should perform well with O_DSYNC writes
- Drives should support power loss protection for data in motion

The better I think.  Once those are established, I think it's reasonable to 
point out that certain drives meet (or do not meet) those criteria and get 
feedback from the community as to whether or not vendor's marketing actually 
reflects reality.  It'd also be really nice to see more information available 
like the actual hardware (capacitors, flash cells, etc) used in the drives.  
I've had to show photos of the innards of specific drives to vendors to get 
them to give me accurate information regarding certain drive capabilities.  
Having a database of such things available to the community would be really 
helpful.

That's probably a very good approach. I think it would be pretty simple to 
avoid the appearance of endorsement if the data is presented correctly.


To that point, I think perhaps though something more important than a
list of known "good" hardware would be a list of known "bad" hardware,

I'm rather hesitant to do this unless it's been specifically confirmed by the 
vendor.  It's too easy to point fingers (see the recent kernel trim bug 
situation).

I disagree. I think that only comes into play if you claim to know why the 
hardware has problems. In this case, if you simply state "people who have used 
this drive have experienced a large number of seemingly premature failures when 
using them as journals" that provides sufficient warning to users, and if the 
vendor wants to engage the community and potentially pin down why and help us 
find a way to make the device work or confirm that it's just not suited, then 
that's on them. Samsung seems to be doing exactly that. It would be great to 
have them help provide that level of detail, but again, I don't think it's 
necessary. We're not saying "ceph/redhat/$whatever says this hardware sucks" 
we're saying "The community has found that using this hardware with ceph has 
exhibited these negative behaviors...". At that point you're just relaying 
experiences and collecting them in a central location. It's up to the reader to 
draw conclusions from it.

But again, I think more important than either of these would be a collection of 
use cases with actual journal write volumes that have occurred in those use 
cases so that people can make more informed purchasing decisions. The fact that 
my small openstack cluster created 3.6T of writes per month on my journal 
drives (3 OSD each) is somewhat mind-blowing. That's almost four times the 
amount of writes my best guess estimates indicated we'd be doing. Clearly 
there's more going on than we are used to paying attention to. Someone coming 
to ceph and seeing the cost of DC-class SSDs versus consumer-class SSDs will 
almost certainly suffer from some amount of sticker shock, and even if they 
don't their purchasing approval people almost certainly will. This is 
especially true for people in smaller organizations where SSDs are still 
somewhat exotic. And when they come back with the "Why won't cheaper thing X be 
OK?" they need to have sufficient information to answer that. Without a test 
environment to generate data with, they will need to rely on the experiences of 
others, and right now those experiences don't seem to be documented anywhere, 
and if they are, they are not very discoverable.

QH

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--

Andrija Panić

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to