The benefit of disabling on-drive cache may be at least partly dependent on the 
HBA; I’ve done testing of one specific drive model and found no difference, 
where someone else reported a measurable difference for the same model.

> Good to know that we're not alone :) I also looked for a newer firmware, to 
> no avail.

Dell sometimes publishes firmware blobs for drives that they resell, though 
those seem to have customized inquiry strings baked in, and their firmware 
won’t apply to “generic” drives without questionable hackery with a hex editor. 
 

My experience with Toshiba has been that the only way to get firmware blobs for 
generic drives is to persuade Toshiba themselves to give it to you, be it 
through a rep or the CSO.

> 
> Mark Nelson wrote:
>> This isn't the first time I've seen drive cache cause problematic
>> latency issues, and not always from the same manufacturer.
>> Unfortunately it seems like you really have to test the drives you
>> want to use before deploying them them to make sure you don't run into
>> issues.
> 
> That's very true! Data sheets and even public benchmarks can be quite
> deceiving, and two hard drives that seem to have similar performance profiles
> can perform very differently within a Ceph cluster. Lesson learned.

Benchmarks often are in a context rather removed from what anyone would deploy 
in production.

Notably I’ve had at least two experiences with drives that passed chassis 
vendor and in-house initial qualification.

The first was an HDD.  We had a mix of drives from vendor A and vendor B.  
Found that Vendor B’s drives were throwing read errors at 30x the rate of 
Vendor A’s.  After persisting for months through the layers I was finally able 
to send drives to the vendor’s engineers, who found at least one design flaw 
that was tickled by the op pattern of a Filestore (XFS) OSD with colo journal.  
Firmware was not able to substantially fix the problem, so they all had to be 
replaced with Vendor A.  Today BlueStore probably would not trigger the same 
design flaw.


The second was an SSD that was marketed as “enterprise” but had certain things 
that would only properly housekeep if allowed long idle times.  In that case I 
was eventually able to work with the vendor for a firmware fix.  In this case, 
BlueStore seemed to correlate with the behavior, as well as a serial number 
range.  This was one that didn’t manifest until drives had been in production 
for at least 90 days and as workload increased.


Moral of the story is to stress-test every model of drive if you care about 
data durability, availability, and performance.  Throw increasingly busy 
workloads and queue depths against the drives; performance of some will hit an 
abrupt cliff at a certain point.



_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to