> Oh, well what I was going to do was just use SATA HBAs on PowerEdge R740s 
> because we don't really care about performance

That is important context.

> as this is just used as a copy point for backups/archival but the current 
> Ceph cluster we have [Which is based on HDDs attached to Dell RAID 
> controllers with each disk in RAID-0 and works just fine for us]

The H330?  You can set passthrough / JBOD / HBA personality and avoid the RAID0 

> is on EL7 and that is going to be EOL soon. So I thought it might be better 
> on the new cluster to use HBAs instead of having the OSDs just be single disk 
> RAID-0 volumes because I am pretty sure that's the least good scenario 
> whether or not it has been working for us for like 8 years now.

See above.

> So I asked on the list for recommendations and also read on the website and 
> it really sounds like the only "right way" to run Ceph is by directly 
> attaching disks to a motherboard

That isn’t quite what I meant.

If one is specking out *new* hardware:

* HDDs are a false economy
* SATA / SAS SSDs hobble performance for little or no cost savings over NVMe
* RAID HBAs are fussy and a waste of money in 2023

>  I had thought that HBAs were okay before

By HBA I suspect you mean a non-RAID HBA?

> but I am probably confusing that with ZFS/BSD or some other equally 
> hyperspecific requirement.

ZFS indeed prefers as little as possible between it and the drives.  The 
benefits for Ceph are not identical but very congruent.

> The other note was about how using NVMe seems to be the only right way now 
> too.

If we predicate that HDDs are a dead end, then that leaves us with SAS/SATA SSD 
vs NVMe SSD.

SAS is all but dead, and carries a price penalty.
SATA SSDs are steadily declining in the market.  5-10 years from now I suspect 
that no more than one manufacturer of enterprise-class SATA SSDs will remain.  
The future is PCI. SATA SSDs don’t save any money over NVMe SSDs, and 
additionally require some sort of HBA, be it an add-in card or on the 
motherboard.  SATA and NVMe SSDs use the same NAND, just with a different 

> I would've rather just stuck to SATA but I figured if I was going to have to 
> buy all new servers that direct attach the SATA ports right off the 
> motherboards to a backplane

On-board SATA chips may be relatively weak but I don’t know much about current 

> I may as well do it with NVMe (even though the price of the media will be a 
> lot higher).

NVMe SSDs shouldn’t cost significantly more than SATA SSDs.  Hint:  certain 
tier-one chassis manufacturers mark both the fsck up.  You can get a better 
warranty and pricing by buying drives from a VAR.

> It would be cool if someone made NVMe drives that were cost competitive and 
> had similar performance to hard drives (meaning, not super expensive but not 
> lightning fast either) because the $/GB on datacenter NVMe drives like 
> Kioxia, etc is still pretty far away from what it is for HDDs (obviously).

It’s a trap!  Which is to say, that the $/GB really isn’t far away, and in fact 
once you step back to TCO from the unit economics of the drive in insolation, 
the HDDs often turn out to be *more* expensive.

Pore through this:  https://www.snia.org/forums/cmsi/programs/TCOcalc

* $/IOPS are higher for any HDD compared to NAND
* HDDs are available up to what, 22TB these days?  With the same tired SATA 
interface as when they were 2TB.  That’s rather a bottleneck.  We see HDD 
clusters limiting themselves to 8-10TB HDDs all the time; in fact AIUI RHCS 
stipulates no larger than 10TB.  Feed that into the equation and the TCO 
changes a bunch
* HDDs not only hobble steady-state performance, but under duress — expansion, 
component failure, etc., the impact to client operations will be higher and 
recovery to desired redundancy will be much longer.  I’ve seen a cluster — 
especially when using EC — take *4 weeks* to weight an 8TB HDD OSD up or down.  
Consider the operational cost and risk of that.  The SNIA calc has a 
performance multiplier that accounts for this.
* A SATA chassis is stuck with SATA, 5-10 years from now that will be 
increasingly limiting, especially if you go with LFF drives
* RUs cost money.  A 1U LFF server can hold what, at most 88TB raw when using 
HDDs?  With 60TB SSDs (*) one can fit 600TB of raw space into the same RU.

* If they meet your needs

> Anyway thanks.
> -Drew
> -----Original Message-----
> From: Robin H. Johnson <robb...@gentoo.org> 
> Sent: Sunday, January 14, 2024 5:00 PM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Re: recommendation for barebones server with 8-12 
> direct attach NVMe?
> On Fri, Jan 12, 2024 at 02:32:12PM +0000, Drew Weaver wrote:
>> Hello,
>> So we were going to replace a Ceph cluster with some hardware we had 
>> laying around using SATA HBAs but I was told that the only right way 
>> to build Ceph in 2023 is with direct attach NVMe.
>> Does anyone have any recommendation for a 1U barebones server (we just 
>> drop in ram disks and cpus) with 8-10 2.5" NVMe bays that are direct 
>> attached to the motherboard without a bridge or HBA for Ceph 
>> specifically?
> If you're buying new, Supermicro would be my first choice for vendor based on 
> experience.
> https://www.supermicro.com/en/products/nvme
> You said 2.5" bays, which makes me think you have existing drives.
> There are models to fit that, but if you're also considering new drives, you 
> can get further density in E1/E3
> The only caveat is that you will absolutely want to put a better NIC in these 
> systems, because 2x10G is easy to saturate with a pile of NVME.
> --
> Robin Hugh Johnson
> Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer
> E-Mail   : robb...@gentoo.org
> GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB 
> E9B85B1F 825BCECF EE05E6F6 A48F6136
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to