Apologies for not consolidating these replys.  My UMA is not my friend today.

> With 10 NVMe drives per node, I'm guessing that a single EPYC 7451 is
> going to be CPU bound for small IO workloads (2.4c/4.8t per OSD), but
> will be network bound for large IO workloads unless you are sticking
> 2x100GbE in.  You might want to consider jumping up to the 7601.  That
> would get you closer to where you want to be for 10 NVMe drives
> (3.2c/6.4t per OSD).

A note about this worthy excerpt — I believe it dates to 2019/03/09 ; the 
models described are the first-gen (Naples) procs and these numbers I think 
assume one OSD / drive.

Today we have the second-gen (Rome) procs that deliver more bang/core and there 
are also more options up to 64c / 128t.

Tying into the discussion favoring smaller, easier to manage nodes:  with both 
generations there are XXXXP models that are only single-socket capable — 
possibly binned dies but that’s only speculation on my part.  By going with a 
single-socket architecture one can choose one of these models and save 
considerably compared to a dual-socket-capable model with comparable 
performance.

So today one might consider say the 7502P, stepping to the 7702P if needed.

When comparing benchmarks, let’s remember that “performance” exists along 
multiple axes and one needs to be specific about your use-case. Eg. for RBD 
vols attached to VMs for block storage, I’ve found that users tend to be more 
concerned with latency than with bandwidth, so sure you can pump mad GB/s out 
of a system, but if the average write latency climbs to 100ms, that might not 
be compatible with the use-case.  Similarly with high queue depths that we see 
in benchmarks, those aren’t compatible with some common RBD workloads — but 
might make a killer RGW OSD box.


— aad
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to