Thanks a lot for the insightful comments. 

Reply see below
 
From: Christian Wuerdig
Date: 2021-10-22 02:13
To: huxia...@horebdata.cn
CC: ceph-users
Subject: Re: [ceph-users] Open discussing: Designing 50GB/s CephFS or S3 ceph 
cluster
What is the expected file/object size distribution and count?

            Let us suppose file/object size be 1MB or even higher. 

Is it write-once or modify-often data?

 Write once and read many, very few modification

What's your overall required storage capacity?

Exa level storage capacity

18 OSDs per WAL/DB drive seems a lot - recommended is ~6-8
With 12TB OSD the recommended WAL/DB size is 120-480GB (1-4%) per OSD to avoid 
spillover - if you go RGW then you may want to aim more towards 4% since RGW 
can use quite a bit of OMAP data (especially when you store many small 
objects). Not sure about CephFS
So you may want to look at 4x NVME and probably 3.2TB instead of 1.6

Does Nautilus 14.2.22 support flexible WAL/DB size? I rememebered previously 
the supported sizes are 3, 30, 300GB 

Rule-of-thumb is 1 Thread per HDD OSD - so if you want to give yourself some 
extra wiggle room a 7402 might be better - especially since EC is a bit heavier 
on CPU

Agree, 7402 is a better choice 

Running EC 8+3 with failure domain host means you should have at least 12 nodes 
which means you'd need to push 4GB/sec/node which seems theoretically possible 
but is quite close to the network interface capacity. And whether you could 
actually push 4GB/sec into a node in this config I don't know. But overall 12 
nodes seems like the minimum
With 12 nodes you have a raw storage capacity of around 5PB - assuming you 
don't run you cluster more than 80% full and EC 8+3 means max of 3PB usable 
data capacity (again assuming your objects are large enough to not cause 
significant space amplification wrt. bluestore min block size)
You will probably run more nodes than that so if you don't need the actual 
capacity then consider going replicated instead which generally performs better 
than EC

Agree, i will need more nodes 

On Fri, 22 Oct 2021 at 05:24, huxia...@horebdata.cn <huxia...@horebdata.cn> 
wrote:
Dear Cephers,

I am thinking of designing a cephfs or S3 cluster, with a target to achieve a 
minimum of 50GB/s (write) bandwidth. For each node, I prefer 4U 36x 3.5" 
Supermicro server with 36x 12TB 7200K RPM HDDs, 2x Intel P4610 1.6TB NVMe SSD 
as DB/WAL, a single CPU socket with AMD 7302, and 256GB DDR4 memory. Each node 
comes with 2x 25Gb networking in mode 4 bonded. 8+3 EC will be used. 

My questions are the following: 

1   How many nodes should be deployed in order to achieve a minimum of 50GB/s, 
if possible, with the above hardware setting?

2   How many Cephfs MDS are required? (suppose 1MB request size), and how many 
clients are needed for reach a total of 50GB/s?

3   From the perspective of getting the maximum bandwidth, which one should i 
choose, CephFS or Ceph S3?

Any comments, suggestions, or improvement tips are warmly welcome 

best regards,

Samuel



huxia...@horebdata.cn
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to