[ceph-users] Re: OSD memory leak?

Frank Schilder Tue, 14 Jul 2020 13:47:06 -0700

Hi Anthony and Mark,

thanks for your answers.


I have seen recommendations derived from test clusters with bluestore OSDs that 
read 16GB base line + 1GB per HDD + 4GB per SSD OSD, probably from the times 
when bluestore had a base-line+stress dependent. I would actually consider this 
already quite something. I understand that for high-performance requirements 
one adds RAM etc. to speed things up.

For a mostly cold data store with a thin layer of warm/hot data, however, this 
is quite a lot compared with what standard disk controllers can do with a cheap 
CPU, 4GB of RAM and 16 drives connected. Essentially, ceph is turning a server 
into a disk controller and it should be possible to run a configuration that 
does not require much more than an ordinary hardware controller per disk 
delivering reasonable performance. I'm thinking along the lines of 25MB/s 
throughput and maybe 10IOP/s per NL-SAS HDD OSD to the user side (simple 
collocated deployment, EC pool). This ought to be possible in a way similar to 
a RAID controller with comparably moderate hardware requirements.

Good aggregated performance then comes from scale and because the layer of hot 
data per disk is only a few GB per drive (a full re-write of just the hot data 
is only a few minutes). I thought this was the idea of ceph. Instead of trying 
to accommodate high-performance wishes for ridiculously small ceph clusters (I 
do see these "I have 3 servers with 3 disks each, why is it so slow" kind of 
complaints, which I would simply ignore), one talks about scale-out systems 
with thousands of OSDs. Something like 20 hosts serving 200 disks each would 
count as a small cluster. If the warm/hot data is only 1% or even less, such a 
system will be quite satisfying.

For low-cost scale-out we have ceph. For performance, we have technologies like 
Lustre (which by the way has much more moderate minimum hardware requirements).

For anything that requires higher performance one can then start using tiering, 
WAL/DB devices, SSD only pools, lots of RAM, whatever. However, there should be 
a stable, well-tested and low-demanding base line config for a cold store use 
case with hardware requirements similar to a NAS box per storage unit (one 
server+JBODs). I start missing support for the latter. 2 or even 4GB and 
1core-GHz per HDD is really a lot compared with such systems.

Please don't take this as a start of a long discussion. Its just a wish from my 
side to have low-demanding configs available that scale easily and are easy to 
administrate at an overall low cost.

I will look into memory profiling of some OSDs. It doesn't look like a 
performance killer.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Anthony D'Atri <anthony.da...@gmail.com>
Sent: 14 July 2020 17:29
To: ceph-users@ceph.io
Subject: [ceph-users] Re: OSD memory leak?

>>  In the past, the minimum recommendation was 1GB RAM per HDD blue store OSD.

There was a rule of thumb of 1GB RAM *per TB* of HDD Filestore OSD, perhaps you 
were influenced by that?
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: OSD memory leak?

Reply via email to