Re: [ceph-users] out of memory bluestore osds

Mark Nelson Wed, 07 Aug 2019 06:10:23 -0700

Hi Jaime,

we only use the cache size parameters now if you've disabledautotuning. With autotuning we adjust the cache size on the fly to tryand keep the mapped process memory under the osd_memory_target. You canset a lower memory target than default, though you will have far lesscache for bluestore onodes and rocksdb. You may notice that it'sslower, especially if you have a big active data set you areprocessing. I don't usually recommend setting the osd_memory_targetbelow 2GB. At some point it will have shrunk the caches as far as itcan and the process memory may start exceeding the target. (with ourdefault rocksdb and pglog settings this usually happens somewherebetween 1.3-1.7GB once the OSD has been sufficiently saturated with IO).Given memory prices right now, I'd still recommend upgrading RAM if youhave the ability though. You might be able to get away with settingeach OSD to 2-2.5GB in your scenario but you'll be pushing it.

I would not recommend lowering the osd_memory_cache_min. You reallywant rocksdb indexes/filters fitting in cache, and as many bluestoreonodes as you can get. In any event, you'll still be bound by the(currently hardcoded) 64MB cache chunk allocation size in the autotunerwhich osd_memory_cache_min can't reduce (and that's per cache whileosd_memory_cache_min is global for the kv,buffer, and rocksdb blockcaches). IE each cache is going to get 64MB+growth room regardless ofhow low you set osd_memory_cache_min. That's intentional as we don'twant a single SST file in rocksdb to be able to completely bloweverything else out of the block cache during compaction, only toquickly become invalid, removed from the cache, and make it look to thepriority cache system like rocksdb doesn't actually need any more memoryfor cache.



Mark


On 8/7/19 7:44 AM, Jaime Ibar wrote:

Hi all,

we run a Ceph Luminous 12.2.12 cluster, 7 osds servers 12x4TB disks each.
Recently we redeployed the osds of one of them using bluestore backend,

however, after this, we're facing Out of memory errors(invokedoom-killer)

and the OS kills one of the ceph-osd process.
The osd is restarted automatically and back online after one minute.
We're running Ubuntu 16.04, kernel 4.15.0-55-generic.
The server has 32GB of RAM and 4GB of swap partition.
All the disks are hdd, no ssd disks.
Bluestore settings are the default ones

"osd_memory_target": "4294967296"
"osd_memory_cache_min": "134217728"
"bluestore_cache_size": "0"
"bluestore_cache_size_hdd": "1073741824"
"bluestore_cache_autotune": "true"

As stated in the documentation, bluestore assigns by default 4GB of
RAM per osd(1GB of RAM for 1TB).
So in this case 48GB of RAM would be needed. Am I right?

Are these the minimun requirements for bluestore?
In case adding more RAM is not an option, can any of
osd_memory_target, osd_memory_cache_min, bluestore_cache_size_hdd
be decrease to fit in our server specs?
Would this have any impact on performance?

Thanks
Jaime

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] out of memory bluestore osds

Reply via email to