I'm running Kraken built from Git right now and I've found that my OSDs eat as much memory as they can before they're killed by OOM. I understand that Bluestore is experimental but thought the fact that it does this should be known.
My setup: - Xeon D-1540, 32GB DDR4 ECC RAM - Arch Linux - Single node, 4 8TB OSDs, each prepared with "ceph-disk prepare --bluestore /dev/sdX" - Built from Git fac6335a1eea12270f76cf2c7814648669e6515a Steps to reproduce: - Start mon - Start OSDs - ceph osd pool create pool 256 256 erasure myprofile storage - rados bench -p pool <time> write -t 32 - ceph osd pool delete pool - ceph osd pool create pool 256 256 replicated - rados bench -p pool <time> write -t 32 - ceph osd pool delete pool The OSDs start at ~500M used each (according to "ceph tell osd.0 heap stats"), before they're allocated PGs. After creating and peering PGs, they're at ~514M each. After running rados bench for 10s, memory is at ~727M each. Running pprof on a dump shows the top entry as: 218.9 96.1% 96.1% 218.9 96.1% ceph::buffer::create_aligned Running rados bench another 10s pushes memory to 836M each. pprof again shows similar results: 305.2 96.8% 96.8% 305.2 96.8% ceph::buffer::create_aligned I can continue this process until the OSDs are killed by OOM. This only happens with Bluestore, other backends (like filestore) work fine. When I delete the pool, the OSDs release the memory and return to their ~500M resting point. Repeating the test with a replicated pool results in the OSDs consuming elevated memory (~610M peak) while writing but returning to resting levels when writing ends. It'd be great if I could do something about this myself but I don't understand the code very well and I can't figure out if there's a way to trace the path taken for the memory to be allocated like there is for CPU usage. Any advice or solution would be much appreciated. Thanks! Lucas
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com