Sure, the db device has a size of 22.5G, the primary deice has 100G. Here’s the complete ceph osd df output of one of the OSDs experiencing this issue:
ID CLASS WEIGHT REWEIGHT SIZE USE DATA OMAP META AVAIL %USE VAR PGS 14 hdd 0.11960 1.00000 122 GiB 118 GiB 2.4 GiB 0 B 116 GiB 4.2 GiB 96.55 3.06 195 I pasted the full output here, since this might not be so readable in the e-mail: https://pastebin.com/NWYTHwxh <https://pastebin.com/NWYTHwxh> The OSD in question has the ID 14. Let me know if there’s anything else I can provide you with. Cheers, Denis > On 13 May 2020, at 11:49, Eugen Block <ebl...@nde.ag> wrote: > > Hi Daniel, > > I had the exact same issue in a (virtual) Luminous cluster without much data > in it. The root cause was that my OSDs were too small (10 GB only) and the > rocksDB also grew until manual compaction. I had configured the small OSDs > intentionally because it was never supposed to store lots of data. Can you > provide some more details like 'ceph odf df'? > > Manual compaction did help, but then I recreated the OSDs with 20 GB each and > the issue didn't occur after that. > > Regards, > Eugen > > > Zitat von Denis Krienbühl <de...@href.ch>: > >> Hi >> >> On one of our Ceph clusters, some OSDs have been marked as full. Since this >> is a staging cluster that does not have much data on it, this is strange. >> >> Looking at the full OSDs through “ceph osd df” I figured out that the space >> is mostly used by metadata: >> >> SIZE: 122 GiB >> USE: 118 GiB >> DATA: 2.4 GiB >> META: 116 GiB >> >> We run mimic, and for the affected OSDs we use a db device (nvme) in >> addition to the primary device (hdd). >> >> In the logs we see the following errors: >> >> 2020-05-12 17:10:26.089 7f183f604700 1 bluefs _allocate failed to >> allocate 0x400000 on bdev 1, free 0x0; fallback to bdev 2 >> 2020-05-12 17:10:27.113 7f183f604700 1 >> bluestore(/var/lib/ceph/osd/ceph-8) _balance_bluefs_freespace gifting >> 0x180a000000~400000 to bluefs >> 2020-05-12 17:10:27.153 7f183f604700 1 bluefs add_block_extent bdev 2 >> 0x180a000000~400000 >> >> We assume it is an issue with Rocksdb, as the following call will quickly >> fix the problem: >> >> ceph daemon osd.8 compact >> >> The question is, why is this happening? I would think that “compact" is >> something that runs automatically from time to time, but I’m not sure. >> >> Is it on us to run this regularly? >> >> Any pointers are welcome. I’m quite new to Ceph :) >> >> Cheers, >> >> Denis >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io > > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io