On Wed, Sep 23, 2020 at 3:03 AM Ivan Kurnosov <zer...@zerkms.com> wrote: > > Hi, > > this morning I woke up to a degraded test ceph cluster (managed by rook, > but it does not really change anything for the question I'm about to ask). > > After checking logs I have found that bluestore on one of the OSDs run out > of space.
I think this is a consequence, and the real error is something else that happened before. The problem is that, if the cluster is unhealthy, the MON storage accumulates a lot of osdmaps and pgmaps, and is not cleaned up automatically, because the MONs think that these old versions might be needed. And OSDs also get a copy of these osdmaps and pgmaps, if I understand correctly, that's why small OSDs get quickly filled up if the cluster stays unhealthy for a few hours. > So, my question would be: how could I have prevented that? From monitoring > I have (prometheus) - OSDs are healthy, have plenty of space, yet they are > not. > > What command (and prometheus metric) would help me understand the actual > real bluestore use? Or am I missing something? You can fix monitoring by setting the "mon data size warn" to something like 1 GB or even less. > Oh, and I "fixed" the cluster by expanding the broken osd.0 with a larger > 15GB volume. And 2 other OSDs still run on 10GB volumes. Sometimes this doesn't help. For data recovery purposes, the most helpful step if you get the "bluefs enospc" error is to add a separate db device, like this: systemctl disable --now ceph-osd@${OSDID} truncate -s 32G /junk/osd.${OSDID}-recover/block.db sgdisk -n 0:0:0 /junk/osd.${OSDID}-recover/block.db ceph-bluestore-tool \ bluefs-bdev-new-db --path /var/lib/ceph/osd/ceph-${OSDID} \ --dev-target /junk/osd.${OSDID}-recover/block.db \ --bluestore-block-db-size=31G --bluefs-log-compact-min-size=31G Of course you can use a real block device instead of just a file. After that, export all PGs using ceph-objecttstore-tool and re-import into a fresh OSD, then destroy or purge the full one. Here is why the options: --bluestore-block-db-size=31G: ceph-bluestore-tool refuses to do anything if this option is not set to any value --bluefs-log-compact-min-size=31G: make absolutely sure that log compaction doesn't happen, because it would hit "bluefs enospc" again. -- Alexander E. Patrakov CV: http://pc.cd/PLz7 _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io