Hi all, Thanks for the responses.
I stopped the monitor that wasn't syncing and dumped keys with the monstoretool. The keys seemed to mostly be of type 'logm' which I guess matches up with the huge amount of log messages I was getting about slow ops. I tried injecting clog_to_monitor=false along the way but it did not help in my case. I ended up doing a rolling restart of the whole cluster, which must have cleared whatever was blocking things because the monitors automatically compacted and 'b' rejoined the quorum about 75% of the way through. Thanks, Lincoln ________________________________ From: Wido den Hollander <w...@42on.com> Sent: Wednesday, March 3, 2021 2:03 AM To: Lincoln Bryant <linco...@uchicago.edu>; ceph-users <ceph-users@ceph.io> Subject: Re: [ceph-users] Monitor leveldb growing without bound v14.2.16 On 03/03/2021 00:55, Lincoln Bryant wrote: > Hi list, > > We recently had a cluster outage over the weekend where several OSDs were > inaccessible over night for several hours. When I found the cluster in the > morning, the monitors' root disks (which contained both the monitor's leveldb > and the Ceph logs) had completely filled. > > After restarting OSDs, cleaning out the monitors' logs, moving /var/lib/ceph > to dedicated disks on the mons, and starting recovery (in which there was 1 > unfound object that I marked lost, if that has any relevancy), the leveldb > continued/continues to grow without bound. The cluster has all PGs in > active+clean at this point, yet I'm accumulating what seems like > approximately ~1GB/hr of new leveldb data. > > Two of the monitors (a, c) are in quorum, while the third (b) has been > synchronizing for the last several hours, but doesn't seem to be able to > catch up. Mon 'b' has been running for 4 hours now in the 'synchronizing' > state. The mon's log has many messages about compacting and deleting files, > yet we never exit the synchronization state. > > The ceph.log is also rapidly accumulating complaints that the mons are slow > (not surprising, I suppose, since the levelDBs are ~100GB at this point). > > I've found that using monstore tool to do compaction on mons 'a' and 'c' > thelps but is only a temporary fix. Soon the database inflates again and I'm > back to where I started. Are all the PGs in the active+clean state? I don't assume so? This will cause the MONs to keep a large history of OSDMaps in their DB and thus it will keep growing. > > Thoughts on how to proceed here? Some ideas I had: > - Would it help to add some new monitors that use RocksDB? They would need to sync which can take a lot of time. Moving to RocksDB is a good idea when this is all fixed. > - Stop a monitor and dump the keys via monstoretool, just to get an idea > of what's going on? > - Increase mon_sync_max_payload_size to try to move data in larger chunks? I would just try it. > - Drop down to a single monitor, and see if normal compaction triggers > and stops growing unbounded? It will keep growing, the compact only works for a limited time. Make sure the PGs become clean again. In the meantime make sure you have enough disk space. Wido > - Stop both 'a' and 'c', compact them, start them, and immediately start > 'b' ? > > Appreciate any advice. > > Regards, > Lincoln > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io