Hi all,

Thanks for the responses.

I stopped the monitor that wasn't syncing and dumped keys with the 
monstoretool. The keys seemed to mostly be of type 'logm' which I guess matches 
up with the huge amount of log messages I was getting about slow ops. I tried 
injecting clog_to_monitor=false along the way but it did not help in my case.

I ended up doing a rolling restart of the whole cluster, which must have 
cleared whatever was blocking things because the monitors automatically 
compacted and 'b' rejoined the quorum about 75% of the way through.

Thanks,
Lincoln
________________________________
From: Wido den Hollander <w...@42on.com>
Sent: Wednesday, March 3, 2021 2:03 AM
To: Lincoln Bryant <linco...@uchicago.edu>; ceph-users <ceph-users@ceph.io>
Subject: Re: [ceph-users] Monitor leveldb growing without bound v14.2.16



On 03/03/2021 00:55, Lincoln Bryant wrote:
> Hi list,
>
> We recently had a cluster outage over the weekend where several OSDs were 
> inaccessible over night for several hours. When I found the cluster in the 
> morning, the monitors' root disks (which contained both the monitor's leveldb 
> and the Ceph logs) had completely filled.
>
> After restarting OSDs, cleaning out the monitors' logs, moving /var/lib/ceph 
> to dedicated disks on the mons, and starting recovery (in which there was 1 
> unfound object that I marked lost, if that has any relevancy), the leveldb 
> continued/continues to grow without bound. The cluster has all PGs in 
> active+clean at this point, yet I'm accumulating what seems like 
> approximately ~1GB/hr of new leveldb data.
>
> Two of the monitors (a, c) are in quorum, while the third (b) has been 
> synchronizing for the last several hours, but doesn't seem to be able to 
> catch up. Mon 'b' has been running for 4 hours now in the 'synchronizing' 
> state. The mon's log has many messages about compacting and deleting files, 
> yet we never exit the synchronization state.
>
> The ceph.log is also rapidly accumulating complaints that the mons are slow 
> (not surprising, I suppose, since the levelDBs are ~100GB at this point).
>
> I've found that using monstore tool to do compaction on mons 'a' and 'c' 
> thelps but is only a temporary fix. Soon the database inflates again and I'm 
> back to where I started.

Are all the PGs in the active+clean state? I don't assume so? This will
cause the MONs to keep a large history of OSDMaps in their DB and thus
it will keep growing.

>
> Thoughts on how to proceed here? Some ideas I had:
>     - Would it help to add some new monitors that use RocksDB?

They would need to sync which can take a lot of time. Moving to RocksDB
is a good idea when this is all fixed.

>     - Stop a monitor and dump the keys via monstoretool, just to get an idea 
> of what's going on?
>     - Increase mon_sync_max_payload_size to try to move data in larger chunks?

I would just try it.

>     - Drop down to a single monitor, and see if normal compaction triggers 
> and stops growing unbounded?

It will keep growing, the compact only works for a limited time. Make
sure the PGs become clean again.

In the meantime make sure you have enough disk space.

Wido

>     - Stop both 'a' and 'c', compact them, start them, and immediately start 
> 'b' ?
>
> Appreciate any advice.
>
> Regards,
> Lincoln
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to