[ceph-users] Re: MON sync time depends on outage duration

2023-07-06 Thread Dan van der Ster
Hi Eugen! Yes that sounds familiar from the luminous and mimic days. Check this old thread: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/F3W2HXMYNF52E7LPIQEJFUTAD3I7QE25/ (that thread is truncated but I can tell you that it worked for Frank). Also the even older referenced thre

[ceph-users] Re: MON sync time depends on outage duration

2023-07-06 Thread Konstantin Shalygin
Hi, And additional to Dan suggestion, the HDD is not a good choices for RocksDB, which is most likely the reason for this thread, I think that from the 3rd time the database just goes into compaction maintenance k Sent from my iPhone > On 6 Jul 2023, at 23:48, Eugen Block wrote: > The MON st

[ceph-users] Re: MON sync time depends on outage duration

2023-07-07 Thread Eugen Block
Thanks, Dan! Yes that sounds familiar from the luminous and mimic days. The workaround for zillions of snapshot keys at that time was to use: ceph config set mon mon_sync_max_payload_size 4096 I actually did search for mon_sync_max_payload_keys, not bytes so I missed your thread, it seems

[ceph-users] Re: MON sync time depends on outage duration

2023-07-07 Thread Eugen Block
I forgot to add one question. @Konstantin, you wrote: I think that from the 3rd time the database just goes into compaction maintenance Can you share some more details what exactly you mean? Do you mean that if I restart a MON three times it goes into compaction maintenance and that it's

[ceph-users] Re: MON sync time depends on outage duration

2023-07-07 Thread Konstantin Shalygin
This is a guess, the databases is like to swell. Especially the Level DB's, can grow x2 and reduce tens of percent of total size. This may be just another SST file creation, 1GB by default, Ii I remember it right Do you was looks to Grafana, about this HDD's utilization, IOPS? k Sent from my iP

[ceph-users] Re: MON sync time depends on outage duration

2023-07-07 Thread Eugen Block
We did look at the iostats of the disk, it was not saturated, but I don't have any specific numbers right now as I don't have direct access. But I'm open for more theories why waiting for 5 minutes lets the MON sync immediately but waiting for more takes so much more time. If necessary we'l

[ceph-users] Re: MON sync time depends on outage duration

2023-07-10 Thread Eugen Block
Hi, I got a customer response with payload size 4096, that made things even worse. The mon startup time was now around 40 minutes. My doubts wrt decreasing the payload size seem confirmed. Then I read Dan's response again which also mentions that the default payload size could be too small

[ceph-users] Re: MON sync time depends on outage duration

2023-07-10 Thread Dan van der Ster
Oh yes, sounds like purging the rbd trash will be the real fix here! Good luck! __ Clyso GmbH | Ceph Support and Consulting | https://www.clyso.com On Mon, Jul 10, 2023 at 6:10 AM Eugen Block wrote: > Hi, > I got a customer response with pa

[ceph-users] Re: MON sync time depends on outage duration

2023-07-11 Thread Eugen Block
I'm not so sure anymore if that could really help here. The dump-keys output from the mon contains 42 million osd_snap prefix entries, 39 million of them are "purged_snap" keys. I also compared to other clusters as well, those aren't tombstones but expected "history" of purged snapshots. So

[ceph-users] Re: MON sync time depends on outage duration

2023-07-11 Thread Josh Baergen
Out of curiosity, what is your require_osd_release set to? (ceph osd dump | grep require_osd_release) Josh On Tue, Jul 11, 2023 at 5:11 AM Eugen Block wrote: > > I'm not so sure anymore if that could really help here. The dump-keys > output from the mon contains 42 million osd_snap prefix entrie

[ceph-users] Re: MON sync time depends on outage duration

2023-07-11 Thread Eugen Block
It was installed with Octopus and hasn't been upgraded yet: "require_osd_release": "octopus", Zitat von Josh Baergen : Out of curiosity, what is your require_osd_release set to? (ceph osd dump | grep require_osd_release) Josh On Tue, Jul 11, 2023 at 5:11 AM Eugen Block wrote: I'm

[ceph-users] Re: MON sync time depends on outage duration

2023-07-12 Thread Eugen Block
My test with a single-host-cluster (virtual machine) finished after around 20 hours. I removed all purged_snap keys from the mon and it actually started again (wasn't sure if I could have expected that). Is that a valid approach in order to reduce the mon store size? Or can it be dangerous?

[ceph-users] Re: MON sync time depends on outage duration

2023-07-28 Thread Eugen Block
Hi, I think we found an explanation for the behaviour, we still need to verify it though. Just wanted to write it up for posterity. We already knew that the large number of "purged_snap" keys in the mon store is responsible for the long synchronization. Removing them didn't seem to have a n