[ceph-users] Re: OSD bootstrap time

2021-06-22 Thread Jan-Philipp Litza
Hi again,

turns out the long bootstrap time was my own fault. I had some down
OSDs for quite a long time, which prohibited the monitor from pruning
the OSD maps. Makes sense, when I think about it, but I didn't before.
Rich's hint to get the cluster to health OK first pointed me in the
right direction, as well as the docs on full OSDmap version pruning [1]
that mention constraints in OSDMonitor::get_trim_to().

So I destroyed the OSDs (they don't hold any data anyway) and the mon's
DBs shrank by almost 8 GB to only ~160 MB.

Thanks for helping figuring this out! I promise to not have lingering
down OSDs anymore. ;-)

Best regards,
Jan-Philipp

[1]: https://docs.ceph.com/en/latest/dev/mon-osdmap-prune/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD bootstrap time

2021-06-09 Thread Konstantin Shalygin
This is new min_alloc_size for bluestore. 4K mkfs required more time and 
process is single threaded I think
It's normal


k

> On 9 Jun 2021, at 14:21, Jan-Philipp Litza  wrote:
> 
> I mean freshly deployed OSDs. Restarted OSDs don't exhibit that behavior.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD bootstrap time

2021-06-09 Thread Jan-Philipp Litza
Hi Konstantin,

I mean freshly deployed OSDs. Restarted OSDs don't exhibit that behavior.

Best regards,
Jan-Philipp
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD bootstrap time

2021-06-09 Thread Jan-Philipp Litza
Hi Rich,

> I've noticed this a couple of times on Nautilus after doing some large
> backfill operations. It seems the osd map doesn't clear properly after
> the cluster returns to Health OK and builds up on the mons. I do a
> "du" on the mon folder e.g. du -shx /var/lib/ceph/mon/ and this shows
> several GB of data.

It does, almost 8 GB for <300 OSDs, which increased several-fold over
the last weeks (since we started upgrading Nautilus->Pacific). However,
I didn't think much of it after reading in the docs about the hardware
recommendations that require at least 60 GB per ceph-mon [1].

> I give all my mgrs and mons a restart and after a few minutes I can
> see this osd map data getting purged from the mons. After a while it
> should be back to a few hundred MB (depending on cluster size).
> This may not be the problem in your case, but an easy thing to try.
> Note, if your cluster is being held in Warning or Error by something
> this can also explain the osd maps not clearing. Make sure you get the
> cluster back to health OK first.

Thanks for the suggestion, will try that once we reach HEALTH_OK.

Best regards,
Jan-Philipp

[1]:
https://docs.ceph.com/en/latest/start/hardware-recommendations/#minimum-hardware-recommendations
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD bootstrap time

2021-06-09 Thread Konstantin Shalygin
Hi,

You mean new fresh deployed OSD's or old just restarted OSD's?


Thanks,
k

Sent from my iPhone

> On 8 Jun 2021, at 23:30, Jan-Philipp Litza  wrote:
> 
> recently I'm noticing that starting OSDs for the first time takes ages
> (like, more than an hour) before they are even picked up by the monitors
> as "up" and start backfilling. I'm not entirely sure if this is a new
> phenomenon or if it always was that way. Either way, I'd like to
> understand why.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD bootstrap time

2021-06-08 Thread Richard Bade
Hi Jan-Philipp,
I've noticed this a couple of times on Nautilus after doing some large
backfill operations. It seems the osd map doesn't clear properly after
the cluster returns to Health OK and builds up on the mons. I do a
"du" on the mon folder e.g. du -shx /var/lib/ceph/mon/ and this shows
several GB of data.
I give all my mgrs and mons a restart and after a few minutes I can
see this osd map data getting purged from the mons. After a while it
should be back to a few hundred MB (depending on cluster size).
This may not be the problem in your case, but an easy thing to try.
Note, if your cluster is being held in Warning or Error by something
this can also explain the osd maps not clearing. Make sure you get the
cluster back to health OK first.

Rich

On Wed, 9 Jun 2021 at 08:29, Jan-Philipp Litza  wrote:
>
> Hi everyone,
>
> recently I'm noticing that starting OSDs for the first time takes ages
> (like, more than an hour) before they are even picked up by the monitors
> as "up" and start backfilling. I'm not entirely sure if this is a new
> phenomenon or if it always was that way. Either way, I'd like to
> understand why.
>
> When I execute `ceph daemon osd.X status`, it says "state: preboot" and
> I can see the "newest_map" increase slowly. Apparently, a new OSD
> doesn't fetch the latest OSD map and gets to work, but instead fetches
> hundreds of thousands of OSD maps from the mon, burning CPU while
> parsing them.
>
> I wasn't able to find any good documentation on the OSDMap, in
> particular why its historical versions need to be kept and why the OSD
> seemingly needs so many of them. Can anybody point me in the right
> direction? Or is something wrong with my cluster?
>
> Best regards,
> Jan-Philipp Litza
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io