[ceph-users] Re: Slow ops during recovery for RGW index pool only when degraded OSD is primary

2024-04-04 Thread Wesley Dillingham
Initial indication shows "osd_async_recovery_min_cost = 0" to be a huge win here. Some initial thoughts. Were this not for the fact that the index (and other OMAP pools) were isolated to their own OSDs in this cluster this tunable would seemingly cause data/blob objects from data pools to async

[ceph-users] Re: Slow ops during recovery for RGW index pool only when degraded OSD is primary

2024-04-03 Thread Anthony D'Atri
Thanks. I'll PR up some doc updates reflecting this and run them by the RGW / RADOS folks. > On Apr 3, 2024, at 16:34, Joshua Baergen wrote: > > Hey Anthony, > > Like with many other options in Ceph, I think what's missing is the > user-visible effect of what's being altered. I believe the

[ceph-users] Re: Slow ops during recovery for RGW index pool only when degraded OSD is primary

2024-04-03 Thread Joshua Baergen
Hey Anthony, Like with many other options in Ceph, I think what's missing is the user-visible effect of what's being altered. I believe the reason why synchronous recovery is still used is that, assuming that per-object recovery is quick, it's faster to complete than asynchronous recovery, which

[ceph-users] Re: Slow ops during recovery for RGW index pool only when degraded OSD is primary

2024-04-03 Thread Anthony D'Atri
We currently have in src/common/options/global.yaml.in - name: osd_async_recovery_min_cost type: uint level: advanced desc: A mixture measure of number of current log entries difference and historical missing objects, above which we switch to use asynchronous recovery when

[ceph-users] Re: Slow ops during recovery for RGW index pool only when degraded OSD is primary

2024-04-03 Thread Joshua Baergen
We've had success using osd_async_recovery_min_cost=0 to drastically reduce slow ops during index recovery. Josh On Wed, Apr 3, 2024 at 11:29 AM Wesley Dillingham wrote: > > I am fighting an issue on an 18.2.0 cluster where a restart of an OSD which > supports the RGW index pool causes