Initial indication shows "osd_async_recovery_min_cost = 0" to be a huge
win here. Some initial thoughts. Were this not for the fact that the index
(and other OMAP pools) were isolated to their own OSDs in this cluster this
tunable would seemingly cause data/blob objects from data pools to async
Thanks. I'll PR up some doc updates reflecting this and run them by the RGW /
RADOS folks.
> On Apr 3, 2024, at 16:34, Joshua Baergen wrote:
>
> Hey Anthony,
>
> Like with many other options in Ceph, I think what's missing is the
> user-visible effect of what's being altered. I believe the
Hey Anthony,
Like with many other options in Ceph, I think what's missing is the
user-visible effect of what's being altered. I believe the reason why
synchronous recovery is still used is that, assuming that per-object
recovery is quick, it's faster to complete than asynchronous recovery,
which
We currently have in src/common/options/global.yaml.in
- name: osd_async_recovery_min_cost
type: uint
level: advanced
desc: A mixture measure of number of current log entries difference and
historical
missing objects, above which we switch to use asynchronous recovery when
We've had success using osd_async_recovery_min_cost=0 to drastically
reduce slow ops during index recovery.
Josh
On Wed, Apr 3, 2024 at 11:29 AM Wesley Dillingham
wrote:
>
> I am fighting an issue on an 18.2.0 cluster where a restart of an OSD which
> supports the RGW index pool causes