Hi,

On 9/19/25 13:23, Niklas Hambüchen wrote:
I noticed that for my clusters, even a short 5-minute network outage or 
single-host reboot can cause

     pgs:     5586988/366684639 objects misplaced (1.524%)

which at the speed of

     recovery: 2.2 GiB/s, 676 objects/s

can take hours to recover.

I don't understand how this can be. If it's down for so short, how can 
rebalancing can take this long?

If the objects are only misplaced, but not degraded, the full data availability and consistency according to the chosen crush rules is already guaranteed. The objects are "only" on the wrong OSDs.

In my experiences the backfilling operation for misplaced objects have a lower priority than other operation. Things are different if objects are _degraded_. In this case CEPH pushes the backfill operations.


You also have to consider how many backfill operations can run in parallel. Each OSD has a number of backfill slots; and all OSDs handling a PG have to provide a free slot for the operation. For replicated pools this means that three OSDs are involved, probably more for EC pools. And finally there's Murphy....the last objects have to be handled by the same set of OSDS, resulting in a queue up of backfill operations and slow progress.

There are a number of settings for backfilling, but these differ e.g. depending on the select osd_op_queue.

Best regards,

Burkhard

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to