Just a shoot,

Perhaps you have many backfilling tasks...
You can trottle the recovery when you limit max-backfill.

Hth
Mehmet 

Am 2. Dezember 2022 21:09:04 MEZ schrieb Wyll Ingersoll 
<wyllys.ingers...@keepertech.com>:
>
>We have a large cluster (10PB) which is about 30% full at this point.  We 
>recently fixed a configuration issue that then triggered the pg autoscaler to 
>start moving around massive amounts of data (85% misplaced objects - about 
>7.5B objects).  The misplaced % is dropping slowly (about 10% each day), but 
>the overall data usage is growing by about 300T/day even though the data being 
>written by clients is well under 30T/day.
>
>The issue was that we have both 3x replicated pools and a very large 
>erasure-coded (8+4) data pool for RGW.  autoscaler doesnt work if it sees what 
>it thinks are overlapping roots ("default" vs "default~hdd" in the crush tree, 
>even if both refer to the same OSDs, they have different ids: -1 vs -2).  We 
>cleared that by setting the same root for both crush rules and then PG 
>autoscaler kicked in and started doing its thing.
>
>The "ceph osd df" output shows the OMAP jumping significantly and our data 
>availability is shrinking MUCH faster than we would expect based on the client 
>usage.
>
>Questions:
>
>  *   What is causing the OMAP data consumption to grow so fast and can it be 
> trimmed/throttled?
>  *   Will the overhead data be cleaned up once the misplaced object counts 
> drop to a much lower value?
>  *   Would it do any good to disable the autoscaler at this point since the 
> PGs have already started being moved?
>  *   Any other recommendations to make this go smoother?
>
>thanks!
>
>_______________________________________________
>ceph-users mailing list -- ceph-users@ceph.io
>To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to