I halfway suspect that something akin to the speculation in
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/7MWAHAY7NCJK2DHEGO6MO4SWTLPTXQMD/
is going on.
Below are reservations reported by a random OSD that serves (mostly) an EC RGW
bucket pool. This is with the mclock
We have had pgs get stuck in quincy (17.2.7). After changing to wpq,
no such problems were observed. We're using a replicated (x3) pool.
On 2024-05-02 10:02, Wesley Dillingham wrote:
In our case it was with a EC pool as well. I believe the PG state was
degraded+recovering / recovery_wait
In our case it was with a EC pool as well. I believe the PG state was
degraded+recovering / recovery_wait and iirc the PGs just simply sat in the
recovering state without any progress (degraded PG object count did not
decline). A repeer of the PG was attempted but no success there. A restart
of
>
> Multiple people -- including me -- have also observed backfill/recovery
> stop completely for no apparent reason.
>
> In some cases poking the lead OSD for a PG with `ceph osd down` restores,
> in other cases it doesn't.
>
> Anecdotally this *may* only happen for EC pools on HDDs but that
>> For our customers we are still disabling mclock and using wpq. Might be
>> worth trying.
>>
>>
> Could you please elaborate a bit on the issue(s) preventing the
> use of mClock. Is this specific to only the slow backfill rate and/or other
> issue?
>
> This feedback would help prioritize
Hi Sridhar,
(Very!) Slow backfill was one issue, but if I recall we hit a case where
backfill wasn't completing at all until we reverted to WPQ. I was
getting hammered with other stuff at the time so I don't quite remember
the details, but Dan might. I think this was in Quincy after the
Hi Mark,
On Thu, May 2, 2024 at 3:18 AM Mark Nelson wrote:
> For our customers we are still disabling mclock and using wpq. Might be
> worth trying.
>
>
Could you please elaborate a bit on the issue(s) preventing the
use of mClock. Is this specific to only the slow backfill rate and/or other
Hi Götz,
Please see my response below.
On Tue, Apr 30, 2024 at 7:39 PM Pierre Riteau wrote:
> Hi Götz,
>
> You can change the value of osd_max_backfills (for all OSDs or specific
> ones) using `ceph config`, but you need
> enable osd_mclock_override_recovery_settings. See
>
>
For our customers we are still disabling mclock and using wpq. Might be
worth trying.
Mark
On 4/30/24 09:08, Pierre Riteau wrote:
Hi Götz,
You can change the value of osd_max_backfills (for all OSDs or specific
ones) using `ceph config`, but you need
enable
Hi Götz,
You can change the value of osd_max_backfills (for all OSDs or specific
ones) using `ceph config`, but you need
enable osd_mclock_override_recovery_settings. See
https://docs.ceph.com/en/quincy/rados/configuration/mclock-config-ref/#steps-to-modify-mclock-max-backfills-recovery-limits
10 matches
Mail list logo