[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-12 Thread Anthony D'Atri
I halfway suspect that something akin to the speculation in 
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/7MWAHAY7NCJK2DHEGO6MO4SWTLPTXQMD/
 is going on.

Below are reservations reported by a random OSD that serves (mostly) an EC RGW 
bucket pool.  This is with the mclock override on and the usual three 
backfill/recovery tunables set to 7 (bumped to get more OSDs backfilling after 
I changed to rack failure domain, having 50+ % of objects remapped makes me 
nervous and I want convergence.

3 happens to be the value of osd_recovery_max_active_hdd , so maybe there is 
some interaction between EC and how osd_recovery_max_active is derived and used?

Complete wild-ass speculation.

Just for grins, after `ceph osd down 313`

* local_reservations incremented
* remote_reservations decreased somewhat
* cluster aggregate recovery speed increased for at least the short term


[root@rook-ceph-osd-313-6f84bc5bd5-hr825 ceph]# ceph daemon osd.313 
dump_recovery_reservations
{
"local_reservations": {
"max_allowed": 7,
"min_priority": 0,
"queues": [],
"in_progress": [
{
"item": "21.161es0",
"prio": 110,
"can_preempt": true
},
{
"item": "21.180bs0",
"prio": 110,
"can_preempt": true
},
{
"item": "21.1e0as0",
"prio": 110,
"can_preempt": true
}
]
},
"remote_reservations": {
"max_allowed": 7,
"min_priority": 0,
"queues": [
{
"priority": 110,
"items": [
{
"item": "21.1d18s5",
"prio": 110,
"can_preempt": true
},
{
"item": "21.7d0s2",
"prio": 110,
"can_preempt": true
},
{
"item": "21.766s5",
"prio": 110,
"can_preempt": true
},
{
"item": "21.373s1",
"prio": 110,
"can_preempt": true
},
{
"item": "21.1a8es1",
"prio": 110,
"can_preempt": true
},
{
"item": "21.2das2",
"prio": 110,
"can_preempt": true
},
{
"item": "21.14a0s2",
"prio": 110,
"can_preempt": true
},
{
"item": "21.c7fs5",
"prio": 110,
"can_preempt": true
},
{
"item": "21.18e5s5",
"prio": 110,
"can_preempt": true
},
{
"item": "21.54ds2",
"prio": 110,
"can_preempt": true
},
{
"item": "21.79bs4",
"prio": 110,
"can_preempt": true
},
{
"item": "21.15c3s2",
"prio": 110,
"can_preempt": true
},
{
"item": "21.e15s4",
"prio": 110,
"can_preempt": true
},
{
"item": "21.226s3",
"prio": 110,
"can_preempt": true
},
{
"item": "21.adfs2",
"prio": 110,
"can_preempt": true
},
{
"item": "21.184bs4",
"prio": 110,
"can_preempt": true
},
{
"item": "21.f43s3",
"prio": 110,
"can_preempt": true
},
{
"item": "21.f5cs4",
"prio": 110,
"can_preempt": true
},
{
"item": "21.1300s3",
"prio": 110,
"can_preempt": true
},
{

[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-10 Thread Matthew Darwin
We have had pgs get stuck  in quincy (17.2.7).  After changing to wpq, 
no such problems were observed.  We're using a replicated (x3) pool.


On 2024-05-02 10:02, Wesley Dillingham wrote:

In our case it was with a EC pool as well. I believe the PG state was
degraded+recovering / recovery_wait and iirc the PGs just simply sat in the
recovering state without any progress (degraded PG object count did not
decline). A repeer of the PG was attempted but no success there. A restart
of all the OSDs for the given PGs was attempted under mclock. That didnt
work. Switching to wpq for all OSDS in the given PG did resolve the issue.
This was on a 17.2.7 cluster.

Respectfully,

*Wes Dillingham*
LinkedIn
w...@wesdillingham.com




On Thu, May 2, 2024 at 9:54 AM Sridhar Seshasayee
wrote:


Multiple people -- including me -- have also observed backfill/recovery
stop completely for no apparent reason.

In some cases poking the lead OSD for a PG with `ceph osd down` restores,
in other cases it doesn't.

Anecdotally this *may* only happen for EC pools on HDDs but that sample
size is small.



Thanks for the information. We will try and reproduce this locally with EC
pools and investigate this further.
I will revert with a tracker for this.
___
ceph-users mailing list --ceph-users@ceph.io
To unsubscribe send an email toceph-users-le...@ceph.io


___
ceph-users mailing list --ceph-users@ceph.io
To unsubscribe send an email toceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-02 Thread Wesley Dillingham
In our case it was with a EC pool as well. I believe the PG state was
degraded+recovering / recovery_wait and iirc the PGs just simply sat in the
recovering state without any progress (degraded PG object count did not
decline). A repeer of the PG was attempted but no success there. A restart
of all the OSDs for the given PGs was attempted under mclock. That didnt
work. Switching to wpq for all OSDS in the given PG did resolve the issue.
This was on a 17.2.7 cluster.

Respectfully,

*Wes Dillingham*
LinkedIn 
w...@wesdillingham.com




On Thu, May 2, 2024 at 9:54 AM Sridhar Seshasayee 
wrote:

> >
> > Multiple people -- including me -- have also observed backfill/recovery
> > stop completely for no apparent reason.
> >
> > In some cases poking the lead OSD for a PG with `ceph osd down` restores,
> > in other cases it doesn't.
> >
> > Anecdotally this *may* only happen for EC pools on HDDs but that sample
> > size is small.
> >
> >
> Thanks for the information. We will try and reproduce this locally with EC
> pools and investigate this further.
> I will revert with a tracker for this.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-02 Thread Sridhar Seshasayee
>
> Multiple people -- including me -- have also observed backfill/recovery
> stop completely for no apparent reason.
>
> In some cases poking the lead OSD for a PG with `ceph osd down` restores,
> in other cases it doesn't.
>
> Anecdotally this *may* only happen for EC pools on HDDs but that sample
> size is small.
>
>
Thanks for the information. We will try and reproduce this locally with EC
pools and investigate this further.
I will revert with a tracker for this.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-02 Thread Anthony D'Atri



>> For our customers we are still disabling mclock and using wpq. Might be
>> worth trying.
>> 
>> 
> Could you please elaborate a bit on the issue(s) preventing the
> use of mClock. Is this specific to only the slow backfill rate and/or other
> issue?
> 
> This feedback would help prioritize the improvements in those areas.

Multiple people -- including me -- have also observed backfill/recovery stop 
completely for no apparent reason.

In some cases poking the lead OSD for a PG with `ceph osd down` restores, in 
other cases it doesn't.

Anecdotally this *may* only happen for EC pools on HDDs but that sample size is 
small.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-02 Thread Mark Nelson

Hi Sridhar,

(Very!) Slow backfill was one issue, but if I recall we hit a case where 
backfill wasn't completing at all until we reverted to WPQ.  I was 
getting hammered with other stuff at the time so I don't quite remember 
the details, but Dan might.  I think this was in Quincy after the more 
recent updates landed in 17.2.6+ though.


Mark

On 5/2/24 00:05, Sridhar Seshasayee wrote:

Hi Mark,

On Thu, May 2, 2024 at 3:18 AM Mark Nelson  wrote:


For our customers we are still disabling mclock and using wpq. Might be
worth trying.



Could you please elaborate a bit on the issue(s) preventing the
use of mClock. Is this specific to only the slow backfill rate and/or other
issue?

This feedback would help prioritize the improvements in those areas.

Thanks,
-Sridhar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Best Regards,
Mark Nelson
Head of Research and Development

Clyso GmbH
p: +49 89 21552391 12 | a: Minnesota, USA
w: https://clyso.com | e: mark.nel...@clyso.com

We are hiring: https://www.clyso.com/jobs/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-01 Thread Sridhar Seshasayee
Hi Mark,

On Thu, May 2, 2024 at 3:18 AM Mark Nelson  wrote:

> For our customers we are still disabling mclock and using wpq. Might be
> worth trying.
>
>
Could you please elaborate a bit on the issue(s) preventing the
use of mClock. Is this specific to only the slow backfill rate and/or other
issue?

This feedback would help prioritize the improvements in those areas.

Thanks,
-Sridhar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-01 Thread Sridhar Seshasayee
Hi Götz,

Please see my response below.

On Tue, Apr 30, 2024 at 7:39 PM Pierre Riteau  wrote:

> Hi Götz,
>
> You can change the value of osd_max_backfills (for all OSDs or specific
> ones) using `ceph config`, but you need
> enable osd_mclock_override_recovery_settings. See
>
> https://docs.ceph.com/en/quincy/rados/configuration/mclock-config-ref/#steps-to-modify-mclock-max-backfills-recovery-limits
> for more information.
>
>
Did the suggestion from Pierre help improve the backfilling rate? With the
mClock scheduler, this is
the correct way of modifying the value of osd_max_backfills and
osd_recovery_max_active.

As to the observation of slower backfills, this is expected with the
'balanced' and 'high_client_ops'
mClock profiles (see allocations here
).
This is due to backfill operation being classified as
background best-effort service and a lower priority assigned to backfill
operations when
compared to degraded recoveries. Degraded recovery (or background recovery
service) is given
higher priority as there's a higher risk of data unavailability in case
other OSDs in the cluster go
down. Backfill operations are assigned lower priority since it just
involves data movement.

If the 'high_recovery_ops' profile coupled with increasing the above config
parameters is still
not enough to improve the backfilling rate, then the cluster must be
examined to see if there
are other competing services like degraded recoveries, client ops etc. that
could affect the
backfilling rate. The ceph status output should give an idea about this.

-Sridhar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-01 Thread Mark Nelson
For our customers we are still disabling mclock and using wpq. Might be 
worth trying.


Mark


On 4/30/24 09:08, Pierre Riteau wrote:

Hi Götz,

You can change the value of osd_max_backfills (for all OSDs or specific
ones) using `ceph config`, but you need
enable osd_mclock_override_recovery_settings. See
https://docs.ceph.com/en/quincy/rados/configuration/mclock-config-ref/#steps-to-modify-mclock-max-backfills-recovery-limits
for more information.

Best regards,
Pierre Riteau

On Sat, 27 Apr 2024 at 08:32, Götz Reinicke 
wrote:


Dear ceph community,

I’ve a ceph cluster which got upgraded from nautilus/pacific/…to reef over
time. Now I added two new nodes to an existing EC pool as I did with the
previous versions of ceph.

Now I face the fact, that the previous „backfilling tuning“ I’v used by
increasing injectargs --osd-max-backfills=XX --osd-recovery-max-active=YY
dose not work anymore.

With adjusting thous parameters the backfill was running with up to 2k +-
objects/s.

As I’m not (yet) familiar with the reef opiont the only speed up so far I
found is „ceph config set osd osd_mclock_profile high_recovery_ops“ which
currently runs the backfill with up to 600 opbjects/s.

My question: What is a best (simple) way to speed that backfill up ?

I’v tried to understand the custom profiles (?) but without success - and
did not apply anything other yet.

Thanks for feedback and suggestions ! Best regards . Götz



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Best Regards,
Mark Nelson
Head of Research and Development

Clyso GmbH
p: +49 89 21552391 12 | a: Minnesota, USA
w: https://clyso.com | e: mark.nel...@clyso.com

We are hiring: https://www.clyso.com/jobs/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-04-30 Thread Pierre Riteau
Hi Götz,

You can change the value of osd_max_backfills (for all OSDs or specific
ones) using `ceph config`, but you need
enable osd_mclock_override_recovery_settings. See
https://docs.ceph.com/en/quincy/rados/configuration/mclock-config-ref/#steps-to-modify-mclock-max-backfills-recovery-limits
for more information.

Best regards,
Pierre Riteau

On Sat, 27 Apr 2024 at 08:32, Götz Reinicke 
wrote:

> Dear ceph community,
>
> I’ve a ceph cluster which got upgraded from nautilus/pacific/…to reef over
> time. Now I added two new nodes to an existing EC pool as I did with the
> previous versions of ceph.
>
> Now I face the fact, that the previous „backfilling tuning“ I’v used by
> increasing injectargs --osd-max-backfills=XX --osd-recovery-max-active=YY
> dose not work anymore.
>
> With adjusting thous parameters the backfill was running with up to 2k +-
> objects/s.
>
> As I’m not (yet) familiar with the reef opiont the only speed up so far I
> found is „ceph config set osd osd_mclock_profile high_recovery_ops“ which
> currently runs the backfill with up to 600 opbjects/s.
>
> My question: What is a best (simple) way to speed that backfill up ?
>
> I’v tried to understand the custom profiles (?) but without success - and
> did not apply anything other yet.
>
> Thanks for feedback and suggestions ! Best regards . Götz
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io