[ceph-users] tuning for backup target cluster

2024-05-24 Thread Lukasz Borek
Hi Everyone, I'm putting together a HDD cluster with an ECC pool dedicated to the backup environment. Traffic via s3. Version 18.2, 7 OSD nodes, 12 * 12TB HDD + 1NVME each, 4+2 ECC pool. Wondering if there is some general guidance for startup setup/tuning in regards to s3 object size. Files are

[ceph-users] MDS Abort druing FS scrub

2024-05-24 Thread Malcolm Haak
When running a cephfs scrub the MDS will crash with the following backtrace -1> 2024-05-25T09:00:23.028+1000 7ef2958006c0 -1 /usr/src/debug/ceph/ceph-18.2.2/src/mds/MDSRank.cc: In function 'void MDSRank::abort(std::string_view)' thread 7ef2958006c0 time 2024-05-25T09:00:23.031373+1000

[ceph-users] Re: Lousy recovery for mclock and reef

2024-05-24 Thread Kai Stian Olstad
On 24.05.2024 21:07, Mazzystr wrote: I did the obnoxious task of updating ceph.conf and restarting all my osds. ceph --admin-daemon /var/run/ceph/ceph-osd.*.asok config get osd_op_queue { "osd_op_queue": "wpq" } I have some spare memory on my target host/osd and increased the target

[ceph-users] Re: Lousy recovery for mclock and reef

2024-05-24 Thread Joshua Baergen
Now that you're on wpq, you can try tweaking osd_max_backfills (up) and osd_recovery_sleep (down). Josh On Fri, May 24, 2024 at 1:07 PM Mazzystr wrote: > > I did the obnoxious task of updating ceph.conf and restarting all my osds. > > ceph --admin-daemon /var/run/ceph/ceph-osd.*.asok config get

[ceph-users] Re: Lousy recovery for mclock and reef

2024-05-24 Thread Mazzystr
I did the obnoxious task of updating ceph.conf and restarting all my osds. ceph --admin-daemon /var/run/ceph/ceph-osd.*.asok config get osd_op_queue { "osd_op_queue": "wpq" } I have some spare memory on my target host/osd and increased the target memory of that OSD to 10 Gb and restarted.

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-24 Thread Eugen Block
Hi, I guess you mean use something like "step take DCA class hdd" instead of "step take default class hdd" as in: rule rule-ec-k7m11 { id 1 type erasure min_size 3 max_size 18 step set_chooseleaf_tries 5 step set_choose_tries 100 step

[ceph-users] Re: Lousy recovery for mclock and reef

2024-05-24 Thread Joshua Baergen
It requires an OSD restart, unfortunately. Josh On Fri, May 24, 2024 at 11:03 AM Mazzystr wrote: > > Is that a setting that can be applied runtime or does it req osd restart? > > On Fri, May 24, 2024 at 9:59 AM Joshua Baergen > wrote: > > > Hey Chris, > > > > A number of users have been

[ceph-users] Re: Lousy recovery for mclock and reef

2024-05-24 Thread Mazzystr
Is that a setting that can be applied runtime or does it req osd restart? On Fri, May 24, 2024 at 9:59 AM Joshua Baergen wrote: > Hey Chris, > > A number of users have been reporting issues with recovery on Reef > with mClock. Most folks have had success reverting to > osd_op_queue=wpq. AIUI

[ceph-users] Re: Lousy recovery for mclock and reef

2024-05-24 Thread Joshua Baergen
Hey Chris, A number of users have been reporting issues with recovery on Reef with mClock. Most folks have had success reverting to osd_op_queue=wpq. AIUI 18.2.3 should have some mClock improvements but I haven't looked at the list myself yet. Josh On Fri, May 24, 2024 at 10:55 AM Mazzystr

[ceph-users] Lousy recovery for mclock and reef

2024-05-24 Thread Mazzystr
Hi all, Goodness I'd say it's been at least 3 major releases since I had to do a recovery. I have disks with 60-75,000 power_on_hours. I just updated from Octopus to Reef last month and I'm hit with 3 disk failures and the mclock ugliness. My recovery is moving at a wondrous 21 mb/sec after

[ceph-users] Re: Reef RGWs stop processing requests

2024-05-24 Thread Iain Stott
Thanks Enrico, We are only syncing metadata between sites, so I don't think that bug will be the cause of our issues. I have been able to delete ~30k objects without causing the RGW to stop processing. Thanks Iain From: Enrico Bocchi Sent: 22 May 2024 13:48

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-24 Thread Frank Schilder
Hi Eugen, so it is partly "unexpectedly expected" and partly buggy. I really wish the crush implementation was honouring a few obvious invariants. It is extremely counter-intuitive that mappings taken from a sub-set change even if both, the sub-set and the mapping instructions themselves

[ceph-users] Re: User + Dev Meetup Tomorrow!

2024-05-24 Thread Frédéric Nass
Hello Sebastian, I just checked the survey and you're right, the issue was within the question. Got me a bit confused when I read it but I clicked anyway. Who doesn't like clicking? :-D What best describes your deployment target? * 1/ Bare metal (RPMs/Binary) 2/ Containers (cephadm/Rook) 3/

[ceph-users] Re: User + Dev Meetup Tomorrow!

2024-05-24 Thread Sebastian Wagner
Hi Frédéric, I agree. Maybe we should re-frame things? Containers can run on bare-metal and containers can run virtualized. And distribution packages can run bare-metal and virtualized as well. What about asking independently about: * Do you run containers or distribution packages? * Do

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-24 Thread Eugen Block
I start to think that the root cause of the remapping is just the fact that the crush rule(s) contain(s) the "step take default" line: step take default class hdd My interpretation is that crush simply tries to honor the rule: consider everything underneath the "default" root, so

[ceph-users] Re: quincy rgw with ceph orch and two realms only get answers from first realm

2024-05-24 Thread Boris
Thanks for being my rubber ducky. Turns out I didn't had the rgw_zonegroup configured in the first apply. Then adding it to the config and applying it, does not restart or reconfigure the containers. After doing a ceph orch restart rgw.customer it seems to work now. Happy weekend everybody. Am

[ceph-users] quincy rgw with ceph orch and two realms only get answers from first realm

2024-05-24 Thread Boris
Hi, we are currently in the process of adopting the main s3 cluster to orchestrator. We have two realms (one for us and one for the customer). The old config worked fine and depending on the port I requested, I got different x-amz-request-id header back: x-amz-request-id:

[ceph-users] Re: cephadm bootstraps cluster with bad CRUSH map(?)

2024-05-24 Thread Eugen Block
Hi, thanks for picking that up so quickly! I haven't used a host spec file yet to add new hosts, but if you read my thread about the unknown PGs, this might be my first choice to do that in the future. So thanks again for bringig it to my attention. ;-) Regards, Eugen Zitat von Matthew

[ceph-users] Re: User + Dev Meetup Tomorrow!

2024-05-24 Thread Frédéric Nass
Hello everyone, Nice talk yesterday. :-) Regarding containers vs RPMs and orchestration, and the related discussion from yesterday, I wanted to share a few things (which I wasn't able to share yesterday on the call due to a headset/bluetooth stack issue) to explain why we use cephadm and ceph

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-24 Thread Eugen Block
Hi Frank, thanks for looking up those trackers. I haven't looked into them yet, I'll read your response in detail later, but I wanted to add some new observation: I added another root bucket (custom) to the osd tree: # ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS

[ceph-users] Re: Help with deep scrub warnings

2024-05-24 Thread Sascha Lucas
Hi, just for the archives: On Tue, 5 Mar 2024, Anthony D'Atri wrote: * Try applying the settings to global so that mons/mgrs get them. Setting osd_deep_scrub_interval at global instead at osd immediately turns health to OK and removes the false warning from PGs not scrubbed in time. HTH,