[ceph-users] Re: rbd unmap fails with "Device or resource busy"

2022-09-15 Thread Chris Dunlop
On Fri, Sep 09, 2022 at 11:14:41AM +1000, Chris Dunlop wrote: What can make a "rbd unmap" fail, assuming the device is not mounted and not (obviously) open by any other processes? I have multiple XFS on rbd filesystems, and often create rbd snapshots, map and read-only mount the snapshot,

[ceph-users] ms_dispatcher of ceph-mgr 100% cpu on pacific 16.2.7

2022-09-15 Thread Wout van Heeswijk
Hi Everyone, We have a cluster that of which the manager is not working nicely. The mgrs are all very slow to respond. This initially caused them to continuously fail over. We've disabled most of the modules. We’ve set the following which seemed to improve the situation a little bit but the

[ceph-users] Re: adding mds service , unable to create keyring for mds

2022-09-15 Thread Xiubo Li
On 15/09/2022 21:56, Jerry Buburuz wrote: ceph auth list: mds.mynode key: mykeyxx caps: [mgr] profile mds caps: [mon] profile mds Yeah, it already exists. ceph auth get-or-create mds.mynode mon 'profile mds' mgr 'profile mds' mds 'allow *' osd 'allow

[ceph-users] Re: Power outage recovery

2022-09-15 Thread Gregory Farnum
Recovery from OSDs loses the mds and rgw keys they use to authenticate with cephx. You need to get those set up again by using the auth commands. I don’t have them handy but it is discussed in the mailing list archives. -Greg On Thu, Sep 15, 2022 at 3:28 PM Jorge Garcia wrote: > Yes, I tried

[ceph-users] Re: Power outage recovery

2022-09-15 Thread Jorge Garcia
Yes, I tried restarting them and even rebooting the mds machine. No joy. If I try to start ceph-mds by hand, it returns: 2022-09-15 15:21:39.848 7fc43dbd2700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2] failed to fetch mon config

[ceph-users] Re: Power outage recovery

2022-09-15 Thread Wesley Dillingham
Having the quorum / monitors back up may change the MDS and RGW's ability to start and stay running. Have you tried just restarting the MDS / RGW daemons again? Respectfully, *Wes Dillingham* w...@wesdillingham.com LinkedIn On Thu, Sep 15, 2022 at

[ceph-users] Re: Power outage recovery

2022-09-15 Thread Jorge Garcia
OK, I'll try to give more details as I remember them. 1. There was a power outage and then power came back up. 2. When the systems came back up, I did a "ceph -s" and it never returned. Further investigation revealed that the ceph-mon processes had not started in any of the 3 monitors. I

[ceph-users] Re: Nautilus: PGs stuck "activating" after adding OSDs. Please help!

2022-09-15 Thread Fulvio Galeazzi
Correct, Dan, indeed at some point I had also raised the "hard-ratio" to 3, with no succes as I guess I was missing the "repeer"... I assumed that restarting OSDs would 'do the whole work'. Ok, I learned something today, thanks! Fulvio On 15/09/2022 21:28, Dan van der Ster

[ceph-users] Re: Nautilus: PGs stuck "activating" after adding OSDs. Please help!

2022-09-15 Thread Fulvio Galeazzi
Wow Josh, thanks a lot for prompt help! Indeed, I thought mon_max_pg_per_osd (which was 500 in my case) would work in combination with the multiplier max_pg_per_osd_hard_ratio which if I am not mistaken is 2 by default: I had ~700 PGs/OSD so I was feeling rather safe. However, I temporarily

[ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

2022-09-15 Thread Christophe BAILLON
Hi The problem is still present in version 17.2.3, thanks for the trick to work around... Regards - Mail original - > De: "Anh Phan Tuan" > À: "Calhoun, Patrick" > Cc: "Arthur Outhenin-Chalandre" , > "ceph-users" > Envoyé: Jeudi 11 Août 2022 10:14:17 > Objet: [ceph-users] Re:

[ceph-users] Re: Nautilus: PGs stuck "activating" after adding OSDs. Please help!

2022-09-15 Thread Dan van der Ster
Another common config to workaround this pg num limit is: ceph config set osd osd_max_pg_per_osd_hard_ratio 10 (Then possibly the repeer step on each activating pg) .. Dan On Thu, Sept 15, 2022, 17:47 Josh Baergen wrote: > Hi Fulvio, > > I've seen this in the past when a CRUSH change

[ceph-users] Re: Power outage recovery

2022-09-15 Thread Wesley Dillingham
What does "ceph status" "ceph health detail" etc show, currently? Based on what you have said here my thought is you have created a new monitor quorum and as such all auth details from the old cluster are lost including any and all mgr cephx auth keys, so what does the log for the mgr say? How

[ceph-users] Re: Power outage recovery

2022-09-15 Thread Eugen Block
The data only seems to be gone (if you mean what I think you mean) because the MGRs are not running and the OSDs can’t report their status. But are all MONs and OSDs up? What is the ceph status? What do the MGRs log when trying to start them? Zitat von Jorge Garcia : We have a Nautilus

[ceph-users] Re: Power outage recovery

2022-09-15 Thread Marc
> (particularly the "Recovery using OSDs" section). I got it so the mon > processes would start, but then the ceph-mgr process died, and would not > restart. Not sure how to recover so both ceph-mgr and ceph-mon processes > run. In the meantime, all the data is gone. Any suggestions? All the data

[ceph-users] Power outage recovery

2022-09-15 Thread Jorge Garcia
We have a Nautilus cluster that just got hit by a bad power outage. When the admin systems came back up, only the ceph-mgr process was running (all the ceph-mon processes would not start). I tried following the instructions in

[ceph-users] multisite replication issue with Quincy

2022-09-15 Thread Jane Zhu
We have encountered replication issues in our multisite settings with Quincy v17.2.3. Our Ceph clusters are brand new. We tore down our clusters and re-deployed fresh Quincy ones before we did our test. In our environment, we have 3 RGW nodes per site, each node has 2 instances for client traffic

[ceph-users] Slides from today's Ceph User + Dev Monthly Meeting

2022-09-15 Thread Kamoltat Sirivadhna
Hi guys, thank you all for attending today's meeting, apologies for the restricted access. Attached here is the slide in pdf format. Let me know if you have any questions, -- Kamoltat Sirivadhna (HE/HIM) SoftWare Engineer - Ceph Storage ksiri...@redhat.comT: (857)

[ceph-users] Nautilus: PGs stuck "activating" after adding OSDs. Please help!

2022-09-15 Thread Fulvio Galeazzi
Hallo, I am on Nautilus and today, after upgrading the operating system (from CentOS 7 to CentOS 8 Stream) on a couple OSD servers and adding them back to the cluster, I noticed some PGs are still "activating". The upgraded server are from the same "rack", and I have replica-3 pools with

[ceph-users] Re: adding mds service , unable to create keyring for mds

2022-09-15 Thread Jerry Buburuz
FIXED! Interesting: ceph auth caps mds.mynode mon 'profile mds' mgr 'profile mds' mds 'allow *' osd 'allow *' ouput: updated caps for mds.admin-node-02 Worked! ceph auth list mds.mynode: key: caps: [mds] allow caps: [mgr] profile mds caps: [mon] profile mds caps: [osd]

[ceph-users] Re: adding mds service , unable to create keyring for mds

2022-09-15 Thread Eugen Block
Have you tried to modify by using ‚ceph auth caps…‘ instead of get-or-create? Zitat von Jerry Buburuz : Can I just: ceph auth export mds.mynode -o mds.export Add(editor) "caps: [mds] profile mds" ceph auth import -i mds.export THanks jerry Jerry Buburuz ceph auth list: mds.mynode

[ceph-users] Re: adding mds service , unable to create keyring for mds

2022-09-15 Thread Jerry Buburuz
Can I just: ceph auth export mds.mynode -o mds.export Add(editor) "caps: [mds] profile mds" ceph auth import -i mds.export THanks jerry Jerry Buburuz > > ceph auth list: > > mds.mynode > key: mykeyxx > caps: [mgr] profile mds > caps: [mon] profile mds > > >

[ceph-users] Re: adding mds service , unable to create keyring for mds

2022-09-15 Thread Jerry Buburuz
ceph auth list: mds.mynode key: mykeyxx caps: [mgr] profile mds caps: [mon] profile mds ceph auth get-or-create mds.mynode mon 'profile mds' mgr 'profile mds' mds 'allow *' osd 'allow *' error: Error EINVAL: key for mds.mynode exists but cap mds does not

[ceph-users] Re: rbd mirroring - journal growing and snapshot high io load

2022-09-15 Thread Arthur Outhenin-Chalandre
Hi Ronny, > On 15/09/2022 14:32 ronny.lippold wrote: > hi arthur, some time went ... > > i would like to know, if there are some news of your setup. > do you have replication active running? No, there was no change at CERN. I am switching jobs as well actually so I won't have much news for

[ceph-users] Re: rbd mirroring - journal growing and snapshot high io load

2022-09-15 Thread ronny.lippold
hi arthur, some time went ... i would like to know, if there are some news of your setup. do you have replication active running? we are using actually snapshot based and had last time a move of both clusters. after that, we had some damaged filesystems ind the kvm vms. did you ever had such

[ceph-users] Re: Manual deployment, documentation error?

2022-09-15 Thread Dominique Ramaekers
Hi Marc, > -Oorspronkelijk bericht- > Van: Marc > Verzonden: donderdag 15 september 2022 11:14 > Aan: Dominique Ramaekers ; Ranjan > Ghosh > CC: ceph-users@ceph.io > Onderwerp: RE: [ceph-users] Re: Manual deployment, documentation error? > > > > Cons of using cephadm (and thus

[ceph-users] Re: Manual deployment, documentation error?

2022-09-15 Thread Marc
> Cons of using cephadm (and thus docker): > - You need to learn the basics of docker If you learn only the basics, you will probably fuck up when you have some sort of real issue with ceph. I would not recommend sticking to basics of anything with this install, it is not like if something

[ceph-users] Re: Manual deployment, documentation error?

2022-09-15 Thread Rok Jaklič
Every now and then someone comes up with a subject like this. There is quite a long thread about pros and cons using docker and all tools around ceph on https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/TTTYKRVWJOR7LOQ3UCQAZQR32R7YADVY/#AT7YQV6RE5SMKDZHXL3ZI2G5BWFUUUXE Long story

[ceph-users] Re: Manual deployment, documentation error?

2022-09-15 Thread Dominique Ramaekers
Hi Ranjan, I don't want to intervene but I can testify that docker doesn't make the installation for a 3-node cluster overkill. I to have a 3-node cluster (to be expanded soon to 4 nodes). Cons of using cephadm (and thus docker): - You need to learn the basics of docker Pros: + cephadm works