[ceph-users] Re: rbd unmap fails with "Device or resource busy"

2022-09-15 Thread Chris Dunlop

On Fri, Sep 09, 2022 at 11:14:41AM +1000, Chris Dunlop wrote:
What can make a "rbd unmap" fail, assuming the device is not mounted 
and not (obviously) open by any other processes?


I have multiple XFS on rbd filesystems, and often create rbd 
snapshots, map and read-only mount the snapshot, perform some work on 
the fs, then unmount and unmap. The unmap regularly (about 1 in 10 
times) fails like:


$ sudo rbd unmap /dev/rbd29
rbd: sysfs write failed
rbd: unmap failed: (16) Device or resource busy


FYI, this was resolved:

https://www.spinics.net/lists/ceph-devel/msg55943.html

Chris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ms_dispatcher of ceph-mgr 100% cpu on pacific 16.2.7

2022-09-15 Thread Wout van Heeswijk
Hi Everyone,

We have a cluster that of which the manager is not working nicely. The mgrs are 
all very slow to respond. This initially caused them to continuously fail over.

We've disabled most of the modules. 

We’ve set the following which seemed to improve the situation a little bit but 
the problem came back.

ms_async_op_threads = 10
ms_async_max_op_threads = 16
mgr_stats_period = 10

However, the ms_dispatch thread is at 99.9% cpu all the time. If we fail the 
manager it will be 99.9% on the new mgr. We has restarted all mon and mgr 
daemons.

The perf dump shows an extreme amount of get_or_fail_fail entries.

"throttle-mgr_mon_messsages": {
"val": 128,
"max": 128,
"get_started": 0,
"get": 1191,
"get_sum": 1191,
"get_or_fail_fail": 188691955,
"get_or_fail_success": 1191,
"take": 0,
"take_sum": 0,
"put": 1191,
"put_sum": 1191,
"wait": {
"avgcount": 0,
"sum": 0.0,
"avgtime": 0.0
}

Thanks,
Wout
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: adding mds service , unable to create keyring for mds

2022-09-15 Thread Xiubo Li



On 15/09/2022 21:56, Jerry Buburuz wrote:

ceph auth list:

mds.mynode
 key: mykeyxx
 caps: [mgr] profile mds
 caps: [mon] profile mds


Yeah, it already exists.



ceph auth  get-or-create mds.mynode mon 'profile mds' mgr 'profile mds'
mds 'allow *' osd 'allow *'

error:
Error EINVAL: key for mds.mynode exists but cap mds does not match


I think the 'get-or-create' won't allow you to change the existing 
user's caps.


Please see 
https://docs.ceph.com/en/pacific/rados/operations/user-management/


If you want to modify it please see the "MODIFY USER CAPABILITIES" 
section how to do it in the above doc as Eugen suggested.


Thanks

Xiubo



I am following instructions on
https://docs.ceph.com/latest/cephfs/add-remove-mds

thanks
jerry

Xiubo Li

On 15/09/2022 03:09, Jerry Buburuz wrote:

Hello,

I am trying to add my first mds service on any node. I am unable to add
keyring to start mds service.

# $ sudo ceph auth get-or-create mds.mynode mon 'profile mds' mgr
'profile
mds' mds 'allow *' osd 'allow *'

Error ENINVAL: key for mds.mynode exists but cap mds does not match

It says the key mds.mynode already exists. What's the output of the
`ceph auth ls` ?

Thanks!



I tried this command on storage nodes, admin nodes(monitor) , same
error.

ceph mds stat
cephfs:0

This makes sense I don't have any mds services running yet.

I had no problem creating keyrings for other services like monitors and
mgr.

Thanks
jerry





___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io






___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Power outage recovery

2022-09-15 Thread Gregory Farnum
Recovery from OSDs loses the mds and rgw keys they use to authenticate with
cephx. You need to get those set up again by using the auth commands. I
don’t have them handy but it is discussed in the mailing list archives.
-Greg

On Thu, Sep 15, 2022 at 3:28 PM Jorge Garcia  wrote:

> Yes, I tried restarting them and even rebooting the mds machine. No joy.
> If I try to start ceph-mds by hand, it returns:
>
> 2022-09-15 15:21:39.848 7fc43dbd2700 -1 monclient(hunting):
> handle_auth_bad_method server allowed_methods [2] but i only support [2]
> failed to fetch mon config (--no-mon-config to skip)
>
> I found this information online, maybe something to try next:
>
> https://docs.ceph.com/en/quincy/cephfs/recover-fs-after-mon-store-loss/
>
> But I think maybe the mds needs to be running before that?
>
> On 9/15/22 15:19, Wesley Dillingham wrote:
> > Having the quorum / monitors back up may change the MDS and RGW's
> > ability to start and stay running. Have you tried just restarting the
> > MDS / RGW daemons again?
> >
> > Respectfully,
> >
> > *Wes Dillingham*
> > w...@wesdillingham.com
> > LinkedIn 
> >
> >
> > On Thu, Sep 15, 2022 at 5:54 PM Jorge Garcia 
> wrote:
> >
> > OK, I'll try to give more details as I remember them.
> >
> > 1. There was a power outage and then power came back up.
> >
> > 2. When the systems came back up, I did a "ceph -s" and it never
> > returned. Further investigation revealed that the ceph-mon
> > processes had
> > not started in any of the 3 monitors. I looked at the log files
> > and it
> > said something about:
> >
> > ceph_abort_msg("Bad table magic number: expected 9863518390377041911,
> > found 30790637387776 in
> > /var/lib/ceph/mon/ceph-gi-cprv-adm-01/store.db/2886524.sst")
> >
> > Looking at the internet, I found some suggestions about
> > troubleshooting
> > monitors in:
> >
> >
> https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/
> >
> > I quickly determined that the monitors weren't running, so I found
> > the
> > section where it said "RECOVERY USING OSDS". The description made
> > sense:
> >
> > "But what if all monitors fail at the same time? Since users are
> > encouraged to deploy at least three (and preferably five) monitors
> > in a
> > Ceph cluster, the chance of simultaneous failure is rare. But
> > unplanned
> > power-downs in a data center with improperly configured disk/fs
> > settings
> > could fail the underlying file system, and hence kill all the
> > monitors.
> > In this case, we can recover the monitor store with the information
> > stored in OSDs."
> >
> > So, I did the procedure described in that section, and then made sure
> > the correct keys were in the keyring and restarted the processes.
> >
> > WELL, I WAS REDOING ALL THESE STEPS WHILE WRITING THIS MAIL
> > MESSAGE, AND
> > NOW THE MONITORS ARE BACK! I must have missed some step in the
> > middle of
> > my panic.
> >
> > # ceph -s
> >
> >cluster:
> >  id: ----
> >  health: HEALTH_WARN
> >  mons are allowing insecure global_id reclaim
> >
> >services:
> >  mon: 3 daemons, quorum host-a, host-b, host-c (age 19m)
> >  mgr: host-b(active, since 19m), standbys: host-a, host-c
> >  osd: 164 osds: 164 up (since 16m), 164 in (since 8h)
> >
> >data:
> >  pools:   14 pools, 2992 pgs
> >  objects: 91.58M objects, 290 TiB
> >  usage:   437 TiB used, 1.2 PiB / 1.7 PiB avail
> >  pgs: 2985 active+clean
> >   7active+clean+scrubbing+deep
> >
> > Couple of missing or strange things:
> >
> > 1. Missing mds
> > 2. Missing rgw
> > 3. New warning showing up
> >
> > But overall, better than a couple hours ago. If anybody is still
> > reading
> > and has any suggestions about how to solve the 3 items above, that
> > would
> > be great! Otherwise, back to scanning the internet for ideas...
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Power outage recovery

2022-09-15 Thread Jorge Garcia
Yes, I tried restarting them and even rebooting the mds machine. No joy. 
If I try to start ceph-mds by hand, it returns:


2022-09-15 15:21:39.848 7fc43dbd2700 -1 monclient(hunting): 
handle_auth_bad_method server allowed_methods [2] but i only support [2]

failed to fetch mon config (--no-mon-config to skip)

I found this information online, maybe something to try next:

https://docs.ceph.com/en/quincy/cephfs/recover-fs-after-mon-store-loss/

But I think maybe the mds needs to be running before that?

On 9/15/22 15:19, Wesley Dillingham wrote:
Having the quorum / monitors back up may change the MDS and RGW's 
ability to start and stay running. Have you tried just restarting the 
MDS / RGW daemons again?


Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Thu, Sep 15, 2022 at 5:54 PM Jorge Garcia  wrote:

OK, I'll try to give more details as I remember them.

1. There was a power outage and then power came back up.

2. When the systems came back up, I did a "ceph -s" and it never
returned. Further investigation revealed that the ceph-mon
processes had
not started in any of the 3 monitors. I looked at the log files
and it
said something about:

ceph_abort_msg("Bad table magic number: expected 9863518390377041911,
found 30790637387776 in
/var/lib/ceph/mon/ceph-gi-cprv-adm-01/store.db/2886524.sst")

Looking at the internet, I found some suggestions about
troubleshooting
monitors in:

https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/

I quickly determined that the monitors weren't running, so I found
the
section where it said "RECOVERY USING OSDS". The description made
sense:

"But what if all monitors fail at the same time? Since users are
encouraged to deploy at least three (and preferably five) monitors
in a
Ceph cluster, the chance of simultaneous failure is rare. But
unplanned
power-downs in a data center with improperly configured disk/fs
settings
could fail the underlying file system, and hence kill all the
monitors.
In this case, we can recover the monitor store with the information
stored in OSDs."

So, I did the procedure described in that section, and then made sure
the correct keys were in the keyring and restarted the processes.

WELL, I WAS REDOING ALL THESE STEPS WHILE WRITING THIS MAIL
MESSAGE, AND
NOW THE MONITORS ARE BACK! I must have missed some step in the
middle of
my panic.

# ceph -s

   cluster:
 id: ----
 health: HEALTH_WARN
 mons are allowing insecure global_id reclaim

   services:
 mon: 3 daemons, quorum host-a, host-b, host-c (age 19m)
 mgr: host-b(active, since 19m), standbys: host-a, host-c
 osd: 164 osds: 164 up (since 16m), 164 in (since 8h)

   data:
 pools:   14 pools, 2992 pgs
 objects: 91.58M objects, 290 TiB
 usage:   437 TiB used, 1.2 PiB / 1.7 PiB avail
 pgs: 2985 active+clean
  7    active+clean+scrubbing+deep

Couple of missing or strange things:

1. Missing mds
2. Missing rgw
3. New warning showing up

But overall, better than a couple hours ago. If anybody is still
reading
and has any suggestions about how to solve the 3 items above, that
would
be great! Otherwise, back to scanning the internet for ideas...

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Power outage recovery

2022-09-15 Thread Wesley Dillingham
Having the quorum / monitors back up may change the MDS and RGW's ability
to start and stay running. Have you tried just restarting the MDS / RGW
daemons again?

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Thu, Sep 15, 2022 at 5:54 PM Jorge Garcia  wrote:

> OK, I'll try to give more details as I remember them.
>
> 1. There was a power outage and then power came back up.
>
> 2. When the systems came back up, I did a "ceph -s" and it never
> returned. Further investigation revealed that the ceph-mon processes had
> not started in any of the 3 monitors. I looked at the log files and it
> said something about:
>
> ceph_abort_msg("Bad table magic number: expected 9863518390377041911,
> found 30790637387776 in
> /var/lib/ceph/mon/ceph-gi-cprv-adm-01/store.db/2886524.sst")
>
> Looking at the internet, I found some suggestions about troubleshooting
> monitors in:
>
> https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/
>
> I quickly determined that the monitors weren't running, so I found the
> section where it said "RECOVERY USING OSDS". The description made sense:
>
> "But what if all monitors fail at the same time? Since users are
> encouraged to deploy at least three (and preferably five) monitors in a
> Ceph cluster, the chance of simultaneous failure is rare. But unplanned
> power-downs in a data center with improperly configured disk/fs settings
> could fail the underlying file system, and hence kill all the monitors.
> In this case, we can recover the monitor store with the information
> stored in OSDs."
>
> So, I did the procedure described in that section, and then made sure
> the correct keys were in the keyring and restarted the processes.
>
> WELL, I WAS REDOING ALL THESE STEPS WHILE WRITING THIS MAIL MESSAGE, AND
> NOW THE MONITORS ARE BACK! I must have missed some step in the middle of
> my panic.
>
> # ceph -s
>
>cluster:
>  id: ----
>  health: HEALTH_WARN
>  mons are allowing insecure global_id reclaim
>
>services:
>  mon: 3 daemons, quorum host-a, host-b, host-c (age 19m)
>  mgr: host-b(active, since 19m), standbys: host-a, host-c
>  osd: 164 osds: 164 up (since 16m), 164 in (since 8h)
>
>data:
>  pools:   14 pools, 2992 pgs
>  objects: 91.58M objects, 290 TiB
>  usage:   437 TiB used, 1.2 PiB / 1.7 PiB avail
>  pgs: 2985 active+clean
>   7active+clean+scrubbing+deep
>
> Couple of missing or strange things:
>
> 1. Missing mds
> 2. Missing rgw
> 3. New warning showing up
>
> But overall, better than a couple hours ago. If anybody is still reading
> and has any suggestions about how to solve the 3 items above, that would
> be great! Otherwise, back to scanning the internet for ideas...
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Power outage recovery

2022-09-15 Thread Jorge Garcia

OK, I'll try to give more details as I remember them.

1. There was a power outage and then power came back up.

2. When the systems came back up, I did a "ceph -s" and it never 
returned. Further investigation revealed that the ceph-mon processes had 
not started in any of the 3 monitors. I looked at the log files and it 
said something about:


ceph_abort_msg("Bad table magic number: expected 9863518390377041911, 
found 30790637387776 in 
/var/lib/ceph/mon/ceph-gi-cprv-adm-01/store.db/2886524.sst")


Looking at the internet, I found some suggestions about troubleshooting 
monitors in:


https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/

I quickly determined that the monitors weren't running, so I found the 
section where it said "RECOVERY USING OSDS". The description made sense:


"But what if all monitors fail at the same time? Since users are 
encouraged to deploy at least three (and preferably five) monitors in a 
Ceph cluster, the chance of simultaneous failure is rare. But unplanned 
power-downs in a data center with improperly configured disk/fs settings 
could fail the underlying file system, and hence kill all the monitors. 
In this case, we can recover the monitor store with the information 
stored in OSDs."


So, I did the procedure described in that section, and then made sure 
the correct keys were in the keyring and restarted the processes.


WELL, I WAS REDOING ALL THESE STEPS WHILE WRITING THIS MAIL MESSAGE, AND 
NOW THE MONITORS ARE BACK! I must have missed some step in the middle of 
my panic.


# ceph -s

  cluster:
    id: ----
    health: HEALTH_WARN
    mons are allowing insecure global_id reclaim

  services:
    mon: 3 daemons, quorum host-a, host-b, host-c (age 19m)
    mgr: host-b(active, since 19m), standbys: host-a, host-c
    osd: 164 osds: 164 up (since 16m), 164 in (since 8h)

  data:
    pools:   14 pools, 2992 pgs
    objects: 91.58M objects, 290 TiB
    usage:   437 TiB used, 1.2 PiB / 1.7 PiB avail
    pgs: 2985 active+clean
 7    active+clean+scrubbing+deep

Couple of missing or strange things:

1. Missing mds
2. Missing rgw
3. New warning showing up

But overall, better than a couple hours ago. If anybody is still reading 
and has any suggestions about how to solve the 3 items above, that would 
be great! Otherwise, back to scanning the internet for ideas...


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Nautilus: PGs stuck "activating" after adding OSDs. Please help!

2022-09-15 Thread Fulvio Galeazzi
Correct, Dan, indeed at some point I had also raised the "hard-ratio" to 
3, with no succes as I guess I was missing the "repeer"... I assumed 
that restarting OSDs would 'do the whole work'.

Ok, I learned something today, thanks!

Fulvio

On 15/09/2022 21:28, Dan van der Ster wrote:

Another common config to workaround this pg num limit is:

ceph config set osd osd_max_pg_per_osd_hard_ratio 10

(Then possibly the repeer step on each activating pg)

.. Dan



On Thu, Sept 15, 2022, 17:47 Josh Baergen > wrote:


Hi Fulvio,

I've seen this in the past when a CRUSH change temporarily resulted in
too many PGs being mapped to an OSD, exceeding mon_max_pg_per_osd. You
can try increasing that setting to see if it helps, then setting it
back to default once backfill completes. You may also need to "ceph pg
repeer $pgid" for each of the PGs stuck activating.

Josh

On Thu, Sep 15, 2022 at 8:42 AM Fulvio Galeazzi
mailto:fulvio.galea...@garr.it>> wrote:
 >
 >
 > Hallo,
 >         I am on Nautilus and today, after upgrading the operating
system (from
 > CentOS 7 to CentOS 8 Stream) on a couple OSD servers and adding them
 > back to the cluster, I noticed some PGs are still "activating".
 >     The upgraded server are from the same "rack", and I have
replica-3
 > pools with 1-per-rack rule, and 6+4 EC pools (in some cases, with SSD
 > pool for metadata).
 >
 > More details:
 > - on the two OSD servers I upgrade, I ran "systemctl stop
ceph.target"
 >     and waited a while, to verify all PGs would remain "active"
 > - went on with the upgrade and ceph-ansible reconfig
 > - as soon as I started adding OSDs I saw "slow ops"
 > - to exclude possible effect of updated packages, I ran "yum
update" on
 >     all OSD servers, and rebooted them one by one
 > - after 2-3 hours, the last OSD disks finally came up
 > - I am left with:
 >         about 1k "slow ops" (if I pause recovery, number ~stable
but max
 >                 age increasing)
 >         ~200 inactive PGs
 >
 >     Most of the inactive PGs are from the object store pool:
 >
 > [cephmgr@cephAdmCT1.cephAdmCT1 ~]$ ceph osd pool get
 > default.rgw.buckets.data crush_rule
 > crush_rule: default.rgw.buckets.data
 >
 > rule default.rgw.buckets.data {
 >           id 6
 >           type erasure
 >           min_size 3
 >           max_size 10
 >           step set_chooseleaf_tries 5
 >           step set_choose_tries 100
 >           step take default class big
 >           step chooseleaf indep 0 type host
 >           step emit
 > }
 >
 >     But "ceph pg dump_stuck inactive" also shows 4 lines for the
glance
 > replicated pool, like:
 >
 > 82.34                       activating+remapped  [139,50,207]  139
 > [139,50,284]  139
 > 82.54   activating+undersized+degraded+remapped    [139,86,5]  139
 > [139,74]      139
 >
 >
 > Need your help please:
 >
 > - any idea what was the root cause for all this?
 >
 > - and now, how can I help OSDs complete their activation?
 >     + does the procedure differ for EC or replicated pools, by
the way?
 >     + or may be I should first get rid of the "slow ops" issue?
 >
 > I am pasting:
 > ceph osd df tree
 > https://pastebin.ubuntu.com/p/VWhT7FWf6m/

 >
 > ceph osd lspools ; ceph pg dump_stuck inactive
 > https://pastebin.ubuntu.com/p/9f6rXRYMh4/

 >
 >     Thanks a lot!
 >
 >                         Fulvio
 >
 > --
 > Fulvio Galeazzi
 > GARR-CSD Department
 > tel.: +39-334-6533-250
 > skype: fgaleazzi70
 > ___
 > ceph-users mailing list -- ceph-users@ceph.io

 > To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Nautilus: PGs stuck "activating" after adding OSDs. Please help!

2022-09-15 Thread Fulvio Galeazzi

Wow Josh, thanks a lot for prompt help!

Indeed, I thought mon_max_pg_per_osd (which was 500 in my case) would 
work in combination with the multiplier max_pg_per_osd_hard_ratio which 
if I am not mistaken is 2 by default: I had ~700 PGs/OSD so I was 
feeling rather safe.


However, I temporarily doubled the max_pg_per_osd value and "repeer"ed: 
an ansible round of "systemctl restart ceph-osd.target" on all OSDs also 
helped clear "slow ops".


  Thanks a lot, again!

Fulvio


On 15/09/2022 17:47, Josh Baergen wrote:

Hi Fulvio,

I've seen this in the past when a CRUSH change temporarily resulted in
too many PGs being mapped to an OSD, exceeding mon_max_pg_per_osd. You
can try increasing that setting to see if it helps, then setting it
back to default once backfill completes. You may also need to "ceph pg
repeer $pgid" for each of the PGs stuck activating.

Josh

On Thu, Sep 15, 2022 at 8:42 AM Fulvio Galeazzi  wrote:



Hallo,
 I am on Nautilus and today, after upgrading the operating system (from
CentOS 7 to CentOS 8 Stream) on a couple OSD servers and adding them
back to the cluster, I noticed some PGs are still "activating".
 The upgraded server are from the same "rack", and I have replica-3
pools with 1-per-rack rule, and 6+4 EC pools (in some cases, with SSD
pool for metadata).

More details:
- on the two OSD servers I upgrade, I ran "systemctl stop ceph.target"
 and waited a while, to verify all PGs would remain "active"
- went on with the upgrade and ceph-ansible reconfig
- as soon as I started adding OSDs I saw "slow ops"
- to exclude possible effect of updated packages, I ran "yum update" on
 all OSD servers, and rebooted them one by one
- after 2-3 hours, the last OSD disks finally came up
- I am left with:
 about 1k "slow ops" (if I pause recovery, number ~stable but max
 age increasing)
 ~200 inactive PGs

 Most of the inactive PGs are from the object store pool:

[cephmgr@cephAdmCT1.cephAdmCT1 ~]$ ceph osd pool get
default.rgw.buckets.data crush_rule
crush_rule: default.rgw.buckets.data

rule default.rgw.buckets.data {
   id 6
   type erasure
   min_size 3
   max_size 10
   step set_chooseleaf_tries 5
   step set_choose_tries 100
   step take default class big
   step chooseleaf indep 0 type host
   step emit
}

 But "ceph pg dump_stuck inactive" also shows 4 lines for the glance
replicated pool, like:

82.34   activating+remapped  [139,50,207]  139
[139,50,284]  139
82.54   activating+undersized+degraded+remapped[139,86,5]  139
[139,74]  139


Need your help please:

- any idea what was the root cause for all this?

- and now, how can I help OSDs complete their activation?
 + does the procedure differ for EC or replicated pools, by the way?
 + or may be I should first get rid of the "slow ops" issue?

I am pasting:
ceph osd df tree
 https://pastebin.ubuntu.com/p/VWhT7FWf6m/

ceph osd lspools ; ceph pg dump_stuck inactive
 https://pastebin.ubuntu.com/p/9f6rXRYMh4/

 Thanks a lot!

 Fulvio

--
Fulvio Galeazzi
GARR-CSD Department
tel.: +39-334-6533-250
skype: fgaleazzi70
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

2022-09-15 Thread Christophe BAILLON
Hi

The problem is still present in version 17.2.3, 
thanks for the trick to work around...

Regards

- Mail original -
> De: "Anh Phan Tuan" 
> À: "Calhoun, Patrick" 
> Cc: "Arthur Outhenin-Chalandre" , 
> "ceph-users" 
> Envoyé: Jeudi 11 Août 2022 10:14:17
> Objet: [ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

> Hi Patrick,
> 
> I am also facing this bug when deploying a new cluster at the time 16.2.7
> release.
> 
> The bugs relative to the way ceph calculator db_size form give db disk.
> 
> Instead of : slot db size = size of db disk / num slot per disk.
> Ceph calculated the value: slot db size = size of db disk (just one disk) /
> total number of slots needed (number of osd prepared in that time).
> 
> In your case, you have 2 db disks, It will make the db size only 50% of the
> corrected value.
> In my case, I have 4 db disks per host, It makes the db size only 25% of
> the corrected value.
> 
> This bug happens even when you deploy by batch command.
> In that time, I finally used to work around by batch command but only
> deploy all osd relative to one db disk a time, in this case ceph calculated
> the correct value.
> 
> Cheers,
> Anh Phan
> 
> 
> 
> On Sat, Jul 30, 2022 at 12:31 AM Calhoun, Patrick  wrote:
> 
>> Thanks, Arthur,
>>
>> I think you are right about that bug looking very similar to what I've
>> observed. I'll try to remember to update the list once the fix is merged
>> and released and I get a chance to test it.
>>
>> I'm hoping somebody can comment on what are ceph's current best practices
>> for sizing WAL/DB volumes, considering rocksdb levels and compaction.
>>
>> -Patrick
>>
>> 
>> From: Arthur Outhenin-Chalandre 
>> Sent: Friday, July 29, 2022 2:11 AM
>> To: ceph-users@ceph.io 
>> Subject: [ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD
>>
>> Hi Patrick,
>>
>> On 7/28/22 16:22, Calhoun, Patrick wrote:
>> > In a new OSD node with 24 hdd (16 TB each) and 2 ssd (1.44 TB each), I'd
>> like to have "ceph orch" allocate WAL and DB on the ssd devices.
>> >
>> > I use the following service spec:
>> > spec:
>> >   data_devices:
>> > rotational: 1
>> > size: '14T:'
>> >   db_devices:
>> > rotational: 0
>> > size: '1T:'
>> >   db_slots: 12
>> >
>> > This results in each OSD having a 60GB volume for WAL/DB, which equates
>> to 50% total usage in the VG on each ssd, and 50% free.
>> > I honestly don't know what size to expect, but exactly 50% of capacity
>> makes me suspect this is due to a bug:
>> > https://tracker.ceph.com/issues/54541
>> > (In fact, I had run into this bug when specifying block_db_size rather
>> than db_slots)
>> >
>> > Questions:
>> >   Am I being bit by that bug?
>> >   Is there a better approach, in general, to my situation?
>> >   Are DB sizes still governed by the rocksdb tiering? (I thought that
>> this was mostly resolved by https://github.com/ceph/ceph/pull/29687 )
>> >   If I provision a DB/WAL logical volume size to 61GB, is that
>> effectively a 30GB database, and 30GB of extra room for compaction?
>>
>> I don't use cephadm, but it's maybe related to this regression:
>> https://tracker.ceph.com/issues/56031. At list the symptoms looks very
>> similar...
>>
>> Cheers,
>>
>> --
>> Arthur Outhenin-Chalandre
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

-- 
Christophe BAILLON
Mobile :: +336 16 400 522
Work :: https://eyona.com
Twitter :: https://twitter.com/ctof
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Nautilus: PGs stuck "activating" after adding OSDs. Please help!

2022-09-15 Thread Dan van der Ster
Another common config to workaround this pg num limit is:

ceph config set osd osd_max_pg_per_osd_hard_ratio 10

(Then possibly the repeer step on each activating pg)

.. Dan



On Thu, Sept 15, 2022, 17:47 Josh Baergen  wrote:

> Hi Fulvio,
>
> I've seen this in the past when a CRUSH change temporarily resulted in
> too many PGs being mapped to an OSD, exceeding mon_max_pg_per_osd. You
> can try increasing that setting to see if it helps, then setting it
> back to default once backfill completes. You may also need to "ceph pg
> repeer $pgid" for each of the PGs stuck activating.
>
> Josh
>
> On Thu, Sep 15, 2022 at 8:42 AM Fulvio Galeazzi 
> wrote:
> >
> >
> > Hallo,
> > I am on Nautilus and today, after upgrading the operating system
> (from
> > CentOS 7 to CentOS 8 Stream) on a couple OSD servers and adding them
> > back to the cluster, I noticed some PGs are still "activating".
> > The upgraded server are from the same "rack", and I have replica-3
> > pools with 1-per-rack rule, and 6+4 EC pools (in some cases, with SSD
> > pool for metadata).
> >
> > More details:
> > - on the two OSD servers I upgrade, I ran "systemctl stop ceph.target"
> > and waited a while, to verify all PGs would remain "active"
> > - went on with the upgrade and ceph-ansible reconfig
> > - as soon as I started adding OSDs I saw "slow ops"
> > - to exclude possible effect of updated packages, I ran "yum update" on
> > all OSD servers, and rebooted them one by one
> > - after 2-3 hours, the last OSD disks finally came up
> > - I am left with:
> > about 1k "slow ops" (if I pause recovery, number ~stable but max
> > age increasing)
> > ~200 inactive PGs
> >
> > Most of the inactive PGs are from the object store pool:
> >
> > [cephmgr@cephAdmCT1.cephAdmCT1 ~]$ ceph osd pool get
> > default.rgw.buckets.data crush_rule
> > crush_rule: default.rgw.buckets.data
> >
> > rule default.rgw.buckets.data {
> >   id 6
> >   type erasure
> >   min_size 3
> >   max_size 10
> >   step set_chooseleaf_tries 5
> >   step set_choose_tries 100
> >   step take default class big
> >   step chooseleaf indep 0 type host
> >   step emit
> > }
> >
> > But "ceph pg dump_stuck inactive" also shows 4 lines for the glance
> > replicated pool, like:
> >
> > 82.34   activating+remapped  [139,50,207]  139
> > [139,50,284]  139
> > 82.54   activating+undersized+degraded+remapped[139,86,5]  139
> > [139,74]  139
> >
> >
> > Need your help please:
> >
> > - any idea what was the root cause for all this?
> >
> > - and now, how can I help OSDs complete their activation?
> > + does the procedure differ for EC or replicated pools, by the way?
> > + or may be I should first get rid of the "slow ops" issue?
> >
> > I am pasting:
> > ceph osd df tree
> > https://pastebin.ubuntu.com/p/VWhT7FWf6m/
> >
> > ceph osd lspools ; ceph pg dump_stuck inactive
> > https://pastebin.ubuntu.com/p/9f6rXRYMh4/
> >
> > Thanks a lot!
> >
> > Fulvio
> >
> > --
> > Fulvio Galeazzi
> > GARR-CSD Department
> > tel.: +39-334-6533-250
> > skype: fgaleazzi70
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Power outage recovery

2022-09-15 Thread Wesley Dillingham
What does "ceph status" "ceph health detail" etc show, currently?

Based on what you have said here my thought is you have created a new
monitor quorum and as such all auth details from the old cluster are lost
including any and all mgr cephx auth keys, so what does the log for the mgr
say? How many monitors did you have before? Do you have a backup the old
monitor store?

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Thu, Sep 15, 2022 at 2:18 PM Marc  wrote:

> > (particularly the "Recovery using OSDs" section). I got it so the mon
> > processes would start, but then the ceph-mgr process died, and would not
> > restart. Not sure how to recover so both ceph-mgr and ceph-mon processes
> > run. In the meantime, all the data is gone. Any suggestions?
>
> All the data is gone? the osd's are running all. Your networking is fine?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Power outage recovery

2022-09-15 Thread Eugen Block
The data only seems to be gone (if you mean what I think you mean)  
because the MGRs are not running and the OSDs can’t report their  
status. But are all MONs and OSDs up? What is the ceph status? What do  
the MGRs log when trying to start them?


Zitat von Jorge Garcia :


We have a Nautilus cluster that just got hit by a bad power outage. When
the admin systems came back up, only the ceph-mgr process was running (all
the ceph-mon processes would not start). I tried following the instructions
in
https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/
(particularly the "Recovery using OSDs" section). I got it so the mon
processes would start, but then the ceph-mgr process died, and would not
restart. Not sure how to recover so both ceph-mgr and ceph-mon processes
run. In the meantime, all the data is gone. Any suggestions?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Power outage recovery

2022-09-15 Thread Marc
> (particularly the "Recovery using OSDs" section). I got it so the mon
> processes would start, but then the ceph-mgr process died, and would not
> restart. Not sure how to recover so both ceph-mgr and ceph-mon processes
> run. In the meantime, all the data is gone. Any suggestions?

All the data is gone? the osd's are running all. Your networking is fine?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Power outage recovery

2022-09-15 Thread Jorge Garcia
We have a Nautilus cluster that just got hit by a bad power outage. When
the admin systems came back up, only the ceph-mgr process was running (all
the ceph-mon processes would not start). I tried following the instructions
in
https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/
(particularly the "Recovery using OSDs" section). I got it so the mon
processes would start, but then the ceph-mgr process died, and would not
restart. Not sure how to recover so both ceph-mgr and ceph-mon processes
run. In the meantime, all the data is gone. Any suggestions?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] multisite replication issue with Quincy

2022-09-15 Thread Jane Zhu
We have encountered replication issues in our multisite settings with
Quincy v17.2.3.

Our Ceph clusters are brand new. We tore down our clusters and re-deployed
fresh Quincy ones before we did our test.
In our environment, we have 3 RGW nodes per site, each node has 2 instances
for client traffic and 1 instance dedicated for replication.

Our test was done using cosbench with the following settings:
- 10 rgw users
- 3000 buckets per user
- write only
- 6 different object sizes with the following distribution:
1k: 17%
2k: 48%
3k: 14%
4k: 5%
1M: 13%
8M: 3%
- trying to write 10 million objects per object size bucket per user to
avoid writing to the same objects
- no multipart uploads involved
The test ran for about 2 hours roughly from 22:50pm 9/14 to 1:00am 9/15.
And after that, the replication tail continued for another roughly 4 hours
till 4:50am 9/15 with gradually decreasing replication traffic. Then the
replication stopped and nothing has been going on in the clusters since.

While we were verifying the replication status, we found many issues.
1. The sync status shows the clusters are not fully synced. However all the
replication traffic has stopped and nothing is going on in the clusters.
Secondary zone:

  realm 8a98f19f-db58-4c09-bde6-ac89560d79b0 (prod-realm)
  zonegroup e041ea69-1e0b-4ad7-92f2-74b20aa3edf3 (prod-zonegroup)
   zone 1dadcf12-f44c-4940-8acc-9623a48b829e (prod-zone-tt)
  metadata sync syncing
full sync: 0/64 shards
incremental sync: 64/64 shards
metadata is caught up with master
  data sync source: b68a526a-ffaa-4058-9903-6e7c6eac35bb (prod-zone-pw)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is behind on 2 shards
behind shards: [40,74]


Why the replication stopped even though the clusters are still not in-sync?

2. We can see some buckets are not fully synced and we are able to
identified some missing objects in our secondary zone.
Here is an example bucket. This is its sync status in the secondary zone.

  realm 8a98f19f-db58-4c09-bde6-ac89560d79b0 (prod-realm)
  zonegroup e041ea69-1e0b-4ad7-92f2-74b20aa3edf3 (prod-zonegroup)
   zone 1dadcf12-f44c-4940-8acc-9623a48b829e (prod-zone-tt)
 bucket
:mixed-5wrks-dev-4k-thisisbcstestload004178[b68a526a-ffaa-4058-9903-6e7c6eac35bb.89152.78])

source zone b68a526a-ffaa-4058-9903-6e7c6eac35bb (prod-zone-pw)
  source bucket
:mixed-5wrks-dev-4k-thisisbcstestload004178[b68a526a-ffaa-4058-9903-6e7c6eac35bb.89152.78])
full sync: 0/101 shards
incremental sync: 100/101 shards
bucket is behind on 1 shards
behind shards: [78]

3. We can see from the above sync status, the behind shard for the example
bucket is not in the list of the behind shards for the system sync status.
Why is that?

4. Data sync status for these behind shards doesn't list any
"pending_buckets" or "recovering_buckets".
An example:

{
"shard_id": 74,
"marker": {
"status": "incremental-sync",
"marker": "0003:03381964",
"next_step_marker": "",
"total_entries": 0,
"pos": 0,
"timestamp": "2022-09-15T00:00:08.718840Z"
},
"pending_buckets": [],
"recovering_buckets": []
}


Shouldn't the not-yet-in-sync buckets be listed here?

5. The sync status of the primary zone is different from the sync status of
the secondary zone with different groups of behind shards. The same for the
sync status of the same bucket. Is it legitimate? Please see the item 1 for
sync status of the secondary zone, and the item 6 for the primary zone.

6. Why the primary zone has behind shards anyway since the replication is
from primary to the secondary?|
Primary Zone:

  realm 8a98f19f-db58-4c09-bde6-ac89560d79b0 (prod-realm)
  zonegroup e041ea69-1e0b-4ad7-92f2-74b20aa3edf3 (prod-zonegroup)
   zone b68a526a-ffaa-4058-9903-6e7c6eac35bb (prod-zone-pw)
  metadata sync no sync (zone is master)
  data sync source: 1dadcf12-f44c-4940-8acc-9623a48b829e (prod-zone-tt)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is behind on 30 shards
behind shards:
[6,7,26,28,29,37,47,52,55,56,61,67,68,69,74,79,82,91,95,99,101,104,106,111,112,121,122,123,126,127]

7. We have buckets in-sync that show correct sync status in secondary zone
but still show behind shards in primary. Why is that?
Secondary Zone:

  realm 8a98f19f-db58-4c09-bde6-ac89560d79b0 (prod-realm)
  zonegroup e041ea69-1e0b-4ad7-92f2-74b20aa3edf3 (prod-zonegroup)
   zone 1dadcf12-f44c-4940-8acc-9623a48b829e (prod-zone-tt)
 bucket

[ceph-users] Slides from today's Ceph User + Dev Monthly Meeting

2022-09-15 Thread Kamoltat Sirivadhna
Hi guys,

thank you all for attending today's meeting,
apologies for the restricted access.

Attached here is the slide in pdf format.

Let me know if you have any questions,

-- 

Kamoltat Sirivadhna (HE/HIM)

SoftWare Engineer - Ceph Storage

ksiri...@redhat.comT: (857) <(919)716-5348>253-8927
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Nautilus: PGs stuck "activating" after adding OSDs. Please help!

2022-09-15 Thread Fulvio Galeazzi


Hallo,
	I am on Nautilus and today, after upgrading the operating system (from 
CentOS 7 to CentOS 8 Stream) on a couple OSD servers and adding them 
back to the cluster, I noticed some PGs are still "activating".
   The upgraded server are from the same "rack", and I have replica-3 
pools with 1-per-rack rule, and 6+4 EC pools (in some cases, with SSD 
pool for metadata).


More details:
- on the two OSD servers I upgrade, I ran "systemctl stop ceph.target"
   and waited a while, to verify all PGs would remain "active"
- went on with the upgrade and ceph-ansible reconfig
- as soon as I started adding OSDs I saw "slow ops"
- to exclude possible effect of updated packages, I ran "yum update" on
   all OSD servers, and rebooted them one by one
- after 2-3 hours, the last OSD disks finally came up
- I am left with:
about 1k "slow ops" (if I pause recovery, number ~stable but max
age increasing)
~200 inactive PGs

   Most of the inactive PGs are from the object store pool:

[cephmgr@cephAdmCT1.cephAdmCT1 ~]$ ceph osd pool get 
default.rgw.buckets.data crush_rule

crush_rule: default.rgw.buckets.data

rule default.rgw.buckets.data {
 id 6
 type erasure
 min_size 3
 max_size 10
 step set_chooseleaf_tries 5
 step set_choose_tries 100
 step take default class big
 step chooseleaf indep 0 type host
 step emit
}

   But "ceph pg dump_stuck inactive" also shows 4 lines for the glance 
replicated pool, like:


82.34   activating+remapped  [139,50,207]  139 
[139,50,284]  139
82.54   activating+undersized+degraded+remapped[139,86,5]  139 
[139,74]  139



Need your help please:

- any idea what was the root cause for all this?

- and now, how can I help OSDs complete their activation?
   + does the procedure differ for EC or replicated pools, by the way?
   + or may be I should first get rid of the "slow ops" issue?

I am pasting:
ceph osd df tree
   https://pastebin.ubuntu.com/p/VWhT7FWf6m/

ceph osd lspools ; ceph pg dump_stuck inactive
   https://pastebin.ubuntu.com/p/9f6rXRYMh4/

   Thanks a lot!

Fulvio

--
Fulvio Galeazzi
GARR-CSD Department
tel.: +39-334-6533-250
skype: fgaleazzi70
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: adding mds service , unable to create keyring for mds

2022-09-15 Thread Jerry Buburuz



FIXED!

Interesting:

ceph auth caps mds.mynode mon 'profile mds' mgr 'profile mds'
mds 'allow *' osd 'allow *'

ouput:
updated caps for mds.admin-node-02

Worked!

ceph auth list

mds.mynode:
   key:
   caps: [mds] allow
   caps: [mgr] profile mds
   caps: [mon] profile mds
   caps: [osd] allow *


Thanks Eugen!

Thank you Xiubo Li! you got me going down the right road.

I am not sure why "get-or-create" did not work. I am not sure if this is
part of the cause. I created a cephfs manually before starting mds.

Currently:

ceph mds stat
cephfs:0/0 1 up:standby, 1 damaged

But I have a mds UP!

thanks
jerry

Eugen Block
> Have you tried to modify by using ‚ceph auth caps…‘ instead of
get-or-create?
> Zitat von Jerry Buburuz :
>> Can I just:
>> ceph auth export mds.mynode -o mds.export
>> Add(editor) "caps: [mds] profile mds"
>> ceph auth import -i mds.export
>> THanks
>> jerry
>> Jerry Buburuz
>>> ceph auth list:
>>> mds.mynode
>>> key: mykeyxx
>>> caps: [mgr] profile mds
>>> caps: [mon] profile mds
>>> ceph auth  get-or-create mds.mynode mon 'profile mds' mgr 'profile
mds'
>>> mds 'allow *' osd 'allow *'
>>> error:
>>> Error EINVAL: key for mds.mynode exists but cap mds does not match I
am following instructions on
>>> https://docs.ceph.com/latest/cephfs/add-remove-mds
>>> thanks
>>> jerry
>>> Xiubo Li
 On 15/09/2022 03:09, Jerry Buburuz wrote:
> Hello,
> I am trying to add my first mds service on any node. I am unable to add
> keyring to start mds service.
> # $ sudo ceph auth get-or-create mds.mynode mon 'profile mds' mgr
'profile
> mds' mds 'allow *' osd 'allow *'
> Error ENINVAL: key for mds.mynode exists but cap mds does not match
 It says the key mds.mynode already exists. What's the output of the
`ceph auth ls` ?
 Thanks!
> I tried this command on storage nodes, admin nodes(monitor) , same
error.
> ceph mds stat
> cephfs:0
> This makes sense I don't have any mds services running yet.
> I had no problem creating keyrings for other services like monitors and
> mgr.
> Thanks
> jerry
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
 ___
 ceph-users mailing list -- ceph-users@ceph.io
 To unsubscribe send an email to ceph-users-le...@ceph.io
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: adding mds service , unable to create keyring for mds

2022-09-15 Thread Eugen Block

Have you tried to modify by using ‚ceph auth caps…‘ instead of get-or-create?

Zitat von Jerry Buburuz :


Can I just:

ceph auth export mds.mynode -o mds.export

Add(editor) "caps: [mds] profile mds"

ceph auth import -i mds.export

THanks
jerry

Jerry Buburuz


ceph auth list:

mds.mynode
key: mykeyxx
caps: [mgr] profile mds
caps: [mon] profile mds


ceph auth  get-or-create mds.mynode mon 'profile mds' mgr 'profile mds'
mds 'allow *' osd 'allow *'

error:
Error EINVAL: key for mds.mynode exists but cap mds does not match

I am following instructions on
https://docs.ceph.com/latest/cephfs/add-remove-mds

thanks
jerry

Xiubo Li


On 15/09/2022 03:09, Jerry Buburuz wrote:

Hello,

I am trying to add my first mds service on any node. I am unable to add
keyring to start mds service.

# $ sudo ceph auth get-or-create mds.mynode mon 'profile mds' mgr
'profile
mds' mds 'allow *' osd 'allow *'

Error ENINVAL: key for mds.mynode exists but cap mds does not match


It says the key mds.mynode already exists. What's the output of the
`ceph auth ls` ?

Thanks!



I tried this command on storage nodes, admin nodes(monitor) , same
error.

ceph mds stat
cephfs:0

This makes sense I don't have any mds services running yet.

I had no problem creating keyrings for other services like monitors and
mgr.

Thanks
jerry





___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io








___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: adding mds service , unable to create keyring for mds

2022-09-15 Thread Jerry Buburuz


Can I just:

ceph auth export mds.mynode -o mds.export

Add(editor) "caps: [mds] profile mds"

ceph auth import -i mds.export

THanks
jerry

Jerry Buburuz
>
> ceph auth list:
>
> mds.mynode
> key: mykeyxx
> caps: [mgr] profile mds
> caps: [mon] profile mds
>
>
> ceph auth  get-or-create mds.mynode mon 'profile mds' mgr 'profile mds'
> mds 'allow *' osd 'allow *'
>
> error:
> Error EINVAL: key for mds.mynode exists but cap mds does not match
>
> I am following instructions on
> https://docs.ceph.com/latest/cephfs/add-remove-mds
>
> thanks
> jerry
>
> Xiubo Li
>>
>> On 15/09/2022 03:09, Jerry Buburuz wrote:
>>> Hello,
>>>
>>> I am trying to add my first mds service on any node. I am unable to add
>>> keyring to start mds service.
>>>
>>> # $ sudo ceph auth get-or-create mds.mynode mon 'profile mds' mgr
>>> 'profile
>>> mds' mds 'allow *' osd 'allow *'
>>>
>>> Error ENINVAL: key for mds.mynode exists but cap mds does not match
>>
>> It says the key mds.mynode already exists. What's the output of the
>> `ceph auth ls` ?
>>
>> Thanks!
>>
>>
>>> I tried this command on storage nodes, admin nodes(monitor) , same
>>> error.
>>>
>>> ceph mds stat
>>> cephfs:0
>>>
>>> This makes sense I don't have any mds services running yet.
>>>
>>> I had no problem creating keyrings for other services like monitors and
>>> mgr.
>>>
>>> Thanks
>>> jerry
>>>
>>>
>>>
>>>
>>>
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>>
>
>


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: adding mds service , unable to create keyring for mds

2022-09-15 Thread Jerry Buburuz


ceph auth list:

mds.mynode
key: mykeyxx
caps: [mgr] profile mds
caps: [mon] profile mds


ceph auth  get-or-create mds.mynode mon 'profile mds' mgr 'profile mds'
mds 'allow *' osd 'allow *'

error:
Error EINVAL: key for mds.mynode exists but cap mds does not match

I am following instructions on
https://docs.ceph.com/latest/cephfs/add-remove-mds

thanks
jerry

Xiubo Li
>
> On 15/09/2022 03:09, Jerry Buburuz wrote:
>> Hello,
>>
>> I am trying to add my first mds service on any node. I am unable to add
>> keyring to start mds service.
>>
>> # $ sudo ceph auth get-or-create mds.mynode mon 'profile mds' mgr
>> 'profile
>> mds' mds 'allow *' osd 'allow *'
>>
>> Error ENINVAL: key for mds.mynode exists but cap mds does not match
>
> It says the key mds.mynode already exists. What's the output of the
> `ceph auth ls` ?
>
> Thanks!
>
>
>> I tried this command on storage nodes, admin nodes(monitor) , same
>> error.
>>
>> ceph mds stat
>> cephfs:0
>>
>> This makes sense I don't have any mds services running yet.
>>
>> I had no problem creating keyrings for other services like monitors and
>> mgr.
>>
>> Thanks
>> jerry
>>
>>
>>
>>
>>
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd mirroring - journal growing and snapshot high io load

2022-09-15 Thread Arthur Outhenin-Chalandre
Hi Ronny,

> On 15/09/2022 14:32 ronny.lippold  wrote:
> hi arthur, some time went ...
> 
> i would like to know, if there are some news of your setup.
> do you have replication active running?

No, there was no change at CERN. I am switching jobs as well actually so I 
won't have much news for you on CERN infra in the future. I know other people 
from the Ceph team at CERN watch this ml so you might hear from them as well I 
guess.

> we are using actually snapshot based and had last time a move of both 
> clusters.
> after that, we had some damaged filesystems ind the kvm vms.
> did you ever had such a problems in your tests.
> 
> i think, there are not so many people, how are using ceph replication.
> for me its hard to find the right way.
> can a snapshot based ceph replication be crash consisten? i think no.

I never noticed it myself, but yes it's written on the docs actually 
https://docs.ceph.com/en/quincy/rbd/rbd-snapshot/ (but on the mirroring docs 
this is not actually explained). I never tested that super carefully though and 
thought this was more a rare occurence than anything else.

I heard a while back (maybe a year-ish ago) that there was some long term plan 
to automatically trigger an fsfreeze for librbd/qemu on a snapshot which would 
probably solve your issue (and also allow application level consistency via 
fsfreeze custom hooks). But this was apparently a tricky feature to add. I 
cc'ed Illya maybe he would know more about that or if something else could have 
caused your issue.

Cheers,

-- 
Arthur Outhenin-Chalandre
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd mirroring - journal growing and snapshot high io load

2022-09-15 Thread ronny.lippold

hi arthur, some time went ...

i would like to know, if there are some news of your setup.
do you have replication active running?

we are using actually snapshot based and had last time a move of both 
clusters.

after that, we had some damaged filesystems ind the kvm vms.
did you ever had such a problems in your tests.

i think, there are not so many people, how are using ceph replication.
for me its hard to find the right way.
can a snapshot based ceph replication be crash consisten? i think no.

thanks für any help there ...

ronny

Am 2022-05-23 20:09, schrieb ronny.lippold:

hi arthur,

just for information. we had some horrible days ...

last week, we shut some virtual machines down.
most of them did not came back. timeout qmp socket ... and no kvm 
console.


so, we switched to our rbd-mirror cluster and ... yes, was working, 
puh.


some days later, we tried to install a devel proxmox package, which 
should help.

did not ... helpfull was, to rbd move the image and than move back
(like rename).

today, i found the answer.

i cleaned up the pool config and we removed the journaling feature
from the images.
after that, everything was booting fine.

maybe the performance issue with snapshots came from an proxmox bug
... we will see
(https://forum.proxmox.com/threads/possible-bug-after-upgrading-to-7-2-vm-freeze-if-backing-up-large-disks.109272/)

have a great time ...

ronny

Am 2022-05-12 15:29, schrieb Arthur Outhenin-Chalandre:

On 5/12/22 14:31, ronny.lippold wrote:

many thanks, we will check the slides ... are looking great




ok, you mean, that the growing came, cause of replication is to 
slow?

strange ... i thought our cluster is not so big ... but ok.
so, we cannot use journal ...
maybe some else have same result?


If you want a bit more details on this you can check my slides here:
https://codimd.web.cern.ch/p/-qWD2Y0S9#/.


Hmmm I think there are some plan to have a way to spread the 
snapshots
in the provided interval in Reef (and not take every snapshots at 
once)

but that's unfortunately not here today... The timing thing is a bit
weird but I am not an expert on RBD snapshots implication in 
general...

Maybe you can try to reproduce by taking snapshot by hand with `rbd
mirror image snapshot` on some of your images, maybe that's 
something

related to really big images? Or that there was a lot of write since
the
last snapshot?



yes right, i was alos thinking of this ...
i would like to find something, to debug the problem.
problems after 50days ... i do not understand this

which way are you actually going? do you have a replication?


We are going towards mirror snapshots, but we didn't advertise
internally so far and we won't enable it on every images; it would 
only

be for new volumes if people want explicitly that feature. So we are
probably not going to hit these performance issues that you suffer for
quite some time and the scope of it should be limited...

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Manual deployment, documentation error?

2022-09-15 Thread Dominique Ramaekers
Hi Marc,


> -Oorspronkelijk bericht-
> Van: Marc 
> Verzonden: donderdag 15 september 2022 11:14
> Aan: Dominique Ramaekers ; Ranjan
> Ghosh 
> CC: ceph-users@ceph.io
> Onderwerp: RE: [ceph-users] Re: Manual deployment, documentation error?
> 
> 
> > Cons of using cephadm (and thus docker):
> > - You need to learn the basics of docker
> 
> If you learn only the basics, you will probably fuck up when you have some
> sort of real issue with ceph. I would not recommend sticking to basics of
> anything with this install, it is not like if something goes wrong you 
> restore a
> vm from a snapshot or so.

My opinion...
If some real issue occurs with ceph, the fix would probably not be found inside 
the docker configuration. All my initial problems were caused by my own 
stupidity and not following the manuals correctly.
If you know the difference between an image and a container, you know how to 
pull an updated image and you know how get information about images and 
containers. You're good to go.
It won't hurt to know more. It's always the case that troubleshooting becomes 
easier with the augmentation of knowledge and experience. But remember. You 
can't know it all! That's the reason of existence of this mailing list, right?
Your argumentation referencing the managing of vm's and snapshots isn't valid 
=> you assume us readers know about managing vm's and snapshots... I remember 
at least one time, I lost data making an error while managing my vm's

> 
> > Pros:
> > + cephadm works very easy. The time you spend on learing docker will
> > + be
> > easely compensated by the small time you need to learn cephadm
> 
> Incorrect, you need to learn ceph as much as in any install. If you proceed to
> use ceph in a manner where you do not know what you are doing, you ‘risk’
> losing your clients data.

I never said you won't have to learn ceph. Where do you get this from? 
I compared 'manualy' installing ceph with installing ceph with cephadm
You will need to learn ceph to operate your cluster.

> 
> > + Upgrading a manual installation is very tricky! Cephadm manages
> > upgrades of ceph automatically. You only need to give the command
> > (done it already two times).
> 
> Why is this tricky? Just read the guidelines and understand them. I would
> even argue that you need to know how to update ceph manually. So you are
> better prepared for what happens at stages of the update.

I don't know how to compile and install a application. Weird that I still can 
install all kinds of applications and do troubleshooting even if the 
installation fails

> 
> > + If you need to upgrade your OS, will the manual installation still
> > function? With cephadm the ceph processes inside the docker containers
> > experience minimal impact with the upgrade of the os (dind't do an OS
> > upgrade yet, but had this issue with other applications).
> >
> 
> If you need to use cephadm because you can't work with ceph manually, just
> forget about using ceph.
> (and forget about this docker, do this podman if you have to)

See my last statement...

> 

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Manual deployment, documentation error?

2022-09-15 Thread Marc

> Cons of using cephadm (and thus docker):
> - You need to learn the basics of docker

If you learn only the basics, you will probably fuck up when you have some sort 
of real issue with ceph. I would not recommend sticking to basics of anything 
with this install, it is not like if something goes wrong you restore a vm from 
a snapshot or so. 

> Pros:
> + cephadm works very easy. The time you spend on learing docker will be
> easely compensated by the small time you need to learn cephadm

Incorrect, you need to learn ceph as much as in any install. If you proceed to 
use ceph in a manner where you do not know what you are doing, you ‘risk’ 
losing your clients data.

> + Upgrading a manual installation is very tricky! Cephadm manages
> upgrades of ceph automatically. You only need to give the command (done
> it already two times).

Why is this tricky? Just read the guidelines and understand them. I would even 
argue that you need to know how to update ceph manually. So you are better 
prepared for what happens at stages of the update.

> + If you need to upgrade your OS, will the manual installation still
> function? With cephadm the ceph processes inside the docker containers
> experience minimal impact with the upgrade of the os (dind't do an OS
> upgrade yet, but had this issue with other applications).
> 

If you need to use cephadm because you can't work with ceph manually, just 
forget about using ceph.
(and forget about this docker, do this podman if you have to)


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Manual deployment, documentation error?

2022-09-15 Thread Rok Jaklič
Every now and then someone comes up with a subject like this.

There is quite a long thread about pros and cons using docker and all tools
around ceph on
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/TTTYKRVWJOR7LOQ3UCQAZQR32R7YADVY/#AT7YQV6RE5SMKDZHXL3ZI2G5BWFUUUXE

Long story short... additional layers of complexity underneath and
simplicity on the top because of some hype around docker or something does
not solve problems users are facing right now. So the argument for ceph not
using docker in installation is actually quite good.

On Thu, Sep 15, 2022 at 10:18 AM Dominique Ramaekers <
dominique.ramaek...@cometal.be> wrote:

> Hi Ranjan,
>
> I don't want to intervene but I can testify that docker doesn't make the
> installation for a 3-node cluster overkill.
>
> I to have a 3-node cluster (to be expanded soon to 4 nodes).
>
> Cons of using cephadm (and thus docker):
> - You need to learn the basics of docker
>
> Pros:
> + cephadm works very easy. The time you spend on learing docker will be
> easely compensated by the small time you need to learn cephadm
> + Upgrading a manual installation is very tricky! Cephadm manages upgrades
> of ceph automatically. You only need to give the command (done it already
> two times).
> + If you need to upgrade your OS, will the manual installation still
> function? With cephadm the ceph processes inside the docker containers
> experience minimal impact with the upgrade of the os (dind't do an OS
> upgrade yet, but had this issue with other applications).
>
> Greetings,
>
> Dominique.
>
> > -Oorspronkelijk bericht-
> > Van: Ranjan Ghosh 
> > Verzonden: woensdag 14 september 2022 15:58
> > Aan: Eugen Block 
> > CC: ceph-users@ceph.io
> > Onderwerp: [ceph-users] Re: Manual deployment, documentation error?
> >
> > Hi Eugen,
> >
> > thanks for your answer. I don't want to use the cephadm tool because it
> > needs docker. I don't like it because it's total overkill for our small
> 3-node
> > cluster.  I'd like to avoid the added complexity, added packages,
> everything.
> > Just another thing I have to learn in detaisl about in case things go
> wrong.
> >
> > The monitor service is running but the logs don't say anything :-( Okay,
> but at
> > least I know now that it should work in principle without the mgr.
> >
> > Thank you
> > Ranjan
> >
> > Eugen Block schrieb am 14.09.2022 um 15:26:
> > > Hi,
> > >
> > >> Im currently trying the manual deployment because ceph-deploy
> > >> unfortunately doesn't seem to exist anymore and under step 19 it says
> > >> you should run "sudo ceph -s". That doesn't seem to work. I guess
> > >> this is because the manager service isn't yet running, right?
> > >
> > > ceph-deploy was deprecated quite some time ago, if you want to use a
> > > deployment tool try cephadm [1]. The command 'ceph -s' is not
> > > depending on the mgr but the mon service. So if that doesn't work you
> > > need to check the mon logs and see if the mon service is up and
> running.
> > >
> > >> Interestingly the screenshot under step 19 says "mgr: mon-
> > >> node1(active)". If you follow the documentation step by step, there's
> > >> no mention of the manager node up until that point.
> > >
> > > Right after your mentioned screenshot there's a section for the mgr
> > > service [2].
> > >
> > > Regards,
> > > Eugen
> > >
> > > [1] https://docs.ceph.com/en/quincy/cephadm/install/
> > > [2]
> > > https://docs.ceph.com/en/quincy/install/manual-deployment/#manager-
> > dae
> > > mon-configuration
> > >
> > >
> > > Zitat von Ranjan Ghosh :
> > >
> > >> Hi all,
> > >>
> > >> I think there's an error in the documentation:
> > >>
> > >> https://docs.ceph.com/en/quincy/install/manual-deployment/
> > >>
> > >> Im currently trying the manual deployment because ceph-deploy
> > >> unfortunately doesn't seem to exist anymore and under step 19 it says
> > >> you should run "sudo ceph -s". That doesn't seem to work. I guess
> > >> this is because the manager service isn't yet running, right?
> > >>
> > >> Interestingly the screenshot under step 19 says "mgr: mon-
> > >> node1(active)". If you follow the documentation step by step, there's
> > >> no mention of the manager node up until that point.
> > >>
> > >> Thank you / BR
> > >> Ranjan
> > >>
> > >> ___
> > >> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > >> email to ceph-users-le...@ceph.io
> > >
> > >
> > >
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > > email to ceph-users-le...@ceph.io
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> email
> > to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>

[ceph-users] Re: Manual deployment, documentation error?

2022-09-15 Thread Dominique Ramaekers
Hi Ranjan,

I don't want to intervene but I can testify that docker doesn't make the 
installation for a 3-node cluster overkill.

I to have a 3-node cluster (to be expanded soon to 4 nodes).

Cons of using cephadm (and thus docker):
- You need to learn the basics of docker

Pros:
+ cephadm works very easy. The time you spend on learing docker will be easely 
compensated by the small time you need to learn cephadm
+ Upgrading a manual installation is very tricky! Cephadm manages upgrades of 
ceph automatically. You only need to give the command (done it already two 
times).
+ If you need to upgrade your OS, will the manual installation still function? 
With cephadm the ceph processes inside the docker containers experience minimal 
impact with the upgrade of the os (dind't do an OS upgrade yet, but had this 
issue with other applications).

Greetings,

Dominique.

> -Oorspronkelijk bericht-
> Van: Ranjan Ghosh 
> Verzonden: woensdag 14 september 2022 15:58
> Aan: Eugen Block 
> CC: ceph-users@ceph.io
> Onderwerp: [ceph-users] Re: Manual deployment, documentation error?
> 
> Hi Eugen,
> 
> thanks for your answer. I don't want to use the cephadm tool because it
> needs docker. I don't like it because it's total overkill for our small 3-node
> cluster.  I'd like to avoid the added complexity, added packages, everything.
> Just another thing I have to learn in detaisl about in case things go wrong.
> 
> The monitor service is running but the logs don't say anything :-( Okay, but 
> at
> least I know now that it should work in principle without the mgr.
> 
> Thank you
> Ranjan
> 
> Eugen Block schrieb am 14.09.2022 um 15:26:
> > Hi,
> >
> >> Im currently trying the manual deployment because ceph-deploy
> >> unfortunately doesn't seem to exist anymore and under step 19 it says
> >> you should run "sudo ceph -s". That doesn't seem to work. I guess
> >> this is because the manager service isn't yet running, right?
> >
> > ceph-deploy was deprecated quite some time ago, if you want to use a
> > deployment tool try cephadm [1]. The command 'ceph -s' is not
> > depending on the mgr but the mon service. So if that doesn't work you
> > need to check the mon logs and see if the mon service is up and running.
> >
> >> Interestingly the screenshot under step 19 says "mgr: mon-
> >> node1(active)". If you follow the documentation step by step, there's
> >> no mention of the manager node up until that point.
> >
> > Right after your mentioned screenshot there's a section for the mgr
> > service [2].
> >
> > Regards,
> > Eugen
> >
> > [1] https://docs.ceph.com/en/quincy/cephadm/install/
> > [2]
> > https://docs.ceph.com/en/quincy/install/manual-deployment/#manager-
> dae
> > mon-configuration
> >
> >
> > Zitat von Ranjan Ghosh :
> >
> >> Hi all,
> >>
> >> I think there's an error in the documentation:
> >>
> >> https://docs.ceph.com/en/quincy/install/manual-deployment/
> >>
> >> Im currently trying the manual deployment because ceph-deploy
> >> unfortunately doesn't seem to exist anymore and under step 19 it says
> >> you should run "sudo ceph -s". That doesn't seem to work. I guess
> >> this is because the manager service isn't yet running, right?
> >>
> >> Interestingly the screenshot under step 19 says "mgr: mon-
> >> node1(active)". If you follow the documentation step by step, there's
> >> no mention of the manager node up until that point.
> >>
> >> Thank you / BR
> >> Ranjan
> >>
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> >> email to ceph-users-le...@ceph.io
> >
> >
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > email to ceph-users-le...@ceph.io
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email
> to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io