[ceph-users] Re: Latest Doco Out Of Date?

2024-04-24 Thread Eugen Block

Hi,

I believe the docs [2] are okay, running 'ceph fs authorize' will  
overwrite the existing caps, it will not add more caps to the client:


Capabilities can be modified by running fs authorize only in the  
case when read/write permissions must be changed.


If a client already has a capability for file-system name a and path  
dir1, running fs authorize again for FS name a but path dir2,  
instead of modifying the capabilities client already holds, a new  
cap for dir2 will be granted


To add more caps you'll need to use the 'ceph auth caps' command, for example:

quincy-1:~ # ceph fs authorize cephfs client.usera /dir1 rw
[client.usera]
key = AQDOrShmk6XhGxAAwz07ngr0JtPSID06RH8lAw==

quincy-1:~ # ceph auth get client.usera
[client.usera]
key = AQDOrShmk6XhGxAAwz07ngr0JtPSID06RH8lAw==
caps mds = "allow rw fsname=cephfs path=/dir1"
caps mon = "allow r fsname=cephfs"
caps osd = "allow rw tag cephfs data=cephfs"

quincy-1:~ # ceph auth caps client.usera mds 'allow rw fsname=cephfs  
path=/dir1, allow rw fsname=cephfs path=/dir2' mon 'allow r  
fsname=cephfs' osd 'allow rw tag cephfs data=cephfs'

updated caps for client.usera

quincy-1:~ # ceph auth get client.usera
[client.usera]
key = AQDOrShmk6XhGxAAwz07ngr0JtPSID06RH8lAw==
caps mds = "allow rw fsname=cephfs path=/dir1, allow rw  
fsname=cephfs path=/dir2"

caps mon = "allow r fsname=cephfs"
caps osd = "allow rw tag cephfs data=cephfs"

Note that I don't actually have these directories in that cephfs, it's  
just to demonstrate, so you'll need to make sure your caps actually  
work.


Thanks,
Eugen

[2]  
https://docs.ceph.com/en/latest/cephfs/client-auth/#changing-rw-permissions-in-caps



Zitat von Zac Dover :

It's in my list of ongoing initiatives. I'll stay up late tonight  
and ask Venky directly what's going on in this instance.


Sometime later today, I'll create an issue tracking bug and I'll  
send it to you for review. Make sure that I haven't misrepresented  
this issue.


Zac

On Wednesday, April 24th, 2024 at 2:10 PM, duluxoz  wrote:


Hi Zac,

Any movement on this? We really need to come up with an  
answer/solution - thanks


Dulux-Oz

On 19/04/2024 18:03, duluxoz wrote:


Cool!

Thanks for that :-)

On 19/04/2024 18:01, Zac Dover wrote:

I think I understand, after more thought. The second command is  
expected to work after the first.


I will ask the cephfs team when they wake up.

Zac Dover
Upstream Docs
Ceph Foundation

On Fri, Apr 19, 2024 at 17:51, duluxoz  
<[dulu...@gmail.com](mailto:On Fri, Apr 19, 2024 at 17:51,  
duluxoz < wrote:



Hi All,

In reference to this page from the Ceph documentation:
https://docs.ceph.com/en/latest/cephfs/client-auth/, down the bottom of
that page it says that you can run the following commands:

~~~
ceph fs authorize a client.x /dir1 rw
ceph fs authorize a client.x /dir2 rw
~~~

This will allow `client.x` to access both `dir1` and `dir2`.

So, having a use case where we need to do this, we are, HOWEVER, getting
the following error on running the 2nd command on a Reef 18.2.2 cluster:

`Error EINVAL: client.x already has fs capabilities that differ from
those supplied. To generate a new auth key for client.x, first remove
client.x from configuration files, execute 'ceph auth rm client.x', then
execute this command again.`

Something we're doing wrong, or is the doco "out of date" (mind you,
that's from the "latest" version of the doco, and the "reef" version),
or is something else going on?

Thanks in advance for the help

Cheers

Dulux-Oz

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: stretched cluster new pool and second pool with nvme

2024-04-24 Thread Eugen Block
Oh, I see. Unfortunately, I don't have a cluster in stretch mode so I  
can't really test that. Thanks for pointing to the tracker.


Zitat von Stefan Kooman :


On 23-04-2024 14:40, Eugen Block wrote:

Hi,


whats the right way to add another pool?
create pool with 4/2 and use the rule for the stretched mode, finished?
the exsisting pools were automaticly set to 4/2 after "ceph mon  
enable_stretch_mode".


It should be that simple. However, it does not seem to work. I tried  
to do just that, use two separate pools, hdd and ssd in that case,  
but it would not work, see this tracker:  
https://tracker.ceph.com/issues/64817


If your experience is different please update the tracker ticket. If  
it indeed does not work, please also update the tracker ticket with  
a "+1".


Thanks,

Gr. Stefan



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Latest Doco Out Of Date?

2024-04-24 Thread Frank Schilder
Hi Eugen,

I would ask for a slight change here:

> If a client already has a capability for file-system name a and path
> dir1, running fs authorize again for FS name a but path dir2,
> instead of modifying the capabilities client already holds, a new
> cap for dir2 will be granted

The formulation "a new cap for dir2 will be granted" is very misleading. I 
would also read it as that the new cap is in addition to the already existing 
cap. I tried to modify caps with fs authorize as well in the past, because it 
will set caps using pool tags and the docu sounded like it will allow to modify 
caps. In my case, I got the same error and thought that its implementation is 
buggy and did it with the authtool.

To be honest, when I look at the command sequence

ceph fs authorize a client.x /dir1 rw
ceph fs authorize a client.x /dir2 rw

and it goes through without error, I would expect the client to have both 
permissions as a result - no matter what the documentation says. There is no 
"revoke caps" instruction anywhere. Revoking caps in this way is a really 
dangerous side effect and telling people to read the documentation about a 
command that should follow how other linux tools manage permissions is not the 
best answer. There is something called parallelism in software engineering and 
this command line syntax violates this in a highly un-intuitive way. The 
intuition of the syntax clearly is that it *adds* capabilities, its incremental.

A command like this should follow how existing linux tools work so that context 
switching will be easier for admins. Here, the choice of the term "authorize" 
seems to be unlucky. A more explicit command that follows setfacl a bit could be

ceph fs caps set a client.x /dir1 rw
ceph fs caps modify a client.x /dir2 rw

or even more parallel

ceph fs setcaps a client.x /dir1 rw
ceph fs setcaps -m a client.x /dir2 rw

Such parallel syntax will not only avoid the reported confusion but also make 
it possible to implement a modify operation in the future without breaking 
stuff. And you can save time on the documentation, because it works like other 
stuff.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Eugen Block 
Sent: Wednesday, April 24, 2024 9:02 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: Latest Doco Out Of Date?

Hi,

I believe the docs [2] are okay, running 'ceph fs authorize' will
overwrite the existing caps, it will not add more caps to the client:

> Capabilities can be modified by running fs authorize only in the
> case when read/write permissions must be changed.

> If a client already has a capability for file-system name a and path
> dir1, running fs authorize again for FS name a but path dir2,
> instead of modifying the capabilities client already holds, a new
> cap for dir2 will be granted

To add more caps you'll need to use the 'ceph auth caps' command, for example:

quincy-1:~ # ceph fs authorize cephfs client.usera /dir1 rw
[client.usera]
 key = AQDOrShmk6XhGxAAwz07ngr0JtPSID06RH8lAw==

quincy-1:~ # ceph auth get client.usera
[client.usera]
 key = AQDOrShmk6XhGxAAwz07ngr0JtPSID06RH8lAw==
 caps mds = "allow rw fsname=cephfs path=/dir1"
 caps mon = "allow r fsname=cephfs"
 caps osd = "allow rw tag cephfs data=cephfs"

quincy-1:~ # ceph auth caps client.usera mds 'allow rw fsname=cephfs
path=/dir1, allow rw fsname=cephfs path=/dir2' mon 'allow r
fsname=cephfs' osd 'allow rw tag cephfs data=cephfs'
updated caps for client.usera

quincy-1:~ # ceph auth get client.usera
[client.usera]
 key = AQDOrShmk6XhGxAAwz07ngr0JtPSID06RH8lAw==
 caps mds = "allow rw fsname=cephfs path=/dir1, allow rw
fsname=cephfs path=/dir2"
 caps mon = "allow r fsname=cephfs"
 caps osd = "allow rw tag cephfs data=cephfs"

Note that I don't actually have these directories in that cephfs, it's
just to demonstrate, so you'll need to make sure your caps actually
work.

Thanks,
Eugen

[2]
https://docs.ceph.com/en/latest/cephfs/client-auth/#changing-rw-permissions-in-caps


Zitat von Zac Dover :

> It's in my list of ongoing initiatives. I'll stay up late tonight
> and ask Venky directly what's going on in this instance.
>
> Sometime later today, I'll create an issue tracking bug and I'll
> send it to you for review. Make sure that I haven't misrepresented
> this issue.
>
> Zac
>
> On Wednesday, April 24th, 2024 at 2:10 PM, duluxoz  wrote:
>
>> Hi Zac,
>>
>> Any movement on this? We really need to come up with an
>> answer/solution - thanks
>>
>> Dulux-Oz
>>
>> On 19/04/2024 18:03, duluxoz wrote:
>>
>>> Cool!
>>>
>>> Thanks for that :-)
>>>
>>> On 19/04/2024 18:01, Zac Dover wrote:
>>>
 I think I understand, after more thought. The second command is
 expected to work after the first.

 I will ask the cephfs team when they wake up.

 Zac Dover
 Upstream Docs
 Ceph Foundation

 On Fri, Apr 19,

[ceph-users] ceph recipe for nfs exports

2024-04-24 Thread Roberto Maggi @ Debian

Hi you all,

I'm almost new to ceph and I'm understanding, day by day, why the 
official support is so expansive :)



I setting up a ceph nfs network cluster whose recipe can be found here 
below.


###

--> cluster creation cephadm bootstrap --mon-ip 10.20.20.81 
--cluster-network 10.20.20.0/24 --fsid $FSID --initial-dashboard-user adm \
--initial-dashboard-password 'Hi_guys' --dashboard-password-noupdate 
--allow-fqdn-hostname --ssl-dashboard-port 443 \
--dashboard-crt /etc/ssl/wildcard.it/wildcard.it.crt --dashboard-key 
/etc/ssl/wildcard.it/wildcard.it.key \

--allow-overwrite --cleanup-on-failure
cephadm shell --fsid $FSID -c /etc/ceph/ceph.conf -k 
/etc/ceph/ceph.client.admin.keyring

cephadm add-repo --release reef && cephadm install ceph-common
--> adding hosts and set labels
for IP in $(grep ceph /etc/hosts | awk '{print $1}') ; do ssh-copy-id -f 
-i /etc/ceph/ceph.pub root@$IP ; done
ceph orch host add cephstage01 10.20.20.81 --labels 
_admin,mon,mgr,prometheus,grafana
ceph orch host add cephstage02 10.20.20.82 --labels 
_admin,mon,mgr,prometheus,grafana
ceph orch host add cephstage03 10.20.20.83 --labels 
_admin,mon,mgr,prometheus,grafana
ceph orch host add cephstagedatanode01 10.20.20.84 --labels 
osd,nfs,prometheus
ceph orch host add cephstagedatanode02 10.20.20.85 --labels 
osd,nfs,prometheus
ceph orch host add cephstagedatanode03 10.20.20.86 --labels 
osd,nfs,prometheus

--> network setup and daemons deploy
ceph config set mon public_network 10.20.20.0/24,192.168.7.0/24
ceph orch apply mon 
--placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83"
ceph orch apply mgr 
--placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83"
ceph orch apply prometheus 
--placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83,cephstagedatanode01:10.20.20.84,cephstagedatanode02:10.20.20.85,cephstagedatanode03:10.20.20.86"
ceph orch apply grafana 
--placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83,cephstagedatanode01:10.20.20.84,cephstagedatanode02:10.20.20.85,cephstagedatanode03:10.20.20.86"

ceph orch apply node-exporter
ceph orch apply alertmanager
ceph config set mgr mgr/cephadm/secure_monitoring_stack true
--> disks and osd setup
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ; do 
ssh root@$IP "hostname && wipefs -a -f /dev/sdb&& wipefs -a -f 
/dev/sdc"; done

ceph config set mgr mgr/cephadm/device_enhanced_scan true
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ; 
doceph orch device ls --hostname=$IP --wide --refresh ; done
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ; 
doceph orch device zap $IP /dev/sdb; done
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ; 
doceph orch device zap $IP /dev/sdc ; done
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ; 
doceph orch daemon add osd $IP:/dev/sdb ; done
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ; 
doceph orch daemon add osd $IP:/dev/sdc ; done

--> ganesha nfs cluster
ceph mgr module enable nfs
ceph fs volume create vol1
ceph nfs cluster create nfs-cephfs 
"cephstagedatanode01,cephstagedatanode02,cephstagedatanode03" --ingress 
--virtual-ip 192.168.7.80 --ingress-mode default
ceph nfs export create cephfs --cluster-id nfs-cephfs --pseudo-path /mnt 
--fsname vol1

--> nfs mount
mount -t nfs -o nfsvers=4.1,proto=tcp 192.168.7.80:/mnt /mnt/ceph


is my recipe correct?


the cluster is set up by 3 mon/mgr nodes and 3 osd/nfs nodes, on the 
latters I installed one 3tb ssd, for the data, and one 300gb ssd for the 
journaling but


my problems are :

- Although I can mount the export I can't write on it

- I can't understand how to use the sdc disks for journaling

- I can't understand the concept of "pseudo path"


here below you can find the json output of the exports

--> check
ceph nfs export ls nfs-cephfs
ceph nfs export info nfs-cephfs /mnt

json file
-
{
"export_id": 1,
"path": "/",
"cluster_id": "nfs-cephfs",
"pseudo": "/mnt",
"access_type": "RW",
"squash": "none",
"security_label": true,
"protocols": [
4
],
"transports": [
"TCP"
],
"fsal": {
"name": "CEPH",
"user_id": "nfs.nfs-cephfs.1",
"fs_name": "vol1"
},
"clients": []
}



Thanks in advance

Rob



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Latest Doco Out Of Date?

2024-04-24 Thread Eugen Block

Hi,

I fully agree that there should be a smoother way to update client  
caps. But regarding the misleading terms, the docs do mention:



This is because the command fs authorize becomes ambiguous


So they are aware of the current state, but I don't know if there's  
any work in progress to improve the client authorization.


Thanks,
Eugen

Zitat von Frank Schilder :


Hi Eugen,

I would ask for a slight change here:


If a client already has a capability for file-system name a and path
dir1, running fs authorize again for FS name a but path dir2,
instead of modifying the capabilities client already holds, a new
cap for dir2 will be granted


The formulation "a new cap for dir2 will be granted" is very  
misleading. I would also read it as that the new cap is in addition  
to the already existing cap. I tried to modify caps with fs  
authorize as well in the past, because it will set caps using pool  
tags and the docu sounded like it will allow to modify caps. In my  
case, I got the same error and thought that its implementation is  
buggy and did it with the authtool.


To be honest, when I look at the command sequence

ceph fs authorize a client.x /dir1 rw
ceph fs authorize a client.x /dir2 rw

and it goes through without error, I would expect the client to have  
both permissions as a result - no matter what the documentation  
says. There is no "revoke caps" instruction anywhere. Revoking caps  
in this way is a really dangerous side effect and telling people to  
read the documentation about a command that should follow how other  
linux tools manage permissions is not the best answer. There is  
something called parallelism in software engineering and this  
command line syntax violates this in a highly un-intuitive way. The  
intuition of the syntax clearly is that it *adds* capabilities, its  
incremental.


A command like this should follow how existing linux tools work so  
that context switching will be easier for admins. Here, the choice  
of the term "authorize" seems to be unlucky. A more explicit command  
that follows setfacl a bit could be


ceph fs caps set a client.x /dir1 rw
ceph fs caps modify a client.x /dir2 rw

or even more parallel

ceph fs setcaps a client.x /dir1 rw
ceph fs setcaps -m a client.x /dir2 rw

Such parallel syntax will not only avoid the reported confusion but  
also make it possible to implement a modify operation in the future  
without breaking stuff. And you can save time on the documentation,  
because it works like other stuff.


Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Eugen Block 
Sent: Wednesday, April 24, 2024 9:02 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: Latest Doco Out Of Date?

Hi,

I believe the docs [2] are okay, running 'ceph fs authorize' will
overwrite the existing caps, it will not add more caps to the client:


Capabilities can be modified by running fs authorize only in the
case when read/write permissions must be changed.



If a client already has a capability for file-system name a and path
dir1, running fs authorize again for FS name a but path dir2,
instead of modifying the capabilities client already holds, a new
cap for dir2 will be granted


To add more caps you'll need to use the 'ceph auth caps' command,  
for example:


quincy-1:~ # ceph fs authorize cephfs client.usera /dir1 rw
[client.usera]
 key = AQDOrShmk6XhGxAAwz07ngr0JtPSID06RH8lAw==

quincy-1:~ # ceph auth get client.usera
[client.usera]
 key = AQDOrShmk6XhGxAAwz07ngr0JtPSID06RH8lAw==
 caps mds = "allow rw fsname=cephfs path=/dir1"
 caps mon = "allow r fsname=cephfs"
 caps osd = "allow rw tag cephfs data=cephfs"

quincy-1:~ # ceph auth caps client.usera mds 'allow rw fsname=cephfs
path=/dir1, allow rw fsname=cephfs path=/dir2' mon 'allow r
fsname=cephfs' osd 'allow rw tag cephfs data=cephfs'
updated caps for client.usera

quincy-1:~ # ceph auth get client.usera
[client.usera]
 key = AQDOrShmk6XhGxAAwz07ngr0JtPSID06RH8lAw==
 caps mds = "allow rw fsname=cephfs path=/dir1, allow rw
fsname=cephfs path=/dir2"
 caps mon = "allow r fsname=cephfs"
 caps osd = "allow rw tag cephfs data=cephfs"

Note that I don't actually have these directories in that cephfs, it's
just to demonstrate, so you'll need to make sure your caps actually
work.

Thanks,
Eugen

[2]
https://docs.ceph.com/en/latest/cephfs/client-auth/#changing-rw-permissions-in-caps


Zitat von Zac Dover :


It's in my list of ongoing initiatives. I'll stay up late tonight
and ask Venky directly what's going on in this instance.

Sometime later today, I'll create an issue tracking bug and I'll
send it to you for review. Make sure that I haven't misrepresented
this issue.

Zac

On Wednesday, April 24th, 2024 at 2:10 PM, duluxoz  
 wrote:



Hi Zac,

Any movement on this? We really need to come up with an
answer/solution - thanks

Dulux-Oz

On 19/04/2024 18

[ceph-users] Re: ceph-users Digest, Vol 118, Issue 85

2024-04-24 Thread duluxoz

Hi Eugen,

Thank you for a viable solution to our underlying issue - I'll attempt 
to implement it shortly.  :-)


However, with all the respect in world, I believe you are incorrect when 
you say the doco is correct (but I will be more than happy to be proven 
wrong).  :-)


The relevant text (extracted from the document page (the last couple of 
paragraphs)) says:


~~~

If a client already has a capability for file-system name |a| and path 
|dir1|, running |fs authorize| again for FS name |a| but path |dir2|, 
instead of modifying the capabilities client already holds, a new cap 
for |dir2| will be granted:


cephfsauthorizeaclient.x/dir1rw
cephauthgetclient.x

[client.x]
key  =  AQC1tyVknMt+JxAAp0pVnbZGbSr/nJrmkMNKqA==
caps  mds  =  "allow rw fsname=a path=/dir1"
caps  mon  =  "allow r fsname=a"
caps  osd  =  "allow rw tag cephfs data=a"

cephfsauthorizeaclient.x/dir2rw

updated  caps  for  client.x

cephauthgetclient.x

[client.x]
key  =  AQC1tyVknMt+JxAAp0pVnbZGbSr/nJrmkMNKqA==
caps  mds  =  "allow rw fsname=a path=dir1, allow rw fsname=a path=dir2"
caps  mon  =  "allow r fsname=a"
caps  osd  =  "allow rw tag cephfs data=a"

~~~

The above *seems* to me to say (as per the 2nd `cephauthgetclient.x` 
example) that a 2nd directory (dir2) *will* be added to the `client.x` 
authorisation.


HOWEVER, this does not work in practice - hence my original query.

This is what we originally attempted to do (word for word, only 
substituting our CechFS name for "a") and we got the error in the 
original post.


So if the doco says that something can be done *and* gives a working 
example, but an end-user (admin) cannot achieve the same results but 
gets an error instead when following the exact same commands, then 
either the doco is incorrect *or* there is something else wrong.


BUT your statement ("running 'ceph fs authorize' will overwrite the 
existing caps, it will not add more caps to the client") is in direct 
contradiction to the documentation ("If a client already has a 
capability for file-system name |a| and path |dir1|, running |fs 
authorize| again for FS name |a| but path |dir2|, instead of modifying 
the capabilities client already holds, a new cap for |dir2| will be 
granted").


So there's some sort of "disconnect" there.  :-)

Cheers


On 24/04/2024 17:33, ceph-users-requ...@ceph.io wrote:

Send ceph-users mailing list submissions to
ceph-users@ceph.io

To subscribe or unsubscribe via email, send a message with subject or
body 'help' to
ceph-users-requ...@ceph.io

You can reach the person managing the list at
ceph-users-ow...@ceph.io

When replying, please edit your Subject line so it is more specific
than "Re: Contents of ceph-users digest..."

Today's Topics:

1. Re: Latest Doco Out Of Date? (Eugen Block)
2. Re: stretched cluster new pool and second pool with nvme
   (Eugen Block)
3. Re: Latest Doco Out Of Date? (Frank Schilder)

___
ceph-users mailing list --ceph-users@ceph.io
To unsubscribe send an email toceph-users-le...@ceph.io
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Reconstructing an OSD server when the boot OS is corrupted

2024-04-24 Thread Peter van Heusden
Dear Ceph Community

We have 5 OSD servers running Ceph v15.2.17. The host operating system is
Ubuntu 20.04.

One of the servers has suffered corruption to its boot operating system.
Using a system rescue disk it is possible to mount the root filesystem but
it is not possible to boot the operating system at the moment.

The OSDs are configured with (spinning disk) data drives, WALs and DBs on
partitions of SSDs, but from my examination of the filesystem the
configuration in /var/lib/ceph appears to be corrupted.

So my question is: what is the best option for repair going forward? Is it
possible to do a clean install of the operating system and scan the
existing drives in order to reconstruct the OSD configuration?

Thank you,
Peter
P.S. the cause of the original corruption is likely due to an unplanned
power outage, an event that hopefully will not recur.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Reconstructing an OSD server when the boot OS is corrupted

2024-04-24 Thread Nico Schottelius


Hey Peter,

the /var/lib/ceph directories mainly contain "meta data" that, depending
on the ceph version and osd setup, can even be residing on tmpfs by
default.

Even if the data was on-disk, they are easy to recreate:


[root@rook-ceph-osd-36-6876cdb479-4764r ceph-36]# ls -l
total 28
lrwxrwxrwx 1 ceph ceph  8 Feb  7 12:12 block -> /dev/sde
-rw--- 1 ceph ceph 37 Feb  7 12:12 ceph_fsid
-rw--- 1 ceph ceph 37 Feb  7 12:12 fsid
-rw--- 1 ceph ceph 56 Feb  7 12:12 keyring
-rw--- 1 ceph ceph  6 Feb  7 12:12 ready
-rw--- 1 ceph ceph  3 Feb  7 12:12 require_osd_release
-rw--- 1 ceph ceph 10 Feb  7 12:12 type
-rw--- 1 ceph ceph  3 Feb  7 12:12 whoami
[root@rook-ceph-osd-36-6876cdb479-4764r ceph-36]# 


We used to create OSDs manually on alpine linux some years ago using
[0], you can check it out as an inspiration for what should be in which
file.

BR,

Nico


[0] 
https://code.ungleich.ch/ungleich-public/ungleich-tools/src/branch/master/ceph/ceph-osd-create-start-alpine

Peter van Heusden  writes:

> Dear Ceph Community
>
> We have 5 OSD servers running Ceph v15.2.17. The host operating system is
> Ubuntu 20.04.
>
> One of the servers has suffered corruption to its boot operating system.
> Using a system rescue disk it is possible to mount the root filesystem but
> it is not possible to boot the operating system at the moment.
>
> The OSDs are configured with (spinning disk) data drives, WALs and DBs on
> partitions of SSDs, but from my examination of the filesystem the
> configuration in /var/lib/ceph appears to be corrupted.
>
> So my question is: what is the best option for repair going forward? Is it
> possible to do a clean install of the operating system and scan the
> existing drives in order to reconstruct the OSD configuration?
>
> Thank you,
> Peter
> P.S. the cause of the original corruption is likely due to an unplanned
> power outage, an event that hopefully will not recur.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Reconstructing an OSD server when the boot OS is corrupted

2024-04-24 Thread Bailey Allison
Hey Peter,

A simple ceph-volume lvm activate should get all of the OSDs back up and
running once you install the proper packages/restore the ceph config
file/etc.,

If the node was also a mon/mgr you can simply re-add those services.

Regards,

Bailey

> -Original Message-
> From: Peter van Heusden 
> Sent: April 24, 2024 8:24 AM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Reconstructing an OSD server when the boot OS is
> corrupted
> 
> Dear Ceph Community
> 
> We have 5 OSD servers running Ceph v15.2.17. The host operating system is
> Ubuntu 20.04.
> 
> One of the servers has suffered corruption to its boot operating system.
> Using a system rescue disk it is possible to mount the root filesystem but
it is
> not possible to boot the operating system at the moment.
> 
> The OSDs are configured with (spinning disk) data drives, WALs and DBs on
> partitions of SSDs, but from my examination of the filesystem the
> configuration in /var/lib/ceph appears to be corrupted.
> 
> So my question is: what is the best option for repair going forward? Is it
> possible to do a clean install of the operating system and scan the
existing
> drives in order to reconstruct the OSD configuration?
> 
> Thank you,
> Peter
> P.S. the cause of the original corruption is likely due to an unplanned
power
> outage, an event that hopefully will not recur.
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email
> to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Reconstructing an OSD server when the boot OS is corrupted

2024-04-24 Thread Eugen Block
In addition to Nico's response, three years ago I wrote a blog post  
[1] about that topic, maybe that can help as well. It might be a bit  
outdated, what it definitely doesn't contain is this command from the  
docs [2] once the server has been re-added to the host list:


ceph cephadm osd activate 

Regards,
Eugen

[1]  
https://heiterbiswolkig.blogs.nde.ag/2021/02/08/cephadm-reusing-osds-on-reinstalled-server/
[2]  
https://docs.ceph.com/en/latest/cephadm/services/osd/#activate-existing-osds


Zitat von Nico Schottelius :


Hey Peter,

the /var/lib/ceph directories mainly contain "meta data" that, depending
on the ceph version and osd setup, can even be residing on tmpfs by
default.

Even if the data was on-disk, they are easy to recreate:


[root@rook-ceph-osd-36-6876cdb479-4764r ceph-36]# ls -l
total 28
lrwxrwxrwx 1 ceph ceph  8 Feb  7 12:12 block -> /dev/sde
-rw--- 1 ceph ceph 37 Feb  7 12:12 ceph_fsid
-rw--- 1 ceph ceph 37 Feb  7 12:12 fsid
-rw--- 1 ceph ceph 56 Feb  7 12:12 keyring
-rw--- 1 ceph ceph  6 Feb  7 12:12 ready
-rw--- 1 ceph ceph  3 Feb  7 12:12 require_osd_release
-rw--- 1 ceph ceph 10 Feb  7 12:12 type
-rw--- 1 ceph ceph  3 Feb  7 12:12 whoami
[root@rook-ceph-osd-36-6876cdb479-4764r ceph-36]#


We used to create OSDs manually on alpine linux some years ago using
[0], you can check it out as an inspiration for what should be in which
file.

BR,

Nico


[0]  
https://code.ungleich.ch/ungleich-public/ungleich-tools/src/branch/master/ceph/ceph-osd-create-start-alpine


Peter van Heusden  writes:


Dear Ceph Community

We have 5 OSD servers running Ceph v15.2.17. The host operating system is
Ubuntu 20.04.

One of the servers has suffered corruption to its boot operating system.
Using a system rescue disk it is possible to mount the root filesystem but
it is not possible to boot the operating system at the moment.

The OSDs are configured with (spinning disk) data drives, WALs and DBs on
partitions of SSDs, but from my examination of the filesystem the
configuration in /var/lib/ceph appears to be corrupted.

So my question is: what is the best option for repair going forward? Is it
possible to do a clean install of the operating system and scan the
existing drives in order to reconstruct the OSD configuration?

Thank you,
Peter
P.S. the cause of the original corruption is likely due to an unplanned
power outage, an event that hopefully will not recur.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph recipe for nfs exports

2024-04-24 Thread Adam King
>
> - Although I can mount the export I can't write on it
>
> What error are you getting trying to do the write? The way you set things
up doesn't look to different than one of our integration tests for ingress
over nfs (
https://github.com/ceph/ceph/blob/main/qa/suites/orch/cephadm/smoke-roleless/2-services/nfs-ingress.yaml)
and that test tests a simple read/write to the export after
creating/mounting it.


> - I can't understand how to use the sdc disks for journaling
>
>
you should be able to specify a `journal_devices` section in an OSD spec.
For example












*service_type: osdservice_id: fooplacement:  hosts:  - vm-00spec:
data_devices:paths:- /dev/vdb  journal_devices:paths:-
/dev/vdc*

that will make non-colocated OSDs where the devices from the
journal_devices section are used as journal devices for the OSDs on the
devices in the data_devices section. Although I'd recommend looking through
https://docs.ceph.com/en/latest/cephadm/services/osd/#advanced-osd-service-specifications

and
see if there are any other filtering options than the path that can be used
first. It's possible the path the device gets can change on reboot and you
could end up with cepadm using a device you don't want it to for this as
that other device gets the path another device held previously.

- I can't understand the concept of "pseudo path"
>

I don't know at a low level either, but it seems to just be the path
nfs-ganesha will present to the user. There is another argument to `ceph
nfs export create` which is just "path" rather than pseudo-path that marks
what actual path within the cephfs the export is mounted on. It's optional
and defaults to "/" (so the export you made is mounted at the root of the
fs). I think that's the one that really matters. The pseudo-path seems to
just act like a user facing name for the path.

On Wed, Apr 24, 2024 at 3:40 AM Roberto Maggi @ Debian 
wrote:

> Hi you all,
>
> I'm almost new to ceph and I'm understanding, day by day, why the
> official support is so expansive :)
>
>
> I setting up a ceph nfs network cluster whose recipe can be found here
> below.
>
> ###
>
> --> cluster creation cephadm bootstrap --mon-ip 10.20.20.81
> --cluster-network 10.20.20.0/24 --fsid $FSID --initial-dashboard-user adm
> \
> --initial-dashboard-password 'Hi_guys' --dashboard-password-noupdate
> --allow-fqdn-hostname --ssl-dashboard-port 443 \
> --dashboard-crt /etc/ssl/wildcard.it/wildcard.it.crt --dashboard-key
> /etc/ssl/wildcard.it/wildcard.it.key \
> --allow-overwrite --cleanup-on-failure
> cephadm shell --fsid $FSID -c /etc/ceph/ceph.conf -k
> /etc/ceph/ceph.client.admin.keyring
> cephadm add-repo --release reef && cephadm install ceph-common
> --> adding hosts and set labels
> for IP in $(grep ceph /etc/hosts | awk '{print $1}') ; do ssh-copy-id -f
> -i /etc/ceph/ceph.pub root@$IP ; done
> ceph orch host add cephstage01 10.20.20.81 --labels
> _admin,mon,mgr,prometheus,grafana
> ceph orch host add cephstage02 10.20.20.82 --labels
> _admin,mon,mgr,prometheus,grafana
> ceph orch host add cephstage03 10.20.20.83 --labels
> _admin,mon,mgr,prometheus,grafana
> ceph orch host add cephstagedatanode01 10.20.20.84 --labels
> osd,nfs,prometheus
> ceph orch host add cephstagedatanode02 10.20.20.85 --labels
> osd,nfs,prometheus
> ceph orch host add cephstagedatanode03 10.20.20.86 --labels
> osd,nfs,prometheus
> --> network setup and daemons deploy
> ceph config set mon public_network 10.20.20.0/24,192.168.7.0/24
> ceph orch apply mon
>
> --placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83"
> ceph orch apply mgr
>
> --placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83"
> ceph orch apply prometheus
>
> --placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83,cephstagedatanode01:10.20.20.84,cephstagedatanode02:10.20.20.85,cephstagedatanode03:10.20.20.86"
> ceph orch apply grafana
>
> --placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83,cephstagedatanode01:10.20.20.84,cephstagedatanode02:10.20.20.85,cephstagedatanode03:10.20.20.86"
> ceph orch apply node-exporter
> ceph orch apply alertmanager
> ceph config set mgr mgr/cephadm/secure_monitoring_stack true
> --> disks and osd setup
> for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ; do
> ssh root@$IP "hostname && wipefs -a -f /dev/sdb&& wipefs -a -f
> /dev/sdc"; done
> ceph config set mgr mgr/cephadm/device_enhanced_scan true
> for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ;
> doceph orch device ls --hostname=$IP --wide --refresh ; done
> for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ;
> doceph orch device zap $IP /dev/sdb; done
> for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ;
> doceph orch device zap $IP /dev/sdc ; done
> for IP in $(grep cep

[ceph-users] Re: Orchestrator not automating services / OSD issue

2024-04-24 Thread Michael Baer

Thanks Frédéric,

Going through your steps helped me narrow down the issue. Oddly, it
looks to be a network issue with the new host. Most things connects okay
(ssh, ping), but when the data stream gets too big, the connections just
hang. And it seems to be host specific as the other storage hosts are
all functioning fine.

Removing that machine from the cluster also seems to solve the
orchestrator services problems. Not sure why that was jamming it up, but
Ceph is working normally and I can focus on tracking down network errors
instead.

Thanks again!,
-Mike

> On Wed, 24 Apr 2024 08:46:14 +0200 (CEST), Frédéric Nass 
>  said:

FN> Hello Michael,
FN> You can try this:

FN> 1/ check that the host shows up on ceph orch ls with the right label 
'osds'
FN> 2/ check that the host is OK with ceph cephadm check-host . 
It should look like:
FN>  (None) ok
FN> podman (/usr/bin/podman) version 4.6.1 is present
FN> systemctl is present
FN> lvcreate is present
FN> Unit chronyd.service is enabled and running
FN> Hostname "" matches what is expected.
FN> Host looks OK
FN> 3/ double check you service_type 'osd' with ceph orch ls --service-type 
osd --export
FN> It should show the correct placement and spec (drives size, etc.)
FN> 4/ enable debugging with ceph config set mgr 
mgr/cephadm/log_to_cluster_level debug
FN> 5/ open a terminal and observe ceph -W cephadm --watch-debug
FN> 6/ ceph mgr fail
FN> 7/ ceph orch device ls --hostname= --wide --refresh (should
FN> show local bloc devices as Available and trigger the creation of the
FN> OSDs)

FN> If your service_type 'osd' is correct, the orchestrator should deploy 
OSDs on the node.
FN> If it does not then look for the reason why in ceph -W cephadm 
--watch-debug output.

FN> Regards,
FN> Frédéric.

FN> - Le 24 Avr 24, à 3:22, Michael Baer c...@mikesoffice.com a écrit :

>> Hi,
>> 
>> This problem started with trying to add a new storage server into a
>> quincy v17.2.6 ceph cluster. Whatever I did, I could not add the drives
>> on the new host as OSDs: via dashboard, via cephadm shell, by setting
>> osd unmanaged to false.
>> 
>> But what I started realizing is that orchestrator will also no longer
>> automatically manage services. I.e. if a service is set to manage by
>> labels, removing and adding labels to different hosts for that service
>> has no affect. Same if I set a service to be manage via hostnames. Same
>> if I try to drain a host (the services/podman containers just keep
>> running). Although, I am able to add/rm services via 'cephadm shell ceph
>> orch daemon add/rm'. But Ceph will not manage automatically using
>> labels/hostnames.
>> 
>> This apparently includes OSD daemons. I can not create and on the new
>> host either automatically or manually, but I'm hoping the services/OSD
>> issues are related and not two issues.
>> 
>> I haven't been able to find any obvious errors in /var/log/ceph,
>> /var/log/syslog, logs , etc. I have been able to get 'slow
>> ops' errors on monitors by trying to add OSDs manually (and having to
>> restart the monitor). I've also gotten cephadm shell to hang. And had to
>> restart managers. I'm not an expert and it could be something obvious,
>> but I haven't been able to figure out a solution. If anyone has any
>> suggestions, I would greatly appreciate them.
>> 
>> Thanks,
>> Mike
>> 
>> --
>> Michael Baer
>> c...@mikesoffice.com
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io


-- 
Michael Baer
c...@mikesoffice.com
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Slow/blocked reads and writes

2024-04-24 Thread Fábio Sato
Hello all,

I am trying to troubleshoot a ceph cluster version 18.2.2 having users
reporting slow and blocked reads and writes.

When running "ceph status" I am seeing many warnings about its health state:

cluster:
id: cc881230-e0dd-11ee-aa9e-37c4e4e5e14b
health: HEALTH_WARN
6 clients failing to respond to capability release
2 clients failing to advance oldest client/flush tid
1 MDSs report slow requests
1 MDSs behind on trimming
Too many repaired reads on 11 OSDs
Degraded data redundancy: 2 pgs degraded
105 pgs not deep-scrubbed in time
109 pgs not scrubbed in time
1 mgr modules have recently crashed
12 slow ops, oldest one blocked for 97678 sec, daemons
[osd.11,osd.12,osd.15,osd.16,osd.19,osd.20,osd.28,osd.3,osd.32,osd.34]...
have slow ops.

  services:
mon: 3 daemons, quorum file03-xx,file04-xx,file05-xx (age 17h)
mgr: file03-xx.xx(active, since 2w), standbys: file04-xx.xx
mds: 1/1 daemons up, 1 standby
osd: 44 osds: 44 up (since 17h), 44 in (since 39h); 492 remapped pgs

  data:
volumes: 1/1 healthy
pools:   3 pools, 2065 pgs
objects: 66.44M objects, 140 TiB
usage:   281 TiB used, 304 TiB / 586 TiB avail
pgs: 16511162/134215883 objects misplaced (12.302%)
 1508 active+clean
 487  active+remapped+backfill_wait
 53   active+clean+scrubbing+deep
 8active+clean+scrubbing
 5active+remapped+backfilling
 2active+recovering+degraded+repair
 2active+recovering+repair

  io:
recovery: 47 MiB/s, 37 objects/s

When checking the output of `ceph -w` I am flooded with crc error messages
like the examples below:

2024-04-24T19:15:40.430334+ osd.32 [ERR] 3.566 full-object read crc
0xa5da7fa != expected 0x on 3:66a8d8f5:::10001c72400.0007:head
2024-04-24T19:15:40.430507+ osd.39 [ERR] 3.270 full-object read crc
0xa1bc3a1e != expected 0x on 3:0e44aa2f:::1000265a625.0003:head
2024-04-24T19:15:40.494249+ osd.28 [ERR] 3.469 full-object read crc
0x8e757c06 != expected 0x on 3:962852f4:::10001c72c2d.0001:head
2024-04-24T19:15:40.529771+ osd.32 [ERR] 3.566 full-object read crc
0xa5da7fa != expected 0x on 3:66a8d8f5:::10001c72400.0007:head
2024-04-24T19:15:40.582128+ osd.19 [ERR] 3.4b full-object read crc
0x9222aec != expected 0x on 3:d20bddca:::100026fdb01.0006:head
2024-04-24T19:15:40.583350+ osd.19 [ERR] 3.4b full-object read crc
0x9222aec != expected 0x on 3:d20bddca:::100026fdb01.0006:head
2024-04-24T19:15:40.662945+ osd.28 [ERR] 3.469 full-object read crc
0x8e757c06 != expected 0x on 3:962852f4:::10001c72c2d.0001:head
2024-04-24T19:15:40.698197+ osd.19 [ERR] 3.4b full-object read crc
0x9222aec != expected 0x on 3:d20bddca:::100026fdb01.0006:head
2024-04-24T19:15:40.699389+ osd.19 [ERR] 3.4b full-object read crc
0x9222aec != expected 0x on 3:d20bddca:::100026fdb01.0006:head
2024-04-24T19:15:40.769191+ osd.28 [ERR] 3.469 full-object read crc
0x8e757c06 != expected 0x on 3:962852f4:::10001c72c2d.0001:head
2024-04-24T19:15:40.834344+ osd.19 [ERR] 3.4b full-object read crc
0x9222aec != expected 0x on 3:d20bddca:::100026fdb01.0006:head
2024-04-24T19:15:40.835513+ osd.19 [ERR] 3.4b full-object read crc
0x9222aec != expected 0x on 3:d20bddca:::100026fdb01.0006:head

I suspect this is the main issue affecting the cluster health state and
performance so I am trying to address this first.

The "expected 0x" crc seems like a bug to me and I found an open
ticket (https://tracker.ceph.com/issues/53240) with similar error messages
but I am not sure this is related to my case.

Could someone point me to the steps to solve these errors?

Cheers,
-- 
Fabio
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Recoveries without any misplaced objects?

2024-04-24 Thread David Orman
Did you ever figure out what was happening here?

David

On Mon, May 29, 2023, at 07:16, Hector Martin wrote:
> On 29/05/2023 20.55, Anthony D'Atri wrote:
>> Check the uptime for the OSDs in question
>
> I restarted all my OSDs within the past 10 days or so. Maybe OSD
> restarts are somehow breaking these stats?
>
>> 
>>> On May 29, 2023, at 6:44 AM, Hector Martin  wrote:
>>>
>>> Hi,
>>>
>>> I'm watching a cluster finish a bunch of backfilling, and I noticed that
>>> quite often PGs end up with zero misplaced objects, even though they are
>>> still backfilling.
>>>
>>> Right now the cluster is down to 6 backfilling PGs:
>>>
>>>  data:
>>>volumes: 1/1 healthy
>>>pools:   6 pools, 268 pgs
>>>objects: 18.79M objects, 29 TiB
>>>usage:   49 TiB used, 25 TiB / 75 TiB avail
>>>pgs: 262 active+clean
>>> 6   active+remapped+backfilling
>>>
>>> But there are no misplaced objects, and the misplaced column in `ceph pg
>>> dump` is zero for all PGs.
>>>
>>> If I do a `ceph pg dump_json`, I can see `num_objects_recovered`
>>> increasing for these PGs... but the misplaced count is still 0.
>>>
>>> Is there something else that would cause recoveries/backfills other than
>>> misplaced objects? Or perhaps there is a bug somewhere causing the
>>> misplaced object count to be misreported as 0 sometimes?
>>>
>>> # ceph -v
>>> ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy
>>> (stable)
>>>
>>> - Hector
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> 
>> 
>
> - Hector
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [EXTERN] cache pressure?

2024-04-24 Thread Dietmar Rieder

Hi Erich,

in our case the "client failing to respond to cache pressure" situation 
is/was often caused by users how have vscode connecting via ssh to our 
HPC head node. vscode makes heavy use of file watchers and we have seen 
users with > 400k watchers. All these watched files must be held in the 
MDS cache and if you have multiple users at the same time running vscode 
it gets problematic.


Unfortunately there is no global setting - at least none that we are 
aware of - for vscode to exclude certain files or directories from being 
watched. We asked the users to configure their vscode (Remote Settings 
-> Watcher Exclude) as follows:


{
  "files.watcherExclude": {
 "**/.git/objects/**": true,
 "**/.git/subtree-cache/**": true,
 "**/node_modules/*/**": true,
"**/.cache/**": true,
"**/.conda/**": true,
"**/.local/**": true,
"**/.nextflow/**": true,
"**/work/**": true
  }
}

~/.vscode-server/data/Machine/settings.json

To monitor and find processes with watcher you may use inotify-info


HTH
  Dietmar

On 4/23/24 15:47, Erich Weiler wrote:
So I'm trying to figure out ways to reduce the number of warnings I'm 
getting and I'm thinking about the one "client failing to respond to 
cache pressure".


Is there maybe a way to tell a client (or all clients) to reduce the 
amount of cache it uses or to release caches quickly?  Like, all the time?


I know the linux kernel (and maybe ceph) likes to cache everything for a 
while, and rightfully so, but I suspect in my use case it may be more 
efficient to more quickly purge the cache or to in general just cache 
way less overall...?


We have many thousands of threads all doing different things that are 
hitting our filesystem, so I suspect the caching isn't really doing me 
much good anyway due to the churn, and probably is causing more problems 
than it helping...


-erich
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io





OpenPGP_signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Recoveries without any misplaced objects?

2024-04-24 Thread Anthony D'Atri
Do you see *keys* aka omap traffic?  Especially if you have RGW set up?

> On Apr 24, 2024, at 15:37, David Orman  wrote:
> 
> Did you ever figure out what was happening here?
> 
> David
> 
> On Mon, May 29, 2023, at 07:16, Hector Martin wrote:
>> On 29/05/2023 20.55, Anthony D'Atri wrote:
>>> Check the uptime for the OSDs in question
>> 
>> I restarted all my OSDs within the past 10 days or so. Maybe OSD
>> restarts are somehow breaking these stats?
>> 
>>> 
 On May 29, 2023, at 6:44 AM, Hector Martin  wrote:
 
 Hi,
 
 I'm watching a cluster finish a bunch of backfilling, and I noticed that
 quite often PGs end up with zero misplaced objects, even though they are
 still backfilling.
 
 Right now the cluster is down to 6 backfilling PGs:
 
 data:
   volumes: 1/1 healthy
   pools:   6 pools, 268 pgs
   objects: 18.79M objects, 29 TiB
   usage:   49 TiB used, 25 TiB / 75 TiB avail
   pgs: 262 active+clean
6   active+remapped+backfilling
 
 But there are no misplaced objects, and the misplaced column in `ceph pg
 dump` is zero for all PGs.
 
 If I do a `ceph pg dump_json`, I can see `num_objects_recovered`
 increasing for these PGs... but the misplaced count is still 0.
 
 Is there something else that would cause recoveries/backfills other than
 misplaced objects? Or perhaps there is a bug somewhere causing the
 misplaced object count to be misreported as 0 sometimes?
 
 # ceph -v
 ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy
 (stable)
 
 - Hector
 ___
 ceph-users mailing list -- ceph-users@ceph.io
 To unsubscribe send an email to ceph-users-le...@ceph.io
>>> 
>>> 
>> 
>> - Hector
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Recoveries without any misplaced objects?

2024-04-24 Thread David Orman
It is RGW, but the index is on a different pool. Not seeing any key/s being 
reported in recovery. We've definitely had OSDs flap multiple times.

David

On Wed, Apr 24, 2024, at 16:48, Anthony D'Atri wrote:
> Do you see *keys* aka omap traffic?  Especially if you have RGW set up?
>
>> On Apr 24, 2024, at 15:37, David Orman  wrote:
>> 
>> Did you ever figure out what was happening here?
>> 
>> David
>> 
>> On Mon, May 29, 2023, at 07:16, Hector Martin wrote:
>>> On 29/05/2023 20.55, Anthony D'Atri wrote:
 Check the uptime for the OSDs in question
>>> 
>>> I restarted all my OSDs within the past 10 days or so. Maybe OSD
>>> restarts are somehow breaking these stats?
>>> 
 
> On May 29, 2023, at 6:44 AM, Hector Martin  wrote:
> 
> Hi,
> 
> I'm watching a cluster finish a bunch of backfilling, and I noticed that
> quite often PGs end up with zero misplaced objects, even though they are
> still backfilling.
> 
> Right now the cluster is down to 6 backfilling PGs:
> 
> data:
>   volumes: 1/1 healthy
>   pools:   6 pools, 268 pgs
>   objects: 18.79M objects, 29 TiB
>   usage:   49 TiB used, 25 TiB / 75 TiB avail
>   pgs: 262 active+clean
>6   active+remapped+backfilling
> 
> But there are no misplaced objects, and the misplaced column in `ceph pg
> dump` is zero for all PGs.
> 
> If I do a `ceph pg dump_json`, I can see `num_objects_recovered`
> increasing for these PGs... but the misplaced count is still 0.
> 
> Is there something else that would cause recoveries/backfills other than
> misplaced objects? Or perhaps there is a bug somewhere causing the
> misplaced object count to be misreported as 0 sometimes?
> 
> # ceph -v
> ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy
> (stable)
> 
> - Hector
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
 
 
>>> 
>>> - Hector
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Remove an OSD with hardware issue caused rgw 503

2024-04-24 Thread Mary Zhang
Hi,

We recently removed an osd from our Cepth cluster. Its underlying disk has
a hardware issue.

We use command: ceph orch osd rm osd_id --zap

During the process, sometimes ceph cluster enters warning state with slow
ops on this osd. Our rgw also failed to respond to requests and returned
503.

We restarted rgw daemon to make it work again. But the same failure occured
from time to time. Eventually we noticed that rgw 503 error is a result of
osd slow ops.

Our cluster has 18 hosts and 210 OSDs. We expect remove an osd with
hardware issue won't impact cluster performance & rgw availbility. Is our
expectation reasonable? What's the best way to handle osd with hardware
failures?

Thank you in advance for any comments or suggestions.

Best Regards,
Mary Zhang
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph recipe for nfs exports

2024-04-24 Thread Ceph . io
Wow, you made it farther than I did.  I got it installed, added hosts, then 
NOTHING.  It showed there were physical disks on the hosts but wouldn't create 
the OSDs.  Command was accepted, but NOTHING happened.  No output, no error, no 
NOTHING.  I fought with it for over a week and finally gave up, as with no 
feedback as to what the issue is it's impossible to troubleshoot.  A product 
that does NOTHING isn't a product at all.  I posted a detailed message here 
with screenshots, steps, everything somebody would need to reproduce my 
situation.  The post got blocked because it was too big and sent for 
moderation.  It never got approved or rejected.  So I moved on, can't be using 
something that does NOTHING with no way to proceed past that point.

From: Roberto Maggi @ Debian 
Sent: April 24, 2024 01:39
To: Ceph Users 
Subject: [ceph-users] ceph recipe for nfs exports

[You don't often get email from debian...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

Hi you all,

I'm almost new to ceph and I'm understanding, day by day, why the
official support is so expansive :)


I setting up a ceph nfs network cluster whose recipe can be found here
below.

###

--> cluster creation cephadm bootstrap --mon-ip 10.20.20.81
--cluster-network 10.20.20.0/24 --fsid $FSID --initial-dashboard-user adm \
--initial-dashboard-password 'Hi_guys' --dashboard-password-noupdate
--allow-fqdn-hostname --ssl-dashboard-port 443 \
--dashboard-crt /etc/ssl/wildcard.it/wildcard.it.crt --dashboard-key
/etc/ssl/wildcard.it/wildcard.it.key \
--allow-overwrite --cleanup-on-failure
cephadm shell --fsid $FSID -c /etc/ceph/ceph.conf -k
/etc/ceph/ceph.client.admin.keyring
cephadm add-repo --release reef && cephadm install ceph-common
--> adding hosts and set labels
for IP in $(grep ceph /etc/hosts | awk '{print $1}') ; do ssh-copy-id -f
-i /etc/ceph/ceph.pub root@$IP ; done
ceph orch host add cephstage01 10.20.20.81 --labels
_admin,mon,mgr,prometheus,grafana
ceph orch host add cephstage02 10.20.20.82 --labels
_admin,mon,mgr,prometheus,grafana
ceph orch host add cephstage03 10.20.20.83 --labels
_admin,mon,mgr,prometheus,grafana
ceph orch host add cephstagedatanode01 10.20.20.84 --labels
osd,nfs,prometheus
ceph orch host add cephstagedatanode02 10.20.20.85 --labels
osd,nfs,prometheus
ceph orch host add cephstagedatanode03 10.20.20.86 --labels
osd,nfs,prometheus
--> network setup and daemons deploy
ceph config set mon public_network 10.20.20.0/24,192.168.7.0/24
ceph orch apply mon
--placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83"
ceph orch apply mgr
--placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83"
ceph orch apply prometheus
--placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83,cephstagedatanode01:10.20.20.84,cephstagedatanode02:10.20.20.85,cephstagedatanode03:10.20.20.86"
ceph orch apply grafana
--placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83,cephstagedatanode01:10.20.20.84,cephstagedatanode02:10.20.20.85,cephstagedatanode03:10.20.20.86"
ceph orch apply node-exporter
ceph orch apply alertmanager
ceph config set mgr mgr/cephadm/secure_monitoring_stack true
--> disks and osd setup
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ; do
ssh root@$IP "hostname && wipefs -a -f /dev/sdb&& wipefs -a -f
/dev/sdc"; done
ceph config set mgr mgr/cephadm/device_enhanced_scan true
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ;
doceph orch device ls --hostname=$IP --wide --refresh ; done
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ;
doceph orch device zap $IP /dev/sdb; done
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ;
doceph orch device zap $IP /dev/sdc ; done
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ;
doceph orch daemon add osd $IP:/dev/sdb ; done
for IP in $(grep cephstagedatanode/etc/hosts | awk '{print $1}') ;
doceph orch daemon add osd $IP:/dev/sdc ; done
--> ganesha nfs cluster
ceph mgr module enable nfs
ceph fs volume create vol1
ceph nfs cluster create nfs-cephfs
"cephstagedatanode01,cephstagedatanode02,cephstagedatanode03" --ingress
--virtual-ip 192.168.7.80 --ingress-mode default
ceph nfs export create cephfs --cluster-id nfs-cephfs --pseudo-path /mnt
--fsname vol1
--> nfs mount
mount -t nfs -o nfsvers=4.1,proto=tcp 192.168.7.80:/mnt /mnt/ceph


is my recipe correct?


the cluster is set up by 3 mon/mgr nodes and 3 osd/nfs nodes, on the
latters I installed one 3tb ssd, for the data, and one 300gb ssd for the
journaling but

my problems are :

- Although I can mount the export I can't write on it

- I can't understand how to use the sdc disks for journaling

- I can't understand the concept of "pseudo path"


here below you can find the json output of the exports

--> check
ceph nfs export ls nfs-cephfs
ceph nfs expo