[ceph-users] Re: Reef: RGW Multisite object fetch limits

2024-05-15 Thread Jayanth Reddy
Hello Community,
In addition, we've 3+ Gbps links and the average object size is 200
kilobytes. So the utilization is about 300 Mbps to ~ 1.8 Gbps and not more
than that.
We seem to saturate the link when the secondary zone fetches bigger objects
sometimes but the objects per second always seem to be 1k to 1.5k per
second.

Regards,
Jayanth

On Thu, May 16, 2024 at 11:05 AM Jayanth Reddy 
wrote:

> Hello Community,
> We've two zones with Reef (v18.2.1) and trying to sync over 2 billion RGW
> objects to the secondary zone. We've added a fresh secondary zone and each
> zones have 2 RGW dedicated daemons (behind LB) each and only for multisite;
> whereas others don't run sync threads. Strange thing is that during the
> sync, the master zone LB records only 1k to 1.5k HTTP requests per second
> from secondary zone RGWs and not more than that. This behaviour leaves us
> with less link utilization. I don't see any noticeable issues with the
> link, cluster or RGW daemons on both ends.
> We've already increased rgw_data_sync_spawn_window,
> rgw_bucket_sync_spawn_window, rgw_meta_sync_spawn_window values to more
> than the defaults and don't seem to improve.
>
> Has someone noticed this behaviour of secondary zone RGW daemons fetching
> only around 1k to 1.5k objects per second?
>
> Regards,
> Jayanth
>
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Please discuss about Slow Peering

2024-05-15 Thread 서민우
Env:
- OS: Ubuntu 20.04
- Ceph Version: Octopus 15.0.0.1
- OSD Disk: 2.9TB NVMe
- BlockStorage (Replication 3)

Symptom:
- Peering when OSD's node up is very slow. Peering speed varies from PG to
PG, and some PG may even take 10 seconds. But, there is no log for 10
seconds.
- I checked the effect of client VM's. Actually, Slow queries of mysql
occur at the same time.

There are Ceph OSD logs of both Best and Worst.

Best Peering Case (0.5 Seconds)
2024-04-11T15:32:44.693+0900 7f108b522700  1 osd.7 pg_epoch: 27368 pg[6.8]
state: transitioning to Primary
2024-04-11T15:32:45.165+0900 7f108f52a700  1 osd.7 pg_epoch: 27371 pg[6.8]
state: Peering, affected_by_map, going to Reset
2024-04-11T15:32:45.165+0900 7f108f52a700  1 osd.7 pg_epoch: 27371 pg[6.8]
start_peering_interval up [7,6,11] -> [6,11], acting [7,6,11] -> [6,11],
acting_primary 7 -> 6, up_primary 7 -> 6, role 0 -> -1, features acting
2024-04-11T15:32:45.165+0900 7f108f52a700  1 osd.7 pg_epoch: 27377 pg[6.8]
state: transitioning to Primary
2024-04-11T15:32:45.165+0900 7f108f52a700  1 osd.7 pg_epoch: 27377 pg[6.8]
start_peering_interval up [6,11] -> [7,6,11], acting [6,11] -> [7,6,11],
acting_primary 6 -> 7, up_primary 6 -> 7, role -1 -> 0, features acting

Worst Peering Case (11.6 Seconds)
2024-04-11T15:32:45.169+0900 7f108b522700  1 osd.7 pg_epoch: 27377 pg[30.20]
state: transitioning to Stray
2024-04-11T15:32:45.169+0900 7f108b522700  1 osd.7 pg_epoch: 27377 pg[30.20]
start_peering_interval up [0,1] -> [0,7,1], acting [0,1] -> [0,7,1],
acting_primary 0 -> 0, up_primary 0 -> 0, role -1 -> 1, features acting
2024-04-11T15:32:46.173+0900 7f108b522700  1 osd.7 pg_epoch: 27378 pg[30.20]
state: transitioning to Stray
2024-04-11T15:32:46.173+0900 7f108b522700  1 osd.7 pg_epoch: 27378 pg[30.20]
start_peering_interval up [0,7,1] -> [0,7,1], acting [0,7,1] -> [0,1],
acting_primary 0 -> 0, up_primary 0 -> 0, role 1 -> -1, features acting
2024-04-11T15:32:57.794+0900 7f108b522700  1 osd.7 pg_epoch: 27390 pg[30.20]
state: transitioning to Stray
2024-04-11T15:32:57.794+0900 7f108b522700  1 osd.7 pg_epoch: 27390 pg[30.20]
start_peering_interval up [0,7,1] -> [0,7,1], acting [0,1] -> [0,7,1],
acting_primary 0 -> 0, up_primary 0 -> 0, role -1 -> 1, features acting

*I wish to know about*
- Why some PG's take 10 seconds until Peering finishes.
- Why Ceph log is quiet during peering.
- Is this symptom intended in Ceph.

*And please give some advice,*
- Is there any way to improve peering speed?
- Or, Is there a way to not affect the client when peering occurs?

P.S
- I checked the symptoms in the following environments.
-> Octopus Version, Reef Version, Cephadm, Ceph-Ansible
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Reef: RGW Multisite object fetch limits

2024-05-15 Thread Jayanth Reddy
Hello Community,
We've two zones with Reef (v18.2.1) and trying to sync over 2 billion RGW
objects to the secondary zone. We've added a fresh secondary zone and each
zones have 2 RGW dedicated daemons (behind LB) each and only for multisite;
whereas others don't run sync threads. Strange thing is that during the
sync, the master zone LB records only 1k to 1.5k HTTP requests per second
from secondary zone RGWs and not more than that. This behaviour leaves us
with less link utilization. I don't see any noticeable issues with the
link, cluster or RGW daemons on both ends.
We've already increased rgw_data_sync_spawn_window,
rgw_bucket_sync_spawn_window, rgw_meta_sync_spawn_window values to more
than the defaults and don't seem to improve.

Has someone noticed this behaviour of secondary zone RGW daemons fetching
only around 1k to 1.5k objects per second?

Regards,
Jayanth
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Reminder: User + Dev Monthly Meetup rescheduled to May 23rd

2024-05-15 Thread Laura Flores
Hi all,

Those of you subscribed to the Ceph Community Calendar may have already
gotten an update that the meeting was rescheduled, but I wanted to send a
reminder here as well. The meeting will be held at the usual time a week
from now on May 23rd.

If you haven't already, please take the new survey we put out! Results will
be discussed at the meeting next week:
https://www.meetup.com/ceph-user-group/events/300883526/

Thanks,
Laura

Meeting Details + Survey link here:
https://www.meetup.com/ceph-user-group/events/300883526/

-- 

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage 

Chicago, IL

lflo...@ibm.com | lflo...@redhat.com 
M: +17087388804
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph dashboard reef 18.2.2 radosgw

2024-05-15 Thread Christopher Durham
 Pierre,
This is indeed the problem. I modified the line in 

/usr/share/ceph/mgr/dashboard/controllers/rgw.py

'port': int(re.findall(r'port=(\d+)', metadata['frontend_config#0'])[0])

to just be:
port: 443
and all works.
I see that in the pull that if it cannot find port= or in one of endpoint or 
ssl_endpoint, it gets set to None. So even thoughrgw itself will listen on port 
443 if I use ssl_endpoint without a port, I will probably have to explicitly 
set it for the dashboard to work.


-Chris


On Sunday, May 12, 2024 at 10:03:59 PM MDT, Pierre Riteau 
 wrote:   

 Hi Christopher,

I think your issue may be fixed by https://github.com/ceph/ceph/pull/54764,
which should be included in the next Reef release.
In the meantime, you should be able to update your RGW configuration to
include port=80. You will need to restart every RGW daemon so that all the
metadata is updated.

Best wishes,
Pierre Riteau

On Wed, 8 May 2024 at 19:51, Christopher Durham  wrote:

> Hello,
> I am uisng 18.2.2 on Rocky 8 Linux.
>
> I am getting http error 500 whe trying to hit the ceph dashboard on reef
> 18.2.2 when trying to look at any of the radosgw pages.
> I tracked this down to /usr/share/ceph/mgr/dashboard/controllers/rgw.py
> It appears to parse the metadata for a given radosgw server improperly. In
> my varoous rgw ceph.conf entries, I have:
> rgw frontends = beast ssl_endpoint=0.0.0.0
> ssl_certificate=/path/to/pem_with_cert_and_key
> but, rgw.py pulls the metadata for each server, and it is looking for
> 'port=' in the metadata for each server. When it doesn't find it based on
> line 147 in rgw.py, the ceph-mgr logs throwan exception which the manager
> proper catches and returns a 500.
> Would changing my frontends definition work? Is this known? I have had the
> frontends definition for awhile prior to my reef upgrade. Thanks
> -Chris
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
  
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Write issues on CephFS mounted with root_squash

2024-05-15 Thread Fabien Sirjean

Hi,

We have the same issue. It seems to come from this bug : 
https://access.redhat.com/solutions/6982902


We had to disable root_squash, which of course is a huge issue...

Cheers,

Fabien


On 5/15/24 12:54, Nicola Mori wrote:

Dear Ceph users,

I'm trying to export a CephFS with the root_squash option. This is the 
client configuration:


client.wizardfs_rootsquash
    key: 
    caps: [mds] allow rw fsname=wizardfs root_squash
    caps: [mon] allow r fsname=wizardfs
    caps: [osd] allow rw tag cephfs data=wizardfs

I can mount it flawlessly on several machines using the kernel driver, 
but when a machine writes on it then the content seems fine from the 
writing machine but it's not actually written on disk since other 
machines just see an empty file:


[12:43 mori@stryke ~]$ echo test > /wizard/ceph/software/el9/test
[12:43 mori@stryke ~]$ ll /wizard/ceph/software/el9/test
-rw-r--r-- 1 mori wizard 5 mag 15 12:43 /wizard/ceph/software/el9/test
[12:43 mori@stryke ~]$ cat /wizard/ceph/software/el9/test
test
[12:43 mori@stryke ~]$

[mori@fili ~]$ ll /wizard/ceph/software/el9/test
-rw-r--r--. 1 mori 1014 0 May 15 06:43 /wizard/ceph/software/el9/test
[mori@fili ~]$ cat /wizard/ceph/software/el9/test
[mori@fili ~]$

Unmounting and then remounting on "stryke" the file is seen as empty, 
so I guess that the content shown just after the write is only a cache 
effect and nothing is effectively written on disk. I checked the posix 
permissions on the folder and I have rw rights from both the machines.


All of the above using Ceph 18.2.2 on the cluster (deployed with 
cephadm) and both the machines. Machine "fili" has kernel 5.14.0 while 
"stryke" has 6.8.9. The same issue happens consistently also in the 
reverse direction (writing from "fili" and reading from "stryke"), and 
also with other machines.


Removing the squash_root option the problem vanishes.

I don't know what might


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Write issues on CephFS mounted with root_squash

2024-05-15 Thread Nicola Mori
Thank you Bailey, I'll give it a try ASAP. By the way, is this issue 
with the kernel driver something that will be fixed at a given point? If 
I'm correct the kernel driver has better performance than FUSE so I'd 
like to use it.

Cheers,

Nicola


smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Write issues on CephFS mounted with root_squash

2024-05-15 Thread Bailey Allison
Hey Nicola,

Try mounting cephfs with fuse instead of kernel, we have seen before sometimes 
the kernel mount does not properly support that option but the fuse mount does.

Regards,

Bailey

> -Original Message-
> From: Nicola Mori 
> Sent: May 15, 2024 7:55 AM
> To: ceph-users 
> Subject: [ceph-users] Write issues on CephFS mounted with root_squash
> 
> Dear Ceph users,
> 
> I'm trying to export a CephFS with the root_squash option. This is the client
> configuration:
> 
> client.wizardfs_rootsquash
>  key: 
>  caps: [mds] allow rw fsname=wizardfs root_squash
>  caps: [mon] allow r fsname=wizardfs
>  caps: [osd] allow rw tag cephfs data=wizardfs
> 
> I can mount it flawlessly on several machines using the kernel driver, but
> when a machine writes on it then the content seems fine from the writing
> machine but it's not actually written on disk since other machines just see an
> empty file:
> 
> [12:43 mori@stryke ~]$ echo test > /wizard/ceph/software/el9/test
> [12:43 mori@stryke ~]$ ll /wizard/ceph/software/el9/test
> -rw-r--r-- 1 mori wizard 5 mag 15 12:43 /wizard/ceph/software/el9/test
> [12:43 mori@stryke ~]$ cat /wizard/ceph/software/el9/test test
> [12:43 mori@stryke ~]$
> 
> [mori@fili ~]$ ll /wizard/ceph/software/el9/test -rw-r--r--. 1 mori 1014 0 May
> 15 06:43 /wizard/ceph/software/el9/test [mori@fili ~]$ cat
> /wizard/ceph/software/el9/test [mori@fili ~]$
> 
> Unmounting and then remounting on "stryke" the file is seen as empty, so I
> guess that the content shown just after the write is only a cache effect and
> nothing is effectively written on disk. I checked the posix permissions on the
> folder and I have rw rights from both the machines.
> 
> All of the above using Ceph 18.2.2 on the cluster (deployed with
> cephadm) and both the machines. Machine "fili" has kernel 5.14.0 while
> "stryke" has 6.8.9. The same issue happens consistently also in the reverse
> direction (writing from "fili" and reading from "stryke"), and also with other
> machines.
> 
> Removing the squash_root option the problem vanishes.
> 
> I don't know what might

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Write issues on CephFS mounted with root_squash

2024-05-15 Thread Nicola Mori

Dear Ceph users,

I'm trying to export a CephFS with the root_squash option. This is the 
client configuration:


client.wizardfs_rootsquash
key: 
caps: [mds] allow rw fsname=wizardfs root_squash
caps: [mon] allow r fsname=wizardfs
caps: [osd] allow rw tag cephfs data=wizardfs

I can mount it flawlessly on several machines using the kernel driver, 
but when a machine writes on it then the content seems fine from the 
writing machine but it's not actually written on disk since other 
machines just see an empty file:


[12:43 mori@stryke ~]$ echo test > /wizard/ceph/software/el9/test
[12:43 mori@stryke ~]$ ll /wizard/ceph/software/el9/test
-rw-r--r-- 1 mori wizard 5 mag 15 12:43 /wizard/ceph/software/el9/test
[12:43 mori@stryke ~]$ cat /wizard/ceph/software/el9/test
test
[12:43 mori@stryke ~]$

[mori@fili ~]$ ll /wizard/ceph/software/el9/test
-rw-r--r--. 1 mori 1014 0 May 15 06:43 /wizard/ceph/software/el9/test
[mori@fili ~]$ cat /wizard/ceph/software/el9/test
[mori@fili ~]$

Unmounting and then remounting on "stryke" the file is seen as empty, so 
I guess that the content shown just after the write is only a cache 
effect and nothing is effectively written on disk. I checked the posix 
permissions on the folder and I have rw rights from both the machines.


All of the above using Ceph 18.2.2 on the cluster (deployed with 
cephadm) and both the machines. Machine "fili" has kernel 5.14.0 while 
"stryke" has 6.8.9. The same issue happens consistently also in the 
reverse direction (writing from "fili" and reading from "stryke"), and 
also with other machines.


Removing the squash_root option the problem vanishes.

I don't know what might



smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph tell mds.0 dirfrag split - syntax of the "frag" argument

2024-05-15 Thread Alexander E. Patrakov
Hello,

In the context of https://tracker.ceph.com/issues/64298, I decided to
do something manually. In the help output of "ceph tell" for an MDS, I
found these possibly useful commands:

dirfrag ls : List fragments in directory
dirfrag merge  : De-fragment directory by path
dirfrag split   : Fragment directory by path

They accept the "frag" argument that is underdocumented. In the
testsuite, they are used, and it seems like this argument accepts some
notation containing a slash, which is also produced as "str" by
"dirfrag ls".

Can anyone explain the meaning of the parts before and after the
slash? What is the relation between the accepted values for "dirfrag
split" and the output of "dirfrag ls" - do I just feed the fragment
from "dirfrag ls" to "dirfrag split" as-is? Is running "dirfrag split"
manually safe on a production cluster?

Thanks in advance.

-- 
Alexander Patrakov
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io