[ceph-users] Re: Unexpected behavior of directory mtime after being set explicitly

2023-06-05 Thread Xiubo Li
Raised one PR to fix this, please see 
https://github.com/ceph/ceph/pull/51931.


Thanks

- Xiubo

On 5/24/23 23:52, Sandip Divekar wrote:

Hi Team,

I'm writing to bring to your attention an issue we have encountered with the 
"mtime" (modification time) behavior for directories in the Ceph filesystem.

Upon observation, we have noticed that when the mtime of a directory (let's 
say: dir1) is explicitly changed in CephFS, subsequent additions of files or 
directories within
'dir1' fail to update the directory's mtime as expected.

This behavior appears to be specific to CephFS - we have reproduced this issue 
on both Quincy and Pacific.  Similar steps work as expected in the ext4 
filesystem amongst others.

Reproduction steps:
1. Create a directory - mkdir dir1
2. Modify mtime using the touch command - touch dir1
3. Create a file or directory inside of 'dir1' - mkdir dir1/dir2
Expected result:
mtime for dir1 should change to the time the file or directory was created in 
step 3
Actual result:
there was no change to the mtime for 'dir1'

Note : For more detail, kindly find the attached logs.

Our queries are :
1. Is this expected behavior for CephFS?
2. If so, can you explain why the directory behavior is inconsistent depending 
on whether the mtime for the directory has previously been manually updated.


Best Regards,
   Sandip Divekar
Component QA Lead SDET.


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] How to show used size of specific storage class in Radosgw?

2023-06-05 Thread Huy Nguyen
Hi,
I'm not able to find the information about used size of a storage class. 
- bucket stats
- usage show
- user stats ...
Does Radosgw support it? Thanks
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RGW: bucket notification issue with Kafka

2023-06-05 Thread Huy Nguyen
Hi,
In Ceph Radosgw 15.2.17, I get this issue when trying to create a push endpoint 
to Kafka

Here is push endpoint configuration:
endpoint_args = 
'push-endpoint=kafka://abcef:123456@kafka.endpoint:9093&use-ssl=true&ca-location=/etc/ssl/certs/ca.crt'
attributes = {nvp[0] : nvp[1] for nvp in urllib.parse.parse_qsl(endpoint_args, 
keep_blank_values=True)}
response = snsclient.create_topic(Name=topic_name, Attributes=attributes)

When I put an object, the radosgw log show this:
Kafka connect: failed to create producer: ssl.ca.location failed: 
crypto/x509/by_file.c:199: error:0B084002:x509 certificate 
routines:X509_load_cert_crl_file:system lib:

I have checked my ca.crt file and it is definitely in x509 format. If I use RGW 
v16.2.13, the producer will be created successfully.

Anyone have any ideal? Thanks
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Updating the Grafana SSL certificate in Quincy

2023-06-05 Thread Thorne Lawler

Hi everyone!

I have a containerised (cephadm built) 17.2.6 cluster where I have 
installed a custom commercial SSL certificate under dashboard.


Before I upgraded from 17.2 to 17.2.6, I successfully installed the 
custom SSL cert everywhere, including grafana, but since the upgrade I 
am finding that I can't update the certificate for grafana. Have tried 
many many commands like the following:


ceph config-key set mgr/cephadm/grafana_crt -i 
/etc/pki/tls/certs/_.quick.net.au_2024.pem

ceph orch reconfig grafana
ceph dashboard set-grafana-frontend-api-url https://san.quick.net.au:3000
restorecon /etc/pki/tls/certs/_.quick.net.au_2024.pem
ceph orch reconfig grafana
ceph dashboard set-grafana-frontend-api-url https://san.quick.net.au:3000
ceph dashboard set-grafana-frontend-url https://san.quick.net.au:3000
ceph dashboard grafana
ceph dashboard grafana dashboards update
ceph orch reconfig grafana
ceph config-key set mgr/cephadm/grafana_crt -i 
/etc/pki/tls/certs/_.quick.net.au_2024.pem

ceph orch redeploy grafana
ceph config set mgr mgr/dashboard/GRAFANA_API_URL 
https://san.quick.net.au:3000


...but to no avail. The grafana frames within dashboard continue to use 
the self-signed key.


Have the commands for updating this changed between 17.2.0 and 17.2.6?

Thank you.

--

Regards,

Thorne Lawler - Senior System Administrator
*DDNS* | ABN 76 088 607 265
First registrar certified ISO 27001-2013 Data Security Standard ITGOV40172
P +61 499 449 170

_DDNS

/_*Please note:* The information contained in this email message and any 
attached files may be confidential information, and may also be the 
subject of legal professional privilege. _If you are not the intended 
recipient any use, disclosure or copying of this email is unauthorised. 
_If you received this email in error, please notify Discount Domain Name 
Services Pty Ltd on 03 9815 6868 to report this matter and delete all 
copies of this transmission together with any attachments. /

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: The pg_num from 1024 reduce to 32 spend much time, is there way to shorten the time?

2023-06-05 Thread Janne Johansson
If you can stop the rgws, you can make a new pool with 32 PGs and then
rados cppool this one over the new one, then rename them so this one
has the right name (and application) and start the rgws again.

Den mån 5 juni 2023 kl 16:43 skrev Louis Koo :
>
> ceph version is 16.2.13;
>
> The pg_num is 1024, and the target_pg_num is 32; there is no any data in the 
> pool of ".rgw.buckets.index", but it spend  much time to reduce the pg num.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: PGs stuck undersized and not scrubbed

2023-06-05 Thread Nicola Mori

Dear Wes,

thank you for your suggestion! I restarted OSDs 57 and 79 and the 
recovery operations restarted as well. In the log I found that for both 
of them a kernel issue raised, but they were not in error state. 
Probably they got stuck because of this.

Thanks again for your help,

Nicola


smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: PGs stuck undersized and not scrubbed

2023-06-05 Thread Wesley Dillingham
When PGs are degraded they won't scrub, further, if an OSD is involved with
recovery of another PG it wont accept scrubs either so that is the likely
explanation of your not-scrubbed-in time issue. Its of low concern.

Are you sure that recovery is not progressing? I see: "7349/147534197
objects degraded" can you check that again (maybe wait an hour) and see if
7,349 has been reduced.

Another thing I'm noticing is that OSD 57 and 79 are the primary for many
of the PGs which are degraded. They might could use a service restart.

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Mon, Jun 5, 2023 at 12:01 PM Nicola Mori  wrote:

> Dear Ceph users,
>
> after an outage and recovery of one machine I have several PGs stuck in
> active+recovering+undersized+degraded+remapped. Furthermore, many PGs
> have not been (deep-)scrubbed in time. See below for status and health
> details.
> It's been like this for two days, with no recovery I/O being reported,
> so I guess something is stuck in a bad state. I'd need some help in
> understanding what's going on here and how to fix it.
> Thanks,
>
> Nicola
>
> -
>
> # ceph -s
>cluster:
>  id: b1029256-7bb3-11ec-a8ce-ac1f6b627b45
>  health: HEALTH_WARN
>  2 OSD(s) have spurious read errors
>  Degraded data redundancy: 7349/147534197 objects degraded
> (0.005%), 22 pgs degraded, 22 pgs undersized
>  332 pgs not deep-scrubbed in time
>  503 pgs not scrubbed in time
>  (muted: OSD_SLOW_PING_TIME_BACK OSD_SLOW_PING_TIME_FRONT)
>
>services:
>  mon: 5 daemons, quorum bofur,balin,aka,romolo,dwalin (age 2d)
>  mgr: bofur.tklnrn(active, since 32h), standbys: balin.hvunfe,
> aka.wzystq
>  mds: 2/2 daemons up, 1 standby
>  osd: 104 osds: 104 up (since 37h), 104 in (since 37h); 22 remapped pgs
>
>data:
>  volumes: 1/1 healthy
>  pools:   3 pools, 529 pgs
>  objects: 18.53M objects, 40 TiB
>  usage:   54 TiB used, 142 TiB / 196 TiB avail
>  pgs: 7349/147534197 objects degraded (0.005%)
>   2715/147534197 objects misplaced (0.002%)
>   507 active+clean
>   20  active+recovering+undersized+degraded+remapped
>   2   active+recovery_wait+undersized+degraded+remapped
>
> # ceph health detail
> [WRN] PG_DEGRADED: Degraded data redundancy: 7349/147534197 objects
> degraded (0.005%), 22 pgs degraded, 22 pgs undersized
>  pg 3.2c is stuck undersized for 37h, current state
> active+recovery_wait+undersized+degraded+remapped, last acting
> [79,83,34,37,65,NONE,18,95]
>  pg 3.57 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,99,37,NONE,15,104,55,40]
>  pg 3.76 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,5,37,15,100,33,85,NONE]
>  pg 3.9c is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,86,88,NONE,11,69,20,10]
>  pg 3.106 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [79,15,89,NONE,36,32,23,64]
>  pg 3.107 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [79,NONE,64,20,61,92,104,43]
>  pg 3.10c is stuck undersized for 37h, current state
> active+recovery_wait+undersized+degraded+remapped, last acting
> [79,34,NONE,95,104,16,69,18]
>  pg 3.11e is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [79,89,64,46,32,NONE,40,15]
>  pg 3.14e is stuck undersized for 37h, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,34,69,97,85,NONE,46,62]
>  pg 3.160 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,1,101,84,18,33,NONE,69]
>  pg 3.16a is stuck undersized for 37h, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,16,59,103,13,38,49,NONE]
>  pg 3.16e is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,0,27,96,55,10,81,NONE]
>  pg 3.170 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [NONE,57,14,46,55,99,15,40]
>  pg 3.19b is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [NONE,79,59,8,32,17,7,90]
>  pg 3.1a0 is stuck undersized for 2d, current state
> active+recovering+undersized+degraded+remapped, last acting
> [NONE,79,26,50,104,24,97,40]
>  pg 3.1a5 is stuck undersized for 37h, current state
> active+recovering+undersized+degraded+remapped, last acting
> [57,100,61,27,20,NONE,24,85]
>  pg 3.1a8 is stuck undersized for 2d, current state
> act

[ceph-users] Re: CephFS metadata pool grows by two orders of magnitude while trimming (?) snapshots

2023-06-05 Thread Janek Bevendorff
That said, our MON store size has also been growing slowly from 900MB to 
5.4GB. But we also have a few remapped PGs right now. Not sure if that 
would have an influence.



On 05/06/2023 17:48, Janek Bevendorff wrote:

Hi Patrick, hi Dan!

I got the MDS back and I think the issue is connected to the "newly 
corrupt dentry" bug [1]. Even though I couldn't see any particular 
reason for the SIGABRT at first, I then noticed one of these awfully 
familiar stack traces.


I rescheduled the two broken MDS ranks on two machines with 1.5TB RAM 
each (just to make sure it's not that) and then let them do their 
thing. The routine goes as follows: both replay the journal, then rank 
4 goes into the "resolve" state, but as soon as rank 3 also starts 
resolving, they both crash.


Then I set

ceph config mds mds_abort_on_newly_corrupt_dentry false
ceph config mds mds_go_bad_corrupt_dentry false

and this time I was able to recover the ranks, even though "resolve" 
and "clientreplay" took forever. I uploaded a compressed log of rank 3 
using ceph-post-file [2]. It's a log of several crash cycles, 
including the final successful attempt after changing the settings. 
The log decompresses to 815MB. I didn't censor any paths and they are 
not super-secret, but please don't share.


While writing this, the metadata pool size has reduced from 6TiB back 
to 440GiB. I am starting to think that the fill-ups may also be 
connected to the corruption issue. I also noticed that the ranks 3 and 
4 always have huge journals. An inspection using ceph-journal-tool 
takes forever and consumes 50GB of memory in the process. Listing the 
events in the journal is impossible without running out of RAM. Ranks 
0, 1, and 2 don't have this problem and this wasn't a problem for 
ranks 3 and 4 either before the fill-ups started happening.


Hope that helps getting to the bottom of this. I reset the guardrail 
settings in the meantime.


Cheers
Janek


[1] "Newly corrupt dentry" ML link: 
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/JNZ6V5WSYKQTNQPQPLWRBM2GEP2YSCRV/#PKQVZYWZCH7P76Q75D5WD5JEAVWOKJE3


[2] ceph-post-file ID: 7c039483-49fd-468c-ba40-fb10337aa7d6



On 05/06/2023 16:08, Janek Bevendorff wrote:
I just had the problem again that MDS were constantly reporting slow 
metadata IO and the pool was slowly growing. Hence I restarted the 
MDS and now ranks 4 and 5 don't come up again.


Every time, they get to the resolve stage, the crash with a SIGABRT 
without an error message (not even at debug_mds = 20). Any idea what 
the reason could be? I checked whether they have enough RAM, which 
seems to be the case (unless they try to allocate tens of GB in one 
allocation).


Janek


On 31/05/2023 21:57, Janek Bevendorff wrote:

Hi Dan,

Sorry, I meant Pacific. The version number was correct, the name 
wasn’t. ;-)


Yes, I have five active MDS and five hot standbys. Static pinning 
isn’t really an options for our directory structure, so we’re using 
ephemeral pins.


Janek


On 31. May 2023, at 18:44, Dan van der Ster 
 wrote:


Hi Janek,

A few questions and suggestions:
- Do you have multi-active MDS? In my experience back in nautilus if
something went wrong with mds export between mds's, the mds
log/journal could grow unbounded like you observed until that export
work was done. Static pinning could help if you are not using it
already.
- You definitely should disable the pg autoscaling on the mds metadata
pool (and other pools imho) -- decide the correct number of PGs for
your pools and leave it.
- Which version are you running? You said nautilus but wrote 16.2.12
which is pacific... If you're running nautilus v14 then I recommend
disabling pg autoscaling completely -- IIRC it does not have a fix for
the OSD memory growth "pg dup" issue which can occur during PG
splitting/merging.

Cheers, Dan

__
Clyso GmbH | https://www.clyso.com


On Wed, May 31, 2023 at 4:03 AM Janek Bevendorff
 wrote:

I checked our logs from yesterday, the PG scaling only started today,
perhaps triggered by the snapshot trimming. I disabled it, but it 
didn't

change anything.

What did change something was restarting the MDS one by one, which 
had
got far behind with trimming their caches and with a bunch of 
stuck ops.

After restarting them, the pool size decreased quickly to 600GiB. I
noticed the same behaviour yesterday, though yesterday is was more
extreme and restarting the MDS took about an hour and I had to 
increase

the heartbeat timeout. This time, it took only half a minute per MDS,
probably because it wasn't that extreme yet and I had reduced the
maximum cache size. Still looks like a bug to me.


On 31/05/2023 11:18, Janek Bevendorff wrote:

Another thing I just noticed is that the auto-scaler is trying to
scale the pool down to 128 PGs. That could also result in large
fluctuations, but this big?? In any case, it looks like a bug to me.
Whatever is happening here, there should be safeguards with 
regard to

th

[ceph-users] PGs stuck undersized and not scrubbed

2023-06-05 Thread Nicola Mori

Dear Ceph users,

after an outage and recovery of one machine I have several PGs stuck in 
active+recovering+undersized+degraded+remapped. Furthermore, many PGs 
have not been (deep-)scrubbed in time. See below for status and health 
details.
It's been like this for two days, with no recovery I/O being reported, 
so I guess something is stuck in a bad state. I'd need some help in 
understanding what's going on here and how to fix it.

Thanks,

Nicola

-

# ceph -s
  cluster:
id: b1029256-7bb3-11ec-a8ce-ac1f6b627b45
health: HEALTH_WARN
2 OSD(s) have spurious read errors
Degraded data redundancy: 7349/147534197 objects degraded 
(0.005%), 22 pgs degraded, 22 pgs undersized

332 pgs not deep-scrubbed in time
503 pgs not scrubbed in time
(muted: OSD_SLOW_PING_TIME_BACK OSD_SLOW_PING_TIME_FRONT)

  services:
mon: 5 daemons, quorum bofur,balin,aka,romolo,dwalin (age 2d)
mgr: bofur.tklnrn(active, since 32h), standbys: balin.hvunfe, 
aka.wzystq

mds: 2/2 daemons up, 1 standby
osd: 104 osds: 104 up (since 37h), 104 in (since 37h); 22 remapped pgs

  data:
volumes: 1/1 healthy
pools:   3 pools, 529 pgs
objects: 18.53M objects, 40 TiB
usage:   54 TiB used, 142 TiB / 196 TiB avail
pgs: 7349/147534197 objects degraded (0.005%)
 2715/147534197 objects misplaced (0.002%)
 507 active+clean
 20  active+recovering+undersized+degraded+remapped
 2   active+recovery_wait+undersized+degraded+remapped

# ceph health detail
[WRN] PG_DEGRADED: Degraded data redundancy: 7349/147534197 objects 
degraded (0.005%), 22 pgs degraded, 22 pgs undersized
pg 3.2c is stuck undersized for 37h, current state 
active+recovery_wait+undersized+degraded+remapped, last acting 
[79,83,34,37,65,NONE,18,95]
pg 3.57 is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[57,99,37,NONE,15,104,55,40]
pg 3.76 is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[57,5,37,15,100,33,85,NONE]
pg 3.9c is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[57,86,88,NONE,11,69,20,10]
pg 3.106 is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[79,15,89,NONE,36,32,23,64]
pg 3.107 is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[79,NONE,64,20,61,92,104,43]
pg 3.10c is stuck undersized for 37h, current state 
active+recovery_wait+undersized+degraded+remapped, last acting 
[79,34,NONE,95,104,16,69,18]
pg 3.11e is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[79,89,64,46,32,NONE,40,15]
pg 3.14e is stuck undersized for 37h, current state 
active+recovering+undersized+degraded+remapped, last acting 
[57,34,69,97,85,NONE,46,62]
pg 3.160 is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[57,1,101,84,18,33,NONE,69]
pg 3.16a is stuck undersized for 37h, current state 
active+recovering+undersized+degraded+remapped, last acting 
[57,16,59,103,13,38,49,NONE]
pg 3.16e is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[57,0,27,96,55,10,81,NONE]
pg 3.170 is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[NONE,57,14,46,55,99,15,40]
pg 3.19b is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[NONE,79,59,8,32,17,7,90]
pg 3.1a0 is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[NONE,79,26,50,104,24,97,40]
pg 3.1a5 is stuck undersized for 37h, current state 
active+recovering+undersized+degraded+remapped, last acting 
[57,100,61,27,20,NONE,24,85]
pg 3.1a8 is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[79,24,NONE,3,55,40,98,45]
pg 3.1aa is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[79,91,48,NONE,24,3,8,85]
pg 3.1af is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[79,NONE,90,33,104,69,26,8]
pg 3.1c1 is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[79,95,NONE,53,54,27,18,85]
pg 3.1c4 is stuck undersized for 2d, current state 
active+recovering+undersized+degraded+remapped, last acting 
[79,69,56,84,95,8,NONE,4]
pg 3.1d5 is stuck undersized for 37h, current state 
active+recovering+undersized+degraded+remapped, last acting 
[57,48,NONE,104,34,16,37,89]

[WRN] PG_NOT_DEEP_SCRUBBED: 332 pgs not deep-scrubbed in time
pg 3.1f

[ceph-users] Re: CephFS metadata pool grows by two orders of magnitude while trimming (?) snapshots

2023-06-05 Thread Janek Bevendorff

Hi Patrick, hi Dan!

I got the MDS back and I think the issue is connected to the "newly 
corrupt dentry" bug [1]. Even though I couldn't see any particular 
reason for the SIGABRT at first, I then noticed one of these awfully 
familiar stack traces.


I rescheduled the two broken MDS ranks on two machines with 1.5TB RAM 
each (just to make sure it's not that) and then let them do their thing. 
The routine goes as follows: both replay the journal, then rank 4 goes 
into the "resolve" state, but as soon as rank 3 also starts resolving, 
they both crash.


Then I set

ceph config mds mds_abort_on_newly_corrupt_dentry false
ceph config mds mds_go_bad_corrupt_dentry false

and this time I was able to recover the ranks, even though "resolve" and 
"clientreplay" took forever. I uploaded a compressed log of rank 3 using 
ceph-post-file [2]. It's a log of several crash cycles, including the 
final successful attempt after changing the settings. The log 
decompresses to 815MB. I didn't censor any paths and they are not 
super-secret, but please don't share.


While writing this, the metadata pool size has reduced from 6TiB back to 
440GiB. I am starting to think that the fill-ups may also be connected 
to the corruption issue. I also noticed that the ranks 3 and 4 always 
have huge journals. An inspection using ceph-journal-tool takes forever 
and consumes 50GB of memory in the process. Listing the events in the 
journal is impossible without running out of RAM. Ranks 0, 1, and 2 
don't have this problem and this wasn't a problem for ranks 3 and 4 
either before the fill-ups started happening.


Hope that helps getting to the bottom of this. I reset the guardrail 
settings in the meantime.


Cheers
Janek


[1] "Newly corrupt dentry" ML link: 
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/JNZ6V5WSYKQTNQPQPLWRBM2GEP2YSCRV/#PKQVZYWZCH7P76Q75D5WD5JEAVWOKJE3


[2] ceph-post-file ID: 7c039483-49fd-468c-ba40-fb10337aa7d6



On 05/06/2023 16:08, Janek Bevendorff wrote:
I just had the problem again that MDS were constantly reporting slow 
metadata IO and the pool was slowly growing. Hence I restarted the MDS 
and now ranks 4 and 5 don't come up again.


Every time, they get to the resolve stage, the crash with a SIGABRT 
without an error message (not even at debug_mds = 20). Any idea what 
the reason could be? I checked whether they have enough RAM, which 
seems to be the case (unless they try to allocate tens of GB in one 
allocation).


Janek


On 31/05/2023 21:57, Janek Bevendorff wrote:

Hi Dan,

Sorry, I meant Pacific. The version number was correct, the name 
wasn’t. ;-)


Yes, I have five active MDS and five hot standbys. Static pinning 
isn’t really an options for our directory structure, so we’re using 
ephemeral pins.


Janek


On 31. May 2023, at 18:44, Dan van der Ster 
 wrote:


Hi Janek,

A few questions and suggestions:
- Do you have multi-active MDS? In my experience back in nautilus if
something went wrong with mds export between mds's, the mds
log/journal could grow unbounded like you observed until that export
work was done. Static pinning could help if you are not using it
already.
- You definitely should disable the pg autoscaling on the mds metadata
pool (and other pools imho) -- decide the correct number of PGs for
your pools and leave it.
- Which version are you running? You said nautilus but wrote 16.2.12
which is pacific... If you're running nautilus v14 then I recommend
disabling pg autoscaling completely -- IIRC it does not have a fix for
the OSD memory growth "pg dup" issue which can occur during PG
splitting/merging.

Cheers, Dan

__
Clyso GmbH | https://www.clyso.com


On Wed, May 31, 2023 at 4:03 AM Janek Bevendorff
 wrote:

I checked our logs from yesterday, the PG scaling only started today,
perhaps triggered by the snapshot trimming. I disabled it, but it 
didn't

change anything.

What did change something was restarting the MDS one by one, which had
got far behind with trimming their caches and with a bunch of stuck 
ops.

After restarting them, the pool size decreased quickly to 600GiB. I
noticed the same behaviour yesterday, though yesterday is was more
extreme and restarting the MDS took about an hour and I had to 
increase

the heartbeat timeout. This time, it took only half a minute per MDS,
probably because it wasn't that extreme yet and I had reduced the
maximum cache size. Still looks like a bug to me.


On 31/05/2023 11:18, Janek Bevendorff wrote:

Another thing I just noticed is that the auto-scaler is trying to
scale the pool down to 128 PGs. That could also result in large
fluctuations, but this big?? In any case, it looks like a bug to me.
Whatever is happening here, there should be safeguards with regard to
the pool's capacity.

Here's the current state of the pool in ceph osd pool ls detail:

pool 110 'cephfs.storage.meta' replicated size 4 min_size 3 
crush_rule

5 object_hash rjenkins pg_num 495 pgp_num 471 pg_num_target 128

[ceph-users] The pg_num from 1024 reduce to 32 spend much time, is there way to shorten the time?

2023-06-05 Thread Louis Koo
ceph version is 16.2.13;

The pg_num is 1024, and the target_pg_num is 32; there is no any data in the 
pool of ".rgw.buckets.index", but it spend  much time to reduce the pg num.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Quincy release -Swift integration with Keystone

2023-06-05 Thread fsbiz
Hi folks,

My ceph cluster with Quincy and Rocky9 is up and running.
But I'm having issues with swift authenticating with keystone.
Was wondering if I'm missed anything in the configuration.
>From the debug logs below, it appears that radosgw is still trying to 
>authenticate with Swift instead of Keystone.
Any pointers will be appreciated.  

thanks,

Here is my configuration.

# ceph config dump | grep rgw
client  
advanced  debug_rgw  20/20  
  
client  
advanced  rgw_keystone_accepted_rolesadmin,user 
*
client  
advanced  rgw_keystone_admin_domain  Default
*
client  
advanced  rgw_keystone_admin_password   
*
client  
advanced  rgw_keystone_admin_project service
*
client  
advanced  rgw_keystone_admin_userceph-ks-svc
*
client  
advanced  rgw_keystone_api_version   3  
  
client  
advanced  rgw_keystone_implicit_tenants  false  
*
client  
advanced  rgw_keystone_token_cache_size  0  
  
client  basic   
  rgw_keystone_url
 *
client  
advanced  rgw_s3_auth_use_keystone   true   
  
client  
advanced  rgw_swift_account_in_url   true   
  
client  basic   
  rgw_thread_pool_size   512
client.rgw.s_rgw.dev-ipp1-u1-control01.ojmddc   basic   
  rgw_frontends  beast port=7480
*
client.rgw.s_rgw.dev-ipp1-u1-control02.adnjrx   basic   
  rgw_frontends  beast port=7480


Here's the debug log.  
If I interpret it correctly, it is trying to do a swift authentication and 
failing.
Am I missing any configuration for Keystone based authentication ?

Jun 03 11:47:03 dev-ipp1-u1-control02 radosgw[2802861]: beast: 0x7fddeb8e7710: 
10.117.53.10 - - [03/Jun/2023:18:47:03.060 +] "GET 
/swift/v1/AUTH_c668ed224e434c88a9e0fce125056112?format=json HTTP/1.1" 401 119 - 
"openstacksdk/0.52.0 keystoneauth1/4.0.0 python-requests/2.22.0 CPython/3.8.10" 
- latency=0.0s
Jun 03 11:47:03 dev-ipp1-u1-control02 radosgw[2802861]: HTTP_ACCEPT=*/*
Jun 03 11:47:03 dev-ipp1-u1-control02 radosgw[2802861]: 
HTTP_ACCEPT_ENCODING=gzip, deflate
Jun 03 11:47:03 dev-ipp1-u1-control02 radosgw[2802861]: HTTP_CONNECTION=close
Jun 03 11:47:03 dev-ipp1-u1-control02.radosgw[2802861]: 
HTTP_HOST=dev-ipp1-u1-object-store
Jun 03 11:47:03 dev-ipp1-u1-control02radosgw[2802861]: 
HTTP_USER_AGENT=openstacksdk/0.52.0 keystoneauth1/4.0.0 python-requests/2.22.0 
CPython/3.8.10
Jun 03 11:47:03 dev-ipp1-u1-control02 radosgw[2802861]: HTTP_VERSION=1.1
Jun 03 11:47:03 dev-ipp1-u1-control02 radosgw[2802861]: 
HTTP_X_AUTH_TOKEN=gABke4qn779UQ_XMz0EDL3P3TgjBQsGG6p-MNhviJxLZTuMTnTDmpT5Yfi9UpgO_T3LOOsPjQAw6zoMUIaC22wPeryp5x-UumB3XwXOWp-qSXLbuN3b9oj_Qg5kCZWA0waWNRHzQ1mwtlEmmpTgvTXbU5V1ym6hEBOn6Q3RWhn34Hj3cF9o
Jun 03 11:47:03 dev-ipp1-u1-control02 radosgw[2802861]: 
HTTP_X_FORWARDED_FOR=10.117.148.3
Jun 03 11:47:03 dev-ipp1-u1-control02 radosgw[2802861]: QUERY_STRING=format=json
Jun 03 11:47:03 dev-ipp1-u1-control02.radosgw[2802861]: REMOTE_ADDR=10.117.53.10
Jun 03 11:47:03 dev-ipp1-u1-control02.radosgw[2802861]: REQUEST_METHOD=GET
Jun 03 11:47:03 dev-ipp1-u

[ceph-users] How to disable S3 ACL in radosgw

2023-06-05 Thread Rasool Almasi
Hi,
Is it possible to disable ACL in favor of bucket policy (on a bucket or
globally)?
The goal is to forbid users to use any bucket/object ACLs and only allow
bucket policies.

Seems there is no documentation in that regard which applies to Ceph RGW.
Apology if I am sending this in the wrong mailing list.

Regards,
Rasool
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS metadata pool grows by two orders of magnitude while trimming (?) snapshots

2023-06-05 Thread Janek Bevendorff
I just had the problem again that MDS were constantly reporting slow 
metadata IO and the pool was slowly growing. Hence I restarted the MDS 
and now ranks 4 and 5 don't come up again.


Every time, they get to the resolve stage, the crash with a SIGABRT 
without an error message (not even at debug_mds = 20). Any idea what the 
reason could be? I checked whether they have enough RAM, which seems to 
be the case (unless they try to allocate tens of GB in one allocation).


Janek


On 31/05/2023 21:57, Janek Bevendorff wrote:

Hi Dan,

Sorry, I meant Pacific. The version number was correct, the name wasn’t. ;-)

Yes, I have five active MDS and five hot standbys. Static pinning isn’t really 
an options for our directory structure, so we’re using ephemeral pins.

Janek



On 31. May 2023, at 18:44, Dan van der Ster  wrote:

Hi Janek,

A few questions and suggestions:
- Do you have multi-active MDS? In my experience back in nautilus if
something went wrong with mds export between mds's, the mds
log/journal could grow unbounded like you observed until that export
work was done. Static pinning could help if you are not using it
already.
- You definitely should disable the pg autoscaling on the mds metadata
pool (and other pools imho) -- decide the correct number of PGs for
your pools and leave it.
- Which version are you running? You said nautilus but wrote 16.2.12
which is pacific... If you're running nautilus v14 then I recommend
disabling pg autoscaling completely -- IIRC it does not have a fix for
the OSD memory growth "pg dup" issue which can occur during PG
splitting/merging.

Cheers, Dan

__
Clyso GmbH | https://www.clyso.com


On Wed, May 31, 2023 at 4:03 AM Janek Bevendorff
 wrote:

I checked our logs from yesterday, the PG scaling only started today,
perhaps triggered by the snapshot trimming. I disabled it, but it didn't
change anything.

What did change something was restarting the MDS one by one, which had
got far behind with trimming their caches and with a bunch of stuck ops.
After restarting them, the pool size decreased quickly to 600GiB. I
noticed the same behaviour yesterday, though yesterday is was more
extreme and restarting the MDS took about an hour and I had to increase
the heartbeat timeout. This time, it took only half a minute per MDS,
probably because it wasn't that extreme yet and I had reduced the
maximum cache size. Still looks like a bug to me.


On 31/05/2023 11:18, Janek Bevendorff wrote:

Another thing I just noticed is that the auto-scaler is trying to
scale the pool down to 128 PGs. That could also result in large
fluctuations, but this big?? In any case, it looks like a bug to me.
Whatever is happening here, there should be safeguards with regard to
the pool's capacity.

Here's the current state of the pool in ceph osd pool ls detail:

pool 110 'cephfs.storage.meta' replicated size 4 min_size 3 crush_rule
5 object_hash rjenkins pg_num 495 pgp_num 471 pg_num_target 128
pgp_num_target 128 autoscale_mode on last_change 1359013 lfor
0/1358620/1358618 flags hashpspool,nodelete stripe_width 0
expected_num_objects 300 recovery_op_priority 5 recovery_priority
2 application cephfs

Janek


On 31/05/2023 10:10, Janek Bevendorff wrote:

Forgot to add: We are still on Nautilus (16.2.12).


On 31/05/2023 09:53, Janek Bevendorff wrote:

Hi,

Perhaps this is a known issue and I was simply too dumb to find it,
but we are having problems with our CephFS metadata pool filling up
over night.

Our cluster has a small SSD pool of around 15TB which hosts our
CephFS metadata pool. Usually, that's more than enough. The normal
size of the pool ranges between 200 and 800GiB (which is quite a lot
of fluctuation already). Yesterday, we had suddenly had the pool
fill up entirely and they only way to fix it was to add more
capacity. I increased the pool size to 18TB by adding more SSDs and
could resolve the problem. After a couple of hours of reshuffling,
the pool size finally went back to 230GiB.

But then we had another fill-up tonight to 7.6TiB. Luckily, I had
adjusted the weights so that not all disks could fill up entirely
like last time, so it ended there.

I wasn't really able to identify the problem yesterday, but under
the more controllable scenario today, I could check the MDS logs at
debug_mds=10 and to me it seems like the problem is caused by
snapshot trimming. The logs contain a lot of snapshot-related
messages for paths that haven't been touched in a long time. The
messages all look something like this:

May 31 09:16:48 XXX ceph-mds[2947525]: 2023-05-31T09:16:48.292+0200
7f7ce1bd9700 10 mds.1.cache.ino(0x1000b3c3670) add_client_cap first
cap, joining realm snaprealm(0x100 seq 1b1c lc 1b1b cr 1
b1b cps 2 snaps={185f=snap(185f 0x100 'monthly_20221201'
2022-12-01T00:00:01.530830+0100),18de=snap(18de 0x100
'monthly_20230101' 2023-01-01T00:00:04.657252+0100),1941=snap(1941
0x100 ...

May 31 09:25:03 XXX ceph-mds[3268481]: 2023-05-31

[ceph-users] Re: Duplicate help statements in Prometheus metrics in 16.2.13

2023-06-05 Thread Konstantin Shalygin
Hi Andreas,

> On 5 Jun 2023, at 14:57, Andreas Haupt  wrote:
> 
> after the update to CEPH 16.2.13 the Prometheus exporter is wrongly
> exporting multiple metric help & type lines for ceph_pg_objects_repaired:
> 
> [mon1] /root #curl -sS http://localhost:9283/metrics
> # HELP ceph_pg_objects_repaired Number of objects repaired in a pool Count
> # TYPE ceph_pg_objects_repaired counter
> ceph_pg_objects_repaired{poolid="34"} 0.0
> # HELP ceph_pg_objects_repaired Number of objects repaired in a pool Count
> # TYPE ceph_pg_objects_repaired counter
> ceph_pg_objects_repaired{poolid="33"} 0.0
> # HELP ceph_pg_objects_repaired Number of objects repaired in a pool Count
> # TYPE ceph_pg_objects_repaired counter
> ceph_pg_objects_repaired{poolid="32"} 0.0
> [...]
> 
> This annoys our exporter_exporter service so it rejects the export of ceph
> metrics. Is this a known issue? Will this be fixed in the next update?

We have backport for this issue [1]

[1] https://github.com/ceph/ceph/pull/51692

k
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Duplicate help statements in Prometheus metrics in 16.2.13

2023-06-05 Thread Andreas Haupt
Dear all,

after the update to CEPH 16.2.13 the Prometheus exporter is wrongly
exporting multiple metric help & type lines for ceph_pg_objects_repaired:

[mon1] /root #curl -sS http://localhost:9283/metrics
# HELP ceph_pg_objects_repaired Number of objects repaired in a pool Count
# TYPE ceph_pg_objects_repaired counter
ceph_pg_objects_repaired{poolid="34"} 0.0
# HELP ceph_pg_objects_repaired Number of objects repaired in a pool Count
# TYPE ceph_pg_objects_repaired counter
ceph_pg_objects_repaired{poolid="33"} 0.0
# HELP ceph_pg_objects_repaired Number of objects repaired in a pool Count
# TYPE ceph_pg_objects_repaired counter
ceph_pg_objects_repaired{poolid="32"} 0.0
[...]

This annoys our exporter_exporter service so it rejects the export of ceph
metrics. Is this a known issue? Will this be fixed in the next update?

Cheers,
Andreas
-- 
| Andreas Haupt| E-Mail: andreas.ha...@desy.de
|  DESY Zeuthen| WWW:http://www.zeuthen.desy.de/~ahaupt
|  Platanenallee 6 | Phone:  +49/33762/7-7359
|  D-15738 Zeuthen | Fax:+49/33762/7-7216




smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [RGW] what is log_meta and log_data config in a multisite config?

2023-06-05 Thread Gilles Mocellin

Hi Cephers,

In a multisite config, with one zonegroup and 2 zones, when I look at 
`radiosgw-admin zonegroup get`,

I see by defaut these two parameters :
"log_meta": "false",
"log_data": "true",

Where can I find documentation on these, I can't find.

I set log_meta to true, because, why not ?
Is it a bad thing ?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 16.2.13: ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; please create

2023-06-05 Thread Zakhar Kirpichenko
Any other thoughts on this, please? Should I file a bug report?

/Z

On Fri, 2 Jun 2023 at 06:11, Zakhar Kirpichenko  wrote:

> Thanks, Josh. The cluster is managed by cephadm.
>
> On Thu, 1 Jun 2023, 23:07 Josh Baergen,  wrote:
>
>> Hi Zakhar,
>>
>> I'm going to guess that it's a permissions issue arising from
>> https://github.com/ceph/ceph/pull/48804, which was included in 16.2.13.
>> You may need to change the directory permissions, assuming that you manage
>> the directories yourself. If this is managed by cephadm or something like
>> that, then that seems like some sort of missing migration in the upgrade.
>>
>> Josh
>>
>> On Thu, Jun 1, 2023 at 12:34 PM Zakhar Kirpichenko 
>> wrote:
>>
>>> Hi,
>>>
>>> I'm having an issue with crash daemons on Pacific 16.2.13 hosts.
>>> ceph-crash
>>> throws the following error on all hosts:
>>>
>>> ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist;
>>> please create
>>> ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist;
>>> please create
>>> ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist;
>>> please create
>>>
>>> ceph-crash runs in docker, the container has the directory mounted: -v
>>>
>>> /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86/crash:/var/lib/ceph/crash:z
>>>
>>> The mount works correctly:
>>>
>>> 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]#
>>> ls
>>> -al crash/posted/
>>> total 8
>>> drwx-- 2 nobody nogroup 4096 May  6  2021 .
>>> drwx-- 3 nobody nogroup 4096 May  6  2021 ..
>>>
>>> 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]#
>>> touch crash/posted/a
>>>
>>> 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]#
>>> docker exec -it c0cd2b8022d8 bash
>>>
>>> [root@ceph02 /]# ls -al /var/lib/ceph/crash/posted/
>>> total 8
>>> drwx-- 2 nobody nobody 4096 Jun  1 18:26 .
>>> drwx-- 3 nobody nobody 4096 May  6  2021 ..
>>> -rw-r--r-- 1 root   root  0 Jun  1 18:26 a
>>>
>>> I.e. the directory actually exists and is correctly mounted in the crash
>>> container, yet ceph-crash says it doesn't exist. How can I convince it
>>> that the directory is there?
>>>
>>> Best regards,
>>> Zakhar
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io