[ceph-users] Re: Pacific 16.2.15 `osd noin`

2024-04-03 Thread Zakhar Kirpichenko
Any comments regarding `osd noin`, please? /Z On Tue, 2 Apr 2024 at 16:09, Zakhar Kirpichenko wrote: > Hi, > > I'm adding a few OSDs to an existing cluster, the cluster is running with > `osd noout,noin`: > > cluster: > id: 3f50555a-ae2a-11eb-a2fc-ffde44714d86 > health:

[ceph-users] RGW services crashing randomly with same message

2024-04-03 Thread Reid Guyett
Hello, We are currently experiencing a lot of rgw service crashes that all seem to terminate with the same message. We have kept our RGW services at 17.2.5 but the rest of the cluster is 17.2.7 due to a bug introduced in 17.2.7. terminate called after throwing an instance of >

[ceph-users] Re: Slow ops during recovery for RGW index pool only when degraded OSD is primary

2024-04-03 Thread Anthony D'Atri
Thanks. I'll PR up some doc updates reflecting this and run them by the RGW / RADOS folks. > On Apr 3, 2024, at 16:34, Joshua Baergen wrote: > > Hey Anthony, > > Like with many other options in Ceph, I think what's missing is the > user-visible effect of what's being altered. I believe the

[ceph-users] Re: Slow ops during recovery for RGW index pool only when degraded OSD is primary

2024-04-03 Thread Joshua Baergen
Hey Anthony, Like with many other options in Ceph, I think what's missing is the user-visible effect of what's being altered. I believe the reason why synchronous recovery is still used is that, assuming that per-object recovery is quick, it's faster to complete than asynchronous recovery, which

[ceph-users] Re: RBD image metric

2024-04-03 Thread Anthony D'Atri
Depending on your Ceph release you might need to enable rbdstats. Are you after provisioned, allocated, or both sizes? Do you have object-map and fast-diff enabled? They speed up `rbd du` massively. > On Apr 3, 2024, at 00:26, Szabo, Istvan (Agoda) > wrote: > > Hi, > > Trying to pull out

[ceph-users] Re: Slow ops during recovery for RGW index pool only when degraded OSD is primary

2024-04-03 Thread Anthony D'Atri
We currently have in src/common/options/global.yaml.in - name: osd_async_recovery_min_cost type: uint level: advanced desc: A mixture measure of number of current log entries difference and historical missing objects, above which we switch to use asynchronous recovery when

[ceph-users] Re: Upgraded to Quincy 17.2.7: some S3 buckets inaccessible

2024-04-03 Thread Casey Bodley
On Wed, Apr 3, 2024 at 3:09 PM Lorenz Bausch wrote: > > Hi Casey, > > thank you so much for analysis! We tested the upgraded intensively, but > the buckets in our test environment were probably too small to get > dynamically resharded. > > > after upgrading to the Quincy release, rgw would > >

[ceph-users] Re: [ext] Re: cephadm auto disk preparation and OSD installation incomplete

2024-04-03 Thread Eugen Block
Hi and sorry for the delay, I was on vacation last week. :-) I just read your responses. I have no idea how to modify the default timeout for cephadm, maybe Adam or someone else can comment on that. But everytime I've been watching cephadm (ceph-volume) create new OSDs they are not created

[ceph-users] Re: quincy-> reef upgrade non-cephadm

2024-04-03 Thread Eugen Block
Hi, 1. I see no systemd units with the fsid in them, as described in the document above. Both before and after the upgrade, my mon and other units are: ceph-mon@.serviceceph-osd@[N].service etc Should I be concerned? I think this is expected because it's not containerized, no reason to

[ceph-users] Re: ceph orchestrator for osds

2024-04-03 Thread Eugen Block
Hi, how many OSDs do you have in total? Can you share your osd tree, please? You could check the unit.meta file on each OSD host to see which service it refers to and simply change it according to the service you intend to keep: host1:~ # grep -r service_name

[ceph-users] Re: Upgraded to Quincy 17.2.7: some S3 buckets inaccessible

2024-04-03 Thread Lorenz Bausch
Hi Casey, thank you so much for analysis! We tested the upgraded intensively, but the buckets in our test environment were probably too small to get dynamically resharded. after upgrading to the Quincy release, rgw would look at the wrong object names when trying to list those buckets. As

[ceph-users] Re: Issue about execute "ceph fs new"

2024-04-03 Thread Eugen Block
Hi, you need to deploy more daemons because your current active MDS is responsible for the already existing CephFS. There are several ways to do this, I like the yaml file approach and increase the number of MDS daemons, just as an example from a test cluster with one CephFS I added the

[ceph-users] Re: Upgraded to Quincy 17.2.7: some S3 buckets inaccessible

2024-04-03 Thread Casey Bodley
to expand on this diagnosis: with multisite resharding, we changed how buckets name/locate their bucket index shard objects. any buckets that were resharded under this Red Hat pacific release would be using the new object names. after upgrading to the Quincy release, rgw would look at the wrong

[ceph-users] Re: Slow ops during recovery for RGW index pool only when degraded OSD is primary

2024-04-03 Thread Joshua Baergen
We've had success using osd_async_recovery_min_cost=0 to drastically reduce slow ops during index recovery. Josh On Wed, Apr 3, 2024 at 11:29 AM Wesley Dillingham wrote: > > I am fighting an issue on an 18.2.0 cluster where a restart of an OSD which > supports the RGW index pool causes

[ceph-users] Slow ops during recovery for RGW index pool only when degraded OSD is primary

2024-04-03 Thread Wesley Dillingham
I am fighting an issue on an 18.2.0 cluster where a restart of an OSD which supports the RGW index pool causes crippling slow ops. If the OSD is marked with primary-affinity of 0 prior to the OSD restart no slow ops are observed. If the OSD has a primary affinity of 1 slow ops occur. The slow ops

[ceph-users] Re: Multi-MDS

2024-04-03 Thread quag...@bol.com.br
___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Issue about execute "ceph fs new"

2024-04-03 Thread elite_stu
Everything goes fine except execute "ceph fs new kingcephfs cephfs-king-metadata cephfs-king-data", its shows 1 filesystem is offline 1 filesystem is online with fewer MDS than max_mds. But i see there is one mds services running, please help me to fix the issue, thanks a lot. bash-4.4$

[ceph-users] OSD: failed decoding part header ERRORS

2024-04-03 Thread Mark Selby
We have a ceph cluster of only nvme drives. Very recently our overall OSD write latency increase pretty dramatically and our overall thoughput has really decreased. One thing that seems to correlate with the start of this problem are the below ERROR line from the logs. All our OSD nodes

[ceph-users] Re: cephfs creation error

2024-04-03 Thread elite_stu
I have the same issue, can someone help me, thanks in advance! bash-4.4$ ceph fs new kingcephfs cephfs-king-metadata cephfs-king-data new fs with metadata pool 7 and data pool 8 bash-4.4$ bash-4.4$ ceph -s cluster: id: de9af3fe-d3b1-4a4b-bf61-929a990295f6 health: HEALTH_ERR

[ceph-users] Re: Upgraded to Quincy 17.2.7: some S3 buckets inaccessible

2024-04-03 Thread Casey Bodley
On Wed, Apr 3, 2024 at 11:58 AM Lorenz Bausch wrote: > > Hi everybody, > > we upgraded our containerized Red Hat Pacific cluster to the latest > Quincy release (Community Edition). i'm afraid this is not an upgrade path that we try to test or support. Red Hat makes its own decisions about what

[ceph-users] Upgraded to Quincy 17.2.7: some S3 buckets inaccessible

2024-04-03 Thread Lorenz Bausch
Hi everybody, we upgraded our containerized Red Hat Pacific cluster to the latest Quincy release (Community Edition). The upgrade itself went fine, the cluster is HEALTH_OK, all daemons run the upgraded version: %< $ ceph -s cluster: id:

[ceph-users] IO500 CFS ISC 2024

2024-04-03 Thread IO500 Committee
Call for Submission Stabilization Period: Monday, April 1st - Friday, April 15th, 2024 Submission Deadline: Tuesday, May 3rd, 2024 AoE The IO500 is now accepting and encouraging submissions for the upcoming 14th semi-annual IO500 Production and Research lists, in conjunction with ISC24. Once

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-03 Thread Adiga, Anantha
removed the config setting for mon. a001s016. Here it is # ceph config get mon container_image docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586 # ceph config get osd container_image

[ceph-users] Re: ceph and raid 1 replication

2024-04-03 Thread Roberto Maggi @ Debian
Thanks for considerations. On 4/3/24 13:08, Janne Johansson wrote: Hi every one, I'm new to ceph and I'm still studying it. In my company we decided to test ceph for possible further implementations. Although I undestood its capabilities I'm still doubtful about how to setup replication.

[ceph-users] Re: ceph and raid 1 replication

2024-04-03 Thread Janne Johansson
> Hi every one, > I'm new to ceph and I'm still studying it. > In my company we decided to test ceph for possible further implementations. > > Although I undestood its capabilities I'm still doubtful about how to > setup replication. Default settings in ceph will give you replication = 3, which

[ceph-users] ceph and raid 1 replication

2024-04-03 Thread Roberto Maggi @ Debian
Hi every one, I'm new to ceph and I'm still studying it. In my company we decided to test ceph for possible further implementations. Although I  undestood its capabilities I'm still doubtful about how to setup replication. Once implemented in production I can accept a little lacking of

[ceph-users] Re: put bucket notification configuration - access denied

2024-04-03 Thread Yuval Lifshitz
Hi GM, sorry for the late reply. anmyway, you are right. in "quincy" (v17) only the owner of the bucket was allowed to set a notification on the bucket. in "reef" (v18) we fixed that, so that we follow the permissions set on the bucket. you can use the "s3PutBucketNotification" policy on the

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-03 Thread Eugen Block
I have no idea what you did there ;-) I would remove that config though and rather configure the ceph image globally, there have been several issues when cephadm tries to launch daemons with different ceph versions. Although in your case it looks like they are actually the same images

[ceph-users] Re: rgw s3 bucket policies limitations (on users)

2024-04-03 Thread Christian Rohmann
Hey Garcetto, On 29.03.24 4:13 PM, garcetto wrote: i am trying to set bucket policies to allow to different users to access same bucket with different permissions, BUT it seems that is not yet supported, am i wrong? https://docs.ceph.com/en/reef/radosgw/bucketpolicy/#limitations "We do not