[ceph-users] Traefik front end with RGW

2024-05-23 Thread Reid Guyett
Hello, We are considering moving from Nginx to Traefik as the frontend for our RGW services. Prior to putting into production I ran it through s3-tests and noticed that all of the tests involving metadata (x-amz-meta-*) are failing because they are expected to be lowercase

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Frank Schilder
Hi Eugen, just to add another strangeness observation from long ago: https://www.spinics.net/lists/ceph-users/msg74655.html. I didn't see any reweights in your trees, so its something else. However, there seem to be multiple issues with EC pools and peering. I also want to clarify: > If this

[ceph-users] Re: Best practice regarding rgw scaling

2024-05-23 Thread Casey Bodley
On Thu, May 23, 2024 at 11:50 AM Szabo, Istvan (Agoda) wrote: > > Hi, > > Wonder what is the best practice to scale RGW, increase the thread numbers or > spin up more gateways? > > > * > Let's say I have 21000 connections on my haproxy > * > I have 3 physical gateway servers so let's say

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Frank Schilder
Hi Eugen, I'm at home now. Could you please check all the remapped PGs that they have no shards on the new OSDs, i.e. its just shuffling around mappings within the same set of OSDs under rooms? If this is the case, it is possible that this is partly intentional and partly buggy. The remapping

[ceph-users] Re: Best practice regarding rgw scaling

2024-05-23 Thread Anthony D'Atri
I'm interested in these responses. Early this year a certain someone related having good results by deploying an RGW on every cluster node. This was when we were experiencing ballooning memory usage conflicting with K8s limits when running 3. So on the cluster in question we now run 25.

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block
So this is the current status after adding two hosts outside of their rooms: ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.37054 root default -23 0.04678 host host5 14hdd 0.02339 osd.14 up 1.0 1.0

[ceph-users] Best practice regarding rgw scaling

2024-05-23 Thread Szabo, Istvan (Agoda)
Hi, Wonder what is the best practice to scale RGW, increase the thread numbers or spin up more gateways? * Let's say I have 21000 connections on my haproxy * I have 3 physical gateway servers so let's say each of them need to server 7000 connections This means with 512 thread pool size

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block
In my small lab cluster I can at least reproduce that a bunch of PGs are remapped after adding hosts to the default root, but they are not in their designated location yet. I have 3 „rooms“ underneath the default root. Although I can’t reproduce the unknown PGs, maybe this is enough to

[ceph-users] Re: Status of 18.2.3

2024-05-23 Thread Sake Ceph
I don't have access to Slack, but thank you for all your work! Fingers crossed for a quick release. Kind regards, Sake > Op 23-05-2024 16:20 CEST schreef Yuri Weinstein : > > > We are still working on the last-minute fixes, see this for details >

[ceph-users] Re: Status of 18.2.3

2024-05-23 Thread Yuri Weinstein
We are still working on the last-minute fixes, see this for details https://ceph-storage.slack.com/archives/C054Q1NUBQT/p1711041666180929 Regards YuriW On Thu, May 23, 2024 at 6:22 AM Sake Ceph wrote: > > I was wondering what happened to the release of 18.2.3? Validation started on > April

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block
Thanks, Frank, I appreciate your help. I already asked for the osdmap, but I’ll also try to find a reproducer. Zitat von Frank Schilder : Hi Eugen, thanks for this clarification. Yes, with the observations you describe for transition 1->2, something is very wrong. Nothing should happen.

[ceph-users] Re: User + Dev Meetup Tomorrow!

2024-05-23 Thread Laura Flores
Hi all, The meeting will be starting shortly! Join us at this link: https://meet.jit.si/ceph-user-dev-monthly - Laura On Wed, May 22, 2024 at 2:55 PM Laura Flores wrote: > Hi all, > > The User + Dev Meetup will be held tomorrow at 10:00 AM EDT. We will be > discussing the results of the

[ceph-users] Status of 18.2.3

2024-05-23 Thread Sake Ceph
I was wondering what happened to the release of 18.2.3? Validation started on April 13th and as far as I know there have been a couple of builds and some extra bug fixes. Is there a way to follow a release or what is holding it back? Normally I wouldn't ask about a release and just wait, but I

[ceph-users] Re: does the RBD client block write when the Watcher times out?

2024-05-23 Thread Frank Schilder
Hi, we run into the same issue and there is actually another use case: live-migration of VMs. This requires an RBD image being mapped to two clients simultaneously, so this is intentional. If multiple clints map an image in RW-mode, the ceph back-end will cycle the write lock between the

[ceph-users] Re: does the RBD client block write when the Watcher times out?

2024-05-23 Thread Ilya Dryomov
On Thu, May 23, 2024 at 4:48 AM Yuma Ogami wrote: > > Hello. > > I'm currently verifying the behavior of RBD on failure. I'm wondering > about the consistency of RBD images after network failures. As a > result of my investigation, I found that RBD sets a Watcher to RBD > image if a client mounts

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Frank Schilder
Hi Eugen, thanks for this clarification. Yes, with the observations you describe for transition 1->2, something is very wrong. Nothing should happen. Unfortunately, I'm going to be on holidays and, generally, don't have too much time. If they can afford to share the osdmap (ceph osd getmap -o

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block
Hi Frank, thanks for chiming in here. Please correct if this is wrong. Assuming its correct, I conclude the following. You assume correctly. Now, from your description it is not clear to me on which of the transitions 1->2 or 2->3 you observe - peering and/or - unknown PGs. The

[ceph-users] Re: Pacific 16.2.15 and ceph-volume no-longer creating LVM on block.db partition

2024-05-23 Thread Bruno Canning
I should add, we are not using cephadm. From: Bruno Canning Sent: 23 May 2024 11:36 To: ceph-users@ceph.io Subject: Pacific 16.2.15 and ceph-volume no-longer creating LVM on block.db partition Hi Folks, After recently upgrading from 16.2.13 to 16.2.15, when I run: ceph-volume lvm create

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Frank Schilder
Hi Eugen, I'm afraid the description of your observation breaks a bit with causality and this might be the reason for the few replies. To produce a bit more structure for when exactly what happened, let's look at what I did and didn't get: Before adding the hosts you have situation 1) default

[ceph-users] Pacific 16.2.15 and ceph-volume no-longer creating LVM on block.db partition

2024-05-23 Thread Bruno Canning
Hi Folks, After recently upgrading from 16.2.13 to 16.2.15, when I run: ceph-volume lvm create --data /dev/sda --block.db /dev/nvme0n1p1 to create a new OSD after replacement of a failed disk, ceph-volume no-longer creates a volume group/logical volume on the block.db partition*. This is the

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block
Hi again, I'm still wondering if I misunderstand some of the ceph concepts. Let's assume the choose_tries value is too low and ceph can't find enough OSDs for the remapping. I would expect that there are some PG chunks in remapping state or unknown or whatever, but why would it affect