[ceph-users] Traefik front end with RGW

2024-05-23 Thread Reid Guyett
Hello,

We are considering moving from Nginx to Traefik as the frontend for our RGW
services. Prior to putting into production I ran it through s3-tests and
noticed that all of the tests involving metadata (x-amz-meta-*) are failing
because they are expected to be lowercase
(test_s3.py::test_object_set_get_metadata_none_to_good - KeyError:
'meta1'). Traefik / Go converts these to have the first letter uppercase
(X-Amz-Meta-Key). Poking around I found
https://github.com/traefik/traefik/issues/466 and it points out RFC 2616
which says these should be case insensitive. AWS S3 says "User-defined
metadata is a set of key-value pairs. Amazon S3 stores user-defined
metadata keys in lowercase." over at
https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingMetadata.html.
Is anybody using Traefik for their frontend to RGW? Do you mitigate this if
so and how?

Thanks for any input.

Reid
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Frank Schilder
Hi Eugen,

just to add another strangeness observation from long ago: 
https://www.spinics.net/lists/ceph-users/msg74655.html. I didn't see any 
reweights in your trees, so its something else. However, there seem to be 
multiple issues with EC pools and peering.

I also want to clarify:

> If this is the case, it is possible that this is partly intentional and 
> partly buggy.

"Partly intentional" here means the code behaviour changes when you add OSDs to 
the root outside the rooms and this change is not considered a bug. It is 
clearly *not* expected as it means you cannot do maintenance on a pool living 
on a tree A without affecting pools on the same device class living on an 
unmodified subtree of A.

>From a ceph user's point of view everything you observe looks buggy. I would 
>really like to see a good explanation why the mappings in the subtree *should* 
>change when adding OSDs above that subtree as in your case when the 
>expectation for good reasons is that they don't. This would help devising 
>clean procedures for adding hosts when you (and I) want to add OSDs first 
>without any peering and then move OSDs into place to have it happen separate 
>from adding and not a total mess with everything in parallel.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Frank Schilder 
Sent: Thursday, May 23, 2024 6:32 PM
To: Eugen Block
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: unknown PGs after adding hosts in different subtree

Hi Eugen,

I'm at home now. Could you please check all the remapped PGs that they have no 
shards on the new OSDs, i.e. its just shuffling around mappings within the same 
set of OSDs under rooms?

If this is the case, it is possible that this is partly intentional and partly 
buggy. The remapping is then probably intentional and the method I use with a 
disjoint tree for new hosts prevents such remappings initially (the crush code 
sees the new OSDs in the root, doesn't use them but their presence does change 
choice orders resulting in remapped PGs). However, the unknown PGs should 
clearly not occur.

I'm afraid that the peering code has quite a few bugs, I reported something at 
least similarly weird a long time ago: https://tracker.ceph.com/issues/56995 
and https://tracker.ceph.com/issues/46847. Might even be related. It looks like 
peering can loose track of PG members in certain situations (specifically after 
adding OSDs until rebalancing completed). In my cases, I get degraded objects 
even though everything is obviously still around. Flipping between the 
crush-maps before/after the change re-discovers everything again.

Issue 46847 is long-standing and still unresolved. In case you need to file a 
tracker, please consider to refer to the two above as well as "might be 
related" if you deem that they might be related.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Best practice regarding rgw scaling

2024-05-23 Thread Casey Bodley
On Thu, May 23, 2024 at 11:50 AM Szabo, Istvan (Agoda)
 wrote:
>
> Hi,
>
> Wonder what is the best practice to scale RGW, increase the thread numbers or 
> spin up more gateways?
>
>
>   *
> Let's say I have 21000 connections on my haproxy
>   *
> I have 3 physical gateway servers so let's say each of them need to server 
> 7000 connections
>
> This means with 512 thread pool size each of them needs 13 gateway altogether 
> 39 in the cluster.
> or
> 3 gateway and each 8192 rgw thread?

with the beast frontend, rgw_max_concurrent_requests is the most
relevant config option here. while you might benefit from more than
512 threads at scale, you won't need a thread per connection

i'd also point out the relationship between concurrent requests and
memory usage: with default tunings, each PutObject
(rgw_put_obj_min_window_size) and GetObject (rgw_get_obj_window_size)
request may buffer up to 16MB of object data

>
> Thank you
>
> 
> This message is confidential and is for the sole use of the intended 
> recipient(s). It may also be privileged or otherwise protected by copyright 
> or other legal rules. If you have received it by mistake please let us know 
> by reply email and delete it from your system. It is prohibited to copy this 
> message or disclose its content to anyone. Any confidentiality or privilege 
> is not waived or lost by any mistaken delivery or unauthorized disclosure of 
> the message. All messages sent to and from Agoda may be monitored to ensure 
> compliance with company policies, to protect the company's interests and to 
> remove potential malware. Electronic messages may be intercepted, amended, 
> lost or deleted, or contain viruses.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Frank Schilder
Hi Eugen,

I'm at home now. Could you please check all the remapped PGs that they have no 
shards on the new OSDs, i.e. its just shuffling around mappings within the same 
set of OSDs under rooms?

If this is the case, it is possible that this is partly intentional and partly 
buggy. The remapping is then probably intentional and the method I use with a 
disjoint tree for new hosts prevents such remappings initially (the crush code 
sees the new OSDs in the root, doesn't use them but their presence does change 
choice orders resulting in remapped PGs). However, the unknown PGs should 
clearly not occur.

I'm afraid that the peering code has quite a few bugs, I reported something at 
least similarly weird a long time ago: https://tracker.ceph.com/issues/56995 
and https://tracker.ceph.com/issues/46847. Might even be related. It looks like 
peering can loose track of PG members in certain situations (specifically after 
adding OSDs until rebalancing completed). In my cases, I get degraded objects 
even though everything is obviously still around. Flipping between the 
crush-maps before/after the change re-discovers everything again.

Issue 46847 is long-standing and still unresolved. In case you need to file a 
tracker, please consider to refer to the two above as well as "might be 
related" if you deem that they might be related.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Best practice regarding rgw scaling

2024-05-23 Thread Anthony D'Atri
I'm interested in these responses.  Early this year a certain someone related 
having good results by deploying an RGW on every cluster node.  This was when 
we were experiencing ballooning memory usage conflicting with K8s limits when 
running 3.  So on the cluster in question we now run 25.

I've read in the past of people running multiple RGW instances on a given host 
vs trying to get a single instance to scale past a certain point.  I'm told 
that in our Rook/K8s case we couldn't do that.

> On May 23, 2024, at 11:49, Szabo, Istvan (Agoda)  
> wrote:
> 
> Hi,
> 
> Wonder what is the best practice to scale RGW, increase the thread numbers or 
> spin up more gateways?
> 
> 
>  *
> Let's say I have 21000 connections on my haproxy
>  *
> I have 3 physical gateway servers so let's say each of them need to server 
> 7000 connections
> 
> This means with 512 thread pool size each of them needs 13 gateway altogether 
> 39 in the cluster.
> or
> 3 gateway and each 8192 rgw thread?
> 
> Thank you
> 
> 
> This message is confidential and is for the sole use of the intended 
> recipient(s). It may also be privileged or otherwise protected by copyright 
> or other legal rules. If you have received it by mistake please let us know 
> by reply email and delete it from your system. It is prohibited to copy this 
> message or disclose its content to anyone. Any confidentiality or privilege 
> is not waived or lost by any mistaken delivery or unauthorized disclosure of 
> the message. All messages sent to and from Agoda may be monitored to ensure 
> compliance with company policies, to protect the company's interests and to 
> remove potential malware. Electronic messages may be intercepted, amended, 
> lost or deleted, or contain viruses.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block

So this is the current status after adding two hosts outside of their rooms:

ceph osd tree
ID   CLASS  WEIGHT   TYPE NAME   STATUS  REWEIGHT  PRI-AFF
 -1 0.37054  root default
-23 0.04678  host host5
 14hdd  0.02339  osd.14  up   1.0  1.0
 15hdd  0.02339  osd.15  up   1.0  1.0
-12 0.04678  host host6
  1hdd  0.02339  osd.1   up   1.0  1.0
 13hdd  0.02339  osd.13  up   1.0  1.0
 -8 0.09399  room room1
 -3 0.04700  host host1
  7hdd  0.02299  osd.7   up   1.0  1.0
 10hdd  0.02299  osd.10  up   1.0  1.0
 -5 0.04700  host host2
  4hdd  0.02299  osd.4   up   1.0  1.0
 11hdd  0.02299  osd.11  up   1.0  1.0
 -9 0.09299  room room2
-17 0.04599  host host7
  0hdd  0.02299  osd.0   up   1.0  1.0
  2hdd  0.02299  osd.2   up   1.0  1.0
 -7 0.04700  host host8
  5hdd  0.02299  osd.5   up   1.0  1.0
  6hdd  0.02299  osd.6   up   1.0  1.0
-21 0.09000  room room3
-11 0.04300  host host3
  8hdd  0.01900  osd.8   up   1.0  1.0
  9hdd  0.02299  osd.9   up   1.0  1.0
-15 0.04700  host host4
  3hdd  0.02299  osd.3   up   1.0  1.0
 12hdd  0.02299  osd.12  up   1.0  1.0

And the current ceph status:

# ceph -s
  cluster:
id: 543967bc-e586-32b8-bd2c-2d8b8b168f02
health: HEALTH_OK

  services:
mon: 3 daemons, quorum host1,host2,host3 (age 5d)
mgr: host8.psefrq(active, since 76m), standbys: host4.frkktj, host1.vhylmr
mds: 2/2 daemons up, 1 standby, 1 hot standby
osd: 16 osds: 16 up (since 69m), 16 in (since 70m); 89 remapped pgs
rgw: 2 daemons active (2 hosts, 1 zones)

  data:
volumes: 2/2 healthy
pools:   15 pools, 350 pgs
objects: 576 objects, 341 MiB
usage:   61 GiB used, 319 GiB / 380 GiB avail
pgs: 256/2013 objects misplaced (12.717%)
 262 active+clean
 88  active+clean+remapped

I attached my osdmap, not sure if it will go through, though. Let me  
know if you need anything else.


Thanks!
Eugen


Zitat von Eugen Block :

In my small lab cluster I can at least reproduce that a bunch of PGs  
are remapped after adding hosts to the default root, but they are  
not in their designated location yet. I have 3 „rooms“ underneath  
the default root. Although I can’t reproduce the unknown PGs, maybe  
this is enough to investigate? I’m on my mobile right now, I’ll add  
my own osdmap to the thread soon.


Zitat von Eugen Block :


Thanks, Frank, I appreciate your help.
I already asked for the osdmap, but I’ll also try to find a reproducer.

Zitat von Frank Schilder :


Hi Eugen,

thanks for this clarification. Yes, with the observations you  
describe for transition 1->2, something is very wrong. Nothing  
should happen. Unfortunately, I'm going to be on holidays and,  
generally, don't have too much time. If they can afford to share  
the osdmap (ceph osd getmap -o file), I could also take a look at  
some point.


I don't think it has to do with set_choose_tries, there is likely  
something else screwed up badly. There should simply not be any  
remapping going on at this stage. Just for fun, you should be able  
to produce a clean crushmap from scratch with a similar or the  
same tree and check if you see the same problems.


Using the full osdmap with osdmaptool allows to reproduce the  
exact mappings as used in the cluster and it encodes other  
important information as well. That's why I'm asking for this  
instead of just the crush map.


Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Eugen Block 
Sent: Thursday, May 23, 2024 1:26 PM
To: Frank Schilder
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: unknown PGs after adding hosts in  
different subtree


Hi Frank,

thanks for chiming in here.


Please correct if this is wrong. Assuming its correct, I conclude
the following.


You assume correctly.


Now, from your description it is not clear to me on which of the
transitions 1->2 or 2->3 you observe
- peering and/or
- unknown PGs.


The unknown PGs were observed during/after 1 -> 2. All or almost all
PGs were reported as "remapped", I don't remember the exact number,
but it was more than 4k, and the largest pool has 4096 PGs. We didn't
see down OSDs at all.
Only after moving the hosts into their designated location (the DCs)
the unknown PGs cleared and the application resumed its operation.

I don't want to overload this thread but I asked for a 

[ceph-users] Best practice regarding rgw scaling

2024-05-23 Thread Szabo, Istvan (Agoda)
Hi,

Wonder what is the best practice to scale RGW, increase the thread numbers or 
spin up more gateways?


  *
Let's say I have 21000 connections on my haproxy
  *
I have 3 physical gateway servers so let's say each of them need to server 7000 
connections

This means with 512 thread pool size each of them needs 13 gateway altogether 
39 in the cluster.
or
3 gateway and each 8192 rgw thread?

Thank you


This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block
In my small lab cluster I can at least reproduce that a bunch of PGs  
are remapped after adding hosts to the default root, but they are not  
in their designated location yet. I have 3 „rooms“ underneath the  
default root. Although I can’t reproduce the unknown PGs, maybe this  
is enough to investigate? I’m on my mobile right now, I’ll add my own  
osdmap to the thread soon.


Zitat von Eugen Block :


Thanks, Frank, I appreciate your help.
I already asked for the osdmap, but I’ll also try to find a reproducer.

Zitat von Frank Schilder :


Hi Eugen,

thanks for this clarification. Yes, with the observations you  
describe for transition 1->2, something is very wrong. Nothing  
should happen. Unfortunately, I'm going to be on holidays and,  
generally, don't have too much time. If they can afford to share  
the osdmap (ceph osd getmap -o file), I could also take a look at  
some point.


I don't think it has to do with set_choose_tries, there is likely  
something else screwed up badly. There should simply not be any  
remapping going on at this stage. Just for fun, you should be able  
to produce a clean crushmap from scratch with a similar or the same  
tree and check if you see the same problems.


Using the full osdmap with osdmaptool allows to reproduce the exact  
mappings as used in the cluster and it encodes other important  
information as well. That's why I'm asking for this instead of just  
the crush map.


Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Eugen Block 
Sent: Thursday, May 23, 2024 1:26 PM
To: Frank Schilder
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: unknown PGs after adding hosts in  
different subtree


Hi Frank,

thanks for chiming in here.


Please correct if this is wrong. Assuming its correct, I conclude
the following.


You assume correctly.


Now, from your description it is not clear to me on which of the
transitions 1->2 or 2->3 you observe
- peering and/or
- unknown PGs.


The unknown PGs were observed during/after 1 -> 2. All or almost all
PGs were reported as "remapped", I don't remember the exact number,
but it was more than 4k, and the largest pool has 4096 PGs. We didn't
see down OSDs at all.
Only after moving the hosts into their designated location (the DCs)
the unknown PGs cleared and the application resumed its operation.

I don't want to overload this thread but I asked for a copy of their
crushmap to play around a bit. I moved the new hosts out of the DCs
into the default root via 'crushtool --move ...', then running the
crushtool --test command

# crushtool -i crushmap --test --rule 1 --num-rep 18
--show-choose-tries [--show-bad-mappings] --show-utilization

results in a couple of issues:

- there are lots of bad mappings no matter how high the number for
set_choose_tries is set
- the show-utilization output shows 240 OSDs in usage (there were 240
OSDs before the expansion), but plenty of them have only 9 chunks
assigned:

rule 1 (rule-ec-k7m11), x = 0..1023, numrep = 18..18
rule 1 (rule-ec-k7m11) num_rep 18 result size == 0: 55/1024
rule 1 (rule-ec-k7m11) num_rep 18 result size == 9: 488/1024
rule 1 (rule-ec-k7m11) num_rep 18 result size == 18:481/1024

And this reminds me of the inactive PGs we saw before I failed the
mgr, those inactive PGs showed only 9 chunks in the acting set. With
k=7 (and min_size=8) that should still be enough, we have successfully
tested disaster recovery with one entire DC down multiple times.

- with --show-mappings some lines contain an empty set like this:

CRUSH rule 1 x 22 []

And one more observation: with the currently active crushmap there are
no bad mappings at all when the hosts are in their designated location.
So there's definitely something wrong here, I just can't tell what it
is yet. I'll play a bit more with that crushmap...

Thanks!
Eugen


Zitat von Frank Schilder :


Hi Eugen,

I'm afraid the description of your observation breaks a bit with
causality and this might be the reason for the few replies. To
produce a bit more structure for when exactly what happened, let's
look at what I did and didn't get:

Before adding the hosts you have situation

1)
default
 DCA
   host A1 ... AN
 DCB
   host B1 ... BM

Now you add K+L hosts, they go into the default root and we have situation

2)
default
 host C1 ... CK, D1 ... DL
 DCA
   host A1 ... AN
 DCB
   host B1 ... BM

As a last step, you move the hosts to their final locations and we
arrive at situation

3)
default
 DCA
   host A1 ... AN, C1 ... CK
 DCB
   host B1 ... BM, D1 ... DL

Please correct if this is wrong. Assuming its correct, I conclude
the following.

Now, from your description it is not clear to me on which of the
transitions 1->2 or 2->3 you observe
- peering and/or
- unknown PGs.

We use a somewhat similar procedure except that we have a second
root (separate disjoint tree) for new hosts/OSDs. However, in terms
of peering it is the same and if 

[ceph-users] Re: Status of 18.2.3

2024-05-23 Thread Sake Ceph
I don't have access to Slack, but thank you for all your work! Fingers crossed 
for a quick release. 

Kind regards, 
Sake

> Op 23-05-2024 16:20 CEST schreef Yuri Weinstein :
> 
>  
> We are still working on the last-minute fixes, see this for details
> https://ceph-storage.slack.com/archives/C054Q1NUBQT/p1711041666180929
> 
> Regards
> YuriW
> 
> On Thu, May 23, 2024 at 6:22 AM Sake Ceph  wrote:
> >
> > I was wondering what happened to the release of 18.2.3? Validation started 
> > on April 13th and as far as I know there have been a couple of builds and 
> > some extra bug fixes. Is there a way to follow a release or what is holding 
> > it back?
> >
> > Normally I wouldn't ask about a release and just wait, but I really need 
> > some fixes of this release.
> >
> > Kind regards,
> > Sake
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Status of 18.2.3

2024-05-23 Thread Yuri Weinstein
We are still working on the last-minute fixes, see this for details
https://ceph-storage.slack.com/archives/C054Q1NUBQT/p1711041666180929

Regards
YuriW

On Thu, May 23, 2024 at 6:22 AM Sake Ceph  wrote:
>
> I was wondering what happened to the release of 18.2.3? Validation started on 
> April 13th and as far as I know there have been a couple of builds and some 
> extra bug fixes. Is there a way to follow a release or what is holding it 
> back?
>
> Normally I wouldn't ask about a release and just wait, but I really need some 
> fixes of this release.
>
> Kind regards,
> Sake
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block

Thanks, Frank, I appreciate your help.
I already asked for the osdmap, but I’ll also try to find a reproducer.

Zitat von Frank Schilder :


Hi Eugen,

thanks for this clarification. Yes, with the observations you  
describe for transition 1->2, something is very wrong. Nothing  
should happen. Unfortunately, I'm going to be on holidays and,  
generally, don't have too much time. If they can afford to share the  
osdmap (ceph osd getmap -o file), I could also take a look at some  
point.


I don't think it has to do with set_choose_tries, there is likely  
something else screwed up badly. There should simply not be any  
remapping going on at this stage. Just for fun, you should be able  
to produce a clean crushmap from scratch with a similar or the same  
tree and check if you see the same problems.


Using the full osdmap with osdmaptool allows to reproduce the exact  
mappings as used in the cluster and it encodes other important  
information as well. That's why I'm asking for this instead of just  
the crush map.


Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Eugen Block 
Sent: Thursday, May 23, 2024 1:26 PM
To: Frank Schilder
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: unknown PGs after adding hosts in  
different subtree


Hi Frank,

thanks for chiming in here.


Please correct if this is wrong. Assuming its correct, I conclude
the following.


You assume correctly.


Now, from your description it is not clear to me on which of the
transitions 1->2 or 2->3 you observe
- peering and/or
- unknown PGs.


The unknown PGs were observed during/after 1 -> 2. All or almost all
PGs were reported as "remapped", I don't remember the exact number,
but it was more than 4k, and the largest pool has 4096 PGs. We didn't
see down OSDs at all.
Only after moving the hosts into their designated location (the DCs)
the unknown PGs cleared and the application resumed its operation.

I don't want to overload this thread but I asked for a copy of their
crushmap to play around a bit. I moved the new hosts out of the DCs
into the default root via 'crushtool --move ...', then running the
crushtool --test command

# crushtool -i crushmap --test --rule 1 --num-rep 18
--show-choose-tries [--show-bad-mappings] --show-utilization

results in a couple of issues:

- there are lots of bad mappings no matter how high the number for
set_choose_tries is set
- the show-utilization output shows 240 OSDs in usage (there were 240
OSDs before the expansion), but plenty of them have only 9 chunks
assigned:

rule 1 (rule-ec-k7m11), x = 0..1023, numrep = 18..18
rule 1 (rule-ec-k7m11) num_rep 18 result size == 0: 55/1024
rule 1 (rule-ec-k7m11) num_rep 18 result size == 9: 488/1024
rule 1 (rule-ec-k7m11) num_rep 18 result size == 18:481/1024

And this reminds me of the inactive PGs we saw before I failed the
mgr, those inactive PGs showed only 9 chunks in the acting set. With
k=7 (and min_size=8) that should still be enough, we have successfully
tested disaster recovery with one entire DC down multiple times.

- with --show-mappings some lines contain an empty set like this:

CRUSH rule 1 x 22 []

And one more observation: with the currently active crushmap there are
no bad mappings at all when the hosts are in their designated location.
So there's definitely something wrong here, I just can't tell what it
is yet. I'll play a bit more with that crushmap...

Thanks!
Eugen


Zitat von Frank Schilder :


Hi Eugen,

I'm afraid the description of your observation breaks a bit with
causality and this might be the reason for the few replies. To
produce a bit more structure for when exactly what happened, let's
look at what I did and didn't get:

Before adding the hosts you have situation

1)
default
  DCA
host A1 ... AN
  DCB
host B1 ... BM

Now you add K+L hosts, they go into the default root and we have situation

2)
default
  host C1 ... CK, D1 ... DL
  DCA
host A1 ... AN
  DCB
host B1 ... BM

As a last step, you move the hosts to their final locations and we
arrive at situation

3)
default
  DCA
host A1 ... AN, C1 ... CK
  DCB
host B1 ... BM, D1 ... DL

Please correct if this is wrong. Assuming its correct, I conclude
the following.

Now, from your description it is not clear to me on which of the
transitions 1->2 or 2->3 you observe
- peering and/or
- unknown PGs.

We use a somewhat similar procedure except that we have a second
root (separate disjoint tree) for new hosts/OSDs. However, in terms
of peering it is the same and if everything is configured correctly
I would expect this to happen (this is what happens when we add
OSDs/hosts):

transition 1->2: hosts get added: no peering, no remapped objects,
nothing, just new OSDs doing nothing
transition 2->3: hosts get moved: peering starts and remapped
objects appear, all PGs active+clean

Unknown PGs should not occur (maybe only temporarily when the
primary changes 

[ceph-users] Re: User + Dev Meetup Tomorrow!

2024-05-23 Thread Laura Flores
Hi all,

The meeting will be starting shortly! Join us at this link:
https://meet.jit.si/ceph-user-dev-monthly

- Laura

On Wed, May 22, 2024 at 2:55 PM Laura Flores  wrote:

> Hi all,
>
> The User + Dev Meetup will be held tomorrow at 10:00 AM EDT. We will be
> discussing the results of the latest survey, and users who attend will have
> the opportunity to provide additional feedback in real time.
>
> See you there!
> Laura Flores
>
> Meeting Details:
> https://www.meetup.com/ceph-user-group/events/300883526/
>
> --
>
> Laura Flores
>
> She/Her/Hers
>
> Software Engineer, Ceph Storage 
>
> Chicago, IL
>
> lflo...@ibm.com | lflo...@redhat.com 
> M: +17087388804
>
>
>

-- 

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage 

Chicago, IL

lflo...@ibm.com | lflo...@redhat.com 
M: +17087388804
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Status of 18.2.3

2024-05-23 Thread Sake Ceph
I was wondering what happened to the release of 18.2.3? Validation started on 
April 13th and as far as I know there have been a couple of builds and some 
extra bug fixes. Is there a way to follow a release or what is holding it back?

Normally I wouldn't ask about a release and just wait, but I really need some 
fixes of this release.

Kind regards,
Sake
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: does the RBD client block write when the Watcher times out?

2024-05-23 Thread Frank Schilder
Hi, we run into the same issue and there is actually another use case: 
live-migration of VMs. This requires an RBD image being mapped to two clients 
simultaneously, so this is intentional. If multiple clints map an image in 
RW-mode, the ceph back-end will cycle the write lock between the clients to 
allow each of them to flush writes, this is intentional. The way to coordinate 
here is the job of the orchestrator. In this case specifically, its explicitly 
managing a write lock during live-migration such that writes occur in the 
correct order.

Its not a ceph job, its an orchestration job. The rbd interface just provides 
the tools to do it, for example, you can attach information that helps you 
hunting down dead-looking clients and kill them proper before mapping an image 
somewhere else.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Ilya Dryomov 
Sent: Thursday, May 23, 2024 2:05 PM
To: Yuma Ogami
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: does the RBD client block write when the Watcher 
times out?

On Thu, May 23, 2024 at 4:48 AM Yuma Ogami  wrote:
>
> Hello.
>
> I'm currently verifying the behavior of RBD on failure. I'm wondering
> about the consistency of RBD images after network failures. As a
> result of my investigation, I found that RBD sets a Watcher to RBD
> image if a client mounts this volume to prevent multiple mounts. In

Hi Yuma,

The watcher is created to watch for updates (technically, to listen to
notifications) on the RBD image, not to prevent multiple mounts.  RBD
allows the same image to be mapped multiple times on the same node or
on different nodes.

> addition, I found that if the client is isolated from the network for
> a long time, the Watcher is released. However, the client still mounts
> this image. In this situation, if another client can also mount this
> image and the image is writable from both clients, data corruption
> occurs. Could you tell me whether this is a realistic scenario?

Yes, this is a realistic scenario which can occur even if the client
isn't isolated from the network.  If the user does this, it's up to the
user to ensure that everything remains consistent.  One use case for
mapping the same image on multiple nodes is a clustered (also referred
to as a shared disk) filesystem, such as OCFS2.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: does the RBD client block write when the Watcher times out?

2024-05-23 Thread Ilya Dryomov
On Thu, May 23, 2024 at 4:48 AM Yuma Ogami  wrote:
>
> Hello.
>
> I'm currently verifying the behavior of RBD on failure. I'm wondering
> about the consistency of RBD images after network failures. As a
> result of my investigation, I found that RBD sets a Watcher to RBD
> image if a client mounts this volume to prevent multiple mounts. In

Hi Yuma,

The watcher is created to watch for updates (technically, to listen to
notifications) on the RBD image, not to prevent multiple mounts.  RBD
allows the same image to be mapped multiple times on the same node or
on different nodes.

> addition, I found that if the client is isolated from the network for
> a long time, the Watcher is released. However, the client still mounts
> this image. In this situation, if another client can also mount this
> image and the image is writable from both clients, data corruption
> occurs. Could you tell me whether this is a realistic scenario?

Yes, this is a realistic scenario which can occur even if the client
isn't isolated from the network.  If the user does this, it's up to the
user to ensure that everything remains consistent.  One use case for
mapping the same image on multiple nodes is a clustered (also referred
to as a shared disk) filesystem, such as OCFS2.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Frank Schilder
Hi Eugen,

thanks for this clarification. Yes, with the observations you describe for 
transition 1->2, something is very wrong. Nothing should happen. Unfortunately, 
I'm going to be on holidays and, generally, don't have too much time. If they 
can afford to share the osdmap (ceph osd getmap -o file), I could also take a 
look at some point.

I don't think it has to do with set_choose_tries, there is likely something 
else screwed up badly. There should simply not be any remapping going on at 
this stage. Just for fun, you should be able to produce a clean crushmap from 
scratch with a similar or the same tree and check if you see the same problems.

Using the full osdmap with osdmaptool allows to reproduce the exact mappings as 
used in the cluster and it encodes other important information as well. That's 
why I'm asking for this instead of just the crush map.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Eugen Block 
Sent: Thursday, May 23, 2024 1:26 PM
To: Frank Schilder
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: unknown PGs after adding hosts in different 
subtree

Hi Frank,

thanks for chiming in here.

> Please correct if this is wrong. Assuming its correct, I conclude
> the following.

You assume correctly.

> Now, from your description it is not clear to me on which of the
> transitions 1->2 or 2->3 you observe
> - peering and/or
> - unknown PGs.

The unknown PGs were observed during/after 1 -> 2. All or almost all
PGs were reported as "remapped", I don't remember the exact number,
but it was more than 4k, and the largest pool has 4096 PGs. We didn't
see down OSDs at all.
Only after moving the hosts into their designated location (the DCs)
the unknown PGs cleared and the application resumed its operation.

I don't want to overload this thread but I asked for a copy of their
crushmap to play around a bit. I moved the new hosts out of the DCs
into the default root via 'crushtool --move ...', then running the
crushtool --test command

# crushtool -i crushmap --test --rule 1 --num-rep 18
--show-choose-tries [--show-bad-mappings] --show-utilization

results in a couple of issues:

- there are lots of bad mappings no matter how high the number for
set_choose_tries is set
- the show-utilization output shows 240 OSDs in usage (there were 240
OSDs before the expansion), but plenty of them have only 9 chunks
assigned:

rule 1 (rule-ec-k7m11), x = 0..1023, numrep = 18..18
rule 1 (rule-ec-k7m11) num_rep 18 result size == 0: 55/1024
rule 1 (rule-ec-k7m11) num_rep 18 result size == 9: 488/1024
rule 1 (rule-ec-k7m11) num_rep 18 result size == 18:481/1024

And this reminds me of the inactive PGs we saw before I failed the
mgr, those inactive PGs showed only 9 chunks in the acting set. With
k=7 (and min_size=8) that should still be enough, we have successfully
tested disaster recovery with one entire DC down multiple times.

- with --show-mappings some lines contain an empty set like this:

CRUSH rule 1 x 22 []

And one more observation: with the currently active crushmap there are
no bad mappings at all when the hosts are in their designated location.
So there's definitely something wrong here, I just can't tell what it
is yet. I'll play a bit more with that crushmap...

Thanks!
Eugen


Zitat von Frank Schilder :

> Hi Eugen,
>
> I'm afraid the description of your observation breaks a bit with
> causality and this might be the reason for the few replies. To
> produce a bit more structure for when exactly what happened, let's
> look at what I did and didn't get:
>
> Before adding the hosts you have situation
>
> 1)
> default
>   DCA
> host A1 ... AN
>   DCB
> host B1 ... BM
>
> Now you add K+L hosts, they go into the default root and we have situation
>
> 2)
> default
>   host C1 ... CK, D1 ... DL
>   DCA
> host A1 ... AN
>   DCB
> host B1 ... BM
>
> As a last step, you move the hosts to their final locations and we
> arrive at situation
>
> 3)
> default
>   DCA
> host A1 ... AN, C1 ... CK
>   DCB
> host B1 ... BM, D1 ... DL
>
> Please correct if this is wrong. Assuming its correct, I conclude
> the following.
>
> Now, from your description it is not clear to me on which of the
> transitions 1->2 or 2->3 you observe
> - peering and/or
> - unknown PGs.
>
> We use a somewhat similar procedure except that we have a second
> root (separate disjoint tree) for new hosts/OSDs. However, in terms
> of peering it is the same and if everything is configured correctly
> I would expect this to happen (this is what happens when we add
> OSDs/hosts):
>
> transition 1->2: hosts get added: no peering, no remapped objects,
> nothing, just new OSDs doing nothing
> transition 2->3: hosts get moved: peering starts and remapped
> objects appear, all PGs active+clean
>
> Unknown PGs should not occur (maybe only temporarily when the
> primary changes or the PG is slow to respond/report status??). The
> 

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block

Hi Frank,

thanks for chiming in here.

Please correct if this is wrong. Assuming its correct, I conclude  
the following.


You assume correctly.

Now, from your description it is not clear to me on which of the  
transitions 1->2 or 2->3 you observe

- peering and/or
- unknown PGs.


The unknown PGs were observed during/after 1 -> 2. All or almost all  
PGs were reported as "remapped", I don't remember the exact number,  
but it was more than 4k, and the largest pool has 4096 PGs. We didn't  
see down OSDs at all.
Only after moving the hosts into their designated location (the DCs)  
the unknown PGs cleared and the application resumed its operation.


I don't want to overload this thread but I asked for a copy of their  
crushmap to play around a bit. I moved the new hosts out of the DCs  
into the default root via 'crushtool --move ...', then running the  
crushtool --test command


# crushtool -i crushmap --test --rule 1 --num-rep 18  
--show-choose-tries [--show-bad-mappings] --show-utilization


results in a couple of issues:

- there are lots of bad mappings no matter how high the number for  
set_choose_tries is set
- the show-utilization output shows 240 OSDs in usage (there were 240  
OSDs before the expansion), but plenty of them have only 9 chunks  
assigned:


rule 1 (rule-ec-k7m11), x = 0..1023, numrep = 18..18
rule 1 (rule-ec-k7m11) num_rep 18 result size == 0: 55/1024
rule 1 (rule-ec-k7m11) num_rep 18 result size == 9: 488/1024
rule 1 (rule-ec-k7m11) num_rep 18 result size == 18:481/1024

And this reminds me of the inactive PGs we saw before I failed the  
mgr, those inactive PGs showed only 9 chunks in the acting set. With  
k=7 (and min_size=8) that should still be enough, we have successfully  
tested disaster recovery with one entire DC down multiple times.


- with --show-mappings some lines contain an empty set like this:

CRUSH rule 1 x 22 []

And one more observation: with the currently active crushmap there are  
no bad mappings at all when the hosts are in their designated location.
So there's definitely something wrong here, I just can't tell what it  
is yet. I'll play a bit more with that crushmap...


Thanks!
Eugen


Zitat von Frank Schilder :


Hi Eugen,

I'm afraid the description of your observation breaks a bit with  
causality and this might be the reason for the few replies. To  
produce a bit more structure for when exactly what happened, let's  
look at what I did and didn't get:


Before adding the hosts you have situation

1)
default
  DCA
host A1 ... AN
  DCB
host B1 ... BM

Now you add K+L hosts, they go into the default root and we have situation

2)
default
  host C1 ... CK, D1 ... DL
  DCA
host A1 ... AN
  DCB
host B1 ... BM

As a last step, you move the hosts to their final locations and we  
arrive at situation


3)
default
  DCA
host A1 ... AN, C1 ... CK
  DCB
host B1 ... BM, D1 ... DL

Please correct if this is wrong. Assuming its correct, I conclude  
the following.


Now, from your description it is not clear to me on which of the  
transitions 1->2 or 2->3 you observe

- peering and/or
- unknown PGs.

We use a somewhat similar procedure except that we have a second  
root (separate disjoint tree) for new hosts/OSDs. However, in terms  
of peering it is the same and if everything is configured correctly  
I would expect this to happen (this is what happens when we add  
OSDs/hosts):


transition 1->2: hosts get added: no peering, no remapped objects,  
nothing, just new OSDs doing nothing
transition 2->3: hosts get moved: peering starts and remapped  
objects appear, all PGs active+clean


Unknown PGs should not occur (maybe only temporarily when the  
primary changes or the PG is slow to respond/report status??). The  
crush bug with too few set_choose_tries is observed if one has *just  
enough hosts* for the EC profile and should not be observed if all  
PGs are active+clean and one *adds hosts*. Persistent unknown PGs  
can (to my understanding, does unknown mean "has no primary"?) only  
occur if the number of PGs changes (autoscaler messing around??)  
because all PGs were active+clean before. The crush bug leads to  
incomplete PGs, so PGs can go incomplete but they should always have  
an acting primary.


This is assuming no OSDs went down/out during the process.

Can you please check if my interpretation is correct and describe at  
which step exactly things start diverging from my expectations.


Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Eugen Block 
Sent: Thursday, May 23, 2024 12:05 PM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: unknown PGs after adding hosts in different subtree

Hi again,

I'm still wondering if I misunderstand some of the ceph concepts.
Let's assume the choose_tries value is too low and ceph can't find
enough OSDs for the remapping. I would expect that there are some PG
chunks in remapping 

[ceph-users] Re: Pacific 16.2.15 and ceph-volume no-longer creating LVM on block.db partition

2024-05-23 Thread Bruno Canning
I should add, we are not using cephadm.

From: Bruno Canning
Sent: 23 May 2024 11:36
To: ceph-users@ceph.io
Subject: Pacific 16.2.15 and ceph-volume no-longer creating LVM on block.db 
partition

Hi Folks,

After recently upgrading from 16.2.13 to 16.2.15, when I run:

ceph-volume lvm create --data /dev/sda --block.db /dev/nvme0n1p1

to create a new OSD after replacement of a failed disk, ceph-volume no-longer 
creates a volume group/logical volume on the block.db partition*. This is the 
old behaviour we saw whilst running Octopus but we have not seen this since our 
upgrade to Pacific until now. Any ideas why this is happening? Nothing is 
jumping out of the release notes, but they do mention a lot of work was done on 
the LVM code in ceph-volume since 16.2.13. Is this now a configuration setting 
somewhere?

Background:
Our storage nodes are 60 HDDs per chassis with two 2TB NVMe drives, each 
containing 30 partitions used as block.db storage, one partition and one HDD 
per OSD. We are running Ubuntu 20.04 and Ceph Pacific 16.2.15, just upgraded 
from 16.2.13. We are using the official Ubuntu packages.

*so /var/lib/ceph/osd/ceph-1090/block.db points to /dev/nvme0n1p1
and not 
/dev/ceph-11992480-4ea6-4c1a-80b0-e025a66539b2/osd-db-52655aa3-9102-4d92-8a63-f2c93eed984f
 which would be on /dev/nvme0n1p1

--
The Wellcome Sanger Institute is operated by Genome Research Limited, a charity 
registered in England with number 1021457 and a company registered in England 
with number 2742969, whose registered office is Wellcome Sanger Institute, 
Wellcome Genome Campus, Hinxton, CB10 1SA.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Frank Schilder
Hi Eugen,

I'm afraid the description of your observation breaks a bit with causality and 
this might be the reason for the few replies. To produce a bit more structure 
for when exactly what happened, let's look at what I did and didn't get:

Before adding the hosts you have situation

1)
default
  DCA
host A1 ... AN
  DCB
host B1 ... BM

Now you add K+L hosts, they go into the default root and we have situation

2)
default
  host C1 ... CK, D1 ... DL
  DCA
host A1 ... AN
  DCB
host B1 ... BM

As a last step, you move the hosts to their final locations and we arrive at 
situation

3)
default
  DCA
host A1 ... AN, C1 ... CK
  DCB
host B1 ... BM, D1 ... DL

Please correct if this is wrong. Assuming its correct, I conclude the following.

Now, from your description it is not clear to me on which of the transitions 
1->2 or 2->3 you observe
- peering and/or
- unknown PGs.

We use a somewhat similar procedure except that we have a second root (separate 
disjoint tree) for new hosts/OSDs. However, in terms of peering it is the same 
and if everything is configured correctly I would expect this to happen (this 
is what happens when we add OSDs/hosts):

transition 1->2: hosts get added: no peering, no remapped objects, nothing, 
just new OSDs doing nothing
transition 2->3: hosts get moved: peering starts and remapped objects appear, 
all PGs active+clean

Unknown PGs should not occur (maybe only temporarily when the primary changes 
or the PG is slow to respond/report status??). The crush bug with too few 
set_choose_tries is observed if one has *just enough hosts* for the EC profile 
and should not be observed if all PGs are active+clean and one *adds hosts*. 
Persistent unknown PGs can (to my understanding, does unknown mean "has no 
primary"?) only occur if the number of PGs changes (autoscaler messing 
around??) because all PGs were active+clean before. The crush bug leads to 
incomplete PGs, so PGs can go incomplete but they should always have an acting 
primary.

This is assuming no OSDs went down/out during the process.

Can you please check if my interpretation is correct and describe at which step 
exactly things start diverging from my expectations.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Eugen Block 
Sent: Thursday, May 23, 2024 12:05 PM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: unknown PGs after adding hosts in different subtree

Hi again,

I'm still wondering if I misunderstand some of the ceph concepts.
Let's assume the choose_tries value is too low and ceph can't find
enough OSDs for the remapping. I would expect that there are some PG
chunks in remapping state or unknown or whatever, but why would it
affect the otherwise healthy cluster in such a way?
Even if ceph doesn't know where to put some of the chunks, I wouldn't
expect inactive PGs and have a service interruption.
What am I missing here?

Thanks,
Eugen

Zitat von Eugen Block :

> Thanks, Konstantin.
> It's been a while since I was last bitten by the choose_tries being
> too low... Unfortunately, I won't be able to verify that... But I'll
> definitely keep that in mind, or least I'll try to. :-D
>
> Thanks!
>
> Zitat von Konstantin Shalygin :
>
>> Hi Eugen
>>
>>> On 21 May 2024, at 15:26, Eugen Block  wrote:
>>>
>>> step set_choose_tries 100
>>
>> I think you should try to increase set_choose_tries to 200
>> Last year we had an Pacific EC 8+2 deployment of 10 racks. And even
>> with 50 hosts, the value of 100 not worked for us
>>
>>
>> k


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Pacific 16.2.15 and ceph-volume no-longer creating LVM on block.db partition

2024-05-23 Thread Bruno Canning
Hi Folks,

After recently upgrading from 16.2.13 to 16.2.15, when I run:

ceph-volume lvm create --data /dev/sda --block.db /dev/nvme0n1p1

to create a new OSD after replacement of a failed disk, ceph-volume no-longer 
creates a volume group/logical volume on the block.db partition*. This is the 
old behaviour we saw whilst running Octopus but we have not seen this since our 
upgrade to Pacific until now. Any ideas why this is happening? Nothing is 
jumping out of the release notes, but they do mention a lot of work was done on 
the LVM code in ceph-volume since 16.2.13. Is this now a configuration setting 
somewhere?

Background:
Our storage nodes are 60 HDDs per chassis with two 2TB NVMe drives, each 
containing 30 partitions used as block.db storage, one partition and one HDD 
per OSD. We are running Ubuntu 20.04 and Ceph Pacific 16.2.15, just upgraded 
from 16.2.13. We are using the official Ubuntu packages.

*so /var/lib/ceph/osd/ceph-1090/block.db points to /dev/nvme0n1p1
and not 
/dev/ceph-11992480-4ea6-4c1a-80b0-e025a66539b2/osd-db-52655aa3-9102-4d92-8a63-f2c93eed984f
 which would be on /dev/nvme0n1p1

--
The Wellcome Sanger Institute is operated by Genome Research Limited, a charity 
registered in England with number 1021457 and a company registered in England 
with number 2742969, whose registered office is Wellcome Sanger Institute, 
Wellcome Genome Campus, Hinxton, CB10 1SA.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block

Hi again,

I'm still wondering if I misunderstand some of the ceph concepts.  
Let's assume the choose_tries value is too low and ceph can't find  
enough OSDs for the remapping. I would expect that there are some PG  
chunks in remapping state or unknown or whatever, but why would it  
affect the otherwise healthy cluster in such a way?
Even if ceph doesn't know where to put some of the chunks, I wouldn't  
expect inactive PGs and have a service interruption.

What am I missing here?

Thanks,
Eugen

Zitat von Eugen Block :


Thanks, Konstantin.
It's been a while since I was last bitten by the choose_tries being  
too low... Unfortunately, I won't be able to verify that... But I'll  
definitely keep that in mind, or least I'll try to. :-D


Thanks!

Zitat von Konstantin Shalygin :


Hi Eugen


On 21 May 2024, at 15:26, Eugen Block  wrote:

step set_choose_tries 100


I think you should try to increase set_choose_tries to 200
Last year we had an Pacific EC 8+2 deployment of 10 racks. And even  
with 50 hosts, the value of 100 not worked for us



k



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io