[ceph-users] Re: Erasure coding scheme 2+4 = good idea?

2024-10-10 Thread Frank Schilder
be up with a DC down. For example, it will make sure that with min_size=2 an ACK is only sent to a client if each DC has a shard. An ordinary crush rule will not do that. Stretch mode only works for replicated pools. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, ru

[ceph-users] Re: What is the problem with many PGs per OSD

2024-10-10 Thread Frank Schilder
of the cluster. That's how you get scale-out capability. A fixed PG count counteracts that with the insane increase of capacity per disk we have lately. That's why I actually lean towards that the recommendation was intended to keep PGs below 5-10G each (and or Sent: Thursday, October

[ceph-users] Procedure for temporary evacuation and replacement

2024-10-10 Thread Frank Schilder
after some time. I'm also wondering if UP+OUT OSDs participate in peering in case there is an OSD restart somewhere in the pool. Thanks for your input and best regards! ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-u

[ceph-users] Re: What is the problem with many PGs per OSD

2024-10-10 Thread Frank Schilder
any more. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Janne Johansson Sent: Thursday, October 10, 2024 8:51 AM To: Frank Schilder Cc: Anthony D'Atri; ceph-users@ceph.io Subject: Re: [ceph-users] Re: What is the probl

[ceph-users] Re: What is the problem with many PGs per OSD

2024-10-09 Thread Frank Schilder
That I would vaguely understand: to keep the average PG size constant at a max of about 10G. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ____ From: Anthony D'Atri Sent: Wednesday, October 9, 2024 3:52 PM To: Frank Sc

[ceph-users] Re: What is the problem with many PGs per OSD

2024-10-09 Thread Frank Schilder
hould expect. None of the discussions I have seen so far address this extreme weirdness of the recommendation. If there is an unsolved scaling problem, please anyone state what it is, why its there and what the critical threshold is. What part of the code will explode? Thanks and best regards, ===

[ceph-users] Re: Forced upgrade OSD from Luminous to Pacific

2024-10-09 Thread Frank Schilder
ng and OSD logs. Maybe they are corrupted? Do they manage to read the rocksdb and get to the state where they try to join the cluster? Do they crash? You can start an OSD daemon manually to see he complete startup log live in a terminal. Best regards, ===== Frank Schilder AIT Risø Cam

[ceph-users] Re: What is the problem with many PGs per OSD

2024-10-09 Thread Frank Schilder
that a dev drops by and can comment on that with background from the implementation. I just won't be satisfied with speculation this time around and will keep bugging. Thanks and best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 _

[ceph-users] Re: What is the problem with many PGs per OSD

2024-10-09 Thread Frank Schilder
nd best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Anthony D'Atri Sent: Wednesday, October 9, 2024 2:40 AM To: Frank Schilder Cc: ceph-users@ceph.io Subject: Re: [ceph-users] What is the problem with many PGs pe

[ceph-users] Re: What is the problem with many PGs per OSD

2024-10-09 Thread Frank Schilder
ou have performance metrics before/after? Did you actually observe any performance degradation? Was there an increased memory consumption? Anything that justifies making a statement alluding to (potential) negative performance impact? Thanks and best regards, = Frank Schilder AI

[ceph-users] What is the problem with many PGs per OSD

2024-10-08 Thread Frank Schilder
SD to large values>500? Thanks a lot for any clarifications in this matter! = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: [Ceph incident] PG stuck in peering.

2024-09-26 Thread Frank Schilder
ry for the confusion and hopefully our experience reports here help other users. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to cep

[ceph-users] Re: [Ceph incident] PG stuck in peering.

2024-09-23 Thread Frank Schilder
g to avoid data loss on other PGs)." I hope you mean "waited for recovery" or what does a wipe here mean. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: HARROUIN Loan (PRESTATAIRE CA-GIP) Sent:

[ceph-users] Re: Ceph octopus version cluster not starting

2024-09-17 Thread Frank Schilder
r 2-5 minutes. NTP shouldn't take much time to come up under normal circumstances. I'm not a systemd wizard. If you do something like this, please post it here as a reply for others to find it. Best regards, ===== Frank Schilder AIT Risø Campus Bygni

[ceph-users] Re: Ceph octopus version cluster not starting

2024-09-16 Thread Frank Schilder
case if this happens again. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Amudhan P Sent: Monday, September 16, 2024 6:19 PM To: Frank Schilder Cc: Eugen Block; ceph-users@ceph.io Subject: Re: [ceph-users] Re: Ceph

[ceph-users] Re: Ceph octopus version cluster not starting

2024-09-16 Thread Frank Schilder
l log files to be written. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Amudhan P Sent: Monday, September 16, 2024 12:18 PM To: Frank Schilder Cc: Eugen Block; ceph-users@ceph.io Subject: Re: [ceph-users] Re: C

[ceph-users] Re: Ceph octopus version cluster not starting

2024-09-16 Thread Frank Schilder
lpful. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Amudhan P Sent: Monday, September 16, 2024 10:36 AM To: Eugen Block Cc: ceph-users@ceph.io Subject: [ceph-users] Re: Ceph octopus version cluster not starting No, I don'

[ceph-users] Re: Successfully using dm-cache

2024-09-12 Thread Frank Schilder
gards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Michael Lipp Sent: Wednesday, January 31, 2024 6:23 PM To: ceph-users@ceph.io Subject: [ceph-users] Successfully using dm-cache Just in case anybody is interested: Using dm-

[ceph-users] Re: Identify laggy PGs

2024-08-15 Thread Frank Schilder
ing this to 300PGs/OSD due to excessively long deep-scrub times per PG. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Szabo, Istvan (Agoda) Sent: Wednesday, August 14, 2024 12:00 PM To: Eugen Block; ceph-us

[ceph-users] Re: Bluestore issue using 18.2.2

2024-08-14 Thread Frank Schilder
have damage). Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eugen Block Sent: Wednesday, August 14, 2024 9:05 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: Bluestore issue using 18.2.2 Hi, it looks like y

[ceph-users] Re: 0 slow ops message stuck for down+out OSD

2024-07-29 Thread Frank Schilder
> Hi, would a mgr restart fix that? It did! The one thing we didn't try last time. We thought the message was stuck in the MONs. Thanks! ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eugen Block Sent: Monday, July

[ceph-users] Re: snaptrim not making progress

2024-07-29 Thread Frank Schilder
bout it as well. However, this was at 9:30am but the snaptrim was hanging since 3am. Is there any event with an OSD/disk that can cause snaptrim to stall yet there is no health issue detected/reported? Thanks for any pointers! ===== Frank Schilder AIT Risø Campus Bygning 10

[ceph-users] Re: snaptrim not making progress

2024-07-29 Thread Frank Schilder
48.11:0/2422413806 client.370420944 cookie=140578306156832 Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ____ From: Frank Schilder Sent: Monday, July 29, 2024 10:24 AM To: ceph-users@ceph.io Subject: [ceph-users] snaptrim not maki

[ceph-users] Re: 0 slow ops message stuck for down+out OSD

2024-07-29 Thread Frank Schilder
Very funny, it was actually me who made this case some time ago: https://www.mail-archive.com/ceph-users@ceph.io/msg10095.html I will look into what we did last time. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14

[ceph-users] 0 slow ops message stuck for down+out OSD

2024-07-29 Thread Frank Schilder
time ago, but I can't find it. How can I get rid of this stuck warning? Our cluster is octopus latest. Thanks and best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.

[ceph-users] snaptrim not making progress

2024-07-29 Thread Frank Schilder
tarting or progressing? Thanks for any pointers! ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: How to specify id on newly created OSD with Ceph Orchestrator

2024-07-23 Thread Frank Schilder
This is fixed, but better safe than sorry. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Iztok Gregori Sent: Tuesday, July 23, 2024 9:10 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: How to specify id on newl

[ceph-users] Re: Repurposing some Dell R750s for Ceph

2024-07-11 Thread Frank Schilder
go from 16x2.5" HDD to something like 24xNVMe? Maybe you could provide a bit more information here, like (links to) the wiring diagrams you mentioned? From the description I cannot entirely deduce what exactly you have and where you want to go to. Best regards, ===== Frank

[ceph-users] Re: Ceph tracker broken?

2024-07-01 Thread Frank Schilder
ow hanging fruits" and that's when it started. I got added to some related PRs and maybe on this occasion to a lot more by accident. Thanks for your help! ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Gregory Farnum

[ceph-users] Ceph tracker broken?

2024-07-01 Thread Frank Schilder
that and I have not subscribed to this tracker item (https://tracker.ceph.com/issues/66763) eithrt. Yet, I receive unrequested updates. Could someone please take a look and try to find out what the problem is? Thanks a lot! = Frank Schilder AIT Risø Campus Bygning 109, rum S14

[ceph-users] Re: pg deep-scrub control scheme

2024-06-27 Thread Frank Schilder
Sorry, the entry point is actually https://github.com/frans42/ceph-goodies/blob/main/doc/TuningScrub.md = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Frank Schilder Sent: Thursday, June 27, 2024 9:02 AM To: David Yang; Ceph

[ceph-users] Re: pg deep-scrub control scheme

2024-06-27 Thread Frank Schilder
egards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: David Yang Sent: Thursday, June 27, 2024 3:50 AM To: Ceph Users Subject: [ceph-users] pg deep-scrub control scheme Hello everyone. I have a cluster with 8321 pgs and recently I started to

[ceph-users] Re: why not block gmail?

2024-06-17 Thread Frank Schilder
Could we at least stop approving requests from obvious spammers? Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eneko Lacunza Sent: Monday, June 17, 2024 9:18 AM To: ceph-users@ceph.io Subject: [ceph-users] Re

[ceph-users] Re: deep scrubb and scrubb does get the job done

2024-06-13 Thread Frank Schilder
Yes, there is: https://github.com/frans42/ceph-goodies/blob/main/doc/TuningScrub.md This is work in progress and a few details are missing, but it should help you find the right parameters. Note that this is tested on octopus with WPQ. Best regards, = Frank Schilder AIT Risø

[ceph-users] Re: Can't comment on my own tracker item any more

2024-06-13 Thread Frank Schilder
uld be. I had to copy the code example back by hand. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ____ From: Frank Schilder Sent: Thursday, June 13, 2024 11:40 PM To: ceph-users@ceph.io Subject: [ceph-users] Can't

[ceph-users] Can't comment on my own tracker item any more

2024-06-13 Thread Frank Schilder
and I'm reported as the author. I can still edit the item itself, but I'm not able to leave comments. Can someone please look into that? Thanks! ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list --

[ceph-users] Re: CephFS metadata pool size

2024-06-12 Thread Frank Schilder
our consideration. ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Anthony D'Atri Sent: Wednesday, June 12, 2024 2:53 PM To: Eugen Block Cc: Lars Köppel; ceph-users@ceph.io Subject: [ceph-users] Re: CephFS metadata pool size If you h

[ceph-users] Re: Documentation for meaning of "tag cephfs" in OSD caps

2024-06-11 Thread Frank Schilder
quot;con-fs2-data2" { "cephfs": { "data": "con-fs2" } } As of today, it seems indeed undocumented black magic and you need to search very carefully to find ceph-user cases that discuss (issues with) these tags, thereby explaining it as a side effect. Bes

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-24 Thread Frank Schilder
n case you have time, it would be great if you could collect information on (reproducing) the fatal peering problem. While remappings might be "unexpectedly expected" it is clearly a serious bug that incomplete and unknown PGs show up in the process of adding hosts at the root. Best r

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Frank Schilder
to have it happen separate >from adding and not a total mess with everything in parallel. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ____ From: Frank Schilder Sent: Thursday, May 23, 2024 6:32 PM To: Eugen B

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Frank Schilder
nding and still unresolved. In case you need to file a tracker, please consider to refer to the two above as well as "might be related" if you deem that they might be related. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 _

[ceph-users] Re: does the RBD client block write when the Watcher times out?

2024-05-23 Thread Frank Schilder
job. The rbd interface just provides the tools to do it, for example, you can attach information that helps you hunting down dead-looking clients and kill them proper before mapping an image somewhere else. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Frank Schilder
the crush map. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eugen Block Sent: Thursday, May 23, 2024 1:26 PM To: Frank Schilder Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Re: unknown PGs after adding hosts in

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Frank Schilder
t during the process. Can you please check if my interpretation is correct and describe at which step exactly things start diverging from my expectations. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eugen Block

[ceph-users] Re: How network latency affects ceph performance really with NVME only storage?

2024-05-22 Thread Frank Schilder
Hi Stefan, ahh OK, misunderstood your e-mail. It sounded like it was a custom profile, not a standard one shipped with tuned. Thanks for the clarification! = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Stefan Bauer Sent

[ceph-users] Re: How network latency affects ceph performance really with NVME only storage?

2024-05-22 Thread Frank Schilder
Hi Stefan, can you provide a link to or copy of the contents of the tuned-profile so others can also profit from it? Thanks! = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Stefan Bauer Sent: Wednesday, May 22, 2024 10:51 AM

[ceph-users] Re: dkim on this mailing list

2024-05-21 Thread Frank Schilder
Hi Marc, in case you are working on the list server, at least for me the situation seems to have improved no more than 2-3 hours ago. My own e-mails to the list now pass. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14

[ceph-users] Re: Please discuss about Slow Peering

2024-05-21 Thread Frank Schilder
a lot with IO, recovery, everything. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Anthony D'Atri Sent: Tuesday, May 21, 2024 3:06 PM To: 서민우 Cc: Frank Schilder; ceph-users@ceph.io Subject: Re: [ceph-us

[ceph-users] Re: Please discuss about Slow Peering

2024-05-21 Thread Frank Schilder
mance. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: 서민우 Sent: Tuesday, May 21, 2024 11:25 AM To: Anthony D'Atri Cc: Frank Schilder; ceph-users@ceph.io Subject: Re: [ceph-users] Please discuss about Slow Peering We used the &q

[ceph-users] Re: Please discuss about Slow Peering

2024-05-16 Thread Frank Schilder
onitor OPS latencies for your drives when peering and look for something that sticks out. People on this list were reporting quite bad results for certain infamous NVMe brands. If you state your model numbers, someone else might recognize it. Best regards, ===== Frank Schilder AIT R

[ceph-users] Re: Remove an OSD with hardware issue caused rgw 503

2024-04-30 Thread Frank Schilder
you still have the disk of the down OSD. Someone will send you the export/import commands within a short time. So stop worrying and just administrate your cluster with common storage admin sense. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, ru

[ceph-users] Re: Remove an OSD with hardware issue caused rgw 503

2024-04-30 Thread Frank Schilder
to have a chance to recover data. Look at the manual of ddrescue why it is important to stop IO from a failing disk as soon as possible. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eugen Block Sent: Saturday, April

[ceph-users] Re: Latest Doco Out Of Date?

2024-04-24 Thread Frank Schilder
an save time on the documentation, because it works like other stuff. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eugen Block Sent: Wednesday, April 24, 2024 9:02 AM To: ceph-users@ceph.io Subject: [ceph-u

[ceph-users] (deep-)scrubs blocked by backfill

2024-04-17 Thread Frank Schilder
when things will go back to normal. Thanks a lot and best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Have a problem with haproxy/keepalived/ganesha/docker

2024-04-16 Thread Frank Schilder
happen with this specific HA set-up in the original request, but a fail-over of the NFS server ought to be handled gracefully by starting a new one up with the IP of the down one. Or not? Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14

[ceph-users] Re: Performance improvement suggestion

2024-03-04 Thread Frank Schilder
>>> Fast write enabled would mean that the primary OSD sends #size copies to the >>> entire active set (including itself) in parallel and sends an ACK to the >>> client as soon as min_size ACKs have been received from the peers (including >>> itself). In this way, one can tolerate (size-min_size) s

[ceph-users] Re: Performance improvement suggestion

2024-03-04 Thread Frank Schilder
d external connections for the remote parts. It would be great to have similar ways of mitigating some penalties of the slow write paths to remote sites. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Peter Grandi

[ceph-users] Re: 6 pgs not deep-scrubbed in time

2024-01-29 Thread Frank Schilder
result will be. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Michel Niyoyita Sent: Monday, January 29, 2024 2:04 PM To: Janne Johansson Cc: Frank Schilder; E Taka; ceph-users Subject: Re: [ceph-users] Re: 6 pgs not deep-scrub

[ceph-users] Re: 6 pgs not deep-scrubbed in time

2024-01-29 Thread Frank Schilder
> 1. For spinners a consideration looking at the actually available drive performance is required, plus a few things more, like PG count, distribution etc. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Wesley Dil

[ceph-users] Re: 6 pgs not deep-scrubbed in time

2024-01-29 Thread Frank Schilder
sider increasing the PG count for pools with lots of data. This should already relax the situation somewhat. Then do the calc above and tune deep-scrub times per pool such that they match with disk performance. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, ru

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-27 Thread Frank Schilder
hen until its fixed. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-26 Thread Frank Schilder
shboard, it has no performance or otherwise negative impact. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eugen Block Sent: Friday, January 26, 2024 10:05 AM To: Özkan Göksu Cc: ceph-users@ceph.io Subject: [cep

[ceph-users] Re: Degraded PGs on EC pool when marking an OSD out

2024-01-24 Thread Frank Schilder
this resolves the PG. If so, there is a temporary condition that prevents the PGs from becoming clean when going through the standard peering procedure. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eug

[ceph-users] List contents of stray buckets with octopus

2024-01-24 Thread Frank Schilder
://tracker.ceph.com/issues/57059 so a "dump tree" will not work. In addition, I clearly don't just need the entries in cache, I need a listing of everything. How can I get that? I'm willing to run rados commands and pipe through ceph-encoder if necessary. Thanks and best regards, ===

[ceph-users] Re: Degraded PGs on EC pool when marking an OSD out

2024-01-22 Thread Frank Schilder
erate valid mappings, you can pull the osdmap of your cluster and use osdmaptool to experiment with it without risk of destroying anything. It allows you to try different crush rules and failure scenarios on off-line but real cluster meta-data. Best regards, ===== Frank Schilder AIT Ris

[ceph-users] Re: Adding OSD's results in slow ops, inactive PG's

2024-01-18 Thread Frank Schilder
, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eugen Block Sent: Thursday, January 18, 2024 9:46 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: Adding OSD's results in slow ops, inactive PG's I'm glad to hear (or read) tha

[ceph-users] Re: Performance impact of Heterogeneous environment

2024-01-18 Thread Frank Schilder
st regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Bailey Allison Sent: Thursday, January 18, 2024 12:36 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: Performance impact of Heterogeneous environment +1 to this, gre

[ceph-users] Re: Recomand number of k and m erasure code

2024-01-15 Thread Frank Schilder
cy bits in it. We had no service outages during such operations. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Anthony D'Atri Sent: Saturday, January 13, 2024 5:36 PM To: Phong Tran Thanh Cc: ceph-users@

[ceph-users] Re: 3 DC with 4+5 EC not quite working

2024-01-12 Thread Frank Schilder
Is it maybe this here: https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-pg/#crush-gives-up-too-soon I always have to tweak the num-tries parameters. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14

[ceph-users] Re: Rack outage test failing when nodes get integrated again

2024-01-11 Thread Frank Schilder
he time to file a tracker issue. I observed this with mimic, but since you report it for Pacific I'm pretty sure its affecting all versions. My guess is that this is not part of the CI testing, at least not in a way that covers network cut-off. Best regards, ===== Frank Schilder

[ceph-users] Re: How to configure something like osd_deep_scrub_min_interval?

2024-01-09 Thread Frank Schilder
nd am waiting for some deep-scrub histograms to converge to equilibrium. This takes months for our large pools, but I would like to have the numbers for an example of how it should look like. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, ru

[ceph-users] Re: How to configure something like osd_deep_scrub_min_interval?

2023-12-15 Thread Frank Schilder
Hi all, another quick update: please use this link to download the script: https://github.com/frans42/ceph-goodies/blob/main/scripts/pool-scrub-report The one I sent originally does not follow latest. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14

[ceph-users] Re: How to configure something like osd_deep_scrub_min_interval?

2023-12-13 Thread Frank Schilder
in general. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-us

[ceph-users] Re: increasing number of (deep) scrubs

2023-12-13 Thread Frank Schilder
Yes, octopus. -- Frank = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Szabo, Istvan (Agoda) Sent: Wednesday, December 13, 2023 6:13 AM To: Frank Schilder; ceph-users@ceph.io Subject: Re: [ceph-users] Re: increasing number of

[ceph-users] Re: increasing number of (deep) scrubs

2023-12-12 Thread Frank Schilder
ick update in the other thread, because the solution was not to increase the number of scrubs, but to tune parameters. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ____ From: Frank Schilder Sent: Monday, January 9, 2023

[ceph-users] Re: How to configure something like osd_deep_scrub_min_interval?

2023-12-12 Thread Frank Schilder
(42.0i) mon.ceph-01 mon_warn_pg_not_deep_scrubbed_ratio=0.75 warn: 24.5d Best regards, merry Christmas and a happy new year to everyone! ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscr

[ceph-users] Re: ceph fs (meta) data inconsistent

2023-12-08 Thread Frank Schilder
Hi Xiubo, I will update the case. I'm afraid this will have to wait a little bit though. I'm too occupied for a while and also don't have a test cluster that would help speed things up. I will update you, please keep the tracker open. Best regards, ===== Frank Sc

[ceph-users] Re: EC Profiles & DR

2023-12-06 Thread Frank Schilder
is no way around it. I was happy when I got the extra hosts. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Curt Sent: Wednesday, December 6, 2023 3:56 PM To: Patrick Begou Cc: ceph-users@ceph.io Subject: [ceph

[ceph-users] ceph df reports incorrect stats

2023-12-06 Thread Frank Schilder
97.20181 host ceph-20 -64 99.77657 host ceph-21 -66 103.56137 host ceph-22 -1 0 root default Best regards, ===

[ceph-users] Re: [ext] CephFS pool not releasing space after data deletion

2023-12-02 Thread Frank Schilder
Hi Mathias, have you made any progress on this? Did the capacity become available eventually? Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Kuhring, Mathias Sent: Friday, October 27, 2023 3:52 PM To: ceph

[ceph-users] Re: ceph fs (meta) data inconsistent

2023-12-01 Thread Frank Schilder
on, please include the part executed on the second host explicitly in an ssh-command. Running your scripts alone in their current form will not reproduce the issue. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From

[ceph-users] Re: ceph fs (meta) data inconsistent

2023-11-24 Thread Frank Schilder
nt to know the python and libc versions. We observe this only for newer versions of both. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Xiubo Li Sent: Thursday, November 23, 2023 3:47 AM To: Frank Schilder; Gr

[ceph-users] Re: Full cluster outage when ECONNREFUSED is triggered

2023-11-24 Thread Frank Schilder
, then there is something wrong with the down reporting that should be looked at. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Frank Schilder Sent: Friday, November 24, 2023 1:20 PM To: Denis Krienbühl; Burkhard

[ceph-users] Re: Full cluster outage when ECONNREFUSED is triggered

2023-11-24 Thread Frank Schilder
ed the relevant code lines, please update/create the tracker with your findings. Hope a dev looks at this. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Denis Krienbühl Sent: Friday, November 24, 2023 12:04 PM To: Bu

[ceph-users] Re: Full cluster outage when ECONNREFUSED is triggered

2023-11-24 Thread Frank Schilder
g the connection error. I think the intention is to shut down fast the OSDs with connection refused (where timeouts are not required) and not other OSDs. A bug report with tracker seems warranted. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, ru

[ceph-users] Re: mds slow request with “failed to authpin, subtree is being exported"

2023-11-22 Thread Frank Schilder
directories to ranks, all our problems disappeared and performance improved a lot. MDS load dropped from 130% average to 10-20%. So did memory consumption and cache recycling. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14

[ceph-users] Re: How to use hardware

2023-11-20 Thread Frank Schilder
with large min_alloc_sizes has to be S3-like, only upload, download and delete are allowed. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Anthony D'Atri Sent: Saturday, November 18, 2023 3:24 PM To: Simon Kepp Cc: Al

[ceph-users] Re: How to configure something like osd_deep_scrub_min_interval?

2023-11-16 Thread Frank Schilder
DD pool converges to. This will need 1-2 months observations and I will report back when significant changes show up. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________ From: Frank Schilder Sent: Wednesday, November 15, 20

[ceph-users] How to configure something like osd_deep_scrub_min_interval?

2023-11-15 Thread Frank Schilder
sy=0 for(pg in pgs) { split(pg_osds[pgs[pg]], osds) for(o in osds) if(osd[osds[o]]=="busy") osds_busy=1 if(osds_busy) printf(" %s*", pgs[pg])

[ceph-users] Re: ceph fs (meta) data inconsistent

2023-11-10 Thread Frank Schilder
the python >> findings above, is this something that should work on ceph or is it a python >> issue? > > Not sure yet. I need to understand what exactly shutil.copy does in kclient. Thanks! Will wait for further instructions. = Frank Schilder AIT Risø Campus B

[ceph-users] Re: ceph fs (meta) data inconsistent

2023-11-09 Thread Frank Schilder
ase provide the mds logs by setting: > [...] I can do a test with MDS logs on high level. Before I do that, looking at the python findings above, is this something that should work on ceph or is it a python issue? Thanks for your help! = Frank Schilder AIT Risø Campus Bygning 10

[ceph-users] Re: MDS stuck in rejoin

2023-11-09 Thread Frank Schilder
Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Xiubo Li Sent: Wednesday, November 8, 2023 1:38 AM To: Frank Schilder; ceph-users@ceph.io Subject: Re: [ceph-users] Re: MDS stuck in rejoin Hi Frank, Recently I fo

[ceph-users] Re: ceph fs (meta) data inconsistent

2023-11-03 Thread Frank Schilder
> Python/3.10.8-GCCcore-12.2.0-bare These are easybuild python modules using different gcc versions to build. The default version of python referred to is Python 2.7.5. Is this a known problem with python3 and is there a patch we can apply? I wonder how python manages to break the fi

[ceph-users] Re: ceph fs (meta) data inconsistent

2023-11-02 Thread Frank Schilder
reboot later today the server where the file was written. Until hen we can do diagnostics while the issue is visible. Please let us know what information we can provide. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From

[ceph-users] ceph fs (meta) data inconsistent

2023-11-01 Thread Frank Schilder
re showing different numbers, I see a 0 length now everywhere for the moved folder. I'm pretty sure though that the file still is non-zero length. Thanks for any pointers. ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ cep

[ceph-users] Re: find PG with large omap object

2023-10-31 Thread Frank Schilder
roperty of creating pools without asking. It would be great if you could add a sanity check that confirms that RGW services are actually present *before* executing any radosgw-admin command and exiting if none are present. Best regards, ===== Frank Schilder AIT Risø Ca

[ceph-users] Combining masks in ceph config

2023-10-25 Thread Frank Schilder
syntax error, but I'm also not sure it does the right thing. Does the above mean "class:hdd and datacenter:A" or does it mean "for OSDs with device class 'hdd,datacenter:A'"? Thanks and best regards, = Frank Schilder A

[ceph-users] Re: stuck MDS warning: Client HOST failing to respond to cache pressure

2023-10-19 Thread Frank Schilder
show 2% CPU usage even though there was no file IO going on. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Frank Schilder Sent: Thursday, October 19, 2023 10:02 AM To: Stefan Kooman; ceph-users@ceph.io Subject

[ceph-users] Re: stuck MDS warning: Client HOST failing to respond to cache pressure

2023-10-19 Thread Frank Schilder
ent caps. Questions: - Why does the client have so many caps allocated? Is there another way than open files that requires allocations? - Is there a way to find out what these caps are for? - We will look at the code (its python+miniconda), any pointers what to look for? Thanks and best regards, ===

[ceph-users] Re: Ceph 16.2.x mon compactions, disk writes

2023-10-18 Thread Frank Schilder
Hi Zakhar, since its a bit beyond of the scope of basic, could you please post the complete ceph.conf config section for these changes for reference? Thanks! = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Zakhar Kirpichenko

  1   2   3   4   5   6   7   8   >