[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-24 Thread Eugen Block
ou could collect information on (reproducing) the fatal peering problem. While remappings might be "unexpectedly expected" it is clearly a serious bug that incomplete and unknown PGs show up in the process of adding hosts at the root. Best regards, = Frank Schilder AIT Ri

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-24 Thread Eugen Block
hosts directly where they belong. - Set osd_crush_initial_weight = 0 to avoid remapping until everything is where it's supposed to be, then reweight the OSDs. Zitat von Eugen Block : Hi Frank, thanks for looking up those trackers. I haven't looked into them yet, I'll read your response in

[ceph-users] Re: cephadm bootstraps cluster with bad CRUSH map(?)

2024-05-24 Thread Eugen Block
Vernon : Hi, On 22/05/2024 12:44, Eugen Block wrote: you can specify the entire tree in the location statement, if you need to: [snip] Brilliant, that's just the ticket, thank you :) This should be made a bit clearer in the docs [0], I added Zac. I've opened a MR to update the docs, I

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-24 Thread Eugen Block
pus Bygning 109, rum S14 From: Frank Schilder Sent: Thursday, May 23, 2024 6:32 PM To: Eugen Block Cc: ceph-users@ceph.io Subject: [ceph-users] Re: unknown PGs after adding hosts in different subtree Hi Eugen, I'm at home now. Could you please check

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block
attached my osdmap, not sure if it will go through, though. Let me know if you need anything else. Thanks! Eugen Zitat von Eugen Block : In my small lab cluster I can at least reproduce that a bunch of PGs are remapped after adding hosts to the default root

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block
to investigate? I’m on my mobile right now, I’ll add my own osdmap to the thread soon. Zitat von Eugen Block : Thanks, Frank, I appreciate your help. I already asked for the osdmap, but I’ll also try to find a reproducer. Zitat von Frank Schilder : Hi Eugen, thanks for this clarification. Yes

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block
ngs as used in the cluster and it encodes other important information as well. That's why I'm asking for this instead of just the crush map. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eugen Block Sent: Thurs

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block
ging from my expectations. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________ From: Eugen Block Sent: Thursday, May 23, 2024 12:05 PM To: ceph-users@ceph.io Subject: [ceph-users] Re: unknown PGs after adding hosts in different sub

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-23 Thread Eugen Block
the otherwise healthy cluster in such a way? Even if ceph doesn't know where to put some of the chunks, I wouldn't expect inactive PGs and have a service interruption. What am I missing here? Thanks, Eugen Zitat von Eugen Block : Thanks, Konstantin. It's been a while since I was last bitten

[ceph-users] Re: cephadm bootstraps cluster with bad CRUSH map(?)

2024-05-22 Thread Eugen Block
Hi, you can specify the entire tree in the location statement, if you need to: ceph:~ # cat host-spec.yaml service_type: host hostname: ceph addr: location: root: default rack: rack2 and after the bootstrap it looks like expected: ceph:~ # ceph osd tree ID CLASS WEIGHT TYPE NAME

[ceph-users] Re: CephFS as Offline Storage

2024-05-21 Thread Eugen Block
It’s usually no problem to shut down a cluster. Set at least the noout flag, the other flags like norebalance, nobackfill etc won’t hurt either. Then shut down the servers. I do that all the time with test clusters (they do have data, just not important at all), and I’ve never had data

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-21 Thread Eugen Block
, Eugen Block wrote: step set_choose_tries 100 I think you should try to increase set_choose_tries to 200 Last year we had an Pacific EC 8+2 deployment of 10 racks. And even with 50 hosts, the value of 100 not worked for us k ___ ceph-users

[ceph-users] Re: Ceph osd df tree takes a long time to respond

2024-05-21 Thread Eugen Block
First thing to try would be to fail the mgr. Although the daemons might be active from a systemd perspective, they sometimes get unresponsive. I saw that in Nautilus clusters as well, so that might be worth a try. Zitat von Huy Nguyen : Ceph version 14.2.7 Ceph osd df tree command take

[ceph-users] unknown PGs after adding hosts in different subtree

2024-05-21 Thread Eugen Block
Hi, I got into a weird and unexpected situation today. I added 6 hosts to an existing Pacific cluster (16.2.13, 20 existing OSD hosts across 2 DCs). The hosts were added to the root=default subtree, their designated location is one of two datacenters underneath the default root. Nothing

[ceph-users] Re: Determine client/inode/dnode source of massive explosion in CephFS metadata pool usage (Red Hat Nautilus CephFS)

2024-05-13 Thread Eugen Block
I just read your message again, you only mention newly created files, not new clients. So my suggestion probably won't help you in this case, but it might help others. :-) Zitat von Eugen Block : Hi Paul, I don't really have a good answer to your question, but maybe this approach can

[ceph-users] Re: Determine client/inode/dnode source of massive explosion in CephFS metadata pool usage (Red Hat Nautilus CephFS)

2024-05-13 Thread Eugen Block
Hi Paul, I don't really have a good answer to your question, but maybe this approach can help track down the clients. Each MDS client has an average "uptime" metric stored in the MDS: storage01:~ # ceph tell mds.cephfs.storage04.uxkclk session ls ... "id": 409348719, ...

[ceph-users] Re: Reef: Dashboard: Object Gateway Graphs have no Data

2024-05-10 Thread Eugen Block
Hi, I don't have a Reef production cluster available yet, only a small test cluster (upgraded from 18.2.1 to 18.2.2 this week). Although I don't use the RGWs constantly there, there are graphs in the ceph dashboard. Maybe it's related to the grafana (and/or prometheus) versions? My

[ceph-users] Re: Problem with take-over-existing-cluster.yml playbook

2024-05-08 Thread Eugen Block
Hi, I'm not familiar with ceph-ansible. I'm not sure if I understand it correctly, according to [1] it tries to get the public IP range to define monitors (?). Can you verify if your mon sections in /etc/ansible/hosts are correct? ansible.builtin.set_fact: _monitor_addresses: "{{

[ceph-users] Re: Removed host in maintenance mode

2024-05-07 Thread Eugen Block
the keys for the osd:s that remained in the host (after the pools recovered/rebalanced). /Johan Den 2024-05-07 kl. 12:09, skrev Eugen Block: Hi, did you remove the host from the host list [0]? ceph orch host rm [--force] [--offline] [0] https://docs.ceph.com/en/latest/cephadm/host

[ceph-users] cephadm upgrade: heartbeat failures not considered

2024-05-07 Thread Eugen Block
Hi, we're facing an issue during upgrades (and sometimes server reboots), it appears to occur when (at leat) one of the MONs has to do a full sync. And I'm wondering if the upgrade procedure could be improved in that regard, I'll come back to that later. First, I'll try to summarize the

[ceph-users] Re: RBD Mirroring with Journaling and Snapshot mechanism

2024-05-07 Thread Eugen Block
Hi, I'm not the biggest rbd-mirror expert. As understand it, if you use one-way mirroring you can failover to the remote site, continue to work there but there's no failover back to primary site. You would need to stop client IO on DR, demote the image and then import the remote images

[ceph-users] Re: Removed host in maintenance mode

2024-05-07 Thread Eugen Block
Hi, did you remove the host from the host list [0]? ceph orch host rm [--force] [--offline] [0] https://docs.ceph.com/en/latest/cephadm/host-management/#offline-host-removal Zitat von Johan : Hi all, In my small cluster of 6 hosts I had troubles with a host (osd:s) and was planning to

[ceph-users] Re: Dashboard issue slowing to a crawl - active ceph mgr process spiking to 600%+

2024-05-07 Thread Eugen Block
Hi, it's a bit much output to scan through, I'd recommend to omit all unnecessary information before pasting. Anyway, this sticks out: 2024-05-01T15:49:26.977+ 7f85688e8700 0 [dashboard ERROR frontend.error] (https://172.20.2.30:8443/#/login): Http failure response for

[ceph-users] Re: cephadm custom crush location hooks

2024-05-03 Thread Eugen Block
And in the tracker you never mentioned to add a symlink, only to add the prefix "/rootfs" to the ceph config. I could have tried that approach first. ;-) Zitat von Eugen Block : Alright, I updated the configs in our production cluster and restarted the OSDs (after removing

[ceph-users] Re: cephadm custom crush location hooks

2024-05-03 Thread Eugen Block
. Thanks! Eugen Zitat von Wyll Ingersoll : Yeah, now that you mention it, I recall figuring that out also at some point. I think I did it originally when I was debugging the problem without the container. From: Eugen Block Sent: Friday, May 3, 2024 8:37 AM

[ceph-users] Re: cephadm custom crush location hooks

2024-05-03 Thread Eugen Block
restart the OSD finds its correct location. So I actually only need to update the location path, nothing else, it seems. Zitat von Eugen Block : I found your (open) tracker issue: https://tracker.ceph.com/issues/53562 Your workaround works great, I tried it in a test cluster successfully

[ceph-users] Re: cephadm custom crush location hooks

2024-05-03 Thread Eugen Block
I found your (open) tracker issue: https://tracker.ceph.com/issues/53562 Your workaround works great, I tried it in a test cluster successfully. I will adopt it to our production cluster as well. Thanks! Eugen Zitat von Eugen Block : Thank you very much for the quick response! I will take

[ceph-users] Re: 'ceph fs status' no longer works?

2024-05-02 Thread Eugen Block
ler" mailto:wei...@soe.ucsc.edu>> *An: *"Eugen Block" mailto:ebl...@nde.ag>>, ceph-users@ceph.io <mailto:ceph-users@ceph.io> *Datum: *02-05-2024 21:05 Hi Eugen, Thanks for the tip!  I just ran: ceph orch daemon restart mgr.pr-md-01.jemmdf (my specific m

[ceph-users] Re: 'ceph fs status' no longer works?

2024-05-02 Thread Eugen Block
Yep, seen this a couple of times during upgrades. I’ll have to check my notes if I wrote anything down for that. But try a mgr failover first, that could help. Zitat von Erich Weiler : Hi All, For a while now I've been using 'ceph fs status' to show current MDS active servers,

[ceph-users] Re: service:mgr [ERROR] "Failed to apply:

2024-05-02 Thread Eugen Block
Can you please paste the output of the following command? ceph orch host ls Zitat von "Roberto Maggi @ Debian" : Hi you all, it is a couple of days I'm facing this problem. Although I already destroyed the cluster a couple of times I continuously get these error I instruct ceph to place

[ceph-users] Re: cephadm custom crush location hooks

2024-05-02 Thread Eugen Block
ituation is not ideal. ____ From: Eugen Block Sent: Thursday, May 2, 2024 10:23 AM To: ceph-users@ceph.io Subject: [ceph-users] cephadm custom crush location hooks Hi, we've been using custom crush location hooks for some OSDs [1] for years. Since we moved to cephadm, we always have to manually

[ceph-users] cephadm custom crush location hooks

2024-05-02 Thread Eugen Block
Hi, we've been using custom crush location hooks for some OSDs [1] for years. Since we moved to cephadm, we always have to manually edit the unit.run file of those OSDs because the path to the script is not mapped into the containers. I don't want to define custom location hooks for all

[ceph-users] Re: After dockerized ceph cluster to Pacific, the fsid changed in the output of 'ceph -s'

2024-05-02 Thread Eugen Block
Hi, did you maybe have some test clusters leftovers on the hosts so cephadm might have picked up the wrong FSID? Does that mean that you adopted all daemons and only afterwards looked into ceph -s? I would have adopted the first daemon and checked immediately if everything still was as

[ceph-users] Re: Unable to add new OSDs

2024-05-02 Thread Eugen Block
Hi, is the cluster healthy? Sometimes a degraded state prevents the orchestrator from doing its work. Then I would fail the mgr (ceph mgr fail), this seems to be necessary lots of times. Then keep an eye on the active mgr log as well as the cephadm.log locally on the host where the OSDs

[ceph-users] Re: Reconstructing an OSD server when the boot OS is corrupted

2024-04-30 Thread Eugen Block
Oh I'm sorry, Peter, I don't know why I wrote Karl. I apologize. Zitat von Eugen Block : Hi Karl, I must admit that I haven't dealt with raw OSDs yet. We've been usually working with LVM based clusters (some of the customers used SUSE's product SES) and in SES there was a recommendation

[ceph-users] Re: Reconstructing an OSD server when the boot OS is corrupted

2024-04-30 Thread Eugen Block
-bf3474f90508:/var/log/ceph:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmpjox0_hj0:/etc/ceph/ceph.conf:z quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233 raw activate --osd-id 20 --dev

[ceph-users] Re: which grafana version to use with 17.2.x ceph version

2024-04-29 Thread Eugen Block
Hi, cephadm stores a local copy of the cephadm binary in /var/lib/ceph/{FSID}/cephad.{DIGEST}: quincy-1:~ # ls -lrt /var/lib/ceph/{FSID}/cephadm.* -rw-r--r-- 1 root root 350889 26. Okt 2023 /var/lib/ceph/{FSID}/cephadm.f6868821c084cd9740b59c7c5eb59f0dd47f6e3b1e6fecb542cb44134ace8d78

[ceph-users] Re: Impact of large PG splits

2024-04-29 Thread Eugen Block
there will be soon some more remapping. :-) So I would consider this thread as closed, all good. Zitat von Eugen Block : No, we didn’t change much, just increased the max pg per osd to avoid warnings and inactive PGs in case a node would fail during this process. And the max backfills

[ceph-users] Re: MDS crash

2024-04-28 Thread Eugen Block
Hi, can you share the current 'ceph status'? Do you have any inconsistent PGs or something? What are the cephfs data pool's min_size and size? Zitat von Alexey GERASIMOV : Colleagues, thank you for the advice to check the operability of MGRs. In fact, it is strange also: we checked our

[ceph-users] Re: Remove an OSD with hardware issue caused rgw 503

2024-04-27 Thread Eugen Block
" in method 1 and "migrating PGs" in method 2? I think method 1 must read the OSD to be removed. Otherwise, we would not see slow ops warning. Does method 2 not involve reading this OSD? Thanks, Mary On Fri, Apr 26, 2024 at 5:15 AM Eugen Block wrote: > Hi, > > if you rem

[ceph-users] Re: rbd-mirror get status updates quicker

2024-04-27 Thread Eugen Block
Hi, I didn’t find any other config options other than you already did. Just wanted to note that I did read your message. :-) Maybe one of the Devs can comment. Zitat von Stefan Kooman : Hi, We're testing with rbd-mirror (mode snapshot) and try to get status updates about snapshots as fast

[ceph-users] Re: Remove an OSD with hardware issue caused rgw 503

2024-04-26 Thread Eugen Block
Hi, if you remove the OSD this way, it will be drained. Which means that it will try to recover PGs from this OSD, and in case of hardware failure it might lead to slow requests. It might make sense to forcefully remove the OSD without draining: - stop the osd daemon - mark it as out -

[ceph-users] Re: MDS crash

2024-04-26 Thread Eugen Block
Hi, it's unlikely that all OSDs fail at the same time, it seems like a network issue. Do you have an active MGR? Just a couple of days ago someone reported incorrect OSD stats because no MGR was up. Although your 'ceph health detail' output doesn't mention that, there are still issues when

[ceph-users] Re: Impact of large PG splits

2024-04-25 Thread Eugen Block
mon_osd_nearfull_ratio temporarily? Frédéric. - Le 25 Avr 24, à 12:35, Eugen Block ebl...@nde.ag a écrit : For those interested, just a short update: the split process is approaching its end, two days ago there were around 230 PGs left (target are 4096 PGs). So far there were no complaints, no cluster

[ceph-users] Re: Impact of large PG splits

2024-04-25 Thread Eugen Block
increasing osd_max_backfills to any values higher than 2-3 will not help much with the recovery/backfilling speed. All the way, you'll have to be patient. :-) Cheers, Frédéric. - Le 10 Avr 24, à 12:54, Eugen Block ebl...@nde.ag a écrit : Thank you for input! We started the split with max

[ceph-users] Re: Cephadm stacktrace on copying ceph.conf

2024-04-25 Thread Eugen Block
Hi, I saw something like this a couple of weeks ago on a customer cluster. I'm not entirely sure, but this was either due to (yet) missing or wrong cephadm ssh config or a label/client-keyring management issue. If this is still an issue I would recommend to check the configured keys to be

[ceph-users] Re: Reconstructing an OSD server when the boot OS is corrupted

2024-04-24 Thread Eugen Block
In addition to Nico's response, three years ago I wrote a blog post [1] about that topic, maybe that can help as well. It might be a bit outdated, what it definitely doesn't contain is this command from the docs [2] once the server has been re-added to the host list: ceph cephadm osd

[ceph-users] Re: Latest Doco Out Of Date?

2024-04-24 Thread Eugen Block
possible to implement a modify operation in the future without breaking stuff. And you can save time on the documentation, because it works like other stuff. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ____ From: Eugen Bl

[ceph-users] Re: stretched cluster new pool and second pool with nvme

2024-04-24 Thread Eugen Block
Oh, I see. Unfortunately, I don't have a cluster in stretch mode so I can't really test that. Thanks for pointing to the tracker. Zitat von Stefan Kooman : On 23-04-2024 14:40, Eugen Block wrote: Hi, whats the right way to add another pool? create pool with 4/2 and use the rule

[ceph-users] Re: Latest Doco Out Of Date?

2024-04-24 Thread Eugen Block
Hi, I believe the docs [2] are okay, running 'ceph fs authorize' will overwrite the existing caps, it will not add more caps to the client: Capabilities can be modified by running fs authorize only in the case when read/write permissions must be changed. If a client already has a

[ceph-users] Re: stretched cluster new pool and second pool with nvme

2024-04-23 Thread Eugen Block
Hi, whats the right way to add another pool? create pool with 4/2 and use the rule for the stretched mode, finished? the exsisting pools were automaticly set to 4/2 after "ceph mon enable_stretch_mode". if that is what you require, then yes, it's as easy as that. Although I haven't played

[ceph-users] Re: rbd-mirror failed to query services: (13) Permission denied

2024-04-23 Thread Eugen Block
I'm not entirely sure if I ever tried it with the rbd-mirror user instead of admin user, but I see the same error message on 17.2.7. I assume that it's not expected, I think a tracker issue makes sense. Thanks, Eugen Zitat von Stefan Kooman : Hi, We are testing rbd-mirroring. There seems

[ceph-users] Re: Stuck in replay?

2024-04-22 Thread Eugen Block
IIRC, you have 8 GB configured for the mds cache memory limit, and it doesn’t seem to be enough. Does the host run into oom killer as well? But it’s definitely a good approach to increase the cache limit (try 24 GB if possible since it’s trying to use at least 19 GB) on a host with enough

[ceph-users] Re: RGWs stop processing requests after upgrading to Reef

2024-04-22 Thread Eugen Block
t have a any clients connected). Zitat von Eugen Block : Hi, I don't see a reason why Quincy rgw daemons shouldn't work with a Reef cluster. It would basically mean that you have a staggered upgrade [1] running and didn't upgrade RGWs yet. It should also work to just downgrade them, e

[ceph-users] Re: RGWs stop processing requests after upgrading to Reef

2024-04-22 Thread Eugen Block
Hi, I don't see a reason why Quincy rgw daemons shouldn't work with a Reef cluster. It would basically mean that you have a staggered upgrade [1] running and didn't upgrade RGWs yet. It should also work to just downgrade them, either by providing a different default image, then redeploy

[ceph-users] Re: MDS crash

2024-04-22 Thread Eugen Block
Right, I just figured from the health output you would have a couple of seconds or so to query the daemon: mds: 1/1 daemons up Zitat von Alexey GERASIMOV : Ok, we will create the ticket. Eugen Block - ceph tell command needs to communicate with the MDS daemon running

[ceph-users] Re: Multiple MDS Daemon needed?

2024-04-22 Thread Eugen Block
Hi Erich, there's no simple answer to your question, as always it depends. Every now and then there are threads about clients misbehaving, especially with the "flush tid" messages. For example, the docs [1] state: The CephFS client-MDS protocol uses a field called the oldest tid to

[ceph-users] Re: MDS crash

2024-04-21 Thread Eugen Block
What’s the output of: ceph tell mds.0 damage ls Zitat von alexey.gerasi...@opencascade.com: Dear colleagues, hope that anybody can help us. The initial point: Ceph cluster v15.2 (installed and controlled by the Proxmox) with 3 nodes based on physical servers rented from a cloud

[ceph-users] Re: Working ceph cluster reports large amount of pgs in state unknown/undersized and objects degraded

2024-04-20 Thread Eugen Block
Hi, there are lots of metrics that are collected by the MGR. So if there is none, the cluster health details can be wrong or outdated. Zitat von Tobias Langner : Hey Alwin, Thanks for your reply, answers inline. I'd assume (w/o pool config) that the EC 2+1 is putting PG as inactive.

[ceph-users] Re: feature_map differs across mon_status

2024-04-17 Thread Eugen Block
Hi, without looking too deep into it, I would just assume that the daemons and clients are connected to different MONs. Or am I misunderstanding your question? Zitat von Joel Davidow : Just curious why the feature_map portions differ in the return of mon_status across a cluster. Below

[ceph-users] Re: crushmap history

2024-04-17 Thread Eugen Block
Hi, I'm not sure if and how that could help, there's a get-crushmap command for the ceph-monstore-tool: [ceph: root@host1 /]# ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/ show-versions -- --map-type crushmap > show-versions [ceph: root@host1 /]# cat show-versions first committed:

[ceph-users] Re: [EXTERN] Re: Ceph 16.2.x mon compactions, disk writes

2024-04-16 Thread Eugen Block
"if something goes wrong, monitors will fail" rather discouraging :-) /Z On Tue, 16 Apr 2024 at 18:59, Eugen Block wrote: Sorry, I meant extra-entrypoint-arguments: https://www.spinics.net/lists/ceph-users/msg79251.html Zitat von Eugen Block : > You can use the extra containe

[ceph-users] Re: [EXTERN] Re: Ceph 16.2.x mon compactions, disk writes

2024-04-16 Thread Eugen Block
Sorry, I meant extra-entrypoint-arguments: https://www.spinics.net/lists/ceph-users/msg79251.html Zitat von Eugen Block : You can use the extra container arguments I pointed out a few months ago. Those work in my test clusters, although I haven’t enabled that in production yet

[ceph-users] Re: [EXTERN] Re: Ceph 16.2.x mon compactions, disk writes

2024-04-16 Thread Eugen Block
in theory this > should result in lower but much faster compression. > > I hope this helps. My plan is to keep the monitors with the current > settings, i.e. 3 with compression + 2 without compression, until the next > minor release of Pacific to see whether the monitors with compressed &g

[ceph-users] Re: Have a problem with haproxy/keepalived/ganesha/docker

2024-04-16 Thread Eugen Block
Ah, okay, thanks for the hint. In that case what I see is expected. Zitat von Robert Sander : Hi, On 16.04.24 10:49, Eugen Block wrote: I believe I can confirm your suspicion, I have a test cluster on Reef 18.2.1 and deployed nfs without HAProxy but with keepalived [1]. Stopping

[ceph-users] Re: Have a problem with haproxy/keepalived/ganesha/docker

2024-04-16 Thread Eugen Block
Hm, no, I can't confirm it yet. I missed something in the config, the failover happens and a new nfs daemon is deployed on a different node. But I still see client interruptions so I'm gonna look into that first. Zitat von Eugen Block : Hi, I believe I can confirm your suspicion, I have

[ceph-users] Re: Have a problem with haproxy/keepalived/ganesha/docker

2024-04-16 Thread Eugen Block
Hi, I believe I can confirm your suspicion, I have a test cluster on Reef 18.2.1 and deployed nfs without HAProxy but with keepalived [1]. Stopping the active NFS daemon doesn't trigger anything, the MGR notices that it's stopped at some point, but nothing else seems to happen. I didn't

[ceph-users] Re: Impact of large PG splits

2024-04-12 Thread Eugen Block
ou'll have to be patient. :-) Cheers, Frédéric. - Le 10 Avr 24, à 12:54, Eugen Block ebl...@nde.ag a écrit : Thank you for input! We started the split with max_backfills = 1 and watched for a few minutes, then gradually increased it to 8. Now it's backfilling with around 180 MB/s, not really much

[ceph-users] Re: Impact of large PG splits

2024-04-10 Thread Eugen Block
, but we haven't noticed it before. HTH, Greg. On 10/4/24 14:42, Eugen Block wrote: Thank you, Janne. I believe the default 5% target_max_misplaced_ratio would work as well, we've had good experience with that in the past, without the autoscaler. I just haven't dealt with such large PGs, I've

[ceph-users] Re: Impact of large PG splits

2024-04-10 Thread Eugen Block
) and now they finally started to listen. Well, they would still ignore it if it wouldn't impact all kinds of things now. ;-) Thanks, Eugen Zitat von Janne Johansson : Den tis 9 apr. 2024 kl 10:39 skrev Eugen Block : I'm trying to estimate the possible impact when large PGs are splitted

[ceph-users] Re: Impact of large PG splits

2024-04-09 Thread Eugen Block
is a simpler In any case, it’s worth trying and using the maximum capabilities of the upmap Good luck, k [1] https://github.com/digitalocean/pgremapper On 9 Apr 2024, at 11:39, Eugen Block wrote: I'm trying to estimate the possible impact when large PGs are splitted. Here's one example

[ceph-users] Impact of large PG splits

2024-04-09 Thread Eugen Block
Hi, I'm trying to estimate the possible impact when large PGs are splitted. Here's one example of such a PG: PG_STAT OBJECTS BYTES OMAP_BYTES* OMAP_KEYS* LOG DISK_LOGUP 86.3ff277708 4144030984090 0 3092 3092

[ceph-users] Re: NFS never recovers after slow ops

2024-04-06 Thread Eugen Block
Hi Torkil, I assume the affected OSDs were the ones with slow requests, no? You should still see them in some of the logs (mon, mgr). Zitat von Torkil Svensgaard : On 06-04-2024 18:10, Torkil Svensgaard wrote: Hi Cephadm Reef 18.2.1 Started draining 5 18-20 TB HDD OSDs (DB/WAL om NVMe)

[ceph-users] Re: Issue about execute "ceph fs new"

2024-04-06 Thread Eugen Block
Sorry, I hit send too early, to enable multi-active MDS the full command is: ceph fs flag set enable_multiple true Zitat von Eugen Block : Did you enable multi-active MDS? Can you please share 'ceph fs dump'? Port 6789 is the MON port (v1, v2 is 3300). If you haven't enabled multi-active

[ceph-users] Re: Issue about execute "ceph fs new"

2024-04-06 Thread Eugen Block
Did you enable multi-active MDS? Can you please share 'ceph fs dump'? Port 6789 is the MON port (v1, v2 is 3300). If you haven't enabled multi-active, run: ceph fs flag set enable_multiple Zitat von elite_...@163.com: I tried to remove the default fs then it works, but port 6789 still

[ceph-users] Re: Pacific 16.2.15 `osd noin`

2024-04-04 Thread Eugen Block
Hi, the noin flag seems to be only applicable to existing OSDs which are already in the crushmap. It doesn't apply to newly created OSDs, I could confirm that in a small test cluster with Pacific and Reef. I don't have any insights if that is by design or not, I assume it's supposed to

[ceph-users] Re: [ext] Re: cephadm auto disk preparation and OSD installation incomplete

2024-04-03 Thread Eugen Block
parameter? Or maybe look into speeding up LV creation (if this is the bootleneck)? Thanks a lot, Mathias -Original Message- From: Kuhring, Mathias Sent: Friday, March 22, 2024 5:38 PM To: Eugen Block ; ceph-users@ceph.io Subject: [ceph-users] Re: [ext] Re: cephadm auto disk preparation

[ceph-users] Re: quincy-> reef upgrade non-cephadm

2024-04-03 Thread Eugen Block
Hi, 1. I see no systemd units with the fsid in them, as described in the document above. Both before and after the upgrade, my mon and other units are: ceph-mon@.serviceceph-osd@[N].service etc Should I be concerned? I think this is expected because it's not containerized, no reason to

[ceph-users] Re: ceph orchestrator for osds

2024-04-03 Thread Eugen Block
Hi, how many OSDs do you have in total? Can you share your osd tree, please? You could check the unit.meta file on each OSD host to see which service it refers to and simply change it according to the service you intend to keep: host1:~ # grep -r service_name

[ceph-users] Re: Issue about execute "ceph fs new"

2024-04-03 Thread Eugen Block
Hi, you need to deploy more daemons because your current active MDS is responsible for the already existing CephFS. There are several ways to do this, I like the yaml file approach and increase the number of MDS daemons, just as an example from a test cluster with one CephFS I added the

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-03 Thread Eugen Block
9945d0514222bd7a83e28b96e8440c630ba6891f", "RepoTags": [ "ceph/daemon:latest-pacific" "RepoDigests": [ "ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586" -Original Message- From: Adiga, Anantha Sent:

[ceph-users] Re: Replace block drives of combined NVME+HDD OSDs

2024-04-02 Thread Eugen Block
, but that was it. /Z On Tue, 2 Apr 2024 at 11:00, Eugen Block wrote: Hi, here's the link to the docs [1] how to replace OSDs. ceph orch osd rm --replace --zap [--force] This should zap both the data drive and db LV (yes, its data is useless without the data drive), not sure how it will handle if the data

[ceph-users] Re: Replace block drives of combined NVME+HDD OSDs

2024-04-02 Thread Eugen Block
Hi, here's the link to the docs [1] how to replace OSDs. ceph orch osd rm --replace --zap [--force] This should zap both the data drive and db LV (yes, its data is useless without the data drive), not sure how it will handle if the data drive isn't accessible though. One thing I'm not

[ceph-users] Re: Drained A Single Node Host On Accident

2024-04-02 Thread Eugen Block
Hi, without knowing the whole story, to cancel OSD removal you can run this command: ceph orch osd rm stop Regards, Eugen Zitat von "adam.ther" : Hello, I have a single node host with a VM as a backup MON,MGR,ect. This has caused all OSD's to be pending as 'deleting', can i safely

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-02 Thread Eugen Block
- a001s017 - a001s018 # ceph orch ls --service_name=mon --export service_type: mon service_name: mon placement: count: 3 hosts: - a001s016 - a001s017 - a001s018 -Original Message- From: Adiga, Anantha Sent: Monday, April 1, 2024 6:06 PM To: Eugen Block Cc: ceph-users@c

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Eugen Block
n_mon_release 16 (pacific) election_strategy: 1 0: [v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0] mon.a001s018 1: [v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0] mon.a001s017 Thank you, Anantha -Original Message- From: Eugen Block Sent: Monday, April 1, 2024 1:10 PM To: ceph-users@ce

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Eugen Block
Maybe it’s just not in the monmap? Can you show the output of: ceph mon dump Did you do any maintenance (apparently OSDs restarted recently) and maybe accidentally removed a MON from the monmap? Zitat von "Adiga, Anantha" : Hi Anthony, Seeing it since last after noon. It is same with

[ceph-users] Re: node-exporter error

2024-03-22 Thread Eugen Block
Hi, what does your node-exporter spec look like? ceph orch ls node-exporter --export If other node-exporter daemons are running in the cluster, what's the difference between them? Do they all have the same container image? ceph config get mgr mgr/cephadm/container_image_node_exporter and

[ceph-users] Re: mon stuck in probing

2024-03-21 Thread Eugen Block
omp rx=0 tx=0)._fault waiting 15.00 2024-03-13T11:14:29.795+0800 7f6980206640 10 RDMAStack polling finally delete qp = 0x5650c54164b0 Eugen Block 于2024年3月19日周二 14:50写道: Hi, there are several existing threads on this list, have you tried to apply those suggestions? A couple of them were: - ceph mgr

[ceph-users] Re: cephadm auto disk preparation and OSD installation incomplete

2024-03-21 Thread Eugen Block
Hi, before getting into that the first thing I would do is to fail the mgr. There have been too many issues where failing over the mgr resolved many of them. If that doesn't help, the cephadm.log should show something useful (/var/log/ceph/cephadm.log on the OSD hosts, I'm still not too

[ceph-users] Re: Adding new OSD's - slow_ops and other issues.

2024-03-19 Thread Eugen Block
Hi Jesper, could you please provide more details about the cluster (the usual like 'ceph osd tree', 'ceph osd df', 'ceph versions')? I find it unusual to enable maintenance mode to add OSDs, is there a specific reason? And why adding OSDs manually with 'ceph orch osd add', why not have a

[ceph-users] Re: mon stuck in probing

2024-03-19 Thread Eugen Block
Hi, there are several existing threads on this list, have you tried to apply those suggestions? A couple of them were: - ceph mgr fail - check time sync (NTP, chrony) - different weights for MONs - Check debug logs Regards, Eugen Zitat von faicker mo : some logs here,

[ceph-users] Re: CephFS space usage

2024-03-19 Thread Eugen Block
It's your pool replication (size = 3): 3886733 (number of objects) * 3 = 11660199 Zitat von Thorne Lawler : Can anyone please tell me what "COPIES" means in this context? [ceph: root@san2 /]# rados df -p cephfs.shared.data POOL_NAME USED  OBJECTS  CLONES    COPIES

[ceph-users] Re: Num values for 3 DC 4+2 crush rule

2024-03-16 Thread Eugen Block
Hi Torkil, Num is 0 but it's not replicated so how does this translate to picking 3 of 3 datacenters? it doesn't really make a difference if replicated or not, it just defines how many crush buckets to choose, so it applies in the same way as for your replicated pool. I am thinking we

[ceph-users] Re: activating+undersized+degraded+remapped

2024-03-16 Thread Eugen Block
Yeah, the whole story would help to give better advice. With EC the default min_size is k+1, you could reduce the min_size to 5 temporarily, this might bring the PGs back online. But the long term fix is to have all required OSDs up and have enough OSDs to sustain an outage. Zitat von

[ceph-users] Re: MANY_OBJECT_PER_PG on 1 pool which is cephfs_metadata

2024-03-11 Thread Eugen Block
Hi, I assume you're still on a "low" pacific release? This was fixed by PR [1][2] and the warning is supressed when autoscaler is on, it was merged into Pacific 16.2.8 [3]. I can't answer why autoscaler doesn't increase the pg_num, but yes, you can increase it by yourself. The pool for

[ceph-users] Re: PG damaged "failed_repair"

2024-03-11 Thread Eugen Block
Hi, your ceph version seems to be 17.2.4, not 17.2.6 (which is the locally installed ceph version on the system where you ran the command) Could you add the 'ceph versions' output as well? How is the load on the systems when the recovery starts? The OSDs crash after around 20 minutes,

[ceph-users] Re: PG damaged "failed_repair"

2024-03-10 Thread Eugen Block
sd.3, it crashes in less than a minute 23:49 : After I mark osd.3 "in" and start it again, it comes back online with osd.0 and osd.11 soon after Best regards, Romain Lebbadi-Breteau On 2024-03-08 3:17 a.m., Eugen Block wrote: Hi, can you share more details? Which OSD are you trying

[ceph-users] Re: PG damaged "failed_repair"

2024-03-08 Thread Eugen Block
Hi, can you share more details? Which OSD are you trying to get out, the primary osd.3? Can you also share 'ceph osd df'? It looks like a replicated pool with size 3, can you confirm with 'ceph osd pool ls detail'? Do you have logs from the crashing OSDs when you take out osd.3? Which ceph

[ceph-users] Re: All MGR loop crash

2024-03-07 Thread Eugen Block
Thanks! That's very interesting to know! Zitat von "David C." : some monitors have existed for many years (weight 10) others have been added (weight 0) => https://github.com/ceph/ceph/commit/2d113dedf851995e000d3cce136b69 bfa94b6fe0 Le jeudi 7 mars 2024, Eugen Block a écrit :

  1   2   3   4   5   6   7   8   9   10   >