ou could collect
information on (reproducing) the fatal peering problem. While
remappings might be "unexpectedly expected" it is clearly a serious
bug that incomplete and unknown PGs show up in the process of adding
hosts at the root.
Best regards,
=
Frank Schilder
AIT Ri
hosts directly where they belong.
- Set osd_crush_initial_weight = 0 to avoid remapping until everything
is where it's supposed to be, then reweight the OSDs.
Zitat von Eugen Block :
Hi Frank,
thanks for looking up those trackers. I haven't looked into them
yet, I'll read your response in
Vernon :
Hi,
On 22/05/2024 12:44, Eugen Block wrote:
you can specify the entire tree in the location statement, if you need to:
[snip]
Brilliant, that's just the ticket, thank you :)
This should be made a bit clearer in the docs [0], I added Zac.
I've opened a MR to update the docs, I
pus
Bygning 109, rum S14
From: Frank Schilder
Sent: Thursday, May 23, 2024 6:32 PM
To: Eugen Block
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: unknown PGs after adding hosts in different subtree
Hi Eugen,
I'm at home now. Could you please check
attached my osdmap, not sure if it will go through, though. Let me
know if you need anything else.
Thanks!
Eugen
Zitat von Eugen Block :
In my small lab cluster I can at least reproduce that a bunch of PGs
are remapped after adding hosts to the default root
to investigate? I’m on my mobile right now, I’ll add my own
osdmap to the thread soon.
Zitat von Eugen Block :
Thanks, Frank, I appreciate your help.
I already asked for the osdmap, but I’ll also try to find a reproducer.
Zitat von Frank Schilder :
Hi Eugen,
thanks for this clarification. Yes
ngs as used in the cluster and it encodes other important
information as well. That's why I'm asking for this instead of just
the crush map.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Eugen Block
Sent: Thurs
ging from my expectations.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________
From: Eugen Block
Sent: Thursday, May 23, 2024 12:05 PM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: unknown PGs after adding hosts in different sub
the otherwise healthy cluster in such a way?
Even if ceph doesn't know where to put some of the chunks, I wouldn't
expect inactive PGs and have a service interruption.
What am I missing here?
Thanks,
Eugen
Zitat von Eugen Block :
Thanks, Konstantin.
It's been a while since I was last bitten
Hi,
you can specify the entire tree in the location statement, if you need to:
ceph:~ # cat host-spec.yaml
service_type: host
hostname: ceph
addr:
location:
root: default
rack: rack2
and after the bootstrap it looks like expected:
ceph:~ # ceph osd tree
ID CLASS WEIGHT TYPE NAME
It’s usually no problem to shut down a cluster. Set at least the noout
flag, the other flags like norebalance, nobackfill etc won’t hurt
either. Then shut down the servers. I do that all the time with test
clusters (they do have data, just not important at all), and I’ve
never had data
, Eugen Block wrote:
step set_choose_tries 100
I think you should try to increase set_choose_tries to 200
Last year we had an Pacific EC 8+2 deployment of 10 racks. And even
with 50 hosts, the value of 100 not worked for us
k
___
ceph-users
First thing to try would be to fail the mgr. Although the daemons
might be active from a systemd perspective, they sometimes get
unresponsive. I saw that in Nautilus clusters as well, so that might
be worth a try.
Zitat von Huy Nguyen :
Ceph version 14.2.7
Ceph osd df tree command take
Hi,
I got into a weird and unexpected situation today. I added 6 hosts to
an existing Pacific cluster (16.2.13, 20 existing OSD hosts across 2
DCs). The hosts were added to the root=default subtree, their
designated location is one of two datacenters underneath the default
root. Nothing
I just read your message again, you only mention newly created files,
not new clients. So my suggestion probably won't help you in this
case, but it might help others. :-)
Zitat von Eugen Block :
Hi Paul,
I don't really have a good answer to your question, but maybe this
approach can
Hi Paul,
I don't really have a good answer to your question, but maybe this
approach can help track down the clients.
Each MDS client has an average "uptime" metric stored in the MDS:
storage01:~ # ceph tell mds.cephfs.storage04.uxkclk session ls
...
"id": 409348719,
...
Hi,
I don't have a Reef production cluster available yet, only a small
test cluster (upgraded from 18.2.1 to 18.2.2 this week). Although I
don't use the RGWs constantly there, there are graphs in the ceph
dashboard. Maybe it's related to the grafana (and/or prometheus)
versions?
My
Hi,
I'm not familiar with ceph-ansible. I'm not sure if I understand it
correctly, according to [1] it tries to get the public IP range to
define monitors (?). Can you verify if your mon sections in
/etc/ansible/hosts are correct?
ansible.builtin.set_fact:
_monitor_addresses: "{{
the keys for the
osd:s that remained in the host (after the pools
recovered/rebalanced).
/Johan
Den 2024-05-07 kl. 12:09, skrev Eugen Block:
Hi, did you remove the host from the host list [0]?
ceph orch host rm [--force] [--offline]
[0]
https://docs.ceph.com/en/latest/cephadm/host
Hi,
we're facing an issue during upgrades (and sometimes server reboots),
it appears to occur when (at leat) one of the MONs has to do a full
sync. And I'm wondering if the upgrade procedure could be improved in
that regard, I'll come back to that later. First, I'll try to
summarize the
Hi,
I'm not the biggest rbd-mirror expert.
As understand it, if you use one-way mirroring you can failover to the
remote site, continue to work there but there's no failover back to
primary site. You would need to stop client IO on DR, demote the image
and then import the remote images
Hi, did you remove the host from the host list [0]?
ceph orch host rm [--force] [--offline]
[0]
https://docs.ceph.com/en/latest/cephadm/host-management/#offline-host-removal
Zitat von Johan :
Hi all,
In my small cluster of 6 hosts I had troubles with a host (osd:s)
and was planning to
Hi,
it's a bit much output to scan through, I'd recommend to omit all
unnecessary information before pasting. Anyway, this sticks out:
2024-05-01T15:49:26.977+ 7f85688e8700 0 [dashboard ERROR
frontend.error] (https://172.20.2.30:8443/#/login): Http failure
response for
And in the tracker you never mentioned to add a symlink, only to add
the prefix "/rootfs" to the ceph config. I could have tried that
approach first. ;-)
Zitat von Eugen Block :
Alright, I updated the configs in our production cluster and
restarted the OSDs (after removing
.
Thanks!
Eugen
Zitat von Wyll Ingersoll :
Yeah, now that you mention it, I recall figuring that out also at
some point. I think I did it originally when I was debugging the
problem without the container.
From: Eugen Block
Sent: Friday, May 3, 2024 8:37 AM
restart the OSD finds its correct location. So I actually
only need to update the location path, nothing else, it seems.
Zitat von Eugen Block :
I found your (open) tracker issue:
https://tracker.ceph.com/issues/53562
Your workaround works great, I tried it in a test cluster
successfully
I found your (open) tracker issue:
https://tracker.ceph.com/issues/53562
Your workaround works great, I tried it in a test cluster
successfully. I will adopt it to our production cluster as well.
Thanks!
Eugen
Zitat von Eugen Block :
Thank you very much for the quick response! I will take
ler" mailto:wei...@soe.ucsc.edu>>
*An: *"Eugen Block" mailto:ebl...@nde.ag>>,
ceph-users@ceph.io <mailto:ceph-users@ceph.io>
*Datum: *02-05-2024 21:05
Hi Eugen,
Thanks for the tip! I just ran:
ceph orch daemon restart mgr.pr-md-01.jemmdf
(my specific m
Yep, seen this a couple of times during upgrades. I’ll have to check
my notes if I wrote anything down for that. But try a mgr failover
first, that could help.
Zitat von Erich Weiler :
Hi All,
For a while now I've been using 'ceph fs status' to show current MDS
active servers,
Can you please paste the output of the following command?
ceph orch host ls
Zitat von "Roberto Maggi @ Debian" :
Hi you all,
it is a couple of days I'm facing this problem.
Although I already destroyed the cluster a couple of times I
continuously get these error
I instruct ceph to place
ituation is not
ideal.
____
From: Eugen Block
Sent: Thursday, May 2, 2024 10:23 AM
To: ceph-users@ceph.io
Subject: [ceph-users] cephadm custom crush location hooks
Hi,
we've been using custom crush location hooks for some OSDs [1] for
years. Since we moved to cephadm, we always have to manually
Hi,
we've been using custom crush location hooks for some OSDs [1] for
years. Since we moved to cephadm, we always have to manually edit the
unit.run file of those OSDs because the path to the script is not
mapped into the containers. I don't want to define custom location
hooks for all
Hi,
did you maybe have some test clusters leftovers on the hosts so
cephadm might have picked up the wrong FSID?
Does that mean that you adopted all daemons and only afterwards looked
into ceph -s? I would have adopted the first daemon and checked
immediately if everything still was as
Hi,
is the cluster healthy? Sometimes a degraded state prevents the
orchestrator from doing its work. Then I would fail the mgr (ceph mgr
fail), this seems to be necessary lots of times. Then keep an eye on
the active mgr log as well as the cephadm.log locally on the host
where the OSDs
Oh I'm sorry, Peter, I don't know why I wrote Karl. I apologize.
Zitat von Eugen Block :
Hi Karl,
I must admit that I haven't dealt with raw OSDs yet. We've been
usually working with LVM based clusters (some of the customers used
SUSE's product SES) and in SES there was a recommendation
-bf3474f90508:/var/log/ceph:z -v
/dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v
/run/lock/lvm:/run/lock/lvm -v /:/rootfs -v
/tmp/ceph-tmpjox0_hj0:/etc/ceph/ceph.conf:z
quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233
raw activate --osd-id 20 --dev
Hi,
cephadm stores a local copy of the cephadm binary in
/var/lib/ceph/{FSID}/cephad.{DIGEST}:
quincy-1:~ # ls -lrt /var/lib/ceph/{FSID}/cephadm.*
-rw-r--r-- 1 root root 350889 26. Okt 2023
/var/lib/ceph/{FSID}/cephadm.f6868821c084cd9740b59c7c5eb59f0dd47f6e3b1e6fecb542cb44134ace8d78
there will be soon some more remapping. :-)
So I would consider this thread as closed, all good.
Zitat von Eugen Block :
No, we didn’t change much, just increased the max pg per osd to
avoid warnings and inactive PGs in case a node would fail during
this process. And the max backfills
Hi,
can you share the current 'ceph status'? Do you have any inconsistent
PGs or something? What are the cephfs data pool's min_size and size?
Zitat von Alexey GERASIMOV :
Colleagues, thank you for the advice to check the operability of
MGRs. In fact, it is strange also: we checked our
" in method 1 and "migrating
PGs" in method 2? I think method 1 must read the OSD to be removed.
Otherwise, we would not see slow ops warning. Does method 2 not involve
reading this OSD?
Thanks,
Mary
On Fri, Apr 26, 2024 at 5:15 AM Eugen Block wrote:
> Hi,
>
> if you rem
Hi, I didn’t find any other config options other than you already did.
Just wanted to note that I did read your message. :-)
Maybe one of the Devs can comment.
Zitat von Stefan Kooman :
Hi,
We're testing with rbd-mirror (mode snapshot) and try to get status
updates about snapshots as fast
Hi,
if you remove the OSD this way, it will be drained. Which means that
it will try to recover PGs from this OSD, and in case of hardware
failure it might lead to slow requests. It might make sense to
forcefully remove the OSD without draining:
- stop the osd daemon
- mark it as out
-
Hi, it's unlikely that all OSDs fail at the same time, it seems like a
network issue. Do you have an active MGR? Just a couple of days ago
someone reported incorrect OSD stats because no MGR was up. Although
your 'ceph health detail' output doesn't mention that, there are still
issues when
mon_osd_nearfull_ratio temporarily?
Frédéric.
- Le 25 Avr 24, à 12:35, Eugen Block ebl...@nde.ag a écrit :
For those interested, just a short update: the split process is
approaching its end, two days ago there were around 230 PGs left
(target are 4096 PGs). So far there were no complaints, no cluster
increasing osd_max_backfills to any values
higher than
2-3 will not help much with the recovery/backfilling speed.
All the way, you'll have to be patient. :-)
Cheers,
Frédéric.
- Le 10 Avr 24, à 12:54, Eugen Block ebl...@nde.ag a écrit :
Thank you for input!
We started the split with max
Hi,
I saw something like this a couple of weeks ago on a customer cluster.
I'm not entirely sure, but this was either due to (yet) missing or
wrong cephadm ssh config or a label/client-keyring management issue.
If this is still an issue I would recommend to check the configured
keys to be
In addition to Nico's response, three years ago I wrote a blog post
[1] about that topic, maybe that can help as well. It might be a bit
outdated, what it definitely doesn't contain is this command from the
docs [2] once the server has been re-added to the host list:
ceph cephadm osd
possible to implement a modify operation in the future
without breaking stuff. And you can save time on the documentation,
because it works like other stuff.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
____
From: Eugen Bl
Oh, I see. Unfortunately, I don't have a cluster in stretch mode so I
can't really test that. Thanks for pointing to the tracker.
Zitat von Stefan Kooman :
On 23-04-2024 14:40, Eugen Block wrote:
Hi,
whats the right way to add another pool?
create pool with 4/2 and use the rule
Hi,
I believe the docs [2] are okay, running 'ceph fs authorize' will
overwrite the existing caps, it will not add more caps to the client:
Capabilities can be modified by running fs authorize only in the
case when read/write permissions must be changed.
If a client already has a
Hi,
whats the right way to add another pool?
create pool with 4/2 and use the rule for the stretched mode, finished?
the exsisting pools were automaticly set to 4/2 after "ceph mon
enable_stretch_mode".
if that is what you require, then yes, it's as easy as that. Although
I haven't played
I'm not entirely sure if I ever tried it with the rbd-mirror user
instead of admin user, but I see the same error message on 17.2.7. I
assume that it's not expected, I think a tracker issue makes sense.
Thanks,
Eugen
Zitat von Stefan Kooman :
Hi,
We are testing rbd-mirroring. There seems
IIRC, you have 8 GB configured for the mds cache memory limit, and it
doesn’t seem to be enough. Does the host run into oom killer as well?
But it’s definitely a good approach to increase the cache limit (try
24 GB if possible since it’s trying to use at least 19 GB) on a host
with enough
t have a any clients connected).
Zitat von Eugen Block :
Hi,
I don't see a reason why Quincy rgw daemons shouldn't work with a
Reef cluster. It would basically mean that you have a staggered
upgrade [1] running and didn't upgrade RGWs yet. It should also work
to just downgrade them, e
Hi,
I don't see a reason why Quincy rgw daemons shouldn't work with a Reef
cluster. It would basically mean that you have a staggered upgrade [1]
running and didn't upgrade RGWs yet. It should also work to just
downgrade them, either by providing a different default image, then
redeploy
Right, I just figured from the health output you would have a couple
of seconds or so to query the daemon:
mds: 1/1 daemons up
Zitat von Alexey GERASIMOV :
Ok, we will create the ticket.
Eugen Block - ceph tell command needs to communicate with the MDS
daemon running
Hi Erich,
there's no simple answer to your question, as always it depends.
Every now and then there are threads about clients misbehaving,
especially with the "flush tid" messages. For example, the docs [1]
state:
The CephFS client-MDS protocol uses a field called the oldest tid to
What’s the output of:
ceph tell mds.0 damage ls
Zitat von alexey.gerasi...@opencascade.com:
Dear colleagues, hope that anybody can help us.
The initial point: Ceph cluster v15.2 (installed and controlled by
the Proxmox) with 3 nodes based on physical servers rented from a
cloud
Hi, there are lots of metrics that are collected by the MGR. So if
there is none, the cluster health details can be wrong or outdated.
Zitat von Tobias Langner :
Hey Alwin,
Thanks for your reply, answers inline.
I'd assume (w/o pool config) that the EC 2+1 is putting PG as
inactive.
Hi,
without looking too deep into it, I would just assume that the daemons
and clients are connected to different MONs. Or am I misunderstanding
your question?
Zitat von Joel Davidow :
Just curious why the feature_map portions differ in the return of
mon_status across a cluster. Below
Hi,
I'm not sure if and how that could help, there's a get-crushmap
command for the ceph-monstore-tool:
[ceph: root@host1 /]# ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/
show-versions -- --map-type crushmap > show-versions
[ceph: root@host1 /]# cat show-versions
first committed:
"if something goes wrong,
monitors will fail" rather discouraging :-)
/Z
On Tue, 16 Apr 2024 at 18:59, Eugen Block wrote:
Sorry, I meant extra-entrypoint-arguments:
https://www.spinics.net/lists/ceph-users/msg79251.html
Zitat von Eugen Block :
> You can use the extra containe
Sorry, I meant extra-entrypoint-arguments:
https://www.spinics.net/lists/ceph-users/msg79251.html
Zitat von Eugen Block :
You can use the extra container arguments I pointed out a few months
ago. Those work in my test clusters, although I haven’t enabled that
in production yet
in theory this
> should result in lower but much faster compression.
>
> I hope this helps. My plan is to keep the monitors with the current
> settings, i.e. 3 with compression + 2 without compression, until the next
> minor release of Pacific to see whether the monitors with compressed
&g
Ah, okay, thanks for the hint. In that case what I see is expected.
Zitat von Robert Sander :
Hi,
On 16.04.24 10:49, Eugen Block wrote:
I believe I can confirm your suspicion, I have a test cluster on
Reef 18.2.1 and deployed nfs without HAProxy but with keepalived [1].
Stopping
Hm, no, I can't confirm it yet. I missed something in the config, the
failover happens and a new nfs daemon is deployed on a different node.
But I still see client interruptions so I'm gonna look into that first.
Zitat von Eugen Block :
Hi,
I believe I can confirm your suspicion, I have
Hi,
I believe I can confirm your suspicion, I have a test cluster on Reef
18.2.1 and deployed nfs without HAProxy but with keepalived [1].
Stopping the active NFS daemon doesn't trigger anything, the MGR
notices that it's stopped at some point, but nothing else seems to
happen. I didn't
ou'll have to be patient. :-)
Cheers,
Frédéric.
- Le 10 Avr 24, à 12:54, Eugen Block ebl...@nde.ag a écrit :
Thank you for input!
We started the split with max_backfills = 1 and watched for a few
minutes, then gradually increased it to 8. Now it's backfilling with
around 180 MB/s, not really much
, but we
haven't noticed it before.
HTH,
Greg.
On 10/4/24 14:42, Eugen Block wrote:
Thank you, Janne.
I believe the default 5% target_max_misplaced_ratio would work as
well, we've had good experience with that in the past, without the
autoscaler. I just haven't dealt with such large PGs, I've
) and now they finally started to listen. Well, they would still
ignore it if it wouldn't impact all kinds of things now. ;-)
Thanks,
Eugen
Zitat von Janne Johansson :
Den tis 9 apr. 2024 kl 10:39 skrev Eugen Block :
I'm trying to estimate the possible impact when large PGs are
splitted
is a simpler
In any case, it’s worth trying and using the maximum capabilities of
the upmap
Good luck,
k
[1] https://github.com/digitalocean/pgremapper
On 9 Apr 2024, at 11:39, Eugen Block wrote:
I'm trying to estimate the possible impact when large PGs are
splitted. Here's one example
Hi,
I'm trying to estimate the possible impact when large PGs are
splitted. Here's one example of such a PG:
PG_STAT OBJECTS BYTES OMAP_BYTES* OMAP_KEYS* LOG DISK_LOGUP
86.3ff277708 4144030984090 0 3092
3092
Hi Torkil,
I assume the affected OSDs were the ones with slow requests, no? You
should still see them in some of the logs (mon, mgr).
Zitat von Torkil Svensgaard :
On 06-04-2024 18:10, Torkil Svensgaard wrote:
Hi
Cephadm Reef 18.2.1
Started draining 5 18-20 TB HDD OSDs (DB/WAL om NVMe)
Sorry, I hit send too early, to enable multi-active MDS the full command is:
ceph fs flag set enable_multiple true
Zitat von Eugen Block :
Did you enable multi-active MDS? Can you please share 'ceph fs
dump'? Port 6789 is the MON port (v1, v2 is 3300). If you haven't
enabled multi-active
Did you enable multi-active MDS? Can you please share 'ceph fs dump'?
Port 6789 is the MON port (v1, v2 is 3300). If you haven't enabled
multi-active, run:
ceph fs flag set enable_multiple
Zitat von elite_...@163.com:
I tried to remove the default fs then it works, but port 6789 still
Hi,
the noin flag seems to be only applicable to existing OSDs which are
already in the crushmap. It doesn't apply to newly created OSDs, I
could confirm that in a small test cluster with Pacific and Reef. I
don't have any insights if that is by design or not, I assume it's
supposed to
parameter?
Or maybe look into speeding up LV creation (if this is the bootleneck)?
Thanks a lot,
Mathias
-Original Message-
From: Kuhring, Mathias
Sent: Friday, March 22, 2024 5:38 PM
To: Eugen Block ; ceph-users@ceph.io
Subject: [ceph-users] Re: [ext] Re: cephadm auto disk preparation
Hi,
1. I see no systemd units with the fsid in them, as described in the
document above. Both before and after the upgrade, my mon and other
units are:
ceph-mon@.serviceceph-osd@[N].service
etc
Should I be concerned?
I think this is expected because it's not containerized, no reason to
Hi,
how many OSDs do you have in total? Can you share your osd tree, please?
You could check the unit.meta file on each OSD host to see which
service it refers to and simply change it according to the service you
intend to keep:
host1:~ # grep -r service_name
Hi,
you need to deploy more daemons because your current active MDS is
responsible for the already existing CephFS. There are several ways to
do this, I like the yaml file approach and increase the number of MDS
daemons, just as an example from a test cluster with one CephFS I
added the
9945d0514222bd7a83e28b96e8440c630ba6891f",
"RepoTags": [
"ceph/daemon:latest-pacific"
"RepoDigests": [
"ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586"
-Original Message-
From: Adiga, Anantha
Sent:
, but that was it.
/Z
On Tue, 2 Apr 2024 at 11:00, Eugen Block wrote:
Hi,
here's the link to the docs [1] how to replace OSDs.
ceph orch osd rm --replace --zap [--force]
This should zap both the data drive and db LV (yes, its data is
useless without the data drive), not sure how it will handle if the
data
Hi,
here's the link to the docs [1] how to replace OSDs.
ceph orch osd rm --replace --zap [--force]
This should zap both the data drive and db LV (yes, its data is
useless without the data drive), not sure how it will handle if the
data drive isn't accessible though.
One thing I'm not
Hi,
without knowing the whole story, to cancel OSD removal you can run
this command:
ceph orch osd rm stop
Regards,
Eugen
Zitat von "adam.ther" :
Hello,
I have a single node host with a VM as a backup MON,MGR,ect.
This has caused all OSD's to be pending as 'deleting', can i safely
- a001s017
- a001s018
# ceph orch ls --service_name=mon --export
service_type: mon
service_name: mon
placement:
count: 3
hosts:
- a001s016
- a001s017
- a001s018
-Original Message-
From: Adiga, Anantha
Sent: Monday, April 1, 2024 6:06 PM
To: Eugen Block
Cc: ceph-users@c
n_mon_release 16 (pacific)
election_strategy: 1
0: [v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0] mon.a001s018
1: [v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0] mon.a001s017
Thank you,
Anantha
-Original Message-
From: Eugen Block
Sent: Monday, April 1, 2024 1:10 PM
To: ceph-users@ce
Maybe it’s just not in the monmap? Can you show the output of:
ceph mon dump
Did you do any maintenance (apparently OSDs restarted recently) and
maybe accidentally removed a MON from the monmap?
Zitat von "Adiga, Anantha" :
Hi Anthony,
Seeing it since last after noon. It is same with
Hi,
what does your node-exporter spec look like?
ceph orch ls node-exporter --export
If other node-exporter daemons are running in the cluster, what's the
difference between them? Do they all have the same container image?
ceph config get mgr mgr/cephadm/container_image_node_exporter
and
omp rx=0 tx=0)._fault waiting 15.00
2024-03-13T11:14:29.795+0800 7f6980206640 10 RDMAStack polling finally
delete qp = 0x5650c54164b0
Eugen Block 于2024年3月19日周二 14:50写道:
Hi,
there are several existing threads on this list, have you tried to
apply those suggestions? A couple of them were:
- ceph mgr
Hi,
before getting into that the first thing I would do is to fail the
mgr. There have been too many issues where failing over the mgr
resolved many of them.
If that doesn't help, the cephadm.log should show something useful
(/var/log/ceph/cephadm.log on the OSD hosts, I'm still not too
Hi Jesper,
could you please provide more details about the cluster (the usual
like 'ceph osd tree', 'ceph osd df', 'ceph versions')?
I find it unusual to enable maintenance mode to add OSDs, is there a
specific reason?
And why adding OSDs manually with 'ceph orch osd add', why not have a
Hi,
there are several existing threads on this list, have you tried to
apply those suggestions? A couple of them were:
- ceph mgr fail
- check time sync (NTP, chrony)
- different weights for MONs
- Check debug logs
Regards,
Eugen
Zitat von faicker mo :
some logs here,
It's your pool replication (size = 3):
3886733 (number of objects) * 3 = 11660199
Zitat von Thorne Lawler :
Can anyone please tell me what "COPIES" means in this context?
[ceph: root@san2 /]# rados df -p cephfs.shared.data
POOL_NAME USED OBJECTS CLONES COPIES
Hi Torkil,
Num is 0 but it's not replicated so how does this translate to
picking 3 of 3 datacenters?
it doesn't really make a difference if replicated or not, it just
defines how many crush buckets to choose, so it applies in the same
way as for your replicated pool.
I am thinking we
Yeah, the whole story would help to give better advice. With EC the
default min_size is k+1, you could reduce the min_size to 5
temporarily, this might bring the PGs back online. But the long term
fix is to have all required OSDs up and have enough OSDs to sustain an
outage.
Zitat von
Hi,
I assume you're still on a "low" pacific release? This was fixed by PR
[1][2] and the warning is supressed when autoscaler is on, it was
merged into Pacific 16.2.8 [3].
I can't answer why autoscaler doesn't increase the pg_num, but yes,
you can increase it by yourself. The pool for
Hi,
your ceph version seems to be 17.2.4, not 17.2.6 (which is the locally
installed ceph version on the system where you ran the command) Could
you add the 'ceph versions' output as well?
How is the load on the systems when the recovery starts? The OSDs
crash after around 20 minutes,
sd.3, it crashes in less than a minute
23:49 : After I mark osd.3 "in" and start it again, it comes back
online with osd.0 and osd.11 soon after
Best regards,
Romain Lebbadi-Breteau
On 2024-03-08 3:17 a.m., Eugen Block wrote:
Hi,
can you share more details? Which OSD are you trying
Hi,
can you share more details? Which OSD are you trying to get out, the
primary osd.3?
Can you also share 'ceph osd df'?
It looks like a replicated pool with size 3, can you confirm with
'ceph osd pool ls detail'?
Do you have logs from the crashing OSDs when you take out osd.3?
Which ceph
Thanks! That's very interesting to know!
Zitat von "David C." :
some monitors have existed for many years (weight 10) others have been
added (weight 0)
=> https://github.com/ceph/ceph/commit/2d113dedf851995e000d3cce136b69
bfa94b6fe0
Le jeudi 7 mars 2024, Eugen Block a écrit :
1 - 100 of 1322 matches
Mail list logo