o make it possible to implement a modify operation in the future
without breaking stuff. And you can save time on the documentation,
because it works like other stuff.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________
From
In addition to Nico's response, three years ago I wrote a blog post
[1] about that topic, maybe that can help as well. It might be a bit
outdated, what it definitely doesn't contain is this command from the
docs [2] once the server has been re-added to the host list:
ceph cephadm osd activa
Hi,
I saw something like this a couple of weeks ago on a customer cluster.
I'm not entirely sure, but this was either due to (yet) missing or
wrong cephadm ssh config or a label/client-keyring management issue.
If this is still an issue I would recommend to check the configured
keys to be m
is
cluster only has 240, increasing osd_max_backfills to any values
higher than
2-3 will not help much with the recovery/backfilling speed.
All the way, you'll have to be patient. :-)
Cheers,
Frédéric.
- Le 10 Avr 24, à 12:54, Eugen Block ebl...@nde.ag a écrit :
Thank you for input!
mon_osd_nearfull_ratio temporarily?
Frédéric.
- Le 25 Avr 24, à 12:35, Eugen Block ebl...@nde.ag a écrit :
For those interested, just a short update: the split process is
approaching its end, two days ago there were around 230 PGs left
(target are 4096 PGs). So far there were no complaints, no cluster
Hi, it's unlikely that all OSDs fail at the same time, it seems like a
network issue. Do you have an active MGR? Just a couple of days ago
someone reported incorrect OSD stats because no MGR was up. Although
your 'ceph health detail' output doesn't mention that, there are still
issues when
Hi,
if you remove the OSD this way, it will be drained. Which means that
it will try to recover PGs from this OSD, and in case of hardware
failure it might lead to slow requests. It might make sense to
forcefully remove the OSD without draining:
- stop the osd daemon
- mark it as out
- os
Hi, I didn’t find any other config options other than you already did.
Just wanted to note that I did read your message. :-)
Maybe one of the Devs can comment.
Zitat von Stefan Kooman :
Hi,
We're testing with rbd-mirror (mode snapshot) and try to get status
updates about snapshots as fast
uating PGs" in method 1 and "migrating
PGs" in method 2? I think method 1 must read the OSD to be removed.
Otherwise, we would not see slow ops warning. Does method 2 not involve
reading this OSD?
Thanks,
Mary
On Fri, Apr 26, 2024 at 5:15 AM Eugen Block wrote:
> Hi,
>
>
Hi,
can you share the current 'ceph status'? Do you have any inconsistent
PGs or something? What are the cephfs data pool's min_size and size?
Zitat von Alexey GERASIMOV :
Colleagues, thank you for the advice to check the operability of
MGRs. In fact, it is strange also: we checked our no
there will be soon some more remapping. :-)
So I would consider this thread as closed, all good.
Zitat von Eugen Block :
No, we didn’t change much, just increased the max pg per osd to
avoid warnings and inactive PGs in case a node would fail during
this process. And the max backfills, of
Hi,
cephadm stores a local copy of the cephadm binary in
/var/lib/ceph/{FSID}/cephad.{DIGEST}:
quincy-1:~ # ls -lrt /var/lib/ceph/{FSID}/cephadm.*
-rw-r--r-- 1 root root 350889 26. Okt 2023
/var/lib/ceph/{FSID}/cephadm.f6868821c084cd9740b59c7c5eb59f0dd47f6e3b1e6fecb542cb44134ace8d78
-rw-r-
osd3 -e CEPH_USE_RANDOM_NONCE=1 -e
CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v
/var/log/ceph/ed7b2c16-b053-45e2-a1fe-bf3474f90508:/var/log/ceph:z -v
/dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v
/run/lock/lvm:/run/lock/lvm -v /:/rootfs -v
/tmp/ceph-tmpjox0_hj0:/etc/ceph/ce
Oh I'm sorry, Peter, I don't know why I wrote Karl. I apologize.
Zitat von Eugen Block :
Hi Karl,
I must admit that I haven't dealt with raw OSDs yet. We've been
usually working with LVM based clusters (some of the customers used
SUSE's product SES) and in SES t
Hi,
is the cluster healthy? Sometimes a degraded state prevents the
orchestrator from doing its work. Then I would fail the mgr (ceph mgr
fail), this seems to be necessary lots of times. Then keep an eye on
the active mgr log as well as the cephadm.log locally on the host
where the OSDs n
Hi,
did you maybe have some test clusters leftovers on the hosts so
cephadm might have picked up the wrong FSID?
Does that mean that you adopted all daemons and only afterwards looked
into ceph -s? I would have adopted the first daemon and checked
immediately if everything still was as expe
Hi,
we've been using custom crush location hooks for some OSDs [1] for
years. Since we moved to cephadm, we always have to manually edit the
unit.run file of those OSDs because the path to the script is not
mapped into the containers. I don't want to define custom location
hooks for all O
..). The current situation is not
ideal.
____
From: Eugen Block
Sent: Thursday, May 2, 2024 10:23 AM
To: ceph-users@ceph.io
Subject: [ceph-users] cephadm custom crush location hooks
Hi,
we've been using custom crush location hooks for some OSDs [1] for
years. Since we moved to cephadm, w
Can you please paste the output of the following command?
ceph orch host ls
Zitat von "Roberto Maggi @ Debian" :
Hi you all,
it is a couple of days I'm facing this problem.
Although I already destroyed the cluster a couple of times I
continuously get these error
I instruct ceph to place
Yep, seen this a couple of times during upgrades. I’ll have to check
my notes if I wrote anything down for that. But try a mgr failover
first, that could help.
Zitat von Erich Weiler :
Hi All,
For a while now I've been using 'ceph fs status' to show current MDS
active servers, filesystem
s?
*Von: *"Erich Weiler" mailto:wei...@soe.ucsc.edu>>
*An: *"Eugen Block" mailto:ebl...@nde.ag>>,
ceph-users@ceph.io <mailto:ceph-users@ceph.io>
*Datum: *02-05-2024 21:05
Hi Eugen,
Thanks for the tip! I just ran:
ceph orch daemon restart mgr.pr-md-01.
I found your (open) tracker issue:
https://tracker.ceph.com/issues/53562
Your workaround works great, I tried it in a test cluster
successfully. I will adopt it to our production cluster as well.
Thanks!
Eugen
Zitat von Eugen Block :
Thank you very much for the quick response! I will take
restart the OSD finds its correct location. So I actually
only need to update the location path, nothing else, it seems.
Zitat von Eugen Block :
I found your (open) tracker issue:
https://tracker.ceph.com/issues/53562
Your workaround works great, I tried it in a test cluster
successfully
.
Thanks!
Eugen
Zitat von Wyll Ingersoll :
Yeah, now that you mention it, I recall figuring that out also at
some point. I think I did it originally when I was debugging the
problem without the container.
From: Eugen Block
Sent: Friday, May 3, 2024 8:37 AM
To
And in the tracker you never mentioned to add a symlink, only to add
the prefix "/rootfs" to the ceph config. I could have tried that
approach first. ;-)
Zitat von Eugen Block :
Alright, I updated the configs in our production cluster and
restarted the OSDs (after removing
Hi,
it's a bit much output to scan through, I'd recommend to omit all
unnecessary information before pasting. Anyway, this sticks out:
2024-05-01T15:49:26.977+ 7f85688e8700 0 [dashboard ERROR
frontend.error] (https://172.20.2.30:8443/#/login): Http failure
response for https://172.20
Hi, did you remove the host from the host list [0]?
ceph orch host rm [--force] [--offline]
[0]
https://docs.ceph.com/en/latest/cephadm/host-management/#offline-host-removal
Zitat von Johan :
Hi all,
In my small cluster of 6 hosts I had troubles with a host (osd:s)
and was planning to
Hi,
I'm not the biggest rbd-mirror expert.
As understand it, if you use one-way mirroring you can failover to the
remote site, continue to work there but there's no failover back to
primary site. You would need to stop client IO on DR, demote the image
and then import the remote images back
Hi,
we're facing an issue during upgrades (and sometimes server reboots),
it appears to occur when (at leat) one of the MONs has to do a full
sync. And I'm wondering if the upgrade procedure could be improved in
that regard, I'll come back to that later. First, I'll try to
summarize the e
he host. I also manually removed the keys for the
osd:s that remained in the host (after the pools
recovered/rebalanced).
/Johan
Den 2024-05-07 kl. 12:09, skrev Eugen Block:
Hi, did you remove the host from the host list [0]?
ceph orch host rm [--force] [--offline]
[0]
https://docs.ceph.co
Hi,
I'm not familiar with ceph-ansible. I'm not sure if I understand it
correctly, according to [1] it tries to get the public IP range to
define monitors (?). Can you verify if your mon sections in
/etc/ansible/hosts are correct?
ansible.builtin.set_fact:
_monitor_addresses: "{{ _mon
Hi,
I don't have a Reef production cluster available yet, only a small
test cluster (upgraded from 18.2.1 to 18.2.2 this week). Although I
don't use the RGWs constantly there, there are graphs in the ceph
dashboard. Maybe it's related to the grafana (and/or prometheus)
versions?
My clus
Hi Paul,
I don't really have a good answer to your question, but maybe this
approach can help track down the clients.
Each MDS client has an average "uptime" metric stored in the MDS:
storage01:~ # ceph tell mds.cephfs.storage04.uxkclk session ls
...
"id": 409348719,
...
"upt
I just read your message again, you only mention newly created files,
not new clients. So my suggestion probably won't help you in this
case, but it might help others. :-)
Zitat von Eugen Block :
Hi Paul,
I don't really have a good answer to your question, but maybe this
ap
Hi,
I got into a weird and unexpected situation today. I added 6 hosts to
an existing Pacific cluster (16.2.13, 20 existing OSD hosts across 2
DCs). The hosts were added to the root=default subtree, their
designated location is one of two datacenters underneath the default
root. Nothing u
First thing to try would be to fail the mgr. Although the daemons
might be active from a systemd perspective, they sometimes get
unresponsive. I saw that in Nautilus clusters as well, so that might
be worth a try.
Zitat von Huy Nguyen :
Ceph version 14.2.7
Ceph osd df tree command take lo
1 May 2024, at 15:26, Eugen Block wrote:
step set_choose_tries 100
I think you should try to increase set_choose_tries to 200
Last year we had an Pacific EC 8+2 deployment of 10 racks. And even
with 50 hosts, the value of 100 not worked for us
k
__
It’s usually no problem to shut down a cluster. Set at least the noout
flag, the other flags like norebalance, nobackfill etc won’t hurt
either. Then shut down the servers. I do that all the time with test
clusters (they do have data, just not important at all), and I’ve
never had data loss
Hi,
you can specify the entire tree in the location statement, if you need to:
ceph:~ # cat host-spec.yaml
service_type: host
hostname: ceph
addr:
location:
root: default
rack: rack2
and after the bootstrap it looks like expected:
ceph:~ # ceph osd tree
ID CLASS WEIGHT TYPE NAME
uld it
affect the otherwise healthy cluster in such a way?
Even if ceph doesn't know where to put some of the chunks, I wouldn't
expect inactive PGs and have a service interruption.
What am I missing here?
Thanks,
Eugen
Zitat von Eugen Block :
Thanks, Konstantin.
It's been a wh
and describe at
which step exactly things start diverging from my expectations.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Eugen Block
Sent: Thursday, May 23, 2024 12:05 PM
To: ceph-users@ceph.io
Subject: [ceph-use
xact
mappings as used in the cluster and it encodes other important
information as well. That's why I'm asking for this instead of just
the crush map.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________
From:
investigate? I’m on my mobile right now, I’ll add my own
osdmap to the thread soon.
Zitat von Eugen Block :
Thanks, Frank, I appreciate your help.
I already asked for the osdmap, but I’ll also try to find a reproducer.
Zitat von Frank Schilder :
Hi Eugen,
thanks for this clarification. Yes
attached my osdmap, not sure if it will go through, though. Let me
know if you need anything else.
Thanks!
Eugen
Zitat von Eugen Block :
In my small lab cluster I can at least reproduce that a bunch of PGs
are remapped after adding hosts to the default root, but they are
not in their
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
____
From: Frank Schilder
Sent: Thursday, May 23, 2024 6:32 PM
To: Eugen Block
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: unknown PGs after adding hosts in different subtree
Hi Eugen,
tthew Vernon :
Hi,
On 22/05/2024 12:44, Eugen Block wrote:
you can specify the entire tree in the location statement, if you need to:
[snip]
Brilliant, that's just the ticket, thank you :)
This should be made a bit clearer in the docs [0], I added Zac.
I've opened a MR to upd
hosts directly where they belong.
- Set osd_crush_initial_weight = 0 to avoid remapping until everything
is where it's supposed to be, then reweight the OSDs.
Zitat von Eugen Block :
Hi Frank,
thanks for looking up those trackers. I haven't looked into them
yet, I'll read y
In case you have time, it would be great if you could collect
information on (reproducing) the fatal peering problem. While
remappings might be "unexpectedly expected" it is clearly a serious
bug that incomplete and unknown PGs show up in the process of adding
hosts at the root.
Best reg
Can you try the vg/lv syntax instead?
ceph orch daemon add osd ceph1:vg_osd/lvm_osd
Although both ways work in my small test cluster with 18.2.2 (as far
as I know 18.2.3 hasn't been released yet):
# ceph orch daemon add osd soc9-ceph:/dev/test-ceph/lv_osd
Created osd(s) 0 on host 'soc9-ceph'
Hi,
I think there might be a misunderstanding about one-way-mirroring. It
really only mirrors one way, from A to B. In case site A fails, you
can promote the images in B and continue using those images. But
there's no automated way back, because it's only one way. When site A
comes back,
I'm not really sure either, what about this?
ceph mds repaired
The docs state:
Mark the file system rank as repaired. Unlike the name suggests,
this command does not change a MDS; it manipulates the file system
rank which has been marked damaged.
Maybe that could bring it back up? Did yo
Hi,
I've never heard of automatic data deletion. Maybe just some snapshots
were removed? Or someone deleted data on purpose because of the
nearfull state of some OSDs? And there's no trash function for cephfs
(for rbd there is). Do you use cephfs snapshots?
Zitat von Prabu GJ :
Hi Team
els
like it has become the first suggestion to almost every mgr related
issue reported in this list.
I don't know what values would make sense, or if mgr_mon_bytes should
be increased as well.
Regards,
Eugen
Zitat von Eugen Block :
Hi,
I guess you mean use something like "ste
gar selvam :
Hi,
We are not using cephfs snapshots. Is there any other way to find this out?
On Thu, May 30, 2024 at 5:20 PM Eugen Block wrote:
Hi,
I've never heard of automatic data deletion. Maybe just some snapshots
were removed? Or someone deleted data on purpose because of the
nearful
How exactly does your crush rule look right now? I assume it's
supposed to distribute data across two sites, and since one site is
missing, the PGs stay in degraded state until the site comes back up.
You would need to either change the crush rule or assign a different
one to that pool whic
Hi,
I think there's something else wrong with your setup, I could
bootstrap a cluster without an issue with ed keys:
ceph:~ # ssh-keygen -t ed25519
Generating public/private ed25519 key pair.
ceph:~ # cephadm --image quay.io/ceph/ceph:v18.2.2 bootstrap --mon-ip
[IP] [some more options] --s
Hi,
I don't have much to contribute, but according to the source code [1]
this seems to be a non-fatal message:
void CreatePrimaryRequest::handle_unlink_peer(int r) {
CephContext *cct = m_image_ctx->cct;
ldout(cct, 15) << "r=" << r << dendl;
if (r < 0) {
lderr(cct) << "failed to un
the check-host command manually, with --verbose flag:
cephadm --verbose check-host --expect-hostname
cephadm --verbose prepare-host
Does that show anything useful?
Zitat von isnraj...@yahoo.com:
Thanks for the reply @Eugen Block
Yes, there is some thing else is wrong in my server, but no
Do you have osd_scrub_auto_repair set to true?
Zitat von Petr Bena :
Hello,
I wanted to try out (lab ceph setup) what exactly is going to happen
when parts of data on OSD disk gets corrupted. I created a simple
test where I was going through the block device data until I found
something
Can you paste the output of:
ls -l /var/lib/ceph/
on cephhost01? It says it can't write to that directory:
Unable to write
cephhost01:/var/lib/ceph/d5d1b7c6-232f-11ef-9ea1-a73759ab75e5/cephadm.2b9d7d139a9cb40289f2358faf49a109fc297c0a25
Which distro are you using?
Zitat von isnraj...@yahoo.
That would have been my next question, did you verify that the
corrupted OSD was a primary? The default deep-scrub config scrubs all
PGs within a week, so yeah, it can take a week until it's detected. It
could have been detected sooner if those objects would have been in
use by clients and
Hi,
can you check if this thread [1] applies to your situation? You don't
have multi-active MDS enabled, but maybe it's still some journal
trimming, or maybe misbehaving clients? In your first post there were
health warnings regarding cache pressure and cache size. Are those
resolved?
[
I assume it means that pools with an enabled application "cephfs" can
be targeted by specifying this tag instead of listing each pool
separately. Browsing through the code [1] seems to confirm that
(somehow, I'm not a dev):
if (g.match.pool_tag.application == ng.match.pool_tag.application
Hi Stefan,
I assume the number of dropped replicas is related to the pool's
min_size. If you increase min_size to 3 you should see only one
replica dropped from the acting set. I didn't run too detailed tests,
but a first quick one seems to confirm that:
# Test with min_size 2, size 4
48.
Dr. Fabian Svara
https://ariadne.ai
On Tue, Jun 11, 2024 at 2:05 PM Eugen Block wrote:
Hi,
can you check if this thread [1] applies to your situation? You don't
have multi-active MDS enabled, but maybe it's still some journal
trimming, or maybe misbehaving clients? In your fi
: lars.koep...@ariadne.ai
Phone: +49 6221 5993580 <+4962215993580>
ariadne.ai (Germany) GmbH
Häusserstraße 3, 69115 Heidelberg
Amtsgericht Mannheim, HRB 744040
Geschäftsführer: Dr. Fabian Svara
https://ariadne.ai
On Tue, Jun 11, 2024 at 3:32 PM Eugen Block wrote:
I don't think scrubs can c
er status
{
"active": true,
"last_optimize_duration": "0:00:00.000170",
"last_optimize_started": "Wed Jun 12 13:14:49 2024",
"mode": "upmap",
"no_optimization_needed": true,
"optimize_result&
There’s also a maintenance mode available for the orchestrator:
https://docs.ceph.com/en/reef/cephadm/host-management/#maintenance-mode
There’s some more information about that in the dev section:
https://docs.ceph.com/en/reef/dev/cephadm/host-maintenance/
Zitat von Anthony D'Atri :
That's ju
ne: +49 6221 5993580 <+4962215993580>
ariadne.ai (Germany) GmbH
Häusserstraße 3, 69115 Heidelberg
Amtsgericht Mannheim, HRB 744040
Geschäftsführer: Dr. Fabian Svara
https://ariadne.ai
On Wed, Jun 12, 2024 at 5:30 PM Eugen Block wrote:
Which version did you upgrade from to 18.2.2?
I can’t
heim, HRB 744040
Geschäftsführer: Dr. Fabian Svara
https://ariadne.ai
On Thu, Jun 13, 2024 at 12:55 PM Eugen Block wrote:
Downgrading isn't supported, I don't think that would be a good idea.
I also don't see anything obvious standing out in the pg output. Any
chance you can add more OSDs to
Hi,
- Is it common practice to configure SSDs with block.db and
associate them with five HDD disks to store OSD blocks when using
eight SSDs and forty HDDs?
yes, it is common practice.
Or would it be better to only store the rgw index on SSDs? I am
also curious about the difference in
Hi,
around the same time
sounds like deep-scrubbing. Did you verify if those OSDs from the
mentioned pool were scrubbed around that time? The primary OSDs for
the PGs would log that. Is that pool that heavily used? Or could there
be one failing disk?
Zitat von "Szabo, Istvan (Agoda)" :
Hi,
this only a theory, not a proven answer or something. But the
orchestrator does automatically reconfigure daemons depending on the
circumstances. So my theory is, some of the OSD nodes didn't respond
via public network anymore, so ceph tried to use the cluster network
as a fallback. T
Hi,
your crush rule distributes each chunk on a different host, so your
failure domain is host. The crush-failure-domain=osd from the EC
profile most likely is from the initial creation, maybe it was
supposed to be OSD during initial tests or whatever, but the crush
rule is key here.
We
Hi,
sorry for the delayed response, I was on vacation.
I would set the "debug_rbd_mirror" config to 15 (or higher) and then
watch the logs:
# ceph config set client.rbd-mirror. debug_rbd_mirror 15
Maybe that reveals anything.
Regards,
Eugen
Zitat von scott.cai...@tecnica-ltd.co.uk:
Thank
Hi,
it depends a bit on the actual OSD layout on the node and your
procedure, but there's a chance you might have hit the overdose. But I
would expect it to be logged in the OSD logs, two years ago in a
Nautilus cluster the message looked like this:
maybe_wait_for_max_pg withhold creatio
Hi Tim,
is this still an issue? If it is, I recommend to add some more details
so it's easier to follow your train of thought.
ceph osd tree
ceph -s
ceph health detail
ceph orch host ls
And then please point out which host you're trying to get rid of. I
would deal with the rgw thing later.
Hi,
the number of shards looks fine, maybe this was just a temporary
burst? Did you check if the rados objects in the index pool still have
more than 200k omap objects? I would try someting like
rados -p listomapkeys
.dir.9213182a-14ba-48ad-bde9-289a1c0c0de8.2479481907.1.151 | wc -l
Z
the 4x nvme osd nodes 😕
Ty
____
From: Eugen Block
Sent: Tuesday, July 9, 2024 6:02 PM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: Large omap in index pool even if properly
sharded and not "OVER"
Email received from the internet. If in doubt, don
Hi,
apparently, db_slots is still not implemented. I just tried it on a
test cluster with 18.2.2:
# ceph orch apply -i osd-slots.yaml --dry-run
Error EINVAL: Failed to validate OSD spec "osd-hdd-ssd.db_devices":
Filtering for `db_slots` is not supported
If it was, I would be interested as
Do you see it in 'ceph mgr services'? You might need to change the
prometheus config as well and redeploy.
Zitat von Albert Shih :
Hi everyone
I just change the subnet of my cluster.
The cephfs part seem to working well.
But I got many error with
Jul 11 10:08:35 hostname ceph-***
the line, maybe you need to update alertmanager instead of
prometheus.
Zitat von Albert Shih :
Le 11/07/2024 à 08:34:21+, Eugen Block a écrit
Hi,
Sorry I miss the answer to the list.
Do you see it in 'ceph mgr services'? You might need to change the
Yes I did
root@cth
Hi,
just one question coming to mind, if you intend to migrate the images
separately, is it really necessary to set up mirroring? You could just
'rbd export' on the source cluster and 'rbd import' on the destination
cluster.
Zitat von Anthony D'Atri :
I would like to use mirroring to
Hi,
containerized daemons usually have the fsid in the systemd unit, like
ceph-{fsid}@osd.5
Is it possible that you have those confused? Check the
/var/lib/ceph/osd/ directory to find possible orphaned daemons and
clean them up.
And as previously stated, it would help to see your osd tree
ousetech.com 10.0.1.54
ceph06.internal.mousetech.com 10.0.1.56
dell02.mousetech.com 10.0.1.52 _admin rgw
www7.mousetech.com 10.0.1.7 rgw
7 hosts in cluster
On Fri, 2024-07-12 at 22:15 +, Eugen Block wrote:
Hi,
containerized daemons usually have the fsid in the systemd unit,
not sure what ill effects might
ensue. I discovered that my other offending machine actually has TWO
legacy OSD directories, but only one of them is being used. The
other OSD is the remnant of a deletion and it's just dead files now.
On 7/13/24 02:39, Eugen Block wrote:
Okay, it lo
delberg
Amtsgericht Mannheim, HRB 744040
Geschäftsführer: Dr. Fabian Svara
https://ariadne.ai
On Thu, Jun 13, 2024 at 3:01 PM Eugen Block wrote:
I'm quite sure that this could result in the impact you're seeing. To
confirm that suspicion you could stop deleting and wait a couple of
da
Hi,
I'm not sure if it's a sufficient answer, but we use our own CA and
have it integrated with a ca-bundle mapping as extra_container_args
for the rgw daemons:
extra_container_args:
- -v=/var/lib/ca-certificates/:/var/lib/ca-certificates/:ro
-
-v=/var/lib/ca-certificates/ca-bundle.pem:/e
was just because of a
typing error when adding a new ceph host. That problem has now been
resolved.
Tim
On Mon, 2024-07-15 at 05:49 +, Eugen Block wrote:
If the OSD is already running in a container, adopting it won't
work,
as you already noticed. I don't have an explanation how t
Are all clients trying to connect to the same ceph cluster? Have you
compared their ceph.conf files? Maybe during the upgrade something
went wrong and an old file was applied or something?
Zitat von Albert Shih :
Hi everyone
My cluster ceph run currently 18.2.2 and ceph -s say everything a
looking for the container OSD and doesn't notice the legacy OSD.
Thanks again for all the help!
Tim
On Tue, 2024-07-16 at 06:38 +, Eugen Block wrote:
Do you have more ceph packages installed than just cephadm? If you
have ceph-osd packages (or ceph-mon, ceph-mds etc.), I would r
container if I desired
also.
Best wishes,
Alex
From: Alex Hussein-Kershaw (HE/HIM)
Sent: Tuesday, July 16, 2024 12:48 PM
To: Eugen Block ; ceph-users@ceph.io
Subject: Re: [EXTERNAL] [ceph-users] Re: RGW Multisite with a Self-Signed CA
Hi Eugen,
Thanks for the
Hi,
"name" is the client name as who you're trying to mount the
filesystem, "fs_name" is the name of your CephFS. You can run 'ceph fs
ls' to see which filesystems are present. And then you need the path
you want to mount, in this example it's the root directory "/".
Regards,
Eugen
Zitat
Thanks a lot for the heads up, Dan!
Zitat von Dan van der Ster :
Hey all,
The upcoming community Ceph container images will be based on CentOS 9.
In our Clyso CI testing lab we learned that el9-based images won't run
on some (default) qemu VMs. Where our el8-based images run well, our
new el9
Hi,
I came across [1] and wanted to try to have all certificates/keys in
one file. But it appears that the validation happens only against the
first cert. So what I did was to concatenate all certs/keys into one
file, then added that to ceph:
ceph config-key set rgw/cert/rgw.realm.zone -i
Thanks, that's what I proposed to the customer as well. They also have
their own CA, so it probably shouldn't be a problem to have such a
certificate as well.
Thanks!
Zitat von Kai Stian Olstad :
On Thu, Jul 18, 2024 at 10:49:02AM +0000, Eugen Block wrote:
And after restarting
Hi,
can you please provide more information?
Which other flags did you set (noout should be sufficient, or just use
the maintenance mode)?
Please share the output from:
ceph osd tree
ceph osd df
ceph osd pool ls detail
Add the corresponding crush rule which applies to the affected pool.
Zit
Hi,
instead of exporting/importing single objects via rados export/import
I would use 'rados cppool ' although it does a
linear copy of each object, so I'm not sure if that's so much better...
So first create a new replicated pool, 'rados cppool old new', then
rename the original pool, and
Hi,
according to [1] those are non-fatal errors:
These are the same non-fatal errors from above, where 404 errors
from other zones are converted to -ENOENT. The RGWBackoffControlCR
will continue to poll these objects for changes. These ERROR
messages are unnecessarily spammy though, so I'd
The customer got a new certificate with all the DNS names and IPs, we
tested it on one host only which was promising. They will restart all
the remaining RGWs during their next maintenance window on Wednesday.
Thanks!
Eugen
Zitat von Eugen Block :
Thanks, that's what I proposed t
601 - 700 of 1217 matches
Mail list logo