I remember someone reporting the same thing but I can’t find the
thread right now. I’ll try again tomorrow.
Zitat von Wesley Dillingham :
I have a brand new Cluster 16.2.9 running bluestore with 0 client activity.
I am modifying some crush weights to move PGs off of a host for testing
Hi,
is your new pool configured as a cache-tier? The option you're trying
to set is a cache-tier option [1]. Could the old pool have been a
cache pool in the past so it still has this option set?
[1]
https://docs.ceph.com/en/latest/rados/operations/cache-tiering/#configuring-a-cache-tier
gets
OOM-killed.
So.. seems I can get my cluster running again, only limited by my
internet upload now. Any hints why it eats a lot of memory in normal
operation would still be appreciated.
Best, Mara
Am Wed, Jun 08, 2022 at 09:05:52AM + schrieb Eugen Block:
It's even worse, you only give
Hi,
you can either use 'rbd du' command:
control01:~ # rbd --id cinder du images/01b01349-a11c-489c-8349-4c5be9523c58
NAME PROVISIONED USED
01b01349-a11c-489c-8349-4c5be9523c58@snap2 GiB 2 GiB
01b01349-a11c-489c-8349-4c5be9523c58 2 GiB
1 7f7c1ef9fb80 DEBUG sestatus: Memory
protection checking: actual (secure)
2022-06-08 16:36:42,521 7f7c1ef9fb80 DEBUG sestatus: Max kernel
policy version: 31
On 2022-06-08 4:30 PM, Eugen Block wrote:
Have you checked /var/log/ceph/cephadm.log on the target nodes?
Zitat von "Z
Have you checked /var/log/ceph/cephadm.log on the target nodes?
Zitat von "Zach Heise (SSCC)" :
Yes, sorry - I tried both 'ceph orch apply mgr "ceph01,ceph03"' and
'ceph orch apply mds "ceph04,ceph05"' before writing this initial
email - once again, the same logged message: "6/8/22 2:25:12
It's even worse, you only give them 1MB, not GB.
Zitat von Eugen Block :
Hi,
is there any reason you use custom configs? Most of the defaults
work well. But you only give your OSDs 1 GB of memory, that is way
too low except for an idle cluster without much data. I recommend to
remove
Hi,
is there any reason you use custom configs? Most of the defaults work
well. But you only give your OSDs 1 GB of memory, that is way too low
except for an idle cluster without much data. I recommend to remove
the line
osd_memory_target = 1048576
and let ceph handle it. I didn't
destroy the cluster and rebuild it
- Mail original -----
De: "Eugen Block"
À: "ceph-users"
Envoyé: Mardi 7 Juin 2022 15:00:39
Objet: [ceph-users] Re: Many errors about PG deviate more than 30%
on a new cluster deployed by cephadm
Hi,
please share the output of 'ceph o
Hi,
the deep copy feature was introduced in Mimic [1] and I doubt that
there will be backports since Luminous is EOL quite for some time now
(as are Mimic and Nautilus btw).
Eugen
[1] https://ceph.io/en/news/blog/2018/v13-2-0-mimic-released/
Zitat von Pardhiv Karri :
Hi,
We are
Hi,
please share the output of 'ceph osd pool autoscale-status'. You have
very low (too low) PG numbers per OSD (between 0 and 6), did you stop
the autoscaler at an early stage? If you don't want to use the
autoscaler you should increase the pg_num, but you could set
autoscaler to warn
Hi,
I'm currently debugging a reoccuring issue with multi-active MDS. The
cluster is still on Nautilus and can't be upgraded at this time. There
have been many discussions about "cache pressure" and I was able to
find the right settings a couple of times, but before I change too
much in
Hi,
how did you end up with that many PGs per OSD? According to your
output the pg_autoscaler is enabled, if that was done by the
autoscaler I would create a tracker issue for that. Then I would
either disable it or set the mode to "warn" and then reduce the pg_num
for some of the pools.
First thing I would try is a mgr failover.
Zitat von Eneko Lacunza :
Hi all,
I'm trying to diagnose a issue in a tiny cluster that is showing the
following status:
root@proxmox3:~# ceph -s
cluster:
id: 80d78bb2-6be6-4dff-b41d-60d52e650016
health: HEALTH_WARN
1/3
Hi,
first, you can bootstrap a cluster by providing the container image
path in the bootstrap command like this:
cephadm --image **:5000/ceph/ceph bootstrap --mon-ip **
Check out the docs for an isolated environment [1], I don't think it's
a good idea to change the runtime the way you
Hi,
in earlier versions (e.g. Nautilus) there was a dashboard command to
set the RGW hostname, that is not available in Octopus (I didn’t check
Pacific, probably when cephadm took over), so I would assume that it
comes from the ‘ceph orch host add’ command and you probably used the
host’s
Hi,
I found this request [1] for version 18, it seems as if that’s not
easily possible at the moment.
[1] https://tracker.ceph.com/issues/54308
Zitat von Vladimir Brik :
Hello
Is it possible to increase to increase the retention period of the
prometheus service deployed with cephadm?
Hi,
I haven’t dealt with this for some time, it used to be a problem in
earlier releases. But can’t you just change the ruleset of the glance
pool to use the „better“ OSDs?
Zitat von Pardhiv Karri :
Hi,
We have a ceph cluster with integration to Openstack. We are thinking about
Hi,
I don’t know what could cause that error, but could you share more
details? You seem to have multiple active MDSs, is that correct? Could
they be overloaded? What happened exactly, did one MDS fail or all of
them? Do the standby MDS report anything different?
Zitat von Kuko Armas :
Do you see anything suspicious in /var/log/ceph/cephadm.log? Also
check the mgr logs for any hints.
Zitat von Lo Re Giuseppe :
Hi,
We have happily tested the upgrade from v15.2.16 to v16.2.7 with
cephadm on a test cluster made of 3 nodes and everything went
smoothly.
Today we started
.
Zitat von zhengyi deng :
Hi Eugen Block
New node added " ceph osd crush add-bucket 192.168.1.47 host " . Executing
"ceph osd crush move 192.168.1.47 root=default " caused ceph-mon to reboot.
I solved the problem because there was a choose_args configuration in
crushmap
Hi,
there's a profile "crash" for that. In a lab setup with Nautilus
thre's one crash client with these caps:
admin:~ # ceph auth get client.crash
[client.crash]
key =
caps mgr = "allow profile crash"
caps mon = "allow profile crash"
On a Octopus cluster
Hi,
can you share your 'ceph osd tree' so it easier to understand what
might be going wrong. I didn't check the script in detail, what
exactly do you mean by extending? Do you create new hosts in a
different root of the osd tree? Do those new hosts get PGs assigned
although they're in a
Hi,
the OSDs log into the journal, so you should be able to capture the
logs during startup with 'journalctl -fu
ceph-@osd..service' or check after the failure with
'journalctl -u ceph-@osd..service'.
Zitat von 7ba335c6-fb20-4041-8c18-1b00efb78...@anonaddy.me:
Hello,
I've
Hi,
- Can we have multiple pools in a stretch cluster?
yes, you can have multiple pools, but apparently they have to be all
configured with the stretch rule as you already noted.
- Can we have multiple different crush rules in a stretch cluster?
It's still a regular ceph cluster, so
Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com<mailto:istvan.sz...@agoda.com>
---
On 2022. May 2., at 18:11, Eugen Block wrote:
Email receive
istvan.sz...@agoda.com<mailto:istvan.sz...@agoda.com>
---
On 2022. May 2., at 15:59, Eugen Block wrote:
Email received from the internet. If in doubt, don't click any link
nor open any attachment !
Just to up
ust speculation at the moment.
As a workaround we'll increase osd_max_pg_per_osd_hard_ratio to 5 and
see how the next attempt will go.
Thanks,
Eugen
Zitat von Josh Baergen :
On Wed, Apr 6, 2022 at 11:20 AM Eugen Block wrote:
I'm pretty sure that their cluster isn't anywhere near the li
Hi,
you could try to set config keys, but I'm not sure if this will work.
What do you get if you run this:
ceph config-key get mgr/dashboard/_iscsi_config
obtained 'mgr/dashboard/_iscsi_config'
{"gateways": {"ses7-host1.fqdn": {"service_url":
"http://:@:5000"}, "ses7-host2.fqdn":
Hi,
ceph moved away from file based config to a config store within ceph.
You only need a minimal ceph.conf to bootstrap a cluster or for
clients which you can generate with:
ceph config generate-minimal-conf
You can dump the current ceph config and integrate into your ceph.conf:
ceph
tat von grin :
On Fri, 22 Apr 2022 06:54:33 +0000
Eugen Block wrote:
Hi,
> They are either static (so when the manager moves they become dead)
> or dynamic (so they will be overwritten the moment the mgr moves),
> aren't they?
there might be a misunderstanding but the MGR failov
ana server for your cluster so a
static URL works fine.
I'm really not sure if we're on the same page here, if not please clarify.
Zitat von grin :
On Thu, 21 Apr 2022 08:52:48 +0000 Eugen Block wrote:
there are a bunch of dashboard settings, for example
pacific:~ # ceph dashboard set-g
Hi,
there are a bunch of dashboard settings, for example
pacific:~ # ceph dashboard set-grafana-api-url
pacific:~ # ceph dashboard set-prometheus-api-host
pacific:~ # ceph dashboard set-alertmanager-api-host
and many more.
Zitat von cephl...@drop.grin.hu:
Hello,
I have tried to find the
These are probably remainders of previous OSDs, I remember having to
clean up orphaned units from time to time. Compare the UUIDs to your
actual OSDs and disable the units of the non-existing OSDs.
Zitat von Marc :
I added some osd's which are up and running with:
ceph-volume lvm create
d: /usr/bin/chown -R ceph:ceph /dev/dm-4
/bin/docker: Running command: /usr/bin/chown -R ceph:ceph
/var/lib/ceph/osd/ceph-16
/bin/docker: --> ceph-volume lvm activate successful for osd ID: 16
On Wed, Apr 20, 2022 at 2:28 PM Eugen Block wrote:
IIUC it's just the arrow that can't be displayed w
trypoint /usr/sbin/ceph-volume
--privileged --group-add=disk --init -e CONTAINER_IMAGE=
quay.io/ceph/ceph@sha256:0d927ccbd8892180ee09894c2b2c26d07c938bf96a56eaee9b80fc9f26083ddb
-e NODE_NAME=dmz-host-4 -e CEPH_USE_RANDOM_NONCE=1 -v
/var/run/ceph/d221bc3c-8ff4-11ec-b4ba-b02628267680:/var/run/ceph:z
] Running command:
/usr/sbin/pvs --noheadings --readonly --separator=";" -S
lv_uuid=pfWtmF-6Xlc-R2LO-kzeV-2jIw-3Ki8-gCOMwZ -o
pv_name,pv_tags,pv_uuid,vg_name,lv_uuid
[2022-04-20 10:38:02,301][ceph_volume.process][INFO ] stdout
/dev/sdf";"";"j2Ilk4-12ZW-qR9u-3n5Y-gn6B
Hi,
have you checked /var/log/ceph/cephadm.log for any hints?
ceph-volume.log may also provide some information
(/var/log/ceph//ceph-volume.log) what might be going on.
Zitat von Manuel Holtgrewe :
Dear all,
I now attempted this and my host is back in the cluster but the `ceph
cephadm
Hi,
Is it advisable to limit the sizes of data pools or metadata pools
of a cephfs filesystem for performance or other reasons?
I assume you don't mean quotas for pools, right? The pool size is
limited by the number and size of the OSDs, of course. I can't really
say what's advisable or
Hi,
unless there are copy/paste mistakes involved I believe you shouldn't
specify '--master' for the secondary zone because you did that already
for the first zone which is supposed to be the master zone. You
specified '--rgw-zone=us-west-1' as the master zone within your realm,
but then
Thanks, I’ll take a closer look at that.
Zitat von Josh Baergen :
Hi Eugen,
how did you determine how many PGs were assigned to the OSDs? I looked
at one of the OSD's logs and checked how many times each PG chunk of
the affected pool was logged during startup. I got around 580 unique
Hi,
It could be, yes. I've seen a case on a test cluster where thousands
of PGs were assigned to a single OSD even when the steady state was
far fewer than that.
how did you determine how many PGs were assigned to the OSDs? I looked
at one of the OSD's logs and checked how many times each
p": "chooseleaf_indep",
"num": 9,
"type": "host"
},
{
"op": "emit"
}
]
},
ot;
"Prometheus/2.18.1"
Apr 7 11:21:16 hvs001 bash[2670]: debug
2022-04-07T11:21:16.267+ 7f514f9b2700 0 [prometheus INFO
cherrypy.access.139987709758544] :::10.3.1.23 - -
[07/Apr/2022:11:21:16] "GET /metrics HTTP/1.1" 200 166748 ""
"Prometheus/2.18.1"
___
Hi,
please add some more output, e.g.
ceph -s
ceph osd tree
ceph osd pool ls detail
ceph osd crush rule dump (of the used rulesets)
You have the pg_autoscaler enabled, you don't need to deal with pg_num
manually.
Zitat von Dominique Ramaekers :
Hi,
My cluster is up and running. I saw a
tat von Josh Baergen :
On Wed, Apr 6, 2022 at 11:20 AM Eugen Block wrote:
I'm pretty sure that their cluster isn't anywhere near the limit for
mon_max_pg_per_osd, they currently have around 100 PGs per OSD and the
configs have not been touched, it's pretty basic.
How is the host being
Thanks for the comments, I'll get the log files to see if there's any
hint. Getting the PGs in an active state is one thing, I'm sure
multiple approaches would have worked. The main question is why this
happens, we have 19 hosts to rebuild and can't risk the application
outage everytime.
ine.
Zitat von Zakhar Kirpichenko :
Hi Eugen,
Can you please elaborate on what you mean by "restarting the primary PG"?
Best regards,
Zakhar
On Wed, Apr 6, 2022 at 5:15 PM Eugen Block wrote:
Update: Restarting the primary PG helped to bring the PGs back to
active state. Consider this
Update: Restarting the primary PG helped to bring the PGs back to
active state. Consider this thread closed.
Zitat von Eugen Block :
Hi all,
I have a strange situation here, a Nautilus cluster with two DCs,
the main pool is an EC pool with k7 m11, min_size = 8 (failure
domain host). We
Hi all,
I have a strange situation here, a Nautilus cluster with two DCs, the
main pool is an EC pool with k7 m11, min_size = 8 (failure domain
host). We confirmed failure resiliency multiple times for this
cluster, today we rebuilt one node resulting in currently 34 inactive
PGs. I'm
Hi Ali,
it's very common to have MONs and OSDs colocated on the same host.
Zitat von Ali Akil :
Hallo together,
i am planning a Ceph cluster on 3 storage nodes (12 OSDs per Cluster
with Bluestorage). Each node has 192 GB of memory nad 24 cores of cpu.
I know it's recommended to have
Thanks for the clarification, I get it now. This would be quite
helpful to have in the docs, I believe. ;-)
Zitat von Arthur Outhenin-Chalandre :
Hi Eugen,
On 4/6/22 09:47, Eugen Block wrote:
I don't mean to hijack this thread, I'm just curious about the
multiple mirror daemons statement
Hi,
is there any specific reason why you do it manually instead of letting
cephadm handle it? I might misremember but I believe for the manual
lvm activation to work you need to pass the '--no-systemd' flag.
Regards,
Eugen
Zitat von Dominique Ramaekers :
Hi,
I've setup a ceph cluster
Hi,
I don't mean to hijack this thread, I'm just curious about the
multiple mirror daemons statement. Last year you mentioned that
multiple daemons only make sense if you have different pools to mirror
[1], at leat that's how I read it, you wrote:
[...] but actually you can have multiple
ssh the host and execute the command for each OSD. if we have to
add many OSD, it will take lots of time.
On Mon, Apr 4, 2022 at 3:42 PM Eugen Block wrote:
Hi,
this is handled by ceph-volume, do you find anything helpful in
/var/log/ceph//ceph-volume.log? Also check the cephadm.log
for any
Hi,
this is handled by ceph-volume, do you find anything helpful in
/var/log/ceph//ceph-volume.log? Also check the cephadm.log
for any hints.
Zitat von 彭勇 :
we have a running ceph, 16.2.7, with SATA OSD and DB on nvme.
and we insert some SATA to host, and the status of new host is
Hi samuel,
I haven't used dedicated rbd journal pools so I don't have any comment
on that. But there's an alternative to journal-based mirroring, you
can also mirror based on snapshot [1]. Would this be an alternative
for you to look deeper into?
Regards,
Eugen
[1]
Zitat von Alfredo Rezinovsky :
Yes.
osd.all-available-devices 0 - 3h
osd.dashboard-admin-1635797884745 7 4m ago 4M *
How should I disable the creation?
El mié, 30 mar 2022 a las 17:24, Eugen Block () escribió:
Do you have other
Do you have other osd services defined which would apply to the
affected host? Check ‚ceph orch ls‘ for other osd services.
Zitat von Alfredo Rezinovsky :
I want to create osds manually
If I zap the osd 0 with:
ceph orch osd rm 0 --zap
as soon as the dev is available the orchestrator
Hi,
I would recommend to focus on one issue at a time and try to resolve
it first. It is indeed very much to read and not really clear which
issues could be connected. Can you start with your current cluster
status (ceph -s) and some basic outputs like 'ceph orch ls', 'ceph
orch ps'
Hi,
does the failed MON's keyring file contain the correct auth caps? Then
I would also remove the local (failed) MON's store.db before rejoining.
Zitat von Tomáš Hodek :
Hi, I have 3 node ceph cluster (managed via proxmox). Got single
node fatal failure and replaced it. Os boots
not recommended.
Zitat von Daniel Persson :
Hi Eugen.
I've tried. The system says it's not recommended but I may force it.
Forcing something with the risk of losing data is not something I'm going
to do.
Best regards
Daniel
On Sat, Mar 26, 2022 at 8:55 PM Eugen Block wrote:
Hi,
just because
Hi,
just because the autoscaler doesn’t increase the pg_num doesn’t mean
you can’t increase it manually. Have you tried that?
Zitat von Daniel Persson :
Hi Team.
We are currently in the process of changing the size of our cache pool.
Currently it's set to 32 PGs and distributed weirdly on
ter option.
Thanks!
Zitat von Ilya Dryomov :
On Fri, Mar 25, 2022 at 10:11 AM Eugen Block wrote:
Hi,
I was curious and tried the same with debug logs. One thing I noticed
was that if I use the '-k ' option I get a different error
message than with '--id user3'. So with '-k' the result is the same
ened again. It is like the ceph thinks the osd
is still full, as it was before...
Em qua., 23 de mar. de 2022 às 14:38, Eugen Block escreveu:
Without having an answer to the question why the OSD is full I'm
wondering why the OSD has a crush weight of 1.2 while its size is
only 1 TB. Was th
Hi,
I was curious and tried the same with debug logs. One thing I noticed
was that if I use the '-k ' option I get a different error
message than with '--id user3'. So with '-k' the result is the same:
---snip---
pacific:~ # rbd -k /etc/ceph/ceph.client.user3.keyring -p test2
--namespace
Without having an answer to the question why the OSD is full I'm
wondering why the OSD has a crush weight of 1.2 while its size is
only 1 TB. Was that changed on purpose? I'm not sure if that would
explain the OSD full message, though.
Zitat von Rodrigo Werle :
Hi everyone!
I'm
How about this one?
https://docs.ceph.com/en/latest/cephfs/fs-volumes/
Zitat von Rafael Diaz Maurin :
Hi cephers,
Under Pacific, I just noticed a new info when running a 'ceph -s':
[...]
date:
volume: 1/1 healthy
[...]
I can't find the info in the Ceph docs, does anyone know what
Hi,
Setting mgr/cephadm/registry_insecure to false doesn't help.
if you want to use an insecure registry you would need to set this
option to true, not false.
I am using podman and /etc/containers/registries.conf is set with
that insecure private registry.
Can you paste the whole
clear.
Thanks again,
Zach
On 3/12/22 00:07, Eugen Block wrote:
Hi,
are you also planning to switch to cephadm? In that case you could
just adopt all the daemons [1], I believe docker would also work (I
use it with podman).
[1] https://docs.ceph.com/en/pacific/cephadm/adoption.html
Hi,
IIUC the OSDs 3,4,5 have been removed while some PGs still refer to
them, correct? Have the OSDs been replaced with the same IDs? If not
(so there are currently no OSDs with IDs 3,4,5 in your osd tree) maybe
marking them as lost [1] would resolve the stuck PG creation, although
I
) _read_bdev_label
failed to open /var/lib/ceph/osd/ceph-1/block: (2) No such file or
directory
Can you clarify?
Zitat von huxia...@horebdata.cn:
thanks, Eugen.
I am suspecting this perhaps could be related to
https://tracker.ceph.com/issues/42223
huxia...@horebdata.cn
From: Eugen
Hi, there must be some mixup of the OSD IDs. Your command seems to use
ID 3 but the log complains about ID 1. You should check your script
and workflow.
Zitat von huxia...@horebdata.cn:
Dear Ceph folks,
I encountered a strange behavior with Luminous 12.2.13, when running
the following
Hi,
can you be more specific what exactly you are looking for? Are you
talking about the rocksDB size? And what is the unit for 5012? It’s
really not clear to me what you’re asking. And since the
recommendations vary between different use cases you might want to
share more details about
ts. So far no loadbalancer
has been put into place there.
Best, Julian
-Ursprüngliche Nachricht-
Von: Eugen Block
Gesendet: Freitag, 25. Februar 2022 10:52
An: ceph-users@ceph.io
Betreff: [ceph-users] Re: WG: Multisite sync issue
This email originated from outside of CGM. Please do not c
Hi,
I would stop alle RGWs except one in each cluster to limit the places
and logs to look at. Do you have a loadbalancer as endpoint or do you
have a list of all RGWs as endpoints?
Zitat von "Poß, Julian" :
Hi,
i did setup multisite with 2 ceph clusters and multiple rgw's and
Hi,
these are the defaults set by cephadm in Octopus and Pacific:
---snip---
[Service]
LimitNOFILE=1048576
LimitNPROC=1048576
EnvironmentFile=-/etc/environment
ExecStart=/bin/bash {data_dir}/{fsid}/%i/unit.run
ExecStop=-{container_path} stop ceph-{fsid}-%i
ExecStopPost=-/bin/bash
Hi,
1. How long will ceph continue to run before it starts complaining
about this?
Looks like it is fine for a few hours, ceph osd tree and ceph -s,
seem not to notice anything.
if the OSDs don't have to log anything to disk (which can take quite
some time depending on the log settings)
.
Le mer. 23 févr. 2022 à 11:41, Eugen Block a écrit :
Hi,
> How can I identify which operation this OSD is trying to achieve as
> osd_op() is a bit large ^^ ?
I would start by querying the OSD for historic_slow_ops:
ceph daemon osd. dump_historic_slow_ops to see which operation it is.
>
Hi,
if you want to have DB and WAL on the same device, just don't specify
WAL in your drivegroup. It will be automatically created on the DB
device, too. In your case the rotational flag should be enough to
distinguish between data and DB.
based on the suggestion in the docs that this
Hi,
How can I identify which operation this OSD is trying to achieve as
osd_op() is a bit large ^^ ?
I would start by querying the OSD for historic_slow_ops:
ceph daemon osd. dump_historic_slow_ops to see which operation it is.
How can I identify the related images to this data chunk?
:istvan.sz...@agoda.com>
---
On 2022. Feb 21., at 19:20, Eugen Block wrote:
Email received from the internet. If in doubt, don't click any link
nor open any attachment !
Hi,
it really depends on the resiliency requirements and the use case. We
Hi,
it really depends on the resiliency requirements and the use case. We
have a couple of customers with EC profiles like k=7 m=11. The
potential waste of space as Anthony already mentions has to be
considered, of course. But with regards to performance we haven't
heard any complaints
Can you retry after resetting the systemd unit? The message "Start
request repeated too quickly." should be cleared first, then start it
again:
systemctl reset-failed
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service
systemctl start
level caching.
Mark
On 2/16/22 10:18, Eugen Block wrote:
Hi,
we've noticed the warnings for quite some time now, but we're big
fans of the cache tier. :-)
IIRC we set it up some time around 2015 or 2016 for our production
openstack environment and it works nicely for us. We tried
Hi,
we've noticed the warnings for quite some time now, but we're big fans
of the cache tier. :-)
IIRC we set it up some time around 2015 or 2016 for our production
openstack environment and it works nicely for us. We tried it without
the cache some time after we switched to Nautilus but
, 2022 at 11:21 AM Eugen Block wrote:
It does update only one OSD at a time, I did that in my little test
cluster on Octopus today. I haven’t played too much with Pacific yet,
maybe some things have changed there?
Zitat von Zakhar Kirpichenko :
> Hi Eugen,
>
> Thanks for this. All of
of
1 host at a time we could resolve this issue.
/Z
On Mon, Feb 14, 2022 at 4:26 PM Eugen Block wrote:
Hi,
what are your rulesets for the affected pools? As far as I remember
the orchestrator updates one OSD node at a time, but not multiple OSDs
at once, only one by one. It checks with the &qu
Hi,
what are your rulesets for the affected pools? As far as I remember
the orchestrator updates one OSD node at a time, but not multiple OSDs
at once, only one by one. It checks with the "ok-to-stop" command if
an upgrade of that daemon can proceed, so as long as you have host as
5bbddf414a
format: 2
features: layering, exclusive-lock, data-pool
op_features:
flags:
create_timestamp: Thu Feb 10 18:17:42 2022
access_timestamp: Thu Feb 10 18:17:42 2022
modify_timestamp: Thu Feb 10 18:17:42 2022
Giuseppe
On 11.02.22, 14:52, &q
uot;,
"bfm_blocks_per_key": "128",
"bfm_bytes_per_block": "4096",
"bfm_size": "6001171365888",
"bluefs": "1",
"ceph_fsid": "1234abcd-1234-abcd-1234-1234 abcd1234",
-dcache-data, profile rbd pool=fulen-dcache-meta, profile
rbd pool=fulen-hdd-data, profile rbd pool=fulen-nvme-meta"
On 11.02.22, 13:22, "Eugen Block" wrote:
Hi,
the first thing coming to mind are the user's caps. Which permissions
do they have? Have you compa
Hi,
the first thing coming to mind are the user's caps. Which permissions
do they have? Have you compared 'ceph auth get client.fulen' on both
clusters? Please paste the output from both clusters and redact
sensitive information.
Zitat von Lo Re Giuseppe :
Hi all,
This is my first
Hi,
is there a difference in PG size on new and old OSDs or are they all
similar in size? Is there some fsck enabled during OSD startup?
Zitat von Andrej Filipcic :
Hi,
with 16.2.7, some OSDs are very slow to start, eg it takes ~30min
for an hdd (12TB, 5TB used) to become active. After
Can you share some more information how exactly you upgraded? It looks
like a cephadm managed cluster. Did you intall OS updates on all nodes
without waiting for the first one to recover? Maybe I'm misreading so
please clarify what your update process looked like.
Zitat von Mazzystr :
I
Hi,
you should be able to change in the config file:
/var/lib/ceph//prometheus.ses7-host1/etc/prometheus/alerting/ceph_alerts.yml
and restart the containers.
Regards,
Eugen
Zitat von Manuel Holtgrewe :
Dear all,
I wonder how I can adjust the default alerts generated by prometheus
when
Hi,
have you tried to failover the mgr service? I noticed similar
behaviour in Octopus.
Zitat von Fyodor Ustinov :
Hi!
No one knows how to fix it?
- Original Message -
From: "Fyodor Ustinov"
To: "ceph-users"
Sent: Tuesday, 25 January, 2022 11:29:53
Subject: [ceph-users] How
Hi,
this is a disk space warning. If the MONs get below 30% free disk
space you'll get a warning since a MON store can grow in case of
recovery for a longer period of time. Use 'df -h' and you'll probably
see /var/lib/containers/ with less than 30% free space. You can either
decrease
Have you also tried this?
# ceph orch daemon restart osd.12
Without the "daemon" you would try to restart an entire service called
"osd.12" which obviously doesn't exist. With "daemon" you can restart
specific daemons.
Zitat von Manuel Holtgrewe :
Dear all,
I'm running Pacific 16.2.7
Hi,
you wrote that this cluster was initially installed with Octopus, so
no upgrade ceph wise? Are all RGW daemons on the exact same ceph
(minor) versions?
I remember one of our customers reporting inconsistent objects on a
regular basis although no hardware issues were detectable. They
801 - 900 of 1387 matches
Mail list logo