Hi all,
I want to benchmark my production cluster with cbt. I read a bit of the
code and I see something strange in it, for example, it's going to create
ceph-osd by it selves (
https://github.com/ceph/cbt/blob/master/cluster/ceph.py#L373) and also
shutdown the whole cluster!! (
https://github.com
A few more things of note after more poking with the help of Dan vdS.
1) The object that the backfill is crashing on has an mtime of a few minutes
before the original primary died this morning, and a 'rados get' gives an
input/output error. So it looks like a new object that was possibly corrupt
Hi Janek,
We realize this, we referenced that issue in our initial email. We do want
the metrics exposed by Ceph internally, and would prefer to work towards a
fix upstream. We appreciate the suggestion for a workaround, however!
Again, we're happy to provide whatever information we can that woul
Hi,
I am uisng 15.2.7 on CentOS 8.1. I have a number of old buckets that are listed
with
# radosgw-admin metadata list bucket.instance
but are not listed with:
# radosgw-admin bucket list
Lets say that one of them is:
'old-bucket' and its instance is 'c100feda-5e16-48a4-b908-7be61aa877ef.123.1'
Hi all,
Got an odd issue that I'm not sure how to solve on our Nautilus 14.2.9 EC
cluster.
The primary OSD of an EC 8+3 PG died this morning with a very sad disk
(thousands of pending sectors). After the down out interval a new 'up' primary
was assigned and the backfill started. Twenty minutes
FYI, this is the ceph-exporter we're using at the moment:
https://github.com/digitalocean/ceph_exporter
It's not as good, but it does the job mostly. Some more specific metrics
are missing, but the majority is there.
On 10/12/2020 19:01, Janek Bevendorff wrote:
Do you have the prometheus mod
Do you have the prometheus module enabled? Turn that off, it's causing
issues. I replaced it with another ceph exporter from Github and almost
forgot about it.
Here's the relevant issue report:
https://tracker.ceph.com/issues/39264#change-179946
On 10/12/2020 16:43, Welby McRoberts wrote:
H
Hi Folks
We've noticed that in a cluster of 21 nodes (5 mgrs&mons & 504 OSDs with 24
per node) that the mgr's are, after a non specific period of time, dropping
out of the cluster. The logs only show the following:
debug 2020-12-10T02:02:50.409+ 7f1005840700 0 log_channel(cluster) log
[DBG]