I recently upgraded a cluster from octopus to pacific using cephadm. The
cluster has 3 rados gateways internally facing with rgw_enable_apis at the
default value and 2 rados gateways externally facing with rgw_enable_apis
set to s3website. After the upgrade the dashboard object gateway page was
sho
Boris, I have seen one problematic OSD cause this issue on all OSD with
which its PGs peered. The solution was to take out the slow OSD,
immediately all slow ops stopped. I found it by observing common OSDs in
reported slow ops. Not saying this is your issue, but it may be a
possibility. Good l
Thank you Frank and Eneko,
Without help and support from ceph admins like you, I would be adrift. I
really appreciate this.
I rejoined the host now one week ago, and the cluster has been dealing
with the misplaced objects and recovering well.
I will use this strategy in the future:
"If you c
One thing to this discussion.
I had a lot of problems with my clusters. I spent some time debugging.
What I found and what I confirmed on AMD nodes, everything starts working like
a charm when I added to kernel param iommu=pt
Plus some other tunings, I can’t share, all information now, but this i
Just a shoot,
Perhaps you have many backfilling tasks...
You can trottle the recovery when you limit max-backfill.
Hth
Mehmet
Am 2. Dezember 2022 21:09:04 MEZ schrieb Wyll Ingersoll
:
>
>We have a large cluster (10PB) which is about 30% full at this point. We
>recently fixed a configuration