Hi Wyll,
Any chance you're using CephFS and have some really large files in the
CephFS filesystem? Erasure coding? I recently encountered a similar
problem and as soon as the end-user deleted the really large files our
problem became much more managable.
I had issues reweighting OSDs too and in the end I changed the crush
weights and had to chase them around every couple of days reweighting
the OSDs >70% to zero and then setting them back to 12 when they were
mostly empty (12TB spinning rust buckets). Note that I'm really not
recommending this course of action it's just the only option that seemed
to have any effect.
-Dave
On 2022-08-29 3:00 p.m., Wyll Ingersoll wrote:
[△EXTERNAL]
Can anyone explain why OSDs (ceph pacific, bluestore osds) continue to grow well after
they have exceeded the "full" level (95%) and is there any way to stop this?
"The full_ratio is 0.95 but we have several osds that continue to grow and are
approaching 100% utilization. They are reweighted to almost 0, but yet continue to
grow.
Why is this happening? I thought the cluster would stop writing to the osd when it
was at above the full ratio."
thanks...
________________________________
From: Wyll Ingersoll <wyllys.ingers...@keepertech.com>
Sent: Monday, August 29, 2022 9:24 AM
To: Jarett <starkr...@gmail.com>; ceph-users@ceph.io <ceph-users@ceph.io>
Subject: [ceph-users] Re: OSDs growing beyond full ratio
I would think so, but it isn't happening nearly fast enough.
It's literally been over 10 days with 40 new drives across 2 new servers and
they barely have any PGs yet. A few, but not nearly enough to help with the
imbalance.
________________________________
From: Jarett <starkr...@gmail.com>
Sent: Sunday, August 28, 2022 8:19 PM
To: Wyll Ingersoll <wyllys.ingers...@keepertech.com>; ceph-users@ceph.io
<ceph-users@ceph.io>
Subject: RE: [ceph-users] OSDs growing beyond full ratio
Isn’t rebalancing onto the empty OSDs default behavior?
From: Wyll Ingersoll<mailto:wyllys.ingers...@keepertech.com>
Sent: Sunday, August 28, 2022 10:31 AM
To: ceph-users@ceph.io<mailto:ceph-users@ceph.io>
Subject: [ceph-users] OSDs growing beyond full ratio
We have a pacific cluster that is overly filled and is having major trouble
recovering. We are desperate for help in improving recovery speed. We have
modified all of the various recovery throttling parameters.
The full_ratio is 0.95 but we have several osds that continue to grow and are
approaching 100% utilization. They are reweighted to almost 0, but yet
continue to grow.
Why is this happening? I thought the cluster would stop writing to the osd
when it was at above the full ratio.
We have added additional capacity to the cluster but the new OSDs are being used very
very slowly. The primary pool in the cluster is the RGW data pool which is a 12+4 EC
pool using "host" placement rules across 18 hosts, 2 new hosts with 20x10TB
osds each were recently added but they are only very very slowly being filled up. I
don't see how to force recovery on that particular pool. From what I understand, we
cannot modify the EC parameters without destroying the pool and we cannot offload that
pool to any others because there is no other place to store the amount of data.
We have been running "ceph osd reweight-by-utilization" periodically and it
works for a while (a few hours) but then recovery and backfill IO numbers drop to
negligible values.
The balancer module will not run because the current misplaced % is about 97%.
Would it be more effective to use the osmaptool and generate a bunch of upmap
commands to manually move data around or keep trying to get
reweight-by-utlilization to work?
Any suggestions (other than deleting data which we cannot do at this point, the
pools are not accessible) or adding more storage (we already did and it is not
being utilized very heavily yet for some reason).
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io