Den fre 13 maj 2022 kl 08:56 skrev Stefan Kooman :
> >>
> > Thanks Janne and all for the insights! The reason why I half-jokingly
> > suggested the cluster 'lost interest' in those last few fixes is that
> > the recovery statistics' included in ceph -s reported near to zero
> > activity for so
On 5/12/22 02:05, Janne Johansson wrote:
Den tors 12 maj 2022 kl 00:03 skrev Harry G. Coin :
Might someone explain why the count of degraded items can drop
thousands, sometimes tens of thousands in the same number of hours it
takes to go from 10 to 0? For example, when an OSD or a host with a
Den tors 12 maj 2022 kl 00:03 skrev Harry G. Coin :
> Might someone explain why the count of degraded items can drop
> thousands, sometimes tens of thousands in the same number of hours it
> takes to go from 10 to 0? For example, when an OSD or a host with a few
> OSD's goes offline for a while,
It's a little four host, 4 OSD/host HDD cluster with a 5th doing the
non-osd work. Nearly entirely cephfs load.
On 5/11/22 17:47, Josh Baergen wrote:
Is this on SSD or HDD? RGW index, RBD, or ...? Those all change the
math on single-object recovery time.
Having said that...if the object is
Small objects recover faster than large ones.
But especially, early in the process many OSDs / PGs are recovering in
parallel. Toward the end there’s a long tail where parallelism is limited by
osd_max_backfills, say the remaining PGs to recover are all on a single OSD,
they will execute