[ceph-users] Re: The last 15 'degraded' items take as many hours as the first 15K?

2022-05-13 Thread Janne Johansson
Den fre 13 maj 2022 kl 08:56 skrev Stefan Kooman : > >> > > Thanks Janne and all for the insights! The reason why I half-jokingly > > suggested the cluster 'lost interest' in those last few fixes is that > > the recovery statistics' included in ceph -s reported near to zero > > activity for so

[ceph-users] Re: The last 15 'degraded' items take as many hours as the first 15K?

2022-05-12 Thread Harry G. Coin
On 5/12/22 02:05, Janne Johansson wrote: Den tors 12 maj 2022 kl 00:03 skrev Harry G. Coin : Might someone explain why the count of degraded items can drop thousands, sometimes tens of thousands in the same number of hours it takes to go from 10 to 0? For example, when an OSD or a host with a

[ceph-users] Re: The last 15 'degraded' items take as many hours as the first 15K?

2022-05-12 Thread Janne Johansson
Den tors 12 maj 2022 kl 00:03 skrev Harry G. Coin : > Might someone explain why the count of degraded items can drop > thousands, sometimes tens of thousands in the same number of hours it > takes to go from 10 to 0? For example, when an OSD or a host with a few > OSD's goes offline for a while,

[ceph-users] Re: The last 15 'degraded' items take as many hours as the first 15K?

2022-05-11 Thread Harry G. Coin
It's a little four host, 4 OSD/host HDD cluster with a 5th doing the non-osd work.  Nearly entirely cephfs load. On 5/11/22 17:47, Josh Baergen wrote: Is this on SSD or HDD? RGW index, RBD, or ...? Those all change the math on single-object recovery time. Having said that...if the object is

[ceph-users] Re: The last 15 'degraded' items take as many hours as the first 15K?

2022-05-11 Thread Anthony D'Atri
Small objects recover faster than large ones. But especially, early in the process many OSDs / PGs are recovering in parallel. Toward the end there’s a long tail where parallelism is limited by osd_max_backfills, say the remaining PGs to recover are all on a single OSD, they will execute