Small objects recover faster than large ones. But especially, early in the process many OSDs / PGs are recovering in parallel. Toward the end there’s a long tail where parallelism is limited by osd_max_backfills, say the remaining PGs to recover are all on a single OSD, they will execute serially.
> > Might someone explain why the count of degraded items can drop thousands, > sometimes tens of thousands in the same number of hours it takes to go from > 10 to 0? For example, when an OSD or a host with a few OSD's goes offline > for a while, reboots. > > Sitting at one complete and entire degraded object out of millions for longer > than it took to write this post. > > Seems the fewer the number of degraded objects, the less interested the > cluster is in fixing it! > > HC > > > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io