Hi David,

On Tuesday, March 30th, 2021 at 00:50, David Orman <orma...@corenode.com> wrote:
> Sure enough, it is more than 200,000, just as the alert indicates.
> However, why did it not reshard further? Here's the kicker - we only
> see this with versioned buckets/objects. I don't see anything in the
> documentation that indicates this is a known issue with sharding, but
> perhaps there is something going on with versioned buckets/objects. Is
> there any clarity here/suggestions on how to deal with this? It sounds
> like you expect this behavior with versioned buckets, so we must be
> missing something.

The issue with versioned buckets is that each object is associated with at 
least 4 index entries, with 2 additional index entries for each version of the 
object. Dynamic resharding is based on the number of objects, not the number of 
index entries, and it counts each version of an object as an object, so the 
biggest discrepancy between number of objects and index entries happens when 
there's only one version of each object (factor of 4), and it tends to a factor 
of two as the number of versions per object increases to infinity. But there's 
one more special case. When you delete an versioned object, it also creates two 
more index entries, but those are not taken into account by dynamic resharding. 
Therefore, the absolute worst case is when there was a single version of each 
object, and all the objects have been deleted. In that case, there's 6 index 
entries for each object counted by dynamic resharding, i.e. a factor of 6.

So one way to "solve" this issue is to set 
`osd_deep_scrub_large_omap_object_key_threshold=600000`, which (with the 
default `rgw_max_objs_per_shard=100000`) will guarantee that dynamic resharding 
will kick in before you get a large omap object warning even in the worst case 
scenario for versioned buckets. If you're not comfortable having that many keys 
per omap object, you could instead decrease `rgw_max_objs_per_shard`.

Cheers,

--
Ben
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to