Thanks for your response Stefan,

On 21/12/2021 10:07, Stefan Schueffler wrote:
Even without adding a lot of rgw objects (only a few PUTs per minute), we have 
thousands and thousands of rgw bucket.sync log entries in the rgw log pool 
(this seems to be a separate problem), and as such we accumulate „large omap 
objects“ over time.

Since you are doing RADOSGW as well, those OMAP objects are usually bucket index files (https://docs.ceph.com/en/latest/rados/operations/health-checks/#large-omap-objects <https://docs.ceph.com/en/latest/rados/operations/health-checks/#large-omap-objects>). Since there is no dynamic resharing (https://docs.ceph.com/en/latest/radosgw/dynamicresharding/#rgw-dynamic-bucket-index-resharding) until Quincy (https://tracker.ceph.com/projects/rgw/issues?utf8=%E2%9C%93&set_filter=1&f%5B%5D=cf_3&op%5Bcf_3%5D=%3D&v%5Bcf_3%5D%5B%5D=multisite-reshard&f%5B%5D=&c%5B%5D=project&c%5B%5D=tracker&c%5B%5D=status&c%5B%5D=priority&c%5B%5D=subject&c%5B%5D=assigned_to&c%5B%5D=updated_on&c%5B%5D=category&c%5B%5D=fixed_version&c%5B%5D=cf_3&group_by=&t%5B%5D=) you need to have enough shards created for each bucket by default.

At about 200k objects (~ keys) per shards you should reveive this warning otherwise (used to be 2mio, see https://github.com/ceph/ceph/pull/29175/files).


we also face the same or at least a very similar  problem. We are running 
pacific (16.2.6 and 16.2.7, upgraded from 16.2.x to y to z) on both sides of 
the rgw multisite. In our case, the scrub errors occur on the secondary side 
only
Regarding your scrub errors. You do have those still coming up at random?
Could you check with "list-inconsistent-obj" if yours are within the OMAP data and in the metadata pools only?




Regards


Christian


_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to