Hmm,
looks like intended behaviour:
SNIP
CommitDate: Mon Mar 3 06:08:42 2014 -0800
worker: process all bucket instance log entries at once
Currently if there are more than max_entries in a single bucket
instance's log, only max_entries of those will be processed, and the
bucket instance will not be examined again until it is modified again.
To keep it simple, get the entire log of entries to be updated and
process them all at once. This means one busy shard may block others
from syncing, but multiple instances of radosgw-agent can be run to
circumvent that issue. With only one instance, users can be sure
everything is synced when an incremental sync completes with no
errors.
/SNIP
However, this brings us to a new issue. After starting a second agent,
one of the agents is busy syncing the busy shard and the other agent
synced correctly all of the other buckets. So far, so good. But, since a
few of them are almost static, it looks like it started syncing those in
a second run from the beginning all over again.
As versioning was enabled on those buckets after they were created and
with already objects and removed objects in there, it seems like the
agent is copying those unversioned objects to versioned ones, creating a
lot of delete markers and multiple versions in the secondary zone.
Anyone any idea how to handle this correctly. I've already did a cleanup
some weeks ago, but if the agent is going to keep on restarting the sync
from the beginning, I'll have to cleanup every time.
regards,
Sam
On 18-08-15 09:36, Sam Wouters wrote:
Hi,
from the doc of radosgw-agent and some items in this list, I understood
that the max-entries argument was there to prevent a very active bucket
to keep the other buckets from keeping synced. In our agent logs however
we saw a lot of bucket instance bla has 1000 entries after bla, and
the agent kept on syncing that active bucket.
Looking at the code, in class DataWorkerIncremental, it looks like the
agent loops in fetching log entries from the bucket until it receives
less entries then the max_entries. Is this intended behaviour? I would
suspect it to just pass the max_entries log entries for processing and
increase the marker.
Is there any other way to make sure less active buckets are frequently
synced? We've tried increasing num-workers, but this only has affect the
first pass.
Thanks,
Sam
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com