[ceph-users] radosgw-agent keeps syncing most active bucket - ignoring others

2015-08-18 Thread Sam Wouters
Hi,

from the doc of radosgw-agent and some items in this list, I understood
that the max-entries argument was there to prevent a very active bucket
to keep the other buckets from keeping synced. In our agent logs however
we saw a lot of bucket instance bla has 1000 entries after bla, and
the agent kept on syncing that active bucket.

Looking at the code, in class DataWorkerIncremental, it looks like the
agent loops in fetching log entries from the bucket until it receives
less entries then the max_entries. Is this intended behaviour? I would
suspect it to just pass the max_entries log entries for processing and
increase the marker.

Is there any other way to make sure less active buckets are frequently
synced? We've tried increasing num-workers, but this only has affect the
first pass.

Thanks,
Sam
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw-agent keeps syncing most active bucket - ignoring others

2015-08-18 Thread Sam Wouters
Hmm,

looks like intended behaviour:

SNIP
CommitDate: Mon Mar 3 06:08:42 2014 -0800

   worker: process all bucket instance log entries at once

 Currently if there are more than max_entries in a single bucket
   instance's log, only max_entries of those will be processed, and the
   bucket instance will not be examined again until it is modified again.

   To keep it simple, get the entire log of entries to be updated and
   process them all at once. This means one busy shard may block others
   from syncing, but multiple instances of radosgw-agent can be run to
   circumvent that issue. With only one instance, users can be sure
   everything is synced when an incremental sync completes with no
   errors.
/SNIP

However, this brings us to a new issue. After starting a second agent,
one of the agents is busy syncing the busy shard and the other agent
synced correctly all of the other buckets. So far, so good. But, since a
few of them are almost static, it looks like it started syncing those in
a second run from the beginning all over again.
As versioning was enabled on those buckets after they were created and
with already objects and removed objects in there, it seems like the
agent is copying those unversioned objects to versioned ones, creating a
lot of delete markers and multiple versions in the secondary zone.

Anyone any idea how to handle this correctly. I've already did a cleanup
some weeks ago, but if the agent is going to keep on restarting the sync
from the beginning, I'll have to cleanup every time.

regards,
Sam

On 18-08-15 09:36, Sam Wouters wrote:
 Hi,

 from the doc of radosgw-agent and some items in this list, I understood
 that the max-entries argument was there to prevent a very active bucket
 to keep the other buckets from keeping synced. In our agent logs however
 we saw a lot of bucket instance bla has 1000 entries after bla, and
 the agent kept on syncing that active bucket.

 Looking at the code, in class DataWorkerIncremental, it looks like the
 agent loops in fetching log entries from the bucket until it receives
 less entries then the max_entries. Is this intended behaviour? I would
 suspect it to just pass the max_entries log entries for processing and
 increase the marker.

 Is there any other way to make sure less active buckets are frequently
 synced? We've tried increasing num-workers, but this only has affect the
 first pass.

 Thanks,
 Sam
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com