Re: [ceph-users] rgw multisite resync only one bucket

Marius Vaitiekunas Tue, 28 Feb 2017 23:13:25 -0800

On Mon, Feb 27, 2017 at 11:40 AM, Marius Vaitiekunas <
mariusvaitieku...@gmail.com> wrote:


>
>
> On Mon, Feb 27, 2017 at 9:59 AM, Marius Vaitiekunas <
> mariusvaitieku...@gmail.com> wrote:
>
>>
>>
>> On Fri, Feb 24, 2017 at 6:35 PM, Yehuda Sadeh-Weinraub <yeh...@redhat.com
>> > wrote:
>>
>>> On Fri, Feb 24, 2017 at 3:59 AM, Marius Vaitiekunas
>>> <mariusvaitieku...@gmail.com> wrote:
>>> >
>>> >
>>> > On Wed, Feb 22, 2017 at 8:33 PM, Yehuda Sadeh-Weinraub <
>>> yeh...@redhat.com>
>>> > wrote:
>>> >>
>>> >> On Wed, Feb 22, 2017 at 6:19 AM, Marius Vaitiekunas
>>> >> <mariusvaitieku...@gmail.com> wrote:
>>> >> > Hi Cephers,
>>> >> >
>>> >> > We are testing rgw multisite solution between to DC. We have one
>>> >> > zonegroup
>>> >> > and to zones. At the moment all writes/deletes are done only to
>>> primary
>>> >> > zone.
>>> >> >
>>> >> > Sometimes not all the objects are replicated.. We've written
>>> prometheus
>>> >> > exporter to check replication status. It gives us each bucket object
>>> >> > count
>>> >> > from user perspective, because we have millions of objects and
>>> hundreds
>>> >> > of
>>> >> > buckets. We just want to be sure, that everything is replicated
>>> without
>>> >> > using ceph internals like rgw admin api for now.
>>> >> >
>>> >> > Is it possible to initiate full resync of only one rgw bucket from
>>> >> > master
>>> >> > zone? What are the options about resync when things go wrong and
>>> >> > replication
>>> >> > misses some objects?
>>> >> >
>>> >> > We run latest jewel 10.2.5.
>>> >>
>>> >>
>>> >> There's the 'radosgw-admin bucket sync init' command that you can run
>>> >> on the specific bucket on the target zone. This will reinitialize the
>>> >> sync state, so that when it starts syncing it will go through the
>>> >> whole full sync process. Note that it shouldn't actually copy data
>>> >> that already exists on the target. Also, in order to actually start
>>> >> the sync, you'll need to have some change that would trigger the sync
>>> >> on that bucket, e.g., create a new object there.
>>> >>
>>> >> Yehuda
>>> >>
>>> >
>>> > Hi,
>>> >
>>> > I've tried to resync a bucket, but it didn't manage to resync a missing
>>> > object. If I try to copy missing object by hand into secondary zone, i
>>> get
>>> > asked to overwrite existing object.. It looks like the object is
>>> replicated,
>>> > but is not in a bucket index. I've tried to check bucket index with
>>> --fix
>>> > and --check-objects flags, but nothing changes. What else should i try?
>>> >
>>>
>>> That's weird. Do you see anything when you run 'radosgw-admin bi list
>>> --bucket=<bucket>'?
>>>
>>> Yehuda
>>>
>>
>> 'radosgw-admin bi list --bucket=<bucket>' gives me an error:
>> 2017-02-27 08:55:30.861659 7f20c15779c0  0 error in read_id for id  : (2)
>> No such file or directory
>> 2017-02-27 08:55:30.861991 7f20c15779c0  0 error in read_id for id  : (2)
>> No such file or directory
>> ERROR: bi_list(): (5) Input/output error
>>
>> 'radosgw-admin bucket list --bucket=<bucket>' successfully list all the
>> files except missing ones.
>>
>>
>>
>>
>
> I've done some more investigation. These missing objects could be found in
> "rgw.buckets.data" pool, but bucket index is not aware about them.
> How does 'radosgw-admin bucket check -b <bucket> --fix --check-objects'
> works?
> I guess that it's not scanning "rgw.buckets.data" pool for "leaked"
> objects? These unreplicated objects looks for me the same like leaked ones
> :)
>
>
>
By the way in rgw logs I can find all the missing files with http 304
return code. For example:
"GET /go84/WRWRDGROWKFKROTWKHXXIBHERRLHBK HTTP/1.1" 304 0 - -

All the gateways in both sites are behind haproxies. Any ideas?


-- 
Marius Vaitiekūnas

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rgw multisite resync only one bucket

Reply via email to