Running "object rewrite" on a couple of the objects in the bucket seems to have 
triggered the sync and now things appear ok.

________________________________
From: Szabo, Istvan (Agoda) <istvan.sz...@agoda.com>
Sent: Thursday, June 9, 2022 3:24 PM
To: Wyll Ingersoll <wyllys.ingers...@keepertech.com>
Cc: ceph-users@ceph.io <ceph-users@ceph.io>; d...@ceph.io <d...@ceph.io>
Subject: Re: [ceph-users] Re: radosgw multisite sync - how to fix data behind 
shards?

Try data sync init and restart the gateways, sometimes this helped me.

If this doesn’t turn on and off the sync policy on the bucket.

Istvan Szabo
Senior Infrastructure Engineer
---------------------------------------------------
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com<mailto:istvan.sz...@agoda.com>
---------------------------------------------------

On 2022. Jun 9., at 20:48, Wyll Ingersoll <wyllys.ingers...@keepertech.com> 
wrote:

Email received from the internet. If in doubt, don't click any link nor open 
any attachment !
________________________________

I ended up giving up after trying everything I could find in the forums and 
docs, deleted the problematic zone, and then re-added it back to the zonegroup 
and re-established the group sync policy for the bucket in question.  The 
sync-status is OK now, though the error list still shows a bunch of errors from 
yesterday that I cannot figure out how to clear ("sync error trim" doesn't do 
anything that I can tell).

My opinion is that multisite sync policy in the current Pacific release 
(16.2.9) is still very fragile and poorly documented as far as troubleshooting 
goes.  I'd love to see clear explanations of the various data and metadata 
operations - metadata, data, bucket, bilog, datalog.  It's hard to know where 
to start when things get into a bad state and the online resources are not 
helpful enough.

Another question, if a sync policy is defined on a bucket already has some 
objects in it, what command should be used to force a sync operation based on 
the new policy? It seems that only objects added AFTER the policy is applied 
get replicated, pre-existing ones are not replicated.


________________________________
From: Wyll Ingersoll <wyllys.ingers...@keepertech.com>
Sent: Thursday, June 9, 2022 9:35 AM
To: Amit Ghadge <amitg....@gmail.com>; ceph-users@ceph.io <ceph-users@ceph.io>; 
d...@ceph.io <d...@ceph.io>
Subject: [ceph-users] Re: radosgw multisite sync - how to fix data behind 
shards?

I think you mean "radosgw-admin sync error list", in which case there are 32 
shards, each with the same error.  I dont see errors on the master zone logs so 
I'm not sure how to correct the situation.


       "shard_id": 31,
       "entries": [
           {
               "id": "1_1654722349.230688_62850.1",
               "section": "data",
               "name": 
"zone-1:a6ed5947-0ceb-407b-812f-347fab2ef62d.677322760.1:6",
               "timestamp": "2022-06-08T21:05:49.230688Z",
               "info": {
                   "source_zone": "a6ed5947-0ceb-407b-812f-347fab2ef62d",
                   "error_code": 125,
                   "message": "failed to sync bucket instance: (125) Operation 
canceled"
               }
           }
       ]
   }




________________________________
From: Amit Ghadge <amitg....@gmail.com>
Sent: Wednesday, June 8, 2022 9:16 PM
To: Wyll Ingersoll <wyllys.ingers...@keepertech.com>
Subject: Re: radosgw multisite sync - how to fix data behind shards?

check any error by running command radosgw-admin data sync error list


-AmitG


On Wed, Jun 8, 2022 at 2:44 PM Wyll Ingersoll 
<wyllys.ingers...@keepertech.com<mailto:wyllys.ingers...@keepertech.com>> wrote:

Seeking help from a radosgw expert...

I have a 3-zone multisite configuration (all running pacific 16.2.9) with 1 
bucket per zone and a couple of small objects in each bucket for testing 
purposes.
One of the secondary zones cannot get seem to get into sync with the master, 
sync status reports:


 metadata sync syncing
               full sync: 0/64 shards
               incremental sync: 64/64 shards
               metadata is caught up with master
     data sync source: a6ed5947-0ceb-407b-812f-347fab2ef62d (zone-1)
                       syncing
                       full sync: 128/128 shards
                       full sync: 66 buckets to sync
                       incremental sync: 0/128 shards
                       data is behind on 128 shards
                       behind shards: 
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127]


I have tried using "data sync init" and restarting the radosgw multiple times, 
but that does not seem to be helping in any way.

If I manually do "radosgw-admin data sync run --bucket bucket-1" - it just 
hangs forever and doesn't appear to do anything.  Checking the sync status 
never shows any improvement in the shards.

It is very hard to figure out what to do as there are a several sync commands - 
 bucket sync, data sync, metadata sync  - and it is not clear what effect they 
have or how to properly run them when the syncing gets confused.

Any guidance on how to get out of this situation would be greatly appreciated.  
I've read lots of threads on various mailing list archives (via google search) 
and very few of them have any sort of resolution or recommendation that is 
confirmed to have fixed these sort of problems.


_______________________________________________
Dev mailing list -- d...@ceph.io<mailto:d...@ceph.io>
To unsubscribe send an email to dev-le...@ceph.io<mailto:dev-le...@ceph.io>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

________________________________
This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to