[ceph-users] Question about multi-site sync policies

Ulrich Klein Thu, 06 Jan 2022 04:25:43 -0800

Hi,

My first question on this list …. 2nd attempt because the first one didn’t make 
it (I hope)?


I'm trying out RGW multi-site sync policies.

I have a test/PoC setup with 16.2.6 using cephadm (which, by the way, I DO like)
I only use RGW/S3 
There is one realm (myrealm), one zonegroup (myzg) and 3 zones: Zone A, zone B 
and zone DR

Zone A and B are on the same ceph instance with different rgw processes and 
pools, zone DR is on a second cluster, in case that matters.
Without any sync policies all data is synced perfectly fine between all zones

What I want to see is:
Data from zone A is synced directionally to zone DR, but not to zone B
Data from zone B is not synced anywhere
Data from zone DR is not synced back/anywhere
i.e. clients writing to zone A get their data "backed up" to zone DR, 
while clients writng to zone B don't get their data "backed" up to a second 
zone.
Clients don't have access to zone DR.

I used sync policies this way:
radosgw-admin sync group create --group-id=drsync --status=allowed
radosgw-admin sync group flow create --group-id=drsync --flow-id=a2dr 
--flow-type=directional --source-zone=a --dest-zone=dr
radosgw-admin sync group pipe create --group-id=drsync --pipe-id=allbuck 
--source-zones='*' --source-bucket='*' --dest-zones='*' --dest-bucket='*'
radosgw-admin sync group modify --group-id=drsync --status=enabled
radosgw-admin period update --commit

Now, from a data-in-buckets perspective all looks fine
Add data to bucket buck on A -> appears on buck on DR, but not in B
Add data to bucket buck on B -> doesn't appear in A or DR
Same for all other combinations, just as I want it.

BUT
radosgw-admin sync status 
with --rgw-zone=A or DR or B
after adding or removing data always shows some
"data is behind on X shards", apparently for all the shards that are - 
intentionally - not synced. Those “behind” shards accumulate over time and 
never go away again.

Is that just annoying but normal? Or is that a bug?
Or is my configuration just "bad" and could be changed so I don't have those 
sort-of errors with the sync status.

When I used bucket level sync policies (only sync certain buckets from zone A 
to DR, no zone B) I had iirc the same effect.

What I'm really trying to achieve is something like "user sync policies", i.e.
- User X data should be synced from A to DR
- User Y data should only stay in A
I’m trying to emulate that by using the existing/documented sync policies. User 
X gets URL for zone A, user Y gets URL for zone B
(or with bucket level policies both get URL for zone A and for user X syncing 
is turned on “on demand” for certain existing buckets - inconvenient)

Best would be if I could flip a flag and change the behaviour per user :)
And even better if it was easy to understand, i.e. if user's sync-flag is 
turned on all his data is synced, not just new data. If user's sync-flag is 
turned off, all his data is removed from the DR zone.


Thanks for any input :)

Ciao, Uli
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Question about multi-site sync policies

Reply via email to