Hi,

We've been trying to set up multi-site sync on two test VMs before rolling 
things out on actual production hardware. Both are running Ceph 18.2.4 deployed 
via cephadm. Host OS is Debian 12, container runtime is podman (switched from 
Debian 11 and docker.io, same error there). There is only one RGW daemon on 
each site. Ceph config is pretty much defaults. One thing I did change was 
setting rgw_relaxed_region_enforcement to true because the zonegroup got 
renamed from "default" during the switch to multi-site using the dashboard's 
assistant. There's nothing special like server-side encryption either. Our end 
goal is to replicate all RGW data from our current cluster to a new one.

The Multi-Site configuration itself went pretty smoothly through the dashboard 
and pre-existing data started syncing right away. Unfortunately, not all 
objects made it. To be precise, none of the larger objects over the multipart 
threshold got synced. This is consistent for newly uploaded multipart objects 
as well. Curiously, it's working fine in the other direction, i.e. multipart 
uploads from the secondary zone do get synced to the master.

Here are some relevant logs:

>From `radosgw-admin sync error list`:

    {
        "shard_id": 26,
        "entries": [
            {
                "id": "1_1722598249.479766_23730.1",
                "section": "data",
                "name": 
"foobar/new:5160b406-4428-4fdc-9c5d-5ec9fe9404c0.12564119.3:7/logstash_1%3a8.12.2-1_amd64.deb",
                "timestamp": "2024-08-02T11:30:49.479766Z",
                "info": {
                    "source_zone": "5160b406-4428-4fdc-9c5d-5ec9fe9404c0",
                    "error_code": 35,
                    "message": "failed to sync object(35) Resource deadlock 
avoided"
                }
            }
        ]
    },



>From RGW on the receiving end:

    Aug 02 13:30:49 dev-ceph-single bash[754387]: debug 
2024-08-02T11:30:49.474+0000 7f3a6243e640  0 rgw async rados processor: 
store->fetch_remote_obj() returned r=-35
    Aug 02 13:30:49 dev-ceph-single bash[754387]: debug 
2024-08-02T11:30:49.474+0000 7f3a36b7b640  2 req 7168648379339657593 
0.000000000s :list_data_changes_log normalizing buckets and tenants
    Aug 02 13:30:49 dev-ceph-single bash[754387]: debug 
2024-08-02T11:30:49.474+0000 7f3a36b7b640  2 req 7168648379339657593 
0.003999872s :list_data_changes_log init permissions
    Aug 02 13:30:49 dev-ceph-single bash[754387]: debug 
2024-08-02T11:30:49.478+0000 7f3a36b7b640  2 req 7168648379339657593 
0.003999872s :list_data_changes_log recalculating target
    Aug 02 13:30:49 dev-ceph-single bash[754387]: debug 
2024-08-02T11:30:49.478+0000 7f3a36b7b640  2 req 7168648379339657593 
0.003999872s :list_data_changes_log reading permissions
    Aug 02 13:30:49 dev-ceph-single bash[754387]: debug 
2024-08-02T11:30:49.478+0000 7f3a36b7b640  2 req 7168648379339657593 
0.003999872s :list_data_changes_log init op
    Aug 02 13:30:49 dev-ceph-single bash[754387]: debug 
2024-08-02T11:30:49.478+0000 7f3a36b7b640  2 req 7168648379339657593 
0.003999872s :list_data_changes_log verifying op mask
    Aug 02 13:30:49 dev-ceph-single bash[754387]: debug 
2024-08-02T11:30:49.478+0000 7f3a36b7b640  2 req 7168648379339657593 
0.003999872s :list_data_changes_log verifying op permissions
    Aug 02 13:30:49 dev-ceph-single bash[754387]: debug 
2024-08-02T11:30:49.478+0000 7f3a36b7b640  2 overriding permissions due to 
system operation
    Aug 02 13:30:49 dev-ceph-single bash[754387]: debug 
2024-08-02T11:30:49.478+0000 7f3a36b7b640  2 req 7168648379339657593 
0.003999872s :list_data_changes_log verifying op params
    Aug 02 13:30:49 dev-ceph-single bash[754387]: debug 
2024-08-02T11:30:49.478+0000 7f3a5241e640  0 
RGW-SYNC:data:sync:shard[28]:entry[foobar/new:5160b406-4428-4fdc-9c5d-5ec9fe9404c0.12564119.3:7[0]]:bucket_sync_sources[source=foobar:new[5160b406-4428-4fdc-9c5d-5ec9fe9404c0.12564119.3]):7:source_zone=5160b406-4428-4fdc-9c5d-5ec9fe9404c0]:bucket[foobar/new:5160b406-4428-4fdc-9c5d-5ec9fe9404c0.12564119.3<-foobar/new:5160b406-4428-4fdc-9c5d-5ec9fe9404c0.12564119.3:7]:inc_sync[foobar/new:5160b406-4428-4fdc-9c5d-5ec9fe9404c0.12564119.3:7]:entry[logstash_1%3a8.12.2-1_amd64.deb]:
 ERROR: failed to sync object: 
foobar/new:5160b406-4428-4fdc-9c5d-5ec9fe9404c0.12564119.3:7/logstash_1%3a8.12.2-1_amd64.deb



And from the sender:

    Aug 02 13:30:49 test-ceph-single bash[885118]: debug 
2024-08-02T11:30:49.476+0000 7f0acfdb2640  1 ====== req done req=0x7f0ab50e4710 
op status=-104 http_status=200 latency=0.419986606s ======
    Aug 02 13:30:49 test-ceph-single bash[885118]: debug 
2024-08-02T11:30:49.476+0000 7f0ba9f66640  2 req 5943847843579143466 
0.000000000s initializing for trans_id = 
tx00000527cca1f3381a52a-0066acc369-c052e6-eu2
    Aug 02 13:30:49 test-ceph-single bash[885118]: debug 
2024-08-02T11:30:49.476+0000 7f0acfdb2640  1 beast: 0x7f0ab50e4710: 
10.139.0.151 - synchronization-user [02/Aug/2024:11:30:49.056 +0000] "GET 
/foobar%3Anew/logstash_1%253a8.12.2-1_amd64.deb?rgwx-zonegroup=9c1ee979-4362-45a1-ae70-2a83a30ea9fc&rgwx-prepend-metadata=true&rgwx-sync-manifest&rgwx-sync-cloudtiered&rgwx-skip-decrypt&rgwx-if-not-replicated-to=a0fab4b8-ec26-4a11-85dd-abab2e3205fa%3Afoobar%2Fnew%3A5160b406-4428-4fdc-9c5d-5ec9fe9404c0.12564119.3
 HTTP/1.1" 200 138413732 - - - latency=0.419986606s
    Aug 02 13:30:49 test-ceph-single bash[885118]: debug 
2024-08-02T11:30:49.476+0000 7f0ba9f66640  2 req 5943847843579143466 
0.000000000s getting op 0
    Aug 02 13:30:49 test-ceph-single bash[885118]: debug 
2024-08-02T11:30:49.476+0000 7f0ba9f66640  2 req 5943847843579143466 
0.000000000s :list_metadata_log verifying requester



They all keep running into the same error: "failed to sync object(35) Resource 
deadlock avoided"

Any ideas? Thanks!

--
Mit freundlichem Gruß // Best regards,

Tino Lehnig
Cloud Architect

Contabo GmbH
Aschauer Straße 32a
81549 München
https://contabo.com

E-Mail: tino.leh...@contabo.de

Amtsgericht München
HRB 180722
Vertretungsberechtigte Geschäftsführer:
Dr. Christian Böing & Thomas Schimmel
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to