Hi Casey,

I set up a completely fresh cluster on a new VM host.. everything is fresh 
fresh fresh. I feel like it installed cleanly and because there is practically 
zero latency and unlimited bandwidth as peer VMs, this is a better place to 
experiment. The behavior is the same as the other cluster.

The realm is “example-test”, has a single zone group named “us”, and there are 
zones “left” and “right”. The master zone is “left” and I am trying to 
unidirectionally replicate to “right”. “left” is a two node cluster and right 
is a single node cluster. Both show "too few PGs per OSD” but are otherwise 
100% active+clean. Both clusters have been completely restarted to make sure 
there are no latent config issues, although only the RGW nodes should require 
that. 

The thread at [1] is the most involved engagement I’ve found with a staff 
member on the subject, so I checked and believe I attached all the logs that 
were requested there. They all appear to be consistent and are attached below.

For start: 
> [root@right01 ~]# radosgw-admin sync status
>           realm d5078dd2-6a6e-49f8-941e-55c02ad58af7 (example-test)
>       zonegroup de533461-2593-45d2-8975-99072d860bb2 (us)
>            zone 5dc80bbc-3d9d-46d5-8f3e-4611fbc17fbe (right)
>   metadata sync syncing
>                 full sync: 0/64 shards
>                 incremental sync: 64/64 shards
>                 metadata is caught up with master
>       data sync source: 479d3f20-d57d-4b37-995b-510ba10756bf (left)
>                         syncing
>                         full sync: 0/128 shards
>                         incremental sync: 128/128 shards
>                         data is caught up with source


I tried the information at [2] and do not see any ops in progress, just 
“linger_ops”. I don’t know what those are, but probably explain the slow stream 
of requests back and forth between the two RGW endpoints:
> [root@right01 ~]# ceph daemon client.rgw.right01.54395.94074682941968 
> objecter_requests
> {
>     "ops": [],
>     "linger_ops": [
>         {
>             "linger_id": 2,
>             "pg": "2.16dafda0",
>             "osd": 0,
>             "object_id": "notify.1",
>             "object_locator": "@2",
>             "target_object_id": "notify.1",
>             "target_object_locator": "@2",
>             "paused": 0,
>             "used_replica": 0,
>             "precalc_pgid": 0,
>             "snapid": "head",
>             "registered": "1"
>         },
>         ...
>     ],
>     "pool_ops": [],
>     "pool_stat_ops": [],
>     "statfs_ops": [],
>     "command_ops": []
> }
> 


The next thing I tried is `radosgw-admin data sync run --source-zone=left` from 
the right side. I get bursts of messages of the following form:
> 2019-04-19 21:46:34.281 7f1c006ad580  0 RGW-SYNC:data:sync:shard[1]: ERROR: 
> failed to read remote data log info: ret=-2
> 2019-04-19 21:46:34.281 7f1c006ad580  0 meta sync: ERROR: RGWBackoffControlCR 
> called coroutine returned -2


When I sorted and filtered the messages, each burst has one RGW-SYNC message 
for each of the PGs on the left side identified by the number in “[]”. Since 
left has 128 PGs, these are the numbers between 0-127. The bursts happen about 
once every five seconds.

The packet traces between the nodes during the `data sync run` are mostly 
requests and responses of the following form:
> HTTP GET: 
> http://right01.example.com:7480/admin/log/?type=data&id=7&marker&extra-info=true&rgwx-zonegroup=de533461-2593-45d2-8975-99072d860bb2
>  
> <http://right01.example.com:7480/admin/log/?type=data&id=7&marker&extra-info=true&rgwx-zonegroup=de533461-2593-45d2-8975-99072d860bb2>HTTP
>  404 RESPONSE: 
> {"Code":"NoSuchKey","RequestId":"tx000000000000000002a01-005cba9593-371d-right","HostId":"371d-right-us”}

When I stop the `data sync run`, these 404s stop, so clearly the `data sync 
run` isn’t changing a state in the rgw, but doing something synchronously. In 
the past, I have done a `data sync init` but it doesn’t seem like doing it 
repeatedly will make a difference so I didn’t do it any more.

NEXT STEPS:

I am working on how to get better logging output from daemons and hope to find 
something in there that will help. If I am lucky, I will find something in 
there and can report back so this thread is useful for others. If I have not 
written back, I probably haven’t found anything, so would be grateful for any 
leads.

Kind regards and thank you!

Brian

[1] 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-September/013188.html 
<http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-September/013188.html>
[2] 
http://docs.ceph.com/docs/master/radosgw/troubleshooting/?highlight=linger_ops#blocked-radosgw-requests
 
<http://docs.ceph.com/docs/master/radosgw/troubleshooting/?highlight=linger_ops#blocked-radosgw-requests>

CONFIG DUMPS:

> [root@left01 ~]# radosgw-admin period get-current
> {
>     "current_period": "cdc3d603-2bc8-493b-ba6a-c6a51c49cc0c"
> }
> [root@left01 ~]# radosgw-admin period get cdc3d603-2bc8-493b-ba6a-c6a51c49cc0c
> {
>     "id": "cdc3d603-2bc8-493b-ba6a-c6a51c49cc0c",
>     "epoch": 6,
>     "predecessor_uuid": "1f87151a-a1e4-469b-9f90-c309d7b64d80",
>     "sync_status": [],
>     "period_map": {
>         "id": "cdc3d603-2bc8-493b-ba6a-c6a51c49cc0c",
>         "zonegroups": [
>             {
>                 "id": "de533461-2593-45d2-8975-99072d860bb2",
>                 "name": "us",
>                 "api_name": "us",
>                 "is_master": "true",
>                 "endpoints": [
>                     "http://left01.example.com:7480 
> <http://left01.example.com:7480/>"
>                 ],
>                 "hostnames": [],
>                 "hostnames_s3website": [],
>                 "master_zone": "479d3f20-d57d-4b37-995b-510ba10756bf",
>                 "zones": [
>                     {
>                         "id": "479d3f20-d57d-4b37-995b-510ba10756bf",
>                         "name": "left",
>                         "endpoints": [
>                             "http://left01.example.com:7480 
> <http://left01.example.com:7480/>"
>                         ],
>                         "log_meta": "false",
>                         "log_data": "true",
>                         "bucket_index_max_shards": 0,
>                         "read_only": "false",
>                         "tier_type": "",
>                         "sync_from_all": "true",
>                         "sync_from": [],
>                         "redirect_zone": ""
>                     },
>                     {
>                         "id": "5dc80bbc-3d9d-46d5-8f3e-4611fbc17fbe",
>                         "name": "right",
>                         "endpoints": [
>                             "http://right01.example.com:7480 
> <http://right01.example.com:7480/>"
>                         ],
>                         "log_meta": "false",
>                         "log_data": "true",
>                         "bucket_index_max_shards": 0,
>                         "read_only": "false",
>                         "tier_type": "",
>                         "sync_from_all": "true",
>                         "sync_from": [],
>                         "redirect_zone": ""
>                     }
>                 ],
>                 "placement_targets": [
>                     {
>                         "name": "default-placement",
>                         "tags": [],
>                         "storage_classes": [
>                             "STANDARD"
>                         ]
>                     }
>                 ],
>                 "default_placement": "default-placement",
>                 "realm_id": "d5078dd2-6a6e-49f8-941e-55c02ad58af7"
>             }
>         ],
>         "short_zone_ids": [
>             {
>                 "key": "479d3f20-d57d-4b37-995b-510ba10756bf",
>                 "val": 1817029288
>             },
>             {
>                 "key": "5dc80bbc-3d9d-46d5-8f3e-4611fbc17fbe",
>                 "val": 1573215025
>             }
>         ]
>     },
>     "master_zonegroup": "de533461-2593-45d2-8975-99072d860bb2",
>     "master_zone": "479d3f20-d57d-4b37-995b-510ba10756bf",
>     "period_config": {
>         "bucket_quota": {
>             "enabled": false,
>             "check_on_raw": false,
>             "max_size": -1,
>             "max_size_kb": 0,
>             "max_objects": -1
>         },
>         "user_quota": {
>             "enabled": false,
>             "check_on_raw": false,
>             "max_size": -1,
>             "max_size_kb": 0,
>             "max_objects": -1
>         }
>     },
>     "realm_id": "d5078dd2-6a6e-49f8-941e-55c02ad58af7",
>     "realm_name": “example-test",
>     "realm_epoch": 2
> }
> [root@left01 ~]# radosgw-admin zonegroup get
> {
>     "id": "de533461-2593-45d2-8975-99072d860bb2",
>     "name": "us",
>     "api_name": "us",
>     "is_master": "true",
>     "endpoints": [
>         "http://left01.example.com:7480 <http://left01.example.com:7480/>"
>     ],
>     "hostnames": [],
>     "hostnames_s3website": [],
>     "master_zone": "479d3f20-d57d-4b37-995b-510ba10756bf",
>     "zones": [
>         {
>             "id": "479d3f20-d57d-4b37-995b-510ba10756bf",
>             "name": "left",
>             "endpoints": [
>                 "http://left01.example.com:7480 
> <http://left01.example.com:7480/>"
>             ],
>             "log_meta": "false",
>             "log_data": "true",
>             "bucket_index_max_shards": 0,
>             "read_only": "false",
>             "tier_type": "",
>             "sync_from_all": "true",
>             "sync_from": [],
>             "redirect_zone": ""
>         },
>         {
>             "id": "5dc80bbc-3d9d-46d5-8f3e-4611fbc17fbe",
>             "name": "right",
>             "endpoints": [
>                 "http://right01.example.com:7480 
> <http://right01.example.com:7480/>"
>             ],
>             "log_meta": "false",
>             "log_data": "true",
>             "bucket_index_max_shards": 0,
>             "read_only": "false",
>             "tier_type": "",
>             "sync_from_all": "true",
>             "sync_from": [],
>             "redirect_zone": ""
>         }
>     ],
>     "placement_targets": [
>         {
>             "name": "default-placement",
>             "tags": [],
>             "storage_classes": [
>                 "STANDARD"
>             ]
>         }
>     ],
>     "default_placement": "default-placement",
>     "realm_id": "d5078dd2-6a6e-49f8-941e-55c02ad58af7"
> }
> [root@left01 ~]# radosgw-admin period get
> {
>     "id": "cdc3d603-2bc8-493b-ba6a-c6a51c49cc0c",
>     "epoch": 6,
>     "predecessor_uuid": "1f87151a-a1e4-469b-9f90-c309d7b64d80",
>     "sync_status": [],
>     "period_map": {
>         "id": "cdc3d603-2bc8-493b-ba6a-c6a51c49cc0c",
>         "zonegroups": [
>             {
>                 "id": "de533461-2593-45d2-8975-99072d860bb2",
>                 "name": "us",
>                 "api_name": "us",
>                 "is_master": "true",
>                 "endpoints": [
>                     "http://left01.example.com:7480 
> <http://left01.example.com:7480/>"
>                 ],
>                 "hostnames": [],
>                 "hostnames_s3website": [],
>                 "master_zone": "479d3f20-d57d-4b37-995b-510ba10756bf",
>                 "zones": [
>                     {
>                         "id": "479d3f20-d57d-4b37-995b-510ba10756bf",
>                         "name": "left",
>                         "endpoints": [
>                             "http://left01.example.com:7480 
> <http://left01.example.com:7480/>"
>                         ],
>                         "log_meta": "false",
>                         "log_data": "true",
>                         "bucket_index_max_shards": 0,
>                         "read_only": "false",
>                         "tier_type": "",
>                         "sync_from_all": "true",
>                         "sync_from": [],
>                         "redirect_zone": ""
>                     },
>                     {
>                         "id": "5dc80bbc-3d9d-46d5-8f3e-4611fbc17fbe",
>                         "name": "right",
>                         "endpoints": [
>                             "http://right01.example.com:7480 
> <http://right01.example.com:7480/>"
>                         ],
>                         "log_meta": "false",
>                         "log_data": "true",
>                         "bucket_index_max_shards": 0,
>                         "read_only": "false",
>                         "tier_type": "",
>                         "sync_from_all": "true",
>                         "sync_from": [],
>                         "redirect_zone": ""
>                     }
>                 ],
>                 "placement_targets": [
>                     {
>                         "name": "default-placement",
>                         "tags": [],
>                         "storage_classes": [
>                             "STANDARD"
>                         ]
>                     }
>                 ],
>                 "default_placement": "default-placement",
>                 "realm_id": "d5078dd2-6a6e-49f8-941e-55c02ad58af7"
>             }
>         ],
>         "short_zone_ids": [
>             {
>                 "key": "479d3f20-d57d-4b37-995b-510ba10756bf",
>                 "val": 1817029288
>             },
>             {
>                 "key": "5dc80bbc-3d9d-46d5-8f3e-4611fbc17fbe",
>                 "val": 1573215025
>             }
>         ]
>     },
>     "master_zonegroup": "de533461-2593-45d2-8975-99072d860bb2",
>     "master_zone": "479d3f20-d57d-4b37-995b-510ba10756bf",
>     "period_config": {
>         "bucket_quota": {
>             "enabled": false,
>             "check_on_raw": false,
>             "max_size": -1,
>             "max_size_kb": 0,
>             "max_objects": -1
>         },
>         "user_quota": {
>             "enabled": false,
>             "check_on_raw": false,
>             "max_size": -1,
>             "max_size_kb": 0,
>             "max_objects": -1
>         }
>     },
>     "realm_id": "d5078dd2-6a6e-49f8-941e-55c02ad58af7",
>     "realm_name": “example-test",
>     "realm_epoch": 2
> }


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to