[ceph-users] radosgw sync falling behind regularly

Christian Rice Wed, 27 Feb 2019 16:06:18 -0800

Debian 9; ceph 12.8.8-bpo90+1; no rbd or cephfs, just radosgw; three clusters 
in one zonegroup.


Often we find either metadata or data sync behind, and it doesn’t look to ever 
recover until…we restart the endpoint radosgw target service.

eg at 15:45:40:

dc11-ceph-rgw1:/var/log/ceph# radosgw-admin sync status
          realm b3e2afe7-2254-494a-9a34-ce50358779fd (savagebucket)
      zonegroup de6af748-1a2f-44a1-9d44-30799cf1313e (us)
           zone 107d29a0-b732-4bf1-a26e-1f64f820e839 (dc11-prod)
  metadata sync syncing
                full sync: 0/64 shards
                incremental sync: 64/64 shards
                metadata is behind on 2 shards
                behind shards: [19,41]
                oldest incremental change not applied: 2019-02-27 
14:42:24.0.408263s
      data sync source: 1e27bf9c-3a2f-4845-85b6-33a24bbe1c04 (sv5-corp)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is caught up with source
                source: 331d3f1e-1b72-4c56-bb5a-d1d0fcf6d0b8 (sv3-prod)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is caught up with source


so at 15:46:07:

dc11-ceph-rgw1:/var/log/ceph# sudo systemctl restart 
ceph-radosgw@rgw.dc11-ceph-rgw1.service<mailto:ceph-radosgw@rgw.dc11-ceph-rgw1.service>

and by the time I checked at 15:48:08:

dc11-ceph-rgw1:/var/log/ceph# radosgw-admin sync status
          realm b3e2afe7-2254-494a-9a34-ce50358779fd (savagebucket)
      zonegroup de6af748-1a2f-44a1-9d44-30799cf1313e (us)
           zone 107d29a0-b732-4bf1-a26e-1f64f820e839 (dc11-prod)
  metadata sync syncing
                full sync: 0/64 shards
                incremental sync: 64/64 shards
                metadata is caught up with master
      data sync source: 1e27bf9c-3a2f-4845-85b6-33a24bbe1c04 (sv5-corp)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is caught up with source
                source: 331d3f1e-1b72-4c56-bb5a-d1d0fcf6d0b8 (sv3-prod)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is caught up with source


There’s no way this is “lag.”  It’s stuck, and happens frequently, though 
perhaps not daily.  Any suggestions?  Our cluster isn’t heavily used yet, but 
it’s production.

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] radosgw sync falling behind regularly

Reply via email to