Re: [ceph-users] Dynamic bucket index resharding bug? - rgw.none showing unreal number of objects

Paul Emmerich Fri, 22 Nov 2019 09:18:57 -0800

I've originally reported the linked issue. I've seen this problem with
negative stats on several of S3 setups but I could never figure out
how to reproduce it.


But I haven't seen the resharder act on these stats; that seems like a
particularly bad case :(


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Fri, Nov 22, 2019 at 5:51 PM David Monschein <monsch...@gmail.com> wrote:
>
> Hi all. Running an Object Storage cluster with Ceph Nautilus 14.2.4.
>
> We are running into what appears to be a serious bug that is affecting our 
> fairly new object storage cluster. While investigating some performance 
> issues -- seeing abnormally high IOPS, extremely slow bucket stat listings 
> (over 3 minutes) -- we noticed some dynamic bucket resharding jobs running. 
> Strangely enough they were resharding buckets that had very few objects. Even 
> more worrying was the number of new shards Ceph was planning: 65521
>
> [root@os1 ~]# radosgw-admin reshard list
> [
>     {
>         "time": "2019-11-22 00:12:40.192886Z",
>         "tenant": "",
>         "bucket_name": "redacteed",
>         "bucket_id": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
>         "new_instance_id": 
> "redacted:c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7552496.28",
>         "old_num_shards": 1,
>         "new_num_shards": 65521
>     }
> ]
>
> Upon further inspection we noticed a seemingly impossible number of objects 
> (18446744073709551603) in rgw.none for the same bucket:
> [root@os1 ~]# radosgw-admin bucket stats --bucket=redacted
> {
>     "bucket": "redacted",
>     "tenant": "",
>     "zonegroup": "dbb69c5b-b33f-4af2-950c-173d695a4d2c",
>     "placement_rule": "default-placement",
>     "explicit_placement": {
>         "data_pool": "",
>         "data_extra_pool": "",
>         "index_pool": ""
>     },
>     "id": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
>     "marker": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
>     "index_type": "Normal",
>     "owner": "d52cb8cc-1f92-47f5-86bf-fb28bc6b592c",
>     "ver": "0#12623",
>     "master_ver": "0#0",
>     "mtime": "2019-11-22 00:18:41.753188Z",
>     "max_marker": "0#",
>     "usage": {
>         "rgw.none": {
>             "size": 0,
>             "size_actual": 0,
>             "size_utilized": 0,
>             "size_kb": 0,
>             "size_kb_actual": 0,
>             "size_kb_utilized": 0,
>             "num_objects": 18446744073709551603
>         },
>         "rgw.main": {
>             "size": 63410030,
>             "size_actual": 63516672,
>             "size_utilized": 63410030,
>             "size_kb": 61924,
>             "size_kb_actual": 62028,
>             "size_kb_utilized": 61924,
>             "num_objects": 27
>         },
>         "rgw.multimeta": {
>             "size": 0,
>             "size_actual": 0,
>             "size_utilized": 0,
>             "size_kb": 0,
>             "size_kb_actual": 0,
>             "size_kb_utilized": 0,
>             "num_objects": 0
>         }
>     },
>     "bucket_quota": {
>         "enabled": false,
>         "check_on_raw": false,
>         "max_size": -1,
>         "max_size_kb": 0,
>         "max_objects": -1
>     }
> }
>
> It would seem that the unreal number of objects in rgw.none is driving the 
> resharding process, making ceph reshard the bucket 65521 times. I am assuming 
> 65521 is the limit.
>
> I have seen only a couple of references to this issue, none of which had a 
> resolution or much of a conversation around them:
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-October/030791.html
> https://tracker.ceph.com/issues/37942
>
> For now we are cancelling these resharding jobs since they seem to be causing 
> performance issues with the cluster, but this is an untenable solution. Does 
> anyone know what is causing this? Or how to prevent it/fix it?
>
> Thanks,
> Dave Monschein
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Dynamic bucket index resharding bug? - rgw.none showing unreal number of objects

Reply via email to