On Tue, Dec 3, 2019 at 6:43 PM Robert LeBlanc <rob...@leblancnet.us> wrote:
>
> On Tue, Dec 3, 2019 at 9:11 AM Ed Fisher <e...@debacle.org> wrote:
>>
>>
>>
>> On Dec 3, 2019, at 10:28 AM, Robert LeBlanc <rob...@leblancnet.us> wrote:
>>
>> Did you make progress on this? We have a ton of < 64K objects as well and 
>> are struggling to get good performance out of our RGW. Sometimes we have RGW 
>> instances that are just gobbling up CPU even when there are no requests to 
>> them, so it seems like things are getting hung up somewhere. There is 
>> nothing in the logs and I haven't had time to do more troubleshooting.
>>
>>
>> There's a bug in the current stable Nautilus release that causes a loop 
>> and/or crash in get_obj_data::flush (you should be able to see it gobbling 
>> up CPU in perf top). This is the related issue: 
>> https://tracker.ceph.com/issues/39660 -- it should be fixed as soon as 
>> 14.2.5 is released (any day now, supposedly).
>
>
> We will try out the new version when it's released and see if it improves 
> things for us.

I can confirm that what you are describing sounds like the issue
linked above; yeah, the issue talks mainly about crashes, but that's
the "good" version of this bug. The bad just hangs the thread in an
infinite loop, I've once debugged this in more detail... the added
locks in linked pull request fixed this.


Paul

>
> Thanks,
> Robert LeBlanc
>
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to