Re: [ceph-users] radosgw pegging down 5 CPU cores when no data is being transferred

Vladimir Brik Wed, 21 Aug 2019 13:47:43 -0700

> Are you running multisite?
No

> Do you have dynamic bucket resharding turned on?
Yes. "radosgw-admin reshard list" prints "[]"


> Are you using lifecycle?
I am not sure. How can I check? "radosgw-admin lc list" says "[]"

> And just to be clear -- sometimes all 3 of your rados gateways are
> simultaneously in this state?

Multiple, but I have not seen all 3 being in this state simultaneously.Currently one gateway has 1 thread using 100% of CPU, and another has 5threads each using 100% CPU.

Here are the fruits of my attempts to capture the call graph using perfand gdbpmp:

https://icecube.wisc.edu/~vbrik/perf.data
https://icecube.wisc.edu/~vbrik/gdbpmp.data

These are the commands that I ran and their outputs (note I couldn't getperf not to generate the warning):

rgw-3 gdbpmp # ./gdbpmp.py -n 100 -p 73688 -o gdbpmp.data
Attaching to process 73688...Done.

GatheringSamples....................................................................................................

Profiling complete with 100 samples.

rgw-3 ~ # perf record --call-graph fp -p 73688 -- sleep 10
[ perf record: Woken up 54 times to write data ]
Warning:
Processed 574207 events and lost 4 chunks!
Check IO/CPU overload!
[ perf record: Captured and wrote 58.866 MB perf.data (233750 samples) ]





Vlad



On 8/21/19 11:16 AM, J. Eric Ivancich wrote:

On 8/21/19 10:22 AM, Mark Nelson wrote:

Hi Vladimir,


On 8/21/19 8:54 AM, Vladimir Brik wrote:

Hello


[much elided]

You might want to try grabbing a a callgraph from perf instead of just
running perf top or using my wallclock profiler to see if you can drill
down and find out where in that method it's spending the most time.


I agree with Mark -- a call graph would be very helpful in tracking down
what's happening.

There are background tasks that run. Are you running multisite? Do you
have dynamic bucket resharding turned on? Are you using lifecycle? And
garbage collection is another background task.

And just to be clear -- sometimes all 3 of your rados gateways are
simultaneously in this state?

But the call graph would be incredibly helpful.

Thank you,

Eric

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] radosgw pegging down 5 CPU cores when no data is being transferred

Reply via email to