On 14/09/2017 11:19 PM, Andi Kleen wrote:
Tariq Toukan <tar...@mellanox.com> writes:
Congestion in this case is very clear.
When monitored in perf top:
85.58% [kernel] [k] queued_spin_lock_slowpath
Please look at the callers. Spinlock profiles without callers
are usually useless because it's just blaming the messenger.
Most likely the PCP lists are too small for your extreme allocation
rate, so it goes back too often to the shared pool.
You can play with the vm.percpu_pagelist_fraction setting.
Thanks Andi.
That was my initial guess, but I wasn't familiar with these tunes in VM
to verify that.
Indeed, bottleneck is released when increasing the PCP size, and BW
becomes significantly better.
-Andi