On 06/16/2016 08:14 AM, Lukasz Anaczkowski wrote:
> For reclaim this brings the performance back to before Mel's
> flushing changes, but for unmap it disables batching.

This turns out to be pretty catastrophic for unmap.  In a workload that
uses, say 200 hardware threads and alloc/frees() a few MB/sec, this ends
up costing hundreds of thousands of extra received IPIs.  10MB=~2500
ptes, and at with 200 threads, that's 250,000 IPIs received just to free
10MB of memory.

The initial testing we did on this was on a *bunch* of threads all doing
alloc/free.  But this is bottlenecked on other things, like mmap_sem
being held for write.

The scenario that we really needed to test here was on lots of threads
doing processing and 1 thread doing alloc/free.

Reply via email to