On Thu, Mar 23, 2017 at 02:43:47PM +0100, Jesper Dangaard Brouer wrote: > On Wed, 22 Mar 2017 23:40:04 +0000 > Mel Gorman <mgor...@techsingularity.net> wrote: > > > On Wed, Mar 22, 2017 at 07:39:17PM +0200, Tariq Toukan wrote: > > > > > > This modification may slow allocations from IRQ context slightly > > > > > > but the > > > > > > main gain from the per-cpu allocator is that it scales better for > > > > > > allocations from multiple contexts. There is an implicit > > > > > > assumption that > > > > > > intensive allocations from IRQ contexts on multiple CPUs from a > > > > > > single > > > > > > NUMA node are rare > > > Hi Mel, Jesper, and all. > > > > > > This assumption contradicts regular multi-stream traffic that is naturally > > > handled > > > over close numa cores. I compared iperf TCP multistream (8 streams) > > > over CX4 (mlx5 driver) with kernels v4.10 (before this series) vs > > > kernel v4.11-rc1 (with this series). > > > I disabled the page-cache (recycle) mechanism to stress the page > > > allocator, > > > and see a drastic degradation in BW, from 47.5 G in v4.10 to 31.4 G in > > > v4.11-rc1 (34% drop). > > > I noticed queued_spin_lock_slowpath occupies 62.87% of CPU time. > > > > Can you get the stack trace for the spin lock slowpath to confirm it's > > from IRQ context? > > AFAIK allocations happen in softirq. Argh and during review I missed > that in_interrupt() also covers softirq. To Mel, can we use a in_irq() > check instead? > > (p.s. just landed and got home)
Not built or even boot tested. I'm unable to run tests at the moment diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6cbde310abed..f82225725bc1 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2481,7 +2481,7 @@ void free_hot_cold_page(struct page *page, bool cold) unsigned long pfn = page_to_pfn(page); int migratetype; - if (in_interrupt()) { + if (in_irq()) { __free_pages_ok(page, 0); return; } @@ -2647,7 +2647,7 @@ static struct page *__rmqueue_pcplist(struct zone *zone, int migratetype, { struct page *page; - VM_BUG_ON(in_interrupt()); + VM_BUG_ON(in_irq()); do { if (list_empty(list)) { @@ -2704,7 +2704,7 @@ struct page *rmqueue(struct zone *preferred_zone, unsigned long flags; struct page *page; - if (likely(order == 0) && !in_interrupt()) { + if (likely(order == 0) && !in_irq()) { page = rmqueue_pcplist(preferred_zone, zone, order, gfp_flags, migratetype); goto out; -- Mel Gorman SUSE Labs