When a page is freed back to the global pool, its buddy will be checked
to see if it's possible to do a merge. This requires accessing buddy's
page structure and that access could take a long time if it's cache cold.

This patch adds a prefetch to the to-be-freed page's buddy outside of
zone->lock in hope of accessing buddy's page structure later under
zone->lock will be faster.

Test with will-it-scale/page_fault1 full load:

kernel      Broadwell(2S)  Skylake(2S)   Broadwell(4S)  Skylake(4S)
v4.15-rc4   9037332        8000124       13642741       15728686
patch1/2    9608786 +6.3%  8368915 +4.6% 14042169 +2.9% 17433559 +10.8%
this patch 10462292 +8.9%  8602889 +2.8% 14802073 +5.4% 17624575 +1.1%

Note: this patch's performance improvement percent is against patch1/2.

Suggested-by: Ying Huang <ying.hu...@intel.com>
Signed-off-by: Aaron Lu <aaron...@intel.com>
---
 mm/page_alloc.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a076f754dac1..9ef084d41708 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1140,6 +1140,9 @@ static void free_pcppages_bulk(struct zone *zone, int 
count,
                        batch_free = count;
 
                do {
+                       unsigned long pfn, buddy_pfn;
+                       struct page *buddy;
+
                        page = list_last_entry(list, struct page, lru);
                        /* must delete as __free_one_page list manipulates */
                        list_del(&page->lru);
@@ -1148,6 +1151,16 @@ static void free_pcppages_bulk(struct zone *zone, int 
count,
                                continue;
 
                        list_add_tail(&page->lru, &head);
+
+                       /*
+                        * We are going to put the page back to
+                        * the global pool, prefetch its buddy to
+                        * speed up later access under zone->lock.
+                        */
+                       pfn = page_to_pfn(page);
+                       buddy_pfn = __find_buddy_pfn(pfn, 0);
+                       buddy = page + (buddy_pfn - pfn);
+                       prefetch(buddy);
                } while (--count && --batch_free && !list_empty(list));
        }
 
-- 
2.14.3

Reply via email to