There is a race between dissolve_free_huge_page() and put_page(), and the race window is quite small. Theoretically, we should return -EBUSY when we encounter this race. In fact, we have a chance to successfully dissolve the page if we do a retry. Because the race window is quite small. If we seize this opportunity, it is an optimization for increasing the success rate of dissolving page.
If we free a HugeTLB page from a non-task context, it is deferred through a workqueue. In this case, we need to flush the work. The dissolve_free_huge_page() can be called from memory hotplug, the caller aims to free the HugeTLB page to the buddy allocator so that the caller can unplug the page successfully. Signed-off-by: Muchun Song <songmuc...@bytedance.com> --- mm/hugetlb.c | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 4a9011e12175..a176ceed55f1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1763,10 +1763,11 @@ static int free_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed, * nothing for in-use hugepages and non-hugepages. * This function returns values like below: * - * -EBUSY: failed to dissolved free hugepages or the hugepage is in-use - * (allocated or reserved.) - * 0: successfully dissolved free hugepages or the page is not a - * hugepage (considered as already dissolved) + * -EAGAIN: race with __free_huge_page() and can do a retry + * -EBUSY: failed to dissolved free hugepages or the hugepage is in-use + * (allocated or reserved.) + * 0: successfully dissolved free hugepages or the page is not a + * hugepage (considered as already dissolved) */ int dissolve_free_huge_page(struct page *page) { @@ -1793,8 +1794,10 @@ int dissolve_free_huge_page(struct page *page) * We should make sure that the page is already on the free list * when it is dissolved. */ - if (unlikely(!PageHugeFreed(head))) + if (unlikely(!PageHugeFreed(head))) { + rc = -EAGAIN; goto out; + } /* * Move PageHWPoison flag from head page to the raw error page, @@ -1813,6 +1816,14 @@ int dissolve_free_huge_page(struct page *page) } out: spin_unlock(&hugetlb_lock); + + /* + * If the freeing of the HugeTLB page is put on a work queue, we should + * flush the work before retrying. + */ + if (unlikely(rc == -EAGAIN)) + flush_work(&free_hpage_work); + return rc; } @@ -1835,7 +1846,12 @@ int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn) for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order) { page = pfn_to_page(pfn); +retry: rc = dissolve_free_huge_page(page); + if (rc == -EAGAIN) { + cpu_relax(); + goto retry; + } if (rc) break; } -- 2.11.0