On 4/2/21 5:47 AM, Muchun Song wrote: > On Wed, Mar 31, 2021 at 11:42 AM Mike Kravetz <mike.krav...@oracle.com> wrote: >> >> Commit c77c0a8ac4c5 ("mm/hugetlb: defer freeing of huge pages if in >> non-task context") was added to address the issue of free_huge_page >> being called from irq context. That commit hands off free_huge_page >> processing to a workqueue if !in_task. However, this doesn't cover >> all the cases as pointed out by 0day bot lockdep report [1]. >> >> : Possible interrupt unsafe locking scenario: >> : >> : CPU0 CPU1 >> : ---- ---- >> : lock(hugetlb_lock); >> : local_irq_disable(); >> : lock(slock-AF_INET); >> : lock(hugetlb_lock); >> : <Interrupt> >> : lock(slock-AF_INET); >> >> Shakeel has later explained that this is very likely TCP TX zerocopy >> from hugetlb pages scenario when the networking code drops a last >> reference to hugetlb page while having IRQ disabled. Hugetlb freeing >> path doesn't disable IRQ while holding hugetlb_lock so a lock dependency >> chain can lead to a deadlock. >> >> This commit addresses the issue by doing the following: >> - Make hugetlb_lock irq safe. This is mostly a simple process of >> changing spin_*lock calls to spin_*lock_irq* calls. >> - Make subpool lock irq safe in a similar manner. >> - Revert the !in_task check and workqueue handoff. >> >> [1] https://lore.kernel.org/linux-mm/000000000000f1c03b05bc43a...@google.com/ >> >> Signed-off-by: Mike Kravetz <mike.krav...@oracle.com> >> Acked-by: Michal Hocko <mho...@suse.com> >> Reviewed-by: Muchun Song <songmuc...@bytedance.com> > > Hi Mike, > > Today I pulled the newest code (next-20210401). I found that > alloc_and_dissolve_huge_page is not updated. In this function, > hugetlb_lock is still non-irq safe. Maybe you or Oscar need > to fix. > > Thanks.
Thank you Muchun, Oscar's changes were not in Andrew's tree when I started on this series and I failed to notice their inclusion. In addition, isolate_or_dissolve_huge_page also needs updating as well as a change in set_max_huge_pages that was omitted while rebasing. Andrew, the following patch addresses those missing changes. Ideally, the changes should be combined/included in this patch. If you want me to sent another version of this patch or another series, let me know. >From 450593eb3cea895f499ddc343c22424c552ea502 Mon Sep 17 00:00:00 2001 From: Mike Kravetz <mike.krav...@oracle.com> Date: Fri, 2 Apr 2021 13:18:13 -0700 Subject: [PATCH] hugetlb: fix irq locking omissions The pach "hugetlb: make free_huge_page irq safe" changed spin_*lock calls to spin_*lock_irq* calls. However, it missed several places in the file hugetlb.c. Add the overlooked changes. Signed-off-by: Mike Kravetz <mike.krav...@oracle.com> --- mm/hugetlb.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c22111f3da20..a6bfc6bcbc81 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2284,7 +2284,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page, */ page_ref_dec(new_page); retry: - spin_lock(&hugetlb_lock); + spin_lock_irq(&hugetlb_lock); if (!PageHuge(old_page)) { /* * Freed from under us. Drop new_page too. @@ -2297,7 +2297,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page, * Fail with -EBUSY if not possible. */ update_and_free_page(h, new_page); - spin_unlock(&hugetlb_lock); + spin_unlock_irq(&hugetlb_lock); if (!isolate_huge_page(old_page, list)) ret = -EBUSY; return ret; @@ -2307,7 +2307,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page, * freelist yet. Race window is small, so we can succed here if * we retry. */ - spin_unlock(&hugetlb_lock); + spin_unlock_irq(&hugetlb_lock); cond_resched(); goto retry; } else { @@ -2323,7 +2323,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page, __enqueue_huge_page(&h->hugepage_freelists[nid], new_page); } unlock: - spin_unlock(&hugetlb_lock); + spin_unlock_irq(&hugetlb_lock); return ret; } @@ -2339,15 +2339,15 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list) * to carefully check the state under the lock. * Return success when racing as if we dissolved the page ourselves. */ - spin_lock(&hugetlb_lock); + spin_lock_irq(&hugetlb_lock); if (PageHuge(page)) { head = compound_head(page); h = page_hstate(head); } else { - spin_unlock(&hugetlb_lock); + spin_unlock_irq(&hugetlb_lock); return 0; } - spin_unlock(&hugetlb_lock); + spin_unlock_irq(&hugetlb_lock); /* * Fence off gigantic pages as there is a cyclic dependency between @@ -2737,7 +2737,7 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, * pages in hstate via the proc/sysfs interfaces. */ mutex_lock(&h->resize_lock); - spin_lock(&hugetlb_lock); + spin_lock_irq(&hugetlb_lock); /* * Check for a node specific request. -- 2.30.2