On Tue, 9 Feb 2021, Naoya Horiguchi wrote:

> From: Naoya Horiguchi <naoya.horigu...@nec.com>
> 
> Currently hwpoison code checks PageAnon() for thp and refuses to handle
> errors on non-anonymous thps (just for historical reason).  We now
> support non-anonymou thp like shmem one, so this patch suggests to enable
> to handle shmem thps. Fortunately, we already have can_split_huge_page()
> to check if a give thp is splittable, so this patch relies on it.

Fortunately? I don't understand. Why call can_split_huge_page()
at all, instead of simply trying split_huge_page() directly?
And could it do better than -EBUSY when split_huge_page() fails?

> 
> Signed-off-by: Naoya Horiguchi <naoya.horigu...@nec.com>

Thanks for trying to add shmem+file THP support, but I think this
does not work as intended - Andrew, if Naoya agrees, please drop from
mmotm for now, the fixup needed will be more than a line or two.

I'm not much into memory-failure myself, but Jue discovered that the
SIGBUS never arrives: because split_huge_page() on a shmem or file
THP unmaps all its pmds and ptes, and (unlike with anon) leaves them
unmapped - in normal circumstances, to be faulted back on demand.
So the page_mapped() check in hwpoison_user_mappings() fails,
and the intended SIGBUS is not delivered.

(Or, is it acceptable that the SIGBUS is not delivered to those who
have the huge page mapped: should it get delivered later, to anyone
who faults back in the bad 4k?)

We believe the tokill list has to be set up earlier, before
split_huge_page() is called, then passed in to hwpoison_user_mappings().

Sorry, we don't have a proper patch for that right now, but I expect
you can see what needs to be done.  But something we found on the way,
we do have a patch for: add_to_kill() uses page_address_in_vma(), but
that has not been used on file THP tails before - fix appended at the
end below, so as not to waste your time on that bit.

> ---
>  mm/memory-failure.c | 34 +++++++++-------------------------
>  1 file changed, 9 insertions(+), 25 deletions(-)
> 
> diff --git v5.11-rc6/mm/memory-failure.c v5.11-rc6_patched/mm/memory-failure.c
> index e9481632fcd1..8c1575de0507 100644
> --- v5.11-rc6/mm/memory-failure.c
> +++ v5.11-rc6_patched/mm/memory-failure.c
> @@ -956,20 +956,6 @@ static int __get_hwpoison_page(struct page *page)
>  {
>       struct page *head = compound_head(page);
>  
> -     if (!PageHuge(head) && PageTransHuge(head)) {
> -             /*
> -              * Non anonymous thp exists only in allocation/free time. We
> -              * can't handle such a case correctly, so let's give it up.
> -              * This should be better than triggering BUG_ON when kernel
> -              * tries to touch the "partially handled" page.
> -              */
> -             if (!PageAnon(head)) {
> -                     pr_err("Memory failure: %#lx: non anonymous thp\n",
> -                             page_to_pfn(page));
> -                     return 0;
> -             }
> -     }
> -
>       if (get_page_unless_zero(head)) {
>               if (head == compound_head(page))
>                       return 1;
> @@ -1197,21 +1183,19 @@ static int identify_page_state(unsigned long pfn, 
> struct page *p,
>  
>  static int try_to_split_thp_page(struct page *page, const char *msg)
>  {
> -     lock_page(page);
> -     if (!PageAnon(page) || unlikely(split_huge_page(page))) {
> -             unsigned long pfn = page_to_pfn(page);
> +     struct page *head;
>  
> +     lock_page(page);
> +     head = compound_head(page);
> +     if (PageTransHuge(head) && can_split_huge_page(head, NULL) &&
> +         !split_huge_page(page)) {
>               unlock_page(page);
> -             if (!PageAnon(page))
> -                     pr_info("%s: %#lx: non anonymous thp\n", msg, pfn);
> -             else
> -                     pr_info("%s: %#lx: thp split failed\n", msg, pfn);
> -             put_page(page);
> -             return -EBUSY;
> +             return 0;
>       }
> +     pr_info("%s: %#lx: thp split failed\n", msg, page_to_pfn(page));
>       unlock_page(page);
> -
> -     return 0;
> +     put_page(page);
> +     return -EBUSY;
>  }
>  
>  static int memory_failure_hugetlb(unsigned long pfn, int flags)
> -- 
> 2.25.1

[PATCH] mm: fix page_address_in_vma() on file THP tails
From: Jue Wang <j...@google.com>

Anon THP tails were already supported, but memory-failure now needs to use
page_address_in_vma() on file THP tails, which its page->mapping check did
not permit: fix it.

Signed-off-by: Jue Wang <j...@google.com>
Signed-off-by: Hugh Dickins <hu...@google.com>
---

 mm/rmap.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- 5.12-rc2/mm/rmap.c  2021-02-28 16:58:57.950450151 -0800
+++ linux/mm/rmap.c     2021-03-10 20:29:21.591475177 -0800
@@ -717,11 +717,11 @@ unsigned long page_address_in_vma(struct
                if (!vma->anon_vma || !page__anon_vma ||
                    vma->anon_vma->root != page__anon_vma->root)
                        return -EFAULT;
-       } else if (page->mapping) {
-               if (!vma->vm_file || vma->vm_file->f_mapping != page->mapping)
-                       return -EFAULT;
-       } else
+       } else if (!vma->vm_file) {
+               return -EFAULT;
+       } else if (vma->vm_file->f_mapping != compound_head(page)->mapping) {
                return -EFAULT;
+       }
        address = __vma_address(page, vma);
        if (unlikely(address < vma->vm_start || address >= vma->vm_end))
                return -EFAULT;

Reply via email to