On Fri, Sep 25, 2020 at 11:21:58AM +0800, Huang, Ying wrote:
> Rafael Aquini <aqu...@redhat.com> writes:
> >> Or, can you help to run the test with a debug kernel based on upstream
> >> kernel.  I can provide some debug patch.
> >> 
> >
> > Sure, I can set your patches to run with the test cases we have that tend 
> > to 
> > reproduce the issue with some degree of success.
> 
> Thanks!
> 
> I found a race condition.  During THP splitting, "head" may be unlocked
> before calling split_swap_cluster(), because head != page during
> deferred splitting.  So we should call split_swap_cluster() before
> unlocking.  The debug patch to do that is as below.  Can you help to
> test it?
>


I finally could grab a good crashdump and confirm that head is really
not locked. I still need to dig into it to figure out more about the
crash. I guess that your patch will guarantee that lock on head, but
it still doesn't help on explaining how did we get the THP marked as 
PG_swapcache, given that it should fail add_to_swap()->get_swap_page()
right? 

I'll give your patch a run over the weekend, hopefully we'll have more
info on this next week.

  
> Best Regards,
> Huang, Ying
> 
> ------------------------8<----------------------------
> From 24ce0736a9f587d2dba12f12491c88d3e296a491 Mon Sep 17 00:00:00 2001
> From: Huang Ying <ying.hu...@intel.com>
> Date: Fri, 25 Sep 2020 11:10:56 +0800
> Subject: [PATCH] dbg: Call split_swap_clsuter() before unlock page during
>  split THP
> 
> ---
>  mm/huge_memory.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index faadc449cca5..8d79e5e6b46e 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2444,6 +2444,12 @@ static void __split_huge_page(struct page *page, 
> struct list_head *list,
>  
>       remap_page(head);
>  
> +     if (PageSwapCache(head)) {
> +             swp_entry_t entry = { .val = page_private(head) };
> +
> +             split_swap_cluster(entry);
> +     }
> +
>       for (i = 0; i < HPAGE_PMD_NR; i++) {
>               struct page *subpage = head + i;
>               if (subpage == page)
> @@ -2678,12 +2684,7 @@ int split_huge_page_to_list(struct page *page, struct 
> list_head *list)
>               }
>  
>               __split_huge_page(page, list, end, flags);
> -             if (PageSwapCache(head)) {
> -                     swp_entry_t entry = { .val = page_private(head) };
> -
> -                     ret = split_swap_cluster(entry);
> -             } else
> -                     ret = 0;
> +             ret = 0;
>       } else {
>               if (IS_ENABLED(CONFIG_DEBUG_VM) && mapcount) {
>                       pr_alert("total_mapcount: %u, page_count(): %u\n",
> -- 
> 2.28.0
> 

Reply via email to