On Wed, Apr 16, 2014 at 5:48 AM, Kirill A. Shutemov <kirill.shute...@linux.intel.com> wrote: > Sasha Levin has reported two THP BUGs[1][2]. I believe both of them have > the same root cause. Let's look to them one by one. > > The first bug[1] is "kernel BUG at mm/huge_memory.c:1829!". > It's BUG_ON(mapcount != page_mapcount(page)) in __split_huge_page(). > From my testing I see that page_mapcount() is higher than mapcount here. > > I think it happens due to race between zap_huge_pmd() and > page_check_address_pmd(). page_check_address_pmd() misses PMD > which is under zap: >
Nice catch! > CPU0 CPU1 > zap_huge_pmd() > pmdp_get_and_clear() > __split_huge_page() > anon_vma_interval_tree_foreach() > __split_huge_page_splitting() > page_check_address_pmd() > mm_find_pmd() > /* > * We check if PMD present without taking ptl: no > * serialization against zap_huge_pmd(). We miss this PMD, > * it's not accounted to 'mapcount' in __split_huge_page(). > */ > pmd_present(pmd) == 0 > > BUG_ON(mapcount != page_mapcount(page)) // CRASH!!! > > page_remove_rmap(page) > atomic_add_negative(-1, > &page->_mapcount) > > The second bug[2] is "kernel BUG at mm/huge_memory.c:1371!". > It's VM_BUG_ON_PAGE(!PageHead(page), page) in zap_huge_pmd(). > > This happens in similar way: > > CPU0 CPU1 > zap_huge_pmd() > pmdp_get_and_clear() > page_remove_rmap(page) > atomic_add_negative(-1, > &page->_mapcount) > __split_huge_page() > anon_vma_interval_tree_foreach() > __split_huge_page_splitting() > page_check_address_pmd() > mm_find_pmd() > pmd_present(pmd) == 0 /* The same comment as above */ > /* > * No crash this time since we already decremented page->_mapcount in > * zap_huge_pmd(). > */ > BUG_ON(mapcount != page_mapcount(page)) > > /* > * We split the compound page here into small pages without > * serialization against zap_huge_pmd() > */ > __split_huge_page_refcount() > > VM_BUG_ON_PAGE(!PageHead(page), page); // CRASH!!! > > So my understanding the problem is pmd_present() check in mm_find_pmd() > without taking page table lock. > > The bug was introduced by me commit with commit 117b0791ac42. Sorry for > that. :( > > Let's open code mm_find_pmd() in page_check_address_pmd() and do the > check under page table lock. > > Note that __page_check_address() does the same for PTE entires > if sync != 0. > > I've stress tested split and zap code paths for 36+ hours by now and > don't see crashes with the patch applied. Before it took <20 min to > trigger the first bug and few hours for second one (if we ignore > first). > > [1] https://lkml.kernel.org/g/<53440991.9090...@oracle.com> > [2] https://lkml.kernel.org/g/<5310c56c.60...@oracle.com> > > Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com> > Reported-by: Sasha Levin <sasha.le...@oracle.com> > Cc: <sta...@vger.kernel.org> #3.13+ > --- > mm/huge_memory.c | 13 ++++++++++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 5025709bb3b5..d02a83852ee9 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -1536,16 +1536,23 @@ pmd_t *page_check_address_pmd(struct page *page, > enum page_check_address_pmd_flag flag, > spinlock_t **ptl) > { > + pgd_t *pgd; > + pud_t *pud; > pmd_t *pmd; > > if (address & ~HPAGE_PMD_MASK) > return NULL; > > - pmd = mm_find_pmd(mm, address); > - if (!pmd) > + pgd = pgd_offset(mm, address); > + if (!pgd_present(*pgd)) > return NULL; > + pud = pud_offset(pgd, address); > + if (!pud_present(*pud)) > + return NULL; > + pmd = pmd_offset(pud, address); > + > *ptl = pmd_lock(mm, pmd); > - if (pmd_none(*pmd)) > + if (!pmd_present(*pmd)) > goto unlock; But I didn't get the idea why pmd_none() was removed? -- Regards, --Bob -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/