On Wed, 12 Aug 2015 17:35:09 +0300 "Kirill A. Shutemov" <kir...@shutemov.name> 
wrote:

> On Thu, Aug 06, 2015 at 12:24:22PM -0700, Hugh Dickins wrote:
> > > IIUC, the only potentially problematic callsites left are physical memory
> > > scanners. This code requires audit. I'll do that.
> > 
> > Please.
> 
> I haven't finished the exercise yet. But here's an issue I believe present
> in current *Linus* tree:
> 
> >From e78eec7d7a8c4cba8b5952a997973f7741e704f4 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shute...@linux.intel.com>
> Date: Wed, 12 Aug 2015 17:09:16 +0300
> Subject: [PATCH] mm: fix potential race in isolate_migratepages_block()
> 
> Hugh has pointed that compound_head() call can be unsafe in some context.
> There's one example:
> 
>       CPU0                                    CPU1
> 
> isolate_migratepages_block()
>   page_count()
>     compound_head()
>       !!PageTail() == true
>                                       put_page()
>                                         tail->first_page = NULL
>       head = tail->first_page
>                                       alloc_pages(__GFP_COMP)
>                                          prep_compound_page()
>                                            tail->first_page = head
>                                            __SetPageTail(p);
>       !!PageTail() == true
>     <head == NULL dereferencing>
> 
> The race is pure theoretical. I don't it's possible to trigger it in
> practice. But who knows.
> 
> This can be fixed by avoiding compound_head() in unsafe context.

This is nuts :( page_count() should Just Work without us having to
worry about bizarre races against splitting.  Sigh.

> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -787,7 +787,7 @@ isolate_migratepages_block(struct compact_control *cc, 
> unsigned long low_pfn,
>                * admittedly racy check.
>                */
>               if (!page_mapping(page) &&
> -                 page_count(page) > page_mapcount(page))
> +                 atomic_read(&page->_count) > page_mapcount(page))
>                       continue;

If we're going to do this sort of thing, can we please do it in a more
transparent manner?  Let's not sprinkle unexplained and
incomprehensible direct accesses to ->_count all over the place.

Create a formal function to do this, with an appropriate name and with
documentation which fully explains what's going on.  Then use that
here, and in has_unmovable_pages() (at least).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to