On Tue, Nov 10, 2020 at 07:44:09PM -0400, Jason Gunthorpe wrote:
> Since commit 70e806e4e645 ("mm: Do early cow for pinned pages during
> fork() for ptes") pages under a FOLL_PIN will not be write protected
> during COW for fork. This means that pages returned from
> pin_user_pages(FOLL_WRITE) should not become write protected while the pin
> is active.
> 
> However, there is a small race where get_user_pages_fast(FOLL_PIN) can
> establish a FOLL_PIN at the same time copy_present_page() is write
> protecting it:
> 
>         CPU 0                             CPU 1
>    get_user_pages_fast()
>     internal_get_user_pages_fast()
>                                        copy_page_range()
>                                          pte_alloc_map_lock()
>                                            copy_present_page()
>                                              atomic_read(has_pinned) == 0
>                                            page_maybe_dma_pinned() == false
>      atomic_set(has_pinned, 1);
>      gup_pgd_range()
>       gup_pte_range()
>        pte_t pte = gup_get_pte(ptep)
>        pte_access_permitted(pte)
>        try_grab_compound_head()
>                                              pte = pte_wrprotect(pte)
>                                            set_pte_at();
>                                          pte_unmap_unlock()
>       // GUP now returns with a write protected page
> 
> The first attempt to resolve this by using the write protect caused
> problems (and was missing a barrrier), see commit f3c64eda3e50 ("mm: avoid
> early COW write protect games during fork()")
> 
> Instead wrap copy_p4d_range() with the write side of a seqcount and check
> the read side around gup_pgd_range(). If there is a collision then
> get_user_pages_fast() fails and falls back to slow GUP.
> 
> Slow GUP is safe against this race because copy_page_range() is only
> called while holding the exclusive side of the mmap_lock on the src
> mm_struct.
> 
> Fixes: f3c64eda3e50 ("mm: avoid early COW write protect games during fork()")
> Suggested-by: Linus Torvalds <torva...@linux-foundation.org>
> Link: 
> https://lore.kernel.org/r/CAHk-=wi=icnycarbpgjkvju9eyyez13n64tzyldob8cp5q_...@mail.gmail.com
> Reviewed-by: John Hubbard <jhubb...@nvidia.com>
> Reviewed-by: Jan Kara <j...@suse.cz>
> Signed-off-by: Jason Gunthorpe <j...@nvidia.com>

Reviewed-by: Peter Xu <pet...@redhat.com>

-- 
Peter Xu

Reply via email to