On 11/29/2016 01:15 AM, Linus Torvalds wrote:
> However, I also independently think I found an actual bug while
> looking at the code as part of looking at the patch.
> 
> This part looks racy:
> 
>                 /*
>                  * We are remapping a dirty PTE, make sure to
>                  * flush TLB before we drop the PTL for the
>                  * old PTE or we may race with page_mkclean().
>                  */
>                 if (pte_present(*old_pte) && pte_dirty(*old_pte))
>                         force_flush = true;
>                 pte = ptep_get_and_clear(mm, old_addr, old_pte);
> 
> where the issue is that another thread might make the pte be dirty (in
> the hardware walker, so no locking of ours make any difference)
> *after* we checked whether it was dirty, but *before* we removed it
> from the page tables.

Ah, very right. Thanks for the catch!

> 
> So I think the "check for force-flush" needs to come *after*, and we should do
> 
>                 pte = ptep_get_and_clear(mm, old_addr, old_pte);
>                 if (pte_present(pte) && pte_dirty(pte))
>                         force_flush = true;
> 
> instead.
> 
> This happens for the pmd case too.

Here is a fix patch, sorry for the trouble.

>From c0dc52fd3d3be93afb5b97804937a1b1b7ef136e Mon Sep 17 00:00:00 2001
From: Aaron Lu <aaron...@intel.com>
Date: Tue, 29 Nov 2016 10:33:37 +0800
Subject: [PATCH] mremap: move_ptes: check pte dirty after its removal

Linus found there still is a race in mremap after commit 5d1904204c99
("mremap: fix race between mremap() and page cleanning").

As described by Linus:
the issue is that another thread might make the pte be dirty (in
the hardware walker, so no locking of ours make any difference)
*after* we checked whether it was dirty, but *before* we removed it
from the page tables.

Fix it by moving the check after we removed it from the page table.

Suggested-by: Linus Torvalds <torva...@linux-foundation.org>
Signed-off-by: Aaron Lu <aaron...@intel.com>
---
 mm/huge_memory.c | 2 +-
 mm/mremap.c      | 6 +++++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index eff3de359d50..a3e466c489a9 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1456,9 +1456,9 @@ bool move_huge_pmd(struct vm_area_struct *vma, unsigned 
long old_addr,
                new_ptl = pmd_lockptr(mm, new_pmd);
                if (new_ptl != old_ptl)
                        spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING);
+               pmd = pmdp_huge_get_and_clear(mm, old_addr, old_pmd);
                if (pmd_present(*old_pmd) && pmd_dirty(*old_pmd))
                        force_flush = true;
-               pmd = pmdp_huge_get_and_clear(mm, old_addr, old_pmd);
                VM_BUG_ON(!pmd_none(*new_pmd));
 
                if (pmd_move_must_withdraw(new_ptl, old_ptl) &&
diff --git a/mm/mremap.c b/mm/mremap.c
index 6ccecc03f56a..4b39dd0974e5 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -149,14 +149,18 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t 
*old_pmd,
                if (pte_none(*old_pte))
                        continue;
 
+               pte = ptep_get_and_clear(mm, old_addr, old_pte);
                /*
                 * We are remapping a dirty PTE, make sure to
                 * flush TLB before we drop the PTL for the
                 * old PTE or we may race with page_mkclean().
+                *
+                * This check has to be done after we removed the
+                * old PTE from page tables or another thread may
+                * dirty it after the check and before the removal.
                 */
                if (pte_present(*old_pte) && pte_dirty(*old_pte))
                        force_flush = true;
-               pte = ptep_get_and_clear(mm, old_addr, old_pte);
                pte = move_pte(pte, new_vma->vm_page_prot, old_addr, new_addr);
                pte = move_soft_dirty_pte(pte);
                set_pte_at(mm, new_addr, new_pte, pte);
-- 
2.5.5

Reply via email to