Linus found there still is a race in mremap after commit 5d1904204c99 ("mremap: fix race between mremap() and page cleanning").
As described by Linus: the issue is that another thread might make the pte be dirty (in the hardware walker, so no locking of ours make any difference) *after* we checked whether it was dirty, but *before* we removed it from the page tables. Fix it by moving the check after we removed it from the page table. Suggested-by: Linus Torvalds <torva...@linux-foundation.org> Signed-off-by: Aaron Lu <aaron...@intel.com> --- mm/huge_memory.c | 4 ++-- mm/mremap.c | 8 ++++++-- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index eff3de359d50..d4a6e4001512 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1456,9 +1456,9 @@ bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr, new_ptl = pmd_lockptr(mm, new_pmd); if (new_ptl != old_ptl) spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING); - if (pmd_present(*old_pmd) && pmd_dirty(*old_pmd)) - force_flush = true; pmd = pmdp_huge_get_and_clear(mm, old_addr, old_pmd); + if (pmd_present(pmd) && pmd_dirty(pmd)) + force_flush = true; VM_BUG_ON(!pmd_none(*new_pmd)); if (pmd_move_must_withdraw(new_ptl, old_ptl) && diff --git a/mm/mremap.c b/mm/mremap.c index 6ccecc03f56a..53df7ec8d2ba 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -149,14 +149,18 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd, if (pte_none(*old_pte)) continue; + pte = ptep_get_and_clear(mm, old_addr, old_pte); /* * We are remapping a dirty PTE, make sure to * flush TLB before we drop the PTL for the * old PTE or we may race with page_mkclean(). + * + * This check has to be done after we removed the + * old PTE from page tables or another thread may + * dirty it after the check and before the removal. */ - if (pte_present(*old_pte) && pte_dirty(*old_pte)) + if (pte_present(pte) && pte_dirty(pte)) force_flush = true; - pte = ptep_get_and_clear(mm, old_addr, old_pte); pte = move_pte(pte, new_vma->vm_page_prot, old_addr, new_addr); pte = move_soft_dirty_pte(pte); set_pte_at(mm, new_addr, new_pte, pte); -- 2.5.5