On Tue, 3 Apr 2018, Jerome Glisse wrote:

> > diff --git a/mm/memory.c b/mm/memory.c
> > index 21b1212a0892..4bc7b0bdcb40 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -2309,21 +2309,29 @@ static bool pte_map_lock(struct vm_fault *vmf)
> >   * parts, do_swap_page must check under lock before unmapping the pte and
> >   * proceeding (but do_wp_page is only called after already making such a 
> > check;
> >   * and do_anonymous_page can safely check later on).
> > + *
> > + * pte_unmap_same() returns:
> > + * 0                       if the PTE are the same
> > + * VM_FAULT_PTNOTSAME      if the PTE are different
> > + * VM_FAULT_RETRY          if the VMA has changed in our back during
> > + *                         a speculative page fault handling.
> >   */
> > -static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd,
> > -                           pte_t *page_table, pte_t orig_pte)
> > +static inline int pte_unmap_same(struct vm_fault *vmf)
> >  {
> > -   int same = 1;
> > +   int ret = 0;
> > +
> >  #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
> >     if (sizeof(pte_t) > sizeof(unsigned long)) {
> > -           spinlock_t *ptl = pte_lockptr(mm, pmd);
> > -           spin_lock(ptl);
> > -           same = pte_same(*page_table, orig_pte);
> > -           spin_unlock(ptl);
> > +           if (pte_spinlock(vmf)) {
> > +                   if (!pte_same(*vmf->pte, vmf->orig_pte))
> > +                           ret = VM_FAULT_PTNOTSAME;
> > +                   spin_unlock(vmf->ptl);
> > +           } else
> > +                   ret = VM_FAULT_RETRY;
> >     }
> >  #endif
> > -   pte_unmap(page_table);
> > -   return same;
> > +   pte_unmap(vmf->pte);
> > +   return ret;
> >  }
> >  
> >  static inline void cow_user_page(struct page *dst, struct page *src, 
> > unsigned long va, struct vm_area_struct *vma)
> > @@ -2913,7 +2921,8 @@ int do_swap_page(struct vm_fault *vmf)
> >     int exclusive = 0;
> >     int ret = 0;
> >  
> > -   if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
> > +   ret = pte_unmap_same(vmf);
> > +   if (ret)
> >             goto out;
> >  
> 
> This change what do_swap_page() returns ie before it was returning 0
> when locked pte lookup was different from orig_pte. After this patch
> it returns VM_FAULT_PTNOTSAME but this is a new return value for
> handle_mm_fault() (the do_swap_page() return value is what ultimately
> get return by handle_mm_fault())
> 
> Do we really want that ? This might confuse some existing user of
> handle_mm_fault() and i am not sure of the value of that information
> to caller.
> 
> Note i do understand that you want to return retry if anything did
> change from underneath and thus need to differentiate from when the
> pte value are not the same.
> 

I think VM_FAULT_RETRY should be handled appropriately for any user of 
handle_mm_fault() already, and would be surprised to learn differently.  
Khugepaged has the appropriate handling.  I think the concern is whether a 
user is handling anything other than VM_FAULT_RETRY and VM_FAULT_ERROR 
(which VM_FAULT_PTNOTSAME is not set in)?  I haven't done a full audit of 
the users.

Reply via email to