type1: Remove locked page accounting workqueue

Peter Xu Mon, 17 Apr 2017 19:55:46 -0700

On Mon, Apr 17, 2017 at 03:32:20PM -0600, Alex Williamson wrote:
> On Tue, 18 Apr 2017 01:02:12 +0530
> Kirti Wankhede <kwankh...@nvidia.com> wrote:
> 
> > On 4/18/2017 12:49 AM, Alex Williamson wrote:
> > > On Tue, 18 Apr 2017 00:35:06 +0530
> > > Kirti Wankhede <kwankh...@nvidia.com> wrote:
> > >   
> > >> On 4/17/2017 8:02 PM, Alex Williamson wrote:  
> > >>> On Mon, 17 Apr 2017 14:47:54 +0800
> > >>> Peter Xu <pet...@redhat.com> wrote:
> > >>>     
> > >>>> On Sun, Apr 16, 2017 at 07:42:27PM -0600, Alex Williamson wrote:
> > >>>>
> > >>>> [...]
> > >>>>    
> > >>>>> -static void vfio_lock_acct(struct task_struct *task, long npage)
> > >>>>> +static int vfio_lock_acct(struct task_struct *task, long npage, bool 
> > >>>>> lock_cap)
> > >>>>>  {
> > >>>>> -     struct vwork *vwork;
> > >>>>>       struct mm_struct *mm;
> > >>>>>       bool is_current;
> > >>>>> +     int ret;
> > >>>>>  
> > >>>>>       if (!npage)
> > >>>>> -             return;
> > >>>>> +             return 0;
> > >>>>>  
> > >>>>>       is_current = (task->mm == current->mm);
> > >>>>>  
> > >>>>>       mm = is_current ? task->mm : get_task_mm(task);
> > >>>>>       if (!mm)
> > >>>>> -             return; /* process exited */
> > >>>>> +             return -ESRCH; /* process exited */
> > >>>>>  
> > >>>>> -     if (down_write_trylock(&mm->mmap_sem)) {
> > >>>>> -             mm->locked_vm += npage;
> > >>>>> -             up_write(&mm->mmap_sem);
> > >>>>> -             if (!is_current)
> > >>>>> -                     mmput(mm);
> > >>>>> -             return;
> > >>>>> -     }
> > >>>>> +     ret = down_write_killable(&mm->mmap_sem);
> > >>>>> +     if (!ret) {
> > >>>>> +             if (npage < 0 || lock_cap) {      
> > >>>>
> > >>>> Nit: maybe we can avoid passing in lock_cap in all the callers of
> > >>>> vfio_lock_acct() and fetch it via has_capability() only if npage < 0?
> > >>>> IMHO that'll keep the vfio_lock_acct() interface cleaner, and we won't
> > >>>> need to pass in "false" any time when doing unpins.    
> > >>>
> > >>> Unfortunately vfio_pin_pages_remote() needs to know about lock_cap
> > >>> since it tests whether the user is exceeding their locked memory
> > >>> limit.  The other callers could certainly get away with
> > >>> vfio_lock_acct() testing the capability itself but that would add a
> > >>> redundant call for the most common user.  I'm not a big fan of passing
> > >>> a lock_cap bool either, but it seemed the best fix for now.  The
> > >>> cleanest alternative I can up with is this (untested):
> > >>>     
> > >>
> > >> In my opinion, passing 'bool lock_cap' looks much clean and simple.
> > >>
> > >> Reviewed-by: Kirti Wankhede <kwankh...@nvidia.com>  
> > > 
> > > Well shoot, I was just starting to warm up to the bool*.  I like that
> > > we're not presuming the polarity for the callers we expect to be
> > > removing pages and I generally just dislike passing fixed bool
> > > parameters to change the function behavior.  I've cleaned it up a bit
> > > further and was starting to do some testing on this which I'd propose
> > > for v5.  Does it change your opinion?  
> > 
> > If passing fixed bool parameter is the concern then I would lean towards
> > Peter's suggestion. vfio_pin_pages_remote() will check lock capability
> > outside vfio_lock_acct() and again in vfio_lock_acct(). At other places,
> > it will be takes care within vfio_lock_acct()
> 
> Sorry, I don't see that as a viable option.  Testing for CAP_IPC_LOCK in
> both vfio_pin_pages_remote() and vfio_lock_acct() results in over a
> 10% performance hit on the mapping path with a custom micro-benchmark.
> In fact, it suggests we should probably pass that from even higher in
> the call stack.  Thanks,


Sorry I wasn't aware of such a performance degradation with such a
change. Then I would be perfectly fine with either current patch, or
the new one you proposed (with bool *). Thanks,

-- 
Peter Xu

Re: [PATCH v4 1/2] vfio/type1: Remove locked page accounting workqueue

Reply via email to