Re: [PATCH 1/6] mm: change locked_vm's type from unsigned long to atomic64_t

2019-04-22 Thread Daniel Jordan
On Tue, Apr 16, 2019 at 04:33:51PM -0700, Andrew Morton wrote:

Sorry for the delay, I was on vacation all last week.

> What's the status of this patchset, btw?
> 
> I have a note here that
> powerpc-mmu-drop-mmap_sem-now-that-locked_vm-is-atomic.patch is to be
> updated.

Yes, the series needs a few updates.  v2 should appear in the next day or two.


Re: [PATCH 1/6] mm: change locked_vm's type from unsigned long to atomic64_t

2019-04-16 Thread Andrew Morton
On Thu, 11 Apr 2019 16:28:07 -0400 Daniel Jordan  
wrote:

> On Thu, Apr 11, 2019 at 10:55:43AM +0100, Mark Rutland wrote:
> > On Thu, Apr 11, 2019 at 02:22:23PM +1000, Alexey Kardashevskiy wrote:
> > > On 03/04/2019 07:41, Daniel Jordan wrote:
> > 
> > > > -   dev_dbg(dev, "[%d] RLIMIT_MEMLOCK %c%ld %ld/%ld%s\n", 
> > > > current->pid,
> > > > +   dev_dbg(dev, "[%d] RLIMIT_MEMLOCK %c%ld %lld/%lu%s\n", 
> > > > current->pid,
> > > > incr ? '+' : '-', npages << PAGE_SHIFT,
> > > > -   current->mm->locked_vm << PAGE_SHIFT, 
> > > > rlimit(RLIMIT_MEMLOCK),
> > > > -   ret ? "- exceeded" : "");
> > > > +   (s64)atomic64_read(>mm->locked_vm) << 
> > > > PAGE_SHIFT,
> > > > +   rlimit(RLIMIT_MEMLOCK), ret ? "- exceeded" : "");
> > > 
> > > 
> > > 
> > > atomic64_read() returns "long" which matches "%ld", why this change (and
> > > similar below)? You did not do this in the two pr_debug()s above anyway.
> > 
> > Unfortunately, architectures return inconsistent types for atomic64 ops.
> > 
> > Some return long (e..g. powerpc), some return long long (e.g. arc), and
> > some return s64 (e.g. x86).
> 
> Yes, Mark said it all, I'm just chiming in to confirm that's why I added the
> cast.
> 
> Btw, thanks for doing this, Mark.

What's the status of this patchset, btw?

I have a note here that
powerpc-mmu-drop-mmap_sem-now-that-locked_vm-is-atomic.patch is to be
updated.



Re: [PATCH 1/6] mm: change locked_vm's type from unsigned long to atomic64_t

2019-04-11 Thread Daniel Jordan
On Thu, Apr 11, 2019 at 10:55:43AM +0100, Mark Rutland wrote:
> On Thu, Apr 11, 2019 at 02:22:23PM +1000, Alexey Kardashevskiy wrote:
> > On 03/04/2019 07:41, Daniel Jordan wrote:
> 
> > > - dev_dbg(dev, "[%d] RLIMIT_MEMLOCK %c%ld %ld/%ld%s\n", current->pid,
> > > + dev_dbg(dev, "[%d] RLIMIT_MEMLOCK %c%ld %lld/%lu%s\n", current->pid,
> > >   incr ? '+' : '-', npages << PAGE_SHIFT,
> > > - current->mm->locked_vm << PAGE_SHIFT, rlimit(RLIMIT_MEMLOCK),
> > > - ret ? "- exceeded" : "");
> > > + (s64)atomic64_read(>mm->locked_vm) << PAGE_SHIFT,
> > > + rlimit(RLIMIT_MEMLOCK), ret ? "- exceeded" : "");
> > 
> > 
> > 
> > atomic64_read() returns "long" which matches "%ld", why this change (and
> > similar below)? You did not do this in the two pr_debug()s above anyway.
> 
> Unfortunately, architectures return inconsistent types for atomic64 ops.
> 
> Some return long (e..g. powerpc), some return long long (e.g. arc), and
> some return s64 (e.g. x86).

Yes, Mark said it all, I'm just chiming in to confirm that's why I added the
cast.

Btw, thanks for doing this, Mark.


Re: [PATCH 1/6] mm: change locked_vm's type from unsigned long to atomic64_t

2019-04-11 Thread Mark Rutland
On Thu, Apr 11, 2019 at 02:22:23PM +1000, Alexey Kardashevskiy wrote:
> On 03/04/2019 07:41, Daniel Jordan wrote:

> > -   dev_dbg(dev, "[%d] RLIMIT_MEMLOCK %c%ld %ld/%ld%s\n", current->pid,
> > +   dev_dbg(dev, "[%d] RLIMIT_MEMLOCK %c%ld %lld/%lu%s\n", current->pid,
> > incr ? '+' : '-', npages << PAGE_SHIFT,
> > -   current->mm->locked_vm << PAGE_SHIFT, rlimit(RLIMIT_MEMLOCK),
> > -   ret ? "- exceeded" : "");
> > +   (s64)atomic64_read(>mm->locked_vm) << PAGE_SHIFT,
> > +   rlimit(RLIMIT_MEMLOCK), ret ? "- exceeded" : "");
> 
> 
> 
> atomic64_read() returns "long" which matches "%ld", why this change (and
> similar below)? You did not do this in the two pr_debug()s above anyway.

Unfortunately, architectures return inconsistent types for atomic64 ops.

Some return long (e..g. powerpc), some return long long (e.g. arc), and
some return s64 (e.g. x86).

I'm currently trying to clean things up so that all use s64 [1], but in
the mean time it's necessary for generic code use a cast or temporarly
variable to ensure a consistent type. Once that's cleaned up, we can
remove the redundant casts.

Thanks,
Mark.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=atomics/type-cleanup


Re: [PATCH 1/6] mm: change locked_vm's type from unsigned long to atomic64_t

2019-04-10 Thread Alexey Kardashevskiy



On 03/04/2019 07:41, Daniel Jordan wrote:
> Taking and dropping mmap_sem to modify a single counter, locked_vm, is
> overkill when the counter could be synchronized separately.
> 
> Make mmap_sem a little less coarse by changing locked_vm to an atomic,
> the 64-bit variety to avoid issues with overflow on 32-bit systems.
> 
> Signed-off-by: Daniel Jordan 
> Cc: Alan Tull 
> Cc: Alexey Kardashevskiy 
> Cc: Alex Williamson 
> Cc: Andrew Morton 
> Cc: Benjamin Herrenschmidt 
> Cc: Christoph Lameter 
> Cc: Davidlohr Bueso 
> Cc: Michael Ellerman 
> Cc: Moritz Fischer 
> Cc: Paul Mackerras 
> Cc: Wu Hao 
> Cc: 
> Cc: 
> Cc: 
> Cc: 
> Cc: 
> Cc: 
> ---
>  arch/powerpc/kvm/book3s_64_vio.c| 14 --
>  arch/powerpc/mm/mmu_context_iommu.c | 15 ---
>  drivers/fpga/dfl-afu-dma-region.c   | 18 ++
>  drivers/vfio/vfio_iommu_spapr_tce.c | 17 +
>  drivers/vfio/vfio_iommu_type1.c | 10 ++
>  fs/proc/task_mmu.c  |  2 +-
>  include/linux/mm_types.h|  2 +-
>  kernel/fork.c   |  2 +-
>  mm/debug.c  |  5 +++--
>  mm/mlock.c  |  4 ++--
>  mm/mmap.c   | 18 +-
>  mm/mremap.c |  6 +++---
>  12 files changed, 61 insertions(+), 52 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_64_vio.c 
> b/arch/powerpc/kvm/book3s_64_vio.c
> index f02b04973710..e7fdb6d10eeb 100644
> --- a/arch/powerpc/kvm/book3s_64_vio.c
> +++ b/arch/powerpc/kvm/book3s_64_vio.c
> @@ -59,32 +59,34 @@ static unsigned long kvmppc_stt_pages(unsigned long 
> tce_pages)
>  static long kvmppc_account_memlimit(unsigned long stt_pages, bool inc)
>  {
>   long ret = 0;
> + s64 locked_vm;
>  
>   if (!current || !current->mm)
>   return ret; /* process exited */
>  
>   down_write(>mm->mmap_sem);
>  
> + locked_vm = atomic64_read(>mm->locked_vm);
>   if (inc) {
>   unsigned long locked, lock_limit;
>  
> - locked = current->mm->locked_vm + stt_pages;
> + locked = locked_vm + stt_pages;
>   lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
>   if (locked > lock_limit && !capable(CAP_IPC_LOCK))
>   ret = -ENOMEM;
>   else
> - current->mm->locked_vm += stt_pages;
> + atomic64_add(stt_pages, >mm->locked_vm);
>   } else {
> - if (WARN_ON_ONCE(stt_pages > current->mm->locked_vm))
> - stt_pages = current->mm->locked_vm;
> + if (WARN_ON_ONCE(stt_pages > locked_vm))
> + stt_pages = locked_vm;
>  
> - current->mm->locked_vm -= stt_pages;
> + atomic64_sub(stt_pages, >mm->locked_vm);
>   }
>  
>   pr_debug("[%d] RLIMIT_MEMLOCK KVM %c%ld %ld/%ld%s\n", current->pid,
>   inc ? '+' : '-',
>   stt_pages << PAGE_SHIFT,
> - current->mm->locked_vm << PAGE_SHIFT,
> + atomic64_read(>mm->locked_vm) << PAGE_SHIFT,
>   rlimit(RLIMIT_MEMLOCK),
>   ret ? " - exceeded" : "");
>  
> diff --git a/arch/powerpc/mm/mmu_context_iommu.c 
> b/arch/powerpc/mm/mmu_context_iommu.c
> index e7a9c4f6bfca..8038ac24a312 100644
> --- a/arch/powerpc/mm/mmu_context_iommu.c
> +++ b/arch/powerpc/mm/mmu_context_iommu.c
> @@ -55,30 +55,31 @@ static long mm_iommu_adjust_locked_vm(struct mm_struct 
> *mm,
>   unsigned long npages, bool incr)
>  {
>   long ret = 0, locked, lock_limit;
> + s64 locked_vm;
>  
>   if (!npages)
>   return 0;
>  
>   down_write(>mmap_sem);
> -
> + locked_vm = atomic64_read(>locked_vm);
>   if (incr) {
> - locked = mm->locked_vm + npages;
> + locked = locked_vm + npages;
>   lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
>   if (locked > lock_limit && !capable(CAP_IPC_LOCK))
>   ret = -ENOMEM;
>   else
> - mm->locked_vm += npages;
> + atomic64_add(npages, >locked_vm);
>   } else {
> - if (WARN_ON_ONCE(npages > mm->locked_vm))
> - npages = mm->locked_vm;
> - mm->locked_vm -= npages;
> + if (WARN_ON_ONCE(npages > locked_vm))
> + npages = locked_vm;
> + atomic64_sub(npages, >locked_vm);
>   }
>  
>   pr_debug("[%d] RLIMIT_MEMLOCK HASH64 %c%ld %ld/%ld\n",
>   current ? current->pid : 0,
>   incr ? '+' : '-',
>   npages << PAGE_SHIFT,
> - mm->locked_vm << PAGE_SHIFT,
> + atomic64_read(>locked_vm) << PAGE_SHIFT,
>   rlimit(RLIMIT_MEMLOCK));
>   up_write(>mmap_sem);
>  
> diff --git 

Re: [PATCH 1/6] mm: change locked_vm's type from unsigned long to atomic64_t

2019-04-03 Thread Daniel Jordan
On Tue, Apr 02, 2019 at 04:43:57PM -0700, Davidlohr Bueso wrote:
> On Tue, 02 Apr 2019, Andrew Morton wrote:
> 
> > Also, we didn't remove any down_write(mmap_sem)s from core code so I'm
> > thinking that the benefit of removing a few mmap_sem-takings from a few
> > obscure drivers (sorry ;)) is pretty small.
> 
> afaik porting the remaining incorrect users of locked_vm to pinned_vm was
> the next step before this one, which made converting locked_vm to atomic
> hardly worth it. Daniel?
 
Right, as you know I tried those incorrect users first, but there were concerns
about user-visible changes regarding RLIMIT_MEMLOCK and pinned_vm/locked_vm
without the accounting problem between all three being solved.

To my knowledge no one has a solution for that, so in the meantime I'm taking
the incremental step of getting rid of mmap_sem for locked_vm users.  The
locked_vm -> pinned_vm conversion can happen later.


Re: [PATCH 1/6] mm: change locked_vm's type from unsigned long to atomic64_t

2019-04-03 Thread Daniel Jordan
On Wed, Apr 03, 2019 at 06:46:07AM +0200, Christophe Leroy wrote:
> 
> 
> Le 02/04/2019 à 22:41, Daniel Jordan a écrit :
> > Taking and dropping mmap_sem to modify a single counter, locked_vm, is
> > overkill when the counter could be synchronized separately.
> > 
> > Make mmap_sem a little less coarse by changing locked_vm to an atomic,
> > the 64-bit variety to avoid issues with overflow on 32-bit systems.
> 
> Can you elaborate on the above ? Previously it was 'unsigned long', what
> were the issues ?

Sure, I responded to this in another thread from this series.

> If there was such issues, shouldn't there be a first patch
> moving it from unsigned long to u64 before this atomic64_t change ? Or at
> least it should be clearly explain here what the issues are and how
> switching to a 64 bit counter fixes them.

Yes, I can explain the motivation in the next version.


Re: [PATCH 1/6] mm: change locked_vm's type from unsigned long to atomic64_t

2019-04-03 Thread Daniel Jordan
On Tue, Apr 02, 2019 at 03:04:24PM -0700, Andrew Morton wrote:
> On Tue,  2 Apr 2019 16:41:53 -0400 Daniel Jordan  
> wrote:
> >  static long kvmppc_account_memlimit(unsigned long stt_pages, bool inc)
> >  {
> > long ret = 0;
> > +   s64 locked_vm;
> >  
> > if (!current || !current->mm)
> > return ret; /* process exited */
> >  
> > down_write(>mm->mmap_sem);
> >  
> > +   locked_vm = atomic64_read(>mm->locked_vm);
> > if (inc) {
> > unsigned long locked, lock_limit;
> >  
> > -   locked = current->mm->locked_vm + stt_pages;
> > +   locked = locked_vm + stt_pages;
> > lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
> > if (locked > lock_limit && !capable(CAP_IPC_LOCK))
> > ret = -ENOMEM;
> > else
> > -   current->mm->locked_vm += stt_pages;
> > +   atomic64_add(stt_pages, >mm->locked_vm);
> > } else {
> > -   if (WARN_ON_ONCE(stt_pages > current->mm->locked_vm))
> > -   stt_pages = current->mm->locked_vm;
> > +   if (WARN_ON_ONCE(stt_pages > locked_vm))
> > +   stt_pages = locked_vm;
> >  
> > -   current->mm->locked_vm -= stt_pages;
> > +   atomic64_sub(stt_pages, >mm->locked_vm);
> > }
> 
> With the current code, current->mm->locked_vm cannot go negative. 
> After the patch, it can go negative.  If someone else decreased
> current->mm->locked_vm between this function's atomic64_read() and
> atomic64_sub().
> 
> I guess this is a can't-happen in this case because the racing code
> which performed the modification would have taken it negative anyway.
> 
> But this all makes me rather queazy.

mmap_sem is still held in this patch, so updates to locked_vm are still
serialized and I don't think what you describe can happen.  A later patch
removes mmap_sem, of course, but it also rewrites the code to do something
different.  This first patch is just a mechanical type change from unsigned
long to atomic64_t.

So...does this alleviate your symptoms?

> Also, we didn't remove any down_write(mmap_sem)s from core code so I'm
> thinking that the benefit of removing a few mmap_sem-takings from a few
> obscure drivers (sorry ;)) is pretty small.

Not sure about the other drivers, but vfio type1 isn't obscure.  We use it
extensively in our cloud, and from Andrea's __GFP_THISNODE thread a few months
back it seems Red Hat also uses it:

  https://lore.kernel.org/linux-mm/20180820032204.9591-3-aarca...@redhat.com/

> Also, the argument for switching 32-bit arches to a 64-bit counter was
> suspiciously vague.  What overflow issues?  Or are we just being lazy?

If user-controlled values are used to increase locked_vm, multiple threads
doing it at once on a 32-bit system could theoretically cause overflow, so in
the absence of atomic overflow checking, the 64-bit counter on 32b is defensive
programming.

I wouldn't have thought to do it, but Jason Gunthorpe raised the same issue in
the pinned_vm series:

  https://lore.kernel.org/linux-mm/20190115205311.gd22...@mellanox.com/

I'm fine with changing it to atomic_long_t if the scenario is too theoretical
for people.


Anyway, thanks for looking at this.


Re: [PATCH 1/6] mm: change locked_vm's type from unsigned long to atomic64_t

2019-04-02 Thread Christophe Leroy




Le 02/04/2019 à 22:41, Daniel Jordan a écrit :

Taking and dropping mmap_sem to modify a single counter, locked_vm, is
overkill when the counter could be synchronized separately.

Make mmap_sem a little less coarse by changing locked_vm to an atomic,
the 64-bit variety to avoid issues with overflow on 32-bit systems.


Can you elaborate on the above ? Previously it was 'unsigned long', what 
were the issues ? If there was such issues, shouldn't there be a first 
patch moving it from unsigned long to u64 before this atomic64_t change 
? Or at least it should be clearly explain here what the issues are and 
how switching to a 64 bit counter fixes them.


Christophe



Signed-off-by: Daniel Jordan 
Cc: Alan Tull 
Cc: Alexey Kardashevskiy 
Cc: Alex Williamson 
Cc: Andrew Morton 
Cc: Benjamin Herrenschmidt 
Cc: Christoph Lameter 
Cc: Davidlohr Bueso 
Cc: Michael Ellerman 
Cc: Moritz Fischer 
Cc: Paul Mackerras 
Cc: Wu Hao 
Cc: 
Cc: 
Cc: 
Cc: 
Cc: 
Cc: 
---
  arch/powerpc/kvm/book3s_64_vio.c| 14 --
  arch/powerpc/mm/mmu_context_iommu.c | 15 ---
  drivers/fpga/dfl-afu-dma-region.c   | 18 ++
  drivers/vfio/vfio_iommu_spapr_tce.c | 17 +
  drivers/vfio/vfio_iommu_type1.c | 10 ++
  fs/proc/task_mmu.c  |  2 +-
  include/linux/mm_types.h|  2 +-
  kernel/fork.c   |  2 +-
  mm/debug.c  |  5 +++--
  mm/mlock.c  |  4 ++--
  mm/mmap.c   | 18 +-
  mm/mremap.c |  6 +++---
  12 files changed, 61 insertions(+), 52 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index f02b04973710..e7fdb6d10eeb 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -59,32 +59,34 @@ static unsigned long kvmppc_stt_pages(unsigned long 
tce_pages)
  static long kvmppc_account_memlimit(unsigned long stt_pages, bool inc)
  {
long ret = 0;
+   s64 locked_vm;
  
  	if (!current || !current->mm)

return ret; /* process exited */
  
  	down_write(>mm->mmap_sem);
  
+	locked_vm = atomic64_read(>mm->locked_vm);

if (inc) {
unsigned long locked, lock_limit;
  
-		locked = current->mm->locked_vm + stt_pages;

+   locked = locked_vm + stt_pages;
lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
if (locked > lock_limit && !capable(CAP_IPC_LOCK))
ret = -ENOMEM;
else
-   current->mm->locked_vm += stt_pages;
+   atomic64_add(stt_pages, >mm->locked_vm);
} else {
-   if (WARN_ON_ONCE(stt_pages > current->mm->locked_vm))
-   stt_pages = current->mm->locked_vm;
+   if (WARN_ON_ONCE(stt_pages > locked_vm))
+   stt_pages = locked_vm;
  
-		current->mm->locked_vm -= stt_pages;

+   atomic64_sub(stt_pages, >mm->locked_vm);
}
  
  	pr_debug("[%d] RLIMIT_MEMLOCK KVM %c%ld %ld/%ld%s\n", current->pid,

inc ? '+' : '-',
stt_pages << PAGE_SHIFT,
-   current->mm->locked_vm << PAGE_SHIFT,
+   atomic64_read(>mm->locked_vm) << PAGE_SHIFT,
rlimit(RLIMIT_MEMLOCK),
ret ? " - exceeded" : "");
  
diff --git a/arch/powerpc/mm/mmu_context_iommu.c b/arch/powerpc/mm/mmu_context_iommu.c

index e7a9c4f6bfca..8038ac24a312 100644
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -55,30 +55,31 @@ static long mm_iommu_adjust_locked_vm(struct mm_struct *mm,
unsigned long npages, bool incr)
  {
long ret = 0, locked, lock_limit;
+   s64 locked_vm;
  
  	if (!npages)

return 0;
  
  	down_write(>mmap_sem);

-
+   locked_vm = atomic64_read(>locked_vm);
if (incr) {
-   locked = mm->locked_vm + npages;
+   locked = locked_vm + npages;
lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
if (locked > lock_limit && !capable(CAP_IPC_LOCK))
ret = -ENOMEM;
else
-   mm->locked_vm += npages;
+   atomic64_add(npages, >locked_vm);
} else {
-   if (WARN_ON_ONCE(npages > mm->locked_vm))
-   npages = mm->locked_vm;
-   mm->locked_vm -= npages;
+   if (WARN_ON_ONCE(npages > locked_vm))
+   npages = locked_vm;
+   atomic64_sub(npages, >locked_vm);
}
  
  	pr_debug("[%d] RLIMIT_MEMLOCK HASH64 %c%ld %ld/%ld\n",

current ? current->pid : 0,
incr ? '+' : '-',
npages << PAGE_SHIFT,
-   

Re: [PATCH 1/6] mm: change locked_vm's type from unsigned long to atomic64_t

2019-04-02 Thread Davidlohr Bueso

On Tue, 02 Apr 2019, Andrew Morton wrote:


Also, we didn't remove any down_write(mmap_sem)s from core code so I'm
thinking that the benefit of removing a few mmap_sem-takings from a few
obscure drivers (sorry ;)) is pretty small.


afaik porting the remaining incorrect users of locked_vm to pinned_vm was
the next step before this one, which made converting locked_vm to atomic
hardly worth it. Daniel?

Thanks,
Davidlohr


Re: [PATCH 1/6] mm: change locked_vm's type from unsigned long to atomic64_t

2019-04-02 Thread Andrew Morton
On Tue,  2 Apr 2019 16:41:53 -0400 Daniel Jordan  
wrote:

> Taking and dropping mmap_sem to modify a single counter, locked_vm, is
> overkill when the counter could be synchronized separately.
> 
> Make mmap_sem a little less coarse by changing locked_vm to an atomic,
> the 64-bit variety to avoid issues with overflow on 32-bit systems.
> 
> ...
>
> --- a/arch/powerpc/kvm/book3s_64_vio.c
> +++ b/arch/powerpc/kvm/book3s_64_vio.c
> @@ -59,32 +59,34 @@ static unsigned long kvmppc_stt_pages(unsigned long 
> tce_pages)
>  static long kvmppc_account_memlimit(unsigned long stt_pages, bool inc)
>  {
>   long ret = 0;
> + s64 locked_vm;
>  
>   if (!current || !current->mm)
>   return ret; /* process exited */
>  
>   down_write(>mm->mmap_sem);
>  
> + locked_vm = atomic64_read(>mm->locked_vm);
>   if (inc) {
>   unsigned long locked, lock_limit;
>  
> - locked = current->mm->locked_vm + stt_pages;
> + locked = locked_vm + stt_pages;
>   lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
>   if (locked > lock_limit && !capable(CAP_IPC_LOCK))
>   ret = -ENOMEM;
>   else
> - current->mm->locked_vm += stt_pages;
> + atomic64_add(stt_pages, >mm->locked_vm);
>   } else {
> - if (WARN_ON_ONCE(stt_pages > current->mm->locked_vm))
> - stt_pages = current->mm->locked_vm;
> + if (WARN_ON_ONCE(stt_pages > locked_vm))
> + stt_pages = locked_vm;
>  
> - current->mm->locked_vm -= stt_pages;
> + atomic64_sub(stt_pages, >mm->locked_vm);
>   }

With the current code, current->mm->locked_vm cannot go negative. 
After the patch, it can go negative.  If someone else decreased
current->mm->locked_vm between this function's atomic64_read() and
atomic64_sub().

I guess this is a can't-happen in this case because the racing code
which performed the modification would have taken it negative anyway.

But this all makes me rather queazy.


Also, we didn't remove any down_write(mmap_sem)s from core code so I'm
thinking that the benefit of removing a few mmap_sem-takings from a few
obscure drivers (sorry ;)) is pretty small.


Also, the argument for switching 32-bit arches to a 64-bit counter was
suspiciously vague.  What overflow issues?  Or are we just being lazy?



[PATCH 1/6] mm: change locked_vm's type from unsigned long to atomic64_t

2019-04-02 Thread Daniel Jordan
Taking and dropping mmap_sem to modify a single counter, locked_vm, is
overkill when the counter could be synchronized separately.

Make mmap_sem a little less coarse by changing locked_vm to an atomic,
the 64-bit variety to avoid issues with overflow on 32-bit systems.

Signed-off-by: Daniel Jordan 
Cc: Alan Tull 
Cc: Alexey Kardashevskiy 
Cc: Alex Williamson 
Cc: Andrew Morton 
Cc: Benjamin Herrenschmidt 
Cc: Christoph Lameter 
Cc: Davidlohr Bueso 
Cc: Michael Ellerman 
Cc: Moritz Fischer 
Cc: Paul Mackerras 
Cc: Wu Hao 
Cc: 
Cc: 
Cc: 
Cc: 
Cc: 
Cc: 
---
 arch/powerpc/kvm/book3s_64_vio.c| 14 --
 arch/powerpc/mm/mmu_context_iommu.c | 15 ---
 drivers/fpga/dfl-afu-dma-region.c   | 18 ++
 drivers/vfio/vfio_iommu_spapr_tce.c | 17 +
 drivers/vfio/vfio_iommu_type1.c | 10 ++
 fs/proc/task_mmu.c  |  2 +-
 include/linux/mm_types.h|  2 +-
 kernel/fork.c   |  2 +-
 mm/debug.c  |  5 +++--
 mm/mlock.c  |  4 ++--
 mm/mmap.c   | 18 +-
 mm/mremap.c |  6 +++---
 12 files changed, 61 insertions(+), 52 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index f02b04973710..e7fdb6d10eeb 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -59,32 +59,34 @@ static unsigned long kvmppc_stt_pages(unsigned long 
tce_pages)
 static long kvmppc_account_memlimit(unsigned long stt_pages, bool inc)
 {
long ret = 0;
+   s64 locked_vm;
 
if (!current || !current->mm)
return ret; /* process exited */
 
down_write(>mm->mmap_sem);
 
+   locked_vm = atomic64_read(>mm->locked_vm);
if (inc) {
unsigned long locked, lock_limit;
 
-   locked = current->mm->locked_vm + stt_pages;
+   locked = locked_vm + stt_pages;
lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
if (locked > lock_limit && !capable(CAP_IPC_LOCK))
ret = -ENOMEM;
else
-   current->mm->locked_vm += stt_pages;
+   atomic64_add(stt_pages, >mm->locked_vm);
} else {
-   if (WARN_ON_ONCE(stt_pages > current->mm->locked_vm))
-   stt_pages = current->mm->locked_vm;
+   if (WARN_ON_ONCE(stt_pages > locked_vm))
+   stt_pages = locked_vm;
 
-   current->mm->locked_vm -= stt_pages;
+   atomic64_sub(stt_pages, >mm->locked_vm);
}
 
pr_debug("[%d] RLIMIT_MEMLOCK KVM %c%ld %ld/%ld%s\n", current->pid,
inc ? '+' : '-',
stt_pages << PAGE_SHIFT,
-   current->mm->locked_vm << PAGE_SHIFT,
+   atomic64_read(>mm->locked_vm) << PAGE_SHIFT,
rlimit(RLIMIT_MEMLOCK),
ret ? " - exceeded" : "");
 
diff --git a/arch/powerpc/mm/mmu_context_iommu.c 
b/arch/powerpc/mm/mmu_context_iommu.c
index e7a9c4f6bfca..8038ac24a312 100644
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -55,30 +55,31 @@ static long mm_iommu_adjust_locked_vm(struct mm_struct *mm,
unsigned long npages, bool incr)
 {
long ret = 0, locked, lock_limit;
+   s64 locked_vm;
 
if (!npages)
return 0;
 
down_write(>mmap_sem);
-
+   locked_vm = atomic64_read(>locked_vm);
if (incr) {
-   locked = mm->locked_vm + npages;
+   locked = locked_vm + npages;
lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
if (locked > lock_limit && !capable(CAP_IPC_LOCK))
ret = -ENOMEM;
else
-   mm->locked_vm += npages;
+   atomic64_add(npages, >locked_vm);
} else {
-   if (WARN_ON_ONCE(npages > mm->locked_vm))
-   npages = mm->locked_vm;
-   mm->locked_vm -= npages;
+   if (WARN_ON_ONCE(npages > locked_vm))
+   npages = locked_vm;
+   atomic64_sub(npages, >locked_vm);
}
 
pr_debug("[%d] RLIMIT_MEMLOCK HASH64 %c%ld %ld/%ld\n",
current ? current->pid : 0,
incr ? '+' : '-',
npages << PAGE_SHIFT,
-   mm->locked_vm << PAGE_SHIFT,
+   atomic64_read(>locked_vm) << PAGE_SHIFT,
rlimit(RLIMIT_MEMLOCK));
up_write(>mmap_sem);
 
diff --git a/drivers/fpga/dfl-afu-dma-region.c 
b/drivers/fpga/dfl-afu-dma-region.c
index e18a786fc943..08132fd9b6b7 100644
--- a/drivers/fpga/dfl-afu-dma-region.c
+++ b/drivers/fpga/dfl-afu-dma-region.c
@@ -45,6