Re: [PATCH 3/3] vdso: preallocate new vmas

2013-10-24 Thread Davidlohr Bueso
On Wed, 2013-10-23 at 02:53 -0700, wal...@google.com wrote:
> On Sun, Oct 20, 2013 at 08:26:15PM -0700, Davidlohr Bueso wrote:
> > From: Davidlohr Bueso 
> > Subject: [PATCH v2 3/3] vdso: preallocate new vmas
> > 
> > With the exception of um and tile, architectures that use
> > the install_special_mapping() function, when setting up a
> > new vma at program startup, do so with the mmap_sem lock
> > held for writing. Unless there's an error, this process
> > ends up allocating a new vma through kmem_cache_zalloc,
> > and inserting it in the task's address space.
> > 
> > This patch moves the vma's space allocation outside of
> > install_special_mapping(), and leaves the callers to do so
> > explicitly, without depending on mmap_sem. The same goes for
> > freeing: if the new vma isn't used (and thus the process fails
> > at some point), it's caller's responsibility to free it -
> > currently this is done inside install_special_mapping.
> > 
> > Furthermore, uprobes behaves exactly the same and thus now the
> > xol_add_vma() function also preallocates the new vma.
> > 
> > While the changes to x86 vdso handling have been tested on both
> > large and small 64-bit systems, the rest of the architectures
> > are totally *untested*. Note that all changes are quite similar
> > from architecture to architecture.
> > 
> > Signed-off-by: Davidlohr Bueso 
> > Cc: Russell King 
> > Cc: Catalin Marinas 
> > Cc: Will Deacon 
> > Cc: Richard Kuo 
> > Cc: Ralf Baechle 
> > Cc: Benjamin Herrenschmidt 
> > Cc: Paul Mackerras 
> > Cc: Martin Schwidefsky 
> > Cc: Heiko Carstens 
> > Cc: Paul Mundt 
> > Cc: Chris Metcalf 
> > Cc: Jeff Dike 
> > Cc: Richard Weinberger 
> > Cc: Thomas Gleixner 
> > Cc: Ingo Molnar 
> > Cc: "H. Peter Anvin" 
> > Cc: Peter Zijlstra 
> > ---
> > v2:
> > - Simplify install_special_mapping interface (Linus Torvalds)
> > - Fix return for uml_setup_stubs when mem allocation fails (Richard 
> > Weinberger)
> 
> I'm still confused as to why you're seeing any gains with this
> one. This code runs during exec when mm isn't shared with any other
> threads yet, so why does it matter how long the mmap_sem is held since
> nobody else can contend on it ? (well, except for accesses from
> /fs/proc/base.c, but I don't see why these would matter in your
> benchmarks either).

Yeah, that's why I dropped the performance numbers from the changelog in
v2, of course any differences are within the noise range. When I did the
initial runs I was scratching my head as to why I was seeing benefits,
but it was most likely a matter of clock frequency differences, and I no
longer see such boosts.

Thanks,
Davidlohr


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] vdso: preallocate new vmas

2013-10-23 Thread walken
On Sun, Oct 20, 2013 at 08:26:15PM -0700, Davidlohr Bueso wrote:
> From: Davidlohr Bueso 
> Subject: [PATCH v2 3/3] vdso: preallocate new vmas
> 
> With the exception of um and tile, architectures that use
> the install_special_mapping() function, when setting up a
> new vma at program startup, do so with the mmap_sem lock
> held for writing. Unless there's an error, this process
> ends up allocating a new vma through kmem_cache_zalloc,
> and inserting it in the task's address space.
> 
> This patch moves the vma's space allocation outside of
> install_special_mapping(), and leaves the callers to do so
> explicitly, without depending on mmap_sem. The same goes for
> freeing: if the new vma isn't used (and thus the process fails
> at some point), it's caller's responsibility to free it -
> currently this is done inside install_special_mapping.
> 
> Furthermore, uprobes behaves exactly the same and thus now the
> xol_add_vma() function also preallocates the new vma.
> 
> While the changes to x86 vdso handling have been tested on both
> large and small 64-bit systems, the rest of the architectures
> are totally *untested*. Note that all changes are quite similar
> from architecture to architecture.
> 
> Signed-off-by: Davidlohr Bueso 
> Cc: Russell King 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Richard Kuo 
> Cc: Ralf Baechle 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Martin Schwidefsky 
> Cc: Heiko Carstens 
> Cc: Paul Mundt 
> Cc: Chris Metcalf 
> Cc: Jeff Dike 
> Cc: Richard Weinberger 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: Peter Zijlstra 
> ---
> v2:
> - Simplify install_special_mapping interface (Linus Torvalds)
> - Fix return for uml_setup_stubs when mem allocation fails (Richard 
> Weinberger)

I'm still confused as to why you're seeing any gains with this
one. This code runs during exec when mm isn't shared with any other
threads yet, so why does it matter how long the mmap_sem is held since
nobody else can contend on it ? (well, except for accesses from
/fs/proc/base.c, but I don't see why these would matter in your
benchmarks either).

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] vdso: preallocate new vmas

2013-10-20 Thread Davidlohr Bueso
From: Davidlohr Bueso 
Subject: [PATCH v2 3/3] vdso: preallocate new vmas

With the exception of um and tile, architectures that use
the install_special_mapping() function, when setting up a
new vma at program startup, do so with the mmap_sem lock
held for writing. Unless there's an error, this process
ends up allocating a new vma through kmem_cache_zalloc,
and inserting it in the task's address space.

This patch moves the vma's space allocation outside of
install_special_mapping(), and leaves the callers to do so
explicitly, without depending on mmap_sem. The same goes for
freeing: if the new vma isn't used (and thus the process fails
at some point), it's caller's responsibility to free it -
currently this is done inside install_special_mapping.

Furthermore, uprobes behaves exactly the same and thus now the
xol_add_vma() function also preallocates the new vma.

While the changes to x86 vdso handling have been tested on both
large and small 64-bit systems, the rest of the architectures
are totally *untested*. Note that all changes are quite similar
from architecture to architecture.

Signed-off-by: Davidlohr Bueso 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Richard Kuo 
Cc: Ralf Baechle 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Cc: Paul Mundt 
Cc: Chris Metcalf 
Cc: Jeff Dike 
Cc: Richard Weinberger 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Peter Zijlstra 
---
v2:
- Simplify install_special_mapping interface (Linus Torvalds)
- Fix return for uml_setup_stubs when mem allocation fails (Richard Weinberger)

 arch/arm/kernel/process.c  | 22 --
 arch/arm64/kernel/vdso.c   | 21 +
 arch/hexagon/kernel/vdso.c | 16 
 arch/mips/kernel/vdso.c| 10 +-
 arch/powerpc/kernel/vdso.c | 11 ---
 arch/s390/kernel/vdso.c| 19 +++
 arch/sh/kernel/vsyscall/vsyscall.c | 11 ++-
 arch/tile/kernel/vdso.c| 13 ++---
 arch/um/kernel/skas/mmu.c  | 16 +++-
 arch/unicore32/kernel/process.c| 17 -
 arch/x86/um/vdso/vma.c | 18 ++
 arch/x86/vdso/vdso32-setup.c   | 16 +++-
 arch/x86/vdso/vma.c| 10 +-
 include/linux/mm.h |  3 ++-
 kernel/events/uprobes.c| 14 --
 mm/mmap.c  | 17 ++---
 16 files changed, 178 insertions(+), 56 deletions(-)

diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 94f6b05..d1eb115 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -480,6 +481,7 @@ extern struct page *get_signal_page(void);
 int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 {
struct mm_struct *mm = current->mm;
+   struct vm_area_struct *vma;
unsigned long addr;
int ret;
 
@@ -488,6 +490,10 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, 
int uses_interp)
if (!signal_page)
return -ENOMEM;
 
+   vma = kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL);
+   if (unlikely(!vma))
+   return -ENOMEM;
+
down_write(&mm->mmap_sem);
addr = get_unmapped_area(NULL, 0, PAGE_SIZE, 0, 0);
if (IS_ERR_VALUE(addr)) {
@@ -496,14 +502,18 @@ int arch_setup_additional_pages(struct linux_binprm 
*bprm, int uses_interp)
}
 
ret = install_special_mapping(mm, addr, PAGE_SIZE,
-   VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC,
-   &signal_page);
-
-   if (ret == 0)
-   mm->context.sigpage = addr;
+ VM_READ | VM_EXEC | VM_MAYREAD |
+ VM_MAYWRITE | VM_MAYEXEC,
+ &signal_page, vma);
+   if (ret)
+   goto up_fail;
 
- up_fail:
+   mm->context.sigpage = addr;
+   up_write(&mm->mmap_sem);
+   return 0;
+up_fail:
up_write(&mm->mmap_sem);
+   kmem_cache_free(vm_area_cachep, vma);
return ret;
 }
 #endif
diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index 6a389dc..06a01ea 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -83,20 +83,26 @@ arch_initcall(alloc_vectors_page);
 
 int aarch32_setup_vectors_page(struct linux_binprm *bprm, int uses_interp)
 {
+   struct vm_area_struct *vma;
struct mm_struct *mm = current->mm;
unsigned long addr = AARCH32_VECTORS_BASE;
int ret;
 
+   vma = kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL);
+   if (unlikely(!vma))
+   return -ENOMEM;
+
down_write(&mm->mmap_sem);
current->mm->context.vdso = (void *)addr;
 
/* Map vectors

Re: [PATCH 3/3] vdso: preallocate new vmas

2013-10-17 Thread Richard Weinberger
Am 18.10.2013 02:50, schrieb Davidlohr Bueso:
> With the exception of um and tile, architectures that use
> the install_special_mapping() function, when setting up a
> new vma at program startup, do so with the mmap_sem lock
> held for writing. Unless there's an error, this process
> ends up allocating a new vma through kmem_cache_zalloc,
> and inserting it in the task's address space.
> 
> This patch moves the vma's space allocation outside of
> install_special_mapping(), and leaves the callers to do so 
> explicitly, without depending on mmap_sem. The same goes for 
> freeing: if the new vma isn't used (and thus the process fails 
> at some point), it's caller's responsibility to free it - 
> currently this is done inside install_special_mapping.
> 
> Furthermore, uprobes behaves exactly the same and thus now the
> xol_add_vma() function also preallocates the new vma.
> 
> While the changes to x86 vdso handling have been tested on both
> large and small 64-bit systems, the rest of the architectures
> are totally *untested*. Note that all changes are quite similar
> from architecture to architecture.
> 
> This patch, when tested on a 64core, 256 Gb NUMA server, benefited
> several aim7 workloads: long +27% throughput with over 1500 users;
> compute +6.5% with over 1000 users; fserver +83% for small amounts
> of users (10-100 range) and +9% for more and new_fserver, showing
> a similar behavior, got +67% boost with 100 users and an avg of +8%
> when more users were added.
> 
> Signed-off-by: Davidlohr Bueso 
> Cc: Russell King 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Richard Kuo 
> Cc: Ralf Baechle 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Martin Schwidefsky 
> Cc: Heiko Carstens 
> Cc: Paul Mundt 
> Cc: Chris Metcalf 
> Cc: Jeff Dike 
> Cc: Richard Weinberger 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: Peter Zijlstra 
> ---
>  arch/arm/kernel/process.c  | 22 --
>  arch/arm64/kernel/vdso.c   | 21 +
>  arch/hexagon/kernel/vdso.c | 16 
>  arch/mips/kernel/vdso.c| 10 +-
>  arch/powerpc/kernel/vdso.c | 11 ---
>  arch/s390/kernel/vdso.c| 19 +++
>  arch/sh/kernel/vsyscall/vsyscall.c | 11 ++-
>  arch/tile/kernel/vdso.c| 13 ++---
>  arch/um/kernel/skas/mmu.c  | 16 +++-
>  arch/unicore32/kernel/process.c| 17 -
>  arch/x86/um/vdso/vma.c | 18 ++
>  arch/x86/vdso/vdso32-setup.c   | 16 +++-
>  arch/x86/vdso/vma.c| 10 +-
>  include/linux/mm.h |  3 ++-
>  kernel/events/uprobes.c| 14 --
>  mm/mmap.c  | 18 +++---
>  16 files changed, 179 insertions(+), 56 deletions(-)
> 
> diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
> index 94f6b05..5637c92 100644
> --- a/arch/arm/kernel/process.c
> +++ b/arch/arm/kernel/process.c
> @@ -13,6 +13,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -480,6 +481,7 @@ extern struct page *get_signal_page(void);
>  int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
>  {
>   struct mm_struct *mm = current->mm;
> + struct vm_area_struct *vma;
>   unsigned long addr;
>   int ret;
>  
> @@ -488,6 +490,10 @@ int arch_setup_additional_pages(struct linux_binprm 
> *bprm, int uses_interp)
>   if (!signal_page)
>   return -ENOMEM;
>  
> + vma = kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL);
> + if (unlikely(!vma))
> + return -ENOMEM;
> +
>   down_write(&mm->mmap_sem);
>   addr = get_unmapped_area(NULL, 0, PAGE_SIZE, 0, 0);
>   if (IS_ERR_VALUE(addr)) {
> @@ -496,14 +502,18 @@ int arch_setup_additional_pages(struct linux_binprm 
> *bprm, int uses_interp)
>   }
>  
>   ret = install_special_mapping(mm, addr, PAGE_SIZE,
> - VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC,
> - &signal_page);
> -
> - if (ret == 0)
> - mm->context.sigpage = addr;
> +   VM_READ | VM_EXEC | VM_MAYREAD |
> +   VM_MAYWRITE | VM_MAYEXEC,
> +   &signal_page, &vma);
> + if (ret)
> + goto up_fail;
>  
> - up_fail:
> + mm->context.sigpage = addr;
> + up_write(&mm->mmap_sem);
> + return 0;
> +up_fail:
>   up_write(&mm->mmap_sem);
> + kmem_cache_free(vm_area_cachep, vma);
>   return ret;
>  }
>  #endif
> diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
> index 6a389dc..519a44c 100644
> --- a/arch/arm64/kernel/vdso.c
> +++ b/arch/arm64/kernel/vdso.c
> @@ -83,20 +83,26 @@ arch_initcall(alloc_vectors_page);
>  
>  int aarch32_setup_vectors_page(struct linux_binprm *bprm, int 

Re: [PATCH 3/3] vdso: preallocate new vmas

2013-10-17 Thread Linus Torvalds
This seems somewhat insane:

   int install_special_mapping(struct mm_struct *mm,
  unsigned long addr, unsigned long len,
  +   unsigned long vm_flags, struct page **pages,
  +   struct vm_area_struct **vma_prealloc)
   {
  +   int ret = 0;
  +   struct vm_area_struct *vma = *vma_prealloc;

(removed the "old" lines to make it more readable).

Why pass in "struct vm_area_struct **vma_prealloc" when you could just
pass in a plain and more readable "struct vm_area_struct *vma"?

My *guess* is that you originally cleared the vma_prealloc thing if
you used it, but in the patch you sent out you definitely don't (the
_only_ use of that "vma_prealloc" is the line that loads the content
into "vma", so this interface looks like it is some remnant of an
earlier and more complicated patch?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/