Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages

2014-10-07 Thread Andrea Arcangeli
Hi Kirill, On Tue, Oct 07, 2014 at 02:10:26PM +0300, Kirill A. Shutemov wrote: On Fri, Oct 03, 2014 at 07:08:00PM +0200, Andrea Arcangeli wrote: There's one constraint enforced to allow this simplification: the source pages passed to remap_anon_pages must be mapped only in one vma

Re: [PATCH 10/17] mm: rmap preparation for remap_anon_pages

2014-10-07 Thread Andrea Arcangeli
Hello, On Tue, Oct 07, 2014 at 08:47:59AM -0400, Linus Torvalds wrote: On Mon, Oct 6, 2014 at 12:41 PM, Andrea Arcangeli aarca...@redhat.com wrote: Of course if somebody has better ideas on how to resolve an anonymous userfault they're welcome. So I'd *much* rather have a write() style

Re: [PATCH 10/17] mm: rmap preparation for remap_anon_pages

2014-10-07 Thread Andrea Arcangeli
On Tue, Oct 07, 2014 at 04:19:13PM +0200, Andrea Arcangeli wrote: mremap like interface, or file+commands protocol interface. I tend to like mremap more, that's why I opted for a remap_anon_pages syscall kept orthogonal to the userfaultfd functionality (remap_anon_pages could be also used

Re: [PATCH 08/17] mm: madvise MADV_USERFAULT

2014-10-06 Thread Andrea Arcangeli
Hi, On Sat, Oct 04, 2014 at 08:13:36AM +0900, Mike Hommey wrote: > On Fri, Oct 03, 2014 at 07:07:58PM +0200, Andrea Arcangeli wrote: > > MADV_USERFAULT is a new madvise flag that will set VM_USERFAULT in the > > vma flags. Whenever VM_USERFAULT is set in an anonymous vma, if >

Re: [PATCH 12/17] mm: sys_remap_anon_pages

2014-10-06 Thread Andrea Arcangeli
Hi, On Sat, Oct 04, 2014 at 06:13:27AM -0700, Andi Kleen wrote: > Andrea Arcangeli writes: > > > This new syscall will move anon pages across vmas, atomically and > > without touching the vmas. > > > > It only works on non shared anonymous pages because thos

Re: [PATCH 10/17] mm: rmap preparation for remap_anon_pages

2014-10-06 Thread Andrea Arcangeli
Hello, On Mon, Oct 06, 2014 at 09:55:41AM +0100, Dr. David Alan Gilbert wrote: > * Linus Torvalds (torva...@linux-foundation.org) wrote: > > On Fri, Oct 3, 2014 at 10:08 AM, Andrea Arcangeli > > wrote: > > > > > > Overall this looks a fairly small change to

Re: [PATCH 04/17] mm: gup: make get_user_pages_fast and __get_user_pages_fast latency conscious

2014-10-06 Thread Andrea Arcangeli
Hello, On Fri, Oct 03, 2014 at 11:23:53AM -0700, Linus Torvalds wrote: > On Fri, Oct 3, 2014 at 10:07 AM, Andrea Arcangeli wrote: > > This teaches gup_fast and __gup_fast to re-enable irqs and > > cond_resched() if possible every BATCH_PAGES. > > This is disgusting. > &

Re: [PATCH 04/17] mm: gup: make get_user_pages_fast and __get_user_pages_fast latency conscious

2014-10-06 Thread Andrea Arcangeli
Hello, On Fri, Oct 03, 2014 at 11:23:53AM -0700, Linus Torvalds wrote: On Fri, Oct 3, 2014 at 10:07 AM, Andrea Arcangeli aarca...@redhat.com wrote: This teaches gup_fast and __gup_fast to re-enable irqs and cond_resched() if possible every BATCH_PAGES. This is disgusting. Many (most

Re: [PATCH 10/17] mm: rmap preparation for remap_anon_pages

2014-10-06 Thread Andrea Arcangeli
Hello, On Mon, Oct 06, 2014 at 09:55:41AM +0100, Dr. David Alan Gilbert wrote: * Linus Torvalds (torva...@linux-foundation.org) wrote: On Fri, Oct 3, 2014 at 10:08 AM, Andrea Arcangeli aarca...@redhat.com wrote: Overall this looks a fairly small change to the rmap code, notably

Re: [PATCH 12/17] mm: sys_remap_anon_pages

2014-10-06 Thread Andrea Arcangeli
Hi, On Sat, Oct 04, 2014 at 06:13:27AM -0700, Andi Kleen wrote: Andrea Arcangeli aarca...@redhat.com writes: This new syscall will move anon pages across vmas, atomically and without touching the vmas. It only works on non shared anonymous pages because those can be relocated without

Re: [PATCH 08/17] mm: madvise MADV_USERFAULT

2014-10-06 Thread Andrea Arcangeli
Hi, On Sat, Oct 04, 2014 at 08:13:36AM +0900, Mike Hommey wrote: On Fri, Oct 03, 2014 at 07:07:58PM +0200, Andrea Arcangeli wrote: MADV_USERFAULT is a new madvise flag that will set VM_USERFAULT in the vma flags. Whenever VM_USERFAULT is set in an anonymous vma, if userland touches a still

[PATCH 08/17] mm: madvise MADV_USERFAULT

2014-10-03 Thread Andrea Arcangeli
exclusive if set. Signed-off-by: Andrea Arcangeli --- arch/alpha/include/uapi/asm/mman.h | 3 ++ arch/mips/include/uapi/asm/mman.h | 3 ++ arch/parisc/include/uapi/asm/mman.h| 3 ++ arch/xtensa/include/uapi/asm/mman.h| 3 ++ fs/proc/task_mmu.c | 1

[PATCH 17/17] userfaultfd: implement USERFAULTFD_RANGE_REGISTER|UNREGISTER

2014-10-03 Thread Andrea Arcangeli
alling ptrace). We could also decide to retain the current -EFAULT behavior of ptrace using get_user_pages_locked with a NULL locked parameter so the FAULT_FLAG_ALLOW_RETRY flag will not be set. Either ways would be safe. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c

[PATCH 12/17] mm: sys_remap_anon_pages

2014-10-03 Thread Andrea Arcangeli
] = 0xbb; } if (c[i] != 0xaa) printf("error %x offset %lu\n", c[i], i), exit(1); } printf("remap_anon_pages functions correctly\n"); return 0; } === Signed-off-by: Andrea Arcangeli --- ar

[PATCH 02/17] mm: gup: add get_user_pages_locked and get_user_pages_unlocked

2014-10-03 Thread Andrea Arcangeli
is not current->mm (like when get_user_pages works on some other process mm). Whenever tsk and mm matches current and current->mm get_user_pages_fast must always be used to increase performance and get the page lockless (only with irq disabled). Signed-off-by: Andrea Arcangeli Rev

[PATCH 05/17] mm: gup: use get_user_pages_fast and get_user_pages_unlocked

2014-10-03 Thread Andrea Arcangeli
Just an optimization. Signed-off-by: Andrea Arcangeli --- drivers/dma/iovlock.c | 10 ++ drivers/iommu/amd_iommu_v2.c | 6 ++ drivers/media/pci/ivtv/ivtv-udma.c | 6 ++ drivers/scsi/st.c | 10 ++ drivers/video/fbdev/pvr2fb.c

[PATCH 10/17] mm: rmap preparation for remap_anon_pages

2014-10-03 Thread Andrea Arcangeli
runs. Signed-off-by: Andrea Arcangeli --- mm/huge_memory.c | 24 mm/rmap.c| 9 + 2 files changed, 29 insertions(+), 4 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b402d60..4277ed7 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c

[PATCH 07/17] mm: madvise MADV_USERFAULT: prepare vm_flags to allow more than 32bits

2014-10-03 Thread Andrea Arcangeli
We run out of 32bits in vm_flags, noop change for 64bit archs. Signed-off-by: Andrea Arcangeli --- fs/proc/task_mmu.c | 4 ++-- include/linux/huge_mm.h | 4 ++-- include/linux/ksm.h | 4 ++-- include/linux/mm_types.h | 2 +- mm/huge_memory.c | 2 +- mm/ksm.c

[PATCH 09/17] mm: PT lock: export double_pt_lock/unlock

2014-10-03 Thread Andrea Arcangeli
Those two helpers are needed by remap_anon_pages. Signed-off-by: Andrea Arcangeli --- include/linux/mm.h | 4 mm/fremap.c| 29 + 2 files changed, 33 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index bf3df07..71dbe03 100644

[PATCH 00/17] RFC: userfault v2

2014-10-03 Thread Andrea Arcangeli
can be found here: git clone --reference linux git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git -b userfault The branch is rebased so you can get updates for example with: git fetch && git checkout -f origin/userfault Comments welcome, thanks! Andrea Andrea Arcangeli (15):

[PATCH 04/17] mm: gup: make get_user_pages_fast and __get_user_pages_fast latency conscious

2014-10-03 Thread Andrea Arcangeli
get_user_pages_unlocked which would be slower). Signed-off-by: Andrea Arcangeli --- arch/x86/mm/gup.c | 234 ++ 1 file changed, 149 insertions(+), 85 deletions(-) diff --git a/arch/x86/mm/gup.c b/arch/x86/mm/gup.c index 2ab183b..917d8c1 100644 --- a/arch/x86/mm

[PATCH 06/17] kvm: Faults which trigger IO release the mmap_sem

2014-10-03 Thread Andrea Arcangeli
other mmap semaphore users now stall as a function of swap or filemap latency. This patch ensures both the regular and async PF path re-enter the fault allowing for the mmap semaphore to be relinquished in the case of IO wait. Reviewed-by: Radim Krčmář Signed-off-by: Andres Lagar-Cavilla Signed-off-

[PATCH 15/17] userfaultfd: make userfaultfd_write non blocking

2014-10-03 Thread Andrea Arcangeli
address. But we should still return an error so if the application thinks this occurrence can never happen it will know it hit a bug. So just return -ENOENT instead of blocking. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 34 +- 1 file changed, 5 insertions

[PATCH 01/17] mm: gup: add FOLL_TRIED

2014-10-03 Thread Andrea Arcangeli
From: Andres Lagar-Cavilla Reviewed-by: Radim Krčmář Signed-off-by: Andres Lagar-Cavilla Signed-off-by: Andrea Arcangeli --- include/linux/mm.h | 1 + mm/gup.c | 4 2 files changed, 5 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 8981cc8..0f4196a

[PATCH 14/17] userfaultfd: add new syscall to provide memory externalization

2014-10-03 Thread Andrea Arcangeli
userfaults to read (POLLIN) and when there are threads waiting a wakeup through a range write (POLLOUT). Signed-off-by: Andrea Arcangeli --- arch/x86/syscalls/syscall_32.tbl | 1 + arch/x86/syscalls/syscall_64.tbl | 1 + fs/Makefile | 1 + fs/userfaultfd.c

[PATCH 16/17] powerpc: add remap_anon_pages and userfaultfd

2014-10-03 Thread Andrea Arcangeli
Add the syscall numbers. Signed-off-by: Andrea Arcangeli --- arch/powerpc/include/asm/systbl.h | 2 ++ arch/powerpc/include/asm/unistd.h | 2 +- arch/powerpc/include/uapi/asm/unistd.h | 2 ++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm

[PATCH 11/17] mm: swp_entry_swapcount

2014-10-03 Thread Andrea Arcangeli
in some anon_vma. Signed-off-by: Andrea Arcangeli --- include/linux/swap.h | 6 ++ mm/swapfile.c| 13 + 2 files changed, 19 insertions(+) diff --git a/include/linux/swap.h b/include/linux/swap.h index 8197452..af9977c 100644 --- a/include/linux/swap.h +++ b/include/linux

[PATCH 13/17] waitqueue: add nr wake parameter to __wake_up_locked_key

2014-10-03 Thread Andrea Arcangeli
Userfaultfd needs to wake all waitqueues (pass 0 as nr parameter), instead of the current hardcoded 1 (that would wake just the first waitqueue in the head list). Signed-off-by: Andrea Arcangeli --- include/linux/wait.h | 5 +++-- kernel/sched/wait.c | 7 --- net/sunrpc/sched.c | 2 +- 3

[PATCH 03/17] mm: gup: use get_user_pages_unlocked within get_user_pages_fast

2014-10-03 Thread Andrea Arcangeli
Signed-off-by: Andrea Arcangeli --- arch/mips/mm/gup.c | 8 +++- arch/powerpc/mm/gup.c| 6 ++ arch/s390/kvm/kvm-s390.c | 4 +--- arch/s390/mm/gup.c | 6 ++ arch/sh/mm/gup.c | 6 ++ arch/sparc/mm/gup.c | 6 ++ arch/x86/mm/gup.c| 7

[PATCH 03/17] mm: gup: use get_user_pages_unlocked within get_user_pages_fast

2014-10-03 Thread Andrea Arcangeli
Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- arch/mips/mm/gup.c | 8 +++- arch/powerpc/mm/gup.c| 6 ++ arch/s390/kvm/kvm-s390.c | 4 +--- arch/s390/mm/gup.c | 6 ++ arch/sh/mm/gup.c | 6 ++ arch/sparc/mm/gup.c | 6 ++ arch/x86/mm/gup.c

[PATCH 11/17] mm: swp_entry_swapcount

2014-10-03 Thread Andrea Arcangeli
in some anon_vma. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/linux/swap.h | 6 ++ mm/swapfile.c| 13 + 2 files changed, 19 insertions(+) diff --git a/include/linux/swap.h b/include/linux/swap.h index 8197452..af9977c 100644 --- a/include/linux/swap.h

[PATCH 13/17] waitqueue: add nr wake parameter to __wake_up_locked_key

2014-10-03 Thread Andrea Arcangeli
Userfaultfd needs to wake all waitqueues (pass 0 as nr parameter), instead of the current hardcoded 1 (that would wake just the first waitqueue in the head list). Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/linux/wait.h | 5 +++-- kernel/sched/wait.c | 7 --- net/sunrpc

[PATCH 14/17] userfaultfd: add new syscall to provide memory externalization

2014-10-03 Thread Andrea Arcangeli
userfaults to read (POLLIN) and when there are threads waiting a wakeup through a range write (POLLOUT). Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- arch/x86/syscalls/syscall_32.tbl | 1 + arch/x86/syscalls/syscall_64.tbl | 1 + fs/Makefile | 1 + fs/userfaultfd.c

[PATCH 16/17] powerpc: add remap_anon_pages and userfaultfd

2014-10-03 Thread Andrea Arcangeli
Add the syscall numbers. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- arch/powerpc/include/asm/systbl.h | 2 ++ arch/powerpc/include/asm/unistd.h | 2 +- arch/powerpc/include/uapi/asm/unistd.h | 2 ++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/powerpc

[PATCH 15/17] userfaultfd: make userfaultfd_write non blocking

2014-10-03 Thread Andrea Arcangeli
address. But we should still return an error so if the application thinks this occurrence can never happen it will know it hit a bug. So just return -ENOENT instead of blocking. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- fs/userfaultfd.c | 34 +- 1 file

[PATCH 04/17] mm: gup: make get_user_pages_fast and __get_user_pages_fast latency conscious

2014-10-03 Thread Andrea Arcangeli
get_user_pages_unlocked which would be slower). Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- arch/x86/mm/gup.c | 234 ++ 1 file changed, 149 insertions(+), 85 deletions(-) diff --git a/arch/x86/mm/gup.c b/arch/x86/mm/gup.c index 2ab183b..917d8c1 100644

[PATCH 06/17] kvm: Faults which trigger IO release the mmap_sem

2014-10-03 Thread Andrea Arcangeli
-Cavilla andre...@google.com Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- virt/kvm/async_pf.c | 4 +--- virt/kvm/kvm_main.c | 4 ++-- 2 files changed, 3 insertions(+), 5 deletions(-) diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c index d6a3d09..44660ae 100644 --- a/virt/kvm

[PATCH 01/17] mm: gup: add FOLL_TRIED

2014-10-03 Thread Andrea Arcangeli
From: Andres Lagar-Cavilla andre...@google.com Reviewed-by: Radim Krčmář rkrc...@redhat.com Signed-off-by: Andres Lagar-Cavilla andre...@google.com Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/linux/mm.h | 1 + mm/gup.c | 4 2 files changed, 5 insertions

[PATCH 00/17] RFC: userfault v2

2014-10-03 Thread Andrea Arcangeli
can be found here: git clone --reference linux git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git -b userfault The branch is rebased so you can get updates for example with: git fetch git checkout -f origin/userfault Comments welcome, thanks! Andrea Andrea Arcangeli (15): mm: gup

[PATCH 09/17] mm: PT lock: export double_pt_lock/unlock

2014-10-03 Thread Andrea Arcangeli
Those two helpers are needed by remap_anon_pages. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/linux/mm.h | 4 mm/fremap.c| 29 + 2 files changed, 33 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index bf3df07

[PATCH 07/17] mm: madvise MADV_USERFAULT: prepare vm_flags to allow more than 32bits

2014-10-03 Thread Andrea Arcangeli
We run out of 32bits in vm_flags, noop change for 64bit archs. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- fs/proc/task_mmu.c | 4 ++-- include/linux/huge_mm.h | 4 ++-- include/linux/ksm.h | 4 ++-- include/linux/mm_types.h | 2 +- mm/huge_memory.c | 2 +- mm

[PATCH 10/17] mm: rmap preparation for remap_anon_pages

2014-10-03 Thread Andrea Arcangeli
userland page faults with MADV_USERFAULT. The source addresses passed to remap_anon_pages should be set as VM_DONTCOPY with MADV_DONTFORK to avoid any risk of the mapcount of the pages increasing, if fork runs in parallel in another thread, before or while remap_anon_pages runs. Signed-off-by: Andrea

[PATCH 05/17] mm: gup: use get_user_pages_fast and get_user_pages_unlocked

2014-10-03 Thread Andrea Arcangeli
Just an optimization. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- drivers/dma/iovlock.c | 10 ++ drivers/iommu/amd_iommu_v2.c | 6 ++ drivers/media/pci/ivtv/ivtv-udma.c | 6 ++ drivers/scsi/st.c | 10 ++ drivers/video

[PATCH 02/17] mm: gup: add get_user_pages_locked and get_user_pages_unlocked

2014-10-03 Thread Andrea Arcangeli
get_user_pages works on some other process mm). Whenever tsk and mm matches current and current-mm get_user_pages_fast must always be used to increase performance and get the page lockless (only with irq disabled). Signed-off-by: Andrea Arcangeli aarca...@redhat.com Reviewed-by: Andres Lagar-Cavilla andre

[PATCH 12/17] mm: sys_remap_anon_pages

2014-10-03 Thread Andrea Arcangeli
) printf(error %x offset %lu\n, c[i], i), exit(1); } printf(remap_anon_pages functions correctly\n); return 0; } === Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- arch/x86/syscalls/syscall_32.tbl | 1 + arch/x86/syscalls/syscall_64.tbl | 1 + include/linux

[PATCH 17/17] userfaultfd: implement USERFAULTFD_RANGE_REGISTER|UNREGISTER

2014-10-03 Thread Andrea Arcangeli
could also decide to retain the current -EFAULT behavior of ptrace using get_user_pages_locked with a NULL locked parameter so the FAULT_FLAG_ALLOW_RETRY flag will not be set. Either ways would be safe. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- fs/userfaultfd.c| 411

[PATCH 08/17] mm: madvise MADV_USERFAULT

2014-10-03 Thread Andrea Arcangeli
exclusive if set. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- arch/alpha/include/uapi/asm/mman.h | 3 ++ arch/mips/include/uapi/asm/mman.h | 3 ++ arch/parisc/include/uapi/asm/mman.h| 3 ++ arch/xtensa/include/uapi/asm/mman.h| 3 ++ fs/proc/task_mmu.c

Re: RFC: get_user_pages_locked|unlocked to leverage VM_FAULT_RETRY

2014-10-02 Thread Andrea Arcangeli
On Thu, Oct 02, 2014 at 02:56:38PM +0200, Peter Zijlstra wrote: > On Thu, Oct 02, 2014 at 02:50:52PM +0200, Peter Zijlstra wrote: > > On Thu, Oct 02, 2014 at 02:31:17PM +0200, Andrea Arcangeli wrote: > > > On Wed, Oct 01, 2014 at 05:36:11PM +0200, Peter Zijlstra wrot

Re: [PATCH 2/4] mm: gup: add get_user_pages_locked and get_user_pages_unlocked

2014-10-02 Thread Andrea Arcangeli
On Wed, Oct 01, 2014 at 10:06:27AM -0700, Andres Lagar-Cavilla wrote: > On Wed, Oct 1, 2014 at 8:51 AM, Peter Feiner wrote: > > On Wed, Oct 01, 2014 at 10:56:35AM +0200, Andrea Arcangeli wrote: > >> + /* VM_FAULT_RETRY cannot return errors */ > >>

Re: RFC: get_user_pages_locked|unlocked to leverage VM_FAULT_RETRY

2014-10-02 Thread Andrea Arcangeli
On Wed, Oct 01, 2014 at 05:36:11PM +0200, Peter Zijlstra wrote: > For all these and the other _fast() users, is there an actual limit to > the nr_pages passed in? Because we used to have the 64 pages limit from > DIO, but without that we get rather long IRQ-off latencies. Ok, I would tend to

Re: RFC: get_user_pages_locked|unlocked to leverage VM_FAULT_RETRY

2014-10-02 Thread Andrea Arcangeli
On Wed, Oct 01, 2014 at 05:36:11PM +0200, Peter Zijlstra wrote: For all these and the other _fast() users, is there an actual limit to the nr_pages passed in? Because we used to have the 64 pages limit from DIO, but without that we get rather long IRQ-off latencies. Ok, I would tend to think

Re: [PATCH 2/4] mm: gup: add get_user_pages_locked and get_user_pages_unlocked

2014-10-02 Thread Andrea Arcangeli
On Wed, Oct 01, 2014 at 10:06:27AM -0700, Andres Lagar-Cavilla wrote: On Wed, Oct 1, 2014 at 8:51 AM, Peter Feiner pfei...@google.com wrote: On Wed, Oct 01, 2014 at 10:56:35AM +0200, Andrea Arcangeli wrote: + /* VM_FAULT_RETRY cannot return errors */ + if (!*locked

Re: RFC: get_user_pages_locked|unlocked to leverage VM_FAULT_RETRY

2014-10-02 Thread Andrea Arcangeli
On Thu, Oct 02, 2014 at 02:56:38PM +0200, Peter Zijlstra wrote: On Thu, Oct 02, 2014 at 02:50:52PM +0200, Peter Zijlstra wrote: On Thu, Oct 02, 2014 at 02:31:17PM +0200, Andrea Arcangeli wrote: On Wed, Oct 01, 2014 at 05:36:11PM +0200, Peter Zijlstra wrote: For all these and the other

Re: [PATCH 3/4] mm: gup: use get_user_pages_fast and get_user_pages_unlocked

2014-10-01 Thread Andrea Arcangeli
On Wed, Oct 01, 2014 at 10:56:36AM +0200, Andrea Arcangeli wrote: > diff --git a/drivers/misc/sgi-gru/grufault.c b/drivers/misc/sgi-gru/grufault.c > index f74fc0c..cd20669 100644 > --- a/drivers/misc/sgi-gru/grufault.c > +++ b/drivers/misc/sgi-gru/grufault.c > @@ -198,8 +198,

[PATCH 3/4] mm: gup: use get_user_pages_fast and get_user_pages_unlocked

2014-10-01 Thread Andrea Arcangeli
Just an optimization. Signed-off-by: Andrea Arcangeli --- drivers/dma/iovlock.c | 10 ++ drivers/iommu/amd_iommu_v2.c | 6 ++ drivers/media/pci/ivtv/ivtv-udma.c | 6 ++ drivers/misc/sgi-gru/grufault.c| 3 +-- drivers/scsi/st.c | 10

[PATCH 4/4] mm: gup: use get_user_pages_unlocked within get_user_pages_fast

2014-10-01 Thread Andrea Arcangeli
Signed-off-by: Andrea Arcangeli --- arch/mips/mm/gup.c | 8 +++- arch/powerpc/mm/gup.c| 6 ++ arch/s390/kvm/kvm-s390.c | 4 +--- arch/s390/mm/gup.c | 6 ++ arch/sh/mm/gup.c | 6 ++ arch/sparc/mm/gup.c | 6 ++ arch/x86/mm/gup.c| 7

[PATCH 0/4] leverage FAULT_FOLL_ALLOW_RETRY in get_user_pages

2014-10-01 Thread Andrea Arcangeli
iews would be welcome, thanks, Andrea Andrea Arcangeli (3): mm: gup: add get_user_pages_locked and get_user_pages_unlocked mm: gup: use get_user_pages_fast and get_user_pages_unlocked mm: gup: use get_user_pages_unlocked within get_user_pages_fast Andres Lagar-Cavilla (1): mm: gup: add

[PATCH 1/4] mm: gup: add FOLL_TRIED

2014-10-01 Thread Andrea Arcangeli
From: Andres Lagar-Cavilla Reviewed-by: Radim Krčmář Signed-off-by: Andres Lagar-Cavilla Signed-off-by: Andrea Arcangeli --- include/linux/mm.h | 1 + mm/gup.c | 4 2 files changed, 5 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 8981cc8..0f4196a

[PATCH 2/4] mm: gup: add get_user_pages_locked and get_user_pages_unlocked

2014-10-01 Thread Andrea Arcangeli
is not current->mm (like when get_user_pages works on some other process mm). Whenever tsk and mm matches current and current->mm get_user_pages_fast must always be used to increase performance and get the page lockless (only with irq disabled). Signed-off-by: Andrea Arcangeli --

[PATCH 2/4] mm: gup: add get_user_pages_locked and get_user_pages_unlocked

2014-10-01 Thread Andrea Arcangeli
get_user_pages works on some other process mm). Whenever tsk and mm matches current and current-mm get_user_pages_fast must always be used to increase performance and get the page lockless (only with irq disabled). Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/linux/mm.h | 7 +++ mm

[PATCH 1/4] mm: gup: add FOLL_TRIED

2014-10-01 Thread Andrea Arcangeli
From: Andres Lagar-Cavilla andre...@google.com Reviewed-by: Radim Krčmář rkrc...@redhat.com Signed-off-by: Andres Lagar-Cavilla andre...@google.com Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/linux/mm.h | 1 + mm/gup.c | 4 2 files changed, 5 insertions

[PATCH 0/4] leverage FAULT_FOLL_ALLOW_RETRY in get_user_pages

2014-10-01 Thread Andrea Arcangeli
be welcome, thanks, Andrea Andrea Arcangeli (3): mm: gup: add get_user_pages_locked and get_user_pages_unlocked mm: gup: use get_user_pages_fast and get_user_pages_unlocked mm: gup: use get_user_pages_unlocked within get_user_pages_fast Andres Lagar-Cavilla (1): mm: gup: add FOLL_TRIED

[PATCH 3/4] mm: gup: use get_user_pages_fast and get_user_pages_unlocked

2014-10-01 Thread Andrea Arcangeli
Just an optimization. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- drivers/dma/iovlock.c | 10 ++ drivers/iommu/amd_iommu_v2.c | 6 ++ drivers/media/pci/ivtv/ivtv-udma.c | 6 ++ drivers/misc/sgi-gru/grufault.c| 3 +-- drivers/scsi/st.c

[PATCH 4/4] mm: gup: use get_user_pages_unlocked within get_user_pages_fast

2014-10-01 Thread Andrea Arcangeli
Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- arch/mips/mm/gup.c | 8 +++- arch/powerpc/mm/gup.c| 6 ++ arch/s390/kvm/kvm-s390.c | 4 +--- arch/s390/mm/gup.c | 6 ++ arch/sh/mm/gup.c | 6 ++ arch/sparc/mm/gup.c | 6 ++ arch/x86/mm/gup.c

Re: [PATCH 3/4] mm: gup: use get_user_pages_fast and get_user_pages_unlocked

2014-10-01 Thread Andrea Arcangeli
On Wed, Oct 01, 2014 at 10:56:36AM +0200, Andrea Arcangeli wrote: diff --git a/drivers/misc/sgi-gru/grufault.c b/drivers/misc/sgi-gru/grufault.c index f74fc0c..cd20669 100644 --- a/drivers/misc/sgi-gru/grufault.c +++ b/drivers/misc/sgi-gru/grufault.c @@ -198,8 +198,7 @@ static int

Re: RFC: get_user_pages_locked|unlocked to leverage VM_FAULT_RETRY

2014-09-28 Thread Andrea Arcangeli
On Fri, Sep 26, 2014 at 12:54:46PM -0700, Andres Lagar-Cavilla wrote: > On Fri, Sep 26, 2014 at 10:25 AM, Andrea Arcangeli > wrote: > > On Thu, Sep 25, 2014 at 02:50:29PM -0700, Andres Lagar-Cavilla wrote: > >> It's nearly impossible to name it right because 1

Re: RFC: get_user_pages_locked|unlocked to leverage VM_FAULT_RETRY

2014-09-28 Thread Andrea Arcangeli
On Fri, Sep 26, 2014 at 12:54:46PM -0700, Andres Lagar-Cavilla wrote: On Fri, Sep 26, 2014 at 10:25 AM, Andrea Arcangeli aarca...@redhat.com wrote: On Thu, Sep 25, 2014 at 02:50:29PM -0700, Andres Lagar-Cavilla wrote: It's nearly impossible to name it right because 1) it indicates we can

RFC: get_user_pages_locked|unlocked to leverage VM_FAULT_RETRY

2014-09-26 Thread Andrea Arcangeli
rotocol. But I need gup_fast to use FAULT_FLAG_ALLOW_RETRY because core places like O_DIRECT uses it. I tried to do a RFC patch below that goes into this direction and should be enough for a start to solve all my issues with the mmap_sem holding inside handle_userfault(), comments welcome. ==

RFC: get_user_pages_locked|unlocked to leverage VM_FAULT_RETRY

2014-09-26 Thread Andrea Arcangeli
that goes into this direction and should be enough for a start to solve all my issues with the mmap_sem holding inside handle_userfault(), comments welcome. === From 41918f7d922d1e7fc70f117db713377e7e2af6e9 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli aarca...@redhat.com Date: Fri, 26 Sep

Re: [PATCH v2] kvm: Faults which trigger IO release the mmap_sem

2014-09-25 Thread Andrea Arcangeli
Hi Andres, On Wed, Sep 17, 2014 at 10:51:48AM -0700, Andres Lagar-Cavilla wrote: > + if (!locked) { > + VM_BUG_ON(npages != -EBUSY); > + Shouldn't this be VM_BUG_ON(npages)? Alternatively we could patch gup to do: case -EHWPOISON: +

Re: [PATCH v2] kvm: Faults which trigger IO release the mmap_sem

2014-09-25 Thread Andrea Arcangeli
Hi Andres, On Wed, Sep 17, 2014 at 10:51:48AM -0700, Andres Lagar-Cavilla wrote: + if (!locked) { + VM_BUG_ON(npages != -EBUSY); + Shouldn't this be VM_BUG_ON(npages)? Alternatively we could patch gup to do: case -EHWPOISON: +

Re: [PATCH 0/3] mmu_notifier: Allow to manage CPU external TLBs

2014-07-24 Thread Andrea Arcangeli
out any references to a > couple of pages. This are usually the places where the CPU > TLBs are flushed too and where its important that this > happens before invalidate_range_end() is called. > > Any comments and review appreciated! Reviewed-by: Andrea Arcangeli -- To unsubscribe

Re: [PATCH 0/3] mmu_notifier: Allow to manage CPU external TLBs

2014-07-24 Thread Andrea Arcangeli
. This are usually the places where the CPU TLBs are flushed too and where its important that this happens before invalidate_range_end() is called. Any comments and review appreciated! Reviewed-by: Andrea Arcangeli aarca...@redhat.com -- To unsubscribe from this list: send the line unsubscribe

Re: [PATCH 08/10] userfaultfd: add new syscall to provide memory externalization

2014-07-03 Thread Andrea Arcangeli
Hi Andy, thanks for CC'ing linux-api. On Wed, Jul 02, 2014 at 06:56:03PM -0700, Andy Lutomirski wrote: > On 07/02/2014 09:50 AM, Andrea Arcangeli wrote: > > Once an userfaultfd is created MADV_USERFAULT regions talks through > > the userfaultfd protocol with the thread respon

Re: [PATCH 08/10] userfaultfd: add new syscall to provide memory externalization

2014-07-03 Thread Andrea Arcangeli
Hi Andy, thanks for CC'ing linux-api. On Wed, Jul 02, 2014 at 06:56:03PM -0700, Andy Lutomirski wrote: On 07/02/2014 09:50 AM, Andrea Arcangeli wrote: Once an userfaultfd is created MADV_USERFAULT regions talks through the userfaultfd protocol with the thread responsible for doing

Re: [PATCH 0/4] Volatile Ranges (v14 - madvise reborn edition!)

2014-06-16 Thread Andrea Arcangeli
Hello everyone, On Mon, Jun 16, 2014 at 01:12:41PM -0700, John Stultz wrote: > On Tue, Jun 3, 2014 at 7:57 AM, Johannes Weiner wrote: > > That, however, truly is a separate virtual memory feature. Would it > > be possible for you to take MADV_FREE and MADV_REVIVE as a base and > > implement an

Re: [PATCH 0/4] Volatile Ranges (v14 - madvise reborn edition!)

2014-06-16 Thread Andrea Arcangeli
Hello everyone, On Mon, Jun 16, 2014 at 01:12:41PM -0700, John Stultz wrote: On Tue, Jun 3, 2014 at 7:57 AM, Johannes Weiner han...@cmpxchg.org wrote: That, however, truly is a separate virtual memory feature. Would it be possible for you to take MADV_FREE and MADV_REVIVE as a base and

Re: [PATCH, RFC 00/10] THP refcounting redesign

2014-06-10 Thread Andrea Arcangeli
On Wed, Jun 11, 2014 at 01:14:31AM +0300, Kirill A. Shutemov wrote: > On Wed, Jun 11, 2014 at 12:04:51AM +0200, Andrea Arcangeli wrote: > > On Tue, Jun 10, 2014 at 11:46:40PM +0300, Kirill A. Shutemov wrote: > > > Agreed. The patchset drops tail page refcounting. > > >

Re: [PATCH, RFC 00/10] THP refcounting redesign

2014-06-10 Thread Andrea Arcangeli
On Tue, Jun 10, 2014 at 11:46:40PM +0300, Kirill A. Shutemov wrote: > Agreed. The patchset drops tail page refcounting. Very possibly I misread something or a later patch fixes this up, I just did a basic code review, but from the new code of split_huge_page it looks like it returns -EBUSY after

Re: [PATCH, RFC 00/10] THP refcounting redesign

2014-06-10 Thread Andrea Arcangeli
On Tue, Jun 10, 2014 at 03:25:42PM -0500, Christoph Lameter wrote: > I thought the idea was that we would modify the relevant code and > that at some point this requirement could go away? There were places that weren't aware and splitted unnecessarily to avoid having to make all places aware

Re: [PATCH, RFC 00/10] THP refcounting redesign

2014-06-10 Thread Andrea Arcangeli
Hello, On Tue, Jun 10, 2014 at 04:52:46PM +0300, Kirill A. Shutemov wrote: > I mean the whole compound page will not be freed until the last part page > is unmapped. It can lead to excessive memory overhead for some workloads. That is why a refcounting design like this wouldn't have been

Re: [PATCH, RFC 00/10] THP refcounting redesign

2014-06-10 Thread Andrea Arcangeli
Hello, On Tue, Jun 10, 2014 at 04:52:46PM +0300, Kirill A. Shutemov wrote: I mean the whole compound page will not be freed until the last part page is unmapped. It can lead to excessive memory overhead for some workloads. That is why a refcounting design like this wouldn't have been feasible

Re: [PATCH, RFC 00/10] THP refcounting redesign

2014-06-10 Thread Andrea Arcangeli
On Tue, Jun 10, 2014 at 03:25:42PM -0500, Christoph Lameter wrote: I thought the idea was that we would modify the relevant code and that at some point this requirement could go away? There were places that weren't aware and splitted unnecessarily to avoid having to make all places aware

Re: [PATCH, RFC 00/10] THP refcounting redesign

2014-06-10 Thread Andrea Arcangeli
On Tue, Jun 10, 2014 at 11:46:40PM +0300, Kirill A. Shutemov wrote: Agreed. The patchset drops tail page refcounting. Very possibly I misread something or a later patch fixes this up, I just did a basic code review, but from the new code of split_huge_page it looks like it returns -EBUSY after

Re: [PATCH, RFC 00/10] THP refcounting redesign

2014-06-10 Thread Andrea Arcangeli
On Wed, Jun 11, 2014 at 01:14:31AM +0300, Kirill A. Shutemov wrote: On Wed, Jun 11, 2014 at 12:04:51AM +0200, Andrea Arcangeli wrote: On Tue, Jun 10, 2014 at 11:46:40PM +0300, Kirill A. Shutemov wrote: Agreed. The patchset drops tail page refcounting. Very possibly I misread something

Re: [PATCH 0/2] KVM: async_pf: use_mm/mm_users fixes

2014-04-28 Thread Andrea Arcangeli
On Mon, Apr 28, 2014 at 01:06:05PM +0200, Paolo Bonzini wrote: > Patch 1 will be for 3.16 only, I'd like a review from Marcelo or Andrea > though (that's "KVM: async_pf: kill the unnecessary use_mm/unuse_mm > async_pf_execute()" for easier googling). Patch 1: Reviewed-by: A

Re: [PATCH 1/2] KVM: async_pf: kill the unnecessary use_mm/unuse_mm async_pf_execute()

2014-04-28 Thread Andrea Arcangeli
Hi, On Wed, Apr 23, 2014 at 09:32:28PM +0200, Oleg Nesterov wrote: > On 04/22, Christian Borntraeger wrote: > > > > On 22/04/14 22:15, Christian Borntraeger wrote: > > > On 21/04/14 15:25, Oleg Nesterov wrote: > > >> async_pf_execute() has no reasons to adopt apf->mm, gup(current, mm) > > >>

Re: [PATCH 1/2] KVM: async_pf: kill the unnecessary use_mm/unuse_mm async_pf_execute()

2014-04-28 Thread Andrea Arcangeli
Hi, On Wed, Apr 23, 2014 at 09:32:28PM +0200, Oleg Nesterov wrote: On 04/22, Christian Borntraeger wrote: On 22/04/14 22:15, Christian Borntraeger wrote: On 21/04/14 15:25, Oleg Nesterov wrote: async_pf_execute() has no reasons to adopt apf-mm, gup(current, mm) should work just fine

Re: [PATCH 0/2] KVM: async_pf: use_mm/mm_users fixes

2014-04-28 Thread Andrea Arcangeli
On Mon, Apr 28, 2014 at 01:06:05PM +0200, Paolo Bonzini wrote: Patch 1 will be for 3.16 only, I'd like a review from Marcelo or Andrea though (that's KVM: async_pf: kill the unnecessary use_mm/unuse_mm async_pf_execute() for easier googling). Patch 1: Reviewed-by: Andrea Arcangeli aarca

Re: [PATCH] thp: close race between split and zap huge pages

2014-04-17 Thread Andrea Arcangeli
Hi everyone, On Wed, Apr 16, 2014 at 12:48:56AM +0300, Kirill A. Shutemov wrote: > - pmd = mm_find_pmd(mm, address); > - if (!pmd) > + pgd = pgd_offset(mm, address); > + if (!pgd_present(*pgd)) > return NULL; > + pud = pud_offset(pgd, address); > + if

Re: [PATCH] thp: close race between split and zap huge pages

2014-04-17 Thread Andrea Arcangeli
Hi everyone, On Wed, Apr 16, 2014 at 12:48:56AM +0300, Kirill A. Shutemov wrote: - pmd = mm_find_pmd(mm, address); - if (!pmd) + pgd = pgd_offset(mm, address); + if (!pgd_present(*pgd)) return NULL; + pud = pud_offset(pgd, address); + if

Re: mm: kernel BUG at mm/huge_memory.c:1829!

2014-04-16 Thread Andrea Arcangeli
Hi Kirill, On Mon, Apr 14, 2014 at 05:42:18PM +0300, Kirill A. Shutemov wrote: > I've spent few day trying to understand rmap code. And now I think my > patch is wrong. > > I actually don't see where walk order requirement comes from. It seems all > operations (insert, remove, foreach) on

Re: mm: kernel BUG at mm/huge_memory.c:1829!

2014-04-16 Thread Andrea Arcangeli
Hi Kirill, On Mon, Apr 14, 2014 at 05:42:18PM +0300, Kirill A. Shutemov wrote: I've spent few day trying to understand rmap code. And now I think my patch is wrong. I actually don't see where walk order requirement comes from. It seems all operations (insert, remove, foreach) on anon_vma is

Re: mm: kernel BUG at mm/huge_memory.c:1829!

2014-04-10 Thread Andrea Arcangeli
Hi, On Thu, Apr 10, 2014 at 04:44:36PM +0300, Kirill A. Shutemov wrote: > Okay, below is my attempt to fix the bug. I'm not entirely sure it's > correct. Andrea, could you take a look? The possibility the interval tree implicitly broke the walk order of the anon_vma list didn't cross my mind,

Re: mm: kernel BUG at mm/huge_memory.c:1829!

2014-04-10 Thread Andrea Arcangeli
Hi, On Thu, Apr 10, 2014 at 04:44:36PM +0300, Kirill A. Shutemov wrote: Okay, below is my attempt to fix the bug. I'm not entirely sure it's correct. Andrea, could you take a look? The possibility the interval tree implicitly broke the walk order of the anon_vma list didn't cross my mind,

Re: [PATCH 0/4] hugetlb: add support gigantic page allocation at runtime

2014-04-03 Thread Andrea Arcangeli
tion at runtime Reviewed-by: Andrea Arcangeli -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH 0/4] hugetlb: add support gigantic page allocation at runtime

2014-04-03 Thread Andrea Arcangeli
Reviewed-by: Andrea Arcangeli aarca...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH 0/5] Volatile Ranges (v12) & LSF-MM discussion fodder

2014-04-02 Thread Andrea Arcangeli
Hi everyone, On Tue, Apr 01, 2014 at 09:03:57PM -0700, John Stultz wrote: > So between zero-fill and SIGBUS, I think SIGBUS makes the most sense. If > you have a third option you're thinking of, I'd of course be interested > in hearing it. I actually thought the way of being notified with a page

Re: [PATCH] mm/mmu_notifier: restore set_pte_at_notify semantics

2014-04-02 Thread Andrea Arcangeli
Hi, On Wed, Apr 02, 2014 at 11:18:27AM -0400, Jerome Glisse wrote: > This would imply either to scan all mmu_notifier currently register or to > have a global flags for the mm to know if there is one mmu_notifier without > change_pte. Moreover this would means that kvm would remain "broken" if

Re: [PATCH] mm/mmu_notifier: restore set_pte_at_notify semantics

2014-04-02 Thread Andrea Arcangeli
Hi, On Wed, Apr 02, 2014 at 11:18:27AM -0400, Jerome Glisse wrote: This would imply either to scan all mmu_notifier currently register or to have a global flags for the mm to know if there is one mmu_notifier without change_pte. Moreover this would means that kvm would remain broken if one of

<    7   8   9   10   11   12   13   14   15   16   >