Re: [RFC v4 00/20] Speculative page faults

2017-06-13 Thread Laurent Dufour
On 12/06/2017 12:20, Jan Kara wrote: > Hello, > > On Fri 09-06-17 16:20:49, Laurent Dufour wrote: >> This is a port on kernel 4.12 of the work done by Peter Zijlstra to >> handle page fault without holding the mm semaphore. >> >> http://linux-kernel.2935.n7.na

Re: [RFC v4 00/20] Speculative page faults

2017-06-13 Thread Laurent Dufour
On 09/06/2017 18:59, Tim Chen wrote: > On 06/09/2017 09:35 AM, Michal Hocko wrote: >> On Fri 09-06-17 17:25:51, Laurent Dufour wrote: >> [...] >>> Thanks Michal for your feedback. >>> >>> I mostly focused on this database workload since this is the one whe

Re: [RFC v4 00/20] Speculative page faults

2017-06-13 Thread Laurent Dufour
On 09/06/2017 18:59, Tim Chen wrote: > On 06/09/2017 09:35 AM, Michal Hocko wrote: >> On Fri 09-06-17 17:25:51, Laurent Dufour wrote: >> [...] >>> Thanks Michal for your feedback. >>> >>> I mostly focused on this database workload since this is the one whe

Re: [RFC v4 00/20] Speculative page faults

2017-06-13 Thread Laurent Dufour
On 09/06/2017 18:35, Michal Hocko wrote: > On Fri 09-06-17 17:25:51, Laurent Dufour wrote: > [...] >> Thanks Michal for your feedback. >> >> I mostly focused on this database workload since this is the one where >> we hit the mmap_sem bottleneck when running on big no

Re: [RFC v4 00/20] Speculative page faults

2017-06-13 Thread Laurent Dufour
On 09/06/2017 18:35, Michal Hocko wrote: > On Fri 09-06-17 17:25:51, Laurent Dufour wrote: > [...] >> Thanks Michal for your feedback. >> >> I mostly focused on this database workload since this is the one where >> we hit the mmap_sem bottleneck when running on big no

Re: [RFC v4 00/20] Speculative page faults

2017-06-09 Thread Laurent Dufour
On 09/06/2017 17:01, Michal Hocko wrote: > On Fri 09-06-17 16:20:49, Laurent Dufour wrote: >> This is a port on kernel 4.12 of the work done by Peter Zijlstra to >> handle page fault without holding the mm semaphore. >> >> http://linux-kernel.2935.n7.nabble.c

Re: [RFC v4 00/20] Speculative page faults

2017-06-09 Thread Laurent Dufour
On 09/06/2017 17:01, Michal Hocko wrote: > On Fri 09-06-17 16:20:49, Laurent Dufour wrote: >> This is a port on kernel 4.12 of the work done by Peter Zijlstra to >> handle page fault without holding the mm semaphore. >> >> http://linux-kernel.2935.n7.nabble.c

[RFC v4 05/20] mm: RCU free VMAs

2017-06-09 Thread Laurent Dufour
(Intel) <pet...@infradead.org> [Rename vma_is_dead() to vma_has_changed()] Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- include/linux/mm_types.h | 2 + kernel/fork.c| 1 + mm/init-mm.c | 1 + mm/internal.h|

[RFC v4 05/20] mm: RCU free VMAs

2017-06-09 Thread Laurent Dufour
s_dead() to vma_has_changed()] Signed-off-by: Laurent Dufour --- include/linux/mm_types.h | 2 + kernel/fork.c| 1 + mm/init-mm.c | 1 + mm/internal.h| 20 ++ mm/mmap.c| 102 ++- 5 files chang

[RFC v4 11/20] mm/spf: Protect changes to vm_flags

2017-06-09 Thread Laurent Dufour
Protect VMA's flags change against the speculative page fault handler. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- fs/proc/task_mmu.c | 2 ++ mm/mempolicy.c | 2 ++ mm/mlock.c | 9 ++--- mm/mmap.c | 2 ++ mm/mprotect.c | 2 ++ 5 files chang

[RFC v4 11/20] mm/spf: Protect changes to vm_flags

2017-06-09 Thread Laurent Dufour
Protect VMA's flags change against the speculative page fault handler. Signed-off-by: Laurent Dufour --- fs/proc/task_mmu.c | 2 ++ mm/mempolicy.c | 2 ++ mm/mlock.c | 9 ++--- mm/mmap.c | 2 ++ mm/mprotect.c | 2 ++ 5 files changed, 14 insertions(+), 3 deletions

[RFC v4 07/20] mm/spf: Try spin lock in speculative path

2017-06-09 Thread Laurent Dufour
/0x280 __do_page_fault+0x187/0x580 trace_do_page_fault+0x52/0x260 do_async_page_fault+0x19/0x70 async_page_fault+0x28/0x30 Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/memory.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/mm/memory.

[RFC v4 07/20] mm/spf: Try spin lock in speculative path

2017-06-09 Thread Laurent Dufour
/0x280 __do_page_fault+0x187/0x580 trace_do_page_fault+0x52/0x260 do_async_page_fault+0x19/0x70 async_page_fault+0x28/0x30 Signed-off-by: Laurent Dufour --- mm/memory.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 8c43895e9310

[RFC v4 09/20] mm/spf: don't set fault entry's fields if locking failed

2017-06-09 Thread Laurent Dufour
In the case pte_map_lock failed to lock the pte or if the VMA is no more valid, the fault entry's fields should not be set so that caller won't try to unlock it. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/memory.c | 14 +- 1 file changed, 9 insertions

[RFC v4 09/20] mm/spf: don't set fault entry's fields if locking failed

2017-06-09 Thread Laurent Dufour
In the case pte_map_lock failed to lock the pte or if the VMA is no more valid, the fault entry's fields should not be set so that caller won't try to unlock it. Signed-off-by: Laurent Dufour --- mm/memory.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/mm

[RFC v4 03/20] mm: Introduce pte_spinlock

2017-06-09 Thread Laurent Dufour
This is needed because in handle_pte_fault() pte_offset_map() is called and then fe->ptl is fetched and spin_locked. This was previously embedded in the call to pte_offset_map_lock(). Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/memory.c | 15 +++ 1 fil

[RFC v4 03/20] mm: Introduce pte_spinlock

2017-06-09 Thread Laurent Dufour
This is needed because in handle_pte_fault() pte_offset_map() is called and then fe->ptl is fetched and spin_locked. This was previously embedded in the call to pte_offset_map_lock(). Signed-off-by: Laurent Dufour --- mm/memory.c | 15 +++ 1 file changed, 11 insertions(+)

[RFC v4 14/20] mm/spf: protect madvise vs speculative pf

2017-06-09 Thread Laurent Dufour
This patch protects madvise's effect against the speculative page fault handler. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/madvise.c | 4 1 file changed, 4 insertions(+) diff --git a/mm/madvise.c b/mm/madvise.c index 25b78ee4fc2c..d1fa6a7ee604 100644 --

[RFC v4 14/20] mm/spf: protect madvise vs speculative pf

2017-06-09 Thread Laurent Dufour
This patch protects madvise's effect against the speculative page fault handler. Signed-off-by: Laurent Dufour --- mm/madvise.c | 4 1 file changed, 4 insertions(+) diff --git a/mm/madvise.c b/mm/madvise.c index 25b78ee4fc2c..d1fa6a7ee604 100644 --- a/mm/madvise.c +++ b/mm/madvise.c

[RFC v4 10/20] mm/spf; fix lock dependency against mapping->i_mmap_rwsem

2017-06-09 Thread Laurent Dufour
o: CPU0CPU1 lock(>i_mmap_rwsem); lock(>vm_sequence/1); lock(>i_mmap_rwsem); lock(>vm_sequence); *** DEADLOCK *** To fix that we must grab the vm_sequence lock after any mapping one in _

[RFC v4 10/20] mm/spf; fix lock dependency against mapping->i_mmap_rwsem

2017-06-09 Thread Laurent Dufour
o: CPU0CPU1 lock(>i_mmap_rwsem); lock(>vm_sequence/1); lock(>i_mmap_rwsem); lock(>vm_sequence); *** DEADLOCK *** To fix that we must grab the vm_sequence lock after any mapping one in __vma_adjus

[RFC v4 16/20] mm/spf: Don't call user fault callback in the speculative path

2017-06-09 Thread Laurent Dufour
The handle_userfault() function assumes that the mmap_sem is held which is not true in the case of a speculative page fault handling. When doing a speculative page fault, lets retry it in the usual path to call handle_userfault(). Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.

[RFC v4 12/20] mm/spf Protect vm_policy's changes against speculative pf

2017-06-09 Thread Laurent Dufour
Mark the VMA touched when policy changes are applied to it so that speculative page fault will be aborted. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/mempolicy.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/mempolicy.c b/mm/mempo

[RFC v4 16/20] mm/spf: Don't call user fault callback in the speculative path

2017-06-09 Thread Laurent Dufour
The handle_userfault() function assumes that the mmap_sem is held which is not true in the case of a speculative page fault handling. When doing a speculative page fault, lets retry it in the usual path to call handle_userfault(). Signed-off-by: Laurent Dufour --- mm/memory.c | 4 1 file

[RFC v4 12/20] mm/spf Protect vm_policy's changes against speculative pf

2017-06-09 Thread Laurent Dufour
Mark the VMA touched when policy changes are applied to it so that speculative page fault will be aborted. Signed-off-by: Laurent Dufour --- mm/mempolicy.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 13d32c25226c..5e44b3e69a0d

[RFC v4 20/20] mm/spf: Clear FAULT_FLAG_KILLABLE in the speculative path

2017-06-09 Thread Laurent Dufour
The flag FAULT_FLAG_KILLABLE should be unset to not allow the mmap_sem to released in __lock_page_or_retry(). In this patch the unsetting of the flag FAULT_FLAG_ALLOW_RETRY is also moved into handle_speculative_fault() since this has to be done for all architectures. Signed-off-by: Laurent

[RFC v4 20/20] mm/spf: Clear FAULT_FLAG_KILLABLE in the speculative path

2017-06-09 Thread Laurent Dufour
The flag FAULT_FLAG_KILLABLE should be unset to not allow the mmap_sem to released in __lock_page_or_retry(). In this patch the unsetting of the flag FAULT_FLAG_ALLOW_RETRY is also moved into handle_speculative_fault() since this has to be done for all architectures. Signed-off-by: Laurent

[RFC v4 19/20] powerpc/mm: Add speculative page fault

2017-06-09 Thread Laurent Dufour
This patch enable the speculative page fault on the PowerPC architecture. This will try a speculative page fault without holding the mmap_sem, if it returns with WM_FAULT_RETRY, the mmap_sem is acquired and the traditional page fault processing is done. Signed-off-by: Laurent Dufour <l

[RFC v4 17/20] x86/mm: Add speculative pagefault handling

2017-06-09 Thread Laurent Dufour
From: Peter Zijlstra Try a speculative fault before acquiring mmap_sem, if it returns with VM_FAULT_RETRY continue with the mmap_sem acquisition and do the traditional fault. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/mm/fault.c | 18

[RFC v4 17/20] x86/mm: Add speculative pagefault handling

2017-06-09 Thread Laurent Dufour
From: Peter Zijlstra Try a speculative fault before acquiring mmap_sem, if it returns with VM_FAULT_RETRY continue with the mmap_sem acquisition and do the traditional fault. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/mm/fault.c | 18 ++ 1 file changed, 18

[RFC v4 19/20] powerpc/mm: Add speculative page fault

2017-06-09 Thread Laurent Dufour
This patch enable the speculative page fault on the PowerPC architecture. This will try a speculative page fault without holding the mmap_sem, if it returns with WM_FAULT_RETRY, the mmap_sem is acquired and the traditional page fault processing is done. Signed-off-by: Laurent Dufour --- arch

[RFC v4 15/20] mm/spf: protect mremap() against speculative pf

2017-06-09 Thread Laurent Dufour
mremap() is modifying the VMA layout and thus must be protected against the speculative page fault handler. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/mremap.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/mm/mremap.c b/mm/mremap.c index cd8a1b

[RFC v4 18/20] x86/mm: Update the handle_speculative_fault's path

2017-06-09 Thread Laurent Dufour
If handle_speculative_fault failed due to a VM ERROR, we try again the slow path to allow the signal to be delivered. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- arch/x86/mm/fault.c | 21 + 1 file changed, 9 insertions(+), 12 deletions(-) diff --git

[RFC v4 15/20] mm/spf: protect mremap() against speculative pf

2017-06-09 Thread Laurent Dufour
mremap() is modifying the VMA layout and thus must be protected against the speculative page fault handler. Signed-off-by: Laurent Dufour --- mm/mremap.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/mm/mremap.c b/mm/mremap.c index cd8a1b199ef9..9c7f69c9e80f 100644 --- a/mm

[RFC v4 18/20] x86/mm: Update the handle_speculative_fault's path

2017-06-09 Thread Laurent Dufour
If handle_speculative_fault failed due to a VM ERROR, we try again the slow path to allow the signal to be delivered. Signed-off-by: Laurent Dufour --- arch/x86/mm/fault.c | 21 + 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/arch/x86/mm/fault.c b/arch/x86

[RFC v4 02/20] mm: Prepare for FAULT_FLAG_SPECULATIVE

2017-06-09 Thread Laurent Dufour
(Intel) <pet...@infradead.org> [port to 4.10 kernel] Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- include/linux/mm.h | 1 + mm/memory.c| 57 +++--- 2 files changed, 42 insertions(+), 16 deletions(-) diff --git a/

[RFC v4 02/20] mm: Prepare for FAULT_FLAG_SPECULATIVE

2017-06-09 Thread Laurent Dufour
] Signed-off-by: Laurent Dufour --- include/linux/mm.h | 1 + mm/memory.c| 57 +++--- 2 files changed, 42 insertions(+), 16 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index b892e95d4929..6b7ec2a76953 100644

[RFC v4 04/20] mm: VMA sequence count

2017-06-09 Thread Laurent Dufour
4.10 kernel] Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- include/linux/mm_types.h | 1 + mm/memory.c | 2 ++ mm/mmap.c| 13 + 3 files changed, 16 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h

[RFC v4 08/20] mm/spf: Fix fe.sequence init in __handle_mm_fault()

2017-06-09 Thread Laurent Dufour
__handle_mm_fault() calls handle_pte_fault which requires the sequence field of the fault_env to be initialized. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/memory.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/memory.c b/mm/memory.c index 9de741

[RFC v4 04/20] mm: VMA sequence count

2017-06-09 Thread Laurent Dufour
are still valid. The flip side is that we cannot distinguish between a vma_adjust() and the unmap_page_range() -- where with the former we could have re-checked the vma bounds against the address. Signed-off-by: Peter Zijlstra (Intel) [port to 4.10 kernel] Signed-off-by: Laurent Dufour --- include

[RFC v4 08/20] mm/spf: Fix fe.sequence init in __handle_mm_fault()

2017-06-09 Thread Laurent Dufour
__handle_mm_fault() calls handle_pte_fault which requires the sequence field of the fault_env to be initialized. Signed-off-by: Laurent Dufour --- mm/memory.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/memory.c b/mm/memory.c index 9de741554e15..f05288797c60 100644 --- a/mm/memory.c

[RFC v4 06/20] mm: Provide speculative fault infrastructure

2017-06-09 Thread Laurent Dufour
ll pud_alloc() as it is safe since p4d is valid] Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- include/linux/mm.h | 3 ++ mm/memory.c| 148 +++-- 2 files changed, 148 insertions(+), 3 deletions(-) diff --git a/in

[RFC v4 06/20] mm: Provide speculative fault infrastructure

2017-06-09 Thread Laurent Dufour
-off-by: Laurent Dufour --- include/linux/mm.h | 3 ++ mm/memory.c| 148 +++-- 2 files changed, 148 insertions(+), 3 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 6b7ec2a76953..671541e00d26 100644 --- a/include/linux

[RFC v4 13/20] mm/spf: Add check on the VMA's flags

2017-06-09 Thread Laurent Dufour
When handling speculative page fault we should check for the VMA's access permission as it is done in handle_mm_fault() or access_error in x86's fault handler. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/memory.c | 24 1 file changed, 24 inse

[RFC v4 13/20] mm/spf: Add check on the VMA's flags

2017-06-09 Thread Laurent Dufour
When handling speculative page fault we should check for the VMA's access permission as it is done in handle_mm_fault() or access_error in x86's fault handler. Signed-off-by: Laurent Dufour --- mm/memory.c | 24 1 file changed, 24 insertions(+) diff --git a/mm/memory.c

[RFC v4 00/20] Speculative page faults

2017-06-09 Thread Laurent Dufour
vel paging. - abort speculative path before entering userfault code - support for PowerPC architecture - reorder the patch to fix build test errors. Laurent Dufour (14): mm: Introduce pte_spinlock mm/spf: Try spin lock in speculative path mm/spf: Fix fe.sequence init in __handle_mm_fault()

[RFC v4 01/20] mm: Dont assume page-table invariance during faults

2017-06-09 Thread Laurent Dufour
From: Peter Zijlstra One of the side effects of speculating on faults (without holding mmap_sem) is that we can race with free_pgtables() and therefore we cannot assume the page-tables will stick around. Remove the relyance on the pte pointer. Signed-off-by: Peter

[RFC v4 00/20] Speculative page faults

2017-06-09 Thread Laurent Dufour
vel paging. - abort speculative path before entering userfault code - support for PowerPC architecture - reorder the patch to fix build test errors. Laurent Dufour (14): mm: Introduce pte_spinlock mm/spf: Try spin lock in speculative path mm/spf: Fix fe.sequence init in __handle_mm_fault()

[RFC v4 01/20] mm: Dont assume page-table invariance during faults

2017-06-09 Thread Laurent Dufour
From: Peter Zijlstra One of the side effects of speculating on faults (without holding mmap_sem) is that we can race with free_pgtables() and therefore we cannot assume the page-tables will stick around. Remove the relyance on the pte pointer. Signed-off-by: Peter Zijlstra (Intel) ---

[RFC v2 07/10] mm: Add a range lock parameter to GUP() and handle_page_fault()

2017-05-24 Thread Laurent Dufour
As get_user_pages*(), handle_page_fault(), fixup_user_fault() and populate_vma_page_range() functions may release the mmap_sem, they have to know the range when dealing with range locks. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- arch/powerpc/mm/copro_f

[RFC v2 07/10] mm: Add a range lock parameter to GUP() and handle_page_fault()

2017-05-24 Thread Laurent Dufour
As get_user_pages*(), handle_page_fault(), fixup_user_fault() and populate_vma_page_range() functions may release the mmap_sem, they have to know the range when dealing with range locks. Signed-off-by: Laurent Dufour --- arch/powerpc/mm/copro_fault.c | 2 +- arch/powerpc/mm

[RFC v2 02/10] mm: Remove nest locking operation with mmap_sem

2017-05-24 Thread Laurent Dufour
The range locking framework doesn't yet provide nest locking operation. Once the range locking API while provide nested operation support, this patch will have to be reviewed. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/mmap.c | 8 1 file changed, 8 inse

[RFC v2 03/10] mm: Add a range parameter to the vm_fault structure

2017-05-24 Thread Laurent Dufour
-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- include/linux/mm.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 7cb17c6b97de..4ad96294c180 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -344,6 +344,9 @@ struct vm

[RFC v2 02/10] mm: Remove nest locking operation with mmap_sem

2017-05-24 Thread Laurent Dufour
The range locking framework doesn't yet provide nest locking operation. Once the range locking API while provide nested operation support, this patch will have to be reviewed. Signed-off-by: Laurent Dufour --- mm/mmap.c | 8 1 file changed, 8 insertions(+) diff --git a/mm/mmap.c b/mm

[RFC v2 03/10] mm: Add a range parameter to the vm_fault structure

2017-05-24 Thread Laurent Dufour
-by: Laurent Dufour --- include/linux/mm.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 7cb17c6b97de..4ad96294c180 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -344,6 +344,9 @@ struct vm_fault

[RFC v2 04/10] mm: Handle range lock field when collapsing huge pages

2017-05-24 Thread Laurent Dufour
When collapsing huge pages from swap in operatioin, a vm_fault structure is built and passed to do_swap_page(). The new range field of the vm_fault structure must be set correctly when dealing with range_lock. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/khugepaged.

[RFC v2 04/10] mm: Handle range lock field when collapsing huge pages

2017-05-24 Thread Laurent Dufour
When collapsing huge pages from swap in operatioin, a vm_fault structure is built and passed to do_swap_page(). The new range field of the vm_fault structure must be set correctly when dealing with range_lock. Signed-off-by: Laurent Dufour --- mm/khugepaged.c | 39

[RFC v2 06/10] mm: Add a range lock parameter to lock_page_or_retry()

2017-05-24 Thread Laurent Dufour
As lock_page_or_retry() may release the mmap_sem, it has to know about the range applying to the lock when using range locks. This patch adds a new range parameter to __lock_page_or_retry() and deals with the callers. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- include

[RFC v2 05/10] mm: Add a range lock parameter to userfaultfd_remove()

2017-05-24 Thread Laurent Dufour
()'s callers are touched to deal with the range parameter as well. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- fs/userfaultfd.c | 8 ++-- include/linux/userfaultfd_k.h | 28 mm/madvise.c

[RFC v2 06/10] mm: Add a range lock parameter to lock_page_or_retry()

2017-05-24 Thread Laurent Dufour
As lock_page_or_retry() may release the mmap_sem, it has to know about the range applying to the lock when using range locks. This patch adds a new range parameter to __lock_page_or_retry() and deals with the callers. Signed-off-by: Laurent Dufour --- include/linux/pagemap.h | 17

[RFC v2 05/10] mm: Add a range lock parameter to userfaultfd_remove()

2017-05-24 Thread Laurent Dufour
()'s callers are touched to deal with the range parameter as well. Signed-off-by: Laurent Dufour --- fs/userfaultfd.c | 8 ++-- include/linux/userfaultfd_k.h | 28 mm/madvise.c | 42 +- mm

[RFC v2 00/10] Replace mmap_sem by a range lock

2017-05-24 Thread Laurent Dufour
ns to move easily from a semaphore to a range lock. - split the leading patches adding the range parameters to some services. Laurent Dufour (10): mm: Deactivate mmap_sem assert mm: Remove nest locking operation with mmap_sem mm: Add a range parameter to the vm_fault structure mm: Handle

[RFC v2 01/10] mm: Deactivate mmap_sem assert

2017-05-24 Thread Laurent Dufour
-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- arch/powerpc/platforms/powernv/npu-dma.c | 2 ++ arch/x86/events/core.c | 2 ++ fs/userfaultfd.c | 6 ++ include/linux/huge_mm.h | 4 mm

[RFC v2 00/10] Replace mmap_sem by a range lock

2017-05-24 Thread Laurent Dufour
ns to move easily from a semaphore to a range lock. - split the leading patches adding the range parameters to some services. Laurent Dufour (10): mm: Deactivate mmap_sem assert mm: Remove nest locking operation with mmap_sem mm: Add a range parameter to the vm_fault structure mm: Handle

[RFC v2 01/10] mm: Deactivate mmap_sem assert

2017-05-24 Thread Laurent Dufour
-off-by: Laurent Dufour --- arch/powerpc/platforms/powernv/npu-dma.c | 2 ++ arch/x86/events/core.c | 2 ++ fs/userfaultfd.c | 6 ++ include/linux/huge_mm.h | 4 mm/gup.c | 2 ++ mm/memory.c

[RFC v2 10/10] mm: Introduce CONFIG_MEM_RANGE_LOCK

2017-05-24 Thread Laurent Dufour
off and requires the EXPERT mode since it is not yet complete. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/Kconfig | 12 1 file changed, 12 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index beb7a455915d..955d9a735a49 100644 --- a/mm/Kconfig ++

[RFC v2 08/10] mm: Define mem range lock operations

2017-05-24 Thread Laurent Dufour
. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- include/linux/mm.h | 27 +++ include/linux/mm_types.h | 5 + 2 files changed, 32 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index b09048386152..d47b28eb0a53

[RFC v2 10/10] mm: Introduce CONFIG_MEM_RANGE_LOCK

2017-05-24 Thread Laurent Dufour
off and requires the EXPERT mode since it is not yet complete. Signed-off-by: Laurent Dufour --- mm/Kconfig | 12 1 file changed, 12 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index beb7a455915d..955d9a735a49 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -309,6 +309,18

[RFC v2 08/10] mm: Define mem range lock operations

2017-05-24 Thread Laurent Dufour
. Signed-off-by: Laurent Dufour --- include/linux/mm.h | 27 +++ include/linux/mm_types.h | 5 + 2 files changed, 32 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index b09048386152..d47b28eb0a53 100644 --- a/include/linux/mm.h +++ b/include

Re: [PATCH 2/6] locking: Introduce range reader/writer lock

2017-05-23 Thread Laurent Dufour
On 15/05/2017 11:07, Davidlohr Bueso wrote: > --- /dev/null > +++ b/include/linux/range_lock.h > @@ -0,0 +1,181 @@ > +/* > + * Range/interval rw-locking > + * - > + * > + * Interval-tree based range locking is about controlling tasks' forward > + * progress when adding an

Re: [PATCH 2/6] locking: Introduce range reader/writer lock

2017-05-23 Thread Laurent Dufour
On 15/05/2017 11:07, Davidlohr Bueso wrote: > --- /dev/null > +++ b/include/linux/range_lock.h > @@ -0,0 +1,181 @@ > +/* > + * Range/interval rw-locking > + * - > + * > + * Interval-tree based range locking is about controlling tasks' forward > + * progress when adding an

Re: [PATCH v2 1/2] mm: Uncharge poisoned pages

2017-05-08 Thread Laurent Dufour
On 04/05/2017 03:21, Balbir Singh wrote: >> @@ -5527,7 +5527,7 @@ static void uncharge_list(struct list_head *page_list) >> next = page->lru.next; >> >> VM_BUG_ON_PAGE(PageLRU(page), page); >> -VM_BUG_ON_PAGE(page_count(page), page); >> +

Re: [PATCH v2 1/2] mm: Uncharge poisoned pages

2017-05-08 Thread Laurent Dufour
On 04/05/2017 03:21, Balbir Singh wrote: >> @@ -5527,7 +5527,7 @@ static void uncharge_list(struct list_head *page_list) >> next = page->lru.next; >> >> VM_BUG_ON_PAGE(PageLRU(page), page); >> -VM_BUG_ON_PAGE(page_count(page), page); >> +

Re: [RFC v3 03/17] mm: Introduce pte_spinlock

2017-05-03 Thread Laurent Dufour
On 30/04/2017 06:47, Matthew Wilcox wrote: > On Thu, Apr 27, 2017 at 05:52:42PM +0200, Laurent Dufour wrote: >> +++ b/mm/memory.c >> @@ -2100,6 +2100,13 @@ static inline void wp_page_reuse(struct vm_fault *vmf) >> pte_unmap_unlock(vmf->pte, vmf->ptl); >> }

Re: [RFC v3 03/17] mm: Introduce pte_spinlock

2017-05-03 Thread Laurent Dufour
On 30/04/2017 06:47, Matthew Wilcox wrote: > On Thu, Apr 27, 2017 at 05:52:42PM +0200, Laurent Dufour wrote: >> +++ b/mm/memory.c >> @@ -2100,6 +2100,13 @@ static inline void wp_page_reuse(struct vm_fault *vmf) >> pte_unmap_unlock(vmf->pte, vmf->ptl); >> }

Re: [PATCH v2 1/2] mm: Uncharge poisoned pages

2017-05-03 Thread Laurent Dufour
On 02/05/2017 20:55, Michal Hocko wrote: > On Tue 02-05-17 16:59:30, Laurent Dufour wrote: >> On 28/04/2017 15:48, Michal Hocko wrote: > [...] >>> This is getting quite hairy. What is the expected page count of the >>> hwpoison page? > > OK, so from the quick

Re: [PATCH v2 1/2] mm: Uncharge poisoned pages

2017-05-03 Thread Laurent Dufour
On 02/05/2017 20:55, Michal Hocko wrote: > On Tue 02-05-17 16:59:30, Laurent Dufour wrote: >> On 28/04/2017 15:48, Michal Hocko wrote: > [...] >>> This is getting quite hairy. What is the expected page count of the >>> hwpoison page? > > OK, so from the quick

Re: [RFC v3 05/17] RCU free VMAs

2017-05-03 Thread Laurent Dufour
On 30/04/2017 07:05, Matthew Wilcox wrote: > On Thu, Apr 27, 2017 at 05:52:44PM +0200, Laurent Dufour wrote: >> +static inline bool vma_is_dead(struct vm_area_struct *vma, unsigned int >> sequence) >> +{ >> +int ret = RB_EMPTY_NODE(>vm_rb); >>

Re: [RFC v3 05/17] RCU free VMAs

2017-05-03 Thread Laurent Dufour
On 30/04/2017 07:05, Matthew Wilcox wrote: > On Thu, Apr 27, 2017 at 05:52:44PM +0200, Laurent Dufour wrote: >> +static inline bool vma_is_dead(struct vm_area_struct *vma, unsigned int >> sequence) >> +{ >> +int ret = RB_EMPTY_NODE(>vm_rb); >>

Re: [PATCH v2 1/2] mm: Uncharge poisoned pages

2017-05-02 Thread Laurent Dufour
On 28/04/2017 15:48, Michal Hocko wrote: > On Fri 28-04-17 11:17:34, Laurent Dufour wrote: >> On 28/04/2017 09:31, Michal Hocko wrote: >>> [CC Johannes and Vladimir - the patch is >>> http://lkml.kernel.org/r/1493130472-22843-2-git-send-email-lduf...@linux.vnet.ibm.com]

Re: [PATCH v2 1/2] mm: Uncharge poisoned pages

2017-05-02 Thread Laurent Dufour
On 28/04/2017 15:48, Michal Hocko wrote: > On Fri 28-04-17 11:17:34, Laurent Dufour wrote: >> On 28/04/2017 09:31, Michal Hocko wrote: >>> [CC Johannes and Vladimir - the patch is >>> http://lkml.kernel.org/r/1493130472-22843-2-git-send-email-lduf...@linux.vnet.ibm.com]

Re: [PATCH v2 1/2] mm: Uncharge poisoned pages

2017-04-28 Thread Laurent Dufour
On 26/04/2017 10:59, Balbir Singh wrote: > On Wed, 2017-04-26 at 04:46 +, Naoya Horiguchi wrote: >> On Wed, Apr 26, 2017 at 01:45:00PM +1000, Balbir Singh wrote: >> static int delete_from_lru_cache(struct page *p) >> { >> +if (memcg_kmem_enabled()) >> +

Re: [PATCH v2 1/2] mm: Uncharge poisoned pages

2017-04-28 Thread Laurent Dufour
On 26/04/2017 10:59, Balbir Singh wrote: > On Wed, 2017-04-26 at 04:46 +, Naoya Horiguchi wrote: >> On Wed, Apr 26, 2017 at 01:45:00PM +1000, Balbir Singh wrote: >> static int delete_from_lru_cache(struct page *p) >> { >> +if (memcg_kmem_enabled()) >> +

Re: [PATCH v2 1/2] mm: Uncharge poisoned pages

2017-04-28 Thread Laurent Dufour
;>> Michal Hocko <mho...@kernel.org> writes: >>> >>>> On Tue 25-04-17 16:27:51, Laurent Dufour wrote: >>>>> When page are poisoned, they should be uncharged from the root memory >>>>> cgroup. >>>>> >>>>> This is

Re: [PATCH v2 1/2] mm: Uncharge poisoned pages

2017-04-28 Thread Laurent Dufour
t;> Michal Hocko writes: >>> >>>> On Tue 25-04-17 16:27:51, Laurent Dufour wrote: >>>>> When page are poisoned, they should be uncharged from the root memory >>>>> cgroup. >>>>> >>>>> This is required to avoid

[RFC v3 02/17] mm: Prepare for FAULT_FLAG_SPECULATIVE

2017-04-27 Thread Laurent Dufour
(Intel) <pet...@infradead.org> [port to 4.10 kernel] Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- include/linux/mm.h | 1 + mm/memory.c| 57 +++--- 2 files changed, 42 insertions(+), 16 deletions(-) diff --git a/

[RFC v3 02/17] mm: Prepare for FAULT_FLAG_SPECULATIVE

2017-04-27 Thread Laurent Dufour
] Signed-off-by: Laurent Dufour --- include/linux/mm.h | 1 + mm/memory.c| 57 +++--- 2 files changed, 42 insertions(+), 16 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index b84615b0f64c..555ac9ac7202 100644

[RFC v3 03/17] mm: Introduce pte_spinlock

2017-04-27 Thread Laurent Dufour
This is needed because in handle_pte_fault() pte_offset_map() is called and then fe->ptl is fetched and spin_locked. This was previously embedded in the call to pte_offset_map_lock(). Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/memory.c | 15 +++ 1 fil

[RFC v3 03/17] mm: Introduce pte_spinlock

2017-04-27 Thread Laurent Dufour
This is needed because in handle_pte_fault() pte_offset_map() is called and then fe->ptl is fetched and spin_locked. This was previously embedded in the call to pte_offset_map_lock(). Signed-off-by: Laurent Dufour --- mm/memory.c | 15 +++ 1 file changed, 11 insertions(+)

[RFC v3 09/17] mm/spf: Fix fe.sequence init in __handle_mm_fault()

2017-04-27 Thread Laurent Dufour
__handle_mm_fault() calls handle_pte_fault which requires the sequence field of the fault_env to be initialized. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/memory.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/memory.c b/mm/memory.c index 458f57

[RFC v3 09/17] mm/spf: Fix fe.sequence init in __handle_mm_fault()

2017-04-27 Thread Laurent Dufour
__handle_mm_fault() calls handle_pte_fault which requires the sequence field of the fault_env to be initialized. Signed-off-by: Laurent Dufour --- mm/memory.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/memory.c b/mm/memory.c index 458f579feb6f..f8afd52f0d34 100644 --- a/mm/memory.c

[RFC v3 04/17] mm: VMA sequence count

2017-04-27 Thread Laurent Dufour
4.10 kernel] Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- include/linux/mm_types.h | 1 + mm/memory.c | 2 ++ mm/mmap.c| 13 + 3 files changed, 16 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h

[RFC v3 08/17] mm/spf: Try spin lock in speculative path

2017-04-27 Thread Laurent Dufour
/0x280 __do_page_fault+0x187/0x580 trace_do_page_fault+0x52/0x260 do_async_page_fault+0x19/0x70 async_page_fault+0x28/0x30 Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/memory.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/mm/memory.

[RFC v3 10/17] mm/spf: don't set fault entry's fields if locking failed

2017-04-27 Thread Laurent Dufour
In the case pte_map_lock failed to lock the pte or if the VMA is no more valid, the fault entry's fields should not be set so that caller won't try to unlock it. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/memory.c | 14 +- 1 file changed, 9 insertions

[RFC v3 04/17] mm: VMA sequence count

2017-04-27 Thread Laurent Dufour
are still valid. The flip side is that we cannot distinguish between a vma_adjust() and the unmap_page_range() -- where with the former we could have re-checked the vma bounds against the address. Signed-off-by: Peter Zijlstra (Intel) [port to 4.10 kernel] Signed-off-by: Laurent Dufour --- include

[RFC v3 08/17] mm/spf: Try spin lock in speculative path

2017-04-27 Thread Laurent Dufour
/0x280 __do_page_fault+0x187/0x580 trace_do_page_fault+0x52/0x260 do_async_page_fault+0x19/0x70 async_page_fault+0x28/0x30 Signed-off-by: Laurent Dufour --- mm/memory.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index fd3a0dc122c5

[RFC v3 10/17] mm/spf: don't set fault entry's fields if locking failed

2017-04-27 Thread Laurent Dufour
In the case pte_map_lock failed to lock the pte or if the VMA is no more valid, the fault entry's fields should not be set so that caller won't try to unlock it. Signed-off-by: Laurent Dufour --- mm/memory.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/mm

[RFC v3 07/17] mm,x86: Add speculative pagefault handling

2017-04-27 Thread Laurent Dufour
From: Peter Zijlstra Try a speculative fault before acquiring mmap_sem, if it returns with VM_FAULT_RETRY continue with the mmap_sem acquisition and do the traditional fault. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/mm/fault.c | 18

[RFC v3 07/17] mm,x86: Add speculative pagefault handling

2017-04-27 Thread Laurent Dufour
From: Peter Zijlstra Try a speculative fault before acquiring mmap_sem, if it returns with VM_FAULT_RETRY continue with the mmap_sem acquisition and do the traditional fault. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/mm/fault.c | 18 ++ 1 file changed, 18

[RFC v3 16/17] mm: protect madvise vs speculative pf

2017-04-27 Thread Laurent Dufour
This is an attempt to protect madvise's effect against the speculative page fault handler. Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> --- mm/madvise.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/madvise.c b/mm/madvise.c index 0e3828

<    6   7   8   9   10   11   12   13   14   >