Re: [PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-30 Thread Rik van Riel
On Mon, 2018-07-30 at 18:26 +0200, Peter Zijlstra wrote: > > So for ARCH_NO_ACTIVE_MM we never touch ->active_mm and therefore > ->active_mm == ->mm. Close, but not true for kernel threads, which have a NULL ->mm, but a non-null ->active_mm that gets passed to enter_lazy_tlb(). I stuck to the

Re: [PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-30 Thread Rik van Riel
On Mon, 2018-07-30 at 18:26 +0200, Peter Zijlstra wrote: > > So for ARCH_NO_ACTIVE_MM we never touch ->active_mm and therefore > ->active_mm == ->mm. Close, but not true for kernel threads, which have a NULL ->mm, but a non-null ->active_mm that gets passed to enter_lazy_tlb(). I stuck to the

Re: [PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-30 Thread Rik van Riel
On Mon, 2018-07-30 at 12:49 -0700, Andy Lutomirski wrote: > > I think it's a big step in the right direction, but it still makes be > nervous. I'd be more comfortable with it if you at least had a > functional set of patches that result in active_mm being gone, > because > that will mean that

Re: [PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-30 Thread Rik van Riel
On Mon, 2018-07-30 at 12:49 -0700, Andy Lutomirski wrote: > > I think it's a big step in the right direction, but it still makes be > nervous. I'd be more comfortable with it if you at least had a > functional set of patches that result in active_mm being gone, > because > that will mean that

Re: [PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-30 Thread Rik van Riel
On Mon, 2018-07-30 at 12:30 -0700, Andy Lutomirski wrote: > On Mon, Jul 30, 2018 at 12:15 PM, Rik van Riel > wrote: > > On Mon, 2018-07-30 at 18:26 +0200, Peter Zijlstra wrote: > > > On Mon, Jul 30, 2018 at 10:30:11AM -0400, Rik van Riel wrote: > > > > > >

Re: [PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-30 Thread Rik van Riel
On Mon, 2018-07-30 at 12:30 -0700, Andy Lutomirski wrote: > On Mon, Jul 30, 2018 at 12:15 PM, Rik van Riel > wrote: > > On Mon, 2018-07-30 at 18:26 +0200, Peter Zijlstra wrote: > > > On Mon, Jul 30, 2018 at 10:30:11AM -0400, Rik van Riel wrote: > > > > > >

Re: [PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-30 Thread Rik van Riel
On Mon, 2018-07-30 at 18:26 +0200, Peter Zijlstra wrote: > On Mon, Jul 30, 2018 at 10:30:11AM -0400, Rik van Riel wrote: > > > > What happened to the rework I did there? That not only avoided > > > fiddling > > > with active_mm, but also avoids grab/drop cycles

Re: [PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-30 Thread Rik van Riel
On Mon, 2018-07-30 at 18:26 +0200, Peter Zijlstra wrote: > On Mon, Jul 30, 2018 at 10:30:11AM -0400, Rik van Riel wrote: > > > > What happened to the rework I did there? That not only avoided > > > fiddling > > > with active_mm, but also avoids grab/drop cycles

Re: [PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-30 Thread Rik van Riel
On Mon, 2018-07-30 at 11:55 +0200, Peter Zijlstra wrote: > On Sun, Jul 29, 2018 at 03:54:52PM -0400, Rik van Riel wrote: > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > index c45de46fdf10..11724c9e88b0 100644 > > --- a/kernel/sched/core.c > > +++ b/kernel/

Re: [PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-30 Thread Rik van Riel
On Mon, 2018-07-30 at 11:55 +0200, Peter Zijlstra wrote: > On Sun, Jul 29, 2018 at 03:54:52PM -0400, Rik van Riel wrote: > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > index c45de46fdf10..11724c9e88b0 100644 > > --- a/kernel/sched/core.c > > +++ b/kernel/

[PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-29 Thread Rik van Riel
On Sat, 28 Jul 2018 21:21:17 -0700 Andy Lutomirski wrote: > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel wrote: > > Conditionally skip lazy TLB mm refcounting. When an architecture has > > CONFIG_ARCH_NO_ACTIVE_MM_REFCOUNTING enabled, an mm that is used in > > lazy TLB m

[PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-29 Thread Rik van Riel
On Sat, 28 Jul 2018 21:21:17 -0700 Andy Lutomirski wrote: > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel wrote: > > Conditionally skip lazy TLB mm refcounting. When an architecture has > > CONFIG_ARCH_NO_ACTIVE_MM_REFCOUNTING enabled, an mm that is used in > > lazy TLB m

[PATCH v2 10/11] x86,tlb: really leave mm on shootdown

2018-07-29 Thread Rik van Riel
On Sat, 28 Jul 2018 21:21:17 -0700 Andy Lutomirski wrote: > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel wrote: > > Conditionally skip lazy TLB mm refcounting. When an architecture has > > CONFIG_ARCH_NO_ACTIVE_MM_REFCOUNTING enabled, an mm that is used in > > lazy TLB m

[PATCH v2 10/11] x86,tlb: really leave mm on shootdown

2018-07-29 Thread Rik van Riel
On Sat, 28 Jul 2018 21:21:17 -0700 Andy Lutomirski wrote: > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel wrote: > > Conditionally skip lazy TLB mm refcounting. When an architecture has > > CONFIG_ARCH_NO_ACTIVE_MM_REFCOUNTING enabled, an mm that is used in > > lazy TLB m

Re: [PATCH 03/10] smp,cpumask: introduce on_each_cpu_cond_mask

2018-07-29 Thread Rik van Riel
On Sun, 2018-07-29 at 08:36 -0700, Andy Lutomirski wrote: > On Jul 29, 2018, at 5:00 AM, Rik van Riel wrote: > > > On Sat, 2018-07-28 at 19:57 -0700, Andy Lutomirski wrote: > > > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel > > > wrote: > > > &g

Re: [PATCH 03/10] smp,cpumask: introduce on_each_cpu_cond_mask

2018-07-29 Thread Rik van Riel
On Sun, 2018-07-29 at 08:36 -0700, Andy Lutomirski wrote: > On Jul 29, 2018, at 5:00 AM, Rik van Riel wrote: > > > On Sat, 2018-07-28 at 19:57 -0700, Andy Lutomirski wrote: > > > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel > > > wrote: > > > &g

Re: [PATCH 03/10] smp,cpumask: introduce on_each_cpu_cond_mask

2018-07-29 Thread Rik van Riel
On Sun, 2018-07-29 at 08:36 -0700, Andy Lutomirski wrote: > On Jul 29, 2018, at 5:00 AM, Rik van Riel wrote: > > > On Sat, 2018-07-28 at 19:57 -0700, Andy Lutomirski wrote: > > > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel > > > wrote: > > > &g

Re: [PATCH 03/10] smp,cpumask: introduce on_each_cpu_cond_mask

2018-07-29 Thread Rik van Riel
On Sun, 2018-07-29 at 08:36 -0700, Andy Lutomirski wrote: > On Jul 29, 2018, at 5:00 AM, Rik van Riel wrote: > > > On Sat, 2018-07-28 at 19:57 -0700, Andy Lutomirski wrote: > > > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel > > > wrote: > > > &g

Re: [PATCH 10/10] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-29 Thread Rik van Riel
On Sun, 2018-07-29 at 08:29 -0700, Andy Lutomirski wrote: > > On Jul 29, 2018, at 5:11 AM, Rik van Riel wrote: > > > > > On Sat, 2018-07-28 at 21:21 -0700, Andy Lutomirski wrote: > > > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel > > > wrote: > > &

Re: [PATCH 10/10] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-29 Thread Rik van Riel
On Sun, 2018-07-29 at 08:29 -0700, Andy Lutomirski wrote: > > On Jul 29, 2018, at 5:11 AM, Rik van Riel wrote: > > > > > On Sat, 2018-07-28 at 21:21 -0700, Andy Lutomirski wrote: > > > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel > > > wrote: > > &

Re: [PATCH 10/10] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-29 Thread Rik van Riel
On Sat, 2018-07-28 at 21:21 -0700, Andy Lutomirski wrote: > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel > wrote: > > Conditionally skip lazy TLB mm refcounting. When an architecture > > has > > CONFIG_ARCH_NO_ACTIVE_MM_REFCOUNTING enabled, an mm that is used in &g

Re: [PATCH 10/10] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-29 Thread Rik van Riel
On Sat, 2018-07-28 at 21:21 -0700, Andy Lutomirski wrote: > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel > wrote: > > Conditionally skip lazy TLB mm refcounting. When an architecture > > has > > CONFIG_ARCH_NO_ACTIVE_MM_REFCOUNTING enabled, an mm that is used in &g

Re: [PATCH 04/10] x86,mm: use on_each_cpu_cond for TLB flushes

2018-07-29 Thread Rik van Riel
On Sat, 2018-07-28 at 19:58 -0700, Andy Lutomirski wrote: > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel > wrote: > > Instead of open coding bitmap magic, use on_each_cpu_cond > > to determine which CPUs to send TLB flush IPIs to. > > > > This might be a li

Re: [PATCH 04/10] x86,mm: use on_each_cpu_cond for TLB flushes

2018-07-29 Thread Rik van Riel
On Sat, 2018-07-28 at 19:58 -0700, Andy Lutomirski wrote: > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel > wrote: > > Instead of open coding bitmap magic, use on_each_cpu_cond > > to determine which CPUs to send TLB flush IPIs to. > > > > This might be a li

Re: [PATCH 03/10] smp,cpumask: introduce on_each_cpu_cond_mask

2018-07-29 Thread Rik van Riel
On Sat, 2018-07-28 at 19:57 -0700, Andy Lutomirski wrote: > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel > wrote: > > Introduce a variant of on_each_cpu_cond that iterates only over the > > CPUs in a cpumask, in order to avoid making callbacks for every > > single >

Re: [PATCH 03/10] smp,cpumask: introduce on_each_cpu_cond_mask

2018-07-29 Thread Rik van Riel
On Sat, 2018-07-28 at 19:57 -0700, Andy Lutomirski wrote: > On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel > wrote: > > Introduce a variant of on_each_cpu_cond that iterates only over the > > CPUs in a cpumask, in order to avoid making callbacks for every > > single >

[PATCH 06/10] mm,x86: skip cr4 and ldt reload when mm stays the same

2018-07-28 Thread Rik van Riel
to the same mm after a flush. Suggested-by: Andy Lutomirski Signed-off-by: Rik van Riel --- arch/x86/mm/tlb.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 671cc66df801..149fb64e4bf4 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm

[PATCH 03/10] smp,cpumask: introduce on_each_cpu_cond_mask

2018-07-28 Thread Rik van Riel
Introduce a variant of on_each_cpu_cond that iterates only over the CPUs in a cpumask, in order to avoid making callbacks for every single CPU in the system when we only need to test a subset. Signed-off-by: Rik van Riel --- include/linux/smp.h | 4 kernel/smp.c| 17

[PATCH 06/10] mm,x86: skip cr4 and ldt reload when mm stays the same

2018-07-28 Thread Rik van Riel
to the same mm after a flush. Suggested-by: Andy Lutomirski Signed-off-by: Rik van Riel --- arch/x86/mm/tlb.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 671cc66df801..149fb64e4bf4 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm

[PATCH 03/10] smp,cpumask: introduce on_each_cpu_cond_mask

2018-07-28 Thread Rik van Riel
Introduce a variant of on_each_cpu_cond that iterates only over the CPUs in a cpumask, in order to avoid making callbacks for every single CPU in the system when we only need to test a subset. Signed-off-by: Rik van Riel --- include/linux/smp.h | 4 kernel/smp.c| 17

[PATCH 08/10] arch,mm: add config variable to skip lazy TLB mm refcounting

2018-07-28 Thread Rik van Riel
Add a config variable indicating that this architecture does not require lazy TLB mm refcounting, because lazy TLB mms get shot down instantly at exit_mmap time. Signed-off-by: Rik van Riel --- arch/Kconfig | 4 1 file changed, 4 insertions(+) diff --git a/arch/Kconfig b/arch/Kconfig

[PATCH 07/10] x86,mm: remove leave_mm cpu argument

2018-07-28 Thread Rik van Riel
The function leave_mm does not use its cpu argument, but always works on the CPU where it is called. Change the argument to a void *, so leave_mm can be called directly from smp_call_function_mask, and stop looking up the CPU number in current leave_mm callers. Signed-off-by: Rik van Riel

[PATCH 08/10] arch,mm: add config variable to skip lazy TLB mm refcounting

2018-07-28 Thread Rik van Riel
Add a config variable indicating that this architecture does not require lazy TLB mm refcounting, because lazy TLB mms get shot down instantly at exit_mmap time. Signed-off-by: Rik van Riel --- arch/Kconfig | 4 1 file changed, 4 insertions(+) diff --git a/arch/Kconfig b/arch/Kconfig

[PATCH 07/10] x86,mm: remove leave_mm cpu argument

2018-07-28 Thread Rik van Riel
The function leave_mm does not use its cpu argument, but always works on the CPU where it is called. Change the argument to a void *, so leave_mm can be called directly from smp_call_function_mask, and stop looking up the CPU number in current leave_mm callers. Signed-off-by: Rik van Riel

[PATCH 05/10] mm,tlb: turn dummy defines into inline functions

2018-07-28 Thread Rik van Riel
Turn the dummy tlb_flush_remove_tables* defines into inline functions, in order to get compiler type checking, etc. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel --- include/asm-generic/tlb.h | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/include/asm

[PATCH 05/10] mm,tlb: turn dummy defines into inline functions

2018-07-28 Thread Rik van Riel
Turn the dummy tlb_flush_remove_tables* defines into inline functions, in order to get compiler type checking, etc. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel --- include/asm-generic/tlb.h | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/include/asm

[PATCH 09/10] mm,x86: shoot down lazy TLB references at exit_mmap time

2018-07-28 Thread Rik van Riel
Shooting down lazy TLB references to an mm at exit_mmap time ensures that no users of the mm_struct will be left anywhere in the system, allowing it to be torn down and freed immediately. Signed-off-by: Rik van Riel Suggested-by: Andy Lutomirski Suggested-by: Peter Zijlstra --- arch/x86

[PATCH 09/10] mm,x86: shoot down lazy TLB references at exit_mmap time

2018-07-28 Thread Rik van Riel
Shooting down lazy TLB references to an mm at exit_mmap time ensures that no users of the mm_struct will be left anywhere in the system, allowing it to be torn down and freed immediately. Signed-off-by: Rik van Riel Suggested-by: Andy Lutomirski Suggested-by: Peter Zijlstra --- arch/x86

[PATCH 02/10] smp: use __cpumask_set_cpu in on_each_cpu_cond

2018-07-28 Thread Rik van Riel
The code in on_each_cpu_cond sets CPUs in a locally allocated bitmask, which should never be used by other CPUs simultaneously. There is no need to use locked memory accesses to set the bits in this bitmap. Switch to __cpumask_set_cpu. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel

[PATCH 04/10] x86,mm: use on_each_cpu_cond for TLB flushes

2018-07-28 Thread Rik van Riel
Instead of open coding bitmap magic, use on_each_cpu_cond to determine which CPUs to send TLB flush IPIs to. This might be a little bit slower than examining the bitmaps, but it should be a lot easier to maintain in the long run. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel

[PATCH 02/10] smp: use __cpumask_set_cpu in on_each_cpu_cond

2018-07-28 Thread Rik van Riel
The code in on_each_cpu_cond sets CPUs in a locally allocated bitmask, which should never be used by other CPUs simultaneously. There is no need to use locked memory accesses to set the bits in this bitmap. Switch to __cpumask_set_cpu. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel

[PATCH 04/10] x86,mm: use on_each_cpu_cond for TLB flushes

2018-07-28 Thread Rik van Riel
Instead of open coding bitmap magic, use on_each_cpu_cond to determine which CPUs to send TLB flush IPIs to. This might be a little bit slower than examining the bitmaps, but it should be a lot easier to maintain in the long run. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel

[PATCH 01/10] x86,tlb: clarify memory barrier in switch_mm_irqs_off

2018-07-28 Thread Rik van Riel
Clarify exactly what the memory barrier synchronizes with. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel --- arch/x86/mm/tlb.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 752dbf4e0e50..5321e02c4e09 100644

[PATCH 0/10] x86,tlb,mm: more lazy TLB cleanups & optimizations

2018-07-28 Thread Rik van Riel
This patch series implements the cleanups suggested by Peter and Andy, removes lazy TLB mm refcounting on x86, and shows how other architectures could implement that same optimization. The previous patch series already seems to have removed most of the cache line contention I was seeing at

[PATCH 10/10] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-28 Thread Rik van Riel
is about to start using. Signed-off-by: Rik van Riel --- fs/exec.c| 2 +- include/linux/sched/mm.h | 25 + kernel/sched/core.c | 6 +++--- mm/mmu_context.c | 21 ++--- 4 files changed, 43 insertions(+), 11 deletions(-) diff

[PATCH 01/10] x86,tlb: clarify memory barrier in switch_mm_irqs_off

2018-07-28 Thread Rik van Riel
Clarify exactly what the memory barrier synchronizes with. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel --- arch/x86/mm/tlb.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 752dbf4e0e50..5321e02c4e09 100644

[PATCH 0/10] x86,tlb,mm: more lazy TLB cleanups & optimizations

2018-07-28 Thread Rik van Riel
This patch series implements the cleanups suggested by Peter and Andy, removes lazy TLB mm refcounting on x86, and shows how other architectures could implement that same optimization. The previous patch series already seems to have removed most of the cache line contention I was seeing at

[PATCH 10/10] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-28 Thread Rik van Riel
is about to start using. Signed-off-by: Rik van Riel --- fs/exec.c| 2 +- include/linux/sched/mm.h | 25 + kernel/sched/core.c | 6 +++--- mm/mmu_context.c | 21 ++--- 4 files changed, 43 insertions(+), 11 deletions(-) diff

Re: [PATCH v2 00/19] Fixes for sched/numa_balancing

2018-07-23 Thread Rik van Riel
On Mon, 2018-07-23 at 08:09 -0700, Srikar Dronamraju wrote: > * Peter Zijlstra [2018-07-23 15:57:00]: > > > On Wed, Jun 20, 2018 at 10:32:41PM +0530, Srikar Dronamraju wrote: > > > Srikar Dronamraju (19): > > > > > sched/numa: Stop multiple tasks from moving to the cpu at the > > > same time

Re: [PATCH v2 00/19] Fixes for sched/numa_balancing

2018-07-23 Thread Rik van Riel
On Mon, 2018-07-23 at 08:09 -0700, Srikar Dronamraju wrote: > * Peter Zijlstra [2018-07-23 15:57:00]: > > > On Wed, Jun 20, 2018 at 10:32:41PM +0530, Srikar Dronamraju wrote: > > > Srikar Dronamraju (19): > > > > > sched/numa: Stop multiple tasks from moving to the cpu at the > > > same time

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-18 Thread Rik van Riel
> On Jul 17, 2018, at 4:04 PM, Andy Lutomirski wrote: > > > I think you've introduced a minor-ish performance regression due to > changing the old (admittedly terribly documented) control flow a bit. > Before, if real_prev == next, we would skip: > > load_mm_cr4(next); >

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-18 Thread Rik van Riel
> On Jul 17, 2018, at 4:04 PM, Andy Lutomirski wrote: > > > I think you've introduced a minor-ish performance regression due to > changing the old (admittedly terribly documented) control flow a bit. > Before, if real_prev == next, we would skip: > > load_mm_cr4(next); >

Re: [tip:x86/mm] x86/mm/tlb: Make lazy TLB mode lazier

2018-07-18 Thread Rik van Riel
> On Jul 18, 2018, at 2:23 PM, Peter Zijlstra wrote: > > On Wed, Jul 18, 2018 at 01:22:19PM -0400, Rik van Riel wrote: >>> On Jul 18, 2018, at 12:00 PM, Peter Zijlstra wrote: > >>> Also, I don't suppose you've looked at the paravirt instances of >>&g

Re: [tip:x86/mm] x86/mm/tlb: Make lazy TLB mode lazier

2018-07-18 Thread Rik van Riel
> On Jul 18, 2018, at 2:23 PM, Peter Zijlstra wrote: > > On Wed, Jul 18, 2018 at 01:22:19PM -0400, Rik van Riel wrote: >>> On Jul 18, 2018, at 12:00 PM, Peter Zijlstra wrote: > >>> Also, I don't suppose you've looked at the paravirt instances of >>&g

Re: [tip:x86/mm] x86/mm/tlb: Make lazy TLB mode lazier

2018-07-18 Thread Rik van Riel
> On Jul 17, 2018, at 7:33 AM, Peter Zijlstra wrote: > > On Tue, Jul 17, 2018 at 02:35:08AM -0700, tip-bot for Rik van Riel wrote: >> @@ -242,17 +244,40 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct >> mm_struct *next, >>

Re: [tip:x86/mm] x86/mm/tlb: Make lazy TLB mode lazier

2018-07-18 Thread Rik van Riel
> On Jul 17, 2018, at 7:33 AM, Peter Zijlstra wrote: > > On Tue, Jul 17, 2018 at 02:35:08AM -0700, tip-bot for Rik van Riel wrote: >> @@ -242,17 +244,40 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct >> mm_struct *next, >>

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-17 Thread Rik van Riel
> On Jul 17, 2018, at 5:29 PM, Andy Lutomirski wrote: > > On Tue, Jul 17, 2018 at 1:16 PM, Rik van Riel wrote: >> Can I skip both the cr4 and let switches when the TLB contents >> are no longer valid and got reloaded? >> >> If the TLB contents are still va

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-17 Thread Rik van Riel
> On Jul 17, 2018, at 5:29 PM, Andy Lutomirski wrote: > > On Tue, Jul 17, 2018 at 1:16 PM, Rik van Riel wrote: >> Can I skip both the cr4 and let switches when the TLB contents >> are no longer valid and got reloaded? >> >> If the TLB contents are still va

[tip:x86/mm] x86/mm/tlb: Skip atomic operations for 'init_mm' in switch_mm_irqs_off()

2018-07-17 Thread tip-bot for Rik van Riel
Commit-ID: e9d8c61557687b7126101e9550bdf243223f0d8f Gitweb: https://git.kernel.org/tip/e9d8c61557687b7126101e9550bdf243223f0d8f Author: Rik van Riel AuthorDate: Mon, 16 Jul 2018 15:03:37 -0400 Committer: Ingo Molnar CommitDate: Tue, 17 Jul 2018 09:35:34 +0200 x86/mm/tlb: Skip atomic

[tip:x86/mm] x86/mm/tlb: Skip atomic operations for 'init_mm' in switch_mm_irqs_off()

2018-07-17 Thread tip-bot for Rik van Riel
Commit-ID: e9d8c61557687b7126101e9550bdf243223f0d8f Gitweb: https://git.kernel.org/tip/e9d8c61557687b7126101e9550bdf243223f0d8f Author: Rik van Riel AuthorDate: Mon, 16 Jul 2018 15:03:37 -0400 Committer: Ingo Molnar CommitDate: Tue, 17 Jul 2018 09:35:34 +0200 x86/mm/tlb: Skip atomic

[tip:x86/mm] x86/mm/tlb: Always use lazy TLB mode

2018-07-17 Thread tip-bot for Rik van Riel
Commit-ID: 95b0e6357d3e4e05349668940d7ff8f3b7e7e11e Gitweb: https://git.kernel.org/tip/95b0e6357d3e4e05349668940d7ff8f3b7e7e11e Author: Rik van Riel AuthorDate: Mon, 16 Jul 2018 15:03:36 -0400 Committer: Ingo Molnar CommitDate: Tue, 17 Jul 2018 09:35:34 +0200 x86/mm/tlb: Always use

[tip:x86/mm] x86/mm/tlb: Always use lazy TLB mode

2018-07-17 Thread tip-bot for Rik van Riel
Commit-ID: 95b0e6357d3e4e05349668940d7ff8f3b7e7e11e Gitweb: https://git.kernel.org/tip/95b0e6357d3e4e05349668940d7ff8f3b7e7e11e Author: Rik van Riel AuthorDate: Mon, 16 Jul 2018 15:03:36 -0400 Committer: Ingo Molnar CommitDate: Tue, 17 Jul 2018 09:35:34 +0200 x86/mm/tlb: Always use

[tip:x86/mm] x86/mm/tlb: Only send page table free TLB flush to lazy TLB CPUs

2018-07-17 Thread tip-bot for Rik van Riel
Commit-ID: 64482aafe55fc7e84d0741c356f8176ee7bde357 Gitweb: https://git.kernel.org/tip/64482aafe55fc7e84d0741c356f8176ee7bde357 Author: Rik van Riel AuthorDate: Mon, 16 Jul 2018 15:03:35 -0400 Committer: Ingo Molnar CommitDate: Tue, 17 Jul 2018 09:35:33 +0200 x86/mm/tlb: Only send

[tip:x86/mm] x86/mm/tlb: Only send page table free TLB flush to lazy TLB CPUs

2018-07-17 Thread tip-bot for Rik van Riel
Commit-ID: 64482aafe55fc7e84d0741c356f8176ee7bde357 Gitweb: https://git.kernel.org/tip/64482aafe55fc7e84d0741c356f8176ee7bde357 Author: Rik van Riel AuthorDate: Mon, 16 Jul 2018 15:03:35 -0400 Committer: Ingo Molnar CommitDate: Tue, 17 Jul 2018 09:35:33 +0200 x86/mm/tlb: Only send

[tip:x86/mm] x86/mm/tlb: Make lazy TLB mode lazier

2018-07-17 Thread tip-bot for Rik van Riel
Commit-ID: ac0315896970d8589291e9d8a1569fc65967b7f1 Gitweb: https://git.kernel.org/tip/ac0315896970d8589291e9d8a1569fc65967b7f1 Author: Rik van Riel AuthorDate: Mon, 16 Jul 2018 15:03:34 -0400 Committer: Ingo Molnar CommitDate: Tue, 17 Jul 2018 09:35:33 +0200 x86/mm/tlb: Make lazy TLB

[tip:x86/mm] x86/mm/tlb: Make lazy TLB mode lazier

2018-07-17 Thread tip-bot for Rik van Riel
Commit-ID: ac0315896970d8589291e9d8a1569fc65967b7f1 Gitweb: https://git.kernel.org/tip/ac0315896970d8589291e9d8a1569fc65967b7f1 Author: Rik van Riel AuthorDate: Mon, 16 Jul 2018 15:03:34 -0400 Committer: Ingo Molnar CommitDate: Tue, 17 Jul 2018 09:35:33 +0200 x86/mm/tlb: Make lazy TLB

[tip:x86/mm] x86/mm/tlb: Restructure switch_mm_irqs_off()

2018-07-17 Thread tip-bot for Rik van Riel
Commit-ID: 61d0beb5796ab11f7f3bf38cb2eccc6579aaa70b Gitweb: https://git.kernel.org/tip/61d0beb5796ab11f7f3bf38cb2eccc6579aaa70b Author: Rik van Riel AuthorDate: Mon, 16 Jul 2018 15:03:33 -0400 Committer: Ingo Molnar CommitDate: Tue, 17 Jul 2018 09:35:32 +0200 x86/mm/tlb: Restructure

[tip:x86/mm] x86/mm/tlb: Restructure switch_mm_irqs_off()

2018-07-17 Thread tip-bot for Rik van Riel
Commit-ID: 61d0beb5796ab11f7f3bf38cb2eccc6579aaa70b Gitweb: https://git.kernel.org/tip/61d0beb5796ab11f7f3bf38cb2eccc6579aaa70b Author: Rik van Riel AuthorDate: Mon, 16 Jul 2018 15:03:33 -0400 Committer: Ingo Molnar CommitDate: Tue, 17 Jul 2018 09:35:32 +0200 x86/mm/tlb: Restructure

[tip:x86/mm] x86/mm/tlb: Leave lazy TLB mode at page table free time

2018-07-17 Thread tip-bot for Rik van Riel
Commit-ID: 2ff6ddf19c0ec40633bd14d8fe28a289816bd98d Gitweb: https://git.kernel.org/tip/2ff6ddf19c0ec40633bd14d8fe28a289816bd98d Author: Rik van Riel AuthorDate: Mon, 16 Jul 2018 15:03:32 -0400 Committer: Ingo Molnar CommitDate: Tue, 17 Jul 2018 09:35:31 +0200 x86/mm/tlb: Leave lazy

[tip:x86/mm] x86/mm/tlb: Leave lazy TLB mode at page table free time

2018-07-17 Thread tip-bot for Rik van Riel
Commit-ID: 2ff6ddf19c0ec40633bd14d8fe28a289816bd98d Gitweb: https://git.kernel.org/tip/2ff6ddf19c0ec40633bd14d8fe28a289816bd98d Author: Rik van Riel AuthorDate: Mon, 16 Jul 2018 15:03:32 -0400 Committer: Ingo Molnar CommitDate: Tue, 17 Jul 2018 09:35:31 +0200 x86/mm/tlb: Leave lazy

[tip:x86/mm] mm: Allocate the mm_cpumask (mm->cpu_bitmap[]) dynamically based on nr_cpu_ids

2018-07-17 Thread tip-bot for Rik van Riel
Commit-ID: c1a2f7f0c06454387c2cd7b93ff1491c715a8c69 Gitweb: https://git.kernel.org/tip/c1a2f7f0c06454387c2cd7b93ff1491c715a8c69 Author: Rik van Riel AuthorDate: Mon, 16 Jul 2018 15:03:31 -0400 Committer: Ingo Molnar CommitDate: Tue, 17 Jul 2018 09:35:30 +0200 mm: Allocate

[tip:x86/mm] mm: Allocate the mm_cpumask (mm->cpu_bitmap[]) dynamically based on nr_cpu_ids

2018-07-17 Thread tip-bot for Rik van Riel
Commit-ID: c1a2f7f0c06454387c2cd7b93ff1491c715a8c69 Gitweb: https://git.kernel.org/tip/c1a2f7f0c06454387c2cd7b93ff1491c715a8c69 Author: Rik van Riel AuthorDate: Mon, 16 Jul 2018 15:03:31 -0400 Committer: Ingo Molnar CommitDate: Tue, 17 Jul 2018 09:35:30 +0200 mm: Allocate

[PATCH 5/7] x86,tlb: only send page table free TLB flush to lazy TLB CPUs

2018-07-16 Thread Rik van Riel
be up to date yet. Signed-off-by: Rik van Riel Acked-by: Dave Hansen Tested-by: Song Liu --- arch/x86/mm/tlb.c | 43 +++ 1 file changed, 39 insertions(+), 4 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 26542cc17043..e4156e37aa71

[PATCH 5/7] x86,tlb: only send page table free TLB flush to lazy TLB CPUs

2018-07-16 Thread Rik van Riel
be up to date yet. Signed-off-by: Rik van Riel Acked-by: Dave Hansen Tested-by: Song Liu --- arch/x86/mm/tlb.c | 43 +++ 1 file changed, 39 insertions(+), 4 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 26542cc17043..e4156e37aa71

[PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-16 Thread Rik van Riel
oad on two socket systems, and by about 1% for a heavily multi-process netperf between two systems. Signed-off-by: Rik van Riel Acked-by: Dave Hansen Tested-by: Song Liu --- arch/x86/mm/tlb.c | 68 +++ 1 file changed, 59 insertions(+), 9 deleti

[PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-16 Thread Rik van Riel
oad on two socket systems, and by about 1% for a heavily multi-process netperf between two systems. Signed-off-by: Rik van Riel Acked-by: Dave Hansen Tested-by: Song Liu --- arch/x86/mm/tlb.c | 68 +++ 1 file changed, 59 insertions(+), 9 deleti

[PATCH 7/7] x86,switch_mm: skip atomic operations for init_mm

2018-07-16 Thread Rik van Riel
. Signed-off-by: Rik van Riel Acked-by: Dave Hansen Reported-and-tested-by: Song Liu --- arch/x86/mm/tlb.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 493559cae2d5..f086195f644c 100644 --- a/arch/x86/mm/tlb.c

[PATCH 6/7] x86,mm: always use lazy TLB mode

2018-07-16 Thread Rik van Riel
Now that CPUs in lazy TLB mode no longer receive TLB shootdown IPIs, except at page table freeing time, and idle CPUs will no longer get shootdown IPIs for things like mprotect and madvise, we can always use lazy TLB mode. Signed-off-by: Rik van Riel Acked-by: Dave Hansen Tested-by: Song Liu

[PATCH 3/7] x86,mm: restructure switch_mm_irqs_off

2018-07-16 Thread Rik van Riel
Move some code that will be needed for the lazy -> !lazy state transition when a lazy TLB CPU has gotten out of date. No functional changes, since the if (real_prev == next) branch always returns. Signed-off-by: Rik van Riel Acked-by: Dave Hansen Suggested-by: Andy Lutomirski --- arch/x86

[PATCH 7/7] x86,switch_mm: skip atomic operations for init_mm

2018-07-16 Thread Rik van Riel
. Signed-off-by: Rik van Riel Acked-by: Dave Hansen Reported-and-tested-by: Song Liu --- arch/x86/mm/tlb.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 493559cae2d5..f086195f644c 100644 --- a/arch/x86/mm/tlb.c

[PATCH 6/7] x86,mm: always use lazy TLB mode

2018-07-16 Thread Rik van Riel
Now that CPUs in lazy TLB mode no longer receive TLB shootdown IPIs, except at page table freeing time, and idle CPUs will no longer get shootdown IPIs for things like mprotect and madvise, we can always use lazy TLB mode. Signed-off-by: Rik van Riel Acked-by: Dave Hansen Tested-by: Song Liu

[PATCH 3/7] x86,mm: restructure switch_mm_irqs_off

2018-07-16 Thread Rik van Riel
Move some code that will be needed for the lazy -> !lazy state transition when a lazy TLB CPU has gotten out of date. No functional changes, since the if (real_prev == next) branch always returns. Signed-off-by: Rik van Riel Acked-by: Dave Hansen Suggested-by: Andy Lutomirski --- arch/x86

[PATCH 1/7] mm: allocate mm_cpumask dynamically based on nr_cpu_ids

2018-07-16 Thread Rik van Riel
is compiled for, since we only have one init_mm in the system, anyway. Pointer magic by Mike Galbraith, to evade -Wstringop-overflow getting confused by the dynamically sized array. Signed-off-by: Rik van Riel Signed-off-by: Mike Galbraith Signed-off-by: Rik van Riel Acked-by: Dave Hansen

[PATCH 1/7] mm: allocate mm_cpumask dynamically based on nr_cpu_ids

2018-07-16 Thread Rik van Riel
is compiled for, since we only have one init_mm in the system, anyway. Pointer magic by Mike Galbraith, to evade -Wstringop-overflow getting confused by the dynamically sized array. Signed-off-by: Rik van Riel Signed-off-by: Mike Galbraith Signed-off-by: Rik van Riel Acked-by: Dave Hansen

[PATCH v6 0/7] x86,tlb,mm: make lazy TLB mode even lazier

2018-07-16 Thread Rik van Riel
Song noticed switch_mm_irqs_off taking a lot of CPU time in recent kernels, using 1.9% of a 48 CPU system during a netperf run. Digging into the profile, the atomic operations in cpumask_clear_cpu and cpumask_set_cpu are responsible for about half of that CPU use. However, the CPUs running

[PATCH 2/7] x86,tlb: leave lazy TLB mode at page table free time

2018-07-16 Thread Rik van Riel
workloads, but do not involve page table freeing. Also, on munmap, batching of page table freeing covers much larger ranges of virtual memory than the batching of unmapped user pages. Signed-off-by: Rik van Riel Acked-by: Dave Hansen Tested-by: Song Liu --- arch/x86/include/asm/tlbflush.h | 5

[PATCH v6 0/7] x86,tlb,mm: make lazy TLB mode even lazier

2018-07-16 Thread Rik van Riel
Song noticed switch_mm_irqs_off taking a lot of CPU time in recent kernels, using 1.9% of a 48 CPU system during a netperf run. Digging into the profile, the atomic operations in cpumask_clear_cpu and cpumask_set_cpu are responsible for about half of that CPU use. However, the CPUs running

[PATCH 2/7] x86,tlb: leave lazy TLB mode at page table free time

2018-07-16 Thread Rik van Riel
workloads, but do not involve page table freeing. Also, on munmap, batching of page table freeing covers much larger ranges of virtual memory than the batching of unmapped user pages. Signed-off-by: Rik van Riel Acked-by: Dave Hansen Tested-by: Song Liu --- arch/x86/include/asm/tlbflush.h | 5

Re: [PATCH 7/7] x86,switch_mm: skip atomic operations for init_mm

2018-07-16 Thread Rik van Riel
On Mon, 2018-07-16 at 03:04 +0200, Ingo Molnar wrote: > * Rik van Riel wrote: > > > On Mon, 2018-07-16 at 01:04 +0200, Ingo Molnar wrote: > > > * Rik van Riel wrote: > > > > > > > + /* > > > > +

Re: [PATCH 7/7] x86,switch_mm: skip atomic operations for init_mm

2018-07-16 Thread Rik van Riel
On Mon, 2018-07-16 at 03:04 +0200, Ingo Molnar wrote: > * Rik van Riel wrote: > > > On Mon, 2018-07-16 at 01:04 +0200, Ingo Molnar wrote: > > > * Rik van Riel wrote: > > > > > > > + /* > > > > +

Re: [PATCH 1/7] mm: allocate mm_cpumask dynamically based on nr_cpu_ids

2018-07-15 Thread Rik van Riel
On Mon, 2018-07-16 at 00:59 +0200, Ingo Molnar wrote: > * Rik van Riel wrote: > > > The mm_struct always contains a cpumask bitmap, regardless of > > CONFIG_CPUMASK_OFFSTACK. That means the first step can be to > > simplify things, and simply have one bitmask at the

Re: [PATCH 1/7] mm: allocate mm_cpumask dynamically based on nr_cpu_ids

2018-07-15 Thread Rik van Riel
On Mon, 2018-07-16 at 00:59 +0200, Ingo Molnar wrote: > * Rik van Riel wrote: > > > The mm_struct always contains a cpumask bitmap, regardless of > > CONFIG_CPUMASK_OFFSTACK. That means the first step can be to > > simplify things, and simply have one bitmask at the

Re: [PATCH 7/7] x86,switch_mm: skip atomic operations for init_mm

2018-07-15 Thread Rik van Riel
On Mon, 2018-07-16 at 01:04 +0200, Ingo Molnar wrote: > * Rik van Riel wrote: > > > + /* > > +* Stop remote flushes for the previous mm. > > +* Skip the idle task; we never send init_mm TLB > > flushing IPIs, > > +

Re: [PATCH 7/7] x86,switch_mm: skip atomic operations for init_mm

2018-07-15 Thread Rik van Riel
On Mon, 2018-07-16 at 01:04 +0200, Ingo Molnar wrote: > * Rik van Riel wrote: > > > + /* > > +* Stop remote flushes for the previous mm. > > +* Skip the idle task; we never send init_mm TLB > > flushing IPIs, > > +

Re: Lazy FPU restoration / moving kernel_fpu_end() to context switch

2018-07-11 Thread Rik van Riel
On Wed, 2018-07-11 at 18:28 +0200, Sebastian Andrzej Siewior wrote: > On 2018-06-15 22:33:47 [+0200], Jason A. Donenfeld wrote: > > On Fri, Jun 15, 2018 at 8:32 PM Andy Lutomirski > > wrote: > > > quite in the form you imagined. The idea that we've tossed > > > around is > > > to restore FPU

Re: Lazy FPU restoration / moving kernel_fpu_end() to context switch

2018-07-11 Thread Rik van Riel
On Wed, 2018-07-11 at 18:28 +0200, Sebastian Andrzej Siewior wrote: > On 2018-06-15 22:33:47 [+0200], Jason A. Donenfeld wrote: > > On Fri, Jun 15, 2018 at 8:32 PM Andy Lutomirski > > wrote: > > > quite in the form you imagined. The idea that we've tossed > > > around is > > > to restore FPU

[PATCH 1/7] mm: allocate mm_cpumask dynamically based on nr_cpu_ids

2018-07-10 Thread Rik van Riel
is compiled for, since we only have one init_mm in the system, anyway. Pointer magic by Mike Galbraith, to evade -Wstringop-overflow getting confused by the dynamically sized array. Signed-off-by: Rik van Riel Signed-off-by: Mike Galbraith Acked-by: Dave Hansen Tested-by: Song Liu

[PATCH 1/7] mm: allocate mm_cpumask dynamically based on nr_cpu_ids

2018-07-10 Thread Rik van Riel
is compiled for, since we only have one init_mm in the system, anyway. Pointer magic by Mike Galbraith, to evade -Wstringop-overflow getting confused by the dynamically sized array. Signed-off-by: Rik van Riel Signed-off-by: Mike Galbraith Acked-by: Dave Hansen Tested-by: Song Liu

[PATCH 3/7] x86,mm: restructure switch_mm_irqs_off

2018-07-10 Thread Rik van Riel
Move some code that will be needed for the lazy -> !lazy state transition when a lazy TLB CPU has gotten out of date. No functional changes, since the if (real_prev == next) branch always returns. Signed-off-by: Rik van Riel Acked-by: Dave Hansen Suggested-by: Andy Lutomirski --- arch/x86

[PATCH 3/7] x86,mm: restructure switch_mm_irqs_off

2018-07-10 Thread Rik van Riel
Move some code that will be needed for the lazy -> !lazy state transition when a lazy TLB CPU has gotten out of date. No functional changes, since the if (real_prev == next) branch always returns. Signed-off-by: Rik van Riel Acked-by: Dave Hansen Suggested-by: Andy Lutomirski --- arch/x86

<    1   2   3   4   5   6   7   8   9   10   >