Re: [RFC PATCH 0/4] sched+mm: Track lazy active mm existence with hazard pointers

2024-10-07 Thread Peter Zijlstra
On Sat, Oct 05, 2024 at 09:56:24AM -0700, Linus Torvalds wrote:
> On Sat, 5 Oct 2024 at 09:16, Peter Zijlstra  wrote:
> >
> > On Wed, Oct 02, 2024 at 10:39:15AM -0700, Linus Torvalds wrote:
> > > So I think the real issue is that "active_mm" is an old hack from a
> > > bygone era when we didn't have the (much more involved) full TLB
> > > tracking.
> >
> > I still seem to have these patches that neither Andy nor I ever managed
> > to find time to finish:
> >
> >   
> > https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=x86/lazy
> 
> Yes, that looks very much like what I had in mind.
> 
> In fact, it looks a lot smaller and simpler than what my mental model was.
> 
> I was thinking I'd do it by removing "active_mm" entirely from 'struct
> task_struct', and turn it into a per-cpu variable instead, and then
> try to massage that into some global new world order. That patch
> series you point to seems to be much simpler and clearer.
> 
> Of course, you also say "never managed to finish", so presumably
> there's something completely broken in that series, and it doesn't
> actually work?

Last time I tried it, it worked fine. I just didn't get around to
actually fully thinking it trough and making sure nothing subtle was
broken etc. Pesky details and such..



Re: [RFC PATCH 0/4] sched+mm: Track lazy active mm existence with hazard pointers

2024-10-05 Thread Linus Torvalds
On Sat, 5 Oct 2024 at 09:16, Peter Zijlstra  wrote:
>
> On Wed, Oct 02, 2024 at 10:39:15AM -0700, Linus Torvalds wrote:
> > So I think the real issue is that "active_mm" is an old hack from a
> > bygone era when we didn't have the (much more involved) full TLB
> > tracking.
>
> I still seem to have these patches that neither Andy nor I ever managed
> to find time to finish:
>
>   
> https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=x86/lazy

Yes, that looks very much like what I had in mind.

In fact, it looks a lot smaller and simpler than what my mental model was.

I was thinking I'd do it by removing "active_mm" entirely from 'struct
task_struct', and turn it into a per-cpu variable instead, and then
try to massage that into some global new world order. That patch
series you point to seems to be much simpler and clearer.

Of course, you also say "never managed to finish", so presumably
there's something completely broken in that series, and it doesn't
actually work?

   Linus



Re: [RFC PATCH 0/4] sched+mm: Track lazy active mm existence with hazard pointers

2024-10-05 Thread Peter Zijlstra
On Wed, Oct 02, 2024 at 10:39:15AM -0700, Linus Torvalds wrote:
> So I think the real issue is that "active_mm" is an old hack from a
> bygone era when we didn't have the (much more involved) full TLB
> tracking.

I still seem to have these patches that neither Andy nor I ever managed
to find time to finish:

  
https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=x86/lazy



Re: [RFC PATCH 0/4] sched+mm: Track lazy active mm existence with hazard pointers

2024-10-02 Thread Linus Torvalds
On Tue, 1 Oct 2024 at 18:04, Mathieu Desnoyers
 wrote:
>
> Hazard pointers appear to be a good fit for replacing refcount based lazy
> active mm tracking.

If the mm refcount is this expensive, I suspect we really shouldn't
use it at all.

The thing is, we don't _need_ to use the mm refcount - the reason the
lazy-tlb handling uses it is because we already had that refcount and
it was easy to extend on existing logic, not because it's really
required any more.

The lazy-tlb activation is basically "I'm switching to a kernel
thread, so I'll re-use the TLB state of the previous thread".

(And yes, it also has a secondary case of "I'm exiting, so I will turn
the mm I already have into a lazy one").

But in the actual task switch case, the previous thread hasn't _lost_
that mm, so we don't actually need to take the refcount at all.

We really just need to make sure to invalidate it before it's torn
down, but we do that *anyway* as part of TLB flushing.

(The exit case is actually different: we are setting it up to be lost,
although delayed - and the lazy count is the delay).

The only thing the refcount means is that we don't actually have to be
as careful when we actually *really* get rid of the MM. We can be a
bit laissez-faire about things because even if we weren't to
invalidate the lazy mm, it does have its own refcount, so we don't
much care.

But in reality, we're actually very careful about the active_mm
_anyway_, because of a fairly fundamental issue: the TLB shootdown and
PCID handling that we need to do even when mm's aren't lazy.

So we actually keep track of things like "which CPU's have seen this
MM state" in all the TLB code.

And even the exit case doesn't actually need the special thing - it
*does* need the "this CPU is still using this MM", but we have that
too as part of the TLB code - entirely independently of 'active_mm'.

So in many ways, I'm pretty sure not just the refcount, but all of
'active_mm', is largely pointless to begin with.

And if the refcount really is this big of a deal:

> nr threads (-t) speedup
>192   +28%

then we should probably just strive to get rid of 'active_mm' altogether.

Look, at least on x86 we ALREADY has a better replacement: it's the
percpu 'cpu_tlbstate'.

It basically duplicates all we do with active_mm and the whole "keep
track of old mm state" (the 'loaded_mm' member is basically the true
'active' mm), except it has some additional fixes:

 - it has some extra housekeeping data that the architecture wants
(for PCID updates etc)

 - it's actually atomic wrt the low-level code in ways that
'current->active_mm' isn't

So I think the real issue is that "active_mm" is an old hack from a
bygone era when we didn't have the (much more involved) full TLB
tracking.

   Linus



Re: [RFC PATCH 0/4] sched+mm: Track lazy active mm existence with hazard pointers

2024-10-02 Thread Jens Axboe
On 10/2/24 10:02 AM, Mathieu Desnoyers wrote:
> On 2024-10-02 17:58, Jens Axboe wrote:
>> On 10/2/24 9:53 AM, Mathieu Desnoyers wrote:
>>> On 2024-10-02 17:36, Mathieu Desnoyers wrote:
 On 2024-10-02 17:33, Matthew Wilcox wrote:
> On Wed, Oct 02, 2024 at 11:26:27AM -0400, Mathieu Desnoyers wrote:
>> On 2024-10-02 16:09, Paul E. McKenney wrote:
>>> On Tue, Oct 01, 2024 at 09:02:01PM -0400, Mathieu Desnoyers wrote:
 Hazard pointers appear to be a good fit for replacing refcount based 
 lazy
 active mm tracking.

 Highlight:

 will-it-scale context_switch1_threads

 nr threads (-t) speedup
24+3%
48   +12%
96   +21%
   192   +28%
>>>
>>> Impressive!!!
>>>
>>> I have to ask...  Any data for smaller numbers of CPUs?
>>
>> Sure, but they are far less exciting ;-)
>
> How many CPUs in the system under test?

 2 sockets, 96-core per socket:

 CPU(s):   384
 On-line CPU(s) list:0-383
 Vendor ID:AuthenticAMD
 Model name: AMD EPYC 9654 96-Core Processor
   CPU family:   25
   Model:17
   Thread(s) per core:   2
   Core(s) per socket:   96
   Socket(s):2
   Stepping: 1
   Frequency boost:  enabled
   CPU(s) scaling MHz:   68%
   CPU max MHz:  3709.
   CPU min MHz:  400.
   BogoMIPS: 4800.00

 Note that Jens Axboe got even more impressive speedups testing this
 on his 512-hw-thread EPYC [1] (390% speedup for 192 threads). I've
 noticed I had schedstats and sched debug enabled in my config, so I'll 
 have to re-run my tests.
>>>
>>> A quick re-run of the 128-thread case with schedstats and sched debug
>>> disabled still show around 26% speedup, similar to my prior numbers.
>>>
>>> I'm not sure why Jens has much better speedups on a similar system.
>>>
>>> I'm attaching my config in case someone spots anything obvious. Note
>>> that my BIOS is configured to show 24 NUMA nodes to the kernel (one
>>> NUMA node per core complex).
>>
>> Here's my .config - note it's from the stock kernel run, which is why it
>> still has:
>>
>> CONFIG_MMU_LAZY_TLB_REFCOUNT=y
>>
>> set. Have the same numa configuration as you, just end up with 32 nodes
>> on this box.
> 
> Just to make sure: did you use other command line options when starting
> the test program (other than -t N ?).

I did not, this is literally what I ran:

for i in 24 48 96 192 256 512 1024 2048; do echo $i threads; timeout -s INT -k 
30 30 ./context_switch1_threads -t $i; done

and the numbers I got were very stable between runs and reboots.

-- 
Jens Axboe



Re: [RFC PATCH 0/4] sched+mm: Track lazy active mm existence with hazard pointers

2024-10-02 Thread Mathieu Desnoyers

On 2024-10-02 17:58, Jens Axboe wrote:

On 10/2/24 9:53 AM, Mathieu Desnoyers wrote:

On 2024-10-02 17:36, Mathieu Desnoyers wrote:

On 2024-10-02 17:33, Matthew Wilcox wrote:

On Wed, Oct 02, 2024 at 11:26:27AM -0400, Mathieu Desnoyers wrote:

On 2024-10-02 16:09, Paul E. McKenney wrote:

On Tue, Oct 01, 2024 at 09:02:01PM -0400, Mathieu Desnoyers wrote:

Hazard pointers appear to be a good fit for replacing refcount based lazy
active mm tracking.

Highlight:

will-it-scale context_switch1_threads

nr threads (-t) speedup
   24+3%
   48   +12%
   96   +21%
  192   +28%


Impressive!!!

I have to ask...  Any data for smaller numbers of CPUs?


Sure, but they are far less exciting ;-)


How many CPUs in the system under test?


2 sockets, 96-core per socket:

CPU(s):   384
On-line CPU(s) list:0-383
Vendor ID:AuthenticAMD
Model name: AMD EPYC 9654 96-Core Processor
  CPU family:   25
  Model:17
  Thread(s) per core:   2
  Core(s) per socket:   96
  Socket(s):2
  Stepping: 1
  Frequency boost:  enabled
  CPU(s) scaling MHz:   68%
  CPU max MHz:  3709.
  CPU min MHz:  400.
  BogoMIPS: 4800.00

Note that Jens Axboe got even more impressive speedups testing this
on his 512-hw-thread EPYC [1] (390% speedup for 192 threads). I've
noticed I had schedstats and sched debug enabled in my config, so I'll have to 
re-run my tests.


A quick re-run of the 128-thread case with schedstats and sched debug
disabled still show around 26% speedup, similar to my prior numbers.

I'm not sure why Jens has much better speedups on a similar system.

I'm attaching my config in case someone spots anything obvious. Note
that my BIOS is configured to show 24 NUMA nodes to the kernel (one
NUMA node per core complex).


Here's my .config - note it's from the stock kernel run, which is why it
still has:

CONFIG_MMU_LAZY_TLB_REFCOUNT=y

set. Have the same numa configuration as you, just end up with 32 nodes
on this box.


Just to make sure: did you use other command line options when starting
the test program (other than -t N ?).

Thanks,

Mathieu



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




Re: [RFC PATCH 0/4] sched+mm: Track lazy active mm existence with hazard pointers

2024-10-02 Thread Jens Axboe
On 10/2/24 9:53 AM, Mathieu Desnoyers wrote:
> On 2024-10-02 17:36, Mathieu Desnoyers wrote:
>> On 2024-10-02 17:33, Matthew Wilcox wrote:
>>> On Wed, Oct 02, 2024 at 11:26:27AM -0400, Mathieu Desnoyers wrote:
 On 2024-10-02 16:09, Paul E. McKenney wrote:
> On Tue, Oct 01, 2024 at 09:02:01PM -0400, Mathieu Desnoyers wrote:
>> Hazard pointers appear to be a good fit for replacing refcount based lazy
>> active mm tracking.
>>
>> Highlight:
>>
>> will-it-scale context_switch1_threads
>>
>> nr threads (-t) speedup
>>   24+3%
>>   48   +12%
>>   96   +21%
>>  192   +28%
>
> Impressive!!!
>
> I have to ask...  Any data for smaller numbers of CPUs?

 Sure, but they are far less exciting ;-)
>>>
>>> How many CPUs in the system under test?
>>
>> 2 sockets, 96-core per socket:
>>
>> CPU(s):   384
>>On-line CPU(s) list:0-383
>> Vendor ID:AuthenticAMD
>>Model name: AMD EPYC 9654 96-Core Processor
>>  CPU family:   25
>>  Model:17
>>  Thread(s) per core:   2
>>  Core(s) per socket:   96
>>  Socket(s):2
>>  Stepping: 1
>>  Frequency boost:  enabled
>>  CPU(s) scaling MHz:   68%
>>  CPU max MHz:  3709.
>>  CPU min MHz:  400.
>>  BogoMIPS: 4800.00
>>
>> Note that Jens Axboe got even more impressive speedups testing this
>> on his 512-hw-thread EPYC [1] (390% speedup for 192 threads). I've
>> noticed I had schedstats and sched debug enabled in my config, so I'll have 
>> to re-run my tests.
> 
> A quick re-run of the 128-thread case with schedstats and sched debug
> disabled still show around 26% speedup, similar to my prior numbers.
> 
> I'm not sure why Jens has much better speedups on a similar system.
> 
> I'm attaching my config in case someone spots anything obvious. Note
> that my BIOS is configured to show 24 NUMA nodes to the kernel (one
> NUMA node per core complex).

Here's my .config - note it's from the stock kernel run, which is why it
still has:

CONFIG_MMU_LAZY_TLB_REFCOUNT=y

set. Have the same numa configuration as you, just end up with 32 nodes
on this box.

-- 
Jens Axboe

r7625.config.gz
Description: application/gzip


Re: [RFC PATCH 0/4] sched+mm: Track lazy active mm existence with hazard pointers

2024-10-02 Thread Mathieu Desnoyers

On 2024-10-02 17:33, Matthew Wilcox wrote:

On Wed, Oct 02, 2024 at 11:26:27AM -0400, Mathieu Desnoyers wrote:

On 2024-10-02 16:09, Paul E. McKenney wrote:

On Tue, Oct 01, 2024 at 09:02:01PM -0400, Mathieu Desnoyers wrote:

Hazard pointers appear to be a good fit for replacing refcount based lazy
active mm tracking.

Highlight:

will-it-scale context_switch1_threads

nr threads (-t) speedup
  24+3%
  48   +12%
  96   +21%
 192   +28%


Impressive!!!

I have to ask...  Any data for smaller numbers of CPUs?


Sure, but they are far less exciting ;-)


How many CPUs in the system under test?


2 sockets, 96-core per socket:

CPU(s):   384
  On-line CPU(s) list:0-383
Vendor ID:AuthenticAMD
  Model name: AMD EPYC 9654 96-Core Processor
CPU family:   25
Model:17
Thread(s) per core:   2
Core(s) per socket:   96
Socket(s):2
Stepping: 1
Frequency boost:  enabled
CPU(s) scaling MHz:   68%
CPU max MHz:  3709.
CPU min MHz:  400.
BogoMIPS: 4800.00

Note that Jens Axboe got even more impressive speedups testing this
on his 512-hw-thread EPYC [1] (390% speedup for 192 threads). I've
noticed I had schedstats and sched debug enabled in my config, so I'll 
have to re-run my tests.


Thanks,

Mathieu

[1] https://discuss.systems/@ax...@fosstodon.org/113238297041686326




nr threads (-t) speedup
  1-0.2%
  2+0.4%
  3+0.2%
  6+0.6%
 12+0.8%
 24+3%
 48   +12%
 96   +21%
192   +28%
384+4%
768-0.6%

Thanks,

Mathieu



Thanx, Paul


I'm curious to see what the build bots have to say about this.

This series applies on top of v6.11.1.

Signed-off-by: Mathieu Desnoyers 
Cc: Nicholas Piggin 
Cc: Michael Ellerman 
Cc: Greg Kroah-Hartman 
Cc: Sebastian Andrzej Siewior 
Cc: "Paul E. McKenney" 
Cc: Will Deacon 
Cc: Boqun Feng 
Cc: Alan Stern 
Cc: John Stultz 
Cc: Neeraj Upadhyay 
Cc: Boqun Feng 
Cc: Frederic Weisbecker 
Cc: Joel Fernandes 
Cc: Josh Triplett 
Cc: Uladzislau Rezki 
Cc: Steven Rostedt 
Cc: Lai Jiangshan 
Cc: Zqiang 
Cc: Ingo Molnar 
Cc: Waiman Long 
Cc: Mark Rutland 
Cc: Thomas Gleixner 
Cc: Vlastimil Babka 
Cc: maged.mich...@gmail.com
Cc: Mateusz Guzik 
Cc: Jonas Oberhauser 
Cc: r...@vger.kernel.org
Cc: linux...@kvack.org
Cc: l...@lists.linux.dev

Mathieu Desnoyers (4):
compiler.h: Introduce ptr_eq() to preserve address dependency
Documentation: RCU: Refer to ptr_eq()
hp: Implement Hazard Pointers
sched+mm: Use hazard pointers to track lazy active mm existence

   Documentation/RCU/rcu_dereference.rst |  38 ++-
   Documentation/mm/active_mm.rst|   9 +-
   arch/Kconfig  |  32 --
   arch/powerpc/Kconfig  |   1 -
   arch/powerpc/mm/book3s64/radix_tlb.c  |  23 +---
   include/linux/compiler.h  |  63 +++
   include/linux/hp.h| 154 ++
   include/linux/mm_types.h  |   3 -
   include/linux/sched/mm.h  |  71 +---
   kernel/Makefile   |   2 +-
   kernel/exit.c |   4 +-
   kernel/fork.c |  47 ++--
   kernel/hp.c   |  46 
   kernel/sched/sched.h  |   8 +-
   lib/Kconfig.debug |  10 --
   15 files changed, 346 insertions(+), 165 deletions(-)
   create mode 100644 include/linux/hp.h
   create mode 100644 kernel/hp.c

--
2.39.2


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




Re: [RFC PATCH 0/4] sched+mm: Track lazy active mm existence with hazard pointers

2024-10-02 Thread Matthew Wilcox
On Wed, Oct 02, 2024 at 11:26:27AM -0400, Mathieu Desnoyers wrote:
> On 2024-10-02 16:09, Paul E. McKenney wrote:
> > On Tue, Oct 01, 2024 at 09:02:01PM -0400, Mathieu Desnoyers wrote:
> > > Hazard pointers appear to be a good fit for replacing refcount based lazy
> > > active mm tracking.
> > > 
> > > Highlight:
> > > 
> > > will-it-scale context_switch1_threads
> > > 
> > > nr threads (-t) speedup
> > >  24+3%
> > >  48   +12%
> > >  96   +21%
> > > 192   +28%
> > 
> > Impressive!!!
> > 
> > I have to ask...  Any data for smaller numbers of CPUs?
> 
> Sure, but they are far less exciting ;-)

How many CPUs in the system under test?

> nr threads (-t) speedup
>  1-0.2%
>  2+0.4%
>  3+0.2%
>  6+0.6%
> 12+0.8%
> 24+3%
> 48   +12%
> 96   +21%
>192   +28%
>384+4%
>768-0.6%
> 
> Thanks,
> 
> Mathieu
> 
> > 
> > Thanx, Paul
> > 
> > > I'm curious to see what the build bots have to say about this.
> > > 
> > > This series applies on top of v6.11.1.
> > > 
> > > Signed-off-by: Mathieu Desnoyers 
> > > Cc: Nicholas Piggin 
> > > Cc: Michael Ellerman 
> > > Cc: Greg Kroah-Hartman 
> > > Cc: Sebastian Andrzej Siewior 
> > > Cc: "Paul E. McKenney" 
> > > Cc: Will Deacon 
> > > Cc: Boqun Feng 
> > > Cc: Alan Stern 
> > > Cc: John Stultz 
> > > Cc: Neeraj Upadhyay 
> > > Cc: Boqun Feng 
> > > Cc: Frederic Weisbecker 
> > > Cc: Joel Fernandes 
> > > Cc: Josh Triplett 
> > > Cc: Uladzislau Rezki 
> > > Cc: Steven Rostedt 
> > > Cc: Lai Jiangshan 
> > > Cc: Zqiang 
> > > Cc: Ingo Molnar 
> > > Cc: Waiman Long 
> > > Cc: Mark Rutland 
> > > Cc: Thomas Gleixner 
> > > Cc: Vlastimil Babka 
> > > Cc: maged.mich...@gmail.com
> > > Cc: Mateusz Guzik 
> > > Cc: Jonas Oberhauser 
> > > Cc: r...@vger.kernel.org
> > > Cc: linux...@kvack.org
> > > Cc: l...@lists.linux.dev
> > > 
> > > Mathieu Desnoyers (4):
> > >compiler.h: Introduce ptr_eq() to preserve address dependency
> > >Documentation: RCU: Refer to ptr_eq()
> > >hp: Implement Hazard Pointers
> > >sched+mm: Use hazard pointers to track lazy active mm existence
> > > 
> > >   Documentation/RCU/rcu_dereference.rst |  38 ++-
> > >   Documentation/mm/active_mm.rst|   9 +-
> > >   arch/Kconfig  |  32 --
> > >   arch/powerpc/Kconfig  |   1 -
> > >   arch/powerpc/mm/book3s64/radix_tlb.c  |  23 +---
> > >   include/linux/compiler.h  |  63 +++
> > >   include/linux/hp.h| 154 ++
> > >   include/linux/mm_types.h  |   3 -
> > >   include/linux/sched/mm.h  |  71 +---
> > >   kernel/Makefile   |   2 +-
> > >   kernel/exit.c |   4 +-
> > >   kernel/fork.c |  47 ++--
> > >   kernel/hp.c   |  46 
> > >   kernel/sched/sched.h  |   8 +-
> > >   lib/Kconfig.debug |  10 --
> > >   15 files changed, 346 insertions(+), 165 deletions(-)
> > >   create mode 100644 include/linux/hp.h
> > >   create mode 100644 kernel/hp.c
> > > 
> > > -- 
> > > 2.39.2
> 
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> https://www.efficios.com
> 
> 



Re: [RFC PATCH 0/4] sched+mm: Track lazy active mm existence with hazard pointers

2024-10-02 Thread Mathieu Desnoyers

On 2024-10-02 16:09, Paul E. McKenney wrote:

On Tue, Oct 01, 2024 at 09:02:01PM -0400, Mathieu Desnoyers wrote:

Hazard pointers appear to be a good fit for replacing refcount based lazy
active mm tracking.

Highlight:

will-it-scale context_switch1_threads

nr threads (-t) speedup
 24+3%
 48   +12%
 96   +21%
192   +28%


Impressive!!!

I have to ask...  Any data for smaller numbers of CPUs?


Sure, but they are far less exciting ;-)

nr threads (-t) speedup
 1-0.2%
 2+0.4%
 3+0.2%
 6+0.6%
12+0.8%
24+3%
48   +12%
96   +21%
   192   +28%
   384+4%
   768-0.6%

Thanks,

Mathieu



Thanx, Paul


I'm curious to see what the build bots have to say about this.

This series applies on top of v6.11.1.

Signed-off-by: Mathieu Desnoyers 
Cc: Nicholas Piggin 
Cc: Michael Ellerman 
Cc: Greg Kroah-Hartman 
Cc: Sebastian Andrzej Siewior 
Cc: "Paul E. McKenney" 
Cc: Will Deacon 
Cc: Boqun Feng 
Cc: Alan Stern 
Cc: John Stultz 
Cc: Neeraj Upadhyay 
Cc: Boqun Feng 
Cc: Frederic Weisbecker 
Cc: Joel Fernandes 
Cc: Josh Triplett 
Cc: Uladzislau Rezki 
Cc: Steven Rostedt 
Cc: Lai Jiangshan 
Cc: Zqiang 
Cc: Ingo Molnar 
Cc: Waiman Long 
Cc: Mark Rutland 
Cc: Thomas Gleixner 
Cc: Vlastimil Babka 
Cc: maged.mich...@gmail.com
Cc: Mateusz Guzik 
Cc: Jonas Oberhauser 
Cc: r...@vger.kernel.org
Cc: linux...@kvack.org
Cc: l...@lists.linux.dev

Mathieu Desnoyers (4):
   compiler.h: Introduce ptr_eq() to preserve address dependency
   Documentation: RCU: Refer to ptr_eq()
   hp: Implement Hazard Pointers
   sched+mm: Use hazard pointers to track lazy active mm existence

  Documentation/RCU/rcu_dereference.rst |  38 ++-
  Documentation/mm/active_mm.rst|   9 +-
  arch/Kconfig  |  32 --
  arch/powerpc/Kconfig  |   1 -
  arch/powerpc/mm/book3s64/radix_tlb.c  |  23 +---
  include/linux/compiler.h  |  63 +++
  include/linux/hp.h| 154 ++
  include/linux/mm_types.h  |   3 -
  include/linux/sched/mm.h  |  71 +---
  kernel/Makefile   |   2 +-
  kernel/exit.c |   4 +-
  kernel/fork.c |  47 ++--
  kernel/hp.c   |  46 
  kernel/sched/sched.h  |   8 +-
  lib/Kconfig.debug |  10 --
  15 files changed, 346 insertions(+), 165 deletions(-)
  create mode 100644 include/linux/hp.h
  create mode 100644 kernel/hp.c

--
2.39.2


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




Re: [RFC PATCH 0/4] sched+mm: Track lazy active mm existence with hazard pointers

2024-10-02 Thread Paul E. McKenney
On Tue, Oct 01, 2024 at 09:02:01PM -0400, Mathieu Desnoyers wrote:
> Hazard pointers appear to be a good fit for replacing refcount based lazy
> active mm tracking.
> 
> Highlight:
> 
> will-it-scale context_switch1_threads
> 
> nr threads (-t) speedup
> 24+3%
> 48   +12%
> 96   +21%
>192   +28%

Impressive!!!

I have to ask...  Any data for smaller numbers of CPUs?

Thanx, Paul

> I'm curious to see what the build bots have to say about this.
> 
> This series applies on top of v6.11.1.
> 
> Signed-off-by: Mathieu Desnoyers 
> Cc: Nicholas Piggin 
> Cc: Michael Ellerman 
> Cc: Greg Kroah-Hartman 
> Cc: Sebastian Andrzej Siewior 
> Cc: "Paul E. McKenney" 
> Cc: Will Deacon 
> Cc: Boqun Feng 
> Cc: Alan Stern 
> Cc: John Stultz 
> Cc: Neeraj Upadhyay 
> Cc: Boqun Feng 
> Cc: Frederic Weisbecker 
> Cc: Joel Fernandes 
> Cc: Josh Triplett 
> Cc: Uladzislau Rezki 
> Cc: Steven Rostedt 
> Cc: Lai Jiangshan 
> Cc: Zqiang 
> Cc: Ingo Molnar 
> Cc: Waiman Long 
> Cc: Mark Rutland 
> Cc: Thomas Gleixner 
> Cc: Vlastimil Babka 
> Cc: maged.mich...@gmail.com
> Cc: Mateusz Guzik 
> Cc: Jonas Oberhauser 
> Cc: r...@vger.kernel.org
> Cc: linux...@kvack.org
> Cc: l...@lists.linux.dev
> 
> Mathieu Desnoyers (4):
>   compiler.h: Introduce ptr_eq() to preserve address dependency
>   Documentation: RCU: Refer to ptr_eq()
>   hp: Implement Hazard Pointers
>   sched+mm: Use hazard pointers to track lazy active mm existence
> 
>  Documentation/RCU/rcu_dereference.rst |  38 ++-
>  Documentation/mm/active_mm.rst|   9 +-
>  arch/Kconfig  |  32 --
>  arch/powerpc/Kconfig  |   1 -
>  arch/powerpc/mm/book3s64/radix_tlb.c  |  23 +---
>  include/linux/compiler.h  |  63 +++
>  include/linux/hp.h| 154 ++
>  include/linux/mm_types.h  |   3 -
>  include/linux/sched/mm.h  |  71 +---
>  kernel/Makefile   |   2 +-
>  kernel/exit.c |   4 +-
>  kernel/fork.c |  47 ++--
>  kernel/hp.c   |  46 
>  kernel/sched/sched.h  |   8 +-
>  lib/Kconfig.debug |  10 --
>  15 files changed, 346 insertions(+), 165 deletions(-)
>  create mode 100644 include/linux/hp.h
>  create mode 100644 kernel/hp.c
> 
> -- 
> 2.39.2



[RFC PATCH 0/4] sched+mm: Track lazy active mm existence with hazard pointers

2024-10-01 Thread Mathieu Desnoyers
Hazard pointers appear to be a good fit for replacing refcount based lazy
active mm tracking.

Highlight:

will-it-scale context_switch1_threads

nr threads (-t) speedup
24+3%
48   +12%
96   +21%
   192   +28%

I'm curious to see what the build bots have to say about this.

This series applies on top of v6.11.1.

Signed-off-by: Mathieu Desnoyers 
Cc: Nicholas Piggin 
Cc: Michael Ellerman 
Cc: Greg Kroah-Hartman 
Cc: Sebastian Andrzej Siewior 
Cc: "Paul E. McKenney" 
Cc: Will Deacon 
Cc: Boqun Feng 
Cc: Alan Stern 
Cc: John Stultz 
Cc: Neeraj Upadhyay 
Cc: Boqun Feng 
Cc: Frederic Weisbecker 
Cc: Joel Fernandes 
Cc: Josh Triplett 
Cc: Uladzislau Rezki 
Cc: Steven Rostedt 
Cc: Lai Jiangshan 
Cc: Zqiang 
Cc: Ingo Molnar 
Cc: Waiman Long 
Cc: Mark Rutland 
Cc: Thomas Gleixner 
Cc: Vlastimil Babka 
Cc: maged.mich...@gmail.com
Cc: Mateusz Guzik 
Cc: Jonas Oberhauser 
Cc: r...@vger.kernel.org
Cc: linux...@kvack.org
Cc: l...@lists.linux.dev

Mathieu Desnoyers (4):
  compiler.h: Introduce ptr_eq() to preserve address dependency
  Documentation: RCU: Refer to ptr_eq()
  hp: Implement Hazard Pointers
  sched+mm: Use hazard pointers to track lazy active mm existence

 Documentation/RCU/rcu_dereference.rst |  38 ++-
 Documentation/mm/active_mm.rst|   9 +-
 arch/Kconfig  |  32 --
 arch/powerpc/Kconfig  |   1 -
 arch/powerpc/mm/book3s64/radix_tlb.c  |  23 +---
 include/linux/compiler.h  |  63 +++
 include/linux/hp.h| 154 ++
 include/linux/mm_types.h  |   3 -
 include/linux/sched/mm.h  |  71 +---
 kernel/Makefile   |   2 +-
 kernel/exit.c |   4 +-
 kernel/fork.c |  47 ++--
 kernel/hp.c   |  46 
 kernel/sched/sched.h  |   8 +-
 lib/Kconfig.debug |  10 --
 15 files changed, 346 insertions(+), 165 deletions(-)
 create mode 100644 include/linux/hp.h
 create mode 100644 kernel/hp.c

-- 
2.39.2