Re: ftrace hangs waiting for rcu

2022-01-28 Thread Paul E. McKenney
On Fri, Jan 28, 2022 at 08:15:47AM -0800, Paul E. McKenney wrote:
> On Fri, Jan 28, 2022 at 04:11:57PM +, Mark Rutland wrote:
> > On Fri, Jan 28, 2022 at 05:08:48PM +0100, Sven Schnelle wrote:
> > > Hi Mark,
> > > 
> > > Mark Rutland  writes:
> > > 
> > > > On arm64 I bisected this down to:
> > > >
> > > >   7a30871b6a27de1a ("rcu-tasks: Introduce ->percpu_enqueue_shift for 
> > > > dynamic queue selection")
> > > >
> > > > Which was going wrong because ilog2() rounds down, and so the shift was 
> > > > wrong
> > > > for any nr_cpus that was not a power-of-two. Paul had already fixed 
> > > > that in
> > > > rcu-next, and just sent a pull request to Linus:
> > > >
> > > >   
> > > > https://lore.kernel.org/lkml/20220128143251.GA2398275@paulmck-ThinkPad-P17-Gen-1/
> > > >
> > > > With that applied, I no longer see these hangs.
> > > >
> > > > Does your s390 test machine have a non-power-of-two nr_cpus, and does 
> > > > that fix
> > > > the issue for you?
> > > 
> > > We noticed the PR from Paul and are currently testing the fix. So far
> > > it's looking good. The configuration where we have seen the hang is a
> > > bit unusual:
> > > 
> > > - 16 physical CPUs on the kvm host
> > > - 248 logical CPUs inside kvm
> > 
> > Aha! 248 is notably *NOT* a power of two, and in this case the shift would 
> > be
> > wrong (ilog2() would give 7, when we need a shift of 8).
> > 
> > So I suspect you're hitting the same issue as I was.
> 
> And apparently no one runs -next on systems having a non-power-of-two
> number of CPUs.  ;-)

And the fix is now in mainline.

Thanx, Paul

> > Thanks,
> > Mark.
> > 
> > > - debug kernel both on the host and kvm guest
> > > 
> > > So things are likely a bit slow in the kvm guest. Interesting is that
> > > the number of CPUs is even. But maybe RCU sees an odd number of CPUs
> > > and gets confused before all cpus are brought up. Have to read code/test
> > > to see whether that could be possible.
> > > 
> > > Thanks for investigating!
> > > Sven


Re: ftrace hangs waiting for rcu

2022-01-28 Thread Sven Schnelle
Hi Mark,

Mark Rutland  writes:

> On Fri, Jan 28, 2022 at 05:08:48PM +0100, Sven Schnelle wrote:
>> We noticed the PR from Paul and are currently testing the fix. So far
>> it's looking good. The configuration where we have seen the hang is a
>> bit unusual:
>> 
>> - 16 physical CPUs on the kvm host
>> - 248 logical CPUs inside kvm
>
> Aha! 248 is notably *NOT* a power of two, and in this case the shift would be
> wrong (ilog2() would give 7, when we need a shift of 8).
>
> So I suspect you're hitting the same issue as I was.

Argh, indeed! I somehow changed 'power of two' to 'odd number' in my
head. I guess it's time for the weekend. :-)

Thanks!


Re: ftrace hangs waiting for rcu

2022-01-28 Thread Paul E. McKenney
On Fri, Jan 28, 2022 at 04:11:57PM +, Mark Rutland wrote:
> On Fri, Jan 28, 2022 at 05:08:48PM +0100, Sven Schnelle wrote:
> > Hi Mark,
> > 
> > Mark Rutland  writes:
> > 
> > > On arm64 I bisected this down to:
> > >
> > >   7a30871b6a27de1a ("rcu-tasks: Introduce ->percpu_enqueue_shift for 
> > > dynamic queue selection")
> > >
> > > Which was going wrong because ilog2() rounds down, and so the shift was 
> > > wrong
> > > for any nr_cpus that was not a power-of-two. Paul had already fixed that 
> > > in
> > > rcu-next, and just sent a pull request to Linus:
> > >
> > >   
> > > https://lore.kernel.org/lkml/20220128143251.GA2398275@paulmck-ThinkPad-P17-Gen-1/
> > >
> > > With that applied, I no longer see these hangs.
> > >
> > > Does your s390 test machine have a non-power-of-two nr_cpus, and does 
> > > that fix
> > > the issue for you?
> > 
> > We noticed the PR from Paul and are currently testing the fix. So far
> > it's looking good. The configuration where we have seen the hang is a
> > bit unusual:
> > 
> > - 16 physical CPUs on the kvm host
> > - 248 logical CPUs inside kvm
> 
> Aha! 248 is notably *NOT* a power of two, and in this case the shift would be
> wrong (ilog2() would give 7, when we need a shift of 8).
> 
> So I suspect you're hitting the same issue as I was.

And apparently no one runs -next on systems having a non-power-of-two
number of CPUs.  ;-)

Thanx, Paul

> Thanks,
> Mark.
> 
> > - debug kernel both on the host and kvm guest
> > 
> > So things are likely a bit slow in the kvm guest. Interesting is that
> > the number of CPUs is even. But maybe RCU sees an odd number of CPUs
> > and gets confused before all cpus are brought up. Have to read code/test
> > to see whether that could be possible.
> > 
> > Thanks for investigating!
> > Sven


Re: ftrace hangs waiting for rcu

2022-01-28 Thread Mark Rutland
On Fri, Jan 28, 2022 at 05:08:48PM +0100, Sven Schnelle wrote:
> Hi Mark,
> 
> Mark Rutland  writes:
> 
> > On arm64 I bisected this down to:
> >
> >   7a30871b6a27de1a ("rcu-tasks: Introduce ->percpu_enqueue_shift for 
> > dynamic queue selection")
> >
> > Which was going wrong because ilog2() rounds down, and so the shift was 
> > wrong
> > for any nr_cpus that was not a power-of-two. Paul had already fixed that in
> > rcu-next, and just sent a pull request to Linus:
> >
> >   
> > https://lore.kernel.org/lkml/20220128143251.GA2398275@paulmck-ThinkPad-P17-Gen-1/
> >
> > With that applied, I no longer see these hangs.
> >
> > Does your s390 test machine have a non-power-of-two nr_cpus, and does that 
> > fix
> > the issue for you?
> 
> We noticed the PR from Paul and are currently testing the fix. So far
> it's looking good. The configuration where we have seen the hang is a
> bit unusual:
> 
> - 16 physical CPUs on the kvm host
> - 248 logical CPUs inside kvm

Aha! 248 is notably *NOT* a power of two, and in this case the shift would be
wrong (ilog2() would give 7, when we need a shift of 8).

So I suspect you're hitting the same issue as I was.

Thanks,
Mark.

> - debug kernel both on the host and kvm guest
> 
> So things are likely a bit slow in the kvm guest. Interesting is that
> the number of CPUs is even. But maybe RCU sees an odd number of CPUs
> and gets confused before all cpus are brought up. Have to read code/test
> to see whether that could be possible.
> 
> Thanks for investigating!
> Sven


Re: ftrace hangs waiting for rcu

2022-01-28 Thread Sven Schnelle
Hi Mark,

Mark Rutland  writes:

> On arm64 I bisected this down to:
>
>   7a30871b6a27de1a ("rcu-tasks: Introduce ->percpu_enqueue_shift for dynamic 
> queue selection")
>
> Which was going wrong because ilog2() rounds down, and so the shift was wrong
> for any nr_cpus that was not a power-of-two. Paul had already fixed that in
> rcu-next, and just sent a pull request to Linus:
>
>   
> https://lore.kernel.org/lkml/20220128143251.GA2398275@paulmck-ThinkPad-P17-Gen-1/
>
> With that applied, I no longer see these hangs.
>
> Does your s390 test machine have a non-power-of-two nr_cpus, and does that fix
> the issue for you?

We noticed the PR from Paul and are currently testing the fix. So far
it's looking good. The configuration where we have seen the hang is a
bit unusual:

- 16 physical CPUs on the kvm host
- 248 logical CPUs inside kvm
- debug kernel both on the host and kvm guest

So things are likely a bit slow in the kvm guest. Interesting is that
the number of CPUs is even. But maybe RCU sees an odd number of CPUs
and gets confused before all cpus are brought up. Have to read code/test
to see whether that could be possible.

Thanks for investigating!
Sven


Re: ftrace hangs waiting for rcu (was: Re: [PATCH] ftrace: Have architectures opt-in for mcount build time sorting)

2022-01-28 Thread Mark Rutland
Hi Sven,

On Thu, Jan 27, 2022 at 07:42:35PM +0100, Sven Schnelle wrote:
> Mark Rutland  writes:
> 
> > * I intermittently see a hang when running the tests. I previously hit that
> >   when originally trying to bisect this issue (and IIRC that bisected down 
> > to
> >   some RCU changes, but I need to re-run that). When the tests hang I
> >   magic-srsrq + L tells me:
> >
> >   [  271.938438] sysrq: Show Blocked State
> >   [  271.939245] task:ftracetest  state:D stack:0 pid: 5687 ppid:  
> > 5627 flags:0x0200
> >   [  271.940961] Call trace:
> >   [  271.941472]  __switch_to+0x104/0x160
> >   [  271.942213]  __schedule+0x2b0/0x6e0
> >   [  271.942933]  schedule+0x5c/0xf0
> >   [  271.943586]  schedule_timeout+0x184/0x1c4
> >   [  271.944410]  wait_for_completion+0x8c/0x12c
> >   [  271.945274]  __wait_rcu_gp+0x184/0x190
> >   [  271.946047]  synchronize_rcu_tasks_rude+0x48/0x70
> >   [  271.947007]  update_ftrace_function+0xa4/0xec
> >   [  271.947897]  __unregister_ftrace_function+0xa4/0xf0
> >   [  271.948898]  unregister_ftrace_function+0x34/0x70
> >   [  271.949857]  wakeup_tracer_reset+0x4c/0x100
> >   [  271.950713]  tracing_set_tracer+0xd0/0x2b0
> >   [  271.951552]  tracing_set_trace_write+0xe8/0x150
> >   [  271.952477]  vfs_write+0xfc/0x284
> >   [  271.953171]  ksys_write+0x7c/0x110
> >   [  271.953874]  __arm64_sys_write+0x2c/0x40
> >   [  271.954678]  invoke_syscall+0x5c/0x130
> >   [  271.955442]  el0_svc_common.constprop.0+0x108/0x130
> >   [  271.956435]  do_el0_svc+0x74/0x90
> >   [  271.957124]  el0_svc+0x2c/0x90
> >   [  271.957757]  el0t_64_sync_handler+0xa8/0x12c
> >   [  271.958629]  el0t_64_sync+0x1a0/0x1a4

On arm64 I bisected this down to:

  7a30871b6a27de1a ("rcu-tasks: Introduce ->percpu_enqueue_shift for dynamic 
queue selection")

Which was going wrong because ilog2() rounds down, and so the shift was wrong
for any nr_cpus that was not a power-of-two. Paul had already fixed that in
rcu-next, and just sent a pull request to Linus:

  
https://lore.kernel.org/lkml/20220128143251.GA2398275@paulmck-ThinkPad-P17-Gen-1/

With that applied, I no longer see these hangs.

Does your s390 test machine have a non-power-of-two nr_cpus, and does that fix
the issue for you?

On arm64 the startup tests didn't seem to trigger the hang, but I was able to
trigger the hang fairly reliably with the ftrace selftests, e.g.

  $ for N in $(seq 1 10); do ./ftracetest test.d/00basic/basic2.tc; done

... which prior to the fix, would hang between runs 2 to 5.

Thanks,
Mark.

> that's interesting. On s390 i'm seeing the same problem in CI, but with
> the startup ftrace tests. So that's likely not arm64 spacific.
> 
> On s390, the last messages from ftrace are [5.663568] clocksource: 
> jiffies: mask: 0x max_cycles: 0x, max_idle_ns: 
> 1911260446275 ns
> [5.667099] futex hash table entries: 65536 (order: 12, 16777216 bytes, 
> vmalloc)
> [5.739549] Running postponed tracer tests:
> [5.740662] Testing tracer function: PASSED
> [6.194635] Testing dynamic ftrace: PASSED
> [6.471213] Testing dynamic ftrace ops #1: 
> [6.558445] (1 0 1 0 0) 
> [6.558458] (1 1 2 0 0) 
> [6.699135] (2 1 3 0 764347) 
> [6.699252] (2 2 4 0 766466) 
> [6.759857] (3 2 4 0 1159604)
> [..] hangs here
> 
> The backtrace looks like this, which is very similar to the one above:
> 
> crash> bt 1
> PID: 1  TASK: 80e68100  CPU: 133  COMMAND: "swapper/0"
>  #0 [380004df808] __schedule at cda39f0e
>  #1 [380004df880] schedule at cda3a488
>  #2 [380004df8b0] schedule_timeout at cda41ef6
>  #3 [380004df978] wait_for_completion at cda3bd0a
>  #4 [380004df9d8] __wait_rcu_gp at cc92
>  #5 [380004dfa30] synchronize_rcu_tasks_generic at ccdde0aa
>  #6 [380004dfad8] ftrace_shutdown at cce7b050
>  #7 [380004dfb18] unregister_ftrace_function at cce7b192
>  #8 [380004dfb50] trace_selftest_ops at cda1e0fa
>  #9 [380004dfba0] run_tracer_selftest at cda1e4f2
> #10 [380004dfc00] trace_selftest_startup_function at ce74355c
> #11 [380004dfc58] run_tracer_selftest at cda1e2fc
> #12 [380004dfc98] init_trace_selftests at ce742d30
> #13 [380004dfcd0] do_one_initcall at cccdca16
> #14 [380004dfd68] do_initcalls at ce72e776
> #15 [380004dfde0] kernel_init_freeable at ce72ea60
> #16 [380004dfe50] kernel_init at cda333fe
> #17 [380004dfe68] __ret_from_fork at cccdf920
> #18 [380004dfe98] ret_from_fork at cda444ca
> 
> I didn't had success reproducing it so far, but it is good to know that
> this also happens when running the ftrace testsuite.
> 
> I have several crashdumps, so i could try to pull out some information
> if someone tells me what to look for.
> 
> Thanks,
> Sven


ftrace hangs waiting for rcu (was: Re: [PATCH] ftrace: Have architectures opt-in for mcount build time sorting)

2022-01-27 Thread Sven Schnelle
Hi Mark,

Mark Rutland  writes:

> * I intermittently see a hang when running the tests. I previously hit that
>   when originally trying to bisect this issue (and IIRC that bisected down to
>   some RCU changes, but I need to re-run that). When the tests hang I
>   magic-srsrq + L tells me:
>
>   [  271.938438] sysrq: Show Blocked State
>   [  271.939245] task:ftracetest  state:D stack:0 pid: 5687 ppid:  
> 5627 flags:0x0200
>   [  271.940961] Call trace:
>   [  271.941472]  __switch_to+0x104/0x160
>   [  271.942213]  __schedule+0x2b0/0x6e0
>   [  271.942933]  schedule+0x5c/0xf0
>   [  271.943586]  schedule_timeout+0x184/0x1c4
>   [  271.944410]  wait_for_completion+0x8c/0x12c
>   [  271.945274]  __wait_rcu_gp+0x184/0x190
>   [  271.946047]  synchronize_rcu_tasks_rude+0x48/0x70
>   [  271.947007]  update_ftrace_function+0xa4/0xec
>   [  271.947897]  __unregister_ftrace_function+0xa4/0xf0
>   [  271.948898]  unregister_ftrace_function+0x34/0x70
>   [  271.949857]  wakeup_tracer_reset+0x4c/0x100
>   [  271.950713]  tracing_set_tracer+0xd0/0x2b0
>   [  271.951552]  tracing_set_trace_write+0xe8/0x150
>   [  271.952477]  vfs_write+0xfc/0x284
>   [  271.953171]  ksys_write+0x7c/0x110
>   [  271.953874]  __arm64_sys_write+0x2c/0x40
>   [  271.954678]  invoke_syscall+0x5c/0x130
>   [  271.955442]  el0_svc_common.constprop.0+0x108/0x130
>   [  271.956435]  do_el0_svc+0x74/0x90
>   [  271.957124]  el0_svc+0x2c/0x90
>   [  271.957757]  el0t_64_sync_handler+0xa8/0x12c
>   [  271.958629]  el0t_64_sync+0x1a0/0x1a4

that's interesting. On s390 i'm seeing the same problem in CI, but with
the startup ftrace tests. So that's likely not arm64 spacific.

On s390, the last messages from ftrace are [5.663568] clocksource: jiffies: 
mask: 0x max_cycles: 0x, max_idle_ns: 1911260446275 ns
[5.667099] futex hash table entries: 65536 (order: 12, 16777216 bytes, 
vmalloc)
[5.739549] Running postponed tracer tests:
[5.740662] Testing tracer function: PASSED
[6.194635] Testing dynamic ftrace: PASSED
[6.471213] Testing dynamic ftrace ops #1: 
[6.558445] (1 0 1 0 0) 
[6.558458] (1 1 2 0 0) 
[6.699135] (2 1 3 0 764347) 
[6.699252] (2 2 4 0 766466) 
[6.759857] (3 2 4 0 1159604)
[..] hangs here

The backtrace looks like this, which is very similar to the one above:

crash> bt 1
PID: 1  TASK: 80e68100  CPU: 133  COMMAND: "swapper/0"
 #0 [380004df808] __schedule at cda39f0e
 #1 [380004df880] schedule at cda3a488
 #2 [380004df8b0] schedule_timeout at cda41ef6
 #3 [380004df978] wait_for_completion at cda3bd0a
 #4 [380004df9d8] __wait_rcu_gp at cc92
 #5 [380004dfa30] synchronize_rcu_tasks_generic at ccdde0aa
 #6 [380004dfad8] ftrace_shutdown at cce7b050
 #7 [380004dfb18] unregister_ftrace_function at cce7b192
 #8 [380004dfb50] trace_selftest_ops at cda1e0fa
 #9 [380004dfba0] run_tracer_selftest at cda1e4f2
#10 [380004dfc00] trace_selftest_startup_function at ce74355c
#11 [380004dfc58] run_tracer_selftest at cda1e2fc
#12 [380004dfc98] init_trace_selftests at ce742d30
#13 [380004dfcd0] do_one_initcall at cccdca16
#14 [380004dfd68] do_initcalls at ce72e776
#15 [380004dfde0] kernel_init_freeable at ce72ea60
#16 [380004dfe50] kernel_init at cda333fe
#17 [380004dfe68] __ret_from_fork at cccdf920
#18 [380004dfe98] ret_from_fork at cda444ca

I didn't had success reproducing it so far, but it is good to know that
this also happens when running the ftrace testsuite.

I have several crashdumps, so i could try to pull out some information
if someone tells me what to look for.

Thanks,
Sven