Re: [patch 05/37] x86/topology: Remove CPU0 hotplug option

2023-06-21 Thread Paul E. McKenney
On Sat, Apr 15, 2023 at 01:44:21AM +0200, Thomas Gleixner wrote:
> This was introduced together with commit e1c467e69040 ("x86, hotplug: Wake
> up CPU0 via NMI instead of INIT, SIPI, SIPI") to eventually support
> physical hotplug of CPU0:
> 
>  "We'll change this code in the future to wake up hard offlined CPU0 if
>   real platform and request are available."
> 
> 11 years later this has not happened and physical hotplug is not officially
> supported. Remove the cruft.
> 
> Signed-off-by: Thomas Gleixner 
> ---
>  Documentation/admin-guide/kernel-parameters.txt |   14 ---
>  Documentation/core-api/cpu_hotplug.rst  |   13 ---
>  arch/x86/Kconfig|   43 --
>  arch/x86/include/asm/cpu.h  |3 
>  arch/x86/kernel/topology.c  |   98 
> 
>  arch/x86/power/cpu.c|   37 -
>  6 files changed, 6 insertions(+), 202 deletions(-)

[ . . . ]

> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2294,49 +2294,6 @@ config HOTPLUG_CPU
>   def_bool y
>   depends on SMP
>  
> -config BOOTPARAM_HOTPLUG_CPU0

Removing this requires also removing its use in rcutorture.

I have therefore queued the commit below in -rcu, but please feel
free to take it along with the BOOTPARAM_HOTPLUG_CPU0-removal patch.
Just please let me know if you do.

(Yes, I finally got back to testing -next.  Why do you ask?)

Thanx, Paul

----

commit 95588de780c0e81004b72526aa3e3ef5ce054719
Author: Paul E. McKenney 
Date:   Wed Jun 21 09:44:52 2023 -0700

rcutorture: Remove obsolete BOOTPARAM_HOTPLUG_CPU0 Kconfig option

Now that the BOOTPARAM_HOTPLUG_CPU0 Kconfig option is in the process of
being removed, it is time to remove rcutorture's use of it.

    Link: https://lore.kernel.org/lkml/20230414232309.510911...@linutronix.de/
Signed-off-by: Paul E. McKenney 
Cc: Thomas Gleixner 
Cc: 

diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE01 
b/tools/testing/selftests/rcutorture/configs/rcu/TREE01
index 04831ef1f9b5..8ae41d5f81a3 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE01
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE01
@@ -15,4 +15,3 @@ CONFIG_DEBUG_LOCK_ALLOC=n
 CONFIG_RCU_BOOST=n
 CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
 CONFIG_RCU_EXPERT=y
-CONFIG_BOOTPARAM_HOTPLUG_CPU0=y



Re: objtool warning for next-20221118

2022-12-01 Thread Paul E. McKenney
On Tue, Nov 29, 2022 at 11:56:55AM -0800, Paul E. McKenney wrote:
> On Fri, Nov 25, 2022 at 06:30:35AM +0100, Juergen Gross wrote:
> > On 24.11.22 17:39, Josh Poimboeuf wrote:
> > > On Thu, Nov 24, 2022 at 08:47:47AM +0100, Juergen Gross wrote:
> > > > > > +++ b/arch/x86/xen/smp_pv.c
> > > > > > @@ -385,17 +385,9 @@ static void xen_pv_play_dead(void) /* used only
> > > > > > with HOTPLUG_CPU */
> > > > > >    {
> > > > > >    play_dead_common();
> > > > > >    HYPERVISOR_vcpu_op(VCPUOP_down, 
> > > > > > xen_vcpu_nr(smp_processor_id()), NULL);
> > > > > > -    cpu_bringup();
> > > > > > -    /*
> > > > > > - * commit 4b0c0f294 (tick: Cleanup NOHZ per cpu data on cpu 
> > > > > > down)
> > > > > > - * clears certain data that the cpu_idle loop (which called us
> > > > > > - * and that we return from) expects. The only way to get that
> > > > > > - * data back is to call:
> > > > > > - */
> > > > > > -    tick_nohz_idle_enter();
> > > > > > -    tick_nohz_idle_stop_tick_protected();
> > > > > > -    cpuhp_online_idle(CPUHP_AP_ONLINE_IDLE);
> > > > > > +    /* FIXME: converge cpu_bringup_and_idle() and 
> > > > > > start_secondary() */
> > > > > > +    cpu_bringup_and_idle();
> > > > > 
> > > > > I think this will leak stack memory. Multiple cpu offline/online 
> > > > > cycles of
> > > > > the same cpu will finally exhaust the idle stack.
> > > 
> > > Doh!  Of course...
> > > 
> > > I was actually thinking ahead, to where eventually xen_pv_play_dead()
> > > can call start_cpu0(), which can be changed to automatically reset the
> > > stack pointer like this:
> > > 
> > > SYM_CODE_START(start_cpu0)
> > >   ANNOTATE_NOENDBR
> > >   UNWIND_HINT_EMPTY
> > >   movqPER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rax
> > >   leaq-PTREGS_SIZE(%rax), %rsp
> > >   jmp .Ljump_to_C_code
> > > SYM_CODE_END(start_cpu0)
> > > 
> > > but that would only be possible be after more cleanups which converge
> > > cpu_bringup_and_idle() with start_secondary().
> > > 
> > > > The attached patch seems to work fine.
> > > 
> > > The patch looks good to me.
> > > 
> > > It doesn't solve Paul's original issue where arch_cpu_idle_dead() needs
> > > to be __noreturn.  But that should probably be a separate patch anyway.
> > 
> > Okay, I'll split this off.
> > 
> > > 
> > > > The __noreturn annotation seems to trigger an objtool warning, though, 
> > > > in
> > > > spite of the added BUG() at the end of xen_pv_play_dead():
> > > > 
> > > > arch/x86/xen/smp_pv.o: warning: objtool: xen_pv_play_dead() falls 
> > > > through to
> > > > next function xen_pv_cpu_die()
> > > 
> > > You'll need to tell objtool that xen_cpu_bringup_again() is noreturn by
> > > adding "xen_cpu_bringup_again" to global_noreturns[] in
> > > tools/objtool/check.c.
> > 
> > Ah, okay. Will do that.
> > 
> > > (Yes it's a pain, I'll be working an improved solution to the noreturn
> > > thing...)
> > 
> > Should be fairly easy, no?
> > 
> > "Just" extend the __noreturn macro to put the function into a 
> > ".text.noreturn"
> > section, which can be handled in a special way by objtool. This would need
> > an __init_noreturn macro, of course, for a ".init.text.noreturn" section.
> 
> And in last night's -next run, that diagnostic was gone!
> 
> But of course another appeared in its place:
> 
> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: 
> check_relocations+0xd1: stack state mismatch: cfa1=4+32 cfa2=-1+0
> 
> The .config file is shown below.  Thoughts?

And it is back.  Which makes no sense, but there it is.

Thanx, Paul



Re: objtool warning for next-20221118

2022-11-29 Thread Paul E. McKenney
On Fri, Nov 25, 2022 at 06:30:35AM +0100, Juergen Gross wrote:
> On 24.11.22 17:39, Josh Poimboeuf wrote:
> > On Thu, Nov 24, 2022 at 08:47:47AM +0100, Juergen Gross wrote:
> > > > > +++ b/arch/x86/xen/smp_pv.c
> > > > > @@ -385,17 +385,9 @@ static void xen_pv_play_dead(void) /* used only
> > > > > with HOTPLUG_CPU */
> > > > >    {
> > > > >    play_dead_common();
> > > > >    HYPERVISOR_vcpu_op(VCPUOP_down, 
> > > > > xen_vcpu_nr(smp_processor_id()), NULL);
> > > > > -    cpu_bringup();
> > > > > -    /*
> > > > > - * commit 4b0c0f294 (tick: Cleanup NOHZ per cpu data on cpu down)
> > > > > - * clears certain data that the cpu_idle loop (which called us
> > > > > - * and that we return from) expects. The only way to get that
> > > > > - * data back is to call:
> > > > > - */
> > > > > -    tick_nohz_idle_enter();
> > > > > -    tick_nohz_idle_stop_tick_protected();
> > > > > -    cpuhp_online_idle(CPUHP_AP_ONLINE_IDLE);
> > > > > +    /* FIXME: converge cpu_bringup_and_idle() and start_secondary() 
> > > > > */
> > > > > +    cpu_bringup_and_idle();
> > > > 
> > > > I think this will leak stack memory. Multiple cpu offline/online cycles 
> > > > of
> > > > the same cpu will finally exhaust the idle stack.
> > 
> > Doh!  Of course...
> > 
> > I was actually thinking ahead, to where eventually xen_pv_play_dead()
> > can call start_cpu0(), which can be changed to automatically reset the
> > stack pointer like this:
> > 
> > SYM_CODE_START(start_cpu0)
> > ANNOTATE_NOENDBR
> > UNWIND_HINT_EMPTY
> > movqPER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rax
> > leaq-PTREGS_SIZE(%rax), %rsp
> > jmp .Ljump_to_C_code
> > SYM_CODE_END(start_cpu0)
> > 
> > but that would only be possible be after more cleanups which converge
> > cpu_bringup_and_idle() with start_secondary().
> > 
> > > The attached patch seems to work fine.
> > 
> > The patch looks good to me.
> > 
> > It doesn't solve Paul's original issue where arch_cpu_idle_dead() needs
> > to be __noreturn.  But that should probably be a separate patch anyway.
> 
> Okay, I'll split this off.
> 
> > 
> > > The __noreturn annotation seems to trigger an objtool warning, though, in
> > > spite of the added BUG() at the end of xen_pv_play_dead():
> > > 
> > > arch/x86/xen/smp_pv.o: warning: objtool: xen_pv_play_dead() falls through 
> > > to
> > > next function xen_pv_cpu_die()
> > 
> > You'll need to tell objtool that xen_cpu_bringup_again() is noreturn by
> > adding "xen_cpu_bringup_again" to global_noreturns[] in
> > tools/objtool/check.c.
> 
> Ah, okay. Will do that.
> 
> > (Yes it's a pain, I'll be working an improved solution to the noreturn
> > thing...)
> 
> Should be fairly easy, no?
> 
> "Just" extend the __noreturn macro to put the function into a ".text.noreturn"
> section, which can be handled in a special way by objtool. This would need
> an __init_noreturn macro, of course, for a ".init.text.noreturn" section.

And in last night's -next run, that diagnostic was gone!

But of course another appeared in its place:

drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: 
check_relocations+0xd1: stack state mismatch: cfa1=4+32 cfa2=-1+0

The .config file is shown below.  Thoughts?

Thanx, Paul



#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 6.1.0-rc1 Kernel Configuration
#
CONFIG_CC_VERSION_TEXT="Ubuntu clang version 
11.1.0-++20211011094159+1fdec59bffc1-1~exp1~20211011214622.5"
CONFIG_GCC_VERSION=0
CONFIG_CC_IS_CLANG=y
CONFIG_CLANG_VERSION=110100
CONFIG_AS_IS_LLVM=y
CONFIG_AS_VERSION=110100
CONFIG_LD_IS_BFD=y
CONFIG_LD_VERSION=23400
CONFIG_LLD_VERSION=0
CONFIG_CC_CAN_LINK=y
CONFIG_CC_CAN_LINK_STATIC=y
CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y
CONFIG_CC_HAS_ASM_INLINE=y
CONFIG_PAHOLE_VERSION=0
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_TABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
# CONFIG_COMPILE_TEST is not set
CONFIG_WERROR=y
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_BUILD_SALT=""
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_HAVE_KERNEL_ZSTD=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
# CONFIG_KERNEL_ZSTD is not set
CONFIG_DEFAULT_INIT=""
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
# CONFIG_WATCH_QUEUE is not set
CONFIG_CROSS_MEMORY_ATTACH=y
# CONFIG_USELIB is not set
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_HARDIRQS_SW_RESEND=y
CO

Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-30 Thread Paul E. McKenney
On Sat, Jul 30, 2022 at 02:40:32AM -0700, Michel Lespinasse wrote:
> On Fri, Jul 29, 2022 at 08:26:22AM -0700, Paul E. McKenney wrote:> Would you 
> be willing to try another shot in the dark, but untested
> > this time?  I freely admit that this is getting strange.
> > 
> > Thanx, Paul
> 
> Yes, adding this second change got rid of the boot time warning for me.

OK, I will make a real patch.  May I have your Tested-by?

Thanx, Paul

> > 
> > 
> > diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> > index e374c0c923dae..279f557bf60bb 100644
> > --- a/kernel/sched/clock.c
> > +++ b/kernel/sched/clock.c
> > @@ -394,7 +394,7 @@ notrace void sched_clock_tick(void)
> > if (!static_branch_likely(&sched_clock_running))
> > return;
> >  
> > -   lockdep_assert_irqs_disabled();
> > +   WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
> >  
> > scd = this_scd();
> > __scd_stamp(scd);



Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-29 Thread Paul E. McKenney
Or better yet, try the patch that Rafael proposed.  ;-)

Thanx, Paul

On Fri, Jul 29, 2022 at 08:26:22AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 29, 2022 at 03:24:58AM -0700, Michel Lespinasse wrote:
> > On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > > Xeons") wrecked intel_idle in two ways:
> > > > > 
> > > > >  - must not have tracing in idle functions
> > > > >  - must return with IRQs disabled
> > > > > 
> > > > > Additionally, it added a branch for no good reason.
> > > > > 
> > > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on 
> > > > > Xeons")
> > > > > Signed-off-by: Peter Zijlstra (Intel) 
> > > > 
> > > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > > usage" when booting a kernel with debug options compiled in. Please
> > > > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > > > and is still present in v5.19-rc8.
> > > > 
> > > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> > > 
> > > I finally got a chance to take a quick look at this.
> > > 
> > > The rcu_eqs_exit() function is making a lockdep complaint about
> > > being invoked with interrupts enabled.  This function is called from
> > > rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> > > via its call to rcu_idle_exit().  Except that rcu_idle_exit() disables
> > > interrupts before invoking rcu_eqs_exit().
> > > 
> > > The only other call to rcu_idle_exit() does not disable interrupts,
> > > but it is via rcu_user_exit(), which would be a very odd choice for
> > > cpuidle_enter_state().
> > > 
> > > It seems unlikely, but it might be that it is the use of local_irq_save()
> > > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > > the trouble.  If this is the case, then the commit shown below would
> > > help.  Note that this commit removes the warning from lockdep, so it
> > > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > > equivalent debugging.
> > > 
> > > Could you please try your test with the -rce commit shown below applied?
> > 
> > Thanks for looking into it.
> 
> And thank you for trying this shot in the dark!
> 
> > After checking out Peter's commit 32d4fd5751ea,
> > cherry picking your commit ed4ae5eff4b3,
> > and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> > I am now seeing this a few seconds into the boot:
> > 
> > [3.010650] [ cut here ]
> > [3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 
> > sched_clock_tick+0x27/0x60
> 
> And this is again a complaint about interrupts not being disabled.
> 
> But it does appear that the problem was the lockdep complaint, and
> eliminating that did take care of part of the problem.  But lockdep
> remained enabled, and you therefore hit the next complaint.
> 
> > [3.010657] Modules linked in:
> > [3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> > 5.19.0-rc1-test-5-g1be22fea0611 #1
> > [3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 
> > 01/17/2022
> > [3.010663] RIP: 0010:sched_clock_tick+0x27/0x60
> 
> The most straightforward way to get to sched_clock_tick() from
> cpuidle_enter_state() is via the call to sched_clock_idle_wakeup_event().
> 
> Except that it disables interrupts before invoking sched_clock_tick().
> 
> > [3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 
> > 74 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 
> > <0f> 0b e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
> >  89 c0 48 03 1c c5 c0 98
> > [3.010667] RSP: :b2803e28 EFLAGS: 00010002
> > [3.010670] RAX: 0001 RBX: c8ce7fa07060 RCX: 
> > 0001
> > [3.010671] RDX:  RSI: b268dd21 RDI: 
> > b269ab13
> > [3.010673] RBP: 0001 R08: ffc300d5 R09: 
> > 0002be8

Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-29 Thread Paul E. McKenney
On Fri, Jul 29, 2022 at 03:24:58AM -0700, Michel Lespinasse wrote:
> On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > Xeons") wrecked intel_idle in two ways:
> > > > 
> > > >  - must not have tracing in idle functions
> > > >  - must return with IRQs disabled
> > > > 
> > > > Additionally, it added a branch for no good reason.
> > > > 
> > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > > Signed-off-by: Peter Zijlstra (Intel) 
> > > 
> > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > usage" when booting a kernel with debug options compiled in. Please
> > > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > > and is still present in v5.19-rc8.
> > > 
> > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> > 
> > I finally got a chance to take a quick look at this.
> > 
> > The rcu_eqs_exit() function is making a lockdep complaint about
> > being invoked with interrupts enabled.  This function is called from
> > rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> > via its call to rcu_idle_exit().  Except that rcu_idle_exit() disables
> > interrupts before invoking rcu_eqs_exit().
> > 
> > The only other call to rcu_idle_exit() does not disable interrupts,
> > but it is via rcu_user_exit(), which would be a very odd choice for
> > cpuidle_enter_state().
> > 
> > It seems unlikely, but it might be that it is the use of local_irq_save()
> > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > the trouble.  If this is the case, then the commit shown below would
> > help.  Note that this commit removes the warning from lockdep, so it
> > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > equivalent debugging.
> > 
> > Could you please try your test with the -rce commit shown below applied?
> 
> Thanks for looking into it.

And thank you for trying this shot in the dark!

> After checking out Peter's commit 32d4fd5751ea,
> cherry picking your commit ed4ae5eff4b3,
> and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> I am now seeing this a few seconds into the boot:
> 
> [3.010650] [ cut here ]
> [3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 
> sched_clock_tick+0x27/0x60

And this is again a complaint about interrupts not being disabled.

But it does appear that the problem was the lockdep complaint, and
eliminating that did take care of part of the problem.  But lockdep
remained enabled, and you therefore hit the next complaint.

> [3.010657] Modules linked in:
> [3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> 5.19.0-rc1-test-5-g1be22fea0611 #1
> [3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 01/17/2022
> [3.010663] RIP: 0010:sched_clock_tick+0x27/0x60

The most straightforward way to get to sched_clock_tick() from
cpuidle_enter_state() is via the call to sched_clock_idle_wakeup_event().

Except that it disables interrupts before invoking sched_clock_tick().

> [3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 74 
> 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 <0f> 0b 
> e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
>  89 c0 48 03 1c c5 c0 98
> [3.010667] RSP: :b2803e28 EFLAGS: 00010002
> [3.010670] RAX: 0001 RBX: c8ce7fa07060 RCX: 
> 0001
> [3.010671] RDX:  RSI: b268dd21 RDI: 
> b269ab13
> [3.010673] RBP: 0001 R08: ffc300d5 R09: 
> 0002be80
> [3.010674] R10: 03625b53183a R11: a012b802b7a4 R12: 
> b2aa9e80
> [3.010675] R13: b2aa9e00 R14: 0001 R15: 
> 
> [3.010677] FS:  () GS:a012b800() 
> knlGS:
> [3.010678] CS:  0010 DS:  ES:  CR0: 80050033
> [3.010680] CR2: a012f81ff000 CR3: 000c99612001 CR4: 
> 003706f0
> [3.010681] DR0:  DR1:  DR2: 
> 
> [3.010682] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [3.010683] Call Trace

Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-28 Thread Paul E. McKenney
On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > Xeons") wrecked intel_idle in two ways:
> > 
> >  - must not have tracing in idle functions
> >  - must return with IRQs disabled
> > 
> > Additionally, it added a branch for no good reason.
> > 
> > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > Signed-off-by: Peter Zijlstra (Intel) 
> 
> After this change was introduced, I am seeing "WARNING: suspicious RCU
> usage" when booting a kernel with debug options compiled in. Please
> see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> and is still present in v5.19-rc8.
> 
> I'm not sure, is this too late to fix or revert in v5.19 final ?

I finally got a chance to take a quick look at this.

The rcu_eqs_exit() function is making a lockdep complaint about
being invoked with interrupts enabled.  This function is called from
rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
via its call to rcu_idle_exit().  Except that rcu_idle_exit() disables
interrupts before invoking rcu_eqs_exit().

The only other call to rcu_idle_exit() does not disable interrupts,
but it is via rcu_user_exit(), which would be a very odd choice for
cpuidle_enter_state().

It seems unlikely, but it might be that it is the use of local_irq_save()
instead of raw_local_irq_save() within rcu_idle_exit() that is causing
the trouble.  If this is the case, then the commit shown below would
help.  Note that this commit removes the warning from lockdep, so it
is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
equivalent debugging.

Could you please try your test with the -rce commit shown below applied?

Thanx, Paul

--------

commit ed4ae5eff4b38797607cbdd80da394149110fb37
Author: Paul E. McKenney 
Date:   Tue May 17 21:00:04 2022 -0700

rcu: Apply noinstr to rcu_idle_enter() and rcu_idle_exit()

This commit applies the "noinstr" tag to the rcu_idle_enter() and
rcu_idle_exit() functions, which are invoked from portions of the idle
loop that cannot be instrumented.  These tags require reworking the
rcu_eqs_enter() and rcu_eqs_exit() functions that these two functions
invoke in order to cause them to use normal assertions rather than
lockdep.  In addition, within rcu_idle_exit(), the raw versions of
local_irq_save() and local_irq_restore() are used, again to avoid issues
with lockdep in uninstrumented code.

This patch is based in part on an earlier patch by Jiri Olsa, discussions
with Peter Zijlstra and Frederic Weisbecker, earlier changes by Thomas
Gleixner, and off-list discussions with Yonghong Song.

Link: 
https://lore.kernel.org/lkml/20220515203653.4039075-1-jo...@kernel.org/
Reported-by: Jiri Olsa 
Reported-by: Alexei Starovoitov 
Reported-by: Andrii Nakryiko 
Signed-off-by: Paul E. McKenney 
Reviewed-by: Yonghong Song 

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index c25ba442044a6..9a5edab5558c9 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -631,8 +631,8 @@ static noinstr void rcu_eqs_enter(bool user)
return;
}
 
-   lockdep_assert_irqs_disabled();
instrumentation_begin();
+   lockdep_assert_irqs_disabled();
trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, 
atomic_read(&rdp->dynticks));
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && 
!is_idle_task(current));
rcu_preempt_deferred_qs(current);
@@ -659,9 +659,9 @@ static noinstr void rcu_eqs_enter(bool user)
  * If you add or remove a call to rcu_idle_enter(), be sure to test with
  * CONFIG_RCU_EQS_DEBUG=y.
  */
-void rcu_idle_enter(void)
+void noinstr rcu_idle_enter(void)
 {
-   lockdep_assert_irqs_disabled();
+   WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
rcu_eqs_enter(false);
 }
 EXPORT_SYMBOL_GPL(rcu_idle_enter);
@@ -861,7 +861,7 @@ static void noinstr rcu_eqs_exit(bool user)
struct rcu_data *rdp;
long oldval;
 
-   lockdep_assert_irqs_disabled();
+   WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
rdp = this_cpu_ptr(&rcu_data);
oldval = rdp->dynticks_nesting;
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
@@ -896,13 +896,13 @@ static void noinstr rcu_eqs_exit(bool user)
  * If you add or remove a call to rcu_idle_exit(), be sure to test with
  * CONFIG_RCU_EQS_DEBUG=y.
  */
-void rcu_idle_exit(v

Re: [PATCH 16/36] rcu: Fix rcu_idle_exit()

2022-06-14 Thread Paul E. McKenney
On Wed, Jun 08, 2022 at 04:27:39PM +0200, Peter Zijlstra wrote:
> Current rcu_idle_exit() is terminally broken because it uses
> local_irq_{save,restore}(), which are traced which uses RCU.
> 
> However, now that all the callers are sure to have IRQs disabled, we
> can remove these calls.
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> Acked-by: Paul E. McKenney 

We have some fun conflicts between this series and Frederic's context-tracking
series.  But it looks like these can be resolved by:

1.  A patch on top of Frederic's series that provides the old rcu_*()
names for the functions now prefixed with ct_*() such as
ct_idle_exit().

2.  Another patch on top of Frederic's series that takes the
changes remaining from this patch, shown below.  Frederic's
series uses raw_local_irq_save() and raw_local_irq_restore(),
which can then be removed.

Or is there a better way to do this?

Thanx, Paul



commit f64cee8c159e9863a74594efe3d33fb513a6a7b5
Author: Peter Zijlstra 
Date:   Tue Jun 14 17:24:43 2022 -0700

context_tracking: Interrupts always disabled for ct_idle_exit()

Now that the idle-loop cleanups have ensured that rcu_idle_exit() is
always invoked with interrupts disabled, remove the interrupt disabling
in favor of a debug check.

Signed-off-by: Peter Zijlstra 
Cc: Frederic Weisbecker 
Signed-off-by: Paul E. McKenney 

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 1da44803fd319..99310cf5b0254 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -332,11 +332,8 @@ EXPORT_SYMBOL_GPL(ct_idle_enter);
  */
 void noinstr ct_idle_exit(void)
 {
-   unsigned long flags;
-
-   raw_local_irq_save(flags);
+   WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
ct_kernel_enter(false, RCU_DYNTICKS_IDX - CONTEXT_IDLE);
-   raw_local_irq_restore(flags);
 }
 EXPORT_SYMBOL_GPL(ct_idle_exit);
 



Re: [PATCH 20/30] panic: Add the panic informational notifier list

2022-04-27 Thread Paul E. McKenney
On Wed, Apr 27, 2022 at 07:49:14PM -0300, Guilherme G. Piccoli wrote:
> The goal of this new panic notifier is to allow its users to
> register callbacks to run earlier in the panic path than they
> currently do. This aims at informational mechanisms, like dumping
> kernel offsets and showing device error data (in case it's simple
> registers reading, for example) as well as mechanisms to disable
> log flooding (like hung_task detector / RCU warnings) and the
> tracing dump_on_oops (when enabled).
> 
> Any (non-invasive) information that should be provided before
> kmsg_dump() as well as log flooding preventing code should fit
> here, as long it offers relatively low risk for kdump.
> 
> For now, the patch is almost a no-op, although it changes a bit
> the ordering in which some panic notifiers are executed - specially
> affected by this are the notifiers responsible for disabling the
> hung_task detector / RCU warnings, which now run first. In a
> subsequent patch, the panic path will be refactored, then the
> panic informational notifiers will effectively run earlier,
> before ksmg_dump() (and usually before kdump as well).
> 
> We also defer documenting it all properly in the subsequent
> refactor patch. Finally, while at it, we removed some useless
> header inclusions too.
> 
> Cc: Benjamin Herrenschmidt 
> Cc: Catalin Marinas 
> Cc: Florian Fainelli 
> Cc: Frederic Weisbecker 
> Cc: "H. Peter Anvin" 
> Cc: Hari Bathini 
> Cc: Joel Fernandes 
> Cc: Jonathan Hunter 
> Cc: Josh Triplett 
> Cc: Lai Jiangshan 
> Cc: Leo Yan 
> Cc: Mathieu Desnoyers 
> Cc: Mathieu Poirier 
> Cc: Michael Ellerman 
> Cc: Mike Leach 
> Cc: Mikko Perttunen 
> Cc: Neeraj Upadhyay 
> Cc: Nicholas Piggin 
> Cc: Paul Mackerras 
> Cc: Suzuki K Poulose 
> Cc: Thierry Reding 
> Cc: Thomas Bogendoerfer 
> Signed-off-by: Guilherme G. Piccoli 

>From an RCU perspective:

Acked-by: Paul E. McKenney 

> ---
>  arch/arm64/kernel/setup.c | 2 +-
>  arch/mips/kernel/relocate.c   | 2 +-
>  arch/powerpc/kernel/setup-common.c| 2 +-
>  arch/x86/kernel/setup.c   | 2 +-
>  drivers/bus/brcmstb_gisb.c| 2 +-
>  drivers/hwtracing/coresight/coresight-cpu-debug.c | 4 ++--
>  drivers/soc/tegra/ari-tegra186.c  | 3 ++-
>  include/linux/panic_notifier.h| 1 +
>  kernel/hung_task.c| 3 ++-
>  kernel/panic.c| 4 
>  kernel/rcu/tree.c | 1 -
>  kernel/rcu/tree_stall.h   | 3 ++-
>  kernel/trace/trace.c  | 2 +-
>  13 files changed, 19 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 3505789cf4bd..ac2c7e8c9c6a 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -444,7 +444,7 @@ static struct notifier_block arm64_panic_block = {
>  
>  static int __init register_arm64_panic_block(void)
>  {
> - atomic_notifier_chain_register(&panic_notifier_list,
> + atomic_notifier_chain_register(&panic_info_list,
>  &arm64_panic_block);
>   return 0;
>  }
> diff --git a/arch/mips/kernel/relocate.c b/arch/mips/kernel/relocate.c
> index 56b51de2dc51..650811f2436a 100644
> --- a/arch/mips/kernel/relocate.c
> +++ b/arch/mips/kernel/relocate.c
> @@ -459,7 +459,7 @@ static struct notifier_block kernel_location_notifier = {
>  
>  static int __init register_kernel_offset_dumper(void)
>  {
> - atomic_notifier_chain_register(&panic_notifier_list,
> + atomic_notifier_chain_register(&panic_info_list,
>  &kernel_location_notifier);
>   return 0;
>  }
> diff --git a/arch/powerpc/kernel/setup-common.c 
> b/arch/powerpc/kernel/setup-common.c
> index 1468c3937bf4..d04b8bf8dbc7 100644
> --- a/arch/powerpc/kernel/setup-common.c
> +++ b/arch/powerpc/kernel/setup-common.c
> @@ -757,7 +757,7 @@ void __init setup_panic(void)
>  &ppc_fadump_block);
>  
>   if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset() > 0)
> - atomic_notifier_chain_register(&panic_notifier_list,
> + atomic_notifier_chain_register(&panic_info_list,
>  &kernel_offset_notifier);
>  
>   /* Low-level platform-specific routines that should run on panic */
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index c95b9ac5a457..599b25346964 100644
> --

Re: [PATCH v4 01/17] cpu: Add new {add,remove}_cpu() functions

2020-03-23 Thread Paul E. McKenney
On Mon, Mar 23, 2020 at 01:50:54PM +, Qais Yousef wrote:
> The new functions use device_{online,offline}() which are userspace
> safe.
> 
> This is in preparation to move cpu_{up, down} kernel users to use
> a safer interface that is not racy with userspace.
> 
> Suggested-by: "Paul E. McKenney" 
> Signed-off-by: Qais Yousef 
> CC: Thomas Gleixner 
> CC: "Paul E. McKenney" 

Reviewed-by: Paul E. McKenney 

> CC: Helge Deller 
> CC: Michael Ellerman 
> CC: "David S. Miller" 
> CC: Juergen Gross 
> CC: Mark Rutland 
> CC: Lorenzo Pieralisi 
> CC: xen-devel@lists.xenproject.org
> CC: linux-par...@vger.kernel.org
> CC: sparcli...@vger.kernel.org
> CC: linuxppc-...@lists.ozlabs.org
> CC: linux-arm-ker...@lists.infradead.org
> CC: x...@kernel.org
> CC: linux-ker...@vger.kernel.org
> ---
>  include/linux/cpu.h |  2 ++
>  kernel/cpu.c| 24 
>  2 files changed, 26 insertions(+)
> 
> diff --git a/include/linux/cpu.h b/include/linux/cpu.h
> index 1ca2baf817ed..cf8cf38dca43 100644
> --- a/include/linux/cpu.h
> +++ b/include/linux/cpu.h
> @@ -89,6 +89,7 @@ extern ssize_t arch_cpu_release(const char *, size_t);
>  #ifdef CONFIG_SMP
>  extern bool cpuhp_tasks_frozen;
>  int cpu_up(unsigned int cpu);
> +int add_cpu(unsigned int cpu);
>  void notify_cpu_starting(unsigned int cpu);
>  extern void cpu_maps_update_begin(void);
>  extern void cpu_maps_update_done(void);
> @@ -118,6 +119,7 @@ extern void cpu_hotplug_disable(void);
>  extern void cpu_hotplug_enable(void);
>  void clear_tasks_mm_cpumask(int cpu);
>  int cpu_down(unsigned int cpu);
> +int remove_cpu(unsigned int cpu);
>  
>  #else /* CONFIG_HOTPLUG_CPU */
>  
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 9c706af713fb..069802f7010f 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1057,6 +1057,18 @@ int cpu_down(unsigned int cpu)
>  }
>  EXPORT_SYMBOL(cpu_down);
>  
> +int remove_cpu(unsigned int cpu)
> +{
> + int ret;
> +
> + lock_device_hotplug();
> + ret = device_offline(get_cpu_device(cpu));
> + unlock_device_hotplug();
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(remove_cpu);
> +
>  #else
>  #define takedown_cpu NULL
>  #endif /*CONFIG_HOTPLUG_CPU*/
> @@ -1209,6 +1221,18 @@ int cpu_up(unsigned int cpu)
>  }
>  EXPORT_SYMBOL_GPL(cpu_up);
>  
> +int add_cpu(unsigned int cpu)
> +{
> + int ret;
> +
> + lock_device_hotplug();
> + ret = device_online(get_cpu_device(cpu));
> + unlock_device_hotplug();
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(add_cpu);
> +
>  #ifdef CONFIG_PM_SLEEP_SMP
>  static cpumask_var_t frozen_cpus;
>  
> -- 
> 2.17.1
>