Hi Shrikanth, Thanks for your review. On Mon, 2026-03-30 at 14:19 +0530, Shrikanth Hegde wrote: > Hi Shivang. > > Few nitpicks. > > On 3/30/26 11:52 AM, Shivang Upadhyay wrote: > > During DLPAR operations, the newly added CPUs start in halted mode. > > The kernel then takes some time to initialize those CPUs internally > > and > > start them using the "start-cpu" RTAS call. However, if a kexec > > crash > > occurs in this window (before the new CPU has been initialized), > > the kexec NMI will try to reset all other CPUs from the crashing > > CPU. > > This leads to firmware starting the uninitialized CPUs as well. > > > > This can cause the kdump kernel to hang during bring-up. > > > > Sample Log: > > [175993.028231][ T1502] NIP [00007fffb953f394] 0x7fffb953f394 > > [175993.028314][ T1502] LR [00007fffb953f394] 0x7fffb953f394 > > [175993.028390][ T1502] --- interrupt: 3000 > > [ 5.519483][ T1] Processor 0 is stuck. > > [ 11.089481][ T1] Processor 1 is stuck. > > > > To fix this, only issue the system-reset hcall to CPUs that have > > actually been started by the kernel. > > > > Cc: Madhavan Srinivasan <[email protected]> > > Cc: Michael Ellerman <[email protected]> > > Cc: Nicholas Piggin <[email protected]> > > Cc: Christophe Leroy <[email protected]> > > Cc: Srikar Dronamraju <[email protected]> > > Cc: Shrikanth Hegde <[email protected]> > > Cc: Nysal Jan K.A. <[email protected]> > > Cc: Vishal Chourasia <[email protected]> > > Cc: Ritesh Harjani <[email protected]> > > Cc: Sourabh Jain <[email protected]> > > Reported-by: Anushree Mathur <[email protected]> > > Signed-off-by: Shivang Upadhyay <[email protected]> > > --- > > Changelog: > > > > V2: > > * added set_crash_nmi_ipi to saperate crash's case from other > > nmi_ipi > > users > > > > V1: > > * > > https://lore.kernel.org/all/[email protected]/ > > --- > > arch/powerpc/include/asm/smp.h | 1 + > > arch/powerpc/kernel/smp.c | 1 + > > arch/powerpc/platforms/pseries/smp.c | 29 > > +++++++++++++++++++++++++++- > > 3 files changed, 30 insertions(+), 1 deletion(-) > > > > diff --git a/arch/powerpc/include/asm/smp.h > > b/arch/powerpc/include/asm/smp.h > > index e41b9ea42122..cb74201f5674 100644 > > --- a/arch/powerpc/include/asm/smp.h > > +++ b/arch/powerpc/include/asm/smp.h > > @@ -47,6 +47,7 @@ struct smp_ops_t { > > void (*cause_ipi)(int cpu); > > #endif > > int (*cause_nmi_ipi)(int cpu); > > + void (*set_crash_nmi_ipi)(void); > > void (*probe)(void); > > int (*kick_cpu)(int nr); > > int (*prepare_cpu)(int nr); > > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c > > index 3467f86fd78f..3390ee8adf79 100644 > > --- a/arch/powerpc/kernel/smp.c > > +++ b/arch/powerpc/kernel/smp.c > > @@ -594,6 +594,7 @@ void crash_send_ipi(void > > (*crash_ipi_callback)(struct pt_regs *)) > > { > > int cpu; > > > > + smp_ops->set_crash_nmi_ipi(); > > smp_send_nmi_ipi(NMI_IPI_ALL_OTHERS, crash_ipi_callback, > > 1000000); > > if (kdump_in_progress() && crash_wake_offline) { > > for_each_present_cpu(cpu) { > > diff --git a/arch/powerpc/platforms/pseries/smp.c > > b/arch/powerpc/platforms/pseries/smp.c > > index db99725e752b..c6c2baacca9a 100644 > > --- a/arch/powerpc/platforms/pseries/smp.c > > +++ b/arch/powerpc/platforms/pseries/smp.c > > @@ -51,6 +51,9 @@ > > */ > > static cpumask_var_t of_spin_mask; > > > > + > > +static int crash_nmi_ipi; > > + > > /* Query where a cpu is now. Return codes #defined in > > plpar_wrappers.h */ > > int smp_query_cpu_stopped(unsigned int pcpu) > > { > > @@ -171,12 +174,35 @@ static void dbell_or_ic_cause_ipi(int cpu) > > ic_cause_ipi(cpu); > > } > > > > +static void pseries_set_crash_nmi_ipi(void) > > +{ > > + crash_nmi_ipi = 1; > > +} > > + > > static int pseries_cause_nmi_ipi(int cpu) > > { > > int hwcpu; > > + int k, curcpu; > > Please follow inverted christmas tree for variables. Ack. > > > Can the below block be re-written with less indentations? > One level indentation removal could be. > > if (cpu == NMI_IPI_ALL_OTHERS && crash_nmi_ipi) > hwcpu = <> > else if (cpu == NMI_IPI_ALL_OTHERS) > hwcpu = H_SIGNAL_SYS_RESET_ALL_OTHERS; > else > <existing remaining logic> > > > > > + curcpu = smp_processor_id(); > > if (cpu == NMI_IPI_ALL_OTHERS) { > > - hwcpu = H_SIGNAL_SYS_RESET_ALL_OTHERS; > > + if (crash_nmi_ipi) { > > + for_each_present_cpu(k) { > > + if (k != curcpu) { > > Maybe below to reduce one more indentation level. > > if ( k == curcpu) > continue; > Sure, i can rewrite it that way. > > > + hwcpu = > > get_hard_smp_processor_id(k); > > + > > + /* it is possible that cpu > > is present, > > + * but not started yet. > > + */ > > + > > + if (paca_ptrs[hwcpu]- > > >cpu_start == 1) { > > It either 1 or 0. So if(paca_ptrs[hwcpu]->cpu_start) is enough. Ack.
~Shivang.
