During DLPAR CPU hotplug, newly added CPUs start in RTAS stopped state (quiesced). If a kexec crash occurs before the guest starts these CPUs via start-cpu RTAS call, H_SIGNAL_SYS_RESET_ALL_OTHERS will reset them anyway, causing the kdump kernel to hang:
[ 5.519483][ T1] Processor 0 is stuck. [ 11.089481][ T1] Processor 1 is stuck. The hypervisor should only reset CPUs that the guest has started. The cpu->env.quiesced flag tracks RTAS stopped state - CPUs in this state are already inactive and should not be reset. Skip system reset for quiesced CPUs to prevent kdump hangs during CPU hotplug operations. Cc: Sourabh Jain <[email protected]> Cc: Harsh Prateek Bora <[email protected]> Cc: Mahesh J Salgaonkar <[email protected]> Reported-by: Anushree Mathur <[email protected]> Suggested-by: Vishal Chourasia <[email protected]> Reviewed-by: Vishal Chourasia <[email protected]> Signed-off-by: Shivang Upadhyay <[email protected]> --- Changelog: v2: * added braces to adhere to style guide. * rebase to master v1: * https://lore.kernel.org/all/[email protected]/ --- hw/ppc/spapr_hcall.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c index 032805a8d0..613dd893bb 100644 --- a/hw/ppc/spapr_hcall.c +++ b/hw/ppc/spapr_hcall.c @@ -1105,6 +1105,12 @@ static target_ulong h_signal_sys_reset(PowerPCCPU *cpu, continue; } } + + /* Skip quiesced CPUs */ + if (c->env.quiesced) { + continue; + } + run_on_cpu(cs, spapr_do_system_reset_on_cpu, RUN_ON_CPU_NULL); } return H_SUCCESS; -- 2.53.0
