From: Shivang Upadhyay <[email protected]> During DLPAR CPU hotplug, newly added CPUs start in RTAS stopped state (quiesced). If a kexec crash occurs before the guest starts these CPUs via start-cpu RTAS call, H_SIGNAL_SYS_RESET_ALL_OTHERS will reset them anyway, causing the kdump kernel to hang:
[ 5.519483][ T1] Processor 0 is stuck. [ 11.089481][ T1] Processor 1 is stuck. The hypervisor should only reset CPUs that the guest has started. The cpu->env.quiesced flag tracks RTAS stopped state - CPUs in this state are already inactive and should not be reset. Skip system reset for quiesced CPUs to prevent kdump hangs during CPU hotplug operations. Cc: Sourabh Jain <[email protected]> Cc: Harsh Prateek Bora <[email protected]> Cc: Mahesh J Salgaonkar <[email protected]> Reported-by: Anushree Mathur <[email protected]> Suggested-by: Vishal Chourasia <[email protected]> Reviewed-by: Vishal Chourasia <[email protected]> Signed-off-by: Shivang Upadhyay <[email protected]> Link: https://lore.kernel.org/qemu-devel/[email protected] [harshpb: expanded comment to elobarate more on the rationale] Signed-off-by: Harsh Prateek Bora <[email protected]> --- hw/ppc/spapr_hcall.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c index 032805a8d0..60ba215e86 100644 --- a/hw/ppc/spapr_hcall.c +++ b/hw/ppc/spapr_hcall.c @@ -1105,6 +1105,15 @@ static target_ulong h_signal_sys_reset(PowerPCCPU *cpu, continue; } } + + /* Skip quiesced CPUs - they are in RTAS stopped state and + * should not be reset. This prevents kdump hangs when CPUs + * are hotplugged but not yet started by the guest. + */ + if (c->env.quiesced) { + continue; + } + run_on_cpu(cs, spapr_do_system_reset_on_cpu, RUN_ON_CPU_NULL); } return H_SUCCESS; -- 2.52.0
