During DLPAR CPU hotplug, newly added CPUs start in RTAS stopped state
(quiesced). If a kexec crash occurs before the guest starts these CPUs
via start-cpu RTAS call, H_SIGNAL_SYS_RESET_ALL_OTHERS will reset them
anyway, causing the kdump kernel to hang:

  [    5.519483][    T1] Processor 0 is stuck.
  [   11.089481][    T1] Processor 1 is stuck.

The hypervisor should only reset CPUs that the guest has started. The
cpu->env.quiesced flag tracks RTAS stopped state - CPUs in this state
are already inactive and should not be reset.

Skip system reset for quiesced CPUs to prevent kdump hangs during CPU
hotplug operations.

Cc: Sourabh Jain <[email protected]>
Cc: Harsh Prateek Bora <[email protected]>
Cc: Mahesh J Salgaonkar <[email protected]>
Reported-by: Anushree Mathur <[email protected]>
Suggested-by: Vishal Chourasia <[email protected]>
Reviewed-by: Vishal Chourasia <[email protected]>
Signed-off-by: Shivang Upadhyay <[email protected]>
---
Changelog:

v2:
 * added braces to adhere to style guide.
 * rebase to master

v1:
 * https://lore.kernel.org/all/[email protected]/
---
 hw/ppc/spapr_hcall.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 032805a8d0..613dd893bb 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1105,6 +1105,12 @@ static target_ulong h_signal_sys_reset(PowerPCCPU *cpu,
                     continue;
                 }
             }
+
+            /* Skip quiesced CPUs */
+            if (c->env.quiesced) {
+                continue;
+            }
+
             run_on_cpu(cs, spapr_do_system_reset_on_cpu, RUN_ON_CPU_NULL);
         }
         return H_SUCCESS;
-- 
2.53.0


Reply via email to