Re: [PATCH v2] ppc/spapr: Skip system reset for quiesced CPUs

Anushree Mathur Wed, 13 May 2026 10:32:57 -0700



On 11/05/26 3:20 PM, Shivang Upadhyay wrote:

During DLPAR CPU hotplug, newly added CPUs start in RTAS stopped state
(quiesced). If a kexec crash occurs before the guest starts these CPUs
via start-cpu RTAS call, H_SIGNAL_SYS_RESET_ALL_OTHERS will reset them
anyway, causing the kdump kernel to hang:

   [    5.519483][    T1] Processor 0 is stuck.
   [   11.089481][    T1] Processor 1 is stuck.

The hypervisor should only reset CPUs that the guest has started. The
cpu->env.quiesced flag tracks RTAS stopped state - CPUs in this state
are already inactive and should not be reset.

Skip system reset for quiesced CPUs to prevent kdump hangs during CPU
hotplug operations.

Cc: Sourabh Jain <[email protected]>
Cc: Harsh Prateek Bora <[email protected]>
Cc: Mahesh J Salgaonkar <[email protected]>
Reported-by: Anushree Mathur <[email protected]>
Suggested-by: Vishal Chourasia <[email protected]>
Reviewed-by: Vishal Chourasia <[email protected]>
Signed-off-by: Shivang Upadhyay <[email protected]>
---
Changelog:

v2:
  * added braces to adhere to style guide.
  * rebase to master

v1:
  * https://lore.kernel.org/all/[email protected]/
---
  hw/ppc/spapr_hcall.c | 6 ++++++
  1 file changed, 6 insertions(+)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 032805a8d0..613dd893bb 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1105,6 +1105,12 @@ static target_ulong h_signal_sys_reset(PowerPCCPU *cpu,
                      continue;
                  }
              }
+
+            /* Skip quiesced CPUs */
+            if (c->env.quiesced) {
+                continue;
+            }
+
              run_on_cpu(cs, spapr_do_system_reset_on_cpu, RUN_ON_CPU_NULL);
          }
          return H_SUCCESS;



Hi Shivang,

thanks for working on the reported issue. After applying the patch I amseeing that this reported issue has been fixed which was guest gettinghung aftertriggering kdump on guest while cpu hotplug is going on, but I am seeingmultiple other issues with multiple attempts with the samescenario and one of the major issue that I have seen is a qemu crash. Ibelieve this needs to be fixed.



Here is my analysis of this issue with and without the patch:

1) Without applying the patch:

i) Start the guest with maxvcpus as 64 and current vcpus as 8

ii) Start cpu hotplug [virsh setvcpus guest_name 64] same time triggerkdump on guest [echo c > /proc/sysrq-trigger]

Guest is getting hung.


[   32.930453][ T1208] NIP [00007fffbe35b3c4] 0x7fffbe35b3c4
[   32.930528][ T1208] LR [00007fffbe35b3c4] 0x7fffbe35b3c4
[   32.930638][ T1208] ---- interrupt: 3000
[    9.857410][    T1] Processor 0 is stuck.



2) After applying the patch


Multiple issues that were seen in multiple attempts of this scenario:

i) In 4th attempt I saw dlpar related traces along with OOPS aftertriggering kdump:

[ 6.071156][ T121] pseries-hotplug-cpu: Cannot add cpu/cpus/PowerPC,POWER11@20; this system configuration supports 32 logicalcpus.[ 6.071313][ T121] OF: changeset notifier error@/cpus/PowerPC,POWER11@20[ 6.074099][ T121] BUG: Unable to handle kernel data access at0x151591241bba0bb6

[    6.074232][  T121] Faulting instruction address: 0xc0000000211e5b98
[    6.074311][  T121] Oops: Kernel access of bad area, sig: 11 [#1]

[    6.076695][  T121] Call Trace:

[ 6.076741][ T121] [c000000026a6baf0] [c000000026a6bb30]0xc000000026a6bb30 (unreliable)

[    6.076834][  T121] [c000000026a6bb60] [0000000010000021] 0x10000021

[ 6.076930][ T121] [c000000026a6bb90] [c000000020e55b14]of_get_next_child+0x64/0xd0[ 6.077034][ T121] [c000000026a6bbd0] [c0000000201cd1dc]dlpar_cpu_add+0xbc/0x5e0[ 6.077148][ T121] [c000000026a6bcb0] [c0000000201ce9d0]dlpar_cpu+0x60/0x1f0[ 6.077241][ T121] [c000000026a6bd40] [c0000000201c5914]handle_dlpar_errorlog+0x1f4/0x6e0[ 6.077333][ T121] [c000000026a6be20] [c0000000201c5e28]pseries_hp_work_fn+0x28/0x60[ 6.077425][ T121] [c000000026a6be50] [c000000020259e6c]process_one_work+0x1dc/0x540[ 6.077516][ T121] [c000000026a6bf00] [c00000002025ae0c]worker_thread+0x36c/0x4d0[ 6.077608][ T121] [c000000026a6bf90] [c000000020269978]kthread+0x168/0x190[ 6.077700][ T121] [c000000026a6bfe0] [c00000002000de58]start_kernel_thread+0x14/0x18


ii) In 7th attempt I saw xive interrupts


[   61.692603][ T1909] ---- interrupt: 3000
[    0.010215][    T1] xive: H_INT_GET_QUEUE_INFO cpu=62 prio=6 failed -55
[    0.013834][    T1] xive: Error -55 getting queue info CPU 62 prio 6

iii) qemu crashed after 10 attempts with the following error message inthe libvirt/qemu logs

qemu-system-ppc64: ../hw/ppc/spapr.c:4396: spapr_cpu_index_to_props:Assertion `core_slot' failed.

2026-05-13 09:46:29.656+0000: shutting down, reason=crashed


Thank you!
Anushree Mathur

Re: [PATCH v2] ppc/spapr: Skip system reset for quiesced CPUs

Reply via email to