On 26/06/24 15:10, Gautam Menghani wrote:
Without this patch, we had an issue where if we have some cpus disabled
in the system and we try to do a 2 stage kexec as follows:

kexec -l vmlinux ....
kexec -e

we would hit the following Oops

[ 2598.923098] kernel BUG at arch/powerpc/kernel/exceptions-64s.S:501!
[ 2598.923103] Oops: Exception in kernel mode, sig: 5 [#1]
[ 2598.923107] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[ 2598.923111] Modules linked in: rpcrdma rdma_cm iw_cm ib_cm ib_core 
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat 
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables 
bridge stp llc kvm_hv kvm bonding tls rfkill binfmt_misc tg3 vmx_crypto 
aes_gcm_p10_crypto ibmveth crct10dif_vpmsum pseries_rng nfsd auth_rpcgss 
nfs_acl lockd grace sunrpc fuse loop dm_multipath nfnetlink zram xfs ibmvscsi 
scsi_transport_srp crc32c_vpmsum pseries_wdt scsi_dh_rdac scsi_dh_emc 
scsi_dh_alua ip6_tables ip_tables
[ 2598.923167] CPU: 11 PID: 1548 Comm: systemd-journal Not tainted 6.9.0+ #4
[ 2598.923171] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 
of:IBM,FW1060.00 (NH1060_022) hv:phyp pSeries
[ 2598.923176] NIP:  c0000000000089e4 LR: 00007fffaa1427c4 CTR: c0000000000089b0
[ 2598.923180] REGS: c0000008dfe7fd60 TRAP: 0700   Not tainted  (6.9.0+)
[ 2598.923184] MSR:  8000000000021031 <SF,ME,IR,DR,LE>  CR: 28002413  XER: 
00000000
[ 2598.923192] CFAR: c0000000000089dc IRQMASK: 0
[ 2598.923192] GPR00: 0000000000000003 00007ffff40fb110 0000000000000000 
0000000000000009
[ 2598.923192] GPR04: 00007ffff40fbcf0 0000000000002000 00007ffff40fdcc0 
0000000000000000
[ 2598.923192] GPR08: 00007fffaabc3b80 0000000048002413 00007ffff40fb3e0 
0000000000017000
[ 2598.923192] GPR12: 8000000000009003 c0000008dfff2b00 0000000000000000 
0000000000000000
[ 2598.923192] GPR16: 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[ 2598.923192] GPR20: 0000000000000000 0000000000000000 0000000000000000 
00007fffaabaf448
[ 2598.923192] GPR24: 000000011bc72700 00007ffff40fddf8 0000000132490ea0 
00007ffff40fddf0
[ 2598.923192] GPR28: 0000000000000000 00007ffff40fbcf0 0000000000002000 
0000000000000009
[ 2598.923238] NIP [c0000000000089e4] data_access_common_virt+0x14/0x220
[ 2598.923245] LR [00007fffaa1427c4] 0x7fffaa1427c4
[ 2598.923251] Call Trace:
[ 2598.923253] Code: 2c0a0000 39400300 408242c0 e94d0020 694a0002 7d400164 60420000 
718a4000 7c2a0b78 3821fd30 41c20008 e82d0910 <0981fd30> f9210160 f9610130 
f9810138
[ 2598.923269] ---[ end trace 0000000000000000 ]---
[ 2598.926662] pstore: backend (nvram) writing error (-1)


With this patch, the disabled cpus are woken up and kexec goes through
fine.

Verified the same on LPAR and has similar observation as Guatam mentioned above.

Thanks for the fix Nick.

Tested-by: Sourabh Jain <sourabhj...@linux.ibm.com>

- Sourabh

Reply via email to