Greetings!!!

I am seeing an early boot kernel panic due to NULL pointer dereference on a POWER9 (pSeries) system when testing linux-next (next-20260522).


Traces:

[    0.038567] Big cores detected but using small core scheduling
[    0.038796] BUG: Kernel NULL pointer dereference at 0x00000000
[    0.038804] Faulting instruction address: 0xc000000000e58504
[    0.038812] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.038819] LE PAGE_SIZE=64K MMU=Hash  SMP NR_CPUS=8192 NUMA pSeries
[    0.038830] Modules linked in:
[    0.038840] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 7.0.0-rc6+ #14 PREEMPTLAZY [    0.038851] Hardware name: IBM,8375-42A POWER9 (architected) 0x4e0202 0xf000005 of:IBM,FW950.80 (VL950_131) hv:phyp pSeries [    0.038860] NIP:  c000000000e58504 LR: c000000000e58500 CTR: 0000000000000000
[    0.038869] REGS: c0000000090e78e0 TRAP: 0380   Not tainted (7.0.0-rc6+)
[    0.038878] MSR:  8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 44002242  XER: 20040003
[    0.038907] CFAR: c00000000093f3f0 IRQMASK: 0
[    0.038907] GPR00: c00000000038b3b8 c0000000090e7b80 c00000000259a800 0000000000000000 [    0.038907] GPR04: 0000000000000038 0000000000000038 c00000000c6e2560 0000000000000000 [    0.038907] GPR08: 0000000000000000 0000000000000037 0000ffffffffffff 0000000000000000 [    0.038907] GPR12: c000000000072730 c0000000051b0000 c00000000c6ee560 00000000ffffffff [    0.038907] GPR16: 0000000000000000 0000000000000038 c0000000032c6b08 fffffffffffffff6 [    0.038907] GPR20: 0000000000000000 c000000004d1a6e0 0000000000000000 0000000000000000 [    0.038907] GPR24: 0000000000000000 0000000000000000 00000000ffffffff c00000000a3bf940 [    0.038907] GPR28: 0000000000000038 0000000000000000 0000000000000000 0000000000000000
[    0.039029] NIP [c000000000e58504] _find_first_bit+0x44/0x130
[    0.039043] LR [c000000000e58500] _find_first_bit+0x40/0x130
[    0.039054] Call Trace:
[    0.039060] [c0000000090e7b80] [c00000000416af20] schedutil_gov+0x0/0xa0 (unreliable) [    0.039076] [c0000000090e7bc0] [c00000000038b3b8] build_sched_domains+0xad8/0xe50 [    0.039089] [c0000000090e7ce0] [c000000003045d78] sched_init_smp+0xa8/0x164 [    0.039102] [c0000000090e7d30] [c00000000300f374] kernel_init_freeable+0x250/0x370
[    0.039117] [c0000000090e7de0] [c000000000011f90] kernel_init+0x34/0x1e4
[    0.039129] [c0000000090e7e50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c
[    0.039142] ---- interrupt: 0 at 0x0
[    0.039150] Code: 41820090 7c0802a6 393cffff fbe10038 7c7f1b78 fba10028 fbc10030 3bc00000 793dd7e2 f8010050 4bae6e9d 60000000 <e93f0000> 2c290000 408200bc 283c0040
[    0.039196] ---[ end trace 0000000000000000 ]---


Git bisect is pointing to b5ea300a17e3 sched/cache: Make LLC id continuous as first bad commit.


Git Bisect Logs:


# git bisect log
git bisect start
# status: waiting for both good and bad commits
# bad: [c1ecb239fa3456529a32255359fc78b69eb9d847] Add linux-next specific files for 20260522
git bisect bad c1ecb239fa3456529a32255359fc78b69eb9d847
# status: waiting for good commit(s), bad commit known
# good: [5200f5f493f79f14bbdc349e402a40dfb32f23c8] Linux 7.1-rc4
git bisect good 5200f5f493f79f14bbdc349e402a40dfb32f23c8
# good: [7cd27a0d57b8539366c98bb04fe48d1aff779ea9] Merge branch 'main' of https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
git bisect good 7cd27a0d57b8539366c98bb04fe48d1aff779ea9
# good: [efb3dd6031ec9858c7285fd673970320c86c01f3] Merge branch 'next' of https://git.kernel.org/pub/scm/linux/kernel/git/dtor/input.git
git bisect good efb3dd6031ec9858c7285fd673970320c86c01f3
# bad: [1a6066d1c1243fdc5ed464032bbdf12e6710c027] Merge branch 'driver-core-next' of https://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core.git
git bisect bad 1a6066d1c1243fdc5ed464032bbdf12e6710c027
# good: [409a99cbc316d912c999fd75b9df042b25900e50] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git
git bisect good 409a99cbc316d912c999fd75b9df042b25900e50
# bad: [af73f6b022c8c09a3234176892a18216be4cd984] Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm.git
git bisect bad af73f6b022c8c09a3234176892a18216be4cd984
# bad: [6a459eb254e4bff61546587eccd3091955123d24] Merge branch into tip/master: 'sched/core'
git bisect bad 6a459eb254e4bff61546587eccd3091955123d24
# good: [71ba4bb66c3a9287245d0f5fcfb27d4b951ba402] Merge branch into tip/master: 'locking/core'
git bisect good 71ba4bb66c3a9287245d0f5fcfb27d4b951ba402
# good: [f3b45696a160a2230d846de8f706e835984ae65b] Merge branch into tip/master: 'objtool/core'
git bisect good f3b45696a160a2230d846de8f706e835984ae65b
# bad: [c99b8593b060931c5a0a4b701689f8d6a2c00dbf] sched/cache: Fix stale preferred_llc for a new task
git bisect bad c99b8593b060931c5a0a4b701689f8d6a2c00dbf
# bad: [5b1d5e6db20a6c64ffb95d04578db8c4b0228eea] sched/cache: Respect LLC preference in task migration and detach
git bisect bad 5b1d5e6db20a6c64ffb95d04578db8c4b0228eea
# bad: [46afe3af7ead57190b6d362e214814ec804e3b7b] sched/cache: Track LLC-preferred tasks per runqueue
git bisect bad 46afe3af7ead57190b6d362e214814ec804e3b7b
# good: [f025ef275388742643a2c33f00a0d9c0af3112ee] sched/cache: Record per LLC utilization to guide cache aware scheduling decisions
git bisect good f025ef275388742643a2c33f00a0d9c0af3112ee
# bad: [b5ea300a17e37eada7a98561fbd34a3054578713] sched/cache: Make LLC id continuous
git bisect bad b5ea300a17e37eada7a98561fbd34a3054578713
# good: [23b2b5ccc45ce2a38b9336a916088fffdc4cdfb1] sched/cache: Introduce helper functions to enforce LLC migration policy
git bisect good 23b2b5ccc45ce2a38b9336a916088fffdc4cdfb1
# first bad commit: [b5ea300a17e37eada7a98561fbd34a3054578713] sched/cache: Make LLC id continuous


b5ea300a17e37eada7a98561fbd34a3054578713 is the first bad commit
commit b5ea300a17e37eada7a98561fbd34a3054578713
Author: Tim Chen <[email protected]>
Date:   Wed Apr 1 14:52:17 2026 -0700

    sched/cache: Make LLC id continuous

    Introduce an index mapping between CPUs and their LLCs. This provides
    a roughly continuous per LLC index needed for cache-aware load balancing in
    later patches.

    The existing per_cpu llc_id usually points to the first CPU of the
    LLC domain, which is sparse and unsuitable as an array index. Using
    llc_id directly would waste memory.

    With the new mapping, CPUs in the same LLC share an approximate
    continuous id:

      per_cpu(llc_id, CPU=0...15)  = 0
      per_cpu(llc_id, CPU=16...31) = 1
      per_cpu(llc_id, CPU=32...47) = 2
      ...

    Note that the LLC IDs are allocated via bitmask, so the IDs may be
    reused during CPU offline->online transitions.

    Suggested-by: Peter Zijlstra (Intel) <[email protected]>
    Originally-by: K Prateek Nayak <[email protected]>
    Co-developed-by: Chen Yu <[email protected]>
    Signed-off-by: Chen Yu <[email protected]>
    Signed-off-by: Tim Chen <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Link: https://patch.msgid.link/047ef46339e4db497b54a89940a7ebedf27fcf28.1775065312.git.tim.c.c...@linux.intel.com

 kernel/sched/core.c     |  2 ++
 kernel/sched/sched.h    |  3 ++
 kernel/sched/topology.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 93 insertions(+), 2 deletions(-)


If you happen to fix this, please add below tag.


Reported-by: Venkat Rao Bagalkote <[email protected]>


Regards,

Venkat.



Reply via email to