Re: linux-next: boot failure after merge of the powerpc tree

2022-12-02 Thread Michael Ellerman
Stephen Rothwell  writes:
> Hi all,
>
> After merging all the trees, today's linux-next qemu run (powerpc
> pseries_le_defconfig with kvm) crashed like this:
>
> Memory: 2029504K/2097152K available (14592K kernel code, 2944K rwdata, 18176K 
> rodata, 5120K init, 1468K bss, 67648K reserved, 0K cma-reserved)
> SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
> BUG: Kernel NULL pointer dereference on read at 0x001c
> Faulting instruction address: 0xc047e9bc
> Oops: Kernel access of bad area, sig: 7 [#1]
> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper Not tainted 6.1.0-rc7 #1
> Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf05 
> of:SLOF,HEAD hv:linux,kvm pSeries
> NIP:  c047e9bc LR: c0e06718 CTR: c047e970
> REGS: c2773770 TRAP: 0300   Not tainted  (6.1.0-rc7)
> MSR:  82001033   CR: 22004220  XER: 
> CFAR: c0070508 DAR: 001c DSISR: 0008 IRQMASK: 3 
> GPR00: c0e06718 c2773a10 c116fc00  
> GPR04: 2900 2800   
> GPR08: 000e c27afc00  4000 
> GPR12: c047e970 c295  013c8ff0 
> GPR16: 000d 02be00d0 0001 013c8e60 
> GPR20: 013c8fa8 013c8d90 c27b2160  
> GPR24: 0005 c27b3568 c0e06718 2900 
> GPR28: 2900 07fff33f  c2773bc8 
> NIP [c047e9bc] kmem_cache_alloc+0x5c/0x610
> LR [c0e06718] mas_alloc_nodes+0xe8/0x350
> Call Trace:
> [c2773a10] [0040] 0x40 (unreliable)
> [c2773a70] [c0e06718] mas_alloc_nodes+0xe8/0x350
> [c2773ad0] [c0e0f7f4] mas_expected_entries+0x94/0x110
> [c2773b10] [c012cc44] dup_mmap+0x194/0x730
> [c2773c80] [c012d260] dup_mm+0x80/0x180
> [c2773cc0] [c008e7c0] text_area_cpu_up_mm+0x20/0x1a0
> [c2773d20] [c013367c] cpuhp_invoke_callback+0x15c/0x810
> [c2773db0] [c01348dc] cpuhp_issue_call+0x28c/0x2a0
> [c2773e00] [c0134e44] 
> __cpuhp_setup_state_cpuslocked+0x154/0x3e0
> [c2773eb0] [c0135180] __cpuhp_setup_state+0xb0/0x1d0
> [c2773f10] [c2016f9c] poking_init+0x40/0x9c
> [c2773f30] [c200434c] start_kernel+0x598/0x914
> [c2773fe0] [c000d990] start_here_common+0x1c/0x20
> Code: fb81ffe0 7c9b2378 3b293968 fbc1fff0 f8010010 7c7e1b78 fba1ffe8 fbe1fff8 
> 91610008 f821ffa1 f8410018 83b9 <83e3001c> 7fbd2038 7bbc0020 7f84e378 
> ---[ end trace  ]---
>
> Kernel panic - not syncing: Attempted to kill the idle task!
>
> Reverting commits
>
>   55a02e6ea958 ("powerpc/code-patching: Use temporary mm for Radix MMU")

Looks like this is related to the conflict you got merging tip.

If I switch the powerpc code to use mm_alloc() then I don't see the
above crash.

I needed to rebase anyway so I've squashed that change in for Monday.

cheers


linux-next: boot failure after merge of the powerpc tree

2022-12-01 Thread Stephen Rothwell
Hi all,

After merging all the trees, today's linux-next qemu run (powerpc
pseries_le_defconfig with kvm) crashed like this:

Memory: 2029504K/2097152K available (14592K kernel code, 2944K rwdata, 18176K 
rodata, 5120K init, 1468K bss, 67648K reserved, 0K cma-reserved)
SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
BUG: Kernel NULL pointer dereference on read at 0x001c
Faulting instruction address: 0xc047e9bc
Oops: Kernel access of bad area, sig: 7 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 6.1.0-rc7 #1
Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf05 
of:SLOF,HEAD hv:linux,kvm pSeries
NIP:  c047e9bc LR: c0e06718 CTR: c047e970
REGS: c2773770 TRAP: 0300   Not tainted  (6.1.0-rc7)
MSR:  82001033   CR: 22004220  XER: 
CFAR: c0070508 DAR: 001c DSISR: 0008 IRQMASK: 3 
GPR00: c0e06718 c2773a10 c116fc00  
GPR04: 2900 2800   
GPR08: 000e c27afc00  4000 
GPR12: c047e970 c295  013c8ff0 
GPR16: 000d 02be00d0 0001 013c8e60 
GPR20: 013c8fa8 013c8d90 c27b2160  
GPR24: 0005 c27b3568 c0e06718 2900 
GPR28: 2900 07fff33f  c2773bc8 
NIP [c047e9bc] kmem_cache_alloc+0x5c/0x610
LR [c0e06718] mas_alloc_nodes+0xe8/0x350
Call Trace:
[c2773a10] [0040] 0x40 (unreliable)
[c2773a70] [c0e06718] mas_alloc_nodes+0xe8/0x350
[c2773ad0] [c0e0f7f4] mas_expected_entries+0x94/0x110
[c2773b10] [c012cc44] dup_mmap+0x194/0x730
[c2773c80] [c012d260] dup_mm+0x80/0x180
[c2773cc0] [c008e7c0] text_area_cpu_up_mm+0x20/0x1a0
[c2773d20] [c013367c] cpuhp_invoke_callback+0x15c/0x810
[c2773db0] [c01348dc] cpuhp_issue_call+0x28c/0x2a0
[c2773e00] [c0134e44] __cpuhp_setup_state_cpuslocked+0x154/0x3e0
[c2773eb0] [c0135180] __cpuhp_setup_state+0xb0/0x1d0
[c2773f10] [c2016f9c] poking_init+0x40/0x9c
[c2773f30] [c200434c] start_kernel+0x598/0x914
[c2773fe0] [c000d990] start_here_common+0x1c/0x20
Code: fb81ffe0 7c9b2378 3b293968 fbc1fff0 f8010010 7c7e1b78 fba1ffe8 fbe1fff8 
91610008 f821ffa1 f8410018 83b9 <83e3001c> 7fbd2038 7bbc0020 7f84e378 
---[ end trace  ]---

Kernel panic - not syncing: Attempted to kill the idle task!

Reverting commits

  55a02e6ea958 ("powerpc/code-patching: Use temporary mm for Radix MMU")
  d0462ee02fdd ("powerpc/code-patching: Consolidate and cache per-cpu patching 
context")
(this second just because it follows the other and modifies the same file)

fixes the panic.  I have done that in linux-next today.

-- 
Cheers,
Stephen Rothwell


pgpXI6gLKrvmu.pgp
Description: OpenPGP digital signature