On Fri, Sep 26, 2025 at 11:10:41PM +0200, Claudio Jeker wrote: > On Sat, Sep 20, 2025 at 10:59:51PM +0200, Claudio Jeker wrote: > > The M10-1 hits this panic in roughly 24h of running make -j 32 build in a > > loop. First time it exploded inside the reaper for me. So maybe this is > > closer to the truth. > > Another run, took a bit more than 24h this time. > This is from my top running when it paniced: > > load averages: 20.59, 10.40, 6.74 m10.zyd.ch > 08:39:30 > 135 processes: 6 starting, 4 running, 98 idle, 1 dead, 26 on up 1 days > 11:39:33 > 32 CPUs: 19.0% user, 0.0% nice, 12.8% sys, 67.4% spin, 0.0% intr, 0.8% > idle > Memory: Real: 504M/7786M act/tot Free: 116G Cache: 6008M Swap: 0K/88G > > cpu13 crashed because on mi_switch cpuswitch corrupted the registers. > CPU after 17 did not stop (probably because the cpu mondo broke down). > > Three make processes are running: > pid 83460 cpu 13 is toast but somewhere between fork and exec. > pid 27987 cpu 4 is in uvmspace_exec() > pid 2638 cpu 9 is spinning on the kernel lock for some syscall > > No process in ps /o seems to hold the KERNEL_LOCK which is a bit strange since > the parent make process is waiting for the lock.
Requested by mpi: what is pmap_remove+0x1e0 and pmap_release+0xf4 (gdb) x /i pmap_remove+0x1e0 0x169a400 <pmap_remove+480>: call 0x13f2760 <smp_tlb_flush_pte> 0x169a404 <pmap_remove+484>: mov %i0, %o0 Which is lines 1865 and 1866 of pmap.c: 1865 /* Here we assume nothing can get into the TLB unless it has a PTE */ 1866 tlb_flush_pte(va, pm->pm_ctx); 0x000000000169a3fc <+476>: ldsw [ %l0 + 0x10 ], %o1 0x000000000169a400 <+480>: call 0x13f2760 <smp_tlb_flush_pte> 0x000000000169a404 <+484>: mov %i0, %o0 (gdb) x /i pmap_release+0xf4 0x1696a14 <pmap_release+244>: call 0x1696580 <pmap_free_page> 0x1696a18 <pmap_release+248>: mov %i0, %o1 Which is line 1511 of pmap.c: 1508 } 1509 } 1510 stxa(pdirentp, ASI_PHYS_CACHED, 0); 0x0000000001696a08 <+232>: clr %g1 0x0000000001696a0c <+236>: stxa %g1, [ %l2 ] #ASI_PHYS_USE_EC 1511 pmap_free_page((paddr_t)ptbl, pm); 0x0000000001696a10 <+240>: mov %l3, %o0 0x0000000001696a14 <+244>: call 0x1696580 <pmap_free_page> 0x0000000001696a18 <+248>: mov %i0, %o1 0x0000000001696a1c <+252>: inc %l5 -- :wq Claudio
