On Fri, Sep 26, 2025 at 11:10:41PM +0200, Claudio Jeker wrote:
> On Sat, Sep 20, 2025 at 10:59:51PM +0200, Claudio Jeker wrote:
> > The M10-1 hits this panic in roughly 24h of running make -j 32 build in a
> > loop. First time it exploded inside the reaper for me. So maybe this is
> > closer to the truth.
> 
> Another run, took a bit more than 24h this time.
> This is from my top running when it paniced:
> 
> load averages: 20.59, 10.40,  6.74                          m10.zyd.ch 
> 08:39:30
> 135 processes: 6 starting, 4 running, 98 idle, 1 dead, 26 on up 1 days 
> 11:39:33
> 32  CPUs: 19.0% user,  0.0% nice, 12.8% sys, 67.4% spin,  0.0% intr,  0.8% 
> idle
> Memory: Real: 504M/7786M act/tot Free: 116G Cache: 6008M Swap: 0K/88G
> 
> cpu13 crashed because on mi_switch cpuswitch corrupted the registers.
> CPU after 17 did not stop (probably because the cpu mondo broke down).
> 
> Three make processes are running:
> pid 83460 cpu 13 is toast but somewhere between fork and exec.
> pid 27987 cpu 4 is in uvmspace_exec()
> pid  2638 cpu 9 is spinning on the kernel lock for some syscall
> 
> No process in ps /o seems to hold the KERNEL_LOCK which is a bit strange since
> the parent make process is waiting for the lock.

Requested by mpi:
what is pmap_remove+0x1e0 and pmap_release+0xf4

(gdb) x /i pmap_remove+0x1e0
   0x169a400 <pmap_remove+480>: call  0x13f2760 <smp_tlb_flush_pte>
   0x169a404 <pmap_remove+484>: mov  %i0, %o0
Which is lines 1865 and 1866 of pmap.c:
1865                            /* Here we assume nothing can get into the
TLB unless it has a PTE */
1866                            tlb_flush_pte(va, pm->pm_ctx);
   0x000000000169a3fc <+476>:   ldsw  [ %l0 + 0x10 ], %o1
   0x000000000169a400 <+480>:   call  0x13f2760 <smp_tlb_flush_pte>
   0x000000000169a404 <+484>:   mov  %i0, %o0


(gdb) x /i pmap_release+0xf4
   0x1696a14 <pmap_release+244>:        call  0x1696580 <pmap_free_page>
   0x1696a18 <pmap_release+248>:        mov  %i0, %o1
Which is line 1511 of pmap.c:
1508                                                    }
1509                                            }
1510                                            stxa(pdirentp,
ASI_PHYS_CACHED, 0);
   0x0000000001696a08 <+232>:   clr  %g1
   0x0000000001696a0c <+236>:   stxa  %g1, [ %l2 ] #ASI_PHYS_USE_EC

1511                                             pmap_free_page((paddr_t)ptbl, 
pm);
   0x0000000001696a10 <+240>:   mov  %l3, %o0
   0x0000000001696a14 <+244>:   call  0x1696580 <pmap_free_page>
   0x0000000001696a18 <+248>:   mov  %i0, %o1
   0x0000000001696a1c <+252>:   inc  %l5

-- 
:wq Claudio

Reply via email to