24.01.2026 23:43, Klemens Nanni пишет:
> 20.01.2026 00:55, Klemens Nanni пишет:
>> 20.01.2026 00:29, Klemens Nanni пишет:
>>> 19.01.2026 22:55, Miod Vallat пишет:
>>>>> Nothing besides nd6 spam (about addresses of non-OpenBSD devices that 
>>>>> work just fine):
>>>>>
>>>>> ddb{0}> dmesg
>>>>> <7>nd6_resolve: xxxx:xxxx:xxxx:xxxx:397f:4b51:7bcb:c6ff: incorrect nd6 
>>>>> information
>>>>> ...
>>>>> Trap cause = 2 Frame 0x980000000fd97878
>>>>> Trap PC 0xffffffff8119dbdc RA 0xffffffff8119df2c fault 0x0
>>>>
>>>> This is a NULL pointer dereference happening at 0xffffffff8119dbdc. If
>>>> you x/i 0xffffffff8119dbdc this will show you where in cnmac_recv_mbuf
>>>> this happens, and then we can figure out the corresponding line in
>>>> if_cnmac.c.
>>>
>>> x/i gives the same address from my previous mail:
>>>
>>>>> Stopped at      cnmac_recv_mbuf+0x134:  ld      v1,32(t8)
>>>
>>> I tried this:
>>>
>>>     router# objdump -d /bsd | grep -m1 cnmac_recv_mbuf  
>>>     ffffffff8119daa8 <cnmac_recv_mbuf>:
>>>     router# addr2line -e/bsd $(python3 
>>> -c'print(hex(0xffffffff8119daa8+0x134))')     
>>>     ??:0
>>>
>>> Then against a fresh COPTS=-O0 DEBUG=-g kernel, but same result, also with:
>>>
>>>     builder# egdb -q -batch -ex 'info line *cnmac_recv_mbuf+0x134' obj/bsd  
>>>   
>>>     No line number information available for address 0xffffffff814954e4 
>>> <cnmac_recv_mbuf+308>
>>
>>
>> tb@ pointed me at https://www.openbsd.org/ddb.html, but here on octeon
>> 'objdump -dlr obj/if_cnmac.o' does not yield line info and prints this:
>>
>> BFD: Dwarf Error: found dwarf version '0', this reader only handles version 
>> 2 information.
>>
>> With llvm-objdump (thanks jca@) I do get this:
>>
>> ; /sys/arch/octeon/dev/if_cnmac.c:1146
>>     3aec: df 03 00 20   ld      $3, 0x20($24)
>>     3af0: 14 43 00 30   bne     $2, $3, 0x3bb4 <cnmac_recv_mbuf+0x1fc>
>>     3af4: 00 00 00 00   nop <cnmac_match>
>>     3af8: 7c 83 38 01   dext    $3, $4, 0x0, 0x28 <cnmac_match+0x28>
>>
>>
>>    1139         for (i = 0; i < nbufs; i++) {
>>    1140                 addr = word3 & PIP_WQE_WORD3_ADDR;
>>    1141                 back = (word3 & PIP_WQE_WORD3_BACK) >> 
>> PIP_WQE_WORD3_BACK_SHIFT;
>>    1142                 pktbuf = (addr & ~(CACHELINESIZE - 1)) - back * 
>> CACHELINESIZE;
>>    1143                 pm = (struct mbuf **)PHYS_TO_XKPHYS(pktbuf, 
>> CCA_CACHED) - 1;
>>    1144                 m = *pm;
>>    1145                 *pm = NULL;
>>    1146                 if ((paddr_t)m->m_pkthdr.ph_cookie != pktbuf)
>>    1147                         panic("%s: packet pool is corrupted, mbuf 
>> cookie %p != "
>>    1148                             "pktbuf %p", __func__, 
>> m->m_pkthdr.ph_cookie,
>>    1149                             (void *)pktbuf);
>>    1150 
>>
>>
>> So m == NULL.
> 
> Hit another one today running
>       OpenBSD 7.8-current (GENERIC.MP) #124: Wed Jan 14 11:01:22 MST 2026

Same issue.

I then downgraded to 7.8-release and the system has been stable ever since,
uptime is 46 days and counting.

> 
> 
> Trap cause = 2 Frame 0x980000000fd83ac8
> Trap PC 0xffffffff8119c6c8 RA 0xffffffff8119c6c8 fault 0x0
> cnmac_send_queue_flush+0x90 
> (c000000000028f38,6b268e5959439c77,705a061c1943f580,0)  ra 0xffffffff8119badc 
> sp 0x980000000fd83c20, sz 80
> cnmac_start+0x18c (c000000000028f38,fcd6d311cc225ea8,705a061c1943f580,0)  ra 
> 0xffffffff8146fdf4 sp 0x980000000fd83c70, sz 96
> ifq_start_task+0x5c (c000000000028f38,fcd6d311cc225ea8,705a061c1943f580,0)  
> ra 0x0 sp 0x980000000fd83cd0, sz 0
> User-level: pid 17060
> stopped on non ddb fault
> Stopped at      cnmac_send_queue_flush+0x90:    ld      v1,32(v0)
> ddb{3}> ddb{3}> cnmac_send_queue_flush+0x90 
> (c000000000028f38,6b268e5959439c77,705a061c1943f580,0)  ra 0xffffffff8119badc 
> sp 0x980000000fd83c20, sz 80
> cnmac_start+0x18c (c000000000028f38,fcd6d311cc225ea8,705a061c1943f580,0)  ra 
> 0xffffffff8146fdf4 sp 0x980000000fd83c70, sz 96
> ifq_start_task+0x5c (c000000000028f38,fcd6d311cc225ea8,705a061c1943f580,0)  
> ra 0x0 sp 0x980000000fd83cd0, sz 0
> 
> 
> # llvm-objdump -dlr obj/if_cnmac.o | tee dump | grep -F 
> '<cnmac_send_queue_flush>'
> 0000000000002548 <cnmac_send_queue_flush>:
> # printf %x\\n $(( 0x2548 + 0x90 ))
> 25d8
> # awk '/^;/ { where = $2 } /25d8:/ { print where; exit(0) }' dump 
> /sys/arch/octeon/dev/if_cnmac.c:597
> 
> 
>     536 void
>     537 cnmac_send_queue_flush(struct cnmac_softc *sc)
>     538 {
>     539         const int64_t sent_count = sc->sc_hard_done_cnt;
>     540         int i;
>     541 
>     542         OCTEON_ETH_KASSERT(sent_count <= 0);
>     543 
>     544         for (i = 0; i < 0 - sent_count; i++) {
>     545                 struct mbuf *m;
>     546                 uint64_t *gbuf;
>     547 
>     548                 cnmac_send_queue_del(sc, &m, &gbuf);
>     549 
>     550                 cn30xxfpa_buf_put_paddr(cnmac_fb_sg, 
> XKPHYS_TO_PHYS(gbuf));
>     551 
>     552                 m_freem(m);
>     553         }
>     554 
>     555         cn30xxfau_op_add_8(&sc->sc_fau_done, i);
>     556 }
> 
>     ...
> 
>     588 void
>     589 cnmac_send_queue_del(struct cnmac_softc *sc, struct mbuf **rm,
>     590     uint64_t **rgbuf)
>     591 {
>     592         struct mbuf *m;
>     593         m = ml_dequeue(&sc->sc_sendq);
>     594         OCTEON_ETH_KASSERT(m != NULL);
>     595 
>     596         *rm = m;
>     597         *rgbuf = m->m_pkthdr.ph_cookie;
>     598 
>     599         if (m->m_ext.ext_free_fn != 0) {
>     600                 sc->sc_ext_callback_cnt--;
>     601                 OCTEON_ETH_KASSERT(sc->sc_ext_callback_cnt >= 0);
>     602         }
>     603 }
> 
> 
> Now running a newer snap, fwiw:
>       OpenBSD 7.8-current (GENERIC.MP) #129: Thu Jan 22 09:49:17 MST 2026
> 

Reply via email to