On Thu, Mar 24, 2022 at 07:41:44AM +0100, Anton Lindqvist wrote:
> >Synopsis:    bse: null dereference in genet_rxintr()
> >Category:    arm64
> >Environment:
>       System      : OpenBSD 7.1
>       Details     : OpenBSD 7.1-beta (GENERIC.MP) #1594: Mon Mar 21 06:55:12 
> MDT 2022
>                       
> dera...@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/GENERIC.MP
> 
>       Architecture: OpenBSD.arm64
>       Machine     : arm64
> >Description:
> 
> Booting my rpi4 often but not always causes a panic while rc(8) tries to start
> the bse network interface:
> 
> panic: attempt to access user address 0x38 from EL1
> Stopped at      panic+0x160:    cmp     w21, #0x0
>     TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
> *     0      0      0     0x10000      0x200    0K swapper
> db_enter() at panic+0x15c
> panic() at do_el1h_sync+0x1f8
> do_el1h_sync() at handle_el1h_sync+0x6c
> handle_el1h_sync() at genet_rxintr+0x120
> genet_rxintr() at genet_intr+0x74
> genet_intr() at ampintc_irq_handler+0x14c
> ampintc_irq_handler() at arm_cpu_irq+0x30
> arm_cpu_irq() at handle_el1h_irq+0x6c
> handle_el1h_irq() at ampintc_splx+0x80
> ampintc_splx() at genet_ioctl+0x158
> genet_ioctl() at ifioctl+0x308
> ifioctl() at nfs_boot_init+0xc0
> nfs_boot_init() at nfs_mountroot+0x3c
> nfs_mountroot() at main+0x464
> main() at virtdone+0x70
> 
> >Fix:
> 
> The mbuf associated with the current index is NULL. I noticed that the NetBSD
> driver allocates mbufs for each ring entry in genet_setup_dma(). But even with
> that in place the same panic still occurs. Enabling GENET_DEBUG shows that the
> total is quite high:
> 
> RX pidx=0000ca07 total=51463                                                  
>                  
> 
> Since it's greater than GENET_DMA_DESC_COUNT (=256) the null dereference will
> still happen after doing more than 256 iterations in genet_rxintr() since we
> will start accessing mbufs cleared by the previous iteration.
> 
> Here's a diff with what I've tried so far. The KASSERT() is just capturing the
> problem at an earlier stage. Any pointers would be much appreciated.

Further digging reveals that writes to GENET_RX_DMA_PROD_INDEX are
ignored by the hardware. That's why I ended up with a large amount of
mbufs available in genet_rxintr() since the software and hardware state
was out of sync. Honoring any existing value makes the problem go away
and matches what u-boot[1] does as well.

The current RX cidx/pidx defaults in genet_fill_rx_ring() where probably
carefully selected as they ensure that the rx ring is filled with at
least the configured low watermark number of mbufs. However, instead of
being forced to ensure a pidx - cidx delta above 0 on the first
invocations of genet_fill_rx_ring(), RX_DESC_COUNT could simply be
passed as the max argument to if_rxr_get() which will clamp the value
anyway.

Also, I've seen up to 8 mbufs being available per rx interrupt which is
odd as only a less amount of rx ring entries are actually populated. Not
sure if the driver is missing some interrupt threshold configuration.
Increasing the rx ring low watermark to 8 "solved" it for now.
Otherwise, the same null dereference occurs while trying to access empty
mbuf ring entries.

Worth mentioning is that the NetBSD driver does not suffer from the same
problem as they keep all rx ring entries populated all the time.

Looking for feedback and OKs at this point.

[1] 
https://github.com/u-boot/u-boot/blob/a94ab561e2f49a80d8579930e840b810ab1a1330/drivers/net/bcmgenet.c#L404

diff --git sys/dev/ic/bcmgenet.c sys/dev/ic/bcmgenet.c
index 923df40bdfc..6070f9d202e 100644
--- sys/dev/ic/bcmgenet.c
+++ sys/dev/ic/bcmgenet.c
@@ -311,16 +311,13 @@ void
 genet_fill_rx_ring(struct genet_softc *sc, int qid)
 {
        struct mbuf *m;
-       uint32_t cidx, index, total;
+       uint32_t cidx, index;
        u_int slots;
        int error;
 
        cidx = sc->sc_rx.cidx;
-       total = (sc->sc_rx.pidx - cidx) & 0xffff;
-       KASSERT(total <= RX_DESC_COUNT);
-
        index = sc->sc_rx.cidx & (RX_DESC_COUNT - 1);
-       for (slots = if_rxr_get(&sc->sc_rx_ring, total);
+       for (slots = if_rxr_get(&sc->sc_rx_ring, RX_DESC_COUNT);
             slots > 0; slots--) {
                if ((m = genet_alloc_mbufcl(sc)) == NULL) {
                        printf("%s: cannot allocate RX mbuf\n",
@@ -517,9 +514,9 @@ genet_init_rings(struct genet_softc *sc, int qid)
 
        /* RX ring */
 
-       sc->sc_rx.next = 0;
-       sc->sc_rx.cidx = 0;
-       sc->sc_rx.pidx = RX_DESC_COUNT;
+       sc->sc_rx.cidx = RD4(sc, GENET_RX_DMA_CONS_INDEX(qid)) & 0xffff;
+       sc->sc_rx.pidx = RD4(sc, GENET_RX_DMA_PROD_INDEX(qid)) & 0xffff;
+       sc->sc_rx.next = sc->sc_rx.cidx & (RX_DESC_COUNT - 1);
 
        WR4(sc, GENET_RX_SCB_BURST_SIZE, 0x08);
 
@@ -543,7 +540,7 @@ genet_init_rings(struct genet_softc *sc, int qid)
 
        WR4(sc, GENET_RX_DMA_RING_CFG, __BIT(qid));     /* enable */
 
-       if_rxr_init(&sc->sc_rx_ring, 2, RX_DESC_COUNT);
+       if_rxr_init(&sc->sc_rx_ring, 8, RX_DESC_COUNT);
        genet_fill_rx_ring(sc, qid);
 
        /* Enable receive DMA */

Reply via email to