On Tue, Apr 19, 2022 at 06:07:47PM +0200, Anton Lindqvist wrote:
> On Tue, Apr 19, 2022 at 07:32:36AM +0200, Anton Lindqvist wrote:
> > On Thu, Mar 24, 2022 at 07:41:44AM +0100, Anton Lindqvist wrote:
> > > >Synopsis:        bse: null dereference in genet_rxintr()
> > > >Category:        arm64
> > > >Environment:
> > >   System      : OpenBSD 7.1
> > >   Details     : OpenBSD 7.1-beta (GENERIC.MP) #1594: Mon Mar 21 06:55:12 
> > > MDT 2022
> > >                   
> > > dera...@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/GENERIC.MP
> > > 
> > >   Architecture: OpenBSD.arm64
> > >   Machine     : arm64
> > > >Description:
> > > 
> > > Booting my rpi4 often but not always causes a panic while rc(8) tries to 
> > > start
> > > the bse network interface:
> > > 
> > > panic: attempt to access user address 0x38 from EL1
> > > Stopped at      panic+0x160:    cmp     w21, #0x0
> > >     TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
> > > *     0      0      0     0x10000      0x200    0K swapper
> > > db_enter() at panic+0x15c
> > > panic() at do_el1h_sync+0x1f8
> > > do_el1h_sync() at handle_el1h_sync+0x6c
> > > handle_el1h_sync() at genet_rxintr+0x120
> > > genet_rxintr() at genet_intr+0x74
> > > genet_intr() at ampintc_irq_handler+0x14c
> > > ampintc_irq_handler() at arm_cpu_irq+0x30
> > > arm_cpu_irq() at handle_el1h_irq+0x6c
> > > handle_el1h_irq() at ampintc_splx+0x80
> > > ampintc_splx() at genet_ioctl+0x158
> > > genet_ioctl() at ifioctl+0x308
> > > ifioctl() at nfs_boot_init+0xc0
> > > nfs_boot_init() at nfs_mountroot+0x3c
> > > nfs_mountroot() at main+0x464
> > > main() at virtdone+0x70
> > > 
> > > >Fix:
> > > 
> > > The mbuf associated with the current index is NULL. I noticed that the 
> > > NetBSD
> > > driver allocates mbufs for each ring entry in genet_setup_dma(). But even 
> > > with
> > > that in place the same panic still occurs. Enabling GENET_DEBUG shows 
> > > that the
> > > total is quite high:
> > > 
> > > RX pidx=0000ca07 total=51463                                              
> > >                      
> > > 
> > > Since it's greater than GENET_DMA_DESC_COUNT (=256) the null dereference 
> > > will
> > > still happen after doing more than 256 iterations in genet_rxintr() since 
> > > we
> > > will start accessing mbufs cleared by the previous iteration.
> > > 
> > > Here's a diff with what I've tried so far. The KASSERT() is just 
> > > capturing the
> > > problem at an earlier stage. Any pointers would be much appreciated.
> > 
> > Further digging reveals that writes to GENET_RX_DMA_PROD_INDEX are
> > ignored by the hardware. That's why I ended up with a large amount of
> > mbufs available in genet_rxintr() since the software and hardware state
> > was out of sync. Honoring any existing value makes the problem go away
> > and matches what u-boot[1] does as well.
> > 
> > The current RX cidx/pidx defaults in genet_fill_rx_ring() where probably
> > carefully selected as they ensure that the rx ring is filled with at
> > least the configured low watermark number of mbufs. However, instead of
> > being forced to ensure a pidx - cidx delta above 0 on the first
> > invocations of genet_fill_rx_ring(), RX_DESC_COUNT could simply be
> > passed as the max argument to if_rxr_get() which will clamp the value
> > anyway.
> > 
> > Also, I've seen up to 8 mbufs being available per rx interrupt which is
> > odd as only a less amount of rx ring entries are actually populated. Not
> > sure if the driver is missing some interrupt threshold configuration.
> > Increasing the rx ring low watermark to 8 "solved" it for now.
> > Otherwise, the same null dereference occurs while trying to access empty
> > mbuf ring entries.
> > 
> > Worth mentioning is that the NetBSD driver does not suffer from the same
> > problem as they keep all rx ring entries populated all the time.
> > 
> > Looking for feedback and OKs at this point.
> > 
> > [1] 
> > https://github.com/u-boot/u-boot/blob/a94ab561e2f49a80d8579930e840b810ab1a1330/drivers/net/bcmgenet.c#L404
> 
> While putting more pressure on the network I'm seeing up to 100 mbufs
> being available per rx interrupt. Could it simply be explained by the
> hardware operating under the assumption that all ring entries are
> available? Even if instructing the hardware about the actual amount of
> available ring entries would require the driver to keep it in sync
> whenever the if_rxr_*() implementation decides to adjust the ring.
> 
> Moving to if_rxr_init(RX_DESC_COUNT, RX_DESC_COUNT) essentially making
> all 256 ring entries always available makes the driver stable.

Here's the diff I've been running lately which is deemed to be stable.
Changes since last time:

* Ensure all 256 ring entries are always populated. Usage of the
  if_rxr_*() API doesn't do much at this point.

* Update the consumer index only after a ring entry has been consumed
  and not after filling in blank ones. This should have the same effect
  as the previous logic in genet_fill_rx_ring() but it makes our driver
  look like the one in NetBSD, FreeBSD and Linux.

diff --git sys/dev/ic/bcmgenet.c sys/dev/ic/bcmgenet.c
index 923df40bdfc..e31c0de6d95 100644
--- sys/dev/ic/bcmgenet.c
+++ sys/dev/ic/bcmgenet.c
@@ -311,16 +311,13 @@ void
 genet_fill_rx_ring(struct genet_softc *sc, int qid)
 {
        struct mbuf *m;
-       uint32_t cidx, index, total;
-       u_int slots;
+       uint32_t index;
+       u_int inuse, slots;
        int error;
 
-       cidx = sc->sc_rx.cidx;
-       total = (sc->sc_rx.pidx - cidx) & 0xffff;
-       KASSERT(total <= RX_DESC_COUNT);
-
-       index = sc->sc_rx.cidx & (RX_DESC_COUNT - 1);
-       for (slots = if_rxr_get(&sc->sc_rx_ring, total);
+       inuse = if_rxr_inuse(&sc->sc_rx_ring);
+       index = (sc->sc_rx.cidx + inuse) & (RX_DESC_COUNT - 1);
+       for (slots = if_rxr_get(&sc->sc_rx_ring, RX_DESC_COUNT);
             slots > 0; slots--) {
                if ((m = genet_alloc_mbufcl(sc)) == NULL) {
                        printf("%s: cannot allocate RX mbuf\n",
@@ -335,16 +332,10 @@ genet_fill_rx_ring(struct genet_softc *sc, int qid)
                        break;
                }
 
-               cidx = (cidx + 1) & 0xffff;
                index = RX_NEXT(index);
        }
        if_rxr_put(&sc->sc_rx_ring, slots);
 
-       if (sc->sc_rx.cidx != cidx) {
-               sc->sc_rx.cidx = cidx;
-               WR4(sc, GENET_RX_DMA_CONS_INDEX(qid), sc->sc_rx.cidx);
-       }
-
        if (if_rxr_inuse(&sc->sc_rx_ring) == 0)
                timeout_add(&sc->sc_rxto, 1);
 }
@@ -517,9 +508,9 @@ genet_init_rings(struct genet_softc *sc, int qid)
 
        /* RX ring */
 
-       sc->sc_rx.next = 0;
-       sc->sc_rx.cidx = 0;
-       sc->sc_rx.pidx = RX_DESC_COUNT;
+       sc->sc_rx.cidx = RD4(sc, GENET_RX_DMA_CONS_INDEX(qid)) & 0xffff;
+       sc->sc_rx.pidx = RD4(sc, GENET_RX_DMA_PROD_INDEX(qid)) & 0xffff;
+       sc->sc_rx.next = sc->sc_rx.cidx & (RX_DESC_COUNT - 1);
 
        WR4(sc, GENET_RX_SCB_BURST_SIZE, 0x08);
 
@@ -543,7 +534,7 @@ genet_init_rings(struct genet_softc *sc, int qid)
 
        WR4(sc, GENET_RX_DMA_RING_CFG, __BIT(qid));     /* enable */
 
-       if_rxr_init(&sc->sc_rx_ring, 2, RX_DESC_COUNT);
+       if_rxr_init(&sc->sc_rx_ring, RX_DESC_COUNT, RX_DESC_COUNT);
        genet_fill_rx_ring(sc, qid);
 
        /* Enable receive DMA */
@@ -686,7 +677,7 @@ genet_rxintr(struct genet_softc *sc, int qid)
        uint32_t status, pidx, total;
 
        pidx = RD4(sc, GENET_RX_DMA_PROD_INDEX(qid)) & 0xffff;
-       total = (pidx - sc->sc_rx.pidx) & 0xffff;
+       total = (pidx - sc->sc_rx.cidx) & 0xffff;
 
        DPRINTF("RX pidx=%08x total=%d\n", pidx, total);
 
@@ -723,6 +714,9 @@ genet_rxintr(struct genet_softc *sc, int qid)
                if_rxr_put(&sc->sc_rx_ring, 1);
 
                index = RX_NEXT(index);
+
+               sc->sc_rx.cidx = (sc->sc_rx.cidx + 1) & 0xffff;
+               WR4(sc, GENET_RX_DMA_CONS_INDEX(qid), sc->sc_rx.cidx);
        }
 
        if (sc->sc_rx.pidx != pidx) {

Reply via email to