Hi, The following diff fixes a deadlock in em(4) when system runs out of mbuf(9)s.
Tested with current on amd64 with: em0 at pci0 dev 25 function 0 "Intel I217-LM" rev 0x05: msi, mac 0x1e phy 0xb, address d0:50:99:c1:67:94 When the system runs out of mbufs, the receive ring of the em-NIC also run out of mbufs. If the driver puts an mbuf from the NICs receive ring with em_rxeof, it calls em_rxrefill to refill the ring. If its not possible to refill the ring and the receive ring is empty, it sets a timer to retry refilling later till new mbufs are available. In my case, the function em_rxeof is unable to pull the last one or two mbufs from the receive ring, because the descriptor status hasn't set the "descriptor done" flag (E1000_RXD_STAT_DD). Thus, the ring has one or two remaining mbufs, and the timeout to put new ones in is never set. When new mbufs are available later, the em driver stays in that situation. The diff always sets the timeout to put new mbufs in the receive ring, when it has lesser mbufs then the "low water mark". A side effect is: when the mbufs in use (alive) are below then the "low water mark", the "current water mark" raises up to value of the "high water mark". When new mbufs are available the number of mbufs in use also raises to that value. Because, the "current water mark" just sinks during net livelocks. I'm sure this path is not the best solution. Maybe someone else has a better idea to avoid that situation?! Bye, Jan Index: dev/pci/if_em.c =================================================================== RCS file: /cvs/src/sys/dev/pci/if_em.c,v retrieving revision 1.349 diff -u -p -r1.349 if_em.c --- dev/pci/if_em.c 24 Mar 2020 09:30:06 -0000 1.349 +++ dev/pci/if_em.c 21 Apr 2020 12:02:28 -0000 @@ -2809,7 +2809,8 @@ em_rxrefill(void *arg) if (em_rxfill(que)) E1000_WRITE_REG(&sc->hw, RDT(que->me), que->rx.sc_rx_desc_head); - else if (if_rxr_inuse(&que->rx.sc_rx_ring) == 0) + else if (if_rxr_inuse(&que->rx.sc_rx_ring) < + if_rxr_lwm(&que->rx.sc_rx_ring)) timeout_add(&que->rx_refill, 1); } Index: net/if_var.h =================================================================== RCS file: /cvs/src/sys/net/if_var.h,v retrieving revision 1.104 diff -u -p -r1.104 if_var.h --- net/if_var.h 12 Apr 2020 07:02:43 -0000 1.104 +++ net/if_var.h 21 Apr 2020 11:58:17 -0000 @@ -389,6 +389,7 @@ u_int if_rxr_get(struct if_rxring *, u_i #define if_rxr_put(_r, _c) do { (_r)->rxr_alive -= (_c); } while (0) #define if_rxr_inuse(_r) ((_r)->rxr_alive) #define if_rxr_cwm(_r) ((_r)->rxr_cwm) +#define if_rxr_lwm(_r) ((_r)->rxr_lwm) int if_rxr_info_ioctl(struct if_rxrinfo *, u_int, struct if_rxring_info *); int if_rxr_ioctl(struct if_rxrinfo *, const char *, u_int,