When an AF_XDP zero-copy application exits while an XDP program remains attached, igb can permanently stall a TX queue associated with the AF_XDP socket. The interface stops forwarding traffic and typically requires a driver reload to recover.
Reproducer: 1. Attach an XDP program to igb 2. Run an AF_XDP zero-copy application 3. kill -9 the application The TX watchdog eventually fires and the interface becomes unresponsive. Reproduced on Intel I210 with Linux 6.17. igb_clean_rx_irq_zc() lacks a __IGB_DOWN guard. When the AF_XDP process exits the XSK pool is destroyed, but NAPI continues polling. The function then repeatedly returns the full budget, which prevents napi_complete_done() from completing. As a result igb_down() blocks in napi_synchronize() and TX completions stop being processed, eventually triggering the TX watchdog. Patch 1 adds a __IGB_DOWN guard to igb_clean_rx_irq_zc() to break the infinite NAPI poll loop. Patch 2 prevents igb_tx_timeout() from scheduling reset_task during XDP transitions when the device is shutting down. Patch 3 adds synchronization in igb_xdp_setup() to ensure that pending ndo_xsk_wakeup() calls complete before the teardown continues, and refreshes trans_start after igb_open() to prevent false TX timeouts. igc handles a similar stale trans_start situation via txq_trans_cond_update() (commit 86ea56c5b0c7). This patch adds equivalent protection for igb during XDP transitions. Tested on Intel I210: - AF_XDP ZC app exit with XDP attached - XDP detach while AF_XDP running - repeated XDP attach/detach cycles Alex Dvoretsky (3): igb: check __IGB_DOWN in igb_clean_rx_irq_zc() igb: skip reset in igb_tx_timeout() during XDP transition igb: add XDP transition guards in igb_xdp_setup() drivers/net/ethernet/intel/igb/igb_main.c | 15 +++++++++++++++ drivers/net/ethernet/intel/igb/igb_xsk.c | 3 +++ 2 files changed, 18 insertions(+) -- 2.51.0
