We are debugging an issue with netconsole and ixgbe, that ksoftirqd takes 100% of a core. It happens with both current net and net-next.
To reproduce the issue: 1. Setup server with ixgbe and netconsole. We bind each queue to a separate core via smp_affinity; 2. Start simple netperf job from client, like: ./super_netperf 201 -P 0 -t TCP_RR -p 8888 -H <SERVER> -l 7200 -- -r 300,300 -o -s 1M,1M -S 1M,1M 3. On server, write to /dev/kmsg in a loop (to send netconsole): for x in {1..7200} ; do echo aa >> /dev/kmsg ; sleep 1; done 4. On server, monitor ksoftirqd in top Within a few minutes, top will show one ksoftirqd take 100% of the core for many seconds in a row. When the ksoftirqd takes 100% of a core, the driver hits "clean_complete=false" path below, so this napi stays in polling mode. ixgbe_for_each_ring(ring, q_vector->rx) { int cleaned = ixgbe_clean_rx_irq(q_vector, ring, per_ring_budget); work_done += cleaned; if (cleaned >= per_ring_budget) clean_complete = false; } /* If all work not completed, return budget and keep polling */ if (!clean_complete) return budget; We didn't see this issue on a 4.6 based kernel. We are still debugging the issue. But we would like to check whether there is known solution for it. Any comments and suggestions are highly appreciated. Best, Song