On Thu, Jul 17, 2025 at 05:21:44PM -0700, Joshua Hay wrote: > This series fixes a stability issue in the flow scheduling Tx send/clean > path that results in a Tx timeout. > > The existing guardrails in the Tx path were not sufficient to prevent > the driver from reusing completion tags that were still in flight (held > by the HW). This collision would cause the driver to erroneously clean > the wrong packet thus leaving the descriptor ring in a bad state. > > The main point of this refactor is to replace the flow scheduling buffer > ring with a large pool/array of buffers. The completion tag then simply > is the index into this array. The driver tracks the free tags and pulls > the next free one from a refillq. The cleaning routines simply use the > completion tag from the completion descriptor to index into the array to > quickly find the buffers to clean. > > All of the code to support the refactor is added first to ensure traffic > still passes with each patch. The final patch then removes all of the > obsolete stashing code. > > --- > v2: > - Add a new patch "idpf: simplify and fix splitq Tx packet rollback > error path" that fixes a bug in the error path. It also sets up > changes in patch 4 that are necessary to prevent a crash when a packet > rollback occurs using the buffer pool. > > v1: > https://lore.kernel.org/intel-wired-lan/[email protected]/T/#maf9f464c598951ee860e5dd24ef8a451a488c5a0 > > Joshua Hay (6): > idpf: add support for Tx refillqs in flow scheduling mode > idpf: improve when to set RE bit logic > idpf: simplify and fix splitq Tx packet rollback error path > idpf: replace flow scheduling buffer ring with buffer pool > idpf: stop Tx if there are insufficient buffer resources > idpf: remove obsolete stashing code > > .../ethernet/intel/idpf/idpf_singleq_txrx.c | 61 +- > drivers/net/ethernet/intel/idpf/idpf_txrx.c | 723 +++++++----------- > drivers/net/ethernet/intel/idpf/idpf_txrx.h | 87 +-- > 3 files changed, 356 insertions(+), 515 deletions(-)
Hi Joshua, all, Perhaps it is not followed much anymore, but at least according to [1] patches for stable should not be more than 100 lines, with context. This patch-set is an order of magnitude larger. Can something be done to create a more minimal fix? [1] https://docs.kernel.org/process/stable-kernel-rules.html#stable-kernel-rules
