On error CQEs (e.g. SA_DROP), the hardware generates one CQE per WQE
regardless of the suppression flag. The previous code honored the
suppress_tx_cqe flag unconditionally, which caused it to skip reading
error CQEs and misalign the CQ consumer index.

This misalignment causes subsequent completions to be misinterpreted:
valid CQEs are read at wrong offsets, leading to spurious error
counts, NULL packet frees, and potential use-after-free of mbufs
that were already completed.

Check the CQE type before honoring suppression: only skip CQE reading
when the completion is CQE_TX_OKAY.

Fixes: cce2c9df44 ("net/mana: suppress Tx CQE generation whenever possible")
Cc: [email protected]

Signed-off-by: Long Li <[email protected]>
---
 drivers/net/mana/tx.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mana/tx.c b/drivers/net/mana/tx.c
index 40931ac027..e5ab566e8a 100644
--- a/drivers/net/mana/tx.c
+++ b/drivers/net/mana/tx.c
@@ -228,9 +228,11 @@ mana_tx_burst(void *dpdk_txq, struct rte_mbuf **tx_pkts, 
uint16_t nb_pkts)
                txq->gdma_sq.tail += desc->wqe_size_in_bu;
 
                /* If TX CQE suppression is used, don't read more CQE but move
-                * on to the next packet
+                * on to the next packet. On error CQEs, HW generates one CQE
+                * per WQE regardless of suppression, so always advance.
                 */
-               if (desc->suppress_tx_cqe)
+               if (desc->suppress_tx_cqe &&
+                   oob->cqe_hdr.cqe_type == CQE_TX_OKAY)
                        continue;
 
                i++;
-- 
2.43.0

Reply via email to