From: Joshua Hay <[email protected]>
Date: Wed, 25 Jun 2025 09:11:52 -0700

> This is the start of a 5 patch series intended to fix a stability issue
> in the flow scheduling Tx send/clean path that results in a Tx timeout.

No need to mention "series", "start", "patch" in commit messages.

> 
> In certain production environments, it is possible for completion tags
> to collide, meaning N packets with the same completion tag are in flight
> at the same time. In this environment, any given Tx queue is effectively
> used to send both slower traffic and higher throughput traffic
> simultaneously. This is the result of a customer's specific
> configuration in the device pipeline, the details of which Intel cannot
> provide. This configuration results in a small number of out-of-order
> completions, i.e., a small number of packets in flight. The existing
> guardrails in the driver only protect against a large number of packets
> in flight. The slower flow completions are delayed which causes the
> out-of-order completions. Meanwhile, the fast flow exhausts the pool of
> unique tags and starts reusing tags. The next packet in the fast flow
> uses the same tag for a packet that is still in flight from the slower
> flow. The driver has no idea which packet it should clean when it
> processes the completion with that tag, but it will for the packet on
> the buffer ring before the hash table.  If the slower flow packet
> completion is processed first, it will end up cleaning the fast flow
> packet on the ring prematurely. This leaves the descriptor ring in a bad
> state resulting in a Tx timeout.
> 
> This series refactors the Tx buffer management by replacing the stashing

Same.

> mechanisms and the tag generation with a large pool/array of unique
> tags. The completion tags are now simply used to index into the pool of
> Tx buffers. This implicitly prevents any tag from being reused while
> it's in flight.
> 
> First, we need a new mechanism for the send path to know what tag to use
> next. The driver will allocate and initialize a refillq for each TxQ
> with all of the possible free tag values. During send, the driver grabs
> the next free tag from the refillq from next_to_clean. While cleaning
> the packet, the clean routine posts the tag back to the refillq's
> next_to_use to indicate that it is now free to use.
> 
> This mechanism works exactly the same way as the existing Rx refill
> queues, which post the cleaned buffer IDs back to the buffer queue to be
> reposted to HW. Since we're using the refillqs for both Rx and Tx now,
> genercize some of the existing refillq support.
> 
> Note: the refillqs will not be used yet. This is only demonstrating how
> they will be used to pass free tags back to the send path.

[...]

> @@ -267,6 +270,31 @@ static int idpf_tx_desc_alloc(const struct idpf_vport 
> *vport,
>       tx_q->next_to_clean = 0;
>       idpf_queue_set(GEN_CHK, tx_q);
>  
> +     if (idpf_queue_has(FLOW_SCH_EN, tx_q)) {

        if (!idpf_queue_has(FLOW_SCH_EN, tx_q))
                return 0;

> +             struct idpf_sw_queue *refillq = tx_q->refillq;
> +
> +             refillq->desc_count = tx_q->desc_count;
> +
> +             refillq->ring = kcalloc(refillq->desc_count, sizeof(u32),
> +                                     GFP_KERNEL);
> +             if (!refillq->ring) {
> +                     err = -ENOMEM;
> +                     goto err_alloc;
> +             }
> +
> +             for (u32 i = 0; i < refillq->desc_count; i++)
> +                     refillq->ring[i] =
> +                             FIELD_PREP(IDPF_RFL_BI_BUFID_M, i) |
> +                             FIELD_PREP(IDPF_RFL_BI_GEN_M,
> +                                        idpf_queue_has(GEN_CHK, refillq));
> +
> +             /*
> +              * Go ahead and flip the GEN bit since this counts as filling
> +              * up the ring, i.e. we already ring wrapped.
> +              */
> +             idpf_queue_change(GEN_CHK, refillq);
> +     }
> +
>       return 0;
>  
>  err_alloc:

Thanks,
Olek

Reply via email to