On 09/11, Zhang, Qi Z wrote:
>
>
>> -----Original Message-----
>> From: Ye, Xiaolong
>> Sent: Tuesday, September 10, 2019 11:09 PM
>> To: Zhang, Qi Z <[email protected]>
>> Cc: Yigit, Ferruh <[email protected]>; Loftus, Ciara
>> <[email protected]>; [email protected]; [email protected]; Karlsson,
>> Magnus <[email protected]>
>> Subject: Re: [PATCH] net/af_xdp: fix Tx halt when no recv packets
>>
>> On 09/10, Zhang, Qi Z wrote:
>> >
>> >
>> >> -----Original Message-----
>> >> From: Ye, Xiaolong
>> >> Sent: Tuesday, September 10, 2019 9:54 PM
>> >> To: Zhang, Qi Z <[email protected]>
>> >> Cc: Yigit, Ferruh <[email protected]>; Loftus, Ciara
>> >> <[email protected]>; [email protected]; [email protected]; Karlsson,
>> >> Magnus <[email protected]>
>> >> Subject: Re: [PATCH] net/af_xdp: fix Tx halt when no recv packets
>> >>
>> >> On 09/10, Zhang, Qi Z wrote:
>> >> >
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Ye, Xiaolong
>> >> >> Sent: Tuesday, September 10, 2019 12:13 AM
>> >> >> To: Yigit, Ferruh <[email protected]>; Loftus, Ciara
>> >> >> <[email protected]>; Ye, Xiaolong <[email protected]>;
>> >> >> Zhang, Qi Z <[email protected]>
>> >> >> Cc: [email protected]; [email protected]
>> >> >> Subject: [PATCH] net/af_xdp: fix Tx halt when no recv packets
>> >> >>
>> >> >> The kernel only consumes Tx packets if we have some Rx traffic on
>> >> >> specified queue or we have called send(). So we need to issue a
>> >> >> send() even when the allocation fails so that kernel will start to
>> >> >> consume
>> >> packets again.
>> >> >
>> >> >So "allocation fails" means " xsk_ring_prod__reserve" fail right?
>> >>
>> >> Yes.
>> >>
>> >> >I don't understand when xsk_ring_prod__needs_wakeup is true why
>> >> >kernel will stop Tx packet at this situation would you share more
>> insight?
>> >>
>> >> Actually, the fail case is xsk_ring_prod__needs_wakeup is false, then
>> >> we can't issue a send() when xsk_ring_prod__reserve fails.
>> >
>> >Sorry, I think my question should be for the case when
>> >xsk_ring_prod__needs_wakeup is false, I don't understand why we need to
>> >handle different at below two situations 1. when xsk_ring_prod__reserve
>> >fails 2. normal tx scenario.
>> >My understanding is when xsk_ring_prod__needs_wakeup(tx) is false,
>> which means Tx is ongoing, we don't need to wake up kernel to continue.
>> >
>>
>> The problem is that kernel does not guarantee that all entries are sent for
>> Tx.
>> There are a number of reasons that this might not happen, but usually some
>> Rx packet will at some point in time in the very short future trigger
>> further Tx
>> processing and the packets will be sent. But if you only have Tx processing
>> and no Rx at all, you have to trigger a sento() again.
>
>Ok , so the question is why we have below code.
>#if defined(XDP_USE_NEED_WAKEUP)
>if (xsk_ring_prod__needs_wakeup(&txq->tx))
>#endif
> kick_tx(txq);
>
>Here, when xsk_ring_prod__needs_wakeup is false, we can skip kick_tx (send),
>but why same "if check" can't be applied to the case when
>xsk_ring_prod__reserve failed?
When the system is running out of Tx entries, it needs some explicit action to
trigger kernel consumes the Tx buffers.
>
>Btw, think about below case
>when xsk_ring_prod_reserve failed, if we don't kick_tx, and no following rx
>happens,
>does that mean the remain packets in tx queue will never get chance be
>transmitted?, what happen if the last tx_burst is never be called?
This is exactly the issue this patch try to fix, in this case,
xsk_ring_prod__reserve
failure means there is no more available entries in tx queue, if we don't call
send/sendto or there is no rx traffic, Tx just halts.
Thanks,
Xiaolong
>
>>
>> Thanks,
>> Xiaolong
>>
>> >>
>> >> Thanks,
>> >> Xiaolong
>> >>
>> >> >
>> >> >Thanks
>> >> >Qi
>> >> >
>> >> >>
>> >> >> Commit 45bba02c95b0 ("net/af_xdp: support need wakeup feature")
>> >> >> breaks above rule by adding some condition to send, this patch
>> >> >> fixes it while still keeps the need_wakeup feature for Tx.
>> >> >>
>> >> >> Fixes: 45bba02c95b0 ("net/af_xdp: support need wakeup feature")
>> >> >> Cc: [email protected]
>> >> >>
>> >> >> Signed-off-by: Xiaolong Ye <[email protected]>
>> >> >> ---
>> >> >> drivers/net/af_xdp/rte_eth_af_xdp.c | 28
>> >> >> ++++++++++++++--------------
>> >> >> 1 file changed, 14 insertions(+), 14 deletions(-)
>> >> >>
>> >> >> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
>> >> >> b/drivers/net/af_xdp/rte_eth_af_xdp.c
>> >> >> index 41ed5b2af..e496e9aaa 100644
>> >> >> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
>> >> >> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
>> >> >> @@ -286,19 +286,16 @@ kick_tx(struct pkt_tx_queue *txq) {
>> >> >> struct xsk_umem_info *umem = txq->pair->umem;
>> >> >>
>> >> >> -#if defined(XDP_USE_NEED_WAKEUP)
>> >> >> - if (xsk_ring_prod__needs_wakeup(&txq->tx))
>> >> >> -#endif
>> >> >> - while (send(xsk_socket__fd(txq->pair->xsk), NULL,
>> >> >> - 0, MSG_DONTWAIT) < 0) {
>> >> >> - /* some thing unexpected */
>> >> >> - if (errno != EBUSY && errno != EAGAIN && errno
>> >> >> !=
>> EINTR)
>> >> >> - break;
>> >> >> -
>> >> >> - /* pull from completion queue to leave more
>> >> >> space */
>> >> >> - if (errno == EAGAIN)
>> >> >> - pull_umem_cq(umem,
>> ETH_AF_XDP_TX_BATCH_SIZE);
>> >> >> - }
>> >> >> + while (send(xsk_socket__fd(txq->pair->xsk), NULL,
>> >> >> + 0, MSG_DONTWAIT) < 0) {
>> >> >> + /* some thing unexpected */
>> >> >> + if (errno != EBUSY && errno != EAGAIN && errno != EINTR)
>> >> >> + break;
>> >> >> +
>> >> >> + /* pull from completion queue to leave more space */
>> >> >> + if (errno == EAGAIN)
>> >> >> + pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
>> >> >> + }
>> >> >> pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE); }
>> >> >>
>> >> >> @@ -367,7 +364,10 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf
>> >> >> **bufs, uint16_t nb_pkts)
>> >> >>
>> >> >> xsk_ring_prod__submit(&txq->tx, nb_pkts);
>> >> >>
>> >> >> - kick_tx(txq);
>> >> >> +#if defined(XDP_USE_NEED_WAKEUP)
>> >> >> + if (xsk_ring_prod__needs_wakeup(&txq->tx))
>> >> >> +#endif
>> >> >> + kick_tx(txq);
>> >> >>
>> >> >> txq->stats.tx_pkts += nb_pkts;
>> >> >> txq->stats.tx_bytes += tx_bytes;
>> >> >> --
>> >> >> 2.17.1
>> >> >