Hello DPDK community,

I have a DPDK application running on a system with a high memory constraint: each applications have a memory budget to respect. So, the number of mbuf in the mempool is calculated in order to match this constraint.

However, I saw a strange behavior regarding the way that the PMDs free the mbuf sent in the network.

To simplify my explanation, I built a small DPDK application which only sends packets (but testpmd can be used I think). I used that on top of one VF (fm10kvf or ixgbevf). The VF has 1 RxQ / 1024 RxD (not used) / 1 TxQ / 128 TxD
Also, let's say : 1 mbuf = 1 descriptor.

===========================
First point: tx_free_thresh
===========================

According to the DPDK programmer guide in section 11.4.5. ``Configuration of Transmit Queues``::

  The minimum transmit packets to free threshold (tx_free_thresh).
    When the number of descriptors used to transmit packets exceeds
    this threshold, the network adaptor should be checked to see if it
    has written back descriptors.
    The default value for tx_free_thresh is 32.
    This ensures that the PMD does not search for completed descriptors
    **until at least 32 have been processed by the NIC for this queue.**

However, in the DPDK headers, in ``rte_eth_txconf`` struct::

uint16_t tx_free_thresh; /**< Start freeing TX buffers if there are
                               less free descriptors than this value. */

And in the docstring of ``rte_eth_tx_queue_setup``::

  - The *tx_free_thresh* value indicates the [minimum] number of network
    buffers that must be pending in the transmit ring to trigger their
    [implicit] freeing by the driver transmit function.

After a code review and tests on target (fm10kvf), my understanding is:
* tx_free_thresh is set to 32
- if I sent 32 packets, all mbufs are locked in the TxQ.
- if I sent 33 packets, all mbufs are locked in the TxQ.
- if I sent 96 packets, all mbufs are locked in the TxQ.
- if I sent 97 packets, PMD tries to clean the TxQ.

* tx_free_thresh is set to 128-32=96
- if I sent 32 packets, all mbufs are locked in the TxQ.
- if I sent 33 packets, PMD tries to clean the TxQ.

Is there any misunderstanding with the DPDK Programmer Guide or in the doctring ?
Should ``tx_free_thresh`` be defined as the following one ?

  This ensures that the PMD does not search for completed descriptors
  until at least 32 descriptors **are still available** by the NIC for
  this queue.

================================
Second: tx_free_thresh and ixgbe
================================

My application is running on two platforms. One with fm10kvf, one with ixgbevf.

I did the following tests:
- fm10kvf / no offload / TxD = 128 / tx_free_thresh = 96
=> after 33 packets, PMD tries to cleanup mbuf, as expected.

- fm10kvf / offload (TX multiseg ON) / TxD = 128 / tx_free_thresh = 96
=> after 33 packets, PMD tries to cleanup mbuf, as expected.

- ixgbevf / no offload / TxD = 128 / tx_free_thresh = 96
=> after 33 packets, PMD tries to cleanup mbuf, as expected.

- ixgbevf / offload (TX multiseg ON) / TxD = 128 / tx_free_thresh = 96
=> after 33 packets, all mbufs are locked in the TxQ.
=> after 97 packets, all mbufs are locked in the TxQ.
=> after 128 packets, only the first mbuf sent is freed.

I did some analysis in this PMD. The TX function is not the same when offload is enabled or not (ixgbe_set_tx_function)
- when no offload is used:
  * TX function is ixgbe_xmit_pkts_vec
* ixgbe_xmit_pkts_vec correctly manages the tx_free_thresh and calls ixgbe_tx_free_bufs in order to free the mbufs.
- when offload is enabled:
  * TX function is ixgbe_xmit_pkts
* ixgbe_xmit_pkts calls ixgbe_xmit_cleanup (instead of free), and seems to only manage internal pointers. * Free is done only when ixgbe detects if descriptor was previously used: http://git.dpdk.org/dpdk/tree/drivers/net/ixgbe/ixgbe_rxtx.c#n890

To sum-up:
- when offload is disabled, TX ring can be cleanup quickly, and it enhances the mbuf circulation inside the application. - when offload is enabled, TX ring cannot be cleanup quickly, and at the end, all mbufs are locked in the TxQ.

I had a usecase with 6 VFs / 4 TxQ / 1024 descriptors. After few seconds of test, 6 * 4 * 1024 = 24 576 mbufs were unavailable, because locked in the TxQ. As our mempool is quite low, this behavior cannot be handled, only because the PMD implementation is different.

Is it the expected and wanted behavior for ixgbe with full TX features enabled ?

Did I miss something in my configuration ?

Thanks in advance !
Best regards,

--
Julien Meunier

Reply via email to