On Fri, 11 Jan 2019 22:10:39 +0000
"Soni, Shivam" <shivs...@amazon.com> wrote:

> Hi All,
> 
> We are trying to debug and fix an issue. After the deployment, in few of the 
> hosts we see an issue where TX is unable to enqueue packets to NIC. On 
> rebouncing or restarting our packet processor daemon, issue gets resolved.
> 
> We are using IntelDPDK version 17.11.4 and i40e drivers.
> 
> On looking into driver’s code, we found that whenever the issue is happening 
> the value for nb_tx_free is ‘0’. And then it tries to free the buffer by 
> calling function ‘i40e_tx_free_bufs’.
> 
> This method returns early as the buffer its trying to free says it hasn’t 
> finished transmitting yet. The method returns at this if condition:
> 
> /* check DD bits on threshold descriptor */
> if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
>                 rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
>                 rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) {
> return 0;
> }
> 
> Hence nb_tx_free remains 0.
> 
> Our tx descriptor count is 1024.
> 
> How can we fix this issue.  Can someone help us out here please

Use bigger mbuf pool.  For safety the mbuf pool has to be big enough
for Nports * (NRxd + NTxd) + NCore * (mbuf_pool_cache_size + burst_size)

Each NIC might get full receive ring and full transmit ring
and each active core might be processing a burst of packets and have
free buffers sitting in the mbuf pool cache. This doesn't account for additional
mbuf's created if doing things like reassembly, encryption, re-encapsulation, 
or compression

Anything smaller and your application is relying on statistical averages
to never see resource exhaustion; overcommitment

Reply via email to