On Fri, 11 Jan 2019 22:10:39 +0000 "Soni, Shivam" <shivs...@amazon.com> wrote:
> Hi All, > > We are trying to debug and fix an issue. After the deployment, in few of the > hosts we see an issue where TX is unable to enqueue packets to NIC. On > rebouncing or restarting our packet processor daemon, issue gets resolved. > > We are using IntelDPDK version 17.11.4 and i40e drivers. > > On looking into driver’s code, we found that whenever the issue is happening > the value for nb_tx_free is ‘0’. And then it tries to free the buffer by > calling function ‘i40e_tx_free_bufs’. > > This method returns early as the buffer its trying to free says it hasn’t > finished transmitting yet. The method returns at this if condition: > > /* check DD bits on threshold descriptor */ > if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz & > rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) != > rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) { > return 0; > } > > Hence nb_tx_free remains 0. > > Our tx descriptor count is 1024. > > How can we fix this issue. Can someone help us out here please Use bigger mbuf pool. For safety the mbuf pool has to be big enough for Nports * (NRxd + NTxd) + NCore * (mbuf_pool_cache_size + burst_size) Each NIC might get full receive ring and full transmit ring and each active core might be processing a burst of packets and have free buffers sitting in the mbuf pool cache. This doesn't account for additional mbuf's created if doing things like reassembly, encryption, re-encapsulation, or compression Anything smaller and your application is relying on statistical averages to never see resource exhaustion; overcommitment