Hi Ivan, Yes, only the user space created rings. Can you add more to your thoughts?
Ed -----Original Message----- From: Ivan Malov <[email protected]> Sent: Tuesday, July 8, 2025 10:19 AM To: Lombardo, Ed <[email protected]> Cc: Stephen Hemminger <[email protected]>; users <[email protected]> Subject: RE: dpdk Tx falling short External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hi Ed, On Tue, 8 Jul 2025, Lombardo, Ed wrote: > Hi Stephen, > When I replace rte_eth_tx_burst() with mbuf free bulk I do not see the tx > ring fill up. I think this is valuable information. Also, perf analysis of > the tx thread shows common_ring_mp_enqueue and rte_atomic32_cmpset, where I > did not expect to see if I created all the Tx rings as SP and SC (and the > workers and ack rings as well, essentially all the 16 rings). > > Perf report snippet: > + 57.25% DPDK_TX_1 test [.] common_ring_mp_enqueue > + 25.51% DPDK_TX_1 test [.] rte_atomic32_cmpset > + 9.13% DPDK_TX_1 test [.] i40e_xmit_pkts > + 6.50% DPDK_TX_1 test [.] rte_pause > 0.21% DPDK_TX_1 test [.] > rte_mempool_ops_enqueue_bulk.isra.0 > 0.20% DPDK_TX_1 test [.] dpdk_tx_thread > > The traffic load is constant 10 Gbps 84 bytes packets with no idles. The > burst size of 512 is a desired burst of mbufs, however the tx thread will > transmit what ever it can get from the Tx ring. > > I think if resolving why the perf analysis shows ring is MP when it has been > created as SP / SC should resolve this issue. The 'common_ring_mp_enqueue' is the enqueue method of mempool variant 'ring', that is, based on RTE Ring internally. When you say that ring has been created as SP / SC you seemingly refer to the regular RTE ring created by your application logic, not the internal ring of the mempool. Am I missing something? Thank you. > > Thanks, > ed > > -----Original Message----- > From: Stephen Hemminger <[email protected]> > Sent: Tuesday, July 8, 2025 9:47 AM > To: Lombardo, Ed <[email protected]> > Cc: Ivan Malov <[email protected]>; users <[email protected]> > Subject: Re: dpdk Tx falling short > > External Email: This message originated outside of NETSCOUT. Do not click > links or open attachments unless you recognize the sender and know the > content is safe. > > On Tue, 8 Jul 2025 04:10:05 +0000 > "Lombardo, Ed" <[email protected]> wrote: > >> Hi Stephen, >> I ensured that in every pipeline stage that enqueue or dequeues mbufs it >> uses the burst version, perf showed the repercussions of doing one mbuf >> dequeue and enqueue. >> For the receive stage rte_eth_rx_burst() is used and Tx stage we use >> rte_eth_tx_burst(). The burst size used in tx_thread for dequeue burst is >> 512 Mbufs. > > You might try buffering like rte_eth_tx_buffer does. > Need to add an additional mechanism to ensure that buffer gets flushed when > you detect idle period. >
