Hi Stephen, If using dpdk rings comes with this penalty then what should I use, is there an alterative to rings. We do not want to use shared memory and do buffer copies?
Thanks, Ed -----Original Message----- From: Stephen Hemminger <step...@networkplumber.org> Sent: Sunday, July 6, 2025 12:03 PM To: Lombardo, Ed <ed.lomba...@netscout.com> Cc: Ivan Malov <ivan.ma...@arknetworks.am>; users <users@dpdk.org> Subject: Re: dpdk Tx falling short External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe. On Sun, 6 Jul 2025 00:03:16 +0000 "Lombardo, Ed" <ed.lomba...@netscout.com> wrote: > Hi Stephen, > Here are comments to the list of obvious causes of cache misses you mentiond. > > Obvious cache misses. > - passing packets to worker with ring - we use lots of rings to pass mbuf > pointers. If I skip the rte_eth_tx_burst() and just free mbuf bulk, the tx > ring does not fill up. > - using spinlocks (cost 16ns) - The driver does not use spinlocks, other > than what dpdk uses. > - fetching TSC - We don't do this, we let Rx offload timestamp packets. > - syscalls? - No syscalls are done in our driver fast path. > > You mention "passing packets to worker with ring", do you mean using rings to > pass mbuf pointers causes cache misses and should be avoided? Rings do cause data to be modified by one core and examined by another so they are a cache miss.