Hi Stephen,
If using dpdk rings comes with this penalty then what should I use, is there an 
alterative to rings.  We do not want to use shared memory and do buffer copies?

Thanks,
Ed

-----Original Message-----
From: Stephen Hemminger <step...@networkplumber.org> 
Sent: Sunday, July 6, 2025 12:03 PM
To: Lombardo, Ed <ed.lomba...@netscout.com>
Cc: Ivan Malov <ivan.ma...@arknetworks.am>; users <users@dpdk.org>
Subject: Re: dpdk Tx falling short

External Email: This message originated outside of NETSCOUT. Do not click links 
or open attachments unless you recognize the sender and know the content is 
safe.

On Sun, 6 Jul 2025 00:03:16 +0000
"Lombardo, Ed" <ed.lomba...@netscout.com> wrote:

> Hi Stephen,
> Here are comments to the list of obvious causes of cache misses you mentiond.
> 
> Obvious cache misses.
>  - passing packets to worker with ring - we use lots of rings to pass mbuf 
> pointers.  If I skip the rte_eth_tx_burst() and just free mbuf bulk, the tx 
> ring does not fill up.
>  - using spinlocks (cost 16ns)  - The driver does not use spinlocks, other 
> than what dpdk uses.
>  - fetching TSC  - We don't do this, we let Rx offload timestamp packets.
>  - syscalls?  - No syscalls are done in our driver fast path.
> 
> You mention "passing packets to worker with ring", do you mean using rings to 
> pass mbuf pointers causes cache misses and should be avoided?

Rings do cause data to be modified by one core and examined by another so they 
are a cache miss.

Reply via email to