On Sat, 5 Jul 2025 17:36:08 +0000 "Lombardo, Ed" <ed.lomba...@netscout.com> wrote:
> Hi Stephen, > I saw your response to more mempools and cache behavior. > > I have a goal to support 2x100G next, and if I can't get 10G with DPDK then > something is seriously wrong. > > Should I build the dpdk static libraries with LTO? > > Thanks, > Ed Are you doing anything in the fast path that is an obvious cache miss. at 10Gbit/sec and size of 84 bytes = 67.2ns CPU's haven't got that much faster 3G cpu that is 201 cycles. Single cache miss is 32ns, so two cache misses means per-packet budget is gone. Obvious cache misses. - passing packets to worker with ring - using spinlocks (cost 16ns) - fetching TSC - syscalls? Also, never ever use floating point. Kernel related and older but worth looking at: https://people.netfilter.org/hawk/presentations/LCA2015/net_stack_challenges_100G_LCA2015.pdf