> -----Original Message-----
> From: Intel-wired-lan <[email protected]> On Behalf Of
> Alexander Lobakin
> Sent: Tuesday, August 26, 2025 9:25 PM
> To: [email protected]
> Cc: Lobakin, Aleksander <[email protected]>; Kubiak, Michal
> <[email protected]>; Fijalkowski, Maciej
> <[email protected]>; Nguyen, Anthony L
> <[email protected]>; Kitszel, Przemyslaw
> <[email protected]>; Andrew Lunn <[email protected]>;
> David S. Miller <[email protected]>; Eric Dumazet
> <[email protected]>; Jakub Kicinski <[email protected]>; Paolo Abeni
> <[email protected]>; Alexei Starovoitov <[email protected]>; Daniel
> Borkmann <[email protected]>; Simon Horman <[email protected]>;
> NXNE CNSE OSDT ITP Upstreaming
> <[email protected]>; [email protected];
> [email protected]; [email protected]
> Subject: [Intel-wired-lan] [PATCH iwl-next v5 01/13] xdp, libeth: make the
> xdp_init_buff() micro-optimization generic
> 
> Often times the compilers are not able to expand two consecutive 32-bit
> writes into one 64-bit on the corresponding architectures. This applies to
> xdp_init_buff() called for every received frame (or at least once per each 64
> frames when the frag size is fixed).
> Move the not-so-pretty hack from libeth_xdp straight to xdp_init_buff(), but
> using a proper union around ::frame_sz and ::flags.
> The optimization is limited to LE architectures due to the structure layout.
> 
> One simple example from idpf with the XDP series applied (Clang 22-git,
> CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE => -O2):
> 
> add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-27 (-27)
> Function                                     old     new   delta
> idpf_vport_splitq_napi_poll                 5076    5049     -27
> 
> The perf difference with XDP_DROP is around +0.8-1% which I see as more
> than satisfying.
> 
> Suggested-by: Simon Horman <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
> ---
>  include/net/libeth/xdp.h | 11 +----------
>  include/net/xdp.h        | 28 +++++++++++++++++++++++++---
>  2 files changed, 26 insertions(+), 13 deletions(-)
> 
Tested-by: R,Ramu <[email protected]>

Reply via email to