On 1/21/2021 6:05 PM, Igor Russkikh wrote:
When testing high performance numbers, it is often that CPU performance
limits the max values device can reach (both in pps and in gbps)

Here instead of recreating each packet separately, we use clones counter
to resend the same mbuf to the line multiple times.

PMDs handle that transparently due to reference counting inside of mbuf.

Reaching max PPS on small packet sizes helps here:
Some data from our 2 port x 50G device. Using 2*6 tx queues, 64b packets,
PowerEdge R7525, AMD EPYC 7452:

./build/app/dpdk-testpmd -l 32-63  -- --forward-mode=flowgen \
   --rxq=6 --txq=6  --disable-crc-strip --burst=512 \
   --flowgen-clones=0 --txd=4096 --stats-period=1 --txpkts=64

Gives ~46MPPS TX output:

   Tx-pps:     22926849          Tx-bps:  11738590176
   Tx-pps:     23642629          Tx-bps:  12105024112

Setting flowgen-clones to 512 pushes TX almost to our device
physical limit (68MPPS) using same 2*6 queues(cores):

   Tx-pps:     34357556          Tx-bps:  17591073696
   Tx-pps:     34353211          Tx-bps:  17588802640

Doing similar measurements per core, I see one core can do
6.9MPPS (without clones) vs 11MPPS (with clones)

Verified on Marvell qede and atlantic PMDs.

v2:
   - fixed warning on uninit var
v1:
   - fixes on Ferruh's comments

rfc v2: http://patchwork.dpdk.org/patch/78800/
   - increment ref counter for each mbuf pointer copy
rfc v1: http://patchwork.dpdk.org/patch/78674/

Signed-off-by: Igor Russkikh <irussk...@marvell.com>

Reviewed-by: Ferruh Yigit <ferruh.yi...@intel.com>

Also a testpmd command to set the 'flowgen-clones' can be useful, for interactive mode. If you have time for it, can you please work on it too?


Applied to dpdk-next-net/main, thanks.

Reply via email to