Hi,
I am testing the perfomance of the mellanox CX5 100GbE NIC. I found that the 
perfomance(Mpps) of AMD(X86) server is better than that of Kunpeng920 
Server(ARM) in the single-core scenario. the test command is as follows:
RX-side:
dpdk-testpmd -l 1-23 -n 4 -a XXXX -- -i --rxq=1 --txq=1 --txd=1024 --rxd=1024 
--nb-cores=1 --eth-peer=0,xxxx  --burst=128 --forward-mode=rxonly -a 
--txpkts=128 --mbcache=512 --rss-udp
TX-side (payload_size is 128byte):
dpdk-testpmd -a XXXXX -l 1-23 -n 4 -- -i --rxq=4 --txq=4 --txd=1024 --rxd=1024 
--nb-cores=4  --eth-peer=0,XXXXX --burst=64    --forward-mode=txonly -a 
--txpkts=128 --mbcache=512 --rss-udp

firmware-version:
16.32.1010 (HUA0000000004)
OS:
OpenEuler 22.03
Kernel:
5.10
DPDK:
21.11.5

Results:
ARM:
28.598Gbps, 27.928Mpps
X86:
34.015Gbps, 33.218Mpps

After some checks, I suspect that the bottleneck is mainly the NIC. Have you 
tested the performance of the CX5 on the ARM server?
Do you have any optimization methods for ARM server, such as some parameters or 
firmware versions?

Thanks

Reply via email to