Hello,

Thanks for your mail and analysis.
The results below of max packet rate of 214Mpps for dual port ConnectX-6 Dx are 
expected, and are aligned with the NIC capabilities.

Regards,
Asaf Penso

From: Дмитрий Степанов <stepanov.d...@gmail.com>
Sent: Tuesday, March 22, 2022 11:04 AM
To: users@dpdk.org
Subject: Mellanox Connectx-6 Dx dual port performance

Hi!

I'm testing overall dual port performance on ConnectX-6 Dx EN adapter card 
(100GbE; Dual-port QSFP56; PCIe 4.0/3.0 x16) with DPDK 21.11 on Ubuntu 20.04.
I have 2 dual port NICs installed on the same server (but on different NUMA 
nodes) which I use as a generator and a reciever respectively.
First, I started custom packet generator on port 0 and got 148 Mpps TX (64 
bytes TCP packets with zero payload lentgh) which equals the maximum of 100 
Gbps line rate. Then I launched the same generator with the same parameters 
simultaneously on port 1.
Performance on both ports decreased to 105-106 Mpss per port (210-212 Mpps in 
sum). If I use 512 bytes TCP packets - then running generators on both ports 
gives me 23 Mpps for each port (46 Mpps in sum, which for given TCP packet size 
equals the maximum line rate).

Mellanox performance report 
http://fast.dpdk.org/doc/perf/DPDK_21_08_Mellanox_NIC_performance_report.pdf<https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Ffast.dpdk.org%2Fdoc%2Fperf%2FDPDK_21_08_Mellanox_NIC_performance_report.pdf&data=04%7C01%7Casafp%40nvidia.com%7Cc85c2868f7cb46bcd74e08da0be2eefe%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637835366991655176%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Y%2FIkVkGUK%2FGozni1b%2B5ICrdMDO%2B8LW84I8Poiol4wWw%3D&reserved=0>
 doesn't contain measurements for TX path, only for RX.
Provided Test#11 Mellanox ConnectX-6 Dx 100GbE PCIe Gen4 Throughput at Zero 
Packet Loss (2x 100GbE) for RX path contains near the same results that I got 
for TX path (214 Mpps for 64 bytes packets, 47 Mpps for 512 bytes packets). The 
question is - do my results for TX path should coincide with provided results 
for RX path? Why I can't get 148 x 2 Mpps for small packets when using both 
ports? What is a bottleneck here - PCIe, RAM or NIC itself?

To test RX path I used testpmd and l3fwd (slightly midified to print RX stats) 
utilities.

./dpdk-testpmd -l 64-127 -n 4 -a 0000:c1:00.0,mprq_en=1,mprq_log_stride_num=9 
-a 0000:c1:00.1,mprq_en=1,mprq_log_stride_num=9 -- --stats-period 1 
--nb-cores=16 --rxq=16 --txq=16 --rxd=4096 --txd=4096 --burst=64 --mbcache=512

./build/examples/dpdk-l3fwd -l 96-111 -n 4 --socketmem=0,4096 -a 
0000:c1:00.0,mprq_en=1,rxqs_min_mprq=1,mprq_log_stride_num=9,txq_inline_mpw=128,rxq_pkt_pad_en=1
 -a 
0000:c1:00.1,mprq_en=1,rxqs_min_mprq=1,mprq_log_stride_num=9,txq_inline_mpw=128,rxq_pkt_pad_en=1
 -- -p 0x3 -P 
--config='(0,0,111),(0,1,110),(0,2,109),(0,3,108),(0,4,107),(0,5,106),(0,6,105),(0,7,104),(1,0,103),(1,1,102),(1,2,101),(1,3,100),(1,4,99),(1,5,98),(1,6,97),(1,7,96)'
 --eth-dest=0,00:15:77:1f:eb:fb --eth-dest=1,00:15:77:1f:eb:fb

Then I provided 105 Mpps of 64 bytes TCP packets from another dual port NIC to 
each port (210 Mpps in sum). As I described above I can't get more than 210 
Mpps in sum from generator. In both cases I was not able to get more than 75-85 
Mpps for each port (150-170 Mpps in sum) on RX path. This contradicts with 
results provided in Mellanox performance report (214 Mpps for both ports, 112 
Mpps per port on RX path). Running only single generator gives me 148 Mpps on 
both TX and RX sides. But after starting generator on the second port - the TX 
performance decreased to 105 Mpps per port (210 Mpps in sum), RX performance 
descreased to 75-85 Mpps per port (150-170 Mpps in sum for both ports). Could 
these poor RX results be due not fully utilized generator or I should get 210 
Mpps provided by generator on both ports in sum? I used all suggestions for 
system tuning described in Mellanox performance report document.
I would be grateful for any advice.

Thanks in advance!

Reply via email to