Hi Alex:

Thanks for your help.

I forget to describe some criteria in my original post.

At first, I has confirmed my 82599 has connected by PCIe Gen3 (Gen3 x8) speed. 
The theoretical bandwidth can support over 160G in total.
Hence, It should get full speed in my test.

Second, I have ever check the performance w/o DPDK in packet size 1518 in the 
same environment, and indeed it can get 160G totally (by IRQ balance method).
So, I was so surprised to get this kinds of result in DPDK (I also use size 
1518 to test DPDK).

BTW, I can get 120G throughput in 12 ports already. But when I add more than 12 
ports, I only can get 100G.
Why the performance gets less than 120G? Why only 10 ports works fine and NO Tx 
and Rx in the others?
Is it bugs or limitations in DPDK?

Has anyone every do the similar or the same test?


On 07/10/2014 04:40 PM, Alex Markuze wrote:
Hi Zachary,
Your issue may be with the PCI-e 3, with 16 lanes Each slot is limited to 
128Gb/s[3].
Now, AFAIK[1] the CPU is connected to the  I/O with a single PCI-E slot.

Several thoughts that may help you:

1. You can figure out the max b/w by running netsurf over the kernel interfaces 
(w/o DPDK). Each CPU can handle the Netperf and the Completion interrupts with 
grace (packets of 64K and all offloads on) for 10Gb nics.
With more then 10 Nics I would disable the IRQ balancer and make sure 
interrupts are spread evenly by setting the  IRQ affinity manually [2].
As long as you have a physical core(NO hyperthreading) per NIC port you can 
figure out the MAX B/W you can get with all the nics.

2. You can try using (If available to you , obviously) 40Gb and 56Gb Nics 
(Mellanox), In this case for each Netperf flow you will need to separate each 
Netperf Stream and the interrupts to different Cores to Reach wire speed as 
long as both cores are on the same NUMA node(lscpu).

Hope this helps.

[1]http://komposter.com.ua/documents/PCI_Express_Base_Specification_Revision_3.0.pdf
[2]http://h50146.www5.hp.com/products/software/oe/linux/mainstream/support/whitepaper/pdfs/4AA4-9294ENW.pdf
[3]http://en.wikipedia.org/wiki/PCI_Express#PCI_Express_3.x


On Thu, Jul 10, 2014 at 11:07 AM, <Zachary.Jen at 
cas-well.com<mailto:Zachary.Jen at cas-well.com>> wrote:
Hey Guys,

Recently, I have used l2fwd to test 160G (82599 10G * 16 ports), but I
got a strange pheromone in my test.

When I used 12 ports to test the performance of l2fwd, it can work fine
and achieve 120G.
But it got abnormal when I using over than 12 port. Part of ports seems
something wrong and no any Tx/Rx.
Has anyone know about this?

My testing Environment.
1. E5-2658 v2 (10 cores) * 2
http://ark.intel.com/zh-tw/products/76160/Intel-Xeon-Processor-E5-2658-v2-25M-Cache-2_40-GHz
2. one core handle one port. (In order to get best performance.)
3. No any QPI crossing  issue.
4. l2fwd parameters
     4.1 -c 0xF0FF -- -P 0xF00FF  => 120G get!
     4.2 -c 0xFF0FF -- -P 0xFF0FF => Failed! Only first 10 ports can
work well.
     4.3 -c 0x3F3FF -- -P 0x3F3FF => Failed! Only first 10 ports can
work well.

BTW, I have tried lots of parameter sets and if I set the ports number
over than 12 ports, it only first 10 ports got work.
Else, everything got well.

Can anyone help me to solve the issue? Or DPDK only can set less equal
than 12 ports?
Or DPDK max throughput is 120G?

????????????????????????????????????????????? This email may contain 
confidential information. Please do not use or disclose it in any way and 
delete it if you are not the intended recipient.



--
Best Regards,
Zachary Jen

Software RD
CAS-WELL Inc.
8th Floor, No. 242, Bo-Ai St., Shu-Lin City, Taipei County 238, Taiwan
Tel: +886-2-7705-8888#6305
Fax: +886-2-7731-9988

????????????????????????????????????????????? This email may contain 
confidential information. Please do not use or disclose it in any way and 
delete it if you are not the intended recipient.

Reply via email to