[dpdk-dev] Performance degradation with multiple ports

2016-02-23 Thread Arnon Warshavsky
Hi Swamy

A somewhat similar degradation (though not with l2fwd)  was experienced by
us as described here
http://dev.dpdk.narkive.com/OL0KiHns/dpdk-dev-missing-prefetch-in-non-vector-rx-function
In our case it surfaced for not using the default configuration and working
in non-vector mode, and it behaved the same for both ixgbe and i40e.

/Arnon

On Tue, Feb 23, 2016 at 5:24 AM, SwamZ  wrote:

> Hi,
>
>  I am trying to find the maximum IO core performance with DPDK-2.2 code
> using l2fwd application. I got the following number in comparison with
> DPDK-1.7 code.
>
>
>One Port  Two ports
>
>  DPDK 2.2   14.86Mpps per port   11.8Mpps per port
>
>  DPDK 1.7   11.8Mpps per port 11.8Mpps per port
>
>
>
> Traffic rate from Router tester: 64bytes packet with 100% line rate
> (14.86Mpps per port)
>
> CPU Speed : 3.3GHz
>
> NIC   : 82599ES 10-Gigabit
>
> IO Virtualization: SR-IOV
>
> Command used: ./l2fwd -c 3 -w :02:00.1 -w :02:00.0 -- -p 3 -T 1
>
>
> Note:
>
>  - Both the ports are in same NUMA node. I got the same results with full
> CPU core as well as hyper-theraded core.
>
>  - PCIe speed is same for both the ports. Attached the lspci and other
> relevant output.
>
>  - In multiple port case, each core was receiving only 11.8Mpps. This means
> that RX is the bottleneck.
>
>
> Questions:
>
>  1) For two ports case, I am getting only 11.8Mpps per port compared to
> single port case, for which I got line rate. What could be the reason for
> this performance degradation? I was looking at the DPDK mail archive and
> found the following article similar to this and couldn?t conclude anything.
>
> http://dpdk.org/ml/archives/dev/2013-May/000115.html
>
>
>  2) Did anybody try this kind of performance test for i40E NIC?
>
>
> Thanks,
>
> Swamy
>


[dpdk-dev] Performance degradation with multiple ports

2016-02-22 Thread SwamZ
Hi,

 I am trying to find the maximum IO core performance with DPDK-2.2 code
using l2fwd application. I got the following number in comparison with
DPDK-1.7 code.


   One Port  Two ports

 DPDK 2.2   14.86Mpps per port   11.8Mpps per port

 DPDK 1.7   11.8Mpps per port 11.8Mpps per port



Traffic rate from Router tester: 64bytes packet with 100% line rate
(14.86Mpps per port)

CPU Speed : 3.3GHz

NIC   : 82599ES 10-Gigabit

IO Virtualization: SR-IOV

Command used: ./l2fwd -c 3 -w :02:00.1 -w :02:00.0 -- -p 3 -T 1


Note:

 - Both the ports are in same NUMA node. I got the same results with full
CPU core as well as hyper-theraded core.

 - PCIe speed is same for both the ports. Attached the lspci and other
relevant output.

 - In multiple port case, each core was receiving only 11.8Mpps. This means
that RX is the bottleneck.


Questions:

 1) For two ports case, I am getting only 11.8Mpps per port compared to
single port case, for which I got line rate. What could be the reason for
this performance degradation? I was looking at the DPDK mail archive and
found the following article similar to this and couldn?t conclude anything.

http://dpdk.org/ml/archives/dev/2013-May/000115.html


 2) Did anybody try this kind of performance test for i40E NIC?


Thanks,

Swamy
-- next part --

LSPIC output for the two NICs:

02:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ 
Network Connection (rev 01)
Subsystem: Intel Corporation Ethernet Server Adapter X520-2
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR-  Remaining core details are removed