[dpdk-dev] [PATCH] kni:fix build on Ubuntu 12.04.5 with current HWE

2014-09-30 Thread Thomas Monjalon
2014-09-30 13:10, Daniel Mrzyglod:
> Recent Ubuntu 12.04.5 LTS is shipped with 3.13.0-36.63 as the only supported 
> kernel.
> 
> Patch a09b359daca3d8af43dc22a57b34cf317f958236 describe the problem.
> 
> Signed-off-by: Daniel Mrzyglod 

>  #if ( LINUX_VERSION_CODE < KERNEL_VERSION(3,14,0) )
>  #if (!(RHEL_RELEASE_CODE && RHEL_RELEASE_CODE >= RHEL_RELEASE_VERSION(7,0)))
> -#if (!(UBUNTU_RELEASE_CODE == UBUNTU_RELEASE_VERSION(14,4) && 
> UBUNTU_KERNEL_CODE >= UBUNTU_KERNEL_VERSION(3,13,0,30,54)))
> +#if (!((UBUNTU_RELEASE_CODE == UBUNTU_RELEASE_VERSION(14,4) \
> +|| UBUNTU_RELEASE_CODE == UBUNTU_RELEASE_VERSION(12,4)) && 
> UBUNTU_KERNEL_CODE >= UBUNTU_KERNEL_VERSION(3,13,0,30,54)))
>  #ifdef NETIF_F_RXHASH

Reordered the conditions and applied.

Thanks
-- 
Thomas


[dpdk-dev] [PATCH] ixgbe: Fix clang compilation issue

2014-09-30 Thread Thomas Monjalon
> > Issue reported by Keith Wiles.
> > Clang fails with an error about a variable being used uninitialized:
> > 
> >  CC ixgbe_rxtx_vec.o
> > /home/keithw/projects/dpdk-code/org-dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c:67:30:
> > error: variable 'dma_addr0' is uninitialized
> >  when used here [-Werror,-Wuninitialized]
> >dma_addr0 = _mm_xor_si128(dma_addr0, dma_addr0);
> >  ^
> > 
> > This error can be fixed by replacing the call to xor which
> > takes two parameters, by a call to setzero, which does not take any.
> > 
> > Signed-off-by: Bruce Richardson 
> 
> Acked-by: Keith Wiles 

Acked and applied

Thanks
-- 
Thomas


[dpdk-dev] [PATCH 2/2] librte_pmd_null: Enable librte_pmd_null

2014-09-30 Thread muk...@igel.co.jp
From: Tetsuya Mukawa 

Signed-off-by: Tetsuya Mukawa 
---
 mk/rte.app.mk | 4 
 1 file changed, 4 insertions(+)

diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 34dff2a..f059290 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -179,6 +179,10 @@ LDLIBS += -lrte_pmd_xenvirt
 LDLIBS += -lxenstore
 endif

+ifeq ($(CONFIG_RTE_LIBRTE_PMD_NULL),y)
+LDLIBS += -lrte_pmd_null
+endif
+
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)

-- 
1.9.1



[dpdk-dev] [PATCH 1/2] librte_pmd_null: Add null PMD

2014-09-30 Thread muk...@igel.co.jp
From: Tetsuya Mukawa 

'null PMD' is a driver of the virtual device particulary designed to measure
performance of DPDK PMDs. When an application call rx, null PMD just allocates
mbufs and returns those. Also tx, the PMD just frees mbufs.

The PMD has following options.
- size: specify packe size allocated by RX. Default packet size is 64.
- copy: specify 1 or 0 to enable or disable copy while RX and TX.
Default value is 0(disbaled).
This option is used for emulating more realistic data transfer.
Copy size is equal to packet size.

Signed-off-by: Tetsuya Mukawa 
---
 config/common_bsdapp   |   5 +
 config/common_linuxapp |   5 +
 lib/Makefile   |   1 +
 lib/librte_pmd_null/Makefile   |  58 +
 lib/librte_pmd_null/rte_eth_null.c | 474 +
 5 files changed, 543 insertions(+)
 create mode 100644 lib/librte_pmd_null/Makefile
 create mode 100644 lib/librte_pmd_null/rte_eth_null.c

diff --git a/config/common_bsdapp b/config/common_bsdapp
index eebd05b..bda37f5 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -224,6 +224,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y
 CONFIG_RTE_LIBRTE_PMD_BOND=y

 #
+# Compile null PMD
+#
+CONFIG_RTE_LIBRTE_PMD_NULL=y
+
+#
 # Do prefetch of packet data within PMD driver receive function
 #
 CONFIG_RTE_PMD_PACKET_PREFETCH=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 4713eb4..66d2ce1 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -252,6 +252,11 @@ CONFIG_RTE_LIBRTE_PMD_BOND=y
 CONFIG_RTE_LIBRTE_PMD_XENVIRT=n

 #
+# Compile null PMD
+#
+CONFIG_RTE_LIBRTE_PMD_NULL=y
+
+#
 # Do prefetch of packet data within PMD driver receive function
 #
 CONFIG_RTE_PMD_PACKET_PREFETCH=y
diff --git a/lib/Makefile b/lib/Makefile
index 10c5bb3..61d6ed1 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap
 DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio
 DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += librte_pmd_null
 DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
 DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm
 DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
diff --git a/lib/librte_pmd_null/Makefile b/lib/librte_pmd_null/Makefile
new file mode 100644
index 000..e017918
--- /dev/null
+++ b/lib/librte_pmd_null/Makefile
@@ -0,0 +1,58 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2014 Nippon Telegraph and Telephone Corporation.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_null.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += rte_eth_null.c
+
+#
+# Export include files
+#
+SYMLINK-y-include +=
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += lib/librte_malloc
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += lib/librte_kvargs
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_pmd_null/rte_eth_null.c 
b/lib/librte_pmd_null/rte_eth_null.c
new file mode 100644
index 000..7ecdd17
--- /dev/null
+++ b/lib/librte_pmd_null/rte_eth_null.c
@@ -0,0 

[dpdk-dev] GSO support by PMD drivers

2014-09-30 Thread Vadim Suraev
Than you, Oliver
On Sep 30, 2014 5:42 PM, "Olivier MATZ"  wrote:

> Hello Alex, Vadim,
>
> On 09/28/2014 09:19 AM, Alex Markuze wrote:
>
>> LSO/TSO support is an important feature, I'm surprised its not
>> supported in DPDK.
>> I personally would like to see these patches.
>>
>> On Fri, Sep 26, 2014 at 1:23 PM, Vadim Suraev 
>> wrote:
>>
>>> Hi, all,
>>> I found ixgbe in couple with rte_mbuf (and probably other PMD drivers)
>>> don't support GSO, I reverse engineered the linux kernel's  ixgbe's gso
>>> support and got it working in 1.6. Could it be useful to provide the
>>> patch?
>>>
>>
> I already submitted a patch providing segmentation offload a few months
> ago:
> http://dpdk.org/ml/archives/dev/2014-May/002537.html
>
> A part of this series has been reworked by Bruce and is now integrated
> in head. I think rebasing the rest of the series should not be too
> difficult since Bruce's work is based on this series. Unfortunately,
> I won't have time to do the rebase by myself in the coming weeks, but
> feel free to do it if you need this feature.
>
> Regards,
> Olivier
>
>


[dpdk-dev] Hi all, does Amazon VMs supported DPDK or not?

2014-09-30 Thread Saha, Avik (AWS)
I have only experimented with C3.8xlarge instances with DPDK. You have to 
attach at least 2 ENIs to the instances since one of them would be taken over 
by DPDK. Based on the size, you would be able to attach 8, 16  or 32 ENIs to 
the instances (these will be visible as ethn devices of ifconfig)

Let me know if this helps (or I am not getting the question :) )

Avik

-Original Message-
From: Patel, Rashmin N [mailto:rashmin.n.pa...@intel.com] 
Sent: Monday, September 29, 2014 1:54 PM
To: Wang, Shawn; Dong, Binghua; dev at dpdk.org; Saha, Avik (AWS)
Subject: RE: Hi all, does Amazon VMs supported DPDK or not?

Hi Shawn,

Which network interface is visible to the VM? I mean which is the virtual 
ethernet port is used in Amazon-VM-DPDK app? And what all interfaces are 
offered based on the VM size and requirements?

Thanks,
Rashmin

-Original Message-
From: Wang, Shawn [mailto:xing...@amazon.com] 
Sent: Monday, September 29, 2014 1:50 PM
To: Dong, Binghua; Patel, Rashmin N; dev at dpdk.org; Saha, Avik (AWS)
Subject: RE: Hi all, does Amazon VMs supported DPDK or not?

Yes, you can.

>From my colleague, Saha, Avik, they are running  IntelDPDK 1.7 on c3.8xlarges.

Thanks.

From: dev [dev-bounces at dpdk.org] on behalf of Dong, Binghua 
[binghua.d...@intel.com]
Sent: Saturday, September 27, 2014 10:05 PM
To: Patel, Rashmin N; dev at dpdk.org
Subject: Re: [dpdk-dev] Hi all,  does Amazon VMs supported DPDK or not?

Hi Patel,

The customer consider that deploy DPDK application in Amazon VMs is very 
flexible and very easy global site deployment:

such as: they only need to buy a 2 lcores VM if a site only need 200Mbps 
throughput;   buy one 4 lcores VM if the throughput is 400Mbps;

the can buy different Amazon site VMs in US, German... for lower access latency;

-Original Message-
From: Patel, Rashmin N
Sent: Saturday, September 27, 2014 12:41 AM
To: Dong, Binghua; dev at dpdk.org
Subject: RE: Hi all, does Amazon VMs supported DPDK or not?

It really depends on the devices offered in the VM. If direct device assignment 
is not provided to a VM or if the node hypervisor doesn't have an optimized 
para-virtual interface to a VM, I don't see any benefit using DPDK in VMs.

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Dong, Binghua
Sent: Friday, September 26, 2014 5:47 AM
To: dev at dpdk.org
Subject: [dpdk-dev] Hi all, does Amazon VMs supported DPDK or not?

A customer plan to buy some global Amazon VMs to run their DPDK 1.3(will 
upgrade to DPDK1.6 or 1.7) based VPN applications on global sites.

Thanks a lot;



[dpdk-dev] [memnic PATCH v2 0/7] MEMNIC PMD performance improvement

2014-09-30 Thread Thomas Monjalon
> This patchset improves MEMNIC PMD performance.
> 
> Hiroshi Shimamoto (7):
>   guest: memnic-tester: PMD benchmark in guest
>   pmd: remove needless assignment
>   pmd: use helper macros
>   pmd: use compiler barrier
>   pmd: packet receiving optimization with prefetch
>   pmd: add branch hint in recv/xmit
>   pmd: burst mbuf freeing in xmit

Applied with Huawei's wording comment.

If there is no more patch, it will be tagged v1.3 at the end
of the week.

Thanks
-- 
Thomas


[dpdk-dev] [PATCH RFC] mbuf: Adjust TX flags to start at bit 32

2014-09-30 Thread Bruce Richardson
On Tue, Sep 30, 2014 at 05:00:04PM +0100, Wiles, Roger Keith wrote:
> Hi Bruce,
> 
> I like the idea of the split, which should make it easier to do the testing 
> of those bits.
> One comment below.
> 
> On Sep 30, 2014, at 10:33 AM, Bruce Richardson  intel.com> wrote:
> 
> > On Tue, Sep 30, 2014 at 04:26:02PM +0100, Bruce Richardson wrote:
> >> This patch takes the existing TX flags defined for the mbuf and shifts
> >> each uniquely defined one left so that additional RX flags can be
> >> defined without having RX and TX flags mixed together.
> >>
> >> Signed-off-by: Bruce Richardson 
> >> ---
> >
> > This is just an RFC patch for now, as I'm looking for input to make sure
> > this is done right. Couple of opens, if people have input:
> > * is a 32/32 split for RX/TX flags appropriate? Are we likely to have about
> >  equal numbers of each?
> > * Doing a grep for the TX flag use, it seems the defines are commonly used,
> >  but if anyone is aware of anywhere where the code depends on the flags
> > having a particular value, please let me know.
> >
> > If I have time, I also hope to look at doing rework on the testpmd flag
> > handling based off Olivier's previous patches, but since that is not
> > affecting the public ABI, I consider it a bit lower priority.
> >
> > Thanks,
> > /Bruce
> >
> >> lib/librte_mbuf/rte_mbuf.h | 26 +-
> >> 1 file changed, 13 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> >> index 1c6e115..c9fc4ec 100644
> >> --- a/lib/librte_mbuf/rte_mbuf.h
> >> +++ b/lib/librte_mbuf/rte_mbuf.h
> >> @@ -86,26 +86,26 @@ extern "C" {
> >> #define PKT_RX_IEEE1588_PTP  0x0200 /**< RX IEEE1588 L2 Ethernet PT 
> >> Packet. */
> >> #define PKT_RX_IEEE1588_TMST 0x0400 /**< RX IEEE1588 L2/L4 timestamped 
> >> packet.*/
> >>
> >> -#define PKT_TX_VLAN_PKT  0x0800 /**< TX packet is a 802.1q VLAN 
> >> packet. */
> >> -#define PKT_TX_IP_CKSUM  0x1000 /**< IP cksum of TX pkt. computed by 
> >> NIC. */
> >> -#define PKT_TX_IPV4_CSUM 0x1000 /**< Alias of PKT_TX_IP_CKSUM. */
> >> -#define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP 
> >> checksum offload. */
> >> -#define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */
> >> +#define PKT_TX_VLAN_PKT  (0x0001 << 32) /**< TX packet is a 802.1q VLAN 
> >> packet. */
> >> +#define PKT_TX_IP_CKSUM  (0x0002 << 32) /**< IP cksum of TX pkt. computed 
> >> by NIC. */
> 
> One little nit in the patch is does (0x0001 << 32) need to be (0x0001ULL << 
> 32)? I have not tested it and just a thought.

Yes, indeed it does, good catch!

/Bruce

> 
> Thanks
> ++Keith
> >> +#define PKT_TX_IPV4_CSUM PKT_TX_IP_CKSUM /**< Alias of PKT_TX_IP_CKSUM. */
> >> +#define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP checksum 
> >> offload. */
> >> +#define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */
> >>
> >> /*
> >> - * Bit 14~13 used for L4 packet type with checksum enabled.
> >> + * Bit 35~34 used for L4 packet type with checksum enabled.
> >>  * 00: Reserved
> >>  * 01: TCP checksum
> >>  * 10: SCTP checksum
> >>  * 11: UDP checksum
> >>  */
> >> -#define PKT_TX_L4_MASK   0x6000 /**< Mask bits for L4 checksum 
> >> offload request. */
> >> -#define PKT_TX_L4_NO_CKSUM   0x /**< Disable L4 cksum of TX pkt. */
> >> -#define PKT_TX_TCP_CKSUM 0x2000 /**< TCP cksum of TX pkt. computed by 
> >> NIC. */
> >> -#define PKT_TX_SCTP_CKSUM0x4000 /**< SCTP cksum of TX pkt. computed 
> >> by NIC. */
> >> -#define PKT_TX_UDP_CKSUM 0x6000 /**< UDP cksum of TX pkt. computed by 
> >> NIC. */
> >> -/* Bit 15 */
> >> -#define PKT_TX_IEEE1588_TMST 0x8000 /**< TX IEEE1588 packet to timestamp. 
> >> */
> >> +#define PKT_TX_L4_NO_CKSUM   (0x << 32) /**< Disable L4 cksum of TX 
> >> pkt. */
> >> +#define PKT_TX_TCP_CKSUM (0x0004 << 32) /**< TCP cksum of TX pkt. 
> >> computed by NIC. */
> >> +#define PKT_TX_SCTP_CKSUM(0x0008 << 32) /**< SCTP cksum of TX pkt. 
> >> computed by NIC. */
> >> +#define PKT_TX_UDP_CKSUM (0x000C << 32) /**< UDP cksum of TX pkt. 
> >> computed by NIC. */
> >> +#define PKT_TX_L4_MASK   (0x000C << 32) /**< Mask for L4 cksum 
> >> offload request. */
> >> +/* Bit 36 */
> >> +#define PKT_TX_IEEE1588_TMST (0x0010 << 32) /**< TX IEEE1588 packet to 
> >> timestamp. */
> >>
> >> /* Use final bit of flags to indicate a control mbuf */
> >> #define CTRL_MBUF_FLAG   (1ULL << 63)
> >> --
> >> 1.9.3
> >>
> 
> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
> 972-213-5533
> 


[dpdk-dev] [memnic PATCH v2 6/7] pmd: add branch hint in recv/xmit

2014-09-30 Thread Thomas Monjalon
2014-09-30 14:38, Xie, Huawei:
> > -   if (++next >= MEMNIC_NR_PACKET)
> > +   if (unlikely(++next >= MEMNIC_NR_PACKET))
> 
> On IA,  compiler can use add, cmp and cmov to avoid branch.
> But If MEMNIC_NR_PACKET is always power of 2, 
>   it is better just next = (next + 1) & (MEMNIC_NR_PACKET - 1)

Power of 2 is not enforced for MEMNIC_NR_PACKET.

-- 
Thomas


[dpdk-dev] GSO support by PMD drivers

2014-09-30 Thread Olivier MATZ
Hello Alex, Vadim,

On 09/28/2014 09:19 AM, Alex Markuze wrote:
> LSO/TSO support is an important feature, I'm surprised its not
> supported in DPDK.
> I personally would like to see these patches.
>
> On Fri, Sep 26, 2014 at 1:23 PM, Vadim Suraev  
> wrote:
>> Hi, all,
>> I found ixgbe in couple with rte_mbuf (and probably other PMD drivers)
>> don't support GSO, I reverse engineered the linux kernel's  ixgbe's gso
>> support and got it working in 1.6. Could it be useful to provide the patch?

I already submitted a patch providing segmentation offload a few months
ago:
http://dpdk.org/ml/archives/dev/2014-May/002537.html

A part of this series has been reworked by Bruce and is now integrated
in head. I think rebasing the rest of the series should not be too
difficult since Bruce's work is based on this series. Unfortunately,
I won't have time to do the rebase by myself in the coming weeks, but
feel free to do it if you need this feature.

Regards,
Olivier



[dpdk-dev] [PATCH RFC] mbuf: Adjust TX flags to start at bit 32

2014-09-30 Thread Bruce Richardson
On Tue, Sep 30, 2014 at 04:26:02PM +0100, Bruce Richardson wrote:
> This patch takes the existing TX flags defined for the mbuf and shifts
> each uniquely defined one left so that additional RX flags can be
> defined without having RX and TX flags mixed together.
> 
> Signed-off-by: Bruce Richardson 
> ---

This is just an RFC patch for now, as I'm looking for input to make sure 
this is done right. Couple of opens, if people have input:
* is a 32/32 split for RX/TX flags appropriate? Are we likely to have about 
  equal numbers of each?
* Doing a grep for the TX flag use, it seems the defines are commonly used, 
  but if anyone is aware of anywhere where the code depends on the flags  
having a particular value, please let me know.

If I have time, I also hope to look at doing rework on the testpmd flag 
handling based off Olivier's previous patches, but since that is not 
affecting the public ABI, I consider it a bit lower priority.

Thanks,
/Bruce

>  lib/librte_mbuf/rte_mbuf.h | 26 +-
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 1c6e115..c9fc4ec 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -86,26 +86,26 @@ extern "C" {
>  #define PKT_RX_IEEE1588_PTP  0x0200 /**< RX IEEE1588 L2 Ethernet PT Packet. 
> */
>  #define PKT_RX_IEEE1588_TMST 0x0400 /**< RX IEEE1588 L2/L4 timestamped 
> packet.*/
>  
> -#define PKT_TX_VLAN_PKT  0x0800 /**< TX packet is a 802.1q VLAN packet. 
> */
> -#define PKT_TX_IP_CKSUM  0x1000 /**< IP cksum of TX pkt. computed by 
> NIC. */
> -#define PKT_TX_IPV4_CSUM 0x1000 /**< Alias of PKT_TX_IP_CKSUM. */
> -#define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP checksum 
> offload. */
> -#define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */
> +#define PKT_TX_VLAN_PKT  (0x0001 << 32) /**< TX packet is a 802.1q VLAN 
> packet. */
> +#define PKT_TX_IP_CKSUM  (0x0002 << 32) /**< IP cksum of TX pkt. computed by 
> NIC. */
> +#define PKT_TX_IPV4_CSUM PKT_TX_IP_CKSUM /**< Alias of PKT_TX_IP_CKSUM. */
> +#define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP checksum 
> offload. */
> +#define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */
>  
>  /*
> - * Bit 14~13 used for L4 packet type with checksum enabled.
> + * Bit 35~34 used for L4 packet type with checksum enabled.
>   * 00: Reserved
>   * 01: TCP checksum
>   * 10: SCTP checksum
>   * 11: UDP checksum
>   */
> -#define PKT_TX_L4_MASK   0x6000 /**< Mask bits for L4 checksum offload 
> request. */
> -#define PKT_TX_L4_NO_CKSUM   0x /**< Disable L4 cksum of TX pkt. */
> -#define PKT_TX_TCP_CKSUM 0x2000 /**< TCP cksum of TX pkt. computed by 
> NIC. */
> -#define PKT_TX_SCTP_CKSUM0x4000 /**< SCTP cksum of TX pkt. computed by 
> NIC. */
> -#define PKT_TX_UDP_CKSUM 0x6000 /**< UDP cksum of TX pkt. computed by 
> NIC. */
> -/* Bit 15 */
> -#define PKT_TX_IEEE1588_TMST 0x8000 /**< TX IEEE1588 packet to timestamp. */
> +#define PKT_TX_L4_NO_CKSUM   (0x << 32) /**< Disable L4 cksum of TX pkt. 
> */
> +#define PKT_TX_TCP_CKSUM (0x0004 << 32) /**< TCP cksum of TX pkt. 
> computed by NIC. */
> +#define PKT_TX_SCTP_CKSUM(0x0008 << 32) /**< SCTP cksum of TX pkt. 
> computed by NIC. */
> +#define PKT_TX_UDP_CKSUM (0x000C << 32) /**< UDP cksum of TX pkt. 
> computed by NIC. */
> +#define PKT_TX_L4_MASK   (0x000C << 32) /**< Mask for L4 cksum offload 
> request. */
> +/* Bit 36 */
> +#define PKT_TX_IEEE1588_TMST (0x0010 << 32) /**< TX IEEE1588 packet to 
> timestamp. */
>  
>  /* Use final bit of flags to indicate a control mbuf */
>  #define CTRL_MBUF_FLAG   (1ULL << 63)
> -- 
> 1.9.3
> 


[dpdk-dev] [PATCH RFC] mbuf: Adjust TX flags to start at bit 32

2014-09-30 Thread Bruce Richardson
This patch takes the existing TX flags defined for the mbuf and shifts
each uniquely defined one left so that additional RX flags can be
defined without having RX and TX flags mixed together.

Signed-off-by: Bruce Richardson 
---
 lib/librte_mbuf/rte_mbuf.h | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 1c6e115..c9fc4ec 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -86,26 +86,26 @@ extern "C" {
 #define PKT_RX_IEEE1588_PTP  0x0200 /**< RX IEEE1588 L2 Ethernet PT Packet. */
 #define PKT_RX_IEEE1588_TMST 0x0400 /**< RX IEEE1588 L2/L4 timestamped 
packet.*/

-#define PKT_TX_VLAN_PKT  0x0800 /**< TX packet is a 802.1q VLAN packet. */
-#define PKT_TX_IP_CKSUM  0x1000 /**< IP cksum of TX pkt. computed by NIC. 
*/
-#define PKT_TX_IPV4_CSUM 0x1000 /**< Alias of PKT_TX_IP_CKSUM. */
-#define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP checksum 
offload. */
-#define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */
+#define PKT_TX_VLAN_PKT  (0x0001 << 32) /**< TX packet is a 802.1q VLAN 
packet. */
+#define PKT_TX_IP_CKSUM  (0x0002 << 32) /**< IP cksum of TX pkt. computed by 
NIC. */
+#define PKT_TX_IPV4_CSUM PKT_TX_IP_CKSUM /**< Alias of PKT_TX_IP_CKSUM. */
+#define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP checksum 
offload. */
+#define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */

 /*
- * Bit 14~13 used for L4 packet type with checksum enabled.
+ * Bit 35~34 used for L4 packet type with checksum enabled.
  * 00: Reserved
  * 01: TCP checksum
  * 10: SCTP checksum
  * 11: UDP checksum
  */
-#define PKT_TX_L4_MASK   0x6000 /**< Mask bits for L4 checksum offload 
request. */
-#define PKT_TX_L4_NO_CKSUM   0x /**< Disable L4 cksum of TX pkt. */
-#define PKT_TX_TCP_CKSUM 0x2000 /**< TCP cksum of TX pkt. computed by NIC. 
*/
-#define PKT_TX_SCTP_CKSUM0x4000 /**< SCTP cksum of TX pkt. computed by 
NIC. */
-#define PKT_TX_UDP_CKSUM 0x6000 /**< UDP cksum of TX pkt. computed by NIC. 
*/
-/* Bit 15 */
-#define PKT_TX_IEEE1588_TMST 0x8000 /**< TX IEEE1588 packet to timestamp. */
+#define PKT_TX_L4_NO_CKSUM   (0x << 32) /**< Disable L4 cksum of TX pkt. */
+#define PKT_TX_TCP_CKSUM (0x0004 << 32) /**< TCP cksum of TX pkt. computed 
by NIC. */
+#define PKT_TX_SCTP_CKSUM(0x0008 << 32) /**< SCTP cksum of TX pkt. 
computed by NIC. */
+#define PKT_TX_UDP_CKSUM (0x000C << 32) /**< UDP cksum of TX pkt. computed 
by NIC. */
+#define PKT_TX_L4_MASK   (0x000C << 32) /**< Mask for L4 cksum offload 
request. */
+/* Bit 36 */
+#define PKT_TX_IEEE1588_TMST (0x0010 << 32) /**< TX IEEE1588 packet to 
timestamp. */

 /* Use final bit of flags to indicate a control mbuf */
 #define CTRL_MBUF_FLAG   (1ULL << 63)
-- 
1.9.3



[dpdk-dev] [PATCH RFC] mbuf: Adjust TX flags to start at bit 32

2014-09-30 Thread Wiles, Roger Keith
Hi Bruce,

I like the idea of the split, which should make it easier to do the testing of 
those bits.
One comment below.

On Sep 30, 2014, at 10:33 AM, Bruce Richardson  
wrote:

> On Tue, Sep 30, 2014 at 04:26:02PM +0100, Bruce Richardson wrote:
>> This patch takes the existing TX flags defined for the mbuf and shifts
>> each uniquely defined one left so that additional RX flags can be
>> defined without having RX and TX flags mixed together.
>> 
>> Signed-off-by: Bruce Richardson 
>> ---
> 
> This is just an RFC patch for now, as I'm looking for input to make sure 
> this is done right. Couple of opens, if people have input:
> * is a 32/32 split for RX/TX flags appropriate? Are we likely to have about 
>  equal numbers of each?
> * Doing a grep for the TX flag use, it seems the defines are commonly used, 
>  but if anyone is aware of anywhere where the code depends on the flags  
> having a particular value, please let me know.
> 
> If I have time, I also hope to look at doing rework on the testpmd flag 
> handling based off Olivier's previous patches, but since that is not 
> affecting the public ABI, I consider it a bit lower priority.
> 
> Thanks,
> /Bruce
> 
>> lib/librte_mbuf/rte_mbuf.h | 26 +-
>> 1 file changed, 13 insertions(+), 13 deletions(-)
>> 
>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
>> index 1c6e115..c9fc4ec 100644
>> --- a/lib/librte_mbuf/rte_mbuf.h
>> +++ b/lib/librte_mbuf/rte_mbuf.h
>> @@ -86,26 +86,26 @@ extern "C" {
>> #define PKT_RX_IEEE1588_PTP  0x0200 /**< RX IEEE1588 L2 Ethernet PT Packet. 
>> */
>> #define PKT_RX_IEEE1588_TMST 0x0400 /**< RX IEEE1588 L2/L4 timestamped 
>> packet.*/
>> 
>> -#define PKT_TX_VLAN_PKT  0x0800 /**< TX packet is a 802.1q VLAN packet. 
>> */
>> -#define PKT_TX_IP_CKSUM  0x1000 /**< IP cksum of TX pkt. computed by 
>> NIC. */
>> -#define PKT_TX_IPV4_CSUM 0x1000 /**< Alias of PKT_TX_IP_CKSUM. */
>> -#define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP checksum 
>> offload. */
>> -#define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */
>> +#define PKT_TX_VLAN_PKT  (0x0001 << 32) /**< TX packet is a 802.1q VLAN 
>> packet. */
>> +#define PKT_TX_IP_CKSUM  (0x0002 << 32) /**< IP cksum of TX pkt. computed 
>> by NIC. */

One little nit in the patch is does (0x0001 << 32) need to be (0x0001ULL << 
32)? I have not tested it and just a thought.

Thanks
++Keith
>> +#define PKT_TX_IPV4_CSUM PKT_TX_IP_CKSUM /**< Alias of PKT_TX_IP_CKSUM. */
>> +#define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP checksum 
>> offload. */
>> +#define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */
>> 
>> /*
>> - * Bit 14~13 used for L4 packet type with checksum enabled.
>> + * Bit 35~34 used for L4 packet type with checksum enabled.
>>  * 00: Reserved
>>  * 01: TCP checksum
>>  * 10: SCTP checksum
>>  * 11: UDP checksum
>>  */
>> -#define PKT_TX_L4_MASK   0x6000 /**< Mask bits for L4 checksum offload 
>> request. */
>> -#define PKT_TX_L4_NO_CKSUM   0x /**< Disable L4 cksum of TX pkt. */
>> -#define PKT_TX_TCP_CKSUM 0x2000 /**< TCP cksum of TX pkt. computed by 
>> NIC. */
>> -#define PKT_TX_SCTP_CKSUM0x4000 /**< SCTP cksum of TX pkt. computed by 
>> NIC. */
>> -#define PKT_TX_UDP_CKSUM 0x6000 /**< UDP cksum of TX pkt. computed by 
>> NIC. */
>> -/* Bit 15 */
>> -#define PKT_TX_IEEE1588_TMST 0x8000 /**< TX IEEE1588 packet to timestamp. */
>> +#define PKT_TX_L4_NO_CKSUM   (0x << 32) /**< Disable L4 cksum of TX 
>> pkt. */
>> +#define PKT_TX_TCP_CKSUM (0x0004 << 32) /**< TCP cksum of TX pkt. 
>> computed by NIC. */
>> +#define PKT_TX_SCTP_CKSUM(0x0008 << 32) /**< SCTP cksum of TX pkt. 
>> computed by NIC. */
>> +#define PKT_TX_UDP_CKSUM (0x000C << 32) /**< UDP cksum of TX pkt. 
>> computed by NIC. */
>> +#define PKT_TX_L4_MASK   (0x000C << 32) /**< Mask for L4 cksum offload 
>> request. */
>> +/* Bit 36 */
>> +#define PKT_TX_IEEE1588_TMST (0x0010 << 32) /**< TX IEEE1588 packet to 
>> timestamp. */
>> 
>> /* Use final bit of flags to indicate a control mbuf */
>> #define CTRL_MBUF_FLAG   (1ULL << 63)
>> -- 
>> 1.9.3
>> 

Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
972-213-5533



[dpdk-dev] [memnic PATCH v2 6/7] pmd: add branch hint in recv/xmit

2014-09-30 Thread Xie, Huawei


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Hiroshi Shimamoto
> Sent: Tuesday, September 30, 2014 7:15 PM
> To: dev at dpdk.org
> Cc: Hayato Momma
> Subject: [dpdk-dev] [memnic PATCH v2 6/7] pmd: add branch hint in recv/xmit
> 
> From: Hiroshi Shimamoto 
> 
> To reduce instruction cache miss, add branch condition hints into
> recv/xmit functions. This improves a bit performance.
> 
> We can see performance improvements with memnic-tester.
> Using Xeon E5-2697 v2 @ 2.70GHz, 4 vCPU.
>  size |  before  |  after
>64 | 5.54Mpps | 5.55Mpps
>   128 | 5.46Mpps | 5.44Mpps
>   256 | 5.21Mpps | 5.22Mpps
>   512 | 4.50Mpps | 4.52Mpps
>  1024 | 3.71Mpps | 3.73Mpps
>  1280 | 3.21Mpps | 3.22Mpps
>  1518 | 2.92Mpps | 2.93Mpps
> 
> Signed-off-by: Hiroshi Shimamoto 
> Reviewed-by: Hayato Momma 
> ---
>  pmd/pmd_memnic.c | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c
> index 7fc3093..875d3ea 100644
> --- a/pmd/pmd_memnic.c
> +++ b/pmd/pmd_memnic.c
> @@ -289,26 +289,26 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
>   int idx, next;
>   struct rte_eth_stats *st = >stats[rte_lcore_id()];
> 
> - if (!adapter->nic->hdr.valid)
> + if (unlikely(!adapter->nic->hdr.valid))
>   return 0;
> 
>   pkts = bytes = errs = 0;
>   idx = adapter->up_idx;
>   for (nr = 0; nr < nb_pkts; nr++) {
>   p = >packets[idx];
> - if (p->status != MEMNIC_PKT_ST_FILLED)
> + if (unlikely(p->status != MEMNIC_PKT_ST_FILLED))
>   break;
>   /* prefetch the next area */
>   next = idx;
> - if (++next >= MEMNIC_NR_PACKET)
> + if (unlikely(++next >= MEMNIC_NR_PACKET))
On IA,  compiler can use add, cmp and cmov to avoid branch.
But If MEMNIC_NR_PACKET is always power of 2, 
it is better just next = (next + 1) & (MEMNIC_NR_PACKET - 1)

>   next = 0;
>   rte_prefetch0(>packets[next]);
> - if (p->len > framesz) {
> + if (unlikely(p->len > framesz)) {
>   errs++;
>   goto drop;
>   }
>   mb = rte_pktmbuf_alloc(adapter->mp);
> - if (!mb)
> + if (unlikely(!mb))
>   break;
> 
>   rte_memcpy(rte_pktmbuf_mtod(mb, void *), p->data, p->len);
> @@ -350,7 +350,7 @@ static uint16_t memnic_xmit_pkts(void *tx_queue,
>   uint64_t pkts, bytes, errs;
>   uint32_t framesz = adapter->framesz;
> 
> - if (!adapter->nic->hdr.valid)
> + if (unlikely(!adapter->nic->hdr.valid))
>   return 0;
> 
>   pkts = bytes = errs = 0;
> @@ -360,7 +360,7 @@ static uint16_t memnic_xmit_pkts(void *tx_queue,
>   struct rte_mbuf *sg;
>   void *ptr;
> 
> - if (pkt_len > framesz) {
> + if (unlikely(pkt_len > framesz)) {
>   errs++;
>   break;
>   }
> @@ -379,7 +379,7 @@ retry:
>   goto retry;
>   }
> 
> - if (idx != ACCESS_ONCE(adapter->down_idx)) {
> + if (unlikely(idx != ACCESS_ONCE(adapter->down_idx))) {
>   /*
>* host freed this and got false positive,
>* need to recover the status and retry.
> @@ -388,7 +388,7 @@ retry:
>   goto retry;
>   }
> 
> - if (++idx >= MEMNIC_NR_PACKET)
> + if (unlikely(++idx >= MEMNIC_NR_PACKET))
>   idx = 0;
>   adapter->down_idx = idx;
> 
> --
> 1.8.3.1



[dpdk-dev] [PATCH] Fix for LRU corrupted returns

2014-09-30 Thread Neil Horman
On Tue, Sep 30, 2014 at 06:14:46PM +, Saha, Avik (AWS) wrote:
> I have to point out that I am commenting out the the power_of_2 check on 
> entry_size. I am not sure if this is the right way but I don't know why this 
> soft assumption is important (since I cannot find the power of 2 constraint 
> in the documentation). I agree with the 0 check but the only reason I did not 
> put that in is because entry size would at least be sizeof(struct 
> rte_pipeline_table_entry) = 8 bytes (to which the action_data_size is added)
> 
> Avik
> 
I would imagine the power of two check is in place sepcifically because of the
zero bit searchs immediately below it.  I.e. you can't really create bit masks
for multi-field values, when those fields aren't contiguous.

Neil

> -Original Message-
> From: Neil Horman [mailto:nhorman at tuxdriver.com] 
> Sent: Tuesday, September 30, 2014 5:51 AM
> To: Saha, Avik (AWS)
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] Fix for LRU corrupted returns
> 
> On Tue, Sep 30, 2014 at 06:26:23AM +, Saha, Avik (AWS) wrote:
> > Sorry about the delay. The number 32 is not really a CACHE_LINE_SIZE but 
> > since __builtin_clz returns the number of leading 0's before the most 
> > significant set bit in a 32 bit number (entry_size is uint32_t), I subtract 
> > that number from 32 to get the number of trailing bits after the most 
> > significant set bit. This will be the separation in my data_mem regions.
> > 
> Ah, ok, then change that 32 to sizeof(t->data_size_shl) to protect you 
> against type changes and to avoid having magic values running around in your 
> code.  Also, you might want to do some sanity checking of entry_size as it 
> seems like theres a soft assumption that entry size is non-zero and a power 
> of two.
> while the latter is checked higher in the function, the former isn't and 
> __builtin_clz has undefined behavior if its passed a zero value.
> 
> Neil
> 
> > -Original Message-
> > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > Sent: Thursday, September 25, 2014 3:22 AM
> > To: Saha, Avik (AWS)
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] Fix for LRU corrupted returns
> > 
> > On Thu, Sep 25, 2014 at 07:46:16AM +, Saha, Avik (AWS) wrote:
> > > This is a patch to a problem that I have faced (described in the  thread) 
> > > and this works for me.
> > > 
> > > 1)  Since the data_size_shl was getting its value from the key_size, 
> > > the table data entries were being corrupted when the calculation to shift 
> > > the number of bits was being made based on the key_size (according to the 
> > > document the key_size and entry_size are independently configurable) - 
> > > With this fix, we get the MSB that is set in entry_size (also removes the 
> > > constraint of this having to be a power of 2 - not entirely sure if this 
> > > was the reason the constraint was kept though)
> > > 2)  The document does not say that the entry_size needs to be a power 
> > > of 2 and this was failing silently when I was trying to bring my 
> > > application up.
> > > 
> > > diff --git a/DPDK/lib/librte_table/rte_table_hash_lru.c
> > > b/DPDK/lib/librte_table/rte_table_hash_lru.c
> > > index d1a4984..4ec9aa4 100644
> > > --- a/DPDK/lib/librte_table/rte_table_hash_lru.c
> > > +++ b/DPDK/lib/librte_table/rte_table_hash_lru.c
> > > @@ -153,8 +153,10 @@ rte_table_hash_lru_create(void *params, int 
> > > socket_id, uint32_t entry_size)
> > > uint32_t i;
> > > 
> > > /* Check input parameters */
> > > -   if ((check_params_create(p) != 0) ||
> > > -   (!rte_is_power_of_2(entry_size)) ||
> > > +   // Commenting out the power of 2 check on the entry_size since the
> > > +   // Programmers Guide does not call this out and we are going to 
> > > handle
> > > +   // the data_size_shl of the table later on (Line 197)
> > Please remove the reference to Line 197 here.  Thats not going to remain 
> > accurate for very long.
> > 
> > > +   if ((check_params_create(p) != 0) ||
> > > ((sizeof(struct rte_table_hash) % CACHE_LINE_SIZE) != 0) 
> > > ||
> > > (sizeof(struct bucket) != (CACHE_LINE_SIZE / 2))) {
> > > return NULL;
> > > @@ -192,7 +194,7 @@ rte_table_hash_lru_create(void *params, int 
> > > socket_id, uint32_t entry_size)
> > > /* Internal */
> > > t->bucket_mask = t->n_buckets - 1;
> > > t->key_size_shl = __builtin_ctzl(p->key_size);
> > > -   t->data_size_shl = __builtin_ctzl(p->key_size);
> > > +   t->data_size_shl = 32 - (__builtin_clz(entry_size));
> > I presume the 32 value here is a cache line size?  That should be replaced 
> > with CACHE_LINE_SIZE...Though looking at it, that doesn't seem sufficient.  
> > Seems like we need a eal abstraction to dynamically tell us what the cache 
> > line size is (we can read it from /proc/cpuinfo in linux, not sure about 
> > bsd).
> > 
> > Neil
> > 
> > 
> 


[dpdk-dev] [PATCH v3 7/7] app/testpmd: add commands to support hash filter control

2014-09-30 Thread Helin Zhang
To demonstrate the hash filter control, commands are added.
They are
- get_sym_hash_ena_per_port
- set_sym_hash_ena_per_port
- get_sym_hash_ena_per_pctype
- set_sym_hash_ena_per_pctype
- get_filter_swap
- set_filter_swap
- get_hash_function
- set_hash_function

v3 changes:
* Renamed the command names.
* Used the re-designed filter control APIs and structures.

Signed-off-by: Helin Zhang 
Acked-by: Jingjing Wu 
---
 app/test-pmd/cmdline.c | 565 +
 1 file changed, 565 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 225f669..c8c0bcd 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -74,6 +74,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
@@ -660,6 +661,35 @@ static void cmd_help_long_parsed(void *parsed_result,

"get_flex_filter (port_id) index (idx)\n"
"get info of a flex filter.\n\n"
+
+   "get_sym_hash_ena_per_port (port_id)\n"
+   "get symmetric hash enable configuration per 
port.\n\n"
+
+   "set_sym_hash_ena_per_port (port_id)"
+   " (enable|disable)\n"
+   "set symmetric hash enable configuration per port"
+   " to enable or disable.\n\n"
+
+   "get_sym_hash_ena_per_pctype (port_id) (pctype)\n"
+   "get symmetric hash enable configuration per 
port\n\n"
+
+   "set_sym_hash_ena_per_pctype (port_id) (pctype)"
+   " (enable|disable)\n"
+   "set symmetric hash enable configuration per"
+   " pctype to enable or disable.\n\n"
+
+   "get_filter_swap (port_id) (pctype)\n"
+   "get filter swap configurations.\n\n"
+
+   "set_filter_swap (port_id) (pctype) (off0_src0) 
(off0_src1)"
+   " (len0) (off1_src0) (off1_src1) (len1)\n"
+   "set filter swap configurations.\n\n"
+
+   "get_hash_function (port_id)\n"
+   "get hash function of Toeplitz or Simple XOR.\n\n"
+
+   "set_hash_function (port_id) (toeplitz|simple_xor)\n"
+   "set the hash function to Toeplitz or Simple 
XOR.\n\n"
);
}
 }
@@ -7411,6 +7441,533 @@ cmdline_parse_inst_t cmd_get_flex_filter = {
},
 };

+/* *** Classification Filters Control *** */
+
+/* *** Get symmetric hash enable per port *** */
+struct cmd_get_sym_hash_ena_per_port_result {
+   cmdline_fixed_string_t get_sym_hash_ena_per_port;
+   uint8_t port_id;
+};
+
+static void
+cmd_get_sym_hash_per_port_parsed(void *parsed_result,
+__rte_unused struct cmdline *cl,
+__rte_unused void *data)
+{
+   struct cmd_get_sym_hash_ena_per_port_result *res = parsed_result;
+   struct rte_eth_hash_filter_info info;
+   int ret;
+
+   if (rte_eth_dev_filter_supported(res->port_id,
+   RTE_ETH_FILTER_HASH) < 0) {
+   printf("RTE_ETH_FILTER_HASH not supported on port: %d\n",
+   res->port_id);
+   return;
+   }
+
+   memset(, 0, sizeof(info));
+   info.info_type = RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PORT;
+   ret = rte_eth_dev_filter_ctrl(res->port_id, RTE_ETH_FILTER_HASH,
+   RTE_ETH_FILTER_OP_GET, );
+   if (ret < 0) {
+   printf("Cannot get symmetric hash enable per port "
+   "on port %u\n", res->port_id);
+   return;
+   }
+
+   printf("Symmetric hash is %s on port %u\n", info.info.enable ?
+   "enabled" : "disabled", res->port_id);
+}
+
+cmdline_parse_token_string_t cmd_get_sym_hash_ena_per_port_all =
+   TOKEN_STRING_INITIALIZER(struct cmd_get_sym_hash_ena_per_port_result,
+   get_sym_hash_ena_per_port, "get_sym_hash_ena_per_port");
+cmdline_parse_token_num_t cmd_get_sym_hash_ena_per_port_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_get_sym_hash_ena_per_port_result,
+   port_id, UINT8);
+
+cmdline_parse_inst_t cmd_get_sym_hash_ena_per_port = {
+   .f = cmd_get_sym_hash_per_port_parsed,
+   .data = NULL,
+   .help_str = "get_sym_hash_ena_per_port port_id",
+   .tokens = {
+   (void *)_get_sym_hash_ena_per_port_all,
+   (void *)_get_sym_hash_ena_per_port_port_id,
+   NULL,
+   },
+};
+
+/* *** Set symmetric hash enable per port *** */
+struct cmd_set_sym_hash_ena_per_port_result {
+   cmdline_fixed_string_t set_sym_hash_ena_per_port;
+   cmdline_fixed_string_t enable;
+   uint8_t port_id;
+};
+
+static 

[dpdk-dev] [PATCH v3 6/7] i40e: Use constant random hash keys

2014-09-30 Thread Helin Zhang
To be simpler, and remove the race condition, it uses prepared
constant random hash keys to replace runtime generating the
hash keys.

v3 changes:
* Use prepared random hash keys.

Signed-off-by: Helin Zhang 
Acked-by: Jingjing Wu 
---
 lib/librte_pmd_i40e/i40e_ethdev.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index f23e0bf..87a5f4d 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -211,9 +211,6 @@ static int i40e_dev_filter_ctrl(struct rte_eth_dev *dev,
void *arg);
 static void i40e_hw_init(struct i40e_hw *hw);

-/* Default hash key buffer for RSS */
-static uint32_t rss_key_default[I40E_PFQF_HKEY_MAX_INDEX + 1];
-
 static struct rte_pci_id pci_id_i40e_map[] = {
 #define RTE_PCI_DEV_ID_DECL_I40E(vend, dev) {RTE_PCI_DEVICE(vend, dev)},
 #include "rte_pci_dev_ids.h"
@@ -4113,9 +4110,12 @@ i40e_pf_config_rss(struct i40e_pf *pf)
}
if (rss_conf.rss_key == NULL || rss_conf.rss_key_len <
(I40E_PFQF_HKEY_MAX_INDEX + 1) * sizeof(uint32_t)) {
-   /* Calculate the default hash key */
-   for (i = 0; i <= I40E_PFQF_HKEY_MAX_INDEX; i++)
-   rss_key_default[i] = (uint32_t)rte_rand();
+   /* Random default keys */
+   static uint32_t rss_key_default[] = {0x6b793944,
+   0x23504cb5, 0x5bea75b6, 0x309f4f12, 0x3dc0a2b8,
+   0x024ddcdf, 0x339b8ca0, 0x4c4af64a, 0x34fac605,
+   0x55d85839, 0x3a58997d, 0x2ec938e1, 0x66031581};
+
rss_conf.rss_key = (uint8_t *)rss_key_default;
rss_conf.rss_key_len = (I40E_PFQF_HKEY_MAX_INDEX + 1) *
sizeof(uint32_t);
-- 
1.8.1.4



[dpdk-dev] [PATCH v3 5/7] i40e: add hardware initialization

2014-09-30 Thread Helin Zhang
As global registers will be reset only after a whole chip reset,
those registers might not be in an initial state after each
launching a physical port. The hardware initialization is added
to put specific global registers into an initial state.

v3 changes:
* Renamed hardware initialization function.
* Added initialization of register 'PFQF_CTL_0'.

Signed-off-by: Helin Zhang 
Acked-by: Jingjing Wu 
---
 lib/librte_pmd_i40e/i40e_ethdev.c | 78 +++
 1 file changed, 78 insertions(+)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index ee7c9de..f23e0bf 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -209,6 +209,7 @@ static int i40e_dev_filter_ctrl(struct rte_eth_dev *dev,
enum rte_filter_type filter_type,
enum rte_filter_op filter_op,
void *arg);
+static void i40e_hw_init(struct i40e_hw *hw);

 /* Default hash key buffer for RSS */
 static uint32_t rss_key_default[I40E_PFQF_HKEY_MAX_INDEX + 1];
@@ -390,6 +391,9 @@ eth_i40e_dev_init(__rte_unused struct eth_driver *eth_drv,
/* Make sure all is clean before doing PF reset */
i40e_clear_hw(hw);

+   /* Initialize the hardware */
+   i40e_hw_init(hw);
+
/* Reset here to make sure all is clean for each PF */
ret = i40e_pf_reset(hw);
if (ret) {
@@ -4533,3 +4537,77 @@ i40e_dev_filter_ctrl(struct rte_eth_dev *dev,

return ret;
 }
+
+/* Initialization for hash function */
+static void
+i40e_hash_function_hw_init(struct i40e_hw *hw)
+{
+   uint32_t i;
+   const struct rte_eth_sym_hash_ena_info sym_hash_ena_info[] = {
+   {ETH_RSS_NONF_IPV4_UDP_SHIFT, 0},
+   {ETH_RSS_NONF_IPV4_TCP_SHIFT, 0},
+   {ETH_RSS_NONF_IPV4_SCTP_SHIFT, 0},
+   {ETH_RSS_NONF_IPV4_OTHER_SHIFT, 0},
+   {ETH_RSS_FRAG_IPV4_SHIFT, 0},
+   {ETH_RSS_NONF_IPV6_UDP_SHIFT, 0},
+   {ETH_RSS_NONF_IPV6_TCP_SHIFT, 0},
+   {ETH_RSS_NONF_IPV6_SCTP_SHIFT, 0},
+   {ETH_RSS_NONF_IPV6_OTHER_SHIFT, 0},
+   {ETH_RSS_FRAG_IPV6_SHIFT, 0},
+   {ETH_RSS_L2_PAYLOAD_SHIFT, 0},
+   };
+   const struct rte_eth_filter_swap_info swap_info[] = {
+   {ETH_RSS_NONF_IPV4_UDP_SHIFT,
+   0x1e, 0x36, 0x04, 0x3a, 0x3c, 0x02},
+   {ETH_RSS_NONF_IPV4_TCP_SHIFT,
+   0x1e, 0x36, 0x04, 0x3a, 0x3c, 0x02},
+   {ETH_RSS_NONF_IPV4_SCTP_SHIFT,
+   0x1e, 0x36, 0x04, 0x00, 0x00, 0x00},
+   {ETH_RSS_NONF_IPV4_OTHER_SHIFT,
+   0x1e, 0x36, 0x04, 0x00, 0x00, 0x00},
+   {ETH_RSS_FRAG_IPV4_SHIFT,
+   0x1e, 0x36, 0x04, 0x00, 0x00, 0x00},
+   {ETH_RSS_NONF_IPV6_UDP_SHIFT,
+   0x1a, 0x2a, 0x10, 0x3a, 0x3c, 0x02},
+   {ETH_RSS_NONF_IPV6_TCP_SHIFT,
+   0x1a, 0x2a, 0x10, 0x3a, 0x3c, 0x02},
+   {ETH_RSS_NONF_IPV6_SCTP_SHIFT,
+   0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00},
+   {ETH_RSS_NONF_IPV6_OTHER_SHIFT,
+   0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00},
+   {ETH_RSS_FRAG_IPV6_SHIFT,
+   0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00},
+   {ETH_RSS_L2_PAYLOAD_SHIFT,
+   0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
+   };
+
+   /* Disable symmetric hash per PCTYPE */
+   for (i = 0; i < RTE_DIM(sym_hash_ena_info); i++)
+   i40e_set_symmetric_hash_enable_per_pctype(hw,
+   _hash_ena_info[i]);
+
+   /* Disable symmetric hash per port */
+   i40e_set_symmetric_hash_enable_per_port(hw, 0);
+
+   /* Initialize filter swap */
+   for (i = 0; i < RTE_DIM(swap_info); i++)
+   i40e_set_filter_swap(hw, _info[i]);
+
+   /* Set hash function to Toeplitz by default */
+   i40e_set_hash_function(hw, RTE_ETH_HASH_FUNCTION_TOEPLITZ);
+}
+
+/*
+ * As global registers wouldn't be reset unless a global hardware reset,
+ * hardware initialization is needed to put those registers into an
+ * expected initial state.
+ */
+static void
+i40e_hw_init(struct i40e_hw *hw)
+{
+   /* clear the PF Queue Filter control register */
+   I40E_WRITE_REG(hw, I40E_PFQF_CTL_0, 0);
+
+   /* Initialize hardware for hash function */
+   i40e_hash_function_hw_init(hw);
+}
-- 
1.8.1.4



[dpdk-dev] [PATCH v3 4/7] i40e: add hash filter control implementation

2014-09-30 Thread Helin Zhang
Hash filter control has been implemented for i40e. It includes
getting/setting
- hash function type
- symmetric hash enable per pctype (packet classification type)
- symmetric hash enable per port
- filter swap configurations

v3 changes:
* Remove public header file specific for i40e.
* Use the re-designed filter control API, filter types,
  and operations.

Signed-off-by: Helin Zhang 
Acked-by: Jingjing Wu 
---
 lib/librte_pmd_i40e/i40e_ethdev.c | 402 ++
 1 file changed, 402 insertions(+)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 26f1799..ee7c9de 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -205,6 +205,10 @@ static int i40e_dev_rss_hash_update(struct rte_eth_dev 
*dev,
struct rte_eth_rss_conf *rss_conf);
 static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
  struct rte_eth_rss_conf *rss_conf);
+static int i40e_dev_filter_ctrl(struct rte_eth_dev *dev,
+   enum rte_filter_type filter_type,
+   enum rte_filter_op filter_op,
+   void *arg);

 /* Default hash key buffer for RSS */
 static uint32_t rss_key_default[I40E_PFQF_HKEY_MAX_INDEX + 1];
@@ -256,6 +260,7 @@ static struct eth_dev_ops i40e_eth_dev_ops = {
.reta_query   = i40e_dev_rss_reta_query,
.rss_hash_update  = i40e_dev_rss_hash_update,
.rss_hash_conf_get= i40e_dev_rss_hash_conf_get,
+   .filter_ctrl  = i40e_dev_filter_ctrl,
 };

 static struct eth_driver rte_i40e_pmd = {
@@ -4131,3 +4136,400 @@ i40e_pf_config_mq_rx(struct i40e_pf *pf)

return 0;
 }
+
+/* Get the symmetric hash enable configurations per PCTYPE */
+static int
+i40e_get_symmetric_hash_enable_per_pctype(struct i40e_hw *hw,
+   struct rte_eth_sym_hash_ena_info *info)
+{
+   uint32_t reg;
+
+   switch (info->pctype) {
+   case ETH_RSS_NONF_IPV4_UDP_SHIFT:
+   case ETH_RSS_NONF_IPV4_TCP_SHIFT:
+   case ETH_RSS_NONF_IPV4_SCTP_SHIFT:
+   case ETH_RSS_NONF_IPV4_OTHER_SHIFT:
+   case ETH_RSS_FRAG_IPV4_SHIFT:
+   case ETH_RSS_NONF_IPV6_UDP_SHIFT:
+   case ETH_RSS_NONF_IPV6_TCP_SHIFT:
+   case ETH_RSS_NONF_IPV6_SCTP_SHIFT:
+   case ETH_RSS_NONF_IPV6_OTHER_SHIFT:
+   case ETH_RSS_FRAG_IPV6_SHIFT:
+   case ETH_RSS_L2_PAYLOAD_SHIFT:
+   reg = I40E_READ_REG(hw, I40E_GLQF_HSYM(info->pctype));
+   info->enable = reg & I40E_GLQF_HSYM_SYMH_ENA_MASK ? 1 : 0;
+   break;
+   default:
+   PMD_DRV_LOG(ERR, "PCTYPE[%u] not supported", info->pctype);
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+/* Set the symmetric hash enable configurations per PCTYPE */
+static int
+i40e_set_symmetric_hash_enable_per_pctype(struct i40e_hw *hw,
+   const struct rte_eth_sym_hash_ena_info *info)
+{
+   uint32_t reg;
+
+   switch (info->pctype) {
+   case ETH_RSS_NONF_IPV4_UDP_SHIFT:
+   case ETH_RSS_NONF_IPV4_TCP_SHIFT:
+   case ETH_RSS_NONF_IPV4_SCTP_SHIFT:
+   case ETH_RSS_NONF_IPV4_OTHER_SHIFT:
+   case ETH_RSS_FRAG_IPV4_SHIFT:
+   case ETH_RSS_NONF_IPV6_UDP_SHIFT:
+   case ETH_RSS_NONF_IPV6_TCP_SHIFT:
+   case ETH_RSS_NONF_IPV6_SCTP_SHIFT:
+   case ETH_RSS_NONF_IPV6_OTHER_SHIFT:
+   case ETH_RSS_FRAG_IPV6_SHIFT:
+   case ETH_RSS_L2_PAYLOAD_SHIFT:
+   reg = info->enable ? I40E_GLQF_HSYM_SYMH_ENA_MASK : 0;
+   I40E_WRITE_REG(hw, I40E_GLQF_HSYM(info->pctype), reg);
+   I40E_WRITE_FLUSH(hw);
+   break;
+   default:
+   PMD_DRV_LOG(ERR, "PCTYPE[%u] not supported", info->pctype);
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+/* Get the symmetric hash enable configurations per port */
+static void
+i40e_get_symmetric_hash_enable_per_port(struct i40e_hw *hw, uint8_t *enable)
+{
+   uint32_t reg = I40E_READ_REG(hw, I40E_PRTQF_CTL_0);
+
+   *enable = reg & I40E_PRTQF_CTL_0_HSYM_ENA_MASK ? 1 : 0;
+}
+
+/* Set the symmetric hash enable configurations per port */
+static void
+i40e_set_symmetric_hash_enable_per_port(struct i40e_hw *hw, uint8_t enable)
+{
+   uint32_t reg = I40E_READ_REG(hw, I40E_PRTQF_CTL_0);
+
+   if (enable > 0) {
+   if (reg & I40E_PRTQF_CTL_0_HSYM_ENA_MASK) {
+   PMD_DRV_LOG(INFO, "Symmetric hash has already "
+   "been enabled");
+   return;
+   }
+   reg |= I40E_PRTQF_CTL_0_HSYM_ENA_MASK;
+   } else {
+   if (!(reg & I40E_PRTQF_CTL_0_HSYM_ENA_MASK)) {
+   PMD_DRV_LOG(INFO, "Symmetric hash has already "
+   "been 

[dpdk-dev] [PATCH v3 3/7] ethdev: add structures and enum for hash filter control

2014-09-30 Thread Helin Zhang
Structures and enum are added in rte_eth_ctrl.h to support hash
filter control.

v3 changes:
* Common structures are added in rte_eth_ctrl.h to support hash
  filter control.
* Hash filter info types and hash function types are added in
  rte_eth_ctrl.h to support filter control.

Signed-off-by: Helin Zhang 
Acked-by: Jingjing Wu 
---
 lib/librte_ether/rte_eth_ctrl.h | 74 +
 1 file changed, 74 insertions(+)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index aaea075..10197fc 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -73,6 +73,80 @@ enum rte_filter_op {
RTE_ETH_FILTER_OP_MAX,
 };

+/**
+ * Hash filter information types.
+ */
+enum rte_eth_hash_filter_info_type {
+   RTE_ETH_HASH_FILTER_INFO_TYPE_UNKNOWN = 0,
+   RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PCTYPE,
+   RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PORT,
+   RTE_ETH_HASH_FILTER_INFO_TYPE_FILTER_SWAP,
+   RTE_ETH_HASH_FILTER_INFO_TYPE_HASH_FUNCTION,
+   RTE_ETH_HASH_FILTER_INFO_TYPE_MAX,
+};
+
+/**
+ * Hash function types.
+ */
+enum rte_eth_hash_function {
+   RTE_ETH_HASH_FUNCTION_UNKNOWN = 0,
+   RTE_ETH_HASH_FUNCTION_TOEPLITZ,
+   RTE_ETH_HASH_FUNCTION_SIMPLE_XOR,
+   RTE_ETH_HASH_FUNCTION_MAX,
+};
+
+/**
+ * A structure used to set or get symmetric hash enable information, to support
+ * 'RTE_ETH_FILTER_HASH', 'RTE_ETH_FILTER_OP_GET/RTE_ETH_FILTER_OP_SET', with
+ * information type 'RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PCTYPE'.
+ */
+struct rte_eth_sym_hash_ena_info {
+   /**< packet classification type, defined in rte_ethdev.h */
+   uint8_t pctype;
+   uint8_t enable; /**< enable or disable flag */
+};
+
+/**
+ * A structure used to set or get filter swap information, to support
+ * 'RTE_ETH_FILTER_HASH', 'RTE_ETH_FILTER_OP_GET/RTE_ETH_FILTER_OP_SET',
+ * with information type 'RTE_ETH_HASH_FILTER_INFO_TYPE_FILTER_SWAP'.
+ */
+struct rte_eth_filter_swap_info {
+   /**< Packet classification type, defined in rte_ethdev.h */
+   uint8_t pctype;
+   /**< Offset of the 1st field of the 1st couple to be swapped. */
+   uint8_t off0_src0;
+   /**< Offset of the 2nd field of the 1st couple to be swapped. */
+   uint8_t off0_src1;
+   /**< Field length of the first couple. */
+   uint8_t len0;
+   /**< Offset of the 1st field of the 2nd couple to be swapped. */
+   uint8_t off1_src0;
+   /**< Offset of the 2nd field of the 2nd couple to be swapped. */
+   uint8_t off1_src1;
+   /**< Field length of the second couple. */
+   uint8_t len1;
+};
+
+/**
+ * A structure used to set or get hash filter information, to support filter
+ * type of 'RTE_ETH_FILTER_HASH' and its operations.
+ */
+struct rte_eth_hash_filter_info {
+   enum rte_eth_hash_filter_info_type info_type; /**< Information type. */
+   /**< Details of hash filter infomation */
+   union {
+   /* For RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PCTYPE */
+   struct rte_eth_sym_hash_ena_info sym_hash_ena;
+   /* For RTE_ETH_HASH_FILTER_INFO_TYPE_FILTER_SWAP */
+   struct rte_eth_filter_swap_info filter_swap;
+   /* For RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PORT */
+   uint8_t enable;
+   /* For RTE_ETH_HASH_FILTER_INFO_TYPE_HASH_FUNCTION */
+   enum rte_eth_hash_function hash_function;
+   } info;
+};
+
 #ifdef __cplusplus
 }
 #endif
-- 
1.8.1.4



[dpdk-dev] [PATCH v3 2/7] ethdev: add interfaces and relevant for filter control

2014-09-30 Thread Helin Zhang
To support flexible filter control, 'rte_eth_dev_filter_ctrl()'
and 'rte_eth_dev_filter_supported()' are added. In addition, filter
types and operations are defined in a newly added header file.

v3 changes:
* Interfaces to be added have been re-designed.
* Header file has been renamed.

Signed-off-by: Helin Zhang 
Acked-by: Jingjing Wu 
---
 lib/librte_ether/Makefile   |  1 +
 lib/librte_ether/rte_eth_ctrl.h | 80 +
 lib/librte_ether/rte_ethdev.c   | 32 +
 lib/librte_ether/rte_ethdev.h   | 48 +
 4 files changed, 161 insertions(+)
 create mode 100644 lib/librte_ether/rte_eth_ctrl.h

diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index b310f8b..a461c31 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -46,6 +46,7 @@ SRCS-y += rte_ethdev.c
 #
 SYMLINK-y-include += rte_ether.h
 SYMLINK-y-include += rte_ethdev.h
+SYMLINK-y-include += rte_eth_ctrl.h

 # this lib depends upon:
 DEPDIRS-y += lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf
diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
new file mode 100644
index 000..aaea075
--- /dev/null
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -0,0 +1,80 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ETH_CTRL_H_
+#define _RTE_ETH_CTRL_H_
+
+/**
+ * @file
+ *
+ * Ethernet device features and related data structures used
+ * by control APIs should be defined in this file.
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Feature filter types
+ */
+enum rte_filter_type {
+   RTE_ETH_FILTER_NONE = 0,
+   RTE_ETH_FILTER_HASH,
+   RTE_ETH_FILTER_FDIR,
+   RTE_ETH_FILTER_TUNNEL,
+   RTE_ETH_FILTER_MAX,
+};
+
+/**
+ * All generic operations to filters
+ */
+enum rte_filter_op {
+   /**< used to check whether the type filter is supported */
+   RTE_ETH_FILTER_OP_NONE = 0,
+   RTE_ETH_FILTER_OP_ADD,  /**< add filter entry */
+   RTE_ETH_FILTER_OP_UPDATE,   /**< update filter entry */
+   RTE_ETH_FILTER_OP_DELETE,   /**< delete filter entry */
+   RTE_ETH_FILTER_OP_GET,  /**< get filter entry */
+   RTE_ETH_FILTER_OP_SET,  /**< configurations */
+   /**< get information of filter, such as status or statistics */
+   RTE_ETH_FILTER_OP_GET_INFO,
+   RTE_ETH_FILTER_OP_MAX,
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ETH_CTRL_H_ */
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index b71b679..fdafb15 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3139,3 +3139,35 @@ rte_eth_dev_get_flex_filter(uint8_t port_id, uint16_t 
index,
return (*dev->dev_ops->get_flex_filter)(dev, index, filter,
rx_queue);
 }
+
+int
+rte_eth_dev_filter_supported(uint8_t port_id, enum rte_filter_type filter_type)
+{
+   struct rte_eth_dev *dev;
+
+   if (port_id >= nb_ports) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -ENODEV;
+   }
+
+   dev = _eth_devices[port_id];
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->filter_ctrl, -ENOTSUP);
+   return (*dev->dev_ops->filter_ctrl)(dev, filter_type,
+   

[dpdk-dev] [PATCH v3 1/7] ethdev: add more annotations

2014-09-30 Thread Helin Zhang
Add more annotations about packet classification type.

v3 changes:
* Remove renamings of RSS 'SHIFT's.
* Add more annotations for RSS 'SHIFT's.

Signed-off-by: Helin Zhang 
Acked-by: Jingjing Wu 
---
 lib/librte_ether/rte_ethdev.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index bbc6022..ad7b9d4 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -334,7 +334,10 @@ struct rte_eth_rss_conf {
uint64_t rss_hf; /**< Hash functions to apply - see below. */
 };

-/* Supported RSS offloads */
+/*
+ * Supported RSS offloads, below '_SHIFT' can also be used to represent
+ * the 'Packet Classification type (pctype)'.
+ */
 /* for 1G & 10G */
 #define ETH_RSS_IPV4_SHIFT0
 #define ETH_RSS_IPV4_TCP_SHIFT1
-- 
1.8.1.4



[dpdk-dev] [PATCH] ixgbe: Fix clang compilation issue

2014-09-30 Thread Wiles, Roger Keith
Acked-by: Keith Wiles 

On Sep 30, 2014, at 4:40 AM, Bruce Richardson  
wrote:

> Issue reported by Keith Wiles.
> Clang fails with an error about a variable being used uninitialized:
> 
>  CC ixgbe_rxtx_vec.o
> /home/keithw/projects/dpdk-code/org-dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c:67:30:
> error: variable 'dma_addr0' is uninitialized
>  when used here [-Werror,-Wuninitialized]
>dma_addr0 = _mm_xor_si128(dma_addr0, dma_addr0);
>  ^
> 
> This error can be fixed by replacing the call to xor which
> takes two parameters, by a call to setzero, which does not take any.
> 
> Signed-off-by: Bruce Richardson 
> ---
> lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c 
> b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> index 457f267..2236250 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> @@ -64,7 +64,7 @@ ixgbe_rxq_rearm(struct igb_rx_queue *rxq)
>RTE_IXGBE_RXQ_REARM_THRESH) < 0) {
>   if (rxq->rxrearm_nb + RTE_IXGBE_RXQ_REARM_THRESH >=
>   rxq->nb_rx_desc) {
> - dma_addr0 = _mm_xor_si128(dma_addr0, dma_addr0);
> + dma_addr0 = _mm_setzero_si128();
>   for (i = 0; i < RTE_IXGBE_DESCS_PER_LOOP; i++) {
>   rxep[i].mbuf = >fake_mbuf;
>   _mm_store_si128((__m128i *)[i].read,
> -- 
> 1.9.3
> 

Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
972-213-5533



[dpdk-dev] vmxnet3 pmd dev restart

2014-09-30 Thread Navakanth M
Hi

I am using DPDKv1.7.0 running on Vmware Esxi 5.1 and am trying to
reset the port which uses pmd_vmnet3 library functions from below
function calls.
rte_eth_dev_stop
rte_eth_dev_start

Doing this, i face panic while rte_free(ring->buf_info) in
Vmxnet3_cmd_ring_release().
I have gone through following thread but the patch mentioned didn't
help rather it crashed in start function while accessing buf_info in
vmxnet3_post_rx_bufs. I see this buf_info is allocated in queue setup
functions which are called at initialization.
http://thread.gmane.org/gmane.comp.networking.dpdk.devel/4683

I tried not freeing it and then rx packets are not received due to mismatch in
while (rcd->gen == rxq->comp_ring.gen) in vmxnet3_recv_pkts()

To reset the device port, is this the right way what i am doing?
Or do I have to call vmxnet3_dev_tx_queue_setup()
vmxnet3_dev_rx_queue_setup() once stop is called? I have checked
recent patches and threads but did not get much information on this.

Thanks
Navakanth


[dpdk-dev] [memnic PATCH v2 6/7] pmd: add branch hint in recv/xmit

2014-09-30 Thread Xie, Huawei
The patch is ok. For the commit message, is it better 
"to reduce branch mispredication"?

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Hiroshi Shimamoto
> Sent: Tuesday, September 30, 2014 7:15 PM
> To: dev at dpdk.org
> Cc: Hayato Momma
> Subject: [dpdk-dev] [memnic PATCH v2 6/7] pmd: add branch hint in recv/xmit
> 
> From: Hiroshi Shimamoto 
> 
> To reduce instruction cache miss, add branch condition hints into
> recv/xmit functions. This improves a bit performance.
> 
> We can see performance improvements with memnic-tester.
> Using Xeon E5-2697 v2 @ 2.70GHz, 4 vCPU.
>  size |  before  |  after
>64 | 5.54Mpps | 5.55Mpps
>   128 | 5.46Mpps | 5.44Mpps
>   256 | 5.21Mpps | 5.22Mpps
>   512 | 4.50Mpps | 4.52Mpps
>  1024 | 3.71Mpps | 3.73Mpps
>  1280 | 3.21Mpps | 3.22Mpps
>  1518 | 2.92Mpps | 2.93Mpps
> 
> Signed-off-by: Hiroshi Shimamoto 
> Reviewed-by: Hayato Momma 
> ---
>  pmd/pmd_memnic.c | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c
> index 7fc3093..875d3ea 100644
> --- a/pmd/pmd_memnic.c
> +++ b/pmd/pmd_memnic.c
> @@ -289,26 +289,26 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
>   int idx, next;
>   struct rte_eth_stats *st = >stats[rte_lcore_id()];
> 
> - if (!adapter->nic->hdr.valid)
> + if (unlikely(!adapter->nic->hdr.valid))
>   return 0;
> 
>   pkts = bytes = errs = 0;
>   idx = adapter->up_idx;
>   for (nr = 0; nr < nb_pkts; nr++) {
>   p = >packets[idx];
> - if (p->status != MEMNIC_PKT_ST_FILLED)
> + if (unlikely(p->status != MEMNIC_PKT_ST_FILLED))
>   break;
>   /* prefetch the next area */
>   next = idx;
> - if (++next >= MEMNIC_NR_PACKET)
> + if (unlikely(++next >= MEMNIC_NR_PACKET))
>   next = 0;
>   rte_prefetch0(>packets[next]);
> - if (p->len > framesz) {
> + if (unlikely(p->len > framesz)) {
>   errs++;
>   goto drop;
>   }
>   mb = rte_pktmbuf_alloc(adapter->mp);
> - if (!mb)
> + if (unlikely(!mb))
>   break;
> 
>   rte_memcpy(rte_pktmbuf_mtod(mb, void *), p->data, p->len);
> @@ -350,7 +350,7 @@ static uint16_t memnic_xmit_pkts(void *tx_queue,
>   uint64_t pkts, bytes, errs;
>   uint32_t framesz = adapter->framesz;
> 
> - if (!adapter->nic->hdr.valid)
> + if (unlikely(!adapter->nic->hdr.valid))
>   return 0;
> 
>   pkts = bytes = errs = 0;
> @@ -360,7 +360,7 @@ static uint16_t memnic_xmit_pkts(void *tx_queue,
>   struct rte_mbuf *sg;
>   void *ptr;
> 
> - if (pkt_len > framesz) {
> + if (unlikely(pkt_len > framesz)) {
>   errs++;
>   break;
>   }
> @@ -379,7 +379,7 @@ retry:
>   goto retry;
>   }
> 
> - if (idx != ACCESS_ONCE(adapter->down_idx)) {
> + if (unlikely(idx != ACCESS_ONCE(adapter->down_idx))) {
>   /*
>* host freed this and got false positive,
>* need to recover the status and retry.
> @@ -388,7 +388,7 @@ retry:
>   goto retry;
>   }
> 
> - if (++idx >= MEMNIC_NR_PACKET)
> + if (unlikely(++idx >= MEMNIC_NR_PACKET))
>   idx = 0;
>   adapter->down_idx = idx;
> 
> --
> 1.8.3.1



[dpdk-dev] [PATCH RFC] mbuf: Adjust TX flags to start at bit 32

2014-09-30 Thread Neil Horman
On Tue, Sep 30, 2014 at 04:26:02PM +0100, Bruce Richardson wrote:
> This patch takes the existing TX flags defined for the mbuf and shifts
> each uniquely defined one left so that additional RX flags can be
> defined without having RX and TX flags mixed together.
> 
> Signed-off-by: Bruce Richardson 
> ---
>  lib/librte_mbuf/rte_mbuf.h | 26 +-
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 1c6e115..c9fc4ec 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -86,26 +86,26 @@ extern "C" {
>  #define PKT_RX_IEEE1588_PTP  0x0200 /**< RX IEEE1588 L2 Ethernet PT Packet. 
> */
>  #define PKT_RX_IEEE1588_TMST 0x0400 /**< RX IEEE1588 L2/L4 timestamped 
> packet.*/
>  
> -#define PKT_TX_VLAN_PKT  0x0800 /**< TX packet is a 802.1q VLAN packet. 
> */
> -#define PKT_TX_IP_CKSUM  0x1000 /**< IP cksum of TX pkt. computed by 
> NIC. */
> -#define PKT_TX_IPV4_CSUM 0x1000 /**< Alias of PKT_TX_IP_CKSUM. */
> -#define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP checksum 
> offload. */
> -#define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */
> +#define PKT_TX_VLAN_PKT  (0x0001 << 32) /**< TX packet is a 802.1q VLAN 
> packet. */
> +#define PKT_TX_IP_CKSUM  (0x0002 << 32) /**< IP cksum of TX pkt. computed by 
> NIC. */
> +#define PKT_TX_IPV4_CSUM PKT_TX_IP_CKSUM /**< Alias of PKT_TX_IP_CKSUM. */
> +#define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP checksum 
> offload. */
> +#define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */
>  
>  /*
> - * Bit 14~13 used for L4 packet type with checksum enabled.
> + * Bit 35~34 used for L4 packet type with checksum enabled.
>   * 00: Reserved
>   * 01: TCP checksum
>   * 10: SCTP checksum
>   * 11: UDP checksum
>   */
> -#define PKT_TX_L4_MASK   0x6000 /**< Mask bits for L4 checksum offload 
> request. */
> -#define PKT_TX_L4_NO_CKSUM   0x /**< Disable L4 cksum of TX pkt. */
> -#define PKT_TX_TCP_CKSUM 0x2000 /**< TCP cksum of TX pkt. computed by 
> NIC. */
> -#define PKT_TX_SCTP_CKSUM0x4000 /**< SCTP cksum of TX pkt. computed by 
> NIC. */
> -#define PKT_TX_UDP_CKSUM 0x6000 /**< UDP cksum of TX pkt. computed by 
> NIC. */
> -/* Bit 15 */
> -#define PKT_TX_IEEE1588_TMST 0x8000 /**< TX IEEE1588 packet to timestamp. */
> +#define PKT_TX_L4_NO_CKSUM   (0x << 32) /**< Disable L4 cksum of TX pkt. 
> */
> +#define PKT_TX_TCP_CKSUM (0x0004 << 32) /**< TCP cksum of TX pkt. 
> computed by NIC. */
> +#define PKT_TX_SCTP_CKSUM(0x0008 << 32) /**< SCTP cksum of TX pkt. 
> computed by NIC. */
> +#define PKT_TX_UDP_CKSUM (0x000C << 32) /**< UDP cksum of TX pkt. 
> computed by NIC. */
> +#define PKT_TX_L4_MASK   (0x000C << 32) /**< Mask for L4 cksum offload 
> request. */
> +/* Bit 36 */
> +#define PKT_TX_IEEE1588_TMST (0x0010 << 32) /**< TX IEEE1588 packet to 
> timestamp. */
>  
>  /* Use final bit of flags to indicate a control mbuf */
>  #define CTRL_MBUF_FLAG   (1ULL << 63)
> -- 
> 1.9.3
> 
> 

I'm not opposed to the patch at all, but I would like to point out that this is
the sort of change that breaks ABI very easily (which is fine right now given
the mbuf changes already staged for the release, but still something to be aware
of).  As such, are there advantages to this patch (other than the niceness of
human readability)?

If we're going to reshuffle these flags now, it might be nice to start tx flags
at the most significant bit and count back, and start rx flags at the least
significant bit and count up.  That would ensure that we don't reserve flags for
a direction without need.

Best
Neil



[dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe:

2014-09-30 Thread Wodkowski, PawelX
> -Original Message-
> Pawe?
> 
> > On Mon, Sep 29, 2014 at 10:11:38AM +, Wodkowski, PawelX wrote:
> > > > >
> > > > > Image how you will be damned by someone that not even notice you
> > change
> > > > > and he Is managing some kind of resource based on returned number of
> > > > > set/canceled timers. If you suddenly start returning negative values 
> > > > > how
> > those
> > > > > application will behave? Silently changing returned value domain is 
> > > > > evil in
> > its
> > > > > pure form.
> > > >
> > > > As I can see the impact is very limited.
> > >
> > > It is small impact to DPDK but can be huge to user application:
> >
> > This is why we traditionally have in the release-notes for each release a
> > section dedicated to calling out changes from one release to another. [See
> > http://dpdk.org/doc/intel/dpdk-release-notes-1.7.0.pdf section 5]. Since
> > from release-to-release there are generally only a couple of changes -
> > though our next release may be a little different - the actual changes are
> > clear enough to read about without wading through pages of documentation.
> I
> > thinking calling out the change in both the release notes and the API docs
> > is sufficient even for a change like this.
> >
> > Basically, I wouldn't let API stability factor in too much in trying to get
> > a proper fix for this issue.
> >
> > /Bruce
> >
> 
> Summarizing all proposed solutions and to be able to produce final patch - 
> what
> Is desired behavior after fix?
> 
> 1. do we leave as is in Patch v2:
> 1.1 if canceling from other thread - if one of the alarms is executing, wait 
> to
>   finish its execution and then cancel any rearmed alarms.
> 1.2 if canceling from alarm handler and one of the alarms to cancel is this
>   executing callback do no wait for it to finish and cancel anything else.
> 
>  in both cases return number of canceled callbacks.
> 
> 2. Do exactly like in 1. but return -EINPROGRESS instead of canceled alarms
>   if one of the alarms to cancel is currently executing callback from alarm 
> thread
>   (information about number of canceled alarms will be lost).

Or instead of returning -EINPROGRESS set errno to EINPROGRESS (replace
returning error value by setting errno and function can always return number
of canceled callbacks - in error condition 0)?

> 
> 3. refuse to cancel anything if canceling currently executing alarm from alarm
>   callback and return -EINPROGRESS otherwise do like in 1.1.
> 
> 4. Implement behaviour 1/2/3 (which?) and add API call to interrogate list of
>   alarms and retrun state of given alarms(s).
> 
> 5. other solutions?
> 
> Pawel


[dpdk-dev] [PATCH v3] distributor_app: new sample app

2014-09-30 Thread reshmapa
From: Reshma Pattan 

A new sample app that shows the usage of the distributor library. This
app works as follows:

* An RX thread runs which pulls packets from each ethernet port in turn
  and passes those packets to worker using a distributor component.
* The workers take the packets in turn, and determine the output port
  for those packets using basic l2forwarding doing an xor on the source
  port id.
* The RX thread takes the returned packets from the workers and enqueue
  those packets into an rte_ring structure.
* A TX thread pulls the packets off the rte_ring structure and then
  sends each packet out the output port specified previously by the worker
* Command-line option support provided only for portmask.

Signed-off-by: Bruce Richardson 
Signed-off-by: Reshma Pattan
---
 examples/Makefile |1 +
 examples/distributor_app/Makefile |   57 
 examples/distributor_app/main.c   |  600 +
 examples/distributor_app/main.h   |   46 +++
 4 files changed, 704 insertions(+), 0 deletions(-)
 create mode 100644 examples/distributor_app/Makefile
 create mode 100644 examples/distributor_app/main.c
 create mode 100644 examples/distributor_app/main.h

diff --git a/examples/Makefile b/examples/Makefile
index 6245f83..2ba82b0 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -66,5 +66,6 @@ DIRS-y += vhost
 DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen
 DIRS-y += vmdq
 DIRS-y += vmdq_dcb
+DIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += distributor_app

 include $(RTE_SDK)/mk/rte.extsubdir.mk
diff --git a/examples/distributor_app/Makefile 
b/examples/distributor_app/Makefile
new file mode 100644
index 000..6a5bada
--- /dev/null
+++ b/examples/distributor_app/Makefile
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = distributor_app
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+EXTRA_CFLAGS += -O3 -Wfatal-errors
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/distributor_app/main.c b/examples/distributor_app/main.c
new file mode 100644
index 000..f555d93
--- /dev/null
+++ b/examples/distributor_app/main.c
@@ -0,0 +1,600 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the 

[dpdk-dev] [PATCH 1/4 v4] compat: Add infrastructure to support symbol versioning

2014-09-30 Thread Neil Horman
Add initial pass header files to support symbol versioning.

---
Change notes
v2)
* Fixed ifdef in rte_compat.h to test for RTE_BUILD_SHARED_LIB instead of the
non-existant RTE_SYMBOL_VERSIONING

* Fixed VERSION_SYMBOL macro to add the needed extra @ to make versioning work
properly

* Improved/Clarified documentation

v3)
* Added missing macros to fully export the symver directive specification

v4)
* Added macro definitions for !SHARED_LIB case
* Improved documentation

Signed-off-by: Neil Horman 
CC: Thomas Monjalon 
CC: "Richardson, Bruce" 
CC: "Gonzalez Monroy, Sergio" 
---
 lib/Makefile   |  1 +
 lib/librte_compat/Makefile | 38 +
 lib/librte_compat/rte_compat.h | 96 ++
 mk/rte.lib.mk  |  6 +++
 4 files changed, 141 insertions(+)
 create mode 100644 lib/librte_compat/Makefile
 create mode 100644 lib/librte_compat/rte_compat.h

diff --git a/lib/Makefile b/lib/Makefile
index 10c5bb3..a85b55b 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -32,6 +32,7 @@
 include $(RTE_SDK)/mk/rte.vars.mk

 DIRS-$(CONFIG_RTE_LIBC) += libc
+DIRS-y += librte_compat
 DIRS-$(CONFIG_RTE_LIBRTE_EAL) += librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_MALLOC) += librte_malloc
 DIRS-$(CONFIG_RTE_LIBRTE_RING) += librte_ring
diff --git a/lib/librte_compat/Makefile b/lib/librte_compat/Makefile
new file mode 100644
index 000..3415c7b
--- /dev/null
+++ b/lib/librte_compat/Makefile
@@ -0,0 +1,38 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Neil Horman 
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+
+# install includes
+SYMLINK-y-include := rte_compat.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_compat/rte_compat.h b/lib/librte_compat/rte_compat.h
new file mode 100644
index 000..d99e362
--- /dev/null
+++ b/lib/librte_compat/rte_compat.h
@@ -0,0 +1,96 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Neil Horman .
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT

[dpdk-dev] [PATCH v2] bond: Add mode 4 support.

2014-09-30 Thread Wodkowski, PawelX
Fixed patch version sent.

Pawel


[dpdk-dev] [memnic PATCH v2 7/7] pmd: burst mbuf freeing in xmit

2014-09-30 Thread Hiroshi Shimamoto
From: Hiroshi Shimamoto 

In rte_pktmbuf_free(), there might be cache miss/memory stall issue.
In small packet case, it could harm the performance.

>From the result of memnic-tester, in less than 1024 frame size the
performance could be improved.

Using Xeon E5-2697 v2 @ 2.70GHz, 4 vCPU.
 size |  before  |  after
   64 | 5.55Mpps | 5.83Mpps
  128 | 5.44Mpps | 5.71Mpps
  256 | 5.22Mpps | 5.40Mpps
  512 | 4.52Mpps | 4.64Mpps
 1024 | 3.73Mpps | 3.68Mpps
 1280 | 3.22Mpps | 3.17Mpps
 1518 | 2.93Mpps | 2.90Mpps

Signed-off-by: Hiroshi Shimamoto 
Reviewed-by: Hayato Momma 
---
 pmd/pmd_memnic.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c
index 875d3ea..59ee332 100644
--- a/pmd/pmd_memnic.c
+++ b/pmd/pmd_memnic.c
@@ -344,7 +344,7 @@ static uint16_t memnic_xmit_pkts(void *tx_queue,
struct memnic_adapter *adapter = q->adapter;
struct memnic_data *data = >nic->down;
struct memnic_packet *p;
-   uint16_t nr;
+   uint16_t i, nr;
int idx;
struct rte_eth_stats *st = >stats[rte_lcore_id()];
uint64_t pkts, bytes, errs;
@@ -408,9 +408,9 @@ retry:

rte_compiler_barrier();
p->status = MEMNIC_PKT_ST_FILLED;
-
-   rte_pktmbuf_free(tx_pkts[nr]);
}
+   for (i = 0; i < nr; i++)
+   rte_pktmbuf_free(tx_pkts[i]);

/* stats */
st->opackets += pkts;
-- 
1.8.3.1



[dpdk-dev] [memnic PATCH v2 6/7] pmd: add branch hint in recv/xmit

2014-09-30 Thread Hiroshi Shimamoto
From: Hiroshi Shimamoto 

To reduce instruction cache miss, add branch condition hints into
recv/xmit functions. This improves a bit performance.

We can see performance improvements with memnic-tester.
Using Xeon E5-2697 v2 @ 2.70GHz, 4 vCPU.
 size |  before  |  after
   64 | 5.54Mpps | 5.55Mpps
  128 | 5.46Mpps | 5.44Mpps
  256 | 5.21Mpps | 5.22Mpps
  512 | 4.50Mpps | 4.52Mpps
 1024 | 3.71Mpps | 3.73Mpps
 1280 | 3.21Mpps | 3.22Mpps
 1518 | 2.92Mpps | 2.93Mpps

Signed-off-by: Hiroshi Shimamoto 
Reviewed-by: Hayato Momma 
---
 pmd/pmd_memnic.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c
index 7fc3093..875d3ea 100644
--- a/pmd/pmd_memnic.c
+++ b/pmd/pmd_memnic.c
@@ -289,26 +289,26 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
int idx, next;
struct rte_eth_stats *st = >stats[rte_lcore_id()];

-   if (!adapter->nic->hdr.valid)
+   if (unlikely(!adapter->nic->hdr.valid))
return 0;

pkts = bytes = errs = 0;
idx = adapter->up_idx;
for (nr = 0; nr < nb_pkts; nr++) {
p = >packets[idx];
-   if (p->status != MEMNIC_PKT_ST_FILLED)
+   if (unlikely(p->status != MEMNIC_PKT_ST_FILLED))
break;
/* prefetch the next area */
next = idx;
-   if (++next >= MEMNIC_NR_PACKET)
+   if (unlikely(++next >= MEMNIC_NR_PACKET))
next = 0;
rte_prefetch0(>packets[next]);
-   if (p->len > framesz) {
+   if (unlikely(p->len > framesz)) {
errs++;
goto drop;
}
mb = rte_pktmbuf_alloc(adapter->mp);
-   if (!mb)
+   if (unlikely(!mb))
break;

rte_memcpy(rte_pktmbuf_mtod(mb, void *), p->data, p->len);
@@ -350,7 +350,7 @@ static uint16_t memnic_xmit_pkts(void *tx_queue,
uint64_t pkts, bytes, errs;
uint32_t framesz = adapter->framesz;

-   if (!adapter->nic->hdr.valid)
+   if (unlikely(!adapter->nic->hdr.valid))
return 0;

pkts = bytes = errs = 0;
@@ -360,7 +360,7 @@ static uint16_t memnic_xmit_pkts(void *tx_queue,
struct rte_mbuf *sg;
void *ptr;

-   if (pkt_len > framesz) {
+   if (unlikely(pkt_len > framesz)) {
errs++;
break;
}
@@ -379,7 +379,7 @@ retry:
goto retry;
}

-   if (idx != ACCESS_ONCE(adapter->down_idx)) {
+   if (unlikely(idx != ACCESS_ONCE(adapter->down_idx))) {
/*
 * host freed this and got false positive,
 * need to recover the status and retry.
@@ -388,7 +388,7 @@ retry:
goto retry;
}

-   if (++idx >= MEMNIC_NR_PACKET)
+   if (unlikely(++idx >= MEMNIC_NR_PACKET))
idx = 0;
adapter->down_idx = idx;

-- 
1.8.3.1



[dpdk-dev] [memnic PATCH v2 5/7] pmd: packet receiving optimization with prefetch

2014-09-30 Thread Hiroshi Shimamoto
From: Hiroshi Shimamoto 

Prefetch the next packet area to reduce memory stall cycles.

Prefetching the next packet area could hide memory stall, because the next
area will be accessed just after processing the current receive operations.

We can see performance improvements with memnic-tester.
Using Xeon E5-2697 v2 @ 2.70GHz, 4 vCPU.
 size |  before  |  after
   64 | 4.59Mpps | 5.54Mpps
  128 | 4.87Mpps | 5.46Mpps
  256 | 4.72Mpps | 5.21Mpps
  512 | 4.41Mpps | 4.50Mpps
 1024 | 3.64Mpps | 3.71Mpps
 1280 | 3.15Mpps | 3.21Mpps
 1518 | 2.87Mpps | 2.92Mpps

Signed-off-by: Hiroshi Shimamoto 
Reviewed-by: Hayato Momma 
---
 pmd/pmd_memnic.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c
index 0783440..7fc3093 100644
--- a/pmd/pmd_memnic.c
+++ b/pmd/pmd_memnic.c
@@ -286,7 +286,7 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
uint16_t nr;
uint64_t pkts, bytes, errs;
uint32_t framesz = adapter->framesz;
-   int idx;
+   int idx, next;
struct rte_eth_stats *st = >stats[rte_lcore_id()];

if (!adapter->nic->hdr.valid)
@@ -298,6 +298,11 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
p = >packets[idx];
if (p->status != MEMNIC_PKT_ST_FILLED)
break;
+   /* prefetch the next area */
+   next = idx;
+   if (++next >= MEMNIC_NR_PACKET)
+   next = 0;
+   rte_prefetch0(>packets[next]);
if (p->len > framesz) {
errs++;
goto drop;
@@ -318,9 +323,7 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
 drop:
rte_compiler_barrier();
p->status = MEMNIC_PKT_ST_FREE;
-
-   if (++idx >= MEMNIC_NR_PACKET)
-   idx = 0;
+   idx = next;
}
adapter->up_idx = idx;

-- 
1.8.3.1



[dpdk-dev] [memnic PATCH v2 4/7] pmd: use compiler barrier

2014-09-30 Thread Hiroshi Shimamoto
From: Hiroshi Shimamoto 

x86 can keep store ordering with standard operations.

Using memory barrier is much expensive in main packet processing loop.
Removing this improves xmit/recv packet performance.

We can see performance improvements with memnic-tester.
Using Xeon E5-2697 v2 @ 2.70GHz, 4 vCPU.
 size |  before  |  after
   64 | 4.18Mpps | 4.59Mpps
  128 | 3.85Mpps | 4.87Mpps
  256 | 4.01Mpps | 4.72Mpps
  512 | 3.52Mpps | 4.41Mpps
 1024 | 3.18Mpps | 3.64Mpps
 1280 | 2.86Mpps | 3.15Mpps
 1518 | 2.59Mpps | 2.87Mpps

Note: we have to take care if we use non-temporal cache.

Signed-off-by: Hiroshi Shimamoto 
Reviewed-by: Hayato Momma 
---
 pmd/pmd_memnic.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c
index 872f3c4..0783440 100644
--- a/pmd/pmd_memnic.c
+++ b/pmd/pmd_memnic.c
@@ -316,7 +316,7 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
bytes += p->len;

 drop:
-   rte_mb();
+   rte_compiler_barrier();
p->status = MEMNIC_PKT_ST_FREE;

if (++idx >= MEMNIC_NR_PACKET)
@@ -403,7 +403,7 @@ retry:
pkts++;
bytes += pkt_len;

-   rte_mb();
+   rte_compiler_barrier();
p->status = MEMNIC_PKT_ST_FILLED;

rte_pktmbuf_free(tx_pkts[nr]);
-- 
1.8.3.1



[dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe:

2014-09-30 Thread Wodkowski, PawelX


Pawe?

> -Original Message-
> From: Richardson, Bruce
> Sent: Monday, September 29, 2014 12:33
> To: Wodkowski, PawelX
> Cc: Ananyev, Konstantin; Neil Horman; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] Change alarm cancel function to thread-
> safe:
> 
> On Mon, Sep 29, 2014 at 10:11:38AM +, Wodkowski, PawelX wrote:
> > > >
> > > > Image how you will be damned by someone that not even notice you
> change
> > > > and he Is managing some kind of resource based on returned number of
> > > > set/canceled timers. If you suddenly start returning negative values how
> those
> > > > application will behave? Silently changing returned value domain is 
> > > > evil in
> its
> > > > pure form.
> > >
> > > As I can see the impact is very limited.
> >
> > It is small impact to DPDK but can be huge to user application:
> 
> This is why we traditionally have in the release-notes for each release a
> section dedicated to calling out changes from one release to another. [See
> http://dpdk.org/doc/intel/dpdk-release-notes-1.7.0.pdf section 5]. Since
> from release-to-release there are generally only a couple of changes -
> though our next release may be a little different - the actual changes are
> clear enough to read about without wading through pages of documentation. I
> thinking calling out the change in both the release notes and the API docs
> is sufficient even for a change like this.
> 
> Basically, I wouldn't let API stability factor in too much in trying to get
> a proper fix for this issue.
> 
> /Bruce
> 

Summarizing all proposed solutions and to be able to produce final patch - what
Is desired behavior after fix?

1. do we leave as is in Patch v2:
1.1 if canceling from other thread - if one of the alarms is executing, wait to 
  finish its execution and then cancel any rearmed alarms.
1.2 if canceling from alarm handler and one of the alarms to cancel is this 
  executing callback do no wait for it to finish and cancel anything else.

 in both cases return number of canceled callbacks.

2. Do exactly like in 1. but return -EINPROGRESS instead of canceled alarms
  if one of the alarms to cancel is currently executing callback from alarm 
thread
  (information about number of canceled alarms will be lost).

3. refuse to cancel anything if canceling currently executing alarm from alarm 
  callback and return -EINPROGRESS otherwise do like in 1.1.

4. Implement behaviour 1/2/3 (which?) and add API call to interrogate list of
  alarms and retrun state of given alarms(s).

5. other solutions?

Pawel


[dpdk-dev] [memnic PATCH v2 3/7] pmd: use helper macros

2014-09-30 Thread Hiroshi Shimamoto
From: Hiroshi Shimamoto 

Do not touch pktmbuf directly.

Intead of direct access, use rte_pktmbuf_pkt_len() and rte_pktmbuf_data_len()
to access the property.

Signed-off-by: Hiroshi Shimamoto 
Reviewed-by: Hayato Momma 
---
 pmd/pmd_memnic.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c
index bbb5380..872f3c4 100644
--- a/pmd/pmd_memnic.c
+++ b/pmd/pmd_memnic.c
@@ -308,8 +308,8 @@ static uint16_t memnic_recv_pkts(void *rx_queue,

rte_memcpy(rte_pktmbuf_mtod(mb, void *), p->data, p->len);
mb->pkt.in_port = q->port_id;
-   mb->pkt.pkt_len = p->len;
-   mb->pkt.data_len = p->len;
+   rte_pktmbuf_pkt_len(mb) = p->len;
+   rte_pktmbuf_data_len(mb) = p->len;
rx_pkts[nr] = mb;

pkts++;
@@ -394,7 +394,7 @@ retry:
ptr = p->data;
for (sg = tx_pkts[nr]; sg; sg = sg->pkt.next) {
void *src = rte_pktmbuf_mtod(sg, void *);
-   int data_len = sg->pkt.data_len;
+   int data_len = rte_pktmbuf_data_len(sg);

rte_memcpy(ptr, src, data_len);
ptr += data_len;
-- 
1.8.3.1



[dpdk-dev] [memnic PATCH v2 2/7] pmd: remove needless assignment

2014-09-30 Thread Hiroshi Shimamoto
From: Hiroshi Shimamoto 

Because these assignment are done in rte_pktmbuf_alloc(), get rid of them.

Signed-off-by: Hiroshi Shimamoto 
Reviewed-by: Hayato Momma 
---
 pmd/pmd_memnic.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c
index 994ed0a..bbb5380 100644
--- a/pmd/pmd_memnic.c
+++ b/pmd/pmd_memnic.c
@@ -308,8 +308,6 @@ static uint16_t memnic_recv_pkts(void *rx_queue,

rte_memcpy(rte_pktmbuf_mtod(mb, void *), p->data, p->len);
mb->pkt.in_port = q->port_id;
-   mb->pkt.nb_segs = 1;
-   mb->pkt.next = NULL;
mb->pkt.pkt_len = p->len;
mb->pkt.data_len = p->len;
rx_pkts[nr] = mb;
-- 
1.8.3.1



[dpdk-dev] [memnic PATCH v2 1/7] guest: memnic-tester: PMD benchmark in guest

2014-09-30 Thread Hiroshi Shimamoto
From: Hiroshi Shimamoto 

Introduce memnic-tester which benchmarks MEMNIC PMD performance in guest.

It starts with two threads, one thread produces and consumes packets,
other thread receives packets and directly transmits the received
packets. This evaluates MEMNIC PMD running cost.

memnic-tester is a benchmark tool to measure performance of MEMNIC PMD itself.
The master thread forward packets with Rx and Tx bursts.
The slave thread fills and clears packets in the lightest way. It doesn't get
packet out of VM because it would increase jitter and hide PMD performance.
Throughput (number of forwarded packets per second) is given for each frame 
size.

The master thread does rx_burst and tx_burst through MEMNIC PMD.
+-+
| master  |
+-+
 rx_burst ^ | tx_burst
  | V
  +--+--+
  |  up  | down | MEMNIC shared memory
  +--+--+
 set flag ^ | unset flag
  | V
+-+
|  slave  |
+-+
The slave thread emulates packet-in/out by setting flag on/off.

It shows that throughputs in different frame size.
  64, 128, 256, 512, 1024, 1280, 1518

Signed-off-by: Hiroshi Shimamoto 
Reviewed-by: Hayato Momma 
---
 guest/Makefile|  20 
 guest/README.rst  |  93 +
 guest/memnic-tester.c | 281 ++
 3 files changed, 394 insertions(+)
 create mode 100644 guest/Makefile
 create mode 100644 guest/README.rst
 create mode 100644 guest/memnic-tester.c

diff --git a/guest/Makefile b/guest/Makefile
new file mode 100644
index 000..3c90350
--- /dev/null
+++ b/guest/Makefile
@@ -0,0 +1,20 @@
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+ifeq ($(RTE_TARGET),)
+$(error "Please define RTE_TARGET environment variable")
+endif
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+COMMON_INC_OPT = -I $(PWD)/../common
+
+APP = memnic-tester
+
+CFLAGS += -Wall -g -O3 $(COMMON_INC_OPT)
+
+SRCS-y := memnic-tester.c
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/guest/README.rst b/guest/README.rst
new file mode 100644
index 000..eb230b0
--- /dev/null
+++ b/guest/README.rst
@@ -0,0 +1,93 @@
+.. Copyright 2014 NEC
+   Redistribution and use in source and binary forms, with or without
+   modification, are permitted provided that the following conditions
+   are met:
+   - Redistributions of source code must retain the above copyright
+ notice, this list of conditions and the following disclaimer.
+   - Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions and the following disclaimer in
+ the documentation and/or other materials provided with the
+ distribution.
+   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+   FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+   COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+   INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+   (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+   SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+   HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+   STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+   ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+   OF THE POSSIBILITY OF SUCH DAMAGE.
+
+MEMNIC TESTER
+=
+
+DESCRIPTION
+---
+
+It is a simple benchmark test of MEMNIC PMD in guest.
+
+It have two threads, one thread produces and consumes packets,
+other thread receives packets and directly transmits the received
+packets back in MEMNIC interface. This evaluates MEMNIC PMD running cost.
+
+memnic-tester is a benchmark tool to measure performance of MEMNIC PMD itself.
+The master thread forward packets with Rx and Tx bursts.
+The slave thread fills and clears packets in the lightest way. It doesn't get
+packet out of VM because it would increase jitter and hide PMD performance.
+Throughput (number of forwarded packets per second) is given for each frame 
size.
+
+The master thread does rx_burst and tx_burst through MEMNIC PMD.
++-+
+| master  |
++-+
+ rx_burst ^ | tx_burst
+  | V
+  +--+--+
+  |  up  | down | MEMNIC shared memory
+  +--+--+
+ set flag ^ | unset flag
+  | V
++-+
+|  slave  |
++-+
+The slave thread emulates packet-in/out by setting flag on/off.
+
+Like RFC2544, evaluations are performed the below frame size packets.
+  64, 128, 256, 512, 1024, 1280, 1518
+
+It shows the 

[dpdk-dev] [memnic PATCH v2 0/7] MEMNIC PMD performance improvement

2014-09-30 Thread Hiroshi Shimamoto
From: Hiroshi Shimamoto 

This patchset improves MEMNIC PMD performance.

The first patch introduces a new benchmark test run in guest,
and will be used to evaluate the following patch effects.

This patchset improves the throughput results of memnic-tester.
Using Xeon E5-2697 v2 @ 2.70GHz, 4 vCPU.
 size |  before  |  after
   64 | 4.18Mpps | 5.83Mpps
  128 | 3.85Mpps | 5.71Mpps
  256 | 4.01Mpps | 5.40Mpps
  512 | 3.52Mpps | 4.64Mpps
 1024 | 3.18Mpps | 3.68Mpps
 1280 | 2.86Mpps | 3.17Mpps
 1518 | 2.59Mpps | 2.90Mpps

Hiroshi Shimamoto (7):
  guest: memnic-tester: PMD benchmark in guest
  pmd: remove needless assignment
  pmd: use helper macros
  pmd: use compiler barrier
  pmd: packet receiving optimization with prefetch
  pmd: add branch hint in recv/xmit
  pmd: burst mbuf freeing in xmit

 guest/Makefile|  20 
 guest/README.rst  |  93 +
 guest/memnic-tester.c | 281 ++
 pmd/pmd_memnic.c  |  45 
 4 files changed, 417 insertions(+), 22 deletions(-)
 create mode 100644 guest/Makefile
 create mode 100644 guest/README.rst
 create mode 100644 guest/memnic-tester.c

-- 
1.8.3.1



[dpdk-dev] [PATCH v4 8/8] bond: unit test test macro refactor

2014-09-30 Thread Declan Doherty

Signed-off-by: Declan Doherty 
---
 app/test/test_link_bonding.c | 2574 +-
 1 file changed, 1036 insertions(+), 1538 deletions(-)

diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
index c32b685..c4fcaf7 100644
--- a/app/test/test_link_bonding.c
+++ b/app/test/test_link_bonding.c
@@ -31,6 +31,7 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */

+#include "unistd.h"
 #include 
 #include 
 #include 
@@ -265,7 +266,7 @@ static pthread_cond_t cvar = PTHREAD_COND_INITIALIZER;
 static int
 test_setup(void)
 {
-   int i, retval, nb_mbuf_per_pool;
+   int i, nb_mbuf_per_pool;
struct ether_addr *mac_addr = (struct ether_addr *)slave_mac;

/* Allocate ethernet packet header with space for VLAN header */
@@ -273,10 +274,8 @@ test_setup(void)
test_params->pkt_eth_hdr = malloc(sizeof(struct ether_hdr) +
sizeof(struct vlan_hdr));

-   if (test_params->pkt_eth_hdr == NULL) {
-   printf("ethernet header struct allocation failed!\n");
-   return -1;
-   }
+   TEST_ASSERT_NOT_NULL(test_params->pkt_eth_hdr,
+   "Ethernet header struct allocation failed!");
}

nb_mbuf_per_pool = RTE_TEST_RX_DESC_MAX + DEF_PKT_BURST +
@@ -286,10 +285,8 @@ test_setup(void)
MBUF_SIZE, MBUF_CACHE_SIZE, sizeof(struct 
rte_pktmbuf_pool_private),
rte_pktmbuf_pool_init, NULL, rte_pktmbuf_init, 
NULL,
rte_socket_id(), 0);
-   if (test_params->mbuf_pool == NULL) {
-   printf("rte_mempool_create failed\n");
-   return -1;
-   }
+   TEST_ASSERT_NOT_NULL(test_params->mbuf_pool,
+   "rte_mempool_create failed");
}

/* Create / Initialize virtual eth devs */
@@ -303,20 +300,12 @@ test_setup(void)

test_params->slave_port_ids[i] = 
virtual_ethdev_create(pmd_name,
mac_addr, rte_socket_id(), 1);
-   if (test_params->slave_port_ids[i] < 0) {
-   printf("Failed to create virtual virtual ethdev 
%s\n", pmd_name);
-   return -1;
-   }
+   TEST_ASSERT(test_params->slave_port_ids[i] >= 0,
+   "Failed to create virtual virtual 
ethdev %s", pmd_name);

-   printf("Created virtual ethdev %s\n", pmd_name);
-
-   retval = 
configure_ethdev(test_params->slave_port_ids[i], 1, 0);
-   if (retval != 0) {
-   printf("Failed to configure virtual ethdev 
%s\n", pmd_name);
-   return -1;
-   }
-
-   printf("Configured virtual ethdev %s\n", pmd_name);
+   TEST_ASSERT_SUCCESS(configure_ethdev(
+   test_params->slave_port_ids[i], 1, 0),
+   "Failed to configure virtual ethdev 
%s", pmd_name);
}
slaves_initialized = 1;
}
@@ -350,14 +339,14 @@ test_create_bonded_device(void)
current_slave_count = 
rte_eth_bond_slaves_get(test_params->bonded_port_id,
slaves, RTE_MAX_ETHPORTS);

-   TEST_ASSERT(current_slave_count == 0,
+   TEST_ASSERT_EQUAL(current_slave_count, 0,
"Number of slaves %d is great than expected %d.",
current_slave_count, 0);

current_slave_count = rte_eth_bond_active_slaves_get(
test_params->bonded_port_id, slaves, RTE_MAX_ETHPORTS);

-   TEST_ASSERT(current_slave_count == 0,
+   TEST_ASSERT_EQUAL(current_slave_count, 0,
"Number of active slaves %d is great than expected %d.",
current_slave_count, 0);

@@ -375,30 +364,21 @@ test_create_bonded_device_with_invalid_params(void)
/* Invalid name */
port_id = rte_eth_bond_create(NULL, test_params->bonding_mode,
rte_socket_id());
-   if (port_id >= 0) {
-   printf("Created bonded device unexpectedly.\n");
-   return -1;
-   }
+   TEST_ASSERT(port_id < 0, "Created bonded device unexpectedly");

test_params->bonding_mode = INVALID_BONDING_MODE;

/* Invalid bonding mode */
port_id = rte_eth_bond_create(BONDED_DEV_NAME, 
test_params->bonding_mode,
rte_socket_id());
-   if (port_id >= 0) {
-   printf("Created bonded device unexpectedly.\n");
-   return -1;
-   }
+   TEST_ASSERT(port_id < 0, "Created bonded device 

[dpdk-dev] [PATCH v4 7/8] bond: lsc polling support

2014-09-30 Thread Declan Doherty

Signed-off-by: Declan Doherty 
---
 app/test-pmd/cmdline.c |  63 +
 app/test/test.h|   7 +-
 app/test/test_link_bonding.c   | 258 ---
 app/test/virtual_pmd.c |  17 +-
 app/test/virtual_pmd.h |  48 +++-
 lib/librte_pmd_bond/rte_eth_bond.h |  80 ++
 lib/librte_pmd_bond/rte_eth_bond_api.c | 315 +++
 lib/librte_pmd_bond/rte_eth_bond_args.c|  30 ++-
 lib/librte_pmd_bond/rte_eth_bond_pmd.c | 393 +
 lib/librte_pmd_bond/rte_eth_bond_private.h |  71 --
 10 files changed, 934 insertions(+), 348 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 15ca493..b14df61 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -439,6 +439,9 @@ static void cmd_help_long_parsed(void *parsed_result,

"set bonding xmit_balance_policy (port_id) 
(l2|l23|l34)\n"
"   Set the transmit balance policy for bonded 
device running in balance mode.\n\n"
+
+   "set bonding mon_period (port_id) (value) \n"
+   "   Set the bonding link status monitoring polling 
period in ms.\n\n"
 #endif

, list_pkt_forwarding_modes()
@@ -3705,6 +3708,65 @@ cmdline_parse_inst_t cmd_set_bond_mac_addr = {
}
 };

+
+/* *** SET LINK STATUS MONITORING POLLING PERIOD ON BONDED DEVICE *** */
+struct cmd_set_bond_mon_period_result {
+   cmdline_fixed_string_t set;
+   cmdline_fixed_string_t bonding;
+   cmdline_fixed_string_t mon_period;
+   uint8_t port_num;
+   uint32_t period_ms;
+};
+
+static void cmd_set_bond_mon_period_parsed(void *parsed_result,
+   __attribute__((unused))  struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   struct cmd_set_bond_mon_period_result *res = parsed_result;
+   int ret;
+
+   if (res->port_num >= nb_ports) {
+   printf("Port id %d must be less than %d\n", res->port_num, 
nb_ports);
+   return;
+   }
+
+   ret = rte_eth_bond_link_monitoring_set(res->port_num, res->period_ms);
+
+   /* check the return value and print it if is < 0 */
+   if (ret < 0)
+   printf("set_bond_mac_addr error: (%s)\n", strerror(-ret));
+}
+
+cmdline_parse_token_string_t cmd_set_bond_mon_period_set =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_bond_mon_period_result,
+   set, "set");
+cmdline_parse_token_string_t cmd_set_bond_mon_period_bonding =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_bond_mon_period_result,
+   bonding, "bonding");
+cmdline_parse_token_string_t cmd_set_bond_mon_period_mon_period =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_bond_mon_period_result,
+   mon_period, "mon_period");
+cmdline_parse_token_num_t cmd_set_bond_mon_period_portnum =
+   TOKEN_NUM_INITIALIZER(struct cmd_set_bond_mon_period_result,
+   port_num, UINT8);
+cmdline_parse_token_num_t cmd_set_bond_mon_period_period_ms =
+   TOKEN_NUM_INITIALIZER(struct cmd_set_bond_mon_period_result,
+   period_ms, UINT32);
+
+cmdline_parse_inst_t cmd_set_bond_mon_period = {
+   .f = cmd_set_bond_mon_period_parsed,
+   .data = (void *) 0,
+   .help_str = "set bonding mon_period (port_id) (period_ms): ",
+   .tokens = {
+   (void *)_set_bond_mon_period_set,
+   (void *)_set_bond_mon_period_bonding,
+   (void *)_set_bond_mon_period_mon_period,
+   (void *)_set_bond_mon_period_portnum,
+   (void *)_set_bond_mon_period_period_ms,
+   NULL
+   }
+};
+
 #endif /* RTE_LIBRTE_PMD_BOND */

 /* *** SET FORWARDING MODE *** */
@@ -7453,6 +7515,7 @@ cmdline_parse_ctx_t main_ctx[] = {
(cmdline_parse_inst_t *) _create_bonded_device,
(cmdline_parse_inst_t *) _set_bond_mac_addr,
(cmdline_parse_inst_t *) _set_balance_xmit_policy,
+   (cmdline_parse_inst_t *) _set_bond_mon_period,
 #endif
(cmdline_parse_inst_t *)_vlan_offload,
(cmdline_parse_inst_t *)_vlan_tpid,
diff --git a/app/test/test.h b/app/test/test.h
index 98ab804..24b1640 100644
--- a/app/test/test.h
+++ b/app/test/test.h
@@ -62,14 +62,15 @@

 #define TEST_ASSERT_SUCCESS(val, msg, ...) do {
\
if (!(val == 0)) {  
\
-   printf("TestCase %s() line %d failed: " 
\
-   msg "\n", __func__, __LINE__, 

[dpdk-dev] [PATCH v4 6/8] testpmd: adding parameter to reconfig method to set socket_id when adding new port to portlist

2014-09-30 Thread Declan Doherty

Signed-off-by: Declan Doherty 
---
 app/test-pmd/cmdline.c | 2 +-
 app/test-pmd/testpmd.c | 3 ++-
 app/test-pmd/testpmd.h | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 225f669..15ca493 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -3614,7 +3614,7 @@ static void cmd_create_bonded_device_parsed(void 
*parsed_result,

/* Update number of ports */
nb_ports = rte_eth_dev_count();
-   reconfig(port_id);
+   reconfig(port_id, res->socket);
rte_eth_promiscuous_enable(port_id);
}

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 9f6cdc4..66e3c7c 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -628,7 +628,7 @@ init_config(void)


 void
-reconfig(portid_t new_port_id)
+reconfig(portid_t new_port_id, unsigned socket_id)
 {
struct rte_port *port;

@@ -647,6 +647,7 @@ reconfig(portid_t new_port_id)
/* set flag to initialize port/queue */
port->need_reconfig = 1;
port->need_reconfig_queues = 1;
+   port->socket_id = socket_id;

init_port_config();
 }
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 9cbfeac..5a3423c 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -457,7 +457,7 @@ void fwd_config_display(void);
 void rxtx_config_display(void);
 void fwd_config_setup(void);
 void set_def_fwd_config(void);
-void reconfig(portid_t new_port_id);
+void reconfig(portid_t new_port_id, unsigned socket_id);
 int init_fwd_streams(void);

 void port_mtu_set(portid_t port_id, uint16_t mtu);
-- 
1.7.12.2



[dpdk-dev] [PATCH v4 5/8] test app: adding support for generating variable sized packet bursts

2014-09-30 Thread Declan Doherty

Signed-off-by: Declan Doherty 
---
 app/test/packet_burst_generator.c | 25 -
 app/test/packet_burst_generator.h |  6 +-
 app/test/test_link_bonding.c  | 14 +-
 3 files changed, 22 insertions(+), 23 deletions(-)

diff --git a/app/test/packet_burst_generator.c 
b/app/test/packet_burst_generator.c
index 9e747a4..b2824dc 100644
--- a/app/test/packet_burst_generator.c
+++ b/app/test/packet_burst_generator.c
@@ -74,8 +74,7 @@ static inline void
 copy_buf_to_pkt(void *buf, unsigned len, struct rte_mbuf *pkt, unsigned offset)
 {
if (offset + len <= pkt->data_len) {
-   rte_memcpy(rte_pktmbuf_mtod(pkt, char *) + offset,
-   buf, (size_t) len);
+   rte_memcpy(rte_pktmbuf_mtod(pkt, char *) + offset, buf, 
(size_t) len);
return;
}
copy_buf_to_pkt_segs(buf, len, pkt, offset);
@@ -191,20 +190,12 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t 
src_addr,
  */
 #define RTE_MAX_SEGS_PER_PKT 255 /**< pkt.nb_segs is a 8-bit unsigned char. */

-#define TXONLY_DEF_PACKET_LEN 64
-#define TXONLY_DEF_PACKET_LEN_128 128
-
-uint16_t tx_pkt_length = TXONLY_DEF_PACKET_LEN;
-uint16_t tx_pkt_seg_lengths[RTE_MAX_SEGS_PER_PKT] = {
-   TXONLY_DEF_PACKET_LEN_128,
-};
-
-uint8_t  tx_pkt_nb_segs = 1;

 int
 generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst,
struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr,
-   uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst)
+   uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst,
+   uint8_t pkt_len, uint8_t nb_pkt_segs)
 {
int i, nb_pkt = 0;
size_t eth_hdr_size;
@@ -221,9 +212,9 @@ nomore_mbuf:
break;
}

-   pkt->data_len = tx_pkt_seg_lengths[0];
+   pkt->data_len = pkt_len;
pkt_seg = pkt;
-   for (i = 1; i < tx_pkt_nb_segs; i++) {
+   for (i = 1; i < nb_pkt_segs; i++) {
pkt_seg->next = rte_pktmbuf_alloc(mp);
if (pkt_seg->next == NULL) {
pkt->nb_segs = i;
@@ -231,7 +222,7 @@ nomore_mbuf:
goto nomore_mbuf;
}
pkt_seg = pkt_seg->next;
-   pkt_seg->data_len = tx_pkt_seg_lengths[i];
+   pkt_seg->data_len = pkt_len;
}
pkt_seg->next = NULL; /* Last segment of packet. */

@@ -259,8 +250,8 @@ nomore_mbuf:
 * Complete first mbuf of packet and append it to the
 * burst of packets to be transmitted.
 */
-   pkt->nb_segs = tx_pkt_nb_segs;
-   pkt->pkt_len = tx_pkt_length;
+   pkt->nb_segs = nb_pkt_segs;
+   pkt->pkt_len = pkt_len;
pkt->l2_len = eth_hdr_size;

if (ipv4) {
diff --git a/app/test/packet_burst_generator.h 
b/app/test/packet_burst_generator.h
index 5b3cd6c..f86589e 100644
--- a/app/test/packet_burst_generator.h
+++ b/app/test/packet_burst_generator.h
@@ -47,6 +47,9 @@ extern "C" {
 #define IPV4_ADDR(a, b, c, d)(((a & 0xff) << 24) | ((b & 0xff) << 16) | \
((c & 0xff) << 8) | (d & 0xff))

+#define PACKET_BURST_GEN_PKT_LEN 60
+#define PACKET_BURST_GEN_PKT_LEN_128 128
+

 void
 initialize_eth_header(struct ether_hdr *eth_hdr, struct ether_addr *src_mac,
@@ -68,7 +71,8 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t 
src_addr,
 int
 generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst,
struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr,
-   uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst);
+   uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst,
+   uint8_t pkt_len, uint8_t nb_pkt_segs);

 #ifdef __cplusplus
 }
diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
index 1a847eb..50355a3 100644
--- a/app/test/test_link_bonding.c
+++ b/app/test/test_link_bonding.c
@@ -1338,7 +1338,8 @@ generate_test_burst(struct rte_mbuf **pkts_burst, 
uint16_t burst_size,
/* Generate burst of packets to transmit */
generated_burst_size = generate_packet_burst(test_params->mbuf_pool,
pkts_burst, test_params->pkt_eth_hdr, vlan, ip_hdr, 
ipv4,
-   test_params->pkt_udp_hdr, burst_size);
+   test_params->pkt_udp_hdr, burst_size, 
PACKET_BURST_GEN_PKT_LEN_128,
+   1);
if (generated_burst_size != burst_size) {
printf("Failed to generate packet burst");
return -1;
@@ -2056,7 +2057,7 @@ test_activebackup_tx_burst(void)
/* Generate a burst of packets to transmit */
generated_burst_size = 

[dpdk-dev] [PATCH v4 4/8] bond: free mbufs if transmission fails in bonding tx_burst functions

2014-09-30 Thread Declan Doherty

Signed-off-by: Declan Doherty 
---
 app/test/test_link_bonding.c   | 393 -
 app/test/virtual_pmd.c |  80 +--
 app/test/virtual_pmd.h |   7 +
 lib/librte_pmd_bond/rte_eth_bond_pmd.c |  83 +--
 4 files changed, 525 insertions(+), 38 deletions(-)

diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
index cce32ed..1a847eb 100644
--- a/app/test/test_link_bonding.c
+++ b/app/test/test_link_bonding.c
@@ -663,6 +663,9 @@ enable_bonded_slaves(void)
int i;

for (i = 0; i < test_params->bonded_slave_count; i++) {
+   
virtual_ethdev_tx_burst_fn_set_success(test_params->slave_port_ids[i],
+   1);
+
virtual_ethdev_simulate_link_status_interrupt(
test_params->slave_port_ids[i], 1);
}
@@ -1413,6 +1416,135 @@ test_roundrobin_tx_burst(void)
 }

 static int
+verify_mbufs_ref_count(struct rte_mbuf **mbufs, int nb_mbufs, int val)
+{
+   int i, refcnt;
+
+   for (i = 0; i < nb_mbufs; i++) {
+   refcnt = rte_mbuf_refcnt_read(mbufs[i]);
+   TEST_ASSERT_EQUAL(refcnt, val,
+   "mbuf ref count (%d)is not the expected value (%d)",
+   refcnt, val);
+   }
+   return 0;
+}
+
+
+static void
+free_mbufs(struct rte_mbuf **mbufs, int nb_mbufs)
+{
+   int i;
+
+   for (i = 0; i < nb_mbufs; i++)
+   rte_pktmbuf_free(mbufs[i]);
+}
+
+#define TEST_RR_SLAVE_TX_FAIL_SLAVE_COUNT  (2)
+#define TEST_RR_SLAVE_TX_FAIL_BURST_SIZE   (64)
+#define TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT(22)
+#define TEST_RR_SLAVE_TX_FAIL_FAILING_SLAVE_IDX(1)
+
+static int
+test_roundrobin_tx_burst_slave_tx_fail(void)
+{
+   struct rte_mbuf *pkt_burst[MAX_PKT_BURST];
+   struct rte_mbuf *expected_tx_fail_pkts[MAX_PKT_BURST];
+
+   struct rte_eth_stats port_stats;
+
+   int i, first_fail_idx, tx_count;
+
+   TEST_ASSERT_SUCCESS(initialize_bonded_device_with_slaves(
+   BONDING_MODE_ROUND_ROBIN, 0,
+   TEST_RR_SLAVE_TX_FAIL_SLAVE_COUNT, 1),
+   "Failed to intialise bonded device");
+
+   /* Generate test bursts of packets to transmit */
+   TEST_ASSERT_EQUAL(generate_test_burst(pkt_burst,
+   TEST_RR_SLAVE_TX_FAIL_BURST_SIZE, 0, 1, 0, 0, 0),
+   TEST_RR_SLAVE_TX_FAIL_BURST_SIZE,
+   "Failed to generate test packet burst");
+
+   /* Copy references to packets which we expect not to be transmitted */
+   first_fail_idx = (TEST_RR_SLAVE_TX_FAIL_BURST_SIZE -
+   (TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT *
+   TEST_RR_SLAVE_TX_FAIL_SLAVE_COUNT)) +
+   TEST_RR_SLAVE_TX_FAIL_FAILING_SLAVE_IDX;
+
+   for (i = 0; i < TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT; i++) {
+   expected_tx_fail_pkts[i] = pkt_burst[first_fail_idx +
+   (i * TEST_RR_SLAVE_TX_FAIL_SLAVE_COUNT)];
+   }
+
+   /* Set virtual slave to only fail transmission of
+* TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT packets in burst */
+   virtual_ethdev_tx_burst_fn_set_success(
+   
test_params->slave_port_ids[TEST_RR_SLAVE_TX_FAIL_FAILING_SLAVE_IDX],
+   0);
+
+   virtual_ethdev_tx_burst_fn_set_tx_pkt_fail_count(
+   
test_params->slave_port_ids[TEST_RR_SLAVE_TX_FAIL_FAILING_SLAVE_IDX],
+   TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT);
+
+   tx_count = rte_eth_tx_burst(test_params->bonded_port_id, 0, pkt_burst,
+   TEST_RR_SLAVE_TX_FAIL_BURST_SIZE);
+
+   TEST_ASSERT_EQUAL(tx_count, TEST_RR_SLAVE_TX_FAIL_BURST_SIZE -
+   TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT,
+   "Transmitted (%d) an unexpected (%d) number of 
packets", tx_count,
+   TEST_RR_SLAVE_TX_FAIL_BURST_SIZE -
+   TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT);
+
+   /* Verify that failed packet are expected failed packets */
+   for (i = 0; i < TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT; i++) {
+   TEST_ASSERT_EQUAL(expected_tx_fail_pkts[i], pkt_burst[i + 
tx_count],
+   "expected mbuf (%d) pointer %p not expected 
pointer %p",
+   i, expected_tx_fail_pkts[i], pkt_burst[i + 
tx_count]);
+   }
+
+   /* Verify bonded port tx stats */
+   rte_eth_stats_get(test_params->bonded_port_id, _stats);
+
+   TEST_ASSERT_EQUAL(port_stats.opackets,
+   (uint64_t)TEST_RR_SLAVE_TX_FAIL_BURST_SIZE -
+   TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT,
+   "Bonded Port (%d) opackets value (%u) not as expected 
(%d)",
+   test_params->bonded_port_id, 

[dpdk-dev] [PATCH v4 3/8] bond: fix naming inconsistency in tx_burst_round_robin

2014-09-30 Thread Declan Doherty

Signed-off-by: Declan Doherty 
---
 lib/librte_pmd_bond/rte_eth_bond_pmd.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/librte_pmd_bond/rte_eth_bond_pmd.c 
b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
index 348e28f..66f1650 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_pmd.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
@@ -92,7 +92,7 @@ static uint16_t
 bond_ethdev_tx_burst_round_robin(void *queue, struct rte_mbuf **bufs,
uint16_t nb_pkts)
 {
-   struct bond_dev_private *dev_private;
+   struct bond_dev_private *internals;
struct bond_tx_queue *bd_tx_q;

struct rte_mbuf *slave_bufs[RTE_MAX_ETHPORTS][nb_pkts];
@@ -107,13 +107,13 @@ bond_ethdev_tx_burst_round_robin(void *queue, struct 
rte_mbuf **bufs,
int i, cs_idx = 0;

bd_tx_q = (struct bond_tx_queue *)queue;
-   dev_private = bd_tx_q->dev_private;
+   internals = bd_tx_q->dev_private;

/* Copy slave list to protect against slave up/down changes during tx
 * bursting */
-   num_of_slaves = dev_private->active_slave_count;
-   memcpy(slaves, dev_private->active_slaves,
-   sizeof(dev_private->active_slaves[0]) * num_of_slaves);
+   num_of_slaves = internals->active_slave_count;
+   memcpy(slaves, internals->active_slaves,
+   sizeof(internals->active_slaves[0]) * num_of_slaves);

if (num_of_slaves < 1)
return num_tx_total;
-- 
1.7.12.2



[dpdk-dev] [PATCH v4 2/8] bond: removing switch statement from rx burst method

2014-09-30 Thread Declan Doherty

Signed-off-by: Declan Doherty 
---
 lib/librte_pmd_bond/rte_eth_bond_pmd.c | 62 +++---
 1 file changed, 35 insertions(+), 27 deletions(-)

diff --git a/lib/librte_pmd_bond/rte_eth_bond_pmd.c 
b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
index aca2dcf..348e28f 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_pmd.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
@@ -59,33 +59,37 @@ bond_ethdev_rx_burst(void *queue, struct rte_mbuf **bufs, 
uint16_t nb_pkts)

internals = bd_rx_q->dev_private;

-   switch (internals->mode) {
-   case BONDING_MODE_ROUND_ROBIN:
-   case BONDING_MODE_BROADCAST:
-   case BONDING_MODE_BALANCE:
-   for (i = 0; i < internals->active_slave_count && nb_pkts; i++) {
-   /* Offset of pointer to *bufs increases as packets are 
received
-* from other slaves */
-   num_rx_slave = 
rte_eth_rx_burst(internals->active_slaves[i],
-   bd_rx_q->queue_id, bufs + num_rx_total, 
nb_pkts);
-   if (num_rx_slave) {
-   num_rx_total += num_rx_slave;
-   nb_pkts -= num_rx_slave;
-   }
+   for (i = 0; i < internals->active_slave_count && nb_pkts; i++) {
+   /* Offset of pointer to *bufs increases as packets are received
+* from other slaves */
+   num_rx_slave = rte_eth_rx_burst(internals->active_slaves[i],
+   bd_rx_q->queue_id, bufs + num_rx_total, 
nb_pkts);
+   if (num_rx_slave) {
+   num_rx_total += num_rx_slave;
+   nb_pkts -= num_rx_slave;
}
-   break;
-   case BONDING_MODE_ACTIVE_BACKUP:
-   num_rx_slave = rte_eth_rx_burst(internals->current_primary_port,
-   bd_rx_q->queue_id, bufs, nb_pkts);
-   if (num_rx_slave)
-   num_rx_total = num_rx_slave;
-   break;
}
+
return num_rx_total;
 }

 static uint16_t
-bond_ethdev_tx_round_robin(void *queue, struct rte_mbuf **bufs,
+bond_ethdev_rx_burst_active_backup(void *queue, struct rte_mbuf **bufs,
+   uint16_t nb_pkts)
+{
+   struct bond_dev_private *internals;
+
+   /* Cast to structure, containing bonded device's port id and queue id */
+   struct bond_rx_queue *bd_rx_q = (struct bond_rx_queue *)queue;
+
+   internals = bd_rx_q->dev_private;
+
+   return rte_eth_rx_burst(internals->current_primary_port,
+   bd_rx_q->queue_id, bufs, nb_pkts);
+}
+
+static uint16_t
+bond_ethdev_tx_burst_round_robin(void *queue, struct rte_mbuf **bufs,
uint16_t nb_pkts)
 {
struct bond_dev_private *dev_private;
@@ -134,7 +138,7 @@ bond_ethdev_tx_round_robin(void *queue, struct rte_mbuf 
**bufs,
 }

 static uint16_t
-bond_ethdev_tx_active_backup(void *queue,
+bond_ethdev_tx_burst_active_backup(void *queue,
struct rte_mbuf **bufs, uint16_t nb_pkts)
 {
struct bond_dev_private *internals;
@@ -270,7 +274,8 @@ xmit_slave_hash(const struct rte_mbuf *buf, uint8_t 
slave_count, uint8_t policy)
 }

 static uint16_t
-bond_ethdev_tx_balance(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+bond_ethdev_tx_burst_balance(void *queue, struct rte_mbuf **bufs,
+   uint16_t nb_pkts)
 {
struct bond_dev_private *internals;
struct bond_tx_queue *bd_tx_q;
@@ -480,22 +485,25 @@ bond_ethdev_mode_set(struct rte_eth_dev *eth_dev, int 
mode)

switch (mode) {
case BONDING_MODE_ROUND_ROBIN:
-   eth_dev->tx_pkt_burst = bond_ethdev_tx_round_robin;
+   eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_round_robin;
+   eth_dev->rx_pkt_burst = bond_ethdev_rx_burst;
break;
case BONDING_MODE_ACTIVE_BACKUP:
-   eth_dev->tx_pkt_burst = bond_ethdev_tx_active_backup;
+   eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_active_backup;
+   eth_dev->rx_pkt_burst = bond_ethdev_rx_burst_active_backup;
break;
case BONDING_MODE_BALANCE:
-   eth_dev->tx_pkt_burst = bond_ethdev_tx_balance;
+   eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_balance;
+   eth_dev->rx_pkt_burst = bond_ethdev_rx_burst;
break;
case BONDING_MODE_BROADCAST:
eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_broadcast;
+   eth_dev->rx_pkt_burst = bond_ethdev_rx_burst;
break;
default:
return -1;
}

-   eth_dev->rx_pkt_burst = bond_ethdev_rx_burst;
internals->mode = mode;

return 0;
-- 
1.7.12.2



[dpdk-dev] [PATCH v4 1/8] bond: link status interrupt support

2014-09-30 Thread Declan Doherty
Adding support for lsc interrupt from bonded device to link
bonding library with supporting unit tests in the test application.

Signed-off-by: Declan Doherty 
---
 app/test/test_link_bonding.c   | 213 +++--
 lib/librte_pmd_bond/rte_eth_bond_api.c |   4 +
 lib/librte_pmd_bond/rte_eth_bond_pmd.c |   6 +
 3 files changed, 189 insertions(+), 34 deletions(-)

diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
index db5b180..cce32ed 100644
--- a/app/test/test_link_bonding.c
+++ b/app/test/test_link_bonding.c
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
@@ -224,10 +225,15 @@ static struct rte_eth_txconf tx_conf_default = {
 };

 static int
-configure_ethdev(uint8_t port_id, uint8_t start)
+configure_ethdev(uint8_t port_id, uint8_t start, uint8_t en_isr)
 {
int q_id;

+   if (en_isr)
+   default_pmd_conf.intr_conf.lsc = 1;
+   else
+   default_pmd_conf.intr_conf.lsc = 0;
+
if (rte_eth_dev_configure(port_id, test_params->nb_rx_q,
test_params->nb_tx_q, _pmd_conf) != 0) {
goto error;
@@ -312,7 +318,7 @@ test_setup(void)

printf("Created virtual ethdev %s\n", pmd_name);

-   retval = 
configure_ethdev(test_params->slave_port_ids[i], 1);
+   retval = 
configure_ethdev(test_params->slave_port_ids[i], 1, 0);
if (retval != 0) {
printf("Failed to configure virtual ethdev 
%s\n", pmd_name);
return -1;
@@ -341,7 +347,7 @@ test_create_bonded_device(void)
TEST_ASSERT(test_params->bonded_port_id >= 0,
"Failed to create bonded ethdev %s", 
BONDED_DEV_NAME);

-   
TEST_ASSERT_SUCCESS(configure_ethdev(test_params->bonded_port_id, 0),
+   
TEST_ASSERT_SUCCESS(configure_ethdev(test_params->bonded_port_id, 0, 0),
"Failed to configure bonded ethdev %s", 
BONDED_DEV_NAME);
}

@@ -1078,12 +1084,12 @@ test_set_explicit_bonded_mac(void)


 static int
-initialize_bonded_device_with_slaves(uint8_t bonding_mode,
+initialize_bonded_device_with_slaves(uint8_t bonding_mode, uint8_t bond_en_isr,
uint8_t number_of_slaves, uint8_t enable_slave)
 {
/* configure bonded device */
-   TEST_ASSERT_SUCCESS(configure_ethdev(test_params->bonded_port_id, 0),
-   "Failed to configure bonding port (%d) in mode %d "
+   TEST_ASSERT_SUCCESS(configure_ethdev(test_params->bonded_port_id, 0,
+   bond_en_isr), "Failed to configure bonding port (%d) in 
mode %d "
"with (%d) slaves.", test_params->bonded_port_id, 
bonding_mode,
number_of_slaves);

@@ -1116,8 +1122,8 @@ test_adding_slave_after_bonded_device_started(void)
 {
int i;

-   if (initialize_bonded_device_with_slaves(BONDING_MODE_ROUND_ROBIN, 4, 
0) !=
-   0)
+   if (initialize_bonded_device_with_slaves(BONDING_MODE_ROUND_ROBIN, 0, 
4, 0)
+   != 0)
return -1;

/* Enabled slave devices */
@@ -1141,6 +1147,144 @@ test_adding_slave_after_bonded_device_started(void)
return remove_slaves_and_stop_bonded_device();
 }

+#define TEST_STATUS_INTERRUPT_SLAVE_COUNT  4
+#define TEST_LSC_WAIT_TIMEOUT_MS   500
+
+int test_lsc_interupt_count;
+
+static pthread_mutex_t mutex;
+static pthread_cond_t cvar;
+
+static void
+test_bonding_lsc_event_callback(uint8_t port_id __rte_unused,
+   enum rte_eth_event_type type  __rte_unused, void *param 
__rte_unused)
+{
+   pthread_mutex_lock();
+   test_lsc_interupt_count++;
+
+   pthread_cond_signal();
+   pthread_mutex_unlock();
+}
+
+static inline int
+lsc_timeout(int wait_us)
+{
+   int retval = 0;
+
+   struct timespec ts;
+   struct timeval tp;
+
+   gettimeofday(, NULL);
+
+   /* Convert from timeval to timespec */
+   ts.tv_sec  = tp.tv_sec;
+   ts.tv_nsec = tp.tv_usec * 1000;
+   ts.tv_nsec += wait_us * 1000;
+
+   pthread_mutex_lock();
+   if (test_lsc_interupt_count < 1)
+   retval = pthread_cond_timedwait(, , );
+
+   pthread_mutex_unlock();
+
+   return retval;
+}
+
+static int
+test_status_interrupt(void)
+{
+   int slave_count;
+   uint8_t slaves[RTE_MAX_ETHPORTS];
+
+   pthread_mutex_init(, NULL);
+   pthread_cond_init(, NULL);
+
+   /* initialized bonding device with T slaves */
+   if (initialize_bonded_device_with_slaves(BONDING_MODE_ROUND_ROBIN, 1,
+   TEST_STATUS_INTERRUPT_SLAVE_COUNT, 1) != 0)
+   return -1;
+
+   test_lsc_interupt_count = 0;
+
+   /* register link status change interrupt callback */
+   

[dpdk-dev] [PATCH v4 0/8] link bonding

2014-09-30 Thread Declan Doherty
v4:
- Rebased to account for changes in master.
- Fix for rte_eth_bond_slaves_get() introduced in v3 patch set
- Addressed issue around disabling/enabling link status polling around adding/
  removing slaves devices.

v3 :
- Typo fix for the bond free mbufs patch.
- Rebased to account for changes in the mbuf patches.
- Add support for slave devices which don't support link status interrupts 
- Tidy up the link bonding unit test so that all tests use the new test macros.

v2 :
Addresses issues with the logic around the handling of fail transmissions.
In this version all modes behave in a manner similar to a standard PMD,
returning the number of successfully transmitted mbufs and with the failing
mbufs at the end of bufs array for freeing / retransmission by the 
application software

v1:

This patch set adds support for link status interrupt in the link bonding
pmd. It also contains some patches to tidy up the code structure and to
of the link bonding code and to fix bugs relating to transmission 
failures in the under lying slave pmd which could lead to leaked mbufs. 

Declan Doherty (8):
  bond: link status interrupt support
  bond: removing switch statement from rx burst method
  bond: fix naming inconsistency in tx_burst_round_robin
  bond: free mbufs if transmission fails in bonding tx_burst functions
  test app: adding support for generating variable sized packet
  testpmd: adding parameter to reconfig method to set socket_id when
adding new port to portlist
  bond: lsc polling support
  bond: unit test test macro refactor

 app/test-pmd/cmdline.c |   65 +-
 app/test-pmd/testpmd.c |3 +-
 app/test-pmd/testpmd.h |2 +-
 app/test/packet_burst_generator.c  |   25 +-
 app/test/packet_burst_generator.h  |6 +-
 app/test/test.h|7 +-
 app/test/test_link_bonding.c   | 3342 ++--
 app/test/virtual_pmd.c |   97 +-
 app/test/virtual_pmd.h |   53 +-
 lib/librte_pmd_bond/rte_eth_bond.h |   80 +
 lib/librte_pmd_bond/rte_eth_bond_api.c |  319 ++-
 lib/librte_pmd_bond/rte_eth_bond_args.c|   30 +-
 lib/librte_pmd_bond/rte_eth_bond_pmd.c |  550 +++--
 lib/librte_pmd_bond/rte_eth_bond_private.h |   71 +-
 14 files changed, 2692 insertions(+), 1958 deletions(-)

-- 
1.7.12.2



[dpdk-dev] [PATCH] llib/ibrte_net: workaround to avoid macro conflict

2014-09-30 Thread Jingjing Wu
Macros such as IPPROTO_TCP, IPPROTO_UDP are already defined in .
If user's application includes  and rte_ip.h at the same time,
there will be conflict error.

This patch uses the way "#ifndef #endif" to avoid the conflict. 

Signed-off-by: Jingjing Wu 
---
 lib/librte_net/rte_ip.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/lib/librte_net/rte_ip.h b/lib/librte_net/rte_ip.h
index e3f65c1..2bcb479 100644
--- a/lib/librte_net/rte_ip.h
+++ b/lib/librte_net/rte_ip.h
@@ -116,6 +116,8 @@ struct ipv4_hdr {

 #defineIPV4_HDR_OFFSET_UNITS   8

+#ifndef _NETINET_IN_H
+#ifndef _NETINET_IN_H_
 /* IPv4 protocols */
 #define IPPROTO_IP 0  /**< dummy for IP */
 #define IPPROTO_HOPOPTS0  /**< IP6 hop-by-hop options */
@@ -227,6 +229,9 @@ struct ipv4_hdr {
 #define IPPROTO_RAW  255  /**< raw IP packet */
 #define IPPROTO_MAX  256  /**< maximum protocol number */

+#endif /*_NETINET_IN_H_*/
+#endif /*_NETINET_IN_H*/
+
 /*
  * IPv4 address types
  */
-- 
1.8.1.4



[dpdk-dev] [PATCH] ixgbe: Fix clang compilation issue

2014-09-30 Thread Bruce Richardson
Issue reported by Keith Wiles.
Clang fails with an error about a variable being used uninitialized:

  CC ixgbe_rxtx_vec.o
/home/keithw/projects/dpdk-code/org-dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c:67:30:
error: variable 'dma_addr0' is uninitialized
  when used here [-Werror,-Wuninitialized]
dma_addr0 = _mm_xor_si128(dma_addr0, dma_addr0);
  ^

This error can be fixed by replacing the call to xor which
takes two parameters, by a call to setzero, which does not take any.

Signed-off-by: Bruce Richardson 
---
 lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
index 457f267..2236250 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
@@ -64,7 +64,7 @@ ixgbe_rxq_rearm(struct igb_rx_queue *rxq)
 RTE_IXGBE_RXQ_REARM_THRESH) < 0) {
if (rxq->rxrearm_nb + RTE_IXGBE_RXQ_REARM_THRESH >=
rxq->nb_rx_desc) {
-   dma_addr0 = _mm_xor_si128(dma_addr0, dma_addr0);
+   dma_addr0 = _mm_setzero_si128();
for (i = 0; i < RTE_IXGBE_DESCS_PER_LOOP; i++) {
rxep[i].mbuf = >fake_mbuf;
_mm_store_si128((__m128i *)[i].read,
-- 
1.9.3



[dpdk-dev] Building current 1.8.1-rc1 with clang

2014-09-30 Thread Bruce Richardson
On Mon, Sep 29, 2014 at 09:50:34PM +, Wiles, Roger Keith wrote:
> I just pulled the current repo and stated a build with ?make install 
> T=x86_64-native-linuxapp-clang? which produced the following error. I do not 
> think I am allowed to modify this file, correct? If that is the case then 
> someone will have to update the original source. If you want me to submit a 
> patch I can, but I do not think I fully understand what needs to be done. 
> 
> From what I can tell the line:
>   dma_addr0 = _mm_xor_si128(dma_addr0, dma_addr0);
> needs to be:
>   dma_addr0 = _mm_setzero_si128();
> 
> == Build lib/librte_pmd_ixgbe
>   CC ixgbe_common.o
>   CC ixgbe_82598.o
>   CC ixgbe_82599.o
>   CC ixgbe_x540.o
>   CC ixgbe_phy.o
>   CC ixgbe_api.o
>   CC ixgbe_vf.o
>   CC ixgbe_dcb.o
>   CC ixgbe_dcb_82599.o
>   CC ixgbe_dcb_82598.o
>   CC ixgbe_mbx.o
>   CC ixgbe_rxtx.o
>   CC ixgbe_ethdev.o
>   CC ixgbe_fdir.o
>   CC ixgbe_pf.o
>   CC ixgbe_rxtx_vec.o
> /home/keithw/projects/dpdk-code/org-dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c:67:30:
>  error: variable 'dma_addr0' is uninitialized
>   when used here [-Werror,-Wuninitialized]
> dma_addr0 = _mm_xor_si128(dma_addr0, dma_addr0);
>   ^
> /home/keithw/projects/dpdk-code/org-dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c:57:2:
>  note: variable 'dma_addr0' is declared here
> __m128i dma_addr0, dma_addr1;
> ^
> 1 error generated.
> make[5]: *** [ixgbe_rxtx_vec.o] Error 1
> make[4]: *** [librte_pmd_ixgbe] Error 2
> make[3]: *** [lib] Error 2
> make[2]: *** [all] Error 2
> make[1]: *** [x86_64-native-linuxapp-clang_install] Error 2
> make: *** [install] Error 2
> 
> Thanks
> ++Keith
> 
> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
> 972-213-5533
> 

I think a simple one-line change like below should fix it. The xor can also 
be written as a setzero call.

diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
index 457f267..2236250 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
@@ -64,7 +64,7 @@ ixgbe_rxq_rearm(struct igb_rx_queue *rxq)
 RTE_IXGBE_RXQ_REARM_THRESH) < 0) {
if (rxq->rxrearm_nb + RTE_IXGBE_RXQ_REARM_THRESH >=
rxq->nb_rx_desc) {
-   dma_addr0 = _mm_xor_si128(dma_addr0, dma_addr0);
+   dma_addr0 = _mm_setzero_si128();
for (i = 0; i < RTE_IXGBE_DESCS_PER_LOOP; i++) {
rxep[i].mbuf = >fake_mbuf;
_mm_store_si128((__m128i *)[i].read,

/Bruce


[dpdk-dev] [memnic PATCH v2 6/7] pmd: add branch hint in recv/xmit

2014-09-30 Thread Neil Horman
On Tue, Sep 30, 2014 at 11:14:40AM +, Hiroshi Shimamoto wrote:
> From: Hiroshi Shimamoto 
> 
> To reduce instruction cache miss, add branch condition hints into
> recv/xmit functions. This improves a bit performance.
> 
> We can see performance improvements with memnic-tester.
> Using Xeon E5-2697 v2 @ 2.70GHz, 4 vCPU.
>  size |  before  |  after
>64 | 5.54Mpps | 5.55Mpps
>   128 | 5.46Mpps | 5.44Mpps
>   256 | 5.21Mpps | 5.22Mpps
>   512 | 4.50Mpps | 4.52Mpps
>  1024 | 3.71Mpps | 3.73Mpps
>  1280 | 3.21Mpps | 3.22Mpps
>  1518 | 2.92Mpps | 2.93Mpps
> 
> Signed-off-by: Hiroshi Shimamoto 
> Reviewed-by: Hayato Momma 
> ---
>  pmd/pmd_memnic.c | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c
> index 7fc3093..875d3ea 100644
> --- a/pmd/pmd_memnic.c
> +++ b/pmd/pmd_memnic.c
> @@ -289,26 +289,26 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
>   int idx, next;
>   struct rte_eth_stats *st = >stats[rte_lcore_id()];
>  
> - if (!adapter->nic->hdr.valid)
> + if (unlikely(!adapter->nic->hdr.valid))
>   return 0;
>  
>   pkts = bytes = errs = 0;
>   idx = adapter->up_idx;
>   for (nr = 0; nr < nb_pkts; nr++) {
>   p = >packets[idx];
> - if (p->status != MEMNIC_PKT_ST_FILLED)
> + if (unlikely(p->status != MEMNIC_PKT_ST_FILLED))
>   break;
>   /* prefetch the next area */
>   next = idx;
> - if (++next >= MEMNIC_NR_PACKET)
> + if (unlikely(++next >= MEMNIC_NR_PACKET))
>   next = 0;
>   rte_prefetch0(>packets[next]);
> - if (p->len > framesz) {
> + if (unlikely(p->len > framesz)) {
>   errs++;
>   goto drop;
>   }
>   mb = rte_pktmbuf_alloc(adapter->mp);
> - if (!mb)
> + if (unlikely(!mb))
>   break;
>  
>   rte_memcpy(rte_pktmbuf_mtod(mb, void *), p->data, p->len);
> @@ -350,7 +350,7 @@ static uint16_t memnic_xmit_pkts(void *tx_queue,
>   uint64_t pkts, bytes, errs;
>   uint32_t framesz = adapter->framesz;
>  
> - if (!adapter->nic->hdr.valid)
> + if (unlikely(!adapter->nic->hdr.valid))
>   return 0;
>  
>   pkts = bytes = errs = 0;
> @@ -360,7 +360,7 @@ static uint16_t memnic_xmit_pkts(void *tx_queue,
>   struct rte_mbuf *sg;
>   void *ptr;
>  
> - if (pkt_len > framesz) {
> + if (unlikely(pkt_len > framesz)) {
>   errs++;
>   break;
>   }
> @@ -379,7 +379,7 @@ retry:
>   goto retry;
>   }
>  
> - if (idx != ACCESS_ONCE(adapter->down_idx)) {
> + if (unlikely(idx != ACCESS_ONCE(adapter->down_idx))) {
Why are you using ACCESS_ONCE here?  Or for that matter, anywhere else in this
PMD?  The whole idea of the ACCESS_ONCE macro is to assign a value to a variable
once and prevent it from getting reloaded from memory at a later time, this is
exactly contrary to that, both in the sense that you're explicitly reloading the
same variable multiple times, and that you're using it as part of a comparison
operation, rather than an asignment operation

Neil



[dpdk-dev] [memnic PATCH v2 0/7] MEMNIC PMD performance improvement

2014-09-30 Thread Neil Horman
On Tue, Sep 30, 2014 at 11:10:45AM +, Hiroshi Shimamoto wrote:
> From: Hiroshi Shimamoto 
> 
> This patchset improves MEMNIC PMD performance.
> 
> The first patch introduces a new benchmark test run in guest,
> and will be used to evaluate the following patch effects.
> 
> This patchset improves the throughput results of memnic-tester.
> Using Xeon E5-2697 v2 @ 2.70GHz, 4 vCPU.
>  size |  before  |  after
>64 | 4.18Mpps | 5.83Mpps
>   128 | 3.85Mpps | 5.71Mpps
>   256 | 4.01Mpps | 5.40Mpps
>   512 | 3.52Mpps | 4.64Mpps
>  1024 | 3.18Mpps | 3.68Mpps
>  1280 | 2.86Mpps | 3.17Mpps
>  1518 | 2.59Mpps | 2.90Mpps
> 
> Hiroshi Shimamoto (7):
>   guest: memnic-tester: PMD benchmark in guest
>   pmd: remove needless assignment
>   pmd: use helper macros
>   pmd: use compiler barrier
>   pmd: packet receiving optimization with prefetch
>   pmd: add branch hint in recv/xmit
>   pmd: burst mbuf freeing in xmit
> 
>  guest/Makefile|  20 
>  guest/README.rst  |  93 +
>  guest/memnic-tester.c | 281 
> ++
>  pmd/pmd_memnic.c  |  45 
>  4 files changed, 417 insertions(+), 22 deletions(-)
>  create mode 100644 guest/Makefile
>  create mode 100644 guest/README.rst
>  create mode 100644 guest/memnic-tester.c
> 
> -- 
> 1.8.3.1
> 
> 
Can this PMD please be merged into the DPDK core. Having a single list for
multiple git trees is really just frustrating.

Neil



[dpdk-dev] [PATCH 1/7] Split atomic operations to architecture specific

2014-09-30 Thread Chao CH Zhu
Bruce and Neil,

Thanks for your comments! Actually, the compiler hides the difference with 
different architecture.
I'll submit another patch to correct this!


Best Regards!
--
Chao Zhu (??)
Research Staff Member
Cloud Infrastructure and Technology Group
IBM China Research Lab
Building 19 Zhongguancun Software Park
8 Dongbeiwang West Road, Haidian District,
Beijing, PRC. 100193
Tel: +86-10-58748711
Email: bjzhuc at cn.ibm.com




From:   Neil Horman 
To: Bruce Richardson 
Cc: Chao CH Zhu/China/IBM at IBMCN, dev at dpdk.org
Date:   2014/09/29 23:23
Subject:Re: [dpdk-dev] [PATCH 1/7] Split atomic operations to 
architecture specific



On Mon, Sep 29, 2014 at 12:05:22PM +0100, Bruce Richardson wrote:
> On Fri, Sep 26, 2014 at 05:33:32AM -0400, Chao Zhu wrote:
> > This patch splits the atomic operations from DPDK and push them to
> > architecture specific arch directories, so that other processor
> > architecture to support DPDK can be easily adopted.
> > 
> > Signed-off-by: Chao Zhu 
> > ---
> >  lib/librte_eal/common/Makefile |2 +-
> >  .../common/include/i686/arch/rte_atomic_arch.h |  378 

> >  lib/librte_eal/common/include/rte_atomic.h |  172 +
> >  .../common/include/x86_64/arch/rte_atomic_arch.h   |  378 

> >  4 files changed, 772 insertions(+), 158 deletions(-)
> >  create mode 100644 
lib/librte_eal/common/include/i686/arch/rte_atomic_arch.h
> >  create mode 100644 
lib/librte_eal/common/include/x86_64/arch/rte_atomic_arch.h
> > 
> <...snip...>
> > +#define rte_compiler_barrier() 
rte_arch_compiler_barrier()
> 
> Small question: shouldn't the compiler barrier be independent of 
> architecture?
> 
Agreed, compiler intrinsics I thought were used to define barriers, 
regardless
of arch (__memory_barrier() is the gcc intrinsic IIRC)
Neil

> /Bruce
> 
> 





[dpdk-dev] [PATCH v5 05/11] lib/librte_vhost: merge Oliver's mbuf change

2014-09-30 Thread Thomas Monjalon
2014-09-30 06:43, Xie, Huawei:
> > 2014-09-30 02:41, Xie, Huawei:
> > > I would rework the patch according to your comment.
> > > I don't get clear about this comment. Do you mean that recreate the patch 
> > > set
> > > based on the example that already has this mbuf change?
> > 
> > Yes
> > 
> > > Some of the background you might not know:
> > > I fully understand your concern here to make it a better patch and I 
> > > fully agree
> > > with you total comments.
> > > This is really a special case. You know it is transform of thousand lines 
> > > of code
> > with modifications.
> > > Sometimes a simple change could take me more than one day to rework the
> > patch, lines of lines manual check.
> > > I have already spent more than one week of time merely  on the patch 
> > > format
> > itself. :(.
> > 
> 
> > I know. I think you are learning (the hard way) how to use git.
> > As Ouyang said in this thread, you should use "git rebase"
> > and especially the --interactive mode to update your changes.
> > And you should make small commits at first. It's easier to squash
> > commits than splitting them.
> > 
> > > Could we possibly treat it specially when we have comment whether the 
> > > patch
> > can be split/merged better?
> > 
> > I thought it many times because I see it causes you many troubles.
> > But I still think that vhost is an important feature and we probably
> > want to be able to understand what are the reasons behond the changes
> > by looking at the git history. That's why I'd like you to make smaller
> > refactoring commits with explanations in commit logs.
> > 
> > That's said, we should continue working together on it.
> > Send me your drafts and I'll help you to split them. The part I cannot do
> > by myself is about the explanations in commit logs.
> >
> Based on the principle "small commit", this is the rough idea. Btw, git tool 
> will n't help here. Please have a quick read through. I will start the rework 
> asap.
> Patch 1:
>   copy examples/vhost/main.c/lib/librte_vhost/vhost/vhost_rxtx.c
>   git mv examples/vhost/eventfd_link  /lib/librte_vhost/
>   git mv examples/vhost/libvirt   /lib/librte/vhost/libvirt
>   git mv examples/vhost/vhost-net-cdev.c  /lib/librte_vhost/
>   git mv examples/vhost/vhost-net-cdev.h /lib/librte_vhost/  
>   git mv examples/vhost/virtio-net.* /lib/librte_vhost/
> comment:
> As in previous patch set, vhost_rxtx.c is partly copied from main.c, here I 
> decide to "copy" rather than "mv" main.c from example to vhost lib. The 
> drawback is gitk couldn't recognize main.c and vhost_rxtx.c have the same 
> index.
> The pros is even with mv, gitk couldn't recognize partly copy of 
> vhost_rxtx.c, and later we could patch vhost example based on existing files.
> Will emphasize that vhost_rxtx.c is a purely copy without any modification in 
> commit message.
> delete examples/vhost/Makefile as vhost example could no longer be compiled 
> from here after until the example patch.
> 
> Another option is leave all example files, and copy needed ones to vhost lib 
> directory. The cons is gitk will treat all files in vhost lib as new files. 

No please move the files to the lib. It's easier to read.
You can drop all the example in 1 patch or make as you did before:
drop old example and rework it in subsequent patches.
The most important part is the library.

> patch 2:
>   rename virtio-net.c to rte_virtio_net.h

It can be done in patch 1.

> patch 3:
>   remove zero copy logic in related files.
> patch 4:
>   remove switching related logics in related files.
> patch 5:
>   delete all functions in vhost_rxtx.c except four functions 
> virtio_dev_(merge)_rx/tx and helper functions copy_mbuf_to_rings..
> comment here:
>   here virtio_dev_(merge)_rx/tx will refer non-existing functions like 
> virtio_tx_route.
> I think it is ok, right?
> patch 6:
>   remove virtio_dev_tx, and rename virtio_dev_merge_tx to
> virtio_dev_tx.
>   patch virtio_dev_rx, virtio_dev_merge_rx and virtio_dev_tx
>   will see if this can be further divided.

Yes please try to split and explain these important things.

> patch 7:
>   Other minior fixes, like change global vars to static vars
> patch 8:
>   fixes plenty of serious coding style issues
> Patch 9:
>   Vhost API patch
> Patch 10:
>   Added identified TODO or FIXME
> patch 11:
>   Add vhost lib makefiles.
> Patch 12...:
>   Vhost example patch

OK for the other ones.
The global idea is to isolate minor changes in order to make important
changes clearly visible in dedicated patches.

-- 
Thomas


[dpdk-dev] [PATCH v2] distributor_app: new sample app

2014-09-30 Thread Ananyev, Konstantin
> -Original Message-
> From: Pattan, Reshma
> Sent: Tuesday, September 30, 2014 9:03 AM
> To: Ananyev, Konstantin; De Lara Guarch, Pablo; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> 
> Hi Konstantin,
> 
> Any comments on below Pablos comment? If so please provide.
> 
> Thanks,
> Reshma
> 

Hi Reshma,

No really.
I would just change printf to what I suggested.
For your app, I believe that is more than enough.
But if you'd like to introduce some sort of rate limiting for logging -
sure go ahead would be interesting to see that patch.
Konstantin

> 
> 
> -Original Message-
> From: De Lara Guarch, Pablo
> Sent: Monday, September 29, 2014 2:35 PM
> To: Ananyev, Konstantin; Pattan, Reshma; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> 
> 
> 
> > -Original Message-
> > From: Ananyev, Konstantin
> > Sent: Monday, September 29, 2014 2:07 PM
> > To: Pattan, Reshma; De Lara Guarch, Pablo; dev at dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> >
> >
> >
> > > -Original Message-
> > > From: Pattan, Reshma
> > > Sent: Monday, September 29, 2014 1:40 PM
> > > To: Ananyev, Konstantin; De Lara Guarch, Pablo; dev at dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> > >
> > >
> > >
> > > -Original Message-
> > > From: Ananyev, Konstantin
> > > Sent: Friday, September 26, 2014 4:52 PM
> > > To: De Lara Guarch, Pablo; Pattan, Reshma; dev at dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of De Lara
> > > > Guarch, Pablo
> > > > Sent: Friday, September 26, 2014 4:12 PM
> > > > To: Pattan, Reshma; dev at dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> > > >
> > > > Hi,
> > > >
> > > > > -Original Message-
> > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of reshmapa
> > > > > Sent: Wednesday, September 24, 2014 3:17 PM
> > > > > To: dev at dpdk.org
> > > > > Subject: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> > > > >
> > > > > From: Reshma Pattan 
> > > > >
> > > > > A new sample app that shows the usage of the distributor library.
> > > > > This app works as follows:
> > > > >
> > > > > * An RX thread runs which pulls packets from each ethernet port in 
> > > > > turn
> > > > >   and passes those packets to worker using a distributor component.
> > > > > * The workers take the packets in turn, and determine the output port
> > > > >   for those packets using basic l2forwarding doing an xor on the 
> > > > > source
> > > > >   port id.
> > > > > * The RX thread takes the returned packets from the workers and
> > enqueue
> > > > >   those packets into an rte_ring structure.
> > > > > * A TX thread pulls the packets off the rte_ring structure and then
> > > > >   sends each packet out the output port specified previously by
> > > > > the worker
> > > > > * Command-line option support provided only for portmask.
> > > > >
> > > > > Signed-off-by: Bruce Richardson 
> > > > > Signed-off-by: Reshma Pattan 
> > > > > ---
> > > > >  examples/Makefile |   1 +
> > > > >  examples/distributor_app/Makefile |  57 
> > > > >  examples/distributor_app/main.c   | 585
> > > > > ++
> > > > >  examples/distributor_app/main.h   |  46 +++
> > > > >  4 files changed, 689 insertions(+)  create mode 100644
> > > > > examples/distributor_app/Makefile  create mode
> > > > > 100644 examples/distributor_app/main.c  create mode 100644
> > > > > examples/distributor_app/main.h
> > > > >
> > > > > diff --git a/examples/Makefile b/examples/Makefile index
> > > > > 6245f83..2ba82b0 100644
> > > > > --- a/examples/Makefile
> > > > > +++ b/examples/Makefile
> > > > > @@ -66,5 +66,6 @@ DIRS-y += vhost
> > > > >  DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen  DIRS-y +=
> > vmdq
> > > > > DIRS-y += vmdq_dcb
> > > > > +DIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += distributor_app
> > > > >
> > > > >  include $(RTE_SDK)/mk/rte.extsubdir.mk diff --git
> > > > > a/examples/distributor_app/Makefile
> > > > > b/examples/distributor_app/Makefile
> > > > > new file mode 100644
> > > > > index 000..394785d
> > > > > --- /dev/null
> > > > > +++ b/examples/distributor_app/Makefile
> > > > > @@ -0,0 +1,57 @@
> > > > > +#   BSD LICENSE
> > > > > +#
> > > > > +#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > > > > +#   All rights reserved.
> > > > > +#
> > > > > +#   Redistribution and use in source and binary forms, with or 
> > > > > without
> > > > > +#   modification, are permitted provided that the following 
> > > > > conditions
> > > > > +#   are met:
> > > > > +#
> > > > > +# * Redistributions of source code must retain the above 
> > > > > copyright
> > > > > +#   notice, this 

[dpdk-dev] [PATCH 1/4 v3] compat: Add infrastructure to support symbol versioning

2014-09-30 Thread Sergio Gonzalez Monroy
On Mon, Sep 29, 2014 at 11:44:03AM -0400, Neil Horman wrote:
> Add initial pass header files to support symbol versioning.
> 
> ---
> Change notes
> v2)
> * Fixed ifdef in rte_compat.h to test for RTE_BUILD_SHARED_LIB instead of the
> non-existant RTE_SYMBOL_VERSIONING
> 
> * Fixed VERSION_SYMBOL macro to add the needed extra @ to make versioning work
> properly
> 
> * Improved/Clarified documentation
> 
> v3)
> * Added missing macros to fully export the symver directive specification
> 
> Signed-off-by: Neil Horman 
> CC: Thomas Monjalon 
> CC: "Richardson, Bruce" 
> CC: "Gonzalez Monroy, Sergio" 
> ---
>  lib/Makefile   |  1 +
>  lib/librte_compat/Makefile | 38 ++
>  lib/librte_compat/rte_compat.h | 90 
> ++
>  mk/rte.lib.mk  |  6 +++
>  4 files changed, 135 insertions(+)
>  create mode 100644 lib/librte_compat/Makefile
>  create mode 100644 lib/librte_compat/rte_compat.h
> 
> diff --git a/lib/Makefile b/lib/Makefile
> index 10c5bb3..a85b55b 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -32,6 +32,7 @@
>  include $(RTE_SDK)/mk/rte.vars.mk
>  
>  DIRS-$(CONFIG_RTE_LIBC) += libc
> +DIRS-y += librte_compat
>  DIRS-$(CONFIG_RTE_LIBRTE_EAL) += librte_eal
>  DIRS-$(CONFIG_RTE_LIBRTE_MALLOC) += librte_malloc
>  DIRS-$(CONFIG_RTE_LIBRTE_RING) += librte_ring
> diff --git a/lib/librte_compat/Makefile b/lib/librte_compat/Makefile
> new file mode 100644
> index 000..3415c7b
> --- /dev/null
> +++ b/lib/librte_compat/Makefile
> @@ -0,0 +1,38 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2010-2014 Neil Horman 
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +#   notice, this list of conditions and the following disclaimer.
> +# * Redistributions in binary form must reproduce the above copyright
> +#   notice, this list of conditions and the following disclaimer in
> +#   the documentation and/or other materials provided with the
> +#   distribution.
> +# * Neither the name of Intel Corporation nor the names of its
> +#   contributors may be used to endorse or promote products derived
> +#   from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +
> +# install includes
> +SYMLINK-y-include := rte_compat.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_compat/rte_compat.h b/lib/librte_compat/rte_compat.h
> new file mode 100644
> index 000..0b76771
> --- /dev/null
> +++ b/lib/librte_compat/rte_compat.h
> @@ -0,0 +1,90 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2010-2014 Neil Horman .
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + *   notice, this list of conditions and the following disclaimer in
> + *   the documentation and/or other materials provided with the
> + *   distribution.
> + * * Neither the name of Intel Corporation nor the names of its
> + *   contributors may be used to endorse or promote products derived
> + *   from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> 

[dpdk-dev] [PATCH] llib/ibrte_net: workaround to avoid macro conflict

2014-09-30 Thread Neil Horman
On Tue, Sep 30, 2014 at 10:49:08AM +0800, Jingjing Wu wrote:
> Macros such as IPPROTO_TCP, IPPROTO_UDP are already defined in .
> If user's application includes  and rte_ip.h at the same time,
> there will be conflict error.
> 
> This patch uses the way "#ifndef #endif" to avoid the conflict. 
> 
> Signed-off-by: Jingjing Wu 
> ---
>  lib/librte_net/rte_ip.h | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/lib/librte_net/rte_ip.h b/lib/librte_net/rte_ip.h
> index e3f65c1..2bcb479 100644
> --- a/lib/librte_net/rte_ip.h
> +++ b/lib/librte_net/rte_ip.h
> @@ -116,6 +116,8 @@ struct ipv4_hdr {
>  
>  #define  IPV4_HDR_OFFSET_UNITS   8
>  
> +#ifndef _NETINET_IN_H
> +#ifndef _NETINET_IN_H_
>  /* IPv4 protocols */
>  #define IPPROTO_IP 0  /**< dummy for IP */
>  #define IPPROTO_HOPOPTS0  /**< IP6 hop-by-hop options */
> @@ -227,6 +229,9 @@ struct ipv4_hdr {
>  #define IPPROTO_RAW  255  /**< raw IP packet */
>  #define IPPROTO_MAX  256  /**< maximum protocol number */
>  
> +#endif /*_NETINET_IN_H_*/
> +#endif /*_NETINET_IN_H*/
> +
>  /*
>   * IPv4 address types
>   */
> -- 
> 1.8.1.4
> 
> 
Why define them at all?  Why not just have rte_ip.h include netinet/in.h
directly?  Its a standard include file in a standard location for both bsd and
linux IIRC.
Neil



[dpdk-dev] [PATCH v2 14/18] ixgbe: Remove unnecessary delay

2014-09-30 Thread Neil Horman
On Mon, Sep 29, 2014 at 03:16:22PM +0800, Ouyang Changchun wrote:
> This patch removes unnecessary delay when setting up physical link
> and negotiating in IXGBE share code.
> 
Why was this there in the first place then?  Was there some older hardware that
required it?  If so, is there a  need to continue compatibility with it.  More
detail would be appreciated here.
Neil



[dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe:

2014-09-30 Thread Neil Horman
On Tue, Sep 30, 2014 at 12:30:08PM +, Ananyev, Konstantin wrote:
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Wodkowski, PawelX
> > Sent: Tuesday, September 30, 2014 1:05 PM
> > To: Wodkowski, PawelX; Richardson, Bruce
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] Change alarm cancel function to 
> > thread-safe:
> > 
> > > -Original Message-
> > > Pawe?
> > >
> > > > On Mon, Sep 29, 2014 at 10:11:38AM +, Wodkowski, PawelX wrote:
> > > > > > >
> > > > > > > Image how you will be damned by someone that not even notice you
> > > > change
> > > > > > > and he Is managing some kind of resource based on returned number 
> > > > > > > of
> > > > > > > set/canceled timers. If you suddenly start returning negative 
> > > > > > > values how
> > > > those
> > > > > > > application will behave? Silently changing returned value domain 
> > > > > > > is evil in
> > > > its
> > > > > > > pure form.
> > > > > >
> > > > > > As I can see the impact is very limited.
> > > > >
> > > > > It is small impact to DPDK but can be huge to user application:
> > > >
> > > > This is why we traditionally have in the release-notes for each release 
> > > > a
> > > > section dedicated to calling out changes from one release to another. 
> > > > [See
> > > > http://dpdk.org/doc/intel/dpdk-release-notes-1.7.0.pdf section 5]. Since
> > > > from release-to-release there are generally only a couple of changes -
> > > > though our next release may be a little different - the actual changes 
> > > > are
> > > > clear enough to read about without wading through pages of 
> > > > documentation.
> > > I
> > > > thinking calling out the change in both the release notes and the API 
> > > > docs
> > > > is sufficient even for a change like this.
> > > >
> > > > Basically, I wouldn't let API stability factor in too much in trying to 
> > > > get
> > > > a proper fix for this issue.
> > > >
> > > > /Bruce
> > > >
> > >
> > > Summarizing all proposed solutions and to be able to produce final patch 
> > > - what
> > > Is desired behavior after fix?
> > >
> > > 1. do we leave as is in Patch v2:
> > > 1.1 if canceling from other thread - if one of the alarms is executing, 
> > > wait to
> > >   finish its execution and then cancel any rearmed alarms.
> > > 1.2 if canceling from alarm handler and one of the alarms to cancel is 
> > > this
> > >   executing callback do no wait for it to finish and cancel anything else.
> > >
> > >  in both cases return number of canceled callbacks.
> > >
> > > 2. Do exactly like in 1. but return -EINPROGRESS instead of canceled 
> > > alarms
> > >   if one of the alarms to cancel is currently executing callback from 
> > > alarm thread
> > >   (information about number of canceled alarms will be lost).
> > 
> > Or instead of returning -EINPROGRESS set errno to EINPROGRESS (replace
> > returning error value by setting errno and function can always return number
> > of canceled callbacks - in error condition 0)?
> 
> Yes that's looks like a better option. 
> As I remember, inside DPDK we have our own rte_errno, that we can probably 
> use here.
> My vote would be for that approach.
> 
You'll want to document that interface to make sure callers understand that a
non-zero return code doesn't automatically mean complete success, but yes, in
this case that seems like a pretty reasonable solution.
Neil

> Konstantin
> 
> > 
> > >
> > > 3. refuse to cancel anything if canceling currently executing alarm from 
> > > alarm
> > >   callback and return -EINPROGRESS otherwise do like in 1.1.
> > >
> > > 4. Implement behaviour 1/2/3 (which?) and add API call to interrogate 
> > > list of
> > >   alarms and retrun state of given alarms(s).
> > >
> > > 5. other solutions?
> > >
> > > Pawel
> 


[dpdk-dev] [PATCH] Fix for LRU corrupted returns

2014-09-30 Thread Neil Horman
On Tue, Sep 30, 2014 at 06:26:23AM +, Saha, Avik (AWS) wrote:
> Sorry about the delay. The number 32 is not really a CACHE_LINE_SIZE but 
> since __builtin_clz returns the number of leading 0's before the most 
> significant set bit in a 32 bit number (entry_size is uint32_t), I subtract 
> that number from 32 to get the number of trailing bits after the most 
> significant set bit. This will be the separation in my data_mem regions.
> 
Ah, ok, then change that 32 to sizeof(t->data_size_shl) to protect you
against type changes and to avoid having magic values running around in your
code.  Also, you might want to do some sanity checking of entry_size as it seems
like theres a soft assumption that entry size is non-zero and a power of two.
while the latter is checked higher in the function, the former isn't and
__builtin_clz has undefined behavior if its passed a zero value.

Neil

> -Original Message-
> From: Neil Horman [mailto:nhorman at tuxdriver.com] 
> Sent: Thursday, September 25, 2014 3:22 AM
> To: Saha, Avik (AWS)
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] Fix for LRU corrupted returns
> 
> On Thu, Sep 25, 2014 at 07:46:16AM +, Saha, Avik (AWS) wrote:
> > This is a patch to a problem that I have faced (described in the  thread) 
> > and this works for me.
> > 
> > 1)  Since the data_size_shl was getting its value from the key_size, 
> > the table data entries were being corrupted when the calculation to shift 
> > the number of bits was being made based on the key_size (according to the 
> > document the key_size and entry_size are independently configurable) - With 
> > this fix, we get the MSB that is set in entry_size (also removes the 
> > constraint of this having to be a power of 2 - not entirely sure if this 
> > was the reason the constraint was kept though)
> > 2)  The document does not say that the entry_size needs to be a power 
> > of 2 and this was failing silently when I was trying to bring my 
> > application up.
> > 
> > diff --git a/DPDK/lib/librte_table/rte_table_hash_lru.c 
> > b/DPDK/lib/librte_table/rte_table_hash_lru.c
> > index d1a4984..4ec9aa4 100644
> > --- a/DPDK/lib/librte_table/rte_table_hash_lru.c
> > +++ b/DPDK/lib/librte_table/rte_table_hash_lru.c
> > @@ -153,8 +153,10 @@ rte_table_hash_lru_create(void *params, int socket_id, 
> > uint32_t entry_size)
> > uint32_t i;
> > 
> > /* Check input parameters */
> > -   if ((check_params_create(p) != 0) ||
> > -   (!rte_is_power_of_2(entry_size)) ||
> > +   // Commenting out the power of 2 check on the entry_size since the
> > +   // Programmers Guide does not call this out and we are going to 
> > handle
> > +   // the data_size_shl of the table later on (Line 197)
> Please remove the reference to Line 197 here.  Thats not going to remain 
> accurate for very long.
> 
> > +   if ((check_params_create(p) != 0) ||
> > ((sizeof(struct rte_table_hash) % CACHE_LINE_SIZE) != 0) ||
> > (sizeof(struct bucket) != (CACHE_LINE_SIZE / 2))) {
> > return NULL;
> > @@ -192,7 +194,7 @@ rte_table_hash_lru_create(void *params, int socket_id, 
> > uint32_t entry_size)
> > /* Internal */
> > t->bucket_mask = t->n_buckets - 1;
> > t->key_size_shl = __builtin_ctzl(p->key_size);
> > -   t->data_size_shl = __builtin_ctzl(p->key_size);
> > +   t->data_size_shl = 32 - (__builtin_clz(entry_size));
> I presume the 32 value here is a cache line size?  That should be replaced 
> with CACHE_LINE_SIZE...Though looking at it, that doesn't seem sufficient.  
> Seems like we need a eal abstraction to dynamically tell us what the cache 
> line size is (we can read it from /proc/cpuinfo in linux, not sure about bsd).
> 
> Neil
> 
> 


[dpdk-dev] [PATCH v2] distributor_app: new sample app

2014-09-30 Thread Pattan, Reshma
Hi Konstantin,

Any comments on below Pablos comment? If so please provide.

Thanks,
Reshma



-Original Message-
From: De Lara Guarch, Pablo 
Sent: Monday, September 29, 2014 2:35 PM
To: Ananyev, Konstantin; Pattan, Reshma; dev at dpdk.org
Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app



> -Original Message-
> From: Ananyev, Konstantin
> Sent: Monday, September 29, 2014 2:07 PM
> To: Pattan, Reshma; De Lara Guarch, Pablo; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> 
> 
> 
> > -Original Message-
> > From: Pattan, Reshma
> > Sent: Monday, September 29, 2014 1:40 PM
> > To: Ananyev, Konstantin; De Lara Guarch, Pablo; dev at dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> >
> >
> >
> > -Original Message-
> > From: Ananyev, Konstantin
> > Sent: Friday, September 26, 2014 4:52 PM
> > To: De Lara Guarch, Pablo; Pattan, Reshma; dev at dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> >
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of De Lara 
> > > Guarch, Pablo
> > > Sent: Friday, September 26, 2014 4:12 PM
> > > To: Pattan, Reshma; dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> > >
> > > Hi,
> > >
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of reshmapa
> > > > Sent: Wednesday, September 24, 2014 3:17 PM
> > > > To: dev at dpdk.org
> > > > Subject: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> > > >
> > > > From: Reshma Pattan 
> > > >
> > > > A new sample app that shows the usage of the distributor library.
> > > > This app works as follows:
> > > >
> > > > * An RX thread runs which pulls packets from each ethernet port in turn
> > > >   and passes those packets to worker using a distributor component.
> > > > * The workers take the packets in turn, and determine the output port
> > > >   for those packets using basic l2forwarding doing an xor on the source
> > > >   port id.
> > > > * The RX thread takes the returned packets from the workers and
> enqueue
> > > >   those packets into an rte_ring structure.
> > > > * A TX thread pulls the packets off the rte_ring structure and then
> > > >   sends each packet out the output port specified previously by 
> > > > the worker
> > > > * Command-line option support provided only for portmask.
> > > >
> > > > Signed-off-by: Bruce Richardson 
> > > > Signed-off-by: Reshma Pattan 
> > > > ---
> > > >  examples/Makefile |   1 +
> > > >  examples/distributor_app/Makefile |  57 
> > > >  examples/distributor_app/main.c   | 585
> > > > ++
> > > >  examples/distributor_app/main.h   |  46 +++
> > > >  4 files changed, 689 insertions(+)  create mode 100644 
> > > > examples/distributor_app/Makefile  create mode
> > > > 100644 examples/distributor_app/main.c  create mode 100644 
> > > > examples/distributor_app/main.h
> > > >
> > > > diff --git a/examples/Makefile b/examples/Makefile index
> > > > 6245f83..2ba82b0 100644
> > > > --- a/examples/Makefile
> > > > +++ b/examples/Makefile
> > > > @@ -66,5 +66,6 @@ DIRS-y += vhost
> > > >  DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen  DIRS-y +=
> vmdq
> > > > DIRS-y += vmdq_dcb
> > > > +DIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += distributor_app
> > > >
> > > >  include $(RTE_SDK)/mk/rte.extsubdir.mk diff --git 
> > > > a/examples/distributor_app/Makefile
> > > > b/examples/distributor_app/Makefile
> > > > new file mode 100644
> > > > index 000..394785d
> > > > --- /dev/null
> > > > +++ b/examples/distributor_app/Makefile
> > > > @@ -0,0 +1,57 @@
> > > > +#   BSD LICENSE
> > > > +#
> > > > +#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > > > +#   All rights reserved.
> > > > +#
> > > > +#   Redistribution and use in source and binary forms, with or without
> > > > +#   modification, are permitted provided that the following conditions
> > > > +#   are met:
> > > > +#
> > > > +# * Redistributions of source code must retain the above copyright
> > > > +#   notice, this list of conditions and the following disclaimer.
> > > > +# * Redistributions in binary form must reproduce the above
> copyright
> > > > +#   notice, this list of conditions and the following disclaimer in
> > > > +#   the documentation and/or other materials provided with the
> > > > +#   distribution.
> > > > +# * Neither the name of Intel Corporation nor the names of its
> > > > +#   contributors may be used to endorse or promote products
> derived
> > > > +#   from this software without specific prior written permission.
> > > > +#
> > > > +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> > > > CONTRIBUTORS
> > > > +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING,
> BUT
> > > > NOT
> > > > +#   LIMITED TO, 

[dpdk-dev] [memnic PATCH v2 0/7] MEMNIC PMD performance improvement

2014-09-30 Thread Venkatesan, Venky

On 9/30/2014 7:29 AM, Neil Horman wrote:
> On Tue, Sep 30, 2014 at 11:10:45AM +, Hiroshi Shimamoto wrote:
>> From: Hiroshi Shimamoto 
>>
>> This patchset improves MEMNIC PMD performance.
>>
>> The first patch introduces a new benchmark test run in guest,
>> and will be used to evaluate the following patch effects.
>>
>> This patchset improves the throughput results of memnic-tester.
>> Using Xeon E5-2697 v2 @ 2.70GHz, 4 vCPU.
>>   size |  before  |  after
>> 64 | 4.18Mpps | 5.83Mpps
>>128 | 3.85Mpps | 5.71Mpps
>>256 | 4.01Mpps | 5.40Mpps
>>512 | 3.52Mpps | 4.64Mpps
>>   1024 | 3.18Mpps | 3.68Mpps
>>   1280 | 2.86Mpps | 3.17Mpps
>>   1518 | 2.59Mpps | 2.90Mpps
>>
>> Hiroshi Shimamoto (7):
>>guest: memnic-tester: PMD benchmark in guest
>>pmd: remove needless assignment
>>pmd: use helper macros
>>pmd: use compiler barrier
>>pmd: packet receiving optimization with prefetch
>>pmd: add branch hint in recv/xmit
>>pmd: burst mbuf freeing in xmit
>>
>>   guest/Makefile|  20 
>>   guest/README.rst  |  93 +
>>   guest/memnic-tester.c | 281 
>> ++
>>   pmd/pmd_memnic.c  |  45 
>>   4 files changed, 417 insertions(+), 22 deletions(-)
>>   create mode 100644 guest/Makefile
>>   create mode 100644 guest/README.rst
>>   create mode 100644 guest/memnic-tester.c
>>
>> -- 
>> 1.8.3.1
>>
>>
> Can this PMD please be merged into the DPDK core. Having a single list for
> multiple git trees is really just frustrating.
>
> Neil
>
Second that motion. This would be useful to have in the DPDK core

-Venky


[dpdk-dev] [PATCH v3] distributor_app: new sample app

2014-09-30 Thread Neil Horman
On Tue, Sep 30, 2014 at 11:39:37AM +0100, reshmapa wrote:
> From: Reshma Pattan 
> 
> A new sample app that shows the usage of the distributor library. This
> app works as follows:
> 
> * An RX thread runs which pulls packets from each ethernet port in turn
>   and passes those packets to worker using a distributor component.
> * The workers take the packets in turn, and determine the output port
>   for those packets using basic l2forwarding doing an xor on the source
>   port id.
> * The RX thread takes the returned packets from the workers and enqueue
>   those packets into an rte_ring structure.
> * A TX thread pulls the packets off the rte_ring structure and then
>   sends each packet out the output port specified previously by the worker
> * Command-line option support provided only for portmask.
> 
> Signed-off-by: Bruce Richardson 
> Signed-off-by: Reshma Pattan
> ---
>  examples/Makefile |1 +
>  examples/distributor_app/Makefile |   57 
>  examples/distributor_app/main.c   |  600 
> +
>  examples/distributor_app/main.h   |   46 +++
>  4 files changed, 704 insertions(+), 0 deletions(-)
>  create mode 100644 examples/distributor_app/Makefile
>  create mode 100644 examples/distributor_app/main.c
>  create mode 100644 examples/distributor_app/main.h
> 
> diff --git a/examples/Makefile b/examples/Makefile
> index 6245f83..2ba82b0 100644
> --- a/examples/Makefile
> +++ b/examples/Makefile
> @@ -66,5 +66,6 @@ DIRS-y += vhost
>  DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen
>  DIRS-y += vmdq
>  DIRS-y += vmdq_dcb
> +DIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += distributor_app
>  
>  include $(RTE_SDK)/mk/rte.extsubdir.mk
> diff --git a/examples/distributor_app/Makefile 
> b/examples/distributor_app/Makefile
> new file mode 100644
> index 000..6a5bada
> --- /dev/null
> +++ b/examples/distributor_app/Makefile
> @@ -0,0 +1,57 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +#   notice, this list of conditions and the following disclaimer.
> +# * Redistributions in binary form must reproduce the above copyright
> +#   notice, this list of conditions and the following disclaimer in
> +#   the documentation and/or other materials provided with the
> +#   distribution.
> +# * Neither the name of Intel Corporation nor the names of its
> +#   contributors may be used to endorse or promote products derived
> +#   from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +ifeq ($(RTE_SDK),)
> +$(error "Please define RTE_SDK environment variable")
> +endif
> +
> +# Default target, can be overriden by command line or environment
> +RTE_TARGET ?= x86_64-native-linuxapp-gcc
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# binary name
> +APP = distributor_app
> +
> +# all source are stored in SRCS-y
> +SRCS-y := main.c
> +
> +CFLAGS += $(WERROR_FLAGS)
> +
> +# workaround for a gcc bug with noreturn attribute
> +# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
> +ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
> +CFLAGS_main.o += -Wno-return-type
> +endif
> +
> +EXTRA_CFLAGS += -O3 -Wfatal-errors
> +
> +include $(RTE_SDK)/mk/rte.extapp.mk
> diff --git a/examples/distributor_app/main.c b/examples/distributor_app/main.c
> new file mode 100644
> index 000..f555d93
> --- /dev/null
> +++ b/examples/distributor_app/main.c
> @@ -0,0 +1,600 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following 

[dpdk-dev] [PATCH 2/2] librte_pmd_null: Enable librte_pmd_null

2014-09-30 Thread Neil Horman
On Tue, Sep 30, 2014 at 06:56:10PM +0900, mukawa at igel.co.jp wrote:
> From: Tetsuya Mukawa 
> 
> Signed-off-by: Tetsuya Mukawa 
> ---
>  mk/rte.app.mk | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> index 34dff2a..f059290 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -179,6 +179,10 @@ LDLIBS += -lrte_pmd_xenvirt
>  LDLIBS += -lxenstore
>  endif
>  
> +ifeq ($(CONFIG_RTE_LIBRTE_PMD_NULL),y)
> +LDLIBS += -lrte_pmd_null
> +endif
> +
You don't need to add this, as the pmd can be loaded dynamically via the dlopen
call executed via the -d option on the test app command line.  The only pmds
that need explicit linking are those that offer additional API calls to an
appilcation.

Neil

>  ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
>  # plugins (link only if static libraries)
>  
> -- 
> 1.9.1
> 
> 


[dpdk-dev] [PATCH v2] bond: Add mode 4 support.

2014-09-30 Thread Pawel Wodkowski
This patch adds support mode 4 of link bonding. It depend on Delcan Doherty
patches v3 and rte alarms patch v2 or above.

New version handles race issues with setting/cancelin callbacks,
fixes promiscus mode setting in mode 4 and some other minor errors in mode 4
implementation.


Signed-off-by: Pawel Wodkowski 
---
 lib/librte_ether/rte_ether.h   |1 +
 lib/librte_pmd_bond/Makefile   |1 +
 lib/librte_pmd_bond/rte_eth_bond.h |4 +
 lib/librte_pmd_bond/rte_eth_bond_8023ad.c  | 1070 
 lib/librte_pmd_bond/rte_eth_bond_8023ad.h  |  405 +++
 lib/librte_pmd_bond/rte_eth_bond_api.c |   82 ++-
 lib/librte_pmd_bond/rte_eth_bond_args.c|1 +
 lib/librte_pmd_bond/rte_eth_bond_pmd.c |  261 ++-
 lib/librte_pmd_bond/rte_eth_bond_private.h |   42 +-
 9 files changed, 1821 insertions(+), 46 deletions(-)
 create mode 100644 lib/librte_pmd_bond/rte_eth_bond_8023ad.c
 create mode 100644 lib/librte_pmd_bond/rte_eth_bond_8023ad.h

diff --git a/lib/librte_ether/rte_ether.h b/lib/librte_ether/rte_ether.h
index 2e08f23..1a3711b 100644
--- a/lib/librte_ether/rte_ether.h
+++ b/lib/librte_ether/rte_ether.h
@@ -293,6 +293,7 @@ struct vlan_hdr {
 #define ETHER_TYPE_RARP 0x8035 /**< Reverse Arp Protocol. */
 #define ETHER_TYPE_VLAN 0x8100 /**< IEEE 802.1Q VLAN tagging. */
 #define ETHER_TYPE_1588 0x88F7 /**< IEEE 802.1AS 1588 Precise Time Protocol. */
+#define ETHER_TYPE_SLOW 0x8809 /**< Slow protocols (LACP and Marker). */

 #ifdef __cplusplus
 }
diff --git a/lib/librte_pmd_bond/Makefile b/lib/librte_pmd_bond/Makefile
index 953d75e..c2312c2 100644
--- a/lib/librte_pmd_bond/Makefile
+++ b/lib/librte_pmd_bond/Makefile
@@ -44,6 +44,7 @@ CFLAGS += $(WERROR_FLAGS)
 #
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_api.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_pmd.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_8023ad.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_args.c

 #
diff --git a/lib/librte_pmd_bond/rte_eth_bond.h 
b/lib/librte_pmd_bond/rte_eth_bond.h
index 6811c7b..b0223c2 100644
--- a/lib/librte_pmd_bond/rte_eth_bond.h
+++ b/lib/librte_pmd_bond/rte_eth_bond.h
@@ -75,6 +75,10 @@ extern "C" {
 /**< Broadcast (Mode 3).
  * In this mode all transmitted packets will be transmitted on all available
  * active slaves of the bonded. */
+#define BONDING_MODE_8023AD(4)
+/**< 802.3AD (Mode 4).
+ * In this mode transmission and reception of packets is managed by LACP
+ * protocol specified in 802.3AD documentation. */

 /* Balance Mode Transmit Policies */
 #define BALANCE_XMIT_POLICY_LAYER2 (0)
diff --git a/lib/librte_pmd_bond/rte_eth_bond_8023ad.c 
b/lib/librte_pmd_bond/rte_eth_bond_8023ad.c
new file mode 100644
index 000..de416c6
--- /dev/null
+++ b/lib/librte_pmd_bond/rte_eth_bond_8023ad.c
@@ -0,0 +1,1070 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "rte_eth_bond_private.h"
+#include "rte_eth_bond_8023ad.h"
+
+#include 
+
+#ifdef RTE_LIBRTE_BOND_DEBUG_8023AD
+#define MODE4_DEBUG(fmt, ...) RTE_LOG(DEBUG, PMD, "%6u [Port %u: %s] " fmt, \
+   bond_dbg_get_time_diff_ms(), 
internals->active_slaves[port_num], \
+   

[dpdk-dev] rc1 / call for review

2014-09-30 Thread Thomas Monjalon
2014-09-29 14:33, Matthew Hall:
> On Mon, Sep 29, 2014 at 10:23:58PM +0200, Thomas Monjalon wrote:
> > - mbuf rework
> > - logs rework
> > - some eal cleanups
> 
> I was curious, did we happen to know if any of these three changes affected 
> the external API's much?
> 
> It would help us get some idea what to test and where to look, since mbuf, 
> logs, and eal are probably the three most popular parts of DPDK for us app 
> hackers to interact with regularly.

You're right.
During integration time, app hackers should be able to check the git history
for these API changes.
When it will be officially released, there will be some notes in the
documentation to help porting applications.

-- 
Thomas


[dpdk-dev] [PATCH v5 05/11] lib/librte_vhost: merge Oliver's mbuf change

2014-09-30 Thread Thomas Monjalon
2014-09-30 02:41, Xie, Huawei:
> I would rework the patch according to your comment.
> I don't get clear about this comment. Do you mean that recreate the patch set
> based on the example that already has this mbuf change?

Yes

> Some of the background you might not know:
> I fully understand your concern here to make it a better patch and I fully 
> agree
> with you total comments. 
> This is really a special case. You know it is transform of thousand lines of 
> code with modifications.
> Sometimes a simple change could take me more than one day to rework the 
> patch, lines of lines manual check. 
> I have already spent more than one week of time merely  on the patch format 
> itself. :(.

I know. I think you are learning (the hard way) how to use git.
As Ouyang said in this thread, you should use "git rebase" 
and especially the --interactive mode to update your changes.
And you should make small commits at first. It's easier to squash
commits than splitting them.

> Could we possibly treat it specially when we have comment whether the patch 
> can be split/merged better? 

I thought it many times because I see it causes you many troubles.
But I still think that vhost is an important feature and we probably
want to be able to understand what are the reasons behond the changes
by looking at the git history. That's why I'd like you to make smaller
refactoring commits with explanations in commit logs.

That's said, we should continue working together on it.
Send me your drafts and I'll help you to split them. The part I cannot do
by myself is about the explanations in commit logs.

Thanks
-- 
Thomas


[dpdk-dev] [PATCH v5 05/11] lib/librte_vhost: merge Oliver's mbuf change

2014-09-30 Thread Xie, Huawei


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, September 30, 2014 12:46 PM
> To: Xie, Huawei
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v5 05/11] lib/librte_vhost: merge Oliver's mbuf
> change
> 
> 2014-09-30 02:41, Xie, Huawei:
> > I would rework the patch according to your comment.
> > I don't get clear about this comment. Do you mean that recreate the patch 
> > set
> > based on the example that already has this mbuf change?
> 
> Yes
> 
> > Some of the background you might not know:
> > I fully understand your concern here to make it a better patch and I fully 
> > agree
> > with you total comments.
> > This is really a special case. You know it is transform of thousand lines 
> > of code
> with modifications.
> > Sometimes a simple change could take me more than one day to rework the
> patch, lines of lines manual check.
> > I have already spent more than one week of time merely  on the patch format
> itself. :(.
> 

> I know. I think you are learning (the hard way) how to use git.
> As Ouyang said in this thread, you should use "git rebase"
> and especially the --interactive mode to update your changes.
> And you should make small commits at first. It's easier to squash
> commits than splitting them.
> 
> > Could we possibly treat it specially when we have comment whether the patch
> can be split/merged better?
> 
> I thought it many times because I see it causes you many troubles.
> But I still think that vhost is an important feature and we probably
> want to be able to understand what are the reasons behond the changes
> by looking at the git history. That's why I'd like you to make smaller
> refactoring commits with explanations in commit logs.
> 
> That's said, we should continue working together on it.
> Send me your drafts and I'll help you to split them. The part I cannot do
> by myself is about the explanations in commit logs.
>
Based on the principle "small commit", this is the rough idea. Btw, git tool 
will n't help here. Please have a quick read through. I will start the rework 
asap.
Patch 1:
copy examples/vhost/main.c/lib/librte_vhost/vhost/vhost_rxtx.c
git mv examples/vhost/eventfd_link  /lib/librte_vhost/
git mv examples/vhost/libvirt   /lib/librte/vhost/libvirt
git mv examples/vhost/vhost-net-cdev.c  /lib/librte_vhost/
git mv examples/vhost/vhost-net-cdev.h /lib/librte_vhost/  
git mv examples/vhost/virtio-net.* /lib/librte_vhost/
comment:
As in previous patch set, vhost_rxtx.c is partly copied from main.c, here I 
decide to "copy" rather than "mv" main.c from example to vhost lib. The 
drawback is gitk couldn't recognize main.c and vhost_rxtx.c have the same index.
The pros is even with mv, gitk couldn't recognize partly copy of vhost_rxtx.c, 
and later we could patch vhost example based on existing files.
Will emphasize that vhost_rxtx.c is a purely copy without any modification in 
commit message.
delete examples/vhost/Makefile as vhost example could no longer be compiled 
from here after until the example patch.

Another option is leave all example files, and copy needed ones to vhost lib 
directory. The cons is gitk will treat all files in vhost lib as new files. 

patch 2:
rename virtio-net.c to rte_virtio_net.h
patch 3:
remove zero copy logic in related files.
patch 4:
remove switching related logics in related files.
patch 5:
delete all functions in vhost_rxtx.c except four functions 
virtio_dev_(merge)_rx/tx and helper functions copy_mbuf_to_rings..
comment here:
here virtio_dev_(merge)_rx/tx will refer non-existing functions like 
virtio_tx_route.
I think it is ok, right?
patch 6:
remove virtio_dev_tx, and rename virtio_dev_merge_tx to
virtio_dev_tx.
patch virtio_dev_rx, virtio_dev_merge_rx and virtio_dev_tx
will see if this can be further divided.
patch 7:
Other minior fixes, like change global vars to static vars
patch 8:
fixes plenty of serious coding style issues
Patch 9:
Vhost API patch
Patch 10:
Added identified TODO or FIXME
patch 11:
Add vhost lib makefiles.
Patch 12...:
Vhost example patch


> Thanks
> --
> Thomas


[dpdk-dev] [PATCH] Fix for LRU corrupted returns

2014-09-30 Thread Saha, Avik (AWS)
Sorry about the delay. The number 32 is not really a CACHE_LINE_SIZE but since 
__builtin_clz returns the number of leading 0's before the most significant set 
bit in a 32 bit number (entry_size is uint32_t), I subtract that number from 32 
to get the number of trailing bits after the most significant set bit. This 
will be the separation in my data_mem regions.

-Original Message-
From: Neil Horman [mailto:nhor...@tuxdriver.com] 
Sent: Thursday, September 25, 2014 3:22 AM
To: Saha, Avik (AWS)
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH] Fix for LRU corrupted returns

On Thu, Sep 25, 2014 at 07:46:16AM +, Saha, Avik (AWS) wrote:
> This is a patch to a problem that I have faced (described in the  thread) and 
> this works for me.
> 
> 1)  Since the data_size_shl was getting its value from the key_size, the 
> table data entries were being corrupted when the calculation to shift the 
> number of bits was being made based on the key_size (according to the 
> document the key_size and entry_size are independently configurable) - With 
> this fix, we get the MSB that is set in entry_size (also removes the 
> constraint of this having to be a power of 2 - not entirely sure if this was 
> the reason the constraint was kept though)
> 2)  The document does not say that the entry_size needs to be a power of 
> 2 and this was failing silently when I was trying to bring my application up.
> 
> diff --git a/DPDK/lib/librte_table/rte_table_hash_lru.c 
> b/DPDK/lib/librte_table/rte_table_hash_lru.c
> index d1a4984..4ec9aa4 100644
> --- a/DPDK/lib/librte_table/rte_table_hash_lru.c
> +++ b/DPDK/lib/librte_table/rte_table_hash_lru.c
> @@ -153,8 +153,10 @@ rte_table_hash_lru_create(void *params, int socket_id, 
> uint32_t entry_size)
> uint32_t i;
> 
> /* Check input parameters */
> -   if ((check_params_create(p) != 0) ||
> -   (!rte_is_power_of_2(entry_size)) ||
> +   // Commenting out the power of 2 check on the entry_size since the
> +   // Programmers Guide does not call this out and we are going to handle
> +   // the data_size_shl of the table later on (Line 197)
Please remove the reference to Line 197 here.  Thats not going to remain 
accurate for very long.

> +   if ((check_params_create(p) != 0) ||
> ((sizeof(struct rte_table_hash) % CACHE_LINE_SIZE) != 0) ||
> (sizeof(struct bucket) != (CACHE_LINE_SIZE / 2))) {
> return NULL;
> @@ -192,7 +194,7 @@ rte_table_hash_lru_create(void *params, int socket_id, 
> uint32_t entry_size)
> /* Internal */
> t->bucket_mask = t->n_buckets - 1;
> t->key_size_shl = __builtin_ctzl(p->key_size);
> -   t->data_size_shl = __builtin_ctzl(p->key_size);
> +   t->data_size_shl = 32 - (__builtin_clz(entry_size));
I presume the 32 value here is a cache line size?  That should be replaced with 
CACHE_LINE_SIZE...Though looking at it, that doesn't seem sufficient.  Seems 
like we need a eal abstraction to dynamically tell us what the cache line size 
is (we can read it from /proc/cpuinfo in linux, not sure about bsd).

Neil



[dpdk-dev] [PATCH 0/3] fix of lsc interrupt in i40e PF

2014-09-30 Thread Cao, Min
Tested-by: Min Cao 
This patch has been verified on FC20 with Eagle Fountain: 4*10G .
The i40e base driver update patch works well on FC20 with basic function.

The test environment detail information as the following:
HOST environment:
CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
OS: Linux 3.11.10
GCC: 4.8.3
NIC: Eagle Fountain: 4*10G 

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Helin Zhang
Sent: Wednesday, September 17, 2014 3:54 PM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH 0/3] fix of lsc interrupt in i40e PF

The patches include the fix for link status change interrupt
in i40e PF, and code style fixes.

Helin Zhang (3):
  i40e: renaming some local variables
  i40e: rework of PF interrupt cause enable flags processing
  i40e: fix of interrupt based link status change

 lib/librte_pmd_i40e/i40e_ethdev.c | 174 ++
 1 file changed, 122 insertions(+), 52 deletions(-)

-- 
1.8.1.4



[dpdk-dev] [PATCH 1/2] librte_pmd_null: Add null PMD

2014-09-30 Thread Thomas Monjalon
2014-09-30 18:56, mukawa at igel.co.jp:
> --- /dev/null
> +++ b/lib/librte_pmd_null/Makefile
> @@ -0,0 +1,58 @@
> +#   BSD LICENSE
> +#
> +#   Copyright (C) 2014 Nippon Telegraph and Telephone Corporation.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +#   notice, this list of conditions and the following disclaimer.
> +# * Redistributions in binary form must reproduce the above copyright
> +#   notice, this list of conditions and the following disclaimer in
> +#   the documentation and/or other materials provided with the
> +#   distribution.
> +# * Neither the name of Intel Corporation nor the names of its

You probably mean NTT here?

> --- /dev/null
> +++ b/lib/librte_pmd_null/rte_eth_null.c
> @@ -0,0 +1,474 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright (C) IGEL Co.,Ltd.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + *   notice, this list of conditions and the following disclaimer in
> + *   the documentation and/or other materials provided with the
> + *   distribution.
> + * * Neither the name of IGEL Co.,Ltd. nor the names of its
> + *   contributors may be used to endorse or promote products derived
> + *   from this software without specific prior written permission.

So the Makefile is copyrighted NTT and the .c file is copyrighted IGEL?

-- 
Thomas


[dpdk-dev] VMDq Sample Application on Virtual Machines

2014-09-30 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of ANKIT BATRA
> Sent: Monday, September 29, 2014 8:27 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] VMDq Sample Application on Virtual Machines
> 
> Hi,
> 
> I am running VMDq sample application on host machine.There are 1Gig I350
> Ethernet Cards on my host machine supporting VMDq.I am sending packets
> from another machine.And when I am sending vlan 5 packet, I am seeing that
> packets are getting increamented in pool 5 and so on.
> 
> And now I want to test this VMDq sample application with virtual
> machines.There are 2 virtual machine running on my machine.Can anybody
> suggest, how can I test this VMDq sample application with virtual machines
> like what steps need to followed and all so that I can see that if packets are
> meant for a specific VM, then the packets can go on the specific queue for
> that VM and so on.
> 

Another sample application vhost would help you, there is also user guide in 
dpdk documents, you can refer to it.

> --
> Regards
> Ankit Batra


[dpdk-dev] Hi all, does Amazon VMs supported DPDK or not?

2014-09-30 Thread Dong, Binghua
Hi Shawn,


Thanks a lot. Any more information of the c3.8xlarges? eg some public detail 
spec of this c3.8xlarges VM;  

Is it an Amazon EC2 VM in following link? 
http://aws.amazon.com/cn/blogs/aws/a-generation-of-ec2-instances-for-compute-intensive-workloads/
http://aws.amazon.com/ec2/


-Original Message-
From: Patel, Rashmin N 
Sent: Tuesday, September 30, 2014 4:54 AM
To: Wang, Shawn; Dong, Binghua; dev at dpdk.org; Saha, Avik (AWS)
Subject: RE: Hi all, does Amazon VMs supported DPDK or not?

Hi Shawn,

Which network interface is visible to the VM? I mean which is the virtual 
ethernet port is used in Amazon-VM-DPDK app? And what all interfaces are 
offered based on the VM size and requirements?

Thanks,
Rashmin

-Original Message-
From: Wang, Shawn [mailto:xing...@amazon.com] 
Sent: Monday, September 29, 2014 1:50 PM
To: Dong, Binghua; Patel, Rashmin N; dev at dpdk.org; Saha, Avik (AWS)
Subject: RE: Hi all, does Amazon VMs supported DPDK or not?

Yes, you can.

>From my colleague, Saha, Avik, they are running  IntelDPDK 1.7 on c3.8xlarges.

Thanks.

From: dev [dev-bounces at dpdk.org] on behalf of Dong, Binghua 
[binghua.d...@intel.com]
Sent: Saturday, September 27, 2014 10:05 PM
To: Patel, Rashmin N; dev at dpdk.org
Subject: Re: [dpdk-dev] Hi all,  does Amazon VMs supported DPDK or not?

Hi Patel,

The customer consider that deploy DPDK application in Amazon VMs is very 
flexible and very easy global site deployment:

such as: they only need to buy a 2 lcores VM if a site only need 200Mbps 
throughput;   buy one 4 lcores VM if the throughput is 400Mbps;

the can buy different Amazon site VMs in US, German... for lower access latency;

-Original Message-
From: Patel, Rashmin N
Sent: Saturday, September 27, 2014 12:41 AM
To: Dong, Binghua; dev at dpdk.org
Subject: RE: Hi all, does Amazon VMs supported DPDK or not?

It really depends on the devices offered in the VM. If direct device assignment 
is not provided to a VM or if the node hypervisor doesn't have an optimized 
para-virtual interface to a VM, I don't see any benefit using DPDK in VMs.

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Dong, Binghua
Sent: Friday, September 26, 2014 5:47 AM
To: dev at dpdk.org
Subject: [dpdk-dev] Hi all, does Amazon VMs supported DPDK or not?

A customer plan to buy some global Amazon VMs to run their DPDK 1.3(will 
upgrade to DPDK1.6 or 1.7) based VPN applications on global sites.

Thanks a lot;



[dpdk-dev] [PATCH v5 05/11] lib/librte_vhost: merge Oliver's mbuf change

2014-09-30 Thread Xie, Huawei
> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, September 30, 2014 3:44 AM
> To: Xie, Huawei
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v5 05/11] lib/librte_vhost: merge Oliver's mbuf
> change
> 
> > There is no rte_pktmbuf structure in mbuf now. Its fields are merged to
> > rte_mbuf structure.
> >
> > Signed-off-by: Huawei Xie 
> 
> This patch shouldn't appear but should be merged with your previous work.
> 
> --
> Thomas

Hi Thomas:
I would rework the patch according to your comment.
I don't get clear about this comment. Do you mean that recreate the patch set
based on the example that already has this mbuf change?

Some of the background you might not know:
I fully understand your concern here to make it a better patch and I fully agree
with you total comments. 
This is really a special case. You know it is transform of thousand lines of 
code with modifications.
Sometimes a simple change could take me more than one day to rework the patch, 
lines of lines manual check. 
I have already spent more than one week of time merely  on the patch format 
itself. :(.  

Could we possibly treat it specially when we have comment whether the patch can 
be split/merged better? 


[dpdk-dev] [PATCH v5 01/11] lib/librte_vhost: move src files in vhost example to vhost lib directory

2014-09-30 Thread Xie, Huawei


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, September 30, 2014 3:42 AM
> To: Xie, Huawei
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v5 01/11] lib/librte_vhost: move src files in 
> vhost
> example to vhost lib directory
> 
> Hi Huawei,
> 
> 2014-09-26 17:45, Huawei Xie:
> > "git mv examples/vhost lib/librte_vhost"
> > This is a purely src file move, without any modification.
> > Subsequent patch will transform those src files to a vhost library.
> >
> > Signed-off-by: Huawei Xie 
> > ---
> >  examples/vhost/Makefile  |   60 -
> >  examples/vhost/eventfd_link/Makefile |   39 -
> >  examples/vhost/eventfd_link/eventfd_link.c   |  205 --
> >  examples/vhost/eventfd_link/eventfd_link.h   |   79 -
> >  examples/vhost/libvirt/qemu-wrap.py  |  367 ---
> >  examples/vhost/main.c| 3725 
> > --
> >  examples/vhost/main.h|   86 -
> >  examples/vhost/vhost-net-cdev.c  |  367 ---
> >  examples/vhost/vhost-net-cdev.h  |   83 -
> >  examples/vhost/virtio-net.c  | 1165 
> >  examples/vhost/virtio-net.h  |  161 --
> >  lib/librte_vhost/eventfd_link/Makefile   |   39 +
> >  lib/librte_vhost/eventfd_link/eventfd_link.c |  205 ++
> >  lib/librte_vhost/eventfd_link/eventfd_link.h |   79 +
> >  lib/librte_vhost/libvirt/qemu-wrap.py|  367 +++
> >  lib/librte_vhost/main.c  | 3725 
> > ++
> >  lib/librte_vhost/main.h  |   86 +
> >  lib/librte_vhost/vhost-net-cdev.c|  367 +++
> >  lib/librte_vhost/vhost-net-cdev.h|   83 +
> >  lib/librte_vhost/virtio-net.c| 1165 
> >  lib/librte_vhost/virtio-net.h|  161 ++
> >  21 files changed, 6277 insertions(+), 6337 deletions(-)
> 
> In patch 2, you're using main.c to create vhost_rxtx.c.
> So it would be clearer to rename it in this patch 1.
> 
Rename it to vhost_rxtx.c?
> --
> Thomas


[dpdk-dev] DPDK doesn't work with iommu=pt

2014-09-30 Thread Hiroshi Shimamoto

> Subject: Re: [dpdk-dev] DPDK doesn't work with iommu=pt
> 
> 
> 
> On Mon, Sep 29, 2014 at 2:53 AM, Hiroshi Shimamoto  ct.jp.nec.com> wrote:
> > Hi,
> >
> >> Subject: Re: [dpdk-dev] DPDK doesn't work with iommu=pt
> >>
> >> iommu=pt effectively disables iommu for the kernel and iommu is
> >> enabled only for KVM.
> >> http://lwn.net/Articles/329174/
> >
> > thanks for pointing that.
> >
> > Okay, I think DPDK cannot handle IOMMU because of no kernel code in
> > DPDK application.
> >
> > And now, I think "iommu=pt" doesn't work correctly DMA on host PMD
> > causes DMAR fault which means IOMMU catches a wrong operation.
> > Will dig around "iommu=pt".
> >
> I agree with your analysis, It seems that a fairly recent patch (3~4) months 
> has introduced a bug that confuses unprotected
> DMA access with an iommu access, by the device and produces an equivalent of 
> a page fault.
> 
> >>
> >> Basically unless you have KVM running you can remove both lines for
> >> the same effect.
> >> On the other hand if you do have KVM and you do want iommu=on You can
> >> remove the iommu=pt for the same performance because AFAIK unlike the
> >> kernel drivers DPDK doesn't dma_map and dma_unman each and every
> >> ingress/egress packet (Please correct me if I'm wrong), and will not
> >> suffer any performance penalties.
> >
> > I also tried "iommu=on", but it didn't fix the issue.
> > I saw the same error messages in kernel.
> >
> 
> Just to clarify, what I suggested you to try is leaving only this string in 
> the command line "intel_iommu=on".  w/o iommu=pt.
> But this would work iff DPDK can handle iota's (I/O virtual addresses).

okay, I tried with "intel_iommu=on" only, but nothing was changed.

By the way, in several testing and my investigation, I think the issue comes 
from
no DMAR entry for hw pass through mode.
So using VFIO which turns IOMMU always on seems to solve my issue.

Unbind devices from igb_uio, and bind them vfio-pci, run testpmd looks working.

thanks,
Hiroshi

> 
> >   [   46.978097] dmar: DRHD: handling fault status reg 2
> >   [   46.978120] dmar: DMAR:[DMA Read] Request device [21:00.0] fault addr 
> > aa01
> >   DMAR:[fault reason 02] Present bit in context entry is clear
> >
> > thanks,
> > Hiroshi
> >
> >>
> >> FYI. Kernel NIC drivers:
> >> When iommu=on{,strict} the kernel network drivers will suffer a heavy
> >> performance penalty due to regular IOVA modifications (both HW and SW
> >> at fault here). Ixgbe and Mellanox reuse dma_mapped pages on the
> >> receive side to avoid this penalty, but still suffer from iommu on TX.
> >>
> >> On Fri, Sep 26, 2014 at 5:47 PM, Choi, Sy Jong  
> >> wrote:
> >> > Hi Shimamoto-san,
> >> >
> >> > There are a lot of sighting relate to "DMAR:[fault reason 06] PTE Read 
> >> > access is not set"
> >> > https://www.mail-archive.com/kvm at vger.kernel.org/msg106573.html
> >> >
> >> > This might be related to IOMMU, and kernel code.
> >> >
> >> > Here is what we know :-
> >> > 1) Disabling VT-d in bios also removed the symptom
> >> > 2) Switch to another OS distribution also removed the symptom
> >> > 3) even different HW we will not see the symptom. In my case, switch 
> >> > from Engineering board to EPSD board.
> >> >
> >> > Regards,
> >> > Choi, Sy Jong
> >> > Platform Application Engineer
> >> >
> >> >
> >> > -Original Message-
> >> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Hiroshi Shimamoto
> >> > Sent: Friday, September 26, 2014 5:14 PM
> >> > To: dev at dpdk.org
> >> > Cc: Hayato Momma
> >> > Subject: [dpdk-dev] DPDK doesn't work with iommu=pt
> >> >
> >> > I encountered an issue that DPDK doesn't work with "iommu=pt 
> >> > intel_iommu=on"
> >> > on HP ProLiant DL380p Gen8 server. I'm using the following environment;
> >> >
> >> >   HW: ProLiant DL380p Gen8
> >> >   CPU: E5-2697 v2
> >> >   OS: RHEL7
> >> >   kernel: kernel-3.10.0-123 and the latest kernel 3.17-rc6+
> >> >   DPDK: v1.7.1-53-gce5abac
> >> >   NIC: 82599ES
> >> >
> >> > When boot with "iommu=pt intel_iommu=on", I got the below message and no 
> >> > packets are handled.
> >> >
> >> >   [  120.809611] dmar: DRHD: handling fault status reg 2
> >> >   [  120.809635] dmar: DMAR:[DMA Read] Request device [21:00.0] fault 
> >> > addr aa01
> >> >   DMAR:[fault reason 02] Present bit in context entry is clear
> >> >
> >> > How to reproduce;
> >> > just run testpmd
> >> > # ./testpmd -c 0xf -n 4 -- -i
> >> >
> >> > Configuring Port 0 (socket 0)
> >> > PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x754eafc0 
> >> > hw_ring=0x7420 dma_addr=0xaa00
> >> > PMD: ixgbe_dev_tx_queue_setup(): Using full-featured tx code path
> >> > PMD: ixgbe_dev_tx_queue_setup():  - txq_flags = 0 
> >> > [IXGBE_SIMPLE_FLAGS=f01]
> >> > PMD: ixgbe_dev_tx_queue_setup():  - tx_rs_thresh = 32 
> >> > [RTE_PMD_IXGBE_TX_MAX_BURST=32]
> >> > PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x754ea740 
> >> > hw_ring=0x7421 dma_addr=0xaa01
> >> > PMD: