date:20150703

[dpdk-dev] [PATCH] mk: enable next abi in static libs

2015-07-03 Thread Thomas Monjalon

When a change makes really hard to keep ABI compatibility,
instead of waiting next release to break the ABI, it is smoother
to introduce the new code and enable it only for static libraries.
The flag RTE_NEXT_ABI may be used to "ifdef" the new code.
When the release is out, a dynamically linked application can use
the new shared libraries without rebuild while developpers can prepare
their application for the next ABI by reading the deprecation notice
and easily testing the new code.
When starting the next release cycle, the "ifdefs" will be removed
and the ABI break will be marked by incrementing LIBABIVER.

The new option CONFIG_RTE_NEXT_ABI is not defined in the configuration
templates because it is deduced from CONFIG_RTE_BUILD_SHARED_LIB.
It is automatically enabled for static libraries and disabled for
shared libraries.
It can be forced to another value by editing the generated .config file.
It shouldn't be enabled for shared libraries because it would break the
ABI without changing the version number LIBABIVER. That's why a warning
is printed in this case.

The guideline is also updated to integrate this new possibility.

Signed-off-by: Thomas Monjalon 
---
 doc/guides/guidelines/versioning.rst | 2 ++
 lib/Makefile | 4 
 mk/rte.sdkconfig.mk  | 3 +++
 pkg/dpdk.spec| 1 +
 scripts/validate-abi.sh  | 2 ++
 5 files changed, 12 insertions(+)

diff --git a/doc/guides/guidelines/versioning.rst 
b/doc/guides/guidelines/versioning.rst
index a1c9368..6bc2a8e 100644
--- a/doc/guides/guidelines/versioning.rst
+++ b/doc/guides/guidelines/versioning.rst
@@ -57,6 +57,8 @@ being provided. The requirements for doing so are:

 #. A full deprecation cycle, as explained above, must be made to offer
downstream consumers sufficient warning of the change.
+   The changes may be shown and used in static builds before the deprecation
+   cycle by conditioning them with RTE_NEXT_ABI option.

 #. The ``LIBABIVER`` variable in the makefile(s) where the ABI changes are
incorporated must be incremented in parallel with the ABI changes
diff --git a/lib/Makefile b/lib/Makefile
index 5f480f9..ebf56ba 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -31,6 +31,10 @@

 include $(RTE_SDK)/mk/rte.vars.mk

+ifeq '$(CONFIG_RTE_BUILD_SHARED_LIB)$(CONFIG_RTE_NEXT_ABI)' 'yy'
+$(info WARNING: Shared libraries versioning is tainted!)
+endif
+
 DIRS-y += librte_compat
 DIRS-$(CONFIG_RTE_LIBRTE_EAL) += librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_MALLOC) += librte_malloc
diff --git a/mk/rte.sdkconfig.mk b/mk/rte.sdkconfig.mk
index f8d95b1..135825c 100644
--- a/mk/rte.sdkconfig.mk
+++ b/mk/rte.sdkconfig.mk
@@ -77,6 +77,9 @@ $(RTE_OUTPUT)/.config: $(RTE_CONFIG_TEMPLATE) FORCE | 
$(RTE_OUTPUT)
$(CPP) -undef -P -x assembler-with-cpp \
-ffreestanding \
-o $(RTE_OUTPUT)/.config_tmp $(RTE_CONFIG_TEMPLATE) ; \
+   printf 'CONFIG_RTE_NEXT_ABI=' >> $(RTE_OUTPUT)/.config_tmp ; \
+   sed -n 's,CONFIG_RTE_BUILD_SHARED_LIB=,,p' 
$(RTE_OUTPUT)/.config_tmp | \
+   tr 'yn' 'ny' >> $(RTE_OUTPUT)/.config_tmp ; \
if ! cmp -s $(RTE_OUTPUT)/.config_tmp $(RTE_OUTPUT)/.config; 
then \
cp $(RTE_OUTPUT)/.config_tmp $(RTE_OUTPUT)/.config ; \
cp $(RTE_OUTPUT)/.config_tmp $(RTE_OUTPUT)/.config.orig 
; \
diff --git a/pkg/dpdk.spec b/pkg/dpdk.spec
index 5f6ec6a..fb71ccc 100644
--- a/pkg/dpdk.spec
+++ b/pkg/dpdk.spec
@@ -82,6 +82,7 @@ make O=%{target} T=%{target} config
 sed -ri 's,(RTE_MACHINE=).*,\1%{machine},' %{target}/.config
 sed -ri 's,(RTE_APP_TEST=).*,\1n,' %{target}/.config
 sed -ri 's,(RTE_BUILD_SHARED_LIB=).*,\1y,' %{target}/.config
+sed -ri 's,(RTE_NEXT_ABI=).*,\1n,' %{target}/.config
 sed -ri 's,(LIBRTE_VHOST=).*,\1y,' %{target}/.config
 sed -ri 's,(LIBRTE_PMD_PCAP=).*,\1y,'  %{target}/.config
 sed -ri 's,(LIBRTE_PMD_XENVIRT=).*,\1y,'   %{target}/.config
diff --git a/scripts/validate-abi.sh b/scripts/validate-abi.sh
index 1747b8b..4476433 100755
--- a/scripts/validate-abi.sh
+++ b/scripts/validate-abi.sh
@@ -157,6 +157,7 @@ git checkout $TAG1
 # Make sure we configure SHARED libraries
 # Also turn off IGB and KNI as those require kernel headers to build
 sed -i -e"$ a\CONFIG_RTE_BUILD_SHARED_LIB=y" config/defconfig_$TARGET
+sed -i -e"$ a\CONFIG_RTE_NEXT_ABI=n" config/defconfig_$TARGET
 sed -i -e"$ a\CONFIG_RTE_EAL_IGB_UIO=n" config/defconfig_$TARGET
 sed -i -e"$ a\CONFIG_RTE_LIBRTE_KNI=n" config/defconfig_$TARGET

@@ -198,6 +199,7 @@ git checkout $TAG2
 # Make sure we configure SHARED libraries
 # Also turn off IGB and KNI as those require kernel headers to build
 sed -i -e"$ a\CONFIG_RTE_BUILD_SHARED_LIB=y" config/defconfig_$TARGET
+sed -i -e"$ a\CONFIG_RTE_NEXT_ABI=n" config/defconfig_$TARGET
 sed -i -e"$ a\CONFIG_RTE_EAL_IGB_UIO=n" config/defconfig_$TARGET
 sed -i -e"$ a\CONFIG_RTE_LIBRTE_KNI=n" config/defconfi

[dpdk-dev] [PATCH v8 01/18] mbuf: redefine packet_type in rte_mbuf

2015-07-03 Thread Zhang, Helin

Hi Thomas

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Thursday, July 2, 2015 5:03 PM
> To: Zhang, Helin
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v8 01/18] mbuf: redefine packet_type in
> rte_mbuf
> 
> 2015-06-23 09:50, Helin Zhang:
> > In order to unify the packet type, the field of 'packet_type' in
> > 'struct rte_mbuf' needs to be extended from 16 to 32 bits.
> > Accordingly, some fields in 'struct rte_mbuf' are re-organized to
> > support this change for Vector PMD. As 'struct rte_kni_mbuf' for KNI
> > should be right mapped to 'struct rte_mbuf', it should be modified
> > accordingly. In addition, Vector PMD of ixgbe is disabled by default,
> > as 'struct rte_mbuf' changed.
> [...]
> > -CONFIG_RTE_IXGBE_INC_VECTOR=y
> > +CONFIG_RTE_IXGBE_INC_VECTOR=n
> 
> It is the default configuration. Disabling it do not prevent from build break 
> during
> a "git bisect".
> Please merge the changes for vector ixgbe in this patch.

Sure, no problem!
V9 will be sent soon. Thanks!

- Helin

[dpdk-dev] [PATCH v8 03/18] mbuf: add definitions of unified packet types

2015-07-03 Thread Zhang, Helin



> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Thursday, July 2, 2015 5:32 PM
> To: Zhang, Helin; dev at dpdk.org
> Cc: Cao, Waterman; Liang, Cunming; Liu, Jijiang; Ananyev, Konstantin; 
> Richardson,
> Bruce; yongwang at vmware.com; Wu, Jingjing
> Subject: Re: [PATCH v8 03/18] mbuf: add definitions of unified packet types
> 
> Hi Helin,
> 
> On 07/02/2015 03:30 AM, Zhang, Helin wrote:
> > Hi Oliver
> >
> > Thanks for your helps!
> >
> >> -Original Message-
> >> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> >> Sent: Tuesday, June 30, 2015 4:44 PM
> >> To: Zhang, Helin; dev at dpdk.org
> >> Cc: Cao, Waterman; Liang, Cunming; Liu, Jijiang; Ananyev, Konstantin;
> >> Richardson, Bruce; yongwang at vmware.com; Wu, Jingjing
> >> Subject: Re: [PATCH v8 03/18] mbuf: add definitions of unified packet
> >> types
> >>
> >> Hi Helin,
> >>
> >> This is greatly documented, thanks!
> >> Please find a small comment below.
> >>
> >> On 06/23/2015 03:50 AM, Helin Zhang wrote:
> >>> As there are only 6 bit flags in ol_flags for indicating packet
> >>> types, which is not enough to describe all the possible packet types
> >>> hardware can recognize. For example, i40e hardware can recognize
> >>> more than 150 packet types. Unified packet type is composed of L2
> >>> type, L3 type, L4 type, tunnel type, inner L2 type, inner L3 type
> >>> and inner L4 type fields, and can be stored in 'struct rte_mbuf' of
> >>> 32 bits field 'packet_type'.
> >>> To avoid breaking ABI compatibility, all the changes would be
> >>> enabled by RTE_NEXT_ABI, which is disabled by default.
> >>>
> >>> [...]
> >>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> >>> index 0315561..0ee0c55 100644
> >>> --- a/lib/librte_mbuf/rte_mbuf.h
> >>> +++ b/lib/librte_mbuf/rte_mbuf.h
> >>> @@ -201,6 +201,493 @@ extern "C" {
> >>>/* Use final bit of flags to indicate a control mbuf */
> >>>#define CTRL_MBUF_FLAG   (1ULL << 63) /**< Mbuf contains
> control
> >> data */
> >>>
> >>> +#ifdef RTE_NEXT_ABI
> >>> +/*
> >>> + * 32 bits are divided into several fields to mark packet types.
> >>> +Note that
> >>> + * each field is indexical.
> >>> + * - Bit 3:0 is for L2 types.
> >>> + * - Bit 7:4 is for L3 or outer L3 (for tunneling case) types.
> >>> + * - Bit 11:8 is for L4 or outer L4 (for tunneling case) types.
> >>> + * - Bit 15:12 is for tunnel types.
> >>> + * - Bit 19:16 is for inner L2 types.
> >>> + * - Bit 23:20 is for inner L3 types.
> >>> + * - Bit 27:24 is for inner L4 types.
> >>> + * - Bit 31:28 is reserved.
> >>> + *
> >>> + * To be compatible with Vector PMD, RTE_PTYPE_L3_IPV4,
> >>> +RTE_PTYPE_L3_IPV4_EXT,
> >>> + * RTE_PTYPE_L3_IPV6, RTE_PTYPE_L3_IPV6_EXT, RTE_PTYPE_L4_TCP,
> >>> +RTE_PTYPE_L4_UDP
> >>> + * and RTE_PTYPE_L4_SCTP should be kept as below in a contiguous 7 bits.
> >>> + *
> >>> + * Note that L3 types values are selected for checking IPV4/IPV6
> >>> +header from
> >>> + * performance point of view. Reading annotations of
> >>> +RTE_ETH_IS_IPV4_HDR and
> >>> + * RTE_ETH_IS_IPV6_HDR is needed for any future changes of L3 type
> values.
> >>> + *
> >>> + * Note that the packet types of the same packet recognized by
> >>> +different
> >>> + * hardware may be different, as different hardware may have
> >>> +different
> >>> + * capability of packet type recognition.
> >>> + *
> >>> + * examples:
> >>> + * <'ether type'=0x0800
> >>> + * | 'version'=4, 'protocol'=0x29
> >>> + * | 'version'=6, 'next header'=0x3A
> >>> + * | 'ICMPv6 header'>
> >>> + * will be recognized on i40e hardware as packet type combination
> >>> +of,
> >>> + * RTE_PTYPE_L2_MAC |
> >>> + * RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
> >>> + * RTE_PTYPE_TUNNEL_IP |
> >>> + * RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
> >>> + * RTE_PTYPE_INNER_L4_ICMP.
> >>> + *
> >>> + * <'ether type'=0x86DD
> >>> + * | 'version'=6, 'next header'=0x2F
> >>> + * | 'GRE header'
> >>> + * | 'version'=6, 'next header'=0x11
> >>> + * | 'UDP header'>
> >>> + * will be recognized on i40e hardware as packet type combination
> >>> +of,
> >>> + * RTE_PTYPE_L2_MAC |
> >>> + * RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
> >>> + * RTE_PTYPE_TUNNEL_GRENAT |
> >>> + * RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
> >>> + * RTE_PTYPE_INNER_L4_UDP.
> >>> + */
> >>> +#define RTE_PTYPE_UNKNOWN   0x
> >>> +/**
> >>> + * MAC (Media Access Control) packet type.
> >>> + * It is used for outer packet for tunneling cases.
> >>> + *
> >>> + * Packet format:
> >>> + * <'ether type'=[0x0800|0x86DD|others]>  */
> >>> +#define RTE_PTYPE_L2_MAC0x0001
> >>
> >> I'm wondering if RTE_PTYPE_L2_ETHER is not a better name?
> > Ethernet includes both Data Link Layer and Physical Layer, while MAC
> > is for Data Link Layer only. I would prefer to keep 'MAC' in the names, 
> > rather
> than 'ether'.
> > Any opinions from others?
> 
> Just to precise what I'm saying: MAC is the interface between the logical 
> link and
> the ph

[dpdk-dev] [PATCH] virtio: fix the vq size issue

2015-07-03 Thread Ouyang, Changchun



> -Original Message-
> From: Xie, Huawei
> Sent: Thursday, July 2, 2015 5:16 PM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org; Thomas Monjalon
> Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> 
> On 7/2/2015 10:16 AM, Ouyang, Changchun wrote:
> >
> >> -Original Message-
> >> From: Xie, Huawei
> >> Sent: Thursday, July 2, 2015 10:02 AM
> >> To: Ouyang, Changchun; dev at dpdk.org; Thomas Monjalon
> >> Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> >>
> >> On 7/2/2015 8:29 AM, Ouyang, Changchun wrote:
> >>> Hi huawei,
> >>>
>  -Original Message-
>  From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Xie, Huawei
>  Sent: Wednesday, July 1, 2015 11:53 PM
>  To: dev at dpdk.org; Thomas Monjalon
>  Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> 
>  On 7/1/2015 3:49 PM, Ouyang Changchun wrote:
> > This commit breaks virtio basic packets rx functionality:
> >   d78deadae4dca240e85054bf2d604a801676becc
> >
> > The QEMU use 256 as default vring size, also use this default
> > value to calculate the virtio avail ring base address and used
> > ring base address, and vhost in the backend use the ring base
> > address to do packet
>  IO.
> > Virtio spec also says the queue size in PCI configuration is
> > read-only, so virtio front end can't change it. just need use the
> > read-only value to allocate space for vring and calculate the
> > avail and used ring base address. Otherwise, the avail and used
> > ring base
>  address will be different between host and guest, accordingly,
>  packet IO can't work normally.
>  virtio driver could still use the vq_size to initialize avail ring
>  and use ring so that they still have the same base address.
>  The other issue is vhost use  index & (vq->size -1) to index the ring.
> >>> I am not sure what is your clear message here, Vhost has no choice
> >>> but use vq->size -1 to index the ring, It is qemu that always use
> >>> 256 as the vq size, and set the avail and used ring base address, It
> >>> also tells vhost the vq size is 256.
> >> I mean "the same base address issue" could be resolved, but we still
> >> couldn't stop vhost using idx & vq->size -1 to index the ring.
> >>
> > Then this patch will resolve this avail ring base address issue.
> I mean different ring base isn't the root cause. The commit message which
> states that this register is read only is simple and enough.  

The direct root cause is avail ring base address issue,
Virtio front end use: vring->avail = vring->desc + vq_size * 
SIZE_OF_DESC_ELEMENT,
And fill the vring->avail->avail_idx, and the ring itself.
Qemu use:  vring->avail = vring->desc + 256 * SIZE_OF_DESC_ELEMENT,
And tell vhost this address, Vhost use this address to enqueue packets from phy 
port  into vring.

Pls note that if vq_size is not 256, e.g. it is changed into 128, then the 
vring->avail in host and in guest
Is totally different, that is why it fail to rx any packet, because they try to 
use different address to get
Same content in that space.

This is why I still think it is the root cause and I need add it into the 
commit.

> 
>  Thomas:
>  This fix works but introduces slight change with original code.
>  Could we just rollback that commit?
> >>> What's your major concern for the slight change here?
> >>> just removing the unnecessary check for nb_desc itself.
> >>> So I think no issue for the slight change.
> >> No major concern. It is better if this patch just rollbacks that
> >> commit without introduce extra change if not necessary.
> >> The original code set nb_desc to vq_size, though it isn't used later.
> >>
> > I prefer to have the slight change to remove unnecessary setting.
> >
> >>> Thanks
> >>> Changchun
> >>>
> >>>
> >>>
> >>>
> >
> >

[dpdk-dev] [PATCH 2/3] vhost: fix the comments and log

2015-07-03 Thread Ouyang, Changchun



> -Original Message-
> From: Xie, Huawei
> Sent: Thursday, July 2, 2015 5:25 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Cc: Cao, Waterman; Xu, Qian Q
> Subject: Re: [PATCH 2/3] vhost: fix the comments and log
> 
> 
> On 7/2/2015 11:33 AM, Ouyang, Changchun wrote:
> > It fixes the wrong log info when fails to unregister vhost driver.
> >
> > Signed-off-by: Changchun Ouyang 
> > ---
> >  examples/vhost/main.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/examples/vhost/main.c b/examples/vhost/main.c index
> > 7863dcf..72c4773 100644
> > --- a/examples/vhost/main.c
> > +++ b/examples/vhost/main.c
> > @@ -3051,10 +3051,10 @@ main(int argc, char *argv[])
> > if (mergeable == 0)
> > rte_vhost_feature_disable(1ULL <<
> VIRTIO_NET_F_MRG_RXBUF);
> >
> > -   /* Register CUSE device to handle IOCTLs. */
> > +   /* Register vhost driver to handle IOCTLs. */
> 
> Also update IOCTLS.
> or:  register vhost [cuse or user] driver to handle vhost message.

Make sense, will update it, thanks

> > ret = rte_vhost_driver_register((char *)&dev_basename);
> > if (ret != 0)
> > -   rte_exit(EXIT_FAILURE,"CUSE device setup failure.\n");
> > +   rte_exit(EXIT_FAILURE,"vhost driver register failure.\n");
> >
> > rte_vhost_driver_callback_register(&virtio_net_device_ops);
> >

[dpdk-dev] [PATCH 1/3] vhost: add log if fails to bind a socket

2015-07-03 Thread Ouyang, Changchun



> -Original Message-
> From: Xie, Huawei
> Sent: Thursday, July 2, 2015 5:29 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Cc: Cao, Waterman; Xu, Qian Q
> Subject: Re: [PATCH 1/3] vhost: add log if fails to bind a socket
> 
> On 7/2/2015 11:33 AM, Ouyang, Changchun wrote:
> > It adds more readable log info if a socket fails to bind to local device 
> > file
> name.
> local socket file, not device file. :).

Make sense, will update it, thanks

> >
> > Signed-off-by: Changchun Ouyang 
> > ---
> >  lib/librte_vhost/vhost_user/vhost-net-user.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c
> b/lib/librte_vhost/vhost_user/vhost-net-user.c
> > index 87a4711..f406a94 100644
> > --- a/lib/librte_vhost/vhost_user/vhost-net-user.c
> > +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
> > @@ -122,8 +122,11 @@ uds_socket(const char *path)
> > un.sun_family = AF_UNIX;
> > snprintf(un.sun_path, sizeof(un.sun_path), "%s", path);
> > ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
> > -   if (ret == -1)
> > +   if (ret == -1) {
> > +   RTE_LOG(ERR, VHOST_CONFIG, "fail to bind fd:%d, remove
> file:%s and try again.\n",
> > +   sockfd, path);
> > goto err;
> > +   }
> > RTE_LOG(INFO, VHOST_CONFIG, "bind to %s\n", path);
> >
> > ret = listen(sockfd, MAX_VIRTIO_BACKLOG);

[dpdk-dev] [PATCH 3/3] vhost: call api to unregister vhost driver

2015-07-03 Thread Ouyang, Changchun


> -Original Message-
> From: Xie, Huawei
> Sent: Thursday, July 2, 2015 5:38 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Cc: Cao, Waterman; Xu, Qian Q
> Subject: Re: [PATCH 3/3] vhost: call api to unregister vhost driver
> 
> On 7/2/2015 11:33 AM, Ouyang, Changchun wrote:
> >
> > /* Start CUSE session. */
> > rte_vhost_driver_session_start();
> > +
> > +   /* Unregister vhost driver. */
> > +   ret = rte_vhost_driver_unregister((char *)&dev_basename);
> > +   if (ret != 0)
> > +   rte_exit(EXIT_FAILURE,"vhost driver unregister failure.\n");
> > +
> Better remove the above code.
> It is duplicated with signal handler and actually
> rte_vhost_driver_session_start never returns.

How about call one function to replace the code snippet?
I think we need unregister there, it give us a clear example what the vhost lib 
caller need to do at the ramp down stage. 
Maybe 'never return' will be changed some day.

> 
> > return 0;
> >
> >  }

[dpdk-dev] [PATCH 3/3] vhost: call api to unregister vhost driver

2015-07-03 Thread Ouyang, Changchun



> -Original Message-
> From: Xie, Huawei
> Sent: Friday, July 3, 2015 12:04 AM
> To: Ouyang, Changchun; dev at dpdk.org
> Cc: Cao, Waterman; Xu, Qian Q
> Subject: Re: [PATCH 3/3] vhost: call api to unregister vhost driver
> 
> On 7/2/2015 11:33 AM, Ouyang, Changchun wrote:
> > The commit will break vhost sample when it runs in second time:
> > 292959c71961acde0cda6e77e737bb0a4df1559c
> >
> > It should call api to unregister vhost driver when sample exit/quit,
> > then the socket file will be removed(by calling unlink), and thus make
> > vhost sample work correctly in second time startup.
> >
> > Signed-off-by: Changchun Ouyang 
> > ---
> >  examples/vhost/main.c | 18 ++
> >  1 file changed, 18 insertions(+)
> >
> > diff --git a/examples/vhost/main.c b/examples/vhost/main.c index
> > 72c4773..90666b3 100644
> > --- a/examples/vhost/main.c
> > +++ b/examples/vhost/main.c
> > @@ -2871,6 +2871,16 @@ setup_mempool_tbl(int socket, uint32_t index,
> char *pool_name,
> > }
> >  }
> >
> > +/* When we receive a HUP signal, unregister vhost driver */ static
> > +void sighup_handler(__rte_unused int signum) {
> > +   /* Unregister vhost driver. */
> > +   int ret = rte_vhost_driver_unregister((char *)&dev_basename);
> > +   if (ret != 0)
> > +   rte_exit(EXIT_FAILURE, "vhost driver unregister failure.\n");
> > +   exit(0);
> > +}
> >
> >  /*
> >   * Main function, does initialisation and calls the per-lcore
> > functions. The CUSE @@ -2887,6 +2897,8 @@ main(int argc, char *argv[])
> > uint16_t queue_id;
> > static pthread_t tid;
> >
> > +   signal(SIGINT, sighup_handler);
> > +
> 
> ignor if duplciated.
> sighup->sigint

Make sense, will update it in v2

> 
> > /* init EAL */
> > ret = rte_eal_init(argc, argv);
> > if (ret < 0)
> > @@ -3060,6 +3072,12 @@ main(int argc, char *argv[])
> >
> > /* Start CUSE session. */
> > rte_vhost_driver_session_start();
> > +
> > +   /* Unregister vhost driver. */
> > +   ret = rte_vhost_driver_unregister((char *)&dev_basename);
> > +   if (ret != 0)
> > +   rte_exit(EXIT_FAILURE,"vhost driver unregister failure.\n");
> > +
> > return 0;
> >
> >  }

[dpdk-dev] [RFC] i40e: Add the LFC(link flow control) support for the FVL

2015-07-03 Thread Zhe Tao

Feature Add: Rx/Tx flow control support for the i40e

All the Rx/Tx enable/disable operation is done by the F/W, so device driver 
need to use the Set PHY Config AD command to trigger the PHY to do the 
auto-negotiation, after the tx/rx pause ablility is negotiated, the F/W will 
help us to set the related LFC enable/disable registers, but device driver need 
to configure the related registers to control how often to send the pasue frame 
and what the value in the pause frame 

Issue: On 40G SR4 card the Set PHY Config AD command will cause the link 
down.Reported this issue to the F/W team and wait their response.

Signed-off-by: Zhe Tao 
---
 drivers/net/i40e/i40e_ethdev.c | 169 -
 drivers/net/i40e/i40e_ethdev.h |  11 +++
 2 files changed, 177 insertions(+), 3 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 2ada502..14eb41c 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -80,6 +80,27 @@

 #define I40E_PRE_TX_Q_CFG_WAIT_US   10 /* 10 us */

+/* Flow control default timer */
+#define I40E_DEFAULT_PAUSE_TIME 0xU
+
+/* Flow control default high water */
+#define I40E_DEFAULT_HIGH_WATER 0x1C40
+
+/* Flow control default low water */
+#define I40E_DEFAULT_LOW_WATER  0x1A40
+
+/* Flow control enable fwd bit */
+#define I40E_PRTMAC_FWD_CTRL   0x0001
+
+/* Receive Packet Buffer size */
+#define I40E_RXPBSIZE (968 * 1024)
+
+/* Kilobytes shift */
+#define I40E_KILOSHIFT 10
+
+/* Receive Average Packet Size in Byte*/
+#define I40E_PACKET_AVERAGE_SIZE 128
+
 /* Mask of PF interrupt causes */
 #define I40E_PFINT_ICR0_ENA_MASK ( \
I40E_PFINT_ICR0_ENA_ECC_ERR_MASK | \
@@ -137,6 +158,8 @@ static void i40e_vlan_strip_queue_set(struct rte_eth_dev 
*dev,
 static int i40e_vlan_pvid_set(struct rte_eth_dev *dev, uint16_t pvid, int on);
 static int i40e_dev_led_on(struct rte_eth_dev *dev);
 static int i40e_dev_led_off(struct rte_eth_dev *dev);
+static int i40e_flow_ctrl_get(struct rte_eth_dev *dev,
+ struct rte_eth_fc_conf *fc_conf);
 static int i40e_flow_ctrl_set(struct rte_eth_dev *dev,
  struct rte_eth_fc_conf *fc_conf);
 static int i40e_priority_flow_ctrl_set(struct rte_eth_dev *dev,
@@ -251,6 +274,7 @@ static const struct eth_dev_ops i40e_eth_dev_ops = {
.tx_queue_release = i40e_dev_tx_queue_release,
.dev_led_on   = i40e_dev_led_on,
.dev_led_off  = i40e_dev_led_off,
+   .flow_ctrl_get= i40e_flow_ctrl_get,
.flow_ctrl_set= i40e_flow_ctrl_set,
.priority_flow_ctrl_set   = i40e_priority_flow_ctrl_set,
.mac_addr_add = i40e_macaddr_add,
@@ -390,6 +414,9 @@ eth_i40e_dev_init(struct rte_eth_dev *dev)
pf->adapter = I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
pf->adapter->eth_dev = dev;
pf->dev_data = dev->data;
+   pf->fc_conf.pause_time = I40E_DEFAULT_PAUSE_TIME;
+   pf->fc_conf.high_water[I40E_MAX_TRAFFIC_CLASS] = 
I40E_DEFAULT_HIGH_WATER;
+   pf->fc_conf.low_water[I40E_MAX_TRAFFIC_CLASS] = I40E_DEFAULT_LOW_WATER;

hw->back = I40E_PF_TO_ADAPTER(pf);
hw->hw_addr = (uint8_t *)(pci_dev->mem_resource[0].addr);
@@ -1673,12 +1700,147 @@ i40e_dev_led_off(struct rte_eth_dev *dev)
 }

 static int
-i40e_flow_ctrl_set(__rte_unused struct rte_eth_dev *dev,
-  __rte_unused struct rte_eth_fc_conf *fc_conf)
+i40e_flow_ctrl_get(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
+{
+   struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+
+   fc_conf->pause_time = pf->fc_conf.pause_time;
+   fc_conf->high_water =  pf->fc_conf.high_water[I40E_MAX_TRAFFIC_CLASS];
+   fc_conf->low_water = pf->fc_conf.low_water[I40E_MAX_TRAFFIC_CLASS];
+
+   /*
+* Return current mode according to actual setting
+*/
+   switch (hw->fc.current_mode) {
+   case I40E_FC_FULL:
+   fc_conf->mode = RTE_FC_FULL;
+   break;
+   case I40E_FC_TX_PAUSE:
+   fc_conf->mode = I40E_FC_TX_PAUSE;
+   break;
+   case I40E_FC_RX_PAUSE:
+   fc_conf->mode = I40E_FC_RX_PAUSE;
+   break;
+   case I40E_FC_NONE:
+   fc_conf->mode = RTE_FC_NONE;
+   break;
+   default:
+   break;
+   };
+
+   return 0;
+}
+
+static int
+i40e_flow_ctrl_set(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
 {
+   uint32_t mflcn_reg, fctrl_reg, reg;
+   uint32_t max_high_water;
+   uint8_t i, aq_failure;
+   int err;
+   struct i40e_hw *hw;
+   struct i40e_pf *pf;
+   enum i40e_fc_mode rte_fcmode_2_i40e_fcmode[] = {
+   I40E_FC_NONE,
+   I40E_FC_RX_PAUSE,
+

[dpdk-dev] [PATCH v2] doc/sample_app_ug:add a VXLAN sample guide

2015-07-03 Thread Jijiang Liu

Add a VXLAN sample guide in the sample_app_ug directory.

It includes:

- Add the overlay networking picture with svg format.

- Add the TEP termination framework picture with svg format.

- Add the tep_termination.rst file

- Change the index.rst file for the above pictures index.

Signed-off-by: Jijiang Liu 
Signed-off-by: Thomas Long 

v2 changes:
optimize the two pictures
add tep_termination index in index.rst file
fix a typo and a command line

---
 .../sample_app_ug/img/overlay_networking.svg   |  786 
 .../sample_app_ug/img/tep_termination_arch.svg |  548 ++
 doc/guides/sample_app_ug/index.rst |3 +
 doc/guides/sample_app_ug/tep_termination.rst   |  321 
 4 files changed, 1658 insertions(+), 0 deletions(-)
 create mode 100644 doc/guides/sample_app_ug/img/overlay_networking.svg
 create mode 100644 doc/guides/sample_app_ug/img/tep_termination_arch.svg
 create mode 100644 doc/guides/sample_app_ug/tep_termination.rst

diff --git a/doc/guides/sample_app_ug/img/overlay_networking.svg 
b/doc/guides/sample_app_ug/img/overlay_networking.svg
new file mode 100644
index 000..2ce440d
--- /dev/null
+++ b/doc/guides/sample_app_ug/img/overlay_networking.svg
@@ -0,0 +1,786 @@
+
+http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd";>
+
+http://www.w3.org/2000/svg"; 
xmlns:xlink="http://www.w3.org/1999/xlink"; 
xmlns:ev="http://www.w3.org/2001/xml-events";
+   
xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/"; width="8.5in" 
height="11in" viewBox="0 0 612 792"
+   xml:space="preserve" color-interpolation-filters="sRGB" 
class="st29">
+   
+   
+   
+   
+   
+   
+   
+
+   
+   
+   
+
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+

[dpdk-dev] Bifurcated driver in DPDK 2.0

2015-07-03 Thread Abhishek Verma

Hi,

I have downloaded DPDK release 2.0 and was looking at how i can install
bifurcated driver support. Can somebody point me towards it?

I dont plan to use the KNI to push the slow path traffic through the kernel
space networking stack as i end up paying some penalty. I intend to use the
bifurcated driver and by installing a few filter rules in the NIC to look
at certain UDP L4 ports and sending those to queues dedicated to DPDK and
all the rest being passed to the native kernel networking stack.

Any pointers in how i can install bifurcated driver would be of immense
help.

Thanks,
Abhishek

[dpdk-dev] [PATCH v6 1/9] eal: move librte_malloc to eal/common

2015-07-03 Thread Gonzalez Monroy, Sergio

On 02/07/2015 13:14, Thomas Monjalon wrote:
> 2015-06-26 16:29, Sergio Gonzalez Monroy:
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -73,6 +73,7 @@ F: lib/librte_eal/common/*
>>   F: lib/librte_eal/common/include/*
>>   F: lib/librte_eal/common/include/generic/
>>   F: doc/guides/prog_guide/env_abstraction_layer.rst
>> +F: doc/guides/prog_guide/malloc_lib.rst
>>   F: app/test/test_alarm.c
>>   F: app/test/test_atomic.c
>>   F: app/test/test_byteorder.c
>> @@ -97,6 +98,8 @@ F: app/test/test_spinlock.c
>>   F: app/test/test_string_fns.c
>>   F: app/test/test_tailq.c
>>   F: app/test/test_version.c
>> +F: app/test/test_malloc.c
>> +F: app/test/test_func_reentrancy.c
> I think we should keep a separate maintainer section for memory allocator
> in EAL. I suggest this:
>
> Memory allocation
> M: Sergio Gonzalez Monroy 
> F: lib/librte_eal/common/include/rte_mem*
> F: lib/librte_eal/common/include/rte_malloc.h
> F: lib/librte_eal/common/*malloc*
> F: lib/librte_eal/common/eal_common_mem*
> F: lib/librte_eal/common/eal_hugepages.h
> F: doc/guides/prog_guide/malloc_lib.rst
> F: app/test/test_malloc.c
> F: app/test/test_func_reentrancy.c
>
>
Fine with me.
Do you need a new version of the patches with that change?

Sergio

[dpdk-dev] [PATCH v3 0/7] ethdev: add support for ieee1588 timestamping

2015-07-03 Thread Lu, Wenzhuo

Hi,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of John McNamara
> Sent: Thursday, July 2, 2015 11:16 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 0/7] ethdev: add support for ieee1588
> timestamping
> 
> This patchset adds ethdev API to enable and read IEEE1588/802.1AS PTP
> timestamps from devices that support it. The following functions are added:
> 
> rte_eth_timesync_enable()
> rte_eth_timesync_disable()
> rte_eth_timesync_read_rx_timestamp()
> rte_eth_timesync_read_tx_timestamp()
> 
> The "ieee1588" forwarding mode in testpmd is also refactored to demonstrate
> the new API and to clean up the code.
> 
> Adds support for igb, ixgbe and i40e.
> 
> V3:
> * Fixed issued with version.map.
> 
> V2:
> * Added i40e support.
> 
> * Renamed ethdev functions from rte_eth_ieee15888_*() to
> rte_eth_timesync_*()
>   since 802.1AS can be supported through the same interfaces.
> 
> V1:
> * Initial version for
> 
> 
> John McNamara (7):
>   ethdev: add support for ieee1588 timestamping
>   e1000: add support for ieee1588 timestamping
>   ixgbe: add support for ieee1588 timestamping
>   i40e: add support for ieee1588 timestamping
>   app/testpmd: refactor ieee1588 forwarding
>   doc: document ieee1588 forwarding mode
>   abi: announce mbuf addition for ieee1588 in DPDK 2.2
> 
>  app/test-pmd/ieee1588fwd.c  | 466 
> ++--
>  doc/guides/rel_notes/abi.rst|   5 +
>  doc/guides/testpmd_app_ug/run_app.rst   |   2 +-
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |   2 +
>  drivers/net/e1000/igb_ethdev.c  | 115 +++
>  drivers/net/i40e/i40e_ethdev.c  | 143 +
>  drivers/net/i40e/i40e_rxtx.c|  39 ++-
>  drivers/net/ixgbe/ixgbe_ethdev.c| 122 
>  lib/librte_ether/rte_ethdev.c   |  70 -
>  lib/librte_ether/rte_ethdev.h   |  90 +-
>  lib/librte_ether/rte_ether_version.map  |   4 +
>  11 files changed, 615 insertions(+), 443 deletions(-)
> 
> --
> 1.8.1.4
Acked-by: Wenzhuo Lu

[dpdk-dev] [PATCH v9 00/19] unified packet type

2015-07-03 Thread Helin Zhang

Currently only 6 bits which are stored in ol_flags are used to indicate the
packet types. This is not enough, as some NIC hardware can recognize quite
a lot of packet types, e.g i40e hardware can recognize more than 150 packet
types. Hiding those packet types hides hardware offload capabilities which
could be quite useful for improving performance and for end users.
So an unified packet types are needed to support all possible PMDs. A 16 bits
packet_type in mbuf structure can be changed to 32 bits and used for this
purpose. In addition, all packet types stored in ol_flag field should be
deleted at all, and 6 bits of ol_flags can be save as the benifit.

Initially, 32 bits of packet_type can be divided into several sub fields to
indicate different packet type information of a packet. The initial design
is to divide those bits into fields for L2 types, L3 types, L4 types, tunnel
types, inner L2 types, inner L3 types and inner L4 types. All PMDs should
translate the offloaded packet types into these 7 fields of information, for
user applications.

To avoid breaking ABI compatibility, currently all the code changes for
unified packet type are disabled at compile time by default. Users can enable
it manually by defining the macro of RTE_NEXT_ABI. The code changes will be
valid by default in a future release, and the old version will be deleted
accordingly, after the ABI change process is done.

Note that this patch set should be integrated after another patch set for
'[PATCH v3 0/7] support i40e QinQ stripping and insertion', to clearly solve
the conflict during integration. As both patch sets modified 'struct rte_mbuf',
and the final layout of the 'struct rte_mbuf' is key to vectorized ixgbe PMD.

Its v8 version was acked by Konstantin Ananyev 

v2 changes:
* Enlarged the packet_type field from 16 bits to 32 bits.
* Redefined the packet type sub-fields.
* Updated the 'struct rte_kni_mbuf' for KNI according to the mbuf changes.
* Used redefined packet types and enlarged packet_type field for all PMDs
  and corresponding applications.
* Removed changes in bond and its relevant application, as there is no need
  at all according to the recent bond changes.

v3 changes:
* Put the mbuf layout changes into a single patch.
* Put vector ixgbe changes right after mbuf changes.
* Disabled vector ixgbe PMD by default, as mbuf layout changed, and then
  re-enabled it after vector ixgbe PMD updated.
* Put the definitions of unified packet type into a single patch.
* Minor bug fixes and enhancements in l3fwd example.

v4 changes:
* Added detailed description of each packet types.
* Supported unified packet type of fm10k.
* Added printing logs of packet types of each received packet for rxonly
  mode in testpmd.
* Removed several useless code lines which block packet type unification from
  app/test/packet_burst_generator.c.

v5 changes:
* Added more detailed description for each packet types, together with examples.
* Rolled back the macro definitions of RX packet flags, for ABI compitability.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.
* Integrated with patch set for '[PATCH v3 0/7] support i40e QinQ stripping
  and insertion', to clearly solve the conflicts during merging.

v8 changes:
* Moved the field of 'vlan_tci_outer' in 'struct rte_mbuf' to the end of the 1st
  cache line, to avoid breaking any vectorized PMD storing, as fields of
  'packet_type, pkt_len, data_len, vlan_tci, rss' should be in an contiguous 128
  bits.

v9 changes:
* Put the mbuf changes and vector PMD changes together, as they are
  tightly relevant.
* Renamed MAC to ETHER in packet type names.
* Corrected the packet type explanation of RTE_PTYPE_L2_ETHER.
* Reworked newly added cxgbe driver and tep_termination example application to
  support unified packet type, which is disabled by default.

Helin Zhang (19):
  mbuf: redefine packet_type in rte_mbuf
  mbuf: add definitions of unified packet types
  e1000: replace bit mask based packet type with unified packet type
  ixgbe: replace bit mask based packet type with unified packet type
  i40e: replace bit mask based packet type with unified packet type
  enic: replace bit mask based packet type with unified packet type
  vmxnet3: replace bit mask based packet type with unified packet type
  fm10k: replace bit mask based packet type with unified packet type
  cxgbe: replace bit mask based packet type with unified packet type
  app/test-pipeline: replace bit mask based packet type with unified
packet type
  app/testpmd: replace bit mask based packet type with unified packet
type
  app/test: Remove useless code
  examples/ip_fragmentation: replace bit mask based packet type with
unified packet type
  examples/ip_reassembly: replace bit mask based packet type with
unified packet type
  examples/l3fwd-acl: replace bit mask based packet type with unified
p

[dpdk-dev] [PATCH v9 01/19] mbuf: redefine packet_type in rte_mbuf

2015-07-03 Thread Helin Zhang

In order to unify the packet type, the field of 'packet_type' in 'struct 
rte_mbuf'
needs to be extended from 16 to 32 bits. Accordingly, some fields in 'struct 
rte_mbuf'
are re-organized to support this change for Vector PMD. As 'struct 
rte_kni_mbuf' for
KNI should be right mapped to 'struct rte_mbuf', it should be modified 
accordingly.
In ixgbe PMD driver, corresponding changes are added for the mbuf changes, 
especially
the bit masks of packet type for 'ol_flags' are replaced by unified packet 
type. In
addition, more packet types (UDP, TCP and SCTP) are supported in vectorized 
ixgbe PMD.
To avoid breaking ABI compatibility, all the changes would be enabled by 
RTE_NEXT_ABI,
which is disabled by default.
Note that around 2% performance drop (64B) was observed of doing 4 ports (1 
port per
82599 card) IO forwarding on the same SNB core.

Signed-off-by: Helin Zhang 
Signed-off-by: Cunming Liang 
---
 drivers/net/ixgbe/ixgbe_rxtx_vec.c | 75 +-
 .../linuxapp/eal/include/exec-env/rte_kni_common.h |  6 ++
 lib/librte_mbuf/rte_mbuf.h | 26 
 3 files changed, 105 insertions(+), 2 deletions(-)

v2 changes:
* Enlarged the packet_type field from 16 bits to 32 bits.
* Redefined the packet type sub-fields.
* Updated the 'struct rte_kni_mbuf' for KNI according to the mbuf changes.

v3 changes:
* Put the mbuf layout changes into a single patch.
* Disabled vector ixgbe PMD by default, as mbuf layout changed.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.
* Integrated with changes of QinQ stripping/insertion.

v8 changes:
* Moved the field of 'vlan_tci_outer' in 'struct rte_mbuf' to the end
  of the 1st cache line, to avoid breaking any vectorized PMD storing.

v9 changes:
* Put the mbuf changes and vector PMD changes together, as they are
  tightly relevant.

diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c 
b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
index abd10f6..ccea7cd 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
@@ -134,6 +134,12 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
  */
 #ifdef RTE_IXGBE_RX_OLFLAGS_ENABLE

+#ifdef RTE_NEXT_ABI
+#define OLFLAGS_MASK_V  (((uint64_t)PKT_RX_VLAN_PKT << 48) | \
+   ((uint64_t)PKT_RX_VLAN_PKT << 32) | \
+   ((uint64_t)PKT_RX_VLAN_PKT << 16) | \
+   ((uint64_t)PKT_RX_VLAN_PKT))
+#else
 #define OLFLAGS_MASK ((uint16_t)(PKT_RX_VLAN_PKT | PKT_RX_IPV4_HDR |\
 PKT_RX_IPV4_HDR_EXT | PKT_RX_IPV6_HDR |\
 PKT_RX_IPV6_HDR_EXT))
@@ -142,11 +148,26 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
  ((uint64_t)OLFLAGS_MASK << 16) | \
  ((uint64_t)OLFLAGS_MASK))
 #define PTYPE_SHIFT(1)
+#endif /* RTE_NEXT_ABI */
+
 #define VTAG_SHIFT (3)

 static inline void
 desc_to_olflags_v(__m128i descs[4], struct rte_mbuf **rx_pkts)
 {
+#ifdef RTE_NEXT_ABI
+   __m128i vtag0, vtag1;
+   union {
+   uint16_t e[4];
+   uint64_t dword;
+   } vol;
+
+   vtag0 = _mm_unpackhi_epi16(descs[0], descs[1]);
+   vtag1 = _mm_unpackhi_epi16(descs[2], descs[3]);
+   vtag1 = _mm_unpacklo_epi32(vtag0, vtag1);
+   vtag1 = _mm_srli_epi16(vtag1, VTAG_SHIFT);
+   vol.dword = _mm_cvtsi128_si64(vtag1) & OLFLAGS_MASK_V;
+#else
__m128i ptype0, ptype1, vtag0, vtag1;
union {
uint16_t e[4];
@@ -166,6 +187,7 @@ desc_to_olflags_v(__m128i descs[4], struct rte_mbuf 
**rx_pkts)

ptype1 = _mm_or_si128(ptype1, vtag1);
vol.dword = _mm_cvtsi128_si64(ptype1) & OLFLAGS_MASK_V;
+#endif /* RTE_NEXT_ABI */

rx_pkts[0]->ol_flags = vol.e[0];
rx_pkts[1]->ol_flags = vol.e[1];
@@ -196,6 +218,18 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct 
rte_mbuf **rx_pkts,
int pos;
uint64_t var;
__m128i shuf_msk;
+#ifdef RTE_NEXT_ABI
+   __m128i crc_adjust = _mm_set_epi16(
+   0, 0, 0,/* ignore non-length fields */
+   -rxq->crc_len, /* sub crc on data_len */
+   0,  /* ignore high-16bits of pkt_len */
+   -rxq->crc_len, /* sub crc on pkt_len */
+   0, 0/* ignore pkt_type field */
+   );
+   __m128i dd_check, eop_check;
+   __m128i desc_mask = _mm_set_epi32(0x, 0x,
+ 0x, 0x07F0);
+#else
__m128i crc_adjust = _mm_set_epi16(
0, 0, 0, 0, /* ignore non-length fields */
0,  /* ignore high-16bits of pkt_len */
@@ -204,

[dpdk-dev] [PATCH v9 02/19] mbuf: add definitions of unified packet types

2015-07-03 Thread Helin Zhang

As there are only 6 bit flags in ol_flags for indicating packet
types, which is not enough to describe all the possible packet
types hardware can recognize. For example, i40e hardware can
recognize more than 150 packet types. Unified packet type is
composed of L2 type, L3 type, L4 type, tunnel type, inner L2 type,
inner L3 type and inner L4 type fields, and can be stored in
'struct rte_mbuf' of 32 bits field 'packet_type'.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 lib/librte_mbuf/rte_mbuf.h | 486 +
 1 file changed, 486 insertions(+)

v3 changes:
* Put the definitions of unified packet type into a single patch.

v4 changes:
* Added detailed description of each packet types.

v5 changes:
* Re-worded the commit logs.
* Added more detailed description for all packet types, together with examples.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.
* Corrected the packet type explanation of RTE_PTYPE_L2_ETHER.

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index ac29da3..3a17d95 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -202,6 +202,492 @@ extern "C" {
 /* Use final bit of flags to indicate a control mbuf */
 #define CTRL_MBUF_FLAG   (1ULL << 63) /**< Mbuf contains control data */

+#ifdef RTE_NEXT_ABI
+/*
+ * 32 bits are divided into several fields to mark packet types. Note that
+ * each field is indexical.
+ * - Bit 3:0 is for L2 types.
+ * - Bit 7:4 is for L3 or outer L3 (for tunneling case) types.
+ * - Bit 11:8 is for L4 or outer L4 (for tunneling case) types.
+ * - Bit 15:12 is for tunnel types.
+ * - Bit 19:16 is for inner L2 types.
+ * - Bit 23:20 is for inner L3 types.
+ * - Bit 27:24 is for inner L4 types.
+ * - Bit 31:28 is reserved.
+ *
+ * To be compatible with Vector PMD, RTE_PTYPE_L3_IPV4, RTE_PTYPE_L3_IPV4_EXT,
+ * RTE_PTYPE_L3_IPV6, RTE_PTYPE_L3_IPV6_EXT, RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP
+ * and RTE_PTYPE_L4_SCTP should be kept as below in a contiguous 7 bits.
+ *
+ * Note that L3 types values are selected for checking IPV4/IPV6 header from
+ * performance point of view. Reading annotations of RTE_ETH_IS_IPV4_HDR and
+ * RTE_ETH_IS_IPV6_HDR is needed for any future changes of L3 type values.
+ *
+ * Note that the packet types of the same packet recognized by different
+ * hardware may be different, as different hardware may have different
+ * capability of packet type recognition.
+ *
+ * examples:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=0x29
+ * | 'version'=6, 'next header'=0x3A
+ * | 'ICMPv6 header'>
+ * will be recognized on i40e hardware as packet type combination of,
+ * RTE_PTYPE_L2_ETHER |
+ * RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+ * RTE_PTYPE_TUNNEL_IP |
+ * RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+ * RTE_PTYPE_INNER_L4_ICMP.
+ *
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=0x2F
+ * | 'GRE header'
+ * | 'version'=6, 'next header'=0x11
+ * | 'UDP header'>
+ * will be recognized on i40e hardware as packet type combination of,
+ * RTE_PTYPE_L2_ETHER |
+ * RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+ * RTE_PTYPE_TUNNEL_GRENAT |
+ * RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+ * RTE_PTYPE_INNER_L4_UDP.
+ */
+#define RTE_PTYPE_UNKNOWN   0x
+/**
+ * Ethernet packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * Packet format:
+ * <'ether type'=[0x0800|0x86DD]>
+ */
+#define RTE_PTYPE_L2_ETHER  0x0001
+/**
+ * Ethernet packet type for time sync.
+ *
+ * Packet format:
+ * <'ether type'=0x88F7>
+ */
+#define RTE_PTYPE_L2_ETHER_TIMESYNC 0x0002
+/**
+ * ARP (Address Resolution Protocol) packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0806>
+ */
+#define RTE_PTYPE_L2_ETHER_ARP  0x0003
+/**
+ * LLDP (Link Layer Discovery Protocol) packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x88CC>
+ */
+#define RTE_PTYPE_L2_ETHER_LLDP 0x0004
+/**
+ * Mask of layer 2 packet types.
+ * It is used for outer packet for tunneling cases.
+ */
+#define RTE_PTYPE_L2_MASK   0x000f
+/**
+ * IP (Internet Protocol) version 4 packet type.
+ * It is used for outer packet for tunneling cases, and does not contain any
+ * header option.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'ihl'=5>
+ */
+#define RTE_PTYPE_L3_IPV4   0x0010
+/**
+ * IP (Internet Protocol) version 4 packet type.
+ * It is used for outer packet for tunneling cases, and contains header
+ * options.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'ihl'=[6-15], 'options'>
+ */
+#define RTE_PTYPE_L3_IPV4_EXT   0x0030
+/**
+ * IP (Internet Protocol)

[dpdk-dev] [PATCH v9 04/19] ixgbe: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet type among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.
Note that around 2.5% performance drop (64B) was observed of doing
4 ports (1 port per 82599 card) IO forwarding on the same SNB core.

Signed-off-by: Helin Zhang 
---
 drivers/net/ixgbe/ixgbe_rxtx.c | 163 +
 1 file changed, 163 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index a211096..1455e54 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -860,6 +860,110 @@ end_of_tx:
  *  RX functions
  *
  **/
+#ifdef RTE_NEXT_ABI
+#define IXGBE_PACKET_TYPE_IPV4  0X01
+#define IXGBE_PACKET_TYPE_IPV4_TCP  0X11
+#define IXGBE_PACKET_TYPE_IPV4_UDP  0X21
+#define IXGBE_PACKET_TYPE_IPV4_SCTP 0X41
+#define IXGBE_PACKET_TYPE_IPV4_EXT  0X03
+#define IXGBE_PACKET_TYPE_IPV4_EXT_SCTP 0X43
+#define IXGBE_PACKET_TYPE_IPV6  0X04
+#define IXGBE_PACKET_TYPE_IPV6_TCP  0X14
+#define IXGBE_PACKET_TYPE_IPV6_UDP  0X24
+#define IXGBE_PACKET_TYPE_IPV6_EXT  0X0C
+#define IXGBE_PACKET_TYPE_IPV6_EXT_TCP  0X1C
+#define IXGBE_PACKET_TYPE_IPV6_EXT_UDP  0X2C
+#define IXGBE_PACKET_TYPE_IPV4_IPV6 0X05
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_TCP 0X15
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_UDP 0X25
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_EXT 0X0D
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_EXT_TCP 0X1D
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_EXT_UDP 0X2D
+#define IXGBE_PACKET_TYPE_MAX   0X80
+#define IXGBE_PACKET_TYPE_MASK  0X7F
+#define IXGBE_PACKET_TYPE_SHIFT 0X04
+static inline uint32_t
+ixgbe_rxd_pkt_info_to_pkt_type(uint16_t pkt_info)
+{
+   static const uint32_t
+   ptype_table[IXGBE_PACKET_TYPE_MAX] __rte_cache_aligned = {
+   [IXGBE_PACKET_TYPE_IPV4] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4,
+   [IXGBE_PACKET_TYPE_IPV4_EXT] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4_EXT,
+   [IXGBE_PACKET_TYPE_IPV6] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6,
+   [IXGBE_PACKET_TYPE_IPV4_IPV6] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6,
+   [IXGBE_PACKET_TYPE_IPV6_EXT] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6_EXT,
+   [IXGBE_PACKET_TYPE_IPV4_IPV6_EXT] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6_EXT,
+   [IXGBE_PACKET_TYPE_IPV4_TCP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP,
+   [IXGBE_PACKET_TYPE_IPV6_TCP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_TCP,
+   [IXGBE_PACKET_TYPE_IPV4_IPV6_TCP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6 | RTE_PTYPE_INNER_L4_TCP,
+   [IXGBE_PACKET_TYPE_IPV6_EXT_TCP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6_EXT | RTE_PTYPE_L4_TCP,
+   [IXGBE_PACKET_TYPE_IPV4_IPV6_EXT_TCP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6_EXT | RTE_PTYPE_INNER_L4_TCP,
+   [IXGBE_PACKET_TYPE_IPV4_UDP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_UDP,
+   [IXGBE_PACKET_TYPE_IPV6_UDP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_UDP,
+   [IXGBE_PACKET_TYPE_IPV4_IPV6_UDP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6 | RTE_PTYPE_INNER_L4_UDP,
+   [IXGBE_PACKET_TYPE_IPV6_EXT_UDP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6_EXT | RTE_PTYPE_L4_UDP,
+   [IXGBE_PACKET_TYPE_IPV4_IPV6_EXT_UDP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6_EXT | RTE_PTYPE_INNER_L4_UDP,
+   [IXGBE_PACKET_TYPE_IPV4_SCTP] = RTE_PTYPE_L2_ETHER |
+

[dpdk-dev] [PATCH v9 06/19] enic: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 drivers/net/enic/enic_main.c | 26 ++
 1 file changed, 26 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 15313c2..f47e96c 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -423,7 +423,11 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
rx_pkt->pkt_len = bytes_written;

if (ipv4) {
+#ifdef RTE_NEXT_ABI
+   rx_pkt->packet_type = RTE_PTYPE_L3_IPV4;
+#else
rx_pkt->ol_flags |= PKT_RX_IPV4_HDR;
+#endif
if (!csum_not_calc) {
if (unlikely(!ipv4_csum_ok))
rx_pkt->ol_flags |= PKT_RX_IP_CKSUM_BAD;
@@ -432,7 +436,11 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
rx_pkt->ol_flags |= PKT_RX_L4_CKSUM_BAD;
}
} else if (ipv6)
+#ifdef RTE_NEXT_ABI
+   rx_pkt->packet_type = RTE_PTYPE_L3_IPV6;
+#else
rx_pkt->ol_flags |= PKT_RX_IPV6_HDR;
+#endif
} else {
/* Header split */
if (sop && !eop) {
@@ -445,7 +453,11 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
*rx_pkt_bucket = rx_pkt;
rx_pkt->pkt_len = bytes_written;
if (ipv4) {
+#ifdef RTE_NEXT_ABI
+   rx_pkt->packet_type = RTE_PTYPE_L3_IPV4;
+#else
rx_pkt->ol_flags |= PKT_RX_IPV4_HDR;
+#endif
if (!csum_not_calc) {
if (unlikely(!ipv4_csum_ok))
rx_pkt->ol_flags |=
@@ -457,13 +469,22 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
PKT_RX_L4_CKSUM_BAD;
}
} else if (ipv6)
+#ifdef RTE_NEXT_ABI
+   rx_pkt->packet_type = RTE_PTYPE_L3_IPV6;
+#else
rx_pkt->ol_flags |= PKT_RX_IPV6_HDR;
+#endif
} else {
/* Payload */
hdr_rx_pkt = *rx_pkt_bucket;
hdr_rx_pkt->pkt_len += bytes_written;
if (ipv4) {
+#ifdef RTE_NEXT_ABI
+   hdr_rx_pkt->packet_type =
+   RTE_PTYPE_L3_IPV4;
+#else
hdr_rx_pkt->ol_flags |= PKT_RX_IPV4_HDR;
+#endif
if (!csum_not_calc) {
if (unlikely(!ipv4_csum_ok))
hdr_rx_pkt->ol_flags |=
@@ -475,7 +496,12 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
PKT_RX_L4_CKSUM_BAD;
}
} else if (ipv6)
+#ifdef RTE_NEXT_ABI
+   hdr_rx_pkt->packet_type =
+   RTE_PTYPE_L3_IPV6;
+#else
hdr_rx_pkt->ol_flags |= PKT_RX_IPV6_HDR;
+#endif

}
}
-- 
1.9.3

[dpdk-dev] [PATCH v9 07/19] vmxnet3: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 drivers/net/vmxnet3/vmxnet3_rxtx.c | 8 
 1 file changed, 8 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c 
b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index a1eac45..25ae2f6 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -649,9 +649,17 @@ vmxnet3_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts, uint16_t nb_pkts)
struct ipv4_hdr *ip = (struct ipv4_hdr *)(eth + 1);

if (((ip->version_ihl & 0xf) << 2) > (int)sizeof(struct 
ipv4_hdr))
+#ifdef RTE_NEXT_ABI
+   rxm->packet_type = RTE_PTYPE_L3_IPV4_EXT;
+#else
rxm->ol_flags |= PKT_RX_IPV4_HDR_EXT;
+#endif
else
+#ifdef RTE_NEXT_ABI
+   rxm->packet_type = RTE_PTYPE_L3_IPV4;
+#else
rxm->ol_flags |= PKT_RX_IPV4_HDR;
+#endif

if (!rcd->cnc) {
if (!rcd->ipc)
-- 
1.9.3

[dpdk-dev] [PATCH v9 05/19] i40e: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/i40e_rxtx.c | 554 +++
 1 file changed, 554 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index b2e1d6d..a608d1f 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -176,6 +176,540 @@ i40e_rxd_error_to_pkt_flags(uint64_t qword)
return flags;
 }

+#ifdef RTE_NEXT_ABI
+/* For each value it means, datasheet of hardware can tell more details */
+static inline uint32_t
+i40e_rxd_pkt_type_mapping(uint8_t ptype)
+{
+   static const uint32_t ptype_table[UINT8_MAX] __rte_cache_aligned = {
+   /* L2 types */
+   /* [0] reserved */
+   [1] = RTE_PTYPE_L2_ETHER,
+   [2] = RTE_PTYPE_L2_ETHER_TIMESYNC,
+   /* [3] - [5] reserved */
+   [6] = RTE_PTYPE_L2_ETHER_LLDP,
+   /* [7] - [10] reserved */
+   [11] = RTE_PTYPE_L2_ETHER_ARP,
+   /* [12] - [21] reserved */
+
+   /* Non tunneled IPv4 */
+   [22] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_L4_FRAG,
+   [23] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_L4_NONFRAG,
+   [24] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_L4_UDP,
+   /* [25] reserved */
+   [26] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_L4_TCP,
+   [27] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_L4_SCTP,
+   [28] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_L4_ICMP,
+
+   /* IPv4 --> IPv4 */
+   [29] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_INNER_L4_FRAG,
+   [30] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_INNER_L4_NONFRAG,
+   [31] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_INNER_L4_UDP,
+   /* [32] reserved */
+   [33] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_INNER_L4_TCP,
+   [34] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_INNER_L4_SCTP,
+   [35] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_INNER_L4_ICMP,
+
+   /* IPv4 --> IPv6 */
+   [36] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+   RTE_PTYPE_INNER_L4_FRAG,
+   [37] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+   RTE_PTYPE_INNER_L4_NONFRAG,
+   [38] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+   RTE_PTYPE_INNER_L4_UDP,
+   /* [39] reserved */
+   [40] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+   RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+   RTE_PTYPE_INNER_L4_TCP,
+   [41] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+

[dpdk-dev] [PATCH v9 08/19] fm10k: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 drivers/net/fm10k/fm10k_rxtx.c | 27 +++
 1 file changed, 27 insertions(+)

v4 changes:
* Supported unified packet type of fm10k from v4.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/drivers/net/fm10k/fm10k_rxtx.c b/drivers/net/fm10k/fm10k_rxtx.c
index f5d1ad0..d3bcdca 100644
--- a/drivers/net/fm10k/fm10k_rxtx.c
+++ b/drivers/net/fm10k/fm10k_rxtx.c
@@ -68,12 +68,37 @@ static inline void dump_rxd(union fm10k_rx_desc *rxd)
 static inline void
 rx_desc_to_ol_flags(struct rte_mbuf *m, const union fm10k_rx_desc *d)
 {
+#ifdef RTE_NEXT_ABI
+   static const uint32_t
+   ptype_table[FM10K_RXD_PKTTYPE_MASK >> FM10K_RXD_PKTTYPE_SHIFT]
+   __rte_cache_aligned = {
+   [FM10K_PKTTYPE_OTHER] = RTE_PTYPE_L2_ETHER,
+   [FM10K_PKTTYPE_IPV4] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4,
+   [FM10K_PKTTYPE_IPV4_EX] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4_EXT,
+   [FM10K_PKTTYPE_IPV6] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6,
+   [FM10K_PKTTYPE_IPV6_EX] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6_EXT,
+   [FM10K_PKTTYPE_IPV4 | FM10K_PKTTYPE_TCP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP,
+   [FM10K_PKTTYPE_IPV6 | FM10K_PKTTYPE_TCP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_TCP,
+   [FM10K_PKTTYPE_IPV4 | FM10K_PKTTYPE_UDP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_UDP,
+   [FM10K_PKTTYPE_IPV6 | FM10K_PKTTYPE_UDP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_UDP,
+   };
+
+   m->packet_type = ptype_table[(d->w.pkt_info & FM10K_RXD_PKTTYPE_MASK)
+   >> FM10K_RXD_PKTTYPE_SHIFT];
+#else /* RTE_NEXT_ABI */
uint16_t ptype;
static const uint16_t pt_lut[] = { 0,
PKT_RX_IPV4_HDR, PKT_RX_IPV4_HDR_EXT,
PKT_RX_IPV6_HDR, PKT_RX_IPV6_HDR_EXT,
0, 0, 0
};
+#endif /* RTE_NEXT_ABI */

if (d->w.pkt_info & FM10K_RXD_RSSTYPE_MASK)
m->ol_flags |= PKT_RX_RSS_HASH;
@@ -97,9 +122,11 @@ rx_desc_to_ol_flags(struct rte_mbuf *m, const union 
fm10k_rx_desc *d)
if (unlikely(d->d.staterr & FM10K_RXD_STATUS_RXE))
m->ol_flags |= PKT_RX_RECIP_ERR;

+#ifndef RTE_NEXT_ABI
ptype = (d->d.data & FM10K_RXD_PKTTYPE_MASK_L3) >>
FM10K_RXD_PKTTYPE_SHIFT;
m->ol_flags |= pt_lut[(uint8_t)ptype];
+#endif
 }

 uint16_t
-- 
1.9.3

[dpdk-dev] [PATCH v9 09/19] cxgbe: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be enabled
by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 drivers/net/cxgbe/sge.c | 8 
 1 file changed, 8 insertions(+)

v9 changes:
* Added unified packet type support in newly added cxgbe driver.

diff --git a/drivers/net/cxgbe/sge.c b/drivers/net/cxgbe/sge.c
index 359296e..fdae0b4 100644
--- a/drivers/net/cxgbe/sge.c
+++ b/drivers/net/cxgbe/sge.c
@@ -1326,14 +1326,22 @@ int t4_ethrx_handler(struct sge_rspq *q, const __be64 
*rsp,

mbuf->port = pkt->iff;
if (pkt->l2info & htonl(F_RXF_IP)) {
+#ifdef RTE_NEXT_ABI
+   mbuf->packet_type = RTE_PTYPE_L3_IPV4;
+#else
mbuf->ol_flags |= PKT_RX_IPV4_HDR;
+#endif
if (unlikely(!csum_ok))
mbuf->ol_flags |= PKT_RX_IP_CKSUM_BAD;

if ((pkt->l2info & htonl(F_RXF_UDP | F_RXF_TCP)) && !csum_ok)
mbuf->ol_flags |= PKT_RX_L4_CKSUM_BAD;
} else if (pkt->l2info & htonl(F_RXF_IP6)) {
+#ifdef RTE_NEXT_ABI
+   mbuf->packet_type = RTE_PTYPE_L3_IPV6;
+#else
mbuf->ol_flags |= PKT_RX_IPV6_HDR;
+#endif
}

mbuf->port = pkt->iff;
-- 
1.9.3

[dpdk-dev] [PATCH v9 03/19] e1000: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 drivers/net/e1000/igb_rxtx.c | 104 +++
 1 file changed, 104 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index 43d6703..165144c 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -590,6 +590,101 @@ eth_igb_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
  *  RX functions
  *
  **/
+#ifdef RTE_NEXT_ABI
+#define IGB_PACKET_TYPE_IPV4  0X01
+#define IGB_PACKET_TYPE_IPV4_TCP  0X11
+#define IGB_PACKET_TYPE_IPV4_UDP  0X21
+#define IGB_PACKET_TYPE_IPV4_SCTP 0X41
+#define IGB_PACKET_TYPE_IPV4_EXT  0X03
+#define IGB_PACKET_TYPE_IPV4_EXT_SCTP 0X43
+#define IGB_PACKET_TYPE_IPV6  0X04
+#define IGB_PACKET_TYPE_IPV6_TCP  0X14
+#define IGB_PACKET_TYPE_IPV6_UDP  0X24
+#define IGB_PACKET_TYPE_IPV6_EXT  0X0C
+#define IGB_PACKET_TYPE_IPV6_EXT_TCP  0X1C
+#define IGB_PACKET_TYPE_IPV6_EXT_UDP  0X2C
+#define IGB_PACKET_TYPE_IPV4_IPV6 0X05
+#define IGB_PACKET_TYPE_IPV4_IPV6_TCP 0X15
+#define IGB_PACKET_TYPE_IPV4_IPV6_UDP 0X25
+#define IGB_PACKET_TYPE_IPV4_IPV6_EXT 0X0D
+#define IGB_PACKET_TYPE_IPV4_IPV6_EXT_TCP 0X1D
+#define IGB_PACKET_TYPE_IPV4_IPV6_EXT_UDP 0X2D
+#define IGB_PACKET_TYPE_MAX   0X80
+#define IGB_PACKET_TYPE_MASK  0X7F
+#define IGB_PACKET_TYPE_SHIFT 0X04
+static inline uint32_t
+igb_rxd_pkt_info_to_pkt_type(uint16_t pkt_info)
+{
+   static const uint32_t
+   ptype_table[IGB_PACKET_TYPE_MAX] __rte_cache_aligned = {
+   [IGB_PACKET_TYPE_IPV4] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4,
+   [IGB_PACKET_TYPE_IPV4_EXT] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4_EXT,
+   [IGB_PACKET_TYPE_IPV6] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6,
+   [IGB_PACKET_TYPE_IPV4_IPV6] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6,
+   [IGB_PACKET_TYPE_IPV6_EXT] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6_EXT,
+   [IGB_PACKET_TYPE_IPV4_IPV6_EXT] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6_EXT,
+   [IGB_PACKET_TYPE_IPV4_TCP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP,
+   [IGB_PACKET_TYPE_IPV6_TCP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_TCP,
+   [IGB_PACKET_TYPE_IPV4_IPV6_TCP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6 | RTE_PTYPE_INNER_L4_TCP,
+   [IGB_PACKET_TYPE_IPV6_EXT_TCP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6_EXT | RTE_PTYPE_L4_TCP,
+   [IGB_PACKET_TYPE_IPV4_IPV6_EXT_TCP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6_EXT | RTE_PTYPE_INNER_L4_TCP,
+   [IGB_PACKET_TYPE_IPV4_UDP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_UDP,
+   [IGB_PACKET_TYPE_IPV6_UDP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_UDP,
+   [IGB_PACKET_TYPE_IPV4_IPV6_UDP] =  RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6 | RTE_PTYPE_INNER_L4_UDP,
+   [IGB_PACKET_TYPE_IPV6_EXT_UDP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6_EXT | RTE_PTYPE_L4_UDP,
+   [IGB_PACKET_TYPE_IPV4_IPV6_EXT_UDP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+   RTE_PTYPE_INNER_L3_IPV6_EXT | RTE_PTYPE_INNER_L4_UDP,
+   [IGB_PACKET_TYPE_IPV4_SCTP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_SCTP,
+   [IGB_PACKET_TYPE_IPV4_EXT_SCTP] = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4_E

[dpdk-dev] [PATCH v9 10/19] app/test-pipeline: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 app/test-pipeline/pipeline_hash.c | 13 +
 1 file changed, 13 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/app/test-pipeline/pipeline_hash.c 
b/app/test-pipeline/pipeline_hash.c
index 4598ad4..aa3f9e5 100644
--- a/app/test-pipeline/pipeline_hash.c
+++ b/app/test-pipeline/pipeline_hash.c
@@ -459,20 +459,33 @@ app_main_loop_rx_metadata(void) {
signature = RTE_MBUF_METADATA_UINT32_PTR(m, 0);
key = RTE_MBUF_METADATA_UINT8_PTR(m, 32);

+#ifdef RTE_NEXT_ABI
+   if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
if (m->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
ip_hdr = (struct ipv4_hdr *)
&m_data[sizeof(struct ether_hdr)];
ip_dst = ip_hdr->dst_addr;

k32 = (uint32_t *) key;
k32[0] = ip_dst & 0xFF00;
+#ifdef RTE_NEXT_ABI
+   } else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+#else
} else {
+#endif
ipv6_hdr = (struct ipv6_hdr *)
&m_data[sizeof(struct ether_hdr)];
ipv6_dst = ipv6_hdr->dst_addr;

memcpy(key, ipv6_dst, 16);
+#ifdef RTE_NEXT_ABI
+   } else
+   continue;
+#else
}
+#endif

*signature = test_hash(key, 0, 0);
}
-- 
1.9.3

[dpdk-dev] [PATCH v9 11/19] app/testpmd: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
Signed-off-by: Jijiang Liu 
---
 app/test-pmd/csumonly.c |  14 
 app/test-pmd/rxonly.c   | 183 
 2 files changed, 197 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v4 changes:
* Added printing logs of packet types of each received packet in rxonly mode.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 4287940..1bf3485 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -202,8 +202,14 @@ parse_ethernet(struct ether_hdr *eth_hdr, struct 
testpmd_offload_info *info)

 /* Parse a vxlan header */
 static void
+#ifdef RTE_NEXT_ABI
+parse_vxlan(struct udp_hdr *udp_hdr,
+   struct testpmd_offload_info *info,
+   uint32_t pkt_type)
+#else
 parse_vxlan(struct udp_hdr *udp_hdr, struct testpmd_offload_info *info,
uint64_t mbuf_olflags)
+#endif
 {
struct ether_hdr *eth_hdr;

@@ -211,8 +217,12 @@ parse_vxlan(struct udp_hdr *udp_hdr, struct 
testpmd_offload_info *info,
 * (rfc7348) or that the rx offload flag is set (i40e only
 * currently) */
if (udp_hdr->dst_port != _htons(4789) &&
+#ifdef RTE_NEXT_ABI
+   RTE_ETH_IS_TUNNEL_PKT(pkt_type) == 0)
+#else
(mbuf_olflags & (PKT_RX_TUNNEL_IPV4_HDR |
PKT_RX_TUNNEL_IPV6_HDR)) == 0)
+#endif
return;

info->is_tunnel = 1;
@@ -549,7 +559,11 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
struct udp_hdr *udp_hdr;
udp_hdr = (struct udp_hdr *)((char *)l3_hdr +
info.l3_len);
+#ifdef RTE_NEXT_ABI
+   parse_vxlan(udp_hdr, &info, m->packet_type);
+#else
parse_vxlan(udp_hdr, &info, m->ol_flags);
+#endif
} else if (info.l4_proto == IPPROTO_GRE) {
struct simple_gre_hdr *gre_hdr;
gre_hdr = (struct simple_gre_hdr *)
diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index 4a9f86e..632056d 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -91,7 +91,11 @@ pkt_burst_receive(struct fwd_stream *fs)
uint64_t ol_flags;
uint16_t nb_rx;
uint16_t i, packet_type;
+#ifdef RTE_NEXT_ABI
+   uint16_t is_encapsulation;
+#else
uint64_t is_encapsulation;
+#endif

 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
uint64_t start_tsc;
@@ -135,8 +139,12 @@ pkt_burst_receive(struct fwd_stream *fs)
ol_flags = mb->ol_flags;
packet_type = mb->packet_type;

+#ifdef RTE_NEXT_ABI
+   is_encapsulation = RTE_ETH_IS_TUNNEL_PKT(packet_type);
+#else
is_encapsulation = ol_flags & (PKT_RX_TUNNEL_IPV4_HDR |
PKT_RX_TUNNEL_IPV6_HDR);
+#endif

print_ether_addr("  src=", ð_hdr->s_addr);
print_ether_addr(" - dst=", ð_hdr->d_addr);
@@ -163,6 +171,177 @@ pkt_burst_receive(struct fwd_stream *fs)
if (ol_flags & PKT_RX_QINQ_PKT)
printf(" - QinQ VLAN tci=0x%x, VLAN tci outer=0x%x",
mb->vlan_tci, mb->vlan_tci_outer);
+#ifdef RTE_NEXT_ABI
+   if (mb->packet_type) {
+   uint32_t ptype;
+
+   /* (outer) L2 packet type */
+   ptype = mb->packet_type & RTE_PTYPE_L2_MASK;
+   switch (ptype) {
+   case RTE_PTYPE_L2_ETHER:
+   printf(" - (outer) L2 type: ETHER");
+   break;
+   case RTE_PTYPE_L2_ETHER_TIMESYNC:
+   printf(" - (outer) L2 type: ETHER_Timesync");
+   break;
+   case RTE_PTYPE_L2_ETHER_ARP:
+   printf(" - (outer) L2 type: ETHER_ARP");
+   break;
+   case RTE_PTYPE_L2_ETHER_LLDP:
+   printf(" - (outer) L2 type: ETHER_LLDP");
+   break;
+   default:
+   printf(" - (outer) L2 type: Unknown");
+   break;
+   }
+
+   /* (outer) L3 pac

[dpdk-dev] [PATCH v9 13/19] examples/ip_fragmentation: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 examples/ip_fragmentation/main.c | 9 +
 1 file changed, 9 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/ip_fragmentation/main.c b/examples/ip_fragmentation/main.c
index 0922ba6..b71d05f 100644
--- a/examples/ip_fragmentation/main.c
+++ b/examples/ip_fragmentation/main.c
@@ -283,7 +283,11 @@ l3fwd_simple_forward(struct rte_mbuf *m, struct 
lcore_queue_conf *qconf,
len = qconf->tx_mbufs[port_out].len;

/* if this is an IPv4 packet */
+#ifdef RTE_NEXT_ABI
+   if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
if (m->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
struct ipv4_hdr *ip_hdr;
uint32_t ip_dst;
/* Read the lookup key (i.e. ip_dst) from the input packet */
@@ -317,9 +321,14 @@ l3fwd_simple_forward(struct rte_mbuf *m, struct 
lcore_queue_conf *qconf,
if (unlikely (len2 < 0))
return;
}
+#ifdef RTE_NEXT_ABI
+   } else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+   /* if this is an IPv6 packet */
+#else
}
/* if this is an IPv6 packet */
else if (m->ol_flags & PKT_RX_IPV6_HDR) {
+#endif
struct ipv6_hdr *ip_hdr;

ipv6 = 1;
-- 
1.9.3

[dpdk-dev] [PATCH v9 14/19] examples/ip_reassembly: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 examples/ip_reassembly/main.c | 9 +
 1 file changed, 9 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/ip_reassembly/main.c b/examples/ip_reassembly/main.c
index 9ecb6f9..f1c47ad 100644
--- a/examples/ip_reassembly/main.c
+++ b/examples/ip_reassembly/main.c
@@ -356,7 +356,11 @@ reassemble(struct rte_mbuf *m, uint8_t portid, uint32_t 
queue,
dst_port = portid;

/* if packet is IPv4 */
+#ifdef RTE_NEXT_ABI
+   if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
if (m->ol_flags & (PKT_RX_IPV4_HDR)) {
+#endif
struct ipv4_hdr *ip_hdr;
uint32_t ip_dst;

@@ -396,9 +400,14 @@ reassemble(struct rte_mbuf *m, uint8_t portid, uint32_t 
queue,
}

eth_hdr->ether_type = rte_be_to_cpu_16(ETHER_TYPE_IPv4);
+#ifdef RTE_NEXT_ABI
+   } else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+   /* if packet is IPv6 */
+#else
}
/* if packet is IPv6 */
else if (m->ol_flags & (PKT_RX_IPV6_HDR | PKT_RX_IPV6_HDR_EXT)) {
+#endif
struct ipv6_extension_fragment *frag_hdr;
struct ipv6_hdr *ip_hdr;

-- 
1.9.3

[dpdk-dev] [PATCH v9 12/19] app/test: Remove useless code

2015-07-03 Thread Helin Zhang

Severl useless code lines are added accidently, which blocks packet
type unification. They should be deleted at all.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 app/test/packet_burst_generator.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

v4 changes:
* Removed several useless code lines which block packet type unification.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/app/test/packet_burst_generator.c 
b/app/test/packet_burst_generator.c
index 28d9e25..d9d808b 100644
--- a/app/test/packet_burst_generator.c
+++ b/app/test/packet_burst_generator.c
@@ -273,19 +273,21 @@ nomore_mbuf:
if (ipv4) {
pkt->vlan_tci  = ETHER_TYPE_IPv4;
pkt->l3_len = sizeof(struct ipv4_hdr);
-
+#ifndef RTE_NEXT_ABI
if (vlan_enabled)
pkt->ol_flags = PKT_RX_IPV4_HDR | 
PKT_RX_VLAN_PKT;
else
pkt->ol_flags = PKT_RX_IPV4_HDR;
+#endif
} else {
pkt->vlan_tci  = ETHER_TYPE_IPv6;
pkt->l3_len = sizeof(struct ipv6_hdr);
-
+#ifndef RTE_NEXT_ABI
if (vlan_enabled)
pkt->ol_flags = PKT_RX_IPV6_HDR | 
PKT_RX_VLAN_PKT;
else
pkt->ol_flags = PKT_RX_IPV6_HDR;
+#endif
}

pkts_burst[nb_pkt] = pkt;
-- 
1.9.3

[dpdk-dev] [PATCH v9 15/19] examples/l3fwd-acl: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 examples/l3fwd-acl/main.c | 29 +++--
 1 file changed, 23 insertions(+), 6 deletions(-)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/l3fwd-acl/main.c b/examples/l3fwd-acl/main.c
index 29cb25e..b2bdf2f 100644
--- a/examples/l3fwd-acl/main.c
+++ b/examples/l3fwd-acl/main.c
@@ -645,10 +645,13 @@ prepare_one_packet(struct rte_mbuf **pkts_in, struct 
acl_search_t *acl,
struct ipv4_hdr *ipv4_hdr;
struct rte_mbuf *pkt = pkts_in[index];

+#ifdef RTE_NEXT_ABI
+   if (RTE_ETH_IS_IPV4_HDR(pkt->packet_type)) {
+#else
int type = pkt->ol_flags & (PKT_RX_IPV4_HDR | PKT_RX_IPV6_HDR);

if (type == PKT_RX_IPV4_HDR) {
-
+#endif
ipv4_hdr = rte_pktmbuf_mtod_offset(pkt, struct ipv4_hdr *,
   sizeof(struct ether_hdr));

@@ -667,9 +670,11 @@ prepare_one_packet(struct rte_mbuf **pkts_in, struct 
acl_search_t *acl,
/* Not a valid IPv4 packet */
rte_pktmbuf_free(pkt);
}
-
+#ifdef RTE_NEXT_ABI
+   } else if (RTE_ETH_IS_IPV6_HDR(pkt->packet_type)) {
+#else
} else if (type == PKT_RX_IPV6_HDR) {
-
+#endif
/* Fill acl structure */
acl->data_ipv6[acl->num_ipv6] = MBUF_IPV6_2PROTO(pkt);
acl->m_ipv6[(acl->num_ipv6)++] = pkt;
@@ -687,17 +692,22 @@ prepare_one_packet(struct rte_mbuf **pkts_in, struct 
acl_search_t *acl,
 {
struct rte_mbuf *pkt = pkts_in[index];

+#ifdef RTE_NEXT_ABI
+   if (RTE_ETH_IS_IPV4_HDR(pkt->packet_type)) {
+#else
int type = pkt->ol_flags & (PKT_RX_IPV4_HDR | PKT_RX_IPV6_HDR);

if (type == PKT_RX_IPV4_HDR) {
-
+#endif
/* Fill acl structure */
acl->data_ipv4[acl->num_ipv4] = MBUF_IPV4_2PROTO(pkt);
acl->m_ipv4[(acl->num_ipv4)++] = pkt;

-
+#ifdef RTE_NEXT_ABI
+   } else if (RTE_ETH_IS_IPV6_HDR(pkt->packet_type)) {
+#else
} else if (type == PKT_RX_IPV6_HDR) {
-
+#endif
/* Fill acl structure */
acl->data_ipv6[acl->num_ipv6] = MBUF_IPV6_2PROTO(pkt);
acl->m_ipv6[(acl->num_ipv6)++] = pkt;
@@ -745,10 +755,17 @@ send_one_packet(struct rte_mbuf *m, uint32_t res)
/* in the ACL list, drop it */
 #ifdef L3FWDACL_DEBUG
if ((res & ACL_DENY_SIGNATURE) != 0) {
+#ifdef RTE_NEXT_ABI
+   if (RTE_ETH_IS_IPV4_HDR(m->packet_type))
+   dump_acl4_rule(m, res);
+   else if (RTE_ETH_IS_IPV6_HDR(m->packet_type))
+   dump_acl6_rule(m, res);
+#else
if (m->ol_flags & PKT_RX_IPV4_HDR)
dump_acl4_rule(m, res);
else
dump_acl6_rule(m, res);
+#endif /* RTE_NEXT_ABI */
}
 #endif
rte_pktmbuf_free(m);
-- 
1.9.3

[dpdk-dev] [PATCH v9 16/19] examples/l3fwd-power: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 examples/l3fwd-power/main.c | 8 
 1 file changed, 8 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index d4eba1a..dbbebdd 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -635,7 +635,11 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid,

eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);

+#ifdef RTE_NEXT_ABI
+   if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
if (m->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
/* Handle IPv4 headers.*/
ipv4_hdr =
rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
@@ -670,8 +674,12 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid,
ether_addr_copy(&ports_eth_addr[dst_port], ð_hdr->s_addr);

send_single_packet(m, dst_port);
+#ifdef RTE_NEXT_ABI
+   } else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+#else
}
else {
+#endif
/* Handle IPv6 headers.*/
 #if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH)
struct ipv6_hdr *ipv6_hdr;
-- 
1.9.3

[dpdk-dev] [PATCH v9 17/19] examples/l3fwd: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 examples/l3fwd/main.c | 123 --
 1 file changed, 120 insertions(+), 3 deletions(-)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v3 changes:
* Minor bug fixes and enhancements.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
index 5c22ed1..b1bcb35 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -939,7 +939,11 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, 
struct lcore_conf *qcon

eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);

+#ifdef RTE_NEXT_ABI
+   if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
if (m->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
/* Handle IPv4 headers.*/
ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
   sizeof(struct ether_hdr));
@@ -970,8 +974,11 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, 
struct lcore_conf *qcon
ether_addr_copy(&ports_eth_addr[dst_port], ð_hdr->s_addr);

send_single_packet(m, dst_port);
-
+#ifdef RTE_NEXT_ABI
+   } else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+#else
} else {
+#endif
/* Handle IPv6 headers.*/
struct ipv6_hdr *ipv6_hdr;

@@ -990,8 +997,13 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, 
struct lcore_conf *qcon
ether_addr_copy(&ports_eth_addr[dst_port], ð_hdr->s_addr);

send_single_packet(m, dst_port);
+#ifdef RTE_NEXT_ABI
+   } else
+   /* Free the mbuf that contains non-IPV4/IPV6 packet */
+   rte_pktmbuf_free(m);
+#else
}
-
+#endif
 }

 #ifdef DO_RFC_1812_CHECKS
@@ -1015,12 +1027,19 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t 
portid, struct lcore_conf *qcon
  * to BAD_PORT value.
  */
 static inline __attribute__((always_inline)) void
+#ifdef RTE_NEXT_ABI
+rfc1812_process(struct ipv4_hdr *ipv4_hdr, uint16_t *dp, uint32_t ptype)
+#else
 rfc1812_process(struct ipv4_hdr *ipv4_hdr, uint16_t *dp, uint32_t flags)
+#endif
 {
uint8_t ihl;

+#ifdef RTE_NEXT_ABI
+   if (RTE_ETH_IS_IPV4_HDR(ptype)) {
+#else
if ((flags & PKT_RX_IPV4_HDR) != 0) {
-
+#endif
ihl = ipv4_hdr->version_ihl - IPV4_MIN_VER_IHL;

ipv4_hdr->time_to_live--;
@@ -1050,11 +1069,19 @@ get_dst_port(const struct lcore_conf *qconf, struct 
rte_mbuf *pkt,
struct ipv6_hdr *ipv6_hdr;
struct ether_hdr *eth_hdr;

+#ifdef RTE_NEXT_ABI
+   if (RTE_ETH_IS_IPV4_HDR(pkt->packet_type)) {
+#else
if (pkt->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
if (rte_lpm_lookup(qconf->ipv4_lookup_struct, dst_ipv4,
&next_hop) != 0)
next_hop = portid;
+#ifdef RTE_NEXT_ABI
+   } else if (RTE_ETH_IS_IPV6_HDR(pkt->packet_type)) {
+#else
} else if (pkt->ol_flags & PKT_RX_IPV6_HDR) {
+#endif
eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
ipv6_hdr = (struct ipv6_hdr *)(eth_hdr + 1);
if (rte_lpm6_lookup(qconf->ipv6_lookup_struct,
@@ -1088,12 +1115,52 @@ process_packet(struct lcore_conf *qconf, struct 
rte_mbuf *pkt,
ve = val_eth[dp];

dst_port[0] = dp;
+#ifdef RTE_NEXT_ABI
+   rfc1812_process(ipv4_hdr, dst_port, pkt->packet_type);
+#else
rfc1812_process(ipv4_hdr, dst_port, pkt->ol_flags);
+#endif

te =  _mm_blend_epi16(te, ve, MASK_ETH);
_mm_store_si128((__m128i *)eth_hdr, te);
 }

+#ifdef RTE_NEXT_ABI
+/*
+ * Read packet_type and destination IPV4 addresses from 4 mbufs.
+ */
+static inline void
+processx4_step1(struct rte_mbuf *pkt[FWDSTEP],
+   __m128i *dip,
+   uint32_t *ipv4_flag)
+{
+   struct ipv4_hdr *ipv4_hdr;
+   struct ether_hdr *eth_hdr;
+   uint32_t x0, x1, x2, x3;
+
+   eth_hdr = rte_pktmbuf_mtod(pkt[0], struct ether_hdr *);
+   ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
+   x0 = ipv4_hdr->dst_addr;
+   ipv4_flag[0] = pkt[0]->packet_type & RTE_PTYPE_L3_IPV4;
+
+   eth_hdr = rte_pktmbuf_mtod(pkt[1], struct ether_hdr *);
+   ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
+   x1 = ipv4_hdr->dst_addr;
+   ipv4_flag[0] &= pkt[1]->packet_type;
+
+   eth_hdr = rte_pktmbuf_mtod(pkt[2], struct ether_hdr *);
+   ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
+   x2 = ipv4_hdr->dst_addr;
+   ipv4_flag[0] &

[dpdk-dev] [PATCH v9 18/19] examples/tep_termination: replace bit mask based packet type with unified packet type

2015-07-03 Thread Helin Zhang

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be enabled
by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 examples/tep_termination/vxlan.c | 4 
 1 file changed, 4 insertions(+)

v9 changes:
* Used unified packet type to check if it is a VXLAN packet, included in
  RTE_NEXT_ABI which is disabled by default.

diff --git a/examples/tep_termination/vxlan.c b/examples/tep_termination/vxlan.c
index b2a2f53..ae4bc9e 100644
--- a/examples/tep_termination/vxlan.c
+++ b/examples/tep_termination/vxlan.c
@@ -180,8 +180,12 @@ decapsulation(struct rte_mbuf *pkt)
 * (rfc7348) or that the rx offload flag is set (i40e only
 * currently)*/
if (udp_hdr->dst_port != rte_cpu_to_be_16(DEFAULT_VXLAN_PORT) &&
+#ifdef RTE_NEXT_ABI
+   ((pkt->packet_type & RTE_PTYPE_TUNNEL_MASK) == 0)
+#else
(pkt->ol_flags & (PKT_RX_TUNNEL_IPV4_HDR |
PKT_RX_TUNNEL_IPV6_HDR)) == 0)
+#endif
return -1;
outer_header_len = info.outer_l2_len + info.outer_l3_len
+ sizeof(struct udp_hdr) + sizeof(struct vxlan_hdr);
-- 
1.9.3

[dpdk-dev] [PATCH v9 19/19] mbuf: remove old packet type bit masks

2015-07-03 Thread Helin Zhang

As unified packet types are used instead, those old bit masks and
the relevant macros for packet type indication need to be removed.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang 
---
 lib/librte_mbuf/rte_mbuf.c | 4 
 lib/librte_mbuf/rte_mbuf.h | 4 
 2 files changed, 8 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.
* Redefined the bit masks for packet RX offload flags.

v5 changes:
* Rolled back the bit masks of RX flags, for ABI compatibility.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index f506517..4320dd4 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -251,14 +251,18 @@ const char *rte_get_rx_ol_flag_name(uint64_t mask)
/* case PKT_RX_HBUF_OVERFLOW: return "PKT_RX_HBUF_OVERFLOW"; */
/* case PKT_RX_RECIP_ERR: return "PKT_RX_RECIP_ERR"; */
/* case PKT_RX_MAC_ERR: return "PKT_RX_MAC_ERR"; */
+#ifndef RTE_NEXT_ABI
case PKT_RX_IPV4_HDR: return "PKT_RX_IPV4_HDR";
case PKT_RX_IPV4_HDR_EXT: return "PKT_RX_IPV4_HDR_EXT";
case PKT_RX_IPV6_HDR: return "PKT_RX_IPV6_HDR";
case PKT_RX_IPV6_HDR_EXT: return "PKT_RX_IPV6_HDR_EXT";
+#endif /* RTE_NEXT_ABI */
case PKT_RX_IEEE1588_PTP: return "PKT_RX_IEEE1588_PTP";
case PKT_RX_IEEE1588_TMST: return "PKT_RX_IEEE1588_TMST";
+#ifndef RTE_NEXT_ABI
case PKT_RX_TUNNEL_IPV4_HDR: return "PKT_RX_TUNNEL_IPV4_HDR";
case PKT_RX_TUNNEL_IPV6_HDR: return "PKT_RX_TUNNEL_IPV6_HDR";
+#endif /* RTE_NEXT_ABI */
default: return NULL;
}
 }
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 3a17d95..b90c73f 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -92,14 +92,18 @@ extern "C" {
 #define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer overflow. */
 #define PKT_RX_RECIP_ERR (0ULL << 0)  /**< Hardware processing error. */
 #define PKT_RX_MAC_ERR   (0ULL << 0)  /**< MAC error. */
+#ifndef RTE_NEXT_ABI
 #define PKT_RX_IPV4_HDR  (1ULL << 5)  /**< RX packet with IPv4 header. */
 #define PKT_RX_IPV4_HDR_EXT  (1ULL << 6)  /**< RX packet with extended IPv4 
header. */
 #define PKT_RX_IPV6_HDR  (1ULL << 7)  /**< RX packet with IPv6 header. */
 #define PKT_RX_IPV6_HDR_EXT  (1ULL << 8)  /**< RX packet with extended IPv6 
header. */
+#endif /* RTE_NEXT_ABI */
 #define PKT_RX_IEEE1588_PTP  (1ULL << 9)  /**< RX IEEE1588 L2 Ethernet PT 
Packet. */
 #define PKT_RX_IEEE1588_TMST (1ULL << 10) /**< RX IEEE1588 L2/L4 timestamped 
packet.*/
+#ifndef RTE_NEXT_ABI
 #define PKT_RX_TUNNEL_IPV4_HDR (1ULL << 11) /**< RX tunnel packet with IPv4 
header.*/
 #define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12) /**< RX tunnel packet with IPv6 
header. */
+#endif /* RTE_NEXT_ABI */
 #define PKT_RX_FDIR_ID   (1ULL << 13) /**< FD id reported if FDIR match. */
 #define PKT_RX_FDIR_FLX  (1ULL << 14) /**< Flexible bytes reported if FDIR 
match. */
 #define PKT_RX_QINQ_PKT  (1ULL << 15)  /**< RX packet with double VLAN 
stripped. */
-- 
1.9.3

[dpdk-dev] Could not achieve wire speed for 40GE with any DPDK version on XL710 NIC's

2015-07-03 Thread Pavel Odintsov

Hello, folks!

We have found root of issue.

Intel do not offer wire speed for 64b packets in XL710 at all.

As mentioned in data sheet
http://www.intel.ru/content/dam/www/public/us/en/documents/product-briefs/xl710-10-40-gbe-controller-brief.pdf
we have:

Small packet performance: Maintains wire-rate throughput on smaller
payload sizes (>128 Bytes for 40 GbE and >64 Bytes for 10 GbE

Could anybody recommend NIC's which could truly achieve wire rate for 40GE?


On Wed, Jul 1, 2015 at 9:01 PM, Anuj Kalia  wrote:
> Thanks for the comments.
>
> On Wed, Jul 1, 2015 at 1:32 PM, Vladimir Medvedkin  
> wrote:
>> Hi Anuj,
>>
>> Thanks for fixes!
>> I have 2 comments
>> - from i40e_ethdev.h : #define I40E_DEFAULT_RX_WTHRESH  0
>> - (26 + 32) / 4 (batched descriptor writeback) should be (26 + 4 * 32) / 4
>> (batched descriptor writeback)
>> , thus we have 135 bytes/packet
>>
>> This corresponds to 58.8 Mpps
>>
>> Regards,
>> Vladimir
>>
>> 2015-07-01 17:22 GMT+03:00 Anuj Kalia :
>>>
>>> Vladimir,
>>>
>>> Few possible fixes to your PCIe analysis (let me know if I'm wrong):
>>> - ECRC is probably disabled (check using sudo lspci -vvv | grep
>>> CGenEn-), so TLP header is 26 bytes
>>> - Descriptor writeback can be batched using high value of WTHRESH,
>>> which is what DPDK uses by default
>>> - Read request contains full TLP header (26 bytes)
>>>
>>> Assuming WTHRESH = 4, bytes transferred from NIC to host per packet =
>>> 26 + 64 (packet itself) +
>>> (26 + 32) / 4 (batched descriptor writeback) +
>>> (26 / 4) (read request for new descriptors) =
>>> 111 bytes / packet
>>>
>>> This corresponds to 70.9 Mpps over PCIe 3.0 x8. Assuming 5% DLLP
>>> overhead, rate = 67.4 Mpps
>>>
>>> --Anuj
>>>
>>>
>>>
>>> On Wed, Jul 1, 2015 at 9:40 AM, Vladimir Medvedkin 
>>> wrote:
>>> > In case with syn flood you should take into account return syn-ack
>>> > traffic,
>>> > which generates PCIe DLLP's from NIC to host, thus pcie bandwith exceeds
>>> > faster. And don't forget about DLLP's generated by rx traffic, which
>>> > saturates host-to-NIC bus.
>>> >
>>> > 2015-07-01 16:05 GMT+03:00 Pavel Odintsov :
>>> >
>>> >> Yes, Bruce, we understand this. But we are working with huge SYN
>>> >> attacks processing and they are 64byte only :(
>>> >>
>>> >> On Wed, Jul 1, 2015 at 3:59 PM, Bruce Richardson
>>> >>  wrote:
>>> >> > On Wed, Jul 01, 2015 at 03:44:57PM +0300, Pavel Odintsov wrote:
>>> >> >> Thanks for answer, Vladimir! So we need look for x16 NIC if we want
>>> >> >> achieve 40GE line rate...
>>> >> >>
>>> >> > Note that this would only apply for your minimal i.e. 64-byte, packet
>>> >> sizes.
>>> >> > Once you go up to larger e.g. 128B packets, your PCI bandwidth
>>> >> requirements
>>> >> > are lower and you can easier achieve line rate.
>>> >> >
>>> >> > /Bruce
>>> >> >
>>> >> >> On Wed, Jul 1, 2015 at 3:06 PM, Vladimir Medvedkin <
>>> >> medvedkinv at gmail.com> wrote:
>>> >> >> > Hi Pavel,
>>> >> >> >
>>> >> >> > Looks like you ran into pcie bottleneck. So let's calculate xl710
>>> >> >> > rx
>>> >> only
>>> >> >> > case.
>>> >> >> > Assume we have 32byte descriptors (if we want more offload).
>>> >> >> > DMA makes one pcie transaction with packet payload, one descriptor
>>> >> writeback
>>> >> >> > and one memory request for free descriptors for every 4 packets.
>>> >> >> > For
>>> >> >> > Transaction Layer Packet (TLP) there is 30 bytes overhead (4 PHY +
>>> >> >> > 6
>>> >> DLL +
>>> >> >> > 16 header + 4 ECRC). So for 1 rx packet dma sends 30 + 64(packet
>>> >> itself) +
>>> >> >> > 30 + 32 (writeback descriptor) + (16 / 4) (read request for new
>>> >> >> > descriptors). Note that we do not take into account PCIe
>>> >> >> > ACK/NACK/FC
>>> >> Update
>>> >> >> > DLLP. So we have 160 bytes per packet. One lane PCIe 3.0 transmits
>>> >> >> > 1
>>> >> byte in
>>> >> >> > 1 ns, so x8 transmits 8 bytes  in 1 ns. 1 packet transmits in 20
>>> >> >> > ns.
>>> >> Thus
>>> >> >> > in theory pcie 3.0 x8 may transfer not more than 50mpps.
>>> >> >> > Correct me if I'm wrong.
>>> >> >> >
>>> >> >> > Regards,
>>> >> >> > Vladimir
>>> >> >> >
>>> >> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Sincerely yours, Pavel Odintsov
>>> >>
>>
>>



-- 
Sincerely yours, Pavel Odintsov

[dpdk-dev] [PATCH v2] doc: announce ABI changes planned for unified packet type

2015-07-03 Thread Helin Zhang

The significant ABI changes are planned for unified packet type
which will be supported from release 2.2. Here announces that ABI
changes in detail.

Signed-off-by: Helin Zhang 
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

v2 changes:
* Added 'struct rte_kni_mbuf' to the ABI change announcement.

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 110c486..4328367 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -12,3 +12,4 @@ Examples of Deprecation Notices

 Deprecation Notices
 ---
+* Significant ABI changes are planned for struct rte_mbuf, struct 
rte_kni_mbuf, and several PKT_RX_ flags will be removed, to support unified 
packet type from release 2.2. The upcoming release 2.1 will not have those 
changes. There is no backward compatibility planned from release 2.2. All 
binaries will need to be rebuilt from release 2.2.
-- 
1.9.3

[dpdk-dev] [PATCH v7 03/12] eal: Fix memory leaks and needless increment of pci_map_addr

2015-07-03 Thread Tetsuya Mukawa

On 2015/07/02 18:57, Bruce Richardson wrote:
> On Tue, Jun 30, 2015 at 05:24:19PM +0900, Tetsuya Mukawa wrote:
>> ---
>>  lib/librte_eal/bsdapp/eal/eal_pci.c   | 14 ++--
>>  lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 56 
>> ---
>>  2 files changed, 48 insertions(+), 22 deletions(-)
>>
>> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
>> b/lib/librte_eal/bsdapp/eal/eal_pci.c
>> index 8e24fd1..b071f07 100644
>> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c
>> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
>> @@ -235,7 +235,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
>>  if ((uio_res = rte_zmalloc("UIO_RES", sizeof (*uio_res), 0)) == NULL) {
>>  RTE_LOG(ERR, EAL,
>>  "%s(): cannot store uio mmap details\n", __func__);
>> -return -1;
>> +goto close_fd;
>>  }
>>  
>>  snprintf(uio_res->path, sizeof(uio_res->path), "%s", devname);
>> @@ -262,8 +262,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
>>  (mapaddr = pci_map_resource(NULL, devname, (off_t)offset,
>>  (size_t)maps[j].size)
>>  ) == NULL) {
>> -rte_free(uio_res);
>> -return -1;
>> +goto free_uio_res;
>>  }
>>  
>>  maps[j].addr = mapaddr;
>> @@ -274,6 +273,15 @@ pci_uio_map_resource(struct rte_pci_device *dev)
>>  TAILQ_INSERT_TAIL(uio_res_list, uio_res, next);
>>  
>>  return 0;
>> +
>> +free_uio_res:
>> +rte_free(uio_res);
> If you initialize uio_res to NULL when it is defined, you can get rid of the
> distinction between "free_uio_res" and "close_fd" labels. 
>
> Similarly, if you check for a valid dev->intr_handle.fd value before close, 
> you can have a 
> generic "error" leg, and replace all return -1's with goto error. While not as
> useful as merging the labels, it might be something to consider.

Hi Bruce,

Thanks for comments.
Sure, I will cleanup my patch like your mentioned. It will be easier to
read.

>> @@ -368,23 +373,15 @@ pci_uio_map_resource(struct rte_pci_device *dev)
>>  
>>  mapaddr = pci_map_resource(pci_map_addr, fd, 0,
>>  (size_t)dev->mem_resource[i].len, 0);
>> -if (mapaddr == MAP_FAILED)
>> -fail = 1;
>> +close(fd);
>> +if (mapaddr == MAP_FAILED) {
>> +rte_free(maps[map_idx].path);
>> +goto free_uio_res;
>> +}
>>  
>>  pci_map_addr = RTE_PTR_ADD(mapaddr,
>>  (size_t)dev->mem_resource[i].len);
>>  
>> -maps[map_idx].path = rte_malloc(NULL, strlen(devname) + 1, 0);
>> -if (maps[map_idx].path == NULL)
>> -fail = 1;
>> -
>> -if (fail) {
>> -rte_free(uio_res);
>> -close(fd);
>> -return -1;
>> -}
>> -close(fd);
>> -
>>  maps[map_idx].phaddr = dev->mem_resource[i].phys_addr;
>>  maps[map_idx].size = dev->mem_resource[i].len;
>>  maps[map_idx].addr = mapaddr;
>> @@ -399,6 +396,25 @@ pci_uio_map_resource(struct rte_pci_device *dev)
>>  TAILQ_INSERT_TAIL(uio_res_list, uio_res, next);
>>  
>>  return 0;
>> +
>> +free_uio_res:
>> +for (i = 0; i < map_idx; i++) {
> If you move the initialization of map_idx = 0 from the "for" loop to the 
> definition
> at the start of the function, you may again be able to merge these two labels
> into one.

Also I will fix it in next patches.

Regards,
Tetsuya

[dpdk-dev] [PATCH v7 05/12] eal: Fix uio mapping differences between linuxapp and bsdapp

2015-07-03 Thread Tetsuya Mukawa

On 2015/07/02 19:20, Bruce Richardson wrote:
> On Tue, Jun 30, 2015 at 05:24:21PM +0900, Tetsuya Mukawa wrote:
>> From: "Tetsuya.Mukawa" 
>>
>> This patch fixes below.
>> - bsdapp
>>  - Use map_id in pci_uio_map_resource().
>>  - Fix interface of pci_map_resource().
>>  - Move path variable of mapped_pci_resource structure to pci_map.
>> - linuxapp
>>  - Remove redundant error message of linuxapp.
>>
>> 'pci_uio_map_resource()' is implemented in both linuxapp and bsdapp,
>> but interface is different. The patch fixes the function of bsdapp
>> to do same as linuxapp. After applying it, file descriptor should be
>> opened and closed out of pci_map_resource().
>>
>> Signed-off-by: Tetsuya Mukawa 
>> ---
>>  lib/librte_eal/bsdapp/eal/eal_pci.c   | 118 
>> ++
>>  lib/librte_eal/linuxapp/eal/eal_pci_uio.c |  21 +++---
>>  2 files changed, 80 insertions(+), 59 deletions(-)
>>
>   
>>  free_uio_res:
>> +for (i = 0; i < map_idx; i++)
>> +rte_free(maps[i].path);
>>  rte_free(uio_res);
>>  close_fd:
>>  close(dev->intr_handle.fd);
> Comments on previous patch about merging error labels would also apply here.

Right, I will fix it also.

>> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c 
>> b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
>> index c3b259b..19620fe 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
>> @@ -116,17 +116,11 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
>>  fd, (off_t)uio_res->maps[i].offset,
>>  (size_t)uio_res->maps[i].size, 0);
>>  if (mapaddr != uio_res->maps[i].addr) {
>> -if (mapaddr == MAP_FAILED)
>> -RTE_LOG(ERR, EAL,
>> -"Cannot mmap device 
>> resource file %s: %s\n",
>> -uio_res->maps[i].path,
>> -strerror(errno));
>> -else
>> -RTE_LOG(ERR, EAL,
>> -"Cannot mmap device 
>> resource file %s to address: %p\n",
>> -uio_res->maps[i].path,
>> -uio_res->maps[i].addr);
>> -
>> +RTE_LOG(ERR, EAL,
>> +"Cannot mmap device resource "
>> +"file %s to address: %p\n",
>> +uio_res->maps[i].path,
>> +uio_res->maps[i].addr);
>>  close(fd);
>>  return -1;
>>  }
>> @@ -353,8 +347,11 @@ pci_uio_map_resource(struct rte_pci_device *dev)
>>  
>>  /* allocate memory to keep path */
>>  maps[map_idx].path = rte_malloc(NULL, strlen(devname) + 1, 0);
>> -if (maps[map_idx].path == NULL)
>> +if (maps[map_idx].path == NULL) {
>> +RTE_LOG(ERR, EAL, "Cannot allocate memory for "
>> +"path: %s\n", strerror(errno));
> It's recommended not to split literal strings across multiple lines. This is
> so that it's easy to find error messages in the source code. In this case, 
> because
> of the split, someone using git grep to search for the error message 
> "Cannot allocate memory for path" 
> would not be able to find it in the code. Longer lines are allowed in code for
> literal strings.
>
> /Bruce
>

Sure, I will fix it.

Tetsuya

[dpdk-dev] [PATCH v7 06/12] eal: Add pci_uio_alloc_resource()

2015-07-03 Thread Tetsuya Mukawa

On 2015/07/02 19:46, Bruce Richardson wrote:
> On Tue, Jun 30, 2015 at 05:24:22PM +0900, Tetsuya Mukawa wrote:
>> From: "Tetsuya.Mukawa" 
>>
>> This patch adds a new function called pci_uio_alloc_resource().
>> The function hides how to prepare uio resource in linuxapp and bsdapp.
>> With the function, pci_uio_map_resource() will be more abstracted.
>>
>> Signed-off-by: Tetsuya Mukawa 
>> ---
>>  lib/librte_eal/bsdapp/eal/eal_pci.c   | 70 +++-
>>  lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 77 
>> ++-
>>  2 files changed, 104 insertions(+), 43 deletions(-)
>>
>> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
>> b/lib/librte_eal/bsdapp/eal/eal_pci.c
>> index 06c564f..7d2f8b5 100644
>> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c
>> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
>> @@ -189,28 +189,17 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
>>  return 1;
>>  }
>>  
>> -/* map the PCI resource of a PCI device in virtual memory */
>>  static int
>> -pci_uio_map_resource(struct rte_pci_device *dev)
>> +pci_uio_alloc_resource(struct rte_pci_device *dev,
>> +struct mapped_pci_resource **uio_res)
> Rather than having to pass in a pointer to a pointer, why not change the
> return type to be "struct mapped_pci_resource *"? The only return values 
> currently
> are 0 and -1, so those could map to non-NULL and NULL respectively, for error
> checking.
>
> /Bruce

It might be difficult to do like above, because pci_uio_alloc_resource()
returns 0, -1 and 1 as return value so far.

Original pci_uio_map_resource() returns negative return value as error,
and positive value as driver not found.
So I follow this specification while implementing the function.

Tetsuya

[dpdk-dev] [PATCH v7 09/12] eal: Consolidate pci_map/unmap_resource() of linuxapp and bsdapp

2015-07-03 Thread Tetsuya Mukawa

On 2015/07/02 20:11, Bruce Richardson wrote:
> On Tue, Jun 30, 2015 at 05:24:25PM +0900, Tetsuya Mukawa wrote:
>> From: "Tetsuya.Mukawa" 
>>
>> The patch consolidates below functions, and implemented in common
>> eal code.
>>  - pci_map_resource()
>>  - pci_unmap_resource()
>>
>> Signed-off-by: Tetsuya Mukawa 
>> ---
>>  lib/librte_eal/bsdapp/eal/eal_pci.c| 22 
>>  lib/librte_eal/common/eal_common_pci.c | 39 
>>  lib/librte_eal/common/include/rte_pci.h| 11 
>>  lib/librte_eal/linuxapp/eal/eal_pci.c  | 41 
>> --
>>  lib/librte_eal/linuxapp/eal/eal_pci_init.h |  5 
>>  5 files changed, 50 insertions(+), 68 deletions(-)
>>
> 
>> diff --git a/lib/librte_eal/common/include/rte_pci.h 
>> b/lib/librte_eal/common/include/rte_pci.h
>> index 0a2ef09..56dcb46 100644
>> --- a/lib/librte_eal/common/include/rte_pci.h
>> +++ b/lib/librte_eal/common/include/rte_pci.h
>> @@ -364,6 +364,17 @@ int rte_eal_pci_scan(void);
>>   */
>>  int rte_eal_pci_probe(void);
>>  
>> +/**
>> + * Map pci resouce.
>> + */
>> +void *pci_map_resource(void *requested_addr, int fd, off_t offset,
>> +size_t size, int additional_flags);
>> +
>> +/**
>> + * Map pci resouce.
>> + */
>> +void pci_unmap_resource(void *requested_addr, size_t size);
>> +
> These functions should probably be marked "@internal", right?

Yes, it should be. I will fix it.

Tetsuya

[dpdk-dev] [PATCH v7 00/12] Clean up pci uio implementations

2015-07-03 Thread Tetsuya Mukawa

On 2015/07/02 20:32, Bruce Richardson wrote:
> On Tue, Jun 30, 2015 at 05:24:16PM +0900, Tetsuya Mukawa wrote:
>> Currently Linux implementation and BSD implementation have almost same
>> code about pci uio. This patch series cleans up it.
>>
> Overall, patchset looks a good idea. I've made some comments on some of the
> individual patches. Quick test on FreeBSD shows that PCI port scanning and
> mapping of the BARs of 82599 NICs works ok.
>
> Tested-by: Bruce Richardson 
>

I appreciate your testing.
Just for other reviewers, here is my test environment.

 - FreeBSD 10.1
 - 82572EI Gigabit Ethernet Controller (Copper) (rev 06)

Regards,
Tetsuya

[dpdk-dev] [PATCH v2] doc/sample_app_ug:add a VXLAN sample guide

2015-07-03 Thread Liu, Yong

Hi,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu
> Sent: Friday, July 03, 2015 2:58 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2] doc/sample_app_ug:add a VXLAN sample guide
> 
> Add a VXLAN sample guide in the sample_app_ug directory.
> 
> It includes:
> 
> - Add the overlay networking picture with svg format.
> 
> - Add the TEP termination framework picture with svg format.
> 
> - Add the tep_termination.rst file
> 
> - Change the index.rst file for the above pictures index.
> 
> Signed-off-by: Jijiang Liu 
> Signed-off-by: Thomas Long 
> 
> v2 changes:
> optimize the two pictures
> add tep_termination index in index.rst file
> fix a typo and a command line
> 
> ---
>  .../sample_app_ug/img/overlay_networking.svg   |  786
> 
>  .../sample_app_ug/img/tep_termination_arch.svg |  548 ++
>  doc/guides/sample_app_ug/index.rst |3 +
>  doc/guides/sample_app_ug/tep_termination.rst   |  321 
>  4 files changed, 1658 insertions(+), 0 deletions(-)
>  create mode 100644 doc/guides/sample_app_ug/img/overlay_networking.svg
>  create mode 100644 doc/guides/sample_app_ug/img/tep_termination_arch.svg
>  create mode 100644 doc/guides/sample_app_ug/tep_termination.rst
> 
> diff --git a/doc/guides/sample_app_ug/img/overlay_networking.svg
> b/doc/guides/sample_app_ug/img/overlay_networking.svg
> new file mode 100644
> index 000..2ce440d
> --- /dev/null
> +++ b/doc/guides/sample_app_ug/img/overlay_networking.svg
> @@ -0,0 +1,786 @@
> +
> + "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd";>
> +
> +http://www.w3.org/2000/svg";
> xmlns:xlink="http://www.w3.org/1999/xlink";
> xmlns:ev="http://www.w3.org/2001/xml-events";
> +
>   xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/";
> width="8.5in" height="11in" viewBox="0 0 612 792"
> + xml:space="preserve" color-interpolation-filters="sRGB"
> class="st29">
> + 
> + 
> +  v:val="VT4(Rectangle)"/>
> + 
> + 
> + 
> + 
> +
> + 
> +  + .st1 {visibility:visible}
> + .st2 {fill:#b2b2b2;fill-opacity:0.5;stroke:#b2b2b2;stroke-
> linecap:round;stroke-linejoin:round;stroke-opacity:0.5}
> + .st3 {fill:#d8d8d8;fill-opacity:0.7;stroke:#404040;stroke-
> linecap:round;stroke-linejoin:round;stroke-width:0.25}
> + .st4 {fill:#00;font-family:Calibri;font-size:1.1em}
> + .st5 {fill:#cdcdcd;fill-opacity:0.5;stroke:#cdcdcd;stroke-
> linecap:round;stroke-linejoin:round;stroke-opacity:0.5}
> + .st6 {fill:#6b9bc7;stroke:#404040;stroke-linecap:round;stroke-
> linejoin:round;stroke-width:0.25}
> + .st7 {fill:#00;font-family:Calibri;font-size:0.86em}
> + .st8 {fill:#f6d5b9;stroke:#404040;stroke-linecap:round;stroke-
> linejoin:round;stroke-width:0.25}
> + .st9 {fill:#00;font-family:Calibri;font-size:0.75em}
> + .st10 {fill:url(#grad0-21);stroke:#404040;stroke-
> linecap:round;stroke-linejoin:round;stroke-width:0.25}
> + .st11 {fill:#00;font-family:Calibri;font-size:1.1em}
> + .st12 {fill:#00b0f0;fill-opacity:0.8;stroke:#404040;stroke-
> linecap:round;stroke-linejoin:round;stroke-width:0.25}
> + .st13 {fill:#00b0f0;stroke:#404040;stroke-
> linecap:round;stroke-linejoin:round;stroke-width:0.25}
> + .st14 {fill:#7030a0;fill-opacity:0.6;stroke:#404040;stroke-
> linecap:round;stroke-linejoin:round;stroke-width:0.25}
> + .st15 {fill:url(#grad0-121)}
> + .st16 {stroke:#0070c0;stroke-linecap:round;stroke-
> linejoin:round;stroke-width:1}
> + .st17 {fill:url(#grad0-128)}
> + .st18 {stroke:#d26d19;stroke-linecap:round;stroke-
> linejoin:round;stroke-width:1}
> + .st19 {stroke:#cdcdcd;stroke-linecap:round;stroke-
> linejoin:round;stroke-opacity:0.5;stroke-width:1.5}
> + .st20 {stroke:#d26d19;stroke-linecap:round;stroke-
> linejoin:round;stroke-width:1.5}
> + .st21 {stroke:#0070c0;stroke-linecap:round;stroke-
> linejoin:round;stroke-width:1.5}
> + .st22 {stroke:#7030a0;stroke-linecap:round;stroke-
> linejoin:round;stroke-width:1.5}
> + .st23 {stroke:#43365a;stroke-linecap:round;stroke-
> linejoin:round;stroke-width:1.5}
> + .st24 {stroke:#404040;stroke-linecap:round;stroke-
> linejoin:round;stroke-width:1.5}
> + .st25 {stroke:#00;stroke-linecap:round;stroke-
> linejoin:round;stroke-width:1.5}
> + .st26 {fill:url(#grad0-236)}
> + .st27 {stroke:#7030a0;stroke-linecap:round;stroke-
> linejoin:round;stroke-width:1}
> + .st28 {fill:#8fa350;stroke:#404040;stroke-
> linecap:round;stroke-linejoin:round;stroke-width:0.5}
> + .st29 {fill:none;fill-ru

[dpdk-dev] [PATCH v4 00/11] ip_pipeline: ip_pipeline application enhancements

2015-07-03 Thread Maciej Gajdzica

This patchset enhances functionality of ip_pipeline application. New config
file syntax is introduced, so parser is changed. Changed structure of the
application. Now every global variable is stored in app_struct in app.h.
Syntax of pipeline cli commands was changed. Implementation of cli commands
for every pipeline is moved to the separate file.

Changes in v2:
- renamed some files
- added more config files
- reworked flow classification pipeline implementation
- fixed some bugs

Changes in v3:
- fixed checkpatch errors
- fixed bug with message queues
- fixed bug with application log

Changes in v4:
- fixed build issue with gcc 5
- fixed bugs in flow classification and firewall pipelines

Daniel Mrzyglod (1):
  ip_pipeline: added new implementation of firewall pipeline

Jasvinder Singh (3):
  ip_pipeline: added config checks
  ip_pipeline: added master pipeline
  ip_pipeline: added new implementation of passthrough pipeline

Maciej Gajdzica (6):
  ip_pipeline: modified init to match new params  struct
  ip_pipeline: moved pipelines to separate folder
  ip_pipeline: added application thread
  ip_pipeline: moved config files to separate folder
  ip_pipeline: added new implementation of routing pipeline
  ip_pipeline: added new implementation of flow classification pipeline

Pawel Wodkowski (1):
  ip_pipeline: add parsing for config files with new syntax

 examples/ip_pipeline/Makefile  |   36 +-
 examples/ip_pipeline/app.h |  905 
 examples/ip_pipeline/cmdline.c | 1976 
 examples/ip_pipeline/config.c  |  419 
 examples/ip_pipeline/config/ip_pipeline.cfg|9 +
 examples/ip_pipeline/config/ip_pipeline.sh |5 +
 examples/ip_pipeline/config/tm_profile.cfg |  105 +
 examples/ip_pipeline/config_check.c|  396 
 examples/ip_pipeline/config_parse.c| 2456 
 examples/ip_pipeline/config_parse_tm.c |  446 
 examples/ip_pipeline/cpu_core_map.c|  492 
 examples/ip_pipeline/cpu_core_map.h|   69 +
 examples/ip_pipeline/init.c| 1645 +
 examples/ip_pipeline/ip_pipeline.cfg   |   56 -
 examples/ip_pipeline/ip_pipeline.sh|   18 -
 examples/ip_pipeline/main.c|  137 +-
 examples/ip_pipeline/main.h|  298 ---
 examples/ip_pipeline/pipeline.h|   87 +
 examples/ip_pipeline/pipeline/hash_func.h  |  351 +++
 .../ip_pipeline/pipeline/pipeline_actions_common.h |  119 +
 examples/ip_pipeline/pipeline/pipeline_common_be.c |  206 ++
 examples/ip_pipeline/pipeline/pipeline_common_be.h |  163 ++
 examples/ip_pipeline/pipeline/pipeline_common_fe.c | 1324 +++
 examples/ip_pipeline/pipeline/pipeline_common_fe.h |  228 ++
 examples/ip_pipeline/pipeline/pipeline_firewall.c  | 1001 
 examples/ip_pipeline/pipeline/pipeline_firewall.h  |   63 +
 .../ip_pipeline/pipeline/pipeline_firewall_be.c|  740 ++
 .../ip_pipeline/pipeline/pipeline_firewall_be.h|  138 ++
 .../pipeline/pipeline_flow_classification.c| 2057 
 .../pipeline/pipeline_flow_classification.h|  105 +
 .../pipeline/pipeline_flow_classification_be.c |  589 +
 .../pipeline/pipeline_flow_classification_be.h |  140 ++
 examples/ip_pipeline/pipeline/pipeline_master.c|   47 +
 examples/ip_pipeline/pipeline/pipeline_master.h|   41 +
 examples/ip_pipeline/pipeline/pipeline_master_be.c |  150 ++
 examples/ip_pipeline/pipeline/pipeline_master_be.h |   41 +
 .../ip_pipeline/pipeline/pipeline_passthrough.c|   47 +
 .../ip_pipeline/pipeline/pipeline_passthrough.h|   41 +
 .../ip_pipeline/pipeline/pipeline_passthrough_be.c |  772 ++
 .../ip_pipeline/pipeline/pipeline_passthrough_be.h |   41 +
 examples/ip_pipeline/pipeline/pipeline_routing.c   | 1541 
 examples/ip_pipeline/pipeline/pipeline_routing.h   |   99 +
 .../ip_pipeline/pipeline/pipeline_routing_be.c |  869 +++
 .../ip_pipeline/pipeline/pipeline_routing_be.h |  230 ++
 examples/ip_pipeline/pipeline_be.h |  256 ++
 examples/ip_pipeline/pipeline_firewall.c   |  313 ---
 .../ip_pipeline/pipeline_flow_classification.c |  306 ---
 examples/ip_pipeline/pipeline_ipv4_frag.c  |  184 --
 examples/ip_pipeline/pipeline_ipv4_ras.c   |  181 --
 examples/ip_pipeline/pipeline_passthrough.c|  213 --
 examples/ip_pipeline/pipeline_routing.c|  474 
 examples/ip_pipeline/pipeline_rx.c |  385 ---
 examples/ip_pipeline/pipeline_tx.c |  283 ---
 examples/ip_pipeline/thread.c  |  110 +
 54 files changed, 17732 insertions(+), 5671 deletions(-)
 create mode 100644 examples/ip_pipeline/app.h
 delete mode 100644 examples/ip_pipeline/cmdline.c
 delete mode

[dpdk-dev] [PATCH v4 01/11] ip_pipeline: add parsing for config files with new syntax

2015-07-03 Thread Maciej Gajdzica

From: Pawel Wodkowski 

New syntax of config files is needed for ip_pipeline example
enhancements. Some old files are temporarily disabled in the Makefile.
It is part of a bigger change.

Signed-off-by: Pawel Wodkowski 
---
 examples/ip_pipeline/Makefile  |   17 +-
 examples/ip_pipeline/app.h |  905 
 examples/ip_pipeline/config.c  |  419 --
 examples/ip_pipeline/config_parse.c| 2456 
 examples/ip_pipeline/config_parse_tm.c |  446 ++
 examples/ip_pipeline/cpu_core_map.c|  492 +++
 examples/ip_pipeline/cpu_core_map.h|   69 +
 examples/ip_pipeline/main.c|  130 +-
 examples/ip_pipeline/main.h|  298 
 examples/ip_pipeline/pipeline.h|   87 ++
 examples/ip_pipeline/pipeline_be.h |  256 
 11 files changed, 4722 insertions(+), 853 deletions(-)
 create mode 100644 examples/ip_pipeline/app.h
 delete mode 100644 examples/ip_pipeline/config.c
 create mode 100644 examples/ip_pipeline/config_parse.c
 create mode 100644 examples/ip_pipeline/config_parse_tm.c
 create mode 100644 examples/ip_pipeline/cpu_core_map.c
 create mode 100644 examples/ip_pipeline/cpu_core_map.h
 delete mode 100644 examples/ip_pipeline/main.h
 create mode 100644 examples/ip_pipeline/pipeline.h
 create mode 100644 examples/ip_pipeline/pipeline_be.h

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index e70fdc7..b0feb4f 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -43,20 +43,9 @@ APP = ip_pipeline

 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) := main.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += init.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += cmdline.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_rx.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_tx.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_flow_classification.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_routing.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_passthrough.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_ipv4_frag.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_ipv4_ras.c
-
-ifeq ($(CONFIG_RTE_LIBRTE_ACL),y)
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_firewall.c
-endif
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse_tm.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += cpu_core_map.c

 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/ip_pipeline/app.h b/examples/ip_pipeline/app.h
new file mode 100644
index 000..112473a
--- /dev/null
+++ b/examples/ip_pipeline/app.h
@@ -0,0 +1,905 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_APP_H__
+#define __INCLUDE_APP_H__
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "cpu_core_map.h"
+#include "pipeline.h"
+
+#define APP_PARAM_NAME_SIZE  PIPELINE_NAME_SIZE
+
+struct app_mempool_params {
+   char *name;
+   uint32_t parsed;
+   uint32_t buffer_size;
+   uint32_t pool_size;
+   uint32_t cache_size;
+   uint32_t cpu_socket_id;
+};
+
+struct app_link_params {
+   char *name;
+   uint32_t parsed;
+   uint32_t pmd

[dpdk-dev] [PATCH v6 1/9] eal: move librte_malloc to eal/common

2015-07-03 Thread Thomas Monjalon

2015-07-03 09:16, Gonzalez Monroy, Sergio:
> On 02/07/2015 13:14, Thomas Monjalon wrote:
> > 2015-06-26 16:29, Sergio Gonzalez Monroy:
> >> --- a/MAINTAINERS
> >> +++ b/MAINTAINERS
> >> @@ -73,6 +73,7 @@ F: lib/librte_eal/common/*
> >>   F: lib/librte_eal/common/include/*
> >>   F: lib/librte_eal/common/include/generic/
> >>   F: doc/guides/prog_guide/env_abstraction_layer.rst
> >> +F: doc/guides/prog_guide/malloc_lib.rst
> >>   F: app/test/test_alarm.c
> >>   F: app/test/test_atomic.c
> >>   F: app/test/test_byteorder.c
> >> @@ -97,6 +98,8 @@ F: app/test/test_spinlock.c
> >>   F: app/test/test_string_fns.c
> >>   F: app/test/test_tailq.c
> >>   F: app/test/test_version.c
> >> +F: app/test/test_malloc.c
> >> +F: app/test/test_func_reentrancy.c
> > I think we should keep a separate maintainer section for memory allocator
> > in EAL. I suggest this:
> >
> > Memory allocation
> > M: Sergio Gonzalez Monroy 
> > F: lib/librte_eal/common/include/rte_mem*
> > F: lib/librte_eal/common/include/rte_malloc.h
> > F: lib/librte_eal/common/*malloc*
> > F: lib/librte_eal/common/eal_common_mem*
> > F: lib/librte_eal/common/eal_hugepages.h
> > F: doc/guides/prog_guide/malloc_lib.rst
> > F: app/test/test_malloc.c
> > F: app/test/test_func_reentrancy.c
> >
> >
> Fine with me.
> Do you need a new version of the patches with that change?

Yes please.
Thanks for your involvement.

[dpdk-dev] [PATCH v4 02/11] ip_pipeline: added config checks

2015-07-03 Thread Maciej Gajdzica

From: Jasvinder Singh 

After loading configuration from a file, data integrity is checked.

Signed-off-by: Jasvinder Singh 
---
 examples/ip_pipeline/Makefile   |1 +
 examples/ip_pipeline/config_check.c |  396 +++
 examples/ip_pipeline/main.c |2 +
 3 files changed, 399 insertions(+)
 create mode 100644 examples/ip_pipeline/config_check.c

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index b0feb4f..bc50e71 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -45,6 +45,7 @@ APP = ip_pipeline
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) := main.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse_tm.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_check.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += cpu_core_map.c

 CFLAGS += -O3
diff --git a/examples/ip_pipeline/config_check.c 
b/examples/ip_pipeline/config_check.c
new file mode 100644
index 000..2218238
--- /dev/null
+++ b/examples/ip_pipeline/config_check.c
@@ -0,0 +1,396 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+
+#include "app.h"
+
+static void
+check_mempools(struct app_params *app)
+{
+   uint32_t i;
+
+   for (i = 0; i < app->n_mempools; i++) {
+   struct app_mempool_params *p = &app->mempool_params[i];
+
+   APP_CHECK((p->pool_size > 0),
+   "Mempool %s size is 0\n", p->name);
+
+   APP_CHECK((p->cache_size > 0),
+   "Mempool %s cache size is 0\n", p->name);
+
+   APP_CHECK(rte_is_power_of_2(p->cache_size),
+   "Mempool %s cache size not a power of 2\n", p->name);
+   }
+}
+
+static void
+check_links(struct app_params *app)
+{
+   uint32_t i;
+
+   /* Check that number of links matches the port mask */
+   APP_CHECK((app->n_links == __builtin_popcountll(app->port_mask)),
+   "Not enough links provided in the PORT_MASK\n");
+
+   for (i = 0; i < app->n_links; i++) {
+   struct app_link_params *link = &app->link_params[i];
+   uint32_t rxq_max, n_rxq, n_txq, link_id, i;
+
+   APP_PARAM_GET_ID(link, "LINK", link_id);
+
+   /* Check that link RXQs are contiguous */
+   rxq_max = 0;
+   if (link->arp_q > rxq_max)
+   rxq_max = link->arp_q;
+   if (link->tcp_syn_local_q > rxq_max)
+   rxq_max = link->tcp_syn_local_q;
+   if (link->ip_local_q > rxq_max)
+   rxq_max = link->ip_local_q;
+   if (link->tcp_local_q > rxq_max)
+   rxq_max = link->tcp_local_q;
+   if (link->udp_local_q > rxq_max)
+   rxq_max = link->udp_local_q;
+   if (link->sctp_local_q > rxq_max)
+   rxq_max = link->sctp_local_q;
+
+   for (i = 1; i <= rxq_max; i++)
+   APP_CHECK(((link->arp_q == i) ||
+   (link->tcp_syn_local_q == i) ||
+   (link->ip_local_q == i) ||
+   (link->tcp_local_q == i) ||
+   (link->udp_local_q == i) ||
+

[dpdk-dev] [PATCH v4 03/11] ip_pipeline: modified init to match new params struct

2015-07-03 Thread Maciej Gajdzica

After changes in config parser, app params struct is changed and
requires modifications in initialization procedures.

Signed-off-by: Maciej Gajdzica 
---
 examples/ip_pipeline/Makefile |1 +
 examples/ip_pipeline/init.c   | 1632 ++---
 examples/ip_pipeline/main.c   |3 +
 3 files changed, 1203 insertions(+), 433 deletions(-)

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index bc50e71..59bea5b 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) := main.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse_tm.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_check.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += init.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += cpu_core_map.c

 CFLAGS += -O3
diff --git a/examples/ip_pipeline/init.c b/examples/ip_pipeline/init.c
index d79762f..8e8b290 100644
--- a/examples/ip_pipeline/init.c
+++ b/examples/ip_pipeline/init.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -32,561 +32,1327 @@
  */

 #include 
-#include 
-#include 
-#include 
-#include 
 #include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
+
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include 
 #include 
-#include 
-#include 
-#include 
-
-#include "main.h"
-
-#define NA APP_SWQ_INVALID
-
-struct app_params app = {
-   /* CPU cores */
-   .cores = {
-   {0, APP_CORE_MASTER, {15, 16, 17, NA, NA, NA, NA, NA},
-   {12, 13, 14, NA, NA, NA, NA, NA} },
-   {0, APP_CORE_RX, {NA, NA, NA, NA, NA, NA, NA, 12},
-   { 0,  1,  2,  3, NA, NA, NA, 15} },
-   {0, APP_CORE_FC, { 0,  1,  2,  3, NA, NA, NA, 13},
-   { 4,  5,  6,  7, NA, NA, NA, 16} },
-   {0, APP_CORE_RT, { 4,  5,  6,  7, NA, NA, NA, 14},
-   { 8,  9, 10, 11, NA, NA, NA, 17} },
-   {0, APP_CORE_TX, { 8,  9, 10, 11, NA, NA, NA, NA},
-   {NA, NA, NA, NA, NA, NA, NA, NA} },
-   },
-
-   /* Ports*/
-   .n_ports = APP_MAX_PORTS,
-   .rsz_hwq_rx = 128,
-   .rsz_hwq_tx = 512,
-   .bsz_hwq_rd = 64,
-   .bsz_hwq_wr = 64,
-
-   .port_conf = {
-   .rxmode = {
-   .split_hdr_size = 0,
-   .header_split   = 0, /* Header Split disabled */
-   .hw_ip_checksum = 1, /* IP checksum offload enabled */
-   .hw_vlan_filter = 0, /* VLAN filtering disabled */
-   .jumbo_frame= 1, /* Jumbo Frame Support enabled */
-   .max_rx_pkt_len = 9000, /* Jumbo Frame MAC pkt length */
-   .hw_strip_crc   = 0, /* CRC stripped by hardware */
-   },
-   .rx_adv_conf = {
-   .rss_conf = {
-   .rss_key = NULL,
-   .rss_hf = ETH_RSS_IP,
-   },
-   },
-   .txmode = {
-   .mq_mode = ETH_MQ_TX_NONE,
-   },
-   },
-
-   .rx_conf = {
-   .rx_thresh = {
-   .pthresh = 8,
-   .hthresh = 8,
-   .wthresh = 4,
-   },
-   .rx_free_thresh = 64,
-   .rx_drop_en = 0,
-   },
-
-   .tx_conf = {
-   .tx_thresh = {
-   .pthresh = 36,
-   .hthresh = 0,
-   .wthresh = 0,
-   },
-   .tx_free_thresh = 0,
-   .tx_rs_thresh = 0,
-   },
-
-   /* SWQs */
-   .rsz_swq = 128,
-   .bsz_swq_rd = 64,
-   .bsz_swq_wr = 64,
-
-   /* Buffer pool */
-   .pool_buffer_size = RTE_MBUF_DEFAULT_BUF_SIZE,
-   .pool_size = 32 * 1024,
-   .pool_cache_size = 256,
-
-   /* Message buffer pool */
-   .msg_pool_buffer_size = 256,
-   .msg_pool_size = 1024,
-   .msg_pool_cache_size = 64,
-
-   /* Rule tables */
-   .max_arp_rules = 1 << 10,
-   .max_firewall_rules = 1 << 5,
-   .max_routing_rules = 1 << 24,
-   .max_flow_rules = 1 << 24,
-
-   /* Application processing */
-   .ether_hdr_pop_push = 0,
-};
-
-struct app_core_params *
-app_get_core_params(uint32_t core_id)
-{
-   uint32_t i;
+#include 
+#include 

-   for (i = 0; i < RTE_MAX_LCORE; i++) {
-   struct app_core

[dpdk-dev] [PATCH v4 04/11] ip_pipeline: moved pipelines to separate folder

2015-07-03 Thread Maciej Gajdzica

Moved pipelines to separate folder, removed not needed pipelines and
modified Makefile to match that change.

Signed-off-by: Maciej Gajdzica 
---
 examples/ip_pipeline/Makefile  |9 +-
 examples/ip_pipeline/pipeline/pipeline_firewall.c  |  313 +
 .../pipeline/pipeline_flow_classification.c|  306 +
 .../ip_pipeline/pipeline/pipeline_passthrough.c|  213 +
 examples/ip_pipeline/pipeline/pipeline_routing.c   |  474 
 examples/ip_pipeline/pipeline_firewall.c   |  313 -
 .../ip_pipeline/pipeline_flow_classification.c |  306 -
 examples/ip_pipeline/pipeline_ipv4_frag.c  |  184 
 examples/ip_pipeline/pipeline_ipv4_ras.c   |  181 
 examples/ip_pipeline/pipeline_passthrough.c|  213 -
 examples/ip_pipeline/pipeline_routing.c|  474 
 examples/ip_pipeline/pipeline_rx.c |  385 
 examples/ip_pipeline/pipeline_tx.c |  283 
 13 files changed, 1314 insertions(+), 2340 deletions(-)
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_firewall.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_flow_classification.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_passthrough.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_routing.c
 delete mode 100644 examples/ip_pipeline/pipeline_firewall.c
 delete mode 100644 examples/ip_pipeline/pipeline_flow_classification.c
 delete mode 100644 examples/ip_pipeline/pipeline_ipv4_frag.c
 delete mode 100644 examples/ip_pipeline/pipeline_ipv4_ras.c
 delete mode 100644 examples/ip_pipeline/pipeline_passthrough.c
 delete mode 100644 examples/ip_pipeline/pipeline_routing.c
 delete mode 100644 examples/ip_pipeline/pipeline_rx.c
 delete mode 100644 examples/ip_pipeline/pipeline_tx.c

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index 59bea5b..213e879 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -36,11 +36,17 @@ endif
 # Default target, can be overridden by command line or environment
 RTE_TARGET ?= x86_64-native-linuxapp-gcc

+DIRS-(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline
+
 include $(RTE_SDK)/mk/rte.vars.mk

 # binary name
 APP = ip_pipeline

+VPATH += $(SRCDIR)/pipeline
+
+INC += $(wildcard *.h) $(wildcard pipeline/*.h)
+
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) := main.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse.c
@@ -49,7 +55,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_check.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += init.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += cpu_core_map.c

+CFLAGS += -I$(SRCDIR) -I$(SRCDIR)/pipeline
 CFLAGS += -O3
-CFLAGS += $(WERROR_FLAGS)
+CFLAGS += $(WERROR_FLAGS) -Wno-error=unused-function -Wno-error=unused-variable

 include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/ip_pipeline/pipeline/pipeline_firewall.c 
b/examples/ip_pipeline/pipeline/pipeline_firewall.c
new file mode 100644
index 000..b70260e
--- /dev/null
+++ b/examples/ip_pipeline/pipeline/pipeline_firewall.c
@@ -0,0 +1,313 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#inc

[dpdk-dev] [PATCH v4 05/11] ip_pipeline: added master pipeline

2015-07-03 Thread Maciej Gajdzica

From: Jasvinder Singh 

Master pipeline is responsible for command line handling and
communicationg with all other pipelines via message queues. Removed
cmdline.c file, as its functionality will be split over multiple
pipeline files.

Signed-off-by: Jasvinder Singh 
---
 examples/ip_pipeline/Makefile  |5 +
 examples/ip_pipeline/cmdline.c | 1976 
 examples/ip_pipeline/init.c|5 +
 examples/ip_pipeline/pipeline/pipeline_common_be.c |  206 ++
 examples/ip_pipeline/pipeline/pipeline_common_be.h |  163 ++
 examples/ip_pipeline/pipeline/pipeline_common_fe.c | 1324 +
 examples/ip_pipeline/pipeline/pipeline_common_fe.h |  228 +++
 examples/ip_pipeline/pipeline/pipeline_master.c|   47 +
 examples/ip_pipeline/pipeline/pipeline_master.h|   41 +
 examples/ip_pipeline/pipeline/pipeline_master_be.c |  150 ++
 examples/ip_pipeline/pipeline/pipeline_master_be.h |   41 +
 11 files changed, 2210 insertions(+), 1976 deletions(-)
 delete mode 100644 examples/ip_pipeline/cmdline.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_common_be.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_common_be.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_common_fe.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_common_fe.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_master.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_master.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_master_be.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_master_be.h

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index 213e879..9ce80a8 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -55,6 +55,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_check.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += init.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += cpu_core_map.c

+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_common_be.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_common_fe.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_master_be.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_master.c
+
 CFLAGS += -I$(SRCDIR) -I$(SRCDIR)/pipeline
 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS) -Wno-error=unused-function -Wno-error=unused-variable
diff --git a/examples/ip_pipeline/cmdline.c b/examples/ip_pipeline/cmdline.c
deleted file mode 100644
index 3173fd0..000
--- a/examples/ip_pipeline/cmdline.c
+++ /dev/null
@@ -1,1976 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- * * Redistributions of source code must retain the above copyright
- *   notice, this list of conditions and the following disclaimer.
- * * Redistributions in binary form must reproduce the above copyright
- *   notice, this list of conditions and the following disclaimer in
- *   the documentation and/or other materials provided with the
- *   distribution.
- * * Neither the name of Intel Corporation nor the names of its
- *   contributors may be used to endorse or promote products derived
- *   from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include "main.h"
-
-#define IS_RULE_PRESENT(res, rule_key, table, type)\
-do {   \
-   struct app_rule *it;\
-   \
-   (res) = NULL;   \
-   TAILQ_FOREACH(it, &table, entries) {\

[dpdk-dev] [PATCH v4 06/11] ip_pipeline: added application thread

2015-07-03 Thread Maciej Gajdzica

Application thread runs pipelines on assigned cores.

Signed-off-by: Maciej Gajdzica 
---
 examples/ip_pipeline/Makefile |1 +
 examples/ip_pipeline/main.c   |6 +++
 examples/ip_pipeline/thread.c |  110 +
 3 files changed, 117 insertions(+)
 create mode 100644 examples/ip_pipeline/thread.c

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index 9ce80a8..f255338 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -53,6 +53,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse_tm.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_check.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += init.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += thread.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += cpu_core_map.c

 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_common_be.c
diff --git a/examples/ip_pipeline/main.c b/examples/ip_pipeline/main.c
index ef68c86..862e2f2 100644
--- a/examples/ip_pipeline/main.c
+++ b/examples/ip_pipeline/main.c
@@ -52,5 +52,11 @@ main(int argc, char **argv)
/* Init */
app_init(&app);

+   /* Run-time */
+   rte_eal_mp_remote_launch(
+   app_thread,
+   (void *) &app,
+   CALL_MASTER);
+
return 0;
 }
diff --git a/examples/ip_pipeline/thread.c b/examples/ip_pipeline/thread.c
new file mode 100644
index 000..b2a8656
--- /dev/null
+++ b/examples/ip_pipeline/thread.c
@@ -0,0 +1,110 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+
+#include "pipeline_common_be.h"
+#include "app.h"
+
+int app_thread(void *arg)
+{
+   struct app_params *app = (struct app_params *) arg;
+   uint32_t core_id = rte_lcore_id(), i, j;
+   struct app_thread_data *t = &app->thread_data[core_id];
+   uint32_t n_regular = RTE_MIN(t->n_regular, RTE_DIM(t->regular));
+   uint32_t n_custom = RTE_MIN(t->n_custom, RTE_DIM(t->custom));
+
+   for (i = 0; ; i++) {
+   /* Run regular pipelines */
+   for (j = 0; j < n_regular; j++) {
+   struct app_thread_pipeline_data *data = &t->regular[j];
+   struct pipeline *p = data->be;
+
+   rte_pipeline_run(p->p);
+   }
+
+   /* Run custom pipelines */
+   for (j = 0; j < n_custom; j++) {
+   struct app_thread_pipeline_data *data = &t->custom[j];
+
+   data->f_run(data->be);
+   }
+
+   /* Timer */
+   if ((i & 0xF) == 0) {
+   uint64_t time = rte_get_tsc_cycles();
+   uint64_t t_deadline = UINT64_MAX;
+
+   if (time < t->deadline)
+   continue;
+
+   /* Timer for regular pipelines */
+   for (j = 0; j < n_regular; j++) {
+   struct app_thread_pipeline_data *data =
+   &t->regular[j];
+   uint64_t p_deadline = data->deadline;
+
+   if (p_deadline <= time) {
+   data->f_timer(data->be);
+

[dpdk-dev] [PATCH v4 07/11] ip_pipeline: moved config files to separate folder

2015-07-03 Thread Maciej Gajdzica

Created new folder for config(.cfg) and script(.sh) files.

Signed-off-by: Maciej Gajdzica 
---
 examples/ip_pipeline/config/ip_pipeline.cfg |9 +++
 examples/ip_pipeline/config/ip_pipeline.sh  |5 ++
 examples/ip_pipeline/config/tm_profile.cfg  |  105 +++
 examples/ip_pipeline/ip_pipeline.cfg|   56 --
 examples/ip_pipeline/ip_pipeline.sh |   18 -
 5 files changed, 119 insertions(+), 74 deletions(-)
 create mode 100644 examples/ip_pipeline/config/ip_pipeline.cfg
 create mode 100644 examples/ip_pipeline/config/ip_pipeline.sh
 create mode 100644 examples/ip_pipeline/config/tm_profile.cfg
 delete mode 100644 examples/ip_pipeline/ip_pipeline.cfg
 delete mode 100644 examples/ip_pipeline/ip_pipeline.sh

diff --git a/examples/ip_pipeline/config/ip_pipeline.cfg 
b/examples/ip_pipeline/config/ip_pipeline.cfg
new file mode 100644
index 000..095ed25
--- /dev/null
+++ b/examples/ip_pipeline/config/ip_pipeline.cfg
@@ -0,0 +1,9 @@
+[PIPELINE0]
+type = MASTER
+core = 0
+
+[PIPELINE1]
+type = PASS-THROUGH
+core = 1
+pktq_in = RXQ0.0 RXQ1.0 RXQ2.0 RXQ3.0
+pktq_out = TXQ0.0 TXQ1.0 TXQ2.0 TXQ3.0
diff --git a/examples/ip_pipeline/config/ip_pipeline.sh 
b/examples/ip_pipeline/config/ip_pipeline.sh
new file mode 100644
index 000..4fca259
--- /dev/null
+++ b/examples/ip_pipeline/config/ip_pipeline.sh
@@ -0,0 +1,5 @@
+#
+#run config/ip_pipeline.sh
+#
+
+p 1 ping
diff --git a/examples/ip_pipeline/config/tm_profile.cfg 
b/examples/ip_pipeline/config/tm_profile.cfg
new file mode 100644
index 000..53edb67
--- /dev/null
+++ b/examples/ip_pipeline/config/tm_profile.cfg
@@ -0,0 +1,105 @@
+;   BSD LICENSE
+;
+;   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+;   All rights reserved.
+;
+;   Redistribution and use in source and binary forms, with or without
+;   modification, are permitted provided that the following conditions
+;   are met:
+;
+; * Redistributions of source code must retain the above copyright
+;   notice, this list of conditions and the following disclaimer.
+; * Redistributions in binary form must reproduce the above copyright
+;   notice, this list of conditions and the following disclaimer in
+;   the documentation and/or other materials provided with the
+;   distribution.
+; * Neither the name of Intel Corporation nor the names of its
+;   contributors may be used to endorse or promote products derived
+;   from this software without specific prior written permission.
+;
+;   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+;   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+;   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+;   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+;   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+;   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+;   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+;   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+;   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+;   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+;   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+; This file enables the following hierarchical scheduler configuration for each
+; 10GbE output port:
+;  * Single subport (subport 0):
+;  - Subport rate set to 100% of port rate
+;  - Each of the 4 traffic classes has rate set to 100% of port 
rate
+;  * 4K pipes per subport 0 (pipes 0 .. 4095) with identical configuration:
+;  - Pipe rate set to 1/4K of port rate
+;  - Each of the 4 traffic classes has rate set to 100% of pipe 
rate
+;  - Within each traffic class, the byte-level WRR weights for the 
4 queues
+; are set to 1:1:1:1
+;
+; For more details, please refer to chapter "Quality of Service (QoS) 
Framework"
+; of Intel Data Plane Development Kit (Intel DPDK) Programmer's Guide.
+
+; Port configuration
+[port]
+frame overhead = 24 ; frame overhead = Preamble (7) + SFD (1) + FCS (4) + IFG 
(12)
+mtu = 1522; mtu = Q-in-Q MTU (FCS not included)
+number of subports per port = 1
+number of pipes per subport = 4096
+queue sizes = 64 64 64 64
+
+; Subport configuration
+[subport 0]
+tb rate = 125000   ; Bytes per second
+tb size = 100  ; Bytes
+
+tc 0 rate = 125000 ; Bytes per second
+tc 1 rate = 125000 ; Bytes per second
+tc 2 rate = 125000 ; Bytes per second
+tc 3 rate = 125000 ; Bytes per second
+tc period = 10 ; Milliseconds
+
+pipe 0-4095 = 0; These pipes are configured with pipe profile 0
+
+; Pipe configuration
+[pipe profile 0]
+tb rate = 305175   ; Bytes per second
+tb size = 100  ; Bytes
+
+tc 0 rate = 30

[dpdk-dev] [PATCH v2] doc/sample_app_ug:add a VXLAN sample guide

2015-07-03 Thread Thomas Monjalon

2015-07-03 08:53, Liu, Yong:
> Acked-by: Marvin Liu 

Please strip the patch when ack'ing.
It will avoid useless scrolling.
Thanks

[dpdk-dev] [PATCH v4 08/11] ip_pipeline: added new implementation of passthrough pipeline

2015-07-03 Thread Maciej Gajdzica

From: Jasvinder Singh 

Passthrough pipeline implementation is split to two files.
pipeline_passthrough.c file handles front-end functions (cli commands
parsing) pipeline_passthrough_ops.c contains implementation of functions
done by pipeline (back-end).

Signed-off-by: Jasvinder Singh 
---
 examples/ip_pipeline/Makefile  |2 +
 examples/ip_pipeline/init.c|2 +
 examples/ip_pipeline/pipeline/hash_func.h  |  351 +
 .../ip_pipeline/pipeline/pipeline_actions_common.h |  119 +++
 .../ip_pipeline/pipeline/pipeline_passthrough.c|  192 +
 .../ip_pipeline/pipeline/pipeline_passthrough.h|   41 ++
 .../ip_pipeline/pipeline/pipeline_passthrough_be.c |  772 
 .../ip_pipeline/pipeline/pipeline_passthrough_be.h |   41 ++
 8 files changed, 1341 insertions(+), 179 deletions(-)
 create mode 100644 examples/ip_pipeline/pipeline/hash_func.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_actions_common.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_passthrough.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_passthrough_be.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_passthrough_be.h

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index f255338..930dc61 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -60,6 +60,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_common_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_common_fe.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_master_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_master.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_passthrough_be.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_passthrough.c

 CFLAGS += -I$(SRCDIR) -I$(SRCDIR)/pipeline
 CFLAGS += -O3
diff --git a/examples/ip_pipeline/init.c b/examples/ip_pipeline/init.c
index fac2e9a..6cffbc6 100644
--- a/examples/ip_pipeline/init.c
+++ b/examples/ip_pipeline/init.c
@@ -45,6 +45,7 @@
 #include "pipeline.h"
 #include "pipeline_common_fe.h"
 #include "pipeline_master.h"
+#include "pipeline_passthrough.h"

 #define APP_NAME_SIZE  32

@@ -1270,6 +1271,7 @@ int app_init(struct app_params *app)

app_pipeline_common_cmd_push(app);
app_pipeline_type_register(app, &pipeline_master);
+   app_pipeline_type_register(app, &pipeline_passthrough);

app_init_pipelines(app);
app_init_threads(app);
diff --git a/examples/ip_pipeline/pipeline/hash_func.h 
b/examples/ip_pipeline/pipeline/hash_func.h
new file mode 100644
index 000..7846300
--- /dev/null
+++ b/examples/ip_pipeline/pipeline/hash_func.h
@@ -0,0 +1,351 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#ifndef __INCLUDE_HASH_FUNC_H__
+#define __INCLUDE_HASH_FUNC_H__
+
+static inline uint64_t
+hash_xor_key8(void *key, __rte_unused uint32_t key_size, uint64_t seed)
+{
+   uint64_t *k = key;
+   uint64_t xor0;
+
+   xor0 = seed ^ k[0];
+
+   return (xor0 >> 32) ^ xor0;
+}
+
+static inline uint64_t
+hash_xor_key16(void *key, __rte_unused uint32_t key_size, uint64_t seed)
+{
+   uint64_t *k = key;
+   uint64_t xor0;
+
+   xor0 = (k[0] ^ seed) ^ k[1];
+
+   return (xor0 >> 32) ^ xor0;
+}
+
+static inline uint64_t
+hash_xor_key24(void *

[dpdk-dev] [PATCH v4 09/11] ip_pipeline: added new implementation of firewall pipeline

2015-07-03 Thread Maciej Gajdzica

From: Daniel Mrzyglod 

Firewall pipeline implementation is split to two files.
pipeline_firewall.c file handles front-end functions (cli commands
parsing) pipeline_firewall_ops.c contains implementation of functions
done by pipeline (back-end).

Signed-off-by: Daniel Mrzyglod 
---
 examples/ip_pipeline/Makefile  |2 +
 examples/ip_pipeline/init.c|2 +
 examples/ip_pipeline/pipeline/pipeline_firewall.c  | 1162 
 examples/ip_pipeline/pipeline/pipeline_firewall.h  |   63 ++
 .../ip_pipeline/pipeline/pipeline_firewall_be.c|  740 +
 .../ip_pipeline/pipeline/pipeline_firewall_be.h|  138 +++
 6 files changed, 1870 insertions(+), 237 deletions(-)
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_firewall.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_firewall_be.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_firewall_be.h

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index 930dc61..382fee6 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -62,6 +62,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_master_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_master.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_passthrough_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_passthrough.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_firewall_be.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_firewall.c

 CFLAGS += -I$(SRCDIR) -I$(SRCDIR)/pipeline
 CFLAGS += -O3
diff --git a/examples/ip_pipeline/init.c b/examples/ip_pipeline/init.c
index 6cffbc6..632d429 100644
--- a/examples/ip_pipeline/init.c
+++ b/examples/ip_pipeline/init.c
@@ -46,6 +46,7 @@
 #include "pipeline_common_fe.h"
 #include "pipeline_master.h"
 #include "pipeline_passthrough.h"
+#include "pipeline_firewall.h"

 #define APP_NAME_SIZE  32

@@ -1272,6 +1273,7 @@ int app_init(struct app_params *app)
app_pipeline_common_cmd_push(app);
app_pipeline_type_register(app, &pipeline_master);
app_pipeline_type_register(app, &pipeline_passthrough);
+   app_pipeline_type_register(app, &pipeline_firewall);

app_init_pipelines(app);
app_init_threads(app);
diff --git a/examples/ip_pipeline/pipeline/pipeline_firewall.c 
b/examples/ip_pipeline/pipeline/pipeline_firewall.c
index b70260e..40fbdc5 100644
--- a/examples/ip_pipeline/pipeline/pipeline_firewall.c
+++ b/examples/ip_pipeline/pipeline/pipeline_firewall.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -32,282 +32,970 @@
  */

 #include 
-#include 
-#include 
+#include 
+#include 
+#include 

+#include 
+#include 
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "app.h"
+#include "pipeline_common_fe.h"
+#include "pipeline_firewall.h"
+
+struct app_pipeline_firewall_rule {
+   struct pipeline_firewall_key key;
+   int32_t priority;
+   uint32_t port_id;
+   void *entry_ptr;
+
+   TAILQ_ENTRY(app_pipeline_firewall_rule) node;
+};
+
+struct app_pipeline_firewall {
+   /* parameters */
+   uint32_t n_ports_in;
+   uint32_t n_ports_out;
+
+   /* rules */
+   TAILQ_HEAD(, app_pipeline_firewall_rule) rules;
+   uint32_t n_rules;
+   uint32_t default_rule_present;
+   uint32_t default_rule_port_id;
+   void *default_rule_entry_ptr;
+};

-#include 
-#include 
-#include 
+static void
+print_firewall_ipv4_rule(struct app_pipeline_firewall_rule *rule)
+{
+   printf("Prio = %d (SA = %u.%u.%u.%u/%u, "
+   "DA = %u.%u.%u.%u/%u, "
+   "SP = %u-%u, "
+   "DP = %u-%u, "
+   "Proto = %u / 0x%x) => "
+   "Port = %u (entry ptr = %p)\n",
+
+   rule->priority,
+
+   (rule->key.key.ipv4_5tuple.src_ip >> 24) & 0xFF,
+   (rule->key.key.ipv4_5tuple.src_ip >> 16) & 0xFF,
+   (rule->key.key.ipv4_5tuple.src_ip >> 8) & 0xFF,
+   rule->key.key.ipv4_5tuple.src_ip & 0xFF,
+   rule->key.key.ipv4_5tuple.src_ip_mask,
+
+   (rule->key.key.ipv4_5tuple.dst_ip >> 24) & 0xFF,
+   (rule->key.key.ipv4_5tuple.dst_ip >> 16) & 0xFF,
+   (rule->key.key.ipv4_5tuple.dst_ip >> 8) & 0xFF,
+   rule->key.key.ipv4_5tuple.dst_ip & 0xFF,
+   rule->key.key.ipv4_5tuple.dst_ip_mask,
+
+   rule->key.key.ipv4_5tuple.src_port_from,
+   rule->key.key.ipv4_5tuple.src_port_to,
+
+   rule->key.key.ipv4_5tuple.dst_port_from,
+   rule->key.key.ipv4_5tuple.dst_port_to,
+
+   rule->key.key.ipv4_5tuple.

[dpdk-dev] [PATCH v4 10/11] ip_pipeline: added new implementation of routing pipeline

2015-07-03 Thread Maciej Gajdzica

Routing pipeline implementation is split to two files.
pipeline_routing.c file handles front-end functions (cli commands
parsing) pipeline_routing_ops.c contains implementation of functions
done by pipeline (back-end).

Signed-off-by: Pawel Wodkowski 
---
 examples/ip_pipeline/Makefile  |2 +
 examples/ip_pipeline/init.c|2 +
 examples/ip_pipeline/pipeline/pipeline_routing.c   | 1783 
 examples/ip_pipeline/pipeline/pipeline_routing.h   |   99 ++
 .../ip_pipeline/pipeline/pipeline_routing_be.c |  869 ++
 .../ip_pipeline/pipeline/pipeline_routing_be.h |  230 +++
 6 files changed, 2627 insertions(+), 358 deletions(-)
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_routing.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_routing_be.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_routing_be.h

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index 382fee6..a2881a6 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -64,6 +64,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += 
pipeline_passthrough_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_passthrough.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_firewall_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_firewall.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_routing_be.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_routing.c

 CFLAGS += -I$(SRCDIR) -I$(SRCDIR)/pipeline
 CFLAGS += -O3
diff --git a/examples/ip_pipeline/init.c b/examples/ip_pipeline/init.c
index 632d429..63db23c 100644
--- a/examples/ip_pipeline/init.c
+++ b/examples/ip_pipeline/init.c
@@ -47,6 +47,7 @@
 #include "pipeline_master.h"
 #include "pipeline_passthrough.h"
 #include "pipeline_firewall.h"
+#include "pipeline_routing.h"

 #define APP_NAME_SIZE  32

@@ -1274,6 +1275,7 @@ int app_init(struct app_params *app)
app_pipeline_type_register(app, &pipeline_master);
app_pipeline_type_register(app, &pipeline_passthrough);
app_pipeline_type_register(app, &pipeline_firewall);
+   app_pipeline_type_register(app, &pipeline_routing);

app_init_pipelines(app);
app_init_threads(app);
diff --git a/examples/ip_pipeline/pipeline/pipeline_routing.c 
b/examples/ip_pipeline/pipeline/pipeline_routing.c
index b1ce624..2415aa2 100644
--- a/examples/ip_pipeline/pipeline/pipeline_routing.c
+++ b/examples/ip_pipeline/pipeline/pipeline_routing.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -31,444 +31,1511 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */

-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
+#include 
+#include 

-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include "app.h"
+#include "pipeline_common_fe.h"
+#include "pipeline_routing.h"

-#include 
-#include 
-#include 
-#include 
+struct app_pipeline_routing_route {
+   struct pipeline_routing_route_key key;
+   struct app_pipeline_routing_route_params params;
+   void *entry_ptr;

-#include "main.h"
+   TAILQ_ENTRY(app_pipeline_routing_route) node;
+};

-#include 
+struct app_pipeline_routing_arp_entry {
+   struct pipeline_routing_arp_key key;
+   struct ether_addr macaddr;
+   void *entry_ptr;

-struct app_routing_table_entry {
-   struct rte_pipeline_table_entry head;
-   uint32_t nh_ip;
-   uint32_t nh_iface;
+   TAILQ_ENTRY(app_pipeline_routing_arp_entry) node;
 };

-struct app_arp_table_entry {
-   struct rte_pipeline_table_entry head;
-   struct ether_addr nh_arp;
+struct pipeline_routing {
+   /* Parameters */
+   uint32_t n_ports_in;
+   uint32_t n_ports_out;
+
+   /* Routes */
+   TAILQ_HEAD(, app_pipeline_routing_route) routes;
+   uint32_t n_routes;
+
+   uint32_t default_route_present;
+   uint32_t default_route_port_id;
+   void *default_route_entry_ptr;
+
+   /* ARP entries */
+   TAILQ_HEAD(, app_pipeline_routing_arp_entry) arp_entries;
+   uint32_t n_arp_entries;
+
+   uint32_t default_arp_entry_present;
+   uint32_t default_arp_entry_port_id;
+   void *default_arp_entry_ptr;
 };

-static inline void
-app_routing_table_write_metadata(
-   struct rte_mbuf *pkt,
-   struct app_routing_table_entry *entry)
+static void *
+pipeline_routing_init(struct pipeline_params *params,
+   __rte_unused void *arg)
 {
-   struct app_pkt_metadata *c =
-   (struct app_pkt_metadata *) RTE_MBUF_METADATA_UINT8_PTR(pkt, 0);
+   struct pipeline_routing *p;
+   uint32_t size;
+
+   /* Check input arguments */
+   if ((params == NULL) ||
+   (params->n_por

[dpdk-dev] [PATCH v4 11/11] ip_pipeline: added new implementation of flow classification pipeline

2015-07-03 Thread Maciej Gajdzica

Flow classification pipeline implementation is split to two files.
pipeline_flow_classification.c file handles front-end functions (cli
commands parsing) pipeline_flow_classification_ops.c contains
implementation of functions done by pipeline (back-end).

Signed-off-by: Maciej Gajdzica 
---
 examples/ip_pipeline/Makefile  |2 +
 examples/ip_pipeline/init.c|2 +
 .../pipeline/pipeline_flow_classification.c| 2205 ++--
 .../pipeline/pipeline_flow_classification.h|  105 +
 .../pipeline/pipeline_flow_classification_be.c |  589 ++
 .../pipeline/pipeline_flow_classification_be.h |  140 ++
 6 files changed, 2816 insertions(+), 227 deletions(-)
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_flow_classification.h
 create mode 100644 
examples/ip_pipeline/pipeline/pipeline_flow_classification_be.c
 create mode 100644 
examples/ip_pipeline/pipeline/pipeline_flow_classification_be.h

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index a2881a6..f3ff1ec 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -64,6 +64,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += 
pipeline_passthrough_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_passthrough.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_firewall_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_firewall.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_flow_classification_be.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_flow_classification.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_routing_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_routing.c

diff --git a/examples/ip_pipeline/init.c b/examples/ip_pipeline/init.c
index 63db23c..431b69e 100644
--- a/examples/ip_pipeline/init.c
+++ b/examples/ip_pipeline/init.c
@@ -47,6 +47,7 @@
 #include "pipeline_master.h"
 #include "pipeline_passthrough.h"
 #include "pipeline_firewall.h"
+#include "pipeline_flow_classification.h"
 #include "pipeline_routing.h"

 #define APP_NAME_SIZE  32
@@ -1274,6 +1275,7 @@ int app_init(struct app_params *app)
app_pipeline_common_cmd_push(app);
app_pipeline_type_register(app, &pipeline_master);
app_pipeline_type_register(app, &pipeline_passthrough);
+   app_pipeline_type_register(app, &pipeline_flow_classification);
app_pipeline_type_register(app, &pipeline_firewall);
app_pipeline_type_register(app, &pipeline_routing);

diff --git a/examples/ip_pipeline/pipeline/pipeline_flow_classification.c 
b/examples/ip_pipeline/pipeline/pipeline_flow_classification.c
index cc0cbf1..c3926fa 100644
--- a/examples/ip_pipeline/pipeline/pipeline_flow_classification.c
+++ b/examples/ip_pipeline/pipeline/pipeline_flow_classification.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -32,275 +32,2026 @@
  */

 #include 
-#include 
-#include 
+#include 
+#include 
+#include 

+#include 
+#include 
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 

-#include 
-#include 
-#include 
+#include "app.h"
+#include "pipeline_common_fe.h"
+#include "pipeline_flow_classification.h"
+#include "hash_func.h"

-#include "main.h"
+/*
+ * Key conversion
+ */
+
+struct pkt_key_qinq {
+   uint16_t ethertype_svlan;
+   uint16_t svlan;
+   uint16_t ethertype_cvlan;
+   uint16_t cvlan;
+} __attribute__((__packed__));
+
+struct pkt_key_ipv4_5tuple {
+   uint8_t ttl;
+   uint8_t proto;
+   uint16_t checksum;
+   uint32_t ip_src;
+   uint32_t ip_dst;
+   uint16_t port_src;
+   uint16_t port_dst;
+} __attribute__((__packed__));
+
+struct pkt_key_ipv6_5tuple {
+   uint16_t payload_length;
+   uint8_t proto;
+   uint8_t hop_limit;
+   uint8_t ip_src[16];
+   uint8_t ip_dst[16];
+   uint16_t port_src;
+   uint16_t port_dst;
+} __attribute__((__packed__));
+
+static int
+app_pipeline_fc_key_convert(struct pipeline_fc_key *key_in,
+   uint8_t *key_out,
+   uint32_t *signature)
+{
+   uint8_t buffer[PIPELINE_FC_FLOW_KEY_MAX_SIZE];
+   void *key_buffer = (key_out) ? key_out : buffer;
+
+   switch (key_in->type) {
+   case FLOW_KEY_QINQ:
+   {
+   struct pkt_key_qinq *qinq = key_buffer;
+
+   qinq->ethertype_svlan = 0;
+   qinq->svlan = rte_bswap16(key_in->key.qinq.svlan);
+   qinq->ethertype_cvlan = 0;
+   qinq->cvlan = rte_bswap16(key_in->key.qinq.cvlan);
+
+   if (signature)
+   *signature = (uint32_t) hash_default_key8(qinq, 8, 0);
+   return 0;
+   }
+
+   case FLOW_KEY_IPV4_5TUPLE:
+   {
+

[dpdk-dev] [PATCH v4 00/11] ip_pipeline: ip_pipeline application enhancements

2015-07-03 Thread Dumitrescu, Cristian



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> Sent: Friday, July 3, 2015 9:58 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v4 00/11] ip_pipeline: ip_pipeline application
> enhancements
> 
> 
> Changes in v4:
> - fixed build issue with gcc 5
> - fixed bugs in flow classification and firewall pipelines

Acked-by: Cristian Dumitrescu

[dpdk-dev] [PATCH v7 06/12] eal: Add pci_uio_alloc_resource()

2015-07-03 Thread Bruce Richardson

On Fri, Jul 03, 2015 at 05:52:11PM +0900, Tetsuya Mukawa wrote:
> On 2015/07/02 19:46, Bruce Richardson wrote:
> > On Tue, Jun 30, 2015 at 05:24:22PM +0900, Tetsuya Mukawa wrote:
> >> From: "Tetsuya.Mukawa" 
> >>
> >> This patch adds a new function called pci_uio_alloc_resource().
> >> The function hides how to prepare uio resource in linuxapp and bsdapp.
> >> With the function, pci_uio_map_resource() will be more abstracted.
> >>
> >> Signed-off-by: Tetsuya Mukawa 
> >> ---
> >>  lib/librte_eal/bsdapp/eal/eal_pci.c   | 70 
> >> +++-
> >>  lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 77 
> >> ++-
> >>  2 files changed, 104 insertions(+), 43 deletions(-)
> >>
> >> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
> >> b/lib/librte_eal/bsdapp/eal/eal_pci.c
> >> index 06c564f..7d2f8b5 100644
> >> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c
> >> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
> >> @@ -189,28 +189,17 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
> >>return 1;
> >>  }
> >>  
> >> -/* map the PCI resource of a PCI device in virtual memory */
> >>  static int
> >> -pci_uio_map_resource(struct rte_pci_device *dev)
> >> +pci_uio_alloc_resource(struct rte_pci_device *dev,
> >> +  struct mapped_pci_resource **uio_res)
> > Rather than having to pass in a pointer to a pointer, why not change the
> > return type to be "struct mapped_pci_resource *"? The only return values 
> > currently
> > are 0 and -1, so those could map to non-NULL and NULL respectively, for 
> > error
> > checking.
> >
> > /Bruce
> 
> It might be difficult to do like above, because pci_uio_alloc_resource()
> returns 0, -1 and 1 as return value so far.
> 
> Original pci_uio_map_resource() returns negative return value as error,
> and positive value as driver not found.
> So I follow this specification while implementing the function.
> 
> Tetsuya

Ok. Keep it as is, so.

/Bruce

[dpdk-dev] [PATCH 2/3] doc: added guidelines on dpdk documentation

2015-07-03 Thread Bruce Richardson

On Thu, Jul 02, 2015 at 06:20:39PM +0200, Thomas Monjalon wrote:
> 2015-07-02 14:50, John McNamara:
> > +* Use one sentence per line in a paragraph, i.e., put a newline character
> > +  after each period/full stop.
> 
> What about adding this?
> Only blank line will generate a newline.
> 
> I think breaking lines at end of sentence is more important than
> wrapping at 80 char, because it will help to keep patches
> readable.
>
+1 to this. I believe that one sentence per line should be the default wrapping
policy and only wrap if lines are very, very long. The why I like this approach
is that changes tend to be sentence-based, and so having a sentence per line
makes for much cleaner diffs when updating - thereby making review easier.

/Bruce

[dpdk-dev] [PATCH v7 0/9] Dynamic memzones

2015-07-03 Thread Sergio Gonzalez Monroy

Current implemetation allows reserving/creating memzones but not the opposite
(unreserve/free). This affects mempools and other memzone based objects.

>From my point of view, implementing free functionality for memzones would look
like malloc over memsegs.
Thus, this approach moves malloc inside eal (which in turn removes a circular
dependency), where malloc heaps are composed of memsegs.
We keep both malloc and memzone APIs as they are, but memzones allocate its
memory by calling malloc_heap_alloc.
Some extra functionality is required in malloc to allow for boundary constrained
memory requests.
In summary, currently malloc is based on memzones, and with this approach
memzones are based on malloc.

v7:
 - Create a separated maintainer section for memory allocation

v6:
 - Fix bad patch for rte_memzone_free

v5:
 - Fix rte_memzone_free
 - Improve rte_memzone_free unit test

v4:
 - Rebase and fix couple of merge issues

v3:
 - Create dummy librte_malloc
 - Add deprecation notice
 - Rework some of the code
 - Doc update
 - checkpatch

v2:
 - New rte_memzone_free
 - Support memzone len = 0
 - Add all available memsegs to malloc heap at init
 - Update memzone/malloc unit tests

Sergio Gonzalez Monroy (9):
  eal: move librte_malloc to eal/common
  eal: memzone allocated by malloc
  app/test: update malloc/memzone unit tests
  config: remove CONFIG_RTE_MALLOC_MEMZONE_SIZE
  eal: remove free_memseg and references to it
  eal: new rte_memzone_free
  app/test: rte_memzone_free unit test
  doc: announce ABI change of librte_malloc
  doc: update malloc documentation

 MAINTAINERS   |  22 +-
 app/test/test_malloc.c|  86 
 app/test/test_memzone.c   | 456 --
 config/common_bsdapp  |   8 +-
 config/common_linuxapp|   8 +-
 doc/guides/prog_guide/env_abstraction_layer.rst   | 220 ++-
 doc/guides/prog_guide/img/malloc_heap.png | Bin 81329 -> 80952 bytes
 doc/guides/prog_guide/index.rst   |   1 -
 doc/guides/prog_guide/malloc_lib.rst  | 233 ---
 doc/guides/prog_guide/overview.rst|  11 +-
 doc/guides/rel_notes/abi.rst  |   1 +
 drivers/net/af_packet/Makefile|   1 -
 drivers/net/bonding/Makefile  |   1 -
 drivers/net/e1000/Makefile|   2 +-
 drivers/net/enic/Makefile |   2 +-
 drivers/net/fm10k/Makefile|   2 +-
 drivers/net/i40e/Makefile |   2 +-
 drivers/net/ixgbe/Makefile|   2 +-
 drivers/net/mlx4/Makefile |   1 -
 drivers/net/null/Makefile |   1 -
 drivers/net/pcap/Makefile |   1 -
 drivers/net/virtio/Makefile   |   2 +-
 drivers/net/vmxnet3/Makefile  |   2 +-
 drivers/net/xenvirt/Makefile  |   2 +-
 lib/Makefile  |   2 +-
 lib/librte_acl/Makefile   |   2 +-
 lib/librte_eal/bsdapp/eal/Makefile|   4 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map |  19 +
 lib/librte_eal/common/Makefile|   1 +
 lib/librte_eal/common/eal_common_memzone.c| 339 ++--
 lib/librte_eal/common/include/rte_eal_memconfig.h |   5 +-
 lib/librte_eal/common/include/rte_malloc.h| 342 
 lib/librte_eal/common/include/rte_malloc_heap.h   |   3 +-
 lib/librte_eal/common/include/rte_memzone.h   |  11 +
 lib/librte_eal/common/malloc_elem.c   | 344 
 lib/librte_eal/common/malloc_elem.h   | 192 +
 lib/librte_eal/common/malloc_heap.c   | 206 ++
 lib/librte_eal/common/malloc_heap.h   |  70 
 lib/librte_eal/common/rte_malloc.c| 259 
 lib/librte_eal/linuxapp/eal/Makefile  |   4 +-
 lib/librte_eal/linuxapp/eal/eal_ivshmem.c |  17 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map   |  19 +
 lib/librte_hash/Makefile  |   2 +-
 lib/librte_lpm/Makefile   |   2 +-
 lib/librte_malloc/Makefile|   6 +-
 lib/librte_malloc/malloc_elem.c   | 320 ---
 lib/librte_malloc/malloc_elem.h   | 190 -
 lib/librte_malloc/malloc_heap.c   | 208 --
 lib/librte_malloc/malloc_heap.h   |  70 
 lib/librte_malloc/rte_malloc.c| 228 +--
 lib/librte_malloc/rte_malloc.h| 342 
 lib/librte_malloc/rte_malloc_version.map  |  16 -
 lib/librte_mempool/Makefile   |   2 -
 lib/librte_port/Makefile  |   1 -
 lib/librt

[dpdk-dev] [PATCH v7 5/9] eal: remove free_memseg and references to it

2015-07-03 Thread Sergio Gonzalez Monroy

Remove free_memseg field from internal mem config structure as it is
not used anymore.
Also remove code in ivshmem that was setting up free_memseg on init.

Signed-off-by: Sergio Gonzalez Monroy 
---
 lib/librte_eal/common/include/rte_eal_memconfig.h | 3 ---
 lib/librte_eal/linuxapp/eal/eal_ivshmem.c | 9 -
 2 files changed, 12 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h 
b/lib/librte_eal/common/include/rte_eal_memconfig.h
index 055212a..7de906b 100644
--- a/lib/librte_eal/common/include/rte_eal_memconfig.h
+++ b/lib/librte_eal/common/include/rte_eal_memconfig.h
@@ -73,9 +73,6 @@ struct rte_mem_config {
struct rte_memseg memseg[RTE_MAX_MEMSEG];/**< Physmem descriptors. 
*/
struct rte_memzone memzone[RTE_MAX_MEMZONE]; /**< Memzone descriptors. 
*/

-   /* Runtime Physmem descriptors - NOT USED */
-   struct rte_memseg free_memseg[RTE_MAX_MEMSEG];
-
struct rte_tailq_head tailq_head[RTE_MAX_TAILQ]; /**< Tailqs for 
objects */

/* Heaps of Malloc per socket */
diff --git a/lib/librte_eal/linuxapp/eal/eal_ivshmem.c 
b/lib/librte_eal/linuxapp/eal/eal_ivshmem.c
index 2deaeb7..facfb80 100644
--- a/lib/librte_eal/linuxapp/eal/eal_ivshmem.c
+++ b/lib/librte_eal/linuxapp/eal/eal_ivshmem.c
@@ -725,15 +725,6 @@ map_all_segments(void)
 * expect memsegs to be empty */
memcpy(&mcfg->memseg[i], &ms,
sizeof(struct rte_memseg));
-   memcpy(&mcfg->free_memseg[i], &ms,
-   sizeof(struct rte_memseg));
-
-
-   /* adjust the free_memseg so that there's no free space left */
-   mcfg->free_memseg[i].ioremap_addr += mcfg->free_memseg[i].len;
-   mcfg->free_memseg[i].phys_addr += mcfg->free_memseg[i].len;
-   mcfg->free_memseg[i].addr_64 += mcfg->free_memseg[i].len;
-   mcfg->free_memseg[i].len = 0;

close(fd);

-- 
1.9.3

[dpdk-dev] [PATCH v7 7/9] app/test: rte_memzone_free unit test

2015-07-03 Thread Sergio Gonzalez Monroy

Add new unit test for rte_memzone_free API.

Signed-off-by: Sergio Gonzalez Monroy 
---
 app/test/test_memzone.c | 82 +++--
 1 file changed, 80 insertions(+), 2 deletions(-)

diff --git a/app/test/test_memzone.c b/app/test/test_memzone.c
index 6934eee..c37f950 100644
--- a/app/test/test_memzone.c
+++ b/app/test/test_memzone.c
@@ -432,7 +432,6 @@ test_memzone_reserve_max(void)
printf("Expected size = %zu, actual size = %zu\n", maxlen, 
mz->len);
rte_dump_physmem_layout(stdout);
rte_memzone_dump(stdout);
-
return -1;
}
return 0;
@@ -472,7 +471,6 @@ test_memzone_reserve_max_aligned(void)
maxlen, mz->len);
rte_dump_physmem_layout(stdout);
rte_memzone_dump(stdout);
-
return -1;
}
return 0;
@@ -684,6 +682,82 @@ test_memzone_bounded(void)
 }

 static int
+test_memzone_free(void)
+{
+   const struct rte_memzone *mz[RTE_MAX_MEMZONE];
+   int i;
+   char name[20];
+
+   mz[0] = rte_memzone_reserve("tempzone0", 2000, SOCKET_ID_ANY, 0);
+   mz[1] = rte_memzone_reserve("tempzone1", 4000, SOCKET_ID_ANY, 0);
+
+   if (mz[0] > mz[1])
+   return -1;
+   if (!rte_memzone_lookup("tempzone0"))
+   return -1;
+   if (!rte_memzone_lookup("tempzone1"))
+   return -1;
+
+   if (rte_memzone_free(mz[0])) {
+   printf("Fail memzone free - tempzone0\n");
+   return -1;
+   }
+   if (rte_memzone_lookup("tempzone0")) {
+   printf("Found previously free memzone - tempzone0\n");
+   return -1;
+   }
+   mz[2] = rte_memzone_reserve("tempzone2", 2000, SOCKET_ID_ANY, 0);
+
+   if (mz[2] > mz[1]) {
+   printf("tempzone2 should have gotten the free entry from 
tempzone0\n");
+   return -1;
+   }
+   if (rte_memzone_free(mz[2])) {
+   printf("Fail memzone free - tempzone2\n");
+   return -1;
+   }
+   if (rte_memzone_lookup("tempzone2")) {
+   printf("Found previously free memzone - tempzone2\n");
+   return -1;
+   }
+   if (rte_memzone_free(mz[1])) {
+   printf("Fail memzone free - tempzone1\n");
+   return -1;
+   }
+   if (rte_memzone_lookup("tempzone1")) {
+   printf("Found previously free memzone - tempzone1\n");
+   return -1;
+   }
+
+   i = 0;
+   do {
+   snprintf(name, sizeof(name), "tempzone%u", i);
+   mz[i] = rte_memzone_reserve(name, 1, SOCKET_ID_ANY, 0);
+   } while (mz[i++] != NULL);
+
+   if (rte_memzone_free(mz[0])) {
+   printf("Fail memzone free - tempzone0\n");
+   return -1;
+   }
+   mz[0] = rte_memzone_reserve("tempzone0new", 0, SOCKET_ID_ANY, 0);
+
+   if (mz[0] == NULL) {
+   printf("Fail to create memzone - tempzone0new - when MAX 
memzones were "
+   "created and one was free\n");
+   return -1;
+   }
+
+   for (i = i - 2; i >= 0; i--) {
+   if (rte_memzone_free(mz[i])) {
+   printf("Fail memzone free - tempzone%d\n", i);
+   return -1;
+   }
+   }
+
+   return 0;
+}
+
+static int
 test_memzone(void)
 {
const struct rte_memzone *memzone1;
@@ -763,6 +837,10 @@ test_memzone(void)
if (mz != NULL)
return -1;

+   printf("test free memzone\n");
+   if (test_memzone_free() < 0)
+   return -1;
+
printf("test reserving memzone with bigger size than the maximum\n");
if (test_memzone_reserving_zone_size_bigger_than_the_maximum() < 0)
return -1;
-- 
1.9.3

[dpdk-dev] [PATCH v7 2/9] eal: memzone allocated by malloc

2015-07-03 Thread Sergio Gonzalez Monroy

In the current memory hierarchy, memsegs are groups of physically
contiguous hugepages, memzones are slices of memsegs and malloc further
slices memzones into smaller memory chunks.

This patch modifies malloc so it partitions memsegs instead of memzones.
Thus memzones would call malloc internally for memory allocation while
maintaining its ABI.

It would be possible to free memzones and therefore any other structure
based on memzones, ie. mempools

Signed-off-by: Sergio Gonzalez Monroy 
---
 lib/librte_eal/common/eal_common_memzone.c| 274 ++
 lib/librte_eal/common/include/rte_eal_memconfig.h |   2 +-
 lib/librte_eal/common/include/rte_malloc_heap.h   |   3 +-
 lib/librte_eal/common/malloc_elem.c   |  68 --
 lib/librte_eal/common/malloc_elem.h   |  14 +-
 lib/librte_eal/common/malloc_heap.c   | 140 ++-
 lib/librte_eal/common/malloc_heap.h   |   6 +-
 lib/librte_eal/common/rte_malloc.c|   7 +-
 8 files changed, 197 insertions(+), 317 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_memzone.c 
b/lib/librte_eal/common/eal_common_memzone.c
index aee184a..943012b 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -50,15 +50,15 @@
 #include 
 #include 

+#include "malloc_heap.h"
+#include "malloc_elem.h"
 #include "eal_private.h"

-/* internal copy of free memory segments */
-static struct rte_memseg *free_memseg = NULL;
-
 static inline const struct rte_memzone *
 memzone_lookup_thread_unsafe(const char *name)
 {
const struct rte_mem_config *mcfg;
+   const struct rte_memzone *mz;
unsigned i = 0;

/* get pointer to global configuration */
@@ -68,8 +68,9 @@ memzone_lookup_thread_unsafe(const char *name)
 * the algorithm is not optimal (linear), but there are few
 * zones and this function should be called at init only
 */
-   for (i = 0; i < RTE_MAX_MEMZONE && mcfg->memzone[i].addr != NULL; i++) {
-   if (!strncmp(name, mcfg->memzone[i].name, RTE_MEMZONE_NAMESIZE))
+   for (i = 0; i < RTE_MAX_MEMZONE; i++) {
+   mz = &mcfg->memzone[i];
+   if (mz->addr != NULL && !strncmp(name, mz->name, 
RTE_MEMZONE_NAMESIZE))
return &mcfg->memzone[i];
}

@@ -88,39 +89,45 @@ rte_memzone_reserve(const char *name, size_t len, int 
socket_id,
len, socket_id, flags, RTE_CACHE_LINE_SIZE);
 }

-/*
- * Helper function for memzone_reserve_aligned_thread_unsafe().
- * Calculate address offset from the start of the segment.
- * Align offset in that way that it satisfy istart alignmnet and
- * buffer of the  requested length would not cross specified boundary.
- */
-static inline phys_addr_t
-align_phys_boundary(const struct rte_memseg *ms, size_t len, size_t align,
-   size_t bound)
+/* Find the heap with the greatest free block size */
+static void
+find_heap_max_free_elem(int *s, size_t *len, unsigned align)
 {
-   phys_addr_t addr_offset, bmask, end, start;
-   size_t step;
+   struct rte_mem_config *mcfg;
+   struct rte_malloc_socket_stats stats;
+   unsigned i;

-   step = RTE_MAX(align, bound);
-   bmask = ~((phys_addr_t)bound - 1);
+   /* get pointer to global configuration */
+   mcfg = rte_eal_get_configuration()->mem_config;

-   /* calculate offset to closest alignment */
-   start = RTE_ALIGN_CEIL(ms->phys_addr, align);
-   addr_offset = start - ms->phys_addr;
+   for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+   malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats);
+   if (stats.greatest_free_size > *len) {
+   *len = stats.greatest_free_size;
+   *s = i;
+   }
+   }
+   *len -= (MALLOC_ELEM_OVERHEAD + align);
+}

-   while (addr_offset + len < ms->len) {
+/* Find a heap that can allocate the requested size */
+static void
+find_heap_suitable(int *s, size_t len, unsigned align)
+{
+   struct rte_mem_config *mcfg;
+   struct rte_malloc_socket_stats stats;
+   unsigned i;

-   /* check, do we meet boundary condition */
-   end = start + len - (len != 0);
-   if ((start & bmask) == (end & bmask))
-   break;
+   /* get pointer to global configuration */
+   mcfg = rte_eal_get_configuration()->mem_config;

-   /* calculate next offset */
-   start = RTE_ALIGN_CEIL(start + 1, step);
-   addr_offset = start - ms->phys_addr;
+   for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+   malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats);
+   if (stats.greatest_free_size >= len + MALLOC_ELEM_OVERHEAD + 
align) {
+   *s = i;
+   break;
+   }
}
-
-   return addr_offset;
 }

 static const

[dpdk-dev] [PATCH v7 4/9] config: remove CONFIG_RTE_MALLOC_MEMZONE_SIZE

2015-07-03 Thread Sergio Gonzalez Monroy

During initializaio malloc sets all available memory as part of the heaps.

CONFIG_RTE_MALLOC_MEMZONE_SIZE was used to specify the default memory
block size to expand the heap. The option is not used/relevant anymore,
so we remove it.

Signed-off-by: Sergio Gonzalez Monroy 
---
 config/common_bsdapp   | 1 -
 config/common_linuxapp | 1 -
 2 files changed, 2 deletions(-)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index c7262e1..1d89e08 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -103,7 +103,6 @@ CONFIG_RTE_LOG_HISTORY=256
 CONFIG_RTE_EAL_ALLOW_INV_SOCKET_ID=n
 CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
 CONFIG_RTE_MALLOC_DEBUG=n
-CONFIG_RTE_MALLOC_MEMZONE_SIZE=11M

 #
 # FreeBSD contiguous memory driver settings
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 70117e7..5deb029 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -106,7 +106,6 @@ CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
 CONFIG_RTE_EAL_IGB_UIO=y
 CONFIG_RTE_EAL_VFIO=y
 CONFIG_RTE_MALLOC_DEBUG=n
-CONFIG_RTE_MALLOC_MEMZONE_SIZE=11M

 #
 # Special configurations in PCI Config Space for high performance
-- 
1.9.3

[dpdk-dev] [PATCH v7 8/9] doc: announce ABI change of librte_malloc

2015-07-03 Thread Sergio Gonzalez Monroy

Announce the creation of dummy malloc library for 2.1 and removal of
such library, now integrated in librte_eal, for 2.2 release.

Signed-off-by: Sergio Gonzalez Monroy 
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 110c486..50fb6a5 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -12,3 +12,4 @@ Examples of Deprecation Notices

 Deprecation Notices
 ---
+* librte_malloc library has been integrated into librte_eal. The 2.1 release 
creates a dummy/empty malloc library to fulfill binaries with dynamic linking 
dependencies on librte_malloc.so. Such dummy library will not be created from 
release 2.2 so binaries will need to be rebuilt.
-- 
1.9.3

[dpdk-dev] [PATCH v7 3/9] app/test: update malloc/memzone unit tests

2015-07-03 Thread Sergio Gonzalez Monroy

Some unit test are not relevant anymore. It is the case of those malloc
UTs that checked corner cases when allocating MALLOC_MEMZONE_SIZE
chunks, and the case of those memzone UTs relaying of specific free
memsegs of rhte reserved memzone.

Other UTs just need to be update, for example, to calculate maximum free
block size available.

Signed-off-by: Sergio Gonzalez Monroy 
---
 app/test/test_malloc.c  |  86 --
 app/test/test_memzone.c | 440 
 2 files changed, 35 insertions(+), 491 deletions(-)

diff --git a/app/test/test_malloc.c b/app/test/test_malloc.c
index ea6f651..a04a751 100644
--- a/app/test/test_malloc.c
+++ b/app/test/test_malloc.c
@@ -56,10 +56,6 @@

 #define N 1

-#define QUOTE_(x) #x
-#define QUOTE(x) QUOTE_(x)
-#define MALLOC_MEMZONE_SIZE QUOTE(RTE_MALLOC_MEMZONE_SIZE)
-
 /*
  * Malloc
  * ==
@@ -292,60 +288,6 @@ test_str_to_size(void)
 }

 static int
-test_big_alloc(void)
-{
-   int socket = 0;
-   struct rte_malloc_socket_stats pre_stats, post_stats;
-   size_t size =rte_str_to_size(MALLOC_MEMZONE_SIZE)*2;
-   int align = 0;
-#ifndef RTE_LIBRTE_MALLOC_DEBUG
-   int overhead = RTE_CACHE_LINE_SIZE + RTE_CACHE_LINE_SIZE;
-#else
-   int overhead = RTE_CACHE_LINE_SIZE + RTE_CACHE_LINE_SIZE + 
RTE_CACHE_LINE_SIZE;
-#endif
-
-   rte_malloc_get_socket_stats(socket, &pre_stats);
-
-   void *p1 = rte_malloc_socket("BIG", size , align, socket);
-   if (!p1)
-   return -1;
-   rte_malloc_get_socket_stats(socket,&post_stats);
-
-   /* Check statistics reported are correct */
-   /* Allocation may increase, or may be the same as before big allocation 
*/
-   if (post_stats.heap_totalsz_bytes < pre_stats.heap_totalsz_bytes) {
-   printf("Malloc statistics are incorrect - 
heap_totalsz_bytes\n");
-   return -1;
-   }
-   /* Check that allocated size adds up correctly */
-   if (post_stats.heap_allocsz_bytes !=
-   pre_stats.heap_allocsz_bytes + size + align + overhead) 
{
-   printf("Malloc statistics are incorrect - alloc_size\n");
-   return -1;
-   }
-   /* Check free size against tested allocated size */
-   if (post_stats.heap_freesz_bytes !=
-   post_stats.heap_totalsz_bytes - 
post_stats.heap_allocsz_bytes) {
-   printf("Malloc statistics are incorrect - heap_freesz_bytes\n");
-   return -1;
-   }
-   /* Number of allocated blocks must increase after allocation */
-   if (post_stats.alloc_count != pre_stats.alloc_count + 1) {
-   printf("Malloc statistics are incorrect - alloc_count\n");
-   return -1;
-   }
-   /* New blocks now available - just allocated 1 but also 1 new free */
-   if (post_stats.free_count != pre_stats.free_count &&
-   post_stats.free_count != pre_stats.free_count - 1) {
-   printf("Malloc statistics are incorrect - free_count\n");
-   return -1;
-   }
-
-   rte_free(p1);
-   return 0;
-}
-
-static int
 test_multi_alloc_statistics(void)
 {
int socket = 0;
@@ -399,10 +341,6 @@ test_multi_alloc_statistics(void)
/* After freeing both allocations check stats return to original */
rte_malloc_get_socket_stats(socket, &post_stats);

-   /*
-* Check that no new blocks added after small allocations
-* i.e. < RTE_MALLOC_MEMZONE_SIZE
-*/
if(second_stats.heap_totalsz_bytes != first_stats.heap_totalsz_bytes) {
printf("Incorrect heap statistics: Total size \n");
return -1;
@@ -447,18 +385,6 @@ test_multi_alloc_statistics(void)
 }

 static int
-test_memzone_size_alloc(void)
-{
-   void *p1 = rte_malloc("BIG", 
(size_t)(rte_str_to_size(MALLOC_MEMZONE_SIZE) - 128), 64);
-   if (!p1)
-   return -1;
-   rte_free(p1);
-   /* one extra check - check no crashes if free(NULL) */
-   rte_free(NULL);
-   return 0;
-}
-
-static int
 test_rte_malloc_type_limits(void)
 {
/* The type-limits functionality is not yet implemented,
@@ -935,18 +861,6 @@ test_malloc(void)
}
else printf("test_str_to_size() passed\n");

-   if (test_memzone_size_alloc() < 0){
-   printf("test_memzone_size_alloc() failed\n");
-   return -1;
-   }
-   else printf("test_memzone_size_alloc() passed\n");
-
-   if (test_big_alloc() < 0){
-   printf("test_big_alloc() failed\n");
-   return -1;
-   }
-   else printf("test_big_alloc() passed\n");
-
if (test_zero_aligned_alloc() < 0){
printf("test_zero_aligned_alloc() failed\n");
return -1;
diff --git a/app/test/test_memzone.c b/app/test/test_memzone.c
index 9c7a1cb..6934eee 100644
--- a/app/test/test_memzone.c
+++ b/app/test/test_memzone.c
@@ -44,6 +44,9 @@
 #include 
 #

[dpdk-dev] [PATCH v7 6/9] eal: new rte_memzone_free

2015-07-03 Thread Sergio Gonzalez Monroy

Implement rte_memzone_free which, as its name implies, would free a
memzone.

Currently memzone are tracked in an array and cannot be free.
To be able to reuse the same array to track memzones, we have to
change how we keep track of reserved memzones.

With this patch, any memzone with addr NULL is not used, so we also need
to change how we look for the next memzone entry free.

Signed-off-by: Sergio Gonzalez Monroy 
---
 lib/librte_eal/bsdapp/eal/rte_eal_version.map |  6 ++
 lib/librte_eal/common/eal_common_memzone.c| 67 ++-
 lib/librte_eal/common/include/rte_eal_memconfig.h |  2 +-
 lib/librte_eal/common/include/rte_memzone.h   | 11 
 lib/librte_eal/linuxapp/eal/eal_ivshmem.c |  8 +--
 lib/librte_eal/linuxapp/eal/rte_eal_version.map   |  6 ++
 6 files changed, 92 insertions(+), 8 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 0401be2..7110816 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -105,3 +105,9 @@ DPDK_2.0 {

local: *;
 };
+
+DPDK_2.1 {
+   global:
+
+   rte_memzone_free;
+} DPDK_2.0;
diff --git a/lib/librte_eal/common/eal_common_memzone.c 
b/lib/librte_eal/common/eal_common_memzone.c
index 943012b..5bc4ab4 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -77,6 +77,23 @@ memzone_lookup_thread_unsafe(const char *name)
return NULL;
 }

+static inline struct rte_memzone *
+get_next_free_memzone(void)
+{
+   struct rte_mem_config *mcfg;
+   unsigned i = 0;
+
+   /* get pointer to global configuration */
+   mcfg = rte_eal_get_configuration()->mem_config;
+
+   for (i = 0; i < RTE_MAX_MEMZONE; i++) {
+   if (mcfg->memzone[i].addr == NULL)
+   return &mcfg->memzone[i];
+   }
+
+   return NULL;
+}
+
 /*
  * Return a pointer to a correctly filled memzone descriptor. If the
  * allocation cannot be done, return NULL.
@@ -141,7 +158,7 @@ memzone_reserve_aligned_thread_unsafe(const char *name, 
size_t len,
mcfg = rte_eal_get_configuration()->mem_config;

/* no more room in config */
-   if (mcfg->memzone_idx >= RTE_MAX_MEMZONE) {
+   if (mcfg->memzone_cnt >= RTE_MAX_MEMZONE) {
RTE_LOG(ERR, EAL, "%s(): No more room in config\n", __func__);
rte_errno = ENOSPC;
return NULL;
@@ -215,7 +232,16 @@ memzone_reserve_aligned_thread_unsafe(const char *name, 
size_t len,
const struct malloc_elem *elem = malloc_elem_from_data(mz_addr);

/* fill the zone in config */
-   struct rte_memzone *mz = &mcfg->memzone[mcfg->memzone_idx++];
+   struct rte_memzone *mz = get_next_free_memzone();
+
+   if (mz == NULL) {
+   RTE_LOG(ERR, EAL, "%s(): Cannot find free memzone but there is 
room "
+   "in config!\n", __func__);
+   rte_errno = ENOSPC;
+   return NULL;
+   }
+
+   mcfg->memzone_cnt++;
snprintf(mz->name, sizeof(mz->name), "%s", name);
mz->phys_addr = rte_malloc_virt2phy(mz_addr);
mz->addr = mz_addr;
@@ -291,6 +317,41 @@ rte_memzone_reserve_bounded(const char *name, size_t len,
return mz;
 }

+int
+rte_memzone_free(const struct rte_memzone *mz)
+{
+   struct rte_mem_config *mcfg;
+   int ret = 0;
+   void *addr;
+   unsigned idx;
+
+   if (mz == NULL)
+   return -EINVAL;
+
+   mcfg = rte_eal_get_configuration()->mem_config;
+
+   rte_rwlock_write_lock(&mcfg->mlock);
+
+   idx = ((uintptr_t)mz - (uintptr_t)mcfg->memzone);
+   idx = idx / sizeof(struct rte_memzone);
+
+   addr = mcfg->memzone[idx].addr;
+   if (addr == NULL)
+   ret = -EINVAL;
+   else if (mcfg->memzone_cnt == 0) {
+   rte_panic("%s(): memzone address not NULL but memzone_cnt is 
0!\n",
+   __func__);
+   } else {
+   memset(&mcfg->memzone[idx], 0, sizeof(mcfg->memzone[idx]));
+   mcfg->memzone_cnt--;
+   }
+
+   rte_rwlock_write_unlock(&mcfg->mlock);
+
+   rte_free(addr);
+
+   return ret;
+}

 /*
  * Lookup for the memzone identified by the given name
@@ -364,7 +425,7 @@ rte_eal_memzone_init(void)
rte_rwlock_write_lock(&mcfg->mlock);

/* delete all zones */
-   mcfg->memzone_idx = 0;
+   mcfg->memzone_cnt = 0;
memset(mcfg->memzone, 0, sizeof(mcfg->memzone));

rte_rwlock_write_unlock(&mcfg->mlock);
diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h 
b/lib/librte_eal/common/include/rte_eal_memconfig.h
index 7de906b..2b5e0b1 100644
--- a/lib/librte_eal/common/include/rte_eal_memconfig.h
+++ b/lib/librte_eal/common/include/rte_eal_memconfig.h
@@ -67,7 +67,7 @@ struct rte_mem_config {
rte_rwlock_t qlock;   /**< used

[dpdk-dev] [PATCH v7 1/9] eal: move librte_malloc to eal/common

2015-07-03 Thread Sergio Gonzalez Monroy

Move malloc inside eal and create a new section in MAINTAINERS file for
Memory Allocation in EAL.

Create a dummy malloc library to avoid breaking applications that have
librte_malloc in their DT_NEEDED entries.

This is the first step towards using malloc to allocate memory directly
from memsegs. Thus, memzones would allocate memory through malloc,
allowing to free memzones.

Signed-off-by: Sergio Gonzalez Monroy 
---
 MAINTAINERS |  22 +-
 config/common_bsdapp|   9 +-
 config/common_linuxapp  |   9 +-
 drivers/net/af_packet/Makefile  |   1 -
 drivers/net/bonding/Makefile|   1 -
 drivers/net/e1000/Makefile  |   2 +-
 drivers/net/enic/Makefile   |   2 +-
 drivers/net/fm10k/Makefile  |   2 +-
 drivers/net/i40e/Makefile   |   2 +-
 drivers/net/ixgbe/Makefile  |   2 +-
 drivers/net/mlx4/Makefile   |   1 -
 drivers/net/null/Makefile   |   1 -
 drivers/net/pcap/Makefile   |   1 -
 drivers/net/virtio/Makefile |   2 +-
 drivers/net/vmxnet3/Makefile|   2 +-
 drivers/net/xenvirt/Makefile|   2 +-
 lib/Makefile|   2 +-
 lib/librte_acl/Makefile |   2 +-
 lib/librte_eal/bsdapp/eal/Makefile  |   4 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |  13 +
 lib/librte_eal/common/Makefile  |   1 +
 lib/librte_eal/common/include/rte_malloc.h  | 342 
 lib/librte_eal/common/malloc_elem.c | 320 ++
 lib/librte_eal/common/malloc_elem.h | 190 +
 lib/librte_eal/common/malloc_heap.c | 208 ++
 lib/librte_eal/common/malloc_heap.h |  70 +
 lib/librte_eal/common/rte_malloc.c  | 260 ++
 lib/librte_eal/linuxapp/eal/Makefile|   4 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |  13 +
 lib/librte_hash/Makefile|   2 +-
 lib/librte_lpm/Makefile |   2 +-
 lib/librte_malloc/Makefile  |   6 +-
 lib/librte_malloc/malloc_elem.c | 320 --
 lib/librte_malloc/malloc_elem.h | 190 -
 lib/librte_malloc/malloc_heap.c | 208 --
 lib/librte_malloc/malloc_heap.h |  70 -
 lib/librte_malloc/rte_malloc.c  | 228 +---
 lib/librte_malloc/rte_malloc.h  | 342 
 lib/librte_malloc/rte_malloc_version.map|  16 --
 lib/librte_mempool/Makefile |   2 -
 lib/librte_port/Makefile|   1 -
 lib/librte_ring/Makefile|   3 +-
 lib/librte_table/Makefile   |   1 -
 43 files changed, 1455 insertions(+), 1426 deletions(-)
 create mode 100644 lib/librte_eal/common/include/rte_malloc.h
 create mode 100644 lib/librte_eal/common/malloc_elem.c
 create mode 100644 lib/librte_eal/common/malloc_elem.h
 create mode 100644 lib/librte_eal/common/malloc_heap.c
 create mode 100644 lib/librte_eal/common/malloc_heap.h
 create mode 100644 lib/librte_eal/common/rte_malloc.c
 delete mode 100644 lib/librte_malloc/malloc_elem.c
 delete mode 100644 lib/librte_malloc/malloc_elem.h
 delete mode 100644 lib/librte_malloc/malloc_heap.c
 delete mode 100644 lib/librte_malloc/malloc_heap.h
 delete mode 100644 lib/librte_malloc/rte_malloc.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 5476a73..6e69d13 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -83,12 +83,9 @@ F: app/test/test_debug.c
 F: app/test/test_devargs.c
 F: app/test/test_eal*
 F: app/test/test_errno.c
-F: app/test/test_func_reentrancy.c
 F: app/test/test_interrupts.c
 F: app/test/test_logs.c
 F: app/test/test_memcpy*
-F: app/test/test_memory.c
-F: app/test/test_memzone.c
 F: app/test/test_pci.c
 F: app/test/test_per_lcore.c
 F: app/test/test_prefetch.c
@@ -98,6 +95,19 @@ F: app/test/test_string_fns.c
 F: app/test/test_tailq.c
 F: app/test/test_version.c

+Memory Allocation
+M: Sergio Gonzalez Monroy 
+F: lib/librte_eal/common/include/rte_mem*
+F: lib/librte_eal/common/include/rte_malloc.h
+F: lib/librte_eal/common/*malloc*
+F: lib/librte_eal/common/eal_common_mem*
+F: lib/librte_eal/common/eal_hugepages.h
+F: doc/guides/prog_guide/malloc_lib.rst
+F: app/test/test_func_reentrancy.c
+F: app/test/test_malloc.c
+F: app/test/test_memory.c
+F: app/test/test_memzone.c
+
 Secondary process
 K: RTE_PROC_
 F: doc/guides/prog_guide/multi_proc_support.rst
@@ -156,12 +166,6 @@ F: lib/librte_eal/bsdapp/nic_uio/
 Core Libraries
 --

-Dynamic memory
-F: lib/librte_malloc/
-F: doc/guides/prog_guide/malloc_lib.rst
-F: app/test/te

[dpdk-dev] [PATCH v7 9/9] doc: update malloc documentation

2015-07-03 Thread Sergio Gonzalez Monroy

Update malloc documentation to reflect new implementation details.

Signed-off-by: Sergio Gonzalez Monroy 
---
 doc/guides/prog_guide/env_abstraction_layer.rst | 220 +-
 doc/guides/prog_guide/img/malloc_heap.png   | Bin 81329 -> 80952 bytes
 doc/guides/prog_guide/index.rst |   1 -
 doc/guides/prog_guide/malloc_lib.rst| 233 
 doc/guides/prog_guide/overview.rst  |  11 +-
 5 files changed, 221 insertions(+), 244 deletions(-)
 delete mode 100644 doc/guides/prog_guide/malloc_lib.rst

diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst 
b/doc/guides/prog_guide/env_abstraction_layer.rst
index 25eb281..cd4d666 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -116,7 +116,6 @@ The physical address of the reserved memory for that memory 
zone is also returne
 .. note::

 Memory reservations done using the APIs provided by the rte_malloc library 
are also backed by pages from the hugetlbfs filesystem.
-However, physical address information is not available for the blocks of 
memory allocated in this way.

 Xen Dom0 support without hugetbls
 ~
@@ -366,3 +365,222 @@ We expect only 50% of CPU spend on packet IO.
 echo  5 > pkt_io/cpu.cfs_quota_us


+Malloc
+--
+
+The EAL provides a malloc API to allocate any-sized memory.
+
+The objective of this API is to provide malloc-like functions to allow
+allocation from hugepage memory and to facilitate application porting.
+The *DPDK API Reference* manual describes the available functions.
+
+Typically, these kinds of allocations should not be done in data plane
+processing because they are slower than pool-based allocation and make
+use of locks within the allocation and free paths.
+However, they can be used in configuration code.
+
+Refer to the rte_malloc() function description in the *DPDK API Reference*
+manual for more information.
+
+Cookies
+~~~
+
+When CONFIG_RTE_MALLOC_DEBUG is enabled, the allocated memory contains
+overwrite protection fields to help identify buffer overflows.
+
+Alignment and NUMA Constraints
+~~
+
+The rte_malloc() takes an align argument that can be used to request a memory
+area that is aligned on a multiple of this value (which must be a power of 
two).
+
+On systems with NUMA support, a call to the rte_malloc() function will return
+memory that has been allocated on the NUMA socket of the core which made the 
call.
+A set of APIs is also provided, to allow memory to be explicitly allocated on a
+NUMA socket directly, or by allocated on the NUMA socket where another core is
+located, in the case where the memory is to be used by a logical core other 
than
+on the one doing the memory allocation.
+
+Use Cases
+~
+
+This API is meant to be used by an application that requires malloc-like
+functions at initialization time.
+
+For allocating/freeing data at runtime, in the fast-path of an application,
+the memory pool library should be used instead.
+
+Internal Implementation
+~~~
+
+Data Structures
+^^^
+
+There are two data structure types used internally in the malloc library:
+
+*   struct malloc_heap - used to track free space on a per-socket basis
+
+*   struct malloc_elem - the basic element of allocation and free-space
+tracking inside the library.
+
+Structure: malloc_heap
+""
+
+The malloc_heap structure is used to manage free space on a per-socket basis.
+Internally, there is one heap structure per NUMA node, which allows us to
+allocate memory to a thread based on the NUMA node on which this thread runs.
+While this does not guarantee that the memory will be used on that NUMA node,
+it is no worse than a scheme where the memory is always allocated on a fixed
+or random node.
+
+The key fields of the heap structure and their function are described below
+(see also diagram above):
+
+*   lock - the lock field is needed to synchronize access to the heap.
+Given that the free space in the heap is tracked using a linked list,
+we need a lock to prevent two threads manipulating the list at the same 
time.
+
+*   free_head - this points to the first element in the list of free nodes for
+this malloc heap.
+
+.. note::
+
+The malloc_heap structure does not keep track of in-use blocks of memory,
+since these are never touched except when they are to be freed again -
+at which point the pointer to the block is an input to the free() function.
+
+.. _figure_malloc_heap:
+
+.. figure:: img/malloc_heap.*
+
+   Example of a malloc heap and malloc elements within the malloc library
+
+
+.. _malloc_elem:
+
+Structure: malloc_elem
+""
+
+The malloc_elem structure is used as a generic header structure for various
+blocks of memory.
+It is used in three different ways - all shown in the diagram above:
+

[dpdk-dev] Ethernet API - multiple post-RX-burst callbacks' run-order is opposite to their add-order

2015-07-03 Thread Bruce Richardson

On Thu, Jul 02, 2015 at 09:04:48PM +, Sanford, Robert wrote:
> When one adds multiple post-RX-burst callbacks to a queue, their execution 
> order is the opposite of the order in which they are added. For example, we 
> add callback A( ), and then we add callback B( ). When we call 
> rte_eth_rx_burst, after invoking the device's rx_pkt_burst function, it will 
> invoke B( ), and then A( ). The same goes for pre-TX-burst callbacks, too.
> 
> This is counter-intuitive. Shouldn't we either execute the callbacks in the 
> same order that we add them (by changing the internals of the add-APIs), or 
> change the add-APIs to allow one to specify whether a callback is added to 
> the head or tail of the callback list? At the least, we could document the 
> expected behavior.
> Any thoughts on this?
> 
Makes sense. I would agree that having the callbacks called in order of addition
makes more sense. 
Having the order configurable might be useful, but would require an API change,
so I'd only look to change that if it really proves necessary. If the callback
order is consistent and behaves logically (i.e. order of call == order of add),
can the app not ensure the callbacks are added in the correct order?

/Bruce

[dpdk-dev] Ethernet API - multiple post-RX-burst callbacks' run-order is opposite to their add-order

2015-07-03 Thread Thomas Monjalon

2015-07-03 10:57, Bruce Richardson:
> On Thu, Jul 02, 2015 at 09:04:48PM +, Sanford, Robert wrote:
> > When one adds multiple post-RX-burst callbacks to a queue, their execution 
> > order is the opposite of the order in which they are added. For example, we 
> > add callback A( ), and then we add callback B( ). When we call 
> > rte_eth_rx_burst, after invoking the device's rx_pkt_burst function, it 
> > will invoke B( ), and then A( ). The same goes for pre-TX-burst callbacks, 
> > too.
> > 
> > This is counter-intuitive. Shouldn't we either execute the callbacks in the 
> > same order that we add them (by changing the internals of the add-APIs), or 
> > change the add-APIs to allow one to specify whether a callback is added to 
> > the head or tail of the callback list? At the least, we could document the 
> > expected behavior.
> > Any thoughts on this?
> > 
> Makes sense. I would agree that having the callbacks called in order of 
> addition
> makes more sense. 
> Having the order configurable might be useful, but would require an API 
> change,
> so I'd only look to change that if it really proves necessary. If the callback
> order is consistent and behaves logically (i.e. order of call == order of 
> add),
> can the app not ensure the callbacks are added in the correct order?

+1 to improve behaviour without changing the API.
If the applications can manage with a simple DPDK API, it's better for everyone.

[dpdk-dev] [PATCH 2/3] doc: added guidelines on dpdk documentation

2015-07-03 Thread Iremonger, Bernard

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> Sent: Friday, July 3, 2015 10:54 AM
> To: Thomas Monjalon
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/3] doc: added guidelines on dpdk
> documentation
> 
> On Thu, Jul 02, 2015 at 06:20:39PM +0200, Thomas Monjalon wrote:
> > 2015-07-02 14:50, John McNamara:
> > > +* Use one sentence per line in a paragraph, i.e., put a newline
> > > +character
> > > +  after each period/full stop.
> >
> > What about adding this?
> > Only blank line will generate a newline.
> >
> > I think breaking lines at end of sentence is more important than
> > wrapping at 80 char, because it will help to keep patches readable.
> >
> +1 to this. I believe that one sentence per line should be the default
> +wrapping
> policy and only wrap if lines are very, very long. The why I like this 
> approach is
> that changes tend to be sentence-based, and so having a sentence per line
> makes for much cleaner diffs when updating - thereby making review easier.
> 
> /Bruce

Hi Bruce,

I think "very, very long" should be defined.
When I was working on the rst files previously  I wrapped sentences which were  
greater than 130 characters.
I wrapped long sentences at punctuation points where possible, rather than at 
the character limit.

Regards,

Bernard.

[dpdk-dev] UIO RTE_INTR_MODE_NONE issue.

2015-07-03 Thread Prathap T

Hi:



If INTX fails, igb_uio falls back to running without IRQ ( refer to the
implementation in igbuio_pci_probe).

On QEMU 0.12.0, the INTX seems to have broken, and the intr_mode falls to
RTE_INTR_MODE_NONE

However this sets the udev->info.irq = 0;



Setting of udev->info.irq  to ?0? does not work on *2.6.36 and lower
kernels*, because the UIO_IRQ_NONE is defined as



#define UIO_IRQ_NONE   -2



Because the udev->info.irq  is set to ?0?, on *2.6.36 and below*
implementation, the *__uio_register_device* invokes request_irq



if (idev->info->irq >= 0) {
 ret = request_irq(idev->info->irq, uio_interrupt,

and it fails with the following dump



IRQ handler type mismatch for IRQ 0

current handler: timer

Pid: 3106, comm: dpdk_nic_bind.p Not tainted 2.6.32-504.23.4.el6.x86_64 #1

Call Trace:

[] ? __setup_irq+0x382/0x3c0

[] ? uio_interrupt+0x0/0x48 [uio]

[] ? request_threaded_irq+0x133/0x230

[] ? __uio_register_device+0x553/0x610 [uio]

[] ? igbuio_pci_probe+0x3a7/0x4a0 [igb_uio]

[] ? kobject_get+0x1a/0x30

[] ? local_pci_probe+0x17/0x20

[] ? pci_device_probe+0x101/0x120

[] ? driver_sysfs_add+0x62/0x90

[] ? driver_probe_device+0x9c/0x3e0

[] ? driver_bind+0xca/0x110

[] ? drv_attr_store+0x2c/0x30

[] ? sysfs_write_file+0xe5/0x170

[] ? vfs_write+0xb8/0x1a0

[] ? sys_write+0x51/0x90

[] ? system_call_fastpath+0x16/0x1b

igb_uio :00:03.0: PCI INT A disabled

igb_uio: probe of :00:03.0 failed with error -16



On kernel *2.6.37* and above, the definition is,



#define UIO_IRQ_NONE   0



And the check is,



if (info->irq && (info->irq != UIO_IRQ_CUSTOM)) {
   ret = request_irq(info->irq, uio_interrupt,
 info->irq_flags, info->name, idev);



So to handle the ?RTE_INTR_MODE_NONE? in these different kernel versions,

We are proposing the following change to the code in igbuio_pci_probe
function,



case RTE_INTR_MODE_NONE:

udev->mode = RTE_INTR_MODE_NONE;

#if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 37)

 udev->info.irq = -2;

#else

udev->info.irq = 0;

#endif



Please let me  know your opinion. If it is correct, we will go ahead and
generate a patch for review.



Regards,

Prathap

[dpdk-dev] [PATCH 2/3] doc: added guidelines on dpdk documentation

2015-07-03 Thread Bruce Richardson

On Fri, Jul 03, 2015 at 11:05:16AM +0100, Iremonger, Bernard wrote:
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Friday, July 3, 2015 10:54 AM
> > To: Thomas Monjalon
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 2/3] doc: added guidelines on dpdk
> > documentation
> > 
> > On Thu, Jul 02, 2015 at 06:20:39PM +0200, Thomas Monjalon wrote:
> > > 2015-07-02 14:50, John McNamara:
> > > > +* Use one sentence per line in a paragraph, i.e., put a newline
> > > > +character
> > > > +  after each period/full stop.
> > >
> > > What about adding this?
> > > Only blank line will generate a newline.
> > >
> > > I think breaking lines at end of sentence is more important than
> > > wrapping at 80 char, because it will help to keep patches readable.
> > >
> > +1 to this. I believe that one sentence per line should be the default
> > +wrapping
> > policy and only wrap if lines are very, very long. The why I like this 
> > approach is
> > that changes tend to be sentence-based, and so having a sentence per line
> > makes for much cleaner diffs when updating - thereby making review easier.
> > 
> > /Bruce
> 
> Hi Bruce,
> 
> I think "very, very long" should be defined.
Obviously :-)

> When I was working on the rst files previously  I wrapped sentences which 
> were  greater than 130 characters.
> I wrapped long sentences at punctuation points where possible, rather than at 
> the character limit.
>
Agree. I would suggest wrapping at a suitable punctuation point between 100-120 
chars is best.

/Bruce

[dpdk-dev] [PATCH 2/2] eal: Fix compilation on C++

2015-07-03 Thread Joongi Kim

 * Forward declaration of enum in C++ requires explicit underlying
   type definitions.

 * This fixes the issue at:
   http://dpdk.org/ml/archives/dev/2015-April/017065.html

Signed-off-by: Joongi Kim 
---
 lib/librte_eal/common/include/arch/x86/rte_cpuflags.h |  4 ++--
 lib/librte_eal/common/include/generic/rte_cpuflags.h  | 12 ++--
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/x86/rte_cpuflags.h 
b/lib/librte_eal/common/include/arch/x86/rte_cpuflags.h
index dd56553..df1834c 100644
--- a/lib/librte_eal/common/include/arch/x86/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/arch/x86/rte_cpuflags.h
@@ -45,7 +45,7 @@ extern "C" {

 #include "generic/rte_cpuflags.h"

-enum rte_cpu_flag_t {
+enum rte_cpu_flag_t __RTE_CPUFLAG_UNDERLYING_TYPE {
/* (EAX 01h) ECX features*/
RTE_CPUFLAG_SSE3 = 0,   /**< SSE3 */
RTE_CPUFLAG_PCLMULQDQ,  /**< PCLMULQDQ */
@@ -150,7 +150,7 @@ enum rte_cpu_flag_t {
RTE_CPUFLAG_NUMFLAGS,   /**< This should always be the 
last! */
 };

-enum cpu_register_t {
+enum cpu_register_t __RTE_REGISTER_UNDERLYING_TYPE {
RTE_REG_EAX = 0,
RTE_REG_EBX,
RTE_REG_ECX,
diff --git a/lib/librte_eal/common/include/generic/rte_cpuflags.h 
b/lib/librte_eal/common/include/generic/rte_cpuflags.h
index 61c4db1..5352cbc 100644
--- a/lib/librte_eal/common/include/generic/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/generic/rte_cpuflags.h
@@ -44,15 +44,23 @@
 #include 
 #include 

+#ifdef __cplusplus
+#define __RTE_CPUFLAG_UNDERLYING_TYPE  : unsigned int
+#define __RTE_REGISTER_UNDERLYING_TYPE : unsigned int
+#else
+#define __RTE_CPUFLAG_UNDERLYING_TYPE
+#define __RTE_REGISTER_UNDERLYING_TYPE
+#endif
+
 /**
  * Enumeration of all CPU features supported
  */
-enum rte_cpu_flag_t;
+enum rte_cpu_flag_t __RTE_CPUFLAG_UNDERLYING_TYPE;

 /**
  * Enumeration of CPU registers
  */
-enum cpu_register_t;
+enum cpu_register_t __RTE_REGISTER_UNDERLYING_TYPE;

 typedef uint32_t cpuid_registers_t[4];

-- 
1.9.1

[dpdk-dev] [PATCH 1/2] lib: Fix pointer arithmetic for C++

2015-07-03 Thread Joongi Kim

 * C++ requires explicit conversion of void pointer types to other
   pointer types.

 * This issue was introduced by previous commits 6cf14ce4 and 7755baae8.
   Two subsequent commits 2f935c12 and 7621d6a also have the same issue,
   I did not fix them because they are NOT headers potentially included
   by C++ sources.

Signed-off-by: Joongi Kim 
---
 lib/librte_malloc/malloc_elem.h  | 4 ++--
 lib/librte_mbuf/rte_mbuf.h   | 2 +-
 lib/librte_mempool/rte_mempool.c | 2 +-
 lib/librte_mempool/rte_mempool.h | 4 ++--
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/librte_malloc/malloc_elem.h b/lib/librte_malloc/malloc_elem.h
index 9790b1a..2b18d06 100644
--- a/lib/librte_malloc/malloc_elem.h
+++ b/lib/librte_malloc/malloc_elem.h
@@ -124,10 +124,10 @@ malloc_elem_from_data(const void *data)
if (data == NULL)
return NULL;

-   struct malloc_elem *elem = RTE_PTR_SUB(data, MALLOC_ELEM_HEADER_LEN);
+   struct malloc_elem *elem = (struct malloc_elem *) RTE_PTR_SUB(data, 
MALLOC_ELEM_HEADER_LEN);
if (!malloc_elem_cookies_ok(elem))
return NULL;
-   return elem->state != ELEM_PAD ? elem:  RTE_PTR_SUB(elem, elem->pad);
+   return elem->state != ELEM_PAD ? elem: (struct malloc_elem *) 
RTE_PTR_SUB(elem, elem->pad);
 }

 /*
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 8a2cae1..4fc770b 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -348,7 +348,7 @@ static inline uint16_t rte_pktmbuf_priv_size(struct 
rte_mempool *mp);
 static inline struct rte_mbuf *
 rte_mbuf_from_indirect(struct rte_mbuf *mi)
 {
-   return RTE_PTR_SUB(mi->buf_addr, sizeof(*mi) + mi->priv_size);
+   return (struct rte_mbuf *) RTE_PTR_SUB(mi->buf_addr, sizeof(*mi) + 
mi->priv_size);
 }

 /**
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 02699a1..e22ddb3 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -136,7 +136,7 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, 
uint32_t obj_idx,
obj = (char *)obj + mp->header_size;

/* set mempool ptr in header */
-   hdr = RTE_PTR_SUB(obj, sizeof(*hdr));
+   hdr = (struct rte_mempool_objhdr *) RTE_PTR_SUB(obj, sizeof(*hdr));
hdr->mp = mp;

 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 6d4ce9a..7c966a1 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -262,13 +262,13 @@ struct rte_mempool {
 /* return the header of a mempool object (internal) */
 static inline struct rte_mempool_objhdr *__mempool_get_header(void *obj)
 {
-   return RTE_PTR_SUB(obj, sizeof(struct rte_mempool_objhdr));
+   return (struct rte_mempool_objhdr *) RTE_PTR_SUB(obj, sizeof(struct 
rte_mempool_objhdr));
 }

 /* return the trailer of a mempool object (internal) */
 static inline struct rte_mempool_objtlr *__mempool_get_trailer(void *obj)
 {
-   return RTE_PTR_SUB(obj, sizeof(struct rte_mempool_objtlr));
+   return (struct rte_mempool_objtlr *) RTE_PTR_SUB(obj, sizeof(struct 
rte_mempool_objtlr));
 }

 /**
-- 
1.9.1

[dpdk-dev] [PATCH v3 2/7] ixgbe: add functions to get and reset xstats

2015-07-03 Thread Olivier MATZ

Hi Maryam,

On 06/26/2015 02:59 PM, Maryam Tahhan wrote:
> Implement ixgbe_dev_xstats_reset and ixgbe_dev_xstats_get.
> 
> Signed-off-by: Maryam Tahhan 
> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c | 85 
> 
>  1 file changed, 85 insertions(+)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c 
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 927021f..0f62bcb 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -131,7 +131,10 @@ static int ixgbe_dev_link_update(struct rte_eth_dev *dev,
>   int wait_to_complete);
>  static void ixgbe_dev_stats_get(struct rte_eth_dev *dev,
>   struct rte_eth_stats *stats);
> +static int ixgbe_dev_xstats_get(struct rte_eth_dev *dev,
> + struct rte_eth_xstats *xstats, unsigned n);
>  static void ixgbe_dev_stats_reset(struct rte_eth_dev *dev);
> +static void ixgbe_dev_xstats_reset(struct rte_eth_dev *dev);
>  static int ixgbe_dev_queue_stats_mapping_set(struct rte_eth_dev *eth_dev,
>uint16_t queue_id,
>uint8_t stat_idx,
> @@ -334,7 +337,9 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
>   .allmulticast_disable = ixgbe_dev_allmulticast_disable,
>   .link_update  = ixgbe_dev_link_update,
>   .stats_get= ixgbe_dev_stats_get,
> + .xstats_get   = ixgbe_dev_xstats_get,
>   .stats_reset  = ixgbe_dev_stats_reset,
> + .xstats_reset = ixgbe_dev_xstats_reset,
>   .queue_stats_mapping_set = ixgbe_dev_queue_stats_mapping_set,
>   .dev_infos_get= ixgbe_dev_info_get,
>   .mtu_set  = ixgbe_dev_mtu_set,
> @@ -414,6 +419,33 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
>   .set_mc_addr_list = ixgbe_dev_set_mc_addr_list,
>  };
>  
> +/* store statistics names and its offset in stats structure  */ struct
> +rte_ixgbe_xstats_name_off {
> + char name[RTE_ETH_XSTATS_NAME_SIZE];
> + unsigned offset;
> +};
> +
> +static const struct rte_ixgbe_xstats_name_off rte_ixgbe_stats_strings[] = {
> + {"rx_illegal_byte_err", offsetof(struct ixgbe_hw_stats, errbc)},
> + {"rx_len_err", offsetof(struct ixgbe_hw_stats, rlec)},
> + {"rx_undersize_count", offsetof(struct ixgbe_hw_stats, ruc)},
> + {"rx_oversize_count", offsetof(struct ixgbe_hw_stats, roc)},
> + {"rx_fragment_count", offsetof(struct ixgbe_hw_stats, rfc)},
> + {"rx_jabber_count", offsetof(struct ixgbe_hw_stats, rjc)},
> + {"l3_l4_xsum_error", offsetof(struct ixgbe_hw_stats, xec)},
> + {"mac_local_fault", offsetof(struct ixgbe_hw_stats, mlfc)},
> + {"mac_remote_fault", offsetof(struct ixgbe_hw_stats, mrfc)},
> + {"mac_short_pkt_discard", offsetof(struct ixgbe_hw_stats, mspdc)},
> + {"fccrc_error", offsetof(struct ixgbe_hw_stats, fccrc)},
> + {"fcoe_drop", offsetof(struct ixgbe_hw_stats, fcoerpdc)},
> + {"fc_last_error", offsetof(struct ixgbe_hw_stats, fclast)},
> + {"rx_broadcast_packets", offsetof(struct ixgbe_hw_stats, bprc)},
> + {"mgmt_pkts_dropped", offsetof(struct ixgbe_hw_stats, mngpdc)},
> +};
> +
> +#define RTE_NB_XSTATS (sizeof(rte_ixgbe_stats_strings) / \
> + sizeof(rte_ixgbe_stats_strings[0]))
> +

Maybe RTE_NB_XSTATS should be renamed in IXGBE_NB_XSTATS?


>  /**
>   * Atomically reads the link status information from global
>   * structure rte_eth_dev.
> @@ -1982,6 +2014,59 @@ ixgbe_dev_stats_reset(struct rte_eth_dev *dev)
>   memset(stats, 0, sizeof(*stats));
>  }
>  
> +static int
> +ixgbe_dev_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstats *xstats,
> +  unsigned n)
> +{
> + struct ixgbe_hw *hw =
> + IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + struct ixgbe_hw_stats *hw_stats =
> + IXGBE_DEV_PRIVATE_TO_STATS(dev->data->dev_private);
> + uint64_t total_missed_rx, total_qbrc, total_qprc, total_qprdc;
> + uint64_t rxnfgpc, txdgpc;
> + unsigned i, count = RTE_NB_XSTATS;
> +
> + if (n == 0)
> + return count;

I think it does not exactly matches the API described in rte_ethdev.h:

 * @return
 *   - positive value lower or equal to n: success. The return value
 * is the number of entries filled in the stats table.
 *   - positive value higher than n: error, the given statistics table
 * is too small. The return value corresponds to the size that should
 * be given to succeed. The entries in the table are not valid and
 * shall not be used by the caller.
 *   - negative value on error (invalid port id)

I think it should be:

  if (n < count)
return count


> +
> + total_missed_rx = 0;
> + total_qbrc = 0;
> + total_qprc = 0;
> + total_qprdc = 0;
> + rxnfgpc = 0;
> + txdgpc = 0;
> + count = 0;
> +
> + ixgbe_read_s

[dpdk-dev] [PATCH v3 2/7] ixgbe: add functions to get and reset xstats

2015-07-03 Thread Olivier MATZ



On 07/03/2015 03:16 PM, Olivier MATZ wrote:
> Hi Maryam,
> 
> On 06/26/2015 02:59 PM, Maryam Tahhan wrote:
>> Implement ixgbe_dev_xstats_reset and ixgbe_dev_xstats_get.
>>
>> Signed-off-by: Maryam Tahhan 
>> ---
>>  drivers/net/ixgbe/ixgbe_ethdev.c | 85 
>> 
>>  1 file changed, 85 insertions(+)
>>
>> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c 
>> b/drivers/net/ixgbe/ixgbe_ethdev.c
>> index 927021f..0f62bcb 100644
>> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
>> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
>> @@ -131,7 +131,10 @@ static int ixgbe_dev_link_update(struct rte_eth_dev 
>> *dev,
>>  int wait_to_complete);
>>  static void ixgbe_dev_stats_get(struct rte_eth_dev *dev,
>>  struct rte_eth_stats *stats);
>> +static int ixgbe_dev_xstats_get(struct rte_eth_dev *dev,
>> +struct rte_eth_xstats *xstats, unsigned n);
>>  static void ixgbe_dev_stats_reset(struct rte_eth_dev *dev);
>> +static void ixgbe_dev_xstats_reset(struct rte_eth_dev *dev);
>>  static int ixgbe_dev_queue_stats_mapping_set(struct rte_eth_dev *eth_dev,
>>   uint16_t queue_id,
>>   uint8_t stat_idx,
>> @@ -334,7 +337,9 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
>>  .allmulticast_disable = ixgbe_dev_allmulticast_disable,
>>  .link_update  = ixgbe_dev_link_update,
>>  .stats_get= ixgbe_dev_stats_get,
>> +.xstats_get   = ixgbe_dev_xstats_get,
>>  .stats_reset  = ixgbe_dev_stats_reset,
>> +.xstats_reset = ixgbe_dev_xstats_reset,
>>  .queue_stats_mapping_set = ixgbe_dev_queue_stats_mapping_set,
>>  .dev_infos_get= ixgbe_dev_info_get,
>>  .mtu_set  = ixgbe_dev_mtu_set,
>> @@ -414,6 +419,33 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
>>  .set_mc_addr_list = ixgbe_dev_set_mc_addr_list,
>>  };
>>  
>> +/* store statistics names and its offset in stats structure  */ struct
>> +rte_ixgbe_xstats_name_off {
>> +char name[RTE_ETH_XSTATS_NAME_SIZE];
>> +unsigned offset;
>> +};
>> +
>> +static const struct rte_ixgbe_xstats_name_off rte_ixgbe_stats_strings[] = {
>> +{"rx_illegal_byte_err", offsetof(struct ixgbe_hw_stats, errbc)},
>> +{"rx_len_err", offsetof(struct ixgbe_hw_stats, rlec)},
>> +{"rx_undersize_count", offsetof(struct ixgbe_hw_stats, ruc)},
>> +{"rx_oversize_count", offsetof(struct ixgbe_hw_stats, roc)},
>> +{"rx_fragment_count", offsetof(struct ixgbe_hw_stats, rfc)},
>> +{"rx_jabber_count", offsetof(struct ixgbe_hw_stats, rjc)},
>> +{"l3_l4_xsum_error", offsetof(struct ixgbe_hw_stats, xec)},
>> +{"mac_local_fault", offsetof(struct ixgbe_hw_stats, mlfc)},
>> +{"mac_remote_fault", offsetof(struct ixgbe_hw_stats, mrfc)},
>> +{"mac_short_pkt_discard", offsetof(struct ixgbe_hw_stats, mspdc)},
>> +{"fccrc_error", offsetof(struct ixgbe_hw_stats, fccrc)},
>> +{"fcoe_drop", offsetof(struct ixgbe_hw_stats, fcoerpdc)},
>> +{"fc_last_error", offsetof(struct ixgbe_hw_stats, fclast)},
>> +{"rx_broadcast_packets", offsetof(struct ixgbe_hw_stats, bprc)},
>> +{"mgmt_pkts_dropped", offsetof(struct ixgbe_hw_stats, mngpdc)},
>> +};
>> +
>> +#define RTE_NB_XSTATS (sizeof(rte_ixgbe_stats_strings) /\
>> +sizeof(rte_ixgbe_stats_strings[0]))
>> +
> 
> Maybe RTE_NB_XSTATS should be renamed in IXGBE_NB_XSTATS?
> 
> 
>>  /**
>>   * Atomically reads the link status information from global
>>   * structure rte_eth_dev.
>> @@ -1982,6 +2014,59 @@ ixgbe_dev_stats_reset(struct rte_eth_dev *dev)
>>  memset(stats, 0, sizeof(*stats));
>>  }
>>  
>> +static int
>> +ixgbe_dev_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstats *xstats,
>> + unsigned n)
>> +{
>> +struct ixgbe_hw *hw =
>> +IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
>> +struct ixgbe_hw_stats *hw_stats =
>> +IXGBE_DEV_PRIVATE_TO_STATS(dev->data->dev_private);
>> +uint64_t total_missed_rx, total_qbrc, total_qprc, total_qprdc;
>> +uint64_t rxnfgpc, txdgpc;
>> +unsigned i, count = RTE_NB_XSTATS;
>> +
>> +if (n == 0)
>> +return count;
> 
> I think it does not exactly matches the API described in rte_ethdev.h:
> 
>  * @return
>  *   - positive value lower or equal to n: success. The return value
>  * is the number of entries filled in the stats table.
>  *   - positive value higher than n: error, the given statistics table
>  * is too small. The return value corresponds to the size that should
>  * be given to succeed. The entries in the table are not valid and
>  * shall not be used by the caller.
>  *   - negative value on error (invalid port id)
> 
> I think it should be:
> 
>   if (n < count)
> return count
> 
> 
>> +
>> +total_missed_rx = 0;
>> +total_qbr

[dpdk-dev] [PATCH v3 3/7] ethdev: expose extended error stats

2015-07-03 Thread Olivier MATZ

Hi Maryam,

On 06/26/2015 02:59 PM, Maryam Tahhan wrote:
> Extend rte_eth_xstats_get to retrieve additional stats from the device
> driver as well the ethdev generic stats.
> 
> Signed-off-by: Maryam Tahhan 
> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c |  2 +-
>  lib/librte_ether/rte_ethdev.c| 20 ++--
>  2 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c 
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 0f62bcb..099e464 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -2035,7 +2035,7 @@ ixgbe_dev_xstats_get(struct rte_eth_dev *dev, struct 
> rte_eth_xstats *xstats,
>   total_qprdc = 0;
>   rxnfgpc = 0;
>   txdgpc = 0;
> - count = 0;
> + count = n;
>  
>   ixgbe_read_stats_registers(hw, hw_stats, &total_missed_rx, &total_qbrc,
>  &total_qprc, 
> &rxnfgpc, &txdgpc, &total_qprdc);
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 02cd07f..6451621 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -1741,7 +1741,7 @@ rte_eth_xstats_get(uint8_t port_id, struct 
> rte_eth_xstats *xstats,
>  {
>   struct rte_eth_stats eth_stats;
>   struct rte_eth_dev *dev;
> - unsigned count, i, q;
> + unsigned count = 0, xcount = 0, i, q;
>   uint64_t val, *stats_ptr;
>  
>   if (!rte_eth_dev_is_valid_port(port_id)) {
> @@ -1751,14 +1751,18 @@ rte_eth_xstats_get(uint8_t port_id, struct 
> rte_eth_xstats *xstats,
>  
>   dev = &rte_eth_devices[port_id];
>  
> - /* implemented by the driver */
> - if (dev->dev_ops->xstats_get != NULL)
> - return (*dev->dev_ops->xstats_get)(dev, xstats, n);
> -
> - /* else, return generic statistics */
> + /* Return generic statistics */
>   count = RTE_NB_STATS;
>   count += dev->data->nb_rx_queues * RTE_NB_RXQ_STATS;
>   count += dev->data->nb_tx_queues * RTE_NB_TXQ_STATS;
> +
> + /* implemented by the driver */
> + if (dev->dev_ops->xstats_get != NULL) {
> + /* Retrieve the additional count size */
> + xcount = (*dev->dev_ops->xstats_get)(dev, xstats, 0);
> + count += xcount;
> + }
> +
>   if (n < count)
>   return count;
>  
> @@ -1805,6 +1809,10 @@ rte_eth_xstats_get(uint8_t port_id, struct 
> rte_eth_xstats *xstats,
>   }
>   }
>  
> + /* Display stats after the generic stats*/
> + if (dev->dev_ops->xstats_get != NULL)
> + (*dev->dev_ops->xstats_get)(dev, xstats, count);
> +
>   return count;
>  }
>  

I think we can avoid to have 2 calls to dev->dev_ops->xstats_get.

The first call can be removed.
The second call could be as below:

if (dev->dev_ops->xstats_get != NULL) {
ret = (*dev->dev_ops->xstats_get)(dev, &xstats[count],
n - count);
if (ret < 0)
return ret;
return ret + count;
}

- if the driver returns -1, the error code is propagated
- if the driver returns a positive value, it is added to "count",
  which is the number of generic stats. It can be higher than "n"
  if the xstats table is too small

Regards,
Olivier

[dpdk-dev] [PATCH v2] mempool: improve cache search

2015-07-03 Thread Olivier MATZ



On 07/01/2015 11:03 AM, Zoltan Kiss wrote:
> The current way has a few problems:
> 
> - if cache->len < n, we copy our elements into the cache first, then
>   into obj_table, that's unnecessary
> - if n >= cache_size (or the backfill fails), and we can't fulfil the
>   request from the ring alone, we don't try to combine with the cache
> - if refill fails, we don't return anything, even if the ring has enough
>   for our request
> 
> This patch rewrites it severely:
> - at the first part of the function we only try the cache if cache->len < n
> - otherwise take our elements straight from the ring
> - if that fails but we have something in the cache, try to combine them
> - the refill happens at the end, and its failure doesn't modify our return
>   value
> 
> Signed-off-by: Zoltan Kiss 


Acked-by: Olivier Matz

[dpdk-dev] [PATCH v2] mempool: improve cache search

2015-07-03 Thread Olivier MATZ



On 07/03/2015 03:32 PM, Olivier MATZ wrote:
> 
> 
> On 07/01/2015 11:03 AM, Zoltan Kiss wrote:
>> The current way has a few problems:
>>
>> - if cache->len < n, we copy our elements into the cache first, then
>>   into obj_table, that's unnecessary
>> - if n >= cache_size (or the backfill fails), and we can't fulfil the
>>   request from the ring alone, we don't try to combine with the cache
>> - if refill fails, we don't return anything, even if the ring has enough
>>   for our request
>>
>> This patch rewrites it severely:
>> - at the first part of the function we only try the cache if cache->len < n
>> - otherwise take our elements straight from the ring
>> - if that fails but we have something in the cache, try to combine them
>> - the refill happens at the end, and its failure doesn't modify our return
>>   value
>>
>> Signed-off-by: Zoltan Kiss 
> 
> 
> Acked-by: Olivier Matz 
> 

Please ignore, sorry, I missed Konstantin's relevant comment.

[dpdk-dev] [PATCH v6 1/7] i40e: changes to support PCI Port Hotplug

2015-07-03 Thread Bernard Iremonger

This patch depends on the Port Hotplug Framework.
It implements the eth_dev_uninit functions for rte_i40e_pmd and
rte_i40evf_pmd.

Signed-off-by: Bernard Iremonger 
---
 drivers/net/i40e/i40e_ethdev.c|   68 -
 drivers/net/i40e/i40e_ethdev_vf.c |   45 -
 drivers/net/i40e/i40e_pf.c|   34 ++
 drivers/net/i40e/i40e_pf.h|1 +
 4 files changed, 146 insertions(+), 2 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 2ada502..449785b 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -107,6 +107,7 @@
(1UL << RTE_ETH_FLOW_L2_PAYLOAD))

 static int eth_i40e_dev_init(struct rte_eth_dev *eth_dev);
+static int eth_i40e_dev_uninit(struct rte_eth_dev *eth_dev);
 static int i40e_dev_configure(struct rte_eth_dev *dev);
 static int i40e_dev_start(struct rte_eth_dev *dev);
 static void i40e_dev_stop(struct rte_eth_dev *dev);
@@ -268,9 +269,11 @@ static struct eth_driver rte_i40e_pmd = {
.pci_drv = {
.name = "rte_i40e_pmd",
.id_table = pci_id_i40e_map,
-   .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+   .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
+   RTE_PCI_DRV_DETACHABLE,
},
.eth_dev_init = eth_i40e_dev_init,
+   .eth_dev_uninit = eth_i40e_dev_uninit,
.dev_private_size = sizeof(struct i40e_adapter),
 };

@@ -405,6 +408,7 @@ eth_i40e_dev_init(struct rte_eth_dev *dev)
hw->subsystem_device_id = pci_dev->id.subsystem_device_id;
hw->bus.device = pci_dev->addr.devid;
hw->bus.func = pci_dev->addr.function;
+   hw->adapter_stopped = 0;

/* Make sure all is clean before doing PF reset */
i40e_clear_hw(hw);
@@ -584,6 +588,65 @@ err_get_capabilities:
 }

 static int
+eth_i40e_dev_uninit(struct rte_eth_dev *dev)
+{
+   struct rte_pci_device *pci_dev;
+   struct i40e_hw *hw;
+   struct i40e_filter_control_settings settings;
+   int ret;
+   uint8_t aq_fail = 0;
+
+   PMD_INIT_FUNC_TRACE();
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return 0;
+
+   hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   pci_dev = dev->pci_dev;
+
+   if (hw->adapter_stopped == 0)
+   i40e_dev_close(dev);
+
+   dev->dev_ops = NULL;
+   dev->rx_pkt_burst = NULL;
+   dev->tx_pkt_burst = NULL;
+
+   /* Disable LLDP */
+   ret = i40e_aq_stop_lldp(hw, true, NULL);
+   if (ret != I40E_SUCCESS) /* Its failure can be ignored */
+   PMD_INIT_LOG(INFO, "Failed to stop lldp");
+
+   /* Clear PXE mode */
+   i40e_clear_pxe_mode(hw);
+
+   /* Unconfigure filter control */
+   memset(&settings, 0, sizeof(settings));
+   ret = i40e_set_filter_control(hw, &settings);
+   if (ret)
+   PMD_INIT_LOG(WARNING, "setup_pf_filter_control failed: %d",
+   ret);
+
+   /* Disable flow control */
+   hw->fc.requested_mode = I40E_FC_NONE;
+   i40e_set_fc(hw, &aq_fail, TRUE);
+
+   /* uninitialize pf host driver */
+   i40e_pf_host_uninit(dev);
+
+   rte_free(dev->data->mac_addrs);
+   dev->data->mac_addrs = NULL;
+
+   /* disable uio intr before callback unregister */
+   rte_intr_disable(&(pci_dev->intr_handle));
+
+   /* register callback func to eal lib */
+   rte_intr_callback_unregister(&(pci_dev->intr_handle),
+   i40e_dev_interrupt_handler, (void *)dev);
+
+   return 0;
+}
+
+static int
 i40e_dev_configure(struct rte_eth_dev *dev)
 {
struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
@@ -858,6 +921,8 @@ i40e_dev_start(struct rte_eth_dev *dev)
struct i40e_vsi *main_vsi = pf->main_vsi;
int ret, i;

+   hw->adapter_stopped = 0;
+
if ((dev->data->dev_conf.link_duplex != ETH_LINK_AUTONEG_DUPLEX) &&
(dev->data->dev_conf.link_duplex != ETH_LINK_FULL_DUPLEX)) {
PMD_INIT_LOG(ERR, "Invalid link_duplex (%hu) for port %hhu",
@@ -965,6 +1030,7 @@ i40e_dev_close(struct rte_eth_dev *dev)
PMD_INIT_FUNC_TRACE();

i40e_dev_stop(dev);
+   hw->adapter_stopped = 1;

/* Disable interrupt */
i40e_pf_disable_irq0(hw);
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index f7332e7..fb12d63 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1146,6 +1146,22 @@ err:
 }

 static int
+i40evf_uninit_vf(struct rte_eth_dev *dev)
+{
+   struct i40e_vf *vf = I40EVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
+   struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   PMD_INIT_FUNC_TRACE();
+
+   if (hw->adapter_stopped == 0)
+   i40evf_dev_close(dev);
+

[dpdk-dev] [PATCH v6 2/7] i40e: release vmdq vsi's in dev_close

2015-07-03 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/i40e/i40e_ethdev.c |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 449785b..bc1bab2 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -1026,6 +1026,7 @@ i40e_dev_close(struct rte_eth_dev *dev)
struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
uint32_t reg;
+   int i;

PMD_INIT_FUNC_TRACE();

@@ -1043,6 +1044,14 @@ i40e_dev_close(struct rte_eth_dev *dev)
i40e_fdir_teardown(pf);
i40e_vsi_release(pf->main_vsi);

+   for (i = 0; i < pf->nb_cfg_vmdq_vsi; i++) {
+   i40e_vsi_release(pf->vmdq[i].vsi);
+   pf->vmdq[i].vsi = NULL;
+   }
+
+   rte_free(pf->vmdq);
+   pf->vmdq = NULL;
+
/* shutdown the adminq */
i40e_aq_queue_shutdown(hw, true);
i40e_shutdown_adminq(hw);
-- 
1.7.4.1

[dpdk-dev] [PATCH v6 5/7] i40e: clear queues in i40evf_dev_stop

2015-07-03 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/i40e/i40e_ethdev_vf.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index b2c32d6..d2b86a7 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1620,6 +1620,7 @@ i40evf_dev_stop(struct rte_eth_dev *dev)

i40evf_disable_queues_intr(hw);
i40evf_stop_queues(dev);
+   i40e_dev_clear_queues(dev);
 }

 static int
-- 
1.7.4.1

[dpdk-dev] [PATCH v6 4/7] i40e: call _clear_cmd() when error occurs

2015-07-03 Thread Bernard Iremonger

_clear_cmd() was not being called in failure situations,
resulting in the next command also failing.
Fix several typos.

Signed-off-by: Bernard Iremonger 
---
 drivers/net/i40e/i40e_ethdev_vf.c |   11 +++
 1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index f0142ad..b2c32d6 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -361,6 +361,7 @@ i40evf_execute_vf_cmd(struct rte_eth_dev *dev, struct 
vf_cmd_info *args)
 args->in_args, args->in_args_size, NULL);
if (err) {
PMD_DRV_LOG(ERR, "fail to send cmd %d", args->ops);
+   _clear_cmd(vf);
return err;
}

@@ -368,8 +369,10 @@ i40evf_execute_vf_cmd(struct rte_eth_dev *dev, struct 
vf_cmd_info *args)
/* read message and it's expected one */
if (!err && args->ops == info.ops)
_clear_cmd(vf);
-   else if (err)
+   else if (err) {
PMD_DRV_LOG(ERR, "Failed to read message from AdminQ");
+   _clear_cmd(vf);
+   }
else if (args->ops != info.ops)
PMD_DRV_LOG(ERR, "command mismatch, expect %u, get %u",
args->ops, info.ops);
@@ -794,7 +797,7 @@ i40evf_stop_queues(struct rte_eth_dev *dev)
/* Stop TX queues first */
for (i = 0; i < dev->data->nb_tx_queues; i++) {
if (i40evf_dev_tx_queue_stop(dev, i) != 0) {
-   PMD_DRV_LOG(ERR, "Fail to start queue %u", i);
+   PMD_DRV_LOG(ERR, "Fail to stop queue %u", i);
return -1;
}
}
@@ -802,7 +805,7 @@ i40evf_stop_queues(struct rte_eth_dev *dev)
/* Then stop RX queues */
for (i = 0; i < dev->data->nb_rx_queues; i++) {
if (i40evf_dev_rx_queue_stop(dev, i) != 0) {
-   PMD_DRV_LOG(ERR, "Fail to start queue %u", i);
+   PMD_DRV_LOG(ERR, "Fail to stop queue %u", i);
return -1;
}
}
@@ -1431,7 +1434,7 @@ i40evf_dev_tx_queue_stop(struct rte_eth_dev *dev, 
uint16_t tx_queue_id)
err = i40evf_switch_queue(dev, FALSE, tx_queue_id, FALSE);

if (err) {
-   PMD_DRV_LOG(ERR, "Failed to switch TX queue %u of",
+   PMD_DRV_LOG(ERR, "Failed to switch TX queue %u off",
tx_queue_id);
return err;
}
-- 
1.7.4.1

[dpdk-dev] [PATCH v6 6/7] i40e: check rxq parameter in i40e_reset_rx_queue

2015-07-03 Thread Bernard Iremonger

There is a segmentation fault if rxq is NULL

Signed-off-by: Bernard Iremonger 
---
 drivers/net/i40e/i40e_rxtx.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 2de0ac4..2a89e84 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -2269,6 +2269,11 @@ i40e_reset_rx_queue(struct i40e_rx_queue *rxq)
unsigned i;
uint16_t len;

+   if (!rxq) {
+   PMD_DRV_LOG(DEBUG, "Pointer to rxq is NULL");
+   return;
+   }
+
 #ifdef RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC
if (check_rx_burst_bulk_alloc_preconditions(rxq) == 0)
len = (uint16_t)(rxq->nb_rx_desc + RTE_PMD_I40E_RX_MAX_BURST);
-- 
1.7.4.1

[dpdk-dev] [PATCH v6 0/7] i40e: PCI Port Hotplug Changes

2015-07-03 Thread Bernard Iremonger

Changes in V6:
Rebased to latest code.
Removed release of rx and tx queues from uninit() functions.
added patch 7, add function i40e_dev_free_queues() and call from close() 
functions.

Changes in V5:
Increased timeout in i40evf_wait_cmd_done()
Set nb_rx_queues and nb_tx_queues to 0 in uninit functions.
Rebased to latest i40e code.

Changes in V4:
added patch 6 to fix segmentation fault reported by Michael Qiu.
Rebase to latest i40e code.

Changes in V3:
Release rx and tx queues in vf uninit function.
Rebase to use latest i40e code.

Changes in V2:
Rebase to use drivers/net/i40e directory.


Bernard Iremonger (7):
  i40e: changes to support PCI Port Hotplug
  i40e: release vmdq vsi's in dev_close
  i40e: increase ASQ_DELAY_MS to 100 and MAX_TRY_TIMES to 20 in
i40evf_wait_cmd_done()
  i40e: call _clear_cmd() when error occurs
  i40e: clear queues in i40evf_dev_stop
  i40e: check rxq parameter in i40e_reset_rx_queue
  i40e: release queue memory in close functions

 drivers/net/i40e/i40e_ethdev.c|   78 -
 drivers/net/i40e/i40e_ethdev_vf.c |   62 ++---
 drivers/net/i40e/i40e_pf.c|   34 
 drivers/net/i40e/i40e_pf.h|1 +
 drivers/net/i40e/i40e_rxtx.c  |   25 
 drivers/net/i40e/i40e_rxtx.h  |1 +
 6 files changed, 193 insertions(+), 8 deletions(-)

-- 
1.7.4.1

[dpdk-dev] [PATCH v6 3/7] i40e: increase ASQ_DELAY_MS to 100 and MAX_TRY_TIMES to 20 in i40evf_wait_cmd_done()

2015-07-03 Thread Bernard Iremonger

Increase delay from 50 * 10 to 100 * 20 to avoid i40evf_read_pfmsg() failures.

Signed-off-by: Bernard Iremonger 
---
 drivers/net/i40e/i40e_ethdev_vf.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index fb12d63..f0142ad 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -298,8 +298,8 @@ i40evf_wait_cmd_done(struct rte_eth_dev *dev,
int i = 0;
enum i40evf_aq_result ret;

-#define MAX_TRY_TIMES 10
-#define ASQ_DELAY_MS  50
+#define MAX_TRY_TIMES 20
+#define ASQ_DELAY_MS  100
do {
/* Delay some time first */
rte_delay_ms(ASQ_DELAY_MS);
-- 
1.7.4.1

[dpdk-dev] [PATCH v6 7/7] i40e: release queue memory in close functions

2015-07-03 Thread Bernard Iremonger

add i40e_dev_free_queues() function and call it from close() functions.

Signed-off-by: Bernard Iremonger 
---
 drivers/net/i40e/i40e_ethdev.c|1 +
 drivers/net/i40e/i40e_ethdev_vf.c |1 +
 drivers/net/i40e/i40e_rxtx.c  |   20 
 drivers/net/i40e/i40e_rxtx.h  |1 +
 4 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index bc1bab2..cdb5576 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -1032,6 +1032,7 @@ i40e_dev_close(struct rte_eth_dev *dev)

i40e_dev_stop(dev);
hw->adapter_stopped = 1;
+   i40e_dev_free_queues(dev);

/* Disable interrupt */
i40e_pf_disable_irq0(hw);
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index d2b86a7..41c0b93 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1756,6 +1756,7 @@ i40evf_dev_close(struct rte_eth_dev *dev)

i40evf_dev_stop(dev);
hw->adapter_stopped = 1;
+   i40e_dev_free_queues(dev);
i40evf_reset_vf(hw);
i40e_shutdown_adminq(hw);
 }
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 2a89e84..6c11c75 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -2588,6 +2588,26 @@ i40e_dev_clear_queues(struct rte_eth_dev *dev)
}
 }

+void
+i40e_dev_free_queues(struct rte_eth_dev *dev)
+{
+   uint16_t i;
+
+   PMD_INIT_FUNC_TRACE();
+
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   i40e_dev_rx_queue_release(dev->data->rx_queues[i]);
+   dev->data->rx_queues[i] = NULL;
+   }
+   dev->data->nb_rx_queues = 0;
+
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   i40e_dev_tx_queue_release(dev->data->tx_queues[i]);
+   dev->data->tx_queues[i] = NULL;
+   }
+   dev->data->nb_tx_queues = 0;
+}
+
 #define I40E_FDIR_NUM_TX_DESC  I40E_MIN_RING_DESC
 #define I40E_FDIR_NUM_RX_DESC  I40E_MIN_RING_DESC

diff --git a/drivers/net/i40e/i40e_rxtx.h b/drivers/net/i40e/i40e_rxtx.h
index decbc3d..224ebb3 100644
--- a/drivers/net/i40e/i40e_rxtx.h
+++ b/drivers/net/i40e/i40e_rxtx.h
@@ -198,6 +198,7 @@ int i40e_rx_queue_init(struct i40e_rx_queue *rxq);
 void i40e_free_tx_resources(struct i40e_tx_queue *txq);
 void i40e_free_rx_resources(struct i40e_rx_queue *rxq);
 void i40e_dev_clear_queues(struct rte_eth_dev *dev);
+void i40e_dev_free_queues(struct rte_eth_dev *dev);
 void i40e_reset_rx_queue(struct i40e_rx_queue *rxq);
 void i40e_reset_tx_queue(struct i40e_tx_queue *txq);
 void i40e_tx_queue_release_mbufs(struct i40e_tx_queue *txq);
-- 
1.7.4.1

[dpdk-dev] [PATCH v7 0/2] e1000: PCI Port Hotplug changes

2015-07-03 Thread Bernard Iremonger

Changes in v7:
Rebase to latest code.
Remove freeing of queue memory from uninit functions
Add patch 2, free queue memory in close functions

Changes in v6:
Set nb_rx_queues and nb_tx_queues to 0 in uninit functions.
Rebase to latest code.

Changes in v5:
Moved adapter stopped boolean to struct e1000_adapter.
Rebase to latest code.

Changes in v4:
Release rx and tx queues in eth_igbvf_dev_uninit.

Changes in v3:
Add igb_adapter_stopped and em_adapter_stopped booleans.
Release rx and tx queues in eth_igb_dev_uninit.

Changes in v2:
Call dev_close() from  dev_uninit() functions.
Remove input parameter checking from dev_uninit() functions.


Bernard Iremonger (2):
  e1000: igb and em1000 PCI Port Hotplug changes
  e1000: free queue memory in close functions

 drivers/net/e1000/e1000_ethdev.h |   10 -
 drivers/net/e1000/em_ethdev.c|   47 ++-
 drivers/net/e1000/em_rxtx.c  |   18 +++
 drivers/net/e1000/igb_ethdev.c   |   96 --
 drivers/net/e1000/igb_pf.c   |   22 +
 drivers/net/e1000/igb_rxtx.c |   18 +++
 6 files changed, 205 insertions(+), 6 deletions(-)

-- 
1.7.4.1

[dpdk-dev] [PATCH v7 1/2] e1000: igb and em1000 PCI Port Hotplug changes

2015-07-03 Thread Bernard Iremonger

This patch depends on the Port Hotplug Framework.
It implements the eth_dev_uninit functions for rte_em_pmd,
rte_igb_pmd and rte_igbvf_pmd.

Signed-off-by: Bernard Iremonger 
---
 drivers/net/e1000/e1000_ethdev.h |8 +++-
 drivers/net/e1000/em_ethdev.c|   46 ++-
 drivers/net/e1000/em_rxtx.c  |   12 +
 drivers/net/e1000/igb_ethdev.c   |   93 -
 drivers/net/e1000/igb_pf.c   |   22 +
 drivers/net/e1000/igb_rxtx.c |   12 +
 6 files changed, 188 insertions(+), 5 deletions(-)

diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index c451faa..ee8b872 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -229,8 +229,12 @@ struct e1000_adapter {
struct e1000_vfta   shadow_vfta;
struct e1000_vf_info*vfdata;
struct e1000_filter_info filter;
+   bool stopped;
 };

+#define E1000_DEV_PRIVATE(adapter) \
+   ((struct e1000_adapter *)adapter)
+
 #define E1000_DEV_PRIVATE_TO_HW(adapter) \
(&((struct e1000_adapter *)adapter)->hw)

@@ -337,4 +341,6 @@ uint16_t eth_em_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
 uint16_t eth_em_recv_scattered_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t nb_pkts);

+void igb_pf_host_uninit(struct rte_eth_dev *dev);
+
 #endif /* _E1000_ETHDEV_H_ */
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index a306c55..dfabb15 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -224,6 +224,8 @@ static int
 eth_em_dev_init(struct rte_eth_dev *eth_dev)
 {
struct rte_pci_device *pci_dev;
+   struct e1000_adapter *adapter =
+   E1000_DEV_PRIVATE(eth_dev->data->dev_private);
struct e1000_hw *hw =
E1000_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
struct e1000_vfta * shadow_vfta =
@@ -246,6 +248,7 @@ eth_em_dev_init(struct rte_eth_dev *eth_dev)

hw->hw_addr = (void *)pci_dev->mem_resource[0].addr;
hw->device_id = pci_dev->id.device_id;
+   adapter->stopped = 0;

/* For ICH8 support we'll need to map the flash memory BAR */

@@ -285,13 +288,47 @@ eth_em_dev_init(struct rte_eth_dev *eth_dev)
return (0);
 }

+static int
+eth_em_dev_uninit(struct rte_eth_dev *eth_dev)
+{
+   struct rte_pci_device *pci_dev;
+   struct e1000_adapter *adapter =
+   E1000_DEV_PRIVATE(eth_dev->data->dev_private);
+
+   PMD_INIT_FUNC_TRACE();
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return -EPERM;
+
+   pci_dev = eth_dev->pci_dev;
+
+   if (adapter->stopped == 0)
+   eth_em_close(eth_dev);
+
+   eth_dev->dev_ops = NULL;
+   eth_dev->rx_pkt_burst = NULL;
+   eth_dev->tx_pkt_burst = NULL;
+
+   rte_free(eth_dev->data->mac_addrs);
+   eth_dev->data->mac_addrs = NULL;
+
+   /* disable uio intr before callback unregister */
+   rte_intr_disable(&(pci_dev->intr_handle));
+   rte_intr_callback_unregister(&(pci_dev->intr_handle),
+   eth_em_interrupt_handler, (void *)eth_dev);
+
+   return 0;
+}
+
 static struct eth_driver rte_em_pmd = {
.pci_drv = {
.name = "rte_em_pmd",
.id_table = pci_id_em_map,
-   .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+   .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
+   RTE_PCI_DRV_DETACHABLE,
},
.eth_dev_init = eth_em_dev_init,
+   .eth_dev_uninit = eth_em_dev_uninit,
.dev_private_size = sizeof(struct e1000_adapter),
 };

@@ -451,6 +488,8 @@ em_set_pba(struct e1000_hw *hw)
 static int
 eth_em_start(struct rte_eth_dev *dev)
 {
+   struct e1000_adapter *adapter =
+   E1000_DEV_PRIVATE(dev->data->dev_private);
struct e1000_hw *hw =
E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
int ret, mask;
@@ -570,6 +609,8 @@ eth_em_start(struct rte_eth_dev *dev)
}
}

+   adapter->stopped = 0;
+
PMD_INIT_LOG(DEBUG, "<<");

return (0);
@@ -613,8 +654,11 @@ static void
 eth_em_close(struct rte_eth_dev *dev)
 {
struct e1000_hw *hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct e1000_adapter *adapter =
+   E1000_DEV_PRIVATE(dev->data->dev_private);

eth_em_stop(dev);
+   adapter->stopped = 1;
e1000_phy_hw_reset(hw);
em_release_manageability(hw);
em_hw_control_release(hw);
diff --git a/drivers/net/e1000/em_rxtx.c b/drivers/net/e1000/em_rxtx.c

[dpdk-dev] [PATCH v7 2/2] e1000: free queue memory in close functions

2015-07-03 Thread Bernard Iremonger

add new functions igb_dev_free_queues() and em_dev_free_queues()

Signed-off-by: Bernard Iremonger 
---
 drivers/net/e1000/e1000_ethdev.h |2 ++
 drivers/net/e1000/em_ethdev.c|1 +
 drivers/net/e1000/em_rxtx.c  |6 ++
 drivers/net/e1000/igb_ethdev.c   |3 ++-
 drivers/net/e1000/igb_rxtx.c |6 ++
 5 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index ee8b872..4e69e44 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -259,6 +259,7 @@ struct e1000_adapter {
 void eth_igb_tx_queue_release(void *txq);
 void eth_igb_rx_queue_release(void *rxq);
 void igb_dev_clear_queues(struct rte_eth_dev *dev);
+void igb_dev_free_queues(struct rte_eth_dev *dev);

 int eth_igb_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id,
uint16_t nb_rx_desc, unsigned int socket_id,
@@ -313,6 +314,7 @@ void eth_em_tx_queue_release(void *txq);
 void eth_em_rx_queue_release(void *rxq);

 void em_dev_clear_queues(struct rte_eth_dev *dev);
+void em_dev_free_queues(struct rte_eth_dev *dev);

 int eth_em_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id,
uint16_t nb_rx_desc, unsigned int socket_id,
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index dfabb15..d8c04e7 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -659,6 +659,7 @@ eth_em_close(struct rte_eth_dev *dev)

eth_em_stop(dev);
adapter->stopped = 1;
+   em_dev_free_queues(dev);
e1000_phy_hw_reset(hw);
em_release_manageability(hw);
em_hw_control_release(hw);
diff --git a/drivers/net/e1000/em_rxtx.c b/drivers/net/e1000/em_rxtx.c
index 976df60..9913ad0 100644
--- a/drivers/net/e1000/em_rxtx.c
+++ b/drivers/net/e1000/em_rxtx.c
@@ -1512,6 +1512,12 @@ em_dev_clear_queues(struct rte_eth_dev *dev)
em_reset_rx_queue(rxq);
}
}
+}
+
+void
+em_dev_free_queues(struct rte_eth_dev *dev)
+{
+   uint16_t i;

for (i = 0; i < dev->data->nb_rx_queues; i++) {
eth_em_rx_queue_release(dev->data->rx_queues[i]);
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 80e4a6c..6e92f2e 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -1089,7 +1089,7 @@ eth_igb_close(struct rte_eth_dev *dev)
E1000_WRITE_REG(hw, E1000_82580_PHY_POWER_MGMT, phpm_reg);
}

-   igb_dev_clear_queues(dev);
+   igb_dev_free_queues(dev);

memset(&link, 0, sizeof(link));
rte_igb_dev_atomic_write_link_status(dev, &link);
@@ -2363,6 +2363,7 @@ igbvf_dev_close(struct rte_eth_dev *dev)

igbvf_dev_stop(dev);
adapter->stopped = 1;
+   igb_dev_free_queues(dev);
 }

 static int igbvf_set_vfta(struct e1000_hw *hw, uint16_t vid, bool on)
diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index 1bf8c93..0aecf8c 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -1498,6 +1498,12 @@ igb_dev_clear_queues(struct rte_eth_dev *dev)
igb_reset_rx_queue(rxq);
}
}
+}
+
+void
+igb_dev_free_queues(struct rte_eth_dev *dev)
+{
+   uint16_t i;

for (i = 0; i < dev->data->nb_rx_queues; i++) {
eth_igb_rx_queue_release(dev->data->rx_queues[i]);
-- 
1.7.4.1

[dpdk-dev] Failed to initialize HW 0x10bc

2015-07-03 Thread Emerson Posadas

Hello

I?m trying to use the pktgen tool to generate some traffic, but my
device (Intel
Corporation 82571EB Gigabit Ethernet) is failing to initialize. Is this
related with the resolved issue 7.19, even when is not the same device?

http://dpdk.org/doc/guides/rel_notes/resolved_issues.html

Here is my output while running pktgen:
PMD: eth_em_dev_init(): port_id 0 vendorID=0x8086 deviceID=0x10bc: failed
to init HW
EAL: Error - exiting with code: 1
  Cause: Requested device :03:00.0 cannot be used

And
>From lib/librte_eal/common/include/rte_pci_dev_ids.h
#define E1000_DEV_ID_82571EB_QUAD_COPPER_LP   0x10BC

[root at localhost dpdk-2.0.0]# dpdk_nic_bind.py --status

Network devices using DPDK-compatible driver

:03:00.0 '82571EB Gigabit Ethernet Controller (Copper)' drv=igb_uio
unused=
:03:00.1 '82571EB Gigabit Ethernet Controller (Copper)' drv=igb_uio
unused=
:04:00.0 '82571EB Gigabit Ethernet Controller (Copper)' drv=igb_uio
unused=
:04:00.1 '82571EB Gigabit Ethernet Controller (Copper)' drv=igb_uio
unused=

Network devices using kernel driver
===
:00:19.0 '82579LM Gigabit Network Connection' if=eno1 drv=e1000e
unused=igb_uio *Active*

Any advice would be useful.

EP

[dpdk-dev] Failed to initialize HW 0x10bc

2015-07-03 Thread Bruce Richardson

On Fri, Jul 03, 2015 at 02:59:15PM +, Emerson Posadas wrote:
> Hello
> 
> I?m trying to use the pktgen tool to generate some traffic, but my
> device (Intel
> Corporation 82571EB Gigabit Ethernet) is failing to initialize. Is this
> related with the resolved issue 7.19, even when is not the same device?
> 
> http://dpdk.org/doc/guides/rel_notes/resolved_issues.html
> 
> Here is my output while running pktgen:
> PMD: eth_em_dev_init(): port_id 0 vendorID=0x8086 deviceID=0x10bc: failed
> to init HW
> EAL: Error - exiting with code: 1
>   Cause: Requested device :03:00.0 cannot be used
>
Can you try running testpmd with your NIC to see if that initializes the ports
successfully?

/Bruce

[dpdk-dev] [PATCH 2/2] ixgbe: check mbuf refcnt when clearing RX/TX ring

2015-07-03 Thread Bruce Richardson

The function to clear the TX ring when a port was being closed, e.g. on
exit in testpmd, was not checking the mbuf refcnt before freeing it.
Since the function in the vector driver to clear the ring after TX does
not set the pointer to NULL post-free, this caused crashes if mbuf
debugging was turned on.

To reproduce the issue, ensure the follow config variables are set:
RTE_IXGBE_INC_VECTOR
RTE_LIBRTE_MBUF_DEBUG
Then compile up and run testpmd using 10G ports with the vector driver.
Start traffic and let some flow through, then type "stop" and "quit" at
the testpmd prompt, and crash will occur. Output below:

testpmd> quit
Stopping port 0...done
Stopping port 1...PANIC in rte_mbuf_sanity_check():
bad ref cnt
[New Thread 0x7fffabfff700 (LWP 145312)]
[New Thread 0x7fffb47fe700 (LWP 145311)]
[New Thread 0x7fffb4fff700 (LWP 145310)]
[New Thread 0x76cd5700 (LWP 145309)]
18: 
[/home/bruce/dpdk.org/x86_64-native-linuxapp-gcc/app/testpmd(_start+0x29)

Program received signal SIGABRT, Aborted.
0x77120a98 in raise () from /lib64/libc.so.6

A similar error occurs when clearing the RX ring, which is also fixed by
this patch.

Signed-off-by: Bruce Richardson 
---
 drivers/net/ixgbe/ixgbe_rxtx.c | 3 ++-
 drivers/net/ixgbe/ixgbe_rxtx_vec.c | 8 +++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 41a062e..12e25b7 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -2108,7 +2108,8 @@ ixgbe_rx_queue_release_mbufs(struct ixgbe_rx_queue *rxq)

if (rxq->sw_ring != NULL) {
for (i = 0; i < rxq->nb_rx_desc; i++) {
-   if (rxq->sw_ring[i].mbuf != NULL) {
+   if (rxq->sw_ring[i].mbuf != NULL &&
+   
rte_mbuf_refcnt_read(rxq->sw_ring[i].mbuf)) {
rte_pktmbuf_free_seg(rxq->sw_ring[i].mbuf);
rxq->sw_ring[i].mbuf = NULL;
}
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c 
b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
index 0edac82..7e633d3 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
@@ -665,7 +665,13 @@ ixgbe_tx_queue_release_mbufs(struct ixgbe_tx_queue *txq)
 nb_free < max_desc && i != txq->tx_tail;
 i = (i + 1) & max_desc) {
txe = (struct ixgbe_tx_entry_v *)&txq->sw_ring[i];
-   if (txe->mbuf != NULL)
+   /*
+*check for already freed packets.
+* Note: ixgbe_tx_free_bufs does not NULL after free,
+* so we actually have to check the reference count.
+*/
+   if (txe->mbuf != NULL &&
+   rte_mbuf_refcnt_read(txe->mbuf) != 0)
rte_pktmbuf_free_seg(txe->mbuf);
}
/* reset tx_entry */
-- 
2.4.3

[dpdk-dev] [PATCH 0/2] Fix crash with vpmd and mbuf debug

2015-07-03 Thread Bruce Richardson

When mbuf debug support is turned on in the build time config, crashes will
occur when clearing up the RX/TX rings, if the 10G vector PMD is in use. This
error can be reproduced using testpmd.
This patchset makes the setup/teardown code easier to debug by marking "cold"
code paths as such, and then fixes the errors by checking for already-freed
mbufs when clearing the rings.

Bruce Richardson (2):
  ixgbe: add "cold" attribute to setup/teardown fns
  ixgbe: check mbuf refcnt when clearing RX/TX ring

 drivers/net/ixgbe/ixgbe_rxtx.c | 62 --
 drivers/net/ixgbe/ixgbe_rxtx_vec.c | 24 ++-
 2 files changed, 48 insertions(+), 38 deletions(-)

-- 
2.4.3

[dpdk-dev] [PATCH 1/2] ixgbe: add "cold" attribute to setup/teardown fns

2015-07-03 Thread Bruce Richardson

As well as the fast-path functions in the rxtx code, there are also
functions which set up and tear down the descriptor rings. Since these
are not performance critical functions, there is no need to have them
extensively optimized, so we add __attribute__((cold)) to their
definitions. This has the side-effect of making debugging them easier as
the compiler does not optimize them as heavily, so more variables are
accessible by default in gdb.

Signed-off-by: Bruce Richardson 
---
 drivers/net/ixgbe/ixgbe_rxtx.c | 59 +++---
 drivers/net/ixgbe/ixgbe_rxtx_vec.c | 16 ++-
 2 files changed, 39 insertions(+), 36 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 3ace8a8..41a062e 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1757,7 +1757,7 @@ ixgbe_recv_pkts_lro_bulk_alloc(void *rx_queue, struct 
rte_mbuf **rx_pkts,
  * needed. If the memzone is already created, then this function returns a ptr
  * to the old one.
  */
-static const struct rte_memzone *
+static const struct rte_memzone * __attribute__((cold))
 ring_dma_zone_reserve(struct rte_eth_dev *dev, const char *ring_name,
  uint16_t queue_id, uint32_t ring_size, int socket_id)
 {
@@ -1781,7 +1781,7 @@ ring_dma_zone_reserve(struct rte_eth_dev *dev, const char 
*ring_name,
 #endif
 }

-static void
+static void __attribute__((cold))
 ixgbe_tx_queue_release_mbufs(struct ixgbe_tx_queue *txq)
 {
unsigned i;
@@ -1796,7 +1796,7 @@ ixgbe_tx_queue_release_mbufs(struct ixgbe_tx_queue *txq)
}
 }

-static void
+static void __attribute__((cold))
 ixgbe_tx_free_swring(struct ixgbe_tx_queue *txq)
 {
if (txq != NULL &&
@@ -1804,7 +1804,7 @@ ixgbe_tx_free_swring(struct ixgbe_tx_queue *txq)
rte_free(txq->sw_ring);
 }

-static void
+static void __attribute__((cold))
 ixgbe_tx_queue_release(struct ixgbe_tx_queue *txq)
 {
if (txq != NULL && txq->ops != NULL) {
@@ -1814,14 +1814,14 @@ ixgbe_tx_queue_release(struct ixgbe_tx_queue *txq)
}
 }

-void
+void __attribute__((cold))
 ixgbe_dev_tx_queue_release(void *txq)
 {
ixgbe_tx_queue_release(txq);
 }

 /* (Re)set dynamic ixgbe_tx_queue fields to defaults */
-static void
+static void __attribute__((cold))
 ixgbe_reset_tx_queue(struct ixgbe_tx_queue *txq)
 {
static const union ixgbe_adv_tx_desc zeroed_desc = {{0}};
@@ -1870,7 +1870,7 @@ static const struct ixgbe_txq_ops def_txq_ops = {
  * the queue parameters. Used in tx_queue_setup by primary process and then
  * in dev_init by secondary process when attaching to an existing ethdev.
  */
-void
+void __attribute__((cold))
 ixgbe_set_tx_function(struct rte_eth_dev *dev, struct ixgbe_tx_queue *txq)
 {
/* Use a simple Tx queue (no offloads, no multi segs) if possible */
@@ -1900,7 +1900,7 @@ ixgbe_set_tx_function(struct rte_eth_dev *dev, struct 
ixgbe_tx_queue *txq)
}
 }

-int
+int __attribute__((cold))
 ixgbe_dev_tx_queue_setup(struct rte_eth_dev *dev,
 uint16_t queue_idx,
 uint16_t nb_desc,
@@ -2088,7 +2088,7 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_dev *dev,
  *
  * @m scattered cluster head
  */
-static void
+static void __attribute__((cold))
 ixgbe_free_sc_cluster(struct rte_mbuf *m)
 {
uint8_t i, nb_segs = m->nb_segs;
@@ -2101,7 +2101,7 @@ ixgbe_free_sc_cluster(struct rte_mbuf *m)
}
 }

-static void
+static void __attribute__((cold))
 ixgbe_rx_queue_release_mbufs(struct ixgbe_rx_queue *rxq)
 {
unsigned i;
@@ -2133,7 +2133,7 @@ ixgbe_rx_queue_release_mbufs(struct ixgbe_rx_queue *rxq)
}
 }

-static void
+static void __attribute__((cold))
 ixgbe_rx_queue_release(struct ixgbe_rx_queue *rxq)
 {
if (rxq != NULL) {
@@ -2144,7 +2144,7 @@ ixgbe_rx_queue_release(struct ixgbe_rx_queue *rxq)
}
 }

-void
+void __attribute__((cold))
 ixgbe_dev_rx_queue_release(void *rxq)
 {
ixgbe_rx_queue_release(rxq);
@@ -2158,7 +2158,7 @@ ixgbe_dev_rx_queue_release(void *rxq)
  *  -EINVAL: the preconditions are NOT satisfied and the default Rx burst
  *   function must be used.
  */
-static inline int
+static inline int __attribute__((cold))
 #ifdef RTE_LIBRTE_IXGBE_RX_ALLOW_BULK_ALLOC
 check_rx_burst_bulk_alloc_preconditions(struct ixgbe_rx_queue *rxq)
 #else
@@ -2213,7 +2213,7 @@ check_rx_burst_bulk_alloc_preconditions(__rte_unused 
struct ixgbe_rx_queue *rxq)
 }

 /* Reset dynamic ixgbe_rx_queue fields back to defaults */
-static void
+static void __attribute__((cold))
 ixgbe_reset_rx_queue(struct ixgbe_adapter *adapter, struct ixgbe_rx_queue *rxq)
 {
static const union ixgbe_adv_rx_desc zeroed_desc = {{0}};
@@ -2263,7 +2263,7 @@ ixgbe_reset_rx_queue(struct ixgbe_adapter *adapter, 
struct ixgbe_rx_queue *rxq)
rxq->pkt_last_seg = NULL;
 }

-int
+int __attribute__((cold))
 ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,

[dpdk-dev] [PATCH 1/2] ixgbe: add "cold" attribute to setup/teardown fns

2015-07-03 Thread Thomas Monjalon

Hi Bruce,

2015-07-03 16:40, Bruce Richardson:
> As well as the fast-path functions in the rxtx code, there are also
> functions which set up and tear down the descriptor rings. Since these
> are not performance critical functions, there is no need to have them
> extensively optimized, so we add __attribute__((cold)) to their
> definitions. This has the side-effect of making debugging them easier as
> the compiler does not optimize them as heavily, so more variables are
> accessible by default in gdb.

What is the benefit, compared to -O0?

[dpdk-dev] [PATCH 2/2] ixgbe: check mbuf refcnt when clearing RX/TX ring

2015-07-03 Thread Thomas Monjalon

2015-07-03 16:40, Bruce Richardson:
> The function to clear the TX ring when a port was being closed, e.g. on
> exit in testpmd, was not checking the mbuf refcnt before freeing it.
> Since the function in the vector driver to clear the ring after TX does
> not set the pointer to NULL post-free, this caused crashes if mbuf
> debugging was turned on.

Please, could you add a Fixes: line to reference the origin of the bug?
Thanks

[dpdk-dev] [PATCH 1/2] ixgbe: add "cold" attribute to setup/teardown fns

2015-07-03 Thread Bruce Richardson

On Fri, Jul 03, 2015 at 05:45:34PM +0200, Thomas Monjalon wrote:
> Hi Bruce,
> 
> 2015-07-03 16:40, Bruce Richardson:
> > As well as the fast-path functions in the rxtx code, there are also
> > functions which set up and tear down the descriptor rings. Since these
> > are not performance critical functions, there is no need to have them
> > extensively optimized, so we add __attribute__((cold)) to their
> > definitions. This has the side-effect of making debugging them easier as
> > the compiler does not optimize them as heavily, so more variables are
> > accessible by default in gdb.
> 
> What is the benefit, compared to -O0?

First off, it's per function, rather than having to use -O0 globally. Secondly,
it doesn't disable optimization, it just tells the compiler that the code is
not on the hotpath - whether or not the compiler optimizes it is up to the 
compiler itself. From GCC documentation: 
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes

"The cold attribute on functions is used to inform the compiler that the 
function is unlikely to be executed. The function is optimized for size rather 
than speed and on many targets it is placed into a special subsection of the 
text section so all cold functions appear close together, improving code 
locality of non-cold parts of program. The paths leading to calls of cold
functions within code are marked as unlikely by the branch prediction mechanism.
It is thus useful to mark functions used to handle unlikely conditions, such as
perror, as cold to improve optimization of hot functions that do call marked
functions in rare occasions."

/Bruce

[dpdk-dev] [PATCH 2/2] ixgbe: check mbuf refcnt when clearing RX/TX ring

2015-07-03 Thread Bruce Richardson

On Fri, Jul 03, 2015 at 05:46:55PM +0200, Thomas Monjalon wrote:
> 2015-07-03 16:40, Bruce Richardson:
> > The function to clear the TX ring when a port was being closed, e.g. on
> > exit in testpmd, was not checking the mbuf refcnt before freeing it.
> > Since the function in the vector driver to clear the ring after TX does
> > not set the pointer to NULL post-free, this caused crashes if mbuf
> > debugging was turned on.
> 
> Please, could you add a Fixes: line to reference the origin of the bug?
> Thanks

What benefit does that information provide? Given that this bug occurs in
two places in the code, and has been there for some time, I'm not sure that a
straight lookup of the commit that last changed the line(s) would identify the
true source of the bug. Because of that, I'd like to know the information is
really going to be useful before making an effort to track it down. :-)

/Bruce

1 2 >

1 - 100 of 108 matches

Mail list logo