date:20160126

[dpdk-dev] [PATCH] doc: introduce networking driver matrix

2016-01-26 Thread Thomas Monjalon

In order to better compare the drivers and check what is missing
for a common baseline, we need to fill a matrix.

A CSS trick is used to fit the HTML page.
The PDF output needs some LaTeX wizardry.

Signed-off-by: Thomas Monjalon 
---
 doc/guides/nics/index.rst|   1 +
 doc/guides/nics/overview.rst | 145 +++
 2 files changed, 146 insertions(+)
 create mode 100644 doc/guides/nics/overview.rst

diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 33c9cea..8618114 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -35,6 +35,7 @@ Network Interface Controller Drivers
 :maxdepth: 3
 :numbered:

+overview
 bnx2x
 cxgbe
 e1000em
diff --git a/doc/guides/nics/overview.rst b/doc/guides/nics/overview.rst
new file mode 100644
index 000..e00f094
--- /dev/null
+++ b/doc/guides/nics/overview.rst
@@ -0,0 +1,145 @@
+..  BSD LICENSE
+Copyright 2016 6WIND S.A.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of 6WIND S.A. nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Overview of Networking Drivers
+==
+
+The networking drivers may be classified in two categories:
+
+- physical for real devices
+- virtual for emulated devices
+
+Some physical devices may be shaped through a virtual layer as for
+SR-IOV.
+The interface seen in the virtual environment is a VF (Virtual Function).
+
+The ethdev layer exposes an API to use the networking functions
+of these devices.
+The bottom half part of ethdev is implemented by the drivers.
+Thus some features may not be implemented.
+
+There are more differences between drivers regarding some internal properties,
+portability or even documentation availability.
+Most of these differences are summarized below.
+
+.. _table_net_pmd_features:
+
+.. raw:: html
+
+   
+  table#id1 th {
+ font-size: 80%;
+ white-space: pre-wrap;
+ text-align: center;
+ vertical-align: top;
+ padding: 5px;
+  }
+  table#id1 th:first-child {
+ vertical-align: bottom;
+  }
+  table#id1 td {
+ font-size: 70%;
+ padding: 1px;
+  }
+  table#id1 td:first-child {
+ padding-left: 1em;
+  }
+   
+
+.. table:: Features availability in networking drivers
+
+    = = = = = = = = = = = = = = = = = = = = = = = = =
+   Feature  a b b b c e e i i i i i i f m m m n n p r s v v x
+f n n o x 1 n 4 4 g g x x m l l p f u c i z i m e
+p x x n g 0 i 0 0 b b g g 1 x x i p l a n e r x n
+a 2 2 d b 0 c e e   v b b 0 4 5 p   l p g d t n v
+c x x i e 0 v   f e e k e a i e i
+k   v n f   v t o t r
+e   f g f a   3 t
+t 2
+    = = = = = = = = = = = = = = = = = = = = = = = = =
+   link status
+   link status event
+   Rx interrupt
+   queue start/stop
+   MTU update
+   jumbo frame
+   scattered Rx
+   LRO
+   TSO
+   promiscuous mode
+   allmulticast mode
+   unicast MAC filter
+   multicast MAC filter
+   RSS hash
+   RSS key update
+   RSS reta update
+   VMDq
+   SR-IOV
+   DCB
+   VLAN filter
+   ethertype filter
+   n-tuple filter
+   SYN filter
+   tunnel filter
+   flexible filter
+   hash filter
+   flow director
+   flow control
+   rate limitation
+   traffic mirro

[dpdk-dev] [PATCH v6 08/11] eal: pci: introduce RTE_KDRV_VFIO_NOIOMMUi driver mode

2016-01-26 Thread Santosh Shukla

On Tue, Jan 26, 2016 at 7:58 PM, Thomas Monjalon
 wrote:
> 2016-01-26 19:35, Santosh Shukla:
>> On Tue, Jan 26, 2016 at 6:30 PM, Thomas Monjalon
>>  wrote:
>> > 2016-01-26 15:56, Santosh Shukla:
>> >> In my observation, currently virtio work for vfio-noiommu, that's why
>> >> said drv->kdrv need to know vfio mode.
>> >
>> > It is your observation. It may change in near future.
>>
>> so that mean till then, virtio support for non-x86 arch has to wait?
>
> No, absolutely not. virtio for non-x86 is welcome.
>
>> We have working model with vfio-noiommu, don't you think it make sense
>> to let vfio_noiommu implementation exist and later in-case
>> virtio+iommu gets mainline then switch to vfio __mode__ agnostic
>> approach. And for that All it takes to replace __noiommu suffix with
>> default.
>
> I'm just saying you should not touch the enum rte_kernel_driver.
> RTE_KDRV_VFIO is a driver.
> RTE_KDRV_VFIO_NOIOMMU is a mode.
> As the VFIO API is the same in both modes, there is no reason to
> distinguish them at this level.
> Your patch adds the NOIOMMU case everywhere:
> case RTE_KDRV_VFIO:
> +   case RTE_KDRV_VFIO_NOIOMMU:
>
> I'll stop commenting here to let others give their opinion.
>
> [...]
>> >> with vfio+iommu; binding virtio pci device to vfio-pci driver fail;
>> >> giving below error:
>> >> [   53.053464] VFIO - User Level meta-driver version: 0.3
>> >> [   73.077805] vfio-pci: probe of :00:03.0 failed with error -22
>> >> [   73.077852] vfio-pci: probe of :00:03.0 failed with error -22
>> >>
>> >> vfio_pci_probe() --> vfio_iommu_group_get() --> iommu_group_get()
>> >> fails: iommu doesn't have group for virtio pci device.
>> >
>> > Yes it fails when binding.
>> > So the later check in the virtio PMD is useless.
>>
>> Which check?
>
> The check for VFIO noiommu only:
> -   if (dev->kdrv == RTE_KDRV_VFIO)
> +   if (dev->kdrv == RTE_KDRV_VFIO_NOIOMMU)
>
> [...]
>> > Furthermore restricting virtio to no-iommu mode doesn't bring
>> > any improvement.
>>
>> We're not __restricting__, as soon as virtio+iommu gets working state,
>> we'll simply replace __noiommu with default. Then its upto user to try
>> out virtio with vfio default or vfio_noiommu.
>
> Yes it's up to user.
> So your code should be
> if (dev->kdrv == RTE_KDRV_VFIO)
>

Right,

>> > That's why I suggest to keep the initial semantic of kdrv and
>> > not pollute it with VFIO modes.
>>
>> I am okay to live with default and forget suffix __noiommu but there
>> are implementation problem which was discussed in other thread
>> - Virtio pmd driver should avoid interface parsing i.e.
>> virtio_resource_init_uio/vfio() etc.. For vfio case - We could easily
>> get rid of by moving /sys parsing to pci_eal layer, Right? If so then
>> virtio currently works with vfio-noiommu, it make sense to me that
>> pci_eal layer does parsing for pmd driver before that pmd driver get
>> initialized.
>
> Please reword. What is the problem?
>
>> - Another case could be: iommu-less-pmd-driver. eal layer to do
>> parsing before updating drv->kdrv.
>
> [...]
>> >> >> > If a check is needed, I would prefer using your function
>> >> >> > pci_vfio_is_noiommu() and remove driver modes from struct 
>> >> >> > rte_kernel_driver.
>> >> >>
>> >> >> I don't think calling pci_vfio_no_iommu() inside
>> >> >> virtio_reg_rd/wr_1/2/3() would be a good idea.
>> >> >
>> >> > Why? The value may be cached in the priv properties.
>> >> >
>> >> pci_vfio_is_noiommu() parses /sys for
>> >> - enable_noiommu param
>> >> - attached driver name is vfio-noiommu or not.
>> >>
>> >> It does file operation for that, I meant to say that calling this api
>> >> within register_rd/wr function is not correct. It would be better if
>> >> those low level register_rd/wr api only checks driver_types.
>> >
>> > Yes, that's why I said the return of pci_vfio_is_noiommu() may be cached
>> > to keep efficiency.
>>
>> I am not convinced though, Still find pmd driver checking driver_types
>> using drv->kdrv is better approach than introducing a new global
>> variable which may look something like;
>
> Not a global variable. A function in EAL layer. A variable in PMD priv.
>

If we agreed to use condition (drv->kdrv == RTE_KDRV_VFIO);
then resource parsing for vfio {including vfio and vfio_noiommu both
case} is enforced in virtio pmd driver layer and that is contradicting
to what we agreed earlier in this[1] thread. Also we don't need a
function in EAL layer or a variable in PMD priv. Perhaps a private
function in virtio pmd which does parsing for vfio interface.

Thoughts?

[1] http://dpdk.org/dev/patchwork/patch/9862/

>> At pci_eal layer 
>> bool vfio_mode;
>> vfio_mode = pci_vfio_is_noiommu();
>>
>> At virtio pmd driver layer 
>> Checking value at vfio_mode variable before doing virtio_rd/wr for
>> vfio interface.
>>
>> Instead virtio pmd driver doing
>>
>> virtio_reg_rd/wr_1/2/4()
>> {
>> if (drv->kdrv == VFIO)
>>   do pread()/pwrite()
>> else
>>   in()/out()
>> }

[dpdk-dev] [PATCH 2.3] tools/dpdk_nic_bind.py: Verbosely warn the user on bind

2016-01-26 Thread David Marchand

On Tue, Jan 26, 2016 at 9:14 PM, Thomas Monjalon
 wrote:
> 2015-12-11 11:20, Aaron Conole:
>> DPDK ports are only detected during the EAL initialization. After that, any
>> new DPDK ports which are bound will not be visible to the application.
>>
>> The dpdk_nic_bind.py can be a bit more helpful to let users know that DPDK
>> enabled applications will not find rebound ports until after they have been
>> restarted.
>
> I think it's better to improve hotplug and allow hot binding.
> A work is in progress towards this direction.
> David, can you confirm?

I intend to provide some rfc patches later this week, for next release.


-- 
David Marchand

[dpdk-dev] [PATCH 2.3] tools/dpdk_nic_bind.py: Verbosely warn the user on bind

2016-01-26 Thread Thomas Monjalon

2015-12-11 11:20, Aaron Conole:
> DPDK ports are only detected during the EAL initialization. After that, any
> new DPDK ports which are bound will not be visible to the application.
> 
> The dpdk_nic_bind.py can be a bit more helpful to let users know that DPDK
> enabled applications will not find rebound ports until after they have been
> restarted.

I think it's better to improve hotplug and allow hot binding.
A work is in progress towards this direction.
David, can you confirm?
Is this patch still valuable in the meantime?

> --- a/tools/dpdk_nic_bind.py
> +++ b/tools/dpdk_nic_bind.py
> @@ -344,8 +344,10 @@ def bind_one(dev_id, driver, force):
>  dev["Driver_str"] = "" # clear driver string
>  
>  # if we are binding to one of DPDK drivers, add PCI id's to that driver
> +bDpdkDriver = False

Please do not use camel case.

[dpdk-dev] [PATCH v6 08/11] eal: pci: introduce RTE_KDRV_VFIO_NOIOMMUi driver mode

2016-01-26 Thread Santosh Shukla

On Tue, Jan 26, 2016 at 6:30 PM, Thomas Monjalon
 wrote:
> 2016-01-26 15:56, Santosh Shukla:
>> On Mon, Jan 25, 2016 at 8:59 PM, Thomas Monjalon
>>  wrote:
>> > 2016-01-21 22:47, Santosh Shukla:
>> >> On Thu, Jan 21, 2016 at 8:16 PM, Thomas Monjalon
>> >>  wrote:
>> >> > 2016-01-21 17:34, Santosh Shukla:
>> >> >> On Thu, Jan 21, 2016 at 4:58 PM, Thomas Monjalon
>> >> >>  wrote:
>> >> >> > 2016-01-21 16:43, Santosh Shukla:
>> >> >> >> David Marchand  wrote:
>> >> >> >> > This is a mode (specific to vfio), not a new kernel driver.
>> >> >> >> >
>> >> >> >> Yes, Specific to VFIO and this is why noiommu appended after vfio 
>> >> >> >> i.e..
>> >> >> >> __VFIO and __VFIO_NOIOMMU.
>> >> >> >
>> >> >> > Woaaa! Your logic is really disappointing :)
>> >> >> > Specific to VFIO => append _NOIOMMU
>> >> >> > If it's for VFIO, it should be called VFIO (that's my logic).
>> >> >> >
>> >> >> I am confused by reading your comment. vfio works for default iommu
>> >> >> and now with noiommu. drv->kdrv need to know driver mode for vfio
>> >> >> case. So that user can simply read drv->kdrv value in their driver and
>> >> >> accordingly use vfio rd/wr api for example {pread/pwrite}. This is how
>> >> >> rte_eal_pci_vfio_read/write_bar() api implemented.
>> >> >
>> >> > Sorry I don't understand. Why EAL read/write functions should be 
>> >> > different
>> >> > depending of the VFIO mode?
>> >>
>> >> no, EAL rd/wr functions are not different for vfio or vfio modes {same
>> >> for iommu or noiommu}. Pl. see pci_eal_read/write_bar() api. Those
>> >> apis currently used for VFIO, Irrespective of vfio mode. If required,
>> >> we can add UIO bar_rd/wr api too. pci_eal_rd/wr_bar() are abstract
>> >> apis. Underneath implementation can be vfio or uio type.
>> >
>> > It means you agree the suffix _NOIOMMU is not needed?
>> > It seems we go nowhere in this discussion. You said
>> > "drv->kdrv need to know driver mode for vfio"
>>
>> In my observation, currently virtio work for vfio-noiommu, that's why
>> said drv->kdrv need to know vfio mode.
>
> It is your observation. It may change in near future.
>

so that mean till then, virtio support for non-x86 arch has to wait?
We have working model with vfio-noiommu, don't you think it make sense
to let vfio_noiommu implementation exist and later in-case
virtio+iommu gets mainline then switch to vfio __mode__ agnostic
approach. And for that All it takes to replace __noiommu suffix with
default.

>> > and after
>> > "Those apis currently used for VFIO, Irrespective of vfio mode"
>> > That's why I assume your first assumption was wrong.
>> >
>>
>> Newly introduced dpdk global api pci_eal_rd/wr_bar(),  can be used for
>> vfio and uio both; can be used for vfio w/IOMMU and vfio w/o IOMMU
>> both.
>>
>> >> >> > Why do we care to parse noiommu only?
>> >> >>
>> >> >> Because pmd drivers example virtio can work with vfio only in
>> >> >> _noiommu_ mode. In particular, virtio spec 0.95 / legacy virtio.
>> >> >
>> >> > Please could you explain the limitation (except IOMMU availability)?
>> >>
>> >> Ok.
>> >>
>> >> I believe - we both agree that noiommu mode is a need for pmd drivers
>> >> like virtio, right? if so then other reason is implementation driven
>> >
>> > No, noiommu is a need for some environment having no IOMMU.
>> > But in my understanding, virtio could run with a nested IOMMU.
>>
>> Interesting, like to understand nested one, I did tried in past by
>> passing "iommu=pt intel_iommu=on kvm-intel.nested=1" in cmdline for
>> x86 (for guest/host both), but virtio pci device binding to vfio-pci
>> driver fails. Tried on 4.2 kernel (qemu version 2.5), is it working
>> for >4.2 kernel/ qemu-version?
>
> I haven't tried.
>
>> >> i.e..
>> >>
>> >> Pl. look at virtio_pci.c in this patch.. VIRTIO_RD/WR/_1/2/4()
>> >> implementation. They are in-use and applicable to  virtio spec 0.95,
>> >> so far support uio/ioport-way rd/wr. Now to support vfio-way rd/wr -
>> >> need to check drv->kdrv value, that value should be of vfio_noiommu
>> >> types __not__  generic _vfio types.
>> >
>> > I still don't understand why it would not work with VFIO w/IOMMU.
>>
>> with vfio+iommu; binding virtio pci device to vfio-pci driver fail;
>> giving below error:
>> [   53.053464] VFIO - User Level meta-driver version: 0.3
>> [   73.077805] vfio-pci: probe of :00:03.0 failed with error -22
>> [   73.077852] vfio-pci: probe of :00:03.0 failed with error -22
>>
>> vfio_pci_probe() --> vfio_iommu_group_get() --> iommu_group_get()
>> fails: iommu doesn't have group for virtio pci device.
>
> Yes it fails when binding.
> So the later check in the virtio PMD is useless.
>

Which check?

>> In case of noiommu, it prepares the group / add device to iommu group,
>> so it passes.
>>
>> Jason in other thread mentioned that he is working on virtio+iommu
>> approach [1], Patches are not merged and I am yet to evaluate his
>> patches for virtio pmd driver for iommu(+vfio). so wondering how
>> virtio pci device could work unle

[dpdk-dev] bnx2x driver and 57800 versus 57810

2016-01-26 Thread Chas Williams

I have to practically identical systems, same hypervisor on each (Centos
7.x). ?In one, I have a 57800 card which works fine with DPDK with
SRIOV. ?In the other, I have a 57810 card which doesn't work with SRIOV.

For the 57810 I have tracked this down to the status block in the VF
failing to be updated. ?The linux driver works fine but it appears to
use a slightly different scheme -- writing some sort of fastpath status
block generation per interrupt.

Does anyone have any suggestions or a programming guide for this device?

[dpdk-dev] [PATCH 5/5] mempool: allow rte_pktmbuf_pool_create switch between memool handlers

2016-01-26 Thread David Hunt

if the user wants to have rte_pktmbuf_pool_create() use an external mempool
handler, they simply define MEMPOOL_HANDLER_NAME to be the name of the
mempool handler they wish to use. May move this to config

Signed-off-by: David Hunt 
---
 lib/librte_mbuf/rte_mbuf.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index c18b438..362396e 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -167,10 +167,21 @@ rte_pktmbuf_pool_create(const char *name, unsigned n,
mbp_priv.mbuf_data_room_size = data_room_size;
mbp_priv.mbuf_priv_size = priv_size;

+/* #define MEMPOOL_HANDLER_NAME "custom_handler" */
+#undef MEMPOOL_HANDLER_NAME
+
+#ifndef MEMPOOL_HANDLER_NAME
return rte_mempool_create(name, n, elt_size,
cache_size, sizeof(struct rte_pktmbuf_pool_private),
rte_pktmbuf_pool_init, &mbp_priv, rte_pktmbuf_init, NULL,
socket_id, 0);
+#else
+   return rte_mempool_create_ext(name, n, elt_size,
+   cache_size, sizeof(struct rte_pktmbuf_pool_private),
+   rte_pktmbuf_pool_init, &mbp_priv, rte_pktmbuf_init, NULL,
+   socket_id, 0,
+   MEMPOOL_HANDLER_NAME);
+#endif
 }

 /* do some sanity checks on a mbuf: panic if it fails */
-- 
1.9.3

[dpdk-dev] [PATCH 4/5] mempool: add autotest for external mempool custom example

2016-01-26 Thread David Hunt

Signed-off-by: David Hunt 
---
 app/test/Makefile   |   1 +
 app/test/test_ext_mempool.c | 474 
 2 files changed, 475 insertions(+)
 create mode 100644 app/test/test_ext_mempool.c

diff --git a/app/test/Makefile b/app/test/Makefile
index ec33e1a..9a2f75f 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -74,6 +74,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer_perf.c
 SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer_racecond.c

 SRCS-y += test_mempool.c
+SRCS-y += test_ext_mempool.c
 SRCS-y += test_mempool_perf.c

 SRCS-y += test_mbuf.c
diff --git a/app/test/test_ext_mempool.c b/app/test/test_ext_mempool.c
new file mode 100644
index 000..b434f8b
--- /dev/null
+++ b/app/test/test_ext_mempool.c
@@ -0,0 +1,474 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "test.h"
+
+/*
+ * Mempool
+ * ===
+ *
+ * Basic tests: done on one core with and without cache:
+ *
+ *- Get one object, put one object
+ *- Get two objects, put two objects
+ *- Get all objects, test that their content is not modified and
+ *  put them back in the pool.
+ */
+
+#define TIME_S 5
+#define MEMPOOL_ELT_SIZE 2048
+#define MAX_KEEP 128
+#define MEMPOOL_SIZE 8192
+
+static struct rte_mempool *mp;
+static struct rte_mempool *ext_nocache, *ext_cache;
+
+static rte_atomic32_t synchro;
+
+/*
+ * For our tests, we use the following struct to pass info to our create
+ *  callback so it can call rte_mempool_create
+ */
+struct custom_mempool_alloc_params {
+   char ring_name[RTE_RING_NAMESIZE];
+   unsigned n_elt;
+   unsigned elt_size;
+};
+
+/*
+ * Simple example of custom mempool structure. Holds pointers to all the
+ * elements which are simply malloc'd in this example.
+ */
+struct custom_mempool {
+   struct rte_ring *r;/* Ring to manage elements */
+   void *elements[MEMPOOL_SIZE];  /* Element pointers */
+};
+
+/*
+ * save the object number in the first 4 bytes of object data. All
+ * other bytes are set to 0.
+ */
+static void
+my_obj_init(struct rte_mempool *mp, __attribute__((unused)) void *arg,
+   void *obj, unsigned i)
+{
+   uint32_t *objnum = obj;
+
+   memset(obj, 0, mp->elt_size);
+   *objnum = i;
+   printf("Setting objnum to %d\n", i);
+}
+
+/* basic tests (done on one core) */
+static int
+test_mempool_basic(void)
+{
+   uint32_t *objnum;
+   void **objtable;
+   void *obj, *obj2;
+   char *obj_data;
+   int ret = 0;
+   unsigned i, j;
+
+   /* dump the mempool status */
+   rte_mempool_dump(stdout, mp);
+
+   printf("Count = %d\n", rte_mempool_count(mp));
+   printf("get an object\n");
+   if (rte_mempool_get(mp, &obj) < 0) {
+   printf("get Failed\n");
+   return -1;
+   }
+   printf("Count = %d\n", rte_mempool_count(mp));
+   rte_mempool_dump(stdout, mp);
+
+   /* tests that improve coverage */
+   printf("get object count\n");
+   if (rte_mempool_count(mp) != MEMPOOL_SIZE -

[dpdk-dev] [PATCH 3/5] mempool: add custom external mempool handler example

2016-01-26 Thread David Hunt

adds a simple ring-based mempool handler using mallocs for each object

Signed-off-by: David Hunt 
---
 lib/librte_mempool/Makefile |   1 +
 lib/librte_mempool/custom_mempool.c | 160 
 2 files changed, 161 insertions(+)
 create mode 100644 lib/librte_mempool/custom_mempool.c

diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile
index d795b48..4f72546 100644
--- a/lib/librte_mempool/Makefile
+++ b/lib/librte_mempool/Makefile
@@ -44,6 +44,7 @@ LIBABIVER := 1
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool.c
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_default.c
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_stack.c
+SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  custom_mempool.c
 ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y)
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_dom0_mempool.c
 endif
diff --git a/lib/librte_mempool/custom_mempool.c 
b/lib/librte_mempool/custom_mempool.c
new file mode 100644
index 000..a9da8c5
--- /dev/null
+++ b/lib/librte_mempool/custom_mempool.c
@@ -0,0 +1,160 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "rte_mempool_internal.h"
+
+/*
+ * Mempool
+ * ===
+ *
+ * Basic tests: done on one core with and without cache:
+ *
+ *- Get one object, put one object
+ *- Get two objects, put two objects
+ *- Get all objects, test that their content is not modified and
+ *  put them back in the pool.
+ */
+
+#define TIME_S 5
+#define MEMPOOL_ELT_SIZE 2048
+#define MAX_KEEP 128
+#define MEMPOOL_SIZE 8192
+
+#if 0
+/*
+ * For our example mempool handler, we use the following struct to
+ * pass info to our create callback so it can call rte_mempool_create
+ */
+struct custom_mempool_alloc_params {
+   char ring_name[RTE_RING_NAMESIZE];
+   unsigned n_elt;
+   unsigned elt_size;
+};
+#endif
+
+/*
+ * Simple example of custom mempool structure. Holds pointers to all the
+ * elements which are simply malloc'd in this example.
+ */
+struct custom_mempool {
+   struct rte_ring *r; /* Ring to manage elements */
+   void *elements[MEMPOOL_SIZE];   /* Element pointers */
+};
+
+/*
+ * Loop though all the element pointers and allocate a chunk of memory, then
+ * insert that memory into the ring.
+ */
+static void *
+custom_mempool_alloc(struct rte_mempool *mp,
+   const char *name, unsigned n,
+   __attribute__((unused)) int socket_id,
+   __attribute__((unused)) unsigned flags)
+
+{
+   static struct custom_mempool *cm;
+   uint32_t *objnum;
+   unsigned int i;
+
+   cm = malloc(sizeof(struct custom_mempool));
+
+   /* Create the ring so we can enqueue/dequeue */
+   cm->r = rte_ring_create(name,
+   rte_align32pow2(n+1), 0, 0);
+   if (cm->r == NULL)
+   return NULL;
+
+   /*
+* Loop around the elements an allocate the required memory
+* and place them in the ring.
+* Not worried about alignment or performance for this example.
+* Also, set the first 32-bits to be the element number so we
+* can check later on.
+*/
+   for (i = 0; i < n; i++) {
+

[dpdk-dev] [PATCH 2/5] memool: add stack (lifo) based external mempool handler

2016-01-26 Thread David Hunt

adds a simple stack based mempool handler

Signed-off-by: David Hunt 
---
 app/test/test_mempool_perf.c   |   1 -
 lib/librte_mempool/Makefile|   1 +
 lib/librte_mempool/rte_mempool_stack.c | 167 +
 3 files changed, 168 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_mempool/rte_mempool_stack.c

diff --git a/app/test/test_mempool_perf.c b/app/test/test_mempool_perf.c
index 091c1df..c5a1d2a 100644
--- a/app/test/test_mempool_perf.c
+++ b/app/test/test_mempool_perf.c
@@ -52,7 +52,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile
index 7c81ef6..d795b48 100644
--- a/lib/librte_mempool/Makefile
+++ b/lib/librte_mempool/Makefile
@@ -43,6 +43,7 @@ LIBABIVER := 1
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool.c
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_default.c
+SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_stack.c
 ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y)
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_dom0_mempool.c
 endif
diff --git a/lib/librte_mempool/rte_mempool_stack.c 
b/lib/librte_mempool/rte_mempool_stack.c
new file mode 100644
index 000..c7d232e
--- /dev/null
+++ b/lib/librte_mempool/rte_mempool_stack.c
@@ -0,0 +1,167 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "rte_mempool_internal.h"
+
+struct rte_mempool_common_stack {
+   /* Spinlock to protect access */
+   rte_spinlock_t sl;
+
+   uint32_t size;
+   uint32_t len;
+   void *objs[];
+
+#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
+#endif
+};
+
+static void *
+common_stack_alloc(struct rte_mempool *mp,
+   const char *name, unsigned n, int socket_id, unsigned flags)
+{
+   struct rte_mempool_common_stack *s;
+   char stack_name[RTE_RING_NAMESIZE];
+
+   int size = sizeof(*s) + (n+16)*sizeof(void *);
+
+   flags = flags;
+
+   /* Allocate our local memory structure */
+   snprintf(stack_name, sizeof(stack_name), "%s-common-stack", name);
+   s = rte_zmalloc_socket(stack_name,
+   size, RTE_CACHE_LINE_SIZE, socket_id);
+   if (s == NULL) {
+   RTE_LOG(ERR, MEMPOOL, "Cannot allocate stack!\n");
+   return NULL;
+   }
+
+   /* And the spinlock we use to protect access */
+   rte_spinlock_init(&s->sl);
+
+   s->size = n;
+   mp->rt_pool = (void *) s;
+   mp->handler_idx = rte_get_mempool_handler("stack");
+
+   return (void *) s;
+}
+
+static int common_stack_put(void *p, void * const *obj_table,
+   unsigned n)
+{
+   struct rte_mempool_common_stack *s =
+   (struct rte_mempool_common_stack *)p;
+   void **cache_objs;
+   unsigned index;
+
+   /* Acquire lock */
+   rte_spinlock_lock(&s->sl);
+   cache_objs = &s->objs[s->len];
+
+   /* Is there sufficient space in the stack ? */
+   if ((s->len + n) > s->size) {
+   rte_spinlock_unlock(&s->sl);
+   return -ENOENT;
+   }
+
+   /* Add elements back into the cache */
+   for (index = 0; index < n; ++index, obj_table++)
+   cache_obj

[dpdk-dev] [PATCH 1/5] mempool: add external mempool manager support

2016-01-26 Thread David Hunt

Adds the new rte_mempool_create_ext api and callback mechanism for
external mempool handlers

Modifies the existing rte_mempool_create to set up the handler_idx to
the relevant mempool handler based on the handler name:
ring_sp_sc
ring_mp_mc
ring_sp_mc
ring_mp_sc

Signed-off-by: David Hunt 
---
 app/test/test_mempool_perf.c  |   1 -
 lib/librte_mempool/Makefile   |   1 +
 lib/librte_mempool/rte_mempool.c  | 210 +++
 lib/librte_mempool/rte_mempool.h  | 207 +++
 lib/librte_mempool/rte_mempool_default.c  | 229 ++
 lib/librte_mempool/rte_mempool_internal.h |  74 ++
 6 files changed, 634 insertions(+), 88 deletions(-)
 create mode 100644 lib/librte_mempool/rte_mempool_default.c
 create mode 100644 lib/librte_mempool/rte_mempool_internal.h

diff --git a/app/test/test_mempool_perf.c b/app/test/test_mempool_perf.c
index cdc02a0..091c1df 100644
--- a/app/test/test_mempool_perf.c
+++ b/app/test/test_mempool_perf.c
@@ -161,7 +161,6 @@ per_lcore_mempool_test(__attribute__((unused)) void *arg)
   n_get_bulk);
if (unlikely(ret < 0)) {
rte_mempool_dump(stdout, mp);
-   rte_ring_dump(stdout, mp->ring);
/* in this case, objects are lost... */
return -1;
}
diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile
index a6898ef..7c81ef6 100644
--- a/lib/librte_mempool/Makefile
+++ b/lib/librte_mempool/Makefile
@@ -42,6 +42,7 @@ LIBABIVER := 1

 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool.c
+SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_default.c
 ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y)
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_dom0_mempool.c
 endif
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index aff5f6d..8c01838 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -59,10 +59,11 @@
 #include 

 #include "rte_mempool.h"
+#include "rte_mempool_internal.h"

 TAILQ_HEAD(rte_mempool_list, rte_tailq_entry);

-static struct rte_tailq_elem rte_mempool_tailq = {
+struct rte_tailq_elem rte_mempool_tailq = {
.name = "RTE_MEMPOOL",
 };
 EAL_REGISTER_TAILQ(rte_mempool_tailq)
@@ -149,7 +150,7 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, 
uint32_t obj_idx,
obj_init(mp, obj_init_arg, obj, obj_idx);

/* enqueue in ring */
-   rte_ring_sp_enqueue(mp->ring, obj);
+   rte_mempool_ext_put_bulk(mp, &obj, 1);
 }

 uint32_t
@@ -375,48 +376,28 @@ rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, 
size_t elt_sz,
return usz;
 }

-#ifndef RTE_LIBRTE_XEN_DOM0
-/* stub if DOM0 support not configured */
-struct rte_mempool *
-rte_dom0_mempool_create(const char *name __rte_unused,
-   unsigned n __rte_unused,
-   unsigned elt_size __rte_unused,
-   unsigned cache_size __rte_unused,
-   unsigned private_data_size __rte_unused,
-   rte_mempool_ctor_t *mp_init __rte_unused,
-   void *mp_init_arg __rte_unused,
-   rte_mempool_obj_ctor_t *obj_init __rte_unused,
-   void *obj_init_arg __rte_unused,
-   int socket_id __rte_unused,
-   unsigned flags __rte_unused)
-{
-   rte_errno = EINVAL;
-   return NULL;
-}
-#endif
-
 /* create the mempool */
 struct rte_mempool *
 rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
-  unsigned cache_size, unsigned private_data_size,
-  rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-  rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
-  int socket_id, unsigned flags)
+   unsigned cache_size, unsigned private_data_size,
+   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
+   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+   int socket_id, unsigned flags)
 {
if (rte_xen_dom0_supported())
return rte_dom0_mempool_create(name, n, elt_size,
-  cache_size, private_data_size,
-  mp_init, mp_init_arg,
-  obj_init, obj_init_arg,
-  socket_id, flags);
+   cache_size, private_data_size,
+   mp_init, mp_init_arg,
+   obj_init, obj_init_arg,
+   socket_id, flags);
else

[dpdk-dev] [PATCH 0/5] add external mempool manager

2016-01-26 Thread David Hunt

Hi all on the list.

Here's a proposed patch for an external mempool manager

The External Mempool Manager is an extension to the mempool API that allows
users to add and use an external mempool manager, which allows external memory
subsystems such as external hardware memory management systems and software
based memory allocators to be used with DPDK.

The existing API to the internal DPDK mempool manager will remain unchanged
and will be backward compatible.

There are two aspects to external mempool manager.
  1. Adding the code for your new mempool handler. This is achieved by adding a
 new mempool handler source file into the librte_mempool library, and
 using the REGISTER_MEMPOOL_HANDLER macro.
  2. Using the new API to call rte_mempool_create_ext to create a new mempool
 using the name parameter to identify which handler to use.

New API calls added
 1. A new mempool 'create' function which accepts mempool handler name.
 2. A new mempool 'rte_get_mempool_handler' function which accepts mempool
handler name, and returns the index to the relevant set of callbacks for
that mempool handler

Several external mempool managers may be used in the same application. A new
mempool can then be created by using the new 'create' function, providing the
mempool handler name to point the mempool to the relevant mempool manager
callback structure.

The old 'create' function can still be called by legacy programs, and will
internally work out the mempool handle based on the flags provided (single
producer, single consumer, etc). By default handles are created internally to
implement the built-in DPDK mempool manager and mempool types.

The external mempool manager needs to provide the following functions.
 1. alloc - allocates the mempool memory, and adds each object onto a ring
 2. put   - puts an object back into the mempool once an application has
finished with it
 3. get   - gets an object from the mempool for use by the application
 4. get_count - gets the number of available objects in the mempool
 5. free  - frees the mempool memory

Every time a get/put/get_count is called from the application/PMD, the
callback for that mempool is called. These functions are in the fastpath,
and any unoptimised handlers may limit performance.

The new APIs are as follows:

1. rte_mempool_create_ext

struct rte_mempool *
rte_mempool_create_ext(const char * name, unsigned n,
unsigned cache_size, unsigned private_data_size,
int socket_id, unsigned flags,
const char * handler_name);

2. rte_get_mempool_handler

int16_t
rte_get_mempool_handler(const char *name);

Please see rte_mempool.h for further information on the parameters.


The important thing to note is that the mempool handler is passed by name
to rte_mempool_create_ext, and that in turn calls rte_get_mempool_handler to
get the handler index, which is stored in the rte_memool structure. This
allow multiple processes to use the same mempool, as the function pointers
are accessed via handler index.

The mempool handler structure contains callbacks to the implementation of
the handler, and is set up for registration as follows:

static struct rte_mempool_handler handler_sp_mc = {
.name = "ring_sp_mc",
.alloc = rte_mempool_common_ring_alloc,
.put = common_ring_sp_put,
.get = common_ring_mc_get,
.get_count = common_ring_get_count,
.free = common_ring_free,
};

And then the following macro will register the handler in the array of handlers

REGISTER_MEMPOOL_HANDLER(handler_mp_mc);

For and example of a simple malloc based mempool manager, see
lib/librte_mempool/custom_mempool.c

For an example of API usage, please see app/test/test_ext_mempool.c, which
implements a rudimentary mempool manager using simple mallocs for each
mempool object (custom_mempool.c).


David Hunt (5):
  mempool: add external mempool manager support
  memool: add stack (lifo) based external mempool handler
  mempool: add custom external mempool handler example
  mempool: add autotest for external mempool custom example
  mempool: allow rte_pktmbuf_pool_create switch between memool handlers

 app/test/Makefile |   1 +
 app/test/test_ext_mempool.c   | 470 ++
 app/test/test_mempool_perf.c  |   2 -
 lib/librte_mbuf/rte_mbuf.c|  11 +
 lib/librte_mempool/Makefile   |   3 +
 lib/librte_mempool/custom_mempool.c   | 158 ++
 lib/librte_mempool/rte_mempool.c  | 208 +
 lib/librte_mempool/rte_mempool.h  | 205 +++--
 lib/librte_mempool/rte_mempool_default.c  | 229 +++
 lib/librte_mempool/rte_mempool_internal.h |  70 +
 lib/librte_mempool/rte_mempool_stack.c| 162 ++
 11 files changed, 1430 insertions(+), 89 deletions(-)
 create mode 100644 app/test/test_ext_mempool.c
 create mode 100644 lib/librte_mempool/custom_mempool.c
 create mode 100644 lib

[dpdk-dev] [PATCH V1 1/1] jobstats: added function abort for job

2016-01-26 Thread Marcin Kerlin

This patch adds new function rte_jobstats_abort. It marks *job* as finished
and time of this work will be add to management time instead of execution time. 
This function should be used instead of rte_jobstats_finish if condition occure,
condition is defined by the application for example when receiving n>0 packets.

Signed-off-by: Marcin Kerlin 
---
 lib/librte_jobstats/rte_jobstats.c   | 22 ++
 lib/librte_jobstats/rte_jobstats.h   | 17 +
 lib/librte_jobstats/rte_jobstats_version.map |  7 +++
 3 files changed, 46 insertions(+)

diff --git a/lib/librte_jobstats/rte_jobstats.c 
b/lib/librte_jobstats/rte_jobstats.c
index 2eaac0c..b603125 100644
--- a/lib/librte_jobstats/rte_jobstats.c
+++ b/lib/librte_jobstats/rte_jobstats.c
@@ -170,6 +170,26 @@ rte_jobstats_start(struct rte_jobstats_context *ctx, 
struct rte_jobstats *job)
 }

 int
+rte_jobstats_abort(struct rte_jobstats *job)
+{
+   struct rte_jobstats_context *ctx;
+   uint64_t now, exec_time;
+
+   /* Some sanity check. */
+   if (unlikely(job == NULL || job->context == NULL))
+   return -EINVAL;
+
+   ctx = job->context;
+   now = get_time();
+   exec_time = now - ctx->state_time;
+   ADD_TIME_MIN_MAX(ctx, management, exec_time);
+   ctx->state_time = now;
+   job->context = NULL;
+
+   return 0;
+}
+
+int
 rte_jobstats_finish(struct rte_jobstats *job, int64_t job_value)
 {
struct rte_jobstats_context *ctx;
@@ -191,6 +211,7 @@ rte_jobstats_finish(struct rte_jobstats *job, int64_t 
job_value)
 * executed. */
now = get_time();
exec_time = now - ctx->state_time;
+   job->last_job_time = exec_time;
ADD_TIME_MIN_MAX(job, exec, exec_time);
ADD_TIME_MIN_MAX(ctx, exec, exec_time);

@@ -269,5 +290,6 @@ void
 rte_jobstats_reset(struct rte_jobstats *job)
 {
RESET_TIME_MIN_MAX(job, exec);
+   job->last_job_time = 0;
job->exec_cnt = 0;
 }
diff --git a/lib/librte_jobstats/rte_jobstats.h 
b/lib/librte_jobstats/rte_jobstats.h
index de6a89a..9995319 100644
--- a/lib/librte_jobstats/rte_jobstats.h
+++ b/lib/librte_jobstats/rte_jobstats.h
@@ -90,6 +90,9 @@ struct rte_jobstats {
uint64_t exec_cnt;
/**< Execute count. */

+   uint64_t last_job_time;
+   /**< Last job time */
+
char name[RTE_JOBSTATS_NAMESIZE];
/**< Name of this job */

@@ -237,6 +240,20 @@ int
 rte_jobstats_start(struct rte_jobstats_context *ctx, struct rte_jobstats *job);

 /**
+ * Mark that *job* finished its execution, but time of this work will be 
skipped
+ * and added to management time.
+ *
+ * @param job
+ *  Job object.
+ *
+ * @return
+ *  0 on success
+ *  -EINVAL if job is NULL or job was not started (it have no context).
+ */
+int
+rte_jobstats_abort(struct rte_jobstats *job);
+
+/**
  * Mark that *job* finished its execution. Context in which it was executing
  * will receive stat update. After this function call *job* object is ready to
  * be executed in other context.
diff --git a/lib/librte_jobstats/rte_jobstats_version.map 
b/lib/librte_jobstats/rte_jobstats_version.map
index cb01bfd..0ec0650 100644
--- a/lib/librte_jobstats/rte_jobstats_version.map
+++ b/lib/librte_jobstats/rte_jobstats_version.map
@@ -17,3 +17,10 @@ DPDK_2.0 {

local: *;
 };
+
+DPDK_2.3 {
+   global:
+
+   rte_jobstats_abort;
+
+} DPDK_2.0;
\ No newline at end of file
-- 1
1.9.1

[dpdk-dev] [PATCH v2 1/2] ethdev: remove useless null checks

2016-01-26 Thread Jan Viktorin

What about the RTE_VERIFY? I think, it's more appropriate here.
Otherwise, feel free to add:

Reviewed-by: Jan Viktorin 

On Fri, 22 Jan 2016 15:06:57 +0100
David Marchand  wrote:

> We are in static functions and those passed arguments can't be NULL.
> 
> Signed-off-by: David Marchand 
> ---
>  lib/librte_ether/rte_ethdev.c | 15 ---
>  1 file changed, 15 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index ed971b4..cab74e0 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -220,9 +220,6 @@ rte_eth_dev_create_unique_device_name(char *name, size_t 
> size,
>  {
>   int ret;
>  
> - if ((name == NULL) || (pci_dev == NULL))
> - return -EINVAL;
> -
>   ret = snprintf(name, size, "%d:%d.%d",
>   pci_dev->addr.bus, pci_dev->addr.devid,
>   pci_dev->addr.function);
> @@ -505,9 +502,6 @@ rte_eth_dev_is_detachable(uint8_t port_id)
>  static int
>  rte_eth_dev_attach_pdev(struct rte_pci_addr *addr, uint8_t *port_id)
>  {
> - if ((addr == NULL) || (port_id == NULL))
> - goto err;
> -
>   /* re-construct pci_device_list */
>   if (rte_eal_pci_scan())
>   goto err;
> @@ -531,9 +525,6 @@ rte_eth_dev_detach_pdev(uint8_t port_id, struct 
> rte_pci_addr *addr)
>   struct rte_pci_addr freed_addr;
>   struct rte_pci_addr vp;
>  
> - if (addr == NULL)
> - goto err;
> -
>   /* check whether the driver supports detach feature, or not */
>   if (rte_eth_dev_is_detachable(port_id))
>   goto err;
> @@ -566,9 +557,6 @@ rte_eth_dev_attach_vdev(const char *vdevargs, uint8_t 
> *port_id)
>   char *name = NULL, *args = NULL;
>   int ret = -1;
>  
> - if ((vdevargs == NULL) || (port_id == NULL))
> - goto end;
> -
>   /* parse vdevargs, then retrieve device name and args */
>   if (rte_eal_parse_devargs_str(vdevargs, &name, &args))
>   goto end;
> @@ -602,9 +590,6 @@ rte_eth_dev_detach_vdev(uint8_t port_id, char *vdevname)
>  {
>   char name[RTE_ETH_NAME_MAX_LEN];
>  
> - if (vdevname == NULL)
> - goto err;
> -
>   /* check whether the driver supports detach feature, or not */
>   if (rte_eth_dev_is_detachable(port_id))
>   goto err;



-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

[dpdk-dev] [PATCH v2 2/2] ethdev: move code to common place in hotplug

2016-01-26 Thread Jan Viktorin

On Fri, 22 Jan 2016 15:06:58 +0100
David Marchand  wrote:

> Move these error logs and checks on detach capabilities in a common place.
> 
> Signed-off-by: David Marchand 
Reviewed-by: Jan Viktorin

[dpdk-dev] [PATCH v6 1/2] tools: Add support for handling built-in kernel modules

2016-01-26 Thread Thomas Monjalon

2016-01-20 10:48, krytarowski at caviumnetworks.com:
> --- a/tools/dpdk_nic_bind.py
> +++ b/tools/dpdk_nic_bind.py
> -for line in loaded_mods:
> +try:
> +# Get list of syfs modules, some of them might be builtin and merge 
> with mods

Please could you explain this comment?
Is it remaining from previous versions of the patch?

[...]
> +# special case for vfio_pci (module is named vfio-pci,
> +# but its .ko is named vfio_pci)

Isn't it common to have dash replaced by underscore for kernel modules?

[dpdk-dev] [PATCH v5 1/2] tools: Add support for handling built-in kernel modules

2016-01-26 Thread Thomas Monjalon

2016-01-19 17:35, Kamil Rytarowski:
> 
> W dniu 18.01.2016 o 15:32, Thomas Monjalon pisze:
> > Hi Kamil,
> >
> > 2015-12-09 14:19, Kamil Rytarowski:
> >> Currently dpdk_nic_bind.py detects Linux kernel modules via reading
> >> /proc/modules. Built-in ones aren't listed there and therefore they are not
> >> being found by the script.
> >>
> >> Add support for checking built-in modules with parsing the sysfs files.
> >>
> >> This commit obsoletes the /proc/modules parsing approach.
> >>
> >> Signed-off-by: Kamil Rytarowski 
> > I have a doubt about this tag:
> >> Signed-off-by: David Marchand 
> > What do you mean here?
> 
> Excuse me, it should be:  Acked-by: David Marchand 
> 
> 
> http://dpdk.org/ml/archives/dev/2015-December/029720.html

The ack was only for the patch 2/2

[dpdk-dev] [PATCH v6 08/11] eal: pci: introduce RTE_KDRV_VFIO_NOIOMMUi driver mode

2016-01-26 Thread Santosh Shukla

On Mon, Jan 25, 2016 at 8:59 PM, Thomas Monjalon
 wrote:
> 2016-01-21 22:47, Santosh Shukla:
>> On Thu, Jan 21, 2016 at 8:16 PM, Thomas Monjalon
>>  wrote:
>> > 2016-01-21 17:34, Santosh Shukla:
>> >> On Thu, Jan 21, 2016 at 4:58 PM, Thomas Monjalon
>> >>  wrote:
>> >> > 2016-01-21 16:43, Santosh Shukla:
>> >> >> David Marchand  wrote:
>> >> >> > This is a mode (specific to vfio), not a new kernel driver.
>> >> >> >
>> >> >> Yes, Specific to VFIO and this is why noiommu appended after vfio i.e..
>> >> >> __VFIO and __VFIO_NOIOMMU.
>> >> >
>> >> > Woaaa! Your logic is really disappointing :)
>> >> > Specific to VFIO => append _NOIOMMU
>> >> > If it's for VFIO, it should be called VFIO (that's my logic).
>> >> >
>> >> I am confused by reading your comment. vfio works for default iommu
>> >> and now with noiommu. drv->kdrv need to know driver mode for vfio
>> >> case. So that user can simply read drv->kdrv value in their driver and
>> >> accordingly use vfio rd/wr api for example {pread/pwrite}. This is how
>> >> rte_eal_pci_vfio_read/write_bar() api implemented.
>> >
>> > Sorry I don't understand. Why EAL read/write functions should be different
>> > depending of the VFIO mode?
>>
>> no, EAL rd/wr functions are not different for vfio or vfio modes {same
>> for iommu or noiommu}. Pl. see pci_eal_read/write_bar() api. Those
>> apis currently used for VFIO, Irrespective of vfio mode. If required,
>> we can add UIO bar_rd/wr api too. pci_eal_rd/wr_bar() are abstract
>> apis. Underneath implementation can be vfio or uio type.
>
> It means you agree the suffix _NOIOMMU is not needed?
> It seems we go nowhere in this discussion. You said
> "drv->kdrv need to know driver mode for vfio"

In my observation, currently virtio work for vfio-noiommu, that's why
said drv->kdrv need to know vfio mode.

> and after
> "Those apis currently used for VFIO, Irrespective of vfio mode"
> That's why I assume your first assumption was wrong.
>

Newly introduced dpdk global api pci_eal_rd/wr_bar(),  can be used for
vfio and uio both; can be used for vfio w/IOMMU and vfio w/o IOMMU
both.

>> >> > Why do we care to parse noiommu only?
>> >>
>> >> Because pmd drivers example virtio can work with vfio only in
>> >> _noiommu_ mode. In particular, virtio spec 0.95 / legacy virtio.
>> >
>> > Please could you explain the limitation (except IOMMU availability)?
>>
>> Ok.
>>
>> I believe - we both agree that noiommu mode is a need for pmd drivers
>> like virtio, right? if so then other reason is implementation driven
>
> No, noiommu is a need for some environment having no IOMMU.
> But in my understanding, virtio could run with a nested IOMMU.
>

Interesting, like to understand nested one, I did tried in past by
passing "iommu=pt intel_iommu=on kvm-intel.nested=1" in cmdline for
x86 (for guest/host both), but virtio pci device binding to vfio-pci
driver fails. Tried on 4.2 kernel (qemu version 2.5), is it working
for >4.2 kernel/ qemu-version?

>> i.e..
>>
>> Pl. look at virtio_pci.c in this patch.. VIRTIO_RD/WR/_1/2/4()
>> implementation. They are in-use and applicable to  virtio spec 0.95,
>> so far support uio/ioport-way rd/wr. Now to support vfio-way rd/wr -
>> need to check drv->kdrv value, that value should be of vfio_noiommu
>> types __not__  generic _vfio types.
>
> I still don't understand why it would not work with VFIO w/IOMMU.
>

with vfio+iommu; binding virtio pci device to vfio-pci driver fail;
giving below error:
[   53.053464] VFIO - User Level meta-driver version: 0.3
[   73.077805] vfio-pci: probe of :00:03.0 failed with error -22
[   73.077852] vfio-pci: probe of :00:03.0 failed with error -22

vfio_pci_probe() --> vfio_iommu_group_get() --> iommu_group_get()
fails: iommu doesn't have group for virtio pci device.

In case of noiommu, it prepares the group / add device to iommu group,
so it passes.

Jason in other thread mentioned that he is working on virtio+iommu
approach [1], Patches are not merged and I am yet to evaluate his
patches for virtio pmd driver for iommu(+vfio). so wondering how
virtio pci device could work unless jason patches used?

[1] https://www.mail-archive.com/qemu-devel at nongnu.org/msg337079.html

>> >> So at
>> >> the initialization (example .. virtio-net) of such pmd driver, pmd
>> >> driver should know that vfio-with-noiommu mode enabled or not? for
>> >> that pmd driver simply checks drv->kdrv value.
>> >
>> > If a check is needed, I would prefer using your function
>> > pci_vfio_is_noiommu() and remove driver modes from struct 
>> > rte_kernel_driver.
>>
>> I don't think calling pci_vfio_no_iommu() inside
>> virtio_reg_rd/wr_1/2/3() would be a good idea.
>
> Why? The value may be cached in the priv properties.
>
pci_vfio_is_noiommu() parses /sys for
- enable_noiommu param
- attached driver name is vfio-noiommu or not.

It does file operation for that, I meant to say that calling this api
within register_rd/wr function is not correct. It would be better if
those low level register_rd/wr ap

[dpdk-dev] [PATCH v6 08/11] eal: pci: introduce RTE_KDRV_VFIO_NOIOMMUi driver mode

2016-01-26 Thread Thomas Monjalon

2016-01-26 19:35, Santosh Shukla:
> On Tue, Jan 26, 2016 at 6:30 PM, Thomas Monjalon
>  wrote:
> > 2016-01-26 15:56, Santosh Shukla:
> >> In my observation, currently virtio work for vfio-noiommu, that's why
> >> said drv->kdrv need to know vfio mode.
> >
> > It is your observation. It may change in near future.
> 
> so that mean till then, virtio support for non-x86 arch has to wait?

No, absolutely not. virtio for non-x86 is welcome.

> We have working model with vfio-noiommu, don't you think it make sense
> to let vfio_noiommu implementation exist and later in-case
> virtio+iommu gets mainline then switch to vfio __mode__ agnostic
> approach. And for that All it takes to replace __noiommu suffix with
> default.

I'm just saying you should not touch the enum rte_kernel_driver.
RTE_KDRV_VFIO is a driver.
RTE_KDRV_VFIO_NOIOMMU is a mode.
As the VFIO API is the same in both modes, there is no reason to
distinguish them at this level.
Your patch adds the NOIOMMU case everywhere:
case RTE_KDRV_VFIO:
+   case RTE_KDRV_VFIO_NOIOMMU:

I'll stop commenting here to let others give their opinion.

[...]
> >> with vfio+iommu; binding virtio pci device to vfio-pci driver fail;
> >> giving below error:
> >> [   53.053464] VFIO - User Level meta-driver version: 0.3
> >> [   73.077805] vfio-pci: probe of :00:03.0 failed with error -22
> >> [   73.077852] vfio-pci: probe of :00:03.0 failed with error -22
> >>
> >> vfio_pci_probe() --> vfio_iommu_group_get() --> iommu_group_get()
> >> fails: iommu doesn't have group for virtio pci device.
> >
> > Yes it fails when binding.
> > So the later check in the virtio PMD is useless.
> 
> Which check?

The check for VFIO noiommu only:
-   if (dev->kdrv == RTE_KDRV_VFIO)
+   if (dev->kdrv == RTE_KDRV_VFIO_NOIOMMU)

[...]
> > Furthermore restricting virtio to no-iommu mode doesn't bring
> > any improvement.
> 
> We're not __restricting__, as soon as virtio+iommu gets working state,
> we'll simply replace __noiommu with default. Then its upto user to try
> out virtio with vfio default or vfio_noiommu.

Yes it's up to user.
So your code should be
if (dev->kdrv == RTE_KDRV_VFIO)

> > That's why I suggest to keep the initial semantic of kdrv and
> > not pollute it with VFIO modes.
> 
> I am okay to live with default and forget suffix __noiommu but there
> are implementation problem which was discussed in other thread
> - Virtio pmd driver should avoid interface parsing i.e.
> virtio_resource_init_uio/vfio() etc.. For vfio case - We could easily
> get rid of by moving /sys parsing to pci_eal layer, Right? If so then
> virtio currently works with vfio-noiommu, it make sense to me that
> pci_eal layer does parsing for pmd driver before that pmd driver get
> initialized.

Please reword. What is the problem?

> - Another case could be: iommu-less-pmd-driver. eal layer to do
> parsing before updating drv->kdrv.

[...]
> >> >> > If a check is needed, I would prefer using your function
> >> >> > pci_vfio_is_noiommu() and remove driver modes from struct 
> >> >> > rte_kernel_driver.
> >> >>
> >> >> I don't think calling pci_vfio_no_iommu() inside
> >> >> virtio_reg_rd/wr_1/2/3() would be a good idea.
> >> >
> >> > Why? The value may be cached in the priv properties.
> >> >
> >> pci_vfio_is_noiommu() parses /sys for
> >> - enable_noiommu param
> >> - attached driver name is vfio-noiommu or not.
> >>
> >> It does file operation for that, I meant to say that calling this api
> >> within register_rd/wr function is not correct. It would be better if
> >> those low level register_rd/wr api only checks driver_types.
> >
> > Yes, that's why I said the return of pci_vfio_is_noiommu() may be cached
> > to keep efficiency.
> 
> I am not convinced though, Still find pmd driver checking driver_types
> using drv->kdrv is better approach than introducing a new global
> variable which may look something like;

Not a global variable. A function in EAL layer. A variable in PMD priv.

> At pci_eal layer 
> bool vfio_mode;
> vfio_mode = pci_vfio_is_noiommu();
> 
> At virtio pmd driver layer 
> Checking value at vfio_mode variable before doing virtio_rd/wr for
> vfio interface.
> 
> Instead virtio pmd driver doing
> 
> virtio_reg_rd/wr_1/2/4()
> {
> if (drv->kdrv == VFIO)
>   do pread()/pwrite()
> else
>   in()/out()
> }
> 
> is better approach.
> 
> Let me know if you still think former is better than latter then I'll
> send patch revision right-away.

[dpdk-dev] [PATCH 12/12] testpmd: extend commands for fdir's vlan input set

2016-01-26 Thread Jingjing Wu

This patch extended commands for filter's input set changing.
It added vlan as filter's input fields.

Signed-off-by: Jingjing Wu 
---
 app/test-pmd/cmdline.c  | 6 +++---
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index ecc822a..f7ffce1 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -748,7 +748,7 @@ static void cmd_help_long_parsed(void *parsed_result,
"set_fdir_input_set (port_id) "
"(ipv4-frag|ipv4-tcp|ipv4-udp|ipv4-sctp|ipv4-other|"
"ipv6-frag|ipv6-tcp|ipv6-udp|ipv6-sctp|ipv6-other|"
-   "l2_payload) (ethertype|src-ipv4|dst-ipv4|src-ipv6|"
+   "l2_payload) 
(ivlan|ethertype|src-ipv4|dst-ipv4|src-ipv6|"
"dst-ipv6|ipv4-tos|ipv4-proto|ipv4-ttl|ipv6-tc|"
"ipv6-next-header|ipv6-hop-limits|udp-src-port|"
"udp-dst-port|tcp-src-port|tcp-dst-port|"
@@ -9622,7 +9622,7 @@ cmdline_parse_token_string_t 
cmd_set_fdir_input_set_flow_type =
 cmdline_parse_token_string_t cmd_set_fdir_input_set_field =
TOKEN_STRING_INITIALIZER(struct cmd_set_fdir_input_set_result,
inset_field,
-   "ethertype#src-ipv4#dst-ipv4#src-ipv6#dst-ipv6#"
+   "ivlan#ethertype#src-ipv4#dst-ipv4#src-ipv6#dst-ipv6#"
"ipv4-tos#ipv4-proto#ipv4-ttl#ipv6-tc#ipv6-next-header#"
"ipv6-hop-limits#udp-src-port#udp-dst-port#"
"tcp-src-port#tcp-dst-port#sctp-src-port#sctp-dst-port#"
@@ -9637,7 +9637,7 @@ cmdline_parse_inst_t cmd_set_fdir_input_set = {
.help_str = "set_fdir_input_set  "
"ipv4-frag|ipv4-tcp|ipv4-udp|ipv4-sctp|ipv4-other|"
"ipv6-frag|ipv6-tcp|ipv6-udp|ipv6-sctp|ipv6-other|l2_payload "
-   "ethertype|src-ipv4|dst-ipv4|src-ipv6|dst-ipv6|"
+   "ivlan|ethertype|src-ipv4|dst-ipv4|src-ipv6|dst-ipv6|"
"ipv4-tos|ipv4-proto|ipv4-ttl|ipv6-tc|ipv6-next-header|"
"ipv6-hop-limits|udp-src-port|udp-dst-port|"
"tcp-src-port|tcp-dst-port|sctp-src-port|sctp-dst-port|"
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst 
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 417ddde..aa20d5a 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1878,7 +1878,7 @@ Set the input set for flow director::

set_fdir_input_set (port_id) (ipv4-frag|ipv4-tcp|ipv4-udp|ipv4-sctp| \
ipv4-other|ipv6|ipv6-frag|ipv6-tcp|ipv6-udp|ipv6-sctp|ipv6-other| \
-   l2_payload) (ethertype|src-ipv4|dst-ipv4|src-ipv6|dst-ipv6|ipv4-tos| \
+   l2_payload) (ivlan|ethertype|src-ipv4|dst-ipv4|src-ipv6|dst-ipv6|ipv4-tos| \
ipv4-proto|ipv4-ttl|ipv6-tc|ipv6-next-header|ipv6-hop-limits| \
tudp-src-port|udp-dst-port|cp-src-port|tcp-dst-port|sctp-src-port| \
sctp-dst-port|sctp-veri-tag|udp-key|gre-key|none) (select|add)
-- 
2.4.0

[dpdk-dev] [PATCH 11/12] i40e: extend flow director to filter by vlan id

2016-01-26 Thread Jingjing Wu

This patch extended flow director to select vlan id
as filter's input set and program the filter rule with vlan id.

Signed-off-by: Jingjing Wu 
---
 doc/guides/rel_notes/release_2_3.rst |  1 +
 drivers/net/i40e/i40e_ethdev.c   | 11 +++
 drivers/net/i40e/i40e_fdir.c |  9 +
 3 files changed, 21 insertions(+)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 2216fee..63c7e04 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -4,6 +4,7 @@ DPDK Release 2.3
 New Features
 

+* **Added Flow director enhancements on Intel X710/XL710.**

 Resolved Issues
 ---
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 66e3a46..b4bd24b 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -6557,58 +6557,69 @@ i40e_get_valid_input_set(enum i40e_filter_pctype pctype,
 */
static const uint64_t valid_fdir_inset_table[] = {
[I40E_FILTER_PCTYPE_FRAG_IPV4] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_PROTO |
I40E_INSET_IPV4_TTL,
[I40E_FILTER_PCTYPE_NONF_IPV4_UDP] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV4_TCP] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV4_SCTP] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV4_OTHER] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_PROTO |
I40E_INSET_IPV4_TTL,
[I40E_FILTER_PCTYPE_FRAG_IPV6] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_NEXT_HDR |
I40E_INSET_IPV6_HOP_LIMIT,
[I40E_FILTER_PCTYPE_NONF_IPV6_UDP] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV6_TCP] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV6_SCTP] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV6_OTHER] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_NEXT_HDR |
I40E_INSET_IPV6_HOP_LIMIT,
[I40E_FILTER_PCTYPE_L2_PAYLOAD] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_LAST_ETHER_TYPE,
};
diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
index 7566017..bbe6f1f 100644
--- a/drivers/net/i40e/i40e_fdir.c
+++ b/drivers/net/i40e/i40e_fdir.c
@@ -799,6 +799,7 @@ i40e_fdir_construct_pkt(struct i40e_pf *pf,
uint8_t size, dst = 0;
uint8_t i, pit_idx, set_idx = I40E_FLXPLD_L4_IDX; /* use l4 by default*/
bool need_mac = TRUE;
+

[dpdk-dev] [PATCH 10/12] i40e: fix VLAN bitmasks for hash/fdir input sets for tunnels

2016-01-26 Thread Jingjing Wu

From: Andrey Chilikin 

This patch adds missing VLAN bitmask for inner frame in case of
tunneling and fixes VLAN tags bitmasks for single or outer frame
in case of tunneling.

Signed-off-by: Andrey Chilikin 
Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 62cdf81..66e3a46 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -206,10 +206,12 @@
 #define I40E_REG_INSET_L2_DMAC   0xE000ULL
 /* Source MAC address */
 #define I40E_REG_INSET_L2_SMAC   0x1C00ULL
-/* VLAN tag in the outer L2 header */
-#define I40E_REG_INSET_L2_OUTER_VLAN 0x0080ULL
-/* VLAN tag in the inner L2 header */
-#define I40E_REG_INSET_L2_INNER_VLAN 0x0100ULL
+/* Outer (S-Tag) VLAN tag in the outer L2 header */
+#define I40E_REG_INSET_L2_OUTER_VLAN 0x0200ULL
+/* Inner (C-Tag) or single VLAN tag in the outer L2 header */
+#define I40E_REG_INSET_L2_INNER_VLAN 0x0080ULL
+/* Single VLAN tag in the inner L2 header */
+#define I40E_REG_INSET_TUNNEL_VLAN   0x0100ULL
 /* Source IPv4 address */
 #define I40E_REG_INSET_L3_SRC_IP40x00018000ULL
 /* Destination IPv4 address */
@@ -6818,7 +6820,7 @@ i40e_translate_input_set_reg(uint64_t input)
I40E_REG_INSET_TUNNEL_L4_UDP_SRC_PORT},
{I40E_INSET_TUNNEL_DST_PORT,
I40E_REG_INSET_TUNNEL_L4_UDP_DST_PORT},
-   {I40E_INSET_TUNNEL_ID, I40E_REG_INSET_TUNNEL_ID},
+   {I40E_INSET_VLAN_TUNNEL, I40E_REG_INSET_TUNNEL_VLAN},
{I40E_INSET_FLEX_PAYLOAD_W1, I40E_REG_INSET_FLEX_PAYLOAD_WORD1},
{I40E_INSET_FLEX_PAYLOAD_W2, I40E_REG_INSET_FLEX_PAYLOAD_WORD2},
{I40E_INSET_FLEX_PAYLOAD_W3, I40E_REG_INSET_FLEX_PAYLOAD_WORD3},
-- 
2.4.0

[dpdk-dev] [PATCH 09/12] testpmd: extend commands for fdir's tunnel id input set

2016-01-26 Thread Jingjing Wu

This patch extended commands for filter's input set changing.
It added GRE/Vxlan Tunnel as filter's input fields.

Signed-off-by: Jingjing Wu 
---
 app/test-pmd/cmdline.c  | 27 +--
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 22 --
 2 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index da1d3f2..ecc822a 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -641,7 +641,8 @@ static void cmd_help_long_parsed(void *parsed_result,
" flow (ipv4-other|ipv4-frag|ipv6-other|ipv6-frag)"
" src (src_ip_address) dst (dst_ip_address)"
" tos (tos_value) proto (proto_value) ttl (ttl_value)"
-   " vlan (vlan_value) flexbytes (flexbytes_value)"
+   " vlan (vlan_value) (NVGRE|VxLAN|GRE|Notunnel)"
+   " (tunnel_id_value) flexbytes (flexbytes_value)"
" (drop|fwd) pf|vf(vf_id) queue (queue_id)"
" fd_id (fd_id_value)\n"
"Add/Del an IP type flow director filter.\n\n"
@@ -651,7 +652,8 @@ static void cmd_help_long_parsed(void *parsed_result,
" src (src_ip_address) (src_port)"
" dst (dst_ip_address) (dst_port)"
" tos (tos_value) ttl (ttl_value)"
-   " vlan (vlan_value) flexbytes (flexbytes_value)"
+   " vlan (vlan_value) (NVGRE|VxLAN|GRE|Notunnel)"
+   " (tunnel_id_value) flexbytes (flexbytes_value)"
" (drop|fwd) pf|vf(vf_id) queue (queue_id)"
" fd_id (fd_id_value)\n"
"Add/Del an UDP/TCP type flow director filter.\n\n"
@@ -663,6 +665,7 @@ static void cmd_help_long_parsed(void *parsed_result,
" tag (verification_tag) "
" tos (tos_value) ttl (ttl_value)"
" vlan (vlan_value)"
+   " (NVGRE|VxLAN|GRE|Notunnel) (tunnel_id_value)"
" flexbytes (flexbytes_value) (drop|fwd)"
" pf|vf(vf_id) queue (queue_id) fd_id (fd_id_value)\n"
"Add/Del a SCTP type flow director filter.\n\n"
@@ -749,7 +752,8 @@ static void cmd_help_long_parsed(void *parsed_result,
"dst-ipv6|ipv4-tos|ipv4-proto|ipv4-ttl|ipv6-tc|"
"ipv6-next-header|ipv6-hop-limits|udp-src-port|"
"udp-dst-port|tcp-src-port|tcp-dst-port|"
-   "sctp-src-port|sctp-dst-port|sctp-veri-tag|none)"
+   "sctp-src-port|sctp-dst-port|sctp-veri-tag|"
+   "udp-key|gre-key|none)"
" (select|add)\n"
"Set the input set for FDir.\n\n"
);
@@ -8092,6 +8096,7 @@ str2fdir_tunneltype(char *string)
} tunneltype_str[] = {
{"NVGRE", RTE_FDIR_TUNNEL_TYPE_NVGRE},
{"VxLAN", RTE_FDIR_TUNNEL_TYPE_VXLAN},
+   {"GRE",   RTE_FDIR_TUNNEL_TYPE_GRE},
};

for (i = 0; i < RTE_DIM(tunneltype_str); i++) {
@@ -8263,6 +8268,10 @@ cmd_flow_director_filter_parsed(void *parsed_result,
   RTE_ETH_FDIR_MAX_FLEXLEN);

entry.input.flow_ext.vlan_tci = rte_cpu_to_be_16(res->vlan_value);
+   entry.input.flow.tunnel_flow.tunnel_type =
+   str2fdir_tunneltype(res->tunnel_type);
+   entry.input.flow.tunnel_flow.tunnel_id =
+   rte_cpu_to_be_32(res->tunnel_id_value);

entry.action.flex_off = 0;  /*use 0 by default */
if (!strcmp(res->drop, "drop"))
@@ -8426,7 +8435,7 @@ cmdline_parse_token_string_t cmd_flow_director_tunnel =
 tunnel, "tunnel");
 cmdline_parse_token_string_t cmd_flow_director_tunnel_type =
TOKEN_STRING_INITIALIZER(struct cmd_flow_director_result,
-tunnel_type, "NVGRE#VxLAN");
+tunnel_type, "NVGRE#VxLAN#GRE#Notunnel");
 cmdline_parse_token_string_t cmd_flow_director_tunnel_id =
TOKEN_STRING_INITIALIZER(struct cmd_flow_director_result,
 tunnel_id, "tunnel-id");
@@ -8458,6 +8467,8 @@ cmdline_parse_inst_t cmd_add_del_ip_flow_director = {
(void *)&cmd_flow_director_ttl_value,
(void *)&cmd_flow_director_vlan,
(void *)&cmd_flow_director_vlan_value,
+   (void *)&cmd_flow_director_tunnel_type,
+   (void *)&cmd_flow_director_tunnel_id_value,
(void *)&cmd_flow_director_flexbytes,
(void *)&cmd_flow_director_flexbytes_value,
(void *)&cmd_flow_director_drop,
@@ -8494,6 +8505,8 @@ cmdline_parse_inst_t cmd_add_del_udp_flow_director = {

[dpdk-dev] [PATCH 08/12] i40e: extend flow director to filter by tunnel ID

2016-01-26 Thread Jingjing Wu

This patch extended flow director to select Vxlan/GRE tunnel ID
as filter's input set and program the filter rule with the defined
tunnel type.

Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev.c |  11 
 drivers/net/i40e/i40e_fdir.c   | 125 ++---
 2 files changed, 102 insertions(+), 34 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 32ffc9f..62cdf81 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -6555,48 +6555,59 @@ i40e_get_valid_input_set(enum i40e_filter_pctype pctype,
 */
static const uint64_t valid_fdir_inset_table[] = {
[I40E_FILTER_PCTYPE_FRAG_IPV4] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_PROTO |
I40E_INSET_IPV4_TTL,
[I40E_FILTER_PCTYPE_NONF_IPV4_UDP] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV4_TCP] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV4_SCTP] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV4_OTHER] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_PROTO |
I40E_INSET_IPV4_TTL,
[I40E_FILTER_PCTYPE_FRAG_IPV6] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_NEXT_HDR |
I40E_INSET_IPV6_HOP_LIMIT,
[I40E_FILTER_PCTYPE_NONF_IPV6_UDP] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV6_TCP] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV6_SCTP] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV6_OTHER] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_NEXT_HDR |
I40E_INSET_IPV6_HOP_LIMIT,
[I40E_FILTER_PCTYPE_L2_PAYLOAD] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_LAST_ETHER_TYPE,
};

diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
index 5ea97e5..7566017 100644
--- a/drivers/net/i40e/i40e_fdir.c
+++ b/drivers/net/i40e/i40e_fdir.c
@@ -688,11 +688,13 @@ i40e_fdir_configure(struct rte_eth_dev *dev)

 static inline void
 i40e_fdir_fill_eth_ip_head(const struct rte_eth_fdir_input *fdir_input,
-  unsigned char *raw_pkt)
+  unsigned char *pkt, bool need_mac)
 {
-   struct ether_hdr *ether = (struct ether_hdr *)raw_pkt;
-   struct ipv4_hdr *ip;
-   struct ipv6_hdr *ip6;
+   struct ether_hdr *ether = (struct ether_hdr *)pkt;
+   struct ipv4_hdr *ip =
+   (struct ipv4_hdr *)(pkt + sizeof(struct ether_hdr));
+   struct ipv6_hdr *ip6 =
+   (struct ipv6_hdr *)(pkt + sizeof(struct ether_hdr));
static const uint8_t next_proto[] = {
[RTE_ETH_FLOW_FRAG_IPV4] = IPPROTO_IP,
[RTE_ETH_FLOW_NONFRAG_IPV4_TCP] = IPPROTO_TCP,
@@ -708,16 +710,18 @@ i40e_fdir_fill_eth_ip_head(const struct 
rte_eth_fdir_input *fdir_input,

switch (fdir_input->flow_type) {
case RTE_ETH_FLOW_L2_PAYLOAD:
-   ether->ether_type = fdir_input->flow.l2_flow.ether_type;
+   if (need_mac)
+   ether->ether_type = fdir_input->flow.l2_flow.ether_type;
break;
case RTE_ETH_FLOW_NONFRAG_

[dpdk-dev] [PATCH 07/12] librte_ether: extend rte_eth_fdir_flow to support tunnel format

2016-01-26 Thread Jingjing Wu

This patch changed rte_eth_fdir_flow from union to struct to
support more packets formats, for example, Vxlan and GRE tunnel
packets with IP inner frame.

This patch also add new RTE_FDIR_TUNNEL_TYPE_GRE enum.

Signed-off-by: Jingjing Wu 
---
 doc/guides/rel_notes/release_2_3.rst |  4 
 lib/librte_ether/rte_eth_ctrl.h  | 27 +++
 2 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 99de186..2216fee 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -39,6 +39,10 @@ API Changes
 ABI Changes
 ---

+* The ethdev flow director structure ``rte_eth_fdir_flow`` structure was
+  changed. New fields were added to extend flow director's input set, and
+  organizing is also changed to support multiple input format.
+

 Shared Library Versions
 ---
diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index 248f719..eb4c13d 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -495,6 +495,7 @@ enum rte_eth_fdir_tunnel_type {
RTE_FDIR_TUNNEL_TYPE_UNKNOWN = 0,
RTE_FDIR_TUNNEL_TYPE_NVGRE,
RTE_FDIR_TUNNEL_TYPE_VXLAN,
+   RTE_FDIR_TUNNEL_TYPE_GRE,
 };

 /**
@@ -508,18 +509,20 @@ struct rte_eth_tunnel_flow {
 };

 /**
- * An union contains the inputs for all types of flow
+ * A struct contains the inputs for all types of flow
  */
-union rte_eth_fdir_flow {
-   struct rte_eth_l2_flow l2_flow;
-   struct rte_eth_udpv4_flow  udp4_flow;
-   struct rte_eth_tcpv4_flow  tcp4_flow;
-   struct rte_eth_sctpv4_flow sctp4_flow;
-   struct rte_eth_ipv4_flow   ip4_flow;
-   struct rte_eth_udpv6_flow  udp6_flow;
-   struct rte_eth_tcpv6_flow  tcp6_flow;
-   struct rte_eth_sctpv6_flow sctp6_flow;
-   struct rte_eth_ipv6_flow   ipv6_flow;
+struct rte_eth_fdir_flow {
+   union {
+   struct rte_eth_l2_flow l2_flow;
+   struct rte_eth_udpv4_flow  udp4_flow;
+   struct rte_eth_tcpv4_flow  tcp4_flow;
+   struct rte_eth_sctpv4_flow sctp4_flow;
+   struct rte_eth_ipv4_flow   ip4_flow;
+   struct rte_eth_udpv6_flow  udp6_flow;
+   struct rte_eth_tcpv6_flow  tcp6_flow;
+   struct rte_eth_sctpv6_flow sctp6_flow;
+   struct rte_eth_ipv6_flow   ipv6_flow;
+   };
struct rte_eth_mac_vlan_flow mac_vlan_flow;
struct rte_eth_tunnel_flow   tunnel_flow;
 };
@@ -540,7 +543,7 @@ struct rte_eth_fdir_flow_ext {
  */
 struct rte_eth_fdir_input {
uint16_t flow_type;
-   union rte_eth_fdir_flow flow;
+   struct rte_eth_fdir_flow flow;
/**< Flow fields to match, dependent on flow_type */
struct rte_eth_fdir_flow_ext flow_ext;
/**< Additional fields to match */
-- 
2.4.0

[dpdk-dev] [PATCH 06/12] testpmd: extend commands for filter's input set changing

2016-01-26 Thread Jingjing Wu

This patch extended commands for filter's input set changing.
It added tos, protocol and ttl as filter's input fields, and
remove the words selection from flex payloads for flow director.

Signed-off-by: Jingjing Wu 
---
 app/test-pmd/cmdline.c  | 100 ++--
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  42 +++-
 2 files changed, 104 insertions(+), 38 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 73298c9..da1d3f2 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -640,6 +640,7 @@ static void cmd_help_long_parsed(void *parsed_result,
"flow_director_filter (port_id) mode IP 
(add|del|update)"
" flow (ipv4-other|ipv4-frag|ipv6-other|ipv6-frag)"
" src (src_ip_address) dst (dst_ip_address)"
+   " tos (tos_value) proto (proto_value) ttl (ttl_value)"
" vlan (vlan_value) flexbytes (flexbytes_value)"
" (drop|fwd) pf|vf(vf_id) queue (queue_id)"
" fd_id (fd_id_value)\n"
@@ -649,6 +650,7 @@ static void cmd_help_long_parsed(void *parsed_result,
" flow (ipv4-tcp|ipv4-udp|ipv6-tcp|ipv6-udp)"
" src (src_ip_address) (src_port)"
" dst (dst_ip_address) (dst_port)"
+   " tos (tos_value) ttl (ttl_value)"
" vlan (vlan_value) flexbytes (flexbytes_value)"
" (drop|fwd) pf|vf(vf_id) queue (queue_id)"
" fd_id (fd_id_value)\n"
@@ -658,7 +660,9 @@ static void cmd_help_long_parsed(void *parsed_result,
" flow (ipv4-sctp|ipv6-sctp)"
" src (src_ip_address) (src_port)"
" dst (dst_ip_address) (dst_port)"
-   " tag (verification_tag) vlan (vlan_value)"
+   " tag (verification_tag) "
+   " tos (tos_value) ttl (ttl_value)"
+   " vlan (vlan_value)"
" flexbytes (flexbytes_value) (drop|fwd)"
" pf|vf(vf_id) queue (queue_id) fd_id (fd_id_value)\n"
"Add/Del a SCTP type flow director filter.\n\n"
@@ -738,14 +742,15 @@ static void cmd_help_long_parsed(void *parsed_result,
"fld-8th|none) (select|add)\n"
"Set the input set for hash.\n\n"

-   "set_fdir_input_set (port_id) (ipv4|ipv4-frag|"
-   "ipv4-tcp|ipv4-udp|ipv4-sctp|ipv4-other|ipv6|"
+   "set_fdir_input_set (port_id) "
+   "(ipv4-frag|ipv4-tcp|ipv4-udp|ipv4-sctp|ipv4-other|"
"ipv6-frag|ipv6-tcp|ipv6-udp|ipv6-sctp|ipv6-other|"
-   "l2_payload) (src-ipv4|dst-ipv4|src-ipv6|dst-ipv6|"
-   "udp-src-port|udp-dst-port|tcp-src-port|tcp-dst-port|"
-   "sctp-src-port|sctp-dst-port|sctp-veri-tag|fld-1st|"
-   "fld-2nd|fld-3rd|fld-4th|fld-5th|fld-6th|fld-7th|"
-   "fld-8th|none) (select|add)\n"
+   "l2_payload) (ethertype|src-ipv4|dst-ipv4|src-ipv6|"
+   "dst-ipv6|ipv4-tos|ipv4-proto|ipv4-ttl|ipv6-tc|"
+   "ipv6-next-header|ipv6-hop-limits|udp-src-port|"
+   "udp-dst-port|tcp-src-port|tcp-dst-port|"
+   "sctp-src-port|sctp-dst-port|sctp-veri-tag|none)"
+   " (select|add)\n"
"Set the input set for FDir.\n\n"
);
}
@@ -7983,6 +7988,12 @@ struct cmd_flow_director_result {
uint16_t port_dst;
cmdline_fixed_string_t verify_tag;
uint32_t verify_tag_value;
+   cmdline_ipaddr_t tos;
+   uint8_t tos_value;
+   cmdline_ipaddr_t proto;
+   uint8_t proto_value;
+   cmdline_ipaddr_t ttl;
+   uint8_t ttl_value;
cmdline_fixed_string_t vlan;
uint16_t vlan_value;
cmdline_fixed_string_t flexbytes;
@@ -8162,12 +8173,15 @@ cmd_flow_director_filter_parsed(void *parsed_result,
switch (entry.input.flow_type) {
case RTE_ETH_FLOW_FRAG_IPV4:
case RTE_ETH_FLOW_NONFRAG_IPV4_OTHER:
+   entry.input.flow.ip4_flow.proto = res->proto_value;
case RTE_ETH_FLOW_NONFRAG_IPV4_UDP:
case RTE_ETH_FLOW_NONFRAG_IPV4_TCP:
IPV4_ADDR_TO_UINT(res->ip_dst,
entry.input.flow.ip4_flow.dst_ip);
IPV4_ADDR_TO_UINT(res->ip_src,
entry.input.flow.ip4_flow.src_ip);
+   entry.input.flow.ip4_flow.tos = res->tos_value;
+   entry.input.flow.ip4_flow.ttl = res->ttl_value;
/* need convert to big endian. */
entry.input.flow.udp4_flow.dst_port =

[dpdk-dev] [PATCH 05/12] i40e: extend flow director to filter by more IP Header fields

2016-01-26 Thread Jingjing Wu

This patch extended flow director to select more IP Header fields
as filter input set.

Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev.c | 69 ++
 drivers/net/i40e/i40e_fdir.c   | 26 +++-
 2 files changed, 75 insertions(+), 20 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 7a09fbc..32ffc9f 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -218,6 +218,8 @@
 #define I40E_REG_INSET_L3_IP4_TOS0x0040ULL
 /* IPv4 Protocol */
 #define I40E_REG_INSET_L3_IP4_PROTO  0x0004ULL
+/* IPv4 Time to Live */
+#define I40E_REG_INSET_L3_IP4_TTL0x0004ULL
 /* Source IPv6 address */
 #define I40E_REG_INSET_L3_SRC_IP60x0007F800ULL
 /* Destination IPv6 address */
@@ -226,6 +228,8 @@
 #define I40E_REG_INSET_L3_IP6_TC 0x0040ULL
 /* IPv6 Next Header */
 #define I40E_REG_INSET_L3_IP6_NEXT_HDR   0x0008ULL
+/* IPv6 Hop Limitr */
+#define I40E_REG_INSET_L3_IP6_HOP_LIMIT  0x0008ULL
 /* Source L4 port */
 #define I40E_REG_INSET_L4_SRC_PORT   0x0004ULL
 /* Destination L4 port */
@@ -269,10 +273,12 @@
 #define I40E_TRANSLATE_INSET 0
 #define I40E_TRANSLATE_REG   1

-#define I40E_INSET_IPV4_TOS_MASK  0x0009FF00UL
-#define I40E_INSET_IPV4_PROTO_MASK0x000DFF00UL
-#define I40E_INSET_IPV6_TC_MASK   0x0009F00FUL
-#define I40E_INSET_IPV6_NEXT_HDR_MASK 0x000C00FFUL
+#define I40E_INSET_IPV4_TOS_MASK0x0009FF00UL
+#define I40E_INSET_IPv4_TTL_MASK0x000D00FFUL
+#define I40E_INSET_IPV4_PROTO_MASK  0x000DFF00UL
+#define I40E_INSET_IPV6_TC_MASK 0x0009F00FUL
+#define I40E_INSET_IPV6_HOP_LIMIT_MASK  0x000CFF00UL
+#define I40E_INSET_IPV6_NEXT_HDR_MASK   0x000C00FFUL

 static int eth_i40e_dev_init(struct rte_eth_dev *eth_dev);
 static int eth_i40e_dev_uninit(struct rte_eth_dev *eth_dev);
@@ -6549,30 +6555,47 @@ i40e_get_valid_input_set(enum i40e_filter_pctype pctype,
 */
static const uint64_t valid_fdir_inset_table[] = {
[I40E_FILTER_PCTYPE_FRAG_IPV4] =
-   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST,
+   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
+   I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_PROTO |
+   I40E_INSET_IPV4_TTL,
[I40E_FILTER_PCTYPE_NONF_IPV4_UDP] =
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
+   I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV4_TCP] =
-   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST,
+   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
+   I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
+   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV4_SCTP] =
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
+   I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV4_OTHER] =
-   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST,
+   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
+   I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_PROTO |
+   I40E_INSET_IPV4_TTL,
[I40E_FILTER_PCTYPE_FRAG_IPV6] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
+   I40E_INSET_IPV6_TC | I40E_INSET_IPV6_NEXT_HDR |
+   I40E_INSET_IPV6_HOP_LIMIT,
[I40E_FILTER_PCTYPE_NONF_IPV6_UDP] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
+   I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
+   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV6_TCP] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
+   I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
+   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV6_SCTP] =
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
+   I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV6_OTHER] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
+   I40E_INSET_IPV6_TC | I40E_INSET_IPV6_NEXT_HDR |
+   I40E_INSET_IPV6_HOP_LIMIT,
[I40E_FILTER_PCTYPE_L2_PAYLO

[dpdk-dev] [PATCH 04/12] i40e: restore default setting on input set of filters

2016-01-26 Thread Jingjing Wu

This patch added a new function to set the input set to default
when initialization.

Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev.c | 56 ++
 1 file changed, 56 insertions(+)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index f3c2e94..7a09fbc 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -374,6 +374,7 @@ static int i40e_dev_udp_tunnel_add(struct rte_eth_dev *dev,
struct rte_eth_udp_tunnel *udp_tunnel);
 static int i40e_dev_udp_tunnel_del(struct rte_eth_dev *dev,
struct rte_eth_udp_tunnel *udp_tunnel);
+static void i40e_filter_input_set_init(struct i40e_pf *pf);
 static int i40e_ethertype_filter_set(struct i40e_pf *pf,
struct rte_eth_ethertype_filter *filter,
bool add);
@@ -788,6 +789,8 @@ eth_i40e_dev_init(struct rte_eth_dev *dev)
 * It should be removed once issues are fixed in NVM.
 */
i40e_flex_payload_reg_init(hw);
+   /* Initialize the input set for filters (hash and fd) to default value 
*/
+   i40e_filter_input_set_init(pf);

/* Initialize the parameters for adminq */
i40e_init_adminq_parameter(hw);
@@ -6844,6 +6847,59 @@ i40e_check_write_reg(struct i40e_hw *hw, uint32_t addr, 
uint32_t val)
(uint32_t)I40E_READ_REG(hw, addr));
 }

+static void
+i40e_filter_input_set_init(struct i40e_pf *pf)
+{
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+   enum i40e_filter_pctype pctype;
+   uint64_t input_set, inset_reg;
+   uint32_t mask_reg[I40E_INSET_MASK_NUM_REG] = {0};
+   int num, i;
+
+   for (pctype = I40E_FILTER_PCTYPE_NONF_IPV4_UDP;
+pctype <= I40E_FILTER_PCTYPE_L2_PAYLOAD; pctype++) {
+   if (!I40E_VALID_PCTYPE(pctype))
+   continue;
+   input_set = i40e_get_default_input_set(pctype);
+
+   num = i40e_generate_inset_mask_reg(input_set, mask_reg,
+  I40E_INSET_MASK_NUM_REG);
+   if (num < 0)
+   return;
+   inset_reg = i40e_translate_input_set_reg(input_set);
+
+   i40e_check_write_reg(hw, I40E_PRTQF_FD_INSET(pctype, 0),
+ (uint32_t)(inset_reg & UINT32_MAX));
+   i40e_check_write_reg(hw, I40E_PRTQF_FD_INSET(pctype, 1),
+(uint32_t)((inset_reg >>
+I40E_32_BIT_WIDTH) & UINT32_MAX));
+   i40e_check_write_reg(hw, I40E_GLQF_HASH_INSET(0, pctype),
+ (uint32_t)(inset_reg & UINT32_MAX));
+   i40e_check_write_reg(hw, I40E_GLQF_HASH_INSET(1, pctype),
+(uint32_t)((inset_reg >>
+I40E_32_BIT_WIDTH) & UINT32_MAX));
+
+   for (i = 0; i < num; i++) {
+   i40e_check_write_reg(hw, I40E_GLQF_FD_MSK(i, pctype),
+mask_reg[i]);
+   i40e_check_write_reg(hw, I40E_GLQF_HASH_MSK(i, pctype),
+mask_reg[i]);
+   }
+   /*clear unused mask registers of the pctype */
+   for (i = num; i < I40E_INSET_MASK_NUM_REG; i++) {
+   i40e_check_write_reg(hw, I40E_GLQF_FD_MSK(i, pctype),
+0);
+   i40e_check_write_reg(hw, I40E_GLQF_HASH_MSK(i, pctype),
+0);
+   }
+   I40E_WRITE_FLUSH(hw);
+
+   /* store the default input set */
+   pf->hash_input_set[pctype] = input_set;
+   pf->fdir.input_set[pctype] = input_set;
+   }
+}
+
 int
 i40e_hash_filter_inset_select(struct i40e_hw *hw,
 struct rte_eth_input_set_conf *conf)
-- 
2.4.0

[dpdk-dev] [PATCH 03/12] i40e: remove flex payload from INPUT_SET_SELECT operation

2016-01-26 Thread Jingjing Wu

In this patch, flex payload is removed from valid fdir input set
values. It is because all flex payload configuration can be set
in struct rte_fdir_conf during device configure phase.
And it is a more flexible configuration including flexpayload's
selection, input set selection by word and mask setting in bits.

Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev.c | 59 +++---
 1 file changed, 26 insertions(+), 33 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 004e206..f3c2e94 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -262,7 +262,8 @@
 #define I40E_REG_INSET_FLEX_PAYLOAD_WORD70x0080ULL
 /* 8th word of flex payload */
 #define I40E_REG_INSET_FLEX_PAYLOAD_WORD80x0040ULL
-
+/* all 8 words flex payload */
+#define I40E_REG_INSET_FLEX_PAYLOAD_WORDS0x3FC0ULL
 #define I40E_REG_INSET_MASK_DEFAULT  0xULL

 #define I40E_TRANSLATE_INSET 0
@@ -6545,43 +6546,32 @@ i40e_get_valid_input_set(enum i40e_filter_pctype pctype,
 */
static const uint64_t valid_fdir_inset_table[] = {
[I40E_FILTER_PCTYPE_FRAG_IPV4] =
-   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST,
[I40E_FILTER_PCTYPE_NONF_IPV4_UDP] =
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
-   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV4_TCP] =
-   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
-   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST,
[I40E_FILTER_PCTYPE_NONF_IPV4_SCTP] =
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
-   I40E_INSET_SCTP_VT | I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV4_OTHER] =
-   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST,
[I40E_FILTER_PCTYPE_FRAG_IPV6] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
[I40E_FILTER_PCTYPE_NONF_IPV6_UDP] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
-   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
[I40E_FILTER_PCTYPE_NONF_IPV6_TCP] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
-   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
[I40E_FILTER_PCTYPE_NONF_IPV6_SCTP] =
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
-   I40E_INSET_SCTP_VT | I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV6_OTHER] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
[I40E_FILTER_PCTYPE_L2_PAYLOAD] =
-   I40E_INSET_LAST_ETHER_TYPE | I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_LAST_ETHER_TYPE,
};

if (pctype > I40E_FILTER_PCTYPE_L2_PAYLOAD)
@@ -6809,7 +6799,7 @@ i40e_translate_input_set_reg(uint64_t input)
return val;
 }

-static uint8_t
+static int
 i40e_generate_inset_mask_reg(uint64_t inset, uint32_t *mask, uint8_t nb_elem)
 {
uint8_t i, idx = 0;
@@ -6827,16 +6817,13 @@ i40e_generate_inset_mask_reg(uint64_t inset, uint32_t 
*mask, uint8_t nb_elem)
if (!inset || !mask || !nb_elem)
return 0;

-   if (!inset && nb_elem >= I40E_INSET_MASK_NUM_REG) {
-   for (i = 0; i < I40E_INSET_MASK_NUM_REG; i++)
-   mask[i] = 0;
-   return I40E_INSET_MASK_NUM_REG;
-   }

for (i = 0, idx = 0; i < RTE_DIM(inset_mask_map); i++) {
-   if (idx >= nb_elem)
-   break;
-   if (inset & inset_mask_map[i].inset) {
+   if ((inset & inset_mask_map[i].inset) == 
inset_mask_map[i].inset) {
+   if (idx >= nb_elem) {
+   PMD_DRV_LOG(ERR, "exceed maximal number of 
bitmasks");
+   return -EINVA

[dpdk-dev] [PATCH 02/12] i40e: split function for input set change of hash and fdir

2016-01-26 Thread Jingjing Wu

This patch split function for input set changing of hash
and fdir to avoid multiple check on different situation.

Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev.c | 233 +
 drivers/net/i40e/i40e_ethdev.h |  11 +-
 drivers/net/i40e/i40e_fdir.c   |   5 +-
 3 files changed, 107 insertions(+), 142 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index bf6220d..004e206 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -6845,25 +6845,6 @@ i40e_generate_inset_mask_reg(uint64_t inset, uint32_t 
*mask, uint8_t nb_elem)
return idx;
 }

-static uint64_t
-i40e_get_reg_inset(struct i40e_hw *hw, enum rte_filter_type filter,
-   enum i40e_filter_pctype pctype)
-{
-   uint64_t reg = 0;
-
-   if (filter == RTE_ETH_FILTER_HASH) {
-   reg = I40E_READ_REG(hw, I40E_GLQF_HASH_INSET(1, pctype));
-   reg <<= I40E_32_BIT_WIDTH;
-   reg |= I40E_READ_REG(hw, I40E_GLQF_HASH_INSET(0, pctype));
-   } else if (filter == RTE_ETH_FILTER_FDIR) {
-   reg = I40E_READ_REG(hw, I40E_PRTQF_FD_INSET(pctype, 1));
-   reg <<= I40E_32_BIT_WIDTH;
-   reg |= I40E_READ_REG(hw, I40E_PRTQF_FD_INSET(pctype, 0));
-   }
-
-   return reg;
-}
-
 static void
 i40e_check_write_reg(struct i40e_hw *hw, uint32_t addr, uint32_t val)
 {
@@ -6876,103 +6857,96 @@ i40e_check_write_reg(struct i40e_hw *hw, uint32_t 
addr, uint32_t val)
(uint32_t)I40E_READ_REG(hw, addr));
 }

-static int
-i40e_set_hash_inset_mask(struct i40e_hw *hw,
-enum i40e_filter_pctype pctype,
-enum rte_filter_input_set_op op,
-uint32_t *mask_reg,
-uint8_t num)
+int
+i40e_hash_filter_inset_select(struct i40e_hw *hw,
+struct rte_eth_input_set_conf *conf)
 {
-   uint32_t reg;
-   uint8_t i;
+   struct i40e_pf *pf = &((struct i40e_adapter *)hw->back)->pf;
+   enum i40e_filter_pctype pctype;
+   uint64_t input_set, inset_reg = 0;
+   uint32_t mask_reg[I40E_INSET_MASK_NUM_REG] = {0};
+   int ret, i, num;

-   if (!mask_reg || num > RTE_ETH_INPUT_SET_SELECT)
+   if (!hw || !conf) {
+   PMD_DRV_LOG(ERR, "Invalid pointer");
+   return -EFAULT;
+   }
+   if (conf->op != RTE_ETH_INPUT_SET_SELECT &&
+   conf->op != RTE_ETH_INPUT_SET_ADD) {
+   PMD_DRV_LOG(ERR, "Unsupported input set operation");
return -EINVAL;
-
-   if (op == RTE_ETH_INPUT_SET_SELECT) {
-   for (i = 0; i < I40E_INSET_MASK_NUM_REG; i++) {
-   i40e_check_write_reg(hw, I40E_GLQF_HASH_MSK(i, pctype),
-0);
-   if (i >= num)
-   continue;
-   i40e_check_write_reg(hw, I40E_GLQF_HASH_MSK(i, pctype),
-mask_reg[i]);
-   }
-   } else if (op == RTE_ETH_INPUT_SET_ADD) {
-   uint8_t j, count = 0;
-
-   for (i = 0; i < I40E_INSET_MASK_NUM_REG; i++) {
-   reg = I40E_READ_REG(hw, I40E_GLQF_HASH_MSK(i, pctype));
-   if (reg & I40E_GLQF_HASH_MSK_FIELD)
-   count++;
-   }
-   if (count + num > I40E_INSET_MASK_NUM_REG)
-   return -EINVAL;
-
-   for (i = count, j = 0; i < I40E_INSET_MASK_NUM_REG; i++, j++)
-   i40e_check_write_reg(hw, I40E_GLQF_HASH_MSK(i, pctype),
-mask_reg[j]);
}

-   return 0;
-}
-
-static int
-i40e_set_fd_inset_mask(struct i40e_hw *hw,
-  enum i40e_filter_pctype pctype,
-  enum rte_filter_input_set_op op,
-  uint32_t *mask_reg,
-  uint8_t num)
-{
-   uint32_t reg;
-   uint8_t i;
-
-   if (!mask_reg || num > RTE_ETH_INPUT_SET_SELECT)
+   pctype = i40e_flowtype_to_pctype(conf->flow_type);
+   if (pctype == 0 || pctype > I40E_FILTER_PCTYPE_L2_PAYLOAD) {
+   PMD_DRV_LOG(ERR, "Not supported flow type (%u)",
+   conf->flow_type);
return -EINVAL;
+   }

-   if (op == RTE_ETH_INPUT_SET_SELECT) {
-   for (i = 0; i < I40E_INSET_MASK_NUM_REG; i++) {
-   i40e_check_write_reg(hw, I40E_GLQF_FD_MSK(i, pctype),
-0);
-   if (i >= num)
-   continue;
-   i40e_check_write_reg(hw, I40E_GLQF_FD_MSK(i, pctype),
-mask_reg[i]);
-   }
-   } else if (op == RTE_ETH_INPUT_SET_ADD) {
-   uint

[dpdk-dev] [PATCH 01/12] ethdev: extend flow director to support input set selection

2016-01-26 Thread Jingjing Wu

This patch added RTE_ETH_INPUT_SET_L3_IP4_TTL,
RTE_ETH_INPUT_SET_L3_IP6_HOP_LIMITS input field type and extended
struct rte_eth_ipv4_flow and rte_eth_ipv6_flow to support filtering
by tos, protocol and ttl.

Signed-off-by: Jingjing Wu 
---
 lib/librte_ether/rte_eth_ctrl.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index ce224ad..248f719 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -337,9 +337,11 @@ enum rte_eth_input_set_field {
RTE_ETH_INPUT_SET_L3_SRC_IP6,
RTE_ETH_INPUT_SET_L3_DST_IP6,
RTE_ETH_INPUT_SET_L3_IP4_TOS,
+   RTE_ETH_INPUT_SET_L3_IP4_TTL,
RTE_ETH_INPUT_SET_L3_IP4_PROTO,
RTE_ETH_INPUT_SET_L3_IP6_TC,
RTE_ETH_INPUT_SET_L3_IP6_NEXT_HEADER,
+   RTE_ETH_INPUT_SET_L3_IP6_HOP_LIMITS,

/* L4 */
RTE_ETH_INPUT_SET_L4_UDP_SRC_PORT = 257,
@@ -407,6 +409,9 @@ struct rte_eth_l2_flow {
 struct rte_eth_ipv4_flow {
uint32_t src_ip;  /**< IPv4 source address to match. */
uint32_t dst_ip;  /**< IPv4 destination address to match. */
+   uint8_t  tos; /**< Type of service to match. */
+   uint8_t  ttl; /**< Time to live */
+   uint8_t  proto;
 };

 /**
@@ -443,6 +448,9 @@ struct rte_eth_sctpv4_flow {
 struct rte_eth_ipv6_flow {
uint32_t src_ip[4];  /**< IPv6 source address to match. */
uint32_t dst_ip[4];  /**< IPv6 destination address to match. */
+   uint8_t  tc; /**< Traffic class to match. */
+   uint8_t  proto;  /**< Protocol, next header. */
+   uint8_t  hop_limits;
 };

 /**
-- 
2.4.0

[dpdk-dev] [PATCH 00/12] extend flow director's fields in i40e driver

2016-01-26 Thread Jingjing Wu

This patch set extends flow director to support filtering by
additional fields below in i40e driver:
 - TOS, Protocol and TTL in IP header
 - Tunnel id if NVGRE/GRE/VxLAN packets
 - single vlan or inner vlan

Jingjing Wu (12):
  ethdev: extend flow director to support input set selection
  i40e: split function for input set change of hash and fdir
  i40e: remove flex payload from INPUT_SET_SELECT operation
  i40e: restore default setting on input set of filters
  i40e: extend flow director to filter by more IP Header fields
  testpmd: extend commands for filter's input set changing
  librte_ether: extend rte_eth_fdir_flow to support tunnel format
  i40e: extend flow director to filter by tunnel ID
  testpmd: extend commands for fdir's tunnel id input set
  i40e: fix VLAN bitmasks for hash/fdir input sets for tunnels
  i40e: extend flow director to filter by vlan id
  testpmd: extend commands for fdir's vlan input set

 app/test-pmd/cmdline.c  | 121 +++--
 doc/guides/rel_notes/release_2_3.rst|   5 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  56 ++--
 drivers/net/i40e/i40e_ethdev.c  | 401 +---
 drivers/net/i40e/i40e_ethdev.h  |  11 +-
 drivers/net/i40e/i40e_fdir.c| 163 ---
 lib/librte_ether/rte_eth_ctrl.h |  35 ++-
 7 files changed, 529 insertions(+), 263 deletions(-)

-- 
2.4.0

[dpdk-dev] [PATCH 2.3] tools/dpdk_nic_bind.py: Verbosely warn the user on bind

2016-01-26 Thread Aaron Conole

Ping... This patch has been sitting^Hrotting for a bit over a month.

> DPDK ports are only detected during the EAL initialization. After that, any
> new DPDK ports which are bound will not be visible to the application.
>
> The dpdk_nic_bind.py can be a bit more helpful to let users know that DPDK
> enabled applications will not find rebound ports until after they have been
> restarted.
>
> Signed-off-by: Aaron Conole 
>
> ---
> tools/dpdk_nic_bind.py | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/tools/dpdk_nic_bind.py b/tools/dpdk_nic_bind.py
> index f02454e..ca39389 100755
> --- a/tools/dpdk_nic_bind.py
> +++ b/tools/dpdk_nic_bind.py
> @@ -344,8 +344,10 @@ def bind_one(dev_id, driver, force):
>  dev["Driver_str"] = "" # clear driver string
> 
>  # if we are binding to one of DPDK drivers, add PCI id's to that driver
> +bDpdkDriver = False
>  if driver in dpdk_drivers:
>  filename = "/sys/bus/pci/drivers/%s/new_id" % driver
> +bDpdkDriver = True
>  try:
>  f = open(filename, "w")
>  except:
> @@ -371,12 +373,18 @@ def bind_one(dev_id, driver, force):
>  try:
>  f.write(dev_id)
>  f.close()
> +if bDpdkDriver:
> +print "Device rebound to dpdk driver."
> +print "Remember to restart any application that will use this 
> port."
>  except:
>  # for some reason, closing dev_id after adding a new PCI ID to new_id
>  # results in IOError. however, if the device was successfully bound,
>  # we don't care for any errors and can safely ignore IOError
>  tmp = get_pci_device_details(dev_id)
>  if "Driver_str" in tmp and tmp["Driver_str"] == driver:
> +if bDpdkDriver:
> +print "Device rebound to dpdk driver."
> +print "Remember to restart any application that will use 
> this port."
>  return
>  print "Error: bind failed for %s - Cannot bind to driver %s" % 
> (dev_id, driver)
>  if saved_driver is not None: # restore any previous driver

[dpdk-dev] [PATCH 12/12] testpmd: extend commands for fdir's vlan input set

2016-01-26 Thread Jingjing Wu

This patch extended commands for filter's input set changing.
It added vlan as filter's input fields.

Signed-off-by: Jingjing Wu 
---
 app/test-pmd/cmdline.c  | 6 +++---
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index ecc822a..f7ffce1 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -748,7 +748,7 @@ static void cmd_help_long_parsed(void *parsed_result,
"set_fdir_input_set (port_id) "
"(ipv4-frag|ipv4-tcp|ipv4-udp|ipv4-sctp|ipv4-other|"
"ipv6-frag|ipv6-tcp|ipv6-udp|ipv6-sctp|ipv6-other|"
-   "l2_payload) (ethertype|src-ipv4|dst-ipv4|src-ipv6|"
+   "l2_payload) 
(ivlan|ethertype|src-ipv4|dst-ipv4|src-ipv6|"
"dst-ipv6|ipv4-tos|ipv4-proto|ipv4-ttl|ipv6-tc|"
"ipv6-next-header|ipv6-hop-limits|udp-src-port|"
"udp-dst-port|tcp-src-port|tcp-dst-port|"
@@ -9622,7 +9622,7 @@ cmdline_parse_token_string_t 
cmd_set_fdir_input_set_flow_type =
 cmdline_parse_token_string_t cmd_set_fdir_input_set_field =
TOKEN_STRING_INITIALIZER(struct cmd_set_fdir_input_set_result,
inset_field,
-   "ethertype#src-ipv4#dst-ipv4#src-ipv6#dst-ipv6#"
+   "ivlan#ethertype#src-ipv4#dst-ipv4#src-ipv6#dst-ipv6#"
"ipv4-tos#ipv4-proto#ipv4-ttl#ipv6-tc#ipv6-next-header#"
"ipv6-hop-limits#udp-src-port#udp-dst-port#"
"tcp-src-port#tcp-dst-port#sctp-src-port#sctp-dst-port#"
@@ -9637,7 +9637,7 @@ cmdline_parse_inst_t cmd_set_fdir_input_set = {
.help_str = "set_fdir_input_set  "
"ipv4-frag|ipv4-tcp|ipv4-udp|ipv4-sctp|ipv4-other|"
"ipv6-frag|ipv6-tcp|ipv6-udp|ipv6-sctp|ipv6-other|l2_payload "
-   "ethertype|src-ipv4|dst-ipv4|src-ipv6|dst-ipv6|"
+   "ivlan|ethertype|src-ipv4|dst-ipv4|src-ipv6|dst-ipv6|"
"ipv4-tos|ipv4-proto|ipv4-ttl|ipv6-tc|ipv6-next-header|"
"ipv6-hop-limits|udp-src-port|udp-dst-port|"
"tcp-src-port|tcp-dst-port|sctp-src-port|sctp-dst-port|"
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst 
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 417ddde..aa20d5a 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1878,7 +1878,7 @@ Set the input set for flow director::

set_fdir_input_set (port_id) (ipv4-frag|ipv4-tcp|ipv4-udp|ipv4-sctp| \
ipv4-other|ipv6|ipv6-frag|ipv6-tcp|ipv6-udp|ipv6-sctp|ipv6-other| \
-   l2_payload) (ethertype|src-ipv4|dst-ipv4|src-ipv6|dst-ipv6|ipv4-tos| \
+   l2_payload) (ivlan|ethertype|src-ipv4|dst-ipv4|src-ipv6|dst-ipv6|ipv4-tos| \
ipv4-proto|ipv4-ttl|ipv6-tc|ipv6-next-header|ipv6-hop-limits| \
tudp-src-port|udp-dst-port|cp-src-port|tcp-dst-port|sctp-src-port| \
sctp-dst-port|sctp-veri-tag|udp-key|gre-key|none) (select|add)
-- 
2.4.0

[dpdk-dev] [PATCH 11/12] i40e: extend flow director to filter by vlan id

2016-01-26 Thread Jingjing Wu

This patch extended flow director to select vlan id
as filter's input set and program the filter rule with vlan id.

Signed-off-by: Jingjing Wu 
---
 doc/guides/rel_notes/release_2_3.rst |  1 +
 drivers/net/i40e/i40e_ethdev.c   | 11 +++
 drivers/net/i40e/i40e_fdir.c |  9 +
 3 files changed, 21 insertions(+)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 2216fee..63c7e04 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -4,6 +4,7 @@ DPDK Release 2.3
 New Features
 

+* **Added Flow director enhancements on Intel X710/XL710.**

 Resolved Issues
 ---
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 66e3a46..b4bd24b 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -6557,58 +6557,69 @@ i40e_get_valid_input_set(enum i40e_filter_pctype pctype,
 */
static const uint64_t valid_fdir_inset_table[] = {
[I40E_FILTER_PCTYPE_FRAG_IPV4] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_PROTO |
I40E_INSET_IPV4_TTL,
[I40E_FILTER_PCTYPE_NONF_IPV4_UDP] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV4_TCP] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV4_SCTP] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV4_OTHER] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_PROTO |
I40E_INSET_IPV4_TTL,
[I40E_FILTER_PCTYPE_FRAG_IPV6] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_NEXT_HDR |
I40E_INSET_IPV6_HOP_LIMIT,
[I40E_FILTER_PCTYPE_NONF_IPV6_UDP] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV6_TCP] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV6_SCTP] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV6_OTHER] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_NEXT_HDR |
I40E_INSET_IPV6_HOP_LIMIT,
[I40E_FILTER_PCTYPE_L2_PAYLOAD] =
+   I40E_INSET_VLAN_OUTER | I40E_INSET_VLAN_INNER |
I40E_INSET_TUNNEL_ID |
I40E_INSET_LAST_ETHER_TYPE,
};
diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
index 7566017..bbe6f1f 100644
--- a/drivers/net/i40e/i40e_fdir.c
+++ b/drivers/net/i40e/i40e_fdir.c
@@ -799,6 +799,7 @@ i40e_fdir_construct_pkt(struct i40e_pf *pf,
uint8_t size, dst = 0;
uint8_t i, pit_idx, set_idx = I40E_FLXPLD_L4_IDX; /* use l4 by default*/
bool need_mac = TRUE;
+

[dpdk-dev] [PATCH 10/12] i40e: fix VLAN bitmasks for hash/fdir input sets for tunnels

2016-01-26 Thread Jingjing Wu

From: Andrey Chilikin 

This patch adds missing VLAN bitmask for inner frame in case of
tunneling and fixes VLAN tags bitmasks for single or outer frame
in case of tunneling.

Signed-off-by: Andrey Chilikin 
Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 62cdf81..66e3a46 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -206,10 +206,12 @@
 #define I40E_REG_INSET_L2_DMAC   0xE000ULL
 /* Source MAC address */
 #define I40E_REG_INSET_L2_SMAC   0x1C00ULL
-/* VLAN tag in the outer L2 header */
-#define I40E_REG_INSET_L2_OUTER_VLAN 0x0080ULL
-/* VLAN tag in the inner L2 header */
-#define I40E_REG_INSET_L2_INNER_VLAN 0x0100ULL
+/* Outer (S-Tag) VLAN tag in the outer L2 header */
+#define I40E_REG_INSET_L2_OUTER_VLAN 0x0200ULL
+/* Inner (C-Tag) or single VLAN tag in the outer L2 header */
+#define I40E_REG_INSET_L2_INNER_VLAN 0x0080ULL
+/* Single VLAN tag in the inner L2 header */
+#define I40E_REG_INSET_TUNNEL_VLAN   0x0100ULL
 /* Source IPv4 address */
 #define I40E_REG_INSET_L3_SRC_IP40x00018000ULL
 /* Destination IPv4 address */
@@ -6818,7 +6820,7 @@ i40e_translate_input_set_reg(uint64_t input)
I40E_REG_INSET_TUNNEL_L4_UDP_SRC_PORT},
{I40E_INSET_TUNNEL_DST_PORT,
I40E_REG_INSET_TUNNEL_L4_UDP_DST_PORT},
-   {I40E_INSET_TUNNEL_ID, I40E_REG_INSET_TUNNEL_ID},
+   {I40E_INSET_VLAN_TUNNEL, I40E_REG_INSET_TUNNEL_VLAN},
{I40E_INSET_FLEX_PAYLOAD_W1, I40E_REG_INSET_FLEX_PAYLOAD_WORD1},
{I40E_INSET_FLEX_PAYLOAD_W2, I40E_REG_INSET_FLEX_PAYLOAD_WORD2},
{I40E_INSET_FLEX_PAYLOAD_W3, I40E_REG_INSET_FLEX_PAYLOAD_WORD3},
-- 
2.4.0

[dpdk-dev] [PATCH 09/12] testpmd: extend commands for fdir's tunnel id input set

2016-01-26 Thread Jingjing Wu

This patch extended commands for filter's input set changing.
It added GRE/Vxlan Tunnel as filter's input fields.

Signed-off-by: Jingjing Wu 
---
 app/test-pmd/cmdline.c  | 27 +--
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 22 --
 2 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index da1d3f2..ecc822a 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -641,7 +641,8 @@ static void cmd_help_long_parsed(void *parsed_result,
" flow (ipv4-other|ipv4-frag|ipv6-other|ipv6-frag)"
" src (src_ip_address) dst (dst_ip_address)"
" tos (tos_value) proto (proto_value) ttl (ttl_value)"
-   " vlan (vlan_value) flexbytes (flexbytes_value)"
+   " vlan (vlan_value) (NVGRE|VxLAN|GRE|Notunnel)"
+   " (tunnel_id_value) flexbytes (flexbytes_value)"
" (drop|fwd) pf|vf(vf_id) queue (queue_id)"
" fd_id (fd_id_value)\n"
"Add/Del an IP type flow director filter.\n\n"
@@ -651,7 +652,8 @@ static void cmd_help_long_parsed(void *parsed_result,
" src (src_ip_address) (src_port)"
" dst (dst_ip_address) (dst_port)"
" tos (tos_value) ttl (ttl_value)"
-   " vlan (vlan_value) flexbytes (flexbytes_value)"
+   " vlan (vlan_value) (NVGRE|VxLAN|GRE|Notunnel)"
+   " (tunnel_id_value) flexbytes (flexbytes_value)"
" (drop|fwd) pf|vf(vf_id) queue (queue_id)"
" fd_id (fd_id_value)\n"
"Add/Del an UDP/TCP type flow director filter.\n\n"
@@ -663,6 +665,7 @@ static void cmd_help_long_parsed(void *parsed_result,
" tag (verification_tag) "
" tos (tos_value) ttl (ttl_value)"
" vlan (vlan_value)"
+   " (NVGRE|VxLAN|GRE|Notunnel) (tunnel_id_value)"
" flexbytes (flexbytes_value) (drop|fwd)"
" pf|vf(vf_id) queue (queue_id) fd_id (fd_id_value)\n"
"Add/Del a SCTP type flow director filter.\n\n"
@@ -749,7 +752,8 @@ static void cmd_help_long_parsed(void *parsed_result,
"dst-ipv6|ipv4-tos|ipv4-proto|ipv4-ttl|ipv6-tc|"
"ipv6-next-header|ipv6-hop-limits|udp-src-port|"
"udp-dst-port|tcp-src-port|tcp-dst-port|"
-   "sctp-src-port|sctp-dst-port|sctp-veri-tag|none)"
+   "sctp-src-port|sctp-dst-port|sctp-veri-tag|"
+   "udp-key|gre-key|none)"
" (select|add)\n"
"Set the input set for FDir.\n\n"
);
@@ -8092,6 +8096,7 @@ str2fdir_tunneltype(char *string)
} tunneltype_str[] = {
{"NVGRE", RTE_FDIR_TUNNEL_TYPE_NVGRE},
{"VxLAN", RTE_FDIR_TUNNEL_TYPE_VXLAN},
+   {"GRE",   RTE_FDIR_TUNNEL_TYPE_GRE},
};

for (i = 0; i < RTE_DIM(tunneltype_str); i++) {
@@ -8263,6 +8268,10 @@ cmd_flow_director_filter_parsed(void *parsed_result,
   RTE_ETH_FDIR_MAX_FLEXLEN);

entry.input.flow_ext.vlan_tci = rte_cpu_to_be_16(res->vlan_value);
+   entry.input.flow.tunnel_flow.tunnel_type =
+   str2fdir_tunneltype(res->tunnel_type);
+   entry.input.flow.tunnel_flow.tunnel_id =
+   rte_cpu_to_be_32(res->tunnel_id_value);

entry.action.flex_off = 0;  /*use 0 by default */
if (!strcmp(res->drop, "drop"))
@@ -8426,7 +8435,7 @@ cmdline_parse_token_string_t cmd_flow_director_tunnel =
 tunnel, "tunnel");
 cmdline_parse_token_string_t cmd_flow_director_tunnel_type =
TOKEN_STRING_INITIALIZER(struct cmd_flow_director_result,
-tunnel_type, "NVGRE#VxLAN");
+tunnel_type, "NVGRE#VxLAN#GRE#Notunnel");
 cmdline_parse_token_string_t cmd_flow_director_tunnel_id =
TOKEN_STRING_INITIALIZER(struct cmd_flow_director_result,
 tunnel_id, "tunnel-id");
@@ -8458,6 +8467,8 @@ cmdline_parse_inst_t cmd_add_del_ip_flow_director = {
(void *)&cmd_flow_director_ttl_value,
(void *)&cmd_flow_director_vlan,
(void *)&cmd_flow_director_vlan_value,
+   (void *)&cmd_flow_director_tunnel_type,
+   (void *)&cmd_flow_director_tunnel_id_value,
(void *)&cmd_flow_director_flexbytes,
(void *)&cmd_flow_director_flexbytes_value,
(void *)&cmd_flow_director_drop,
@@ -8494,6 +8505,8 @@ cmdline_parse_inst_t cmd_add_del_udp_flow_director = {

[dpdk-dev] [PATCH 08/12] i40e: extend flow director to filter by tunnel ID

2016-01-26 Thread Jingjing Wu

This patch extended flow director to select Vxlan/GRE tunnel ID
as filter's input set and program the filter rule with the defined
tunnel type.

Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev.c |  11 
 drivers/net/i40e/i40e_fdir.c   | 125 ++---
 2 files changed, 102 insertions(+), 34 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 32ffc9f..62cdf81 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -6555,48 +6555,59 @@ i40e_get_valid_input_set(enum i40e_filter_pctype pctype,
 */
static const uint64_t valid_fdir_inset_table[] = {
[I40E_FILTER_PCTYPE_FRAG_IPV4] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_PROTO |
I40E_INSET_IPV4_TTL,
[I40E_FILTER_PCTYPE_NONF_IPV4_UDP] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV4_TCP] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV4_SCTP] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV4_OTHER] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_PROTO |
I40E_INSET_IPV4_TTL,
[I40E_FILTER_PCTYPE_FRAG_IPV6] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_NEXT_HDR |
I40E_INSET_IPV6_HOP_LIMIT,
[I40E_FILTER_PCTYPE_NONF_IPV6_UDP] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV6_TCP] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV6_SCTP] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV6_OTHER] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_IPV6_TC | I40E_INSET_IPV6_NEXT_HDR |
I40E_INSET_IPV6_HOP_LIMIT,
[I40E_FILTER_PCTYPE_L2_PAYLOAD] =
+   I40E_INSET_TUNNEL_ID |
I40E_INSET_LAST_ETHER_TYPE,
};

diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
index 5ea97e5..7566017 100644
--- a/drivers/net/i40e/i40e_fdir.c
+++ b/drivers/net/i40e/i40e_fdir.c
@@ -688,11 +688,13 @@ i40e_fdir_configure(struct rte_eth_dev *dev)

 static inline void
 i40e_fdir_fill_eth_ip_head(const struct rte_eth_fdir_input *fdir_input,
-  unsigned char *raw_pkt)
+  unsigned char *pkt, bool need_mac)
 {
-   struct ether_hdr *ether = (struct ether_hdr *)raw_pkt;
-   struct ipv4_hdr *ip;
-   struct ipv6_hdr *ip6;
+   struct ether_hdr *ether = (struct ether_hdr *)pkt;
+   struct ipv4_hdr *ip =
+   (struct ipv4_hdr *)(pkt + sizeof(struct ether_hdr));
+   struct ipv6_hdr *ip6 =
+   (struct ipv6_hdr *)(pkt + sizeof(struct ether_hdr));
static const uint8_t next_proto[] = {
[RTE_ETH_FLOW_FRAG_IPV4] = IPPROTO_IP,
[RTE_ETH_FLOW_NONFRAG_IPV4_TCP] = IPPROTO_TCP,
@@ -708,16 +710,18 @@ i40e_fdir_fill_eth_ip_head(const struct 
rte_eth_fdir_input *fdir_input,

switch (fdir_input->flow_type) {
case RTE_ETH_FLOW_L2_PAYLOAD:
-   ether->ether_type = fdir_input->flow.l2_flow.ether_type;
+   if (need_mac)
+   ether->ether_type = fdir_input->flow.l2_flow.ether_type;
break;
case RTE_ETH_FLOW_NONFRAG_

[dpdk-dev] [PATCH 07/12] librte_ether: extend rte_eth_fdir_flow to support tunnel format

2016-01-26 Thread Jingjing Wu

This patch changed rte_eth_fdir_flow from union to struct to
support more packets formats, for example, Vxlan and GRE tunnel
packets with IP inner frame.

This patch also add new RTE_FDIR_TUNNEL_TYPE_GRE enum.

Signed-off-by: Jingjing Wu 
---
 doc/guides/rel_notes/release_2_3.rst |  4 
 lib/librte_ether/rte_eth_ctrl.h  | 27 +++
 2 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 99de186..2216fee 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -39,6 +39,10 @@ API Changes
 ABI Changes
 ---

+* The ethdev flow director structure ``rte_eth_fdir_flow`` structure was
+  changed. New fields were added to extend flow director's input set, and
+  organizing is also changed to support multiple input format.
+

 Shared Library Versions
 ---
diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index 248f719..eb4c13d 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -495,6 +495,7 @@ enum rte_eth_fdir_tunnel_type {
RTE_FDIR_TUNNEL_TYPE_UNKNOWN = 0,
RTE_FDIR_TUNNEL_TYPE_NVGRE,
RTE_FDIR_TUNNEL_TYPE_VXLAN,
+   RTE_FDIR_TUNNEL_TYPE_GRE,
 };

 /**
@@ -508,18 +509,20 @@ struct rte_eth_tunnel_flow {
 };

 /**
- * An union contains the inputs for all types of flow
+ * A struct contains the inputs for all types of flow
  */
-union rte_eth_fdir_flow {
-   struct rte_eth_l2_flow l2_flow;
-   struct rte_eth_udpv4_flow  udp4_flow;
-   struct rte_eth_tcpv4_flow  tcp4_flow;
-   struct rte_eth_sctpv4_flow sctp4_flow;
-   struct rte_eth_ipv4_flow   ip4_flow;
-   struct rte_eth_udpv6_flow  udp6_flow;
-   struct rte_eth_tcpv6_flow  tcp6_flow;
-   struct rte_eth_sctpv6_flow sctp6_flow;
-   struct rte_eth_ipv6_flow   ipv6_flow;
+struct rte_eth_fdir_flow {
+   union {
+   struct rte_eth_l2_flow l2_flow;
+   struct rte_eth_udpv4_flow  udp4_flow;
+   struct rte_eth_tcpv4_flow  tcp4_flow;
+   struct rte_eth_sctpv4_flow sctp4_flow;
+   struct rte_eth_ipv4_flow   ip4_flow;
+   struct rte_eth_udpv6_flow  udp6_flow;
+   struct rte_eth_tcpv6_flow  tcp6_flow;
+   struct rte_eth_sctpv6_flow sctp6_flow;
+   struct rte_eth_ipv6_flow   ipv6_flow;
+   };
struct rte_eth_mac_vlan_flow mac_vlan_flow;
struct rte_eth_tunnel_flow   tunnel_flow;
 };
@@ -540,7 +543,7 @@ struct rte_eth_fdir_flow_ext {
  */
 struct rte_eth_fdir_input {
uint16_t flow_type;
-   union rte_eth_fdir_flow flow;
+   struct rte_eth_fdir_flow flow;
/**< Flow fields to match, dependent on flow_type */
struct rte_eth_fdir_flow_ext flow_ext;
/**< Additional fields to match */
-- 
2.4.0

[dpdk-dev] [PATCH 06/12] testpmd: extend commands for filter's input set changing

2016-01-26 Thread Jingjing Wu

This patch extended commands for filter's input set changing.
It added tos, protocol and ttl as filter's input fields, and
remove the words selection from flex payloads for flow director.

Signed-off-by: Jingjing Wu 
---
 app/test-pmd/cmdline.c  | 100 ++--
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  42 +++-
 2 files changed, 104 insertions(+), 38 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 73298c9..da1d3f2 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -640,6 +640,7 @@ static void cmd_help_long_parsed(void *parsed_result,
"flow_director_filter (port_id) mode IP 
(add|del|update)"
" flow (ipv4-other|ipv4-frag|ipv6-other|ipv6-frag)"
" src (src_ip_address) dst (dst_ip_address)"
+   " tos (tos_value) proto (proto_value) ttl (ttl_value)"
" vlan (vlan_value) flexbytes (flexbytes_value)"
" (drop|fwd) pf|vf(vf_id) queue (queue_id)"
" fd_id (fd_id_value)\n"
@@ -649,6 +650,7 @@ static void cmd_help_long_parsed(void *parsed_result,
" flow (ipv4-tcp|ipv4-udp|ipv6-tcp|ipv6-udp)"
" src (src_ip_address) (src_port)"
" dst (dst_ip_address) (dst_port)"
+   " tos (tos_value) ttl (ttl_value)"
" vlan (vlan_value) flexbytes (flexbytes_value)"
" (drop|fwd) pf|vf(vf_id) queue (queue_id)"
" fd_id (fd_id_value)\n"
@@ -658,7 +660,9 @@ static void cmd_help_long_parsed(void *parsed_result,
" flow (ipv4-sctp|ipv6-sctp)"
" src (src_ip_address) (src_port)"
" dst (dst_ip_address) (dst_port)"
-   " tag (verification_tag) vlan (vlan_value)"
+   " tag (verification_tag) "
+   " tos (tos_value) ttl (ttl_value)"
+   " vlan (vlan_value)"
" flexbytes (flexbytes_value) (drop|fwd)"
" pf|vf(vf_id) queue (queue_id) fd_id (fd_id_value)\n"
"Add/Del a SCTP type flow director filter.\n\n"
@@ -738,14 +742,15 @@ static void cmd_help_long_parsed(void *parsed_result,
"fld-8th|none) (select|add)\n"
"Set the input set for hash.\n\n"

-   "set_fdir_input_set (port_id) (ipv4|ipv4-frag|"
-   "ipv4-tcp|ipv4-udp|ipv4-sctp|ipv4-other|ipv6|"
+   "set_fdir_input_set (port_id) "
+   "(ipv4-frag|ipv4-tcp|ipv4-udp|ipv4-sctp|ipv4-other|"
"ipv6-frag|ipv6-tcp|ipv6-udp|ipv6-sctp|ipv6-other|"
-   "l2_payload) (src-ipv4|dst-ipv4|src-ipv6|dst-ipv6|"
-   "udp-src-port|udp-dst-port|tcp-src-port|tcp-dst-port|"
-   "sctp-src-port|sctp-dst-port|sctp-veri-tag|fld-1st|"
-   "fld-2nd|fld-3rd|fld-4th|fld-5th|fld-6th|fld-7th|"
-   "fld-8th|none) (select|add)\n"
+   "l2_payload) (ethertype|src-ipv4|dst-ipv4|src-ipv6|"
+   "dst-ipv6|ipv4-tos|ipv4-proto|ipv4-ttl|ipv6-tc|"
+   "ipv6-next-header|ipv6-hop-limits|udp-src-port|"
+   "udp-dst-port|tcp-src-port|tcp-dst-port|"
+   "sctp-src-port|sctp-dst-port|sctp-veri-tag|none)"
+   " (select|add)\n"
"Set the input set for FDir.\n\n"
);
}
@@ -7983,6 +7988,12 @@ struct cmd_flow_director_result {
uint16_t port_dst;
cmdline_fixed_string_t verify_tag;
uint32_t verify_tag_value;
+   cmdline_ipaddr_t tos;
+   uint8_t tos_value;
+   cmdline_ipaddr_t proto;
+   uint8_t proto_value;
+   cmdline_ipaddr_t ttl;
+   uint8_t ttl_value;
cmdline_fixed_string_t vlan;
uint16_t vlan_value;
cmdline_fixed_string_t flexbytes;
@@ -8162,12 +8173,15 @@ cmd_flow_director_filter_parsed(void *parsed_result,
switch (entry.input.flow_type) {
case RTE_ETH_FLOW_FRAG_IPV4:
case RTE_ETH_FLOW_NONFRAG_IPV4_OTHER:
+   entry.input.flow.ip4_flow.proto = res->proto_value;
case RTE_ETH_FLOW_NONFRAG_IPV4_UDP:
case RTE_ETH_FLOW_NONFRAG_IPV4_TCP:
IPV4_ADDR_TO_UINT(res->ip_dst,
entry.input.flow.ip4_flow.dst_ip);
IPV4_ADDR_TO_UINT(res->ip_src,
entry.input.flow.ip4_flow.src_ip);
+   entry.input.flow.ip4_flow.tos = res->tos_value;
+   entry.input.flow.ip4_flow.ttl = res->ttl_value;
/* need convert to big endian. */
entry.input.flow.udp4_flow.dst_port =

[dpdk-dev] [PATCH 05/12] i40e: extend flow director to filter by more IP Header fields

2016-01-26 Thread Jingjing Wu

This patch extended flow director to select more IP Header fields
as filter input set.

Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev.c | 69 ++
 drivers/net/i40e/i40e_fdir.c   | 26 +++-
 2 files changed, 75 insertions(+), 20 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 7a09fbc..32ffc9f 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -218,6 +218,8 @@
 #define I40E_REG_INSET_L3_IP4_TOS0x0040ULL
 /* IPv4 Protocol */
 #define I40E_REG_INSET_L3_IP4_PROTO  0x0004ULL
+/* IPv4 Time to Live */
+#define I40E_REG_INSET_L3_IP4_TTL0x0004ULL
 /* Source IPv6 address */
 #define I40E_REG_INSET_L3_SRC_IP60x0007F800ULL
 /* Destination IPv6 address */
@@ -226,6 +228,8 @@
 #define I40E_REG_INSET_L3_IP6_TC 0x0040ULL
 /* IPv6 Next Header */
 #define I40E_REG_INSET_L3_IP6_NEXT_HDR   0x0008ULL
+/* IPv6 Hop Limitr */
+#define I40E_REG_INSET_L3_IP6_HOP_LIMIT  0x0008ULL
 /* Source L4 port */
 #define I40E_REG_INSET_L4_SRC_PORT   0x0004ULL
 /* Destination L4 port */
@@ -269,10 +273,12 @@
 #define I40E_TRANSLATE_INSET 0
 #define I40E_TRANSLATE_REG   1

-#define I40E_INSET_IPV4_TOS_MASK  0x0009FF00UL
-#define I40E_INSET_IPV4_PROTO_MASK0x000DFF00UL
-#define I40E_INSET_IPV6_TC_MASK   0x0009F00FUL
-#define I40E_INSET_IPV6_NEXT_HDR_MASK 0x000C00FFUL
+#define I40E_INSET_IPV4_TOS_MASK0x0009FF00UL
+#define I40E_INSET_IPv4_TTL_MASK0x000D00FFUL
+#define I40E_INSET_IPV4_PROTO_MASK  0x000DFF00UL
+#define I40E_INSET_IPV6_TC_MASK 0x0009F00FUL
+#define I40E_INSET_IPV6_HOP_LIMIT_MASK  0x000CFF00UL
+#define I40E_INSET_IPV6_NEXT_HDR_MASK   0x000C00FFUL

 static int eth_i40e_dev_init(struct rte_eth_dev *eth_dev);
 static int eth_i40e_dev_uninit(struct rte_eth_dev *eth_dev);
@@ -6549,30 +6555,47 @@ i40e_get_valid_input_set(enum i40e_filter_pctype pctype,
 */
static const uint64_t valid_fdir_inset_table[] = {
[I40E_FILTER_PCTYPE_FRAG_IPV4] =
-   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST,
+   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
+   I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_PROTO |
+   I40E_INSET_IPV4_TTL,
[I40E_FILTER_PCTYPE_NONF_IPV4_UDP] =
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
+   I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV4_TCP] =
-   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST,
+   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
+   I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
+   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV4_SCTP] =
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
+   I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_TTL |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV4_OTHER] =
-   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST,
+   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
+   I40E_INSET_IPV4_TOS | I40E_INSET_IPV4_PROTO |
+   I40E_INSET_IPV4_TTL,
[I40E_FILTER_PCTYPE_FRAG_IPV6] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
+   I40E_INSET_IPV6_TC | I40E_INSET_IPV6_NEXT_HDR |
+   I40E_INSET_IPV6_HOP_LIMIT,
[I40E_FILTER_PCTYPE_NONF_IPV6_UDP] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
+   I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
+   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV6_TCP] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
+   I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
+   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV6_SCTP] =
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
+   I40E_INSET_IPV6_TC | I40E_INSET_IPV6_HOP_LIMIT |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV6_OTHER] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
+   I40E_INSET_IPV6_TC | I40E_INSET_IPV6_NEXT_HDR |
+   I40E_INSET_IPV6_HOP_LIMIT,
[I40E_FILTER_PCTYPE_L2_PAYLO

[dpdk-dev] [PATCH 04/12] i40e: restore default setting on input set of filters

2016-01-26 Thread Jingjing Wu

This patch added a new function to set the input set to default
when initialization.

Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev.c | 56 ++
 1 file changed, 56 insertions(+)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index f3c2e94..7a09fbc 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -374,6 +374,7 @@ static int i40e_dev_udp_tunnel_add(struct rte_eth_dev *dev,
struct rte_eth_udp_tunnel *udp_tunnel);
 static int i40e_dev_udp_tunnel_del(struct rte_eth_dev *dev,
struct rte_eth_udp_tunnel *udp_tunnel);
+static void i40e_filter_input_set_init(struct i40e_pf *pf);
 static int i40e_ethertype_filter_set(struct i40e_pf *pf,
struct rte_eth_ethertype_filter *filter,
bool add);
@@ -788,6 +789,8 @@ eth_i40e_dev_init(struct rte_eth_dev *dev)
 * It should be removed once issues are fixed in NVM.
 */
i40e_flex_payload_reg_init(hw);
+   /* Initialize the input set for filters (hash and fd) to default value 
*/
+   i40e_filter_input_set_init(pf);

/* Initialize the parameters for adminq */
i40e_init_adminq_parameter(hw);
@@ -6844,6 +6847,59 @@ i40e_check_write_reg(struct i40e_hw *hw, uint32_t addr, 
uint32_t val)
(uint32_t)I40E_READ_REG(hw, addr));
 }

+static void
+i40e_filter_input_set_init(struct i40e_pf *pf)
+{
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+   enum i40e_filter_pctype pctype;
+   uint64_t input_set, inset_reg;
+   uint32_t mask_reg[I40E_INSET_MASK_NUM_REG] = {0};
+   int num, i;
+
+   for (pctype = I40E_FILTER_PCTYPE_NONF_IPV4_UDP;
+pctype <= I40E_FILTER_PCTYPE_L2_PAYLOAD; pctype++) {
+   if (!I40E_VALID_PCTYPE(pctype))
+   continue;
+   input_set = i40e_get_default_input_set(pctype);
+
+   num = i40e_generate_inset_mask_reg(input_set, mask_reg,
+  I40E_INSET_MASK_NUM_REG);
+   if (num < 0)
+   return;
+   inset_reg = i40e_translate_input_set_reg(input_set);
+
+   i40e_check_write_reg(hw, I40E_PRTQF_FD_INSET(pctype, 0),
+ (uint32_t)(inset_reg & UINT32_MAX));
+   i40e_check_write_reg(hw, I40E_PRTQF_FD_INSET(pctype, 1),
+(uint32_t)((inset_reg >>
+I40E_32_BIT_WIDTH) & UINT32_MAX));
+   i40e_check_write_reg(hw, I40E_GLQF_HASH_INSET(0, pctype),
+ (uint32_t)(inset_reg & UINT32_MAX));
+   i40e_check_write_reg(hw, I40E_GLQF_HASH_INSET(1, pctype),
+(uint32_t)((inset_reg >>
+I40E_32_BIT_WIDTH) & UINT32_MAX));
+
+   for (i = 0; i < num; i++) {
+   i40e_check_write_reg(hw, I40E_GLQF_FD_MSK(i, pctype),
+mask_reg[i]);
+   i40e_check_write_reg(hw, I40E_GLQF_HASH_MSK(i, pctype),
+mask_reg[i]);
+   }
+   /*clear unused mask registers of the pctype */
+   for (i = num; i < I40E_INSET_MASK_NUM_REG; i++) {
+   i40e_check_write_reg(hw, I40E_GLQF_FD_MSK(i, pctype),
+0);
+   i40e_check_write_reg(hw, I40E_GLQF_HASH_MSK(i, pctype),
+0);
+   }
+   I40E_WRITE_FLUSH(hw);
+
+   /* store the default input set */
+   pf->hash_input_set[pctype] = input_set;
+   pf->fdir.input_set[pctype] = input_set;
+   }
+}
+
 int
 i40e_hash_filter_inset_select(struct i40e_hw *hw,
 struct rte_eth_input_set_conf *conf)
-- 
2.4.0

[dpdk-dev] [PATCH 03/12] i40e: remove flex payload from INPUT_SET_SELECT operation

2016-01-26 Thread Jingjing Wu

In this patch, flex payload is removed from valid fdir input set
values. It is because all flex payload configuration can be set
in struct rte_fdir_conf during device configure phase.
And it is a more flexible configuration including flexpayload's
selection, input set selection by word and mask setting in bits.

Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev.c | 59 +++---
 1 file changed, 26 insertions(+), 33 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 004e206..f3c2e94 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -262,7 +262,8 @@
 #define I40E_REG_INSET_FLEX_PAYLOAD_WORD70x0080ULL
 /* 8th word of flex payload */
 #define I40E_REG_INSET_FLEX_PAYLOAD_WORD80x0040ULL
-
+/* all 8 words flex payload */
+#define I40E_REG_INSET_FLEX_PAYLOAD_WORDS0x3FC0ULL
 #define I40E_REG_INSET_MASK_DEFAULT  0xULL

 #define I40E_TRANSLATE_INSET 0
@@ -6545,43 +6546,32 @@ i40e_get_valid_input_set(enum i40e_filter_pctype pctype,
 */
static const uint64_t valid_fdir_inset_table[] = {
[I40E_FILTER_PCTYPE_FRAG_IPV4] =
-   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST,
[I40E_FILTER_PCTYPE_NONF_IPV4_UDP] =
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
-   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT,
[I40E_FILTER_PCTYPE_NONF_IPV4_TCP] =
-   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
-   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST,
[I40E_FILTER_PCTYPE_NONF_IPV4_SCTP] =
I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
-   I40E_INSET_SCTP_VT | I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV4_OTHER] =
-   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_IPV4_SRC | I40E_INSET_IPV4_DST,
[I40E_FILTER_PCTYPE_FRAG_IPV6] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
[I40E_FILTER_PCTYPE_NONF_IPV6_UDP] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
-   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
[I40E_FILTER_PCTYPE_NONF_IPV6_TCP] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
-   I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
[I40E_FILTER_PCTYPE_NONF_IPV6_SCTP] =
I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
I40E_INSET_SRC_PORT | I40E_INSET_DST_PORT |
-   I40E_INSET_SCTP_VT | I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_SCTP_VT,
[I40E_FILTER_PCTYPE_NONF_IPV6_OTHER] =
-   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST |
-   I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_IPV6_SRC | I40E_INSET_IPV6_DST,
[I40E_FILTER_PCTYPE_L2_PAYLOAD] =
-   I40E_INSET_LAST_ETHER_TYPE | I40E_INSET_FLEX_PAYLOAD,
+   I40E_INSET_LAST_ETHER_TYPE,
};

if (pctype > I40E_FILTER_PCTYPE_L2_PAYLOAD)
@@ -6809,7 +6799,7 @@ i40e_translate_input_set_reg(uint64_t input)
return val;
 }

-static uint8_t
+static int
 i40e_generate_inset_mask_reg(uint64_t inset, uint32_t *mask, uint8_t nb_elem)
 {
uint8_t i, idx = 0;
@@ -6827,16 +6817,13 @@ i40e_generate_inset_mask_reg(uint64_t inset, uint32_t 
*mask, uint8_t nb_elem)
if (!inset || !mask || !nb_elem)
return 0;

-   if (!inset && nb_elem >= I40E_INSET_MASK_NUM_REG) {
-   for (i = 0; i < I40E_INSET_MASK_NUM_REG; i++)
-   mask[i] = 0;
-   return I40E_INSET_MASK_NUM_REG;
-   }

for (i = 0, idx = 0; i < RTE_DIM(inset_mask_map); i++) {
-   if (idx >= nb_elem)
-   break;
-   if (inset & inset_mask_map[i].inset) {
+   if ((inset & inset_mask_map[i].inset) == 
inset_mask_map[i].inset) {
+   if (idx >= nb_elem) {
+   PMD_DRV_LOG(ERR, "exceed maximal number of 
bitmasks");
+   return -EINVA

[dpdk-dev] [PATCH 02/12] i40e: split function for input set change of hash and fdir

2016-01-26 Thread Jingjing Wu

This patch splited function for input set changing of hash
and fdir to avoid multiple check on different situation.

Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev.c | 233 +
 drivers/net/i40e/i40e_ethdev.h |  11 +-
 drivers/net/i40e/i40e_fdir.c   |   5 +-
 3 files changed, 107 insertions(+), 142 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index bf6220d..004e206 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -6845,25 +6845,6 @@ i40e_generate_inset_mask_reg(uint64_t inset, uint32_t 
*mask, uint8_t nb_elem)
return idx;
 }

-static uint64_t
-i40e_get_reg_inset(struct i40e_hw *hw, enum rte_filter_type filter,
-   enum i40e_filter_pctype pctype)
-{
-   uint64_t reg = 0;
-
-   if (filter == RTE_ETH_FILTER_HASH) {
-   reg = I40E_READ_REG(hw, I40E_GLQF_HASH_INSET(1, pctype));
-   reg <<= I40E_32_BIT_WIDTH;
-   reg |= I40E_READ_REG(hw, I40E_GLQF_HASH_INSET(0, pctype));
-   } else if (filter == RTE_ETH_FILTER_FDIR) {
-   reg = I40E_READ_REG(hw, I40E_PRTQF_FD_INSET(pctype, 1));
-   reg <<= I40E_32_BIT_WIDTH;
-   reg |= I40E_READ_REG(hw, I40E_PRTQF_FD_INSET(pctype, 0));
-   }
-
-   return reg;
-}
-
 static void
 i40e_check_write_reg(struct i40e_hw *hw, uint32_t addr, uint32_t val)
 {
@@ -6876,103 +6857,96 @@ i40e_check_write_reg(struct i40e_hw *hw, uint32_t 
addr, uint32_t val)
(uint32_t)I40E_READ_REG(hw, addr));
 }

-static int
-i40e_set_hash_inset_mask(struct i40e_hw *hw,
-enum i40e_filter_pctype pctype,
-enum rte_filter_input_set_op op,
-uint32_t *mask_reg,
-uint8_t num)
+int
+i40e_hash_filter_inset_select(struct i40e_hw *hw,
+struct rte_eth_input_set_conf *conf)
 {
-   uint32_t reg;
-   uint8_t i;
+   struct i40e_pf *pf = &((struct i40e_adapter *)hw->back)->pf;
+   enum i40e_filter_pctype pctype;
+   uint64_t input_set, inset_reg = 0;
+   uint32_t mask_reg[I40E_INSET_MASK_NUM_REG] = {0};
+   int ret, i, num;

-   if (!mask_reg || num > RTE_ETH_INPUT_SET_SELECT)
+   if (!hw || !conf) {
+   PMD_DRV_LOG(ERR, "Invalid pointer");
+   return -EFAULT;
+   }
+   if (conf->op != RTE_ETH_INPUT_SET_SELECT &&
+   conf->op != RTE_ETH_INPUT_SET_ADD) {
+   PMD_DRV_LOG(ERR, "Unsupported input set operation");
return -EINVAL;
-
-   if (op == RTE_ETH_INPUT_SET_SELECT) {
-   for (i = 0; i < I40E_INSET_MASK_NUM_REG; i++) {
-   i40e_check_write_reg(hw, I40E_GLQF_HASH_MSK(i, pctype),
-0);
-   if (i >= num)
-   continue;
-   i40e_check_write_reg(hw, I40E_GLQF_HASH_MSK(i, pctype),
-mask_reg[i]);
-   }
-   } else if (op == RTE_ETH_INPUT_SET_ADD) {
-   uint8_t j, count = 0;
-
-   for (i = 0; i < I40E_INSET_MASK_NUM_REG; i++) {
-   reg = I40E_READ_REG(hw, I40E_GLQF_HASH_MSK(i, pctype));
-   if (reg & I40E_GLQF_HASH_MSK_FIELD)
-   count++;
-   }
-   if (count + num > I40E_INSET_MASK_NUM_REG)
-   return -EINVAL;
-
-   for (i = count, j = 0; i < I40E_INSET_MASK_NUM_REG; i++, j++)
-   i40e_check_write_reg(hw, I40E_GLQF_HASH_MSK(i, pctype),
-mask_reg[j]);
}

-   return 0;
-}
-
-static int
-i40e_set_fd_inset_mask(struct i40e_hw *hw,
-  enum i40e_filter_pctype pctype,
-  enum rte_filter_input_set_op op,
-  uint32_t *mask_reg,
-  uint8_t num)
-{
-   uint32_t reg;
-   uint8_t i;
-
-   if (!mask_reg || num > RTE_ETH_INPUT_SET_SELECT)
+   pctype = i40e_flowtype_to_pctype(conf->flow_type);
+   if (pctype == 0 || pctype > I40E_FILTER_PCTYPE_L2_PAYLOAD) {
+   PMD_DRV_LOG(ERR, "Not supported flow type (%u)",
+   conf->flow_type);
return -EINVAL;
+   }

-   if (op == RTE_ETH_INPUT_SET_SELECT) {
-   for (i = 0; i < I40E_INSET_MASK_NUM_REG; i++) {
-   i40e_check_write_reg(hw, I40E_GLQF_FD_MSK(i, pctype),
-0);
-   if (i >= num)
-   continue;
-   i40e_check_write_reg(hw, I40E_GLQF_FD_MSK(i, pctype),
-mask_reg[i]);
-   }
-   } else if (op == RTE_ETH_INPUT_SET_ADD) {
-   ui

[dpdk-dev] [PATCH 01/12] ethdev: extend flow director to support input set selection

2016-01-26 Thread Jingjing Wu

This patch added RTE_ETH_INPUT_SET_L3_IP4_TTL,
RTE_ETH_INPUT_SET_L3_IP6_HOP_LIMITS input field type and extended
struct rte_eth_ipv4_flow and rte_eth_ipv6_flow to support filtering
by tos, protocol and ttl.

Signed-off-by: Jingjing Wu 
---
 lib/librte_ether/rte_eth_ctrl.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index ce224ad..248f719 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -337,9 +337,11 @@ enum rte_eth_input_set_field {
RTE_ETH_INPUT_SET_L3_SRC_IP6,
RTE_ETH_INPUT_SET_L3_DST_IP6,
RTE_ETH_INPUT_SET_L3_IP4_TOS,
+   RTE_ETH_INPUT_SET_L3_IP4_TTL,
RTE_ETH_INPUT_SET_L3_IP4_PROTO,
RTE_ETH_INPUT_SET_L3_IP6_TC,
RTE_ETH_INPUT_SET_L3_IP6_NEXT_HEADER,
+   RTE_ETH_INPUT_SET_L3_IP6_HOP_LIMITS,

/* L4 */
RTE_ETH_INPUT_SET_L4_UDP_SRC_PORT = 257,
@@ -407,6 +409,9 @@ struct rte_eth_l2_flow {
 struct rte_eth_ipv4_flow {
uint32_t src_ip;  /**< IPv4 source address to match. */
uint32_t dst_ip;  /**< IPv4 destination address to match. */
+   uint8_t  tos; /**< Type of service to match. */
+   uint8_t  ttl; /**< Time to live */
+   uint8_t  proto;
 };

 /**
@@ -443,6 +448,9 @@ struct rte_eth_sctpv4_flow {
 struct rte_eth_ipv6_flow {
uint32_t src_ip[4];  /**< IPv6 source address to match. */
uint32_t dst_ip[4];  /**< IPv6 destination address to match. */
+   uint8_t  tc; /**< Traffic class to match. */
+   uint8_t  proto;  /**< Protocol, next header. */
+   uint8_t  hop_limits;
 };

 /**
-- 
2.4.0

[dpdk-dev] [PATCH 00/12] extend flow director's fields in i40e driver

2016-01-26 Thread Jingjing Wu

This patch set extends flow director to support filtering by
additional fields below in i40e driver:
 - TOS, Protocol and TTL in IP header
 - Tunnel id if NVGRE/GRE/VxLAN packets
 - single vlan or inner vlan

Jingjing Wu (12):
  ethdev: extend flow director to support input set selection
  i40e: split function for input set change of hash and fdir
  i40e: remove flex payload from INPUT_SET_SELECT operation
  i40e: restore default setting on input set of filters
  i40e: extend flow director to filter by more IP Header fields
  testpmd: extend commands for filter's input set changing
  librte_ether: extend rte_eth_fdir_flow to support tunnel format
  i40e: extend flow director to filter by tunnel ID
  testpmd: extend commands for fdir's tunnel id input set
  i40e: fix VLAN bitmasks for hash/fdir input sets for tunnels
  i40e: extend flow director to filter by vlan id
  testpmd: extend commands for fdir's vlan input set

 app/test-pmd/cmdline.c  | 121 +++--
 doc/guides/rel_notes/release_2_3.rst|   5 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  56 ++--
 drivers/net/i40e/i40e_ethdev.c  | 401 +---
 drivers/net/i40e/i40e_ethdev.h  |  11 +-
 drivers/net/i40e/i40e_fdir.c| 163 ---
 lib/librte_ether/rte_eth_ctrl.h |  35 ++-
 7 files changed, 529 insertions(+), 263 deletions(-)

-- 
2.4.0

[dpdk-dev] [PATCH 0/4] virtio support for container

2016-01-26 Thread Tan, Jianfeng

Hi Michael,

On 1/26/2016 2:02 PM, Qiu, Michael wrote:
> On 1/11/2016 2:43 AM, Tan, Jianfeng wrote:
...
>>
>> f. Used with vhost-net
>> $: modprobe vhost
>> $: modprobe vhost-net
>> $: docker run -i -t --privileged \
>>  -v /dev/vhost-net:/dev/vhost-net \
>>  -v /dev/net/tun:/dev/net/tun \
>>  -v /dev/hugepages:/dev/hugepages \
>>  dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \
>>  --vdev=eth_cvio0,path=/dev/vhost-net -- -p 0x1
> We'd better add a ifname, like
> --vdev=eth_cvio0,path=/dev/vhost-net,ifname=tap0, so that user could add
> the tap to the bridge first.

That's an awesome suggestion.

Thanks,
Jianfeng

>
> Thanks,
> Michael

[dpdk-dev] rte_mbuf size for jumbo frame

2016-01-26 Thread Polehn, Mike A

Jumbo frames are generally handled by link lists (but called something else) of 
mbufs.
Enabling jumbo frames for the device driver should enable the right portion of 
the driver which handles the linked lists.

Don't make the mbufs huge.

Mike 

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Masaru OKI
Sent: Monday, January 25, 2016 2:41 PM
To: Saurabh Mishra; users at dpdk.org; dev at dpdk.org
Subject: Re: [dpdk-dev] rte_mbuf size for jumbo frame

Hi,

1. Take care of unit size of mempool for mbuf.
2. Call rte_eth_dev_set_mtu() for each interface.
Note that some PMDs does not supported change MTU.

On 2016/01/26 6:02, Saurabh Mishra wrote:
> Hi,
>
> We wanted to use 10400 bytes size of each rte_mbuf to enable Jumbo frames.
> Do you guys see any problem with that? Would all the drivers like 
> ixgbe, i40e, vmxnet3, virtio and bnx2x work with larger rte_mbuf size?
>
> We would want to avoid detailing with chained mbufs.
>
> /Saurabh

[dpdk-dev] [PATCH v6 08/11] eal: pci: introduce RTE_KDRV_VFIO_NOIOMMUi driver mode

2016-01-26 Thread Thomas Monjalon

2016-01-26 15:56, Santosh Shukla:
> On Mon, Jan 25, 2016 at 8:59 PM, Thomas Monjalon
>  wrote:
> > 2016-01-21 22:47, Santosh Shukla:
> >> On Thu, Jan 21, 2016 at 8:16 PM, Thomas Monjalon
> >>  wrote:
> >> > 2016-01-21 17:34, Santosh Shukla:
> >> >> On Thu, Jan 21, 2016 at 4:58 PM, Thomas Monjalon
> >> >>  wrote:
> >> >> > 2016-01-21 16:43, Santosh Shukla:
> >> >> >> David Marchand  wrote:
> >> >> >> > This is a mode (specific to vfio), not a new kernel driver.
> >> >> >> >
> >> >> >> Yes, Specific to VFIO and this is why noiommu appended after vfio 
> >> >> >> i.e..
> >> >> >> __VFIO and __VFIO_NOIOMMU.
> >> >> >
> >> >> > Woaaa! Your logic is really disappointing :)
> >> >> > Specific to VFIO => append _NOIOMMU
> >> >> > If it's for VFIO, it should be called VFIO (that's my logic).
> >> >> >
> >> >> I am confused by reading your comment. vfio works for default iommu
> >> >> and now with noiommu. drv->kdrv need to know driver mode for vfio
> >> >> case. So that user can simply read drv->kdrv value in their driver and
> >> >> accordingly use vfio rd/wr api for example {pread/pwrite}. This is how
> >> >> rte_eal_pci_vfio_read/write_bar() api implemented.
> >> >
> >> > Sorry I don't understand. Why EAL read/write functions should be 
> >> > different
> >> > depending of the VFIO mode?
> >>
> >> no, EAL rd/wr functions are not different for vfio or vfio modes {same
> >> for iommu or noiommu}. Pl. see pci_eal_read/write_bar() api. Those
> >> apis currently used for VFIO, Irrespective of vfio mode. If required,
> >> we can add UIO bar_rd/wr api too. pci_eal_rd/wr_bar() are abstract
> >> apis. Underneath implementation can be vfio or uio type.
> >
> > It means you agree the suffix _NOIOMMU is not needed?
> > It seems we go nowhere in this discussion. You said
> > "drv->kdrv need to know driver mode for vfio"
> 
> In my observation, currently virtio work for vfio-noiommu, that's why
> said drv->kdrv need to know vfio mode.

It is your observation. It may change in near future.

> > and after
> > "Those apis currently used for VFIO, Irrespective of vfio mode"
> > That's why I assume your first assumption was wrong.
> >
> 
> Newly introduced dpdk global api pci_eal_rd/wr_bar(),  can be used for
> vfio and uio both; can be used for vfio w/IOMMU and vfio w/o IOMMU
> both.
> 
> >> >> > Why do we care to parse noiommu only?
> >> >>
> >> >> Because pmd drivers example virtio can work with vfio only in
> >> >> _noiommu_ mode. In particular, virtio spec 0.95 / legacy virtio.
> >> >
> >> > Please could you explain the limitation (except IOMMU availability)?
> >>
> >> Ok.
> >>
> >> I believe - we both agree that noiommu mode is a need for pmd drivers
> >> like virtio, right? if so then other reason is implementation driven
> >
> > No, noiommu is a need for some environment having no IOMMU.
> > But in my understanding, virtio could run with a nested IOMMU.
> 
> Interesting, like to understand nested one, I did tried in past by
> passing "iommu=pt intel_iommu=on kvm-intel.nested=1" in cmdline for
> x86 (for guest/host both), but virtio pci device binding to vfio-pci
> driver fails. Tried on 4.2 kernel (qemu version 2.5), is it working
> for >4.2 kernel/ qemu-version?

I haven't tried.

> >> i.e..
> >>
> >> Pl. look at virtio_pci.c in this patch.. VIRTIO_RD/WR/_1/2/4()
> >> implementation. They are in-use and applicable to  virtio spec 0.95,
> >> so far support uio/ioport-way rd/wr. Now to support vfio-way rd/wr -
> >> need to check drv->kdrv value, that value should be of vfio_noiommu
> >> types __not__  generic _vfio types.
> >
> > I still don't understand why it would not work with VFIO w/IOMMU.
> 
> with vfio+iommu; binding virtio pci device to vfio-pci driver fail;
> giving below error:
> [   53.053464] VFIO - User Level meta-driver version: 0.3
> [   73.077805] vfio-pci: probe of :00:03.0 failed with error -22
> [   73.077852] vfio-pci: probe of :00:03.0 failed with error -22
> 
> vfio_pci_probe() --> vfio_iommu_group_get() --> iommu_group_get()
> fails: iommu doesn't have group for virtio pci device.

Yes it fails when binding.
So the later check in the virtio PMD is useless.

> In case of noiommu, it prepares the group / add device to iommu group,
> so it passes.
> 
> Jason in other thread mentioned that he is working on virtio+iommu
> approach [1], Patches are not merged and I am yet to evaluate his
> patches for virtio pmd driver for iommu(+vfio). so wondering how
> virtio pci device could work unless jason patches used?
> 
> [1] https://www.mail-archive.com/qemu-devel at nongnu.org/msg337079.html

I haven't tried nested IOMMU.
All this thread was about the kernel module in use, i.e. VFIO.
We are saying that virtio could work in both VFIO modes.
Furthermore restricting virtio to no-iommu mode doesn't bring
any improvement.
That's why I suggest to keep the initial semantic of kdrv and
not pollute it with VFIO modes.

> >> >> So at
> >> >> the initialization (example .. virtio-net) of such pmd driver, pmd
> >> >

[dpdk-dev] DPDK mbuf pool in SR-IOV env and one RX/TX queue

2016-01-26 Thread Bruce Richardson

On Mon, Jan 25, 2016 at 04:15:28PM -0800, Saurabh Mishra wrote:
> Hi Bruce --
> 
> >The sharing of the mbuf pool is not an issue, but sharing of rx/tx queues
> is.
> >The ethdev queues are not multi-thread safe, so to share a queue between
> processes
> >or threads, you need to put in locks or other access control mechanisms.
> [This
> >also implies a performance hit due to the locking]
> >Regards,
> >/Bruce
> 
> Right. So now we have only one process to do rx/tx on queue 0 if we detect
> that max queue support is 1.
> 
> However, we have noticed that if our process, which does rx/tx, is not
> primary, then we can't transmit the packet out with SR-IOV.
> 
> Is there any specific limitation on SR-IOV (the vf driver in dpdk) that
> only primary process should receive and transmit packets?
> 
> In our model, we have an agent process which monitor links and another
> process which does packet processing. If we make our agent process as
> primary then our secondary process is not able to send the packets --
> rte_eth_tx_burst() succeed but recipient does not receive the packet.
> 
> Thanks,
> /Saurabh

There should be no restrictions on RX/TX from secondary processes.

/Bruce

[dpdk-dev] [PATCH 15/16] fm10k: use default mailbox message handler for pf

2016-01-26 Thread Bruce Richardson

On Mon, Jan 25, 2016 at 02:31:05AM +, Wang, Xiao W wrote:
> Hi Bruce,
> 
> > -Original Message-
> > From: Richardson, Bruce
> > Sent: Saturday, January 23, 2016 5:32 AM
> > To: Wang, Xiao W 
> > Cc: Chen, Jing D ; dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 15/16] fm10k: use default mailbox message
> > handler for pf
> > 
> > On Thu, Jan 21, 2016 at 06:36:00PM +0800, Wang Xiao W wrote:
> > > The new share code makes fm10k_msg_update_pvid_pf function static, so
> > > we can not refer to it now in fm10k_ethdev.c. The registered pf
> > > handler is almost the same as the default pf handler, removing it has no
> > impact on mailbox.
> > >
> > > Signed-off-by: Wang Xiao W 
> > 
> > What patch makes the function static, as we need to ensure that the build is
> > not broken by having this patch in the wrong place in the patchset?
> > 
> > Also, it seems strange having this patch in the middle of a series of base 
> > code
> > updates - perhaps it should go first, so that all base code update patches 
> > can
> > go one after the other.
> > 
> > /Bruce
> 
> It's the first patch in the patch set that makes the function static.

So does this patch not need to go before patch 1, if we can't refer to the 
function
once patch one is applied?

/Bruce

[dpdk-dev] [RFC PATCH 5/5] virtio: Extend virtio-net PMD to support container environment

2016-01-26 Thread Tetsuya Mukawa

On 2016/01/25 19:29, Xie, Huawei wrote:
> On 1/21/2016 7:09 PM, Tetsuya Mukawa wrote:
>> +#define PCI_CONFIG_ADDR(_bus, _device, _function, _offset) ( \
>> +(1 << 31) | ((_bus) & 0xff) << 16 | ((_device) & 0x1f) << 11 | \
>> +((_function) & 0xf) << 8 | ((_offset) & 0xfc))
> (_function) & 0x7 ?

Yes, you are correct.
I will fix it.

Thanks,
Tetsuya

[dpdk-dev] [RFC PATCH 5/5] virtio: Extend virtio-net PMD to support container environment

2016-01-26 Thread Tetsuya Mukawa

On 2016/01/25 19:17, Xie, Huawei wrote:
> On 1/21/2016 7:09 PM, Tetsuya Mukawa wrote:
>> +static void
>> +qtest_handle_one_message(struct qtest_session *s, char *buf)
>> +{
>> +int ret;
>> +
>> +if (strncmp(buf, interrupt_message, strlen(interrupt_message)) == 0) {
>> +if (rte_atomic16_read(&s->enable_intr) == 0)
>> +return;
>> +
>> +/* relay interrupt to pipe */
>> +ret = write(s->irqfds.writefd, "1", 1);
> How about the interrupt latency? Seems it is quite long.

Yes, I agree with it.
Probably using evetfd or removing this read/write mechanism to handle
interrupts will be nice.
Let me check it more.

Tetsuya

[dpdk-dev] [RFC PATCH 5/5] virtio: Extend virtio-net PMD to support container environment

2016-01-26 Thread Tetsuya Mukawa

On 2016/01/25 19:15, Xie, Huawei wrote:
> On 1/22/2016 6:38 PM, Tetsuya Mukawa wrote:
>> On 2016/01/22 17:14, Xie, Huawei wrote:
>>> On 1/21/2016 7:09 PM, Tetsuya Mukawa wrote:
 virtio: Extend virtio-net PMD to support container environment

 The patch adds a new virtio-net PMD configuration that allows the PMD to
 work on host as if the PMD is in VM.
 Here is new configuration for virtio-net PMD.
  - CONFIG_RTE_LIBRTE_VIRTIO_HOST_MODE
 To use this mode, EAL needs physically contiguous memory. To allocate
 such memory, add "--shm" option to application command line.

 To prepare virtio-net device on host, the users need to invoke QEMU
 process in special qtest mode. This mode is mainly used for testing QEMU
 devices from outer process. In this mode, no guest runs.
 Here is QEMU command line.

  $ qemu-system-x86_64 \
  -machine pc-i440fx-1.4,accel=qtest \
  -display none -qtest-log /dev/null \
  -qtest unix:/tmp/socket,server \
  -netdev type=tap,script=/etc/qemu-ifup,id=net0,queues=1\
  -device virtio-net-pci,netdev=net0,mq=on \
  -chardev socket,id=chr1,path=/tmp/ivshmem,server \
  -device ivshmem,size=1G,chardev=chr1,vectors=1

  * QEMU process is needed per port.
>>> Does qtest supports hot plug virtio-net pci device, so that we could run
>>> one QEMU process in host, which provisions the virtio-net virtual
>>> devices for the container?
>> Theoretically, we can use hot plug in some cases.
>> But I guess we have 3 concerns here.
>>
>> 1. Security.
>> If we share QEMU process between multiple DPDK applications, this QEMU
>> process will have all fds of  the applications on different containers.
>> In some cases, it will be security concern.
>> So, I guess we need to support current 1:1 configuration at least.
>>
>> 2. shared memory.
>> Currently, QEMU and DPDK application will map shared memory using same
>> virtual address.
>> So if multiple DPDK application connects to one QEMU process, each DPDK
>> application should have different address for shared memory. I guess
>> this will be a big limitation.
>>
>> 3. PCI bridge.
>> So far, QEMU has one PCI bridge, so we can connect almost 10 PCI devices
>> to QEMU.
>> (I forget correct number, but it's almost 10, because some slots are
>> reserved by QEMU)
>> A DPDK application needs both virtio-net and ivshmem device, so I guess
>> almost 5 DPDK applications can connect to one QEMU process, so far.
>> To add more PCI bridges solves this.
>> But we need to add a lot of implementation to support cascaded PCI
>> bridges and PCI devices.
>> (Also we need to solve above "2nd" concern.)
>>
>> Anyway, if we use virtio-net PMD and vhost-user PMD, QEMU process will
>> not do anything after initialization.
>> (QEMU will try to read a qtest socket, then be stopped because there is
>> no message after initialization)
>> So I guess we can ignore overhead of these QEMU processes.
>> If someone cannot ignore it, I guess this is the one of cases that it's
>> nice to use your light weight container implementation.
> Thanks for the explanation, and also in your opinion where is the best
> place to run the QEMU instance? If we run QEMU instances in host, for
> vhost-kernel support, we could get rid of the root privilege issue.

Do you mean below?
If we deploy QEMU instance on host, we can start a container without the
root privilege.
(But on host, still QEMU instance needs the privilege to access to
vhost-kernel)

If so, I agree to deploy QEMU instance on host or other privileged
container will be nice.
In the case of vhost-user, to deploy on host or non-privileged container
will be good.

>
> Another issue is do you plan to support multiple virtio devices in
> container? Currently i find the code assuming only one virtio-net device
> in QEMU, right?

Yes, so far, 1 port needs 1 QEMU instance.
So if you need multiple virtio devices, you need to invoke multiple QEMU
instances.

Do you want to deploy 1 QEMU instance for each DPDK application, even if
the application has multiple virtio-net ports?

So far, I am not sure whether we need it, because this type of DPDK
application will need only one port in most cases.
But if you need this, yes, I can implement using QEMU PCI hotplug feature.
(But probably we can only attach almost 10 ports. This will be limitation.)

>
> Btw, i have read most of your qtest code. No obvious issues found so far
> but quite a couple of nits. You must have spent a lot of time on this.
> It is great work!

I appreciate your reviewing!

BTW, my container implementation needed a QEMU patch in the case of
vhost-user.
But the patch has been merged in upstream QEMU, so we don't have this
limitation any more.

Thanks,
Tetsuya

[dpdk-dev] rte_mbuf size for jumbo frame

2016-01-26 Thread Lawrence MacIntyre

Saurabh:

It sounds like you benchmarked Apache using Jumbo Packets, but not the 
DPDK app using large mbufs. Those are two entirely different issues.

You should be able to write your packet inspection routines to work with 
the mbuf chains, rather than copying them into a larger buffer (although 
if there are multiple passes through the data, it could be a bit 
complicated). Copying the data into a larger buffer will definitely 
cause the application to be slower.

Lawrence

This one time (01/26/2016 09:40 AM), at band camp, Saurabh Mishra wrote:
>
> Hi,
>
> Since we do full content inspection, we will end up coalescing mbuf 
> chains into one before inspecting the packet which would require 
> allocating another buffer of larger size.
>
> I am inclined towards larger size mbuf for this reason.
>
> I have benchmarked a bit using apache benchmark and we see 3x 
> performance improvement over 1500 mtu. Memory is not an issue.
>
> My only concern is that would all the dpdk drivers work with larger 
> size mbuf?
>
> Thanks,
> Saurabh
>
> On Jan 26, 2016 6:23 AM, "Lawrence MacIntyre"  > wrote:
>
> Saurabh:
>
> Raising the mbuf size will make the packet handling for large
> packets slightly more efficient, but it will use much more memory
> unless the great majority of the packets you are handling are of
> the jumbo size. Using more memory has its own costs. In order to
> evaluate this design choice, it is necessary to understand the
> behavior of the memory subsystem, which is VERY complicated.
>
> Before  you go down this path, at least benchmark your application
> using the regular sized mbufs and the large ones and see what the
> effect is.
>
> This one time (01/26/2016 09:01 AM), at band camp, Polehn, Mike A
> wrote:
>
> Jumbo frames are generally handled by link lists (but called
> something else) of mbufs.
> Enabling jumbo frames for the device driver should enable the
> right portion of the driver which handles the linked lists.
>
> Don't make the mbufs huge.
>
> Mike
>
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org
> ] On Behalf Of Masaru OKI
> Sent: Monday, January 25, 2016 2:41 PM
> To: Saurabh Mishra; users at dpdk.org ;
> dev at dpdk.org 
> Subject: Re: [dpdk-dev] rte_mbuf size for jumbo frame
>
> Hi,
>
> 1. Take care of unit size of mempool for mbuf.
> 2. Call rte_eth_dev_set_mtu() for each interface.
>  Note that some PMDs does not supported change MTU.
>
> On 2016/01/26 6:02, Saurabh Mishra wrote:
>
> Hi,
>
> We wanted to use 10400 bytes size of each rte_mbuf to
> enable Jumbo frames.
> Do you guys see any problem with that? Would all the
> drivers like
> ixgbe, i40e, vmxnet3, virtio and bnx2x work with larger
> rte_mbuf size?
>
> We would want to avoid detailing with chained mbufs.
>
> /Saurabh
>
>
> -- 
> Lawrence MacIntyre macintyrelp at ornl.gov
>  Oak Ridge National Laboratory
> 865.574.7401   Cyber Space and Information
> Intelligence Research Group
>

-- 
Lawrence MacIntyre  macintyrelp at ornl.gov  Oak Ridge National Laboratory
  865.574.7401  Cyber Space and Information Intelligence Research Group

[dpdk-dev] [PATCH] eal: add function to check if primary proc alive

2016-01-26 Thread Van Haaren, Harry

> From: Qiu, Michael
> > Whatever work the secondary was performing (in its own address space)
> > won't be directly changed by the primary being killed, because the
> > shared config and hugepages stay (EAL "cleans up" when the primary
> > is re-launched, not on quit).
> 
> OK,  when primary quit or be killed, the queues will be freed, it will
> be a potential issue when secondary try to access, maybe I'm wrong.

The use-case for this patch is monitoring statistics and fault-detection.
That involves reading registers directly from the NIC, and the NIC
rx/tx queues are not used. I think you are right that using the rx/tx
queues from a secondary process when they have been cleaned-up by the
primary process will indeed cause issues.

If there is a valid use-case where both primary and secondary processes
will be forwarding packets on the same NIC, this issue should be discussed
in more detail.

In its current state, this patch solves a problem for the use case of a
primary process forwarding packets, and a secondary process monitoring
and providing fault-detection.

-Harry

[dpdk-dev] [PATCH] eal: add function to check if primary proc alive

2016-01-26 Thread Bruce Richardson

On Mon, Jan 25, 2016 at 11:44:59AM +, Van Haaren, Harry wrote:
> > From: Richardson, Bruce
> > The details of what the config file is should largely be hidden from the 
> > user
> > IMHO.
> 
> Agreed, however hiding it totally removes the flexibility of waiting for a 
> primary
> that is starting with --file-prefix (aka: in a non-default location). Imposing
> a limit on only monitoring primary procs in the default location seems wrong.
> 

But the secondary also needs the same prefix. Is that prefix not accessible by
this function to be used?

/Bruce

[dpdk-dev] [PATCH] eal: add function to check if primary proc alive

2016-01-26 Thread Qiu, Michael

On 1/26/2016 5:04 PM, Van Haaren, Harry wrote:
>> From: Qiu, Michael
>> On 1/25/2016 7:51 PM, Van Haaren, Harry wrote:
>>> Not really, the secondary process will need some CPU,
>>> however it can sleep so it doesn't have to use 100% of it.
>>> It shouldn't be run on a core that is used by the primary
>>> for packet-forwarding though - that will impact performance.
>> If not, what will happen if the primary been killed after you check
>> alive? At that time, the secondary may be doing some work need primary
>> alive.
> What work are you thinking of? Apart from the shared config
> and hugepages, primary and secondary processes are running
> in their own address-space, and if the primary gets killed,
> the secondary will notice when it next polls rte_eal_primary_proc_alive().
>
> Whatever work the secondary was performing (in its own address space)
> won't be directly changed by the primary being killed, because the
> shared config and hugepages stay (EAL "cleans up" when the primary
> is re-launched, not on quit).

OK,  when primary quit or be killed, the queues will be freed, it will
be a potential issue when secondary try to access, maybe I'm wrong.

Thanks,
Michael

> -Harry
>
>

[dpdk-dev] DPDK Community Call - Linux Foundation

2016-01-26 Thread Dave Neary

Hi Tim,

On 01/22/2016 07:19 PM, O'Driscoll, Tim wrote:
> At the community call we held on governance in December, we agreed that a few 
> of us would work with the Linux Foundation on a proposal for a light-weight 
> governance of DPDK. This would include things like management of DPDK events, 
> registering trademarks, and any required legal support etc.
> 
> Stephen, Thomas, Dave, Vincent and I met with the Linux Foundation to discuss 
> this and came up with a draft budget proposal. We'd like to have a community 
> call to discuss this and decide how we should proceed.
> 
> The current budget estimate is ~$227k. LF guidance is that for this size of 
> budget they'd propose a flat membership fee (i.e. no tiered membership) of 
> ~$25k per member company, with a target of getting 10 or more companies to 
> contribute.
> 
> At the call we can discuss:
> - A breakdown of the proposed budget. LF are very flexible on this, so we can 
> add or remove whatever we want for DPDK.
> - Proposed membership fee.
> - Matthew also highlight some recent changes to community representation at 
> the Linux Foundation. We should discuss any concerns associated with this.
> - Next steps.

Sounds good to me. All of the times below work. :-)

Thanks,
Dave.

> When: 
> Tue, Feb 2, 2015 15:00 - 16:00 GMT
> Tue, Feb 2, 2015 07:00 - 08:00 PST
> Tue, Feb 2, 2015 10:00 - 11:00 EST
> Tue, Feb 2, 2015 16:00 - 17:00 CET
> 
> 
> How to join:
> You can join from a computer, tablet or smartphone: 
> https://global.gotomeeting.com/join/474154717
> 
> You can also dial in by phone.
> Access Code: 474-154-717
> 
> More phone numbers
> United States : +1 (312) 757-3126
> Australia : +61 2 9087 3604
> Austria : +43 7 2088 1400
> Belgium : +32 (0) 92 98 0592
> Canada : +1 (647) 497-9350
> Denmark : +45 69 91 80 05
> Finland : +358 (0) 942 41 5778
> France : +33 (0) 182 880 456
> Germany : +49 (0) 692 5736 7313
> Ireland : +353 (0) 15 290 180
> Italy : +39 0 247 92 12 39
> Netherlands : +31 (0) 208 080 379
> New Zealand : +64 9 442 7358
> Norway : +47 21 03 58 96
> Spain : +34 911 82 9782
> Sweden : +46 (0) 313 613 558
> Switzerland : +41 (0) 435 0006 96
> United Kingdom : +44 (0) 20 3713 5028
> 
> 

-- 
Dave Neary - NFV/SDN Community Strategy
Open Source and Standards, Red Hat - http://community.redhat.com
Ph: +1-978-399-2182 / Cell: +1-978-799-3338

[dpdk-dev] [PATCH v6 1/2] tools: Add support for handling built-in kernel modules

2016-01-26 Thread Kamil Rytarowski

ping?

W dniu 20.01.2016 o 10:48, krytarowski at caviumnetworks.com pisze:
> From: Kamil Rytarowski 
>
> Currently dpdk_nic_bind.py detects Linux kernel modules via reading
> /proc/modules. Built-in ones aren't listed there and therefore they are not
> being found by the script.
>
> Add support for checking built-in modules with parsing the sysfs files.
>
> This commit obsoletes the /proc/modules parsing approach.
>
> Signed-off-by: Kamil Rytarowski 
> Acked-by: David Marchand 
> Acked-by: Yuanhan Liu 
> ---
>   tools/dpdk_nic_bind.py | 30 --
>   1 file changed, 20 insertions(+), 10 deletions(-)
>
> diff --git a/tools/dpdk_nic_bind.py b/tools/dpdk_nic_bind.py
> index f02454e..1d16d9f 100755
> --- a/tools/dpdk_nic_bind.py
> +++ b/tools/dpdk_nic_bind.py
> @@ -156,22 +156,32 @@ def check_modules():
>   '''Checks that igb_uio is loaded'''
>   global dpdk_drivers
>   
> -fd = file("/proc/modules")
> -loaded_mods = fd.readlines()
> -fd.close()
> -
>   # list of supported modules
>   mods =  [{"Name" : driver, "Found" : False} for driver in dpdk_drivers]
>   
>   # first check if module is loaded
> -for line in loaded_mods:
> +try:
> +# Get list of syfs modules, some of them might be builtin and merge 
> with mods
> +sysfs_path = '/sys/module/'
> +
> +# Get the list of directories in sysfs_path
> +sysfs_mods = [os.path.join(sysfs_path, o) for o
> +  in os.listdir(sysfs_path)
> +  if os.path.isdir(os.path.join(sysfs_path, o))]
> +
> +# Extract the last element of '/sys/module/abc' in the array
> +sysfs_mods = [a.split('/')[-1] for a in sysfs_mods]
> +
> +# special case for vfio_pci (module is named vfio-pci,
> +# but its .ko is named vfio_pci)
> +sysfs_mods = map(lambda a:
> + a if a != 'vfio_pci' else 'vfio-pci', sysfs_mods)
> +
>   for mod in mods:
> -if line.startswith(mod["Name"]):
> -mod["Found"] = True
> -# special case for vfio_pci (module is named vfio-pci,
> -# but its .ko is named vfio_pci)
> -elif line.replace("_", "-").startswith(mod["Name"]):
> +if mod["Found"] == False and (mod["Name"] in sysfs_mods):
>   mod["Found"] = True
> +except:
> +pass
>   
>   # check if we have at least one loaded module
>   if True not in [mod["Found"] for mod in mods] and b_flag is not None:

[dpdk-dev] [PATCH 1/5] vhost: refactor rte_vhost_dequeue_burst

2016-01-26 Thread Xie, Huawei

On 12/3/2015 2:03 PM, Yuanhan Liu wrote:
> Signed-off-by: Yuanhan Liu 
> ---
>  lib/librte_vhost/vhost_rxtx.c | 287 
> +-
>  1 file changed, 113 insertions(+), 174 deletions(-)

Prefer to unroll copy_mbuf_to_desc and your COPY macro. It prevents us
processing descriptors in a burst way in future.

[dpdk-dev] [RFC] eal: add cgroup-aware resource self discovery

2016-01-26 Thread Tan, Jianfeng

Hi Neil,

On 1/25/2016 9:46 PM, Neil Horman wrote:
> On Mon, Jan 25, 2016 at 02:49:53AM +0800, Jianfeng Tan wrote:
...
>> -- 
>> 2.1.4
>>
>>
>
> This doesn't make a whole lot of sense, for several reasons:
>
> 1) Applications, as a general rule shouldn't be interrogating the cgroups
> interface at all.

The main reason to do this in DPDK is that DPDK obtains resource 
information from sysfs and proc, which are not well containerized so 
far. And DPDK pre-allocates resource instead of on-demand gradual 
allocating.

>
> 2) Cgroups aren't the only way in which a cpuset or memoryset can be 
> restricted
> (the isolcpus command line argument, or a taskset on a parent process for
> instance, but there are several others).

Yes, I agree. To enable that, I'd like design the new API for resource 
self discovery in a flexible way. A parameter "type" is used to specify 
the solution to discovery way. In addition, I'm considering to add a 
callback function pointer so that users can write their own resource 
discovery functions.

>
> Instead of trying to figure out what cpuset is valid for your process by
> interrogating the cgroups heirarchy, instead you should follow the proscribed
> method of calling sched_getaffinity after calling sched_setaffinity.  That 
> will
> give you the canonical cpuset that you are executing on, taking all cpuset
> filters into account (including cgroups and any other restrictions).  Its far
> simpler as well, as it doesn't require a ton of file/string processing.

Yes, this way is much better for cpuset discovery. But is there such a 
syscall for hugepages?

Thanks,
Jianfeng

>
> Neil
>

[dpdk-dev] rte_mbuf size for jumbo frame

2016-01-26 Thread Lawrence MacIntyre

Saurabh:

Raising the mbuf size will make the packet handling for large packets 
slightly more efficient, but it will use much more memory unless the 
great majority of the packets you are handling are of the jumbo size. 
Using more memory has its own costs. In order to evaluate this design 
choice, it is necessary to understand the behavior of the memory 
subsystem, which is VERY complicated.

Before  you go down this path, at least benchmark your application using 
the regular sized mbufs and the large ones and see what the effect is.

This one time (01/26/2016 09:01 AM), at band camp, Polehn, Mike A wrote:
> Jumbo frames are generally handled by link lists (but called something else) 
> of mbufs.
> Enabling jumbo frames for the device driver should enable the right portion 
> of the driver which handles the linked lists.
>
> Don't make the mbufs huge.
>
> Mike
>
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Masaru OKI
> Sent: Monday, January 25, 2016 2:41 PM
> To: Saurabh Mishra; users at dpdk.org; dev at dpdk.org
> Subject: Re: [dpdk-dev] rte_mbuf size for jumbo frame
>
> Hi,
>
> 1. Take care of unit size of mempool for mbuf.
> 2. Call rte_eth_dev_set_mtu() for each interface.
>  Note that some PMDs does not supported change MTU.
>
> On 2016/01/26 6:02, Saurabh Mishra wrote:
>> Hi,
>>
>> We wanted to use 10400 bytes size of each rte_mbuf to enable Jumbo frames.
>> Do you guys see any problem with that? Would all the drivers like
>> ixgbe, i40e, vmxnet3, virtio and bnx2x work with larger rte_mbuf size?
>>
>> We would want to avoid detailing with chained mbufs.
>>
>> /Saurabh

-- 
Lawrence MacIntyre  macintyrelp at ornl.gov  Oak Ridge National Laboratory
  865.574.7401  Cyber Space and Information Intelligence Research Group

[dpdk-dev] [RFC] eal: add cgroup-aware resource self discovery

2016-01-26 Thread Neil Horman

On Tue, Jan 26, 2016 at 10:22:18AM +0800, Tan, Jianfeng wrote:
> 
> Hi Neil,
> 
> On 1/25/2016 9:46 PM, Neil Horman wrote:
> >On Mon, Jan 25, 2016 at 02:49:53AM +0800, Jianfeng Tan wrote:
> ...
> >>-- 
> >>2.1.4
> >>
> >>
> >
> >This doesn't make a whole lot of sense, for several reasons:
> >
> >1) Applications, as a general rule shouldn't be interrogating the cgroups
> >interface at all.
> 
> The main reason to do this in DPDK is that DPDK obtains resource information
> from sysfs and proc, which are not well containerized so far. And DPDK
> pre-allocates resource instead of on-demand gradual allocating.
> 
Not disagreeing with this, just suggesting that:

1) Interrogating cgroups really isn't the best way to collect that information
2) Pre-allocating those resources isn't particularly wise without some mechanism
to reallocate it, as resource constraints can change (consider your cpuset
getting rewritten)

> >
> >2) Cgroups aren't the only way in which a cpuset or memoryset can be 
> >restricted
> >(the isolcpus command line argument, or a taskset on a parent process for
> >instance, but there are several others).
> 
> Yes, I agree. To enable that, I'd like design the new API for resource self
> discovery in a flexible way. A parameter "type" is used to specify the
> solution to discovery way. In addition, I'm considering to add a callback
> function pointer so that users can write their own resource discovery
> functions.
> 
Why?  You don't need an API for this, or if you really want one, it can be very
generic if you use POSIX apis to gather the information.  What you have here is
going to be very linux specific, and will need reimplementing for BSD or other
operating systems.  To use the cpuset example, instead of reading and parsing
the mask files in the cgroup filesystem module to find your task and
corresponding mask, just call sched_setaffinity with an all f's mask, then call
sched_getaffinity.  The returned mask will be all the cpus your process is
allowed to execute on, taking into account every limiting filter the system you
are running on offers.

There are simmilar OS level POSIX apis for most resources out there.  You really
don't need to dig through cgroups just to learn what some of those reources are.

> >
> >Instead of trying to figure out what cpuset is valid for your process by
> >interrogating the cgroups heirarchy, instead you should follow the proscribed
> >method of calling sched_getaffinity after calling sched_setaffinity.  That 
> >will
> >give you the canonical cpuset that you are executing on, taking all cpuset
> >filters into account (including cgroups and any other restrictions).  Its far
> >simpler as well, as it doesn't require a ton of file/string processing.
> 
> Yes, this way is much better for cpuset discovery. But is there such a
> syscall for hugepages?
> 
In what capacity?  Interrogating how many hugepages you have, or to what node
they are affined to?  Capacity would require reading the requisite proc file, as
theres no posix api for this resource.  Node affinity can be implied by setting
the numa policy of the dpdk and then writing to /proc/nr_hugepages, as the
kernel will attempt to distribute hugepages evenly among the tasks' numa policy
configuration.

That said, I would advise that you strongly consider not exporting hugepages as
a resource, as:

a) Applications generally don't need to know that they are using hugepages, and
so they dont need to know where said hugepages live, they just allocate memory
via your allocation api and you give them something appropriate

b) Hugepages are a resource that are very specific to Linux, and to X86 Linux at
that.  Some OS implement simmilar resources, but they may have very different
semantics.  And other Arches may or may not implement various forms of compound
paging at all.  As the DPDK expands to support more OS'es and arches, it would
be nice to ensure that the programming surfaces that you expose have a more
broad level of support.

Neil

> Thanks,
> Jianfeng
> 
> >
> >Neil
> >
> 
>

[dpdk-dev] rte_mbuf size for jumbo frame

2016-01-26 Thread Saurabh Mishra

Hi Lawrence --

>It sounds like you benchmarked Apache using Jumbo Packets, but not the
DPDK app using large mbufs.
>Those are two entirely different issues.

I meant I ran Apache benchmark between two guest VMs through our
data-processing VM which is using DPDK.

I saw 3x better performance with 10k mbuf size vs 2k mbuf size (MTU also
set appropriately )

Unfortunately, we can't handle chained mbuf unless we copy into a large
buffer. Even we do start handling chained mbufs, for inspection we can't
inspect a scattered mbuf payloads. We have to anyway coalesce them into one
to make sense of the content of the packet. We inspect full packet (from
1st byte to last byte).

Thanks,
/Saurabh

On Tue, Jan 26, 2016 at 8:50 AM, Lawrence MacIntyre 
wrote:

> Saurabh:
>
> It sounds like you benchmarked Apache using Jumbo Packets, but not the
> DPDK app using large mbufs. Those are two entirely different issues.
>
> You should be able to write your packet inspection routines to work with
> the mbuf chains, rather than copying them into a larger buffer (although if
> there are multiple passes through the data, it could be a bit complicated).
> Copying the data into a larger buffer will definitely cause the application
> to be slower.
>
> Lawrence
>
>
> This one time (01/26/2016 09:40 AM), at band camp, Saurabh Mishra wrote:
>
> Hi,
>
> Since we do full content inspection, we will end up coalescing mbuf chains
> into one before inspecting the packet which would require allocating
> another buffer of larger size.
>
> I am inclined towards larger size mbuf for this reason.
>
> I have benchmarked a bit using apache benchmark and we see 3x performance
> improvement over 1500 mtu. Memory is not an issue.
>
> My only concern is that would all the dpdk drivers work with larger size
> mbuf?
>
> Thanks,
> Saurabh
> On Jan 26, 2016 6:23 AM, "Lawrence MacIntyre" 
> wrote:
>
>> Saurabh:
>>
>> Raising the mbuf size will make the packet handling for large packets
>> slightly more efficient, but it will use much more memory unless the great
>> majority of the packets you are handling are of the jumbo size. Using more
>> memory has its own costs. In order to evaluate this design choice, it is
>> necessary to understand the behavior of the memory subsystem, which is VERY
>> complicated.
>>
>> Before  you go down this path, at least benchmark your application using
>> the regular sized mbufs and the large ones and see what the effect is.
>>
>> This one time (01/26/2016 09:01 AM), at band camp, Polehn, Mike A wrote:
>>
>>> Jumbo frames are generally handled by link lists (but called something
>>> else) of mbufs.
>>> Enabling jumbo frames for the device driver should enable the right
>>> portion of the driver which handles the linked lists.
>>>
>>> Don't make the mbufs huge.
>>>
>>> Mike
>>>
>>> -Original Message-
>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Masaru OKI
>>> Sent: Monday, January 25, 2016 2:41 PM
>>> To: Saurabh Mishra; users at dpdk.org; dev at dpdk.org
>>> Subject: Re: [dpdk-dev] rte_mbuf size for jumbo frame
>>>
>>> Hi,
>>>
>>> 1. Take care of unit size of mempool for mbuf.
>>> 2. Call rte_eth_dev_set_mtu() for each interface.
>>>  Note that some PMDs does not supported change MTU.
>>>
>>> On 2016/01/26 6:02, Saurabh Mishra wrote:
>>>
 Hi,

 We wanted to use 10400 bytes size of each rte_mbuf to enable Jumbo
 frames.
 Do you guys see any problem with that? Would all the drivers like
 ixgbe, i40e, vmxnet3, virtio and bnx2x work with larger rte_mbuf size?

 We would want to avoid detailing with chained mbufs.

 /Saurabh

>>>
>> --
>> Lawrence MacIntyre  macintyrelp at ornl.gov  Oak Ridge National Laboratory
>>  865.574.7401  Cyber Space and Information Intelligence Research Group
>>
>>
> --
> Lawrence MacIntyre  macintyrelp at ornl.gov  Oak Ridge National Laboratory
>  865.574.7401  Cyber Space and Information Intelligence Research Group
>
>

[dpdk-dev] [PATCH] eal: add function to check if primary proc alive

2016-01-26 Thread Van Haaren, Harry

> From: Qiu, Michael
> On 1/25/2016 7:51 PM, Van Haaren, Harry wrote:
> > Not really, the secondary process will need some CPU,
> > however it can sleep so it doesn't have to use 100% of it.
> > It shouldn't be run on a core that is used by the primary
> > for packet-forwarding though - that will impact performance.
> 
> If not, what will happen if the primary been killed after you check
> alive? At that time, the secondary may be doing some work need primary
> alive.

What work are you thinking of? Apart from the shared config
and hugepages, primary and secondary processes are running
in their own address-space, and if the primary gets killed,
the secondary will notice when it next polls rte_eal_primary_proc_alive().

Whatever work the secondary was performing (in its own address space)
won't be directly changed by the primary being killed, because the
shared config and hugepages stay (EAL "cleans up" when the primary
is re-launched, not on quit).

-Harry

[dpdk-dev] rte_mbuf size for jumbo frame

2016-01-26 Thread Masaru OKI

Hi,

1. Take care of unit size of mempool for mbuf.
2. Call rte_eth_dev_set_mtu() for each interface.
Note that some PMDs does not supported change MTU.

On 2016/01/26 6:02, Saurabh Mishra wrote:
> Hi,
>
> We wanted to use 10400 bytes size of each rte_mbuf to enable Jumbo frames.
> Do you guys see any problem with that? Would all the drivers like ixgbe,
> i40e, vmxnet3, virtio and bnx2x work with larger rte_mbuf size?
>
> We would want to avoid detailing with chained mbufs.
>
> /Saurabh

[dpdk-dev] rte_mbuf size for jumbo frame

2016-01-26 Thread Saurabh Mishra

Hi,

Since we do full content inspection, we will end up coalescing mbuf chains
into one before inspecting the packet which would require allocating
another buffer of larger size.

I am inclined towards larger size mbuf for this reason.

I have benchmarked a bit using apache benchmark and we see 3x performance
improvement over 1500 mtu. Memory is not an issue.

My only concern is that would all the dpdk drivers work with larger size
mbuf?

Thanks,
Saurabh
On Jan 26, 2016 6:23 AM, "Lawrence MacIntyre"  wrote:

> Saurabh:
>
> Raising the mbuf size will make the packet handling for large packets
> slightly more efficient, but it will use much more memory unless the great
> majority of the packets you are handling are of the jumbo size. Using more
> memory has its own costs. In order to evaluate this design choice, it is
> necessary to understand the behavior of the memory subsystem, which is VERY
> complicated.
>
> Before  you go down this path, at least benchmark your application using
> the regular sized mbufs and the large ones and see what the effect is.
>
> This one time (01/26/2016 09:01 AM), at band camp, Polehn, Mike A wrote:
>
>> Jumbo frames are generally handled by link lists (but called something
>> else) of mbufs.
>> Enabling jumbo frames for the device driver should enable the right
>> portion of the driver which handles the linked lists.
>>
>> Don't make the mbufs huge.
>>
>> Mike
>>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Masaru OKI
>> Sent: Monday, January 25, 2016 2:41 PM
>> To: Saurabh Mishra; users at dpdk.org; dev at dpdk.org
>> Subject: Re: [dpdk-dev] rte_mbuf size for jumbo frame
>>
>> Hi,
>>
>> 1. Take care of unit size of mempool for mbuf.
>> 2. Call rte_eth_dev_set_mtu() for each interface.
>>  Note that some PMDs does not supported change MTU.
>>
>> On 2016/01/26 6:02, Saurabh Mishra wrote:
>>
>>> Hi,
>>>
>>> We wanted to use 10400 bytes size of each rte_mbuf to enable Jumbo
>>> frames.
>>> Do you guys see any problem with that? Would all the drivers like
>>> ixgbe, i40e, vmxnet3, virtio and bnx2x work with larger rte_mbuf size?
>>>
>>> We would want to avoid detailing with chained mbufs.
>>>
>>> /Saurabh
>>>
>>
> --
> Lawrence MacIntyre  macintyrelp at ornl.gov  Oak Ridge National Laboratory
>  865.574.7401  Cyber Space and Information Intelligence Research Group
>
>

[dpdk-dev] [PATCH 0/4] virtio support for container

2016-01-26 Thread Qiu, Michael

On 1/11/2016 2:43 AM, Tan, Jianfeng wrote:
> This patchset is to provide high performance networking interface (virtio)
> for container-based DPDK applications. The way of starting DPDK apps in
> containers with ownership of NIC devices exclusively is beyond the scope.
> The basic idea here is to present a new virtual device (named eth_cvio),
> which can be discovered and initialized in container-based DPDK apps using
> rte_eal_init(). To minimize the change, we reuse already-existing virtio
> frontend driver code (driver/net/virtio/).
>  
> Compared to QEMU/VM case, virtio device framework (translates I/O port r/w
> operations into unix socket/cuse protocol, which is originally provided in
> QEMU), is integrated in virtio frontend driver. So this converged driver
> actually plays the role of original frontend driver and the role of QEMU
> device framework.
>  
> The major difference lies in how to calculate relative address for vhost.
> The principle of virtio is that: based on one or multiple shared memory
> segments, vhost maintains a reference system with the base addresses and
> length for each segment so that an address from VM comes (usually GPA,
> Guest Physical Address) can be translated into vhost-recognizable address
> (named VVA, Vhost Virtual Address). To decrease the overhead of address
> translation, we should maintain as few segments as possible. In VM's case,
> GPA is always locally continuous. In container's case, CVA (Container
> Virtual Address) can be used. Specifically:
> a. when set_base_addr, CVA address is used;
> b. when preparing RX's descriptors, CVA address is used;
> c. when transmitting packets, CVA is filled in TX's descriptors;
> d. in TX and CQ's header, CVA is used.
>  
> How to share memory? In VM's case, qemu always shares all physical layout
> to backend. But it's not feasible for a container, as a process, to share
> all virtual memory regions to backend. So only specified virtual memory
> regions (with type of shared) are sent to backend. It's a limitation that
> only addresses in these areas can be used to transmit or receive packets.
>
> Known issues
>
> a. When used with vhost-net, root privilege is required to create tap
> device inside.
> b. Control queue and multi-queue are not supported yet.
> c. When --single-file option is used, socket_id of the memory may be
> wrong. (Use "numactl -N x -m x" to work around this for now)
>  
> How to use?
>
> a. Apply this patchset.
>
> b. To compile container apps:
> $: make config RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make install RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make -C examples/l2fwd RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make -C examples/vhost RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
>
> c. To build a docker image using Dockerfile below.
> $: cat ./Dockerfile
> FROM ubuntu:latest
> WORKDIR /usr/src/dpdk
> COPY . /usr/src/dpdk
> ENV PATH "$PATH:/usr/src/dpdk/examples/l2fwd/build/"
> $: docker build -t dpdk-app-l2fwd .
>
> d. Used with vhost-user
> $: ./examples/vhost/build/vhost-switch -c 3 -n 4 \
>   --socket-mem 1024,1024 -- -p 0x1 --stats 1
> $: docker run -i -t -v :/var/run/usvhost \
>   -v /dev/hugepages:/dev/hugepages \
>   dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \
>   --vdev=eth_cvio0,path=/var/run/usvhost -- -p 0x1
>
> f. Used with vhost-net
> $: modprobe vhost
> $: modprobe vhost-net
> $: docker run -i -t --privileged \
>   -v /dev/vhost-net:/dev/vhost-net \
>   -v /dev/net/tun:/dev/net/tun \
>   -v /dev/hugepages:/dev/hugepages \
>   dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \
>   --vdev=eth_cvio0,path=/dev/vhost-net -- -p 0x1

We'd better add a ifname, like
--vdev=eth_cvio0,path=/dev/vhost-net,ifname=tap0, so that user could add
the tap to the bridge first.

Thanks,
Michael
>
> By the way, it's not necessary to run in a container.
>
> Signed-off-by: Huawei Xie 
> Signed-off-by: Jianfeng Tan 
>
> Jianfeng Tan (4):
>   mem: add --single-file to create single mem-backed file
>   mem: add API to obstain memory-backed file info
>   virtio/vdev: add ways to interact with vhost
>   virtio/vdev: add a new vdev named eth_cvio
>
>  config/common_linuxapp |   5 +
>  drivers/net/virtio/Makefile|   4 +
>  drivers/net/virtio/vhost.c | 734 
> +
>  drivers/net/virtio/vhost.h | 192 
>  drivers/net/virtio/virtio_ethdev.c | 338 ++---
>  drivers/net/virtio/virtio_ethdev.h |   4 +
>  drivers/net/virtio/virtio_pci.h|  52 +-
>  drivers/net/virtio/virtio_rxtx.c   |  11 +-
>  drivers/net/virtio/virtio_rxtx_simple.c|  14 +-
>  drivers/net/virtio/virtqueue.h |  13 +-
>  lib/librte_eal/common/eal_common_options.c |  17 +
>  lib/librte_eal/common/eal_internal_cfg.h   |   1 +
>  lib/librte_eal/common/eal_options.h|   2 +
>  lib/librte_eal/common/include/rte_memory.h |  16 +
>  lib/l

[dpdk-dev] [PATCH 0/3] fm10k: enable FTAG based forwarding

2016-01-26 Thread Liu, Yong

Tested-by: Yong Liu 

- Tested Commit: a38e5ec15e3fe615b94f3cc5edca5974dab325ab
- OS: Fedora20 3.11.10-301.fc20.x86_64
- GCC: gcc version 4.8.3 20140911
- CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
- NIC: Intel Corporation Device RedrockCanyou [8086:15a4]
- Default x86_64-native-linuxapp-gcc configuration
- Prerequisites:
- Total 1 cases, 1 passed, 0 failed

- Prerequisites command / instruction:
  Apply FM10k ftag unit test patch.
  Turn on CONFIG_RTE_LIBRTE_FM10K_FTAG_FWD setting.
  export Port0 and Port1's GLORT ID to environment variables
export PORT1_GLORT=0x4200
export PORT0_GLORT=0x4000

- Case: Ftag forwarding unit test
  Description: check fm10k nic can forwarding packets based on FTAG
  Command / instruction:
Start test application and run fm10k_ftag_autotest.
  ./x86_64-native-linuxapp-gcc/app/test -c f -n 4
  RTE>>fm10k_ftag_autotest
Send packet to Port0 and verify packet forwarded to Port1.
  Receive 1 packets on port 0
  test for FTAG RX passed
  Send out 1 packets with FTAG on port 0
  Receive 1 packets on port 1
  test for FTAG TX passed
  Test OK

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Wang Xiao W
> Sent: Monday, January 25, 2016 4:07 PM
> To: Chen, Jing D
> Cc: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 0/3] fm10k: enable FTAG based forwarding
> 
> This patch set adds support for FTAG based forwarding in fm10k. This
> feature
> is a particularity of fm10k, so I add an introduction for it in fm10k.rst.
> A FTAG unit test is kept internally for feature testing, it's not included
> in the patch set due to the particularity.
> 
> Wang Xiao W (3):
>   fm10k: enable FTAG based forwarding
>   doc: add introduction for fm10k FTAG based forwarding
>   doc: update release note for fm10k FTAG support
> 
>  config/common_bsdapp |  1 +
>  config/common_linuxapp   |  1 +
>  doc/guides/nics/fm10k.rst| 15 ++-
>  doc/guides/rel_notes/release_2_3.rst |  1 +
>  drivers/net/fm10k/fm10k_ethdev.c |  8 
>  drivers/net/fm10k/fm10k_rxtx.c   | 17 +
>  drivers/net/fm10k/fm10k_rxtx_vec.c   |  9 +
>  7 files changed, 51 insertions(+), 1 deletion(-)
> 
> --
> 1.9.3

[dpdk-dev] [PATCH] eal: add function to check if primary proc alive

2016-01-26 Thread Qiu, Michael

On 1/25/2016 7:51 PM, Van Haaren, Harry wrote:
>> From: Qiu, Michael
>> Subject: Re: [dpdk-dev] [PATCH] eal: add function to check if primary proc 
>> alive
>>
>> So secondary will waste a whole lcore to do such polling?
> Not really, the secondary process will need some CPU,
> however it can sleep so it doesn't have to use 100% of it.
> It shouldn't be run on a core that is used by the primary
> for packet-forwarding though - that will impact performance.

If not, what will happen if the primary been killed after you check
alive? At that time, the secondary may be doing some work need primary
alive.

Thanks,
Michael
> -Harry
>

[dpdk-dev] [PATCH 2/4] i40e: split function for input set change of hash and fdir

2016-01-26 Thread Wu, Jingjing


> >
> > Thanks for your comments. You are correct, I removed the
> > I40E_INSET_FLEX_PAYLOAD from valid fdir input set values, and this is
> > one reason why I splited function for input set change of hash and and
> > it is because all flex payload configuration can be set in struct
> > rte_fdir_conf during device configure phase. And it is a more flexible
> > configuration including flexpayload's selection, input set selection by word
> and mask setting in bits.
> 
> Should it be then two patches? First patch to split fdir and hash input set
> configuration and then second one to remove existing functionality? At the
> moment it is not obvious that this patch not just splits fdir input set
> configuration but removes some features in a way that fdir it is not
> compatible with DPDK 2.2 anymore.
> 
OK. I will try to split it to two patches.

> >
> > If I enable it in the input set change API, it will be duplicate. And
> > the input set change on flexible payload only on word, just some
> > ability compared with rte_fdir_conf.
> > If flexible selection isn't done in  struct rte_fdir_conf, the input
> > set selection in input set change API doesn't make sense. If flexible
> > selection is done in struct rte_fdir_conf, why not selection input set
> > in struct rte_fdir_conf at the same time?
> 
> I do not have a problem with selecting it at the same time - it always was 
> this
> way with the legacy systems. But now new NIC supports new way of
> configuring input set with flexible payload as a part of this input set. So 
> why
> not to have new way of configuration available as well and change input set
> using one API call instead of splitting single configuration in to two parts?
> 
Yes, if we support two way, at least we need to make sure the consistency.
In this patch set, I didn't add new way to configure flexible selection and mask
Setting. So to make sure consistency, just remove the flexible payload in this
Patch.

Thanks
Jingjing
> > And about you concern, "when application has to run on an old NIC and
> > on a new one", The rte_fdir_conf is for each eth_dev, so it will be fine.
> >

[dpdk-dev] [PATCH 1/3] i40e: enable extended tag

2016-01-26 Thread Zhang, Helin



> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, January 25, 2016 5:17 PM
> To: Zhang, Helin
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 1/3] i40e: enable extended tag
> 
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > 2015-12-21 10:38, Helin Zhang:
> > > > PCIe feature of 'Extended Tag' is important for 40G performance.
> > > > It adds its enabling during each port initialization, to ensure
> > > > the high performance.
> > >
> > > If it's so important, why the values are not documented?
> > > Please start to fill a file doc/guides/nics/i40e.rst to explain how
> > > the device works. Thanks
> >
> > It has already been mentioned in getting started guide for a long time.
> > Are you suggesting to move into i40e specifically? Thanks!
> 
> Yes you're right. I had forgotten that:
> http://dpdk.org/doc/guides-2.2/linux_gsg/enable_func.html#enabling-
> extended-tag-and-setting-max-read-request-size
> 
> Yes, it would be better moved into an i40e doc.
> And maybe max_read_request_size may be commented to give advices on
> values.
> Thanks

OK. Good idea, and I will move that soon later. Thanks!

Regards,
Helin

73 matches

Mail list logo