date:20160621

[dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset

2016-06-21 Thread Lu, Wenzhuo

Hi Jerin,


> -Original Message-
> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Monday, June 20, 2016 5:14 PM
> To: Lu, Wenzhuo
> Cc: dev at dpdk.org; Ananyev, Konstantin; Richardson, Bruce; Chen, Jing D; 
> Liang,
> Cunming; Wu, Jingjing; Zhang, Helin; thomas.monjalon at 6wind.com
> Subject: Re: [dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset
> 
> On Mon, Jun 20, 2016 at 02:24:27PM +0800, Wenzhuo Lu wrote:
> > Add an API to reset the device.
> > It's for VF device in this scenario, kernel PF + DPDK VF.
> > When the PF port down->up, APP should call this API to reset VF port.
> > Most likely, APP should call it in its management thread and guarantee
> > the thread safe. It means APP should stop the rx/tx and the device,
> > then reset the device, then recover the device and rx/tx.
> 
> Following is _a_ use-case for Device reset. But may be not be _the_ use case.
> IMO, We need to first say expected behavior of this API and add a use-case 
> later.
Thanks for the suggestion, I'll reword it.

> 
> Other use-case would be, PCIe VF with functional level reset for SRIOV 
> migration.
> Are we on same page?
I'm not sure:) Does this SRIOV migration mean the migration of a Logical domain 
that has a VF assigned to it?

> 
> >
> > Signed-off-by: Wenzhuo Lu 
> > ---
> >  doc/guides/nics/overview.rst   |  1 +
> >  lib/librte_ether/rte_ethdev.c  | 17 +
> >  lib/librte_ether/rte_ethdev.h  | 24 
> >  lib/librte_ether/rte_ether_version.map |  7 +++
> >  4 files changed, 49 insertions(+)
> >
> > diff --git a/doc/guides/nics/overview.rst
> > b/doc/guides/nics/overview.rst index 0bd8fae..c8a4985 100644
> > --- a/doc/guides/nics/overview.rst
> > +++ b/doc/guides/nics/overview.rst
> > @@ -89,6 +89,7 @@ Most of these differences are summarized below.
> > Speed capabilities
> > Link statusY Y   Y Y   Y Y Y Y   Y Y Y Y Y Y
> >  Y Y   Y Y Y Y
> > Link status event  Y Y Y Y Y Y   Y Y Y Y
> >  Y Y Y
> > +   Link reset   Y Y   Y Y Y
> 
> More appropriate would be "Device reset" ? Right?
Yes, sounds better :)

> 
> > Queue status event  
> >  Y
> > Rx interrupt   Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y
> > Queue start/stop Y   Y Y Y Y Y Y Y Y Y Y Y Y Y Y
> >Y   Y Y
> > diff --git a/lib/librte_ether/rte_ethdev.c
> > b/lib/librte_ether/rte_ethdev.c index e148028..6c0449b 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -3346,3 +3346,20 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t
> port_id,
> > -ENOTSUP);
> > return (*dev->dev_ops->l2_tunnel_offload_set)(dev, l2_tunnel, mask,
> > en);  }
> > +
> > +int
> > +rte_eth_dev_reset(uint8_t port_id)
> > +{
> > +   struct rte_eth_dev *dev;
> > +   int diag;
> > +
> > +   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +
> > +   dev = &rte_eth_devices[port_id];
> > +
> > +   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_reset, -ENOTSUP);
> > +
> > +   diag = (*dev->dev_ops->dev_reset)(dev);
> > +
> > +   return diag;
> > +}
> > diff --git a/lib/librte_ether/rte_ethdev.h
> > b/lib/librte_ether/rte_ethdev.h index 2757510..5b3ba12 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -1318,6 +1318,9 @@ typedef int (*eth_l2_tunnel_offload_set_t)
> >  uint8_t en);
> >  /**< @internal enable/disable the l2 tunnel offload functions */
> >
> > +typedef int  (*eth_dev_reset_t)(struct rte_eth_dev *dev); /**<
> > + at internal Function used to reset a configured Ethernet device. */
> > +
> >  #ifdef RTE_NIC_BYPASS
> >
> >  enum {
> > @@ -1508,6 +1511,8 @@ struct eth_dev_ops {
> > eth_l2_tunnel_eth_type_conf_t l2_tunnel_eth_type_conf;
> > /** Enable/disable l2 tunnel offload functions */
> > eth_l2_tunnel_offload_set_t l2_tunnel_offload_set;
> > +   /** Reset device. */
> > +   eth_dev_reset_t dev_reset;
> >  };
> >
> >  /**
> > @@ -4253,6 +4258,25 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t
> port_id,
> >   uint32_t mask,
> >   uint8_t en);
> >
> > +/**
> > + * Reset an ethernet device when it's not working. One scenario is,
> > +after PF
> > + * port is down and up, the related VF port should be reset.
> > + * The API will stop the port, clear the rx/tx queues, re-setup the
> > +rx/tx
> > + * queues, restart the port.
> > + * Before calling this API, APP should stop the rx/tx. When tx is
> > +being stopped,
> > + * APP can drop the packets and release the buffer instead of sending them.
> 
> Same as first comment.
I'll reword it.

> 
> > + *
> > + * @param port_id
> > + *   The port identifier of the Ethernet device.
> > + *
> > + * @return
> > + *   - (0) if success

[dpdk-dev] [PATCH v5 1/2] app/test: reworks the crypto AES unit test

2016-06-21 Thread Chen, Zhaoyan

Tested-by: Chen, Zhaoyan

* Commit: 3901ed99c2f82d3e979bb1bea001d61898241829
* Patch Apply: Success
* Compilation: Success
* Kernel/OS: 3.11.10-301.fc20.x86_64
* GCC: 4.8.3 20140911

* Case 1
./app/test -cf -n4
cryptodev_aesni_mb_autotest

Checked the AES-128-CBC / HMAC-224/384 unit test is added. 

* Case 2
./app/test -cf -n4
crypto_qat_autotest
cryptodev_aesni_gcm_autotest
cryptodev_aesni_mb_perftest
cryptodev_qat_perftest
cryptodev_null_autotest

Checked all cryptodev unit tests run smoothly.

Thanks,
Joey


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of De Lara Guarch,
> Pablo
> Sent: Wednesday, June 15, 2016 6:41 PM
> To: Zhang, Roy Fan ; dev at dpdk.org
> Cc: Doherty, Declan 
> Subject: Re: [dpdk-dev] [PATCH v5 1/2] app/test: reworks the crypto AES
> unit test
> 
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Fan Zhang
> > Sent: Tuesday, June 14, 2016 10:57 AM
> > To: dev at dpdk.org
> > Cc: Doherty, Declan
> > Subject: [dpdk-dev] [PATCH v5 1/2] app/test: reworks the crypto AES
> > unit test
> >
> > This patch reworks the crypto AES unit test by introducing a new
> > unified test function
> >
> > Signed-off-by: Fan Zhang 
> > ---
> >  app/test/Makefile  |1 +
> >  app/test/test_cryptodev.c  | 1613 
> > ++--
> >  app/test/test_cryptodev_aes.c  |  663 ++
> >  app/test/test_cryptodev_aes.h  |  828 
> >  app/test/test_cryptodev_aes_ctr_test_vectors.h |  257 
> >  5 files changed, 1614 insertions(+), 1748 deletions(-)  create mode
> > 100644 app/test/test_cryptodev_aes.c  create mode 100644
> > app/test/test_cryptodev_aes.h  delete mode 100644
> > app/test/test_cryptodev_aes_ctr_test_vectors.h
> >
> 
> [...]
> 
> > diff --git a/app/test/test_cryptodev_aes.c
> > b/app/test/test_cryptodev_aes.c new file mode 100644 index
> > 000..8c43441
> > --- /dev/null
> > +++ b/app/test/test_cryptodev_aes.c
> > @@ -0,0 +1,663 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2015-2016 Intel Corporation. All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *  * Redistributions of source code must retain the above copyright
> > + *notice, this list of conditions and the following disclaimer.
> > + *  * Redistributions in binary form must reproduce the above copyright
> > + *notice, this list of conditions and the following disclaimer in
> > + *the documentation and/or other materials provided with the
> > + *distribution.
> > + *  * Neither the name of Intel Corporation nor the names of its
> > + *contributors may be used to endorse or promote products derived
> > + *from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> > CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> > NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> > FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> > COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> > INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> BUT
> > NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> LOSS
> > OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> > AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> > TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
> OF
> > THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> > DAMAGE.
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "test.h"
> > +#include "test_cryptodev_aes.h"
> > +
> > +#ifndef MAX_N_AES_TESTS
> > +#define MAX_N_AES_TESTS256
> > +#endif
> 
> This macro is not used anywhere.
> 
> > +
> > +#ifndef AES_TEST_MSG_LEN
> > +#define AES_TEST_MSG_LEN   256
> > +#endif
> > +
> > +#define AES_TEST_OP_ENCRYPT0x01
> > +#define AES_TEST_OP_DECRYPT0x02
> > +#define AES_TEST_OP_AUTH_GEN   0x04
> > +#define AES_TEST_OP_AUTH_VERIFY0x08

[dpdk-dev] [PATCH v3 1/2] mempool: add stack (lifo) mempool handler

2016-06-21 Thread Jerin Jacob

On Mon, Jun 20, 2016 at 05:56:40PM +, Ananyev, Konstantin wrote:
> 
> 
> > -Original Message-
> > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> > Sent: Monday, June 20, 2016 3:22 PM
> > To: Ananyev, Konstantin
> > Cc: Thomas Monjalon; dev at dpdk.org; Hunt, David; olivier.matz at 
> > 6wind.com; viktorin at rehivetech.com; shreyansh.jain at nxp.com
> > Subject: Re: [dpdk-dev] [PATCH v3 1/2] mempool: add stack (lifo) mempool 
> > handler
> > 
> > On Mon, Jun 20, 2016 at 01:58:04PM +, Ananyev, Konstantin wrote:
> > >
> > >
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> > > > Sent: Monday, June 20, 2016 2:54 PM
> > > > To: Jerin Jacob
> > > > Cc: dev at dpdk.org; Hunt, David; olivier.matz at 6wind.com; viktorin 
> > > > at rehivetech.com; shreyansh.jain at nxp.com
> > > > Subject: Re: [dpdk-dev] [PATCH v3 1/2] mempool: add stack (lifo) 
> > > > mempool handler
> > > >
> > > > 2016-06-20 18:55, Jerin Jacob:
> > > > > On Mon, Jun 20, 2016 at 02:08:10PM +0100, David Hunt wrote:
> > > > > > This is a mempool handler that is useful for pipelining apps, where
> > > > > > the mempool cache doesn't really work - example, where we have one
> > > > > > core doing rx (and alloc), and another core doing Tx (and return). 
> > > > > > In
> > > > > > such a case, the mempool ring simply cycles through all the mbufs,
> > > > > > resulting in a LLC miss on every mbuf allocated when the number of
> > > > > > mbufs is large. A stack recycles buffers more effectively in this
> > > > > > case.
> > > > > >
> > > > > > Signed-off-by: David Hunt 
> > > > > > ---
> > > > > >  lib/librte_mempool/Makefile|   1 +
> > > > > >  lib/librte_mempool/rte_mempool_stack.c | 145 
> > > > > > +
> > > > >
> > > > > How about moving new mempool handlers to drivers/mempool? (or 
> > > > > similar).
> > > > > In future, adding HW specific handlers in lib/librte_mempool/ may be 
> > > > > bad idea.
> > > >
> > > > You're probably right.
> > > > However we need to check and understand what a HW mempool handler will 
> > > > be.
> > > > I imagine the first of them will have to move handlers in drivers/
> > >
> > > Does it mean it we'll have to move mbuf into drivers too?
> > > Again other libs do use mempool too.
> > > Why not just lib/librte_mempool/arch/
> > > ?
> > 
> > I was proposing only to move only the new
> > handler(lib/librte_mempool/rte_mempool_stack.c). Not any library or any
> > other common code.
> > 
> > Just like DPDK crypto device, Even if it is software implementation its
> > better to move in driver/crypto instead of lib/librte_cryptodev
> > 
> > "lib/librte_mempool/arch/" is not correct place as it is platform specific
> > not architecture specific and HW mempool device may be PCIe or platform
> > device.
> 
> Ok, but why rte_mempool_stack.c has to be moved?

Just thought of having all the mempool handlers at one place.
We can't move all HW mempool handlers at lib/librte_mempool/

Jerin

> I can hardly imagine it is a 'platform sepcific'.
> From my understanding it is a generic code.
> Konstantin
> 
> 
> > 
> > > Konstantin
> > >
> > >
> > > > Jerin, are you volunteer?

[dpdk-dev] [PATCH v3 1/2] mempool: add stack (lifo) mempool handler

2016-06-21 Thread Jerin Jacob

On Mon, Jun 20, 2016 at 03:54:20PM +0200, Thomas Monjalon wrote:
> 2016-06-20 18:55, Jerin Jacob:
> > On Mon, Jun 20, 2016 at 02:08:10PM +0100, David Hunt wrote:
> > > This is a mempool handler that is useful for pipelining apps, where
> > > the mempool cache doesn't really work - example, where we have one
> > > core doing rx (and alloc), and another core doing Tx (and return). In
> > > such a case, the mempool ring simply cycles through all the mbufs,
> > > resulting in a LLC miss on every mbuf allocated when the number of
> > > mbufs is large. A stack recycles buffers more effectively in this
> > > case.
> > > 
> > > Signed-off-by: David Hunt 
> > > ---
> > >  lib/librte_mempool/Makefile|   1 +
> > >  lib/librte_mempool/rte_mempool_stack.c | 145 
> > > +
> > 
> > How about moving new mempool handlers to drivers/mempool? (or similar).
> > In future, adding HW specific handlers in lib/librte_mempool/ may be bad 
> > idea.
> 
> You're probably right.
> However we need to check and understand what a HW mempool handler will be.
> I imagine the first of them will have to move handlers in drivers/
> Jerin, are you volunteer?

Thomas,

We are planning to upstream a HW based mempool handler.
Not sure about the timelines. We will take up this as part of a HW
based mempool upstreaming if no one takes it before that.

Jerin

[dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset

2016-06-21 Thread Jerin Jacob

On Mon, Jun 20, 2016 at 09:17:14AM -0700, Stephen Hemminger wrote:
> On Mon, 20 Jun 2016 14:44:11 +0530
> Jerin Jacob  wrote:
> 
> > On Mon, Jun 20, 2016 at 02:24:27PM +0800, Wenzhuo Lu wrote:
> > > Add an API to reset the device.
> > > It's for VF device in this scenario, kernel PF + DPDK VF.
> > > When the PF port down->up, APP should call this API to
> > > reset VF port. Most likely, APP should call it in its
> > > management thread and guarantee the thread safe. It means
> > > APP should stop the rx/tx and the device, then reset the
> > > device, then recover the device and rx/tx.
> > 
> > Following is _a_ use-case for Device reset. But may be not be _the_ use
> > case. IMO, We need to first say expected behavior of this API and add a 
> > use-case
> > later.
> > 
> > Other use-case would be, PCIe VF with functional level reset for SRIOV
> > migration.
> > Are we on same page?
> 
> 
> In my experience with Linux devices, this is normally handled by the
> device driver in the start routine.  Since any use case which needs
> this is going to do a stop/reset/start sequence, why not just have
> the VF device driver do this in the start routine?.
> 
> Adding yet another API and state transistion if not necessary increases
> the complexity and required test cases for all devices.

I agree with Stephen here.I think if application needs to call start
after the device reset then we could add this logic in start itself
rather exposing a yet another API

>

[dpdk-dev] [PATCH v3] i40e: fix olflags for vector Rx

2016-06-21 Thread Peng, Yuan

Tested-by: Peng Yuan 

- Test Commit: 04920e693a053a923f94c271ee68881756649cec
- OS/Kernel: Fedora 23/4.2.3
- GCC: gcc version 5.3.1 20151207 (Red Hat 5.3.1-2) (GCC)
- CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
- Total 1 cases, 1 passed, 0 failed.

Case1: read RSS HASH and RSS queue in the received packet.  Passed.

DUT:
./tools/dpdk_nic_bind.py --bind=igb_uio :82:00.0 :82:00.1 
./x86_64-native-linuxapp-gcc/app/testpmd  -c f -n 4 -- -i 
--coremask=0xe --portmask=0x3 --rxq=16 --txq=16 --txqflags=0
testpmd> set verbose 8
testpmd> set fwd rxonly
testpmd> port stop all
testpmd> set_hash_global_config  0 toeplitz ipv4-udp enable port start 
testpmd> all port config all rss udp start

tester:
scapy
>>> sendp([Ether(dst="00:00:00:00:01:00", 
>>> src=get_if_hwaddr("enp132s0f1"))/IP(src="192.168.0.1", 
>>> dst="192.168.0.2")/UDP(sport=1024,dport=1024)], iface="enp132s0f1")

If test in commit 04920e693a053a923f94c271ee68881756649cec (without the patch) 
DUT receive the packet:
testpmd> port 0/queue 1: received 1 packets
  src=00:00:00:00:01:01 - dst=00:00:00:00:01:00 - type=0x0800 - length=60 - 
nb_segs=1 - FDIR matched hash=0xc3f2 ID=0x5263 Unknown packet type
 - Receive queue=0x1
  PKT_RX_FDIR
You can't find the RSS HASH and RSS queue

If test with [PATCH v3] i40e: fix olflags for vector Rx DUT receive the packet:
testpmd> port 0/queue 1: received 1 packets
  src=00:00:00:00:01:01 - dst=00:00:00:00:01:00 - type=0x0800 - length=60 - 
nb_segs=1 - RSS hash=0x5263c3f2 - RSS queue=0x1 - (outer) L2 type: ETHER - 
(outer) L3 type: IPV4_EXT_UNKNOWN - (outer) L4 type: UDP - Tunnel type: Unknown 
- Inner L2 type: Unknown - Inner L3 type: Unknown - Inner L4 type: Unknown
 - Receive queue=0x1
  PKT_RX_RSS_HASH
You can check that RSS hash=0x5263c3f2 - RSS queue=0x1

The case was run in the default settings: CONFIG_RTE_LIBRTE_I40E_INC_VECTOR=y

so the issue has been fixed.

Thank you.
Yuan.

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Zhe Tao
Sent: Tuesday, June 14, 2016 1:24 PM
To: dev at dpdk.org
Cc: Tao, Zhe ; Wu, Jingjing 
Subject: [dpdk-dev] [PATCH v3] i40e: fix olflags for vector Rx

Problem:
The flag for RSS and flow director is not set correctly in the vector Rx 
function, so the upper layer APP which base on the related flags will not work 
correctly.

Fix this problem by change the shuffle table. the original shuffle table is not 
correct.

Fixes: 9ed94e5bb04e ("i40e: add vector Rx")

Signed-off-by: Zhe Tao 
---
v2: Changed the comments according to the code change.
v3: Fixed the issues reported by check-git-log.sh.

 drivers/net/i40e/i40e_rxtx_vec.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/net/i40e/i40e_rxtx_vec.c b/drivers/net/i40e/i40e_rxtx_vec.c
index eef80d9..704924f 100644
--- a/drivers/net/i40e/i40e_rxtx_vec.c
+++ b/drivers/net/i40e/i40e_rxtx_vec.c
@@ -144,12 +144,13 @@ desc_to_olflags_v(__m128i descs[4], struct rte_mbuf 
**rx_pkts)
uint64_t dword;
} vol;

-   /* mask everything except rss and vlan flags
-   *bit2 is for vlan tag, bits 13:12 for rss
-   */
+   /* mask everything except RSS, flow director and VLAN flags
+* bit2 is for VLAN tag, bit11 for flow director indication
+* bit13:12 for RSS indication.
+*/
const __m128i rss_vlan_msk = _mm_set_epi16(
0x, 0x, 0x, 0x,
-   0x3004, 0x3004, 0x3004, 0x3004);
+   0x3804, 0x3804, 0x3804, 0x3804);

/* map rss and vlan type to rss hash and vlan flag */
const __m128i vlan_flags = _mm_set_epi8(0, 0, 0, 0, @@ -159,8 +160,8 @@ 
desc_to_olflags_v(__m128i descs[4], struct rte_mbuf **rx_pkts)

const __m128i rss_flags = _mm_set_epi8(0, 0, 0, 0,
0, 0, 0, 0,
-   0, 0, 0, 0,
-   PKT_RX_FDIR, 0, PKT_RX_RSS_HASH, 0);
+   PKT_RX_RSS_HASH | PKT_RX_FDIR, PKT_RX_RSS_HASH, 0, 0,
+   0, 0, PKT_RX_FDIR, 0);

vlan0 = _mm_unpackhi_epi16(descs[0], descs[1]);
vlan1 = _mm_unpackhi_epi16(descs[2], descs[3]); @@ -169,7 +170,7 @@ 
desc_to_olflags_v(__m128i descs[4], struct rte_mbuf **rx_pkts)
vlan1 = _mm_and_si128(vlan0, rss_vlan_msk);
vlan0 = _mm_shuffle_epi8(vlan_flags, vlan1);

-   rss = _mm_srli_epi16(vlan1, 12);
+   rss = _mm_srli_epi16(vlan1, 11);
rss = _mm_shuffle_epi8(rss_flags, rss);

vlan0 = _mm_or_si128(vlan0, rss);
--
2.1.4

[dpdk-dev] [PATCH] bnx2x: Correctly determine MSIX vector count

2016-06-21 Thread Harish Patil

>
>From: "Charles (Chas) Williams" 
>
>If MSIX is available, the vector count given by the table size is one
>less than the actual count.  This count also limits the receive and
>transmit queue resources the VF can support.
>
>Fixes: 540a211084a7 ("bnx2x: driver core")
>
>Signed-off-by: Chas Williams 
>---
> drivers/net/bnx2x/bnx2x.c | 8 +---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/net/bnx2x/bnx2x.c b/drivers/net/bnx2x/bnx2x.c
>index 6edb2f9..4be732f 100644
>--- a/drivers/net/bnx2x/bnx2x.c
>+++ b/drivers/net/bnx2x/bnx2x.c
>@@ -9570,8 +9570,10 @@ static int bnx2x_pci_get_caps(struct bnx2x_softc
>*sc)
> static void bnx2x_init_rte(struct bnx2x_softc *sc)
> {
>   if (IS_VF(sc)) {
>-  sc->max_tx_queues = BNX2X_VF_MAX_QUEUES_PER_VF;
>-  sc->max_rx_queues = BNX2X_VF_MAX_QUEUES_PER_VF;
>+  sc->max_tx_queues = min(BNX2X_VF_MAX_QUEUES_PER_VF,
>+  sc->igu_sb_cnt);
>+  sc->max_rx_queues = min(BNX2X_VF_MAX_QUEUES_PER_VF,
>+  sc->igu_sb_cnt);
>   } else {
>   sc->max_tx_queues = 128;
>   sc->max_rx_queues = 128;
>@@ -9713,7 +9715,7 @@ int bnx2x_attach(struct bnx2x_softc *sc)
>   pci_read(sc,
>(sc->devinfo.pcie_msix_cap_reg + PCIR_MSIX_CTRL), &val,
>2);
>-  sc->igu_sb_cnt = (val & PCIM_MSIXCTRL_TABLE_SIZE);
>+  sc->igu_sb_cnt = (val & PCIM_MSIXCTRL_TABLE_SIZE) + 1;
>   } else {
>   sc->igu_sb_cnt = 1;
>   }
>-- 
>2.5.5
>
>

Acked-by: Harish Patil

[dpdk-dev] [DPDK16.04: Error While compiling]

2016-06-21 Thread amartya....@wipro.com

Hi,

I am facing compilation error for DPDK 16.04 as below:
In file included from 
/home/cran/dpdk-16.04/lib/librte_eal/common/eal_common_options.c:52:0:
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:673:9: 
error: called from here
 _mm_storeu_si128((__m128i *)((uint8_t *)dst + 6 * 16), 
_mm_alignr_epi8(xmm7, xmm6, offset));\
 ^
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:730:16: 
note: in expansion of macro 'MOVEUNALIGNED_LEFT47_IMM'
 case 0x0F: MOVEUNALIGNED_LEFT47_IMM(dst, src, n, 0x0F); break;\
^
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:870:2: 
note: in expansion of macro 'MOVEUNALIGNED_LEFT47'
  MOVEUNALIGNED_LEFT47(dst, src, n, srcofs);
  ^
In file included from 
/usr/lib/gcc/x86_64-redhat-linux/4.9.2/include/x86intrin.h:37:0,
 from 
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_vect.h:67,
 from 
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:46,
 from 
/home/cran/dpdk-16.04/lib/librte_eal/common/eal_common_options.c:52:
/usr/lib/gcc/x86_64-redhat-linux/4.9.2/include/tmmintrin.h:185:1: error: 
inlining failed in call to always_inline '_mm_alignr_epi8': target specific 
option mismatch
_mm_alignr_epi8(__m128i __X, __m128i __Y, const int __N)
^
In file included from 
/home/cran/dpdk-16.04/lib/librte_eal/common/eal_common_options.c:52:0:
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:674:9: 
error: called from here
 _mm_storeu_si128((__m128i *)((uint8_t *)dst + 7 * 16), 
_mm_alignr_epi8(xmm8, xmm7, offset));\
 ^
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:730:16: 
note: in expansion of macro 'MOVEUNALIGNED_LEFT47_IMM'
 case 0x0F: MOVEUNALIGNED_LEFT47_IMM(dst, src, n, 0x0F); break;\
^
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:870:2: 
note: in expansion of macro 'MOVEUNALIGNED_LEFT47'
  MOVEUNALIGNED_LEFT47(dst, src, n, srcofs);
  ^
In file included from 
/usr/lib/gcc/x86_64-redhat-linux/4.9.2/include/x86intrin.h:37:0,
 from 
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_vect.h:67,
 from 
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:46,
 from 
/home/cran/dpdk-16.04/lib/librte_eal/common/eal_common_options.c:52:
/usr/lib/gcc/x86_64-redhat-linux/4.9.2/include/tmmintrin.h:185:1: error: 
inlining failed in call to always_inline '_mm_alignr_epi8': target specific 
option mismatch
_mm_alignr_epi8(__m128i __X, __m128i __Y, const int __N)
^
In file included from 
/home/cran/dpdk-16.04/lib/librte_eal/common/eal_common_options.c:52:0:
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:689:13: 
error: called from here
 _mm_storeu_si128((__m128i *)((uint8_t *)dst + 0 * 16), 
_mm_alignr_epi8(xmm1, xmm0, offset));\
 ^
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:730:16: 
note: in expansion of macro 'MOVEUNALIGNED_LEFT47_IMM'
 case 0x0F: MOVEUNALIGNED_LEFT47_IMM(dst, src, n, 0x0F); break;\
^
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:870:2: 
note: in expansion of macro 'MOVEUNALIGNED_LEFT47'
  MOVEUNALIGNED_LEFT47(dst, src, n, srcofs);
  ^
In file included from 
/usr/lib/gcc/x86_64-redhat-linux/4.9.2/include/x86intrin.h:37:0,
 from 
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_vect.h:67,
 from 
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:46,
 from 
/home/cran/dpdk-16.04/lib/librte_eal/common/eal_common_options.c:52:
/usr/lib/gcc/x86_64-redhat-linux/4.9.2/include/tmmintrin.h:185:1: error: 
inlining failed in call to always_inline '_mm_alignr_epi8': target specific 
option mismatch
_mm_alignr_epi8(__m128i __X, __m128i __Y, const int __N)
^
In file included from 
/home/cran/dpdk-16.04/lib/librte_eal/common/eal_common_options.c:52:0:
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:690:13: 
error: called from here
 _mm_storeu_si128((__m128i *)((uint8_t *)dst + 1 * 16), 
_mm_alignr_epi8(xmm2, xmm1, offset));\
 ^
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:730:16: 
note: in expansion of macro 'MOVEUNALIGNED_LEFT47_IMM'
 case 0x0F: MOVEUNALIGNED_LEFT47_IMM(dst, src, n, 0x0F); break;\
^
/home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:870:2: 
note: in expansion of macro 'MOVEUNALIGNED_LEFT47'
  MOVEUNALIGNED_LEFT47(dst, src, n, srcofs);
  ^
/home/cran/dpdk-16.04/mk/internal/rte.compile-pre.mk:129: recipe for target 
'eal_common_options.o' failed
make[7]: *** [eal_common_options.o] Error 1
/home/cran/dpdk-16.04/mk/rte.subdir.mk:61: recipe for target 'eal' failed
make[6]: *** [eal] Err

[dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset

2016-06-21 Thread Lu, Wenzhuo

Hi Jerin, Stephen,


> -Original Message-
> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Tuesday, June 21, 2016 11:51 AM
> To: Stephen Hemminger
> Cc: Lu, Wenzhuo; dev at dpdk.org; Ananyev, Konstantin; Richardson, Bruce; 
> Chen,
> Jing D; Liang, Cunming; Wu, Jingjing; Zhang, Helin;
> thomas.monjalon at 6wind.com
> Subject: Re: [dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset
> 
> On Mon, Jun 20, 2016 at 09:17:14AM -0700, Stephen Hemminger wrote:
> > On Mon, 20 Jun 2016 14:44:11 +0530
> > Jerin Jacob  wrote:
> >
> > > On Mon, Jun 20, 2016 at 02:24:27PM +0800, Wenzhuo Lu wrote:
> > > > Add an API to reset the device.
> > > > It's for VF device in this scenario, kernel PF + DPDK VF.
> > > > When the PF port down->up, APP should call this API to reset VF
> > > > port. Most likely, APP should call it in its management thread and
> > > > guarantee the thread safe. It means APP should stop the rx/tx and
> > > > the device, then reset the device, then recover the device and
> > > > rx/tx.
> > >
> > > Following is _a_ use-case for Device reset. But may be not be _the_
> > > use case. IMO, We need to first say expected behavior of this API
> > > and add a use-case later.
> > >
> > > Other use-case would be, PCIe VF with functional level reset for
> > > SRIOV migration.
> > > Are we on same page?
> >
> >
> > In my experience with Linux devices, this is normally handled by the
> > device driver in the start routine.  Since any use case which needs
> > this is going to do a stop/reset/start sequence, why not just have the
> > VF device driver do this in the start routine?.
> >
> > Adding yet another API and state transistion if not necessary
> > increases the complexity and required test cases for all devices.
> 
> I agree with Stephen here.I think if application needs to call start after the
> device reset then we could add this logic in start itself rather exposing a 
> yet
> another API
Do you mean changing the device_start to include all these actions, stop device 
-> stop queue -> re-setup queue -> start queue -> start device ?

> 
> >

[dpdk-dev] [PATCH / RFC] sched: Correct subport calcuation

2016-06-21 Thread Simon Kågström

Hi again!

Any news about this patch? I'm off for parental leave starting next week
(until january), so any comments (or simply dropping it!) would be good
to have before that :-)

// Simon

On 2016-06-10 08:29, Simon Kagstrom wrote:
> Signed-off-by: Simon Kagstrom 
> ---
> I'm a total newbie to the rte_sched design and implementation, so I've
> added the RFC.
> 
> We get crashes (at other places in the scheduler) without this code.
> 
>  lib/librte_sched/rte_sched.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
> index 1609ea8..b46ecfb 100644
> --- a/lib/librte_sched/rte_sched.c
> +++ b/lib/librte_sched/rte_sched.c
> @@ -1869,7 +1869,7 @@ grinder_next_pipe(struct rte_sched_port *port, uint32_t 
> pos)
>  
>   /* Install new pipe in the grinder */
>   grinder->pindex = pipe_qindex >> 4;
> - grinder->subport = port->subport + (grinder->pindex / 
> port->n_pipes_per_subport);
> + grinder->subport = port->subport + (grinder->pindex / 
> port->n_subports_per_port);
>   grinder->pipe = port->pipe + grinder->pindex;
>   grinder->pipe_params = NULL; /* to be set after the pipe structure is 
> prefetched */
>   grinder->productive = 0;
>

[dpdk-dev] [PATCH v2 00/25] Refactor mlx5 to improve performance

2016-06-21 Thread Nélio Laranjeiro

Hi Ferruh,

On Mon, Jun 20, 2016 at 06:01:49PM +0100, Ferruh Yigit wrote:
> Hi Nelio,
> 
> On 6/20/2016 5:10 PM, Nelio Laranjeiro wrote:
> > Enhance mlx5 with a data path that bypasses Verbs.
> > 
> > The first half of this patchset removes support for functionality completely
> > rewritten in the second half (scatter/gather, inline send), while the data
> > path is refactored without Verbs.
> > 
> > The PMD remains usable during the transition.
> > 
> > This patchset must be applied after "Miscellaneous fixes for mlx4 and mlx5".
> > 
> > Changes in v2:
> > - Rebased patchset on top of dpdk/master.
> 
> This is driver patch, and should be rebased on top of
> dpdk-next-net/rel_16_07.

I just applied it on this branch, in fact some on the patch fails to
apply.

> I tried to apply to a few branches but all failed, am I missing something?

No I missed something, I did not notice it should be rebased on top of
rel_16_07.

> The error log for applying to dpdk-next-net/rel_16_07:
> 
> Applying patch #14086 using 'git am'
> Description: [dpdk-dev,v2,01/25] drivers: fix PCI class id support
> Applying: drivers: fix PCI class id support
> 
> Applying patch #14087 using 'git am'
> Description: [dpdk-dev,v2,02/25] mlx5: split memory registration function
> Applying: mlx5: split memory registration function
> 
> Applying patch #14088 using 'git am'
> Description: [dpdk-dev,v2,03/25] mlx5: remove Tx gather support
> Applying: mlx5: remove Tx gather support
> 
> Applying patch #14089 using 'git am'
> Description: [dpdk-dev,v2,04/25] mlx5: remove Rx scatter support
> Applying: mlx5: remove Rx scatter support
> error: patch failed: drivers/net/mlx5/mlx5_rxtx.c:502
> error: drivers/net/mlx5/mlx5_rxtx.c: patch does not apply
> Patch failed at 0001 mlx5: remove Rx scatter support
> The copy of the patch that failed is found in:
>/tmp/dpdk-b/.git/rebase-apply/patch


I prepare a v3 on top on this branch.

-- 
N?lio Laranjeiro
6WIND

[dpdk-dev] [PATCH] dropping librte_ivshmem - was log: deprecate history dump

2016-06-21 Thread Panu Matilainen

On 06/10/2016 12:26 AM, Thomas Monjalon wrote:
> Looking a bit more into librte_ivshmem, the documentation says we need
> a Qemu patch but the URL doesn't exist anymore:
>   https://01.org/packet-processing/intel%C2%AE-ovdk
>   -> 404 Oops, we couldn't find that page
>
> I've never understood why we should keep this wart and now I'm going
> to be upset.

Good :)

> To sum up the situation, eal depends on ivshmem which depends on
> ring/mempool which depends... on eal.
> The truth is that eal should not depends on librte_ivshmem.
> And the option CONFIG_RTE_LIBRTE_IVSHMEM should not exist.
>
> There are 3 parts to distinguish:
>
> 1/ The librte_ivshmem API to export some data structures from host.
> No real problem here.
>
> 2/ The scan of the ivshmem devices in the guest init.
> It should be handled as any other PCI device with an appropriate driver.
> The scan is done by rte_eal_pci_init.
>
> 3/ The automatic mapped allocation of DPDK objects in the guest.
> It should not be done in EAL.
> An ivshmem driver would be called by rte_eal_dev_init.
> It would check where are the shared DPDK structures, as currently done
> with the IVSHMEM_MAGIC (0x0BADC0DE), and do the appropriate allocations.
> Thus only the driver would depend on ring and mempool.
>
> The last step of the ivshmem cleanup will be to remove the memory hack
> RTE_EAL_SINGLE_FILE_SEGMENTS. Then CONFIG_RTE_LIBRTE_IVSHMEM could be
> removed.
>
> So this is my proposal:
> Someone start working on the above cleanup now, otherwise the whole
> rte_ivshmem feature will be deprecated in 16.07 and removed in 16.11.
> We already talked about the rte_ivshmem design issues several times
> and nobody declared using it.

+1 (more like +100) to that.

In addition to the technical mess in EAL, there are quite some 
eyebrow-raisers related to IVSHMEM:

That it all starts with "you'll need to build a special version of qemu" 
with this special patch from the 'net, a patch which doesn't even exist 
anymore, is a complete non-starter. Such a situation can occur during 
early development, but its been years by now. Dependencies to 
non-upstreamed features in other projects are not a healthy sign.

Regardless of whether the patch has been integrated to qemu upstream or 
not, the situation is quite telling: nobody cares enough to have updated 
the information. I found a copy of the patch from my laptop, and as far 
as I can tell, the patch has never been proposed upstream, much less 
applied. Certainly the patch would not come even close to applying to 
current qemu. And apparently IVSHMEM is unmaintained in qemu upstream 
too (according to MAINTAINERS).

On DPDK side, that the most obvious (to me at least) user of memnic PMD 
has been unmaintained for two years no, and allowed to fall off the edge 
of the world (witness http://dpdk.org/browse/memnic/) is also quite telling.

Just deprecate it already. If somebody shows up with actual patches to 
clean it all up, the deprecation can be lifted of course, but cleaning 
up this abandonware seems like waste of engineering resources to me.

- Panu -

[dpdk-dev] [RFC] librte_vhost: Add unix domain socket fd registration

2016-06-21 Thread Yuanhan Liu

On Fri, Jun 17, 2016 at 11:32:36AM -0400, Aaron Conole wrote:
> Prior to this commit, the only way to add a vhost-user socket to the
> system is by relying on librte_vhost to open the unix domain socket and
> add it to the unix socket list.  This is problematic for applications
> which would like to set the permissions,

So, you want to address the issue raised by following patch?

http://dpdk.org/dev/patchwork/patch/1/

I would still like to stick to my proposal, that is to introduce a
new API to do the permission change at anytime, if we end up with
wanting to introduce a new API.

> or applications which are not
> directly allowed to open sockets due to policy restrictions.

Could you name a specific example?

BTW, JFYI, since 16.07, DPDK supports client mode. It's QEMU (acting
as the server) will create the socket file. I guess that would diminish
(or even avoid?) the permission pain that DPDK acting as server brings.
I doubt the API to do the permission change is really needed then.

--yliu

[dpdk-dev] [PATCH v3 00/25] Refactor mlx5 to improve performance

2016-06-21 Thread Nelio Laranjeiro

Enhance mlx5 with a data path that bypasses Verbs.

The first half of this patchset removes support for functionality completely
rewritten in the second half (scatter/gather, inline send), while the data
path is refactored without Verbs.

The PMD remains usable during the transition.

This patchset must be applied after "Miscellaneous fixes for mlx4 and mlx5".

Changes in v3:
- Rebased patchset on top of next-net/rel_16_07.

Changes in v2:
- Rebased patchset on top of dpdk/master.
- Fixed CQE size on Power8.
- Fixed mbuf assertion failure in debug mode.
- Fixed missing class_id field in rte_pci_id by using RTE_PCI_DEVICE.

Adrien Mazarguil (8):
  mlx5: replace countdown with threshold for Tx completions
  mlx5: add debugging information about Tx queues capabilities
  mlx5: check remaining space while processing Tx burst
  mlx5: resurrect Tx gather support
  mlx5: work around spurious compilation errors
  mlx5: remove redundant Rx queue initialization code
  mlx5: make Rx queue reinitialization safer
  mlx5: resurrect Rx scatter support

Nelio Laranjeiro (16):
  drivers: fix PCI class id support
  mlx5: split memory registration function
  mlx5: remove Tx gather support
  mlx5: remove Rx scatter support
  mlx5: remove configuration variable
  mlx5: remove inline Tx support
  mlx5: split Tx queue structure
  mlx5: split Rx queue structure
  mlx5: update prerequisites for upcoming enhancements
  mlx5: add definitions for data path without Verbs
  mlx5: add support for configuration through kvargs
  mlx5: add Tx/Rx burst function selection wrapper
  mlx5: refactor Rx data path
  mlx5: refactor Tx data path
  mlx5: handle Rx CQE compression
  mlx5: add support for multi-packet send

Yaacov Hazan (1):
  mlx5: add support for inline send

 config/common_base |2 -
 doc/guides/nics/mlx5.rst   |   94 +-
 drivers/crypto/qat/rte_qat_cryptodev.c |5 +-
 drivers/net/mlx4/mlx4.c|   18 +-
 drivers/net/mlx5/Makefile  |   49 +-
 drivers/net/mlx5/mlx5.c|  182 ++-
 drivers/net/mlx5/mlx5.h|   10 +
 drivers/net/mlx5/mlx5_defs.h   |   26 +-
 drivers/net/mlx5/mlx5_ethdev.c |  188 ++-
 drivers/net/mlx5/mlx5_fdir.c   |   20 +-
 drivers/net/mlx5/mlx5_mr.c |  280 
 drivers/net/mlx5/mlx5_prm.h|  163 +++
 drivers/net/mlx5/mlx5_rxmode.c |8 -
 drivers/net/mlx5/mlx5_rxq.c|  762 ---
 drivers/net/mlx5/mlx5_rxtx.c   | 2210 +++-
 drivers/net/mlx5/mlx5_rxtx.h   |  176 ++-
 drivers/net/mlx5/mlx5_txq.c|  368 +++---
 drivers/net/mlx5/mlx5_vlan.c   |6 +-
 drivers/net/nfp/nfp_net.c  |   12 +-
 19 files changed, 2624 insertions(+), 1955 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mr.c
 create mode 100644 drivers/net/mlx5/mlx5_prm.h

-- 
2.1.4

[dpdk-dev] [PATCH v3 02/25] mlx5: split memory registration function

2016-06-21 Thread Nelio Laranjeiro

Except for the first time when memory registration occurs, the lkey is
always cached. Since memory registration is slow and performs system calls,
performance can be improved by moving that code to its own function outside
of the data path so only the lookup code is left in the original inlined
function.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/Makefile|   1 +
 drivers/net/mlx5/mlx5_mr.c   | 277 +++
 drivers/net/mlx5/mlx5_rxtx.c | 209 ++--
 drivers/net/mlx5/mlx5_rxtx.h |   8 +-
 4 files changed, 295 insertions(+), 200 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mr.c

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 82558aa..999ada5 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -47,6 +47,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_vlan.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_stats.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_fdir.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mr.c

 # Dependencies.
 DEPDIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += lib/librte_ether
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
new file mode 100644
index 000..7c3e87f
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -0,0 +1,277 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of 6WIND S.A. nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/* Verbs header. */
+/* ISO C doesn't support unnamed structs/unions, disabling -pedantic. */
+#ifdef PEDANTIC
+#pragma GCC diagnostic ignored "-pedantic"
+#endif
+#include 
+#ifdef PEDANTIC
+#pragma GCC diagnostic error "-pedantic"
+#endif
+
+/* DPDK headers don't like -pedantic. */
+#ifdef PEDANTIC
+#pragma GCC diagnostic ignored "-pedantic"
+#endif
+#include 
+#ifdef PEDANTIC
+#pragma GCC diagnostic error "-pedantic"
+#endif
+
+#include "mlx5.h"
+#include "mlx5_rxtx.h"
+
+struct mlx5_check_mempool_data {
+   int ret;
+   char *start;
+   char *end;
+};
+
+/* Called by mlx5_check_mempool() when iterating the memory chunks. */
+static void mlx5_check_mempool_cb(struct rte_mempool *mp,
+   void *opaque, struct rte_mempool_memhdr *memhdr,
+   unsigned mem_idx)
+{
+   struct mlx5_check_mempool_data *data = opaque;
+
+   (void)mp;
+   (void)mem_idx;
+
+   /* It already failed, skip the next chunks. */
+   if (data->ret != 0)
+   return;
+   /* It is the first chunk. */
+   if (data->start == NULL && data->end == NULL) {
+   data->start = memhdr->addr;
+   data->end = data->start + memhdr->len;
+   return;
+   }
+   if (data->end == memhdr->addr) {
+   data->end += memhdr->len;
+   return;
+   }
+   if (data->start == (char *)memhdr->addr + memhdr->len) {
+   data->start -= memhdr->len;
+   return;
+   }
+   /* Error, mempool is not virtually contiguous. */
+   data->ret = -1;
+}
+
+/**
+ * Check if a mempool can be used: it must be virtually contiguous.
+ *
+ * @param[in] mp
+ *   Pointer to memory pool.
+ * @param[out] start
+ *   Pointer to the start address of the mempool virtual memory area
+ * @param[out] end
+ *   Pointer to the end address of the mempool virtual memory area
+ *
+ * @return
+ *   0 on success (mempo

[dpdk-dev] [PATCH v3 01/25] drivers: fix PCI class id support

2016-06-21 Thread Nelio Laranjeiro

Fixes: 701c8d80c820 ("pci: support class id probing")

Signed-off-by: Nelio Laranjeiro 
---
 drivers/crypto/qat/rte_qat_cryptodev.c |  5 +
 drivers/net/mlx4/mlx4.c| 18 ++
 drivers/net/mlx5/mlx5.c| 24 
 drivers/net/nfp/nfp_net.c  | 12 
 4 files changed, 19 insertions(+), 40 deletions(-)

diff --git a/drivers/crypto/qat/rte_qat_cryptodev.c 
b/drivers/crypto/qat/rte_qat_cryptodev.c
index a7912f5..f46ec85 100644
--- a/drivers/crypto/qat/rte_qat_cryptodev.c
+++ b/drivers/crypto/qat/rte_qat_cryptodev.c
@@ -69,10 +69,7 @@ static struct rte_cryptodev_ops crypto_qat_ops = {

 static struct rte_pci_id pci_id_qat_map[] = {
{
-   .vendor_id = 0x8086,
-   .device_id = 0x0443,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(0x8086, 0x0443),
},
{.device_id = 0},
 };
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 9e94630..6228688 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -5807,22 +5807,16 @@ error:

 static const struct rte_pci_id mlx4_pci_id_map[] = {
{
-   .vendor_id = PCI_VENDOR_ID_MELLANOX,
-   .device_id = PCI_DEVICE_ID_MELLANOX_CONNECTX3,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+  PCI_DEVICE_ID_MELLANOX_CONNECTX3)
},
{
-   .vendor_id = PCI_VENDOR_ID_MELLANOX,
-   .device_id = PCI_DEVICE_ID_MELLANOX_CONNECTX3PRO,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+  PCI_DEVICE_ID_MELLANOX_CONNECTX3PRO)
},
{
-   .vendor_id = PCI_VENDOR_ID_MELLANOX,
-   .device_id = PCI_DEVICE_ID_MELLANOX_CONNECTX3VF,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+  PCI_DEVICE_ID_MELLANOX_CONNECTX3VF)
},
{
.vendor_id = 0
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 67a541c..350028b 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -610,28 +610,20 @@ error:

 static const struct rte_pci_id mlx5_pci_id_map[] = {
{
-   .vendor_id = PCI_VENDOR_ID_MELLANOX,
-   .device_id = PCI_DEVICE_ID_MELLANOX_CONNECTX4,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+  PCI_DEVICE_ID_MELLANOX_CONNECTX4)
},
{
-   .vendor_id = PCI_VENDOR_ID_MELLANOX,
-   .device_id = PCI_DEVICE_ID_MELLANOX_CONNECTX4VF,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+  PCI_DEVICE_ID_MELLANOX_CONNECTX4VF)
},
{
-   .vendor_id = PCI_VENDOR_ID_MELLANOX,
-   .device_id = PCI_DEVICE_ID_MELLANOX_CONNECTX4LX,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+  PCI_DEVICE_ID_MELLANOX_CONNECTX4LX)
},
{
-   .vendor_id = PCI_VENDOR_ID_MELLANOX,
-   .device_id = PCI_DEVICE_ID_MELLANOX_CONNECTX4LXVF,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+  PCI_DEVICE_ID_MELLANOX_CONNECTX4LXVF)
},
{
.vendor_id = 0
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index ea5a2a3..dd0c559 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -2446,16 +2446,12 @@ nfp_net_init(struct rte_eth_dev *eth_dev)

 static struct rte_pci_id pci_id_nfp_net_map[] = {
{
-   .vendor_id = PCI_VENDOR_ID_NETRONOME,
-   .device_id = PCI_DEVICE_ID_NFP6000_PF_NIC,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID,
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_NETRONOME,
+  PCI_DEVICE_ID_NFP6000_PF_NIC)
},
{
-   .vendor_id = PCI_VENDOR_ID_NETRONOME,
-   .device_id = PCI_DEVICE_ID_NFP6000_VF_NIC,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_AN

[dpdk-dev] [PATCH v3 03/25] mlx5: remove Tx gather support

2016-06-21 Thread Nelio Laranjeiro

This is done in preparation of bypassing Verbs entirely for the data path
as a performance improvement. TX gather cannot be maintained during the
transition and will be reimplemented later.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5_ethdev.c |   2 +-
 drivers/net/mlx5/mlx5_rxtx.c   | 315 -
 drivers/net/mlx5/mlx5_rxtx.h   |  17 ---
 drivers/net/mlx5/mlx5_txq.c|  49 ++-
 4 files changed, 69 insertions(+), 314 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 0a881b6..280a90a 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1260,7 +1260,7 @@ mlx5_secondary_data_setup(struct priv *priv)
if (txq != NULL) {
if (txq_setup(priv->dev,
  txq,
- primary_txq->elts_n * MLX5_PMD_SGE_WR_N,
+ primary_txq->elts_n,
  primary_txq->socket,
  NULL) == 0) {
txq->stats.idx = primary_txq->stats.idx;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 616cf7a..6e184c3 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -228,156 +228,6 @@ insert_vlan_sw(struct rte_mbuf *buf)
return 0;
 }

-#if MLX5_PMD_SGE_WR_N > 1
-
-/**
- * Copy scattered mbuf contents to a single linear buffer.
- *
- * @param[out] linear
- *   Linear output buffer.
- * @param[in] buf
- *   Scattered input buffer.
- *
- * @return
- *   Number of bytes copied to the output buffer or 0 if not large enough.
- */
-static unsigned int
-linearize_mbuf(linear_t *linear, struct rte_mbuf *buf)
-{
-   unsigned int size = 0;
-   unsigned int offset;
-
-   do {
-   unsigned int len = DATA_LEN(buf);
-
-   offset = size;
-   size += len;
-   if (unlikely(size > sizeof(*linear)))
-   return 0;
-   memcpy(&(*linear)[offset],
-  rte_pktmbuf_mtod(buf, uint8_t *),
-  len);
-   buf = NEXT(buf);
-   } while (buf != NULL);
-   return size;
-}
-
-/**
- * Handle scattered buffers for mlx5_tx_burst().
- *
- * @param txq
- *   TX queue structure.
- * @param segs
- *   Number of segments in buf.
- * @param elt
- *   TX queue element to fill.
- * @param[in] buf
- *   Buffer to process.
- * @param elts_head
- *   Index of the linear buffer to use if necessary (normally txq->elts_head).
- * @param[out] sges
- *   Array filled with SGEs on success.
- *
- * @return
- *   A structure containing the processed packet size in bytes and the
- *   number of SGEs. Both fields are set to (unsigned int)-1 in case of
- *   failure.
- */
-static struct tx_burst_sg_ret {
-   unsigned int length;
-   unsigned int num;
-}
-tx_burst_sg(struct txq *txq, unsigned int segs, struct txq_elt *elt,
-   struct rte_mbuf *buf, unsigned int elts_head,
-   struct ibv_sge (*sges)[MLX5_PMD_SGE_WR_N])
-{
-   unsigned int sent_size = 0;
-   unsigned int j;
-   int linearize = 0;
-
-   /* When there are too many segments, extra segments are
-* linearized in the last SGE. */
-   if (unlikely(segs > RTE_DIM(*sges))) {
-   segs = (RTE_DIM(*sges) - 1);
-   linearize = 1;
-   }
-   /* Update element. */
-   elt->buf = buf;
-   /* Register segments as SGEs. */
-   for (j = 0; (j != segs); ++j) {
-   struct ibv_sge *sge = &(*sges)[j];
-   uint32_t lkey;
-
-   /* Retrieve Memory Region key for this memory pool. */
-   lkey = txq_mp2mr(txq, txq_mb2mp(buf));
-   if (unlikely(lkey == (uint32_t)-1)) {
-   /* MR does not exist. */
-   DEBUG("%p: unable to get MP <-> MR association",
- (void *)txq);
-   /* Clean up TX element. */
-   elt->buf = NULL;
-   goto stop;
-   }
-   /* Update SGE. */
-   sge->addr = rte_pktmbuf_mtod(buf, uintptr_t);
-   if (txq->priv->sriov)
-   rte_prefetch0((volatile void *)
- (uintptr_t)sge->addr);
-   sge->length = DATA_LEN(buf);
-   sge->lkey = lkey;
-   sent_size += sge->length;
-   buf = NEXT(buf);
-   }
-   /* If buf is not NULL here and is not going to be linearized,
-* nb_segs is not valid. */
-   assert(j == segs);
-   assert((buf == NULL) || (linearize));
-   /* Linearize extra segments. */
-   if (linearize) {
-   struct ibv_sge *sge = &(*sges)[segs];
-   linear_t *linear = &(*txq->

[dpdk-dev] [PATCH v3 04/25] mlx5: remove Rx scatter support

2016-06-21 Thread Nelio Laranjeiro

This is done in preparation of bypassing Verbs entirely for the data path
as a performance improvement. RX scatter cannot be maintained during the
transition and will be reimplemented later.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5_ethdev.c |  31 +---
 drivers/net/mlx5/mlx5_rxq.c| 314 ++---
 drivers/net/mlx5/mlx5_rxtx.c   | 211 +--
 drivers/net/mlx5/mlx5_rxtx.h   |  13 +-
 4 files changed, 53 insertions(+), 516 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 280a90a..ca57021 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -623,8 +623,7 @@ mlx5_dev_supported_ptypes_get(struct rte_eth_dev *dev)

};

-   if (dev->rx_pkt_burst == mlx5_rx_burst ||
-   dev->rx_pkt_burst == mlx5_rx_burst_sp)
+   if (dev->rx_pkt_burst == mlx5_rx_burst)
return ptypes;
return NULL;
 }
@@ -762,19 +761,11 @@ mlx5_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu)
mb_len = rte_pktmbuf_data_room_size(rxq->mp);
assert(mb_len >= RTE_PKTMBUF_HEADROOM);
sp = (max_frame_len > (mb_len - RTE_PKTMBUF_HEADROOM));
-   /* Provide new values to rxq_setup(). */
-   dev->data->dev_conf.rxmode.jumbo_frame = sp;
-   dev->data->dev_conf.rxmode.max_rx_pkt_len = max_frame_len;
-   ret = rxq_rehash(dev, rxq);
-   if (ret) {
-   /* Force SP RX if that queue requires it and abort. */
-   if (rxq->sp)
-   rx_func = mlx5_rx_burst_sp;
-   break;
+   if (sp) {
+   ERROR("%p: RX scatter is not supported", (void *)dev);
+   ret = ENOTSUP;
+   goto out;
}
-   /* Scattered burst function takes priority. */
-   if (rxq->sp)
-   rx_func = mlx5_rx_burst_sp;
}
/* Burst functions can now be called again. */
rte_wmb();
@@ -1103,22 +1094,12 @@ priv_set_link(struct priv *priv, int up)
 {
struct rte_eth_dev *dev = priv->dev;
int err;
-   unsigned int i;

if (up) {
err = priv_set_flags(priv, ~IFF_UP, IFF_UP);
if (err)
return err;
-   for (i = 0; i < priv->rxqs_n; i++)
-   if ((*priv->rxqs)[i]->sp)
-   break;
-   /* Check if an sp queue exists.
-* Note: Some old frames might be received.
-*/
-   if (i == priv->rxqs_n)
-   dev->rx_pkt_burst = mlx5_rx_burst;
-   else
-   dev->rx_pkt_burst = mlx5_rx_burst_sp;
+   dev->rx_pkt_burst = mlx5_rx_burst;
dev->tx_pkt_burst = mlx5_tx_burst;
} else {
err = priv_set_flags(priv, ~IFF_UP, ~IFF_UP);
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 0bcf55b..38ff9fd 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -634,145 +634,6 @@ priv_rehash_flows(struct priv *priv)
 }

 /**
- * Allocate RX queue elements with scattered packets support.
- *
- * @param rxq
- *   Pointer to RX queue structure.
- * @param elts_n
- *   Number of elements to allocate.
- * @param[in] pool
- *   If not NULL, fetch buffers from this array instead of allocating them
- *   with rte_pktmbuf_alloc().
- *
- * @return
- *   0 on success, errno value on failure.
- */
-static int
-rxq_alloc_elts_sp(struct rxq *rxq, unsigned int elts_n,
- struct rte_mbuf **pool)
-{
-   unsigned int i;
-   struct rxq_elt_sp (*elts)[elts_n] =
-   rte_calloc_socket("RXQ elements", 1, sizeof(*elts), 0,
- rxq->socket);
-   int ret = 0;
-
-   if (elts == NULL) {
-   ERROR("%p: can't allocate packets array", (void *)rxq);
-   ret = ENOMEM;
-   goto error;
-   }
-   /* For each WR (packet). */
-   for (i = 0; (i != elts_n); ++i) {
-   unsigned int j;
-   struct rxq_elt_sp *elt = &(*elts)[i];
-   struct ibv_sge (*sges)[RTE_DIM(elt->sges)] = &elt->sges;
-
-   /* These two arrays must have the same size. */
-   assert(RTE_DIM(elt->sges) == RTE_DIM(elt->bufs));
-   /* For each SGE (segment). */
-   for (j = 0; (j != RTE_DIM(elt->bufs)); ++j) {
-   struct ibv_sge *sge = &(*sges)[j];
-   struct rte_mbuf *buf;
-
-   if (pool != NULL) {
-   buf = *(pool++);
-   assert(buf != NULL);
-   rte_pktmbuf_reset(buf);
-

[dpdk-dev] [PATCH v3 05/25] mlx5: remove configuration variable

2016-06-21 Thread Nelio Laranjeiro

There is no scatter/gather support anymore, CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N
has no purpose and can be removed.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 config/common_base   | 1 -
 doc/guides/nics/mlx5.rst | 7 ---
 drivers/net/mlx5/Makefile| 4 
 drivers/net/mlx5/mlx5_defs.h | 5 -
 drivers/net/mlx5/mlx5_rxq.c  | 4 
 drivers/net/mlx5/mlx5_txq.c  | 4 
 6 files changed, 25 deletions(-)

diff --git a/config/common_base b/config/common_base
index ead5984..39e6333 100644
--- a/config/common_base
+++ b/config/common_base
@@ -207,7 +207,6 @@ CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS=1
 #
 CONFIG_RTE_LIBRTE_MLX5_PMD=n
 CONFIG_RTE_LIBRTE_MLX5_DEBUG=n
-CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N=4
 CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE=0
 CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE=8

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index d9196d1..84c35a0 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -114,13 +114,6 @@ These options can be modified in the ``.config`` file.
   adds additional run-time checks and debugging messages at the cost of
   lower performance.

-- ``CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N`` (default **4**)
-
-  Number of scatter/gather elements (SGEs) per work request (WR). Lowering
-  this number improves performance but also limits the ability to receive
-  scattered packets (packets that do not fit a single mbuf). The default
-  value is a safe tradeoff.
-
 - ``CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE`` (default **0**)

   Amount of data to be inlined during TX operations. Improves latency.
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 999ada5..656a6e1 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -86,10 +86,6 @@ else
 CFLAGS += -DNDEBUG -UPEDANTIC
 endif

-ifdef CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N
-CFLAGS += -DMLX5_PMD_SGE_WR_N=$(CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N)
-endif
-
 ifdef CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE
 CFLAGS += -DMLX5_PMD_MAX_INLINE=$(CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE)
 endif
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 09207d9..da1c90e 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -54,11 +54,6 @@
 /* RSS Indirection table size. */
 #define RSS_INDIRECTION_TABLE_SIZE 256

-/* Maximum number of Scatter/Gather Elements per Work Request. */
-#ifndef MLX5_PMD_SGE_WR_N
-#define MLX5_PMD_SGE_WR_N 4
-#endif
-
 /* Maximum size for inline data. */
 #ifndef MLX5_PMD_MAX_INLINE
 #define MLX5_PMD_MAX_INLINE 0
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 38ff9fd..4000624 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -976,10 +976,6 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, 
uint16_t desc,
ERROR("%p: invalid number of RX descriptors", (void *)dev);
return EINVAL;
}
-   if (MLX5_PMD_SGE_WR_N > 1) {
-   ERROR("%p: RX scatter is not supported", (void *)dev);
-   return ENOTSUP;
-   }
/* Toggle RX checksum offload if hardware supports it. */
if (priv->hw_csum)
tmpl.csum = !!dev->data->dev_conf.rxmode.hw_ip_checksum;
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 5a248c9..59974c5 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -264,10 +264,6 @@ txq_setup(struct rte_eth_dev *dev, struct txq *txq, 
uint16_t desc,
ERROR("%p: invalid number of TX descriptors", (void *)dev);
return EINVAL;
}
-   if (MLX5_PMD_SGE_WR_N > 1) {
-   ERROR("%p: TX gather is not supported", (void *)dev);
-   return EINVAL;
-   }
/* MRs will be registered in mp2mr[] later. */
attr.rd = (struct ibv_exp_res_domain_init_attr){
.comp_mask = (IBV_EXP_RES_DOMAIN_THREAD_MODEL |
-- 
2.1.4

[dpdk-dev] [PATCH v3 06/25] mlx5: remove inline Tx support

2016-06-21 Thread Nelio Laranjeiro

Inline TX will be fully managed by the PMD after Verbs is bypassed in the
data path. Remove the current code until then.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 config/common_base   |  1 -
 doc/guides/nics/mlx5.rst | 10 --
 drivers/net/mlx5/Makefile|  4 ---
 drivers/net/mlx5/mlx5_defs.h |  5 ---
 drivers/net/mlx5/mlx5_rxtx.c | 73 +++-
 drivers/net/mlx5/mlx5_rxtx.h |  9 --
 drivers/net/mlx5/mlx5_txq.c  | 16 --
 7 files changed, 25 insertions(+), 93 deletions(-)

diff --git a/config/common_base b/config/common_base
index 39e6333..5fbac47 100644
--- a/config/common_base
+++ b/config/common_base
@@ -207,7 +207,6 @@ CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS=1
 #
 CONFIG_RTE_LIBRTE_MLX5_PMD=n
 CONFIG_RTE_LIBRTE_MLX5_DEBUG=n
-CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE=0
 CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE=8

 #
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 84c35a0..77fa957 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -114,16 +114,6 @@ These options can be modified in the ``.config`` file.
   adds additional run-time checks and debugging messages at the cost of
   lower performance.

-- ``CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE`` (default **0**)
-
-  Amount of data to be inlined during TX operations. Improves latency.
-  Can improve PPS performance when PCI backpressure is detected and may be
-  useful for scenarios involving heavy traffic on many queues.
-
-  Since the additional software logic necessary to handle this mode can
-  lower performance when there is no backpressure, it is not enabled by
-  default.
-
 - ``CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE`` (default **8**)

   Maximum number of cached memory pools (MPs) per TX queue. Each MP from
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 656a6e1..289c85e 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -86,10 +86,6 @@ else
 CFLAGS += -DNDEBUG -UPEDANTIC
 endif

-ifdef CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE
-CFLAGS += -DMLX5_PMD_MAX_INLINE=$(CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE)
-endif
-
 ifdef CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE
 CFLAGS += -DMLX5_PMD_TX_MP_CACHE=$(CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE)
 endif
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index da1c90e..9a19835 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -54,11 +54,6 @@
 /* RSS Indirection table size. */
 #define RSS_INDIRECTION_TABLE_SIZE 256

-/* Maximum size for inline data. */
-#ifndef MLX5_PMD_MAX_INLINE
-#define MLX5_PMD_MAX_INLINE 0
-#endif
-
 /*
  * Maximum number of cached Memory Pools (MPs) per TX queue. Each RTE MP
  * from which buffers are to be transmitted will have to be mapped by this
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 07d95eb..4ba88ea 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -329,56 +329,33 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
rte_prefetch0((volatile void *)
  (uintptr_t)buf_next_addr);
}
-   /* Put packet into send queue. */
-#if MLX5_PMD_MAX_INLINE > 0
-   if (length <= txq->max_inline) {
-#ifdef HAVE_VERBS_VLAN_INSERTION
-   if (insert_vlan)
-   err = txq->send_pending_inline_vlan
-   (txq->qp,
-(void *)addr,
-length,
-send_flags,
-&buf->vlan_tci);
-   else
-#endif /* HAVE_VERBS_VLAN_INSERTION */
-   err = txq->send_pending_inline
-   (txq->qp,
-(void *)addr,
-length,
-send_flags);
-   } else
-#endif
-   {
-   /* Retrieve Memory Region key for this
-* memory pool. */
-   lkey = txq_mp2mr(txq, txq_mb2mp(buf));
-   if (unlikely(lkey == (uint32_t)-1)) {
-   /* MR does not exist. */
-   DEBUG("%p: unable to get MP <-> MR"
- " association", (void *)txq);
-   /* Clean up TX element. */
-   elt->buf = NULL;
-   goto stop;
-   }
+   /* Retrieve Memory Region key for this memory pool. */
+   lkey = txq_mp2mr(txq, txq_mb2mp(buf));
+   if (unlikely(lkey == (uint32_t)-1)) {
+   /* MR does not exist. */
+   DEBUG("%p: unable t

[dpdk-dev] [PATCH v3 07/25] mlx5: split Tx queue structure

2016-06-21 Thread Nelio Laranjeiro

To keep the data path as efficient as possible, move fields only useful to
the control path into new structure txq_ctrl.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5.c|  21 +++--
 drivers/net/mlx5/mlx5_ethdev.c |  27 +++---
 drivers/net/mlx5/mlx5_mr.c |  39 
 drivers/net/mlx5/mlx5_rxtx.h   |   9 +-
 drivers/net/mlx5/mlx5_txq.c| 198 +
 5 files changed, 158 insertions(+), 136 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 350028b..3d30e00 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -98,7 +98,6 @@ static void
 mlx5_dev_close(struct rte_eth_dev *dev)
 {
struct priv *priv = mlx5_get_priv(dev);
-   void *tmp;
unsigned int i;

priv_lock(priv);
@@ -122,12 +121,13 @@ mlx5_dev_close(struct rte_eth_dev *dev)
/* XXX race condition if mlx5_rx_burst() is still running. */
usleep(1000);
for (i = 0; (i != priv->rxqs_n); ++i) {
-   tmp = (*priv->rxqs)[i];
-   if (tmp == NULL)
+   struct rxq *rxq = (*priv->rxqs)[i];
+
+   if (rxq == NULL)
continue;
(*priv->rxqs)[i] = NULL;
-   rxq_cleanup(tmp);
-   rte_free(tmp);
+   rxq_cleanup(rxq);
+   rte_free(rxq);
}
priv->rxqs_n = 0;
priv->rxqs = NULL;
@@ -136,12 +136,15 @@ mlx5_dev_close(struct rte_eth_dev *dev)
/* XXX race condition if mlx5_tx_burst() is still running. */
usleep(1000);
for (i = 0; (i != priv->txqs_n); ++i) {
-   tmp = (*priv->txqs)[i];
-   if (tmp == NULL)
+   struct txq *txq = (*priv->txqs)[i];
+   struct txq_ctrl *txq_ctrl;
+
+   if (txq == NULL)
continue;
+   txq_ctrl = container_of(txq, struct txq_ctrl, txq);
(*priv->txqs)[i] = NULL;
-   txq_cleanup(tmp);
-   rte_free(tmp);
+   txq_cleanup(txq_ctrl);
+   rte_free(txq_ctrl);
}
priv->txqs_n = 0;
priv->txqs = NULL;
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index ca57021..3992b2c 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1232,28 +1232,31 @@ mlx5_secondary_data_setup(struct priv *priv)
/* TX queues. */
for (i = 0; i != nb_tx_queues; ++i) {
struct txq *primary_txq = (*sd->primary_priv->txqs)[i];
-   struct txq *txq;
+   struct txq_ctrl *primary_txq_ctrl;
+   struct txq_ctrl *txq_ctrl;

if (primary_txq == NULL)
continue;
-   txq = rte_calloc_socket("TXQ", 1, sizeof(*txq), 0,
-   primary_txq->socket);
-   if (txq != NULL) {
+   primary_txq_ctrl = container_of(primary_txq,
+   struct txq_ctrl, txq);
+   txq_ctrl = rte_calloc_socket("TXQ", 1, sizeof(*txq_ctrl), 0,
+primary_txq_ctrl->socket);
+   if (txq_ctrl != NULL) {
if (txq_setup(priv->dev,
- txq,
+ primary_txq_ctrl,
  primary_txq->elts_n,
- primary_txq->socket,
+ primary_txq_ctrl->socket,
  NULL) == 0) {
-   txq->stats.idx = primary_txq->stats.idx;
-   tx_queues[i] = txq;
+   txq_ctrl->txq.stats.idx = 
primary_txq->stats.idx;
+   tx_queues[i] = &txq_ctrl->txq;
continue;
}
-   rte_free(txq);
+   rte_free(txq_ctrl);
}
while (i) {
-   txq = tx_queues[--i];
-   txq_cleanup(txq);
-   rte_free(txq);
+   txq_ctrl = tx_queues[--i];
+   txq_cleanup(txq_ctrl);
+   rte_free(txq_ctrl);
}
goto error;
}
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index 7c3e87f..79d5568 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -183,33 +183,36 @@ mlx5_mp2mr(struct ibv_pd *pd, struct rte_mempool *mp)

[dpdk-dev] [PATCH v3 08/25] mlx5: split Rx queue structure

2016-06-21 Thread Nelio Laranjeiro

To keep the data path as efficient as possible, move fields only useful to
the control path into new structure rxq_ctrl.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5.c  |   6 +-
 drivers/net/mlx5/mlx5_fdir.c |   8 +-
 drivers/net/mlx5/mlx5_rxq.c  | 250 ++-
 drivers/net/mlx5/mlx5_rxtx.c |   1 -
 drivers/net/mlx5/mlx5_rxtx.h |  13 ++-
 5 files changed, 148 insertions(+), 130 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 3d30e00..27a7a30 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -122,12 +122,14 @@ mlx5_dev_close(struct rte_eth_dev *dev)
usleep(1000);
for (i = 0; (i != priv->rxqs_n); ++i) {
struct rxq *rxq = (*priv->rxqs)[i];
+   struct rxq_ctrl *rxq_ctrl;

if (rxq == NULL)
continue;
+   rxq_ctrl = container_of(rxq, struct rxq_ctrl, rxq);
(*priv->rxqs)[i] = NULL;
-   rxq_cleanup(rxq);
-   rte_free(rxq);
+   rxq_cleanup(rxq_ctrl);
+   rte_free(rxq_ctrl);
}
priv->rxqs_n = 0;
priv->rxqs = NULL;
diff --git a/drivers/net/mlx5/mlx5_fdir.c b/drivers/net/mlx5/mlx5_fdir.c
index 63e43ad..e3b97ba 100644
--- a/drivers/net/mlx5/mlx5_fdir.c
+++ b/drivers/net/mlx5/mlx5_fdir.c
@@ -424,7 +424,9 @@ create_flow:
 static struct fdir_queue *
 priv_get_fdir_queue(struct priv *priv, uint16_t idx)
 {
-   struct fdir_queue *fdir_queue = &(*priv->rxqs)[idx]->fdir_queue;
+   struct rxq_ctrl *rxq_ctrl =
+   container_of((*priv->rxqs)[idx], struct rxq_ctrl, rxq);
+   struct fdir_queue *fdir_queue = &rxq_ctrl->fdir_queue;
struct ibv_exp_rwq_ind_table *ind_table = NULL;
struct ibv_qp *qp = NULL;
struct ibv_exp_rwq_ind_table_init_attr ind_init_attr;
@@ -629,8 +631,10 @@ priv_fdir_disable(struct priv *priv)
/* Run on every RX queue to destroy related flow director QP and
 * indirection table. */
for (i = 0; (i != priv->rxqs_n); i++) {
-   fdir_queue = &(*priv->rxqs)[i]->fdir_queue;
+   struct rxq_ctrl *rxq_ctrl =
+   container_of((*priv->rxqs)[i], struct rxq_ctrl, rxq);

+   fdir_queue = &rxq_ctrl->fdir_queue;
if (fdir_queue->qp != NULL) {
claim_zero(ibv_destroy_qp(fdir_queue->qp));
fdir_queue->qp = NULL;
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 4000624..8d32e74 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -636,7 +636,7 @@ priv_rehash_flows(struct priv *priv)
 /**
  * Allocate RX queue elements.
  *
- * @param rxq
+ * @param rxq_ctrl
  *   Pointer to RX queue structure.
  * @param elts_n
  *   Number of elements to allocate.
@@ -648,16 +648,17 @@ priv_rehash_flows(struct priv *priv)
  *   0 on success, errno value on failure.
  */
 static int
-rxq_alloc_elts(struct rxq *rxq, unsigned int elts_n, struct rte_mbuf **pool)
+rxq_alloc_elts(struct rxq_ctrl *rxq_ctrl, unsigned int elts_n,
+  struct rte_mbuf **pool)
 {
unsigned int i;
struct rxq_elt (*elts)[elts_n] =
rte_calloc_socket("RXQ elements", 1, sizeof(*elts), 0,
- rxq->socket);
+ rxq_ctrl->socket);
int ret = 0;

if (elts == NULL) {
-   ERROR("%p: can't allocate packets array", (void *)rxq);
+   ERROR("%p: can't allocate packets array", (void *)rxq_ctrl);
ret = ENOMEM;
goto error;
}
@@ -672,10 +673,10 @@ rxq_alloc_elts(struct rxq *rxq, unsigned int elts_n, 
struct rte_mbuf **pool)
assert(buf != NULL);
rte_pktmbuf_reset(buf);
} else
-   buf = rte_pktmbuf_alloc(rxq->mp);
+   buf = rte_pktmbuf_alloc(rxq_ctrl->rxq.mp);
if (buf == NULL) {
assert(pool == NULL);
-   ERROR("%p: empty mbuf pool", (void *)rxq);
+   ERROR("%p: empty mbuf pool", (void *)rxq_ctrl);
ret = ENOMEM;
goto error;
}
@@ -691,15 +692,15 @@ rxq_alloc_elts(struct rxq *rxq, unsigned int elts_n, 
struct rte_mbuf **pool)
sge->addr = (uintptr_t)
((uint8_t *)buf->buf_addr + RTE_PKTMBUF_HEADROOM);
sge->length = (buf->buf_len - RTE_PKTMBUF_HEADROOM);
-   sge->lkey = rxq->mr->lkey;
+   sge->lkey = rxq_ctrl->mr->lkey;
/* Redundant check for tailroom. */
assert(sge->length == rte_pktmbuf_tailroom

[dpdk-dev] [PATCH v3 09/25] mlx5: update prerequisites for upcoming enhancements

2016-06-21 Thread Nelio Laranjeiro

The latest version of Mellanox OFED exposes hardware definitions necessary
to implement data path operation bypassing Verbs. Update the minimum
version requirement to MLNX_OFED >= 3.3 and clean up compatibility checks
for previous releases.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 doc/guides/nics/mlx5.rst   | 44 +++---
 drivers/net/mlx5/Makefile  | 39 -
 drivers/net/mlx5/mlx5.c| 23 --
 drivers/net/mlx5/mlx5.h|  5 +
 drivers/net/mlx5/mlx5_defs.h   |  9 -
 drivers/net/mlx5/mlx5_fdir.c   | 10 --
 drivers/net/mlx5/mlx5_rxmode.c |  8 
 drivers/net/mlx5/mlx5_rxq.c| 30 
 drivers/net/mlx5/mlx5_rxtx.c   |  4 
 drivers/net/mlx5/mlx5_rxtx.h   |  8 
 drivers/net/mlx5/mlx5_txq.c|  2 --
 drivers/net/mlx5/mlx5_vlan.c   |  3 ---
 12 files changed, 16 insertions(+), 169 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 77fa957..3a07928 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -125,16 +125,6 @@ These options can be modified in the ``.config`` file.
 Environment variables
 ~

-- ``MLX5_ENABLE_CQE_COMPRESSION``
-
-  A nonzero value lets ConnectX-4 return smaller completion entries to
-  improve performance when PCI backpressure is detected. It is most useful
-  for scenarios involving heavy traffic on many queues.
-
-  Since the additional software logic necessary to handle this mode can
-  lower performance when there is no backpressure, it is not enabled by
-  default.
-
 - ``MLX5_PMD_ENABLE_PADDING``

   Enables HW packet padding in PCI bus transactions.
@@ -211,40 +201,12 @@ DPDK and must be installed separately:

 Currently supported by DPDK:

-- Mellanox OFED **3.1-1.0.3**, **3.1-1.5.7.1** or **3.2-2.0.0.0** depending
-  on usage.
-
-The following features are supported with version **3.1-1.5.7.1** and
-above only:
-
-- IPv6, UPDv6, TCPv6 RSS.
-- RX checksum offloads.
-- IBM POWER8.
-
-The following features are supported with version **3.2-2.0.0.0** and
-above only:
-
-- Flow director.
-- RX VLAN stripping.
-- TX VLAN insertion.
-- RX CRC stripping configuration.
+- Mellanox OFED **3.3-1.0.0.0**.

 - Minimum firmware version:

-  With MLNX_OFED **3.1-1.0.3**:
-
-  - ConnectX-4: **12.12.1240**
-  - ConnectX-4 Lx: **14.12.1100**
-
-  With MLNX_OFED **3.1-1.5.7.1**:
-
-  - ConnectX-4: **12.13.0144**
-  - ConnectX-4 Lx: **14.13.0144**
-
-  With MLNX_OFED **3.2-2.0.0.0**:
-
-  - ConnectX-4: **12.14.2036**
-  - ConnectX-4 Lx: **14.14.2036**
+  - ConnectX-4: **12.16.1006**
+  - ConnectX-4 Lx: **14.16.1006**

 Getting Mellanox OFED
 ~
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 289c85e..dc99797 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -106,42 +106,19 @@ mlx5_autoconf.h.new: FORCE
 mlx5_autoconf.h.new: $(RTE_SDK)/scripts/auto-config-h.sh
$Q $(RM) -f -- '$@'
$Q sh -- '$<' '$@' \
-   HAVE_EXP_QUERY_DEVICE \
-   infiniband/verbs.h \
-   type 'struct ibv_exp_device_attr' $(AUTOCONF_OUTPUT)
-   $Q sh -- '$<' '$@' \
-   HAVE_FLOW_SPEC_IPV6 \
-   infiniband/verbs.h \
-   type 'struct ibv_exp_flow_spec_ipv6' $(AUTOCONF_OUTPUT)
-   $Q sh -- '$<' '$@' \
-   HAVE_EXP_QP_BURST_CREATE_ENABLE_MULTI_PACKET_SEND_WR \
-   infiniband/verbs.h \
-   enum IBV_EXP_QP_BURST_CREATE_ENABLE_MULTI_PACKET_SEND_WR \
-   $(AUTOCONF_OUTPUT)
-   $Q sh -- '$<' '$@' \
-   HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS \
-   infiniband/verbs.h \
-   enum IBV_EXP_DEVICE_ATTR_VLAN_OFFLOADS \
-   $(AUTOCONF_OUTPUT)
-   $Q sh -- '$<' '$@' \
-   HAVE_EXP_CQ_RX_TCP_PACKET \
+   HAVE_VERBS_VLAN_INSERTION \
infiniband/verbs.h \
-   enum IBV_EXP_CQ_RX_TCP_PACKET \
+   enum IBV_EXP_RECEIVE_WQ_CVLAN_INSERTION \
$(AUTOCONF_OUTPUT)
$Q sh -- '$<' '$@' \
-   HAVE_VERBS_FCS \
-   infiniband/verbs.h \
-   enum IBV_EXP_CREATE_WQ_FLAG_SCATTER_FCS \
+   HAVE_VERBS_IBV_EXP_CQ_COMPRESSED_CQE \
+   infiniband/verbs_exp.h \
+   enum IBV_EXP_CQ_COMPRESSED_CQE \
$(AUTOCONF_OUTPUT)
$Q sh -- '$<' '$@' \
-   HAVE_VERBS_RX_END_PADDING \
-   infiniband/verbs.h \
-   enum IBV_EXP_CREATE_WQ_FLAG_RX_END_PADDING \
-   $(AUTOCONF_OUTPUT)
-   $Q sh -- '$<' '$@' \
-   HAVE_VERBS_VLAN_INSERTION \
-   infiniband/verbs.h \
-   enum IBV_EXP_RECEIVE_WQ_CVLAN_INSERTION \
+   HAVE_VERBS_MLX5_ETH_VLAN_INLINE_HEADER_SI

[dpdk-dev] [PATCH v3 10/25] mlx5: add definitions for data path without Verbs

2016-06-21 Thread Nelio Laranjeiro

These structures and macros extend those exposed by libmlx5 (in mlx5_hw.h)
to let the PMD manage work queue and completion queue elements directly.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5_prm.h | 163 
 1 file changed, 163 insertions(+)
 create mode 100644 drivers/net/mlx5/mlx5_prm.h

diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
new file mode 100644
index 000..5db219b
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_prm.h
@@ -0,0 +1,163 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of 6WIND S.A. nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_PMD_MLX5_PRM_H_
+#define RTE_PMD_MLX5_PRM_H_
+
+/* Verbs header. */
+/* ISO C doesn't support unnamed structs/unions, disabling -pedantic. */
+#ifdef PEDANTIC
+#pragma GCC diagnostic ignored "-pedantic"
+#endif
+#include 
+#ifdef PEDANTIC
+#pragma GCC diagnostic error "-pedantic"
+#endif
+
+/* Get CQE owner bit. */
+#define MLX5_CQE_OWNER(op_own) ((op_own) & MLX5_CQE_OWNER_MASK)
+
+/* Get CQE format. */
+#define MLX5_CQE_FORMAT(op_own) (((op_own) & MLX5E_CQE_FORMAT_MASK) >> 2)
+
+/* Get CQE opcode. */
+#define MLX5_CQE_OPCODE(op_own) (((op_own) & 0xf0) >> 4)
+
+/* Get CQE solicited event. */
+#define MLX5_CQE_SE(op_own) (((op_own) >> 1) & 1)
+
+/* Invalidate a CQE. */
+#define MLX5_CQE_INVALIDATE (MLX5_CQE_INVALID << 4)
+
+/* CQE value to inform that VLAN is stripped. */
+#define MLX5_CQE_VLAN_STRIPPED 0x1
+
+/* Maximum number of packets a multi-packet WQE can handle. */
+#define MLX5_MPW_DSEG_MAX 5
+
+/* Room for inline data in regular work queue element. */
+#define MLX5_WQE64_INL_DATA 12
+
+/* Room for inline data in multi-packet WQE. */
+#define MLX5_MWQE64_INL_DATA 28
+
+/* Subset of struct mlx5_wqe_eth_seg. */
+struct mlx5_wqe_eth_seg_small {
+   uint32_t rsvd0;
+   uint8_t cs_flags;
+   uint8_t rsvd1;
+   uint16_t mss;
+   uint32_t rsvd2;
+   uint16_t inline_hdr_sz;
+};
+
+/* Regular WQE. */
+struct mlx5_wqe_regular {
+   union {
+   struct mlx5_wqe_ctrl_seg ctrl;
+   uint32_t data[4];
+   } ctrl;
+   struct mlx5_wqe_eth_seg eseg;
+   struct mlx5_wqe_data_seg dseg;
+} __rte_aligned(64);
+
+/* Inline WQE. */
+struct mlx5_wqe_inl {
+   union {
+   struct mlx5_wqe_ctrl_seg ctrl;
+   uint32_t data[4];
+   } ctrl;
+   struct mlx5_wqe_eth_seg eseg;
+   uint32_t byte_cnt;
+   uint8_t data[MLX5_WQE64_INL_DATA];
+} __rte_aligned(64);
+
+/* Multi-packet WQE. */
+struct mlx5_wqe_mpw {
+   union {
+   struct mlx5_wqe_ctrl_seg ctrl;
+   uint32_t data[4];
+   } ctrl;
+   struct mlx5_wqe_eth_seg_small eseg;
+   struct mlx5_wqe_data_seg dseg[2];
+} __rte_aligned(64);
+
+/* Multi-packet WQE with inline. */
+struct mlx5_wqe_mpw_inl {
+   union {
+   struct mlx5_wqe_ctrl_seg ctrl;
+   uint32_t data[4];
+   } ctrl;
+   struct mlx5_wqe_eth_seg_small eseg;
+   uint32_t byte_cnt;
+   uint8_t data[MLX5_MWQE64_INL_DATA];
+} __rte_aligned(64);
+
+/* Union of all WQE types. */
+union mlx5_wqe {
+   struct mlx5_wqe_regular wqe;
+   struct mlx5_wqe_inl inl;
+   struct mlx5_wqe_mpw mpw;
+   struct mlx5_wqe_mpw_inl mpw_inl;
+   uint8_t data[64];
+};
+
+/* MPW session status. */
+enum mlx5_mpw_state {
+

[dpdk-dev] [PATCH v3 11/25] mlx5: add support for configuration through kvargs

2016-06-21 Thread Nelio Laranjeiro

The intent is to replace the remaining compile-time options and environment
variables with a common mean of runtime configuration. This commit only
adds the kvargs handling code, subsequent commits will update the rest.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5.c | 72 +
 1 file changed, 72 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 3f45d84..56b1dfc 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 

 /* Verbs header. */
@@ -57,6 +58,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifdef PEDANTIC
 #pragma GCC diagnostic error "-pedantic"
 #endif
@@ -237,6 +239,70 @@ mlx5_dev_idx(struct rte_pci_addr *pci_addr)
return ret;
 }

+/**
+ * Verify and store value for device argument.
+ *
+ * @param[in] key
+ *   Key argument to verify.
+ * @param[in] val
+ *   Value associated with key.
+ * @param opaque
+ *   User data.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+mlx5_args_check(const char *key, const char *val, void *opaque)
+{
+   struct priv *priv = opaque;
+
+   /* No parameters are expected at the moment. */
+   (void)priv;
+   (void)val;
+   WARN("%s: unknown parameter", key);
+   return EINVAL;
+}
+
+/**
+ * Parse device parameters.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ * @param devargs
+ *   Device arguments structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+mlx5_args(struct priv *priv, struct rte_devargs *devargs)
+{
+   static const char *params[] = {
+   NULL,
+   };
+   struct rte_kvargs *kvlist;
+   int ret = 0;
+   int i;
+
+   if (devargs == NULL)
+   return 0;
+   kvlist = rte_kvargs_parse(devargs->args, params);
+   if (kvlist == NULL)
+   return 0;
+   /* Process parameters. */
+   for (i = 0; (i != RTE_DIM(params)); ++i) {
+   if (rte_kvargs_count(kvlist, params[i])) {
+   ret = rte_kvargs_process(kvlist, params[i],
+mlx5_args_check, priv);
+   if (ret != 0)
+   return ret;
+   }
+   }
+   rte_kvargs_free(kvlist);
+   return 0;
+}
+
 static struct eth_driver mlx5_driver;

 /**
@@ -408,6 +474,12 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
priv->port = port;
priv->pd = pd;
priv->mtu = ETHER_MTU;
+   err = mlx5_args(priv, pci_dev->devargs);
+   if (err) {
+   ERROR("failed to process device arguments: %s",
+ strerror(err));
+   goto port_error;
+   }
if (ibv_exp_query_device(ctx, &exp_device_attr)) {
ERROR("ibv_exp_query_device() failed");
goto port_error;
-- 
2.1.4

[dpdk-dev] [PATCH v3 12/25] mlx5: add Tx/Rx burst function selection wrapper

2016-06-21 Thread Nelio Laranjeiro

These wrappers are meant to prevent code duplication later.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5.h|  2 ++
 drivers/net/mlx5/mlx5_ethdev.c | 34 --
 drivers/net/mlx5/mlx5_txq.c|  2 +-
 3 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 935e1b0..3dca03d 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -196,6 +196,8 @@ void priv_dev_interrupt_handler_install(struct priv *, 
struct rte_eth_dev *);
 int mlx5_set_link_down(struct rte_eth_dev *dev);
 int mlx5_set_link_up(struct rte_eth_dev *dev);
 struct priv *mlx5_secondary_data_setup(struct priv *priv);
+void priv_select_tx_function(struct priv *);
+void priv_select_rx_function(struct priv *);

 /* mlx5_mac.c */

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 3992b2c..771d8b5 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1099,8 +1099,8 @@ priv_set_link(struct priv *priv, int up)
err = priv_set_flags(priv, ~IFF_UP, IFF_UP);
if (err)
return err;
-   dev->rx_pkt_burst = mlx5_rx_burst;
-   dev->tx_pkt_burst = mlx5_tx_burst;
+   priv_select_tx_function(priv);
+   priv_select_rx_function(priv);
} else {
err = priv_set_flags(priv, ~IFF_UP, ~IFF_UP);
if (err)
@@ -1289,13 +1289,11 @@ mlx5_secondary_data_setup(struct priv *priv)
rte_mb();
priv->dev->data = &sd->data;
rte_mb();
-   priv->dev->tx_pkt_burst = mlx5_tx_burst;
-   priv->dev->rx_pkt_burst = removed_rx_burst;
+   priv_select_tx_function(priv);
+   priv_select_rx_function(priv);
priv_unlock(priv);
 end:
/* More sanity checks. */
-   assert(priv->dev->tx_pkt_burst == mlx5_tx_burst);
-   assert(priv->dev->rx_pkt_burst == removed_rx_burst);
assert(priv->dev->data == &sd->data);
rte_spinlock_unlock(&sd->lock);
return priv;
@@ -1306,3 +1304,27 @@ error:
rte_spinlock_unlock(&sd->lock);
return NULL;
 }
+
+/**
+ * Configure the TX function to use.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ */
+void
+priv_select_tx_function(struct priv *priv)
+{
+   priv->dev->tx_pkt_burst = mlx5_tx_burst;
+}
+
+/**
+ * Configure the RX function to use.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ */
+void
+priv_select_rx_function(struct priv *priv)
+{
+   priv->dev->rx_pkt_burst = mlx5_rx_burst;
+}
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 9f3a33b..d7cc39d 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -477,7 +477,7 @@ mlx5_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, 
uint16_t desc,
  (void *)dev, (void *)txq_ctrl);
(*priv->txqs)[idx] = &txq_ctrl->txq;
/* Update send callback. */
-   dev->tx_pkt_burst = mlx5_tx_burst;
+   priv_select_tx_function(priv);
}
priv_unlock(priv);
return -ret;
-- 
2.1.4

[dpdk-dev] [PATCH v3 13/25] mlx5: refactor Rx data path

2016-06-21 Thread Nelio Laranjeiro

Bypass Verbs to improve RX performance.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Yaacov Hazan 
Signed-off-by: Adrien Mazarguil 
Signed-off-by: Vasily Philipov 
---
 drivers/net/mlx5/mlx5_ethdev.c |   4 +-
 drivers/net/mlx5/mlx5_fdir.c   |   2 +-
 drivers/net/mlx5/mlx5_rxq.c| 303 -
 drivers/net/mlx5/mlx5_rxtx.c   | 289 ---
 drivers/net/mlx5/mlx5_rxtx.h   |  38 +++---
 drivers/net/mlx5/mlx5_vlan.c   |   3 +-
 6 files changed, 325 insertions(+), 314 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 771d8b5..8628321 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1262,7 +1262,9 @@ mlx5_secondary_data_setup(struct priv *priv)
}
/* RX queues. */
for (i = 0; i != nb_rx_queues; ++i) {
-   struct rxq *primary_rxq = (*sd->primary_priv->rxqs)[i];
+   struct rxq_ctrl *primary_rxq =
+   container_of((*sd->primary_priv->rxqs)[i],
+struct rxq_ctrl, rxq);

if (primary_rxq == NULL)
continue;
diff --git a/drivers/net/mlx5/mlx5_fdir.c b/drivers/net/mlx5/mlx5_fdir.c
index 1850218..73eb00e 100644
--- a/drivers/net/mlx5/mlx5_fdir.c
+++ b/drivers/net/mlx5/mlx5_fdir.c
@@ -431,7 +431,7 @@ priv_get_fdir_queue(struct priv *priv, uint16_t idx)
ind_init_attr = (struct ibv_exp_rwq_ind_table_init_attr){
.pd = priv->pd,
.log_ind_tbl_size = 0,
-   .ind_tbl = &((*priv->rxqs)[idx]->wq),
+   .ind_tbl = &rxq_ctrl->wq,
.comp_mask = 0,
};

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 7db4ce7..a8f68a3 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -43,6 +43,8 @@
 #pragma GCC diagnostic ignored "-pedantic"
 #endif
 #include 
+#include 
+#include 
 #ifdef PEDANTIC
 #pragma GCC diagnostic error "-pedantic"
 #endif
@@ -373,8 +375,13 @@ priv_create_hash_rxqs(struct priv *priv)
DEBUG("indirection table extended to assume %u WQs",
  priv->reta_idx_n);
}
-   for (i = 0; (i != priv->reta_idx_n); ++i)
-   wqs[i] = (*priv->rxqs)[(*priv->reta_idx)[i]]->wq;
+   for (i = 0; (i != priv->reta_idx_n); ++i) {
+   struct rxq_ctrl *rxq_ctrl;
+
+   rxq_ctrl = container_of((*priv->rxqs)[(*priv->reta_idx)[i]],
+   struct rxq_ctrl, rxq);
+   wqs[i] = rxq_ctrl->wq;
+   }
/* Get number of hash RX queues to configure. */
for (i = 0, hash_rxqs_n = 0; (i != ind_tables_n); ++i)
hash_rxqs_n += ind_table_init[i].hash_types_n;
@@ -638,21 +645,13 @@ rxq_alloc_elts(struct rxq_ctrl *rxq_ctrl, unsigned int 
elts_n,
   struct rte_mbuf **pool)
 {
unsigned int i;
-   struct rxq_elt (*elts)[elts_n] =
-   rte_calloc_socket("RXQ elements", 1, sizeof(*elts), 0,
- rxq_ctrl->socket);
int ret = 0;

-   if (elts == NULL) {
-   ERROR("%p: can't allocate packets array", (void *)rxq_ctrl);
-   ret = ENOMEM;
-   goto error;
-   }
/* For each WR (packet). */
for (i = 0; (i != elts_n); ++i) {
-   struct rxq_elt *elt = &(*elts)[i];
-   struct ibv_sge *sge = &(*elts)[i].sge;
struct rte_mbuf *buf;
+   volatile struct mlx5_wqe_data_seg *scat =
+   &(*rxq_ctrl->rxq.wqes)[i];

if (pool != NULL) {
buf = *(pool++);
@@ -666,40 +665,36 @@ rxq_alloc_elts(struct rxq_ctrl *rxq_ctrl, unsigned int 
elts_n,
ret = ENOMEM;
goto error;
}
-   elt->buf = buf;
/* Headroom is reserved by rte_pktmbuf_alloc(). */
assert(DATA_OFF(buf) == RTE_PKTMBUF_HEADROOM);
/* Buffer is supposed to be empty. */
assert(rte_pktmbuf_data_len(buf) == 0);
assert(rte_pktmbuf_pkt_len(buf) == 0);
-   /* sge->addr must be able to store a pointer. */
-   assert(sizeof(sge->addr) >= sizeof(uintptr_t));
-   /* SGE keeps its headroom. */
-   sge->addr = (uintptr_t)
-   ((uint8_t *)buf->buf_addr + RTE_PKTMBUF_HEADROOM);
-   sge->length = (buf->buf_len - RTE_PKTMBUF_HEADROOM);
-   sge->lkey = rxq_ctrl->mr->lkey;
-   /* Redundant check for tailroom. */
-   assert(sge->length == rte_pktmbuf_tailroom(buf));
+   assert(!buf->next);
+   PORT(buf) = rxq_ctrl->rxq.port_id;
+   DATA_LEN(buf) = rte_pktmbuf_tailroom(buf);
+   PKT_LEN(buf) = DATA_LEN(buf);
+   NB_SEGS

[dpdk-dev] [PATCH v3 14/25] mlx5: refactor Tx data path

2016-06-21 Thread Nelio Laranjeiro

Bypass Verbs to improve Tx performance.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Yaacov Hazan 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/Makefile  |   5 -
 drivers/net/mlx5/mlx5_ethdev.c |  10 +-
 drivers/net/mlx5/mlx5_mr.c |   4 +-
 drivers/net/mlx5/mlx5_rxtx.c   | 359 ++---
 drivers/net/mlx5/mlx5_rxtx.h   |  52 +++---
 drivers/net/mlx5/mlx5_txq.c| 216 +
 6 files changed, 343 insertions(+), 303 deletions(-)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index dc99797..66687e8 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -106,11 +106,6 @@ mlx5_autoconf.h.new: FORCE
 mlx5_autoconf.h.new: $(RTE_SDK)/scripts/auto-config-h.sh
$Q $(RM) -f -- '$@'
$Q sh -- '$<' '$@' \
-   HAVE_VERBS_VLAN_INSERTION \
-   infiniband/verbs.h \
-   enum IBV_EXP_RECEIVE_WQ_CVLAN_INSERTION \
-   $(AUTOCONF_OUTPUT)
-   $Q sh -- '$<' '$@' \
HAVE_VERBS_IBV_EXP_CQ_COMPRESSED_CQE \
infiniband/verbs_exp.h \
enum IBV_EXP_CQ_COMPRESSED_CQE \
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 8628321..4e125a7 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1242,11 +1242,11 @@ mlx5_secondary_data_setup(struct priv *priv)
txq_ctrl = rte_calloc_socket("TXQ", 1, sizeof(*txq_ctrl), 0,
 primary_txq_ctrl->socket);
if (txq_ctrl != NULL) {
-   if (txq_setup(priv->dev,
- primary_txq_ctrl,
- primary_txq->elts_n,
- primary_txq_ctrl->socket,
- NULL) == 0) {
+   if (txq_ctrl_setup(priv->dev,
+  primary_txq_ctrl,
+  primary_txq->elts_n,
+  primary_txq_ctrl->socket,
+  NULL) == 0) {
txq_ctrl->txq.stats.idx = 
primary_txq->stats.idx;
tx_queues[i] = &txq_ctrl->txq;
continue;
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index 79d5568..e5e8a04 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -189,7 +189,7 @@ txq_mp2mr_reg(struct txq *txq, struct rte_mempool *mp, 
unsigned int idx)
/* Add a new entry, register MR first. */
DEBUG("%p: discovered new memory pool \"%s\" (%p)",
  (void *)txq_ctrl, mp->name, (void *)mp);
-   mr = mlx5_mp2mr(txq_ctrl->txq.priv->pd, mp);
+   mr = mlx5_mp2mr(txq_ctrl->priv->pd, mp);
if (unlikely(mr == NULL)) {
DEBUG("%p: unable to configure MR, ibv_reg_mr() failed.",
  (void *)txq_ctrl);
@@ -208,7 +208,7 @@ txq_mp2mr_reg(struct txq *txq, struct rte_mempool *mp, 
unsigned int idx)
/* Store the new entry. */
txq_ctrl->txq.mp2mr[idx].mp = mp;
txq_ctrl->txq.mp2mr[idx].mr = mr;
-   txq_ctrl->txq.mp2mr[idx].lkey = mr->lkey;
+   txq_ctrl->txq.mp2mr[idx].lkey = htonl(mr->lkey);
DEBUG("%p: new MR lkey for MP \"%s\" (%p): 0x%08" PRIu32,
  (void *)txq_ctrl, mp->name, (void *)mp,
  txq_ctrl->txq.mp2mr[idx].lkey);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 27d8852..95bf981 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -119,68 +119,52 @@ get_cqe64(volatile struct mlx5_cqe cqes[],
  *
  * @param txq
  *   Pointer to TX queue structure.
- *
- * @return
- *   0 on success, -1 on failure.
  */
-static int
+static void
 txq_complete(struct txq *txq)
 {
-   unsigned int elts_comp = txq->elts_comp;
-   unsigned int elts_tail = txq->elts_tail;
-   unsigned int elts_free = txq->elts_tail;
const unsigned int elts_n = txq->elts_n;
-   int wcs_n;
-
-   if (unlikely(elts_comp == 0))
-   return 0;
-#ifdef DEBUG_SEND
-   DEBUG("%p: processing %u work requests completions",
- (void *)txq, elts_comp);
-#endif
-   wcs_n = txq->poll_cnt(txq->cq, elts_comp);
-   if (unlikely(wcs_n == 0))
-   return 0;
-   if (unlikely(wcs_n < 0)) {
-   DEBUG("%p: ibv_poll_cq() failed (wcs_n=%d)",
- (void *)txq, wcs_n);
-   return -1;
+   const unsigned int cqe_n = txq->cqe_n;
+   uint16_t elts_free = txq->elts_tail;
+   uint16_t elts_tail;
+   uint16_t cq_ci = txq->cq_ci;
+   unsigned int wqe_ci = (unsigned int)-1;
+   int ret = 0;
+
+   while (ret == 0) {
+   volatile struct mlx5_cqe64 *cqe;
+
+   cqe = get_cqe64(*txq->cqes, cq

[dpdk-dev] [PATCH v3 15/25] mlx5: handle Rx CQE compression

2016-06-21 Thread Nelio Laranjeiro

Mini (compressed) CQEs are returned by the NIC when PCI back pressure is
detected, in which case the first CQE64 contains common packet information
followed by a number of CQE8 providing the rest, followed by a matching
number of empty CQE64 entries to be used by software for decompression.

Before decompression:

  0   1  2   6 7 8
  +---+  +-+ +---+   +---+ +---+ +---+
  | CQE64 |  |  CQE64  | | CQE64 |   | CQE64 | | CQE64 | | CQE64 |
  |---|  |-| |---|   |---| |---| |---|
  | . |  | cqe8[0] | |   | . |   | |   | | . |
  | . |  | cqe8[1] | |   | . |   | |   | | . |
  | . |  | ... | |   | . |   | |   | | . |
  | . |  | cqe8[7] | |   |   |   | |   | | . |
  +---+  +-+ +---+   +---+ +---+ +---+

After decompression:

  0  1 ... 8
  +---+  +---+ +---+
  | CQE64 |  | CQE64 | | CQE64 |
  |---|  |---| |---|
  | . |  | . |  .  | . |
  | . |  | . |  .  | . |
  | . |  | . |  .  | . |
  | . |  | . | | . |
  +---+  +---+ +---+

This patch does not perform the entire decompression step as it would be
really expensive, instead the first CQE64 is consumed and an internal
context is maintained to interpret the following CQE8 entries directly.

Intermediate empty CQE64 entries are handed back to HW without further
processing.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
Signed-off-by: Olga Shern 
Signed-off-by: Vasily Philipov 
---
 doc/guides/nics/mlx5.rst |   6 +
 drivers/net/mlx5/mlx5.c  |  25 -
 drivers/net/mlx5/mlx5.h  |   1 +
 drivers/net/mlx5/mlx5_rxq.c  |   9 +-
 drivers/net/mlx5/mlx5_rxtx.c | 260 ---
 drivers/net/mlx5/mlx5_rxtx.h |  11 ++
 drivers/net/mlx5/mlx5_txq.c  |   5 +
 7 files changed, 248 insertions(+), 69 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 3a07928..756153b 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -148,6 +148,12 @@ Run-time configuration

 - **ethtool** operations on related kernel interfaces also affect the PMD.

+- ``rxq_cqe_comp_en`` parameter [int]
+
+  A nonzero value enables the compression of CQE on RX side. This feature
+  allows to save PCI bandwidth and improve performance at the cost of a
+  slightly higher CPU usage.  Enabled by default.
+
 Prerequisites
 -

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 56b1dfc..7e8c579 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -69,6 +69,9 @@
 #include "mlx5_autoconf.h"
 #include "mlx5_defs.h"

+/* Device parameter to enable RX completion queue compression. */
+#define MLX5_RXQ_CQE_COMP_EN "rxq_cqe_comp_en"
+
 /**
  * Retrieve integer value from environment variable.
  *
@@ -256,12 +259,21 @@ static int
 mlx5_args_check(const char *key, const char *val, void *opaque)
 {
struct priv *priv = opaque;
+   unsigned long tmp;

-   /* No parameters are expected at the moment. */
-   (void)priv;
-   (void)val;
-   WARN("%s: unknown parameter", key);
-   return EINVAL;
+   errno = 0;
+   tmp = strtoul(val, NULL, 0);
+   if (errno) {
+   WARN("%s: \"%s\" is not a valid integer", key, val);
+   return errno;
+   }
+   if (strcmp(MLX5_RXQ_CQE_COMP_EN, key) == 0)
+   priv->cqe_comp = !!tmp;
+   else {
+   WARN("%s: unknown parameter", key);
+   return EINVAL;
+   }
+   return 0;
 }

 /**
@@ -279,7 +291,7 @@ static int
 mlx5_args(struct priv *priv, struct rte_devargs *devargs)
 {
static const char *params[] = {
-   NULL,
+   MLX5_RXQ_CQE_COMP_EN,
};
struct rte_kvargs *kvlist;
int ret = 0;
@@ -474,6 +486,7 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
priv->port = port;
priv->pd = pd;
priv->mtu = ETHER_MTU;
+   priv->cqe_comp = 1; /* Enable compression by default. */
err = mlx5_args(priv, pci_dev->devargs);
if (err) {
ERROR("failed to process device arguments: %s",
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 3dca03d..8f5a6df 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -111,6 +111,7 @@ struct priv {
unsigned int hw_padding:1; /* End alignment padding is supported. */
unsigned int sriov:1; /* This is a VF or PF with VF devices. */
unsigned int mps:1; /* Whether multi-packet send is supported. */
+   unsigned int cqe_comp:1; /* Whether CQE compression is enabled. */
unsigned int pending_alarm:1; /* An alarm is pend

[dpdk-dev] [PATCH v3 16/25] mlx5: replace countdown with threshold for Tx completions

2016-06-21 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

Replacing the variable countdown (which depends on the number of
descriptors) with a fixed relative threshold known at compile time improves
performance by reducing the TX queue structure footprint and the amount of
code to manage completions during a burst.

Completions are now requested at most once per burst after threshold is
reached.

Signed-off-by: Adrien Mazarguil 
Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Vasily Philipov 
---
 drivers/net/mlx5/mlx5_defs.h |  7 +--
 drivers/net/mlx5/mlx5_rxtx.c | 42 --
 drivers/net/mlx5/mlx5_rxtx.h |  5 ++---
 drivers/net/mlx5/mlx5_txq.c  | 19 ---
 4 files changed, 43 insertions(+), 30 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 8d2ec7a..cc2a6f3 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -48,8 +48,11 @@
 /* Maximum number of special flows. */
 #define MLX5_MAX_SPECIAL_FLOWS 4

-/* Request send completion once in every 64 sends, might be less. */
-#define MLX5_PMD_TX_PER_COMP_REQ 64
+/*
+ * Request TX completion every time descriptors reach this threshold since
+ * the previous request. Must be a power of two for performance reasons.
+ */
+#define MLX5_TX_COMP_THRESH 32

 /* RSS Indirection table size. */
 #define RSS_INDIRECTION_TABLE_SIZE 256
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 30d413c..d56c9e9 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -154,9 +154,6 @@ check_cqe64(volatile struct mlx5_cqe64 *cqe,
  * Manage TX completions.
  *
  * When sending a burst, mlx5_tx_burst() posts several WRs.
- * To improve performance, a completion event is only required once every
- * MLX5_PMD_TX_PER_COMP_REQ sends. Doing so discards completion information
- * for other WRs, but this information would not be used anyway.
  *
  * @param txq
  *   Pointer to TX queue structure.
@@ -170,14 +167,16 @@ txq_complete(struct txq *txq)
uint16_t elts_free = txq->elts_tail;
uint16_t elts_tail;
uint16_t cq_ci = txq->cq_ci;
-   unsigned int wqe_ci = (unsigned int)-1;
+   volatile struct mlx5_cqe64 *cqe = NULL;
+   volatile union mlx5_wqe *wqe;

do {
-   unsigned int idx = cq_ci & cqe_cnt;
-   volatile struct mlx5_cqe64 *cqe = &(*txq->cqes)[idx].cqe64;
+   volatile struct mlx5_cqe64 *tmp;

-   if (check_cqe64(cqe, cqe_n, cq_ci) == 1)
+   tmp = &(*txq->cqes)[cq_ci & cqe_cnt].cqe64;
+   if (check_cqe64(tmp, cqe_n, cq_ci))
break;
+   cqe = tmp;
 #ifndef NDEBUG
if (MLX5_CQE_FORMAT(cqe->op_own) == MLX5_COMPRESSED) {
if (!check_cqe64_seen(cqe))
@@ -191,14 +190,15 @@ txq_complete(struct txq *txq)
return;
}
 #endif /* NDEBUG */
-   wqe_ci = ntohs(cqe->wqe_counter);
++cq_ci;
} while (1);
-   if (unlikely(wqe_ci == (unsigned int)-1))
+   if (unlikely(cqe == NULL))
return;
+   wqe = &(*txq->wqes)[htons(cqe->wqe_counter) & (txq->wqe_n - 1)];
+   elts_tail = wqe->wqe.ctrl.data[3];
+   assert(elts_tail < txq->wqe_n);
/* Free buffers. */
-   elts_tail = (wqe_ci + 1) & (elts_n - 1);
-   do {
+   while (elts_free != elts_tail) {
struct rte_mbuf *elt = (*txq->elts)[elts_free];
unsigned int elts_free_next =
(elts_free + 1) & (elts_n - 1);
@@ -214,7 +214,7 @@ txq_complete(struct txq *txq)
/* Only one segment needs to be freed. */
rte_pktmbuf_free_seg(elt);
elts_free = elts_free_next;
-   } while (elts_free != elts_tail);
+   }
txq->cq_ci = cq_ci;
txq->elts_tail = elts_tail;
/* Update the consumer index. */
@@ -435,6 +435,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
const unsigned int elts_n = txq->elts_n;
unsigned int i;
unsigned int max;
+   unsigned int comp;
volatile union mlx5_wqe *wqe;
struct rte_mbuf *buf;

@@ -484,12 +485,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
buf->vlan_tci);
else
mlx5_wqe_write(txq, wqe, addr, length, lkey);
-   /* Request completion if needed. */
-   if (unlikely(--txq->elts_comp == 0)) {
-   wqe->wqe.ctrl.data[2] = htonl(8);
-   txq->elts_comp = txq->elts_comp_cd_init;
-   } else
-   wqe->wqe.ctrl.data[2] = 0;
+   wqe->wqe.ctrl.data[2] = 0;
/* Should we enable HW CKSUM offload */
if (buf->ol_flags &
(PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |

[dpdk-dev] [PATCH v3 17/25] mlx5: add support for inline send

2016-06-21 Thread Nelio Laranjeiro

From: Yaacov Hazan 

Implement send inline feature which copies packet data directly into WQEs
for improved latency. The maximum packet size and the minimum number of Tx
queues to qualify for inline send are user-configurable.

This feature is effective when HW causes a performance bottleneck.

Signed-off-by: Yaacov Hazan 
Signed-off-by: Adrien Mazarguil 
Signed-off-by: Nelio Laranjeiro 
---
 doc/guides/nics/mlx5.rst   |  17 +++
 drivers/net/mlx5/mlx5.c|  13 ++
 drivers/net/mlx5/mlx5.h|   2 +
 drivers/net/mlx5/mlx5_ethdev.c |   5 +
 drivers/net/mlx5/mlx5_rxtx.c   | 271 +
 drivers/net/mlx5/mlx5_rxtx.h   |   2 +
 drivers/net/mlx5/mlx5_txq.c|   4 +
 7 files changed, 314 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 756153b..9ada221 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -154,6 +154,23 @@ Run-time configuration
   allows to save PCI bandwidth and improve performance at the cost of a
   slightly higher CPU usage.  Enabled by default.

+- ``txq_inline`` parameter [int]
+
+  Amount of data to be inlined during TX operations. Improves latency.
+  Can improve PPS performance when PCI back pressure is detected and may be
+  useful for scenarios involving heavy traffic on many queues.
+
+  It is not enabled by default (set to 0) since the additional software
+  logic necessary to handle this mode can lower performance when back
+  pressure is not expected.
+
+- ``txqs_min_inline`` parameter [int]
+
+  Enable inline send only when the number of TX queues is greater or equal
+  to this value.
+
+  This option should be used in combination with ``txq_inline`` above.
+
 Prerequisites
 -

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 7e8c579..8c8c5e4 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -72,6 +72,13 @@
 /* Device parameter to enable RX completion queue compression. */
 #define MLX5_RXQ_CQE_COMP_EN "rxq_cqe_comp_en"

+/* Device parameter to configure inline send. */
+#define MLX5_TXQ_INLINE "txq_inline"
+
+/* Device parameter to configure the number of TX queues threshold for
+ * enabling inline send. */
+#define MLX5_TXQS_MIN_INLINE "txqs_min_inline"
+
 /**
  * Retrieve integer value from environment variable.
  *
@@ -269,6 +276,10 @@ mlx5_args_check(const char *key, const char *val, void 
*opaque)
}
if (strcmp(MLX5_RXQ_CQE_COMP_EN, key) == 0)
priv->cqe_comp = !!tmp;
+   else if (strcmp(MLX5_TXQ_INLINE, key) == 0)
+   priv->txq_inline = tmp;
+   else if (strcmp(MLX5_TXQS_MIN_INLINE, key) == 0)
+   priv->txqs_inline = tmp;
else {
WARN("%s: unknown parameter", key);
return EINVAL;
@@ -292,6 +303,8 @@ mlx5_args(struct priv *priv, struct rte_devargs *devargs)
 {
static const char *params[] = {
MLX5_RXQ_CQE_COMP_EN,
+   MLX5_TXQ_INLINE,
+   MLX5_TXQS_MIN_INLINE,
};
struct rte_kvargs *kvlist;
int ret = 0;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 8f5a6df..3a86609 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -113,6 +113,8 @@ struct priv {
unsigned int mps:1; /* Whether multi-packet send is supported. */
unsigned int cqe_comp:1; /* Whether CQE compression is enabled. */
unsigned int pending_alarm:1; /* An alarm is pending. */
+   unsigned int txq_inline; /* Maximum packet size for inlining. */
+   unsigned int txqs_inline; /* Queue number threshold for inlining. */
/* RX/TX queues. */
unsigned int rxqs_n; /* RX queues array size. */
unsigned int txqs_n; /* TX queues array size. */
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 4e125a7..a2bdc56 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1317,6 +1317,11 @@ void
 priv_select_tx_function(struct priv *priv)
 {
priv->dev->tx_pkt_burst = mlx5_tx_burst;
+   if (priv->txq_inline && (priv->txqs_n >= priv->txqs_inline)) {
+   priv->dev->tx_pkt_burst = mlx5_tx_burst_inline;
+   DEBUG("selected inline TX function (%u >= %u queues)",
+ priv->txqs_n, priv->txqs_inline);
+   }
 }

 /**
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index d56c9e9..43fe532 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -374,6 +374,139 @@ mlx5_wqe_write_vlan(struct txq *txq, volatile union 
mlx5_wqe *wqe,
 }

 /**
+ * Write a inline WQE.
+ *
+ * @param txq
+ *   Pointer to TX queue structure.
+ * @param wqe
+ *   Pointer to the WQE to fill.
+ * @param addr
+ *   Buffer data address.
+ * @param length
+ *   Packet length.
+ * @param lkey
+ *   Memory region lkey.
+ */
+static inline void
+mlx5_wqe_write_inline(struct txq *txq, volatile union m

[dpdk-dev] [PATCH v3 18/25] mlx5: add support for multi-packet send

2016-06-21 Thread Nelio Laranjeiro

This feature enables the TX burst function to emit up to 5 packets using
only two WQEs on devices that support it. Saves PCI bandwidth and improves
performance.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
Signed-off-by: Olga Shern 
---
 doc/guides/nics/mlx5.rst   |  10 ++
 drivers/net/mlx5/mlx5.c|  14 +-
 drivers/net/mlx5/mlx5_ethdev.c |  15 +-
 drivers/net/mlx5/mlx5_rxtx.c   | 400 +
 drivers/net/mlx5/mlx5_rxtx.h   |   2 +
 drivers/net/mlx5/mlx5_txq.c|   2 +-
 6 files changed, 439 insertions(+), 4 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 9ada221..063c4a5 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -171,6 +171,16 @@ Run-time configuration

   This option should be used in combination with ``txq_inline`` above.

+- ``txq_mpw_en`` parameter [int]
+
+  A nonzero value enables multi-packet send. This feature allows the TX
+  burst function to pack up to five packets in two descriptors in order to
+  save PCI bandwidth and improve performance at the cost of a slightly
+  higher CPU usage.
+
+  It is currently only supported on the ConnectX-4 Lx family of adapters.
+  Enabled by default.
+
 Prerequisites
 -

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 8c8c5e4..b85030a 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -79,6 +79,9 @@
  * enabling inline send. */
 #define MLX5_TXQS_MIN_INLINE "txqs_min_inline"

+/* Device parameter to enable multi-packet send WQEs. */
+#define MLX5_TXQ_MPW_EN "txq_mpw_en"
+
 /**
  * Retrieve integer value from environment variable.
  *
@@ -280,6 +283,8 @@ mlx5_args_check(const char *key, const char *val, void 
*opaque)
priv->txq_inline = tmp;
else if (strcmp(MLX5_TXQS_MIN_INLINE, key) == 0)
priv->txqs_inline = tmp;
+   else if (strcmp(MLX5_TXQ_MPW_EN, key) == 0)
+   priv->mps = !!tmp;
else {
WARN("%s: unknown parameter", key);
return EINVAL;
@@ -305,6 +310,7 @@ mlx5_args(struct priv *priv, struct rte_devargs *devargs)
MLX5_RXQ_CQE_COMP_EN,
MLX5_TXQ_INLINE,
MLX5_TXQS_MIN_INLINE,
+   MLX5_TXQ_MPW_EN,
};
struct rte_kvargs *kvlist;
int ret = 0;
@@ -499,6 +505,7 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
priv->port = port;
priv->pd = pd;
priv->mtu = ETHER_MTU;
+   priv->mps = mps; /* Enable MPW by default if supported. */
priv->cqe_comp = 1; /* Enable compression by default. */
err = mlx5_args(priv, pci_dev->devargs);
if (err) {
@@ -547,7 +554,12 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)

priv_get_num_vfs(priv, &num_vfs);
priv->sriov = (num_vfs || sriov);
-   priv->mps = mps;
+   if (priv->mps && !mps) {
+   ERROR("multi-packet send not supported on this device"
+ " (" MLX5_TXQ_MPW_EN ")");
+   err = ENOTSUP;
+   goto port_error;
+   }
/* Allocate and register default RSS hash keys. */
priv->rss_conf = rte_calloc(__func__, hash_rxq_init_n,
sizeof((*priv->rss_conf)[0]), 0);
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index a2bdc56..69bfe03 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -584,7 +584,8 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *info)
  DEV_RX_OFFLOAD_UDP_CKSUM |
  DEV_RX_OFFLOAD_TCP_CKSUM) :
 0);
-   info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT;
+   if (!priv->mps)
+   info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT;
if (priv->hw_csum)
info->tx_offload_capa |=
(DEV_TX_OFFLOAD_IPV4_CKSUM |
@@ -1317,7 +1318,17 @@ void
 priv_select_tx_function(struct priv *priv)
 {
priv->dev->tx_pkt_burst = mlx5_tx_burst;
-   if (priv->txq_inline && (priv->txqs_n >= priv->txqs_inline)) {
+   /* Display warning for unsupported configurations. */
+   if (priv->sriov && priv->mps)
+   WARN("multi-packet send WQE cannot be used on a SR-IOV setup");
+   /* Select appropriate TX function. */
+   if ((priv->sriov == 0) && priv->mps && priv->txq_inline) {
+   priv->dev->tx_pkt_burst = mlx5_tx_burst_mpw_inline;
+   DEBUG("selected MPW inline TX function");
+   } else if ((priv->sriov == 0) && priv->mps) {
+   priv->dev->tx_pkt_burst = mlx5_tx_burst_mpw;
+   DEBUG("selected MPW TX function");
+

[dpdk-dev] [PATCH v3 19/25] mlx5: add debugging information about Tx queues capabilities

2016-06-21 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5_txq.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 4f17fb0..bae9f3d 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -343,6 +343,11 @@ txq_ctrl_setup(struct rte_eth_dev *dev, struct txq_ctrl 
*txq_ctrl,
  (void *)dev, strerror(ret));
goto error;
}
+   DEBUG("TX queue capabilities: max_send_wr=%u, max_send_sge=%u,"
+ " max_inline_data=%u",
+ attr.init.cap.max_send_wr,
+ attr.init.cap.max_send_sge,
+ attr.init.cap.max_inline_data);
attr.mod = (struct ibv_exp_qp_attr){
/* Move the QP to this state. */
.qp_state = IBV_QPS_INIT,
-- 
2.1.4

[dpdk-dev] [PATCH v3 20/25] mlx5: check remaining space while processing Tx burst

2016-06-21 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

The space necessary to store segmented packets cannot be known in advance
and must be verified for each of them.

Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5_rxtx.c | 136 ++-
 1 file changed, 70 insertions(+), 66 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 2ee504d..7097713 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -583,50 +583,49 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
struct txq *txq = (struct txq *)dpdk_txq;
uint16_t elts_head = txq->elts_head;
const unsigned int elts_n = txq->elts_n;
-   unsigned int i;
+   unsigned int i = 0;
unsigned int max;
unsigned int comp;
volatile union mlx5_wqe *wqe;
-   struct rte_mbuf *buf;

if (unlikely(!pkts_n))
return 0;
-   buf = pkts[0];
/* Prefetch first packet cacheline. */
tx_prefetch_cqe(txq, txq->cq_ci);
tx_prefetch_cqe(txq, txq->cq_ci + 1);
-   rte_prefetch0(buf);
+   rte_prefetch0(*pkts);
/* Start processing. */
txq_complete(txq);
max = (elts_n - (elts_head - txq->elts_tail));
if (max > elts_n)
max -= elts_n;
-   assert(max >= 1);
-   assert(max <= elts_n);
-   /* Always leave one free entry in the ring. */
-   --max;
-   if (max == 0)
-   return 0;
-   if (max > pkts_n)
-   max = pkts_n;
-   for (i = 0; (i != max); ++i) {
-   unsigned int elts_head_next = (elts_head + 1) & (elts_n - 1);
+   do {
+   struct rte_mbuf *buf;
+   unsigned int elts_head_next;
uintptr_t addr;
uint32_t length;
uint32_t lkey;

+   /* Make sure there is enough room to store this packet and
+* that one ring entry remains unused. */
+   if (max < 1 + 1)
+   break;
+   --max;
+   --pkts_n;
+   buf = *(pkts++);
+   elts_head_next = (elts_head + 1) & (elts_n - 1);
wqe = &(*txq->wqes)[txq->wqe_ci & (txq->wqe_n - 1)];
rte_prefetch0(wqe);
-   if (i + 1 < max)
-   rte_prefetch0(pkts[i + 1]);
+   if (pkts_n)
+   rte_prefetch0(*pkts);
/* Retrieve buffer information. */
addr = rte_pktmbuf_mtod(buf, uintptr_t);
length = DATA_LEN(buf);
/* Update element. */
(*txq->elts)[elts_head] = buf;
/* Prefetch next buffer data. */
-   if (i + 1 < max)
-   rte_prefetch0(rte_pktmbuf_mtod(pkts[i + 1],
+   if (pkts_n)
+   rte_prefetch0(rte_pktmbuf_mtod(*pkts,
   volatile void *));
/* Retrieve Memory Region key for this memory pool. */
lkey = txq_mp2mr(txq, txq_mb2mp(buf));
@@ -649,8 +648,8 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
txq->stats.obytes += length;
 #endif
elts_head = elts_head_next;
-   buf = pkts[i + 1];
-   }
+   ++i;
+   } while (pkts_n);
/* Take a shortcut if nothing must be sent. */
if (unlikely(i == 0))
return 0;
@@ -693,44 +692,43 @@ mlx5_tx_burst_inline(void *dpdk_txq, struct rte_mbuf 
**pkts, uint16_t pkts_n)
struct txq *txq = (struct txq *)dpdk_txq;
uint16_t elts_head = txq->elts_head;
const unsigned int elts_n = txq->elts_n;
-   unsigned int i;
+   unsigned int i = 0;
unsigned int max;
unsigned int comp;
volatile union mlx5_wqe *wqe;
-   struct rte_mbuf *buf;
unsigned int max_inline = txq->max_inline;

if (unlikely(!pkts_n))
return 0;
-   buf = pkts[0];
/* Prefetch first packet cacheline. */
tx_prefetch_cqe(txq, txq->cq_ci);
tx_prefetch_cqe(txq, txq->cq_ci + 1);
-   rte_prefetch0(buf);
+   rte_prefetch0(*pkts);
/* Start processing. */
txq_complete(txq);
max = (elts_n - (elts_head - txq->elts_tail));
if (max > elts_n)
max -= elts_n;
-   assert(max >= 1);
-   assert(max <= elts_n);
-   /* Always leave one free entry in the ring. */
-   --max;
-   if (max == 0)
-   return 0;
-   if (max > pkts_n)
-   max = pkts_n;
-   for (i = 0; (i != max); ++i) {
-   unsigned int elts_head_next = (elts_head + 1) & (elts_n - 1);
+   do {
+   struct rte_mbuf *buf;
+   unsigned int elts_head_next;
uintptr_t addr;
uint32_t length;
uint32_t

[dpdk-dev] [PATCH v3 21/25] mlx5: resurrect Tx gather support

2016-06-21 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

Compared to its previous incarnation, the software limit on the number of
mbuf segments is no more (previously MLX5_PMD_SGE_WR_N, set to 4 by
default) hence no need for linearization code and related buffers that
permanently consumed a non negligible amount of memory to handle oversized
mbufs.

The resulting code is both lighter and faster.

Signed-off-by: Adrien Mazarguil 
Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/mlx5_rxtx.c | 231 +--
 drivers/net/mlx5/mlx5_txq.c  |   6 +-
 2 files changed, 182 insertions(+), 55 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 7097713..db784c0 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -301,6 +301,7 @@ mlx5_wqe_write(struct txq *txq, volatile union mlx5_wqe 
*wqe,
 {
wqe->wqe.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
wqe->wqe.ctrl.data[1] = htonl((txq->qp_num_8s) | 4);
+   wqe->wqe.ctrl.data[2] = 0;
wqe->wqe.ctrl.data[3] = 0;
wqe->inl.eseg.rsvd0 = 0;
wqe->inl.eseg.rsvd1 = 0;
@@ -346,6 +347,7 @@ mlx5_wqe_write_vlan(struct txq *txq, volatile union 
mlx5_wqe *wqe,

wqe->wqe.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
wqe->wqe.ctrl.data[1] = htonl((txq->qp_num_8s) | 4);
+   wqe->wqe.ctrl.data[2] = 0;
wqe->wqe.ctrl.data[3] = 0;
wqe->inl.eseg.rsvd0 = 0;
wqe->inl.eseg.rsvd1 = 0;
@@ -423,6 +425,7 @@ mlx5_wqe_write_inline(struct txq *txq, volatile union 
mlx5_wqe *wqe,
assert(size < 64);
wqe->inl.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
wqe->inl.ctrl.data[1] = htonl(txq->qp_num_8s | size);
+   wqe->inl.ctrl.data[2] = 0;
wqe->inl.ctrl.data[3] = 0;
wqe->inl.eseg.rsvd0 = 0;
wqe->inl.eseg.rsvd1 = 0;
@@ -496,6 +499,7 @@ mlx5_wqe_write_inline_vlan(struct txq *txq, volatile union 
mlx5_wqe *wqe,
assert(size < 64);
wqe->inl.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
wqe->inl.ctrl.data[1] = htonl(txq->qp_num_8s | size);
+   wqe->inl.ctrl.data[2] = 0;
wqe->inl.ctrl.data[3] = 0;
wqe->inl.eseg.rsvd0 = 0;
wqe->inl.eseg.rsvd1 = 0;
@@ -584,6 +588,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
uint16_t elts_head = txq->elts_head;
const unsigned int elts_n = txq->elts_n;
unsigned int i = 0;
+   unsigned int j = 0;
unsigned int max;
unsigned int comp;
volatile union mlx5_wqe *wqe;
@@ -600,21 +605,25 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
if (max > elts_n)
max -= elts_n;
do {
-   struct rte_mbuf *buf;
+   struct rte_mbuf *buf = *(pkts++);
unsigned int elts_head_next;
uintptr_t addr;
uint32_t length;
uint32_t lkey;
+   unsigned int segs_n = buf->nb_segs;
+   volatile struct mlx5_wqe_data_seg *dseg;
+   unsigned int ds = sizeof(*wqe) / 16;

/* Make sure there is enough room to store this packet and
 * that one ring entry remains unused. */
-   if (max < 1 + 1)
+   assert(segs_n);
+   if (max < segs_n + 1)
break;
-   --max;
+   max -= segs_n;
--pkts_n;
-   buf = *(pkts++);
elts_head_next = (elts_head + 1) & (elts_n - 1);
wqe = &(*txq->wqes)[txq->wqe_ci & (txq->wqe_n - 1)];
+   dseg = &wqe->wqe.dseg;
rte_prefetch0(wqe);
if (pkts_n)
rte_prefetch0(*pkts);
@@ -634,7 +643,6 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
buf->vlan_tci);
else
mlx5_wqe_write(txq, wqe, addr, length, lkey);
-   wqe->wqe.ctrl.data[2] = 0;
/* Should we enable HW CKSUM offload */
if (buf->ol_flags &
(PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM | PKT_TX_UDP_CKSUM)) {
@@ -643,6 +651,35 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
MLX5_ETH_WQE_L4_CSUM;
} else
wqe->wqe.eseg.cs_flags = 0;
+   while (--segs_n) {
+   /* Spill on next WQE when the current one does not have
+* enough room left. Size of WQE must a be a multiple
+* of data segment size. */
+   assert(!(sizeof(*wqe) % sizeof(*dseg)));
+   if (!(ds % (sizeof(*wqe) / 16)))
+   dseg = (volatile void *)
+   &(*

[dpdk-dev] [PATCH v3 22/25] mlx5: work around spurious compilation errors

2016-06-21 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

Since commit "mlx5: resurrect Tx gather support", older GCC versions (such
as 4.8.5) may complain about the following:

 mlx5_rxtx.c: In function `mlx5_tx_burst':
 mlx5_rxtx.c:705:25: error: `wqe' may be used uninitialized in this
 function [-Werror=maybe-uninitialized]

 mlx5_rxtx.c: In function `mlx5_tx_burst_inline':
 mlx5_rxtx.c:864:25: error: `wqe' may be used uninitialized in this
 function [-Werror=maybe-uninitialized]

In both cases, this code cannot be reached when wqe is not initialized.

Considering older GCC versions are still widely used, work around this
issue by initializing wqe preemptively, even if it should not be necessary.

Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5_rxtx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index db784c0..2fc57dc 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -591,7 +591,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
unsigned int j = 0;
unsigned int max;
unsigned int comp;
-   volatile union mlx5_wqe *wqe;
+   volatile union mlx5_wqe *wqe = NULL;

if (unlikely(!pkts_n))
return 0;
@@ -733,7 +733,7 @@ mlx5_tx_burst_inline(void *dpdk_txq, struct rte_mbuf 
**pkts, uint16_t pkts_n)
unsigned int j = 0;
unsigned int max;
unsigned int comp;
-   volatile union mlx5_wqe *wqe;
+   volatile union mlx5_wqe *wqe = NULL;
unsigned int max_inline = txq->max_inline;

if (unlikely(!pkts_n))
-- 
2.1.4

[dpdk-dev] [PATCH v3 23/25] mlx5: remove redundant Rx queue initialization code

2016-06-21 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

Toggling RX checksum offloads is already done at initialization time. This
code does not belong in rxq_rehash().

Signed-off-by: Adrien Mazarguil 
Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/mlx5_rxq.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 6881cdd..707296c 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -798,7 +798,6 @@ rxq_cleanup(struct rxq_ctrl *rxq_ctrl)
 int
 rxq_rehash(struct rte_eth_dev *dev, struct rxq_ctrl *rxq_ctrl)
 {
-   struct priv *priv = rxq_ctrl->priv;
struct rxq_ctrl tmpl = *rxq_ctrl;
unsigned int mbuf_n;
unsigned int desc_n;
@@ -811,15 +810,6 @@ rxq_rehash(struct rte_eth_dev *dev, struct rxq_ctrl 
*rxq_ctrl)
/* Number of descriptors and mbufs currently allocated. */
desc_n = tmpl.rxq.elts_n;
mbuf_n = desc_n;
-   /* Toggle RX checksum offload if hardware supports it. */
-   if (priv->hw_csum) {
-   tmpl.rxq.csum = !!dev->data->dev_conf.rxmode.hw_ip_checksum;
-   rxq_ctrl->rxq.csum = tmpl.rxq.csum;
-   }
-   if (priv->hw_csum_l2tun) {
-   tmpl.rxq.csum_l2tun = 
!!dev->data->dev_conf.rxmode.hw_ip_checksum;
-   rxq_ctrl->rxq.csum_l2tun = tmpl.rxq.csum_l2tun;
-   }
/* From now on, any failure will render the queue unusable.
 * Reinitialize WQ. */
mod = (struct ibv_exp_wq_attr){
-- 
2.1.4

[dpdk-dev] [PATCH v3 24/25] mlx5: make Rx queue reinitialization safer

2016-06-21 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

The primary purpose of rxq_rehash() function is to stop and restart
reception on a queue after re-posting buffers. This may fail if the array
that temporarily stores existing buffers for reuse cannot be allocated.

Update rxq_rehash() to work on the target queue directly (not through a
template copy) and avoid this allocation.

rxq_alloc_elts() is modified accordingly to take buffers from an existing
queue directly and update their refcount.

Unlike rxq_rehash(), rxq_setup() must work on a temporary structure but
should not allocate new mbufs from the pool while reinitializing an
existing queue. This is achieved by using the refcount-aware
rxq_alloc_elts() before overwriting queue data.

Signed-off-by: Adrien Mazarguil 
Signed-off-by: Vasily Philipov 
---
 drivers/net/mlx5/mlx5_rxq.c | 83 ++---
 1 file changed, 41 insertions(+), 42 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 707296c..0a3225e 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -642,7 +642,7 @@ priv_rehash_flows(struct priv *priv)
  */
 static int
 rxq_alloc_elts(struct rxq_ctrl *rxq_ctrl, unsigned int elts_n,
-  struct rte_mbuf **pool)
+  struct rte_mbuf *(*pool)[])
 {
unsigned int i;
int ret = 0;
@@ -654,9 +654,10 @@ rxq_alloc_elts(struct rxq_ctrl *rxq_ctrl, unsigned int 
elts_n,
&(*rxq_ctrl->rxq.wqes)[i];

if (pool != NULL) {
-   buf = *(pool++);
+   buf = (*pool)[i];
assert(buf != NULL);
rte_pktmbuf_reset(buf);
+   rte_pktmbuf_refcnt_update(buf, 1);
} else
buf = rte_pktmbuf_alloc(rxq_ctrl->rxq.mp);
if (buf == NULL) {
@@ -781,7 +782,7 @@ rxq_cleanup(struct rxq_ctrl *rxq_ctrl)
 }

 /**
- * Reconfigure a RX queue with new parameters.
+ * Reconfigure RX queue buffers.
  *
  * rxq_rehash() does not allocate mbufs, which, if not done from the right
  * thread (such as a control thread), may corrupt the pool.
@@ -798,67 +799,48 @@ rxq_cleanup(struct rxq_ctrl *rxq_ctrl)
 int
 rxq_rehash(struct rte_eth_dev *dev, struct rxq_ctrl *rxq_ctrl)
 {
-   struct rxq_ctrl tmpl = *rxq_ctrl;
-   unsigned int mbuf_n;
-   unsigned int desc_n;
-   struct rte_mbuf **pool;
-   unsigned int i, k;
+   unsigned int elts_n = rxq_ctrl->rxq.elts_n;
+   unsigned int i;
struct ibv_exp_wq_attr mod;
int err;

DEBUG("%p: rehashing queue %p", (void *)dev, (void *)rxq_ctrl);
-   /* Number of descriptors and mbufs currently allocated. */
-   desc_n = tmpl.rxq.elts_n;
-   mbuf_n = desc_n;
/* From now on, any failure will render the queue unusable.
 * Reinitialize WQ. */
mod = (struct ibv_exp_wq_attr){
.attr_mask = IBV_EXP_WQ_ATTR_STATE,
.wq_state = IBV_EXP_WQS_RESET,
};
-   err = ibv_exp_modify_wq(tmpl.wq, &mod);
+   err = ibv_exp_modify_wq(rxq_ctrl->wq, &mod);
if (err) {
ERROR("%p: cannot reset WQ: %s", (void *)dev, strerror(err));
assert(err > 0);
return err;
}
-   /* Allocate pool. */
-   pool = rte_malloc(__func__, (mbuf_n * sizeof(*pool)), 0);
-   if (pool == NULL) {
-   ERROR("%p: cannot allocate memory", (void *)dev);
-   return ENOBUFS;
-   }
/* Snatch mbufs from original queue. */
-   k = 0;
-   for (i = 0; (i != desc_n); ++i)
-   pool[k++] = (*rxq_ctrl->rxq.elts)[i];
-   assert(k == mbuf_n);
-   rte_free(pool);
+   claim_zero(rxq_alloc_elts(rxq_ctrl, elts_n, rxq_ctrl->rxq.elts));
+   for (i = 0; i != elts_n; ++i) {
+   struct rte_mbuf *buf = (*rxq_ctrl->rxq.elts)[i];
+
+   assert(rte_mbuf_refcnt_read(buf) == 2);
+   rte_pktmbuf_free_seg(buf);
+   }
/* Change queue state to ready. */
mod = (struct ibv_exp_wq_attr){
.attr_mask = IBV_EXP_WQ_ATTR_STATE,
.wq_state = IBV_EXP_WQS_RDY,
};
-   err = ibv_exp_modify_wq(tmpl.wq, &mod);
+   err = ibv_exp_modify_wq(rxq_ctrl->wq, &mod);
if (err) {
ERROR("%p: WQ state to IBV_EXP_WQS_RDY failed: %s",
  (void *)dev, strerror(err));
goto error;
}
-   /* Post SGEs. */
-   err = rxq_alloc_elts(&tmpl, desc_n, pool);
-   if (err) {
-   ERROR("%p: cannot reallocate WRs, aborting", (void *)dev);
-   rte_free(pool);
-   assert(err > 0);
-   return err;
-   }
/* Update doorbell counter. */
-   rxq_ctrl->rxq.rq_ci = desc_n;
+   rxq_ctrl->rxq.rq_ci = elts_n;
rte_wmb();
*rxq_ctrl->rxq.rq_db = htonl(rxq_ctrl->rxq.rq_ci);
 error:
-

[dpdk-dev] [PATCH v3 25/25] mlx5: resurrect Rx scatter support

2016-06-21 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

This commit brings back Rx scatter and related support by the MTU update
function. The maximum number of segments per packet is not a fixed value
anymore (previously MLX5_PMD_SGE_WR_N, set to 4 by default) as it caused
performance issues when fewer segments were actually needed as well as
limitations on the maximum packet size that could be received with the
default mbuf size (supporting at most 8576 bytes).

These limitations are now lifted as the number of SGEs is derived from the
MTU (which implies MRU) at queue initialization and during MTU update.

Signed-off-by: Adrien Mazarguil 
Signed-off-by: Vasily Philipov 
Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/mlx5_ethdev.c |  84 +
 drivers/net/mlx5/mlx5_rxq.c|  73 +-
 drivers/net/mlx5/mlx5_rxtx.c   | 139 -
 drivers/net/mlx5/mlx5_rxtx.h   |   1 +
 4 files changed, 215 insertions(+), 82 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 69bfe03..757f8e4 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -725,6 +725,9 @@ mlx5_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu)
unsigned int i;
uint16_t (*rx_func)(void *, struct rte_mbuf **, uint16_t) =
mlx5_rx_burst;
+   unsigned int max_frame_len;
+   int rehash;
+   int restart = priv->started;

if (mlx5_is_secondary())
return -E_RTE_SECONDARY;
@@ -738,7 +741,6 @@ mlx5_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu)
goto out;
} else
DEBUG("adapter port %u MTU set to %u", priv->port, mtu);
-   priv->mtu = mtu;
/* Temporarily replace RX handler with a fake one, assuming it has not
 * been copied elsewhere. */
dev->rx_pkt_burst = removed_rx_burst;
@@ -746,28 +748,88 @@ mlx5_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu)
 * removed_rx_burst() instead. */
rte_wmb();
usleep(1000);
+   /* MTU does not include header and CRC. */
+   max_frame_len = ETHER_HDR_LEN + mtu + ETHER_CRC_LEN;
+   /* Check if at least one queue is going to need a SGE update. */
+   for (i = 0; i != priv->rxqs_n; ++i) {
+   struct rxq *rxq = (*priv->rxqs)[i];
+   unsigned int mb_len;
+   unsigned int size = RTE_PKTMBUF_HEADROOM + max_frame_len;
+   unsigned int sges_n;
+
+   if (rxq == NULL)
+   continue;
+   mb_len = rte_pktmbuf_data_room_size(rxq->mp);
+   assert(mb_len >= RTE_PKTMBUF_HEADROOM);
+   /* Determine the number of SGEs needed for a full packet
+* and round it to the next power of two. */
+   sges_n = log2above((size / mb_len) + !!(size % mb_len));
+   if (sges_n != rxq->sges_n)
+   break;
+   }
+   /* If all queues have the right number of SGEs, a simple rehash
+* of their buffers is enough, otherwise SGE information can only
+* be updated in a queue by recreating it. All resources that depend
+* on queues (flows, indirection tables) must be recreated as well in
+* that case. */
+   rehash = (i == priv->rxqs_n);
+   if (!rehash) {
+   /* Clean up everything as with mlx5_dev_stop(). */
+   priv_special_flow_disable_all(priv);
+   priv_mac_addrs_disable(priv);
+   priv_destroy_hash_rxqs(priv);
+   priv_fdir_disable(priv);
+   priv_dev_interrupt_handler_uninstall(priv, dev);
+   }
+recover:
/* Reconfigure each RX queue. */
for (i = 0; (i != priv->rxqs_n); ++i) {
struct rxq *rxq = (*priv->rxqs)[i];
-   unsigned int mb_len;
-   unsigned int max_frame_len;
+   struct rxq_ctrl *rxq_ctrl =
+   container_of(rxq, struct rxq_ctrl, rxq);
int sp;
+   unsigned int mb_len;
+   unsigned int tmp;

if (rxq == NULL)
continue;
-   /* Calculate new maximum frame length according to MTU and
-* toggle scattered support (sp) if necessary. */
-   max_frame_len = (priv->mtu + ETHER_HDR_LEN +
-(ETHER_MAX_VLAN_FRAME_LEN - ETHER_MAX_LEN));
mb_len = rte_pktmbuf_data_room_size(rxq->mp);
assert(mb_len >= RTE_PKTMBUF_HEADROOM);
+   /* Toggle scattered support (sp) if necessary. */
sp = (max_frame_len > (mb_len - RTE_PKTMBUF_HEADROOM));
-   if (sp) {
-   ERROR("%p: RX scatter is not supported", (void *)dev);
-   ret = ENOTSUP;
-   goto out;
+   /* Provide new values to rxq_setup(). */
+   dev->data->d

[dpdk-dev] enic in passhtrough mode tx drops

2016-06-21 Thread Ruth Christen

Just wanted to update that the traffic problem was being caused by a completely 
different reason. Not related to the vlan priority as I thought.
It was because I was changing the mbuf data offset and I was missing this patch 
:  http://www.dpdk.org/ml/archives/dev/2016-February/033887.html

Thanks !

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ruth Christen
Sent: Thursday, June 16, 2016 04:13 PM
To: dev at dpdk.org
Subject: [dpdk-dev] enic in passhtrough mode tx drops

Hi all,

I'm running a vm attached to 2 cisco Virtual Card Interfaces in passthrough 
mode in a cisco UCS. The vNICs are configured on access mode without VLAN ID.

The incoming packets are arriving with 802.1q header containing vlan priority 
bit according to the class of service configured on the vNIC. I understood this 
is expected from a fiber channel Ethernet card.

According to dpdk documentation there's a need to set the VLAN_STRIP_OFFLOAD 
flag and call rte_eth_dev_set_vlan_offload on the ports.

If I run a simple l2fwd application where the same packet received in one port 
is sent through the other the traffic works ok.

If I generate the packets in my vm and send them out traffic doesn't work. (I 
tried send the traffic out with/without a 802.1q header with priority bit)



Is there a specific configuration to be added to the mbuff for the tx packets 
generated in the VM? Could be the vlan_tci/ ol_flags/ or any other missing flag 
set?

Does somebody know the exact behavior of the enic card with the priority 
tagging?



BTW in virtio mode the traffic works in both the flows.



Thanks a lot!

[dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset

2016-06-21 Thread Jerin Jacob

On Tue, Jun 21, 2016 at 06:14:29AM +, Lu, Wenzhuo wrote:
> Hi Jerin, Stephen,
> 
> 
> > -Original Message-
> > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> > Sent: Tuesday, June 21, 2016 11:51 AM
> > To: Stephen Hemminger
> > Cc: Lu, Wenzhuo; dev at dpdk.org; Ananyev, Konstantin; Richardson, Bruce; 
> > Chen,
> > Jing D; Liang, Cunming; Wu, Jingjing; Zhang, Helin;
> > thomas.monjalon at 6wind.com
> > Subject: Re: [dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device 
> > reset
> > 
> > On Mon, Jun 20, 2016 at 09:17:14AM -0700, Stephen Hemminger wrote:
> > > On Mon, 20 Jun 2016 14:44:11 +0530
> > > Jerin Jacob  wrote:
> > >
> > > > On Mon, Jun 20, 2016 at 02:24:27PM +0800, Wenzhuo Lu wrote:
> > > > > Add an API to reset the device.
> > > > > It's for VF device in this scenario, kernel PF + DPDK VF.
> > > > > When the PF port down->up, APP should call this API to reset VF
> > > > > port. Most likely, APP should call it in its management thread and
> > > > > guarantee the thread safe. It means APP should stop the rx/tx and
> > > > > the device, then reset the device, then recover the device and
> > > > > rx/tx.
> > > >
> > > > Following is _a_ use-case for Device reset. But may be not be _the_
> > > > use case. IMO, We need to first say expected behavior of this API
> > > > and add a use-case later.
> > > >
> > > > Other use-case would be, PCIe VF with functional level reset for
> > > > SRIOV migration.
> > > > Are we on same page?
> > >
> > >
> > > In my experience with Linux devices, this is normally handled by the
> > > device driver in the start routine.  Since any use case which needs
> > > this is going to do a stop/reset/start sequence, why not just have the
> > > VF device driver do this in the start routine?.
> > >
> > > Adding yet another API and state transistion if not necessary
> > > increases the complexity and required test cases for all devices.
> > 
> > I agree with Stephen here.I think if application needs to call start after 
> > the
> > device reset then we could add this logic in start itself rather exposing a 
> > yet
> > another API
> Do you mean changing the device_start to include all these actions, stop 
> device -> stop queue -> re-setup queue -> start queue -> start device ?

What was the expected API call sequence when you were introduced this API?

Point was to have implicit device reset in the API call
sequence(Wherever make sense for specific PMD)

Jerin

> 
> > 
> > >

[dpdk-dev] [PATCH v3 00/25] Refactor mlx5 to improve performance

2016-06-21 Thread Yuanhan Liu

Hi,

Here is an off-topic comment: would you please add following line to
the sendemail section of your git config file?

chainreplyto = false

That would let me to break the long threads in my client much easier.
Otherwise, it's hard for me to do it, leading that your thread occupies
several screens on my side.

It seems that Tetsuya also has the issue, thus CC'ed.

Thanks.

--yliu

[dpdk-dev] [PATCH v3 00/25] Refactor mlx5 to improve performance

2016-06-21 Thread Nélio Laranjeiro

On Tue, Jun 21, 2016 at 03:43:08PM +0800, Yuanhan Liu wrote:
> Hi,
> 
> Here is an off-topic comment: would you please add following line to
> the sendemail section of your git config file?
> 
> chainreplyto = false
> 
> That would let me to break the long threads in my client much easier.
> Otherwise, it's hard for me to do it, leading that your thread occupies
> several screens on my side.
> 
> It seems that Tetsuya also has the issue, thus CC'ed.
> 
> Thanks.
> 
>   --yliu

I already have it in my sendemail section (copied from
http://dpdk.org/dev web page).

I was wondering it did not split correctly the patchset threads.

I will try to use the command line "--no-chain-reply-to" option next
time.

Thanks.

-- 
N?lio Laranjeiro
6WIND

[dpdk-dev] [PATCH v3 00/25] Refactor mlx5 to improve performance

2016-06-21 Thread Yuanhan Liu

On Tue, Jun 21, 2016 at 10:00:34AM +0200, N?lio Laranjeiro wrote:
> On Tue, Jun 21, 2016 at 03:43:08PM +0800, Yuanhan Liu wrote:
> > Hi,
> > 
> > Here is an off-topic comment: would you please add following line to
> > the sendemail section of your git config file?
> > 
> > chainreplyto = false
> > 
> > That would let me to break the long threads in my client much easier.
> > Otherwise, it's hard for me to do it, leading that your thread occupies
> > several screens on my side.
> > 
> > It seems that Tetsuya also has the issue, thus CC'ed.
> > 
> > Thanks.
> > 
> > --yliu
> 
> I already have it in my sendemail section (copied from
> http://dpdk.org/dev web page).
> 
> I was wondering it did not split correctly the patchset threads.

No idea, and here is a blind guess: maybe you have a local git config
file that overwrites the globle options?

> I will try to use the command line "--no-chain-reply-to" option next
> time.

Thanks!

--yliu

[dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list truncation

2016-06-21 Thread Panu Matilainen

In other libraries, dependency list is always appended to, but
in commit 6cbf4f75e059 it with an assignment. This causes the
librte_eal dependency added in commit 6cbf4f75e059 to get discarded,
resulting in missing dependency on librte_eal.

Fixes: b3688bee81a8 ("pipeline: new packet framework logic")
Fixes: 6cbf4f75e059 ("mk: fix missing internal dependencies")

Signed-off-by: Panu Matilainen 
---
 lib/librte_pipeline/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pipeline/Makefile b/lib/librte_pipeline/Makefile
index 95387aa..a8f3128 100644
--- a/lib/librte_pipeline/Makefile
+++ b/lib/librte_pipeline/Makefile
@@ -53,7 +53,7 @@ SYMLINK-$(CONFIG_RTE_LIBRTE_PIPELINE)-include += 
rte_pipeline.h

 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_eal
-DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) := lib/librte_table
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_table
 DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_port

 include $(RTE_SDK)/mk/rte.lib.mk
-- 
2.5.5

[dpdk-dev] [PATCH 2/3] pdump: fix missing dependency on libpthread

2016-06-21 Thread Panu Matilainen

Fixes: 278f945402c5 ("pdump: add new library for packet capture")

Signed-off-by: Panu Matilainen 
---
 lib/librte_pdump/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_pdump/Makefile b/lib/librte_pdump/Makefile
index af81a28..a506c4d 100644
--- a/lib/librte_pdump/Makefile
+++ b/lib/librte_pdump/Makefile
@@ -36,6 +36,7 @@ LIB = librte_pdump.a

 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
 CFLAGS += -D_GNU_SOURCE
+LDLIBS += -lpthread

 EXPORT_MAP := rte_pdump_version.map

-- 
2.5.5

[dpdk-dev] [PATCH 3/3] mk: fail build on incomplete shared library dependencies

2016-06-21 Thread Panu Matilainen

Require all symbols used by a DSO to be resolvable via LDLIBS at
build-time. Previously it was possible to build a library with
incomplete dependencies which could then fail at run-time.

Signed-off-by: Panu Matilainen 
---
 mk/rte.lib.mk | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
index 1ff403f..f75ec10 100644
--- a/mk/rte.lib.mk
+++ b/mk/rte.lib.mk
@@ -94,7 +94,7 @@ O_TO_A_DO = @set -e; \
echo $(O_TO_A_CMD) > $(call exe2cmd,$(@))

 O_TO_S = $(LD) -L$(RTE_OUTPUT)/lib $(_CPU_LDFLAGS) $(EXTRA_LDFLAGS) \
- -shared $(OBJS-y) $(LDLIBS) -Wl,-soname,$(LIB) -o $(LIB)
+ -shared $(OBJS-y) -z defs $(LDLIBS) -Wl,-soname,$(LIB) -o $(LIB)
 O_TO_S_STR = $(subst ','\'',$(O_TO_S)) #'# fix syntax highlight
 O_TO_S_DISP = $(if $(V),"$(O_TO_S_STR)","  LD $(@)")
 O_TO_S_DO = @set -e; \
-- 
2.5.5

[dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset

2016-06-21 Thread Lu, Wenzhuo

Hi Jerin,

> -Original Message-
> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Tuesday, June 21, 2016 3:37 PM
> To: Lu, Wenzhuo
> Cc: Stephen Hemminger; dev at dpdk.org; Ananyev, Konstantin; Richardson,
> Bruce; Chen, Jing D; Liang, Cunming; Wu, Jingjing; Zhang, Helin;
> thomas.monjalon at 6wind.com
> Subject: Re: [dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset
> 
> On Tue, Jun 21, 2016 at 06:14:29AM +, Lu, Wenzhuo wrote:
> > Hi Jerin, Stephen,
> >
> >
> > > -Original Message-
> > > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> > > Sent: Tuesday, June 21, 2016 11:51 AM
> > > To: Stephen Hemminger
> > > Cc: Lu, Wenzhuo; dev at dpdk.org; Ananyev, Konstantin; Richardson,
> > > Bruce; Chen, Jing D; Liang, Cunming; Wu, Jingjing; Zhang, Helin;
> > > thomas.monjalon at 6wind.com
> > > Subject: Re: [dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support
> > > device reset
> > >
> > > On Mon, Jun 20, 2016 at 09:17:14AM -0700, Stephen Hemminger wrote:
> > > > On Mon, 20 Jun 2016 14:44:11 +0530 Jerin Jacob
> > > >  wrote:
> > > >
> > > > > On Mon, Jun 20, 2016 at 02:24:27PM +0800, Wenzhuo Lu wrote:
> > > > > > Add an API to reset the device.
> > > > > > It's for VF device in this scenario, kernel PF + DPDK VF.
> > > > > > When the PF port down->up, APP should call this API to reset
> > > > > > VF port. Most likely, APP should call it in its management
> > > > > > thread and guarantee the thread safe. It means APP should stop
> > > > > > the rx/tx and the device, then reset the device, then recover
> > > > > > the device and rx/tx.
> > > > >
> > > > > Following is _a_ use-case for Device reset. But may be not be
> > > > > _the_ use case. IMO, We need to first say expected behavior of
> > > > > this API and add a use-case later.
> > > > >
> > > > > Other use-case would be, PCIe VF with functional level reset for
> > > > > SRIOV migration.
> > > > > Are we on same page?
> > > >
> > > >
> > > > In my experience with Linux devices, this is normally handled by
> > > > the device driver in the start routine.  Since any use case which
> > > > needs this is going to do a stop/reset/start sequence, why not
> > > > just have the VF device driver do this in the start routine?.
> > > >
> > > > Adding yet another API and state transistion if not necessary
> > > > increases the complexity and required test cases for all devices.
> > >
> > > I agree with Stephen here.I think if application needs to call start
> > > after the device reset then we could add this logic in start itself
> > > rather exposing a yet another API
> > Do you mean changing the device_start to include all these actions, stop
> device -> stop queue -> re-setup queue -> start queue -> start device ?
> 
> What was the expected API call sequence when you were introduced this API?
> 
> Point was to have implicit device reset in the API call sequence(Wherever make
> sense for specific PMD)
I think the API call sequence depends on the implementation of the APP. Let's 
say if there's not this reset API, APP can use this API call sequence to handle 
the PF link down/up event, rte_eth_dev_close -> rte_eth_rx_queue_setup -> 
rte_eth_tx_queue_setup -> rte_eth_dev_start. 
Actually our purpose is to use this reset API instead of the API call sequence. 
You can see the reset API is not necessary. The benefit is to save the code for 
APP.

> 
> Jerin
> 
> >
> > >
> > > >

[dpdk-dev] [PATCH v2] examples/ip_pipeline: fix build error for gcc 4.8

2016-06-21 Thread Daniel Mrzyglod

This patch fixes a maybe-uninitialized warning when compiling DPDK with GCC 4.8

examples/ip_pipeline/pipeline/pipeline_common_fe.c: In function 
'app_pipeline_track_pktq_out_to_link':
examples/ip_pipeline/pipeline/pipeline_common_fe.c:66:31: error:
'reader' may be used uninitialized in this function 
[-Werror=maybe-uninitialized]

   struct app_pktq_out_params *pktq_out =

Fixes: 760064838ec0 ("examples/ip_pipeline: link routing output ports to 
devices")

Signed-off-by: Daniel Mrzyglod 
Acked-by: Cristian Dumitrescu 
---
 examples/ip_pipeline/app.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/examples/ip_pipeline/app.h b/examples/ip_pipeline/app.h
index 7611341..242dae8 100644
--- a/examples/ip_pipeline/app.h
+++ b/examples/ip_pipeline/app.h
@@ -667,11 +667,11 @@ app_swq_get_reader(struct app_params *app,
struct app_pktq_swq_params *swq,
uint32_t *pktq_in_id)
 {
-   struct app_pipeline_params *reader;
+   struct app_pipeline_params *reader = NULL;
uint32_t pos = swq - app->swq_params;
uint32_t n_pipelines = RTE_MIN(app->n_pipelines,
RTE_DIM(app->pipeline_params));
-   uint32_t n_readers = 0, id, i;
+   uint32_t n_readers = 0, id = 0, i;

for (i = 0; i < n_pipelines; i++) {
struct app_pipeline_params *p = &app->pipeline_params[i];
@@ -727,11 +727,11 @@ app_tm_get_reader(struct app_params *app,
struct app_pktq_tm_params *tm,
uint32_t *pktq_in_id)
 {
-   struct app_pipeline_params *reader;
+   struct app_pipeline_params *reader = NULL;
uint32_t pos = tm - app->tm_params;
uint32_t n_pipelines = RTE_MIN(app->n_pipelines,
RTE_DIM(app->pipeline_params));
-   uint32_t n_readers = 0, id, i;
+   uint32_t n_readers = 0, id = 0, i;

for (i = 0; i < n_pipelines; i++) {
struct app_pipeline_params *p = &app->pipeline_params[i];
-- 
2.5.5

[dpdk-dev] [PATCH] kni : fix build errors for gcc --version >= 6.1

2016-06-21 Thread Anupam Kapoor

This commit fixes build errors triggered due misleading indentation.

Fixes: 366113dbfb696 (e1000: suppress misleading indentation warning)


Signed-off-by: Anupam Kapoor 
---
 lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c | 12 
 lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c |  6 +++---
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c 
b/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c
index df224702ed7d..26352da15101 100644
--- a/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c
+++ b/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c
@@ -3299,13 +3299,15 @@ s32 e1000_read_phy_reg_mphy(struct e1000_hw *hw, u32 
address, u32 *data)
return -E1000_ERR_PHY;
*data = E1000_READ_REG(hw, E1000_MPHY_DATA);

-   /* Disable access to mPHY if it was originally disabled */
-   if (locked)
+   /* Disable access to mPHY if it was originally enabled */
+   if (locked) {
ready = e1000_is_mphy_ready(hw);
if (!ready)
return -E1000_ERR_PHY;
+
E1000_WRITE_REG(hw, E1000_MPHY_ADDR_CTRL,
E1000_MPHY_DIS_ACCESS);
+   }

return E1000_SUCCESS;
 }
@@ -3364,13 +3366,15 @@ s32 e1000_write_phy_reg_mphy(struct e1000_hw *hw, u32 
address, u32 data,
return -E1000_ERR_PHY;
E1000_WRITE_REG(hw, E1000_MPHY_DATA, data);

-   /* Disable access to mPHY if it was originally disabled */
-   if (locked)
+   /* Disable access to mPHY if it was originally enabled */
+   if (locked) {
ready = e1000_is_mphy_ready(hw);
if (!ready)
return -E1000_ERR_PHY;
+
E1000_WRITE_REG(hw, E1000_MPHY_ADDR_CTRL,
E1000_MPHY_DIS_ACCESS);
+   }

return E1000_SUCCESS;
 }
diff --git a/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c 
b/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c
index 017dfe16c73f..dc2a4fb61c25 100644
--- a/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c
+++ b/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c
@@ -870,9 +870,9 @@ s32 ixgbe_setup_mac_link_82599(struct ixgbe_hw *hw,
if (speed & IXGBE_LINK_SPEED_10GB_FULL)
if (orig_autoc & IXGBE_AUTOC_KX4_SUPP)
autoc |= IXGBE_AUTOC_KX4_SUPP;
-   if ((orig_autoc & IXGBE_AUTOC_KR_SUPP) &&
-   (hw->phy.smart_speed_active == false))
-   autoc |= IXGBE_AUTOC_KR_SUPP;
+   if ((orig_autoc & IXGBE_AUTOC_KR_SUPP) &&
+   (hw->phy.smart_speed_active == false))
+   autoc |= IXGBE_AUTOC_KR_SUPP;
if (speed & IXGBE_LINK_SPEED_1GB_FULL)
autoc |= IXGBE_AUTOC_KX_SUPP;
} else if ((pma_pmd_1g == IXGBE_AUTOC_1G_SFI) &&
--
2.9.0

[dpdk-dev] random pkt generator PMD

2016-06-21 Thread Yerden Zhumabekov

I've developed some preliminary version of the driver. The code is 
derived from Null PMD, but required a lot of rework.

It uses following devargs to generate packets:

1) edit=offset:size:[rnd|value]
 Edit a field within an mbuf packet data with given offset and size. 
Mark it as 'rnd' or assign it a hex value, for example:
'edit=8:16:rnd' tags field with offset 8 bytes and with size of 16 
bytes random-generated,
'edit=14:4:0xdeadbeef' assigns a specified sequence of bytes to the 
field (network byte order).

2) tmpl=name
 Use a template with name. Instead of editing data manually, specify 
a hard-coded template and then edit only intended fields. Implemented 
icmp4, tcp4, but needs to be expanded.

3) size=len
 Specify a size of packet. May not be less than size of template 
(checked on devinit).

I ran testpmd (start/stop), then l2fwd, looks like it works, but I'd be 
happy to hear about additional tests I need to run to ensure the PMD 
conformance.

With 64 bytes packet and one 8-byte random field it's about 6-7 Mpps 
now. I use rte_rand()/lrand48() as a source of random bytes, it impacts 
a performance, but I haven't come up with anything else.

On 15.06.2016 16:07, Bruce Richardson wrote:
> On Wed, Jun 15, 2016 at 04:03:59PM +0600, Yerden Zhumabekov wrote:
>>
>> Right, but development of various features regarding L3/L4 etc requires more
>> subtle approach, like live packets, different protocol versions, fields
>> manipulation. In this case some packet mangling/randomizing capabilities
>> would be quite useful. Something similar to what is done in Pktgen, but more
>> lightweight approach, in a same app.
>>
>> I've almost made my mind :) so the next question: is there any guide on PMD
>> dev? I'm looking through rte_ether.h right now, but some doc would be very
>> nice.
> Unfortunately not. My suggestion is to take one of the simple vdev's e.g. 
> ring,
> pcap, null, and work off a copy of it.
>
> /Bruce

[dpdk-dev] [PATCH v3 00/25] Refactor mlx5 to improve performance

2016-06-21 Thread Nélio Laranjeiro

On Tue, Jun 21, 2016 at 04:05:44PM +0800, Yuanhan Liu wrote:
> On Tue, Jun 21, 2016 at 10:00:34AM +0200, N?lio Laranjeiro wrote:
> > On Tue, Jun 21, 2016 at 03:43:08PM +0800, Yuanhan Liu wrote:
> > > Hi,
> > > 
> > > Here is an off-topic comment: would you please add following line to
> > > the sendemail section of your git config file?
> > > 
> > > chainreplyto = false
> > > 
> > > That would let me to break the long threads in my client much easier.
> > > Otherwise, it's hard for me to do it, leading that your thread occupies
> > > several screens on my side.
> > > 
> > > It seems that Tetsuya also has the issue, thus CC'ed.
> > > 
> > > Thanks.
> > > 
> > >   --yliu
> > 
> > I already have it in my sendemail section (copied from
> > http://dpdk.org/dev web page).
> > 
> > I was wondering it did not split correctly the patchset threads.
> 
> No idea, and here is a blind guess: maybe you have a local git config
> file that overwrites the globle options?

No, my local git/config does not have any sendemail section.  It worked
once, maybe an update of the package on my machine broke the script.

> > I will try to use the command line "--no-chain-reply-to" option next
> > time.
> 
> Thanks!
> 
>   --yliu

-- 
N?lio Laranjeiro
6WIND

[dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset

2016-06-21 Thread Jerin Jacob

On Tue, Jun 21, 2016 at 08:24:36AM +, Lu, Wenzhuo wrote:
> Hi Jerin,

Hi Wenzhuo,

> > > > > > On Mon, Jun 20, 2016 at 02:24:27PM +0800, Wenzhuo Lu wrote:
> > > > > > > Add an API to reset the device.
> > > > > > > It's for VF device in this scenario, kernel PF + DPDK VF.
> > > > > > > When the PF port down->up, APP should call this API to reset
> > > > > > > VF port. Most likely, APP should call it in its management
> > > > > > > thread and guarantee the thread safe. It means APP should stop
> > > > > > > the rx/tx and the device, then reset the device, then recover
> > > > > > > the device and rx/tx.
> > > > > >
> > > > > > Following is _a_ use-case for Device reset. But may be not be
> > > > > > _the_ use case. IMO, We need to first say expected behavior of
> > > > > > this API and add a use-case later.
> > > > > >
> > > > > > Other use-case would be, PCIe VF with functional level reset for
> > > > > > SRIOV migration.
> > > > > > Are we on same page?
> > > > >
> > > > >
> > > > > In my experience with Linux devices, this is normally handled by
> > > > > the device driver in the start routine.  Since any use case which
> > > > > needs this is going to do a stop/reset/start sequence, why not
> > > > > just have the VF device driver do this in the start routine?.
> > > > >
> > > > > Adding yet another API and state transistion if not necessary
> > > > > increases the complexity and required test cases for all devices.
> > > >
> > > > I agree with Stephen here.I think if application needs to call start
> > > > after the device reset then we could add this logic in start itself
> > > > rather exposing a yet another API
> > > Do you mean changing the device_start to include all these actions, stop
> > device -> stop queue -> re-setup queue -> start queue -> start device ?
> > 
> > What was the expected API call sequence when you were introduced this API?
> > 
> > Point was to have implicit device reset in the API call sequence(Wherever 
> > make
> > sense for specific PMD)
> I think the API call sequence depends on the implementation of the APP. Let's 
> say if there's not this reset API, APP can use this API call sequence to 
> handle the PF link down/up event, rte_eth_dev_close -> rte_eth_rx_queue_setup 
> -> rte_eth_tx_queue_setup -> rte_eth_dev_start. 
> Actually our purpose is to use this reset API instead of the API call 
> sequence. You can see the reset API is not necessary. The benefit is to save 
> the code for APP.

Then I am bit confused with original commit log description.
|
|It means APP should stop the rx/tx and the device, then reset the
|device, then recover the device and rx/tx.
|
I was under impression that it a low level reset API for this device? Is
n't it?

The other issue is generalized outlook of the API, Certain PMD will not
have PF link down/up event? Link down/up and only connected to VF and PF
only for configuration.

How about fixing it more transparently in PMD driver itself as
PMD driver knows the PF link up/down event, Is it possible to
recover the VF on that event if its only matter of resetting it?

Jerin

[dpdk-dev] supported packet types

2016-06-21 Thread Olivier Matz

Hi Konstantin,

On 06/16/2016 01:29 PM, Ananyev, Konstantin wrote:
 I suggest instead to set the ptype
 in an opportunistic fashion instead:
 - if the driver/hw knows the ptype, set it
 - else, set it to unknown
>>>
>>> That's what PMD does now... and I don't think it can do much more -
>>> only interpret information provided by the HW in a proper way.
>>> Probably I misunderstood you here...
>>
>> My suggestion was to remove get_supported_ptypes an set the ptype in
>> mbuf when the pmd recognize a type.
>>
>> But we could also keep get_supported_ptypes() for ptypes that will
>> always be recognized, allowing a PMD to set any other ptype if it
>> is recognized. This is probably what we have today in some PMDs, I
>> think it just requires some clarification.
> 
> Yes, +1 to the second option.

What about the following API comment?

'''
Retrieve the supported packet types of an Ethernet device.

When a packet type is announced as supported, it *must* be recognized by
the PMD. For instance, if RTE_PTYPE_L2_ETHER, RTE_PTYPE_L2_ETHER_VLAN
and RTE_PTYPE_L3_IPV4 are announced, the PMD must return the following
packet types for these packets:
- Ether/IPv4  -> RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4
- Ether/Vlan/IPv4 -> RTE_PTYPE_L2_ETHER_VLAN | RTE_PTYPE_L3_IPV4
- Ether/   -> RTE_PTYPE_L2_ETHER
- Ether/Vlan/ -> RTE_PTYPE_L2_ETHER_VLAN

When a packet is received by a PMD, the most precise type must be
returned among the ones supported. However a PMD is allowed to set
packet type that is not in the supported list, at the condition that it
is more precise. Therefore, a PMD announcing no supported packet types
can still set a matching packet type in a received packet.
'''

If it's fine I'll submit it as a patch.

Regards,
Olivier

[dpdk-dev] [PATCH v4 2/3] bnx2x: enhance stats get

2016-06-21 Thread Remy Horton

Morning,

On 11/05/2016 01:58, Rasesh Mody wrote:
[..]
>>> We shall split this patch into an enhancement and a bug fix.
>>
>> Keep in mind that the xstats API is changing so that stats_get() no
>> longer includes strings:
>>
>> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/37079
>> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/37571
>>
>> ..Remy
>
> Sure, made a note of it. Do we know when will these changes be picked
> up? We can incorporate the related changes into our patches if the
> patches are about to be accepted.
>

Quick heads-up - the XStats changes were merged into master a few days ago.

Regards,

..Remy

[dpdk-dev] [PATCH / RFC] sched: Correct subport calcuation

2016-06-21 Thread Dumitrescu, Cristian

Hi Simon,

I am going to take a look at it this week and come back to you.

Thanks,
Cristian

> -Original Message-
> From: Simon K?gstr?m [mailto:simon.kagstrom at netinsight.net]
> Sent: Tuesday, June 21, 2016 7:41 AM
> To: Dumitrescu, Cristian ;
> stephen at networkplumber.org; dev at dpdk.org;
> thomas.monjalon at 6wind.com
> Subject: Re: [dpdk-dev] [PATCH / RFC] sched: Correct subport calcuation
> 
> Hi again!
> 
> Any news about this patch? I'm off for parental leave starting next week
> (until january), so any comments (or simply dropping it!) would be good
> to have before that :-)
> 
> // Simon
> 
> On 2016-06-10 08:29, Simon Kagstrom wrote:
> > Signed-off-by: Simon Kagstrom 
> > ---
> > I'm a total newbie to the rte_sched design and implementation, so I've
> > added the RFC.
> >
> > We get crashes (at other places in the scheduler) without this code.
> >
> >  lib/librte_sched/rte_sched.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
> > index 1609ea8..b46ecfb 100644
> > --- a/lib/librte_sched/rte_sched.c
> > +++ b/lib/librte_sched/rte_sched.c
> > @@ -1869,7 +1869,7 @@ grinder_next_pipe(struct rte_sched_port *port,
> uint32_t pos)
> >
> > /* Install new pipe in the grinder */
> > grinder->pindex = pipe_qindex >> 4;
> > -   grinder->subport = port->subport + (grinder->pindex / port-
> >n_pipes_per_subport);
> > +   grinder->subport = port->subport + (grinder->pindex / port-
> >n_subports_per_port);
> > grinder->pipe = port->pipe + grinder->pindex;
> > grinder->pipe_params = NULL; /* to be set after the pipe structure is
> prefetched */
> > grinder->productive = 0;
> >

[dpdk-dev] [PATCH v3] qat: fix for VFs not getting recognized

2016-06-21 Thread Deepak Kumar Jain

From: "Jain, Deepak K" 

Updated the code to use RTE_PCI_DEVICE.

Fixes: 701c8d80c820 ("pci: support class id probing")

Signed-off-by: Jain, Deepak K 
---
v3: kept PCI id in the driver file

v2: updated code to use RTE_PCI_DEVICE

 drivers/crypto/qat/rte_qat_cryptodev.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/crypto/qat/rte_qat_cryptodev.c 
b/drivers/crypto/qat/rte_qat_cryptodev.c
index a7912f5..f46ec85 100644
--- a/drivers/crypto/qat/rte_qat_cryptodev.c
+++ b/drivers/crypto/qat/rte_qat_cryptodev.c
@@ -69,10 +69,7 @@ static struct rte_cryptodev_ops crypto_qat_ops = {

 static struct rte_pci_id pci_id_qat_map[] = {
{
-   .vendor_id = 0x8086,
-   .device_id = 0x0443,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(0x8086, 0x0443),
},
{.device_id = 0},
 };
-- 
2.5.5

[dpdk-dev] supported packet types

2016-06-21 Thread Ananyev, Konstantin

Hi Olivier,

> 
> Hi Konstantin,
> 
> On 06/16/2016 01:29 PM, Ananyev, Konstantin wrote:
>  I suggest instead to set the ptype
>  in an opportunistic fashion instead:
>  - if the driver/hw knows the ptype, set it
>  - else, set it to unknown
> >>>
> >>> That's what PMD does now... and I don't think it can do much more -
> >>> only interpret information provided by the HW in a proper way.
> >>> Probably I misunderstood you here...
> >>
> >> My suggestion was to remove get_supported_ptypes an set the ptype in
> >> mbuf when the pmd recognize a type.
> >>
> >> But we could also keep get_supported_ptypes() for ptypes that will
> >> always be recognized, allowing a PMD to set any other ptype if it
> >> is recognized. This is probably what we have today in some PMDs, I
> >> think it just requires some clarification.
> >
> > Yes, +1 to the second option.
> 
> What about the following API comment?
> 
> '''
> Retrieve the supported packet types of an Ethernet device.
> 
> When a packet type is announced as supported, it *must* be recognized by
> the PMD. For instance, if RTE_PTYPE_L2_ETHER, RTE_PTYPE_L2_ETHER_VLAN
> and RTE_PTYPE_L3_IPV4 are announced, the PMD must return the following
> packet types for these packets:
> - Ether/IPv4  -> RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4
> - Ether/Vlan/IPv4 -> RTE_PTYPE_L2_ETHER_VLAN | RTE_PTYPE_L3_IPV4
> - Ether/   -> RTE_PTYPE_L2_ETHER
> - Ether/Vlan/ -> RTE_PTYPE_L2_ETHER_VLAN
> 
> When a packet is received by a PMD, the most precise type must be
> returned among the ones supported. However a PMD is allowed to set
> packet type that is not in the supported list, at the condition that it
> is more precise. Therefore, a PMD announcing no supported packet types
> can still set a matching packet type in a received packet.
> '''
> 
> If it's fine I'll submit it as a patch.

Yep, looks good to me.
Thanks
Konstantin

[dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset

2016-06-21 Thread Ananyev, Konstantin

Hi Jerin,

> -Original Message-
> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Tuesday, June 21, 2016 9:56 AM
> To: Lu, Wenzhuo
> Cc: Stephen Hemminger; dev at dpdk.org; Ananyev, Konstantin; Richardson, 
> Bruce; Chen, Jing D; Liang, Cunming; Wu, Jingjing; Zhang,
> Helin; thomas.monjalon at 6wind.com
> Subject: Re: [dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset
> 
> On Tue, Jun 21, 2016 at 08:24:36AM +, Lu, Wenzhuo wrote:
> > Hi Jerin,
> 
> Hi Wenzhuo,
> 
> > > > > > > On Mon, Jun 20, 2016 at 02:24:27PM +0800, Wenzhuo Lu wrote:
> > > > > > > > Add an API to reset the device.
> > > > > > > > It's for VF device in this scenario, kernel PF + DPDK VF.
> > > > > > > > When the PF port down->up, APP should call this API to reset
> > > > > > > > VF port. Most likely, APP should call it in its management
> > > > > > > > thread and guarantee the thread safe. It means APP should stop
> > > > > > > > the rx/tx and the device, then reset the device, then recover
> > > > > > > > the device and rx/tx.
> > > > > > >
> > > > > > > Following is _a_ use-case for Device reset. But may be not be
> > > > > > > _the_ use case. IMO, We need to first say expected behavior of
> > > > > > > this API and add a use-case later.
> > > > > > >
> > > > > > > Other use-case would be, PCIe VF with functional level reset for
> > > > > > > SRIOV migration.
> > > > > > > Are we on same page?
> > > > > >
> > > > > >
> > > > > > In my experience with Linux devices, this is normally handled by
> > > > > > the device driver in the start routine.  Since any use case which
> > > > > > needs this is going to do a stop/reset/start sequence, why not
> > > > > > just have the VF device driver do this in the start routine?.
> > > > > >
> > > > > > Adding yet another API and state transistion if not necessary
> > > > > > increases the complexity and required test cases for all devices.
> > > > >
> > > > > I agree with Stephen here.I think if application needs to call start
> > > > > after the device reset then we could add this logic in start itself
> > > > > rather exposing a yet another API
> > > > Do you mean changing the device_start to include all these actions, stop
> > > device -> stop queue -> re-setup queue -> start queue -> start device ?
> > >
> > > What was the expected API call sequence when you were introduced this API?
> > >
> > > Point was to have implicit device reset in the API call sequence(Wherever 
> > > make
> > > sense for specific PMD)
> > I think the API call sequence depends on the implementation of the APP. 
> > Let's say if there's not this reset API, APP can use this API
> call sequence to handle the PF link down/up event, rte_eth_dev_close -> 
> rte_eth_rx_queue_setup -> rte_eth_tx_queue_setup ->
> rte_eth_dev_start.
> > Actually our purpose is to use this reset API instead of the API call 
> > sequence. You can see the reset API is not necessary. The benefit
> is to save the code for APP.
> 
> Then I am bit confused with original commit log description.
> |
> |It means APP should stop the rx/tx and the device, then reset the
> |device, then recover the device and rx/tx.
> |
> I was under impression that it a low level reset API for this device? Is
> n't it?
> 
> The other issue is generalized outlook of the API, Certain PMD will not
> have PF link down/up event? Link down/up and only connected to VF and PF
> only for configuration.
> 
> How about fixing it more transparently in PMD driver itself as
> PMD driver knows the PF link up/down event, Is it possible to
> recover the VF on that event if its only matter of resetting it?

I think we already went through that discussion on the list.
Unfortunately with current dpdk design it is hardly possible.
To achieve that we need to introduce some sort of synchronisation
between IO and control APIs (locking or so).
Actually I am not sure why having a special reset function will be a problem.
Yes, it would exist only for VFs, for PF it could be left unimplemented.
Though it definitely seems more convenient from user point of view,
they would know: to handle VF reset event, they just need to call that
particular function, not to re-implement their own.

Konstantin

> 
> Jerin

[dpdk-dev] [PATCH v3 1/2] mempool: add stack (lifo) mempool handler

2016-06-21 Thread Ananyev, Konstantin



> -Original Message-
> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Tuesday, June 21, 2016 4:35 AM
> To: Ananyev, Konstantin
> Cc: Thomas Monjalon; dev at dpdk.org; Hunt, David; olivier.matz at 6wind.com; 
> viktorin at rehivetech.com; shreyansh.jain at nxp.com
> Subject: Re: [dpdk-dev] [PATCH v3 1/2] mempool: add stack (lifo) mempool 
> handler
> 
> On Mon, Jun 20, 2016 at 05:56:40PM +, Ananyev, Konstantin wrote:
> >
> >
> > > -Original Message-
> > > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> > > Sent: Monday, June 20, 2016 3:22 PM
> > > To: Ananyev, Konstantin
> > > Cc: Thomas Monjalon; dev at dpdk.org; Hunt, David; olivier.matz at 
> > > 6wind.com; viktorin at rehivetech.com; shreyansh.jain at nxp.com
> > > Subject: Re: [dpdk-dev] [PATCH v3 1/2] mempool: add stack (lifo) mempool 
> > > handler
> > >
> > > On Mon, Jun 20, 2016 at 01:58:04PM +, Ananyev, Konstantin wrote:
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas 
> > > > > Monjalon
> > > > > Sent: Monday, June 20, 2016 2:54 PM
> > > > > To: Jerin Jacob
> > > > > Cc: dev at dpdk.org; Hunt, David; olivier.matz at 6wind.com; viktorin 
> > > > > at rehivetech.com; shreyansh.jain at nxp.com
> > > > > Subject: Re: [dpdk-dev] [PATCH v3 1/2] mempool: add stack (lifo) 
> > > > > mempool handler
> > > > >
> > > > > 2016-06-20 18:55, Jerin Jacob:
> > > > > > On Mon, Jun 20, 2016 at 02:08:10PM +0100, David Hunt wrote:
> > > > > > > This is a mempool handler that is useful for pipelining apps, 
> > > > > > > where
> > > > > > > the mempool cache doesn't really work - example, where we have one
> > > > > > > core doing rx (and alloc), and another core doing Tx (and 
> > > > > > > return). In
> > > > > > > such a case, the mempool ring simply cycles through all the mbufs,
> > > > > > > resulting in a LLC miss on every mbuf allocated when the number of
> > > > > > > mbufs is large. A stack recycles buffers more effectively in this
> > > > > > > case.
> > > > > > >
> > > > > > > Signed-off-by: David Hunt 
> > > > > > > ---
> > > > > > >  lib/librte_mempool/Makefile|   1 +
> > > > > > >  lib/librte_mempool/rte_mempool_stack.c | 145 
> > > > > > > +
> > > > > >
> > > > > > How about moving new mempool handlers to drivers/mempool? (or 
> > > > > > similar).
> > > > > > In future, adding HW specific handlers in lib/librte_mempool/ may 
> > > > > > be bad idea.
> > > > >
> > > > > You're probably right.
> > > > > However we need to check and understand what a HW mempool handler 
> > > > > will be.
> > > > > I imagine the first of them will have to move handlers in drivers/
> > > >
> > > > Does it mean it we'll have to move mbuf into drivers too?
> > > > Again other libs do use mempool too.
> > > > Why not just lib/librte_mempool/arch/
> > > > ?
> > >
> > > I was proposing only to move only the new
> > > handler(lib/librte_mempool/rte_mempool_stack.c). Not any library or any
> > > other common code.
> > >
> > > Just like DPDK crypto device, Even if it is software implementation its
> > > better to move in driver/crypto instead of lib/librte_cryptodev
> > >
> > > "lib/librte_mempool/arch/" is not correct place as it is platform specific
> > > not architecture specific and HW mempool device may be PCIe or platform
> > > device.
> >
> > Ok, but why rte_mempool_stack.c has to be moved?
> 
> Just thought of having all the mempool handlers at one place.
> We can't move all HW mempool handlers at lib/librte_mempool/

Yep, but from your previous mail I thought we might have specific ones
for specific devices, no? 
If so, why to put them in one place, why just not in:
Drivers/xxx_dev/xxx_mempool.[h,c]
?
And keep generic ones in lib/librte_mempool
?
Konstantin

> 
> Jerin
> 
> > I can hardly imagine it is a 'platform sepcific'.
> > From my understanding it is a generic code.
> > Konstantin
> >
> >
> > >
> > > > Konstantin
> > > >
> > > >
> > > > > Jerin, are you volunteer?

[dpdk-dev] [DPDK16.04: Error While compiling]

2016-06-21 Thread amartya....@wipro.com

Hi,

Its VM fedora21, gcc 4.9.

Thanks,
Amartya

-Original Message-
From: Anupam Kapoor [mailto:akap...@parallelwireless.com]
Sent: Tuesday, June 21, 2016 2:54 PM
To: Amartya Kumar Das (MFG & Tech) 
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [DPDK16.04: Error While compiling]

** This mail has been sent from an external source **

> [2016-06-21T11:27:50+0530]: "amartya.das" (amartya-das):
,[ amartya-das ]
| I am facing compilation error for DPDK 16.04 as below:
| In file included from 
/home/cran/dpdk-16.04/lib/librte_eal/common/eal_common_options.c:52:0:
| /home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:673:9: 
error: called from here
| _mm_storeu_si128((__m128i *)((uint8_t *)dst + 6 * 16), _mm_alignr_epi8(xmm7, 
xmm6, offset));\
| ^
| /home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:730:16: 
note: in expansion of macro 'MOVEUNALIGNED_LEFT47_IMM'
| case 0x0F: MOVEUNALIGNED_LEFT47_IMM(dst, src, n, 0x0F); break;\
| ^
| /home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:870:2: 
note: in expansion of macro 'MOVEUNALIGNED_LEFT47'
| MOVEUNALIGNED_LEFT47(dst, src, n, srcofs); ^ In file included from
| /usr/lib/gcc/x86_64-redhat-linux/4.9.2/include/x86intrin.h:37:0,
| from
| /home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_vect.h:67
| , from
| /home/cran/dpdk-16.04/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:
| 46, from
| /home/cran/dpdk-16.04/lib/librte_eal/common/eal_common_options.c:52:
| /usr/lib/gcc/x86_64-redhat-linux/4.9.2/include/tmmintrin.h:185:1:
| error: inlining failed in call to always_inline '_mm_alignr_epi8':
| target specific option mismatch _mm_alignr_epi8(__m128i __X, __m128i
| __Y, const int __N)
`
can you please provide information on the os and platform where you are 
attempting this ? from the paths it seems like gcc 4.9 on redhat (7 maybe ?)

---
thanks
anupam
The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. WARNING: Computer viruses can be transmitted via email. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email. www.wipro.com

[dpdk-dev] [PATCH v3 1/2] mempool: add stack (lifo) mempool handler

2016-06-21 Thread Olivier Matz

Hi,

On 06/21/2016 11:28 AM, Ananyev, Konstantin wrote:
 I was proposing only to move only the new
 handler(lib/librte_mempool/rte_mempool_stack.c). Not any library or any
 other common code.

 Just like DPDK crypto device, Even if it is software implementation its
 better to move in driver/crypto instead of lib/librte_cryptodev

 "lib/librte_mempool/arch/" is not correct place as it is platform specific
 not architecture specific and HW mempool device may be PCIe or platform
 device.
>>>
>>> Ok, but why rte_mempool_stack.c has to be moved?
>>
>> Just thought of having all the mempool handlers at one place.
>> We can't move all HW mempool handlers at lib/librte_mempool/
> 
> Yep, but from your previous mail I thought we might have specific ones
> for specific devices, no? 
> If so, why to put them in one place, why just not in:
> Drivers/xxx_dev/xxx_mempool.[h,c]
> ?
> And keep generic ones in lib/librte_mempool
> ?

I think all drivers (generic or not) should be at the same place for
consistency.

I'm not sure having them lib/librte_mempool is really a problem today,
but once hardware-dependent handler are pushed, we may move all of them
in drivers/mempool because I think we should avoid to have hw-specific
code in lib/.

I don't think it will cause ABI/API breakage since the user always
talk to the generic mempool API.

Regards
Olivier

[dpdk-dev] [PATCH v3 1/2] ethdev: add callback to get register size in bytes

2016-06-21 Thread Zyta Szpak

OK, I will do the v4.

On 17.06.2016 12:20, Thomas Monjalon wrote:
> 2016-06-13 16:51, Remy Horton:
>> On 12/06/2016 15:51, Zyta Szpak wrote:
>>>  I would prefer having only one function rte_eth_dev_get_regs()
>>>  which returns length and width if data is NULL.
>>>  The first call is a parameter request before buffer allocation,
>>>  and the second call fills the buffer.
>>>
>>>  We can deprecate the old API and introduce this new one.
>>>
>>>  Opinions?
>>>
>>> In my opinion as it is now it works fine. Gathering all parameters in
>>> one callback might be a good idea if the maintainer also agrees to that
>>> because as I mentioned, it interferes.
>>   From my perspective changing rte_eth_dev_get_regs() isn't a problem, as
>> it isn't used directly rather than through rte_ethtool_get_regs()..
> Zyta, would you like to make a v4?

[dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list truncation

2016-06-21 Thread Dumitrescu, Cristian



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen
> Sent: Tuesday, June 21, 2016 9:12 AM
> To: dev at dpdk.org
> Cc: christian.ehrhardt at canonical.com; thomas.monjalon at 6wind.com
> Subject: [dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list
> truncation
> 
> In other libraries, dependency list is always appended to, but
> in commit 6cbf4f75e059 it with an assignment. This causes the
> librte_eal dependency added in commit 6cbf4f75e059 to get discarded,
> resulting in missing dependency on librte_eal.
> 
> Fixes: b3688bee81a8 ("pipeline: new packet framework logic")
> Fixes: 6cbf4f75e059 ("mk: fix missing internal dependencies")
> 
> Signed-off-by: Panu Matilainen 
> ---
>  lib/librte_pipeline/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_pipeline/Makefile b/lib/librte_pipeline/Makefile
> index 95387aa..a8f3128 100644
> --- a/lib/librte_pipeline/Makefile
> +++ b/lib/librte_pipeline/Makefile
> @@ -53,7 +53,7 @@ SYMLINK-$(CONFIG_RTE_LIBRTE_PIPELINE)-include +=
> rte_pipeline.h
> 
>  # this lib depends upon:
>  DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_eal
> -DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) := lib/librte_table
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_table
>  DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_port
> 
>  include $(RTE_SDK)/mk/rte.lib.mk
> --
> 2.5.5


In release 16.4, EAL was missing from the dependency list, now it is added. The 
librte_pipeline uses rte_malloc, therefore it depends on librte_eal being 
present.

In the Makefile of the other Packet Framework libraries (librte_port, 
librte_table), it looks like the first dependency in the list is EAL, which is 
listed with the assignment operator, followed by others that are listed with 
the append operator:
DEPDIRS-$(CONFIG_RTE_LIBRTE_XYZ) := lib/librte_eal
DEPDIRS-$(CONFIG_RTE_LIBRTE_XYZ) += lib/librte_some other lib

Therefore, at least for cosmetic reasons, we should probably do the same in 
librte_pipeline, which requires changing both the librte_eal and the 
librte_table lines as below:
DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) := lib/librte_eal
DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_table
DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_port

However, some other libraries e.g. librte_lpm simply add the EAL dependency 
using the append operator:
DEPDIRS-$(CONFIG_RTE_LIBRTE_LPM) += lib/librte_eal

To be honest, I need to refresh my knowledge on make, I don't remember right 
now when we should use the assignment and when the append. Do we need to use 
the assign for first dependency (EAL) and append for others or should we use 
append everywhere?

Thanks,
Cristian

[dpdk-dev] [DPDK16.04: Error While compiling]

2016-06-21 Thread Anupam Kapoor


> [2016-06-21T15:12:27+0530]: "amartya.das" (amarty-das):
,[ amartya-das ]
| Its VM fedora21, gcc 4.9.
`
does your vcpu support sse3 ?

---
thanks
anupam

[dpdk-dev] [PATCH] kni : fix build errors for gcc --version >= 6.1

2016-06-21 Thread Ferruh Yigit

Hi Anupam,

Thank you for the patch.


On 6/21/2016 9:37 AM, Anupam Kapoor wrote:
> This commit fixes build errors triggered due misleading indentation.
> 
> Fixes: 366113dbfb696 (e1000: suppress misleading indentation warning)
This looks like wrong commit id that fixed, can you please double check
Also you may need two fixes lines, since fixing two different driver files

> 
> 
> Signed-off-by: Anupam Kapoor 
> ---
>  lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c | 12 
>  lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c |  6 +++---
>  2 files changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c 
> b/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c
> index df224702ed7d..26352da15101 100644
> --- a/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c
> +++ b/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c
> @@ -3299,13 +3299,15 @@ s32 e1000_read_phy_reg_mphy(struct e1000_hw *hw, u32 
> address, u32 *data)
> return -E1000_ERR_PHY;
> *data = E1000_READ_REG(hw, E1000_MPHY_DATA);
> 
> -   /* Disable access to mPHY if it was originally disabled */
> -   if (locked)
> +   /* Disable access to mPHY if it was originally enabled */
I think original comment is correct.
As far as I can see, if access disabled in the beginning of the
function, it is enabled and here disabled back. Original state saved to
locked variable.

> +   if (locked) {
> ready = e1000_is_mphy_ready(hw);
> if (!ready)
> return -E1000_ERR_PHY;
> +
> E1000_WRITE_REG(hw, E1000_MPHY_ADDR_CTRL,
> E1000_MPHY_DIS_ACCESS);
> +   }
> 

...

> diff --git a/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c 
> b/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c
> index 017dfe16c73f..dc2a4fb61c25 100644
> --- a/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c
> +++ b/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c
> @@ -870,9 +870,9 @@ s32 ixgbe_setup_mac_link_82599(struct ixgbe_hw *hw,
> if (speed & IXGBE_LINK_SPEED_10GB_FULL)
> if (orig_autoc & IXGBE_AUTOC_KX4_SUPP)
> autoc |= IXGBE_AUTOC_KX4_SUPP;
> -   if ((orig_autoc & IXGBE_AUTOC_KR_SUPP) &&
> -   (hw->phy.smart_speed_active == false))
> -   autoc |= IXGBE_AUTOC_KR_SUPP;
> +   if ((orig_autoc & IXGBE_AUTOC_KR_SUPP) &&
> +   (hw->phy.smart_speed_active == false))
> +   autoc |= IXGBE_AUTOC_KR_SUPP;
Can you please check following commit:
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=55461ddbc

> if (speed & IXGBE_LINK_SPEED_1GB_FULL)
> autoc |= IXGBE_AUTOC_KX_SUPP;
> } else if ((pma_pmd_1g == IXGBE_AUTOC_1G_SFI) &&
> --
> 2.9.0
> 


Would you mind sending a new version of patch according above comments?

Thanks,
ferruh

[dpdk-dev] [PATCH] bnx2x: Correctly determine MSIX vector count

2016-06-21 Thread Bruce Richardson

On Tue, Jun 21, 2016 at 05:55:19AM +, Harish Patil wrote:
> >
> >From: "Charles (Chas) Williams" 
> >
> >If MSIX is available, the vector count given by the table size is one
> >less than the actual count.  This count also limits the receive and
> >transmit queue resources the VF can support.
> >
> >Fixes: 540a211084a7 ("bnx2x: driver core")
> >
> >Signed-off-by: Chas Williams 

> 
> Acked-by: Harish Patil 
> 
Applied to dpdk-next-net/rel_16_07

/Bruce

[dpdk-dev] [PATCH v3 0/9] IPSec Enhancements

2016-06-21 Thread Thomas Monjalon

> > Sergio Gonzalez Monroy (9):
> >   examples/ipsec-secgw: fix esp padding check
> >   examples/ipsec-secgw: fix stack smashing error
> >   examples/ipsec-secgw: add build option and cleanup
> >   examples/ipsec-secgw: rework ipsec execution loop
> >   examples/ipsec-secgw: fix no sa found case
> >   examples/ipsec-secgw: consistent config variable names
> >   examples/ipsec-secgw: ipv6 support
> >   examples/ipsec-secgw: transport mode support
> >   doc: update ipsec sample guide
> 
> Series-acked-by: Pablo de Lara 

Applied (with the fix suggested by John for the PDF doc), thanks.

[dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list truncation

2016-06-21 Thread Panu Matilainen

On 06/21/2016 01:01 PM, Dumitrescu, Cristian wrote:
>
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen
>> Sent: Tuesday, June 21, 2016 9:12 AM
>> To: dev at dpdk.org
>> Cc: christian.ehrhardt at canonical.com; thomas.monjalon at 6wind.com
>> Subject: [dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list
>> truncation
>>
>> In other libraries, dependency list is always appended to, but
>> in commit 6cbf4f75e059 it with an assignment. This causes the
>> librte_eal dependency added in commit 6cbf4f75e059 to get discarded,
>> resulting in missing dependency on librte_eal.
>>
>> Fixes: b3688bee81a8 ("pipeline: new packet framework logic")
>> Fixes: 6cbf4f75e059 ("mk: fix missing internal dependencies")
>>
>> Signed-off-by: Panu Matilainen 
>> ---
>>  lib/librte_pipeline/Makefile | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/lib/librte_pipeline/Makefile b/lib/librte_pipeline/Makefile
>> index 95387aa..a8f3128 100644
>> --- a/lib/librte_pipeline/Makefile
>> +++ b/lib/librte_pipeline/Makefile
>> @@ -53,7 +53,7 @@ SYMLINK-$(CONFIG_RTE_LIBRTE_PIPELINE)-include +=
>> rte_pipeline.h
>>
>>  # this lib depends upon:
>>  DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_eal
>> -DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) := lib/librte_table
>> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_table
>>  DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_port
>>
>>  include $(RTE_SDK)/mk/rte.lib.mk
>> --
>> 2.5.5
>
>
> In release 16.4, EAL was missing from the dependency list, now it is added. 
> The librte_pipeline uses rte_malloc, therefore it depends on librte_eal being 
> present.
>
> In the Makefile of the other Packet Framework libraries (librte_port, 
> librte_table), it looks like the first dependency in the list is EAL, which 
> is listed with the assignment operator, followed by others that are listed 
> with the append operator:
>   DEPDIRS-$(CONFIG_RTE_LIBRTE_XYZ) := lib/librte_eal
>   DEPDIRS-$(CONFIG_RTE_LIBRTE_XYZ) += lib/librte_some other lib
>
> Therefore, at least for cosmetic reasons, we should probably do the same in 
> librte_pipeline, which requires changing both the librte_eal and the 
> librte_table lines as below:
>   DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) := lib/librte_eal
>   DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_table
>   DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_port

Ah, didn't notice those because the assignment is first of the dependencies.

>
> However, some other libraries e.g. librte_lpm simply add the EAL dependency 
> using the append operator:
>   DEPDIRS-$(CONFIG_RTE_LIBRTE_LPM) += lib/librte_eal
>
> To be honest, I need to refresh my knowledge on make, I don't remember right 
> now when we should use the assignment and when the append. Do we need to use 
> the assign for first dependency (EAL) and append for others or should we use 
> append everywhere?

At least in automake, you need to assign before you can append. But in 
gmake this apparently is not the case, quoting from 
https://www.gnu.org/software/make/manual/html_node/Appending.html:

"When the variable in question has not been defined before, ?+=? acts 
just like normal ?=?: it defines a recursively-expanded variable. 
However, when there is a previous definition, exactly what ?+=? does 
depends on what flavor of variable you defined originally."

So there's no need to use := anywhere for the dependencies, in fact its 
probably best avoided to avoid issues like this. Of course after the 
third patch in this "series" is applied, mistakes like these can no 
longer go unnoticed.

- Panu -

> Thanks,
> Cristian
>

[dpdk-dev] [PATCH v2] i40e: modify the meaning of single VLAN type

2016-06-21 Thread Bruce Richardson

On Mon, Jun 13, 2016 at 04:03:32PM +0800, Beilei Xing wrote:
> In current i40e codebase, if single VLAN header is added in a packet,
> it's treated as inner VLAN. Generally, a single VLAN header is
> treated as the outer VLAN header. So change corresponding register
> for single VLAN.
> At the meanwhile, change the meanings of inner VLAN and outer VLAN.
> 
> Signed-off-by: Beilei Xing 

This patch changes the ABI, since an app written to the original API as 
specified
e.g. to set a single vlan header, would no longer work with this change.
Therefore, even though the original behaviour was inconsistent with other 
drivers
it may still need to be preserved.

I'm thinking that we may need to provide appropriately versioned copies of the
vlan_offload_set and vlan_tpid_set functions for backward compatibility with
the old ABI.

Any other comments or thoughts on this? 
Neil, Thomas, Panu - is this fix something that we need to provide backward
version-compatibility for, or given that the functions are being called through
a generic ethdev API mean that this can just go in as a straight bug-fix?

/Bruce

[dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list truncation

2016-06-21 Thread Bruce Richardson

On Tue, Jun 21, 2016 at 01:25:52PM +0300, Panu Matilainen wrote:
> On 06/21/2016 01:01 PM, Dumitrescu, Cristian wrote:
> >
> >
> >>-Original Message-
> >>From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen
> >>Sent: Tuesday, June 21, 2016 9:12 AM
> >>To: dev at dpdk.org
> >>Cc: christian.ehrhardt at canonical.com; thomas.monjalon at 6wind.com
> >>Subject: [dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list
> >>truncation
> >>
> >>In other libraries, dependency list is always appended to, but
> >>in commit 6cbf4f75e059 it with an assignment. This causes the
> >>librte_eal dependency added in commit 6cbf4f75e059 to get discarded,
> >>resulting in missing dependency on librte_eal.
> >>
> >>Fixes: b3688bee81a8 ("pipeline: new packet framework logic")
> >>Fixes: 6cbf4f75e059 ("mk: fix missing internal dependencies")
> >>
> >>Signed-off-by: Panu Matilainen 
> >>---
> >> lib/librte_pipeline/Makefile | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >>diff --git a/lib/librte_pipeline/Makefile b/lib/librte_pipeline/Makefile
> >>index 95387aa..a8f3128 100644
> >>--- a/lib/librte_pipeline/Makefile
> >>+++ b/lib/librte_pipeline/Makefile
> >>@@ -53,7 +53,7 @@ SYMLINK-$(CONFIG_RTE_LIBRTE_PIPELINE)-include +=
> >>rte_pipeline.h
> >>
> >> # this lib depends upon:
> >> DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_eal
> >>-DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) := lib/librte_table
> >>+DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_table
> >> DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_port
> >>
> >> include $(RTE_SDK)/mk/rte.lib.mk
> >>--
> >>2.5.5
> >
> >
> >In release 16.4, EAL was missing from the dependency list, now it is added. 
> >The librte_pipeline uses rte_malloc, therefore it depends on librte_eal 
> >being present.
> >
> >In the Makefile of the other Packet Framework libraries (librte_port, 
> >librte_table), it looks like the first dependency in the list is EAL, which 
> >is listed with the assignment operator, followed by others that are listed 
> >with the append operator:
> > DEPDIRS-$(CONFIG_RTE_LIBRTE_XYZ) := lib/librte_eal
> > DEPDIRS-$(CONFIG_RTE_LIBRTE_XYZ) += lib/librte_some other lib
> >
> >Therefore, at least for cosmetic reasons, we should probably do the same in 
> >librte_pipeline, which requires changing both the librte_eal and the 
> >librte_table lines as below:
> > DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) := lib/librte_eal
> > DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_table
> > DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_port
> 
> Ah, didn't notice those because the assignment is first of the dependencies.
> 
> >
> >However, some other libraries e.g. librte_lpm simply add the EAL dependency 
> >using the append operator:
> > DEPDIRS-$(CONFIG_RTE_LIBRTE_LPM) += lib/librte_eal
> >
> >To be honest, I need to refresh my knowledge on make, I don't remember right 
> >now when we should use the assignment and when the append. Do we need to use 
> >the assign for first dependency (EAL) and append for others or should we use 
> >append everywhere?
> 
> At least in automake, you need to assign before you can append. But in gmake
> this apparently is not the case, quoting from
> https://www.gnu.org/software/make/manual/html_node/Appending.html:
> 
> "When the variable in question has not been defined before, ?+=? acts just
> like normal ?=?: it defines a recursively-expanded variable. However, when
> there is a previous definition, exactly what ?+=? does depends on what
> flavor of variable you defined originally."
> 
> So there's no need to use := anywhere for the dependencies, in fact its
> probably best avoided to avoid issues like this. Of course after the third
> patch in this "series" is applied, mistakes like these can no longer go
> unnoticed.
> 
Will the build be any slower with everything defaulting to recursively expanded
variables rather than the simply-expanded variables defined by the initial ":="?

/Bruce

[dpdk-dev] [PATCH v3 00/25] Refactor mlx5 to improve performance

2016-06-21 Thread Ferruh Yigit

On 6/21/2016 8:43 AM, Yuanhan Liu wrote:
> Hi,
> 
> Here is an off-topic comment: would you please add following line to
> the sendemail section of your git config file?
> 
> chainreplyto = false
> 
> That would let me to break the long threads in my client much easier.
> Otherwise, it's hard for me to do it, leading that your thread occupies
> several screens on my side.
> 
> It seems that Tetsuya also has the issue, thus CC'ed.
> 

As far as I can see this is not related to the chainreplyto option, but
"--no-thread" seems set, because all patchsets sent as reply to first
mail of first patchset [C].
Correct setting should be "--thread and --no-chain-reply-to"

Although this is really detail, for multi version patchsets, if there is
a preferred way between (A) or (B) I would like to learn too?

A)

- [0/N]
- - [1/N]
- - [2/N]
- - [v2 0/N]
- - - [v2 1/N]
- - - [v2 N/N]
- - - [v3 0/N]
- - - - [v3 1/N]
- - - - [v3 N/N]
- - - - [v4 0/N]
- - - - - [v4 1/N]
- - - - - [v4 N/N]



B)

- [0/N]
- - [1/N]
- - [2/N]
- - [v2 0/N]
- - - [v2 1/N]
- - - [v2 N/N]
- - [v3 0/N]
- - - [v3 1/N]
- - - [v3 N/N]
- - [v4 0/N]
- - - [v4 1/N]
- - - [v4 N/N]


C)

- [0/N]
- - [1/N]
- - [2/N]
- - [v2 0/N]
- - [v2 1/N]
- - [v2 N/N]
- - [v3 0/N]
- - [v3 1/N]
- - [v3 N/N]

[dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list truncation

2016-06-21 Thread Panu Matilainen

On 06/21/2016 01:31 PM, Bruce Richardson wrote:
> On Tue, Jun 21, 2016 at 01:25:52PM +0300, Panu Matilainen wrote:
>> On 06/21/2016 01:01 PM, Dumitrescu, Cristian wrote:
>>>
>>>
 -Original Message-
 From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen
 Sent: Tuesday, June 21, 2016 9:12 AM
 To: dev at dpdk.org
 Cc: christian.ehrhardt at canonical.com; thomas.monjalon at 6wind.com
 Subject: [dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list
 truncation

 In other libraries, dependency list is always appended to, but
 in commit 6cbf4f75e059 it with an assignment. This causes the
 librte_eal dependency added in commit 6cbf4f75e059 to get discarded,
 resulting in missing dependency on librte_eal.

 Fixes: b3688bee81a8 ("pipeline: new packet framework logic")
 Fixes: 6cbf4f75e059 ("mk: fix missing internal dependencies")

 Signed-off-by: Panu Matilainen 
 ---
 lib/librte_pipeline/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/lib/librte_pipeline/Makefile b/lib/librte_pipeline/Makefile
 index 95387aa..a8f3128 100644
 --- a/lib/librte_pipeline/Makefile
 +++ b/lib/librte_pipeline/Makefile
 @@ -53,7 +53,7 @@ SYMLINK-$(CONFIG_RTE_LIBRTE_PIPELINE)-include +=
 rte_pipeline.h

 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_eal
 -DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) := lib/librte_table
 +DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_table
 DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_port

 include $(RTE_SDK)/mk/rte.lib.mk
 --
 2.5.5
>>>
>>>
>>> In release 16.4, EAL was missing from the dependency list, now it is added. 
>>> The librte_pipeline uses rte_malloc, therefore it depends on librte_eal 
>>> being present.
>>>
>>> In the Makefile of the other Packet Framework libraries (librte_port, 
>>> librte_table), it looks like the first dependency in the list is EAL, which 
>>> is listed with the assignment operator, followed by others that are listed 
>>> with the append operator:
>>> DEPDIRS-$(CONFIG_RTE_LIBRTE_XYZ) := lib/librte_eal
>>> DEPDIRS-$(CONFIG_RTE_LIBRTE_XYZ) += lib/librte_some other lib
>>>
>>> Therefore, at least for cosmetic reasons, we should probably do the same in 
>>> librte_pipeline, which requires changing both the librte_eal and the 
>>> librte_table lines as below:
>>> DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) := lib/librte_eal
>>> DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_table
>>> DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_port
>>
>> Ah, didn't notice those because the assignment is first of the dependencies.
>>
>>>
>>> However, some other libraries e.g. librte_lpm simply add the EAL dependency 
>>> using the append operator:
>>> DEPDIRS-$(CONFIG_RTE_LIBRTE_LPM) += lib/librte_eal
>>>
>>> To be honest, I need to refresh my knowledge on make, I don't remember 
>>> right now when we should use the assignment and when the append. Do we need 
>>> to use the assign for first dependency (EAL) and append for others or 
>>> should we use append everywhere?
>>
>> At least in automake, you need to assign before you can append. But in gmake
>> this apparently is not the case, quoting from
>> https://www.gnu.org/software/make/manual/html_node/Appending.html:
>>
>> "When the variable in question has not been defined before, ?+=? acts just
>> like normal ?=?: it defines a recursively-expanded variable. However, when
>> there is a previous definition, exactly what ?+=? does depends on what
>> flavor of variable you defined originally."
>>
>> So there's no need to use := anywhere for the dependencies, in fact its
>> probably best avoided to avoid issues like this. Of course after the third
>> patch in this "series" is applied, mistakes like these can no longer go
>> unnoticed.
>>
> Will the build be any slower with everything defaulting to recursively 
> expanded
> variables rather than the simply-expanded variables defined by the initial 
> ":="?

Bruce, everything already *is* defaulting to recursively expanded 
variables, except for the three libraries here which have used := for 
who knows what (historical or other) reason. And out of those three 
exceptions, one is buggy. Which is what I'm addressing here.

- Panu -

[dpdk-dev] [PATCH v4 1/4] port: kni interface support

2016-06-21 Thread Ethan Zhuang

From: WeiJie Zhuang 

add KNI port type to the packet framework

Signed-off-by: WeiJie Zhuang 
---
v2:
* Fix check patch error.
v3:
* Fix code review comments.
v4:
* Reorganize patch sets and fix some comments
---
 lib/librte_port/Makefile |   7 +
 lib/librte_port/rte_port_kni.c   | 545 +++
 lib/librte_port/rte_port_kni.h   |  95 ++
 lib/librte_port/rte_port_version.map |   9 +
 4 files changed, 656 insertions(+)
 create mode 100644 lib/librte_port/rte_port_kni.c
 create mode 100644 lib/librte_port/rte_port_kni.h

diff --git a/lib/librte_port/Makefile b/lib/librte_port/Makefile
index dc6a601..0fc929b 100644
--- a/lib/librte_port/Makefile
+++ b/lib/librte_port/Makefile
@@ -56,6 +56,9 @@ SRCS-$(CONFIG_RTE_LIBRTE_PORT) += rte_port_frag.c
 SRCS-$(CONFIG_RTE_LIBRTE_PORT) += rte_port_ras.c
 endif
 SRCS-$(CONFIG_RTE_LIBRTE_PORT) += rte_port_sched.c
+ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
+SRCS-$(CONFIG_RTE_LIBRTE_PORT) += rte_port_kni.c
+endif
 SRCS-$(CONFIG_RTE_LIBRTE_PORT) += rte_port_source_sink.c

 # install includes
@@ -67,6 +70,9 @@ SYMLINK-$(CONFIG_RTE_LIBRTE_PORT)-include += rte_port_frag.h
 SYMLINK-$(CONFIG_RTE_LIBRTE_PORT)-include += rte_port_ras.h
 endif
 SYMLINK-$(CONFIG_RTE_LIBRTE_PORT)-include += rte_port_sched.h
+ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
+SYMLINK-$(CONFIG_RTE_LIBRTE_PORT)-include += rte_port_kni.h
+endif
 SYMLINK-$(CONFIG_RTE_LIBRTE_PORT)-include += rte_port_source_sink.h

 # this lib depends upon:
@@ -76,5 +82,6 @@ DEPDIRS-$(CONFIG_RTE_LIBRTE_PORT) += lib/librte_mempool
 DEPDIRS-$(CONFIG_RTE_LIBRTE_PORT) += lib/librte_ether
 DEPDIRS-$(CONFIG_RTE_LIBRTE_PORT) += lib/librte_ip_frag
 DEPDIRS-$(CONFIG_RTE_LIBRTE_PORT) += lib/librte_sched
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PORT) += lib/librte_kni

 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_port/rte_port_kni.c b/lib/librte_port/rte_port_kni.c
new file mode 100644
index 000..08f4ac2
--- /dev/null
+++ b/lib/librte_port/rte_port_kni.c
@@ -0,0 +1,545 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Ethan Zhuang .
+ *   Copyright(c) 2016 Intel Corporation.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include 
+
+#include 
+#include 
+#include 
+
+#include "rte_port_kni.h"
+
+/*
+ * Port KNI Reader
+ */
+#ifdef RTE_PORT_STATS_COLLECT
+
+#define RTE_PORT_KNI_READER_STATS_PKTS_IN_ADD(port, val) \
+   port->stats.n_pkts_in += val
+#define RTE_PORT_KNI_READER_STATS_PKTS_DROP_ADD(port, val) \
+   port->stats.n_pkts_drop += val
+
+#else
+
+#define RTE_PORT_KNI_READER_STATS_PKTS_IN_ADD(port, val)
+#define RTE_PORT_KNI_READER_STATS_PKTS_DROP_ADD(port, val)
+
+#endif
+
+struct rte_port_kni_reader {
+   struct rte_port_in_stats stats;
+
+   struct rte_kni *kni;
+};
+
+static void *
+rte_port_kni_reader_create(void *params, int socket_id)
+{
+   struct rte_port_kni_reader_params *conf =
+   (struct rte_port_kni_reader_params *) params;
+   struct rte_port_kni_reader *port;
+
+   /* Check input parameters */
+   if (conf == NULL) {
+   RTE_LOG(ERR, PORT, "%s: params is NULL\n", __func__);
+   return NULL;
+   }
+
+   /* Memory allocation */
+   port = rte_zmalloc_socket("PORT", sizeof(*port),
+   RTE_CACHE_LINE_SIZE, socket_id);
+   if (port == NULL) {
+   RTE_LOG(ERR, PORT, "%s: Failed to allocate port\n", __func__);
+

[dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset

2016-06-21 Thread Jerin Jacob

On Tue, Jun 21, 2016 at 09:26:12AM +, Ananyev, Konstantin wrote:

Hi Konstantin,

> Hi Jerin,
> 
> > -Original Message-
> > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> > Sent: Tuesday, June 21, 2016 9:56 AM
> > To: Lu, Wenzhuo
> > Cc: Stephen Hemminger; dev at dpdk.org; Ananyev, Konstantin; Richardson, 
> > Bruce; Chen, Jing D; Liang, Cunming; Wu, Jingjing; Zhang,
> > Helin; thomas.monjalon at 6wind.com
> > Subject: Re: [dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device 
> > reset
> > 
> > On Tue, Jun 21, 2016 at 08:24:36AM +, Lu, Wenzhuo wrote:
> > > Hi Jerin,
> > 
> > Hi Wenzhuo,
> > 
> > > > > > > > On Mon, Jun 20, 2016 at 02:24:27PM +0800, Wenzhuo Lu wrote:
> > > > > > > > > Add an API to reset the device.
> > > > > > > > > It's for VF device in this scenario, kernel PF + DPDK VF.
> > > > > > > > > When the PF port down->up, APP should call this API to reset
> > > > > > > > > VF port. Most likely, APP should call it in its management
> > > > > > > > > thread and guarantee the thread safe. It means APP should stop
> > > > > > > > > the rx/tx and the device, then reset the device, then recover
> > > > > > > > > the device and rx/tx.
> > > > > > > >
> > > > > > > > Following is _a_ use-case for Device reset. But may be not be
> > > > > > > > _the_ use case. IMO, We need to first say expected behavior of
> > > > > > > > this API and add a use-case later.
> > > > > > > >
> > > > > > > > Other use-case would be, PCIe VF with functional level reset for
> > > > > > > > SRIOV migration.
> > > > > > > > Are we on same page?
> > > > > > >
> > > > > > >
> > > > > > > In my experience with Linux devices, this is normally handled by
> > > > > > > the device driver in the start routine.  Since any use case which
> > > > > > > needs this is going to do a stop/reset/start sequence, why not
> > > > > > > just have the VF device driver do this in the start routine?.
> > > > > > >
> > > > > > > Adding yet another API and state transistion if not necessary
> > > > > > > increases the complexity and required test cases for all devices.
> > > > > >
> > > > > > I agree with Stephen here.I think if application needs to call start
> > > > > > after the device reset then we could add this logic in start itself
> > > > > > rather exposing a yet another API
> > > > > Do you mean changing the device_start to include all these actions, 
> > > > > stop
> > > > device -> stop queue -> re-setup queue -> start queue -> start device ?
> > > >
> > > > What was the expected API call sequence when you were introduced this 
> > > > API?
> > > >
> > > > Point was to have implicit device reset in the API call 
> > > > sequence(Wherever make
> > > > sense for specific PMD)
> > > I think the API call sequence depends on the implementation of the APP. 
> > > Let's say if there's not this reset API, APP can use this API
> > call sequence to handle the PF link down/up event, rte_eth_dev_close -> 
> > rte_eth_rx_queue_setup -> rte_eth_tx_queue_setup ->
> > rte_eth_dev_start.
> > > Actually our purpose is to use this reset API instead of the API call 
> > > sequence. You can see the reset API is not necessary. The benefit
> > is to save the code for APP.
> > 
> > Then I am bit confused with original commit log description.
> > |
> > |It means APP should stop the rx/tx and the device, then reset the
> > |device, then recover the device and rx/tx.
> > |
> > I was under impression that it a low level reset API for this device? Is
> > n't it?
> > 
> > The other issue is generalized outlook of the API, Certain PMD will not
> > have PF link down/up event? Link down/up and only connected to VF and PF
> > only for configuration.
> > 
> > How about fixing it more transparently in PMD driver itself as
> > PMD driver knows the PF link up/down event, Is it possible to
> > recover the VF on that event if its only matter of resetting it?
> 
> I think we already went through that discussion on the list.
> Unfortunately with current dpdk design it is hardly possible.
> To achieve that we need to introduce some sort of synchronisation
> between IO and control APIs (locking or so).
> Actually I am not sure why having a special reset function will be a problem.

|
|It means APP should stop the rx/tx and the device, then reset the
|device, then recover the device and rx/tx.
|
Just to understand, If application still need  to do the stop then what
value addtion reset API brings on the table?


> Yes, it would exist only for VFs, for PF it could be left unimplemented.
> Though it definitely seems more convenient from user point of view,
> they would know: to handle VF reset event, they just need to call that
> particular function, not to re-implement their own.
What if driver returns "not implemented" then application will have do
generic rte_eth_dev_stop/rte_eth_dev_start.That way in application
perspective we are NOT solving any problem.

Jerin

[dpdk-dev] [PATCH v4 2/4] examples/ip_pipeline: kni interface support

2016-06-21 Thread Ethan Zhuang

From: WeiJie Zhuang 

1. add KNI support to the IP Pipeline sample Application
2. some bug fix

Signed-off-by: WeiJie Zhuang 
---
 examples/ip_pipeline/app.h | 183 ++-
 examples/ip_pipeline/config_check.c|  26 ++-
 examples/ip_pipeline/config_parse.c| 203 -
 examples/ip_pipeline/init.c| 148 ++-
 examples/ip_pipeline/pipeline/pipeline_common_fe.c |  27 +++
 examples/ip_pipeline/pipeline/pipeline_master_be.c |   9 +
 examples/ip_pipeline/pipeline_be.h |  33 
 7 files changed, 618 insertions(+), 11 deletions(-)

diff --git a/examples/ip_pipeline/app.h b/examples/ip_pipeline/app.h
index 7611341..6a6fdd9 100644
--- a/examples/ip_pipeline/app.h
+++ b/examples/ip_pipeline/app.h
@@ -44,6 +44,9 @@
 #include 

 #include 
+#ifdef RTE_LIBRTE_KNI
+#include 
+#endif

 #include "cpu_core_map.h"
 #include "pipeline.h"
@@ -132,6 +135,22 @@ struct app_pktq_swq_params {
uint32_t mempool_indirect_id;
 };

+struct app_pktq_kni_params {
+   char *name;
+   uint32_t parsed;
+
+   uint32_t socket_id;
+   uint32_t core_id;
+   uint32_t hyper_th_id;
+   uint32_t force_bind;
+
+   uint32_t mempool_id; /* Position in the app->mempool_params */
+   uint32_t burst_read;
+   uint32_t burst_write;
+   uint32_t dropless;
+   uint64_t n_retries;
+};
+
 #ifndef APP_FILE_NAME_SIZE
 #define APP_FILE_NAME_SIZE   256
 #endif
@@ -185,6 +204,7 @@ enum app_pktq_in_type {
APP_PKTQ_IN_HWQ,
APP_PKTQ_IN_SWQ,
APP_PKTQ_IN_TM,
+   APP_PKTQ_IN_KNI,
APP_PKTQ_IN_SOURCE,
 };

@@ -197,6 +217,7 @@ enum app_pktq_out_type {
APP_PKTQ_OUT_HWQ,
APP_PKTQ_OUT_SWQ,
APP_PKTQ_OUT_TM,
+   APP_PKTQ_OUT_KNI,
APP_PKTQ_OUT_SINK,
 };

@@ -420,6 +441,8 @@ struct app_eal_params {

 #define APP_MAX_PKTQ_TM  APP_MAX_LINKS

+#define APP_MAX_PKTQ_KNI APP_MAX_LINKS
+
 #ifndef APP_MAX_PKTQ_SOURCE
 #define APP_MAX_PKTQ_SOURCE  64
 #endif
@@ -471,6 +494,7 @@ struct app_params {
struct app_pktq_hwq_out_params hwq_out_params[APP_MAX_HWQ_OUT];
struct app_pktq_swq_params swq_params[APP_MAX_PKTQ_SWQ];
struct app_pktq_tm_params tm_params[APP_MAX_PKTQ_TM];
+   struct app_pktq_kni_params kni_params[APP_MAX_PKTQ_KNI];
struct app_pktq_source_params source_params[APP_MAX_PKTQ_SOURCE];
struct app_pktq_sink_params sink_params[APP_MAX_PKTQ_SINK];
struct app_msgq_params msgq_params[APP_MAX_MSGQ];
@@ -482,6 +506,7 @@ struct app_params {
uint32_t n_pktq_hwq_out;
uint32_t n_pktq_swq;
uint32_t n_pktq_tm;
+   uint32_t n_pktq_kni;
uint32_t n_pktq_source;
uint32_t n_pktq_sink;
uint32_t n_msgq;
@@ -495,6 +520,9 @@ struct app_params {
struct app_link_data link_data[APP_MAX_LINKS];
struct rte_ring *swq[APP_MAX_PKTQ_SWQ];
struct rte_sched_port *tm[APP_MAX_PKTQ_TM];
+#ifdef RTE_LIBRTE_KNI
+   struct rte_kni *kni[APP_MAX_PKTQ_KNI];
+#endif /* RTE_LIBRTE_KNI */
struct rte_ring *msgq[APP_MAX_MSGQ];
struct pipeline_type pipeline_type[APP_MAX_PIPELINE_TYPES];
struct app_pipeline_data pipeline_data[APP_MAX_PIPELINES];
@@ -667,11 +695,11 @@ app_swq_get_reader(struct app_params *app,
struct app_pktq_swq_params *swq,
uint32_t *pktq_in_id)
 {
-   struct app_pipeline_params *reader;
+   struct app_pipeline_params *reader = NULL;
uint32_t pos = swq - app->swq_params;
uint32_t n_pipelines = RTE_MIN(app->n_pipelines,
RTE_DIM(app->pipeline_params));
-   uint32_t n_readers = 0, id, i;
+   uint32_t n_readers = 0, id = 0, i;

for (i = 0; i < n_pipelines; i++) {
struct app_pipeline_params *p = &app->pipeline_params[i];
@@ -727,11 +755,11 @@ app_tm_get_reader(struct app_params *app,
struct app_pktq_tm_params *tm,
uint32_t *pktq_in_id)
 {
-   struct app_pipeline_params *reader;
+   struct app_pipeline_params *reader = NULL;
uint32_t pos = tm - app->tm_params;
uint32_t n_pipelines = RTE_MIN(app->n_pipelines,
RTE_DIM(app->pipeline_params));
-   uint32_t n_readers = 0, id, i;
+   uint32_t n_readers = 0, id = 0, i;

for (i = 0; i < n_pipelines; i++) {
struct app_pipeline_params *p = &app->pipeline_params[i];
@@ -758,6 +786,66 @@ app_tm_get_reader(struct app_params *app,
 }

 static inline uint32_t
+app_kni_get_readers(struct app_params *app, struct app_pktq_kni_params *kni)
+{
+   uint32_t pos = kni - app->kni_params;
+   uint32_t n_pipelines = RTE_MIN(app->n_pipelines,
+   RTE_DIM(app->pipeline_params));
+   uint32_t n_readers = 0, i;
+
+   for (i = 0; i < n_pipelines; i++) {
+   struct app_pipeline_params *p = &app->

[dpdk-dev] [PATCH v4 3/4] examples/ip_pipeline: kni example configuration

2016-06-21 Thread Ethan Zhuang

From: WeiJie Zhuang 

config file with two KNI interfaces connected using a Linux kernel bridge

Signed-off-by: WeiJie Zhuang 
---
 examples/ip_pipeline/config/kni.cfg | 67 +
 1 file changed, 67 insertions(+)
 create mode 100644 examples/ip_pipeline/config/kni.cfg

diff --git a/examples/ip_pipeline/config/kni.cfg 
b/examples/ip_pipeline/config/kni.cfg
new file mode 100644
index 000..cea208b
--- /dev/null
+++ b/examples/ip_pipeline/config/kni.cfg
@@ -0,0 +1,67 @@
+;   BSD LICENSE
+;
+;   Copyright(c) 2016 Intel Corporation.
+;   All rights reserved.
+;
+;   Redistribution and use in source and binary forms, with or without
+;   modification, are permitted provided that the following conditions
+;   are met:
+;
+; * Redistributions of source code must retain the above copyright
+;   notice, this list of conditions and the following disclaimer.
+; * Redistributions in binary form must reproduce the above copyright
+;   notice, this list of conditions and the following disclaimer in
+;   the documentation and/or other materials provided with the
+;   distribution.
+; * Neither the name of Intel Corporation nor the names of its
+;   contributors may be used to endorse or promote products derived
+;   from this software without specific prior written permission.
+;
+;   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+;   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+;   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+;   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+;   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+;   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+;   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+;   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+;   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+;   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+;   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+;
+; __  __
+;|  |  KNI0  |  |
+; RXQ0.0 --->|  |--->|--+   |
+;|  |  KNI1  |  | br0   |
+; TXQ1.0 <---|  |<---|<-+   |
+;| Pass-through || Linux Kernel |
+;| (P1) || Network Stack|
+;|  |  KNI1  |  |
+; RXQ1.0 --->|  |--->|--+   |
+;|  |  KNI0  |  | br0   |
+; TXQ0.0 <---|  |<---|<-+   |
+;|__||__|
+;
+; Insert Linux kernel KNI module:
+;[Linux]$ insmod rte_kni.ko
+;
+; Configure Linux kernel bridge between KNI0 and KNI1 interfaces:
+;[Linux]$ ifconfig KNI0 up
+;[Linux]$ ifconfig KNI1 up
+;[Linux]$ brctl addbr "br0"
+;[Linux]$ brctl addif br0 KNI0
+;[Linux]$ brctl addif br0 KNI1
+;[Linux]$ ifconfig br0 up
+
+[EAL]
+log_level = 0
+
+[PIPELINE0]
+type = MASTER
+core = 0
+
+[PIPELINE1]
+type = PASS-THROUGH
+core = 1
+pktq_in = RXQ0.0 KNI1 RXQ1.0 KNI0
+pktq_out = KNI0 TXQ1.0 KNI1 TXQ0.0
-- 
2.7.4

[dpdk-dev] [PATCH v4 4/4] doc: kni port support in the packet framework

2016-06-21 Thread Ethan Zhuang

From: WeiJie Zhuang 

add some descriptions for the kni port in the packet framework

Signed-off-by: WeiJie Zhuang 
---
 doc/api/doxy-api-index.md|   1 +
 doc/guides/sample_app_ug/ip_pipeline.rst | 112 +++
 2 files changed, 84 insertions(+), 29 deletions(-)

diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index f626386..5e7f024 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -118,6 +118,7 @@ There are many libraries, so their headers may be grouped 
by topics:
 [frag] (@ref rte_port_frag.h),
 [reass](@ref rte_port_ras.h),
 [sched](@ref rte_port_sched.h),
+[kni]  (@ref rte_port_kni.h),
 [src/sink] (@ref rte_port_source_sink.h)
   * [table](@ref rte_table.h):
 [lpm IPv4] (@ref rte_table_lpm.h),
diff --git a/doc/guides/sample_app_ug/ip_pipeline.rst 
b/doc/guides/sample_app_ug/ip_pipeline.rst
index 899fd4a..09cbc17 100644
--- a/doc/guides/sample_app_ug/ip_pipeline.rst
+++ b/doc/guides/sample_app_ug/ip_pipeline.rst
@@ -1,5 +1,5 @@
 ..  BSD LICENSE
-Copyright(c) 2015 Intel Corporation. All rights reserved.
+Copyright(c) 2015-2016 Intel Corporation. All rights reserved.
 All rights reserved.

 Redistribution and use in source and binary forms, with or without
@@ -351,33 +351,35 @@ Application resources present in the configuration file

 .. table:: Application resource names in the configuration file

-   
+--+-+-+
-   | Resource type| Format  | Examples 
   |
-   
+==+=+=+
-   | Pipeline | ``PIPELINE``| ``PIPELINE0``, 
``PIPELINE1``|
-   
+--+-+-+
-   | Mempool  | ``MEMPOOL`` | ``MEMPOOL0``, 
``MEMPOOL1``  |
-   
+--+-+-+
-   | Link (network interface) | ``LINK``| ``LINK0``, 
``LINK1``|
-   
+--+-+-+
-   | Link RX queue| ``RXQ.`` | ``RXQ0.0``, 
``RXQ1.5``  |
-   
+--+-+-+
-   | Link TX queue| ``TXQ.`` | ``TXQ0.0``, 
``TXQ1.5``  |
-   
+--+-+-+
-   | Software queue   | ``SWQ`` | ``SWQ0``, 
``SWQ1``  |
-   
+--+-+-+
-   | Traffic Manager  | ``TM`` | ``TM0``, ``TM1`` 
   |
-   
+--+-+-+
-   | Source   | ``SOURCE``  | ``SOURCE0``, 
``SOURCE1``|
-   
+--+-+-+
-   | Sink | ``SINK``| ``SINK0``, 
``SINK1``|
-   
+--+-+-+
-   | Message queue| ``MSGQ``| ``MSGQ0``, 
``MSGQ1``,   |
-   |  | ``MSGQ-REQ-PIPELINE``   | 
``MSGQ-REQ-PIPELINE2``, ``MSGQ-RSP-PIPELINE2,`` |
-   |  | ``MSGQ-RSP-PIPELINE``   | 
``MSGQ-REQ-CORE-s0c1``, ``MSGQ-RSP-CORE-s0c1``  |
-   |  | ``MSGQ-REQ-CORE-`` |  
   |
-   |  | ``MSGQ-RSP-CORE-`` |  
   |
-   
+--+-+-+
+   
++-+-+
+   | Resource type  | Format  | Examples   
 |
+   
++=+=+
+   | Pipeline   | ``PIPELINE``| ``PIPELINE0``, 
``PIPELINE1``|
+   
++--

[dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list truncation

2016-06-21 Thread Dumitrescu, Cristian



> -Original Message-
> From: Panu Matilainen [mailto:pmatilai at redhat.com]
> Sent: Tuesday, June 21, 2016 11:45 AM
> To: Richardson, Bruce 
> Cc: Dumitrescu, Cristian ; dev at dpdk.org;
> christian.ehrhardt at canonical.com; thomas.monjalon at 6wind.com
> Subject: Re: [dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list
> truncation
> 
> On 06/21/2016 01:31 PM, Bruce Richardson wrote:
> > On Tue, Jun 21, 2016 at 01:25:52PM +0300, Panu Matilainen wrote:
> >> On 06/21/2016 01:01 PM, Dumitrescu, Cristian wrote:
> >>>
> >>>
>  -Original Message-
>  From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu
> Matilainen
>  Sent: Tuesday, June 21, 2016 9:12 AM
>  To: dev at dpdk.org
>  Cc: christian.ehrhardt at canonical.com; thomas.monjalon at 6wind.com
>  Subject: [dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list
>  truncation
> 
>  In other libraries, dependency list is always appended to, but
>  in commit 6cbf4f75e059 it with an assignment. This causes the
>  librte_eal dependency added in commit 6cbf4f75e059 to get discarded,
>  resulting in missing dependency on librte_eal.
> 
>  Fixes: b3688bee81a8 ("pipeline: new packet framework logic")
>  Fixes: 6cbf4f75e059 ("mk: fix missing internal dependencies")
> 
>  Signed-off-by: Panu Matilainen 
>  ---
>  lib/librte_pipeline/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
>  diff --git a/lib/librte_pipeline/Makefile b/lib/librte_pipeline/Makefile
>  index 95387aa..a8f3128 100644
>  --- a/lib/librte_pipeline/Makefile
>  +++ b/lib/librte_pipeline/Makefile
>  @@ -53,7 +53,7 @@ SYMLINK-$(CONFIG_RTE_LIBRTE_PIPELINE)-
> include +=
>  rte_pipeline.h
> 
>  # this lib depends upon:
>  DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_eal
>  -DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) := lib/librte_table
>  +DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_table
>  DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_port
> 
>  include $(RTE_SDK)/mk/rte.lib.mk
>  --
>  2.5.5
> >>>
> >>>
> >>> In release 16.4, EAL was missing from the dependency list, now it is
> added. The librte_pipeline uses rte_malloc, therefore it depends on
> librte_eal being present.
> >>>
> >>> In the Makefile of the other Packet Framework libraries (librte_port,
> librte_table), it looks like the first dependency in the list is EAL, which 
> is listed
> with the assignment operator, followed by others that are listed with the
> append operator:
> >>>   DEPDIRS-$(CONFIG_RTE_LIBRTE_XYZ) := lib/librte_eal
> >>>   DEPDIRS-$(CONFIG_RTE_LIBRTE_XYZ) += lib/librte_some other lib
> >>>
> >>> Therefore, at least for cosmetic reasons, we should probably do the
> same in librte_pipeline, which requires changing both the librte_eal and the
> librte_table lines as below:
> >>>   DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) := lib/librte_eal
> >>>   DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_table
> >>>   DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_port
> >>
> >> Ah, didn't notice those because the assignment is first of the
> dependencies.
> >>
> >>>
> >>> However, some other libraries e.g. librte_lpm simply add the EAL
> dependency using the append operator:
> >>>   DEPDIRS-$(CONFIG_RTE_LIBRTE_LPM) += lib/librte_eal
> >>>
> >>> To be honest, I need to refresh my knowledge on make, I don't
> remember right now when we should use the assignment and when the
> append. Do we need to use the assign for first dependency (EAL) and
> append for others or should we use append everywhere?
> >>
> >> At least in automake, you need to assign before you can append. But in
> gmake
> >> this apparently is not the case, quoting from
> >>
> https://www.gnu.org/software/make/manual/html_node/Appending.html:
> >>
> >> "When the variable in question has not been defined before, ?+=? acts
> just
> >> like normal ?=?: it defines a recursively-expanded variable. However,
> when
> >> there is a previous definition, exactly what ?+=? does depends on what
> >> flavor of variable you defined originally."
> >>
> >> So there's no need to use := anywhere for the dependencies, in fact its
> >> probably best avoided to avoid issues like this. Of course after the third
> >> patch in this "series" is applied, mistakes like these can no longer go
> >> unnoticed.
> >>
> > Will the build be any slower with everything defaulting to recursively
> expanded
> > variables rather than the simply-expanded variables defined by the initial
> ":="?
> 
> Bruce, everything already *is* defaulting to recursively expanded
> variables, except for the three libraries here which have used := for
> who knows what (historical or other) reason. And out of those three
> exceptions, one is buggy. Which is what I'm addressing here.
> 
>   - Panu -

Yes, you're right, looks like the assign operator is only used in these 3 
places. Therefore, it wou

[dpdk-dev] [PATCH v2] i40e: modify the meaning of single VLAN type

2016-06-21 Thread Panu Matilainen

On 06/21/2016 01:29 PM, Bruce Richardson wrote:
> On Mon, Jun 13, 2016 at 04:03:32PM +0800, Beilei Xing wrote:
>> In current i40e codebase, if single VLAN header is added in a packet,
>> it's treated as inner VLAN. Generally, a single VLAN header is
>> treated as the outer VLAN header. So change corresponding register
>> for single VLAN.
>> At the meanwhile, change the meanings of inner VLAN and outer VLAN.
>>
>> Signed-off-by: Beilei Xing 
>
> This patch changes the ABI, since an app written to the original API as 
> specified
> e.g. to set a single vlan header, would no longer work with this change.
> Therefore, even though the original behaviour was inconsistent with other 
> drivers
> it may still need to be preserved.
>
> I'm thinking that we may need to provide appropriately versioned copies of the
> vlan_offload_set and vlan_tpid_set functions for backward compatibility with
> the old ABI.
>
> Any other comments or thoughts on this?
> Neil, Thomas, Panu - is this fix something that we need to provide backward
> version-compatibility for, or given that the functions are being called 
> through
> a generic ethdev API mean that this can just go in as a straight bug-fix?

Since it's currently inconsistent with everything else, I'd just call it 
a bug-fix and leave it at that.

Besides, I dont think you could version it via the ordinary means even 
if you wanted to, due to the way its called through eth_dev_ops etc.

- Panu -

[dpdk-dev] [PATCH v3 1/3] port: add kni interface support

2016-06-21 Thread Ethan

Hi Cristian,

New patch has been submitted. All comments are fixed except this one:
"Here is one bug for you, you need to make sure you add the following line
here:
param->parsed = 1;"
I think the new convention is to set this flag by the
macro PARSE_CHECK_DUPLICATE_SECTION.

BTW, although I use the  --cover-letter and --annotate flags in the
send-email command, it seems no cover letter is created.
I am not very familiar with this. So sorry!


B.R.
Ethan

2016-06-19 0:44 GMT+08:00 Dumitrescu, Cristian <
cristian.dumitrescu at intel.com>:

> Hi Ethan,
>
> Thank you, here are some comments inlined below.
>
> Please reorganize this patch in a slightly different way to look similar
> to other DPDK patch sets and also ease up the integration work for Thomas:
> Patch 0: I suggest adding a cover letter;
> Patch 1: all librte_port changes (rte_port_kni.h, rte_port_kni.c,
> Makefile, rte_port_version.map), including the "nodrop" KNI port version
> Patch 2: all ip_pipeline app changes
> Patch 3: ip_pipeline app kni.cfg file
> Patch 4: Documentation changes
>
> > -Original Message-
> > From: WeiJie Zhuang [mailto:zhuangwj at gmail.com]
> > Sent: Thursday, June 16, 2016 12:27 PM
> > To: Dumitrescu, Cristian 
> > Cc: dev at dpdk.org; Singh, Jasvinder ; Yigit,
> > Ferruh ; WeiJie Zhuang 
> > Subject: [PATCH v3 1/3] port: add kni interface support
> >
> > 1. add KNI port type to the packet framework
> > 2. add KNI support to the IP Pipeline sample Application
> > 3. some bug fix
> >
> > Signed-off-by: WeiJie Zhuang 
> > ---
> > v2:
> > * Fix check patch error.
> > v3:
> > * Fix code review comments.
> > ---
> >  doc/api/doxy-api-index.md  |   1 +
> >  examples/ip_pipeline/Makefile  |   2 +-
> >  examples/ip_pipeline/app.h | 181 +++-
> >  examples/ip_pipeline/config/kni.cfg|  67 +
> >  examples/ip_pipeline/config_check.c|  26 +-
> >  examples/ip_pipeline/config_parse.c| 166 ++-
> >  examples/ip_pipeline/init.c| 132 -
> >  examples/ip_pipeline/pipeline/pipeline_common_fe.c |  29 ++
> >  examples/ip_pipeline/pipeline/pipeline_master_be.c |   6 +
> >  examples/ip_pipeline/pipeline_be.h |  27 ++
> >  lib/librte_port/Makefile   |   7 +
> >  lib/librte_port/rte_port_kni.c | 325
> +
> >  lib/librte_port/rte_port_kni.h |  82 ++
> >  lib/librte_port/rte_port_version.map   |   8 +
> >  14 files changed, 1047 insertions(+), 12 deletions(-)
> >  create mode 100644 examples/ip_pipeline/config/kni.cfg
> >  create mode 100644 lib/librte_port/rte_port_kni.c
> >  create mode 100644 lib/librte_port/rte_port_kni.h
> >
> > diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> > index f626386..5e7f024 100644
> > --- a/doc/api/doxy-api-index.md
> > +++ b/doc/api/doxy-api-index.md
> > @@ -118,6 +118,7 @@ There are many libraries, so their headers may be
> > grouped by topics:
> >  [frag] (@ref rte_port_frag.h),
> >  [reass](@ref rte_port_ras.h),
> >  [sched](@ref rte_port_sched.h),
> > +[kni]  (@ref rte_port_kni.h),
> >  [src/sink] (@ref rte_port_source_sink.h)
> >* [table](@ref rte_table.h):
> >  [lpm IPv4] (@ref rte_table_lpm.h),
> > diff --git a/examples/ip_pipeline/Makefile
> b/examples/ip_pipeline/Makefile
> > index 5827117..6dc3f52 100644
> > --- a/examples/ip_pipeline/Makefile
> > +++ b/examples/ip_pipeline/Makefile
> > @@ -1,6 +1,6 @@
> >  #   BSD LICENSE
> >  #
> > -#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > +#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
> >  #   All rights reserved.
> >  #
> >  #   Redistribution and use in source and binary forms, with or without
> > diff --git a/examples/ip_pipeline/app.h b/examples/ip_pipeline/app.h
> > index 7611341..abbd6d4 100644
> > --- a/examples/ip_pipeline/app.h
> > +++ b/examples/ip_pipeline/app.h
> > @@ -44,6 +44,9 @@
> >  #include 
> >
> >  #include 
> > +#ifdef RTE_LIBRTE_KNI
> > +#include 
> > +#endif
> >
> >  #include "cpu_core_map.h"
> >  #include "pipeline.h"
> > @@ -132,6 +135,20 @@ struct app_pktq_swq_params {
> >   uint32_t mempool_indirect_id;
> >  };
> >
> > +struct app_pktq_kni_params {
> > + char *name;
> > + uint32_t parsed;
> > +
> > + uint32_t socket_id;
> > + uint32_t core_id;
> > + uint32_t hyper_th_id;
> > + uint32_t force_bind;
> > +
> > + uint32_t mempool_id; /* Position in the app->mempool_params */
> > + uint32_t burst_read;
> > + uint32_t burst_write;
> > +};
> > +
> >  #ifndef APP_FILE_NAME_SIZE
> >  #define APP_FILE_NAME_SIZE   256
> >  #endif
> > @@ -185,6 +202,7 @@ enum app_pktq_in_type {
> >   APP_PK

[dpdk-dev] [PATCH] lib/table: fix wrong type of nht field

2016-06-21 Thread Dumitrescu, Cristian



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Monday, June 20, 2016 11:14 AM
> To: Jastrzebski, MichalX K ; Kobylinski,
> MichalX 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] lib/table: fix wrong type of nht field
> 
> 2016-06-20 12:10, Michal Jastrzebski:
> > From: Michal Kobylinski 
> >
> > Change type of nht field from uint32_t to uint8_t and increase max of
> > next hops.
> >
> > Fixes: dc81ebbacaeb ("lpm: extend IPv4 next hop field")
> 
> Why the type is wrong?

The lpm->nht is simply some raw memory allocated at the end of the table 
context using the usual pattern:
struct rte_table_lpm {
... 
uint8_t nht[0] __rte_cache_aligned;
}

Therefore, when we do: 
nht_entry = &lpm->nht[i * lpm->entry_size];
in several places, it makes big difference whether nht_entry is declared as 
uin8_t * (correct) or uint32_t * (incorrect), as the position computed by the 
latter is 4 times the position computed by the former ;(

Michal K and Michal J,
I just realized we still need to do a small change to this patch, I 
will reply to the original mail now. So Thomas, sorry, there is one small 
change, we'll send new version soon.

Thanks,
Cristian

[dpdk-dev] [PATCH] lib/table: fix wrong type of nht field

2016-06-21 Thread Dumitrescu, Cristian



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michal Jastrzebski
> Sent: Monday, June 20, 2016 11:10 AM
> To: dev at dpdk.org
> Cc: Kobylinski, MichalX 
> Subject: [dpdk-dev] [PATCH] lib/table: fix wrong type of nht field
> 
> From: Michal Kobylinski 
> 
> Change type of nht field from uint32_t to uint8_t and increase max of
> next hops.
> 
> Fixes: dc81ebbacaeb ("lpm: extend IPv4 next hop field")
> 
> Signed-off-by: Michal Kobylinski 
> Acked-by: Cristian Dumitrescu 
> ---
>  examples/ip_pipeline/pipeline/pipeline_routing_be.h | 2 +-
>  lib/librte_table/rte_table_lpm.c| 8 
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/examples/ip_pipeline/pipeline/pipeline_routing_be.h
> b/examples/ip_pipeline/pipeline/pipeline_routing_be.h
> index 1276342..ea50896 100644
> --- a/examples/ip_pipeline/pipeline/pipeline_routing_be.h
> +++ b/examples/ip_pipeline/pipeline/pipeline_routing_be.h
> @@ -42,7 +42,7 @@
>   * Pipeline argument parsing
>   */
>  #ifndef PIPELINE_ROUTING_N_ROUTES_DEFAULT
> -#define PIPELINE_ROUTING_N_ROUTES_DEFAULT  4096
> +#define PIPELINE_ROUTING_N_ROUTES_DEFAULT  65536
>  #endif
> 

Changing the PIPELINE_ROUTING_N_ROUTES_DEFAULT  is actually not required, this 
is simply the default value which can be changed through the configuration 
file. Please remove this.

>  enum pipeline_routing_encap {
> diff --git a/lib/librte_table/rte_table_lpm.c
> b/lib/librte_table/rte_table_lpm.c
> index cdeb0f5..f2eaed5 100644
> --- a/lib/librte_table/rte_table_lpm.c
> +++ b/lib/librte_table/rte_table_lpm.c
> @@ -44,7 +44,7 @@
> 
>  #include "rte_table_lpm.h"
> 
> -#define RTE_TABLE_LPM_MAX_NEXT_HOPS256
> +#define RTE_TABLE_LPM_MAX_NEXT_HOPS65536
> 

With the next hop size of 24 bits, we can now make this configurable, so please 
use:

#ifndef RTE_TABLE_LPM_MAX_NEXT_HOPS
#define RTE_TABLE_LPM_MAX_NEXT_HOPS65536
#endif

>  #ifdef RTE_TABLE_STATS_COLLECT
> 
> @@ -74,7 +74,7 @@ struct rte_table_lpm {
> 
>   /* Next Hop Table (NHT) */
>   uint32_t nht_users[RTE_TABLE_LPM_MAX_NEXT_HOPS];
> - uint32_t nht[0] __rte_cache_aligned;
> + uint8_t nht[0] __rte_cache_aligned;
>  };
> 
>  static void *
> @@ -188,7 +188,7 @@ nht_find_existing(struct rte_table_lpm *lpm, void
> *entry, uint32_t *pos)
>   uint32_t i;
> 
>   for (i = 0; i < RTE_TABLE_LPM_MAX_NEXT_HOPS; i++) {
> - uint32_t *nht_entry = &lpm->nht[i * lpm->entry_size];
> + uint8_t *nht_entry = &lpm->nht[i * lpm->entry_size];
> 
>   if ((lpm->nht_users[i] > 0) && (memcmp(nht_entry, entry,
>   lpm->entry_unique_size) == 0)) {
> @@ -242,7 +242,7 @@ rte_table_lpm_entry_add(
> 
>   /* Find existing or free NHT entry */
>   if (nht_find_existing(lpm, entry, &nht_pos) == 0) {
> - uint32_t *nht_entry;
> + uint8_t *nht_entry;
> 
>   if (nht_find_free(lpm, &nht_pos) == 0) {
>   RTE_LOG(ERR, TABLE, "%s: NHT full\n", __func__);
> --
> 1.9.1

[dpdk-dev] [PATCH] kni : fix build errors for gcc --version >= 6.1

2016-06-21 Thread Anupam Kapoor

On Tue, Jun 21, 2016 at 3:40 PM, Ferruh Yigit 
wrote:

> Hi Anupam,
>
> Thank you for the patch.
>
>
> On 6/21/2016 9:37 AM, Anupam Kapoor wrote:
> > This commit fixes build errors triggered due misleading indentation.
> >
> > Fixes: 366113dbfb696 (e1000: suppress misleading indentation warning)
> This looks like wrong commit id that fixed, can you please double check
> Also you may need two fixes lines, since fixing two different driver files
>
?yes, you are right, i erroneously looked at the commit-id ?

?which included the '-Wno-misleading-indentation' and used that.?


> >
> >
> > Signed-off-by: Anupam Kapoor 
> > ---
> >  lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c | 12
> 
> >  lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c |  6 +++---
> >  2 files changed, 11 insertions(+), 7 deletions(-)
> >
> > diff --git a/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c
> b/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c
> > index df224702ed7d..26352da15101 100644
> > --- a/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c
> > +++ b/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c
> > @@ -3299,13 +3299,15 @@ s32 e1000_read_phy_reg_mphy(struct e1000_hw *hw,
> u32 address, u32 *data)
> > return -E1000_ERR_PHY;
> > *data = E1000_READ_REG(hw, E1000_MPHY_DATA);
> >
> > -   /* Disable access to mPHY if it was originally disabled */
> > -   if (locked)
> > +   /* Disable access to mPHY if it was originally enabled */
> I think original comment is correct.
> As far as I can see, if access disabled in the beginning of the
> function, it is enabled and here disabled back. Original state saved to
> locked variable.
>
?yes, the locked variable keeps that state...
?


> > +   if (locked) {
> > ready = e1000_is_mphy_ready(hw);
> > if (!ready)
> > return -E1000_ERR_PHY;
> > +
> > E1000_WRITE_REG(hw, E1000_MPHY_ADDR_CTRL,
> > E1000_MPHY_DIS_ACCESS);
> > +   }
> >
>
> ...
>
> > diff --git a/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c
> b/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c
> > index 017dfe16c73f..dc2a4fb61c25 100644
> > --- a/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c
> > +++ b/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c
> > @@ -870,9 +870,9 @@ s32 ixgbe_setup_mac_link_82599(struct ixgbe_hw *hw,
> > if (speed & IXGBE_LINK_SPEED_10GB_FULL)
> > if (orig_autoc & IXGBE_AUTOC_KX4_SUPP)
> > autoc |= IXGBE_AUTOC_KX4_SUPP;
> > -   if ((orig_autoc & IXGBE_AUTOC_KR_SUPP) &&
> > -   (hw->phy.smart_speed_active == false))
> > -   autoc |= IXGBE_AUTOC_KR_SUPP;
> > +   if ((orig_autoc & IXGBE_AUTOC_KR_SUPP) &&
> > +   (hw->phy.smart_speed_active == false))
> > +   autoc |= IXGBE_AUTOC_KR_SUPP;
> Can you please check following commit:
>
> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=55461ddbc

?cool thanks for the information.
?

>
> > if (speed & IXGBE_LINK_SPEED_1GB_FULL)
> > autoc |= IXGBE_AUTOC_KX_SUPP;
> > } else if ((pma_pmd_1g == IXGBE_AUTOC_1G_SFI) &&
> > --
> > 2.9.0
> >
>
>
> Would you mind sending a new version of patch according above comments?

?yes sure. in a few minutes...?



>
> Thanks,
> ferruh
>



-- 
?thanks
anupam?

[dpdk-dev] [PATCH v2] i40e: modify the meaning of single VLAN type

2016-06-21 Thread Bruce Richardson

On Tue, Jun 21, 2016 at 02:06:38PM +0300, Panu Matilainen wrote:
> On 06/21/2016 01:29 PM, Bruce Richardson wrote:
> >On Mon, Jun 13, 2016 at 04:03:32PM +0800, Beilei Xing wrote:
> >>In current i40e codebase, if single VLAN header is added in a packet,
> >>it's treated as inner VLAN. Generally, a single VLAN header is
> >>treated as the outer VLAN header. So change corresponding register
> >>for single VLAN.
> >>At the meanwhile, change the meanings of inner VLAN and outer VLAN.
> >>
> >>Signed-off-by: Beilei Xing 
> >
> >This patch changes the ABI, since an app written to the original API as 
> >specified
> >e.g. to set a single vlan header, would no longer work with this change.
> >Therefore, even though the original behaviour was inconsistent with other 
> >drivers
> >it may still need to be preserved.
> >
> >I'm thinking that we may need to provide appropriately versioned copies of 
> >the
> >vlan_offload_set and vlan_tpid_set functions for backward compatibility with
> >the old ABI.
> >
> >Any other comments or thoughts on this?
> >Neil, Thomas, Panu - is this fix something that we need to provide backward
> >version-compatibility for, or given that the functions are being called 
> >through
> >a generic ethdev API mean that this can just go in as a straight bug-fix?
> 
> Since it's currently inconsistent with everything else, I'd just call it a
> bug-fix and leave it at that.
> 

Yep, makes sense.

> Besides, I dont think you could version it via the ordinary means even if
> you wanted to, due to the way its called through eth_dev_ops etc.
> 

Good point, never thought of that! :-(

>   - Panu -

Thanks for the guidance.

/Bruce

[dpdk-dev] [PATCH v2] i40e: modify the meaning of single VLAN type

2016-06-21 Thread Bruce Richardson

On Mon, Jun 13, 2016 at 04:03:32PM +0800, Beilei Xing wrote:
> In current i40e codebase, if single VLAN header is added in a packet,
> it's treated as inner VLAN. Generally, a single VLAN header is
> treated as the outer VLAN header. So change corresponding register
> for single VLAN.
> At the meanwhile, change the meanings of inner VLAN and outer VLAN.
> 
> Signed-off-by: Beilei Xing 
> ---
> v2 changes:
>  Combine corresponding i40e driver changes into this patch.
> 
>  doc/guides/rel_notes/release_16_07.rst |  3 +++
>  drivers/net/i40e/i40e_ethdev.c | 29 -
>  lib/librte_ether/rte_ethdev.h  |  4 ++--
>  3 files changed, 25 insertions(+), 11 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_16_07.rst 
> b/doc/guides/rel_notes/release_16_07.rst
> index c0f6b02..ae02824 100644
> --- a/doc/guides/rel_notes/release_16_07.rst
> +++ b/doc/guides/rel_notes/release_16_07.rst
> @@ -135,6 +135,9 @@ API Changes
>ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
>tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff.
>  
> +* The meanings of ``ETH_VLAN_TYPE_INNER`` and ``ETH_VLAN_TYPE_OUTER`` in
> +  ``rte_vlan_type`` are changed.
> +

This change of meaning is not a general change across the whole API but is
just for the i40e driver. Rather than noting it as an API change, I think it's
better documenting it as a "fixed issue" in the i40e driver.

Also, I think it is better explained as a change in the type of a single vlan
tag, rather than referring to it as a switch in meaning of inner and outer
tags generally. Can you reword the message, please.

/Bruce

[dpdk-dev] [PATCH v3 1/3] port: add kni interface support

2016-06-21 Thread Dumitrescu, Cristian

Hi Ethan,

Thanks very much for sending the new version.

You are absolutely right about the param->parsed issue, sorry, my fault.

I think you need to use the --cover-letter flag for git format-patch command. 
You can practice by sending the patch set to your email address first before 
you send it to the list. Do you want to try sending a new revision of this 
patch set with the cover letter included (preferred) or you want to stop at v3? 
Please let us know.

Thanks,
Cristian

From: zhuangweijie at gmail.com [mailto:zhuangwei...@gmail.com] On Behalf Of 
Ethan
Sent: Tuesday, June 21, 2016 12:11 PM
To: Dumitrescu, Cristian 
Cc: dev at dpdk.org; Singh, Jasvinder ; Yigit, 
Ferruh 
Subject: Re: [PATCH v3 1/3] port: add kni interface support

Hi Cristian,

New patch has been submitted. All comments are fixed except this one:
"Here is one bug for you, you need to make sure you add the following line here:
param->parsed = 1;"
I think the new convention is to set this flag by the macro 
PARSE_CHECK_DUPLICATE_SECTION.

BTW, although I use the  --cover-letter and --annotate flags in the send-email 
command, it seems no cover letter is created.
I am not very familiar with this. So sorry!


B.R.
Ethan

2016-06-19 0:44 GMT+08:00 Dumitrescu, Cristian mailto:cristian.dumitrescu at intel.com>>:
Hi Ethan,

Thank you, here are some comments inlined below.

Please reorganize this patch in a slightly different way to look similar to 
other DPDK patch sets and also ease up the integration work for Thomas:
Patch 0: I suggest adding a cover letter;
Patch 1: all librte_port changes (rte_port_kni.h, rte_port_kni.c, 
Makefile, rte_port_version.map), including the "nodrop" KNI port version
Patch 2: all ip_pipeline app changes
Patch 3: ip_pipeline app kni.cfg file
Patch 4: Documentation changes

> -Original Message-
> From: WeiJie Zhuang [mailto:zhuangwj at gmail.com gmail.com>]
> Sent: Thursday, June 16, 2016 12:27 PM
> To: Dumitrescu, Cristian  intel.com>
> Cc: dev at dpdk.org; Singh, Jasvinder 
> mailto:jasvinder.singh at intel.com>>; Yigit,
> Ferruh mailto:ferruh.yigit at intel.com>>; WeiJie 
> Zhuang mailto:zhuangwj at gmail.com>>
> Subject: [PATCH v3 1/3] port: add kni interface support
>
> 1. add KNI port type to the packet framework
> 2. add KNI support to the IP Pipeline sample Application
> 3. some bug fix
>
> Signed-off-by: WeiJie Zhuang mailto:zhuangwj at 
> gmail.com>>
> ---
> v2:
> * Fix check patch error.
> v3:
> * Fix code review comments.
> ---
>  doc/api/doxy-api-index.md  
> |   1 +
>  examples/ip_pipeline/Makefile  |   2 +-
>  examples/ip_pipeline/app.h | 181 +++-
>  examples/ip_pipeline/config/kni.cfg|  67 +
>  examples/ip_pipeline/config_check.c|  26 +-
>  examples/ip_pipeline/config_parse.c| 166 ++-
>  examples/ip_pipeline/init.c| 132 -
>  examples/ip_pipeline/pipeline/pipeline_common_fe.c |  29 ++
>  examples/ip_pipeline/pipeline/pipeline_master_be.c |   6 +
>  examples/ip_pipeline/pipeline_be.h |  27 ++
>  lib/librte_port/Makefile   |   7 +
>  lib/librte_port/rte_port_kni.c | 325 
> +
>  lib/librte_port/rte_port_kni.h |  82 ++
>  lib/librte_port/rte_port_version.map   |   8 +
>  14 files changed, 1047 insertions(+), 12 deletions(-)
>  create mode 100644 examples/ip_pipeline/config/kni.cfg
>  create mode 100644 lib/librte_port/rte_port_kni.c
>  create mode 100644 lib/librte_port/rte_port_kni.h
>
> diff --git a/doc/api/doxy-api-index.md 
> b/doc/api/doxy-api-index.md
> index f626386..5e7f024 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -118,6 +118,7 @@ There are many libraries, so their headers may be
> grouped by topics:
>  [frag] (@ref rte_port_frag.h),
>  [reass](@ref rte_port_ras.h),
>  [sched](@ref rte_port_sched.h),
> +[kni]  (@ref rte_port_kni.h),
>  [src/sink] (@ref rte_port_source_sink.h)
>* [table](@ref rte_table.h):
>  [lpm IPv4] (@ref rte_table_lpm.h),
> diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
> index 5827117..6dc3f52 100644
> --- a/examples/ip_pipeline/Makefile
> +++ b/examples/ip_pipeline/Makefile
> @@ -1,6 +1,6 @@
>  #   BSD LICENSE
>  #
> -#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> +#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
>  #   All rights reserved.
>  #
>  #   Redistribution and use in source and binary forms, with or witho

[dpdk-dev] [PATCH] kni : fix build errors for gcc --version >= 6.1

2016-06-21 Thread Anupam Kapoor

This commit fixes build errors triggered due misleading indentation.

Fixes: 38db3f7f50bde (e1000: update base driver)
Fixes: 3fc5ca2f63529 (kni: initial import)


Signed-off-by: Anupam Kapoor 
---
 lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c | 6 --
 lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c | 3 ++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c
b/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c
index df224702ed7d..140a2a476ed2 100644
--- a/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c
+++ b/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_phy.c
@@ -3300,12 +3300,13 @@ s32 e1000_read_phy_reg_mphy(struct e1000_hw *hw,
u32 address, u32 *data)
*data = E1000_READ_REG(hw, E1000_MPHY_DATA);

/* Disable access to mPHY if it was originally disabled */
-   if (locked)
+   if (locked) {
ready = e1000_is_mphy_ready(hw);
if (!ready)
return -E1000_ERR_PHY;
E1000_WRITE_REG(hw, E1000_MPHY_ADDR_CTRL,
E1000_MPHY_DIS_ACCESS);
+   }

return E1000_SUCCESS;
 }
@@ -3365,12 +3366,13 @@ s32 e1000_write_phy_reg_mphy(struct e1000_hw *hw,
u32 address, u32 data,
E1000_WRITE_REG(hw, E1000_MPHY_DATA, data);

/* Disable access to mPHY if it was originally disabled */
-   if (locked)
+   if (locked) {
ready = e1000_is_mphy_ready(hw);
if (!ready)
return -E1000_ERR_PHY;
E1000_WRITE_REG(hw, E1000_MPHY_ADDR_CTRL,
E1000_MPHY_DIS_ACCESS);
+   }

return E1000_SUCCESS;
 }
diff --git a/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c
b/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c
index 017dfe16c73f..c6f4130d78ab 100644
--- a/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c
+++ b/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_82599.c
@@ -867,12 +867,13 @@ s32 ixgbe_setup_mac_link_82599(struct ixgbe_hw *hw,
link_mode == IXGBE_AUTOC_LMS_KX4_KX_KR_SGMII) {
/* Set KX4/KX/KR support according to speed requested */
autoc &= ~(IXGBE_AUTOC_KX4_KX_SUPP_MASK |
IXGBE_AUTOC_KR_SUPP);
-   if (speed & IXGBE_LINK_SPEED_10GB_FULL)
+   if (speed & IXGBE_LINK_SPEED_10GB_FULL) {
if (orig_autoc & IXGBE_AUTOC_KX4_SUPP)
autoc |= IXGBE_AUTOC_KX4_SUPP;
if ((orig_autoc & IXGBE_AUTOC_KR_SUPP) &&
(hw->phy.smart_speed_active == false))
autoc |= IXGBE_AUTOC_KR_SUPP;
+   }
if (speed & IXGBE_LINK_SPEED_1GB_FULL)
autoc |= IXGBE_AUTOC_KX_SUPP;
} else if ((pma_pmd_1g == IXGBE_AUTOC_1G_SFI) &&
-- 
2.9.0

?--
thanks
anupam?

[dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list truncation

2016-06-21 Thread Panu Matilainen

On 06/21/2016 01:58 PM, Dumitrescu, Cristian wrote:
>
>
>> -Original Message-
>> From: Panu Matilainen [mailto:pmatilai at redhat.com]
>> Sent: Tuesday, June 21, 2016 11:45 AM
>> To: Richardson, Bruce 
>> Cc: Dumitrescu, Cristian ; dev at dpdk.org;
>> christian.ehrhardt at canonical.com; thomas.monjalon at 6wind.com
>> Subject: Re: [dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list
>> truncation
>>
>> On 06/21/2016 01:31 PM, Bruce Richardson wrote:
>>> On Tue, Jun 21, 2016 at 01:25:52PM +0300, Panu Matilainen wrote:
 On 06/21/2016 01:01 PM, Dumitrescu, Cristian wrote:
>
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu
>> Matilainen
>> Sent: Tuesday, June 21, 2016 9:12 AM
>> To: dev at dpdk.org
>> Cc: christian.ehrhardt at canonical.com; thomas.monjalon at 6wind.com
>> Subject: [dpdk-dev] [PATCH 1/3] mk: fix librte_pipeline dependency list
>> truncation
>>
>> In other libraries, dependency list is always appended to, but
>> in commit 6cbf4f75e059 it with an assignment. This causes the
>> librte_eal dependency added in commit 6cbf4f75e059 to get discarded,
>> resulting in missing dependency on librte_eal.
>>
>> Fixes: b3688bee81a8 ("pipeline: new packet framework logic")
>> Fixes: 6cbf4f75e059 ("mk: fix missing internal dependencies")
>>
>> Signed-off-by: Panu Matilainen 
>> ---
>> lib/librte_pipeline/Makefile | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/lib/librte_pipeline/Makefile b/lib/librte_pipeline/Makefile
>> index 95387aa..a8f3128 100644
>> --- a/lib/librte_pipeline/Makefile
>> +++ b/lib/librte_pipeline/Makefile
>> @@ -53,7 +53,7 @@ SYMLINK-$(CONFIG_RTE_LIBRTE_PIPELINE)-
>> include +=
>> rte_pipeline.h
>>
>> # this lib depends upon:
>> DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_eal
>> -DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) := lib/librte_table
>> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_table
>> DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_port
>>
>> include $(RTE_SDK)/mk/rte.lib.mk
>> --
>> 2.5.5
>
>
> In release 16.4, EAL was missing from the dependency list, now it is
>> added. The librte_pipeline uses rte_malloc, therefore it depends on
>> librte_eal being present.
>
> In the Makefile of the other Packet Framework libraries (librte_port,
>> librte_table), it looks like the first dependency in the list is EAL, which 
>> is listed
>> with the assignment operator, followed by others that are listed with the
>> append operator:
>   DEPDIRS-$(CONFIG_RTE_LIBRTE_XYZ) := lib/librte_eal
>   DEPDIRS-$(CONFIG_RTE_LIBRTE_XYZ) += lib/librte_some other lib
>
> Therefore, at least for cosmetic reasons, we should probably do the
>> same in librte_pipeline, which requires changing both the librte_eal and the
>> librte_table lines as below:
>   DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) := lib/librte_eal
>   DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_table
>   DEPDIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += lib/librte_port

 Ah, didn't notice those because the assignment is first of the
>> dependencies.

>
> However, some other libraries e.g. librte_lpm simply add the EAL
>> dependency using the append operator:
>   DEPDIRS-$(CONFIG_RTE_LIBRTE_LPM) += lib/librte_eal
>
> To be honest, I need to refresh my knowledge on make, I don't
>> remember right now when we should use the assignment and when the
>> append. Do we need to use the assign for first dependency (EAL) and
>> append for others or should we use append everywhere?

 At least in automake, you need to assign before you can append. But in
>> gmake
 this apparently is not the case, quoting from

>> https://www.gnu.org/software/make/manual/html_node/Appending.html:

 "When the variable in question has not been defined before, ?+=? acts
>> just
 like normal ?=?: it defines a recursively-expanded variable. However,
>> when
 there is a previous definition, exactly what ?+=? does depends on what
 flavor of variable you defined originally."

 So there's no need to use := anywhere for the dependencies, in fact its
 probably best avoided to avoid issues like this. Of course after the third
 patch in this "series" is applied, mistakes like these can no longer go
 unnoticed.

>>> Will the build be any slower with everything defaulting to recursively
>> expanded
>>> variables rather than the simply-expanded variables defined by the initial
>> ":="?
>>
>> Bruce, everything already *is* defaulting to recursively expanded
>> variables, except for the three libraries here which have used := for
>> who knows what (historical or other) reason. And out of those three
>> exceptions, one is buggy. Which is what I'm addressing here.
>>
>>  - Panu -
>
> Yes, you'

[dpdk-dev] [PATCH] mem: skip memory locking on failure

2016-06-21 Thread Panu Matilainen

On 06/14/2016 05:12 PM, Olivier MATZ wrote:
> Hi Panu,
>
> On 06/14/2016 03:21 PM, Panu Matilainen wrote:
>> On 06/13/2016 01:26 PM, Olivier Matz wrote:
>>> Since recently [1], it is not possible to run the dpdk with user
>>> (non-root) privileges and the --no-huge option. This is because the eal
>>> layer tries to lock the memory. Using locked memory is mandatory for
>>> physical devices because they reference physical addresses.
>>>
>>> But a user may want to start the dpdk without locked memory, because he
>>> does not have the permission to do so, and/or does not have this need.
>>>
>>> Moreover, the option --no-huge is still not functional today since the
>>> physical memory address is not properly filled, so the initial patch is
>>> not really useful.
>>>
>>> This commit fixes this issue by retrying the mmap() wihout the
>>> MAP_LOCKED flag if the first mmap() failed.
>>>
>>> [1] http://www.dpdk.org/ml/archives/dev/2016-May/039404.html
>>>
>>> Fixes: 593a084afc2b ("mem: lock pages when not using hugepages")
>>> Reported-by: Panu Matilainen 
>>> Signed-off-by: Olivier Matz 
>>> ---
>>>  lib/librte_eal/linuxapp/eal/eal_memory.c | 9 +
>>>  1 file changed, 9 insertions(+)
>>>
>>> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c
>>> b/lib/librte_eal/linuxapp/eal/eal_memory.c
>>> index 79d1d2d..08692d1 100644
>>> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
>>> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
>>> @@ -1075,6 +1075,15 @@ rte_eal_hugepage_init(void)
>>>  if (internal_config.no_hugetlbfs) {
>>>  addr = mmap(NULL, internal_config.memory, PROT_READ |
>>> PROT_WRITE,
>>>  MAP_LOCKED | MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
>>> +/* retry without MAP_LOCKED */
>>> +if (addr == MAP_FAILED && errno == EAGAIN) {
>>> +addr = mmap(NULL, internal_config.memory,
>>> +PROT_READ | PROT_WRITE,
>>> +MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
>>> +if (addr != MAP_FAILED)
>>> +RTE_LOG(NOTICE, EAL,
>>> +"Cannot lock memory: don't use physical
>>> devices\n");
>>> +}
>>>  if (addr == MAP_FAILED) {
>>>  RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__,
>>>  strerror(errno));
>>>
>>
>> I'm not really that familiar with dpdk memory usage, but gut feeling
>> says such a thing needs to be explicit - either you explicitly ask for
>> memory that doesn't need to be locked, or this simply fails with no
>> retries. Or something like that. But "maybe I did, maybe I didn't"
>> doesn't seem like very good API semantics to me :)
>
> Yes, you're right. Anyway as this commit is not useful today,
> it would be better to revert it.

I suppose you mean revert the memlock commit, ie this?

commit 593a084afc2b441895aeca78a2c4465e450d0ef5
Author: Olivier Matz 
Date:   Wed May 18 13:04:42 2016 +0200

 mem: lock pages when not using hugepages

Reverting that would help in the sense that then we could make the 
test-suite runnable by regular users (I've some patches for this), and 
once that is in place it would sort of force dealing with the issue one 
way or the other in future work in this area :)

>
>> Are there actual plans to make --no-huge work with real devices?
>
> I think this is something that could be part of the memory
> rework referenced by Thomas:
> http://dpdk.org/ml/archives/dev/2016-April/037444.html
>
> I don't know if it's planified yet.
>
>
>> If not then documenting --no-huge to imply unlocked memory is one
>> option I guess.
>
> There is already some words in the known issues:
> http://dpdk.org/doc/guides/rel_notes/known_issues.html?highlight=known%20issues#pmd-does-not-work-with-no-huge-eal-command-line-parameter

Right, so it wouldn't be a regression at least.

- Panu -

>
> Maybe we could add something somewhere else, but I did not find
> any doc referencing eal options. Only a guide for testpmd here:
> http://dpdk.org/doc/guides/testpmd_app_ug/run_app.html#eal-command-line-options
>
>
> John, maybe you have an idea?
>
> Thanks
> Olivier
>

[dpdk-dev] [PATCH v4 00/17] prepare for rte_device / rte_driver

2016-06-21 Thread Shreyansh Jain

* Original patch series is from David Marchand [1], [2].
* Cover letter text has been modified to make it author agnostic

David created the original patchset based on the discussions on list [3].
Being a large piece of work, this patchset introduces first level of changes
for generalizing the driver-device relationship for supporting hotplug.

Pending work, as per discussions in thread [3]:
- Heirarchical relationship between rte_driver/device, pci_*, crypto_*
- Cleaner device init/deinit methods (probably from rte_driver onwards)
- Moving generic flags/fields from pci_* structure to rte_* structure
- Removing dependency on devargs for pdev/vdev distinction
- Device/Driver lists: discussion and decision on separate or unified lists

Changes since v3:
- rebase over HEAD (913154e)
- Update arguments to RTE_EAL_PCI_REGISTER macro as per Jan's suggestion
- modify qede driver to use RTE_EAL_PCI_REGISTER
- Argument check in hotplug functions

Changes since v2:
- rebase over HEAD (d76c193)
- Move SYSFS_PCI_DRIVERS macro to rte_pci.h to avoid compilation issue

Changes since v1:
- rebased on HEAD, new drivers should be okay
- patches have been split into smaller pieces
- RTE_INIT macro has been added, but in the end, I am not sure it is useful
- device type has been removed from ethdev, as it was used only by hotplug
- getting rid of pmd type in eal patch (patch 5 of initial series) has been
  dropped for now, we can do this once vdev drivers have been converted

[1] http://dpdk.org/ml/archives/dev/2016-January/032387.html
[2] http://dpdk.org/ml/archives/dev/2016-April/037686.html
[3] http://dpdk.org/ml/archives/dev/2016-January/031390.html

David Marchand (17):
  pci: no need for dynamic tailq init
  crypto: no need for a crypto pmd type
  drivers: align pci driver definitions
  eal: remove duplicate function declaration
  eal: introduce init macros
  crypto: export init/uninit common wrappers for pci drivers
  ethdev: export init/uninit common wrappers for pci drivers
  drivers: convert all pdev drivers as pci drivers
  crypto: get rid of crypto driver register callback
  ethdev: get rid of eth driver register callback
  eal/linux: move back interrupt thread init before setting affinity
  pci: add a helper for device name
  pci: add a helper to update a device
  ethdev: do not scan all pci devices on attach
  eal: add hotplug operations for pci and vdev
  ethdev: convert to eal hotplug
  ethdev: get rid of device type

 app/test/virtual_pmd.c  |   2 +-
 drivers/crypto/qat/rte_qat_cryptodev.c  |  18 +-
 drivers/net/af_packet/rte_eth_af_packet.c   |   2 +-
 drivers/net/bnx2x/bnx2x_ethdev.c|  35 +---
 drivers/net/bonding/rte_eth_bond_api.c  |   2 +-
 drivers/net/cxgbe/cxgbe_ethdev.c|  24 +--
 drivers/net/cxgbe/cxgbe_main.c  |   2 +-
 drivers/net/e1000/em_ethdev.c   |  16 +-
 drivers/net/e1000/igb_ethdev.c  |  40 +
 drivers/net/ena/ena_ethdev.c|  20 +--
 drivers/net/enic/enic_ethdev.c  |  23 +--
 drivers/net/fm10k/fm10k_ethdev.c|  23 +--
 drivers/net/i40e/i40e_ethdev.c  |  26 +--
 drivers/net/i40e/i40e_ethdev_vf.c   |  25 +--
 drivers/net/ixgbe/ixgbe_ethdev.c|  47 +
 drivers/net/mlx4/mlx4.c |  22 +--
 drivers/net/mlx5/mlx5.c |  21 +--
 drivers/net/mpipe/mpipe_tilegx.c|   2 +-
 drivers/net/nfp/nfp_net.c   |  23 +--
 drivers/net/null/rte_eth_null.c |   2 +-
 drivers/net/pcap/rte_eth_pcap.c |   2 +-
 drivers/net/qede/qede_ethdev.c  |  40 +
 drivers/net/ring/rte_eth_ring.c |   2 +-
 drivers/net/szedata2/rte_eth_szedata2.c |  25 +--
 drivers/net/vhost/rte_eth_vhost.c   |   2 +-
 drivers/net/virtio/virtio_ethdev.c  |  26 +--
 drivers/net/vmxnet3/vmxnet3_ethdev.c|  23 +--
 drivers/net/xenvirt/rte_eth_xenvirt.c   |   2 +-
 examples/ip_pipeline/init.c |  22 ---
 lib/librte_cryptodev/rte_cryptodev.c|  67 ++-
 lib/librte_cryptodev/rte_cryptodev.h|   2 -
 lib/librte_cryptodev/rte_cryptodev_pmd.h|  45 ++---
 lib/librte_cryptodev/rte_cryptodev_version.map  |   9 +-
 lib/librte_eal/bsdapp/eal/eal_pci.c |  52 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |   2 +
 lib/librte_eal/common/eal_common_dev.c  |  47 +
 lib/librte_eal/common/eal_common_pci.c  |  19 +-
 lib/librte_eal/common/eal_private.h |  20 ++-
 lib/librte_eal/common/include/rte_dev.h |  29 +++-
 lib/librte_eal/common/include/rte_eal.h |   3 +
 lib/librte_eal/common/include/rte_pci.h |  36 
 lib/librte_eal/common/include/rte_tailq.h   |   4 +-
 lib/librte_eal/linuxapp/eal/eal.c   |   7 +-
 lib/l

[dpdk-dev] [PATCH v4 01/17] pci: no need for dynamic tailq init

2016-06-21 Thread Shreyansh Jain

These lists can be initialized once and for all at build time.
With this, those lists are only manipulated in a common place
(and we could even make them private).

A nice side effect is that pci drivers can now register in constructors.

Signed-off-by: David Marchand 
Reviewed-by: Jan Viktorin 
Signed-off-by: Shreyansh Jain 
---
 lib/librte_eal/bsdapp/eal/eal_pci.c| 3 ---
 lib/librte_eal/common/eal_common_pci.c | 6 --
 lib/librte_eal/linuxapp/eal/eal_pci.c  | 3 ---
 3 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 7fdd6f1..880483d 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -623,9 +623,6 @@ rte_eal_pci_ioport_unmap(struct rte_pci_ioport *p)
 int
 rte_eal_pci_init(void)
 {
-   TAILQ_INIT(&pci_driver_list);
-   TAILQ_INIT(&pci_device_list);
-
/* for debug purposes, PCI can be disabled */
if (internal_config.no_pci)
return 0;
diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index ba5283d..fee4aa5 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -82,8 +82,10 @@

 #include "eal_private.h"

-struct pci_driver_list pci_driver_list;
-struct pci_device_list pci_device_list;
+struct pci_driver_list pci_driver_list =
+   TAILQ_HEAD_INITIALIZER(pci_driver_list);
+struct pci_device_list pci_device_list =
+   TAILQ_HEAD_INITIALIZER(pci_device_list);

 #define SYSFS_PCI_DEVICES "/sys/bus/pci/devices"

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index f9c3efd..bfc410f 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -743,9 +743,6 @@ rte_eal_pci_ioport_unmap(struct rte_pci_ioport *p)
 int
 rte_eal_pci_init(void)
 {
-   TAILQ_INIT(&pci_driver_list);
-   TAILQ_INIT(&pci_device_list);
-
/* for debug purposes, PCI can be disabled */
if (internal_config.no_pci)
return 0;
-- 
2.7.4

[dpdk-dev] [PATCH v4 02/17] crypto: no need for a crypto pmd type

2016-06-21 Thread Shreyansh Jain

This information is not used and just adds noise.

Signed-off-by: David Marchand 
Signed-off-by: Shreyansh Jain 
---
 lib/librte_cryptodev/rte_cryptodev.c | 8 +++-
 lib/librte_cryptodev/rte_cryptodev.h | 2 --
 lib/librte_cryptodev/rte_cryptodev_pmd.h | 3 +--
 3 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/lib/librte_cryptodev/rte_cryptodev.c 
b/lib/librte_cryptodev/rte_cryptodev.c
index 960e2d5..b0d806c 100644
--- a/lib/librte_cryptodev/rte_cryptodev.c
+++ b/lib/librte_cryptodev/rte_cryptodev.c
@@ -230,7 +230,7 @@ rte_cryptodev_find_free_device_index(void)
 }

 struct rte_cryptodev *
-rte_cryptodev_pmd_allocate(const char *name, enum pmd_type type, int socket_id)
+rte_cryptodev_pmd_allocate(const char *name, int socket_id)
 {
struct rte_cryptodev *cryptodev;
uint8_t dev_id;
@@ -269,7 +269,6 @@ rte_cryptodev_pmd_allocate(const char *name, enum pmd_type 
type, int socket_id)
cryptodev->data->dev_started = 0;

cryptodev->attached = RTE_CRYPTODEV_ATTACHED;
-   cryptodev->pmd_type = type;

cryptodev_globals.nb_devs++;
}
@@ -318,7 +317,7 @@ rte_cryptodev_pmd_virtual_dev_init(const char *name, size_t 
dev_private_size,
struct rte_cryptodev *cryptodev;

/* allocate device structure */
-   cryptodev = rte_cryptodev_pmd_allocate(name, PMD_VDEV, socket_id);
+   cryptodev = rte_cryptodev_pmd_allocate(name, socket_id);
if (cryptodev == NULL)
return NULL;

@@ -360,8 +359,7 @@ rte_cryptodev_init(struct rte_pci_driver *pci_drv,
rte_cryptodev_create_unique_device_name(cryptodev_name,
sizeof(cryptodev_name), pci_dev);

-   cryptodev = rte_cryptodev_pmd_allocate(cryptodev_name, PMD_PDEV,
-   rte_socket_id());
+   cryptodev = rte_cryptodev_pmd_allocate(cryptodev_name, rte_socket_id());
if (cryptodev == NULL)
return -ENOMEM;

diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
b/lib/librte_cryptodev/rte_cryptodev.h
index 27cf8ef..f22eb43 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -700,8 +700,6 @@ struct rte_cryptodev {

enum rte_cryptodev_type dev_type;
/**< Crypto device type */
-   enum pmd_type pmd_type;
-   /**< PMD type - PDEV / VDEV */

struct rte_cryptodev_cb_list link_intr_cbs;
/**< User application callback for interrupts if present */
diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h 
b/lib/librte_cryptodev/rte_cryptodev_pmd.h
index 7d049ea..c977c61 100644
--- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
+++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
@@ -454,13 +454,12 @@ struct rte_cryptodev_ops {
  * to that slot for the driver to use.
  *
  * @param  nameUnique identifier name for each device
- * @param  typeDevice type of this Crypto device
  * @param  socket_id   Socket to allocate resources on.
  * @return
  *   - Slot in the rte_dev_devices array for a new device;
  */
 struct rte_cryptodev *
-rte_cryptodev_pmd_allocate(const char *name, enum pmd_type type, int 
socket_id);
+rte_cryptodev_pmd_allocate(const char *name, int socket_id);

 /**
  * Creates a new virtual crypto device and returns the pointer
-- 
2.7.4

[dpdk-dev] [PATCH v4 03/17] drivers: align pci driver definitions

2016-06-21 Thread Shreyansh Jain

Pure coding style, but it might make it easier later if we want to move
fields in rte_cryptodev_driver and eth_driver structures.

Signed-off-by: David Marchand 
Signed-off-by: Shreyansh Jain 
---
 drivers/crypto/qat/rte_qat_cryptodev.c | 2 +-
 drivers/net/ena/ena_ethdev.c   | 2 +-
 drivers/net/nfp/nfp_net.c  | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/qat/rte_qat_cryptodev.c 
b/drivers/crypto/qat/rte_qat_cryptodev.c
index a7912f5..08496ab 100644
--- a/drivers/crypto/qat/rte_qat_cryptodev.c
+++ b/drivers/crypto/qat/rte_qat_cryptodev.c
@@ -116,7 +116,7 @@ crypto_qat_dev_init(__attribute__((unused)) struct 
rte_cryptodev_driver *crypto_
 }

 static struct rte_cryptodev_driver rte_qat_pmd = {
-   {
+   .pci_drv = {
.name = "rte_qat_pmd",
.id_table = pci_id_qat_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index e157587..8d01e9a 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -1427,7 +1427,7 @@ static uint16_t eth_ena_xmit_pkts(void *tx_queue, struct 
rte_mbuf **tx_pkts,
 }

 static struct eth_driver rte_ena_pmd = {
-   {
+   .pci_drv = {
.name = "rte_ena_pmd",
.id_table = pci_id_ena_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 5c9f350..ef7011e 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -2463,7 +2463,7 @@ static struct rte_pci_id pci_id_nfp_net_map[] = {
 };

 static struct eth_driver rte_nfp_net_pmd = {
-   {
+   .pci_drv = {
.name = "rte_nfp_net_pmd",
.id_table = pci_id_nfp_net_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-- 
2.7.4

[dpdk-dev] [PATCH v4 04/17] eal: remove duplicate function declaration

2016-06-21 Thread Shreyansh Jain

rte_eal_dev_init is declared in both eal_private.h and rte_dev.h since its
introduction.
This function has been exported in ABI, so remove it from eal_private.h

Fixes: e57f20e05177 ("eal: make vdev init path generic for both virtual and pci 
devices")
Signed-off-by: David Marchand 
Signed-off-by: Shreyansh Jain 
---
 lib/librte_eal/common/eal_private.h | 7 ---
 lib/librte_eal/linuxapp/eal/eal.c   | 1 +
 2 files changed, 1 insertion(+), 7 deletions(-)

diff --git a/lib/librte_eal/common/eal_private.h 
b/lib/librte_eal/common/eal_private.h
index 857dc3e..06a68f6 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -259,13 +259,6 @@ int rte_eal_intr_init(void);
 int rte_eal_alarm_init(void);

 /**
- * This function initialises any virtual devices
- *
- * This function is private to the EAL.
- */
-int rte_eal_dev_init(void);
-
-/**
  * Function is to check if the kernel module(like, vfio, vfio_iommu_type1,
  * etc.) loaded.
  *
diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index 4f22c18..29fba52 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -70,6 +70,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
2.7.4

[dpdk-dev] [PATCH v4 05/17] eal: introduce init macros

2016-06-21 Thread Shreyansh Jain

Introduce a RTE_INIT macro used to mark an init function as a constructor.
Current eal macros have been converted to use this (no functional impact).
RTE_EAL_PCI_REGISTER is added as a helper for pci drivers.

RTE_EAL_PCI_REGISTER assumes that object expanded contains a pci_drv member.

Suggested-by: Jan Viktorin 
Signed-off-by: David Marchand 
Signed-off-by: Shreyansh Jain 
---
 lib/librte_eal/common/include/rte_dev.h   | 4 ++--
 lib/librte_eal/common/include/rte_eal.h   | 3 +++
 lib/librte_eal/common/include/rte_pci.h   | 8 
 lib/librte_eal/common/include/rte_tailq.h | 4 ++--
 4 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_dev.h 
b/lib/librte_eal/common/include/rte_dev.h
index f1b5507..85e48f2 100644
--- a/lib/librte_eal/common/include/rte_dev.h
+++ b/lib/librte_eal/common/include/rte_dev.h
@@ -179,8 +179,8 @@ int rte_eal_vdev_init(const char *name, const char *args);
 int rte_eal_vdev_uninit(const char *name);

 #define PMD_REGISTER_DRIVER(d)\
-void devinitfn_ ##d(void);\
-void __attribute__((constructor, used)) devinitfn_ ##d(void)\
+RTE_INIT(devinitfn_ ##d);\
+static void devinitfn_ ##d(void)\
 {\
rte_eal_driver_register(&d);\
 }
diff --git a/lib/librte_eal/common/include/rte_eal.h 
b/lib/librte_eal/common/include/rte_eal.h
index a71d6f5..186f3c6 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -252,6 +252,9 @@ static inline int rte_gettid(void)
return RTE_PER_LCORE(_thread_id);
 }

+#define RTE_INIT(func) \
+static void __attribute__((constructor, used)) func(void)
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index fa74962..ac890fc 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -470,6 +470,14 @@ void rte_eal_pci_dump(FILE *f);
  */
 void rte_eal_pci_register(struct rte_pci_driver *driver);

+/** Helper for PCI device registeration from driver (eth, crypto) instance */
+#define RTE_EAL_PCI_REGISTER(name) \
+RTE_INIT(pciinitfn_ ##name); \
+static void pciinitfn_ ##name(void) \
+{ \
+   rte_eal_pci_register(&(name).pci_drv); \
+}
+
 /**
  * Unregister a PCI driver.
  *
diff --git a/lib/librte_eal/common/include/rte_tailq.h 
b/lib/librte_eal/common/include/rte_tailq.h
index 4a686e6..71ed3bb 100644
--- a/lib/librte_eal/common/include/rte_tailq.h
+++ b/lib/librte_eal/common/include/rte_tailq.h
@@ -148,8 +148,8 @@ struct rte_tailq_head *rte_eal_tailq_lookup(const char 
*name);
 int rte_eal_tailq_register(struct rte_tailq_elem *t);

 #define EAL_REGISTER_TAILQ(t) \
-void tailqinitfn_ ##t(void); \
-void __attribute__((constructor, used)) tailqinitfn_ ##t(void) \
+RTE_INIT(tailqinitfn_ ##t); \
+static void tailqinitfn_ ##t(void) \
 { \
if (rte_eal_tailq_register(&t) < 0) \
rte_panic("Cannot initialize tailq: %s\n", t.name); \
-- 
2.7.4

[dpdk-dev] [PATCH v4 06/17] crypto: export init/uninit common wrappers for pci drivers

2016-06-21 Thread Shreyansh Jain

Preparing for getting rid of rte_cryptodev_driver, here are two wrappers
that can be used by pci drivers that assume a 1 to 1 association between
pci resource and upper interface.

Signed-off-by: David Marchand 
Signed-off-by: Shreyansh Jain 
---
 lib/librte_cryptodev/rte_cryptodev.c   | 16 
 lib/librte_cryptodev/rte_cryptodev_pmd.h   | 12 
 lib/librte_cryptodev/rte_cryptodev_version.map |  8 
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/lib/librte_cryptodev/rte_cryptodev.c 
b/lib/librte_cryptodev/rte_cryptodev.c
index b0d806c..65a2e29 100644
--- a/lib/librte_cryptodev/rte_cryptodev.c
+++ b/lib/librte_cryptodev/rte_cryptodev.c
@@ -340,9 +340,9 @@ rte_cryptodev_pmd_virtual_dev_init(const char *name, size_t 
dev_private_size,
return cryptodev;
 }

-static int
-rte_cryptodev_init(struct rte_pci_driver *pci_drv,
-   struct rte_pci_device *pci_dev)
+int
+rte_cryptodev_pci_probe(struct rte_pci_driver *pci_drv,
+   struct rte_pci_device *pci_dev)
 {
struct rte_cryptodev_driver *cryptodrv;
struct rte_cryptodev *cryptodev;
@@ -401,8 +401,8 @@ rte_cryptodev_init(struct rte_pci_driver *pci_drv,
return -ENXIO;
 }

-static int
-rte_cryptodev_uninit(struct rte_pci_device *pci_dev)
+int
+rte_cryptodev_pci_remove(struct rte_pci_device *pci_dev)
 {
const struct rte_cryptodev_driver *cryptodrv;
struct rte_cryptodev *cryptodev;
@@ -450,15 +450,15 @@ rte_cryptodev_pmd_driver_register(struct 
rte_cryptodev_driver *cryptodrv,
 {
/* Call crypto device initialization directly if device is virtual */
if (type == PMD_VDEV)
-   return rte_cryptodev_init((struct rte_pci_driver *)cryptodrv,
+   return rte_cryptodev_pci_probe((struct rte_pci_driver 
*)cryptodrv,
NULL);

/*
 * Register PCI driver for physical device intialisation during
 * PCI probing
 */
-   cryptodrv->pci_drv.devinit = rte_cryptodev_init;
-   cryptodrv->pci_drv.devuninit = rte_cryptodev_uninit;
+   cryptodrv->pci_drv.devinit = rte_cryptodev_pci_probe;
+   cryptodrv->pci_drv.devuninit = rte_cryptodev_pci_remove;

rte_eal_pci_register(&cryptodrv->pci_drv);

diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h 
b/lib/librte_cryptodev/rte_cryptodev_pmd.h
index c977c61..3fb7c7c 100644
--- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
+++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
@@ -534,6 +534,18 @@ rte_cryptodev_pmd_driver_register(struct 
rte_cryptodev_driver *crypto_drv,
 void rte_cryptodev_pmd_callback_process(struct rte_cryptodev *dev,
enum rte_cryptodev_event_type event);

+/**
+ * Wrapper for use by pci drivers as a .devinit function to attach to a crypto
+ * interface.
+ */
+int rte_cryptodev_pci_probe(struct rte_pci_driver *pci_drv,
+   struct rte_pci_device *pci_dev);
+
+/**
+ * Wrapper for use by pci drivers as a .devuninit function to detach a crypto
+ * interface.
+ */
+int rte_cryptodev_pci_remove(struct rte_pci_device *pci_dev);

 #ifdef __cplusplus
 }
diff --git a/lib/librte_cryptodev/rte_cryptodev_version.map 
b/lib/librte_cryptodev/rte_cryptodev_version.map
index 41004e1..8d0edfb 100644
--- a/lib/librte_cryptodev/rte_cryptodev_version.map
+++ b/lib/librte_cryptodev/rte_cryptodev_version.map
@@ -32,3 +32,11 @@ DPDK_16.04 {

local: *;
 };
+
+DPDK_16.07 {
+   global:
+
+   rte_cryptodev_pci_probe;
+   rte_cryptodev_pci_remove;
+
+} DPDK_16.04;
-- 
2.7.4

[dpdk-dev] [PATCH v4 07/17] ethdev: export init/uninit common wrappers for pci drivers

2016-06-21 Thread Shreyansh Jain

Preparing for getting rid of eth_drv, here are two wrappers that can be
used by pci drivers that assume a 1 to 1 association between pci resource and
upper interface.

Signed-off-by: David Marchand 
Signed-off-by: Shreyansh Jain 
---
 lib/librte_ether/rte_ethdev.c  | 14 +++---
 lib/librte_ether/rte_ethdev.h  | 13 +
 lib/librte_ether/rte_ether_version.map |  3 +++
 3 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 42aaef7..312c42c 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -245,9 +245,9 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
return 0;
 }

-static int
-rte_eth_dev_init(struct rte_pci_driver *pci_drv,
-struct rte_pci_device *pci_dev)
+int
+rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
+ struct rte_pci_device *pci_dev)
 {
struct eth_driver*eth_drv;
struct rte_eth_dev *eth_dev;
@@ -299,8 +299,8 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
return diag;
 }

-static int
-rte_eth_dev_uninit(struct rte_pci_device *pci_dev)
+int
+rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
 {
const struct eth_driver *eth_drv;
struct rte_eth_dev *eth_dev;
@@ -357,8 +357,8 @@ rte_eth_dev_uninit(struct rte_pci_device *pci_dev)
 void
 rte_eth_driver_register(struct eth_driver *eth_drv)
 {
-   eth_drv->pci_drv.devinit = rte_eth_dev_init;
-   eth_drv->pci_drv.devuninit = rte_eth_dev_uninit;
+   eth_drv->pci_drv.devinit = rte_eth_dev_pci_probe;
+   eth_drv->pci_drv.devuninit = rte_eth_dev_pci_remove;
rte_eal_pci_register(ð_drv->pci_drv);
 }

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index bd93bf6..2249466 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -4354,6 +4354,19 @@ rte_eth_dev_get_port_by_name(const char *name, uint8_t 
*port_id);
 int
 rte_eth_dev_get_name_by_port(uint8_t port_id, char *name);

+/**
+ * Wrapper for use by pci drivers as a .devinit function to attach to a ethdev
+ * interface.
+ */
+int rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
+ struct rte_pci_device *pci_dev);
+
+/**
+ * Wrapper for use by pci drivers as a .devuninit function to detach a ethdev
+ * interface.
+ */
+int rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_ether/rte_ether_version.map 
b/lib/librte_ether/rte_ether_version.map
index 97ed0b0..cf4581c 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -140,4 +140,7 @@ DPDK_16.07 {
rte_eth_dev_get_name_by_port;
rte_eth_dev_get_port_by_name;
rte_eth_xstats_get_names;
+   rte_eth_dev_pci_probe;
+   rte_eth_dev_pci_remove;
+
 } DPDK_16.04;
-- 
2.7.4

[dpdk-dev] [PATCH v4 08/17] drivers: convert all pdev drivers as pci drivers

2016-06-21 Thread Shreyansh Jain

Simplify crypto and ethdev pci drivers init by using newly introduced
init macros and helpers.
Those drivers then don't need to register as "rte_driver"s anymore.

virtio and mlx* drivers use the general purpose RTE_INIT macro, as they both
need some special stuff to be done before registering a pci driver.

Signed-off-by: David Marchand 
Signed-off-by: Shreyansh Jain 
---
 drivers/crypto/qat/rte_qat_cryptodev.c  | 16 +++
 drivers/net/bnx2x/bnx2x_ethdev.c| 35 +---
 drivers/net/cxgbe/cxgbe_ethdev.c| 24 +++--
 drivers/net/e1000/em_ethdev.c   | 16 +++
 drivers/net/e1000/igb_ethdev.c  | 40 +---
 drivers/net/ena/ena_ethdev.c| 18 +++--
 drivers/net/enic/enic_ethdev.c  | 23 +++-
 drivers/net/fm10k/fm10k_ethdev.c| 23 +++-
 drivers/net/i40e/i40e_ethdev.c  | 26 +++---
 drivers/net/i40e/i40e_ethdev_vf.c   | 25 +++---
 drivers/net/ixgbe/ixgbe_ethdev.c| 47 +
 drivers/net/mlx4/mlx4.c | 20 +++---
 drivers/net/mlx5/mlx5.c | 19 +++--
 drivers/net/nfp/nfp_net.c   | 21 +++
 drivers/net/qede/qede_ethdev.c  | 40 ++--
 drivers/net/szedata2/rte_eth_szedata2.c | 25 +++---
 drivers/net/virtio/virtio_ethdev.c  | 26 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c| 23 +++-
 18 files changed, 76 insertions(+), 391 deletions(-)

diff --git a/drivers/crypto/qat/rte_qat_cryptodev.c 
b/drivers/crypto/qat/rte_qat_cryptodev.c
index 08496ab..43bccdc 100644
--- a/drivers/crypto/qat/rte_qat_cryptodev.c
+++ b/drivers/crypto/qat/rte_qat_cryptodev.c
@@ -120,21 +120,11 @@ static struct rte_cryptodev_driver rte_qat_pmd = {
.name = "rte_qat_pmd",
.id_table = pci_id_qat_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
+   .devinit = rte_cryptodev_pci_probe,
+   .devuninit = rte_cryptodev_pci_remove,
},
.cryptodev_init = crypto_qat_dev_init,
.dev_private_size = sizeof(struct qat_pmd_private),
 };

-static int
-rte_qat_pmd_init(const char *name __rte_unused, const char *params 
__rte_unused)
-{
-   PMD_INIT_FUNC_TRACE();
-   return rte_cryptodev_pmd_driver_register(&rte_qat_pmd, PMD_PDEV);
-}
-
-static struct rte_driver pmd_qat_drv = {
-   .type = PMD_PDEV,
-   .init = rte_qat_pmd_init,
-};
-
-PMD_REGISTER_DRIVER(pmd_qat_drv);
+RTE_EAL_PCI_REGISTER(rte_qat_pmd);
diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c b/drivers/net/bnx2x/bnx2x_ethdev.c
index 071b44f..5ab3c75 100644
--- a/drivers/net/bnx2x/bnx2x_ethdev.c
+++ b/drivers/net/bnx2x/bnx2x_ethdev.c
@@ -506,11 +506,15 @@ static struct eth_driver rte_bnx2x_pmd = {
.name = "rte_bnx2x_pmd",
.id_table = pci_id_bnx2x_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+   .devinit = rte_eth_dev_pci_probe,
+   .devuninit = rte_eth_dev_pci_remove,
},
.eth_dev_init = eth_bnx2x_dev_init,
.dev_private_size = sizeof(struct bnx2x_softc),
 };

+RTE_EAL_PCI_REGISTER(rte_bnx2x_pmd);
+
 /*
  * virtual function driver struct
  */
@@ -519,36 +523,11 @@ static struct eth_driver rte_bnx2xvf_pmd = {
.name = "rte_bnx2xvf_pmd",
.id_table = pci_id_bnx2xvf_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
+   .devinit = rte_eth_dev_pci_probe,
+   .devuninit = rte_eth_dev_pci_remove,
},
.eth_dev_init = eth_bnx2xvf_dev_init,
.dev_private_size = sizeof(struct bnx2x_softc),
 };

-static int rte_bnx2x_pmd_init(const char *name __rte_unused, const char 
*params __rte_unused)
-{
-   PMD_INIT_FUNC_TRACE();
-   rte_eth_driver_register(&rte_bnx2x_pmd);
-
-   return 0;
-}
-
-static int rte_bnx2xvf_pmd_init(const char *name __rte_unused, const char 
*params __rte_unused)
-{
-   PMD_INIT_FUNC_TRACE();
-   rte_eth_driver_register(&rte_bnx2xvf_pmd);
-
-   return 0;
-}
-
-static struct rte_driver rte_bnx2x_driver = {
-   .type = PMD_PDEV,
-   .init = rte_bnx2x_pmd_init,
-};
-
-static struct rte_driver rte_bnx2xvf_driver = {
-   .type = PMD_PDEV,
-   .init = rte_bnx2xvf_pmd_init,
-};
-
-PMD_REGISTER_DRIVER(rte_bnx2x_driver);
-PMD_REGISTER_DRIVER(rte_bnx2xvf_driver);
+RTE_EAL_PCI_REGISTER(rte_bnx2xvf_pmd);
diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index 04eddaf..1389371 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -869,29 +869,11 @@ static struct eth_driver rte_cxgbe_pmd = {
.name = "rte_cxgbe_pmd",
.id_table = cxgb4_pci_tbl,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+

[dpdk-dev] [PATCH v4 09/17] crypto: get rid of crypto driver register callback

2016-06-21 Thread Shreyansh Jain

Now that all pdev are pci drivers, we don't need to register crypto drivers
through a dedicated channel.

Signed-off-by: David Marchand 
Signed-off-by: Shreyansh Jain 
---
 lib/librte_cryptodev/rte_cryptodev.c   | 22 ---
 lib/librte_cryptodev/rte_cryptodev_pmd.h   | 30 --
 lib/librte_cryptodev/rte_cryptodev_version.map |  1 -
 3 files changed, 53 deletions(-)

diff --git a/lib/librte_cryptodev/rte_cryptodev.c 
b/lib/librte_cryptodev/rte_cryptodev.c
index 65a2e29..a7cb33a 100644
--- a/lib/librte_cryptodev/rte_cryptodev.c
+++ b/lib/librte_cryptodev/rte_cryptodev.c
@@ -444,28 +444,6 @@ rte_cryptodev_pci_remove(struct rte_pci_device *pci_dev)
return 0;
 }

-int
-rte_cryptodev_pmd_driver_register(struct rte_cryptodev_driver *cryptodrv,
-   enum pmd_type type)
-{
-   /* Call crypto device initialization directly if device is virtual */
-   if (type == PMD_VDEV)
-   return rte_cryptodev_pci_probe((struct rte_pci_driver 
*)cryptodrv,
-   NULL);
-
-   /*
-* Register PCI driver for physical device intialisation during
-* PCI probing
-*/
-   cryptodrv->pci_drv.devinit = rte_cryptodev_pci_probe;
-   cryptodrv->pci_drv.devuninit = rte_cryptodev_pci_remove;
-
-   rte_eal_pci_register(&cryptodrv->pci_drv);
-
-   return 0;
-}
-
-
 uint16_t
 rte_cryptodev_queue_pair_count(uint8_t dev_id)
 {
diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h 
b/lib/librte_cryptodev/rte_cryptodev_pmd.h
index 3fb7c7c..99fd69e 100644
--- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
+++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
@@ -491,36 +491,6 @@ rte_cryptodev_pmd_virtual_dev_init(const char *name, 
size_t dev_private_size,
 extern int
 rte_cryptodev_pmd_release_device(struct rte_cryptodev *cryptodev);

-
-/**
- * Register a Crypto [Poll Mode] driver.
- *
- * Function invoked by the initialization function of a Crypto driver
- * to simultaneously register itself as Crypto Poll Mode Driver and to either:
- *
- * a - register itself as PCI driver if the crypto device is a physical
- * device, by invoking the rte_eal_pci_register() function to
- * register the *pci_drv* structure embedded in the *crypto_drv*
- * structure, after having stored the address of the
- * rte_cryptodev_init() function in the *devinit* field of the
- * *pci_drv* structure.
- *
- * During the PCI probing phase, the rte_cryptodev_init()
- * function is invoked for each PCI [device] matching the
- * embedded PCI identifiers provided by the driver.
- *
- * b, complete the initialization sequence if the device is a virtual
- * device by calling the rte_cryptodev_init() directly passing a
- * NULL parameter for the rte_pci_device structure.
- *
- *   @param crypto_drv crypto_driver structure associated with the crypto
- * driver.
- *   @param type   pmd type
- */
-extern int
-rte_cryptodev_pmd_driver_register(struct rte_cryptodev_driver *crypto_drv,
-   enum pmd_type type);
-
 /**
  * Executes all the user application registered callbacks for the specific
  * device.
diff --git a/lib/librte_cryptodev/rte_cryptodev_version.map 
b/lib/librte_cryptodev/rte_cryptodev_version.map
index 8d0edfb..e0a9620 100644
--- a/lib/librte_cryptodev/rte_cryptodev_version.map
+++ b/lib/librte_cryptodev/rte_cryptodev_version.map
@@ -14,7 +14,6 @@ DPDK_16.04 {
rte_cryptodev_info_get;
rte_cryptodev_pmd_allocate;
rte_cryptodev_pmd_callback_process;
-   rte_cryptodev_pmd_driver_register;
rte_cryptodev_pmd_release_device;
rte_cryptodev_pmd_virtual_dev_init;
rte_cryptodev_sym_session_create;
-- 
2.7.4

1 2 >

1 - 100 of 169 matches

Mail list logo