[dpdk-dev] [PATCH v3] af_packet: make the device detachable

2016-02-29 Thread Wojciech Żmuda
Hi Bernard,

> Does making   rte_pmd_af_packet_devinit local result in an ABI breakage?
If someone uses it in their app, they'll be forced to change it.
However, as this function is not intentionally public and there is API
to create devices that finally calls rte_pmd_af_packet_devinit(), I'm
not sure if any special caution is needed here.

> Should the DPDK_2.0 structure be kept and a DPDK_2.3 structure added?
Should it be just `DPDK_2.3 { local: *} DPDK_2.0`? Doesn't inheritance
of DPDK_2.0 make the symbol also global in 2.3?

> A deprecation notice may need to be added to the 
> doc/guides/rel_notes/deprecation.rst  file.
As far as I understand, deprecation.rst is used to announce something
will be removed in the future release. Changes already done should be
moved from deprecation.rst to the release's .rst file. At least, this
is what I see in commit logs. If this change should be announced in
deprecation.rst, does this mean there should be another patch in the
future (after 2.3 release?) making this function static? And that
future patch will add DPDK_2.3 structure in the map file?

Thank you for your time,
Wojtek


[dpdk-dev] [PATCH v4 4/4] virtio: return 1 to tell the upper layer we don't take over this device

2016-02-29 Thread Santosh Shukla
On Fri, Feb 26, 2016 at 7:23 AM, Huawei Xie  wrote:
> v4 changes:
>  Rebase as io port map is moved to eal.
>  Only fall back to PORT IO when there isn't any kernel driver (including

Pl. mention that fallback behaviour applicable to x86 arch only..

However this patch fixes one problem in non-x86 arch issue, Example:
VM has 8 virtio interface and 2 i/f attached out of 8, so in default
case - after 2nd interface, ioport try to program 3..8 ports, result
to failure, lead to exit dpdk application. Patch fixes this problem
for non-x86 arch, test on arm64 platform.

> VFIO/UIO) managing the device. Before v4, we fall back to PORT IO even if
> VFIO/UIO fails.
>  Reword the commit message.
>
> v3 changes:
>  Change log message to tell user that the virtio device is skipped
> due to it is managed by kernel driver, instead of asking user to
> unbind it from kernel driver.
>
> v2 changes:
>  Remove unnecessary assignment of NULL to dev->data->mac_addrs.
>  Ajust one comment's position.
>
> virtio PMD could use IO port to configure the virtio device without
> using UIO/VFIO driver in legacy mode.
>
> There are two issues with the previous implementation:
> 1) virtio PMD will take over the virtio device(s) blindly even if not
> intended for DPDK.
> 2) driver conflict between virtio PMD and virtio-net kernel driver.
>
> This patch checks if there is kernel driver other than UIO/VFIO managing
> the virtio device before using port IO.
>
> If legacy_virtio_resource_init fails and kernel driver other than
> VFIO/UIO is managing the device, return 1 to tell the upper layer we
> don't take over this device.
> For all other IO port mapping errors, return -1.
>
> Note than if VFIO/UIO fails, now we don't fall back to port IO.
>
> Fixes: da978dfdc43b ("virtio: use port IO to get PCI resource")
>
> Signed-off-by: Huawei Xie 
> ---
>  drivers/net/virtio/virtio_ethdev.c |  9 +++--
>  drivers/net/virtio/virtio_pci.c| 15 ++-
>  2 files changed, 21 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/virtio/virtio_ethdev.c 
> b/drivers/net/virtio/virtio_ethdev.c
> index caa970c..8601080 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -1,4 +1,5 @@
>  /*-
> +
>   *   BSD LICENSE
>   *
>   *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
> @@ -1015,6 +1016,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
> struct virtio_net_config *config;
> struct virtio_net_config local_config;
> struct rte_pci_device *pci_dev;
> +   int ret;
>
> RTE_BUILD_BUG_ON(RTE_PKTMBUF_HEADROOM < sizeof(struct 
> virtio_net_hdr));
>
> @@ -1037,8 +1039,11 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
>
> pci_dev = eth_dev->pci_dev;
>
> -   if (vtpci_init(pci_dev, hw) < 0)
> -   return -1;
> +   ret = vtpci_init(pci_dev, hw);
> +   if (ret) {
> +   rte_free(eth_dev->data->mac_addrs);
> +   return ret;
> +   }
>
> /* Reset the device although not necessary at startup */
> vtpci_reset(hw);
> diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c
> index 85fbe88..f159b2a 100644
> --- a/drivers/net/virtio/virtio_pci.c
> +++ b/drivers/net/virtio/virtio_pci.c
> @@ -622,6 +622,13 @@ next:
> return 0;
>  }
>
> +/*
> + * Return -1:
> + *   if there is error mapping with VFIO/UIO.
> + *   if port map error when driver type is KDRV_NONE.
> + * Return 1 if kernel driver is managing the device.
> + * Return 0 on success.
> + */
>  int
>  vtpci_init(struct rte_pci_device *dev, struct virtio_hw *hw)
>  {
> @@ -641,8 +648,14 @@ vtpci_init(struct rte_pci_device *dev, struct virtio_hw 
> *hw)
> }
>
> PMD_INIT_LOG(INFO, "trying with legacy virtio pci.");
> -   if (legacy_virtio_resource_init(dev, hw) < 0)
> +   if (legacy_virtio_resource_init(dev, hw) < 0) {
> +   if (dev->kdrv == RTE_KDRV_UNKNOWN) {
> +   PMD_INIT_LOG(INFO,
> +   "skip kernel managed virtio device.");
> +   return 1;
> +   }
> return -1;
> +   }
>
> hw->vtpci_ops = _ops;
> hw->use_msix = legacy_virtio_has_msix(>addr);

Tested-by: Santosh Shukla 
Acked-by: Santosh Shukla 

> --
> 1.8.1.4
>


[dpdk-dev] [PATCH v2] virtio: Use cpuflag for vector api

2016-02-29 Thread Santosh Shukla
Check cpuflag macro before using vectored api.
-virtio_recv_pkts_vec() uses _sse3__ simd instruction for now so added cpuflag.
- Also wrap other vectored freind api ie..
1) virtqueue_enqueue_recv_refill_simple
2) virtio_rxq_vec_setup

- removed VIRTIO_PMD=n from armv7/v8 config.

todo:
1) Move virtio_recv_pkts_vec() implementation to
   drivers/virtio/virtio_vec_.h file.
2) Remove use_simple_rxtx flag, so that virtio/virtio_vec_.h
   files to provide vectored/non-vectored rx/tx apis.

Signed-off-by: Santosh Shukla 
---
- v2: Removed VIRTIO_PMD=n from arm v7/v8

- v1: This is a rework of patch [1].
Note: This patch will let non-x86 arch to use virtio pmd.

[1] http://dpdk.org/dev/patchwork/patch/10429/


 config/defconfig_arm-armv7a-linuxapp-gcc   |1 -
 config/defconfig_arm64-armv8a-linuxapp-gcc |1 -
 drivers/net/virtio/virtio_rxtx.c   |   16 +++-
 drivers/net/virtio/virtio_rxtx.h   |2 ++
 drivers/net/virtio/virtio_rxtx_simple.c|   11 ++-
 5 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc 
b/config/defconfig_arm-armv7a-linuxapp-gcc
index cbebd64..4bfdfad 100644
--- a/config/defconfig_arm-armv7a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7a-linuxapp-gcc
@@ -70,7 +70,6 @@ CONFIG_RTE_LIBRTE_I40E_PMD=n
 CONFIG_RTE_LIBRTE_IXGBE_PMD=n
 CONFIG_RTE_LIBRTE_MLX4_PMD=n
 CONFIG_RTE_LIBRTE_MPIPE_PMD=n
-CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
 CONFIG_RTE_LIBRTE_VMXNET3_PMD=n
 CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
 CONFIG_RTE_LIBRTE_PMD_BNX2X=n
diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc 
b/config/defconfig_arm64-armv8a-linuxapp-gcc
index eacd01c..f6f5d18 100644
--- a/config/defconfig_arm64-armv8a-linuxapp-gcc
+++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
@@ -44,7 +44,6 @@ CONFIG_RTE_TOOLCHAIN="gcc"
 CONFIG_RTE_TOOLCHAIN_GCC=y

 CONFIG_RTE_IXGBE_INC_VECTOR=n
-CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
 CONFIG_RTE_LIBRTE_IVSHMEM=n
 CONFIG_RTE_LIBRTE_FM10K_PMD=n
 CONFIG_RTE_LIBRTE_I40E_PMD=n
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 41a1366..ec0b8de 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -67,7 +67,9 @@
 #define VIRTIO_SIMPLE_FLAGS ((uint32_t)ETH_TXQ_FLAGS_NOMULTSEGS | \
ETH_TXQ_FLAGS_NOOFFLOADS)

+#ifdef RTE_MACHINE_CPUFLAG_SSSE3
 static int use_simple_rxtx;
+#endif

 static void
 vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx)
@@ -307,12 +309,13 @@ virtio_dev_vring_start(struct virtqueue *vq, int 
queue_type)
nbufs = 0;
error = ENOSPC;

+#ifdef RTE_MACHINE_CPUFLAG_SSSE3
if (use_simple_rxtx)
for (i = 0; i < vq->vq_nentries; i++) {
vq->vq_ring.avail->ring[i] = i;
vq->vq_ring.desc[i].flags = VRING_DESC_F_WRITE;
}
-
+#endif
memset(>fake_mbuf, 0, sizeof(vq->fake_mbuf));
for (i = 0; i < RTE_PMD_VIRTIO_RX_MAX_BURST; i++)
vq->sw_ring[vq->vq_nentries + i] = >fake_mbuf;
@@ -325,9 +328,11 @@ virtio_dev_vring_start(struct virtqueue *vq, int 
queue_type)
/**
* Enqueue allocated buffers*
***/
+#ifdef RTE_MACHINE_CPUFLAG_SSSE3
if (use_simple_rxtx)
error = 
virtqueue_enqueue_recv_refill_simple(vq, m);
else
+#endif
error = virtqueue_enqueue_recv_refill(vq, m);
if (error) {
rte_pktmbuf_free(m);
@@ -340,6 +345,7 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type)

PMD_INIT_LOG(DEBUG, "Allocated %d bufs", nbufs);
} else if (queue_type == VTNET_TQ) {
+#ifdef RTE_MACHINE_CPUFLAG_SSSE3
if (use_simple_rxtx) {
int mid_idx  = vq->vq_nentries >> 1;
for (i = 0; i < mid_idx; i++) {
@@ -357,6 +363,7 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type)
for (i = mid_idx; i < vq->vq_nentries; i++)
vq->vq_ring.avail->ring[i] = i;
}
+#endif
}
 }

@@ -423,7 +430,9 @@ virtio_dev_rx_queue_setup(struct rte_eth_dev *dev,

dev->data->rx_queues[queue_idx] = vq;

+#ifdef RTE_MACHINE_CPUFLAG_SSSE3
virtio_rxq_vec_setup(vq);
+#endif

return 0;
 }
@@ -449,7 +458,10 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev,
const struct rte_eth_txconf *tx_conf)
 {
uint8_t vtpci_queue_idx = 2 * queue_idx + VTNET_SQ_TQ_QUEUE_IDX;
+
+#ifdef RTE_MACHINE_CPUFLAG_SSSE3
struct virtio_hw *hw = dev->data->dev_private;
+#endif
struct virtqueue *vq;
uint16_t 

[dpdk-dev] [PATCH] Adding maintainers for Intel QAT PMD

2016-02-29 Thread Jain, Deepak K

On 05/02/16 16:36, Fiona Trahe wrote:
> Signed-off-by: Fiona Trahe 

Acked-by: John Griffin 

Acked-by: Deepak Kumar Jain 



[dpdk-dev] [PATCH] Adding maintainers for Intel QAT PMD

2016-02-29 Thread John Griffin
On 05/02/16 16:36, Fiona Trahe wrote:
> Signed-off-by: Fiona Trahe 

Acked-by: John Griffin 



[dpdk-dev] [PATCH 1/1] arm: set CONFIG_RTE_ARCH_STRICT_ALIGN=y for armv7 target

2016-02-29 Thread Jan Viktorin
On Mon, 29 Feb 2016 16:55:38 +0100
Jan Viktorin  wrote:

> On Mon, 29 Feb 2016 16:14:58 +0100
> Thomas Monjalon  wrote:
> 
> > 2015-12-09 16:16, Jan Viktorin:  
> > > This patch reduces number of warnings from 53 to 40. It removes the usual 
> > > false
> > > positives utilizing unaligned_uint*_t data types.
> > > 
> > > Signed-off-by: Jan Viktorin 
> > 
> > Applied, thanks
> > 
> > Jan, what is the problem with the other ARM alignment warnings?
> > Can they be fixed?  
> 
> This is the full list of warnings I can see on the current origin/master
> for ARMv7 (42 occurences) including examples (+10 more). The origin of
> all of them is:
> 
>   cast increases required alignment of target type [-Wcast-align]
> 
> After skimming through the list, you can see that they are mostly casts
> to uint32_t * or something similar. I believe that all of them are OK.
> However, I don't know how to persuade GCC to not be angry...
> 
> Probably, we can add some explicit alignment of certain structures.
> 
[snip]
> 
> lib/librte_vhost/vhost_user/virtio-net-user.c
> 433   rarp = (struct ether_arp *)(eth_hdr + 1);
> 527   ifr = (struct ifreq *)ifc.ifc_buf;

Fixed recently in
http://dpdk.org/browse/dpdk/commit/?id=bb66588304632a7e4a043d2921d06709d40f9ed4

> 
> Regards
> Jan


[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-02-29 Thread Panu Matilainen
On 02/29/2016 05:27 PM, Thomas Monjalon wrote:
> 2016-02-29 17:19, Panu Matilainen:
>> On 02/29/2016 01:35 PM, Ferruh Yigit wrote:
>>> On 2/29/2016 11:06 AM, Thomas Monjalon wrote:
 Hi,
 I totally agree with Avi's comments.
 This topic is really important for the future of DPDK.
 So I think we must give some time to continue the discussion
 and have netdev involved in the choices done.
 As a consequence, these series should not be merged in the release 16.04.
 Thanks for continuing the work.

>>> Hi Thomas,
>>>
>>> It is great to have some discussion and feedbacks.
>>> But I doubt not merging in this release will help to have more discussion.
>>>
>>> It is better to have them in this release and let people experiment it,
>>> this gives more chance to better discussion.
>>>
>>> These features are replacement of KNI, and KNI is not intended to be
>>> removed in this release, so who are using KNI as solution can continue
>>> to use KNI and can test KCP/KDP, so that we can get more feedbacks.
>>
>> So make the work available from a separate git repo and make it easy for
>> people to experiment with it. Code doesn't have to be in a release for
>> the sake of experimenting, and removing code is much harder than not
>> adding it in the first place, witness KNI.
>
> Good idea.
> What about a -next tree to experiment on kernel interactions?

Here's another, related but more radical (and rather unbaked) idea:

Move all the kernel modules and their associated libraries (thinking of 
KNI here) to a separate repo with perhaps more relaxed rules, but OTOH 
require upstream kernel support for any features to be included in dpdk 
itself. Carrot-and-stick of sorts :)

- Panu -





[dpdk-dev] [PATCH v1] virtio: Use cpuflag for vector api

2016-02-29 Thread Santosh Shukla
On Mon, Feb 29, 2016 at 9:57 AM, Yuanhan Liu
 wrote:
> On Fri, Feb 26, 2016 at 02:21:02PM +0530, Santosh Shukla wrote:
>> Check cpuflag macro before using vectored api.
>> -virtio_recv_pkts_vec() uses _sse3__ simd instruction for now so added 
>> cpuflag.
>> - Also wrap other vectored freind api ie..
>> 1) virtqueue_enqueue_recv_refill_simple
>> 2) virtio_rxq_vec_setup
>>
> ...
>> diff --git a/drivers/net/virtio/virtio_rxtx_simple.c 
>> b/drivers/net/virtio/virtio_rxtx_simple.c
>> index 3a1de9d..be51d7c 100644
>> --- a/drivers/net/virtio/virtio_rxtx_simple.c
>> +++ b/drivers/net/virtio/virtio_rxtx_simple.c
>
> Hmm, why not wrapping the whole file, instead of just few functions?
>

Better to refactor code and make arch specific. Current implementation
is temporary.
> Or maybe better, do a compile time check at the Makefile, something
> like:
>
> if has_CPUFLAG_xxx
> SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c
> endif
>
Tried this approach but end up with link error,  If I try to fix below
link error then I will be ending up writing similar code,
linker error snap:

/work/santosh/thunder/nfs/dpdk/arm64-thunderx-linuxapp-gcc/lib/librte_pmd_virtio.a(virtio_rxtx.o):
In function `virtio_dev_rxtx_start':
virtio_rxtx.c:(.text+0x168c): undefined reference to
`virtqueue_enqueue_recv_refill_simple'
/work/santosh/thunder/nfs/dpdk/arm64-thunderx-linuxapp-gcc/lib/librte_pmd_virtio.a(virtio_rxtx.o):
In function `virtio_dev_rx_queue_setup':
virtio_rxtx.c:(.text+0x2364): undefined reference to `virtio_rxq_vec_setup'
/work/santosh/thunder/nfs/dpdk/arm64-thunderx-linuxapp-gcc/lib/librte_pmd_virtio.a(virtio_rxtx.o):
In function `virtio_dev_tx_queue_setup':
virtio_rxtx.c:(.text+0x2460): undefined reference to `virtio_xmit_pkts_simple'
virtio_rxtx.c:(.text+0x2464): undefined reference to `virtio_recv_pkts_vec'
virtio_rxtx.c:(.text+0x2468): undefined reference to `virtio_xmit_pkts_simple'
virtio_rxtx.c:(.text+0x246c): undefined reference to `virtio_recv_pkts_vec'
collect2: error: ld returned 1 exit status
make[5]: *** [test] Error 1
make[4]: *** [test] Error 2
make[3]: *** [app] Error 2

>
> --yliu


[dpdk-dev] [PATCH v5 00/11] Add API to get packet type info

2016-02-29 Thread Adrien Mazarguil
On Mon, Feb 29, 2016 at 04:54:19PM +, Ananyev, Konstantin wrote:
> 
> 
> > -Original Message-
> > From: Tan, Jianfeng
> > Sent: Friday, February 26, 2016 7:34 AM
> > To: dev at dpdk.org
> > Cc: Zhang, Helin; Ananyev, Konstantin; nelio.laranjeiro at 6wind.com; 
> > adrien.mazarguil at 6wind.com; rahul.lakkireddy at chelsio.com;
> > Tan, Jianfeng
> > Subject: [PATCH v5 00/11] Add API to get packet type info
> > 
> > To achieve this, a new function pointer, dev_ptype_info_get, is added
> > into struct eth_dev_ops. For those devices who do not implement it, it
> > means it will not provide any ptype info.
> > 
> > v5:
> >   - Exclude l3fwd change from this series, as a separated one.
> >   - Fix malposition of mlx4 code in mlx5 commit introduced in v4.
> > 
> > v4:
> >   - Change how to use this API: to previously agreement reached in mail.
> > 
> > v3:
> >   - Change how to use this API: api to allocate mem for storing ptype
> > array; and caller to free the mem.
> >   - Change how to return back ptypes from PMDs: return a pointer to
> > corresponding static const array of supported ptypes, terminated
> > by RTE_PTYPE_UNKNOWN.
> >   - Fix l3fwd parse_packet_type() when EXACT_MATCH is enabled.
> >   - Fix l3fwd memory leak when calling the API.
> > 
> > v2:
> >   - Move ptype_mask filter function from each PMDs into ether layer.
> >   - Add ixgbe vPMD's ptype info.
> >   - Fix code style issues.
> > 
> > Signed-off-by: Jianfeng Tan 
> > 
> 
> Acked-by: Konstantin Ananyev 

Fine for me as well.

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] [PATCH v4 0/2] cryptodev API changes

2016-02-29 Thread Trahe, Fiona


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Declan Doherty
> Sent: Monday, February 29, 2016 4:52 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v4 0/2] cryptodev API changes
> 
> This patch set separates the symmetric crypto operations from generic
> operations and then modifies the cryptodev burst API to accept bursts of
> rte_crypto_op rather than rte_mbufs.
> 
> V4:
> - Fixes for issues introduced in __rte_crypto_op_raw_bulk_alloc in V3 
> patcheset.
> - Typo fix in cached attribute on rte_crypto_op structure.
> 
> V3:
>  - Addresses V2 comments
>  - Rebased for head
> 
> 
> Declan Doherty (1):
>   cryptodev: change burst API to be crypto op oriented
> 
> Fiona Trahe (1):
>   cryptodev: API tidy and changes to support future extensions
> 
>  MAINTAINERS|   6 +-
>  app/test/test_cryptodev.c  | 894 
> +++--
>  app/test/test_cryptodev.h  |   9 +-
>  app/test/test_cryptodev_perf.c | 270 ---
>  config/common_bsdapp   |   8 -
>  config/common_linuxapp |   8 -
>  doc/api/doxy-api-index.md  |   1 -
>  drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c | 199 ++---
>  drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c |  18 +-
>  drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h |   6 +-
>  drivers/crypto/qat/qat_crypto.c| 150 ++--
>  drivers/crypto/qat/qat_crypto.h|  14 +-
>  drivers/crypto/qat/rte_qat_cryptodev.c |   8 +-
>  examples/l2fwd-crypto/main.c   | 300 ---
>  lib/Makefile   |   1 -
>  lib/librte_cryptodev/Makefile  |   1 +
>  lib/librte_cryptodev/rte_crypto.h  | 819 +++
>  lib/librte_cryptodev/rte_crypto_sym.h  | 642 +++
>  lib/librte_cryptodev/rte_cryptodev.c   | 115 ++-
>  lib/librte_cryptodev/rte_cryptodev.h   | 185 ++---
>  lib/librte_cryptodev/rte_cryptodev_pmd.h   |  32 +-
>  lib/librte_cryptodev/rte_cryptodev_version.map |   3 +-
>  lib/librte_mbuf/rte_mbuf.h |   6 -
>  lib/librte_mbuf_offload/Makefile   |  52 --
>  lib/librte_mbuf_offload/rte_mbuf_offload.c | 100 ---
>  lib/librte_mbuf_offload/rte_mbuf_offload.h | 310 ---
>  .../rte_mbuf_offload_version.map   |   7 -
>  27 files changed, 2143 insertions(+), 2021 deletions(-)  create mode 100644
> lib/librte_cryptodev/rte_crypto_sym.h
>  delete mode 100644 lib/librte_mbuf_offload/Makefile  delete mode 100644
> lib/librte_mbuf_offload/rte_mbuf_offload.c
>  delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload.h
>  delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload_version.map
> 
> --
> 2.5.0
Series Acked-by: Fiona Trahe 




[dpdk-dev] [PATCH v7] mbuf: provide rte_pktmbuf_alloc_bulk API

2016-02-29 Thread Thomas Monjalon
2016-02-28 20:44, Huawei Xie:
> v7 changes:
>  rte_pktmbuf_alloc_bulk isn't exported as API, so shouldn't be listed in
> version map
> 
> v6 changes:
>  reflect the changes in release notes and library version map file
>  revise our duff's code style a bit to make it more readable
> 
> v5 changes:
>  add comment about duff's device and our variant implementation
> 
> v3 changes:
>  move while after case 0
>  add context about duff's device and why we use while loop in the commit
> message
> 
> v2 changes:
>  unroll the loop a bit to help the performance
> 
> rte_pktmbuf_alloc_bulk allocates a bulk of packet mbufs.
> 
> There is related thread about this bulk API.
> http://dpdk.org/dev/patchwork/patch/4718/
> Thanks to Konstantin's loop unrolling.
> 
> Attached the wiki page about duff's device. It explains the performance
> optimization through loop unwinding, and also the most dramatic use of
> case label fall-through.
> https://en.wikipedia.org/wiki/Duff%27s_device
> 
> In this implementation, while() loop is used because we could not assume
> count is strictly positive. Using while() loop saves one line of check.
> 
> Signed-off-by: Gerald Rogers 
> Signed-off-by: Huawei Xie 
> Acked-by: Konstantin Ananyev 
> Acked-by: Olivier Matz 

Applied, thanks



[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-02-29 Thread Panu Matilainen
On 02/29/2016 01:35 PM, Ferruh Yigit wrote:
> On 2/29/2016 11:06 AM, Thomas Monjalon wrote:
>> Hi,
>> I totally agree with Avi's comments.
>> This topic is really important for the future of DPDK.
>> So I think we must give some time to continue the discussion
>> and have netdev involved in the choices done.
>> As a consequence, these series should not be merged in the release 16.04.
>> Thanks for continuing the work.
>>
> Hi Thomas,
>
> It is great to have some discussion and feedbacks.
> But I doubt not merging in this release will help to have more discussion.
>
> It is better to have them in this release and let people experiment it,
> this gives more chance to better discussion.
>
> These features are replacement of KNI, and KNI is not intended to be
> removed in this release, so who are using KNI as solution can continue
> to use KNI and can test KCP/KDP, so that we can get more feedbacks.

So make the work available from a separate git repo and make it easy for 
people to experiment with it. Code doesn't have to be in a release for 
the sake of experimenting, and removing code is much harder than not 
adding it in the first place, witness KNI.

- Panu -



[dpdk-dev] VIRTIO interface with DPDK in Guest VM not receiving packets

2016-02-29 Thread Thomas Monjalon
May I kindly ask you to remove this footer from your emails?
Thanks

> =-=-=
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain 
> confidential or privileged information. If you are 
> not the intended recipient, any dissemination, use, 
> review, distribution, printing or copying of the 
> information contained in this e-mail message 
> and/or attachments to it are strictly prohibited. If 
> you have received this communication in error, 
> please notify us by reply e-mail or telephone and 
> immediately and permanently delete the message 
> and any attachments. Thank you



[dpdk-dev] [PATCH v6 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API

2016-02-29 Thread Thomas Monjalon
2016-02-29 12:51, Panu Matilainen:
> On 02/24/2016 03:23 PM, Ananyev, Konstantin wrote:
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen
> >> On 02/23/2016 07:35 AM, Xie, Huawei wrote:
> >>> On 2/22/2016 10:52 PM, Xie, Huawei wrote:
>  On 2/4/2016 1:24 AM, Olivier MATZ wrote:
> > On 01/27/2016 02:56 PM, Panu Matilainen wrote:
> >> Since rte_pktmbuf_alloc_bulk() is an inline function, it is not part of
> >> the library ABI and should not be listed in the version map.
> >>
> >> I assume its inline for performance reasons, but then you lose the
> >> benefits of dynamic linking such as ability to fix bugs and/or improve
> >> itby just updating the library. Since the point of having a bulk API is
> >> to improve performance by reducing the number of calls required, does 
> >> it
> >> really have to be inline? As in, have you actually measured the
> >> difference between inline and non-inline and decided its worth all the
> >> downsides?
> > Agree with Panu. It would be interesting to compare the performance
> > between inline and non inline to decide whether inlining it or not.
>  Will update after i gathered more data. inline could show obvious
>  performance difference in some cases.
> >>>
> >>> Panu and Oliver:
> >>> I write a simple benchmark. This benchmark run 10M rounds, in each round
> >>> 8 mbufs are allocated through bulk API, and then freed.
> >>> These are the CPU cycles measured(Intel(R) Xeon(R) CPU E5-2680 0 @
> >>> 2.70GHz, CPU isolated, timer interrupt disabled, rcu offloaded).
> >>> Btw, i have removed some exceptional data, the frequency of which is
> >>> like 1/10. Sometimes observed user usage suddenly disappeared, no clue
> >>> what happened.
> >>>
> >>> With 8 mbufs allocated, there is about 6% performance increase using 
> >>> inline.
> >> [...]
> >>>
> >>> With 16 mbufs allocated, we could still observe obvious performance
> >>> difference, though only 1%-2%
> >>>
> >> [...]
> >>>
> >>> With 32/64 mbufs allocated, the deviation of the data itself would hide
> >>> the performance difference.
> >>> So we prefer using inline for performance.
> >>
> >> At least I was more after real-world performance in a real-world
> >> use-case rather than CPU cycles in a microbenchmark, we know function
> >> calls have a cost but the benefits tend to outweight the cons.
> >>
> >> Inline functions have their place and they're far less evil in project
> >> internal use, but in library public API they are BAD and should be ...
> >> well, not banned because there are exceptions to every rule, but highly
> >> discouraged.
> >
> > Why is that?
> 
> For all the reasons static linking is bad, and what's worse it forces 
> the static linking badness into dynamically linked builds.
> 
> If there's a bug (security or otherwise) in a library, a distro wants to 
> supply an updated package which fixes that bug and be done with it. But 
> if that bug is in an inlined code, supplying an update is not enough, 
> you also need to recompile everything using that code, and somehow 
> inform customers possibly using that code that they need to not only 
> update the library but to recompile their apps as well. That is 
> precisely the reason distros go to great lenghts to avoid *any* 
> statically linked apps and libs in the distro, completely regardless of 
> the performance overhead.
> 
> In addition, inlined code complicates ABI compatibility issues because 
> some of the code is one the "wrong" side, and worse, it bypasses all the 
> other ABI compatibility safeguards like soname and symbol versioning.
> 
> Like said, inlined code is fine for internal consumption, but incredibly 
> bad for public interfaces. And of course, the more complicated a 
> function is, greater the potential of needing bugfixes.
> 
> Mind you, none of this is magically specific to this particular 
> function. Except in the sense that bulk operations offer a better way of 
> performance improvements than just inlining everything.
> 
> > As you can see right now we have all mbuf alloc/free routines as static 
> > inline.
> > And I think we would like to keep it like that.
> > So why that particular function should be different?
> 
> Because there's much less need to have it inlined since the function 
> call overhead is "amortized" by the fact its doing bulk operations. "We 
> always did it that way" is not a very good reason :)
> 
> > After all that function is nothing more than a wrapper
> > around rte_mempool_get_bulk()  unrolled by 4 loop {rte_pktmbuf_reset()}
> > So unless mempool get/put API would change, I can hardly see there could be 
> > any ABI
> > breakages in future.
> > About 'real world' performance gain - it was a 'real world' performance 
> > problem,
> > that we tried to solve by introducing that function:
> > http://dpdk.org/ml/archives/dev/2015-May/017633.html
> >
> > And according to the user feedback, it does help:
> > 

[dpdk-dev] [PATCH v2 0/7] vhost rxtx refactor

2016-02-29 Thread Thomas Monjalon
Hi Yuanhan

2016-02-18 21:49, Yuanhan Liu:
> Here is a patchset for refactoring vhost rxtx code, mainly for
> improving readability.

This series requires to be rebased.

And maybe you could check also the series about numa_realloc.

Thanks


[dpdk-dev] [PATCH 0/3 v2] ixgbe fixes

2016-02-29 Thread Ananyev, Konstantin


> -Original Message-
> From: Iremonger, Bernard
> Sent: Friday, February 26, 2016 2:49 PM
> To: dev at dpdk.org
> Cc: Ananyev, Konstantin; Zhang, Helin; Iremonger, Bernard
> Subject: [PATCH 0/3 v2] ixgbe fixes
> 
> This patch set implements the following:
> Removes code which was duplicated in eth_ixgbevf_dev_init().
> Adds more information to the error message in ixgbe_check_mq_mode().
> Allows the MAC address of the VF to be set to zero.
> 
> Changes in v2:
> Do not overwrite the VF perm_add with zero.
> 
> Bernard Iremonger (3):
>   ixgbe: cleanup eth_ixgbevf_dev_uninit
>   ixgbe: add more information to the error message
>   ixgbe: fix setting of VF MAC address
> 
>  drivers/net/ixgbe/ixgbe_ethdev.c | 29 +
>  drivers/net/ixgbe/ixgbe_pf.c |  7 ---
>  2 files changed, 17 insertions(+), 19 deletions(-)
> 
> --

Acked-by: Konstantin Ananyev 

> 2.6.3



[dpdk-dev] [PATCH v3 2/4] kcp: add kernel control path kernel module

2016-02-29 Thread Stephen Hemminger
On Fri, 26 Feb 2016 14:10:39 +
Ferruh Yigit  wrote:

> +#define KCP_ERR(args...) printk(KERN_ERR "KCP: " args)
> +#define KCP_INFO(args...) printk(KERN_INFO "KCP: " args)
> +
> +#ifdef RTE_KCP_KO_DEBUG
> +#define KCP_DBG(args...) printk(KERN_DEBUG "KCP: " args)
> +#else
> +#define KCP_DBG(args...)
> +#endif

These macros will not make netdev developers happy.

Use standard printk macros, and if you want prefix, use pr_fmt


#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt


[dpdk-dev] [PATCH] vhost: broadcast RARP pkt by injecting it to receiving mbuf array

2016-02-29 Thread Thomas Monjalon
2016-02-22 22:36, Yuanhan Liu:
> The wrong mac table lead all the packets to the VM go to the "ovsbr0"
> in the end, which ends up with all packets being lost, until the guest
> send a ARP quest (or reply) to refresh the mac learning table.
> 
> Jianfeng then came up with a solution I have thought of firstly but NAKed
> by myself, concerning it has potential issues [0]. The solution is as title
> stated: broadcast the RARP packet by injecting it to the receiving mbuf
> arrays at rte_vhost_dequeue_burst(). The re-bring of that idea made me
> think it twice; it looked like a false concern to me then. And I had done
> a rough verification: it worked as expected.
> 
> [0]: http://dpdk.org/ml/archives/dev/2016-February/033527.html
> 
> Another note is that while preparing this version, I found that DPDK has
> some ARP related structures and macros defined. So, use them instead of
> the one from standard header files here.
> 
> Cc: Thibaut Collet 
> Suggested-by: Jianfeng Tan 
> Signed-off-by: Yuanhan Liu 

Applied, thanks



[dpdk-dev] [PATCH 1/1] arm: set CONFIG_RTE_ARCH_STRICT_ALIGN=y for armv7 target

2016-02-29 Thread Jan Viktorin
On Mon, 29 Feb 2016 16:14:58 +0100
Thomas Monjalon  wrote:

> 2015-12-09 16:16, Jan Viktorin:
> > This patch reduces number of warnings from 53 to 40. It removes the usual 
> > false
> > positives utilizing unaligned_uint*_t data types.
> > 
> > Signed-off-by: Jan Viktorin   
> 
> Applied, thanks
> 
> Jan, what is the problem with the other ARM alignment warnings?
> Can they be fixed?

This is the full list of warnings I can see on the current origin/master
for ARMv7 (42 occurences) including examples (+10 more). The origin of
all of them is:

  cast increases required alignment of target type [-Wcast-align]

After skimming through the list, you can see that they are mostly casts
to uint32_t * or something similar. I believe that all of them are OK.
However, I don't know how to persuade GCC to not be angry...

Probably, we can add some explicit alignment of certain structures.

app/test/test_thash.c
116   rte_convert_rss_key((uint32_t *)_rss_key,
117 (uint32_t *)rss_key_be, RTE_DIM(default_rss_key));

build/include/test_thash.h
179 *((uint32_t *)targ->v6.src_addr + i) =
180   rte_be_to_cpu_32(*((const uint32_t *)orig->src_addr + i));
181 *((uint32_t *)targ->v6.dst_addr + i) =
182   rte_be_to_cpu_32(*((const uint32_t *)orig->dst_addr + i));
207 ret ^= rte_cpu_to_be_32(((const uint32_t *)rss_key)[j]) << i |
208   (uint32_t)((uint64_t)(rte_cpu_to_be_32(((const uint32_t 
*)rss_key)[j + 1])) >>
238 ret ^= ((const uint32_t *)rss_key)[j] << i |
239   (uint32_t)((uint64_t)(((const uint32_t *)rss_key)[j + 1]) >> (32 
- i));

examples-sdk/usr/local/share/dpdk/arm-armv7a-linuxapp-gcc/include/rte_mbuf.h
1617   ((t)((char *)(m)->buf_addr + (m)->data_off + (o)))

examples/l3fwd-acl/main.c
1074   next = (struct rte_acl_rule *)(route_rules +
1079   next = (struct rte_acl_rule *)(acl_rules +
1115   *pacl_base = (struct rte_acl_rule *)acl_rules;
1117   *proute_base = (struct rte_acl_rule *)route_rules;

netmap_user.h
65 #define NETMAP_IF(b, o)  (struct netmap_if *)((char *)(b) + (o))
68   ((struct netmap_ring *)((char *)(nifp) +  \
72   ((struct netmap_ring *)((char *)(nifp) +  \

examples/vhost/main.c
121 #define MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t *)((uint8_t *)(mbuf) \
945   return ((*(uint64_t *)ea ^ *(uint64_t *)eb) & MAC_ADDR_CMP) == 0;

lib/librte_acl/acl_gen.c
391 qtrp = (uint32_t *)node->transitions;

lib/librte_acl/acl_run.h
46   (*((const int32_t *)((prm)[(idx)].data + *(prm)[idx].data_index++)))

lib/librte_eal/linuxapp/eal/eal_interrupts.c
150   irq_set = (struct vfio_irq_set *) irq_set_buf;
156   fd_ptr = (int *) _set->data;
196   irq_set = (struct vfio_irq_set *) irq_set_buf;
239   irq_set = (struct vfio_irq_set *) irq_set_buf;
245   fd_ptr = (int *) _set->data;
267   irq_set = (struct vfio_irq_set *) irq_set_buf;
293   irq_set = (struct vfio_irq_set *) irq_set_buf;
304   fd_ptr = (int *) _set->data;
330   irq_set = (struct vfio_irq_set *) irq_set_buf;

lib/librte_eal/linuxapp/eal/eal_pci_vfio_mp_sync.c
176   chdr = (struct cmsghdr *) chdr_buf;
209   chdr = (struct cmsghdr *) chdr_buf;

595   k = (struct rte_hash_key *) ((char *)keys +
615   k = (struct rte_hash_key *) ((char *)keys +
726   k = (struct rte_hash_key *) ((char *)keys +
749   k = (struct rte_hash_key *) ((char *)keys +
841   k = (struct rte_hash_key *) ((char *)keys +
864   k = (struct rte_hash_key *) ((char *)keys +
959   *key_slot = (const struct rte_hash_key *) ((const char *)keys +
1233   next_key = (struct rte_hash_key *) ((char *)h->key_store +

lib/librte_sched/rte_bitmap.h
262   bmp = (struct rte_bitmap *) mem;
264   bmp->array1 = (uint64_t *) [array1_byte_offset];
266   bmp->array2 = (uint64_t *) [array2_byte_offset];

lib/librte_sched/rte_sched.c
684   port->subport = (struct rte_sched_subport *)
687   port->pipe = (struct rte_sched_pipe *)
690   port->queue = (struct rte_sched_queue *)
693   port->queue_extra = (struct rte_sched_queue_extra *)
696   port->pipe_profiles = (struct rte_sched_pipe_profile *)
701   port->queue_array = (struct rte_mbuf **)

lib/librte_vhost/vhost_user/virtio-net-user.c
433   rarp = (struct ether_arp *)(eth_hdr + 1);
527   ifr = (struct ifreq *)ifc.ifc_buf;

Regards
Jan


[dpdk-dev] [PATCH v5 00/11] Add API to get packet type info

2016-02-29 Thread Ananyev, Konstantin


> -Original Message-
> From: Tan, Jianfeng
> Sent: Friday, February 26, 2016 7:34 AM
> To: dev at dpdk.org
> Cc: Zhang, Helin; Ananyev, Konstantin; nelio.laranjeiro at 6wind.com; 
> adrien.mazarguil at 6wind.com; rahul.lakkireddy at chelsio.com;
> Tan, Jianfeng
> Subject: [PATCH v5 00/11] Add API to get packet type info
> 
> To achieve this, a new function pointer, dev_ptype_info_get, is added
> into struct eth_dev_ops. For those devices who do not implement it, it
> means it will not provide any ptype info.
> 
> v5:
>   - Exclude l3fwd change from this series, as a separated one.
>   - Fix malposition of mlx4 code in mlx5 commit introduced in v4.
> 
> v4:
>   - Change how to use this API: to previously agreement reached in mail.
> 
> v3:
>   - Change how to use this API: api to allocate mem for storing ptype
> array; and caller to free the mem.
>   - Change how to return back ptypes from PMDs: return a pointer to
> corresponding static const array of supported ptypes, terminated
> by RTE_PTYPE_UNKNOWN.
>   - Fix l3fwd parse_packet_type() when EXACT_MATCH is enabled.
>   - Fix l3fwd memory leak when calling the API.
> 
> v2:
>   - Move ptype_mask filter function from each PMDs into ether layer.
>   - Add ixgbe vPMD's ptype info.
>   - Fix code style issues.
> 
> Signed-off-by: Jianfeng Tan 
> 

Acked-by: Konstantin Ananyev 

> --
> 2.1.4



[dpdk-dev] [PATCH v4 2/2] cryptodev: change burst API to be crypto op oriented

2016-02-29 Thread Declan Doherty
This patch modifies the crypto burst enqueue/dequeue APIs to operate on bursts
rte_crypto_op's rather than the current implementation which operates on
rte_mbuf bursts, this simplifies the burst processing in the crypto PMDs and the
use of crypto operations in general.

The changes also continues the separatation of the symmetric operation 
parameters
from the more general operation parameters, this will simplify the integration 
of
asymmetric crypto operations in the future.

As well as the changes to the crypto APIs this patch adds functions for managing
rte_crypto_op pools to the cryptodev API. It modifies the existing PMDs, unit
tests and sample application to work with the modified APIs and finally
removes the now unused rte_mbuf_offload library.

Signed-off-by: Declan Doherty 
---
 MAINTAINERS|   6 +-
 app/test/test_cryptodev.c  | 804 +++--
 app/test/test_cryptodev.h  |   9 +-
 app/test/test_cryptodev_perf.c | 253 +++
 config/common_bsdapp   |   8 -
 config/common_linuxapp |   8 -
 doc/api/doxy-api-index.md  |   1 -
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c | 171 +++--
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c |  12 +-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h |   2 +-
 drivers/crypto/qat/qat_crypto.c| 123 ++--
 drivers/crypto/qat/qat_crypto.h|  12 +-
 drivers/crypto/qat/rte_qat_cryptodev.c |   4 +-
 examples/l2fwd-crypto/main.c   | 283 
 lib/Makefile   |   1 -
 lib/librte_cryptodev/rte_crypto.h  | 364 +-
 lib/librte_cryptodev/rte_crypto_sym.h  | 379 +-
 lib/librte_cryptodev/rte_cryptodev.c   |  76 ++
 lib/librte_cryptodev/rte_cryptodev.h   | 109 ++-
 lib/librte_cryptodev/rte_cryptodev_version.map |   3 +-
 lib/librte_mbuf/rte_mbuf.h |   6 -
 lib/librte_mbuf_offload/Makefile   |  52 --
 lib/librte_mbuf_offload/rte_mbuf_offload.c | 100 ---
 lib/librte_mbuf_offload/rte_mbuf_offload.h | 310 
 .../rte_mbuf_offload_version.map   |   7 -
 25 files changed, 1575 insertions(+), 1528 deletions(-)
 delete mode 100644 lib/librte_mbuf_offload/Makefile
 delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload.c
 delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload.h
 delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 628bc05..ad6b45e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -222,16 +222,12 @@ F: lib/librte_mbuf/
 F: doc/guides/prog_guide/mbuf_lib.rst
 F: app/test/test_mbuf.c

-Packet buffer offload - EXPERIMENTAL
-M: Declan Doherty 
-F: lib/librte_mbuf_offload/
-
 Ethernet API
 M: Thomas Monjalon 
 F: lib/librte_ether/
 F: scripts/test-null.sh

-Crypto API - EXPERIMENTAL
+Crypto API
 M: Declan Doherty 
 F: lib/librte_cryptodev/
 F: app/test/test_cryptodev*
diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index 951b443..208fc14 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -35,7 +35,6 @@
 #include 
 #include 
 #include 
-#include 

 #include 
 #include 
@@ -48,7 +47,7 @@ static enum rte_cryptodev_type gbl_cryptodev_type;

 struct crypto_testsuite_params {
struct rte_mempool *mbuf_pool;
-   struct rte_mempool *mbuf_ol_pool;
+   struct rte_mempool *op_mpool;
struct rte_cryptodev_config conf;
struct rte_cryptodev_qp_conf qp_conf;

@@ -62,8 +61,7 @@ struct crypto_unittest_params {

struct rte_cryptodev_sym_session *sess;

-   struct rte_mbuf_offload *ol;
-   struct rte_crypto_sym_op *op;
+   struct rte_crypto_op *op;

struct rte_mbuf *obuf, *ibuf;

@@ -104,7 +102,7 @@ setup_test_string(struct rte_mempool *mpool,
return m;
 }

-#if HEX_DUMP
+#ifdef HEX_DUMP
 static void
 hexdump_mbuf_data(FILE *f, const char *title, struct rte_mbuf *m)
 {
@@ -112,27 +110,29 @@ hexdump_mbuf_data(FILE *f, const char *title, struct 
rte_mbuf *m)
 }
 #endif

-static struct rte_mbuf *
-process_crypto_request(uint8_t dev_id, struct rte_mbuf *ibuf)
+static struct rte_crypto_op *
+process_crypto_request(uint8_t dev_id, struct rte_crypto_op *op)
 {
-   struct rte_mbuf *obuf = NULL;
-#if HEX_DUMP
+#ifdef HEX_DUMP
hexdump_mbuf_data(stdout, "Enqueued Packet", ibuf);
 #endif

-   if (rte_cryptodev_enqueue_burst(dev_id, 0, , 1) != 1) {
+   if (rte_cryptodev_enqueue_burst(dev_id, 0, , 1) != 1) {
printf("Error sending packet for encryption");
return NULL;
}
-   while (rte_cryptodev_dequeue_burst(dev_id, 0, , 1) == 0)
+
+   op = NULL;
+
+   while (rte_cryptodev_dequeue_burst(dev_id, 0, , 1) 

[dpdk-dev] [PATCH v4 1/2] cryptodev: API tidy and changes to support future extensions

2016-02-29 Thread Declan Doherty
From: Fiona Trahe 

This patch splits symmetric specific definitions and functions away from the
common crypto APIs to facilitate the future extension and expansion of the
cryptodev framework, in order to allow  asymmetric crypto operations to be
introduced at a later date, as well as to clean the logical structure of the
public includes. The patch also introduces the _sym prefix to symmetric
specific structure and functions to improve clarity in the API.

Signed-off-by: Fiona Trahe 
Signed-off-by: Declan Doherty 
---
 app/test/test_cryptodev.c  | 164 +++---
 app/test/test_cryptodev_perf.c |  79 +--
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c |  44 +-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c |   6 +-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h |   4 +-
 drivers/crypto/qat/qat_crypto.c|  51 +-
 drivers/crypto/qat/qat_crypto.h|  10 +-
 drivers/crypto/qat/rte_qat_cryptodev.c |   8 +-
 examples/l2fwd-crypto/main.c   |  33 +-
 lib/librte_cryptodev/Makefile  |   1 +
 lib/librte_cryptodev/rte_crypto.h  | 563 +--
 lib/librte_cryptodev/rte_crypto_sym.h  | 613 +
 lib/librte_cryptodev/rte_cryptodev.c   |  39 +-
 lib/librte_cryptodev/rte_cryptodev.h   |  80 ++-
 lib/librte_cryptodev/rte_cryptodev_pmd.h   |  32 +-
 lib/librte_mbuf_offload/rte_mbuf_offload.h |  22 +-
 16 files changed, 912 insertions(+), 837 deletions(-)
 create mode 100644 lib/librte_cryptodev/rte_crypto_sym.h

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index 62f8fb0..951b443 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2015-2016 Intel Corporation. All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
  *   modification, are permitted provided that the following conditions
@@ -57,13 +57,13 @@ struct crypto_testsuite_params {
 };

 struct crypto_unittest_params {
-   struct rte_crypto_xform cipher_xform;
-   struct rte_crypto_xform auth_xform;
+   struct rte_crypto_sym_xform cipher_xform;
+   struct rte_crypto_sym_xform auth_xform;

-   struct rte_cryptodev_session *sess;
+   struct rte_cryptodev_sym_session *sess;

struct rte_mbuf_offload *ol;
-   struct rte_crypto_op *op;
+   struct rte_crypto_sym_op *op;

struct rte_mbuf *obuf, *ibuf;

@@ -78,7 +78,7 @@ test_AES_CBC_HMAC_SHA512_decrypt_create_session_params(
struct crypto_unittest_params *ut_params);

 static int
-test_AES_CBC_HMAC_SHA512_decrypt_perform(struct rte_cryptodev_session *sess,
+test_AES_CBC_HMAC_SHA512_decrypt_perform(struct rte_cryptodev_sym_session 
*sess,
struct crypto_unittest_params *ut_params,
struct crypto_testsuite_params *ts_param);

@@ -165,7 +165,8 @@ testsuite_setup(void)
ts_params->mbuf_ol_pool = rte_pktmbuf_offload_pool_create(
"MBUF_OFFLOAD_POOL",
NUM_MBUFS, MBUF_CACHE_SIZE,
-   DEFAULT_NUM_XFORMS * sizeof(struct rte_crypto_xform),
+   DEFAULT_NUM_XFORMS *
+   sizeof(struct rte_crypto_sym_xform),
rte_socket_id());
if (ts_params->mbuf_ol_pool == NULL) {
RTE_LOG(ERR, USER1, "Can't create CRYPTO_OP_POOL\n");
@@ -220,7 +221,7 @@ testsuite_setup(void)

ts_params->conf.nb_queue_pairs = info.max_nb_queue_pairs;
ts_params->conf.socket_id = SOCKET_ID_ANY;
-   ts_params->conf.session_mp.nb_objs = info.max_nb_sessions;
+   ts_params->conf.session_mp.nb_objs = info.sym.max_nb_sessions;

TEST_ASSERT_SUCCESS(rte_cryptodev_configure(dev_id,
_params->conf),
@@ -275,7 +276,7 @@ ut_setup(void)
ts_params->conf.nb_queue_pairs = DEFAULT_NUM_QPS_PER_QAT_DEVICE;
ts_params->conf.socket_id = SOCKET_ID_ANY;
ts_params->conf.session_mp.nb_objs =
-   (gbl_cryptodev_type == RTE_CRYPTODEV_QAT_PMD) ?
+   (gbl_cryptodev_type == RTE_CRYPTODEV_QAT_SYM_PMD) ?
DEFAULT_NUM_OPS_INFLIGHT :
DEFAULT_NUM_OPS_INFLIGHT;

@@ -319,7 +320,7 @@ ut_teardown(void)

/* free crypto session structure */
if (ut_params->sess) {
-   rte_cryptodev_session_free(ts_params->valid_devs[0],
+   rte_cryptodev_sym_session_free(ts_params->valid_devs[0],
ut_params->sess);
ut_params->sess = NULL;
}
@@ -464,7 +465,7 @@ 

[dpdk-dev] [PATCH v4 0/2] cryptodev API changes

2016-02-29 Thread Declan Doherty
This patch set separates the symmetric crypto operations from generic operations
and then modifies the cryptodev burst API to accept bursts of rte_crypto_op
rather than rte_mbufs.

V4:
- Fixes for issues introduced in __rte_crypto_op_raw_bulk_alloc in V3 patcheset.
- Typo fix in cached attribute on rte_crypto_op structure.

V3:
 - Addresses V2 comments
 - Rebased for head


Declan Doherty (1):
  cryptodev: change burst API to be crypto op oriented

Fiona Trahe (1):
  cryptodev: API tidy and changes to support future extensions

 MAINTAINERS|   6 +-
 app/test/test_cryptodev.c  | 894 +++--
 app/test/test_cryptodev.h  |   9 +-
 app/test/test_cryptodev_perf.c | 270 ---
 config/common_bsdapp   |   8 -
 config/common_linuxapp |   8 -
 doc/api/doxy-api-index.md  |   1 -
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c | 199 ++---
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c |  18 +-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h |   6 +-
 drivers/crypto/qat/qat_crypto.c| 150 ++--
 drivers/crypto/qat/qat_crypto.h|  14 +-
 drivers/crypto/qat/rte_qat_cryptodev.c |   8 +-
 examples/l2fwd-crypto/main.c   | 300 ---
 lib/Makefile   |   1 -
 lib/librte_cryptodev/Makefile  |   1 +
 lib/librte_cryptodev/rte_crypto.h  | 819 +++
 lib/librte_cryptodev/rte_crypto_sym.h  | 642 +++
 lib/librte_cryptodev/rte_cryptodev.c   | 115 ++-
 lib/librte_cryptodev/rte_cryptodev.h   | 185 ++---
 lib/librte_cryptodev/rte_cryptodev_pmd.h   |  32 +-
 lib/librte_cryptodev/rte_cryptodev_version.map |   3 +-
 lib/librte_mbuf/rte_mbuf.h |   6 -
 lib/librte_mbuf_offload/Makefile   |  52 --
 lib/librte_mbuf_offload/rte_mbuf_offload.c | 100 ---
 lib/librte_mbuf_offload/rte_mbuf_offload.h | 310 ---
 .../rte_mbuf_offload_version.map   |   7 -
 27 files changed, 2143 insertions(+), 2021 deletions(-)
 create mode 100644 lib/librte_cryptodev/rte_crypto_sym.h
 delete mode 100644 lib/librte_mbuf_offload/Makefile
 delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload.c
 delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload.h
 delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload_version.map

-- 
2.5.0



[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-02-29 Thread Thomas Monjalon
2016-02-29 17:19, Panu Matilainen:
> On 02/29/2016 01:35 PM, Ferruh Yigit wrote:
> > On 2/29/2016 11:06 AM, Thomas Monjalon wrote:
> >> Hi,
> >> I totally agree with Avi's comments.
> >> This topic is really important for the future of DPDK.
> >> So I think we must give some time to continue the discussion
> >> and have netdev involved in the choices done.
> >> As a consequence, these series should not be merged in the release 16.04.
> >> Thanks for continuing the work.
> >>
> > Hi Thomas,
> >
> > It is great to have some discussion and feedbacks.
> > But I doubt not merging in this release will help to have more discussion.
> >
> > It is better to have them in this release and let people experiment it,
> > this gives more chance to better discussion.
> >
> > These features are replacement of KNI, and KNI is not intended to be
> > removed in this release, so who are using KNI as solution can continue
> > to use KNI and can test KCP/KDP, so that we can get more feedbacks.
> 
> So make the work available from a separate git repo and make it easy for 
> people to experiment with it. Code doesn't have to be in a release for 
> the sake of experimenting, and removing code is much harder than not 
> adding it in the first place, witness KNI.

Good idea.
What about a -next tree to experiment on kernel interactions?


[dpdk-dev] [PATCH 1/1] arm: set CONFIG_RTE_ARCH_STRICT_ALIGN=y for armv7 target

2016-02-29 Thread Thomas Monjalon
2015-12-09 16:16, Jan Viktorin:
> This patch reduces number of warnings from 53 to 40. It removes the usual 
> false
> positives utilizing unaligned_uint*_t data types.
> 
> Signed-off-by: Jan Viktorin 

Applied, thanks

Jan, what is the problem with the other ARM alignment warnings?
Can they be fixed?


[dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump

2016-02-29 Thread Pattan, Reshma
Hi,

> -Original Message-
> From: Pavel Fedin [mailto:p.fedin at samsung.com]
> Sent: Wednesday, February 24, 2016 3:05 PM
> To: Pattan, Reshma 
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for
> tcpdump
> 
>  Hello!
> 
> > >  2. What if i don't want separate RX and TX streams either? It only
> > > prevents me from seeing the complete picture.
> >
> > Do you mean not to have separate pcap files for tx and rx? If so, I
> > would prefer to keep this as it is.
> 
>  I mean - add an option not to have separate files.

OK, I will make changes in v3.

> 
> > >  3. vhostuser ports are missing. Perhaps not really related to this
> > > patchset, i just don't know how much code "server" part of vhostuser
> > > shares with normal PMDs, but anyway, ability to dump them too would be
> nice to have.
> > >
> >
> > I think this can be done in future i.e. when vhost as PMD is
> > available. But as of now vhost is library.
> 
>  I expected "server"-side vhost to be the same as "client" part (AKA virtio), 
> just
> use another mechanism for exchanging control information (via socket). Is it 
> not
> true? I suppose, driving queues from both sides should be quite symmetric.
> 

At this stage of release adding these changes is difficult as I don't have 
knowledge on vhost.
But at the same if anyone from committee would like to make these enhancements 
are welcome.

Thanks,
Reshma




[dpdk-dev] [PATCH] log: add missing symbol

2016-02-29 Thread Thomas Monjalon
2016-01-27 10:35, Thomas Monjalon:
> 2015-12-16 16:38, Stephen Hemminger:
> > rte_get_log_type and rte_get_log_level functions has been avaliable
> > for many versions. But they are missing from the shared library map
> > and therefore do not get exported correctly.
> > 
> > Signed-off-by: Stephen Hemminger 
> > ---
> >  lib/librte_eal/linuxapp/eal/rte_eal_version.map | 2 ++
> >  1 file changed, 2 insertions(+)
> 
> Why only in linuxapp?
> 
> > diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map 
> > b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> > index cbe175f..51a241c 100644
> > --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> > +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> > @@ -93,7 +93,9 @@ DPDK_2.0 {
> > rte_realloc;
> > rte_set_application_usage_hook;
> > rte_set_log_level;
> > +   rte_get_log_level;
> > rte_set_log_type;
> > +   rte_get_log_type;
> 
> We try to keep an alphabetical order :)

Reordered, updated in bsdapp/ and
Applied, thanks


[dpdk-dev] [PATCH v3 0/2] cryptodev API changes

2016-02-29 Thread Declan Doherty
On 26/02/16 17:30, Declan Doherty wrote:
> This patch set separates the symmetric crypto operations from generic 
> operations
> and then modifies the cryptodev burst API to accept bursts of rte_crypto_op
> rather than rte_mbufs.
>
> V3:
>   - Addresses V2 comments
>   - Rebased for head
>
> Declan Doherty (1):
>cryptodev: change burst API to be crypto op oriented
>
> Fiona Trahe (1):
>cryptodev: API tidy and changes to support future extensions
>
>   MAINTAINERS|   6 +-
>   app/test/test_cryptodev.c  | 894 
> +++--
>   app/test/test_cryptodev.h  |   9 +-
>   app/test/test_cryptodev_perf.c | 270 ---
>   config/common_bsdapp   |   8 -
>   config/common_linuxapp |   8 -
>   doc/api/doxy-api-index.md  |   1 -
>   drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c | 199 ++---
>   drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c |  18 +-
>   drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h |   6 +-
>   drivers/crypto/qat/qat_crypto.c| 150 ++--
>   drivers/crypto/qat/qat_crypto.h|  14 +-
>   drivers/crypto/qat/rte_qat_cryptodev.c |   8 +-
>   examples/l2fwd-crypto/main.c   | 300 ---
>   lib/Makefile   |   1 -
>   lib/librte_cryptodev/Makefile  |   1 +
>   lib/librte_cryptodev/rte_crypto.h  | 822 ---
>   lib/librte_cryptodev/rte_crypto_sym.h  | 642 +++
>   lib/librte_cryptodev/rte_cryptodev.c   | 115 ++-
>   lib/librte_cryptodev/rte_cryptodev.h   | 185 ++---
>   lib/librte_cryptodev/rte_cryptodev_pmd.h   |  32 +-
>   lib/librte_cryptodev/rte_cryptodev_version.map |   3 +-
>   lib/librte_mbuf/rte_mbuf.h |   6 -
>   lib/librte_mbuf_offload/Makefile   |  52 --
>   lib/librte_mbuf_offload/rte_mbuf_offload.c | 100 ---
>   lib/librte_mbuf_offload/rte_mbuf_offload.h | 310 ---
>   .../rte_mbuf_offload_version.map   |   7 -
>   27 files changed, 2146 insertions(+), 2021 deletions(-)
>   create mode 100644 lib/librte_cryptodev/rte_crypto_sym.h
>   delete mode 100644 lib/librte_mbuf_offload/Makefile
>   delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload.c
>   delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload.h
>   delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload_version.map
>
self NAK.

There is an issue with mis-merged code in __rte_crypto_op_raw_bulk_alloc 
function in rte_crypto.h


[dpdk-dev] [dpdk-dev,v2] Clean up rte_memcpy.h file

2016-02-29 Thread Wang, Zhihong


> -Original Message-
> From: Ravi Kerur [mailto:rkerur at gmail.com]
> Sent: Saturday, February 27, 2016 10:06 PM
> To: Wang, Zhihong 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev,v2] Clean up rte_memcpy.h file
> 
> 
> 
> On Wed, Jan 27, 2016 at 8:18 PM, Zhihong Wang 
> wrote:
> > Remove unnecessary type casting in functions.
> >
> > Tested on Ubuntu (14.04 x86_64) with "make test".
> > "make test" results match the results with baseline.
> > "Memcpy perf" results match the results with baseline.
> >
> > Signed-off-by: Ravi Kerur 
> > Acked-by: Stephen Hemminger 
> >
> > ---
> > .../common/include/arch/x86/rte_memcpy.h? ? ? ? ? ?| 340 +++---
> ---
> >? 1 file changed, 175 insertions(+), 165 deletions(-)
> >
> > diff --git a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h
> b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h
> > index 6a57426..839d4ec 100644
> > --- a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h
> > +++ b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h
> 
> [...]
> 
> >? /**
> > @@ -150,13 +150,16 @@ rte_mov64blocks(uint8_t *dst, const uint8_t *src,
> size_t n)
> >? ? ? ?__m256i ymm0, ymm1;
> >
> >? ? ? ?while (n >= 64) {
> > -? ? ? ? ? ? ?ymm0 = _mm256_loadu_si256((const __m256i *)((const uint8_t
> *)src + 0 * 32));
> > +
> > +? ? ? ? ? ? ?ymm0 = _mm256_loadu_si256((const __m256i *)(src + 0 * 32));
> > +? ? ? ? ? ? ?ymm1 = _mm256_loadu_si256((const __m256i *)(src + 1 * 32));
> > +
> > +? ? ? ? ? ? ?_mm256_storeu_si256((__m256i *)(dst + 0 * 32), ymm0);
> > +? ? ? ? ? ? ?_mm256_storeu_si256((__m256i *)(dst + 1 * 32), ymm1);
> > +
> 
> Any particular reason to change the order of the statements here? :)
> Overall this patch looks good.
> 
> I checked the code changes, initial code had moving ?addresses (src and dst) 
> and
> decrement counter scattered between store and load instructions. I changed it 
> to
> loads, followed by stores and handle address/counters increment/decrement
> without changing functionality.
> 

It's definitely okay to do this. Actually changing it or not won't affect
the final output at all since gcc will optimize it while generating code.
It's C code we're writing after all.

But personally I prefer to keep the original order just as a comment
that what's needed in the future should be calculated ASAP, and
different kinds (CPU port) of instructions should be mixed together. :)

Could you please rebase this patch since there has been some changes
already?

> >? ? ? ? ? ? ? ?n -= 64;
> > -? ? ? ? ? ? ?ymm1 = _mm256_loadu_si256((const __m256i *)((const uint8_t
> *)src + 1 * 32));
> > -? ? ? ? ? ? ?src = (const uint8_t *)src + 64;
> > -? ? ? ? ? ? ?_mm256_storeu_si256((__m256i *)((uint8_t *)dst + 0 * 32),
> ymm0);
> > -? ? ? ? ? ? ?_mm256_storeu_si256((__m256i *)((uint8_t *)dst + 1 * 32),
> ymm1);
> > -? ? ? ? ? ? ?dst = (uint8_t *)dst + 64;
> > +? ? ? ? ? ? ?src = src + 64;
> > +? ? ? ? ? ? ?dst = dst + 64;
> >? ? ? ?}
> >? }
> >



[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-02-29 Thread Ferruh Yigit
On 2/29/2016 11:35 AM, Ferruh Yigit wrote:
> On 2/29/2016 11:06 AM, Thomas Monjalon wrote:
>> Hi,
>> I totally agree with Avi's comments.
>> This topic is really important for the future of DPDK.
>> So I think we must give some time to continue the discussion
>> and have netdev involved in the choices done.
>> As a consequence, these series should not be merged in the release 16.04.
>> Thanks for continuing the work.
>>
> Hi Thomas,
> 
> It is great to have some discussion and feedbacks.
> But I doubt not merging in this release will help to have more discussion.
> 
> It is better to have them in this release and let people experiment it,
> this gives more chance to better discussion.
> 
> These features are replacement of KNI, and KNI is not intended to be
> removed in this release, so who are using KNI as solution can continue
> to use KNI and can test KCP/KDP, so that we can get more feedbacks.
> 
One more thing, overall reason of working on KCP/KDP is reduce KNI
maintenance cost, and add more features, not to add more maintenance cost.
The most maintenance cost of KNI is because of Linux network drivers in
it, which KCP removes them, so there is an improvement.

Although it is not as good as removing them completely, KCP/KDP is one
step closer to be upstreamed than existing KNI.

Thanks,
ferruh



[dpdk-dev] [PATCH v2 2/2] modify action handlers in test_pipeline and ip_pipeline

2016-02-29 Thread Jasvinder Singh
Changes are made to the ports and table action handlers defined
in app/test_pipeline and ip_pipeline sample application.

Signed-off-by: Jasvinder Singh 
Acked-by: Cristian Dumitrescu 
---
 app/test-pipeline/pipeline_acl.c   |  3 +-
 app/test-pipeline/pipeline_hash.c  |  3 +-
 app/test-pipeline/pipeline_lpm.c   |  3 +-
 app/test-pipeline/pipeline_lpm_ipv6.c  |  3 +-
 app/test-pipeline/pipeline_stub.c  |  3 +-
 .../ip_pipeline/pipeline/pipeline_actions_common.h | 47 +-
 .../ip_pipeline/pipeline/pipeline_firewall_be.c|  3 +-
 .../pipeline/pipeline_flow_actions_be.c|  3 +-
 .../pipeline/pipeline_flow_classification_be.c |  3 +-
 .../ip_pipeline/pipeline/pipeline_passthrough_be.c |  3 +-
 .../ip_pipeline/pipeline/pipeline_routing_be.c |  3 +-
 11 files changed, 37 insertions(+), 40 deletions(-)

diff --git a/app/test-pipeline/pipeline_acl.c b/app/test-pipeline/pipeline_acl.c
index f163e55..22d5f36 100644
--- a/app/test-pipeline/pipeline_acl.c
+++ b/app/test-pipeline/pipeline_acl.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -159,7 +159,6 @@ app_main_loop_worker_pipeline_acl(void) {
.ops = _port_ring_writer_ops,
.arg_create = (void *) _ring_params,
.f_action = NULL,
-   .f_action_bulk = NULL,
.arg_ah = NULL,
};

diff --git a/app/test-pipeline/pipeline_hash.c 
b/app/test-pipeline/pipeline_hash.c
index 8b888d7..f8aac0d 100644
--- a/app/test-pipeline/pipeline_hash.c
+++ b/app/test-pipeline/pipeline_hash.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -140,7 +140,6 @@ app_main_loop_worker_pipeline_hash(void) {
.ops = _port_ring_writer_ops,
.arg_create = (void *) _ring_params,
.f_action = NULL,
-   .f_action_bulk = NULL,
.arg_ah = NULL,
};

diff --git a/app/test-pipeline/pipeline_lpm.c b/app/test-pipeline/pipeline_lpm.c
index 2d7bc01..916abd4 100644
--- a/app/test-pipeline/pipeline_lpm.c
+++ b/app/test-pipeline/pipeline_lpm.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -99,7 +99,6 @@ app_main_loop_worker_pipeline_lpm(void) {
.ops = _port_ring_writer_ops,
.arg_create = (void *) _ring_params,
.f_action = NULL,
-   .f_action_bulk = NULL,
.arg_ah = NULL,
};

diff --git a/app/test-pipeline/pipeline_lpm_ipv6.c 
b/app/test-pipeline/pipeline_lpm_ipv6.c
index c895b62..3352e89 100644
--- a/app/test-pipeline/pipeline_lpm_ipv6.c
+++ b/app/test-pipeline/pipeline_lpm_ipv6.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -100,7 +100,6 @@ app_main_loop_worker_pipeline_lpm_ipv6(void) {
.ops = _port_ring_writer_ops,
.arg_create = (void *) _ring_params,
.f_action = NULL,
-   .f_action_bulk = NULL,
.arg_ah = NULL,
};

diff --git a/app/test-pipeline/pipeline_stub.c 
b/app/test-pipeline/pipeline_stub.c
index 0ad6f9b..ba710ca 100644
--- a/app/test-pipeline/pipeline_stub.c
+++ b/app/test-pipeline/pipeline_stub.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -94,7 +94,6 @@ app_main_loop_worker_pipeline_stub(void) {
.ops = _port_ring_writer_ops,
.arg_create = (void *) _ring_params,
.f_action = NULL,
-   .f_action_bulk = NULL,
.arg_ah = NULL,
};

diff --git 

[dpdk-dev] [PATCH v2 1/2] librte_pipeline: add support for packet redirection at action handlers

2016-02-29 Thread Jasvinder Singh
Currently, there is no mechanism that allows the pipeline ports (in/out) and
table action handlers to override the default forwarding decision (as
previously configured per input port or in the table entry). Therefore, new
pipeline API functions have been added which allows action handlers to
hijack packets and remove them from the pipeline processing, and then either
drop them or send them out of the pipeline on any output port. The port
(in/out) and table action handler prototypes have been changed for making
use of these new API functions. This feature will be helpful to implement
functions such as exception handling (e.g. TTL =0), load balancing etc.

Signed-off-by: Jasvinder Singh 
Acked-by: Cristian Dumitrescu 
---
v2:
* rebased on master

 doc/guides/rel_notes/deprecation.rst |   5 -
 doc/guides/rel_notes/release_16_04.rst   |   6 +-
 lib/librte_pipeline/Makefile |   4 +-
 lib/librte_pipeline/rte_pipeline.c   | 461 ++-
 lib/librte_pipeline/rte_pipeline.h   |  98 +++---
 lib/librte_pipeline/rte_pipeline_version.map |   8 +
 6 files changed, 308 insertions(+), 274 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index e94d4a2..1a7d660 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -40,11 +40,6 @@ Deprecation Notices
 * The scheduler statistics structure will change to allow keeping track of
   RED actions.

-* librte_pipeline: The prototype for the pipeline input port, output port
-  and table action handlers will be updated:
-  the pipeline parameter will be added, the packets mask parameter will be
-  either removed (for input port action handler) or made input-only.
-
 * ABI changes are planned in cmdline buffer size to allow the use of long
   commands (such as RETA update in testpmd).  This should impact
   CMDLINE_PARSE_RESULT_BUFSIZE, STR_TOKEN_SIZE and RDLINE_BUF_SIZE.
diff --git a/doc/guides/rel_notes/release_16_04.rst 
b/doc/guides/rel_notes/release_16_04.rst
index e2219d0..bbfd248 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -118,6 +118,10 @@ ABI Changes
   the previous releases and made in this release. Use fixed width quotes for
   ``rte_function_names`` or ``rte_struct_names``. Use the past tense.

+* librte_pipeline: The prototype for the pipeline input port, output port
+  and table action handlers are updated:the pipeline parameter is added,
+  the packets mask parameter has been either removed or made input-only.
+

 Shared Library Versions
 ---
@@ -144,7 +148,7 @@ The libraries prepended with a plus sign were incremented 
in this version.
  librte_mbuf.so.2
  librte_mempool.so.1
  librte_meter.so.1
- librte_pipeline.so.2
+   + librte_pipeline.so.3
  librte_pmd_bond.so.1
  librte_pmd_ring.so.2
  librte_port.so.2
diff --git a/lib/librte_pipeline/Makefile b/lib/librte_pipeline/Makefile
index 1166d3c..822fd41 100644
--- a/lib/librte_pipeline/Makefile
+++ b/lib/librte_pipeline/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -41,7 +41,7 @@ CFLAGS += $(WERROR_FLAGS)

 EXPORT_MAP := rte_pipeline_version.map

-LIBABIVER := 2
+LIBABIVER := 3

 #
 # all source are stored in SRCS-y
diff --git a/lib/librte_pipeline/rte_pipeline.c 
b/lib/librte_pipeline/rte_pipeline.c
index d625fd2..87f7634 100644
--- a/lib/librte_pipeline/rte_pipeline.c
+++ b/lib/librte_pipeline/rte_pipeline.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -49,14 +49,30 @@
 #define RTE_TABLE_INVALID UINT32_MAX

 #ifdef RTE_PIPELINE_STATS_COLLECT
-#define RTE_PIPELINE_STATS_ADD(counter, val) \
-   ({ (counter) += (val); })

-#define RTE_PIPELINE_STATS_ADD_M(counter, mask) \
-   ({ (counter) += __builtin_popcountll(mask); })
+#define RTE_PIPELINE_STATS_AH_DROP_WRITE(p, mask)  \
+   ({ (p)->n_pkts_ah_drop = __builtin_popcountll(mask); })
+
+#define RTE_PIPELINE_STATS_AH_DROP_READ(p, counter)\
+   ({ (counter) += (p)->n_pkts_ah_drop; (p)->n_pkts_ah_drop = 0; })
+
+#define RTE_PIPELINE_STATS_TABLE_DROP0(p)  \
+   ({ (p)->pkts_drop_mask = (p)->action_mask0[RTE_PIPELINE_ACTION_DROP]; })
+
+#define RTE_PIPELINE_STATS_TABLE_DROP1(p, counter) \
+({ \
+   uint64_t mask = 

[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-02-29 Thread Ferruh Yigit
On 2/29/2016 11:39 AM, Avi Kivity wrote:
> 
> 
> On 02/29/2016 01:27 PM, Ferruh Yigit wrote:
>> On 2/29/2016 10:58 AM, Avi Kivity wrote:
>>>
>>> On 02/29/2016 12:43 PM, Ferruh Yigit wrote:
 On 2/29/2016 9:43 AM, Avi Kivity wrote:
> On 02/28/2016 10:16 PM, Ferruh Yigit wrote:
>> On 2/28/2016 3:34 PM, Avi Kivity wrote:
>>> On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
 This kernel module is based on KNI module, but this one is stripped
 version of it and only for control messages, no data transfer
 functionality provided.

 This Linux kernel module helps userspace application create virtual
 interfaces and when a control command issued into that virtual
 interface, module pushes the command to the userspace and gets the
 response back for the caller application.

 The Linux tools like ethtool/ifconfig/ip can be used on virtual
 interfaces but not ones for related data, like tcpdump.

 In long term this patch intends to replace the KNI and KNI will be
 depreciated.
>>> Instead of adding yet another out-of-tree kernel module, why not
>>> extend
>>> the existing in-tree tap driver?  This will make everyone's life
>>> easier.
>>>
>>> Since tap also supports data transfer, an application can also
>>> forward
>>> packets not intended to it to the kernel, and forward packets
>>> from the
>>> kernel through the device.
>>>
>> Hi Avi,
>>
>> KDP (Kernel Data Path) does what you have described, it is
>> implemented
>> as PMD and it benefits from tap driver to data transfer through the
>> kernel. It also support custom kernel module for better performance.
>>
>> For KCP (Kernel Control Path), network driver forwards control
>> commands
>> to the userspace driver, I doubt this is something wanted for tun/tap
>> driver, so extending tun/tap driver like this can be hard to
>> upstream.
> Have you tried asking?  Maybe if you explain it they will be open
> to the
> extension.
>
 Not communicated but tun/tap already doing something different.
 For KCP, created interface is map of the DPDK port. All data interface
 shows coming from DPDK port. For example if you get stats information
 with ifconfig, the values you observe are DPDK port statistics -not
 statistics of data between userspace and kernelspace, statistics of
 data
 forwarded between DPDK ports. If you down the interface, DPDK port
 stopped, etc...

 If you extend the tun/tap, it won't be map of the DPDK port, and if you
 get statistics information from that interface, what do you expect to
 see, the data transferred between kernel and userspace, or underlying
 DPDK port forwarding statistics?
>>> Good point.  But you really have to involve netdev on this, or you'll
>>> live out-of-tree forever.
>>>
>> Why do we need to touch netdev?
> 
> By netdev, I meant the mailing list.  If you don't touch it, your driver
> will remain out-of-tree forever.
> 
Sorry, I thought you are suggesting updating netdev (struct net_device)
for this.

>> A simple network driver, similar to kcp, can be solution.
>>
>> This driver implements all net_device_ops and ethtool_ops in a way to
>> forward everything to the userspace via netlink. All needs to know about
>> userspace driver is it's unique id. Any userspace application, not only
>> DPDK drivers, can listen the netlink messages and response to the
>> requests come to itself.
>>
>> This kind of driver is not big or complicated, kcp already does %90 of
>> what described above.
> 
> I am not arguing against kcp.  It fulfills an important need.  This is
> my argument:
> 
> 1. having multiple interfaces for the control and data path is bad for
> the user
> 2. therefore, we need to either add tap functionality to kcp, or add kcp
> functionality to tap
> 3. netdev@ is more likely (IMO) to accept additional functionality to
> tap than a new driver, but the only way to know is to engage with them
> 
Agreed an incremental update to the tap can be easier to get in, but
this is not really working for us, as explained above.

The concern of having two separate interfaces can be solved without
merging data and control path. I believe this is not a showstopper for
the functionality and can be the incremental improvement.

>>
 Extending tun/tap in a way we want, forwarding all control commands to
 userspace, will break the current tun/tap, this doesn't looks like a
 valid option to me.
>>> It's possible to enhance it while preserving backwards compatibility, by
>>> enabling a feature flag (statistics from userspace).
>>>
 For data path, using tun/tap is OK and we are already doing it, for the
 control path I believe we need a new driver.

> Certainly it will be better to have KCP and KDP use the same kernel
> interface name; so we'll need 

[dpdk-dev] Issue with configuring iproute using netdpcmd and running opendp

2016-02-29 Thread Mariappan Rajendran
Hi,

I am trying to configure the iproute using netdpcmd(from dpdk-odp repository), 
but it is failing. Kindly help to resolve this issue.

root at ICSCHELAP1003:/home/hadmin/Mari/dpdk-odp/netdp_cmd# ./build/netdpcmd
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Detected lcore 2 as core 0 on socket 0
EAL: Detected lcore 3 as core 1 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 4 lcore(s)
EAL: Setting up physically contiguous memory...
EAL: Analysing 64 files
EAL: Mapped segment 0 of size 0x20
EAL: Mapped segment 1 of size 0x40
EAL: Mapped segment 2 of size 0x60
EAL: Mapped segment 3 of size 0x20
EAL: Mapped segment 4 of size 0x20
EAL: Mapped segment 5 of size 0x20
EAL: Mapped segment 6 of size 0x20
EAL: Mapped segment 7 of size 0x20
EAL: Mapped segment 8 of size 0x20
EAL: Mapped segment 9 of size 0x20
EAL: Mapped segment 10 of size 0x40
EAL: Mapped segment 11 of size 0x40
EAL: Mapped segment 12 of size 0x40
EAL: Mapped segment 13 of size 0x20
EAL: Mapped segment 14 of size 0x220
EAL: Mapped segment 15 of size 0x100
EAL: Mapped segment 16 of size 0x20
EAL: Mapped segment 17 of size 0x40
EAL: Mapped segment 18 of size 0x200
EAL: memzone_reserve_aligned_thread_unsafe(): memzone  
already exists
RING: Cannot reserve memory


EAL: TSC frequency is ~1895612 KHz
EAL: Master lcore 0 is ready (tid=f7fdc940;cpuset=[0])
Lookup ring(NETDP_CTRL_PRI_2_SEC) failed
PANIC in main():
Cannot init ring
5: [./build/netdpcmd() [0x42c223]]
4: [/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x77105ec5]]
3: [./build/netdpcmd() [0x42ab3c]]
2: [./build/netdpcmd(__rte_panic+0xc9) [0x424f31]]
1: [./build/netdpcmd(rte_dump_stack+0x28) [0x495128]]
Aborted

Also, i am getting the below error while running opendp.

root at ICSCHELAP1003:/home/hadmin/Mari/dpdk-odp/opendp# ./build/opendp -c 0x1 
-n 1  -- -p 0x1 --config="(0,0,0)"
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Detected lcore 2 as core 0 on socket 0
EAL: Detected lcore 3 as core 1 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 4 lcore(s)
EAL: VFIO modules not all loaded, skip VFIO support...
EAL: Setting up physically contiguous memory...
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x76e0 (size = 0x20)
EAL: Ask a virtual area of 0x40 bytes
EAL: Virtual area found at 0x7680 (size = 0x40)
EAL: Ask a virtual area of 0x60 bytes
EAL: Virtual area found at 0x7600 (size = 0x60)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x75c0 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7580 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7540 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7500 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x74c0 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7480 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7440 (size = 0x20)
EAL: Ask a virtual area of 0x40 bytes
EAL: Virtual area found at 0x73e0 (size = 0x40)
EAL: Ask a virtual area of 0x40 bytes
EAL: Virtual area found at 0x7380 (size = 0x40)
EAL: Ask a virtual area of 0x40 bytes
EAL: Virtual area found at 0x7320 (size = 0x40)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x72e0 (size = 0x20)
EAL: Ask a virtual area of 0x220 bytes
EAL: Virtual area found at 0x70a0 (size = 0x220)
EAL: Ask a virtual area of 0x100 bytes
EAL: Virtual area found at 0x7fffef80 (size = 0x100)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7fffef40 (size = 0x20)
EAL: Ask a virtual area of 0x40 bytes
EAL: Virtual area found at 0x7fffeee0 (size = 0x40)
EAL: Ask a virtual area of 0x200 bytes
EAL: Virtual area found at 0x7fffecc0 (size = 0x200)
EAL: Requesting 64 pages of size 2MB from socket 0
EAL: TSC frequency is ~1895612 KHz
EAL: Master lcore 0 is ready (tid=f7fdc980;cpuset=[0])
param nb 1 ports 0
port id 0
 port 0 is not present on the board
EAL: Error - exiting with code: 1
  Cause: check_port_config failed

Below is my ifconfig, do i need to configure anything before running opendp ?
root at ICSCHELAP1003:/home/hadmin/Mari/dpdk-odp/opendp# ifconfig
eth13 Link encap:Ethernet  HWaddr 34:e6:d7:2b:89:60
  inet addr:172.27.10.27  Bcast:172.27.10.255  Mask:255.255.255.0
  inet6 addr: fe80::36e6:d7ff:fe2b:8960/64 Scope:Link
  UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
 

[dpdk-dev] [PATCH] mk: add makefile extention support

2016-02-29 Thread Wiles, Keith
>2016-02-28 21:47, Wiles, Keith:
>> >Hi,
>> >
>> >2016-02-09 11:35, Keith Wiles:
>> >> Adding support to the build system to allow for Makefile.XXX
>> >> extention to a subtree, which already has Makefiles. These
>> >> Makefiles could be from the autotools and others places. Using
>> >> the Makefile extention RTE_MKFILE_SUFFIX in a makefile subtree
>> >> using 'export RTE_MKFILE_SUFFIX=.XXX' to use Makefile.XXX in
>> >> that subtree.
>> >> 
>> >> The main reason I needed this feature was to integrate a autotool
>> >> open source projects with DPDK and keep the original Makefiles.
>> >
>> >Sorry I fail to understand why it is needed.
>> >Are you trying to add autotool in DPDK? I don't think it is a good approach.
>> >The DPDK must provide a pkgconfig interface to be integrated anywhere.
>> 
>> I was not trying to add autotools to DPDK. On a number of times I wanted to 
>> integrate a open source project(s) with DPDK and use DPDK?s build system, 
>> but because the open source project already contained Makefile files you can 
>> not use DPDK build system without modify or moving the original Makefile 
>> files. Using this method I can just add a exported variable and supply my 
>> own Makefile.XXX files.
>> 
>> One case was building FreeBSD source, but I did not want to modify FreeBSD 
>> Makefiles (or reply on previous built Makefiles as they would not work on 
>> Linux anyway) as I was pulling the source down from freebsd.org repo. Using 
>> a patch to add the Makefiles with a different suffix allows me to build 
>> FreeBSD using DPDK, without having to modify or own the FreeBSD source. I 
>> have had this problem a number of times with open source code I did not want 
>> to modify, but just build within DPDK build system and adding the support 
>> for a different suffix to DPDK provided a clean way. The change does not 
>> effect the correct build system and just allows someone to define a new 
>> suffix for a given subtree in the code.
>
>Why would you like to have another project inside the DPDK files tree?
>If you want to integrate the lib inside an existing project, the solution
>is pkgconfig.

The goal for me was to use DPDK build system for that project, instead of using 
autotools or some other makefile system. In the case of FreeBSD code, the 
FreeBSD build system requires FreeBSD tools to be built as the ?make? and the 
Makefiles are very different on a Linux machine.
>


Regards,
Keith






[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-02-29 Thread Avi Kivity


On 02/29/2016 01:27 PM, Ferruh Yigit wrote:
> On 2/29/2016 10:58 AM, Avi Kivity wrote:
>>
>> On 02/29/2016 12:43 PM, Ferruh Yigit wrote:
>>> On 2/29/2016 9:43 AM, Avi Kivity wrote:
 On 02/28/2016 10:16 PM, Ferruh Yigit wrote:
> On 2/28/2016 3:34 PM, Avi Kivity wrote:
>> On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
>>> This kernel module is based on KNI module, but this one is stripped
>>> version of it and only for control messages, no data transfer
>>> functionality provided.
>>>
>>> This Linux kernel module helps userspace application create virtual
>>> interfaces and when a control command issued into that virtual
>>> interface, module pushes the command to the userspace and gets the
>>> response back for the caller application.
>>>
>>> The Linux tools like ethtool/ifconfig/ip can be used on virtual
>>> interfaces but not ones for related data, like tcpdump.
>>>
>>> In long term this patch intends to replace the KNI and KNI will be
>>> depreciated.
>> Instead of adding yet another out-of-tree kernel module, why not
>> extend
>> the existing in-tree tap driver?  This will make everyone's life
>> easier.
>>
>> Since tap also supports data transfer, an application can also forward
>> packets not intended to it to the kernel, and forward packets from the
>> kernel through the device.
>>
> Hi Avi,
>
> KDP (Kernel Data Path) does what you have described, it is implemented
> as PMD and it benefits from tap driver to data transfer through the
> kernel. It also support custom kernel module for better performance.
>
> For KCP (Kernel Control Path), network driver forwards control commands
> to the userspace driver, I doubt this is something wanted for tun/tap
> driver, so extending tun/tap driver like this can be hard to upstream.
 Have you tried asking?  Maybe if you explain it they will be open to the
 extension.

>>> Not communicated but tun/tap already doing something different.
>>> For KCP, created interface is map of the DPDK port. All data interface
>>> shows coming from DPDK port. For example if you get stats information
>>> with ifconfig, the values you observe are DPDK port statistics -not
>>> statistics of data between userspace and kernelspace, statistics of data
>>> forwarded between DPDK ports. If you down the interface, DPDK port
>>> stopped, etc...
>>>
>>> If you extend the tun/tap, it won't be map of the DPDK port, and if you
>>> get statistics information from that interface, what do you expect to
>>> see, the data transferred between kernel and userspace, or underlying
>>> DPDK port forwarding statistics?
>> Good point.  But you really have to involve netdev on this, or you'll
>> live out-of-tree forever.
>>
> Why do we need to touch netdev?

By netdev, I meant the mailing list.  If you don't touch it, your driver 
will remain out-of-tree forever.

> A simple network driver, similar to kcp, can be solution.
>
> This driver implements all net_device_ops and ethtool_ops in a way to
> forward everything to the userspace via netlink. All needs to know about
> userspace driver is it's unique id. Any userspace application, not only
> DPDK drivers, can listen the netlink messages and response to the
> requests come to itself.
>
> This kind of driver is not big or complicated, kcp already does %90 of
> what described above.

I am not arguing against kcp.  It fulfills an important need.  This is 
my argument:

1. having multiple interfaces for the control and data path is bad for 
the user
2. therefore, we need to either add tap functionality to kcp, or add kcp 
functionality to tap
3. netdev@ is more likely (IMO) to accept additional functionality to 
tap than a new driver, but the only way to know is to engage with them

>
>>> Extending tun/tap in a way we want, forwarding all control commands to
>>> userspace, will break the current tun/tap, this doesn't looks like a
>>> valid option to me.
>> It's possible to enhance it while preserving backwards compatibility, by
>> enabling a feature flag (statistics from userspace).
>>
>>> For data path, using tun/tap is OK and we are already doing it, for the
>>> control path I believe we need a new driver.
>>>
 Certainly it will be better to have KCP and KDP use the same kernel
 interface name; so we'll need to either add data path support to kcp
 (causing duplication with tap), or add control path support to tap. I
 think the latter is preferable.

>>> Why it is better to have same interface? Anyone who is not interested
>>> with kernel data path may want to control DPDK ports using common tools,
>>> or want to get some basic information and stats using ethtool or
>>> ifconfig. Why we need to bind two different functionality together?
>> Having two interfaces will be confusing for the user.  If I wish to
>> firewall data packets coming from the dpdk port, do I set firewall rules
>> on 

[dpdk-dev] [PATCH v5 01/11] ethdev: add API to query packet type filling info

2016-02-29 Thread Panu Matilainen
On 02/26/2016 09:34 AM, Jianfeng Tan wrote:
> Add a new API rte_eth_dev_get_ptype_info to query whether/what packet
> type can be filled by given pmd rx burst function.
>
> Signed-off-by: Jianfeng Tan 
> ---
>   lib/librte_ether/rte_ethdev.c | 26 ++
>   lib/librte_ether/rte_ethdev.h | 26 ++
>   2 files changed, 52 insertions(+)
>
[...]
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 16da821..16f32a0 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1021,6 +1021,9 @@ typedef void (*eth_dev_infos_get_t)(struct rte_eth_dev 
> *dev,
>   struct rte_eth_dev_info *dev_info);
>   /**< @internal Get specific informations of an Ethernet device. */
>
> +typedef const uint32_t *(*eth_dev_ptype_info_get_t)(struct rte_eth_dev *dev);
> +/**< @internal Get ptype info of eth_rx_burst_t. */
> +
>   typedef int (*eth_queue_start_t)(struct rte_eth_dev *dev,
>   uint16_t queue_id);
>   /**< @internal Start rx and tx of a queue of an Ethernet device. */
> @@ -1347,6 +1350,7 @@ struct eth_dev_ops {
>   eth_queue_stats_mapping_set_t queue_stats_mapping_set;
>   /**< Configure per queue stat counter mapping. */
>   eth_dev_infos_get_tdev_infos_get; /**< Get device info. */
> + eth_dev_ptype_info_get_t   dev_ptype_info_get; /** Get ptype info */
>   mtu_set_t  mtu_set; /**< Set MTU. */
>   vlan_filter_set_t  vlan_filter_set;  /**< Filter VLAN Setup. */
>   vlan_tpid_set_tvlan_tpid_set;  /**< Outer VLAN TPID 
> Setup. */
> @@ -2268,6 +2272,28 @@ void rte_eth_macaddr_get(uint8_t port_id, struct 
> ether_addr *mac_addr);

Technically this is an ABI break but its marked internal and I guess it 
falls into the "drivers only" territory similar to what was discussed in 
this thead: http://dpdk.org/ml/archives/dev/2016-January/032348.html so 
its probably ok.

>   void rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info 
> *dev_info);
>
>   /**
> + * Retrieve the packet type information of an Ethernet device.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param ptype_mask
> + *   A hint of what kind of packet type which the caller is interested in.
> + * @param ptypes
> + *   An array pointer to store adequent packet types, allocated by caller.
> + * @param num
> + *  Size of the array pointed by param ptypes.
> + * @return
> + *   - (>0) Number of ptypes supported. If it exceeds param num, exceeding
> + *  packet types will not be filled in the given array.
> + *   - (0 or -ENOTSUP) if PMD does not fill the specified ptype.
> + *   - (-ENODEV) if *port_id* invalid.
> + */
> +extern int rte_eth_dev_get_ptype_info(uint8_t port_id,
> +   uint32_t ptype_mask,
> +   uint32_t *ptypes,
> +   int num);
> +
> +/**
>* Retrieve the MTU of an Ethernet device.
>*
>* @param port_id
>

"extern" is redundant in headers. We just saw a round of removing them 
(commit dd34ff1f0e03b2c5e4a97e9fbcba5c8238aac573), lets not add them back :)

More importantly, to export a function you need to add an entry for it 
in rte_ether_version.map.

- Panu -




[dpdk-dev] [PATCH v2] doc/nic: add ixgbe statistics on read frequency

2016-02-29 Thread Kerlin, MarcinX
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Harry van Haaren
> Sent: Monday, February 29, 2016 1:17 PM
> To: Mcnamara, John 
> Cc: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2] doc/nic: add ixgbe statistics on read
> frequency
> 
> This patch adds a note to the ixgbe PMD guide, stating the minimum time
> that statistics must be polled from the hardware in order to avoid register
> values becoming saturated and "sticking" to the max value.
> 
> Reported-by: Jerry Zhang 
> Tested-by: Marcin Kerlin 
> Signed-off-by: Harry van Haaren 

Acked-by: Marcin Kerlin 


[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-02-29 Thread Avi Kivity


On 02/29/2016 12:43 PM, Ferruh Yigit wrote:
> On 2/29/2016 9:43 AM, Avi Kivity wrote:
>> On 02/28/2016 10:16 PM, Ferruh Yigit wrote:
>>> On 2/28/2016 3:34 PM, Avi Kivity wrote:
 On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
> This kernel module is based on KNI module, but this one is stripped
> version of it and only for control messages, no data transfer
> functionality provided.
>
> This Linux kernel module helps userspace application create virtual
> interfaces and when a control command issued into that virtual
> interface, module pushes the command to the userspace and gets the
> response back for the caller application.
>
> The Linux tools like ethtool/ifconfig/ip can be used on virtual
> interfaces but not ones for related data, like tcpdump.
>
> In long term this patch intends to replace the KNI and KNI will be
> depreciated.
 Instead of adding yet another out-of-tree kernel module, why not extend
 the existing in-tree tap driver?  This will make everyone's life easier.

 Since tap also supports data transfer, an application can also forward
 packets not intended to it to the kernel, and forward packets from the
 kernel through the device.

>>> Hi Avi,
>>>
>>> KDP (Kernel Data Path) does what you have described, it is implemented
>>> as PMD and it benefits from tap driver to data transfer through the
>>> kernel. It also support custom kernel module for better performance.
>>>
>>> For KCP (Kernel Control Path), network driver forwards control commands
>>> to the userspace driver, I doubt this is something wanted for tun/tap
>>> driver, so extending tun/tap driver like this can be hard to upstream.
>> Have you tried asking?  Maybe if you explain it they will be open to the
>> extension.
>>
> Not communicated but tun/tap already doing something different.
> For KCP, created interface is map of the DPDK port. All data interface
> shows coming from DPDK port. For example if you get stats information
> with ifconfig, the values you observe are DPDK port statistics -not
> statistics of data between userspace and kernelspace, statistics of data
> forwarded between DPDK ports. If you down the interface, DPDK port
> stopped, etc...
>
> If you extend the tun/tap, it won't be map of the DPDK port, and if you
> get statistics information from that interface, what do you expect to
> see, the data transferred between kernel and userspace, or underlying
> DPDK port forwarding statistics?

Good point.  But you really have to involve netdev on this, or you'll 
live out-of-tree forever.

> Extending tun/tap in a way we want, forwarding all control commands to
> userspace, will break the current tun/tap, this doesn't looks like a
> valid option to me.

It's possible to enhance it while preserving backwards compatibility, by 
enabling a feature flag (statistics from userspace).

> For data path, using tun/tap is OK and we are already doing it, for the
> control path I believe we need a new driver.
>
>> Certainly it will be better to have KCP and KDP use the same kernel
>> interface name; so we'll need to either add data path support to kcp
>> (causing duplication with tap), or add control path support to tap. I
>> think the latter is preferable.
>>
> Why it is better to have same interface? Anyone who is not interested
> with kernel data path may want to control DPDK ports using common tools,
> or want to get some basic information and stats using ethtool or
> ifconfig. Why we need to bind two different functionality together?

Having two interfaces will be confusing for the user.  If I wish to 
firewall data packets coming from the dpdk port, do I set firewall rules 
on dpdk0 or tap0?

I don't think it matters whether you extend tap, or add a data path to 
kcp, but if you want to upstream it, it needs to be blessed by netdev.

>
>>> We are investigating about adding a native support to Linux kernel for
>>> KCP, but there is no task started for this right now, any support is
>>> welcome.
>>>
>>>



[dpdk-dev] [PATCH v6 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API

2016-02-29 Thread Panu Matilainen
On 02/24/2016 03:23 PM, Ananyev, Konstantin wrote:
> Hi Panu,
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen
>> Sent: Wednesday, February 24, 2016 12:12 PM
>> To: Xie, Huawei; Olivier MATZ; dev at dpdk.org
>> Cc: dprovan at bivio.net
>> Subject: Re: [dpdk-dev] [PATCH v6 1/2] mbuf: provide rte_pktmbuf_alloc_bulk 
>> API
>>
>> On 02/23/2016 07:35 AM, Xie, Huawei wrote:
>>> On 2/22/2016 10:52 PM, Xie, Huawei wrote:
 On 2/4/2016 1:24 AM, Olivier MATZ wrote:
> Hi,
>
> On 01/27/2016 02:56 PM, Panu Matilainen wrote:
>> Since rte_pktmbuf_alloc_bulk() is an inline function, it is not part of
>> the library ABI and should not be listed in the version map.
>>
>> I assume its inline for performance reasons, but then you lose the
>> benefits of dynamic linking such as ability to fix bugs and/or improve
>> itby just updating the library. Since the point of having a bulk API is
>> to improve performance by reducing the number of calls required, does it
>> really have to be inline? As in, have you actually measured the
>> difference between inline and non-inline and decided its worth all the
>> downsides?
> Agree with Panu. It would be interesting to compare the performance
> between inline and non inline to decide whether inlining it or not.
 Will update after i gathered more data. inline could show obvious
 performance difference in some cases.
>>>
>>> Panu and Oliver:
>>> I write a simple benchmark. This benchmark run 10M rounds, in each round
>>> 8 mbufs are allocated through bulk API, and then freed.
>>> These are the CPU cycles measured(Intel(R) Xeon(R) CPU E5-2680 0 @
>>> 2.70GHz, CPU isolated, timer interrupt disabled, rcu offloaded).
>>> Btw, i have removed some exceptional data, the frequency of which is
>>> like 1/10. Sometimes observed user usage suddenly disappeared, no clue
>>> what happened.
>>>
>>> With 8 mbufs allocated, there is about 6% performance increase using inline.
>> [...]
>>>
>>> With 16 mbufs allocated, we could still observe obvious performance
>>> difference, though only 1%-2%
>>>
>> [...]
>>>
>>> With 32/64 mbufs allocated, the deviation of the data itself would hide
>>> the performance difference.
>>> So we prefer using inline for performance.
>>
>> At least I was more after real-world performance in a real-world
>> use-case rather than CPU cycles in a microbenchmark, we know function
>> calls have a cost but the benefits tend to outweight the cons.
>>
>> Inline functions have their place and they're far less evil in project
>> internal use, but in library public API they are BAD and should be ...
>> well, not banned because there are exceptions to every rule, but highly
>> discouraged.
>
> Why is that?

For all the reasons static linking is bad, and what's worse it forces 
the static linking badness into dynamically linked builds.

If there's a bug (security or otherwise) in a library, a distro wants to 
supply an updated package which fixes that bug and be done with it. But 
if that bug is in an inlined code, supplying an update is not enough, 
you also need to recompile everything using that code, and somehow 
inform customers possibly using that code that they need to not only 
update the library but to recompile their apps as well. That is 
precisely the reason distros go to great lenghts to avoid *any* 
statically linked apps and libs in the distro, completely regardless of 
the performance overhead.

In addition, inlined code complicates ABI compatibility issues because 
some of the code is one the "wrong" side, and worse, it bypasses all the 
other ABI compatibility safeguards like soname and symbol versioning.

Like said, inlined code is fine for internal consumption, but incredibly 
bad for public interfaces. And of course, the more complicated a 
function is, greater the potential of needing bugfixes.

Mind you, none of this is magically specific to this particular 
function. Except in the sense that bulk operations offer a better way of 
performance improvements than just inlining everything.

> As you can see right now we have all mbuf alloc/free routines as static 
> inline.
> And I think we would like to keep it like that.
> So why that particular function should be different?

Because there's much less need to have it inlined since the function 
call overhead is "amortized" by the fact its doing bulk operations. "We 
always did it that way" is not a very good reason :)

> After all that function is nothing more than a wrapper
> around rte_mempool_get_bulk()  unrolled by 4 loop {rte_pktmbuf_reset()}
> So unless mempool get/put API would change, I can hardly see there could be 
> any ABI
> breakages in future.
> About 'real world' performance gain - it was a 'real world' performance 
> problem,
> that we tried to solve by introducing that function:
> http://dpdk.org/ml/archives/dev/2015-May/017633.html
>
> And according to the user feedback, it 

[dpdk-dev] [PATCH v1] virtio: Use cpuflag for vector api

2016-02-29 Thread Yuanhan Liu
On Fri, Feb 26, 2016 at 02:21:02PM +0530, Santosh Shukla wrote:
> Check cpuflag macro before using vectored api.
> -virtio_recv_pkts_vec() uses _sse3__ simd instruction for now so added 
> cpuflag.
> - Also wrap other vectored freind api ie..
> 1) virtqueue_enqueue_recv_refill_simple
> 2) virtio_rxq_vec_setup
> 
...
> diff --git a/drivers/net/virtio/virtio_rxtx_simple.c 
> b/drivers/net/virtio/virtio_rxtx_simple.c
> index 3a1de9d..be51d7c 100644
> --- a/drivers/net/virtio/virtio_rxtx_simple.c
> +++ b/drivers/net/virtio/virtio_rxtx_simple.c

Hmm, why not wrapping the whole file, instead of just few functions?

Or maybe better, do a compile time check at the Makefile, something
like:

if has_CPUFLAG_xxx
SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c
endif


--yliu


[dpdk-dev] [PATCH v2] doc/nic: add ixgbe statistics on read frequency

2016-02-29 Thread Harry van Haaren
This patch adds a note to the ixgbe PMD guide, stating
the minimum time that statistics must be polled from
the hardware in order to avoid register values becoming
saturated and "sticking" to the max value.

Reported-by: Jerry Zhang 
Tested-by: Marcin Kerlin 
Signed-off-by: Harry van Haaren 
---

v2: Add reported-by and tested-by

 doc/guides/nics/ixgbe.rst | 24 
 1 file changed, 24 insertions(+)

diff --git a/doc/guides/nics/ixgbe.rst b/doc/guides/nics/ixgbe.rst
index 8cae299..c8085a8 100644
--- a/doc/guides/nics/ixgbe.rst
+++ b/doc/guides/nics/ixgbe.rst
@@ -178,3 +178,27 @@ load_balancer

 As in the case of l3fwd, set configure port_conf.rxmode.hw_ip_checksum=0 to 
enable vPMD.
 In addition, for improved performance, use -bsz "(32,32),(64,64),(32,32)" in 
load_balancer to avoid using the default burst size of 144.
+
+Statistics
+--
+
+The statistics of ixgbe hardware must be polled regularly in order for it to
+remain consistent. Running a DPDK application without polling the statistcs 
will
+cause registers on hardware to count to thier maxiumum value, and "stick" at
+that value.
+
+In order to avoid statistic registers every reaching thier maxiumum value,
+read the statistics from the hardware using ``rte_eth_stats_get()`` or
+``rte_eth_xstats_get()``.
+
+The maxiumum time between statistics polls that ensures consistent results can
+be calculated as follows:
+
+.. code-block:: c
+
+  max_read_interval = UINT_MAX / max_packets_per_second
+  max_read_interval = 4294967295 / 14880952
+  max_read_interval = 288.6218096127183 (seconds)
+  max_read_interval = ~4 mins 48 sec.
+
+In order to ensure valid results, it is recommended to poll every 4 minutes.
-- 
2.5.0



[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-02-29 Thread Stephen Hemminger
On Wed, 27 Jan 2016 16:24:07 +
Ferruh Yigit  wrote:

> +static int
> +kcp_ioctl_release(unsigned int ioctl_num, unsigned long ioctl_param)
> +{
> + int ret = -EINVAL;
> + struct kcp_dev *dev;
> + struct kcp_dev *n;
> + char name[RTE_KCP_NAMESIZE];
> + unsigned int instance = ioctl_param;
> +
> + snprintf(name, RTE_KCP_NAMESIZE, "dpdk%u", instance);
> +
> + down_write(_list_lock);


Some observations about how acceptable this will to upstream
kernel developers.

ioctl's are the lease favored form of API.

You chose the worst possible mutual exclusion read/write semaphores.
Read/write is slower than simpler primtives, and semaphores were
replaced for almost all usage models by mutexes (about 4 years ago).

Looks like you copied the out of date kernel API's used
by KNI.


[dpdk-dev] ACL memory allocation failures

2016-02-29 Thread Rapelly, Varun
Thanks Konstantin. Few more questions in line:

> 
> Previous allocation error was coming with 1024 huge pages of 2 MB size.
> 
> After increasing the huge pages to 2048, I was able to add another
> ~140 rules [IPv4 rule data--> with src, dst IP address & port, next header ] 
> more, ie., 950 rules were added.

That's strange according to your log, all you need is ~13MB of hugepage memory:
ACL: allocation of 12966528 bytes on socket 0 for ipv4_acl_table1 Wonder what 
consumed rest of 4GB?
>> We are creating mem pools (for DPDK compatible 3 ports) for packet 
>> processing.
>>> And there are no free huge pages available after our DPDK app 
>>> initialization.

Again do you re-build your table after every rule you add?
If so, then it seems a bit strange approach (and definitely not the fastest 
one).
>>Yes, we are rebuilding the rules every time and is due to 2 reasons: 
>>1. Our application, gives full list of rules every time you add new rule. 
>>2. There is no way to delete a specific rule in the trie. Is there any way to 
>>delete a specific ACL rule?


What you can do instead: create context; add all your rules into it; build; 
>>> By following the same approach (what I explained above, rebuilding the ACL 
>>> trie everytime), can we fix this memory allocation issue?
>>>If yes, please provide me some pointers to modify the code.

> 
> Logically it did not increase number of rules [expected 2*817, but only 950 
> were added]. Is it really using huge pages memory only?
> 
> From the code it looks like heap memory. [ ret = 
> malloc_heap_alloc(>malloc_heaps[i], type, size, 0, align == 0 ?
> 1 : align, 0) ]

As I can see from the log it fails at GEN phase, when trying to allocate 
hugepages for RT table.
At lib/librte_acl/acl_gen.c:509

rte_acl_gen(struct rte_acl_ctx *ctx, struct rte_acl_trie *trie,
struct rte_acl_bld_trie *node_bld_trie, uint32_t num_tries,
uint32_t num_categories, uint32_t data_index_sz, size_t max_size) { ...
mem = rte_zmalloc_socket(ctx->name, total_size, RTE_CACHE_LINE_SIZE,
ctx->socket_id); if (mem == NULL) {
RTE_LOG(ERR, ACL,
"allocation of %zu bytes on socket %d for %s failed\n",
total_size, ctx->socket_id, ctx->name);
return -ENOMEM;
}
>>> Is there any way to reserve some particular amount of huge page memory for 
>>> ACL trie (in eal_init())?

Konstantin

> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Rapelly, Varun
> > Sent: Friday, February 26, 2016 10:28 AM
> > To: dev at dpdk.org
> > Subject: Re: [dpdk-dev] ACL memory allocation failures
> >
> > Hi All,
> >
> > When I'm trying to configure some 5000+ ACL rules with different 
> > source IP addresses, getting ACL memory allocation failure. I'm using DPDK 
> > 2.1.
> >
> > [root at ACLISSUE log_2015_10_26_08_19_42]# vim np.log match 
> > nodes/bytes
> > used: 816/104448
> > total: 12940832 bytes
> > ACL: Build phase for ACL "ipv4_acl_table2":
> > memory consumed: 947913495
> > ACL: trie 0: number of rules: 816
> > ACL: allocation of 12966528 bytes on socket 0 for ipv4_acl_table1 
> > failed
> > ACL: Build phase for ACL "ipv4_acl_table1":
> > memory consumed: 947913495
> > ACL: trie 0: number of rules: 817
> > EAL: Error - exiting with code: 1
> >   Cause: Failed to build ACL trie
> >
> > Again sourced the ACL config file. After adding around 77 again the same 
> > error came.
> >
> > total: 14912784 bytes
> > ACL: Build phase for ACL "ipv4_acl_table1":
> > memory consumed: 1040188260
> > ACL: trie 0: number of rules: 893
> > ACL: allocation of 14938480 bytes on socket 0 for ipv4_acl_table2 
> > failed
> 
> You are running out of hugepages memory.
> 
> > ACL: Build phase for ACL "ipv4_acl_table2":
> > memory consumed: 1040188260
> > ACL: trie 0: number of rules: 894
> > EAL: Error - exiting with code: 1
> >   Cause: Failed to build ACL trie
> >
> > Where to increase the memory to avoid this issue?
> 
>  Refer to:
> http://dpdk.org/doc/guides/linux_gsg/sys_reqs.html#running-dpdk-applic
> ations
> Section 2.3.2
> 
> Konstantin



[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-02-29 Thread Thomas Monjalon
Hi,
I totally agree with Avi's comments.
This topic is really important for the future of DPDK.
So I think we must give some time to continue the discussion
and have netdev involved in the choices done.
As a consequence, these series should not be merged in the release 16.04.
Thanks for continuing the work.


2016-02-29 12:58, Avi Kivity:
> On 02/29/2016 12:43 PM, Ferruh Yigit wrote:
> > On 2/29/2016 9:43 AM, Avi Kivity wrote:
> >> On 02/28/2016 10:16 PM, Ferruh Yigit wrote:
> >>> On 2/28/2016 3:34 PM, Avi Kivity wrote:
>  On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
> > This kernel module is based on KNI module, but this one is stripped
> > version of it and only for control messages, no data transfer
> > functionality provided.
> >
> > This Linux kernel module helps userspace application create virtual
> > interfaces and when a control command issued into that virtual
> > interface, module pushes the command to the userspace and gets the
> > response back for the caller application.
> >
> > The Linux tools like ethtool/ifconfig/ip can be used on virtual
> > interfaces but not ones for related data, like tcpdump.
> >
> > In long term this patch intends to replace the KNI and KNI will be
> > depreciated.
>  Instead of adding yet another out-of-tree kernel module, why not extend
>  the existing in-tree tap driver?  This will make everyone's life easier.
> 
>  Since tap also supports data transfer, an application can also forward
>  packets not intended to it to the kernel, and forward packets from the
>  kernel through the device.
> 
> >>> Hi Avi,
> >>>
> >>> KDP (Kernel Data Path) does what you have described, it is implemented
> >>> as PMD and it benefits from tap driver to data transfer through the
> >>> kernel. It also support custom kernel module for better performance.
> >>>
> >>> For KCP (Kernel Control Path), network driver forwards control commands
> >>> to the userspace driver, I doubt this is something wanted for tun/tap
> >>> driver, so extending tun/tap driver like this can be hard to upstream.
> >> Have you tried asking?  Maybe if you explain it they will be open to the
> >> extension.
> >>
> > Not communicated but tun/tap already doing something different.
> > For KCP, created interface is map of the DPDK port. All data interface
> > shows coming from DPDK port. For example if you get stats information
> > with ifconfig, the values you observe are DPDK port statistics -not
> > statistics of data between userspace and kernelspace, statistics of data
> > forwarded between DPDK ports. If you down the interface, DPDK port
> > stopped, etc...
> >
> > If you extend the tun/tap, it won't be map of the DPDK port, and if you
> > get statistics information from that interface, what do you expect to
> > see, the data transferred between kernel and userspace, or underlying
> > DPDK port forwarding statistics?
> 
> Good point.  But you really have to involve netdev on this, or you'll 
> live out-of-tree forever.

+1

> > Extending tun/tap in a way we want, forwarding all control commands to
> > userspace, will break the current tun/tap, this doesn't looks like a
> > valid option to me.
> 
> It's possible to enhance it while preserving backwards compatibility, by 
> enabling a feature flag (statistics from userspace).

+1

> > For data path, using tun/tap is OK and we are already doing it, for the
> > control path I believe we need a new driver.
> >
> >> Certainly it will be better to have KCP and KDP use the same kernel
> >> interface name; so we'll need to either add data path support to kcp
> >> (causing duplication with tap), or add control path support to tap. I
> >> think the latter is preferable.
> >>
> > Why it is better to have same interface? Anyone who is not interested
> > with kernel data path may want to control DPDK ports using common tools,
> > or want to get some basic information and stats using ethtool or
> > ifconfig. Why we need to bind two different functionality together?
> 
> Having two interfaces will be confusing for the user.  If I wish to 
> firewall data packets coming from the dpdk port, do I set firewall rules 
> on dpdk0 or tap0?

+1

> I don't think it matters whether you extend tap, or add a data path to 
> kcp, but if you want to upstream it, it needs to be blessed by netdev.

+1

> >>> We are investigating about adding a native support to Linux kernel for
> >>> KCP, but there is no task started for this right now, any support is
> >>> welcome.



[dpdk-dev] [PATCH v3] examples/l3fwd: exact-match rework

2016-02-29 Thread Thomas Monjalon
2016-02-29 11:33, Tomasz Kulasek:
> Current implementation of Exact-Match uses different execution path than
> for LPM. Unifying them allows to reuse big part of LPM code and sightly
> increase performance of Exact-Match.
> 
> Main changes:
> -
> * Packet classification stage is separated from the rest of path for both
>   LPM and EM.
> * Packet processing, modifying and transmit part is the same for LPM and EM
>   and mostly based on the current LPM implementation.
> * Shared code is moved to the common file "l3fwd_sse.h".
> * While sequential packet classification in EM path, seems to be faster
>   than using multi hash lookup, used before, it is used by default. Old
>   implementation is moved to the file l3fwd_em_hlm_sse.h and can be enabled
>   with HASH_LOOKUP_MULTI global define in compilation time.
> 
> This patch depends of Ravi Kerur's "Modify and modularize l3fwd code" and
> should be applied after it.
> 
> Changes in v3:
>  - fixed error: unused function 'l3fwd_em_simple_forward'. This function is
>used only in l3fwd_em_no_opt_send_packets, and after moving it to new
>header file l3fwd_em.h in Ravi's patch, also should be moved there.
> 
> Changes in v2:
>  - patch rebase to be applicable on top of "Modify and modularize l3fwd
>code" v3
> 
> Signed-off-by: Tomasz Kulasek 
> Acked-by: Konstantin Ananyev 

Applied, thanks


[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-02-29 Thread Avi Kivity
On 02/28/2016 10:16 PM, Ferruh Yigit wrote:
> On 2/28/2016 3:34 PM, Avi Kivity wrote:
>> On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
>>> This kernel module is based on KNI module, but this one is stripped
>>> version of it and only for control messages, no data transfer
>>> functionality provided.
>>>
>>> This Linux kernel module helps userspace application create virtual
>>> interfaces and when a control command issued into that virtual
>>> interface, module pushes the command to the userspace and gets the
>>> response back for the caller application.
>>>
>>> The Linux tools like ethtool/ifconfig/ip can be used on virtual
>>> interfaces but not ones for related data, like tcpdump.
>>>
>>> In long term this patch intends to replace the KNI and KNI will be
>>> depreciated.
>> Instead of adding yet another out-of-tree kernel module, why not extend
>> the existing in-tree tap driver?  This will make everyone's life easier.
>>
>> Since tap also supports data transfer, an application can also forward
>> packets not intended to it to the kernel, and forward packets from the
>> kernel through the device.
>>
> Hi Avi,
>
> KDP (Kernel Data Path) does what you have described, it is implemented
> as PMD and it benefits from tap driver to data transfer through the
> kernel. It also support custom kernel module for better performance.
>
> For KCP (Kernel Control Path), network driver forwards control commands
> to the userspace driver, I doubt this is something wanted for tun/tap
> driver, so extending tun/tap driver like this can be hard to upstream.

Have you tried asking?  Maybe if you explain it they will be open to the 
extension.

Certainly it will be better to have KCP and KDP use the same kernel 
interface name; so we'll need to either add data path support to kcp 
(causing duplication with tap), or add control path support to tap. I 
think the latter is preferable.

> We are investigating about adding a native support to Linux kernel for
> KCP, but there is no task started for this right now, any support is
> welcome.
>
>



[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-02-29 Thread Ferruh Yigit
On 2/29/2016 11:06 AM, Thomas Monjalon wrote:
> Hi,
> I totally agree with Avi's comments.
> This topic is really important for the future of DPDK.
> So I think we must give some time to continue the discussion
> and have netdev involved in the choices done.
> As a consequence, these series should not be merged in the release 16.04.
> Thanks for continuing the work.
> 
Hi Thomas,

It is great to have some discussion and feedbacks.
But I doubt not merging in this release will help to have more discussion.

It is better to have them in this release and let people experiment it,
this gives more chance to better discussion.

These features are replacement of KNI, and KNI is not intended to be
removed in this release, so who are using KNI as solution can continue
to use KNI and can test KCP/KDP, so that we can get more feedbacks.

Thanks,
ferruh



[dpdk-dev] [PATCH v3] examples/l3fwd: exact-match rework

2016-02-29 Thread Tomasz Kulasek
Current implementation of Exact-Match uses different execution path than
for LPM. Unifying them allows to reuse big part of LPM code and sightly
increase performance of Exact-Match.

Main changes:
-
* Packet classification stage is separated from the rest of path for both
  LPM and EM.
* Packet processing, modifying and transmit part is the same for LPM and EM
  and mostly based on the current LPM implementation.
* Shared code is moved to the common file "l3fwd_sse.h".
* While sequential packet classification in EM path, seems to be faster
  than using multi hash lookup, used before, it is used by default. Old
  implementation is moved to the file l3fwd_em_hlm_sse.h and can be enabled
  with HASH_LOOKUP_MULTI global define in compilation time.

This patch depends of Ravi Kerur's "Modify and modularize l3fwd code" and
should be applied after it.

Changes in v3:
 - fixed error: unused function 'l3fwd_em_simple_forward'. This function is
   used only in l3fwd_em_no_opt_send_packets, and after moving it to new
   header file l3fwd_em.h in Ravi's patch, also should be moved there.

Changes in v2:
 - patch rebase to be applicable on top of "Modify and modularize l3fwd
   code" v3

Signed-off-by: Tomasz Kulasek 
Acked-by: Konstantin Ananyev 
---
 examples/l3fwd/l3fwd.h|8 +
 examples/l3fwd/l3fwd_em.c |   80 +-
 examples/l3fwd/l3fwd_em.h |   68 +
 examples/l3fwd/l3fwd_em_hlm_sse.h |  341 +
 examples/l3fwd/l3fwd_em_sse.h |  447 +++-
 examples/l3fwd/l3fwd_lpm.c|   15 +-
 examples/l3fwd/l3fwd_lpm_sse.h|  507 -
 examples/l3fwd/l3fwd_sse.h|  501 
 8 files changed, 1011 insertions(+), 956 deletions(-)
 create mode 100644 examples/l3fwd/l3fwd_em_hlm_sse.h
 create mode 100644 examples/l3fwd/l3fwd_sse.h

diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h
index f450269..da6d369 100644
--- a/examples/l3fwd/l3fwd.h
+++ b/examples/l3fwd/l3fwd.h
@@ -53,6 +53,14 @@
 /* Configure how many packets ahead to prefetch, when reading packets */
 #define PREFETCH_OFFSET  3

+/* Used to mark destination port as 'invalid'. */
+#defineBAD_PORT ((uint16_t)-1)
+
+#define FWDSTEP4
+
+/* replace first 12B of the ethernet header. */
+#defineMASK_ETH 0x3f
+
 /* Hash parameters. */
 #ifdef RTE_ARCH_X86_64
 /* default to 4 million hash entries (approx) */
diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
index ace06cf..f6a65d8 100644
--- a/examples/l3fwd/l3fwd_em.c
+++ b/examples/l3fwd/l3fwd_em.c
@@ -300,81 +300,17 @@ em_get_ipv6_dst_port(void *ipv6_hdr,  uint8_t portid, 
void *lookup_struct)
return (uint8_t)((ret < 0) ? portid : ipv6_l3fwd_out_if[ret]);
 }

-static inline __attribute__((always_inline)) void
-l3fwd_em_simple_forward(struct rte_mbuf *m, uint8_t portid,
-   struct lcore_conf *qconf)
-{
-   struct ether_hdr *eth_hdr;
-   struct ipv4_hdr *ipv4_hdr;
-   uint8_t dst_port;
-
-   eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
-
-   if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
-   /* Handle IPv4 headers.*/
-   ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
-  sizeof(struct ether_hdr));
-
-#ifdef DO_RFC_1812_CHECKS
-   /* Check to make sure the packet is valid (RFC1812) */
-   if (is_valid_ipv4_pkt(ipv4_hdr, m->pkt_len) < 0) {
-   rte_pktmbuf_free(m);
-   return;
-   }
-#endif
-dst_port = em_get_ipv4_dst_port(ipv4_hdr, portid,
-   qconf->ipv4_lookup_struct);
-
-   if (dst_port >= RTE_MAX_ETHPORTS ||
-   (enabled_port_mask & 1 << dst_port) == 0)
-   dst_port = portid;
-
-#ifdef DO_RFC_1812_CHECKS
-   /* Update time to live and header checksum */
-   --(ipv4_hdr->time_to_live);
-   ++(ipv4_hdr->hdr_checksum);
-#endif
-   /* dst addr */
-   *(uint64_t *)_hdr->d_addr = dest_eth_addr[dst_port];
-
-   /* src addr */
-   ether_addr_copy(_eth_addr[dst_port], _hdr->s_addr);
-
-   send_single_packet(qconf, m, dst_port);
-   } else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
-   /* Handle IPv6 headers.*/
-   struct ipv6_hdr *ipv6_hdr;
-
-   ipv6_hdr = rte_pktmbuf_mtod_offset(m, struct ipv6_hdr *,
-  sizeof(struct ether_hdr));
-
-   dst_port = em_get_ipv6_dst_port(ipv6_hdr, portid,
-   qconf->ipv6_lookup_struct);
-
-   if (dst_port >= RTE_MAX_ETHPORTS ||
-   (enabled_port_mask & 1 << dst_port) == 0)
-   dst_port = 

[dpdk-dev] [PATCH] doc/nic: add ixgbe statistics on read frequency

2016-02-29 Thread Harry van Haaren
This patch adds a note to the ixgbe PMD guide, stating
the minimum time that statistics must be polled from
the hardware in order to avoid register values becoming
saturated and "sticking" to the max value.

Signed-off-by: Harry van Haaren 
---
 doc/guides/nics/ixgbe.rst | 24 
 1 file changed, 24 insertions(+)

diff --git a/doc/guides/nics/ixgbe.rst b/doc/guides/nics/ixgbe.rst
index 8cae299..c8085a8 100644
--- a/doc/guides/nics/ixgbe.rst
+++ b/doc/guides/nics/ixgbe.rst
@@ -178,3 +178,27 @@ load_balancer

 As in the case of l3fwd, set configure port_conf.rxmode.hw_ip_checksum=0 to 
enable vPMD.
 In addition, for improved performance, use -bsz "(32,32),(64,64),(32,32)" in 
load_balancer to avoid using the default burst size of 144.
+
+Statistics
+--
+
+The statistics of ixgbe hardware must be polled regularly in order for it to
+remain consistent. Running a DPDK application without polling the statistcs 
will
+cause registers on hardware to count to thier maxiumum value, and "stick" at
+that value.
+
+In order to avoid statistic registers every reaching thier maxiumum value,
+read the statistics from the hardware using ``rte_eth_stats_get()`` or
+``rte_eth_xstats_get()``.
+
+The maxiumum time between statistics polls that ensures consistent results can
+be calculated as follows:
+
+.. code-block:: c
+
+  max_read_interval = UINT_MAX / max_packets_per_second
+  max_read_interval = 4294967295 / 14880952
+  max_read_interval = 288.6218096127183 (seconds)
+  max_read_interval = ~4 mins 48 sec.
+
+In order to ensure valid results, it is recommended to poll every 4 minutes.
-- 
2.5.0



[dpdk-dev] [PATCH v6] cfgfile: support looking up sections by index

2016-02-29 Thread Thomas Monjalon
> > This is useful when sections have duplicate names.
> > 
> > Signed-off-by: Rich Lane 
> > ---
> > v5->v6:
> > - Reordered sectionname argument in comment.
> 
> Acked-by: Cristian Dumitrescu 
> 
> Thanks, Rich!

Applied, thanks


[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-02-29 Thread Ferruh Yigit
On 2/29/2016 10:58 AM, Avi Kivity wrote:
> 
> 
> On 02/29/2016 12:43 PM, Ferruh Yigit wrote:
>> On 2/29/2016 9:43 AM, Avi Kivity wrote:
>>> On 02/28/2016 10:16 PM, Ferruh Yigit wrote:
 On 2/28/2016 3:34 PM, Avi Kivity wrote:
> On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
>> This kernel module is based on KNI module, but this one is stripped
>> version of it and only for control messages, no data transfer
>> functionality provided.
>>
>> This Linux kernel module helps userspace application create virtual
>> interfaces and when a control command issued into that virtual
>> interface, module pushes the command to the userspace and gets the
>> response back for the caller application.
>>
>> The Linux tools like ethtool/ifconfig/ip can be used on virtual
>> interfaces but not ones for related data, like tcpdump.
>>
>> In long term this patch intends to replace the KNI and KNI will be
>> depreciated.
> Instead of adding yet another out-of-tree kernel module, why not
> extend
> the existing in-tree tap driver?  This will make everyone's life
> easier.
>
> Since tap also supports data transfer, an application can also forward
> packets not intended to it to the kernel, and forward packets from the
> kernel through the device.
>
 Hi Avi,

 KDP (Kernel Data Path) does what you have described, it is implemented
 as PMD and it benefits from tap driver to data transfer through the
 kernel. It also support custom kernel module for better performance.

 For KCP (Kernel Control Path), network driver forwards control commands
 to the userspace driver, I doubt this is something wanted for tun/tap
 driver, so extending tun/tap driver like this can be hard to upstream.
>>> Have you tried asking?  Maybe if you explain it they will be open to the
>>> extension.
>>>
>> Not communicated but tun/tap already doing something different.
>> For KCP, created interface is map of the DPDK port. All data interface
>> shows coming from DPDK port. For example if you get stats information
>> with ifconfig, the values you observe are DPDK port statistics -not
>> statistics of data between userspace and kernelspace, statistics of data
>> forwarded between DPDK ports. If you down the interface, DPDK port
>> stopped, etc...
>>
>> If you extend the tun/tap, it won't be map of the DPDK port, and if you
>> get statistics information from that interface, what do you expect to
>> see, the data transferred between kernel and userspace, or underlying
>> DPDK port forwarding statistics?
> 
> Good point.  But you really have to involve netdev on this, or you'll
> live out-of-tree forever.
> 
Why do we need to touch netdev?

A simple network driver, similar to kcp, can be solution.

This driver implements all net_device_ops and ethtool_ops in a way to
forward everything to the userspace via netlink. All needs to know about
userspace driver is it's unique id. Any userspace application, not only
DPDK drivers, can listen the netlink messages and response to the
requests come to itself.

This kind of driver is not big or complicated, kcp already does %90 of
what described above.

>> Extending tun/tap in a way we want, forwarding all control commands to
>> userspace, will break the current tun/tap, this doesn't looks like a
>> valid option to me.
> 
> It's possible to enhance it while preserving backwards compatibility, by
> enabling a feature flag (statistics from userspace).
> 
>> For data path, using tun/tap is OK and we are already doing it, for the
>> control path I believe we need a new driver.
>>
>>> Certainly it will be better to have KCP and KDP use the same kernel
>>> interface name; so we'll need to either add data path support to kcp
>>> (causing duplication with tap), or add control path support to tap. I
>>> think the latter is preferable.
>>>
>> Why it is better to have same interface? Anyone who is not interested
>> with kernel data path may want to control DPDK ports using common tools,
>> or want to get some basic information and stats using ethtool or
>> ifconfig. Why we need to bind two different functionality together?
> 
> Having two interfaces will be confusing for the user.  If I wish to
> firewall data packets coming from the dpdk port, do I set firewall rules
> on dpdk0 or tap0?
> 
Agreed that it is confusing to have two interfaces.

I think if user wants to use both data and control paths, a way can be
found to end up with single interface, using module params or something
else. Two different drivers for data and control not conflict with each
other and can cooperate.
But to work on this first both KCP and KDP should go in.

> I don't think it matters whether you extend tap, or add a data path to
> kcp, but if you want to upstream it, it needs to be blessed by netdev.
> 
I still think not good idea to merge these two, because they may be used
independently, but we can improve how they work together.


[dpdk-dev] [PATCH v3 1/1] jobstats: added function abort for job

2016-02-29 Thread Thomas Monjalon
2016-02-16 13:19, Zhang, Roy Fan:
> 
> On 12/02/2016 16:04, Marcin Kerlin wrote:
> > This patch adds new function rte_jobstats_abort. It marks *job* as finished 
> > and
> > time of this work will be add to management time instead of execution time. 
> > This
> > function should be used instead of rte_jobstats_finish if condition occurs,
> > condition is defined by the application for example when receiving n>0 
> > packets.
> > Example of usage is added to the example l2fwd-jobstats. At maximum load 
> > do-while
> > loop inside Idle job will be execute once because one or more jobs waiting 
> > to be
> > executed, so this time should not be include as the execution time by 
> > calling
> > rte_jobstats_abort().
> >
> > v2:
> > * removed redundant field
> > v3:
> > * added an example of using
> >
> > Signed-off-by: Marcin Kerlin 
[...]
> > --- a/lib/librte_jobstats/rte_jobstats_version.map
> > +++ b/lib/librte_jobstats/rte_jobstats_version.map
> > @@ -17,3 +17,10 @@ DPDK_2.0 {
> >   
> > local: *;
> >   };
> > +
> > +DPDK_2.3 {

updated to 16.04

> > +   global:
> > +
> > +   rte_jobstats_abort;
> > +
> > +} DPDK_2.0;
> 
> Acked-by : Fan Zhang

Applied, thanks


[dpdk-dev] [PATCH 1/6] mempool: add external mempool manager support

2016-02-29 Thread Hunt, David

On 2/19/2016 1:30 PM, Olivier MATZ wrote:
> Hi David,
>
> On 02/16/2016 03:48 PM, David Hunt wrote:
>> Adds the new rte_mempool_create_ext api and callback mechanism for
>> external mempool handlers
>>
>> Modifies the existing rte_mempool_create to set up the handler_idx to
>> the relevant mempool handler based on the handler name:
>>  ring_sp_sc
>>  ring_mp_mc
>>  ring_sp_mc
>>  ring_mp_sc
>>
>> v2: merges the duplicated code in rte_mempool_xmem_create and
>> rte_mempool_create_ext into one common function. The old functions
>> now call the new common function with the relevant parameters.
>>
>> Signed-off-by: David Hunt 
> I think the refactoring of rte_mempool_create() (adding of
> mempool_create()) should go in another commit. It will make the
> patches much easier to read.
>
> Also, I'm sorry but it seems that several comments or question I've made
> in http://dpdk.org/ml/archives/dev/2016-February/032706.html are
> not addressed.
>
> Examples:
> - putting some part of the patch in separate commits
> - meaning of "rt_pool"
> - put_pool_bulk unclear comment
> - should we also have get_pool_bulk stats?
> - missing _MEMPOOL_STAT_ADD() in mempool_bulk()
> - why internal in rte_mempool_internal.h?
> - why default in rte_mempool_default.c?
> - remaining references to stack handler (in a comment)
> - ...?
>
> As you know, doing a proper code review takes a lot of time. If I
> have to re-check all of my previous comments, it will take even
> more. I'm not saying all my comments require a code change, but in case
> you don't agree, please at least explain your opinion so we can debate
> on the list.
>
Hi Olivier,
Sincerest apologies. I had intended in coming back around to your 
original comments after refactoring the code. I will do that now. I did 
take them into consideration, but I see now that I need to do further 
work, such as a clearer name for rt_pool, etc.  I will respond to your 
original email.
Thanks
David.


[dpdk-dev] [PATCH 2/6] mempool: add stack (lifo) based external mempool handler

2016-02-29 Thread Hunt, David

On 2/19/2016 1:31 PM, Olivier MATZ wrote:
> Hi David,
>
> On 02/16/2016 03:48 PM, David Hunt wrote:
>> adds a simple stack based mempool handler
>>
>> Signed-off-by: David Hunt 
>> ---
>>   lib/librte_mempool/Makefile|   2 +-
>>   lib/librte_mempool/rte_mempool.c   |   4 +-
>>   lib/librte_mempool/rte_mempool.h   |   1 +
>>   lib/librte_mempool/rte_mempool_stack.c | 164 
>> +
>>   4 files changed, 169 insertions(+), 2 deletions(-)
>>   create mode 100644 lib/librte_mempool/rte_mempool_stack.c
>>
> I don't get what is the purpose of this handler. Is it an example
> or is it something that could be useful for dpdk applications?
>
> If it's an example, we should find a way to put the code outside
> the librte_mempool library, maybe in the test program. I see there
> is also a "custom handler". Do we really need to have both?
They are both example handlers. I agree that we could reduce down to 
one, and since the 'custom' handler has autotests, I would suggest we 
keep that one.

The next question is where it should live. I agree that it's not ideal 
to have example code living in the same directory as the mempool 
library, but they are an integral part of the library itself. How about 
creating a handlers sub-directory? We could then keep all additional and 
sample handlers in there, away from the built-in handlers. Also, seeing 
as the handler code is intended to be part of the library, I think 
moving it out to the examples directory may confuse matters further.

Regards,
David.



[dpdk-dev] [PATCH 0/6] external mempool manager

2016-02-29 Thread Hunt, David


On 2/19/2016 1:25 PM, Olivier MATZ wrote:
> Hi,
>
> On 02/16/2016 03:48 PM, David Hunt wrote:
>> Hi list.
>>
>> Here's the v2 version of a proposed patch for an external mempool manager
> Just to notice the "v2" is missing in the title, it would help
> to have it for next versions of the series.
>
Thanks, Olivier, I will ensure it's in the next patchset.
Regards,
David.



[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-02-29 Thread Ferruh Yigit
On 2/29/2016 9:43 AM, Avi Kivity wrote:
> On 02/28/2016 10:16 PM, Ferruh Yigit wrote:
>> On 2/28/2016 3:34 PM, Avi Kivity wrote:
>>> On 01/27/2016 06:24 PM, Ferruh Yigit wrote:
 This kernel module is based on KNI module, but this one is stripped
 version of it and only for control messages, no data transfer
 functionality provided.

 This Linux kernel module helps userspace application create virtual
 interfaces and when a control command issued into that virtual
 interface, module pushes the command to the userspace and gets the
 response back for the caller application.

 The Linux tools like ethtool/ifconfig/ip can be used on virtual
 interfaces but not ones for related data, like tcpdump.

 In long term this patch intends to replace the KNI and KNI will be
 depreciated.
>>> Instead of adding yet another out-of-tree kernel module, why not extend
>>> the existing in-tree tap driver?  This will make everyone's life easier.
>>>
>>> Since tap also supports data transfer, an application can also forward
>>> packets not intended to it to the kernel, and forward packets from the
>>> kernel through the device.
>>>
>> Hi Avi,
>>
>> KDP (Kernel Data Path) does what you have described, it is implemented
>> as PMD and it benefits from tap driver to data transfer through the
>> kernel. It also support custom kernel module for better performance.
>>
>> For KCP (Kernel Control Path), network driver forwards control commands
>> to the userspace driver, I doubt this is something wanted for tun/tap
>> driver, so extending tun/tap driver like this can be hard to upstream.
> 
> Have you tried asking?  Maybe if you explain it they will be open to the
> extension.
> 
Not communicated but tun/tap already doing something different.
For KCP, created interface is map of the DPDK port. All data interface
shows coming from DPDK port. For example if you get stats information
with ifconfig, the values you observe are DPDK port statistics -not
statistics of data between userspace and kernelspace, statistics of data
forwarded between DPDK ports. If you down the interface, DPDK port
stopped, etc...

If you extend the tun/tap, it won't be map of the DPDK port, and if you
get statistics information from that interface, what do you expect to
see, the data transferred between kernel and userspace, or underlying
DPDK port forwarding statistics?

Extending tun/tap in a way we want, forwarding all control commands to
userspace, will break the current tun/tap, this doesn't looks like a
valid option to me.

For data path, using tun/tap is OK and we are already doing it, for the
control path I believe we need a new driver.

> Certainly it will be better to have KCP and KDP use the same kernel
> interface name; so we'll need to either add data path support to kcp
> (causing duplication with tap), or add control path support to tap. I
> think the latter is preferable.
> 
Why it is better to have same interface? Anyone who is not interested
with kernel data path may want to control DPDK ports using common tools,
or want to get some basic information and stats using ethtool or
ifconfig. Why we need to bind two different functionality together?

>> We are investigating about adding a native support to Linux kernel for
>> KCP, but there is no task started for this right now, any support is
>> welcome.
>>
>>
> 



[dpdk-dev] [PATCH v2] I217 and I218 changes

2016-02-29 Thread Ravi Kerur
v2:
Incorporate Wenzhou's comments
Compiled and tested (via testpmd) on Ubuntu 14.04 on target
x86_64-native-linuxapp-gcc
Compiled for target x86_64-native-linuxapp-clang

v1:
Modified driver and eal code to recognize and support I217 and
I218 Intel NICs.
Compiled and tested (via testpmd) on Ubuntu 14.04 for target
x86_64-native-linuxapp-gcc
Compiled for target x86_64-native-linuxapp-clang

Signed-off-by: Ravi Kerur 
---
 drivers/net/e1000/base/e1000_osdep.h| 26 +++-
 drivers/net/e1000/em_ethdev.c   | 32 +
 lib/librte_eal/common/include/rte_pci_dev_ids.h |  9 +++
 3 files changed, 61 insertions(+), 6 deletions(-)

diff --git a/drivers/net/e1000/base/e1000_osdep.h 
b/drivers/net/e1000/base/e1000_osdep.h
index b2c76e3..47a1948 100644
--- a/drivers/net/e1000/base/e1000_osdep.h
+++ b/drivers/net/e1000/base/e1000_osdep.h
@@ -96,21 +96,35 @@ typedef int bool;

 #define E1000_PCI_REG(reg) (*((volatile uint32_t *)(reg)))

+#define E1000_PCI_REG16(reg) (*((volatile uint16_t *)(reg)))
+
 #define E1000_PCI_REG_WRITE(reg, value) do { \
E1000_PCI_REG((reg)) = (rte_cpu_to_le_32(value)); \
 } while (0)

+#define E1000_PCI_REG_WRITE16(reg, value) do { \
+   E1000_PCI_REG16((reg)) = (rte_cpu_to_le_16(value)); \
+} while (0)
+
 #define E1000_PCI_REG_ADDR(hw, reg) \
((volatile uint32_t *)((char *)(hw)->hw_addr + (reg)))

 #define E1000_PCI_REG_ARRAY_ADDR(hw, reg, index) \
E1000_PCI_REG_ADDR((hw), (reg) + ((index) << 2))

-static inline uint32_t e1000_read_addr(volatile void* addr)
+#define E1000_PCI_REG_FLASH_ADDR(hw, reg) \
+   ((volatile uint32_t *)((char *)(hw)->flash_address + (reg)))
+
+static inline uint32_t e1000_read_addr(volatile void *addr)
 {
return rte_le_to_cpu_32(E1000_PCI_REG(addr));
 }

+static inline uint16_t e1000_read_addr16(volatile void *addr)
+{
+   return rte_le_to_cpu_16(E1000_PCI_REG16(addr));
+}
+
 /* Necessary defines */
 #define E1000_MRQC_ENABLE_MASK  0x0007
 #define E1000_MRQC_RSS_FIELD_IPV6_EX   0x0008
@@ -155,20 +169,20 @@ static inline uint32_t e1000_read_addr(volatile void* 
addr)
E1000_WRITE_REG(hw, reg, value)

 /*
- * Not implemented.
+ * Tested on I217/I218 chipset.
  */

 #define E1000_READ_FLASH_REG(hw, reg) \
-   (E1000_ACCESS_PANIC(E1000_READ_FLASH_REG, hw, reg, 0), 0)
+   e1000_read_addr(E1000_PCI_REG_FLASH_ADDR((hw), (reg)))

 #define E1000_READ_FLASH_REG16(hw, reg)  \
-   (E1000_ACCESS_PANIC(E1000_READ_FLASH_REG16, hw, reg, 0), 0)
+   e1000_read_addr16(E1000_PCI_REG_FLASH_ADDR((hw), (reg)))

 #define E1000_WRITE_FLASH_REG(hw, reg, value)  \
-   E1000_ACCESS_PANIC(E1000_WRITE_FLASH_REG, hw, reg, value)
+   E1000_PCI_REG_WRITE(E1000_PCI_REG_FLASH_ADDR((hw), (reg)), (value))

 #define E1000_WRITE_FLASH_REG16(hw, reg, value) \
-   E1000_ACCESS_PANIC(E1000_WRITE_FLASH_REG16, hw, reg, value)
+   E1000_PCI_REG_WRITE16(E1000_PCI_REG_FLASH_ADDR((hw), (reg)), (value))

 #define STATIC static

diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 4a843fe..a8c26ed 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -231,6 +231,32 @@ rte_em_dev_atomic_write_link_status(struct rte_eth_dev 
*dev,
return 0;
 }

+/**
+ *  eth_em_dev_is_ich8 - Check for ICH8 device
+ *  @hw: pointer to the HW structure
+ *
+ *  return TRUE for ICH8, otherwise FALSE
+ **/
+static bool
+eth_em_dev_is_ich8(struct e1000_hw *hw)
+{
+   DEBUGFUNC("eth_em_dev_is_ich8");
+
+   switch (hw->device_id) {
+   case E1000_DEV_ID_PCH_LPT_I217_LM:
+   case E1000_DEV_ID_PCH_LPT_I217_V:
+   case E1000_DEV_ID_PCH_LPTLP_I218_LM:
+   case E1000_DEV_ID_PCH_LPTLP_I218_V:
+   case E1000_DEV_ID_PCH_I218_V2:
+   case E1000_DEV_ID_PCH_I218_LM2:
+   case E1000_DEV_ID_PCH_I218_V3:
+   case E1000_DEV_ID_PCH_I218_LM3:
+   return 1;
+   default:
+   return 0;
+   }
+}
+
 static int
 eth_em_dev_init(struct rte_eth_dev *eth_dev)
 {
@@ -265,6 +291,8 @@ eth_em_dev_init(struct rte_eth_dev *eth_dev)
adapter->stopped = 0;

/* For ICH8 support we'll need to map the flash memory BAR */
+   if (eth_em_dev_is_ich8(hw))
+   hw->flash_address = (void *)pci_dev->mem_resource[1].addr;

if (e1000_setup_init_funcs(hw, TRUE) != E1000_SUCCESS ||
em_hw_init(hw) != 0) {
@@ -490,6 +518,7 @@ em_set_pba(struct e1000_hw *hw)
break;
case e1000_pchlan:
case e1000_pch2lan:
+   case e1000_pch_lpt:
pba = E1000_PBA_26K;
break;
default:
@@ -798,6 +827,8 @@ em_hardware_init(struct e1000_hw *hw)
hw->fc.low_water = 0x5048;
hw->fc.pause_time = 

[dpdk-dev] [PATCH v3 3/3] keepalive: add rte_keepalive_xstats_get()

2016-02-29 Thread Thomas Monjalon
Hi,

There is a compilation error for 32-bit arch:

2016-02-22 11:26, Harry van Haaren:
> +   for (i = 0; i < nstats; i++)
> +   printf("%s\t%lu\n", xstats[i].name, xstats[i].value);

examples/l2fwd-keepalive/main.c:206:10: error:
format ?%lu? expects argument of type ?long unsigned int?, but argument 3
has type ?uint64_t {aka long long unsigned int}?

Please keep acks when re-sending.
Thanks


[dpdk-dev] [PATCH v4 2/4] eal: set kdrv to RTE_KDRV_NONE if kernel driver isn't managing the device

2016-02-29 Thread David Marchand
On Mon, Feb 29, 2016 at 10:00 AM, Xie, Huawei  wrote:
> On 2/29/2016 4:47 PM, David Marchand wrote:
>> On Fri, Feb 26, 2016 at 2:53 AM, Huawei Xie  wrote:
>>> v4 changes:
>>>  reword the commit message. When we mention kernel driver, emphasizes
>>> that it includes UIO/VFIO.
>> Annotations should not be part of the commitlog itself.
>
> Do you mean that "rewording the commit message" should not appear in the
> commit message itself?  Those version changes will not appear in the
> commit log when applied, right? So i added this so that reviewers know

Try to apply it.

http://dpdk.org/dev :

"Annotations take place after the 3 dashes and should explicit what
has changed since the previous version.".


-- 
David Marchand


[dpdk-dev] [PATCH v4 3/4] eal: call pci_ioport_map when kernel driver isn't managing the device

2016-02-29 Thread David Marchand
On Fri, Feb 26, 2016 at 2:53 AM, Huawei Xie  wrote:
> Call rte_eal_pci_ioport_map only if driver type is RTE_KDRV_NONE, which
> means kernel driver(including UIO/VFIO) isn't managing the device.

I suppose you meant 'Call pci_ioport_map when the pci device is not
bound to a kernel driver'.
If you keep on with your choice of words, at least put a space before the (.


> other minor changes:
>  * use RTE_ARCH_X86 for pci ioport map

This is a trivial change, but this should not be here.

>  * rework rte_eal_pci_ioport_map a bit

Well, not sure this comment helps the review, and anyway, why did you
need to change this ?
Your modification should be the smallest possible.


> Signed-off-by: Huawei Xie 

Let aside these nits.
Acked-by: David Marchand 


-- 
David Marchand


[dpdk-dev] [PATCH] eal: make resource initialization more robust

2016-02-29 Thread Tan, Jianfeng
Hi Thomas,

On 2/29/2016 5:12 AM, Thomas Monjalon wrote:
> Hi,
>
> 2016-01-29 19:22, Jianfeng Tan:
>> Current issue: DPDK is not that friendly to container environment, which
>> caused by that it pre-alloc resource like cores and hugepages. But there
>> are this or that resource limitations, for examples, cgroup, rlimit,
>> cpuset, etc.
>>
>> For cores, this patch makes use of pthread_getaffinity_np to further
>> narrow down detected cores before parsing coremask (-c), corelist (-l),
>> and coremap (--lcores).
>>
>> For hugepages, this patch adds a recover mechanism to the case that
>> there are no that many hugepages can be used. It relys on a mem access
>> to fault-in hugepages, and if fails with SIGBUS, recover to previously
>> saved stack environment with siglongjmp().
> They are some interesting ideas.
> However, I am not sure a library should try to be so smart silently.
> It needs more feedback to decide wether it can be the default behaviour
> or an option.
>
> Please send coremask and hugepage mapping as separate patches as they
> are totally different and may be integrated separately.

Good advise, thanks! I'll do it.

And one more thing FYI, coremask using pthread_getaffinity_np() may have 
issue on some Linux versions or distros: it excludes isolcpus. This is 
reported by Sergio Gonzalez Monroy , 
and I'm still working it out.

Thanks,
Jianfeng

>
> Thanks



[dpdk-dev] [PATCH v4 1/4] eal: make the comment more accurate

2016-02-29 Thread David Marchand
On Fri, Feb 26, 2016 at 2:53 AM, Huawei Xie  wrote:
> positive return of rte_eal_pci_probe_one_driver means the driver doesn't 
> support
> the device.
>
> Signed-off-by: Huawei Xie 
> Acked-by: Yuanhan Liu 

Acked-by: David Marchand 

-- 
David Marchand


[dpdk-dev] [PATCH v4 2/4] eal: set kdrv to RTE_KDRV_NONE if kernel driver isn't managing the device

2016-02-29 Thread David Marchand
On Fri, Feb 26, 2016 at 2:53 AM, Huawei Xie  wrote:
> v4 changes:
>  reword the commit message. When we mention kernel driver, emphasizes
> that it includes UIO/VFIO.

Annotations should not be part of the commitlog itself.

> Use RTE_KDRV_NONE to indicate that kernel driver(including UIO/VFIO)
> isn't manipulating the device.

missing space before (

> Signed-off-by: Huawei Xie 
> Acked-by: Yuanhan Liu 

Thought I already acked this.
Anyway,
Acked-by: David Marchand 


-- 
David Marchand


[dpdk-dev] [PATCH v3 0/4] Use common Linux tools to control DPDK ports

2016-02-29 Thread Remy Horton


On 26/02/2016 14:10, Ferruh Yigit wrote:

> Ferruh Yigit (4):
>lib/librte_ethtool: move librte_ethtool form examples to lib folder
>kcp: add kernel control path kernel module
>rte_ctrl_if: add control interface library
>examples/ethtool: add control interface support to the application

Acked-by: Remy Horton 


[dpdk-dev] [PATCH] mk: add makefile extention support

2016-02-29 Thread Thomas Monjalon
2016-02-28 21:47, Wiles, Keith:
> >Hi,
> >
> >2016-02-09 11:35, Keith Wiles:
> >> Adding support to the build system to allow for Makefile.XXX
> >> extention to a subtree, which already has Makefiles. These
> >> Makefiles could be from the autotools and others places. Using
> >> the Makefile extention RTE_MKFILE_SUFFIX in a makefile subtree
> >> using 'export RTE_MKFILE_SUFFIX=.XXX' to use Makefile.XXX in
> >> that subtree.
> >> 
> >> The main reason I needed this feature was to integrate a autotool
> >> open source projects with DPDK and keep the original Makefiles.
> >
> >Sorry I fail to understand why it is needed.
> >Are you trying to add autotool in DPDK? I don't think it is a good approach.
> >The DPDK must provide a pkgconfig interface to be integrated anywhere.
> 
> I was not trying to add autotools to DPDK. On a number of times I wanted to 
> integrate a open source project(s) with DPDK and use DPDK?s build system, but 
> because the open source project already contained Makefile files you can not 
> use DPDK build system without modify or moving the original Makefile files. 
> Using this method I can just add a exported variable and supply my own 
> Makefile.XXX files.
> 
> One case was building FreeBSD source, but I did not want to modify FreeBSD 
> Makefiles (or reply on previous built Makefiles as they would not work on 
> Linux anyway) as I was pulling the source down from freebsd.org repo. Using a 
> patch to add the Makefiles with a different suffix allows me to build FreeBSD 
> using DPDK, without having to modify or own the FreeBSD source. I have had 
> this problem a number of times with open source code I did not want to 
> modify, but just build within DPDK build system and adding the support for a 
> different suffix to DPDK provided a clean way. The change does not effect the 
> correct build system and just allows someone to define a new suffix for a 
> given subtree in the code.

Why would you like to have another project inside the DPDK files tree?
If you want to integrate the lib inside an existing project, the solution
is pkgconfig.


[dpdk-dev] [PATCH v4 2/4] eal: set kdrv to RTE_KDRV_NONE if kernel driver isn't managing the device

2016-02-29 Thread Xie, Huawei
On 2/29/2016 4:47 PM, David Marchand wrote:
> On Fri, Feb 26, 2016 at 2:53 AM, Huawei Xie  wrote:
>> v4 changes:
>>  reword the commit message. When we mention kernel driver, emphasizes
>> that it includes UIO/VFIO.
> Annotations should not be part of the commitlog itself.

Do you mean that "rewording the commit message" should not appear in the
commit message itself?  Those version changes will not appear in the
commit log when applied, right? So i added this so that reviewers know
that i have changed the commit message otherwise they don't need to
waste their time reviewing the commit message again. Is it that even if
i send a new patch version with only the changes to the commit message ,
i needn't mention this?
>
>> Use RTE_KDRV_NONE to indicate that kernel driver(including UIO/VFIO)
>> isn't manipulating the device.
> missing space before (

Thomas, could you help change this?

>
>> Signed-off-by: Huawei Xie 
>> Acked-by: Yuanhan Liu 
> Thought I already acked this.
> Anyway,
> Acked-by: David Marchand 
>
>



[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-02-29 Thread Jay Rolette
On Mon, Feb 29, 2016 at 5:06 AM, Thomas Monjalon 
wrote:

> Hi,
> I totally agree with Avi's comments.
> This topic is really important for the future of DPDK.
> So I think we must give some time to continue the discussion
> and have netdev involved in the choices done.
> As a consequence, these series should not be merged in the release 16.04.
> Thanks for continuing the work.
>

I know you guys are very interested in getting rid of the out-of-tree
drivers, but please do not block incremental improvements to DPDK in the
meantime. Ferruh's patch improves the usability of KNI. Don't throw out
good and useful enhancements just because it isn't where you want to be in
the end.

I'd like to see these be merged.

Jay


[dpdk-dev] [PATCH] i40e: remove redundant compiler warning disablers

2016-02-29 Thread Pei, Yulong
This patch caused build error with i686-native-linuxapp-gcc  (gcc version is 
4.8.3)


> > i686-native-linuxapp-gcc compile error info:
> >
> > INSTALL-LIB librte_pmd_vmxnet3_uio.a
> > /root/dpdk/drivers/net/i40e/base/i40e_common.c: In function
> > ?i40e_aq_set_lldp_mib?:
> > /root/dpdk/drivers/net/i40e/base/i40e_common.c:3772:32: error: cast 
> > from pointer to integer of different size [-Werror=pointer-to-int-cast]
> >   cmd->address_high = CPU_TO_LE32(I40E_HI_WORD((u64)buff));
> > ^
> > /root/dpdk/drivers/net/i40e/base/i40e_common.c:3773:30: error: cast 
> > from pointer to integer of different size [-Werror=pointer-to-int-cast]
> >   cmd->address_low =  CPU_TO_LE32(I40E_LO_DWORD((u64)buff));
> >   ^
> > /root/dpdk/drivers/net/i40e/base/i40e_common.c: In function
> > ?i40e_aq_set_arp_proxy_config?:
> > /root/dpdk/drivers/net/i40e/base/i40e_common.c:5817:33: error: cast 
> > from pointer to integer of different size [-Werror=pointer-to-int-cast]
> >   cmd->address_high = CPU_TO_LE32(I40E_HI_DWORD((u64)proxy_config));
> >  ^
> > /root/dpdk/drivers/net/i40e/base/i40e_common.c:5818:30: error: cast 
> > from pointer to integer of different size [-Werror=pointer-to-int-cast]
> >   cmd->address_low = CPU_TO_LE32(I40E_LO_DWORD((u64)proxy_config));
> >   ^
> > /root/dpdk/drivers/net/i40e/base/i40e_common.c: In function
> > ?i40e_aq_set_ns_proxy_table_entry?:
> > /root/dpdk/drivers/net/i40e/base/i40e_common.c:5852:14: error: cast 
> > from pointer to integer of different size [-Werror=pointer-to-int-cast]
> >CPU_TO_LE32(I40E_HI_DWORD((u64)ns_proxy_table_entry));
> >   ^
> > /root/dpdk/drivers/net/i40e/base/i40e_common.c:5854:12: error: cast 
> > from pointer to integer of different size [-Werror=pointer-to-int-cast]
> >CPU_TO_LE32(I40E_LO_DWORD((u64)ns_proxy_table_entry));
> > ^
> > /root/dpdk/drivers/net/i40e/base/i40e_common.c: In function
> > ?i40e_aq_set_clear_wol_filter?:
> > /root/dpdk/drivers/net/i40e/base/i40e_common.c:5914:33: error: cast 
> > from pointer to integer of different size [-Werror=pointer-to-int-cast]
> >   cmd->address_high = CPU_TO_LE32(I40E_HI_DWORD((u64)filter));
> >  ^
> > /root/dpdk/drivers/net/i40e/base/i40e_common.c:5915:30: error: cast 
> > from pointer to integer of different size [-Werror=pointer-to-int-cast]
> >   cmd->address_low = CPU_TO_LE32(I40E_LO_DWORD((u64)filter));
> >   ^
> > cc1: all warnings being treated as errors
> > make[6]: *** [i40e_common.o] Error 1
> > make[5]: *** [i40e] Error 2
> > make[5]: *** Waiting for unfinished jobs
> >   INSTALL-LIB librte_pmd_ixgbe.a
> >   AR librte_pmd_e1000.a
> >   INSTALL-LIB librte_pmd_e1000.a
> > make[4]: *** [net] Error 2
> > make[3]: *** [drivers] Error 2
> > make[2]: *** [all] Error 2
> > make[1]: *** [pre_install] Error 2
> > make: *** [install] Error 2


-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Panu Matilainen
Sent: Monday, December 7, 2015 8:37 PM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH] i40e: remove redundant compiler warning disablers

These may have been required at some point but current i40e base driver 
compiles cleanly without them, at least with clang 3.7.0 and gcc 5.1.1.

Signed-off-by: Panu Matilainen 
---
 drivers/net/i40e/Makefile | 13 -
 1 file changed, 13 deletions(-)

diff --git a/drivers/net/i40e/Makefile b/drivers/net/i40e/Makefile index 
033ee4a..4ffaf0d 100644
--- a/drivers/net/i40e/Makefile
+++ b/drivers/net/i40e/Makefile
@@ -53,23 +53,10 @@ CFLAGS_BASE_DRIVER = -wd593 -wd188  else ifeq ($(CC), 
clang)  CFLAGS_BASE_DRIVER += -Wno-sign-compare  CFLAGS_BASE_DRIVER += 
-Wno-unused-value -CFLAGS_BASE_DRIVER += -Wno-unused-parameter 
-CFLAGS_BASE_DRIVER += -Wno-strict-aliasing -CFLAGS_BASE_DRIVER += -Wno-format 
-CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers -CFLAGS_BASE_DRIVER += 
-Wno-pointer-to-int-cast -CFLAGS_BASE_DRIVER += -Wno-format-nonliteral  
CFLAGS_BASE_DRIVER += -Wno-unused-variable  else  CFLAGS_BASE_DRIVER  = 
-Wno-sign-compare  CFLAGS_BASE_DRIVER += -Wno-unused-value -CFLAGS_BASE_DRIVER 
+= -Wno-unused-parameter -CFLAGS_BASE_DRIVER += -Wno-strict-aliasing 
-CFLAGS_BASE_DRIVER += -Wno-format -CFLAGS_BASE_DRIVER += 
-Wno-missing-field-initializers -CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast 
-CFLAGS_BASE_DRIVER += -Wno-format-nonliteral -CFLAGS_BASE_DRIVER += 
-Wno-format-security  CFLAGS_BASE_DRIVER += -Wno-unused-variable

 ifeq ($(shell test $(GCC_VERSION) -ge 44 && echo 1), 1)
--
2.5.0



[dpdk-dev] [PATCH v1] virtio: Use cpuflag for vector api

2016-02-29 Thread Xie, Huawei
On 2/29/2016 12:26 PM, Yuanhan Liu wrote:
> On Fri, Feb 26, 2016 at 02:21:02PM +0530, Santosh Shukla wrote:
>> Check cpuflag macro before using vectored api.
>> -virtio_recv_pkts_vec() uses _sse3__ simd instruction for now so added 
>> cpuflag.
>> - Also wrap other vectored freind api ie..
>> 1) virtqueue_enqueue_recv_refill_simple
>> 2) virtio_rxq_vec_setup
>>
> ...
>> diff --git a/drivers/net/virtio/virtio_rxtx_simple.c 
>> b/drivers/net/virtio/virtio_rxtx_simple.c
>> index 3a1de9d..be51d7c 100644
>> --- a/drivers/net/virtio/virtio_rxtx_simple.c
>> +++ b/drivers/net/virtio/virtio_rxtx_simple.c
> Hmm, why not wrapping the whole file, instead of just few functions?
>
> Or maybe better, do a compile time check at the Makefile, something
> like:
>
> if has_CPUFLAG_xxx
> SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c
> endif
>
>
>   --yliu
>
For next release, we could consider providing arch level framework for
different arch optimizations. It is more complicated for rte_memcpy. For
the time being, except the small issue, ok with the temporary solution
using CPUFLAG.


[dpdk-dev] ACL memory allocation failures

2016-02-29 Thread Rapelly, Varun
> 
> Thanks Konstantin.
> 
> Previous allocation error was coming with 1024 huge pages of 2 MB size.
> 
> After increasing the huge pages to 2048, I was able to add another 
> ~140 rules [IPv4 rule data--> with src, dst IP address & port, next header ] 
> more, ie., 950 rules were added.

That's strange according to your log, all you need is ~13MB of hugepage memory:
ACL: allocation of 12966528 bytes on socket 0 for ipv4_acl_table1 Wonder what 
consumed rest of 4GB?


>> We are creating mem pools (for DPDK compatible 3 ports) for packet 
>> processing.

Again do you re-build your table after every rule you add?
If so, then it seems a bit strange approach (and definitely not the fastest 
one).
>>Yes, we are rebuilding the rules every time and is due to 2 reasons: 
>>1. Our application, gives full list of rules every time you add new rule. 
>>2. There is no way to delete a specific rule in the trie. Is there any way to 
>>delete a specific ACL rule?
What you can do instead: create context; add all your rules into it; build; 

> 
> Logically it did not increase number of rules [expected 2*817, but only 950 
> were added]. Is it really using huge pages memory only?
> 
> From the code it looks like heap memory. [ ret = 
> malloc_heap_alloc(>malloc_heaps[i], type, size, 0, align == 0 ? 
> 1 : align, 0) ]

As I can see from the log it fails at GEN phase, when trying to allocate 
hugepages for RT table.
At lib/librte_acl/acl_gen.c:509

rte_acl_gen(struct rte_acl_ctx *ctx, struct rte_acl_trie *trie,
struct rte_acl_bld_trie *node_bld_trie, uint32_t num_tries,
uint32_t num_categories, uint32_t data_index_sz, size_t max_size) { ...
mem = rte_zmalloc_socket(ctx->name, total_size, RTE_CACHE_LINE_SIZE,
ctx->socket_id); if (mem == NULL) {
RTE_LOG(ERR, ACL,
"allocation of %zu bytes on socket %d for %s failed\n",
total_size, ctx->socket_id, ctx->name);
return -ENOMEM;
}

Konstantin

> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Rapelly, Varun
> > Sent: Friday, February 26, 2016 10:28 AM
> > To: dev at dpdk.org
> > Subject: Re: [dpdk-dev] ACL memory allocation failures
> >
> > Hi All,
> >
> > When I'm trying to configure some 5000+ ACL rules with different 
> > source IP addresses, getting ACL memory allocation failure. I'm using DPDK 
> > 2.1.
> >
> > [root at ACLISSUE log_2015_10_26_08_19_42]# vim np.log match 
> > nodes/bytes
> > used: 816/104448
> > total: 12940832 bytes
> > ACL: Build phase for ACL "ipv4_acl_table2":
> > memory consumed: 947913495
> > ACL: trie 0: number of rules: 816
> > ACL: allocation of 12966528 bytes on socket 0 for ipv4_acl_table1 
> > failed
> > ACL: Build phase for ACL "ipv4_acl_table1":
> > memory consumed: 947913495
> > ACL: trie 0: number of rules: 817
> > EAL: Error - exiting with code: 1
> >   Cause: Failed to build ACL trie
> >
> > Again sourced the ACL config file. After adding around 77 again the same 
> > error came.
> >
> > total: 14912784 bytes
> > ACL: Build phase for ACL "ipv4_acl_table1":
> > memory consumed: 1040188260
> > ACL: trie 0: number of rules: 893
> > ACL: allocation of 14938480 bytes on socket 0 for ipv4_acl_table2 
> > failed
> 
> You are running out of hugepages memory.
> 
> > ACL: Build phase for ACL "ipv4_acl_table2":
> > memory consumed: 1040188260
> > ACL: trie 0: number of rules: 894
> > EAL: Error - exiting with code: 1
> >   Cause: Failed to build ACL trie
> >
> > Where to increase the memory to avoid this issue?
> 
>  Refer to:
> http://dpdk.org/doc/guides/linux_gsg/sys_reqs.html#running-dpdk-applic
> ations
> Section 2.3.2
> 
> Konstantin



[dpdk-dev] [PATCH v4 2/4] eal: set kdrv to RTE_KDRV_NONE if kernel driver isn't managing the device

2016-02-29 Thread Xie, Huawei
? 2/27/2016 1:47 AM, Xie, Huawei ??:
> Use RTE_KDRV_NONE to indicate that kernel driver(including UIO/VFIO)
> isn't manipulating the device.
Thomas, could you kindly help change manipulating->managing? I have
changed others per Panu's suggestion but missed this.


[dpdk-dev] [PATCH v3 00/18] fm10k: update shared code

2016-02-29 Thread Ding, HengX
Tested-by: Heng Ding 

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Wang Xiao W
Sent: Friday, February 19, 2016 7:07 PM
To: Chen, Jing D
Cc: dev at dpdk.org
Subject: [dpdk-dev] [PATCH v3 00/18] fm10k: update shared code

v3:
* Fixed checkpatch.pl warning about long commit message.
* Fixed the issue of compile failure about part of patches applied.
* Split the misc-small-fixes patch into several patches.

v2:
* Put the two extra fix patches ahead of the base code patches.

This patch set has passed regression test.

Wang Xiao W (18):
  fm10k: use default mailbox message handler for PF
  fm10k/base: correct typecast in fm10k_update_xc_addr_pf
  fm10k/base: cleanup namespace pollution
  fm10k/base: use bitshift for itr_scale
  fm10k/base: reset max_queues on init_hw_vf failure
  fm10k/base: document ITR scale workaround in VF TDLEN register
  fm10k/base: cleanup lines over 80 characters
  fm10k/base: cleanup useless else
  fm10k/base: use BIT macro instead of open-coded bit-shifting of 1
  fm10k/base: do not use CamelCase
  fm10k/base: use memcpy for mac addr copy
  fm10k/base: allow removal of is_slot_appropriate function
  fm10k/base: consistently use VLAN ID when referencing vid variables
  fm10k/base: imporve comment per upstream review changes
  fm10k/base: fix TLV structures alignment
  fm10k/base: move constants to the right of binary operators
  fm10k/base: minor cleanups
  fm10k/base: remove unused struct element

 drivers/net/fm10k/base/fm10k_api.c   |   2 +
 drivers/net/fm10k/base/fm10k_api.h   |   2 +
 drivers/net/fm10k/base/fm10k_mbx.c   |  63 +++-
 drivers/net/fm10k/base/fm10k_mbx.h   |  11 +--
 drivers/net/fm10k/base/fm10k_osdep.h |  32 ++
 drivers/net/fm10k/base/fm10k_pf.c|  88 +
 drivers/net/fm10k/base/fm10k_pf.h|  18 ++--
 drivers/net/fm10k/base/fm10k_tlv.c   |  40 
 drivers/net/fm10k/base/fm10k_tlv.h   |   9 +-
 drivers/net/fm10k/base/fm10k_type.h  | 182 +++
 drivers/net/fm10k/base/fm10k_vf.c|  32 --
 drivers/net/fm10k/fm10k_ethdev.c |  41 +++-
 12 files changed, 222 insertions(+), 298 deletions(-)

-- 
1.9.3



[dpdk-dev] [PATCH v3 1/3] fm10k: enable FTAG based forwarding

2016-02-29 Thread Wang, Xiao W


> -Original Message-
> From: Richardson, Bruce
> Sent: Saturday, February 27, 2016 12:33 AM
> To: David Marchand 
> Cc: Wang, Xiao W ; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 1/3] fm10k: enable FTAG based forwarding
> 
> On Fri, Feb 26, 2016 at 04:00:49PM +0100, David Marchand wrote:
> > On Fri, Feb 26, 2016 at 3:48 PM, Bruce Richardson
> >  wrote:
> > > On Fri, Feb 26, 2016 at 09:24:06AM +, Wang, Xiao W wrote:
> > >> Hi,
> > >> > > Thanks for the discussion, Thomas, do you have any suggestions?
> > >> >
> > >> > I don't understand why you say this feature is specific to fm10k.
> > >> > Can we imagine another NIC having this capability?
> > >>
> > >> As you know, fm10k has a switch logic between the Mac and Phy,
> > >> every packets Sent out from the host will be switched inside the
> > >> NIC, other NICs don't have a switch inside, and the FTAG feature is 
> > >> related
> to the switch function.
> > >>
> > >> As introduced in the second patch:
> > >> The FM10K family of NICs support the addition of a Fabric Tag
> > >> (FTAG) to carry special information. The FTAG is placed at the
> > >> beginning of the frame, it contains information such as where the
> > >> packet comes from and goes, and the vlan tag. In FTAG based
> > >> forwarding mode, the switch logic forwards packets according to glort
> (global resource tag) information, rather than the mac and vlan table.
> > >> So this is a feature specific to fm10k.
> > >
> > > If it is fm10k specific, how about just adding a public function to
> > > the fm10k driver to turn it on. The user app will be non-portable
> > > across NICs, but that's the price of using nic-specific features.
> >
> > What about using a devargs ?
> > Something like :
> > -w :xx:xx.x,enable_ftag=1
> >
> > The application still needs to know about this to enable it, but that
> > sounds better to me.
> > The only issue is that it can't work with hotplug at the moment.
> >
> Seems a good enough solution to me. Xiao, any other thoughts?
> 
> /Bruce

I also agree with the devargs solution, in this way, the build time config can
be removed and we don't need to add extra fields into ethdev.
I'll rework the patch, thanks for the suggestions.

Best Regards,
Xiao


[dpdk-dev] [PATCH v2] doc: Malicious Driver Detection not supported by ixgbe

2016-02-29 Thread Lu, Wenzhuo
Hi Bruce,

> -Original Message-
> From: Richardson, Bruce
> Sent: Friday, February 26, 2016 10:41 PM
> To: Lu, Wenzhuo
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] doc: Malicious Driver Detection not
> supported by ixgbe
> 
> On Fri, Feb 26, 2016 at 12:48:37PM +0800, Wenzhuo Lu wrote:
> > Announce that Malicious Driver Detection is not supported.
> >
> > V2:
> >  *Rework the words.
> >
> > Signed-off-by: Wenzhuo Lu 
> 
> Hi Wenzhuo,
> 
> just for future reference, please put the V2,v3 etc. updates below the cut 
> line "--
> -" so that they can be auto-stripped when applying the patch.
> 
> /Bruce
Got it. Thanks for the reminder :)

> 
> > ---
> >  doc/guides/nics/ixgbe.rst  | 20 
> >  doc/guides/rel_notes/release_16_04.rst | 23 +++
> >  2 files changed, 43 insertions(+)
> >
>