date:20161003

[dpdk-dev] [PATCH] app/test: add mempool walk

2016-10-03 Thread Thomas Monjalon

The mempool function rte_mempool_walk was not tested.
It will print the name of all mempools.

Signed-off-by: Thomas Monjalon 
---
 app/test/test_mempool.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index fffbf8d..b9880b3 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -500,6 +500,12 @@ test_mempool_xmem_misc(void)
return 0;
 }

+static void
+walk_cb(struct rte_mempool *mp, void *userdata __rte_unused)
+{
+   printf("\t%s\n", mp->name);
+}
+
 static int
 test_mempool(void)
 {
@@ -561,6 +567,9 @@ test_mempool(void)
goto err;
}

+   printf("Walk into mempools:\n");
+   rte_mempool_walk(walk_cb, NULL);
+
rte_mempool_list_dump(stdout);

/* basic tests without cache */
-- 
2.7.0

[dpdk-dev] [PATCH 1/1 v2] eal: Fix misleading error messages, errno can't be trusted.

2016-10-03 Thread Thomas Monjalon

2016-10-03 08:55, Jean Tourrilhes:
> On Mon, Oct 03, 2016 at 02:25:40PM +0100, Sergio Gonzalez Monroy wrote:
> > Hi Jean,
> > 
> > There are some format issues with the patch:
> > 
> > You can run scripts/check-git-log.sh to check them:
> > Wrong headline format:
> > eal: Fix misleading error messages, errno can't be trusted.
> > Wrong headline uppercase:
> > eal: Fix misleading error messages, errno can't be trusted.
> > Missing 'Fixes' tag:
> > eal: Fix misleading error messages, errno can't be trusted.
> > 
> > The script's output highlights the different issues.
> 
>   SOrry about that, I casually read the page on
> http://dpdk.org/dev, but obviously I need to look at it again.

No problem. This guide is more oriented towards regular contributors.
You come with a bug and its fix, we can make some effort to format
the patch :)

The title could be "mem: fix hugepage mapping error messages"

> > On 21/09/2016 22:10, Jean Tourrilhes wrote:
> > >@@ -263,9 +264,16 @@ rte_eal_config_reattach(void)
> > >   mem_config = (struct rte_mem_config *) mmap(rte_mem_cfg_addr,
> > >   sizeof(*mem_config), PROT_READ | PROT_WRITE, MAP_SHARED,
> > >   mem_cfg_fd, 0);
> > >+  if (mem_config == MAP_FAILED || mem_config != rte_mem_cfg_addr) {
> > >+  if (mem_config != MAP_FAILED)
> > >+  /* errno is stale, don't use */
> > >+  rte_panic("Cannot mmap memory for rte_config at [%p], 
> > >got [%p] - please use '--base-virtaddr' option\n",
> > >+rte_mem_cfg_addr, mem_config);
> > >+  else
> > >+  rte_panic("Cannot mmap memory for rte_config! error %i 
> > >(%s)\n",
> > >+errno, strerror(errno));
> > >+  }
> > >   close(mem_cfg_fd);
> > >-  if (mem_config == MAP_FAILED || mem_config != rte_mem_cfg_addr)
> > >-  rte_panic("Cannot mmap memory for rte_config\n");
> > 
> > NIT but any reason you moved the check before closing the file descriptor?
> > (not that it matters with current code as we panic anyway)
> 
>   "close()" may change "errno" according to its man page.

Sergio, do you have more comments?
Should we wait another version or is it OK?
Maybe you'd prefer to rework it yourself?

[dpdk-dev] [PATCH v5 0/4] new crypto software based device

2016-10-03 Thread De Lara Guarch, Pablo

Hi,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Slawomir Mrozowicz
> Sent: Monday, October 03, 2016 7:26 AM
> To: dev at dpdk.org
> Cc: Mrozowicz, SlawomirX
> Subject: [dpdk-dev] [PATCH v5 0/4] new crypto software based device
> 
> This code provides the initial implementation of the libcrypto poll mode
> driver.
> All cryptography operations are using Openssl library crypto API.
> Each algorithm uses EVP_ interface from openssl API - which is recommended
> by
> Openssl maintainers.
> 
> For more information about how to use this driver, go to:
> doc/guides/cryptodevs/libcrypto.rst
> 
> Changes in V5:
> - reduce source of big data test
> 
> Changes in V4:
> - move aes test rework to another patch
> - move big data test to another patch
> - checking if libcrypto pmd is available
> 
> Changes in V3:
> - add nagative verification tests
> - add big data test
> - fix pmd according to negative verification tests
> - change gmac aad max size
> - update documentation and commits comments
> 
> Changes in V2:
> - add gcm/gmac algorithm correction
> - unit test rework
> 
> Slawomir Mrozowicz (1):
>   libcrypto_pmd: initial implementation of SW crypto device
> 
> Piotr Azarewicz (2)
>   app/test: cryptodev AES tests rework
>   app/test: added tests for libcrypto PMD
> 
> Daniel Mrzyglod (1)
>   examples/l2fwd-crypto: updated example for libcrypto PMD
> 
>  MAINTAINERS|4 +
>  app/test/Makefile  |2 +-
>  app/test/test_cryptodev.c  | 1581 
> ++--
>  app/test/test_cryptodev.h  |1 +
>  app/test/test_cryptodev_aes.c  |  687 -
>  app/test/test_cryptodev_aes.h  | 1124 --
>  app/test/test_cryptodev_aes_test_vectors.h | 1095 ++
>  app/test/test_cryptodev_blockcipher.c  |  531 +++
>  app/test/test_cryptodev_blockcipher.h  |  125 ++
>  app/test/test_cryptodev_des_test_vectors.h |  952 
>  app/test/test_cryptodev_gcm_test_vectors.h |   36 +-
>  app/test/test_cryptodev_hash_test_vectors.h|  491 ++
>  app/test/test_cryptodev_perf.c |  689 -
>  config/common_base |6 +
>  doc/guides/cryptodevs/index.rst|1 +
>  doc/guides/cryptodevs/libcrypto.rst|  116 ++
>  doc/guides/rel_notes/release_16_11.rst |   23 +-
>  drivers/crypto/Makefile|1 +
>  drivers/crypto/libcrypto/Makefile  |   60 +
>  drivers/crypto/libcrypto/rte_libcrypto_pmd.c   | 1051 +
>  drivers/crypto/libcrypto/rte_libcrypto_pmd_ops.c   |  708 +
>  .../crypto/libcrypto/rte_libcrypto_pmd_private.h   |  174 +++
>  .../crypto/libcrypto/rte_pmd_libcrypto_version.map |3 +
>  examples/l2fwd-crypto/main.c   |9 +
>  lib/librte_cryptodev/rte_cryptodev.h   |5 +-
>  mk/rte.app.mk  |   23 +-
>  26 files changed, 7563 insertions(+), 1935 deletions(-)
>  delete mode 100644 app/test/test_cryptodev_aes.c
>  delete mode 100644 app/test/test_cryptodev_aes.h
>  create mode 100644 app/test/test_cryptodev_aes_test_vectors.h
>  create mode 100644 app/test/test_cryptodev_blockcipher.c
>  create mode 100644 app/test/test_cryptodev_blockcipher.h
>  create mode 100644 app/test/test_cryptodev_des_test_vectors.h
>  create mode 100644 app/test/test_cryptodev_hash_test_vectors.h
>  create mode 100644 doc/guides/cryptodevs/libcrypto.rst
>  create mode 100644 drivers/crypto/libcrypto/Makefile
>  create mode 100644 drivers/crypto/libcrypto/rte_libcrypto_pmd.c
>  create mode 100644 drivers/crypto/libcrypto/rte_libcrypto_pmd_ops.c
>  create mode 100644 drivers/crypto/libcrypto/rte_libcrypto_pmd_private.h
>  create mode 100644
> drivers/crypto/libcrypto/rte_pmd_libcrypto_version.map
> 
> --
> 2.5.0

There are still some checkpatch errors, mainly related to exceeding maximum 
line length.
Some of these lines are more than 90 character long, so at least these ones 
should be fixed.

Thanks,
Pablo

[dpdk-dev] [PATCH] eal: fix c++ compilation issue with rte_delay_us()

2016-10-03 Thread Konstantin Ananyev

When compiling with C++, it treats
void (*rte_delay_us)(unsigned int us);
as definition of the global variable.
So further linking with librte_eal fails.

Fixes: b4d63fb62240 ("eal: customize delay function")

Steps to reproduce:

$ cat rttm1.cpp

#include 
#include 
#include 

using namespace std;

int main(int argc, char *argv[])
{
int ret = rte_eal_init(argc, argv);
rte_delay_us(1);
cout << "return code ";
cout << ret;
return ret;
}

$ g++ -m64 -I/${RTE_SDK}/${RTE_TARGET}/include -c  -o rttm1.o rttm1.cpp
$ gcc -m64 -pthread -o rttm1 rttm1.o -ldl -Wl,-lstdc++ \
  -L/${RTE_SDK}/${RTE_TARGET}/lib -Wl,-lrte_eal
.../librte_eal.a(eal_common_timer.o):
(.bss+0x0): multiple definition of `rte_delay_us'
rttm1.o:(.bss+0x0): first defined here
collect2: error: ld returned 1 exit status

$ nm rttm1.o | grep rte_delay_us
0092 t _GLOBAL__sub_I_rte_delay_us
 B rte_delay_us


Signed-off-by: Konstantin Ananyev 
---
 lib/librte_eal/common/include/generic/rte_cycles.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/include/generic/rte_cycles.h 
b/lib/librte_eal/common/include/generic/rte_cycles.h
index 96a2da9..00103ca 100644
--- a/lib/librte_eal/common/include/generic/rte_cycles.h
+++ b/lib/librte_eal/common/include/generic/rte_cycles.h
@@ -188,7 +188,7 @@ rte_get_timer_hz(void)
  * @param us
  *   The number of microseconds to wait.
  */
-void
+extern void
 (*rte_delay_us)(unsigned int us);

 /**
-- 
2.4.3

[dpdk-dev] qos: traffic shaping at queue level

2016-10-03 Thread Dumitrescu, Cristian

From: Nikhil Jagtap [mailto:nikhil.jag...@gmail.com]
Sent: Friday, September 30, 2016 7:12 AM
To: dev at dpdk.org; Dumitrescu, Cristian ; 
users at dpdk.org
Subject: Re: qos: traffic shaping at queue level

Hi,
Can someone please answer my queries?
I tried using queue weights to distribute traffic-class bandwidth among the 
child queues, but did not get the desired results.
[Cristian] Can you please describe what issues you see?

Regards,
Nikhil

On 27 September 2016 at 15:34, Nikhil Jagtap mailto:nikhil.jagtap at gmail.com>> wrote:
Hi,

I have a few questions about the hierarchical scheduler. I am taking a simple 
example here to get a better understanding.

Reference example:
  pipe rate = 30 mbps
  tc 0 rate = 30 mbps
  traffic-type 0 being queued to queue 0, tc 0.
  traffic-type 1 being queued to queue 1, tc 0.
  Assume traffic-type 0 is being received at the rate of 25 mbps.
  Assume traffic-type 1 is also being received at the rate of 25 mbps.

Requirement:
  To limit traffic-type 0 to (CIR =  5 mbps, PIR = 30 mbps), AND
  limit traffic-type 1 to (CIR = 25 mbps, PIR = 30 mbps).

The questions:
1) I understand that with the scheduler, it is possible to do rate limiting 
only at the sub-port and pipe levels and not at the individual queue level.
[Cristian] Yes, correct, only subports and pipes own token buckets, with all 
the pipe traffic classes and queues sharing their pipe token bucket.

Is it possible to achieve rate limiting using the notion of queue weights? For 
the above example, will assigning weights in 1:5 ratio to the two queues help 
achieve shaping the two traffic-types at the two different rates?
[Cristian] Yes. However, getting the weight observed accurately relies on all 
the queues being backlogged (always having packets to dequeue). When a pipe and 
certain TC is examined for dequeuing, the relative weights are enforced between 
the queues that have packets at that precise moment in time, with the empty 
queues being ignored. The fully backlogged scenario is not taking place in 
practice, and the set of non-empty queues changes over time. As said it the 
past, having big relative weight ratios between queues helps (1:5 should be 
good).

2) In continuation to previous question: if queue weights don't help, would it 
be possible to use metering to achieve rate limiting? Assume we meter 
individual traffic-types (using CIR-PIR config mentioned above) before queuing 
it to the scheduler queues. So to achieve the respective queue rates, the 
dequeuer would be expected to prioritise green packets over yellow.
Looking into the code, the packet color is used as an input to the dropper 
block, but does not seem to be used anywhere in the scheduler. So I guess it is 
not possible to prioritise green packets when dequeing?
[Cristian] Packet color is used by Weighted RED (WRED) congestion management 
scheme on the enqueue side, not on the dequeue side. Once the packet has been 
enqueued, it cannot be dropped (i.e. every enqueued packet will eventually be 
dequeued), so rate limiting cannot be enforced on the dequeue side.

Regards,
Nikhil

[dpdk-dev] [PATCH 1/2] mbuf: add rte_pktmbuff_reset_headroom function

2016-10-03 Thread Olivier Matz

Hi Maxime,

On 09/29/2016 02:20 PM, Maxime Coquelin wrote:
> Some application use rte_mbuf_raw_alloc() function to improve
> performance by not resetting mbuf's fields to their default state.
> 
> This can be however problematic for mbuf consumers that need some
> headroom, meaning that data_off field gets decremented after
> allocation. When the mbuf is re-used afterwards, there might not
> be enough room for the consumer to prepend anything, if the data_off
> field is not reset to its default value.
> 
> This patch adds a new rte_pktmbuf_reset_headroom() function that
> applications can call to reset the data_off field.
> This patch also replaces current data_off affectations in the mbuf
> lib with a call to this function.
> 
> Signed-off-by: Maxime Coquelin 

Sounds like a good idea. Just one small comment below.

>  
>  /**
> + * Reset the data_off field of a packet mbuf to its default value.
> + *
> + * The given mbuf must have only one segment.
> + *
> + * @param m
> + *   The packet mbuf's data_off field has to be reset.
> + */
> +static inline void rte_pktmbuf_reset_headroom(struct rte_mbuf *m)
> +{
> + m->data_off = RTE_MIN(RTE_PKTMBUF_HEADROOM, (uint16_t)m->buf_len);
> +}

Maybe we should also highlight in the API comment that the segment
should be empty.


Thanks,
Olivier

[dpdk-dev] [RFC 0/7] changing mbuf pool handler

2016-10-03 Thread Olivier Matz

Hi Hemant,

Thank you for your feedback.

On 09/22/2016 01:52 PM, Hemant Agrawal wrote:
> Hi Olivier
> 
> On 9/19/2016 7:12 PM, Olivier Matz wrote:
>> Hello,
>>
>> Following discussion from [1] ("usages issue with external mempool").
>>
>> This is a tentative to make the mempool_ops feature introduced
>> by David Hunt [2] more widely used by applications.
>>
>> It applies on top of a minor fix in mbuf lib [3].
>>
>> To sumarize the needs (please comment if I did not got it properly):
>>
>> - new hw-assisted mempool handlers will soon be introduced
>> - to make use of it, the new mempool API [4] (rte_mempool_create_empty,
>>   rte_mempool_populate, ...) has to be used
>> - the legacy mempool API (rte_mempool_create) does not allow to change
>>   the mempool ops. The default is "ring_p_c" depending on
>>   flags.
>> - the mbuf helper (rte_pktmbuf_pool_create) does not allow to change
>>   them either, and the default is RTE_MBUF_DEFAULT_MEMPOOL_OPS
>>   ("ring_mp_mc")
>> - today, most (if not all) applications and examples use either
>>   rte_pktmbuf_pool_create or rte_mempool_create to create the mbuf
>>   pool, making it difficult to take advantage of this feature with
>>   existing apps.
>>
>> My initial idea was to deprecate both rte_pktmbuf_pool_create() and
>> rte_mempool_create(), forcing the applications to use the new API, which
>> is more flexible. But after digging a bit, it appeared that
>> rte_mempool_create() is widely used, and not only for mbufs. Deprecating
>> it would have a big impact on applications, and replacing it with the
>> new API would be overkill in many use-cases.
> 
> I agree with the proposal.
> 
>>
>> So I finally tried the following approach (inspired from a suggestion
>> Jerin [5]):
>>
>> - add a new mempool_ops parameter to rte_pktmbuf_pool_create(). This
>>   unfortunatelly breaks the API, but I implemented an ABI compat layer.
>>   If the patch is accepted, we could discuss how to announce/schedule
>>   the API change.
>> - update the applications and documentation to prefer
>>   rte_pktmbuf_pool_create() as much as possible
>> - update most used examples (testpmd, l2fwd, l3fwd) to add a new command
>>   line argument to select the mempool handler
>>
>> I hope the external applications would then switch to
>> rte_pktmbuf_pool_create(), since it supports most of the use-cases (even
>> priv_size != 0, since we can call rte_mempool_obj_iter() after) .
>>
> 
> I will still prefer if you can add the "rte_mempool_obj_cb_t *obj_cb,
> void *obj_cb_arg" into "rte_pktmbuf_pool_create". This single
> consolidated wrapper will almost make it certain that applications will
> not try to use rte_mempool_create for packet buffers.

The patch changes the example applications. I'm not sure I understand
why adding these arguments would force application to not use
rte_mempool_create() for packet buffers. Do you have a application in mind?

For the mempool_ops parameter, we must pass it at init because we need
to know the mempool handler before populating the pool. For object
initialization, it can be done after, so I thought it was better to
reduce the number of arguments to avoid to fall in the mempool_create()
syndrom :)

Any other opinions?

Regards,
Olivier

[dpdk-dev] [PATCH] log: do not drop debug logs at compile time

2016-10-03 Thread Olivier Matz



On 10/03/2016 05:27 PM, Wiles, Keith wrote:
> 
> Regards,
> Keith
> 
>> On Oct 3, 2016, at 10:02 AM, Olivier Matz  wrote:
>>
>> Hi Keith,
>>
>> On 09/30/2016 05:48 PM, Wiles, Keith wrote:
 On Sep 30, 2016, at 4:33 AM, Thomas Monjalon >>> 6wind.com> wrote:

 2016-09-16 09:43, Olivier Matz:
> Today, all logs whose level is lower than INFO are dropped at
> compile-time. This prevents from enabling debug logs at runtime using
> --log-level=8.
>
> The rationale was to remove debug logs from the data path at
> compile-time, avoiding a test at run-time.
>
> This patch changes the behavior of RTE_LOG() to avoid the compile-time
> optimization, and introduces the RTE_LOG_DP() macro that has the same
> behavior than the previous RTE_LOG(), for the rare cases where debug
> logs are in the data path.
>
> So it is now possible to enable debug logs at run-time by just
> specifying --log-level=8. Some drivers still have special compile-time
> options to enable more debug log. Maintainers may consider to
> remove/reduce them.
>
> Signed-off-by: Olivier Matz 

 I think it is a good change.
 However I'm not sure we should take it for 16.11 as it was sent late and
 there is no review comment.
 It is neither really a fix nor really a feature.
 If there are some +1, and no opinions against, it will go in 16.11.
 Note that some drivers would need some changes to fully benefit of
 debug logs enabled at run-time.
>>>
>>> Would this be easier to add a new LOG level instead say DEBUG_DATAPATH and 
>>> then change the RTE_LOG to exclude the new log level?
>>>
>>>
>>
>> The log levels are quite standard, I don't feel it would be very clear
>> to have a new level for that. It would also prevent to have different
>> log level inside data path.
> 
> I am not following you here. Having one more log level for DEBUG in the data 
> path is not a big change and you can still have any other log level in the 
> data or anyplace else for that matter.

Adding a new log level is not a big change, you are right.
But to me it looks confusing to have DEBUG, INFO, ..., WARNING, ERROR,
plus a DEBUG_DATAPATH. For instance, how do you compare levels? Or if
your log stream forwards logs to syslog, you cannot do a 1:1 mapping
with standard syslog levels.

What makes you feel it's easier to add a log level instead of adding a
new RTE_LOG_DP() function?


Regards,
Olivier

[dpdk-dev] [PATCH v5 4/4] examples/l2fwd-crypto: updated example for libcrypto PMD

2016-10-03 Thread Slawomir Mrozowicz

Libcrypto PMD has support for:

Supported cipher algorithms:
RTE_CRYPTO_CIPHER_3DES_CBC
RTE_CRYPTO_CIPHER_AES_CBC
RTE_CRYPTO_CIPHER_AES_CTR
RTE_CRYPTO_CIPHER_3DES_CTR
RTE_CRYPTO_CIPHER_AES_GCM

Supported authentication algorithms:
RTE_CRYPTO_AUTH_AES_GMAC
RTE_CRYPTO_AUTH_MD5
RTE_CRYPTO_AUTH_SHA1
RTE_CRYPTO_AUTH_SHA224
RTE_CRYPTO_AUTH_SHA256
RTE_CRYPTO_AUTH_SHA384
RTE_CRYPTO_AUTH_SHA512
RTE_CRYPTO_AUTH_MD5_HMAC
RTE_CRYPTO_AUTH_SHA1_HMAC
RTE_CRYPTO_AUTH_SHA224_HMAC
RTE_CRYPTO_AUTH_SHA256_HMAC
RTE_CRYPTO_AUTH_SHA384_HMAC
RTE_CRYPTO_AUTH_SHA512_HMAC

Signed-off-by: Daniel Mrzyglod 
---
v3:
- change description
---
 examples/l2fwd-crypto/main.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/examples/l2fwd-crypto/main.c b/examples/l2fwd-crypto/main.c
index 0593734..dae45f5 100644
--- a/examples/l2fwd-crypto/main.c
+++ b/examples/l2fwd-crypto/main.c
@@ -340,15 +340,22 @@ fill_supported_algorithm_tables(void)
strcpy(supported_auth_algo[i], "NOT_SUPPORTED");

strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_AES_GCM], "AES_GCM");
+   strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_AES_GMAC], "AES_GMAC");
strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_MD5_HMAC], "MD5_HMAC");
+   strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_MD5], "MD5");
strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_NULL], "NULL");
strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_AES_XCBC_MAC],
"AES_XCBC_MAC");
strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_SHA1_HMAC], "SHA1_HMAC");
+   strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_SHA1], "SHA1");
strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_SHA224_HMAC], "SHA224_HMAC");
+   strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_SHA224], "SHA224");
strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_SHA256_HMAC], "SHA256_HMAC");
+   strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_SHA256], "SHA256");
strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_SHA384_HMAC], "SHA384_HMAC");
+   strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_SHA384], "SHA384");
strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_SHA512_HMAC], "SHA512_HMAC");
+   strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_SHA512], "SHA512");
strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_SNOW3G_UIA2], "SNOW3G_UIA2");
strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_ZUC_EIA3], "ZUC_EIA3");
strcpy(supported_auth_algo[RTE_CRYPTO_AUTH_KASUMI_F9], "KASUMI_F9");
@@ -363,6 +370,8 @@ fill_supported_algorithm_tables(void)
strcpy(supported_cipher_algo[RTE_CRYPTO_CIPHER_SNOW3G_UEA2], 
"SNOW3G_UEA2");
strcpy(supported_cipher_algo[RTE_CRYPTO_CIPHER_ZUC_EEA3], "ZUC_EEA3");
strcpy(supported_cipher_algo[RTE_CRYPTO_CIPHER_KASUMI_F8], "KASUMI_F8");
+   strcpy(supported_cipher_algo[RTE_CRYPTO_CIPHER_3DES_CTR], "3DES_CTR");
+   strcpy(supported_cipher_algo[RTE_CRYPTO_CIPHER_3DES_CBC], "3DES_CBC");
 }


-- 
2.5.0

[dpdk-dev] [PATCH v5 3/4] app/test: added tests for libcrypto PMD

2016-10-03 Thread Slawomir Mrozowicz

This patch containes unit tests for libcrypto PMD. User can
use app/test application to check how to use this pmd and to
verify crypto processing.

Test name is cryptodev_libcrypto_autotest.
For performance test cryptodev_libcrypto_perftest can be used.

Signed-off-by: Piotr Azarewicz 
Signed-off-by: Marcin Kerlin 
Signed-off-by: Daniel Mrzyglod 
---
v2:
- rename AES-named functions to blockcipher
- replace different test cases with blockcipher functions pattern
- add 3DES tests into QuickAssist PMD testsuite

v3:
- add nagative verification tests
- add big data test

v4:
- move aes test rework to another patch
- move big data test to another patch
- checking if libcrypto pmd is available

v5:
- add reduced big data test
---
 app/test/test_cryptodev.c   | 1495 +--
 app/test/test_cryptodev.h   |1 +
 app/test/test_cryptodev_aes_test_vectors.h  |  304 +-
 app/test/test_cryptodev_blockcipher.c   |   22 +
 app/test/test_cryptodev_blockcipher.h   |1 +
 app/test/test_cryptodev_des_test_vectors.h  |  952 +
 app/test/test_cryptodev_gcm_test_vectors.h  |   36 +-
 app/test/test_cryptodev_hash_test_vectors.h |  491 +
 app/test/test_cryptodev_perf.c  |  689 +++-
 9 files changed, 3911 insertions(+), 80 deletions(-)
 create mode 100644 app/test/test_cryptodev_des_test_vectors.h
 create mode 100644 app/test/test_cryptodev_hash_test_vectors.h

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index c46db94..54982d2 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -45,6 +45,8 @@

 #include "test_cryptodev_blockcipher.h"
 #include "test_cryptodev_aes_test_vectors.h"
+#include "test_cryptodev_des_test_vectors.h"
+#include "test_cryptodev_hash_test_vectors.h"
 #include "test_cryptodev_kasumi_test_vectors.h"
 #include "test_cryptodev_kasumi_hash_test_vectors.h"
 #include "test_cryptodev_snow3g_test_vectors.h"
@@ -167,7 +169,7 @@ testsuite_setup(void)
/* Not already created so create */
ts_params->mbuf_pool = rte_pktmbuf_pool_create(
"CRYPTO_MBUFPOOL",
-   NUM_MBUFS, MBUF_CACHE_SIZE, 0, MBUF_SIZE,
+   NUM_MBUFS, MBUF_CACHE_SIZE, 0, UINT16_MAX,
rte_socket_id());
if (ts_params->mbuf_pool == NULL) {
RTE_LOG(ERR, USER1, "Can't create CRYPTO_MBUFPOOL\n");
@@ -308,6 +310,26 @@ testsuite_setup(void)
}
}

+   /* Create 2 LIBCRYPTO devices if required */
+   if (gbl_cryptodev_type == RTE_CRYPTODEV_LIBCRYPTO_PMD) {
+#ifndef RTE_LIBRTE_PMD_LIBCRYPTO
+   RTE_LOG(ERR, USER1, "CONFIG_RTE_LIBRTE_PMD_LIBCRYPTO must be"
+   " enabled in config file to run this testsuite.\n");
+   return TEST_FAILED;
+#endif
+   nb_devs = 
rte_cryptodev_count_devtype(RTE_CRYPTODEV_LIBCRYPTO_PMD);
+   if (nb_devs < 2) {
+   for (i = nb_devs; i < 2; i++) {
+   ret = rte_eal_vdev_init(
+   RTE_STR(CRYPTODEV_NAME_LIBCRYPTO_PMD), 
NULL);
+
+   TEST_ASSERT(ret == 0,
+   "Failed to create instance %u of pmd : 
%s",
+   i, 
RTE_STR(CRYPTODEV_NAME_LIBCRYPTO_PMD));
+   }
+   }
+   }
+
 #ifndef RTE_LIBRTE_PMD_QAT
if (gbl_cryptodev_type == RTE_CRYPTODEV_QAT_SYM_PMD) {
RTE_LOG(ERR, USER1, "CONFIG_RTE_LIBRTE_PMD_QAT must be enabled "
@@ -877,6 +899,315 @@ static const uint8_t 
catch_22_quote_2_512_bytes_AES_CBC_HMAC_SHA1_digest[] = {
0x18, 0x8c, 0x1d, 0x32
 };

+
+/* Multisession Vector context Test */
+/*Begin Session 0 */
+static uint8_t ms_aes_cbc_key0[] = {
+   0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7,
+   0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff
+};
+
+static uint8_t ms_aes_cbc_iv0[] = {
+   0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7,
+   0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff
+};
+
+static const uint8_t ms_aes_cbc_cipher0[] = {
+   0x3C, 0xE4, 0xEE, 0x42, 0xB6, 0x9B, 0xC3, 0x38,
+   0x5F, 0xAD, 0x54, 0xDC, 0xA8, 0x32, 0x81, 0xDC,
+   0x7A, 0x6F, 0x85, 0x58, 0x07, 0x35, 0xED, 0xEB,
+   0xAD, 0x79, 0x79, 0x96, 0xD3, 0x0E, 0xA6, 0xD9,
+   0xAA, 0x86, 0xA4, 0x8F, 0xB5, 0xD6, 0x6E, 0x6D,
+   0x0C, 0x91, 0x2F, 0xC4, 0x67, 0x98, 0x0E, 0xC4,
+   0x8D, 0x83, 0x68, 0x69, 0xC4, 0xD3, 0x94, 0x34,
+   0xC4, 0x5D, 0x60, 0x55, 0x22, 0x87, 0x8F, 0x6F,
+   0x17, 0x8E, 0x75, 0xE4, 0x02, 0xF5, 0x1B, 0x99,
+   0xC8, 0x39, 0xA9, 0xAB, 0x23, 0x91, 0x12, 0xED,
+   0x08, 0xE7, 0xD9, 0x25, 0x89, 0x24, 0x4F, 0x8D,
+   0x68

[dpdk-dev] [PATCH v2 1/2] mempool: fix comments for mempool create functions

2016-10-03 Thread Olivier Matz



On 09/28/2016 03:59 PM, Ferruh Yigit wrote:
> Fixes: 85226f9c526b ("mempool: introduce a function to create an empty pool")
> Fixes: d1d914ebbc25 ("mempool: allocate in several memory chunks by default")
> 
> Signed-off-by: Ferruh Yigit 
> ---

Series:
Acked-by: Olivier Matz 

Thanks

[dpdk-dev] [PATCH v5 2/4] app/test: cryptodev AES tests rework

2016-10-03 Thread Slawomir Mrozowicz

This patch rework AES tests .
In general - rename AES-named functions to blockcipher functions pattern.

Signed-off-by: Piotr Azarewicz 
Signed-off-by: Fiona Trahe 
---
 app/test/Makefile  |2 +-
 app/test/test_cryptodev.c  |   74 +-
 app/test/test_cryptodev_aes.c  |  687 -
 app/test/test_cryptodev_aes.h  | 1124 
 app/test/test_cryptodev_aes_test_vectors.h |  797 
 app/test/test_cryptodev_blockcipher.c  |  509 +
 app/test/test_cryptodev_blockcipher.h  |  124 +++
 7 files changed, 1478 insertions(+), 1839 deletions(-)
 delete mode 100644 app/test/test_cryptodev_aes.c
 delete mode 100644 app/test/test_cryptodev_aes.h
 create mode 100644 app/test/test_cryptodev_aes_test_vectors.h
 create mode 100644 app/test/test_cryptodev_blockcipher.c
 create mode 100644 app/test/test_cryptodev_blockcipher.h

diff --git a/app/test/Makefile b/app/test/Makefile
index 611d77a..5be023a 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -193,7 +193,7 @@ endif
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_RING) += test_pmd_ring.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_RING) += test_pmd_ring_perf.c

-SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev_aes.c
+SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev_blockcipher.c
 SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev_perf.c
 SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev.c

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index 9d7caba..c46db94 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -43,7 +43,8 @@
 #include "test.h"
 #include "test_cryptodev.h"

-#include "test_cryptodev_aes.h"
+#include "test_cryptodev_blockcipher.h"
+#include "test_cryptodev_aes_test_vectors.h"
 #include "test_cryptodev_kasumi_test_vectors.h"
 #include "test_cryptodev_kasumi_hash_test_vectors.h"
 #include "test_cryptodev_snow3g_test_vectors.h"
@@ -86,12 +87,16 @@ struct crypto_unittest_params {
  */
 static int
 test_AES_CBC_HMAC_SHA512_decrypt_create_session_params(
-   struct crypto_unittest_params *ut_params);
+   struct crypto_unittest_params *ut_params, uint8_t *cipher_key,
+   uint8_t *hmac_key);

 static int
 test_AES_CBC_HMAC_SHA512_decrypt_perform(struct rte_cryptodev_sym_session 
*sess,
struct crypto_unittest_params *ut_params,
-   struct crypto_testsuite_params *ts_param);
+   struct crypto_testsuite_params *ts_param,
+   const uint8_t *cipher,
+   const uint8_t *digest,
+   const uint8_t *iv);

 static struct rte_mbuf *
 setup_test_string(struct rte_mempool *mpool,
@@ -313,7 +318,7 @@ testsuite_setup(void)

nb_devs = rte_cryptodev_count();
if (nb_devs < 1) {
-   RTE_LOG(ERR, USER1, "No crypto devices found?");
+   RTE_LOG(ERR, USER1, "No crypto devices found?\n");
return TEST_FAILED;
}

@@ -872,7 +877,6 @@ static const uint8_t 
catch_22_quote_2_512_bytes_AES_CBC_HMAC_SHA1_digest[] = {
0x18, 0x8c, 0x1d, 0x32
 };

-
 static int
 test_AES_CBC_HMAC_SHA1_encrypt_digest(void)
 {
@@ -1003,17 +1007,24 @@ static const uint8_t 
catch_22_quote_2_512_bytes_AES_CBC_HMAC_SHA512_digest[] = {

 static int
 test_AES_CBC_HMAC_SHA512_decrypt_create_session_params(
-   struct crypto_unittest_params *ut_params);
+   struct crypto_unittest_params *ut_params,
+   uint8_t *cipher_key,
+   uint8_t *hmac_key);

 static int
 test_AES_CBC_HMAC_SHA512_decrypt_perform(struct rte_cryptodev_sym_session 
*sess,
struct crypto_unittest_params *ut_params,
-   struct crypto_testsuite_params *ts_params);
+   struct crypto_testsuite_params *ts_params,
+   const uint8_t *cipher,
+   const uint8_t *digest,
+   const uint8_t *iv);


 static int
 test_AES_CBC_HMAC_SHA512_decrypt_create_session_params(
-   struct crypto_unittest_params *ut_params)
+   struct crypto_unittest_params *ut_params,
+   uint8_t *cipher_key,
+   uint8_t *hmac_key)
 {

/* Setup Cipher Parameters */
@@ -1022,7 +1033,7 @@ test_AES_CBC_HMAC_SHA512_decrypt_create_session_params(

ut_params->cipher_xform.cipher.algo = RTE_CRYPTO_CIPHER_AES_CBC;
ut_params->cipher_xform.cipher.op = RTE_CRYPTO_CIPHER_OP_DECRYPT;
-   ut_params->cipher_xform.cipher.key.data = aes_cbc_key;
+   ut_params->cipher_xform.cipher.key.data = cipher_key;
ut_params->cipher_xform.cipher.key.length = CIPHER_KEY_LENGTH_AES_CBC;

/* Setup HMAC Parameters */
@@ -1031,7 +1042,7 @@ test_AES_CBC_HMAC_SHA512_decrypt_create_session_params(

ut_params->auth_xform.auth.op = RTE_CRYPTO_AUTH_OP_VERIFY;
ut_params->auth_xform.auth.algo = RTE_CRYPTO_AUTH_SHA512_HMAC;
-   ut_params->auth_xform.auth.key.d

[dpdk-dev] [PATCH] log: do not drop debug logs at compile time

2016-10-03 Thread Olivier Matz

Hi Keith,

On 09/30/2016 05:48 PM, Wiles, Keith wrote:
>> On Sep 30, 2016, at 4:33 AM, Thomas Monjalon  
>> wrote:
>>
>> 2016-09-16 09:43, Olivier Matz:
>>> Today, all logs whose level is lower than INFO are dropped at
>>> compile-time. This prevents from enabling debug logs at runtime using
>>> --log-level=8.
>>>
>>> The rationale was to remove debug logs from the data path at
>>> compile-time, avoiding a test at run-time.
>>>
>>> This patch changes the behavior of RTE_LOG() to avoid the compile-time
>>> optimization, and introduces the RTE_LOG_DP() macro that has the same
>>> behavior than the previous RTE_LOG(), for the rare cases where debug
>>> logs are in the data path.
>>>
>>> So it is now possible to enable debug logs at run-time by just
>>> specifying --log-level=8. Some drivers still have special compile-time
>>> options to enable more debug log. Maintainers may consider to
>>> remove/reduce them.
>>>
>>> Signed-off-by: Olivier Matz 
>>
>> I think it is a good change.
>> However I'm not sure we should take it for 16.11 as it was sent late and
>> there is no review comment.
>> It is neither really a fix nor really a feature.
>> If there are some +1, and no opinions against, it will go in 16.11.
>> Note that some drivers would need some changes to fully benefit of
>> debug logs enabled at run-time.
> 
> Would this be easier to add a new LOG level instead say DEBUG_DATAPATH and 
> then change the RTE_LOG to exclude the new log level?
> 
> 

The log levels are quite standard, I don't feel it would be very clear
to have a new level for that. It would also prevent to have different
log level inside data path.

Regards,
Olivier

[dpdk-dev] [PATCH v5 1/4] libcrypto_pmd: initial implementation of SW crypto device

2016-10-03 Thread Slawomir Mrozowicz

This code provides the initial implementation of the libcrypto
poll mode driver. All cryptography operations are using Openssl
library crypto API. Each algorithm uses EVP_ interface from
openssl API - which is recommended by Openssl maintainers.

This patch adds libcrypto poll mode driver support to librte_cryptodev
library.

Signed-off-by: Slawomir Mrozowicz 
Signed-off-by: Michal Kobylinski 
Signed-off-by: Tomasz Kulasek 
Signed-off-by: Daniel Mrzyglod 
---
v2:
- add gcm crypto cipher and authentication algorithm
- rework gmac crypto authentication algorithm

v3:
- fix pmd according to negative verification tests
- change gmac aad max size
- update documentation
---
 MAINTAINERS|4 +
 config/common_base |6 +
 doc/guides/cryptodevs/index.rst|1 +
 doc/guides/cryptodevs/libcrypto.rst|  116 +++
 doc/guides/rel_notes/release_16_11.rst |   23 +-
 drivers/crypto/Makefile|1 +
 drivers/crypto/libcrypto/Makefile  |   60 ++
 drivers/crypto/libcrypto/rte_libcrypto_pmd.c   | 1051 
 drivers/crypto/libcrypto/rte_libcrypto_pmd_ops.c   |  708 +
 .../crypto/libcrypto/rte_libcrypto_pmd_private.h   |  174 
 .../crypto/libcrypto/rte_pmd_libcrypto_version.map |3 +
 lib/librte_cryptodev/rte_cryptodev.h   |5 +-
 mk/rte.app.mk  |   23 +-
 13 files changed, 2162 insertions(+), 13 deletions(-)
 create mode 100644 doc/guides/cryptodevs/libcrypto.rst
 create mode 100644 drivers/crypto/libcrypto/Makefile
 create mode 100644 drivers/crypto/libcrypto/rte_libcrypto_pmd.c
 create mode 100644 drivers/crypto/libcrypto/rte_libcrypto_pmd_ops.c
 create mode 100644 drivers/crypto/libcrypto/rte_libcrypto_pmd_private.h
 create mode 100644 drivers/crypto/libcrypto/rte_pmd_libcrypto_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 58a10b8..1e9d1f8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -439,6 +439,10 @@ M: Declan Doherty 
 F: drivers/crypto/null/
 F: doc/guides/cryptodevs/null.rst

+LibCrypto Crypto PMD
+M: Declan Doherty 
+F: drivers/crypto/libcrypto/
+F: doc/guides/cryptodevs/libcrypto.rst

 Packet processing
 -
diff --git a/config/common_base b/config/common_base
index 3a412ee..87b8646 100644
--- a/config/common_base
+++ b/config/common_base
@@ -376,6 +376,12 @@ CONFIG_RTE_LIBRTE_PMD_AESNI_MB=n
 CONFIG_RTE_LIBRTE_PMD_AESNI_MB_DEBUG=n

 #
+# Compile PMD for Software backed device
+#
+CONFIG_RTE_LIBRTE_PMD_LIBCRYPTO=n
+CONFIG_RTE_LIBRTE_PMD_LIBCRYPTO_DEBUG=n
+
+#
 # Compile PMD for AESNI GCM device
 #
 CONFIG_RTE_LIBRTE_PMD_AESNI_GCM=n
diff --git a/doc/guides/cryptodevs/index.rst b/doc/guides/cryptodevs/index.rst
index 906f1b4..bae8e53 100644
--- a/doc/guides/cryptodevs/index.rst
+++ b/doc/guides/cryptodevs/index.rst
@@ -39,6 +39,7 @@ Crypto Device Drivers
 aesni_mb
 aesni_gcm
 kasumi
+libcrypto
 null
 snow3g
 qat
diff --git a/doc/guides/cryptodevs/libcrypto.rst 
b/doc/guides/cryptodevs/libcrypto.rst
new file mode 100644
index 000..77eff95
--- /dev/null
+++ b/doc/guides/cryptodevs/libcrypto.rst
@@ -0,0 +1,116 @@
+..  BSD LICENSE
+Copyright(c) 2016 Intel Corporation. All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of Intel Corporation nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+LibCrypto Crypto Poll Mode Driver
+
+This code provides the initial implementation of the libcrypto poll mode
+dri

[dpdk-dev] [PATCH] log: do not drop debug logs at compile time

2016-10-03 Thread Wiles, Keith

Regards,
Keith

> On Oct 3, 2016, at 10:37 AM, Olivier Matz  wrote:
> 
> 
> 
> On 10/03/2016 05:27 PM, Wiles, Keith wrote:
>> 
>> Regards,
>> Keith
>> 
>>> On Oct 3, 2016, at 10:02 AM, Olivier Matz  wrote:
>>> 
>>> Hi Keith,
>>> 
>>> On 09/30/2016 05:48 PM, Wiles, Keith wrote:
> On Sep 30, 2016, at 4:33 AM, Thomas Monjalon  6wind.com> wrote:
> 
> 2016-09-16 09:43, Olivier Matz:
>> Today, all logs whose level is lower than INFO are dropped at
>> compile-time. This prevents from enabling debug logs at runtime using
>> --log-level=8.
>> 
>> The rationale was to remove debug logs from the data path at
>> compile-time, avoiding a test at run-time.
>> 
>> This patch changes the behavior of RTE_LOG() to avoid the compile-time
>> optimization, and introduces the RTE_LOG_DP() macro that has the same
>> behavior than the previous RTE_LOG(), for the rare cases where debug
>> logs are in the data path.
>> 
>> So it is now possible to enable debug logs at run-time by just
>> specifying --log-level=8. Some drivers still have special compile-time
>> options to enable more debug log. Maintainers may consider to
>> remove/reduce them.
>> 
>> Signed-off-by: Olivier Matz 
> 
> I think it is a good change.
> However I'm not sure we should take it for 16.11 as it was sent late and
> there is no review comment.
> It is neither really a fix nor really a feature.
> If there are some +1, and no opinions against, it will go in 16.11.
> Note that some drivers would need some changes to fully benefit of
> debug logs enabled at run-time.

 Would this be easier to add a new LOG level instead say DEBUG_DATAPATH and 
 then change the RTE_LOG to exclude the new log level?

>>> 
>>> The log levels are quite standard, I don't feel it would be very clear
>>> to have a new level for that. It would also prevent to have different
>>> log level inside data path.
>> 
>> I am not following you here. Having one more log level for DEBUG in the data 
>> path is not a big change and you can still have any other log level in the 
>> data or anyplace else for that matter.
> 
> Adding a new log level is not a big change, you are right.
> But to me it looks confusing to have DEBUG, INFO, ..., WARNING, ERROR,
> plus a DEBUG_DATAPATH. For instance, how do you compare levels? Or if
> your log stream forwards logs to syslog, you cannot do a 1:1 mapping
> with standard syslog levels.

Doing 1:1 mapping is not a big problem unless you are trying to compare to old 
logs, which maybe the case the first time and after that it should not be a 
problem.

> 
> What makes you feel it's easier to add a log level instead of adding a
> new RTE_LOG_DP() function?

It seems to me the log levels are for displaying logs at different levels 
adding a new macro to not log is just a hack because we do not have a log level 
for data path. This is why I would like to see a log level added and not a new 
macro.

It also appears the new RTE_LOG() will always be in the code as you moved the 
test to the RTE_LOG_DP() macro. This would mean all RTE_LOG() in the code will 
always call rte_log(), correct?

If using a new DEBUG_DP (maybe DATAPATH is a better log level name) level we 
can use the same macro as before and modify the level only. This way we can 
remove via the compiler any log that is below the default RTE_LOG_LEVEL. I see 
keeping the rte_log() could be a performance problem or code blot when you 
really want to remove them all.

The DATAPATH log level would be above (smaller number) then DEBUG in the enum 
list. To remove all debug logs just set the RTE_LOG_LEVEL to RTE_LOG_DATAPATH.

> 
> 
> Regards,
> Olivier

[dpdk-dev] [PATCH v5 0/4] new crypto software based device

2016-10-03 Thread Slawomir Mrozowicz

This code provides the initial implementation of the libcrypto poll mode driver.
All cryptography operations are using Openssl library crypto API.
Each algorithm uses EVP_ interface from openssl API - which is recommended by
Openssl maintainers.

For more information about how to use this driver, go to:
doc/guides/cryptodevs/libcrypto.rst

Changes in V5:
- reduce source of big data test

Changes in V4:
- move aes test rework to another patch
- move big data test to another patch
- checking if libcrypto pmd is available

Changes in V3:
- add nagative verification tests
- add big data test
- fix pmd according to negative verification tests
- change gmac aad max size
- update documentation and commits comments

Changes in V2:
- add gcm/gmac algorithm correction
- unit test rework

Slawomir Mrozowicz (1):
  libcrypto_pmd: initial implementation of SW crypto device

Piotr Azarewicz (2)
  app/test: cryptodev AES tests rework
  app/test: added tests for libcrypto PMD

Daniel Mrzyglod (1)
  examples/l2fwd-crypto: updated example for libcrypto PMD

 MAINTAINERS|4 +
 app/test/Makefile  |2 +-
 app/test/test_cryptodev.c  | 1581 ++--
 app/test/test_cryptodev.h  |1 +
 app/test/test_cryptodev_aes.c  |  687 -
 app/test/test_cryptodev_aes.h  | 1124 --
 app/test/test_cryptodev_aes_test_vectors.h | 1095 ++
 app/test/test_cryptodev_blockcipher.c  |  531 +++
 app/test/test_cryptodev_blockcipher.h  |  125 ++
 app/test/test_cryptodev_des_test_vectors.h |  952 
 app/test/test_cryptodev_gcm_test_vectors.h |   36 +-
 app/test/test_cryptodev_hash_test_vectors.h|  491 ++
 app/test/test_cryptodev_perf.c |  689 -
 config/common_base |6 +
 doc/guides/cryptodevs/index.rst|1 +
 doc/guides/cryptodevs/libcrypto.rst|  116 ++
 doc/guides/rel_notes/release_16_11.rst |   23 +-
 drivers/crypto/Makefile|1 +
 drivers/crypto/libcrypto/Makefile  |   60 +
 drivers/crypto/libcrypto/rte_libcrypto_pmd.c   | 1051 +
 drivers/crypto/libcrypto/rte_libcrypto_pmd_ops.c   |  708 +
 .../crypto/libcrypto/rte_libcrypto_pmd_private.h   |  174 +++
 .../crypto/libcrypto/rte_pmd_libcrypto_version.map |3 +
 examples/l2fwd-crypto/main.c   |9 +
 lib/librte_cryptodev/rte_cryptodev.h   |5 +-
 mk/rte.app.mk  |   23 +-
 26 files changed, 7563 insertions(+), 1935 deletions(-)
 delete mode 100644 app/test/test_cryptodev_aes.c
 delete mode 100644 app/test/test_cryptodev_aes.h
 create mode 100644 app/test/test_cryptodev_aes_test_vectors.h
 create mode 100644 app/test/test_cryptodev_blockcipher.c
 create mode 100644 app/test/test_cryptodev_blockcipher.h
 create mode 100644 app/test/test_cryptodev_des_test_vectors.h
 create mode 100644 app/test/test_cryptodev_hash_test_vectors.h
 create mode 100644 doc/guides/cryptodevs/libcrypto.rst
 create mode 100644 drivers/crypto/libcrypto/Makefile
 create mode 100644 drivers/crypto/libcrypto/rte_libcrypto_pmd.c
 create mode 100644 drivers/crypto/libcrypto/rte_libcrypto_pmd_ops.c
 create mode 100644 drivers/crypto/libcrypto/rte_libcrypto_pmd_private.h
 create mode 100644 drivers/crypto/libcrypto/rte_pmd_libcrypto_version.map

-- 
2.5.0

[dpdk-dev] [PATCH v11 07/24] driver: probe/remove common wrappers for PCI drivers

2016-10-03 Thread Thomas Monjalon

2016-09-20 18:11, Shreyansh Jain:
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -4372,6 +4372,19 @@ rte_eth_dev_get_port_by_name(const char *name, uint8_t 
> *port_id);
>  int
>  rte_eth_dev_get_name_by_port(uint8_t port_id, char *name);
>  
> +/**
> + * Wrapper for use by pci drivers as a .probe function to attach to a ethdev
> + * interface.
> + */
> +int rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
> +   struct rte_pci_device *pci_dev);
> +
> +/**
> + * Wrapper for use by pci drivers as a .remove function to detach a ethdev
> + * interface.
> + */
> +int rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev);

These functions are used by the drivers only (as helpers).
So they should be marked @internal (added after applying the patch).

[dpdk-dev] [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-03 Thread Maxime Coquelin



On 09/29/2016 10:21 PM, Michael S. Tsirkin wrote:
> On Thu, Sep 29, 2016 at 10:05:22PM +0200, Maxime Coquelin wrote:
>> >
>> >
>> > On 09/29/2016 07:57 PM, Michael S. Tsirkin wrote:
>>> > > On Thu, Sep 29, 2016 at 05:30:53PM +0200, Maxime Coquelin wrote:
>> > ...
 > > >
 > > > Before enabling anything by default, we should first optimize the 1 
 > > > slot
 > > > case. Indeed, micro-benchmark using testpmd in txonly[0] shows ~17%
 > > > perf regression for 64 bytes case:
 > > >  - 2 descs per packet: 11.6Mpps
 > > >  - 1 desc per packet: 9.6Mpps
 > > >
 > > > This is due to the virtio header clearing in 
 > > > virtqueue_enqueue_xmit().
 > > > Removing it, we get better results than with 2 descs (1.20Mpps).
 > > > Since the Virtio PMD doesn't support offloads, I wonder whether we 
 > > > can
 > > > just drop the memset?
>>> > >
>>> > > What will happen? Will the header be uninitialized?
>> > Yes..
>> > I didn't look closely at the spec, but just looked at DPDK's and Linux
>> > vhost implementations. IIUC, the header is just skipped in the two
>> > implementations.
> In linux guest skbs are initialized AFAIK. See virtio_net_hdr_from_skb
> first thing it does is
> memset(hdr, 0, sizeof(*hdr));
>
>
>
>>> > >
>>> > > The spec says:
>>> > > The driver can send a completely checksummed packet. In this 
>>> > > case, flags
>>> > > will be zero, and gso_type
>>> > > will be VIRTIO_NET_HDR_GSO_NONE.
>>> > >
>>> > > and
>>> > > The driver MUST set num_buffers to zero.
>>> > > If VIRTIO_NET_F_CSUM is not negotiated, the driver MUST set 
>>> > > flags to
>>> > > zero and SHOULD supply a fully
>>> > > checksummed packet to the device.
>>> > >
>>> > > and
>>> > > If none of the VIRTIO_NET_F_HOST_TSO4, TSO6 or UFO options have 
>>> > > been
>>> > > negotiated, the driver MUST
>>> > > set gso_type to VIRTIO_NET_HDR_GSO_NONE.
>>> > >
>>> > > so doing this unconditionally would be a spec violation, but if you see
>>> > > value in this, we can add a feature bit.
>> > Right it would be a spec violation, so it should be done conditionally.
>> > If a feature bit is to be added, what about VIRTIO_NET_F_NO_TX_HEADER?
>> > It would imply VIRTIO_NET_F_CSUM not set, and no GSO features set.
>> > If negotiated, we wouldn't need to prepend a header.
> Yes but two points.
>
> 1. why is this memset expensive? Is the test completely skipping looking
>at the packet otherwise?
>
> 2. As long as we are doing this, see
>   Alignment vs. Networking
>   
> in Documentation/unaligned-memory-access.txt

This change will not have an impact on the IP header alignment,
as is offset in the mbuf will not change.

Regards,
Maxime

[dpdk-dev] [PATCH 1/1 v2] eal: Fix misleading error messages, errno can't be trusted.

2016-10-03 Thread Mcnamara, John



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jean Tourrilhes
> Sent: Monday, October 3, 2016 4:56 PM
> To: Gonzalez Monroy, Sergio 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 1/1 v2] eal: Fix misleading error messages,
> errno can't be trusted.
> 
> On Mon, Oct 03, 2016 at 02:25:40PM +0100, Sergio Gonzalez Monroy wrote:
> > Hi Jean,
> >
> > There are some format issues with the patch:
> >
> > You can run scripts/check-git-log.sh to check them:
> > Wrong headline format:
> > eal: Fix misleading error messages, errno can't be trusted.
> > Wrong headline uppercase:
> > eal: Fix misleading error messages, errno can't be trusted.
> > Missing 'Fixes' tag:
> > eal: Fix misleading error messages, errno can't be trusted.
> >
> > The script's output highlights the different issues.
> 
>   SOrry about that, I casually read the page on http://dpdk.org/dev,
> but obviously I need to look at it again.

The longer more detailed version is here: "Contributing Code to DPDK":

http://dpdk.org/doc/guides/contributing/patches.html

John

[dpdk-dev] [PATCH] eal: check cpu flags at init

2016-10-03 Thread Thomas Monjalon

2016-09-29 16:42, Aaron Conole:
> Flavio Leitner  writes:
> 
> > On Mon, Sep 26, 2016 at 11:43:37AM -0400, Aaron Conole wrote:
> >> My only concern is whether this change would be considered ABI
> >> breaking.  I wouldn't think so, since it doesn't seem as though an
> >> application would want to call this explicitly (and is spelled out as
> >> such), but I can't be sure that it isn't already included in the
> >> standard application API, and therefore needs to go through the change
> >> process.
> >
> > I didn't want to change the original behavior more than needed.
> >
> > I think another patch would be necessary to change the whole EAL
> > initialization because there's a bunch of rte_panic() there which
> > aren't friendly with callers either.

Yes please, we need to remove all those panic/exit calls.

> Okay makes sense.
> 
> Acked-by: Aaron Conole 

Applied, thanks

[dpdk-dev] [PATCH] eal: remove single file segments related code

2016-10-03 Thread Thomas Monjalon

2016-09-30 15:48, Sergio Gonzalez Monroy:
> On 30/09/2016 15:32, David Marchand wrote:
> > On Fri, Sep 23, 2016 at 12:08 PM, Tan, Jianfeng  
> > wrote:
> >>> -Original Message-
> >>> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> >>> Sent: Friday, September 23, 2016 5:15 PM
> >>> To: Thomas Monjalon
> >>> Cc: dev at dpdk.org; David Marchand; Tan, Jianfeng
> >>> Subject: Re: [PATCH] eal: remove single file segments related code
> >>>
> >>> On Fri, Sep 23, 2016 at 10:50:06AM +0200, Thomas Monjalon wrote:
>  2016-09-23 15:10, Yuanhan Liu:
> > Commit c711ccb30987 ("ivshmem: remove library and its EAL integration")
> > removed ivshmem support, but seems David forgot to remove the another
> > piece of code: code for RTE_EAL_SINGLE_FILE_SEGMENTS, which is
> >>> introduced
> > when ivshmem was firstly added.
>  It is not a mistake. We thought it is used by container use case.
> >>> I think no. It would help the container case a bit, but not too much I
> >>> would think, especially when the memory goes fragement.
> >>>
> >>> Jianfeng, IIRC, you don't use that option for container case, right?
> >>>
> >>>--yliu
> >> No, I don't use this option for container case. As yuanhan said, it cannot 
> >> provide much help for virtio_user memory region number limitation.
> > Ok, as said, since this feature had been introduced with ivshmem
> > 40b966a211ab ("ivshmem: library changes for mmaping using ivshmem"),
> > if Sergio has nothing against this removal, I am all for removing
> > unused code.
> 
> I certainly do not have anything against this removal :)
> 
> Acked-by: Sergio Gonzalez Monroy 

Applied, thanks

[dpdk-dev] [PATCH] eal: fix crash on mmap error in rte_eal_hugepage_attach()

2016-10-03 Thread Thomas Monjalon

2016-10-03 14:04, Sergio Gonzalez Monroy:
> On 28/09/2016 11:52, maciej.czekaj at caviumnetworks.com wrote:
> > From: Maciej Czekaj 
> >
> > In ASLR-enabled system, it is possible that selected
> > virtual space is occupied by program segments. Therefore,
> > error path should not blindly unmap all memmory segments
> > but only those already mapped.
> >
> > Steps that lead to crash:
> > 1. memeseg 0 in secondary process overlaps
> > with libc.so
> > 2. mmap of /dev/zero fails for virtual space of memseg 0
> > 3. munmap of memseg 0 leads to unmapping libc.so itself
> > 4. app gets SIGSEGV after returning from syscall to libc
> >
> > Fixes: ea329d7f8e34 ("mem: fix leak after mapping failure")
> >
> > Signed-off-by: Maciej Czekaj 
> > ---
> >   lib/librte_eal/linuxapp/eal/eal_memory.c | 11 ++-
> >   1 file changed, 6 insertions(+), 5 deletions(-)
> 
> Acked-by: Sergio Gonzalez Monroy 

Applied, thanks

[dpdk-dev] [PATCH v1 0/4] Generalize PCI specific EAL function/structures

2016-10-03 Thread Thomas Monjalon

2016-10-03 11:07, Shreyansh Jain:
> Hi David,
> 
> On Friday 30 September 2016 09:01 PM, David Marchand wrote:
> > On Tue, Sep 27, 2016 at 4:12 PM, Shreyansh Jain  
> > wrote:
> >> (I rebased these over HEAD 7b3c4f3)
> >>
> >> These patches were initially part of Jan's original series on SoC
> >> Framework ([1],[2]). An update to that series, without these patches,
> >> was posted here [3].
> >>
> >> Main motivation for these is aim of introducing a non-PCI centric
> >> subsystem in EAL. As of now the first usecase is SoC, but not limited to
> >> it.
> >>
> >> 4 patches in this series are independent of each other, as well as SoC
> >> framework. All these focus on generalizing some structure or functions
> >> present with the PCI specific code to EAL Common area (or splitting a
> >> function to be more userful).
> >
> > Those patches move linux specifics (binding pci devices using sysfs)
> > to common infrastucture.
> > We have no proper hotplug support on bsd, but if we had some common
> > code we should at least try to make the apis generic.
> >
> 
> I am not sure if I understood your point well. Just to confirm - you are 
> stating that the movement done in the patches might not suit BSD. 
> Probably you are talking about (Patch 3/4 and 4/4).
> Is my understanding correct?
> 
> So, movement to just Linux area is not enough?
> I am not well versed with BSD way of doing something similar so if 
> someone can point it out, I can integrate that. (I will investigate it 
> at my end as well).
> 
> This patchset makes the PCI->EAL movement *only* for Linux for sysfs 
> bind/unbind. (I should add this to cover letter, at the least).

The concern is about function declarations in
lib/librte_eal/common/eal_private.h
We cannot be sure it can be applicable to something else than Linux.
As it is implemented in Linux only, it should not be in a common header.

[dpdk-dev] [PATCH v3 1/2] librte_ether: add internal callback functions

2016-10-03 Thread Iremonger, Bernard

Hi Stephen,

From: Stephen Hemminger [mailto:step...@networkplumber.org]
Sent: Sunday, October 2, 2016 10:13 AM
To: Iremonger, Bernard 
Cc: dev at dpdk.org; Lu, Wenzhuo ; jerin.jacob at 
caviumnetworks.com; az5157 at att.com; Shah, Rahul R 
Subject: Re: [dpdk-dev] [PATCH v3 1/2] librte_ether: add internal callback 
functions

I know callbacks are needed, in fact even more are necessary. That is why I  
don't like this design. It expands the API for each event. I think something 
like the Linux kernel netlink callback mechanism that passes an event and 
device handle.

The current  rte_eth_dev_callback_register()  function takes a parameter void 
*cb_arg. This allows the passing of a parameter to the callback. For the events 
RTE_ETH_EVENT_QUEUE_STATE,  RTE_ETH_EVENT_INTR_RESET the callback parameter is 
not used. In some cases for the RTE_ETH_EVENT_INTR_LSC the call back parameter 
is used.

This patch adds a new event RTE_ETH_VF_MBOX and a parameter for this event 
struct rte_eth_mb_event_param{}. This parameter is only used with the 
RTE_ETH_VF_MBOX event and does not affect the other events.

The struct rte_eth_mb_event_param{} should probably not be in rte_eth_dev.h, I 
will send a v4.

Regards,

Bernard.

[dpdk-dev] [PATCH] log: do not drop debug logs at compile time

2016-10-03 Thread Wiles, Keith

Regards,
Keith

> On Oct 3, 2016, at 10:02 AM, Olivier Matz  wrote:
> 
> Hi Keith,
> 
> On 09/30/2016 05:48 PM, Wiles, Keith wrote:
>>> On Sep 30, 2016, at 4:33 AM, Thomas Monjalon  
>>> wrote:
>>> 
>>> 2016-09-16 09:43, Olivier Matz:
 Today, all logs whose level is lower than INFO are dropped at
 compile-time. This prevents from enabling debug logs at runtime using
 --log-level=8.

 The rationale was to remove debug logs from the data path at
 compile-time, avoiding a test at run-time.

 This patch changes the behavior of RTE_LOG() to avoid the compile-time
 optimization, and introduces the RTE_LOG_DP() macro that has the same
 behavior than the previous RTE_LOG(), for the rare cases where debug
 logs are in the data path.

 So it is now possible to enable debug logs at run-time by just
 specifying --log-level=8. Some drivers still have special compile-time
 options to enable more debug log. Maintainers may consider to
 remove/reduce them.

 Signed-off-by: Olivier Matz 
>>> 
>>> I think it is a good change.
>>> However I'm not sure we should take it for 16.11 as it was sent late and
>>> there is no review comment.
>>> It is neither really a fix nor really a feature.
>>> If there are some +1, and no opinions against, it will go in 16.11.
>>> Note that some drivers would need some changes to fully benefit of
>>> debug logs enabled at run-time.
>> 
>> Would this be easier to add a new LOG level instead say DEBUG_DATAPATH and 
>> then change the RTE_LOG to exclude the new log level?
>> 
>> 
> 
> The log levels are quite standard, I don't feel it would be very clear
> to have a new level for that. It would also prevent to have different
> log level inside data path.

I am not following you here. Having one more log level for DEBUG in the data 
path is not a big change and you can still have any other log level in the data 
or anyplace else for that matter.

> 
> Regards,
> Olivier

[dpdk-dev] [PATCH v1 0/4] Generalize PCI specific EAL function/structures

2016-10-03 Thread Jan Viktorin

On Mon, 3 Oct 2016 11:07:43 +0530
Shreyansh Jain  wrote:

> Hi David,
> 
> On Friday 30 September 2016 09:01 PM, David Marchand wrote:
> > On Tue, Sep 27, 2016 at 4:12 PM, Shreyansh Jain  
> > wrote:  
> >> (I rebased these over HEAD 7b3c4f3)
> >>
> >> These patches were initially part of Jan's original series on SoC
> >> Framework ([1],[2]). An update to that series, without these patches,
> >> was posted here [3].
> >>
> >> Main motivation for these is aim of introducing a non-PCI centric
> >> subsystem in EAL. As of now the first usecase is SoC, but not limited to
> >> it.
> >>
> >> 4 patches in this series are independent of each other, as well as SoC
> >> framework. All these focus on generalizing some structure or functions
> >> present with the PCI specific code to EAL Common area (or splitting a
> >> function to be more userful).  
> >
> > Those patches move linux specifics (binding pci devices using sysfs)
> > to common infrastucture.
> > We have no proper hotplug support on bsd, but if we had some common
> > code we should at least try to make the apis generic.
> >  
> 
> I am not sure if I understood your point well. Just to confirm - you are 

I don't clearly see the point, either.

Jan

> stating that the movement done in the patches might not suit BSD. 
> Probably you are talking about (Patch 3/4 and 4/4).
> Is my understanding correct?
> 
> So, movement to just Linux area is not enough?
> I am not well versed with BSD way of doing something similar so if 
> someone can point it out, I can integrate that. (I will investigate it 
> at my end as well).
> 
> This patchset makes the PCI->EAL movement *only* for Linux for sysfs 
> bind/unbind. (I should add this to cover letter, at the least).
> 
> -
> Shreyansh



-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

[dpdk-dev] [PATCH v2 09/12] virtio: add Rx checksum offload support

2016-10-03 Thread Maxime Coquelin

Hi Olivier,


On 10/03/2016 11:00 AM, Olivier Matz wrote:
> Signed-off-by: Olivier Matz 
> ---
>  drivers/net/virtio/virtio_ethdev.c | 14 
>  drivers/net/virtio/virtio_ethdev.h |  2 +-
>  drivers/net/virtio/virtio_rxtx.c   | 69 
> ++
>  drivers/net/virtio/virtqueue.h |  1 +
>  4 files changed, 78 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/virtio/virtio_ethdev.c 
> b/drivers/net/virtio/virtio_ethdev.c
> index fa56032..43cb096 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -1262,7 +1262,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
>   eth_dev->data->dev_flags = dev_flags;
>
>   /* reset device and negotiate default features */
> - ret = virtio_init_device(eth_dev, VIRTIO_PMD_GUEST_FEATURES);
> + ret = virtio_init_device(eth_dev, VIRTIO_PMD_DEFAULT_GUEST_FEATURES);
>   if (ret < 0)
>   return ret;
>
> @@ -1351,13 +1351,10 @@ virtio_dev_configure(struct rte_eth_dev *dev)
>   int ret;
>
>   PMD_INIT_LOG(DEBUG, "configure");
> + req_features = VIRTIO_PMD_DEFAULT_GUEST_FEATURES;
> + if (rxmode->hw_ip_checksum)
> + req_features |= (1ULL << VIRTIO_NET_F_GUEST_CSUM);
>
> - if (rxmode->hw_ip_checksum) {
> - PMD_DRV_LOG(ERR, "HW IP checksum not supported");
> - return -EINVAL;
> - }
> -
> - req_features = VIRTIO_PMD_GUEST_FEATURES;
>   /* if request features changed, reinit the device */
>   if (req_features != hw->req_guest_features) {
>   ret = virtio_init_device(dev, req_features);
> @@ -1578,6 +1575,9 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct 
> rte_eth_dev_info *dev_info)
>   dev_info->default_txconf = (struct rte_eth_txconf) {
>   .txq_flags = ETH_TXQ_FLAGS_NOOFFLOADS
>   };
> + dev_info->rx_offload_capa =
> + DEV_RX_OFFLOAD_TCP_CKSUM |
> + DEV_RX_OFFLOAD_UDP_CKSUM;
>  }
>
>  /*
> diff --git a/drivers/net/virtio/virtio_ethdev.h 
> b/drivers/net/virtio/virtio_ethdev.h
> index 5d5e788..2fc9218 100644
> --- a/drivers/net/virtio/virtio_ethdev.h
> +++ b/drivers/net/virtio/virtio_ethdev.h
> @@ -54,7 +54,7 @@
>  #define VIRTIO_MAX_RX_PKTLEN  9728
>
>  /* Features desired/implemented by this driver. */
> -#define VIRTIO_PMD_GUEST_FEATURES\
> +#define VIRTIO_PMD_DEFAULT_GUEST_FEATURES\
>   (1u << VIRTIO_NET_F_MAC   | \
>1u << VIRTIO_NET_F_STATUS| \
>1u << VIRTIO_NET_F_MQ| \
> diff --git a/drivers/net/virtio/virtio_rxtx.c 
> b/drivers/net/virtio/virtio_rxtx.c
> index 724517e..eda678a 100644
> --- a/drivers/net/virtio/virtio_rxtx.c
> +++ b/drivers/net/virtio/virtio_rxtx.c
> @@ -50,6 +50,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include "virtio_logs.h"
>  #include "virtio_ethdev.h"
> @@ -627,6 +628,56 @@ virtio_update_packet_stats(struct virtnet_stats *stats, 
> struct rte_mbuf *mbuf)
>   }
>  }
>
> +/* Optionally fill offload information in structure */
> +static int
> +virtio_rx_offload(struct rte_mbuf *m, struct virtio_net_hdr *hdr)
> +{
> + struct rte_net_hdr_lens hdr_lens;
> + uint32_t hdrlen, ptype;
> + int l4_supported = 0;
> +
> + /* nothing to do */
> + if (hdr->flags == 0 && hdr->gso_type == VIRTIO_NET_HDR_GSO_NONE)
> + return 0;
Maybe we could first check whether offload features were negotiated?
Doing this, we could return before accessing the header and so avoid a
cache miss.

Maxime

[dpdk-dev] [PATCH 1/1 v2] eal: Fix misleading error messages, errno can't be trusted.

2016-10-03 Thread Sergio Gonzalez Monroy

Hi Jean,

There are some format issues with the patch:

You can run scripts/check-git-log.sh to check them:
Wrong headline format:
 eal: Fix misleading error messages, errno can't be trusted.
Wrong headline uppercase:
 eal: Fix misleading error messages, errno can't be trusted.
Missing 'Fixes' tag:
 eal: Fix misleading error messages, errno can't be trusted.

The script's output highlights the different issues.


On 21/09/2016 22:10, Jean Tourrilhes wrote:
>   lib/librte_eal/linuxapp/eal/eal.c| 14 +++---
>   lib/librte_eal/linuxapp/eal/eal_memory.c | 16 
>   2 files changed, 23 insertions(+), 7 deletions(-)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
> b/lib/librte_eal/linuxapp/eal/eal.c
> index 3fb2188..5df9f6a 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -238,7 +238,8 @@ rte_eal_config_attach(void)
>   mem_config = (struct rte_mem_config *) mmap(NULL, sizeof(*mem_config),
>   PROT_READ, MAP_SHARED, mem_cfg_fd, 0);
>   if (mem_config == MAP_FAILED)
> - rte_panic("Cannot mmap memory for rte_config\n");
> + rte_panic("Cannot mmap memory for rte_config! error %i (%s)\n",
> +   errno, strerror(errno));
>   
>   rte_config.mem_config = mem_config;
>   }
> @@ -263,9 +264,16 @@ rte_eal_config_reattach(void)
>   mem_config = (struct rte_mem_config *) mmap(rte_mem_cfg_addr,
>   sizeof(*mem_config), PROT_READ | PROT_WRITE, MAP_SHARED,
>   mem_cfg_fd, 0);
> + if (mem_config == MAP_FAILED || mem_config != rte_mem_cfg_addr) {
> + if (mem_config != MAP_FAILED)
> + /* errno is stale, don't use */
> + rte_panic("Cannot mmap memory for rte_config at [%p], 
> got [%p] - please use '--base-virtaddr' option\n",
> +   rte_mem_cfg_addr, mem_config);
> + else
> + rte_panic("Cannot mmap memory for rte_config! error %i 
> (%s)\n",
> +   errno, strerror(errno));
> + }
>   close(mem_cfg_fd);
> - if (mem_config == MAP_FAILED || mem_config != rte_mem_cfg_addr)
> - rte_panic("Cannot mmap memory for rte_config\n");
>   

NIT but any reason you moved the check before closing the file 
descriptor? (not that it matters with current code as we panic anyway)

Thanks,
Sergio

>   rte_config.mem_config = mem_config;
>   }
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
> b/lib/librte_eal/linuxapp/eal/eal_memory.c
> index 41e0a92..b036ffc 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
> @@ -1615,10 +1615,18 @@ rte_eal_hugepage_attach(void)
>PROT_READ, MAP_PRIVATE, fd_zero, 0);
>   if (base_addr == MAP_FAILED ||
>   base_addr != mcfg->memseg[s].addr) {
> - RTE_LOG(ERR, EAL, "Could not mmap %llu bytes "
> - "in /dev/zero to requested address [%p]: 
> '%s'\n",
> - (unsigned long long)mcfg->memseg[s].len,
> - mcfg->memseg[s].addr, strerror(errno));
> + if (base_addr != MAP_FAILED)
> + /* errno is stale, don't use */
> + RTE_LOG(ERR, EAL, "Could not mmap %llu bytes "
> + "in /dev/zero at [%p], got [%p] - "
> + "please use '--base-virtaddr' option\n",
> + (unsigned long long)mcfg->memseg[s].len,
> + mcfg->memseg[s].addr, base_addr);
> + else
> + RTE_LOG(ERR, EAL, "Could not mmap %llu bytes "
> + "in /dev/zero at [%p]: '%s'\n",
> + (unsigned long long)mcfg->memseg[s].len,
> + mcfg->memseg[s].addr, strerror(errno));
>   if (aslr_enabled() > 0) {
>   RTE_LOG(ERR, EAL, "It is recommended to "
>   "disable ASLR in the kernel "

[dpdk-dev] [PATCH] eal: fix crash on mmap error in rte_eal_hugepage_attach()

2016-10-03 Thread Sergio Gonzalez Monroy

On 28/09/2016 11:52, maciej.czekaj at caviumnetworks.com wrote:
> From: Maciej Czekaj 
>
> In ASLR-enabled system, it is possible that selected
> virtual space is occupied by program segments. Therefore,
> error path should not blindly unmap all memmory segments
> but only those already mapped.
>
> Steps that lead to crash:
> 1. memeseg 0 in secondary process overlaps
> with libc.so
> 2. mmap of /dev/zero fails for virtual space of memseg 0
> 3. munmap of memseg 0 leads to unmapping libc.so itself
> 4. app gets SIGSEGV after returning from syscall to libc
>
> Fixes: ea329d7f8e34 ("mem: fix leak after mapping failure")
>
> Signed-off-by: Maciej Czekaj 
> ---
>   lib/librte_eal/linuxapp/eal/eal_memory.c | 11 ++-
>   1 file changed, 6 insertions(+), 5 deletions(-)

Acked-by: Sergio Gonzalez Monroy

[dpdk-dev] [PATCH] l2fwd:mac learning

2016-10-03 Thread Rafat Jahan

Added MAC learning to reduce load at l2

Signed-off-by: Rafat Jahan 
---
 examples/l2fwd-mac/Makefile |   50 ++
 examples/l2fwd-mac/main.c   | 1325 +++
 2 files changed, 1375 insertions(+)
 create mode 100644 examples/l2fwd-mac/Makefile
 create mode 100644 examples/l2fwd-mac/main.c

diff --git a/examples/l2fwd-mac/Makefile b/examples/l2fwd-mac/Makefile
new file mode 100644
index 000..6ab93f4
--- /dev/null
+++ b/examples/l2fwd-mac/Makefile
@@ -0,0 +1,50 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = l2fwd-mac
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/l2fwd-mac/main.c b/examples/l2fwd-mac/main.c
new file mode 100644
index 000..33d6a6e
--- /dev/null
+++ b/examples/l2fwd-mac/main.c
@@ -0,0 +1,1325 @@
+/*-thread created and in which two seperate threads for updation and checking 
are created infinately
+
+   final working code
+
+
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#inclu

[dpdk-dev] [PATCH v1 0/4] Generalize PCI specific EAL function/structures

2016-10-03 Thread Shreyansh Jain

Hi David,

On Friday 30 September 2016 09:01 PM, David Marchand wrote:
> On Tue, Sep 27, 2016 at 4:12 PM, Shreyansh Jain  
> wrote:
>> (I rebased these over HEAD 7b3c4f3)
>>
>> These patches were initially part of Jan's original series on SoC
>> Framework ([1],[2]). An update to that series, without these patches,
>> was posted here [3].
>>
>> Main motivation for these is aim of introducing a non-PCI centric
>> subsystem in EAL. As of now the first usecase is SoC, but not limited to
>> it.
>>
>> 4 patches in this series are independent of each other, as well as SoC
>> framework. All these focus on generalizing some structure or functions
>> present with the PCI specific code to EAL Common area (or splitting a
>> function to be more userful).
>
> Those patches move linux specifics (binding pci devices using sysfs)
> to common infrastucture.
> We have no proper hotplug support on bsd, but if we had some common
> code we should at least try to make the apis generic.
>

I am not sure if I understood your point well. Just to confirm - you are 
stating that the movement done in the patches might not suit BSD. 
Probably you are talking about (Patch 3/4 and 4/4).
Is my understanding correct?

So, movement to just Linux area is not enough?
I am not well versed with BSD way of doing something similar so if 
someone can point it out, I can integrate that. (I will investigate it 
at my end as well).

This patchset makes the PCI->EAL movement *only* for Linux for sysfs 
bind/unbind. (I should add this to cover letter, at the least).

-
Shreyansh

[dpdk-dev] [PATCH v2 0/8] Misc enhancements in testpmd

2016-10-03 Thread Olivier Matz

Hello,

On 09/09/2016 09:55 AM, Olivier Matz wrote:
> This patchset introduces several enhancements or minor fixes
> in testpmd. It is targetted for v16.11, and applies on top of
> software ptype v2 patchset [1].
> 
> These patches are useful to validate the virtio offload
> patchset [2] (to be rebased).
> 
> [1] http://dpdk.org/ml/archives/dev/2016-August/045876.html
> [2] http://dpdk.org/ml/archives/dev/2016-July/044404.html
> 
> changes v1 -> v2:
> - rebase on top of sw ptype v2 patch

Any comment on this patchset?


Thanks,
Olivier

[dpdk-dev] [PATCH v2 12/12] virtio: add Tso support

2016-10-03 Thread Olivier Matz

Signed-off-by: Olivier Matz 
---
 drivers/net/virtio/virtio_ethdev.c |   6 ++
 drivers/net/virtio/virtio_ethdev.h |   2 +
 drivers/net/virtio/virtio_rxtx.c   | 129 -
 3 files changed, 134 insertions(+), 3 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index fd33364..5728ca1 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1563,6 +1563,7 @@ virtio_dev_link_update(struct rte_eth_dev *dev, 
__rte_unused int wait_to_complet
 static void
 virtio_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 {
+   uint64_t tso_mask;
struct virtio_hw *hw = dev->data->dev_private;

if (dev->pci_dev)
@@ -1590,6 +1591,11 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
DEV_TX_OFFLOAD_UDP_CKSUM |
DEV_TX_OFFLOAD_TCP_CKSUM;
}
+
+   tso_mask = (1ULL << VIRTIO_NET_F_HOST_TSO4) |
+   (1ULL << VIRTIO_NET_F_HOST_TSO6);
+   if ((hw->guest_features & tso_mask) == tso_mask)
+   dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_TCP_TSO;
 }

 /*
diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index daa6bff..ab3b138 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -63,6 +63,8 @@
 1u << VIRTIO_NET_F_CTRL_RX   | \
 1u << VIRTIO_NET_F_CTRL_VLAN | \
 1u << VIRTIO_NET_F_CSUM  | \
+1u << VIRTIO_NET_F_HOST_TSO4 | \
+1u << VIRTIO_NET_F_HOST_TSO6 | \
 1u << VIRTIO_NET_F_MRG_RXBUF | \
 1ULL << VIRTIO_F_VERSION_1)

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 0464bd1..134995e 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -51,6 +51,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 

 #include "virtio_logs.h"
 #include "virtio_ethdev.h"
@@ -209,6 +211,111 @@ virtqueue_enqueue_recv_refill(struct virtqueue *vq, 
struct rte_mbuf *cookie)
return 0;
 }

+/* When doing TSO, the IP length is not included in the pseudo header
+ * checksum of the packet given to the PMD, but for virtio it is
+ * expected.
+ */
+static void
+virtio_tso_fix_cksum(struct rte_mbuf *m)
+{
+   /* common case: header is not fragmented */
+   if (likely(rte_pktmbuf_data_len(m) >= m->l2_len + m->l3_len +
+   m->l4_len)) {
+   struct ipv4_hdr *iph;
+   struct ipv6_hdr *ip6h;
+   struct tcp_hdr *th;
+   uint16_t prev_cksum, new_cksum, ip_len, ip_paylen;
+   uint32_t tmp;
+
+   iph = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *, m->l2_len);
+   th = RTE_PTR_ADD(iph, m->l3_len);
+   if ((iph->version_ihl >> 4) == 4) {
+   iph->hdr_checksum = 0;
+   iph->hdr_checksum = rte_ipv4_cksum(iph);
+   ip_len = iph->total_length;
+   ip_paylen = rte_cpu_to_be_16(rte_be_to_cpu_16(ip_len) -
+   m->l3_len);
+   } else {
+   ip6h = (struct ipv6_hdr *)iph;
+   ip_paylen = ip6h->payload_len;
+   }
+
+   /* calculate the new phdr checksum not including ip_paylen */
+   prev_cksum = th->cksum;
+   tmp = prev_cksum;
+   tmp += ip_paylen;
+   tmp = (tmp & 0x) + (tmp >> 16);
+   new_cksum = tmp;
+
+   /* replace it in the packet */
+   th->cksum = new_cksum;
+   } else {
+   const struct ipv4_hdr *iph;
+   struct ipv4_hdr iph_copy;
+   union {
+   uint16_t u16;
+   uint8_t u8[2];
+   } prev_cksum, new_cksum, ip_len, ip_paylen, ip_csum;
+   uint32_t tmp;
+
+   /* Same code than above, but we use rte_pktmbuf_read()
+* or we read/write in mbuf data one byte at a time to
+* avoid issues if the packet is multi segmented.
+*/
+
+   uint8_t ip_version;
+
+   ip_version = *rte_pktmbuf_mtod_offset(m, uint8_t *,
+   m->l2_len) >> 4;
+
+   /* calculate ip checksum (API imposes to set it to 0)
+* and get ip payload len */
+   if (ip_version == 4) {
+   *rte_pktmbuf_mtod_offset(m, uint8_t *,
+   m->l2_len + 10) = 0;
+   *rte_pktmbuf_mtod_offset(m, uint8_t *,
+   m->l2_len + 11) = 0;
+   iph = rte_pktmbuf_read(m, m->l2_len,
+   sizeof(*iph), &iph_copy);
+   ip_cs

[dpdk-dev] [PATCH v2 11/12] virtio: add Lro support

2016-10-03 Thread Olivier Matz

Signed-off-by: Olivier Matz 
---
 drivers/net/virtio/virtio_ethdev.c |  7 ++-
 drivers/net/virtio/virtio_ethdev.h |  9 -
 drivers/net/virtio/virtio_rxtx.c   | 21 +
 3 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 55024cd..fd33364 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1354,6 +1354,10 @@ virtio_dev_configure(struct rte_eth_dev *dev)
req_features = VIRTIO_PMD_DEFAULT_GUEST_FEATURES;
if (rxmode->hw_ip_checksum)
req_features |= (1ULL << VIRTIO_NET_F_GUEST_CSUM);
+   if (rxmode->enable_lro)
+   req_features |=
+   (1ULL << VIRTIO_NET_F_GUEST_TSO4) |
+   (1ULL << VIRTIO_NET_F_GUEST_TSO6);

/* if request features changed, reinit the device */
if (req_features != hw->req_guest_features) {
@@ -1577,7 +1581,8 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
};
dev_info->rx_offload_capa =
DEV_RX_OFFLOAD_TCP_CKSUM |
-   DEV_RX_OFFLOAD_UDP_CKSUM;
+   DEV_RX_OFFLOAD_UDP_CKSUM |
+   DEV_RX_OFFLOAD_TCP_LRO;
dev_info->tx_offload_capa = 0;

if (hw->guest_features & (1ULL << VIRTIO_NET_F_CSUM)) {
diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index 202aa2e..daa6bff 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -116,13 +116,4 @@ uint16_t virtio_xmit_pkts_simple(void *tx_queue, struct 
rte_mbuf **tx_pkts,

 int eth_virtio_dev_init(struct rte_eth_dev *eth_dev);

-/*
- * The VIRTIO_NET_F_GUEST_TSO[46] features permit the host to send us
- * frames larger than 1514 bytes. We do not yet support software LRO
- * via tcp_lro_rx().
- */
-#define VTNET_LRO_FEATURES (VIRTIO_NET_F_GUEST_TSO4 | \
-   VIRTIO_NET_F_GUEST_TSO6 | VIRTIO_NET_F_GUEST_ECN)
-
-
 #endif /* _VIRTIO_ETHDEV_H_ */
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 4ae11e7..0464bd1 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -692,6 +692,27 @@ virtio_rx_offload(struct rte_mbuf *m, struct 
virtio_net_hdr *hdr)
m->ol_flags |= PKT_RX_L4_CKSUM_GOOD;
}

+   /* GSO request, save required information in mbuf */
+   if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) {
+   /* Check unsupported modes */
+   if ((hdr->gso_type & VIRTIO_NET_HDR_GSO_ECN) ||
+   (hdr->gso_size == 0)) {
+   return -EINVAL;
+   }
+
+   /* Update mss lengthes in mbuf */
+   m->tso_segsz = hdr->gso_size;
+   switch (hdr->gso_type & ~VIRTIO_NET_HDR_GSO_ECN) {
+   case VIRTIO_NET_HDR_GSO_TCPV4:
+   case VIRTIO_NET_HDR_GSO_TCPV6:
+   m->ol_flags |= PKT_RX_LRO | \
+   PKT_RX_L4_CKSUM_NONE;
+   break;
+   default:
+   return -EINVAL;
+   }
+   }
+
return 0;
 }

-- 
2.8.1

[dpdk-dev] [PATCH v2 10/12] virtio: add Tx checksum offload support

2016-10-03 Thread Olivier Matz

Signed-off-by: Olivier Matz 
---
 drivers/net/virtio/virtio_ethdev.c |  7 +
 drivers/net/virtio/virtio_ethdev.h |  1 +
 drivers/net/virtio/virtio_rxtx.c   | 57 +-
 3 files changed, 45 insertions(+), 20 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 43cb096..55024cd 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1578,6 +1578,13 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->rx_offload_capa =
DEV_RX_OFFLOAD_TCP_CKSUM |
DEV_RX_OFFLOAD_UDP_CKSUM;
+   dev_info->tx_offload_capa = 0;
+
+   if (hw->guest_features & (1ULL << VIRTIO_NET_F_CSUM)) {
+   dev_info->tx_offload_capa |=
+   DEV_TX_OFFLOAD_UDP_CKSUM |
+   DEV_TX_OFFLOAD_TCP_CKSUM;
+   }
 }

 /*
diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index 2fc9218..202aa2e 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -62,6 +62,7 @@
 1u << VIRTIO_NET_F_CTRL_VQ   | \
 1u << VIRTIO_NET_F_CTRL_RX   | \
 1u << VIRTIO_NET_F_CTRL_VLAN | \
+1u << VIRTIO_NET_F_CSUM  | \
 1u << VIRTIO_NET_F_MRG_RXBUF | \
 1ULL << VIRTIO_F_VERSION_1)

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index eda678a..4ae11e7 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -213,13 +213,14 @@ static inline void
 virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct rte_mbuf *cookie,
   uint16_t needed, int use_indirect, int can_push)
 {
+   struct virtio_tx_region *txr = txvq->virtio_net_hdr_mz->addr;
struct vq_desc_extra *dxp;
struct virtqueue *vq = txvq->vq;
struct vring_desc *start_dp;
uint16_t seg_num = cookie->nb_segs;
uint16_t head_idx, idx;
uint16_t head_size = vq->hw->vtnet_hdr_size;
-   unsigned long offs;
+   struct virtio_net_hdr *hdr;

head_idx = vq->vq_desc_head_idx;
idx = head_idx;
@@ -230,10 +231,9 @@ virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct 
rte_mbuf *cookie,
start_dp = vq->vq_ring.desc;

if (can_push) {
-   /* put on zero'd transmit header (no offloads) */
-   void *hdr = rte_pktmbuf_prepend(cookie, head_size);
-
-   memset(hdr, 0, head_size);
+   /* prepend cannot fail, checked by caller */
+   hdr = (struct virtio_net_hdr *)
+   rte_pktmbuf_prepend(cookie, head_size);
} else if (use_indirect) {
/* setup tx ring slot to point to indirect
 * descriptor list stored in reserved region.
@@ -241,14 +241,11 @@ virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct 
rte_mbuf *cookie,
 * the first slot in indirect ring is already preset
 * to point to the header in reserved region
 */
-   struct virtio_tx_region *txr = txvq->virtio_net_hdr_mz->addr;
-
-   offs = idx * sizeof(struct virtio_tx_region)
-   + offsetof(struct virtio_tx_region, tx_indir);
-
-   start_dp[idx].addr  = txvq->virtio_net_hdr_mem + offs;
+   start_dp[idx].addr  = txvq->virtio_net_hdr_mem +
+   RTE_PTR_DIFF(&txr[idx].tx_indir, txr);
start_dp[idx].len   = (seg_num + 1) * sizeof(struct vring_desc);
start_dp[idx].flags = VRING_DESC_F_INDIRECT;
+   hdr = (struct virtio_net_hdr *)&txr[idx].tx_hdr;

/* loop below will fill in rest of the indirect elements */
start_dp = txr[idx].tx_indir;
@@ -257,15 +254,40 @@ virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct 
rte_mbuf *cookie,
/* setup first tx ring slot to point to header
 * stored in reserved region.
 */
-   offs = idx * sizeof(struct virtio_tx_region)
-   + offsetof(struct virtio_tx_region, tx_hdr);
-
-   start_dp[idx].addr  = txvq->virtio_net_hdr_mem + offs;
+   start_dp[idx].addr  = txvq->virtio_net_hdr_mem +
+   RTE_PTR_DIFF(&txr[idx].tx_hdr, txr);
start_dp[idx].len   = vq->hw->vtnet_hdr_size;
start_dp[idx].flags = VRING_DESC_F_NEXT;
+   hdr = (struct virtio_net_hdr *)&txr[idx].tx_hdr;
+
idx = start_dp[idx].next;
}

+   /* Checksum Offload */
+   switch (cookie->ol_flags & PKT_TX_L4_MASK) {
+   case PKT_TX_UDP_CKSUM:
+   hdr->csum_start = cookie->l2_len + cookie->l3_len;
+   hdr->csum_offset = 6;
+   hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
+

[dpdk-dev] [PATCH v2 09/12] virtio: add Rx checksum offload support

2016-10-03 Thread Olivier Matz

Signed-off-by: Olivier Matz 
---
 drivers/net/virtio/virtio_ethdev.c | 14 
 drivers/net/virtio/virtio_ethdev.h |  2 +-
 drivers/net/virtio/virtio_rxtx.c   | 69 ++
 drivers/net/virtio/virtqueue.h |  1 +
 4 files changed, 78 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index fa56032..43cb096 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1262,7 +1262,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
eth_dev->data->dev_flags = dev_flags;

/* reset device and negotiate default features */
-   ret = virtio_init_device(eth_dev, VIRTIO_PMD_GUEST_FEATURES);
+   ret = virtio_init_device(eth_dev, VIRTIO_PMD_DEFAULT_GUEST_FEATURES);
if (ret < 0)
return ret;

@@ -1351,13 +1351,10 @@ virtio_dev_configure(struct rte_eth_dev *dev)
int ret;

PMD_INIT_LOG(DEBUG, "configure");
+   req_features = VIRTIO_PMD_DEFAULT_GUEST_FEATURES;
+   if (rxmode->hw_ip_checksum)
+   req_features |= (1ULL << VIRTIO_NET_F_GUEST_CSUM);

-   if (rxmode->hw_ip_checksum) {
-   PMD_DRV_LOG(ERR, "HW IP checksum not supported");
-   return -EINVAL;
-   }
-
-   req_features = VIRTIO_PMD_GUEST_FEATURES;
/* if request features changed, reinit the device */
if (req_features != hw->req_guest_features) {
ret = virtio_init_device(dev, req_features);
@@ -1578,6 +1575,9 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->default_txconf = (struct rte_eth_txconf) {
.txq_flags = ETH_TXQ_FLAGS_NOOFFLOADS
};
+   dev_info->rx_offload_capa =
+   DEV_RX_OFFLOAD_TCP_CKSUM |
+   DEV_RX_OFFLOAD_UDP_CKSUM;
 }

 /*
diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index 5d5e788..2fc9218 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -54,7 +54,7 @@
 #define VIRTIO_MAX_RX_PKTLEN  9728

 /* Features desired/implemented by this driver. */
-#define VIRTIO_PMD_GUEST_FEATURES  \
+#define VIRTIO_PMD_DEFAULT_GUEST_FEATURES  \
(1u << VIRTIO_NET_F_MAC   | \
 1u << VIRTIO_NET_F_STATUS| \
 1u << VIRTIO_NET_F_MQ| \
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 724517e..eda678a 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -50,6 +50,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "virtio_logs.h"
 #include "virtio_ethdev.h"
@@ -627,6 +628,56 @@ virtio_update_packet_stats(struct virtnet_stats *stats, 
struct rte_mbuf *mbuf)
}
 }

+/* Optionally fill offload information in structure */
+static int
+virtio_rx_offload(struct rte_mbuf *m, struct virtio_net_hdr *hdr)
+{
+   struct rte_net_hdr_lens hdr_lens;
+   uint32_t hdrlen, ptype;
+   int l4_supported = 0;
+
+   /* nothing to do */
+   if (hdr->flags == 0 && hdr->gso_type == VIRTIO_NET_HDR_GSO_NONE)
+   return 0;
+
+   m->ol_flags |= PKT_RX_IP_CKSUM_UNKNOWN;
+
+   ptype = rte_net_get_ptype(m, &hdr_lens, RTE_PTYPE_ALL_MASK);
+   m->packet_type = ptype;
+   if ((ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP ||
+   (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_UDP ||
+   (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_SCTP)
+   l4_supported = 1;
+
+   if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
+   hdrlen = hdr_lens.l2_len + hdr_lens.l3_len + hdr_lens.l4_len;
+   if (hdr->csum_start <= hdrlen && l4_supported) {
+   m->ol_flags |= PKT_RX_L4_CKSUM_NONE;
+   } else {
+   /* Unknown proto or tunnel, do sw cksum. We can assume
+* the cksum field is in the first segment since the
+* buffers we provided to the host are large enough.
+* In case of SCTP, this will be wrong since it's a CRC
+* but there's nothing we can do.
+*/
+   uint16_t csum, off;
+
+   csum = rte_raw_cksum_mbuf(m, hdr->csum_start,
+   rte_pktmbuf_pkt_len(m) - hdr->csum_start);
+   if (csum != 0x)
+   csum = ~csum;
+   off = hdr->csum_offset + hdr->csum_start;
+   if (rte_pktmbuf_data_len(m) >= off + 1)
+   *rte_pktmbuf_mtod_offset(m, uint16_t *,
+   off) = csum;
+   }
+   } else if (hdr->flags & VIRTIO_NET_HDR_F_DATA_VALID && l4_supported) {
+   m->ol_flags |= PKT_RX_L4_CKSUM_GOOD;
+

[dpdk-dev] [PATCH v2 08/12] app/testpmd: display lro segment size

2016-10-03 Thread Olivier Matz

In csumonly engine, display the value of LRO segment if the
LRO flag is set.

Signed-off-by: Olivier Matz 
---
 app/test-pmd/csumonly.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 8c88ee8..3f71595 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -792,6 +792,8 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
"l4_proto=%d l4_len=%d flags=%s\n",
info.l2_len, rte_be_to_cpu_16(info.ethertype),
info.l3_len, info.l4_proto, info.l4_len, buf);
+   if (rx_ol_flags & PKT_RX_LRO)
+   printf("rx: m->lro_segsz=%u\n", m->tso_segsz);
if (info.is_tunnel == 1)
printf("rx: outer_l2_len=%d outer_ethertype=%x "
"outer_l3_len=%d\n", info.outer_l2_len,
-- 
2.8.1

[dpdk-dev] [PATCH v2 07/12] mbuf: new flag for LRO

2016-10-03 Thread Olivier Matz

When receiving coalesced packets in virtio, the original size of the
segments is provided. This is a useful information because it allows to
resegment with the same size.

Add a RX new flag in mbuf, that can be set when packets are coalesced by
a hardware or virtual driver when the m->tso_segsz field is valid and is
set to the segment size of original packets.

This flag is used in next commits in the virtio pmd.

Signed-off-by: Olivier Matz 
---
 doc/guides/rel_notes/release_16_11.rst | 5 +
 lib/librte_mbuf/rte_mbuf.c | 2 ++
 lib/librte_mbuf/rte_mbuf.h | 7 +++
 3 files changed, 14 insertions(+)

diff --git a/doc/guides/rel_notes/release_16_11.rst 
b/doc/guides/rel_notes/release_16_11.rst
index 2aff84c..a8ad9ab 100644
--- a/doc/guides/rel_notes/release_16_11.rst
+++ b/doc/guides/rel_notes/release_16_11.rst
@@ -66,6 +66,11 @@ New Features
   good, bad, or not present (useful for virtual drivers). This modification
   was done for IP and L4.

+* **Added a LRO mbuf flag.**
+
+  Added a new RX LRO mbuf flag, used when packets are coalesced. This
+  flag indicates that the segment size of original packets is known.
+
 Resolved Issues
 ---

diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index c55cb57..61bcd7e 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -317,6 +317,7 @@ const char *rte_get_rx_ol_flag_name(uint64_t mask)
case PKT_RX_IEEE1588_PTP: return "PKT_RX_IEEE1588_PTP";
case PKT_RX_IEEE1588_TMST: return "PKT_RX_IEEE1588_TMST";
case PKT_RX_QINQ_STRIPPED: return "PKT_RX_QINQ_STRIPPED";
+   case PKT_RX_LRO: return "PKT_RX_LRO";
default: return NULL;
}
 }
@@ -349,6 +350,7 @@ int rte_get_rx_ol_flag_list(uint64_t mask, char *buf, 
size_t buflen)
{ PKT_RX_IEEE1588_PTP, PKT_RX_IEEE1588_PTP, NULL },
{ PKT_RX_IEEE1588_TMST, PKT_RX_IEEE1588_TMST, NULL },
{ PKT_RX_QINQ_STRIPPED, PKT_RX_QINQ_STRIPPED, NULL },
+   { PKT_RX_LRO, PKT_RX_LRO, NULL },
};
const char *name;
unsigned int i;
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 7061cfc..f9d7bfa 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -170,6 +170,13 @@ extern "C" {
  */
 #define PKT_RX_QINQ_PKT  PKT_RX_QINQ_STRIPPED

+/**
+ * When packets are coalesced by a hardware or virtual driver, this flag
+ * can be set in the RX mbuf, meaning that the m->tso_segsz field is
+ * valid and is set to the segment size of original packets.
+ */
+#define PKT_RX_LRO   (1ULL << 16)
+
 /* add new RX flags here */

 /* add new TX flags here */
-- 
2.8.1

[dpdk-dev] [PATCH v2 06/12] app/testpmd: fix checksum stats in csum engine

2016-10-03 Thread Olivier Matz

---
 app/test-pmd/csumonly.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index d5eb260..8c88ee8 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -679,8 +679,10 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
rx_ol_flags = m->ol_flags;

/* Update the L3/L4 checksum error packet statistics */
-   rx_bad_ip_csum += ((rx_ol_flags & PKT_RX_IP_CKSUM_BAD) != 0);
-   rx_bad_l4_csum += ((rx_ol_flags & PKT_RX_L4_CKSUM_BAD) != 0);
+   if ((rx_ol_flags & PKT_RX_IP_CKSUM_MASK) == PKT_RX_IP_CKSUM_BAD)
+   rx_bad_ip_csum += 1;
+   if ((rx_ol_flags & PKT_RX_L4_CKSUM_MASK) == PKT_RX_L4_CKSUM_BAD)
+   rx_bad_l4_csum += 1;

/* step 1: dissect packet, parsing optional vlan, ip4/ip6, vxlan
 * and inner headers */
-- 
2.8.1

[dpdk-dev] [PATCH v2 05/12] mbuf: add new Rx checksum mbuf flags

2016-10-03 Thread Olivier Matz

Following discussions in [1] and [2], introduce a new bit to
describe the Rx checksum status in mbuf.

Before this patch, only one flag was available:
  PKT_RX_L4_CKSUM_BAD: L4 cksum of RX pkt. is not OK.

And same for L3:
  PKT_RX_IP_CKSUM_BAD: IP cksum of RX pkt. is not OK.

This had 2 issues:
- it was not possible to differentiate "checksum good" from
  "checksum unknown".
- it was not possible for a virtual driver to say "the checksum
  in packet may be wrong, but data integrity is valid".

This patch tries to solve this issue by having 4 states (2 bits)
for the IP and L4 Rx checksums. New values are:

 - PKT_RX_L4_CKSUM_UNKNOWN: no information about the RX L4 checksum
   -> the application should verify the checksum by sw
 - PKT_RX_L4_CKSUM_BAD: the L4 checksum in the packet is wrong
   -> the application can drop the packet without additional check
 - PKT_RX_L4_CKSUM_GOOD: the L4 checksum in the packet is valid
   -> the application can accept the packet without verifying the
  checksum by sw
 - PKT_RX_L4_CKSUM_NONE: the L4 checksum is not correct in the packet
   data, but the integrity of the L4 data is verified.
   -> the application can process the packet but must not verify the
  checksum by sw. It has to take care to recalculate the cksum
  if the packet is transmitted (either by sw or using tx offload)

  And same for L3 (replace L4 by IP in description above).

This commit tries to be compatible with existing applications that
only check the existing flag (CKSUM_BAD).

[1] http://dpdk.org/ml/archives/dev/2016-May/039920.html
[2] http://dpdk.org/ml/archives/dev/2016-June/040007.html

Signed-off-by: Olivier Matz 
---
 doc/guides/rel_notes/release_16_11.rst |  6 
 lib/librte_mbuf/rte_mbuf.c | 16 +--
 lib/librte_mbuf/rte_mbuf.h | 51 --
 3 files changed, 68 insertions(+), 5 deletions(-)

diff --git a/doc/guides/rel_notes/release_16_11.rst 
b/doc/guides/rel_notes/release_16_11.rst
index f29b44c..2aff84c 100644
--- a/doc/guides/rel_notes/release_16_11.rst
+++ b/doc/guides/rel_notes/release_16_11.rst
@@ -60,6 +60,12 @@ New Features
   Added a new function ``rte_raw_cksum_mbuf()`` to process the checksum of
   data embedded in an mbuf chain.

+* **Added new Rx checksum mbuf flags.**
+
+  Added new Rx checksum flags in mbufs to described more states: unknown,
+  good, bad, or not present (useful for virtual drivers). This modification
+  was done for IP and L4.
+
 Resolved Issues
 ---

diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index bd5bd48..c55cb57 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -307,7 +307,11 @@ const char *rte_get_rx_ol_flag_name(uint64_t mask)
case PKT_RX_RSS_HASH: return "PKT_RX_RSS_HASH";
case PKT_RX_FDIR: return "PKT_RX_FDIR";
case PKT_RX_L4_CKSUM_BAD: return "PKT_RX_L4_CKSUM_BAD";
+   case PKT_RX_L4_CKSUM_GOOD: return "PKT_RX_L4_CKSUM_GOOD";
+   case PKT_RX_L4_CKSUM_NONE: return "PKT_RX_L4_CKSUM_NONE";
case PKT_RX_IP_CKSUM_BAD: return "PKT_RX_IP_CKSUM_BAD";
+   case PKT_RX_IP_CKSUM_GOOD: return "PKT_RX_IP_CKSUM_GOOD";
+   case PKT_RX_IP_CKSUM_NONE: return "PKT_RX_IP_CKSUM_NONE";
case PKT_RX_EIP_CKSUM_BAD: return "PKT_RX_EIP_CKSUM_BAD";
case PKT_RX_VLAN_STRIPPED: return "PKT_RX_VLAN_STRIPPED";
case PKT_RX_IEEE1588_PTP: return "PKT_RX_IEEE1588_PTP";
@@ -330,8 +334,16 @@ int rte_get_rx_ol_flag_list(uint64_t mask, char *buf, 
size_t buflen)
{ PKT_RX_VLAN_PKT, PKT_RX_VLAN_PKT, NULL },
{ PKT_RX_RSS_HASH, PKT_RX_RSS_HASH, NULL },
{ PKT_RX_FDIR, PKT_RX_FDIR, NULL },
-   { PKT_RX_L4_CKSUM_BAD, PKT_RX_L4_CKSUM_BAD, NULL },
-   { PKT_RX_IP_CKSUM_BAD, PKT_RX_IP_CKSUM_BAD, NULL },
+   { PKT_RX_L4_CKSUM_BAD, PKT_RX_L4_CKSUM_MASK, NULL },
+   { PKT_RX_L4_CKSUM_GOOD, PKT_RX_L4_CKSUM_MASK, NULL },
+   { PKT_RX_L4_CKSUM_NONE, PKT_RX_L4_CKSUM_MASK, NULL },
+   { PKT_RX_L4_CKSUM_UNKNOWN, PKT_RX_L4_CKSUM_MASK,
+ "PKT_RX_L4_CKSUM_UNKNOWN" },
+   { PKT_RX_IP_CKSUM_BAD, PKT_RX_IP_CKSUM_MASK, NULL },
+   { PKT_RX_IP_CKSUM_GOOD, PKT_RX_IP_CKSUM_MASK, NULL },
+   { PKT_RX_IP_CKSUM_NONE, PKT_RX_IP_CKSUM_MASK, NULL },
+   { PKT_RX_IP_CKSUM_UNKNOWN, PKT_RX_IP_CKSUM_MASK,
+ "PKT_RX_IP_CKSUM_UNKNOWN" },
{ PKT_RX_EIP_CKSUM_BAD, PKT_RX_EIP_CKSUM_BAD, NULL },
{ PKT_RX_VLAN_STRIPPED, PKT_RX_VLAN_STRIPPED, NULL },
{ PKT_RX_IEEE1588_PTP, PKT_RX_IEEE1588_PTP, NULL },
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 5e349e7..7061cfc 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -91,8 +91,25 @@ extern "C" {

 #define PKT_RX_RSS_HASH  (1ULL << 1)  /**< RX packet with RSS hash result. 
*/
 #define PKT_R

[dpdk-dev] [PATCH v2 04/12] net: add function to calculate a checksum in a mbuf

2016-10-03 Thread Olivier Matz

This function can be used to calculate the checksum of data embedded in
mbuf, that can be composed of several segments.

This function will be used by the virtio pmd in next commits to calculate
the checksum in software in case the protocol is not recognized.

Signed-off-by: Olivier Matz 
---
 doc/guides/rel_notes/release_16_11.rst |  5 +++
 lib/librte_net/rte_ip.h| 60 ++
 2 files changed, 65 insertions(+)

diff --git a/doc/guides/rel_notes/release_16_11.rst 
b/doc/guides/rel_notes/release_16_11.rst
index 3d3c417..f29b44c 100644
--- a/doc/guides/rel_notes/release_16_11.rst
+++ b/doc/guides/rel_notes/release_16_11.rst
@@ -55,6 +55,11 @@ New Features
   Added two new functions ``rte_get_rx_ol_flag_list()`` and
   ``rte_get_tx_ol_flag_list()`` to dump offload flags as a string.

+* **Added a functions to calculate the checksum of data in a mbuf.**
+
+  Added a new function ``rte_raw_cksum_mbuf()`` to process the checksum of
+  data embedded in an mbuf chain.
+
 Resolved Issues
 ---

diff --git a/lib/librte_net/rte_ip.h b/lib/librte_net/rte_ip.h
index 5b7554a..8499356 100644
--- a/lib/librte_net/rte_ip.h
+++ b/lib/librte_net/rte_ip.h
@@ -230,6 +230,66 @@ rte_raw_cksum(const void *buf, size_t len)
 }

 /**
+ * Compute the raw (non complemented) checksum of a packet.
+ *
+ * @param m
+ *   The pointer to the mbuf.
+ * @param off
+ *   The offset in bytes to start the checksum.
+ * @param len
+ *   The length in bytes of the data to ckecksum.
+ */
+static inline uint16_t
+rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len)
+{
+   const struct rte_mbuf *seg;
+   const char *buf;
+   uint32_t sum, tmp;
+   uint32_t seglen, done;
+
+   /* easy case: all data in the first segment */
+   if (off + len <= rte_pktmbuf_data_len(m))
+   return rte_raw_cksum(rte_pktmbuf_mtod_offset(m,
+   const char *, off), len);
+
+   if (off + len > rte_pktmbuf_pkt_len(m))
+   return 0; /* invalid params, return a dummy value */
+
+   /* else browse the segment to find offset */
+   seglen = 0;
+   for (seg = m; seg != NULL; seg = seg->next) {
+   seglen = rte_pktmbuf_data_len(seg);
+   if (off < seglen)
+   break;
+   off -= seglen;
+   }
+   seglen -= off;
+   buf = rte_pktmbuf_mtod_offset(seg, const char *, off);
+   if (seglen >= len) /* all in one segment */
+   return rte_raw_cksum(buf, len);
+
+   /* hard case: process checksum of several segments */
+   sum = 0;
+   done = 0;
+   for (;;) {
+   tmp = __rte_raw_cksum(buf, seglen, 0);
+   if (done & 1)
+   tmp = rte_bswap16(tmp);
+   sum += tmp;
+   done += seglen;
+   if (done == len)
+   break;
+   seg = seg->next;
+   buf = rte_pktmbuf_mtod(seg, const char *);
+   seglen = rte_pktmbuf_data_len(seg);
+   if (seglen > len - done)
+   seglen = len - done;
+   }
+
+   return __rte_raw_cksum_reduce(sum);
+}
+
+/**
  * Process the IPv4 checksum of an IPv4 header.
  *
  * The checksum field must be set to 0 by the caller.
-- 
2.8.1

[dpdk-dev] [PATCH v2 03/12] virtio: reinitialize the device in configure callback

2016-10-03 Thread Olivier Matz

Add the ability to reset the virtio device in the configure callback
if the features flag changed since previous reset. This will be possible
with the introduction of offload support in next commits.

Signed-off-by: Olivier Matz 
---
 drivers/net/virtio/virtio_ethdev.c | 26 +++---
 drivers/net/virtio/virtio_pci.h|  1 +
 2 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index b1056a1..fa56032 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1045,14 +1045,13 @@ virtio_vlan_filter_set(struct rte_eth_dev *dev, 
uint16_t vlan_id, int on)
 }

 static int
-virtio_negotiate_features(struct virtio_hw *hw)
+virtio_negotiate_features(struct virtio_hw *hw, uint64_t req_features)
 {
uint64_t host_features;

/* Prepare guest_features: feature that driver wants to support */
-   hw->guest_features = VIRTIO_PMD_GUEST_FEATURES;
PMD_INIT_LOG(DEBUG, "guest_features before negotiate = %" PRIx64,
-   hw->guest_features);
+   req_features);

/* Read device(host) feature bits */
host_features = hw->vtpci_ops->get_features(hw);
@@ -1063,6 +1062,7 @@ virtio_negotiate_features(struct virtio_hw *hw)
 * Negotiate features: Subset of device feature bits are written back
 * guest feature bits.
 */
+   hw->guest_features = req_features;
hw->guest_features = vtpci_negotiate_features(hw, host_features);
PMD_INIT_LOG(DEBUG, "features after negotiate = %" PRIx64,
hw->guest_features);
@@ -1081,6 +1081,8 @@ virtio_negotiate_features(struct virtio_hw *hw)
}
}

+   hw->req_guest_features = req_features;
+
return 0;
 }

@@ -1121,8 +1123,9 @@ rx_func_get(struct rte_eth_dev *eth_dev)
eth_dev->rx_pkt_burst = &virtio_recv_pkts;
 }

+/* reset device and renegotiate features if needed */
 static int
-virtio_init_device(struct rte_eth_dev *eth_dev)
+virtio_init_device(struct rte_eth_dev *eth_dev, uint64_t req_features)
 {
struct virtio_hw *hw = eth_dev->data->dev_private;
struct virtio_net_config *config;
@@ -1137,7 +1140,7 @@ virtio_init_device(struct rte_eth_dev *eth_dev)

/* Tell the host we've known how to drive the device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER);
-   if (virtio_negotiate_features(hw) < 0)
+   if (virtio_negotiate_features(hw, req_features) < 0)
return -1;

/* If host does not support status then disable LSC */
@@ -1258,8 +1261,8 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)

eth_dev->data->dev_flags = dev_flags;

-   /* reset device and negotiate features */
-   ret = virtio_init_device(eth_dev);
+   /* reset device and negotiate default features */
+   ret = virtio_init_device(eth_dev, VIRTIO_PMD_GUEST_FEATURES);
if (ret < 0)
return ret;

@@ -1344,6 +1347,7 @@ virtio_dev_configure(struct rte_eth_dev *dev)
 {
const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
struct virtio_hw *hw = dev->data->dev_private;
+   uint64_t req_features;
int ret;

PMD_INIT_LOG(DEBUG, "configure");
@@ -1353,6 +1357,14 @@ virtio_dev_configure(struct rte_eth_dev *dev)
return -EINVAL;
}

+   req_features = VIRTIO_PMD_GUEST_FEATURES;
+   /* if request features changed, reinit the device */
+   if (req_features != hw->req_guest_features) {
+   ret = virtio_init_device(dev, req_features);
+   if (ret < 0)
+   return ret;
+   }
+
/* Setup and start control queue */
if (vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VQ)) {
ret = virtio_dev_cq_queue_setup(dev,
diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
index 552166d..d1a7d1e 100644
--- a/drivers/net/virtio/virtio_pci.h
+++ b/drivers/net/virtio/virtio_pci.h
@@ -245,6 +245,7 @@ struct virtio_net_config;
 struct virtio_hw {
struct virtnet_ctl *cvq;
struct rte_pci_ioport io;
+   uint64_treq_guest_features;
uint64_tguest_features;
uint32_tmax_queue_pairs;
uint16_tvtnet_hdr_size;
-- 
2.8.1

[dpdk-dev] [PATCH v2 02/12] virtio: setup and start cq in configure callback

2016-10-03 Thread Olivier Matz

Move the configuration of control queue in the configure callback.
This is needed by next commit, which introduces the reinitialization
of the device in the configure callback to change the feature flags.
Therefore, the control queue will have to be restarted at the same
place.

As virtio_dev_cq_queue_setup() is called from a place where
config->max_virtqueue_pairs is not available, we need to store this in
the private structure. It replaces max_rx_queues and max_tx_queues which
have the same value. The log showing the value of max_rx_queues and
max_tx_queues is also removed since config->max_virtqueue_pairs is
already displayed above.

Signed-off-by: Olivier Matz 
---
 drivers/net/virtio/virtio_ethdev.c | 43 +++---
 drivers/net/virtio/virtio_ethdev.h |  4 ++--
 drivers/net/virtio/virtio_pci.h|  3 +--
 3 files changed, 24 insertions(+), 26 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 21ed945..b1056a1 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -552,6 +552,9 @@ virtio_dev_close(struct rte_eth_dev *dev)
if (hw->started == 1)
virtio_dev_stop(dev);

+   if (hw->cvq)
+   virtio_dev_queue_release(hw->cvq->vq);
+
/* reset the NIC */
if (dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)
vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR);
@@ -1191,16 +1194,7 @@ virtio_init_device(struct rte_eth_dev *eth_dev)
config->max_virtqueue_pairs = 1;
}

-   hw->max_rx_queues =
-   (VIRTIO_MAX_RX_QUEUES < config->max_virtqueue_pairs) ?
-   VIRTIO_MAX_RX_QUEUES : config->max_virtqueue_pairs;
-   hw->max_tx_queues =
-   (VIRTIO_MAX_TX_QUEUES < config->max_virtqueue_pairs) ?
-   VIRTIO_MAX_TX_QUEUES : config->max_virtqueue_pairs;
-
-   virtio_dev_cq_queue_setup(eth_dev,
-   config->max_virtqueue_pairs * 2,
-   SOCKET_ID_ANY);
+   hw->max_queue_pairs = config->max_virtqueue_pairs;

PMD_INIT_LOG(DEBUG, "config->max_virtqueue_pairs=%d",
config->max_virtqueue_pairs);
@@ -1211,19 +1205,15 @@ virtio_init_device(struct rte_eth_dev *eth_dev)
config->mac[2], config->mac[3],
config->mac[4], config->mac[5]);
} else {
-   hw->max_rx_queues = 1;
-   hw->max_tx_queues = 1;
+   PMD_INIT_LOG(DEBUG, "config->max_virtqueue_pairs=1");
+   hw->max_queue_pairs = 1;
}

-   PMD_INIT_LOG(DEBUG, "hw->max_rx_queues=%d   hw->max_tx_queues=%d",
-   hw->max_rx_queues, hw->max_tx_queues);
if (pci_dev)
PMD_INIT_LOG(DEBUG, "port %d vendorID=0x%x deviceID=0x%x",
eth_dev->data->port_id, pci_dev->id.vendor_id,
pci_dev->id.device_id);

-   virtio_dev_cq_start(eth_dev);
-
return 0;
 }

@@ -1285,7 +1275,6 @@ static int
 eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
 {
struct rte_pci_device *pci_dev;
-   struct virtio_hw *hw = eth_dev->data->dev_private;

PMD_INIT_FUNC_TRACE();

@@ -1301,9 +1290,6 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
eth_dev->tx_pkt_burst = NULL;
eth_dev->rx_pkt_burst = NULL;

-   if (hw->cvq)
-   virtio_dev_queue_release(hw->cvq->vq);
-
rte_free(eth_dev->data->mac_addrs);
eth_dev->data->mac_addrs = NULL;

@@ -1358,6 +1344,7 @@ virtio_dev_configure(struct rte_eth_dev *dev)
 {
const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
struct virtio_hw *hw = dev->data->dev_private;
+   int ret;

PMD_INIT_LOG(DEBUG, "configure");

@@ -1366,6 +1353,16 @@ virtio_dev_configure(struct rte_eth_dev *dev)
return -EINVAL;
}

+   /* Setup and start control queue */
+   if (vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VQ)) {
+   ret = virtio_dev_cq_queue_setup(dev,
+   hw->max_queue_pairs * 2,
+   SOCKET_ID_ANY);
+   if (ret < 0)
+   return ret;
+   virtio_dev_cq_start(dev);
+   }
+
hw->vlan_strip = rxmode->hw_vlan_strip;

if (rxmode->hw_vlan_filter
@@ -1559,8 +1556,10 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->driver_name = dev->driver->pci_drv.name;
else
dev_info->driver_name = "virtio_user PMD";
-   dev_info->max_rx_queues = (uint16_t)hw->max_rx_queues;
-   dev_info->max_tx_queues = (uint16_t)hw->max_tx_queues;
+   dev_info->max_rx_queues =
+   RTE_MIN(hw->max_queue_pa

[dpdk-dev] [PATCH v2 01/12] virtio: move device initialization in a function

2016-10-03 Thread Olivier Matz

Move all code related to device initialization in a new function
virtio_init_device().

This commit brings no functional change, it prepares the next commits
that will add the offload support. For that, it will be needed to
reinitialize the device from ethdev->configure(), using this new
function.

Signed-off-by: Olivier Matz 
---
 drivers/net/virtio/virtio_ethdev.c | 99 ++
 1 file changed, 58 insertions(+), 41 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index ef0d6ee..21ed945 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1118,46 +1118,13 @@ rx_func_get(struct rte_eth_dev *eth_dev)
eth_dev->rx_pkt_burst = &virtio_recv_pkts;
 }

-/*
- * This function is based on probe() function in virtio_pci.c
- * It returns 0 on success.
- */
-int
-eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
+static int
+virtio_init_device(struct rte_eth_dev *eth_dev)
 {
struct virtio_hw *hw = eth_dev->data->dev_private;
struct virtio_net_config *config;
struct virtio_net_config local_config;
-   struct rte_pci_device *pci_dev;
-   uint32_t dev_flags = RTE_ETH_DEV_DETACHABLE;
-   int ret;
-
-   RTE_BUILD_BUG_ON(RTE_PKTMBUF_HEADROOM < sizeof(struct 
virtio_net_hdr_mrg_rxbuf));
-
-   eth_dev->dev_ops = &virtio_eth_dev_ops;
-   eth_dev->tx_pkt_burst = &virtio_xmit_pkts;
-
-   if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
-   rx_func_get(eth_dev);
-   return 0;
-   }
-
-   /* Allocate memory for storing MAC addresses */
-   eth_dev->data->mac_addrs = rte_zmalloc("virtio", VIRTIO_MAX_MAC_ADDRS * 
ETHER_ADDR_LEN, 0);
-   if (eth_dev->data->mac_addrs == NULL) {
-   PMD_INIT_LOG(ERR,
-   "Failed to allocate %d bytes needed to store MAC 
addresses",
-   VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN);
-   return -ENOMEM;
-   }
-
-   pci_dev = eth_dev->pci_dev;
-
-   if (pci_dev) {
-   ret = vtpci_init(pci_dev, hw, &dev_flags);
-   if (ret)
-   return ret;
-   }
+   struct rte_pci_device *pci_dev = eth_dev->pci_dev;

/* Reset the device although not necessary at startup */
vtpci_reset(hw);
@@ -1172,10 +1139,11 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)

/* If host does not support status then disable LSC */
if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS))
-   dev_flags &= ~RTE_ETH_DEV_INTR_LSC;
+   eth_dev->data->dev_flags &= ~RTE_ETH_DEV_INTR_LSC;
+   else
+   eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;

rte_eth_copy_pci_info(eth_dev, pci_dev);
-   eth_dev->data->dev_flags = dev_flags;

rx_func_get(eth_dev);

@@ -1254,12 +1222,61 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
eth_dev->data->port_id, pci_dev->id.vendor_id,
pci_dev->id.device_id);

+   virtio_dev_cq_start(eth_dev);
+
+   return 0;
+}
+
+/*
+ * This function is based on probe() function in virtio_pci.c
+ * It returns 0 on success.
+ */
+int
+eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
+{
+   struct virtio_hw *hw = eth_dev->data->dev_private;
+   struct rte_pci_device *pci_dev;
+   uint32_t dev_flags = RTE_ETH_DEV_DETACHABLE;
+   int ret;
+
+   RTE_BUILD_BUG_ON(RTE_PKTMBUF_HEADROOM < sizeof(struct 
virtio_net_hdr_mrg_rxbuf));
+
+   eth_dev->dev_ops = &virtio_eth_dev_ops;
+   eth_dev->tx_pkt_burst = &virtio_xmit_pkts;
+
+   if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+   rx_func_get(eth_dev);
+   return 0;
+   }
+
+   /* Allocate memory for storing MAC addresses */
+   eth_dev->data->mac_addrs = rte_zmalloc("virtio", VIRTIO_MAX_MAC_ADDRS * 
ETHER_ADDR_LEN, 0);
+   if (eth_dev->data->mac_addrs == NULL) {
+   PMD_INIT_LOG(ERR,
+   "Failed to allocate %d bytes needed to store MAC 
addresses",
+   VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN);
+   return -ENOMEM;
+   }
+
+   pci_dev = eth_dev->pci_dev;
+
+   if (pci_dev) {
+   ret = vtpci_init(pci_dev, hw, &dev_flags);
+   if (ret)
+   return ret;
+   }
+
+   eth_dev->data->dev_flags = dev_flags;
+
+   /* reset device and negotiate features */
+   ret = virtio_init_device(eth_dev);
+   if (ret < 0)
+   return ret;
+
/* Setup interrupt callback  */
if (eth_dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)
rte_intr_callback_register(&pci_dev->intr_handle,
-  virtio_interrupt_handler, eth_dev);
-
-   virtio_dev_cq_start(eth_dev);
+   virtio_interrupt_handler, eth_dev);

return 0;
 }
-- 
2.8.1

[dpdk-dev] [PATCH v2 00/12] net/virtio: add offload support

2016-10-03 Thread Olivier Matz

This patchset, targetted for 16.11, introduces the support of rx and tx
offload in virtio pmd.  To achieve this, some new mbuf flags must be
introduced, as discussed in [1].

It applies on top of:
- software packet type [2]
- testpmd enhancements [3]

The new mbuf checksum flags are backward compatible for current
applications that assume that unknown_csum = good_cum (since there
was only a bad_csum flag). But it the patchset is integrated, we
should consider updating the PMDs to match the new API for 16.11.

[1] http://dpdk.org/ml/archives/dev/2016-May/039920.html
[2] http://dpdk.org/ml/archives/dev/2016-October/048073.html
[3] http://dpdk.org/ml/archives/dev/2016-September/046443.html

changes v1 -> v2
- change mbuf checksum calculation static inline
- fix checksum calculation for protocol where csum=0 means no csum
- move mbuf checksum calculation in librte_net
- use RTE_MIN() to set max rx/tx queue
- rebase on top of head

Olivier Matz (12):
  virtio: move device initialization in a function
  virtio: setup and start cq in configure callback
  virtio: reinitialize the device in configure callback
  net: add function to calculate a checksum in a mbuf
  mbuf: add new Rx checksum mbuf flags
  app/testpmd: fix checksum stats in csum engine
  mbuf: new flag for LRO
  app/testpmd: display lro segment size
  virtio: add Rx checksum offload support
  virtio: add Tx checksum offload support
  virtio: add Lro support
  virtio: add Tso support

 app/test-pmd/csumonly.c|   8 +-
 doc/guides/rel_notes/release_16_11.rst |  16 ++
 drivers/net/virtio/virtio_ethdev.c | 182 +-
 drivers/net/virtio/virtio_ethdev.h |  18 +--
 drivers/net/virtio/virtio_pci.h|   4 +-
 drivers/net/virtio/virtio_rxtx.c   | 270 ++---
 drivers/net/virtio/virtqueue.h |   1 +
 lib/librte_mbuf/rte_mbuf.c |  18 ++-
 lib/librte_mbuf/rte_mbuf.h |  58 ++-
 lib/librte_net/rte_ip.h|  60 
 10 files changed, 526 insertions(+), 109 deletions(-)

Test plan
=

(not fully replayed on v2, but no major change)

Platform description


  guest (dpdk)
  ++
  ||
  ||
  | port0  +-<---+
  |   ixgbe /  | |
  |   directio | |
  || |
  |port1   | ^ flow1
  ++ | (flow2 is the reverse)
 |   |
 | virtio|
 v   |
  ++ |
  | tap0   /   | |
  |1.1.1.1   / | |
  |ns-tap  /   | |
  |  / | |
  |/   ixgbe2  +-->--+
  |  /1.1.1.2  |
  |/  ns-ixgbe |
  ++
  host (linux, vhost-net)


flow1:
  host -(ixgbe)-> guest -(virtio)-> host
  1.1.1.2 -> 1.1.1.1

flow2:
  host -(virtio)-> guest -(ixgbe)-> host
  1.1.1.2 -> 1.1.1.1

Host configuration
--

Start qemu with:

- a ne2k management interface to avoi any conflict with dpdk
- 2 ixgbe interfaces given to with vm through vfio
- a virtio net device, connected to a tap interface through vhost-net

  /usr/bin/qemu-system-x86_64 -k fr -daemonize --enable-kvm -m 1G -cpu host \
-smp 3 -serial telnet::40564,server,nowait -serial null \
-qmp tcp::44340,server,nowait -monitor telnet::49229,server,nowait \
-device ne2k_pci,mac=de:ad:de:01:02:03,netdev=user.0,addr=03 \
-netdev user,id=user.0,hostfwd=tcp::34965-:22 \
-device vfio-pci,host=:04:00.0 -device vfio-pci,host=:04:00.1 \
-netdev type=tap,id=vhostnet0,script=no,vhost=on,queues=8 \
-device virtio-net-pci,netdev=vhostnet0,ioeventfd=on,mq=on,vectors=17 \
-hda "/path/to/ubuntu-14.04-template.qcow2" \
-snapshot -vga none -display none

Move the tap interface in a netns, and configure it:

  ip netns add ns-tap
  ip netns exec ns-tap ip l set lo up
  ip link set tap0 netns ns-tap
  ip netns exec ns-tap ip l set tap0 down
  ip netns exec ns-tap ip l set addr 02:00:00:00:00:01 dev tap0
  ip netns exec ns-tap ip l set tap0 up
  ip netns exec ns-tap ip a a 1.1.1.1/24 dev tap0
  ip netns exec ns-tap arp -s 1.1.1.2 02:00:00:00:00:00
  ip netns exec ns-tap ip a

Move the ixgbe interface in a netns, and configure it:

  IXGBE=ixgbe2
  ip netns add ns-ixgbe
  ip netns exec ns-ixgbe ip l set lo up
  ip link set ${IXGBE} netns ns-ixgbe
  ip netns exec ns-ixgbe ip l set ${IXGBE} down
  ip netns exec ns-ixgbe ip l set addr 02:00:00:00:00:00 dev ${IXGBE}
  ip netns exec ns-ixgbe ip l set ${IXGBE} up
  ip netns exec ns-ixgbe ip a a 1.1.1.2/24 dev ${IXGBE}
  ip netns exec ns-ixgbe arp -s 1.1.1.1 02:00:00:00:00:01
  ip netns exec ns-ixgbe ip a

Guest configuration
---

List of pci devices:

  00:02.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connection [8086:10fb] (rev 01)
  00:03.0 Ethernet controller [0200]:

[dpdk-dev] [PATCH v4 0/4] Cuckoo hash enhancements

2016-10-03 Thread Bruce Richardson

On Fri, Sep 30, 2016 at 08:38:52AM +0100, Pablo de Lara wrote:
> This patchset improves lookup performance on the current hash library
> by changing the existing lookup bulk pipeline, with an improved pipeline,
> based on a loop-and-jump model, instead of the current 4-stage 2-entry 
> pipeline.
> Also, x86 vectorized intrinsics are used to improve performance when 
> comparing signatures.
> 
> First patch reorganizes the order of the hash structure.
> The structure takes more than one 64-byte cache line, but not all
> the fields are used in the lookup operation (the most common operation).
> Therefore, all these fields have been moved to the first part of the 
> structure,
> so they all fit in one cache line, improving slightly the performance in some
> scenarios.
> 
> Second patch modifies the order of the bucket structure.
> Currently, the buckets store all the signatures together (current and 
> alternative).
> In order to be able to perform a vectorized signature comparison,
> all current signatures have to be together, so the order of the bucket has 
> been changed,
> having separated all the current signatures from the alternative signatures.
> 
> Third patch introduces x86 vectorized intrinsics.
> When performing a lookup bulk operation, all current signatures in a bucket
> are compared against the signature of the key being looked up.
> Now that they all are together, a vectorized comparison can be performed,
> which takes less instructions to be carried out.
> In case of having a machine with AVX2, number of entries per bucket are
> increased from 4 to 8, as AVX2 allows comparing two 256-bit values, with 
> 8x32-bit integers,
> which are the 8 signatures on the bucket.
> 
> Fourth (and last) patch modifies the current pipeline of the lookup bulk 
> function.
> The new pipeline is based on a loop-and-jump model. The two key improvements 
> are:
> 
> - Better prefetching: in this case, first 4 keys to be looked up are 
> prefetched,
>   and after that, the rest of the keys are prefetched at the time the 
> calculation
>   of the signatures are being performed. This gives more time for the CPU to
>   prefetch the data requesting before actually need it, which result in less
>   cache misses and therefore, higher throughput.
> 
> - Lower performance penalty when using fallback: the lookup bulk algorithm
>   assumes that most times there will not be a collision in a bucket, but it 
> might
>   happen that two or more signatures are equal, which means that more than one
>   key comparison might be necessary. In that case, only the key of the first 
> hit is prefetched,
>   like in the current implementation. The difference now is that if this 
> comparison
>   results in a miss, the information of the other keys to be compared has 
> been stored,
>   unlike the current implementation, which needs to perform an entire simple 
> lookup again.
> 
> Changes in v4:
> - Reordered hash structure, so alt signature is at the start
>   of the next cache line, and explain in the commit message
>   why it has been moved
> - Reordered hash structure, so name field is on top of the structure,
>   leaving all the fields used in lookup in the next cache line
>   (instead of the first cache line)
> 
> Changes in v3:
> - Corrected the cover letter (wrong number of patches)
> 
> Changes in v2:
> - Increased entries per bucket from 4 to 8 for all cases,
>   so it is not architecture dependent any longer.
> - Replaced compile-time signature comparison function election
>   with run-time election, so best optimization available
>   will be used from a single binary.
> - Reordered the hash structure, so all the fields used by lookup
>   are in the same cache line (first).
> 
> Byron Marohn (3):
>   hash: reorganize bucket structure
>   hash: add vectorized comparison
>   hash: modify lookup bulk pipeline
> 

Hi,

Firstly, checkpatches is reporting some style errors in these patches.

Secondly, when I run the "hash_multiwriter_autotest" I get what I assume to be
an error after applying this patchset. Before this set is applied, running
that test shows the cycles per insert with/without lock elision. Now, though
I'm getting an error about a key being dropped or failing to insert in the lock
elision case, e.g. 

  Core #2 inserting 1572864: 0 - 1,572,864
  key 1497087 is lost
  1 key lost

I've run the test a number of times, and there is a single key lost each time.
Please check on this, is it expected or is it a problem?

Thanks,
/Bruce

[dpdk-dev] [PATCH] l2fwd:mac learning

2016-10-03 Thread Rafat Jahan

Added MAC learning to reduce load at l2

Signed-off-by: Rafat Jahan 
---
 examples/l2fwd-mac/Makefile |   50 ++
 examples/l2fwd-mac/main.c   | 1325 +++
 2 files changed, 1375 insertions(+)
 create mode 100644 examples/l2fwd-mac/Makefile
 create mode 100644 examples/l2fwd-mac/main.c

diff --git a/examples/l2fwd-mac/Makefile b/examples/l2fwd-mac/Makefile
new file mode 100644
index 000..6ab93f4
--- /dev/null
+++ b/examples/l2fwd-mac/Makefile
@@ -0,0 +1,50 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = l2fwd-mac
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/l2fwd-mac/main.c b/examples/l2fwd-mac/main.c
new file mode 100644
index 000..33d6a6e
--- /dev/null
+++ b/examples/l2fwd-mac/main.c
@@ -0,0 +1,1325 @@
+/*-thread created and in which two seperate threads for updation and checking 
are created infinately
+
+   final working code
+
+
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#inclu

[dpdk-dev] [PATCH] l2fwd:mac learning

2016-10-03 Thread Rafat Jahan

Added MAC learning to reduce load at l2

Signed-off-by: Rafat Jahan 
---
 examples/l2fwd-mac/Makefile |   50 ++
 examples/l2fwd-mac/main.c   | 1325 +++
 2 files changed, 1375 insertions(+)
 create mode 100644 examples/l2fwd-mac/Makefile
 create mode 100644 examples/l2fwd-mac/main.c

diff --git a/examples/l2fwd-mac/Makefile b/examples/l2fwd-mac/Makefile
new file mode 100644
index 000..6ab93f4
--- /dev/null
+++ b/examples/l2fwd-mac/Makefile
@@ -0,0 +1,50 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = l2fwd-mac
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/l2fwd-mac/main.c b/examples/l2fwd-mac/main.c
new file mode 100644
index 000..33d6a6e
--- /dev/null
+++ b/examples/l2fwd-mac/main.c
@@ -0,0 +1,1325 @@
+/*-thread created and in which two seperate threads for updation and checking 
are created infinately
+
+   final working code
+
+
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#inclu

[dpdk-dev] [PATCH v3 16/16] app/testpmd: display software packet type

2016-10-03 Thread Olivier Matz

In addition to the packet type returned by the PMD, also display the
packet type calculated by parsing the packet in software. This is
particularly useful to compare the 2 values.

Note: it does not mean that both hw and sw always have to provide the
same value, since it depends on what hardware supports.

Signed-off-by: Olivier Matz 
---
 app/test-pmd/rxonly.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index 9a6e394..9acc4c6 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -66,6 +66,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "testpmd.h"

@@ -92,6 +93,8 @@ pkt_burst_receive(struct fwd_stream *fs)
uint16_t i, packet_type;
uint16_t is_encapsulation;
char buf[256];
+   struct rte_net_hdr_lens hdr_lens;
+   uint32_t sw_packet_type;

 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
uint64_t start_tsc;
@@ -163,8 +166,26 @@ pkt_burst_receive(struct fwd_stream *fs)
mb->vlan_tci, mb->vlan_tci_outer);
if (mb->packet_type) {
rte_get_ptype_name(mb->packet_type, buf, sizeof(buf));
-   printf(" - %s", buf);
+   printf(" - hw ptype: %s", buf);
}
+   sw_packet_type = rte_net_get_ptype(mb, &hdr_lens,
+   RTE_PTYPE_ALL_MASK);
+   rte_get_ptype_name(sw_packet_type, buf, sizeof(buf));
+   printf(" - sw ptype: %s", buf);
+   if (sw_packet_type & RTE_PTYPE_L2_MASK)
+   printf(" - l2_len=%d", hdr_lens.l2_len);
+   if (sw_packet_type & RTE_PTYPE_L3_MASK)
+   printf(" - l3_len=%d", hdr_lens.l3_len);
+   if (sw_packet_type & RTE_PTYPE_L4_MASK)
+   printf(" - l4_len=%d", hdr_lens.l4_len);
+   if (sw_packet_type & RTE_PTYPE_TUNNEL_MASK)
+   printf(" - tunnel_len=%d", hdr_lens.tunnel_len);
+   if (sw_packet_type & RTE_PTYPE_INNER_L2_MASK)
+   printf(" - inner_l2_len=%d", hdr_lens.inner_l2_len);
+   if (sw_packet_type & RTE_PTYPE_INNER_L3_MASK)
+   printf(" - inner_l3_len=%d", hdr_lens.inner_l3_len);
+   if (sw_packet_type & RTE_PTYPE_INNER_L4_MASK)
+   printf(" - inner_l4_len=%d", hdr_lens.inner_l4_len);
if (is_encapsulation) {
struct ipv4_hdr *ipv4_hdr;
struct ipv6_hdr *ipv6_hdr;
-- 
2.8.1

[dpdk-dev] [PATCH v3 15/16] app/testpmd: dump ptype using the new function

2016-10-03 Thread Olivier Matz

Use the function introduced in previous commit to dump the packet type
of the received packet.

Signed-off-by: Olivier Matz 
---
 app/test-pmd/rxonly.c | 175 ++
 1 file changed, 4 insertions(+), 171 deletions(-)

diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index b1fc5bf..9a6e394 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -91,6 +91,7 @@ pkt_burst_receive(struct fwd_stream *fs)
uint16_t nb_rx;
uint16_t i, packet_type;
uint16_t is_encapsulation;
+   char buf[256];

 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
uint64_t start_tsc;
@@ -161,177 +162,9 @@ pkt_burst_receive(struct fwd_stream *fs)
printf(" - QinQ VLAN tci=0x%x, VLAN tci outer=0x%x",
mb->vlan_tci, mb->vlan_tci_outer);
if (mb->packet_type) {
-   uint32_t ptype;
-
-   /* (outer) L2 packet type */
-   ptype = mb->packet_type & RTE_PTYPE_L2_MASK;
-   switch (ptype) {
-   case RTE_PTYPE_L2_ETHER:
-   printf(" - (outer) L2 type: ETHER");
-   break;
-   case RTE_PTYPE_L2_ETHER_TIMESYNC:
-   printf(" - (outer) L2 type: ETHER_Timesync");
-   break;
-   case RTE_PTYPE_L2_ETHER_ARP:
-   printf(" - (outer) L2 type: ETHER_ARP");
-   break;
-   case RTE_PTYPE_L2_ETHER_LLDP:
-   printf(" - (outer) L2 type: ETHER_LLDP");
-   break;
-   case RTE_PTYPE_L2_ETHER_NSH:
-   printf(" - (outer) L2 type: ETHER_NSH");
-   break;
-   default:
-   printf(" - (outer) L2 type: Unknown");
-   break;
-   }
-
-   /* (outer) L3 packet type */
-   ptype = mb->packet_type & RTE_PTYPE_L3_MASK;
-   switch (ptype) {
-   case RTE_PTYPE_L3_IPV4:
-   printf(" - (outer) L3 type: IPV4");
-   break;
-   case RTE_PTYPE_L3_IPV4_EXT:
-   printf(" - (outer) L3 type: IPV4_EXT");
-   break;
-   case RTE_PTYPE_L3_IPV6:
-   printf(" - (outer) L3 type: IPV6");
-   break;
-   case RTE_PTYPE_L3_IPV4_EXT_UNKNOWN:
-   printf(" - (outer) L3 type: IPV4_EXT_UNKNOWN");
-   break;
-   case RTE_PTYPE_L3_IPV6_EXT:
-   printf(" - (outer) L3 type: IPV6_EXT");
-   break;
-   case RTE_PTYPE_L3_IPV6_EXT_UNKNOWN:
-   printf(" - (outer) L3 type: IPV6_EXT_UNKNOWN");
-   break;
-   default:
-   printf(" - (outer) L3 type: Unknown");
-   break;
-   }
-
-   /* (outer) L4 packet type */
-   ptype = mb->packet_type & RTE_PTYPE_L4_MASK;
-   switch (ptype) {
-   case RTE_PTYPE_L4_TCP:
-   printf(" - (outer) L4 type: TCP");
-   break;
-   case RTE_PTYPE_L4_UDP:
-   printf(" - (outer) L4 type: UDP");
-   break;
-   case RTE_PTYPE_L4_FRAG:
-   printf(" - (outer) L4 type: L4_FRAG");
-   break;
-   case RTE_PTYPE_L4_SCTP:
-   printf(" - (outer) L4 type: SCTP");
-   break;
-   case RTE_PTYPE_L4_ICMP:
-   printf(" - (outer) L4 type: ICMP");
-   break;
-   case RTE_PTYPE_L4_NONFRAG:
-   printf(" - (outer) L4 type: L4_NONFRAG");
-   break;
-   default:
-   printf(" - (outer) L4 type: Unknown");
-   break;
-   }
-
-   /* packet tunnel type */
-   ptype = mb->packet_type & RTE_PTYPE_TUNNEL_MASK;
-   switch (ptype) {
-   case RTE_PTYPE_TUNNEL_IP:
-   printf(" - Tunnel type: IP");
-

[dpdk-dev] [PATCH v3 14/16] mbuf: clarify definition of fragment packet types

2016-10-03 Thread Olivier Matz

An IPv4 packet is considered as a fragment if:
- MF (more fragment) bit is set
- or Fragment_Offset field is non-zero

Update the API documentation of packet types to reflect this.

Signed-off-by: Olivier Matz 
---
 lib/librte_mbuf/rte_mbuf_ptype.h | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf_ptype.h b/lib/librte_mbuf/rte_mbuf_ptype.h
index f19c56c..ff6de9d 100644
--- a/lib/librte_mbuf/rte_mbuf_ptype.h
+++ b/lib/librte_mbuf/rte_mbuf_ptype.h
@@ -227,7 +227,7 @@ extern "C" {
  *
  * Packet format:
  * <'ether type'=0x0800
- * | 'version'=4, 'protocol'=6, 'MF'=0>
+ * | 'version'=4, 'protocol'=6, 'MF'=0, 'frag_offset'=0>
  * or,
  * <'ether type'=0x86DD
  * | 'version'=6, 'next header'=6>
@@ -239,7 +239,7 @@ extern "C" {
  *
  * Packet format:
  * <'ether type'=0x0800
- * | 'version'=4, 'protocol'=17, 'MF'=0>
+ * | 'version'=4, 'protocol'=17, 'MF'=0, 'frag_offset'=0>
  * or,
  * <'ether type'=0x86DD
  * | 'version'=6, 'next header'=17>
@@ -258,6 +258,9 @@ extern "C" {
  * <'ether type'=0x0800
  * | 'version'=4, 'MF'=1>
  * or,
+ * <'ether type'=0x0800
+ * | 'version'=4, 'frag_offset'!=0>
+ * or,
  * <'ether type'=0x86DD
  * | 'version'=6, 'next header'=44>
  */
@@ -268,7 +271,7 @@ extern "C" {
  *
  * Packet format:
  * <'ether type'=0x0800
- * | 'version'=4, 'protocol'=132, 'MF'=0>
+ * | 'version'=4, 'protocol'=132, 'MF'=0, 'frag_offset'=0>
  * or,
  * <'ether type'=0x86DD
  * | 'version'=6, 'next header'=132>
@@ -280,7 +283,7 @@ extern "C" {
  *
  * Packet format:
  * <'ether type'=0x0800
- * | 'version'=4, 'protocol'=1, 'MF'=0>
+ * | 'version'=4, 'protocol'=1, 'MF'=0, 'frag_offset'=0>
  * or,
  * <'ether type'=0x86DD
  * | 'version'=6, 'next header'=1>
@@ -296,7 +299,7 @@ extern "C" {
  *
  * Packet format:
  * <'ether type'=0x0800
- * | 'version'=4, 'protocol'!=[6|17|132|1], 'MF'=0>
+ * | 'version'=4, 'protocol'!=[6|17|132|1], 'MF'=0, 'frag_offset'=0>
  * or,
  * <'ether type'=0x86DD
  * | 'version'=6, 'next header'!=[6|17|44|132|1]>
@@ -473,7 +476,7 @@ extern "C" {
  *
  * Packet format (inner only):
  * <'ether type'=0x0800
- * | 'version'=4, 'protocol'=6, 'MF'=0>
+ * | 'version'=4, 'protocol'=6, 'MF'=0, 'frag_offset'=0>
  * or,
  * <'ether type'=0x86DD
  * | 'version'=6, 'next header'=6>
@@ -485,7 +488,7 @@ extern "C" {
  *
  * Packet format (inner only):
  * <'ether type'=0x0800
- * | 'version'=4, 'protocol'=17, 'MF'=0>
+ * | 'version'=4, 'protocol'=17, 'MF'=0, 'frag_offset'=0>
  * or,
  * <'ether type'=0x86DD
  * | 'version'=6, 'next header'=17>
@@ -499,6 +502,9 @@ extern "C" {
  * <'ether type'=0x0800
  * | 'version'=4, 'MF'=1>
  * or,
+ * <'ether type'=0x0800
+ * | 'version'=4, 'frag_offset'!=0>
+ * or,
  * <'ether type'=0x86DD
  * | 'version'=6, 'next header'=44>
  */
@@ -509,7 +515,7 @@ extern "C" {
  *
  * Packet format (inner only):
  * <'ether type'=0x0800
- * | 'version'=4, 'protocol'=132, 'MF'=0>
+ * | 'version'=4, 'protocol'=132, 'MF'=0, 'frag_offset'=0>
  * or,
  * <'ether type'=0x86DD
  * | 'version'=6, 'next header'=132>
@@ -521,7 +527,7 @@ extern "C" {
  *
  * Packet format (inner only):
  * <'ether type'=0x0800
- * | 'version'=4, 'protocol'=1, 'MF'=0>
+ * | 'version'=4, 'protocol'=1, 'MF'=0, 'frag_offset'=0>
  * or,
  * <'ether type'=0x86DD
  * | 'version'=6, 'next header'=1>
@@ -534,7 +540,7 @@ extern "C" {
  *
  * Packet format (inner only):
  * <'ether type'=0x0800
- * | 'version'=4, 'protocol'!=[6|17|132|1], 'MF'=0>
+ * | 'version'=4, 'protocol'!=[6|17|132|1], 'MF'=0, 'frag_offset'=0>
  * or,
  * <'ether type'=0x86DD
  * | 'version'=6, 'next header'!=[6|17|44|132|1]>
-- 
2.8.1

[dpdk-dev] [PATCH v3 13/16] mbuf: add functions to dump packet type

2016-10-03 Thread Olivier Matz

Dumping the packet type is useful for debug purposes. Instead
of having each application providing its function to do that,
introduce functions to do it.

It factorizes the code and reduces the risk of desynchronization between
the new packet types and the dump function.

Signed-off-by: Olivier Matz 
---
 doc/guides/rel_notes/release_16_11.rst |   4 +
 lib/librte_mbuf/Makefile   |   2 +-
 lib/librte_mbuf/rte_mbuf_ptype.c   | 227 +
 lib/librte_mbuf/rte_mbuf_ptype.h   |  89 +
 lib/librte_mbuf/rte_mbuf_version.map   |   8 ++
 5 files changed, 329 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_mbuf/rte_mbuf_ptype.c

diff --git a/doc/guides/rel_notes/release_16_11.rst 
b/doc/guides/rel_notes/release_16_11.rst
index b3b9dfb..40c09ca 100644
--- a/doc/guides/rel_notes/release_16_11.rst
+++ b/doc/guides/rel_notes/release_16_11.rst
@@ -46,6 +46,10 @@ New Features
   Added a new function ``rte_net_get_ptype()`` to parse an Ethernet packet
   in an mbuf chain and retrieve its packet type by software.

+* **Added functions to dump the packet type as a string.**
+
+  Added new functions ``rte_get_ptype_*()`` to dump a packet type as a string.
+
 Resolved Issues
 ---

diff --git a/lib/librte_mbuf/Makefile b/lib/librte_mbuf/Makefile
index 27e037c..4ae2e8c 100644
--- a/lib/librte_mbuf/Makefile
+++ b/lib/librte_mbuf/Makefile
@@ -41,7 +41,7 @@ EXPORT_MAP := rte_mbuf_version.map
 LIBABIVER := 2

 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_MBUF) := rte_mbuf.c
+SRCS-$(CONFIG_RTE_LIBRTE_MBUF) := rte_mbuf.c rte_mbuf_ptype.c

 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_MBUF)-include := rte_mbuf.h rte_mbuf_ptype.h
diff --git a/lib/librte_mbuf/rte_mbuf_ptype.c b/lib/librte_mbuf/rte_mbuf_ptype.c
new file mode 100644
index 000..e5c4fae
--- /dev/null
+++ b/lib/librte_mbuf/rte_mbuf_ptype.c
@@ -0,0 +1,227 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of 6WIND S.A. nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+
+#include 
+#include 
+
+/* get the name of the l2 packet type */
+const char *rte_get_ptype_l2_name(uint32_t ptype)
+{
+   switch (ptype & RTE_PTYPE_L2_MASK) {
+   case RTE_PTYPE_L2_ETHER: return "L2_ETHER";
+   case RTE_PTYPE_L2_ETHER_TIMESYNC: return "L2_ETHER_TIMESYNC";
+   case RTE_PTYPE_L2_ETHER_ARP: return "L2_ETHER_ARP";
+   case RTE_PTYPE_L2_ETHER_LLDP: return "L2_ETHER_LLDP";
+   case RTE_PTYPE_L2_ETHER_NSH: return "L2_ETHER_NSH";
+   case RTE_PTYPE_L2_ETHER_VLAN: return "L2_ETHER_VLAN";
+   case RTE_PTYPE_L2_ETHER_QINQ: return "L2_ETHER_QINQ";
+   default: return "L2_UNKNOWN";
+   }
+}
+
+/* get the name of the l3 packet type */
+const char *rte_get_ptype_l3_name(uint32_t ptype)
+{
+   switch (ptype & RTE_PTYPE_L3_MASK) {
+   case RTE_PTYPE_L3_IPV4: return "L3_IPV4";
+   case RTE_PTYPE_L3_IPV4_EXT: return "L3_IPV4_EXT";
+   case RTE_PTYPE_L3_IPV6: return "L3_IPV6";
+   case RTE_PTYPE_L3_IPV4_EXT_UNKNOWN: return "L3_IPV4_EXT_UNKNOWN";
+   case RTE_PTYPE_L3_IPV6_EXT: return "L3_IPV6_EXT";
+   case RTE_PTYPE_L3_IPV6_EXT_UNKNOWN: return "L3_IPV6_EXT_UNKNOWN";
+   default: return "L3_UNKNOWN";
+   }
+}
+
+/* get the name of the l4 packet type */
+const char *rte_get_ptype_l4_name(uint32_t ptype)
+{
+   switch (ptype & RTE_PTYPE_L4_MASK) {
+   case RTE_PTYPE_L4_TCP: retu

[dpdk-dev] [PATCH v3 12/16] net: get ptype for the first layers only

2016-10-03 Thread Olivier Matz

Add a parameter to rte_net_get_ptype() to select which
layers should be parsed. This avoids to parse all layers if
only the first ones are required.

Signed-off-by: Olivier Matz 
---
 lib/librte_net/rte_net.c | 33 -
 lib/librte_net/rte_net.h |  7 ++-
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/lib/librte_net/rte_net.c b/lib/librte_net/rte_net.c
index 53cfef8..a8c7aff 100644
--- a/lib/librte_net/rte_net.c
+++ b/lib/librte_net/rte_net.c
@@ -254,7 +254,7 @@ skip_ip6_ext(uint16_t proto, const struct rte_mbuf *m, 
uint32_t *off,

 /* parse mbuf data to get packet type */
 uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
-   struct rte_net_hdr_lens *hdr_lens)
+   struct rte_net_hdr_lens *hdr_lens, uint32_t layers)
 {
struct rte_net_hdr_lens local_hdr_lens;
const struct ether_hdr *eh;
@@ -273,6 +273,9 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
off = sizeof(*eh);
hdr_lens->l2_len = off;

+   if ((layers & RTE_PTYPE_L2_MASK) == 0)
+   return 0;
+
if (proto == rte_cpu_to_be_16(ETHER_TYPE_IPv4))
goto l3; /* fast path if packet is IPv4 */

@@ -302,6 +305,9 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
}

  l3:
+   if ((layers & RTE_PTYPE_L3_MASK) == 0)
+   return pkt_type;
+
if (proto == rte_cpu_to_be_16(ETHER_TYPE_IPv4)) {
const struct ipv4_hdr *ip4h;
struct ipv4_hdr ip4h_copy;
@@ -313,6 +319,10 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
pkt_type |= ptype_l3_ip(ip4h->version_ihl);
hdr_lens->l3_len = ip4_hlen(ip4h);
off += hdr_lens->l3_len;
+
+   if ((layers & RTE_PTYPE_L4_MASK) == 0)
+   return pkt_type;
+
if (ip4h->fragment_offset & rte_cpu_to_be_16(
IPV4_HDR_OFFSET_MASK | IPV4_HDR_MF_FLAG)) {
pkt_type |= RTE_PTYPE_L4_FRAG;
@@ -340,6 +350,10 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
}
if (proto == 0)
return pkt_type;
+
+   if ((layers & RTE_PTYPE_L4_MASK) == 0)
+   return pkt_type;
+
if (frag) {
pkt_type |= RTE_PTYPE_L4_FRAG;
hdr_lens->l4_len = 0;
@@ -368,6 +382,10 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
uint32_t prev_off = off;

hdr_lens->l4_len = 0;
+
+   if ((layers & RTE_PTYPE_TUNNEL_MASK) == 0)
+   return pkt_type;
+
pkt_type |= ptype_tunnel(&proto, m, &off);
hdr_lens->tunnel_len = off - prev_off;
}
@@ -375,6 +393,9 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
/* same job for inner header: we need to duplicate the code
 * because the packet types do not have the same value.
 */
+   if ((layers & RTE_PTYPE_INNER_L2_MASK) == 0)
+   return pkt_type;
+
if (proto == rte_cpu_to_be_16(ETHER_TYPE_TEB)) {
eh = rte_pktmbuf_read(m, off, sizeof(*eh), &eh_copy);
if (unlikely(eh == NULL))
@@ -412,6 +433,9 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
proto = vh->eth_proto;
}

+   if ((layers & RTE_PTYPE_INNER_L3_MASK) == 0)
+   return pkt_type;
+
if (proto == rte_cpu_to_be_16(ETHER_TYPE_IPv4)) {
const struct ipv4_hdr *ip4h;
struct ipv4_hdr ip4h_copy;
@@ -423,6 +447,9 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
pkt_type |= ptype_inner_l3_ip(ip4h->version_ihl);
hdr_lens->inner_l3_len = ip4_hlen(ip4h);
off += hdr_lens->inner_l3_len;
+
+   if ((layers & RTE_PTYPE_INNER_L4_MASK) == 0)
+   return pkt_type;
if (ip4h->fragment_offset &
rte_cpu_to_be_16(IPV4_HDR_OFFSET_MASK |
IPV4_HDR_MF_FLAG)) {
@@ -455,6 +482,10 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
}
if (proto == 0)
return pkt_type;
+
+   if ((layers & RTE_PTYPE_INNER_L4_MASK) == 0)
+   return pkt_type;
+
if (frag) {
pkt_type |= RTE_PTYPE_INNER_L4_FRAG;
hdr_lens->inner_l4_len = 0;
diff --git a/lib/librte_net/rte_net.h b/lib/librte_net/rte_net.h
index 02299db..d4156ae 100644
--- a/lib/librte_net/rte_net.h
+++ b/lib/librte_net/rte_net.h
@@ -75,11 +75,16 @@ struct rte_net_hdr_lens {
  * @param hdr_lens
  *   A pointer to a structure where the header lengths will be returned,
  *   or NULL.
+ * @param layers
+ *   List of layers to parse. The function will stop at the first
+ *   empty layer. Examp

[dpdk-dev] [PATCH v3 11/16] net: support Nvgre in software packet type parser

2016-10-03 Thread Olivier Matz

Add support of Nvgre tunnels in rte_net_get_ptype(). At the same
time, as Nvgre transports Ethernet, we need to add the support for inner
Vlan, QinQ, and Mpls.

Signed-off-by: Jean Dao 
Signed-off-by: Olivier Matz 
---
 lib/librte_mbuf/rte_mbuf_ptype.h |  7 +++
 lib/librte_net/rte_net.c | 42 ++--
 lib/librte_net/rte_net.h |  2 +-
 3 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf_ptype.h b/lib/librte_mbuf/rte_mbuf_ptype.h
index 6e62492..fbe764a 100644
--- a/lib/librte_mbuf/rte_mbuf_ptype.h
+++ b/lib/librte_mbuf/rte_mbuf_ptype.h
@@ -396,6 +396,13 @@ extern "C" {
  */
 #define RTE_PTYPE_INNER_L2_ETHER_VLAN   0x0002
 /**
+ * QinQ packet type.
+ *
+ * Packet format:
+ * <'ether type'=[0x88A8]>
+ */
+#define RTE_PTYPE_INNER_L2_ETHER_QINQ   0x0003
+/**
  * Mask of inner layer 2 packet types.
  */
 #define RTE_PTYPE_INNER_L2_MASK 0x000f
diff --git a/lib/librte_net/rte_net.c b/lib/librte_net/rte_net.c
index 66db2c8..53cfef8 100644
--- a/lib/librte_net/rte_net.c
+++ b/lib/librte_net/rte_net.c
@@ -183,7 +183,10 @@ ptype_tunnel(uint16_t *proto, const struct rte_mbuf *m,

*off += opt_len[flags];
*proto = gh->proto;
-   return RTE_PTYPE_TUNNEL_GRE;
+   if (*proto == rte_cpu_to_be_16(ETHER_TYPE_TEB))
+   return RTE_PTYPE_TUNNEL_NVGRE;
+   else
+   return RTE_PTYPE_TUNNEL_GRE;
}
case IPPROTO_IPIP:
*proto = rte_cpu_to_be_16(ETHER_TYPE_IPv4);
@@ -372,7 +375,42 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
/* same job for inner header: we need to duplicate the code
 * because the packet types do not have the same value.
 */
-   hdr_lens->inner_l2_len = 0;
+   if (proto == rte_cpu_to_be_16(ETHER_TYPE_TEB)) {
+   eh = rte_pktmbuf_read(m, off, sizeof(*eh), &eh_copy);
+   if (unlikely(eh == NULL))
+   return pkt_type;
+   pkt_type |= RTE_PTYPE_INNER_L2_ETHER;
+   proto = eh->ether_type;
+   off += sizeof(*eh);
+   hdr_lens->inner_l2_len = sizeof(*eh);
+   }
+
+   if (proto == rte_cpu_to_be_16(ETHER_TYPE_VLAN)) {
+   const struct vlan_hdr *vh;
+   struct vlan_hdr vh_copy;
+
+   pkt_type &= ~RTE_PTYPE_INNER_L2_MASK;
+   pkt_type |= RTE_PTYPE_INNER_L2_ETHER_VLAN;
+   vh = rte_pktmbuf_read(m, off, sizeof(*vh), &vh_copy);
+   if (unlikely(vh == NULL))
+   return pkt_type;
+   off += sizeof(*vh);
+   hdr_lens->inner_l2_len += sizeof(*vh);
+   proto = vh->eth_proto;
+   } else if (proto == rte_cpu_to_be_16(ETHER_TYPE_QINQ)) {
+   const struct vlan_hdr *vh;
+   struct vlan_hdr vh_copy;
+
+   pkt_type &= ~RTE_PTYPE_INNER_L2_MASK;
+   pkt_type |= RTE_PTYPE_INNER_L2_ETHER_QINQ;
+   vh = rte_pktmbuf_read(m, off + sizeof(*vh), sizeof(*vh),
+   &vh_copy);
+   if (unlikely(vh == NULL))
+   return pkt_type;
+   off += 2 * sizeof(*vh);
+   hdr_lens->inner_l2_len += 2 * sizeof(*vh);
+   proto = vh->eth_proto;
+   }

if (proto == rte_cpu_to_be_16(ETHER_TYPE_IPv4)) {
const struct ipv4_hdr *ip4h;
diff --git a/lib/librte_net/rte_net.h b/lib/librte_net/rte_net.h
index 4a72b1b..02299db 100644
--- a/lib/librte_net/rte_net.h
+++ b/lib/librte_net/rte_net.h
@@ -68,7 +68,7 @@ struct rte_net_hdr_lens {
  *   L2: Ether, Vlan, QinQ
  *   L3: IPv4, IPv6
  *   L4: TCP, UDP, SCTP
- *   Tunnels: IPv4, IPv6, Gre
+ *   Tunnels: IPv4, IPv6, Gre, Nvgre
  *
  * @param m
  *   The packet mbuf to be parsed.
-- 
2.8.1

[dpdk-dev] [PATCH v3 10/16] net: support Gre in software packet type parser

2016-10-03 Thread Olivier Matz

Add support of Gre tunnels in rte_net_get_ptype().

Signed-off-by: Jean Dao 
Signed-off-by: Olivier Matz 
---
 lib/librte_net/rte_net.c | 40 
 lib/librte_net/rte_net.h |  2 +-
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/lib/librte_net/rte_net.c b/lib/librte_net/rte_net.c
index 87294bb..66db2c8 100644
--- a/lib/librte_net/rte_net.c
+++ b/lib/librte_net/rte_net.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 

 /* get l3 packet type from ip6 next protocol */
@@ -150,11 +151,40 @@ ptype_inner_l4(uint8_t proto)
return ptype_inner_l4_proto[proto];
 }

-/* get the tunnel packet type if any, update proto. */
+/* get the tunnel packet type if any, update proto and off. */
 static uint32_t
-ptype_tunnel(uint16_t *proto)
+ptype_tunnel(uint16_t *proto, const struct rte_mbuf *m,
+   uint32_t *off)
 {
switch (*proto) {
+   case IPPROTO_GRE: {
+   static const uint8_t opt_len[16] = {
+   [0x0] = 4,
+   [0x1] = 8,
+   [0x2] = 8,
+   [0x8] = 8,
+   [0x3] = 12,
+   [0x9] = 12,
+   [0xa] = 12,
+   [0xb] = 16,
+   };
+   const struct gre_hdr *gh;
+   struct gre_hdr gh_copy;
+   uint16_t flags;
+
+   gh = rte_pktmbuf_read(m, *off, sizeof(*gh), &gh_copy);
+   if (unlikely(gh == NULL))
+   return 0;
+
+   flags = rte_be_to_cpu_16(*(const uint16_t *)gh);
+   flags >>= 12;
+   if (opt_len[flags] == 0)
+   return 0;
+
+   *off += opt_len[flags];
+   *proto = gh->proto;
+   return RTE_PTYPE_TUNNEL_GRE;
+   }
case IPPROTO_IPIP:
*proto = rte_cpu_to_be_16(ETHER_TYPE_IPv4);
return RTE_PTYPE_TUNNEL_IP;
@@ -332,9 +362,11 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
hdr_lens->l4_len = sizeof(struct sctp_hdr);
return pkt_type;
} else {
+   uint32_t prev_off = off;
+
hdr_lens->l4_len = 0;
-   pkt_type |= ptype_tunnel(&proto);
-   hdr_lens->tunnel_len = 0;
+   pkt_type |= ptype_tunnel(&proto, m, &off);
+   hdr_lens->tunnel_len = off - prev_off;
}

/* same job for inner header: we need to duplicate the code
diff --git a/lib/librte_net/rte_net.h b/lib/librte_net/rte_net.h
index f433389..4a72b1b 100644
--- a/lib/librte_net/rte_net.h
+++ b/lib/librte_net/rte_net.h
@@ -68,7 +68,7 @@ struct rte_net_hdr_lens {
  *   L2: Ether, Vlan, QinQ
  *   L3: IPv4, IPv6
  *   L4: TCP, UDP, SCTP
- *   Tunnels: IPv4, IPv6
+ *   Tunnels: IPv4, IPv6, Gre
  *
  * @param m
  *   The packet mbuf to be parsed.
-- 
2.8.1

[dpdk-dev] [PATCH v3 09/16] net: add Gre header structure

2016-10-03 Thread Olivier Matz

Add the Gre header structure in librte_net. It will be used by next
patches that adds the support of Gre tunnels in the software packet type
parser.

The extended headers (checksum, key or sequence number) are not defined.

Signed-off-by: Jean Dao 
Signed-off-by: Olivier Matz 
---
 lib/librte_net/Makefile  |  2 +-
 lib/librte_net/rte_gre.h | 71 
 2 files changed, 72 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_net/rte_gre.h

diff --git a/lib/librte_net/Makefile b/lib/librte_net/Makefile
index c16b542..e5758ce 100644
--- a/lib/librte_net/Makefile
+++ b/lib/librte_net/Makefile
@@ -43,7 +43,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_NET) := rte_net.c
 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include := rte_ip.h rte_tcp.h rte_udp.h
 SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include += rte_sctp.h rte_icmp.h rte_arp.h
-SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include += rte_ether.h rte_net.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include += rte_ether.h rte_gre.h rte_net.h

 DEPDIRS-$(CONFIG_RTE_LIBRTE_NET) += lib/librte_eal lib/librte_mempool
 DEPDIRS-$(CONFIG_RTE_LIBRTE_NET) += lib/librte_mbuf
diff --git a/lib/librte_net/rte_gre.h b/lib/librte_net/rte_gre.h
new file mode 100644
index 000..46568ff
--- /dev/null
+++ b/lib/librte_net/rte_gre.h
@@ -0,0 +1,71 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_GRE_H_
+#define _RTE_GRE_H_
+
+#include 
+#include 
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * GRE Header
+ */
+struct gre_hdr {
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+   uint16_t res2:4; /**< Reserved */
+   uint16_t s:1;/**< Sequence Number Present bit */
+   uint16_t k:1;/**< Key Present bit */
+   uint16_t res1:1; /**< Reserved */
+   uint16_t c:1;/**< Checksum Present bit */
+   uint16_t ver:3;  /**< Version Number */
+   uint16_t res3:5; /**< Reserved */
+#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
+   uint16_t c:1;/**< Checksum Present bit */
+   uint16_t res1:1; /**< Reserved */
+   uint16_t k:1;/**< Key Present bit */
+   uint16_t s:1;/**< Sequence Number Present bit */
+   uint16_t res2:4; /**< Reserved */
+   uint16_t res3:5; /**< Reserved */
+   uint16_t ver:3;  /**< Version Number */
+#endif
+   uint16_t proto;  /**< Protocol Type */
+} __attribute__((__packed__));
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_GRE_H_ */
-- 
2.8.1

[dpdk-dev] [PATCH v3 08/16] net: support Ip tunnels in software packet type parser

2016-10-03 Thread Olivier Matz

Add support of IP and IP6 tunnels in rte_net_get_ptype().

We need to duplicate some code because the packet types do not have the
same value for a given protocol between inner and outer.

Signed-off-by: Jean Dao 
Signed-off-by: Olivier Matz 
---
 lib/librte_net/rte_net.c | 158 ++-
 lib/librte_net/rte_net.h |   1 +
 2 files changed, 156 insertions(+), 3 deletions(-)

diff --git a/lib/librte_net/rte_net.c b/lib/librte_net/rte_net.c
index dc9e376..87294bb 100644
--- a/lib/librte_net/rte_net.c
+++ b/lib/librte_net/rte_net.c
@@ -93,6 +93,79 @@ ptype_l4(uint8_t proto)
return ptype_l4_proto[proto];
 }

+/* get inner l3 packet type from ip6 next protocol */
+static uint32_t
+ptype_inner_l3_ip6(uint8_t ip6_proto)
+{
+   static const uint32_t ptype_inner_ip6_ext_proto_map[256] = {
+   [IPPROTO_HOPOPTS] = RTE_PTYPE_INNER_L3_IPV6_EXT -
+   RTE_PTYPE_INNER_L3_IPV6,
+   [IPPROTO_ROUTING] = RTE_PTYPE_INNER_L3_IPV6_EXT -
+   RTE_PTYPE_INNER_L3_IPV6,
+   [IPPROTO_FRAGMENT] = RTE_PTYPE_INNER_L3_IPV6_EXT -
+   RTE_PTYPE_INNER_L3_IPV6,
+   [IPPROTO_ESP] = RTE_PTYPE_INNER_L3_IPV6_EXT -
+   RTE_PTYPE_INNER_L3_IPV6,
+   [IPPROTO_AH] = RTE_PTYPE_INNER_L3_IPV6_EXT -
+   RTE_PTYPE_INNER_L3_IPV6,
+   [IPPROTO_DSTOPTS] = RTE_PTYPE_INNER_L3_IPV6_EXT -
+   RTE_PTYPE_INNER_L3_IPV6,
+   };
+
+   return RTE_PTYPE_INNER_L3_IPV6 +
+   ptype_inner_ip6_ext_proto_map[ip6_proto];
+}
+
+/* get inner l3 packet type from ip version and header length */
+static uint32_t
+ptype_inner_l3_ip(uint8_t ipv_ihl)
+{
+   static const uint32_t ptype_inner_l3_ip_proto_map[256] = {
+   [0x45] = RTE_PTYPE_INNER_L3_IPV4,
+   [0x46] = RTE_PTYPE_INNER_L3_IPV4_EXT,
+   [0x47] = RTE_PTYPE_INNER_L3_IPV4_EXT,
+   [0x48] = RTE_PTYPE_INNER_L3_IPV4_EXT,
+   [0x49] = RTE_PTYPE_INNER_L3_IPV4_EXT,
+   [0x4A] = RTE_PTYPE_INNER_L3_IPV4_EXT,
+   [0x4B] = RTE_PTYPE_INNER_L3_IPV4_EXT,
+   [0x4C] = RTE_PTYPE_INNER_L3_IPV4_EXT,
+   [0x4D] = RTE_PTYPE_INNER_L3_IPV4_EXT,
+   [0x4E] = RTE_PTYPE_INNER_L3_IPV4_EXT,
+   [0x4F] = RTE_PTYPE_INNER_L3_IPV4_EXT,
+   };
+
+   return ptype_inner_l3_ip_proto_map[ipv_ihl];
+}
+
+/* get inner l4 packet type from proto */
+static uint32_t
+ptype_inner_l4(uint8_t proto)
+{
+   static const uint32_t ptype_inner_l4_proto[256] = {
+   [IPPROTO_UDP] = RTE_PTYPE_INNER_L4_UDP,
+   [IPPROTO_TCP] = RTE_PTYPE_INNER_L4_TCP,
+   [IPPROTO_SCTP] = RTE_PTYPE_INNER_L4_SCTP,
+   };
+
+   return ptype_inner_l4_proto[proto];
+}
+
+/* get the tunnel packet type if any, update proto. */
+static uint32_t
+ptype_tunnel(uint16_t *proto)
+{
+   switch (*proto) {
+   case IPPROTO_IPIP:
+   *proto = rte_cpu_to_be_16(ETHER_TYPE_IPv4);
+   return RTE_PTYPE_TUNNEL_IP;
+   case IPPROTO_IPV6:
+   *proto = rte_cpu_to_be_16(ETHER_TYPE_IPv6);
+   return RTE_PTYPE_TUNNEL_IP; /* IP is also valid for IPv6 */
+   default:
+   return 0;
+   }
+}
+
 /* get the ipv4 header length */
 static uint8_t
 ip4_hlen(const struct ipv4_hdr *hdr)
@@ -207,9 +280,8 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
pkt_type |= ptype_l3_ip(ip4h->version_ihl);
hdr_lens->l3_len = ip4_hlen(ip4h);
off += hdr_lens->l3_len;
-   if (ip4h->fragment_offset &
-   rte_cpu_to_be_16(IPV4_HDR_OFFSET_MASK |
-   IPV4_HDR_MF_FLAG)) {
+   if (ip4h->fragment_offset & rte_cpu_to_be_16(
+   IPV4_HDR_OFFSET_MASK | IPV4_HDR_MF_FLAG)) {
pkt_type |= RTE_PTYPE_L4_FRAG;
hdr_lens->l4_len = 0;
return pkt_type;
@@ -245,6 +317,7 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,

if ((pkt_type & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_UDP) {
hdr_lens->l4_len = sizeof(struct udp_hdr);
+   return pkt_type;
} else if ((pkt_type & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP) {
const struct tcp_hdr *th;
struct tcp_hdr th_copy;
@@ -254,10 +327,89 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
return pkt_type & (RTE_PTYPE_L2_MASK |
RTE_PTYPE_L3_MASK);
hdr_lens->l4_len = (th->data_off & 0xf0) >> 2;
+   return pkt_type;
} else if ((pkt_type & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_SCTP) {
hdr_lens->l4_len = sizeof(struct sctp_hdr);
+   return pkt_type;
} else {

[dpdk-dev] [PATCH v3 07/16] net: support QinQ in software packet type parser

2016-10-03 Thread Olivier Matz

Add a new RTE_PTYPE_L2_ETHER_QINQ packet type, and its support in
rte_net_get_ptype().

Signed-off-by: Didier Pallard 
Signed-off-by: Olivier Matz 
---
 lib/librte_mbuf/rte_mbuf_ptype.h |  7 +++
 lib/librte_net/rte_ether.h   |  1 +
 lib/librte_net/rte_net.c | 16 
 lib/librte_net/rte_net.h |  2 +-
 4 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/lib/librte_mbuf/rte_mbuf_ptype.h b/lib/librte_mbuf/rte_mbuf_ptype.h
index a955c5a..6e62492 100644
--- a/lib/librte_mbuf/rte_mbuf_ptype.h
+++ b/lib/librte_mbuf/rte_mbuf_ptype.h
@@ -143,6 +143,13 @@ extern "C" {
  */
 #define RTE_PTYPE_L2_ETHER_VLAN 0x0006
 /**
+ * QinQ packet type.
+ *
+ * Packet format:
+ * <'ether type'=[0x88A8]>
+ */
+#define RTE_PTYPE_L2_ETHER_QINQ 0x0007
+/**
  * Mask of layer 2 packet types.
  * It is used for outer packet for tunneling cases.
  */
diff --git a/lib/librte_net/rte_ether.h b/lib/librte_net/rte_ether.h
index 647e6c9..ff3d065 100644
--- a/lib/librte_net/rte_ether.h
+++ b/lib/librte_net/rte_ether.h
@@ -329,6 +329,7 @@ struct vxlan_hdr {
 #define ETHER_TYPE_ARP  0x0806 /**< Arp Protocol. */
 #define ETHER_TYPE_RARP 0x8035 /**< Reverse Arp Protocol. */
 #define ETHER_TYPE_VLAN 0x8100 /**< IEEE 802.1Q VLAN tagging. */
+#define ETHER_TYPE_QINQ 0x88A8 /**< IEEE 802.1ad QinQ tagging. */
 #define ETHER_TYPE_1588 0x88F7 /**< IEEE 802.1AS 1588 Precise Time Protocol. */
 #define ETHER_TYPE_SLOW 0x8809 /**< Slow protocols (LACP and Marker). */
 #define ETHER_TYPE_TEB  0x6558 /**< Transparent Ethernet Bridging. */
diff --git a/lib/librte_net/rte_net.c b/lib/librte_net/rte_net.c
index a75b509..dc9e376 100644
--- a/lib/librte_net/rte_net.c
+++ b/lib/librte_net/rte_net.c
@@ -167,6 +167,9 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
off = sizeof(*eh);
hdr_lens->l2_len = off;

+   if (proto == rte_cpu_to_be_16(ETHER_TYPE_IPv4))
+   goto l3; /* fast path if packet is IPv4 */
+
if (proto == rte_cpu_to_be_16(ETHER_TYPE_VLAN)) {
const struct vlan_hdr *vh;
struct vlan_hdr vh_copy;
@@ -178,8 +181,21 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
off += sizeof(*vh);
hdr_lens->l2_len += sizeof(*vh);
proto = vh->eth_proto;
+   } else if (proto == rte_cpu_to_be_16(ETHER_TYPE_QINQ)) {
+   const struct vlan_hdr *vh;
+   struct vlan_hdr vh_copy;
+
+   pkt_type = RTE_PTYPE_L2_ETHER_QINQ;
+   vh = rte_pktmbuf_read(m, off + sizeof(*vh), sizeof(*vh),
+   &vh_copy);
+   if (unlikely(vh == NULL))
+   return pkt_type;
+   off += 2 * sizeof(*vh);
+   hdr_lens->l2_len += 2 * sizeof(*vh);
+   proto = vh->eth_proto;
}

+ l3:
if (proto == rte_cpu_to_be_16(ETHER_TYPE_IPv4)) {
const struct ipv4_hdr *ip4h;
struct ipv4_hdr ip4h_copy;
diff --git a/lib/librte_net/rte_net.h b/lib/librte_net/rte_net.h
index 81979f1..1224b0e 100644
--- a/lib/librte_net/rte_net.h
+++ b/lib/librte_net/rte_net.h
@@ -65,7 +65,7 @@ struct rte_net_hdr_lens {
  * (retval & RTE_PTYPE_L2_MASK) != RTE_PTYPE_UNKNOWN.
  *
  * Supported packet types are:
- *   L2: Ether
+ *   L2: Ether, Vlan, QinQ
  *   L3: IPv4, IPv6
  *   L4: TCP, UDP, SCTP
  *
-- 
2.8.1

[dpdk-dev] [PATCH v3 06/16] net: support Vlan in software packet type parser

2016-10-03 Thread Olivier Matz

Add a new RTE_PTYPE_L2_ETHER_VLAN packet type, and its support in
rte_net_get_ptype().

Signed-off-by: Didier Pallard 
Signed-off-by: Olivier Matz 
---
 lib/librte_mbuf/rte_mbuf_ptype.h |  7 +++
 lib/librte_net/rte_net.c | 13 +
 2 files changed, 20 insertions(+)

diff --git a/lib/librte_mbuf/rte_mbuf_ptype.h b/lib/librte_mbuf/rte_mbuf_ptype.h
index 65e9ced..a955c5a 100644
--- a/lib/librte_mbuf/rte_mbuf_ptype.h
+++ b/lib/librte_mbuf/rte_mbuf_ptype.h
@@ -136,6 +136,13 @@ extern "C" {
  */
 #define RTE_PTYPE_L2_ETHER_NSH  0x0005
 /**
+ * VLAN packet type.
+ *
+ * Packet format:
+ * <'ether type'=[0x8100]>
+ */
+#define RTE_PTYPE_L2_ETHER_VLAN 0x0006
+/**
  * Mask of layer 2 packet types.
  * It is used for outer packet for tunneling cases.
  */
diff --git a/lib/librte_net/rte_net.c b/lib/librte_net/rte_net.c
index 93e9df0..a75b509 100644
--- a/lib/librte_net/rte_net.c
+++ b/lib/librte_net/rte_net.c
@@ -167,6 +167,19 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf *m,
off = sizeof(*eh);
hdr_lens->l2_len = off;

+   if (proto == rte_cpu_to_be_16(ETHER_TYPE_VLAN)) {
+   const struct vlan_hdr *vh;
+   struct vlan_hdr vh_copy;
+
+   pkt_type = RTE_PTYPE_L2_ETHER_VLAN;
+   vh = rte_pktmbuf_read(m, off, sizeof(*vh), &vh_copy);
+   if (unlikely(vh == NULL))
+   return pkt_type;
+   off += sizeof(*vh);
+   hdr_lens->l2_len += sizeof(*vh);
+   proto = vh->eth_proto;
+   }
+
if (proto == rte_cpu_to_be_16(ETHER_TYPE_IPv4)) {
const struct ipv4_hdr *ip4h;
struct ipv4_hdr ip4h_copy;
-- 
2.8.1

[dpdk-dev] [PATCH v3 05/16] net: add function to get packet type from data

2016-10-03 Thread Olivier Matz

Introduce the function rte_net_get_ptype() that parses a mbuf and
returns its packet type. For now, the following packet types are parsed:
   L2: Ether
   L3: IPv4, IPv6
   L4: TCP, UDP, SCTP

The goal here is to provide a reference implementation for packet type
parsing. This function will be used by testpmd in next commits, allowing
to compare its result with the value given by the hardware.

This function will also be useful when implementing Rx offload support
in virtio pmd. Indeed, the virtio protocol gives the csum start and
offset, but it does not give the L4 protocol nor it tells if the
checksum is relevant for inner or outer. This information has to be
known to properly set the ol_flags in mbuf.

Signed-off-by: Didier Pallard 
Signed-off-by: Jean Dao 
Signed-off-by: Olivier Matz 
---
 doc/guides/rel_notes/release_16_11.rst |   5 +
 lib/librte_net/Makefile|   4 +-
 lib/librte_net/rte_net.c   | 235 +
 lib/librte_net/rte_net.h   |  88 
 lib/librte_net/rte_net_version.map |   3 +
 5 files changed, 334 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_net/rte_net.h

diff --git a/doc/guides/rel_notes/release_16_11.rst 
b/doc/guides/rel_notes/release_16_11.rst
index ae24da2..b3b9dfb 100644
--- a/doc/guides/rel_notes/release_16_11.rst
+++ b/doc/guides/rel_notes/release_16_11.rst
@@ -41,6 +41,11 @@ New Features
   Added a new function ``rte_pktmbuf_read()`` to read the packet data from an
   mbuf chain, linearizing if required.

+* **Added a function to get the packet type from packet data.**
+
+  Added a new function ``rte_net_get_ptype()`` to parse an Ethernet packet
+  in an mbuf chain and retrieve its packet type by software.
+
 Resolved Issues
 ---

diff --git a/lib/librte_net/Makefile b/lib/librte_net/Makefile
index a6be7ae..c16b542 100644
--- a/lib/librte_net/Makefile
+++ b/lib/librte_net/Makefile
@@ -41,7 +41,9 @@ LIBABIVER := 1
 SRCS-$(CONFIG_RTE_LIBRTE_NET) := rte_net.c

 # install includes
-SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include := rte_ip.h rte_tcp.h rte_udp.h 
rte_sctp.h rte_icmp.h rte_arp.h rte_ether.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include := rte_ip.h rte_tcp.h rte_udp.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include += rte_sctp.h rte_icmp.h rte_arp.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include += rte_ether.h rte_net.h

 DEPDIRS-$(CONFIG_RTE_LIBRTE_NET) += lib/librte_eal lib/librte_mempool
 DEPDIRS-$(CONFIG_RTE_LIBRTE_NET) += lib/librte_mbuf
diff --git a/lib/librte_net/rte_net.c b/lib/librte_net/rte_net.c
index e69de29..93e9df0 100644
--- a/lib/librte_net/rte_net.c
+++ b/lib/librte_net/rte_net.c
@@ -0,0 +1,235 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of 6WIND S.A. nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* get l3 packet type from ip6 next protocol */
+static uint32_t
+ptype_l3_ip6(uint8_t ip6_proto)
+{
+   static const uint32_t ip6_ext_proto_map[256] = {
+   [IPPROTO_HOPOPTS] = RTE_PTYPE_L3_IPV6_EXT - RTE_PTYPE_L3_IPV6,
+   [IPPROTO_ROUTING] = RTE_PTYPE_L3_IPV6_EXT - RTE_PTYPE_L3_IPV6,
+   [IPPROTO_FRAGMENT] = RTE_PTYPE_L3_IPV6_EXT - RTE_PTYPE_L3_IPV6,
+   [IPPROTO_ESP] = RTE_PTYPE_L3_IPV6_EXT - RTE_PTYPE_L3_IPV6,
+   [IPPROTO_AH] = RTE_PTYPE_L3_IPV6_

[dpdk-dev] [PATCH v3 04/16] net: introduce net library

2016-10-03 Thread Olivier Matz

Previously, librte_net only contained header files. Add a C file
(empty for now) and generate a library. It will contain network helpers
like checksum calculation, software packet type parser, ...

Signed-off-by: Olivier Matz 
---
 MAINTAINERS|  1 +
 lib/librte_net/Makefile| 11 ++-
 lib/librte_net/rte_net.c   |  0
 lib/librte_net/rte_net_version.map |  3 +++
 mk/rte.app.mk  |  1 +
 mk/rte.lib.mk  |  2 +-
 6 files changed, 16 insertions(+), 2 deletions(-)
 create mode 100644 lib/librte_net/rte_net.c
 create mode 100644 lib/librte_net/rte_net_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 7c33ad4..3885df5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -439,6 +439,7 @@ Packet processing
 -

 Network headers
+M: Olivier Matz 
 F: lib/librte_net/

 IP fragmentation & reassembly
diff --git a/lib/librte_net/Makefile b/lib/librte_net/Makefile
index fc332ff..a6be7ae 100644
--- a/lib/librte_net/Makefile
+++ b/lib/librte_net/Makefile
@@ -31,10 +31,19 @@

 include $(RTE_SDK)/mk/rte.vars.mk

+LIB = librte_net.a
+
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3

+EXPORT_MAP := rte_net_version.map
+LIBABIVER := 1
+
+SRCS-$(CONFIG_RTE_LIBRTE_NET) := rte_net.c
+
 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include := rte_ip.h rte_tcp.h rte_udp.h 
rte_sctp.h rte_icmp.h rte_arp.h rte_ether.h

+DEPDIRS-$(CONFIG_RTE_LIBRTE_NET) += lib/librte_eal lib/librte_mempool
+DEPDIRS-$(CONFIG_RTE_LIBRTE_NET) += lib/librte_mbuf

-include $(RTE_SDK)/mk/rte.install.mk
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_net/rte_net.c b/lib/librte_net/rte_net.c
new file mode 100644
index 000..e69de29
diff --git a/lib/librte_net/rte_net_version.map 
b/lib/librte_net/rte_net_version.map
new file mode 100644
index 000..cc5829e
--- /dev/null
+++ b/lib/librte_net/rte_net_version.map
@@ -0,0 +1,3 @@
+DPDK_16.11 {
+   local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 1a0095b..b519e08 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -90,6 +90,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_VHOST)  += -lrte_vhost

 _LDLIBS-$(CONFIG_RTE_LIBRTE_KVARGS) += -lrte_kvargs
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MBUF)   += -lrte_mbuf
+_LDLIBS-$(CONFIG_RTE_LIBRTE_NET)+= -lrte_net
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ETHER)  += -lethdev
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CRYPTODEV)  += -lrte_cryptodev
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MEMPOOL)+= -lrte_mempool
diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
index 830f81a..7b96fd4 100644
--- a/mk/rte.lib.mk
+++ b/mk/rte.lib.mk
@@ -79,7 +79,7 @@ endif

 # Translate DEPDIRS-y into LDLIBS
 # Ignore (sub)directory dependencies which do not provide an actual library
-_IGNORE_DIRS = lib/librte_eal/% lib/librte_net lib/librte_compat
+_IGNORE_DIRS = lib/librte_eal/% lib/librte_compat
 _DEPDIRS = $(filter-out $(_IGNORE_DIRS),$(DEPDIRS-y))
 _LDDIRS = $(subst librte_ether,libethdev,$(_DEPDIRS))
 LDLIBS += $(subst lib/lib,-l,$(_LDDIRS))
-- 
2.8.1

[dpdk-dev] [PATCH v3 03/16] mbuf: move packet type definitions in a new file

2016-10-03 Thread Olivier Matz

The file rte_mbuf.h starts to be quite big, and next commits
will introduce more functions related to packet types. Let's
move them in a new file.

Signed-off-by: Olivier Matz 
---
 lib/librte_mbuf/Makefile |   2 +-
 lib/librte_mbuf/rte_mbuf.h   | 495 +--
 lib/librte_mbuf/rte_mbuf_ptype.h | 552 +++
 3 files changed, 554 insertions(+), 495 deletions(-)
 create mode 100644 lib/librte_mbuf/rte_mbuf_ptype.h

diff --git a/lib/librte_mbuf/Makefile b/lib/librte_mbuf/Makefile
index 8d62b0d..27e037c 100644
--- a/lib/librte_mbuf/Makefile
+++ b/lib/librte_mbuf/Makefile
@@ -44,7 +44,7 @@ LIBABIVER := 2
 SRCS-$(CONFIG_RTE_LIBRTE_MBUF) := rte_mbuf.c

 # install includes
-SYMLINK-$(CONFIG_RTE_LIBRTE_MBUF)-include := rte_mbuf.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_MBUF)-include := rte_mbuf.h rte_mbuf_ptype.h

 # this lib needs eal
 DEPDIRS-$(CONFIG_RTE_LIBRTE_MBUF) += lib/librte_eal lib/librte_mempool
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index a26b9b9..1451ec3 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -60,6 +60,7 @@
 #include 
 #include 
 #include 
+#include 

 #ifdef __cplusplus
 extern "C" {
@@ -225,500 +226,6 @@ extern "C" {
 /* Use final bit of flags to indicate a control mbuf */
 #define CTRL_MBUF_FLAG   (1ULL << 63) /**< Mbuf contains control data */

-/*
- * 32 bits are divided into several fields to mark packet types. Note that
- * each field is indexical.
- * - Bit 3:0 is for L2 types.
- * - Bit 7:4 is for L3 or outer L3 (for tunneling case) types.
- * - Bit 11:8 is for L4 or outer L4 (for tunneling case) types.
- * - Bit 15:12 is for tunnel types.
- * - Bit 19:16 is for inner L2 types.
- * - Bit 23:20 is for inner L3 types.
- * - Bit 27:24 is for inner L4 types.
- * - Bit 31:28 is reserved.
- *
- * To be compatible with Vector PMD, RTE_PTYPE_L3_IPV4, RTE_PTYPE_L3_IPV4_EXT,
- * RTE_PTYPE_L3_IPV6, RTE_PTYPE_L3_IPV6_EXT, RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP
- * and RTE_PTYPE_L4_SCTP should be kept as below in a contiguous 7 bits.
- *
- * Note that L3 types values are selected for checking IPV4/IPV6 header from
- * performance point of view. Reading annotations of RTE_ETH_IS_IPV4_HDR and
- * RTE_ETH_IS_IPV6_HDR is needed for any future changes of L3 type values.
- *
- * Note that the packet types of the same packet recognized by different
- * hardware may be different, as different hardware may have different
- * capability of packet type recognition.
- *
- * examples:
- * <'ether type'=0x0800
- * | 'version'=4, 'protocol'=0x29
- * | 'version'=6, 'next header'=0x3A
- * | 'ICMPv6 header'>
- * will be recognized on i40e hardware as packet type combination of,
- * RTE_PTYPE_L2_ETHER |
- * RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
- * RTE_PTYPE_TUNNEL_IP |
- * RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
- * RTE_PTYPE_INNER_L4_ICMP.
- *
- * <'ether type'=0x86DD
- * | 'version'=6, 'next header'=0x2F
- * | 'GRE header'
- * | 'version'=6, 'next header'=0x11
- * | 'UDP header'>
- * will be recognized on i40e hardware as packet type combination of,
- * RTE_PTYPE_L2_ETHER |
- * RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
- * RTE_PTYPE_TUNNEL_GRENAT |
- * RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
- * RTE_PTYPE_INNER_L4_UDP.
- */
-#define RTE_PTYPE_UNKNOWN   0x
-/**
- * Ethernet packet type.
- * It is used for outer packet for tunneling cases.
- *
- * Packet format:
- * <'ether type'=[0x0800|0x86DD]>
- */
-#define RTE_PTYPE_L2_ETHER  0x0001
-/**
- * Ethernet packet type for time sync.
- *
- * Packet format:
- * <'ether type'=0x88F7>
- */
-#define RTE_PTYPE_L2_ETHER_TIMESYNC 0x0002
-/**
- * ARP (Address Resolution Protocol) packet type.
- *
- * Packet format:
- * <'ether type'=0x0806>
- */
-#define RTE_PTYPE_L2_ETHER_ARP  0x0003
-/**
- * LLDP (Link Layer Discovery Protocol) packet type.
- *
- * Packet format:
- * <'ether type'=0x88CC>
- */
-#define RTE_PTYPE_L2_ETHER_LLDP 0x0004
-/**
- * NSH (Network Service Header) packet type.
- *
- * Packet format:
- * <'ether type'=0x894F>
- */
-#define RTE_PTYPE_L2_ETHER_NSH  0x0005
-/**
- * Mask of layer 2 packet types.
- * It is used for outer packet for tunneling cases.
- */
-#define RTE_PTYPE_L2_MASK   0x000f
-/**
- * IP (Internet Protocol) version 4 packet type.
- * It is used for outer packet for tunneling cases, and does not contain any
- * header option.
- *
- * Packet format:
- * <'ether type'=0x0800
- * | 'version'=4, 'ihl'=5>
- */
-#define RTE_PTYPE_L3_IPV4   0x0010
-/**
- * IP (Internet Protocol) version 4 packet type.
- * It is used for outer packet for tunneling cases, and contains header
- * options.
- *
- * Packet format:
- * <'ether type'=0x0800
- * | 'version'=4, 'ihl'=[6-15], 'options'>
- */
-#define RTE_PTYPE_L3_IPV4_EXT   0x0030
-/**
- * IP (Internet Protocol) version 6 packet type.
- * It is used for outer

[dpdk-dev] [PATCH v3 02/16] net: move Ethernet header definitions to the net library

2016-10-03 Thread Olivier Matz

The proper place for rte_ether.h is in librte_net because it defines
network headers.

Moving it will also prevent to have circular references in the following
patches that will require the Ethernet header definition in rte_mbuf.c.
By the way, fix minor checkpatch issues.

Signed-off-by: Didier Pallard 
Signed-off-by: Olivier Matz 
---
 lib/librte_ether/Makefile|   3 +-
 lib/librte_ether/rte_ether.h | 416 ---
 lib/librte_net/Makefile  |   2 +-
 lib/librte_net/rte_ether.h   | 416 +++
 4 files changed, 418 insertions(+), 419 deletions(-)
 delete mode 100644 lib/librte_ether/rte_ether.h
 create mode 100644 lib/librte_net/rte_ether.h

diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index 0bb5dc9..488b7c8 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -48,12 +48,11 @@ SRCS-y += rte_ethdev.c
 #
 # Export include files
 #
-SYMLINK-y-include += rte_ether.h
 SYMLINK-y-include += rte_ethdev.h
 SYMLINK-y-include += rte_eth_ctrl.h
 SYMLINK-y-include += rte_dev_info.h

 # this lib depends upon:
-DEPDIRS-y += lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf
+DEPDIRS-y += lib/librte_net lib/librte_eal lib/librte_mempool lib/librte_ring 
lib/librte_mbuf

 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_ether/rte_ether.h b/lib/librte_ether/rte_ether.h
deleted file mode 100644
index 1d62d8e..000
--- a/lib/librte_ether/rte_ether.h
+++ /dev/null
@@ -1,416 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- * * Redistributions of source code must retain the above copyright
- *   notice, this list of conditions and the following disclaimer.
- * * Redistributions in binary form must reproduce the above copyright
- *   notice, this list of conditions and the following disclaimer in
- *   the documentation and/or other materials provided with the
- *   distribution.
- * * Neither the name of Intel Corporation nor the names of its
- *   contributors may be used to endorse or promote products derived
- *   from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _RTE_ETHER_H_
-#define _RTE_ETHER_H_
-
-/**
- * @file
- *
- * Ethernet Helpers in RTE
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#include 
-#include 
-
-#include 
-#include 
-#include 
-#include 
-
-#define ETHER_ADDR_LEN  6 /**< Length of Ethernet address. */
-#define ETHER_TYPE_LEN  2 /**< Length of Ethernet type field. */
-#define ETHER_CRC_LEN   4 /**< Length of Ethernet CRC. */
-#define ETHER_HDR_LEN   \
-   (ETHER_ADDR_LEN * 2 + ETHER_TYPE_LEN) /**< Length of Ethernet header. */
-#define ETHER_MIN_LEN   64/**< Minimum frame len, including CRC. */
-#define ETHER_MAX_LEN   1518  /**< Maximum frame len, including CRC. */
-#define ETHER_MTU   \
-   (ETHER_MAX_LEN - ETHER_HDR_LEN - ETHER_CRC_LEN) /**< Ethernet MTU. */
-
-#define ETHER_MAX_VLAN_FRAME_LEN \
-   (ETHER_MAX_LEN + 4) /**< Maximum VLAN frame length, including CRC. */
-
-#define ETHER_MAX_JUMBO_FRAME_LEN \
-   0x3F00 /**< Maximum Jumbo frame length, including CRC. */
-
-#define ETHER_MAX_VLAN_ID  4095 /**< Maximum VLAN ID. */
-
-#define ETHER_MIN_MTU 68 /**< Minimum MTU for IPv4 packets, see RFC 791. */
-
-/**
- * Ethernet address:
- * A universally administered address is uniquely assigned to a device by its
- * manufacturer. The first three octets (in transmission order) contain the
- * Organizationally Unique Identifier (OUI). The following three (MAC-48 and
- * EUI-48) octets are assigned by that organization with the only constraint
- * of uniqueness.
- * A locally administered address is assigned to a device by a network
- * administrator and does not contain OUIs.
- * See http://standards.ieee.org/regauth/groupmac/tutorial.html
- */
-struct ether_addr {
-   uint8_t addr_bytes[ETHER_ADDR_LEN]; /**< Ad

[dpdk-dev] [PATCH v3 01/16] mbuf: add function to read packet data

2016-10-03 Thread Olivier Matz

Introduce a new function to read the packet data from an mbuf chain. It
linearizes the data if required, and also ensures that the mbuf is large
enough.

This function is used in next commits that add a software parser to
retrieve the packet type.

Signed-off-by: Olivier Matz 
---
 doc/guides/rel_notes/release_16_11.rst |  4 
 lib/librte_mbuf/rte_mbuf.c | 35 ++
 lib/librte_mbuf/rte_mbuf.h | 35 ++
 lib/librte_mbuf/rte_mbuf_version.map   |  7 +++
 4 files changed, 81 insertions(+)

diff --git a/doc/guides/rel_notes/release_16_11.rst 
b/doc/guides/rel_notes/release_16_11.rst
index a9a6095..ae24da2 100644
--- a/doc/guides/rel_notes/release_16_11.rst
+++ b/doc/guides/rel_notes/release_16_11.rst
@@ -36,6 +36,10 @@ New Features

  This section is a comment. Make sure to start the actual text at the 
margin.

+* **Added function to read packet data.**
+
+  Added a new function ``rte_pktmbuf_read()`` to read the packet data from an
+  mbuf chain, linearizing if required.

 Resolved Issues
 ---
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 80b1713..37fd72b 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -58,6 +58,7 @@
 #include 
 #include 
 #include 
+#include 

 /*
  * ctrlmbuf constructor, given as a callback function to
@@ -261,6 +262,40 @@ rte_pktmbuf_dump(FILE *f, const struct rte_mbuf *m, 
unsigned dump_len)
}
 }

+/* read len data bytes in a mbuf at specified offset (internal) */
+const void *__rte_pktmbuf_read(const struct rte_mbuf *m, uint32_t off,
+   uint32_t len, void *buf)
+{
+   const struct rte_mbuf *seg = m;
+   uint32_t buf_off = 0, copy_len;
+
+   if (off + len > rte_pktmbuf_pkt_len(m))
+   return NULL;
+
+   while (off >= rte_pktmbuf_data_len(seg)) {
+   off -= rte_pktmbuf_data_len(seg);
+   seg = seg->next;
+   }
+
+   if (off + len <= rte_pktmbuf_data_len(seg))
+   return rte_pktmbuf_mtod_offset(seg, char *, off);
+
+   /* rare case: header is split among several segments */
+   while (len > 0) {
+   copy_len = rte_pktmbuf_data_len(seg) - off;
+   if (copy_len > len)
+   copy_len = len;
+   rte_memcpy((char *)buf + buf_off,
+   rte_pktmbuf_mtod_offset(seg, char *, off), copy_len);
+   off = 0;
+   buf_off += copy_len;
+   len -= copy_len;
+   seg = seg->next;
+   }
+
+   return buf;
+}
+
 /*
  * Get the name of a RX offload flag. Must be kept synchronized with flag
  * definitions in rte_mbuf.h.
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 23b7bf8..a26b9b9 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -1960,6 +1960,41 @@ static inline int rte_pktmbuf_is_contiguous(const struct 
rte_mbuf *m)
 }

 /**
+ * @internal used by rte_pktmbuf_read().
+ */
+const void *__rte_pktmbuf_read(const struct rte_mbuf *m, uint32_t off,
+   uint32_t len, void *buf);
+
+/**
+ * Read len data bytes in a mbuf at specified offset.
+ *
+ * If the data is contiguous, return the pointer in the mbuf data, else
+ * copy the data in the buffer provided by the user and return its
+ * pointer.
+ *
+ * @param m
+ *   The pointer to the mbuf.
+ * @param off
+ *   The offset of the data in the mbuf.
+ * @param len
+ *   The amount of bytes to read.
+ * @param buf
+ *   The buffer where data is copied if it is not contigous in mbuf
+ *   data. Its length should be at least equal to the len parameter.
+ * @return
+ *   The pointer to the data, either in the mbuf if it is contiguous,
+ *   or in the user buffer. If mbuf is too small, NULL is returned.
+ */
+static inline const void *rte_pktmbuf_read(const struct rte_mbuf *m,
+   uint32_t off, uint32_t len, void *buf)
+{
+   if (likely(off + len <= rte_pktmbuf_data_len(m)))
+   return rte_pktmbuf_mtod_offset(m, char *, off);
+   else
+   return __rte_pktmbuf_read(m, off, len, buf);
+}
+
+/**
  * Chain an mbuf to another, thereby creating a segmented packet.
  *
  * Note: The implementation will do a linear walk over the segments to find
diff --git a/lib/librte_mbuf/rte_mbuf_version.map 
b/lib/librte_mbuf/rte_mbuf_version.map
index e10f6bd..79e4dd8 100644
--- a/lib/librte_mbuf/rte_mbuf_version.map
+++ b/lib/librte_mbuf/rte_mbuf_version.map
@@ -18,3 +18,10 @@ DPDK_2.1 {
rte_pktmbuf_pool_create;

 } DPDK_2.0;
+
+DPDK_16.11 {
+   global:
+
+   __rte_pktmbuf_read;
+
+} DPDK_2.1;
-- 
2.8.1

[dpdk-dev] [PATCH v3 00/16] software parser for packet type

2016-10-03 Thread Olivier Matz

This patchset introduces a software packet type parser. This
feature is targeted for v16.11.

The goal here is to provide a reference implementation for packet type
parsing. This function will be used by testpmd to compare its result
with the value given by the hardware.

It will also be useful when implementing Rx offload support in virtio
pmd. Indeed, the virtio protocol gives the csum start and offset, but
it does not give the L4 protocol nor it tells if the checksum is
relevant for inner or outer. This information has to be known to
properly set the ol_flags in mbuf.

changes v2 -> v3
- fix in rte_pktmbuf_read(): allow empty segments
- fix shared lib compilation by removing librte_net from automatic
  directory dependency filter in rte.lib.mk
- fix typo in license header
- rebase on top of head

changes v1 -> v2
- implement sw parser in librte_net instead of librte_mbuf
- remove MPLS parser for now, mapping mpls to packet type requires
  more discussion
- remove the patch adding the 16.11 release notes template, the
  file is already present now
- rebase on current head

Olivier Matz (16):
  mbuf: add function to read packet data
  net: move Ethernet header definitions to the net library
  mbuf: move packet type definitions in a new file
  net: introduce net library
  net: add function to get packet type from data
  net: support Vlan in software packet type parser
  net: support QinQ in software packet type parser
  net: support Ip tunnels in software packet type parser
  net: add Gre header structure
  net: support Gre in software packet type parser
  net: support Nvgre in software packet type parser
  net: get ptype for the first layers only
  mbuf: add functions to dump packet type
  mbuf: clarify definition of fragment packet types
  app/testpmd: dump ptype using the new function
  app/testpmd: display software packet type

 MAINTAINERS|   1 +
 app/test-pmd/rxonly.c  | 196 ++
 doc/guides/rel_notes/release_16_11.rst |  13 +
 lib/librte_ether/Makefile  |   3 +-
 lib/librte_ether/rte_ether.h   | 416 
 lib/librte_mbuf/Makefile   |   4 +-
 lib/librte_mbuf/rte_mbuf.c |  35 ++
 lib/librte_mbuf/rte_mbuf.h | 530 ++
 lib/librte_mbuf/rte_mbuf_ptype.c   | 227 +++
 lib/librte_mbuf/rte_mbuf_ptype.h   | 668 +
 lib/librte_mbuf/rte_mbuf_version.map   |  15 +
 lib/librte_net/Makefile|  15 +-
 lib/librte_net/rte_ether.h | 417 
 lib/librte_net/rte_gre.h   |  71 
 lib/librte_net/rte_net.c   | 517 +
 lib/librte_net/rte_net.h   |  94 +
 lib/librte_net/rte_net_version.map |   6 +
 mk/rte.app.mk  |   1 +
 mk/rte.lib.mk  |   2 +-
 19 files changed, 2143 insertions(+), 1088 deletions(-)
 delete mode 100644 lib/librte_ether/rte_ether.h
 create mode 100644 lib/librte_mbuf/rte_mbuf_ptype.c
 create mode 100644 lib/librte_mbuf/rte_mbuf_ptype.h
 create mode 100644 lib/librte_net/rte_ether.h
 create mode 100644 lib/librte_net/rte_gre.h
 create mode 100644 lib/librte_net/rte_net.c
 create mode 100644 lib/librte_net/rte_net.h
 create mode 100644 lib/librte_net/rte_net_version.map

Test report
===

(not fully replayed on v3, but no major change)

Topology:

 dut
   +-+   
   | |   
   | ixgbe pmd   +---.
   | |   |
   | |   |
   | ixgbe linux +---'
   | |   
   +-+   

We will send packets with scapy from the kernel interface to
testpmd with rxonly engine, and check the logs to verify the
packet type.

# compile and run testpmd
cd dpdk.org/
make config T=x86_64-native-linuxapp-gcc
make -j32

mkdir -p /mnt/huge
mount -t hugetlbfs nodev /mnt/huge
echo 256 > 
/sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
modprobe uio_pci_generic
python tools/dpdk_nic_bind.py -b uio_pci_generic :04:00.0

./build/app/testpmd -l 2,4 -- --total-num-mbufs=65536 -i 
--port-topology=chained --enable-rx-cksum --disable-hw-vlan-filter 
--disable-hw-vlan-strip
  set fwd rxonly
  set verbose 1
  start

# on another terminal, run scapy
scapy

eh = Ether(src="00:01:02:03:04:05", dst="00:1B:21:AB:8F:10")
vlan = Dot1Q(vlan=0x666)
eth = "ixgbe2"

bind_layers(GRE, IPv6, type=0x86dd)

v4/udp
==

# scapy
p = eh/IP()/UDP()/Raw("x"*32)
sendp(p, iface=eth)
p = eh/vlan/IP()/UDP()/Raw("x"*32)
sendp(p, iface=eth)
p = eh/vlan/vlan/IP()/UDP()/Raw("x"*32)
p.type=0x88A8 # QinQ
sendp(p, iface=eth)
p = eh/IP(options=IPOption('\x83\x03\x10'))/UDP()/Raw("x"*32)
sendp(p, iface=eth)

# displayed in testpmd
port 0/queue 0: received 1 packets
  src=00:01:02:03:04:05 - dst=00:1B:21:AB:8F:10 - type=0x0800 - length=74 - 
nb_segs=1 - hw ptype: L2_ETHER L3_IPV4 L4_UDP  - sw ptype: L2_ETHER L3_IPV4 
L4_UDP  -

[dpdk-dev] [PATCH 1/1 v2] eal: Fix misleading error messages, errno can't be trusted.

2016-10-03 Thread Jean Tourrilhes

On Mon, Oct 03, 2016 at 04:15:11PM +, Mcnamara, John wrote:
> 
> The longer more detailed version is here: "Contributing Code to DPDK":
> 
> http://dpdk.org/doc/guides/contributing/patches.html
> 
> John

Thanks a lot. I'll try to find time to look at it.

Jean

[dpdk-dev] [PATCH 1/1 v2] eal: Fix misleading error messages, errno can't be trusted.

2016-10-03 Thread Jean Tourrilhes

On Mon, Oct 03, 2016 at 02:25:40PM +0100, Sergio Gonzalez Monroy wrote:
> Hi Jean,
> 
> NIT but any reason you moved the check before closing the file descriptor?
> (not that it matters with current code as we panic anyway)
> 
> Thanks,
> Sergio

More details, as I admit I was terse
Running secondary is tricky due to the need to map the memory
region at the right place, which is whatever primary has chosen. If
the base address for primary happens to by already mapped in the
secondary, we will hit precisely this error message (well, in a few
case we might hit the other one). This is why there is already
a comment about ASLR.
A colleague of mine hit that message and was misled by errno
claiming "permission denied", which sent him down the wrong
track. It's such a common error for secondary that I feel this error
message should be unambiguous and helpful.
Regards,

Jean

[dpdk-dev] [RFC PATCH v2 1/5] librte_ether: add internal callback functions

2016-10-03 Thread Iremonger, Bernard

Hi Jerin,

> -Original Message-
> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Wednesday, September 14, 2016 12:28 PM
> To: ZELEZNIAK, ALEX 
> Cc: Iremonger, Bernard ; Shah, Rahul R
> ; Lu, Wenzhuo ;
> dev at dpdk.org
> Subject: Re: [dpdk-dev] [RFC PATCH v2 1/5] librte_ether: add internal
> callback functions
> 
> On Tue, Sep 13, 2016 at 02:05:49PM +, ZELEZNIAK, ALEX wrote:
> > Idea here is not to allow VM to control policies assigned to it for
> > security and other reasons. PF is controlled by host and dictates what
> > VM can and can't do in regards of setting VF parameters.
> 
> I think the proposed scheme, The VM does not take any action on its own.
> The VM will just follow what the centralized entity to do so.
> I think if you are planning to support different varieties of PMD then this
> could be an option.However, if you wish to support only a subset of PMDs
> then PF MBOX based scheme may be enough.
> In any case, I think exposing the fine details of PF/VF MBOX scheme in the
> ethdev spec is not a good idea.

I have reworked these patches (1/5 and 2/5) using the new rte_pmd_ixgbe.h file 
and submitted as a separate patchset.

[PATCH v3 0/2] add callbacks for VF management
http://dpdk.org/dev/patchwork/patch/16321/
http://dpdk.org/dev/patchwork/patch/16322/

Regards,

Bernard.

> > > -Original Message-
> > > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> > > Sent: Tuesday, September 13, 2016 4:46 AM
> > > To: ZELEZNIAK, ALEX 
> > > Cc: Bernard Iremonger ;
> > > rahul.r.shah at intel.com; wenzhuo.lu at intel.com; dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [RFC PATCH v2 1/5] librte_ether: add
> > > internal callback functions
> > >
> > > On Fri, Sep 09, 2016 at 04:32:07PM +, ZELEZNIAK, ALEX wrote:
> > > > Use case could be to inform application managing SRIOV about VM's
> > > intention
> > > > to modify parameters like add VLAN which might not be the one
> > > > which is assigned to VF or inform about VF reset and reapply
> > > > settings like
> > > strip/insert
> > > > VLAN id based on policy.
> > >
> > > Is there any other way(more portable way) where we can realize the
> > > same use case?
> > >
> > > Something like,
> > >
> > > 1) The assigned VM operates/control the VF
> > > 2) A centralized entity post messages through UNIX socket or
> > > something(like vhost user communicates with VM).
> > > On message receive, VM can take necessary action on assigned VF.
> > >
> > > This will avoid the need of defining specifics of PF to VF mailbox
> > > communication in normative ethdev specification.
> > >
> > > And I guess it will work almost the PMD drivers as their is no PMD
> > > specific work here.
> > >
> > > Just a thought.
> > >
> > > >
> > > > > -Original Message-
> > > > > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> > > > > Sent: Friday, September 09, 2016 10:11 AM
> > > > > To: Bernard Iremonger 
> > > > > Cc: rahul.r.shah at intel.com; wenzhuo.lu at intel.com; dev at 
> > > > > dpdk.org;
> > > > > ZELEZNIAK, ALEX 
> > > > > Subject: Re: [dpdk-dev] [RFC PATCH v2 1/5] librte_ether: add
> > > > > internal callback functions
> > > > >
> > > > > On Fri, Aug 26, 2016 at 10:10:16AM +0100, Bernard Iremonger wrote:
> > > > > > add _rte_eth_dev_callback_process_vf function.
> > > > > > add _rte_eth_dev_callback_process_generic function
> > > > > >
> > > > > > Adding a callback to the user application on VF to PF mailbox
> > > > > > message, allows passing information to the application
> > > > > > controlling the PF when a VF mailbox event message is received,
> such as VF reset.
> > > > > >
> > > > > > Signed-off-by: azelezniak 
> > > > > > Signed-off-by: Bernard Iremonger 
> > > > > > ---
> > > > > >  lib/librte_ether/rte_ethdev.c  | 17 ++
> > > > > >  lib/librte_ether/rte_ethdev.h  | 61
> > > > > ++
> > > > > >  lib/librte_ether/rte_ether_version.map |  7 
> > > > > >  3 files changed, 85 insertions(+)
> > > > > >
> > > > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > b/lib/librte_ether/rte_ethdev.c
> > > > > > index f62a9ec..1388ea3 100644
> > > > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > > > @@ -2690,6 +2690,20 @@ void
> > > > > >  _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
> > > > > > enum rte_eth_event_type event)  {
> > > > > > +   return _rte_eth_dev_callback_process_generic(dev, event,
> > > > > > +NULL); }
> > > > > > +
> > > > > > +void
> > > > > > +_rte_eth_dev_callback_process_vf(struct rte_eth_dev *dev,
> > > > > > +   enum rte_eth_event_type event, void *param) {
> > > > > > +   return _rte_eth_dev_callback_process_generic(dev, event,
> > > > > > +param); }
> > > > > > +
> > > > > > +void
> > > > > > +_rte_eth_dev_callback_process_generic(struct rte_eth_dev
> *dev,
> > > > > > +   enum rte_eth_event_type event, void *param) {
> > > > > > struct rte_eth_dev_c

[dpdk-dev] [PATCH 1/1 v2] eal: Fix misleading error messages, errno can't be trusted.

2016-10-03 Thread Jean Tourrilhes

On Mon, Oct 03, 2016 at 02:25:40PM +0100, Sergio Gonzalez Monroy wrote:
> Hi Jean,
> 
> There are some format issues with the patch:
> 
> You can run scripts/check-git-log.sh to check them:
> Wrong headline format:
> eal: Fix misleading error messages, errno can't be trusted.
> Wrong headline uppercase:
> eal: Fix misleading error messages, errno can't be trusted.
> Missing 'Fixes' tag:
> eal: Fix misleading error messages, errno can't be trusted.
> 
> The script's output highlights the different issues.

SOrry about that, I casually read the page on
http://dpdk.org/dev, but obviously I need to look at it again.

> On 21/09/2016 22:10, Jean Tourrilhes wrote:
> >@@ -263,9 +264,16 @@ rte_eal_config_reattach(void)
> > mem_config = (struct rte_mem_config *) mmap(rte_mem_cfg_addr,
> > sizeof(*mem_config), PROT_READ | PROT_WRITE, MAP_SHARED,
> > mem_cfg_fd, 0);
> >+if (mem_config == MAP_FAILED || mem_config != rte_mem_cfg_addr) {
> >+if (mem_config != MAP_FAILED)
> >+/* errno is stale, don't use */
> >+rte_panic("Cannot mmap memory for rte_config at [%p], 
> >got [%p] - please use '--base-virtaddr' option\n",
> >+  rte_mem_cfg_addr, mem_config);
> >+else
> >+rte_panic("Cannot mmap memory for rte_config! error %i 
> >(%s)\n",
> >+  errno, strerror(errno));
> >+}
> > close(mem_cfg_fd);
> >-if (mem_config == MAP_FAILED || mem_config != rte_mem_cfg_addr)
> >-rte_panic("Cannot mmap memory for rte_config\n");
> 
> NIT but any reason you moved the check before closing the file descriptor?
> (not that it matters with current code as we panic anyway)

"close()" may change "errno" according to its man page.

> Thanks,
> Sergio

Thanks for the review !

Jean

[dpdk-dev] Last few spaces & Agenda released for DPDK Userspace Summit 2016

2016-10-03 Thread Butler, Siobhan A

Hi all,
There are just a few spaces left at this year's DPDK Userspace Summit, if you 
want to attend please register at www.dpdksummit.com 
where you can review the agenda for the two days of the event. Thanks to those 
who have already registered, but again if you can't make it please let us know!

Look forward to seeing you in Dublin!
Siobh?n

[dpdk-dev] [PATCH] l2fwd:mac learning

2016-10-03 Thread Thomas Monjalon

Hi,

2016-10-03 11:47, Rafat Jahan:
> Added MAC learning to reduce load at l2

I'm sorry but I don't think it is worth adding a new example.
The examples are here to demonstrate the usage of the DPDK libraries.
And it would be valuable for project maintainability to reduce the
number of examples, not adding new ones.
Hope you'll understand.

[dpdk-dev] [PATCH v4 4/5] app/test: added big data GMAC test for libcrypto

2016-10-03 Thread Mrozowicz, SlawomirX



>-Original Message-
>From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
>Sent: Friday, September 30, 2016 7:19 PM
>To: Mrozowicz, SlawomirX ; Azarewicz,
>PiotrX T 
>Cc: dev at dpdk.org; Dai, Wei 
>Subject: Re: [dpdk-dev] [PATCH v4 4/5] app/test: added big data GMAC test
>for libcrypto
>
>2016-09-30 18:32, Slawomir Mrozowicz:
>> This patch add big data AES-GMAC test for libcrypto PMD.
>>
>> Signed-off-by: Piotr Azarewicz 
>> ---
>>  app/test/test_cryptodev.c  |   18 +-
>>  app/test/test_cryptodev_gcm_test_vectors.h | 8245
>+++-
>>  2 files changed, 8242 insertions(+), 21 deletions(-)
>
>The test data are really too big.
>Is it possible to generate them as Wei Dai did for LPM?
>   http://dpdk.org/patch/16175
>   http://dpdk.org/patch/16253

Yes it is possible.
We will prepare new patch set with the changes which you proposed.
S?awek

[dpdk-dev] [PATCH v11 00/24] Introducing rte_driver/rte_device generalization

2016-10-03 Thread Thomas Monjalon

Applied, thanks everybody for the great (re)work!

2016-09-20 18:11, Shreyansh Jain:
> Future Work/Pending:
> ===
>  - Presently eth_driver, rte_eth_dev are not aligned to the rte_driver/
>rte_device model. eth_driver still is a PCI specific entity. This
>has been highlighted by comments from Ferruh in [9].
>  - Some variables, like drv_name (as highlighted by Ferruh), are getting
>duplicated across rte_xxx_driver/device and rte_driver/device.

What about those pending work?

I would add more remaining issues:
- probe/remove naming could be applied to vdev for consistency
- rte_eal_device_insert must be called in vdev
- REGISTER macros should be prefixed with RTE_
- Some functions in EAL does not need eal_ in their prefix:
rte_eal_pci_   -> rte_pci_
rte_eal_dev_   -> rte_dev_
rte_eal_vdev_  -> rte_vdev_
rte_eal_driver -> rte_drv_
rte_eal_vdrv   -> rte_vdrv_

73 matches

Mail list logo