date:20150625

[dpdk-dev] [PATCH v2 11/11] doc: update hash documentation

2015-06-25 Thread Pablo de Lara

Updates hash library documentation, reflecting
the new implementation changes.

Signed-off-by: Pablo de Lara 
---
 doc/guides/prog_guide/hash_lib.rst | 77 +++---
 1 file changed, 64 insertions(+), 13 deletions(-)

diff --git a/doc/guides/prog_guide/hash_lib.rst 
b/doc/guides/prog_guide/hash_lib.rst
index 9b83835..be64a74 100644
--- a/doc/guides/prog_guide/hash_lib.rst
+++ b/doc/guides/prog_guide/hash_lib.rst
@@ -1,5 +1,5 @@
 ..  BSD LICENSE
-Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
 All rights reserved.

 Redistribution and use in source and binary forms, with or without
@@ -50,8 +50,6 @@ The hash also allows the configuration of some low-level 
implementation related

 *   Hash function to translate the key into a bucket index

-*   Number of entries per bucket
-
 The main methods exported by the hash are:

 *   Add entry with key: The key is provided as input. If a new entry is 
successfully added to the hash for the specified key,
@@ -65,10 +63,27 @@ The main methods exported by the hash are:
 *   Lookup for entry with key: The key is provided as input. If an entry with 
the specified key is found in the hash (lookup hit),
 then the position of the entry is returned, otherwise (lookup miss) a 
negative value is returned.

-The current hash implementation handles the key management only.
-The actual data associated with each key has to be managed by the user using a 
separate table that
+Apart from these method explained above, the API allows the user three more 
options:
+
+*   Add / lookup / delete with key and precomputed hash: Both the key and its 
precomputed hash are provided as input. This allows
+the user to perform these operations faster, as hash is already computed.
+
+*   Add / lookup / delete with key and data: A pair of key-value is provided 
as input. This allows the user to store
+not only the key, but also either an integer or a pointer in the table 
itself, which should perform better
+than storing the data in an external table
+
+*   Combination of the two options above: User can provide key, precomputed 
hash and data.
+
+Also, the API contains a method to allow the user to look up entries in 
bursts, achieving higher performance
+than looking up individual entries, as the function prefetches next entries at 
the time it is operating
+with the first ones, which reduces significantly the impact of the necessary 
memory accesses.
+Notice that this method uses a pipeline of 8 entries (4 stages of 2 entries), 
so it is highly recommended
+to use at least 8 entries per burst.
+
+The actual data associated with each key can be either managed by the user 
using a separate table that
 mirrors the hash in terms of number of entries and position of each entry,
-as shown in the Flow Classification use case describes in the following 
sections.
+as shown in the Flow Classification use case describes in the following 
sections,
+or store the data in the hash table.

 The example hash tables in the L2/L3 Forwarding sample applications defines 
which port to forward a packet to based on a packet flow identified by the 
five-tuple lookup.
 However, this table could also be used for more sophisticated features and 
provide many other functions and actions that could be performed on the packets 
and flows.
@@ -76,17 +91,26 @@ However, this table could also be used for more 
sophisticated features and provi
 Implementation Details
 --

-The hash table is implemented as an array of entries which is further divided 
into buckets,
-with the same number of consecutive array entries in each bucket.
-For any input key, there is always a single bucket where that key can be 
stored in the hash,
-therefore only the entries within that bucket need to be examined when the key 
is looked up.
+The hash table has two main tables:
+
+* First table is an array of entries which is further divided into buckets,
+  with the same number of consecutive array entries in each bucket. Each entry 
contains the computed primary
+  and secondary hashes of a given key (explained below), and an index to the 
second table.
+
+* The second table is an array of all the keys stored in the hash table and 
its data associated to each key.
+
+The hash library uses the cuckoo hash method to resolve collisions.
+For any input key, there are two possible buckets (primary and 
secondary/alternative location)
+where that key can be stored in the hash, therefore only the entries within 
those bucket need to be examined
+when the key is looked up.
 The lookup speed is achieved by reducing the number of entries to be scanned 
from the total
-number of hash entries down to the number of entries in a hash bucket,
+number of hash entries down to the number of entries in the two hash buckets,
 as opposed to the basic method of linearly scanning all the entries in the 
array.
 The has

[dpdk-dev] [PATCH v2 10/11] doc: announce ABI change of librte_hash

2015-06-25 Thread Pablo de Lara

rte_hash structure is now private for version 2.1, and two
of the macros in rte_hash.h are now deprecated, so this patch
adds notice of these changes.

Signed-off-by: Pablo de Lara 
---
 doc/guides/rel_notes/abi.rst | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index f00a6ee..fae09fd 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -38,3 +38,5 @@ Examples of Deprecation Notices

 Deprecation Notices
 ---
+* Structure rte_hash in librte_hash library has been changed and has been made 
private in relese 2.1, as applications should have never accessed to its 
internal data (library should have been marked as internal).
+* The Macros #RTE_HASH_BUCKET_ENTRIES_MAX and #RTE_HASH_KEY_LENGTH_MAX are 
deprecated and will be removed with version 2.2.
-- 
2.4.2

[dpdk-dev] [PATCH v2 09/11] MAINTAINERS: claim responsability for hash library

2015-06-25 Thread Pablo de Lara

Signed-off-by: Pablo de Lara 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 54f0973..a536992 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -339,6 +339,7 @@ F: doc/guides/sample_app_ug/l3_forward_access_ctrl.rst

 Hashes
 M: Bruce Richardson 
+M: Pablo de Lara 
 F: lib/librte_hash/
 F: doc/guides/prog_guide/hash_lib.rst
 F: app/test/test_hash*
-- 
2.4.2

[dpdk-dev] [PATCH v2 08/11] hash: add new functionality to store data in hash table

2015-06-25 Thread Pablo de Lara

Usually hash tables not only store keys, but also data associated
to them. In order to maintain the existing API,
the key index will still be returned when
adding/looking up/deleting an entry, but user will be able
to store/look up data associated to a key.

Signed-off-by: Pablo de Lara 
---
 lib/librte_hash/rte_cuckoo_hash.c| 231 +--
 lib/librte_hash/rte_hash.h   | 144 +-
 lib/librte_hash/rte_hash_version.map |   6 +
 3 files changed, 314 insertions(+), 67 deletions(-)

diff --git a/lib/librte_hash/rte_cuckoo_hash.c 
b/lib/librte_hash/rte_cuckoo_hash.c
index 8e5b9a6..37e72ab 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -84,6 +84,8 @@ EAL_REGISTER_TAILQ(rte_hash_tailq)
 #endif

 #define NULL_SIGNATURE  0
+/* Stored key size is a multiple of this value */
+#define KEY_ALIGNMENT   16

 typedef int (*rte_hash_cmp_eq_t)(const void *key1, const void *key2, size_t 
key_len);
 static int rte_hash_k16_cmp_eq(const void *key1, const void *key2, size_t 
key_len);
@@ -133,6 +135,15 @@ struct rte_hash_bucket {
uint32_t key_idx[RTE_HASH_BUCKET_ENTRIES + 1];
 } __rte_cache_aligned;

+struct rte_hash_key {
+   union {
+   uintptr_t idata;
+   void *pdata;
+   };
+   /* Variable key size */
+   char key[] __attribute__((aligned(KEY_ALIGNMENT)));
+};
+
 struct rte_hash *
 rte_hash_find_existing(const char *name)
 {
@@ -200,7 +211,7 @@ rte_hash_create(const struct rte_hash_parameters *params)
/* Total memory required for hash context */
const uint32_t mem_size = sizeof(struct rte_hash) +
num_buckets * sizeof(struct rte_hash_bucket);
-   const uint8_t key_entry_size = params->key_len;
+   const uint8_t key_entry_size = sizeof(struct rte_hash_key) + 
params->key_len;
/* Store all keys and leave the first entry as a dummy entry for 
lookup_bulk */
const uint64_t key_tbl_size = key_entry_size * (params->entries + 1);

@@ -396,7 +407,7 @@ run_cuckoo(const struct rte_hash *h, struct rte_hash_bucket 
*bkt,
const void *original_key)
 {
static unsigned number_pushes;
-   void *k, *keys = h->key_store;
+   struct rte_hash_key *k, *keys = h->key_store;
unsigned i, j;

hash_sig_t current_hash_stored, alt_hash_stored;
@@ -421,8 +432,8 @@ run_cuckoo(const struct rte_hash *h, struct rte_hash_bucket 
*bkt,
 * we just entered in a loop and key cannot be added
 */
if (++number_pushes > 1 && current_hash == original_hash) {
-   k = (char *)keys + key_idx * h->key_entry_size;
-   if (!h->rte_hash_cmp_eq(k, original_key, h->key_len)) {
+   k = (struct rte_hash_key *) ((char *)keys + key_idx * 
h->key_entry_size);
+   if (!h->rte_hash_cmp_eq(k->key, original_key, h->key_len)) {
rte_ring_sp_enqueue(h->free_slots,
(void *)((uintptr_t)key_idx));
number_pushes = 0;
@@ -436,6 +447,9 @@ run_cuckoo(const struct rte_hash *h, struct rte_hash_bucket 
*bkt,
 */
for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
key_idx_stored = bkt->key_idx[i];
+   k = (struct rte_hash_key *) ((char *)keys +
+   key_idx_stored * 
h->key_entry_size);
+
current_hash_stored = bkt->signatures[i].current;
alt_hash_stored = bkt->signatures[i].alt;

@@ -479,20 +493,21 @@ run_cuckoo(const struct rte_hash *h, struct 
rte_hash_bucket *bkt,

 static inline int32_t
 __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
-   hash_sig_t sig)
+   hash_sig_t sig, uintptr_t data)
 {
hash_sig_t hash0, hash1;
uint32_t bucket_idx0, bucket_idx1;
unsigned i;
struct rte_hash_bucket *bkt0, *bkt1;
-   void *new_k, *k, *keys = h->key_store;
+   struct rte_hash_key *new_k, *k, *keys = h->key_store;
void *slot_id;
int ret;

/* Get a new slot for storing the new key */
if (rte_ring_sc_dequeue(h->free_slots, &slot_id) != 0)
return -ENOSPC;
-   new_k = (char *)keys + (uintptr_t)slot_id * h->key_entry_size;
+   new_k = (struct rte_hash_key *) ((char *)keys +
+   (uintptr_t)slot_id * h->key_entry_size);
rte_prefetch0(new_k);

hash0 = sig;
@@ -508,9 +523,12 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, 
const void *key,
for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
if (bkt0->signatures[i].current == hash0 &&
bkt0->signatures[i].alt == hash1)  {
-   k = (char *)keys + bkt0->key_idx[i] * h->key_entry_size;
-

[dpdk-dev] [PATCH v2 07/11] hash: add new function rte_hash_reset

2015-06-25 Thread Pablo de Lara

Added reset function to be able to empty the table,
without having to destroy and create it again.

Signed-off-by: Pablo de Lara 
---
 app/test/test_hash.c |  3 +--
 app/test/test_hash_perf.c| 11 +++
 lib/librte_hash/rte_cuckoo_hash.c| 21 +
 lib/librte_hash/rte_hash.h   | 11 +--
 lib/librte_hash/rte_hash_version.map |  1 +
 5 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/app/test/test_hash.c b/app/test/test_hash.c
index 52de1bd..b1ca939 100644
--- a/app/test/test_hash.c
+++ b/app/test/test_hash.c
@@ -1156,8 +1156,7 @@ static int test_average_table_utilization(void)
no_space = 0;

/* Reset the table */
-   rte_hash_free(handle);
-   rte_hash_create(&ut_params);
+   rte_hash_reset(handle);
}

const unsigned average_keys_added = added_keys_until_no_space / 
ITERATIONS;
diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c
index 469000e..db0c5e8 100644
--- a/app/test/test_hash_perf.c
+++ b/app/test/test_hash_perf.c
@@ -352,14 +352,10 @@ free_table(unsigned table_index)
rte_hash_free(h[table_index]);
 }

-static int
+static void
 reset_table(unsigned table_index)
 {
-   free_table(table_index);
-   if (create_table(table_index) != 0)
-   return -1;
-
-   return 0;
+   rte_hash_reset(h[table_index]);
 }

 static int
@@ -396,8 +392,7 @@ run_all_tbl_perf_tests(void)
if (timed_deletes(1, i) < 0)
return -1;

-   if (reset_table(i) < 0)
-   return -1;
+   reset_table(i);

printf("\n - WITH JUST KEYS -\n\n");

diff --git a/lib/librte_hash/rte_cuckoo_hash.c 
b/lib/librte_hash/rte_cuckoo_hash.c
index f1b6df0..8e5b9a6 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -356,6 +356,27 @@ rte_hash_secondary_hash(const hash_sig_t primary_hash)
return (primary_hash ^ ((tag + 1) * alt_bits_xor));
 }

+void
+rte_hash_reset(struct rte_hash *h)
+{
+   void *ptr;
+   unsigned i;
+
+   if (h == NULL)
+   return;
+
+   memset(h->buckets, 0, h->num_buckets * sizeof(struct rte_hash_bucket));
+   memset(h->key_store, 0, h->key_entry_size * h->entries);
+
+   /* clear the free ring */
+   while (rte_ring_dequeue(h->free_slots, &ptr) == 0)
+   rte_pause();
+
+   /* Repopulate the free slots ring. Entry zero is reserved for key 
misses */
+   for (i = 1; i < h->entries + 1; i++)
+   rte_ring_sp_enqueue(h->free_slots, (void *)((uintptr_t) i));
+}
+
 /*
  * Try to insert a new entry. If bucket has space, hash value and key index
  * to the key table are copied.
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index 7f7e75f..fa327c2 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -132,8 +132,15 @@ void
 rte_hash_free(struct rte_hash *h);

 /**
- * Add a key to an existing hash table.
- * This operation is not multi-thread safe
+ * Reset all hash structure, by zeroing all entries
+ * @param h
+ *   Hash table to reset
+ */
+void
+rte_hash_reset(struct rte_hash *h);
+
+/**
+ * Add a key to an existing hash table. This operation is not multi-thread safe
  * and should only be called from one thread.
  *
  * @param h
diff --git a/lib/librte_hash/rte_hash_version.map 
b/lib/librte_hash/rte_hash_version.map
index fd92def..f011054 100644
--- a/lib/librte_hash/rte_hash_version.map
+++ b/lib/librte_hash/rte_hash_version.map
@@ -22,6 +22,7 @@ DPDK_2.1 {
global:

rte_hash_lookup_bulk_with_hash;
+   rte_hash_reset;

local: *;
 } DPDK_2.0;
-- 
2.4.2

[dpdk-dev] [PATCH v2 06/11] hash: add new lookup_bulk_with_hash function

2015-06-25 Thread Pablo de Lara

Previous implementation was lacking a function
to look up a burst of entries, given precalculated hash values.
This patch implements such function, quite useful for
looking up keys from packets that have precalculated hash values
from a 5-tuple key.

Added the function in the hash unit test as well.

Signed-off-by: Pablo de Lara 
---
 app/test/test_hash.c |  19 ++-
 lib/librte_hash/rte_cuckoo_hash.c| 222 ++-
 lib/librte_hash/rte_hash.h   |  27 +
 lib/librte_hash/rte_hash_version.map |   8 ++
 4 files changed, 272 insertions(+), 4 deletions(-)

diff --git a/app/test/test_hash.c b/app/test/test_hash.c
index 0176219..52de1bd 100644
--- a/app/test/test_hash.c
+++ b/app/test/test_hash.c
@@ -456,6 +456,7 @@ static int test_five_keys(void)
 {
struct rte_hash *handle;
const void *key_array[5] = {0};
+   hash_sig_t hashes[5];
int pos[5];
int expected_pos[5];
unsigned i;
@@ -475,12 +476,24 @@ static int test_five_keys(void)
}

/* Lookup */
-   for(i = 0; i < 5; i++)
+   for (i = 0; i < 5; i++) {
key_array[i] = &keys[i];
+   hashes[i] = rte_hash_hash(handle, &keys[i]);
+   }

ret = rte_hash_lookup_multi(handle, &key_array[0], 5, (int32_t *)pos);
-   if(ret == 0)
-   for(i = 0; i < 5; i++) {
+   if (ret == 0)
+   for (i = 0; i < 5; i++) {
+   print_key_info("Lkp", key_array[i], pos[i]);
+   RETURN_IF_ERROR(pos[i] != expected_pos[i],
+   "failed to find key (pos[%u]=%d)", i, 
pos[i]);
+   }
+
+   /* Lookup with precalculated hashes */
+   ret = rte_hash_lookup_multi_with_hash(handle, &key_array[0], hashes,
+   5, (int32_t *)pos);
+   if (ret == 0)
+   for (i = 0; i < 5; i++) {
print_key_info("Lkp", key_array[i], pos[i]);
RETURN_IF_ERROR(pos[i] != expected_pos[i],
"failed to find key (pos[%u]=%d)", i, 
pos[i]);
diff --git a/lib/librte_hash/rte_cuckoo_hash.c 
b/lib/librte_hash/rte_cuckoo_hash.c
index e19b179..f1b6df0 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -709,6 +709,21 @@ lookup_stage0(unsigned *idx, uint64_t *lookup_mask,
*lookup_mask &= ~(1llu << *idx);
 }

+/* Lookup bulk stage 0: Get primary hash value and calculate secondary hash 
value */
+static inline void
+lookup_stage0_with_hash(unsigned *idx, uint64_t *lookup_mask,
+   hash_sig_t *primary_hash, hash_sig_t *secondary_hash,
+   const hash_sig_t *hash_vals)
+{
+   *idx = __builtin_ctzl(*lookup_mask);
+   if (*lookup_mask == 0)
+   *idx = 0;
+
+   *primary_hash = hash_vals[*idx];
+   *secondary_hash = rte_hash_secondary_hash(*primary_hash);
+
+   *lookup_mask &= ~(1llu << *idx);
+}

 /* Lookup bulk stage 1: Prefetch primary/secondary buckets */
 static inline void
@@ -725,7 +740,7 @@ lookup_stage1(hash_sig_t primary_hash, hash_sig_t 
secondary_hash,
 }

 /*
- * Lookup bulk stage 2:  Search for match hashes in primary/secondary locations
+ * Lookup bulk stage 2: Search for match hashes in primary/secondary locations
  * and prefetch first key slot
  */
 static inline void
@@ -971,6 +986,198 @@ __rte_hash_lookup_bulk(const struct rte_hash *h, const 
void **keys,
return 0;
 }

+static inline int
+__rte_hash_lookup_bulk_with_hash(const struct rte_hash *h, const void **keys,
+   const hash_sig_t *hash_vals, uint32_t num_keys,
+   int32_t *positions)
+{
+   uint64_t hits = 0;
+   uint64_t next_mask = 0;
+   uint64_t extra_hits_mask = 0;
+   uint64_t lookup_mask;
+   unsigned idx;
+   const void *key_store = h->key_store;
+
+   unsigned idx00, idx01, idx10, idx11, idx20, idx21, idx30, idx31;
+   const struct rte_hash_bucket *primary_bkt10, *primary_bkt11;
+   const struct rte_hash_bucket *secondary_bkt10, *secondary_bkt11;
+   const struct rte_hash_bucket *primary_bkt20, *primary_bkt21;
+   const struct rte_hash_bucket *secondary_bkt20, *secondary_bkt21;
+   const void *k_slot20, *k_slot21, *k_slot30, *k_slot31;
+   hash_sig_t primary_hash00, primary_hash01;
+   hash_sig_t secondary_hash00, secondary_hash01;
+   hash_sig_t primary_hash10, primary_hash11;
+   hash_sig_t secondary_hash10, secondary_hash11;
+   hash_sig_t primary_hash20, primary_hash21;
+   hash_sig_t secondary_hash20, secondary_hash21;
+
+   if (num_keys == RTE_HASH_LOOKUP_BULK_MAX)
+   lookup_mask = 0x;
+   else
+   lookup_mask = (1ULL << num_keys) - 1;
+
+   lookup_stage0_with_hash(&idx00, &lookup_mask, &primary_hash00,
+   &secondary_hash00, hash_vals);
+   lookup_s

[dpdk-dev] [PATCH v2 05/11] hash: replace existing hash library with cuckoo hash implementation

2015-06-25 Thread Pablo de Lara

This patch replaces the existing hash library with another approach,
using the Cuckoo Hash method to resolve collisions (open addressing),
which pushes items from a full bucket when a new entry tries
to be added in it, storing the evicted entry in an alternative location,
using a secondary hash function.

This gives the user the ability to store more entries when a bucket
is full, in comparison with the previous implementation.
Therefore, the unit test has been updated, as some scenarios have changed
(such as the previous removed restriction).

Also note that the API has not been changed, although new fields
have been added in the rte_hash structure (structure is internal now).
The main change when creating a new table is that the number of entries
per bucket is fixed now, so its parameter is ignored now
(still there to maintain the same parameters structure).
The hash unit test has been updated to reflect these changes.

As a last note, the maximum burst size in lookup_burst function
hash been increased to 64, to improve performance.

Signed-off-by: Pablo de Lara 
---
 app/test/test_hash.c  |  106 +---
 app/test/test_hash_perf.c |2 +-
 lib/librte_hash/Makefile  |8 +-
 lib/librte_hash/rte_cuckoo_hash.c | 1054 +
 lib/librte_hash/rte_hash.c|  499 --
 lib/librte_hash/rte_hash.h|   67 ++-
 6 files changed, 1127 insertions(+), 609 deletions(-)
 create mode 100644 lib/librte_hash/rte_cuckoo_hash.c
 delete mode 100644 lib/librte_hash/rte_hash.c

diff --git a/app/test/test_hash.c b/app/test/test_hash.c
index 46174db..0176219 100644
--- a/app/test/test_hash.c
+++ b/app/test/test_hash.c
@@ -169,7 +169,6 @@ static struct flow_key keys[5] = { {
 /* Parameters used for hash table in unit test functions. Name set later. */
 static struct rte_hash_parameters ut_params = {
.entries = 64,
-   .bucket_entries = 4,
.key_len = sizeof(struct flow_key), /* 13 */
.hash_func = rte_jhash,
.hash_func_init_val = 0,
@@ -516,9 +515,18 @@ static int test_five_keys(void)
pos[i] = rte_hash_lookup(handle, &keys[i]);
print_key_info("Lkp", &keys[i], pos[i]);
RETURN_IF_ERROR(pos[i] != -ENOENT,
-   "failed to find key (pos[%u]=%d)", i, pos[i]);
+   "found non-existent key (pos[%u]=%d)", i, 
pos[i]);
}

+   /* Lookup multi */
+   ret = rte_hash_lookup_multi(handle, &key_array[0], 5, (int32_t *)pos);
+   if (ret == 0)
+   for (i = 0; i < 5; i++) {
+   print_key_info("Lkp", key_array[i], pos[i]);
+   RETURN_IF_ERROR(pos[i] != -ENOENT,
+   "found not-existent key (pos[%u]=%d)", 
i, pos[i]);
+   }
+
rte_hash_free(handle);

return 0;
@@ -527,21 +535,18 @@ static int test_five_keys(void)
 /*
  * Add keys to the same bucket until bucket full.
  * - add 5 keys to the same bucket (hash created with 4 keys per bucket):
- *   first 4 successful, 5th unsuccessful
- * - lookup the 5 keys: 4 hits, 1 miss
- * - add the 5 keys again: 4 OK, one error as bucket is full
- * - lookup the 5 keys: 4 hits (updated data), 1 miss
- * - delete the 5 keys: 5 OK (even if the 5th is not in the table)
+ *   first 4 successful, 5th successful, pushing existing item in bucket
+ * - lookup the 5 keys: 5 hits
+ * - add the 5 keys again: 5 OK
+ * - lookup the 5 keys: 5 hits (updated data)
+ * - delete the 5 keys: 5 OK
  * - lookup the 5 keys: 5 misses
- * - add the 5th key: OK
- * - lookup the 5th key: hit
  */
 static int test_full_bucket(void)
 {
struct rte_hash_parameters params_pseudo_hash = {
.name = "test4",
.entries = 64,
-   .bucket_entries = 4,
.key_len = sizeof(struct flow_key), /* 13 */
.hash_func = pseudo_hash,
.hash_func_init_val = 0,
@@ -555,7 +560,7 @@ static int test_full_bucket(void)
handle = rte_hash_create(¶ms_pseudo_hash);
RETURN_IF_ERROR(handle == NULL, "hash creation failed");

-   /* Fill bucket*/
+   /* Fill bucket */
for (i = 0; i < 4; i++) {
pos[i] = rte_hash_add_key(handle, &keys[i]);
print_key_info("Add", &keys[i], pos[i]);
@@ -563,47 +568,39 @@ static int test_full_bucket(void)
"failed to add key (pos[%u]=%d)", i, pos[i]);
expected_pos[i] = pos[i];
}
-   /* This shouldn't work because the bucket is full */
+   /*
+* This should work and will push one of the items
+* in the bucket because it is full
+*/
pos[4] = rte_hash_add_key(handle, &keys[4]);
print_key_info("Add", &keys[4], pos[4]);
-   RETURN_IF_ERROR(pos[4] != -ENOSPC,
-   "fail: added k

[dpdk-dev] [PATCH v2 04/11] test/hash: rename new hash perf unit test back to original name

2015-06-25 Thread Pablo de Lara

To be able to see the diff more clear, new performance unit test was
named differently from the old unit test. This patch renames the
new unit test as the old one.

Signed-off-by: Pablo de Lara 
---
 app/test/Makefile |   2 +-
 app/test/test_hash_perf.c | 553 ++
 app/test/test_hash_perf_new.c | 553 --
 3 files changed, 554 insertions(+), 554 deletions(-)
 create mode 100644 app/test/test_hash_perf.c
 delete mode 100644 app/test/test_hash_perf_new.c

diff --git a/app/test/Makefile b/app/test/Makefile
index 8624e95..2e2758c 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -82,7 +82,7 @@ SRCS-y += test_memcpy.c
 SRCS-y += test_memcpy_perf.c

 SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash.c
-SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_perf_new.c
+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_perf.c
 SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_functions.c
 SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_scaling.c

diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c
new file mode 100644
index 000..6bba0c5
--- /dev/null
+++ b/app/test/test_hash_perf.c
@@ -0,0 +1,553 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "test.h"
+
+#define KEYS_TO_ADD (1 << 18)
+#define MAX_ENTRIES (KEYS_TO_ADD * 4) /* 25% table utilization */
+#define NUM_LOOKUPS (KEYS_TO_ADD * 10) /* Loop among keys added, several times 
*/
+#define BUCKET_SIZE 4
+#define NUM_BUCKETS (MAX_ENTRIES / BUCKET_SIZE)
+#define MAX_KEYSIZE 64
+#define NUM_KEYSIZES 10
+#define NUM_OPERATIONS 4 /* Add, lookup, lookup_bulk, delete */
+#define NUM_SHUFFLES 10
+#define BURST_SIZE 16
+
+static uint32_t hashtest_key_lens[] = {
+   4, 8, 16, 32, 48, 64, /* standard key sizes */
+   9,/* IPv4 SRC + DST + protocol, unpadded */
+   13,   /* IPv4 5-tuple, unpadded */
+   37,   /* IPv6 5-tuple, unpadded */
+   40/* IPv6 5-tuple, padded to 8-byte boundary */
+};
+struct rte_hash *h[NUM_KEYSIZES];
+/* Array that stores if a slot is full */
+uint8_t slot_taken[MAX_ENTRIES];
+/* Array to store number of cycles per operation */
+uint64_t cycles[NUM_KEYSIZES][NUM_OPERATIONS][2];
+/* Array to store all input keys */
+uint8_t keys[KEYS_TO_ADD][MAX_KEYSIZE];
+/* Array to store the precomputed hash for 'keys' */
+hash_sig_t signatures[KEYS_TO_ADD];
+/* Array to store how many busy entries have each bucket */
+uint8_t buckets[NUM_BUCKETS];
+
+/* Parameters used for hash table in unit test functions. */
+static struct rte_hash_parameters ut_params = {
+   .entries = MAX_ENTRIES,
+   .bucket_entries = BUCKET_SIZE,
+   .hash_func = rte_jhash,
+   .hash_func_init_val = 0,
+};
+
+static int
+create_table(unsigned table_index)
+{
+   char name[RTE_HASH_NAMESIZE];
+
+   sprintf(name, "test_hash%d", hashtest_key_lens[table_index]);
+   ut_params.name = name;
+   ut_params.key_len = hashtest_key_lens[table_index];
+   ut_params.socket_id = rte_socket_id();
+   h[table_index] = rte_hash_find_existing(name);
+   if (h[table_index] != NULL)
+   /*
+

[dpdk-dev] [PATCH v2 03/11] test/hash: enhance hash unit tests

2015-06-25 Thread Pablo de Lara

Add new unit test for calculating the average table utilization,
using random keys, based on number of entries that can be added
until we encounter one that cannot be added (bucket if full)

Also, replace current hash_perf unit test to see performance more clear.
The current hash_perf unit test takes too long and add keys that
may or may not fit in the table and look up/delete that may not be
in the table. This new unit test gets a set of keys that we know
that fits in the table, and then measure the time to add/look up/delete
them.

Signed-off-by: Pablo de Lara 
---
 app/test/Makefile |   2 +-
 app/test/test_hash.c  |  59 
 app/test/test_hash_perf.c | 702 --
 app/test/test_hash_perf_new.c | 553 +
 4 files changed, 613 insertions(+), 703 deletions(-)
 delete mode 100644 app/test/test_hash_perf.c
 create mode 100644 app/test/test_hash_perf_new.c

diff --git a/app/test/Makefile b/app/test/Makefile
index 2e2758c..8624e95 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -82,7 +82,7 @@ SRCS-y += test_memcpy.c
 SRCS-y += test_memcpy_perf.c

 SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash.c
-SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_perf.c
+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_perf_new.c
 SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_functions.c
 SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_scaling.c

diff --git a/app/test/test_hash.c b/app/test/test_hash.c
index 4300de9..46174db 100644
--- a/app/test/test_hash.c
+++ b/app/test/test_hash.c
@@ -1147,6 +1147,63 @@ test_hash_creation_with_good_parameters(void)
return 0;
 }

+#define ITERATIONS 50
+/*
+ * Test to see the average table utilization (entries added/max entries)
+ * before hitting a random entry that cannot be added
+ */
+static int test_average_table_utilization(void)
+{
+   struct rte_hash *handle;
+   void *simple_key;
+   unsigned i, j, no_space = 0;
+   double added_keys_until_no_space = 0;
+   int ret;
+
+   ut_params.entries = 1 << 20;
+   ut_params.name = "test_average_utilization";
+   ut_params.hash_func = rte_hash_crc;
+   handle = rte_hash_create(&ut_params);
+   RETURN_IF_ERROR(handle == NULL, "hash creation failed");
+
+   simple_key = rte_zmalloc(NULL, ut_params.key_len, 0);
+
+   for (j = 0; j < ITERATIONS; j++) {
+   while (!no_space) {
+   for (i = 0; i < ut_params.key_len; i++)
+   ((uint8_t *) simple_key)[i] = rte_rand() % 255;
+
+   ret = rte_hash_add_key(handle, simple_key);
+   print_key_info("Add", simple_key, ret);
+
+   if (ret == -ENOSPC) {
+   if (-ENOENT != rte_hash_lookup(handle, 
simple_key))
+   printf("Found key that should not be 
present\n");
+   no_space = 1;
+   } else {
+   if (ret < 0)
+   rte_free(simple_key);
+   RETURN_IF_ERROR(ret < 0, "failed to add key 
(ret=%d)", ret);
+   added_keys_until_no_space++;
+   }
+   }
+   no_space = 0;
+
+   /* Reset the table */
+   rte_hash_free(handle);
+   rte_hash_create(&ut_params);
+   }
+
+   const unsigned average_keys_added = added_keys_until_no_space / 
ITERATIONS;
+
+   printf("Average table utilization = %.2f%% (%u/%u)\n",
+   ((double) average_keys_added / ut_params.entries * 100),
+   average_keys_added, ut_params.entries);
+   rte_hash_free(handle);
+
+   return 0;
+}
+
 static uint8_t key[16] = {0x00, 0x01, 0x02, 0x03,
0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0a, 0x0b,
@@ -1405,6 +1462,8 @@ test_hash(void)
return -1;
if (test_hash_creation_with_good_parameters() < 0)
return -1;
+   if (test_average_table_utilization() < 0)
+   return -1;

run_hash_func_tests();

diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c
deleted file mode 100644
index d0e5ce0..000
--- a/app/test/test_hash_perf.c
+++ /dev/null
@@ -1,702 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- * * Redistributions of source code must retain the above copyright
- *   notice, this list of conditions and the following disclaimer.
- * * Redistributions in binary form must reproduce the above copyright
- *   notice, this list of conditions and the following disclaimer in
- *   the

[dpdk-dev] [PATCH v2 02/11] hash: move rte_hash structure to C file and make it internal

2015-06-25 Thread Pablo de Lara

rte_hash structure should not be a public structure,
and therefore it should be moved to the C file and be declared
as internal.

This patch also removes part of a unit test that was checking
a field of the structure.

Signed-off-by: Pablo de Lara 
---
 app/test/test_hash.c   |  6 +-
 lib/librte_hash/rte_hash.c | 30 +-
 lib/librte_hash/rte_hash.h | 37 +
 3 files changed, 35 insertions(+), 38 deletions(-)

diff --git a/app/test/test_hash.c b/app/test/test_hash.c
index 4ecb11b..4300de9 100644
--- a/app/test/test_hash.c
+++ b/app/test/test_hash.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -1110,10 +1110,6 @@ test_hash_creation_with_good_parameters(void)
printf("Creating hash with null hash_func failed\n");
return -1;
}
-   if (handle->hash_func == NULL) {
-   printf("Hash function should have been DEFAULT_HASH_FUNC\n");
-   return -1;
-   }

/* this test is trying to create a hash with the same name as previous 
one.
 * this should return a pointer to the hash we previously created.
diff --git a/lib/librte_hash/rte_hash.c b/lib/librte_hash/rte_hash.c
index 67dff5b..5100a75 100644
--- a/lib/librte_hash/rte_hash.c
+++ b/lib/librte_hash/rte_hash.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -92,6 +92,27 @@ EAL_REGISTER_TAILQ(rte_hash_tailq)
 /* The high bit is always set in real signatures */
 #define NULL_SIGNATURE  0

+struct rte_hash {
+   char name[RTE_HASH_NAMESIZE];   /**< Name of the hash. */
+   uint32_t entries;   /**< Total table entries. */
+   uint32_t bucket_entries;/**< Bucket entries. */
+   uint32_t key_len;   /**< Length of hash key. */
+   rte_hash_function hash_func;/**< Function used to calculate hash. */
+   uint32_t hash_func_init_val;/**< Init value used by hash_func. */
+   uint32_t num_buckets;   /**< Number of buckets in table. */
+   uint32_t bucket_bitmask;/**< Bitmask for getting bucket index
+   from hash signature. */
+   hash_sig_t sig_msb; /**< MSB is always set in valid signatures. */
+   uint8_t *sig_tbl;   /**< Flat array of hash signature buckets. */
+   uint32_t sig_tbl_bucket_size;   /**< Signature buckets may be padded for
+  alignment reasons, and this is the
+  bucket size used by sig_tbl. */
+   uint8_t *key_tbl;   /**< Flat array of key value buckets. */
+   uint32_t key_tbl_key_size;  /**< Keys may be padded for alignment
+  reasons, and this is the key size
+  used by key_tbl. */
+};
+
 /* Returns a pointer to the first signature in specified bucket. */
 static inline hash_sig_t *
 get_sig_tbl_bucket(const struct rte_hash *h, uint32_t bucket_index)
@@ -291,6 +312,13 @@ rte_hash_free(struct rte_hash *h)
rte_free(te);
 }

+hash_sig_t
+rte_hash_hash(const struct rte_hash *h, const void *key)
+{
+   /* calc hash result by key */
+   return h->hash_func(key, h->key_len, h->hash_func_init_val);
+}
+
 static inline int32_t
 __rte_hash_add_key_with_hash(const struct rte_hash *h,
const void *key, hash_sig_t sig)
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index 821a9d4..79827a6 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -40,9 +40,6 @@
  * RTE Hash Table
  */

-#include 
-#include 
-
 #ifdef __cplusplus
 extern "C" {
 #endif
@@ -84,27 +81,8 @@ struct rte_hash_parameters {
int socket_id;  /**< NUMA Socket ID for memory. */
 };

-/** A hash table structure. */
-struct rte_hash {
-   char name[RTE_HASH_NAMESIZE];   /**< Name of the hash. */
-   uint32_t entries;   /**< Total table entries. */
-   uint32_t bucket_entries;/**< Bucket entries. */
-   uint32_t key_len;   /**< Length of hash key. */
-   rte_hash_function hash_func;/**< Function used

[dpdk-dev] [PATCH v2 01/11] eal: add const in prefetch functions

2015-06-25 Thread Pablo de Lara

rte_prefetchX functions included volatile void *p as parameter,
but the function does not modify it, so it should include the const keyword.

Signed-off-by: Pablo de Lara 
Acked-by: Bruce Richardson 
---
 lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h |  6 +++---
 lib/librte_eal/common/include/arch/x86/rte_prefetch.h| 14 +++---
 lib/librte_eal/common/include/generic/rte_prefetch.h |  8 
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h 
b/lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h
index 9df0d13..fea3be1 100644
--- a/lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h
@@ -39,17 +39,17 @@ extern "C" {

 #include "generic/rte_prefetch.h"

-static inline void rte_prefetch0(volatile void *p)
+static inline void rte_prefetch0(const volatile void *p)
 {
asm volatile ("dcbt 0,%[p],1" : : [p] "r" (p));
 }

-static inline void rte_prefetch1(volatile void *p)
+static inline void rte_prefetch1(const volatile void *p)
 {
asm volatile ("dcbt 0,%[p],1" : : [p] "r" (p));
 }

-static inline void rte_prefetch2(volatile void *p)
+static inline void rte_prefetch2(const volatile void *p)
 {
asm volatile ("dcbt 0,%[p],1" : : [p] "r" (p));
 }
diff --git a/lib/librte_eal/common/include/arch/x86/rte_prefetch.h 
b/lib/librte_eal/common/include/arch/x86/rte_prefetch.h
index ec2454d..8e6e02c 100644
--- a/lib/librte_eal/common/include/arch/x86/rte_prefetch.h
+++ b/lib/librte_eal/common/include/arch/x86/rte_prefetch.h
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -40,19 +40,19 @@ extern "C" {

 #include "generic/rte_prefetch.h"

-static inline void rte_prefetch0(volatile void *p)
+static inline void rte_prefetch0(const volatile void *p)
 {
-   asm volatile ("prefetcht0 %[p]" : [p] "+m" (*(volatile char *)p));
+   asm volatile ("prefetcht0 %[p]" : : [p] "m" (*(const volatile char 
*)p));
 }

-static inline void rte_prefetch1(volatile void *p)
+static inline void rte_prefetch1(const volatile void *p)
 {
-   asm volatile ("prefetcht1 %[p]" : [p] "+m" (*(volatile char *)p));
+   asm volatile ("prefetcht1 %[p]" : : [p] "m" (*(const volatile char 
*)p));
 }

-static inline void rte_prefetch2(volatile void *p)
+static inline void rte_prefetch2(const volatile void *p)
 {
-   asm volatile ("prefetcht2 %[p]" : [p] "+m" (*(volatile char *)p));
+   asm volatile ("prefetcht2 %[p]" : : [p] "m" (*(const volatile char 
*)p));
 }

 #ifdef __cplusplus
diff --git a/lib/librte_eal/common/include/generic/rte_prefetch.h 
b/lib/librte_eal/common/include/generic/rte_prefetch.h
index 217f319..725715f 100644
--- a/lib/librte_eal/common/include/generic/rte_prefetch.h
+++ b/lib/librte_eal/common/include/generic/rte_prefetch.h
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -51,14 +51,14 @@
  * @param p
  *   Address to prefetch
  */
-static inline void rte_prefetch0(volatile void *p);
+static inline void rte_prefetch0(const volatile void *p);

 /**
  * Prefetch a cache line into all cache levels except the 0th cache level.
  * @param p
  *   Address to prefetch
  */
-static inline void rte_prefetch1(volatile void *p);
+static inline void rte_prefetch1(const volatile void *p);

 /**
  * Prefetch a cache line into all cache levels except the 0th and 1th cache
@@ -66,6 +66,6 @@ static inline void rte_prefetch1(volatile void *p);
  * @param p
  *   Address to prefetch
  */
-static inline void rte_prefetch2(volatile void *p);
+static inline void rte_prefetch2(const volatile void *p);

 #endif /* _RTE_PREFETCH_H_ */
-- 
2.4.2

[dpdk-dev] [PATCH v2 00/11] Cuckoo hash

2015-06-25 Thread Pablo de Lara

This patchset is to replace the existing hash library with
a more efficient and functional approach, using the Cuckoo hash
method to deal with collisions. This method is based on using
two different hash functions to have two possible locations
in the hash table where an entry can be.
So, if a bucket is full, a new entry can push one of the items
in that bucket to its alternative location, making space for itself.

Advantages
~~~
- Offers the option to store more entries when the target bucket is full
  (unlike the previous implementation)
- Memory efficient: for storing those entries, it is not necessary to
  request new memory, as the entries will be stored in the same table
- Constant worst lookup time: in worst case scenario, it always takes
  the same time to look up an entry, as there are only two possible locations
  where an entry can be.
- Storing data: user can store data in the hash table, unlike the
  previous implementation, but he can still use the old API

This implementation tipically offers over 90% utilization.
Notice that API has been extended, but old API remains. The main
change in ABI is that rte_hash structure is now private and the
deprecation of two macros.

Changes in v2:

- Fixed issue where table could not store maximum number of entries
- Fixed issue where lookup burst could not be more than 32 (instead of 64)
- Remove unnecessary macros and add other useful ones
- Added missing library dependencies
- Used directly rte_hash_secondary instead of rte_hash_alt
- Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library
- Renamed test_hash_perf.c temporarily to ease the view of the improved unit 
test
- Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to
  make them private
- Corrected copyright dates
- Added an optimized function to compare keys that are multiple of 16 bytes
- Improved the way to use primary/secondary signatures. Now both are stored in
  the bucket, so there is no need to calculate them if required.
  Also, there is no need to use the MSB of a signature to differenciate between
  an empty entry and signature 0, since we are storing both signatures,
  which cannot be both 0.
- Removed rte_hash_rehash, as it was a very expensive operation.
  Therefore, the add function returns now -ENOSPC if key cannot be added
  because of a loop.
- Prefetched new slot for new key in add function to improve performance.
- Made doxygen comments more clear.
- Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data,
  as we can use the lookup functions if we want to get the data before deleting.
- Removed some unnecessary includes in rte_hash.h
- Removed some unnecessary variables in rte_cuckoo_hash.c
- Removed some unnecessary checks before creating a new hash table 
- Added documentation (in release notes and programmers guide)
- Added new unit tests and replaced the performance one for hash tables

Pablo de Lara (11):
  eal: add const in prefetch functions
  hash: move rte_hash structure to C file and make it internal
  test/hash: enhance hash unit tests
  test/hash: rename new hash perf unit test back to original name
  hash: replace existing hash library with cuckoo hash implementation
  hash: add new lookup_bulk_with_hash function
  hash: add new function rte_hash_reset
  hash: add new functionality to store data in hash table
  MAINTAINERS: claim responsability for hash library
  doc: announce ABI change of librte_hash
  doc: update hash documentation

 MAINTAINERS|1 +
 app/test/test_hash.c   |  189 +--
 app/test/test_hash_perf.c  |  906 ++---
 doc/guides/prog_guide/hash_lib.rst |   77 +-
 doc/guides/rel_notes/abi.rst   |2 +
 .../common/include/arch/ppc_64/rte_prefetch.h  |6 +-
 .../common/include/arch/x86/rte_prefetch.h |   14 +-
 .../common/include/generic/rte_prefetch.h  |8 +-
 lib/librte_hash/Makefile   |8 +-
 lib/librte_hash/rte_cuckoo_hash.c  | 1394 
 lib/librte_hash/rte_hash.c |  471 ---
 lib/librte_hash/rte_hash.h |  274 +++-
 lib/librte_hash/rte_hash_version.map   |   15 +
 13 files changed, 2191 insertions(+), 1174 deletions(-)
 create mode 100644 lib/librte_hash/rte_cuckoo_hash.c
 delete mode 100644 lib/librte_hash/rte_hash.c

-- 
2.4.2

[dpdk-dev] VMXNET3 on vmware, ping delay

2015-06-25 Thread Avi Kivity

On 06/25/2015 09:44 PM, Thomas Monjalon wrote:
> 2015-06-25 18:46, Avi Kivity:
>> On 06/25/2015 06:18 PM, Matthew Hall wrote:
>>> On Thu, Jun 25, 2015 at 09:14:53AM +, Vass, Sandor (Nokia - 
>>> HU/Budapest) wrote:
 According to my understanding each packet should go
 through BR as fast as possible, but it seems that the rte_eth_rx_burst
 retrieves packets only when there are at least 2 packets on the RX queue of
 the NIC. At least most of the times as there are cases (rarely - according
 to my console log) when it can retrieve 1 packet also and sometimes only 3
 packets can be retrieved...
>>> By default DPDK is optimized for throughput not latency. Try a test with
>>> heavier traffic.
>>>
>>> There is also some work going on now for DPDK interrupt-driven mode, which
>>> will work more like traditional Ethernet drivers instead of polling mode
>>> Ethernet drivers.
>>>
>>> Though I'm not an expert on it, there is also a series of ways to optimize 
>>> for
>>> latency, which hopefully some others could discuss... or maybe search the
>>> archives / web site / Intel tuning documentation.
>>>
>> What would be useful is a runtime switch between polling and interrupt
>> modes.  This was if the load is load you use interrupts, and as
>> mitigation, you switch to poll mode, until the load drops again.
> DPDK is not a stack. It's up to the DPDK application to poll or use interrupts
> when needed.

As long as DPDK provides a mechanism for a runtime switch, the 
application can do that.

[dpdk-dev] 取消

2015-06-25 Thread zhhxu2011

[dpdk-dev] [PATCH 0/2] kni: fix build with kernel 4.1

2015-06-25 Thread De Lara Guarch, Pablo

Hi Miguel,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Miguel Bernal
> Marin
> Sent: Thursday, June 25, 2015 8:10 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 0/2] kni: fix build with kernel 4.1
> 
> Due to API changes in netdevice.h in 4.1 kernel release, KNI modules
> would not build.  This patch set adds the properly checks to fix
> compilation.
> 
> Miguel Bernal Marin (2):
>   kni: fix igb_ndo_bridge_getlink in 4.1
>   kni: fix header_ops in 4.1
> 
>  lib/librte_eal/linuxapp/kni/ethtool/igb/igb_main.c | 10 ++
>  lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h  |  5 +
>  lib/librte_eal/linuxapp/kni/kni_net.c  |  4 
>  3 files changed, 19 insertions(+)
> 
> --
> 2.3.3

I have tested your fix and it works for kernel 4.1, but I get and error if 
CONFIG_RTE_KNI_VHOST=y:

  CC [M]  
/root/dpdk-latest/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/kni_vhost.o
/root/dpdk-latest/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/kni_vhost.c:593:2:
 error: initialization from incompatible pointer type [-Werror]
  .sendmsg = kni_sock_sndmsg,
  ^
/root/dpdk-latest/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/kni_vhost.c:593:2:
 error: (near initialization for 'kni_socket_ops.sendmsg') [-Werror]
/root/dpdk-latest/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/kni_vhost.c:594:2:
 error: initialization from incompatible pointer type [-Werror]
  .recvmsg = kni_sock_rcvmsg,
  ^
/root/dpdk-latest/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/kni_vhost.c:594:2:
 error: (near initialization for 'kni_socket_ops.recvmsg') [-Werror]
cc1: all warnings being treated as errors
/home/kernel_test/kernels_rc/linux-4.1/scripts/Makefile.build:258: recipe for 
target 
'/root/dpdk-latest/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/kni_vhost.o'
 failed
make[10]: *** 
[/root/dpdk-latest/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/kni_vhost.o]
 Error 1
/home/kernel_test/kernels_rc/linux-4.1/Makefile:1383: recipe for target 
'_module_/root/dpdk-latest/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni'
 failed

The fix for this is to remove struct socket *sock from  kni_sock_sndmsg and 
kni_sock_rcvmsg in kni_vhost.c.

Could you send a v2 with this fix as well?

Thanks for this,
Pablo

[dpdk-dev] VMXNET3 on vmware, ping delay

2015-06-25 Thread Vass, Sandor (Nokia - HU/Budapest)

It seems I have found the cause, but I still don't understand the reason.
So, let me describe my setup a bit further. I installed the VMWare Workstation 
onto my laptop. It has a mobile i5 CPU: 2 cores with hyperthreading, so 
basically 4 cores.
In VMWare I assigned to C1 and C2 nodes 1 CPU and one core, BR has one CPU and 
4 cores allocated (the possible maximum value).

If I execute the 'basicfwd' or the multi-process master (and two clients) on 
any of the cores out of [2,3,4] then the ping is received immediately (less 
than 0.5ms) and the transfer speed is immediately high (starting from ~30MB and 
finishing at around 80-90MB/s with basicfwd and test-pdm also *). 

If I allocate them on core 1 (the clients on any other cores), then the ping 
behaves as I originally described: 1sec delays. When I tried to transfer a 
bigger file (I used scp) it started really slow (some 16-32KB/s), sometimes 
even it was stalled. Then later on it get faster as Matthew wrote but it didn't 
went upper than 20-30MB/s

test-pmd worked originally. 
This is because when executing test-pmd there had to be defined 2 cores and I 
always passed '-c 3'. Checking with top it could be seen that it always used 
the CPU#2 (top showed that the second CPU was utilized by 100%).

Can anyone tell me the reason of this behavior? Using CPU 1 there are huge 
latencies, using other CPUs everything work as expected...
Checking on the laptop (windows task manager) it could be seen that none of the 
VMs were utilizing one CPU's to 100% on my laptop. The dpdk processes 100% 
utilization were somehow distributed amongst the physical CPU cores. So no 
single core were allocated exclusively by a VM. Why is it a different situation 
when I use the first CPU on BR rather than the others? It doesn't seem that C1 
and C2 are blocking that CPU. Anyway, the HOST opsys already uses all the cores 
(not heavily).

Rashmin, thanks for the docs. I think I already saw that one but I didn't take 
that as serious. I thought perf tuning about latency in VMWare ESXi makes point 
when one would like to go from 5ms to 0.5ms. But I had 1000ms latency at low 
load... I will check those params if they apply to Workstation at all.

*) Top speed of Multi process master-client example was around 20-30 MB/s, 
immediately. I think this is a normal limitation because the processes have to 
talk with each other through shared mem, so it is anyway slower. I didn't test 
its speed when the Master process was bound to core 1

Sandor

-Original Message-
From: ext Patel, Rashmin N [mailto:rashmin.n.pa...@intel.com] 
Sent: Thursday, June 25, 2015 10:56 PM
To: Matthew Hall; Vass, Sandor (Nokia - HU/Budapest)
Cc: dev at dpdk.org
Subject: RE: [dpdk-dev] VMXNET3 on vmware, ping delay

For tuning ESXi and vSwitch for latency sensitive workloads, I remember the 
following paper published by VMware: 
https://www.vmware.com/files/pdf/techpaper/VMW-Tuning-Latency-Sensitive-Workloads.pdf
 that you can try out.

The overall latency in setup (vmware and dpdk-vm using vmxnet3) remains in 
vmware-native-driver/vmkernel/vmxnet3-backend/vmx-emulation threads in ESXi. So 
you can better tune ESXi (as explained in the above white paper) and/or make 
sure that these important threads are not starving to improve latency and 
throughput in some cases of this setup.

Thanks,
Rashmin

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Matthew Hall
Sent: Thursday, June 25, 2015 8:19 AM
To: Vass, Sandor (Nokia - HU/Budapest)
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] VMXNET3 on vmware, ping delay

On Thu, Jun 25, 2015 at 09:14:53AM +, Vass, Sandor (Nokia - HU/Budapest) 
wrote:
> According to my understanding each packet should go through BR as fast 
> as possible, but it seems that the rte_eth_rx_burst retrieves packets 
> only when there are at least 2 packets on the RX queue of the NIC. At 
> least most of the times as there are cases (rarely - according to my 
> console log) when it can retrieve 1 packet also and sometimes only 3 
> packets can be retrieved...

By default DPDK is optimized for throughput not latency. Try a test with 
heavier traffic.

There is also some work going on now for DPDK interrupt-driven mode, which will 
work more like traditional Ethernet drivers instead of polling mode Ethernet 
drivers.

Though I'm not an expert on it, there is also a series of ways to optimize for 
latency, which hopefully some others could discuss... or maybe search the 
archives / web site / Intel tuning documentation.

Matthew.

[dpdk-dev] [PATCH v7 1/4] ethdev: add apis to support access device info

2015-06-25 Thread Wang, Liang-min



> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Thursday, June 25, 2015 9:44 AM
> To: Wang, Liang-min
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v7 1/4] ethdev: add apis to support access
> device info
> 
> On Wed, 17 Jun 2015 18:22:12 -0400
> Liang-Min Larry Wang  wrote:
> 
> > +int
> > +rte_eth_dev_reg_length(uint8_t port_id)
> > +{
> > +   struct rte_eth_dev *dev;
> > +
> > +   if ((dev= &rte_eth_devices[port_id]) == NULL) {
> > +   PMD_DEBUG_TRACE("Invalid port device\n");
> > +   return -ENODEV;
> > +   }
> 
> Some minor nits:
>   * for consistency you should add valid port check here.
>   * style:
> - don't do assignment in if() unless it really helps readability
> - need whitespace
> 
>   if (!rte_eth_dev_is_valid_port(portid)) {
>   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
>   return -ENODEV;
>   }
> 
>   dev = &rte_eth_devices[port_id];
>   if (dev == NULL) {
>   PMD_DEBUG("Invalid port device\n");
>   return -ENODEV:
>   }
> ...
> 
> This code pattern is so common it really should be a function.
> 
>   dev = rte_eth_dev_get(port_id);
>   if (dev == NULL) {
>   PMD_DEBUG("Invalid port device\n");
>   return -ENODEV;
>   }
> 
> And then add a macro to generate this??

This is used through-out the rte_ethdev.c, should it be done to the entire file?

[dpdk-dev] [PATCH v7 1/4] ethdev: add apis to support access device info

2015-06-25 Thread Wang, Liang-min



> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Thursday, June 25, 2015 9:40 AM
> To: Wang, Liang-min
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v7 1/4] ethdev: add apis to support access
> device info
> 
> On Wed, 17 Jun 2015 18:22:12 -0400
> Liang-Min Larry Wang  wrote:
> 
> >  int
> > +rte_eth_dev_default_mac_addr_set(uint8_t port_id, struct ether_addr
> *addr)
> > +{
> > +   struct rte_eth_dev *dev;
> > +
> > +   if (!rte_eth_dev_is_valid_port(port_id)) {
> > +   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > +   return -ENODEV;
> > +   }
> > +
> > +   if (!is_valid_assigned_ether_addr(addr))
> > +   return -EINVAL;
> > +
> > +   dev = &rte_eth_devices[port_id];
> > +   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mac_addr_set, -
> ENOTSUP);
> > +
> > +   /* Update default address in NIC data structure */
> > +   ether_addr_copy(addr, &dev->data->mac_addrs[0]);
> > +
> > +   (*dev->dev_ops->mac_addr_set)(dev, addr);
> 
> Would it be possible to directly set mac_addr[0] if device does not
> provide a device driver specific override?

I would yield this question to Konstantin since this information is used by get 
mac addr API.

[dpdk-dev] VMXNET3 on vmware, ping delay

2015-06-25 Thread Patel, Rashmin N

For tuning ESXi and vSwitch for latency sensitive workloads, I remember the 
following paper published by VMware: 
https://www.vmware.com/files/pdf/techpaper/VMW-Tuning-Latency-Sensitive-Workloads.pdf
 that you can try out.

The overall latency in setup (vmware and dpdk-vm using vmxnet3) remains in 
vmware-native-driver/vmkernel/vmxnet3-backend/vmx-emulation threads in ESXi. So 
you can better tune ESXi (as explained in the above white paper) and/or make 
sure that these important threads are not starving to improve latency and 
throughput in some cases of this setup.

Thanks,
Rashmin

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Matthew Hall
Sent: Thursday, June 25, 2015 8:19 AM
To: Vass, Sandor (Nokia - HU/Budapest)
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] VMXNET3 on vmware, ping delay

On Thu, Jun 25, 2015 at 09:14:53AM +, Vass, Sandor (Nokia - HU/Budapest) 
wrote:
> According to my understanding each packet should go through BR as fast 
> as possible, but it seems that the rte_eth_rx_burst retrieves packets 
> only when there are at least 2 packets on the RX queue of the NIC. At 
> least most of the times as there are cases (rarely - according to my 
> console log) when it can retrieve 1 packet also and sometimes only 3 
> packets can be retrieved...

By default DPDK is optimized for throughput not latency. Try a test with 
heavier traffic.

There is also some work going on now for DPDK interrupt-driven mode, which will 
work more like traditional Ethernet drivers instead of polling mode Ethernet 
drivers.

Though I'm not an expert on it, there is also a series of ways to optimize for 
latency, which hopefully some others could discuss... or maybe search the 
archives / web site / Intel tuning documentation.

Matthew.

[dpdk-dev] VMXNET3 on vmware, ping delay

2015-06-25 Thread Thomas Monjalon

2015-06-25 18:46, Avi Kivity:
> On 06/25/2015 06:18 PM, Matthew Hall wrote:
> > On Thu, Jun 25, 2015 at 09:14:53AM +, Vass, Sandor (Nokia - 
> > HU/Budapest) wrote:
> >> According to my understanding each packet should go
> >> through BR as fast as possible, but it seems that the rte_eth_rx_burst
> >> retrieves packets only when there are at least 2 packets on the RX queue of
> >> the NIC. At least most of the times as there are cases (rarely - according
> >> to my console log) when it can retrieve 1 packet also and sometimes only 3
> >> packets can be retrieved...
> > By default DPDK is optimized for throughput not latency. Try a test with
> > heavier traffic.
> >
> > There is also some work going on now for DPDK interrupt-driven mode, which
> > will work more like traditional Ethernet drivers instead of polling mode
> > Ethernet drivers.
> >
> > Though I'm not an expert on it, there is also a series of ways to optimize 
> > for
> > latency, which hopefully some others could discuss... or maybe search the
> > archives / web site / Intel tuning documentation.
> >
> 
> What would be useful is a runtime switch between polling and interrupt 
> modes.  This was if the load is load you use interrupts, and as 
> mitigation, you switch to poll mode, until the load drops again.

DPDK is not a stack. It's up to the DPDK application to poll or use interrupts
when needed.

[dpdk-dev] [PATCH] mempool: improbe cache search

2015-06-25 Thread Zoltan Kiss

The current way has a few problems:

- if cache->len < n, we copy our elements into the cache first, then
  into obj_table, that's unnecessary
- if n >= cache_size (or the backfill fails), and we can't fulfil the
  request from the ring alone, we don't try to combine with the cache
- if refill fails, we don't return anything, even if the ring has enough
  for our request

This patch rewrites it severely:
- at the first part of the function we only try the cache if cache->len < n
- otherwise take our elements straight from the ring
- if that fails but we have something in the cache, try to combine them
- the refill happens at the end, and its failure doesn't modify our return
  value

Signed-off-by: Zoltan Kiss 
---
 lib/librte_mempool/rte_mempool.h | 63 +---
 1 file changed, 39 insertions(+), 24 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index a8054e1..896946c 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -948,34 +948,14 @@ __mempool_get_bulk(struct rte_mempool *mp, void 
**obj_table,
unsigned lcore_id = rte_lcore_id();
uint32_t cache_size = mp->cache_size;

-   /* cache is not enabled or single consumer */
+   cache = &mp->local_cache[lcore_id];
+   /* cache is not enabled or single consumer or not enough */
if (unlikely(cache_size == 0 || is_mc == 0 ||
-n >= cache_size || lcore_id >= RTE_MAX_LCORE))
+cache->len < n || lcore_id >= RTE_MAX_LCORE))
goto ring_dequeue;

-   cache = &mp->local_cache[lcore_id];
cache_objs = cache->objs;

-   /* Can this be satisfied from the cache? */
-   if (cache->len < n) {
-   /* No. Backfill the cache first, and then fill from it */
-   uint32_t req = n + (cache_size - cache->len);
-
-   /* How many do we require i.e. number to fill the cache + the 
request */
-   ret = rte_ring_mc_dequeue_bulk(mp->ring, 
&cache->objs[cache->len], req);
-   if (unlikely(ret < 0)) {
-   /*
-* In the offchance that we are buffer constrained,
-* where we are not able to allocate cache + n, go to
-* the ring directly. If that fails, we are truly out of
-* buffers.
-*/
-   goto ring_dequeue;
-   }
-
-   cache->len += req;
-   }
-
/* Now fill in the response ... */
for (index = 0, len = cache->len - 1; index < n; ++index, len--, 
obj_table++)
*obj_table = cache_objs[len];
@@ -984,7 +964,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void **obj_table,

__MEMPOOL_STAT_ADD(mp, get_success, n);

-   return 0;
+   ret = 0;
+   goto cache_refill;

 ring_dequeue:
 #endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */
@@ -995,11 +976,45 @@ ring_dequeue:
else
ret = rte_ring_sc_dequeue_bulk(mp->ring, obj_table, n);

+#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
+   if (ret < 0 && is_mc == 1 && cache->len > 0) {
+   uint32_t req = n - cache->len;
+
+   ret = rte_ring_mc_dequeue_bulk(mp->ring, obj_table, req);
+   if (ret == 0) {
+   cache_objs = cache->objs;
+   obj_table += req;
+   for (index = 0; index < cache->len;
+++index, ++obj_table)
+   *obj_table = cache_objs[index];
+   cache->len = 0;
+   }
+   }
+#endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */
+
if (ret < 0)
__MEMPOOL_STAT_ADD(mp, get_fail, n);
else
__MEMPOOL_STAT_ADD(mp, get_success, n);

+#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
+cache_refill:
+   /* If previous dequeue was OK and we have less than n, start refill */
+   if (ret == 0 && cache_size > 0 && cache->len < n) {
+   uint32_t req = cache_size - cache->len;
+
+   cache_objs = cache->objs;
+   ret = rte_ring_mc_dequeue_bulk(mp->ring,
+  &cache->objs[cache->len],
+  req);
+   if (likely(ret == 0))
+   cache->len += req;
+   else
+   /* Don't spoil the return value */
+   ret = 0;
+   }
+#endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */
+
return ret;
 }

-- 
1.9.1

[dpdk-dev] VMXNET3 on vmware, ping delay

2015-06-25 Thread Avi Kivity



On 06/25/2015 06:18 PM, Matthew Hall wrote:
> On Thu, Jun 25, 2015 at 09:14:53AM +, Vass, Sandor (Nokia - HU/Budapest) 
> wrote:
>> According to my understanding each packet should go
>> through BR as fast as possible, but it seems that the rte_eth_rx_burst
>> retrieves packets only when there are at least 2 packets on the RX queue of
>> the NIC. At least most of the times as there are cases (rarely - according
>> to my console log) when it can retrieve 1 packet also and sometimes only 3
>> packets can be retrieved...
> By default DPDK is optimized for throughput not latency. Try a test with
> heavier traffic.
>
> There is also some work going on now for DPDK interrupt-driven mode, which
> will work more like traditional Ethernet drivers instead of polling mode
> Ethernet drivers.
>
> Though I'm not an expert on it, there is also a series of ways to optimize for
> latency, which hopefully some others could discuss... or maybe search the
> archives / web site / Intel tuning documentation.
>

What would be useful is a runtime switch between polling and interrupt 
modes.  This was if the load is load you use interrupts, and as 
mitigation, you switch to poll mode, until the load drops again.

[dpdk-dev] [PATCH] librte_ether: release memory in uninit function.

2015-06-25 Thread Ananyev, Konstantin

Hi Bernard,

> -Original Message-
> From: Iremonger, Bernard
> Sent: Thursday, June 25, 2015 3:30 PM
> To: dev at dpdk.org
> Cc: Zhang, Helin; Ananyev, Konstantin; Qiu, Michael; mukawa at igel.co.jp; 
> Iremonger, Bernard
> Subject: [PATCH] librte_ether: release memory in uninit function.
> 
> 
> Signed-off-by: Bernard Iremonger 
> ---
>  lib/librte_ether/rte_ethdev.c |8 +++-
>  1 files changed, 7 insertions(+), 1 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index e13fde5..2404556 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -369,8 +369,14 @@ rte_eth_dev_uninit(struct rte_pci_device *pci_dev)
>   /* free ether device */
>   rte_eth_dev_release_port(eth_dev);
> 
> - if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> + if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> + rte_free(eth_dev->data->rx_queues);
> + rte_free(eth_dev->data->tx_queues);
>   rte_free(eth_dev->data->dev_private);
> + rte_free(eth_dev->data->mac_addrs);
> + rte_free(eth_dev->data->hash_mac_addrs);

Sorry, but I don't understand why you put last 2 rte_free()s here.
You already do relese mac_addrs and hash_mac_addrs memory at each PMD _uninit 
routine.
Plus, as Stephen said - it would be better if same component (PMD in that case) 
would do both alloc and free.
Apart from that, patch looks good to me.

Konstantin 


> + memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
> + }
> 
>   eth_dev->pci_dev = NULL;
>   eth_dev->driver = NULL;
> --
> 1.7.4.1

[dpdk-dev] [PATCH] examples/tep_termination: Add a compilation option for the VXLAN sample

2015-06-25 Thread Jijiang Liu

Add a compilation option for the VXLAN sample.

Signed-off-by: Jijiang Liu 
---
 examples/Makefile |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/examples/Makefile b/examples/Makefile
index 081b768..b4eddbd 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -67,7 +67,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += qos_sched
 DIRS-y += quota_watermark
 DIRS-$(CONFIG_RTE_ETHDEV_RXTX_CALLBACKS) += rxtx_callbacks
 DIRS-y += skeleton
-DIRS-y += tep_termination
+DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += tep_termination
 DIRS-$(CONFIG_RTE_LIBRTE_TIMER) += timer
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost
 DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen
-- 
1.7.7.6

[dpdk-dev] [PATCH] vfio-pci: Fixing type used to unsigned long

2015-06-25 Thread Alejandro.Lucero

From: "Alejandro.Lucero" 

VFIO kernel driver and mmap system call expect offset and size being 64 bits.

Due to this bug BAR index info given to the VFIO driver is always 0 when 
checking validity of resources mapping.
---
 lib/librte_eal/linuxapp/eal/eal_pci_vfio.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c 
b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
index aea1fb1..29d8806 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
@@ -728,7 +728,7 @@ pci_vfio_map_resource(struct rte_pci_device *dev)
struct vfio_region_info reg = { .argsz = sizeof(reg) };
void *bar_addr;
struct memreg {
-   uint32_t offset, size;
+   unsigned long offset, size;
} memreg[2] = {};

reg.index = i;
-- 
1.7.9.5

[dpdk-dev] VMXNET3 on vmware, ping delay

2015-06-25 Thread Matthew Hall

On Thu, Jun 25, 2015 at 09:13:59PM +, Vass, Sandor (Nokia - HU/Budapest) 
wrote:
> Can anyone tell me the reason of this behavior? Using CPU 1 there are huge 
> latencies, using other CPUs everything work as expected...

One possible guess what could be related. Normally DPDK uses "Core #0" as the 
"master lcore".

That core's behavior is ever so slightly different from other cores.

Matthew.

[dpdk-dev] [PATCH] librte_ether: release memory in uninit function.

2015-06-25 Thread Bernard Iremonger


Signed-off-by: Bernard Iremonger 
---
 lib/librte_ether/rte_ethdev.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index e13fde5..2404556 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -369,8 +369,14 @@ rte_eth_dev_uninit(struct rte_pci_device *pci_dev)
/* free ether device */
rte_eth_dev_release_port(eth_dev);

-   if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+   if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+   rte_free(eth_dev->data->rx_queues);
+   rte_free(eth_dev->data->tx_queues);
rte_free(eth_dev->data->dev_private);
+   rte_free(eth_dev->data->mac_addrs);
+   rte_free(eth_dev->data->hash_mac_addrs);
+   memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
+   }

eth_dev->pci_dev = NULL;
eth_dev->driver = NULL;
-- 
1.7.4.1

[dpdk-dev] [PATCHv2 2/2] ABI: Add some documentation

2015-06-25 Thread Thomas Monjalon

2015-06-25 07:35, Neil Horman:
> On Wed, Jun 24, 2015 at 11:09:29PM +0200, Thomas Monjalon wrote:
> > 2015-06-24 14:34, Neil Horman:
> > > +Some ABI changes may be too significant to reasonably maintain multiple
> > > +versions. In those cases ABI's may be updated without backward 
> > > compatibility
> > > +being provided. The requirements for doing so are:
> > > +
> > > +#. At least 3 acknowledgments of the need to do so must be made on the
> > > +   dpdk.org mailing list.
> > > +
> > > +#. A full deprecation cycle, as explained above, must be made to offer
> > > +   downstream consumers sufficient warning of the change.
> > > +
> > > +#. The ``LIBABIVER`` variable in the makefile(s) where the ABI changes 
> > > are
> > > +   incorporated must be incremented in parallel with the ABI changes
> > > +   themselves.
> > 
> > The proposal was to provide the old and the new ABI in the same source code
> > during the deprecation cycle. The old ABI would be the default and people
> > can build the new one by enabling the NEXT_ABI config option.
> > So the migration to the new ABI is smoother.
> 
> YesI'm not sure what you're saying here.  The ABI doesn't 'Change' until 
> the
> old ABI is removed (i.e. old applications are forced to adopt a new ABI), and 
> so
> LIBABIVER has to be updated in parallel with that removal

I'm referring to previous threads suggesting a NEXT_ABI build option to be able
to build the old (default) ABI or the next one.
So the LIBABIVER and .map file would depend of enabling NEXT_ABI or not:
http://dpdk.org/ml/archives/dev/2015-June/019147.html
http://dpdk.org/ml/archives/dev/2015-June/019784.html
http://dpdk.org/ml/archives/dev/2015-June/019810.html

> > [...]
> > > +The macros exported are:
> > > +
> > > +* ``VERSION_SYMBOL(b, e, n)``: Creates a symbol version table entry 
> > > binding
> > > +  unversioned symbol ``b`` to the internal function ``b_e``.
> > 
> > The definition is the same as BASE_SYMBOL.
> > 
> No, they're different.  VERSION_SYMBOL is defined as:
> VERSION_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " 
> RTE_STR(b) "@DPDK_" RTE_STR(n))
> 
> while BASE_SYMBOL is
> #define BASE_SYMBOL(b, e) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " 
> RTE_STR(b)"@")

Yes. I mean the comments are the same, so don't reflect the difference.

> > [...]
> > > +   DPDK_2.0 {
> > > +global:
> > > +
> > > +rte_acl_add_rules;
> > > +rte_acl_build;
> > > +rte_acl_classify;
> > > +rte_acl_classify_alg;
> > > +rte_acl_classify_scalar;
> > > +rte_acl_create;
> > 
> > So it's declared twice, right?
> > I think it should be explicit.
> > 
> Yes, its listed once for each version node, so 2 delcarations.  I thought that
> was made explicit by the use of the code block.  What else would you like to
> see?

I think you should say it explicitly in the comment below the block.

> > > +rte_acl_dump;
> > > +rte_acl_find_existing;
> > > +rte_acl_free;
> > > +rte_acl_ipv4vlan_add_rules;
> > > +rte_acl_ipv4vlan_build;
> > > +rte_acl_list_dump;
> > > +rte_acl_reset;
> > > +rte_acl_reset_rules;
> > > +rte_acl_set_ctx_classify;
> > > +
> > > +local: *;
> > > +   };
> > > +
> > > +   DPDK_2.1 {
> > > +global:
> > > +rte_acl_create;
> > > +
> > > +   } DPDK_2.0;

> > [...]
> > > +the macros used for versioning symbols.  That is our next step, mapping 
> > > this new
> > > +symbol name to the initial symbol name at version node 2.0.  Immediately 
> > > after
> > > +the function, we add this line of code
> > > +
> > > +.. code-block:: c
> > > +
> > > +   VERSION_SYMBOL(rte_acl_create, _v20, 2.0);
> > 
> > Can it be declared before the function?
> > 
> Strictly speaking yes, though its a bit odd from a sylistic point to declare
> versioned aliases for a symbol prior to defining the symbol itself (its like a
> forward declaration)

It allows to declare it near the function header.

> > When do we need to use BASE_SYMBOL?
> > 
> For our purposes you currently don't, because there are no unversioned symbols
> in DPDK (since we use a map file).  I've just included it here for 
> completeness
> in the header file should it ever be needed in the future.

If it can be useful, please integrate a note to explain when it should be used.

> > [...]
> > > +This code serves as our new API call.  Its the same as our old call, but 
> > > adds
> > > +the new parameter in place.  Next we need to map this function to the 
> > > symbol
> > > +``rte_acl_create at DPDK_2.1``.  To do this, we modify the public 
> > > prototype of the call
> > > +in the header file, adding the macro there to inform all including 
> > > applications,
> > > +that on re-link, the default rte_acl_create symbol should point to this
> > > +function.  Note that we could do this by simply naming the function above
> > > +rte_acl_create, and the linker would chose the mo

[dpdk-dev] [PATCH v4 9/9] doc: update malloc documentation

2015-06-25 Thread Sergio Gonzalez Monroy

Update malloc documentation to reflect new implementation details.

Signed-off-by: Sergio Gonzalez Monroy 
---
 doc/guides/prog_guide/env_abstraction_layer.rst | 220 +-
 doc/guides/prog_guide/img/malloc_heap.png   | Bin 81329 -> 80952 bytes
 doc/guides/prog_guide/index.rst |   1 -
 doc/guides/prog_guide/malloc_lib.rst| 233 
 doc/guides/prog_guide/overview.rst  |  11 +-
 5 files changed, 221 insertions(+), 244 deletions(-)
 delete mode 100644 doc/guides/prog_guide/malloc_lib.rst

diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst 
b/doc/guides/prog_guide/env_abstraction_layer.rst
index 25eb281..cd4d666 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -116,7 +116,6 @@ The physical address of the reserved memory for that memory 
zone is also returne
 .. note::

 Memory reservations done using the APIs provided by the rte_malloc library 
are also backed by pages from the hugetlbfs filesystem.
-However, physical address information is not available for the blocks of 
memory allocated in this way.

 Xen Dom0 support without hugetbls
 ~
@@ -366,3 +365,222 @@ We expect only 50% of CPU spend on packet IO.
 echo  5 > pkt_io/cpu.cfs_quota_us


+Malloc
+--
+
+The EAL provides a malloc API to allocate any-sized memory.
+
+The objective of this API is to provide malloc-like functions to allow
+allocation from hugepage memory and to facilitate application porting.
+The *DPDK API Reference* manual describes the available functions.
+
+Typically, these kinds of allocations should not be done in data plane
+processing because they are slower than pool-based allocation and make
+use of locks within the allocation and free paths.
+However, they can be used in configuration code.
+
+Refer to the rte_malloc() function description in the *DPDK API Reference*
+manual for more information.
+
+Cookies
+~~~
+
+When CONFIG_RTE_MALLOC_DEBUG is enabled, the allocated memory contains
+overwrite protection fields to help identify buffer overflows.
+
+Alignment and NUMA Constraints
+~~
+
+The rte_malloc() takes an align argument that can be used to request a memory
+area that is aligned on a multiple of this value (which must be a power of 
two).
+
+On systems with NUMA support, a call to the rte_malloc() function will return
+memory that has been allocated on the NUMA socket of the core which made the 
call.
+A set of APIs is also provided, to allow memory to be explicitly allocated on a
+NUMA socket directly, or by allocated on the NUMA socket where another core is
+located, in the case where the memory is to be used by a logical core other 
than
+on the one doing the memory allocation.
+
+Use Cases
+~
+
+This API is meant to be used by an application that requires malloc-like
+functions at initialization time.
+
+For allocating/freeing data at runtime, in the fast-path of an application,
+the memory pool library should be used instead.
+
+Internal Implementation
+~~~
+
+Data Structures
+^^^
+
+There are two data structure types used internally in the malloc library:
+
+*   struct malloc_heap - used to track free space on a per-socket basis
+
+*   struct malloc_elem - the basic element of allocation and free-space
+tracking inside the library.
+
+Structure: malloc_heap
+""
+
+The malloc_heap structure is used to manage free space on a per-socket basis.
+Internally, there is one heap structure per NUMA node, which allows us to
+allocate memory to a thread based on the NUMA node on which this thread runs.
+While this does not guarantee that the memory will be used on that NUMA node,
+it is no worse than a scheme where the memory is always allocated on a fixed
+or random node.
+
+The key fields of the heap structure and their function are described below
+(see also diagram above):
+
+*   lock - the lock field is needed to synchronize access to the heap.
+Given that the free space in the heap is tracked using a linked list,
+we need a lock to prevent two threads manipulating the list at the same 
time.
+
+*   free_head - this points to the first element in the list of free nodes for
+this malloc heap.
+
+.. note::
+
+The malloc_heap structure does not keep track of in-use blocks of memory,
+since these are never touched except when they are to be freed again -
+at which point the pointer to the block is an input to the free() function.
+
+.. _figure_malloc_heap:
+
+.. figure:: img/malloc_heap.*
+
+   Example of a malloc heap and malloc elements within the malloc library
+
+
+.. _malloc_elem:
+
+Structure: malloc_elem
+""
+
+The malloc_elem structure is used as a generic header structure for various
+blocks of memory.
+It is used in three different ways - all shown in the diagram above:
+

[dpdk-dev] [PATCH v4 8/9] doc: announce ABI change of librte_malloc

2015-06-25 Thread Sergio Gonzalez Monroy

Announce the creation of dummy malloc library for 2.1 and removal of
such library, now integrated in librte_eal, for 2.2 release.

Signed-off-by: Sergio Gonzalez Monroy 
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index f00a6ee..2aaf900 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -38,3 +38,4 @@ Examples of Deprecation Notices

 Deprecation Notices
 ---
+* librte_malloc library has been integrated into librte_eal. The 2.1 release 
creates a dummy/empty malloc library to fulfill binaries with dynamic linking 
dependencies on librte_malloc.so. Such dummy library will not be created from 
release 2.2 so binaries will need to be rebuilt.
-- 
1.9.3

[dpdk-dev] [PATCH v4 7/9] app/test: update unit test with rte_memzone_free

2015-06-25 Thread Sergio Gonzalez Monroy

Update memzone unit test for the new rte_memzone_free API.

Signed-off-by: Sergio Gonzalez Monroy 
---
 app/test/test_memzone.c | 53 +
 1 file changed, 53 insertions(+)

diff --git a/app/test/test_memzone.c b/app/test/test_memzone.c
index 6934eee..501ad12 100644
--- a/app/test/test_memzone.c
+++ b/app/test/test_memzone.c
@@ -684,6 +684,55 @@ test_memzone_bounded(void)
 }

 static int
+test_memzone_free(void)
+{
+   const struct rte_memzone *mz[4];
+
+   mz[0] = rte_memzone_reserve("tempzone0", 2000, SOCKET_ID_ANY, 0);
+   mz[1] = rte_memzone_reserve("tempzone1", 4000, SOCKET_ID_ANY, 0);
+
+   if (mz[0] > mz[1])
+   return -1;
+   if (!rte_memzone_lookup("tempzone0"))
+   return -1;
+   if (!rte_memzone_lookup("tempzone1"))
+   return -1;
+
+   if (rte_memzone_free(mz[0])) {
+   printf("Fail memzone free - tempzone0\n");
+   return -1;
+   }
+   if (rte_memzone_lookup("tempzone0")) {
+   printf("Found previously free memzone - tempzone0\n");
+   return -1;
+   }
+   mz[2] = rte_memzone_reserve("tempzone2", 2000, SOCKET_ID_ANY, 0);
+
+   if (mz[2] > mz[1]) {
+   printf("tempzone2 should have gotten the free entry from 
tempzone0\n");
+   return -1;
+   }
+   if (rte_memzone_free(mz[2])) {
+   printf("Fail memzone free - tempzone2\n");
+   return -1;
+   }
+   if (rte_memzone_lookup("tempzone2")) {
+   printf("Found previously free memzone - tempzone2\n");
+   return -1;
+   }
+   if (rte_memzone_free(mz[1])) {
+   printf("Fail memzone free - tempzone1\n");
+   return -1;
+   }
+   if (rte_memzone_lookup("tempzone1")) {
+   printf("Found previously free memzone - tempzone1\n");
+   return -1;
+   }
+
+   return 0;
+}
+
+static int
 test_memzone(void)
 {
const struct rte_memzone *memzone1;
@@ -791,6 +840,10 @@ test_memzone(void)
if (test_memzone_reserve_max_aligned() < 0)
return -1;

+   printf("test free memzone\n");
+   if (test_memzone_free() < 0)
+   return -1;
+
return 0;
 }

-- 
1.9.3

[dpdk-dev] [PATCH v4 6/9] eal: new rte_memzone_free

2015-06-25 Thread Sergio Gonzalez Monroy

Implement rte_memzone_free which, as its name implies, would free a
memzone.

Currently memzone are tracked in an array and cannot be free.
To be able to reuse the same array to track memzones, we have to
change how we keep track of reserved memzones.

With this patch, any memzone with addr NULL is not used, so we also need
to change how we look for the next memzone entry free.

Signed-off-by: Sergio Gonzalez Monroy 
---
 lib/librte_eal/bsdapp/eal/rte_eal_version.map |  6 +++
 lib/librte_eal/common/eal_common_memzone.c| 55 +--
 lib/librte_eal/common/include/rte_eal_memconfig.h |  2 +-
 lib/librte_eal/common/include/rte_memzone.h   | 11 +
 lib/librte_eal/linuxapp/eal/eal_ivshmem.c |  8 ++--
 lib/librte_eal/linuxapp/eal/rte_eal_version.map   |  6 +++
 6 files changed, 80 insertions(+), 8 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 0401be2..7110816 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -105,3 +105,9 @@ DPDK_2.0 {

local: *;
 };
+
+DPDK_2.1 {
+   global:
+
+   rte_memzone_free;
+} DPDK_2.0;
diff --git a/lib/librte_eal/common/eal_common_memzone.c 
b/lib/librte_eal/common/eal_common_memzone.c
index 943012b..dbb3844 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -78,6 +78,27 @@ memzone_lookup_thread_unsafe(const char *name)
 }

 /*
+ * This function is called only if the number of memzones is smaller
+ * than RTE_MAX_MEMZONE, so it is expected to always succeed.
+ */
+static inline struct rte_memzone *
+get_next_free_memzone(void)
+{
+   struct rte_mem_config *mcfg;
+   unsigned i = 0;
+
+   /* get pointer to global configuration */
+   mcfg = rte_eal_get_configuration()->mem_config;
+
+   for (i = 0; i < RTE_MAX_MEMZONE; i++) {
+   if (mcfg->memzone[i].addr == NULL)
+   break;
+   }
+
+   return &mcfg->memzone[i];
+}
+
+/*
  * Return a pointer to a correctly filled memzone descriptor. If the
  * allocation cannot be done, return NULL.
  */
@@ -141,7 +162,7 @@ memzone_reserve_aligned_thread_unsafe(const char *name, 
size_t len,
mcfg = rte_eal_get_configuration()->mem_config;

/* no more room in config */
-   if (mcfg->memzone_idx >= RTE_MAX_MEMZONE) {
+   if (mcfg->memzone_cnt >= RTE_MAX_MEMZONE) {
RTE_LOG(ERR, EAL, "%s(): No more room in config\n", __func__);
rte_errno = ENOSPC;
return NULL;
@@ -215,7 +236,9 @@ memzone_reserve_aligned_thread_unsafe(const char *name, 
size_t len,
const struct malloc_elem *elem = malloc_elem_from_data(mz_addr);

/* fill the zone in config */
-   struct rte_memzone *mz = &mcfg->memzone[mcfg->memzone_idx++];
+   struct rte_memzone *mz = get_next_free_memzone();
+
+   mcfg->memzone_cnt++;
snprintf(mz->name, sizeof(mz->name), "%s", name);
mz->phys_addr = rte_malloc_virt2phy(mz_addr);
mz->addr = mz_addr;
@@ -291,6 +314,32 @@ rte_memzone_reserve_bounded(const char *name, size_t len,
return mz;
 }

+int
+rte_memzone_free(const struct rte_memzone *mz)
+{
+   struct rte_mem_config *mcfg;
+   int ret = 0;
+   void *addr;
+   unsigned idx;
+
+   if (mz == NULL)
+   return -EINVAL;
+
+   mcfg = rte_eal_get_configuration()->mem_config;
+
+   rte_rwlock_read_lock(&mcfg->mlock);
+
+   idx = ((uintptr_t)mz - (uintptr_t)mcfg->memzone);
+   idx = idx / sizeof(struct rte_memzone);
+
+   addr = mcfg->memzone[idx].addr;
+   mcfg->memzone[idx].addr = NULL;
+   rte_free(addr);
+
+   rte_rwlock_read_unlock(&mcfg->mlock);
+
+   return ret;
+}

 /*
  * Lookup for the memzone identified by the given name
@@ -364,7 +413,7 @@ rte_eal_memzone_init(void)
rte_rwlock_write_lock(&mcfg->mlock);

/* delete all zones */
-   mcfg->memzone_idx = 0;
+   mcfg->memzone_cnt = 0;
memset(mcfg->memzone, 0, sizeof(mcfg->memzone));

rte_rwlock_write_unlock(&mcfg->mlock);
diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h 
b/lib/librte_eal/common/include/rte_eal_memconfig.h
index 7de906b..2b5e0b1 100644
--- a/lib/librte_eal/common/include/rte_eal_memconfig.h
+++ b/lib/librte_eal/common/include/rte_eal_memconfig.h
@@ -67,7 +67,7 @@ struct rte_mem_config {
rte_rwlock_t qlock;   /**< used for tailq operation for thread safe. */
rte_rwlock_t mplock;  /**< only used by mempool LIB for thread-safe. */

-   uint32_t memzone_idx; /**< Index of memzone */
+   uint32_t memzone_cnt; /**< Number of allocated memzones */

/* memory segments and zones */
struct rte_memseg memseg[RTE_MAX_MEMSEG];/**< Physmem descriptors. 
*/
diff --git a/lib/librte_eal/common/include/rte_memzone.h 
b/lib/librte_eal/common/include/r

[dpdk-dev] [PATCH v4 5/9] eal: remove free_memseg and references to it

2015-06-25 Thread Sergio Gonzalez Monroy

Remove free_memseg field from internal mem config structure as it is
not used anymore.
Also remove code in ivshmem that was setting up free_memseg on init.

Signed-off-by: Sergio Gonzalez Monroy 
---
 lib/librte_eal/common/include/rte_eal_memconfig.h | 3 ---
 lib/librte_eal/linuxapp/eal/eal_ivshmem.c | 9 -
 2 files changed, 12 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h 
b/lib/librte_eal/common/include/rte_eal_memconfig.h
index 055212a..7de906b 100644
--- a/lib/librte_eal/common/include/rte_eal_memconfig.h
+++ b/lib/librte_eal/common/include/rte_eal_memconfig.h
@@ -73,9 +73,6 @@ struct rte_mem_config {
struct rte_memseg memseg[RTE_MAX_MEMSEG];/**< Physmem descriptors. 
*/
struct rte_memzone memzone[RTE_MAX_MEMZONE]; /**< Memzone descriptors. 
*/

-   /* Runtime Physmem descriptors - NOT USED */
-   struct rte_memseg free_memseg[RTE_MAX_MEMSEG];
-
struct rte_tailq_head tailq_head[RTE_MAX_TAILQ]; /**< Tailqs for 
objects */

/* Heaps of Malloc per socket */
diff --git a/lib/librte_eal/linuxapp/eal/eal_ivshmem.c 
b/lib/librte_eal/linuxapp/eal/eal_ivshmem.c
index 2deaeb7..facfb80 100644
--- a/lib/librte_eal/linuxapp/eal/eal_ivshmem.c
+++ b/lib/librte_eal/linuxapp/eal/eal_ivshmem.c
@@ -725,15 +725,6 @@ map_all_segments(void)
 * expect memsegs to be empty */
memcpy(&mcfg->memseg[i], &ms,
sizeof(struct rte_memseg));
-   memcpy(&mcfg->free_memseg[i], &ms,
-   sizeof(struct rte_memseg));
-
-
-   /* adjust the free_memseg so that there's no free space left */
-   mcfg->free_memseg[i].ioremap_addr += mcfg->free_memseg[i].len;
-   mcfg->free_memseg[i].phys_addr += mcfg->free_memseg[i].len;
-   mcfg->free_memseg[i].addr_64 += mcfg->free_memseg[i].len;
-   mcfg->free_memseg[i].len = 0;

close(fd);

-- 
1.9.3

[dpdk-dev] [PATCH v4 4/9] config: remove CONFIG_RTE_MALLOC_MEMZONE_SIZE

2015-06-25 Thread Sergio Gonzalez Monroy

During initializaio malloc sets all available memory as part of the heaps.

CONFIG_RTE_MALLOC_MEMZONE_SIZE was used to specify the default memory
block size to expand the heap. The option is not used/relevant anymore,
so we remove it.

Signed-off-by: Sergio Gonzalez Monroy 
---
 config/common_bsdapp   | 1 -
 config/common_linuxapp | 1 -
 2 files changed, 2 deletions(-)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index 2b0c877..a54957d 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -103,7 +103,6 @@ CONFIG_RTE_LOG_HISTORY=256
 CONFIG_RTE_EAL_ALLOW_INV_SOCKET_ID=n
 CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
 CONFIG_RTE_MALLOC_DEBUG=n
-CONFIG_RTE_MALLOC_MEMZONE_SIZE=11M

 #
 # FreeBSD contiguous memory driver settings
diff --git a/config/common_linuxapp b/config/common_linuxapp
index fc6dc2e..72611c9 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -106,7 +106,6 @@ CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
 CONFIG_RTE_EAL_IGB_UIO=y
 CONFIG_RTE_EAL_VFIO=y
 CONFIG_RTE_MALLOC_DEBUG=n
-CONFIG_RTE_MALLOC_MEMZONE_SIZE=11M

 #
 # Special configurations in PCI Config Space for high performance
-- 
1.9.3

[dpdk-dev] [PATCH v4 3/9] app/test: update malloc/memzone unit tests

2015-06-25 Thread Sergio Gonzalez Monroy

Some unit test are not relevant anymore. It is the case of those malloc
UTs that checked corner cases when allocating MALLOC_MEMZONE_SIZE
chunks, and the case of those memzone UTs relaying of specific free
memsegs of rhte reserved memzone.

Other UTs just need to be update, for example, to calculate maximum free
block size available.

Signed-off-by: Sergio Gonzalez Monroy 
---
 app/test/test_malloc.c  |  86 --
 app/test/test_memzone.c | 440 
 2 files changed, 35 insertions(+), 491 deletions(-)

diff --git a/app/test/test_malloc.c b/app/test/test_malloc.c
index ea6f651..a04a751 100644
--- a/app/test/test_malloc.c
+++ b/app/test/test_malloc.c
@@ -56,10 +56,6 @@

 #define N 1

-#define QUOTE_(x) #x
-#define QUOTE(x) QUOTE_(x)
-#define MALLOC_MEMZONE_SIZE QUOTE(RTE_MALLOC_MEMZONE_SIZE)
-
 /*
  * Malloc
  * ==
@@ -292,60 +288,6 @@ test_str_to_size(void)
 }

 static int
-test_big_alloc(void)
-{
-   int socket = 0;
-   struct rte_malloc_socket_stats pre_stats, post_stats;
-   size_t size =rte_str_to_size(MALLOC_MEMZONE_SIZE)*2;
-   int align = 0;
-#ifndef RTE_LIBRTE_MALLOC_DEBUG
-   int overhead = RTE_CACHE_LINE_SIZE + RTE_CACHE_LINE_SIZE;
-#else
-   int overhead = RTE_CACHE_LINE_SIZE + RTE_CACHE_LINE_SIZE + 
RTE_CACHE_LINE_SIZE;
-#endif
-
-   rte_malloc_get_socket_stats(socket, &pre_stats);
-
-   void *p1 = rte_malloc_socket("BIG", size , align, socket);
-   if (!p1)
-   return -1;
-   rte_malloc_get_socket_stats(socket,&post_stats);
-
-   /* Check statistics reported are correct */
-   /* Allocation may increase, or may be the same as before big allocation 
*/
-   if (post_stats.heap_totalsz_bytes < pre_stats.heap_totalsz_bytes) {
-   printf("Malloc statistics are incorrect - 
heap_totalsz_bytes\n");
-   return -1;
-   }
-   /* Check that allocated size adds up correctly */
-   if (post_stats.heap_allocsz_bytes !=
-   pre_stats.heap_allocsz_bytes + size + align + overhead) 
{
-   printf("Malloc statistics are incorrect - alloc_size\n");
-   return -1;
-   }
-   /* Check free size against tested allocated size */
-   if (post_stats.heap_freesz_bytes !=
-   post_stats.heap_totalsz_bytes - 
post_stats.heap_allocsz_bytes) {
-   printf("Malloc statistics are incorrect - heap_freesz_bytes\n");
-   return -1;
-   }
-   /* Number of allocated blocks must increase after allocation */
-   if (post_stats.alloc_count != pre_stats.alloc_count + 1) {
-   printf("Malloc statistics are incorrect - alloc_count\n");
-   return -1;
-   }
-   /* New blocks now available - just allocated 1 but also 1 new free */
-   if (post_stats.free_count != pre_stats.free_count &&
-   post_stats.free_count != pre_stats.free_count - 1) {
-   printf("Malloc statistics are incorrect - free_count\n");
-   return -1;
-   }
-
-   rte_free(p1);
-   return 0;
-}
-
-static int
 test_multi_alloc_statistics(void)
 {
int socket = 0;
@@ -399,10 +341,6 @@ test_multi_alloc_statistics(void)
/* After freeing both allocations check stats return to original */
rte_malloc_get_socket_stats(socket, &post_stats);

-   /*
-* Check that no new blocks added after small allocations
-* i.e. < RTE_MALLOC_MEMZONE_SIZE
-*/
if(second_stats.heap_totalsz_bytes != first_stats.heap_totalsz_bytes) {
printf("Incorrect heap statistics: Total size \n");
return -1;
@@ -447,18 +385,6 @@ test_multi_alloc_statistics(void)
 }

 static int
-test_memzone_size_alloc(void)
-{
-   void *p1 = rte_malloc("BIG", 
(size_t)(rte_str_to_size(MALLOC_MEMZONE_SIZE) - 128), 64);
-   if (!p1)
-   return -1;
-   rte_free(p1);
-   /* one extra check - check no crashes if free(NULL) */
-   rte_free(NULL);
-   return 0;
-}
-
-static int
 test_rte_malloc_type_limits(void)
 {
/* The type-limits functionality is not yet implemented,
@@ -935,18 +861,6 @@ test_malloc(void)
}
else printf("test_str_to_size() passed\n");

-   if (test_memzone_size_alloc() < 0){
-   printf("test_memzone_size_alloc() failed\n");
-   return -1;
-   }
-   else printf("test_memzone_size_alloc() passed\n");
-
-   if (test_big_alloc() < 0){
-   printf("test_big_alloc() failed\n");
-   return -1;
-   }
-   else printf("test_big_alloc() passed\n");
-
if (test_zero_aligned_alloc() < 0){
printf("test_zero_aligned_alloc() failed\n");
return -1;
diff --git a/app/test/test_memzone.c b/app/test/test_memzone.c
index 9c7a1cb..6934eee 100644
--- a/app/test/test_memzone.c
+++ b/app/test/test_memzone.c
@@ -44,6 +44,9 @@
 #include 
 #

[dpdk-dev] [PATCH v4 2/9] eal: memzone allocated by malloc

2015-06-25 Thread Sergio Gonzalez Monroy

In the current memory hierarchy, memsegs are groups of physically
contiguous hugepages, memzones are slices of memsegs and malloc further
slices memzones into smaller memory chunks.

This patch modifies malloc so it partitions memsegs instead of memzones.
Thus memzones would call malloc internally for memory allocation while
maintaining its ABI.

It would be possible to free memzones and therefore any other structure
based on memzones, ie. mempools

Signed-off-by: Sergio Gonzalez Monroy 
---
 lib/librte_eal/common/eal_common_memzone.c| 274 ++
 lib/librte_eal/common/include/rte_eal_memconfig.h |   2 +-
 lib/librte_eal/common/include/rte_malloc_heap.h   |   3 +-
 lib/librte_eal/common/malloc_elem.c   |  68 --
 lib/librte_eal/common/malloc_elem.h   |  14 +-
 lib/librte_eal/common/malloc_heap.c   | 140 ++-
 lib/librte_eal/common/malloc_heap.h   |   6 +-
 lib/librte_eal/common/rte_malloc.c|   7 +-
 8 files changed, 197 insertions(+), 317 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_memzone.c 
b/lib/librte_eal/common/eal_common_memzone.c
index aee184a..943012b 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -50,15 +50,15 @@
 #include 
 #include 

+#include "malloc_heap.h"
+#include "malloc_elem.h"
 #include "eal_private.h"

-/* internal copy of free memory segments */
-static struct rte_memseg *free_memseg = NULL;
-
 static inline const struct rte_memzone *
 memzone_lookup_thread_unsafe(const char *name)
 {
const struct rte_mem_config *mcfg;
+   const struct rte_memzone *mz;
unsigned i = 0;

/* get pointer to global configuration */
@@ -68,8 +68,9 @@ memzone_lookup_thread_unsafe(const char *name)
 * the algorithm is not optimal (linear), but there are few
 * zones and this function should be called at init only
 */
-   for (i = 0; i < RTE_MAX_MEMZONE && mcfg->memzone[i].addr != NULL; i++) {
-   if (!strncmp(name, mcfg->memzone[i].name, RTE_MEMZONE_NAMESIZE))
+   for (i = 0; i < RTE_MAX_MEMZONE; i++) {
+   mz = &mcfg->memzone[i];
+   if (mz->addr != NULL && !strncmp(name, mz->name, 
RTE_MEMZONE_NAMESIZE))
return &mcfg->memzone[i];
}

@@ -88,39 +89,45 @@ rte_memzone_reserve(const char *name, size_t len, int 
socket_id,
len, socket_id, flags, RTE_CACHE_LINE_SIZE);
 }

-/*
- * Helper function for memzone_reserve_aligned_thread_unsafe().
- * Calculate address offset from the start of the segment.
- * Align offset in that way that it satisfy istart alignmnet and
- * buffer of the  requested length would not cross specified boundary.
- */
-static inline phys_addr_t
-align_phys_boundary(const struct rte_memseg *ms, size_t len, size_t align,
-   size_t bound)
+/* Find the heap with the greatest free block size */
+static void
+find_heap_max_free_elem(int *s, size_t *len, unsigned align)
 {
-   phys_addr_t addr_offset, bmask, end, start;
-   size_t step;
+   struct rte_mem_config *mcfg;
+   struct rte_malloc_socket_stats stats;
+   unsigned i;

-   step = RTE_MAX(align, bound);
-   bmask = ~((phys_addr_t)bound - 1);
+   /* get pointer to global configuration */
+   mcfg = rte_eal_get_configuration()->mem_config;

-   /* calculate offset to closest alignment */
-   start = RTE_ALIGN_CEIL(ms->phys_addr, align);
-   addr_offset = start - ms->phys_addr;
+   for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+   malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats);
+   if (stats.greatest_free_size > *len) {
+   *len = stats.greatest_free_size;
+   *s = i;
+   }
+   }
+   *len -= (MALLOC_ELEM_OVERHEAD + align);
+}

-   while (addr_offset + len < ms->len) {
+/* Find a heap that can allocate the requested size */
+static void
+find_heap_suitable(int *s, size_t len, unsigned align)
+{
+   struct rte_mem_config *mcfg;
+   struct rte_malloc_socket_stats stats;
+   unsigned i;

-   /* check, do we meet boundary condition */
-   end = start + len - (len != 0);
-   if ((start & bmask) == (end & bmask))
-   break;
+   /* get pointer to global configuration */
+   mcfg = rte_eal_get_configuration()->mem_config;

-   /* calculate next offset */
-   start = RTE_ALIGN_CEIL(start + 1, step);
-   addr_offset = start - ms->phys_addr;
+   for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+   malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats);
+   if (stats.greatest_free_size >= len + MALLOC_ELEM_OVERHEAD + 
align) {
+   *s = i;
+   break;
+   }
}
-
-   return addr_offset;
 }

 static const

[dpdk-dev] [PATCH v4 1/9] eal: move librte_malloc to eal/common

2015-06-25 Thread Sergio Gonzalez Monroy

Move malloc inside eal.

Create a dummy malloc library to avoid breaking applications that have
librte_malloc in their DT_NEEDED entries.

This is the first step towards using malloc to allocate memory directly
from memsegs. Thus, memzones would allocate memory through malloc,
allowing to free memzones.

Signed-off-by: Sergio Gonzalez Monroy 
---
 MAINTAINERS |   9 +-
 config/common_bsdapp|   9 +-
 config/common_linuxapp  |   9 +-
 drivers/net/af_packet/Makefile  |   1 -
 drivers/net/bonding/Makefile|   1 -
 drivers/net/e1000/Makefile  |   2 +-
 drivers/net/enic/Makefile   |   2 +-
 drivers/net/fm10k/Makefile  |   2 +-
 drivers/net/i40e/Makefile   |   2 +-
 drivers/net/ixgbe/Makefile  |   2 +-
 drivers/net/mlx4/Makefile   |   1 -
 drivers/net/null/Makefile   |   1 -
 drivers/net/pcap/Makefile   |   1 -
 drivers/net/virtio/Makefile |   2 +-
 drivers/net/vmxnet3/Makefile|   2 +-
 drivers/net/xenvirt/Makefile|   2 +-
 lib/Makefile|   2 +-
 lib/librte_acl/Makefile |   2 +-
 lib/librte_eal/bsdapp/eal/Makefile  |   4 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |  13 +
 lib/librte_eal/common/Makefile  |   1 +
 lib/librte_eal/common/include/rte_malloc.h  | 342 
 lib/librte_eal/common/malloc_elem.c | 320 ++
 lib/librte_eal/common/malloc_elem.h | 190 +
 lib/librte_eal/common/malloc_heap.c | 208 ++
 lib/librte_eal/common/malloc_heap.h |  70 +
 lib/librte_eal/common/rte_malloc.c  | 260 ++
 lib/librte_eal/linuxapp/eal/Makefile|   4 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |  13 +
 lib/librte_hash/Makefile|   2 +-
 lib/librte_lpm/Makefile |   2 +-
 lib/librte_malloc/Makefile  |   6 +-
 lib/librte_malloc/malloc_elem.c | 320 --
 lib/librte_malloc/malloc_elem.h | 190 -
 lib/librte_malloc/malloc_heap.c | 208 --
 lib/librte_malloc/malloc_heap.h |  70 -
 lib/librte_malloc/rte_malloc.c  | 228 +---
 lib/librte_malloc/rte_malloc.h  | 342 
 lib/librte_malloc/rte_malloc_version.map|  16 --
 lib/librte_mempool/Makefile |   2 -
 lib/librte_port/Makefile|   1 -
 lib/librte_ring/Makefile|   3 +-
 lib/librte_table/Makefile   |   1 -
 43 files changed, 1445 insertions(+), 1423 deletions(-)
 create mode 100644 lib/librte_eal/common/include/rte_malloc.h
 create mode 100644 lib/librte_eal/common/malloc_elem.c
 create mode 100644 lib/librte_eal/common/malloc_elem.h
 create mode 100644 lib/librte_eal/common/malloc_heap.c
 create mode 100644 lib/librte_eal/common/malloc_heap.h
 create mode 100644 lib/librte_eal/common/rte_malloc.c
 delete mode 100644 lib/librte_malloc/malloc_elem.c
 delete mode 100644 lib/librte_malloc/malloc_elem.h
 delete mode 100644 lib/librte_malloc/malloc_heap.c
 delete mode 100644 lib/librte_malloc/malloc_heap.h
 delete mode 100644 lib/librte_malloc/rte_malloc.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 54f0973..bb08e0a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -73,6 +73,7 @@ F: lib/librte_eal/common/*
 F: lib/librte_eal/common/include/*
 F: lib/librte_eal/common/include/generic/
 F: doc/guides/prog_guide/env_abstraction_layer.rst
+F: doc/guides/prog_guide/malloc_lib.rst
 F: app/test/test_alarm.c
 F: app/test/test_atomic.c
 F: app/test/test_byteorder.c
@@ -97,6 +98,8 @@ F: app/test/test_spinlock.c
 F: app/test/test_string_fns.c
 F: app/test/test_tailq.c
 F: app/test/test_version.c
+F: app/test/test_malloc.c
+F: app/test/test_func_reentrancy.c

 Secondary process
 K: RTE_PROC_
@@ -155,12 +158,6 @@ F: lib/librte_eal/bsdapp/nic_uio/
 Core Libraries
 --

-Dynamic memory
-F: lib/librte_malloc/
-F: doc/guides/prog_guide/malloc_lib.rst
-F: app/test/test_malloc.c
-F: app/test/test_func_reentrancy.c
-
 Memory pool
 M: Olivier Matz 
 F: lib/librte_mempool/
diff --git a/config/common_bsdapp b/config/common_bsdapp
index 464250b..2b0c877 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -102,6 +102,8 @@ CONFIG_RTE_LOG_LEVEL=8
 CONFIG_RTE_LOG_HISTORY=256
 CONFIG_RTE_EAL_ALLOW_INV_SOCKET_ID=n
 CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
+CONFIG_RTE_MALLOC_DEBUG=n
+CONFIG_RTE_MALLOC_MEMZONE_SIZE=11M

 #
 # FreeBSD contiguous memory driver settings
@@ -300,13 +302,6 @

[dpdk-dev] [PATCH v4 0/9] Dynamic memzone

2015-06-25 Thread Sergio Gonzalez Monroy

Current implemetation allows reserving/creating memzones but not the opposite
(unreserve/free). This affects mempools and other memzone based objects.

>From my point of view, implementing free functionality for memzones would look
like malloc over memsegs.
Thus, this approach moves malloc inside eal (which in turn removes a circular
dependency), where malloc heaps are composed of memsegs.
We keep both malloc and memzone APIs as they are, but memzones allocate its
memory by calling malloc_heap_alloc.
Some extra functionality is required in malloc to allow for boundary constrained
memory requests.
In summary, currently malloc is based on memzones, and with this approach
memzones are based on malloc.

v4:
 - rebase and fix couple of merge issues

v3:
 - Create dummy librte_malloc
 - Add deprecation notice
 - Rework some of the code
 - Doc update
 - checkpatch

v2:
 - New rte_memzone_free
 - Support memzone len = 0
 - Add all available memsegs to malloc heap at init
 - Update memzone/malloc unit tests

Sergio Gonzalez Monroy (9):
  eal: move librte_malloc to eal/common
  eal: memzone allocated by malloc
  app/test: update malloc/memzone unit tests
  config: remove CONFIG_RTE_MALLOC_MEMZONE_SIZE
  eal: remove free_memseg and references to it
  eal: new rte_memzone_free
  app/test: update unit test with rte_memzone_free
  doc: announce ABI change of librte_malloc
  doc: update malloc documentation

 MAINTAINERS   |   9 +-
 app/test/test_malloc.c|  86 -
 app/test/test_memzone.c   | 441 +++---
 config/common_bsdapp  |   8 +-
 config/common_linuxapp|   8 +-
 doc/guides/prog_guide/env_abstraction_layer.rst   | 220 ++-
 doc/guides/prog_guide/img/malloc_heap.png | Bin 81329 -> 80952 bytes
 doc/guides/prog_guide/index.rst   |   1 -
 doc/guides/prog_guide/malloc_lib.rst  | 233 
 doc/guides/prog_guide/overview.rst|  11 +-
 doc/guides/rel_notes/abi.rst  |   1 +
 drivers/net/af_packet/Makefile|   1 -
 drivers/net/bonding/Makefile  |   1 -
 drivers/net/e1000/Makefile|   2 +-
 drivers/net/enic/Makefile |   2 +-
 drivers/net/fm10k/Makefile|   2 +-
 drivers/net/i40e/Makefile |   2 +-
 drivers/net/ixgbe/Makefile|   2 +-
 drivers/net/mlx4/Makefile |   1 -
 drivers/net/null/Makefile |   1 -
 drivers/net/pcap/Makefile |   1 -
 drivers/net/virtio/Makefile   |   2 +-
 drivers/net/vmxnet3/Makefile  |   2 +-
 drivers/net/xenvirt/Makefile  |   2 +-
 lib/Makefile  |   2 +-
 lib/librte_acl/Makefile   |   2 +-
 lib/librte_eal/bsdapp/eal/Makefile|   4 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map |  19 +
 lib/librte_eal/common/Makefile|   1 +
 lib/librte_eal/common/eal_common_memzone.c| 329 ++--
 lib/librte_eal/common/include/rte_eal_memconfig.h |   5 +-
 lib/librte_eal/common/include/rte_malloc.h| 342 +
 lib/librte_eal/common/include/rte_malloc_heap.h   |   3 +-
 lib/librte_eal/common/include/rte_memzone.h   |  11 +
 lib/librte_eal/common/malloc_elem.c   | 344 +
 lib/librte_eal/common/malloc_elem.h   | 192 ++
 lib/librte_eal/common/malloc_heap.c   | 206 ++
 lib/librte_eal/common/malloc_heap.h   |  70 
 lib/librte_eal/common/rte_malloc.c| 259 +
 lib/librte_eal/linuxapp/eal/Makefile  |   4 +-
 lib/librte_eal/linuxapp/eal/eal_ivshmem.c |  17 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map   |  19 +
 lib/librte_hash/Makefile  |   2 +-
 lib/librte_lpm/Makefile   |   2 +-
 lib/librte_malloc/Makefile|   6 +-
 lib/librte_malloc/malloc_elem.c   | 320 
 lib/librte_malloc/malloc_elem.h   | 190 --
 lib/librte_malloc/malloc_heap.c   | 208 --
 lib/librte_malloc/malloc_heap.h   |  70 
 lib/librte_malloc/rte_malloc.c| 228 +--
 lib/librte_malloc/rte_malloc.h| 342 -
 lib/librte_malloc/rte_malloc_version.map  |  16 -
 lib/librte_mempool/Makefile   |   2 -
 lib/librte_port/Makefile  |   1 -
 lib/librte_ring/Makefile  |   3 +-
 lib/librte_table/Makefile |   1 -
 56 files changed, 1897 insertions(+), 2362 dele

[dpdk-dev] [PATCH 2/2] kni: fix header_ops to build with 4.1

2015-06-25 Thread Miguel Bernal Marin

rebuild member was removed from headers_ops in kernel release
4.1. Therefore kni module compilation breaks.

This patch add the properly checks to fix it.

Fixes: d476059e77d1 ("net: Kill dev_rebuild_header")

Signed-off-by: Miguel Bernal Marin 
---
 lib/librte_eal/linuxapp/kni/kni_net.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/lib/librte_eal/linuxapp/kni/kni_net.c 
b/lib/librte_eal/linuxapp/kni/kni_net.c
index e34a0fd..ab5add4 100644
--- a/lib/librte_eal/linuxapp/kni/kni_net.c
+++ b/lib/librte_eal/linuxapp/kni/kni_net.c
@@ -605,6 +605,7 @@ kni_net_header(struct sk_buff *skb, struct net_device *dev,
 /*
  * Re-fill the eth header
  */
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(4, 1, 0))
 static int
 kni_net_rebuild_header(struct sk_buff *skb)
 {
@@ -616,6 +617,7 @@ kni_net_rebuild_header(struct sk_buff *skb)

return 0;
 }
+#endif /* < 4.1.0  */

 /**
  * kni_net_set_mac - Change the Ethernet Address of the KNI NIC
@@ -646,7 +648,9 @@ static int kni_net_change_carrier(struct net_device *dev, 
bool new_carrier)

 static const struct header_ops kni_net_header_ops = {
.create  = kni_net_header,
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(4, 1, 0))
.rebuild = kni_net_rebuild_header,
+#endif /* < 4.1.0  */
.cache   = NULL,  /* disable caching */
 };

-- 
2.3.3

[dpdk-dev] [PATCH 1/2] kni: fix igb_ndo_bridge_getlink to build with 4.1

2015-06-25 Thread Miguel Bernal Marin

ndo_bridge_getlink has changed in kernel release 4.1. It
adds new parameter which brakes compilation.

This patch add the properly checks to fix it.

Fixes: 46c264d5 ("bridge/nl: remove wrong use of NLM_F_MULTI")

Signed-off-by: Miguel Bernal Marin 
---
 lib/librte_eal/linuxapp/kni/ethtool/igb/igb_main.c | 10 ++
 lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h  |  5 +
 2 files changed, 15 insertions(+)

diff --git a/lib/librte_eal/linuxapp/kni/ethtool/igb/igb_main.c 
b/lib/librte_eal/linuxapp/kni/ethtool/igb/igb_main.c
index fa24d16..47198bb 100644
--- a/lib/librte_eal/linuxapp/kni/ethtool/igb/igb_main.c
+++ b/lib/librte_eal/linuxapp/kni/ethtool/igb/igb_main.c
@@ -2250,8 +2250,14 @@ static int igb_ndo_bridge_setlink(struct net_device *dev,
 }

 #ifdef HAVE_BRIDGE_FILTER
+#ifdef HAVE_NDO_BRIDGE_GETLINK_FILTER_MASK
+static int igb_ndo_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
+ struct net_device *dev, u32 filter_mask,
+ int nlflags)
+#else
 static int igb_ndo_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
  struct net_device *dev, u32 filter_mask)
+#endif /* HAVE_NDO_BRIDGE_GETLINK_FILTER_MASK */
 #else
 static int igb_ndo_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
  struct net_device *dev)
@@ -2269,7 +2275,11 @@ static int igb_ndo_bridge_getlink(struct sk_buff *skb, 
u32 pid, u32 seq,
mode = BRIDGE_MODE_VEPA;

 #ifdef HAVE_NDO_FDB_ADD_VID
+#ifdef HAVE_NDO_BRIDGE_GETLINK_FILTER_MASK
+   return ndo_dflt_bridge_getlink(skb, pid, seq, dev, mode, 0, 0, nlflags);
+#else
return ndo_dflt_bridge_getlink(skb, pid, seq, dev, mode, 0, 0);
+#endif /* HAVE_NDO_BRIDGE_GETLINK_FILTER_MASK */
 #else
return ndo_dflt_bridge_getlink(skb, pid, seq, dev, mode);
 #endif /* HAVE_NDO_FDB_ADD_VID */
diff --git a/lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h 
b/lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h
index 44b9ebf..96d68a2 100644
--- a/lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h
+++ b/lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h
@@ -3891,4 +3891,9 @@ skb_set_hash(struct sk_buff *skb, __u32 hash, 
__always_unused int type)
 #define vlan_tx_tag_present skb_vlan_tag_present
 #define HAVE_NDO_BRIDGE_SET_DEL_LINK_FLAGS
 #endif /* 4.0.0 */
+
+#if ( LINUX_VERSION_CODE >= KERNEL_VERSION(4,1,0) )
+/* ndo_bridge_getlink adds new nlflags parameter */
+#define HAVE_NDO_BRIDGE_GETLINK_FILTER_MASK
+#endif /* >= 4.1.0 */
 #endif /* _KCOMPAT_H_ */
-- 
2.3.3

[dpdk-dev] [PATCH 0/2] kni: fix build with kernel 4.1

2015-06-25 Thread Miguel Bernal Marin

Due to API changes in netdevice.h in 4.1 kernel release, KNI modules
would not build.  This patch set adds the properly checks to fix
compilation.

Miguel Bernal Marin (2):
  kni: fix igb_ndo_bridge_getlink in 4.1
  kni: fix header_ops in 4.1

 lib/librte_eal/linuxapp/kni/ethtool/igb/igb_main.c | 10 ++
 lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h  |  5 +
 lib/librte_eal/linuxapp/kni/kni_net.c  |  4 
 3 files changed, 19 insertions(+)

-- 
2.3.3

[dpdk-dev] [PATCH v3 2/2] vhost: realloc vhost device and queues to the same numa node of vring desc table

2015-06-25 Thread Huawei Xie

When we get the address of vring descriptor table in VHOST_SET_VRING_ADDR 
message, will try to reallocate vhost device and virt queue to the same numa 
node.

v3 changes:
- remove unnecessary rte_free of new_vq and new_ll_dev

v2 changes:
- fix uninitialised new_vq and new_ll_device
- fix missed endif in rte.app.mk
- fix new_ll_dev and new_vq allocation failure issue
- return old virtio device if new_ll_dev isn't allocated

Signed-off-by: Huawei Xie 
---
 config/common_linuxapp|  1 +
 lib/librte_vhost/Makefile |  4 ++
 lib/librte_vhost/virtio-net.c | 88 +++
 mk/rte.app.mk |  4 ++
 4 files changed, 97 insertions(+)

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 0078dc9..4ace24e 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -421,6 +421,7 @@ CONFIG_RTE_KNI_VHOST_DEBUG_TX=n
 #
 CONFIG_RTE_LIBRTE_VHOST=n
 CONFIG_RTE_LIBRTE_VHOST_USER=y
+CONFIG_RTE_LIBRTE_VHOST_NUMA=n
 CONFIG_RTE_LIBRTE_VHOST_DEBUG=n

 #
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index a8645a6..6681f22 100644
--- a/lib/librte_vhost/Makefile
+++ b/lib/librte_vhost/Makefile
@@ -46,6 +46,10 @@ CFLAGS += -I vhost_cuse -lfuse
 LDFLAGS += -lfuse
 endif

+ifeq ($(CONFIG_RTE_LIBRTE_VHOST_NUMA),y)
+LDFLAGS += -lnuma
+endif
+
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := virtio-net.c vhost_rxtx.c
 ifeq ($(CONFIG_RTE_LIBRTE_VHOST_USER),y)
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 19b74d6..fcaefd6 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -38,6 +38,9 @@
 #include 
 #include 
 #include 
+#ifdef RTE_LIBRTE_VHOST_NUMA
+#include 
+#endif

 #include 

@@ -481,6 +484,88 @@ set_vring_num(struct vhost_device_ctx ctx, struct 
vhost_vring_state *state)
 }

 /*
+ * Reallocate virtio_det and vhost_virtqueue data structure to make them on the
+ * same numa node as the memory of vring descriptor.
+ */
+#ifdef RTE_LIBRTE_VHOST_NUMA
+static struct virtio_net*
+numa_realloc(struct virtio_net *dev, int index)
+{
+   int oldnode, newnode;
+   struct virtio_net_config_ll *old_ll_dev, *new_ll_dev = NULL;
+   struct vhost_virtqueue *old_vq, *new_vq = NULL;
+   int ret;
+   int realloc_dev = 0, realloc_vq = 0;
+
+   old_ll_dev = (struct virtio_net_config_ll *)dev;
+   old_vq = dev->virtqueue[index];
+
+   ret  = get_mempolicy(&newnode, NULL, 0, old_vq->desc,
+   MPOL_F_NODE | MPOL_F_ADDR);
+   ret = ret | get_mempolicy(&oldnode, NULL, 0, old_ll_dev,
+   MPOL_F_NODE | MPOL_F_ADDR);
+   if (ret) {
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "Unable to get vring desc or dev numa information.\n");
+   return dev;
+   }
+   if (oldnode != newnode)
+   realloc_dev = 1;
+
+   ret = get_mempolicy(&oldnode, NULL, 0, old_vq,
+   MPOL_F_NODE | MPOL_F_ADDR);
+   if (ret) {
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "Unable to get vq numa information.\n");
+   return dev;
+   }
+   if (oldnode != newnode)
+   realloc_vq = 1;
+
+   if (realloc_dev == 0 && realloc_vq == 0)
+   return dev;
+
+   if (realloc_dev)
+   new_ll_dev = rte_malloc_socket(NULL,
+   sizeof(struct virtio_net_config_ll), 0, newnode);
+   if (realloc_vq)
+   new_vq = rte_malloc_socket(NULL,
+   sizeof(struct vhost_virtqueue), 0, newnode);
+   if (!new_ll_dev && !new_vq)
+   return dev;
+
+   if (realloc_vq)
+   memcpy(new_vq, old_vq, sizeof(*new_vq));
+   if (realloc_dev)
+   memcpy(new_ll_dev, old_ll_dev, sizeof(*new_ll_dev));
+   (new_ll_dev ? new_ll_dev : old_ll_dev)->dev.virtqueue[index] =
+   new_vq ? new_vq : old_vq;
+   if (realloc_vq)
+   rte_free(old_vq);
+   if (realloc_dev) {
+   if (ll_root == old_ll_dev)
+   ll_root = new_ll_dev;
+   else {
+   struct virtio_net_config_ll *prev = ll_root;
+   while (prev->next != old_ll_dev)
+   prev = prev->next;
+   prev->next = new_ll_dev;
+   new_ll_dev->next = old_ll_dev->next;
+   }
+   rte_free(old_ll_dev);
+   }
+
+   return realloc_dev ? &new_ll_dev->dev : dev;
+}
+#else
+static struct virtio_net*
+numa_realloc(struct virtio_net *dev, int index __rte_unused)
+{
+   return dev;
+}
+#endif
+
+/*
  * Called from CUSE IOCTL: VHOST_SET_VRING_ADDR
  * The virtio device sends us the desc, used and avail ring addresses.
  * This function then converts these to our address space.
@@ -508,6 +593,9 @@ set_vring_addr(struct vhost_device_ctx ctx, struct 
vhost_vring_addr *addr)

[dpdk-dev] [PATCH v3 1/2] vhost: use rte_malloc to allocate device and queues

2015-06-25 Thread Huawei Xie

use rte_malloc to allocate vhost device and queues


Signed-off-by: Huawei Xie 
---
 lib/librte_vhost/virtio-net.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 4672e67..19b74d6 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -45,6 +45,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 

 #include "vhost-net.h"
@@ -202,9 +203,9 @@ static void
 free_device(struct virtio_net_config_ll *ll_dev)
 {
/* Free any malloc'd memory */
-   free(ll_dev->dev.virtqueue[VIRTIO_RXQ]);
-   free(ll_dev->dev.virtqueue[VIRTIO_TXQ]);
-   free(ll_dev);
+   rte_free(ll_dev->dev.virtqueue[VIRTIO_RXQ]);
+   rte_free(ll_dev->dev.virtqueue[VIRTIO_TXQ]);
+   rte_free(ll_dev);
 }

 /*
@@ -278,7 +279,7 @@ new_device(struct vhost_device_ctx ctx)
struct vhost_virtqueue *virtqueue_rx, *virtqueue_tx;

/* Setup device and virtqueues. */
-   new_ll_dev = malloc(sizeof(struct virtio_net_config_ll));
+   new_ll_dev = rte_malloc(NULL, sizeof(struct virtio_net_config_ll), 0);
if (new_ll_dev == NULL) {
RTE_LOG(ERR, VHOST_CONFIG,
"(%"PRIu64") Failed to allocate memory for dev.\n",
@@ -286,19 +287,19 @@ new_device(struct vhost_device_ctx ctx)
return -1;
}

-   virtqueue_rx = malloc(sizeof(struct vhost_virtqueue));
+   virtqueue_rx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
if (virtqueue_rx == NULL) {
-   free(new_ll_dev);
+   rte_free(new_ll_dev);
RTE_LOG(ERR, VHOST_CONFIG,
"(%"PRIu64") Failed to allocate memory for rxq.\n",
ctx.fh);
return -1;
}

-   virtqueue_tx = malloc(sizeof(struct vhost_virtqueue));
+   virtqueue_tx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
if (virtqueue_tx == NULL) {
-   free(virtqueue_rx);
-   free(new_ll_dev);
+   rte_free(virtqueue_rx);
+   rte_free(new_ll_dev);
RTE_LOG(ERR, VHOST_CONFIG,
"(%"PRIu64") Failed to allocate memory for txq.\n",
ctx.fh);
-- 
1.8.1.4

[dpdk-dev] [PATCH v3 0/2] vhost: numa aware allocation of vhost device and queues

2015-06-25 Thread Huawei Xie

The vhost device and queues should be allocated on the same numa node as vring 
descriptor table.
When we firstly allocate the vhost device and queues, we don't know the numa 
node of vring descriptor table.
When we receive the VHOST_SET_VRING_ADDR message, we get the numa node of vring 
descriptor table, we will try to reallocate vhost device and queues to the same 
numa node.


Huawei Xie (2):
  use rte_malloc to allocate vhost device and queues
  reallocate vhost device and queues when we get the address of vring 
descriptor table

 config/common_linuxapp|   1 +
 lib/librte_vhost/Makefile |   4 ++
 lib/librte_vhost/virtio-net.c | 107 ++
 mk/rte.app.mk |   4 ++
 4 files changed, 107 insertions(+), 9 deletions(-)

-- 
1.8.1.4

[dpdk-dev] KNI performance numbers...

2015-06-25 Thread Maciej Grochowski

I meet similar issue with KNI connected VM, but In my case I run 2 VM
guests based on KNI and measure network performance between them:

sesion:

### I just started demo with kni

./build/kni -c 0xf0 -n 4 -- -P -p 0x3 --config="(0,4,6,8),(1,5,7,9)"

###starting...

###set kni on vEthX to connect (as in example)

echo 1 > /sys/class/net/vEth0_0/sock_en
fd=`cat /sys/class/net/vEth0_0/sock_fd`

## start first guest VM
kvm -nographic -name vm1 -cpu host -m 2048 -smp 1 -hda
.../debian_squeeze_amd64.qcow2 -netdev tap,fd=$fd,id=hostnet1,vhost=on
-device virtio-net-pci,netdev=hostnet1,id=net1,bus=pci.0,addr=0x4

## start second guest VM
echo 1 > /sys/class/net/vEth1_0/sock_en
fd=`cat /sys/class/net/vEth1_0/sock_fd`

kvm -nographic -name vm2 -cpu host -m 2048 -smp 1 -hda
.../debian_squeeze2_amd64.qcow2 -netdev tap,fd=$fd,id=hostnet1,vhost=on
-device virtio-net-pci,netdev=hostnet1,id=net1,bus=pci.0,addr=0x4

###END: ustawiam 2 kvm z virtual guestem


### first VM node start server
 netserver -p 22113

### performance from second VM guest to first (server) using netperf

root at debian-amd64:~# netperf -H 10.0.0.200 -p 22113 -t TCP_STREAM
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
10.0.0.200 () port 0 AF_INET
Recv   SendSend
Socket Socket  Message  Elapsed
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec
 87380  16384  1638410.01 219.86

So I got 220M between two VM using KNI, but it was only experiment (I
didn't analyze it deeply)

On Wed, Jun 24, 2015 at 7:58 AM, Vithal S Mohare 
wrote:

> Hi,
>
> I am running DPDP KNI application on linux (3.18 kernel) VM (ESXi 5.5),
> directly connected to another linux box to measure throughput using  iperf
> tool.  Link speed: 1Gbps.   Maximum throughput I get is 50% with 1470
> Bytes.  With 512B pkt sizes, throughput drops to 282 Mbps.
>
> Tried using KNI loopback modes (and traffic from Ixia), but no change in
> throughput.
>
> KNI is running in single thread mode.  One lcore for rx, one for tx and
> another fir kni thread.
>
> Is the result expected?  Has anybody got better numbers?  Appreciate for
> input and relevant info.
>
> Thanks,
> -Vithal
>

[dpdk-dev] [PATCH v2 11/11] ip_pipeline: added new implementation of flow classification pipeline

2015-06-25 Thread Maciej Gajdzica

Flow classification pipeline implementation is split to two files.
pipeline_flow_classification.c file handles front-end functions (cli
commands parsing) pipeline_flow_classification_ops.c contains
implementation of functions done by pipeline (back-end).

Signed-off-by: Maciej Gajdzica 
---
 examples/ip_pipeline/Makefile  |2 +
 examples/ip_pipeline/config/fc_ipv4_5tuple.cfg |   23 +
 examples/ip_pipeline/config/fc_ipv4_5tuple.sh  |9 +
 examples/ip_pipeline/config/fc_ipv6_5tuple.cfg |   23 +
 examples/ip_pipeline/config/fc_ipv6_5tuple.sh  |8 +
 examples/ip_pipeline/config/fc_qinq.cfg|   23 +
 examples/ip_pipeline/config/fc_qinq.sh |8 +
 examples/ip_pipeline/init.c|2 +
 .../pipeline/pipeline_flow_classification.c| 2063 +---
 .../pipeline/pipeline_flow_classification.h|  106 +
 .../pipeline/pipeline_flow_classification_be.c |  569 ++
 .../pipeline/pipeline_flow_classification_be.h |  140 ++
 12 files changed, 2755 insertions(+), 221 deletions(-)
 create mode 100644 examples/ip_pipeline/config/fc_ipv4_5tuple.cfg
 create mode 100644 examples/ip_pipeline/config/fc_ipv4_5tuple.sh
 create mode 100644 examples/ip_pipeline/config/fc_ipv6_5tuple.cfg
 create mode 100644 examples/ip_pipeline/config/fc_ipv6_5tuple.sh
 create mode 100644 examples/ip_pipeline/config/fc_qinq.cfg
 create mode 100644 examples/ip_pipeline/config/fc_qinq.sh
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_flow_classification.h
 create mode 100644 
examples/ip_pipeline/pipeline/pipeline_flow_classification_be.c
 create mode 100644 
examples/ip_pipeline/pipeline/pipeline_flow_classification_be.h

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index a2881a6..f3ff1ec 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -64,6 +64,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += 
pipeline_passthrough_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_passthrough.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_firewall_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_firewall.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_flow_classification_be.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_flow_classification.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_routing_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_routing.c

diff --git a/examples/ip_pipeline/config/fc_ipv4_5tuple.cfg 
b/examples/ip_pipeline/config/fc_ipv4_5tuple.cfg
new file mode 100644
index 000..246df5f
--- /dev/null
+++ b/examples/ip_pipeline/config/fc_ipv4_5tuple.cfg
@@ -0,0 +1,23 @@
+[PIPELINE0]
+type = MASTER
+core = 0
+
+[PIPELINE1]
+type = PASS-THROUGH
+core = s0c1
+pktq_in = RXQ0.0 RXQ1.0 RXQ2.0 RXQ3.0
+pktq_out = SWQ0 SWQ1 SWQ2 SWQ3
+key_type = ipv4_5tuple
+key_offset_rd = 150; key_offset_rd = headroom (128) + ethernet (14) + ttl 
offset (8)
+key_offset_wr = 64
+hash_offset = 80
+
+[PIPELINE2]
+type = FLOW_CLASSIFICATION
+core = s0c2
+pktq_in = SWQ0 SWQ1 SWQ2 SWQ3
+pktq_out = TXQ0.0 TXQ1.0 TXQ2.0 TXQ3.0
+n_flows = 16777216
+key_offset = 64
+key_size = 16
+hash_offset = 80
diff --git a/examples/ip_pipeline/config/fc_ipv4_5tuple.sh 
b/examples/ip_pipeline/config/fc_ipv4_5tuple.sh
new file mode 100644
index 000..29c77f9
--- /dev/null
+++ b/examples/ip_pipeline/config/fc_ipv4_5tuple.sh
@@ -0,0 +1,9 @@
+#run config/fc_ipv4_5tuple.sh
+
+p 1 ping
+p 2 ping
+
+p 2 flow add default 3
+p 2 flow add ipv4_5tuple 1.2.3.4 5.6.7.8 256 257 6 2
+p 2 flow ls
+
diff --git a/examples/ip_pipeline/config/fc_ipv6_5tuple.cfg 
b/examples/ip_pipeline/config/fc_ipv6_5tuple.cfg
new file mode 100644
index 000..4b2b0da
--- /dev/null
+++ b/examples/ip_pipeline/config/fc_ipv6_5tuple.cfg
@@ -0,0 +1,23 @@
+[PIPELINE0]
+type = MASTER
+core = 0
+
+[PIPELINE1]
+type = PASS-THROUGH
+core = s0c1
+pktq_in = RXQ0.0 RXQ1.0 RXQ2.0 RXQ3.0
+pktq_out = SWQ0 SWQ1 SWQ2 SWQ3
+key_type = ipv6_5tuple; key_size = 64
+key_offset_rd = 146; key_offset_rd = headroom (128) + ethernet (14) + payload 
length offset (4)
+key_offset_wr = 0
+hash_offset = 64
+
+[PIPELINE2]
+type = FLOW_CLASSIFICATION
+core = s0c2
+pktq_in = SWQ0 SWQ1 SWQ2 SWQ3
+pktq_out = TXQ0.0 TXQ1.0 TXQ2.0 TXQ3.0
+n_flows = 16777216
+key_offset = 0
+key_size = 64
+hash_offset = 64
diff --git a/examples/ip_pipeline/config/fc_ipv6_5tuple.sh 
b/examples/ip_pipeline/config/fc_ipv6_5tuple.sh
new file mode 100644
index 000..b3724ee
--- /dev/null
+++ b/examples/ip_pipeline/config/fc_ipv6_5tuple.sh
@@ -0,0 +1,8 @@
+#run config/fc_ipv6_5tuple.sh
+
+p 1 ping
+p 2 ping
+
+p 2 flow add default 3
+p 2 flow add ipv6_5tuple 0001:0203:0405:0607:0809:0a0b:0c0d:0e0f 
1011:1213:1415:1617:1819:1a1b:1c1d:1e1f 256 257 6 2
+p 2 flow ls
diff --git a/examples/ip_pipeline/config/fc_qinq.cfg 
b/examples/ip_pipeline/config/fc_qinq.cfg
new file mode 100644
index 000..a502d7a
--- /dev/null
+++ b/examples/ip_pipeline/config/fc_qinq.cfg
@@ -0,0 +1,23 @@

[dpdk-dev] [PATCH v2 10/11] ip_pipeline: added new implementation of routing pipeline

2015-06-25 Thread Maciej Gajdzica

From: Pawel Wodkowski 

Routing pipeline implementation is split to two files.
pipeline_routing.c file handles front-end functions (cli commands
parsing) pipeline_routing_ops.c contains implementation of functions
done by pipeline (back-end).

Signed-off-by: Pawel Wodkowski 
---
 examples/ip_pipeline/Makefile  |2 +
 examples/ip_pipeline/config/rt.cfg |   13 +
 examples/ip_pipeline/config/rt.sh  |   18 +
 examples/ip_pipeline/init.c|2 +
 examples/ip_pipeline/pipeline/pipeline_routing.c   | 1777 
 examples/ip_pipeline/pipeline/pipeline_routing.h   |   99 ++
 .../ip_pipeline/pipeline/pipeline_routing_be.c |  836 +
 .../ip_pipeline/pipeline/pipeline_routing_be.h |  230 +++
 8 files changed, 2620 insertions(+), 357 deletions(-)
 create mode 100644 examples/ip_pipeline/config/rt.cfg
 create mode 100644 examples/ip_pipeline/config/rt.sh
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_routing.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_routing_be.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_routing_be.h

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index 382fee6..a2881a6 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -64,6 +64,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += 
pipeline_passthrough_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_passthrough.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_firewall_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_firewall.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_routing_be.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_routing.c

 CFLAGS += -I$(SRCDIR) -I$(SRCDIR)/pipeline
 CFLAGS += -O3
diff --git a/examples/ip_pipeline/config/rt.cfg 
b/examples/ip_pipeline/config/rt.cfg
new file mode 100644
index 000..e2c614f
--- /dev/null
+++ b/examples/ip_pipeline/config/rt.cfg
@@ -0,0 +1,13 @@
+[PIPELINE0]

+type = MASTER

+core = 0

+

+[PIPELINE1]

+type = ROUTING

+core = s0c1

+pktq_in = RXQ0.0 RXQ1.0 RXQ2.0 RXQ3.0

+pktq_out = TXQ0.0 TXQ1.0 TXQ2.0 TXQ3.0

+n_routes = 4096

+n_arp_entries = 1024

+ip_da_offset = 158; ip_da_offset = headroom (128) + ethernet header (14) + ip 
header offset (16)

+arp_key_offset = 128; arp_key_offset = headroom (128)

diff --git a/examples/ip_pipeline/config/rt.sh 
b/examples/ip_pipeline/config/rt.sh
new file mode 100644
index 000..3cf2877
--- /dev/null
+++ b/examples/ip_pipeline/config/rt.sh
@@ -0,0 +1,18 @@
+#run config/routing.sh

+

+p 1 ping

+

+p 1 arp add default 2

+p 1 arp add 0 10.0.0.1 a0:b0:c0:d0:e0:f0

+p 1 arp add 1 11.0.0.1 a1:b1:c1:d1:e1:f1

+p 1 arp add 2 12.0.0.1 a2:b2:c2:d2:e2:f2

+p 1 arp add 3 13.0.0.1 a3:b3:c3:d3:e3:f3

+

+p 1 route add default 3

+p 1 route add 0.0.0.0 10 0 10.0.0.1

+p 1 route add 0.64.0.0 10 1 11.0.0.1

+p 1 route add 0.128.0.0 10 2 12.0.0.1

+p 1 route add 0.192.0.0 10 3 13.0.0.1

+

+p 1 route ls

+p 1 arp ls

diff --git a/examples/ip_pipeline/init.c b/examples/ip_pipeline/init.c
index 3583672..840bc60 100644
--- a/examples/ip_pipeline/init.c
+++ b/examples/ip_pipeline/init.c
@@ -47,6 +47,7 @@
 #include "pipeline_master.h"
 #include "pipeline_passthrough.h"
 #include "pipeline_firewall.h"
+#include "pipeline_routing.h"

 #define APP_NAME_SIZE  32

@@ -1193,6 +1194,7 @@ int app_init(struct app_params *app)
app_pipeline_type_register(app, &pipeline_master);
app_pipeline_type_register(app, &pipeline_passthrough);
app_pipeline_type_register(app, &pipeline_firewall);
+   app_pipeline_type_register(app, &pipeline_routing);

app_init_pipelines(app);
app_init_threads(app);
diff --git a/examples/ip_pipeline/pipeline/pipeline_routing.c 
b/examples/ip_pipeline/pipeline/pipeline_routing.c
index b1ce624..3a42bc9 100644
--- a/examples/ip_pipeline/pipeline/pipeline_routing.c
+++ b/examples/ip_pipeline/pipeline/pipeline_routing.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -31,444 +31,1507 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */

-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
+#include 
+#include 

-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include "app.h"
+#include "pipeline_common_fe.h"
+#include "pipeline_routing.h"

-#include 
-#include 
-#include 
-#include 
+struct app_pipeline_routing_route {
+   struct pipeline_routing_route_key key;
+   struct app_pipeline_routing_route_params params;
+   void *entry_ptr;

-#include "main.h"
+   TAILQ_ENTRY(app_pipeline_routing_route) node;
+};

-#include 
+struct app_pipeline_routing_arp_entry {
+   struct pipeline_routing_arp_key key;

[dpdk-dev] [PATCH v2 09/11] ip_pipeline: added new implementation of firewall pipeline

2015-06-25 Thread Maciej Gajdzica

From: Daniel Mrzyglod 

Firewall pipeline implementation is split to two files.
pipeline_firewall.c file handles front-end functions (cli commands
parsing) pipeline_firewall_ops.c contains implementation of functions
done by pipeline (back-end).

Signed-off-by: Daniel Mrzyglod 
---
 examples/ip_pipeline/Makefile  |2 +
 examples/ip_pipeline/config/fw.cfg |   11 +
 examples/ip_pipeline/config/fw.sh  |   13 +
 examples/ip_pipeline/init.c|2 +
 examples/ip_pipeline/pipeline/pipeline_firewall.c  | 1125 +++-
 examples/ip_pipeline/pipeline/pipeline_firewall.h  |   63 ++
 .../ip_pipeline/pipeline/pipeline_firewall_be.c|  701 
 .../ip_pipeline/pipeline/pipeline_firewall_be.h|  138 +++
 8 files changed, 1816 insertions(+), 239 deletions(-)
 create mode 100644 examples/ip_pipeline/config/fw.cfg
 create mode 100644 examples/ip_pipeline/config/fw.sh
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_firewall.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_firewall_be.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_firewall_be.h

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index 930dc61..382fee6 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -62,6 +62,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_master_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_master.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_passthrough_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_passthrough.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_firewall_be.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_firewall.c

 CFLAGS += -I$(SRCDIR) -I$(SRCDIR)/pipeline
 CFLAGS += -O3
diff --git a/examples/ip_pipeline/config/fw.cfg 
b/examples/ip_pipeline/config/fw.cfg
new file mode 100644
index 000..fba324b
--- /dev/null
+++ b/examples/ip_pipeline/config/fw.cfg
@@ -0,0 +1,11 @@
+[PIPELINE0]

+type = MASTER

+core = 0

+

+[PIPELINE1]

+type = FIREWALL

+core = s0c1

+pktq_in = RXQ0.0 RXQ1.0 RXQ2.0 RXQ3.0

+pktq_out = TXQ0.0 TXQ1.0 TXQ2.0 TXQ3.0

+n_rules = 4096

+pkt_type = ipv4

diff --git a/examples/ip_pipeline/config/fw.sh 
b/examples/ip_pipeline/config/fw.sh
new file mode 100644
index 000..59e1213
--- /dev/null
+++ b/examples/ip_pipeline/config/fw.sh
@@ -0,0 +1,13 @@
+#Firewall

+

+p 1 firewall add ipv4 1 0.0.0.0 0 0.0.0.0 10 0 65535 0 65535 6 0xf 0

+p 1 firewall add ipv4 1 0.0.0.0 0 0.64.0.0 10 0 65535 0 65535 6 0xf 1

+p 1 firewall add ipv4 1 0.0.0.0 0 0.128.0.0 10 0 65535 0 65535 6 0xf 2

+p 1 firewall add ipv4 1 0.0.0.0 0 0.192.0.0 10 0 65535 0 65535 6 0xf 3

+

+p 1 firewall ls

+

+#p 1 firewall del ipv4 0.0.0.0 0 0.0.0.0 10 0 65535 0 65535 6 0xf

+#p 1 firewall del ipv4 0.0.0.0 0 0.64.0.0 10 0 65535 0 65535 6 0xf

+#p 1 firewall del ipv4 0.0.0.0 0 0.128.0.0 10 0 65535 0 65535 6 0xf

+#p 1 firewall del ipv4 0.0.0.0 0 0.192.0.0 10 0 65535 0 65535 6 0xf

diff --git a/examples/ip_pipeline/init.c b/examples/ip_pipeline/init.c
index 711c243..3583672 100644
--- a/examples/ip_pipeline/init.c
+++ b/examples/ip_pipeline/init.c
@@ -46,6 +46,7 @@
 #include "pipeline_common_fe.h"
 #include "pipeline_master.h"
 #include "pipeline_passthrough.h"
+#include "pipeline_firewall.h"

 #define APP_NAME_SIZE  32

@@ -1191,6 +1192,7 @@ int app_init(struct app_params *app)
app_pipeline_common_cmd_push(app);
app_pipeline_type_register(app, &pipeline_master);
app_pipeline_type_register(app, &pipeline_passthrough);
+   app_pipeline_type_register(app, &pipeline_firewall);

app_init_pipelines(app);
app_init_threads(app);
diff --git a/examples/ip_pipeline/pipeline/pipeline_firewall.c 
b/examples/ip_pipeline/pipeline/pipeline_firewall.c
index b70260e..9796fe2 100644
--- a/examples/ip_pipeline/pipeline/pipeline_firewall.c
+++ b/examples/ip_pipeline/pipeline/pipeline_firewall.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -32,282 +32,929 @@
  */

 #include 
-#include 
-#include 
+#include 
+#include 
+#include 

+#include 
+#include 
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "app.h"
+#include "pipeline_common_fe.h"
+#include "pipeline_firewall.h"
+
+struct app_pipeline_firewall_rule {
+   struct pipeline_firewall_key key;
+   int32_t priority;
+   uint32_t port_id;
+   void *entry_ptr;
+
+   TAILQ_ENTRY(app_pipeline_firewall_rule) node;
+};
+
+struct app_pipeline_firewall {
+   /* parameters */
+   uint32_t n_ports_in;
+   uint32_t n_ports_out;
+
+   /* rules */
+   TAILQ_HEAD(, app_pipeline_firewa

[dpdk-dev] [PATCH v2 08/11] ip_pipeline: added new implementation of passthrough pipeline

2015-06-25 Thread Maciej Gajdzica

From: Jasvinder Singh 

Passthrough pipeline implementation is split to two files.
pipeline_passthrough.c file handles front-end functions (cli commands
parsing) pipeline_passthrough_ops.c contains implementation of functions
done by pipeline (back-end).

Signed-off-by: Jasvinder Singh 
---
 examples/ip_pipeline/Makefile  |2 +
 examples/ip_pipeline/config/pt1.cfg|9 +
 examples/ip_pipeline/config/pt2.cfg|   15 +
 examples/ip_pipeline/config/pt3.cfg|   21 +
 examples/ip_pipeline/init.c|2 +
 examples/ip_pipeline/pipeline/hash_func.h  |  351 ++
 .../ip_pipeline/pipeline/pipeline_actions_common.h |  119 
 .../ip_pipeline/pipeline/pipeline_passthrough.c|  192 +
 .../ip_pipeline/pipeline/pipeline_passthrough.h|   41 ++
 .../ip_pipeline/pipeline/pipeline_passthrough_be.c |  741 
 .../ip_pipeline/pipeline/pipeline_passthrough_be.h |   41 ++
 11 files changed, 1355 insertions(+), 179 deletions(-)
 create mode 100644 examples/ip_pipeline/config/pt1.cfg
 create mode 100644 examples/ip_pipeline/config/pt2.cfg
 create mode 100644 examples/ip_pipeline/config/pt3.cfg
 create mode 100644 examples/ip_pipeline/pipeline/hash_func.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_actions_common.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_passthrough.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_passthrough_be.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_passthrough_be.h

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index f255338..930dc61 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -60,6 +60,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_common_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_common_fe.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_master_be.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_master.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_passthrough_be.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_passthrough.c

 CFLAGS += -I$(SRCDIR) -I$(SRCDIR)/pipeline
 CFLAGS += -O3
diff --git a/examples/ip_pipeline/config/pt1.cfg 
b/examples/ip_pipeline/config/pt1.cfg
new file mode 100644
index 000..c9cdc78
--- /dev/null
+++ b/examples/ip_pipeline/config/pt1.cfg
@@ -0,0 +1,9 @@
+[PIPELINE0]

+type = MASTER

+core = 0

+

+[PIPELINE1]

+type = PASS-THROUGH

+core = s0c1

+pktq_in = RXQ0.0 RXQ1.0 RXQ2.0 RXQ3.0

+pktq_out = TXQ0.0 TXQ1.0 TXQ2.0 TXQ3.0

diff --git a/examples/ip_pipeline/config/pt2.cfg 
b/examples/ip_pipeline/config/pt2.cfg
new file mode 100644
index 000..860cab6
--- /dev/null
+++ b/examples/ip_pipeline/config/pt2.cfg
@@ -0,0 +1,15 @@
+[PIPELINE0]

+type = MASTER

+core = 0

+

+[PIPELINE1]

+type = PASS-THROUGH

+core = s0c1

+pktq_in = RXQ0.0 RXQ1.0 RXQ2.0 RXQ3.0

+pktq_out = SWQ0 SWQ1 SWQ2 SWQ3

+

+[PIPELINE2]

+type = PASS-THROUGH

+core = s0c2

+pktq_in = SWQ0 SWQ1 SWQ2 SWQ3

+pktq_out = TXQ0.0 TXQ1.0 TXQ2.0 TXQ3.0

diff --git a/examples/ip_pipeline/config/pt3.cfg 
b/examples/ip_pipeline/config/pt3.cfg
new file mode 100644
index 000..e6159d0
--- /dev/null
+++ b/examples/ip_pipeline/config/pt3.cfg
@@ -0,0 +1,21 @@
+[PIPELINE0]

+type = MASTER

+core = 0

+

+[PIPELINE1]

+type = PASS-THROUGH

+core = s0c1

+pktq_in = RXQ0.0 RXQ1.0 RXQ2.0 RXQ3.0

+pktq_out = SWQ0 SWQ1 SWQ2 SWQ3

+

+[PIPELINE2]

+type = PASS-THROUGH

+core = s0c2

+pktq_in = SWQ0 SWQ1 SWQ2 SWQ3

+pktq_out = SWQ4 SWQ5 SWQ6 SWQ7

+

+[PIPELINE3]

+type = PASS-THROUGH

+core = s0c3

+pktq_in = SWQ4 SWQ5 SWQ6 SWQ7

+pktq_out = TXQ0.0 TXQ1.0 TXQ2.0 TXQ3.0

diff --git a/examples/ip_pipeline/init.c b/examples/ip_pipeline/init.c
index b362af0..711c243 100644
--- a/examples/ip_pipeline/init.c
+++ b/examples/ip_pipeline/init.c
@@ -45,6 +45,7 @@
 #include "pipeline.h"
 #include "pipeline_common_fe.h"
 #include "pipeline_master.h"
+#include "pipeline_passthrough.h"

 #define APP_NAME_SIZE  32

@@ -1189,6 +1190,7 @@ int app_init(struct app_params *app)

app_pipeline_common_cmd_push(app);
app_pipeline_type_register(app, &pipeline_master);
+   app_pipeline_type_register(app, &pipeline_passthrough);

app_init_pipelines(app);
app_init_threads(app);
diff --git a/examples/ip_pipeline/pipeline/hash_func.h 
b/examples/ip_pipeline/pipeline/hash_func.h
new file mode 100644
index 000..0d9c019
--- /dev/null
+++ b/examples/ip_pipeline/pipeline/hash_func.h
@@ -0,0 +1,351 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * R

[dpdk-dev] [PATCH v2 07/11] ip_pipeline: moved config files to separate folder

2015-06-25 Thread Maciej Gajdzica

Created new folder for config(.cfg) and script(.sh) files.

Signed-off-by: Maciej Gajdzica 
---
 examples/ip_pipeline/config/ip_pipeline.cfg |9 ++
 examples/ip_pipeline/config/ip_pipeline.sh  |5 +
 examples/ip_pipeline/config/test.cfg|  164 +++
 examples/ip_pipeline/config/test.sh |6 +
 examples/ip_pipeline/config/tm_profile.cfg  |  105 +
 examples/ip_pipeline/ip_pipeline.cfg|   56 -
 examples/ip_pipeline/ip_pipeline.sh |   18 ---
 7 files changed, 289 insertions(+), 74 deletions(-)
 create mode 100644 examples/ip_pipeline/config/ip_pipeline.cfg
 create mode 100644 examples/ip_pipeline/config/ip_pipeline.sh
 create mode 100644 examples/ip_pipeline/config/test.cfg
 create mode 100644 examples/ip_pipeline/config/test.sh
 create mode 100644 examples/ip_pipeline/config/tm_profile.cfg
 delete mode 100644 examples/ip_pipeline/ip_pipeline.cfg
 delete mode 100644 examples/ip_pipeline/ip_pipeline.sh

diff --git a/examples/ip_pipeline/config/ip_pipeline.cfg 
b/examples/ip_pipeline/config/ip_pipeline.cfg
new file mode 100644
index 000..095ed25
--- /dev/null
+++ b/examples/ip_pipeline/config/ip_pipeline.cfg
@@ -0,0 +1,9 @@
+[PIPELINE0]
+type = MASTER
+core = 0
+
+[PIPELINE1]
+type = PASS-THROUGH
+core = 1
+pktq_in = RXQ0.0 RXQ1.0 RXQ2.0 RXQ3.0
+pktq_out = TXQ0.0 TXQ1.0 TXQ2.0 TXQ3.0
diff --git a/examples/ip_pipeline/config/ip_pipeline.sh 
b/examples/ip_pipeline/config/ip_pipeline.sh
new file mode 100644
index 000..4fca259
--- /dev/null
+++ b/examples/ip_pipeline/config/ip_pipeline.sh
@@ -0,0 +1,5 @@
+#
+#run config/ip_pipeline.sh
+#
+
+p 1 ping
diff --git a/examples/ip_pipeline/config/test.cfg 
b/examples/ip_pipeline/config/test.cfg
new file mode 100644
index 000..99a21dd
--- /dev/null
+++ b/examples/ip_pipeline/config/test.cfg
@@ -0,0 +1,164 @@
+; #define OFFSET_QINQ 142
+; #define OFFSET_IP_DA 166
+; #define OFFSET_HASH 128
+; #define OFFSET_FLOW_ID 132
+; #define OFFSET_COLOR 136
+
+; TBD - need to think about
+;[EAL]
+; c = not allowed
+;n = 2 ; 
+;m = 2048
+
+
+[PIPELINE0]
+type = MASTER
+core = 0
+
+[PIPELINE1]
+type = PASS-THROUGH; Packet RX
+core = s0c1
+pktq_in = RXQ0.0 RXQ1.0 SWQ1
+pktq_out = SWQ0 SWQ1
+msgq_in = MSGQ0
+msgq_out = MSGQ1
+pkt_type=qinq_ipv4
+key_type=qinq
+key_offset=OFFSET_QINQ
+hash_offset=OFFSET_HASH
+timer_period = 2
+
+[PIPELINE2]
+type = FIREWALL
+core = s1c2
+pktq_in = SWQ0 SWQ5
+pktq_out = SWQ2 SINK0
+msgq_in = MSGQ1
+msgq_out = MSGQ0
+n_rules=4K
+pkt_type=qinq_ipv4
+
+[PIPELINE3]
+type = FLOW_CLASSIF
+core = s0c3   
+pktq_in = SWQ2
+pktq_out = SWQ3 SINK1
+time_period = 100
+n_flows=16M
+key_size=8
+key_offset=$OFFSET_QINQ
+hash_offset=$OFFSET_HASH
+flow_id_offset=$OFFSET_FLOW_ID
+
+[PIPELINE4]
+type = FLOW_ACTIONS
+core = c4h
+pktq_in = SWQ3
+pktq_out = SWQ4
+n_flows=16M
+flow_id_offset=$OFFSET_FLOW_ID
+color_offset=$OFFSET_COLOR
+
+[PIPELINE5]
+type = ROUTING
+core = s1c5h
+pktq_in = SWQ4
+pktq_out = TXQ0.0 TXQ1.0 SINK2
+n_routes=1M
+next_hop_type=ipv4_mpls
+ip_da_offset=$OFFSET_IP_DA
+color_offset=$OFFSET_COLOR
+
+[MEMPOOL1]
+pool_size=2k
+cache_size=64
+ 
+[LINK0]
+ip_local_q=0
+udp_local_q =0
+arp_q=0
+tcp_local_q=0
+
+[LINK1]
+ip_local_q=0
+udp_local_q =0
+arp_q=0
+tcp_local_q=0
+
+[RXQ0.0]
+mempool=MEMPOOL0
+burst=16
+size=128
+
+[RXQ1.0]
+mempool=MEMPOOL0
+burst=16
+size=128
+
+[TXQ0.0]
+burst=16
+size=128
+dropless=yes
+
+[TXQ1.0]
+burst=16
+size=128
+dropless=no
+   
+[SWQ0]
+size=64
+
+[SWQ1]
+dropless=yes
+
+[SWQ2]
+cpu=0
+
+[SWQ3]
+dropless=yes
+
+[SWQ4]
+dropless=yes
+
+[TM1]
+cfg=config/tm_profile.cfg
+
+[SOURCE1]
+mempool=MEMPOOL3
+burst=64
+
+[MSGQ-REQ-PIPELINE1]
+size=16
+
+[MSGQ-RSP-PIPELINE1]
+size=16
+
+[MSGQ-REQ-PIPELINE2]
+size=16
+
+[MSGQ-RSP-PIPELINE2]
+size=16
+
+[MSGQ-REQ-PIPELINE3]
+size=16
+
+[MSGQ-RSP-PIPELINE3]
+size=16
+
+[MSGQ-REQ-PIPELINE4]
+size=16
+
+[MSGQ-RSP-PIPELINE4]
+size=16
+
+[MSGQ-REQ-PIPELINE5]
+size=16
+
+[MSGQ-RSP-PIPELINE5]
+size=16
+
+;[MSGQ-REQ-CORE-s1c5h]
+;size = 32
+
+;[MSGQ-RSP-CORE-s1c5h]
+;size = 32
diff --git a/examples/ip_pipeline/config/test.sh 
b/examples/ip_pipeline/config/test.sh
new file mode 100644
index 000..ca78a4a
--- /dev/null
+++ b/examples/ip_pipeline/config/test.sh
@@ -0,0 +1,6 @@
+p 1 ping
+p 1 stats port in 0
+p 1 stats table 0
+p 1 stats port out 1
+p 1 port in 0 disable
+p 1 port in 0 enable
diff --git a/examples/ip_pipeline/config/tm_profile.cfg 
b/examples/ip_pipeline/config/tm_profile.cfg
new file mode 100644
index 000..53edb67
--- /dev/null
+++ b/examples/ip_pipeline/config/tm_profile.cfg
@@ -0,0 +1,105 @@
+;   BSD LICENSE
+;
+;   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+;   All rights reserved.
+;
+;   Redistribution and use in source and binary forms, with or without
+;   modification, are permitted provided that the following conditions
+;   are met:
+;
+; * Redistributions of source code must retain the above copyright
+;   notice, this list of conditions

[dpdk-dev] [PATCH v2 06/11] ip_pipeline: added application thread

2015-06-25 Thread Maciej Gajdzica

Application thread runs pipelines on assigned cores.

Signed-off-by: Maciej Gajdzica 
---
 examples/ip_pipeline/Makefile |1 +
 examples/ip_pipeline/main.c   |6 +++
 examples/ip_pipeline/thread.c |  105 +
 3 files changed, 112 insertions(+)
 create mode 100644 examples/ip_pipeline/thread.c

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index 9ce80a8..f255338 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -53,6 +53,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse_tm.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_check.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += init.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += thread.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += cpu_core_map.c

 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_common_be.c
diff --git a/examples/ip_pipeline/main.c b/examples/ip_pipeline/main.c
index ef68c86..862e2f2 100644
--- a/examples/ip_pipeline/main.c
+++ b/examples/ip_pipeline/main.c
@@ -52,5 +52,11 @@ main(int argc, char **argv)
/* Init */
app_init(&app);

+   /* Run-time */
+   rte_eal_mp_remote_launch(
+   app_thread,
+   (void *) &app,
+   CALL_MASTER);
+
return 0;
 }
diff --git a/examples/ip_pipeline/thread.c b/examples/ip_pipeline/thread.c
new file mode 100644
index 000..66b2a2b
--- /dev/null
+++ b/examples/ip_pipeline/thread.c
@@ -0,0 +1,105 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+
+#include "pipeline_common_be.h"
+#include "app.h"
+
+int app_thread(void *arg)
+{
+   struct app_params *app = (struct app_params *) arg;
+   uint32_t core_id = rte_lcore_id(), i, j;
+   struct app_thread_data *t = &app->thread_data[core_id];
+
+   for (i = 0; ; i++) {
+   /* Run regular pipelines */
+   for (j = 0; j < t->n_regular; j++){
+   struct app_thread_pipeline_data *data = &t->regular[j];
+   struct pipeline *p = data->be;
+
+   rte_pipeline_run(p->p);
+   }
+
+   /* Run custom pipelines */
+   for (j = 0; j < t->n_custom; j++){
+   struct app_thread_pipeline_data *data = &t->custom[j];
+
+   data->f_run(data->be);
+   }
+
+   /* Timer */
+   if ((i & 0xF) == 0) {
+   uint64_t time = rte_get_tsc_cycles();
+   uint64_t t_deadline = UINT64_MAX;
+   
+   if (time < t->deadline)
+   continue;
+
+   /* Timer for regular pipelines */
+   for (j = 0; j < t->n_regular; j++){
+   struct app_thread_pipeline_data *data = 
&t->regular[j];
+   uint64_t p_deadline = data->deadline;
+
+   if (p_deadline <= time) {
+   data->f_timer(data->be);
+   p_deadline = time + data->timer_period;
+   data->deadline = p_deadline;
+   }
+
+

[dpdk-dev] [PATCH v2 05/11] ip_pipeline: added master pipeline

2015-06-25 Thread Maciej Gajdzica

From: Jasvinder Singh 

Master pipeline is responsible for command line handling and
communicationg with all other pipelines via message queues. Removed
cmdline.c file, as its functionality will be split over multiple
pipeline files.

Signed-off-by: Jasvinder Singh 
---
 examples/ip_pipeline/Makefile  |5 +
 examples/ip_pipeline/cmdline.c | 1976 
 examples/ip_pipeline/init.c|5 +
 3 files changed, 10 insertions(+), 1976 deletions(-)
 delete mode 100644 examples/ip_pipeline/cmdline.c

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index 213e879..9ce80a8 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -55,6 +55,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_check.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += init.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += cpu_core_map.c

+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_common_be.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_common_fe.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_master_be.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_master.c
+
 CFLAGS += -I$(SRCDIR) -I$(SRCDIR)/pipeline
 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS) -Wno-error=unused-function -Wno-error=unused-variable
diff --git a/examples/ip_pipeline/cmdline.c b/examples/ip_pipeline/cmdline.c
deleted file mode 100644
index 3173fd0..000
--- a/examples/ip_pipeline/cmdline.c
+++ /dev/null
@@ -1,1976 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- * * Redistributions of source code must retain the above copyright
- *   notice, this list of conditions and the following disclaimer.
- * * Redistributions in binary form must reproduce the above copyright
- *   notice, this list of conditions and the following disclaimer in
- *   the documentation and/or other materials provided with the
- *   distribution.
- * * Neither the name of Intel Corporation nor the names of its
- *   contributors may be used to endorse or promote products derived
- *   from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include "main.h"
-
-#define IS_RULE_PRESENT(res, rule_key, table, type)\
-do {   \
-   struct app_rule *it;\
-   \
-   (res) = NULL;   \
-   TAILQ_FOREACH(it, &table, entries) {\
-   if (memcmp(&rule_key, &it->type.key, sizeof(rule_key)) == 0) {\
-   (res) = it; \
-   break;  \
-   }   \
-   }   \
-} while (0)
-
-/* Rules */
-static void
-app_init_rule_tables(void);
-
-TAILQ_HEAD(linked_list, app_rule) arp_table, routing_table, firewall_table,
-   flow_table;
-
-uint32_t n_arp_rules;
-uint32_t n_routing_rules;
-uint32_t n_firewall_rules;
-uint32_t n_flow_rules;
-
-struct app_arp_rule {
-   struct {
-   uint8_t out_iface;
-   uint32_t nh_ip;
-   } key;
-
-   struct ether_addr nh_arp;
-};
-
-struct app_routing_rule {
-   struct {
-   uint32_t ip;
-   uint8_t depth;
-   } key;
-
-   uint8_t port;
-   uint32_t nh_ip;
-};
-
-struct app_firewall_rule {
-   struct {
-   uint32_t src_ip;
-   uint32_t src_ip_mask;
-

[dpdk-dev] [PATCH v2 04/11] ip_pipeline: moved pipelines to separate folder

2015-06-25 Thread Maciej Gajdzica

Moved pipelines to separate folder, removed not needed pipelines and
modified Makefile to match that change.

Signed-off-by: Maciej Gajdzica 
---
 examples/ip_pipeline/Makefile  |9 +-
 examples/ip_pipeline/pipeline/pipeline_common_be.c |  204 
 examples/ip_pipeline/pipeline/pipeline_common_be.h |  161 +++
 examples/ip_pipeline/pipeline/pipeline_common_fe.c | 1283 
 examples/ip_pipeline/pipeline/pipeline_common_fe.h |  248 
 examples/ip_pipeline/pipeline/pipeline_firewall.c  |  313 +
 .../pipeline/pipeline_flow_classification.c|  306 +
 examples/ip_pipeline/pipeline/pipeline_master.c|   47 +
 examples/ip_pipeline/pipeline/pipeline_master.h|   41 +
 examples/ip_pipeline/pipeline/pipeline_master_be.c |  146 +++
 examples/ip_pipeline/pipeline/pipeline_master_be.h |   41 +
 .../ip_pipeline/pipeline/pipeline_passthrough.c|  213 
 examples/ip_pipeline/pipeline/pipeline_routing.c   |  474 
 examples/ip_pipeline/pipeline_firewall.c   |  313 -
 .../ip_pipeline/pipeline_flow_classification.c |  306 -
 examples/ip_pipeline/pipeline_ipv4_frag.c  |  184 ---
 examples/ip_pipeline/pipeline_ipv4_ras.c   |  181 ---
 examples/ip_pipeline/pipeline_passthrough.c|  213 
 examples/ip_pipeline/pipeline_routing.c|  474 
 examples/ip_pipeline/pipeline_rx.c |  385 --
 examples/ip_pipeline/pipeline_tx.c |  283 -
 21 files changed, 3485 insertions(+), 2340 deletions(-)
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_common_be.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_common_be.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_common_fe.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_common_fe.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_firewall.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_flow_classification.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_master.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_master.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_master_be.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_master_be.h
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_passthrough.c
 create mode 100644 examples/ip_pipeline/pipeline/pipeline_routing.c
 delete mode 100644 examples/ip_pipeline/pipeline_firewall.c
 delete mode 100644 examples/ip_pipeline/pipeline_flow_classification.c
 delete mode 100644 examples/ip_pipeline/pipeline_ipv4_frag.c
 delete mode 100644 examples/ip_pipeline/pipeline_ipv4_ras.c
 delete mode 100644 examples/ip_pipeline/pipeline_passthrough.c
 delete mode 100644 examples/ip_pipeline/pipeline_routing.c
 delete mode 100644 examples/ip_pipeline/pipeline_rx.c
 delete mode 100644 examples/ip_pipeline/pipeline_tx.c

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index 59bea5b..213e879 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -36,11 +36,17 @@ endif
 # Default target, can be overridden by command line or environment
 RTE_TARGET ?= x86_64-native-linuxapp-gcc

+DIRS-(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline
+
 include $(RTE_SDK)/mk/rte.vars.mk

 # binary name
 APP = ip_pipeline

+VPATH += $(SRCDIR)/pipeline
+
+INC += $(wildcard *.h) $(wildcard pipeline/*.h)
+
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) := main.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse.c
@@ -49,7 +55,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_check.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += init.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += cpu_core_map.c

+CFLAGS += -I$(SRCDIR) -I$(SRCDIR)/pipeline
 CFLAGS += -O3
-CFLAGS += $(WERROR_FLAGS)
+CFLAGS += $(WERROR_FLAGS) -Wno-error=unused-function -Wno-error=unused-variable

 include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/ip_pipeline/pipeline/pipeline_common_be.c 
b/examples/ip_pipeline/pipeline/pipeline_common_be.c
new file mode 100644
index 000..1cb107a
--- /dev/null
+++ b/examples/ip_pipeline/pipeline/pipeline_common_be.c
@@ -0,0 +1,204 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote produ

[dpdk-dev] [PATCH v2 03/11] ip_pipeline: modified init to match new params struct

2015-06-25 Thread Maciej Gajdzica

After changes in config parser, app params struct is changed and
requires modifications in initialization procedures.

Signed-off-by: Maciej Gajdzica 
---
 examples/ip_pipeline/Makefile |1 +
 examples/ip_pipeline/init.c   | 1550 +
 examples/ip_pipeline/main.c   |3 +
 3 files changed, 1121 insertions(+), 433 deletions(-)

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index bc50e71..59bea5b 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) := main.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse_tm.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_check.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += init.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += cpu_core_map.c

 CFLAGS += -O3
diff --git a/examples/ip_pipeline/init.c b/examples/ip_pipeline/init.c
index d79762f..d6b1768 100644
--- a/examples/ip_pipeline/init.c
+++ b/examples/ip_pipeline/init.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -32,561 +32,1245 @@
  */

 #include 
-#include 
-#include 
-#include 
-#include 
 #include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
+
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include 
 #include 
-#include 
-#include 
-#include 
-
-#include "main.h"
-
-#define NA APP_SWQ_INVALID
-
-struct app_params app = {
-   /* CPU cores */
-   .cores = {
-   {0, APP_CORE_MASTER, {15, 16, 17, NA, NA, NA, NA, NA},
-   {12, 13, 14, NA, NA, NA, NA, NA} },
-   {0, APP_CORE_RX, {NA, NA, NA, NA, NA, NA, NA, 12},
-   { 0,  1,  2,  3, NA, NA, NA, 15} },
-   {0, APP_CORE_FC, { 0,  1,  2,  3, NA, NA, NA, 13},
-   { 4,  5,  6,  7, NA, NA, NA, 16} },
-   {0, APP_CORE_RT, { 4,  5,  6,  7, NA, NA, NA, 14},
-   { 8,  9, 10, 11, NA, NA, NA, 17} },
-   {0, APP_CORE_TX, { 8,  9, 10, 11, NA, NA, NA, NA},
-   {NA, NA, NA, NA, NA, NA, NA, NA} },
-   },
-
-   /* Ports*/
-   .n_ports = APP_MAX_PORTS,
-   .rsz_hwq_rx = 128,
-   .rsz_hwq_tx = 512,
-   .bsz_hwq_rd = 64,
-   .bsz_hwq_wr = 64,
-
-   .port_conf = {
-   .rxmode = {
-   .split_hdr_size = 0,
-   .header_split   = 0, /* Header Split disabled */
-   .hw_ip_checksum = 1, /* IP checksum offload enabled */
-   .hw_vlan_filter = 0, /* VLAN filtering disabled */
-   .jumbo_frame= 1, /* Jumbo Frame Support enabled */
-   .max_rx_pkt_len = 9000, /* Jumbo Frame MAC pkt length */
-   .hw_strip_crc   = 0, /* CRC stripped by hardware */
-   },
-   .rx_adv_conf = {
-   .rss_conf = {
-   .rss_key = NULL,
-   .rss_hf = ETH_RSS_IP,
-   },
-   },
-   .txmode = {
-   .mq_mode = ETH_MQ_TX_NONE,
-   },
-   },
-
-   .rx_conf = {
-   .rx_thresh = {
-   .pthresh = 8,
-   .hthresh = 8,
-   .wthresh = 4,
-   },
-   .rx_free_thresh = 64,
-   .rx_drop_en = 0,
-   },
-
-   .tx_conf = {
-   .tx_thresh = {
-   .pthresh = 36,
-   .hthresh = 0,
-   .wthresh = 0,
-   },
-   .tx_free_thresh = 0,
-   .tx_rs_thresh = 0,
-   },
-
-   /* SWQs */
-   .rsz_swq = 128,
-   .bsz_swq_rd = 64,
-   .bsz_swq_wr = 64,
-
-   /* Buffer pool */
-   .pool_buffer_size = RTE_MBUF_DEFAULT_BUF_SIZE,
-   .pool_size = 32 * 1024,
-   .pool_cache_size = 256,
-
-   /* Message buffer pool */
-   .msg_pool_buffer_size = 256,
-   .msg_pool_size = 1024,
-   .msg_pool_cache_size = 64,
-
-   /* Rule tables */
-   .max_arp_rules = 1 << 10,
-   .max_firewall_rules = 1 << 5,
-   .max_routing_rules = 1 << 24,
-   .max_flow_rules = 1 << 24,
-
-   /* Application processing */
-   .ether_hdr_pop_push = 0,
-};
-
-struct app_core_params *
-app_get_core_params(uint32_t core_id)
-{
-   uint32_t i;
+#include 
+#include 

-   for (i = 0; i < RTE_MAX_LCORE; i++) {
-   struct app_core

[dpdk-dev] [PATCH v2 02/11] ip_pipeline: added config checks

2015-06-25 Thread Maciej Gajdzica

From: Jasvinder Singh 

After loading configuration from a file, data integrity is checked.

Signed-off-by: Jasvinder Singh 
---
 examples/ip_pipeline/Makefile   |1 +
 examples/ip_pipeline/config_check.c |  387 +++
 examples/ip_pipeline/main.c |2 +
 3 files changed, 390 insertions(+)
 create mode 100644 examples/ip_pipeline/config_check.c

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index b0feb4f..bc50e71 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -45,6 +45,7 @@ APP = ip_pipeline
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) := main.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse_tm.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_check.c
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += cpu_core_map.c

 CFLAGS += -O3
diff --git a/examples/ip_pipeline/config_check.c 
b/examples/ip_pipeline/config_check.c
new file mode 100644
index 000..972d0e7
--- /dev/null
+++ b/examples/ip_pipeline/config_check.c
@@ -0,0 +1,387 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+
+#include "app.h"
+
+static void
+check_mempools(struct app_params *app)
+{
+   uint32_t i;
+   
+   for (i = 0; i < app->n_mempools; i++) {
+   struct app_mempool_params *p = &app->mempool_params[i];
+
+   APP_CHECK((p->pool_size > 0),
+   "Mempool %s size is 0\n", p->name);
+
+   APP_CHECK((p->cache_size > 0),
+   "Mempool %s cache size is 0\n", p->name);
+
+   APP_CHECK(rte_is_power_of_2(p->cache_size),
+   "Mempool %s cache size not a power of 2\n", p->name);
+   }
+}
+
+static void
+check_links(struct app_params *app)
+{
+   uint32_t i;
+
+   /* Check that number of links matches the port mask */
+   APP_CHECK((app->n_links == __builtin_popcountll(app->port_mask)),
+   "Not enough links provided in the PORT_MASK\n");
+
+   for (i = 0; i< app->n_links; i++) {
+   struct app_link_params *link = &app->link_params[i];
+   uint32_t rxq_max, n_rxq, n_txq, link_id, i;
+
+   APP_PARAM_GET_ID(link, "LINK", link_id);
+
+   /* Check that link RXQs are contiguous */
+   rxq_max = 0;
+   if (link->arp_q > rxq_max)
+   rxq_max = link->arp_q;
+   if (link->tcp_syn_local_q > rxq_max)
+   rxq_max = link->tcp_syn_local_q;
+   if (link->ip_local_q > rxq_max)
+   rxq_max = link->ip_local_q;
+   if (link->tcp_local_q > rxq_max)
+   rxq_max = link->tcp_local_q;
+   if (link->udp_local_q > rxq_max)
+   rxq_max = link->udp_local_q;
+   if (link->sctp_local_q > rxq_max)
+   rxq_max = link->sctp_local_q;
+
+   for (i = 1; i <= rxq_max; i++)
+   APP_CHECK(((link->arp_q == i) ||
+   (link->tcp_syn_local_q == i) ||
+   (link->ip_local_q == i) ||
+   (link->tcp_local_q == i) ||
+   (link->udp_local_q == i) ||
+

[dpdk-dev] [PATCH v2 01/11] ip_pipeline: add parsing for config files with new syntax

2015-06-25 Thread Maciej Gajdzica

From: Pawel Wodkowski 

New syntax of config files is needed for ip_pipeline example
enhancements. Some old files are temporarily disabled in the Makefile.
It is part of a bigger change.

Signed-off-by: Pawel Wodkowski 
---
 examples/ip_pipeline/Makefile  |   17 +-
 examples/ip_pipeline/app.h |  850 
 examples/ip_pipeline/config.c  |  419 --
 examples/ip_pipeline/config_parse.c| 2272 
 examples/ip_pipeline/config_parse_tm.c |  373 ++
 examples/ip_pipeline/cpu_core_map.c|  465 +++
 examples/ip_pipeline/cpu_core_map.h|   69 +
 examples/ip_pipeline/main.c|  130 +-
 examples/ip_pipeline/main.h|  298 -
 examples/ip_pipeline/pipeline.h|   87 ++
 examples/ip_pipeline/pipeline_be.h |  253 
 11 files changed, 4380 insertions(+), 853 deletions(-)
 create mode 100644 examples/ip_pipeline/app.h
 delete mode 100644 examples/ip_pipeline/config.c
 create mode 100644 examples/ip_pipeline/config_parse.c
 create mode 100644 examples/ip_pipeline/config_parse_tm.c
 create mode 100644 examples/ip_pipeline/cpu_core_map.c
 create mode 100644 examples/ip_pipeline/cpu_core_map.h
 delete mode 100644 examples/ip_pipeline/main.h
 create mode 100644 examples/ip_pipeline/pipeline.h
 create mode 100644 examples/ip_pipeline/pipeline_be.h

diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index e70fdc7..b0feb4f 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -43,20 +43,9 @@ APP = ip_pipeline

 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) := main.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += init.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += cmdline.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_rx.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_tx.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_flow_classification.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_routing.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_passthrough.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_ipv4_frag.c
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_ipv4_ras.c
-
-ifeq ($(CONFIG_RTE_LIBRTE_ACL),y)
-SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += pipeline_firewall.c
-endif
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += config_parse_tm.c
+SRCS-$(CONFIG_RTE_LIBRTE_PIPELINE) += cpu_core_map.c

 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/ip_pipeline/app.h b/examples/ip_pipeline/app.h
new file mode 100644
index 000..b6b0700
--- /dev/null
+++ b/examples/ip_pipeline/app.h
@@ -0,0 +1,850 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_APP_H__
+#define __INCLUDE_APP_H__
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "cpu_core_map.h"
+#include "pipeline.h"
+
+#define APP_PARAM_NAME_SIZE  PIPELINE_NAME_SIZE
+
+struct app_mempool_params {
+   char name[APP_PARAM_NAME_SIZE];
+   uint32_t parsed; /* Used to check if object was parsed or only 
referenced */
+   uint32_t buffer_size;
+   uint32_t pool_size;
+   uint32_t cache_size;
+   uint32_t cpu_socket_id;
+};
+
+struct app_link_par

[dpdk-dev] [PATCH v2 00/11] ip_pipeline: ip_pipeline application enhancements

2015-06-25 Thread Maciej Gajdzica

This patchset enhances functionality of ip_pipeline application. New config
file syntax is introduced, so parser is changed. Changed structure of the
application. Now every global variable is stored in app_struct in app.h.
Syntax of pipeline cli commands was changed. Implementation of cli commands
for every pipeline is moved to the separate file.

Changes in v2:
- renamed some files
- added more config files
- reworked flow classification pipeline implementation
- fixed some bugs

Daniel Mrzyglod (1):
  ip_pipeline: added new implementation of firewall pipeline

Jasvinder Singh (3):
  ip_pipeline: added config checks
  ip_pipeline: added master pipeline
  ip_pipeline: added new implementation of passthrough pipeline

Maciej Gajdzica (5):
  ip_pipeline: modified init to match new params struct
  ip_pipeline: moved pipelines to separate folder
  ip_pipeline: added application thread
  ip_pipeline: moved config files to separate folder
  ip_pipeline: added new implementation of flow classification pipeline

Pawel Wodkowski (2):
  ip_pipeline: add parsing for config files with new syntax
  ip_pipeline: added new implementation of routing pipeline

 examples/ip_pipeline/Makefile  |   36 +-
 examples/ip_pipeline/app.h |  850 
 examples/ip_pipeline/cmdline.c | 1976 -
 examples/ip_pipeline/config.c  |  419 
 examples/ip_pipeline/config/fc_ipv4_5tuple.cfg |   23 +
 examples/ip_pipeline/config/fc_ipv4_5tuple.sh  |9 +
 examples/ip_pipeline/config/fc_ipv6_5tuple.cfg |   23 +
 examples/ip_pipeline/config/fc_ipv6_5tuple.sh  |8 +
 examples/ip_pipeline/config/fc_qinq.cfg|   23 +
 examples/ip_pipeline/config/fc_qinq.sh |8 +
 examples/ip_pipeline/config/fw.cfg |   11 +
 examples/ip_pipeline/config/fw.sh  |   13 +
 examples/ip_pipeline/config/ip_pipeline.cfg|9 +
 examples/ip_pipeline/config/ip_pipeline.sh |5 +
 examples/ip_pipeline/config/pt1.cfg|9 +
 examples/ip_pipeline/config/pt2.cfg|   15 +
 examples/ip_pipeline/config/pt3.cfg|   21 +
 examples/ip_pipeline/config/rt.cfg |   13 +
 examples/ip_pipeline/config/rt.sh  |   18 +
 examples/ip_pipeline/config/test.cfg   |  164 ++
 examples/ip_pipeline/config/test.sh|6 +
 examples/ip_pipeline/config/tm_profile.cfg |  105 +
 examples/ip_pipeline/config_check.c|  387 
 examples/ip_pipeline/config_parse.c| 2272 
 examples/ip_pipeline/config_parse_tm.c |  373 
 examples/ip_pipeline/cpu_core_map.c|  465 
 examples/ip_pipeline/cpu_core_map.h|   69 +
 examples/ip_pipeline/init.c| 1563 ++
 examples/ip_pipeline/ip_pipeline.cfg   |   56 -
 examples/ip_pipeline/ip_pipeline.sh|   18 -
 examples/ip_pipeline/main.c|  137 +-
 examples/ip_pipeline/main.h|  298 ---
 examples/ip_pipeline/pipeline.h|   87 +
 examples/ip_pipeline/pipeline/hash_func.h  |  351 +++
 .../ip_pipeline/pipeline/pipeline_actions_common.h |  119 +
 examples/ip_pipeline/pipeline/pipeline_common_be.c |  204 ++
 examples/ip_pipeline/pipeline/pipeline_common_be.h |  161 ++
 examples/ip_pipeline/pipeline/pipeline_common_fe.c | 1283 +++
 examples/ip_pipeline/pipeline/pipeline_common_fe.h |  248 +++
 examples/ip_pipeline/pipeline/pipeline_firewall.c  |  960 +
 examples/ip_pipeline/pipeline/pipeline_firewall.h  |   63 +
 .../ip_pipeline/pipeline/pipeline_firewall_be.c|  701 ++
 .../ip_pipeline/pipeline/pipeline_firewall_be.h|  138 ++
 .../pipeline/pipeline_flow_classification.c| 1927 +
 .../pipeline/pipeline_flow_classification.h|  106 +
 .../pipeline/pipeline_flow_classification_be.c |  569 +
 .../pipeline/pipeline_flow_classification_be.h |  140 ++
 examples/ip_pipeline/pipeline/pipeline_master.c|   47 +
 examples/ip_pipeline/pipeline/pipeline_master.h|   41 +
 examples/ip_pipeline/pipeline/pipeline_master_be.c |  146 ++
 examples/ip_pipeline/pipeline/pipeline_master_be.h |   41 +
 .../ip_pipeline/pipeline/pipeline_passthrough.c|   47 +
 .../ip_pipeline/pipeline/pipeline_passthrough.h|   41 +
 .../ip_pipeline/pipeline/pipeline_passthrough_be.c |  741 +++
 .../ip_pipeline/pipeline/pipeline_passthrough_be.h |   41 +
 examples/ip_pipeline/pipeline/pipeline_routing.c   | 1537 +
 examples/ip_pipeline/pipeline/pipeline_routing.h   |   99 +
 .../ip_pipeline/pipeline/pipeline_routing_be.c |  836 +++
 .../ip_pipeline/pipeline/pipeline_routing_be.h |  230 ++
 examples/ip_pipeline/pipeline_be.h |  253 +++
 examples/ip_pipeline/pipe

[dpdk-dev] [PATCH] ethdev: fix checking for tx_free_thresh

2015-06-25 Thread Ananyev, Konstantin



> -Original Message-
> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
> Sent: Tuesday, June 23, 2015 7:43 PM
> To: dev at dpdk.org
> Cc: Zoltan Kiss; Ananyev, Konstantin
> Subject: [PATCH] ethdev: fix checking for tx_free_thresh
> 
> This parameter is not consistent between the drivers: some use it as
> rte_eth_tx_burst() requires, some release buffers when the number of free
> descriptors drop below this value.
> Let's use it as most fast-path code does, which is the latter, and update
> comments throughout the code to reflect that.
> 
> Signed-off-by: Zoltan Kiss 
> ---

Acked-by: Konstantin Ananyev 

> --
> 1.9.1

[dpdk-dev] [PATCH v5 5/5] eal: Fix uio mapping differences between linuxapp and bsdapp

2015-06-25 Thread Tetsuya Mukawa

From: "Tetsuya.Mukawa" 

This patch fixes below.
- bsdapp
 - Use map_id in pci_uio_map_resource().
 - Fix interface of pci_map_resource().
 - Move path variable of mapped_pci_resource structure to pci_map.
- linuxapp
 - Remove redundant error message of linuxapp.

'pci_uio_map_resource()' is implemented in both linuxapp and bsdapp,
but interface is different. The patch fixes the function of bsdapp
to do same as linuxapp. After applying it, file descriptor should be
opened and closed out of pci_map_resource().

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/bsdapp/eal/eal_pci.c   | 118 ++
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c |  21 +++---
 2 files changed, 80 insertions(+), 59 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 8261e09..06c564f 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -85,6 +85,7 @@

 struct pci_map {
void *addr;
+   char *path;
uint64_t offset;
uint64_t size;
uint64_t phaddr;
@@ -99,7 +100,7 @@ struct mapped_pci_resource {

struct rte_pci_addr pci_addr;
char path[PATH_MAX];
-   size_t nb_maps;
+   int nb_maps;
struct pci_map maps[PCI_MAX_RESOURCE];
 };

@@ -121,47 +122,30 @@ pci_unbind_kernel_driver(struct rte_pci_device *dev 
__rte_unused)

 /* map a particular resource from a file */
 static void *
-pci_map_resource(void *requested_addr, const char *devname, off_t offset,
-size_t size)
+pci_map_resource(void *requested_addr, int fd, off_t offset, size_t size,
+int additional_flags)
 {
-   int fd;
void *mapaddr;

-   /*
-* open devname, to mmap it
-*/
-   fd = open(devname, O_RDWR);
-   if (fd < 0) {
-   RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
-   devname, strerror(errno));
-   goto fail;
-   }
-
/* Map the PCI memory resource of device */
mapaddr = mmap(requested_addr, size, PROT_READ | PROT_WRITE,
-   MAP_SHARED, fd, offset);
-   close(fd);
-   if (mapaddr == MAP_FAILED ||
-   (requested_addr != NULL && mapaddr != requested_addr)) {
-   RTE_LOG(ERR, EAL, "%s(): cannot mmap(%s(%d), %p, 0x%lx, 0x%lx):"
-   " %s (%p)\n", __func__, devname, fd, requested_addr,
+   MAP_SHARED | additional_flags, fd, offset);
+   if (mapaddr == MAP_FAILED) {
+   RTE_LOG(ERR, EAL,
+   "%s(): cannot mmap(%d, %p, 0x%lx, 0x%lx): %s (%p)\n",
+   __func__, fd, requested_addr,
(unsigned long)size, (unsigned long)offset,
strerror(errno), mapaddr);
-   goto fail;
-   }
-
-   RTE_LOG(DEBUG, EAL, "  PCI memory mapped at %p\n", mapaddr);
+   } else
+   RTE_LOG(DEBUG, EAL, "  PCI memory mapped at %p\n", mapaddr);

return mapaddr;
-
-fail:
-   return NULL;
 }

 static int
 pci_uio_map_secondary(struct rte_pci_device *dev)
 {
-   size_t i;
+   int i, fd;
struct mapped_pci_resource *uio_res;
struct mapped_pci_res_list *uio_res_list =
RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list);
@@ -169,19 +153,34 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
TAILQ_FOREACH(uio_res, uio_res_list, next) {

/* skip this element if it doesn't match our PCI address */
-   if (memcmp(&uio_res->pci_addr, &dev->addr, sizeof(dev->addr)))
+   if (rte_eal_compare_pci_addr(&uio_res->pci_addr, &dev->addr))
continue;

for (i = 0; i != uio_res->nb_maps; i++) {
-   if (pci_map_resource(uio_res->maps[i].addr,
-uio_res->path,
-(off_t)uio_res->maps[i].offset,
-(size_t)uio_res->maps[i].size)
-   != uio_res->maps[i].addr) {
+   /*
+* open devname, to mmap it
+*/
+   fd = open(uio_res->maps[i].path, O_RDWR);
+   if (fd < 0) {
+   RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
+   uio_res->maps[i].path, strerror(errno));
+   return -1;
+   }
+
+   void *mapaddr = pci_map_resource(uio_res->maps[i].addr,
+   fd, (off_t)uio_res->maps[i].offset,
+   (size_t)uio_res->maps[i].size, 0);
+   if (mapaddr != uio_res->maps[i].addr) {
RTE_LOG(ERR, EAL,
-   "Cannot mmap

[dpdk-dev] [PATCH v5 4/5] eal/bsdapp: Change names of pci related data structure

2015-06-25 Thread Tetsuya Mukawa

From: "Tetsuya.Mukawa" 

To merge pci code of linuxapp and bsdapp, this patch changes names
like below.
 - uio_map to pci_map
 - uio_resource to mapped_pci_resource
 - uio_res_list to mapped_pci_res_list

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/bsdapp/eal/eal_pci.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
b/lib/librte_eal/bsdapp/eal/eal_pci.c
index b071f07..8261e09 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -83,7 +83,7 @@
  * enabling bus master.
  */

-struct uio_map {
+struct pci_map {
void *addr;
uint64_t offset;
uint64_t size;
@@ -94,16 +94,16 @@ struct uio_map {
  * For multi-process we need to reproduce all PCI mappings in secondary
  * processes, so save them in a tailq.
  */
-struct uio_resource {
-   TAILQ_ENTRY(uio_resource) next;
+struct mapped_pci_resource {
+   TAILQ_ENTRY(mapped_pci_resource) next;

struct rte_pci_addr pci_addr;
char path[PATH_MAX];
size_t nb_maps;
-   struct uio_map maps[PCI_MAX_RESOURCE];
+   struct pci_map maps[PCI_MAX_RESOURCE];
 };

-TAILQ_HEAD(uio_res_list, uio_resource);
+TAILQ_HEAD(mapped_pci_res_list, mapped_pci_resource);

 static struct rte_tailq_elem rte_uio_tailq = {
.name = "UIO_RESOURCE_LIST",
@@ -162,9 +162,9 @@ static int
 pci_uio_map_secondary(struct rte_pci_device *dev)
 {
size_t i;
-   struct uio_resource *uio_res;
-   struct uio_res_list *uio_res_list =
-   RTE_TAILQ_CAST(rte_uio_tailq.head, uio_res_list);
+   struct mapped_pci_resource *uio_res;
+   struct mapped_pci_res_list *uio_res_list =
+   RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list);

TAILQ_FOREACH(uio_res, uio_res_list, next) {

@@ -201,10 +201,10 @@ pci_uio_map_resource(struct rte_pci_device *dev)
uint64_t offset;
uint64_t pagesz;
struct rte_pci_addr *loc = &dev->addr;
-   struct uio_resource *uio_res;
-   struct uio_res_list *uio_res_list =
-   RTE_TAILQ_CAST(rte_uio_tailq.head, uio_res_list);
-   struct uio_map *maps;
+   struct mapped_pci_resource *uio_res;
+   struct mapped_pci_res_list *uio_res_list =
+   RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list);
+   struct pci_map *maps;

dev->intr_handle.fd = -1;
dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
-- 
2.1.4

[dpdk-dev] [PATCH v5 3/5] eal: Fix memory leaks and needless increment of pci_map_addr

2015-06-25 Thread Tetsuya Mukawa

From: "Tetsuya.Mukawa" 

This patch fixes following memory leaks.
- When open() is failed, uio_res and fds won't be freed in
  pci_uio_map_resource().
- When pci_map_resource() is failed but path is allocated correctly,
  path and fds won't be freed in pci_uio_map_recource().
- When pci_uio_unmap() is called, path should be freed.

Also, fixes below.
- When pci_map_resource() is failed, mapaddr will be MAP_FAILED.
  In this case, pci_map_addr should not be incremented in
  pci_uio_map_resource().
- To shrink code, move close().
- Remove fail variable.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/bsdapp/eal/eal_pci.c   | 14 +++--
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 51 ---
 2 files changed, 44 insertions(+), 21 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 8e24fd1..b071f07 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -235,7 +235,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
if ((uio_res = rte_zmalloc("UIO_RES", sizeof (*uio_res), 0)) == NULL) {
RTE_LOG(ERR, EAL,
"%s(): cannot store uio mmap details\n", __func__);
-   return -1;
+   goto close_fd;
}

snprintf(uio_res->path, sizeof(uio_res->path), "%s", devname);
@@ -262,8 +262,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
(mapaddr = pci_map_resource(NULL, devname, (off_t)offset,
(size_t)maps[j].size)
) == NULL) {
-   rte_free(uio_res);
-   return -1;
+   goto free_uio_res;
}

maps[j].addr = mapaddr;
@@ -274,6 +273,15 @@ pci_uio_map_resource(struct rte_pci_device *dev)
TAILQ_INSERT_TAIL(uio_res_list, uio_res, next);

return 0;
+
+free_uio_res:
+   rte_free(uio_res);
+close_fd:
+   close(dev->intr_handle.fd);
+   dev->intr_handle.fd = -1;
+   dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
+
+   return -1;
 }

 /* Scan one pci sysfs entry, and fill the devices list from it. */
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c 
b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index 34316b6..2dd83d3 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -308,7 +308,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
if (dev->intr_handle.uio_cfg_fd < 0) {
RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
cfgname, strerror(errno));
-   return -1;
+   goto close_fd;
}

if (dev->kdrv == RTE_KDRV_IGB_UIO)
@@ -328,7 +328,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
if (uio_res == NULL) {
RTE_LOG(ERR, EAL,
"%s(): cannot store uio mmap details\n", __func__);
-   return -1;
+   goto close_fd;
}

snprintf(uio_res->path, sizeof(uio_res->path), "%s", devname);
@@ -338,7 +338,6 @@ pci_uio_map_resource(struct rte_pci_device *dev)
maps = uio_res->maps;
for (i = 0, map_idx = 0; i != PCI_MAX_RESOURCE; i++) {
int fd;
-   int fail = 0;

/* skip empty BAR */
phaddr = dev->mem_resource[i].phys_addr;
@@ -352,6 +351,11 @@ pci_uio_map_resource(struct rte_pci_device *dev)
loc->domain, loc->bus, loc->devid, 
loc->function,
i);

+   /* allocate memory to keep path */
+   maps[map_idx].path = rte_malloc(NULL, strlen(devname) + 1, 0);
+   if (maps[map_idx].path == NULL)
+   goto free_uio_res;
+
/*
 * open resource file, to mmap it
 */
@@ -359,7 +363,8 @@ pci_uio_map_resource(struct rte_pci_device *dev)
if (fd < 0) {
RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
devname, strerror(errno));
-   return -1;
+   rte_free(maps[map_idx].path);
+   goto free_uio_res;
}

/* try mapping somewhere close to the end of hugepages */
@@ -368,23 +373,15 @@ pci_uio_map_resource(struct rte_pci_device *dev)

mapaddr = pci_map_resource(pci_map_addr, fd, 0,
(size_t)dev->mem_resource[i].len, 0);
-   if (mapaddr == MAP_FAILED)
-   fail = 1;
+   close(fd);
+   if (mapaddr == MAP_FAILED) {
+   rte_free(maps[map_idx].path);
+   goto free_uio_res;
+   }

pci_map_addr = RTE_PTR_ADD(mapaddr,
(size_t)dev->mem_resource

[dpdk-dev] [PATCH v5 2/5] eal: Close file descriptor of uio configuration

2015-06-25 Thread Tetsuya Mukawa

From: "Tetsuya.Mukawa" 

When pci_uio_unmap_resource() is called, a file descriptor that is used
for uio configuration should be closed.

Signed-off-by: Tetsuya Mukawa 
Acked-by: Stephen Hemminger 
---
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c 
b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index 5d3354d..34316b6 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -464,8 +464,12 @@ pci_uio_unmap_resource(struct rte_pci_device *dev)

/* close fd if in primary process */
close(dev->intr_handle.fd);
-
dev->intr_handle.fd = -1;
+
+   /* close cfg_fd if in primary process */
+   close(dev->intr_handle.uio_cfg_fd);
+   dev->intr_handle.uio_cfg_fd = -1;
+
dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
 }
 #endif /* RTE_LIBRTE_EAL_HOTPLUG */
-- 
2.1.4

[dpdk-dev] [PATCH v5 1/5] eal: Fix coding style of eal_pci.c and eal_pci_uio.c

2015-06-25 Thread Tetsuya Mukawa

From: "Tetsuya.Mukawa" 

This patch fixes coding style of below files in linuxapp and bsdapp.
 - eal_pci.c
 - eal_pci_uio.c

Signed-off-by: Tetsuya Mukawa 
Acked-by: Stephen Hemminger 
---
 lib/librte_eal/bsdapp/eal/eal_pci.c   | 12 +++-
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 12 
 2 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 2df5c1c..8e24fd1 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -161,9 +161,10 @@ fail:
 static int
 pci_uio_map_secondary(struct rte_pci_device *dev)
 {
-size_t i;
-struct uio_resource *uio_res;
-   struct uio_res_list *uio_res_list = RTE_TAILQ_CAST(rte_uio_tailq.head, 
uio_res_list);
+   size_t i;
+   struct uio_resource *uio_res;
+   struct uio_res_list *uio_res_list =
+   RTE_TAILQ_CAST(rte_uio_tailq.head, uio_res_list);

TAILQ_FOREACH(uio_res, uio_res_list, next) {

@@ -201,7 +202,8 @@ pci_uio_map_resource(struct rte_pci_device *dev)
uint64_t pagesz;
struct rte_pci_addr *loc = &dev->addr;
struct uio_resource *uio_res;
-   struct uio_res_list *uio_res_list = RTE_TAILQ_CAST(rte_uio_tailq.head, 
uio_res_list);
+   struct uio_res_list *uio_res_list =
+   RTE_TAILQ_CAST(rte_uio_tailq.head, uio_res_list);
struct uio_map *maps;

dev->intr_handle.fd = -1;
@@ -311,7 +313,7 @@ pci_scan_one(int dev_pci_fd, struct pci_conf *conf)
/* FreeBSD has no NUMA support (yet) */
dev->numa_node = 0;

-/* parse resources */
+   /* parse resources */
switch (conf->pc_hdr & PCIM_HDRTYPE) {
case PCIM_HDRTYPE_NORMAL:
max = PCIR_MAX_BAR_0;
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c 
b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index b5116a7..5d3354d 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -92,7 +92,8 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
 {
int fd, i;
struct mapped_pci_resource *uio_res;
-   struct mapped_pci_res_list *uio_res_list = 
RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list);
+   struct mapped_pci_res_list *uio_res_list =
+   RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list);

TAILQ_FOREACH(uio_res, uio_res_list, next) {

@@ -272,7 +273,8 @@ pci_uio_map_resource(struct rte_pci_device *dev)
uint64_t phaddr;
struct rte_pci_addr *loc = &dev->addr;
struct mapped_pci_resource *uio_res;
-   struct mapped_pci_res_list *uio_res_list = 
RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list);
+   struct mapped_pci_res_list *uio_res_list =
+   RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list);
struct pci_map *maps;

dev->intr_handle.fd = -1;
@@ -417,7 +419,8 @@ static struct mapped_pci_resource *
 pci_uio_find_resource(struct rte_pci_device *dev)
 {
struct mapped_pci_resource *uio_res;
-   struct mapped_pci_res_list *uio_res_list = 
RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list);
+   struct mapped_pci_res_list *uio_res_list =
+   RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list);

if (dev == NULL)
return NULL;
@@ -436,7 +439,8 @@ void
 pci_uio_unmap_resource(struct rte_pci_device *dev)
 {
struct mapped_pci_resource *uio_res;
-   struct mapped_pci_res_list *uio_res_list = 
RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list);
+   struct mapped_pci_res_list *uio_res_list =
+   RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list);

if (dev == NULL)
return;
-- 
2.1.4

[dpdk-dev] [PATCH v5 0/5] Clean up pci uio implementations

2015-06-25 Thread Tetsuya Mukawa

This patch set cleans up pci uio implementation. These clean up are
for consolidating pci uio implementation of linuxapp and bsdapp, and
moving consolidated functions in eal common.
Because of above, this patch set tries to implement linuxapp and bsdapp
almost same.
Actual consolidations will be done later patch set.

PATCH v5 changes:
 - Rebase to latest master branch.

PATCH v4 changes:
 - Rebase to latest master branch.
 - Fix bug in pci_uio_map_resource() of BSD code. 'maps[i].path' shouldn't be 
freed.
 Fixed in below patch:
 [PATCH 3/5] eal: Fix memory leaks and needless increment of pci_map_addr
 - 'path' member of 'struct mapped_pci_resource' should not be removed because 
it will be used in BSD code.
 Fixed in below patch:
 [PATCH 5/5] eal: Fix uio mapping differences between linuxapp and bsdapp

PATCH v3 changes:
 - Squash patches related with pci_map_resource().
 - Free maps[].path to easy to understand.
   (Thanks to Iremonger, Bernard)
 - Close fds opened in this function.
 - Remove unused path variable from mapped_pci_resource structure.

PATCH v2 changes:
 - Move 'if-condition' to later patch series.
 - Fix memory leaks of path.
 - Fix typos.
   (Thanks to David Marchand)
 - Fix commit title and body.
 - Fix pci_map_resource() to handle MAP_FAILED.
   (Thanks to Iremonger, Bernard)

Changes:
 - This patch set is derived from below.
   "[PATCH v2] eal: Port Hotplug support for BSD"
 - Set cfg_fd as -1, when cfg_fd is closed.
   (Thanks to Iremonger, Bernard)
 - Remove needless coding style fixings.
 - Fix coding style of if-else condition.
   (Thanks to Richardson, Bruce)



Tetsuya.Mukawa (5):
  eal: Fix coding style of eal_pci.c and eal_pci_uio.c
  eal: Close file descriptor of uio configuration
  eal: Fix memory leaks and needless increment of pci_map_addr
  eal/bsdapp: Change names of pci related data structure
  eal: Fix uio mapping differences between linuxapp and bsdapp

 lib/librte_eal/bsdapp/eal/eal_pci.c   | 156 ++
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c |  88 ++---
 2 files changed, 149 insertions(+), 95 deletions(-)

-- 
2.1.4

[dpdk-dev] [PATCH v2 00/11] ip_pipeline: ip_pipeline application enhancements

2015-06-25 Thread Dumitrescu, Cristian



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> Sent: Thursday, June 25, 2015 12:15 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2 00/11] ip_pipeline: ip_pipeline application
> enhancements
> 
> This patchset enhances functionality of ip_pipeline application. New config
> file syntax is introduced, so parser is changed. Changed structure of the
> application. Now every global variable is stored in app_struct in app.h.
> Syntax of pipeline cli commands was changed. Implementation of cli
> commands
> for every pipeline is moved to the separate file.
> 
> Changes in v2:
> - renamed some files
> - added more config files
> - reworked flow classification pipeline implementation
> - fixed some bugs
> 

Acked-by: Cristian Dumitrescu

[dpdk-dev] Packets lost

2015-06-25 Thread Daeyoung Kim

Hi all,

I'm making a packet capture program from the l3fwd. When I send DNS
packets, the wireshark simultaneously gets all the packets on two ports.
However, using my program on promiscuous mode, I can see the packets on
only one port. Do you have any idea of that? Could you give me advice?

Thanks,
Dan

[dpdk-dev] VMXNET3 on vmware, ping delay

2015-06-25 Thread Matthew Hall

On Thu, Jun 25, 2015 at 08:44:51PM +0200, Thomas Monjalon wrote:
> DPDK is not a stack.

Hi Thomas,

Don't worry too much about that challenge.

When I get my app feature complete, I think we can change that.

Same for Avi and they server frameworks they are making at Cloudius. ;)

Matthew.

[dpdk-dev] Can't compile examples

2015-06-25 Thread Thomas Monjalon

2015-06-25 08:39, Liu, Jijiang:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > 2015-06-25 11:31, Tetsuya Mukawa:
> > > Hi Jijiang,
> > >
> > > It seems below patch introduces compile error of examples.
> > >  - a50245e examples/tep_term: initialize VXLAN sample
> > >
> > > Here is log.
> > > Could you please check it?
> > >
> > [...]
> > >
> > /home/mukawa/work/dpdk.org/dpdk/examples/tep_termination/main.c:52:28:
> > > fatal error: rte_virtio_net.h: No such file or directory
> > 
> > The check before merging was with vhost enabled.
> > 
> > Jijiang, does it make sense to try make it without vhost?
> > If not, examples/Makefile must contain
> > DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += tep_termination
> 
> The CONFIG_RTE_LIBRTE_VHOST must be set 'Y' when compiling the VXLAN example.

Fixed:
http://dpdk.org/browse/dpdk/commit/?id=8b22792abbfe

[dpdk-dev] [PATCH v5] ixgbe: changes to support PCI Port Hotplug

2015-06-25 Thread Ananyev, Konstantin



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bernard Iremonger
> Sent: Wednesday, June 24, 2015 4:09 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v5] ixgbe: changes to support PCI Port Hotplug
> 
> This patch depends on the Port Hotplug Framework.
> It implements the eth_dev_uninit functions for rte_ixgbe_pmd and
> rte_ixgbevf_pmd.
> 
> Changes in V5:
> Set nb_rx_queues and nb_tx_queues to 0 in uninit functions.
> Rebased to latest ixgbe code.
> 
> Changes in V4:
> Release rx and tx queues in dev_uninit() functions.
> Replace TRUE and FALSE with 1 and 0.
> 
> Changes in V3:
> Rebased to use drivers/net/ixgbe directory.
> 
> Changes in V2:
> Added call to dev_close() in dev_uninit() functions.
> Removed input parameter checks from dev_uninit() functions.
> 
> Signed-off-by: Bernard Iremonger 
> ---

Acked-by: Konstantin Ananyev 

> 1.7.4.1

[dpdk-dev] [PATCH] examples/tep_termination: Add a compilation option for the VXLAN sample

2015-06-25 Thread Thomas Monjalon

2015-06-25 16:56, Jijiang Liu:
> Add a compilation option for the VXLAN sample.
> 
> Signed-off-by: Jijiang Liu 

Applied, thanks

[dpdk-dev] Can't compile examples

2015-06-25 Thread Tetsuya Mukawa

Hi Jijiang,

It seems below patch introduces compile error of examples.
 - a50245e examples/tep_term: initialize VXLAN sample

Here is log.
Could you please check it?

$ T=x86_64-native-linuxapp-gcc make examples -j12
== Build examples for x86_64-native-linuxapp-gcc
== cmdline
== distributor
== bond
== helloworld
== exception_path
== kni
== ip_pipeline
== ipv4_multicast
== ip_reassembly
== l2fwd
== l3fwd
== l2fwd-jobstats
== l3fwd-power
== l3fwd-acl
== l3fwd-vf
== link_status_interrupt
== load_balancer
== multi_process
== packet_ordering
== qos_sched
== netmap_compat/bridge
== qos_meter
== quota_watermark
== rxtx_callbacks
== qw
== client_server_mp
== simple_mp
== skeleton
== qwctl
== tep_termination
== symmetric_mp
== timer
== vmdq
== vmdq_dcb
== mp_client
== mp_server
== vm_power_manager
  CC main.o
/home/mukawa/work/dpdk.org/dpdk/examples/tep_termination/main.c:52:28:
fatal error: rte_virtio_net.h: No such file or directory
compilation terminated.
/home/mukawa/work/dpdk.org/dpdk/mk/internal/rte.compile-pre.mk:126:
recipe for target 'main.o' failed
make[4]: *** [main.o] Error 1
/home/mukawa/work/dpdk.org/dpdk/mk/rte.extapp.mk:42: recipe for target
'all' failed
make[3]: *** [all] Error 2
/home/mukawa/work/dpdk.org/dpdk/mk/rte.extsubdir.mk:46: recipe for
target 'tep_termination' failed
make[2]: *** [tep_termination] Error 2
/home/mukawa/work/dpdk.org/dpdk/mk/rte.sdkexamples.mk:52: recipe for
target 'x86_64-native-linuxapp-gcc_examples' failed
make[1]: *** [x86_64-native-linuxapp-gcc_examples] Error 2
/home/mukawa/work/dpdk.org/dpdk/mk/rte.sdkroot.mk:120: recipe for target
'examples' failed
make: *** [examples] Error 2

Regards,
Tetsuya

[dpdk-dev] [PATCH 2/2] ixgbe: add memory barriers in vector rx/tx

2015-06-25 Thread Eric Kinzie

Add write memory barrier before writing tail pointer.

Fixes c95584dc2b18 ("ixgbe: new vectorized functions for Rx/Tx")

Signed-off-by: Eric Kinzie 
---
 drivers/net/ixgbe/ixgbe_rxtx_vec.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c 
b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
index abd10f6..b601de8 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
@@ -123,6 +123,7 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
 (rxq->nb_rx_desc - 1) : (rxq->rxrearm_start - 1));

/* Update the tail pointer on the NIC */
+   rte_wmb();
IXGBE_PCI_REG_WRITE(rxq->rdt_reg_addr, rx_id);
 }

@@ -645,6 +646,8 @@ ixgbe_xmit_pkts_vec(void *tx_queue, struct rte_mbuf 
**tx_pkts,

txq->tx_tail = tx_id;

+   /* update tail pointer */
+   rte_wmb();
IXGBE_PCI_REG_WRITE(txq->tdt_reg_addr, txq->tx_tail);

return nb_pkts;
-- 
1.7.10.4

[dpdk-dev] [PATCH 1/2] ixgbe: vector rx rearm after queue reset

2015-06-25 Thread Eric Kinzie

zero values in ixgbe_reset_rx_queue() used by vector receive so that
rearming the rx queue happens at the right time.  Not doing so can in
some cases result in the software inadvertently setting the card's rx
tail pointer equal to the head pointer, which indicates that there are
no descriptors available.  This causes receive to stop indefinitely
on that queue.

Fixes: 01fa1d6215fa ("ixgbe: unify Rx setup")

Signed-off-by: Eric Kinzie 
---
 drivers/net/ixgbe/ixgbe_rxtx.c |4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 3ace8a8..1e840b6 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -2261,6 +2261,10 @@ ixgbe_reset_rx_queue(struct ixgbe_adapter *adapter, 
struct ixgbe_rx_queue *rxq)
rxq->nb_rx_hold = 0;
rxq->pkt_first_seg = NULL;
rxq->pkt_last_seg = NULL;
+#ifdef RTE_IXGBE_INC_VECTOR
+   rxq->rxrearm_nb = 0;
+   rxq->rxrearm_start = 0;
+#endif
 }

 int
-- 
1.7.10.4

[dpdk-dev] [PATCH 0/2] ixgbe vector rx/tx changes

2015-06-25 Thread Eric Kinzie

Clear values specific to ixgbe vector RX during queue reset.

I've also include a patch that adds a memory barrier before writing the
rx/tx tail pointer registers in ixgbe_rxtx_vec.c.  The non-vector code
has such barriers which looks right to me.  Comments?

Eric Kinzie (2):
  ixgbe: vector rx rearm after queue reset
  ixgbe: add memory barriers in vector rx/tx

 drivers/net/ixgbe/ixgbe_rxtx.c |4 
 drivers/net/ixgbe/ixgbe_rxtx_vec.c |3 +++
 2 files changed, 7 insertions(+)

-- 
1.7.10.4

[dpdk-dev] [PATCH v5 4/5] eal/bsdapp: Change names of pci related data structure

2015-06-25 Thread David Marchand

On Thu, Jun 25, 2015 at 5:19 AM, Tetsuya Mukawa  wrote:

> From: "Tetsuya.Mukawa" 
>
> To merge pci code of linuxapp and bsdapp, this patch changes names
> like below.
>  - uio_map to pci_map
>  - uio_resource to mapped_pci_resource
>  - uio_res_list to mapped_pci_res_list
>
> Signed-off-by: Tetsuya Mukawa 
>


Acked-by: David Marchand 


-- 
David Marchand

[dpdk-dev] [PATCH v5 3/5] eal: Fix memory leaks and needless increment of pci_map_addr

2015-06-25 Thread David Marchand

On Thu, Jun 25, 2015 at 5:19 AM, Tetsuya Mukawa  wrote:

> From: "Tetsuya.Mukawa" 
>
> This patch fixes following memory leaks.
> - When open() is failed, uio_res and fds won't be freed in
>   pci_uio_map_resource().
> - When pci_map_resource() is failed but path is allocated correctly,
>   path and fds won't be freed in pci_uio_map_recource().
> - When pci_uio_unmap() is called, path should be freed.
>
> Also, fixes below.
> - When pci_map_resource() is failed, mapaddr will be MAP_FAILED.
>   In this case, pci_map_addr should not be incremented in
>   pci_uio_map_resource().
> - To shrink code, move close().
> - Remove fail variable.
>
> Signed-off-by: Tetsuya Mukawa 
> ---
>  lib/librte_eal/bsdapp/eal/eal_pci.c   | 14 +++--
>  lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 51
> ---
>  2 files changed, 44 insertions(+), 21 deletions(-)
>
> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c
> b/lib/librte_eal/bsdapp/eal/eal_pci.c
> index 8e24fd1..b071f07 100644
> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c
> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
> @@ -235,7 +235,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
> if ((uio_res = rte_zmalloc("UIO_RES", sizeof (*uio_res), 0)) ==
> NULL) {
> RTE_LOG(ERR, EAL,
> "%s(): cannot store uio mmap details\n", __func__);
> -   return -1;
> +   goto close_fd;
> }
>
> snprintf(uio_res->path, sizeof(uio_res->path), "%s", devname);
> @@ -262,8 +262,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
> (mapaddr = pci_map_resource(NULL, devname,
> (off_t)offset,
> (size_t)maps[j].size)
> ) == NULL) {
> -   rte_free(uio_res);
> -   return -1;
> +   goto free_uio_res;
> }
>
> maps[j].addr = mapaddr;
> @@ -274,6 +273,15 @@ pci_uio_map_resource(struct rte_pci_device *dev)
> TAILQ_INSERT_TAIL(uio_res_list, uio_res, next);
>
> return 0;
> +
> +free_uio_res:
> +   rte_free(uio_res);
> +close_fd:
> +   close(dev->intr_handle.fd);
> +   dev->intr_handle.fd = -1;
> +   dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
> +
> +   return -1;
>  }
>
>
Thinking about it, when something fails, don't you need to unmap pci
resources in uio_res->maps before freeing ?


-- 
David Marchand

[dpdk-dev] [PATCH v5 1/5] eal: Fix coding style of eal_pci.c and eal_pci_uio.c

2015-06-25 Thread David Marchand

On Thu, Jun 25, 2015 at 5:19 AM, Tetsuya Mukawa  wrote:

> From: "Tetsuya.Mukawa" 
>
> This patch fixes coding style of below files in linuxapp and bsdapp.
>  - eal_pci.c
>  - eal_pci_uio.c
>
> Signed-off-by: Tetsuya Mukawa 
> Acked-by: Stephen Hemminger 
>

Acked-by: David Marchand 


-- 
David Marchand

[dpdk-dev] [PATCH v5 3/5] eal: Fix memory leaks and needless increment of pci_map_addr

2015-06-25 Thread David Marchand

Hello Tetsuya,


On Thu, Jun 25, 2015 at 5:19 AM, Tetsuya Mukawa  wrote:

> From: "Tetsuya.Mukawa" 
>
> This patch fixes following memory leaks.
> - When open() is failed, uio_res and fds won't be freed in
>   pci_uio_map_resource().
> - When pci_map_resource() is failed but path is allocated correctly,
>   path and fds won't be freed in pci_uio_map_recource().
> - When pci_uio_unmap() is called, path should be freed.
>
> Also, fixes below.
> - When pci_map_resource() is failed, mapaddr will be MAP_FAILED.
>   In this case, pci_map_addr should not be incremented in
>   pci_uio_map_resource().
> - To shrink code, move close().
> - Remove fail variable.
>
> Signed-off-by: Tetsuya Mukawa 
> ---
>  lib/librte_eal/bsdapp/eal/eal_pci.c   | 14 +++--
>  lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 51
> ---
>  2 files changed, 44 insertions(+), 21 deletions(-)
>
> [snip]
> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> index 34316b6..2dd83d3 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> @@ -308,7 +308,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
> if (dev->intr_handle.uio_cfg_fd < 0) {
> RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> cfgname, strerror(errno));
> -   return -1;
> +   goto close_fd;
> }
>
> if (dev->kdrv == RTE_KDRV_IGB_UIO)
>


You missed a return here :

if (dev->kdrv == RTE_KDRV_IGB_UIO)

dev->intr_handle.type = RTE_INTR_HANDLE_UIO;

else {

dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;



/* set bus master that is not done by uio_pci_generic */

if (pci_uio_set_bus_master(dev->intr_handle.uio_cfg_fd)) {

RTE_LOG(ERR, EAL, "Cannot set up bus
mastering!\n");
return -1;

}

}




@@ -328,7 +328,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
> if (uio_res == NULL) {
> RTE_LOG(ERR, EAL,
> "%s(): cannot store uio mmap details\n", __func__);
> -   return -1;
> +   goto close_fd;
> }
>
> snprintf(uio_res->path, sizeof(uio_res->path), "%s", devname);
> @@ -338,7 +338,6 @@ pci_uio_map_resource(struct rte_pci_device *dev)
> maps = uio_res->maps;
> for (i = 0, map_idx = 0; i != PCI_MAX_RESOURCE; i++) {
> int fd;
> -   int fail = 0;
>
> /* skip empty BAR */
> phaddr = dev->mem_resource[i].phys_addr;
>


The rest looks good to me.


-- 
David Marchand

[dpdk-dev] [PATCH] doc/sample_app_ug:add a VXLAN sample guide

2015-06-25 Thread Jijiang Liu

Add a VXLAN sample guide in the sample_app_ug directory.

It includes:

- Add the overlay networking picture with svg format.

- Add the TEP termination framework picture with svg format.

- Add the tep_termination.rst file

- Change the index.rst file for the above pictures index.

Signed-off-by: Jijiang Liu 
Signed-off-by: Thomas Long 

---
 .../sample_app_ug/img/overlay_networking.svg   |  820 
 .../sample_app_ug/img/tep_termination_arch.svg |  551 +
 doc/guides/sample_app_ug/index.rst |2 +
 doc/guides/sample_app_ug/tep_termination.rst   |  319 
 4 files changed, 1692 insertions(+), 0 deletions(-)
 create mode 100644 doc/guides/sample_app_ug/img/overlay_networking.svg
 create mode 100644 doc/guides/sample_app_ug/img/tep_termination_arch.svg
 create mode 100644 doc/guides/sample_app_ug/tep_termination.rst

diff --git a/doc/guides/sample_app_ug/img/overlay_networking.svg 
b/doc/guides/sample_app_ug/img/overlay_networking.svg
new file mode 100644
index 000..e16b5ac
--- /dev/null
+++ b/doc/guides/sample_app_ug/img/overlay_networking.svg
@@ -0,0 +1,820 @@
+
+http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd";>
+
+http://www.w3.org/2000/svg"; 
xmlns:xlink="http://www.w3.org/1999/xlink"; 
xmlns:ev="http://www.w3.org/2001/xml-events";
+   
xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/"; width="8.5in" 
height="11in" viewBox="0 0 612 792"
+   xml:space="preserve" color-interpolation-filters="sRGB" 
class="st31">
+   
+   
+   
+   
+   
+   
+   
+
+   
+   
+   
+
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+

[dpdk-dev] [PATCH] librte_ether: release memory in uninit function.

2015-06-25 Thread Stephen Hemminger

On Thu, 25 Jun 2015 15:30:28 +0100
Bernard Iremonger  wrote:

> Signed-off-by: Bernard Iremonger 
> ---
>  lib/librte_ether/rte_ethdev.c |8 +++-
>  1 files changed, 7 insertions(+), 1 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index e13fde5..2404556 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -369,8 +369,14 @@ rte_eth_dev_uninit(struct rte_pci_device *pci_dev)
>   /* free ether device */
>   rte_eth_dev_release_port(eth_dev);
>  
> - if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> + if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> + rte_free(eth_dev->data->rx_queues);
> + rte_free(eth_dev->data->tx_queues);
>   rte_free(eth_dev->data->dev_private);
> + rte_free(eth_dev->data->mac_addrs);
> + rte_free(eth_dev->data->hash_mac_addrs);
> + memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));


Glad to see this problem addressed.

I would prefer that the component that created the object be responsible
for doing its own cleanup.

[dpdk-dev] [PATCHv3 3/3] ABI: Add some documentation

2015-06-25 Thread Neil Horman

People have been asking for ways to use the ABI macros, heres some docs to
clarify their use.  Included is:

* An overview of what ABI is
* Details of the ABI deprecation process
* Details of the versioning macros
* Examples of their use
* Details of how to use the ABI validator

Thanks to John Mcnamara, who duplicated much of this effort at Intel while I was
working on it.  Much of the introductory material was gathered and cleaned up by
him

Signed-off-by: Neil Horman 
CC: john.mcnamara at intel.com
CC: thomas.monjalon at 6wind.com

Change notes:

v2)
 * Fixed RST indentations and spelling errors
 * Rebased to upstream to fix index.rst conflict

v3)
 * Fixed in tact -> intact
 * Added docs to address static linking
 * Removed duplicate documentation from release notes
---
 doc/guides/guidelines/index.rst  |   1 +
 doc/guides/guidelines/versioning.rst | 484 +++
 doc/guides/rel_notes/abi.rst |  30 +--
 3 files changed, 487 insertions(+), 28 deletions(-)
 create mode 100644 doc/guides/guidelines/versioning.rst

diff --git a/doc/guides/guidelines/index.rst b/doc/guides/guidelines/index.rst
index 0ee9ab3..bfb9fa3 100644
--- a/doc/guides/guidelines/index.rst
+++ b/doc/guides/guidelines/index.rst
@@ -7,3 +7,4 @@ Guidelines

 coding_style
 design
+versioning
diff --git a/doc/guides/guidelines/versioning.rst 
b/doc/guides/guidelines/versioning.rst
new file mode 100644
index 000..da9eca0
--- /dev/null
+++ b/doc/guides/guidelines/versioning.rst
@@ -0,0 +1,484 @@
+Managing ABI updates
+
+
+Description
+---
+
+This document details some methods for handling ABI management in the DPDK.
+Note this document is not exhaustive, in that C library versioning is flexible
+allowing multiple methods to achieve various goals, but it will provide the 
user
+with some introductory methods
+
+General Guidelines
+--
+
+#. Whenever possible, ABI should be preserved
+#. The addition of symbols is generally not problematic
+#. The modification of symbols can generally be managed with versioning
+#. The removal of symbols generally is an ABI break and requires bumping of the
+   LIBABIVER macro
+
+What is an ABI
+--
+
+An ABI (Application Binary Interface) is the set of runtime interfaces exposed
+by a library. It is similar to an API (Application Programming Interface) but
+is the result of compilation.  It is also effectively cloned when applications
+link to dynamic libraries.  That is to say when an application is compiled to
+link against dynamic libraries, it is assumed that the ABI remains constant
+between the time the application is compiled/linked, and the time that it runs.
+Therefore, in the case of dynamic linking, it is critical that an ABI is
+preserved, or (when modified), done in such a way that the application is 
unable
+to behave improperly or in an unexpected fashion.
+
+The DPDK ABI policy
+---
+
+ABI versions are set at the time of major release labeling, and the ABI may
+change multiple times, without warning, between the last release label and the
+HEAD label of the git tree.
+
+ABI versions, once released, are available until such time as their
+deprecation has been noted in the Release Notes for at least one major release
+cycle. For example consider the case where the ABI for DPDK 2.0 has been
+shipped and then a decision is made to modify it during the development of
+DPDK 2.1. The decision will be recorded in the Release Notes for the DPDK 2.1
+release and the modification will be made available in the DPDK 2.2 release.
+
+ABI versions may be deprecated in whole or in part as needed by a given
+update.
+
+Some ABI changes may be too significant to reasonably maintain multiple
+versions. In those cases ABI's may be updated without backward compatibility
+being provided. The requirements for doing so are:
+
+#. At least 3 acknowledgments of the need to do so must be made on the
+   dpdk.org mailing list.
+
+#. A full deprecation cycle, as explained above, must be made to offer
+   downstream consumers sufficient warning of the change.
+
+#. The ``LIBABIVER`` variable in the makefile(s) where the ABI changes are
+   incorporated must be incremented in parallel with the ABI changes
+   themselves.
+
+Note that the above process for ABI deprecation should not be undertaken
+lightly. ABI stability is extremely important for downstream consumers of the
+DPDK, especially when distributed in shared object form. Every effort should
+be made to preserve the ABI whenever possible. The ABI should only be changed
+for significant reasons, such as performance enhancements. ABI breakage due to
+changes such as reorganizing public structure fields for aesthetic or
+readability purposes should be avoided.
+
+Examples of Deprecation Notices
+---
+
+The following are some examples of ABI deprecation notices which would be
+added to the Release Notes

[dpdk-dev] [PATCHv3 2/3] rte_compat: Add MAP_STATIC_SYMBOL macro

2015-06-25 Thread Neil Horman

It was pointed out in my examples that doing shared library symbol versioning by
partitioning symbols to version specific functions (as opposed to leaving the
latest symol version at the base symbol name), neglects to take into account
static builds.  Add a macro to handle that.  If you choose a versioning approach
that uniquely names every version of the symbol, then this macro lets you map
your symbol choice to the base name when building a static library

Also, while I'm at it, since we're documenting this in the guide, take the
abbreviated example out of the header

Signed-off-by: Neil Horman 
CC: thomas.monjalon at 6wind.com
---
 lib/librte_compat/rte_compat.h | 35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/lib/librte_compat/rte_compat.h b/lib/librte_compat/rte_compat.h
index 75920a1..d7768d5 100644
--- a/lib/librte_compat/rte_compat.h
+++ b/lib/librte_compat/rte_compat.h
@@ -49,22 +49,8 @@
  * Assumptions: DPDK 2.(X) contains a function int foo(char *string)
  *  DPDK 2.(X+1) needs to change foo to be int foo(int index)
  *
- * To accomplish this:
- * 1) Edit lib//library_version.map to add a DPDK_2.(X+1) node, in 
which
- * foo is exported as a global symbol.
- *
- * 2) rename the existing function int foo(char *string) to
- * int foo_v20(char *string)
- *
- * 3) Add this macro immediately below the function
- * VERSION_SYMBOL(foo, _v20, 2.0);
- *
- * 4) Implement a new version of foo.
- * char foo(int value, int otherval) { ...}
- *
- * 5) Mark the newest version as the default version
- * BIND_DEFAULT_SYMBOL(foo, _v21, 2.1);
- *
+ * Refer to the guidelines document in the docs subdirectory for details on the
+ * use of these macros
  */

 /*
@@ -72,6 +58,8 @@
  * b - function base name
  * e - function version extension, to be concatenated with base name
  * n - function symbol version string to be applied
+ * f - function prototype
+ * p - full function symbol name
  */

 /*
@@ -96,6 +84,19 @@
 #define BIND_DEFAULT_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e) 
", " RTE_STR(b) "@@DPDK_" RTE_STR(n))
 #define __vsym __attribute__((used))

+/*
+ * MAP_STATIC_SYMBOL
+ * If a function has been bifurcated into multiple versions, none of which
+ * are defined as the exported symbol name in the map file, this macro can be
+ * used to alias a specific version of the symbol to its exported name.  For
+ * example, if you have 2 versions of a function foo_v1 and foo_v2, where the
+ * former is mapped to foo at DPDK_1 and the latter is mapped to foo at DPDK_2 
when
+ * building a shared library, this macro can be used to map either foo_v1 or
+ * foo_v2 to the symbol foo when building a static library, e.g.:
+ * MAP_STATIC_SYMBOL(void foo(), foo_v2);
+ */
+#define MAP_STATIC_SYMBOL(f, p)
+
 #else
 /*
  * No symbol versioning in use
@@ -104,7 +105,7 @@
 #define __vsym
 #define BASE_SYMBOL(b, n)
 #define BIND_DEFAULT_SYMBOL(b, e, n)
-
+#define MAP_STATIC_SYMBOL(f, p) f  __attribute__((alias( RTE_STR(p
 /*
  * RTE_BUILD_SHARED_LIB=n
  */
-- 
2.1.0

[dpdk-dev] [PATCHv3 1/3] rte_compat.h : Clean up some typos

2015-06-25 Thread Neil Horman

Clean up some macro definition typos and comments

Signed-off-by: Neil Horman 
CC: thomas.monjalon at 6wind.com
---
 lib/librte_compat/rte_compat.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/lib/librte_compat/rte_compat.h b/lib/librte_compat/rte_compat.h
index fb0dc19..75920a1 100644
--- a/lib/librte_compat/rte_compat.h
+++ b/lib/librte_compat/rte_compat.h
@@ -54,7 +54,7 @@
  * foo is exported as a global symbol.
  *
  * 2) rename the existing function int foo(char *string) to
- * int __vsym foo_v20(char *string)
+ * int foo_v20(char *string)
  *
  * 3) Add this macro immediately below the function
  * VERSION_SYMBOL(foo, _v20, 2.0);
@@ -63,7 +63,7 @@
  * char foo(int value, int otherval) { ...}
  *
  * 5) Mark the newest version as the default version
- * BIND_DEFAULT_SYMBOL(foo, 2.1);
+ * BIND_DEFAULT_SYMBOL(foo, _v21, 2.1);
  *
  */

@@ -79,21 +79,21 @@
  * Creates a symbol version table entry binding symbol @DPDK_ to the 
internal
  * function name _
  */
-#define VERSION_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", 
"RTE_STR(b)"@DPDK_"RTE_STR(n))
+#define VERSION_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " 
RTE_STR(b) "@DPDK_" RTE_STR(n))

 /*
  * BASE_SYMBOL
  * Creates a symbol version table entry binding unversioned symbol 
  * to the internal function _
  */
-#define BASE_SYMBOL(b, e) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", 
"RTE_STR(b)"@")
+#define BASE_SYMBOL(b, e) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", " 
RTE_STR(b)"@")

 /*
- * BNID_DEFAULT_SYMBOL
+ * BIND_DEFAULT_SYMBOL
  * Creates a symbol version entry instructing the linker to bind references to
  * symbol  to the internal symbol _
  */
-#define BIND_DEFAULT_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e) 
", "RTE_STR(b)"@@DPDK_"RTE_STR(n))
+#define BIND_DEFAULT_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e) 
", " RTE_STR(b) "@@DPDK_" RTE_STR(n))
 #define __vsym __attribute__((used))

 #else
@@ -103,7 +103,7 @@
 #define VERSION_SYMBOL(b, e, v)
 #define __vsym
 #define BASE_SYMBOL(b, n)
-#define BIND_DEFAULT_SYMBOL(b, v)
+#define BIND_DEFAULT_SYMBOL(b, e, n)

 /*
  * RTE_BUILD_SHARED_LIB=n
-- 
2.1.0

[dpdk-dev] Can't compile examples

2015-06-25 Thread Thomas Monjalon

2015-06-25 11:31, Tetsuya Mukawa:
> Hi Jijiang,
> 
> It seems below patch introduces compile error of examples.
>  - a50245e examples/tep_term: initialize VXLAN sample
> 
> Here is log.
> Could you please check it?
> 
[...]
> /home/mukawa/work/dpdk.org/dpdk/examples/tep_termination/main.c:52:28:
> fatal error: rte_virtio_net.h: No such file or directory

The check before merging was with vhost enabled.

Jijiang, does it make sense to try make it without vhost?
If not, examples/Makefile must contain
DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += tep_termination

[dpdk-dev] Regarding usage of vmxnet3 PMD with DPDK2.0

2015-06-25 Thread Puneet Singh

I am migrating from using DPDK1.6r2 and the external vmxnet3 user map
driver to DPDK2.0.
I believe with DPDK2.0, the vmxnet3 PMD is builtin and I don?t need to
use the external driver.
I have been able to build the DPDK2.0 and I see that
CONFIG_RTE_LIBRTE_VMXNET3_PMD is set to y in the config.

Question is, what should be the steps that I follow on the ESXi node
in my virtual machine to detect the NIC with the above driver (I do
have vmxnet3 NIC there)
Should I load the uio and take over the NIC the usual way with igb_uio
and then run my application or are there any other steps to be
followed.

Regards
-Puneet

[dpdk-dev] [PATCH v7 2/4] ixgbe: add ops to support ethtool ops

2015-06-25 Thread Stephen Hemminger

On Wed, 17 Jun 2015 18:22:13 -0400
Liang-Min Larry Wang  wrote:

> +
> +static reg_info ixgbe_regs_general[] = {
> + {IXGBE_CTRL, 1, 1, "IXGBE_CTRL"},
> + {IXGBE_STATUS, 1, 1, "IXGBE_STATUS"},
> + {IXGBE_CTRL_EXT, 1, 1, "IXGBE_CTRL_EXT"},
> + {IXGBE_ESDP, 1, 1, "IXGBE_ESDP"},
> + {IXGBE_EODSDP, 1, 1, "IXGBE_EODSDP"},
> + {IXGBE_LEDCTL, 1, 1, "IXGBE_LEDCTL"},
> + {IXGBE_FRTIMER, 1, 1, "IXGBE_FRTIMER"},
> + {IXGBE_TCPTIMER, 1, 1, "IXGBE_TCPTIMER"},
> + {0, 0, 0, ""}
> +};
> +
> +static reg_info ixgbevf_regs_general[] = {
> + {IXGBE_CTRL, 1, 1, "IXGBE_CTRL"},
> + {IXGBE_STATUS, 1, 1, "IXGBE_STATUS"},
> + {IXGBE_VFLINKS, 1, 1, "IXGBE_VFLINKS"},
> + {IXGBE_FRTIMER, 1, 1, "IXGBE_FRTIMER"},
> + {IXGBE_VFMAILBOX, 1, 1, "IXGBE_VFMAILBOX"},
> + {IXGBE_VFMBMEM, 16, 4, "IXGBE_VFMBMEM"},
> + {IXGBE_VFRXMEMWRAP, 1, 1, "IXGBE_VFRXMEMWRAP"},
> + {0, 0, 0, ""}
> +};
> +

All these tables should be const
and API may need to change.

[dpdk-dev] [PATCH v7 1/4] ethdev: add apis to support access device info

2015-06-25 Thread Stephen Hemminger

On Wed, 17 Jun 2015 18:22:12 -0400
Liang-Min Larry Wang  wrote:

> +int
> +rte_eth_dev_reg_length(uint8_t port_id)
> +{
> + struct rte_eth_dev *dev;
> +
> + if ((dev= &rte_eth_devices[port_id]) == NULL) {
> + PMD_DEBUG_TRACE("Invalid port device\n");
> + return -ENODEV;
> + }

Some minor nits:
  * for consistency you should add valid port check here.
  * style:
- don't do assignment in if() unless it really helps readability
- need whitespace

if (!rte_eth_dev_is_valid_port(portid)) {
PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
return -ENODEV;
}

dev = &rte_eth_devices[port_id];
if (dev == NULL) {
PMD_DEBUG("Invalid port device\n");
return -ENODEV:
}
...

This code pattern is so common it really should be a function.

dev = rte_eth_dev_get(port_id);
if (dev == NULL) {
PMD_DEBUG("Invalid port device\n");
return -ENODEV;
}

And then add a macro to generate this??

[dpdk-dev] [PATCH v7 1/4] ethdev: add apis to support access device info

2015-06-25 Thread Stephen Hemminger

On Wed, 17 Jun 2015 18:22:12 -0400
Liang-Min Larry Wang  wrote:

>  int
> +rte_eth_dev_default_mac_addr_set(uint8_t port_id, struct ether_addr *addr)
> +{
> + struct rte_eth_dev *dev;
> +
> + if (!rte_eth_dev_is_valid_port(port_id)) {
> + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> + return -ENODEV;
> + }
> +
> + if (!is_valid_assigned_ether_addr(addr))
> + return -EINVAL;
> +
> + dev = &rte_eth_devices[port_id];
> + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mac_addr_set, -ENOTSUP);
> +
> + /* Update default address in NIC data structure */
> + ether_addr_copy(addr, &dev->data->mac_addrs[0]);
> +
> + (*dev->dev_ops->mac_addr_set)(dev, addr);

Would it be possible to directly set mac_addr[0] if device does not
provide a device driver specific override?

[dpdk-dev] VMXNET3 on vmware, ping delay

2015-06-25 Thread Matthew Hall

On Thu, Jun 25, 2015 at 06:46:30PM +0300, Avi Kivity wrote:
> What would be useful is a runtime switch between polling and interrupt
> modes.  This was if the load is load you use interrupts, and as mitigation,
> you switch to poll mode, until the load drops again.

Yes... I believe this is part of the plan. Though obviously I didn't work on 
it personally, I am still using the classic simple modes until I get my app to 
feature-complete level first.

In addition the *power* examples use adaptive polling to reduce CPU load to 
fit the current traffic profile.

Matthew.

[dpdk-dev] [PATCH 0/8] Dynamic RSS Configuration for Bonding

2015-06-25 Thread Kulasek, TomaszX


There's a bug in bonding itself, which prevents the bonding, made of Fortville 
NICs, start and is not related to Dynamic RSS Configuration.

This problem solves separate patch "bond: fix check initial link status of 
slave".

-Original Message-
From: Xu, HuilongX 
Sent: Friday, June 12, 2015 07:36
To: Kulasek, TomaszX; dev at dpdk.org
Subject: RE: [dpdk-dev] [PATCH 0/8] Dynamic RSS Configuration for Bonding

Tested-by: huilong xu 
- Tested Commit: 1a1109404e702d3ad1ccc1033df55c59bec1f89a + PATCH
- OS: Linux dpdk-fedora20 3.11.10-301.fc20.x86_64
- GCC: gcc version 4.8.3 20140624 (Red Hat 4.8.3-1) (GCC)
- CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
- NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection 
[8086:10fb]
- NIC: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ [8086:1583]
- Default x86_64-native-linuxapp-gcc configuration
- Total 4 cases, 2 passed, 2 failed. Niantic NIC case all passed, but Fortville 
NIC case all failed.


   First case:
   This is a unit test case, not need NIC
   1. build dpdk driver and insmod driver
   2. set 1024*2M hugepage
   3. compile test app in app/test
   4. run test
   ./test -c f -n 4 
   5. exec dynamic rss confif unit test
  link_bonding_rssconf_autotest
   6. print "test ok"
   7. this case passed
Second case:
   This is a function test case, used Fortville NIC(8086:1583)
1. build dpdk driver and insmod driver
2. bind dpdk driver to Fortville nic
3. set 1024*2M hugepage
4. run test pmd
  ./x86_64-native-linuxapp-gcc/app/testpmd -c 0xff -n 4  -- -i 
--txqflags=0x2 --mbcache=128 --burst=32 --txfreet=32 --rxfreet=64 --rxq=4 
--txq=4
5. exec testpmd cmd
   a) create bonded device 0 0
   b) add bonding slave 0 3
   c) add bonding slave 1 3
   d) port start 3
port can start, and link stats is down, so this case failed.
 Thirdly case:
   This is a function test case, used Fortville NIC(8086:1583)
1. build dpdk driver and insmod driver
2. bind dpdk driver to Fortville nic
3. set 1024*2M hugepage
4. run test pmd
  ./x86_64-native-linuxapp-gcc/app/testpmd -c 0xff -n 4  -- -i 
--txqflags=0x2 --mbcache=128 --burst=32 --txfreet=32 --rxfreet=64 --rxq=4 
--txq=4
5. exec testpmd cmd
   a) create bonded device 0 0
   b) add bonding slave 0 3
   c) add bonding slave 1 3
   d) port config all rss ip
   e) show port 3 rss-hash
  printf: 
  RSS functions:
  ipv4-frag ipv4-other ipv6-frag ipv6-other
   f) show port 0 rss-hash
  printf:
 RSS disabled
  Slave rss not enable, so this case failed
   Fourthly case:
  This is a function test case, used Niantic NIC(8086:10fb)
  1. build dpdk driver and insmod driver
  2. bind dpdk driver to Fortville nic
  3. set 1024*2M hugepage
  4. run test pmd
  ./x86_64-native-linuxapp-gcc/app/testpmd -c 0xff -n 4  -- -i 
--txqflags=0x2 --mbcache=128 --burst=32 --txfreet=32 --rxfreet=64 --rxq=4 
--txq=4
  5. exec testpmd cmd
a) create bonded device 0 0
b) add bonding slave 0 3
c) add bonding slave 1 3
d) port start 3
e) port stop all
f) set verbose 8
g) set fwd rxonly
h) set stat_qmap 3 0 0
i) set stat_qmap 3 1 1 
j) set stat_qmap 3 1 1
k) set stat_qmap 3 1 1
l) port config all rss ip
m) port start all
n)start
   6. send 50 ip packages to salve 0 by ixia, the package config as 
below
  a) dst mac: bond device (port 3) mac address.
  b) src mac: 00:00:00:12:34:56
  c) package type:0800
  e) dst ip: 192.168.1.1 
  f) src ip: from 192.168.1.2 to 192.168.1.51, one package, 
this ip add 1
7. stop 
Port 3 queue 0 received 9 packages
Port 3 queue 1 received 9 packages
Port 3 queue 2 received 16 packages
Port 3 queue 3 received 16 packages
8. send 50 ip packages to slave 1 by ixia, the package config 
as same
9. stop and check port 3 received packages again
   Form slave 0 and slave 1 the rss are some, the test passed

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tomasz Kulasek
> Sent: Wednesday, June 03, 2015 6:59 PM
> T

[dpdk-dev] VMXNET3 on vmware, ping delay

2015-06-25 Thread Vass, Sandor (Nokia - HU/Budapest)

Hello,
I would like to create an IP packet processor program and I choose to use DPDK 
because it is promising wrt its speed aspect.

I am trying to build a test environment to make the development a cheaper (not 
to buy HW for each developer), so I created a test setup in
- VMWare Workstation 11
- using DPDK 2.0.0
- with linux kernel 3.10.0, CentOS7
- gcc 4.8.3
- and standard, centos7 provided VMXNET3 driver, with uio_pci_generic kernel 
module
(shall I use vmxnet3-usermap.ko with dpdk 2.0.0? Where is it, how could I 
compile it?)

I set up 3 machines:
- set all machines' network interface type to VMXNET3
- set up one machine (C1) for issuing ping, its interface has an IP: 
192.168.3.21
- set up one machine (C2) for being the ping target, its interface has an IP: 
192.168.3.23
- set up one machine (BR) to act a L2 bridge using some of the examples 
provided. DPDK is compiled properly, 256x  2MB hugetables created, example 
application is executed and running without (major) error.
- three machines are connected linearly:  C1 - BR - C2 using two private 
networks on each side of BR (VMnet2 and VMnet3), so the VMs are connected by 
vSwitches

Ping reply arrives, definitely goes through BR (extra console logs), but there 
are unexpected delays with example/skeleton/basicfwd...
[root at localhost ~]# ping 192.168.3.23
PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data.
64 bytes from 192.168.3.23: icmp_seq=1 ttl=64 time=1018 ms
64 bytes from 192.168.3.23: icmp_seq=2 ttl=64 time=18.7 ms
64 bytes from 192.168.3.23: icmp_seq=3 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=4 ttl=64 time=8.87 ms
64 bytes from 192.168.3.23: icmp_seq=5 ttl=64 time=1010 ms
64 bytes from 192.168.3.23: icmp_seq=6 ttl=64 time=10.2 ms
64 bytes from 192.168.3.23: icmp_seq=7 ttl=64 time=1012 ms
64 bytes from 192.168.3.23: icmp_seq=8 ttl=64 time=12.7 ms
64 bytes from 192.168.3.23: icmp_seq=9 ttl=64 time=1049 ms
64 bytes from 192.168.3.23: icmp_seq=10 ttl=64 time=49.8 ms
64 bytes from 192.168.3.23: icmp_seq=11 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=12 ttl=64 time=9.02 ms
64 bytes from 192.168.3.23: icmp_seq=13 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=14 ttl=64 time=8.74 ms
64 bytes from 192.168.3.23: icmp_seq=15 ttl=64 time=1007 ms
64 bytes from 192.168.3.23: icmp_seq=16 ttl=64 time=8.03 ms
64 bytes from 192.168.3.23: icmp_seq=17 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=18 ttl=64 time=8.96 ms
64 bytes from 192.168.3.23: icmp_seq=19 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=20 ttl=64 time=9.27 ms
64 bytes from 192.168.3.23: icmp_seq=21 ttl=64 time=1008 ms
...

When I switched on BR to multi_process/client_server_mp, with 2 client 
processes the result was almost the same:
[root at localhost ~]# ping 192.168.3.23
PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data.
64 bytes from 192.168.3.23: icmp_seq=1 ttl=64 time=1003 ms
64 bytes from 192.168.3.23: icmp_seq=2 ttl=64 time=3.50 ms
64 bytes from 192.168.3.23: icmp_seq=3 ttl=64 time=1002 ms
64 bytes from 192.168.3.23: icmp_seq=4 ttl=64 time=3.94 ms
64 bytes from 192.168.3.23: icmp_seq=5 ttl=64 time=1001 ms
64 bytes from 192.168.3.23: icmp_seq=6 ttl=64 time=1010 ms
64 bytes from 192.168.3.23: icmp_seq=7 ttl=64 time=1003 ms
64 bytes from 192.168.3.23: icmp_seq=8 ttl=64 time=2003 ms
64 bytes from 192.168.3.23: icmp_seq=10 ttl=64 time=2.29 ms
64 bytes from 192.168.3.23: icmp_seq=9 ttl=64 time=3002 ms
64 bytes from 192.168.3.23: icmp_seq=12 ttl=64 time=2.66 ms
64 bytes from 192.168.3.23: icmp_seq=11 ttl=64 time=3003 ms
64 bytes from 192.168.3.23: icmp_seq=14 ttl=64 time=2.87 ms
64 bytes from 192.168.3.23: icmp_seq=13 ttl=64 time=3003 ms
64 bytes from 192.168.3.23: icmp_seq=16 ttl=64 time=2.88 ms
64 bytes from 192.168.3.23: icmp_seq=15 ttl=64 time=1003 ms
64 bytes from 192.168.3.23: icmp_seq=17 ttl=64 time=1001 ms
64 bytes from 192.168.3.23: icmp_seq=18 ttl=64 time=2.70 ms
...

And when I switched on BR to test-pdm, the ping result was kind of normal 
(every commandline switch left as default)
[root at localhost ~]# ping 192.168.3.23
PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data.
64 bytes from 192.168.3.23: icmp_seq=1 ttl=64 time=3.52 ms
64 bytes from 192.168.3.23: icmp_seq=2 ttl=64 time=33.2 ms
64 bytes from 192.168.3.23: icmp_seq=3 ttl=64 time=3.97 ms
64 bytes from 192.168.3.23: icmp_seq=4 ttl=64 time=25.5 ms
64 bytes from 192.168.3.23: icmp_seq=5 ttl=64 time=61.1 ms
64 bytes from 192.168.3.23: icmp_seq=6 ttl=64 time=36.3 ms
64 bytes from 192.168.3.23: icmp_seq=7 ttl=64 time=35.5 ms
64 bytes from 192.168.3.23: icmp_seq=8 ttl=64 time=33.0 ms
64 bytes from 192.168.3.23: icmp_seq=9 ttl=64 time=5.32 ms
64 bytes from 192.168.3.23: icmp_seq=10 ttl=64 time=14.6 ms
64 bytes from 192.168.3.23: icmp_seq=11 ttl=64 time=34.5 ms
64 bytes from 192.168.3.23: icmp_seq=12 ttl=64 time=4.67 ms
64 bytes from 192.168.3.23: icmp_seq=13 ttl=64 time=55.0 ms
64 bytes from 192.168.3.23: icmp_seq=14 ttl=64 time=4.93 ms
64 bytes from 192.16

[dpdk-dev] [PATCH v6] e1000: igb and em1000 PCI Port Hotplug changes

2015-06-25 Thread Iremonger, Bernard


> -Original Message-
> From: Zhang, Helin
> Sent: Thursday, June 25, 2015 3:34 AM
> To: Iremonger, Bernard; dev at dpdk.org
> Subject: RE: [PATCH v6] e1000: igb and em1000 PCI Port Hotplug changes
> 
> Hi Bernard
> 
> > -Original Message-
> > From: Iremonger, Bernard
> > Sent: Monday, June 22, 2015 6:44 PM
> > To: dev at dpdk.org
> > Cc: Zhang, Helin; Iremonger, Bernard
> > Subject: [PATCH v6] e1000: igb and em1000 PCI Port Hotplug changes
> >
> > This patch depends on the Port Hotplug Framework.
> > It implements the eth_dev_uninit functions for rte_em_pmd,
> rte_igb_pmd
> > and rte_igbvf_pmd.
> Would it be better to split this patch into smaller patches in a patch set as 
> you
> did for i40e hotplug?
> 
> Regards,
> Helin

Hi Helin,
I don't think there is anything to be gained by splitting up this patch.
All the changes are hotplug related.

In the case of the i40e, there are five patches which are unrelated to hotplug 
which resolve issues encountered during development.

Regards,

Bernard

[dpdk-dev] [PATCHv2 2/2] ABI: Add some documentation

2015-06-25 Thread Gonzalez Monroy, Sergio

On 25/06/2015 08:42, Gonzalez Monroy, Sergio wrote:
> On 25/06/2015 08:19, Zhang, Helin wrote:
>>
>>> -Original Message-
>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
>>> Sent: Thursday, June 25, 2015 2:35 AM
>>> To: dev at dpdk.org
>>> Subject: [dpdk-dev] [PATCHv2 2/2] ABI: Add some documentation
>>>
>>> People have been asking for ways to use the ABI macros, heres some 
>>> docs to
>>> clarify their use.  Included is:
>>>
>>> * An overview of what ABI is
>>> * Details of the ABI deprecation process
>>> * Details of the versioning macros
>>> * Examples of their use
>>> * Details of how to use the ABI validator
>>>
>>> Thanks to John Mcnamara, who duplicated much of this effort at Intel 
>>> while I was
>>> working on it.  Much of the introductory material was gathered and 
>>> cleaned up
>>> by him
>>>
>>> Signed-off-by: Neil Horman 
>>> CC: john.mcnamara at intel.com
>>> CC: thomas.monjalon at 6wind.com
>>>
>>> Change notes:
>>>
>>> v2)
>>>   * Fixed RST indentations and spelling errors
>>>   * Rebased to upstream to fix index.rst conflict
>>> ---
>>>   doc/guides/guidelines/index.rst  |   1 +
>>>   doc/guides/guidelines/versioning.rst | 456
>>> +++
>>>   2 files changed, 457 insertions(+)
>>>   create mode 100644 doc/guides/guidelines/versioning.rst
>>>
>>> diff --git a/doc/guides/guidelines/index.rst 
>>> b/doc/guides/guidelines/index.rst
>>> index 0ee9ab3..bfb9fa3 100644
>>> --- a/doc/guides/guidelines/index.rst
>>> +++ b/doc/guides/guidelines/index.rst
>>> @@ -7,3 +7,4 @@ Guidelines
>>>
>>>   coding_style
>>>   design
>>> +versioning
>>> diff --git a/doc/guides/guidelines/versioning.rst
>>> b/doc/guides/guidelines/versioning.rst
>>> new file mode 100644
>>> index 000..2aef526
>>> --- /dev/null
>>> +++ b/doc/guides/guidelines/versioning.rst
>>> @@ -0,0 +1,456 @@
>>> +Managing ABI updates
>>> +
>>> +
>>> +Description
>>> +---
>>> +
>>> +This document details some methods for handling ABI management in the
>>> DPDK.
>>> +Note this document is not exhaustive, in that C library versioning is
>>> +flexible allowing multiple methods to achieve various goals, but it
>>> +will provide the user with some introductory methods
>>> +
>>> +General Guidelines
>>> +--
>>> +
>>> +#. Whenever possible, ABI should be preserved #. The addition of
>>> +symbols is generally not problematic #. The modification of symbols 
>>> can
>>> +generally be managed with versioning #. The removal of symbols
>>> +generally is an ABI break and requires bumping of the
>>> +   LIBABIVER macro
>>> +
>>> +What is an ABI
>>> +--
>>> +
>>> +An ABI (Application Binary Interface) is the set of runtime interfaces
>>> +exposed by a library. It is similar to an API (Application Programming
>>> +Interface) but is the result of compilation.  It is also effectively
>>> +cloned when applications link to dynamic libraries.  That is to say
>>> +when an application is compiled to link against dynamic libraries, it
>>> +is assumed that the ABI remains constant between the time the 
>>> application is
>>> compiled/linked, and the time that it runs.
>>> +Therefore, in the case of dynamic linking, it is critical that an ABI
>>> +is preserved, or (when modified), done in such a way that the
>>> +application is unable to behave improperly or in an unexpected 
>>> fashion.
>>> +
>>> +The DPDK ABI policy
>>> +---
>>> +
>>> +ABI versions are set at the time of major release labeling, and the 
>>> ABI
>>> +may change multiple times, without warning, between the last release
>>> +label and the HEAD label of the git tree.
>>> +
>>> +ABI versions, once released, are available until such time as their
>>> +deprecation has been noted in the Release Notes for at least one major
>>> +release cycle. For example consider the case where the ABI for DPDK 
>>> 2.0
>>> +has been shipped and then a decision is made to modify it during the
>>> +development of DPDK 2.1. The decision will be recorded in the Release
>>> +Notes for the DPDK 2.1 release and the modification will be made 
>>> available in
>>> the DPDK 2.2 release.
>>> +
>>> +ABI versions may be deprecated in whole or in part as needed by a 
>>> given
>>> +update.
>>> +
>>> +Some ABI changes may be too significant to reasonably maintain 
>>> multiple
>>> +versions. In those cases ABI's may be updated without backward
>>> +compatibility being provided. The requirements for doing so are:
>>> +
>>> +#. At least 3 acknowledgments of the need to do so must be made on the
>>> +   dpdk.org mailing list.
>>> +
>>> +#. A full deprecation cycle, as explained above, must be made to offer
>>> +   downstream consumers sufficient warning of the change.
>>> +
>>> +#. The ``LIBABIVER`` variable in the makefile(s) where the ABI 
>>> changes are
>>> +   incorporated must be incremented in parallel with the ABI changes
>>> +   themselves.
>>> +
>>> +Note that the above proc

[dpdk-dev] [PATCHv2 2/2] ABI: Add some documentation

2015-06-25 Thread Gonzalez Monroy, Sergio

On 25/06/2015 08:19, Zhang, Helin wrote:
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
>> Sent: Thursday, June 25, 2015 2:35 AM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCHv2 2/2] ABI: Add some documentation
>>
>> People have been asking for ways to use the ABI macros, heres some docs to
>> clarify their use.  Included is:
>>
>> * An overview of what ABI is
>> * Details of the ABI deprecation process
>> * Details of the versioning macros
>> * Examples of their use
>> * Details of how to use the ABI validator
>>
>> Thanks to John Mcnamara, who duplicated much of this effort at Intel while I 
>> was
>> working on it.  Much of the introductory material was gathered and cleaned up
>> by him
>>
>> Signed-off-by: Neil Horman 
>> CC: john.mcnamara at intel.com
>> CC: thomas.monjalon at 6wind.com
>>
>> Change notes:
>>
>> v2)
>>   * Fixed RST indentations and spelling errors
>>   * Rebased to upstream to fix index.rst conflict
>> ---
>>   doc/guides/guidelines/index.rst  |   1 +
>>   doc/guides/guidelines/versioning.rst | 456
>> +++
>>   2 files changed, 457 insertions(+)
>>   create mode 100644 doc/guides/guidelines/versioning.rst
>>
>> diff --git a/doc/guides/guidelines/index.rst 
>> b/doc/guides/guidelines/index.rst
>> index 0ee9ab3..bfb9fa3 100644
>> --- a/doc/guides/guidelines/index.rst
>> +++ b/doc/guides/guidelines/index.rst
>> @@ -7,3 +7,4 @@ Guidelines
>>
>>   coding_style
>>   design
>> +versioning
>> diff --git a/doc/guides/guidelines/versioning.rst
>> b/doc/guides/guidelines/versioning.rst
>> new file mode 100644
>> index 000..2aef526
>> --- /dev/null
>> +++ b/doc/guides/guidelines/versioning.rst
>> @@ -0,0 +1,456 @@
>> +Managing ABI updates
>> +
>> +
>> +Description
>> +---
>> +
>> +This document details some methods for handling ABI management in the
>> DPDK.
>> +Note this document is not exhaustive, in that C library versioning is
>> +flexible allowing multiple methods to achieve various goals, but it
>> +will provide the user with some introductory methods
>> +
>> +General Guidelines
>> +--
>> +
>> +#. Whenever possible, ABI should be preserved #. The addition of
>> +symbols is generally not problematic #. The modification of symbols can
>> +generally be managed with versioning #. The removal of symbols
>> +generally is an ABI break and requires bumping of the
>> +   LIBABIVER macro
>> +
>> +What is an ABI
>> +--
>> +
>> +An ABI (Application Binary Interface) is the set of runtime interfaces
>> +exposed by a library. It is similar to an API (Application Programming
>> +Interface) but is the result of compilation.  It is also effectively
>> +cloned when applications link to dynamic libraries.  That is to say
>> +when an application is compiled to link against dynamic libraries, it
>> +is assumed that the ABI remains constant between the time the application is
>> compiled/linked, and the time that it runs.
>> +Therefore, in the case of dynamic linking, it is critical that an ABI
>> +is preserved, or (when modified), done in such a way that the
>> +application is unable to behave improperly or in an unexpected fashion.
>> +
>> +The DPDK ABI policy
>> +---
>> +
>> +ABI versions are set at the time of major release labeling, and the ABI
>> +may change multiple times, without warning, between the last release
>> +label and the HEAD label of the git tree.
>> +
>> +ABI versions, once released, are available until such time as their
>> +deprecation has been noted in the Release Notes for at least one major
>> +release cycle. For example consider the case where the ABI for DPDK 2.0
>> +has been shipped and then a decision is made to modify it during the
>> +development of DPDK 2.1. The decision will be recorded in the Release
>> +Notes for the DPDK 2.1 release and the modification will be made available 
>> in
>> the DPDK 2.2 release.
>> +
>> +ABI versions may be deprecated in whole or in part as needed by a given
>> +update.
>> +
>> +Some ABI changes may be too significant to reasonably maintain multiple
>> +versions. In those cases ABI's may be updated without backward
>> +compatibility being provided. The requirements for doing so are:
>> +
>> +#. At least 3 acknowledgments of the need to do so must be made on the
>> +   dpdk.org mailing list.
>> +
>> +#. A full deprecation cycle, as explained above, must be made to offer
>> +   downstream consumers sufficient warning of the change.
>> +
>> +#. The ``LIBABIVER`` variable in the makefile(s) where the ABI changes are
>> +   incorporated must be incremented in parallel with the ABI changes
>> +   themselves.
>> +
>> +Note that the above process for ABI deprecation should not be
>> +undertaken lightly. ABI stability is extremely important for downstream
>> +consumers of the DPDK, especially when distributed in shared object
>> +form. Every effort should be made to

[dpdk-dev] Can't compile examples

2015-06-25 Thread Liu, Jijiang


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Thursday, June 25, 2015 4:27 PM
> To: Liu, Jijiang
> Cc: dev at dpdk.org; Tetsuya Mukawa
> Subject: Re: [dpdk-dev] Can't compile examples
> 
> 2015-06-25 11:31, Tetsuya Mukawa:
> > Hi Jijiang,
> >
> > It seems below patch introduces compile error of examples.
> >  - a50245e examples/tep_term: initialize VXLAN sample
> >
> > Here is log.
> > Could you please check it?
> >
> [...]
> >
> /home/mukawa/work/dpdk.org/dpdk/examples/tep_termination/main.c:52:28:
> > fatal error: rte_virtio_net.h: No such file or directory
> 
> The check before merging was with vhost enabled.
> 
> Jijiang, does it make sense to try make it without vhost?
> If not, examples/Makefile must contain
>   DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += tep_termination

The CONFIG_RTE_LIBRTE_VHOST must be set 'Y' when compiling the VXLAN example.

[dpdk-dev] [PATCH v3 2/7] mbuf: use the reserved 16 bits for double vlan

2015-06-25 Thread Zhang, Helin

Hi Neil

> -Original Message-
> From: Zhang, Helin
> Sent: Thursday, June 11, 2015 3:04 PM
> To: dev at dpdk.org
> Cc: Cao, Min; Liu, Jijiang; Wu, Jingjing; Ananyev, Konstantin; Richardson, 
> Bruce;
> olivier.matz at 6wind.com; Zhang, Helin
> Subject: [PATCH v3 2/7] mbuf: use the reserved 16 bits for double vlan
> 
> Use the reserved 16 bits in rte_mbuf structure for the outer vlan, also add 
> QinQ
> offloading flags for both RX and TX sides.
> 
> Signed-off-by: Helin Zhang 
> ---
>  lib/librte_mbuf/rte_mbuf.h | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> v2 changes:
> * Fixed a typo.
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index
> ab6de67..84fe181 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -101,11 +101,17 @@ extern "C" {
>  #define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12) /**< RX tunnel packet with
> IPv6 header. */
>  #define PKT_RX_FDIR_ID   (1ULL << 13) /**< FD id reported if FDIR match.
> */
>  #define PKT_RX_FDIR_FLX  (1ULL << 14) /**< Flexible bytes reported if
> FDIR match. */
> +#define PKT_RX_QINQ_PKT  (1ULL << 15)  /**< RX packet with double
> VLAN stripped. */
>  /* add new RX flags here */
> 
>  /* add new TX flags here */
> 
>  /**
> + * Second VLAN insertion (QinQ) flag.
> + */
> +#define PKT_TX_QINQ_PKT(1ULL << 49)   /**< TX packet with double
> VLAN inserted. */
> +
> +/**
>   * TCP segmentation offload. To enable this offload feature for a
>   * packet to be transmitted on hardware supporting TSO:
>   *  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies @@
> -279,7 +285,7 @@ struct rte_mbuf {
>   uint16_t data_len;/**< Amount of data in segment buffer. */
>   uint32_t pkt_len; /**< Total pkt len: sum of all segments. */
>   uint16_t vlan_tci;/**< VLAN Tag Control Identifier (CPU order) 
> */
> - uint16_t reserved;
> + uint16_t vlan_tci_outer;  /**< Outer VLAN Tag Control Identifier (CPU
> +order) */
Do you think here is a ABI break or not? Just using the reserved 16 bits, which 
was
intended for the second_vlan_tag. Thanks in advance!
I did not see any "Incompatible" reported by validate_abi.sh.

Regards,
Helin

>   union {
>   uint32_t rss; /**< RSS hash result if RSS enabled */
>   struct {
> @@ -777,6 +783,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
>   m->pkt_len = 0;
>   m->tx_offload = 0;
>   m->vlan_tci = 0;
> + m->vlan_tci_outer = 0;
>   m->nb_segs = 1;
>   m->port = 0xff;
> 
> @@ -849,6 +856,7 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi,
> struct rte_mbuf *m)
>   mi->data_len = m->data_len;
>   mi->port = m->port;
>   mi->vlan_tci = m->vlan_tci;
> + mi->vlan_tci_outer = m->vlan_tci_outer;
>   mi->tx_offload = m->tx_offload;
>   mi->hash = m->hash;
> 
> --
> 1.9.3

[dpdk-dev] [PATCH 1/2] rte_compat.h : Clean up some typos

2015-06-25 Thread Neil Horman

On Thu, Jun 25, 2015 at 07:37:43AM +, Gajdzica, MaciejX T wrote:
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > Sent: Tuesday, June 23, 2015 9:34 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH 1/2] rte_compat.h : Clean up some typos
> > 
> > Clean up some macro definition typos and comments
> > 
> > Signed-off-by: Neil Horman 
> > CC: thomas.monjalon at 6wind.com
> > ---
> >  lib/librte_compat/rte_compat.h | 14 +++---
> >  1 file changed, 7 insertions(+), 7 deletions(-)
> > 
> > diff --git a/lib/librte_compat/rte_compat.h b/lib/librte_compat/rte_compat.h
> > index fb0dc19..75920a1 100644
> > --- a/lib/librte_compat/rte_compat.h
> > +++ b/lib/librte_compat/rte_compat.h
> > @@ -54,7 +54,7 @@
> >   * foo is exported as a global symbol.
> >   *
> >   * 2) rename the existing function int foo(char *string) to
> > - * int __vsym foo_v20(char *string)
> > + * int foo_v20(char *string)
> >   *
> >   * 3) Add this macro immediately below the function
> >   * VERSION_SYMBOL(foo, _v20, 2.0);
> > @@ -63,7 +63,7 @@
> >   * char foo(int value, int otherval) { ...}
> >   *
> >   * 5) Mark the newest version as the default version
> > - * BIND_DEFAULT_SYMBOL(foo, 2.1);
> > + * BIND_DEFAULT_SYMBOL(foo, _v21, 2.1);
> >   *
> >   */
> > 
> > @@ -79,21 +79,21 @@
> >   * Creates a symbol version table entry binding symbol @DPDK_ to the
> > internal
> >   * function name _
> >   */
> > -#define VERSION_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e)
> > ", "RTE_STR(b)"@DPDK_"RTE_STR(n))
> > +#define VERSION_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b)
> > +RTE_STR(e) ", " RTE_STR(b) "@DPDK_" RTE_STR(n))
> > 
> >  /*
> >   * BASE_SYMBOL
> >   * Creates a symbol version table entry binding unversioned symbol 
> >   * to the internal function _
> >   */
> > -#define BASE_SYMBOL(b, e) __asm__(".symver " RTE_STR(b) RTE_STR(e) ",
> > "RTE_STR(b)"@")
> > +#define BASE_SYMBOL(b, e) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", "
> > +RTE_STR(b)"@")
> > 
> >  /*
> > - * BNID_DEFAULT_SYMBOL
> > + * BIND_DEFAULT_SYMBOL
> >   * Creates a symbol version entry instructing the linker to bind 
> > references to
> >   * symbol  to the internal symbol _
> >   */
> > -#define BIND_DEFAULT_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b)
> > RTE_STR(e) ", "RTE_STR(b)"@@DPDK_"RTE_STR(n))
> > +#define BIND_DEFAULT_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b)
> > +RTE_STR(e) ", " RTE_STR(b) "@@DPDK_" RTE_STR(n))
> >  #define __vsym __attribute__((used))
> > 
> >  #else
> > @@ -103,7 +103,7 @@
> >  #define VERSION_SYMBOL(b, e, v)
> >  #define __vsym
> >  #define BASE_SYMBOL(b, n)
> > -#define BIND_DEFAULT_SYMBOL(b, v)
> > +#define BIND_DEFAULT_SYMBOL(b, e, n)
> > 
> >  /*
> >   * RTE_BUILD_SHARED_LIB=n
> > --
> > 2.1.0
> 
> This patch doesn't solves the issue with static build.
> 
> You have function:
> int foo(int val)
> 
> And you want to create new version of it. So after edit you will have:
> int foo_v20(int val)
> {
> [...]
> }
> VERSION_SYMBOL(foo, _v20, 2.0);
> 
> int foo_v21(int val1, int val2)
> {
> [...]
> }
> BIND_DEFAULT_SYMBOL (foo, _v21, 2.1);
> 
> You have also external application that uses foo function. You try to compile 
> this app with dpdk
> compiled as shared and static. In first case everything will work fine, but 
> in second linker won't
> find definition of foo because it doesn't exist. There are only definitions 
> of foo_v20 and foo_v21.
> 
> Best Regards
> Maciek
> 
As I noted before, you can avoid that by explicitly making the latest version of
the function the static version (that is to say, only rename older versions and
allow the 'latest' to be called rte_acl_create (in this case).  You're right
though, I prefer to rename all functions, and in my example above I didn't
address the static build issue.  I'm adding a macro for that in my repost.

Neil

[dpdk-dev] [PATCHv2 2/2] ABI: Add some documentation

2015-06-25 Thread Neil Horman

On Thu, Jun 25, 2015 at 07:19:49AM +, Zhang, Helin wrote:
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > Sent: Thursday, June 25, 2015 2:35 AM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCHv2 2/2] ABI: Add some documentation
> > 
> > People have been asking for ways to use the ABI macros, heres some docs to
> > clarify their use.  Included is:
> > 
> > * An overview of what ABI is
> > * Details of the ABI deprecation process
> > * Details of the versioning macros
> > * Examples of their use
> > * Details of how to use the ABI validator
> > 
> > Thanks to John Mcnamara, who duplicated much of this effort at Intel while 
> > I was
> > working on it.  Much of the introductory material was gathered and cleaned 
> > up
> > by him
> > 
> > Signed-off-by: Neil Horman 
> > CC: john.mcnamara at intel.com
> > CC: thomas.monjalon at 6wind.com
> > 
> > Change notes:
> > 
> > v2)
> >  * Fixed RST indentations and spelling errors
> >  * Rebased to upstream to fix index.rst conflict
> > ---
> >  doc/guides/guidelines/index.rst  |   1 +
> >  doc/guides/guidelines/versioning.rst | 456
> > +++
> >  2 files changed, 457 insertions(+)
> >  create mode 100644 doc/guides/guidelines/versioning.rst
> > 
> > diff --git a/doc/guides/guidelines/index.rst 
> > b/doc/guides/guidelines/index.rst
> > index 0ee9ab3..bfb9fa3 100644
> > --- a/doc/guides/guidelines/index.rst
> > +++ b/doc/guides/guidelines/index.rst
> > @@ -7,3 +7,4 @@ Guidelines
> > 
> >  coding_style
> >  design
> > +versioning
> > diff --git a/doc/guides/guidelines/versioning.rst
> > b/doc/guides/guidelines/versioning.rst
> > new file mode 100644
> > index 000..2aef526
> > --- /dev/null
> > +++ b/doc/guides/guidelines/versioning.rst
> > @@ -0,0 +1,456 @@
> > +Managing ABI updates
> > +
> > +
> > +Description
> > +---
> > +
> > +This document details some methods for handling ABI management in the
> > DPDK.
> > +Note this document is not exhaustive, in that C library versioning is
> > +flexible allowing multiple methods to achieve various goals, but it
> > +will provide the user with some introductory methods
> > +
> > +General Guidelines
> > +--
> > +
> > +#. Whenever possible, ABI should be preserved #. The addition of
> > +symbols is generally not problematic #. The modification of symbols can
> > +generally be managed with versioning #. The removal of symbols
> > +generally is an ABI break and requires bumping of the
> > +   LIBABIVER macro
> > +
> > +What is an ABI
> > +--
> > +
> > +An ABI (Application Binary Interface) is the set of runtime interfaces
> > +exposed by a library. It is similar to an API (Application Programming
> > +Interface) but is the result of compilation.  It is also effectively
> > +cloned when applications link to dynamic libraries.  That is to say
> > +when an application is compiled to link against dynamic libraries, it
> > +is assumed that the ABI remains constant between the time the application 
> > is
> > compiled/linked, and the time that it runs.
> > +Therefore, in the case of dynamic linking, it is critical that an ABI
> > +is preserved, or (when modified), done in such a way that the
> > +application is unable to behave improperly or in an unexpected fashion.
> > +
> > +The DPDK ABI policy
> > +---
> > +
> > +ABI versions are set at the time of major release labeling, and the ABI
> > +may change multiple times, without warning, between the last release
> > +label and the HEAD label of the git tree.
> > +
> > +ABI versions, once released, are available until such time as their
> > +deprecation has been noted in the Release Notes for at least one major
> > +release cycle. For example consider the case where the ABI for DPDK 2.0
> > +has been shipped and then a decision is made to modify it during the
> > +development of DPDK 2.1. The decision will be recorded in the Release
> > +Notes for the DPDK 2.1 release and the modification will be made available 
> > in
> > the DPDK 2.2 release.
> > +
> > +ABI versions may be deprecated in whole or in part as needed by a given
> > +update.
> > +
> > +Some ABI changes may be too significant to reasonably maintain multiple
> > +versions. In those cases ABI's may be updated without backward
> > +compatibility being provided. The requirements for doing so are:
> > +
> > +#. At least 3 acknowledgments of the need to do so must be made on the
> > +   dpdk.org mailing list.
> > +
> > +#. A full deprecation cycle, as explained above, must be made to offer
> > +   downstream consumers sufficient warning of the change.
> > +
> > +#. The ``LIBABIVER`` variable in the makefile(s) where the ABI changes are
> > +   incorporated must be incremented in parallel with the ABI changes
> > +   themselves.
> > +
> > +Note that the above process for ABI deprecation should not be
> > +undertaken lightly. ABI sta

[dpdk-dev] VMXNET3 on vmware, ping delay

2015-06-25 Thread Matthew Hall

On Thu, Jun 25, 2015 at 09:14:53AM +, Vass, Sandor (Nokia - HU/Budapest) 
wrote:
> According to my understanding each packet should go 
> through BR as fast as possible, but it seems that the rte_eth_rx_burst 
> retrieves packets only when there are at least 2 packets on the RX queue of 
> the NIC. At least most of the times as there are cases (rarely - according 
> to my console log) when it can retrieve 1 packet also and sometimes only 3 
> packets can be retrieved...

By default DPDK is optimized for throughput not latency. Try a test with 
heavier traffic.

There is also some work going on now for DPDK interrupt-driven mode, which 
will work more like traditional Ethernet drivers instead of polling mode 
Ethernet drivers.

Though I'm not an expert on it, there is also a series of ways to optimize for 
latency, which hopefully some others could discuss... or maybe search the 
archives / web site / Intel tuning documentation.

Matthew.

[dpdk-dev] [PATCH 1/2] rte_compat.h : Clean up some typos

2015-06-25 Thread Gajdzica, MaciejX T



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> Sent: Tuesday, June 23, 2015 9:34 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 1/2] rte_compat.h : Clean up some typos
> 
> Clean up some macro definition typos and comments
> 
> Signed-off-by: Neil Horman 
> CC: thomas.monjalon at 6wind.com
> ---
>  lib/librte_compat/rte_compat.h | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/librte_compat/rte_compat.h b/lib/librte_compat/rte_compat.h
> index fb0dc19..75920a1 100644
> --- a/lib/librte_compat/rte_compat.h
> +++ b/lib/librte_compat/rte_compat.h
> @@ -54,7 +54,7 @@
>   * foo is exported as a global symbol.
>   *
>   * 2) rename the existing function int foo(char *string) to
> - *   int __vsym foo_v20(char *string)
> + *   int foo_v20(char *string)
>   *
>   * 3) Add this macro immediately below the function
>   *   VERSION_SYMBOL(foo, _v20, 2.0);
> @@ -63,7 +63,7 @@
>   *   char foo(int value, int otherval) { ...}
>   *
>   * 5) Mark the newest version as the default version
> - *   BIND_DEFAULT_SYMBOL(foo, 2.1);
> + *   BIND_DEFAULT_SYMBOL(foo, _v21, 2.1);
>   *
>   */
> 
> @@ -79,21 +79,21 @@
>   * Creates a symbol version table entry binding symbol @DPDK_ to the
> internal
>   * function name _
>   */
> -#define VERSION_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b) RTE_STR(e)
> ", "RTE_STR(b)"@DPDK_"RTE_STR(n))
> +#define VERSION_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b)
> +RTE_STR(e) ", " RTE_STR(b) "@DPDK_" RTE_STR(n))
> 
>  /*
>   * BASE_SYMBOL
>   * Creates a symbol version table entry binding unversioned symbol 
>   * to the internal function _
>   */
> -#define BASE_SYMBOL(b, e) __asm__(".symver " RTE_STR(b) RTE_STR(e) ",
> "RTE_STR(b)"@")
> +#define BASE_SYMBOL(b, e) __asm__(".symver " RTE_STR(b) RTE_STR(e) ", "
> +RTE_STR(b)"@")
> 
>  /*
> - * BNID_DEFAULT_SYMBOL
> + * BIND_DEFAULT_SYMBOL
>   * Creates a symbol version entry instructing the linker to bind references 
> to
>   * symbol  to the internal symbol _
>   */
> -#define BIND_DEFAULT_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b)
> RTE_STR(e) ", "RTE_STR(b)"@@DPDK_"RTE_STR(n))
> +#define BIND_DEFAULT_SYMBOL(b, e, n) __asm__(".symver " RTE_STR(b)
> +RTE_STR(e) ", " RTE_STR(b) "@@DPDK_" RTE_STR(n))
>  #define __vsym __attribute__((used))
> 
>  #else
> @@ -103,7 +103,7 @@
>  #define VERSION_SYMBOL(b, e, v)
>  #define __vsym
>  #define BASE_SYMBOL(b, n)
> -#define BIND_DEFAULT_SYMBOL(b, v)
> +#define BIND_DEFAULT_SYMBOL(b, e, n)
> 
>  /*
>   * RTE_BUILD_SHARED_LIB=n
> --
> 2.1.0

This patch doesn't solves the issue with static build.

You have function:
int foo(int val)

And you want to create new version of it. So after edit you will have:
int foo_v20(int val)
{
[...]
}
VERSION_SYMBOL(foo, _v20, 2.0);

int foo_v21(int val1, int val2)
{
[...]
}
BIND_DEFAULT_SYMBOL (foo, _v21, 2.1);

You have also external application that uses foo function. You try to compile 
this app with dpdk
compiled as shared and static. In first case everything will work fine, but in 
second linker won't
find definition of foo because it doesn't exist. There are only definitions of 
foo_v20 and foo_v21.

Best Regards
Maciek

1 2 >

1 - 100 of 106 matches

Mail list logo