date:20150904

[dpdk-dev] how to change binding of NIC ports to NUMA nodes

2015-09-04 Thread Rajesh R

Hi Pablo,

Thank you for the reply. I think I did not convey my query properly in my
question.

I agree that physical placement of NICs in PCIe slots decides the NUMA node
to which it is associated.
But in the server that I am experimenting(IBM system x 3850 x5 with 4 xeon
7560 processors) there are two IO hubs though which the PCIe slots are
connected to the CPU sockets.  4 of the PCIe slots are connected to 1 IOH
and 3 slots are connected to the second IOH. Each IOH is connected to 2 cpu
sockets- IOH1 is connected to sockets (0 and 1) . IOH2 is connected to
sockets (2 and 3). When I put 2 NICs in slots connecting to IOH1, both get
binded to socket 0. Similarly when I put 2 NICs in slots connecting to
IOH2, both get binded to socket 2.

My question is why none of the cards get binded to numa nodes(sockets) 1 or
3?

Is there something that I am missing in the physical architecture of the
server? is it that each IOH is directly connected to only 1 socket?

Regards
Rajesh

On Fri, Sep 4, 2015 at 12:50 PM, De Lara Guarch, Pablo <
pablo.de.lara.guarch at intel.com> wrote:

> Hi Rajesh,
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Rajesh R
> > Sent: Friday, September 04, 2015 5:29 AM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] how to change binding of NIC ports to NUMA nodes
> >
> > Hi,
> >
> > I am trying an application based on dpdk on a 4- processor server i.e. 4
> > numa nodes.
> > The server is having with 4 NIC cards out of which 2 cards get binded to
> > numa node 0 and other 2 cards get binded to numa node 2 (as per the
> > /sys/pci/.../numa_node for each card)
> >
> >
> > How to evenly distribute the cards to all the numa nodes so that one card
> > each gets binded to one numa node?
> >
> > Can we control the binding from dpdk, either pmd_ixgbe or igb_uio?
>
> The drivers cannot change the numa node where your NICs are,
> as those nodes are associated to the different physical sockets (CPU and
> memory)
> that you have on your platform, and your NICs are connected physically
> to these sockets via the PCI slots.
>
> So, if you want to change the numa node, you will have to move the NIC(s)
> to another PCI slot that is connected to a different socket.
> Look at the user guide of your platform to find out which PCI slots are
> connected to which socket.
>
> Regards,
>
> Pablo
> >
> >
> > --
> > Regards
> >
> > Rajesh R
>

-- 
Regards

Rajesh R

[dpdk-dev] [RFC PATCH 00/18] refactor eal driver registration code

2015-09-04 Thread Neil Horman

On Fri, Sep 04, 2015 at 12:18:50PM +0100, Bruce Richardson wrote:
> On Fri, Sep 04, 2015 at 12:01:36PM +0100, Bernard Iremonger wrote:
> > At present the eal driver registration code is more complicated than it
> > needs to be.
> > 
> > This RFC proposes to simplify the eal driver registration code.
> > 
> > Remove the type field from the eal driver structure.
> > Refactor the eal driver registration code to use the name
> > field in the eal driver structure instead of the type field.
> > 
> > Modify all PMD's to use the modified eal driver structure.
> > Initialise the name field in the eal driver structure
> > in some PMD's where it is not initialised at present.
> > 
> >
> Hi,
> 
> I don't think I like this approach very much. It seems very brittle to remove
> the explicit type field and starting to rely on the drivers putting a prefix
> in the name instead i.e. implicit typing.
> 
> What is the major concern with marking drivers as virtual or physical? My 
> thinking
> is that we should keep the type field, just perhaps change PDEV to be more
> descriptive in identifying the type of physical device, e.g. DEV_PCI.
> 
The issue is largely philisophical.  We shouldn't need to define the type of bus
a driver is on in the init structure of a pmd.  Instead we should register it
dynamically during pmd initalization

As you note, ennumerating the bus type (ie. PCI/USB/etc) is a step in the right
direction, but it would be better to register that dynamically than to encode it
in the data structure
Neil

> Regards,
> /Bruce
>

[dpdk-dev] ixgbe: account more Rx errors Issue

2015-09-04 Thread Andriy Berestovskyy

Hi Maryam,
Please see below.

> XEC counts the Number of receive IPv4, TCP, UDP or SCTP XSUM errors

Please note than UDP checksum is optional for IPv4, but UDP packets with zero 
checksum hit XEC.

> And general crc errors counts Counts the number of receive packets with CRC 
> errors.

Let me explain you with an example.

DPDK 2.0 behavior:
host A sends 10M IPv4 UDP packets (no checksum) to host B
host B stats: 9M ipackets + 1M ierrors (missed) = 10M

DPDK 2.1 behavior:
host A sends 10M IPv4 UDP packets (no checksum) to host B
host B stats: 9M ipackets + 11M in ierrors (1M missed + 10M XEC) = 20M?

> So our options are we can:
> 1. Add only one of these into the error stats.
> 2. We can introduce some cooking of stats in this scenario, so only add 
> either or if they are equal or one is higher than the other.
> 3. Add them all which means you can have more errors than the number of 
> received packets, but TBH this is going to be the case if your packets have 
> multiple errors anyway.

4. ierrors should reflect NIC drops only.
XEC does not count drops, so IMO it should be removed from ierrors.

Please note that we still can access the XEC using rte_eth_xstats_get()


Regards,
Andriy

[dpdk-dev] virtio optimization idea

2015-09-04 Thread Xie, Huawei

There is some format issue with the ascii chart of the tx ring. Update
that chart.
Sorry for the trouble.


On 9/4/2015 4:25 PM, Xie, Huawei wrote:
> Hi:
>
> Recently I have done one virtio optimization proof of concept. The
> optimization includes two parts:
> 1) avail ring set with fixed descriptors
> 2) RX vectorization
> With the optimizations, we could have several times of performance boost
> for purely vhost-virtio throughput.
>
> Here i will only cover the first part, which is the prerequisite for the
> second part.
> Let us first take RX for example. Currently when we fill the avail ring
> with guest mbuf, we need
> a) allocate one descriptor(for non sg mbuf) from free descriptors
> b) set the idx of the desc into the entry of avail ring
> c) set the addr/len field of the descriptor to point to guest blank mbuf
> data area
>
> Those operation takes time, and especially step b results in modifed (M)
> state of the cache line for the avail ring in the virtio processing
> core. When vhost processes the avail ring, the cache line transfer from
> virtio processing core to vhost processing core takes pretty much CPU
> cycles.
> To solve this problem, this is the arrangement of RX ring for DPDK
> pmd(for non-mergable case).
>
> avail  
> idx
> +  
> |  
> +++---+-+--+   
> | 0  | 1  | 2 | ... |  254  | 255  |  avail ring
> +-+--+-+--+-+-+-+---+--+---+   
>   |||   |   |  |   
>   |||   |   |  |   
>   vvv   |   v  v   
> +-+--+-+--+-+-+-+---+--+---+   
> | 0  | 1  | 2 | ... |  254  | 255  |  desc ring
> +++---+-+--+   
> |  
> |  
> +++---+-+--+   
> | 0  | 1  | 2 | |  254  | 255  |  used ring
> +++---+-+--+   
> |  
> +
> Avail ring is initialized with fixed descriptor and is never changed,
> i.e, the index value of the nth avail ring entry is always n, which
> means virtio PMD is actually refilling desc ring only, without having to
> change avail ring.
> When vhost fetches avail ring, if not evicted, it is always in its first
> level cache.
>
> When RX receives packets from used ring, we use the used->idx as the
> desc idx. This requires that vhost processes and returns descs from
> avail ring to used ring in order, which is true for both current dpdk
> vhost and kernel vhost implementation. In my understanding, there is no
> necessity for vhost net to process descriptors OOO. One case could be
> zero copy, for example, if one descriptor doesn't meet zero copy
> requirment, we could directly return it to used ring, earlier than the
> descriptors in front of it.
> To enforce this, i want to use a reserved bit to indicate in order
> processing of descriptors.
>
> For tx ring, the arrangement is like below. Each transmitted mbuf needs
> a desc for virtio_net_hdr, so actually we have only 128 free slots.
>   
> 
>
>
> ++
>
> ||
>
> ||
>
>+-+-+-+--+--+--+--+
>
>|  0  |  1  | ... |  127 || 128  | 129  | ...  | 255  |   avail ring   
>
>+--+--+--+--+-+---+--+---+--+---+--+--+---+
>
>   | ||  ||  |  | |
>
>   v vv  ||  v  v v
>
>+--+--+--+--+-+---+--+---+--+---+--+--+---+
>
>| 127 | 128 | ... |  255 || 127  | 128  | ...  | 255  |   desc ring for 
> virtio_net_hdr
>+--+--+--+--+-+---+--+---+--+---+--+--+---+
>
>   | ||  ||  |  | |
>
>   v vv  ||  v  v v
>
>+--+--+--+--+-+---+--+---+--+---+--+--+---+
>
>|  0  |  1  | ... |  127 ||  0   |  1   | ...  | 127  |   desc ring for tx 
> dat
>
>
>  
> /huawei
>

[dpdk-dev] [PATCH v2 3/3] version: 2.2.0-rc0

2015-09-04 Thread Thomas Monjalon

Signed-off-by: Thomas Monjalon 
---
 lib/librte_eal/common/include/rte_version.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_version.h 
b/lib/librte_eal/common/include/rte_version.h
index 29baa06..08cc87a 100644
--- a/lib/librte_eal/common/include/rte_version.h
+++ b/lib/librte_eal/common/include/rte_version.h
@@ -60,7 +60,7 @@ extern "C" {
 /**
  * Minor version number i.e. the y in x.y.z
  */
-#define RTE_VER_MINOR 1
+#define RTE_VER_MINOR 2

 /**
  * Patch level number i.e. the z in x.y.z
@@ -70,14 +70,14 @@ extern "C" {
 /**
  * Extra string to be appended to version number
  */
-#define RTE_VER_SUFFIX ""
+#define RTE_VER_SUFFIX "-rc"

 /**
  * Patch release number
  *   0-15 = release candidates
  *   16   = release
  */
-#define RTE_VER_PATCH_RELEASE 16
+#define RTE_VER_PATCH_RELEASE 0

 /**
  * Macro to compute a version number usable for comparisons
-- 
2.5.1

[dpdk-dev] [PATCH v2 2/3] hash: remove deprecated function and macros

2015-09-04 Thread Thomas Monjalon

From: Pablo de Lara 

The function rte_jhash2() was renamed rte_jhash_32b and
macros RTE_HASH_KEY_LENGTH_MAX and RTE_HASH_BUCKET_ENTRIES_MAX
were tagged as deprecated, so they can be removed in 2.2.

RTE_HASH_KEY_LENGTH is replaced in unit tests by an internal macro
for the memory allocation of all keys used.

The library version number is incremented.

Signed-off-by: Pablo de Lara 
Signed-off-by: Thomas Monjalon 
---
 app/test/test_hash.c |  7 ---
 app/test/test_hash_functions.c   |  4 ++--
 app/test/test_hash_perf.c|  2 +-
 doc/guides/rel_notes/deprecation.rst |  5 -
 doc/guides/rel_notes/release_2_2.rst |  5 -
 lib/librte_hash/Makefile |  2 +-
 lib/librte_hash/rte_hash.h   |  6 --
 lib/librte_hash/rte_jhash.h  | 15 ++-
 8 files changed, 14 insertions(+), 32 deletions(-)

diff --git a/app/test/test_hash.c b/app/test/test_hash.c
index 7f8c0d3..4f2509d 100644
--- a/app/test/test_hash.c
+++ b/app/test/test_hash.c
@@ -66,6 +66,7 @@
 static rte_hash_function hashtest_funcs[] = {rte_jhash, rte_hash_crc};
 static uint32_t hashtest_initvals[] = {0};
 static uint32_t hashtest_key_lens[] = {0, 2, 4, 5, 6, 7, 8, 10, 11, 15, 16, 
21, 31, 32, 33, 63, 64};
+#define MAX_KEYSIZE 64
 
/**/
 #define LOCAL_FBK_HASH_ENTRIES_MAX (1 << 15)

@@ -238,7 +239,7 @@ test_crc32_hash_alg_equiv(void)
 static void run_hash_func_test(rte_hash_function f, uint32_t init_val,
uint32_t key_len)
 {
-   static uint8_t key[RTE_HASH_KEY_LENGTH_MAX];
+   static uint8_t key[MAX_KEYSIZE];
unsigned i;


@@ -1100,7 +1101,7 @@ test_hash_creation_with_good_parameters(void)
 static int test_average_table_utilization(void)
 {
struct rte_hash *handle;
-   uint8_t simple_key[RTE_HASH_KEY_LENGTH_MAX];
+   uint8_t simple_key[MAX_KEYSIZE];
unsigned i, j;
unsigned added_keys, average_keys_added = 0;
int ret;
@@ -1154,7 +1155,7 @@ static int test_hash_iteration(void)
 {
struct rte_hash *handle;
unsigned i;
-   uint8_t keys[NUM_ENTRIES][RTE_HASH_KEY_LENGTH_MAX];
+   uint8_t keys[NUM_ENTRIES][MAX_KEYSIZE];
const void *next_key;
void *next_data;
void *data[NUM_ENTRIES];
diff --git a/app/test/test_hash_functions.c b/app/test/test_hash_functions.c
index 8c7cf63..3ad6d80 100644
--- a/app/test/test_hash_functions.c
+++ b/app/test/test_hash_functions.c
@@ -85,7 +85,7 @@ static uint32_t hash_values_crc[2][10] = {{
  * from the array entries is tested.
  */
 #define HASHTEST_ITERATIONS 100
-
+#define MAX_KEYSIZE 64
 static rte_hash_function hashtest_funcs[] = {rte_jhash, rte_hash_crc};
 static uint32_t hashtest_initvals[] = {0, 0xdeadbeef};
 static uint32_t hashtest_key_lens[] = {
@@ -119,7 +119,7 @@ static void
 run_hash_func_perf_test(uint32_t key_len, uint32_t init_val,
rte_hash_function f)
 {
-   static uint8_t key[HASHTEST_ITERATIONS][RTE_HASH_KEY_LENGTH_MAX];
+   static uint8_t key[HASHTEST_ITERATIONS][MAX_KEYSIZE];
uint64_t ticks, start, end;
unsigned i, j;

diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c
index a87fc80..9d53c14 100644
--- a/app/test/test_hash_perf.c
+++ b/app/test/test_hash_perf.c
@@ -140,7 +140,7 @@ shuffle_input_keys(unsigned table_index)
 {
unsigned i;
uint32_t swap_idx;
-   uint8_t temp_key[RTE_HASH_KEY_LENGTH_MAX];
+   uint8_t temp_key[MAX_KEYSIZE];
hash_sig_t temp_signature;
int32_t temp_position;

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 5f6079b..fffad80 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -13,11 +13,6 @@ Deprecation Notices
   There is no backward compatibility planned from release 2.2.
   All binaries will need to be rebuilt from release 2.2.

-* The Macros RTE_HASH_BUCKET_ENTRIES_MAX and RTE_HASH_KEY_LENGTH_MAX are
-  deprecated and will be removed with version 2.2.
-
-* The function rte_jhash2 is deprecated and should be removed.
-
 * The following fields have been deprecated in rte_eth_stats:
   imissed, ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
   tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff
diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index abe57b4..682f468 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -21,6 +21,9 @@ API Changes

 * The deprecated ACL API ipv4vlan is removed.

+* The deprecated hash function rte_jhash2() is removed.
+  It was replaced by rte_jhash_32b().
+
 * The deprecated KNI functions are removed:
   rte_kni_create(), rte_kni_get_port_id() and rte_kni_info_get().

@@ -58,7 +61,7 @@ The libraries prepended with a plus sign were incremented in 
this version.

[dpdk-dev] [PATCH v2 1/3] enic: use appropriate key length in hash table

2015-09-04 Thread Thomas Monjalon

From: Pablo de Lara 

RTE_HASH_KEY_LENGTH_MAX was deprecated, and the hash table
actually is hosting bigger keys than that size, so key length
has been increased to properly allocate all keys.

Signed-off-by: Pablo de Lara 
Acked-by: Sujith Sankar 
---
 drivers/net/enic/enic_clsf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/enic/enic_clsf.c b/drivers/net/enic/enic_clsf.c
index 9c2abfb..656b25b 100644
--- a/drivers/net/enic/enic_clsf.c
+++ b/drivers/net/enic/enic_clsf.c
@@ -214,7 +214,7 @@ int enic_fdir_add_fltr(struct enic *enic, struct 
rte_eth_fdir_filter *params)
enic->fdir.stats.add++;
}

-   pos = rte_hash_add_key(enic->fdir.hash, (void *)key);
+   pos = rte_hash_add_key(enic->fdir.hash, params);
enic->fdir.nodes[pos] = key;
return 0;
 }
@@ -244,7 +244,7 @@ int enic_clsf_init(struct enic *enic)
struct rte_hash_parameters hash_params = {
.name = "enicpmd_clsf_hash",
.entries = ENICPMD_CLSF_HASH_ENTRIES,
-   .key_len = RTE_HASH_KEY_LENGTH_MAX,
+   .key_len = sizeof(struct rte_eth_fdir_filter),
.hash_func = DEFAULT_HASH_FUNC,
.hash_func_init_val = 0,
.socket_id = SOCKET_0,
-- 
2.5.1

[dpdk-dev] [PATCH v2 0/3] clean deprecated code in hash library

2015-09-04 Thread Thomas Monjalon

This patchset removes all deprecated macros and functions
from the hash library.
Then the DPDK version can be changed to 2.2.0-rc0.

Changes in v2:
- increment hash library version
- merge hash patches
- increment DPDK version

Pablo de Lara (2):
  enic: use appropriate key length in hash table
  hash: remove deprecated function and macros

Thomas Monjalon (1):
  version: 2.2.0-rc0

 app/test/test_hash.c|  7 ---
 app/test/test_hash_functions.c  |  4 ++--
 app/test/test_hash_perf.c   |  2 +-
 doc/guides/rel_notes/deprecation.rst|  5 -
 doc/guides/rel_notes/release_2_2.rst|  5 -
 drivers/net/enic/enic_clsf.c|  4 ++--
 lib/librte_eal/common/include/rte_version.h |  6 +++---
 lib/librte_hash/Makefile|  2 +-
 lib/librte_hash/rte_hash.h  |  6 --
 lib/librte_hash/rte_jhash.h | 15 ++-
 10 files changed, 19 insertions(+), 37 deletions(-)

-- 
2.5.1

[dpdk-dev] [PATCH 1/1] ip_frag: fix creating ipv6 fragment extension header

2015-09-04 Thread Ananyev, Konstantin

Hi Piotr,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Piotr
> Sent: Wednesday, September 02, 2015 3:13 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 1/1] ip_frag: fix creating ipv6 fragment extension 
> header
> 
> From: Piotr Azarewicz 
> 
> Previous implementation won't work on every environment. The order of
> allocation of bit-fields within a unit (high-order to low-order or
> low-order to high-order) is implementation-defined.
> Solution: used bytes instead of bit fields.

Seems like right thing to do to me.
Though I think we also should replace:
union {
struct {
uint16_t frag_offset:13; /**< Offset from the start of 
the packet */
uint16_t reserved2:2; /**< Reserved */
uint16_t more_frags:1;
/**< 1 if more fragments left, 0 if last fragment */
};
uint16_t frag_data;
/**< union of all fragmentation data */
}; 

With just: 
uint16_t frag_data;
 and probably provide macros to read/set fragment_offset and more_flags values.
Otherwise people might keep using the wrong layout.
Konstantin

> 
> Signed-off-by: Piotr Azarewicz 
> ---
>  lib/librte_ip_frag/rte_ipv6_fragmentation.c |6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/librte_ip_frag/rte_ipv6_fragmentation.c 
> b/lib/librte_ip_frag/rte_ipv6_fragmentation.c
> index 0e32aa8..7342421 100644
> --- a/lib/librte_ip_frag/rte_ipv6_fragmentation.c
> +++ b/lib/librte_ip_frag/rte_ipv6_fragmentation.c
> @@ -65,10 +65,8 @@ __fill_ipv6hdr_frag(struct ipv6_hdr *dst,
> 
>   fh = (struct ipv6_extension_fragment *) ++dst;
>   fh->next_header = src->proto;
> - fh->reserved1   = 0;
> - fh->frag_offset = rte_cpu_to_be_16(fofs);
> - fh->reserved2   = 0;
> - fh->more_frags  = rte_cpu_to_be_16(mf);
> + fh->reserved1 = 0;
> + fh->frag_data = rte_cpu_to_be_16((fofs & ~IPV6_HDR_FO_MASK) | mf);
>   fh->id = 0;
>  }
> 
> --
> 1.7.9.5

[dpdk-dev] [RFC PATCH 01/18] librte_eal: remove type field from rte_driver structure.

2015-09-04 Thread Thomas Monjalon

2015-09-04 12:01, Bernard Iremonger:
> Signed-off-by: Bernard Iremonger 

There is no explanation in this patch.

> - if (driver->type != PMD_PDEV)
> - continue;
> - /* PDEV drivers don't get passed any parameters */
> - driver->init(NULL, NULL);
> +
> + /* PCI drivers don't get passed any parameters */
> + /*
> +  * Search a virtual driver prefix in device name.
> +  * It should not be found for PCI devices.
> +  * Use strncmp to compare.
> +  */
> +
> + if ((driver->name) &&
> + (strncmp(driver->name, "eth_", strlen("eth_")) != 0)) {
> + driver->init(NULL, NULL);
> + }

You don't need to submit a full patchset with changes in every drivers
for a RFC. Having just this patch is enough to have an opinion.
Here it is a nack.
We need to have a common init path instead of the current VDEV/PDEV branches.
And instead of "pmd_type", a bus information would be more meaningful.
So just replacing a type by a magical string is worst.

Please don't try to fix wrong problems and focus on your goal.
We had some discussions about possible PCI EAL refactoring but
it probably needs to be done step by step with a clear cleaning motivation
at each step. I think other people involved in EAL will have other ideas.

[dpdk-dev] [PATCH] app/testpmd: add engine for UDP echo server support

2015-09-04 Thread Thadeu Lima de Souza Cascardo

Adapt the ICMP echo code to reply to UDP echo requests on port 7. The testpmd
forward engine udpecho is used for that.

Signed-off-by: Thadeu Lima de Souza Cascardo 
---
 app/test-pmd/config.c   |  7 ++-
 app/test-pmd/icmpecho.c | 90 ++---
 app/test-pmd/testpmd.c  |  1 +
 app/test-pmd/testpmd.h  |  1 +
 doc/guides/testpmd_app_ug/run_app.rst   |  2 +-
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  6 +-
 6 files changed, 79 insertions(+), 28 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index cf2aa6e..0b5c4e6 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -1239,7 +1239,7 @@ dcb_fwd_config_setup(void)
 }

 static void
-icmp_echo_config_setup(void)
+echo_config_setup(void)
 {
portid_t  rxp;
queueid_t rxq;
@@ -1297,8 +1297,9 @@ void
 fwd_config_setup(void)
 {
cur_fwd_config.fwd_eng = cur_fwd_eng;
-   if (strcmp(cur_fwd_eng->fwd_mode_name, "icmpecho") == 0) {
-   icmp_echo_config_setup();
+   if (strcmp(cur_fwd_eng->fwd_mode_name, "icmpecho") == 0 ||
+   strcmp(cur_fwd_eng->fwd_mode_name, "udpecho") == 0) {
+   echo_config_setup();
return;
}
if ((nb_rxq > 1) && (nb_txq > 1)){
diff --git a/app/test-pmd/icmpecho.c b/app/test-pmd/icmpecho.c
index e510f9b..a7f882a 100644
--- a/app/test-pmd/icmpecho.c
+++ b/app/test-pmd/icmpecho.c
@@ -61,6 +61,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 

 #include "testpmd.h"
@@ -301,7 +302,7 @@ ipv4_hdr_cksum(struct ipv4_hdr *ip_h)
  * send back ICMP echo replies.
  */
 static void
-reply_to_icmp_echo_rqsts(struct fwd_stream *fs)
+reply_to_echo_rqsts(struct fwd_stream *fs, int proto)
 {
struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
struct rte_mbuf *pkt;
@@ -310,6 +311,7 @@ reply_to_icmp_echo_rqsts(struct fwd_stream *fs)
struct arp_hdr  *arp_h;
struct ipv4_hdr *ip_h;
struct icmp_hdr *icmp_h;
+   struct udp_hdr *udp_h;
struct ether_addr eth_addr;
uint32_t ip_addr;
uint16_t nb_rx;
@@ -319,6 +321,7 @@ reply_to_icmp_echo_rqsts(struct fwd_stream *fs)
uint16_t vlan_id;
uint16_t arp_op;
uint16_t arp_pro;
+   uint16_t udp_port;
uint32_t cksum;
uint8_t  i;
int l2_len;
@@ -448,24 +451,40 @@ reply_to_icmp_echo_rqsts(struct fwd_stream *fs)
   ip_proto_name(ip_h->next_proto_id));
}

-   /*
-* Check if packet is a ICMP echo request.
-*/
-   icmp_h = (struct icmp_hdr *) ((char *)ip_h +
- sizeof(struct ipv4_hdr));
-   if (! ((ip_h->next_proto_id == IPPROTO_ICMP) &&
-  (icmp_h->icmp_type == IP_ICMP_ECHO_REQUEST) &&
-  (icmp_h->icmp_code == 0))) {
-   rte_pktmbuf_free(pkt);
-   continue;
+   if (proto == IPPROTO_ICMP) {
+   /*
+* Check if packet is a ICMP echo request.
+*/
+   icmp_h = (struct icmp_hdr *) ((char *)ip_h +
+ sizeof(struct ipv4_hdr));
+   if (! ((ip_h->next_proto_id == IPPROTO_ICMP) &&
+  (icmp_h->icmp_type == IP_ICMP_ECHO_REQUEST) &&
+  (icmp_h->icmp_code == 0))) {
+   rte_pktmbuf_free(pkt);
+   continue;
+   }
+   } else if (proto == IPPROTO_UDP) {
+   udp_h = (struct udp_hdr *) ((char *)ip_h +
+ sizeof(struct ipv4_hdr));
+   if ((ip_h->next_proto_id != IPPROTO_UDP) &&
+   (rte_be_to_cpu_16(udp_h->dst_port) != 7)) {
+   rte_pktmbuf_free(pkt);
+   continue;
+   }
}

-   if (verbose_level > 0)
-   printf("  ICMP: echo request seq id=%d\n",
-  rte_be_to_cpu_16(icmp_h->icmp_seq_nb));
+   if (proto == IPPROTO_ICMP) {
+   if (verbose_level > 0)
+   printf("  ICMP: echo request seq id=%d\n",
+  rte_be_to_cpu_16(icmp_h->icmp_seq_nb));
+   } else if (proto == IPPROTO_UDP) {
+   if (verbose_level > 0)
+   printf("  UDP: echo request from port=%d\n",
+  rte_be_to_cpu_16(udp_h->src_port));
+   }

/*
-* Prepare ICMP echo reply to be sent back.
+* Prepare ICMP or UDP

[dpdk-dev] [PATCH 4/4] virtio: use any layout on transmit

2015-09-04 Thread Stephen Hemminger

Virtio supports a feature that allows sender to put transmit
header prepended to data.  It requires that the mbuf be writeable, correct
alignment, and the feature has been negotiatied.  If all this works out,
then it will be the optimum way to transmit a single segment packet.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/virtio/virtio_ethdev.h |  3 +-
 drivers/net/virtio/virtio_rxtx.c   | 67 ++
 2 files changed, 49 insertions(+), 21 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index 07a9265..f260fbb 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -65,7 +65,8 @@
 1u << VIRTIO_NET_F_CTRL_RX   | \
 1u << VIRTIO_NET_F_CTRL_VLAN | \
 1u << VIRTIO_NET_F_MRG_RXBUF | \
-1u << VIRTIO_RING_F_INDIRECT_DESC)
+1u << VIRTIO_RING_F_INDIRECT_DESC| \
+1u << VIRTIO_F_ANY_LAYOUT)

 /*
  * CQ function prototype
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 8979695..5ec9b29 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -200,13 +200,14 @@ virtqueue_enqueue_recv_refill(struct virtqueue *vq, 
struct rte_mbuf *cookie)

 static int
 virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie,
-  int use_indirect)
+  int use_indirect, int can_push)
 {
struct vq_desc_extra *dxp;
struct vring_desc *start_dp;
uint16_t seg_num = cookie->nb_segs;
-   uint16_t needed = use_indirect ? 1 : 1 + seg_num;
+   uint16_t needed = use_indirect ? 1 : !can_push + seg_num;
uint16_t head_idx, idx;
+   uint16_t head_size = txvq->hw->vtnet_hdr_size;
unsigned long offs;

if (unlikely(txvq->vq_free_cnt == 0))
@@ -236,27 +237,31 @@ virtqueue_enqueue_xmit(struct virtqueue *txvq, struct 
rte_mbuf *cookie,
idx = 0;
}

-   offs = offsetof(struct virtio_tx_region, tx_hdr)
-   + idx * sizeof(struct virtio_tx_region);
+   if (can_push) {
+   /* put on zero'd transmit header (no offloads) */
+   void *hdr = rte_pktmbuf_prepend(cookie, head_size);

-   start_dp[idx].addr = txvq->virtio_net_hdr_mem + offs;
-   start_dp[idx].len = txvq->hw->vtnet_hdr_size;
-   start_dp[idx].flags = VRING_DESC_F_NEXT;
+   memset(hdr, 0, head_size);
+   } else {
+   offs = offsetof(struct virtio_tx_region, tx_hdr)
+   + idx * sizeof(struct virtio_tx_region);

-   for (; ((seg_num > 0) && (cookie != NULL)); seg_num--) {
+   start_dp[idx].addr = txvq->virtio_net_hdr_mem + offs;
+   start_dp[idx].len = head_size;
+   start_dp[idx].flags = VRING_DESC_F_NEXT;
idx = start_dp[idx].next;
+   }
+
+   for (; ((seg_num > 0) && (cookie != NULL)); seg_num--) {
start_dp[idx].addr  = RTE_MBUF_DATA_DMA_ADDR(cookie);
start_dp[idx].len   = cookie->data_len;
-   start_dp[idx].flags = VRING_DESC_F_NEXT;
cookie = cookie->next;
+   start_dp[idx].flags = cookie ? VRING_DESC_F_NEXT : 0;
+   idx = start_dp[idx].next;
}

-   start_dp[idx].flags &= ~VRING_DESC_F_NEXT;
-
if (use_indirect)
idx = txvq->vq_ring.desc[head_idx].next;
-   else
-   idx = start_dp[idx].next;

txvq->vq_desc_head_idx = idx;
if (txvq->vq_desc_head_idx == VQ_RING_DESC_CHAIN_END)
@@ -762,6 +767,26 @@ virtio_recv_mergeable_pkts(void *rx_queue,
return nb_rx;
 }

+/* Evaluate whether the virtio header can just be put in place in the mbuf */
+static int virtio_xmit_push_ok(const struct virtqueue *txvq,
+  const struct rte_mbuf *m)
+{
+   if (rte_mbuf_refcnt_read(m) != 1)
+   return 0;   /* no mbuf is shared */
+
+   if (rte_pktmbuf_headroom(m) < txvq->hw->vtnet_hdr_size)
+   return 0;   /* no space in headroom */
+
+   if (!rte_is_aligned(rte_pktmbuf_mtod(m, char *),
+   sizeof(struct virtio_net_hdr_mrg_rxbuf)))
+   return 0;   /* not alligned */
+
+   if (m->nb_segs > 1)
+   return 0;   /* better off using indirect */
+
+   return vtpci_with_feature(txvq->hw, VIRTIO_F_ANY_LAYOUT);
+}
+
 uint16_t
 virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 {
@@ -781,14 +806,16 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts, uint16_t nb_pkts)

for (nb_tx = 0; nb_tx < nb_pkts; nb_tx++) {
struct rte_mbuf *txm = tx_pkts[nb_tx];
-   int use_indirect, slots, need;
-
-   use_indirect = vtpci_with_feature(txvq->hw,
-

[dpdk-dev] [PATCH 3/4] virtio: use indirect ring elements

2015-09-04 Thread Stephen Hemminger

The virtio ring in QEMU/KVM is usually limited to 256 entries
and the normal way that virtio driver was queuing mbufs required
nsegs + 1 ring elements. By using the indirect ring element feature
if available, each packet will take only one ring slot even for
multi-segment packets.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/virtio/virtio_ethdev.c | 11 +---
 drivers/net/virtio/virtio_ethdev.h |  3 ++-
 drivers/net/virtio/virtio_rxtx.c   | 51 ++
 drivers/net/virtio/virtqueue.h |  8 ++
 4 files changed, 57 insertions(+), 16 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 465d3cd..bcfb87b 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -359,12 +359,15 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
if (queue_type == VTNET_TQ) {
/*
 * For each xmit packet, allocate a virtio_net_hdr
+* and indirect ring elements
 */
snprintf(vq_name, sizeof(vq_name), "port%d_tvq%d_hdrzone",
-   dev->data->port_id, queue_idx);
-   vq->virtio_net_hdr_mz = rte_memzone_reserve_aligned(vq_name,
-   vq_size * hw->vtnet_hdr_size,
-   socket_id, 0, RTE_CACHE_LINE_SIZE);
+dev->data->port_id, queue_idx);
+
+   vq->virtio_net_hdr_mz =
+   rte_memzone_reserve_aligned(vq_name,
+   vq_size * sizeof(struct 
virtio_tx_region),
+   socket_id, 0, 
RTE_CACHE_LINE_SIZE);
if (vq->virtio_net_hdr_mz == NULL) {
if (rte_errno == EEXIST)
vq->virtio_net_hdr_mz =
diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index 9026d42..07a9265 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -64,7 +64,8 @@
 1u << VIRTIO_NET_F_CTRL_VQ   | \
 1u << VIRTIO_NET_F_CTRL_RX   | \
 1u << VIRTIO_NET_F_CTRL_VLAN | \
-1u << VIRTIO_NET_F_MRG_RXBUF)
+1u << VIRTIO_NET_F_MRG_RXBUF | \
+1u << VIRTIO_RING_F_INDIRECT_DESC)

 /*
  * CQ function prototype
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index dbe6665..8979695 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -199,14 +199,15 @@ virtqueue_enqueue_recv_refill(struct virtqueue *vq, 
struct rte_mbuf *cookie)
 }

 static int
-virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie)
+virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie,
+  int use_indirect)
 {
struct vq_desc_extra *dxp;
struct vring_desc *start_dp;
uint16_t seg_num = cookie->nb_segs;
-   uint16_t needed = 1 + seg_num;
+   uint16_t needed = use_indirect ? 1 : 1 + seg_num;
uint16_t head_idx, idx;
-   uint16_t head_size = txvq->hw->vtnet_hdr_size;
+   unsigned long offs;

if (unlikely(txvq->vq_free_cnt == 0))
return -ENOSPC;
@@ -220,11 +221,26 @@ virtqueue_enqueue_xmit(struct virtqueue *txvq, struct 
rte_mbuf *cookie)
dxp = >vq_descx[idx];
dxp->cookie = (void *)cookie;
dxp->ndescs = needed;
-
start_dp = txvq->vq_ring.desc;
-   start_dp[idx].addr =
-   txvq->virtio_net_hdr_mem + idx * head_size;
-   start_dp[idx].len = (uint32_t)head_size;
+
+   if (use_indirect) {
+   offs = offsetof(struct virtio_tx_region, tx_indir)
+   + idx * sizeof(struct virtio_tx_region);
+
+   start_dp[idx].addr = txvq->virtio_net_hdr_mem + offs;
+   start_dp[idx].len = sizeof(struct vring_desc);
+   start_dp[idx].flags = VRING_DESC_F_INDIRECT;
+
+   start_dp = (struct vring_desc *)
+   ((char *)txvq->virtio_net_hdr_mz->addr + offs);
+   idx = 0;
+   }
+
+   offs = offsetof(struct virtio_tx_region, tx_hdr)
+   + idx * sizeof(struct virtio_tx_region);
+
+   start_dp[idx].addr = txvq->virtio_net_hdr_mem + offs;
+   start_dp[idx].len = txvq->hw->vtnet_hdr_size;
start_dp[idx].flags = VRING_DESC_F_NEXT;

for (; ((seg_num > 0) && (cookie != NULL)); seg_num--) {
@@ -236,7 +252,12 @@ virtqueue_enqueue_xmit(struct virtqueue *txvq, struct 
rte_mbuf *cookie)
}

start_dp[idx].flags &= ~VRING_DESC_F_NEXT;
-   idx = start_dp[idx].next;
+
+   if (use_indirect)
+   idx = txvq->vq_ring.desc[head_idx].next;
+   else
+   idx = start_dp[idx].next;
+
txvq->vq_desc_head_idx = idx;
if (txvq->vq_desc_head_idx == VQ_RING_DESC_CHAIN_END)

[dpdk-dev] [PATCH 2/4] virtio: don't use unlikely for normal tx stuff

2015-09-04 Thread Stephen Hemminger

Don't use unlikely() for VLAN or ring getting full.
GCC will not optimize code in unlikely paths and since these can
happen with normal code that can hurt performance.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/virtio/virtio_rxtx.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 5b50ed0..dbe6665 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -763,7 +763,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, 
uint16_t nb_pkts)
int need = txm->nb_segs - txvq->vq_free_cnt + 1;

/* Positive value indicates it need free vring descriptors */
-   if (unlikely(need > 0)) {
+   if (need > 0) {
nb_used = VIRTQUEUE_NUSED(txvq);
virtio_rmb();
need = RTE_MIN(need, (int)nb_used);
@@ -778,7 +778,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, 
uint16_t nb_pkts)
}

/* Do VLAN tag insertion */
-   if (unlikely(txm->ol_flags & PKT_TX_VLAN_PKT)) {
+   if (txm->ol_flags & PKT_TX_VLAN_PKT) {
error = rte_vlan_insert();
if (unlikely(error)) {
rte_pktmbuf_free(txm);
@@ -798,10 +798,9 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts, uint16_t nb_pkts)
break;
}
txvq->bytes += txm->pkt_len;
+   ++txvq->packets;
}

-   txvq->packets += nb_tx;
-
if (likely(nb_tx)) {
vq_update_avail_idx(txvq);

-- 
2.1.4

[dpdk-dev] [PATCH 1/4] virtio: clean up space checks on xmit

2015-09-04 Thread Stephen Hemminger

The space check for transmit ring only needs a single conditional.
I.e only need to recheck for space if there was no space in first check.

This can help performance and simplifies loop.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/virtio/virtio_rxtx.c | 66 
 1 file changed, 27 insertions(+), 39 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index c5b53bb..5b50ed0 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -745,7 +745,6 @@ uint16_t
 virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 {
struct virtqueue *txvq = tx_queue;
-   struct rte_mbuf *txm;
uint16_t nb_used, nb_tx;
int error;

@@ -759,57 +758,46 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts, uint16_t nb_pkts)
if (likely(nb_used > txvq->vq_nentries - txvq->vq_free_thresh))
virtio_xmit_cleanup(txvq, nb_used);

-   nb_tx = 0;
+   for (nb_tx = 0; nb_tx < nb_pkts; nb_tx++) {
+   struct rte_mbuf *txm = tx_pkts[nb_tx];
+   int need = txm->nb_segs - txvq->vq_free_cnt + 1;

-   while (nb_tx < nb_pkts) {
-   /* Need one more descriptor for virtio header. */
-   int need = tx_pkts[nb_tx]->nb_segs - txvq->vq_free_cnt + 1;
-
-   /*Positive value indicates it need free vring descriptors */
+   /* Positive value indicates it need free vring descriptors */
if (unlikely(need > 0)) {
nb_used = VIRTQUEUE_NUSED(txvq);
virtio_rmb();
need = RTE_MIN(need, (int)nb_used);

virtio_xmit_cleanup(txvq, need);
-   need = (int)tx_pkts[nb_tx]->nb_segs -
-   txvq->vq_free_cnt + 1;
-   }
-
-   /*
-* Zero or negative value indicates it has enough free
-* descriptors to use for transmitting.
-*/
-   if (likely(need <= 0)) {
-   txm = tx_pkts[nb_tx];
-
-   /* Do VLAN tag insertion */
-   if (unlikely(txm->ol_flags & PKT_TX_VLAN_PKT)) {
-   error = rte_vlan_insert();
-   if (unlikely(error)) {
-   rte_pktmbuf_free(txm);
-   ++nb_tx;
-   continue;
-   }
+   need = txm->nb_segs - txvq->vq_free_cnt + 1;
+   if (unlikely(need > 0)) {
+   PMD_TX_LOG(ERR,
+  "No free tx descriptors to 
transmit");
+   break;
}
+   }

-   /* Enqueue Packet buffers */
-   error = virtqueue_enqueue_xmit(txvq, txm);
+   /* Do VLAN tag insertion */
+   if (unlikely(txm->ol_flags & PKT_TX_VLAN_PKT)) {
+   error = rte_vlan_insert();
if (unlikely(error)) {
-   if (error == ENOSPC)
-   PMD_TX_LOG(ERR, "virtqueue_enqueue Free 
count = 0");
-   else if (error == EMSGSIZE)
-   PMD_TX_LOG(ERR, "virtqueue_enqueue Free 
count < 1");
-   else
-   PMD_TX_LOG(ERR, "virtqueue_enqueue 
error: %d", error);
-   break;
+   rte_pktmbuf_free(txm);
+   continue;
}
-   nb_tx++;
-   txvq->bytes += txm->pkt_len;
-   } else {
-   PMD_TX_LOG(ERR, "No free tx descriptors to transmit");
+   }
+
+   /* Enqueue Packet buffers */
+   error = virtqueue_enqueue_xmit(txvq, txm);
+   if (unlikely(error)) {
+   if (error == ENOSPC)
+   PMD_TX_LOG(ERR, "virtqueue_enqueue Free count = 
0");
+   else if (error == EMSGSIZE)
+   PMD_TX_LOG(ERR, "virtqueue_enqueue Free count < 
1");
+   else
+   PMD_TX_LOG(ERR, "virtqueue_enqueue error: %d", 
error);
break;
}
+   txvq->bytes += txm->pkt_len;
}

txvq->packets += nb_tx;
-- 
2.1.4

[dpdk-dev] [PATCH 0/4] RFC virtio performance enhancement and cleanups

2015-09-04 Thread Stephen Hemminger

These are compile tested only, haven't debugged or checked out the corner
case. Submitted for discussion and future planning.

Stephen Hemminger (4):
  virtio: clean up space checks on xmit
  virtio: don't use unlikely for normal tx stuff
  virtio: use indirect ring elements
  virtio: use any layout on transmit

 drivers/net/virtio/virtio_ethdev.c |  11 ++-
 drivers/net/virtio/virtio_ethdev.h |   4 +-
 drivers/net/virtio/virtio_rxtx.c   | 151 -
 drivers/net/virtio/virtqueue.h |   8 ++
 4 files changed, 115 insertions(+), 59 deletions(-)

-- 
2.1.4

[dpdk-dev] [RFC PATCH 00/18] refactor eal driver registration code

2015-09-04 Thread Bruce Richardson

On Fri, Sep 04, 2015 at 01:46:11PM +0100, Iremonger, Bernard wrote:
> Hi Bruce,
> 
> > Subject: Re: [dpdk-dev] [RFC PATCH 00/18] refactor eal driver registration
> > code
> > 
> > On Fri, Sep 04, 2015 at 12:01:36PM +0100, Bernard Iremonger wrote:
> > > At present the eal driver registration code is more complicated than
> > > it needs to be.
> > >
> > > This RFC proposes to simplify the eal driver registration code.
> > >
> > > Remove the type field from the eal driver structure.
> > > Refactor the eal driver registration code to use the name field in the
> > > eal driver structure instead of the type field.
> > >
> > > Modify all PMD's to use the modified eal driver structure.
> > > Initialise the name field in the eal driver structure in some PMD's
> > > where it is not initialised at present.
> > >
> > >
> > Hi,
> > 
> > I don't think I like this approach very much. It seems very brittle to 
> > remove
> > the explicit type field and starting to rely on the drivers putting a 
> > prefix in the
> > name instead i.e. implicit typing.
> > 
> > What is the major concern with marking drivers as virtual or physical? My
> > thinking is that we should keep the type field, just perhaps change PDEV to
> > be more descriptive in identifying the type of physical device, e.g. 
> > DEV_PCI.
> > 
> > Regards,
> > /Bruce
> 
> The eth_  prefix is already required  for vdev's  for example:
> testpmd -c f -n 4 --vdev='eth_pcap0,iface=eth0'
> testpmd -c f -n 4 --vdev=eth_ring0
> 
> The eth_ prefix should not be used for pdev's.
> 
> Keeping the type field and name field is duplicating  information
> 
> Regards,
> 
> Bernard.

Hi Bernard,

It's duplicating information until such a time as we decide to relax the 
restriction
on having vdev's starting with "eth" or we want to have a driver for a physical
nic starting with "eth". :-)
Overall, I'm not seeing the need for this particular patchset right now. I think
your previous patchset - removing the need for a pci_dev structure on vdevs - as
being the more important change for cleaning up our code.

Regards,
/Bruce

[dpdk-dev] PMD/l3fwd issue

2015-09-04 Thread Ananyev, Konstantin



> -Original Message-
> From: Harish Patil [mailto:harish.patil at qlogic.com]
> Sent: Friday, September 04, 2015 2:08 PM
> To: Ananyev, Konstantin; dev at dpdk.org
> Cc: Ameen Rahman
> Subject: Re: PMD/l3fwd issue
> 
> Hi Konstantin,
> 
> >Hi Patil,
> >
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Harish Patil
> >> Sent: Thursday, September 03, 2015 4:53 PM
> >> To: dev at dpdk.org
> >> Subject: [dpdk-dev] PMD/l3fwd issue
> >>
> >> Hello,
> >> Have a question regarding l3fwd application. The l3fwd application
> >>expects
> >> the poll mode driver to return packets whose L2 header is 16-byte
> >>aligned.
> >
> >Yep, and as I remember, by default PMD returns ti the upper layer mbufs
> >with data offsets
> >aligned to cahce line size (64B).
> >Unless you'll change RTE_PKTMBUF_HEADROOM config parameter.
> >
> >> Otherwise, it results in a crash. This is due to use of _mm_load_si128()
> >> and _mm_store_si128() intrinsics which expects the address to be 16-byte
> >> aligned. However, most of the real protocol stack expects packets such
> >> that its IP header be aligned on a 16-byte boundary (not L2). Its not
> >>just
> >> for IP but any L3 for that matter.  That?s way we usually see
> >> skb_reserve(skb, NET_IP_ALIGN) calls in linux drivers.
> >
> >Well, l3fwd is just an example application to demonstrate usage of DPDK
> >API
> >And max performance it could get for that type of workload.
> >No-one forces you to use aligned load/store in your own application.
> 
> Yes, I agree if its our private application. But l3fwd being widely used
> as a benchmarking/testing tool and they may ran into this issue.
> 

If someone would try to run it with RTE_PKTMBUF_HEADROOM non-aligned on 16B, 
then probably yes.

> >
> >>
> >> So I?m looking for suggestions here, whether l3wd application or poll
> >>mode
> >> driver should be changed to fix that? What is the right thing to do?
> >> Can a check be added in l3fwd to use _mm_loadu_si128/_mm_storeu_si128
> >> instructions instead of mm_load_si128/_mm_store_si128 if address is
> >>found
> >> not be 16B aligned?
> >
> >I'd personally just change l3fwd to use to use
> >_mm_loadu_si128/_mm_storeu_si128 unconditionally.
> >As by default  address is 16B aligned anyway, I think that using MOVDQU
> >instead of MOVDQA here
> >shouldn't make that big difference.
> >But off course testing need to be done to make sure there is no
> >performance drop with that change.
> 
> I too would just change l3fwd application so that all poll mode drivers
> would just work. Are you proposing that we upstream l3fwd change if we
> don?t see performance drop?

Yep, I'd suggest to verify there is no performance difference and submit a 
patch.

[dpdk-dev] PMD/l3fwd issue

2015-09-04 Thread Harish Patil

Hi Konstantin,

>Hi Patil,
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Harish Patil
>> Sent: Thursday, September 03, 2015 4:53 PM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] PMD/l3fwd issue
>>
>> Hello,
>> Have a question regarding l3fwd application. The l3fwd application
>>expects
>> the poll mode driver to return packets whose L2 header is 16-byte
>>aligned.
>
>Yep, and as I remember, by default PMD returns ti the upper layer mbufs
>with data offsets
>aligned to cahce line size (64B).
>Unless you'll change RTE_PKTMBUF_HEADROOM config parameter.
>
>> Otherwise, it results in a crash. This is due to use of _mm_load_si128()
>> and _mm_store_si128() intrinsics which expects the address to be 16-byte
>> aligned. However, most of the real protocol stack expects packets such
>> that its IP header be aligned on a 16-byte boundary (not L2). Its not
>>just
>> for IP but any L3 for that matter.  That?s way we usually see
>> skb_reserve(skb, NET_IP_ALIGN) calls in linux drivers.
>
>Well, l3fwd is just an example application to demonstrate usage of DPDK
>API
>And max performance it could get for that type of workload.
>No-one forces you to use aligned load/store in your own application.

Yes, I agree if its our private application. But l3fwd being widely used
as a benchmarking/testing tool and they may ran into this issue.

>
>>
>> So I?m looking for suggestions here, whether l3wd application or poll
>>mode
>> driver should be changed to fix that? What is the right thing to do?
>> Can a check be added in l3fwd to use _mm_loadu_si128/_mm_storeu_si128
>> instructions instead of mm_load_si128/_mm_store_si128 if address is
>>found
>> not be 16B aligned?
>
>I'd personally just change l3fwd to use to use
>_mm_loadu_si128/_mm_storeu_si128 unconditionally.
>As by default  address is 16B aligned anyway, I think that using MOVDQU
>instead of MOVDQA here
>shouldn't make that big difference.
>But off course testing need to be done to make sure there is no
>performance drop with that change.

I too would just change l3fwd application so that all poll mode drivers
would just work. Are you proposing that we upstream l3fwd change if we
don?t see performance drop?

>Konstantin
>
>>
>> Thanks,
>> Harish
>>
>>
>>
>> 
>>
>> This message and any attached documents contain information from the
>>sending company or its parent company(s), subsidiaries,
>> divisions or branch offices that may be confidential. If you are not
>>the intended recipient, you may not read, copy, distribute, or use
>> this information. If you have received this transmission in error,
>>please notify the sender immediately by reply e-mail and then delete
>> this message.
>





This message and any attached documents contain information from the sending 
company or its parent company(s), subsidiaries, divisions or branch offices 
that may be confidential. If you are not the intended recipient, you may not 
read, copy, distribute, or use this information. If you have received this 
transmission in error, please notify the sender immediately by reply e-mail and 
then delete this message.

[dpdk-dev] [RFC PATCH 00/18] refactor eal driver registration code

2015-09-04 Thread Iremonger, Bernard

Hi Bruce,

> Subject: Re: [dpdk-dev] [RFC PATCH 00/18] refactor eal driver registration
> code
> 
> On Fri, Sep 04, 2015 at 12:01:36PM +0100, Bernard Iremonger wrote:
> > At present the eal driver registration code is more complicated than
> > it needs to be.
> >
> > This RFC proposes to simplify the eal driver registration code.
> >
> > Remove the type field from the eal driver structure.
> > Refactor the eal driver registration code to use the name field in the
> > eal driver structure instead of the type field.
> >
> > Modify all PMD's to use the modified eal driver structure.
> > Initialise the name field in the eal driver structure in some PMD's
> > where it is not initialised at present.
> >
> >
> Hi,
> 
> I don't think I like this approach very much. It seems very brittle to remove
> the explicit type field and starting to rely on the drivers putting a prefix 
> in the
> name instead i.e. implicit typing.
> 
> What is the major concern with marking drivers as virtual or physical? My
> thinking is that we should keep the type field, just perhaps change PDEV to
> be more descriptive in identifying the type of physical device, e.g. DEV_PCI.
> 
> Regards,
> /Bruce

The eth_  prefix is already required  for vdev's  for example:
testpmd -c f -n 4 --vdev='eth_pcap0,iface=eth0'
testpmd -c f -n 4 --vdev=eth_ring0

The eth_ prefix should not be used for pdev's.

Keeping the type field and name field is duplicating  information

Regards,

Bernard.

[dpdk-dev] PMD/l3fwd issue

2015-09-04 Thread Ananyev, Konstantin

Hi Patil,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Harish Patil
> Sent: Thursday, September 03, 2015 4:53 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] PMD/l3fwd issue
> 
> Hello,
> Have a question regarding l3fwd application. The l3fwd application expects
> the poll mode driver to return packets whose L2 header is 16-byte aligned.

Yep, and as I remember, by default PMD returns ti the upper layer mbufs with 
data offsets
aligned to cahce line size (64B).
Unless you'll change RTE_PKTMBUF_HEADROOM config parameter.

> Otherwise, it results in a crash. This is due to use of _mm_load_si128()
> and _mm_store_si128() intrinsics which expects the address to be 16-byte
> aligned. However, most of the real protocol stack expects packets such
> that its IP header be aligned on a 16-byte boundary (not L2). Its not just
> for IP but any L3 for that matter.  That?s way we usually see
> skb_reserve(skb, NET_IP_ALIGN) calls in linux drivers.

Well, l3fwd is just an example application to demonstrate usage of DPDK API
And max performance it could get for that type of workload.
No-one forces you to use aligned load/store in your own application.

> 
> So I?m looking for suggestions here, whether l3wd application or poll mode
> driver should be changed to fix that? What is the right thing to do?
> Can a check be added in l3fwd to use _mm_loadu_si128/_mm_storeu_si128
> instructions instead of mm_load_si128/_mm_store_si128 if address is found
> not be 16B aligned?

I'd personally just change l3fwd to use to use _mm_loadu_si128/_mm_storeu_si128 
unconditionally.
As by default  address is 16B aligned anyway, I think that using MOVDQU instead 
of MOVDQA here
shouldn't make that big difference. 
But off course testing need to be done to make sure there is no performance 
drop with that change.
Konstantin

> 
> Thanks,
> Harish
> 
> 
> 
> 
> 
> This message and any attached documents contain information from the sending 
> company or its parent company(s), subsidiaries,
> divisions or branch offices that may be confidential. If you are not the 
> intended recipient, you may not read, copy, distribute, or use
> this information. If you have received this transmission in error, please 
> notify the sender immediately by reply e-mail and then delete
> this message.

[dpdk-dev] testpmd - configuration of the fdir filter

2015-09-04 Thread Jan Fruhbauer

Hi,

I want to use the fdir filtering on a NIC based on the Intel 82599. I 
have tested the testpmd application. I configured masks and added a 
filter but the fdir filter never matched any packet. I even tried 
different masks and filters (with/without ports, TCP/UDP flow, IP 
prefixes, ...), but it never worked. Here is an example of commands I 
used during testing:

./testpmd -c 0xff -n 2 -- -i --rxq=2 --txq=2 --pkt-filter-mode=perfect 
--portmask=0x3 --nb-ports=2 --disable-rss
testpmd> port stop 0
testpmd> flow_director_mask 0 vlan 0x src_mask 255.255.255.255 
::::::: 0x dst_mask 255.255.255.255 
::::::: 0x
testpmd> flow_director_flex_mask 0 flow all 
(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)
testpmd> port start 0
testpmd> flow_director_filter 0 add flow ipv4-tcp src 1.0.0.1 1 dst 
2.0.0.1 1 vlan 0x0 flexbytes (0x00,0x00) fwd queue 1 fd_id 1
testpmd> start

Then I sent generated traffic (simple packets with ethernet/IP/TCP 
headers) with parametrs I specified in the flow_director_filter to the 
NIC port 0 and all packets arrived to the queue 0.

Please, could you advise me what I am doing wrong? Maybe some other 
configuration I didn't notice?

Regards,
Jan

[dpdk-dev] ixgbe: account more Rx errors Issue

2015-09-04 Thread Tahhan, Maryam

> From: Andriy Berestovskyy [mailto:aber at semihalf.com]
> Sent: Friday, September 4, 2015 10:38 AM
> To: Tahhan, Maryam; dev at dpdk.org
> Subject: ixgbe: account more Rx errors Issue
> 
> Hi,
> Updating to DPDK 2.1 I noticed an issue with the ixgbe stats.
> 
> In commit f6bf669b9900 "ixgbe: account more Rx errors" we add XEC
> hardware counter (l3_l4_xsum_error) to the ierrors now. The issue is the
> UDP packets with zero check sum are counted in XEC and now in ierrors too.
> 
> I've tried to disable hw_ip_checksum in rxmode, but it didn't help.
> 
> I'm not sure we should add XEC to ierrors, because packets counted in XEC
> are not dropped by the NIC actually. So in my case ierrors counter is now
> greater than actual number of packets received by the NIC, which makes no
> sense.
> 
> What's your opinion?

Hi Andriy
Thanks for flagging this, I'm aware of this phenomenon, unfortunately it means 
we are hitting 2 hw registers on the NIC.

XEC counts the Number of receive IPv4, TCP, UDP or SCTP XSUM errors

And general crc errors counts Counts the number of receive packets with CRC 
errors. In order for a packet to be counted in this register, it must be 64 
bytes or greater (from  through , inclusively) in 
length. This register counts all packets received, regardless of L2 filtering 
and receive enablement

So our options are we can:
1. Add only one of these into the error stats.
2. We can introduce some cooking of stats in this scenario, so only add either 
or if they are equal or one is higher than the other.
3. Add them all which means you can have more errors than the number of 
received packets, but TBH this is going to be the case if your packets have 
multiple errors anyway.

I'm happy to go with either 1, 2 or 3 but would like some more feedback from 
the community on this front.

Regards
Maryam
> Regards,
> Andriy

[dpdk-dev] [PATCH v1] change hugepage sorting to avoid overlapping memcpy

2015-09-04 Thread Ralf Hoffmann

with only one hugepage or already sorted hugepage addresses, the sort
function called memcpy with same src and dst pointer. Debugging with
valgrind will issue a warning about overlapping area. This patch changes
the bubble sort to avoid this behavior. Also, the function cannot fail
any longer.

Signed-off-by: Ralf Hoffmann 
---
 lib/librte_eal/linuxapp/eal/eal_memory.c | 27 +--
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index ac2745e..6d01f61 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -699,25 +699,25 @@ error:
  * higher address first on powerpc). We use a slow algorithm, but we won't
  * have millions of pages, and this is only done at init time.
  */
-static int
+static void
 sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
 {
unsigned i, j;
-   int compare_idx;
+   unsigned compare_idx;
uint64_t compare_addr;
struct hugepage_file tmp;

for (i = 0; i < hpi->num_pages[0]; i++) {
-   compare_addr = 0;
-   compare_idx = -1;
+   compare_addr = hugepg_tbl[i].physaddr;
+   compare_idx = i;

/*
-* browse all entries starting at 'i', and find the
+* browse all entries starting at 'i+1', and find the
 * entry with the smallest addr
 */
-   for (j=i; j< hpi->num_pages[0]; j++) {
+   for (j=i + 1; j < hpi->num_pages[0]; j++) {

-   if (compare_addr == 0 ||
+   if (
 #ifdef RTE_ARCH_PPC_64
hugepg_tbl[j].physaddr > compare_addr) {
 #else
@@ -728,10 +728,9 @@ sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct 
hugepage_info *hpi)
}
}

-   /* should not happen */
-   if (compare_idx == -1) {
-   RTE_LOG(ERR, EAL, "%s(): error in physaddr sorting\n", 
__func__);
-   return -1;
+   if (compare_idx == i) {
+   /* no smaller page found */
+   continue;
}

/* swap the 2 entries in the table */
@@ -741,7 +740,8 @@ sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct 
hugepage_info *hpi)
sizeof(struct hugepage_file));
memcpy(_tbl[i], , sizeof(struct hugepage_file));
}
-   return 0;
+
+   return;
 }

 /*
@@ -1164,8 +1164,7 @@ rte_eal_hugepage_init(void)
goto fail;
}

-   if (sort_by_physaddr(_hp[hp_offset], hpi) < 0)
-   goto fail;
+   sort_by_physaddr(_hp[hp_offset], hpi);

 #ifdef RTE_EAL_SINGLE_FILE_SEGMENTS
/* remap all hugepages into single file segments */
-- 
2.1.4

[dpdk-dev] [RFC PATCH 18/18] xenvirt: remove type field from rte_driver structure

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/xenvirt/rte_eth_xenvirt.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/xenvirt/rte_eth_xenvirt.c 
b/drivers/net/xenvirt/rte_eth_xenvirt.c
index 73e8bce..4ce1730 100644
--- a/drivers/net/xenvirt/rte_eth_xenvirt.c
+++ b/drivers/net/xenvirt/rte_eth_xenvirt.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -706,8 +706,7 @@ rte_pmd_xenvirt_devinit(const char *name, const char 
*params)
 }

 static struct rte_driver pmd_xenvirt_drv = {
-   .name = "eth_xenvirt",
-   .type = PMD_VDEV,
+   .name = "eth_xenvirt",  /* Virtual device */
.init = rte_pmd_xenvirt_devinit,
 };

-- 
1.9.1

[dpdk-dev] [RFC PATCH 17/18] vmxnet3: remove type field and initialise name field in rte_driver structure

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/vmxnet3/vmxnet3_ethdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.c 
b/drivers/net/vmxnet3/vmxnet3_ethdev.c
index a70be5c..04fff43 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.c
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.c
@@ -884,7 +884,7 @@ vmxnet3_process_events(struct vmxnet3_hw *hw)
 #endif

 static struct rte_driver rte_vmxnet3_driver = {
-   .type = PMD_PDEV,
+   .name = "rte_vmxnet3_driver",   /* PCI device */
.init = rte_vmxnet3_pmd_init,
 };

-- 
1.9.1

[dpdk-dev] [RFC PATCH 15/18] ring: remove type field from rte_driver structure

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/ring/rte_eth_ring.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index 6fd3d0a..cbb3dc7 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -624,8 +624,7 @@ rte_pmd_ring_devuninit(const char *name)
 }

 static struct rte_driver pmd_ring_drv = {
-   .name = "eth_ring",
-   .type = PMD_VDEV,
+   .name = "eth_ring", /* Virtual device */
.init = rte_pmd_ring_devinit,
.uninit = rte_pmd_ring_devuninit,
 };
-- 
1.9.1

[dpdk-dev] [RFC PATCH 14/18] pcap: remove type field from rte_driver structure

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/pcap/rte_eth_pcap.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index f2e4634..fd38894 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -1104,8 +1104,7 @@ rte_pmd_pcap_devuninit(const char *name)
 }

 static struct rte_driver pmd_pcap_drv = {
-   .name = "eth_pcap",
-   .type = PMD_VDEV,
+   .name = "eth_pcap", /* Virtual device */
.init = rte_pmd_pcap_devinit,
.uninit = rte_pmd_pcap_devuninit,
 };
-- 
1.9.1

[dpdk-dev] [RFC PATCH 13/18] null: remove type field from rte_driver structure

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/null/rte_eth_null.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index e244595..5f9871c 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -577,8 +577,7 @@ rte_pmd_null_devuninit(const char *name)
 }

 static struct rte_driver pmd_null_drv = {
-   .name = "eth_null",
-   .type = PMD_VDEV,
+   .name = "eth_null", /* Virtual device */
.init = rte_pmd_null_devinit,
.uninit = rte_pmd_null_devuninit,
 };
-- 
1.9.1

[dpdk-dev] [RFC PATCH 12/18] mpipe: remove type field and update name in rte_driver structure

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/mpipe/mpipe_tilegx.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mpipe/mpipe_tilegx.c b/drivers/net/mpipe/mpipe_tilegx.c
index 743feef..9454d4e 100644
--- a/drivers/net/mpipe/mpipe_tilegx.c
+++ b/drivers/net/mpipe/mpipe_tilegx.c
@@ -2,6 +2,7 @@
  *   BSD LICENSE
  *
  *   Copyright(c) 2015 EZchip Semiconductor Ltd. All rights reserved.
+ *   Copyright(c) 2015 Intel Corporation. All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
  *   modification, are permitted provided that the following conditions
@@ -1602,14 +1603,12 @@ rte_pmd_mpipe_devinit(const char *ifname,
 }

 static struct rte_driver pmd_mpipe_xgbe_drv = {
-   .name = "xgbe",
-   .type = PMD_VDEV,
+   .name = "eth_xgbe", /* Virtual device */
.init = rte_pmd_mpipe_devinit,
 };

 static struct rte_driver pmd_mpipe_gbe_drv = {
-   .name = "gbe",
-   .type = PMD_VDEV,
+   .name = "eth_gbe",  /* Virtual device */
.init = rte_pmd_mpipe_devinit,
 };

-- 
1.9.1

[dpdk-dev] [RFC PATCH 11/18] mlx4: remove type field from rte_driver structure

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/mlx4/mlx4.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index fa3cb7e..532307d 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -3,6 +3,7 @@
  *
  *   Copyright 2012-2015 6WIND S.A.
  *   Copyright 2012 Mellanox.
+ *   Copyright 2015 Intel.
  *
  *   Redistribution and use in source and binary forms, with or without
  *   modification, are permitted provided that the following conditions
@@ -5107,8 +5108,7 @@ rte_mlx4_pmd_init(const char *name, const char *args)
 }

 static struct rte_driver rte_mlx4_driver = {
-   .type = PMD_PDEV,
-   .name = MLX4_DRIVER_NAME,
+   .name = MLX4_DRIVER_NAME,   /* PCI device */
.init = rte_mlx4_pmd_init,
 };

-- 
1.9.1

[dpdk-dev] [RFC PATCH 10/18] ixgbe: remove type field and initialise name field in rte_driver structure

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index b8ee1e9..d59d4b5 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -5514,12 +5514,12 @@ ixgbe_set_eeprom(struct rte_eth_dev *dev,
 }

 static struct rte_driver rte_ixgbe_driver = {
-   .type = PMD_PDEV,
+   .name = "rte_ixgbe_driver", /* PCI device */
.init = rte_ixgbe_pmd_init,
 };

 static struct rte_driver rte_ixgbevf_driver = {
-   .type = PMD_PDEV,
+   .name = "rte_ixgbevf_driver",   /* PCI device */
.init = rte_ixgbevf_pmd_init,
 };

-- 
1.9.1

[dpdk-dev] [RFC PATCH 09/18] i40e: remove type field and initialise name field in rte_driver structures

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/i40e/i40e_ethdev.c| 2 +-
 drivers/net/i40e/i40e_ethdev_vf.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 40b0526..2d0551c 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -347,7 +347,7 @@ rte_i40e_pmd_init(const char *name __rte_unused,
 }

 static struct rte_driver rte_i40e_driver = {
-   .type = PMD_PDEV,
+   .name = "rte_i40e_driver",  /* PCI device */
.init = rte_i40e_pmd_init,
 };

diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index b694400..fe44966 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1268,7 +1268,7 @@ rte_i40evf_pmd_init(const char *name __rte_unused,
 }

 static struct rte_driver rte_i40evf_driver = {
-   .type = PMD_PDEV,
+   .name = "rte_i40evf_driver",/* PCI device */
.init = rte_i40evf_pmd_init,
 };

-- 
1.9.1

[dpdk-dev] [RFC PATCH 08/18] fm10k: remove type field and initialise name field in rte_driver structure

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/fm10k/fm10k_ethdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index a69c990..bda5a81 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -2312,7 +2312,7 @@ rte_pmd_fm10k_init(__rte_unused const char *name,
 }

 static struct rte_driver rte_fm10k_driver = {
-   .type = PMD_PDEV,
+   .name = "rte_fm10k_driver", /* PCI device */
.init = rte_pmd_fm10k_init,
 };

-- 
1.9.1

[dpdk-dev] [RFC PATCH 07/18] enic: remove type field and initialise name field in rte_driver structure

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/enic/enic_ethdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index 8280cea..af2c57e 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -3,6 +3,7 @@
  * Copyright 2007 Nuova Systems, Inc.  All rights reserved.
  *
  * Copyright (c) 2014, Cisco Systems, Inc.
+ * Copyright(c) 2015 Intel Corporation.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -633,7 +634,7 @@ rte_enic_pmd_init(const char *name __rte_unused,
 }

 static struct rte_driver rte_enic_driver = {
-   .type = PMD_PDEV,
+   .name = "rte_enic_driver",  /* PCI device */
.init = rte_enic_pmd_init,
 };

-- 
1.9.1

[dpdk-dev] [RFC PATCH 04/18] bonding: remove type field from rte_driver structure

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/bonding/rte_eth_bond_pmd.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c 
b/drivers/net/bonding/rte_eth_bond_pmd.c
index 5cc6372..0e222b2 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -2302,8 +2302,7 @@ bond_ethdev_configure(struct rte_eth_dev *dev)
 }

 static struct rte_driver bond_drv = {
-   .name = "eth_bond",
-   .type = PMD_VDEV,
+   .name = "eth_bond", /* Virtual device */
.init = bond_init,
.uninit = bond_uninit,
 };
-- 
1.9.1

[dpdk-dev] [RFC PATCH 03/18] bnx2x: remove type field and initialise name field in rte_driver structure

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/bnx2x/bnx2x_ethdev.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c b/drivers/net/bnx2x/bnx2x_ethdev.c
index 09b5920..b25ca21 100644
--- a/drivers/net/bnx2x/bnx2x_ethdev.c
+++ b/drivers/net/bnx2x/bnx2x_ethdev.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2013-2015 Brocade Communications Systems, Inc.
+ * Copyright(c) 2015 Intel Corporation.
  *
  * All rights reserved.
  */
@@ -529,12 +530,12 @@ static int rte_bnx2xvf_pmd_init(const char *name 
__rte_unused, const char *param
 }

 static struct rte_driver rte_bnx2x_driver = {
-   .type = PMD_PDEV,
+   .name = "rte_bnx2x_driver", /* PCI device */
.init = rte_bnx2x_pmd_init,
 };

 static struct rte_driver rte_bnx2xvf_driver = {
-   .type = PMD_PDEV,
+   .name = "rte_bnx2xvf_driver",   /* PCI device */
.init = rte_bnx2xvf_pmd_init,
 };

-- 
1.9.1

[dpdk-dev] [RFC PATCH 02/18] af_packet: remove type field from rte_driver structure

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 drivers/net/af_packet/rte_eth_af_packet.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c 
b/drivers/net/af_packet/rte_eth_af_packet.c
index bdd9628..0ce6540 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -5,7 +5,7 @@
  *
  *   Originally based upon librte_pmd_pcap code:
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   Copyright(c) 2014 6WIND S.A.
  *   All rights reserved.
  *
@@ -839,8 +839,7 @@ exit:
 }

 static struct rte_driver pmd_af_packet_drv = {
-   .name = "eth_af_packet",
-   .type = PMD_VDEV,
+   .name = "eth_af_packet",   /* Virtual device */
.init = rte_pmd_af_packet_devinit,
 };

-- 
1.9.1

[dpdk-dev] [RFC PATCH 01/18] librte_eal: remove type field from rte_driver structure.

2015-09-04 Thread Bernard Iremonger

Signed-off-by: Bernard Iremonger 
---
 lib/librte_eal/common/eal_common_dev.c  | 22 +-
 lib/librte_eal/common/include/rte_dev.h | 11 +--
 2 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c
index 4089d66..ccfbb8c 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   Copyright(c) 2014 6WIND S.A.
  *   All rights reserved.
  *
@@ -72,8 +72,6 @@ rte_eal_vdev_init(const char *name, const char *args)
return -EINVAL;

TAILQ_FOREACH(driver, _driver_list, next) {
-   if (driver->type != PMD_VDEV)
-   continue;

/*
 * search a driver prefix in virtual device name.
@@ -117,10 +115,18 @@ rte_eal_dev_init(void)

/* Once the vdevs are initalized, start calling all the pdev drivers */
TAILQ_FOREACH(driver, _driver_list, next) {
-   if (driver->type != PMD_PDEV)
-   continue;
-   /* PDEV drivers don't get passed any parameters */
-   driver->init(NULL, NULL);
+
+   /* PCI drivers don't get passed any parameters */
+   /*
+* Search a virtual driver prefix in device name.
+* It should not be found for PCI devices.
+* Use strncmp to compare.
+*/
+
+   if ((driver->name) &&
+   (strncmp(driver->name, "eth_", strlen("eth_")) != 0)) {
+   driver->init(NULL, NULL);
+   }
}
return 0;
 }
@@ -134,8 +140,6 @@ rte_eal_vdev_uninit(const char *name)
return -EINVAL;

TAILQ_FOREACH(driver, _driver_list, next) {
-   if (driver->type != PMD_VDEV)
-   continue;

/*
 * search a driver prefix in virtual device name.
diff --git a/lib/librte_eal/common/include/rte_dev.h 
b/lib/librte_eal/common/include/rte_dev.h
index f601d21..6253185 100644
--- a/lib/librte_eal/common/include/rte_dev.h
+++ b/lib/librte_eal/common/include/rte_dev.h
@@ -62,20 +62,11 @@ typedef int (rte_dev_init_t)(const char *name, const char 
*args);
 typedef int (rte_dev_uninit_t)(const char *name);

 /**
- * Driver type enumeration
- */
-enum pmd_type {
-   PMD_VDEV = 0,
-   PMD_PDEV = 1,
-};
-
-/**
  * A structure describing a device driver.
  */
 struct rte_driver {
TAILQ_ENTRY(rte_driver) next;  /**< Next in list. */
-   enum pmd_type type;/**< PMD Driver type */
-   const char *name;   /**< Driver name. */
+   const char *name;  /**< Driver name. */
rte_dev_init_t *init;  /**< Device init. function. */
rte_dev_uninit_t *uninit;  /**< Device uninit. function. */
 };
-- 
1.9.1

[dpdk-dev] [RFC PATCH 00/18] refactor eal driver registration code

2015-09-04 Thread Bernard Iremonger

At present the eal driver registration code is more complicated than it
needs to be.

This RFC proposes to simplify the eal driver registration code.

Remove the type field from the eal driver structure.
Refactor the eal driver registration code to use the name
field in the eal driver structure instead of the type field.

Modify all PMD's to use the modified eal driver structure.
Initialise the name field in the eal driver structure
in some PMD's where it is not initialised at present.


Bernard Iremonger (18):
  librte_eal: remove type field from rte_driver structure.
  af_packet: remove type field from rte_driver structure
  bnx2x: remove type field and initialise name field in rte_driver
structure
  bonding: remove type field from rte_driver structure
  cxgbe: remove type field from rte_driver structure
  e1000: remove type field and initialise name field in rte_driver
structures
  enic: remove type field and initialise name field in rte_driver
structure
  fm10k: remove type field and initialise name field in rte_driver
structure
  i40e: remove type field and initialise name field in rte_driver
structures
  ixgbe: remove type field and initialise name field in rte_driver
structure
  mlx4: remove type field from rte_driver structure
  mpipe: remove type field and update name in rte_driver structure
  null: remove type field from rte_driver structure
  pcap: remove type field from rte_driver structure
  ring: remove type field from rte_driver structure
  virtio_ethdev: remove type field and initialise name field in
rte_driver structure
  vmxnet3: remove type field and initialise name field in rte_driver
structure
  xenvirt: remove type field from rte_driver structure

 drivers/net/af_packet/rte_eth_af_packet.c |  5 ++---
 drivers/net/bnx2x/bnx2x_ethdev.c  |  5 +++--
 drivers/net/bonding/rte_eth_bond_pmd.c|  3 +--
 drivers/net/cxgbe/cxgbe_ethdev.c  |  4 ++--
 drivers/net/e1000/em_ethdev.c |  2 +-
 drivers/net/e1000/igb_ethdev.c|  4 ++--
 drivers/net/enic/enic_ethdev.c|  3 ++-
 drivers/net/fm10k/fm10k_ethdev.c  |  2 +-
 drivers/net/i40e/i40e_ethdev.c|  2 +-
 drivers/net/i40e/i40e_ethdev_vf.c |  2 +-
 drivers/net/ixgbe/ixgbe_ethdev.c  |  4 ++--
 drivers/net/mlx4/mlx4.c   |  4 ++--
 drivers/net/mpipe/mpipe_tilegx.c  |  7 +++
 drivers/net/null/rte_eth_null.c   |  3 +--
 drivers/net/pcap/rte_eth_pcap.c   |  3 +--
 drivers/net/ring/rte_eth_ring.c   |  3 +--
 drivers/net/virtio/virtio_ethdev.c|  2 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c  |  2 +-
 drivers/net/xenvirt/rte_eth_xenvirt.c |  5 ++---
 lib/librte_eal/common/eal_common_dev.c| 22 +-
 lib/librte_eal/common/include/rte_dev.h   | 11 +--
 21 files changed, 44 insertions(+), 54 deletions(-)

-- 
1.9.1

[dpdk-dev] [PATCH v3] librte_cfgfile(rte_cfgfile.h): modify the macros values

2015-09-04 Thread Jasvinder Singh

This patch refers to the ABI change proposed for librte_cfgfile
(rte_cfgfile.h). In order to allow for longer names and values,
the new values of macros CFG_NAME_LEN and CFG_NAME_VAL are set.

Signed-off-by: Jasvinder Singh 
---
 doc/guides/rel_notes/deprecation.rst | 4 
 doc/guides/rel_notes/release_2_2.rst | 7 ++-
 lib/librte_cfgfile/Makefile  | 2 +-
 lib/librte_cfgfile/rte_cfgfile.h | 9 +++--
 4 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 5f6079b..2fbdee2 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -53,10 +53,6 @@ Deprecation Notices
 * The scheduler statistics structure will change to allow keeping track of
   RED actions.

-* librte_cfgfile: In order to allow for longer names and values,
-  the value of macros CFG_NAME_LEN and CFG_NAME_VAL will be increased.
-  Most likely, the new values will be 64 and 256, respectively.
-
 * librte_port: Macros to access the packet meta-data stored within the
   packet buffer will be adjusted to cover the packet mbuf structure as well,
   as currently they are able to access any packet buffer location except the
diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index abe57b4..ff64da8 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -44,6 +44,11 @@ ABI Changes

 * The LPM structure is changed. The deprecated field mem_location is removed.

+* librte_cfgfile: In order to allow for longer names and values,
+  the value of macros CFG_NAME_LEN and CFG_NAME_VAL is increased,
+  the new values are 64 and 256, respectively
+
+

 Shared Library Versions
 ---
@@ -54,7 +59,7 @@ The libraries prepended with a plus sign were incremented in 
this version.

+ libethdev.so.2
+ librte_acl.so.2
- librte_cfgfile.so.1
+   + librte_cfgfile.so.2
  librte_cmdline.so.1
  librte_distributor.so.1
+ librte_eal.so.2
diff --git a/lib/librte_cfgfile/Makefile b/lib/librte_cfgfile/Makefile
index 032c240..616aef0 100644
--- a/lib/librte_cfgfile/Makefile
+++ b/lib/librte_cfgfile/Makefile
@@ -41,7 +41,7 @@ CFLAGS += $(WERROR_FLAGS)

 EXPORT_MAP := rte_cfgfile_version.map

-LIBABIVER := 1
+LIBABIVER := 2

 #
 # all source are stored in SRCS-y
diff --git a/lib/librte_cfgfile/rte_cfgfile.h b/lib/librte_cfgfile/rte_cfgfile.h
index 7c9fc91..d443782 100644
--- a/lib/librte_cfgfile/rte_cfgfile.h
+++ b/lib/librte_cfgfile/rte_cfgfile.h
@@ -47,8 +47,13 @@ extern "C" {
 *
 ***/

-#define CFG_NAME_LEN 32
-#define CFG_VALUE_LEN 64
+#ifndef CFG_NAME_LEN
+#define CFG_NAME_LEN 64
+#endif
+
+#ifndef CFG_VALUE_LEN
+#define CFG_VALUE_LEN 256
+#endif

 /** Configuration file */
 struct rte_cfgfile;
-- 
2.1.0

[dpdk-dev] ixgbe: account more Rx errors Issue

2015-09-04 Thread Andriy Berestovskyy

Hi,
Updating to DPDK 2.1 I noticed an issue with the ixgbe stats.

In commit f6bf669b9900 "ixgbe: account more Rx errors" we add XEC
hardware counter (l3_l4_xsum_error) to the ierrors now. The issue is
the UDP packets with zero check sum are counted in XEC and now in
ierrors too.

I've tried to disable hw_ip_checksum in rxmode, but it didn't help.

I'm not sure we should add XEC to ierrors, because packets counted in
XEC are not dropped by the NIC actually. So in my case ierrors counter
is now greater than actual number of packets received by the NIC,
which makes no sense.

What's your opinion?

Regards,
Andriy

[dpdk-dev] [RFC PATCH 03/18] bnx2x: remove type field and initialise name field in rte_driver structure

2015-09-04 Thread Harish Patil

>
>Signed-off-by: Bernard Iremonger 
>---
> drivers/net/bnx2x/bnx2x_ethdev.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c
>b/drivers/net/bnx2x/bnx2x_ethdev.c
>index 09b5920..b25ca21 100644
>--- a/drivers/net/bnx2x/bnx2x_ethdev.c
>+++ b/drivers/net/bnx2x/bnx2x_ethdev.c
>@@ -1,5 +1,6 @@
> /*
>  * Copyright (c) 2013-2015 Brocade Communications Systems, Inc.
>+ * Copyright(c) 2015 Intel Corporation.
>  *
>  * All rights reserved.
>  */
>@@ -529,12 +530,12 @@ static int rte_bnx2xvf_pmd_init(const char *name
>__rte_unused, const char *param
> }
>
> static struct rte_driver rte_bnx2x_driver = {
>-  .type = PMD_PDEV,
>+  .name = "rte_bnx2x_driver", /* PCI device */
>   .init = rte_bnx2x_pmd_init,
> };
>
> static struct rte_driver rte_bnx2xvf_driver = {
>-  .type = PMD_PDEV,
>+  .name = "rte_bnx2xvf_driver",   /* PCI device */
>   .init = rte_bnx2xvf_pmd_init,
> };
>
>--
>1.9.1
>
>

Acked-by: Harish Patil 


Thanks,
Harish




This message and any attached documents contain information from the sending 
company or its parent company(s), subsidiaries, divisions or branch offices 
that may be confidential. If you are not the intended recipient, you may not 
read, copy, distribute, or use this information. If you have received this 
transmission in error, please notify the sender immediately by reply e-mail and 
then delete this message.

[dpdk-dev] i40e PMD VSI/QUEUE setting can't be satisfied

2015-09-04 Thread Nicolas A Buchanan



Hello All,

Hopefully this is the correct place to post questions.  I am a brand new
users of DPDK and I am having issues starting the i40e PMD.  I get the
following error message when I start any of the test applications:

PMD: eth_i40e_dev_init(): FW 0.0 API 0.0 NVM 04.02.04 eetrack 800013fc
PMD: i40e_pf_parameter_init(): Max supported VSIs:0
PMD: i40e_pf_parameter_init(): PF queue pairs:1
PMD: i40e_pf_parameter_init(): VSI/QUEUE setting can't be satisfied
PMD: i40e_pf_parameter_init(): Max VSIs: 0, asked:0
PMD: i40e_pf_parameter_init(): Total queue pairs:0, asked:1
PMD: eth_i40e_dev_init(): Failed to do parameter init: -22
EAL: Error - exiting with code: 1
  Cause: Requested device :04:00.0 cannot be used

I get the same error binding the device to the uio_pci_generic or igb_uio
driver.

My setup:

CentOS 6.6 running kernel 3.18.12-11.el6.x86_64
DPDK version 2.1.0
Intel XL710 dual port NIC

Any help would be appreciated.

Nick Buchanan

[dpdk-dev] [PATCH 2/3] enic: use appropriate key length in hash table

2015-09-04 Thread Sujith Sankar (ssujith)


On 04/09/15 2:35 pm, "Pablo de Lara" 
wrote:

>RTE_HASH_KEY_LENGTH_MAX was deprecated, and the hash table
>actually is hosting bigger keys than that size, so key length
>has been increased to properly allocate all keys.
>
>Signed-off-by: Pablo de Lara 
>---
> drivers/net/enic/enic_clsf.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/net/enic/enic_clsf.c b/drivers/net/enic/enic_clsf.c
>index 9c2abfb..656b25b 100644
>--- a/drivers/net/enic/enic_clsf.c
>+++ b/drivers/net/enic/enic_clsf.c
>@@ -214,7 +214,7 @@ int enic_fdir_add_fltr(struct enic *enic, struct
>rte_eth_fdir_filter *params)
>   enic->fdir.stats.add++;
>   }
> 
>-  pos = rte_hash_add_key(enic->fdir.hash, (void *)key);
>+  pos = rte_hash_add_key(enic->fdir.hash, params);
>   enic->fdir.nodes[pos] = key;
>   return 0;
> }
>@@ -244,7 +244,7 @@ int enic_clsf_init(struct enic *enic)
>   struct rte_hash_parameters hash_params = {
>   .name = "enicpmd_clsf_hash",
>   .entries = ENICPMD_CLSF_HASH_ENTRIES,
>-  .key_len = RTE_HASH_KEY_LENGTH_MAX,
>+  .key_len = sizeof(struct rte_eth_fdir_filter),
>   .hash_func = DEFAULT_HASH_FUNC,
>   .hash_func_init_val = 0,
>   .socket_id = SOCKET_0,
>--

Looks good.

Thanks,
-Sujith


> 
>2.4.2
>

[dpdk-dev] pcap->eth low TX performance

2015-09-04 Thread Yerden Zhumabekov

Hello,

Did anyone try to work with pcap PMD recently? We're testing our app
with this setup:

PCAP --- rte_eth_rx_burst--> APP-> rte_eth_tx_burst -> ethdev

I'm experiencing very low TX performance leading to massive mbuf drop
while trying to send those packets over the Ethernet device. I tried
running ordinary l2fwd and got the same issue with over 80-90% of
packets drop. When I substitute PCAP with another ordinary Ethernet
device, everything works fine. Can anyone share an idea?

-- 
Sincerely,

Yerden Zhumabekov
State Technical Service
Astana, KZ

[dpdk-dev] [PATCH 3/3] hash: remove deprecated functions and macros

2015-09-04 Thread Pablo de Lara

The function rte_jhash2() was renamed rte_jhash_32b and
macros RTE_HASH_KEY_LENGTH_MAX and RTE_HASH_BUCKET_ENTRIES_MAX
were tagged as deprecated, so they can be removed in 2.2.

Signed-off-by: Pablo de Lara 
---
 doc/guides/rel_notes/deprecation.rst |  5 -
 doc/guides/rel_notes/release_2_2.rst |  3 +++
 lib/librte_hash/rte_hash.h   |  6 --
 lib/librte_hash/rte_jhash.h  | 15 ++-
 4 files changed, 5 insertions(+), 24 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 5f6079b..fffad80 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -13,11 +13,6 @@ Deprecation Notices
   There is no backward compatibility planned from release 2.2.
   All binaries will need to be rebuilt from release 2.2.

-* The Macros RTE_HASH_BUCKET_ENTRIES_MAX and RTE_HASH_KEY_LENGTH_MAX are
-  deprecated and will be removed with version 2.2.
-
-* The function rte_jhash2 is deprecated and should be removed.
-
 * The following fields have been deprecated in rte_eth_stats:
   imissed, ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
   tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff
diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index abe57b4..aa44862 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -27,6 +27,9 @@ API Changes
 * The deprecated ring PMD functions are removed:
   rte_eth_ring_pair_create() and rte_eth_ring_pair_attach().

+* The function rte_jhash2() is removed.
+  It was replaced by rte_jhash_32b().
+

 ABI Changes
 ---
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index 1cddc07..175c0bb 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -49,12 +49,6 @@ extern "C" {
 /** Maximum size of hash table that can be created. */
 #define RTE_HASH_ENTRIES_MAX   (1 << 30)

-/** @deprecated Maximum bucket size that can be created. */
-#define RTE_HASH_BUCKET_ENTRIES_MAX4
-
-/** @deprecated Maximum length of key that can be used. */
-#define RTE_HASH_KEY_LENGTH_MAX64
-
 /** Maximum number of characters in hash name.*/
 #define RTE_HASH_NAMESIZE  32

diff --git a/lib/librte_hash/rte_jhash.h b/lib/librte_hash/rte_jhash.h
index f9a8266..457f225 100644
--- a/lib/librte_hash/rte_jhash.h
+++ b/lib/librte_hash/rte_jhash.h
@@ -267,10 +267,10 @@ rte_jhash_2hashes(const void *key, uint32_t length, 
uint32_t *pc, uint32_t *pb)
 }

 /**
- * Same as rte_jhash2, but takes two seeds and return two uint32_ts.
+ * Same as rte_jhash_32b, but takes two seeds and return two uint32_ts.
  * pc and pb must be non-null, and *pc and *pb must both be initialized
  * with seeds. If you pass in (*pb)=0, the output (*pc) will be
- * the same as the return value from rte_jhash2.
+ * the same as the return value from rte_jhash_32b.
  *
  * @param k
  *   Key to calculate hash of.
@@ -335,17 +335,6 @@ rte_jhash_32b(const uint32_t *k, uint32_t length, uint32_t 
initval)
 }

 static inline uint32_t
-__attribute__ ((deprecated))
-rte_jhash2(const uint32_t *k, uint32_t length, uint32_t initval)
-{
-   uint32_t initval2 = 0;
-
-   rte_jhash_32b_2hashes(k, length, , );
-
-   return initval;
-}
-
-static inline uint32_t
 __rte_jhash_3words(uint32_t a, uint32_t b, uint32_t c, uint32_t initval)
 {
a += RTE_JHASH_GOLDEN_RATIO + initval;
-- 
2.4.2

[dpdk-dev] [PATCH 2/3] enic: use appropriate key length in hash table

2015-09-04 Thread Pablo de Lara

RTE_HASH_KEY_LENGTH_MAX was deprecated, and the hash table
actually is hosting bigger keys than that size, so key length
has been increased to properly allocate all keys.

Signed-off-by: Pablo de Lara 
---
 drivers/net/enic/enic_clsf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/enic/enic_clsf.c b/drivers/net/enic/enic_clsf.c
index 9c2abfb..656b25b 100644
--- a/drivers/net/enic/enic_clsf.c
+++ b/drivers/net/enic/enic_clsf.c
@@ -214,7 +214,7 @@ int enic_fdir_add_fltr(struct enic *enic, struct 
rte_eth_fdir_filter *params)
enic->fdir.stats.add++;
}

-   pos = rte_hash_add_key(enic->fdir.hash, (void *)key);
+   pos = rte_hash_add_key(enic->fdir.hash, params);
enic->fdir.nodes[pos] = key;
return 0;
 }
@@ -244,7 +244,7 @@ int enic_clsf_init(struct enic *enic)
struct rte_hash_parameters hash_params = {
.name = "enicpmd_clsf_hash",
.entries = ENICPMD_CLSF_HASH_ENTRIES,
-   .key_len = RTE_HASH_KEY_LENGTH_MAX,
+   .key_len = sizeof(struct rte_eth_fdir_filter),
.hash_func = DEFAULT_HASH_FUNC,
.hash_func_init_val = 0,
.socket_id = SOCKET_0,
-- 
2.4.2

[dpdk-dev] [PATCH 1/3] hash: use max key length as internal macro instead of deprecated one

2015-09-04 Thread Pablo de Lara

RTE_HASH_KEY_LENGTH_MAX has been deprecated in DPDK 2.1 and it is going
to be removed in 2.2, so the macro is defined internally
for the memory allocation of all keys used.

Signed-off-by: Pablo de Lara 
---
 app/test/test_hash.c   | 7 ---
 app/test/test_hash_functions.c | 4 ++--
 app/test/test_hash_perf.c  | 2 +-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/app/test/test_hash.c b/app/test/test_hash.c
index 7f8c0d3..4f2509d 100644
--- a/app/test/test_hash.c
+++ b/app/test/test_hash.c
@@ -66,6 +66,7 @@
 static rte_hash_function hashtest_funcs[] = {rte_jhash, rte_hash_crc};
 static uint32_t hashtest_initvals[] = {0};
 static uint32_t hashtest_key_lens[] = {0, 2, 4, 5, 6, 7, 8, 10, 11, 15, 16, 
21, 31, 32, 33, 63, 64};
+#define MAX_KEYSIZE 64
 
/**/
 #define LOCAL_FBK_HASH_ENTRIES_MAX (1 << 15)

@@ -238,7 +239,7 @@ test_crc32_hash_alg_equiv(void)
 static void run_hash_func_test(rte_hash_function f, uint32_t init_val,
uint32_t key_len)
 {
-   static uint8_t key[RTE_HASH_KEY_LENGTH_MAX];
+   static uint8_t key[MAX_KEYSIZE];
unsigned i;


@@ -1100,7 +1101,7 @@ test_hash_creation_with_good_parameters(void)
 static int test_average_table_utilization(void)
 {
struct rte_hash *handle;
-   uint8_t simple_key[RTE_HASH_KEY_LENGTH_MAX];
+   uint8_t simple_key[MAX_KEYSIZE];
unsigned i, j;
unsigned added_keys, average_keys_added = 0;
int ret;
@@ -1154,7 +1155,7 @@ static int test_hash_iteration(void)
 {
struct rte_hash *handle;
unsigned i;
-   uint8_t keys[NUM_ENTRIES][RTE_HASH_KEY_LENGTH_MAX];
+   uint8_t keys[NUM_ENTRIES][MAX_KEYSIZE];
const void *next_key;
void *next_data;
void *data[NUM_ENTRIES];
diff --git a/app/test/test_hash_functions.c b/app/test/test_hash_functions.c
index 8c7cf63..3ad6d80 100644
--- a/app/test/test_hash_functions.c
+++ b/app/test/test_hash_functions.c
@@ -85,7 +85,7 @@ static uint32_t hash_values_crc[2][10] = {{
  * from the array entries is tested.
  */
 #define HASHTEST_ITERATIONS 100
-
+#define MAX_KEYSIZE 64
 static rte_hash_function hashtest_funcs[] = {rte_jhash, rte_hash_crc};
 static uint32_t hashtest_initvals[] = {0, 0xdeadbeef};
 static uint32_t hashtest_key_lens[] = {
@@ -119,7 +119,7 @@ static void
 run_hash_func_perf_test(uint32_t key_len, uint32_t init_val,
rte_hash_function f)
 {
-   static uint8_t key[HASHTEST_ITERATIONS][RTE_HASH_KEY_LENGTH_MAX];
+   static uint8_t key[HASHTEST_ITERATIONS][MAX_KEYSIZE];
uint64_t ticks, start, end;
unsigned i, j;

diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c
index a87fc80..9d53c14 100644
--- a/app/test/test_hash_perf.c
+++ b/app/test/test_hash_perf.c
@@ -140,7 +140,7 @@ shuffle_input_keys(unsigned table_index)
 {
unsigned i;
uint32_t swap_idx;
-   uint8_t temp_key[RTE_HASH_KEY_LENGTH_MAX];
+   uint8_t temp_key[MAX_KEYSIZE];
hash_sig_t temp_signature;
int32_t temp_position;

-- 
2.4.2

[dpdk-dev] [PATCH 0/3] clean deprecated code in hash library

2015-09-04 Thread Pablo de Lara

This patchset is to remove all deprecated macros and functions
from the hash library, as well as to modify the unit tests and
ENIC driver that were using them.

Pablo de Lara (3):
  hash: use max key length as internal macro instead of deprecated one
  enic: use appropriate key length in hash table
  hash: remove deprecated functions and macros

 app/test/test_hash.c |  7 ---
 app/test/test_hash_functions.c   |  4 ++--
 app/test/test_hash_perf.c|  2 +-
 doc/guides/rel_notes/deprecation.rst |  5 -
 doc/guides/rel_notes/release_2_2.rst |  3 +++
 drivers/net/enic/enic_clsf.c |  4 ++--
 lib/librte_hash/rte_hash.h   |  6 --
 lib/librte_hash/rte_jhash.h  | 15 ++-
 8 files changed, 14 insertions(+), 32 deletions(-)

-- 
2.4.2

[dpdk-dev] pcap->eth low TX performance

2015-09-04 Thread Kyle Larose

Are you reading from the pcap faster than the device can transmit? Does the
app hold off reading from the pcap when the ethdev is pushing back, or does
it just tail drop?

On Fri, Sep 4, 2015 at 12:14 AM, Yerden Zhumabekov 
wrote:

> Hello,
>
> Did anyone try to work with pcap PMD recently? We're testing our app
> with this setup:
>
> PCAP --- rte_eth_rx_burst--> APP-> rte_eth_tx_burst -> ethdev
>
> I'm experiencing very low TX performance leading to massive mbuf drop
> while trying to send those packets over the Ethernet device. I tried
> running ordinary l2fwd and got the same issue with over 80-90% of
> packets drop. When I substitute PCAP with another ordinary Ethernet
> device, everything works fine. Can anyone share an idea?
>
> --
> Sincerely,
>
> Yerden Zhumabekov
> State Technical Service
> Astana, KZ
>
>

[dpdk-dev] [PATCH v2 00/10] clean deprecated code

2015-09-04 Thread Thomas Monjalon

2015-09-02 15:16, Thomas Monjalon:
> Before starting a new integration cycle (2.2.0-rc0),
> the deprecated code is removed.
> 
> The hash library is not cleaned in this patchset and would be
> better done by its maintainers. Bruce, Pablo, please check the
> file doc/guides/rel_notes/deprecation.rst.
> 
> Changes in v2:
> - increment KNI and ring PMD versions
> - list library versions in release notes
> - list API/ABI changes in release notes
> 
> Stephen Hemminger (2):
>   kni: remove deprecated functions
>   ring: remove deprecated functions
> 
> Thomas Monjalon (8):
>   doc: init next release notes
>   ethdev: remove Rx interrupt switch
>   mbuf: remove packet type from offload flags
>   ethdev: remove SCTP flow entries switch
>   eal: remove deprecated function
>   mem: remove dummy malloc library
>   lpm: remove deprecated field
>   acl: remove old API

Applied

[dpdk-dev] [PATCH v2 01/10] doc: init next release notes

2015-09-04 Thread Thomas Monjalon

2015-09-03 15:44, Mcnamara, John:
> P.S. Perhaps we should announce, or maybe this will do as an announcement, 
> that from this release forward the Release Notes should be updated as part of 
> a patchset that contains one of the following:
> 
> * New Features
> * Resolved Issues (in relation to features existing in the previous 
> releases)
> * Known Issues
> * API Changes
> * ABI Changes
> * Shared Library Versions

Maybe we should update doc/guides/contributing/documentation.rst to clearly 
state it.

[dpdk-dev] "cannot use T= with gcov target" when doing "makefile clean" with DPDK-2.1.0

2015-09-04 Thread Montorsi, Francesco

Hi John,

> -Original Message-
> From: Mcnamara, John [mailto:john.mcnamara at intel.com]
> Sent: mercoled? 2 settembre 2015 16:32
> To: Montorsi, Francesco ; dev at dpdk.org
> Subject: RE: "cannot use T= with gcov target" when doing "makefile clean"
> with DPDK-2.1.0
>...
> That fix seems reasonable and you should submit it as a patch.
> 
> There may be other ways to fix this (there are several ways to fix things
> within the build system) but if you submit a patch we can get some
> comments.

I will submit the patch ASAP (together with a few others). I guess I need to 
follow closely what's written here:

  http://dpdk.org/dev


Thanks,
Francesco

[dpdk-dev] virtio optimization idea

2015-09-04 Thread Xie, Huawei

Hi:

Recently I have done one virtio optimization proof of concept. The
optimization includes two parts:
1) avail ring set with fixed descriptors
2) RX vectorization
With the optimizations, we could have several times of performance boost
for purely vhost-virtio throughput.

Here i will only cover the first part, which is the prerequisite for the
second part.
Let us first take RX for example. Currently when we fill the avail ring
with guest mbuf, we need
a) allocate one descriptor(for non sg mbuf) from free descriptors
b) set the idx of the desc into the entry of avail ring
c) set the addr/len field of the descriptor to point to guest blank mbuf
data area

Those operation takes time, and especially step b results in modifed (M)
state of the cache line for the avail ring in the virtio processing
core. When vhost processes the avail ring, the cache line transfer from
virtio processing core to vhost processing core takes pretty much CPU
cycles.
To solve this problem, this is the arrangement of RX ring for DPDK
pmd(for non-mergable case).

avail  
idx
+  
|  
+++---+-+--+   
| 0  | 1  | 2 | ... |  254  | 255  |  avail ring
+-+--+-+--+-+-+-+---+--+---+   
  |||   |   |  |   
  |||   |   |  |   
  vvv   |   v  v   
+-+--+-+--+-+-+-+---+--+---+   
| 0  | 1  | 2 | ... |  254  | 255  |  desc ring
+++---+-+--+   
|  
|  
+++---+-+--+   
| 0  | 1  | 2 | |  254  | 255  |  used ring
+++---+-+--+   
|  
+
Avail ring is initialized with fixed descriptor and is never changed,
i.e, the index value of the nth avail ring entry is always n, which
means virtio PMD is actually refilling desc ring only, without having to
change avail ring.
When vhost fetches avail ring, if not evicted, it is always in its first
level cache.

When RX receives packets from used ring, we use the used->idx as the
desc idx. This requires that vhost processes and returns descs from
avail ring to used ring in order, which is true for both current dpdk
vhost and kernel vhost implementation. In my understanding, there is no
necessity for vhost net to process descriptors OOO. One case could be
zero copy, for example, if one descriptor doesn't meet zero copy
requirment, we could directly return it to used ring, earlier than the
descriptors in front of it.
To enforce this, i want to use a reserved bit to indicate in order
processing of descriptors.

For tx ring, the arrangement is like below. Each transmitted mbuf needs
a desc for virtio_net_hdr, so actually we have only 128 free slots.



++  

||  

||  

+-+-+-+--+--+--+--+ 
 

   |  0  |  1  | ... |  127 || 128  | 129  | ...  | 255  |   avail ring
with fixed descriptor

+--+--+--+--+-+---+--+---+--+---+--+--+---+ 
 

  | ||  ||  |  |
|  
  v vv  ||  v  v
v  

+--+--+--+--+-+---+--+---+--+---+--+--+---+ 
 

   | 127 | 128 | ... |  255 || 127  | 128  | ...  | 255  |   desc ring
for virtio_net_hdr

+--+--+--+--+-+---+--+---+--+---+--+--+---+ 
 

  | ||  ||  |  |
|  
  v vv  ||  v  v
v  

+--+--+--+--+-+---+--+---+--+---+--+--+---+ 
 

   |  0  |  1  | ... |  127 ||  0   |  1   | ...  | 127  |   desc ring
for tx dat   

+-+-+-+--+--+--+--+



/huawei

[dpdk-dev] how to change binding of NIC ports to NUMA nodes

2015-09-04 Thread De Lara Guarch, Pablo

Hi Rajesh,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Rajesh R
> Sent: Friday, September 04, 2015 5:29 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] how to change binding of NIC ports to NUMA nodes
> 
> Hi,
> 
> I am trying an application based on dpdk on a 4- processor server i.e. 4
> numa nodes.
> The server is having with 4 NIC cards out of which 2 cards get binded to
> numa node 0 and other 2 cards get binded to numa node 2 (as per the
> /sys/pci/.../numa_node for each card)
> 
> 
> How to evenly distribute the cards to all the numa nodes so that one card
> each gets binded to one numa node?
> 
> Can we control the binding from dpdk, either pmd_ixgbe or igb_uio?

The drivers cannot change the numa node where your NICs are,
as those nodes are associated to the different physical sockets (CPU and 
memory) 
that you have on your platform, and your NICs are connected physically
to these sockets via the PCI slots.

So, if you want to change the numa node, you will have to move the NIC(s)
to another PCI slot that is connected to a different socket.
Look at the user guide of your platform to find out which PCI slots are 
connected to which socket.

Regards,

Pablo
> 
> 
> --
> Regards
> 
> Rajesh R

[dpdk-dev] vmxnet2-usermap kmod compile errors with ubuntu 15.04

2015-09-04 Thread Ale Mansoor

Downloaded the latest vmxnet3-usermap package (ver 1.2) from dpdk.org, tried 
compiling it under an Ubuntu VM but it fails to compile, is there a newer 
version of this driver available from somewhere that will compile correctly 
under Ubuntu 15.04 ?
The kernel (Ubuntu 15.04) "uname -a" ===> Linux ubuntu-vm-mansoor 
3.19.0-15-generic #15-Ubuntu SMP Thu Apr 16 23:32:37 UTC 2015 x86_64 x86_64 
x86_64 GNU/Linux
First I got an error about undefined VM_RESERVED, which I fixed by setting to 
(VM_DONTEXPAND | VM_DONTDUMP) to get past the error, now I get following 
compile errors, have followed the instructions inside the 
"vmxnet3-usermap-1.2/kmod/README" file. 
Also noticed the message "Using 2.6.x kernel build system", have setup the RTE 
environment variables as below:
# env | grep 
RTERTE_INCLUDE=/home/mansoor/dpdk_download/dpdk_2.1/dpdk-2.1.0/build/includeRTE_SDK=/home/mansoor/dpdk_download/dpdk_2.1/dpdk-2.1.0RTE_TARGET=x86_64-native-linuxapp-gcc
Thanks in advance for your help.

# makeUsing 2.6.x kernel build system.make -C 
/lib/modules/3.19.0-15-generic/build/include/.. SUBDIRS=$PWD SRCROOT=$PWD/. \  
MODULEBUILDDIR= modulesmake[1]: Entering directory 
'/usr/src/linux-headers-3.19.0-15-generic'  CC [M]  
/home/mansoor/dpdk_download/dpdk_2.1/dpdk-2.1.0/vmxnet3-usermap-1.2/kmod/vmxnet3_ethtool.o/home/mansoor/dpdk_download/dpdk_2.1/dpdk-2.1.0/vmxnet3-usermap-1.2/kmod/vmxnet3_ethtool.c:
 In function 
???vmxnet3_set_features???:/home/mansoor/dpdk_download/dpdk_2.1/dpdk-2.1.0/vmxnet3-usermap-1.2/kmod/vmxnet3_ethtool.c:361:48:
 error: ???NETIF_F_HW_VLAN_RX??? undeclared (first use in this function)  if 
(changed & (NETIF_F_RXCSUM | NETIF_F_LRO | NETIF_F_HW_VLAN_RX)) {   
 
^/home/mansoor/dpdk_download/dpdk_2.1/dpdk-2.1.0/vmxnet3-usermap-1.2/kmod/vmxnet3_ethtool.c:361:48:
 note: each undeclared identifier is reported only once for each function it 
appears 
in/home/mansoor/dpdk_download/dpdk_2.1/dpdk-2.1.0/vmxnet3-usermap-1.2/kmod/vmxnet3_ethtool.c:
 In function 
???vmxnet3_set_ethtool_ops???:/home/mansoor/dpdk_download/dpdk_2.1/dpdk-2.1.0/vmxnet3-usermap-1.2/kmod/vmxnet3_ethtool.c:677:2:
 error: implicit declaration of function ???SET_ETHTOOL_OPS??? 
[-Werror=implicit-function-declaration]  SET_ETHTOOL_OPS(netdev, 
_ethtool_ops);  ^cc1: some warnings being treated as 
errorsscripts/Makefile.build:257: recipe for target 
'/home/mansoor/dpdk_download/dpdk_2.1/dpdk-2.1.0/vmxnet3-usermap-1.2/kmod/vmxnet3_ethtool.o'
 failedmake[2]: *** 
[/home/mansoor/dpdk_download/dpdk_2.1/dpdk-2.1.0/vmxnet3-usermap-1.2/kmod/vmxnet3_ethtool.o]
 Error 1Makefile:1394: recipe for target 
'_module_/home/mansoor/dpdk_download/dpdk_2.1/dpdk-2.1.0/vmxnet3-usermap-1.2/kmod'
 failedmake[1]: *** 
[_module_/home/mansoor/dpdk_download/dpdk_2.1/dpdk-2.1.0/vmxnet3-usermap-1.2/kmod]
 Error 2make[1]: Leaving directory 
'/usr/src/linux-headers-3.19.0-15-generic'Makefile:123: recipe for target 
'vmxnet3-usermap.ko' failedmake: *** [vmxnet3-usermap.ko] Error 2

[dpdk-dev] how to change binding of NIC ports to NUMA nodes

2015-09-04 Thread Rajesh R

Hi,

I am trying an application based on dpdk on a 4- processor server i.e. 4
numa nodes.
The server is having with 4 NIC cards out of which 2 cards get binded to
numa node 0 and other 2 cards get binded to numa node 2 (as per the
/sys/pci/.../numa_node for each card)


How to evenly distribute the cards to all the numa nodes so that one card
each gets binded to one numa node?

Can we control the binding from dpdk, either pmd_ixgbe or igb_uio?


-- 
Regards

Rajesh R

58 matches

Mail list logo