date:20140311

[dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?

2014-03-11 Thread Prashant Upadhyaya

Hi Stephen,

This is great news !
I can wait for a formal release of DPDK with your driver.
Please let me know when is the release expected. I will happily migrate to that.

Regards
-Prashant

-Original Message-
From: Stephen Hemminger [mailto:step...@networkplumber.org]
Sent: Monday, March 10, 2014 9:21 PM
To: Prashant Upadhyaya
Cc: Srinivasan J; dev at dpdk.org
Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?

On Mon, 10 Mar 2014 13:30:48 +0530
Prashant Upadhyaya  wrote:

> Hi Srini,
>
> Thanks, I could also make it work, thanks to your cue !
>
> Now then, this multi-segment not being supported in vmxnet3 driver is a big 
> party-pooper for me. Unfortunately in my usecase, I do indeed make heavy use 
> of multisegment buffers for sending out the data, so my usecase has failed 
> and I will have to fix that.
>
> Also, can you please adivse how much is the max data rates you have been able 
> to achieve with one vmxnet3 10G port.
>
> Thanks a lot for the advice once again.
>
> Regards
> -Prashant

I am integrating our driver with the 1.6.1 DPDK driver.
We support multi-segment, if you want I will backport that feature first.

===
Please refer to http://www.aricent.com/legal/email_disclaimer.html
for important disclosures regarding this electronic communication.
===

[dpdk-dev] [memnic PATCH 1/5] pmd: fix race condition

2014-03-11 Thread Hiroshi Shimamoto

From: Hiroshi Shimamoto 

There is a race condition, on transmit to vSwitch.

Guest PMD Host
Thread-A Thread-B vSwitch
   |idx=0   |idx=0   |p[0] st!=2
   |cmpxchg ||
   |p[0] st->1  ||
   |idx=1   ||
   |fill data   ||
   |p[0] st->2  ||p[0] st==2
   |||receive data
   |||p[0] st->0
   ||cmpxchg |
   ||success |p[1] st!=2
   ||p[0] st->1  |
  This is BAD

That causes traffic stop.

We have to take care about that race condition with checking
whether current index is correct.

Signed-off-by: Hiroshi Shimamoto 
Reviewed-by: Hayato Momma 
---
 pmd/pmd_memnic.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c
index 30d5a1b..805f0b2 100644
--- a/pmd/pmd_memnic.c
+++ b/pmd/pmd_memnic.c
@@ -278,6 +278,15 @@ retry:
goto retry;
}

+   if (idx != ACCESS_ONCE(adapter->down_idx)) {
+   /*
+* vSwitch freed this and got false positive,
+* need to recover the status and retry.
+*/
+   p->status = MEMNIC_PKT_ST_FREE;
+   goto retry;
+   }
+
if (++idx >= MEMNIC_NR_PACKET)
idx = 0;
adapter->down_idx = idx;
-- 
1.8.4

[dpdk-dev] [memnic PATCH 2/5] pmd: check frame length from host

2014-03-11 Thread Hiroshi Shimamoto

From: Hiroshi Shimamoto 

Drop packets which have invalid length.

Normally this must not happen while vSwitch works fine, however
it's better to put a sentinel to prevent memory corruption.

Signed-off-by: Hiroshi Shimamoto 
Reviewed-by: Hayato Momma 
---
 pmd/pmd_memnic.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c
index 805f0b2..bf5fc2e 100644
--- a/pmd/pmd_memnic.c
+++ b/pmd/pmd_memnic.c
@@ -226,6 +226,8 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
p = &data->packets[idx];
if (p->status != MEMNIC_PKT_ST_FILLED)
break;
+   if (p->len > MEMNIC_MAX_FRAME_LEN)
+   goto drop;
mb = rte_pktmbuf_alloc(adapter->mp);
if (!mb)
break;
@@ -238,6 +240,7 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
mb->pkt.data_len = p->len;
rx_pkts[nr] = mb;

+drop:
rte_mb();
p->status = MEMNIC_PKT_ST_FREE;

-- 
1.8.4

[dpdk-dev] [memnic PATCH 3/5] pmd: implement stats of MEMNIC

2014-03-11 Thread Hiroshi Shimamoto

From: Hiroshi Shimamoto 

Implement missing feature to account statistics.
This patch adds just an infrastructure.

Signed-off-by: Hiroshi Shimamoto 
Reviewed-by: Hayato Momma 
---
 pmd/pmd_memnic.c | 45 ++---
 1 file changed, 42 insertions(+), 3 deletions(-)

diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c
index bf5fc2e..fc2d990 100644
--- a/pmd/pmd_memnic.c
+++ b/pmd/pmd_memnic.c
@@ -51,6 +51,7 @@ struct memnic_adapter {
int up_idx, down_idx;
struct rte_mempool *mp;
struct ether_addr mac_addr;
+   struct rte_eth_stats stats[RTE_MAX_LCORE];
 };

 static inline struct memnic_adapter *get_adapter(const struct rte_eth_dev *dev)
@@ -126,13 +127,51 @@ static void memnic_dev_infos_get(struct rte_eth_dev *dev,
dev_info->max_mac_addrs = 1;
 }

-static void memnic_dev_stats_get(__rte_unused struct rte_eth_dev *dev,
-__rte_unused struct rte_eth_stats *stats)
+static void memnic_dev_stats_get(struct rte_eth_dev *dev,
+struct rte_eth_stats *stats)
 {
+   struct memnic_adapter *adapter = get_adapter(dev);
+   int i;
+
+   memset(stats, 0, sizeof(*stats));
+   for (i = 0; i < RTE_MAX_LCORE; i++) {
+   struct rte_eth_stats *st = &adapter->stats[i];
+
+   stats->ipackets += st->ipackets;
+   stats->opackets += st->opackets;
+   stats->ibytes += st->ibytes;
+   stats->obytes += st->obytes;
+   stats->ierrors += st->ierrors;
+   stats->oerrors += st->oerrors;
+   stats->imcasts += st->imcasts;
+   stats->rx_nombuf += st->rx_nombuf;
+   stats->fdirmatch += st->fdirmatch;
+   stats->fdirmiss += st->fdirmiss;
+
+   /* no multiqueue support now */
+   stats->q_ipackets[0] = st->q_ipackets[0];
+   stats->q_opackets[0] = st->q_opackets[0];
+   stats->q_ibytes[0] = st->q_ibytes[0];
+   stats->q_obytes[0] = st->q_obytes[0];
+   stats->q_errors[0] = st->q_errors[0];
+
+   stats->ilbpackets += st->ilbpackets;
+   stats->olbpackets += st->olbpackets;
+   stats->ilbbytes += st->ilbbytes;
+   stats->olbbytes += st->olbbytes;
+   }
 }

-static void memnic_dev_stats_reset(__rte_unused struct rte_eth_dev *dev)
+static void memnic_dev_stats_reset(struct rte_eth_dev *dev)
 {
+   struct memnic_adapter *adapter = get_adapter(dev);
+   int i;
+
+   for (i = 0; i < RTE_MAX_LCORE; i++) {
+   struct rte_eth_stats *st = &adapter->stats[i];
+
+   memset(st, 0, sizeof(*st));
+   }
 }

 static int memnic_dev_link_update(struct rte_eth_dev *dev,
-- 
1.8.4

[dpdk-dev] [memnic PATCH 4/5] pmd: account statistics

2014-03-11 Thread Hiroshi Shimamoto

From: Hiroshi Shimamoto 

Implement packet accounting of MEMNIC on TX/RX.

Signed-off-by: Hiroshi Shimamoto 
Reviewed-by: Hayato Momma 
---
 pmd/pmd_memnic.c | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c
index fc2d990..abfd437 100644
--- a/pmd/pmd_memnic.c
+++ b/pmd/pmd_memnic.c
@@ -255,18 +255,23 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
struct memnic_packet *p;
struct rte_mbuf *mb;
uint16_t nr;
+   uint64_t pkts, bytes, errs;
int idx;
+   struct rte_eth_stats *st = &adapter->stats[rte_lcore_id()];

if (!adapter->nic->hdr.valid)
return 0;

+   pkts = bytes = errs = 0;
idx = adapter->up_idx;
for (nr = 0; nr < nb_pkts; nr++) {
p = &data->packets[idx];
if (p->status != MEMNIC_PKT_ST_FILLED)
break;
-   if (p->len > MEMNIC_MAX_FRAME_LEN)
+   if (p->len > MEMNIC_MAX_FRAME_LEN) {
+   errs++;
goto drop;
+   }
mb = rte_pktmbuf_alloc(adapter->mp);
if (!mb)
break;
@@ -279,6 +284,9 @@ static uint16_t memnic_recv_pkts(void *rx_queue,
mb->pkt.data_len = p->len;
rx_pkts[nr] = mb;

+   pkts++;
+   bytes += p->len;
+
 drop:
rte_mb();
p->status = MEMNIC_PKT_ST_FREE;
@@ -288,6 +296,13 @@ drop:
}
adapter->up_idx = idx;

+   /* stats */
+   st->ipackets += pkts;
+   st->ibytes += bytes;
+   st->ierrors += errs;
+   st->q_ipackets[0] += pkts;
+   st->q_ibytes[0] += bytes;
+
return nr;
 }

@@ -300,14 +315,21 @@ static uint16_t memnic_xmit_pkts(void *tx_queue,
struct memnic_packet *p;
uint16_t nr;
int idx, old;
+   struct rte_eth_stats *st = &adapter->stats[rte_lcore_id()];
+   uint64_t pkts, bytes, errs;

if (!adapter->nic->hdr.valid)
return 0;

+   pkts = bytes = errs = 0;
+
for (nr = 0; nr < nb_pkts; nr++) {
int len = rte_pktmbuf_data_len(tx_pkts[nr]);
-   if (len > MEMNIC_MAX_FRAME_LEN)
+
+   if (len > MEMNIC_MAX_FRAME_LEN) {
+   errs++;
break;
+   }
 retry:
idx = ACCESS_ONCE(adapter->down_idx);
p = &data->packets[idx];
@@ -315,6 +337,7 @@ retry:
if (old != MEMNIC_PKT_ST_FREE) {
if (old == MEMNIC_PKT_ST_FILLED &&
idx == ACCESS_ONCE(adapter->down_idx)) {
+   errs++;
break;
}
goto retry;
@@ -337,12 +360,22 @@ retry:

rte_memcpy(p->data, rte_pktmbuf_mtod(tx_pkts[nr], void *), len);

+   pkts++;
+   bytes += len;
+
rte_mb();
p->status = MEMNIC_PKT_ST_FILLED;

rte_pktmbuf_free(tx_pkts[nr]);
}

+   /* stats */
+   st->opackets += pkts;
+   st->obytes += bytes;
+   st->oerrors += errs;
+   st->q_opackets[0] += pkts;
+   st->q_obytes[0] += bytes;
+
return nr;
 }

-- 
1.8.4

[dpdk-dev] [memnic PATCH 5/5] pmd: handle multiple segments on xmit

2014-03-11 Thread Hiroshi Shimamoto

From: Hiroshi Shimamoto 

The current MEMNIC PMD cannot handle multiple segments.

Add the functionality to transmit a mbuf which has multiple segments.
Walk every segment in transmitting mbuf and copy the data to MEMNIC
packet buffer.

Signed-off-by: Hiroshi Shimamoto 
Reviewed-by: Hayato Momma 
---
 pmd/pmd_memnic.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c
index abfd437..4ee655d 100644
--- a/pmd/pmd_memnic.c
+++ b/pmd/pmd_memnic.c
@@ -324,9 +324,11 @@ static uint16_t memnic_xmit_pkts(void *tx_queue,
pkts = bytes = errs = 0;

for (nr = 0; nr < nb_pkts; nr++) {
-   int len = rte_pktmbuf_data_len(tx_pkts[nr]);
+   int pkt_len = rte_pktmbuf_pkt_len(tx_pkts[nr]);
+   struct rte_mbuf *sg;
+   void *ptr;

-   if (len > MEMNIC_MAX_FRAME_LEN) {
+   if (pkt_len > MEMNIC_MAX_FRAME_LEN) {
errs++;
break;
}
@@ -356,12 +358,19 @@ retry:
idx = 0;
adapter->down_idx = idx;

-   p->len = len;
+   p->len = pkt_len;

-   rte_memcpy(p->data, rte_pktmbuf_mtod(tx_pkts[nr], void *), len);
+   ptr = p->data;
+   for (sg = tx_pkts[nr]; sg; sg = sg->pkt.next) {
+   void *src = rte_pktmbuf_mtod(sg, void *);
+   int data_len = sg->pkt.data_len;
+
+   rte_memcpy(ptr, src, data_len);
+   ptr += data_len;
+   }

pkts++;
-   bytes += len;
+   bytes += pkt_len;

rte_mb();
p->status = MEMNIC_PKT_ST_FILLED;
-- 
1.8.4

[dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?

2014-03-11 Thread Prashant Upadhyaya

Hi Stephen,

Can you please advise on your experience of the kind of data rates you have 
been able to achieve with vmxnet3.
Also did you have to do any special optimizations at the vmnic of ESXi for the 
above, kindly let me know.

Regards
-Prashant

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya
Sent: Tuesday, March 11, 2014 10:57 AM
To: Stephen Hemminger
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?

Hi Stephen,

This is great news !
I can wait for a formal release of DPDK with your driver.
Please let me know when is the release expected. I will happily migrate to that.

Regards
-Prashant

-Original Message-
From: Stephen Hemminger [mailto:step...@networkplumber.org]
Sent: Monday, March 10, 2014 9:21 PM
To: Prashant Upadhyaya
Cc: Srinivasan J; dev at dpdk.org
Subject: Re: [dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?

On Mon, 10 Mar 2014 13:30:48 +0530
Prashant Upadhyaya  wrote:

> Hi Srini,
>
> Thanks, I could also make it work, thanks to your cue !
>
> Now then, this multi-segment not being supported in vmxnet3 driver is a big 
> party-pooper for me. Unfortunately in my usecase, I do indeed make heavy use 
> of multisegment buffers for sending out the data, so my usecase has failed 
> and I will have to fix that.
>
> Also, can you please adivse how much is the max data rates you have been able 
> to achieve with one vmxnet3 10G port.
>
> Thanks a lot for the advice once again.
>
> Regards
> -Prashant

I am integrating our driver with the 1.6.1 DPDK driver.
We support multi-segment, if you want I will backport that feature first.

===
Please refer to http://www.aricent.com/legal/email_disclaimer.html
for important disclosures regarding this electronic communication.
===

===
Please refer to http://www.aricent.com/legal/email_disclaimer.html
for important disclosures regarding this electronic communication.
===

[dpdk-dev] Setting up hugepage memory failed when there is a java process also using hugepages

2014-03-11 Thread Carlos Franco

Hello
I am having some rare problem with hugepages. I am running DPDK 1.5.1 in CenOS 
6.4. Everything is running smothly and we have done quite good progress 
integrating DPDK into our solution. But sometimes, there is an error with the 
init of the huge pages. The problem is when there is another process started 
that also used hugepages, in this case, a java 1.7 process (jdk 1.7.0_45 
64-bits).
The machine only has one socket with 1 Xeon CPU with 4 cores and 4 GB. I have 
configured 256 pages of 2MB. Everything is running in 64 bits. 
grep Huge /proc/meminfo
AnonHugePages:253952 kB
HugePages_Total: 256
HugePages_Free:  256
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB
When I start the DPDK process (with ?c 01 ?n 4 ?m 32 options), this is the 
trace:
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Setting up hugepage memory...
EAL: Ask a virtual area of 0x394264576 bytes
EAL: Virtual area found at 0x7fc0c900 (size = 0x1780)
EAL: Ask a virtual area of 0x4194304 bytes
EAL: Virtual area found at 0x7fc0c8a0 (size = 0x40)
EAL: Ask a virtual area of 0x132120576 bytes
EAL: Virtual area found at 0x7fc0c0a0 (size = 0x7e0)
EAL: Ask a virtual area of 0x4194304 bytes
EAL: Virtual area found at 0x7fc0c040 (size = 0x40)
EAL: Ask a virtual area of 0x2097152 bytes
EAL: Virtual area found at 0x7fc0c000 (size = 0x20)
EAL: Requesting 16 pages of size 2MB from socket 0
...
Everything goes well. After the start, this is the hugepages information:
AnonHugePages:280576 kB
HugePages_Total: 256
HugePages_Free:  220
HugePages_Rsvd:   35
HugePages_Surp:0
Hugepagesize:   2048 kB

If I started then the java process, it starts ok and this is the hugepages 
information:

HugePages_Total: 256
HugePages_Free:  204
HugePages_Rsvd:   74
HugePages_Surp:0
Hugepagesize:   2048 kB

If I stop and start the DPDK process, it fails with the following trace:

EAL: Detected lcore 0 as core 0 on socket 0
EAL: Setting up hugepage memory...
EAL: map_all_hugepages(): mmap failed: Cannot allocate memory
EAL: Failed to mmap 2 MB hugepages
PANIC in rte_eal_init():
Cannot init memory

And after the fail, this is the hugepage information:

AnonHugePages:251904 kB
HugePages_Total: 256
HugePages_Free:   40
HugePages_Rsvd:   40
HugePages_Surp:0
Hugepagesize:   2048 kB

It seems that it tries to map all the hugepages even if it should be limited 
with ?m 32. After the fail, there are 201 rtemap_X files in /mnt/huge.

When I stop the java process, the DPDK process starts without any problem. I 
don?t know if this happens also with any other process different from java and 
that uses hugepages.

Am I doing something wrong? 

Thanks a lot

Carlos

[dpdk-dev] RSS, performance, and stability issues vmxnet3-pmd

2014-03-11 Thread Daniel Kan

I?m unable to get RSS to work properly with vmxnet3-pmd. The first issue is 
that the number of rxqs must be power of 2. Otherwise, rte_eth_dev_start() 
fails due to inability to activate vmxnet3 NIC. This is not too big of a deal, 
but physical NICs don?t have this requirement. 

The second issue is that RSS is just not working at all for me. The rxmode is 
set to ETH_MQ_RX_RSS and rss_hf = ETH_RSS_IPV4_TCP | ETH_RSS_IPV4_UDP | 
ETH_RSS_IPV4 | ETH_RSS_IPV6. The same configuration works for a real NIC. When 
I checked mb->pkt.hash, the value is all zeroed out. 

Even if I disabled RSS, I found the performance of vmxnet3-pmd to be quite 
poor, peaking out at 600k pps with 64 byte packet, while libpcap can do 650k 
pps. 

Lastly, there is a stability issue. On a number of occasions, vmxnet3-pmd stops 
receiving packets after some random time and several million packets. 

I?m not sure if anyone else is having as much issue as I?m, I will give 
vmxnet3-usermap a try. 

Finally, does either vmxnet3-usermap or vmxnet3-pmd work well for 
non-Intel-based underlying physical NIC? 

Thanks. 

Dan

[dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?

[dpdk-dev] [memnic PATCH 1/5] pmd: fix race condition

[dpdk-dev] [memnic PATCH 2/5] pmd: check frame length from host

[dpdk-dev] [memnic PATCH 3/5] pmd: implement stats of MEMNIC

[dpdk-dev] [memnic PATCH 4/5] pmd: account statistics

[dpdk-dev] [memnic PATCH 5/5] pmd: handle multiple segments on xmit

[dpdk-dev] Which vmxnet3 pmd is to be used in dpdk 1.6.x?

[dpdk-dev] Setting up hugepage memory failed when there is a java process also using hugepages

[dpdk-dev] RSS, performance, and stability issues vmxnet3-pmd

9 matches

Site Navigation

Mail list logo

Footer information