[dpdk-dev] [PATCH v2] hash: fix compilation for non-x86 systems

2015-07-17 Thread Tony Lu
>-Original Message-
>From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pablo de Lara
>Sent: Friday, July 17, 2015 5:18 PM
>To: dev at dpdk.org
>Subject: [dpdk-dev] [PATCH v2] hash: fix compilation for non-x86 systems
>
>From: "Pablo de Lara" 
>
>Hash library uses optimized compare functions that use
>x86 intrinsics, therefore non-x86 systems could not build
>the library. In that case, the compare function is set
>to the generic memcmp.
>
>Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation")
>
>Reported-by: Tony Lu 
>Signed-off-by: Pablo de Lara 
>---
>Changes in v2:
>- Renamed new file rte_cmp_fns.h to rte_cmp_x86.h
>- Removed blank line
>
> lib/librte_hash/rte_cmp_x86.h | 109
>++
> lib/librte_hash/rte_cuckoo_hash.c |  96 -
> 2 files changed, 120 insertions(+), 85 deletions(-)
> create mode 100644 lib/librte_hash/rte_cmp_x86.h
>
>diff --git a/lib/librte_hash/rte_cmp_x86.h b/lib/librte_hash/rte_cmp_x86.h
>new file mode 100644
>index 000..7f79bac
>--- /dev/null
>+++ b/lib/librte_hash/rte_cmp_x86.h
>@@ -0,0 +1,109 @@
>+/*-
>+ *   BSD LICENSE
>+ *
>+ *   Copyright(c) 2015 Intel Corporation. All rights reserved.
>+ *   All rights reserved.
>+ *
>+ *   Redistribution and use in source and binary forms, with or without
>+ *   modification, are permitted provided that the following conditions
>+ *   are met:
>+ *
>+ * * Redistributions of source code must retain the above copyright
>+ *   notice, this list of conditions and the following disclaimer.
>+ * * Redistributions in binary form must reproduce the above copyright
>+ *   notice, this list of conditions and the following disclaimer in
>+ *   the documentation and/or other materials provided with the
>+ *   distribution.
>+ * * Neither the name of Intel Corporation nor the names of its
>+ *   contributors may be used to endorse or promote products derived
>+ *   from this software without specific prior written permission.
>+ *
>+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
>CONTRIBUTORS
>+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
>NOT
>+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
>FITNESS FOR
>+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
>COPYRIGHT
>+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
>INCIDENTAL,
>+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
>NOT
>+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
>OF USE,
>+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
>AND ON ANY
>+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
>TORT
>+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
>THE USE
>+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
>DAMAGE.
>+ */
>+
>+/* Functions to compare multiple of 16 byte keys (up to 128 bytes) */
>+static int
>+rte_hash_k16_cmp_eq(const void *key1, const void *key2, size_t key_len
>__rte_unused)
>+{
>+  const __m128i k1 = _mm_loadu_si128((const __m128i *) key1);
>+  const __m128i k2 = _mm_loadu_si128((const __m128i *) key2);
>+#ifdef RTE_MACHINE_CPUFLAG_SSE4_1
>+  const __m128i x = _mm_xor_si128(k1, k2);
>+
>+  return !_mm_test_all_zeros(x, x);
>+#else
>+  const __m128i x = _mm_cmpeq_epi32(k1, k2);
>+
>+  return (_mm_movemask_epi8(x) != 0x);
>+#endif
>+}
>+
>+static int
>+rte_hash_k32_cmp_eq(const void *key1, const void *key2, size_t key_len)
>+{
>+  return rte_hash_k16_cmp_eq(key1, key2, key_len) ||
>+  rte_hash_k16_cmp_eq((const char *) key1 + 16,
>+  (const char *) key2 + 16, key_len);
>+}
>+
>+static int
>+rte_hash_k48_cmp_eq(const void *key1, const void *key2, size_t key_len)
>+{
>+  return rte_hash_k16_cmp_eq(key1, key2, key_len) ||
>+  rte_hash_k16_cmp_eq((const char *) key1 + 16,
>+  (const char *) key2 + 16, key_len) ||
>+  rte_hash_k16_cmp_eq((const char *) key1 + 32,
>+  (const char *) key2 + 32, key_len);
>+}
>+
>+static int
>+rte_hash_k64_cmp_eq(const void *key1, const void *key2, size_t key_len)
>+{
>+  return rte_hash_k32_cmp_eq(key1, key2, key_len) ||
>+  rte_hash_k32_cmp_eq((const char *) key1 + 32,
>+  (const char *) key2 + 32, key_len);
>+}
>+
>+static int
>+rte_hash_k80_cmp_eq(const void *key1, const void *key2, size_t key_len)
>+{
>+  return rte_hash_k64_cmp_eq(key1, key2, key_len) ||
>+  rte_hash_k16_cmp_eq((const char *) key1 + 64,
>+  (const char *) key2 + 64, key_len);
>+}
>+
>+static int
>+rte_hash_k96_cmp_eq(const void *key1, const void *key2, size_t key_len)
>+{
>+  return rte_hash_k64_cmp_eq(key1, key2, key_len) ||
>+  rte_hash_k32_cmp_eq((const char *) key1 + 64,
>+  

[dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation

2015-07-17 Thread Tony Lu
Hi, Pablo

The patch "hash: fix compilation for non-x86 systems " works for no-X86
arches.
Thanks for your quick fix.

>-Original Message-
>From: De Lara Guarch, Pablo [mailto:pablo.de.lara.guarch at intel.com]
>Sent: Friday, July 17, 2015 5:06 PM
>To: Tony Lu; dev at dpdk.org
>Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library
with
>cuckoo hash implementation
>
>Hi Tony,
>
>> -Original Message-
>> From: Tony Lu [mailto:zlu at ezchip.com]
>> Sent: Friday, July 17, 2015 8:58 AM
>> To: De Lara Guarch, Pablo; dev at dpdk.org
>> Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash
>> library with cuckoo hash implementation
>>
>> >-Original Message-
>> >From: De Lara Guarch, Pablo [mailto:pablo.de.lara.guarch at intel.com]
>> >Sent: Friday, July 17, 2015 3:35 PM
>> >To: Tony Lu; dev at dpdk.org
>> >Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash
>> >library
>> with
>> >cuckoo hash implementation
>> >
>> >
>> >
>> >> -Original Message-
>> >> From: Tony Lu [mailto:zlu at ezchip.com]
>> >> Sent: Friday, July 17, 2015 4:35 AM
>> >> To: De Lara Guarch, Pablo; dev at dpdk.org
>> >> Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash
>> >> library with cuckoo hash implementation
>> >>
>> >> Hi, Pablo
>> >>
>> >> >-Original Message-
>> >> >From: De Lara Guarch, Pablo
>> >> >[mailto:pablo.de.lara.guarch at intel.com]
>> >> >Sent: Friday, July 17, 2015 4:42 AM
>> >> >To: Tony Lu; dev at dpdk.org
>> >> >Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash
>> >> >library
>> >> with
>> >> >cuckoo hash implementation
>> >> >
>> >> >Hi Tony,
>> >> >
>> >> >> -Original Message-
>> >> >> From: Tony Lu [mailto:zlu at ezchip.com]
>> >> >> Sent: Thursday, July 16, 2015 10:40 AM
>> >> >> To: De Lara Guarch, Pablo; dev at dpdk.org
>> >> >> Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing
>> >> >> hash library with cuckoo hash implementation
>> >> >>
>> >> >> >diff --git a/lib/librte_hash/rte_cuckoo_hash.c
>> >> >> b/lib/librte_hash/rte_cuckoo_hash.c
>> >> >> >new file mode 100644
>> >> >> >index 000..50e3acd
>> >> >> >--- /dev/null
>> >> >> >+++ b/lib/librte_hash/rte_cuckoo_hash.c
>> >> >> >@@ -0,0 +1,1027 @@
>> >> >> ...
>> >> >> >+
>> >> >> >+/* Functions to compare multiple of 16 byte keys (up to 128
>> >> >> >+bytes) */ static int rte_hash_k16_cmp_eq(const void *key1,
>> >> >> >+const void *key2, size_t key_len
>> >> >> >__rte_unused)
>> >> >> >+{
>> >> >> >+ const __m128i k1 = _mm_loadu_si128((const __m128i *)
>> key1);
>> >> >> >+ const __m128i k2 = _mm_loadu_si128((const __m128i *)
>> key2);
>> >> >> >+ const __m128i x = _mm_xor_si128(k1, k2);
>> >> >> >+
>> >> >> >+ return !_mm_test_all_zeros(x, x); }
>> >> >> ...
>> >> >>
>> >> >> When compiling the latest dev DPDK for non-x86 arch, it fails on
>> >> >> the above code, as the SSE is x86 specific defined in
>> >> >> .  Is it possible to replace this function with
>> >> >> platform
>> >independent code?
>> >> >
>> >> >Thanks for spotting this. I just sent a patch that should fix the
>> problem.
>> >> >Can you check if it works?
>> >>
>> >> Thanks for your quick response, but __m128i and all the _mm_
>> >> related functions are X86 specific defined in .  This
>> >> header file is only available in X86 compiler library, but no-X86
>> >> archs do not have this file.  So if we can replace all the X86
>> >> specific code in the above function, that would be great.
>> >>
>> >With the patch that I sent, if you are compiling for a non-x86 arch,
>> >you
>> should
>> >not have any problem, since all that code will only be used if using
>> >x86
>> arch.
>> >Have you tried compiling DPDK with the patch?
>>
>> Yes, I have built the DPDK with your patch, and got the following errors.
>> This is
>> because there are no __m128i, _mm_loadu_si128(), _mm_cmpeq_epi32() and
>> _mm_movemask_epi8() on no-X86 arches.
>>
>> == Build lib/librte_hash
>>   CC rte_cuckoo_hash.o
>> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c: In function
>> 'rte_hash_k16_cmp_eq':
>> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error:
>> expected '=', ',', ';', 'asm' or '__attribute__' before 'k1'
>> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error:
'k1'
>> undeclared (first use in this function)
>> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error:
>> (Each undeclared identifier is reported only once
>> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error:
>> for each function it appears in.)
>> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: warning:
>> implicit declaration of function '_mm_loadu_si128'
>> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: warning:
>> nested extern declaration of '_mm_loadu_si128'
>> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error:
>> expected ')' before '__m128i'
>> 

[dpdk-dev] can eth_igb_xmit_pkts called with len 0 affect transmission?

2015-07-17 Thread ciprian.barbu
Hi,

I'm seeing a strange behavior when calling rte_eth_tx_burst with len == 
0. I'll explain the reason for this situation further bellow. But what 
I'm seeing is that after doing this call my application keeps returning 
from eth_igb_xmit_pkts here, even when len > 0: 
http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_e1000/igb_rxtx.c?h=releases#n476

I can't really explain myself this behavior, I don't understand very 
well what the nic does once it receives buffers in its rings, but to me 
it looks like calling rte_eth_tx_burst with len 0 has this effect.

What I'm using in my tests is the linaro odp-dpdk implementation and the 
odp_l2fwd example. The odp-dpdk implementation makes this call to try 
and make the pmd to flush the tx queue in case there are no more free 
buffers in the pool, but this was only verified for ixgbe 82599 cards, 
for igb the packets are not actually flushed until the tail circles back 
the whole length of the queue. I'm pretty much the only one (that I know 
of) that uses odp with 1G igb i350 cards and see this issue.

Can anyone explain whether I'm getting this right and there could be 
side effects to calling eth_igb_xmit_pkts with len 0?

Thank you,
/Ciprian


[dpdk-dev] [dpdk-virtio] Performance tuning for dpdk with virtio?

2015-07-17 Thread Clarylin L
My VM has two ports connecting to two linux bridges (in turn connecting two
physical ports). DPDK is used to forward between these two ports (one port
connected to traffic generator and the other connected to sink). I used
iperf to test the throughput between the traffic generator and one port on
VM, as well as throughput between the other port and the sink. Both legs
show around 7.5G throughput.

Traffic anyway would goes through bridge to reach to the VM ports, so I
think linux bridge does support much higher throughput, doesn't it?

On Fri, Jul 17, 2015 at 2:20 PM, Stephen Hemminger <
stephen at networkplumber.org> wrote:

> On Fri, 17 Jul 2015 11:03:15 -0700
> Clarylin L  wrote:
>
> > I am running dpdk with a virtual guest as a L2 forwarder.
> >
> > If the virtual guest is on passthrough, dpdk can achieve around 10G
> > throughput. However if the virtual guest is on virtio, dpdk achieves just
> > 150M throughput, which is a huge degrade. Any idea what could be the
> cause
> > of such poor performance on virtio? and any performance tuning
> techniques I
> > could try? Thanks a lot!
>
> The default Linux bridge (and OVS) switch are your bottleneck.
> It is not DPDK virtio issue in general. There are some small performance
> gains still possible with virtio enhancements (like offloading).
>
> Did you try running OVS-DPDK on the host?
>


[dpdk-dev] Non-working TX IP checksum offload

2015-07-17 Thread Andriy Berestovskyy
Cze?? Angela,
Make sure your NIC is configured properly as described in this thread:
http://dpdk.org/ml/archives/dev/2015-May/018096.html

Andriy

On Fri, Jul 17, 2015 at 4:23 PM, Angela Czubak  wrote:
> Hi,
>
> I have some difficulties using ip checksum tx offload capabilities - I
> think I set everything as advised by the API documentation, but
> unfortunately the packet leaves the interface with its ip checksum still
> being zero (it reaches its destination).
>
> What I do is:
> buffer->ol_flags |= PKT_TX_IP_CKSUM|PKT_TX_IPV4;
> ip_header->hdr_checksum = 0;
> buffer->l3_len = sizeof(struct ipv4_hdr);
> buffer->l2_len = sizeof(struct ether_hdr);
>
> In L4 there's UDP, which checksum is zeroed if that matters.
>
> Is there something I am missing? The NIC is Intel Corporation Ethernet
> Controller X710 for 10GbE SFP+ (rev 01).
>
> What is more, is there any particular reason for assuming in
> i40e_xmit_pkts that offloading checksums is unlikely (I mean the line no
> 1307 "if (unlikely(ol_flags & I40E_TX_CKSUM_OFFLOAD_MASK))" at
> dpdk-2.0.0/lib/librte_pmd_i40e/i40e_rxtx.c)?
>
> Regards,
> Angela



-- 
Andriy Berestovskyy


[dpdk-dev] Non-working TX IP checksum offload

2015-07-17 Thread Angela Czubak
Hi,

I have some difficulties using ip checksum tx offload capabilities - I 
think I set everything as advised by the API documentation, but 
unfortunately the packet leaves the interface with its ip checksum still 
being zero (it reaches its destination).

What I do is:
buffer->ol_flags |= PKT_TX_IP_CKSUM|PKT_TX_IPV4;
ip_header->hdr_checksum = 0;
buffer->l3_len = sizeof(struct ipv4_hdr);
buffer->l2_len = sizeof(struct ether_hdr);

In L4 there's UDP, which checksum is zeroed if that matters.

Is there something I am missing? The NIC is Intel Corporation Ethernet 
Controller X710 for 10GbE SFP+ (rev 01).

What is more, is there any particular reason for assuming in 
i40e_xmit_pkts that offloading checksums is unlikely (I mean the line no 
1307 "if (unlikely(ol_flags & I40E_TX_CKSUM_OFFLOAD_MASK))" at 
dpdk-2.0.0/lib/librte_pmd_i40e/i40e_rxtx.c)?

Regards,
Angela


[dpdk-dev] [PATCH] doc/testpmd_app_ug:add a comment for outer-ip option in csum

2015-07-17 Thread Jijiang Liu
Add a comment for outer-ip option in csum command.

Set outer-ip option only when the packet is a IPv4 packet. 

Signed-off-by: Jijiang Liu 
---
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst 
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 4652962..c8baa76 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -541,7 +541,7 @@ csum set (ip|udp|tcp|sctp|outer-ip) (hw|sw) (port_id)

 - ip|udp|tcp|sctp always concern the inner layer.

-- outer-ip concerns the outer IP layer in case the packet is recognized
+- outer-ip concerns the outer IP layer(only for IPv4) in case the packet is 
recognized
   as a tunnel packet by the forward engine (vxlan, gre and ipip are
   supported). See "csum parse-tunnel" command.

-- 
1.7.7.6



[dpdk-dev] [PATCH v3] i40e: Fix the endian issue for the i40e read registers functions

2015-07-17 Thread Chao Zhu
Acked-by: Chao Zhu 

On 2015/7/17 15:25, Zhe Tao wrote:
> Signed-off-by: Zhe Tao 
> ---
> PATCH v3: Edit the subject make it more clear
>
> PATCH v2: Edit the comments make it more clear
>
> PATCH v1: Add the endian conversion for registers operations.
>
>   drivers/net/i40e/base/i40e_osdep.h | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/i40e/base/i40e_osdep.h 
> b/drivers/net/i40e/base/i40e_osdep.h
> index 3ce8057..70d2721 100644
> --- a/drivers/net/i40e/base/i40e_osdep.h
> +++ b/drivers/net/i40e/base/i40e_osdep.h
> @@ -122,10 +122,10 @@ do {
> \
>   ((volatile uint32_t *)((char *)(a)->hw_addr + (reg)))
>   static inline uint32_t i40e_read_addr(volatile void *addr)
>   {
> - return I40E_PCI_REG(addr);
> + return rte_le_to_cpu_32(I40E_PCI_REG(addr));
>   }
>   #define I40E_PCI_REG_WRITE(reg, value) \
> - do {I40E_PCI_REG((reg)) = (value);} while(0)
> + do { I40E_PCI_REG((reg)) = rte_cpu_to_le_32(value); } while (0)
>
>   #define I40E_WRITE_FLUSH(a) I40E_READ_REG(a, I40E_GLGEN_STAT)
>   #define I40EVF_WRITE_FLUSH(a) I40E_READ_REG(a, I40E_VFGEN_RSTAT)



[dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation

2015-07-17 Thread Tony Lu
>-Original Message-
>From: De Lara Guarch, Pablo [mailto:pablo.de.lara.guarch at intel.com]
>Sent: Friday, July 17, 2015 3:35 PM
>To: Tony Lu; dev at dpdk.org
>Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library
with
>cuckoo hash implementation
>
>
>
>> -Original Message-
>> From: Tony Lu [mailto:zlu at ezchip.com]
>> Sent: Friday, July 17, 2015 4:35 AM
>> To: De Lara Guarch, Pablo; dev at dpdk.org
>> Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash
>> library with cuckoo hash implementation
>>
>> Hi, Pablo
>>
>> >-Original Message-
>> >From: De Lara Guarch, Pablo [mailto:pablo.de.lara.guarch at intel.com]
>> >Sent: Friday, July 17, 2015 4:42 AM
>> >To: Tony Lu; dev at dpdk.org
>> >Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash
>> >library
>> with
>> >cuckoo hash implementation
>> >
>> >Hi Tony,
>> >
>> >> -Original Message-
>> >> From: Tony Lu [mailto:zlu at ezchip.com]
>> >> Sent: Thursday, July 16, 2015 10:40 AM
>> >> To: De Lara Guarch, Pablo; dev at dpdk.org
>> >> Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash
>> >> library with cuckoo hash implementation
>> >>
>> >> >diff --git a/lib/librte_hash/rte_cuckoo_hash.c
>> >> b/lib/librte_hash/rte_cuckoo_hash.c
>> >> >new file mode 100644
>> >> >index 000..50e3acd
>> >> >--- /dev/null
>> >> >+++ b/lib/librte_hash/rte_cuckoo_hash.c
>> >> >@@ -0,0 +1,1027 @@
>> >> ...
>> >> >+
>> >> >+/* Functions to compare multiple of 16 byte keys (up to 128
>> >> >+bytes) */ static int rte_hash_k16_cmp_eq(const void *key1, const
>> >> >+void *key2, size_t key_len
>> >> >__rte_unused)
>> >> >+{
>> >> >+const __m128i k1 = _mm_loadu_si128((const __m128i *) key1);
>> >> >+const __m128i k2 = _mm_loadu_si128((const __m128i *) key2);
>> >> >+const __m128i x = _mm_xor_si128(k1, k2);
>> >> >+
>> >> >+return !_mm_test_all_zeros(x, x); }
>> >> ...
>> >>
>> >> When compiling the latest dev DPDK for non-x86 arch, it fails on
>> >> the above code, as the SSE is x86 specific defined in
>> >> .  Is it possible to replace this function with platform
>independent code?
>> >
>> >Thanks for spotting this. I just sent a patch that should fix the
problem.
>> >Can you check if it works?
>>
>> Thanks for your quick response, but __m128i and all the _mm_ related
>> functions are X86 specific defined in .  This header file
>> is only available in X86 compiler library, but no-X86 archs do not
>> have this file.  So if we can replace all the X86 specific code in the
>> above function, that would be great.
>>
>With the patch that I sent, if you are compiling for a non-x86 arch, you
should
>not have any problem, since all that code will only be used if using x86
arch.
>Have you tried compiling DPDK with the patch?

Yes, I have built the DPDK with your patch, and got the following errors.
This is
because there are no __m128i, _mm_loadu_si128(), _mm_cmpeq_epi32() and
_mm_movemask_epi8() on no-X86 arches.

== Build lib/librte_hash
  CC rte_cuckoo_hash.o
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c: In function
'rte_hash_k16_cmp_eq':
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error:
expected '=', ',', ';', 'asm' or '__attribute__' before 'k1'
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error: 'k1'
undeclared (first use in this function)
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error: (Each
undeclared identifier is reported only once
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error: for
each function it appears in.)
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: warning:
implicit declaration of function '_mm_loadu_si128'
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: warning:
nested extern declaration of '_mm_loadu_si128'
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error:
expected ')' before '__m128i'
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: warning:
type defaults to 'int' in declaration of 'type name'
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: warning:
cast from pointer to integer of different size
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1127: error:
expected '=', ',', ';', 'asm' or '__attribute__' before 'k2'
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1127: error: 'k2'
undeclared (first use in this function)
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1127: error:
expected ')' before '__m128i'
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1127: warning:
type defaults to 'int' in declaration of 'type name'
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1127: warning:
cast from pointer to integer of different size
/u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1133: error:
expected '=', ',', ';', 'asm' or '__attribute__' before 'x'

[dpdk-dev] [PATCH v3 1/2] librte_ether: release memory in uninit function.

2015-07-17 Thread Thomas Monjalon
2015-07-13 14:04, Bernard Iremonger:
> @@ -387,8 +387,12 @@ rte_eth_dev_uninit(struct rte_pci_device *pci_dev)
>   /* free ether device */
>   rte_eth_dev_release_port(eth_dev);
>  
> - if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> + if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> + rte_free(eth_dev->data->rx_queues);
> + rte_free(eth_dev->data->tx_queues);
>   rte_free(eth_dev->data->dev_private);
> + memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
> + }

What is the benefit of freeing queues in detach/uninit function?
It is already freed in the close function of your other patch
and calling close() is mandatory before calling detach():

http://dpdk.org/browse/dpdk/tree/doc/guides/prog_guide/port_hotplug_framework.rst#n63
http://dpdk.org/browse/dpdk/tree/lib/librte_ether/rte_ethdev.h#n1699



[dpdk-dev] How to get net_device and use struct ethtool_cmd at DPDK enverinment?

2015-07-17 Thread "Scott.Jhuang (莊清翔) : 6309"
Hi Sy Jong,

If I using KNI in DPDK, can I use another applications at the same time? (e.g. 
L2 forward, L3 forward)

Choi, Sy Jong ? 2015?07?15? 18:01 ??:
Hi Scott,

You will need to start KNI sample app, it will create the vEth interface. After 
kni app, it will be there, kni app is the datapath, it get the packet into the 
kernel.

http://dpdk.org/doc/guides/prog_guide/kernel_nic_interface.html


  1.  Insert the KNI kernel module:
2.  insmod ./rte_kni.ko
If using KNI in multi-thread mode, use the following command line:
insmod ./rte_kni.ko kthread_mode=multiple

  1.  Running the KNI sample application:
4.  ./kni -c -0xf0 -n 4 -- -p 0x3 -P -config="(0,4,6),(1,5,7)"
This command runs the kni sample application with two physical ports. Each port 
pins two forwarding cores (ingress/egress) in user space.


Regards,
Choi, Sy Jong
Platform Application Engineer

From: "Scott.Jhuang (?? ?) : 6309" [mailto:scott.jhu...@cas-well.com]
Sent: Wednesday, July 15, 2015 5:54 PM
To: Choi, Sy Jong; dev at dpdk.org; "Sandy.Liu (?? ?) : 
6817"; "Alan Yu (?? ?) : 6632"
Subject: Re: [dpdk-dev] How to get net_device and use struct ethtool_cmd at 
DPDK enverinment?

Hi Sy Jong,

If I load "rte_kni.ko" driver, the net_device structs will be initialled by 
KNI, right?
If yes, how can I handle these net_device structs in other driver,
because I using "for_each_netdev()" kernel API can't find the net_device 
structs which KNI initialled.
Or these structs have not been exported to kernel?

Choi, Sy Jong ? 2015?07?01? 15:55 ??:
Hi Scott,

Please refer to our KNI library at:-
dpdk-1.8.0\lib\librte_eal\linuxapp\kni\ethtool\igb\igb.h

Regards,
Choi, Sy Jong
Platform Application Engineer

From: "Scott.Jhuang (?? ?) : 6309" [mailto:scott.jhu...@cas-well.com]
Sent: Wednesday, July 01, 2015 2:44 PM
To: Choi, Sy Jong; dev at dpdk.org
Subject: Re: [dpdk-dev] How to get net_device and use struct ethtool_cmd at 
DPDK enverinment?

Hi Sy Jong,

Have any idea?

"Scott.Jhuang (? ??) : 6309" ? 2015?06?23? 21:24 ??:
Dear Sy Jong,

Yes, I have check out DPDK KNI, but I still can't find how to prepare 
net_device structure...
And I also doesn't find how to get "ethtool_cmd.phy_address"
Could you let me know the path of source code folder

Choi, Sy Jong ? 2015?06?19? 10:35 ??:
Hi Scott,

DPDK PMD are interfacing using rte_ethdev.c which link to ixgbe_ethdev.c 
there?s no ?net_device? in our code.

But if you search DPDk code based, we have KNI example to teach you how to 
prepare the net_device structure.
Have you check out our DPDK KNI codes?

Regards,
Choi, Sy Jong
Platform Application Engineer

From: "Scott.Jhuang (? ? ?) : 6309" [mailto:scott.jhu...@cas-well.com]
Sent: Thursday, June 18, 2015 12:25 PM
To: Choi, Sy Jong; dev at dpdk.org
Subject: Re: [dpdk-dev] How to get net_device and use struct ethtool_cmd at 
DPDK enverinment?

Dear Sy Jong,

I'm planning to program a driver to get all the ethport's net_device structure, 
because I need some information from these net_device structures.
And I also need to use net_device struct's ethtool_cmd to get some information 
e.g. ethtool_cmd.phy_address, net_device->ethtool_ops->get_settings.

In fact, I need some information from net_device struct to access and control 
PHY's link-up/down,
and I reference igb driver to design the link-up/down functions, since in DPDK 
envirenment doesn't have igb driver,
so In DPDK envirenment, I don't know how to get network deivce's net_device 
structs and more information which initial by igb driver(because doesn't have 
igb driver).

Choi, Sy Jong ? 2015?06?17? 11:15 ??:
Hi Scott,

You are right, the KNI will be a good reference for you. It demonstrate how 
DPDK PMD interface with kernel.
May I know are you planning to build the interface to ethtool? You can try 
running KNI app.

Regards,
Choi, Sy Jong
Platform Application Engineer

From: "Scott.Jhuang (?? ?) : 6309" [mailto:scott.jhu...@cas-well.com]
Sent: Wednesday, June 17, 2015 11:12 AM
To: Choi, Sy Jong; dev at dpdk.org
Subject: Re: [dpdk-dev] How to get net_device and use struct ethtool_cmd at 
DPDK enverinment?

Hi Sy Jong,

But...I am programming a driver now, have any sample driver I can reference?

Choi, Sy Jong ? 2015?06?16? 14:48 ??:

Hi Scott,



You can review DPDK KNI sample app, there's ethtool support using a vEth device 
interfacing to DPDK PMD.



Pure DPDK PMD require programming to display the information in ethtool. The 
interfacing is demonstrate on KNI sample app.



Regards,

Choi, Sy Jong

Platform Application Engineer



-Original Message-

From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of "Scott.Jhuang (???) : 6309"

Sent: Monday, June 15, 2015 6:35 PM

To: dev at dpdk.org

Subject: [dpdk-dev] How to get net_device and use struct ethtool_cmd at DPDK 
enverinment?



Hi,



I want to get etherport's net_device structs and using ethtool_cmd to get some 

[dpdk-dev] [PATCH v3] i40e: Fix the endian issue for the i40e read registers functions

2015-07-17 Thread Zhe Tao
Signed-off-by: Zhe Tao 
---
PATCH v3: Edit the subject make it more clear

PATCH v2: Edit the comments make it more clear

PATCH v1: Add the endian conversion for registers operations.

 drivers/net/i40e/base/i40e_osdep.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_osdep.h 
b/drivers/net/i40e/base/i40e_osdep.h
index 3ce8057..70d2721 100644
--- a/drivers/net/i40e/base/i40e_osdep.h
+++ b/drivers/net/i40e/base/i40e_osdep.h
@@ -122,10 +122,10 @@ do {  
  \
((volatile uint32_t *)((char *)(a)->hw_addr + (reg)))
 static inline uint32_t i40e_read_addr(volatile void *addr)
 {
-   return I40E_PCI_REG(addr);
+   return rte_le_to_cpu_32(I40E_PCI_REG(addr));
 }
 #define I40E_PCI_REG_WRITE(reg, value) \
-   do {I40E_PCI_REG((reg)) = (value);} while(0)
+   do { I40E_PCI_REG((reg)) = rte_cpu_to_le_32(value); } while (0)

 #define I40E_WRITE_FLUSH(a) I40E_READ_REG(a, I40E_GLGEN_STAT)
 #define I40EVF_WRITE_FLUSH(a) I40E_READ_REG(a, I40E_VFGEN_RSTAT)
-- 
1.9.3



[dpdk-dev] [PATCH v2 0/3] Fix vhost startup issue

2015-07-17 Thread Thomas Monjalon
2015-07-10 14:20, Xie, Huawei:
> On 7/6/2015 10:27 AM, Ouyang, Changchun wrote:
> > The patch set fix vhost sample fails to start up in second time:
> >  
> > It should call api to unregister vhost driver when sample exit/quit, then
> > the socket file will be removed(by calling unlink), and thus make vhost 
> > sample
> > work correctly in second time startup.
> >  
> > It also adds/refines some log information.
> >
> > Changchun Ouyang (3):
> >   vhost: add log when failing to bind a socket
> >   vhost: fix the comments and log
> >   vhost: call api to unregister vhost driver
> 
> Acked-by: Huawei Xie 

Applied, thanks


[dpdk-dev] [PATCH v14 10/13] ethdev: add rx intr enable, disable and ctl functions

2015-07-17 Thread Stephen Hemminger

> +/**
>   * Turn on the LED on the Ethernet device.
>   * This function turns on the LED on the Ethernet device.
>   *
> diff --git a/lib/librte_ether/rte_ether_version.map 
> b/lib/librte_ether/rte_ether_version.map
> index 39baf11..fa09d75 100644
> --- a/lib/librte_ether/rte_ether_version.map
> +++ b/lib/librte_ether/rte_ether_version.map
> @@ -109,6 +109,10 @@ DPDK_2.0 {
>  DPDK_2.1 {
>   global:
>  
> + rte_eth_dev_rx_intr_ctl;
> + rte_eth_dev_rx_intr_ctl_q;
> + rte_eth_dev_rx_intr_disable;
> + rte_eth_dev_rx_intr_enable;
>   rte_eth_dev_set_mc_addr_list;
>   rte_eth_timesync_disable;
>   rte_eth_timesync_enable;

This needs rebase to current master, minor conflict here


[dpdk-dev] [dpdk-virtio] Performance tuning for dpdk with virtio?

2015-07-17 Thread Stephen Hemminger
On Fri, 17 Jul 2015 11:03:15 -0700
Clarylin L  wrote:

> I am running dpdk with a virtual guest as a L2 forwarder.
> 
> If the virtual guest is on passthrough, dpdk can achieve around 10G
> throughput. However if the virtual guest is on virtio, dpdk achieves just
> 150M throughput, which is a huge degrade. Any idea what could be the cause
> of such poor performance on virtio? and any performance tuning techniques I
> could try? Thanks a lot!

The default Linux bridge (and OVS) switch are your bottleneck.
It is not DPDK virtio issue in general. There are some small performance
gains still possible with virtio enhancements (like offloading).

Did you try running OVS-DPDK on the host?


[dpdk-dev] [PATCH v14 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch

2015-07-17 Thread Cunming Liang
The patch demonstrates how to handle per rx queue interrupt in a NAPI-like
implementation in userspace. The working thread mainly runs in polling mode
and switch to interrupt mode only if there is no packet received in recent 
polls.
The working thread returns to polling mode immediately once it receives an
interrupt notification caused by the incoming packets.
The sample keeps running in polling mode if the binding PMD hasn't supported
the rx interrupt yet. Now only ixgbe(pf/vf) and igb support it.

Signed-off-by: Danny Zhou 
Signed-off-by: Cunming Liang 
---
v14 changes
 - per-patch basis ABI compatibility rework
 - reword commit comments

v7 changes
 - using new APIs
 - demo multiple port/queue pair wait on the same epoll instance

v6 changes
 - Split event fd add and wait

v5 changes
 - Change invoked function name and parameter to accomodate EAL change

v3 changes
 - Add spinlock to ensure thread safe when accessing interrupt mask
   register

v2 changes
 - Remove unused function which is for debug purpose

 examples/l3fwd-power/main.c | 202 +++-
 1 file changed, 162 insertions(+), 40 deletions(-)

diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index b3c5f43..bec78e1 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -74,12 +74,14 @@
 #include 
 #include 
 #include 
+#include 
+#include 

 #define RTE_LOGTYPE_L3FWD_POWER RTE_LOGTYPE_USER1

 #define MAX_PKT_BURST 32

-#define MIN_ZERO_POLL_COUNT 5
+#define MIN_ZERO_POLL_COUNT 10

 /* around 100ms at 2 Ghz */
 #define TIMER_RESOLUTION_CYCLES   2ULL
@@ -153,6 +155,9 @@ static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
 /* ethernet addresses of ports */
 static struct ether_addr ports_eth_addr[RTE_MAX_ETHPORTS];

+/* ethernet addresses of ports */
+static rte_spinlock_t locks[RTE_MAX_ETHPORTS];
+
 /* mask of enabled ports */
 static uint32_t enabled_port_mask = 0;
 /* Ports set in promiscuous mode off by default. */
@@ -185,6 +190,9 @@ struct lcore_rx_queue {
 #define MAX_TX_QUEUE_PER_PORT RTE_MAX_ETHPORTS
 #define MAX_RX_QUEUE_PER_PORT 128

+#define MAX_RX_QUEUE_INTERRUPT_PER_PORT 16
+
+
 #define MAX_LCORE_PARAMS 1024
 struct lcore_params {
uint8_t port_id;
@@ -211,7 +219,7 @@ static uint16_t nb_lcore_params = 
sizeof(lcore_params_array_default) /

 static struct rte_eth_conf port_conf = {
.rxmode = {
-   .mq_mode= ETH_MQ_RX_RSS,
+   .mq_mode = ETH_MQ_RX_RSS,
.max_rx_pkt_len = ETHER_MAX_LEN,
.split_hdr_size = 0,
.header_split   = 0, /**< Header Split disabled */
@@ -223,11 +231,14 @@ static struct rte_eth_conf port_conf = {
.rx_adv_conf = {
.rss_conf = {
.rss_key = NULL,
-   .rss_hf = ETH_RSS_IP,
+   .rss_hf = ETH_RSS_UDP,
},
},
.txmode = {
-   .mq_mode = ETH_DCB_NONE,
+   .mq_mode = ETH_MQ_TX_NONE,
+   },
+   .intr_conf = {
+   .lsc = 1,
},
 };

@@ -399,19 +410,22 @@ power_timer_cb(__attribute__((unused)) struct rte_timer 
*tim,
/* accumulate total execution time in us when callback is invoked */
sleep_time_ratio = (float)(stats[lcore_id].sleep_time) /
(float)SCALING_PERIOD;
-
/**
 * check whether need to scale down frequency a step if it sleep a lot.
 */
-   if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD)
-   rte_power_freq_down(lcore_id);
+   if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
+   if (rte_power_freq_down)
+   rte_power_freq_down(lcore_id);
+   }
else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
-   stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST)
+   stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
/**
 * scale down a step if average packet per iteration less
 * than expectation.
 */
-   rte_power_freq_down(lcore_id);
+   if (rte_power_freq_down)
+   rte_power_freq_down(lcore_id);
+   }

/**
 * initialize another timer according to current frequency to ensure
@@ -712,22 +726,20 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid,

 }

-#define SLEEP_GEAR1_THRESHOLD100
-#define SLEEP_GEAR2_THRESHOLD1000
+#define MINIMUM_SLEEP_TIME 1
+#define SUSPEND_THRESHOLD  300

 static inline uint32_t
 power_idle_heuristic(uint32_t zero_rx_packet_count)
 {
-   /* If zero count is less than 100, use it as the sleep time in us */
-   if (zero_rx_packet_count < SLEEP_GEAR1_THRESHOLD)
-   return zero_rx_packet_count;
-   /* If zero count is less than 1000, sleep time should be 100 us */

[dpdk-dev] [PATCH v14 12/13] igb: enable rx queue interrupts for PF

2015-07-17 Thread Cunming Liang
The patch does below for igb PF:
- Setup NIC to generate MSI-X interrupts
- Set the IVAR register to map interrupt causes to vectors
- Implement interrupt enable/disable functions

Signed-off-by: Danny Zhou 
Signed-off-by: Cunming Liang 
---
v14 changes
 - per-patch basis ABI compatibility rework

v9 changes
 - move queue-vec mapping init from dev_configure to dev_start
 - fix link interrupt not working issue in vfio-msix

v8 changes
 - add vfio-msi/vfio-legacy and uio-legacy support

v7 changes
 - add condition check when intr vector is not enabled

v6 changes
 - fill queue-vector mapping table

v5 changes
 - Rebase the patchset onto the HEAD

v3 changes
 - Remove unnecessary variables in e1000_mac_info
 - Remove spinlok from PMD

v2 changes
 - Consolidate review comments related to coding style

 drivers/net/e1000/igb_ethdev.c | 311 -
 1 file changed, 277 insertions(+), 34 deletions(-)

diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index eb97218..fd92c80 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -104,6 +104,9 @@ static int  eth_igb_flow_ctrl_get(struct rte_eth_dev *dev,
 static int  eth_igb_flow_ctrl_set(struct rte_eth_dev *dev,
struct rte_eth_fc_conf *fc_conf);
 static int eth_igb_lsc_interrupt_setup(struct rte_eth_dev *dev);
+#ifdef RTE_NEXT_ABI
+static int eth_igb_rxq_interrupt_setup(struct rte_eth_dev *dev);
+#endif
 static int eth_igb_interrupt_get_status(struct rte_eth_dev *dev);
 static int eth_igb_interrupt_action(struct rte_eth_dev *dev);
 static void eth_igb_interrupt_handler(struct rte_intr_handle *handle,
@@ -201,7 +204,6 @@ static int eth_igb_filter_ctrl(struct rte_eth_dev *dev,
 enum rte_filter_type filter_type,
 enum rte_filter_op filter_op,
 void *arg);
-
 static int eth_igb_set_mc_addr_list(struct rte_eth_dev *dev,
struct ether_addr *mc_addr_set,
uint32_t nb_mc_addr);
@@ -212,6 +214,17 @@ static int igb_timesync_read_rx_timestamp(struct 
rte_eth_dev *dev,
  uint32_t flags);
 static int igb_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
  struct timespec *timestamp);
+#ifdef RTE_NEXT_ABI
+static int eth_igb_rx_queue_intr_enable(struct rte_eth_dev *dev,
+   uint16_t queue_id);
+static int eth_igb_rx_queue_intr_disable(struct rte_eth_dev *dev,
+uint16_t queue_id);
+static void eth_igb_assign_msix_vector(struct e1000_hw *hw, int8_t direction,
+  uint8_t queue, uint8_t msix_vector);
+static void eth_igb_write_ivar(struct e1000_hw *hw, uint8_t msix_vector,
+  uint8_t index, uint8_t offset);
+#endif
+static void eth_igb_configure_msix_intr(struct rte_eth_dev *dev);

 /*
  * Define VF Stats MACRO for Non "cleared on read" register
@@ -272,6 +285,10 @@ static const struct eth_dev_ops eth_igb_ops = {
.vlan_tpid_set= eth_igb_vlan_tpid_set,
.vlan_offload_set = eth_igb_vlan_offload_set,
.rx_queue_setup   = eth_igb_rx_queue_setup,
+#ifdef RTE_NEXT_ABI
+   .rx_queue_intr_enable = eth_igb_rx_queue_intr_enable,
+   .rx_queue_intr_disable = eth_igb_rx_queue_intr_disable,
+#endif
.rx_queue_release = eth_igb_rx_queue_release,
.rx_queue_count   = eth_igb_rx_queue_count,
.rx_descriptor_done   = eth_igb_rx_descriptor_done,
@@ -609,12 +626,6 @@ eth_igb_dev_init(struct rte_eth_dev *eth_dev)
 eth_dev->data->port_id, pci_dev->id.vendor_id,
 pci_dev->id.device_id);

-   rte_intr_callback_register(&(pci_dev->intr_handle),
-   eth_igb_interrupt_handler, (void *)eth_dev);
-
-   /* enable uio intr after callback register */
-   rte_intr_enable(&(pci_dev->intr_handle));
-
/* enable support intr */
igb_intr_enable(eth_dev);

@@ -777,7 +788,11 @@ eth_igb_start(struct rte_eth_dev *dev)
 {
struct e1000_hw *hw =
E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-   int ret, i, mask;
+   struct rte_intr_handle *intr_handle = >pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+   uint32_t intr_vector = 0;
+#endif
+   int ret, mask;
uint32_t ctrl_ext;

PMD_INIT_FUNC_TRACE();
@@ -817,6 +832,29 @@ eth_igb_start(struct rte_eth_dev *dev)
/* configure PF module if SRIOV enabled */
igb_pf_host_configure(dev);

+#ifdef RTE_NEXT_ABI
+   /* check and configure queue intr-vector mapping */
+   if (dev->data->dev_conf.intr_conf.rxq != 0)
+   intr_vector = dev->data->nb_rx_queues;
+
+   if (rte_intr_efd_enable(intr_handle, intr_vector))
+   return -1;
+
+   if 

[dpdk-dev] [PATCH v14 09/13] eal/bsd: fix inappropriate linuxapp referred in bsd

2015-07-17 Thread Cunming Liang

Signed-off-by: Cunming Liang 
---
 lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h 
b/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
index eaf5410..4c4b761 100644
--- a/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
@@ -35,8 +35,8 @@
 #error "don't include this file directly, please include generic 
"
 #endif

-#ifndef _RTE_LINUXAPP_INTERRUPTS_H_
-#define _RTE_LINUXAPP_INTERRUPTS_H_
+#ifndef _RTE_BSDAPP_INTERRUPTS_H_
+#define _RTE_BSDAPP_INTERRUPTS_H_

 #include 

@@ -137,4 +137,4 @@ rte_intr_allow_others(struct rte_intr_handle *intr_handle)
return 1;
 }

-#endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
+#endif /* _RTE_BSDAPP_INTERRUPTS_H_ */
-- 
1.8.1.4



[dpdk-dev] [PATCH v14 08/13] eal/bsd: dummy for new intr definition

2015-07-17 Thread Cunming Liang
To make bsd compiling happy with new intr changes.

Signed-off-by: Cunming Liang 
---
v14 changes
 - per-patch basis ABI compatibility rework

v13 changes
 - version map cleanup for v2.1

v12 changes
 - fix unused variables compiling warning

v8 changes
 - add stub for new function

v7 changes
 - remove stub 'linux only' function from source file

 lib/librte_eal/bsdapp/eal/eal_interrupts.c | 28 +++
 .../bsdapp/eal/include/exec-env/rte_interrupts.h   | 85 ++
 lib/librte_eal/bsdapp/eal/rte_eal_version.map  |  5 ++
 3 files changed, 118 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_interrupts.c 
b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
index 26a55c7..a550ece 100644
--- a/lib/librte_eal/bsdapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
@@ -68,3 +68,31 @@ rte_eal_intr_init(void)
 {
return 0;
 }
+
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+   int epfd, int op, unsigned int vec, void *data)
+{
+   RTE_SET_USED(intr_handle);
+   RTE_SET_USED(epfd);
+   RTE_SET_USED(op);
+   RTE_SET_USED(vec);
+   RTE_SET_USED(data);
+
+   return -ENOTSUP;
+}
+
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+   RTE_SET_USED(intr_handle);
+   RTE_SET_USED(nb_efd);
+
+   return 0;
+}
+
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+   RTE_SET_USED(intr_handle);
+}
diff --git a/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h 
b/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
index d4c388f..eaf5410 100644
--- a/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
@@ -38,6 +38,8 @@
 #ifndef _RTE_LINUXAPP_INTERRUPTS_H_
 #define _RTE_LINUXAPP_INTERRUPTS_H_

+#include 
+
 enum rte_intr_handle_type {
RTE_INTR_HANDLE_UNKNOWN = 0,
RTE_INTR_HANDLE_UIO,  /**< uio device handle */
@@ -50,6 +52,89 @@ struct rte_intr_handle {
int fd;  /**< file descriptor */
int uio_cfg_fd;  /**< UIO config file descriptor */
enum rte_intr_handle_type type;  /**< handle type */
+#ifdef RTE_NEXT_ABI
+   /**
+* RTE_NEXT_ABI will be removed from v2.2.
+* It's only used to avoid ABI(unannounced) broken in v2.1.
+* Make sure being aware of the impact before turning on the feature.
+*/
+   int max_intr;/**< max interrupt requested */
+   uint32_t nb_efd; /**< number of available efds */
+   int *intr_vec;   /**< intr vector number array */
+#endif
 };

+/**
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {ADD, DEL}.
+ * @param vec
+ *   RX intr vector number added to the epoll instance wait list.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+   int epfd, int op, unsigned int vec, void *data);
+
+/**
+ * It enables the fastpath event fds if it's necessary.
+ * It creates event fds when multi-vectors allowed,
+ * otherwise it multiplexes the single event fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param nb_vec
+ *   Number of interrupt vector trying to enable.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd);
+
+/**
+ * It disable the fastpath event fds.
+ * It deletes registered eventfds and closes the open fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle);
+
+/**
+ * The fastpath interrupt is enabled or not.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+static inline int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle)
+{
+   RTE_SET_USED(intr_handle);
+   return 0;
+}
+
+/**
+ * The interrupt handle instance allows other cause or not.
+ * Other cause stands for none fastpath interrupt.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+static inline int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle)
+{
+   RTE_SET_USED(intr_handle);
+   return 1;
+}
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index e537b42..b527ad4 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -116,4 +116,9 @@ DPDK_2.1 {
global:

rte_memzone_free;
+   rte_intr_allow_others;
+   

[dpdk-dev] [PATCH v14 07/13] eal/linux: fix lsc read error in uio_pci_generic

2015-07-17 Thread Cunming Liang
The intr handle type(RTE_INTR_HANDLE_UIO_INTX) was introduced by UIO pci 
generic.
When turning on the lsc interrupt, it complains fd read error.
The patch uses the correct read size in the case of RTE_INTR_HANDLE_UIO_INTX.

Fixes: 3f313bef3467 ("eal/linux: fix irq handling with igb_uio")

Reported-by: Yong Liu 
Signed-off-by: Cunming Liang 
---
 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c 
b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 0266d98..69ce974 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -686,6 +686,7 @@ eal_intr_process_interrupts(struct epoll_event *events, int 
nfds)
/* set the length to be read dor different handle type */
switch (src->intr_handle.type) {
case RTE_INTR_HANDLE_UIO:
+   case RTE_INTR_HANDLE_UIO_INTX:
bytes_read = sizeof(buf.uio_intr_count);
break;
case RTE_INTR_HANDLE_ALARM:
-- 
1.8.1.4



[dpdk-dev] [PATCH v14 06/13] eal/linux: standalone intr event fd create support

2015-07-17 Thread Cunming Liang
The patch exposes intr event fd create and release for PMD.
The device driver can assign the number of event associated with interrupt 
vector.
It also provides misc functions to check 1) allows other slowpath intr(e.g. 
lsc);
2) intr event on fastpath is enabled or not.

Signed-off-by: Cunming Liang 
---
v14 changes
 - per-patch basis ABI compatibility rework
 - minor changes on API decription comments

v13 changes
 - version map cleanup for v2.1

v11 changes
 - typo cleanup

 lib/librte_eal/linuxapp/eal/eal_interrupts.c   | 57 ++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h | 87 ++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map|  4 +
 3 files changed, 148 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c 
b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index b18ab86..0266d98 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -44,6 +44,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
@@ -68,6 +69,7 @@
 #include "eal_vfio.h"

 #define EAL_INTR_EPOLL_WAIT_FOREVER (-1)
+#define NB_OTHER_INTR   1

 static RTE_DEFINE_PER_LCORE(int, _epfd) = -1; /**< epoll fd per thread */

@@ -1121,4 +1123,59 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int 
epfd,

return rc;
 }
+
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+   uint32_t i;
+   int fd;
+   uint32_t n = RTE_MIN(nb_efd, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+
+   if (intr_handle->type == RTE_INTR_HANDLE_VFIO_MSIX) {
+   for (i = 0; i < n; i++) {
+   fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
+   if (fd < 0) {
+   RTE_LOG(ERR, EAL,
+   "cannot setup eventfd,"
+   "error %i (%s)\n",
+   errno, strerror(errno));
+   return -1;
+   }
+   intr_handle->efds[i] = fd;
+   }
+   intr_handle->nb_efd   = n;
+   intr_handle->max_intr = NB_OTHER_INTR + n;
+   } else {
+   intr_handle->efds[0]  = intr_handle->fd;
+   intr_handle->nb_efd   = RTE_MIN(nb_efd, 1U);
+   intr_handle->max_intr = NB_OTHER_INTR;
+   }
+
+   return 0;
+}
+
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+   uint32_t i;
+   struct rte_epoll_event *rev;
+
+   for (i = 0; i < intr_handle->nb_efd; i++) {
+   rev = _handle->elist[i];
+   if (rev->status == RTE_EPOLL_INVALID)
+   continue;
+   if (rte_epoll_ctl(rev->epfd, EPOLL_CTL_DEL, rev->fd, rev)) {
+   /* force free if the entry valid */
+   eal_epoll_data_safe_free(rev);
+   rev->status = RTE_EPOLL_INVALID;
+   }
+   }
+
+   if (intr_handle->max_intr > intr_handle->nb_efd) {
+   for (i = 0; i < intr_handle->nb_efd; i++)
+   close(intr_handle->efds[i]);
+   }
+   intr_handle->nb_efd = 0;
+   intr_handle->max_intr = 0;
+}
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h 
b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index 918246f..3f17f29 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -191,4 +191,91 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
 }
 #endif

+/**
+ * It enables the packet I/O interrupt event if it's necessary.
+ * It creates event fd for each interrupt vector when MSIX is used,
+ * otherwise it multiplexes a single event fd.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param nb_vec
+ *   Number of interrupt vector trying to enable.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+#ifdef RTE_NEXT_ABI
+extern int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd);
+#else
+static inline int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+   RTE_SET_USED(intr_handle);
+   RTE_SET_USED(nb_efd);
+   return 0;
+}
+#endif
+
+/**
+ * It disables the packet I/O interrupt event.
+ * It deletes registered eventfds and closes the open fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+#ifdef RTE_NEXT_ABI
+extern void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle);
+#else
+static inline void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+   RTE_SET_USED(intr_handle);
+}
+#endif
+
+/**
+ * The packet I/O interrupt on datapath is enabled or not.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+#ifdef RTE_NEXT_ABI

[dpdk-dev] [PATCH v14 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector

2015-07-17 Thread Cunming Liang
The patch assigns event fds to each vfio msix interrupt vector by ioctl.

Signed-off-by: Danny Zhou 
Signed-off-by: Cunming Liang 
---
v14 changes
 - per-patch basis ABI compatibility rework
 - reword commit comments

v8 changes
 - move eventfd creation out of the setup_interrupts to a standalone function

v7 changes
 - cleanup unnecessary code change
 - split event and intr operation to other patches

 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 56 ++--
 1 file changed, 20 insertions(+), 36 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c 
b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index cca2efd..b18ab86 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -128,6 +128,9 @@ static pthread_t intr_thread;
 #ifdef VFIO_PRESENT

 #define IRQ_SET_BUF_LEN  (sizeof(struct vfio_irq_set) + sizeof(int))
+/* irq set buffer length for queue interrupts and LSC interrupt */
+#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
+ sizeof(int) * (RTE_MAX_RXTX_INTR_VEC_ID + 1))

 /* enable legacy (INTx) interrupts */
 static int
@@ -245,23 +248,6 @@ vfio_enable_msi(struct rte_intr_handle *intr_handle) {
intr_handle->fd);
return -1;
}
-
-   /* manually trigger interrupt to enable it */
-   memset(irq_set, 0, len);
-   len = sizeof(struct vfio_irq_set);
-   irq_set->argsz = len;
-   irq_set->count = 1;
-   irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-   irq_set->index = VFIO_PCI_MSI_IRQ_INDEX;
-   irq_set->start = 0;
-
-   ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-
-   if (ret) {
-   RTE_LOG(ERR, EAL, "Error triggering MSI interrupts for fd %d\n",
-   intr_handle->fd);
-   return -1;
-   }
return 0;
 }

@@ -294,7 +280,7 @@ vfio_disable_msi(struct rte_intr_handle *intr_handle) {
 static int
 vfio_enable_msix(struct rte_intr_handle *intr_handle) {
int len, ret;
-   char irq_set_buf[IRQ_SET_BUF_LEN];
+   char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
struct vfio_irq_set *irq_set;
int *fd_ptr;

@@ -302,12 +288,26 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {

irq_set = (struct vfio_irq_set *) irq_set_buf;
irq_set->argsz = len;
+#ifdef RTE_NEXT_ABI
+   if (!intr_handle->max_intr)
+   intr_handle->max_intr = 1;
+   else if (intr_handle->max_intr > RTE_MAX_RXTX_INTR_VEC_ID)
+   intr_handle->max_intr = RTE_MAX_RXTX_INTR_VEC_ID + 1;
+
+   irq_set->count = intr_handle->max_intr;
+#else
irq_set->count = 1;
+#endif
irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | 
VFIO_IRQ_SET_ACTION_TRIGGER;
irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
irq_set->start = 0;
fd_ptr = (int *) _set->data;
-   *fd_ptr = intr_handle->fd;
+#ifdef RTE_NEXT_ABI
+   memcpy(fd_ptr, intr_handle->efds, sizeof(intr_handle->efds));
+   fd_ptr[intr_handle->max_intr - 1] = intr_handle->fd;
+#else
+   fd_ptr[0] = intr_handle->fd;
+#endif

ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);

@@ -317,22 +317,6 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {
return -1;
}

-   /* manually trigger interrupt to enable it */
-   memset(irq_set, 0, len);
-   len = sizeof(struct vfio_irq_set);
-   irq_set->argsz = len;
-   irq_set->count = 1;
-   irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-   irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
-   irq_set->start = 0;
-
-   ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-
-   if (ret) {
-   RTE_LOG(ERR, EAL, "Error triggering MSI-X interrupts for fd 
%d\n",
-   intr_handle->fd);
-   return -1;
-   }
return 0;
 }

@@ -340,7 +324,7 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 static int
 vfio_disable_msix(struct rte_intr_handle *intr_handle) {
struct vfio_irq_set *irq_set;
-   char irq_set_buf[IRQ_SET_BUF_LEN];
+   char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
int len, ret;

len = sizeof(struct vfio_irq_set);
-- 
1.8.1.4



[dpdk-dev] [PATCH v14 04/13] eal/linux: fix comments typo on vfio msi

2015-07-17 Thread Cunming Liang

Signed-off-by: Cunming Liang 
---
 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c 
b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 4e34abc..cca2efd 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -219,7 +219,7 @@ vfio_disable_intx(struct rte_intr_handle *intr_handle) {
return 0;
 }

-/* enable MSI-X interrupts */
+/* enable MSI interrupts */
 static int
 vfio_enable_msi(struct rte_intr_handle *intr_handle) {
int len, ret;
@@ -265,7 +265,7 @@ vfio_enable_msi(struct rte_intr_handle *intr_handle) {
return 0;
 }

-/* disable MSI-X interrupts */
+/* disable MSI interrupts */
 static int
 vfio_disable_msi(struct rte_intr_handle *intr_handle) {
struct vfio_irq_set *irq_set;
-- 
1.8.1.4



[dpdk-dev] [PATCH v14 03/13] eal/linux: add API to set rx interrupt event monitor

2015-07-17 Thread Cunming Liang
The patch adds 'rte_intr_rx_ctl' to add or delete interrupt vector events 
monitor on specified epoll instance.

Signed-off-by: Cunming Liang 
---
v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map

v13 changes
 - version map cleanup for v2.1

v12 changes:
 - fix awkward line split in using RTE_LOG

v10 changes:
 - add RTE_INTR_HANDLE_UIO_INTX for uio_pci_generic

v8 changes
 - fix EWOULDBLOCK and EINTR processing
 - add event status check

v7 changes
 - rename rte_intr_rx_set to rte_intr_rx_ctl.
 - rte_intr_rx_ctl uses rte_epoll_ctl to register epoll event instance.
 - the intr rx event instance includes a intr process callback.

v6 changes
 - split rte_intr_wait_rx_pkt into two function, wait and set.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set to remove queue visibility on eal.
 - rte_intr_rx_wait to support multiplexing.
 - allow epfd as input to support flexible event fd combination.

 lib/librte_eal/linuxapp/eal/eal_interrupts.c   | 105 +
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |  38 
 lib/librte_eal/linuxapp/eal/rte_eal_version.map|   1 +
 3 files changed, 144 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c 
b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 5fe5b99..4e34abc 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -897,6 +897,51 @@ rte_eal_intr_init(void)
return -ret;
 }

+#ifdef RTE_NEXT_ABI
+static void
+eal_intr_proc_rxtx_intr(int fd, const struct rte_intr_handle *intr_handle)
+{
+   union rte_intr_read_buffer buf;
+   int bytes_read = 1;
+
+   switch (intr_handle->type) {
+   case RTE_INTR_HANDLE_UIO:
+   case RTE_INTR_HANDLE_UIO_INTX:
+   bytes_read = sizeof(buf.uio_intr_count);
+   break;
+#ifdef VFIO_PRESENT
+   case RTE_INTR_HANDLE_VFIO_MSIX:
+   case RTE_INTR_HANDLE_VFIO_MSI:
+   case RTE_INTR_HANDLE_VFIO_LEGACY:
+   bytes_read = sizeof(buf.vfio_intr_count);
+   break;
+#endif
+   default:
+   bytes_read = 1;
+   RTE_LOG(INFO, EAL, "unexpected intr type\n");
+   break;
+   }
+
+   /**
+* read out to clear the ready-to-be-read flag
+* for epoll_wait.
+*/
+   do {
+   bytes_read = read(fd, , bytes_read);
+   if (bytes_read < 0) {
+   if (errno == EINTR || errno == EWOULDBLOCK ||
+   errno == EAGAIN)
+   continue;
+   RTE_LOG(ERR, EAL,
+   "Error reading from fd %d: %s\n",
+   fd, strerror(errno));
+   } else if (bytes_read == 0)
+   RTE_LOG(ERR, EAL, "Read nothing from fd %d\n", fd);
+   return;
+   } while (1);
+}
+#endif
+
 static int
 eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
struct rte_epoll_event *events)
@@ -1033,3 +1078,63 @@ rte_epoll_ctl(int epfd, int op, int fd,

return 0;
 }
+
+#ifdef RTE_NEXT_ABI
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
+   int op, unsigned int vec, void *data)
+{
+   struct rte_epoll_event *rev;
+   struct rte_epoll_data *epdata;
+   int epfd_op;
+   int rc = 0;
+
+   if (!intr_handle || intr_handle->nb_efd == 0 ||
+   vec >= intr_handle->nb_efd) {
+   RTE_LOG(ERR, EAL, "Wrong intr vector number.\n");
+   return -EPERM;
+   }
+
+   switch (op) {
+   case RTE_INTR_EVENT_ADD:
+   epfd_op = EPOLL_CTL_ADD;
+   rev = _handle->elist[vec];
+   if (rev->status != RTE_EPOLL_INVALID) {
+   RTE_LOG(INFO, EAL, "Event already been added.\n");
+   return -EEXIST;
+   }
+
+   /* attach to intr vector fd */
+   epdata = >epdata;
+   epdata->event  = EPOLLIN | EPOLLPRI | EPOLLET;
+   epdata->data   = data;
+   epdata->cb_fun = (rte_intr_event_cb_t)eal_intr_proc_rxtx_intr;
+   epdata->cb_arg = (void *)intr_handle;
+   rc = rte_epoll_ctl(epfd, epfd_op, intr_handle->efds[vec], rev);
+   if (!rc)
+   RTE_LOG(DEBUG, EAL,
+   "efd %d associated with vec %d added on epfd %d"
+   "\n", rev->fd, vec, epfd);
+   else
+   rc = -EPERM;
+   break;
+   case RTE_INTR_EVENT_DEL:
+   epfd_op = EPOLL_CTL_DEL;
+   rev = _handle->elist[vec];
+   if (rev->status == RTE_EPOLL_INVALID) {
+   RTE_LOG(INFO, EAL, "Event does not exist.\n");
+   return -EPERM;
+   }
+
+   rc 

[dpdk-dev] [PATCH v14 02/13] eal/linux: add rte_epoll_wait/ctl support

2015-07-17 Thread Cunming Liang
The patch adds 'rte_epoll_wait' and 'rte_epoll_ctl' for async event wakeup.
It defines 'struct rte_epoll_event' as the event param.
When the event fds add to a specified epoll instance, 'eptrs' will hold the 
rte_epoll_event object pointer.
The 'op' uses the same enum as epoll_wait/ctl does.
The epoll event support to carry a raw user data and to register a callback 
which is executed during wakeup.

Signed-off-by: Cunming Liang 
---
v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map

v13 changes
 - version map cleanup for v2.1

v11 changes
 - cleanup spelling error

v9 changes
 - rework on coding style

v8 changes
 - support delete event in safety during the wakeup execution
 - add EINTR process during epoll_wait

v7 changes
 - split v6[4/8] into two patches, one for epoll event(this one)
   another for rx intr(next patch)
 - introduce rte_epoll_event definition
 - rte_epoll_wait/ctl for more generic RTE epoll API

v6 changes
 - split rte_intr_wait_rx_pkt into two function, wait and set.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set to remove queue visibility on eal.
 - rte_intr_rx_wait to support multiplexing.
 - allow epfd as input to support flexible event fd combination.

 lib/librte_eal/linuxapp/eal/eal_interrupts.c   | 139 +
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |  80 
 lib/librte_eal/linuxapp/eal/rte_eal_version.map|   3 +
 3 files changed, 222 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c 
b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index b5f369e..5fe5b99 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -69,6 +69,8 @@

 #define EAL_INTR_EPOLL_WAIT_FOREVER (-1)

+static RTE_DEFINE_PER_LCORE(int, _epfd) = -1; /**< epoll fd per thread */
+
 /**
  * union for pipe fds.
  */
@@ -894,3 +896,140 @@ rte_eal_intr_init(void)

return -ret;
 }
+
+static int
+eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
+   struct rte_epoll_event *events)
+{
+   unsigned int i, count = 0;
+   struct rte_epoll_event *rev;
+
+   for (i = 0; i < n; i++) {
+   rev = evs[i].data.ptr;
+   if (!rev || !rte_atomic32_cmpset(>status, RTE_EPOLL_VALID,
+RTE_EPOLL_EXEC))
+   continue;
+
+   events[count].status= RTE_EPOLL_VALID;
+   events[count].fd= rev->fd;
+   events[count].epfd  = rev->epfd;
+   events[count].epdata.event  = rev->epdata.event;
+   events[count].epdata.data   = rev->epdata.data;
+   if (rev->epdata.cb_fun)
+   rev->epdata.cb_fun(rev->fd,
+  rev->epdata.cb_arg);
+
+   rte_compiler_barrier();
+   rev->status = RTE_EPOLL_VALID;
+   count++;
+   }
+   return count;
+}
+
+static inline int
+eal_init_tls_epfd(void)
+{
+   int pfd = epoll_create(255);
+
+   if (pfd < 0) {
+   RTE_LOG(ERR, EAL,
+   "Cannot create epoll instance\n");
+   return -1;
+   }
+   return pfd;
+}
+
+int
+rte_intr_tls_epfd(void)
+{
+   if (RTE_PER_LCORE(_epfd) == -1)
+   RTE_PER_LCORE(_epfd) = eal_init_tls_epfd();
+
+   return RTE_PER_LCORE(_epfd);
+}
+
+int
+rte_epoll_wait(int epfd, struct rte_epoll_event *events,
+  int maxevents, int timeout)
+{
+   struct epoll_event evs[maxevents];
+   int rc;
+
+   if (!events) {
+   RTE_LOG(ERR, EAL, "rte_epoll_event can't be NULL\n");
+   return -1;
+   }
+
+   /* using per thread epoll fd */
+   if (epfd == RTE_EPOLL_PER_THREAD)
+   epfd = rte_intr_tls_epfd();
+
+   while (1) {
+   rc = epoll_wait(epfd, evs, maxevents, timeout);
+   if (likely(rc > 0)) {
+   /* epoll_wait has at least one fd ready to read */
+   rc = eal_epoll_process_event(evs, rc, events);
+   break;
+   } else if (rc < 0) {
+   if (errno == EINTR)
+   continue;
+   /* epoll_wait fail */
+   RTE_LOG(ERR, EAL, "epoll_wait returns with fail %s\n",
+   strerror(errno));
+   rc = -1;
+   break;
+   }
+   }
+
+   return rc;
+}
+
+static inline void
+eal_epoll_data_safe_free(struct rte_epoll_event *ev)
+{
+   while (!rte_atomic32_cmpset(>status, RTE_EPOLL_VALID,
+   RTE_EPOLL_INVALID))
+   while (ev->status != RTE_EPOLL_VALID)
+   rte_pause();
+   memset(>epdata, 0, sizeof(ev->epdata));
+   ev->fd = -1;
+

[dpdk-dev] [PATCH v14 00/13] Interrupt mode PMD

2015-07-17 Thread Cunming Liang
v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map
 - minor comments rework

v13 changes
 - version map cleanup for v2.1
 - replace RTE_EAL_RX_INTR by RTE_NEXT_ABI for ABI compatibility

Patch series v12
Acked-by: Stephen Hemminger 
Acked-by: Danny Zhou 

v12 changes
 - bsd cleanup for unused variable warning
 - fix awkward line split in debug message

v11 changes
 - typo cleanup and check kernel style

v10 changes
 - code rework to return actual error code
 - bug fix for lsc when using uio_pci_generic

v9 changes
 - code rework to fix open comment
 - bug fix for igb lsc when both lsc and rxq are enabled in vfio-msix
 - new patch to turn off the feature by default so as to avoid v2.1 abi broken

v8 changes
 - remove condition check for only vfio-msix
 - add multiplex intr support when only one intr vector allowed
 - lsc and rxq interrupt runtime enable decision
 - add safe event delete while the event wakeup execution happens

v7 changes
 - decouple epoll event and intr operation
 - add condition check in the case intr vector is disabled
 - renaming some APIs

v6 changes
 - split rte_intr_wait_rx_pkt into two APIs 'wait' and 'set'.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set.
 - using vector number instead of queue_id as interrupt API params.
 - patch reorder and split.

v5 changes
 - Rebase the patchset onto the HEAD
 - Isolate ethdev from EAL for new-added wait-for-rx interrupt function
 - Export wait-for-rx interrupt function for shared libraries
 - Split-off a new patch file for changed struct rte_intr_handle that
   other patches depend on, to avoid breaking git bisect
 - Change sample applicaiton to accomodate EAL function spec change
   accordingly

v4 changes
 - Export interrupt enable/disable functions for shared libraries
 - Adjust position of new-added structure fields and functions to
   avoid breaking ABI

v3 changes
 - Add return value for interrupt enable/disable functions
 - Move spinlok from PMD to L3fwd-power
 - Remove unnecessary variables in e1000_mac_info
 - Fix miscelleous review comments

v2 changes
 - Fix compilation issue in Makefile for missed header file.
 - Consolidate internal and community review comments of v1 patch set.

The patch series introduce low-latency one-shot rx interrupt into DPDK with
polling and interrupt mode switch control example.

DPDK userspace interrupt notification and handling mechanism is based on UIO
with below limitation:
1) It is designed to handle LSC interrupt only with inefficient suspended
   pthread wakeup procedure (e.g. UIO wakes up LSC interrupt handling thread
   which then wakes up DPDK polling thread). In this way, it introduces
   non-deterministic wakeup latency for DPDK polling thread as well as packet
   latency if it is used to handle Rx interrupt.
2) UIO only supports a single interrupt vector which has to been shared by
   LSC interrupt and interrupts assigned to dedicated rx queues.

This patchset includes below features:
1) Enable one-shot rx queue interrupt in ixgbe PMD(PF & VF) and igb PMD(PF only)
. 
2) Build on top of the VFIO mechanism instead of UIO, so it could support
   up to 64 interrupt vectors for rx queue interrupts.
3) Have 1 DPDK polling thread handle per Rx queue interrupt with a dedicated
   VFIO eventfd, which eliminates non-deterministic pthread wakeup latency in
   user space.
4) Demonstrate interrupts control APIs and userspace NAIP-like polling/interrupt
   switch algorithms in L3fwd-power example.

Known limitations:
1) It does not work for UIO due to a single interrupt eventfd shared by LSC
   and rx queue interrupt handlers causes a mess. [FIXED]
2) LSC interrupt is not supported by VF driver, so it is by default disabled
   in L3fwd-power now. Feel free to turn in on if you want to support both LSC
   and rx queue interrupts on a PF.

Cunming Liang (13):
  eal/linux: add interrupt vectors support in intr_handle
  eal/linux: add rte_epoll_wait/ctl support
  eal/linux: add API to set rx interrupt event monitor
  eal/linux: fix comments typo on vfio msi
  eal/linux: map eventfd to VFIO MSI-X intr vector
  eal/linux: standalone intr event fd create support
  eal/linux: fix lsc read error in uio_pci_generic
  eal/bsd: dummy for new intr definition
  eal/bsd: fix inappropriate linuxapp referred in bsd
  ethdev: add rx intr enable, disable and ctl functions
  ixgbe: enable rx queue interrupts for both PF and VF
  igb: enable rx queue interrupts for PF
  l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode
switch

 drivers/net/e1000/igb_ethdev.c | 311 ++--
 drivers/net/ixgbe/ixgbe_ethdev.c   | 527 -
 drivers/net/ixgbe/ixgbe_ethdev.h   |   4 +
 examples/l3fwd-power/main.c| 202 ++--
 lib/librte_eal/bsdapp/eal/eal_interrupts.c |  28 ++
 .../bsdapp/eal/include/exec-env/rte_interrupts.h   |  91 +++-
 

[dpdk-dev] jumbo frame support for 82583V

2015-07-17 Thread Klaus Degner
Hi Wenzhuo,

We are testing different Intel NICs for DPDK. We have tested the master
branch with the support for 82583V Intel chip.
It works very well except that we can only use up to 1518 bytes for
maximum packet capture.
We have debugged this and it is restricted in the initialization:

http://dpdk.org/browse/dpdk/tree/drivers/net/e1000/em_ethdev.c#n855

Without DPDK, the linux driver support and mtu up to 9k and ark.intel
reports that this chip is jumbo frame capable:

http://ark.intel.com/de/products/41676/Intel-82583V-Gigabit-Ethernet-Controller

Is there any specific reason why DPDK cannot use jumbo frames for this NIC ?

Thanks for help !

Klaus


[dpdk-dev] [PATCH] virtio: fix the vq size issue

2015-07-17 Thread Thomas Monjalon
Stephen,

This patch is partially reverting yours:
http://dpdk.org/browse/dpdk/commit/?id=d78deadae4dca240

A comment or a ack?

2015-07-07 02:32, Ouyang, Changchun:
> 
> > -Original Message-
> > From: Ouyang, Changchun
> > Sent: Wednesday, July 1, 2015 3:49 PM
> > To: dev at dpdk.org
> > Cc: Cao, Waterman; Xu, Qian Q; Ouyang, Changchun
> > Subject: [PATCH] virtio: fix the vq size issue
> > 
> > This commit breaks virtio basic packets rx functionality:
> >   d78deadae4dca240e85054bf2d604a801676becc
> > 
> > The QEMU use 256 as default vring size, also use this default value to
> > calculate the virtio avail ring base address and used ring base address, and
> > vhost in the backend use the ring base address to do packet IO.
> > 
> > Virtio spec also says the queue size in PCI configuration is read-only, so 
> > virtio
> > front end can't change it. just need use the read-only value to allocate 
> > space
> > for vring and calculate the avail and used ring base address. Otherwise, the
> > avail and used ring base address will be different between host and guest,
> > accordingly, packet IO can't work normally.
> > 
> > Signed-off-by: Changchun Ouyang 
> > ---
> >  drivers/net/virtio/virtio_ethdev.c | 14 +++---
> >  1 file changed, 3 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/net/virtio/virtio_ethdev.c
> > b/drivers/net/virtio/virtio_ethdev.c
> > index fe5f9a1..d84de13 100644
> > --- a/drivers/net/virtio/virtio_ethdev.c
> > +++ b/drivers/net/virtio/virtio_ethdev.c
> > @@ -263,8 +263,6 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
> >  */
> > vq_size = VIRTIO_READ_REG_2(hw, VIRTIO_PCI_QUEUE_NUM);
> > PMD_INIT_LOG(DEBUG, "vq_size: %d nb_desc:%d", vq_size,
> > nb_desc);
> > -   if (nb_desc == 0)
> > -   nb_desc = vq_size;
> > if (vq_size == 0) {
> > PMD_INIT_LOG(ERR, "%s: virtqueue does not exist",
> > __func__);
> > return -EINVAL;
> > @@ -275,15 +273,9 @@ int virtio_dev_queue_setup(struct rte_eth_dev
> > *dev,
> > return -EINVAL;
> > }
> > 
> > -   if (nb_desc < vq_size) {
> > -   if (!rte_is_power_of_2(nb_desc)) {
> > -   PMD_INIT_LOG(ERR,
> > -"nb_desc(%u) size is not powerof 2",
> > -nb_desc);
> > -   return -EINVAL;
> > -   }
> > -   vq_size = nb_desc;
> > -   }
> > +   if (nb_desc != vq_size)
> > +   PMD_INIT_LOG(ERR, "Warning: nb_desc(%d) is not equal to
> > vq size (%d), fall to vq size",
> > +   nb_desc, vq_size);
> > 
> > if (queue_type == VTNET_RQ) {
> > snprintf(vq_name, sizeof(vq_name), "port%d_rvq%d",
> > --
> > 1.8.4.2
> 
> Any more comments for this patch?
> 
> Thanks
> Changchun
> 




[dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline

2015-07-17 Thread Singh, Jasvinder


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 6:08 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
> 
> 
> Signed-off-by: Cristian Dumitrescu 
> ---

Acked-by: Jasvinder Singh 


[dpdk-dev] [PATCH] doc: announce ABI change for librte_table

2015-07-17 Thread Singh, Jasvinder


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 6:00 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
> 
> 
> Signed-off-by: Cristian Dumitrescu 
> ---

Acked-by: Jasvinder Singh 


[dpdk-dev] ACL Libraries

2015-07-17 Thread Sugumaran, Varthamanan
Hi,
Iam exploring librte_acl libraries. I have a query on the usage of number of 
tries per context when we build a acl context.


1.  Can we have more than one trie per context? Though it has mentioned 
that we can have 8 tries(RTE_ACL_MAX_TRIES),

what is the use case for having more than one trie?

2.  If we allow more than 1 trie per context, is it possible to have a 
separate trie for each category?

Thanks in advance.

Thanks
Vartha


[dpdk-dev] [PATCH v2] Fix the endian issue for the i40e read registers functions

2015-07-17 Thread Zhe Tao
When using the Power big endian CPU for i40e NIC,
the current i40e related registers operations will cause a problem,
because the i40e registers are little endian which is inconsistent with
big endian CPU. Add the conversion for the inconsistency.

Signed-off-by: Zhe Tao 
---
PATCH v2: Edit the comments make it more clear

PATCH v1: Add the endian conversion for registers operations.

 drivers/net/i40e/base/i40e_osdep.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_osdep.h 
b/drivers/net/i40e/base/i40e_osdep.h
index 3ce8057..70d2721 100644
--- a/drivers/net/i40e/base/i40e_osdep.h
+++ b/drivers/net/i40e/base/i40e_osdep.h
@@ -122,10 +122,10 @@ do {  
  \
((volatile uint32_t *)((char *)(a)->hw_addr + (reg)))
 static inline uint32_t i40e_read_addr(volatile void *addr)
 {
-   return I40E_PCI_REG(addr);
+   return rte_le_to_cpu_32(I40E_PCI_REG(addr));
 }
 #define I40E_PCI_REG_WRITE(reg, value) \
-   do {I40E_PCI_REG((reg)) = (value);} while(0)
+   do { I40E_PCI_REG((reg)) = rte_cpu_to_le_32(value); } while (0)

 #define I40E_WRITE_FLUSH(a) I40E_READ_REG(a, I40E_GLGEN_STAT)
 #define I40EVF_WRITE_FLUSH(a) I40E_READ_REG(a, I40E_VFGEN_RSTAT)
-- 
1.9.3



[dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code

2015-07-17 Thread Mcnamara, John
> -Original Message-
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Monday, July 13, 2015 3:00 PM
> To: Mcnamara, John
> Cc: dev at dpdk.org; vladz at cloudius-systems.com
> Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
> 
> > > > -   dev_started : 1;   /**< Device state: STARTED(1) / 
> > > > STOPPED(0). */
> > > > +   dev_started : 1,   /**< Device state: STARTED(1) / 
> > > > STOPPED(0). */
> > > > +   lro : 1;   /**< RX LRO is ON(1) / OFF(0) */
> > > >
> > > >
> >
> Thank you, I'll ack as soon as Chao confirms its not a problem on ppc Neil

Hi,

Just pinging Chao Zhu on this again so that it isn't forgotten.

Neil, just to be clear, are you looking for a validate-abi.sh check on PPC?

Just for context, the lro flag doesn't seem to be used anywhere that would be 
affected by endianness:

$ ag -w "\->lro" 
drivers/net/ixgbe/ixgbe_rxtx.c
3767:   if (dev->data->lro) {
3967:   dev->data->lro = 1;

drivers/net/ixgbe/ixgbe_ethdev.c
1689:   dev->data->lro = 0;

John.
-- 



[dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation

2015-07-17 Thread Tony Lu
Hi, Pablo

>-Original Message-
>From: De Lara Guarch, Pablo [mailto:pablo.de.lara.guarch at intel.com]
>Sent: Friday, July 17, 2015 4:42 AM
>To: Tony Lu; dev at dpdk.org
>Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library
with
>cuckoo hash implementation
>
>Hi Tony,
>
>> -Original Message-
>> From: Tony Lu [mailto:zlu at ezchip.com]
>> Sent: Thursday, July 16, 2015 10:40 AM
>> To: De Lara Guarch, Pablo; dev at dpdk.org
>> Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash
>> library with cuckoo hash implementation
>>
>> >diff --git a/lib/librte_hash/rte_cuckoo_hash.c
>> b/lib/librte_hash/rte_cuckoo_hash.c
>> >new file mode 100644
>> >index 000..50e3acd
>> >--- /dev/null
>> >+++ b/lib/librte_hash/rte_cuckoo_hash.c
>> >@@ -0,0 +1,1027 @@
>> ...
>> >+
>> >+/* Functions to compare multiple of 16 byte keys (up to 128 bytes)
>> >+*/ static int rte_hash_k16_cmp_eq(const void *key1, const void
>> >+*key2, size_t key_len
>> >__rte_unused)
>> >+{
>> >+   const __m128i k1 = _mm_loadu_si128((const __m128i *) key1);
>> >+   const __m128i k2 = _mm_loadu_si128((const __m128i *) key2);
>> >+   const __m128i x = _mm_xor_si128(k1, k2);
>> >+
>> >+   return !_mm_test_all_zeros(x, x);
>> >+}
>> ...
>>
>> When compiling the latest dev DPDK for non-x86 arch, it fails on the
>> above code, as the SSE is x86 specific defined in .  Is
>> it possible to replace this function with platform independent code?
>
>Thanks for spotting this. I just sent a patch that should fix the problem.
>Can you check if it works?

Thanks for your quick response, but __m128i and all the _mm_ related
functions
are X86 specific defined in .  This header file is only
available in X86
compiler library, but no-X86 archs do not have this file.  So if we can
replace all
the X86 specific code in the above function, that would be great.

Thanks
-Tony


>Thanks,
>Pablo
>>
>> Thanks
>> -Zhigang Lu




[dpdk-dev] ethdev cleanup following hotplug changes

2015-07-17 Thread Thomas Monjalon
Hi Tetsuya,

Any news about this comment in ethdev?

 * TODO:
 * rte_eal_vdev_init() should return port_id,
 * And rte_eth_dev_save() and rte_eth_dev_get_changed_port()
 * should be removed. */

http://dpdk.org/browse/dpdk/tree/lib/librte_ether/rte_ethdev.c#n618


[dpdk-dev] igb PMD should set the default tx wthresh correctly.

2015-07-17 Thread Wiles, Keith


On 7/17/15, 9:15 AM, "dev on behalf of Thomas Monjalon"
 wrote:

>2015-07-16 19:49, Stephen Hemminger:
>> On Fri, 17 Jul 2015 00:52:09 +
>> "Lu, Wenzhuo"  wrote:
>> 
>> > Hi Stephen,
>> > I don't think there's a conflict. The message just reminder us that
>>we can adjust the values to achieve better performance.
>> > I saw ixgbe and i40e also use 0 as the same default value. To my
>>opinion, it's good to keep the same behavior.
>> > Thanks.
>> 
>> In my opnion, no application should have to make special case setup
>> for each device type. Having to have a table that lists all the
>>parameters
>> for each device name is not supportable or scaleable.
>> 
>> The DPDK started out as "lets do benchmarks fast" but as a production
>> toolkit it needs to stop having this kind of thing.
>> 
>> The message shows up to the end-user, who thinks it is a driver bug.
>> The "us" is now real customers not DPDK developers.
>
>+1 to have better default values and less scary messages.
+1 I agree we should have default values. The scary message is for
debugging only in the best of cases and just wrong for the normal case.
>



[dpdk-dev] [ovs-discuss] ovs-dpdk performance is not good

2015-07-17 Thread Ouyang, Changchun


On 7/16/2015 9:45 PM, Traynor, Kevin wrote:
>
> (re-adding the ovs-discuss list)
>
> This might be better on the dpdk dev mailing list. For the OVS part, 
> see this thread 
> http://openvswitch.org/pipermail/discuss/2015-July/018095.html
>
> Kevin.
>
> *From:*Na Zhu [mailto:zhunatuzi at gmail.com]
> *Sent:* Wednesday, July 15, 2015 6:16 AM
> *To:* Traynor, Kevin
> *Subject:* Re: [ovs-discuss] ovs-dpdk performance is not good
>
> Hi Kevin,
>
> The interface MTU is 1500, the TCP message size is 16384 and the UDP 
> message size is 65507.
>
> How to use DPDK virtio PMD?
>
in DPDK virtio PMD, it uses mergeable feature to support jumbo frame,
the mergeable feature need negotiate with vhost on the backend,
so if ovs enable the mergeable feature, and virtio can succeed in 
negotiating this feature,
then jumbo frame can be supported.

thanks
Changchun

> 2015-07-14 20:25 GMT+08:00 Traynor, Kevin  >:
>
> *From:*discuss [mailto:discuss-bounces at openvswitch.org
> ] *On Behalf Of *Na Zhu
> *Sent:* Monday, July 13, 2015 3:15 AM
> *To:* bugs at openvswitch.org 
> *Subject:* [ovs-discuss] ovs-dpdk performance is not good
>
> Dear all,
>
> I want to use ovs-dpdk to improve my nfv performance. But when i
> compare the throughput between standard ovs and ovs-dpdk, the ovs
> is better, does anyone know why?
>
> I use netperf to test the throughput.
>
> use vhost-net to test standard ovs.
>
> use vhost-user to test ovs-dpdk.
>
> My topology is as follow:
>
>  1
>
> The result is that standard ovs performance is better. Throughput
> unit Mbps.
>
>  2
>
>  3
>
> [kt] I would check your core affinitization to ensure that the
> vswitchd
>
> pmd is on a separate core to the vCPUs (set with
> other_config:pmd-cpu-mask).
>
> Also, this test is not using the DPDK vitrio PMD in the guest
> which provides
>
> performance gains.
>
> What packet sizes are you using? you should see a greater gain
> from DPDK
>
> at lower packet sizes (i.e. more PPS)
>



[dpdk-dev] [dpdk-virtio] Performance tuning for dpdk with virtio?

2015-07-17 Thread Clarylin L
I am running dpdk with a virtual guest as a L2 forwarder.

If the virtual guest is on passthrough, dpdk can achieve around 10G
throughput. However if the virtual guest is on virtio, dpdk achieves just
150M throughput, which is a huge degrade. Any idea what could be the cause
of such poor performance on virtio? and any performance tuning techniques I
could try? Thanks a lot!

lab at vpc-2:~$ ps aux | grep qemu

libvirt+ 12020  228  0.0 102832508 52860 ? Sl   14:54  61:06
*qemu*-system-x86_64
-enable-kvm -name dpdk-perftest -S -machine
pc-i440fx-trusty,accel=kvm,usb=off,mem-merge=off -cpu host -m 98304
-mem-prealloc -mem-path /dev/hugepages/libvirt/*qemu* -realtime mlock=off
-smp 24,sockets=2,cores=12,threads=1 -numa
node,nodeid=0,cpus=0-11,mem=49152 -numa node,nodeid=1,cpus=12-23,mem=49152
-uuid eb5f8848-9983-4f13-983c-e3bd4c59387d -no-user-config -nodefaults
-chardev 
socket,id=charmonitor,path=/var/lib/libvirt/*qemu*/dpdk-perftest.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown
-boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
file=/var/lib/libvirt/images/dpdk-perftest-hda.img,if=none,id=drive-ide0-0-0,format=qcow2
-device
ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive
file=/var/lib/libvirt/images/dpdk-perftest-hdb.img,if=none,id=drive-ide0-0-1,format=qcow2
-device ide-hd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -drive
if=none,id=drive-ide0-1-0,readonly=on,format=raw -device
ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=2
-netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:45:ff:5e,bus=pci.0,addr=0x5
-netdev
tap,fds=26:27:28:29:30:31:32:33,id=hostnet1,vhost=on,vhostfds=34:35:36:37:38:39:40:41
-device
virtio-net-pci,mq=on,vectors=17,netdev=hostnet1,id=net1,mac=52:54:00:7e:b5:6b,bus=pci.0,addr=0x6
-netdev
tap,fds=42:43:44:45:46:47:48:49,id=hostnet2,vhost=on,vhostfds=50:51:52:53:54:55:56:57
-device
virtio-net-pci,mq=on,vectors=17,netdev=hostnet2,id=net2,mac=52:54:00:f1:a5:20,bus=pci.0,addr=0x7
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1
-device isa-serial,chardev=charserial1,id=serial1 -vnc 127.0.0.1:0 -device
cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device
i6300esb,id=watchdog0,bus=pci.0,addr=0x3 -watchdog-action reset -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4


[dpdk-dev] [PATCH v2] hash: fix compilation for non-x86 systems

2015-07-17 Thread Pablo de Lara
From: "Pablo de Lara" 

Hash library uses optimized compare functions that use
x86 intrinsics, therefore non-x86 systems could not build
the library. In that case, the compare function is set
to the generic memcmp.

Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation")

Reported-by: Tony Lu 
Signed-off-by: Pablo de Lara 
---
Changes in v2:
- Renamed new file rte_cmp_fns.h to rte_cmp_x86.h
- Removed blank line

 lib/librte_hash/rte_cmp_x86.h | 109 ++
 lib/librte_hash/rte_cuckoo_hash.c |  96 -
 2 files changed, 120 insertions(+), 85 deletions(-)
 create mode 100644 lib/librte_hash/rte_cmp_x86.h

diff --git a/lib/librte_hash/rte_cmp_x86.h b/lib/librte_hash/rte_cmp_x86.h
new file mode 100644
index 000..7f79bac
--- /dev/null
+++ b/lib/librte_hash/rte_cmp_x86.h
@@ -0,0 +1,109 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/* Functions to compare multiple of 16 byte keys (up to 128 bytes) */
+static int
+rte_hash_k16_cmp_eq(const void *key1, const void *key2, size_t key_len 
__rte_unused)
+{
+   const __m128i k1 = _mm_loadu_si128((const __m128i *) key1);
+   const __m128i k2 = _mm_loadu_si128((const __m128i *) key2);
+#ifdef RTE_MACHINE_CPUFLAG_SSE4_1
+   const __m128i x = _mm_xor_si128(k1, k2);
+
+   return !_mm_test_all_zeros(x, x);
+#else
+   const __m128i x = _mm_cmpeq_epi32(k1, k2);
+
+   return (_mm_movemask_epi8(x) != 0x);
+#endif
+}
+
+static int
+rte_hash_k32_cmp_eq(const void *key1, const void *key2, size_t key_len)
+{
+   return rte_hash_k16_cmp_eq(key1, key2, key_len) ||
+   rte_hash_k16_cmp_eq((const char *) key1 + 16,
+   (const char *) key2 + 16, key_len);
+}
+
+static int
+rte_hash_k48_cmp_eq(const void *key1, const void *key2, size_t key_len)
+{
+   return rte_hash_k16_cmp_eq(key1, key2, key_len) ||
+   rte_hash_k16_cmp_eq((const char *) key1 + 16,
+   (const char *) key2 + 16, key_len) ||
+   rte_hash_k16_cmp_eq((const char *) key1 + 32,
+   (const char *) key2 + 32, key_len);
+}
+
+static int
+rte_hash_k64_cmp_eq(const void *key1, const void *key2, size_t key_len)
+{
+   return rte_hash_k32_cmp_eq(key1, key2, key_len) ||
+   rte_hash_k32_cmp_eq((const char *) key1 + 32,
+   (const char *) key2 + 32, key_len);
+}
+
+static int
+rte_hash_k80_cmp_eq(const void *key1, const void *key2, size_t key_len)
+{
+   return rte_hash_k64_cmp_eq(key1, key2, key_len) ||
+   rte_hash_k16_cmp_eq((const char *) key1 + 64,
+   (const char *) key2 + 64, key_len);
+}
+
+static int
+rte_hash_k96_cmp_eq(const void *key1, const void *key2, size_t key_len)
+{
+   return rte_hash_k64_cmp_eq(key1, key2, key_len) ||
+   rte_hash_k32_cmp_eq((const char *) key1 + 64,
+   (const char *) key2 + 64, key_len);
+}
+
+static int
+rte_hash_k112_cmp_eq(const void *key1, const void *key2, size_t key_len)
+{
+   return rte_hash_k64_cmp_eq(key1, key2, key_len) ||
+   rte_hash_k32_cmp_eq((const char *) key1 + 64,
+   (const char *) key2 + 64, 

[dpdk-dev] [PATCH] virtio: fix the vq size issue

2015-07-17 Thread Stephen Hemminger
On Wed,  1 Jul 2015 15:48:50 +0800
Ouyang Changchun  wrote:

> This commit breaks virtio basic packets rx functionality:
>   d78deadae4dca240e85054bf2d604a801676becc
> 
> The QEMU use 256 as default vring size, also use this default value to 
> calculate the virtio
> avail ring base address and used ring base address, and vhost in the backend 
> use the ring base
> address to do packet IO.
> 
> Virtio spec also says the queue size in PCI configuration is read-only, so 
> virtio front end
> can't change it. just need use the read-only value to allocate space for 
> vring and calculate the
> avail and used ring base address. Otherwise, the avail and used ring base 
> address will be different
> between host and guest, accordingly, packet IO can't work normally.
> 
> Signed-off-by: Changchun Ouyang 
> ---
>  drivers/net/virtio/virtio_ethdev.c | 14 +++---
>  1 file changed, 3 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c 
> b/drivers/net/virtio/virtio_ethdev.c
> index fe5f9a1..d84de13 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -263,8 +263,6 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
>*/
>   vq_size = VIRTIO_READ_REG_2(hw, VIRTIO_PCI_QUEUE_NUM);
>   PMD_INIT_LOG(DEBUG, "vq_size: %d nb_desc:%d", vq_size, nb_desc);
> - if (nb_desc == 0)
> - nb_desc = vq_size;

command queue is setup with nb_desc = 0

>   if (vq_size == 0) {
>   PMD_INIT_LOG(ERR, "%s: virtqueue does not exist", __func__);
>   return -EINVAL;
> @@ -275,15 +273,9 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
>   return -EINVAL;
>   }
>  
> - if (nb_desc < vq_size) {
> - if (!rte_is_power_of_2(nb_desc)) {
> - PMD_INIT_LOG(ERR,
> -  "nb_desc(%u) size is not powerof 2",
> -  nb_desc);
> - return -EINVAL;
> - }
> - vq_size = nb_desc;
> - }
> + if (nb_desc != vq_size)
> + PMD_INIT_LOG(ERR, "Warning: nb_desc(%d) is not equal to vq size 
> (%d), fall to vq size",
> + nb_desc, vq_size);

Nack. This breaks onn Google Compute Engine the vring size is 16K.

An application that wants to work on both QEMU and GCE will want to pass a
reasonable size and have the negotiation resolve to best value.

For example, vRouter passes 512 as Rx ring size.
On QEMU this gets rounded down to 256 and on GCE only 512 elements
are used.

This is what the Linux kernel virtio does.



[dpdk-dev] [PATCH v6 6/6] test-pmd: remove call to rte_eth_promiscuous_disable() from detach_port()

2015-07-17 Thread Xu, Qian Q
Bernard
Apply the patchset and run vhost-sample, both vhost-user and vhost-cuse, no 
impact. Thx. 

Thanks
Qian


-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Bernard Iremonger
Sent: Wednesday, July 15, 2015 9:51 PM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH v6 6/6] test-pmd: remove call to 
rte_eth_promiscuous_disable() from detach_port()

At this point the stop() and close() functions have already been called.
The rte_eth_promiscuous_disable() function does not return on the VM.

Signed-off-by: Bernard Iremonger 
---
 app/test-pmd/testpmd.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index 
82b465d..4769533 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -1542,8 +1542,6 @@ detach_port(uint8_t port_id)
return;
}

-   rte_eth_promiscuous_disable(port_id);
-
if (rte_eth_dev_detach(port_id, name))
return;

--
1.9.1



[dpdk-dev] [PATCH] i40e: fix the VF rss issue when nb_rx_queue is less than nb_tx_queue

2015-07-17 Thread Xu, Qian Q
Tested-by: Qian Xu 

- Test Commit: 58d3da9eddad28012c16523aa0b5f63dae791bcb
- OS: Fedora 21
- GCC: gcc (GCC) 4.9.2 20141101 (Red Hat 4.9.2-1)
- CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
- NIC: Intel Corporation Ethernet Controller XL710 for 40G  

   bE QSFP+
- Target: x86_64-native-linuxapp-gcc
- Guest: Fedora 20/ 3.11 kernel
- Total 1 cases, 1 passed, 0 failed.

Test case: L3FWD-VF on VM
1.  Fortville PF generate 2VFs. 
2.  assign the VF to VM and launch the VM: 
taskset -c 4-9  qemu-system-x86_64 \
-object memory-backend-file,id=mem,size=2048M,mem-path=/mnt/huge -mem-prealloc \
-enable-kvm -m 2048 -smp cores=6,sockets=1 -cpu host -name dpdk1-vm1 \
-drive file=/home/img/fc21-vm1.img \
-device pci-assign,host=03:02.0 \
-device pci-assign,host=03:02.1 \
-device pci-assign,host=05:02.0 \
-device pci-assign,host=05:02.1 \
-netdev tap,id=ipvm1,ifname=tap3,script=/etc/qemu-ifup -device 
rtl8139,netdev=ipvm1,id=net0,mac=00:00:00:00:00:01 \
-localtime -nographic
3.  keep pf in dpdk driver, run testpmd but not start
4.  run l3fwd in the VM, and send packets to VFs, no packet drops. 

Thanks
Qian


-Original Message-
From: Wu, Jingjing 
Sent: Monday, July 13, 2015 9:22 AM
To: dev at dpdk.org
Cc: Wu, Jingjing; Xu, Qian Q; Zhang, Helin
Subject: [PATCH] i40e: fix the VF rss issue when nb_rx_queue is less than 
nb_tx_queue

From: "jingjing.wu" 

I40e VF driver uses the num_queue_pairs in vf structure to construct queue 
index look up table. When the nb_rx_queue is less than nb_tx_queue, 
num_queue_pairs is equal to nb_tx_queue. It will make the table use invalid 
queue index, then application cannot poll packets on these queues.

This patch also moves the inline function i40e_align_floor from i40e_ethdev.c 
to i40e_ethdev.h.

Signed-off-by: jingjing.wu 
---
 drivers/net/i40e/i40e_ethdev.c| 8 
 drivers/net/i40e/i40e_ethdev.h| 8 
 drivers/net/i40e/i40e_ethdev_vf.c | 6 +-
 3 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c 
index 5fb6b4c..051fd02 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -282,14 +282,6 @@ static struct eth_driver rte_i40e_pmd = {  };

 static inline int
-i40e_align_floor(int n)
-{
-   if (n == 0)
-   return 0;
-   return (1 << (sizeof(n) * CHAR_BIT - 1 - __builtin_clz(n)));
-}
-
-static inline int
 rte_i40e_dev_atomic_read_link_status(struct rte_eth_dev *dev,
 struct rte_eth_link *link)
 {
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h 
index 883ee06..6185657 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -563,6 +563,14 @@ i40e_init_adminq_parameter(struct i40e_hw *hw)
hw->aq.asq_buf_size = I40E_AQ_BUF_SZ;
 }

+static inline int
+i40e_align_floor(int n)
+{
+   if (n == 0)
+   return 0;
+   return 1 << (sizeof(n) * CHAR_BIT - 1 - __builtin_clz(n)); }
+
 #define I40E_VALID_FLOW(flow_type) \
((flow_type) == RTE_ETH_FLOW_FRAG_IPV4 || \
(flow_type) == RTE_ETH_FLOW_NONFRAG_IPV4_TCP || \ diff --git 
a/drivers/net/i40e/i40e_ethdev_vf.c b/drivers/net/i40e/i40e_ethdev_vf.c
index b150b62..c4ce2cf 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1481,6 +1481,8 @@ i40evf_rx_init(struct rte_eth_dev *dev)

i40evf_config_rss(vf);
for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   if (!rxq[i] || !rxq[i]->q_set)
+   continue;
if (i40evf_rxq_init(dev, rxq[i]) < 0)
return -EFAULT;
}
@@ -1857,6 +1859,7 @@ i40evf_config_rss(struct i40e_vf *vf)
struct i40e_hw *hw = I40E_VF_TO_HW(vf);
struct rte_eth_rss_conf rss_conf;
uint32_t i, j, lut = 0, nb_q = (I40E_VFQF_HLUT_MAX_INDEX + 1) * 4;
+   uint16_t num;

if (vf->dev_data->dev_conf.rxmode.mq_mode != ETH_MQ_RX_RSS) {
i40evf_disable_rss(vf);
@@ -1864,9 +1867,10 @@ i40evf_config_rss(struct i40e_vf *vf)
return 0;
}

+   num = i40e_align_floor(vf->dev_data->nb_rx_queues);
/* Fill out the look up table */
for (i = 0, j = 0; i < nb_q; i++, j++) {
-   if (j >= vf->num_queue_pairs)
+   if (j >= num)
j = 0;
lut = (lut << 8) | j;
if ((i & 3) == 3)
--
2.4.0



[dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation

2015-07-17 Thread De Lara Guarch, Pablo
Hi Tony,

> -Original Message-
> From: Tony Lu [mailto:zlu at ezchip.com]
> Sent: Friday, July 17, 2015 8:58 AM
> To: De Lara Guarch, Pablo; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library
> with cuckoo hash implementation
> 
> >-Original Message-
> >From: De Lara Guarch, Pablo [mailto:pablo.de.lara.guarch at intel.com]
> >Sent: Friday, July 17, 2015 3:35 PM
> >To: Tony Lu; dev at dpdk.org
> >Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library
> with
> >cuckoo hash implementation
> >
> >
> >
> >> -Original Message-
> >> From: Tony Lu [mailto:zlu at ezchip.com]
> >> Sent: Friday, July 17, 2015 4:35 AM
> >> To: De Lara Guarch, Pablo; dev at dpdk.org
> >> Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash
> >> library with cuckoo hash implementation
> >>
> >> Hi, Pablo
> >>
> >> >-Original Message-
> >> >From: De Lara Guarch, Pablo [mailto:pablo.de.lara.guarch at intel.com]
> >> >Sent: Friday, July 17, 2015 4:42 AM
> >> >To: Tony Lu; dev at dpdk.org
> >> >Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash
> >> >library
> >> with
> >> >cuckoo hash implementation
> >> >
> >> >Hi Tony,
> >> >
> >> >> -Original Message-
> >> >> From: Tony Lu [mailto:zlu at ezchip.com]
> >> >> Sent: Thursday, July 16, 2015 10:40 AM
> >> >> To: De Lara Guarch, Pablo; dev at dpdk.org
> >> >> Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash
> >> >> library with cuckoo hash implementation
> >> >>
> >> >> >diff --git a/lib/librte_hash/rte_cuckoo_hash.c
> >> >> b/lib/librte_hash/rte_cuckoo_hash.c
> >> >> >new file mode 100644
> >> >> >index 000..50e3acd
> >> >> >--- /dev/null
> >> >> >+++ b/lib/librte_hash/rte_cuckoo_hash.c
> >> >> >@@ -0,0 +1,1027 @@
> >> >> ...
> >> >> >+
> >> >> >+/* Functions to compare multiple of 16 byte keys (up to 128
> >> >> >+bytes) */ static int rte_hash_k16_cmp_eq(const void *key1, const
> >> >> >+void *key2, size_t key_len
> >> >> >__rte_unused)
> >> >> >+{
> >> >> >+  const __m128i k1 = _mm_loadu_si128((const __m128i *)
> key1);
> >> >> >+  const __m128i k2 = _mm_loadu_si128((const __m128i *)
> key2);
> >> >> >+  const __m128i x = _mm_xor_si128(k1, k2);
> >> >> >+
> >> >> >+  return !_mm_test_all_zeros(x, x); }
> >> >> ...
> >> >>
> >> >> When compiling the latest dev DPDK for non-x86 arch, it fails on
> >> >> the above code, as the SSE is x86 specific defined in
> >> >> .  Is it possible to replace this function with platform
> >independent code?
> >> >
> >> >Thanks for spotting this. I just sent a patch that should fix the
> problem.
> >> >Can you check if it works?
> >>
> >> Thanks for your quick response, but __m128i and all the _mm_ related
> >> functions are X86 specific defined in .  This header file
> >> is only available in X86 compiler library, but no-X86 archs do not
> >> have this file.  So if we can replace all the X86 specific code in the
> >> above function, that would be great.
> >>
> >With the patch that I sent, if you are compiling for a non-x86 arch, you
> should
> >not have any problem, since all that code will only be used if using x86
> arch.
> >Have you tried compiling DPDK with the patch?
> 
> Yes, I have built the DPDK with your patch, and got the following errors.
> This is
> because there are no __m128i, _mm_loadu_si128(), _mm_cmpeq_epi32()
> and
> _mm_movemask_epi8() on no-X86 arches.
> 
> == Build lib/librte_hash
>   CC rte_cuckoo_hash.o
> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c: In function
> 'rte_hash_k16_cmp_eq':
> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error:
> expected '=', ',', ';', 'asm' or '__attribute__' before 'k1'
> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error: 'k1'
> undeclared (first use in this function)
> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error: (Each
> undeclared identifier is reported only once
> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error: for
> each function it appears in.)
> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: warning:
> implicit declaration of function '_mm_loadu_si128'
> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: warning:
> nested extern declaration of '_mm_loadu_si128'
> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: error:
> expected ')' before '__m128i'
> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: warning:
> type defaults to 'int' in declaration of 'type name'
> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1126: warning:
> cast from pointer to integer of different size
> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1127: error:
> expected '=', ',', ';', 'asm' or '__attribute__' before 'k2'
> /u/zlu.bjg/git/dpdk.org/lib/librte_hash/rte_cuckoo_hash.c:1127: error: 'k2'
> undeclared (first use in this function)
> 

[dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code

2015-07-17 Thread Neil Horman
On Fri, Jul 17, 2015 at 11:45:10AM +, Mcnamara, John wrote:
> > -Original Message-
> > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > Sent: Monday, July 13, 2015 3:00 PM
> > To: Mcnamara, John
> > Cc: dev at dpdk.org; vladz at cloudius-systems.com
> > Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
> > 
> > > > > - dev_started : 1;   /**< Device state: STARTED(1) / 
> > > > > STOPPED(0). */
> > > > > + dev_started : 1,   /**< Device state: STARTED(1) / 
> > > > > STOPPED(0). */
> > > > > + lro : 1;   /**< RX LRO is ON(1) / OFF(0) */
> > > > >
> > > > >
> > >
> > Thank you, I'll ack as soon as Chao confirms its not a problem on ppc Neil
> 
> Hi,
> 
> Just pinging Chao Zhu on this again so that it isn't forgotten.
> 
> Neil, just to be clear, are you looking for a validate-abi.sh check on PPC?
> 
Yes, correct.
> Just for context, the lro flag doesn't seem to be used anywhere that would be 
> affected by endianness:
> 
> $ ag -w "\->lro" 
> drivers/net/ixgbe/ixgbe_rxtx.c
> 3767:   if (dev->data->lro) {
> 3967:   dev->data->lro = 1;
> 
> drivers/net/ixgbe/ixgbe_ethdev.c
> 1689:   dev->data->lro = 0;
> 
But this data is visible to the outside application, correct?  If so then we
can't rely on internal-only usage as a guide.  If it is only internally visible,
then yes, you are correct, endianess is not an issue then
neil

> John.
> -- 
> 
> 


[dpdk-dev] How to get net_device and use struct ethtool_cmd at DPDK enverinment?

2015-07-17 Thread Choi, Sy Jong
HI Scott,

KNI app effectively will become your datapath, it is a sample app, you can 
modify it.
It receive packets from PMD and pass all packets to the kernel path from 
userspace.

You can run l2fwd, or other app, you will need to run it on other PMD ports. 
Not on the same ports on KNI.
Or you will need to modify KNI app, to distribute packets to kernel path or 
your new datapath. Basically it will be like coding  to combine kni app and 
l2fwd app.

Regards,
Choi, Sy Jong
Platform Application Engineer

From: "Scott.Jhuang (???) : 6309" [mailto:scott.jhu...@cas-well.com]
Sent: Friday, July 17, 2015 3:53 PM
To: Choi, Sy Jong; dev at dpdk.org; "Sandy.Liu (???) : 6817"; "Alan Yu (???) : 
6632"
Subject: Re: [dpdk-dev] How to get net_device and use struct ethtool_cmd at 
DPDK enverinment?

Hi Sy Jong,

If I using KNI in DPDK, can I use another applications at the same time? (e.g. 
L2 forward, L3 forward)

Choi, Sy Jong ? 2015?07?15? 18:01 ??:
Hi Scott,

You will need to start KNI sample app, it will create the vEth interface. After 
kni app, it will be there, kni app is the datapath, it get the packet into the 
kernel.

http://dpdk.org/doc/guides/prog_guide/kernel_nic_interface.html


  1.  Insert the KNI kernel module:

  1.  insmod ./rte_kni.ko
If using KNI in multi-thread mode, use the following command line:
insmod ./rte_kni.ko kthread_mode=multiple

  1.  Running the KNI sample application:

  1.  ./kni -c -0xf0 -n 4 -- -p 0x3 -P -config="(0,4,6),(1,5,7)"
This command runs the kni sample application with two physical ports. Each port 
pins two forwarding cores (ingress/egress) in user space.


Regards,
Choi, Sy Jong
Platform Application Engineer

From: "Scott.Jhuang (?? ?) : 6309" [mailto:scott.jhu...@cas-well.com]
Sent: Wednesday, July 15, 2015 5:54 PM
To: Choi, Sy Jong; dev at dpdk.org; "Sandy.Liu (?? ?) : 
6817"; "Alan Yu (?? ?) : 6632"
Subject: Re: [dpdk-dev] How to get net_device and use struct ethtool_cmd at 
DPDK enverinment?

Hi Sy Jong,

If I load "rte_kni.ko" driver, the net_device structs will be initialled by 
KNI, right?
If yes, how can I handle these net_device structs in other driver,
because I using "for_each_netdev()" kernel API can't find the net_device 
structs which KNI initialled.
Or these structs have not been exported to kernel?

Choi, Sy Jong ? 2015?07?01? 15:55 ??:
Hi Scott,

Please refer to our KNI library at:-
dpdk-1.8.0\lib\librte_eal\linuxapp\kni\ethtool\igb\igb.h

Regards,
Choi, Sy Jong
Platform Application Engineer

From: "Scott.Jhuang (?? ?) : 6309" [mailto:scott.jhu...@cas-well.com]
Sent: Wednesday, July 01, 2015 2:44 PM
To: Choi, Sy Jong; dev at dpdk.org
Subject: Re: [dpdk-dev] How to get net_device and use struct ethtool_cmd at 
DPDK enverinment?

Hi Sy Jong,

Have any idea?

"Scott.Jhuang (? ??) : 6309" ? 2015?06?23? 21:24 ??:
Dear Sy Jong,

Yes, I have check out DPDK KNI, but I still can't find how to prepare 
net_device structure...
And I also doesn't find how to get "ethtool_cmd.phy_address"
Could you let me know the path of source code folder

Choi, Sy Jong ? 2015?06?19? 10:35 ??:
Hi Scott,

DPDK PMD are interfacing using rte_ethdev.c which link to ixgbe_ethdev.c 
there?s no ?net_device? in our code.

But if you search DPDk code based, we have KNI example to teach you how to 
prepare the net_device structure.
Have you check out our DPDK KNI codes?

Regards,
Choi, Sy Jong
Platform Application Engineer

From: "Scott.Jhuang (? ? ?) : 6309" [mailto:scott.jhu...@cas-well.com]
Sent: Thursday, June 18, 2015 12:25 PM
To: Choi, Sy Jong; dev at dpdk.org
Subject: Re: [dpdk-dev] How to get net_device and use struct ethtool_cmd at 
DPDK enverinment?

Dear Sy Jong,

I'm planning to program a driver to get all the ethport's net_device structure, 
because I need some information from these net_device structures.
And I also need to use net_device struct's ethtool_cmd to get some information 
e.g. ethtool_cmd.phy_address, net_device->ethtool_ops->get_settings.

In fact, I need some information from net_device struct to access and control 
PHY's link-up/down,
and I reference igb driver to design the link-up/down functions, since in DPDK 
envirenment doesn't have igb driver,
so In DPDK envirenment, I don't know how to get network deivce's net_device 
structs and more information which initial by igb driver(because doesn't have 
igb driver).

Choi, Sy Jong ? 2015?06?17? 11:15 ??:
Hi Scott,

You are right, the KNI will be a good reference for you. It demonstrate how 
DPDK PMD interface with kernel.
May I know are you planning to build the interface to ethtool? You can try 
running KNI app.

Regards,
Choi, Sy Jong
Platform Application Engineer

From: "Scott.Jhuang (?? ?) : 6309" [mailto:scott.jhu...@cas-well.com]
Sent: Wednesday, June 17, 2015 11:12 AM
To: Choi, Sy Jong; dev at dpdk.org
Subject: Re: [dpdk-dev] How to get net_device and use struct ethtool_cmd at 
DPDK enverinment?

Hi Sy 

[dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline

2015-07-17 Thread Mrzyglod, DanielX T
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 7:08 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
> 
> 
> Signed-off-by: Cristian Dumitrescu 

Acked-by: Daniel Mrzyglod 


[dpdk-dev] [PATCH] doc: announce ABI change for librte_table

2015-07-17 Thread Mrzyglod, DanielX T
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 7:00 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
> 
> 
> Signed-off-by: Cristian Dumitrescu 

Acked-by: Daniel Mrzyglod 


[dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port

2015-07-17 Thread Mrzyglod, DanielX T
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 5:27 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
> 
> v2 changes:
> -text simplification
> 
> Signed-off-by: Cristian Dumitrescu 

Acked-by: Daniel Mrzyglod 


[dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port

2015-07-17 Thread Gajdzica, MaciejX T


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 5:27 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
> 
> v2 changes:
> -text simplification
> 
> Signed-off-by: Cristian Dumitrescu 

Acked-by: Maciej Gajdzica 


[dpdk-dev] [PATCH] doc: announce ABI change for librte_table

2015-07-17 Thread Gajdzica, MaciejX T


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 7:00 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
> 
> 
> Signed-off-by: Cristian Dumitrescu 

Acked-by: Maciej Gajdzica 


[dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline

2015-07-17 Thread Gajdzica, MaciejX T


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 7:08 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
> 
> 
> Signed-off-by: Cristian Dumitrescu 

Acked-by: Maciej Gajdzica 


[dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation

2015-07-17 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Tony Lu [mailto:zlu at ezchip.com]
> Sent: Friday, July 17, 2015 4:35 AM
> To: De Lara Guarch, Pablo; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library
> with cuckoo hash implementation
> 
> Hi, Pablo
> 
> >-Original Message-
> >From: De Lara Guarch, Pablo [mailto:pablo.de.lara.guarch at intel.com]
> >Sent: Friday, July 17, 2015 4:42 AM
> >To: Tony Lu; dev at dpdk.org
> >Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library
> with
> >cuckoo hash implementation
> >
> >Hi Tony,
> >
> >> -Original Message-
> >> From: Tony Lu [mailto:zlu at ezchip.com]
> >> Sent: Thursday, July 16, 2015 10:40 AM
> >> To: De Lara Guarch, Pablo; dev at dpdk.org
> >> Subject: RE: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash
> >> library with cuckoo hash implementation
> >>
> >> >diff --git a/lib/librte_hash/rte_cuckoo_hash.c
> >> b/lib/librte_hash/rte_cuckoo_hash.c
> >> >new file mode 100644
> >> >index 000..50e3acd
> >> >--- /dev/null
> >> >+++ b/lib/librte_hash/rte_cuckoo_hash.c
> >> >@@ -0,0 +1,1027 @@
> >> ...
> >> >+
> >> >+/* Functions to compare multiple of 16 byte keys (up to 128 bytes)
> >> >+*/ static int rte_hash_k16_cmp_eq(const void *key1, const void
> >> >+*key2, size_t key_len
> >> >__rte_unused)
> >> >+{
> >> >+ const __m128i k1 = _mm_loadu_si128((const __m128i *) key1);
> >> >+ const __m128i k2 = _mm_loadu_si128((const __m128i *) key2);
> >> >+ const __m128i x = _mm_xor_si128(k1, k2);
> >> >+
> >> >+ return !_mm_test_all_zeros(x, x);
> >> >+}
> >> ...
> >>
> >> When compiling the latest dev DPDK for non-x86 arch, it fails on the
> >> above code, as the SSE is x86 specific defined in .  Is
> >> it possible to replace this function with platform independent code?
> >
> >Thanks for spotting this. I just sent a patch that should fix the problem.
> >Can you check if it works?
> 
> Thanks for your quick response, but __m128i and all the _mm_ related
> functions
> are X86 specific defined in .  This header file is only
> available in X86
> compiler library, but no-X86 archs do not have this file.  So if we can
> replace all
> the X86 specific code in the above function, that would be great.
> 
With the patch that I sent, if you are compiling for a non-x86 arch, you should 
not have any problem,
since all that code will only be used if using x86 arch. Have you tried 
compiling DPDK with the patch?

Pablo

> Thanks
> -Tony
> 
> 
> >Thanks,
> >Pablo
> >>
> >> Thanks
> >> -Zhigang Lu
> 



[dpdk-dev] [PATCH v13 00/14] Interrupt mode PMD

2015-07-17 Thread Liang, Cunming

> -Original Message-
> From: David Marchand [mailto:david.marchand at 6wind.com] 
> Sent: Thursday, July 09, 2015 9:59 PM
> To: Liang, Cunming
> Cc: dev at dpdk.org; Stephen Hemminger; Thomas Monjalon; Zhou, Danny; Wang, 
> Liang-min; Richardson, Bruce; Liu, Yong; Neil Horman
> Subject: Re: [PATCH v13 00/14] Interrupt mode PMD

> On Fri, Jun 19, 2015 at 6:00 AM, Cunming Liang  
> wrote:
> v13 changes
>?- version map cleanup for v2.1
>?- replace RTE_EAL_RX_INTR by RTE_NEXT_ABI for ABI compatibility
>
> Please, this patchset ends with a patch that deals with ABI compatibility 
> while it should do so on a per-patch basis.
> Besides, some patches are introducing stuff that is reworked in other patches 
> without a clear reason.
> 
> Can you rework this to ease review and ensure patch atomicity ?
> 
> Thanks.
> 
> --?
> David Marchand

Will split it, thanks.


[dpdk-dev] [PATCH v13 13/14] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch

2015-07-17 Thread Liang, Cunming


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, July 14, 2015 1:13 AM
> To: Liang, Cunming
> Cc: dev at dpdk.org; shemming at brocade.com; david.marchand at 6wind.com;
> Zhou, Danny; Wang, Liang-min; Richardson, Bruce; Liu, Yong;
> nhorman at tuxdriver.com
> Subject: Re: [PATCH v13 13/14] l3fwd-power: enable one-shot rx interrupt and
> polling/interrupt mode switch
> 
> 2015-06-19 12:00, Cunming Liang:
> > Demonstrate how to handle per rx queue interrupt in a NAPI-like
> > implementation in usersapce. PDK polling thread mainly works in
> > polling mode and switch to interrupt mode only if there is no
> > any packet received in recent polls.
> > Usersapce interrupt notification generally takes a lot more cycles
> > than kernel, so one-shot interrupt is used here to guarantee minimum
> > overhead and DPDK polling thread returns to polling mode immediately
> > once it receives an interrupt notificaiton for incoming packet.
> 
> Besides typos, it should be noted that it works only with igb and ixgbe.
Will reword the commit comments, thanks for the notice.


[dpdk-dev] [PATCH v13 08/14] eal/bsd: dummy for new intr definition

2015-07-17 Thread Liang, Cunming


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, July 14, 2015 1:06 AM
> To: Liang, Cunming
> Cc: dev at dpdk.org; shemming at brocade.com; david.marchand at 6wind.com;
> Zhou, Danny; Wang, Liang-min; Richardson, Bruce; Liu, Yong;
> nhorman at tuxdriver.com
> Subject: Re: [PATCH v13 08/14] eal/bsd: dummy for new intr definition
> 
> 2015-06-19 12:00, Cunming Liang:
> > To make bsd compiling happy with new intr changes.
> 
> This patch doesn't make FreeBSD happy.
> DPDK works on Linux and FreeBSD.
> Why not adopt an API which could be implemented for FreeBSD, instead of being
> tightly linked to Linux epoll?
The *rte_epoll_* API is not provided as a EAL API, only exists in linuxapp for 
the low level needs of combing rx event with other user events.
I haven't defined the abstract EAL API for both linux and freebsd yet.
The next step is 1) to have a bsdapp level API (base on kqueue), 2) trying to 
provide a common EAL API to cover both.
It's planed in next release.



[dpdk-dev] [PATCH v13 06/14] eal/linux: standalone intr event fd create support

2015-07-17 Thread Liang, Cunming


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, July 14, 2015 1:02 AM
> To: Liang, Cunming
> Cc: dev at dpdk.org; shemming at brocade.com; david.marchand at 6wind.com;
> Zhou, Danny; Wang, Liang-min; Richardson, Bruce; Liu, Yong;
> nhorman at tuxdriver.com
> Subject: Re: [PATCH v13 06/14] eal/linux: standalone intr event fd create 
> support
> 
> 2015-06-19 12:00, Cunming Liang:
> > +/**
> > + * It enables the fastpath event fds if it's necessary.
> 
> What means fastpath here?
Here means RX/TX packet I/O interrupt, to distinguish link status interrupt 
which is processed in a standalone thread.
Will reword the description.
> 
> > + * It creates event fds when multi-vectors allowed,
> > + * otherwise it multiplexes the single event fds.
> 
> Maybe a reference to allow multi-vectors is needed.
Will rework it, thanks.
> 
> > + *
> > + * @param intr_handle
> > + *   Pointer to the interrupt handle.
> > + * @param nb_vec
> > + *   Number of interrupt vector trying to enable.
> > + * @return
> > + *   - On success, zero.
> > + *   - On failure, a negative value.
> > + */
> > +int
> > +rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd);
> >
> 



[dpdk-dev] [PATCH v13 02/14] eal/linux: add rte_epoll_wait/ctl support

2015-07-17 Thread Liang, Cunming


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, July 14, 2015 12:56 AM
> To: Liang, Cunming
> Cc: dev at dpdk.org; shemming at brocade.com; david.marchand at 6wind.com;
> Zhou, Danny; Wang, Liang-min; Richardson, Bruce; Liu, Yong;
> nhorman at tuxdriver.com
> Subject: Re: [PATCH v13 02/14] eal/linux: add rte_epoll_wait/ctl support
> 
> 2015-06-19 12:00, Cunming Liang:
> > +int
> > +rte_epoll_wait(int epfd, struct rte_epoll_event *events,
> > +  int maxevents, int timeout)
> > +{
> > +   struct epoll_event evs[maxevents];
> > +   int rc;
> > +
> > +   if (!events) {
> > +   RTE_LOG(ERR, EAL, "rte_epoll_event can't be NULL\n");
> > +   return -1;
> > +   }
> > +
> > +   /* using per thread epoll fd */
> > +   if (epfd == RTE_EPOLL_PER_THREAD)
> > +   epfd = rte_intr_tls_epfd();
> > +
> > +   while (1) {
> > +   rc = epoll_wait(epfd, evs, maxevents, timeout);
> > +   if (likely(rc > 0)) {
> > +   /* epoll_wait has at least one fd ready to read */
> > +   rc = eal_epoll_process_event(evs, rc, events);
> > +   break;
> > +   } else if (rc < 0) {
> > +   if (errno == EINTR)
> > +   continue;
> > +   /* epoll_wait fail */
> > +   RTE_LOG(ERR, EAL, "epoll_wait returns with fail %s\n",
> > +   strerror(errno));
> > +   rc = -1;
> > +   break;
> > +   }
> > +   }
> > +
> > +   return rc;
> > +}
> 
> In general, such loop is application-level.
> What is the added value of rte_epoll_wait()?
> Do we need some wrappers to libc in DPDK?
Some motivations to do it,
1) 'epoll_event' takes either fd or a data point. However we require more to 
cover both rx interrupt and other user's events.
2) Some errno processing can be addressed commonly, it's helpful to focus on 
the real event we're interested in.
3) Usually there's one epoll instance per lcore to serve the events. Here gives 
a default one if it isn't assigned.


[dpdk-dev] [PATCH v13 01/14] eal/linux: add interrupt vectors support in intr_handle

2015-07-17 Thread Liang, Cunming


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, July 14, 2015 12:41 AM
> To: Liang, Cunming
> Cc: dev at dpdk.org; shemming at brocade.com; david.marchand at 6wind.com;
> Zhou, Danny; Wang, Liang-min; Richardson, Bruce; Liu, Yong;
> nhorman at tuxdriver.com
> Subject: Re: [PATCH v13 01/14] eal/linux: add interrupt vectors support in
> intr_handle
> 
> 2015-06-19 12:00, Cunming Liang:
> > @@ -58,6 +60,10 @@ struct rte_intr_handle {
> > };
> > int fd;  /**< interrupt event file descriptor */
> > enum rte_intr_handle_type type;  /**< handle type */
> > +   uint32_t max_intr;   /**< max interrupt requested */
> > +   uint32_t nb_efd; /**< number of available efds */
> > +   int efds[RTE_MAX_RXTX_INTR_VEC_ID];  /**< intr vectors/efds mapping
> */
> 
> efd is not defined in these comments.
> 
Will expand abbreviation in comments, thanks.


[dpdk-dev] processor/core count issue

2015-07-17 Thread Jeff Venable, Sr.
In a VM Fusion environment configured with 2 processors with 1 core each I get 
the following:

PANIC in rte_eal_init():
Cannot init logs
4: [/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) 
[0x7f9fd3bdaec5]]
3: 
[/usr/lib/dpdk-2.0.0/librte_eal.so.1(rte_eal_init+0x123a)
 [0x7f9fd4f462fa]]
2: 
[/usr/lib/dpdk-2.0.0/librte_eal.so.1(__rte_panic+0xc3) 
[0x7f9fd4f449e8]]
1: 
[/usr/lib/dpdk-2.0.0/librte_eal.so.1(rte_dump_stack+0x16)
 [0x7f9fd4f4daa6]]
Aborted (core dumped)

rte_socket_id() was returning 0x here:

(gdb) bt
#0  0x7797ae95 in rte_malloc_socket () from 
/usr/lib/dpdk-2.0.0/librte_malloc.so.1
#1  0x7797af4e in rte_zmalloc_socket () from 
/usr/lib/dpdk-2.0.0/librte_malloc.so.1
#2  0x77573ef8 in rte_ring_create () from 
/usr/lib/dpdk-2.0.0/librte_ring.so.1
#3  0x770d in rte_mempool_xmem_create () from 
/usr/lib/dpdk-2.0.0/librte_mempool.so.1
#4  0x7b97 in rte_mempool_create () from 
/usr/lib/dpdk-2.0.0/librte_mempool.so.1
#5  0x77b90899 in ?? () from /usr/lib/dpdk-2.0.0/librte_eal.so.1
#6  0x77b88d34 in ?? () from /usr/lib/dpdk-2.0.0/librte_eal.so.1
#7  0x77b84950 in rte_eal_init () from 
/usr/lib/dpdk-2.0.0/librte_eal.so.1

When I reconfigured the VM to 1 processor with 2 cores, everything works as 
expected.

This was not a problem in 1.6.

Jeff


[dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps

2015-07-17 Thread Thomas Monjalon
2015-07-16 09:54, Olivier MATZ:
> On 07/15/2015 03:11 PM, Maryam Tahhan wrote:
> > This patch set implements xstats_get() and xstats_reset() in dev_ops for
> > ixgbe to expose detailed error statistics to DPDK applications. The
> > dump_cfg application was extended to demonstrate the usage of
> > retrieving statistics for DPDK interfaces and renamed to proc_info
> > in order reflect this new functionality. This patch set also removes non
> > generic statistics from the statistics strings at the ethdev level and
> > marks the relevant registers as depricated in struct rte_eth_stats.
> 
> Acked-by: Olivier Matz 

Applied with minor fixes, thanks


[dpdk-dev] [dpdk-virtio]: cannot start testpmd after binding virtio devices to gib_uio

2015-07-17 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Clarylin L
> Sent: Friday, July 17, 2015 7:10 AM
> To: dev at dpdk.org
> Subject: Re: [dpdk-dev] [dpdk-virtio]: cannot start testpmd after binding
> virtio devices to gib_uio
> 
> > I am running a virtual guest on Ubuntu and trying to use dpdk testpmd
> > as a packet forwarder.
> >
> > After starting the virtual guest, I do insmod igb_uio.ko insmod
> > rte_kni.ko echo ":00:06.0" >
> > /sys/bus/pci/drivers/virtio-pci/unbind
> > echo ":00:07.0" > /sys/bus/pci/drivers/virtio-pci/unbind
> > echo "1af4 1000" > /sys/bus/pci/drivers/igb_uio/new_id


You can try with the following instead of above to bind virtio port with igb_uio
tools/dpdk-nic-bind.py --bind igb_uio 00:06.0 00:07.0

> > mkdir -p /tmp/huge
> > mount -t hugetlbfs nodev /tmp/huge
> > echo 1024 > /sys/kernel/mm/hugepages/hugepages-
> 2048kB/nr_hugepages
> >
> > Where :00:06.0 and :00:07.0 are the two virtio devices I am
> > gonna use, and 1af4 1000 is the corresponding vendor and device id.
> >
> > After the above steps, I verified that the virtio devices are actually
> > bound to igb_uio:
> >
> > lspci -s 00:06.0 -vvv | grep driver
> >
> > Kernel driver in use: igb_uio
> >
> >
> > However, I couldn't start testpmd and it hang at the the last line
> > below
> > "PMD: rte_eh_dev_config_restore.."
> >
> > ...
> >
> > EAL: PCI device :00:05.0 on NUMA socket -1
> >
> > EAL:   probe driver: 1af4:1000 rte_virtio_pmd
> >
> > EAL:   Device is blacklisted, not initializing
> >
> > EAL: PCI device :00:06.0 on NUMA socket -1
> >
> > EAL:   probe driver: 1af4:1000 rte_virtio_pmd
> >
> > EAL: PCI device :00:07.0 on NUMA socket -1
> >
> > EAL:   probe driver: 1af4:1000 rte_virtio_pmd
> >
> > Interactive-mode selected
> >
> > Set mac packet forwarding mode
> >
> > Configuring Port 0 (socket 0)
> >
> > PMD: rte_eth_dev_config_restore: port 0: MAC address array not
> > supported
> >
> >
> > If I do not bind interface to igb_uio, testpmd can start successfully
> > which also shows "probe driver: 1af4:1000 rte_virtio_pmd" during
> > starting process. However, even after testpmd started, virtio devices
> > are bound to nothing ("lspci -s 00:06.0 -vvv | grep driver" shows nothing).
> >
> >
> > I am also attaching my virtual guest configuration below. Thanks for
> > your help. Highly appreciate!!
> >
> >
> >
> > lab at vpc-2:~$ ps aux | grep qemu
> >
> > libvirt+ 12020  228  0.0 102832508 52860 ? Sl   14:54  61:06 
> > *qemu*-system-
> x86_64
> > -enable-kvm -name dpdk-perftest -S -machine
> > pc-i440fx-trusty,accel=kvm,usb=off,mem-merge=off -cpu host -m 98304
> > -mem-prealloc -mem-path /dev/hugepages/libvirt/*qemu* -realtime
> > mlock=off -smp 24,sockets=2,cores=12,threads=1 -numa
> > node,nodeid=0,cpus=0-11,mem=49152 -numa
> > node,nodeid=1,cpus=12-23,mem=49152
> > -uuid eb5f8848-9983-4f13-983c-e3bd4c59387d -no-user-config -nodefaults
> > -chardev
> > socket,id=charmonitor,path=/var/lib/libvirt/*qemu*/dpdk-perftest.monit
> > or,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -
> rtc
> > base=utc -no-shutdown -boot strict=on -device
> > piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
> > file=/var/lib/libvirt/images/dpdk-perftest-hda.img,if=none,id=drive-id
> > e0-0-0,format=qcow2
> > -device
> > ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1
> > -drive
> > file=/var/lib/libvirt/images/dpdk-perftest-hdb.img,if=none,id=drive-id
> > e0-0-1,format=qcow2 -device
> > ide-hd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -drive
> > if=none,id=drive-ide0-1-0,readonly=on,format=raw -device
> > ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=2
> > -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device
> > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:45:ff:5e,bus=pci.0
> > ,addr=0x5
> > -netdev
> > tap,fds=26:27:28:29:30:31:32:33,id=hostnet1,vhost=on,vhostfds=34:35:36
> > :37:38:39:40:41
> > -device
> > virtio-net-pci,mq=on,vectors=17,netdev=hostnet1,id=net1,mac=52:54:00:7
> > e:b5:6b,bus=pci.0,addr=0x6
> > -netdev
> > tap,fds=42:43:44:45:46:47:48:49,id=hostnet2,vhost=on,vhostfds=50:51:52
> > :53:54:55:56:57
> > -device
> > virtio-net-pci,mq=on,vectors=17,netdev=hostnet2,id=net2,mac=52:54:00:f
> > 1:a5:20,bus=pci.0,addr=0x7
> > -chardev pty,id=charserial0 -device
> > isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1
> > -device isa-serial,chardev=charserial1,id=serial1 -vnc 127.0.0.1:0
> > -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device
> > i6300esb,id=watchdog0,bus=pci.0,addr=0x3 -watchdog-action reset
> > -device
> > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4
> >


[dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code

2015-07-17 Thread Vlad Zolotarov


On 07/13/15 13:26, John McNamara wrote:
> Fix for ABI breakage introduced in LRO addition. Moves
> lro bitfield to the end of the struct/member.
>
> Fixes: 8eecb3295aed (ixgbe: add LRO support)
>
> Signed-off-by: John McNamara 
> ---
>   lib/librte_ether/rte_ethdev.h | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 79bde89..1c3ace1 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
>   uint8_t port_id;   /**< Device [external] port identifier. */
>   uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) / OFF(0). */
>   scattered_rx : 1,  /**< RX of scattered packets is ON(1) / 
> OFF(0) */
> - lro  : 1,  /**< RX LRO is ON(1) / OFF(0) */
>   all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
> - dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0). 
> */
> + dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). 
> */
> + lro : 1;   /**< RX LRO is ON(1) / OFF(0) */

Acked-by: Vlad Zolotarov 

>   };
>   
>   /**



[dpdk-dev] [PATCH v6 4/9] ethdev: remove HW specific stats in stats structs

2015-07-17 Thread Thomas Monjalon
2015-07-15 14:11, Maryam Tahhan:
> Remove non generic stats in rte_stats_strings and mark the relevant
> fields in struct rte_eth_stats as deprecated.
> 
> Signed-off-by: Maryam Tahhan 

> - uint64_t imissed;   /**< Total of RX missed packets (e.g full FIFO). */
> - uint64_t ibadcrc;   /**< Total of RX packets with CRC error. */
> - uint64_t ibadlen;   /**< Total of RX packets with bad length. */
> + /**< Deprecated; Total of RX missed packets (e.g full FIFO). */
> + uint64_t imissed;
> + /**< Deprecated; Total of RX packets with CRC error. */
> + uint64_t ibadcrc;
> + /**< Deprecated; Total of RX packets with bad length. */
> + uint64_t ibadlen;

The /**< style comments must be put *after* the field.


[dpdk-dev] [PATCH v6 3/9] ethdev: expose extended error stats

2015-07-17 Thread Thomas Monjalon
2015-07-15 14:11, Maryam Tahhan:
> Extend rte_eth_xstats_get to retrieve additional stats from the device
> driver as well the ethdev generic stats.
> 
> Signed-off-by: Maryam Tahhan 
> ---
>  lib/librte_ether/rte_ethdev.c | 31 ---
>  1 file changed, 20 insertions(+), 11 deletions(-)
>  mode change 100644 => 100755 lib/librte_ether/rte_ethdev.c
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> old mode 100644
> new mode 100755

Beware of not changing file mode.


[dpdk-dev] [PATCH v5 0/4] rte_sched: cleanup and deprecation

2015-07-17 Thread Thomas Monjalon
> > This is a subset of earlier rte_sched patches.
> > 
> > It does not address the read/clearing API issue since that
> > was still under discussion.
> > 
> 
> Acked-by: Cristian Dumitrescu 

Applied, thanks


[dpdk-dev] [PATCH v17 0/5] User-space Ethtool

2015-07-17 Thread Thomas Monjalon
2015-07-16 21:55, Wang, Liang-min:
> Thomas,
>   Do you want me to create a separate patch just include the 
> example/l2fwd-ethtool?

Yes

>   Do you also mean besides the identified Makefiles, you see more rework 
> needs to be done,
>   or I just need to fix Makefile issue? If just Makefile issue, I could 
> try to make another attempt tomorrow?

After checking the build there is probably more review to do.
Let's take more time to have something clean and maybe more complete in 2.2.
This patchset is your first contribution to DPDK and is already a nice 
achievement.
The new API must now be implemented in more drivers to be effective.


> > -Original Message-
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > Sent: Thursday, July 16, 2015 5:48 PM
> > To: Wang, Liang-min
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v17 0/5] User-space Ethtool
> > 
> > 2015-07-16 09:25, Liang-Min Larry Wang:
> > > This implementation is designed to provide a familar interface for
> > applications that rely on kernel-space driver to support ethtool_op and
> > net_device_op for device management. The initial implementation focuses
> > on ops that can be implemented through existing netdev APIs. More ops will
> > be supported in latter release.
> > 
> > Applied without example which needs more work, thanks




[dpdk-dev] [PATCH] ethdev: fix macro VALID_PORTID_OR_ERR_RTE

2015-07-17 Thread Thomas Monjalon
2015-07-15 13:22, Liang-Min Larry Wang:
> fix return value, using the macro input instead of -EINVAL.
> 
> Signed-off-by: Liang-Min Larry Wang 

Applied, thanks


[dpdk-dev] [PATCH v17 0/5] User-space Ethtool

2015-07-17 Thread Thomas Monjalon
2015-07-16 09:25, Liang-Min Larry Wang:
> This implementation is designed to provide a familar interface for 
> applications that rely on kernel-space driver to support ethtool_op and 
> net_device_op for device management. The initial implementation focuses on 
> ops that can be implemented through existing netdev APIs. More ops will be 
> supported in latter release.

Applied without example which needs more work, thanks


[dpdk-dev] [PATCH v17 5/5] examples: new example: l2fwd-ethtool

2015-07-17 Thread Thomas Monjalon
2015-07-16 09:25, Liang-Min Larry Wang:
> The example includes an ethtool library and two applications:
> one application is a non- DPDK process (nic-control)
> and the other is a DPDK l2fwd applicaiton (l2fwd-app).
> The nic-control process sends ethtool alike device management
> requests to l2fwd-app through a named pipe IPC. This example
> is designed to show how to build a ethtool shim library and
> how to use ethtool apis to manage device parameters.

The makefiles need some clean-up and it does not build in shared lib mode.

Beginning of a cleanup patch to merge with this one:

--- a/examples/Makefile
+++ b/examples/Makefile
@@ -50,12 +50,10 @@ DIRS-y += ip_reassembly
 DIRS-$(CONFIG_RTE_IP_FRAG) += ip_fragmentation
 DIRS-y += ipv4_multicast
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
-DIRS-y += l2fwd-ethtool/lib
-DIRS-y += l2fwd-ethtool/nic-control
-DIRS-y += l2fwd-ethtool/l2fwd-app
 DIRS-y += l2fwd
 DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += l2fwd-ivshmem
 DIRS-$(CONFIG_RTE_LIBRTE_JOBSTATS) += l2fwd-jobstats
+DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += l2fwd-ethtool
 DIRS-y += l3fwd
 DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
 DIRS-$(CONFIG_RTE_LIBRTE_POWER) += l3fwd-power

--- a/examples/l2fwd-ethtool/Makefile
+++ b/examples/l2fwd-ethtool/Makefile
@@ -37,7 +37,6 @@ endif
 RTE_TARGET ?= x86_64-native-linuxapp-gcc

 include $(RTE_SDK)/mk/rte.vars.mk
-unexport RTE_SRCDIR RTE_OUTPUT RTE_EXTMK

 ifneq ($(CONFIG_RTE_EXEC_ENV),"linuxapp")
 $(error This application can only operate in a linuxapp environment, \
@@ -46,10 +45,4 @@ endif

 DIRS-y += lib nic-control l2fwd-app

-.PHONY: all clean $(DIRS-y)
-
-all: $(DIRS-y)
-clean: $(DIRS-y)
-
-$(DIRS-y):
-   $(MAKE) -C $@ $(MAKECMDGOALS) O=$(RTE_OUTPUT)
+include $(RTE_SDK)/mk/rte.extsubdir.mk



[dpdk-dev] [PATCH] hash: fix compilation for non-x86 systems

2015-07-17 Thread Thomas Monjalon
2015-07-16 21:41, Pablo de Lara:
> Hash library uses optimized compare functions that use
> x86 intrinsics, therefore non-x86 systems could not build
> the library. In that case, the compare function is set
> to the generic memcmp.
[...]
> --- /dev/null
> +++ b/lib/librte_hash/rte_cmp_fns.h

Renaming it to rte_cmp_x86.h would allow other arch in separate files.