date:20140929

[dpdk-dev] [PATCH v2 0/3] add i40e RSS support in VF

2014-09-29 Thread Zhan, Zhaochen

Tested-by: Zhaochen Zhan 

This patch has been verified on KVM virtual environment with
4*10G, 2*40G and 1*40G NICs. The VF is generated by SRIOV.
And testpmd should be run on HOST to support testpmd works on VM.
The RSS function works well in the testpmd app in VM environment of KVM.

Please see environment information as the following:
HOST environment:
CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
OS: Linux 3.11.10-301.fc20.x86_64
GCC: 4.8.3
NIC: 4*10G(1572), 2*40G(1583), 1*40G(1584)

VM environment generated by KVM:
CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
OS: Linux 3.11.10-301.fc20.x86_64
GCC: 4.8.2
NIC: VF generated through SRIOV

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Helin Zhang
> Sent: Friday, September 19, 2014 9:15 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2 0/3] add i40e RSS support in VF
> 
> As hardware supports RSS in VF, the patches add that support
> in driver. In addition, minor improvements are added for
> defining macro with constant.
> 
> v2 changes:
> * Remove support of updating/querying redirection table, as it
>   will be implemented in another patches later.
> * Remove changes in testpmd, as it is not needed at all for
>   supporting RSS in VF.
> 
> Helin Zhang (3):
>   ethdev: improvement for constant usage
>   i40e: extern two functions and relevant macros
>   i40evf: support of RSS in VF
> 
>  lib/librte_ether/rte_ethdev.h|  47 ++--
>  lib/librte_pmd_i40e/i40e_ethdev.c|   4 +-
>  lib/librte_pmd_i40e/i40e_ethdev.h|  40 +-
>  lib/librte_pmd_i40e/i40e_ethdev_vf.c | 142
> +++
>  4 files changed, 207 insertions(+), 26 deletions(-)
> 
> --
> 1.8.1.4

[dpdk-dev] [PATCH v2 4/5] i40e: set crc stripping in rx queue configuration

2014-09-29 Thread Xu, HuilongX

Tested-by: HuilongX xu 

This patch has been verified on FC20 with eagle fountain:4*10G fortville, 
spirit falls 1*40G fortville and 2*40G fortville.
The VF is greater by SRIOV, and testpmd should be run on host to support  VF 
work on VM.

CRC stripping function works well in the testpmd app in VM and host.

Test environment information detail information as the following:
HOST environment:
CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
OS: Linux 3.11.10-301.fc20.x86_64
GCC: 4.8.3
NIC: Eagle Fountain:4*10G fortville, Spirit Falls 1*40G fortville and 2*40G 
fortville.

VM environment generated by KVM:
CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
OS: Linux 3.11.10-301.fc20.x86_64
GCC: 4.8.2
NIC: VF generated through SRIOV   

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Helin Zhang
Sent: Sunday, September 14, 2014 10:48 PM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH v2 4/5] i40e: set crc stripping in rx queue 
configuration

It enables/disables the crc stripping in the rx queue contexts,
according to the extra configuration carried from VF.

v2 changes:
* Put setting the crc stripping into a single patch.

Signed-off-by: Helin Zhang 
Reviewed-by: Jingjing Wu 
Reviewed-by: Jing Chen 
---
 lib/librte_pmd_i40e/i40e_pf.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_pmd_i40e/i40e_pf.c b/lib/librte_pmd_i40e/i40e_pf.c
index bc9bfcb..2910fd5 100644
--- a/lib/librte_pmd_i40e/i40e_pf.c
+++ b/lib/librte_pmd_i40e/i40e_pf.c
@@ -357,7 +357,10 @@ i40e_pf_host_hmc_config_rxq(struct i40e_hw *hw,
rx_ctx.tphdata_ena = 1;
rx_ctx.tphhead_ena = 1;
rx_ctx.lrxqthresh = 2;
-   rx_ctx.crcstrip = 1;
+   if (qpei) /* For DPDK PF host */
+   rx_ctx.crcstrip = qpei->crcstrip ? 1 : 0;
+   else /* For Linux PF host */
+   rx_ctx.crcstrip = 1;
rx_ctx.l2tsel = 1;
rx_ctx.prefena = 1;

-- 
1.8.1.4

[dpdk-dev] [PATCH 00/15] i40e base driver update

2014-09-29 Thread Xu, HuilongX

Tested-by: HuilongX xu 

This patch has been verified on FC20 with Eagle Fountain: 4*10G , Spirit Falls: 
1*40G fortvill and 2*40G fortvill.
The i40e base driver update patch works well on FC20 with basic function and 
performance.

The test environment detail information as the following:
HOST environment:
CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
OS: Linux 3.11.10-301.fc20.x86_64
GCC: 4.8.3
NIC: Eagle Fountain: 4*10G , Spirit Falls: 1*40G fortvill and 2*40G fortvill.

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Helin Zhang
Sent: Tuesday, September 09, 2014 3:21 PM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH 00/15] i40e base driver udpate

Here is the update of i40e base driver. Also it involves a few
relevant necessary code changes in i40e PMD.

Helin Zhang (15):
  i40e: make the indentation more consistent in share code
  i40e: support nvmupdate by default
  i40e: remove useless code which was written for Solaris
  i40e: remove test code for 'ethtool'
  i40e: force a shifted '1' to be 'unsigned'
  i40e: remove useless code for pre-boot support
  i40e: Get rid of sparse warnings, and remove unreachable code
  i40e: remove code which is for software validation only
  i40e: remove code for TPH (TLP Processing Hints)
  i40e: support of 10G base T
  i40e: expose debug_write_register request
  i40e: workaround of get_firmware_version, and enhancements
  i40e: Use get_link_status to report FC settings
  i40e: fix and enhancement in arq_event_info struct
  i40e: support redefined struct of 'i40e_arq_event_info'

 lib/librte_pmd_i40e/i40e/i40e_adminq.c |   55 +-
 lib/librte_pmd_i40e/i40e/i40e_adminq.h |5 +-
 lib/librte_pmd_i40e/i40e/i40e_adminq_cmd.h | 2132 ++--
 lib/librte_pmd_i40e/i40e/i40e_common.c |  173 +--
 lib/librte_pmd_i40e/i40e/i40e_dcb.c|  625 
 lib/librte_pmd_i40e/i40e/i40e_dcb.h|  103 --
 lib/librte_pmd_i40e/i40e/i40e_diag.c   |   10 -
 lib/librte_pmd_i40e/i40e/i40e_hmc.h|5 +-
 lib/librte_pmd_i40e/i40e/i40e_lan_hmc.c|  227 +--
 lib/librte_pmd_i40e/i40e/i40e_lan_hmc.h|   14 -
 lib/librte_pmd_i40e/i40e/i40e_nvm.c|  120 +-
 lib/librte_pmd_i40e/i40e/i40e_prototype.h  |   19 +-
 lib/librte_pmd_i40e/i40e/i40e_type.h   |   49 +-
 lib/librte_pmd_i40e/i40e_ethdev.c  |8 +-
 lib/librte_pmd_i40e/i40e_ethdev_vf.c   |   10 +-
 15 files changed, 1242 insertions(+), 2313 deletions(-)

-- 
1.8.1.4

[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture

2014-09-29 Thread hem...@freescale.com

Hi Chao,

This Patch seems to be incomplete. You may also need to patch the 
librte_eal\common\include\rte_atomic.h 
e.g.
#if !(defined RTE_ARCH_X86_64) || !(defined RTE_ARCH_I686)
#include 
#else /* if Intel*/

Otherwise you shall be getting compilation errors for "_mm_mfence"

Similar is true for other common header files as well.


Regards,
Hemant

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao Zhu
> Sent: 26/Sep/2014 3:06 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power
> architecture
> 
> The atomic operations implemented with assembly code in DPDK only support
> x86. This patch add architecture specific atomic operations for IBM Power
> architecture.
> 
> Signed-off-by: Chao Zhu 
> ---
>  .../common/include/powerpc/arch/rte_atomic.h   |  387
> 
>  .../common/include/powerpc/arch/rte_atomic_arch.h  |  318
> 
>  2 files changed, 705 insertions(+), 0 deletions(-)  create mode 100644
> lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
>  create mode 100644
> lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> 
> diff --git a/lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
> b/lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
> new file mode 100644
> index 000..7f5214e
> --- /dev/null
> +++ b/lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
> @@ -0,0 +1,387 @@
> +/*
> + *   BSD LICENSE
> + *
> + *   Copyright (C) IBM Corporation 2014.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + *   notice, this list of conditions and the following disclaimer in
> + *   the documentation and/or other materials provided with the
> + *   distribution.
> + * * Neither the name of IBM Corporation nor the names of its
> + *   contributors may be used to endorse or promote products derived
> + *   from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
> ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> +*/
> +
> +/*
> + * Inspired from FreeBSD src/sys/powerpc/include/atomic.h
> + * Copyright (c) 2008 Marcel Moolenaar
> + * Copyright (c) 2001 Benno Rice
> + * Copyright (c) 2001 David E. O'Brien
> + * Copyright (c) 1998 Doug Rabson
> + * All rights reserved.
> + */
> +
> +#ifndef _RTE_ATOMIC_H_
> +#error "don't include this file directly, please include generic 
> "
> +#endif
> +
> +#ifndef _RTE_POWERPC_64_ATOMIC_H_
> +#define _RTE_POWERPC_64_ATOMIC_H_
> +
> +/*- 64 bit atomic operations
> +-*/
> +
> +/**
> + * An atomic compare and set function used by the mutex functions.
> + * (atomic) equivalent to:
> + *   if (*dst == exp)
> + * *dst = src (all 64-bit words)
> + *
> + * @param dst
> + *   The destination into which the value will be written.
> + * @param exp
> + *   The expected value.
> + * @param src
> + *   The new value.
> + * @return
> + *   Non-zero on success; 0 on failure.
> + */
> +static inline int
> +rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
> +{
> + unsigned int ret = 0;
> +
> + asm volatile (
> + "\tlwsync\n"
> + "1: ldarx %[ret], 0, %[dst]\n"
> + "cmpld %[exp], %[ret]\n"
> + "bne 2f\n"
> + "stdcx. %[src], 0, %[dst]\n"
> + "bne- 1b\n"
> + "li %[ret], 1\n"
> + "b 3f\n"
> + "2:\n"
> + "stdcx. %[ret], 0, %[dst]\n"
> + "li %[ret], 0\n"
> + "3:\n"
> + "isync\n"
> + : [ret] "=&r" (ret), "=m" (*dst)
> + : [dst] "r" (dst), [exp] "r" (exp), [src] "r" (src), 
> "m" (*dst)
> +

[dpdk-dev] [PATCH 10/12] Add cache size define for IBM Power Architecture

2014-09-29 Thread hem...@freescale.com

> --- a/mk/arch/powerpc/rte.vars.mk
> +++ b/mk/arch/powerpc/rte.vars.mk
> @@ -32,7 +32,7 @@
>  ARCH  ?= powerpc
>  CROSS ?=
> 
> -CPU_CFLAGS  ?= -m64
> +CPU_CFLAGS  ?= -m64 -DCACHE_LINE_SIZE=128

 [hemant]  Instead of hardcoding the CACHE_LINE_SIZE,  can you drive the 
CACHE_LINE_SIZE from config file.  Other powerpc processor have it as 64.


>  CPU_LDFLAGS ?=
>  CPU_ASFLAGS ?= -felf64
> 
> --
> 1.7.1

[dpdk-dev] DPDK doesn't work with iommu=pt

2014-09-29 Thread Alex Markuze

On Mon, Sep 29, 2014 at 2:53 AM, Hiroshi Shimamoto 
wrote:
> Hi,
>
>> Subject: Re: [dpdk-dev] DPDK doesn't work with iommu=pt
>>
>> iommu=pt effectively disables iommu for the kernel and iommu is
>> enabled only for KVM.
>> http://lwn.net/Articles/329174/
>
> thanks for pointing that.
>
> Okay, I think DPDK cannot handle IOMMU because of no kernel code in
> DPDK application.
>
> And now, I think "iommu=pt" doesn't work correctly DMA on host PMD
> causes DMAR fault which means IOMMU catches a wrong operation.
> Will dig around "iommu=pt".
>
I agree with your analysis, It seems that a fairly recent patch (3~4)
months has introduced a bug that confuses unprotected DMA access with an
iommu access, by the device and produces an equivalent of a page fault.

>>
>> Basically unless you have KVM running you can remove both lines for
>> the same effect.
>> On the other hand if you do have KVM and you do want iommu=on You can
>> remove the iommu=pt for the same performance because AFAIK unlike the
>> kernel drivers DPDK doesn't dma_map and dma_unman each and every
>> ingress/egress packet (Please correct me if I'm wrong), and will not
>> suffer any performance penalties.
>
> I also tried "iommu=on", but it didn't fix the issue.
> I saw the same error messages in kernel.
>

Just to clarify, what I suggested you to try is leaving only this string in
the command line "intel_iommu=on".  w/o iommu=pt.
But this would work iff DPDK can handle iota's (I/O virtual addresses).

>   [   46.978097] dmar: DRHD: handling fault status reg 2
>   [   46.978120] dmar: DMAR:[DMA Read] Request device [21:00.0] fault addr
aa01
>   DMAR:[fault reason 02] Present bit in context entry is clear
>
> thanks,
> Hiroshi
>
>>
>> FYI. Kernel NIC drivers:
>> When iommu=on{,strict} the kernel network drivers will suffer a heavy
>> performance penalty due to regular IOVA modifications (both HW and SW
>> at fault here). Ixgbe and Mellanox reuse dma_mapped pages on the
>> receive side to avoid this penalty, but still suffer from iommu on TX.
>>
>> On Fri, Sep 26, 2014 at 5:47 PM, Choi, Sy Jong 
wrote:
>> > Hi Shimamoto-san,
>> >
>> > There are a lot of sighting relate to "DMAR:[fault reason 06] PTE Read
access is not set"
>> > https://www.mail-archive.com/kvm at vger.kernel.org/msg106573.html
>> >
>> > This might be related to IOMMU, and kernel code.
>> >
>> > Here is what we know :-
>> > 1) Disabling VT-d in bios also removed the symptom
>> > 2) Switch to another OS distribution also removed the symptom
>> > 3) even different HW we will not see the symptom. In my case, switch
from Engineering board to EPSD board.
>> >
>> > Regards,
>> > Choi, Sy Jong
>> > Platform Application Engineer
>> >
>> >
>> > -Original Message-
>> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Hiroshi Shimamoto
>> > Sent: Friday, September 26, 2014 5:14 PM
>> > To: dev at dpdk.org
>> > Cc: Hayato Momma
>> > Subject: [dpdk-dev] DPDK doesn't work with iommu=pt
>> >
>> > I encountered an issue that DPDK doesn't work with "iommu=pt intel_
iommu=on"
>> > on HP ProLiant DL380p Gen8 server. I'm using the following environment;
>> >
>> >   HW: ProLiant DL380p Gen8
>> >   CPU: E5-2697 v2
>> >   OS: RHEL7
>> >   kernel: kernel-3.10.0-123 and the latest kernel 3.17-rc6+
>> >   DPDK: v1.7.1-53-gce5abac
>> >   NIC: 82599ES
>> >
>> > When boot with "iommu=pt intel_iommu=on", I got the below message and
no packets are handled.
>> >
>> >   [  120.809611] dmar: DRHD: handling fault status reg 2
>> >   [  120.809635] dmar: DMAR:[DMA Read] Request device [21:00.0] fault
addr aa01
>> >   DMAR:[fault reason 02] Present bit in context entry is clear
>> >
>> > How to reproduce;
>> > just run testpmd
>> > # ./testpmd -c 0xf -n 4 -- -i
>> >
>> > Configuring Port 0 (socket 0)
>> > PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x754eafc0 
>> > hw_ring=0x7420
dma_addr=0xaa00
>> > PMD: ixgbe_dev_tx_queue_setup(): Using full-featured tx code path
>> > PMD: ixgbe_dev_tx_queue_setup():  - txq_flags = 0 [IXGBE
_SIMPLE_FLAGS=f01]
>> > PMD: ixgbe_dev_tx_queue_setup():  - tx_rs_thresh = 32 [RTE_PMD_IXGBE
_TX_MAX_BURST=32]
>> > PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x754ea740 
>> > hw_ring=0x7421
dma_addr=0xaa01
>> > PMD: check_rx_burst_bulk_alloc_preconditions(): Rx Burst Bulk Alloc
Preconditions: rxq->rx_free_thresh=0,
>> RTE_PMD_IXGBE_RX_MAX_BURST=32
>> > PMD: ixgbe_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are
not satisfied, Scattered Rx is requested, or
>> RTE_LIBRTE_IXGBE_RX_ALLOW_BULK_ALLOC is not enabled (port=0, queue=0).
>> > PMD: check_rx_burst_bulk_alloc_preconditions(): Rx Burst Bulk Alloc
Preconditions: rxq->rx_free_thresh=0,
>> RTE_PMD_IXGBE_RX_MAX_BURST=32
>> >
>> > testpmd> start
>> >   io packet forwarding - CRC stripping disabled - packets/burst=32
>> >   nb forwarding cores=1 - nb forwarding ports=2
>> >   RX queues=1 - RX desc=128 - RX free threshold=0
>> >   RX threshold registers: pthresh=8 hthresh=8 wthresh=0
>> >   TX q

[dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe:

2014-09-29 Thread Wodkowski, PawelX

> Yes, this is my concern exactly.
> 
> >  If that's so, then I suppose we can do: make alarm_cancel() to return a
> negative value for the case #3 (-EINPROGRESS or something).
> >  Something like:
> > ...
> > if (ap->executing == 0) {
> >LIST_REMOVE(ap,next);
> > rte_free(ap);
> > count++;
> > ap = ap_prev;
> > } else if (pthread_equal(ap->executing_id, pthread_self()) == 0) {
> > executing++;
> > } else {
> >ret = -EINPROGRESS;
> > }
> > ...
> > return ((ret != 0) ? ret : count);
> >
> > So the return value  will be > 0 for #1, 0 for #2, <0 for #3.
> > As I remember, you already suggested something similar in one of the 
> > previous
> mails.
> Yes, I rolled the API changes I suggested in with this model, because I wanted
> to be able to do precise specification of a timer instance to cancel, but if
> we're not ready to make that change, I think what you propose above would be
> suffficient.  Theres some question as to weather we would cancel timers that
> are
> still pending on a return of -EINPROGRESS, but I think if we document it
> accordingly, then it can be worked out just fine.
> 
> Best
> Neil
> 

Image how you will be damned by someone that not even notice you change
and he Is managing some kind of resource based on returned number of 
set/canceled timers. If you suddenly start returning negative values how those
application will behave? Silently changing returned value domain is evil in its 
pure form.

>From my point of view, problem is virtual because this is user application 
>task to 
know what it can and what it not. If you really want to inform user application
about timer state you can introduce API call which will interrogate timers list
and return appropriate value, but for god sake, do not introduce untraceable 
bugs.

Pawel

[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture

2014-09-29 Thread Chao CH Zhu

Hi, Hemant 

Actually, I submitted another set of patches to split the architecture 
specific operations which includes the patch to 
librte_eal\common\include\rte_atomic.h. Please refer to the previous 
email.   

Best Regards!
--
Chao Zhu (??)
Research Staff Member
Cloud Infrastructure and Technology Group
IBM China Research Lab
Building 19 Zhongguancun Software Park
8 Dongbeiwang West Road, Haidian District,
Beijing, PRC. 100193
Tel: +86-10-58748711
Email: bjzhuc at cn.ibm.com




From:   "Hemant at freescale.com" 
To: Chao CH Zhu/China/IBM at IBMCN, "dev at dpdk.org" 
Date:   2014/09/29 14:15
Subject:RE: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM 
Power   architecture



Hi Chao,

This Patch seems to be incomplete. You may also need to patch the 
librte_eal\common\include\rte_atomic.h 
e.g.
#if !(defined RTE_ARCH_X86_64) || !(defined RTE_ARCH_I686)
#include 
#else /* if Intel*/

Otherwise you shall be getting compilation errors for "_mm_mfence"

Similar is true for other common header files as well.


Regards,
Hemant

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao Zhu
> Sent: 26/Sep/2014 3:06 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power
> architecture
> 
> The atomic operations implemented with assembly code in DPDK only 
support
> x86. This patch add architecture specific atomic operations for IBM 
Power
> architecture.
> 
> Signed-off-by: Chao Zhu 
> ---
>  .../common/include/powerpc/arch/rte_atomic.h   |  387
> 
>  .../common/include/powerpc/arch/rte_atomic_arch.h  |  318
> 
>  2 files changed, 705 insertions(+), 0 deletions(-)  create mode 100644
> lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
>  create mode 100644
> lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> 
> diff --git a/lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
> b/lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
> new file mode 100644
> index 000..7f5214e
> --- /dev/null
> +++ b/lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
> @@ -0,0 +1,387 @@
> +/*
> + *   BSD LICENSE
> + *
> + *   Copyright (C) IBM Corporation 2014.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above 
copyright
> + *   notice, this list of conditions and the following disclaimer 
in
> + *   the documentation and/or other materials provided with the
> + *   distribution.
> + * * Neither the name of IBM Corporation nor the names of its
> + *   contributors may be used to endorse or promote products 
derived
> + *   from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
> ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR 
TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> +*/
> +
> +/*
> + * Inspired from FreeBSD src/sys/powerpc/include/atomic.h
> + * Copyright (c) 2008 Marcel Moolenaar
> + * Copyright (c) 2001 Benno Rice
> + * Copyright (c) 2001 David E. O'Brien
> + * Copyright (c) 1998 Doug Rabson
> + * All rights reserved.
> + */
> +
> +#ifndef _RTE_ATOMIC_H_
> +#error "don't include this file directly, please include generic 
"
> +#endif
> +
> +#ifndef _RTE_POWERPC_64_ATOMIC_H_
> +#define _RTE_POWERPC_64_ATOMIC_H_
> +
> +/*- 64 bit atomic operations
> +-*/
> +
> +/**
> + * An atomic compare and set function used by the mutex functions.
> + * (atomic) equivalent to:
> + *   if (*dst == exp)
> + * *dst = src (all 64-bit words)
> + *
> + * @param dst
> + *   The destination into which the value will be written.
> + * @param exp
> + *   The expected value.
> + * @param src
> + *   The new value.
> + * @return
> + *   Non-zero on success; 0 on failure.
> + */
> +static inline int
> +rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
> +

[dpdk-dev] [PATCH 10/12] Add cache size define for IBM Power Architecture

2014-09-29 Thread Chao CH Zhu

Hi,Hemant, 

Actually, the set of patches is only for IBM Power7/8 which has difference 
cache line size. Of cause, a better way may be detecting the cache line 
size at runtime not from configuration files... May be we can submit this 
kind of patch later. 

Best Regards!
--
Chao Zhu (??)
Research Staff Member
Cloud Infrastructure and Technology Group
IBM China Research Lab
Building 19 Zhongguancun Software Park
8 Dongbeiwang West Road, Haidian District,
Beijing, PRC. 100193
Tel: +86-10-58748711
Email: bjzhuc at cn.ibm.com




From:   "Hemant at freescale.com" 
To: Chao CH Zhu/China/IBM at IBMCN, "dev at dpdk.org" 
Date:   2014/09/29 14:20
Subject:RE: [dpdk-dev] [PATCH 10/12] Add cache size define for IBM 
Power   Architecture



> --- a/mk/arch/powerpc/rte.vars.mk
> +++ b/mk/arch/powerpc/rte.vars.mk
> @@ -32,7 +32,7 @@
>  ARCH  ?= powerpc
>  CROSS ?=
> 
> -CPU_CFLAGS  ?= -m64
> +CPU_CFLAGS  ?= -m64 -DCACHE_LINE_SIZE=128

 [hemant]  Instead of hardcoding the CACHE_LINE_SIZE,  can you drive the 
CACHE_LINE_SIZE from config file.  Other powerpc processor have it as 64.


>  CPU_LDFLAGS ?=
>  CPU_ASFLAGS ?= -felf64
> 
> --
> 1.7.1

[dpdk-dev] [PATCH v2 00/18] Update IXGBE base code

2014-09-29 Thread Ouyang Changchun

This patch series update IXGBE base code (a.k.a. share code) from
package 2014.03.13 to package 2014.09.04

v2 change:
  -- Regenerate the patch files based on the latest commit, otherwise git apply 
will fail. 

v1 change:
  -- The updating includes the following changes:
1. Change comments and fix typo in IXGBE base code.
2. Clean up IXGBE base code.
3. Implement a function to check command complete for flow director in
   IXGBE base code.
4. Support cloud filter and tunnel in IXGBE base code.
5. Refine function to let eeprom checksum calculation return either a
   negative error code on error, or the 16-bit checksum in IXGBE base
   code.
6. Let caller determine if it need read and return data or not after
   executing host interface command in IXGBE base code.
7. Extend mask from 16 bits to 32 bits for releasing or acquiring SWFW
   semaphore in IXGBE base code. It is used in reading and writing I2C
   byte.
8. Implement functions to do I2C byte read and write in IXGBE base code;
   Relocate function of ixgbe_mng_enabled.
9. Support device id 82599_QSFP and 82599_LS in IXGBE base code.
10.It need wait for 5 ms for polling EEC register in IXGBE X540 base
   code.
11.Define new error type in IXGBE base code, they are used to report
   different kinds of error.
12.Use hardware MAC type to determine I2C control, clock in/out, and data
   in/out in IXGBE base code.
13.Store lan_id and physical semaphore mask into hardware physical 
information,
   and use them to control read and write physical registers in IXGBE base 
code.
14.Remove unnecessary delay when setting up physical link and negotiate
   in IXGBE base code.
15.Implement a function to reset VF register to initial values in IXGBE
   base code.
16.Support these functionalities in IXGBE base code: Thermal sensor,
   DMA coalescing, EEE support, Source address pruning,
   Anti-spoofing, Iosf buffer reading and writing, Malicious
   driver detection.
17.Support X550 in IXGBE base code.
18.Support X550 in IXGBE poll mode driver.

Changchun Ouyang (18):
  Update comments and fix some comments typo in IXGBE share code.
  Clean up IXGBE share code.
  Implement a function to check command complete for flow director in
IXGBE share code.
  Support cloud mode in IXGBE share code.
  Refine function to let eeprom checksum calculation return either a
negative error code on error, or the 16-bit checksum in IXGBE share
code.
  Let caller determine if it need read and return data or not after
executing host interface command in IXGBE share code.
  Extend mask from 16 bits to 32 bits for releasing or acquiring SWFW
semaphore in IXGBE share code. It is used in reading and writing I2C
byte.
  Implement functions to do I2C byte read and write in IXGBE share code;
relocate function of ixgbe_mng_enabled.
  Support device id 82599_QSFP and 82599_LS in IXGBE share code.
  It need wait for 5 msec for polling EEC register in IXGBE X540 share
code.
  Define new error type in IXGBE share code, they are used to report
different kinds of error.
  Use hardware MAC type to determine I2C control, clock in/out, data
in/out in IXGBE share code.
  Store lan_id and physical semaphore mask into hw->phy, and use them to
control read and write physical registers in IXGBE share code.
  Remove unnecessary delay when setting up physical link and negotiate
in IXGBE share code.
  Implement a function to reset VF register to initial values in IXGBE
share code.
  Support these functionalities in IXGBE share code: Thermal sensor,
DMA coalescing, EEE support, Source address pruning,
Anti-spoofing, Iosf buffer reading and writing, Malicious
driver detection.
  Support X550 in IXGBE share code.
  Support X550 in IXGBE poll mode driver.

 lib/librte_eal/common/include/rte_pci_dev_ids.h |   14 +
 lib/librte_ether/rte_ethdev.h   |2 +-
 lib/librte_pmd_ixgbe/Makefile   |2 +
 lib/librte_pmd_ixgbe/ixgbe/README   |3 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_82598.c|2 -
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c|  404 +++--
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c  |  212 ++-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.h  |   24 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c   |  386 -
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h   |   23 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb.c  |   20 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb_82598.c|2 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb_82599.c|1 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_mbx.c  |4 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_osdep.h|   12 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c  |  651 ++--
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.h  |   23 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgb

[dpdk-dev] [PATCH v2 01/18] ixgbe: Update comments and fix some comments typo in IXGBE base code

2014-09-29 Thread Ouyang Changchun

This patch updates comments and fixes some comments typo, such as 'tx' is 
changed into 'Tx',
'cloude' is changed into 'cloud' etc.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c  | 36 +++
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c |  2 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_x540.c   | 16 ++
 3 files changed, 25 insertions(+), 29 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
index ed97ad9..835331b 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
@@ -146,7 +146,7 @@ s32 ixgbe_init_phy_ops_82599(struct ixgbe_hw *hw)
  &ixgbe_get_copper_link_capabilities_generic;
}

-   /* Set necessary function pointers based on phy type */
+   /* Set necessary function pointers based on PHY type */
switch (hw->phy.type) {
case ixgbe_phy_tn:
phy->ops.setup_link = &ixgbe_setup_phy_link_tnx;
@@ -264,7 +264,7 @@ s32 prot_autoc_read_82599(struct ixgbe_hw *hw, bool 
*locked, u32 *reg_val)
  * @locked: bool to indicate whether the SW/FW lock was already taken by
  *   previous proc_autoc_read_82599.
  *
- * This part (82599) may need to hold a the SW/FW lock around all writes to
+ * This part (82599) may need to hold the SW/FW lock around all writes to
  * AUTOC. Likewise after a write we need to do a pipeline reset.
  */
 s32 prot_autoc_write_82599(struct ixgbe_hw *hw, u32 autoc, bool locked)
@@ -664,7 +664,7 @@ void ixgbe_disable_tx_laser_multispeed_fiber(struct 
ixgbe_hw *hw)
if (ixgbe_check_reset_blocked(hw))
return;

-   /* Disable tx laser; allow 100us to go dark per spec */
+   /* Disable Tx laser; allow 100us to go dark per spec */
esdp_reg |= IXGBE_ESDP_SDP3;
IXGBE_WRITE_REG(hw, IXGBE_ESDP, esdp_reg);
IXGBE_WRITE_FLUSH(hw);
@@ -683,7 +683,7 @@ void ixgbe_enable_tx_laser_multispeed_fiber(struct ixgbe_hw 
*hw)
 {
u32 esdp_reg = IXGBE_READ_REG(hw, IXGBE_ESDP);

-   /* Enable tx laser; allow 100ms to light up */
+   /* Enable Tx laser; allow 100ms to light up */
esdp_reg &= ~IXGBE_ESDP_SDP3;
IXGBE_WRITE_REG(hw, IXGBE_ESDP, esdp_reg);
IXGBE_WRITE_FLUSH(hw);
@@ -697,7 +697,7 @@ void ixgbe_enable_tx_laser_multispeed_fiber(struct ixgbe_hw 
*hw)
  *  When the driver changes the link speeds that it can support,
  *  it sets autotry_restart to true to indicate that we need to
  *  initiate a new autotry session with the link partner.  To do
- *  so, we set the speed then disable and re-enable the tx laser, to
+ *  so, we set the speed then disable and re-enable the Tx laser, to
  *  alert the link partner that it also needs to restart autotry on its
  *  end.  This is consistent with true clause 37 autoneg, which also
  *  involves a loss of signal.
@@ -842,7 +842,7 @@ s32 ixgbe_setup_mac_link_multispeed_fiber(struct ixgbe_hw 
*hw,
if (status != IXGBE_SUCCESS)
return status;

-   /* Flap the tx laser if it has not already been done */
+   /* Flap the Tx laser if it has not already been done */
ixgbe_flap_tx_laser(hw);

/* Wait for the link partner to also set speed */
@@ -1461,7 +1461,7 @@ s32 ixgbe_init_fdir_signature_82599(struct ixgbe_hw *hw, 
u32 fdirctrl)
  *  @hw: pointer to hardware structure
  *  @fdirctrl: value to write to flow director control register, initially
  *  contains just the value of the Rx packet buffer allocation
- *  @cloud_mode: true - cloude mode, false - other mode
+ *  @cloud_mode: true - cloud mode, false - other mode
  **/
 s32 ixgbe_init_fdir_perfect_82599(struct ixgbe_hw *hw, u32 fdirctrl,
bool cloud_mode)
@@ -1513,14 +1513,14 @@ do { \
bucket_hash ^= hi_hash_dword >> n; \
else if (IXGBE_ATR_SIGNATURE_HASH_KEY & (0x01 << (n + 16))) \
sig_hash ^= hi_hash_dword << (16 - n); \
-} while (0);
+} while (0)

 /**
  *  ixgbe_atr_compute_sig_hash_82599 - Compute the signature hash
  *  @stream: input bitstream to compute the hash on
  *
  *  This function is almost identical to the function above but contains
- *  several optomizations such as unwinding all of the loops, letting the
+ *  several optimizations such as unwinding all of the loops, letting the
  *  compiler work out all of the conditional ifs since the keys are static
  *  defines, and computing two keys at once since the hashed dword stream
  *  will be the same for both keys.
@@ -1549,7 +1549,7 @@ u32 ixgbe_atr_compute_sig_hash_82599(union 
ixgbe_atr_hash_dword input,
/*
 * apply flow ID/VM pool/VLAN ID bits to lo hash dword, we had to
 * delay this because bit 0 of the stream should not be processed
-* so we do not add the vlan until after bit 0 was processed
+* so we do no

[dpdk-dev] [PATCH v2 03/18] ixgbe: New function to check command complete in IXGBE base code

2014-09-29 Thread Ouyang Changchun

This patch implements a function to check command complete for flow director in
IXGBE base code, and replaces related code snippet with this function.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c | 63 +++-
 1 file changed, 38 insertions(+), 25 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
index 046a35e..126aa24 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
@@ -1313,11 +1313,30 @@ reset_hw_out:
 }

 /**
+ * ixgbe_fdir_check_cmd_complete - poll to check whether FDIRCMD is complete
+ * @hw: pointer to hardware structure
+ */
+STATIC s32 ixgbe_fdir_check_cmd_complete(struct ixgbe_hw *hw)
+{
+   int i;
+
+   for (i = 0; i < IXGBE_FDIRCMD_CMD_POLL; i++) {
+   if (!(IXGBE_READ_REG(hw, IXGBE_FDIRCMD) &
+ IXGBE_FDIRCMD_CMD_MASK))
+   return IXGBE_SUCCESS;
+   usec_delay(10);
+   }
+
+   return IXGBE_ERR_FDIR_CMD_INCOMPLETE;
+}
+
+/**
  *  ixgbe_reinit_fdir_tables_82599 - Reinitialize Flow Director tables.
  *  @hw: pointer to hardware structure
  **/
 s32 ixgbe_reinit_fdir_tables_82599(struct ixgbe_hw *hw)
 {
+   s32 err;
int i;
u32 fdirctrl = IXGBE_READ_REG(hw, IXGBE_FDIRCTRL);
fdirctrl &= ~IXGBE_FDIRCTRL_INIT_DONE;
@@ -1328,16 +1347,10 @@ s32 ixgbe_reinit_fdir_tables_82599(struct ixgbe_hw *hw)
 * Before starting reinitialization process,
 * FDIRCMD.CMD must be zero.
 */
-   for (i = 0; i < IXGBE_FDIRCMD_CMD_POLL; i++) {
-   if (!(IXGBE_READ_REG(hw, IXGBE_FDIRCMD) &
- IXGBE_FDIRCMD_CMD_MASK))
-   break;
-   usec_delay(10);
-   }
-   if (i >= IXGBE_FDIRCMD_CMD_POLL) {
-   DEBUGOUT("Flow Director previous command isn't complete, "
-"aborting table re-initialization.\n");
-   return IXGBE_ERR_FDIR_REINIT_FAILED;
+   err = ixgbe_fdir_check_cmd_complete(hw);
+   if (err) {
+   DEBUGOUT("Flow Director previous command did not complete, 
aborting table re-initialization.\n");
+   return err;
}

IXGBE_WRITE_REG(hw, IXGBE_FDIRFREE, 0);
@@ -1593,8 +1606,9 @@ s32 ixgbe_fdir_add_signature_filter_82599(struct ixgbe_hw 
*hw,
  union ixgbe_atr_hash_dword common,
  u8 queue)
 {
-   u64  fdirhashcmd;
-   u32  fdircmd;
+   u64 fdirhashcmd;
+   u32 fdircmd;
+   s32 err;

DEBUGFUNC("ixgbe_fdir_add_signature_filter_82599");

@@ -1630,6 +1644,12 @@ s32 ixgbe_fdir_add_signature_filter_82599(struct 
ixgbe_hw *hw,
fdirhashcmd |= ixgbe_atr_compute_sig_hash_82599(input, common);
IXGBE_WRITE_REG64(hw, IXGBE_FDIRHASH, fdirhashcmd);

+   err = ixgbe_fdir_check_cmd_complete(hw);
+   if (err) {
+   DEBUGOUT("Flow Director command did not complete!\n");
+   return err;
+   }
+
DEBUGOUT2("Tx Queue=%x hash=%x\n", queue, (u32)fdirhashcmd);

return IXGBE_SUCCESS;
@@ -1906,8 +1926,7 @@ s32 ixgbe_fdir_erase_perfect_filter_82599(struct ixgbe_hw 
*hw,
 {
u32 fdirhash;
u32 fdircmd = 0;
-   u32 retry_count;
-   s32 err = IXGBE_SUCCESS;
+   s32 err;

/* configure FDIRHASH register */
fdirhash = input->formatted.bkt_hash;
@@ -1920,18 +1939,12 @@ s32 ixgbe_fdir_erase_perfect_filter_82599(struct 
ixgbe_hw *hw,
/* Query if filter is present */
IXGBE_WRITE_REG(hw, IXGBE_FDIRCMD, IXGBE_FDIRCMD_CMD_QUERY_REM_FILT);

-   for (retry_count = 10; retry_count; retry_count--) {
-   /* allow 10us for query to process */
-   usec_delay(10);
-   /* verify query completed successfully */
-   fdircmd = IXGBE_READ_REG(hw, IXGBE_FDIRCMD);
-   if (!(fdircmd & IXGBE_FDIRCMD_CMD_MASK))
-   break;
+   err = ixgbe_fdir_check_cmd_complete(hw);
+   if (err) {
+   DEBUGOUT("Flow Director command did not complete!\n");
+   return err;
}

-   if (!retry_count)
-   err = IXGBE_ERR_FDIR_REINIT_FAILED;
-
/* if filter exists in hardware then remove it */
if (fdircmd & IXGBE_FDIRCMD_FILTER_VALID) {
IXGBE_WRITE_REG(hw, IXGBE_FDIRHASH, fdirhash);
@@ -1940,7 +1953,7 @@ s32 ixgbe_fdir_erase_perfect_filter_82599(struct ixgbe_hw 
*hw,
IXGBE_FDIRCMD_CMD_REMOVE_FLOW);
}

-   return err;
+   return IXGBE_SUCCESS;
 }

 /**
-- 
1.8.4.2

[dpdk-dev] [PATCH v2 02/18] ixgbe: Clean up IXGBE base codes

2014-09-29 Thread Ouyang Changchun

This patch cleans up some IXGBE base codes, such as remove unnecessary return 
statement,
and reduce goto statement etc.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_82598.c | 2 --
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c | 7 +++
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c| 2 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h| 2 --
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb_82598.c | 2 ++
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb_82599.c | 1 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c   | 3 ++-
 7 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82598.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82598.c
index ee2217d..c8ce893 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82598.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82598.c
@@ -1417,8 +1417,6 @@ STATIC void ixgbe_set_rxpba_82598(struct ixgbe_hw *hw, 
int num_pb,
/* Setup Tx packet buffer sizes */
for (i = 0; i < IXGBE_MAX_PACKET_BUFFERS; i++)
IXGBE_WRITE_REG(hw, IXGBE_TXPBSIZE(i), IXGBE_TXPBSIZE_40KB);
-
-   return;
 }

 /**
diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
index 835331b..046a35e 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
@@ -2103,7 +2103,7 @@ s32 ixgbe_identify_phy_82599(struct ixgbe_hw *hw)
if (status != IXGBE_SUCCESS) {
/* 82599 10GBASE-T requires an external PHY */
if (hw->mac.ops.get_media_type(hw) == ixgbe_media_type_copper)
-   goto out;
+   return status;
else
status = ixgbe_identify_module_generic(hw);
}
@@ -2111,14 +2111,13 @@ s32 ixgbe_identify_phy_82599(struct ixgbe_hw *hw)
/* Set PHY type none if no PHY detected */
if (hw->phy.type == ixgbe_phy_unknown) {
hw->phy.type = ixgbe_phy_none;
-   status = IXGBE_SUCCESS;
+   return IXGBE_SUCCESS;
}

/* Return error if SFP module has been detected but is not supported */
if (hw->phy.type == ixgbe_phy_sfp_unsupported)
-   status = IXGBE_ERR_SFP_NOT_SUPPORTED;
+   return IXGBE_ERR_SFP_NOT_SUPPORTED;

-out:
return status;
 }

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
index 8084659..e36b3a8 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
@@ -1060,7 +1060,7 @@ s32 ixgbe_stop_adapter_generic(struct ixgbe_hw *hw)
hw->adapter_stopped = true;

/* Disable the receive unit */
-   IXGBE_WRITE_REG(hw, IXGBE_RXCTRL, 0);
+   ixgbe_disable_rx(hw);

/* Clear interrupt mask to stop interrupts from being generated */
IXGBE_WRITE_REG(hw, IXGBE_EIMC, IXGBE_IRQ_CLEAR_MASK);
diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h
index 80e47c1..8ee1dba 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h
@@ -41,9 +41,7 @@ POSSIBILITY OF SUCH DAMAGE.
IXGBE_WRITE_REG(hw, reg, (u32) value); \
IXGBE_WRITE_REG(hw, reg + 4, (u32) (value >> 32)); \
} while (0)
-#ifndef IXGBE_REMOVED
 #define IXGBE_REMOVED(a) (0)
-#endif /* IXGBE_REMOVED */
 struct ixgbe_pba {
u16 word[2];
u16 *pba_block;
diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb_82598.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb_82598.c
index 52c7e72..a6161cd 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb_82598.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb_82598.c
@@ -347,6 +347,8 @@ s32 ixgbe_dcb_hw_config_82598(struct ixgbe_hw *hw, int 
link_speed,
  u16 *refill, u16 *max, u8 *bwg_id,
  u8 *tsa)
 {
+   UNREFERENCED_1PARAMETER(link_speed);
+
ixgbe_dcb_config_rx_arbiter_82598(hw, refill, max, tsa);
ixgbe_dcb_config_tx_desc_arbiter_82598(hw, refill, max, bwg_id,
   tsa);
diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb_82599.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb_82599.c
index 2bcf1c7..e754d1a 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb_82599.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb_82599.c
@@ -580,6 +580,7 @@ s32 ixgbe_dcb_hw_config_82599(struct ixgbe_hw *hw, int 
link_speed,
  u16 *refill, u16 *max, u8 *bwg_id, u8 *tsa,
  u8 *map)
 {
+   UNREFERENCED_1PARAMETER(link_speed);

ixgbe_dcb_config_rx_arbiter_82599(hw, refill, max, bwg_id, tsa,
  map);
diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
index 7d2ed2a..4271f70 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
@@ -1804,7 +1804,7

[dpdk-dev] [PATCH v2 04/18] ixgbe: Support cloud mode in IXGBE base code

2014-09-29 Thread Ouyang Changchun

This patch supports cloud mode in IXGBE base code.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c | 70 
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h  | 10 +
 2 files changed, 80 insertions(+)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
index 126aa24..adf0e52 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
@@ -1497,6 +1497,9 @@ s32 ixgbe_init_fdir_perfect_82599(struct ixgbe_hw *hw, 
u32 fdirctrl,
(0xA << IXGBE_FDIRCTRL_MAX_LENGTH_SHIFT) |
(4 << IXGBE_FDIRCTRL_FULL_THRESH_SHIFT);

+   if (cloud_mode)
+   fdirctrl |=(IXGBE_FDIRCTRL_FILTERMODE_CLOUD <<
+   IXGBE_FDIRCTRL_FILTERMODE_SHIFT);

/* write hashes and fdirctrl register, poll for completion */
ixgbe_fdir_enable_82599(hw, fdirctrl);
@@ -1766,6 +1769,7 @@ s32 ixgbe_fdir_set_input_mask_82599(struct ixgbe_hw *hw,
/* mask IPv6 since it is currently not supported */
u32 fdirm = IXGBE_FDIRM_DIPv6;
u32 fdirtcpm;
+   u32 fdirip6m;
DEBUGFUNC("ixgbe_fdir_set_atr_input_mask_82599");

/*
@@ -1838,6 +1842,49 @@ s32 ixgbe_fdir_set_input_mask_82599(struct ixgbe_hw *hw,
return IXGBE_ERR_CONFIG;
}

+   if (cloud_mode) {
+   fdirm |= IXGBE_FDIRM_L3P;
+   fdirip6m = ((u32) 0xU << IXGBE_FDIRIP6M_DIPM_SHIFT);
+   fdirip6m |= IXGBE_FDIRIP6M_ALWAYS_MASK;
+
+   switch (input_mask->formatted.inner_mac[0] & 0xFF) {
+   case 0x00:
+   /* Mask inner MAC, fall through */
+   fdirip6m |= IXGBE_FDIRIP6M_INNER_MAC;
+   case 0xFF:
+   break;
+   default:
+   DEBUGOUT(" Error on inner_mac byte mask\n");
+   return IXGBE_ERR_CONFIG;
+   }
+
+   switch (input_mask->formatted.tni_vni & 0x) {
+   case 0x0:
+   /* Mask vxlan id */
+   fdirip6m |= IXGBE_FDIRIP6M_TNI_VNI;
+   break;
+   case 0x00FF:
+   fdirip6m |= IXGBE_FDIRIP6M_TNI_VNI_24;
+   break;
+   case 0x:
+   break;
+   default:
+   DEBUGOUT(" Error on TNI/VNI byte mask\n");
+   return IXGBE_ERR_CONFIG;
+   }
+
+   switch (input_mask->formatted.tunnel_type & 0x) {
+   case 0x0:
+   /* Mask turnnel type, fall through */
+   fdirip6m |= IXGBE_FDIRIP6M_TUNNEL_TYPE;
+   case 0x:
+   break;
+   default:
+   DEBUGOUT(" Error on tunnel type byte mask\n");
+   return IXGBE_ERR_CONFIG;
+   }
+   IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRIP6M, fdirip6m);
+   }

/* Now mask VM pool and destination IPv6 - bits 5 and 2 */
IXGBE_WRITE_REG(hw, IXGBE_FDIRM, fdirm);
@@ -1863,6 +1910,9 @@ s32 ixgbe_fdir_write_perfect_filter_82599(struct ixgbe_hw 
*hw,
  u16 soft_id, u8 queue, bool 
cloud_mode)
 {
u32 fdirport, fdirvlan, fdirhash, fdircmd;
+   u32 addr_low, addr_high;
+   u32 cloud_type = 0;
+   s32 err;

DEBUGFUNC("ixgbe_fdir_write_perfect_filter_82599");

@@ -1892,6 +1942,21 @@ s32 ixgbe_fdir_write_perfect_filter_82599(struct 
ixgbe_hw *hw,
fdirvlan |= IXGBE_NTOHS(input->formatted.vlan_id);
IXGBE_WRITE_REG(hw, IXGBE_FDIRVLAN, fdirvlan);

+   if (cloud_mode) {
+   if (input->formatted.tunnel_type != 0)
+   cloud_type = 0x8000;
+
+   addr_low = ((u32)input->formatted.inner_mac[0] |
+   ((u32)input->formatted.inner_mac[1] << 8) |
+   ((u32)input->formatted.inner_mac[2] << 16) |
+   ((u32)input->formatted.inner_mac[3] << 24));
+   addr_high = ((u32)input->formatted.inner_mac[4] |
+   ((u32)input->formatted.inner_mac[5] << 8));
+   cloud_type |= addr_high;
+   IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRSIPv6(0), addr_low);
+   IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRSIPv6(1), cloud_type);
+   IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRSIPv6(2), 
input->formatted.tni_vni);
+   }

/* configure FDIRHASH register */
fdirhash = input->formatted.bkt_hash;
@@ -1916,6 +1981,11 @@ s32 ixgbe_fdir_write_perfect_filter_82599(struct 
ixgbe_hw *hw,
fdircmd |= (u32)input->formatted.vm_pool << IXGBE_FDIRCMD_VT_POOL_SHIFT;

IXGBE_WRITE_REG(hw, IXGBE_FDIRCMD, fdircmd);
+

[dpdk-dev] [PATCH v2 06/18] ixgbe: New argument in host interface command function

2014-09-29 Thread Ouyang Changchun

This patch introduces a new argument to let caller determine if it need read and
return data or not after executing host interface command in IXGBE base code.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c | 66 ++-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h |  3 +-
 2 files changed, 40 insertions(+), 29 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
index 4bd004c..f8f4e7e 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
@@ -4341,41 +4341,50 @@ u8 ixgbe_calculate_checksum(u8 *buffer, u32 length)
  *  @buffer: contains the command to write and where the return status will
  *   be placed
  *  @length: length of buffer, must be multiple of 4 bytes
+ *  @return_data: read and return data from the buffer (true) or not (false)
+ *   Needed because FW structures are big endian and decoding of
+ *   these fields can be 8 bit or 16 bit based on command. Decoding
+ *   is not easily understood without making a table of commands.
+ *   So we will leave this up to the caller to read back the data
+ *   in these cases.
  *
  *  Communicates with the manageability block.  On success return IXGBE_SUCCESS
  *  else return IXGBE_ERR_HOST_INTERFACE_COMMAND.
  **/
 s32 ixgbe_host_interface_command(struct ixgbe_hw *hw, u32 *buffer,
-u32 length)
+u32 length, bool return_data)
 {
-   u32 hicr, i, bi;
+   u32 hicr, i, bi, fwsts;
u32 hdr_size = sizeof(struct ixgbe_hic_hdr);
-   u8 buf_len, dword_len;
-
-   s32 ret_val = IXGBE_SUCCESS;
+   u16 buf_len;
+   u8 dword_len;

DEBUGFUNC("ixgbe_host_interface_command");

-   if (length == 0 || length & 0x3 ||
-   length > IXGBE_HI_MAX_BLOCK_BYTE_LENGTH) {
-   DEBUGOUT("Buffer length failure.\n");
-   ret_val = IXGBE_ERR_HOST_INTERFACE_COMMAND;
-   goto out;
+   if (length == 0 || length > IXGBE_HI_MAX_BLOCK_BYTE_LENGTH) {
+   DEBUGOUT1("Buffer length failure buffersize=%d.\n", length);
+   return IXGBE_ERR_HOST_INTERFACE_COMMAND;
}
+   /* Set bit 9 of FWSTS clearing FW reset indication */
+   fwsts = IXGBE_READ_REG(hw, IXGBE_FWSTS);
+   IXGBE_WRITE_REG(hw, IXGBE_FWSTS, fwsts | IXGBE_FWSTS_FWRI);

/* Check that the host interface is enabled. */
hicr = IXGBE_READ_REG(hw, IXGBE_HICR);
if ((hicr & IXGBE_HICR_EN) == 0) {
DEBUGOUT("IXGBE_HOST_EN bit disabled.\n");
-   ret_val = IXGBE_ERR_HOST_INTERFACE_COMMAND;
-   goto out;
+   return IXGBE_ERR_HOST_INTERFACE_COMMAND;
+   }
+
+   /* Calculate length in DWORDs. We must be DWORD aligned */
+   if ((length % (sizeof(u32))) != 0) {
+   DEBUGOUT("Buffer length failure, not aligned to dword");
+   return IXGBE_ERR_INVALID_ARGUMENT;
}

-   /* Calculate length in DWORDs */
dword_len = length >> 2;

-   /*
-* The device driver writes the relevant command block
+   /* The device driver writes the relevant command block
 * into the ram area.
 */
for (i = 0; i < dword_len; i++)
@@ -4392,14 +4401,17 @@ s32 ixgbe_host_interface_command(struct ixgbe_hw *hw, 
u32 *buffer,
msec_delay(1);
}

-   /* Check command successful completion. */
+   /* Check command completion */
if (i == IXGBE_HI_COMMAND_TIMEOUT ||
-   (!(IXGBE_READ_REG(hw, IXGBE_HICR) & IXGBE_HICR_SV))) {
-   DEBUGOUT("Command has failed with no status valid.\n");
-   ret_val = IXGBE_ERR_HOST_INTERFACE_COMMAND;
-   goto out;
+   !(IXGBE_READ_REG(hw, IXGBE_HICR) & IXGBE_HICR_SV)) {
+   ERROR_REPORT1(IXGBE_ERROR_CAUTION,
+"Command has failed with no status valid.\n");
+   return IXGBE_ERR_HOST_INTERFACE_COMMAND;
}

+   if (!return_data)
+   return 0;
+
/* Calculate length in DWORDs */
dword_len = hdr_size >> 2;

@@ -4412,25 +4424,23 @@ s32 ixgbe_host_interface_command(struct ixgbe_hw *hw, 
u32 *buffer,
/* If there is any thing in data position pull it in */
buf_len = ((struct ixgbe_hic_hdr *)buffer)->buf_len;
if (buf_len == 0)
-   goto out;
+   return 0;

-   if (length < (buf_len + hdr_size)) {
+   if (length < buf_len + hdr_size) {
DEBUGOUT("Buffer not large enough for reply message.\n");
-   ret_val = IXGBE_ERR_HOST_INTERFACE_COMMAND;
-   goto out;
+   return IXGBE_ERR_HOST_INTERFACE_COMMAND;
}

/* Calculate length in DWORDs, add 3 for odd lengths */
dword_len = (buf_len + 3) >> 2;

-   /* Pull in the rest of the buffer (bi is where we left

[dpdk-dev] [PATCH v2 07/18] ixgbe: Extend mask for SWFW semaphore

2014-09-29 Thread Ouyang Changchun

This patch extend mask from 16 bits to 32 bits for releasing or
acquiring SWFW semaphore in IXGBE base code. It is used in reading and
writing I2C byte.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c|   4 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.h|   4 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c |   4 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h |   4 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c|  32 +++--
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h   |   4 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_x540.c   | 108 ++
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_x540.h   |   4 +-
 8 files changed, 88 insertions(+), 76 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c
index 7e6b092..378304f 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c
@@ -1178,7 +1178,7 @@ s32 ixgbe_enable_sec_rx_path(struct ixgbe_hw *hw)
  *  Acquires the SWFW semaphore through SW_FW_SYNC register for the specified
  *  function (CSR, PHY0, PHY1, EEPROM, Flash)
  **/
-s32 ixgbe_acquire_swfw_semaphore(struct ixgbe_hw *hw, u16 mask)
+s32 ixgbe_acquire_swfw_semaphore(struct ixgbe_hw *hw, u32 mask)
 {
return ixgbe_call_func(hw, hw->mac.ops.acquire_swfw_sync,
   (hw, mask), IXGBE_NOT_IMPLEMENTED);
@@ -1192,7 +1192,7 @@ s32 ixgbe_acquire_swfw_semaphore(struct ixgbe_hw *hw, u16 
mask)
  *  Releases the SWFW semaphore through SW_FW_SYNC register for the specified
  *  function (CSR, PHY0, PHY1, EEPROM, Flash)
  **/
-void ixgbe_release_swfw_semaphore(struct ixgbe_hw *hw, u16 mask)
+void ixgbe_release_swfw_semaphore(struct ixgbe_hw *hw, u32 mask)
 {
if (hw->mac.ops.release_swfw_sync)
hw->mac.ops.release_swfw_sync(hw, mask);
diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.h 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.h
index da41d95..88a31e8 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.h
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.h
@@ -172,8 +172,8 @@ s32 ixgbe_write_i2c_eeprom(struct ixgbe_hw *hw, u8 
byte_offset, u8 eeprom_data);
 s32 ixgbe_get_san_mac_addr(struct ixgbe_hw *hw, u8 *san_mac_addr);
 s32 ixgbe_set_san_mac_addr(struct ixgbe_hw *hw, u8 *san_mac_addr);
 s32 ixgbe_get_device_caps(struct ixgbe_hw *hw, u16 *device_caps);
-s32 ixgbe_acquire_swfw_semaphore(struct ixgbe_hw *hw, u16 mask);
-void ixgbe_release_swfw_semaphore(struct ixgbe_hw *hw, u16 mask);
+s32 ixgbe_acquire_swfw_semaphore(struct ixgbe_hw *hw, u32 mask);
+void ixgbe_release_swfw_semaphore(struct ixgbe_hw *hw, u32 mask);
 s32 ixgbe_get_wwn_prefix(struct ixgbe_hw *hw, u16 *wwnn_prefix,
 u16 *wwpn_prefix);
 s32 ixgbe_get_fcoe_boot_status(struct ixgbe_hw *hw, u16 *bs);
diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
index f8f4e7e..749188d 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
@@ -3172,7 +3172,7 @@ out:
  *  Acquires the SWFW semaphore through the GSSR register for the specified
  *  function (CSR, PHY0, PHY1, EEPROM, Flash)
  **/
-s32 ixgbe_acquire_swfw_sync(struct ixgbe_hw *hw, u16 mask)
+s32 ixgbe_acquire_swfw_sync(struct ixgbe_hw *hw, u32 mask)
 {
u32 gssr = 0;
u32 swmask = mask;
@@ -3219,7 +3219,7 @@ s32 ixgbe_acquire_swfw_sync(struct ixgbe_hw *hw, u16 mask)
  *  Releases the SWFW semaphore through the GSSR register for the specified
  *  function (CSR, PHY0, PHY1, EEPROM, Flash)
  **/
-void ixgbe_release_swfw_sync(struct ixgbe_hw *hw, u16 mask)
+void ixgbe_release_swfw_sync(struct ixgbe_hw *hw, u32 mask)
 {
u32 gssr;
u32 swmask = mask;
diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h
index 8b8bd0b..14f1fec 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h
@@ -114,8 +114,8 @@ bool ixgbe_device_supports_autoneg_fc(struct ixgbe_hw *hw);
 void ixgbe_fc_autoneg(struct ixgbe_hw *hw);

 s32 ixgbe_validate_mac_addr(u8 *mac_addr);
-s32 ixgbe_acquire_swfw_sync(struct ixgbe_hw *hw, u16 mask);
-void ixgbe_release_swfw_sync(struct ixgbe_hw *hw, u16 mask);
+s32 ixgbe_acquire_swfw_sync(struct ixgbe_hw *hw, u32 mask);
+void ixgbe_release_swfw_sync(struct ixgbe_hw *hw, u32 mask);
 s32 ixgbe_disable_pcie_master(struct ixgbe_hw *hw);

 s32 prot_autoc_read_generic(struct ixgbe_hw *hw, bool *, u32 *reg_val);
diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
index 4271f70..4351f4f 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
@@ -1510,26 +1510,18 @@ s32 ixgbe_write_i2c_eeprom_generic(struct ixgbe_hw *hw, 
u8 byte_offset,
 s32 ixgbe_read_i2c_byte_generic(struct ixgbe_hw *hw, u8 byte_offset,
u8 dev_addr, u8 *data)
 {
-   s32 status = IXGBE_SUCCESS;
+   s32 status;
u

[dpdk-dev] [PATCH v2 10/18] ixgbe: Modify time to wait in polling flash update

2014-09-29 Thread Ouyang Changchun

It need wait for 5 ms for polling EEC register in IXGBE X540 share codes.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_x540.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_x540.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_x540.c
index e47fb1d..ab38450 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_x540.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_x540.c
@@ -726,7 +726,7 @@ STATIC s32 ixgbe_poll_flash_update_done_X540(struct 
ixgbe_hw *hw)
status = IXGBE_SUCCESS;
break;
}
-   usec_delay(5);
+   msec_delay(5);
}

if (i == IXGBE_FLUDONE_ATTEMPTS)
-- 
1.8.4.2

[dpdk-dev] [PATCH v2 08/18] ixgbe: New function to read and write I2C bytes

2014-09-29 Thread Ouyang Changchun

This patch implement functions to do I2C byte read and write in
IXGBE base code; it also relocates function of ixgbe_mng_enabled.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c  | 136 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c |  27 ++
 2 files changed, 144 insertions(+), 19 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
index adf0e52..277cc25 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
@@ -53,25 +53,10 @@ STATIC s32 ixgbe_read_eeprom_82599(struct ixgbe_hw *hw,
   u16 offset, u16 *data);
 STATIC s32 ixgbe_read_eeprom_buffer_82599(struct ixgbe_hw *hw, u16 offset,
  u16 words, u16 *data);
-
-bool ixgbe_mng_enabled(struct ixgbe_hw *hw)
-{
-   u32 fwsm, manc, factps;
-
-   fwsm = IXGBE_READ_REG(hw, IXGBE_FWSM);
-   if ((fwsm & IXGBE_FWSM_MODE_MASK) != IXGBE_FWSM_FW_MODE_PT)
-   return false;
-
-   manc = IXGBE_READ_REG(hw, IXGBE_MANC);
-   if (!(manc & IXGBE_MANC_RCV_TCO_EN))
-   return false;
-
-   factps = IXGBE_READ_REG(hw, IXGBE_FACTPS);
-   if (factps & IXGBE_FACTPS_MNGCG)
-   return false;
-
-   return true;
-}
+STATIC s32 ixgbe_read_i2c_byte_82599(struct ixgbe_hw *hw, u8 byte_offset,
+   u8 dev_addr, u8 *data);
+STATIC s32 ixgbe_write_i2c_byte_82599(struct ixgbe_hw *hw, u8 byte_offset,
+   u8 dev_addr, u8 data);

 void ixgbe_init_mac_link_ops_82599(struct ixgbe_hw *hw)
 {
@@ -2583,4 +2568,117 @@ reset_pipeline_out:
 }


+/**
+ *  ixgbe_read_i2c_byte_82599 - Reads 8 bit word over I2C
+ *  @hw: pointer to hardware structure
+ *  @byte_offset: byte offset to read
+ *  @data: value read
+ *
+ *  Performs byte read operation to SFP module's EEPROM over I2C interface at
+ *  a specified device address.
+ **/
+STATIC s32 ixgbe_read_i2c_byte_82599(struct ixgbe_hw *hw, u8 byte_offset,
+   u8 dev_addr, u8 *data)
+{
+   u32 esdp;
+   s32 status;
+   s32 timeout = 200;
+
+   DEBUGFUNC("ixgbe_read_i2c_byte_82599");
+
+   if (hw->phy.qsfp_shared_i2c_bus == TRUE) {
+   /* Acquire I2C bus ownership. */
+   esdp = IXGBE_READ_REG(hw, IXGBE_ESDP);
+   esdp |= IXGBE_ESDP_SDP0;
+   IXGBE_WRITE_REG(hw, IXGBE_ESDP, esdp);
+   IXGBE_WRITE_FLUSH(hw);
+
+   while (timeout) {
+   esdp = IXGBE_READ_REG(hw, IXGBE_ESDP);
+   if (esdp & IXGBE_ESDP_SDP1)
+   break;
+
+   msec_delay(5);
+   timeout--;
+   }
+
+   if (!timeout) {
+   DEBUGOUT("Driver can't access resource,"
+" acquiring I2C bus timeout.\n");
+   status = IXGBE_ERR_I2C;
+   goto release_i2c_access;
+   }
+   }
+
+   status = ixgbe_read_i2c_byte_generic(hw, byte_offset, dev_addr, data);
+
+release_i2c_access:
+
+   if (hw->phy.qsfp_shared_i2c_bus == TRUE) {
+   /* Release I2C bus ownership. */
+   esdp = IXGBE_READ_REG(hw, IXGBE_ESDP);
+   esdp &= ~IXGBE_ESDP_SDP0;
+   IXGBE_WRITE_REG(hw, IXGBE_ESDP, esdp);
+   IXGBE_WRITE_FLUSH(hw);
+   }
+
+   return status;
+}
+
+/**
+ *  ixgbe_write_i2c_byte_82599 - Writes 8 bit word over I2C
+ *  @hw: pointer to hardware structure
+ *  @byte_offset: byte offset to write
+ *  @data: value to write
+ *
+ *  Performs byte write operation to SFP module's EEPROM over I2C interface at
+ *  a specified device address.
+ **/
+STATIC s32 ixgbe_write_i2c_byte_82599(struct ixgbe_hw *hw, u8 byte_offset,
+u8 dev_addr, u8 data)
+{
+   u32 esdp;
+   s32 status;
+   s32 timeout = 200;
+
+   DEBUGFUNC("ixgbe_write_i2c_byte_82599");
+
+   if (hw->phy.qsfp_shared_i2c_bus == TRUE) {
+   /* Acquire I2C bus ownership. */
+   esdp = IXGBE_READ_REG(hw, IXGBE_ESDP);
+   esdp |= IXGBE_ESDP_SDP0;
+   IXGBE_WRITE_REG(hw, IXGBE_ESDP, esdp);
+   IXGBE_WRITE_FLUSH(hw);
+
+   while (timeout) {
+   esdp = IXGBE_READ_REG(hw, IXGBE_ESDP);
+   if (esdp & IXGBE_ESDP_SDP1)
+   break;
+
+   msec_delay(5);
+   timeout--;
+   }
+
+   if (!timeout) {
+   DEBUGOUT("Driver can't access resource,"
+" acquiring I2C bus timeout.\n");
+   status = IXGBE_ERR_I2C;
+   goto release_i2c_access;
+   }
+   }

[dpdk-dev] [PATCH v2 11/18] ixgbe: New error type

2014-09-29 Thread Ouyang Changchun

This patch defines new error type in IXGBE share codes; they are
used to report different kinds of error.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_osdep.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_osdep.h 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_osdep.h
index ae9c280..ab13d64 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_osdep.h
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_osdep.h
@@ -82,6 +82,16 @@
 #define UNREFERENCED_3PARAMETER(_p, _q, _r) 
 #define UNREFERENCED_4PARAMETER(_p, _q, _r, _s) 

+/* Shared code error reporting */
+enum {
+   IXGBE_ERROR_SOFTWARE,
+   IXGBE_ERROR_POLLING,
+   IXGBE_ERROR_INVALID_STATE,
+   IXGBE_ERROR_UNSUPPORTED,
+   IXGBE_ERROR_ARGUMENT,
+   IXGBE_ERROR_CAUTION,
+};
+
 #define STATIC static
 #define IXGBE_NTOHL(_i)rte_be_to_cpu_32(_i)
 #define IXGBE_NTOHS(_i)rte_be_to_cpu_16(_i)
-- 
1.8.4.2

[dpdk-dev] [PATCH v2 09/18] ixgbe: Support new device id 82599_QSFP and 82599_LS

2014-09-29 Thread Ouyang Changchun

This patch support new device id 82599_QSFP and 82599_LS in IXGBE
base code.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c  |  79 +++--
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c|   2 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c |   8 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c| 481 --
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.h|  18 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h   |  24 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_vf.c |  15 +
 7 files changed, 541 insertions(+), 86 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
index 277cc25..3e442f7 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
@@ -111,9 +111,27 @@ s32 ixgbe_init_phy_ops_82599(struct ixgbe_hw *hw)
struct ixgbe_mac_info *mac = &hw->mac;
struct ixgbe_phy_info *phy = &hw->phy;
s32 ret_val = IXGBE_SUCCESS;
+   u32 esdp;

DEBUGFUNC("ixgbe_init_phy_ops_82599");

+   if (hw->device_id == IXGBE_DEV_ID_82599_QSFP_SF_QP) {
+   /* Store flag indicating I2C bus access control unit. */
+   hw->phy.qsfp_shared_i2c_bus = TRUE;
+
+   /* Initialize access to QSFP+ I2C bus */
+   esdp = IXGBE_READ_REG(hw, IXGBE_ESDP);
+   esdp |= IXGBE_ESDP_SDP0_DIR;
+   esdp &= ~IXGBE_ESDP_SDP1_DIR;
+   esdp &= ~IXGBE_ESDP_SDP0;
+   esdp &= ~IXGBE_ESDP_SDP0_NATIVE;
+   esdp &= ~IXGBE_ESDP_SDP1_NATIVE;
+   IXGBE_WRITE_REG(hw, IXGBE_ESDP, esdp);
+   IXGBE_WRITE_FLUSH(hw);
+
+   phy->ops.read_i2c_byte = &ixgbe_read_i2c_byte_82599;
+   phy->ops.write_i2c_byte = &ixgbe_write_i2c_byte_82599;
+   }
/* Identify the PHY or SFP module */
ret_val = phy->ops.identify(hw);
if (ret_val == IXGBE_ERR_SFP_NOT_SUPPORTED)
@@ -397,10 +415,8 @@ s32 ixgbe_get_link_capabilities_82599(struct ixgbe_hw *hw,
/* Check if 1G SFP module. */
if (hw->phy.sfp_type == ixgbe_sfp_type_1g_cu_core0 ||
hw->phy.sfp_type == ixgbe_sfp_type_1g_cu_core1 ||
-#ifdef SUPPORT_1000BASE_LX
hw->phy.sfp_type == ixgbe_sfp_type_1g_lx_core0 ||
hw->phy.sfp_type == ixgbe_sfp_type_1g_lx_core1 ||
-#endif
hw->phy.sfp_type == ixgbe_sfp_type_1g_sx_core0 ||
hw->phy.sfp_type == ixgbe_sfp_type_1g_sx_core1) {
*speed = IXGBE_LINK_SPEED_1GB_FULL;
@@ -477,7 +493,13 @@ s32 ixgbe_get_link_capabilities_82599(struct ixgbe_hw *hw,
*speed |= IXGBE_LINK_SPEED_10GB_FULL |
  IXGBE_LINK_SPEED_1GB_FULL;

-   *autoneg = true;
+   /* QSFP must not enable full auto-negotiation
+* Limited autoneg is enabled at 1G
+*/
+   if (hw->phy.media_type == ixgbe_media_type_fiber_qsfp)
+   *autoneg = false;
+   else
+   *autoneg = true;
}

 out:
@@ -530,6 +552,12 @@ enum ixgbe_media_type ixgbe_get_media_type_82599(struct 
ixgbe_hw *hw)
case IXGBE_DEV_ID_82599_T3_LOM:
media_type = ixgbe_media_type_copper;
break;
+   case IXGBE_DEV_ID_82599_LS:
+   media_type = ixgbe_media_type_fiber_lco;
+   break;
+   case IXGBE_DEV_ID_82599_QSFP_SF_QP:
+   media_type = ixgbe_media_type_fiber_qsfp;
+   break;
default:
media_type = ixgbe_media_type_unknown;
break;
@@ -755,6 +783,9 @@ s32 ixgbe_setup_mac_link_multispeed_fiber(struct ixgbe_hw 
*hw,
IXGBE_WRITE_REG(hw, IXGBE_ESDP, esdp_reg);
IXGBE_WRITE_FLUSH(hw);
break;
+   case ixgbe_media_type_fiber_qsfp:
+   /* QSFP module automatically detects MAC link speed */
+   break;
default:
DEBUGOUT("Unexpected media type.\n");
break;
@@ -813,6 +844,9 @@ s32 ixgbe_setup_mac_link_multispeed_fiber(struct ixgbe_hw 
*hw,
IXGBE_WRITE_REG(hw, IXGBE_ESDP, esdp_reg);
IXGBE_WRITE_FLUSH(hw);
break;
+   case ixgbe_media_type_fiber_qsfp:
+   /* QSFP module automatically detects link speed */
+   break;
default:
DEBUGOUT("Unexpected media type.\n");
break;
@@ -1052,7 +1086,7 @@ s32 ixgbe_setup_mac_link_82599(struct ixgbe_hw *hw,
if ((speed == IXGBE_LINK_SPEED_1GB_FULL) &&
(pma_pmd_1g == IXGBE_AUTOC_1G_SFI)) {
autoc &= ~IXGBE_AUTOC_LMS_MASK;
-   if (autoneg)
+   if (autoneg || hw->phy.type == ixgbe_phy

[dpdk-dev] [PATCH v2 13/18] ixgbe: semaphore mask move into hardware physical information

2014-09-29 Thread Ouyang Changchun

This patch stores lan_id and physical semaphore mask into hardware physical 
information,
and use them to control read and write physical registers in IXGBE base code.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c | 23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
index 2e8fe93..f39df9a 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
@@ -283,6 +283,15 @@ s32 ixgbe_identify_phy_generic(struct ixgbe_hw *hw)

DEBUGFUNC("ixgbe_identify_phy_generic");

+   if (!hw->phy.phy_semaphore_mask) {
+   hw->phy.lan_id = IXGBE_READ_REG(hw, IXGBE_STATUS) &
+   IXGBE_STATUS_LAN_ID_1;
+   if (hw->phy.lan_id)
+   hw->phy.phy_semaphore_mask = IXGBE_GSSR_PHY1_SM;
+   else
+   hw->phy.phy_semaphore_mask = IXGBE_GSSR_PHY0_SM;
+   }
+
if (hw->phy.type == ixgbe_phy_unknown) {
for (phy_addr = 0; phy_addr < IXGBE_MAX_PHY_ADDR; phy_addr++) {
if (ixgbe_validate_phy_addr(hw, phy_addr)) {
@@ -587,15 +596,10 @@ s32 ixgbe_read_phy_reg_generic(struct ixgbe_hw *hw, u32 
reg_addr,
   u32 device_type, u16 *phy_data)
 {
s32 status;
-   u16 gssr;
+   u32 gssr = hw->phy.phy_semaphore_mask;

DEBUGFUNC("ixgbe_read_phy_reg_generic");

-   if (IXGBE_READ_REG(hw, IXGBE_STATUS) & IXGBE_STATUS_LAN_ID_1)
-   gssr = IXGBE_GSSR_PHY1_SM;
-   else
-   gssr = IXGBE_GSSR_PHY0_SM;
-
if (hw->mac.ops.acquire_swfw_sync(hw, gssr) == IXGBE_SUCCESS) {
status = ixgbe_read_phy_reg_mdi(hw, reg_addr, device_type,
phy_data);
@@ -693,15 +697,10 @@ s32 ixgbe_write_phy_reg_generic(struct ixgbe_hw *hw, u32 
reg_addr,
u32 device_type, u16 phy_data)
 {
s32 status;
-   u16 gssr;
+   u32 gssr = hw->phy.phy_semaphore_mask;

DEBUGFUNC("ixgbe_write_phy_reg_generic");

-   if (IXGBE_READ_REG(hw, IXGBE_STATUS) & IXGBE_STATUS_LAN_ID_1)
-   gssr = IXGBE_GSSR_PHY1_SM;
-   else
-   gssr = IXGBE_GSSR_PHY0_SM;
-
if (hw->mac.ops.acquire_swfw_sync(hw, gssr) == IXGBE_SUCCESS) {
status = ixgbe_write_phy_reg_mdi(hw, reg_addr, device_type,
 phy_data);
-- 
1.8.4.2

[dpdk-dev] [PATCH v2 14/18] ixgbe: Remove unnecessary delay

2014-09-29 Thread Ouyang Changchun

This patch removes unnecessary delay when setting up physical link
and negotiating in IXGBE share code.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c | 53 --
 1 file changed, 6 insertions(+), 47 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
index f39df9a..e1e560b 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
@@ -713,16 +713,14 @@ s32 ixgbe_write_phy_reg_generic(struct ixgbe_hw *hw, u32 
reg_addr,
 }

 /**
- *  ixgbe_setup_phy_link_generic - Set and restart autoneg
+ *  ixgbe_setup_phy_link_generic - Set and restart auto-neg
  *  @hw: pointer to hardware structure
  *
- *  Restart autonegotiation and PHY and waits for completion.
+ *  Restart auto-negotiation and PHY and waits for completion.
  **/
 s32 ixgbe_setup_phy_link_generic(struct ixgbe_hw *hw)
 {
s32 status = IXGBE_SUCCESS;
-   u32 time_out;
-   u32 max_time_out = 10;
u16 autoneg_reg = IXGBE_MII_AUTONEG_REG;
bool autoneg = false;
ixgbe_link_speed speed;
@@ -783,7 +781,7 @@ s32 ixgbe_setup_phy_link_generic(struct ixgbe_hw *hw)
if (ixgbe_check_reset_blocked(hw))
return status;

-   /* Restart PHY autonegotiation and wait for completion */
+   /* Restart PHY auto-negotiation. */
hw->phy.ops.read_reg(hw, IXGBE_MDIO_AUTO_NEG_CONTROL,
 IXGBE_MDIO_AUTO_NEG_DEV_TYPE, &autoneg_reg);

@@ -792,25 +790,6 @@ s32 ixgbe_setup_phy_link_generic(struct ixgbe_hw *hw)
hw->phy.ops.write_reg(hw, IXGBE_MDIO_AUTO_NEG_CONTROL,
  IXGBE_MDIO_AUTO_NEG_DEV_TYPE, autoneg_reg);

-   /* Wait for autonegotiation to finish */
-   for (time_out = 0; time_out < max_time_out; time_out++) {
-   usec_delay(10);
-   /* Restart PHY autonegotiation and wait for completion */
-   status = hw->phy.ops.read_reg(hw, IXGBE_MDIO_AUTO_NEG_STATUS,
- IXGBE_MDIO_AUTO_NEG_DEV_TYPE,
- &autoneg_reg);
-
-   autoneg_reg &= IXGBE_MII_AUTONEG_COMPLETE;
-   if (autoneg_reg == IXGBE_MII_AUTONEG_COMPLETE)
-   break;
-   }
-
-   if (time_out == max_time_out) {
-   status = IXGBE_ERR_LINK_SETUP;
-   ERROR_REPORT1(IXGBE_ERROR_POLLING,
-"PHY autonegotiation time out");
-   }
-
return status;
 }

@@ -934,16 +913,14 @@ s32 ixgbe_check_phy_link_tnx(struct ixgbe_hw *hw, 
ixgbe_link_speed *speed,
 }

 /**
- * ixgbe_setup_phy_link_tnx - Set and restart autoneg
+ * ixgbe_setup_phy_link_tnx - Set and restart auto-neg
  * @hw: pointer to hardware structure
  *
- * Restart autonegotiation and PHY and waits for completion.
+ * Restart auto-negotiation and PHY and waits for completion.
  **/
 s32 ixgbe_setup_phy_link_tnx(struct ixgbe_hw *hw)
 {
s32 status = IXGBE_SUCCESS;
-   u32 time_out;
-   u32 max_time_out = 10;
u16 autoneg_reg = IXGBE_MII_AUTONEG_REG;
bool autoneg = false;
ixgbe_link_speed speed;
@@ -1001,7 +978,7 @@ s32 ixgbe_setup_phy_link_tnx(struct ixgbe_hw *hw)
if (ixgbe_check_reset_blocked(hw))
return status;

-   /* Restart PHY autonegotiation and wait for completion */
+   /* Restart PHY auto-negotiation. */
hw->phy.ops.read_reg(hw, IXGBE_MDIO_AUTO_NEG_CONTROL,
 IXGBE_MDIO_AUTO_NEG_DEV_TYPE, &autoneg_reg);

@@ -1010,24 +987,6 @@ s32 ixgbe_setup_phy_link_tnx(struct ixgbe_hw *hw)
hw->phy.ops.write_reg(hw, IXGBE_MDIO_AUTO_NEG_CONTROL,
  IXGBE_MDIO_AUTO_NEG_DEV_TYPE, autoneg_reg);

-   /* Wait for autonegotiation to finish */
-   for (time_out = 0; time_out < max_time_out; time_out++) {
-   usec_delay(10);
-   /* Restart PHY autonegotiation and wait for completion */
-   status = hw->phy.ops.read_reg(hw, IXGBE_MDIO_AUTO_NEG_STATUS,
- IXGBE_MDIO_AUTO_NEG_DEV_TYPE,
- &autoneg_reg);
-
-   autoneg_reg &= IXGBE_MII_AUTONEG_COMPLETE;
-   if (autoneg_reg == IXGBE_MII_AUTONEG_COMPLETE)
-   break;
-   }
-
-   if (time_out == max_time_out) {
-   status = IXGBE_ERR_LINK_SETUP;
-   DEBUGOUT("ixgbe_setup_phy_link_tnx: time out");
-   }
-
return status;
 }

-- 
1.8.4.2

[dpdk-dev] [PATCH v2 05/18] ixgbe: eeprom checksum calculation return new value in IXGBE base code

2014-09-29 Thread Ouyang Changchun

This patch refines function to let eeprom checksum calculation return
either a negative error code on error, or the 16-bit checksum in IXGBE 
base code.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c |  96 ++--
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h |   2 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h   |   2 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_x540.c   | 118 +++---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_x540.h   |   2 +-
 5 files changed, 123 insertions(+), 97 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
index e36b3a8..4bd004c 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c
@@ -2096,8 +2096,10 @@ STATIC void ixgbe_release_eeprom(struct ixgbe_hw *hw)
 /**
  *  ixgbe_calc_eeprom_checksum_generic - Calculates and returns the checksum
  *  @hw: pointer to hardware structure
+ *
+ *  Returns a negative error code on error, or the 16-bit checksum
  **/
-u16 ixgbe_calc_eeprom_checksum_generic(struct ixgbe_hw *hw)
+s32 ixgbe_calc_eeprom_checksum_generic(struct ixgbe_hw *hw)
 {
u16 i;
u16 j;
@@ -2110,33 +2112,44 @@ u16 ixgbe_calc_eeprom_checksum_generic(struct ixgbe_hw 
*hw)

/* Include 0x0-0x3F in the checksum */
for (i = 0; i < IXGBE_EEPROM_CHECKSUM; i++) {
-   if (hw->eeprom.ops.read(hw, i, &word) != IXGBE_SUCCESS) {
+   if (hw->eeprom.ops.read(hw, i, &word)) {
DEBUGOUT("EEPROM read failed\n");
-   break;
+   return IXGBE_ERR_EEPROM;
}
checksum += word;
}

/* Include all data from pointers except for the fw pointer */
for (i = IXGBE_PCIE_ANALOG_PTR; i < IXGBE_FW_PTR; i++) {
-   hw->eeprom.ops.read(hw, i, &pointer);
+   if (hw->eeprom.ops.read(hw, i, &pointer)) {
+   DEBUGOUT("EEPROM read failed\n");
+   return IXGBE_ERR_EEPROM;
+   }

-   /* Make sure the pointer seems valid */
-   if (pointer != 0x && pointer != 0) {
-   hw->eeprom.ops.read(hw, pointer, &length);
+   /* If the pointer seems invalid */
+   if (pointer == 0x || pointer == 0)
+   continue;
+
+   if (hw->eeprom.ops.read(hw, pointer, &length)) {
+   DEBUGOUT("EEPROM read failed\n");
+   return IXGBE_ERR_EEPROM;
+   }

-   if (length != 0x && length != 0) {
-   for (j = pointer+1; j <= pointer+length; j++) {
-   hw->eeprom.ops.read(hw, j, &word);
-   checksum += word;
-   }
+   if (length == 0x || length == 0)
+   continue;
+
+   for (j = pointer + 1; j <= pointer + length; j++) {
+   if (hw->eeprom.ops.read(hw, j, &word)) {
+   DEBUGOUT("EEPROM read failed\n");
+   return IXGBE_ERR_EEPROM;
}
+   checksum += word;
}
}

checksum = (u16)IXGBE_EEPROM_SUM - checksum;

-   return checksum;
+   return (s32)checksum;
 }

 /**
@@ -2156,32 +2169,38 @@ s32 ixgbe_validate_eeprom_checksum_generic(struct 
ixgbe_hw *hw,

DEBUGFUNC("ixgbe_validate_eeprom_checksum_generic");

-   /*
-* Read the first word from the EEPROM. If this times out or fails, do
+   /* Read the first word from the EEPROM. If this times out or fails, do
 * not continue or we could be in for a very long wait while every
 * EEPROM read fails
 */
status = hw->eeprom.ops.read(hw, 0, &checksum);
+   if (status) {
+   DEBUGOUT("EEPROM read failed\n");
+   return status;
+   }

-   if (status == IXGBE_SUCCESS) {
-   checksum = hw->eeprom.ops.calc_checksum(hw);
-
-   hw->eeprom.ops.read(hw, IXGBE_EEPROM_CHECKSUM, &read_checksum);
+   status = hw->eeprom.ops.calc_checksum(hw);
+   if (status < 0)
+   return status;

-   /*
-* Verify read checksum from EEPROM is the same as
-* calculated checksum
-*/
-   if (read_checksum != checksum)
-   status = IXGBE_ERR_EEPROM_CHECKSUM;
+   checksum = (u16)(status & 0x);

-   /* If the user cares, return the calculated checksum */
-   if (checksum_val)
-   *checksum_val = checksum;
-   } else {
+   status = hw->eeprom.ops.read(hw, IXGBE_EEPROM_CHECKSUM, &read_checksum);
+   if (status) {
DEBUGOUT("EEPROM read f

[dpdk-dev] [PATCH v2 15/18] ixgbe: New function for resetting VF register

2014-09-29 Thread Ouyang Changchun

This patch implements a function to reset VF register to initial
values in IXGBE base code.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_vf.c | 46 +++
 1 file changed, 46 insertions(+)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_vf.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_vf.c
index a2d6e61..e6b6c51 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_vf.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_vf.c
@@ -89,6 +89,49 @@ s32 ixgbe_init_ops_vf(struct ixgbe_hw *hw)
return IXGBE_SUCCESS;
 }

+/* ixgbe_virt_clr_reg - Set register to default (power on) state.
+ *  @hw: pointer to hardware structure
+ */
+static void ixgbe_virt_clr_reg(struct ixgbe_hw *hw)
+{
+   int i;
+   u32 vfsrrctl;
+   u32 vfdca_rxctrl;
+   u32 vfdca_txctrl;
+
+   /* VRSRRCTL default values (BSIZEPACKET = 2048, BSIZEHEADER = 256) */
+   vfsrrctl = 0x100 << IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT;
+   vfsrrctl |= 0x800 >> IXGBE_SRRCTL_BSIZEPKT_SHIFT;
+
+   /* DCA_RXCTRL default value */
+   vfdca_rxctrl = IXGBE_DCA_RXCTRL_DESC_RRO_EN |
+  IXGBE_DCA_RXCTRL_DATA_WRO_EN |
+  IXGBE_DCA_RXCTRL_HEAD_WRO_EN;
+
+   /* DCA_TXCTRL default value */
+   vfdca_txctrl = IXGBE_DCA_TXCTRL_DESC_RRO_EN |
+  IXGBE_DCA_TXCTRL_DESC_WRO_EN |
+  IXGBE_DCA_TXCTRL_DATA_RRO_EN;
+
+   IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE, 0);
+
+   for (i = 0; i < 7; i++) {
+   IXGBE_WRITE_REG(hw, IXGBE_VFRDH(i), 0);
+   IXGBE_WRITE_REG(hw, IXGBE_VFRDT(i), 0);
+   IXGBE_WRITE_REG(hw, IXGBE_VFRXDCTL(i), 0);
+   IXGBE_WRITE_REG(hw, IXGBE_VFSRRCTL(i), vfsrrctl);
+   IXGBE_WRITE_REG(hw, IXGBE_VFTDH(i), 0);
+   IXGBE_WRITE_REG(hw, IXGBE_VFTDT(i), 0);
+   IXGBE_WRITE_REG(hw, IXGBE_VFTXDCTL(i), 0);
+   IXGBE_WRITE_REG(hw, IXGBE_VFTDWBAH(i), 0);
+   IXGBE_WRITE_REG(hw, IXGBE_VFTDWBAL(i), 0);
+   IXGBE_WRITE_REG(hw, IXGBE_VFDCA_RXCTRL(i), vfdca_rxctrl);
+   IXGBE_WRITE_REG(hw, IXGBE_VFDCA_TXCTRL(i), vfdca_txctrl);
+   }
+
+   IXGBE_WRITE_FLUSH(hw);
+}
+
 /**
  *  ixgbe_start_hw_vf - Prepare hardware for Tx/Rx
  *  @hw: pointer to hardware structure
@@ -161,6 +204,9 @@ s32 ixgbe_reset_hw_vf(struct ixgbe_hw *hw)
if (!timeout)
return IXGBE_ERR_RESET_FAILED;

+   /* Reset VF registers to initial values */
+   ixgbe_virt_clr_reg(hw);
+
/* mailbox timeout can now become active */
mbx->timeout = IXGBE_VF_MBX_INIT_TIMEOUT;

-- 
1.8.4.2

[dpdk-dev] [PATCH v2 16/18] ixgbe: New functionalities in IXGBE base code

2014-09-29 Thread Ouyang Changchun

This patch supports these functionalities in IXGBE base code:
Thermal sensor, DMA coalescing, EEE support, Source address pruning,
Anti-spoofing, Iosf buffer reading and writing, Malicious driver detection.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c  |   4 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c| 181 ++
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.h|  18 +++
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c | 169 
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h |  12 ++
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h   | 152 -
 6 files changed, 531 insertions(+), 5 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
index 3e442f7..2b74374 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
@@ -388,6 +388,10 @@ s32 ixgbe_init_ops_82599(struct ixgbe_hw *hw)
/* Manageability interface */
mac->ops.set_fw_drv_ver = &ixgbe_set_fw_drv_ver_generic;

+   mac->ops.get_thermal_sensor_data =
+&ixgbe_get_thermal_sensor_data_generic;
+   mac->ops.init_thermal_sensor_thresh =
+ &ixgbe_init_thermal_sensor_thresh_generic;

mac->ops.get_rtrup2tc = &ixgbe_dcb_get_rtrup2tc_generic;

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c
index 8ed4b75..b3e89c5 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c
@@ -1018,7 +1018,188 @@ s32 ixgbe_set_fw_drv_ver(struct ixgbe_hw *hw, u8 maj, 
u8 min, u8 build,
 }


+/**
+ *  ixgbe_get_thermal_sensor_data - Gathers thermal sensor data
+ *  @hw: pointer to hardware structure
+ *
+ *  Updates the temperatures in mac.thermal_sensor_data
+ **/
+s32 ixgbe_get_thermal_sensor_data(struct ixgbe_hw *hw)
+{
+   return ixgbe_call_func(hw, hw->mac.ops.get_thermal_sensor_data, (hw),
+   IXGBE_NOT_IMPLEMENTED);
+}
+
+/**
+ *  ixgbe_init_thermal_sensor_thresh - Inits thermal sensor thresholds
+ *  @hw: pointer to hardware structure
+ *
+ *  Inits the thermal sensor thresholds according to the NVM map
+ **/
+s32 ixgbe_init_thermal_sensor_thresh(struct ixgbe_hw *hw)
+{
+   return ixgbe_call_func(hw, hw->mac.ops.init_thermal_sensor_thresh, (hw),
+   IXGBE_NOT_IMPLEMENTED);
+}
+
+/**
+ *  ixgbe_dmac_config - Configure DMA Coalescing registers.
+ *  @hw: pointer to hardware structure
+ *
+ *  Configure DMA coalescing. If enabling dmac, dmac is activated.
+ *  When disabling dmac, dmac enable dmac bit is cleared.
+ **/
+s32 ixgbe_dmac_config(struct ixgbe_hw *hw)
+{
+   return ixgbe_call_func(hw, hw->mac.ops.dmac_config, (hw),
+   IXGBE_NOT_IMPLEMENTED);
+}
+
+/**
+ *  ixgbe_dmac_update_tcs - Configure DMA Coalescing registers.
+ *  @hw: pointer to hardware structure
+ *
+ *  Disables dmac, updates per TC settings, and then enable dmac.
+ **/
+s32 ixgbe_dmac_update_tcs(struct ixgbe_hw *hw)
+{
+   return ixgbe_call_func(hw, hw->mac.ops.dmac_update_tcs, (hw),
+   IXGBE_NOT_IMPLEMENTED);
+}
+
+/**
+ *  ixgbe_dmac_config_tcs - Configure DMA Coalescing registers.
+ *  @hw: pointer to hardware structure
+ *
+ *  Configure DMA coalescing threshold per TC and set high priority bit for
+ *  FCOE TC. The dmac enable bit must be cleared before configuring.
+ **/
+s32 ixgbe_dmac_config_tcs(struct ixgbe_hw *hw)
+{
+   return ixgbe_call_func(hw, hw->mac.ops.dmac_config_tcs, (hw),
+   IXGBE_NOT_IMPLEMENTED);
+}
+
+/**
+ *  ixgbe_setup_eee - Enable/disable EEE support
+ *  @hw: pointer to the HW structure
+ *  @enable_eee: boolean flag to enable EEE
+ *
+ *  Enable/disable EEE based on enable_ee flag.
+ *  Auto-negotiation must be started after BASE-T EEE bits in PHY register 7.3C
+ *  are modified.
+ *
+ **/
+s32 ixgbe_setup_eee(struct ixgbe_hw *hw, bool enable_eee)
+{
+   return ixgbe_call_func(hw, hw->mac.ops.setup_eee, (hw, enable_eee),
+   IXGBE_NOT_IMPLEMENTED);
+}
+
+/**
+ * ixgbe_set_source_address_pruning - Enable/Disable source address pruning
+ * @hw: pointer to hardware structure
+ * @enbale: enable or disable source address pruning
+ * @pool: Rx pool - Rx pool to toggle source address pruning
+ **/
+void ixgbe_set_source_address_pruning(struct ixgbe_hw *hw, bool enable,
+ unsigned int pool)
+{
+   if (hw->mac.ops.set_source_address_pruning)
+   hw->mac.ops.set_source_address_pruning(hw, enable, pool);
+}
+
+/**
+ *  ixgbe_set_ethertype_anti_spoofing - Enable/Disable Ethertype anti-spoofing
+ *  @hw: pointer to hardware structure
+ *  @enable: enable or disable switch for Ethertype anti-spoofing
+ *  @vf: Virtual Function pool - VF Pool to set for Ethertype anti-spoofing
+ *
+ **/

[dpdk-dev] [PATCH v2 18/18] ixgbe: Support X550 in IXGBE poll mode driver

2014-09-29 Thread Ouyang Changchun

This patch updates device id and PF driver in IXGBE PMD to support X550.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_eal/common/include/rte_pci_dev_ids.h | 14 ++
 lib/librte_ether/rte_ethdev.h   |  2 +-
 lib/librte_pmd_ixgbe/ixgbe_bypass_api.h |  9 +++
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 11 +---
 lib/librte_pmd_ixgbe/ixgbe_fdir.c   | 35 -
 lib/librte_pmd_ixgbe/ixgbe_pf.c |  6 +++--
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |  6 +
 7 files changed, 70 insertions(+), 13 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci_dev_ids.h 
b/lib/librte_eal/common/include/rte_pci_dev_ids.h
index 978b0ed..dea620f 100644
--- a/lib/librte_eal/common/include/rte_pci_dev_ids.h
+++ b/lib/librte_eal/common/include/rte_pci_dev_ids.h
@@ -391,6 +391,9 @@ RTE_PCI_DEV_ID_DECL_IGB(PCI_VENDOR_ID_INTEL, 
E1000_DEV_ID_DH89XXCC_SFP)
 #define IXGBE_DEV_ID_82599_T3_LOM   0x151C
 #define IXGBE_DEV_ID_X540T  0x1528
 #define IXGBE_DEV_ID_X540T1 0x1560
+#define IXGBE_DEV_ID_X550T  0x1563
+#define IXGBE_DEV_ID_X550EM_X_KX4   0x15AA
+#define IXGBE_DEV_ID_X550EM_X_KR0x15AB

 #ifdef RTE_NIC_BYPASS
 #define IXGBE_DEV_ID_82599_BYPASS   0x155D
@@ -433,6 +436,9 @@ RTE_PCI_DEV_ID_DECL_IXGBE(PCI_VENDOR_ID_INTEL, 
IXGBE_DEV_ID_82599_XAUI_LOM)
 RTE_PCI_DEV_ID_DECL_IXGBE(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_82599_T3_LOM)
 RTE_PCI_DEV_ID_DECL_IXGBE(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_X540T)
 RTE_PCI_DEV_ID_DECL_IXGBE(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_X540T1)
+RTE_PCI_DEV_ID_DECL_IXGBE(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_X550T)
+RTE_PCI_DEV_ID_DECL_IXGBE(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_X550EM_X_KX4)
+RTE_PCI_DEV_ID_DECL_IXGBE(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_X550EM_X_KR)

 #ifdef RTE_NIC_BYPASS
 RTE_PCI_DEV_ID_DECL_IXGBE(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_82599_BYPASS)
@@ -480,11 +486,19 @@ RTE_PCI_DEV_ID_DECL_IGBVF(PCI_VENDOR_ID_INTEL, 
E1000_DEV_ID_I350_VF_HV)
 #define IXGBE_DEV_ID_82599_VF_HV0x152E
 #define IXGBE_DEV_ID_X540_VF0x1515
 #define IXGBE_DEV_ID_X540_VF_HV 0x1530
+#define IXGBE_DEV_ID_X550_VF_HV 0x1564
+#define IXGBE_DEV_ID_X550_VF0x1565
+#define IXGBE_DEV_ID_X550EM_X_VF0x15A8
+#define IXGBE_DEV_ID_X550EM_X_VF_HV 0x15A9

 RTE_PCI_DEV_ID_DECL_IXGBEVF(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_82599_VF)
 RTE_PCI_DEV_ID_DECL_IXGBEVF(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_82599_VF_HV)
 RTE_PCI_DEV_ID_DECL_IXGBEVF(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_X540_VF)
 RTE_PCI_DEV_ID_DECL_IXGBEVF(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_X540_VF_HV)
+RTE_PCI_DEV_ID_DECL_IXGBEVF(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_X550_VF_HV)
+RTE_PCI_DEV_ID_DECL_IXGBEVF(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_X550_VF)
+RTE_PCI_DEV_ID_DECL_IXGBEVF(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_X550EM_X_VF)
+RTE_PCI_DEV_ID_DECL_IXGBEVF(PCI_VENDOR_ID_INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV)

 /** Virtual I40E devices from i40e_type.h /

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 60b24c5..1539c49 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2175,7 +2175,7 @@ extern int rte_eth_dev_vlan_filter(uint8_t port_id, 
uint16_t vlan_id , int on);

 /**
  * Enable/Disable hardware VLAN Strip by a rx queue of an Ethernet device.
- * 82599/X540 can support VLAN stripping at the rx queue level
+ * 82599/X540/X550 can support VLAN stripping at the rx queue level
  *
  * @param port_id
  *   The port identifier of the Ethernet device.
diff --git a/lib/librte_pmd_ixgbe/ixgbe_bypass_api.h 
b/lib/librte_pmd_ixgbe/ixgbe_bypass_api.h
index 6af370a..b4a7386 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_bypass_api.h
+++ b/lib/librte_pmd_ixgbe/ixgbe_bypass_api.h
@@ -76,6 +76,15 @@ static s32 ixgbe_bypass_rw_generic(struct ixgbe_hw *hw, u32 
cmd, u32 *status)
dir_sdi = IXGBE_ESDP_SDP0_DIR;
dir_sdo = IXGBE_ESDP_SDP1_DIR;
break;
+   case ixgbe_mac_X550:
+   case ixgbe_mac_X550EM_x:
+   sck = IXGBE_ESDP_SDP2;
+   sdi = IXGBE_ESDP_SDP0;
+   sdo = IXGBE_ESDP_SDP1;
+   dir_sck = IXGBE_ESDP_SDP2_DIR;
+   dir_sdi = IXGBE_ESDP_SDP0_DIR;
+   dir_sdo = IXGBE_ESDP_SDP1_DIR;
+   break;
default:
return IXGBE_ERR_DEVICE_NOT_SUPPORTED;
}
diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
index f4b590b..c835c67 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
@@ -544,7 +544,10 @@ ixgbe_dev_queue_stats_mapping_set(struct rte_eth_dev 
*eth_dev,
uint32_t q_map;
uint8_t n, offset;

-   if ((hw->mac.type != ixgbe_mac_82599EB) && (hw->mac.type != 
i

[dpdk-dev] [PATCH v2 12/18] ixgbe: Use hardware MAC type for I2C control

2014-09-29 Thread Ouyang Changchun

This patch uses hardware MAC type to determine I2C control, clock
in/out, and data in/out in IXGBE base code.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c  | 58 -
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h | 12 ---
 2 files changed, 37 insertions(+), 33 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
index 462e884..2e8fe93 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c
@@ -2075,7 +2075,7 @@ write_byte_out:
  **/
 STATIC void ixgbe_i2c_start(struct ixgbe_hw *hw)
 {
-   u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
+   u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));

DEBUGFUNC("ixgbe_i2c_start");

@@ -2106,7 +2106,7 @@ STATIC void ixgbe_i2c_start(struct ixgbe_hw *hw)
  **/
 STATIC void ixgbe_i2c_stop(struct ixgbe_hw *hw)
 {
-   u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
+   u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));

DEBUGFUNC("ixgbe_i2c_stop");

@@ -2170,9 +2170,9 @@ STATIC s32 ixgbe_clock_out_i2c_byte(struct ixgbe_hw *hw, 
u8 data)
}

/* Release SDA line (set high) */
-   i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
-   i2cctl |= IXGBE_I2C_DATA_OUT;
-   IXGBE_WRITE_REG(hw, IXGBE_I2CCTL, i2cctl);
+   i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
+   i2cctl |= IXGBE_I2C_DATA_OUT_BY_MAC(hw);
+   IXGBE_WRITE_REG(hw, IXGBE_I2CCTL_BY_MAC(hw), i2cctl);
IXGBE_WRITE_FLUSH(hw);

return status;
@@ -2188,7 +2188,7 @@ STATIC s32 ixgbe_get_i2c_ack(struct ixgbe_hw *hw)
 {
s32 status = IXGBE_SUCCESS;
u32 i = 0;
-   u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
+   u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
u32 timeout = 10;
bool ack = 1;

@@ -2203,17 +2203,16 @@ STATIC s32 ixgbe_get_i2c_ack(struct ixgbe_hw *hw)
/* Poll for ACK.  Note that ACK in I2C spec is
 * transition from 1 to 0 */
for (i = 0; i < timeout; i++) {
-   i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
-   ack = ixgbe_get_i2c_data(&i2cctl);
+   i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
+   ack = ixgbe_get_i2c_data(hw, &i2cctl);

usec_delay(1);
if (!ack)
break;
}

-   if (ack == 1) {
-   ERROR_REPORT1(IXGBE_ERROR_POLLING,
-"I2C ack was not received.\n");
+   if (ack) {
+   DEBUGOUT("I2C ack was not received.\n");
status = IXGBE_ERR_I2C;
}

@@ -2234,7 +2233,7 @@ STATIC s32 ixgbe_get_i2c_ack(struct ixgbe_hw *hw)
  **/
 STATIC s32 ixgbe_clock_in_i2c_bit(struct ixgbe_hw *hw, bool *data)
 {
-   u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
+   u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));

DEBUGFUNC("ixgbe_clock_in_i2c_bit");

@@ -2243,8 +2242,8 @@ STATIC s32 ixgbe_clock_in_i2c_bit(struct ixgbe_hw *hw, 
bool *data)
/* Minimum high period of clock is 4us */
usec_delay(IXGBE_I2C_T_HIGH);

-   i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
-   *data = ixgbe_get_i2c_data(&i2cctl);
+   i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
+   *data = ixgbe_get_i2c_data(hw, &i2cctl);

ixgbe_lower_i2c_clk(hw, &i2cctl);

@@ -2264,7 +2263,7 @@ STATIC s32 ixgbe_clock_in_i2c_bit(struct ixgbe_hw *hw, 
bool *data)
 STATIC s32 ixgbe_clock_out_i2c_bit(struct ixgbe_hw *hw, bool data)
 {
s32 status;
-   u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
+   u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));

DEBUGFUNC("ixgbe_clock_out_i2c_bit");

@@ -2306,15 +2305,15 @@ STATIC void ixgbe_raise_i2c_clk(struct ixgbe_hw *hw, 
u32 *i2cctl)
DEBUGFUNC("ixgbe_raise_i2c_clk");

for (i = 0; i < timeout; i++) {
-   *i2cctl |= IXGBE_I2C_CLK_OUT;
+   *i2cctl |= IXGBE_I2C_CLK_OUT_BY_MAC(hw);

-   IXGBE_WRITE_REG(hw, IXGBE_I2CCTL, *i2cctl);
+   IXGBE_WRITE_REG(hw, IXGBE_I2CCTL_BY_MAC(hw), *i2cctl);
IXGBE_WRITE_FLUSH(hw);
/* SCL rise time (1000ns) */
usec_delay(IXGBE_I2C_T_RISE);

-   i2cctl_r = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
-   if (i2cctl_r & IXGBE_I2C_CLK_IN)
+   i2cctl_r = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
+   if (i2cctl_r & IXGBE_I2C_CLK_IN_BY_MAC(hw))
break;
}
 }
@@ -2331,9 +2330,9 @@ STATIC void ixgbe_lower_i2c_clk(struct ixgbe_hw *hw, u32 
*i2cctl)

DEBUGFUNC("ixgbe_lower_i2c_clk");

-   *i2cctl &= ~IXGBE_I2C_CLK_OUT;
+   *i2cctl &= ~(IXGBE_I2C_CLK_OUT_BY_MAC(hw));

-   IXGBE_WRITE_REG(hw, IXGBE_I2CCTL, *i2cctl);
+   IXGBE_WRITE_REG(hw, IXGBE_I2CCTL_BY_MAC(hw), *i2cctl);
IXGBE_WRITE_FLUSH(

[dpdk-dev] [PATCH v2 17/18] ixgbe: Support X550 in IXGBE base code

2014-09-29 Thread Ouyang Changchun

This patch adds new file to support controller X550, therefore update the 
Makefile and README file.
It also updates the API functions, DCB related functions, mailbox related 
functions etc to support X550.
In addition, some new MACROs used by X550 are added.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/Makefile |2 +
 lib/librte_pmd_ixgbe/ixgbe/README |3 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c  |9 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c|   25 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.h|2 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c |   12 +-
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb.c|   20 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_mbx.c|4 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_osdep.h  |2 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c|1 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.h|5 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h   |  267 -
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_vf.h |3 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_x550.c   | 1809 +
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_x550.h   |   88 ++
 15 files changed, 2246 insertions(+), 6 deletions(-)
 create mode 100644 lib/librte_pmd_ixgbe/ixgbe/ixgbe_x550.c
 create mode 100644 lib/librte_pmd_ixgbe/ixgbe/ixgbe_x550.h

diff --git a/lib/librte_pmd_ixgbe/Makefile b/lib/librte_pmd_ixgbe/Makefile
index 00ccedb..0b647bd 100644
--- a/lib/librte_pmd_ixgbe/Makefile
+++ b/lib/librte_pmd_ixgbe/Makefile
@@ -64,6 +64,7 @@ CFLAGS_BASE_DRIVER += -Wno-strict-aliasing 
-Wno-format-extra-args

 ifeq ($(shell test $(GCC_MAJOR_VERSION) -ge 4 -a $(GCC_MINOR_VERSION) -ge 6 && 
echo 1), 1)
 CFLAGS_ixgbe_common.o += -Wno-unused-but-set-variable
+CFLAGS_ixgbe_x550.o += -Wno-unused-but-set-variable -Wno-maybe-uninitialized
 endif
 endif

@@ -83,6 +84,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe_82598.c
 SRCS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe_82599.c
 SRCS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe_x540.c
+SRCS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe_x550.c
 SRCS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe_phy.c
 SRCS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe_api.c
 SRCS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe_vf.c
diff --git a/lib/librte_pmd_ixgbe/ixgbe/README 
b/lib/librte_pmd_ixgbe/ixgbe/README
index fc71e85..e0e5f0d 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/README
+++ b/lib/librte_pmd_ixgbe/ixgbe/README
@@ -34,7 +34,7 @@ Intel? IXGBE driver
 ===

 This directory contains source code of FreeBSD ixgbe driver of version
-cid-10g-shared-code.2014.03.13 released by LAD. The sub-directory of lad/
+cid-10g-shared-code.2014.09.04 released by LAD. The sub-directory of lad/
 contains the original source package.
 This driver is valid for the product(s) listed below

@@ -50,6 +50,7 @@ This driver is valid for the product(s) listed below
 * Intel? Ethernet Controller X540-AT2
 * Intel? Ethernet Server Adapter X520 Series
 * Intel? Ethernet Server Adapter X520-T2
+* Intel? Ethernet Controller X550-BT2

 Updating driver
 ===
diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
index 2b74374..a06b57c 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
@@ -1918,6 +1918,15 @@ s32 ixgbe_fdir_set_input_mask_82599(struct ixgbe_hw *hw,
/* write both the same so that UDP and TCP use the same mask */
IXGBE_WRITE_REG(hw, IXGBE_FDIRTCPM, ~fdirtcpm);
IXGBE_WRITE_REG(hw, IXGBE_FDIRUDPM, ~fdirtcpm);
+   /* also use it for SCTP */
+   switch (hw->mac.type) {
+   case ixgbe_mac_X550:
+   case ixgbe_mac_X550EM_x:
+   IXGBE_WRITE_REG(hw, IXGBE_FDIRSCTPM, ~fdirtcpm);
+   break;
+   default:
+   break;
+   }

/* store source and destination IP masks (big-endian) */
IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRSIP4M,
diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c
index b3e89c5..1802760 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_api.c
@@ -81,8 +81,16 @@ s32 ixgbe_init_shared_code(struct ixgbe_hw *hw)
case ixgbe_mac_X540:
status = ixgbe_init_ops_X540(hw);
break;
+   case ixgbe_mac_X550:
+   status = ixgbe_init_ops_X550(hw);
+   break;
+   case ixgbe_mac_X550EM_x:
+   status = ixgbe_init_ops_X550EM(hw);
+   break;
case ixgbe_mac_82599_vf:
case ixgbe_mac_X540_vf:
+   case ixgbe_mac_X550_vf:
+   case ixgbe_mac_X550EM_x_vf:
status = ixgbe_init_ops_vf(hw);
break;
default:
@@ -157,6 +165,23 @@ s32 ixgbe_set_mac_type(struct ixgbe_hw *hw)
case IXGBE_DEV_ID_X540T1:
hw->mac.type = ixgbe_mac_X540;
break;
+   case IXGBE_DEV_ID_X550T:
+   hw->mac.type = ixgbe_ma

[dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe:

2014-09-29 Thread Ananyev, Konstantin



> -Original Message-
> From: Wodkowski, PawelX
> Sent: Monday, September 29, 2014 7:41 AM
> To: Neil Horman; Ananyev, Konstantin
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] Change alarm cancel function to 
> thread-safe:
> 
> > Yes, this is my concern exactly.
> >
> > >  If that's so, then I suppose we can do: make alarm_cancel() to return a
> > negative value for the case #3 (-EINPROGRESS or something).
> > >  Something like:
> > > ...
> > > if (ap->executing == 0) {
> > >LIST_REMOVE(ap,next);
> > > rte_free(ap);
> > > count++;
> > > ap = ap_prev;
> > > } else if (pthread_equal(ap->executing_id, pthread_self()) == 0) {
> > > executing++;
> > > } else {
> > >ret = -EINPROGRESS;
> > > }
> > > ...
> > > return ((ret != 0) ? ret : count);
> > >
> > > So the return value  will be > 0 for #1, 0 for #2, <0 for #3.
> > > As I remember, you already suggested something similar in one of the 
> > > previous
> > mails.
> > Yes, I rolled the API changes I suggested in with this model, because I 
> > wanted
> > to be able to do precise specification of a timer instance to cancel, but if
> > we're not ready to make that change, I think what you propose above would be
> > suffficient.  Theres some question as to weather we would cancel timers that
> > are
> > still pending on a return of -EINPROGRESS, but I think if we document it
> > accordingly, then it can be worked out just fine.
> >
> > Best
> > Neil
> >
> 
> Image how you will be damned by someone that not even notice you change
> and he Is managing some kind of resource based on returned number of
> set/canceled timers. If you suddenly start returning negative values how those
> application will behave? Silently changing returned value domain is evil in 
> its
> pure form.

As I can see the impact is very limited.
Only code that does check for (rte_alarm_cancel(...) == 0/ != 0) inside alarm 
callback function might be affected. 
>From other side, indeed, there could exist situations, when the caller needs 
>to know
was the alarm successfully cancelled or not. 
And if not by what reason. 

> 
> From my point of view, problem is virtual because this is user application 
> task to
> know what it can and what it not. If you really want to inform user 
> application
> about timer state you can introduce API call which will interrogate timers 
> list
> and return appropriate value, but for god sake, do not introduce untraceable 
> bugs.
> 
> Pawel

[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

2014-09-29 Thread Bruce Richardson

On Fri, Sep 26, 2014 at 10:08:55AM -0400, Neil Horman wrote:
> On Fri, Sep 26, 2014 at 11:28:05AM +0200, Thomas Monjalon wrote:
> > 2014-09-16 16:16, Neil Horman:
> > > On Fri, Sep 12, 2014 at 02:05:23PM -0400, John W. Linville wrote:
> > > > Ping?  Are there objections to this patch from mid-July?
> > > 
> > > Thomas, Where are you on this?  It seems like if you don't have any 
> > > objections
> > > to this patch, it should go in, in ilght of the lack of further 
> > > commentary.
> > 
> > 1) It doesn't appear as a top priority.
> Thats your responsibility.  Patches can't languish and rot on a list forever
> just because others aren't willing to test it.  If theres further testing that
> you feel it needs, ask. But from my read, its been tested for functionality 
> and
> performance (though high performance is never expected from a AF_PACKET PMD).
> Given that any one PMD will not affect the performance of another in 
> isolation,
> I'm not sure what more you're waiting for here.
> 
> > 2) It's competing with pcap PMD and bifurcated PMD to come
> >(http://dpdk.org/ml/archives/dev/2014-September/005379.html)
> Regarding the pcap PMD, so?  Its an alternate implementation that provides
> different features with different limitations.  The fact that they are 
> simmilar
> is irrelevant.  If simmilarity was the test, then we wouldn't bother with the
> bifurcated driver either, because the pcap pmd already exists.
> 
> Regarding the bifurcated driver, you can't hold existing patches on the 
> promise
> of another pmd thats comming at an indeterminate time in the future.  Theres 
> no
> reason not to take this now and deprecate it in the future if there is
> sufficient overlap with the bifurcated driver, though to my point above, they
> still address different needs with different limitations, so I don't see doing
> so as necessecary.
>  
> > 3) There is no test associated with this PMD.
> That would have been a great comment to make a few months back, though whats
> wrong with testpmd here?  That seems to be the same test that every other pmd
> uses. What exactly are you looking for?
> 
> 
> > If one of this item becomes wrong, it should go in.
> > 
> 
> > Currently, 2 projects are being initiated for validation (dcts) and
> > documentation. Keeping new things outside of the DPDK core makes it
> > clear that they have not to be supported by dcts and doc yet.
> > So, it is better to have an external PMD, like memnic, acting as a
> > staging area.
> > 
> So, this brings up an excellent point - Validation and support.  Commonly open
> source projects don't provide support at the upstream HEAD. Those items are
> applied and inforced by distributors.  Theres no need to ensure that the
> upstream head is always the most performance and stable point of the tree.  
> Its
> that need that keeps the development pace slow, and creates frustrations like
> this one, where a patch sits unaddressed for long periods of time.  Commonly 
> the
> workflow for most open source projects is for there to be a window of time 
> where
> visual review and basic functional testing are sufficient for acceptance into
> the head of the tree.  After the development window closes there is a
> stabilization period where testing/validation is done to ensure that no
> regressions have been encountered, optionally with a -next branch temporarily
> being created to accept patches for upcomming future releases.  If regressions
> are found, its a simple matter in git to bisect back to the offending patch,
> allow the contributing developer an opportunity to fix the issue, or to drop 
> the
> patch.  Using a workflow like this we can have a reasonable balance of needs
> (good patch turn around time, as well as reasonable testing).  We've discussed
> this when I posted the PMD_REGISTER_DRIVER patch months ago, and I thought you
> were going to move in the direction of this workflow.  What happened?
> 
> > During this time, keeping this PMD separately will allow you to update it
> > with a maintainer account in dpdk.org. I just need your SSH public key.
> > 
> We've discussed this too, keeping PMDs maintained separately is a very bad 
> idea.
> Doing so means developers have to constantly be aware of changes to the core
> tree and try to keep up individually.  Integrating them all means that API
> changes can be easily propogated to all PMD's when needed without making work
> for many people.  Its exactly the reason we encourage driver writers to open
> source drivers in Linux, because not doing so closes developers off from the
> free maintenence they get when optimizations are made to API's.  And if you
> follow the development model above, you don't need to worry about implied
> support, as that correctly becomes a distributor issue.
> 
> 
> Neil

While not wanting to get too involved in the discussion, I'd just like to 
express my support for getting this new PMD merged in.

/Bruce

[dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe:

2014-09-29 Thread Wodkowski, PawelX

> >
> > Image how you will be damned by someone that not even notice you change
> > and he Is managing some kind of resource based on returned number of
> > set/canceled timers. If you suddenly start returning negative values how 
> > those
> > application will behave? Silently changing returned value domain is evil in 
> > its
> > pure form.
> 
> As I can see the impact is very limited.

It is small impact to DPDK but can be huge to user application:
Ex:
If someone use this kind of expression in callback (skipping user app 
serialization part):
callback () {
...
some_simple_semaphore += rte_alarm_cancel(...));
...
}

Anywhere in the code:
...
If (some_simple_semapore) {
some_simple_semapore --;
if (rte_eal_alarm_set(...) != 0)
some_simple_semapore ++;
}
...

1. Do you notice the change in cancel function?
2. How many hours you spend to find this issue in case of big app/system?

> Only code that does check for (rte_alarm_cancel(...) == 0/ != 0) inside alarm
> callback function might be affected.
> From other side, indeed, there could exist situations, when the caller needs 
> to
> know
> was the alarm successfully cancelled or not.
> And if not by what reason.
> 

I can extend API of rte alarms to add alarm state checking in next patch,  but 
for 
now, since this is not urgent I think original patch  v2 should be enough.

Pawel

[dpdk-dev] [PATCH] KNI: use a memzone pool for KNI alloc/release

2014-09-29 Thread Marc Sune

This patch implements the KNI memzone pool in order to:

* prevent memzone exhaustion when allocating/deallocating KNI
  interfaces.
* be able to allocate KNI interfaces with the same name as
  previously deallocated ones.

It adds a new API call, rte_kni_init(max_kni_ifaces) that shall
be called before any call to rte_kni_alloc() if KNI is used.

Signed-off-by: Marc Sune 
---
 lib/librte_kni/rte_kni.c |  302 ++
 lib/librte_kni/rte_kni.h |   18 +++
 2 files changed, 269 insertions(+), 51 deletions(-)

diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index 76feef4..df55789 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -40,6 +40,7 @@
 #include 
 #include 

+#include 
 #include 
 #include 
 #include 
@@ -58,7 +59,7 @@

 #define KNI_REQUEST_MBUF_NUM_MAX  32

-#define KNI_MZ_CHECK(mz) do { if (mz) goto fail; } while (0)
+#define KNI_MEM_CHECK(cond) do { if (cond) goto kni_fail; } while (0)

 /**
  * KNI context
@@ -66,6 +67,7 @@
 struct rte_kni {
char name[RTE_KNI_NAMESIZE];/**< KNI interface name */
uint16_t group_id;  /**< Group ID of KNI devices */
+   unsigned slot_id;   /**< KNI pool slot ID */
struct rte_mempool *pktmbuf_pool;   /**< pkt mbuf mempool */
unsigned mbuf_size; /**< mbuf size */

@@ -88,10 +90,48 @@ enum kni_ops_status {
KNI_REQ_REGISTERED,
 };

+/**
+* KNI memzone pool slot
+*/
+struct rte_kni_memzone_slot{
+   unsigned id;
+   uint8_t in_use : 1;/**< slot in use */
+
+   //Memzones
+   const struct rte_memzone *m_ctx;   /**< KNI ctx */
+   const struct rte_memzone *m_tx_q;  /**< TX queue */
+   const struct rte_memzone *m_rx_q;  /**< RX queue */
+   const struct rte_memzone *m_alloc_q;   /**< Allocated mbufs queue */
+   const struct rte_memzone *m_free_q;/**< To be freed mbufs queue */
+   const struct rte_memzone *m_req_q; /**< Request queue */
+   const struct rte_memzone *m_resp_q;/**< Response queue */
+   const struct rte_memzone *m_sync_addr; 
+   
+   /* Free linked list */
+   struct rte_kni_memzone_slot *next; /**< Next slot link.list */
+};
+
+/**
+* KNI memzone pool
+*/
+struct rte_kni_memzone_pool{
+   uint8_t initialized : 1;/**< Global KNI pool init flag */
+ 
+   unsigned max_ifaces;/**< Max. num of KNI ifaces */
+   struct rte_kni_memzone_slot *slots;/**< Pool slots */
+   rte_spinlock_t mutex;   /**< alloc/relase mutex */
+
+   //Free memzone slots linked-list
+   struct rte_kni_memzone_slot *free; /**< First empty slot */
+   struct rte_kni_memzone_slot *free_tail;/**< Last empty slot */
+};
+
+
 static void kni_free_mbufs(struct rte_kni *kni);
 static void kni_allocate_mbufs(struct rte_kni *kni);

 static volatile int kni_fd = -1;
+static struct rte_kni_memzone_pool kni_memzone_pool = {0};

 static const struct rte_memzone *
 kni_memzone_reserve(const char *name, size_t len, int socket_id,
@@ -105,6 +145,154 @@ kni_memzone_reserve(const char *name, size_t len, int 
socket_id,
return mz;
 }

+/* Pool mgmt */
+static struct rte_kni_memzone_slot*
+kni_memzone_pool_alloc(void)
+{
+   struct rte_kni_memzone_slot* slot;
+   
+   rte_spinlock_lock(&kni_memzone_pool.mutex); 
+
+   if(!kni_memzone_pool.free) {
+   rte_spinlock_unlock(&kni_memzone_pool.mutex);   
+   return NULL;
+   }
+
+   slot = kni_memzone_pool.free;
+   kni_memzone_pool.free = slot->next;
+
+   if(!kni_memzone_pool.free)
+   kni_memzone_pool.free_tail = NULL;
+
+   rte_spinlock_unlock(&kni_memzone_pool.mutex);
+
+   return slot;
+}
+
+static void 
+kni_memzone_pool_dealloc(struct rte_kni_memzone_slot* slot)
+{
+   rte_spinlock_lock(&kni_memzone_pool.mutex); 
+
+   if(kni_memzone_pool.free)
+   kni_memzone_pool.free_tail->next = slot;
+   else
+   kni_memzone_pool.free = slot;
+
+   kni_memzone_pool.free_tail = slot;
+   slot->next = NULL;
+
+   rte_spinlock_unlock(&kni_memzone_pool.mutex);
+}
+
+
+/* Shall be called before any allocation happens */
+void
+rte_kni_init(unsigned int max_kni_ifaces)
+{
+   unsigned i;
+   struct rte_kni_memzone_slot* it;
+   const struct rte_memzone *mz;
+#define OBJNAMSIZ 32
+   char obj_name[OBJNAMSIZ];
+   char mz_name[RTE_MEMZONE_NAMESIZE];
+
+   if(max_kni_ifaces == 0) {
+   //Panic
+   RTE_LOG(ERR, KNI, "Invalid number of max_kni_ifaces %d\n",
+   max_kni_ifaces);
+   rte_panic("Unable to initialize KNI\n");
+   }
+
+   //Allocate slot objects
+   kni_memzone_pool.slots = (struct rte_kni_memzone_slot*)rte_malloc(NULL,
+

[dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe:

2014-09-29 Thread Bruce Richardson

On Mon, Sep 29, 2014 at 10:11:38AM +, Wodkowski, PawelX wrote:
> > >
> > > Image how you will be damned by someone that not even notice you change
> > > and he Is managing some kind of resource based on returned number of
> > > set/canceled timers. If you suddenly start returning negative values how 
> > > those
> > > application will behave? Silently changing returned value domain is evil 
> > > in its
> > > pure form.
> > 
> > As I can see the impact is very limited.
> 
> It is small impact to DPDK but can be huge to user application:

This is why we traditionally have in the release-notes for each release a 
section dedicated to calling out changes from one release to another. [See 
http://dpdk.org/doc/intel/dpdk-release-notes-1.7.0.pdf section 5]. Since 
from release-to-release there are generally only a couple of changes - 
though our next release may be a little different - the actual changes are 
clear enough to read about without wading through pages of documentation. I 
thinking calling out the change in both the release notes and the API docs 
is sufficient even for a change like this.  

Basically, I wouldn't let API stability factor in too much in trying to get 
a proper fix for this issue.

/Bruce

> Ex:
> If someone use this kind of expression in callback (skipping user app 
> serialization part):
> callback () {
> ...
> some_simple_semaphore += rte_alarm_cancel(...));
> ...
> }
> 
> Anywhere in the code:
> ...
> If (some_simple_semapore) {
>   some_simple_semapore --;
>   if (rte_eal_alarm_set(...) != 0)
>   some_simple_semapore ++;
> }
> ...
> 
> 1. Do you notice the change in cancel function?
> 2. How many hours you spend to find this issue in case of big app/system?
> 
> > Only code that does check for (rte_alarm_cancel(...) == 0/ != 0) inside 
> > alarm
> > callback function might be affected.
> > From other side, indeed, there could exist situations, when the caller 
> > needs to
> > know
> > was the alarm successfully cancelled or not.
> > And if not by what reason.
> > 
> 
> I can extend API of rte alarms to add alarm state checking in next patch,  
> but for 
> now, since this is not urgent I think original patch  v2 should be enough.
> 
> Pawel

[dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe:

2014-09-29 Thread Bruce Richardson

On Fri, Sep 26, 2014 at 02:13:55PM +, Ananyev, Konstantin wrote:
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > Sent: Friday, September 26, 2014 2:40 PM
> > To: Wodkowski, PawelX
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] Change alarm cancel function to 
> > thread-safe:
> > 
> > On Fri, Sep 26, 2014 at 12:37:54PM +, Wodkowski, PawelX wrote:
> > > > So basically cancel() just set ALARM_CANCELLED and leaves actual alarm
> > > > deletion to the callback()?
> > > > That was the thought, yes.
> > > >
> > > > > I think it is doable - but I don't see any real advantage with that 
> > > > > approach.
> > > > > Yes, code will become a bit simpler, as  we'll have one point when we 
> > > > > remove
> > > > alarm from the list.
> > > > Yes, that would be the advantage, that the code would be much simpler.
> > > >
> > > > > But from other side, imagine such simple test-case:
> > > > >
> > > > > for (i = 0; i < 0x10; i++) {
> > > > >rte_eal_alarm_set(ONE_MIN, cb_func, (void *)i);
> > > > >rte_eal_alarm_cancel(cb_func, (void *)i);
> > > > > }
> > > > >
> > > > > We'll endup with 1M of cancelled, but still not removed entries in the
> > > > alarm_list.
> > > > > With current implementation that means - few MBs of wasted memory,
> > > > Thats correct, and the tradeoff to choose between.  Do you want simpler 
> > > > code
> > > > that is easier to maintain, or do you want a high speed cancel and set
> > > > operation.  I'm not aware of all the use cases, but I have a hard time 
> > > > seeing
> > > > a use case in which the in-flight alarm list grows unboundedly large, 
> > > > which in
> > > > my mind mitigates the risk of deferred removal, but I'm perfectly 
> > > > willing to
> > > > believe that there are use cases which I'm not aware of.
> 
> After executing example above - from user perspective there is no active 
> alarms in the system at all.
> Though in fact alarm_list contains 1M entries. 

This would concern me. It's likely that in applications, e.g. those with a 
network stack for instance, timers could be used for timeouts e.g. on 
connections, which would mean that the common case by far would be for 
timers to be cancelled or rescheduled without ever timing out.

/Bruce

[dpdk-dev] [PATCH v4 2/8]i40e:support VxLAN packet identification in librte_pmd_i40e

2014-09-29 Thread Bruce Richardson

On Fri, Sep 26, 2014 at 10:02:03AM +0800, Jijiang Liu wrote:
> Support tunneling UDP port configuration on i40e in librte_pmd_i40e.
> Currently, only VxLAN is implemented, which include
>  -  VxLAN UDP port initialization
>  -  Implement the APIs to configure VxLAN UDP port in librte_pmd_i40e.
>  
> Signed-off-by: Jijiang Liu 
> Acked-by: Helin Zhang 
> Acked-by: Jingjing Wu 
> Acked-by: Jing Chen 
> 
> ---
>  config/common_linuxapp|5 +
>  lib/librte_mbuf/rte_mbuf.h|2 +
>  lib/librte_pmd_i40e/i40e_ethdev.c |  200 
> -
>  lib/librte_pmd_i40e/i40e_ethdev.h |5 +
>  lib/librte_pmd_i40e/i40e_rxtx.c   |   10 ++
>  5 files changed, 221 insertions(+), 1 deletions(-)
> 
> diff --git a/config/common_linuxapp b/config/common_linuxapp
> index 5bee910..75a4cd7 100644
> --- a/config/common_linuxapp
> +++ b/config/common_linuxapp
> @@ -212,6 +212,11 @@ CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF=4
>  CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL=-1
>  
>  #
> +# Compile tunneling UDP port support
> +#
> +CONFIG_RTE_LIBRTE_TUNNEL_UDP_PORT=4789
> +
> +#
>  # Compile burst-oriented VIRTIO PMD driver
>  #
>  CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 1c6e115..4955684 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -538,6 +538,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
>   m->port = 0xff;
>  
>   m->ol_flags = 0;
> + m->reserved = 0;
>   m->data_off = (RTE_PKTMBUF_HEADROOM <= m->buf_len) ?
>   RTE_PKTMBUF_HEADROOM : m->buf_len;
>  
> @@ -607,6 +608,7 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf 
> *mi, struct rte_mbuf *md)
>   mi->pkt_len = mi->data_len;
>   mi->nb_segs = 1;
>   mi->ol_flags = md->ol_flags;
> + mi->reserved = md->reserved;
>  
>   __rte_mbuf_sanity_check(mi, 1);
>   __rte_mbuf_sanity_check(md, 0);

If the "reserved" field in the mbuf is now being used, it should be renamed 
to what its actually being used for. If it is still not being used, why this 
change?

/Bruce

[dpdk-dev] [PATCH v4 7/8]i40e:support VxLAN Tx checksum offload

2014-09-29 Thread Bruce Richardson

On Fri, Sep 26, 2014 at 10:02:08AM +0800, Jijiang Liu wrote:
> Support VxLAN Tx checksum offload, which include
>   - outer L3(IP) checksum offload
>   - inner L3(IP) checksum offload
>   - inner L4(UDP, TCP and SCTP) checksum offload
>  
> Signed-off-by: Jijiang Liu 
> Acked-by: Helin Zhang 
> Acked-by: Jingjing Wu 
> Acked-by: Jing Chen 
> 
> ---
>  lib/librte_mbuf/rte_mbuf.h|2 +
>  lib/librte_pmd_i40e/i40e_ethdev.c |4 +-
>  lib/librte_pmd_i40e/i40e_rxtx.c   |   47 ++--
>  3 files changed, 48 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 4955684..1f3f4eb 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -86,6 +86,8 @@ extern "C" {
>  #define PKT_RX_IEEE1588_PTP  0x0200 /**< RX IEEE1588 L2 Ethernet PT Packet. 
> */
>  #define PKT_RX_IEEE1588_TMST 0x0400 /**< RX IEEE1588 L2/L4 timestamped 
> packet.*/
>  
> +#define PKT_TX_VXLAN_CKSUM   0x0001 /**< Checksum of TX VxLAN pkt. computed 
> by NIC.. */
> +#define PKT_TX_IVLAN_PKT 0x0002 /**< TX packet is VxLAN packet with an 
> inner VLAN. */
>  #define PKT_TX_VLAN_PKT  0x0800 /**< TX packet is a 802.1q VLAN packet. 
> */
>  #define PKT_TX_IP_CKSUM  0x1000 /**< IP cksum of TX pkt. computed by 
> NIC. */
>  #define PKT_TX_IPV4_CSUM 0x1000 /**< Alias of PKT_TX_IP_CKSUM. */

These flag values overlap with ones already defined for RX. We have an 
addition 48 flags (47 after you subtract one I reused for control mbuf flag) 
following the mbuf rework, so overlap should not be needed, I think.

/Bruce

[dpdk-dev] [PATCH] pci: remove flag for multiple devices with single id

2014-09-29 Thread Thomas Monjalon

The flag RTE_PCI_DRV_MULTIPLE was used to register an eth_driver allowing
multiples devices with a single PCI id.
It is now possible to register a pci_driver and create ethdev objects
using rte_eth_dev_allocate().

Suggested-by: David Marchand 
Signed-off-by: Thomas Monjalon 
---
 lib/librte_eal/common/eal_common_pci.c  | 10 --
 lib/librte_eal/common/include/rte_pci.h |  4 ++--
 2 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index af809a8..f3c7f71 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -103,10 +103,6 @@ static struct rte_devargs *pci_devargs_lookup(struct 
rte_pci_device *dev)
  * If vendor/device ID match, call the devinit() function of all
  * registered driver for the given device. Return -1 if initialization
  * failed, return 1 if no driver is found for this device.
- * For drivers with the RTE_PCI_DRV_MULTIPLE flag enabled, register
- * the same device multiple times until failure to do so.
- * It is required for non-Intel NIC drivers provided by third-parties such
- * as 6WIND.
  */
 static int
 pci_probe_all_drivers(struct rte_pci_device *dev)
@@ -122,12 +118,6 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
if (rc > 0)
/* positive value means driver not found */
continue;
-   /* initialize subsequent driver instances for this device */
-   if ((dr->drv_flags & RTE_PCI_DRV_MULTIPLE) &&
-   (dev->devargs == NULL ||
-   dev->devargs->type != 
RTE_DEVTYPE_BLACKLISTED_PCI))
-   while (rte_eal_pci_probe_one_driver(dr, dev) == 0)
-   ;
return 0;
}
return 1;
diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index d6b1c1b..66ed793 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -193,8 +193,8 @@ struct rte_pci_driver {

 /** Device needs PCI BAR mapping (done with either IGB_UIO or VFIO) */
 #define RTE_PCI_DRV_NEED_MAPPING 0x0001
-/** Device driver must be registered several times until failure */
-#define RTE_PCI_DRV_MULTIPLE 0x0002
+/** Device driver must be registered several times until failure - deprecated 
*/
+#pragma GCC poison RTE_PCI_DRV_MULTIPLE
 /** Device needs to be unbound even if no module is provided */
 #define RTE_PCI_DRV_FORCE_UNBIND 0x0004
 /** Device driver supports link state interrupt */
-- 
2.0.4

[dpdk-dev] [PATCH] ixgbe: fix crash caused by bulk allocation failure in vector pmd

2014-09-29 Thread Thomas Monjalon

> > Since the introduction of vector PMD, a bug in ixgbe_rxq_rearm could
> > cause a crash. As long as the memory pool allocated to the RX queue
> > has mbufs available, there is no problem. After allocation of _all_
> > mbufs from the memory pool, previously returned mbufs by
> > rte_eth_rx_burst could be accessed by subsequent calls to the PMD and
> > could be returned by subsequent calls to rte_eth_rx_burst. From the
> > perspective of the application, the means that fields within the mbuf
> > could change and that previously allocated mbufs could appear multiple
> > times.
> > 
> > After failure of mbuf allocation, the dd bits should indicate that the
> > packets are not ready. For this, this patch adds code to reset the dd
> > bits in the first RTE_IXGBE_DESCS_PER_LOOP packets of the next
> > RTE_IXGBE_RXQ_REARM_THRESH packets only if the next
> > RTE_IXGBE_RXQ_REARM_THRESH packets that will be accessed contain
> > previously allocated packets.
> > 
> > Setting the bits is not enough. The bits are checked _after_ setting
> > the mbuf fields, thus a mechanism is needed to prevent the previously
> > used mbuf pointers from being accessed during the speculative load of
> > the mbuf fields. For this reason, not only the dd bits are reset, but
> > also the mbufs associated to those descriptors are set to point to a
> > "fake" mbuf.
> > 
> > Signed-off-by: Balazs Nemeth 
> 
> Acked-by: Konstantin Ananyev 

Applied

Thanks
-- 
Thomas

[dpdk-dev] [PATCH 1/7] Split atomic operations to architecture specific

2014-09-29 Thread Bruce Richardson

On Fri, Sep 26, 2014 at 05:33:32AM -0400, Chao Zhu wrote:
> This patch splits the atomic operations from DPDK and push them to
> architecture specific arch directories, so that other processor
> architecture to support DPDK can be easily adopted.
> 
> Signed-off-by: Chao Zhu 
> ---
>  lib/librte_eal/common/Makefile |2 +-
>  .../common/include/i686/arch/rte_atomic_arch.h |  378 
> 
>  lib/librte_eal/common/include/rte_atomic.h |  172 +
>  .../common/include/x86_64/arch/rte_atomic_arch.h   |  378 
> 
>  4 files changed, 772 insertions(+), 158 deletions(-)
>  create mode 100644 lib/librte_eal/common/include/i686/arch/rte_atomic_arch.h
>  create mode 100644 
> lib/librte_eal/common/include/x86_64/arch/rte_atomic_arch.h
> 
<...snip...>
> +#define  rte_compiler_barrier() rte_arch_compiler_barrier()

Small question: shouldn't the compiler barrier be independent of 
architecture?

/Bruce

[dpdk-dev] [PATCH] ixgbe: allow unsupported SFP

2014-09-29 Thread Thomas Monjalon

> No need to restrict usage of non Intel SFP.
> If (hw->phy.type == ixgbe_phy_sfp_intel) is false,
> a warning will be logged.
> It was disabled for ixgbe and enabled but unused for i40e.
> 
> Signed-off-by: Thomas Monjalon 

Applied
-- 
Thomas

[dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe:

2014-09-29 Thread Neil Horman

On Mon, Sep 29, 2014 at 10:11:38AM +, Wodkowski, PawelX wrote:
> > >
> > > Image how you will be damned by someone that not even notice you change
> > > and he Is managing some kind of resource based on returned number of
> > > set/canceled timers. If you suddenly start returning negative values how 
> > > those
> > > application will behave? Silently changing returned value domain is evil 
> > > in its
> > > pure form.
> > 
> > As I can see the impact is very limited.
> 
> It is small impact to DPDK but can be huge to user application:
> Ex:
> If someone use this kind of expression in callback (skipping user app 
> serialization part):
> callback () {
> ...
> some_simple_semaphore += rte_alarm_cancel(...));

This code would be broken to begin with, as rte_eal_alarm_cancel is already
written to return negative return codes.  Its not documented as such, but its
still the case.  Note that if you run an application built against a shared
library on BSD, the definition of rte_eal_alarm_cancel returns -ENOTSUP.  The
above code would be broken because it doesn't account for that.  You can argue
that the documentation should be updated, but the dpdk in the wild already
conforms to the model Konstantin and I are proposing.

> ...
> }
> 
> Anywhere in the code:
> ...
> If (some_simple_semapore) {
>   some_simple_semapore --;
>   if (rte_eal_alarm_set(...) != 0)
>   some_simple_semapore ++;
> }
> ...
> 
> 1. Do you notice the change in cancel function?
The application crashes, or otherwise misbehaves.

> 2. How many hours you spend to find this issue in case of big app/system?
You don't.  Such a problem as you describe would very likely result in a
semaphore deadlock, as the count would be incorrectly lowered, so you put
watches on the variable, note that sometimes the count goes down on a cancel,
which is completely counter-intuitive, read the updated documentation that
indicates error codes are possible (which you should have been prepared for
anyway), and move on with your day.

> 
> > Only code that does check for (rte_alarm_cancel(...) == 0/ != 0) inside 
> > alarm
> > callback function might be affected.
> > From other side, indeed, there could exist situations, when the caller 
> > needs to
> > know
> > was the alarm successfully cancelled or not.
> > And if not by what reason.
> > 
> 
> I can extend API of rte alarms to add alarm state checking in next patch,  
> but for 
> now, since this is not urgent I think original patch  v2 should be enough.
I re-assert my origional argument here, without the above change, you haven't
really fixed the race.  If you can find another way to do it, thats fine with
me, but keep in mind once again, that some implementations of rte_eal_alarm_set
already do whats being proposed.

Neil

> 
> Pawel
>

[dpdk-dev] [RFC] More changes for rte_mempool.h:__mempool_get_bulk()

2014-09-29 Thread Bruce Richardson

On Sun, Sep 28, 2014 at 11:17:34PM +, Wiles, Roger Keith wrote:
> 
> On Sep 28, 2014, at 5:41 PM, Ananyev, Konstantin  intel.com> wrote:
> 
> > 
> > 
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Wiles, Roger Keith
> >> Sent: Sunday, September 28, 2014 6:52 PM
> >> To: 
> >> Subject: [dpdk-dev] [RFC] More changes for 
> >> rte_mempool.h:__mempool_get_bulk()
> >> 
> >> Here is a Request for Comment on __mempool_get_bulk() routine. I believe I 
> >> am seeing a few more issues in this routine, please look
> >> at the code below and see if these seem to fix some concerns in how the 
> >> ring is handled.
> >> 
> >> The first issue I believe is cache->len is increased by ret and not req as 
> >> we do not know if ret == req. This also means the cache->len
> >> may still not satisfy the request from the cache.
> >> 
> >> The second issue is if you believe the above code then we have to account 
> >> for that issue in the stats.
> >> 
> >> Let me know what you think?
> >> ++Keith
> >> ---
> >> 
> >> diff --git a/lib/librte_mempool/rte_mempool.h 
> >> b/lib/librte_mempool/rte_mempool.h
> >> index 199a493..b1b1f7a 100644
> >> --- a/lib/librte_mempool/rte_mempool.h
> >> +++ b/lib/librte_mempool/rte_mempool.h
> >> @@ -945,9 +945,7 @@ __mempool_get_bulk(struct rte_mempool *mp, void 
> >> **obj_table,
> >>   unsigned n, int is_mc)
> >> {
> >>int ret;
> >> -#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
> >> -   unsigned n_orig = n;
> >> -#endif
> > 
> > Yep, as I said in my previous mail n_orig could be removed in total.
> > Though from other side - it is harmless.
> > 
> >> +
> >> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> >>struct rte_mempool_cache *cache;
> >>uint32_t index, len;
> >> @@ -979,7 +977,21 @@ __mempool_get_bulk(struct rte_mempool *mp, void 
> >> **obj_table,
> >>goto ring_dequeue;
> >>}
> >> 
> >> -   cache->len += req;
> >> +   cache->len += ret;  // Need to adjust len by ret not 
> >> req, as (ret != req)
> >> +
> > 
> > rte_ring_mc_dequeue_bulk(.., req) at line 971, would either get all req 
> > objects from the ring and return 0 (success),
> > or wouldn't get any entry from the ring and return negative value (failure).
> > So  this change is erroneous.
> 
> Sorry, I combined my thoughts on changing the get_bulk behavior and you would 
> be correct for the current design. This is why I decided to make it an RFC :-)
> > 
> >> +   if ( cache->len < n ) {
> > 
> > If n > cache_size, then we will go straight to  'ring_dequeue' see line 959.
> > So no need for that check here.
> 
> My thinking (at the time) was get_bulk should return ?n? instead of zero, 
> which I feel is the better coding. You are correct it does not make sense 
> unless you factor in my thinking at time :-(
> > 
> >> +   /*
> >> +* Number (ret + cache->len) may not be >= n. As
> >> +* the 'ret' value maybe zero or less then 'req'.
> >> +*
> >> +* Note:
> >> +* An issue of order from the cache and common 
> >> pool could
> >> +* be an issue if (cache->len != 0 and less then 
> >> n), but the
> >> +* normal case it should be OK. If the user needs 
> >> to preserve
> >> +* the order of packets then he must set 
> >> cache_size == 0.
> >> +*/
> >> +   goto ring_dequeue;
> >> +   }
> >>}
> >> 
> >>/* Now fill in the response ... */
> >> @@ -1002,9 +1014,12 @@ ring_dequeue:
> >>ret = rte_ring_sc_dequeue_bulk(mp->ring, obj_table, n);
> >> 
> >>if (ret < 0)
> >> -   __MEMPOOL_STAT_ADD(mp, get_fail, n_orig);
> >> -   else
> >> +   __MEMPOOL_STAT_ADD(mp, get_fail, n);
> >> +   else {
> >>__MEMPOOL_STAT_ADD(mp, get_success, ret);
> >> +   // Catch the case when ret != n, adding zero should not be 
> >> a problem.
> >> +   __MEMPOOL_STAT_ADD(mp, get_fail, n - ret);
> > 
> > As I said above, ret == 0 on success, so need for that change.
> > Just n (or n_orig) is ok here.
> > 
> >> +   }
> >> 
> >>return ret;
> >> }
> >> 
> >> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
> >> 972-213-5533
> 
> Do we think it is worth it to change the behavior of get_bulk returning ?n? 
> instead of zero on success? It would remove a few test IMO in a couple of 
> places. We could also return <0 on the zero case as well, just to make sure 
> code did not try to follow the success case by mistake.

If you want to have such a function, i think it should align with the 
functions on the rings. In this case, this would mean having a get_burst 
function, which returns less than or equal to the number of eleme

[dpdk-dev] Bulk dequeue of packets and the returned values, question

2014-09-29 Thread Bruce Richardson

On Sun, Sep 28, 2014 at 11:06:17PM +, Wiles, Roger Keith wrote:
> Thanks Venky,
> On Sep 28, 2014, at 5:23 PM, Venkatesan, Venky  intel.com> wrote:
> 
> > Keith,
> > 
> > On 9/28/2014 11:04 AM, Wiles, Roger Keith wrote:
> >> I am also looking at the bulk dequeue routines, which the ring can be 
> >> fixed or variable. On fixed  < 0 on error is returned and 0 if successful. 
> >> On a variable ring < 0 on error or n on success, but I think n can be zero 
> >> in the variable case, correct?
> >> 
> >> If these are true then why not have the routines return  < 0 on error and 
> >> >= 0 on success. Which means a dequeue from a fixed ring would return only 
> >> ?requested size n? or < 0 if you error off the 0 case. The 0 case could be 
> >> OK, if you allow zero to be return on a empty ring for the fixed ring case.
> >> 
> >> Does this make sense to anyone?
> > It won't make sense unless you're aware of the history behind these 
> > functions. The original functions that were implemented for the ring were 
> > only the bulk functions (i.e. FIXED). They would return exactly the number 
> > of items requested for dequeue (0 if success, negative if error), and not 
> > return any if the required number were not available.
> > 
> > The burst (i.e. VARIABLE) functions came in much later (think it was r1.3 
> > where we introduced them), and by that time, there were already quite a 
> > number of deployments of DPDK in the field using the legacy ring functions. 
> > Therefore we made the decision to keep the legacy behavior intact & not 
> > impacting deployed code - and merging the burst functions into the code. 
> > Given that there was no "versioning" of the API/ABI in those releases :).
> 
> I see why the code is this way. If the developers used ?if ( ret == 0 ) { /* 
> do something */ }? then it would break if it returned a positive value on 
> success. I would expect the normal behavior to be ?if ( ret < 0 ) { /* error 
> case */ }? and fall thru for the success case. I would love to change the 
> code to just return <0 on error or >= 0 on success. I wonder how many 
> customers code would break changing the code to do just just the two steps. I 
> think it will remove some code in a couple places that were testing for FIXED 
> or VARIABLE?
> > 
> > Hope that helps.
> > -Venky
> > 
> >> 
> >> Thanks
> >> ++Keith
> >> 
> >> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
> >> 972-213-5533
> 
> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
> 972-213-5533
> 

Since we are looking at making considerable ABI changes in this release and 
(hopefully) also looking to version our ABI going forward, I would be in 
favour of making any changes to these APIs in this current release if 
possible. While the current behaviour makes sense for historical reason, I 
think an overall change to the behaviour as Keith describes would be more 
sensible long-term. 

(Also to note my previous suggestion about upping the major version to 2.0 
if we continue to increase the number of ABI/API changes in this release.  
Anyone else any thoughts on that?)

/Bruce

[dpdk-dev] [PATCH v2] ADD mode 5(tlb) to link bonding pmd

2014-09-29 Thread Mrzyglod, DanielX T

Add this Release note
This patch set adds support of mode 5 to link bonding pmd

This patchset depend on  Declan Doherty patch set:
http://dpdk.org/ml/archives/dev/2014-September/005641.html

v2 change:
Add Unit Tests
Modification that updates obytes structure in virtualpmd driver.
change internals->slaves[i].last_obytes to have proper values.
Update codebase to Declan's patches.

v1 change
Add support for mode 5 (Transmit load balancing) into pmd driver

> -Original Message-
> From: Mrzyglod, DanielX T
> Sent: Friday, September 26, 2014 5:41 PM
> To: dev at dpdk.org
> Cc: Mrzyglod, DanielX T
> Subject: [PATCH v2] ADD mode 5(tlb) to link bonding pmd
> 
> 
> Signed-off-by: Daniel Mrzyglod 
> ---
>  app/test/test_link_bonding.c   |  501 
> +++-
>  app/test/virtual_pmd.c |6 +-
>  app/test/virtual_pmd.h |7 +
>  lib/librte_pmd_bond/rte_eth_bond.h |   23 ++
>  lib/librte_pmd_bond/rte_eth_bond_args.c|1 +
>  lib/librte_pmd_bond/rte_eth_bond_pmd.c |  161 -
>  lib/librte_pmd_bond/rte_eth_bond_private.h |3 +-
>  7 files changed, 696 insertions(+), 6 deletions(-)
> 
> diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
> index c4fcaf7..77f791f 100644
> --- a/app/test/test_link_bonding.c
> +++ b/app/test/test_link_bonding.c
> @@ -41,7 +41,7 @@
>  #include 
>  #include 
>  #include 
> -
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -3845,6 +3845,500 @@ testsuite_teardown(void)
>   return remove_slaves_and_stop_bonded_device();
>  }
> 
> +#define NINETY_PERCENT_NUMERAL 90
> +#define ONE_HUNDRED_PERCENT_DENOMINATOR 100
> +#define ONE_HUNDRED_PERCENT_AND_TEN_NUMERAL 110
> +static int
> +test_tlb_tx_burst(void)
> +{
> + int i, burst_size, nb_tx;
> + uint64_t nb_tx2 = 0;
> + struct rte_mbuf *pkt_burst[MAX_PKT_BURST];
> + struct rte_eth_stats port_stats[32];
> + uint64_t sum_ports_opackets = 0, all_bond_opackets = 0,
> all_bond_obytes = 0;
> + uint16_t pktlen;
> +
> + TEST_ASSERT_SUCCESS(initialize_bonded_device_with_slaves
> +
>   (BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING, 1, 3, 1),
> + "Failed to initialise bonded device");
> +
> + burst_size = 20 * test_params->bonded_slave_count;
> +
> + TEST_ASSERT(burst_size < MAX_PKT_BURST,
> + "Burst size specified is greater than supported.\n");
> +
> +
> + /* Generate 40 test bursts in 2s of packets to transmit  */
> + for (i = 0; i < 40; i++) {
> + /*test two types of mac src own(bonding) and others */
> + if (i % 2 == 0) {
> + initialize_eth_header(test_params->pkt_eth_hdr,
> + (struct ether_addr *)src_mac, (struct
> ether_addr *)dst_mac_0, 0, 0);
> + } else {
> + initialize_eth_header(test_params->pkt_eth_hdr,
> + (struct ether_addr *)test_params-
> >default_slave_mac,
> + (struct ether_addr *)dst_mac_0, 0, 0);
> + }
> + pktlen = initialize_udp_header(test_params->pkt_udp_hdr,
> src_port,
> + dst_port_0, 16);
> + pktlen = initialize_ipv4_header(test_params->pkt_ipv4_hdr,
> src_addr,
> + dst_addr_0, pktlen);
> + generate_packet_burst(test_params->mbuf_pool, pkt_burst,
> + test_params->pkt_eth_hdr, 0, test_params-
> >pkt_ipv4_hdr,
> + 1, test_params->pkt_udp_hdr, burst_size, 60,
> 1);
> + /* Send burst on bonded port */
> + nb_tx = rte_eth_tx_burst(test_params->bonded_port_id, 0,
> pkt_burst,
> + burst_size);
> + nb_tx2 += nb_tx;
> +
> + TEST_ASSERT_EQUAL(nb_tx, burst_size,
> + "number of packet not equal burst size");
> +
> + rte_delay_us(5);
> + }
> +
> +
> + /* Verify bonded port tx stats */
> + rte_eth_stats_get(test_params->bonded_port_id, &port_stats[0]);
> +
> + all_bond_opackets = port_stats[0].opackets;
> + all_bond_obytes = port_stats[0].obytes;
> +
> + TEST_ASSERT_EQUAL(port_stats[0].opackets, (uint64_t)nb_tx2,
> + "Bonded Port (%d) opackets value (%u) not as expected
> (%d)\n",
> + test_params->bonded_port_id, (unsigned
> int)port_stats[0].opackets,
> + burst_size);
> +
> +
> + /* Verify slave ports tx stats */
> + for (i = 0; i < test_params->bonded_slave_count; i++) {
> + rte_eth_stats_get(test_params->slave_port_ids[i],
> &port_stats[i]);
> + sum_ports_opackets += port_stats[i].opackets;
> + }
> +
> + TEST_ASSERT_EQUAL(sum_ports_opackets,
> (uint64_t)all_bond_opackets,
> + "Total packets sent by slaves is not equalto p

[dpdk-dev] [PATCH 0/3] eal / bonding pmd cleanup

2014-09-29 Thread Thomas Monjalon

> > >> This patchset reworks the bonding pmd so that we don't need to modify the
> > >> eal
> > >> for this pmd to work.
> > >>
> > >> Basically, the arguments parsed at bond_init are stored in the bond
> > >> private
> > >> structure to be used at dev_configure time.
> > >> If no argument are present, we suppose that the bonding api has been
> > >> called.
> > >>
> > >
> > > I did not get any comment on these patches.
> > > Anyone ?
> > >
> > > The idea here is to keep pmd stuff in the pmds and avoid polluting the 
> > > eal.
> > 
> > ping
> 
> Sorry, it was on my todo list and it kept getting pushed back.  I like the
> change, its makes great sense and does proper isolation of the PMD.
> 
> Acked-by: Neil Horman 

Acked-by: Thomas Monjalon 

Applied.
Patches 2 and 3 were merged since 2 is a revert and 3 is a partial revert
of the revert ;)

This patchset prove that my initial request was not so complex :)
(http://dpdk.org/ml/archives/dev/2014-June/003833.html)

Thanks
-- 
Thomas

[dpdk-dev] [PATCH] examples: do not probe pci twice

2014-09-29 Thread Thomas Monjalon

> > Since commit a155d430119 ("support link bonding device initialization"),
> > rte_eal_pci_probe() is called in rte_eal_init().
> > So it doesn't have to be called by application anymore.
> > It has been fixed for testpmd in commit 2950a769315,
> > and this patch remove it from other applications.
> > 
> > Signed-off-by: Thomas Monjalon 
> 
> Acked-by: Neil Horman 

Applied in the serie about bonding cleanup.

-- 
Thomas

[dpdk-dev] [RFC] More changes for rte_mempool.h:__mempool_get_bulk()

2014-09-29 Thread Ananyev, Konstantin



> -Original Message-
> From: Richardson, Bruce
> Sent: Monday, September 29, 2014 1:06 PM
> To: Wiles, Roger Keith (Wind River)
> Cc: Ananyev, Konstantin; 
> Subject: Re: [dpdk-dev] [RFC] More changes for 
> rte_mempool.h:__mempool_get_bulk()
> 
> On Sun, Sep 28, 2014 at 11:17:34PM +, Wiles, Roger Keith wrote:
> >
> > On Sep 28, 2014, at 5:41 PM, Ananyev, Konstantin  > intel.com> wrote:
> >
> > >
> > >
> > >> -Original Message-
> > >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Wiles, Roger 
> > >> Keith
> > >> Sent: Sunday, September 28, 2014 6:52 PM
> > >> To: 
> > >> Subject: [dpdk-dev] [RFC] More changes for 
> > >> rte_mempool.h:__mempool_get_bulk()
> > >>
> > >> Here is a Request for Comment on __mempool_get_bulk() routine. I believe 
> > >> I am seeing a few more issues in this routine, please
> look
> > >> at the code below and see if these seem to fix some concerns in how the 
> > >> ring is handled.
> > >>
> > >> The first issue I believe is cache->len is increased by ret and not req 
> > >> as we do not know if ret == req. This also means the cache-
> >len
> > >> may still not satisfy the request from the cache.
> > >>
> > >> The second issue is if you believe the above code then we have to 
> > >> account for that issue in the stats.
> > >>
> > >> Let me know what you think?
> > >> ++Keith
> > >> ---
> > >>
> > >> diff --git a/lib/librte_mempool/rte_mempool.h 
> > >> b/lib/librte_mempool/rte_mempool.h
> > >> index 199a493..b1b1f7a 100644
> > >> --- a/lib/librte_mempool/rte_mempool.h
> > >> +++ b/lib/librte_mempool/rte_mempool.h
> > >> @@ -945,9 +945,7 @@ __mempool_get_bulk(struct rte_mempool *mp, void 
> > >> **obj_table,
> > >>   unsigned n, int is_mc)
> > >> {
> > >>int ret;
> > >> -#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
> > >> -   unsigned n_orig = n;
> > >> -#endif
> > >
> > > Yep, as I said in my previous mail n_orig could be removed in total.
> > > Though from other side - it is harmless.
> > >
> > >> +
> > >> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> > >>struct rte_mempool_cache *cache;
> > >>uint32_t index, len;
> > >> @@ -979,7 +977,21 @@ __mempool_get_bulk(struct rte_mempool *mp, void 
> > >> **obj_table,
> > >>goto ring_dequeue;
> > >>}
> > >>
> > >> -   cache->len += req;
> > >> +   cache->len += ret;  // Need to adjust len by ret not 
> > >> req, as (ret != req)
> > >> +
> > >
> > > rte_ring_mc_dequeue_bulk(.., req) at line 971, would either get all req 
> > > objects from the ring and return 0 (success),
> > > or wouldn't get any entry from the ring and return negative value 
> > > (failure).
> > > So  this change is erroneous.
> >
> > Sorry, I combined my thoughts on changing the get_bulk behavior and you 
> > would be correct for the current design. This is why I
> decided to make it an RFC :-)
> > >
> > >> +   if ( cache->len < n ) {
> > >
> > > If n > cache_size, then we will go straight to  'ring_dequeue' see line 
> > > 959.
> > > So no need for that check here.
> >
> > My thinking (at the time) was get_bulk should return ?n? instead of zero, 
> > which I feel is the better coding. You are correct it does not
> make sense unless you factor in my thinking at time :-(
> > >
> > >> +   /*
> > >> +* Number (ret + cache->len) may not be >= n. As
> > >> +* the 'ret' value maybe zero or less then 'req'.
> > >> +*
> > >> +* Note:
> > >> +* An issue of order from the cache and common 
> > >> pool could
> > >> +* be an issue if (cache->len != 0 and less then 
> > >> n), but the
> > >> +* normal case it should be OK. If the user 
> > >> needs to preserve
> > >> +* the order of packets then he must set 
> > >> cache_size == 0.
> > >> +*/
> > >> +   goto ring_dequeue;
> > >> +   }
> > >>}
> > >>
> > >>/* Now fill in the response ... */
> > >> @@ -1002,9 +1014,12 @@ ring_dequeue:
> > >>ret = rte_ring_sc_dequeue_bulk(mp->ring, obj_table, n);
> > >>
> > >>if (ret < 0)
> > >> -   __MEMPOOL_STAT_ADD(mp, get_fail, n_orig);
> > >> -   else
> > >> +   __MEMPOOL_STAT_ADD(mp, get_fail, n);
> > >> +   else {
> > >>__MEMPOOL_STAT_ADD(mp, get_success, ret);
> > >> +   // Catch the case when ret != n, adding zero should not 
> > >> be a problem.
> > >> +   __MEMPOOL_STAT_ADD(mp, get_fail, n - ret);
> > >
> > > As I said above, ret == 0 on success, so need for that change.
> > > Just n (or n_orig) is ok here.
> > >
> > >> +   }
> > >>
> > >>return ret;
> > >> }
> > >>
> > >> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
> > >> 972-213-5533
> >
> > D

[dpdk-dev] VMDq Sample Application on Virtual Machines

2014-09-29 Thread ANKIT BATRA

Hi,

I am running VMDq sample application on host machine.There are 1Gig I350
Ethernet Cards on my host machine supporting VMDq.I am sending packets from
another machine.And when I am sending vlan 5 packet, I am seeing that
packets are getting increamented in pool 5 and so on.

And now I want to test this VMDq sample application with virtual
machines.There are 2 virtual machine running on my machine.Can anybody
suggest, how can I test this VMDq sample application with virtual machines
like what steps need to followed and all so that I can see that if packets
are meant for a specific VM, then the packets can go on the specific queue
for that VM and so on.

-- 
Regards
Ankit Batra

[dpdk-dev] Bulk dequeue of packets and the returned values, question

2014-09-29 Thread Neil Horman

On Mon, Sep 29, 2014 at 01:10:22PM +0100, Bruce Richardson wrote:
> On Sun, Sep 28, 2014 at 11:06:17PM +, Wiles, Roger Keith wrote:
> > Thanks Venky,
> > On Sep 28, 2014, at 5:23 PM, Venkatesan, Venky  > intel.com> wrote:
> > 
> > > Keith,
> > > 
> > > On 9/28/2014 11:04 AM, Wiles, Roger Keith wrote:
> > >> I am also looking at the bulk dequeue routines, which the ring can be 
> > >> fixed or variable. On fixed  < 0 on error is returned and 0 if 
> > >> successful. On a variable ring < 0 on error or n on success, but I think 
> > >> n can be zero in the variable case, correct?
> > >> 
> > >> If these are true then why not have the routines return  < 0 on error 
> > >> and >= 0 on success. Which means a dequeue from a fixed ring would 
> > >> return only ?requested size n? or < 0 if you error off the 0 case. The 0 
> > >> case could be OK, if you allow zero to be return on a empty ring for the 
> > >> fixed ring case.
> > >> 
> > >> Does this make sense to anyone?
> > > It won't make sense unless you're aware of the history behind these 
> > > functions. The original functions that were implemented for the ring were 
> > > only the bulk functions (i.e. FIXED). They would return exactly the 
> > > number of items requested for dequeue (0 if success, negative if error), 
> > > and not return any if the required number were not available.
> > > 
> > > The burst (i.e. VARIABLE) functions came in much later (think it was r1.3 
> > > where we introduced them), and by that time, there were already quite a 
> > > number of deployments of DPDK in the field using the legacy ring 
> > > functions. Therefore we made the decision to keep the legacy behavior 
> > > intact & not impacting deployed code - and merging the burst functions 
> > > into the code. Given that there was no "versioning" of the API/ABI in 
> > > those releases :).
> > 
> > I see why the code is this way. If the developers used ?if ( ret == 0 ) { 
> > /* do something */ }? then it would break if it returned a positive value 
> > on success. I would expect the normal behavior to be ?if ( ret < 0 ) { /* 
> > error case */ }? and fall thru for the success case. I would love to change 
> > the code to just return <0 on error or >= 0 on success. I wonder how many 
> > customers code would break changing the code to do just just the two steps. 
> > I think it will remove some code in a couple places that were testing for 
> > FIXED or VARIABLE?
> > > 
> > > Hope that helps.
> > > -Venky
> > > 
> > >> 
> > >> Thanks
> > >> ++Keith
> > >> 
> > >> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
> > >> 972-213-5533
> > 
> > Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
> > 972-213-5533
> > 
> 
> Since we are looking at making considerable ABI changes in this release and 
> (hopefully) also looking to version our ABI going forward, I would be in 
> favour of making any changes to these APIs in this current release if 
> possible. While the current behaviour makes sense for historical reason, I 
> think an overall change to the behaviour as Keith describes would be more 
> sensible long-term. 
> 
I agree, this seems like a sensible time to make these sorts of changes as we
identify them.

> (Also to note my previous suggestion about upping the major version to 2.0 
> if we continue to increase the number of ABI/API changes in this release.  
> Anyone else any thoughts on that?)
> 
I feel like this is a policy decision, as I vew the versioning as arbitrary.
I'm really fine with it either way.  Presumably moving to 2.0 would represent a
major shift in design, and I suppose adding versioning does amount to something
like that, so I could be supportive.
Neil

> /Bruce
>

[dpdk-dev] Bulk dequeue of packets and the returned values, question

2014-09-29 Thread Ananyev, Konstantin



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> Sent: Monday, September 29, 2014 1:10 PM
> To: Wiles, Roger Keith (Wind River)
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] Bulk dequeue of packets and the returned values, 
> question
> 
> On Sun, Sep 28, 2014 at 11:06:17PM +, Wiles, Roger Keith wrote:
> > Thanks Venky,
> > On Sep 28, 2014, at 5:23 PM, Venkatesan, Venky  > intel.com> wrote:
> >
> > > Keith,
> > >
> > > On 9/28/2014 11:04 AM, Wiles, Roger Keith wrote:
> > >> I am also looking at the bulk dequeue routines, which the ring can be 
> > >> fixed or variable. On fixed  < 0 on error is returned and 0 if
> successful. On a variable ring < 0 on error or n on success, but I think n 
> can be zero in the variable case, correct?
> > >>
> > >> If these are true then why not have the routines return  < 0 on error 
> > >> and >= 0 on success. Which means a dequeue from a fixed
> ring would return only ?requested size n? or < 0 if you error off the 0 case. 
> The 0 case could be OK, if you allow zero to be return on a
> empty ring for the fixed ring case.
> > >>
> > >> Does this make sense to anyone?
> > > It won't make sense unless you're aware of the history behind these 
> > > functions. The original functions that were implemented for
> the ring were only the bulk functions (i.e. FIXED). They would return exactly 
> the number of items requested for dequeue (0 if success,
> negative if error), and not return any if the required number were not 
> available.
> > >
> > > The burst (i.e. VARIABLE) functions came in much later (think it was r1.3 
> > > where we introduced them), and by that time, there were
> already quite a number of deployments of DPDK in the field using the legacy 
> ring functions. Therefore we made the decision to keep
> the legacy behavior intact & not impacting deployed code - and merging the 
> burst functions into the code. Given that there was no
> "versioning" of the API/ABI in those releases :).
> >
> > I see why the code is this way. If the developers used ?if ( ret == 0 ) { 
> > /* do something */ }? then it would break if it returned a
> positive value on success. I would expect the normal behavior to be ?if ( ret 
> < 0 ) { /* error case */ }? and fall thru for the success case. I
> would love to change the code to just return <0 on error or >= 0 on success. 
> I wonder how many customers code would break
> changing the code to do just just the two steps. I think it will remove some 
> code in a couple places that were testing for FIXED or
> VARIABLE?
> > >
> > > Hope that helps.
> > > -Venky
> > >
> > >>
> > >> Thanks
> > >> ++Keith
> > >>
> > >> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
> > >> 972-213-5533
> >
> > Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
> > 972-213-5533
> >
> 
> Since we are looking at making considerable ABI changes in this release and
> (hopefully) also looking to version our ABI going forward, I would be in
> favour of making any changes to these APIs in this current release if
> possible. While the current behaviour makes sense for historical reason, I
> think an overall change to the behaviour as Keith describes would be more
> sensible long-term.

It is doable, I suppose, but might become quite messy:
Don't know how many people are using  rte_ring_dequeue_bulk() all over the 
place.
I suspect quite a lot.

[dpdk-dev] [RFC] More changes for rte_mempool.h:__mempool_get_bulk()

2014-09-29 Thread Bruce Richardson

On Mon, Sep 29, 2014 at 01:25:11PM +0100, Ananyev, Konstantin wrote:
> 
> 
> > -Original Message-
> > From: Richardson, Bruce
> > Sent: Monday, September 29, 2014 1:06 PM
> > To: Wiles, Roger Keith (Wind River)
> > Cc: Ananyev, Konstantin; 
> > Subject: Re: [dpdk-dev] [RFC] More changes for 
> > rte_mempool.h:__mempool_get_bulk()
> > 
> > On Sun, Sep 28, 2014 at 11:17:34PM +, Wiles, Roger Keith wrote:
> > >
> > > On Sep 28, 2014, at 5:41 PM, Ananyev, Konstantin  > > intel.com> wrote:
> > >
> > > >
> > > >
> > > >> -Original Message-
> > > >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Wiles, Roger 
> > > >> Keith
> > > >> Sent: Sunday, September 28, 2014 6:52 PM
> > > >> To: 
> > > >> Subject: [dpdk-dev] [RFC] More changes for 
> > > >> rte_mempool.h:__mempool_get_bulk()
> > > >>
> > > >> Here is a Request for Comment on __mempool_get_bulk() routine. I 
> > > >> believe I am seeing a few more issues in this routine, please
> > look
> > > >> at the code below and see if these seem to fix some concerns in how 
> > > >> the ring is handled.
> > > >>
> > > >> The first issue I believe is cache->len is increased by ret and not 
> > > >> req as we do not know if ret == req. This also means the cache-
> > >len
> > > >> may still not satisfy the request from the cache.
> > > >>
> > > >> The second issue is if you believe the above code then we have to 
> > > >> account for that issue in the stats.
> > > >>
> > > >> Let me know what you think?
> > > >> ++Keith
> > > >> ---
> > > >>
> > > >> diff --git a/lib/librte_mempool/rte_mempool.h 
> > > >> b/lib/librte_mempool/rte_mempool.h
> > > >> index 199a493..b1b1f7a 100644
> > > >> --- a/lib/librte_mempool/rte_mempool.h
> > > >> +++ b/lib/librte_mempool/rte_mempool.h
> > > >> @@ -945,9 +945,7 @@ __mempool_get_bulk(struct rte_mempool *mp, void 
> > > >> **obj_table,
> > > >>   unsigned n, int is_mc)
> > > >> {
> > > >>int ret;
> > > >> -#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
> > > >> -   unsigned n_orig = n;
> > > >> -#endif
> > > >
> > > > Yep, as I said in my previous mail n_orig could be removed in total.
> > > > Though from other side - it is harmless.
> > > >
> > > >> +
> > > >> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> > > >>struct rte_mempool_cache *cache;
> > > >>uint32_t index, len;
> > > >> @@ -979,7 +977,21 @@ __mempool_get_bulk(struct rte_mempool *mp, void 
> > > >> **obj_table,
> > > >>goto ring_dequeue;
> > > >>}
> > > >>
> > > >> -   cache->len += req;
> > > >> +   cache->len += ret;  // Need to adjust len by ret 
> > > >> not req, as (ret != req)
> > > >> +
> > > >
> > > > rte_ring_mc_dequeue_bulk(.., req) at line 971, would either get all req 
> > > > objects from the ring and return 0 (success),
> > > > or wouldn't get any entry from the ring and return negative value 
> > > > (failure).
> > > > So  this change is erroneous.
> > >
> > > Sorry, I combined my thoughts on changing the get_bulk behavior and you 
> > > would be correct for the current design. This is why I
> > decided to make it an RFC :-)
> > > >
> > > >> +   if ( cache->len < n ) {
> > > >
> > > > If n > cache_size, then we will go straight to  'ring_dequeue' see line 
> > > > 959.
> > > > So no need for that check here.
> > >
> > > My thinking (at the time) was get_bulk should return ?n? instead of zero, 
> > > which I feel is the better coding. You are correct it does not
> > make sense unless you factor in my thinking at time :-(
> > > >
> > > >> +   /*
> > > >> +* Number (ret + cache->len) may not be >= n. 
> > > >> As
> > > >> +* the 'ret' value maybe zero or less then 
> > > >> 'req'.
> > > >> +*
> > > >> +* Note:
> > > >> +* An issue of order from the cache and common 
> > > >> pool could
> > > >> +* be an issue if (cache->len != 0 and less 
> > > >> then n), but the
> > > >> +* normal case it should be OK. If the user 
> > > >> needs to preserve
> > > >> +* the order of packets then he must set 
> > > >> cache_size == 0.
> > > >> +*/
> > > >> +   goto ring_dequeue;
> > > >> +   }
> > > >>}
> > > >>
> > > >>/* Now fill in the response ... */
> > > >> @@ -1002,9 +1014,12 @@ ring_dequeue:
> > > >>ret = rte_ring_sc_dequeue_bulk(mp->ring, obj_table, n);
> > > >>
> > > >>if (ret < 0)
> > > >> -   __MEMPOOL_STAT_ADD(mp, get_fail, n_orig);
> > > >> -   else
> > > >> +   __MEMPOOL_STAT_ADD(mp, get_fail, n);
> > > >> +   else {
> > > >>__MEMPOOL_STAT_ADD(mp, get_success, ret);
> > > >> +   // Catch the case when ret != n, adding zero should 
> > > >> not be a problem.
> > > >> +

[dpdk-dev] [PATCH v2] distributor_app: new sample app

2014-09-29 Thread Pattan, Reshma



-Original Message-
From: Ananyev, Konstantin 
Sent: Friday, September 26, 2014 4:52 PM
To: De Lara Guarch, Pablo; Pattan, Reshma; dev at dpdk.org
Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of De Lara Guarch, 
> Pablo
> Sent: Friday, September 26, 2014 4:12 PM
> To: Pattan, Reshma; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> 
> Hi,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of reshmapa
> > Sent: Wednesday, September 24, 2014 3:17 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> >
> > From: Reshma Pattan 
> >
> > A new sample app that shows the usage of the distributor library. 
> > This app works as follows:
> >
> > * An RX thread runs which pulls packets from each ethernet port in turn
> >   and passes those packets to worker using a distributor component.
> > * The workers take the packets in turn, and determine the output port
> >   for those packets using basic l2forwarding doing an xor on the source
> >   port id.
> > * The RX thread takes the returned packets from the workers and enqueue
> >   those packets into an rte_ring structure.
> > * A TX thread pulls the packets off the rte_ring structure and then
> >   sends each packet out the output port specified previously by the 
> > worker
> > * Command-line option support provided only for portmask.
> >
> > Signed-off-by: Bruce Richardson 
> > Signed-off-by: Reshma Pattan 
> > ---
> >  examples/Makefile |   1 +
> >  examples/distributor_app/Makefile |  57 
> >  examples/distributor_app/main.c   | 585
> > ++
> >  examples/distributor_app/main.h   |  46 +++
> >  4 files changed, 689 insertions(+)
> >  create mode 100644 examples/distributor_app/Makefile  create mode 
> > 100644 examples/distributor_app/main.c  create mode 100644 
> > examples/distributor_app/main.h
> >
> > diff --git a/examples/Makefile b/examples/Makefile index 
> > 6245f83..2ba82b0 100644
> > --- a/examples/Makefile
> > +++ b/examples/Makefile
> > @@ -66,5 +66,6 @@ DIRS-y += vhost
> >  DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen  DIRS-y += vmdq  
> > DIRS-y += vmdq_dcb
> > +DIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += distributor_app
> >
> >  include $(RTE_SDK)/mk/rte.extsubdir.mk diff --git 
> > a/examples/distributor_app/Makefile
> > b/examples/distributor_app/Makefile
> > new file mode 100644
> > index 000..394785d
> > --- /dev/null
> > +++ b/examples/distributor_app/Makefile
> > @@ -0,0 +1,57 @@
> > +#   BSD LICENSE
> > +#
> > +#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > +#   All rights reserved.
> > +#
> > +#   Redistribution and use in source and binary forms, with or without
> > +#   modification, are permitted provided that the following conditions
> > +#   are met:
> > +#
> > +# * Redistributions of source code must retain the above copyright
> > +#   notice, this list of conditions and the following disclaimer.
> > +# * Redistributions in binary form must reproduce the above copyright
> > +#   notice, this list of conditions and the following disclaimer in
> > +#   the documentation and/or other materials provided with the
> > +#   distribution.
> > +# * Neither the name of Intel Corporation nor the names of its
> > +#   contributors may be used to endorse or promote products derived
> > +#   from this software without specific prior written permission.
> > +#
> > +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> > CONTRIBUTORS
> > +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> > NOT
> > +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> > FITNESS FOR
> > +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> > COPYRIGHT
> > +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> > INCIDENTAL,
> > +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> > NOT
> > +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> > OF USE,
> > +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> > AND ON ANY
> > +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> > TORT
> > +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> > THE USE
> > +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> > DAMAGE.
> > +
> > +ifeq ($(RTE_SDK),)
> > +$(error "Please define RTE_SDK environment variable") endif
> > +
> > +# Default target, can be overriden by command line or environment 
> > +RTE_TARGET ?= x86_64-default-linuxapp-gcc
> 
> This target is not present anymore. Change it to x86_64-native-linuxapp-gcc.
> 
> > +
> > +include $(RTE_SDK)/mk/rte.vars.mk
> > +
> > +# binary name
> > +APP = distributor_app
> > +
> > +# all source are stored in SRCS-y
> > +SRCS-y := main.c

[dpdk-dev] [PATCH RFC 0/3] only call iopl when necessary

2014-09-29 Thread Thomas Monjalon

2014-08-26 16:11, David Marchand:
> This patch series is just a little clean up to remove the unconditionnal call 
> to
> iopl on linux.
> Rather than call iopl() at the eal level, let the PMD that needs it call
> rte_eal_iopl_init().

Acked and applied

Thanks for the cleanup
-- 
Thomas

[dpdk-dev] [PATCH] eal: remove rte_snprintf

2014-09-29 Thread Thomas Monjalon

> > The function rte_snprintf() was deprecated in version 1.7.0
> > (commit 6f41fe75e2dd).
> > It's now totally removed.
> > 
> > Signed-off-by: Thomas Monjalon 
> 
> Acked-by: Neil Horman 

Applied

-- 
Thomas

[dpdk-dev] [PATCH v2] distributor_app: new sample app

2014-09-29 Thread Ananyev, Konstantin



> -Original Message-
> From: Pattan, Reshma
> Sent: Monday, September 29, 2014 1:40 PM
> To: Ananyev, Konstantin; De Lara Guarch, Pablo; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> 
> 
> 
> -Original Message-
> From: Ananyev, Konstantin
> Sent: Friday, September 26, 2014 4:52 PM
> To: De Lara Guarch, Pablo; Pattan, Reshma; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> 
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of De Lara Guarch,
> > Pablo
> > Sent: Friday, September 26, 2014 4:12 PM
> > To: Pattan, Reshma; dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> >
> > Hi,
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of reshmapa
> > > Sent: Wednesday, September 24, 2014 3:17 PM
> > > To: dev at dpdk.org
> > > Subject: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> > >
> > > From: Reshma Pattan 
> > >
> > > A new sample app that shows the usage of the distributor library.
> > > This app works as follows:
> > >
> > > * An RX thread runs which pulls packets from each ethernet port in turn
> > >   and passes those packets to worker using a distributor component.
> > > * The workers take the packets in turn, and determine the output port
> > >   for those packets using basic l2forwarding doing an xor on the source
> > >   port id.
> > > * The RX thread takes the returned packets from the workers and enqueue
> > >   those packets into an rte_ring structure.
> > > * A TX thread pulls the packets off the rte_ring structure and then
> > >   sends each packet out the output port specified previously by the
> > > worker
> > > * Command-line option support provided only for portmask.
> > >
> > > Signed-off-by: Bruce Richardson 
> > > Signed-off-by: Reshma Pattan 
> > > ---
> > >  examples/Makefile |   1 +
> > >  examples/distributor_app/Makefile |  57 
> > >  examples/distributor_app/main.c   | 585
> > > ++
> > >  examples/distributor_app/main.h   |  46 +++
> > >  4 files changed, 689 insertions(+)
> > >  create mode 100644 examples/distributor_app/Makefile  create mode
> > > 100644 examples/distributor_app/main.c  create mode 100644
> > > examples/distributor_app/main.h
> > >
> > > diff --git a/examples/Makefile b/examples/Makefile index
> > > 6245f83..2ba82b0 100644
> > > --- a/examples/Makefile
> > > +++ b/examples/Makefile
> > > @@ -66,5 +66,6 @@ DIRS-y += vhost
> > >  DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen  DIRS-y += vmdq
> > > DIRS-y += vmdq_dcb
> > > +DIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += distributor_app
> > >
> > >  include $(RTE_SDK)/mk/rte.extsubdir.mk diff --git
> > > a/examples/distributor_app/Makefile
> > > b/examples/distributor_app/Makefile
> > > new file mode 100644
> > > index 000..394785d
> > > --- /dev/null
> > > +++ b/examples/distributor_app/Makefile
> > > @@ -0,0 +1,57 @@
> > > +#   BSD LICENSE
> > > +#
> > > +#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > > +#   All rights reserved.
> > > +#
> > > +#   Redistribution and use in source and binary forms, with or without
> > > +#   modification, are permitted provided that the following conditions
> > > +#   are met:
> > > +#
> > > +# * Redistributions of source code must retain the above copyright
> > > +#   notice, this list of conditions and the following disclaimer.
> > > +# * Redistributions in binary form must reproduce the above copyright
> > > +#   notice, this list of conditions and the following disclaimer in
> > > +#   the documentation and/or other materials provided with the
> > > +#   distribution.
> > > +# * Neither the name of Intel Corporation nor the names of its
> > > +#   contributors may be used to endorse or promote products derived
> > > +#   from this software without specific prior written permission.
> > > +#
> > > +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> > > CONTRIBUTORS
> > > +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> > > NOT
> > > +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> > > FITNESS FOR
> > > +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> > > COPYRIGHT
> > > +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> > > INCIDENTAL,
> > > +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> > > NOT
> > > +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> > > OF USE,
> > > +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> > > AND ON ANY
> > > +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> > > TORT
> > > +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> > > THE USE
> > > +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> > > DAMAGE.
> > > +
> > > +ifeq

[dpdk-dev] [PATCH] Fix librte_pmd_pcap driver double stop error

2014-09-29 Thread Thomas Monjalon

2014-09-10 17:17, Nicol?s Pernas Maradei:
> From: Nicola?s Pernas Maradei 
> 
> librte_pmd_pcap driver was opening the pcap/interfaces only at init time and
> closing them only when the port was being stopped. This behaviour would cause
> problems (leading to segfault) if the user closed the port 2 times. The first
> time the pcap/interfaces would be normally closed but libpcap would throw an
> error causing a segfault if the closed pcaps/interfaces were closed again.
> This behaviour is solved by re-openning pcaps/interfaces when the port is
> started (only if these weren't open already for example at init time).
> 
> Signed-off-by: Nicola?s Pernas Maradei 
> ---
>  lib/librte_pmd_pcap/rte_eth_pcap.c | 254 
> +
>  1 file changed, 202 insertions(+), 52 deletions(-)

Someone to review this patch?

-- 
Thomas

[dpdk-dev] [PATCH v2] bond: Add mode 4 support.

2014-09-29 Thread Pawel Wodkowski

This patch adds support mode 4 of link bonding. It depend on Delcan Doherty
patches v3 and rte alarms patch v2 or above.

New version handles race issues with setting/cancelin callbacks,
fixes promiscus mode setting in mode 4 and some other minor errors in mode 4
implementation.


Signed-off-by: Pawel Wodkowski 
---
 lib/librte_ether/rte_ether.h   |1 +
 lib/librte_pmd_bond/Makefile   |1 +
 lib/librte_pmd_bond/rte_eth_bond.h |4 +
 lib/librte_pmd_bond/rte_eth_bond_api.c |   82 ++---
 lib/librte_pmd_bond/rte_eth_bond_args.c|1 +
 lib/librte_pmd_bond/rte_eth_bond_pmd.c |  261 +---
 lib/librte_pmd_bond/rte_eth_bond_private.h |   42 -
 7 files changed, 346 insertions(+), 46 deletions(-)

diff --git a/lib/librte_ether/rte_ether.h b/lib/librte_ether/rte_ether.h
index 2e08f23..1a3711b 100644
--- a/lib/librte_ether/rte_ether.h
+++ b/lib/librte_ether/rte_ether.h
@@ -293,6 +293,7 @@ struct vlan_hdr {
 #define ETHER_TYPE_RARP 0x8035 /**< Reverse Arp Protocol. */
 #define ETHER_TYPE_VLAN 0x8100 /**< IEEE 802.1Q VLAN tagging. */
 #define ETHER_TYPE_1588 0x88F7 /**< IEEE 802.1AS 1588 Precise Time Protocol. */
+#define ETHER_TYPE_SLOW 0x8809 /**< Slow protocols (LACP and Marker). */

 #ifdef __cplusplus
 }
diff --git a/lib/librte_pmd_bond/Makefile b/lib/librte_pmd_bond/Makefile
index 953d75e..c2312c2 100644
--- a/lib/librte_pmd_bond/Makefile
+++ b/lib/librte_pmd_bond/Makefile
@@ -44,6 +44,7 @@ CFLAGS += $(WERROR_FLAGS)
 #
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_api.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_pmd.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_8023ad.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_args.c

 #
diff --git a/lib/librte_pmd_bond/rte_eth_bond.h 
b/lib/librte_pmd_bond/rte_eth_bond.h
index 6811c7b..b0223c2 100644
--- a/lib/librte_pmd_bond/rte_eth_bond.h
+++ b/lib/librte_pmd_bond/rte_eth_bond.h
@@ -75,6 +75,10 @@ extern "C" {
 /**< Broadcast (Mode 3).
  * In this mode all transmitted packets will be transmitted on all available
  * active slaves of the bonded. */
+#define BONDING_MODE_8023AD(4)
+/**< 802.3AD (Mode 4).
+ * In this mode transmission and reception of packets is managed by LACP
+ * protocol specified in 802.3AD documentation. */

 /* Balance Mode Transmit Policies */
 #define BALANCE_XMIT_POLICY_LAYER2 (0)
diff --git a/lib/librte_pmd_bond/rte_eth_bond_api.c 
b/lib/librte_pmd_bond/rte_eth_bond_api.c
index c690ceb..c547164 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_api.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_api.c
@@ -31,6 +31,8 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */

+#include 
+
 #include 
 #include 
 #include 
@@ -104,6 +106,44 @@ valid_slave_port_id(uint8_t port_id)
return 0;
 }

+void
+activate_slave(struct rte_eth_dev *eth_dev, uint8_t port_id)
+{
+   struct bond_dev_private *internals = eth_dev->data->dev_private;
+   uint8_t active_count = internals->active_slave_count;
+
+   internals->active_slaves[active_count] = port_id;
+
+   if (internals->mode == BONDING_MODE_8023AD)
+   bond_mode_8023ad_slave_append(eth_dev);
+
+   internals->active_slave_count = active_count + 1;
+}
+
+void
+deactivate_slave(struct rte_eth_dev *eth_dev,
+   uint8_t slave_pos)
+{
+   struct bond_dev_private *internals = eth_dev->data->dev_private;
+   uint8_t active_count = internals->active_slave_count;
+
+   if (internals->mode == BONDING_MODE_8023AD)
+   bond_mode_8023ad_deactivate_slave(eth_dev, slave_pos);
+
+   active_count--;
+
+   /* If slave was not at the end of the list
+* shift active slaves up active array list */
+   if (slave_pos < active_count) {
+   memmove(internals->active_slaves + slave_pos,
+   internals->active_slaves + slave_pos + 1,
+   (active_count - slave_pos) *
+   sizeof(internals->active_slaves[0]));
+   }
+
+   internals->active_slave_count = active_count;
+}
+
 uint8_t
 number_of_sockets(void)
 {
@@ -216,12 +256,8 @@ rte_eth_bond_create(const char *name, uint8_t mode, 
uint8_t socket_id)
eth_dev->dev_ops = &default_dev_ops;
eth_dev->pci_dev = pci_dev;

-   if (bond_ethdev_mode_set(eth_dev, mode)) {
-   RTE_BOND_LOG(ERR, "Failed to set bonded device %d mode too %d",
-eth_dev->data->port_id, mode);
-   goto err;
-   }
-
+   internals->port_id = eth_dev->data->port_id;
+   internals->mode = BONDING_MODE_INVALID;
internals->current_primary_port = 0;
internals->balance_xmit_policy = BALANCE_XMIT_POLICY_LAYER2;
internals->user_defined_mac = 0;
@@ -241,6 +277,12 @@ rte_eth_bond_create(const char *name, uint8_t mode, 
uint8_t socket_id)
memset(internals->activ

[dpdk-dev] [RFC] More changes for rte_mempool.h:__mempool_get_bulk()

2014-09-29 Thread Ananyev, Konstantin



> -Original Message-
> From: Richardson, Bruce
> Sent: Monday, September 29, 2014 1:34 PM
> To: Ananyev, Konstantin
> Cc: Wiles, Roger Keith (Wind River); 
> Subject: Re: [dpdk-dev] [RFC] More changes for 
> rte_mempool.h:__mempool_get_bulk()
> 
> On Mon, Sep 29, 2014 at 01:25:11PM +0100, Ananyev, Konstantin wrote:
> >
> >
> > > -Original Message-
> > > From: Richardson, Bruce
> > > Sent: Monday, September 29, 2014 1:06 PM
> > > To: Wiles, Roger Keith (Wind River)
> > > Cc: Ananyev, Konstantin; 
> > > Subject: Re: [dpdk-dev] [RFC] More changes for 
> > > rte_mempool.h:__mempool_get_bulk()
> > >
> > > On Sun, Sep 28, 2014 at 11:17:34PM +, Wiles, Roger Keith wrote:
> > > >
> > > > On Sep 28, 2014, at 5:41 PM, Ananyev, Konstantin  > > > intel.com> wrote:
> > > >
> > > > >
> > > > >
> > > > >> -Original Message-
> > > > >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Wiles, Roger 
> > > > >> Keith
> > > > >> Sent: Sunday, September 28, 2014 6:52 PM
> > > > >> To: 
> > > > >> Subject: [dpdk-dev] [RFC] More changes for 
> > > > >> rte_mempool.h:__mempool_get_bulk()
> > > > >>
> > > > >> Here is a Request for Comment on __mempool_get_bulk() routine. I 
> > > > >> believe I am seeing a few more issues in this routine,
> please
> > > look
> > > > >> at the code below and see if these seem to fix some concerns in how 
> > > > >> the ring is handled.
> > > > >>
> > > > >> The first issue I believe is cache->len is increased by ret and not 
> > > > >> req as we do not know if ret == req. This also means the
> cache-
> > > >len
> > > > >> may still not satisfy the request from the cache.
> > > > >>
> > > > >> The second issue is if you believe the above code then we have to 
> > > > >> account for that issue in the stats.
> > > > >>
> > > > >> Let me know what you think?
> > > > >> ++Keith
> > > > >> ---
> > > > >>
> > > > >> diff --git a/lib/librte_mempool/rte_mempool.h 
> > > > >> b/lib/librte_mempool/rte_mempool.h
> > > > >> index 199a493..b1b1f7a 100644
> > > > >> --- a/lib/librte_mempool/rte_mempool.h
> > > > >> +++ b/lib/librte_mempool/rte_mempool.h
> > > > >> @@ -945,9 +945,7 @@ __mempool_get_bulk(struct rte_mempool *mp, void 
> > > > >> **obj_table,
> > > > >>   unsigned n, int is_mc)
> > > > >> {
> > > > >>int ret;
> > > > >> -#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
> > > > >> -   unsigned n_orig = n;
> > > > >> -#endif
> > > > >
> > > > > Yep, as I said in my previous mail n_orig could be removed in total.
> > > > > Though from other side - it is harmless.
> > > > >
> > > > >> +
> > > > >> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> > > > >>struct rte_mempool_cache *cache;
> > > > >>uint32_t index, len;
> > > > >> @@ -979,7 +977,21 @@ __mempool_get_bulk(struct rte_mempool *mp, void 
> > > > >> **obj_table,
> > > > >>goto ring_dequeue;
> > > > >>}
> > > > >>
> > > > >> -   cache->len += req;
> > > > >> +   cache->len += ret;  // Need to adjust len by ret 
> > > > >> not req, as (ret != req)
> > > > >> +
> > > > >
> > > > > rte_ring_mc_dequeue_bulk(.., req) at line 971, would either get all 
> > > > > req objects from the ring and return 0 (success),
> > > > > or wouldn't get any entry from the ring and return negative value 
> > > > > (failure).
> > > > > So  this change is erroneous.
> > > >
> > > > Sorry, I combined my thoughts on changing the get_bulk behavior and you 
> > > > would be correct for the current design. This is why I
> > > decided to make it an RFC :-)
> > > > >
> > > > >> +   if ( cache->len < n ) {
> > > > >
> > > > > If n > cache_size, then we will go straight to  'ring_dequeue' see 
> > > > > line 959.
> > > > > So no need for that check here.
> > > >
> > > > My thinking (at the time) was get_bulk should return ?n? instead of 
> > > > zero, which I feel is the better coding. You are correct it does
> not
> > > make sense unless you factor in my thinking at time :-(
> > > > >
> > > > >> +   /*
> > > > >> +* Number (ret + cache->len) may not be >= 
> > > > >> n. As
> > > > >> +* the 'ret' value maybe zero or less then 
> > > > >> 'req'.
> > > > >> +*
> > > > >> +* Note:
> > > > >> +* An issue of order from the cache and 
> > > > >> common pool could
> > > > >> +* be an issue if (cache->len != 0 and less 
> > > > >> then n), but the
> > > > >> +* normal case it should be OK. If the user 
> > > > >> needs to preserve
> > > > >> +* the order of packets then he must set 
> > > > >> cache_size == 0.
> > > > >> +*/
> > > > >> +   goto ring_dequeue;
> > > > >> +   }
> > > > >>}
> > > > >>
> > > > >>/* Now fill in the response ... */
> > > > >> @@ -1002,9 +1014,12 @@ ring_d

[dpdk-dev] [PATCH v2] distributor_app: new sample app

2014-09-29 Thread De Lara Guarch, Pablo



> -Original Message-
> From: Ananyev, Konstantin
> Sent: Monday, September 29, 2014 2:07 PM
> To: Pattan, Reshma; De Lara Guarch, Pablo; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> 
> 
> 
> > -Original Message-
> > From: Pattan, Reshma
> > Sent: Monday, September 29, 2014 1:40 PM
> > To: Ananyev, Konstantin; De Lara Guarch, Pablo; dev at dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> >
> >
> >
> > -Original Message-
> > From: Ananyev, Konstantin
> > Sent: Friday, September 26, 2014 4:52 PM
> > To: De Lara Guarch, Pablo; Pattan, Reshma; dev at dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> >
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of De Lara Guarch,
> > > Pablo
> > > Sent: Friday, September 26, 2014 4:12 PM
> > > To: Pattan, Reshma; dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> > >
> > > Hi,
> > >
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of reshmapa
> > > > Sent: Wednesday, September 24, 2014 3:17 PM
> > > > To: dev at dpdk.org
> > > > Subject: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> > > >
> > > > From: Reshma Pattan 
> > > >
> > > > A new sample app that shows the usage of the distributor library.
> > > > This app works as follows:
> > > >
> > > > * An RX thread runs which pulls packets from each ethernet port in turn
> > > >   and passes those packets to worker using a distributor component.
> > > > * The workers take the packets in turn, and determine the output port
> > > >   for those packets using basic l2forwarding doing an xor on the source
> > > >   port id.
> > > > * The RX thread takes the returned packets from the workers and
> enqueue
> > > >   those packets into an rte_ring structure.
> > > > * A TX thread pulls the packets off the rte_ring structure and then
> > > >   sends each packet out the output port specified previously by the
> > > > worker
> > > > * Command-line option support provided only for portmask.
> > > >
> > > > Signed-off-by: Bruce Richardson 
> > > > Signed-off-by: Reshma Pattan 
> > > > ---
> > > >  examples/Makefile |   1 +
> > > >  examples/distributor_app/Makefile |  57 
> > > >  examples/distributor_app/main.c   | 585
> > > > ++
> > > >  examples/distributor_app/main.h   |  46 +++
> > > >  4 files changed, 689 insertions(+)
> > > >  create mode 100644 examples/distributor_app/Makefile  create mode
> > > > 100644 examples/distributor_app/main.c  create mode 100644
> > > > examples/distributor_app/main.h
> > > >
> > > > diff --git a/examples/Makefile b/examples/Makefile index
> > > > 6245f83..2ba82b0 100644
> > > > --- a/examples/Makefile
> > > > +++ b/examples/Makefile
> > > > @@ -66,5 +66,6 @@ DIRS-y += vhost
> > > >  DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen  DIRS-y +=
> vmdq
> > > > DIRS-y += vmdq_dcb
> > > > +DIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += distributor_app
> > > >
> > > >  include $(RTE_SDK)/mk/rte.extsubdir.mk diff --git
> > > > a/examples/distributor_app/Makefile
> > > > b/examples/distributor_app/Makefile
> > > > new file mode 100644
> > > > index 000..394785d
> > > > --- /dev/null
> > > > +++ b/examples/distributor_app/Makefile
> > > > @@ -0,0 +1,57 @@
> > > > +#   BSD LICENSE
> > > > +#
> > > > +#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > > > +#   All rights reserved.
> > > > +#
> > > > +#   Redistribution and use in source and binary forms, with or without
> > > > +#   modification, are permitted provided that the following conditions
> > > > +#   are met:
> > > > +#
> > > > +# * Redistributions of source code must retain the above copyright
> > > > +#   notice, this list of conditions and the following disclaimer.
> > > > +# * Redistributions in binary form must reproduce the above
> copyright
> > > > +#   notice, this list of conditions and the following disclaimer in
> > > > +#   the documentation and/or other materials provided with the
> > > > +#   distribution.
> > > > +# * Neither the name of Intel Corporation nor the names of its
> > > > +#   contributors may be used to endorse or promote products
> derived
> > > > +#   from this software without specific prior written permission.
> > > > +#
> > > > +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> > > > CONTRIBUTORS
> > > > +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING,
> BUT
> > > > NOT
> > > > +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> > > > FITNESS FOR
> > > > +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> > > > COPYRIGHT
> > > > +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> > > > INCIDENTAL,
> > > > +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> BUT
>

[dpdk-dev] [PATCH v2] bond: Add mode 4 support.

2014-09-29 Thread Jastrzebski, MichalX K

Please don't take this patch into account. Two files are missing.

Best regards
Michal


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pawel Wodkowski
> Sent: Monday, September 29, 2014 3:23 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2] bond: Add mode 4 support.
> 
> This patch adds support mode 4 of link bonding. It depend on Delcan
> Doherty
> patches v3 and rte alarms patch v2 or above.
> 
> New version handles race issues with setting/cancelin callbacks,
> fixes promiscus mode setting in mode 4 and some other minor errors in
> mode 4
> implementation.
> 
> 
> Signed-off-by: Pawel Wodkowski 
> ---
>  lib/librte_ether/rte_ether.h   |1 +
>  lib/librte_pmd_bond/Makefile   |1 +
>  lib/librte_pmd_bond/rte_eth_bond.h |4 +
>  lib/librte_pmd_bond/rte_eth_bond_api.c |   82 ++---
>  lib/librte_pmd_bond/rte_eth_bond_args.c|1 +
>  lib/librte_pmd_bond/rte_eth_bond_pmd.c |  261
> +---
>  lib/librte_pmd_bond/rte_eth_bond_private.h |   42 -
>  7 files changed, 346 insertions(+), 46 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ether.h b/lib/librte_ether/rte_ether.h
> index 2e08f23..1a3711b 100644
> --- a/lib/librte_ether/rte_ether.h
> +++ b/lib/librte_ether/rte_ether.h
> @@ -293,6 +293,7 @@ struct vlan_hdr {
>  #define ETHER_TYPE_RARP 0x8035 /**< Reverse Arp Protocol. */
>  #define ETHER_TYPE_VLAN 0x8100 /**< IEEE 802.1Q VLAN tagging. */
>  #define ETHER_TYPE_1588 0x88F7 /**< IEEE 802.1AS 1588 Precise Time
> Protocol. */
> +#define ETHER_TYPE_SLOW 0x8809 /**< Slow protocols (LACP and Marker).
> */
> 
>  #ifdef __cplusplus
>  }
> diff --git a/lib/librte_pmd_bond/Makefile b/lib/librte_pmd_bond/Makefile
> index 953d75e..c2312c2 100644
> --- a/lib/librte_pmd_bond/Makefile
> +++ b/lib/librte_pmd_bond/Makefile
> @@ -44,6 +44,7 @@ CFLAGS += $(WERROR_FLAGS)
>  #
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_api.c
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_pmd.c
> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_8023ad.c
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_args.c
> 
>  #
> diff --git a/lib/librte_pmd_bond/rte_eth_bond.h
> b/lib/librte_pmd_bond/rte_eth_bond.h
> index 6811c7b..b0223c2 100644
> --- a/lib/librte_pmd_bond/rte_eth_bond.h
> +++ b/lib/librte_pmd_bond/rte_eth_bond.h
> @@ -75,6 +75,10 @@ extern "C" {
>  /**< Broadcast (Mode 3).
>   * In this mode all transmitted packets will be transmitted on all available
>   * active slaves of the bonded. */
> +#define BONDING_MODE_8023AD  (4)
> +/**< 802.3AD (Mode 4).
> + * In this mode transmission and reception of packets is managed by LACP
> + * protocol specified in 802.3AD documentation. */
> 
>  /* Balance Mode Transmit Policies */
>  #define BALANCE_XMIT_POLICY_LAYER2   (0)
> diff --git a/lib/librte_pmd_bond/rte_eth_bond_api.c
> b/lib/librte_pmd_bond/rte_eth_bond_api.c
> index c690ceb..c547164 100644
> --- a/lib/librte_pmd_bond/rte_eth_bond_api.c
> +++ b/lib/librte_pmd_bond/rte_eth_bond_api.c
> @@ -31,6 +31,8 @@
>   *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
>   */
> 
> +#include 
> +
>  #include 
>  #include 
>  #include 
> @@ -104,6 +106,44 @@ valid_slave_port_id(uint8_t port_id)
>   return 0;
>  }
> 
> +void
> +activate_slave(struct rte_eth_dev *eth_dev, uint8_t port_id)
> +{
> + struct bond_dev_private *internals = eth_dev->data->dev_private;
> + uint8_t active_count = internals->active_slave_count;
> +
> + internals->active_slaves[active_count] = port_id;
> +
> + if (internals->mode == BONDING_MODE_8023AD)
> + bond_mode_8023ad_slave_append(eth_dev);
> +
> + internals->active_slave_count = active_count + 1;
> +}
> +
> +void
> +deactivate_slave(struct rte_eth_dev *eth_dev,
> + uint8_t slave_pos)
> +{
> + struct bond_dev_private *internals = eth_dev->data->dev_private;
> + uint8_t active_count = internals->active_slave_count;
> +
> + if (internals->mode == BONDING_MODE_8023AD)
> + bond_mode_8023ad_deactivate_slave(eth_dev, slave_pos);
> +
> + active_count--;
> +
> + /* If slave was not at the end of the list
> +  * shift active slaves up active array list */
> + if (slave_pos < active_count) {
> + memmove(internals->active_slaves + slave_pos,
> + internals->active_slaves + slave_pos + 1,
> + (active_count - slave_pos) *
> + sizeof(internals->active_slaves[0]));
> + }
> +
> + internals->active_slave_count = active_count;
> +}
> +
>  uint8_t
>  number_of_sockets(void)
>  {
> @@ -216,12 +256,8 @@ rte_eth_bond_create(const char *name, uint8_t
> mode, uint8_t socket_id)
>   eth_dev->dev_ops = &default_dev_ops;
>   eth_dev->pci_dev = pci_dev;
> 
> - if (bond_ethdev_mode_set(eth_dev, mode)) {
> - RTE_BOND_LOG(ERR, "Faile

[dpdk-dev] [PATCH 0/2] Added functions to get RX/TX default configuration

2014-09-29 Thread De Lara Guarch, Pablo

Hi David,

> From: David Marchand [mailto:david.marchand at 6wind.com]
> Sent: Saturday, September 27, 2014 7:45 PM
> To: De Lara Guarch, Pablo
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/2] Added functions to get RX/TX default
> configuration
> 
> Hello Pablo,
> - All I can see in this patchset is stuff that should remain in the PMD (since
> this is really specific to them).
> 
> - Anyway, if you want to let application get this information, why the new API
> ?
> From my point of view, this should go in rte_eth_dev_info_get().

Thanks for the comments. Main changes are in the PMDs, and I only added two 
functions in rte_ethdev.c, which basically calls the functions in the PMDs.
Anyway, so you suggest to modify the rte_eth_dev_info structure, so it also 
contains these two structures populated with the default values?

Thanks,
Pablo
> 
> 
> --
> David Marchand
> 
> On Fri, Sep 26, 2014 at 4:19 PM, Pablo de Lara
>  wrote:
> These patches add two new API functions to get an optimal values
> for the RX/TX configuration structures (rte_eth_rxconf and rte_eth_txconf),
> so users can get these configurations and modify or use them directly,
> to set up RX/TX queues. Besides, most of the apps that were modifying little
> or none of the default values of the structures, have been modified to use
> these functions to simplify the code and avoid duplication.
> 
> Pablo de Lara (2):
> ? pmd: Added rte_eth_rxconf_defaults and rte_eth_txconf defaults
> ? ? functions
> ? app: Used rte_eth_rxconf_defaults and rte_eth_txconf_defaults in apps
> 
> ?examples/dpdk_qat/main.c? ? ? ? ? ? ? ? ? ? ? ? ? ?|? ?44 ++---
> ?examples/exception_path/main.c? ? ? ? ? ? ? ? ? ? ?|? ?30 +
> ?examples/ip_fragmentation/main.c? ? ? ? ? ? ? ? ? ?|? ?42 ++---
> ?examples/ip_reassembly/main.c? ? ? ? ? ? ? ? ? ? ? |? ?44 ++---
> ?examples/ipv4_multicast/main.c? ? ? ? ? ? ? ? ? ? ?|? ?44 ++---
> ?examples/kni/main.c? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? |? ?34 +-
> ?examples/l2fwd-ivshmem/host/host.c? ? ? ? ? ? ? ? ?|? ?43 +---
> ?examples/l2fwd/main.c? ? ? ? ? ? ? ? ? ? ? ? ? ? ? |? ?48 +-
> ?examples/l3fwd-acl/main.c? ? ? ? ? ? ? ? ? ? ? ? ? |? ?46 ++
> ?examples/l3fwd-power/main.c? ? ? ? ? ? ? ? ? ? ? ? |? ?46 ++---
> ?examples/l3fwd-vf/main.c? ? ? ? ? ? ? ? ? ? ? ? ? ?|? ?31 ++---
> ?examples/l3fwd/main.c? ? ? ? ? ? ? ? ? ? ? ? ? ? ? |? ?54 +++-
> ?examples/link_status_interrupt/main.c? ? ? ? ? ? ? |? ?43 +---
> ?examples/load_balancer/init.c? ? ? ? ? ? ? ? ? ? ? |? ?24 +--
> ?.../client_server_mp/mp_server/init.c? ? ? ? ? ? ? |? ?41 +---
> ?examples/multi_process/l2fwd_fork/main.c? ? ? ? ? ?|? ?44 +
> ?examples/multi_process/symmetric_mp/main.c? ? ? ? ?|? ?36 +-
> ?examples/netmap_compat/bridge/bridge.c? ? ? ? ? ? ?|? ?25 ---
> ?examples/netmap_compat/lib/compat_netmap.c? ? ? ? ?|? ? 6 +-
> ?examples/netmap_compat/lib/compat_netmap.h? ? ? ? ?|? ? 2 -
> ?examples/qos_meter/main.c? ? ? ? ? ? ? ? ? ? ? ? ? |? ?36 ---
> ?examples/quota_watermark/qw/init.c? ? ? ? ? ? ? ? ?|? ?26 ++--
> ?examples/vhost_xen/main.c? ? ? ? ? ? ? ? ? ? ? ? ? |? ?31 ++---
> ?examples/vmdq/main.c? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?|? ?60 ++---
> ?examples/vmdq_dcb/main.c? ? ? ? ? ? ? ? ? ? ? ? ? ?|? ?36 +-
> ?lib/librte_ether/rte_ethdev.c? ? ? ? ? ? ? ? ? ? ? |? ?68 
> 
> ?lib/librte_ether/rte_ethdev.h? ? ? ? ? ? ? ? ? ? ? |? ?29 
> ?lib/librte_pmd_e1000/igb_ethdev.c? ? ? ? ? ? ? ? ? |? ?56 -
> ?lib/librte_pmd_i40e/i40e_ethdev.c? ? ? ? ? ? ? ? ? |? ?56 
> ?lib/librte_pmd_ixgbe/ixgbe_ethdev.c? ? ? ? ? ? ? ? |? ?59 +
> ?30 files changed, 385 insertions(+), 799 deletions(-)
> 
> --
> 1.7.7.6

[dpdk-dev] [PATCH] Fix librte_pmd_pcap driver double stop error

2014-09-29 Thread Neil Horman

On Wed, Sep 10, 2014 at 05:17:05PM -0300, Nicol?s Pernas Maradei wrote:
> From: Nicola?s Pernas Maradei 
> 
> librte_pmd_pcap driver was opening the pcap/interfaces only at init time and
> closing them only when the port was being stopped. This behaviour would cause
> problems (leading to segfault) if the user closed the port 2 times. The first
> time the pcap/interfaces would be normally closed but libpcap would throw an
> error causing a segfault if the closed pcaps/interfaces were closed again.
> This behaviour is solved by re-openning pcaps/interfaces when the port is
> started (only if these weren't open already for example at init time).
> 
> Signed-off-by: Nicola?s Pernas Maradei 

This patch assigns pointers to strings that are allocated in the devargs_list.
Given that there exists an api interface free_devargs_list(), I'm not sure that
whats being done here is consistently safe.  It seems like you should dup the
strings to make sure you always have the storage allocated, or find some other
method to store the needed information.

Neil

[dpdk-dev] [PATCH v2] distributor_app: new sample app

2014-09-29 Thread Neil Horman

On Mon, Sep 29, 2014 at 01:35:21PM +, De Lara Guarch, Pablo wrote:
> 
> 
> > -Original Message-
> > From: Ananyev, Konstantin
> > Sent: Monday, September 29, 2014 2:07 PM
> > To: Pattan, Reshma; De Lara Guarch, Pablo; dev at dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> > 
> > 
> > 
> > > -Original Message-
> > > From: Pattan, Reshma
> > > Sent: Monday, September 29, 2014 1:40 PM
> > > To: Ananyev, Konstantin; De Lara Guarch, Pablo; dev at dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> > >
> > >
> > >
> > > -Original Message-
> > > From: Ananyev, Konstantin
> > > Sent: Friday, September 26, 2014 4:52 PM
> > > To: De Lara Guarch, Pablo; Pattan, Reshma; dev at dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of De Lara Guarch,
> > > > Pablo
> > > > Sent: Friday, September 26, 2014 4:12 PM
> > > > To: Pattan, Reshma; dev at dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> > > >
> > > > Hi,
> > > >
> > > > > -Original Message-
> > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of reshmapa
> > > > > Sent: Wednesday, September 24, 2014 3:17 PM
> > > > > To: dev at dpdk.org
> > > > > Subject: [dpdk-dev] [PATCH v2] distributor_app: new sample app
> > > > >
> > > > > From: Reshma Pattan 
> > > > >
> > > > > A new sample app that shows the usage of the distributor library.
> > > > > This app works as follows:
> > > > >
> > > > > * An RX thread runs which pulls packets from each ethernet port in 
> > > > > turn
> > > > >   and passes those packets to worker using a distributor component.
> > > > > * The workers take the packets in turn, and determine the output port
> > > > >   for those packets using basic l2forwarding doing an xor on the 
> > > > > source
> > > > >   port id.
> > > > > * The RX thread takes the returned packets from the workers and
> > enqueue
> > > > >   those packets into an rte_ring structure.
> > > > > * A TX thread pulls the packets off the rte_ring structure and then
> > > > >   sends each packet out the output port specified previously by the
> > > > > worker
> > > > > * Command-line option support provided only for portmask.
> > > > >
> > > > > Signed-off-by: Bruce Richardson 
> > > > > Signed-off-by: Reshma Pattan 
> > > > > ---
> > > > >  examples/Makefile |   1 +
> > > > >  examples/distributor_app/Makefile |  57 
> > > > >  examples/distributor_app/main.c   | 585
> > > > > ++
> > > > >  examples/distributor_app/main.h   |  46 +++
> > > > >  4 files changed, 689 insertions(+)
> > > > >  create mode 100644 examples/distributor_app/Makefile  create mode
> > > > > 100644 examples/distributor_app/main.c  create mode 100644
> > > > > examples/distributor_app/main.h
> > > > >
> > > > > diff --git a/examples/Makefile b/examples/Makefile index
> > > > > 6245f83..2ba82b0 100644
> > > > > --- a/examples/Makefile
> > > > > +++ b/examples/Makefile
> > > > > @@ -66,5 +66,6 @@ DIRS-y += vhost
> > > > >  DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen  DIRS-y +=
> > vmdq
> > > > > DIRS-y += vmdq_dcb
> > > > > +DIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += distributor_app
> > > > >
> > > > >  include $(RTE_SDK)/mk/rte.extsubdir.mk diff --git
> > > > > a/examples/distributor_app/Makefile
> > > > > b/examples/distributor_app/Makefile
> > > > > new file mode 100644
> > > > > index 000..394785d
> > > > > --- /dev/null
> > > > > +++ b/examples/distributor_app/Makefile
> > > > > @@ -0,0 +1,57 @@
> > > > > +#   BSD LICENSE
> > > > > +#
> > > > > +#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > > > > +#   All rights reserved.
> > > > > +#
> > > > > +#   Redistribution and use in source and binary forms, with or 
> > > > > without
> > > > > +#   modification, are permitted provided that the following 
> > > > > conditions
> > > > > +#   are met:
> > > > > +#
> > > > > +# * Redistributions of source code must retain the above 
> > > > > copyright
> > > > > +#   notice, this list of conditions and the following disclaimer.
> > > > > +# * Redistributions in binary form must reproduce the above
> > copyright
> > > > > +#   notice, this list of conditions and the following disclaimer 
> > > > > in
> > > > > +#   the documentation and/or other materials provided with the
> > > > > +#   distribution.
> > > > > +# * Neither the name of Intel Corporation nor the names of its
> > > > > +#   contributors may be used to endorse or promote products
> > derived
> > > > > +#   from this software without specific prior written permission.
> > > > > +#
> > > > > +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> > > > > CONTRIBUTORS
> > > > > +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING,
>

[dpdk-dev] [PATCH] pci: remove flag for multiple devices with single id

2014-09-29 Thread Thomas Monjalon

> The flag RTE_PCI_DRV_MULTIPLE was used to register an eth_driver allowing
> multiples devices with a single PCI id.
> It is now possible to register a pci_driver and create ethdev objects
> using rte_eth_dev_allocate().
> 
> Suggested-by: David Marchand 
> Signed-off-by: Thomas Monjalon 

Applied

-- 
Thomas

[dpdk-dev] [PATCH] pcap: set in_port value in packet mbuf data when each packet is received

2014-09-29 Thread Thomas Monjalon

> The pkt.in_port parameter in mbuf should be set with an input port id
> because DPDK apps may use it to know where each packet came from.
> 
> Signed-off-by: Saori USAMI 

Acked, adapted to mbuf rework and applied.

Thanks
-- 
Thomas

[dpdk-dev] [PATCH 0/2] Added functions to get RX/TX default configuration

2014-09-29 Thread David Marchand

Hello Pablo,

On Mon, Sep 29, 2014 at 4:02 PM, De Lara Guarch, Pablo <
pablo.de.lara.guarch at intel.com> wrote:

> Hi David,
>
> > - All I can see in this patchset is stuff that should remain in the PMD
> (since
> > this is really specific to them).
> >
> > - Anyway, if you want to let application get this information, why the
> new API
> > ?
> > From my point of view, this should go in rte_eth_dev_info_get().
>
> Thanks for the comments. Main changes are in the PMDs, and I only added
> two functions in rte_ethdev.c, which basically calls the functions in the
> PMDs.
> Anyway, so you suggest to modify the rte_eth_dev_info structure, so it
> also contains these two structures populated with the default values?
>

- Yep, that would be the idea.
This way applications can reuse these structures "as is" or change some
values before calling rx / tx _queue_setup.


- By the way, I noticed that rte_eth_dev_info_get() is only resetting part
of the dev_info structure given by the user.
We might want to reset the dev_info structure to 0 before filling it ?
>From my point of view, rte_eth_dev_info_get() is not in the datapath.
I would prefer it to be safe.


-- 
David Marchand

[dpdk-dev] [PATCH] doc: apply one comment to all members of a group

2014-09-29 Thread Thomas Monjalon

> A doxygen group begins with /**@{*/ and ends with /**@}*/.
> By enabling DISTRIBUTE_GROUP_DOC, the first comment is applied
> to each undocumented member of the group.
> 
> Signed-off-by: Thomas Monjalon 

Applied

-- 
Thomas

[dpdk-dev] [PATCH v2 02/18] ixgbe: Clean up IXGBE base codes

2014-09-29 Thread Neil Horman

On Mon, Sep 29, 2014 at 03:16:10PM +0800, Ouyang Changchun wrote:
> This patch cleans up some IXGBE base codes, such as remove unnecessary return 
> statement,
> and reduce goto statement etc.
> 
> Signed-off-by: Changchun Ouyang 
> ---
>  lib/librte_pmd_ixgbe/ixgbe/ixgbe_82598.c | 2 --
>  lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c | 7 +++
>  lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.c| 2 +-
>  lib/librte_pmd_ixgbe/ixgbe/ixgbe_common.h| 2 --
>  lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb_82598.c | 2 ++
>  lib/librte_pmd_ixgbe/ixgbe/ixgbe_dcb_82599.c | 1 +
>  lib/librte_pmd_ixgbe/ixgbe/ixgbe_phy.c   | 3 ++-
>  7 files changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82598.c 
> b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82598.c
> index ee2217d..c8ce893 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82598.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82598.c
> @@ -1417,8 +1417,6 @@ STATIC void ixgbe_set_rxpba_82598(struct ixgbe_hw *hw, 
> int num_pb,
>   /* Setup Tx packet buffer sizes */
>   for (i = 0; i < IXGBE_MAX_PACKET_BUFFERS; i++)
>   IXGBE_WRITE_REG(hw, IXGBE_TXPBSIZE(i), IXGBE_TXPBSIZE_40KB);
> -
> - return;
>  }
>  
>  /**
> diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c 
> b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
> index 835331b..046a35e 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
> @@ -2103,7 +2103,7 @@ s32 ixgbe_identify_phy_82599(struct ixgbe_hw *hw)
>   if (status != IXGBE_SUCCESS) {
>   /* 82599 10GBASE-T requires an external PHY */
>   if (hw->mac.ops.get_media_type(hw) == ixgbe_media_type_copper)
> - goto out;
> + return status;
>   else
>   status = ixgbe_identify_module_generic(hw);
>   }
> @@ -2111,14 +2111,13 @@ s32 ixgbe_identify_phy_82599(struct ixgbe_hw *hw)
>   /* Set PHY type none if no PHY detected */
>   if (hw->phy.type == ixgbe_phy_unknown) {
>   hw->phy.type = ixgbe_phy_none;
> - status = IXGBE_SUCCESS;
> + return IXGBE_SUCCESS;
>   }
>  
>   /* Return error if SFP module has been detected but is not supported */
>   if (hw->phy.type == ixgbe_phy_sfp_unsupported)
> - status = IXGBE_ERR_SFP_NOT_SUPPORTED;
> + return IXGBE_ERR_SFP_NOT_SUPPORTED;
>  
> -out:
>   return status;
>  }
>  
How is this a cleanup?  I understand that you've removed a set of goto
statements from this function, and, while I don't think gotos are a problem, its
fine that you did.  But there are literally dozens of other goto statements that
could be cleaned up in a simmilar fashion that you ignored in this patch.  Why
just this one location?

Also, isn't this code lifted directly from the linux ixgbe driver?  Wouldn't it
be prudent to just keep it in line with that driver as much as possible?

Neil

[dpdk-dev] Bulk dequeue of packets and the returned values, question

2014-09-29 Thread Wiles, Roger Keith


On Sep 29, 2014, at 7:30 AM, Ananyev, Konstantin  wrote:

> 
> 
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
>> Sent: Monday, September 29, 2014 1:10 PM
>> To: Wiles, Roger Keith (Wind River)
>> Cc: dev at dpdk.org
>> Subject: Re: [dpdk-dev] Bulk dequeue of packets and the returned values, 
>> question
>> 
>> On Sun, Sep 28, 2014 at 11:06:17PM +, Wiles, Roger Keith wrote:
>>> Thanks Venky,
>>> On Sep 28, 2014, at 5:23 PM, Venkatesan, Venky >> intel.com> wrote:
>>> 
 Keith,
 
 On 9/28/2014 11:04 AM, Wiles, Roger Keith wrote:
> I am also looking at the bulk dequeue routines, which the ring can be 
> fixed or variable. On fixed  < 0 on error is returned and 0 if
>> successful. On a variable ring < 0 on error or n on success, but I think n 
>> can be zero in the variable case, correct?
> 
> If these are true then why not have the routines return  < 0 on error and 
> >= 0 on success. Which means a dequeue from a fixed
>> ring would return only ?requested size n? or < 0 if you error off the 0 
>> case. The 0 case could be OK, if you allow zero to be return on a
>> empty ring for the fixed ring case.
> 
> Does this make sense to anyone?
 It won't make sense unless you're aware of the history behind these 
 functions. The original functions that were implemented for
>> the ring were only the bulk functions (i.e. FIXED). They would return 
>> exactly the number of items requested for dequeue (0 if success,
>> negative if error), and not return any if the required number were not 
>> available.
 
 The burst (i.e. VARIABLE) functions came in much later (think it was r1.3 
 where we introduced them), and by that time, there were
>> already quite a number of deployments of DPDK in the field using the legacy 
>> ring functions. Therefore we made the decision to keep
>> the legacy behavior intact & not impacting deployed code - and merging the 
>> burst functions into the code. Given that there was no
>> "versioning" of the API/ABI in those releases :).
>>> 
>>> I see why the code is this way. If the developers used ?if ( ret == 0 ) { 
>>> /* do something */ }? then it would break if it returned a
>> positive value on success. I would expect the normal behavior to be ?if ( 
>> ret < 0 ) { /* error case */ }? and fall thru for the success case. I
>> would love to change the code to just return <0 on error or >= 0 on success. 
>> I wonder how many customers code would break
>> changing the code to do just just the two steps. I think it will remove some 
>> code in a couple places that were testing for FIXED or
>> VARIABLE?
 
 Hope that helps.
 -Venky
 
> 
> Thanks
> ++Keith
> 
> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
> 972-213-5533
>>> 
>>> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
>>> 972-213-5533
>>> 
>> 
>> Since we are looking at making considerable ABI changes in this release and
>> (hopefully) also looking to version our ABI going forward, I would be in
>> favour of making any changes to these APIs in this current release if
>> possible. While the current behaviour makes sense for historical reason, I
>> think an overall change to the behaviour as Keith describes would be more
>> sensible long-term.
> 
> It is doable, I suppose, but might become quite messy:
> Don't know how many people are using  rte_ring_dequeue_bulk() all over the 
> place.
> I suspect quite a lot.
> From other side - what the real gain we'll have from it?
> I don't see much so far.
> Konstantin
> 
I see two possible gains one is a consistent return method for Fixed/Variable 
and some code reduction in a few places. Let me see if I can create a patch we 
can review and see if it seems reasonable.

>> 
>> (Also to note my previous suggestion about upping the major version to 2.0
>> if we continue to increase the number of ABI/API changes in this release.
>> Anyone else any thoughts on that?)

Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
972-213-5533

[dpdk-dev] [PATCH v3 00/10] VM Power Management

2014-09-29 Thread Alan Carew

Virtual Machine Power Management.

The following patches add two DPDK sample applications and an alternate
implementation of librte_power for use in virtualized environments.
The idea is to provide librte_power functionality from within a VM to address
the lack of MSRs to facilitate frequency changes from within a VM.
It is ideally suited for Haswell which provides per core frequency scaling.

The current librte_power affects frequency changes via the acpi-cpufreq
'userspace' power governor, accessed via sysfs.

General Overview:(more information in each patch that follows).
The VM Power Management solution provides two components:

 1)VM: Allows for the a DPDK application in a VM to reuse the librte_power
 interface. Each lcore opens a Virto-Serial endpoint channel to the host,
 where the re-implementation of librte_power simply forwards the requests for
 frequency change to a host based monitor. The host monitor itself uses
 librte_power.
 Each lcore channel corresponds to a
 serial device '/dev/virtio-ports/virtio.serial.port.poweragent.'
 which is opened in non-blocking mode.
 While each Virtual CPU can be mapped to multiple physical CPUs it is
 recommended that each vCPU should be mapped to a single core only.

 2)Host: The host monitor is managed by a CLI, it allows for adding qemu/KVM
 virtual machines and associated channels to the monitor, manually changing
 CPU frequency, inspecting the state of VMs, vCPU to pCPU pinning and managing
 channels.
 Host channel endpoints are Virto-Serial endpoints configured as AF_UNIX file
 sockets which follow a specific naming convention
 i.e /tmp/powermonitor/.,
 each channel has an 1:1 mapping to a VM endpoint
 i.e. /dev/virtio-ports/virtio.serial.port.poweragent.
 Host channel endpoints are opened in non-blocking mode and are monitored via 
epoll.
 Requests over each channel to change frequency are forwarded to the original
 librte_power.

Channels must be manually configured as qemu-kvm command line arguments or
libvirt domain definition(xml) e.g.

 


  
  


Where multiple channels can be configured by specifying multiple 
elements, by replacing , .
(port number) should be incremented by 1 for each new channel element.
More information on Virtio-Serial can be found here:
http://fedoraproject.org/wiki/Features/VirtioSerial
To enable the Hypervisor creation of channels, the host endpoint directory
must be created with qemu permissions:
mkdir /tmp/powermonitor
chown qemu:qemu /tmp/powermonitor

The host application runs on two separate lcores:
Core N) CLI: For management of Virtual Machines adding channels to Monitor 
thread,
 inspecting state and manually setting CPU frequency [PATCH 02/09]
Core N+1) Monitor Thread: An epoll based infinite loop that waits on channel 
events
 from VMs and calls the corresponding librte_power functions.

A sample application is also provided to run on Virtual Machines, this
application provides a CLI to manually set the frequency of a 
vCPU[PATCH 08/09]

The current l3fwd-power sample application can also be run on a VM.

Changes in V3:
 Fixed crash in Guest CLI when host application is not running.
 Renamed #defines to be more specific to the module they belong
 Added vCPU pinning via CLI
 Testing feedback

Changes in V2:
 Runtime selection of librte_power implementations.
 Updated Unit tests to cover librte_power changes.
 PATCH[0/3] was sent twice, again as PATCH[0/4]
 Miscellaneous fixes.

Alan Carew (10):
  Channel Manager and Monitor for VM Power Management(Host).
  VM Power Management CLI(Host).
  CPU Frequency Power Management(Host).
  VM Power Management application and Makefile.
  VM Power Management CLI(Guest).
  VM communication channels for VM Power Management(Guest).
  librte_power common interface for Guest and Host
  Packet format for VM Power Management(Host and Guest).
  Build system integration for VM Power Management(Guest and Host)
  VM Power Management Unit Tests

 app/test/Makefile  |   3 +-
 app/test/autotest_data.py  |  26 +
 app/test/test_power.c  | 445 +---
 app/test/test_power_acpi_cpufreq.c | 544 ++
 app/test/test_power_kvm_vm.c   | 308 
 examples/vm_power_manager/Makefile |  57 ++
 examples/vm_power_manager/channel_manager.c| 804 +
 examples/vm_power_manager/channel_manager.h| 314 
 examples/vm_power_manager/channel_monitor.c| 228 ++
 examples/vm_power_manager/channel_monitor.h| 102 +++
 examples/vm_power_manager/guest_cli/Makefile   |  56 ++
 examples/vm_power_manager/guest_cli/main.c |  87 +++
 examples/vm_power_manager/guest_cli/main.h |  52 ++
 .../guest_cli/vm_power_cli_guest.c | 155 
 .../guest_cli/vm_power_cli_guest.h |  55 ++
 examples/vm_power_manager/main.c   | 113 +++
 examples/vm_power_manage

[dpdk-dev] [PATCH v3 01/10] Channel Manager and Monitor for VM Power Management(Host).

2014-09-29 Thread Alan Carew

The manager is responsible for adding communications channels to the Monitor
thread, tracking and reporting VM state and employs the libvirt API for
synchronization with the KVM Hypervisor. The manager interacts with the
Hypervisor to discover the mapping of virtual CPUS(vCPUs) to the host
physical CPUS(pCPUs) and to inspect the VM running state.

The manager provides the following functionality to the CLI:
1) Connect to a libvirtd instance, default: qemu:///system
2) Add a VM to an internal list, each VM is identified by a "name" which must
   correspond a valid libvirt Domain Name.
3) Add communication channels associated with a VM to the epoll based Monitor
   thread.
   The channels must exist and be in the form of:
   /tmp/powermonitor/.. Each channel is a
   Virtio-Serial endpoint configured as an AF_UNIX file socket and opened in
   non-blocking mode.
   Each VM can have a maximum of 64 channels associated with it.
4) Disable or re-enable VM communication channels, channels once added to the
   Monitor thread remain in that threads control, however acting on channel
   requests can be disabled and renabled via CLI.

The monitor is an epoll based infinite loop running in a separate thread that
waits on channel events from VMs and calls the corresponding functions. Channel
definitions from the manager are registered via the epoll event opaque pointer
when calling epoll_ctl(EPOLL_CTL_ADD), this allows for obtaining the channels
file descriptor for reading EPOLLIN events and mapping the vCPU to pCPU(s)
associated with a request from a particular VM.

Signed-off-by: Alan Carew 
---
 examples/vm_power_manager/channel_manager.c | 804 
 examples/vm_power_manager/channel_manager.h | 314 +++
 examples/vm_power_manager/channel_monitor.c | 228 
 examples/vm_power_manager/channel_monitor.h | 102 
 4 files changed, 1448 insertions(+)
 create mode 100644 examples/vm_power_manager/channel_manager.c
 create mode 100644 examples/vm_power_manager/channel_manager.h
 create mode 100644 examples/vm_power_manager/channel_monitor.c
 create mode 100644 examples/vm_power_manager/channel_monitor.h

diff --git a/examples/vm_power_manager/channel_manager.c 
b/examples/vm_power_manager/channel_manager.c
new file mode 100644
index 000..a14f191
--- /dev/null
+++ b/examples/vm_power_manager/channel_manager.c
@@ -0,0 +1,804 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "channel_manager.h"
+#include "channel_commands.h"
+#include "channel_monitor.h"
+
+
+#define RTE_LOGTYPE_CHANNEL_MANAGER RTE_LOGTYPE_USER1
+
+#define ITERATIVE_BITMASK_CHECK_64(mask_u64b, i) \
+   for (i = 0; mask_u64b; mask_u64b &= ~(1ULL << i++)) \
+   if ((mask_u64b >> i) & 1) \
+
+/* Global pointer to libvirt connection */
+static virConnectPtr global_vir_conn_ptr;
+
+static unsigned char *global_cpumaps;
+static virVcpuInfo *global_vircpuinfo;
+static size_t global_maplen;
+
+static unsigned global_n_host_cpus;
+
+/*
+ * Represents a single Virtual Machine
+ */
+struct virtual_machine_info {
+   cha

[dpdk-dev] [PATCH v3 02/10] VM Power Management CLI(Host).

2014-09-29 Thread Alan Carew

The CLI is used for administrating the channel monitor and manager and
manually setting the CPU frequency on the host.

Supports the following commands:
 add_vm [Mul-choice STRING]: add_vm|rm_vm , add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 rm_vm [Mul-choice STRING]: add_vm|rm_vm , add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 add_channels [Fixed STRING]: add_channels  |all, add
  communication channels for the specified VM, the virtio channels must be
  enabled in the VM configuration(qemu/libvirt) and the associated VM must be
  active.  is a comma-separated list of channel numbers to add, using the
  keyword 'all' will attempt to add all channels for the VM

 set_channel_status [Fixed STRING]:
  set_channel_status  |all enabled|disabled,  enable or disable
  the communication channels in list(comma-seperated) for the specified VM,
  alternatively list can be replaced with keyword 'all'. Disabled channels will
  still receive packets on the host, however the commands they specify will be
  ignored. Set status to 'enabled' to begin processing requests again.

 show_vm [Fixed STRING]: show_vm , prints the information on the
  specified VM(s), the information lists the number of vCPUS, the pinning to
  pCPU(s) as a bit mask, along with any communication channels associated with
  each VM

 show_cpu_freq_mask [Fixed STRING]: show_cpu_freq_mask , Get the current
  frequency for each core specified in the mask

 set_cpu_freq_mask [Fixed STRING]: set_cpu_freq  ,
  Set the current frequency for the cores specified in  by scaling
  each up/down/min/max.

 show_cpu_freq [Fixed STRING]: Get the current frequency for the specified core

 set_cpu_freq [Fixed STRING]: set_cpu_freq  ,
  Set the current frequency for the specified core by scaling up/down/min/max

 quit [Fixed STRING]: close the application

Signed-off-by: Alan Carew 
---
 examples/vm_power_manager/vm_power_cli.c | 669 +++
 examples/vm_power_manager/vm_power_cli.h |  47 +++
 2 files changed, 716 insertions(+)
 create mode 100644 examples/vm_power_manager/vm_power_cli.c
 create mode 100644 examples/vm_power_manager/vm_power_cli.h

diff --git a/examples/vm_power_manager/vm_power_cli.c 
b/examples/vm_power_manager/vm_power_cli.c
new file mode 100644
index 000..a8cfb3a
--- /dev/null
+++ b/examples/vm_power_manager/vm_power_cli.c
@@ -0,0 +1,669 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "vm_power_cli.h"
+#include "channel_manager.h"
+#include "channel_monitor.h"
+#include "power_manager.h"
+#include "channel_commands.h"
+
+struct cmd_quit_result {
+   cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+   struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   channel_monitor_exit();
+   channel_manager_exit();
+   power_manager_exit();
+   cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+   TOKEN_STRING_INITIALIZER(struct cmd_quit_result, qu

[dpdk-dev] [PATCH v3 03/10] CPU Frequency Power Management(Host).

2014-09-29 Thread Alan Carew

A wrapper around librte_power(using ACPI cpufreq), providing locking around the
non-threadsafe library, allowing for frequency changes based on core masks and
core numbers from both the CLI thread and epoll monitor thread.

Signed-off-by: Alan Carew 
---
 examples/vm_power_manager/power_manager.c | 244 ++
 examples/vm_power_manager/power_manager.h | 188 +++
 2 files changed, 432 insertions(+)
 create mode 100644 examples/vm_power_manager/power_manager.c
 create mode 100644 examples/vm_power_manager/power_manager.h

diff --git a/examples/vm_power_manager/power_manager.c 
b/examples/vm_power_manager/power_manager.c
new file mode 100644
index 000..b7b1fca
--- /dev/null
+++ b/examples/vm_power_manager/power_manager.c
@@ -0,0 +1,244 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include "power_manager.h"
+
+#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
+
+#define POWER_SCALE_CORE(DIRECTION, core_num , ret) do { \
+   if (core_num >= POWER_MGR_MAX_CPUS) \
+   return -1; \
+   if (!(global_enabled_cpus & (1ULL << core_num))) \
+   return -1; \
+   rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); \
+   ret = rte_power_freq_##DIRECTION(core_num); \
+   rte_spinlock_unlock(&global_core_freq_info[core_num].power_sl); \
+} while (0)
+
+#define POWER_SCALE_MASK(DIRECTION, core_mask, ret) do { \
+   int i; \
+   for (i = 0; core_mask; core_mask &= ~(1 << i++)) { \
+   if ((core_mask >> i) & 1) { \
+   if (!(global_enabled_cpus & (1ULL << i))) \
+   continue; \
+   rte_spinlock_lock(&global_core_freq_info[i].power_sl); \
+   if (rte_power_freq_##DIRECTION(i) != 1) \
+   ret = -1; \
+   rte_spinlock_unlock(&global_core_freq_info[i].power_sl); \
+   } \
+   } \
+} while (0)
+
+struct freq_info {
+   rte_spinlock_t power_sl;
+   uint32_t freqs[RTE_MAX_LCORE_FREQS];
+   unsigned num_freqs;
+} __rte_cache_aligned;
+
+static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS];
+
+static uint64_t global_enabled_cpus;
+
+#define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
+
+static unsigned
+set_host_cpus_mask(void)
+{
+   char path[PATH_MAX];
+   unsigned i;
+   unsigned num_cpus = 0;
+   for (i = 0; i < POWER_MGR_MAX_CPUS; i++) {
+   snprintf(path, sizeof(path), SYSFS_CPU_PATH, i);
+   if (access(path, F_OK) == 0) {
+   global_enabled_cpus |= 1ULL << i;
+   num_cpus++;
+   } else
+   return num_cpus;
+   }
+   return num_cpus;
+}
+
+int
+power_manager_init(void)
+{
+   unsigned i, num_cpus;
+   uint64_t cpu_mask;
+   int ret = 0;
+
+   num_cpus = set_host_cpus_mask();
+   if (num_cpus == 0) {
+   RTE_LOG(ERR, POWER_MANAGER, "Unable to detected host CPUs, 
please "
+   "ensure that sufficient privileges exist to 
inspect sysfs\n");
+   return -1;
+

[dpdk-dev] [PATCH v3 04/10] VM Power Management application and Makefile.

2014-09-29 Thread Alan Carew

For launching CLI thread and Monitor thread and initialising
resources.
Requires a minimum of two lcores to run, additional cores specified by eal core
mask are not used.

Signed-off-by: Alan Carew 
---
 examples/vm_power_manager/Makefile |  57 +++
 examples/vm_power_manager/main.c   | 113 +
 examples/vm_power_manager/main.h   |  52 +
 3 files changed, 222 insertions(+)
 create mode 100644 examples/vm_power_manager/Makefile
 create mode 100644 examples/vm_power_manager/main.c
 create mode 100644 examples/vm_power_manager/main.h

diff --git a/examples/vm_power_manager/Makefile 
b/examples/vm_power_manager/Makefile
new file mode 100644
index 000..7d6f943
--- /dev/null
+++ b/examples/vm_power_manager/Makefile
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-default-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = vm_power_mgr
+
+# all source are stored in SRCS-y
+SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
+SRCS-y += channel_monitor.c
+
+CFLAGS += -O3 -lvirt -I$(RTE_SDK)/lib/librte_power/
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
new file mode 100644
index 000..e819e6f
--- /dev/null
+++ b/examples/vm_power_manager/main.c
@@ -0,0 +1,113 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OT

[dpdk-dev] [PATCH v3 06/10] VM communication channels for VM Power Management(Guest).

2014-09-29 Thread Alan Carew

Allows for the opening of Virtio-Serial devices on a VM, where a DPDK
application can send packets to the host based monitor. The packet formatted is
specified in channel_commands.h
Each device appears as a serial device in path
/dev/virtio-ports/virtio.serial.port.. where each lcore
in a DPDK application has exclusive to a device/channel.
Each channel is opened in non-blocking mode, after a successful open a test
packet is send to the host to ensure the host side is monitoring.

Signed-off-by: Alan Carew 
---
 lib/librte_power/guest_channel.c | 162 +++
 lib/librte_power/guest_channel.h |  89 +
 2 files changed, 251 insertions(+)
 create mode 100644 lib/librte_power/guest_channel.c
 create mode 100644 lib/librte_power/guest_channel.h

diff --git a/lib/librte_power/guest_channel.c b/lib/librte_power/guest_channel.c
new file mode 100644
index 000..2295665
--- /dev/null
+++ b/lib/librte_power/guest_channel.c
@@ -0,0 +1,162 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+
+#include 
+#include 
+
+#include "guest_channel.h"
+#include "channel_commands.h"
+
+#define RTE_LOGTYPE_GUEST_CHANNEL RTE_LOGTYPE_USER1
+
+static int global_fds[RTE_MAX_LCORE];
+
+int
+guest_channel_host_connect(const char *path, unsigned lcore_id)
+{
+   int flags, ret;
+   struct channel_packet pkt;
+   char fd_path[PATH_MAX];
+   int fd = -1;
+
+   if (lcore_id >= RTE_MAX_LCORE) {
+   RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is out of range 
0...%d\n",
+   lcore_id, RTE_MAX_LCORE-1);
+   return -1;
+   }
+   /* check if path is already open */
+   if (global_fds[lcore_id] != 0) {
+   RTE_LOG(ERR, GUEST_CHANNEL, "Channel(%u) is already open with 
fd %d\n",
+   lcore_id, global_fds[lcore_id]);
+   return -1;
+   }
+
+   snprintf(fd_path, PATH_MAX, "%s.%u", path, lcore_id);
+   RTE_LOG(INFO, GUEST_CHANNEL, "Opening channel '%s' for lcore %u\n",
+   fd_path, lcore_id);
+   fd = open(fd_path, O_RDWR);
+   if (fd < 0) {
+   RTE_LOG(ERR, GUEST_CHANNEL, "Unable to to connect to '%s' with 
error "
+   "%s\n", fd_path, strerror(errno));
+   return -1;
+   }
+
+   flags = fcntl(fd, F_GETFL, 0);
+   if (flags < 0) {
+   RTE_LOG(ERR, GUEST_CHANNEL, "Failed on fcntl get flags for file 
%s\n",
+   fd_path);
+   goto error;
+   }
+
+   flags |= O_NONBLOCK;
+   if (fcntl(fd, F_SETFL, flags) < 0) {
+   RTE_LOG(ERR, GUEST_CHANNEL, "Failed on setting non-blocking 
mode for "
+   "file %s", fd_path);
+   goto error;
+   }
+   /* QEMU needs a delay after connection */
+   sleep(1);
+
+   /* Send a test packet, this command is ignored by the host, but a 
successful
+* send indicates that the host endpoint is monitoring.
+*/
+   pkt.command = CPU_POWER_CONNECT;
+   global_fds[lcore_id] = fd;
+   ret = guest_channel_send_msg(&pkt, lcore_id);
+   if (ret != 0) {
+

[dpdk-dev] [PATCH v3 05/10] VM Power Management CLI(Guest).

2014-09-29 Thread Alan Carew

Provides a small sample application(guest_vm_power_mgr) to run on a VM.
The application is run by providing a core mask(-c) and number of memory
channels(-n). The core mask corresponds to the number of lcore channels to
attempt to open. A maximum of 64 channels per VM is allowed. The channels must
be monitored by the host.
After successful initialisation a CPU frequency command can be sent to the host
using:
set_cpu_freq  .

Signed-off-by: Alan Carew 
---
 examples/vm_power_manager/guest_cli/Makefile   |  56 
 examples/vm_power_manager/guest_cli/main.c |  87 
 examples/vm_power_manager/guest_cli/main.h |  52 +++
 .../guest_cli/vm_power_cli_guest.c | 155 +
 .../guest_cli/vm_power_cli_guest.h |  55 
 5 files changed, 405 insertions(+)
 create mode 100644 examples/vm_power_manager/guest_cli/Makefile
 create mode 100644 examples/vm_power_manager/guest_cli/main.c
 create mode 100644 examples/vm_power_manager/guest_cli/main.h
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.c
 create mode 100644 examples/vm_power_manager/guest_cli/vm_power_cli_guest.h

diff --git a/examples/vm_power_manager/guest_cli/Makefile 
b/examples/vm_power_manager/guest_cli/Makefile
new file mode 100644
index 000..167a7ed
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-default-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = guest_vm_power_mgr
+
+# all source are stored in SRCS-y
+SRCS-y := main.c vm_power_cli_guest.c
+
+CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/vm_power_manager/guest_cli/main.c 
b/examples/vm_power_manager/guest_cli/main.c
new file mode 100644
index 000..1e4767a
--- /dev/null
+++ b/examples/vm_power_manager/guest_cli/main.c
@@ -0,0 +1,87 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPL

[dpdk-dev] [PATCH v3 08/10] Packet format for VM Power Management(Host and Guest).

2014-09-29 Thread Alan Carew

Provides a command packet format for host and guest.

Signed-off-by: Alan Carew 
---
 lib/librte_power/channel_commands.h | 77 +
 1 file changed, 77 insertions(+)
 create mode 100644 lib/librte_power/channel_commands.h

diff --git a/lib/librte_power/channel_commands.h 
b/lib/librte_power/channel_commands.h
new file mode 100644
index 000..7e78a8b
--- /dev/null
+++ b/lib/librte_power/channel_commands.h
@@ -0,0 +1,77 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef CHANNEL_COMMANDS_H_
+#define CHANNEL_COMMANDS_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include 
+
+/* Maximum number of CPUs */
+#define CHANNEL_CMDS_MAX_CPUS64
+#if CHANNEL_CMDS_MAX_CPUS > 64
+#error Maximum number of cores is 64, overflow is guaranteed to \
+   cause problems with VM Power Management
+#endif
+
+/* Maximum number of channels per VM */
+#define CHANNEL_CMDS_MAX_VM_CHANNELS 64
+
+/* Maximum number of channels per VM */
+#define CHANNEL_CMDS_MAX_VM_CHANNELS 64
+
+/* Valid Commands */
+#define CPU_POWER   1
+#define CPU_POWER_CONNECT   2
+
+/* CPU Power Command Scaling */
+#define CPU_POWER_SCALE_UP  1
+#define CPU_POWER_SCALE_DOWN2
+#define CPU_POWER_SCALE_MAX 3
+#define CPU_POWER_SCALE_MIN 4
+
+struct channel_packet {
+   uint64_t resource_id; /**< core_num, device */
+   uint32_t unit;/**< scale down/up/min/max */
+   uint32_t command; /**< Power, IO, etc */
+};
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* CHANNEL_COMMANDS_H_ */
-- 
1.9.3

[dpdk-dev] [PATCH v3 09/10] Build system integration for VM Power Management(Guest and Host)

2014-09-29 Thread Alan Carew

librte_power now contains both rte_power_acpi_cpufreq and rte_power_kvm_vm
implementations.

Signed-off-by: Alan Carew 
---
 lib/librte_power/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/librte_power/Makefile b/lib/librte_power/Makefile
index 6185812..d672a5a 100644
--- a/lib/librte_power/Makefile
+++ b/lib/librte_power/Makefile
@@ -37,7 +37,8 @@ LIB = librte_power.a
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -fno-strict-aliasing

 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_POWER) := rte_power.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) := rte_power.c rte_power_acpi_cpufreq.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += rte_power_kvm_vm.c guest_channel.c

 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_POWER)-include := rte_power.h
-- 
1.9.3

[dpdk-dev] [PATCH v3 07/10] librte_power common interface for Guest and Host

2014-09-29 Thread Alan Carew

Moved the current librte_power implementation to rte_power_acpi_cpufreq, with
renaming of functions only.
Added rte_power_kvm_vm implmentation to support Power Management from a VM.

librte_power now hides the implementation based on the environment used.
A new call rte_power_set_env() can explicidly set the environment, if not
called then auto-detection takes place.

rte_power_kvm_vm is subset of the librte_power APIs, the following is supported:
 rte_power_init(unsigned lcore_id)
 rte_power_exit(unsigned lcore_id)
 rte_power_freq_up(unsigned lcore_id)
 rte_power_freq_down(unsigned lcore_id)
 rte_power_freq_min(unsigned lcore_id)
 rte_power_freq_max(unsigned lcore_id)

The other unsupported APIs return -ENOTSUP

Signed-off-by: Alan Carew 
---
 lib/librte_power/rte_power.c  | 540 -
 lib/librte_power/rte_power.h  | 120 +--
 lib/librte_power/rte_power_acpi_cpufreq.c | 545 ++
 lib/librte_power/rte_power_acpi_cpufreq.h | 192 +++
 lib/librte_power/rte_power_common.h   |  39 +++
 lib/librte_power/rte_power_kvm_vm.c   | 135 
 lib/librte_power/rte_power_kvm_vm.h   | 179 ++
 7 files changed, 1248 insertions(+), 502 deletions(-)
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.c
 create mode 100644 lib/librte_power/rte_power_acpi_cpufreq.h
 create mode 100644 lib/librte_power/rte_power_common.h
 create mode 100644 lib/librte_power/rte_power_kvm_vm.c
 create mode 100644 lib/librte_power/rte_power_kvm_vm.h

diff --git a/lib/librte_power/rte_power.c b/lib/librte_power/rte_power.c
index 856da9a..998ed1c 100644
--- a/lib/librte_power/rte_power.c
+++ b/lib/librte_power/rte_power.c
@@ -31,515 +31,113 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */

-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
 #include 

 #include "rte_power.h"
+#include "rte_power_acpi_cpufreq.h"
+#include "rte_power_kvm_vm.h"
+#include "rte_power_common.h"

-#ifdef RTE_LIBRTE_POWER_DEBUG
-#define POWER_DEBUG_TRACE(fmt, args...) do { \
-   RTE_LOG(ERR, POWER, "%s: " fmt, __func__, ## args); \
-   } while (0)
-#else
-#define POWER_DEBUG_TRACE(fmt, args...)
-#endif
-
-#define FOPEN_OR_ERR_RET(f, retval) do { \
-   if ((f) == NULL) { \
-   RTE_LOG(ERR, POWER, "File not openned\n"); \
-   return (retval); \
-   } \
-} while(0)
-
-#define FOPS_OR_NULL_GOTO(ret, label) do { \
-   if ((ret) == NULL) { \
-   RTE_LOG(ERR, POWER, "fgets returns nothing\n"); \
-   goto label; \
-   } \
-} while(0)
-
-#define FOPS_OR_ERR_GOTO(ret, label) do { \
-   if ((ret) < 0) { \
-   RTE_LOG(ERR, POWER, "File operations failed\n"); \
-   goto label; \
-   } \
-} while(0)
-
-#define STR_SIZE 1024
-#define POWER_CONVERT_TO_DECIMAL 10
+enum power_management_env global_default_env = PM_ENV_NOT_SET;

-#define POWER_GOVERNOR_USERSPACE "userspace"
-#define POWER_SYSFILE_GOVERNOR   \
-   "/sys/devices/system/cpu/cpu%u/cpufreq/scaling_governor"
-#define POWER_SYSFILE_AVAIL_FREQ \
-   "/sys/devices/system/cpu/cpu%u/cpufreq/scaling_available_frequencies"
-#define POWER_SYSFILE_SETSPEED   \
-   "/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
+volatile uint32_t global_env_cfg_status = 0;

-enum power_state {
-   POWER_IDLE = 0,
-   POWER_ONGOING,
-   POWER_USED,
-   POWER_UNKNOWN
-};
+/* function pointers */
+rte_power_freqs_t rte_power_freqs  = NULL;
+rte_power_get_freq_t rte_power_get_freq = NULL;
+rte_power_set_freq_t rte_power_set_freq = NULL;
+rte_power_freq_change_t rte_power_freq_up = NULL;
+rte_power_freq_change_t rte_power_freq_down = NULL;
+rte_power_freq_change_t rte_power_freq_max = NULL;
+rte_power_freq_change_t rte_power_freq_min = NULL;

-/**
- * Power info per lcore.
- */
-struct rte_power_info {
-   unsigned lcore_id;   /**< Logical core id */
-   uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
-   uint32_t nb_freqs;   /**< number of available freqs */
-   FILE *f; /**< FD of scaling_setspeed */
-   char governor_ori[32];   /**< Original governor name */
-   uint32_t curr_idx;   /**< Freq index in freqs array */
-   volatile uint32_t state; /**< Power in use state */
-} __rte_cache_aligned;
-
-static struct rte_power_info lcore_power_info[RTE_MAX_LCORE];
-
-/**
- * It is to set specific freq for specific logical core, according to the index
- * of supported frequencies.
- */
-static int
-set_freq_internal(struct rte_power_info *pi, uint32_t idx)
+int
+rte_power_set_env(enum power_management_env env)
 {
-   if (idx >= RTE_MAX_LCORE_FREQS || idx >= pi->nb_freqs) {
-   RTE_LOG(ERR, POWER, "Invalid frequency index %u, which "
-

[dpdk-dev] [PATCH v3 10/10] VM Power Management Unit Tests

2014-09-29 Thread Alan Carew

Updated the unit tests to cover both librte_power implementations as well as
the external API.

Signed-off-by: Alan Carew 
---
 app/test/Makefile  |   3 +-
 app/test/autotest_data.py  |  26 ++
 app/test/test_power.c  | 445 +++---
 app/test/test_power_acpi_cpufreq.c | 544 +
 app/test/test_power_kvm_vm.c   | 308 +
 5 files changed, 917 insertions(+), 409 deletions(-)
 create mode 100644 app/test/test_power_acpi_cpufreq.c
 create mode 100644 app/test/test_power_kvm_vm.c

diff --git a/app/test/Makefile b/app/test/Makefile
index 37a3772..03ade39 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -119,7 +119,8 @@ endif

 SRCS-$(CONFIG_RTE_LIBRTE_METER) += test_meter.c
 SRCS-$(CONFIG_RTE_LIBRTE_KNI) += test_kni.c
-SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power.c test_power_acpi_cpufreq.c
+SRCS-$(CONFIG_RTE_LIBRTE_POWER) += test_power_kvm_vm.c
 SRCS-y += test_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += test_ivshmem.c

diff --git a/app/test/autotest_data.py b/app/test/autotest_data.py
index 878c72e..618a946 100644
--- a/app/test/autotest_data.py
+++ b/app/test/autotest_data.py
@@ -425,6 +425,32 @@ non_parallel_test_group_list = [
]
 },
 {
+   "Prefix" :  "power_acpi_cpufreq",
+   "Memory" :  all_sockets(512),
+   "Tests" :
+   [
+   {
+"Name" :   "Power ACPI cpufreq autotest",
+"Command" :"power_acpi_cpufreq_autotest",
+"Func" :   default_autotest,
+"Report" : None,
+   },
+   ]
+},
+{
+   "Prefix" :  "power_kvm_vm",
+   "Memory" :  "512",
+   "Tests" :
+   [
+   {
+"Name" :   "Power KVM VM  autotest",
+"Command" :"power_kvm_vm_autotest",
+"Func" :   default_autotest,
+"Report" : None,
+   },
+   ]
+},
+{
"Prefix" :  "lpm6",
"Memory" :  "512",
"Tests" :
diff --git a/app/test/test_power.c b/app/test/test_power.c
index d9eb420..64a2305 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -41,437 +41,66 @@

 #include 

-#define TEST_POWER_LCORE_ID  2U
-#define TEST_POWER_LCORE_INVALID ((unsigned)RTE_MAX_LCORE)
-#define TEST_POWER_FREQS_NUM_MAX ((unsigned)RTE_MAX_LCORE_FREQS)
-
-#define TEST_POWER_SYSFILE_CUR_FREQ \
-   "/sys/devices/system/cpu/cpu%u/cpufreq/scaling_cur_freq"
-
-static uint32_t total_freq_num;
-static uint32_t freqs[TEST_POWER_FREQS_NUM_MAX];
-
-static int
-check_cur_freq(unsigned lcore_id, uint32_t idx)
-{
-#define TEST_POWER_CONVERT_TO_DECIMAL 10
-   FILE *f;
-   char fullpath[PATH_MAX];
-   char buf[BUFSIZ];
-   uint32_t cur_freq;
-   int ret = -1;
-
-   if (snprintf(fullpath, sizeof(fullpath),
-   TEST_POWER_SYSFILE_CUR_FREQ, lcore_id) < 0) {
-   return 0;
-   }
-   f = fopen(fullpath, "r");
-   if (f == NULL) {
-   return 0;
-   }
-   if (fgets(buf, sizeof(buf), f) == NULL) {
-   goto fail_get_cur_freq;
-   }
-   cur_freq = strtoul(buf, NULL, TEST_POWER_CONVERT_TO_DECIMAL);
-   ret = (freqs[idx] == cur_freq ? 0 : -1);
-
-fail_get_cur_freq:
-   fclose(f);
-
-   return ret;
-}
-
-/* Check rte_power_freqs() */
-static int
-check_power_freqs(void)
-{
-   uint32_t ret;
-
-   total_freq_num = 0;
-   memset(freqs, 0, sizeof(freqs));
-
-   /* test with an invalid lcore id */
-   ret = rte_power_freqs(TEST_POWER_LCORE_INVALID, freqs,
-   TEST_POWER_FREQS_NUM_MAX);
-   if (ret > 0) {
-   printf("Unexpectedly get available freqs successfully on "
-   "lcore %u\n", TEST_POWER_LCORE_INVALID);
-   return -1;
-   }
-
-   /* test with NULL buffer to save available freqs */
-   ret = rte_power_freqs(TEST_POWER_LCORE_ID, NULL,
-   TEST_POWER_FREQS_NUM_MAX);
-   if (ret > 0) {
-   printf("Unexpectedly get available freqs successfully with "
-   "NULL buffer on lcore %u\n", TEST_POWER_LCORE_ID);
-   return -1;
-   }
-
-   /* test of getting zero number of freqs */
-   ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs, 0);
-   if (ret > 0) {
-   printf("Unexpectedly get available freqs successfully with "
-   "zero buffer size on lcore %u\n", TEST_POWER_LCORE_ID);
-   return -1;
-   }
-
-   /* test with all valid input parameters */
-   ret = rte_power_freqs(TEST_POWER_LCORE_ID, freqs,
-   TEST_POWER_FREQS_NUM_MAX);
-   if (ret == 0 || ret > TEST_POWER_FREQS_NUM_MAX) {
-   printf("Fail to get available fre

[dpdk-dev] [PATCH 1/7] Split atomic operations to architecture specific

2014-09-29 Thread Neil Horman

On Mon, Sep 29, 2014 at 12:05:22PM +0100, Bruce Richardson wrote:
> On Fri, Sep 26, 2014 at 05:33:32AM -0400, Chao Zhu wrote:
> > This patch splits the atomic operations from DPDK and push them to
> > architecture specific arch directories, so that other processor
> > architecture to support DPDK can be easily adopted.
> > 
> > Signed-off-by: Chao Zhu 
> > ---
> >  lib/librte_eal/common/Makefile |2 +-
> >  .../common/include/i686/arch/rte_atomic_arch.h |  378 
> > 
> >  lib/librte_eal/common/include/rte_atomic.h |  172 +
> >  .../common/include/x86_64/arch/rte_atomic_arch.h   |  378 
> > 
> >  4 files changed, 772 insertions(+), 158 deletions(-)
> >  create mode 100644 
> > lib/librte_eal/common/include/i686/arch/rte_atomic_arch.h
> >  create mode 100644 
> > lib/librte_eal/common/include/x86_64/arch/rte_atomic_arch.h
> > 
> <...snip...>
> > +#definerte_compiler_barrier() rte_arch_compiler_barrier()
> 
> Small question: shouldn't the compiler barrier be independent of 
> architecture?
> 
Agreed, compiler intrinsics I thought were used to define barriers, regardless
of arch (__memory_barrier() is the gcc intrinsic IIRC)
Neil

> /Bruce
> 
>

[dpdk-dev] [PATCH 1/4 v3] compat: Add infrastructure to support symbol versioning

2014-09-29 Thread Neil Horman

Add initial pass header files to support symbol versioning.

---
Change notes
v2)
* Fixed ifdef in rte_compat.h to test for RTE_BUILD_SHARED_LIB instead of the
non-existant RTE_SYMBOL_VERSIONING

* Fixed VERSION_SYMBOL macro to add the needed extra @ to make versioning work
properly

* Improved/Clarified documentation

v3)
* Added missing macros to fully export the symver directive specification

Signed-off-by: Neil Horman 
CC: Thomas Monjalon 
CC: "Richardson, Bruce" 
CC: "Gonzalez Monroy, Sergio" 
---
 lib/Makefile   |  1 +
 lib/librte_compat/Makefile | 38 ++
 lib/librte_compat/rte_compat.h | 90 ++
 mk/rte.lib.mk  |  6 +++
 4 files changed, 135 insertions(+)
 create mode 100644 lib/librte_compat/Makefile
 create mode 100644 lib/librte_compat/rte_compat.h

diff --git a/lib/Makefile b/lib/Makefile
index 10c5bb3..a85b55b 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -32,6 +32,7 @@
 include $(RTE_SDK)/mk/rte.vars.mk

 DIRS-$(CONFIG_RTE_LIBC) += libc
+DIRS-y += librte_compat
 DIRS-$(CONFIG_RTE_LIBRTE_EAL) += librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_MALLOC) += librte_malloc
 DIRS-$(CONFIG_RTE_LIBRTE_RING) += librte_ring
diff --git a/lib/librte_compat/Makefile b/lib/librte_compat/Makefile
new file mode 100644
index 000..3415c7b
--- /dev/null
+++ b/lib/librte_compat/Makefile
@@ -0,0 +1,38 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Neil Horman 
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+
+# install includes
+SYMLINK-y-include := rte_compat.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_compat/rte_compat.h b/lib/librte_compat/rte_compat.h
new file mode 100644
index 000..0b76771
--- /dev/null
+++ b/lib/librte_compat/rte_compat.h
@@ -0,0 +1,90 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Neil Horman .
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE

[dpdk-dev] [PATCH v2 0/5] Mbuf Structure Rework, part 3

2014-09-29 Thread De Lara Guarch, Pablo



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Richardson, Bruce
> Sent: Tuesday, September 23, 2014 12:08 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2 0/5] Mbuf Structure Rework, part 3
> 
> This is the final planned set of patches to make changes to the mbuf
> data structure and associated files. This patch set makes more changes to
> help improve performance following the mbuf changes and adds in two new
> fields into the mbuf structure.
> 
> It is planned to add other fields other than the two provided here, but
> patches for adding those fields will be included in the patch sets for the
> changes making use of those fields, since adding them does not affect, or
> move, any other mbuf fields.
> 
> Changes in V2:
> * Updated userdata pointer in mbuf to always be 8 bytes big
> * Updated a number of commit messages to have more details about the
> performance benefits of the changes proposed in the patches
> * Removed old patch 5 which added the second vlan tag, and replaced it with
> a new, smaller patch which just moves the existing vlan_tci field above the
> 16-bit reserved space.
> 
> Bruce Richardson (5):
>   mbuf: ensure next pointer is set to null on free
>   ixgbe: add prefetch to improve slow-path tx perf
>   testpmd: Change rxfreet default to 32
>   mbuf: add userdata pointer field
>   mbuf: switch vlan_tci and reserved2 fields
> 
>  app/test-pmd/testpmd.c  |  4 +++-
>  .../linuxapp/eal/include/exec-env/rte_kni_common.h  |  6 --
>  lib/librte_mbuf/rte_mbuf.h  | 12 ++--
>  lib/librte_pmd_ixgbe/ixgbe_rxtx.c   | 13 
> -
>  4 files changed, 25 insertions(+), 10 deletions(-)
> 
> --
> 1.9.3

Acked-by: Pablo de Lara

[dpdk-dev] [PATCH v2 04/18] ixgbe: Support cloud mode in IXGBE base code

2014-09-29 Thread Neil Horman

On Mon, Sep 29, 2014 at 03:16:12PM +0800, Ouyang Changchun wrote:
> This patch supports cloud mode in IXGBE base code.
> 
> Signed-off-by: Changchun Ouyang 
> ---
>  lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c | 70 
> 
>  lib/librte_pmd_ixgbe/ixgbe/ixgbe_type.h  | 10 +
>  2 files changed, 80 insertions(+)
> 
Could you please add a little bit of description as to what cloude mode is here?
The changelogs in this entire series are a bit spartan.  More detail would be
appreciated accross the board.

Also (and this is likely just my lack of understanding regarding how this module
works), but I notice there are two implementations of
ixgbe_fdir_set_input_mask_82599 (one in the kni library and one in the pmd
here).  All the call sites for this function appear to use the former
implementation (to judge by the function prototype used).  How is an application
expected to reach this function?  I don't see any generic api for this sort of
work.

Neil

> diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c 
> b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
> index 126aa24..adf0e52 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_82599.c
> @@ -1497,6 +1497,9 @@ s32 ixgbe_init_fdir_perfect_82599(struct ixgbe_hw *hw, 
> u32 fdirctrl,
>   (0xA << IXGBE_FDIRCTRL_MAX_LENGTH_SHIFT) |
>   (4 << IXGBE_FDIRCTRL_FULL_THRESH_SHIFT);
>  
> + if (cloud_mode)
> + fdirctrl |=(IXGBE_FDIRCTRL_FILTERMODE_CLOUD <<
> + IXGBE_FDIRCTRL_FILTERMODE_SHIFT);
>  
>   /* write hashes and fdirctrl register, poll for completion */
>   ixgbe_fdir_enable_82599(hw, fdirctrl);
> @@ -1766,6 +1769,7 @@ s32 ixgbe_fdir_set_input_mask_82599(struct ixgbe_hw *hw,
>   /* mask IPv6 since it is currently not supported */
>   u32 fdirm = IXGBE_FDIRM_DIPv6;
>   u32 fdirtcpm;
> + u32 fdirip6m;
>   DEBUGFUNC("ixgbe_fdir_set_atr_input_mask_82599");
>  
>   /*
> @@ -1838,6 +1842,49 @@ s32 ixgbe_fdir_set_input_mask_82599(struct ixgbe_hw 
> *hw,
>   return IXGBE_ERR_CONFIG;
>   }
>  
> + if (cloud_mode) {
> + fdirm |= IXGBE_FDIRM_L3P;
> + fdirip6m = ((u32) 0xU << IXGBE_FDIRIP6M_DIPM_SHIFT);
> + fdirip6m |= IXGBE_FDIRIP6M_ALWAYS_MASK;
> +
> + switch (input_mask->formatted.inner_mac[0] & 0xFF) {
> + case 0x00:
> + /* Mask inner MAC, fall through */
> + fdirip6m |= IXGBE_FDIRIP6M_INNER_MAC;
> + case 0xFF:
> + break;
> + default:
> + DEBUGOUT(" Error on inner_mac byte mask\n");
> + return IXGBE_ERR_CONFIG;
> + }
> +
> + switch (input_mask->formatted.tni_vni & 0x) {
> + case 0x0:
> + /* Mask vxlan id */
> + fdirip6m |= IXGBE_FDIRIP6M_TNI_VNI;
> + break;
> + case 0x00FF:
> + fdirip6m |= IXGBE_FDIRIP6M_TNI_VNI_24;
> + break;
> + case 0x:
> + break;
> + default:
> + DEBUGOUT(" Error on TNI/VNI byte mask\n");
> + return IXGBE_ERR_CONFIG;
> + }
> +
> + switch (input_mask->formatted.tunnel_type & 0x) {
> + case 0x0:
> + /* Mask turnnel type, fall through */
> + fdirip6m |= IXGBE_FDIRIP6M_TUNNEL_TYPE;
> + case 0x:
> + break;
> + default:
> + DEBUGOUT(" Error on tunnel type byte mask\n");
> + return IXGBE_ERR_CONFIG;
> + }
> + IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRIP6M, fdirip6m);
> + }
>  
>   /* Now mask VM pool and destination IPv6 - bits 5 and 2 */
>   IXGBE_WRITE_REG(hw, IXGBE_FDIRM, fdirm);
> @@ -1863,6 +1910,9 @@ s32 ixgbe_fdir_write_perfect_filter_82599(struct 
> ixgbe_hw *hw,
> u16 soft_id, u8 queue, bool 
> cloud_mode)
>  {
>   u32 fdirport, fdirvlan, fdirhash, fdircmd;
> + u32 addr_low, addr_high;
> + u32 cloud_type = 0;
> + s32 err;
>  
>   DEBUGFUNC("ixgbe_fdir_write_perfect_filter_82599");
>  
> @@ -1892,6 +1942,21 @@ s32 ixgbe_fdir_write_perfect_filter_82599(struct 
> ixgbe_hw *hw,
>   fdirvlan |= IXGBE_NTOHS(input->formatted.vlan_id);
>   IXGBE_WRITE_REG(hw, IXGBE_FDIRVLAN, fdirvlan);
>  
> + if (cloud_mode) {
> + if (input->formatted.tunnel_type != 0)
> + cloud_type = 0x8000;
> +
> + addr_low = ((u32)input->formatted.inner_mac[0] |
> + ((u32)input->formatted.inner_mac[1] << 8) |
> + ((u32)input->formatted.inner_mac[2] << 16) |
> +

[dpdk-dev] Regarding Hardware Crypto Accelerator

2014-09-29 Thread Prashant Upadhyaya

Hi,

Currently I have a machine with Xeon processor and it does not have a
hardware crypto accelerator. I am running my DPDK based application
successfully on it.

Now I want to use a hardware crypto accelerator and use it with DPDK for
IPSec operations in my application
I am planning to buy the following --
PE3iS4CO2 -- Silicom's Quad HW Accelerator Crypto Compression PCI Express
Gen 3.0 Server
Adapter / ColetoCreek SKU2

Can somebody advise if this would work properly with DPDK or is there any
other catch involved which I should be careful about before I go ahead and
invest on the equipment, would really appreciate any advice.

Regards
-Prashant

[dpdk-dev] Regarding Hardware Crypto Accelerator

2014-09-29 Thread Jayakumar, Muthurajan

Prashant, 

Please find Chapter 19 of the Sample Application User Guide 
http://dpdk.org/doc/intel/dpdk-sample-apps-1.7.0.pdf useful for reference.
ColetoCreek configuration (refer Ch 19.3.1)

Regards, 


-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya
Sent: Monday, September 29, 2014 10:06 AM
To: dev at dpdk.org
Subject: [dpdk-dev] Regarding Hardware Crypto Accelerator

Hi,

Currently I have a machine with Xeon processor and it does not have a hardware 
crypto accelerator. I am running my DPDK based application successfully on it.

Now I want to use a hardware crypto accelerator and use it with DPDK for IPSec 
operations in my application I am planning to buy the following --
PE3iS4CO2 -- Silicom's Quad HW Accelerator Crypto Compression PCI Express Gen 
3.0 Server Adapter / ColetoCreek SKU2

Can somebody advise if this would work properly with DPDK or is there any other 
catch involved which I should be careful about before I go ahead and invest on 
the equipment, would really appreciate any advice.

Regards
-Prashant

[dpdk-dev] [PATCH v3 00/10] VM Power Management

2014-09-29 Thread Neil Horman

On Mon, Sep 29, 2014 at 04:18:13PM +0100, Alan Carew wrote:
> Virtual Machine Power Management.
> 
> The following patches add two DPDK sample applications and an alternate
> implementation of librte_power for use in virtualized environments.
> The idea is to provide librte_power functionality from within a VM to address
> the lack of MSRs to facilitate frequency changes from within a VM.
> It is ideally suited for Haswell which provides per core frequency scaling.
> 
> The current librte_power affects frequency changes via the acpi-cpufreq
> 'userspace' power governor, accessed via sysfs.
> 
> General Overview:(more information in each patch that follows).
> The VM Power Management solution provides two components:
> 
>  1)VM: Allows for the a DPDK application in a VM to reuse the librte_power
>  interface. Each lcore opens a Virto-Serial endpoint channel to the host,
>  where the re-implementation of librte_power simply forwards the requests for
>  frequency change to a host based monitor. The host monitor itself uses
>  librte_power.
>  Each lcore channel corresponds to a
>  serial device '/dev/virtio-ports/virtio.serial.port.poweragent.'
>  which is opened in non-blocking mode.
>  While each Virtual CPU can be mapped to multiple physical CPUs it is
>  recommended that each vCPU should be mapped to a single core only.
> 
>  2)Host: The host monitor is managed by a CLI, it allows for adding qemu/KVM
>  virtual machines and associated channels to the monitor, manually changing
>  CPU frequency, inspecting the state of VMs, vCPU to pCPU pinning and managing
>  channels.
>  Host channel endpoints are Virto-Serial endpoints configured as AF_UNIX file
>  sockets which follow a specific naming convention
>  i.e /tmp/powermonitor/.,
>  each channel has an 1:1 mapping to a VM endpoint
>  i.e. /dev/virtio-ports/virtio.serial.port.poweragent.
>  Host channel endpoints are opened in non-blocking mode and are monitored via 
> epoll.
>  Requests over each channel to change frequency are forwarded to the original
>  librte_power.
>  
> Channels must be manually configured as qemu-kvm command line arguments or
> libvirt domain definition(xml) e.g.
> 
>  
> 
> 
>   
>   
> 
> 
> Where multiple channels can be configured by specifying multiple 
> elements, by replacing , .
> (port number) should be incremented by 1 for each new channel element.
> More information on Virtio-Serial can be found here:
> http://fedoraproject.org/wiki/Features/VirtioSerial
> To enable the Hypervisor creation of channels, the host endpoint directory
> must be created with qemu permissions:
> mkdir /tmp/powermonitor
> chown qemu:qemu /tmp/powermonitor
> 
> The host application runs on two separate lcores:
> Core N) CLI: For management of Virtual Machines adding channels to Monitor 
> thread,
>  inspecting state and manually setting CPU frequency [PATCH 02/09]
> Core N+1) Monitor Thread: An epoll based infinite loop that waits on channel 
> events
>  from VMs and calls the corresponding librte_power functions.
> 
> A sample application is also provided to run on Virtual Machines, this
> application provides a CLI to manually set the frequency of a 
> vCPU[PATCH 08/09]
> 
> The current l3fwd-power sample application can also be run on a VM.
> 
> Changes in V3:
>  Fixed crash in Guest CLI when host application is not running.
>  Renamed #defines to be more specific to the module they belong
>  Added vCPU pinning via CLI
>  Testing feedback
> 
> Changes in V2:
>  Runtime selection of librte_power implementations.
>  Updated Unit tests to cover librte_power changes.
>  PATCH[0/3] was sent twice, again as PATCH[0/4]
>  Miscellaneous fixes.
> 
> Alan Carew (10):
>   Channel Manager and Monitor for VM Power Management(Host).
>   VM Power Management CLI(Host).
>   CPU Frequency Power Management(Host).
>   VM Power Management application and Makefile.
>   VM Power Management CLI(Guest).
>   VM communication channels for VM Power Management(Guest).
>   librte_power common interface for Guest and Host
>   Packet format for VM Power Management(Host and Guest).
>   Build system integration for VM Power Management(Guest and Host)
>   VM Power Management Unit Tests
> 
>  app/test/Makefile  |   3 +-
>  app/test/autotest_data.py  |  26 +
>  app/test/test_power.c  | 445 +---
>  app/test/test_power_acpi_cpufreq.c | 544 ++
>  app/test/test_power_kvm_vm.c   | 308 
>  examples/vm_power_manager/Makefile |  57 ++
>  examples/vm_power_manager/channel_manager.c| 804 
> +
>  examples/vm_power_manager/channel_manager.h| 314 
>  examples/vm_power_manager/channel_monitor.c| 228 ++
>  examples/vm_power_manager/channel_monitor.h| 102 +++
>  examples/vm_power_manager/guest_cli/Makefile   |  56 ++
>  examples/vm_power_manager/guest_cli/main.c

[dpdk-dev] [PATCH v3] ethdev: Rename RX/TX enable queue field for queue start and stop

2014-09-29 Thread Thomas Monjalon

2014-09-26 13:00, Ouyang Changchun:
> V3 change:
>  - Rename field name to rx_deferred_start/tx_deferred_start in
>both ixgbe and i40e PMD. 
>  - Move the doxygen comments for rx_deferred_start after it is declared.
>  - Simplify/split the long description and move some to doxygen comments of
>rte_eth_dev_rx_queue_start and rte_eth_dev_tx_queue_start.
> 
> V2 and V1 change:
>  - Update comments for the field start_rx_per_q for better readability.
>  - Rename the field name to rx_enable_queue for better readability too.
>  - Accordingly Update its reference in sample vhost.
> 
> Signed-off-by: Changchun Ouyang 

Acked and applied with some minor changes in comments.

Thanks
-- 
Thomas

[dpdk-dev] [PATCH] virtio: fix crash if VIRTIO_NET_F_CTRL_VQ is not negotiated

2014-09-29 Thread Damjan Marion (damarion)



On 17 Sep 2014, at 09:32, Olivier MATZ  wrote:

> Hello,
> 
> On 09/12/2014 12:25 AM, damarion at cisco.com wrote:
>> From: Damjan Marion 
>> 
>> If VIRTIO_NET_F_CTRL_VQ is not negotiated hw->cvq will be NULL
>> 
>> Signed-off-by: Damjan Marion 
>> ---
>>  lib/librte_pmd_virtio/virtio_rxtx.c | 6 --
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>> 
> 
> Acked-by: Olivier Matz 
> 

Is this going to be applied or any action pending on my side?

Thanks,

Damjan

[dpdk-dev] [PATCH v5 01/11] lib/librte_vhost: move src files in vhost example to vhost lib directory

2014-09-29 Thread Thomas Monjalon

Hi Huawei,

2014-09-26 17:45, Huawei Xie:
> "git mv examples/vhost lib/librte_vhost"
> This is a purely src file move, without any modification.
> Subsequent patch will transform those src files to a vhost library.
> 
> Signed-off-by: Huawei Xie 
> ---
>  examples/vhost/Makefile  |   60 -
>  examples/vhost/eventfd_link/Makefile |   39 -
>  examples/vhost/eventfd_link/eventfd_link.c   |  205 --
>  examples/vhost/eventfd_link/eventfd_link.h   |   79 -
>  examples/vhost/libvirt/qemu-wrap.py  |  367 ---
>  examples/vhost/main.c| 3725 
> --
>  examples/vhost/main.h|   86 -
>  examples/vhost/vhost-net-cdev.c  |  367 ---
>  examples/vhost/vhost-net-cdev.h  |   83 -
>  examples/vhost/virtio-net.c  | 1165 
>  examples/vhost/virtio-net.h  |  161 --
>  lib/librte_vhost/eventfd_link/Makefile   |   39 +
>  lib/librte_vhost/eventfd_link/eventfd_link.c |  205 ++
>  lib/librte_vhost/eventfd_link/eventfd_link.h |   79 +
>  lib/librte_vhost/libvirt/qemu-wrap.py|  367 +++
>  lib/librte_vhost/main.c  | 3725 
> ++
>  lib/librte_vhost/main.h  |   86 +
>  lib/librte_vhost/vhost-net-cdev.c|  367 +++
>  lib/librte_vhost/vhost-net-cdev.h|   83 +
>  lib/librte_vhost/virtio-net.c| 1165 
>  lib/librte_vhost/virtio-net.h|  161 ++
>  21 files changed, 6277 insertions(+), 6337 deletions(-)

In patch 2, you're using main.c to create vhost_rxtx.c.
So it would be clearer to rename it in this patch 1.

-- 
Thomas

[dpdk-dev] [PATCH v5 05/11] lib/librte_vhost: merge Oliver's mbuf change

2014-09-29 Thread Thomas Monjalon

> There is no rte_pktmbuf structure in mbuf now. Its fields are merged to
> rte_mbuf structure.
> 
> Signed-off-by: Huawei Xie 

This patch shouldn't appear but should be merged with your previous work.

-- 
Thomas

[dpdk-dev] [PATCH v5 03/11] lib/librte_vhost: vhost lib transform

2014-09-29 Thread Thomas Monjalon

2014-09-26 17:45, Huawei Xie:
> This vhost lib consists of five APIs plus several other helper routines
> for feature disable/enable.
> 1) rte_vhost_driver_register initialises vhost driver.
> 2) rte_vhost_driver_callback_register registers the callbacks.
> Callbacks are called from vhost driver when virtio device is ready
> for polling or is de-activated by guest.
> 3) rte_vhost_driver_session_start, a blocking API to start vhost
> message handler session.
> 4) rte_vhost_enqueue_burst and rte_vhost_dequeue_burst for
> enqueue/dequeue to/from virtio ring.

There are probably many things here to split in different patches.
It's not mandatory but would be very nice. Example: a patch to remove
hpa_memory_regions would explain why it is removed.

> Modifications include:
> 1) in vhost_rxtx.c
>virtio_dev_rx -> rte_vhost_enqueue_burst
>virtio_dev_tx -> rte_vhost_dequeue_burst
> 2) VMDQ, MAC learning and other switch related logics are removed.
> 3) zero copy feature isn't generic at this stage, and is removed.
> 4) retry logic is removed from vhost rx functions.
> The above three logics will be implemented in example as reference.
> 5) Add several TODO/FIXME:
>-allow application to disable cmpset reserve in rte_vhost_enqueue_burst
> in case there is no contention.
>-fix memcpy from mbuf to vring desc when mbuf is chained and the
> desc couldn't hold all the data
>-fix vhost_set_mem_table possible race condition: two vqs concurrently
> calls set_mem_table which cause saved mem_temp to be overide.
> 6) merge-able feature is removed, which will be merged in subsequent patch.

Please do not remove a feature which is re-added later. It's really difficult
to follow such history.

-- 
Thomas

[dpdk-dev] [PATCH v5 02/11] lib/librte_vhost: refactor vhost lib for subsequent transform

2014-09-29 Thread Thomas Monjalon

2014-09-26 17:45, Huawei Xie:
> This patch does simple split of the original vhost example source
> files in vhost lib directory.
> vhost rx/tx functions virtio_dev_rx/tx are copied from main.c to
> new file vhost_rxtx.c and license header is added.
> main.c and main.h are removed and will be copied to new vhost
> example in subsequent patch.
> virtio-net.h is renamed to rte_virtio_net.h as API header file.
> 
> Signed-off-by: Huawei Xie 

You are removing functions for mergeable buffer feature.
Please keep it instead of re-adding it later.

-- 
Thomas

[dpdk-dev] [PATCH v5 02/11] lib/librte_vhost: refactor vhost lib for subsequent transform

2014-09-29 Thread Thomas Monjalon

2014-09-29 21:55, Thomas Monjalon:
> 2014-09-26 17:45, Huawei Xie:
> > This patch does simple split of the original vhost example source
> > files in vhost lib directory.
> > vhost rx/tx functions virtio_dev_rx/tx are copied from main.c to
> > new file vhost_rxtx.c and license header is added.
> > main.c and main.h are removed and will be copied to new vhost
> > example in subsequent patch.
> > virtio-net.h is renamed to rte_virtio_net.h as API header file.
> > 
> > Signed-off-by: Huawei Xie 
> 
> You are removing functions for mergeable buffer feature.
> Please keep it instead of re-adding it later.

Other comment, you are silently increasing these values to 64:

 #define MAX_PKT_BURST 32   /* Max burst size for RX/TX */
 #define MAX_MRG_PKT_BURST 16   /* Max burst for merge buffers. Set to 1 due to 
performance issue. */

-- 
Thomas

[dpdk-dev] rc1 / call for review

2014-09-29 Thread Thomas Monjalon

Hello,

There is a new tag: 1.8.0-rc1. It's not really a release candidate,
it's a first step toward the new release including:
- mbuf rework
- logs rework
- some eal cleanups
- extended statistics
- fixes for i211 and ixgbe
- removal of rte_snprintf and RTE_PCI_DRV_MULTIPLE

In the next weeks, the features which are already sent must be properly
reviewed, tested and integrated in this coming new release.
Having more eyes to look at the discussed patches would be very helpful.

Thanks everyone
-- 
Thomas

[dpdk-dev] Hi all, does Amazon VMs supported DPDK or not?

2014-09-29 Thread Wang, Shawn

Yes, you can.

>From my colleague, Saha, Avik, they are running  IntelDPDK 1.7 on c3.8xlarges.

Thanks.

From: dev [dev-bounces at dpdk.org] on behalf of Dong, Binghua 
[binghua.d...@intel.com]
Sent: Saturday, September 27, 2014 10:05 PM
To: Patel, Rashmin N; dev at dpdk.org
Subject: Re: [dpdk-dev] Hi all,  does Amazon VMs supported DPDK or not?

Hi Patel,

The customer consider that deploy DPDK application in Amazon VMs is very 
flexible and very easy global site deployment:

such as: they only need to buy a 2 lcores VM if a site only need 200Mbps 
throughput;   buy one 4 lcores VM if the throughput is 400Mbps;

the can buy different Amazon site VMs in US, German... for lower access latency;

-Original Message-
From: Patel, Rashmin N
Sent: Saturday, September 27, 2014 12:41 AM
To: Dong, Binghua; dev at dpdk.org
Subject: RE: Hi all, does Amazon VMs supported DPDK or not?

It really depends on the devices offered in the VM. If direct device assignment 
is not provided to a VM or if the node hypervisor doesn't have an optimized 
para-virtual interface to a VM, I don't see any benefit using DPDK in VMs.

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Dong, Binghua
Sent: Friday, September 26, 2014 5:47 AM
To: dev at dpdk.org
Subject: [dpdk-dev] Hi all, does Amazon VMs supported DPDK or not?

A customer plan to buy some global Amazon VMs to run their DPDK 1.3(will 
upgrade to DPDK1.6 or 1.7) based VPN applications on global sites.

Thanks a lot;

[dpdk-dev] Hi all, does Amazon VMs supported DPDK or not?

2014-09-29 Thread Patel, Rashmin N

Hi Shawn,

Which network interface is visible to the VM? I mean which is the virtual 
ethernet port is used in Amazon-VM-DPDK app? And what all interfaces are 
offered based on the VM size and requirements?

Thanks,
Rashmin

-Original Message-
From: Wang, Shawn [mailto:xing...@amazon.com] 
Sent: Monday, September 29, 2014 1:50 PM
To: Dong, Binghua; Patel, Rashmin N; dev at dpdk.org; Saha, Avik (AWS)
Subject: RE: Hi all, does Amazon VMs supported DPDK or not?

Yes, you can.

>From my colleague, Saha, Avik, they are running  IntelDPDK 1.7 on c3.8xlarges.

Thanks.

From: dev [dev-bounces at dpdk.org] on behalf of Dong, Binghua 
[binghua.d...@intel.com]
Sent: Saturday, September 27, 2014 10:05 PM
To: Patel, Rashmin N; dev at dpdk.org
Subject: Re: [dpdk-dev] Hi all,  does Amazon VMs supported DPDK or not?

Hi Patel,

The customer consider that deploy DPDK application in Amazon VMs is very 
flexible and very easy global site deployment:

such as: they only need to buy a 2 lcores VM if a site only need 200Mbps 
throughput;   buy one 4 lcores VM if the throughput is 400Mbps;

the can buy different Amazon site VMs in US, German... for lower access latency;

-Original Message-
From: Patel, Rashmin N
Sent: Saturday, September 27, 2014 12:41 AM
To: Dong, Binghua; dev at dpdk.org
Subject: RE: Hi all, does Amazon VMs supported DPDK or not?

It really depends on the devices offered in the VM. If direct device assignment 
is not provided to a VM or if the node hypervisor doesn't have an optimized 
para-virtual interface to a VM, I don't see any benefit using DPDK in VMs.

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Dong, Binghua
Sent: Friday, September 26, 2014 5:47 AM
To: dev at dpdk.org
Subject: [dpdk-dev] Hi all, does Amazon VMs supported DPDK or not?

A customer plan to buy some global Amazon VMs to run their DPDK 1.3(will 
upgrade to DPDK1.6 or 1.7) based VPN applications on global sites.

Thanks a lot;

[dpdk-dev] rc1 / call for review

2014-09-29 Thread Matthew Hall

On Mon, Sep 29, 2014 at 10:23:58PM +0200, Thomas Monjalon wrote:
>   - mbuf rework
>   - logs rework
>   - some eal cleanups

Hi Thomas,

I was curious, did we happen to know if any of these three changes affected 
the external API's much?

It would help us get some idea what to test and where to look, since mbuf, 
logs, and eal are probably the three most popular parts of DPDK for us app 
hackers to interact with regularly.

Thanks,
Matthew.

[dpdk-dev] Building current 1.8.1-rc1 with clang

2014-09-29 Thread Wiles, Roger Keith

I just pulled the current repo and stated a build with ?make install 
T=x86_64-native-linuxapp-clang? which produced the following error. I do not 
think I am allowed to modify this file, correct? If that is the case then 
someone will have to update the original source. If you want me to submit a 
patch I can, but I do not think I fully understand what needs to be done. 

>From what I can tell the line:
dma_addr0 = _mm_xor_si128(dma_addr0, dma_addr0);
needs to be:
dma_addr0 = _mm_setzero_si128();

== Build lib/librte_pmd_ixgbe
  CC ixgbe_common.o
  CC ixgbe_82598.o
  CC ixgbe_82599.o
  CC ixgbe_x540.o
  CC ixgbe_phy.o
  CC ixgbe_api.o
  CC ixgbe_vf.o
  CC ixgbe_dcb.o
  CC ixgbe_dcb_82599.o
  CC ixgbe_dcb_82598.o
  CC ixgbe_mbx.o
  CC ixgbe_rxtx.o
  CC ixgbe_ethdev.o
  CC ixgbe_fdir.o
  CC ixgbe_pf.o
  CC ixgbe_rxtx_vec.o
/home/keithw/projects/dpdk-code/org-dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c:67:30:
 error: variable 'dma_addr0' is uninitialized
  when used here [-Werror,-Wuninitialized]
dma_addr0 = _mm_xor_si128(dma_addr0, dma_addr0);
  ^
/home/keithw/projects/dpdk-code/org-dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c:57:2:
 note: variable 'dma_addr0' is declared here
__m128i dma_addr0, dma_addr1;
^
1 error generated.
make[5]: *** [ixgbe_rxtx_vec.o] Error 1
make[4]: *** [librte_pmd_ixgbe] Error 2
make[3]: *** [lib] Error 2
make[2]: *** [all] Error 2
make[1]: *** [x86_64-native-linuxapp-clang_install] Error 2
make: *** [install] Error 2

Thanks
++Keith

Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
972-213-5533

[dpdk-dev] rc1 / call for review

2014-09-29 Thread Matthew Hall

On Tue, Sep 30, 2014 at 06:52:45AM +0200, Thomas Monjalon wrote:
> You're right.
> During integration time, app hackers should be able to check the git history
> for these API changes.
> When it will be officially released, there will be some notes in the
> documentation to help porting applications.

It works for commercial apps where we have 40 hrs / week to look. But for my 
open source app I guess I just have to do step 1) compile, step 2) pray that 
it still works. ;)

Matthew.

1 2 >

100 matches

Mail list logo