date:20150202

[dpdk-dev] [PATCH 2/7] rte_sched: use reserved field to allow more VLAN's

2015-02-02 Thread Stephen Hemminger

On Mon, 2 Feb 2015 14:21:58 +
"Ananyev, Konstantin"  wrote:

> Hi Stephen,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen Hemminger
> > Sent: Sunday, February 01, 2015 10:04 AM
> > To: dev at dpdk.org
> > Cc: Stephen Hemminger
> > Subject: [dpdk-dev] [PATCH 2/7] rte_sched: use reserved field to allow more 
> > VLAN's
> > 
> > From: Stephen Hemminger 
> > 
> > The QoS subport is limited to 8 bits in original code.
> > But customers demanded ability to support full number of VLAN's (4096)
> > therefore use reserved field of mbuf for this field instead
> > of packing inside other classify portions.
> > 
> > Signed-off-by: Stephen Hemminger 
> > ---
> >  lib/librte_mbuf/rte_mbuf.h   |  2 +-
> >  lib/librte_sched/rte_sched.h | 31 ---
> >  2 files changed, 21 insertions(+), 12 deletions(-)
> > 
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 16059c6..b6b08f4 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -242,7 +242,7 @@ struct rte_mbuf {
> > uint16_t data_len;/**< Amount of data in segment buffer. */
> > uint32_t pkt_len; /**< Total pkt len: sum of all segments. */
> > uint16_t vlan_tci;/**< VLAN Tag Control Identifier (CPU order) 
> > */
> > -   uint16_t reserved;
> > +   uint16_t subport; /**< SCHED Subport ID */
> 
> As I remember, we keep these reserved 2 bytes for RX 2 double vlan tag 
> offload.
> So probably not a good idea to use it for something that is rte_sched 
> specific.
> If you really need extra space fo rte_sched fields inside mbuf, can't you 
> move it into second cache line?
> Or might be you can use userdata, to either store sched information directly, 
> or as a pointer to some external memory  location? 
> Another possibility - union mbuf.hash is 64bit now, while sched uses only 
> 32bits.
> So might be you can rearrange it to make sched 64bits too?
> Something like:
> 
> union {
> uint32_t rss; /**< RSS hash result if RSS enabled */
> struct {
> union {
> struct {
> uint16_t hash;
> uint16_t id;
> };
> uint32_t lo;
> /**< Second 4 flexible bytes */
> };
> uint32_t hi;
> /**< First 4 flexible bytes or FD ID, dependent on
>  PKT_RX_FDIR_* flag in ol_flags. */
> } fdir;   /**< Filter identifier if FDIR enabled */
> -uint32_t sched;   /**< Hierarchical scheduler */
> +   uint64_t sched;   /**< Hierarchical scheduler */
> uint32_t usr; /**< User defined tags. See 
> @rte_distributor_p
> rocess */
> } hash;   /**< hash information */

Increasing the size of that union totally breaks other alignment and is a not 
starter.

The reserved field is not use upstream merged code and therefore is fair game.
First to claim it wins.

[dpdk-dev] site down?

2015-02-02 Thread Thomas Monjalon

2015-02-02 20:10, Vipin Agrawal:
> I?ve been trying to connect to download the 1.6 version.

You should try to download a newer version :)

> Does anybody have a status on dpdk.org?

Yes it was down but now the problem seems to be fixed.
We are going to investigate why the kernel has crashed.
It may be due to a recent upgrade of the allocated resources.

Sorry for the inconvenience
-- 
Thomas

[dpdk-dev] site down?

2015-02-02 Thread Vipin Agrawal

I?ve been trying to connect to download the 1.6 version.  Does anybody have a 
status on dpdk.org?

Vipin





This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Thank you in advance for your 
cooperation.

[dpdk-dev] [PATCH v2 1/2] eal: sort and align options lists

2015-02-02 Thread Thomas Monjalon

Options listing in usage help was a mess.
The main usage line is fixed and shorter.
The options in usage output are logically sorted (cpu/mem/dev/proc),
aligned and lightly reworded.
The options in declarations are alphabetically sorted.
Code in swith statement is not moved.

Signed-off-by: Thomas Monjalon 
---
changes in v2:
- sort and align options enum in .h
---
 lib/librte_eal/common/eal_common_options.c | 112 ++---
 lib/librte_eal/common/eal_options.h|  62 
 lib/librte_eal/linuxapp/eal/eal.c  |  19 +++--
 3 files changed, 95 insertions(+), 98 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_options.c 
b/lib/librte_eal/common/eal_common_options.c
index 67e02dc..4890e78 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -55,37 +55,38 @@
 const char
 eal_short_options[] =
"b:" /* pci-blacklist */
-   "w:" /* pci-whitelist */
"c:" /* coremask */
-   "d:"
+   "d:" /* driver */
"l:" /* corelist */
-   "m:"
-   "n:"
-   "r:"
-   "v";
+   "m:" /* memory size */
+   "n:" /* memory channels */
+   "r:" /* memory ranks */
+   "v"  /* version */
+   "w:" /* pci-whitelist */
+   ;

 const struct option
 eal_long_options[] = {
-   {OPT_HUGE_DIR, 1, 0, OPT_HUGE_DIR_NUM},
-   {OPT_MASTER_LCORE, 1, 0, OPT_MASTER_LCORE_NUM},
-   {OPT_PROC_TYPE, 1, 0, OPT_PROC_TYPE_NUM},
-   {OPT_NO_SHCONF, 0, 0, OPT_NO_SHCONF_NUM},
-   {OPT_NO_HPET, 0, 0, OPT_NO_HPET_NUM},
-   {OPT_VMWARE_TSC_MAP, 0, 0, OPT_VMWARE_TSC_MAP_NUM},
-   {OPT_NO_PCI, 0, 0, OPT_NO_PCI_NUM},
-   {OPT_NO_HUGE, 0, 0, OPT_NO_HUGE_NUM},
-   {OPT_FILE_PREFIX, 1, 0, OPT_FILE_PREFIX_NUM},
-   {OPT_SOCKET_MEM, 1, 0, OPT_SOCKET_MEM_NUM},
-   {OPT_PCI_WHITELIST, 1, 0, OPT_PCI_WHITELIST_NUM},
-   {OPT_PCI_BLACKLIST, 1, 0, OPT_PCI_BLACKLIST_NUM},
-   {OPT_VDEV, 1, 0, OPT_VDEV_NUM},
-   {OPT_SYSLOG, 1, NULL, OPT_SYSLOG_NUM},
-   {OPT_LOG_LEVEL, 1, NULL, OPT_LOG_LEVEL_NUM},
-   {OPT_BASE_VIRTADDR, 1, 0, OPT_BASE_VIRTADDR_NUM},
-   {OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
-   {OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
-   {OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
-   {0, 0, 0, 0}
+   {OPT_BASE_VIRTADDR, 1, NULL, OPT_BASE_VIRTADDR_NUM},
+   {OPT_CREATE_UIO_DEV,1, NULL, OPT_CREATE_UIO_DEV_NUM   },
+   {OPT_FILE_PREFIX,   1, NULL, OPT_FILE_PREFIX_NUM  },
+   {OPT_HUGE_DIR,  1, NULL, OPT_HUGE_DIR_NUM },
+   {OPT_LOG_LEVEL, 1, NULL, OPT_LOG_LEVEL_NUM},
+   {OPT_MASTER_LCORE,  1, NULL, OPT_MASTER_LCORE_NUM },
+   {OPT_NO_HPET,   0, NULL, OPT_NO_HPET_NUM  },
+   {OPT_NO_HUGE,   0, NULL, OPT_NO_HUGE_NUM  },
+   {OPT_NO_PCI,0, NULL, OPT_NO_PCI_NUM   },
+   {OPT_NO_SHCONF, 0, NULL, OPT_NO_SHCONF_NUM},
+   {OPT_PCI_BLACKLIST, 1, NULL, OPT_PCI_BLACKLIST_NUM},
+   {OPT_PCI_WHITELIST, 1, NULL, OPT_PCI_WHITELIST_NUM},
+   {OPT_PROC_TYPE, 1, NULL, OPT_PROC_TYPE_NUM},
+   {OPT_SOCKET_MEM,1, NULL, OPT_SOCKET_MEM_NUM   },
+   {OPT_SYSLOG,1, NULL, OPT_SYSLOG_NUM   },
+   {OPT_VDEV,  1, NULL, OPT_VDEV_NUM },
+   {OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
+   {OPT_VMWARE_TSC_MAP,0, NULL, OPT_VMWARE_TSC_MAP_NUM   },
+   {OPT_XEN_DOM0,  0, NULL, OPT_XEN_DOM0_NUM },
+   {0, 0, NULL, 0}
 };

 static int lcores_parsed;
@@ -578,37 +579,36 @@ eal_check_common_options(struct internal_config 
*internal_cfg)
 void
 eal_common_usage(void)
 {
-   printf("-c COREMASK -n NUM [-m NB] [-r NUM] [-b 
]"
-  "[--proc-type primary|secondary|auto]\n\n"
+   printf("-c COREMASK|-l CORELIST -n CHANNELS [options]\n\n"
   "EAL common options:\n"
-  "  -c COREMASK  : A hexadecimal bitmask of cores to run on\n"
-  "  -l CORELIST  : List of cores to run on\n"
-  " The argument format is 
[-c2][,c3[-c4],...]\n"
-  " where c1, c2, etc are core indexes between 0 
and %d\n"
-  "  --"OPT_MASTER_LCORE" ID: Core ID that is used as master\n"
-  "  -n NUM   : Number of memory channels\n"
-  "  -v   : Display version information on startup\n"
-  "  -m MB: memory to allocate (see also 
--"OPT_SOCKET_MEM")\n"
-  "  -r NUM   : force number of memory ranks (don't detect)\n"
-  "  --"OPT_SYSLOG" : set syslog facility\n"
-  "  --"OPT_LOG_LEVEL"  : set default log level\n"
-  "  --"OPT_PROC_TYPE"  : type of this process\n"
-  "

[dpdk-dev] [PATCH v2 0/2] help option

2015-02-02 Thread Thomas Monjalon

This is a small reorganization of options.
The main goal is to provide a nice --help option.

changes in v2:
- sort also the options enum

Thomas Monjalon (2):
  eal: sort and align options lists
  eal: add help option

 lib/librte_eal/bsdapp/eal/eal.c|   7 +-
 lib/librte_eal/common/eal_common_options.c | 115 +++--
 lib/librte_eal/common/eal_options.h|  64 
 lib/librte_eal/linuxapp/eal/eal.c  |  27 ---
 4 files changed, 113 insertions(+), 100 deletions(-)

-- 
2.2.2

[dpdk-dev] [RFC PATCH 3/4] doc: nics guide

2015-02-02 Thread Iremonger, Bernard

> > > > It would be better to recreate the references  in  
> > > > doc/guides/nics/index.rst rather than just
> delete them.
> > > > I don't think there is any need to renumber them.
> > > > The references are global across documents and must be unique.
> > >
> > > No the references are not really uniques:
> > >   - Figures in doc/guides/sample_app_ug/index.rst start from 1 to 17.
> > >   - Figures in doc/guides/prog_guide/index.rst start from 1 to 39.
> > > When adding a new new figure in a document, we must renumber them.
> >
> > I had this problem before with duplicate numbers, so in the prog_guide I 
> > prefixed the links with
> "pg_" so the links are unique.
> > For example:
> >  :ref:`Figure 16. Memory Sharing inthe Intel(r) DPDK Multi-process
> > Sample Application `
> 
> OK I understand what is unique.
> But I still think that the number in "Figure 16." should be automatically 
> generated.

Hi Thomas,

I looked at this before and I do not think there is a way to automatically 
generate the number

> > > Are you speaking about the figure references at the bottom of this page?
> > >   http://dpdk.org/doc/guides/prog_guide
> > > It doesn't really help for browsing.
> >
> > No, what I mean is that within a document you can jump to another section 
> > and back for example:
> >
> > file:///home/bairemon/git_home/dpdk_master/build/doc/html/guides/prog_
> > guide/mbuf_lib.html There is link on this page that allows the reader
> > to jump to Mempool Library
> > file:///home/bairemon/git_home/dpdk_master/build/doc/html/guides/prog_
> > guide/mempool_lib.html#mempool-library
> > The reader can use the back arrow in the browser to return to the original 
> > page.
> > All of the links are used for this purpose.
> 
> OK, it was a misunderstanding. I agree the cross references are useful.
> I'm speaking only about the Figure/Table list at the bottom of the index.

The references were all gathered here in the original 1.7 MSWord documents.
It also useful for testing the links to have them all in one place.

> 
> --
> Thomas

Regards,

Bernard.

[dpdk-dev] [PATCH] fix testpmd show port info error

2015-02-02 Thread xuelin....@freescale.com

From: Xuelin Shi 

the port number type should be consistent with librte_cmdline,
else there is potential endian issue.

Signed-off-by: Xuelin Shi 
---
 app/test-pmd/cmdline.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 4beb404..488ac63 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -5568,7 +5568,7 @@ cmdline_parse_token_string_t cmd_showport_what =
TOKEN_STRING_INITIALIZER(struct cmd_showport_result, what,
 "info#stats#xstats#fdir#stat_qmap");
 cmdline_parse_token_num_t cmd_showport_portnum =
-   TOKEN_NUM_INITIALIZER(struct cmd_showport_result, portnum, INT32);
+   TOKEN_NUM_INITIALIZER(struct cmd_showport_result, portnum, INT8);

 cmdline_parse_inst_t cmd_showport = {
.f = cmd_showport_parsed,
-- 
1.9.1

[dpdk-dev] [PATCH v2] ABI: Add abi checking utility

2015-02-02 Thread Neil Horman

There was a request for an abi validation utiltyfor the ongoing ABI stability
work.  As it turns out there is a abi compliance checker in development that
seems to be under active development and provides fairly detailed ABI compliance
reports.  Its not yet intellegent enough to understand symbol versioning, but it
does provide the ability to identify symbols which have changed between
releases, along with details of the change, and offers develoeprs the
opportunity to identify which symbols then need versioning and validation for a
given update via manaul testing.

This script automates the use of the compliance checker between two arbitrarily
specified tags within the dpdk tree.  To execute enter the $RTE_SDK directory
and run:

./scripts/validate_abi.sh $GIT_TAG1 $GIT_TAG2 $CONFIG

where $GIT_TAG1 and 2 are git tags and $CONFIG is a config specification
suitable for passing as the T= variable in the make config command.

Signed-off-by: Neil Horman 

Change Notes:

v2) Fixed some typos as requested by Thomas
---
 scripts/validate_abi.sh | 241 
 1 file changed, 241 insertions(+)
 create mode 100755 scripts/validate_abi.sh

diff --git a/scripts/validate_abi.sh b/scripts/validate_abi.sh
new file mode 100755
index 000..31583df
--- /dev/null
+++ b/scripts/validate_abi.sh
@@ -0,0 +1,241 @@
+#!/bin/sh
+#   BSD LICENSE
+#
+#   Copyright(c) 2015 Neil Horman. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+TAG1=$1
+TAG2=$2
+TARGET=$3
+ABI_DIR=`mktemp -d -p /tmp ABI.XX`
+
+usage() {
+   echo "$0   "
+}
+
+log() {
+   local level=$1
+   shift
+   echo "$*"
+}
+
+validate_tags() {
+   git tag -l | grep -q "$TAG1"
+   if [ $? -ne 0 ]
+   then
+   echo "$TAG1 is invalid"
+   return
+   fi
+   git tag -l | grep -q "$TAG2"
+   if [ $? -ne 0 ]
+   then
+   echo "$TAG2 is invalid"
+   return
+   fi
+}
+
+validate_args() {
+   if [ -z "$TAG1" ]
+   then
+   echo "Must Specify TAG1"
+   return
+   fi
+   if [ -z "$TAG2" ]
+   then
+   echo "Must Specify TAG2"
+   return
+   fi
+   if [ -z "$TARGET" ]
+   then
+   echo "Must Specify a build target"
+   fi
+}
+
+
+cleanup_and_exit() {
+   rm -rf $ABI_DIR
+   exit $1
+}
+
+###
+#START
+
+
+#Save the current branch
+CURRENT_BRANCH=`git branch | grep \* | cut -d' ' -f2`
+
+if [ -n "$VERBOSE" ]
+then
+   export VERBOSE=/dev/stdout
+else
+   export VERBOSE=/dev/null
+fi
+
+# Validate that we have all the arguments we need
+res=$(validate_args)
+if [ -n "$res" ]
+then
+   echo $res
+   usage
+   cleanup_and_exit 1
+fi
+
+# Make sure our tags exist
+res=$(validate_tags)
+if [ -n "$res" ]
+then
+   echo $res
+   cleanup_and_exit 1
+fi
+
+ABICHECK=`which abi-compliance-checker 2>/dev/null`
+if [ $? -ne 0 ]
+then
+   log "INFO" "Cant find abi-compliance-checker utility"
+   cleanup_and_exit 1
+fi
+
+ABIDUMP=`which abi-dumper 2>/dev/null`
+if [ $? -ne 0 ]
+then
+   log "INFO" "Cant find abi-dumper utility"
+   cleanup_and_exit 1
+fi
+
+log "INFO" "We're going to check and make sure that applications built"
+log "INFO" "against DPDK DSOs from tag $TAG1 will still run when executed"
+log "INFO" "against DPDK DSOs built from tag $TAG2."
+log "INFO" ""
+
+# Check to make sure we have a clean tree
+git status | grep -q clean
+if [ $? -ne 0 ]
+then
+   log "WARN" "Working

[dpdk-dev] [PATCH v6 2/7] hash: add assembly implementation of CRC32 intrinsics

2015-02-02 Thread Liang, Cunming


On 1/29/2015 4:48 PM, Yerden Zhumabekov wrote:
> Added:
> - crc32c_sse42_u32() emits 'crc32l' asm instruction;
> - crc32c_sse42_u64() emits 'crc32q' asm instruction;
> - crc32c_sse42_u64_mimic(), wrapper in case of run on 32-bit platform.
>
> Signed-off-by: Yerden Zhumabekov 
> ---
>   lib/librte_hash/rte_hash_crc.h |   34 ++
>   1 file changed, 34 insertions(+)
>
> diff --git a/lib/librte_hash/rte_hash_crc.h b/lib/librte_hash/rte_hash_crc.h
> index 4da7ca4..fe35996 100644
> --- a/lib/librte_hash/rte_hash_crc.h
> +++ b/lib/librte_hash/rte_hash_crc.h
> @@ -363,6 +363,40 @@ crc32c_2words(uint64_t data, uint32_t init_val)
>   return crc;
>   }
>   
> +static inline uint32_t
> +crc32c_sse42_u32(uint32_t data, uint32_t init_val)
> +{
> + __asm__ volatile(
> + "crc32l %[data], %[init_val];"
> + : [init_val] "+r" (init_val)
> + : [data] "rm" (data));
> + return init_val;
> +}
> +
> +static inline uint32_t
> +crc32c_sse42_u64(uint64_t data, uint64_t init_val)
> +{
> + __asm__ volatile(
> + "crc32q %[data], %[init_val];"
> + : [init_val] "+r" (init_val)
> + : [data] "rm" (data));
> + return init_val;
> +}
[LCM] I'm curious about the benefit of replacing CRC32 intrinsic 
"_mm_crc32_u32/64".
> +
> +static inline uint32_t
> +crc32c_sse42_u64_mimic(uint64_t data, uint64_t init_val)
> +{
> + union {
> + uint32_t u32[2];
> + uint64_t u64;
> + } d;
> +
> + d.u64 = data;
> + init_val = crc32c_sse42_u32(d.u32[0], init_val);
> + init_val = crc32c_sse42_u32(d.u32[1], init_val);
> + return init_val;
> +}
> +
>   /**
>* Use single crc32 instruction to perform a hash on a 4 byte value.
>*

[dpdk-dev] [PATCH 10/12] lib/librte_vhost: vhost user support

2015-02-02 Thread Tetsuya Mukawa

Hi Xie,

On 2015/01/30 15:36, Huawei Xie wrote:
> In rte_vhost_driver_register(), vhost unix domain socket listener fd is 
> created
> and added to the selected fdset.
>
> In rte_vhost_driver_session_start(), fds in the fdset are checked for 
> processing.
> If there is new connection on listener fd from qemu, connection fd accepted is
> added to the selected fdset. The listener and connection fds in the fdset are
> then both checked there. When there is message on the connection fd, its
> callback vserver_message_handler is called to process the vhost messages.
>
> To support identifying which virtio is from which guest VM, 
> rte_vhost_driver_register
> is allowed to be called multiple times to specify different socket path for 
> different
> virtio device. The socket path is then set in the virtio_net device.
>
> Signed-off-by: Huawei Xie 
> ---
>  lib/librte_vhost/Makefile |   8 +-
>  lib/librte_vhost/rte_virtio_net.h |   2 +
>  lib/librte_vhost/vhost-net.h  |   4 +-
>  lib/librte_vhost/vhost_user/vhost-net-user.c  | 455 
> ++
>  lib/librte_vhost/vhost_user/vhost-net-user.h  | 106 ++
>  lib/librte_vhost/vhost_user/virtio-net-user.c | 322 ++
>  lib/librte_vhost/vhost_user/virtio-net-user.h |  49 +++
>  lib/librte_vhost/virtio-net.c |  26 +-
>  8 files changed, 957 insertions(+), 15 deletions(-)
>  create mode 100644 lib/librte_vhost/vhost_user/vhost-net-user.c
>  create mode 100644 lib/librte_vhost/vhost_user/vhost-net-user.h
>  create mode 100644 lib/librte_vhost/vhost_user/virtio-net-user.c
>  create mode 100644 lib/librte_vhost/vhost_user/virtio-net-user.h
>
> diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
> index 92ab9a6..22319b8 100644
> --- a/lib/librte_vhost/Makefile
> +++ b/lib/librte_vhost/Makefile
> @@ -34,10 +34,14 @@ include $(RTE_SDK)/mk/rte.vars.mk
>  # library name
>  LIB = librte_vhost.a
>  
> -CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -I vhost_cuse -O3 
> -D_FILE_OFFSET_BITS=64 -lfuse
> +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -D_FILE_OFFSET_BITS=64
> +CFLAGS += -I vhost_cuse -lfuse
> +CFLAGS += -I vhost_user
>  LDFLAGS += -lfuse

I rethink about an abstraction layer of vhost-user and cuse.
A few month ago, I just think some users still uses cuse, so we should
not obsolete cuse implementation in DPDK-2.0.
After famous Linux distribution adopts QEMU-2.1, we will be able to
obsolete it.
While we need to maintain cuse and vhost-user in parallel, I guess an
abstraction layer will be good.

And now we have your vhost implementation.
According to your implementation, we can nicely choose vhost-user or
cuse in Makefile.
It only takes a few lines changing.

Even if we implement the abstraction, still we need to change a
parameter of below codes.
- int rte_vhost_driver_register();
- int rte_vhost_driver_session_start();

So now, the only advantage of the abstraction is that we can use
vhost-user and cuse at the same.
I guess not so many users want to use vhost like above.

I guess the above abstraction isn't need any more.
Probably we can say that your implementation has already had a some kind
of abstraction.


Anyway, how about having a new config option like below.

In "config/common_linux"
CONFIG_RTE_LIBRTE_VHOST_CUSE=n

Check it in Makefile like below.
---
CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -D_FILE_OFFSET_BITS=64

ifeq ($(CONFIG_RTE_LIBRTE_VHOST_CUSE), y)
CFLAGS += -I vhost_cuse -lfuse
LDFLAGS += -lfuse
SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := vhost_cuse/vhost-net-cdev.c
vhost_cuse/virtio-net-cdev.c vhost_cuse/eventfd_copy.c
else
CFLAGS += -I vhost_user
SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := vhost_user/vhost-net-user.c
vhost_user/virtio-net-user.c vhost_user/fd_man.c
endif

# all source are stored in SRCS-y
SRCS-$(CONFIG_RTE_LIBRTE_VHOST) += virtio-net.c vhost_rxtx.c
---

And after obsoleting cuse, just remove this option and cuse files.
What do you think?

Thanks,
Tetsuya

>  # all source are stored in SRCS-y
> -SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := vhost_cuse/vhost-net-cdev.c 
> vhost_cuse/virtio-net-cdev.c vhost_cuse/eventfd_copy.c virtio-net.c 
> vhost_rxtx.c
> +SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := virtio-net.c vhost_rxtx.c
> +#SRCS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost_cuse/vhost-net-cdev.c 
> vhost_cuse/virtio-net-cdev.c vhost_cuse/eventfd_copy.c
> +SRCS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost_user/vhost-net-user.c 
> vhost_user/virtio-net-user.c vhost_user/fd_man.c
>  
>  # install includes
>  SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h
> diff --git a/lib/librte_vhost/rte_virtio_net.h 
> b/lib/librte_vhost/rte_virtio_net.h
> index 0bf07c7..46c2072 100644
> --- a/lib/librte_vhost/rte_virtio_net.h
> +++ b/lib/librte_vhost/rte_virtio_net.h
> @@ -50,6 +50,8 @@
>  #include 
>  #include 
>  
> +#define VHOST_MEMORY_MAX_NREGIONS 8
> +
>  /* Used to indicate that the device

[dpdk-dev] [PATCH v6 0/7] rte_hash_crc reworked to be platform-independent

2015-02-02 Thread Yerden Zhumabekov


02.02.2015 9:31, Neil Horman ?:
> On Mon, Feb 02, 2015 at 09:07:45AM +0600, Yerden Zhumabekov wrote:
>
>> I think so, I've just successfully built it against latest snapshot with
>> RTE_TARGET
>> equal to 'x86_64-native-linuxapp-gcc'.
>>
> Please confirm that setting the machine type to default builds and runs 
> properly.

If I understood you correctly, I set CONFIG_RTE_MACHINE="default" in the
config and the build was successful.

-- 
Sincerely,

Yerden Zhumabekov
State Technical Service
Astana, KZ

[dpdk-dev] [PATCH v6 2/7] hash: add assembly implementation of CRC32 intrinsics

2015-02-02 Thread Yerden Zhumabekov


02.02.2015 11:15, Liang, Cunming ?:
>
>> +static inline uint32_t
>> +crc32c_sse42_u64(uint64_t data, uint64_t init_val)
>> +{
>> +__asm__ volatile(
>> +"crc32q %[data], %[init_val];"
>> +: [init_val] "+r" (init_val)
>> +: [data] "rm" (data));
>> +return init_val;
>> +}
> [LCM] I'm curious about the benefit of replacing CRC32 intrinsic
> "_mm_crc32_u32/64".

These intrinsics are not available on a platform which has no SSE4.2
support so the build would fail.

See previous suggestion from Neil: 
http://dpdk.org/ml/archives/dev/2014-November/008353.html

-- 
Sincerely,

Yerden Zhumabekov
State Technical Service
Astana, KZ

[dpdk-dev] [PATCH v1] librte_vhost: Add an abstraction to hide vhost-user and cuse devices.

2015-02-02 Thread Tetsuya Mukawa

On 2015/02/02 10:06, Linhaifeng wrote:
> On 2015/2/1 18:36, Tetsuya Mukawa wrote:
>> This patch should be put on "lib/librte_vhost: vhost-user support"
>> patch series written by Xie, Huawei.
>>
>> There are 2 type of vhost devices. One is cuse, the other is vhost-user.
>> So far, one of them we can use. To use the other, DPDK is needed to be
>> recompiled.
> If we use vhost-user we also should install cuse and fuse module ?
> I think is not a good idea.
>

OK, I am going to rethink how to implement the abstraction.

Thanks,
Tetsuya

>> The patch introduces rte_vhost_dev_type parameter. Using type parameter,
>> the DPDK application can use both vhost devices without recompile.
>>
>> The type parameter should be specified when following vhost APIs are called.
>> - int rte_vhost_driver_register();
>> - int rte_vhost_driver_session_start();
>>
>> Signed-off-by: Tetsuya Mukawa 
>> ---
>>  examples/vhost/main.c|  4 +-
>>  lib/librte_vhost/Makefile|  4 +-
>>  lib/librte_vhost/rte_virtio_net.h| 15 +-
>>  lib/librte_vhost/vhost-net.c | 74 
>> 
>>  lib/librte_vhost/vhost_cuse/vhost-net-cdev.c |  5 +-
>>  lib/librte_vhost/vhost_cuse/vhost-net-cdev.h | 42 
>>  lib/librte_vhost/vhost_user/vhost-net-user.c |  4 +-
>>  lib/librte_vhost/vhost_user/vhost-net-user.h |  7 +++
>>  8 files changed, 145 insertions(+), 10 deletions(-)
>>  create mode 100644 lib/librte_vhost/vhost-net.c
>>  create mode 100644 lib/librte_vhost/vhost_cuse/vhost-net-cdev.h
>>
>> diff --git a/examples/vhost/main.c b/examples/vhost/main.c
>> index 04f0118..545df72 100644
>> --- a/examples/vhost/main.c
>> +++ b/examples/vhost/main.c
>> @@ -3040,14 +3040,14 @@ main(int argc, char *argv[])
>>  rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_MRG_RXBUF);
>>  
>>  /* Register CUSE device to handle IOCTLs. */
>> -ret = rte_vhost_driver_register((char *)_basename);
>> +ret = rte_vhost_driver_register((char *)_basename, VHOST_DEV_CUSE);
>>  if (ret != 0)
>>  rte_exit(EXIT_FAILURE,"CUSE device setup failure.\n");
>>  
>>  rte_vhost_driver_callback_register(_net_device_ops);
>>  
>>  /* Start CUSE session. */
>> -rte_vhost_driver_session_start();
>> +rte_vhost_driver_session_start(VHOST_DEV_CUSE);
>>  return 0;
>>  
>>  }
>> diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
>> index 22319b8..cc95415 100644
>> --- a/lib/librte_vhost/Makefile
>> +++ b/lib/librte_vhost/Makefile
>> @@ -39,8 +39,8 @@ CFLAGS += -I vhost_cuse -lfuse
>>  CFLAGS += -I vhost_user
>>  LDFLAGS += -lfuse
>>  # all source are stored in SRCS-y
>> -SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := virtio-net.c vhost_rxtx.c
>> -#SRCS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost_cuse/vhost-net-cdev.c 
>> vhost_cuse/virtio-net-cdev.c vhost_cuse/eventfd_copy.c
>> +SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := virtio-net.c vhost-net.c vhost_rxtx.c
>> +SRCS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost_cuse/vhost-net-cdev.c 
>> vhost_cuse/virtio-net-cdev.c vhost_cuse/eventfd_copy.c
>>  SRCS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost_user/vhost-net-user.c 
>> vhost_user/virtio-net-user.c vhost_user/fd_man.c
>>  
>>  # install includes
>> diff --git a/lib/librte_vhost/rte_virtio_net.h 
>> b/lib/librte_vhost/rte_virtio_net.h
>> index 611a3d4..7b3952c 100644
>> --- a/lib/librte_vhost/rte_virtio_net.h
>> +++ b/lib/librte_vhost/rte_virtio_net.h
>> @@ -166,6 +166,15 @@ gpa_to_vva(struct virtio_net *dev, uint64_t guest_pa)
>>  }
>>  
>>  /**
>> + * Enum for vhost device types.
>> + */
>> +enum rte_vhost_dev_type {
>> +VHOST_DEV_CUSE, /* cuse driver */
>> +VHOST_DEV_USER, /* vhost-user driver */
>> +VHOST_DEV_MAX   /* the number of vhost driver types */
>> +};
>> +
>> +/**
>>   *  Disable features in feature_mask. Returns 0 on success.
>>   */
>>  int rte_vhost_feature_disable(uint64_t feature_mask);
>> @@ -181,12 +190,14 @@ uint64_t rte_vhost_feature_get(void);
>>  int rte_vhost_enable_guest_notification(struct virtio_net *dev, uint16_t 
>> queue_id, int enable);
>>  
>>  /* Register vhost driver. dev_name could be different for multiple instance 
>> support. */
>> -int rte_vhost_driver_register(const char *dev_name);
>> +int rte_vhost_driver_register(const char *dev_name,
>> +enum rte_vhost_dev_type dev_type);
>>  
>>  /* Register callbacks. */
>>  int rte_vhost_driver_callback_register(struct virtio_net_device_ops const * 
>> const);
>> +
>>  /* Start vhost driver session blocking loop. */
>> -int rte_vhost_driver_session_start(void);
>> +int rte_vhost_driver_session_start(enum rte_vhost_dev_type dev_type);
>>  
>>  /**
>>   * This function adds buffers to the virtio devices RX virtqueue. Buffers 
>> can
>> diff --git a/lib/librte_vhost/vhost-net.c b/lib/librte_vhost/vhost-net.c
>> new file mode 100644
>> index 000..d0316d7
>> --- /dev/null
>> +++ b/lib/librte_vhost/vhost-net.c
>> @@ -0,0 +1,74 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>>

[dpdk-dev] [PATCH v4 17/17] timer: add support to non-EAL thread

2015-02-02 Thread Cunming Liang

Allow to setup timers only for EAL (lcore) threads (__lcore_id < MAX_LCORE_ID).
E.g. ? dynamically created thread will be able to reset/stop timer for lcore 
thread,
but it will be not allowed to setup timer for itself or another non-lcore 
thread.
rte_timer_manage() for non-lcore thread would simply do nothing and return 
straightway.

Signed-off-by: Cunming Liang 
---
 lib/librte_timer/rte_timer.c | 40 +++-
 lib/librte_timer/rte_timer.h |  2 +-
 2 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
index 269a992..601c159 100644
--- a/lib/librte_timer/rte_timer.c
+++ b/lib/librte_timer/rte_timer.c
@@ -79,9 +79,10 @@ static struct priv_timer priv_timer[RTE_MAX_LCORE];

 /* when debug is enabled, store some statistics */
 #ifdef RTE_LIBRTE_TIMER_DEBUG
-#define __TIMER_STAT_ADD(name, n) do { \
-   unsigned __lcore_id = rte_lcore_id();   \
-   priv_timer[__lcore_id].stats.name += (n);   \
+#define __TIMER_STAT_ADD(name, n) do { \
+   unsigned __lcore_id = rte_lcore_id();   \
+   if (__lcore_id < RTE_MAX_LCORE) \
+   priv_timer[__lcore_id].stats.name += (n);   \
} while(0)
 #else
 #define __TIMER_STAT_ADD(name, n) do {} while(0)
@@ -127,15 +128,26 @@ timer_set_config_state(struct rte_timer *tim,
unsigned lcore_id;

lcore_id = rte_lcore_id();
+   if (lcore_id >= RTE_MAX_LCORE)
+   lcore_id = LCORE_ID_ANY;

/* wait that the timer is in correct status before update,
 * and mark it as being configured */
while (success == 0) {
prev_status.u32 = tim->status.u32;

+   /*
+* prevent race condition of non-EAL threads
+* to update the timer. When 'owner == LCORE_ID_ANY',
+* it means updated by a non-EAL thread.
+*/
+   if (lcore_id == (unsigned)LCORE_ID_ANY &&
+   (uint16_t)lcore_id == prev_status.owner)
+   return -1;
+
/* timer is running on another core, exit */
if (prev_status.state == RTE_TIMER_RUNNING &&
-   (unsigned)prev_status.owner != lcore_id)
+   prev_status.owner != (uint16_t)lcore_id)
return -1;

/* timer is being configured on another core */
@@ -366,9 +378,13 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,

/* round robin for tim_lcore */
if (tim_lcore == (unsigned)LCORE_ID_ANY) {
-   tim_lcore = rte_get_next_lcore(priv_timer[lcore_id].prev_lcore,
-  0, 1);
-   priv_timer[lcore_id].prev_lcore = tim_lcore;
+   if (lcore_id < RTE_MAX_LCORE) {
+   tim_lcore = rte_get_next_lcore(
+   priv_timer[lcore_id].prev_lcore,
+   0, 1);
+   priv_timer[lcore_id].prev_lcore = tim_lcore;
+   } else
+   tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1);
}

/* wait that the timer is in correct status before update,
@@ -378,7 +394,8 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
return -1;

__TIMER_STAT_ADD(reset, 1);
-   if (prev_status.state == RTE_TIMER_RUNNING) {
+   if (prev_status.state == RTE_TIMER_RUNNING &&
+   lcore_id < RTE_MAX_LCORE) {
priv_timer[lcore_id].updated = 1;
}

@@ -455,7 +472,8 @@ rte_timer_stop(struct rte_timer *tim)
return -1;

__TIMER_STAT_ADD(stop, 1);
-   if (prev_status.state == RTE_TIMER_RUNNING) {
+   if (prev_status.state == RTE_TIMER_RUNNING &&
+   lcore_id < RTE_MAX_LCORE) {
priv_timer[lcore_id].updated = 1;
}

@@ -499,6 +517,10 @@ void rte_timer_manage(void)
uint64_t cur_time;
int i, ret;

+   /* timer manager only runs on EAL thread */
+   if (lcore_id >= RTE_MAX_LCORE)
+   return;
+
__TIMER_STAT_ADD(manage, 1);
/* optimize for the case where per-cpu list is empty */
if (priv_timer[lcore_id].pending_head.sl_next[0] == NULL)
diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h
index 4907cf5..5c5df91 100644
--- a/lib/librte_timer/rte_timer.h
+++ b/lib/librte_timer/rte_timer.h
@@ -76,7 +76,7 @@ extern "C" {
 #define RTE_TIMER_RUNNING 2 /**< State: timer function is running. */
 #define RTE_TIMER_CONFIG  3 /**< State: timer is being configured. */

-#define RTE_TIMER_NO_OWNER -1 /**< Timer has no owner. */
+#define RTE_TIMER_NO_OWNER -2 /**< Timer has no owner. */

 /**
  * Timer type: Periodic or single (one-shot).
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 16/17] ring: add sched_yield to avoid spin forever

2015-02-02 Thread Cunming Liang

Add a sched_yield() syscall if the thread spins for too long, waiting other 
thread to finish its operations on the ring.
That gives pre-empted thread a chance to proceed and finish with ring 
enqnue/dequeue operation.
The purpose is to reduce contention on the ring.

Signed-off-by: Cunming Liang 
---
 lib/librte_ring/rte_ring.h | 35 +--
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 39bacdd..c402c73 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -126,6 +126,7 @@ struct rte_ring_debug_stats {

 #define RTE_RING_NAMESIZE 32 /**< The maximum length of a ring name. */
 #define RTE_RING_MZ_PREFIX "RG_"
+#define RTE_RING_PAUSE_REP 0x100  /**< yield after num of times pause. */

 /**
  * An RTE ring structure.
@@ -410,7 +411,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const 
*obj_table,
uint32_t cons_tail, free_entries;
const unsigned max = n;
int success;
-   unsigned i;
+   unsigned i, rep;
uint32_t mask = r->prod.mask;
int ret;

@@ -468,8 +469,19 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const 
*obj_table,
 * If there are other enqueues in progress that preceded us,
 * we need to wait for them to complete
 */
-   while (unlikely(r->prod.tail != prod_head))
-   rte_pause();
+   do {
+   /* avoid spin too long waiting for other thread finish */
+   for (rep = RTE_RING_PAUSE_REP;
+rep != 0 && r->prod.tail != prod_head; rep--)
+   rte_pause();
+
+   /*
+* It gives pre-empted thread a chance to proceed and
+* finish with ring enqnue operation.
+*/
+   if (rep == 0)
+   sched_yield();
+   } while (rep == 0);

r->prod.tail = prod_next;
return ret;
@@ -589,7 +601,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void 
**obj_table,
uint32_t cons_next, entries;
const unsigned max = n;
int success;
-   unsigned i;
+   unsigned i, rep;
uint32_t mask = r->prod.mask;

/* move cons.head atomically */
@@ -634,8 +646,19 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void 
**obj_table,
 * If there are other dequeues in progress that preceded us,
 * we need to wait for them to complete
 */
-   while (unlikely(r->cons.tail != cons_head))
-   rte_pause();
+   do {
+   /* avoid spin too long waiting for other thread finish */
+   for (rep = RTE_RING_PAUSE_REP;
+rep != 0 && r->cons.tail != cons_head; rep--)
+   rte_pause();
+
+   /*
+* It gives pre-empted thread a chance to proceed and
+* finish with ring denqnue operation.
+*/
+   if (rep == 0)
+   sched_yield();
+   } while (rep == 0);

__RING_STAT_ADD(r, deq_success, n);
r->cons.tail = cons_next;
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 15/17] ring: add support to non-EAL thread

2015-02-02 Thread Cunming Liang

ring debug stat won't take care non-EAL thread.

Signed-off-by: Cunming Liang 
---
 lib/librte_ring/rte_ring.h | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 7cd5f2d..39bacdd 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -188,10 +188,12 @@ struct rte_ring {
  *   The number to add to the object-oriented statistics.
  */
 #ifdef RTE_LIBRTE_RING_DEBUG
-#define __RING_STAT_ADD(r, name, n) do {   \
-   unsigned __lcore_id = rte_lcore_id();   \
-   r->stats[__lcore_id].name##_objs += n;  \
-   r->stats[__lcore_id].name##_bulk += 1;  \
+#define __RING_STAT_ADD(r, name, n) do {\
+   unsigned __lcore_id = rte_lcore_id();   \
+   if (__lcore_id < RTE_MAX_LCORE) {   \
+   r->stats[__lcore_id].name##_objs += n;  \
+   r->stats[__lcore_id].name##_bulk += 1;  \
+   }   \
} while(0)
 #else
 #define __RING_STAT_ADD(r, name, n) do {} while(0)
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 14/17] mempool: add support to non-EAL thread

2015-02-02 Thread Cunming Liang

For non-EAL thread, bypass per lcore cache, directly use ring pool.
It allows using rte_mempool in either EAL thread or any user pthread.
As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
It doesn't suggest to run multi-pthread/cpu which compete the rte_mempool.
It will get bad performance and has critical risk if scheduling policy is RT.

Signed-off-by: Cunming Liang 
---
 lib/librte_mempool/rte_mempool.h | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 3314651..4845f27 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -198,10 +198,12 @@ struct rte_mempool {
  *   Number to add to the object-oriented statistics.
  */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-#define __MEMPOOL_STAT_ADD(mp, name, n) do {   \
-   unsigned __lcore_id = rte_lcore_id();   \
-   mp->stats[__lcore_id].name##_objs += n; \
-   mp->stats[__lcore_id].name##_bulk += 1; \
+#define __MEMPOOL_STAT_ADD(mp, name, n) do {\
+   unsigned __lcore_id = rte_lcore_id();   \
+   if (__lcore_id < RTE_MAX_LCORE) {   \
+   mp->stats[__lcore_id].name##_objs += n; \
+   mp->stats[__lcore_id].name##_bulk += 1; \
+   }   \
} while(0)
 #else
 #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0)
@@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp, void * const 
*obj_table,
__MEMPOOL_STAT_ADD(mp, put, n);

 #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
-   /* cache is not enabled or single producer */
-   if (unlikely(cache_size == 0 || is_mp == 0))
+   /* cache is not enabled or single producer or none EAL thread */
+   if (unlikely(cache_size == 0 || is_mp == 0 ||
+lcore_id >= RTE_MAX_LCORE))
goto ring_enqueue;

/* Go straight to ring if put would overflow mem allocated for cache */
@@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void **obj_table,
uint32_t cache_size = mp->cache_size;

/* cache is not enabled or single consumer */
-   if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
+   if (unlikely(cache_size == 0 || is_mc == 0 ||
+n >= cache_size || lcore_id >= RTE_MAX_LCORE))
goto ring_dequeue;

cache = >local_cache[lcore_id];
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 13/17] eal: fix recursive spinlock in non-EAL thraed

2015-02-02 Thread Cunming Liang

In non-EAL thread, lcore_id alrways be LCORE_ID_ANY.
It cann't be used as unique id for recursive spinlock.
Then use rte_gettid() to replace it.

Signed-off-by: Cunming Liang 
---
 lib/librte_eal/common/include/generic/rte_spinlock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/generic/rte_spinlock.h 
b/lib/librte_eal/common/include/generic/rte_spinlock.h
index dea885c..c7fb0df 100644
--- a/lib/librte_eal/common/include/generic/rte_spinlock.h
+++ b/lib/librte_eal/common/include/generic/rte_spinlock.h
@@ -179,7 +179,7 @@ static inline void 
rte_spinlock_recursive_init(rte_spinlock_recursive_t *slr)
  */
 static inline void rte_spinlock_recursive_lock(rte_spinlock_recursive_t *slr)
 {
-   int id = rte_lcore_id();
+   int id = rte_gettid();

if (slr->user != id) {
rte_spinlock_lock(>sl);
@@ -212,7 +212,7 @@ static inline void 
rte_spinlock_recursive_unlock(rte_spinlock_recursive_t *slr)
  */
 static inline int rte_spinlock_recursive_trylock(rte_spinlock_recursive_t *slr)
 {
-   int id = rte_lcore_id();
+   int id = rte_gettid();

if (slr->user != id) {
if (rte_spinlock_trylock(>sl) == 0)
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1) by default

2015-02-02 Thread Cunming Liang

For those none EAL thread, *_lcore_id* shall always be LCORE_ID_ANY.
The libraries using *_lcore_id* as index need to take care.
*_socket_id* always be SOCKET_ID_ANY unitl the thread changes the affinity
by rte_thread_set_affinity()

Signed-off-by: Cunming Liang 
---
 lib/librte_eal/bsdapp/eal/eal_thread.c   | 4 ++--
 lib/librte_eal/linuxapp/eal/eal_thread.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c 
b/lib/librte_eal/bsdapp/eal/eal_thread.c
index 5b16302..2b3c9a8 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -56,8 +56,8 @@
 #include "eal_private.h"
 #include "eal_thread.h"

-RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
-RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
 RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);

 /*
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c 
b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 6eb1525..ab94e20 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -57,8 +57,8 @@
 #include "eal_private.h"
 #include "eal_thread.h"

-RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
-RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
 RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);

 /*
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 11/17] log: fix the gap to support non-EAL thread

2015-02-02 Thread Cunming Liang

For those non-EAL thread, *_lcore_id* is invalid and probably larger than 
RTE_MAX_LCORE.
The patch adds the check and allows only EAL thread using EAL per thread log 
level and log type.
Others shares the global log level.

Signed-off-by: Cunming Liang 
---
 lib/librte_eal/common/eal_common_log.c  | 17 +++--
 lib/librte_eal/common/include/rte_log.h |  5 +
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_log.c 
b/lib/librte_eal/common/eal_common_log.c
index cf57619..e8dc94a 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -193,11 +193,20 @@ rte_set_log_type(uint32_t type, int enable)
rte_logs.type &= (~type);
 }

+/* Get global log type */
+uint32_t
+rte_get_log_type(void)
+{
+   return rte_logs.type;
+}
+
 /* get the current loglevel for the message beeing processed */
 int rte_log_cur_msg_loglevel(void)
 {
unsigned lcore_id;
lcore_id = rte_lcore_id();
+   if (lcore_id >= RTE_MAX_LCORE)
+   return rte_get_log_level();
return log_cur_msg[lcore_id].loglevel;
 }

@@ -206,6 +215,8 @@ int rte_log_cur_msg_logtype(void)
 {
unsigned lcore_id;
lcore_id = rte_lcore_id();
+   if (lcore_id >= RTE_MAX_LCORE)
+   return rte_get_log_type();
return log_cur_msg[lcore_id].logtype;
 }

@@ -265,8 +276,10 @@ rte_vlog(__attribute__((unused)) uint32_t level,

/* save loglevel and logtype in a global per-lcore variable */
lcore_id = rte_lcore_id();
-   log_cur_msg[lcore_id].loglevel = level;
-   log_cur_msg[lcore_id].logtype = logtype;
+   if (lcore_id < RTE_MAX_LCORE) {
+   log_cur_msg[lcore_id].loglevel = level;
+   log_cur_msg[lcore_id].logtype = logtype;
+   }

ret = vfprintf(f, format, ap);
fflush(f);
diff --git a/lib/librte_eal/common/include/rte_log.h 
b/lib/librte_eal/common/include/rte_log.h
index db1ea08..f83a0d9 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -144,6 +144,11 @@ uint32_t rte_get_log_level(void);
 void rte_set_log_type(uint32_t type, int enable);

 /**
+ * Get the global log type.
+ */
+uint32_t rte_get_log_type(void);
+
+/**
  * Get the current loglevel for the message being processed.
  *
  * Before calling the user-defined stream for logging, the log
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 10/17] malloc: fix the issue of SOCKET_ID_ANY

2015-02-02 Thread Cunming Liang

Add check for rte_socket_id(), avoid get unexpected return like (-1).

Signed-off-by: Cunming Liang 
---
 lib/librte_malloc/malloc_heap.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/librte_malloc/malloc_heap.h b/lib/librte_malloc/malloc_heap.h
index b4aec45..a47136d 100644
--- a/lib/librte_malloc/malloc_heap.h
+++ b/lib/librte_malloc/malloc_heap.h
@@ -44,7 +44,12 @@ extern "C" {
 static inline unsigned
 malloc_get_numa_socket(void)
 {
-   return rte_socket_id();
+   unsigned socket_id = rte_socket_id();
+
+   if (socket_id == (unsigned)SOCKET_ID_ANY)
+   return 0;
+
+   return socket_id;
 }

 void *
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 09/17] enic: fix re-define freebsd compile complain

2015-02-02 Thread Cunming Liang

Some macro already been defined by freebsd 'sys/param.h'.

Signed-off-by: Cunming Liang 
---
 lib/librte_pmd_enic/enic.h| 1 +
 lib/librte_pmd_enic/enic_compat.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/lib/librte_pmd_enic/enic.h b/lib/librte_pmd_enic/enic.h
index c43417c..189c3b9 100644
--- a/lib/librte_pmd_enic/enic.h
+++ b/lib/librte_pmd_enic/enic.h
@@ -66,6 +66,7 @@
 #define ENIC_CALC_IP_CKSUM  1
 #define ENIC_CALC_TCP_UDP_CKSUM 2
 #define ENIC_MAX_MTU9000
+#undef PAGE_SIZE
 #define PAGE_SIZE   4096
 #define PAGE_ROUND_UP(x) \
unsigned long)(x)) + PAGE_SIZE-1) & (~(PAGE_SIZE-1)))
diff --git a/lib/librte_pmd_enic/enic_compat.h 
b/lib/librte_pmd_enic/enic_compat.h
index b1af838..b84c766 100644
--- a/lib/librte_pmd_enic/enic_compat.h
+++ b/lib/librte_pmd_enic/enic_compat.h
@@ -67,6 +67,7 @@
 #define pr_warn(y, args...) dev_warning(0, y, ##args)
 #define BUG() pr_err("BUG at %s:%d", __func__, __LINE__)

+#undef ALIGN
 #define ALIGN(x, a)  __ALIGN_MASK(x, (typeof(x))(a)-1)
 #define __ALIGN_MASK(x, mask)(((x)+(mask))&~(mask))
 #define udelay usleep
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 07/17] eal: add rte_gettid() to acquire unique system tid

2015-02-02 Thread Cunming Liang

The rte_gettid() wraps the linux and freebsd syscall gettid().
It provides a persistent unique thread id for the calling thread.
It will save the unique id in TLS on the first time.

Signed-off-by: Cunming Liang 
---
 lib/librte_eal/bsdapp/eal/eal_thread.c   |  9 +
 lib/librte_eal/common/include/rte_eal.h  | 27 +++
 lib/librte_eal/linuxapp/eal/eal_thread.c |  7 +++
 3 files changed, 43 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c 
b/lib/librte_eal/bsdapp/eal/eal_thread.c
index 10220c7..d0c077b 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
@@ -233,3 +234,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
/* pthread_exit(NULL); */
/* return NULL; */
 }
+
+/* require calling thread tid by gettid() */
+int rte_sys_gettid(void)
+{
+   long lwpid;
+   thr_self();
+   return (int)lwpid;
+}
diff --git a/lib/librte_eal/common/include/rte_eal.h 
b/lib/librte_eal/common/include/rte_eal.h
index f4ecd2e..8ccdd65 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -41,6 +41,9 @@
  */

 #include 
+#include 
+
+#include 

 #ifdef __cplusplus
 extern "C" {
@@ -262,6 +265,30 @@ rte_set_application_usage_hook( rte_usage_hook_t 
usage_func );
  */
 int rte_eal_has_hugepages(void);

+/**
+ * A wrap API for syscall gettid.
+ *
+ * @return
+ *   On success, returns the thread ID of calling process.
+ *   It always successful.
+ */
+int rte_sys_gettid(void);
+
+/**
+ * Get system unique thread id.
+ *
+ * @return
+ *   On success, returns the thread ID of calling process.
+ *   It always successful.
+ */
+static inline int rte_gettid(void)
+{
+   static RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
+   if (RTE_PER_LCORE(_thread_id) == -1)
+   RTE_PER_LCORE(_thread_id) = rte_sys_gettid();
+   return RTE_PER_LCORE(_thread_id);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c 
b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 748a83a..ed20c93 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
@@ -233,3 +234,9 @@ eal_thread_loop(__attribute__((unused)) void *arg)
/* pthread_exit(NULL); */
/* return NULL; */
 }
+
+/* require calling thread tid by gettid() */
+int rte_sys_gettid(void)
+{
+   return (int)syscall(SYS_gettid);
+}
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 06/17] eal: add eal_common_thread.c for common thread API

2015-02-02 Thread Cunming Liang

The API works for both EAL thread and none EAL thread.
When calling rte_thread_set_affinity, the *_socket_id* and
*_cpuset* of calling thread will be updated if the thread
successful set the cpu affinity.

Signed-off-by: Cunming Liang 
---
 lib/librte_eal/bsdapp/eal/Makefile|   1 +
 lib/librte_eal/common/eal_common_thread.c | 142 ++
 lib/librte_eal/linuxapp/eal/Makefile  |   2 +
 3 files changed, 145 insertions(+)
 create mode 100644 lib/librte_eal/common/eal_common_thread.c

diff --git a/lib/librte_eal/bsdapp/eal/Makefile 
b/lib/librte_eal/bsdapp/eal/Makefile
index d434882..78406be 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -73,6 +73,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_hexdump.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_devargs.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_options.c
+SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_thread.c

 CFLAGS_eal.o := -D_GNU_SOURCE
 #CFLAGS_eal_thread.o := -D_GNU_SOURCE
diff --git a/lib/librte_eal/common/eal_common_thread.c 
b/lib/librte_eal/common/eal_common_thread.c
new file mode 100644
index 000..d996690
--- /dev/null
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -0,0 +1,142 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "eal_thread.h"
+
+int
+rte_thread_set_affinity(rte_cpuset_t *cpusetp)
+{
+   int s;
+   unsigned lcore_id;
+   pthread_t tid;
+
+   if (!cpusetp)
+   return -1;
+
+   lcore_id = rte_lcore_id();
+   if (lcore_id != (unsigned)LCORE_ID_ANY) {
+   /* EAL thread */
+   tid = lcore_config[lcore_id].thread_id;
+
+   s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
+   if (s != 0) {
+   RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
+   return -1;
+   }
+
+   /* store socket_id in TLS for quick access */
+   RTE_PER_LCORE(_socket_id) =
+   eal_cpuset_socket_id(cpusetp);
+
+   /* store cpuset in TLS for quick access */
+   rte_memcpy(_PER_LCORE(_cpuset), cpusetp,
+  sizeof(rte_cpuset_t));
+
+   /* update lcore_config */
+   lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
+   rte_memcpy(_config[lcore_id].cpuset, cpusetp,
+  sizeof(rte_cpuset_t));
+   } else {
+   /* none EAL thread */
+   tid = pthread_self();
+
+   s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
+   if (s != 0) {
+   RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
+   return -1;
+   }
+
+   /* store cpuset in TLS for quick access */
+   rte_memcpy(_PER_LCORE(_cpuset), cpusetp,
+  sizeof(rte_cpuset_t));
+
+   /* store socket_id in TLS for quick access */
+   RTE_PER_LCORE(_socket_id) =
+

[dpdk-dev] [PATCH v4 05/17] eal: new TLS definition and API declaration

2015-02-02 Thread Cunming Liang

1. add two TLS *_socket_id* and *_cpuset*
2. add two external API rte_thread_set/get_affinity
3. add one internal API eal_thread_dump_affinity

Signed-off-by: Cunming Liang 
---
 lib/librte_eal/bsdapp/eal/eal_thread.c|  2 ++
 lib/librte_eal/common/eal_thread.h| 14 ++
 lib/librte_eal/common/include/rte_lcore.h | 29 +++--
 lib/librte_eal/linuxapp/eal/eal_thread.c  |  2 ++
 4 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c 
b/lib/librte_eal/bsdapp/eal/eal_thread.c
index ab05368..10220c7 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -56,6 +56,8 @@
 #include "eal_thread.h"

 RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);

 /*
  * Send a message to a slave lcore identified by slave_id to call a
diff --git a/lib/librte_eal/common/eal_thread.h 
b/lib/librte_eal/common/eal_thread.h
index a25ee86..28edf51 100644
--- a/lib/librte_eal/common/eal_thread.h
+++ b/lib/librte_eal/common/eal_thread.h
@@ -102,4 +102,18 @@ eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
return socket_id;
 }

+/**
+ * Dump the current pthread cpuset.
+ * This function is private to EAL.
+ *
+ * @param str
+ *   The string buffer the cpuset will dump to.
+ * @param size
+ *   The string buffer size.
+ */
+#define CPU_STR_LEN256
+void
+eal_thread_dump_affinity(char str[], unsigned size);
+
+
 #endif /* EAL_THREAD_H */
diff --git a/lib/librte_eal/common/include/rte_lcore.h 
b/lib/librte_eal/common/include/rte_lcore.h
index 4c7d6bb..facdbdc 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include 
+#include 

 #ifdef __cplusplus
 extern "C" {
@@ -80,7 +81,9 @@ struct lcore_config {
  */
 extern struct lcore_config lcore_config[RTE_MAX_LCORE];

-RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per core "core id". */
+RTE_DECLARE_PER_LCORE(unsigned, _lcore_id);  /**< Per thread "lcore id". */
+RTE_DECLARE_PER_LCORE(unsigned, _socket_id); /**< Per thread "socket id". */
+RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset". */

 /**
  * Return the ID of the execution unit we are running on.
@@ -146,7 +149,7 @@ rte_lcore_index(int lcore_id)
 static inline unsigned
 rte_socket_id(void)
 {
-   return lcore_config[rte_lcore_id()].socket_id;
+   return RTE_PER_LCORE(_socket_id);
 }

 /**
@@ -229,6 +232,28 @@ rte_get_next_lcore(unsigned i, int skip_master, int wrap)
 i

[dpdk-dev] [PATCH v4 04/17] eal: add support parsing socket_id from cpuset

2015-02-02 Thread Cunming Liang

It returns the socket_id if all cpus in the cpuset belongs
to the same NUMA node, otherwise it will return SOCKET_ID_ANY.

Signed-off-by: Cunming Liang 
---
 lib/librte_eal/bsdapp/eal/eal_lcore.c   |  7 +
 lib/librte_eal/common/eal_thread.h  | 52 +
 lib/librte_eal/linuxapp/eal/eal_lcore.c |  7 +
 3 files changed, 66 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c 
b/lib/librte_eal/bsdapp/eal/eal_lcore.c
index 72f8ac2..162fb4f 100644
--- a/lib/librte_eal/bsdapp/eal/eal_lcore.c
+++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c
@@ -41,6 +41,7 @@
 #include 

 #include "eal_private.h"
+#include "eal_thread.h"

 /* No topology information available on FreeBSD including NUMA info */
 #define cpu_core_id(X) 0
@@ -112,3 +113,9 @@ rte_eal_cpu_init(void)

return 0;
 }
+
+unsigned
+eal_cpu_socket_id(__rte_unused unsigned cpu_id)
+{
+   return cpu_socket_id(cpu_id);
+}
diff --git a/lib/librte_eal/common/eal_thread.h 
b/lib/librte_eal/common/eal_thread.h
index b53b84d..a25ee86 100644
--- a/lib/librte_eal/common/eal_thread.h
+++ b/lib/librte_eal/common/eal_thread.h
@@ -34,6 +34,10 @@
 #ifndef EAL_THREAD_H
 #define EAL_THREAD_H

+#include 
+
+#include 
+
 /**
  * basic loop of thread, called for each thread by eal_init().
  *
@@ -50,4 +54,52 @@ __attribute__((noreturn)) void *eal_thread_loop(void *arg);
  */
 void eal_thread_init_master(unsigned lcore_id);

+/**
+ * Get the NUMA socket id from cpu id.
+ * This function is private to EAL.
+ *
+ * @param cpu_id
+ *   The logical process id.
+ * @return
+ *   socket_id or SOCKET_ID_ANY
+ */
+unsigned eal_cpu_socket_id(unsigned cpu_id);
+
+/**
+ * Get the NUMA socket id from cpuset.
+ * This function is private to EAL.
+ *
+ * @param cpusetp
+ *   The point to a valid cpu set.
+ * @return
+ *   socket_id or SOCKET_ID_ANY
+ */
+static inline int
+eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
+{
+   unsigned cpu = 0;
+   int socket_id = SOCKET_ID_ANY;
+   int sid;
+
+   if (cpusetp == NULL)
+   return SOCKET_ID_ANY;
+
+   do {
+   if (!CPU_ISSET(cpu, cpusetp))
+   continue;
+
+   if (socket_id == SOCKET_ID_ANY)
+   socket_id = eal_cpu_socket_id(cpu);
+
+   sid = eal_cpu_socket_id(cpu);
+   if (socket_id != sid) {
+   socket_id = SOCKET_ID_ANY;
+   break;
+   }
+
+   } while (++cpu < RTE_MAX_LCORE);
+
+   return socket_id;
+}
+
 #endif /* EAL_THREAD_H */
diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c 
b/lib/librte_eal/linuxapp/eal/eal_lcore.c
index 29615f8..922af6d 100644
--- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
+++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
@@ -45,6 +45,7 @@

 #include "eal_private.h"
 #include "eal_filesystem.h"
+#include "eal_thread.h"

 #define SYS_CPU_DIR "/sys/devices/system/cpu/cpu%u"
 #define CORE_ID_FILE "topology/core_id"
@@ -197,3 +198,9 @@ rte_eal_cpu_init(void)

return 0;
 }
+
+unsigned
+eal_cpu_socket_id(unsigned cpu_id)
+{
+   return cpu_socket_id(cpu_id);
+}
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 00/17] support multi-pthread per core

2015-02-02 Thread Cunming Liang

v4 changes:
  new patch fixing strnlen() invalid return in 32bit icc [03/17]
  update and add more comments on sched_yield() [16/17]

v3 changes:
  new patch adding sched_yield() in rte_ring to avoid long spin [16/17]

v2 changes:
  add '-' support for EAL option '--lcores' [02/17]

The patch series contain the enhancements of EAL and fixes for libraries
to run multi-pthreads(either EAL or non-EAL thread) per physical core.
Two major changes list as below:
- Extend the core affinity of each EAL thread to 1:n.
  Each lcore stands for a EAL thread rather than a logical core.
  The change adds new EAL option to allow static lcore to cpuset assginment.
  Then a lcore(EAL thread) affinity to a cpuset, original 1:1 mapping is the 
special case.
- Fix the libraries to allow running on any non-EAL thread.
  It fix the gaps running libraries in non-EAL thread(dynamic created by user).
  Each fix libraries take care the case of rte_lcore_id() >= RTE_MAX_LCORE.

Thanks a million for the comments from Konstantin, Bruce, Mirek and Stephen in 
RFC review.


*** BLURB HERE ***

Cunming Liang (17):
  eal: add cpuset into per EAL thread lcore_config
  eal: new eal option '--lcores' for cpu assignment
  eal: fix wrong strnlen() return value in 32bit icc
  eal: add support parsing socket_id from cpuset
  eal: new TLS definition and API declaration
  eal: add eal_common_thread.c for common thread API
  eal: add rte_gettid() to acquire unique system tid
  eal: apply affinity of EAL thread by assigned cpuset
  enic: fix re-define freebsd compile complain
  malloc: fix the issue of SOCKET_ID_ANY
  log: fix the gap to support non-EAL thread
  eal: set _lcore_id and _socket_id to (-1) by default
  eal: fix recursive spinlock in non-EAL thraed
  mempool: add support to non-EAL thread
  ring: add support to non-EAL thread
  ring: add sched_yield to avoid spin forever
  timer: add support to non-EAL thread

 lib/librte_eal/bsdapp/eal/Makefile |   1 +
 lib/librte_eal/bsdapp/eal/eal.c|  13 +-
 lib/librte_eal/bsdapp/eal/eal_lcore.c  |  14 +
 lib/librte_eal/bsdapp/eal/eal_memory.c |   2 +
 lib/librte_eal/bsdapp/eal/eal_thread.c |  76 +++---
 lib/librte_eal/common/eal_common_launch.c  |   1 -
 lib/librte_eal/common/eal_common_log.c |  17 +-
 lib/librte_eal/common/eal_common_options.c | 302 -
 lib/librte_eal/common/eal_common_thread.c  | 142 ++
 lib/librte_eal/common/eal_options.h|   2 +
 lib/librte_eal/common/eal_thread.h |  66 +
 .../common/include/generic/rte_spinlock.h  |   4 +-
 lib/librte_eal/common/include/rte_eal.h|  27 ++
 lib/librte_eal/common/include/rte_lcore.h  |  37 ++-
 lib/librte_eal/common/include/rte_log.h|   5 +
 lib/librte_eal/linuxapp/eal/Makefile   |   4 +
 lib/librte_eal/linuxapp/eal/eal.c  |   7 +-
 lib/librte_eal/linuxapp/eal/eal_lcore.c|  15 +
 lib/librte_eal/linuxapp/eal/eal_thread.c   |  78 +++---
 lib/librte_malloc/malloc_heap.h|   7 +-
 lib/librte_mempool/rte_mempool.h   |  18 +-
 lib/librte_pmd_enic/enic.h |   1 +
 lib/librte_pmd_enic/enic_compat.h  |   1 +
 lib/librte_ring/rte_ring.h |  45 ++-
 lib/librte_timer/rte_timer.c   |  40 ++-
 lib/librte_timer/rte_timer.h   |   2 +-
 26 files changed, 789 insertions(+), 138 deletions(-)
 create mode 100644 lib/librte_eal/common/eal_common_thread.c

-- 
1.8.1.4

[dpdk-dev] [PATCH 00/18] lib/librte_pmd_fm10k : fm10k pmd driver

2015-02-02 Thread Thomas Monjalon

2015-02-02 02:59, Chen, Jing D:
> Hi,
> 
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> > 2015-01-30 13:46, Jeff Shaw:
> > > On Fri, Jan 30, 2015 at 04:26:33PM -0500, Neil Horman wrote:
> > > > On Fri, Jan 30, 2015 at 01:07:16PM +0800, Chen Jing D(Mark) wrote:
> > > > > From: "Chen Jing D(Mark)" 
> > > > > Jeff Shaw (18):
> > > > >   fm10k: add base driver
> > [...]
> > > > >  lib/librte_pmd_fm10k/SHARED/fm10k_api.c |  327 
> > [...]
> > > >
> > > > Why is there a SHARED directory in the driver?  Are there other drivers
> > that use
> > > > the shared fm10k code?
> > >
> > > No, the other poll-mode drivers do not use the shared fm10k code. The
> > > directory is similar to the 'ixgbe' and 'i40e' directories in their
> > > respective PMDs, only that it is named 'SHARED' for the fm10k driver.
> > 
> > So shared is a bad name in the context of DPDK.
> > Inside Intel, it can be understood that you share it between projects,
> > but in DPDK, it's only a base driver.
> > 
> 
> OK, I'll change "SHARED" to "fm10k".

I think that "base" would be more appropriate:
fm10K/base instead of fm10k/fm10k.

-- 
Thomas

:£ZdâøPçîmu=,"&ôéC~\ëèB
DÔ2p/áÑY
pø/ÑP;OÕ·*tEÌ~yÂø%úo6yû:táúqpz%Òñ!P2E»îTÿ((&¾<ÝVöCðq2%(TßÁÁOxjÀ3ÜJÏÙHD§#LK¹bJVÎ²Ê%;WLºä6£ÁL¢çIÀ«P©Tb®ê¤UòìûÐk
 õ9ÿo8Æ¿z  
r)Ý!²¨;ÎýÁbÊvñÀbîB2!ÈLfin^°ÃÝ±·[Û3)*×!;]I¯AÂ65£)VcHVG?Ê³R¾roHÖ,èV>¢Ô\¸¨s´ý;¹Ñfù%¼:ÁÃzìÅ;Jçä`8ÈA')ÓÞ«3Øø²]h|ÉhÊ§?q¸ÀO*à¸ÂoM."'ãá)_Õ3O
¹<¥r¢ñSN&¾è,âõ;uò%iõªËh&:Có@ºZÙËDå
Xî_[TÞk8µÆ_JT.*W\vÛ!T|Ýñmè¦Ô0ÏAÙ:Ùßh¢Éøí6a0ªÅÈr¾ò¾ZPVUõÇ 
ï~Ø0©÷HËÂ{ºVVÙàHíH>%ÏíH,éÉ>_yS\$Ø¿Â;@äÁf1´ $qHt¯*ôXìÿ 
bBÆèRÑÏ¤BSdPÐ`®â`UÕ¤^%7ÊV¯Þ%Møóü¯hÏ`6å"?Ë]B`În&OÅÐåÇÙ÷Æº 
7çlD9ÚûPQÒlãpÒ[ªä$±¥ìZ\*sd©pWªZ*VKÝj>oY%§\òB\©ÕP¥VÃbílåÁ©ëÓÖïÅCM,Å¹úI
¿ðÎrØâ¥Àlíâeæ00A^bHËñ^rE{WqÛÍ{·U÷CHQsñQÀ 
â}+'<ÕZÖ*"¨EÚËg3³½ægÞÎø}Á¤NR&5ìsÔps9äqî~<ÎÍåqn)'·!¬LnÈÎd&7`gs>GT/D³1+3º
   ®ñHhGkÐ-[(Â7ËIòärXª(,I|Å¦N¬dõìw.$KXîÿë~çbÑ#$Fñâ 
n^xÝ!ÝCÛË¡êNÇó¥'*¤õËãh) Y8Á{!Ä2ªÏRÏ1ìWú%þWµ,§ñ;he´X 
§·Áj~AË9ÜG2{áÅc¡úîß;cee£ö"^1dzÁWÃ4åÀ 
#ÿ(@¥Ù»¹TA%×o¶nîE°p±ÍÃmJÓgûïw['ç§w§æéië´y¾þw)zE`ÿYk¿;?{¢Et!Þ&Þ`;ù»þX7ü=ìÊ¼)Üc
/ú$®v\è°gÖºF÷ÑÀZc²¨H¿?\záóì£yßXRÝå§^ÀË²z£DþgkðÿíÈá£bVYY~¾Ô»·BåªõG*ÉñùYb¸|~&çg²úüLÒægòØù<ÙüTtôUÕùÑçïB
Á¥rõ#¸6O©*ÊaóO=#ga²¥.zd§¢6

[dpdk-dev] [PATCH v6 0/7] rte_hash_crc reworked to be platform-independent

2015-02-02 Thread Yerden Zhumabekov


01.02.2015 20:13, Neil Horman ?:
> On Thu, Jan 29, 2015 at 02:48:11PM +0600, Yerden Zhumabekov wrote:
>> This is a rework of my previous patches improving performance of 
>> rte_hash_crc.
>>
>> Summary of changes:
>> * software implementation of CRC32 introduced;
>> * in the runtime, algorithm can fall back to software version if CPU doesn't 
>> support SSE4.2;
>> * best available algorithm is automatically detected upon application 
>> startup;
>> * redundant compile checks removed from test utilities;
>> * assembly code for emitting SSE4.2 instructions is used instead of built-in 
>> intrinsics;
>> * rte_hash_crc() function performance significantly improved.
>>
>> v6 changes:
>> * added 'const' qualifier to crc32c lookup tables declaration.
> Just to be clear, this does build if you compile it against the "default"
> machine type, correct?
> Neil

I think so, I've just successfully built it against latest snapshot with
RTE_TARGET
equal to 'x86_64-native-linuxapp-gcc'.

-- 
Sincerely,

Yerden Zhumabekov
State Technical Service
Astana, KZ

[dpdk-dev] [PATCH v1] librte_vhost: Add an abstraction to hide vhost-user and cuse devices.

2015-02-02 Thread Linhaifeng


On 2015/2/1 18:36, Tetsuya Mukawa wrote:
> This patch should be put on "lib/librte_vhost: vhost-user support"
> patch series written by Xie, Huawei.
> 
> There are 2 type of vhost devices. One is cuse, the other is vhost-user.
> So far, one of them we can use. To use the other, DPDK is needed to be
> recompiled.

If we use vhost-user we also should install cuse and fuse module ?
I think is not a good idea.


> The patch introduces rte_vhost_dev_type parameter. Using type parameter,
> the DPDK application can use both vhost devices without recompile.
> 
> The type parameter should be specified when following vhost APIs are called.
> - int rte_vhost_driver_register();
> - int rte_vhost_driver_session_start();
> 
> Signed-off-by: Tetsuya Mukawa 
> ---
>  examples/vhost/main.c|  4 +-
>  lib/librte_vhost/Makefile|  4 +-
>  lib/librte_vhost/rte_virtio_net.h| 15 +-
>  lib/librte_vhost/vhost-net.c | 74 
> 
>  lib/librte_vhost/vhost_cuse/vhost-net-cdev.c |  5 +-
>  lib/librte_vhost/vhost_cuse/vhost-net-cdev.h | 42 
>  lib/librte_vhost/vhost_user/vhost-net-user.c |  4 +-
>  lib/librte_vhost/vhost_user/vhost-net-user.h |  7 +++
>  8 files changed, 145 insertions(+), 10 deletions(-)
>  create mode 100644 lib/librte_vhost/vhost-net.c
>  create mode 100644 lib/librte_vhost/vhost_cuse/vhost-net-cdev.h
> 
> diff --git a/examples/vhost/main.c b/examples/vhost/main.c
> index 04f0118..545df72 100644
> --- a/examples/vhost/main.c
> +++ b/examples/vhost/main.c
> @@ -3040,14 +3040,14 @@ main(int argc, char *argv[])
>   rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_MRG_RXBUF);
>  
>   /* Register CUSE device to handle IOCTLs. */
> - ret = rte_vhost_driver_register((char *)_basename);
> + ret = rte_vhost_driver_register((char *)_basename, VHOST_DEV_CUSE);
>   if (ret != 0)
>   rte_exit(EXIT_FAILURE,"CUSE device setup failure.\n");
>  
>   rte_vhost_driver_callback_register(_net_device_ops);
>  
>   /* Start CUSE session. */
> - rte_vhost_driver_session_start();
> + rte_vhost_driver_session_start(VHOST_DEV_CUSE);
>   return 0;
>  
>  }
> diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
> index 22319b8..cc95415 100644
> --- a/lib/librte_vhost/Makefile
> +++ b/lib/librte_vhost/Makefile
> @@ -39,8 +39,8 @@ CFLAGS += -I vhost_cuse -lfuse
>  CFLAGS += -I vhost_user
>  LDFLAGS += -lfuse
>  # all source are stored in SRCS-y
> -SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := virtio-net.c vhost_rxtx.c
> -#SRCS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost_cuse/vhost-net-cdev.c 
> vhost_cuse/virtio-net-cdev.c vhost_cuse/eventfd_copy.c
> +SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := virtio-net.c vhost-net.c vhost_rxtx.c
> +SRCS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost_cuse/vhost-net-cdev.c 
> vhost_cuse/virtio-net-cdev.c vhost_cuse/eventfd_copy.c
>  SRCS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost_user/vhost-net-user.c 
> vhost_user/virtio-net-user.c vhost_user/fd_man.c
>  
>  # install includes
> diff --git a/lib/librte_vhost/rte_virtio_net.h 
> b/lib/librte_vhost/rte_virtio_net.h
> index 611a3d4..7b3952c 100644
> --- a/lib/librte_vhost/rte_virtio_net.h
> +++ b/lib/librte_vhost/rte_virtio_net.h
> @@ -166,6 +166,15 @@ gpa_to_vva(struct virtio_net *dev, uint64_t guest_pa)
>  }
>  
>  /**
> + * Enum for vhost device types.
> + */
> +enum rte_vhost_dev_type {
> + VHOST_DEV_CUSE, /* cuse driver */
> + VHOST_DEV_USER, /* vhost-user driver */
> + VHOST_DEV_MAX   /* the number of vhost driver types */
> +};
> +
> +/**
>   *  Disable features in feature_mask. Returns 0 on success.
>   */
>  int rte_vhost_feature_disable(uint64_t feature_mask);
> @@ -181,12 +190,14 @@ uint64_t rte_vhost_feature_get(void);
>  int rte_vhost_enable_guest_notification(struct virtio_net *dev, uint16_t 
> queue_id, int enable);
>  
>  /* Register vhost driver. dev_name could be different for multiple instance 
> support. */
> -int rte_vhost_driver_register(const char *dev_name);
> +int rte_vhost_driver_register(const char *dev_name,
> + enum rte_vhost_dev_type dev_type);
>  
>  /* Register callbacks. */
>  int rte_vhost_driver_callback_register(struct virtio_net_device_ops const * 
> const);
> +
>  /* Start vhost driver session blocking loop. */
> -int rte_vhost_driver_session_start(void);
> +int rte_vhost_driver_session_start(enum rte_vhost_dev_type dev_type);
>  
>  /**
>   * This function adds buffers to the virtio devices RX virtqueue. Buffers can
> diff --git a/lib/librte_vhost/vhost-net.c b/lib/librte_vhost/vhost-net.c
> new file mode 100644
> index 000..d0316d7
> --- /dev/null
> +++ b/lib/librte_vhost/vhost-net.c
> @@ -0,0 +1,74 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2015 IGEL Co.,Ltd. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the

[dpdk-dev] [PATCH 18/18] Change mk/rte.app.mk to add fm10k lib into link

2015-02-02 Thread Chen, Jing D

Hi Neil,

> -Original Message-
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Sunday, February 01, 2015 8:51 AM
> To: Chen, Jing D
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 18/18] Change mk/rte.app.mk to add fm10k
> lib into link
> 
> On Fri, Jan 30, 2015 at 01:07:34PM +0800, Chen Jing D(Mark) wrote:
> > From: Jeff Shaw 
> >
> > Signed-off-by: Jeff Shaw 
> > Signed-off-by: Chen Jing D(Mark) 
> > ---
> >  mk/rte.app.mk |4 
> >  1 files changed, 4 insertions(+), 0 deletions(-)
> >
> > diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> > index 4294d9a..87d8763 100644
> > --- a/mk/rte.app.mk
> > +++ b/mk/rte.app.mk
> > @@ -211,6 +211,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_I40E_PMD),y)
> >  LDLIBS += -lrte_pmd_i40e
> >  endif
> >
> > +ifeq ($(CONFIG_RTE_LIBRTE_FM10K_PMD),y)
> > +LDLIBS += -lrte_pmd_fm10k
> > +endif
> > +
> >  ifeq ($(CONFIG_RTE_LIBRTE_IXGBE_PMD),y)
> >  LDLIBS += -lrte_pmd_ixgbe
> >  endif
> > --
> > 1.7.7.6
> >
> >
> This patch should be merged with patch 17, and patch 2, and placed at the
> end of
> your series to avoid a FTBFS issue

My rationale is to make every single patch not to break the compile. So, I'd 
like to
add the binary library into compile and link in last 2 patches, after all the 
actual code
are patched.  For Patch 2, I think you are right, maybe a better way is to move 
it as 
patch "16". 

But I'm not sure whether I should merge these 3 together. You know, somebody may
not happy to see the changes in different directory to appear in single patch.

> Neil

[dpdk-dev] [PATCH 16/18] fm10k: add PF and VF interrupt handling function

2015-02-02 Thread Chen, Jing D

Hi Neil,

> -Original Message-
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Sunday, February 01, 2015 8:43 AM
> To: Chen, Jing D
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 16/18] fm10k: add PF and VF interrupt
> handling function
> 
> On Fri, Jan 30, 2015 at 01:07:32PM +0800, Chen Jing D(Mark) wrote:
> > From: Jeff Shaw 
> >
> > 1. Add 2 interrupt handling functions, one for PF and one for VF.
> > 2. Enable interrupt after completing initialization of NIC.
> >
> This seems to do way more than enable interrupt handling.  Can you be a bit
> more
> desriptive here?

OK, I'll try to add more description in the log. 

> Neil
> 
> > Signed-off-by: Jeff Shaw 
> > Signed-off-by: Chen Jing D(Mark) 
> > ---
> >  lib/librte_pmd_fm10k/fm10k_ethdev.c |  268
> +++
> >  1 files changed, 268 insertions(+), 0 deletions(-)
> >
> > diff --git a/lib/librte_pmd_fm10k/fm10k_ethdev.c
> b/lib/librte_pmd_fm10k/fm10k_ethdev.c
> > index 40e3a2b..685fa8f 100644
> > --- a/lib/librte_pmd_fm10k/fm10k_ethdev.c
> > +++ b/lib/librte_pmd_fm10k/fm10k_ethdev.c
> > @@ -1325,6 +1325,256 @@ fm10k_rss_hash_conf_get(struct rte_eth_dev
> *dev,
> > return 0;
> >  }
> >
> > +static void
> > +fm10k_dev_enable_intr_pf(struct rte_eth_dev *dev)
> > +{
> > +   struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data-
> >dev_private);
> > +   uint32_t int_map = FM10K_INT_MAP_IMMEDIATE;
> > +
> > +   /* Bind all local non-queue interrupt to vector 0 */
> > +   int_map |= 0;
> > +
> > +   FM10K_WRITE_REG(hw, FM10K_INT_MAP(fm10k_int_Mailbox),
> int_map);
> > +   FM10K_WRITE_REG(hw, FM10K_INT_MAP(fm10k_int_PCIeFault),
> int_map);
> > +   FM10K_WRITE_REG(hw,
> FM10K_INT_MAP(fm10k_int_SwitchUpDown), int_map);
> > +   FM10K_WRITE_REG(hw, FM10K_INT_MAP(fm10k_int_SwitchEvent),
> int_map);
> > +   FM10K_WRITE_REG(hw, FM10K_INT_MAP(fm10k_int_SRAM),
> int_map);
> > +   FM10K_WRITE_REG(hw, FM10K_INT_MAP(fm10k_int_VFLR),
> int_map);
> > +
> > +   /* Enable misc causes */
> > +   FM10K_WRITE_REG(hw, FM10K_EIMR,
> FM10K_EIMR_ENABLE(PCA_FAULT) |
> > +   FM10K_EIMR_ENABLE(THI_FAULT) |
> > +   FM10K_EIMR_ENABLE(FUM_FAULT) |
> > +   FM10K_EIMR_ENABLE(MAILBOX) |
> > +   FM10K_EIMR_ENABLE(SWITCHREADY) |
> > +   FM10K_EIMR_ENABLE(SWITCHNOTREADY) |
> > +   FM10K_EIMR_ENABLE(SRAMERROR) |
> > +   FM10K_EIMR_ENABLE(VFLR));
> > +
> > +   /* Enable ITR 0 */
> > +   FM10K_WRITE_REG(hw, FM10K_ITR(0), FM10K_ITR_AUTOMASK |
> > +   FM10K_ITR_MASK_CLEAR);
> > +   FM10K_WRITE_FLUSH(hw);
> > +}
> > +
> > +static void
> > +fm10k_dev_enable_intr_vf(struct rte_eth_dev *dev)
> > +{
> > +   struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data-
> >dev_private);
> > +   uint32_t int_map = FM10K_INT_MAP_IMMEDIATE;
> > +
> > +   /* Bind all local non-queue interrupt to vector 0 */
> > +   int_map |= 0;
> > +
> > +   /* Only INT 0 availiable, other 15 are reserved. */
> > +   FM10K_WRITE_REG(hw, FM10K_VFINT_MAP, int_map);
> > +
> > +   /* Enable ITR 0 */
> > +   FM10K_WRITE_REG(hw, FM10K_VFITR(0), FM10K_ITR_AUTOMASK |
> > +   FM10K_ITR_MASK_CLEAR);
> > +   FM10K_WRITE_FLUSH(hw);
> > +}
> > +
> > +static int
> > +fm10k_dev_handle_fault(struct fm10k_hw *hw, uint32_t eicr)
> > +{
> > +   struct fm10k_fault fault;
> > +   int err;
> > +   const char *estr = "Unknown error";
> > +
> > +   /* Process PCA fault */
> > +   if (eicr & FM10K_EIMR_PCA_FAULT) {
> > +   err = fm10k_get_fault(hw, FM10K_PCA_FAULT, );
> > +   if (err)
> > +   goto error;
> > +   switch (fault.type) {
> > +   case PCA_NO_FAULT:
> > +   estr = "PCA_NO_FAULT"; break;
> > +   case PCA_UNMAPPED_ADDR:
> > +   estr = "PCA_UNMAPPED_ADDR"; break;
> > +   case PCA_BAD_QACCESS_PF:
> > +   estr = "PCA_BAD_QACCESS_PF"; break;
> > +   case PCA_BAD_QACCESS_VF:
> > +   estr = "PCA_BAD_QACCESS_VF"; break;
> > +   case PCA_MALICIOUS_REQ:
> > +   estr = "PCA_MALICIOUS_REQ"; break;
> > +   case PCA_POISONED_TLP:
> > +   estr = "PCA_POISONED_TLP"; break;
> > +   case PCA_TLP_ABORT:
> > +   estr = "PCA_TLP_ABORT"; break;
> > +   default:
> > +   goto error;
> > +   }
> > +   PMD_LOG(ERR, "%s: %s(%d) Addr:0x%"PRIu64" Spec: 0x%x",
> > +   estr, fault.func ? "VF" : "PF", fault.func,
> > +   fault.address, fault.specinfo);
> > +   }
> > +
> > +   /* Process THI fault */
> > +   if (eicr & FM10K_EIMR_THI_FAULT) {
> > +   err = fm10k_get_fault(hw, FM10K_THI_FAULT, );
> > +   if (err)
> > +   goto error;
> > +   switch (fault.type) {
> > +

[dpdk-dev] [PATCH 04/18] fm10k: add fm10k device id

2015-02-02 Thread Chen, Jing D

Hi Neil,

> -Original Message-
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Saturday, January 31, 2015 10:20 PM
> To: Chen, Jing D
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 04/18] fm10k: add fm10k device id
> 
> On Fri, Jan 30, 2015 at 01:07:20PM +0800, Chen Jing D(Mark) wrote:
> > From: Jeff Shaw 
> >
> > Add fm10k device ID list into rte_pci_dev_ids.h.
> >
> > Signed-off-by: Jeff Shaw 
> > Signed-off-by: Chen Jing D(Mark) 
> > ---
> >  lib/librte_eal/common/include/rte_pci_dev_ids.h |   22
> ++
> >  1 files changed, 22 insertions(+), 0 deletions(-)
> >
> > diff --git a/lib/librte_eal/common/include/rte_pci_dev_ids.h
> b/lib/librte_eal/common/include/rte_pci_dev_ids.h
> > index c922de9..f54800e 100644
> > --- a/lib/librte_eal/common/include/rte_pci_dev_ids.h
> > +++ b/lib/librte_eal/common/include/rte_pci_dev_ids.h
> > @@ -132,6 +132,14 @@
> >  #define RTE_PCI_DEV_ID_DECL_VMXNET3(vend, dev)
> >  #endif
> >
> > +#ifndef RTE_PCI_DEV_ID_DECL_FM10K
> > +#define RTE_PCI_DEV_ID_DECL_FM10K(vend, dev)
> > +#endif
> > +
> > +#ifndef RTE_PCI_DEV_ID_DECL_FM10KVF
> > +#define RTE_PCI_DEV_ID_DECL_FM10KVF(vend, dev)
> > +#endif
> > +
> I know this isn't the job of this patch series, but I don't really understand
> why we bother with this pattern for filling out pci id tables.  A PMD supports
> specific hardware, we might as well use the generic RTE_PCI_DEVICE macro
> in the
> driver rather than creating a FM10K specific wrapper, only to have to do
> some
> ifdef trickery in the rte_cpi_dev_ids file and some include magic to fill it
> out.
> 
> I'd suggest that you just use RTE_PCI_DEVICE macro here, and make your
> own table
> (keep the specific device id values in the common file.  Then we can clean out
> the macro maggic in a later update.

I partially agree with you. Maybe a better solution is to use the mechanism 
that applied
in kernel to register PCI driver. Driver maintains a device list that it can 
manage and provide
a hook function to detect if new device can be managed by the driver.  Then, 
DPDK core
library needn't worry about the long device list (Maybe script that unbind/bind 
device needs 
such info, it's another story). 

But I'd like to keep current implementation unchanged since final decision is 
not made yet. If
new mechanism is introduced, I would like to update to adapt to the new changes.

> Neil

Best Regards,
Mark

[dpdk-dev] [PATCH v6 12/13] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-02 Thread Qiu, Michael

On 2/2/2015 1:43 PM, Qiu, Michael wrote:
> On 2/1/2015 12:02 PM, Tetsuya Mukawa wrote:
>> These functions are used for attaching or detaching a port.
>> When rte_eal_dev_attach() is called, the function tries to realize the
>> device name as pci address. If this is done successfully,
>> rte_eal_dev_attach() will attach physical device port. If not, attaches
>> virtual devive port.
>> When rte_eal_dev_detach() is called, the function gets the device type
>> of this port to know whether the port is came from physical or virtual.
>> And then specific detaching function will be called.
>>
>> v5:
>> - Change function names like below.
>>   rte_eal_dev_find_and_invoke() to rte_eal_vdev_find_and_invoke().
>>   rte_eal_dev_invoke() to rte_eal_vdev_invoke().
>> - Add code to handle a return value of rte_eal_devargs_remove().
>> - Fix pci address format in rte_eal_dev_detach().
>> v4:
>> - Fix comment.
>> - Add error checking.
>> - Fix indent of 'if' statement.
>> - Change function name.
>>

[...]

>> +/* attach the new virtual device, then store port_id of the device */
>> +static int
>> +rte_eal_dev_attach_vdev(const char *vdevargs, uint8_t *port_id)
>> +{
>> +char *args;
>> +uint8_t new_port_id;
>> +struct rte_eth_dev devs[RTE_MAX_ETHPORTS];
>> +
>> +if ((vdevargs == NULL) || (port_id == NULL))
>> +goto err0;
>> +
>> +args = strdup(vdevargs);
>> +if (args == NULL)
>> +goto err0;
>> +
>> +/* save current port status */
>> +rte_eth_dev_save(devs);
>> +/* add the vdevargs to devargs_list */
>> +if (rte_eal_devargs_add(RTE_DEVTYPE_VIRTUAL, args))
>> +goto err1;
>> +/* parse vdevargs, then retrieve device name */
>> +get_vdev_name(args);
>> +/* walk around dev_driver_list to find the driver of the device,
>> + * then invoke probe function o the driver */
>> +if (rte_eal_vdev_find_and_invoke(args, RTE_EAL_INVOKE_TYPE_PROBE))
>> +goto err2;
>> +/* get port_id enabled by above procedures */
>> +if (rte_eth_dev_get_changed_port(devs, _port_id))
>> +goto err2;
>> +
>> +free(args);
>> +*port_id = new_port_id;
>> +return 0;
>> +err2:
>> +rte_eal_devargs_remove(RTE_DEVTYPE_VIRTUAL, args);
>> +err1:
>> +free(args);
>> +err0:
>> +RTE_LOG(ERR, EAL, "Drver, cannot detach the device\n");

Here "cannot detach the device\n" should be "cannot attach the device" I
think.

> Here also "Drver",
>
>
> Thanks,
> Michael
>> +return -1;
>> +}
>> +
>> +/* detach the new virtual device, then store the name of the device */
>> +static int
>> +rte_eal_dev_detach_vdev(uint8_t port_id, char *vdevname)
>> +{
>> +char name[RTE_ETH_NAME_MAX_LEN];
>> +
>> +if (vdevname == NULL)
>> +goto err;
>> +
>> +/* check whether the driver supports detach feature, or not */
>> +if (rte_eth_dev_check_detachable(port_id))
>> +goto err;
>> +
>> +/* get device name by port id */
>> +if (rte_eth_dev_get_name_by_port(port_id, name))
>> +goto err;
>> +/* walk around dev_driver_list to find the driver of the device,
>> + * then invoke close function o the driver */
>> +if (rte_eal_vdev_find_and_invoke(name, RTE_EAL_INVOKE_TYPE_CLOSE))
>> +goto err;
>> +/* remove the vdevname from devargs_list */
>> +if (rte_eal_devargs_remove(RTE_DEVTYPE_VIRTUAL, name))
>> +goto err;
>> +
>> +strncpy(vdevname, name, sizeof(name));
>> +return 0;
>> +err:
>> +RTE_LOG(ERR, EAL, "Drver, cannot detach the device\n");
>> +return -1;
>> +}
>> +
>> +/* attach the new device, then store port_id of the device */
>> +int
>> +rte_eal_dev_attach(const char *devargs, uint8_t *port_id)
>> +{
>> +struct rte_pci_addr addr;
>> +
>> +if ((devargs == NULL) || (port_id == NULL))
>> +return -EINVAL;
>> +
>> +if (eal_parse_pci_DomBDF(devargs, ) == 0)
>> +return rte_eal_dev_attach_pdev(, port_id);
>> +else
>> +return rte_eal_dev_attach_vdev(devargs, port_id);
>> +}
>> +
>> +/* detach the device, then store the name of the device */
>> +int
>> +rte_eal_dev_detach(uint8_t port_id, char *name)
>> +{
>> +struct rte_pci_addr addr;
>> +int ret;
>> +
>> +if (name == NULL)
>> +return -EINVAL;
>> +
>> +if (rte_eth_dev_get_device_type(port_id) == RTE_ETH_DEV_PHYSICAL) {
>> +ret = rte_eth_dev_get_addr_by_port(port_id, );
>> +if (ret < 0)
>> +return ret;
>> +
>> +ret = rte_eal_dev_detach_pdev(port_id, );
>> +if (ret == 0)
>> +snprintf(name, RTE_ETH_NAME_MAX_LEN,
>> +"%04x:%02x:%02x.%d",
>> +addr.domain, addr.bus,
>> +addr.devid, addr.function);
>> +
>> +return ret;
>> +} else
>> +return rte_eal_dev_detach_vdev(port_id, name);
>> +}
>> +#else /* ENABLE_HOTPLUG */
>> +int
>>

[dpdk-dev] [PATCH v6 2/7] hash: add assembly implementation of CRC32 intrinsics

2015-02-02 Thread Liang, Cunming

Got it, thanks.

> -Original Message-
> From: Yerden Zhumabekov [mailto:e_zhumabekov at sts.kz]
> Sent: Monday, February 02, 2015 1:34 PM
> To: Liang, Cunming; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 2/7] hash: add assembly implementation of
> CRC32 intrinsics
> 
> 
> 02.02.2015 11:15, Liang, Cunming ?:
> >
> >> +static inline uint32_t
> >> +crc32c_sse42_u64(uint64_t data, uint64_t init_val)
> >> +{
> >> +__asm__ volatile(
> >> +"crc32q %[data], %[init_val];"
> >> +: [init_val] "+r" (init_val)
> >> +: [data] "rm" (data));
> >> +return init_val;
> >> +}
> > [LCM] I'm curious about the benefit of replacing CRC32 intrinsic
> > "_mm_crc32_u32/64".
> 
> These intrinsics are not available on a platform which has no SSE4.2
> support so the build would fail.
> 
> See previous suggestion from Neil:
> http://dpdk.org/ml/archives/dev/2014-November/008353.html
> 
> --
> Sincerely,
> 
> Yerden Zhumabekov
> State Technical Service
> Astana, KZ

[dpdk-dev] [PATCH v6 12/13] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-02 Thread Qiu, Michael

On 2/1/2015 12:02 PM, Tetsuya Mukawa wrote:
> These functions are used for attaching or detaching a port.
> When rte_eal_dev_attach() is called, the function tries to realize the
> device name as pci address. If this is done successfully,
> rte_eal_dev_attach() will attach physical device port. If not, attaches
> virtual devive port.
> When rte_eal_dev_detach() is called, the function gets the device type
> of this port to know whether the port is came from physical or virtual.
> And then specific detaching function will be called.
>
> v5:
> - Change function names like below.
>   rte_eal_dev_find_and_invoke() to rte_eal_vdev_find_and_invoke().
>   rte_eal_dev_invoke() to rte_eal_vdev_invoke().
> - Add code to handle a return value of rte_eal_devargs_remove().
> - Fix pci address format in rte_eal_dev_detach().
> v4:
> - Fix comment.
> - Add error checking.
> - Fix indent of 'if' statement.
> - Change function name.
>
> Signed-off-by: Tetsuya Mukawa 
> ---
>  lib/librte_eal/common/eal_common_dev.c  | 274 
> 
>  lib/librte_eal/common/eal_private.h |  11 ++
>  lib/librte_eal/common/include/rte_dev.h |  33 
>  lib/librte_eal/linuxapp/eal/Makefile|   1 +
>  lib/librte_eal/linuxapp/eal/eal_pci.c   |   6 +-
>  5 files changed, 322 insertions(+), 3 deletions(-)
>
> diff --git a/lib/librte_eal/common/eal_common_dev.c 
> b/lib/librte_eal/common/eal_common_dev.c
> index eae5656..e3a3f54 100644
> --- a/lib/librte_eal/common/eal_common_dev.c
> +++ b/lib/librte_eal/common/eal_common_dev.c
> @@ -32,10 +32,13 @@
>   *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>   */
>  
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -107,3 +110,274 @@ rte_eal_dev_init(void)
>   }
>   return 0;
>  }
> +
> +/* So far, DPDK hotplug function only supports linux */
> +#ifdef ENABLE_HOTPLUG
> +static void
> +rte_eal_vdev_invoke(struct rte_driver *driver,
> + struct rte_devargs *devargs, enum rte_eal_invoke_type type)
> +{
> + if ((driver == NULL) || (devargs == NULL))
> + return;
> +
> + switch (type) {
> + case RTE_EAL_INVOKE_TYPE_PROBE:
> + driver->init(devargs->virtual.drv_name, devargs->args);
> + break;
> + case RTE_EAL_INVOKE_TYPE_CLOSE:
> + driver->uninit(devargs->virtual.drv_name, devargs->args);
> + break;
> + default:
> + break;
> + }
> +}
> +
> +static int
> +rte_eal_vdev_find_and_invoke(const char *name, int type)
> +{
> + struct rte_devargs *devargs;
> + struct rte_driver *driver;
> +
> + if (name == NULL)
> + return -EINVAL;
> +
> + /* call the init function for each virtual device */
> + TAILQ_FOREACH(devargs, _list, next) {
> +
> + if (devargs->type != RTE_DEVTYPE_VIRTUAL)
> + continue;
> +
> + if (strncmp(name, devargs->virtual.drv_name, strlen(name)))
> + continue;
> +
> + TAILQ_FOREACH(driver, _driver_list, next) {
> + if (driver->type != PMD_VDEV)
> + continue;
> +
> + /* search a driver prefix in virtual device name */
> + if (!strncmp(driver->name, devargs->virtual.drv_name,
> + strlen(driver->name))) {
> + rte_eal_vdev_invoke(driver, devargs, type);
> + break;
> + }
> + }
> +
> + if (driver == NULL) {
> + RTE_LOG(WARNING, EAL, "no driver found for %s\n",
> +   devargs->virtual.drv_name);
> + }
> + return 0;
> + }
> + return 1;
> +}
> +
> +/* attach the new physical device, then store port_id of the device */
> +static int
> +rte_eal_dev_attach_pdev(struct rte_pci_addr *addr, uint8_t *port_id)
> +{
> + uint8_t new_port_id;
> + struct rte_eth_dev devs[RTE_MAX_ETHPORTS];
> +
> + if ((addr == NULL) || (port_id == NULL))
> + goto err;
> +
> + /* save current port status */
> + rte_eth_dev_save(devs);
> + /* re-construct pci_device_list */
> + if (rte_eal_pci_scan())
> + goto err;
> + /* invoke probe func of the driver can handle the new device */
> + if (rte_eal_pci_probe_one(addr))
> + goto err;
> + /* get port_id enabled by above procedures */
> + if (rte_eth_dev_get_changed_port(devs, _port_id))
> + goto err;
> +
> + *port_id = new_port_id;
> + return 0;
> +err:
> + RTE_LOG(ERR, EAL, "Drver, cannot attach the device\n");

Sorry, what does "Drver" means?

My English is bad, also I haven't gotten this work in google

Thanks,
Michael
> + return -1;
> +}
> +
> +/* detach the new physical device, then store pci_addr of the device */
> +static int
>

[dpdk-dev] [PATCH 13/20] testpmd: support gre tunnels in csum fwd engine

2015-02-02 Thread Liu, Jijiang

Hi Olivier,


> -Original Message-
> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> Sent: Friday, January 30, 2015 9:16 PM
> To: dev at dpdk.org
> Cc: Ananyev, Konstantin; Liu, Jijiang; Zhang, Helin; olivier.matz at 6wind.com
> Subject: [PATCH 13/20] testpmd: support gre tunnels in csum fwd engine
> 
> Add support for Ethernet over GRE and IP over GRE tunnels.
> 
> Signed-off-by: Olivier Matz 
> ---
>  app/test-pmd/cmdline.c  |  6 ++--
>  app/test-pmd/csumonly.c | 87
> +
>  2 files changed, 84 insertions(+), 9 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> 98f7a1c..aa5c178 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -321,9 +321,9 @@ static void cmd_help_long_parsed(void
> *parsed_result,
>   " checksum with when transmitting a packet using
> the"
>   " csum forward engine.\n"
>   "ip|udp|tcp|sctp always concern the inner
> layer.\n"
> - "outer-ip concerns the outer IP layer (in"
> - " case the packet is recognized as a vxlan packet by"
> - " the forward engine)\n"
> + "outer-ip concerns the outer IP layer in"
> + " case the packet is recognized as a tunnel packet by"
> + " the forward engine (vxlan and gre are
> supported)\n"
>   "Please check the NIC datasheet for HW
> limits.\n\n"
> 
>   "csum parse-tunnel (on|off) (tx_port_id)\n"
> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index
> 52af0e7..02c01f6 100644
> --- a/app/test-pmd/csumonly.c
> +++ b/app/test-pmd/csumonly.c
> @@ -100,6 +100,12 @@ struct testpmd_offload_info {
>   uint16_t tso_segsz;
>  };
> 
> +/* simplified GRE header (flags must be 0) */ struct simple_gre_hdr {
> + uint16_t flags;
> + uint16_t proto;
> +};
I suggest you to remove the comment ' flags must be 0'?the reason is that we 
just use the header structure to check what the protocol is.
It is not necessary to require the flag must be 0.

>  static uint16_t
>  get_psd_sum(void *l3_hdr, uint16_t ethertype, uint64_t ol_flags)  { @@ -
> 218,6 +224,60 @@ parse_vxlan(struct udp_hdr *udp_hdr, struct
> testpmd_offload_info *info,
>   info->l2_len += ETHER_VXLAN_HLEN; /* add udp + vxlan */  }
> 
> +/* Parse a gre header */
> +static void
> +parse_gre(struct simple_gre_hdr *gre_hdr, struct testpmd_offload_info
> +*info) {
> + struct ether_hdr *eth_hdr;
> + struct ipv4_hdr *ipv4_hdr;
> + struct ipv6_hdr *ipv6_hdr;
> +
> + /* if flags != 0; it's not supported */
> + if (gre_hdr->flags != 0)
> + return;
I suggest you remove the check here, you can add some comments in front of this 
function to explain that actual NVGRE and IP over GRE is not supported now.

For example, when I want to support NVGRE TX checksum offload, I will do the 
following change.

Or you still keep it here, but anyway, I will change it later.


> +
> + if (gre_hdr->proto == _htons(ETHER_TYPE_IPv4)) {
> + info->is_tunnel = 1;
> + info->outer_ethertype = info->ethertype;
> + info->outer_l2_len = info->l2_len;
> + info->outer_l3_len = info->l3_len;
> +
> + ipv4_hdr = (struct ipv4_hdr *)((char *)gre_hdr +
> + sizeof(struct simple_gre_hdr));
> +
> + parse_ipv4(ipv4_hdr, info);
> + info->ethertype = _htons(ETHER_TYPE_IPv4);
> + info->l2_len = 0;
> +
> + } else if (gre_hdr->proto == _htons(ETHER_TYPE_IPv6)) {
> + info->is_tunnel = 1;
> + info->outer_ethertype = info->ethertype;
> + info->outer_l2_len = info->l2_len;
> + info->outer_l3_len = info->l3_len;
> +
> + ipv6_hdr = (struct ipv6_hdr *)((char *)gre_hdr +
> + sizeof(struct simple_gre_hdr));
> +
> + info->ethertype = _htons(ETHER_TYPE_IPv6);
> + parse_ipv6(ipv6_hdr, info);
> + info->l2_len = 0;
> +
> + } else if (gre_hdr->proto == _htons(0x6558)) { /* ETH_P_TEB in linux
> */
> + info->is_tunnel = 1;
> + info->outer_ethertype = info->ethertype;
> + info->outer_l2_len = info->l2_len;
> + info->outer_l3_len = info->l3_len;
> +
> + eth_hdr = (struct ether_hdr *)((char *)gre_hdr +
> + sizeof(struct simple_gre_hdr));

For NVGRE:
I will do some change here.
eth_hdr = (struct ether_hdr *)((char *)gre_hdr +
sizeof(struct nvgre_hdr)); // replace  simple_gre_hdr with 
nvgre_hdr.


> + parse_ethernet(eth_hdr, info);
> + } else
> + return;
> +
> + info->l2_len += sizeof(struct simple_gre_hdr); }
> +
>  /* modify the IPv4 or IPv4 source address of a packet */  static void
>

[dpdk-dev] [PATCH 00/18] lib/librte_pmd_fm10k : fm10k pmd driver

2015-02-02 Thread Chen, Jing D

Hi,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Saturday, January 31, 2015 6:19 AM
> To: Shaw, Jeffrey B
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 00/18] lib/librte_pmd_fm10k : fm10k pmd
> driver
> 
> 2015-01-30 13:46, Jeff Shaw:
> > On Fri, Jan 30, 2015 at 04:26:33PM -0500, Neil Horman wrote:
> > > On Fri, Jan 30, 2015 at 01:07:16PM +0800, Chen Jing D(Mark) wrote:
> > > > From: "Chen Jing D(Mark)" 
> > > > Jeff Shaw (18):
> > > >   fm10k: add base driver
> [...]
> > > >  lib/librte_pmd_fm10k/SHARED/fm10k_api.c |  327 
> [...]
> > >
> > > Why is there a SHARED directory in the driver?  Are there other drivers
> that use
> > > the shared fm10k code?
> >
> > No, the other poll-mode drivers do not use the shared fm10k code. The
> > directory is similar to the 'ixgbe' and 'i40e' directories in their
> > respective PMDs, only that it is named 'SHARED' for the fm10k driver.
> 
> So shared is a bad name in the context of DPDK.
> Inside Intel, it can be understood that you share it between projects,
> but in DPDK, it's only a base driver.
> 

OK, I'll change "SHARED" to "fm10k".

> --
> Thomas

[dpdk-dev] [PATCH 00/17] unified packet type

2015-02-02 Thread Zhang, Helin

Hi Olivier

> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Friday, January 30, 2015 9:31 PM
> To: Zhang, Helin; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 00/17] unified packet type
> 
> Hi Helin,
> 
> On 01/29/2015 04:15 AM, Helin Zhang wrote:
> > Currently only 6 bits which are stored in ol_flags are used to
> > indicate the packet types. This is not enough, as some NIC hardware
> > can recognize quite a lot of packet types, e.g i40e hardware can
> > recognize more than 150 packet types. Hiding those packet types hides
> > hardware offload capabilities which could be quite useful for improving
> performance and for end users.
> > So an unified packet types are needed to support all possible PMDs.
> > Recently a 16 bits packet_type field has been added in mbuf header
> > which can be used for this purpose. In addition, all packet types
> > stored in ol_flag field should be deleted at all, and 6 bits of ol_flags 
> > can be
> save as the benifit.
> >
> > Initially, 16 bits of packet_type can be divided into several sub
> > fields to indicate different packet type information of a packet. The
> > initial design is to divide those bits into 4 fields for L3 types,
> > tunnel types, inner L3 types and L4 types. All PMDs should translate
> > the offloaded packet types into this 4 fields of information, for user
> applications.
> 
> You haven't answered to my question I asked in your RFC patch [1].
> I copied it below:
> 
> 
> >> On 01/20/2015 03:28 AM, Zhang, Helin wrote:
>  Another question I've asked several times[1][2] : what does having
>  RTE_PTYPE_TUNNEL_IP mean? What fields are checked by the hardware
>  (or the driver) and what fields should be checked by the application?
>  Are you sure that all the drivers (ixgbe, i40e, vmxnet3, enic)
>  check the same fields? (ethertype, ip version, ip len correct, ip
>  checksum correct, flags, ...)
> >>> RTE_PTYPE_TUNNEL_IP means hardware recognizes the received packet
> as
> >>> an IP-in-IP packet.
> >>> All the fields are filled by PMD which is recognized by hardware.
> >>> The application can just use it which can save some cpu cycles to
> >>> recognize the packet type by software.
> >>> Drivers is responsible for filling with correct values according to
> >>> the packet types recognized by its hardware. Different PMDs may fill
> >>> with different values based on different capabilities.
> >>
> >> Sorry, that does not answer to my question.
> >>
> >> Let's take a simple example. Imagine a hardware-1 that is able to
> >> recognize an IP packet by checking the ethertype and that the IP
> >> version is set to 4.
> >> Another hardware-2 recognize an IP packet by checking the ethertype,
> >> the IP version and that the IP length is correct compared to m_len(m).
> >>
> >> For the same packet, both hardwares will return RTE_PTYPE_L3_IPV4,
> >> but they don't do the same checks on the packet. As I want my
> >> application behave exactly the same whatever the hardware, I need to
> >> know what checks are done in hardware, so I can decide what checks
> >> must be done in my application.
> >>
> >> Example of definition: RTE_PTYPE_L3_IPV4 means that ethertype is
> >> 0x0800 and IP.version is 4.
> >>
> >> It means that I can skip these 2 tests in my application if I have
> >> this packet_type, but all other checks must be done in software (ip
> >> length, flags, checksum, ...)
> >>
> >> For each packet type, we need a definition like above, and we must
> >> check that all drivers setting a packet type behave like described.
Hmm, I think the packet_type may need to be renamed to else, like 
offload_packet_type.
It is just for hardware reported packet type information. It is not for all
information of a packet.
As different hardware may have different capability, it cannot report the same
in mbuf among different hardware for the same packet.
With your question, I think the hardware capability flags may be needed. 
Applications
can query the packet type capabilities on each port, then it knows what type of 
packet
type information can be reported by the corresponding hardware.
What do you think? And are they any better ideas from you?

Thanks you very much!

Regards,
Helin

> 
> I'm not opposed to have a packet_type field in rx mbuf, but I really think the
> question above is an important question to make this feature useful to the
> applications.
> 
> 
> Regards,
> Olivier
> 
> [1] http://dpdk.org/ml/archives/dev/2015-January/011273.html
>

[dpdk-dev] [PATCH 17/17] mbuf: remove old packet type bit masks for ol_flags

2015-02-02 Thread Zhang, Helin

Hi Olivier

> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Friday, January 30, 2015 9:37 PM
> To: Zhang, Helin; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 17/17] mbuf: remove old packet type bit masks
> for ol_flags
> 
> Hi Helin,
> 
> On 01/29/2015 04:16 AM, Helin Zhang wrote:
> > To unify packet types among all PMDs, bit masks and relevant macros of
> > packet type for ol_flags are replaced by unified packet type and
> > relevant macros.
> >
> > Signed-off-by: Helin Zhang 
> > ---
> >  lib/librte_mbuf/rte_mbuf.c |  6 --  lib/librte_mbuf/rte_mbuf.h |
> > 10 ++
> >  2 files changed, 2 insertions(+), 14 deletions(-)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > index 1b14e02..8050ccf 100644
> > --- a/lib/librte_mbuf/rte_mbuf.c
> > +++ b/lib/librte_mbuf/rte_mbuf.c
> > @@ -215,14 +215,8 @@ const char *rte_get_rx_ol_flag_name(uint64_t
> mask)
> > /* case PKT_RX_HBUF_OVERFLOW: return "PKT_RX_HBUF_OVERFLOW";
> */
> > /* case PKT_RX_RECIP_ERR: return "PKT_RX_RECIP_ERR"; */
> > /* case PKT_RX_MAC_ERR: return "PKT_RX_MAC_ERR"; */
> > -   case PKT_RX_IPV4_HDR: return "PKT_RX_IPV4_HDR";
> > -   case PKT_RX_IPV4_HDR_EXT: return "PKT_RX_IPV4_HDR_EXT";
> > -   case PKT_RX_IPV6_HDR: return "PKT_RX_IPV6_HDR";
> > -   case PKT_RX_IPV6_HDR_EXT: return "PKT_RX_IPV6_HDR_EXT";
> > case PKT_RX_IEEE1588_PTP: return "PKT_RX_IEEE1588_PTP";
> > case PKT_RX_IEEE1588_TMST: return "PKT_RX_IEEE1588_TMST";
> > -   case PKT_RX_TUNNEL_IPV4_HDR: return "PKT_RX_TUNNEL_IPV4_HDR";
> > -   case PKT_RX_TUNNEL_IPV6_HDR: return "PKT_RX_TUNNEL_IPV6_HDR";
> 
> I see you are not removing IEEE1588. Is there a reason why it is not handled 
> as
> a packet_type?
Ieee1588 is not a part of information reported by hardware in packet type.
Yes, your idea on this is worth being taken into account.

> 
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 94ae344..5df0d61 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -90,16 +90,10 @@ extern "C" {
> >  #define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer
> overflow. */
> >  #define PKT_RX_RECIP_ERR (0ULL << 0)  /**< Hardware processing
> error. */
> >  #define PKT_RX_MAC_ERR   (0ULL << 0)  /**< MAC error. */
> > -#define PKT_RX_IPV4_HDR  (1ULL << 5)  /**< RX packet with IPv4
> header. */
> > -#define PKT_RX_IPV4_HDR_EXT  (1ULL << 6)  /**< RX packet with
> extended IPv4 header. */
> > -#define PKT_RX_IPV6_HDR  (1ULL << 7)  /**< RX packet with IPv6
> header. */
> > -#define PKT_RX_IPV6_HDR_EXT  (1ULL << 8)  /**< RX packet with
> > extended IPv6 header. */  #define PKT_RX_IEEE1588_PTP  (1ULL << 9)
> > /**< RX IEEE1588 L2 Ethernet PT Packet. */  #define
> > PKT_RX_IEEE1588_TMST (1ULL << 10) /**< RX IEEE1588 L2/L4 timestamped
> > packet.*/ -#define PKT_RX_TUNNEL_IPV4_HDR (1ULL << 11) /**< RX tunnel
> packet with IPv4 header.*/ -#define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12)
> /**< RX tunnel packet with IPv6 header. */
> > -#define PKT_RX_FDIR_ID   (1ULL << 13) /**< FD id reported if FDIR
> match. */
> > -#define PKT_RX_FDIR_FLX  (1ULL << 14) /**< Flexible bytes reported if
> FDIR match. */
> > +#define PKT_RX_FDIR_ID   (1ULL << 11) /**< FD id reported if FDIR
> match. */
> > +#define PKT_RX_FDIR_FLX  (1ULL << 12) /**< Flexible bytes reported if
> FDIR match. */
> 
> It looks like but numbers are not contiguous anymore (there is a hole between
> 5 and 8).
Initially I don't want to move the following values up, as I am not sure if it 
may
affect other features.
I'd prefer to keep that hole as reserved. What's the opinion from you guys?
Thanks for the good comments!

> 
> Regards,
> Olivier

[dpdk-dev] [PATCH 12/20] testpmd: introduce parse_vxlan in csum fwd engine

2015-02-02 Thread Liu, Jijiang

Hi,

> -Original Message-
> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> Sent: Friday, January 30, 2015 9:16 PM
> To: dev at dpdk.org
> Cc: Ananyev, Konstantin; Liu, Jijiang; Zhang, Helin; olivier.matz at 6wind.com
> Subject: [PATCH 12/20] testpmd: introduce parse_vxlan in csum fwd engine
> 
> Move code parsing vxlan into a function. It will ease the support of GRE
> tunnels and IPIP tunnels in next commits.
> 
> Signed-off-by: Olivier Matz 
> ---
>  app/test-pmd/csumonly.c | 68 +++
> --
>  1 file changed, 37 insertions(+), 31 deletions(-)
> 
> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index
> 0b89d89..52af0e7 100644
> --- a/app/test-pmd/csumonly.c
> +++ b/app/test-pmd/csumonly.c
> @@ -93,7 +93,6 @@ struct testpmd_offload_info {
>   uint16_t l3_len;
>   uint16_t l4_len;
>   uint8_t l4_proto;
> - uint8_t l4_tun_len;
>   uint8_t is_tunnel;
>   uint16_t outer_ethertype;
>   uint16_t outer_l2_len;
> @@ -191,6 +190,34 @@ parse_ethernet(struct ether_hdr *eth_hdr, struct
> testpmd_offload_info *info)
>   }
>  }
> 
> +/* Parse a vxlan header */
> +static void
> +parse_vxlan(struct udp_hdr *udp_hdr, struct testpmd_offload_info *info,
> + uint64_t mbuf_olflags)
> +{
> + struct ether_hdr *eth_hdr;
> +
> + /* check udp destination port, 4789 is the default vxlan port
> +  * (rfc7348) or that the rx offload flag is set (i40e only
> +  * currently) */
> + if (udp_hdr->dst_port != _htons(4789) &&
> + (mbuf_olflags & (PKT_RX_TUNNEL_IPV4_HDR |
> + PKT_RX_TUNNEL_IPV6_HDR)) != 0)

It seems that there is a bug, which is mbuf_olflags check.
It should be 
(mbuf_olflags & (PKT_RX_TUNNEL_IPV4_HDR |
PKT_RX_TUNNEL_IPV6_HDR)) == 0).



> + return;
> +
> + info->is_tunnel = 1;
> + info->outer_ethertype = info->ethertype;
> + info->outer_l2_len = info->l2_len;
> + info->outer_l3_len = info->l3_len;
> +
> + eth_hdr = (struct ether_hdr *)((char *)udp_hdr +
> + sizeof(struct udp_hdr) +
> + sizeof(struct vxlan_hdr));
> +
> + parse_ethernet(eth_hdr, info);
> + info->l2_len += ETHER_VXLAN_HLEN; /* add udp + vxlan */ }
> +
>  /* modify the IPv4 or IPv4 source address of a packet */  static void
> change_ip_addresses(void *l3_hdr, uint16_t ethertype) @@ -356,7 +383,6
> @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
>   struct rte_mbuf *m;
>   struct ether_hdr *eth_hdr;
>   void *l3_hdr = NULL, *outer_l3_hdr = NULL; /* can be IPv4 or IPv6 */
> - struct udp_hdr *udp_hdr;
>   uint16_t nb_rx;
>   uint16_t nb_tx;
>   uint16_t i;
> @@ -414,33 +440,15 @@ pkt_burst_checksum_forward(struct fwd_stream
> *fs)
>   /* check if it's a supported tunnel (only vxlan for now) */
>   if ((testpmd_ol_flags &
> TESTPMD_TX_OFFLOAD_PARSE_TUNNEL) &&
>   info.l4_proto == IPPROTO_UDP) {
> + struct udp_hdr *udp_hdr;
>   udp_hdr = (struct udp_hdr *)((char *)l3_hdr +
> info.l3_len);
> + parse_vxlan(udp_hdr, , m->ol_flags);
> + }
> 
> - /* check udp destination port, 4789 is the default
> -  * vxlan port (rfc7348) */
> - if (udp_hdr->dst_port == _htons(4789)) {
> - info.l4_tun_len = ETHER_VXLAN_HLEN;
> - info.is_tunnel = 1;
> -
> - /* currently, this flag is set by i40e only if the
> -  * packet is vxlan */
> - } else if (m->ol_flags & (PKT_RX_TUNNEL_IPV4_HDR
> |
> - PKT_RX_TUNNEL_IPV6_HDR))
> - info.is_tunnel = 1;
> -
> - if (info.is_tunnel == 1) {
> - info.outer_ethertype = info.ethertype;
> - info.outer_l2_len = info.l2_len;
> - info.outer_l3_len = info.l3_len;
> - outer_l3_hdr = l3_hdr;
> -
> - eth_hdr = (struct ether_hdr *)((char
> *)udp_hdr +
> - sizeof(struct udp_hdr) +
> - sizeof(struct vxlan_hdr));
> -
> - parse_ethernet(eth_hdr, );
> - l3_hdr = (char *)eth_hdr + info.l2_len;
> - }
> + /* update l3_hdr and outer_l3_hdr if a tunnel was parsed */
> + if (info.is_tunnel) {
> + outer_l3_hdr = l3_hdr;
> + l3_hdr = (char *)l3_hdr + info.outer_l3_len +
> info.l2_len;
>   }
> 
>   /* step 2: change all source IPs (v4 or v6) so we need @@ -
> 472,7 +480,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
>   if

[dpdk-dev] [PATCH 01/17] mbuf: add definitions of unified packet types

2015-02-02 Thread Zhang, Helin

Hi Olivier

> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Friday, January 30, 2015 9:56 PM
> To: Zhang, Helin; dev at dpdk.org
> Cc: Stephen Hemminger
> Subject: Re: [dpdk-dev] [PATCH 01/17] mbuf: add definitions of unified packet
> types
> 
> Hi Helin,
> 
> On 01/29/2015 04:15 AM, Helin Zhang wrote:
> > As there are only 6 bit flags in ol_flags for indicating packet types,
> > which is not enough to describe all the possible packet types hardware
> > can recognize. For example, i40e hardware can recognize more than 150
> > packet types. Unified packet type is composed of tunnel type, L3 type,
> > L4 type and inner L3 type fields, and can be stored in 16 bits mbuf
> > field of 'packet_type'.
> >
> > Signed-off-by: Helin Zhang 
> > Signed-off-by: Cunming Liang 
> > Signed-off-by: Jijiang Liu 
> > ---
> >  lib/librte_mbuf/rte_mbuf.h | 74
> > ++
> >  1 file changed, 74 insertions(+)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 16059c6..94ae344 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -165,6 +165,80 @@ extern "C" {
> >  /* Use final bit of flags to indicate a control mbuf */
> >  #define CTRL_MBUF_FLAG   (1ULL << 63) /**< Mbuf contains control
> data */
> >
> > +/*
> > + * Sixteen bits are divided into several fields to mark packet types.
> > +Note that
> > + * each field is indexical.
> > + * - Bit 3:0 is for tunnel types.
> > + * - Bit 7:4 is for L3 or outer L3 (for tunneling case) types.
> > + * - Bit 10:8 is for L4 types. It can also be used for inner L4 types for
> > + *   tunneling packets.
> > + * - Bit 13:11 is for inner L3 types.
> > + * - Bit 15:14 is reserved.
> 
> Is there a reason why using this specific order?
Yes, to support ixgbe Vector PMD, outer L3 types and L4 types need to be 
contiguous
and in this order.

> 
> Also, there are 4 bits for outer L3 types and 3 bits for inner L3 types, but 
> both of
> them have 6 different supported types. Is it intentional?
Yes, it is to support ixgbe Vector PMD. Contiguous 7 bits are needed, though 1 
bit wasted.

> 
> > + *
> > + * To be compitable with Vector PMD, RTE_PTYPE_L3_IPV4,
> > + RTE_PTYPE_L3_IPV4_EXT,
> 
> compitable -> compatible
Good catch! It will be fixed in next version. Thanks!

> 
> > + * RTE_PTYPE_L3_IPV6, RTE_PTYPE_L3_IPV6_EXT, RTE_PTYPE_L4_TCP,
> > +RTE_PTYPE_L4_UDP
> > + * and RTE_PTYPE_L4_SCTP should be kept as below in a contiguous 7 bits.
> > + *
> > + * Note that L3 types values are selected for checking IPV4/IPV6
> > +header from
> > + * performance point of view. Reading annotations of
> > +RTE_ETH_IS_IPV4_HDR and
> > + * RTE_ETH_IS_IPV6_HDR is needed for any future changes of L3 type
> values.
> > + */
> > +#define RTE_PTYPE_UNKNOWN   0x /*
> 0b */
> > +/* bit 3:0 for tunnel types */
> > +#define RTE_PTYPE_TUNNEL_IP 0x0001 /*
> 0b0001 */
> > +#define RTE_PTYPE_TUNNEL_TCP0x0002 /*
> 0b0010 */
> > +#define RTE_PTYPE_TUNNEL_UDP0x0003 /*
> 0b0011 */
> > +#define RTE_PTYPE_TUNNEL_GRE0x0004 /*
> 0b0100 */
> > +#define RTE_PTYPE_TUNNEL_VXLAN  0x0005 /*
> 0b0101 */
> > +#define RTE_PTYPE_TUNNEL_NVGRE  0x0006 /*
> 0b0110 */
> > +#define RTE_PTYPE_TUNNEL_GENEVE 0x0007 /*
> 0b0111 */
> > +#define RTE_PTYPE_TUNNEL_GRENAT 0x0008 /*
> 0b1000 */
> > +#define RTE_PTYPE_TUNNEL_GRENAT_MAC 0x0009 /*
> 0b1001 */
> > +#define RTE_PTYPE_TUNNEL_GRENAT_MACVLAN 0x000a /*
> 0b1010 */
> > +#define RTE_PTYPE_TUNNEL_MASK   0x000f /*
> 0b */
> > +/* bit 7:4 for L3 types */
> > +#define RTE_PTYPE_L3_IPV4   0x0010 /*
> 0b0001 */
> > +#define RTE_PTYPE_L3_IPV4_EXT   0x0030 /*
> 0b0011 */
> > +#define RTE_PTYPE_L3_IPV6   0x0040 /*
> 0b0100 */
> > +#define RTE_PTYPE_L3_IPV4_EXT_UNKNOWN   0x0090 /*
> 0b1001 */
> > +#define RTE_PTYPE_L3_IPV6_EXT   0x00c0 /*
> 0b1100 */
> > +#define RTE_PTYPE_L3_IPV6_EXT_UNKNOWN   0x00e0 /*
> 0b1110 */
> > +#define RTE_PTYPE_L3_MASK   0x00f0 /*
> 0b */
> 
> can we expect that when RTE_PTYPE_L3_IPV4, RTE_PTYPE_L3_IPV4_EXT or
> RTE_PTYPE_L3_IPV4_EXT_UNKNOWN is set, the hardware also verified the
> L3 checksum?
RTE_PTYPE_L3_IPV4 means there is NONE-EXT. Each time only one of above 3 can be 
set.
These bits don't indicate any checksum, checksum should be indicated by other 
flags.
They are just for packet types hardware can recognized.

> 
> My understanding is:
> 
> - if packet_type is IPv4* and PKT_RX_IP_CKSUM_BAD is 0
>-> checksum was checked by hw and is good
> - if packet_type

[dpdk-dev] Does virtio-net PMD require specific QEMU virtio-net parameters?

2015-02-02 Thread Ouyang, Changchun



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tetsuya Mukawa
> Sent: Sunday, February 1, 2015 6:03 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] Does virtio-net PMD require specific QEMU virtio-net
> parameters?
> 
> Hi,
> 
> I cannot invoke virtio-net PMD at least on Ubuntu12 and Ubuntu14 QEMU
> guest.
> This behavior might be seen on other users environment also, so I report it.
> .
> Here is error log.
> 
> $ ./testpmd -c f -n 1 -- -i

Try to add '--txqflags 0xf00' into the testpmd cmd line:
$ ./testpmd -c f -n 1 -- -i --txqflags 0xf00

Thanks
Changchun

43 matches

Mail list logo