Re: [dpdk-dev] [PATCH] net/mlx5: fix error number handling

2018-06-06 Thread Nélio Laranjeiro
On Tue, Jun 05, 2018 at 09:36:32PM +, Yongseok Koh wrote:
> > On Jun 4, 2018, at 11:52 PM, Nélio Laranjeiro  
> > wrote:
> > 
> > On Mon, Jun 04, 2018 at 10:37:31AM -0700, Yongseok Koh wrote:
> >> rte_errno should be saved only if error has occurred because rte_errno
> >> could have garbage value.
> >> 
> >> Fixes: a6d83b6a9209 ("net/mlx5: standardize on negative errno values")
> >> Cc: sta...@dpdk.org
> >> 
> >> Signed-off-by: Yongseok Koh 
> >> ---
> >> drivers/net/mlx5/mlx5_flow.c | 3 ++-
> >> 1 file changed, 2 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> >> index 994be05be..eaffe7495 100644
> >> --- a/drivers/net/mlx5/mlx5_flow.c
> >> +++ b/drivers/net/mlx5/mlx5_flow.c
> >> @@ -3561,7 +3561,8 @@ mlx5_fdir_filter_delete(struct rte_eth_dev *dev,
> >>/* The flow does not match. */
> >>continue;
> >>}
> >> -  ret = rte_errno; /* Save rte_errno before cleanup. */
> >> +  if (ret)
> >> +  ret = rte_errno; /* Save rte_errno before cleanup. */
> >>if (flow)
> >>mlx5_flow_list_destroy(dev, &priv->flows, flow);
> >> exit:
> >> -- 
> >> 2.11.0
> > 
> > This patch is not enough, the returned value being -rte_errno if no
> > error is detected by the function it cannot set rte_errno nor return it.
> 
> We may need to refactor this kind of code (saving and restoring rte_errno). I
> still don't understand why we should preserve rte_errno like this.
>
> Even if this function returns success, there's no obligation to preserve
> rte_errno in the function. Once it is called, the ownership of rte_errno 
> belongs
> to this function.
>
> I can't find how we define this per-lcore variable but, from
> the man page of errno,
> 
>Theheader  file  defines  the integer variable errno, 
> which
>is set by system calls and some library functions in the event of an
>error to indicate what went wrong.  Its value is significant only when
>the return value of the call indicated an error (i.e., -1 from most
>system calls; -1 or NULL from most library  functions);
>a function that succeeds is allowed to change errno.
>
> So, I still think an API can change rte_errno even if it succeeds, no need to
> preserve it. If needed, the caller has to save it.

Functions in this PMD are defined as is:

  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.

Which means rte_errno is only modified in case of error.

This fix does not respect the documentation of the function or any other
function of the PMD which can return errors.  rte_errno is only set if
an error is encountered and contains only the error code of the first
error sub-sequent ones are considered consequences of the first one and
thus not preserved.

Not preserving the rte_errno in roll backs is equivalent to not setting
it at all as a function called by the rollback may also set it, example:

 {
void * a;

foo_do();
a  = malloc(10);
if (!a) {
rte_errno = ENOMEM;
foo_undo();
return -rte_errno;
}
 }

If foo_undo() also encounter an error it will modify the rte_errno which
may have a value different from ENOMEM, for the callee won't be informed
the error is due to a memory issue and thus cannot make counter parts.
In such situation the rte_errno must be preserved to keep the ENOMEM
error code.
This is also the main reason almost all system function only update
errno when no error is encountered.

Regards,

-- 
Nélio Laranjeiro
6WIND


[dpdk-dev] 18.08 Intel Roadmap

2018-06-06 Thread O'Driscoll, Tim
Now that the 18.05 release is complete, we need to update the roadmap for 
18.08. The features that we plan to contribute are below. We'll submit a patch 
to update the roadmap page with this info.


Power Management: Turbo Core Thread Pinning - An API will be created to 
distinguish between regular cores and turbo cores. This will allow turbo cores 
to be used for tasks like packet distribution, with regular cores being used as 
worker cores.

Power Management: Traffic Pattern Aware Power Control - When a core is polling 
there's no easy way to determine how busy it actually is. This feature will 
provide a means for an application to determine how much work a core is 
actually performing, and to dynamically adjust power management based on that.

Power Management: 100% Busy Polling - This is similar to the previous item, but 
will provide a means of detecting traffic load and managing power that is 
transparent to the application.

Failure Handler for PCIe Hardware Hotplug - Phase 1 of device hotplug (enabling 
a device event monitoring framework) was added in 18.05. This will be extended 
by adding a failure handler which will provide a mechanism to handle any 
accesses to the device from the data path until such time as the removal is 
complete.

Virtio IN_ORDER Support - A new feature (IN_ORDER) has been added to the virtio 
spec, so that the performance can be improved when the requirement is met and 
assumptions can be made on ring manipulation. Support for this will be added to 
DPDK.

Allow Setup/Reconfiguration/Tear Down of Queues at Run Time - Currently, to 
configure a DPDK ethdev, the application specifies how many Tx and Rx queues to 
include prior to starting the device. This feature introduces a more dynamic 
approach where the application can also setup/reconfigure/tear down queues 
after the device has been started (on NICs where this is supported in hardware).

SoftNIC Restructuring - The SoftNIC will be modified to use the Packet 
Framework, which will make it more flexible and easier to add new functionality 
in future.

SoftNIC Support for NAT - Support for Network Address Translation (NAT44 & 
NAT66) was added to the Packet Framework in 18.05. This support will be added 
to the SoftNIC.

Virtual Device Hotplug - This will allow a DPDK primary process to create or 
remove a virtual device and have this change mirrored in an already-started 
secondary process by providing a framework to notify secondary processes when 
virtual devices are added or removed.

Intel(r) QuickAssist Technology Compression PMD - A new compression API was 
added in 18.05. Hardware acceleration of compression via an Intel(r) 
QuickAssist Technology PMD will be added.  

Vhost-User Extension to Improve vDPA - This will improve vHost Data Path 
Acceleration (vDPA) performance by extending the vhost-user protocol to allow 
the vhost-user backend to register memory-based host notifiers to QEMU. This 
requires changes to both DPDK and QEMU.

Unified Packet Fragmentation - There are currently two separate libraries for 
splitting large packets into smaller ones - IP Fragmentation for UDP packets 
and GSO for other packet types. This feature will add support for UDP 
fragmentation (using the IP Fragmentation library) to the GSO library, thus 
enabling applications to use a unified API to split all types of large packets.

3DES Support in AESNI_MB PMD - Support for 3DES will be added to the AESNI_MB 
PMD.

ISAL Compression PMD Enhancements - Support for Scatter Gather List (SGL) and 
other enhancements will be added to the ISAL (Intel(r) Intelligent Storage 
Acceleration Library) compression PMD.

Complete Intel PMD Support for New Descriptor Status APIs - Remaining Intel 
PMDs (E1000 VF and FM10K PF/VF) will be updated to support the new descriptor 
status APIs (rte_eth_rx_descriptor_status and rte_eth_tx_descriptor_status).


Re: [dpdk-dev] [PATCH v2] doc: add suggested new feature order to release notes

2018-06-06 Thread Mcnamara, John



> -Original Message-
> From: Yigit, Ferruh
> Sent: Tuesday, June 5, 2018 2:44 PM
> To: Mcnamara, John ; Kovacevic, Marko
> 
> Cc: dev@dpdk.org; Yigit, Ferruh ; Thomas Monjalon
> ; Shreyansh Jain 
> Subject: [PATCH v2] doc: add suggested new feature order to release notes
> 
> Signed-off-by: Ferruh Yigit 
> ---
> Cc: Shreyansh Jain 
> 
> v2:
> * Wording fixed
> ---
>  doc/guides/rel_notes/release_18_08.rst | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/release_18_08.rst
> b/doc/guides/rel_notes/release_18_08.rst
> index 5bc23c537..bce01b550 100644
> --- a/doc/guides/rel_notes/release_18_08.rst
> +++ b/doc/guides/rel_notes/release_18_08.rst
> @@ -37,6 +37,9 @@ New Features
> 
>   Refer to the previous release notes for examples.
> 
> + Suggested order of items in release note:
> + core libs, device abstraction libs and PMDs, other libs, apps,
> examples, unit test

Good idea.

The line should be wrapped or maybe added as a list like this:

   * Core libs
   * Device abstraction libs and PMDs 
   * Other libs
   * Apps
   * Examples 
   * Unit tests

But that is minor, so.

Acked-by: John McNamara 





[dpdk-dev] [PATCH 1/2] power: add get capabilities API

2018-06-06 Thread Radu Nicolau
New API added, rte_power_get_capabilities(), that allows the
application to query the power and performance capabilities
of the CPU cores.

Signed-off-by: Radu Nicolau 
---
 lib/librte_power/power_acpi_cpufreq.c  | 21 +
 lib/librte_power/power_acpi_cpufreq.h  | 17 ++
 lib/librte_power/power_kvm_vm.c|  8 +++
 lib/librte_power/power_kvm_vm.h| 17 ++
 lib/librte_power/rte_power.c   |  3 +++
 lib/librte_power/rte_power.h   | 33 ++
 lib/librte_power/rte_power_version.map |  7 ++
 test/test/test_power_acpi_cpufreq.c| 43 ++
 8 files changed, 149 insertions(+)

diff --git a/lib/librte_power/power_acpi_cpufreq.c 
b/lib/librte_power/power_acpi_cpufreq.c
index bce933e..cd5978d 100644
--- a/lib/librte_power/power_acpi_cpufreq.c
+++ b/lib/librte_power/power_acpi_cpufreq.c
@@ -623,3 +623,24 @@ power_acpi_disable_turbo(unsigned int lcore_id)
 
return 0;
 }
+
+int power_acpi_get_capabilities(unsigned int lcore_id,
+   struct rte_power_core_capabilities *caps)
+{
+   struct rte_power_info *pi;
+
+   if (lcore_id >= RTE_MAX_LCORE) {
+   RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+   return -1;
+   }
+   if (caps == NULL) {
+   RTE_LOG(ERR, POWER, "Invalid argument\n");
+   return -1;
+   }
+
+   pi = &lcore_power_info[lcore_id];
+   caps->capabilities = 0;
+   caps->turbo = !!(pi->turbo_available);
+
+   return 0;
+}
diff --git a/lib/librte_power/power_acpi_cpufreq.h 
b/lib/librte_power/power_acpi_cpufreq.h
index edeeb27..77701a9 100644
--- a/lib/librte_power/power_acpi_cpufreq.h
+++ b/lib/librte_power/power_acpi_cpufreq.h
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include "rte_power.h"
 
 #ifdef __cplusplus
 extern "C" {
@@ -196,6 +197,22 @@ int power_acpi_enable_turbo(unsigned int lcore_id);
  */
 int power_acpi_disable_turbo(unsigned int lcore_id);
 
+/**
+ * Returns power capabilities for a specific lcore.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param caps
+ *  pointer to rte_power_core_capabilities object.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_acpi_get_capabilities(unsigned int lcore_id,
+   struct rte_power_core_capabilities *caps);
+
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_power/power_kvm_vm.c b/lib/librte_power/power_kvm_vm.c
index 38e9066..20659b7 100644
--- a/lib/librte_power/power_kvm_vm.c
+++ b/lib/librte_power/power_kvm_vm.c
@@ -124,3 +124,11 @@ power_kvm_vm_disable_turbo(unsigned int lcore_id)
 {
return send_msg(lcore_id, CPU_POWER_DISABLE_TURBO);
 }
+
+struct rte_power_core_capabilities;
+int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
+   __rte_unused struct rte_power_core_capabilities *caps)
+{
+   RTE_LOG(ERR, POWER, "rte_power_get_capabilities is not implemented for 
Virtual Machine Power Management\n");
+   return -ENOTSUP;
+}
diff --git a/lib/librte_power/power_kvm_vm.h b/lib/librte_power/power_kvm_vm.h
index 446d699..94d4aa1 100644
--- a/lib/librte_power/power_kvm_vm.h
+++ b/lib/librte_power/power_kvm_vm.h
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include "rte_power.h"
 
 #ifdef __cplusplus
 extern "C" {
@@ -177,6 +178,22 @@ int power_kvm_vm_enable_turbo(unsigned int lcore_id);
  *  - Negative on error.
  */
 int power_kvm_vm_disable_turbo(unsigned int lcore_id);
+
+/**
+ * Returns power capabilities for a specific lcore.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param caps
+ *  pointer to rte_power_core_capabilities object.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_kvm_vm_get_capabilities(unsigned int lcore_id,
+   struct rte_power_core_capabilities *caps);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_power/rte_power.c b/lib/librte_power/rte_power.c
index 6c8fb40..208b791 100644
--- a/lib/librte_power/rte_power.c
+++ b/lib/librte_power/rte_power.c
@@ -24,6 +24,7 @@ rte_power_freq_change_t rte_power_freq_min = NULL;
 rte_power_freq_change_t rte_power_turbo_status;
 rte_power_freq_change_t rte_power_freq_enable_turbo;
 rte_power_freq_change_t rte_power_freq_disable_turbo;
+rte_power_get_capabilities_t rte_power_get_capabilities;
 
 int
 rte_power_set_env(enum power_management_env env)
@@ -42,6 +43,7 @@ rte_power_set_env(enum power_management_env env)
rte_power_turbo_status = power_acpi_turbo_status;
rte_power_freq_enable_turbo = power_acpi_enable_turbo;
rte_power_freq_disable_turbo = power_acpi_disable_turbo;
+   rte_power_get_capabilities = power_acpi_get_capabilities;
} else if (env == PM_ENV_KVM_VM) {
rte_power_freqs = power_kvm_vm_freqs;
rte_power_get_freq = power_kvm_vm_get_freq;
@@ -53,6 +55,7 @@ rte_power_set_env(enum power_management_env env)

[dpdk-dev] [PATCH 2/2] examples/l3fw-power: add high/regular performance cores option

2018-06-06 Thread Radu Nicolau
Added high/regular performance core pinning configuration options
that can be used in place of the existing 'config' option.

'--high-perf-cores CORELIST' option allow the user to specify a high performance
cores list; if this option is not used and the 'perf-config' option is used, the
application will query the system using the rte_power library in order to get a 
list
of available high performance cores. The cores that are considered high 
performance
are the cores that have turbo enabled.

'--perf-config 
(port,queue,hi_perf,lcore_index)[(port,queue,hi_perf,lcore_index)]'
option is similar to the existing config option, the cores are specified
as indices for bins containing high or regular performance cores.

Example:

l3fwd-power -l 6,7 -- -p 0xff --high-perf-cores 6 
--perf-config="(0,0,0,0),(1,0,1,0)"

cores 6 and 7 are used, core 6 is specified as a high performance core.
port 0 queue 0 will use a regular performance core, index 0 (core 7)
port 1 queue 0 will use a high performance core, index 0 (core 6)

Signed-off-by: Radu Nicolau 
---
 examples/l3fwd-power/Makefile|   4 +-
 examples/l3fwd-power/main.c  |  74 ++---
 examples/l3fwd-power/main.h  |  20 
 examples/l3fwd-power/meson.build |   4 +-
 examples/l3fwd-power/perf_core.c | 229 +++
 examples/l3fwd-power/perf_core.h |  12 ++
 6 files changed, 322 insertions(+), 21 deletions(-)
 create mode 100644 examples/l3fwd-power/main.h
 create mode 100644 examples/l3fwd-power/perf_core.c
 create mode 100644 examples/l3fwd-power/perf_core.h

diff --git a/examples/l3fwd-power/Makefile b/examples/l3fwd-power/Makefile
index 390b7d6..1a46033 100644
--- a/examples/l3fwd-power/Makefile
+++ b/examples/l3fwd-power/Makefile
@@ -1,11 +1,11 @@
 # SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2010-2014 Intel Corporation
+# Copyright(c) 2010-2018 Intel Corporation
 
 # binary name
 APP = l3fwd-power
 
 # all source are stored in SRCS-y
-SRCS-y := main.c
+SRCS-y := main.c perf_core.c
 
 # Build using pkg-config variables if possible
 $(shell pkg-config --exists libdpdk)
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 596d645..36f31de 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2010-2016 Intel Corporation
+ * Copyright(c) 2010-2018 Intel Corporation
  */
 
 #include 
@@ -44,6 +44,9 @@
 #include 
 #include 
 
+#include "perf_core.h"
+#include "main.h"
+
 #define RTE_LOGTYPE_L3FWD_POWER RTE_LOGTYPE_USER1
 
 #define MAX_PKT_BURST 32
@@ -155,14 +158,7 @@ struct lcore_rx_queue {
 #define MAX_RX_QUEUE_INTERRUPT_PER_PORT 16
 
 
-#define MAX_LCORE_PARAMS 1024
-struct lcore_params {
-   uint16_t port_id;
-   uint8_t queue_id;
-   uint8_t lcore_id;
-} __rte_cache_aligned;
-
-static struct lcore_params lcore_params_array[MAX_LCORE_PARAMS];
+struct lcore_params lcore_params_array[MAX_LCORE_PARAMS];
 static struct lcore_params lcore_params_array_default[] = {
{0, 0, 2},
{0, 1, 2},
@@ -175,8 +171,8 @@ static struct lcore_params lcore_params_array_default[] = {
{3, 1, 3},
 };
 
-static struct lcore_params * lcore_params = lcore_params_array_default;
-static uint16_t nb_lcore_params = sizeof(lcore_params_array_default) /
+struct lcore_params * lcore_params = lcore_params_array_default;
+uint16_t nb_lcore_params = sizeof(lcore_params_array_default) /
sizeof(lcore_params_array_default[0]);
 
 static struct rte_eth_conf port_conf = {
@@ -1121,10 +1117,15 @@ print_usage(const char *prgname)
 {
printf ("%s [EAL options] -- -p PORTMASK -P"
"  [--config (port,queue,lcore)[,(port,queue,lcore]]"
+   "  [--high-perf-cores CORELIST"
+   "  [--perf-config 
(port,queue,hi_perf,lcore_index)[,(port,queue,hi_perf,lcore_index]]"
"  [--enable-jumbo [--max-pkt-len PKTLEN]]\n"
"  -p PORTMASK: hexadecimal bitmask of ports to configure\n"
"  -P : enable promiscuous mode\n"
"  --config (port,queue,lcore): rx queues configuration\n"
+   "  --high-perf-cores CORELIST: list of high performance cores\n"
+   "  --perf-config: similar as config, cores specified as indices"
+   " for bins containing high or regular performance cores\n"
"  --no-numa: optional, disable numa awareness\n"
"  --enable-jumbo: enable jumbo frame"
" which max packet len is PKTLEN in decimal (64-9600)\n"
@@ -1234,6 +1235,8 @@ parse_args(int argc, char **argv)
char *prgname = argv[0];
static struct option lgopts[] = {
{"config", 1, 0, 0},
+   {"perf-config", 1, 0, 0},
+   {"high-perf-cores", 1, 0, 0},
{"no-numa", 0, 0, 0},
{"enable-jumbo", 0, 0, 0},
{CMD_LINE_OPT_PARSE_PTYPE, 0, 0,

Re: [dpdk-dev] [PATCH v2] net/qede: fix L2-handles used for RSS hash update

2018-06-06 Thread Kevin Traynor
On 06/06/2018 12:03 AM, Rasesh Mody wrote:
> Fix fast path array index which is used for passing L2 handles to RSS
> indirection table, properly distribute rxq handles for indirection table.
> Currently, it is using the local copy of indirection table. When the RX
> queue configuration changes the local copy becomes invalid.
> 
> Fixes: 69d7ba88f1a1 ("net/qede/base: use L2-handles for RSS configuration")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Rasesh Mody 
> Reviewed-by: Kevin Traynor 

LGTM

> ---
>  drivers/net/qede/qede_ethdev.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/qede/qede_ethdev.c b/drivers/net/qede/qede_ethdev.c
> index 3206cc6..5a4071b 100644
> --- a/drivers/net/qede/qede_ethdev.c
> +++ b/drivers/net/qede/qede_ethdev.c
> @@ -2210,7 +2210,7 @@ int qede_rss_hash_update(struct rte_eth_dev *eth_dev,
>   vport_update_params.vport_id = 0;
>   /* pass the L2 handles instead of qids */
>   for (i = 0 ; i < ECORE_RSS_IND_TABLE_SIZE ; i++) {
> - idx = qdev->rss_ind_table[i];
> + idx = i % QEDE_RSS_COUNT(qdev);
>   rss_params.rss_ind_table[i] = qdev->fp_array[idx].rxq->handle;
>   }
>   vport_update_params.rss_params = &rss_params;
> 



Re: [dpdk-dev] [PATCH v2 00/26] net/ena: new features and fixes

2018-06-06 Thread Michał Krawczyk
Hi Ferruh,

2018-06-05 18:42 GMT+02:00 Ferruh Yigit :
>
> On 6/4/2018 1:09 PM, Michal Krawczyk wrote:
> > The ENA driver was updated with the new features and few fixes and minor
> > changes are introduced.
> > First of all, the communication layer which is delivered by vendor was
> > updated - the version in the HEAD is a bit outdated now. ENA is able to
> > communicate with the driver through Admin queue by using admin interrupts
> > instead of polling.
> > Admin interrupts are also used for handling AENQ events, which are used for
> > the following new features:
> >   - LSC handlers
> >   - watchdog and device rest
> >   - monitoring the admin queue
> >   - handling ENA notifications (getting hints from device)
> > For the watchdog and admin queue monitoring, the timers had to be used, so
> > the makefile was modified to do not cut out the librte_timer.
> >
> > From other fixes and changes:
> >   - legacy LLQ was removed which is now deprecated API
> >   - Rx out of order completion was added to enable cleaning up packets out
> > of order
> >   - Tx mbufs are now linearized if they exceed supported number of segments
> >   - pass information about maximum number of Tx and Rx descriptors
> >   - the IO queue number is now taking into consideration maximum number of
> > sq and cq
> >   - Tx id requested for sending is now being validated and the reset is
> > being triggered if it is invalid
> >   - branch predictioning was added for better performance
> >   - error checking and returned values were fixed
> >   - macros for allocating memory in communication layer were fixed
> >   - information about numa mode is now being passed to the NIC
> >
> > ---
> > v2:
> > * Rebased on top of dpdk-next-net
> > * Added link speed patch
> > * Added fix when allocating coherent memory in the PMD
>
> Hi Michał,
>
> I am getting build error for ICC [1] and shared library [2], can you please 
> check?
>
>
> [1]
> .../drivers/net/ena/base/ena_com.c(323): error #592: variable "flags" is used
> before its value is set
> ENA_SPINLOCK_LOCK(admin_queue->q_lock, flags);
> ^
> .../drivers/net/ena/base/ena_com.c(534): error #3656: variable "flags" may be
> used before its value is set
>   ENA_SPINLOCK_LOCK(admin_queue->q_lock, flags);
>   ^
> .../drivers/net/ena/base/ena_com.c(589): error #592: variable "flags" is used
> before its value is set
> ENA_SPINLOCK_LOCK(admin_queue->q_lock, flags);
> ^
> .../drivers/net/ena/base/ena_com.c(634): error #592: variable "flags" is used
> before its value is set
> ENA_SPINLOCK_LOCK(mmio_read->lock, flags);
> ^
> .../drivers/net/ena/base/ena_com.c(1297): error #592: variable "flags" is used
> before its value is set
> ENA_SPINLOCK_LOCK(admin_queue->q_lock, flags);
> ^
> .../drivers/net/ena/base/ena_com.c(1341): error #592: variable "flags" is used
> before its value is set
> ENA_SPINLOCK_LOCK(admin_queue->q_lock, flags);
> ^
>

That should be easy to fix - I prepared the patch but I want to send
it with second fix together.

>
>
> [2]
> ena_ethdev.o: In function `eth_ena_pci_probe':
> .../drivers/net/ena/ena_ethdev.c:(.text+0x6de): undefined reference to
> `rte_timer_subsystem_init'
> .../drivers/net/ena/ena_ethdev.c:(.text+0x6ea): undefined reference to
> `rte_timer_init'
> ena_ethdev.o: In function `eth_ena_pci_remove':
> .../drivers/net/ena/ena_ethdev.c:(.text+0x854): undefined reference to
> `rte_timer_stop_sync'
> ena_ethdev.o: In function `ena_start':
> .../drivers/net/ena/ena_ethdev.c:(.text+0x1ff6): undefined reference to
> `rte_timer_reset'
> ena_ethdev.o: In function `ena_stop':
> .../drivers/net/ena/ena_ethdev.c:(.text+0x21a1): undefined reference to
> `rte_timer_stop_sync'
> ena_ethdev.o: In function `ena_close':
> .../drivers/net/ena/ena_ethdev.c:(.text+0x21d8): undefined reference to
> `rte_timer_stop_sync'
>

Strange that this happens, it should be fixed in the patch:
  mk: link librte_timer with --whole-archive
which seems to be no longer needed, because similar commit is already
applied (eb54ef42b02f94c4093556fdd6b51e2d7fd0df47). That should
resolve the issue with linking (at least for gcc it is helping) by
linking timer library with --whole-archive.
Could you, please, send me build logs so I could look closer if it is
properly built with ICC as I don't have access to it?

Thanks,
Michał


Re: [dpdk-dev] [PATCH] net/bonding: add add/remove mac addrs

2018-06-06 Thread Alex Kiselev
Hi Chas.

Thank you for the review.

>This is fine but you are missing one bit.  If a slave is added after you can 
>configured the MAC
>addresses, you will need to replay the configured MAC addresses into that 
>slave.

Yeah, I've missed that part about adding a slave when a bond is already working.

It looks like that mac_address_slaves_update() is the right place
to configure additional MAC addresses when a slave is being added.
But there is one moment that's very unclear to me.
could you please take a look at my reasoning about it?

mac_address_slaves_update() behaves very straightforward when
the bond mode isn't BONDING_MODE_8023AD, it just sets MAC address,
so I guess adding rte_eth_dev_mac_addr_add() call would be enought.

But when mode is BONDING_MODE_8023AD, mac_address_slaves_update() calls
bond_mode_8023ad_mac_address_update(). first, it doesn't use
rte_eth_dev_default_mac_addr_set() and second, in BONDING_MODE_8023AD
mode promiscuous mode is enabled for every slave.

Those are two moments that are unclear to me.
I would say that there is no need to do any MAC manipulations
(either rte_eth_dev_default_mac_addr_set() or rte_eth_dev_mac_addr_add())
when mode is BONDING_MODE_8023AD since promiscuous mode is enabled. Am I right?


> This is fine but you are missing one bit.  If a slave is added after you can 
> configured the MAC
> addresses, you will need to replay the configured MAC addresses into that 
> slave.
>
> It's not clear to me what the contract is between bonding and DPDK as far as 
> slave
> removal.  Just looking at the behavior with the default MAC address, I would 
> guess
> that bonding also needs to remove extra MAC addresses it added during slave
> remove.  Bonding is a PMD but also an application.

On Fri, May 25, 2018 at 12:21 PM, Alex Kiselev  wrote:

add functions to add/remove MAC addresses

Signed-off-by: Alex Kiselev 
---
 drivers/net/bonding/rte_eth_bond_pmd.c | 76 +++---
 1 file changed, 71 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c 
b/drivers/net/bonding/rte_eth_bond_pmd.c
index 02d94b1..835b4e3 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -25,6 +25,7 @@

 #define REORDER_PERIOD_MS 10
 #define DEFAULT_POLLING_INTERVAL_10_MS (10)
+#define BOND_MAX_MAC_ADDRS 16

 #define HASH_L4_PORTS(h) ((h)->src_port ^ (h)->dst_port)

@@ -2219,7 +2220,7 @@ bond_ethdev_info(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
uint16_t max_nb_rx_queues = UINT16_MAX;
uint16_t max_nb_tx_queues = UINT16_MAX;

-   dev_info->max_mac_addrs = 1;
+   dev_info->max_mac_addrs = BOND_MAX_MAC_ADDRS;

dev_info->max_rx_pktlen = internals->candidate_max_rx_pktlen ?
internals->candidate_max_rx_pktlen :
@@ -2905,6 +2906,65 @@ bond_filter_ctrl(struct rte_eth_dev *dev __rte_unused,
return -ENOTSUP;
 }

+static int
+bond_ethdev_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac_addr,
+   __rte_unused uint32_t index, uint32_t vmdq)
+{
+   struct rte_eth_dev *slave_eth_dev;
+   struct bond_dev_private *internals = dev->data->dev_private;
+   int ret, i;
+
+   rte_spinlock_lock(&internals->lock);
+
+   for (i = 0; i < internals->slave_count; i++) {
+   slave_eth_dev = &rte_eth_devices[internals->slaves[i].port_id];
+   if (*slave_eth_dev->dev_ops->mac_addr_add == NULL) {
+   ret = -ENOTSUP;
+   goto end;
+   }
+   }
+
+   for (i = 0; i < internals->slave_count; i++) {
+   ret = rte_eth_dev_mac_addr_add(internals->slaves[i].port_id, 
mac_addr,
+   vmdq);
+   if (ret < 0)
+   goto end;
+   }
+
+   ret = 0;
+end:
+   rte_spinlock_unlock(&internals->lock);
+   return ret;
+}
+
+static void
+bond_ethdev_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
+{
+   struct rte_eth_dev *slave_eth_dev;
+   struct bond_dev_private *internals = dev->data->dev_private;
+   int ret, i;
+
+   rte_spinlock_lock(&internals->lock);
+
+   for (i = 0; i < internals->slave_count; i++) {
+   slave_eth_dev = &rte_eth_devices[internals->slaves[i].port_id];
+   if (*slave_eth_dev->dev_ops->mac_addr_remove == NULL)
+   goto end;
+   }
+
+   struct ether_addr *mac_addr = &dev->data->mac_addrs[index];
+
+   for (i = 0; i < internals->slave_count; i++) {
+   ret = rte_eth_dev_mac_addr_remove(internals->slaves[i].port_id,
+   mac_addr);
+   if (ret < 0)
+   goto end;
+   }
+
+end:
+   rte_spinlock_unlock(&internals->lock);
+}
+
 const struct eth_dev_ops default_dev_ops = {
.dev_start= bond_ethdev_start,
.dev_stop = bond_ethdev_s

[dpdk-dev] [PATCH] examples/ipsec-secgw: fix incorrect IPv4 checksum at TX

2018-06-06 Thread Konstantin Ananyev
For ESP transport and BYPASS mode the app might generate output
packets with invalid IPv4 header checksum.
At least such behavior was observed on few Intel NICs.
The reason is that the app didn't set ipv4 header checksum to zero
before passing it to the HW.

Fixes: 906257e965b7 ("examples/ipsec-secgw: support IPv6")

Signed-off-by: Konstantin Ananyev 
---
 examples/ipsec-secgw/ipsec-secgw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/examples/ipsec-secgw/ipsec-secgw.c 
b/examples/ipsec-secgw/ipsec-secgw.c
index 5d7071657..38933341b 100644
--- a/examples/ipsec-secgw/ipsec-secgw.c
+++ b/examples/ipsec-secgw/ipsec-secgw.c
@@ -331,6 +331,7 @@ prepare_tx_pkt(struct rte_mbuf *pkt, uint16_t port)
pkt->l3_len = sizeof(struct ip);
pkt->l2_len = ETHER_HDR_LEN;
 
+   ip->ip_sum = 0;
ethhdr->ether_type = rte_cpu_to_be_16(ETHER_TYPE_IPv4);
} else {
pkt->ol_flags |= PKT_TX_IPV6;
-- 
2.13.6



Re: [dpdk-dev] Compiling RSS ebpf

2018-06-06 Thread Sameeh Jubran
On Tue, May 29, 2018 at 6:48 PM, Ophir Munk  wrote:

> Hi Sameeh,
>
> RSS ebpf source code compilation is planned for dpdk next releases.
>
>
>
> In order to compile the current tap_bpf_program.c code you will need
> definitions found in iproute2:
>
>
>
> https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/
>
>
>
> 1.   Please clone iproute2 tree and make it accessible to your build
> environment, say under directory .
>
>
>
> 2.   Please add "bpf_api.h" file inclusion in tap_bpf_program.c
> (following tap_rss.h inclusion, see [1])
>
>
>
> 3.   When running the clang command please include the iproute2
> include directory, for example,
>
> clang -I /iproute2/include -O2 -emit-llvm -c
> tap_bpf_program.c -o - | llc -march=bpf -filetype=obj -o tap_bpf_program.o
>
>
>
> That’s all is needed.
>

I have just tested this and this indeed solved the compilation errors!

It'll be great if you'll add them to the documentation as it is misleading
and not up-to-date :)

Thanks

>
>
> [1]
>
> 
> ---
>
> diff --git a/drivers/net/tap/tap_bpf_program.c b/drivers/net/tap/tap_bpf_
> program.c
>
> index 1cb7382..d65f9b8 100644
>
> --- a/drivers/net/tap/tap_bpf_program.c
>
> +++ b/drivers/net/tap/tap_bpf_program.c
>
> @@ -17,6 +17,7 @@
>
> #include 
>
>
>
>   #include "tap_rss.h"
>
>   +#include "bpf_api.h"
>
>
>
>/** Create IPv4 address */
>
> #define IPv4(a, b, c, d) ((__u32)(((a) & 0xff) << 24) | \
>
> 
> ---
>
>
>
> Regards,
>
> Ophir
>
>
>
> *From:* Sameeh Jubran [mailto:sam...@daynix.com]
> *Sent:* Friday, May 25, 2018 11:54 AM
> *To:* dev@dpdk.org; Ophir Munk 
> *Cc:* Yan Vugenfirer 
> *Subject:* Re: Compiling RSS ebpf
>
>
>
> Ping.
>
>
>
> On Sun, May 13, 2018 at 1:16 PM, Sameeh Jubran  wrote:
>
> Sorry I have accidentally sent my last email without finishing it.
>
>
>
> I believe it is due misconfiguration of the linux headers and their
> inclusion. Anyone else faced these errors?
>
>
>
>
>
> On Sun, May 13, 2018 at 1:10 PM, Sameeh Jubran  wrote:
>
> Hi All,
>
>
>
> I am attempting to compile the rss epbf program dpdk and I am getting too
> much compilation errors, is there anything specific that I need to
> configure when compiling?
>
>
>
> I am running the following line:
>
>
>
>  clang -O2 -emit-llvm -c tap_bpf_program.c -o - | llc -march=bpf
> -filetype=obj -o tap_bpf_program.o
>
>
>
> I think it is due to some misconfiguration of
>
> The errors I am getting:
>
>
>
> tap_bpf_program.c:157:37: warning: implicit declaration of function
> 'offsetof' is invalid in C99 [-Wimplicit-function-declaration]
>
> __u8 *src_dst_addr = data + off + offsetof(struct iphdr,
> saddr);
>
>   ^
>
> tap_bpf_program.c:157:46: error: expected expression
>
> __u8 *src_dst_addr = data + off + offsetof(struct iphdr,
> saddr);
>
>^
>
> tap_bpf_program.c:157:60: error: use of undeclared identifier 'saddr'
>
> __u8 *src_dst_addr = data + off + offsetof(struct iphdr,
> saddr);
>
>  ^
>
> tap_bpf_program.c:180:11: error: use of undeclared identifier 'TC_ACT_OK'
>
> return TC_ACT_OK;
>
>^
>
> tap_bpf_program.c:182:15: error: expected expression
>
> offsetof(struct ipv6hdr, saddr);
>
>  ^
>
> tap_bpf_program.c:182:31: error: use of undeclared identifier 'saddr'
>
> offsetof(struct ipv6hdr, saddr);
>
>  ^
>
> tap_bpf_program.c:204:10: error: use of undeclared identifier 'TC_ACT_PIPE'
>
> return TC_ACT_PIPE;
>
>^
>
> tap_bpf_program.c:212:9: error: use of undeclared identifier
> 'TC_ACT_RECLASSIFY'
>
> return TC_ACT_RECLASSIFY;
>
>^
>
> tap_bpf_program.c:222:1: error: expected parameter declarator
>
> RSS(l3_l4)
>
> ^
>
> tap_bpf_program.c:216:12: note: expanded from macro 'RSS'
>
> __section(#L) int   \
>
>   ^
>
> :76:1: note: expanded from here
>
> "l3_l4"
>
> ^
>
> tap_bpf_program.c:222:1: error: expected ')'
>
> tap_bpf_program.c:216:12: note: expanded from macro 'RSS'
>
> __section(#L) int   \
>
>   ^
>
> :76:1: note: expanded from here
>
> "l3_l4"
>
> ^
>
> tap_bpf_program.c:222:1: note: to match this '('
>
> tap_bpf_program.c:216:11: note: expanded from macro 'RSS'
>
> __section(#L) int   \
>
>  ^
>
> tap_bpf_program.c:222:1: warnin

[dpdk-dev] [PATCH] net/bonding: don't clear active slave count

2018-06-06 Thread Chas Williams
From: "Charles (Chas) Williams" 

When the bond PMD is stopped, the active slave count is reset.
For 802.3ad mode this potentially leaks memory and clears state since
a second sequential activate_slave() will occur when the bond PMD is
restarted and the LSC callback is triggered while the active slave
count is 0. To fix this, don't clear the active slave count when
stopping. Only deactivate_slave() should be used to clear the slaves.

Signed-off-by: Chas Williams 
---
 drivers/net/bonding/rte_eth_bond_pmd.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c 
b/drivers/net/bonding/rte_eth_bond_pmd.c
index 02d94b1b1..4ae577078 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -2173,7 +2173,6 @@ bond_ethdev_stop(struct rte_eth_dev *eth_dev)
tlb_last_obytets[internals->active_slaves[i]] = 0;
}
 
-   internals->active_slave_count = 0;
internals->link_status_polling_enabled = 0;
for (i = 0; i < internals->slave_count; i++)
internals->slaves[i].last_link_status = 0;
-- 
2.14.3



[dpdk-dev] [PATCH v2 2/5] net/virtio: use simple path for Tx even if Rx mergeable

2018-06-06 Thread Maxime Coquelin
Having Rx mergeable buffers feature enabled should not be
a reason to not use Tx simple path.

Signed-off-by: Maxime Coquelin 
---
 drivers/net/virtio/virtio_ethdev.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index bdc4f09d5..73e6d6b6b 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1949,7 +1949,6 @@ virtio_dev_configure(struct rte_eth_dev *dev)
 #endif
if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) {
hw->use_simple_rx = 0;
-   hw->use_simple_tx = 0;
}
 
if (rx_offloads & (DEV_RX_OFFLOAD_UDP_CKSUM |
-- 
2.14.3



[dpdk-dev] [PATCH v2 0/5] net/virtio: Tx path selection and offload improvements

2018-06-06 Thread Maxime Coquelin
This series addresses a problem seen when running benchmars,
where we see a big difference in Tx performance depending on
whether Rx mergeable is enabled or not.
With patch 1, Tx simple path is selected even when Rx-mrg is
negotiated.

Digging a bit further, I found that Tx simple path could be
selected even when offload features had been negotiated.
With patch 2, guest does not try to negotiate Tx offload
features that haven't been selected by the application.
It means that without restarting the guest, we can now switch
from Tx simple path when application do no request offloads,
to Tx standard path when the application request offloads.
Note that to do so, one must stop the port first [0].
Another advantage of doing this than using Tx simple path
when guest application does not use offload features, is
that on host side, the virtio net header parsing is skipped
in dequeue function.

Patch 3 fixes an issue with Rx offload, as simple path was
be used if application requested TCP LRO only.

Finally, patch 4 aims at improving the offload feature checks
in the standard paths.

Below are benchmarks results when offloads are enabled at
device level and simple Tx is disabled (default).

Rx-mrg=off benchmarks:
++---+-+-+--+
|Run |  PVP  | Guest->Host | Host->Guest | Loopback |
++---+-+-+--+
| v18.05 | 14.47 |   17.02 |   17.57 |13.15 |
| + series   | 11.73 |   13.90 |   17.50 |13.19 |
++---+-+-+--+

Rx-mrg=on benchmarks:
++---+-+-+--+
|Run |  PVP  | Guest->Host | Host->Guest | Loopback |
++---+-+-+--+
| v18.05 |  9.53 |   13.80 |   16.70 |13.11 |
| + series   |  9.58 |   13.93 |   16.70 |13.11 |
++---+-+-+--+

Below are benchmarks results when offloads are enabled at
device level and simple Tx is unlocked (with passing
'simple_tx_support=1' as Virtio device devarg).

Rx-mrg=off benchmarks:
++---+-+-+--+
|Run |  PVP  | Guest->Host | Host->Guest | Loopback |
++---+-+-+--+
| v18.05 | 14.47 |   17.02 |   17.57 |13.15 |
| + series   | 14.88 |   19.64 |   17.50 |13.14 |
++---+-+-+--+

Rx-mrg=on benchmarks:
++---+-+-+--+
|Run |  PVP  | Guest->Host | Host->Guest | Loopback |
++---+-+-+--+
| v18.05 |  9.53 |   13.80 |   16.70 |13.11 |
| + series   | 12.62 |   19.69 |   16.70 |13.11 |
++---+-+-+--+


[0]:
testpmd> port start 0
EAL: Error disabling MSI-X interrupts for fd 17
set_rxtx_funcs(): virtio: using mergeable buffer Rx path on port 0
set_rxtx_funcs(): virtio: using simple Tx path on port 0
Port 0: 52:54:00:58:D7:01
Checking link statuses...
Done
testpmd> show port 0 tx_offload capabilities 
Tx Offloading Capabilities of port 0 :
  Per Queue :
  Per Port  : VLAN_INSERT UDP_CKSUM TCP_CKSUM TCP_TSO MULTI_SEGS
 
testpmd> show port 0 tx_offload configuration
Tx Offloading Configuration of port 0 :
  Port :
  Queue[ 0] :

testpmd> port stop 0
Stopping ports...
Checking link statuses...
Done
testpmd> csum set tcp hw 0
Parse tunnel is off
IP checksum offload is sw
UDP checksum offload is sw
TCP checksum offload is hw
SCTP checksum offload is sw
Outer-Ip checksum offload is sw
testpmd> port start 0
Configuring Port 0 (socket 0)
EAL: Error disabling MSI-X interrupts for fd 17
set_rxtx_funcs(): virtio: using mergeable buffer Rx path on port 0
set_rxtx_funcs(): virtio: using standard Tx path on port 0
Port 0: 52:54:00:58:D7:01
Checking link statuses...
Done 
testpmd> show port 0 tx_offload configuration
Tx Offloading Configuration of port 0 :
  Port : TCP_CKSUM
  Queue[ 0] :


Changes in v2:
==
- Introduce a devarg to disable simple Tx path by default (Tiwei)
- Use standard PATH if VLAN strip or insert offloads are requested (Tiwei)
- Use boolean instead of int/uint8_t for has_tx/rx_offload (Tiwei)

Maxime Coquelin (5):
  net/virtio: prevent simple Tx path selection by default
  net/virtio: use simple path for Tx even if Rx mergeable
  net/vhost: improve Tx path selection
  net/virtio: don't use simple Rx if TCP LRO or VLAN strip requested
  net/virtio: improve offload check performance

 doc/guides/nics/virtio.rst |   9 +++
 drivers/net/virtio/virtio_ethdev.c | 117 +++--
 drivers/net/virtio/virtio_ethdev.h |   3 -
 drivers/net/virtio/virtio_pci.h|   4 ++
 drivers/net/virtio/virtio_rxtx.c   |  31 ++-

[dpdk-dev] [PATCH v2 1/5] net/virtio: prevent simple Tx path selection by default

2018-06-06 Thread Maxime Coquelin
Simple Tx path is not compliant with the Virtio specification,
as it assumes the device will use the descriptors in order.

VIRTIO_F_IN_ORDER feature has been introduced recently, but the
simple Tx path is not compliant with it as VIRTIO_F_IN_ORDER
requires that chained descriptors are used sequentially, which
is not the case in simple Tx path.

This patch introduces 'simple_tx_support' devarg to unlock
Tx simple path selection.

Reported-by: Tiwei Bie 
Signed-off-by: Maxime Coquelin 
---
 doc/guides/nics/virtio.rst |  9 +
 drivers/net/virtio/virtio_ethdev.c | 72 +-
 drivers/net/virtio/virtio_pci.h|  1 +
 3 files changed, 81 insertions(+), 1 deletion(-)

diff --git a/doc/guides/nics/virtio.rst b/doc/guides/nics/virtio.rst
index 8922f9c0b..53ce1c12a 100644
--- a/doc/guides/nics/virtio.rst
+++ b/doc/guides/nics/virtio.rst
@@ -222,6 +222,9 @@ Tx callbacks:
 
 #. ``virtio_xmit_pkts_simple``:
Vector version fixes the available ring indexes to optimize performance.
+   This implementation does not comply with the Virtio specification, and so
+   is not selectable by default. "simple_tx_support=1" devarg must be passed
+   to unlock it.
 
 
 By default, the non-vector callbacks are used:
@@ -331,3 +334,9 @@ The user can specify below argument in devargs.
 driver, and works as a HW vhost backend. This argument is used to specify
 a virtio device needs to work in vDPA mode.
 (Default: 0 (disabled))
+
+#.  ``simple_tx_support``:
+
+This argument enables support for the simple Tx path, which is not
+compliant with the Virtio specification.
+(Default: 0 (disabled))
diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 5833dad73..bdc4f09d5 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1331,6 +1331,8 @@ set_rxtx_funcs(struct rte_eth_dev *eth_dev)
if (hw->use_simple_tx) {
PMD_INIT_LOG(INFO, "virtio: using simple Tx path on port %u",
eth_dev->data->port_id);
+   PMD_INIT_LOG(WARNING,
+   "virtio: simple Tx path does not comply with 
Virtio spec");
eth_dev->tx_pkt_burst = virtio_xmit_pkts_simple;
} else {
PMD_INIT_LOG(INFO, "virtio: using standard Tx path on port %u",
@@ -1790,6 +1792,65 @@ rte_virtio_pmd_init(void)
rte_pci_register(&rte_virtio_pmd);
 }
 
+#define VIRTIO_SIMPLE_TX_SUPPORT "simple_tx_support"
+
+static int virtio_dev_args_check(const char *key, const char *val,
+   void *opaque)
+{
+   struct rte_eth_dev *dev = opaque;
+   struct virtio_hw *hw = dev->data->dev_private;
+   unsigned long tmp;
+   int ret = 0;
+
+   errno = 0;
+   tmp = strtoul(val, NULL, 0);
+   if (errno) {
+   PMD_INIT_LOG(INFO, "%s: \"%s\" is not a valid integer", key, 
val);
+   return errno;
+   }
+
+   if (strcmp(VIRTIO_SIMPLE_TX_SUPPORT, key) == 0)
+   hw->support_simple_tx = !!tmp;
+
+   return ret;
+}
+
+static int
+virtio_dev_args(struct rte_eth_dev *dev)
+{
+   struct rte_kvargs *kvlist;
+   struct rte_devargs *devargs;
+   const char *valid_args[] = {
+   VIRTIO_SIMPLE_TX_SUPPORT,
+   NULL,
+   };
+   int ret;
+   int i;
+
+   devargs = dev->device->devargs;
+   if (!devargs)
+   return 0; /* return success */
+
+   kvlist = rte_kvargs_parse(devargs->args, valid_args);
+   if (kvlist == NULL)
+   return -EINVAL;
+
+/* Process parameters. */
+   for (i = 0; (valid_args[i] != NULL); ++i) {
+   if (rte_kvargs_count(kvlist, valid_args[i])) {
+   ret = rte_kvargs_process(kvlist, valid_args[i],
+virtio_dev_args_check, dev);
+   if (ret) {
+   rte_kvargs_free(kvlist);
+   return ret;
+   }
+   }
+   }
+   rte_kvargs_free(kvlist);
+
+   return 0;
+}
+
 /*
  * Configure virtio device
  * It returns 0 on success.
@@ -1804,6 +1865,10 @@ virtio_dev_configure(struct rte_eth_dev *dev)
int ret;
 
PMD_INIT_LOG(DEBUG, "configure");
+
+   if (virtio_dev_args(dev))
+   return -ENOTSUP;
+
req_features = VIRTIO_PMD_DEFAULT_GUEST_FEATURES;
 
if (dev->data->dev_conf.intr_conf.rxq) {
@@ -1869,7 +1934,12 @@ virtio_dev_configure(struct rte_eth_dev *dev)
rte_spinlock_init(&hw->state_lock);
 
hw->use_simple_rx = 1;
-   hw->use_simple_tx = 1;
+   /*
+* Simple Tx does not comply with Virtio spec,
+* "simple_tx_support=1" devarg needs to be passed
+* to unlock it.
+*/
+   hw->use_simple_tx = hw->support_simple_tx;
 
 #if defined RTE_ARCH_ARM64 || defined RTE_ARCH_ARM
if 

[dpdk-dev] [PATCH v2 5/5] net/virtio: improve offload check performance

2018-06-06 Thread Maxime Coquelin
Instead of checking the multiple Virtio features bits for
every packet, let's do the check once at configure time and
store it in virtio_hw struct.

Signed-off-by: Maxime Coquelin 
---
 drivers/net/virtio/virtio_ethdev.c | 19 +++
 drivers/net/virtio/virtio_pci.h|  3 +++
 drivers/net/virtio/virtio_rxtx.c   | 31 +--
 3 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 357968fdd..3d4b28be8 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1851,6 +1851,22 @@ virtio_dev_args(struct rte_eth_dev *dev)
return 0;
 }
 
+static bool
+rx_offload_enabled(struct virtio_hw *hw)
+{
+   return vtpci_with_feature(hw, VIRTIO_NET_F_GUEST_CSUM) ||
+   vtpci_with_feature(hw, VIRTIO_NET_F_GUEST_TSO4) ||
+   vtpci_with_feature(hw, VIRTIO_NET_F_GUEST_TSO6);
+}
+
+static bool
+tx_offload_enabled(struct virtio_hw *hw)
+{
+   return vtpci_with_feature(hw, VIRTIO_NET_F_CSUM) ||
+   vtpci_with_feature(hw, VIRTIO_NET_F_HOST_TSO4) ||
+   vtpci_with_feature(hw, VIRTIO_NET_F_HOST_TSO6);
+}
+
 /*
  * Configure virtio device
  * It returns 0 on success.
@@ -1934,6 +1950,9 @@ virtio_dev_configure(struct rte_eth_dev *dev)
return -ENOTSUP;
}
 
+   hw->has_tx_offload = tx_offload_enabled(hw);
+   hw->has_rx_offload = rx_offload_enabled(hw);
+
if (dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)
/* Enable vector (0) for Link State Intrerrupt */
if (VTPCI_OPS(hw)->set_config_irq(hw, 0) ==
diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
index 7318bb318..337dc6180 100644
--- a/drivers/net/virtio/virtio_pci.h
+++ b/drivers/net/virtio/virtio_pci.h
@@ -6,6 +6,7 @@
 #define _VIRTIO_PCI_H_
 
 #include 
+#include 
 
 #include 
 #include 
@@ -234,6 +235,8 @@ struct virtio_hw {
uint8_t support_simple_tx;
uint8_t use_simple_rx;
uint8_t use_simple_tx;
+   boolhas_tx_offload;
+   boolhas_rx_offload;
uint16_tport_id;
uint8_t mac_addr[ETHER_ADDR_LEN];
uint32_tnotify_off_multiplier;
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 92fab2174..8579a7567 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -225,13 +225,6 @@ virtio_tso_fix_cksum(struct rte_mbuf *m)
}
 }
 
-static inline int
-tx_offload_enabled(struct virtio_hw *hw)
-{
-   return vtpci_with_feature(hw, VIRTIO_NET_F_CSUM) ||
-   vtpci_with_feature(hw, VIRTIO_NET_F_HOST_TSO4) ||
-   vtpci_with_feature(hw, VIRTIO_NET_F_HOST_TSO6);
-}
 
 /* avoid write operation when necessary, to lessen cache issues */
 #define ASSIGN_UNLESS_EQUAL(var, val) do { \
@@ -251,9 +244,7 @@ virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct 
rte_mbuf *cookie,
uint16_t head_idx, idx;
uint16_t head_size = vq->hw->vtnet_hdr_size;
struct virtio_net_hdr *hdr;
-   int offload;
 
-   offload = tx_offload_enabled(vq->hw);
head_idx = vq->vq_desc_head_idx;
idx = head_idx;
dxp = &vq->vq_descx[idx];
@@ -270,8 +261,7 @@ virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct 
rte_mbuf *cookie,
 * which is wrong. Below subtract restores correct pkt size.
 */
cookie->pkt_len -= head_size;
-   /* if offload disabled, it is not zeroed below, do it now */
-   if (offload == 0) {
+   if (!vq->hw->has_tx_offload) {
ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
@@ -309,7 +299,7 @@ virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct 
rte_mbuf *cookie,
}
 
/* Checksum Offload / TSO */
-   if (offload) {
+   if (vq->hw->has_tx_offload) {
if (cookie->ol_flags & PKT_TX_TCP_SEG)
cookie->ol_flags |= PKT_TX_TCP_CKSUM;
 
@@ -686,14 +676,6 @@ virtio_rx_offload(struct rte_mbuf *m, struct 
virtio_net_hdr *hdr)
return 0;
 }
 
-static inline int
-rx_offload_enabled(struct virtio_hw *hw)
-{
-   return vtpci_with_feature(hw, VIRTIO_NET_F_GUEST_CSUM) ||
-   vtpci_with_feature(hw, VIRTIO_NET_F_GUEST_TSO4) ||
-   vtpci_with_feature(hw, VIRTIO_NET_F_GUEST_TSO6);
-}
-
 #define VIRTIO_MBUF_BURST_SZ 64
 #define DESC_PER_CACHELINE (RTE_CACHE_LINE_SIZE / sizeof(struct vring_desc))
 uint16_t
@@ -709,7 +691,6 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, 
uint16_t nb_pkts)
int error;
uint32_t i, nb_enqueued;
uint32_t hdr_size;
-   int offload;
struct virtio_net_hdr *hdr;
 
nb_rx = 0;
@@ 

[dpdk-dev] [PATCH v2 3/5] net/vhost: improve Tx path selection

2018-06-06 Thread Maxime Coquelin
This patch improves the Tx path selection depending on
whether the application request for offloads, and on whether
offload features have been negotiated.

When the application doesn't request for Tx offload features,
the corresponding features bits aren't negotiated.

When Tx offload virtio features have been negotiated, ensure
the simple Tx path isn't selected.

Signed-off-by: Maxime Coquelin 
---
 drivers/net/virtio/virtio_ethdev.c | 21 +++--
 drivers/net/virtio/virtio_ethdev.h |  3 ---
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 73e6d6b6b..b023ec02e 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1859,8 +1859,10 @@ static int
 virtio_dev_configure(struct rte_eth_dev *dev)
 {
const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
+   const struct rte_eth_txmode *txmode = &dev->data->dev_conf.txmode;
struct virtio_hw *hw = dev->data->dev_private;
uint64_t rx_offloads = rxmode->offloads;
+   uint64_t tx_offloads = txmode->offloads;
uint64_t req_features;
int ret;
 
@@ -1886,6 +1888,15 @@ virtio_dev_configure(struct rte_eth_dev *dev)
(1ULL << VIRTIO_NET_F_GUEST_TSO4) |
(1ULL << VIRTIO_NET_F_GUEST_TSO6);
 
+   if (tx_offloads & (DEV_TX_OFFLOAD_TCP_CKSUM |
+   DEV_TX_OFFLOAD_UDP_CKSUM))
+   req_features |= (1ULL << VIRTIO_NET_F_CSUM);
+
+   if (tx_offloads & DEV_TX_OFFLOAD_TCP_TSO)
+   req_features |=
+   (1ULL << VIRTIO_NET_F_HOST_TSO4) |
+   (1ULL << VIRTIO_NET_F_HOST_TSO6);
+
/* if request features changed, reinit the device */
if (req_features != hw->req_guest_features) {
ret = virtio_init_device(dev, req_features);
@@ -1955,6 +1966,12 @@ virtio_dev_configure(struct rte_eth_dev *dev)
   DEV_RX_OFFLOAD_TCP_CKSUM))
hw->use_simple_rx = 0;
 
+   if (tx_offloads & (DEV_TX_OFFLOAD_TCP_CKSUM |
+   DEV_TX_OFFLOAD_UDP_CKSUM |
+   DEV_TX_OFFLOAD_TCP_TSO |
+   DEV_TX_OFFLOAD_VLAN_INSERT))
+   hw->use_simple_tx = 0;
+
return 0;
 }
 
@@ -2208,14 +2225,14 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
 
dev_info->tx_offload_capa = DEV_TX_OFFLOAD_MULTI_SEGS |
DEV_TX_OFFLOAD_VLAN_INSERT;
-   if (hw->guest_features & (1ULL << VIRTIO_NET_F_CSUM)) {
+   if (host_features & (1ULL << VIRTIO_NET_F_CSUM)) {
dev_info->tx_offload_capa |=
DEV_TX_OFFLOAD_UDP_CKSUM |
DEV_TX_OFFLOAD_TCP_CKSUM;
}
tso_mask = (1ULL << VIRTIO_NET_F_HOST_TSO4) |
(1ULL << VIRTIO_NET_F_HOST_TSO6);
-   if ((hw->guest_features & tso_mask) == tso_mask)
+   if ((host_features & tso_mask) == tso_mask)
dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_TCP_TSO;
 }
 
diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index bb40064ea..b603665c7 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -28,9 +28,6 @@
 1u << VIRTIO_NET_F_CTRL_VQ   | \
 1u << VIRTIO_NET_F_CTRL_RX   | \
 1u << VIRTIO_NET_F_CTRL_VLAN | \
-1u << VIRTIO_NET_F_CSUM  | \
-1u << VIRTIO_NET_F_HOST_TSO4 | \
-1u << VIRTIO_NET_F_HOST_TSO6 | \
 1u << VIRTIO_NET_F_MRG_RXBUF | \
 1u << VIRTIO_NET_F_MTU | \
 1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE |  \
-- 
2.14.3



[dpdk-dev] [PATCH v2 4/5] net/virtio: don't use simple Rx if TCP LRO or VLAN strip requested

2018-06-06 Thread Maxime Coquelin
Signed-off-by: Maxime Coquelin 
---
 drivers/net/virtio/virtio_ethdev.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index b023ec02e..357968fdd 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1963,7 +1963,9 @@ virtio_dev_configure(struct rte_eth_dev *dev)
}
 
if (rx_offloads & (DEV_RX_OFFLOAD_UDP_CKSUM |
-  DEV_RX_OFFLOAD_TCP_CKSUM))
+  DEV_RX_OFFLOAD_TCP_CKSUM |
+  DEV_RX_OFFLOAD_TCP_LRO |
+  DEV_RX_OFFLOAD_VLAN_STRIP))
hw->use_simple_rx = 0;
 
if (tx_offloads & (DEV_TX_OFFLOAD_TCP_CKSUM |
-- 
2.14.3



Re: [dpdk-dev] [RFC] hot plug failure handle mechanism

2018-06-06 Thread Bruce Richardson
+Tech-board as I think that this should have more input at the design stage
ahead of any code patches being pushed.

On Mon, Jun 04, 2018 at 09:56:10AM +0800, Guo, Jia wrote:
> hi,bruce
> 
> 
> On 5/29/2018 7:20 PM, Bruce Richardson wrote:
> > On Thu, May 24, 2018 at 07:55:43AM +0100, Guo, Jia wrote:
> > 
> > > The hot plug failure handle mechanism should be come across as bellow:
> > > 
> > > 1.  Add a new bus ops “handle_hot-unplug”in bus to handle bus
> > > read/write error, it is bus-specific and each
> > > 
> > > kind of bus can implement its own logic.
> > > 
> > > 2.  Implement pci bus specific ops“pci_handle_hot_unplug”, in the
> > > function, base on the
> > > 
> > > failure address to remap memory which belong to the corresponding
> > > device that unplugged.
> > > 
> > > 3.  Implement a new sigbus handler, and register it when start
> > > device event monitoring,
> > > 
> > > once the MMIO sigbus error exposure, it will trigger the above hot 
> > > plug
> > > failure handle mechanism,
> > > 
> > > that will keep app, that working on packet processing, would not be
> > > broken and crash, then could
> > > 
> > > keep going clean, fail-safe or other working task.
> > > 
> > > 4.  Also also will introduce the solution by use testpmd to show
> > > the example of the whole procedure like that:
> > > 
> > > device unplug ->failure handle->stop forwarding->stop port->close
> > > port->detach port.
> > > 
> > Hi Jeff,
> > 
> > so if I understand this correctly the proposal is that we need two parallel
> > solutions to handle safe removal of a device.
> > 
> > 1. We need a solution to support unpluging of the device at the bus level,
> > ie. remove the device from the list of devices and to make access to
> > that device invalid.
> > 2. Since the removal of the device from the software lists is not going to
> > be instantaneous, we need a mechanism to handle any accesses to the
> > device from the data path until such time as the removal is complete. To
> > support that, you propose to add a sigbus handler which will
> > automatically replace any mmio bar mappings with some other memory that 
> > is
> > ok to access - presumable zero memory or similar.
> > 
> > Is this understanding correct?
> 
> i think you are correct about that.
> 
> > Point #2 seems reasonably clear to me, but for #1, presumably the trigger
> > to the bus needs to come from the kernel. What is planned to be used there?
> 
> about point #1, i should clarify here is that, we will use the device event
> monitor mechanism to detect the hot unplug event.
> the monitor be enabled by app(or fail-safe driver), and app(fail-safe
> driver) register the event callback. Once the hot unplug behave be detected,
> the user's callback could be triggered to let app(fail-safe driver) know the
> event and manage the process, it will call APIs to stop the device
> and detach the device from the bus.

Ok. If there is no failsafe driver, and the app does not set up a handler,
does nothing happen when we get a removal event? Will the app just crash?

> 
> > You also talk about using testpmd as a reference for this, but you don't
> > explain how an application can be notified of a device removal, or even why
> > that is necessary. Since all applications should now be using the proper
> > macros when iterating device lists, and not just assuming devices 0-N are
> > valid, what changes would you see a normal app having to make to be
> > hotplug-safe?
> 
> we could use app or fail-safe driver to use these mechanism , but at this
> stage i will firstly use testpmd as a reference.
> as above reply, testpmd should enable device event mechanism to monitor the
> device removal, and register callback,
> the device bdf list is managed by bus and the hoplug fail handler will be
> process by the eal layer, then the app would be hotplug-safe.
> 
> is there anything i miss to clarify? please shout. and i think i will detail
> more later.

This is becoming clearer now, thanks. Just the one question above I have at
this point.
Given how long-running this issue of hotplug is, I'm hoping others on the
technical board can also review this proposal.

Regards,
/Bruce


Re: [dpdk-dev] [RFC] hot plug failure handle mechanism

2018-06-06 Thread Ananyev, Konstantin


> -Original Message-
> From: Richardson, Bruce
> Sent: Wednesday, June 6, 2018 1:55 PM
> To: Guo, Jia ; techbo...@dpdk.org
> Cc: dev@dpdk.org; Ananyev, Konstantin ; 
> step...@networkplumber.org; Yigit, Ferruh
> ; gaetan.ri...@6wind.com; Wu, Jingjing 
> ; tho...@monjalon.net;
> mo...@mellanox.com; ma...@mellanox.com; Van Haaren, Harry 
> ; Zhang, Qi Z
> ; Zhang, Helin ; 
> jblu...@infradead.org; shreyansh.j...@nxp.com
> Subject: Re: [dpdk-dev] [RFC] hot plug failure handle mechanism
> 
> +Tech-board as I think that this should have more input at the design stage
> ahead of any code patches being pushed.
> 
> On Mon, Jun 04, 2018 at 09:56:10AM +0800, Guo, Jia wrote:
> > hi,bruce
> >
> >
> > On 5/29/2018 7:20 PM, Bruce Richardson wrote:
> > > On Thu, May 24, 2018 at 07:55:43AM +0100, Guo, Jia wrote:
> > > 
> > > > The hot plug failure handle mechanism should be come across as 
> > > > bellow:
> > > >
> > > > 1.  Add a new bus ops “handle_hot-unplug”in bus to handle bus
> > > > read/write error, it is bus-specific and each
> > > >
> > > > kind of bus can implement its own logic.
> > > >
> > > > 2.  Implement pci bus specific ops“pci_handle_hot_unplug”, in 
> > > > the
> > > > function, base on the
> > > >
> > > > failure address to remap memory which belong to the corresponding
> > > > device that unplugged.
> > > >
> > > > 3.  Implement a new sigbus handler, and register it when start
> > > > device event monitoring,
> > > >
> > > > once the MMIO sigbus error exposure, it will trigger the above hot 
> > > > plug
> > > > failure handle mechanism,
> > > >
> > > > that will keep app, that working on packet processing, would not be
> > > > broken and crash, then could
> > > >
> > > > keep going clean, fail-safe or other working task.
> > > >
> > > > 4.  Also also will introduce the solution by use testpmd to show
> > > > the example of the whole procedure like that:
> > > >
> > > > device unplug ->failure handle->stop forwarding->stop port->close
> > > > port->detach port.
> > > >
> > > Hi Jeff,
> > >
> > > so if I understand this correctly the proposal is that we need two 
> > > parallel
> > > solutions to handle safe removal of a device.
> > >
> > > 1. We need a solution to support unpluging of the device at the bus level,
> > > ie. remove the device from the list of devices and to make access to
> > > that device invalid.
> > > 2. Since the removal of the device from the software lists is not going to
> > > be instantaneous, we need a mechanism to handle any accesses to the
> > > device from the data path until such time as the removal is complete. 
> > > To
> > > support that, you propose to add a sigbus handler which will
> > > automatically replace any mmio bar mappings with some other memory 
> > > that is
> > > ok to access - presumable zero memory or similar.
> > >
> > > Is this understanding correct?
> >
> > i think you are correct about that.
> >
> > > Point #2 seems reasonably clear to me, but for #1, presumably the trigger
> > > to the bus needs to come from the kernel. What is planned to be used 
> > > there?
> >
> > about point #1, i should clarify here is that, we will use the device event
> > monitor mechanism to detect the hot unplug event.
> > the monitor be enabled by app(or fail-safe driver), and app(fail-safe
> > driver) register the event callback. Once the hot unplug behave be detected,
> > the user's callback could be triggered to let app(fail-safe driver) know the
> > event and manage the process, it will call APIs to stop the device
> > and detach the device from the bus.
> 
> Ok. If there is no failsafe driver, and the app does not set up a handler,
> does nothing happen when we get a removal event? Will the app just crash?
> 
> >
> > > You also talk about using testpmd as a reference for this, but you don't
> > > explain how an application can be notified of a device removal, or even 
> > > why
> > > that is necessary. Since all applications should now be using the proper
> > > macros when iterating device lists, and not just assuming devices 0-N are
> > > valid, what changes would you see a normal app having to make to be
> > > hotplug-safe?
> >
> > we could use app or fail-safe driver to use these mechanism , but at this
> > stage i will firstly use testpmd as a reference.
> > as above reply, testpmd should enable device event mechanism to monitor the
> > device removal, and register callback,
> > the device bdf list is managed by bus and the hoplug fail handler will be
> > process by the eal layer, then the app would be hotplug-safe.
> >
> > is there anything i miss to clarify? please shout. and i think i will detail
> > more later.
> 
> This is becoming clearer now, thanks. Just the one question above I have at
> this point.
> Given how long-running this issue of hotplug is, I'm hoping others on the
> technical board can also review this proposal.

I l

Re: [dpdk-dev] [PATCH] doc: add template release notes for 18.08

2018-06-06 Thread Ferruh Yigit
On 6/6/2018 7:19 AM, Shreyansh Jain wrote:
> On 6/5/2018 6:58 PM, Ferruh Yigit wrote:
>> On 6/5/2018 1:36 PM, Shreyansh Jain wrote:
>>> Hello Ferruh,
>>>
 -Original Message-
 From: Ferruh Yigit [mailto:ferruh.yi...@intel.com]
 Sent: Tuesday, June 5, 2018 2:34 PM
 To: Thomas Monjalon ; dev@dpdk.org
 Cc: john.mcnam...@intel.com; marko.kovace...@intel.com; Hemant Agrawal
 ; Shreyansh Jain 
 Subject: Re: [dpdk-dev] [PATCH] doc: add template release notes for
 18.08

 On 5/31/2018 10:11 PM, Thomas Monjalon wrote:
> Add template release notes for DPDK 18.08 with inline
> comments and explanations of the various sections.
>
> Signed-off-by: Thomas Monjalon 

 <...>

> +Shared Library Versions
> +---
> +
> +.. Update any library version updated in this release
> +   and prepend with a ``+`` sign, like this:
> +
> + librte_acl.so.2
> +   + librte_cfgfile.so.2
> + librte_cmdline.so.2
> +
> +   This section is a comment. Do not overwrite or remove it.
> +   =
> +
> +The libraries prepended with a plus sign were incremented in this
 version.
> +
> +.. code-block:: diff
> +
 <..>
> + librte_bus_dpaa.so.1
> + librte_bus_fslmc.so.1
> + librte_bus_pci.so.1
> + librte_bus_vdev.so.1
 <..>
> + librte_common_octeontx.so.1
 <..>
> + librte_pmd_dpaa2_cmdif.so.1
>>>
>>> cmdif doesn't expose any APIs for the user. It works through the librawdev 
>>> APIs.
>>>
>>> drivers/raw/dpaa2_cmdif/± cat rte_pmd_dpaa2_cmdif_version.map
>>> DPDK_18.05 {
>>>
>>> local: *;
>>> };
>>>
> + librte_pmd_dpaa2_qdma.so.1
>>>
>>> QDMA driver indeed exposes various APIs for the user/application.
>>>

 Is above libraries provide any API for user application?

 It looks like they are for other internal libraries, if so should we
 document
 them here in release notes?
>>>
>>> You are suggesting that we add a note stating a particular library has no 
>>> exposed APIs? I can't guess what use that information would be for a 
>>> release note reader.
>>> That might be a note for the relevant documentation rst file, though.
>>
>> I suggest removing the library from release notes if it is only for internal
>> interface. Why user should concerned about version updates if the library is
>> internal only?
> 
> Are you referring to removing from 18.05 release note (published) as 
> well? Or, you are only referring to removing from template for 18.08?

>From v18.08 an further, as a cleanup.

> 
> In principle, I have no objections in removing lib_pmd_dpaa2_cmdif from 
> an exposed list of libraries as no application is going to use it directly.
> 
> -
> Shreyansh
> 



Re: [dpdk-dev] [PATCH] examples/ipsec-secgw: fix incorrect IPv4 checksum at TX

2018-06-06 Thread Radu Nicolau



On 6/6/2018 1:04 PM, Konstantin Ananyev wrote:

For ESP transport and BYPASS mode the app might generate output
packets with invalid IPv4 header checksum.
At least such behavior was observed on few Intel NICs.
The reason is that the app didn't set ipv4 header checksum to zero
before passing it to the HW.

Fixes: 906257e965b7 ("examples/ipsec-secgw: support IPv6")

Signed-off-by: Konstantin Ananyev 
---

Acked-by: Radu Nicolau  



[dpdk-dev] [PATCH] examples: make Linux environment check consistent

2018-06-06 Thread Thomas Monjalon
Some Makefiles are using CONFIG_RTE_EXEC_ENV and others
are using CONFIG_RTE_EXEC_ENV_LINUXAPP.
Use the latter one for consistency.
We could remove CONFIG_RTE_EXEC_ENV later if considered useless.

Signed-off-by: Thomas Monjalon 
---
 examples/ethtool/Makefile  | 2 +-
 examples/ethtool/lib/Makefile  | 2 +-
 examples/ip_pipeline/Makefile  | 2 +-
 examples/kni/Makefile  | 2 +-
 examples/l3fwd-power/Makefile  | 2 +-
 examples/multi_process/client_server_mp/mp_server/Makefile | 2 +-
 examples/netmap_compat/bridge/Makefile | 2 +-
 examples/qos_sched/Makefile| 2 +-
 examples/server_node_efd/server/Makefile   | 2 +-
 examples/tep_termination/Makefile  | 2 +-
 examples/vhost/Makefile| 2 +-
 examples/vhost_crypto/Makefile | 2 +-
 examples/vhost_scsi/Makefile   | 2 +-
 13 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/examples/ethtool/Makefile b/examples/ethtool/Makefile
index 2b40b4b61..3d9d4f06e 100644
--- a/examples/ethtool/Makefile
+++ b/examples/ethtool/Makefile
@@ -10,7 +10,7 @@ RTE_TARGET ?= x86_64-native-linuxapp-gcc
 
 include $(RTE_SDK)/mk/rte.vars.mk
 
-ifneq ($(CONFIG_RTE_EXEC_ENV),"linuxapp")
+ifneq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 $(info This application can only operate in a linuxapp environment, \
 please change the definition of the RTE_TARGET environment variable)
 else
diff --git a/examples/ethtool/lib/Makefile b/examples/ethtool/lib/Makefile
index 2576910f8..6eaa640bc 100644
--- a/examples/ethtool/lib/Makefile
+++ b/examples/ethtool/lib/Makefile
@@ -10,7 +10,7 @@ RTE_TARGET ?= x86_64-native-linuxapp-gcc
 
 include $(RTE_SDK)/mk/rte.vars.mk
 
-ifneq ($(CONFIG_RTE_EXEC_ENV),"linuxapp")
+ifneq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 $(error This application can only operate in a linuxapp environment, \
 please change the definition of the RTE_TARGET environment variable)
 endif
diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index 11d2b35da..3fb98ce3e 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -67,7 +67,7 @@ RTE_TARGET ?= x86_64-native-linuxapp-gcc
 
 include $(RTE_SDK)/mk/rte.vars.mk
 
-ifneq ($(CONFIG_RTE_EXEC_ENV),"linuxapp")
+ifneq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 $(info This application can only operate in a linuxapp environment, \
 please change the definition of the RTE_TARGET environment variable)
 all:
diff --git a/examples/kni/Makefile b/examples/kni/Makefile
index 562dc2741..7e19d2e2a 100644
--- a/examples/kni/Makefile
+++ b/examples/kni/Makefile
@@ -48,7 +48,7 @@ RTE_TARGET ?= x86_64-native-linuxapp-gcc
 
 include $(RTE_SDK)/mk/rte.vars.mk
 
-ifneq ($(CONFIG_RTE_EXEC_ENV),"linuxapp")
+ifneq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 $(error This application can only operate in a linuxapp environment, \
 please change the definition of the RTE_TARGET environment variable)
 endif
diff --git a/examples/l3fwd-power/Makefile b/examples/l3fwd-power/Makefile
index 390b7d6b6..f2fe7463b 100644
--- a/examples/l3fwd-power/Makefile
+++ b/examples/l3fwd-power/Makefile
@@ -48,7 +48,7 @@ RTE_TARGET ?= x86_64-native-linuxapp-gcc
 
 include $(RTE_SDK)/mk/rte.vars.mk
 
-ifneq ($(CONFIG_RTE_EXEC_ENV),"linuxapp")
+ifneq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 $(info This application can only operate in a linuxapp environment, \
 please change the definition of the RTE_TARGET environment variable)
 all:
diff --git a/examples/multi_process/client_server_mp/mp_server/Makefile 
b/examples/multi_process/client_server_mp/mp_server/Makefile
index af7246e6b..9c1caae79 100644
--- a/examples/multi_process/client_server_mp/mp_server/Makefile
+++ b/examples/multi_process/client_server_mp/mp_server/Makefile
@@ -10,7 +10,7 @@ RTE_TARGET ?= x86_64-native-linuxapp-gcc
 
 include $(RTE_SDK)/mk/rte.vars.mk
 
-ifneq ($(CONFIG_RTE_EXEC_ENV),"linuxapp")
+ifneq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 $(error This application can only operate in a linuxapp environment, \
 please change the definition of the RTE_TARGET environment variable)
 endif
diff --git a/examples/netmap_compat/bridge/Makefile 
b/examples/netmap_compat/bridge/Makefile
index a7c9c14a8..856c847bd 100644
--- a/examples/netmap_compat/bridge/Makefile
+++ b/examples/netmap_compat/bridge/Makefile
@@ -10,7 +10,7 @@ RTE_TARGET ?= x86_64-native-linuxapp-gcc
 
 include $(RTE_SDK)/mk/rte.vars.mk
 
-ifneq ($(CONFIG_RTE_EXEC_ENV),"linuxapp")
+ifneq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 $(info This application can only operate in a linuxapp environment, \
 please change the definition of the RTE_TARGET environment variable)
 all:
diff --git a/examples/qos_sched/Makefile b/examples/qos_sched/Makefile
index 0f0a31ff2..a7ecf9788 100644
--- a/examples/qos_sched/Makefile
+++ b

Re: [dpdk-dev] [PATCH] net/bonding: don't clear active slave count

2018-06-06 Thread Matan Azrad
Hi Chas

From: Chas Williams
> From: "Charles (Chas) Williams" 
> 
> When the bond PMD is stopped, the active slave count is reset.
> For 802.3ad mode this potentially leaks memory and clears state since a second
> sequential activate_slave() will occur when the bond PMD is restarted and the
> LSC callback is triggered while the active slave count is 0. To fix this, 
> don't clear
> the active slave count when stopping. Only deactivate_slave() should be used 
> to
> clear the slaves.
> 

Looks like it is a fix, so need fix title, CC stable and fixes line, no?

> Signed-off-by: Chas Williams 
> ---
>  drivers/net/bonding/rte_eth_bond_pmd.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
> b/drivers/net/bonding/rte_eth_bond_pmd.c
> index 02d94b1b1..4ae577078 100644
> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> @@ -2173,7 +2173,6 @@ bond_ethdev_stop(struct rte_eth_dev *eth_dev)
>   tlb_last_obytets[internals->active_slaves[i]] = 0;
>   }
> 
> - internals->active_slave_count = 0;

But why not to call deactivate_slave() for all the active slaves? 

>   internals->link_status_polling_enabled = 0;
>   for (i = 0; i < internals->slave_count; i++)
>   internals->slaves[i].last_link_status = 0;
> --
> 2.14.3




Re: [dpdk-dev] [dpdk-ci] maintenance on DPDK patchwork today

2018-06-06 Thread Thomas Monjalon
06/06/2018 08:56, Thomas Monjalon:
> Patchwork will stop working during approximately one hour
> or hopefully less, today June 6th.
> It will be upgraded to patchwork 2.
> 
> Sorry for the inconvenience.

The upgrade to patchwork 2.0.2 is now done:
https://dpdk.org/dev/patchwork/about

There is a preliminary series support
and there is a new REST API.

If you see any issue, do not hesitate to send an email
or contact me on IRC.

Thanks




Re: [dpdk-dev] [PATCH] net/bonding: don't clear active slave count

2018-06-06 Thread Chas Williams
On Wed, Jun 6, 2018 at 9:57 AM, Matan Azrad  wrote:

> Hi Chas
>
> From: Chas Williams
> > From: "Charles (Chas) Williams" 
> >
> > When the bond PMD is stopped, the active slave count is reset.
> > For 802.3ad mode this potentially leaks memory and clears state since a
> second
> > sequential activate_slave() will occur when the bond PMD is restarted
> and the
> > LSC callback is triggered while the active slave count is 0. To fix
> this, don't clear
> > the active slave count when stopping. Only deactivate_slave() should be
> used to
> > clear the slaves.
> >
>
> Looks like it is a fix, so need fix title, CC stable and fixes line, no?
>

Yes, I forgot to add those.  Let's call it an RFC for now.  It's not
completely clear to
me what is the right thing to do here.  The activate state of slaves is
really dependent
on the link status of the slaves.  Does a PMD have a link status outside of
a start/stop
sequence?  Or, since we clear the last_status on stop, the start will just
get all the
activate/deactivate states corrected.  Since we might stop/start for a
short interval,
it seems like we might not want to break what 802.3ad has negotiated?


>
> > Signed-off-by: Chas Williams 
> > ---
> >  drivers/net/bonding/rte_eth_bond_pmd.c | 1 -
> >  1 file changed, 1 deletion(-)
> >
> > diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
> > b/drivers/net/bonding/rte_eth_bond_pmd.c
> > index 02d94b1b1..4ae577078 100644
> > --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> > +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> > @@ -2173,7 +2173,6 @@ bond_ethdev_stop(struct rte_eth_dev *eth_dev)
> >   tlb_last_obytets[internals->active_slaves[i]] = 0;
> >   }
> >
> > - internals->active_slave_count = 0;
>
> But why not to call deactivate_slave() for all the active slaves?
>
> >   internals->link_status_polling_enabled = 0;
> >   for (i = 0; i < internals->slave_count; i++)
> >   internals->slaves[i].last_link_status = 0;
> > --
> > 2.14.3
>
>
>


Re: [dpdk-dev] [dpdk-ci] maintenance on DPDK patchwork today

2018-06-06 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Wednesday, June 6, 2018 3:03 PM
> To: dev@dpdk.org
> Cc: c...@dpdk.org; dpdk...@iol.unh.edu; Patrick MacArthur
> ; alia...@mellanox.com
> Subject: Re: [dpdk-dev] [dpdk-ci] maintenance on DPDK patchwork today
> 
> 06/06/2018 08:56, Thomas Monjalon:
> > Patchwork will stop working during approximately one hour or hopefully
> > less, today June 6th.
> > It will be upgraded to patchwork 2.
> >
> > Sorry for the inconvenience.
> 
> The upgrade to patchwork 2.0.2 is now done:

\o/ Well done.

John



Re: [dpdk-dev] [PATCH] net/bonding: add add/remove mac addrs

2018-06-06 Thread Chas Williams
On Wed, Jun 6, 2018 at 7:52 AM, Alex Kiselev  wrote:

> Hi Chas.
>
> Thank you for the review.
>
> >This is fine but you are missing one bit.  If a slave is added after you
> can configured the MAC
> >addresses, you will need to replay the configured MAC addresses into that
> slave.
>
> Yeah, I've missed that part about adding a slave when a bond is already
> working.
>
> It looks like that mac_address_slaves_update() is the right place
> to configure additional MAC addresses when a slave is being added.
> But there is one moment that's very unclear to me.
> could you please take a look at my reasoning about it?
>
> mac_address_slaves_update() behaves very straightforward when
> the bond mode isn't BONDING_MODE_8023AD, it just sets MAC address,
> so I guess adding rte_eth_dev_mac_addr_add() call would be enought.
>

Yes, that seems correct.  Don't forget the remove if a slave is removed.


> But when mode is BONDING_MODE_8023AD, mac_address_slaves_update() calls
> bond_mode_8023ad_mac_address_update(). first, it doesn't use
> rte_eth_dev_default_mac_addr_set() and second, in BONDING_MODE_8023AD
> mode promiscuous mode is enabled for every slave.
>

Sorry this part is a bit hazy to me.  I need to review the specification to
see what's
going on here exactly and why.  Our local implementation does something a
little
different here.


> Those are two moments that are unclear to me.
> I would say that there is no need to do any MAC manipulations
> (either rte_eth_dev_default_mac_addr_set() or rte_eth_dev_mac_addr_add())
> when mode is BONDING_MODE_8023AD since promiscuous mode is enabled. Am I
> right?
>

One reason would be that the LAC PDUs arrive on the ETYPE_TYPE_SLOW
multicast
group and so we need a way to receive those packets.  Since adding a
multicast
group address probably didn't exist when this was written, promiscuous was
used.
It would also be the fallback if the PMD didn't support adding MAC
addresses.

If we wanted to switch away from using promiscuous, bonding would need to
use the
same interface but offset those application MAC addresses away from the MAC
addresses
that bonding would need to add for basic operation.  Doing this seems a bit
tricky and
unnecessary at this point.

So just ignore 802.3ad for now.  Just comment about it in the code in
..._slaves_update().
"802.3ad already uses promiscuous, so nothing to do here."


> > This is fine but you are missing one bit.  If a slave is added after you
> can configured the MAC
> > addresses, you will need to replay the configured MAC addresses into
> that slave.
> >
> > It's not clear to me what the contract is between bonding and DPDK as
> far as slave
> > removal.  Just looking at the behavior with the default MAC address, I
> would guess
> > that bonding also needs to remove extra MAC addresses it added during
> slave
> > remove.  Bonding is a PMD but also an application.
>
> On Fri, May 25, 2018 at 12:21 PM, Alex Kiselev  wrote:
>
> add functions to add/remove MAC addresses
>
> Signed-off-by: Alex Kiselev 
> ---
>  drivers/net/bonding/rte_eth_bond_pmd.c | 76
> +++---
>  1 file changed, 71 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
> b/drivers/net/bonding/rte_eth_bond_pmd.c
> index 02d94b1..835b4e3 100644
> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> @@ -25,6 +25,7 @@
>
>  #define REORDER_PERIOD_MS 10
>  #define DEFAULT_POLLING_INTERVAL_10_MS (10)
> +#define BOND_MAX_MAC_ADDRS 16
>
>  #define HASH_L4_PORTS(h) ((h)->src_port ^ (h)->dst_port)
>
> @@ -2219,7 +2220,7 @@ bond_ethdev_info(struct rte_eth_dev *dev, struct
> rte_eth_dev_info *dev_info)
> uint16_t max_nb_rx_queues = UINT16_MAX;
> uint16_t max_nb_tx_queues = UINT16_MAX;
>
> -   dev_info->max_mac_addrs = 1;
> +   dev_info->max_mac_addrs = BOND_MAX_MAC_ADDRS;
>
> dev_info->max_rx_pktlen = internals->candidate_max_rx_pktlen ?
> internals->candidate_max_rx_pktlen :
> @@ -2905,6 +2906,65 @@ bond_filter_ctrl(struct rte_eth_dev *dev
> __rte_unused,
> return -ENOTSUP;
>  }
>
> +static int
> +bond_ethdev_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr
> *mac_addr,
> +   __rte_unused uint32_t index, uint32_t vmdq)
> +{
> +   struct rte_eth_dev *slave_eth_dev;
> +   struct bond_dev_private *internals = dev->data->dev_private;
> +   int ret, i;
> +
> +   rte_spinlock_lock(&internals->lock);
> +
> +   for (i = 0; i < internals->slave_count; i++) {
> +   slave_eth_dev = &rte_eth_devices[internals->
> slaves[i].port_id];
> +   if (*slave_eth_dev->dev_ops->mac_addr_add == NULL) {
> +   ret = -ENOTSUP;
> +   goto end;
> +   }
> +   }
> +
> +   for (i = 0; i < internals->slave_count; i++) {
> +   ret = rte_eth_dev_mac_addr_add(internals->slaves[i].port_id,
> mac_addr,
> +  

Re: [dpdk-dev] [dpdk-ci] maintenance on DPDK patchwork today

2018-06-06 Thread Mcnamara, John
> -Original Message-
> From: ci [mailto:ci-boun...@dpdk.org]
> Sent: Wednesday, June 6, 2018 4:22 PM
> To: Thomas Monjalon ; dev@dpdk.org
> Cc: c...@dpdk.org; dpdk...@iol.unh.edu; Patrick MacArthur
> ; alia...@mellanox.com
> Subject: Re: [dpdk-ci] [dpdk-dev] maintenance on DPDK patchwork today
> 
> > -Original Message-
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas Monjalon
> > Sent: Wednesday, June 6, 2018 3:03 PM
> > To: dev@dpdk.org
> > Cc: c...@dpdk.org; dpdk...@iol.unh.edu; Patrick MacArthur
> > ; alia...@mellanox.com
> > Subject: Re: [dpdk-dev] [dpdk-ci] maintenance on DPDK patchwork today
> >
> > 06/06/2018 08:56, Thomas Monjalon:
> > > Patchwork will stop working during approximately one hour or
> > > hopefully less, today June 6th.
> > > It will be upgraded to patchwork 2.
> > >
> > > Sorry for the inconvenience.
> >
> > The upgrade to patchwork 2.0.2 is now done:
> 
> \o/ Well done.
> 
> John


Hi,

 o
/ \

Arms back down. I tried the new git-pw interface to test out series support but 
it doesn't seem to list any of the new patches before yesterday:

https://git-pw.readthedocs.io/en/latest/

pip install git-pw
git config pw.server   http://dpdk.org/dev/patchwork/
git config pw.project  dpdk
git config pw.username johnmcnamara
git config pw.password password

Via the new REST API:

git-pw patch list --state new | wc -l
34

Vis the older non-REST (?) API:

pwclient search -s new | wc -l
409

Maybe that is just a feature of porting to the new version.

Also, applying a patch doesn't seem to work:

git-pw --debug patch apply 40648

The issue seems to be the REST query setting the series to %2A (*):

   http://dpdk.org/dev/patchwork/patch/40648/mbox/?series=%2A

Setting it to '' in the code works. But that is probably an issue for the 
git-pw devs (stephen).

John







Re: [dpdk-dev] [PATCH] net/mlx5: fix error number handling

2018-06-06 Thread Yongseok Koh
On Wed, Jun 06, 2018 at 08:55:01AM +0200, Nélio Laranjeiro wrote:
> On Tue, Jun 05, 2018 at 09:36:32PM +, Yongseok Koh wrote:
> > > On Jun 4, 2018, at 11:52 PM, Nélio Laranjeiro 
> > >  wrote:
> > > 
> > > On Mon, Jun 04, 2018 at 10:37:31AM -0700, Yongseok Koh wrote:
> > >> rte_errno should be saved only if error has occurred because rte_errno
> > >> could have garbage value.
> > >> 
> > >> Fixes: a6d83b6a9209 ("net/mlx5: standardize on negative errno values")
> > >> Cc: sta...@dpdk.org
> > >> 
> > >> Signed-off-by: Yongseok Koh 
> > >> ---
> > >> drivers/net/mlx5/mlx5_flow.c | 3 ++-
> > >> 1 file changed, 2 insertions(+), 1 deletion(-)
> > >> 
> > >> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> > >> index 994be05be..eaffe7495 100644
> > >> --- a/drivers/net/mlx5/mlx5_flow.c
> > >> +++ b/drivers/net/mlx5/mlx5_flow.c
> > >> @@ -3561,7 +3561,8 @@ mlx5_fdir_filter_delete(struct rte_eth_dev *dev,
> > >>  /* The flow does not match. */
> > >>  continue;
> > >>  }
> > >> -ret = rte_errno; /* Save rte_errno before cleanup. */
> > >> +if (ret)
> > >> +ret = rte_errno; /* Save rte_errno before cleanup. */
> > >>  if (flow)
> > >>  mlx5_flow_list_destroy(dev, &priv->flows, flow);
> > >> exit:
> > >> -- 
> > >> 2.11.0
> > > 
> > > This patch is not enough, the returned value being -rte_errno if no
> > > error is detected by the function it cannot set rte_errno nor return it.
> > 
> > We may need to refactor this kind of code (saving and restoring rte_errno). 
> > I
> > still don't understand why we should preserve rte_errno like this.
> >
> > Even if this function returns success, there's no obligation to preserve
> > rte_errno in the function. Once it is called, the ownership of rte_errno 
> > belongs
> > to this function.
> >
> > I can't find how we define this per-lcore variable but, from
> > the man page of errno,
> > 
> >Theheader  file  defines  the integer variable errno, 
> > which
> >is set by system calls and some library functions in the event of an
> >error to indicate what went wrong.  Its value is significant only 
> > when
> >the return value of the call indicated an error (i.e., -1 from most
> >system calls; -1 or NULL from most library  functions);
> >a function that succeeds is allowed to change errno.
> >
> > So, I still think an API can change rte_errno even if it succeeds, no need 
> > to
> > preserve it. If needed, the caller has to save it.
> 
> Functions in this PMD are defined as is:
> 
>   * @return
>   *   0 on success, a negative errno value otherwise and rte_errno is set.
> 
> Which means rte_errno is only modified in case of error.
> 
> This fix does not respect the documentation of the function or any other
> function of the PMD which can return errors.

That's logically a wrong interpretation. According to the description, if
returning error, rte_errno is set but the opposite isn't always true. Even if
rte_errno is set, it doesn't mean there's an error. So the description coincides
with that of errno. If you want to enforce preserving rte_errno in case of
success, you should amend the documentation.

> rte_errno is only set if an error is encountered and contains only the error
> code of the first error sub-sequent ones are considered consequences of the
> first one and thus not preserved.
> 
> Not preserving the rte_errno in roll backs is equivalent to not setting
> it at all as a function called by the rollback may also set it, example:
> 
>  {
> void * a;
> 
> foo_do();
> a  = malloc(10);
> if (!a) {
>   rte_errno = ENOMEM;
>   foo_undo();

This example is weird. You can simply set rte_errno after foo_undo() in this
case.

>   return -rte_errno;
> }
>  }
> 
> If foo_undo() also encounter an error it will modify the rte_errno which
> may have a value different from ENOMEM, for the callee won't be informed
> the error is due to a memory issue and thus cannot make counter parts.
> In such situation the rte_errno must be preserved to keep the ENOMEM
> error code.

I knew it. That's why rte_errno is saved before calling another API which may
change the rte_errno inside. But, we are talking about a case where an API
returns success. If caller is supposed to save rte_errno (when it's needed), why
does callee have to put some effort to preserve it even in case of success? If
rte_errno must be preserved even in case of success, we have to make a big
change to preserve rte_errno for cases where a void function is called (or cases
where we don't check its return value of non-void function).

> This is also the main reason almost all system function only update
> errno when no error is encountered.

'Almost' doesn't mean 'all", does it? It is true that such functions must update
errno when it returns error but it doesn't care about the value when it returns
success. Like the man page I attached above, the errno is significa

Re: [dpdk-dev] [PATCH v2] net/qede: fix L2-handles used for RSS hash update

2018-06-06 Thread Ferruh Yigit
On 6/6/2018 12:03 AM, Rasesh Mody wrote:
> Fix fast path array index which is used for passing L2 handles to RSS
> indirection table, properly distribute rxq handles for indirection table.
> Currently, it is using the local copy of indirection table. When the RX
> queue configuration changes the local copy becomes invalid.
> 
> Fixes: 69d7ba88f1a1 ("net/qede/base: use L2-handles for RSS configuration")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Rasesh Mody 
> Reviewed-by: Kevin Traynor 

Applied to dpdk-next-net/master, thanks.


Re: [dpdk-dev] [PATCH v2 00/26] net/ena: new features and fixes

2018-06-06 Thread Ferruh Yigit
On 6/6/2018 12:46 PM, Michał Krawczyk wrote:
> Hi Ferruh,
> 
> 2018-06-05 18:42 GMT+02:00 Ferruh Yigit :
>>
>> On 6/4/2018 1:09 PM, Michal Krawczyk wrote:
>>> The ENA driver was updated with the new features and few fixes and minor
>>> changes are introduced.
>>> First of all, the communication layer which is delivered by vendor was
>>> updated - the version in the HEAD is a bit outdated now. ENA is able to
>>> communicate with the driver through Admin queue by using admin interrupts
>>> instead of polling.
>>> Admin interrupts are also used for handling AENQ events, which are used for
>>> the following new features:
>>>   - LSC handlers
>>>   - watchdog and device rest
>>>   - monitoring the admin queue
>>>   - handling ENA notifications (getting hints from device)
>>> For the watchdog and admin queue monitoring, the timers had to be used, so
>>> the makefile was modified to do not cut out the librte_timer.
>>>
>>> From other fixes and changes:
>>>   - legacy LLQ was removed which is now deprecated API
>>>   - Rx out of order completion was added to enable cleaning up packets out
>>> of order
>>>   - Tx mbufs are now linearized if they exceed supported number of segments
>>>   - pass information about maximum number of Tx and Rx descriptors
>>>   - the IO queue number is now taking into consideration maximum number of
>>> sq and cq
>>>   - Tx id requested for sending is now being validated and the reset is
>>> being triggered if it is invalid
>>>   - branch predictioning was added for better performance
>>>   - error checking and returned values were fixed
>>>   - macros for allocating memory in communication layer were fixed
>>>   - information about numa mode is now being passed to the NIC
>>>
>>> ---
>>> v2:
>>> * Rebased on top of dpdk-next-net
>>> * Added link speed patch
>>> * Added fix when allocating coherent memory in the PMD
>>
>> Hi Michał,
>>
>> I am getting build error for ICC [1] and shared library [2], can you please 
>> check?
>>
>>
>> [1]
>> .../drivers/net/ena/base/ena_com.c(323): error #592: variable "flags" is used
>> before its value is set
>> ENA_SPINLOCK_LOCK(admin_queue->q_lock, flags);
>> ^
>> .../drivers/net/ena/base/ena_com.c(534): error #3656: variable "flags" may be
>> used before its value is set
>>   ENA_SPINLOCK_LOCK(admin_queue->q_lock, flags);
>>   ^
>> .../drivers/net/ena/base/ena_com.c(589): error #592: variable "flags" is used
>> before its value is set
>> ENA_SPINLOCK_LOCK(admin_queue->q_lock, flags);
>> ^
>> .../drivers/net/ena/base/ena_com.c(634): error #592: variable "flags" is used
>> before its value is set
>> ENA_SPINLOCK_LOCK(mmio_read->lock, flags);
>> ^
>> .../drivers/net/ena/base/ena_com.c(1297): error #592: variable "flags" is 
>> used
>> before its value is set
>> ENA_SPINLOCK_LOCK(admin_queue->q_lock, flags);
>> ^
>> .../drivers/net/ena/base/ena_com.c(1341): error #592: variable "flags" is 
>> used
>> before its value is set
>> ENA_SPINLOCK_LOCK(admin_queue->q_lock, flags);
>> ^
>>
> 
> That should be easy to fix - I prepared the patch but I want to send
> it with second fix together.
> 
>>
>>
>> [2]
>> ena_ethdev.o: In function `eth_ena_pci_probe':
>> .../drivers/net/ena/ena_ethdev.c:(.text+0x6de): undefined reference to
>> `rte_timer_subsystem_init'
>> .../drivers/net/ena/ena_ethdev.c:(.text+0x6ea): undefined reference to
>> `rte_timer_init'
>> ena_ethdev.o: In function `eth_ena_pci_remove':
>> .../drivers/net/ena/ena_ethdev.c:(.text+0x854): undefined reference to
>> `rte_timer_stop_sync'
>> ena_ethdev.o: In function `ena_start':
>> .../drivers/net/ena/ena_ethdev.c:(.text+0x1ff6): undefined reference to
>> `rte_timer_reset'
>> ena_ethdev.o: In function `ena_stop':
>> .../drivers/net/ena/ena_ethdev.c:(.text+0x21a1): undefined reference to
>> `rte_timer_stop_sync'
>> ena_ethdev.o: In function `ena_close':
>> .../drivers/net/ena/ena_ethdev.c:(.text+0x21d8): undefined reference to
>> `rte_timer_stop_sync'
>>
> 
> Strange that this happens, it should be fixed in the patch:
>   mk: link librte_timer with --whole-archive
> which seems to be no longer needed, because similar commit is already
> applied (eb54ef42b02f94c4093556fdd6b51e2d7fd0df47). That should
> resolve the issue with linking (at least for gcc it is helping) by
> linking timer library with --whole-archive.
> Could you, please, send me build logs so I could look closer if it is
> properly built with ICC as I don't have access to it?

Note "--whole-archive" is related to the static library build, but the above
error is related to shared build, via "CONFIG_RTE_BUILD_SHARED_LIB=y" option.

Adding librte_timer as a dependent library should fix the issue, like:
 diff --git a/drivers/net/ena/Makefile b/drivers/net/ena/Makefile
 index 43339f3b9..ff9ce315b 100644
 --- a/drivers/net/ena/Makefile
 +++ b/drivers/net/ena/Makefile
 @@ -58,5 +58,6 @@ CFLAGS += $(INCLUDES)
  

[dpdk-dev] [PATCH v3] doc/event: improve eventdev library documentation

2018-06-06 Thread Honnappa Nagarahalli
Add small amount of additional code, use consistent variable names
across code blocks, change the image to represent queues and
CPU cores intuitively. These help improve the eventdev library
documentation.

Signed-off-by: Honnappa Nagarahalli 
Reviewed-by: Gavin Hu 
Acked-by: Jerin Jacob 
Acked-by: Harry van Haaren 
---
v3:
* Added Acked-by info

v2:
* Added copyright
* Added reference for RX and TX cores
* Changed worker_port_id1 to worker_port_id 

 doc/guides/prog_guide/eventdev.rst   |   57 +-
 doc/guides/prog_guide/img/eventdev_usage.svg | 1519 +-
 2 files changed, 574 insertions(+), 1002 deletions(-)

diff --git a/doc/guides/prog_guide/eventdev.rst 
b/doc/guides/prog_guide/eventdev.rst
index ce19997..8fcae54 100644
--- a/doc/guides/prog_guide/eventdev.rst
+++ b/doc/guides/prog_guide/eventdev.rst
@@ -1,5 +1,6 @@
 ..  SPDX-License-Identifier: BSD-3-Clause
 Copyright(c) 2017 Intel Corporation.
+Copyright(c) 2018 Arm Limited.
 
 Event Device Library
 
@@ -129,8 +130,10 @@ API Walk-through
 
 This section will introduce the reader to the eventdev API, showing how to
 create and configure an eventdev and use it for a two-stage atomic pipeline
-with a single core for TX. The diagram below shows the final state of the
-application after this walk-through:
+with one core each for RX and TX. RX and TX cores are shown here for
+illustration, refer to Eventdev Adapter documentation for further details.
+The diagram below shows the final state of the application after this
+walk-through:
 
 .. _figure_eventdev-usage1:
 
@@ -196,23 +199,29 @@ calling the setup function. Repeat this step for each 
queue, starting from
 .nb_atomic_flows = 1024,
 .nb_atomic_order_sequences = 1024,
 };
+struct rte_event_queue_conf single_link_conf = {
+.event_queue_cfg = RTE_EVENT_QUEUE_CFG_SINGLE_LINK,
+};
 int dev_id = 0;
-int queue_id = 0;
-int err = rte_event_queue_setup(dev_id, queue_id, &atomic_conf);
+int atomic_q_1 = 0;
+int atomic_q_2 = 1;
+int single_link_q = 2;
+int err = rte_event_queue_setup(dev_id, atomic_q_1, &atomic_conf);
+int err = rte_event_queue_setup(dev_id, atomic_q_2, &atomic_conf);
+int err = rte_event_queue_setup(dev_id, single_link_q, 
&single_link_conf);
 
-The remainder of this walk-through assumes that the queues are configured as
-follows:
+As shown above, queue IDs are as follows:
 
  * id 0, atomic queue #1
  * id 1, atomic queue #2
  * id 2, single-link queue
 
+These queues are used for the remainder of this walk-through.
+
 Setting up Ports
 
 
-Once queues are set up successfully, create the ports as required. Each port
-should be set up with its corresponding port_conf type, worker for worker 
cores,
-rx and tx for the RX and TX cores:
+Once queues are set up successfully, create the ports as required.
 
 .. code-block:: c
 
@@ -232,15 +241,24 @@ rx and tx for the RX and TX cores:
 .new_event_threshold = 4096,
 };
 int dev_id = 0;
-int port_id = 0;
-int err = rte_event_port_setup(dev_id, port_id, &CORE_FUNCTION_conf);
+int rx_port_id = 0;
+int err = rte_event_port_setup(dev_id, rx_port_id, &rx_conf);
+
+for(int worker_port_id = 1; worker_port_id <= 4; worker_port_id++) {
+   int err = rte_event_port_setup(dev_id, worker_port_id, 
&worker_conf);
+}
 
-It is now assumed that:
+int tx_port_id = 5;
+   int err = rte_event_port_setup(dev_id, tx_port_id, &tx_conf);
+
+As shown above:
 
  * port 0: RX core
  * ports 1,2,3,4: Workers
  * port 5: TX core
 
+These ports are used for the remainder of this walk-through.
+
 Linking Queues and Ports
 
 
@@ -254,15 +272,14 @@ can be achieved like this:
 
 .. code-block:: c
 
-uint8_t port_id = 0;
+uint8_t rx_port_id = 0;
+uint8_t tx_port_id = 5;
 uint8_t atomic_qs[] = {0, 1};
 uint8_t single_link_q = 2;
-uint8_t tx_port_id = 5;
 uin8t_t priority = RTE_EVENT_DEV_PRIORITY_NORMAL;
 
-for(int i = 0; i < 4; i++) {
-int worker_port = i + 1;
-int links_made = rte_event_port_link(dev_id, worker_port, 
atomic_qs, NULL, 2);
+for(int worker_port_id = 1; worker_port_id <= 4; worker_port_id++) {
+int links_made = rte_event_port_link(dev_id, worker_port_id, 
atomic_qs, NULL, 2);
 }
 int links_made = rte_event_port_link(dev_id, tx_port_id, 
&single_link_q, &priority, 1);
 
@@ -295,14 +312,14 @@ The following code shows how those packets can be 
enqueued into the eventdev:
 ev[i].flow_id = mbufs[i]->hash.rss;
 ev[i].op = RTE_EVENT_OP_NEW;
 ev[i].sched_type = RTE_SCHED_TYPE_ATOMIC;
-ev[i].queue_id = 0;
+ev[i].queue_id = atomic

[dpdk-dev] [PATCH] i40evf: don't reset device_info data

2018-06-06 Thread Damjan Marion
At this point valid data is already set by rte_eth_get_device_info.
device field becomes zero and consumer is not able to retrieve pci data.

Signed-off-by: Damjan Marion 
---
 drivers/net/i40e/i40e_ethdev_vf.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index 804e44530..86b38d202 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -2182,7 +2182,6 @@ i40evf_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
 {
struct i40e_vf *vf = I40EVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
 
-   memset(dev_info, 0, sizeof(*dev_info));
dev_info->max_rx_queues = vf->vsi_res->num_queue_pairs;
dev_info->max_tx_queues = vf->vsi_res->num_queue_pairs;
dev_info->min_rx_bufsize = I40E_BUF_SIZE_MIN;
-- 
2.17.1



Re: [dpdk-dev] [PATCH v2] net/pcap: rx_iface_in stream type support

2018-06-06 Thread Ido Goshen
The problem is if a dpdk app uses the same iface(s) both as rx_iface and 
tx_iface then it will receive back the packets it sends.
If my app sends a packet to portid=X with rte_eth_tx_burst() then I wouldn't 
expect to receive it back by rte_eth_rx_burst() for that same portid=X  
(assuming of course there's no external loopback)
This is coming from the default nature of pcap that like a sniffer captures 
both incoming and outgoing direction.
The patch provides an option to limit pcap rx_iface to get only incoming 
traffic which is more like a real (non-pcap) dpdk device.

for example:
when using existing *rx_iface*
l2fwd -c 3 -n1 --no-huge --vdev=eth_pcap0,rx_iface=eth1,tx_iface=eth1 
--vdev=eth_pcap1,rx_iface=dummy0,tx_iface=dummy0  -- -p 3 -T 1 
sending only 1 single packet into eth1 will end in an infinite loop - 
the packet that is being transmitted by l2fwd to tx_iface=dummy0 is being 
recaptured again by the rx_iface=dummy0 (as it captures outgoing packets too) 
then l2fwd it to tx_iface=eth0 and it is received back on it by the 
rx_iface=eth0 and so on...
Port statistics 
Statistics for port 0 --
Packets sent: 4658
Packets received: 4658
Packets dropped: 0
Statistics for port 1 --
Packets sent: 4658
Packets received: 4658
Packets dropped: 0
Aggregate statistics ===
Total packets sent:   9316
Total packets received:   9316
Total packets dropped:   0


while if using the suggested *rx_iface_in*
l2fwd -c 3 -n1 --no-huge 
--vdev=eth_pcap0,rx_iface_in=eth1,tx_iface=eth1 
--vdev=eth_pcap1,rx_iface_in=dummy0,tx_iface=dummy0  -- -p 3 -T 1
The packet will be transmitted once and not received again (as I'd expect)
Port statistics 
Statistics for port 0 --
Packets sent:0
Packets received:1
Packets dropped: 0
Statistics for port 1 --
Packets sent:1
Packets received:0
Packets dropped: 0
Aggregate statistics ===
Total packets sent:  1
Total packets received:  1
Total packets dropped:   0


-Original Message-
From: Ferruh Yigit  
Sent: Tuesday, June 5, 2018 4:27 PM
To: Ido Goshen 
Cc: dev@dpdk.org
Subject: Re: [PATCH v2] net/pcap: rx_iface_in stream type support

On 6/5/2018 10:39 AM, ido goshen wrote:
> Support rx of in direction packets only Useful for apps that also tx 
> to eth_pcap ports in order not to see them echoed back in as rx when 
> out direction is also captured

Can you please explain the problem a little more?
If you are using rx_iface and tx_iface arguments you should have different pcap 
handlers for Rx and Tx and you shouldn't be getting Tx packets in Rx handler.

> 
> Signed-off-by: ido goshen 
> ---
> v2: clean checkpatch warning
> 
>  drivers/net/pcap/rte_eth_pcap.c | 36 
> 

doc/guides/nics/pcap_ring.rst also needs to be updated with this new arg, but 
please wait to clarify the above comment, to figure out if this patch really 
necessary, before updating that document.

>  1 file changed, 32 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/pcap/rte_eth_pcap.c 
> b/drivers/net/pcap/rte_eth_pcap.c index 6bd4a7d..132f469 100644
> --- a/drivers/net/pcap/rte_eth_pcap.c
> +++ b/drivers/net/pcap/rte_eth_pcap.c
> @@ -26,6 +26,7 @@
>  #define ETH_PCAP_RX_PCAP_ARG  "rx_pcap"
>  #define ETH_PCAP_TX_PCAP_ARG  "tx_pcap"
>  #define ETH_PCAP_RX_IFACE_ARG "rx_iface"
> +#define ETH_PCAP_RX_IFACE_IN_ARG "rx_iface_in"
>  #define ETH_PCAP_TX_IFACE_ARG "tx_iface"
>  #define ETH_PCAP_IFACE_ARG"iface"
>  
> @@ -83,6 +84,7 @@ struct pmd_devargs {
>   ETH_PCAP_RX_PCAP_ARG,
>   ETH_PCAP_TX_PCAP_ARG,
>   ETH_PCAP_RX_IFACE_ARG,
> + ETH_PCAP_RX_IFACE_IN_ARG,
>   ETH_PCAP_TX_IFACE_ARG,
>   ETH_PCAP_IFACE_ARG,
>   NULL
> @@ -726,6 +728,22 @@ struct pmd_devargs {
>   return 0;
>  }
>  
> +static inline int
> +set_iface_direction(const char *iface, const char *key, pcap_t *pcap) 
> +{
> + if (strcmp(key, ETH_PCAP_RX_IFACE_IN_ARG) == 0) {

set_iface_direction() can be more generic function, instead of checking the key 
here, it can be done the functions calls this.

Also can be possible to make even more generic, this function only sets 
direction as 

[dpdk-dev] [PATCH v2] net/pcap: rx_iface_in stream type support

2018-06-06 Thread ido goshen
Support rx of in direction packets only
Useful for apps that also tx to eth_pcap ports in order not to see them
echoed back in as rx when out direction is also captured

Signed-off-by: ido goshen 
---
v2: clean checkpatch warning

 drivers/net/pcap/rte_eth_pcap.c | 36 
 1 file changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index 6bd4a7d..132f469 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -26,6 +26,7 @@
 #define ETH_PCAP_RX_PCAP_ARG  "rx_pcap"
 #define ETH_PCAP_TX_PCAP_ARG  "tx_pcap"
 #define ETH_PCAP_RX_IFACE_ARG "rx_iface"
+#define ETH_PCAP_RX_IFACE_IN_ARG "rx_iface_in"
 #define ETH_PCAP_TX_IFACE_ARG "tx_iface"
 #define ETH_PCAP_IFACE_ARG"iface"
 
@@ -83,6 +84,7 @@ struct pmd_devargs {
ETH_PCAP_RX_PCAP_ARG,
ETH_PCAP_TX_PCAP_ARG,
ETH_PCAP_RX_IFACE_ARG,
+   ETH_PCAP_RX_IFACE_IN_ARG,
ETH_PCAP_TX_IFACE_ARG,
ETH_PCAP_IFACE_ARG,
NULL
@@ -726,6 +728,22 @@ struct pmd_devargs {
return 0;
 }
 
+static inline int
+set_iface_direction(const char *iface, const char *key, pcap_t *pcap)
+{
+   if (strcmp(key, ETH_PCAP_RX_IFACE_IN_ARG) == 0) {
+   if (pcap_setdirection(pcap, PCAP_D_IN) < 0) {
+   PMD_LOG(ERR,
+   "Setting %s pcap direction IN failed - %s\n",
+iface,
+pcap_geterr(pcap));
+   return -1;
+   }
+   PMD_LOG(INFO, "Setting %s pcap direction IN\n", iface);
+   }
+   return 0;
+}
+
 /*
  * Opens a NIC for reading packets from it
  */
@@ -740,11 +758,12 @@ struct pmd_devargs {
for (i = 0; i < rx->num_of_queue; i++) {
if (open_single_iface(iface, &pcap) < 0)
return -1;
+   if (set_iface_direction(iface, key, pcap) < 0)
+   return -1;
rx->queue[i].pcap = pcap;
rx->queue[i].name = iface;
rx->queue[i].type = key;
}
-
return 0;
 }
 
@@ -963,17 +982,25 @@ struct pmd_devargs {
is_rx_pcap = 1;
else
pcaps.num_of_queue = rte_kvargs_count(kvlist,
-   ETH_PCAP_RX_IFACE_ARG);
+   ETH_PCAP_RX_IFACE_ARG) +
+   rte_kvargs_count(kvlist,
+   ETH_PCAP_RX_IFACE_IN_ARG);
 
if (pcaps.num_of_queue > RTE_PMD_PCAP_MAX_QUEUES)
pcaps.num_of_queue = RTE_PMD_PCAP_MAX_QUEUES;
 
-   if (is_rx_pcap)
+   if (is_rx_pcap) {
ret = rte_kvargs_process(kvlist, ETH_PCAP_RX_PCAP_ARG,
&open_rx_pcap, &pcaps);
-   else
+   } else {
ret = rte_kvargs_process(kvlist, ETH_PCAP_RX_IFACE_ARG,
&open_rx_iface, &pcaps);
+   if (ret == 0)
+   ret = rte_kvargs_process(kvlist,
+   ETH_PCAP_RX_IFACE_IN_ARG,
+   &open_rx_iface,
+   &pcaps);
+   }
 
if (ret < 0)
goto free_kvlist;
@@ -1046,6 +1073,7 @@ struct pmd_devargs {
ETH_PCAP_RX_PCAP_ARG "= "
ETH_PCAP_TX_PCAP_ARG "= "
ETH_PCAP_RX_IFACE_ARG "= "
+   ETH_PCAP_RX_IFACE_IN_ARG "= "
ETH_PCAP_TX_IFACE_ARG "= "
ETH_PCAP_IFACE_ARG "=");
 
-- 
1.9.1



[dpdk-dev] [dpdk-announce] CFP for DPDK Summit Userspace

2018-06-06 Thread O'Driscoll, Tim
The CFP for DPDK Summit Userspace (September 5th & 6th in Dublin) is available 
now at: https://goo.gl/forms/Jh70MIA7QJE2nSBb2.

The deadline for submissions is Friday July 27th. Submissions should focus on 
the key technical challenges facing the project. Suggested topics include:
* New APIs, libraries and features
* DPDK changes required to facilitate hardware acceleration
* Technical challenges facing the DPDK community, including parts of the code 
that need to be refactored
* Improvements to interworking with the Linux kernel
* Improvements to performance, usability and packaging
* Virtualization and container networking
* Innovative networking technologies such as P4, and their impact on DPDK

A link for registration for the event will be provided shortly.




[dpdk-dev] [dpdk-announce] Call for speakers

2018-06-06 Thread Tibrewala, Sujata
Call for speakers Open for: 

1. June 30th, Bangalore:  Packet Processing, Cloud and Edge Next Gen SDN/NFV 
Networks - Out of the Box Network Developers meet up
  
(https://www.meetup.com/Out-of-the-Box-Network-Developers-Bangalore/events/251225679/)

2. June 28th, Taipei:  Packet Processing and the world of SDN/NFV- Out of the 
Box Network Developers meet up 
(https://www.meetup.com/Out-Of-The-Box-Network-Developers-Taipei/)

3. July 3rd, Dublin: Containers, Cloud Native, Kubernetes, IoT, 5G - Out of the 
Box Network Developers meet up
(https://www.meetup.com/Out-of-the-Box-Network-Developers-Ireland/)

Last date for Submission: June 15th
Topics: Packet processing, DPDK, Cloud Native, Edge, IoT, 5G

Please send your proposal to: sujata.tibrew...@intel.com

Thanks
Sujata Tibrewala @sujatatibre
Community Development Manager 
Intel Developer Zone
https://software.intel.com/networking



[dpdk-dev] [PATCH] net/i40e: remove VF interrupt handler

2018-06-06 Thread Qi Zhang
For i40evf, internal rx interrupt and adminq interrupt share the same
source, that cause a lot cpu cycles be wasted on interrupt handler
on rx path. This is complained by customers which require low latency
(when set I40E_ITR_INTERVAL to small value), but have to be sufferred by
tremendous interrupts handling that eat significant CPU resources.

The patch disable pci interrupt and remove the interrupt handler,
replace it with a low frequency (50ms) interrupt polling daemon
which is implemented by registering a alarm callback periodly, this
save CPU time significently: On a typical x86 server with 2.1GHz CPU,
with low latency configure (32us) we saw CPU usage from top commmand
reduced from 20% to 0% on management core in testpmd).

Also with the new method we can remove compile option: I40E_ITR_INTERVAL
which is used to balance between low latency and low CPU usage previously.
Now we don't need it since we can reach both at same time.

Suggested-by: Jingjing Wu 
Signed-off-by: Qi Zhang 
---
 config/common_base|  2 --
 drivers/net/i40e/i40e_ethdev.c|  3 +--
 drivers/net/i40e/i40e_ethdev.h| 22 +++---
 drivers/net/i40e/i40e_ethdev_vf.c | 36 ++--
 4 files changed, 26 insertions(+), 37 deletions(-)

diff --git a/config/common_base b/config/common_base
index 6b0d1cbbb..9e21c6865 100644
--- a/config/common_base
+++ b/config/common_base
@@ -264,8 +264,6 @@ CONFIG_RTE_LIBRTE_I40E_INC_VECTOR=y
 CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC=n
 CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_PF=64
 CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM=4
-# interval up to 8160 us, aligned to 2 (or default value)
-CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL=-1
 
 #
 # Compile burst-oriented FM10K PMD
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 13c5d3296..c8f9566e0 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -1829,8 +1829,7 @@ __vsi_queues_bind_intr(struct i40e_vsi *vsi, uint16_t 
msix_vect,
/* Write first RX queue to Link list register as the head element */
if (vsi->type != I40E_VSI_SRIOV) {
uint16_t interval =
-   i40e_calc_itr_interval(RTE_LIBRTE_I40E_ITR_INTERVAL, 1,
-  pf->support_multi_driver);
+   i40e_calc_itr_interval(1, pf->support_multi_driver);
 
if (msix_vect == I40E_MISC_VEC_ID) {
I40E_WRITE_REG(hw, I40E_PFINT_LNKLST0,
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
index 11c4c76bd..53dac 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -178,7 +178,7 @@ enum i40e_flxpld_layer_idx {
 #define I40E_ITR_INDEX_NONE 3
 #define I40E_QUEUE_ITR_INTERVAL_DEFAULT 32 /* 32 us */
 #define I40E_QUEUE_ITR_INTERVAL_MAX 8160 /* 8160 us */
-#define I40E_VF_QUEUE_ITR_INTERVAL_DEFAULT 8160 /* 8160 us */
+#define I40E_VF_QUEUE_ITR_INTERVAL_DEFAULT 32 /* 32 us */
 /* Special FW support this floating VEB feature */
 #define FLOATING_VEB_SUPPORTED_FW_MAJ 5
 #define FLOATING_VEB_SUPPORTED_FW_MIN 0
@@ -1328,17 +1328,17 @@ i40e_align_floor(int n)
 }
 
 static inline uint16_t
-i40e_calc_itr_interval(int16_t interval, bool is_pf, bool is_multi_drv)
+i40e_calc_itr_interval(bool is_pf, bool is_multi_drv)
 {
-   if (interval < 0 || interval > I40E_QUEUE_ITR_INTERVAL_MAX) {
-   if (is_multi_drv) {
-   interval = I40E_QUEUE_ITR_INTERVAL_MAX;
-   } else {
-   if (is_pf)
-   interval = I40E_QUEUE_ITR_INTERVAL_DEFAULT;
-   else
-   interval = I40E_VF_QUEUE_ITR_INTERVAL_DEFAULT;
-   }
+   uint16_t interval = 0;
+
+   if (is_multi_drv) {
+   interval = I40E_QUEUE_ITR_INTERVAL_MAX;
+   } else {
+   if (is_pf)
+   interval = I40E_QUEUE_ITR_INTERVAL_DEFAULT;
+   else
+   interval = I40E_VF_QUEUE_ITR_INTERVAL_DEFAULT;
}
 
/* Convert to hardware count, as writing each 1 represents 2 us */
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index 804e44530..ad5c069e8 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -44,6 +44,8 @@
 #define I40EVF_BUSY_WAIT_COUNT 50
 #define MAX_RESET_WAIT_CNT 20
 
+#define I40EVF_ALARM_INTERVAL 5 /* us */
+
 struct i40evf_arq_msg_info {
enum virtchnl_ops ops;
enum i40e_status_code result;
@@ -1133,7 +1135,7 @@ i40evf_init_vf(struct rte_eth_dev *dev)
struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
struct i40e_vf *vf = I40EVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
uint16_t interval =
-   i40e_calc_itr_interval(RTE_LIBRTE_I40E_ITR_INTERVAL, 0, 0);
+   i40e_

Re: [dpdk-dev] [RFC] hot plug failure handle mechanism

2018-06-06 Thread Guo, Jia




On 6/6/2018 8:54 PM, Bruce Richardson wrote:

+Tech-board as I think that this should have more input at the design stage
ahead of any code patches being pushed.

On Mon, Jun 04, 2018 at 09:56:10AM +0800, Guo, Jia wrote:

hi,bruce


On 5/29/2018 7:20 PM, Bruce Richardson wrote:

On Thu, May 24, 2018 at 07:55:43AM +0100, Guo, Jia wrote:


 The hot plug failure handle mechanism should be come across as bellow:

 1.  Add a new bus ops “handle_hot-unplug”in bus to handle bus
 read/write error, it is bus-specific and each

 kind of bus can implement its own logic.

 2.  Implement pci bus specific ops“pci_handle_hot_unplug”, in the
 function, base on the

 failure address to remap memory which belong to the corresponding
 device that unplugged.

 3.  Implement a new sigbus handler, and register it when start
 device event monitoring,

 once the MMIO sigbus error exposure, it will trigger the above hot plug
 failure handle mechanism,

 that will keep app, that working on packet processing, would not be
 broken and crash, then could

 keep going clean, fail-safe or other working task.

 4.  Also also will introduce the solution by use testpmd to show
 the example of the whole procedure like that:

 device unplug ->failure handle->stop forwarding->stop port->close
 port->detach port.


Hi Jeff,

so if I understand this correctly the proposal is that we need two parallel
solutions to handle safe removal of a device.

1. We need a solution to support unpluging of the device at the bus level,
 ie. remove the device from the list of devices and to make access to
 that device invalid.
2. Since the removal of the device from the software lists is not going to
 be instantaneous, we need a mechanism to handle any accesses to the
 device from the data path until such time as the removal is complete. To
 support that, you propose to add a sigbus handler which will
 automatically replace any mmio bar mappings with some other memory that is
 ok to access - presumable zero memory or similar.

Is this understanding correct?

i think you are correct about that.


Point #2 seems reasonably clear to me, but for #1, presumably the trigger
to the bus needs to come from the kernel. What is planned to be used there?

about point #1, i should clarify here is that, we will use the device event
monitor mechanism to detect the hot unplug event.
the monitor be enabled by app(or fail-safe driver), and app(fail-safe
driver) register the event callback. Once the hot unplug behave be detected,
the user's callback could be triggered to let app(fail-safe driver) know the
event and manage the process, it will call APIs to stop the device
and detach the device from the bus.

Ok. If there is no failsafe driver, and the app does not set up a handler,
does nothing happen when we get a removal event? Will the app just crash?


when the device event monitor be enabled by app, the handler auto be set 
up, app or fail safe driver no need and can not directly do it.
so if app want to process this hot plug event, what they need to do is 
only enable hot plug event monitor and register their self callback,

then the app will not crash when hotplug behavior occur.


You also talk about using testpmd as a reference for this, but you don't
explain how an application can be notified of a device removal, or even why
that is necessary. Since all applications should now be using the proper
macros when iterating device lists, and not just assuming devices 0-N are
valid, what changes would you see a normal app having to make to be
hotplug-safe?

we could use app or fail-safe driver to use these mechanism , but at this
stage i will firstly use testpmd as a reference.
as above reply, testpmd should enable device event mechanism to monitor the
device removal, and register callback,
the device bdf list is managed by bus and the hoplug fail handler will be
process by the eal layer, then the app would be hotplug-safe.

is there anything i miss to clarify? please shout. and i think i will detail
more later.

This is becoming clearer now, thanks. Just the one question above I have at
this point.
Given how long-running this issue of hotplug is, I'm hoping others on the
technical board can also review this proposal.

Regards,
/Bruce




[dpdk-dev] [PATCH 1/2] net/i40e: print real global changes

2018-06-06 Thread Beilei Xing
Currently no matter if there's global change, the global
configuration will be always logged. But there's no value
to log the info if the configuration is not changed.
This patch prints only real global changes.
Also, change log level from DEBUG to WARNING.

Signed-off-by: Beilei Xing 
---
 drivers/net/i40e/i40e_ethdev.c | 124 +++--
 drivers/net/i40e/i40e_ethdev.h |  12 +++-
 2 files changed, 91 insertions(+), 45 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 7d4f1c9..e81b47e 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -698,12 +698,16 @@ i40e_write_global_rx_ctl(struct i40e_hw *hw, uint32_t 
reg_addr,
 uint32_t reg_val)
 {
uint32_t ori_reg_val;
+   struct rte_eth_dev *dev;
 
ori_reg_val = i40e_read_rx_ctl(hw, reg_addr);
+   dev = ((struct i40e_adapter *)hw->back)->eth_dev;
i40e_write_rx_ctl(hw, reg_addr, reg_val);
-   PMD_DRV_LOG(DEBUG,
-   "Global register [0x%08x] original: 0x%08x, after: 0x%08x",
-   reg_addr, ori_reg_val, reg_val);
+   if (ori_reg_val != reg_val)
+   PMD_DRV_LOG(WARNING,
+   "i40e device %s changed global register [0x%08x]."
+   " original: 0x%08x, new: 0x%08x",
+   dev->device->name, reg_addr, ori_reg_val, reg_val);
 }
 
 RTE_PMD_REGISTER_PCI(net_i40e, rte_i40e_pmd);
@@ -1165,6 +1169,7 @@ i40e_aq_debug_write_global_register(struct i40e_hw *hw,
struct i40e_asq_cmd_details *cmd_details)
 {
uint64_t ori_reg_val;
+   struct rte_eth_dev *dev;
int ret;
 
ret = i40e_aq_debug_read_register(hw, reg_addr, &ori_reg_val, NULL);
@@ -1174,11 +1179,13 @@ i40e_aq_debug_write_global_register(struct i40e_hw *hw,
reg_addr);
return -EIO;
}
+   dev = ((struct i40e_adapter *)hw->back)->eth_dev;
 
-   PMD_DRV_LOG(DEBUG,
-   "Global register [0x%08x] original: 0x%"PRIx64
-   ", after: 0x%"PRIx64,
-   reg_addr, ori_reg_val, reg_val);
+   if (ori_reg_val != reg_val)
+   PMD_DRV_LOG(WARNING,
+   "i40e device %s changed global register [0x%08x]."
+   " original: 0x%"PRIx64", after: 0x%"PRIx64,
+   dev->device->name, reg_addr, ori_reg_val, reg_val);
 
return i40e_aq_debug_write_register(hw, reg_addr, reg_val, cmd_details);
 }
@@ -7565,6 +7572,7 @@ i40e_status_code i40e_replace_mpls_l1_filter(struct 
i40e_pf *pf)
struct i40e_aqc_replace_cloud_filters_cmd  filter_replace;
struct i40e_aqc_replace_cloud_filters_cmd_buf  filter_replace_buf;
struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+   struct rte_eth_dev *dev = ((struct i40e_adapter *)hw->back)->eth_dev;
enum i40e_status_code status = I40E_SUCCESS;
 
if (pf->support_multi_driver) {
@@ -7608,10 +7616,12 @@ i40e_status_code i40e_replace_mpls_l1_filter(struct 
i40e_pf *pf)
 
status = i40e_aq_replace_cloud_filters(hw, &filter_replace,
   &filter_replace_buf);
-   if (!status) {
+   if (!status && (filter_replace.old_filter_type !=
+   filter_replace.new_filter_type)) {
i40e_global_cfg_warning(I40E_WARNING_RPL_CLD_FILTER);
-   PMD_DRV_LOG(DEBUG, "Global configuration modification: "
-   "cloud l1 type is changed from 0x%x to 0x%x",
+   PMD_DRV_LOG(WARNING, "i40e device %s changed cloud l1 type."
+   " original: 0x%x, new: 0x%x",
+   dev->device->name,
filter_replace.old_filter_type,
filter_replace.new_filter_type);
}
@@ -7624,6 +7634,7 @@ i40e_status_code i40e_replace_mpls_cloud_filter(struct 
i40e_pf *pf)
struct i40e_aqc_replace_cloud_filters_cmd  filter_replace;
struct i40e_aqc_replace_cloud_filters_cmd_buf  filter_replace_buf;
struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+   struct rte_eth_dev *dev = ((struct i40e_adapter *)hw->back)->eth_dev;
enum i40e_status_code status = I40E_SUCCESS;
 
if (pf->support_multi_driver) {
@@ -7652,10 +7663,13 @@ i40e_status_code i40e_replace_mpls_cloud_filter(struct 
i40e_pf *pf)
   &filter_replace_buf);
if (status < 0)
return status;
-   PMD_DRV_LOG(DEBUG, "Global configuration modification: "
-   "cloud filter type is changed from 0x%x to 0x%x",
-   filter_replace.old_filter_type,
-   filter_replace.new_filter_type);
+   if (filter_replace.old_filter_type !=
+   filter_replace.new_filter_type)
+   PMD

[dpdk-dev] [PATCH 2/2] net/i40e: remove summarized global register change info

2018-06-06 Thread Beilei Xing
The summarized global register change info will be logged
no matter if there's real global register change. Since
only real changes are logged now, there's no need to
summarize global register change info, otherwise will
cause misunderstanding.

Signed-off-by: Beilei Xing 
---
 drivers/net/i40e/i40e_ethdev.c  | 43 ++---
 drivers/net/i40e/i40e_ethdev.h  | 43 -
 drivers/net/i40e/i40e_fdir.c|  1 -
 drivers/net/i40e/i40e_flow.c|  1 -
 drivers/net/i40e/rte_pmd_i40e.c |  3 ---
 5 files changed, 10 insertions(+), 81 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index e81b47e..29eeb92 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -734,7 +734,6 @@ static inline void i40e_GLQF_reg_init(struct i40e_hw *hw)
 */
I40E_WRITE_GLB_REG(hw, I40E_GLQF_ORT(40), 0x0029);
I40E_WRITE_GLB_REG(hw, I40E_GLQF_PIT(9), 0x9420);
-   i40e_global_cfg_warning(I40E_WARNING_QINQ_PARSER);
 }
 
 static inline void i40e_config_automask(struct i40e_pf *pf)
@@ -1306,7 +1305,6 @@ eth_i40e_dev_init(struct rte_eth_dev *dev, void 
*init_params __rte_unused)
PMD_INIT_LOG(DEBUG,
 "Global register 0x%08x is changed with 0x28",
 I40E_GLQF_L3_MAP(40));
-   i40e_global_cfg_warning(I40E_WARNING_QINQ_CLOUD_FILTER);
}
 
/* Need the special FW version to support floating VEB */
@@ -1593,7 +1591,6 @@ void i40e_flex_payload_reg_set_default(struct i40e_hw *hw)
I40E_WRITE_GLB_REG(hw, I40E_GLQF_ORT(33), 0x);
I40E_WRITE_GLB_REG(hw, I40E_GLQF_ORT(34), 0x);
I40E_WRITE_GLB_REG(hw, I40E_GLQF_ORT(35), 0x);
-   i40e_global_cfg_warning(I40E_WARNING_DIS_FLX_PLD);
 }
 
 static int
@@ -3508,8 +3505,6 @@ i40e_vlan_tpid_set_by_registers(struct rte_eth_dev *dev,
"Global register 0x%08x is changed with value 0x%08x",
I40E_GL_SWT_L2TAGCTRL(reg_id), (uint32_t)reg_w);
 
-   i40e_global_cfg_warning(I40E_WARNING_TPID);
-
return 0;
 }
 
@@ -3804,7 +3799,6 @@ i40e_flow_ctrl_set(struct rte_eth_dev *dev, struct 
rte_eth_fc_conf *fc_conf)
I40E_WRITE_GLB_REG(hw, I40E_GLRPB_GLW,
   pf->fc_conf.low_water[I40E_MAX_TRAFFIC_CLASS]
   << I40E_KILOSHIFT);
-   i40e_global_cfg_warning(I40E_WARNING_FLOW_CTL);
} else {
PMD_DRV_LOG(ERR,
"Water marker configuration is not supported.");
@@ -7617,14 +7611,13 @@ i40e_status_code i40e_replace_mpls_l1_filter(struct 
i40e_pf *pf)
status = i40e_aq_replace_cloud_filters(hw, &filter_replace,
   &filter_replace_buf);
if (!status && (filter_replace.old_filter_type !=
-   filter_replace.new_filter_type)) {
-   i40e_global_cfg_warning(I40E_WARNING_RPL_CLD_FILTER);
+   filter_replace.new_filter_type))
PMD_DRV_LOG(WARNING, "i40e device %s changed cloud l1 type."
" original: 0x%x, new: 0x%x",
dev->device->name,
filter_replace.old_filter_type,
filter_replace.new_filter_type);
-   }
+
return status;
 }
 
@@ -7693,14 +7686,13 @@ i40e_status_code i40e_replace_mpls_cloud_filter(struct 
i40e_pf *pf)
status = i40e_aq_replace_cloud_filters(hw, &filter_replace,
   &filter_replace_buf);
if (!status && (filter_replace.old_filter_type !=
-   filter_replace.new_filter_type)) {
-   i40e_global_cfg_warning(I40E_WARNING_RPL_CLD_FILTER);
+   filter_replace.new_filter_type))
PMD_DRV_LOG(WARNING, "i40e device %s changed cloud filter type."
" original: 0x%x, new: 0x%x",
dev->device->name,
filter_replace.old_filter_type,
filter_replace.new_filter_type);
-   }
+
return status;
 }
 
@@ -7782,14 +7774,13 @@ i40e_replace_gtp_l1_filter(struct i40e_pf *pf)
status = i40e_aq_replace_cloud_filters(hw, &filter_replace,
   &filter_replace_buf);
if (!status && (filter_replace.old_filter_type !=
-   filter_replace.new_filter_type)) {
-   i40e_global_cfg_warning(I40E_WARNING_RPL_CLD_FILTER);
+   filter_replace.new_filter_type))
PMD_DRV_LOG(WARNING, "i40e device %s changed cloud l1 type."
" original: 0x%x, new: 0x%x",
dev->device->name,
filter_replace.old_filter

[dpdk-dev] [PATCH 0/2] net/i40e: print real global changes

2018-06-06 Thread Beilei Xing
Only real changes should be logged, so only the first
detected change will be logged. Log should be like:
i40e device xx:xx.x changed global register [xxx] original: xxx, new: xxx

Beilei Xing (2):
  net/i40e: print real global changes
  net/i40e: remove summarized global register change info

 drivers/net/i40e/i40e_ethdev.c  | 157 ++--
 drivers/net/i40e/i40e_ethdev.h  |  55 +++---
 drivers/net/i40e/i40e_fdir.c|   1 -
 drivers/net/i40e/i40e_flow.c|   1 -
 drivers/net/i40e/rte_pmd_i40e.c |   3 -
 5 files changed, 96 insertions(+), 121 deletions(-)

-- 
2.5.5



Re: [dpdk-dev] [PATCH v2 5/5] net/virtio: improve offload check performance

2018-06-06 Thread Tiwei Bie
On Wed, Jun 06, 2018 at 02:31:28PM +0200, Maxime Coquelin wrote:
> Instead of checking the multiple Virtio features bits for
> every packet, let's do the check once at configure time and
> store it in virtio_hw struct.
> 
> Signed-off-by: Maxime Coquelin 
> ---
[...]
> @@ -270,8 +261,7 @@ virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct 
> rte_mbuf *cookie,
>* which is wrong. Below subtract restores correct pkt size.
>*/
>   cookie->pkt_len -= head_size;
> - /* if offload disabled, it is not zeroed below, do it now */

I think there is no need to remove this comment.

Apart from that,

Reviewed-by: Tiwei Bie 

> - if (offload == 0) {
> + if (!vq->hw->has_tx_offload) {
>   ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
>   ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
>   ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
[...]


Re: [dpdk-dev] [PATCH v2 4/5] net/virtio: don't use simple Rx if TCP LRO or VLAN strip requested

2018-06-06 Thread Tiwei Bie
On Wed, Jun 06, 2018 at 02:31:27PM +0200, Maxime Coquelin wrote:
> Signed-off-by: Maxime Coquelin 

Reviewed-by: Tiwei Bie 

> ---
>  drivers/net/virtio/virtio_ethdev.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c 
> b/drivers/net/virtio/virtio_ethdev.c
> index b023ec02e..357968fdd 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -1963,7 +1963,9 @@ virtio_dev_configure(struct rte_eth_dev *dev)
>   }
>  
>   if (rx_offloads & (DEV_RX_OFFLOAD_UDP_CKSUM |
> -DEV_RX_OFFLOAD_TCP_CKSUM))
> +DEV_RX_OFFLOAD_TCP_CKSUM |
> +DEV_RX_OFFLOAD_TCP_LRO |
> +DEV_RX_OFFLOAD_VLAN_STRIP))
>   hw->use_simple_rx = 0;
>  
>   if (tx_offloads & (DEV_TX_OFFLOAD_TCP_CKSUM |
> -- 
> 2.14.3
> 


Re: [dpdk-dev] [PATCH v2 3/5] net/vhost: improve Tx path selection

2018-06-06 Thread Tiwei Bie
On Wed, Jun 06, 2018 at 02:31:26PM +0200, Maxime Coquelin wrote:
[...]
> @@ -1886,6 +1888,15 @@ virtio_dev_configure(struct rte_eth_dev *dev)
>   (1ULL << VIRTIO_NET_F_GUEST_TSO4) |
>   (1ULL << VIRTIO_NET_F_GUEST_TSO6);
>  
> + if (tx_offloads & (DEV_TX_OFFLOAD_TCP_CKSUM |
> + DEV_TX_OFFLOAD_UDP_CKSUM))
> + req_features |= (1ULL << VIRTIO_NET_F_CSUM);

Hmm.. I still prefer to keep the DEV_TX_OFFLOAD_TCP_CKSUM
and DEV_TX_OFFLOAD_UDP_CKSUM aligned to make the coding
style consistent with the existing code of Rx:

1901 ¦   if (rx_offloads & (DEV_RX_OFFLOAD_UDP_CKSUM |
1902 ¦   ¦   ¦  DEV_RX_OFFLOAD_TCP_CKSUM))
1903 ¦   ¦   req_features |= (1ULL << VIRTIO_NET_F_GUEST_CSUM);

But it's up to you.

> +
> + if (tx_offloads & DEV_TX_OFFLOAD_TCP_TSO)
> + req_features |=
> + (1ULL << VIRTIO_NET_F_HOST_TSO4) |
> + (1ULL << VIRTIO_NET_F_HOST_TSO6);
> +
>   /* if request features changed, reinit the device */
>   if (req_features != hw->req_guest_features) {
>   ret = virtio_init_device(dev, req_features);
> @@ -1955,6 +1966,12 @@ virtio_dev_configure(struct rte_eth_dev *dev)
>  DEV_RX_OFFLOAD_TCP_CKSUM))
>   hw->use_simple_rx = 0;
>  
> + if (tx_offloads & (DEV_TX_OFFLOAD_TCP_CKSUM |
> + DEV_TX_OFFLOAD_UDP_CKSUM |
> + DEV_TX_OFFLOAD_TCP_TSO |
> + DEV_TX_OFFLOAD_VLAN_INSERT))
> + hw->use_simple_tx = 0;

Ditto. Below is the code for Rx:

1987 ¦   if (rx_offloads & (DEV_RX_OFFLOAD_UDP_CKSUM |
1988 ¦   ¦   ¦  DEV_RX_OFFLOAD_TCP_CKSUM |
1989 ¦   ¦   ¦  DEV_RX_OFFLOAD_TCP_LRO |
1990 ¦   ¦   ¦  DEV_RX_OFFLOAD_VLAN_STRIP))
1991 ¦   ¦   hw->use_simple_rx = 0;

So, instead of:

1987 ¦   if (rx_offloads & (DEV_RX_OFFLOAD_UDP_CKSUM |
1988 ¦   ¦   ¦  DEV_RX_OFFLOAD_TCP_CKSUM |
1989 ¦   ¦   ¦  DEV_RX_OFFLOAD_TCP_LRO |
1990 ¦   ¦   ¦  DEV_RX_OFFLOAD_VLAN_STRIP))
1991 ¦   ¦   hw->use_simple_rx = 0;
1992
1993 ¦   if (tx_offloads & (DEV_TX_OFFLOAD_TCP_CKSUM |
1994 ¦   ¦   ¦   ¦   DEV_TX_OFFLOAD_UDP_CKSUM |
1995 ¦   ¦   ¦   ¦   DEV_TX_OFFLOAD_TCP_TSO |
1996 ¦   ¦   ¦   ¦   DEV_TX_OFFLOAD_VLAN_INSERT))
1997 ¦   ¦   hw->use_simple_tx = 0;

I would prefer:

1987 ¦   if (rx_offloads & (DEV_RX_OFFLOAD_UDP_CKSUM |
1988 ¦   ¦   ¦  DEV_RX_OFFLOAD_TCP_CKSUM |
1989 ¦   ¦   ¦  DEV_RX_OFFLOAD_TCP_LRO |
1990 ¦   ¦   ¦  DEV_RX_OFFLOAD_VLAN_STRIP))
1991 ¦   ¦   hw->use_simple_rx = 0;
1992
1993 ¦   if (tx_offloads & (DEV_TX_OFFLOAD_UDP_CKSUM |
1994 ¦   ¦   ¦  DEV_TX_OFFLOAD_TCP_CKSUM |
1995 ¦   ¦   ¦  DEV_TX_OFFLOAD_TCP_TSO |
1996 ¦   ¦   ¦  DEV_TX_OFFLOAD_VLAN_INSERT))
1997 ¦   ¦   hw->use_simple_tx = 0;

But anyway, it's up to you. :)

Reviewed-by: Tiwei Bie 

Best regards,
Tiwei Bie


Re: [dpdk-dev] [PATCH v2 2/5] net/virtio: use simple path for Tx even if Rx mergeable

2018-06-06 Thread Tiwei Bie
On Wed, Jun 06, 2018 at 02:31:25PM +0200, Maxime Coquelin wrote:
> Having Rx mergeable buffers feature enabled should not be
> a reason to not use Tx simple path.
> 
> Signed-off-by: Maxime Coquelin 
> ---
>  drivers/net/virtio/virtio_ethdev.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c 
> b/drivers/net/virtio/virtio_ethdev.c
> index bdc4f09d5..73e6d6b6b 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -1949,7 +1949,6 @@ virtio_dev_configure(struct rte_eth_dev *dev)
>  #endif
>   if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) {
>   hw->use_simple_rx = 0;
> - hw->use_simple_tx = 0;
>   }

Maybe it's better to also remove { and }.

Reviewed-by: Tiwei Bie 

>  
>   if (rx_offloads & (DEV_RX_OFFLOAD_UDP_CKSUM |
> -- 
> 2.14.3
> 


[dpdk-dev] [PATCH] net/bnxt: add missing ids in xstats

2018-06-06 Thread David Marchand
The xstats api expects that the driver fills both values and ids for each
filled entries.

Fixes: bfb9c2260be2 ("net/bnxt: support xstats get/reset")

Signed-off-by: David Marchand 
---
 drivers/net/bnxt/bnxt_stats.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_stats.c b/drivers/net/bnxt/bnxt_stats.c
index bbd4e78..a5d3c86 100644
--- a/drivers/net/bnxt/bnxt_stats.c
+++ b/drivers/net/bnxt/bnxt_stats.c
@@ -278,6 +278,7 @@ int bnxt_dev_xstats_get_op(struct rte_eth_dev *eth_dev,
count = 0;
for (i = 0; i < RTE_DIM(bnxt_rx_stats_strings); i++) {
uint64_t *rx_stats = (uint64_t *)bp->hw_rx_port_stats;
+   xstats[count].id = count;
xstats[count].value = rte_le_to_cpu_64(
*(uint64_t *)((char *)rx_stats +
bnxt_rx_stats_strings[i].offset));
@@ -286,6 +287,7 @@ int bnxt_dev_xstats_get_op(struct rte_eth_dev *eth_dev,
 
for (i = 0; i < RTE_DIM(bnxt_tx_stats_strings); i++) {
uint64_t *tx_stats = (uint64_t *)bp->hw_tx_port_stats;
+   xstats[count].id = count;
xstats[count].value = rte_le_to_cpu_64(
 *(uint64_t *)((char *)tx_stats +
bnxt_tx_stats_strings[i].offset));
@@ -293,6 +295,7 @@ int bnxt_dev_xstats_get_op(struct rte_eth_dev *eth_dev,
}
 
/* The Tx drop pkts aka the Anti spoof coounter */
+   xstats[count].id = count;
xstats[count].value = rte_le_to_cpu_64(tx_drop_pkts);
count++;
 
-- 
2.7.4



Re: [dpdk-dev] [PATCH v2 1/5] net/virtio: prevent simple Tx path selection by default

2018-06-06 Thread Tiwei Bie
On Wed, Jun 06, 2018 at 02:31:24PM +0200, Maxime Coquelin wrote:
[...]
> +
> +static int
> +virtio_dev_args(struct rte_eth_dev *dev)
> +{
> + struct rte_kvargs *kvlist;
> + struct rte_devargs *devargs;
> + const char *valid_args[] = {
> + VIRTIO_SIMPLE_TX_SUPPORT,
> + NULL,
> + };
> + int ret;
> + int i;
> +
> + devargs = dev->device->devargs;
> + if (!devargs)
> + return 0; /* return success */
> +
> + kvlist = rte_kvargs_parse(devargs->args, valid_args);
> + if (kvlist == NULL)
> + return -EINVAL;

Virtio-user has defined some other mandatory devargs.
The parse will fail when other devargs have been
specified.

> +
> +  /* Process parameters. */
> + for (i = 0; (valid_args[i] != NULL); ++i) {

There is an extra space before the comment.
The () around `valid_args[i] != NULL` isn't necessary.

> + if (rte_kvargs_count(kvlist, valid_args[i])) {
> + ret = rte_kvargs_process(kvlist, valid_args[i],
> +  virtio_dev_args_check, dev);
> + if (ret) {
> + rte_kvargs_free(kvlist);
> + return ret;
> + }
> + }
> + }
> + rte_kvargs_free(kvlist);
> +
> + return 0;
> +}
> +
[...]


Re: [dpdk-dev] [PATCH] net/i40e: firmware status check

2018-06-06 Thread Xing, Beilei



> -Original Message-
> From: Li, Xiaoyun
> Sent: Wednesday, June 6, 2018 2:17 PM
> To: Xing, Beilei ; Zhang, Qi Z 
> Cc: dev@dpdk.org; Li, Xiaoyun 
> Subject: [PATCH] net/i40e: firmware status check
> 
> Check the firmware status at init time. If the status is abnormal, alert the
> user to exit DPDK.
> 
> Signed-off-by: Xiaoyun Li 
> ---
>  drivers/net/i40e/i40e_ethdev.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
> index 13c5d32..c1c19d3 100644
> --- a/drivers/net/i40e/i40e_ethdev.c
> +++ b/drivers/net/i40e/i40e_ethdev.c
> @@ -1192,7 +1192,7 @@ eth_i40e_dev_init(struct rte_eth_dev *dev, void
> *init_params __rte_unused)
>   struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data-
> >dev_private);
>   struct i40e_vsi *vsi;
>   int ret;
> - uint32_t len;
> + uint32_t len, val;
>   uint8_t aq_fail = 0;
> 
>   PMD_INIT_FUNC_TRACE();
> @@ -1236,6 +1236,14 @@ eth_i40e_dev_init(struct rte_eth_dev *dev, void
> *init_params __rte_unused)
>   hw->bus.func = pci_dev->addr.function;
>   hw->adapter_stopped = 0;
> 
> + val = I40E_READ_REG(hw, I40E_GL_FWSTS);
> + if (val & I40E_GL_FWSTS_FWS1B_MASK) {
> + PMD_INIT_LOG(ERR, "\nERROR:"
> +   " Firmware status of this driver is
> abnormal."
> +   " Please quit the app.");

"Please quit the app." can be replaced with some notice, for example "please 
check if firmware is updating" and ? 

> + return -EIO;

Seems I/O error is not suitable for firmware status. How about EBUSY or 
something else?

> + }
> +
>   /* Check if need to support multi-driver */
>   i40e_support_multi_driver(dev);
> 
> --
> 2.7.4