[dpdk-dev] [PATCH v2] eal: restrict cores detection

2016-09-01 Thread Tan, Jianfeng
Hi Stephen,

> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Wednesday, August 31, 2016 11:31 PM
> To: Tan, Jianfeng
> Cc: dev at dpdk.org; david.marchand at 6wind.com; pmatilai at redhat.com;
> thomas.monjalon at 6wind.com
> Subject: Re: [dpdk-dev] [PATCH v2] eal: restrict cores detection
> 
> On Wed, 31 Aug 2016 03:07:10 +
> Jianfeng Tan  wrote:
> 
> > This patch uses pthread_getaffinity_np() to narrow down detected
> > cores before parsing coremask (-c), corelist (-l), and coremap
> > (--lcores).
> >
> > The purpose of this patch is to leave out these core related options
> > when DPDK applications are deployed under container env, so that
> > users only specify core restriction as starting the instance.
> >
> > Note: previously, some users are using isolated CPUs, which could
> > be excluded by default. Please add commands like taskset to use
> > those cores.
> >
> > Test example:
> > $ taskset 0xc ./examples/helloworld/build/helloworld -m 1024
> >
> > Signed-off-by: Jianfeng Tan 
> > Acked-by: Neil Horman 
> > ---
> > v2:
> >   - Make it as default instead of adding the new options.
> >  lib/librte_eal/common/eal_common_lcore.c | 11 ++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_eal/common/eal_common_lcore.c
> b/lib/librte_eal/common/eal_common_lcore.c
> > index 2cd4132..62e4f67 100644
> > --- a/lib/librte_eal/common/eal_common_lcore.c
> > +++ b/lib/librte_eal/common/eal_common_lcore.c
> > @@ -57,6 +57,14 @@ rte_eal_cpu_init(void)
> > struct rte_config *config = rte_eal_get_configuration();
> > unsigned lcore_id;
> > unsigned count = 0;
> > +   rte_cpuset_t cs;
> > +   pthread_t tid = pthread_self();
> > +
> > +   /* Add below method to obtain core restrictions, like ulimit,
> > +* cgroup.cpuset, etc. Will not use those cores, which are rebuffed.
> > +*/
> > +   if (pthread_getaffinity_np(tid, sizeof(rte_cpuset_t), &cs) < 0)
> > +   CPU_ZERO(&cs);
> >
> 
> This patch makes sense but the comment is hard to read because of wording
> and grammar.
> 
> If you choose variable names better then there really is no need for
> a comment in many cases. Code is often easier to read/write than comments
> for non-native English speakers.
> 
> Remove the comment and rename 'cs' as 'affinity_set' or something equally
> as descriptive.

Great suggestion. I'll resend one as you suggest.

Thanks,
Jianfeng


[dpdk-dev] [PATCH v3] eal: restrict cores detection

2016-09-01 Thread Jianfeng Tan
This patch uses pthread_getaffinity_np() to narrow down detected
cores before parsing coremask (-c), corelist (-l), and coremap
(--lcores).

The purpose of this patch is to leave out these core related options
when DPDK applications are deployed under container env, so that
users only specify core restriction as starting the instance.

Note: previously, some users are using isolated CPUs, which could
be excluded by default. Please add commands like taskset to use
those cores.

Test example:
$ taskset 0xc ./examples/helloworld/build/helloworld -m 1024

Signed-off-by: Jianfeng Tan 
Acked-by: Neil Horman 
---
v3:
  - Choose a more descriptive variable name, and remove comments
as suggested by Stephen Hemminger.
v2:
  - Make it as default instead of adding the new options.
 lib/librte_eal/common/eal_common_lcore.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_lcore.c 
b/lib/librte_eal/common/eal_common_lcore.c
index 2cd4132..71c575c 100644
--- a/lib/librte_eal/common/eal_common_lcore.c
+++ b/lib/librte_eal/common/eal_common_lcore.c
@@ -57,6 +57,12 @@ rte_eal_cpu_init(void)
struct rte_config *config = rte_eal_get_configuration();
unsigned lcore_id;
unsigned count = 0;
+   rte_cpuset_t affinity_set;
+   pthread_t tid = pthread_self();
+
+   if (pthread_getaffinity_np(tid, sizeof(rte_cpuset_t),
+  &affinity_set) < 0)
+   CPU_ZERO(&affinity_set);

/*
 * Parse the maximum set of logical cores, detect the subset of running
@@ -70,7 +76,8 @@ rte_eal_cpu_init(void)

/* in 1:1 mapping, record related cpu detected state */
lcore_config[lcore_id].detected = eal_cpu_detected(lcore_id);
-   if (lcore_config[lcore_id].detected == 0) {
+   if (lcore_config[lcore_id].detected == 0 ||
+   !CPU_ISSET(lcore_id, &affinity_set)) {
config->lcore_role[lcore_id] = ROLE_OFF;
lcore_config[lcore_id].core_index = -1;
continue;
-- 
2.7.4



[dpdk-dev] [RFC] igb_uio: deprecate iomem and ioport mapping

2016-09-01 Thread Jianfeng Tan
Previously in igb_uio, iomem is mapped, and both ioport and io mem
are recorded into uio framework, which is duplicated and makes the
code too complex.

For iomem, DPDK user space code never opens or reads files under
/sys/pci/bus/devices/:xx:xx.x/uio/uioY/maps/. Instead,
/sys/pci/bus/devices/:xx:xx.x/resourceY are used to map device
memory.

For ioport, non-x86 platforms cannot read from files under
/sys/pci/bus/devices/:xx:xx.x/uio/uioY/portio/ directly, because
non-x86 platforms need to map port region for access in user space,
see non-x86 version pci_uio_ioport_map(). x86 platforms can use the
the same way as uio_pci_generic.

This patch deprecates iomem and ioport mapping in igb_uio kernel
module, and adjusts the iomem implementation in both igb_uio and
uio_pci_generic:
  - for x86 platform, get ports info from /proc/ioports;
  - for non-x86 platform, map and get ports info by pci_uio_ioport_map().

Note: this will affect those applications who are using files under
/sys/pci/bus/devices/:xx:xx.x/uio/uioY/maps/ and
/sys/pci/bus/devices/:xx:xx.x/uio/uioY/portio/.

Signed-off-by: Jianfeng Tan 
---
 lib/librte_eal/linuxapp/eal/eal_pci.c |   4 -
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c |  56 +-
 lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 119 ++
 3 files changed, 9 insertions(+), 170 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index cd9de7c..f23e99d 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -629,8 +629,6 @@ rte_eal_pci_ioport_map(struct rte_pci_device *dev, int bar,
break;
 #endif
case RTE_KDRV_IGB_UIO:
-   ret = pci_uio_ioport_map(dev, bar, p);
-   break;
case RTE_KDRV_UIO_GENERIC:
 #if defined(RTE_ARCH_X86)
ret = pci_ioport_map(dev, bar, p);
@@ -718,8 +716,6 @@ rte_eal_pci_ioport_unmap(struct rte_pci_ioport *p)
break;
 #endif
case RTE_KDRV_IGB_UIO:
-   ret = pci_uio_ioport_unmap(p);
-   break;
case RTE_KDRV_UIO_GENERIC:
 #if defined(RTE_ARCH_X86)
ret = 0;
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c 
b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index 1786b75..28d09ed 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -370,53 +370,7 @@ error:
return -1;
 }

-#if defined(RTE_ARCH_X86)
-int
-pci_uio_ioport_map(struct rte_pci_device *dev, int bar,
-  struct rte_pci_ioport *p)
-{
-   char dirname[PATH_MAX];
-   char filename[PATH_MAX];
-   int uio_num;
-   unsigned long start;
-
-   uio_num = pci_get_uio_dev(dev, dirname, sizeof(dirname), 0);
-   if (uio_num < 0)
-   return -1;
-
-   /* get portio start */
-   snprintf(filename, sizeof(filename),
-"%s/portio/port%d/start", dirname, bar);
-   if (eal_parse_sysfs_value(filename, &start) < 0) {
-   RTE_LOG(ERR, EAL, "%s(): cannot parse portio start\n",
-   __func__);
-   return -1;
-   }
-   /* ensure we don't get anything funny here, read/write will cast to
-* uin16_t */
-   if (start > UINT16_MAX)
-   return -1;
-
-   /* FIXME only for primary process ? */
-   if (dev->intr_handle.type == RTE_INTR_HANDLE_UNKNOWN) {
-
-   snprintf(filename, sizeof(filename), "/dev/uio%u", uio_num);
-   dev->intr_handle.fd = open(filename, O_RDWR);
-   if (dev->intr_handle.fd < 0) {
-   RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
-   filename, strerror(errno));
-   return -1;
-   }
-   dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
-   }
-
-   RTE_LOG(DEBUG, EAL, "PCI Port IO found start=0x%lx\n", start);
-
-   p->base = start;
-   p->len = 0;
-   return 0;
-}
-#else
+#if !defined(RTE_ARCH_X86)
 int
 pci_uio_ioport_map(struct rte_pci_device *dev, int bar,
   struct rte_pci_ioport *p)
@@ -553,14 +507,10 @@ pci_uio_ioport_write(struct rte_pci_ioport *p,
}
 }

+#if !defined(RTE_ARCH_X86)
 int
 pci_uio_ioport_unmap(struct rte_pci_ioport *p)
 {
-#if defined(RTE_ARCH_X86)
-   RTE_SET_USED(p);
-   /* FIXME close intr fd ? */
-   return 0;
-#else
return munmap((void *)(uintptr_t)p->base, p->len);
-#endif
 }
+#endif
diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c 
b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
index df41e45..e9d78fb 100644
--- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
+++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
@@ -216,107 +216,6 @@ igbuio_dom0_pci_mmap(struct uio_info *info, struct 
vm_area_struct *vma)
 }
 #endif

-/* Remap pci resources described by bar #pci_bar in uio resource n. */
-static int
-igbuio_pci_setup_iomem(s

[dpdk-dev] [PATCH 2/2] net/i40e: fix mbufs leakage during Rx queue release

2016-09-01 Thread Xing, Beilei
Hi,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Yury Kylulin
> Sent: Tuesday, August 30, 2016 12:51 AM
> To: Zhang, Helin ; Ananyev, Konstantin
> ; Wu, Jingjing 
> Cc: Lu, Wenzhuo ; dev at dpdk.org; Kylulin, Yury
> 
> Subject: [dpdk-dev] [PATCH 2/2] net/i40e: fix mbufs leakage during Rx
> queue release
> 
> For the vector PMD release all mbufs from the Rx queue if no packets
> received after device start.
> 
> Fixes: 9ed94e5bb04e ("i40e: add vector Rx")
> 
> Signed-off-by: Yury Kylulin 
Acked-by: Beilei Xing 


[dpdk-dev] QoS: The difference of traffic class between subport and pipe in QoS

2016-09-01 Thread lveny...@1218.com.cn
Thanks for your answer!  But i haved not understand it. 
what is the role of traffic class in subport, and the  relationship with the 
traffic class in pipe ?
The  traffic class in function of rte_sched_port_pkt_write is subport's or 
pipe's ?
void rte_sched_port_pkt_write(struct rte_mbuf *pkt,
 uint32_t subport, uint32_t pipe, uint32_t traffic_class,
 uint32_t queue, enum rte_meter_color color);



lvenyong at 1218.com.cn

From: Dumitrescu, Cristian
Date: 2016-09-01 01:44
To: lvenyong; dev at dpdk.org
CC: users at dpdk.org
Subject: RE: [dpdk-dev] QoS: The difference of traffic class between subport 
and pipe in QoS


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of lvenyong
> Sent: Wednesday, August 31, 2016 12:34 PM
> To: dev at dpdk.org
> Cc: users at dpdk.org
> Subject: [dpdk-dev] QoS: The difference of traffic class between subport and
> pipe in QoS
> 
> HI !
> 
> Is there difference of traffic class between subport and pipe in QOS ?
> 
> After read prog_guide-2.2.pdf we kown that the scheduling hierarchy is port,
> subport, pipe, traffic class and queue. But the traffic class both in
> subport and pipe appeared in example of qos_sched .
> 
> [subport 0]
> tb rate = 125000   ; Bytes per second
> tb size = 100  ; Bytes
> tc 0 rate = 125000 ; Bytes per second
> tc 1 rate = 125000 ; Bytes per second
> tc 2 rate = 125000 ; Bytes per second
> tc 3 rate = 125000 ; Bytes per second
> tc period = 10 ; Milliseconds
> pipe 0-4095 = 0; These pipes are configured with pipe
> profile 0
> ; Pipe configuration
> [pipe profile 0]
> tb rate = 305175   ; Bytes per second
> tb size = 100  ; Bytes
> tc 0 rate = 305175 ; Bytes per second
> tc 1 rate = 305175 ; Bytes per second
> tc 2 rate = 305175 ; Bytes per second
> tc 3 rate = 305175 ; Bytes per second
> tc period = 40 ; Milliseconds
> 
> Thanks
> 
> 

There are 4x traffic classes. You can enforce a limit on the amount of traffic 
belonging to each traffic class at the subport level, as well as at the level 
of each pipe if you want.




[dpdk-dev] [PATCH] sched: fix releasing enqueued packets

2016-09-01 Thread Hiroyuki Mikita
rte_sched_port_free should release only enqueued packets of all queues.
Previous behavior is that enqueued and already dequeued packets of
only first 4 queues are released.

Fixes: 61383240 ("sched: release enqueued mbufs when freeing port")

Signed-off-by: Hiroyuki Mikita 
---
 lib/librte_sched/rte_sched.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 8696423..371003e 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -734,19 +734,24 @@ rte_sched_port_config(struct rte_sched_port_params 
*params)
 void
 rte_sched_port_free(struct rte_sched_port *port)
 {
-   unsigned int queue;
+   uint32_t qindex;
+   uint32_t n_queues_per_port = RTE_SCHED_QUEUES_PER_PIPE *
+   port->n_pipes_per_subport * port->n_subports_per_port;

/* Check user parameters */
if (port == NULL)
return;

/* Free enqueued mbufs */
-   for (queue = 0; queue < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; queue++) {
-   struct rte_mbuf **mbufs = rte_sched_port_qbase(port, queue);
-   unsigned int i;
-
-   for (i = 0; i < rte_sched_port_qsize(port, queue); i++)
-   rte_pktmbuf_free(mbufs[i]);
+   for (qindex = 0; qindex < n_queues_per_port; qindex++) {
+   struct rte_mbuf **mbufs = rte_sched_port_qbase(port, qindex);
+   uint16_t qsize = rte_sched_port_qsize(port, qindex);
+   struct rte_sched_queue *queue = port->queue + qindex;
+   uint16_t qr = queue->qr & (qsize - 1);
+   uint16_t qw = queue->qw & (qsize - 1);
+
+   for (; qr != qw; qr++)
+   rte_pktmbuf_free(mbufs[qr]);
}

rte_bitmap_free(port->bmp);
-- 
2.7.4



[dpdk-dev] [PATCH 0/5] Generalize PCI specific EAL function/structures

2016-09-01 Thread Shreyansh Jain
From: Jan Viktorin 

(I rebased these over HEAD e228563)

These patches were initially part of Jan's original series on SoC Framework
([1],[2]). An update to that series, without these patches, was posted by me
here [3].

Main motivation for these is aim of introducing a non-PCI centric subsystem
in EAL. As of now the first usecase is SoC, but not limited to it.

5 patches in this series are independent of each other, as well as SoC
framework. All these focus on generalizing some structure or functions
present with the PCI specific code to EAL Common area (or splitting a 
function to be more userful).

 - 0001: move the rte_kernel_driver enum from rte_pci to rte_dev. As of now
   this structure is embedded in rte_pci_device, but, going ahead it can be
   part of other rte_xxx_device structures. Either way, it has no impact on
   PCI.
 - 0002: eal_parse_sysfs_value function has been split into two, one
   accepting filename and other working on a file object. This is helpful if
   multiple calls to this are made from EAL - that way infra can maintain a
   file object.
 - 0003: Functions pci_map_resource/pci_unmap_resource are moved to EAL
   common as rte_eal_map_resource/rte_eal_unmap_resource, respectively.
 - 0004: Split the  pci_unbind_kernel_driver into two, still working on the
   PCI BDF sysfs layout, first handles the file path (and validations) and
   second does the actual unbind. The second part might be useful in case of
   non-PCI layouts.
 - 0005: Move pci_get_kernel_driver_by_path to
   rte_eal_get_kernel_driver_by_path in EAL common. This function is generic
   for any sysfs compliant driver and can be re-used by other non-PCI
   subsystem.

If need be, I can propose them as 5 separate patches - but I think clubbing
them together makes more sense (these are loosely related).

[1] http://dpdk.org/ml/archives/dev/2016-January/030915.html
[2] http://www.dpdk.org/ml/archives/dev/2016-May/038486.html
[3] http://dpdk.org/ml/archives/dev/2016-August/045993.html

Jan Viktorin (5):
  eal: make enum rte_kernel_driver non-PCI specific
  eal: extract function eal_parse_sysfs_valuef
  eal: Convert pci_(un)map_resource to rte_eal_(un)map_resource
  eal/linux: extract function rte_eal_unbind_kernel_driver
  eal/linux: extract function rte_eal_get_kernel_driver_by_path

 lib/librte_eal/bsdapp/eal/eal_pci.c |  2 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |  8 +++
 lib/librte_eal/common/eal_common_dev.c  | 39 +++
 lib/librte_eal/common/eal_common_pci.c  | 39 ---
 lib/librte_eal/common/eal_common_pci_uio.c  | 16 +++--
 lib/librte_eal/common/eal_filesystem.h  |  5 ++
 lib/librte_eal/common/eal_private.h | 27 
 lib/librte_eal/common/include/rte_dev.h | 44 
 lib/librte_eal/common/include/rte_pci.h | 41 ---
 lib/librte_eal/linuxapp/eal/eal.c   | 91 ++---
 lib/librte_eal/linuxapp/eal/eal_pci.c   | 62 +++--
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c   |  2 +-
 lib/librte_eal/linuxapp/eal/eal_pci_vfio.c  |  5 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |  8 +++
 14 files changed, 234 insertions(+), 155 deletions(-)

-- 
2.7.4



[dpdk-dev] [PATCH 1/5] eal: make enum rte_kernel_driver non-PCI specific

2016-09-01 Thread Shreyansh Jain
From: Jan Viktorin 

From: Jan Viktorin 

Signed-off-by: Jan Viktorin 
Signed-off-by: Shreyansh Jain 
---
 lib/librte_eal/common/include/rte_dev.h | 12 
 lib/librte_eal/common/include/rte_pci.h |  9 -
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_dev.h 
b/lib/librte_eal/common/include/rte_dev.h
index 95789f9..60bc91d 100644
--- a/lib/librte_eal/common/include/rte_dev.h
+++ b/lib/librte_eal/common/include/rte_dev.h
@@ -101,6 +101,18 @@ rte_pmd_debug_trace(const char *func_name, const char 
*fmt, ...)
 } while (0)


+/**
+ * Kernel driver passthrough type
+ */
+enum rte_kernel_driver {
+   RTE_KDRV_UNKNOWN = 0,
+   RTE_KDRV_IGB_UIO,
+   RTE_KDRV_VFIO,
+   RTE_KDRV_UIO_GENERIC,
+   RTE_KDRV_NIC_UIO,
+   RTE_KDRV_NONE,
+};
+
 /** Double linked list of device drivers. */
 TAILQ_HEAD(rte_driver_list, rte_driver);

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index fa74962..a4c8156 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -141,15 +141,6 @@ struct rte_pci_addr {

 struct rte_devargs;

-enum rte_kernel_driver {
-   RTE_KDRV_UNKNOWN = 0,
-   RTE_KDRV_IGB_UIO,
-   RTE_KDRV_VFIO,
-   RTE_KDRV_UIO_GENERIC,
-   RTE_KDRV_NIC_UIO,
-   RTE_KDRV_NONE,
-};
-
 /**
  * A structure describing a PCI device.
  */
-- 
2.7.4



[dpdk-dev] [PATCH 2/5] eal: extract function eal_parse_sysfs_valuef

2016-09-01 Thread Shreyansh Jain
From: Jan Viktorin 

From: Jan Viktorin 

The eal_parse_sysfs_value function accepts a filename however, such
interface introduces race-conditions to the code. Introduce the
variant of this function that accepts an already opened file instead of
a filename.

Signed-off-by: Jan Viktorin 
Signed-off-by: Shreyansh Jain 
---
 lib/librte_eal/common/eal_filesystem.h |  5 +
 lib/librte_eal/linuxapp/eal/eal.c  | 36 +++---
 2 files changed, 30 insertions(+), 11 deletions(-)

diff --git a/lib/librte_eal/common/eal_filesystem.h 
b/lib/librte_eal/common/eal_filesystem.h
index fdb4a70..7875454 100644
--- a/lib/librte_eal/common/eal_filesystem.h
+++ b/lib/librte_eal/common/eal_filesystem.h
@@ -43,6 +43,7 @@
 /** Path of rte config file. */
 #define RUNTIME_CONFIG_FMT "%s/.%s_config"

+#include 
 #include 
 #include 
 #include 
@@ -115,4 +116,8 @@ eal_get_hugefile_temp_path(char *buffer, size_t buflen, 
const char *hugedir, int
  * Used to read information from files on /sys */
 int eal_parse_sysfs_value(const char *filename, unsigned long *val);

+/** Function to read a single numeric value from a file on the filesystem.
+ * Used to read information from files on /sys */
+int eal_parse_sysfs_valuef(FILE *f, unsigned long *val);
+
 #endif /* EAL_FILESYSTEM_H */
diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index d5b81a3..f912e4e 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -126,13 +126,30 @@ rte_eal_get_configuration(void)
return &rte_config;
 }

+int
+eal_parse_sysfs_valuef(FILE *f, unsigned long *val)
+{
+   char buf[BUFSIZ];
+   char *end = NULL;
+
+   RTE_VERIFY(f != NULL);
+
+   if (fgets(buf, sizeof(buf), f) == NULL)
+   return -1;
+
+   *val = strtoul(buf, &end, 0);
+   if ((buf[0] == '\0') || (end == NULL) || (*end != '\n'))
+   return -2;
+
+   return 0;
+}
+
 /* parse a sysfs (or other) file containing one integer value */
 int
 eal_parse_sysfs_value(const char *filename, unsigned long *val)
 {
+   int ret;
FILE *f;
-   char buf[BUFSIZ];
-   char *end = NULL;

if ((f = fopen(filename, "r")) == NULL) {
RTE_LOG(ERR, EAL, "%s(): cannot open sysfs value %s\n",
@@ -140,21 +157,18 @@ eal_parse_sysfs_value(const char *filename, unsigned long 
*val)
return -1;
}

-   if (fgets(buf, sizeof(buf), f) == NULL) {
+   ret = eal_parse_sysfs_valuef(f, val);
+   if (ret == -1) {
RTE_LOG(ERR, EAL, "%s(): cannot read sysfs value %s\n",
-   __func__, filename);
-   fclose(f);
-   return -1;
+   __func__, filename);
}
-   *val = strtoul(buf, &end, 0);
-   if ((buf[0] == '\0') || (end == NULL) || (*end != '\n')) {
+   else if (ret < 0) {
RTE_LOG(ERR, EAL, "%s(): cannot parse sysfs value %s\n",
__func__, filename);
-   fclose(f);
-   return -1;
}
+
fclose(f);
-   return 0;
+   return ret;
 }


-- 
2.7.4



[dpdk-dev] [PATCH 3/5] eal: Convert pci_(un)map_resource to rte_eal_(un)map_resource

2016-09-01 Thread Shreyansh Jain
From: Jan Viktorin 

From: Jan Viktorin 

The functions pci_map_resource, pci_unmap_resource are generic so the
pci_* prefix can be omitted. The functions are moved to the
eal_common_dev.c so they can be reused by other infrastructure.

Signed-off-by: Jan Viktorin 
Signed-off-by: Shreyansh Jain 
---
 lib/librte_eal/bsdapp/eal/eal_pci.c |  2 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |  8 +
 lib/librte_eal/common/eal_common_dev.c  | 39 +
 lib/librte_eal/common/eal_common_pci.c  | 39 -
 lib/librte_eal/common/eal_common_pci_uio.c  | 16 +-
 lib/librte_eal/common/include/rte_dev.h | 32 
 lib/librte_eal/common/include/rte_pci.h | 32 
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c   |  2 +-
 lib/librte_eal/linuxapp/eal/eal_pci_vfio.c  |  5 ++--
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |  8 +
 10 files changed, 101 insertions(+), 82 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 374b68f..c021969 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -228,7 +228,7 @@ pci_uio_map_resource_by_index(struct rte_pci_device *dev, 
int res_idx,

/* if matching map is found, then use it */
offset = res_idx * pagesz;
-   mapaddr = pci_map_resource(NULL, fd, (off_t)offset,
+   mapaddr = rte_eal_map_resource(NULL, fd, (off_t)offset,
(size_t)dev->mem_resource[res_idx].len, 0);
close(fd);
if (mapaddr == MAP_FAILED)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index a335e04..6dd4186 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -162,3 +162,11 @@ DPDK_16.07 {
rte_thread_setname;

 } DPDK_16.04;
+
+DPDK_16.11 {
+   global:
+
+   rte_eal_map_resource;
+   rte_eal_unmap_resource;
+
+} DPDK_16.07;
diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c
index a8a4146..83aa1ca 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
@@ -150,3 +151,41 @@ rte_eal_vdev_uninit(const char *name)
RTE_LOG(ERR, EAL, "no driver found for %s\n", name);
return -EINVAL;
 }
+
+/* map a particular resource from a file */
+void *
+rte_eal_map_resource(void *requested_addr, int fd, off_t offset, size_t size,
+int additional_flags)
+{
+   void *mapaddr;
+
+   /* Map the Memory resource of device */
+   mapaddr = mmap(requested_addr, size, PROT_READ | PROT_WRITE,
+   MAP_SHARED | additional_flags, fd, offset);
+   if (mapaddr == MAP_FAILED) {
+   RTE_LOG(ERR, EAL, "%s(): cannot mmap(%d, %p, 0x%lx, 0x%lx): %s"
+   " (%p)\n", __func__, fd, requested_addr,
+   (unsigned long)size, (unsigned long)offset,
+   strerror(errno), mapaddr);
+   } else
+   RTE_LOG(DEBUG, EAL, "  Device memory mapped at %p\n", mapaddr);
+
+   return mapaddr;
+}
+
+/* unmap a particular resource */
+void
+rte_eal_unmap_resource(void *requested_addr, size_t size)
+{
+   if (requested_addr == NULL)
+   return;
+
+   /* Unmap the Memory resource of device */
+   if (munmap(requested_addr, size)) {
+   RTE_LOG(ERR, EAL, "%s(): cannot munmap(%p, 0x%lx): %s\n",
+   __func__, requested_addr, (unsigned long)size,
+   strerror(errno));
+   } else
+   RTE_LOG(DEBUG, EAL, "  Device memory unmapped at %p\n",
+   requested_addr);
+}
diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index 7248c38..0818b63 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -67,7 +67,6 @@
 #include 
 #include 
 #include 
-#include 

 #include 
 #include 
@@ -112,44 +111,6 @@ static struct rte_devargs *pci_devargs_lookup(struct 
rte_pci_device *dev)
return NULL;
 }

-/* map a particular resource from a file */
-void *
-pci_map_resource(void *requested_addr, int fd, off_t offset, size_t size,
-int additional_flags)
-{
-   void *mapaddr;
-
-   /* Map the PCI memory resource of device */
-   mapaddr = mmap(requested_addr, size, PROT_READ | PROT_WRITE,
-   MAP_SHARED | additional_flags, fd, offset);
-   if (mapaddr == MAP_FAILED) {
-   RTE_LOG(ERR, EAL, "%s(): cannot mmap(%d, %p, 0x%lx, 0x%lx): %s 
(%p)\n",
-   __func__, fd, requested_addr,
-   (unsigned long)size, (unsigned long)offset,
- 

[dpdk-dev] [PATCH 5/5] eal/linux: extract function rte_eal_get_kernel_driver_by_path

2016-09-01 Thread Shreyansh Jain
From: Jan Viktorin 

From: Jan Viktorin 

Generalize the PCI-specific pci_get_kernel_driver_by_path. The function
is general enough, we have just moved it to eal.c, changed the prefix to
rte_eal and provided it privately to other parts of EAL.

Signed-off-by: Jan Viktorin 
Signed-off-by: Shreyansh Jain 
---
 lib/librte_eal/common/eal_private.h   | 14 ++
 lib/librte_eal/linuxapp/eal/eal.c | 29 +
 lib/librte_eal/linuxapp/eal/eal_pci.c | 31 +--
 3 files changed, 44 insertions(+), 30 deletions(-)

diff --git a/lib/librte_eal/common/eal_private.h 
b/lib/librte_eal/common/eal_private.h
index 0740c0c..5ea30a2 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -271,6 +271,20 @@ int rte_eal_check_module(const char *module_name);
 int rte_eal_unbind_kernel_driver(const char *devpath, const char *devid);

 /**
+ * Extrat the kernel driver name from the absolute path to the driver.
+ *
+ * @param filename  path to the driver ("/driver")
+ * @path  dri_name  target buffer where to place the driver name
+ *  (should be at least PATH_MAX long)
+ *
+ * @return
+ *  -1   on failure
+ *   0   when successful
+ *   1   when there is no such driver
+ */
+int rte_eal_get_kernel_driver_by_path(const char *filename, char *dri_name);
+
+/**
  * Get cpu core_id.
  *
  * This function is private to the EAL.
diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index 8711d9a..9a4c498 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -973,3 +973,32 @@ error:
fclose(f);
return -1;
 }
+
+int
+rte_eal_get_kernel_driver_by_path(const char *filename, char *dri_name)
+{
+   int count;
+   char path[PATH_MAX];
+   char *name;
+
+   if (!filename || !dri_name)
+   return -1;
+
+   count = readlink(filename, path, PATH_MAX);
+   if (count >= PATH_MAX)
+   return -1;
+
+   /* For device does not have a driver */
+   if (count < 0)
+   return 1;
+
+   path[count] = '\0';
+
+   name = strrchr(path, '/');
+   if (name) {
+   strncpy(dri_name, name + 1, strlen(name + 1) + 1);
+   return 0;
+   }
+
+   return -1;
+}
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 4792d05..f923e42 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -78,35 +78,6 @@ pci_unbind_kernel_driver(struct rte_pci_device *dev)
return rte_eal_unbind_kernel_driver(devpath, devid);
 }

-static int
-pci_get_kernel_driver_by_path(const char *filename, char *dri_name)
-{
-   int count;
-   char path[PATH_MAX];
-   char *name;
-
-   if (!filename || !dri_name)
-   return -1;
-
-   count = readlink(filename, path, PATH_MAX);
-   if (count >= PATH_MAX)
-   return -1;
-
-   /* For device does not have a driver */
-   if (count < 0)
-   return 1;
-
-   path[count] = '\0';
-
-   name = strrchr(path, '/');
-   if (name) {
-   strncpy(dri_name, name + 1, strlen(name + 1) + 1);
-   return 0;
-   }
-
-   return -1;
-}
-
 /* Map pci device */
 int
 rte_eal_pci_map_device(struct rte_pci_device *dev)
@@ -354,7 +325,7 @@ pci_scan_one(const char *dirname, uint16_t domain, uint8_t 
bus,

/* parse driver */
snprintf(filename, sizeof(filename), "%s/driver", dirname);
-   ret = pci_get_kernel_driver_by_path(filename, driver);
+   ret = rte_eal_get_kernel_driver_by_path(filename, driver);
if (ret < 0) {
RTE_LOG(ERR, EAL, "Fail to get kernel driver\n");
free(dev);
-- 
2.7.4



[dpdk-dev] [PATCH 4/5] eal/linux: extract function rte_eal_unbind_kernel_driver

2016-09-01 Thread Shreyansh Jain
From: Jan Viktorin 

From: Jan Viktorin 

Generalize the PCI-specific pci_unbind_kernel_driver. It is now divided
into two parts. First, determination of the path and string identification
of the device to be unbound. Second, the actual unbind operation which is
generic.

Signed-off-by: Jan Viktorin 
Signed-off-by: Shreyansh Jain 
---
 lib/librte_eal/common/eal_private.h   | 13 +
 lib/librte_eal/linuxapp/eal/eal.c | 26 ++
 lib/librte_eal/linuxapp/eal/eal_pci.c | 33 +
 3 files changed, 48 insertions(+), 24 deletions(-)

diff --git a/lib/librte_eal/common/eal_private.h 
b/lib/librte_eal/common/eal_private.h
index 19f7535..0740c0c 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -258,6 +258,19 @@ int rte_eal_dev_init(void);
 int rte_eal_check_module(const char *module_name);

 /**
+ * Unbind kernel driver bound to the device specified by the given devpath,
+ * and its string identification.
+ *
+ * @param devpath  path to the device directory ("/sys/.../devices/")
+ * @param devididentification of the device ()
+ *
+ * @return
+ *  -1  unbind has failed
+ *   0  module has been unbound
+ */
+int rte_eal_unbind_kernel_driver(const char *devpath, const char *devid);
+
+/**
  * Get cpu core_id.
  *
  * This function is private to the EAL.
diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index f912e4e..8711d9a 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -947,3 +947,29 @@ rte_eal_check_module(const char *module_name)
/* Module has been found */
return 1;
 }
+
+int
+rte_eal_unbind_kernel_driver(const char *devpath, const char *devid)
+{
+   char filename[PATH_MAX];
+   FILE *f;
+
+   snprintf(filename, sizeof(filename),
+"%s/driver/unbind", devpath);
+
+   f = fopen(filename, "w");
+   if (f == NULL) /* device was not bound */
+   return 0;
+
+   if (fwrite(devid, strlen(devid), 1, f) == 0) {
+   RTE_LOG(ERR, EAL, "%s(): could not write to %s\n", __func__,
+   filename);
+   goto error;
+   }
+
+   fclose(f);
+   return 0;
+error:
+   fclose(f);
+   return -1;
+}
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index cd9de7c..4792d05 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -59,38 +59,23 @@ int
 pci_unbind_kernel_driver(struct rte_pci_device *dev)
 {
int n;
-   FILE *f;
-   char filename[PATH_MAX];
-   char buf[BUFSIZ];
+   char devpath[PATH_MAX];
+   char devid[BUFSIZ];
struct rte_pci_addr *loc = &dev->addr;

-   /* open /sys/bus/pci/devices/:BB:CC.D/driver */
-   snprintf(filename, sizeof(filename),
-   "%s/" PCI_PRI_FMT "/driver/unbind", pci_get_sysfs_path(),
+   /* devpath /sys/bus/pci/devices/:BB:CC.D */
+   snprintf(devpath, sizeof(devpath),
+   "%s/" PCI_PRI_FMT, pci_get_sysfs_path(),
loc->domain, loc->bus, loc->devid, loc->function);

-   f = fopen(filename, "w");
-   if (f == NULL) /* device was not bound */
-   return 0;
-
-   n = snprintf(buf, sizeof(buf), PCI_PRI_FMT "\n",
+   n = snprintf(devid, sizeof(devid), PCI_PRI_FMT "\n",
 loc->domain, loc->bus, loc->devid, loc->function);
-   if ((n < 0) || (n >= (int)sizeof(buf))) {
+   if ((n < 0) || (n >= (int)sizeof(devid))) {
RTE_LOG(ERR, EAL, "%s(): snprintf failed\n", __func__);
-   goto error;
-   }
-   if (fwrite(buf, n, 1, f) == 0) {
-   RTE_LOG(ERR, EAL, "%s(): could not write to %s\n", __func__,
-   filename);
-   goto error;
+   return -1;
}

-   fclose(f);
-   return 0;
-
-error:
-   fclose(f);
-   return -1;
+   return rte_eal_unbind_kernel_driver(devpath, devid);
 }

 static int
-- 
2.7.4



[dpdk-dev] [PATCH] vhost: add pmd xstats

2016-09-01 Thread Yang, Zhiyong
Hi, all:

> -Original Message-
> From: Yang, Zhiyong
> Sent: Wednesday, August 31, 2016 3:19 PM
> To: dev at dpdk.org
> Cc: Panu Matilainen ; Liu, Yuanhan
> ; Thomas Monjalon
> ; Yao, Lei A ; Yang,
> Zhiyong ; Wang, Zhihong
> 
> Subject: RE: [dpdk-dev] [PATCH] vhost: add pmd xstats
> 
> Hi, ALL:
> 
> Physical NIC has a set of counters, such as
> u64 prc64;
> u64 prc127;
> u64 prc255; etc.
> but now, DPDK has counted the prc64 in two ways. Physical NIC counts
> prc64 with CRC by hardware. Virtio computes the counter like prc64 without
> CRC. This will cause the conflict, when a 64 packet from outer network is sent
> to VM(virtio), NIC will show prc64 + 1, virtio will actually receive the 
> 64-4(CRC)
> = 60 bytes pkt, undersize(<64) counter will be increased. Should Vhost do like
> NIC's behavior or virtio's behavior?
> 
> According to rfc2819 description as referece.
> etherStatsPkts64Octets OBJECT-TYPE
> SYNTAX Counter32
> UNITS "Packets"
> "The total number of packets (including bad packets) received that were
> 64 octets in length (excluding framing bits but including FCS octets)."
>
I consult the requirement from dev at openvswitch.com,  Jesse Gross
 reply the question as following:
All other stats in OVS (including flows and other counters that don't come
from hardware) count bytes without the CRC. Presumably it would be best to
adjust the physical NIC stats with DPDK to do the same.


[dpdk-dev] [PATCH] virtio: xstats name issue

2016-09-01 Thread Zhiyong Yang
The patch fixes some xstats name issues and make the xstats name conform to
code implementation(the function virtio_update_packet_stats).

Signed-off-by: Zhiyong Yang 
---
 drivers/net/virtio/virtio_ethdev.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 07d6449..4cee067 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -125,8 +125,8 @@ static const struct rte_virtio_xstats_name_off 
rte_virtio_rxq_stat_strings[] = {
{"size_128_255_packets",   offsetof(struct virtnet_rx, 
stats.size_bins[3])},
{"size_256_511_packets",   offsetof(struct virtnet_rx, 
stats.size_bins[4])},
{"size_512_1023_packets",  offsetof(struct virtnet_rx, 
stats.size_bins[5])},
-   {"size_1024_1517_packets", offsetof(struct virtnet_rx, 
stats.size_bins[6])},
-   {"size_1518_max_packets",  offsetof(struct virtnet_rx, 
stats.size_bins[7])},
+   {"size_1024_1518_packets", offsetof(struct virtnet_rx, 
stats.size_bins[6])},
+   {"size_1519_max_packets",  offsetof(struct virtnet_rx, 
stats.size_bins[7])},
 };

 /* [rt]x_qX_ is prepended to the name string here */
@@ -142,8 +142,8 @@ static const struct rte_virtio_xstats_name_off 
rte_virtio_txq_stat_strings[] = {
{"size_128_255_packets",   offsetof(struct virtnet_tx, 
stats.size_bins[3])},
{"size_256_511_packets",   offsetof(struct virtnet_tx, 
stats.size_bins[4])},
{"size_512_1023_packets",  offsetof(struct virtnet_tx, 
stats.size_bins[5])},
-   {"size_1024_1517_packets", offsetof(struct virtnet_tx, 
stats.size_bins[6])},
-   {"size_1518_max_packets",  offsetof(struct virtnet_tx, 
stats.size_bins[7])},
+   {"size_1024_1518_packets", offsetof(struct virtnet_tx, 
stats.size_bins[6])},
+   {"size_1519_max_packets",  offsetof(struct virtnet_tx, 
stats.size_bins[7])},
 };

 #define VIRTIO_NB_RXQ_XSTATS (sizeof(rte_virtio_rxq_stat_strings) / \
-- 
2.5.5



[dpdk-dev] [PATCH 2/5] eal: extract function eal_parse_sysfs_valuef

2016-09-01 Thread Shreyansh Jain
Hi Stephen,

On Thursday 01 September 2016 12:00 PM, Stephen Hemminger wrote:
> On Thu, 1 Sep 2016 10:11:52 +0530
> Shreyansh Jain  wrote:
>
>> From: Jan Viktorin 
>>
>> From: Jan Viktorin 
>>
>> The eal_parse_sysfs_value function accepts a filename however, such
>> interface introduces race-conditions to the code. Introduce the
>> variant of this function that accepts an already opened file instead of
>> a filename.
>>
>> Signed-off-by: Jan Viktorin 
>> Signed-off-by: Shreyansh Jain 
>> ---
>
> You introduce new API, but don't use it in your other patches.

Indeed - I have not used this API anywhere in my patches. As highlighted 
in the covering letter, these are some proposed changes which *might* 
help in future (for e.g., introducing new SoC infra).
Patches don't depend on each other.

> I don't see where passing filename is racy. sysfs files only get 
> created/destroyed
> when device is added/removed.
>

Agree that 'race-condition' is not the right word.

parse_sysfs_value reads a single integer from a given sysfs file and is 
a useful helper for EAL layer, specifically for non-PCI infra as and 
when it is introduced (reading through custom sysfs files).
At that time, it may be possible that caller keeps the context of the 
call rather than open the file every time - for reading more than an 
integer, for example.

-
Shreyansh


[dpdk-dev] OpenSSL and non-BSD licenses in DPDK

2016-09-01 Thread Remy Horton
Morning,

On 31/08/2016 18:26, Bodek, Zbigniew wrote:
[..]
> I would like to ask a question regarding code licensing and importing code 
> that
> uses different license than BSD-like. Especially I'm curious about the code
> that goes with OpenSSL/SSLeay license. Is it allowed to import such sources
> (or derived work) to DPDK? I've seen some GPL stuff, mostly kernel modules 
> from
> Intel. So what with the above mentioned OpenSSL?

It is not my call to make, but my guess is the answer will almost 
certainly be "No". DPDK makes it way into commercial products, and the 
companies concerned (well, their lawyers actually..) most likley won't 
like the prospect of extra licencing conditions slipping in.

..Remy


[dpdk-dev] [PATCH] examples/qos_sched: fix packets dequeue operation from ring

2016-09-01 Thread Jasvinder Singh
The app_worker_thread() and app_mixed_thread() use rte_ring_sc_dequeue_bulk
to dequeue packets from the ring and this imposes restriction on number of
packets in software ring to be greater than the specified value to start
actual dequeue operation, thus, adds latency to those packets. Therefore,
rte_ring_sc_dequeue_bulk is replaced with rte_ring_sc_dequeue_burst.

Fixes: de3cfa2c9823 ("sched: initial import")

Suggested-by: Yang, Tao Y 
Signed-off-by: Jasvinder Singh 
---
 examples/qos_sched/app_thread.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/examples/qos_sched/app_thread.c b/examples/qos_sched/app_thread.c
index 3c678cc..70fdcdb 100644
--- a/examples/qos_sched/app_thread.c
+++ b/examples/qos_sched/app_thread.c
@@ -215,17 +215,16 @@ app_worker_thread(struct thread_conf **confs)

while ((conf = confs[conf_idx])) {
uint32_t nb_pkt;
-   int retval;

/* Read packet from the ring */
-   retval = rte_ring_sc_dequeue_bulk(conf->rx_ring, (void **)mbufs,
+   nb_pkt = rte_ring_sc_dequeue_burst(conf->rx_ring, (void 
**)mbufs,
burst_conf.ring_burst);
-   if (likely(retval == 0)) {
+   if (likely(nb_pkt)) {
int nb_sent = rte_sched_port_enqueue(conf->sched_port, 
mbufs,
-   burst_conf.ring_burst);
+   nb_pkt);

-   APP_STATS_ADD(conf->stat.nb_drop, burst_conf.ring_burst 
- nb_sent);
-   APP_STATS_ADD(conf->stat.nb_rx, burst_conf.ring_burst);
+   APP_STATS_ADD(conf->stat.nb_drop, nb_pkt - nb_sent);
+   APP_STATS_ADD(conf->stat.nb_rx, nb_pkt);
}

nb_pkt = rte_sched_port_dequeue(conf->sched_port, mbufs,
@@ -250,17 +249,16 @@ app_mixed_thread(struct thread_conf **confs)

while ((conf = confs[conf_idx])) {
uint32_t nb_pkt;
-   int retval;

/* Read packet from the ring */
-   retval = rte_ring_sc_dequeue_bulk(conf->rx_ring, (void **)mbufs,
+   nb_pkt = rte_ring_sc_dequeue_burst(conf->rx_ring, (void 
**)mbufs,
burst_conf.ring_burst);
-   if (likely(retval == 0)) {
+   if (likely(nb_pkt)) {
int nb_sent = rte_sched_port_enqueue(conf->sched_port, 
mbufs,
-   burst_conf.ring_burst);
+   nb_pkt);

-   APP_STATS_ADD(conf->stat.nb_drop, burst_conf.ring_burst 
- nb_sent);
-   APP_STATS_ADD(conf->stat.nb_rx, burst_conf.ring_burst);
+   APP_STATS_ADD(conf->stat.nb_drop, nb_pkt - nb_sent);
+   APP_STATS_ADD(conf->stat.nb_rx, nb_pkt);
}


-- 
2.5.5



[dpdk-dev] eal : rte_rdtsc is wrong on some cpu

2016-09-01 Thread lveny...@1218.com.cn
HI !

rte_rdtsc is wrong on some cpu.  
when  runing on Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz , it is OK. But on 
Intel(R) Xeon(R) CPU E5-4610 v2 @ 2.30GHz, it return a very big value sometime.

Here is my test use gdb. It can jump from 26460438829980939  to  
2840228530541503 in less one second !

Breakpoint 1,  rte_rdtsc() at 
dpdk-2.2.0/lib/librte_eal/common/include/arch/x86/rte_cycles.h:104
$184 = {tsc_64 = 0x5e01a008be027f, {lo_32 = 0x8be027f, hi_32 = 0x5e01a0}}
$185 = {tsc_64 = 26460434663867007, {lo_32 = 146670207, hi_32 = 6160800}}
(gdb) 
Continuing.

Breakpoint 2, rte_rdtsc() at 
dpdk-2.2.0/lib/librte_eal/common/include/arch/x86/rte_cycles.h:104
32 in /home/yangqiang/gajet/gajet_branch/2.0/code/src/nsdpf/tsc_time.h
$186 = {tsc_64 = 0x5e01a1010fdd0b, {lo_32 = 0x10fdd0b, hi_32 = 0x5e01a1}}
$187 = {tsc_64 = 26460438829980939, {lo_32 = 17816843, hi_32 = 6160801}}
(gdb) 
Continuing.

Breakpoint 2, rte_rdtsc() at 
dpdk-2.2.0/lib/librte_eal/common/include/arch/x86/rte_cycles.h:104
32 in /home/yangqiang/gajet/gajet_branch/2.0/code/src/nsdpf/tsc_time.h
$188 = {tsc_64 = 0xa172c3ca4d3bf, {lo_32 = 0x3ca4d3bf, hi_32 = 0xa172c}}
$189 = {tsc_64 = 2840228530541503, {lo_32 = 1017435071, hi_32 = 661292}}
(gdb) 
Continuing.

Breakpoint 2, tsc_time () at 
/home/yangqiang/gajet/gajet_branch/2.0/code/src/nsdpf/tsc_time.h:32
32 in /home/yangqiang/gajet/gajet_branch/2.0/code/src/nsdpf/tsc_time.h
$190 = {tsc_64 = 0xa172c7cd5f131, {lo_32 = 0x7cd5f131, hi_32 = 0xa172c}}
$191 = {tsc_64 = 2840229607502129, {lo_32 = 2094395697, hi_32 = 661292}} 



lvenyong at 1218.com.cn


[dpdk-dev] eal : rte_rdtsc is wrong on some cpu

2016-09-01 Thread lveny...@1218.com.cn
HI !

rte_rdtsc is wrong on some cpu.  
when  runing on Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz , it is OK. But on 
Intel(R) Xeon(R) CPU E5-4610 v2 @ 2.30GHz, it return a very big value sometime.

Here is my test use gdb. It can jump from 26460438829980939  to  
2840228530541503 in less one second !

Breakpoint 1,  rte_rdtsc() at 
dpdk-2.2.0/lib/librte_eal/common/include/arch/x86/rte_cycles.h:104
$184 = {tsc_64 = 0x5e01a008be027f, {lo_32 = 0x8be027f, hi_32 = 0x5e01a0}}
$185 = {tsc_64 = 26460434663867007, {lo_32 = 146670207, hi_32 = 6160800}}
(gdb) 
Continuing.

Breakpoint 2, rte_rdtsc() at 
dpdk-2.2.0/lib/librte_eal/common/include/arch/x86/rte_cycles.h:104
32 in /home/yangqiang/gajet/gajet_branch/2.0/code/src/nsdpf/tsc_time.h
$186 = {tsc_64 = 0x5e01a1010fdd0b, {lo_32 = 0x10fdd0b, hi_32 = 0x5e01a1}}
$187 = {tsc_64 = 26460438829980939, {lo_32 = 17816843, hi_32 = 6160801}}
(gdb) 
Continuing.

Breakpoint 2, rte_rdtsc() at 
dpdk-2.2.0/lib/librte_eal/common/include/arch/x86/rte_cycles.h:104
32 in /home/yangqiang/gajet/gajet_branch/2.0/code/src/nsdpf/tsc_time.h
$188 = {tsc_64 = 0xa172c3ca4d3bf, {lo_32 = 0x3ca4d3bf, hi_32 = 0xa172c}}
$189 = {tsc_64 = 2840228530541503, {lo_32 = 1017435071, hi_32 = 661292}}
(gdb) 
Continuing.



lvenyong at 1218.com.cn


[dpdk-dev] [PATCH] crypto/qat: fix memzone creation to use a fixed size string

2016-09-01 Thread John Griffin
Remove the dependency on dev->driver->pci_drv.name when
creating the memzone for the qat hardware queues.
The pci_drv.name may grow too large for RTE_MEMZONE_NAMESIZE.

Fixes: 1703e94ac5ce ("qat: add driver for QuickAssist devices")

Signed-off-by: John Griffin 
---
 drivers/crypto/qat/qat_qp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/qat/qat_qp.c b/drivers/crypto/qat/qat_qp.c
index 5de47e3..a29ed66 100644
--- a/drivers/crypto/qat/qat_qp.c
+++ b/drivers/crypto/qat/qat_qp.c
@@ -300,7 +300,7 @@ qat_queue_create(struct rte_cryptodev *dev, struct 
qat_queue *queue,
 * Allocate a memzone for the queue - create a unique name.
 */
snprintf(queue->memz_name, sizeof(queue->memz_name), "%s_%s_%d_%d_%d",
-   dev->driver->pci_drv.name, "qp_mem", dev->data->dev_id,
+   "qat_pmd", "qp_mem", dev->data->dev_id,
queue->hw_bundle_number, queue->hw_queue_number);
qp_mz = queue_dma_zone_reserve(queue->memz_name, queue_size_bytes,
socket_id);
-- 
2.1.0



[dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.

2016-09-01 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mussar, Gary
> Sent: Monday, August 29, 2016 4:10 PM
> To: Dey, Souvik ; Stephen Hemminger
> 
> Cc: nhorman at tuxdriver.com; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface
> issue.
> 
> We did this slightly differently. This is 100% python and is a bit more
> general. We search for the first "net" directory under the specific device
> directory.
> 
> ---
> --- tools/dpdk-devbind.py   2016-08-29 11:02:35.594202888 -0400
> +++ ../dpdk/tools/dpdk-devbind.py 2016-08-29 11:00:34.897677233 -0400
> @@ -221,11 +221,11 @@
>  name = name.strip(":") + "_str"
>  device[name] = value
>  # check for a unix interface name
> -sys_path = "/sys/bus/pci/devices/%s/net/" % dev_id
> -if exists(sys_path):
> -device["Interface"] = ",".join(os.listdir(sys_path))
> -else:
> -device["Interface"] = ""
> +device["Interface"] = ""
> +for base, dirs, files in os.walk("/sys/bus/pci/devices/%s/" %
> dev_id):
> +if "net" in dirs:
> +device["Interface"] =
> ",".join(os.listdir(os.path.join(base,"net")))
> +break
>  # check if a port is used for ssh connection
>  device["Ssh_if"] = False
>  device["Active"] = ""
> ---

Hi Gary,

That looks like a cleaner solution. Could you submit that as a patch.

Souvik, could you test this patch and confirm it fixes your issue.


Gary, if you submit a patch could you make a few minor changes:

> +device["Interface"] = ""
> +for base, dirs, files in os.walk("/sys/bus/pci/devices/%s/" % dev_id):
> +

If "files" is unused, and it looks like it is, then replace it with "_".


> +device["Interface"] = 
> ",".join(os.listdir(os.path.join(base,"net")))

There is a space required after "," for PEP8 compliance.

John





[dpdk-dev] [PATCH v8 00/25] Introducing rte_driver/rte_device generalization

2016-09-01 Thread Shreyansh Jain
Hi Ferruh,

Sorry for the delay in my reply.
Please find some comments inline.

On Tuesday 30 August 2016 06:57 PM, Ferruh Yigit wrote:
> On 8/26/2016 2:56 PM, Shreyansh Jain wrote:
>> Based on master (e22856313fff2)
>>
>> Background:
>> ===
>>
>> It includes two different patch-sets floated on ML earlier:
>>  * Original patch series is from David Marchand [1], [2].
>>   `- This focused mainly on PCI (PDEV) part
>>   `- v7 of this was posted by me [8] in August/2016
>>  * Patch series [4] from Jan Viktorin
>>   `- This focused on VDEV and rte_device integration
>>
>> Introduction:
>> =
>>
>> This patch series introduces a generic device model, moving away from PCI
>> centric code layout. Key change is to introduce rte_driver/rte_device
>> structures at the top level which are inherited by
>> rte_XXX_driver/rte_XXX_device - where XXX belongs to {pci, vdev, soc (in
>> future),...}.
>>
>> Key motivation for this series is to move away from PCI centric design of
>> EAL to a more hierarchical device model - pivoted around a generic driver
>> and device. Each specific driver and device can inherit the common
>> properties of the generic set and build upon it through driver/device
>> specific functions.
>>
>> Earlier, the EAL device initialization model was:
>> (Refer: [3])
>>
>> --
>>  Constructor:
>>   |- PMD_DRIVER_REGISTER(rte_driver)
>>  `-  insert into dev_driver_list, rte_driver object
>>
>>  rte_eal_init():
>>   |- rte_eal_pci_init()
>>   |  `- scan and fill pci_device_list from sysfs
>>   |
>>   |- rte_eal_dev_init()
>>   |  `- For each rte_driver in dev_driver_list
>>   | `- call the rte_driver->init() function
>>   ||- PMDs designed to call rte_eth_driver_register(eth_driver)
>>   ||- eth_driver have rte_pci_driver embedded in them
>>   |`- rte_eth_driver_register installs the
>>   |   rte_pci_driver->devinit/devuninit callbacks.
>>   |
>>   |- rte_eal_pci_probe()
>>   |  |- For each device detected, dev_driver_list is parsed and matching is
>>   |  |  done.
>>   |  |- For each matching device, the rte_pci_driver->devinit() is called.
>>   |  |- Default map is to rte_eth_dev_init() which in turn creates a
>>   |  |  new ethernet device (eth_dev)
>>   |  |  `- eth_drv->eth_dev_init() is called which is implemented by
>>   `--|individual PMD drivers.
>>
>> --
>>
>> The structure of driver looks something like:
>>
>>  ++ ._.
>>  | rte_driver <-| PMD |___
>>  |  .init | `-`   \
>>  +.---+  | \
>>   `-.| What PMD actually is
>>  \   |  |
>>   +--v+ |
>>   | eth_driver| |
>>   | .eth_dev_init | |
>>   +.--+ |
>>`-.  |
>>   \ |
>>+v---+
>>| rte_pci_driver |
>>| .pci_devinit   |
>>++
>>
>>   and all devices are part of a following linked lists:
>> - dev_driver_list for all rte_drivers
>> - pci_device_list for all devices, whether PCI or VDEV
>>
>>
>> From the above:
>>  * a PMD initializes a rte_driver, eth_driver even though actually it is a
>>pci_driver
>>  * initialization routines are passed from rte_driver->pci_driver->eth_driver
>>even though they should ideally be rte_eal_init()->rte_pci_driver()
>>  * For a single driver/device type model, this is not necessarily a
>>functional issue - but more of a design language.
>>  * But, when number of driver/device type increase, this would create problem
>>in how driver<=>device links are represented.
>>
>> Proposed Architecture:
>> ==
>>
>> A nice representation has already been created by David in [3]. Copying that
>> here:
>>
>> +--+ +---+
>> |  | |   |
>> | rte_pci_device   | | rte_pci_driver|
>> |  | |   |
>> +-+ | +--+ | | +---+ |
>> | | | |  | | | |   | |
>> | rte_eth_dev +---> rte_device   +-> rte_driver| |
>> | | | |  char name[] | | | |  char name[]  | |
>> +-+ | |  | | | |  int init(rte_device *)   | |
>> | +--+ | | |  int uninit(rte_device *) | |
>> |  | | |   | |
>> +--+ | +---+ |
>>  |   |
>>  +---+
>>
>> - for ethdev on top of vdev devices
>>
>> +-

[dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod dependencies in pmdinfo

2016-09-01 Thread Trahe, Fiona
Hi Neil and Olivier,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier Matz
> Sent: Wednesday, August 31, 2016 2:40 PM
> To: Neil Horman 
> Cc: dev at dpdk.org; thomas.monjalon at 6wind.com
> Subject: Re: [dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod dependencies
> in pmdinfo
> 
> Hi Neil,
> 
> On 08/31/2016 03:27 PM, Neil Horman wrote:
> > On Wed, Aug 31, 2016 at 11:21:18AM +0200, Olivier Matz wrote:
> >> Hi Neil,
> >>
> >> On 08/30/2016 03:23 PM, Neil Horman wrote:
> >>> On Fri, Aug 26, 2016 at 03:20:46PM +0200, Olivier Matz wrote:
>  Add a new macro DRIVER_REGISTER_KMOD_DEP() that allows a driver to
>  declare the list of kernel modules required to run properly.
> 
>  Today, most PCI drivers require uio/vfio.
> 
>  Signed-off-by: Olivier Matz 
> 
>  ---
>  In this RFC, I supposed that all PCI drivers require a the loading of a
>  uio/vfio module (except mlx*), this may be wrong.
>  Comments are welcome!
> 
> 
>   buildtools/pmdinfogen/pmdinfogen.c  |  1 +
>   buildtools/pmdinfogen/pmdinfogen.h  |  1 +
>   drivers/crypto/qat/rte_qat_cryptodev.c  |  2 ++
>   drivers/net/bnx2x/bnx2x_ethdev.c|  4 
>   drivers/net/bnxt/bnxt_ethdev.c  |  2 ++
>   drivers/net/cxgbe/cxgbe_ethdev.c|  2 ++
>   drivers/net/e1000/em_ethdev.c   |  2 ++
>   drivers/net/e1000/igb_ethdev.c  |  4 
>   drivers/net/ena/ena_ethdev.c|  2 ++
>   drivers/net/enic/enic_ethdev.c  |  2 ++
>   drivers/net/fm10k/fm10k_ethdev.c|  2 ++
>   drivers/net/i40e/i40e_ethdev.c  |  2 ++
>   drivers/net/i40e/i40e_ethdev_vf.c   |  2 ++
>   drivers/net/ixgbe/ixgbe_ethdev.c|  4 
>   drivers/net/mlx4/mlx4.c |  2 ++
>   drivers/net/mlx5/mlx5.c |  3 +++
>   drivers/net/nfp/nfp_net.c   |  2 ++
>   drivers/net/qede/qede_ethdev.c  |  4 
>   drivers/net/szedata2/rte_eth_szedata2.c |  2 ++
>   drivers/net/thunderx/nicvf_ethdev.c |  2 ++
>   drivers/net/virtio/virtio_ethdev.c  |  2 ++
>   drivers/net/vmxnet3/vmxnet3_ethdev.c|  2 ++
>   lib/librte_eal/common/include/rte_dev.h | 14 ++
>   tools/dpdk-pmdinfo.py   |  5 -
>   24 files changed, 69 insertions(+), 1 deletion(-)
> 
> >>>
> >>> Generally speaking, I like the idea, it makes sense to me in terms of 
> >>> using
> >>> pmdinfo to export this information
> >>>
> >>> That said, This may need to be a set of macros.  By that I mean (and 
> >>> correct
> me
> >>> if I'm wrong here), but the relationship between pmd's and kernel modules
> is in
> >>> some cases, more complex than a 'requires' or 'depends' relationship.  
> >>> That
> is
> >>> to say, some pmd may need user space hardware access, but can use either
> uio OR
> >>> vfio, but doesn't need both, and can continue to function if only one is
> >>> available.  Other PMD's may be able to use vfio or uio, but can still 
> >>> function
> >>> without either.  And some, as your patch implements, simply require one or
> the
> >>> other to function.  As such it seems like you may want a few macros, in 
> >>> the
> form
> >>> of:
> >>>
> >>> DRIVER_REGISTER_KMOD_REQUEST - List of modules to attempt loading,
> ignore any
> >>> failures
> >>> DRIVER_REGISTER_KMOD_REQUIRE - List of modules required to be
> loaded after
> >>> request macro completes, fail if any are not loaded
> >>>
> >>> Thats just spitballing, mind you, theres probably a better way to do it, 
> >>> but
> the
> >>> idea is to list a set of modules you would like to have, and then create a
> >>> parsable syntax to describe the modules that need to be loaded after the
> request
> >>> is complete so that you can accurately codify the situations I described
> above.
> >>
> >> Thank you for your feedback.
> >> However, I'm not sure I'm perfectly getting what you suggest.
> >>
> >> Do you think some PMDs could request a kernel module without really
> >> requiring it? Do you have an example in mind?
> >>
> > Yes, thats precisely it.  The most clear example I could think of (though 
> > I'm
> > not sure if any pmd currently supports this), is a pmd that supports both 
> > UIO
> > and VFIO communication with the kernel.  Such a PMD requires that one of
> those
> > two modules be loaded, but only one (i.e. both are not required), so if only
> the
> > uio kernel module loads is a success case, likewise if only the vfio module
> > loads can be treated as success.  Both loading are clearly successful.  
> > Only if
> > neither load do we have a failure case.  I'm suggesting that the grammer 
> > that
> > your exports define should take those cases into account.  Its not always as
> > simple as "I must have the following modules"
> >
> >> The syntax I've submitted lets you define several lists of modules, so
> >> th

[dpdk-dev] [dpdk-users] ixgbe drop all the packet

2016-09-01 Thread wei wang
my enviroment:
NIC:X540
DPDK version:2.2.0

problems:
start dpdk app with traffic not stopped?app can't receive any packet
from dpdk ixgbe driver (1.2w pps udp packet).
test five times would happens 1 ~ 2 times.

The stats of the nic (ipackets==3230 and imissed == ierrors):
ipackets 3230 imissed 9780225 ierrors 9780225
ipackets 3230 imissed 9792154 ierrors 9792154
ipackets 3230 imissed 9804310 ierrors 9804310
ipackets 3230 imissed 9816177 ierrors 9816177
ipackets 3230 imissed 9828042 ierrors 9828042
ipackets 3230 imissed 9839694 ierrors 9839694
ipackets 3230 imissed 9851412 ierrors 9851412
ipackets 3230 imissed 9863134 ierrors 9863134
ipackets 3230 imissed 9874722 ierrors 9874722
ipackets 3230 imissed 9886776 ierrors 9886776
ipackets 3230 imissed 9898616 ierrors 9898616
ipackets 3230 imissed 9910648 ierrors 9910648
ipackets 3230 imissed 9922513 ierrors 9922513

The value of register RDH:1023
The value of register RDT:959

there are some desc value log for one queue, the format of the log is
[desc_idx](DD value, vlan, length, status_error) :
[959]:(0,0, 0, 0) [960]:(1,0, 2048, 1342308425) [961]:(1,0, 2048,
1342308425) [962]:(1,0, 1031, 1342308427)
[963]:(1,0, 0, 3) [964]:(1,0, 0,   3)
[965]:(1,0, 2048, 1342308425) [966]:(1,0, 2048, 1342308425)
[967]:(1,0, 1031, 1342308427) [968]:(1,0, 0, 3) [969]:(1,0, 0, 3)
[970]:(1,0, 2048, 1342308425)
[971]:(1,0, 2048, 1342308425) [972]:(1,0, 1031, 1342308427)
[973]:(1,0, 0, 3) [974]:(1,0, 0, 3)
[975]:(1,0, 2048, 1342308425) [976]:(1,0, 2048, 1342308425)
[977]:(1,0, 1031, 1342308427) [978]:(1,0, 0, 3)
[979]:(1,0, 0, 3) [980]:(1,0, 2048, 1342308425) [981]:(1,0, 2048,
1342308425) [982]:(1,0, 1031, 1342308427)
[983]:(1,0, 0, 3) [984]:(1,0, 0, 3) [985]:(1,0, 2048, 1342308425)
[986]:(1,0, 2048, 1342308425)
[987]:(1,0, 1031, 1342308427) [988]:(1,0, 0, 3) [989]:(1,0, 0, 3)
[990]:(1,0, 2048, 1342308425)
[991]:(1,0, 2048, 1342308425) [992]:(1,0, 1031, 1342308427)
[993]:(1,0, 0, 3) [994]:(1,0, 0, 3)
[995]:(1,0, 2048, 1342308425) [996]:(1,0, 2048, 1342308425)
[997]:(1,0, 1031, 1342308427) [998]:(1,0, 0, 3)
[999]:(1,0, 0, 3) [1000]:(1,0, 2048, 1342308425) [1001]:(1,0, 2048,
1342308425) [1002]:(1,0, 1031, 1342308427)
[1003]:(1,0, 0, 3) [1004]:(1,0, 0, 3) [1005]:(1,0, 2048, 1342308425)
[1006]:(1,0, 2048, 1342308425)
[1007]:(1,0, 1031, 1342308427) [1008]:(1,0, 0, 3) [1009]:(1,0, 0, 3)
[1010]:(1,0, 2048, 1342308425)
[1011]:(1,0, 2048, 1342308425) [1012]:(1,0, 1031, 1342308427)
[1013]:(1,0, 0, 3)[1014]:(1,0, 0, 3)
[1015]:(1,0, 2048, 1342308425) [1016]:(1,0, 2048, 1342308425)
[1017]:(1,0, 1031, 1342308427) [1018]:(1,0, 0, 3)
 [1019]:(1,0, 0, 3) [1020]:(1,0, 2048, 1342308425) [1021]:(1,0, 2048,
1342308425)  [1022]:(1,0, 1031, 1342308427) [1023]:(0,0, 0, 0)

other desc in the ring is all zero?

is there a bug?

PS: it was normal with linux ixgbe driver?


[dpdk-dev] [PATCH v8 22/25] eal/pci: inherit rte_driver by rte_pci_driver

2016-09-01 Thread Shreyansh Jain
Hi,

On Tuesday 30 August 2016 09:17 PM, Ferruh Yigit wrote:
> On 8/26/2016 2:57 PM, Shreyansh Jain wrote:
>> Remove the 'name' member from rte_pci_driver and move to generic rte_driver.
>>
>> Most of the PMD drivers were initially using DRIVER_REGISTER_PCI(..)
>> as well as assigning a name to eth_driver.pci_drv.name member.
>> In this patch, only the original DRIVER_REGISTER_PCI(..) name has been
>> populated into the rte_driver.name member - assignments through eth_driver
>> has been removed.
>>
>> Signed-off-by: Jan Viktorin 
>> Signed-off-by: Shreyansh Jain 
>> ---
>
> There are a few name fields:
>
> 1) eth_dev->data->name
> 2) eth_dev->data->drv_name
> 3) rte_driver->name
> 4) dev_info->driver_name
>
>
> What should be the relation between them?
>
> I guess 1) is device_name, 2, 3, 4 are same thing and driver_name.

Yes, (1) is the ethernet device name.
(2), (3) are same, i.e. driver name
(4) is an output field for eth_dev_info_get method and would represent 
same thing as (2) and (3).

>
> If this is correct, virtual drivers needs to be updated for this,
> because for them 3 != (2 == 4). They all use global variable for 2 & 4.

Ok. I didn't notice this. I will check it once again.

>
> And what do you think removing 2) completely?
> I guess it exists for virtual devices, since for them eth_driver is not
> exists and not able to access to rte_driver->name from eth_dev, but this
> is solvable.

Ok.
Probably, one way to solve is to make eth_dev->driver point to 
rte_vdev_driver. That way, rte_driver->name would replace 
eth_dev->data->drv_name.
I will give it a thought. Thanks for pointing out.

>
>
> Thanks,
> ferruh
>

-
Shreyansh


[dpdk-dev] [PATCH 0/2] Add ptype and xsum handling in i40e rx vpmd

2016-09-01 Thread Jeff Shaw
On Fri, Jul 15, 2016 at 10:26:23PM +0200, Thomas Monjalon wrote:
> 2016-07-14 09:59, Jeff Shaw:
> > Our testing suggests minimal (in some cases zero) impact to core-bound
> > forwarding throughput as measured by testpmd. Throughput increase is
> > observed in l3fwd as now the vpmd can be used with hw_ip_checksum
> > enabled and without needing '--parse-ptype'.
> > 
> > The benefits to applications using this functionality is realized when
> > Ethernet processing and L3/L4 checksum validation can be skipped.
> > 
> > We hope others can also test performance in their applications while
> > conducting a review of this series.
> 
> Thanks for the patches. They need some careful review and are a bit late
> for an integration in 16.07. Thus they are pending for 16.11.

Hello, I noticed there are other patches going into i40e ahead of this
one. Would somebody please review and merge this one if there are no
issues?

Thanks,
Jeff


[dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod dependencies in pmdinfo

2016-09-01 Thread Neil Horman
On Thu, Sep 01, 2016 at 12:55:27PM +, Trahe, Fiona wrote:
> Hi Neil and Olivier,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier Matz
> > Sent: Wednesday, August 31, 2016 2:40 PM
> > To: Neil Horman 
> > Cc: dev at dpdk.org; thomas.monjalon at 6wind.com
> > Subject: Re: [dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod dependencies
> > in pmdinfo
> > 
> > Hi Neil,
> > 
> > On 08/31/2016 03:27 PM, Neil Horman wrote:
> > > On Wed, Aug 31, 2016 at 11:21:18AM +0200, Olivier Matz wrote:
> > >> Hi Neil,
> > >>
> > >> On 08/30/2016 03:23 PM, Neil Horman wrote:
> > >>> On Fri, Aug 26, 2016 at 03:20:46PM +0200, Olivier Matz wrote:
> >  Add a new macro DRIVER_REGISTER_KMOD_DEP() that allows a driver to
> >  declare the list of kernel modules required to run properly.
> > 
> >  Today, most PCI drivers require uio/vfio.
> > 
> >  Signed-off-by: Olivier Matz 
> > 
> >  ---
> >  In this RFC, I supposed that all PCI drivers require a the loading of a
> >  uio/vfio module (except mlx*), this may be wrong.
> >  Comments are welcome!
> > 
> > 
> >   buildtools/pmdinfogen/pmdinfogen.c  |  1 +
> >   buildtools/pmdinfogen/pmdinfogen.h  |  1 +
> >   drivers/crypto/qat/rte_qat_cryptodev.c  |  2 ++
> >   drivers/net/bnx2x/bnx2x_ethdev.c|  4 
> >   drivers/net/bnxt/bnxt_ethdev.c  |  2 ++
> >   drivers/net/cxgbe/cxgbe_ethdev.c|  2 ++
> >   drivers/net/e1000/em_ethdev.c   |  2 ++
> >   drivers/net/e1000/igb_ethdev.c  |  4 
> >   drivers/net/ena/ena_ethdev.c|  2 ++
> >   drivers/net/enic/enic_ethdev.c  |  2 ++
> >   drivers/net/fm10k/fm10k_ethdev.c|  2 ++
> >   drivers/net/i40e/i40e_ethdev.c  |  2 ++
> >   drivers/net/i40e/i40e_ethdev_vf.c   |  2 ++
> >   drivers/net/ixgbe/ixgbe_ethdev.c|  4 
> >   drivers/net/mlx4/mlx4.c |  2 ++
> >   drivers/net/mlx5/mlx5.c |  3 +++
> >   drivers/net/nfp/nfp_net.c   |  2 ++
> >   drivers/net/qede/qede_ethdev.c  |  4 
> >   drivers/net/szedata2/rte_eth_szedata2.c |  2 ++
> >   drivers/net/thunderx/nicvf_ethdev.c |  2 ++
> >   drivers/net/virtio/virtio_ethdev.c  |  2 ++
> >   drivers/net/vmxnet3/vmxnet3_ethdev.c|  2 ++
> >   lib/librte_eal/common/include/rte_dev.h | 14 ++
> >   tools/dpdk-pmdinfo.py   |  5 -
> >   24 files changed, 69 insertions(+), 1 deletion(-)
> > 
> > >>>
> > >>> Generally speaking, I like the idea, it makes sense to me in terms of 
> > >>> using
> > >>> pmdinfo to export this information
> > >>>
> > >>> That said, This may need to be a set of macros.  By that I mean (and 
> > >>> correct
> > me
> > >>> if I'm wrong here), but the relationship between pmd's and kernel 
> > >>> modules
> > is in
> > >>> some cases, more complex than a 'requires' or 'depends' relationship.  
> > >>> That
> > is
> > >>> to say, some pmd may need user space hardware access, but can use either
> > uio OR
> > >>> vfio, but doesn't need both, and can continue to function if only one is
> > >>> available.  Other PMD's may be able to use vfio or uio, but can still 
> > >>> function
> > >>> without either.  And some, as your patch implements, simply require one 
> > >>> or
> > the
> > >>> other to function.  As such it seems like you may want a few macros, in 
> > >>> the
> > form
> > >>> of:
> > >>>
> > >>> DRIVER_REGISTER_KMOD_REQUEST - List of modules to attempt loading,
> > ignore any
> > >>> failures
> > >>> DRIVER_REGISTER_KMOD_REQUIRE - List of modules required to be
> > loaded after
> > >>> request macro completes, fail if any are not loaded
> > >>>
> > >>> Thats just spitballing, mind you, theres probably a better way to do 
> > >>> it, but
> > the
> > >>> idea is to list a set of modules you would like to have, and then 
> > >>> create a
> > >>> parsable syntax to describe the modules that need to be loaded after the
> > request
> > >>> is complete so that you can accurately codify the situations I described
> > above.
> > >>
> > >> Thank you for your feedback.
> > >> However, I'm not sure I'm perfectly getting what you suggest.
> > >>
> > >> Do you think some PMDs could request a kernel module without really
> > >> requiring it? Do you have an example in mind?
> > >>
> > > Yes, thats precisely it.  The most clear example I could think of (though 
> > > I'm
> > > not sure if any pmd currently supports this), is a pmd that supports both 
> > > UIO
> > > and VFIO communication with the kernel.  Such a PMD requires that one of
> > those
> > > two modules be loaded, but only one (i.e. both are not required), so if 
> > > only
> > the
> > > uio kernel module loads is a success case, likewise if only the vfio 
> > > module
> > > loads can be treated as success.  Both loading are clearly 

[dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod dependencies in pmdinfo

2016-09-01 Thread Stephen Hemminger
On Thu, 1 Sep 2016 13:35:19 -0400
Neil Horman  wrote:

> On Thu, Sep 01, 2016 at 12:55:27PM +, Trahe, Fiona wrote:
> > Hi Neil and Olivier,
> >   
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier Matz
> > > Sent: Wednesday, August 31, 2016 2:40 PM
> > > To: Neil Horman 
> > > Cc: dev at dpdk.org; thomas.monjalon at 6wind.com
> > > Subject: Re: [dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod 
> > > dependencies
> > > in pmdinfo
> > > 
> > > Hi Neil,
> > > 
> > > On 08/31/2016 03:27 PM, Neil Horman wrote:  
> > > > On Wed, Aug 31, 2016 at 11:21:18AM +0200, Olivier Matz wrote:  
> > > >> Hi Neil,
> > > >>
> > > >> On 08/30/2016 03:23 PM, Neil Horman wrote:  
> > > >>> On Fri, Aug 26, 2016 at 03:20:46PM +0200, Olivier Matz wrote:  
> > >  Add a new macro DRIVER_REGISTER_KMOD_DEP() that allows a driver to
> > >  declare the list of kernel modules required to run properly.
> > > 
> > >  Today, most PCI drivers require uio/vfio.
> > > 
> > >  Signed-off-by: Olivier Matz 
> > > 
> > >  ---
> > >  In this RFC, I supposed that all PCI drivers require a the loading 
> > >  of a
> > >  uio/vfio module (except mlx*), this may be wrong.
> > >  Comments are welcome!
> > > 
> > > 
> > >   buildtools/pmdinfogen/pmdinfogen.c  |  1 +
> > >   buildtools/pmdinfogen/pmdinfogen.h  |  1 +
> > >   drivers/crypto/qat/rte_qat_cryptodev.c  |  2 ++
> > >   drivers/net/bnx2x/bnx2x_ethdev.c|  4 
> > >   drivers/net/bnxt/bnxt_ethdev.c  |  2 ++
> > >   drivers/net/cxgbe/cxgbe_ethdev.c|  2 ++
> > >   drivers/net/e1000/em_ethdev.c   |  2 ++
> > >   drivers/net/e1000/igb_ethdev.c  |  4 
> > >   drivers/net/ena/ena_ethdev.c|  2 ++
> > >   drivers/net/enic/enic_ethdev.c  |  2 ++
> > >   drivers/net/fm10k/fm10k_ethdev.c|  2 ++
> > >   drivers/net/i40e/i40e_ethdev.c  |  2 ++
> > >   drivers/net/i40e/i40e_ethdev_vf.c   |  2 ++
> > >   drivers/net/ixgbe/ixgbe_ethdev.c|  4 
> > >   drivers/net/mlx4/mlx4.c |  2 ++
> > >   drivers/net/mlx5/mlx5.c |  3 +++
> > >   drivers/net/nfp/nfp_net.c   |  2 ++
> > >   drivers/net/qede/qede_ethdev.c  |  4 
> > >   drivers/net/szedata2/rte_eth_szedata2.c |  2 ++
> > >   drivers/net/thunderx/nicvf_ethdev.c |  2 ++
> > >   drivers/net/virtio/virtio_ethdev.c  |  2 ++
> > >   drivers/net/vmxnet3/vmxnet3_ethdev.c|  2 ++
> > >   lib/librte_eal/common/include/rte_dev.h | 14 ++
> > >   tools/dpdk-pmdinfo.py   |  5 -
> > >   24 files changed, 69 insertions(+), 1 deletion(-)
> > >   
> > > >>>
> > > >>> Generally speaking, I like the idea, it makes sense to me in terms of 
> > > >>> using
> > > >>> pmdinfo to export this information
> > > >>>
> > > >>> That said, This may need to be a set of macros.  By that I mean (and 
> > > >>> correct  
> > > me  
> > > >>> if I'm wrong here), but the relationship between pmd's and kernel 
> > > >>> modules  
> > > is in  
> > > >>> some cases, more complex than a 'requires' or 'depends' relationship. 
> > > >>>  That  
> > > is  
> > > >>> to say, some pmd may need user space hardware access, but can use 
> > > >>> either  
> > > uio OR  
> > > >>> vfio, but doesn't need both, and can continue to function if only one 
> > > >>> is
> > > >>> available.  Other PMD's may be able to use vfio or uio, but can still 
> > > >>> function
> > > >>> without either.  And some, as your patch implements, simply require 
> > > >>> one or  
> > > the  
> > > >>> other to function.  As such it seems like you may want a few macros, 
> > > >>> in the  
> > > form  
> > > >>> of:
> > > >>>
> > > >>> DRIVER_REGISTER_KMOD_REQUEST - List of modules to attempt loading,  
> > > ignore any  
> > > >>> failures
> > > >>> DRIVER_REGISTER_KMOD_REQUIRE - List of modules required to be  
> > > loaded after  
> > > >>> request macro completes, fail if any are not loaded
> > > >>>
> > > >>> Thats just spitballing, mind you, theres probably a better way to do 
> > > >>> it, but  
> > > the  
> > > >>> idea is to list a set of modules you would like to have, and then 
> > > >>> create a
> > > >>> parsable syntax to describe the modules that need to be loaded after 
> > > >>> the  
> > > request  
> > > >>> is complete so that you can accurately codify the situations I 
> > > >>> described  
> > > above.  
> > > >>
> > > >> Thank you for your feedback.
> > > >> However, I'm not sure I'm perfectly getting what you suggest.
> > > >>
> > > >> Do you think some PMDs could request a kernel module without really
> > > >> requiring it? Do you have an example in mind?
> > > >>  
> > > > Yes, thats precisely it.  The most clear example I could think of 
> > > > (though I'm
> > > > not sure if any pmd currentl

[dpdk-dev] [PATCH v8 00/25] Introducing rte_driver/rte_device generalization

2016-09-01 Thread Jan Viktorin
Hi Shreyansh,

I am sorry to be quiet on this thread. I am traveling in those
two weeks and have some vacation. However, I passively follow the
conversation. Thank you for your work so far!

Regards
Jan

On Fri, 26 Aug 2016 19:26:38 +0530
Shreyansh Jain  wrote:

> Based on master (e22856313fff2)
> 
> Background:
> ===
> 
> It includes two different patch-sets floated on ML earlier:
>  * Original patch series is from David Marchand [1], [2].
>   `- This focused mainly on PCI (PDEV) part
>   `- v7 of this was posted by me [8] in August/2016
>  * Patch series [4] from Jan Viktorin
>   `- This focused on VDEV and rte_device integration
> 
> Introduction:
> =
> 
> This patch series introduces a generic device model, moving away from PCI 
> centric code layout. Key change is to introduce rte_driver/rte_device 
> structures at the top level which are inherited by 
> rte_XXX_driver/rte_XXX_device - where XXX belongs to {pci, vdev, soc (in 
> future),...}.
> 
> Key motivation for this series is to move away from PCI centric design of 
> EAL to a more hierarchical device model - pivoted around a generic driver 
> and device. Each specific driver and device can inherit the common 
> properties of the generic set and build upon it through driver/device 
> specific functions.
> 
> Earlier, the EAL device initialization model was:
> (Refer: [3])
> 
> --
>  Constructor:
>   |- PMD_DRIVER_REGISTER(rte_driver)
>  `-  insert into dev_driver_list, rte_driver object
> 
>  rte_eal_init():
>   |- rte_eal_pci_init()
>   |  `- scan and fill pci_device_list from sysfs
>   |
>   |- rte_eal_dev_init()
>   |  `- For each rte_driver in dev_driver_list
>   | `- call the rte_driver->init() function
>   ||- PMDs designed to call rte_eth_driver_register(eth_driver)
>   ||- eth_driver have rte_pci_driver embedded in them
>   |`- rte_eth_driver_register installs the 
>   |   rte_pci_driver->devinit/devuninit callbacks.
>   |
>   |- rte_eal_pci_probe()
>   |  |- For each device detected, dev_driver_list is parsed and matching is
>   |  |  done.
>   |  |- For each matching device, the rte_pci_driver->devinit() is called.
>   |  |- Default map is to rte_eth_dev_init() which in turn creates a
>   |  |  new ethernet device (eth_dev)
>   |  |  `- eth_drv->eth_dev_init() is called which is implemented by 
>   `--|individual PMD drivers.
> 
> --
> 
> The structure of driver looks something like:
> 
>  ++ ._.
>  | rte_driver <-| PMD |___
>  |  .init | `-`   \
>  +.---+  | \
>   `-.| What PMD actually is
>  \   |  |
>   +--v+ |
>   | eth_driver| |
>   | .eth_dev_init | |
>   +.--+ |
>`-.  |
>   \ |
>+v---+
>| rte_pci_driver |
>| .pci_devinit   |
>++
> 
>   and all devices are part of a following linked lists:
> - dev_driver_list for all rte_drivers
> - pci_device_list for all devices, whether PCI or VDEV
> 
> 
> From the above:
>  * a PMD initializes a rte_driver, eth_driver even though actually it is a 
>pci_driver
>  * initialization routines are passed from rte_driver->pci_driver->eth_driver
>even though they should ideally be rte_eal_init()->rte_pci_driver()
>  * For a single driver/device type model, this is not necessarily a
>functional issue - but more of a design language.
>  * But, when number of driver/device type increase, this would create problem
>in how driver<=>device links are represented.
> 
> Proposed Architecture:
> ==
> 
> A nice representation has already been created by David in [3]. Copying that
> here:
> 
> +--+ +---+
> |  | |   |
> | rte_pci_device   | | rte_pci_driver|
> |  | |   |
> +-+ | +--+ | | +---+ |
> | | | |  | | | |   | |
> | rte_eth_dev +---> rte_device   +-> rte_driver| |
> | | | |  char name[] | | | |  char name[]  | |
> +-+ | |  | | | |  int init(rte_device *)   | |
> | +--+ | | |  int uninit(rte_device *) | |
> |  | | |   | |
> +--+ | +---+ |
>  |   |
>  +---+
> 
> - for ethdev on top of vdev devices
> 
> +--

[dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod dependencies in pmdinfo

2016-09-01 Thread Neil Horman
On Thu, Sep 01, 2016 at 10:41:22AM -0700, Stephen Hemminger wrote:
> On Thu, 1 Sep 2016 13:35:19 -0400
> Neil Horman  wrote:
> 
> > On Thu, Sep 01, 2016 at 12:55:27PM +, Trahe, Fiona wrote:
> > > Hi Neil and Olivier,
> > >   
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier Matz
> > > > Sent: Wednesday, August 31, 2016 2:40 PM
> > > > To: Neil Horman 
> > > > Cc: dev at dpdk.org; thomas.monjalon at 6wind.com
> > > > Subject: Re: [dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod 
> > > > dependencies
> > > > in pmdinfo
> > > > 
> > > > Hi Neil,
> > > > 
> > > > On 08/31/2016 03:27 PM, Neil Horman wrote:  
> > > > > On Wed, Aug 31, 2016 at 11:21:18AM +0200, Olivier Matz wrote:  
> > > > >> Hi Neil,
> > > > >>
> > > > >> On 08/30/2016 03:23 PM, Neil Horman wrote:  
> > > > >>> On Fri, Aug 26, 2016 at 03:20:46PM +0200, Olivier Matz wrote:  
> > > >  Add a new macro DRIVER_REGISTER_KMOD_DEP() that allows a driver to
> > > >  declare the list of kernel modules required to run properly.
> > > > 
> > > >  Today, most PCI drivers require uio/vfio.
> > > > 
> > > >  Signed-off-by: Olivier Matz 
> > > > 
> > > >  ---
> > > >  In this RFC, I supposed that all PCI drivers require a the loading 
> > > >  of a
> > > >  uio/vfio module (except mlx*), this may be wrong.
> > > >  Comments are welcome!
> > > > 
> > > > 
> > > >   buildtools/pmdinfogen/pmdinfogen.c  |  1 +
> > > >   buildtools/pmdinfogen/pmdinfogen.h  |  1 +
> > > >   drivers/crypto/qat/rte_qat_cryptodev.c  |  2 ++
> > > >   drivers/net/bnx2x/bnx2x_ethdev.c|  4 
> > > >   drivers/net/bnxt/bnxt_ethdev.c  |  2 ++
> > > >   drivers/net/cxgbe/cxgbe_ethdev.c|  2 ++
> > > >   drivers/net/e1000/em_ethdev.c   |  2 ++
> > > >   drivers/net/e1000/igb_ethdev.c  |  4 
> > > >   drivers/net/ena/ena_ethdev.c|  2 ++
> > > >   drivers/net/enic/enic_ethdev.c  |  2 ++
> > > >   drivers/net/fm10k/fm10k_ethdev.c|  2 ++
> > > >   drivers/net/i40e/i40e_ethdev.c  |  2 ++
> > > >   drivers/net/i40e/i40e_ethdev_vf.c   |  2 ++
> > > >   drivers/net/ixgbe/ixgbe_ethdev.c|  4 
> > > >   drivers/net/mlx4/mlx4.c |  2 ++
> > > >   drivers/net/mlx5/mlx5.c |  3 +++
> > > >   drivers/net/nfp/nfp_net.c   |  2 ++
> > > >   drivers/net/qede/qede_ethdev.c  |  4 
> > > >   drivers/net/szedata2/rte_eth_szedata2.c |  2 ++
> > > >   drivers/net/thunderx/nicvf_ethdev.c |  2 ++
> > > >   drivers/net/virtio/virtio_ethdev.c  |  2 ++
> > > >   drivers/net/vmxnet3/vmxnet3_ethdev.c|  2 ++
> > > >   lib/librte_eal/common/include/rte_dev.h | 14 ++
> > > >   tools/dpdk-pmdinfo.py   |  5 -
> > > >   24 files changed, 69 insertions(+), 1 deletion(-)
> > > >   
> > > > >>>
> > > > >>> Generally speaking, I like the idea, it makes sense to me in terms 
> > > > >>> of using
> > > > >>> pmdinfo to export this information
> > > > >>>
> > > > >>> That said, This may need to be a set of macros.  By that I mean 
> > > > >>> (and correct  
> > > > me  
> > > > >>> if I'm wrong here), but the relationship between pmd's and kernel 
> > > > >>> modules  
> > > > is in  
> > > > >>> some cases, more complex than a 'requires' or 'depends' 
> > > > >>> relationship.  That  
> > > > is  
> > > > >>> to say, some pmd may need user space hardware access, but can use 
> > > > >>> either  
> > > > uio OR  
> > > > >>> vfio, but doesn't need both, and can continue to function if only 
> > > > >>> one is
> > > > >>> available.  Other PMD's may be able to use vfio or uio, but can 
> > > > >>> still function
> > > > >>> without either.  And some, as your patch implements, simply require 
> > > > >>> one or  
> > > > the  
> > > > >>> other to function.  As such it seems like you may want a few 
> > > > >>> macros, in the  
> > > > form  
> > > > >>> of:
> > > > >>>
> > > > >>> DRIVER_REGISTER_KMOD_REQUEST - List of modules to attempt loading,  
> > > > ignore any  
> > > > >>> failures
> > > > >>> DRIVER_REGISTER_KMOD_REQUIRE - List of modules required to be  
> > > > loaded after  
> > > > >>> request macro completes, fail if any are not loaded
> > > > >>>
> > > > >>> Thats just spitballing, mind you, theres probably a better way to 
> > > > >>> do it, but  
> > > > the  
> > > > >>> idea is to list a set of modules you would like to have, and then 
> > > > >>> create a
> > > > >>> parsable syntax to describe the modules that need to be loaded 
> > > > >>> after the  
> > > > request  
> > > > >>> is complete so that you can accurately codify the situations I 
> > > > >>> described  
> > > > above.  
> > > > >>
> > > > >> Thank you for your feedback.
> > > > >> However, I'm not sure I'm perfectly getting what you

[dpdk-dev] virtio kills qemu VM after stopping/starting ports

2016-09-01 Thread Kyle Larose
Hello everyone,

In my own testing, I recently stumbled across an issue where I could get qemu 
to exit when sending traffic to my application. To do this, I simply needed to 
do the following:

1) Start my virtio interfaces
2) Send some traffic into/out of the interfaces
3) Stop the interfaces
4) Start the interfaces
5) Send some more traffic

At this point, I would lose connectivity to my VM.  Further investigation 
revealed qemu exiting with the following log:

2016-09-01T15:45:32.119059Z qemu-kvm: Guest moved used index from 5 to 1

I found the following bug report against qemu, reported by a user of DPDK: 
https://bugs.launchpad.net/qemu/+bug/1558175

That thread seems to have stalled out, so I think we probably should deal with 
the problem within DPDK itself. Either way, later in the bug report chain, we 
see a link to this patch to DPDK: 
http://dpdk.org/browse/dpdk/commit/?id=9a0615af774648. The submitter of the bug 
report claims that this patch fixes the problem. Perhaps it does. However, it 
introduces a new problem: If I remove the patch, I cannot reproduce the 
problem. So, that leads me to believe that it has caused a regression.

To summarize the patch?s changes, it basically changes the virtio_dev_stop 
function to flag the device as stopped, and stops the device when 
closing/uninitializing it. However, there is a seemingly unintended 
side-effect. 

In virtio_dev_start, we have the following block of code:

/* On restart after stop do not touch queues */
if (hw->started)
return 0;

/* Do final configuration before rx/tx engine starts */
virtio_dev_rxtx_start(dev);

?.

Prior to the patch, if an interface were stopped then started, without 
restarting the application, the queues would be left as-is, because hw->started 
would be set to 1. Now, calling stop sets hw->started to 0, which means the 
next call to start will ?touch the queues?. This is the unintended side-effect 
that causes the problem.

I made a change locally to break the state of the device into two: started and 
opened. The devices starts out neither started nor opened. If the device is 
accepting packets, it is started. If the device has set up its queues, it is 
opened. Stopping the device does not close the device. This allows me to change 
the check above to:

if (hw->opened) {
hw->started=1
return 0;
}

Then, if I stop and start the device, it does not reinitialize the queues. I 
have no problem. I can restart ports as much as I want, and the system keeps 
running. Traffic flows when they?ve restarted as well, which is always a plus. ?

Some background:
- I tested against DPDK 16.04 and DPDK 16.07.
- I?m using virtio NICs:
- CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
- Host OS: CentOS Linux release 7.1.1503 (Core)
- Guest OS: CentOS Linux release 7.2.1511 (Core)
- Qemu-kvm version: 1.5.3-86.el7_1.6

I plan on submitting a patch to fix this tomorrow. Let me know if anyone has 
any thoughts about this, or a better way to fix it.

Thanks,

Kyle


[dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.

2016-09-01 Thread Dey, Souvik
Yes this patch definitely solves my issue too. 

-Original Message-
From: Mcnamara, John [mailto:john.mcnam...@intel.com] 
Sent: Thursday, September 1, 2016 7:00 AM
To: Mussar, Gary ; Dey, Souvik ; 
Stephen Hemminger 
Cc: nhorman at tuxdriver.com; dev at dpdk.org
Subject: RE: [dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mussar, Gary
> Sent: Monday, August 29, 2016 4:10 PM
> To: Dey, Souvik ; Stephen Hemminger 
> 
> Cc: nhorman at tuxdriver.com; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface 
> issue.
> 
> We did this slightly differently. This is 100% python and is a bit 
> more general. We search for the first "net" directory under the 
> specific device directory.
> 
> ---
> --- tools/dpdk-devbind.py   2016-08-29 11:02:35.594202888 -0400
> +++ ../dpdk/tools/dpdk-devbind.py 2016-08-29 11:00:34.897677233 -0400
> @@ -221,11 +221,11 @@
>  name = name.strip(":") + "_str"
>  device[name] = value
>  # check for a unix interface name
> -sys_path = "/sys/bus/pci/devices/%s/net/" % dev_id
> -if exists(sys_path):
> -device["Interface"] = ",".join(os.listdir(sys_path))
> -else:
> -device["Interface"] = ""
> +device["Interface"] = ""
> +for base, dirs, files in os.walk("/sys/bus/pci/devices/%s/" %
> dev_id):
> +if "net" in dirs:
> +device["Interface"] =
> ",".join(os.listdir(os.path.join(base,"net")))
> +break
>  # check if a port is used for ssh connection
>  device["Ssh_if"] = False
>  device["Active"] = ""
> ---

Hi Gary,

That looks like a cleaner solution. Could you submit that as a patch.

Souvik, could you test this patch and confirm it fixes your issue.


Gary, if you submit a patch could you make a few minor changes:

> +device["Interface"] = ""
> +for base, dirs, files in os.walk("/sys/bus/pci/devices/%s/" % dev_id):
> +

If "files" is unused, and it looks like it is, then replace it with "_".


> +device["Interface"] = 
> + ",".join(os.listdir(os.path.join(base,"net")))

There is a space required after "," for PEP8 compliance.

John





[dpdk-dev] [PATCH v2] add mtu set in virtio

2016-09-01 Thread Dey, Souvik
Hi Maxime,
When is patches or new implementation going to come in the release ? if 
it is not 16.11 then, can we keep this change till the new virtio changes come 
in the release. And if it is already planned for 16.11, then can I get a little 
more information on that.

--
Regards,
Souvik

-Original Message-
From: Maxime Coquelin [mailto:maxime.coque...@redhat.com] 
Sent: Tuesday, August 30, 2016 3:58 AM
To: Dey, Souvik ; stephen at networkplumber.org; 
huawei.xie at intel.com; yuanhan.liu at linux.intel.com
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2] add mtu set in virtio

Hi Souvik,

On 08/30/2016 01:02 AM, souvikdey33 wrote:
> Signed-off-by: Souvik Dey 
>
> Fixes: 1fb8e8896ca8 ("Signed-off-by: Souvik Dey ")
> Reviewed-by: Stephen Hemminger 
>
> Virtio interfaces should also support setting of mtu, as in case of 
> cloud it is expected to have the consistent mtu across the 
> infrastructure that the dhcp server sends and not hardcoded to 1500(default).
> ---
>  drivers/net/virtio/virtio_ethdev.c | 12 
>  1 file changed, 12 insertions(+)

FYI, there are some on-going changes in the VIRTIO specification so that the 
VHOST interface exposes its MTU to its VIRTIO peer.
It may also be used as an alternative of what you patch achieves.

I am working on its implementation in Qemu/DPDK, our goal being to reduce 
performance drops for small packets with Rx mergeable buffers feature enabled.

Regards,
Maxime


[dpdk-dev] problem with continual link status down events

2016-09-01 Thread Coulson, Ken
I am seeing a problem with link state change using interrupts using a Broadwell
with on-board 10G-BaseT LAN ports (0x15AD) running DPDK 16.04.  There is an Ixia
connected to the two 10G ports and both are initially up.  One of the
connections is brought down via the Ixia and the software gets a single down
event through the callback installed with rte_eth_dev_callback_register() and
enabling lsc in the rte_eth_conf structure.  The connection is then re-enabled
and a single up event is received as expected.  However on the other 10G port
when the connection is brought down sporadic down events are continually
received at a frequency ranging from about 1 seconds to 20 seconds.
I do not see the problem with the two 1G ports.  The problem has been
seen on two different Broadwell boxes.

I'm looking for information from others that have seen anything like this 
problem.

Ken Coulson
Software Engineer with Ciena