date:20160104

[dpdk-dev] [PATCH] vhost: remove lockless enqueue to the virtio ring

2016-01-04 Thread Huawei Xie

This patch removes the internal lockless enqueue implmentation.
DPDK doesn't support receiving/transmitting packets from/to the same
queue. Vhost PMD wraps vhost device as normal DPDK port. DPDK
applications normally have their own lock implmentation when enqueue
packets to the same queue of a port.

The atomic cmpset is a costly operation. This patch should help
performance a bit.

Signed-off-by: Huawei Xie 
---
 lib/librte_vhost/vhost_rxtx.c | 86 +--
 1 file changed, 25 insertions(+), 61 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index bbf3fac..26a1b9c 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -69,10 +69,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
uint64_t buff_hdr_addr = 0;
uint32_t head[MAX_PKT_BURST];
uint32_t head_idx, packet_success = 0;
-   uint16_t avail_idx, res_cur_idx;
-   uint16_t res_base_idx, res_end_idx;
+   uint16_t avail_idx, res_cur_idx, res_end_idx;
uint16_t free_entries;
-   uint8_t success = 0;

LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_rx()\n", dev->device_fh);
if (unlikely(!is_valid_virt_queue_idx(queue_id, 0, dev->virt_qp_nb))) {
@@ -88,29 +86,18 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,

count = (count > MAX_PKT_BURST) ? MAX_PKT_BURST : count;

-   /*
-* As many data cores may want access to available buffers,
-* they need to be reserved.
-*/
-   do {
-   res_base_idx = vq->last_used_idx_res;
-   avail_idx = *((volatile uint16_t *)>avail->idx);
-
-   free_entries = (avail_idx - res_base_idx);
-   /*check that we have enough buffers*/
-   if (unlikely(count > free_entries))
-   count = free_entries;
-
-   if (count == 0)
-   return 0;
-
-   res_end_idx = res_base_idx + count;
-   /* vq->last_used_idx_res is atomically updated. */
-   /* TODO: Allow to disable cmpset if no concurrency in 
application. */
-   success = rte_atomic16_cmpset(>last_used_idx_res,
-   res_base_idx, res_end_idx);
-   } while (unlikely(success == 0));
-   res_cur_idx = res_base_idx;
+   avail_idx = *((volatile uint16_t *)>avail->idx);
+   free_entries = (avail_idx - vq->last_used_idx_res);
+   /*check that we have enough buffers*/
+   if (unlikely(count > free_entries))
+   count = free_entries;
+   if (count == 0)
+   return 0;
+
+   res_cur_idx = vq->last_used_idx_res;
+   res_end_idx = res_cur_idx + count;
+   vq->last_used_idx_res = res_end_idx;
+
LOG_DEBUG(VHOST_DATA, "(%"PRIu64") Current Index %d| End Index %d\n",
dev->device_fh, res_cur_idx, res_end_idx);

@@ -230,10 +217,6 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,

rte_compiler_barrier();

-   /* Wait until it's our turn to add our buffer to the used ring. */
-   while (unlikely(vq->last_used_idx != res_base_idx))
-   rte_pause();
-
*(volatile uint16_t *)>used->idx += count;
vq->last_used_idx = res_end_idx;

@@ -474,7 +457,6 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
queue_id,
uint32_t pkt_idx = 0, entry_success = 0;
uint16_t avail_idx;
uint16_t res_base_idx, res_cur_idx;
-   uint8_t success = 0;

LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_merge_rx()\n",
dev->device_fh);
@@ -496,46 +478,28 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
queue_id,

for (pkt_idx = 0; pkt_idx < count; pkt_idx++) {
uint32_t pkt_len = pkts[pkt_idx]->pkt_len + vq->vhost_hlen;
+   uint32_t secure_len = 0;
+   uint32_t vec_idx = 0;

-   do {
-   /*
-* As many data cores may want access to available
-* buffers, they need to be reserved.
-*/
-   uint32_t secure_len = 0;
-   uint32_t vec_idx = 0;
-
-   res_base_idx = vq->last_used_idx_res;
-   res_cur_idx = res_base_idx;
+   res_base_idx = res_cur_idx = vq->last_used_idx_res;

-   do {
-   avail_idx = *((volatile uint16_t 
*)>avail->idx);
-   if (unlikely(res_cur_idx == avail_idx))
-   goto merge_rx_exit;
+   do {
+   avail_idx = *((volatile uint16_t *)>avail->idx);
+   if (unlikely(res_cur_idx == avail_idx))
+   goto merge_rx_exit;

-   update_secure_len(vq, res_cur_idx,
-

[dpdk-dev] vmxnet3 pmd stats counters reset after rte_eth_dev_start() is called

2016-01-04 Thread Tom Crugnale

Hi All,

I am seeing an issue where the stats counters for vmxnet3 interfaces are reset 
to 0 after rte_eth_dev_start() is called, making it difficult to track 
statistics over a period of time where interfaces could be disabled and 
re-enabled.

There is a memset in the code that clears the txq/rxq stats counters to 0 each 
time an interface is started from the stopped state.  However, even with this 
code removed, I still see the issue happening.  I believe there is something on 
the host side that is resetting these counters, since they come from shared 
memory.

Does anyone know what causes the reset, and if there is a way to prevent it?  I 
could work around the issue by caching a snapshot of the last statistics before 
calling rte_eth_dev_start() and then just using that as a baseline for 
accumulation going forward, but I don't think this behaviour exists for all 
other NIC types, so I would have to specialize on the vmxnet3 case.

Thanks,
Tom

[dpdk-dev] [PATCH v2 0/4] vmxnet3 TSO and tx cksum offload

2016-01-04 Thread Stephen Hemminger

On Mon,  4 Jan 2016 18:28:15 -0800
Yong Wang  wrote:

> v2:
> * fixed some logging issues when debug option turned on
> * updated the txq_flags check in vmxnet3_dev_tx_queue_setup()
> 
> This patchset adds TCP/UDP checksum offload and TSO to vmxnet3 PMD.
> One of the use cases for these features is to support STT.  It also
> restores the tx data ring feature that was removed from a previous
> patch.
> 
> Yong Wang (4):
>   vmxnet3: restore tx data ring support
>   vmxnet3: add tx l4 cksum offload
>   vmxnet3: add TSO support
>   vmxnet3: announce device offload capability
> 
>  doc/guides/rel_notes/release_2_3.rst |  11 +++
>  drivers/net/vmxnet3/vmxnet3_ethdev.c |  16 +++-
>  drivers/net/vmxnet3/vmxnet3_ring.h   |  13 ---
>  drivers/net/vmxnet3/vmxnet3_rxtx.c   | 169 
> +++
>  4 files changed, 158 insertions(+), 51 deletions(-)
> 

Overall, this looks good. 

I hope STT would die (but unfortunately it won't).

[dpdk-dev] [PATCH v2 1/4] vmxnet3: restore tx data ring support

2016-01-04 Thread Stephen Hemminger

On Mon,  4 Jan 2016 18:28:16 -0800
Yong Wang  wrote:

> Tx data ring support was removed in a previous change
> to add multi-seg transmit.  This change adds it back.
> 
> Fixes: 7ba5de417e3c ("vmxnet3: support multi-segment transmit")
> 
> Signed-off-by: Yong Wang 

Do you have any numbers to confirm this?

[dpdk-dev] [PATCH v2 3/4] vmxnet3: add TSO support

2016-01-04 Thread Stephen Hemminger

On Mon,  4 Jan 2016 18:28:18 -0800
Yong Wang  wrote:

> +/* The number of descriptors that are needed for a packet. */
> +static unsigned
> +txd_estimate(const struct rte_mbuf *m)
> +{
> + return m->nb_segs;
> +}
> +

A wrapper function only really clarifies if it is hiding some information.
Why not just code this in place?

[dpdk-dev] [PATCH v2 3/4] vmxnet3: add TSO support

2016-01-04 Thread Stephen Hemminger

On Mon,  4 Jan 2016 18:28:18 -0800
Yong Wang  wrote:

> + mbuf = txq->cmd_ring.buf_info[eop_idx].m;
> + if (unlikely(mbuf == NULL))
> + rte_panic("EOP desc does not point to a valid mbuf");
> + else

The unlikely is really not needed with rte_panic since it is declared
with cold attribute which has same effect.

Else is unnecessary because rte_panic never returns.

[dpdk-dev] [PATCH 14/14] lib/ether: introduce rte_eth_copy_dev_info

2016-01-04 Thread Jan Viktorin

This function should be preferred over the rte_eth_copy_pci_info as it is not
PCI-specific.

Signed-off-by: Jan Viktorin 
---
 lib/librte_ether/rte_ethdev.c | 38 ++
 lib/librte_ether/rte_ethdev.h | 15 +++
 2 files changed, 53 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 75121bc..6d58544 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3309,3 +3309,41 @@ rte_eth_copy_pci_info(struct rte_eth_dev *eth_dev, 
struct rte_pci_device *pci_de
eth_dev->data->numa_node = pci_dev->numa_node;
eth_dev->data->drv_name = pci_dev->driver->name;
 }
+
+void
+rte_eth_copy_dev_info(struct rte_eth_dev *eth_dev, const union rte_device *dev)
+{
+   unsigned int drv_flags;
+
+   if (eth_dev == NULL || dev == NULL) {
+   RTE_PMD_DEBUG_TRACE("NULL pointer eth_dev=%p dev=%p\n",
+   eth_dev, dev);
+   return;
+   }
+
+   if (eth_dev->dev->magic != dev->magic) {
+   rte_panic("%s() incompatible magic set: %08x != %08x\n",
+   __func__, eth_dev->dev->magic, dev->magic);
+   return;
+   }
+
+   eth_dev->data->dev_flags = 0;
+
+   switch (eth_dev->dev->magic) {
+   case RTE_PCI_DEVICE_MAGIC:
+   drv_flags = dev->pci.driver->drv_flags;
+   if (drv_flags & RTE_PCI_DRV_INTR_LSC)
+   eth_dev->data->dev_flags |= RTE_PCI_DRV_INTR_LSC;
+   if (drv_flags & RTE_PCI_DRV_DETACHABLE)
+   eth_dev->data->dev_flags |= RTE_PCI_DRV_DETACHABLE;
+
+   eth_dev->data->kdrv = dev->pci.kdrv;
+   eth_dev->data->numa_node = dev->pci.numa_node;
+   eth_dev->data->drv_name = dev->pci.driver->name;
+   break;
+   default:
+   rte_panic("%s() unrecognized dev magic: %08x\n",
+   __func__, dev->magic);
+   break;
+   }
+}
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 5dd2e1a..020c0f7 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -3864,6 +3864,21 @@ extern int rte_eth_timesync_write_time(uint8_t port_id,
  */
 extern void rte_eth_copy_pci_info(struct rte_eth_dev *eth_dev, struct 
rte_pci_device *pci_dev);

+/**
+ * Copy device info to the Ethernet device data. The target eth_dev must be
+ * compatible with the given device (from the same infrastructure - eg. PCI).
+ *
+ * @param eth_dev
+ * The *eth_dev* pointer is the address of the *rte_eth_dev* structure.
+ * @param pci_dev
+ * The *dev* pointer is the address of the *rte_device* union.
+ *
+ * @return
+ *   - 0 on success, negative on error
+ */
+extern void rte_eth_copy_dev_info(struct rte_eth_dev *eth_dev,
+   const union rte_device *dev);
+

 /**
  * Create memzone for HW rings.
-- 
2.6.3

[dpdk-dev] [PATCH 13/14] lib/ether: check magic in rte_eth_copy_pci_info

2016-01-04 Thread Jan Viktorin

Signed-off-by: Jan Viktorin 
---
 lib/librte_ether/rte_ethdev.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 6fb3423..75121bc 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3293,6 +3293,12 @@ rte_eth_copy_pci_info(struct rte_eth_dev *eth_dev, 
struct rte_pci_device *pci_de
return;
}

+   if (eth_dev->dev->magic != RTE_PCI_DEVICE_MAGIC) {
+   rte_panic("%s() unexpected device magic: %08x\n",
+   __func__, eth_dev->dev->magic);
+   return;
+   }
+
eth_dev->data->dev_flags = 0;
if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
-- 
2.6.3

[dpdk-dev] [PATCH 12/14] lib/ether: check magic before naming a zone

2016-01-04 Thread Jan Viktorin

Signed-off-by: Jan Viktorin 
---
 lib/librte_ether/rte_ethdev.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index b17aa11..6fb3423 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -2659,9 +2659,12 @@ rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, 
const char *ring_name,
char z_name[RTE_MEMZONE_NAMESIZE];
const struct rte_memzone *mz;

-   snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
-dev->driver->pci_drv.name, ring_name,
-dev->data->port_id, queue_id);
+   if (dev->dev->magic == RTE_PCI_DEVICE_MAGIC) {
+   snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
+dev->driver->pci_drv.name, ring_name,
+dev->data->port_id, queue_id);
+   } else
+   return NULL;

mz = rte_memzone_lookup(z_name);
if (mz)
-- 
2.6.3

[dpdk-dev] [PATCH 11/14] lib/ether: extract function rte_device_get_intr_handle

2016-01-04 Thread Jan Viktorin

Signed-off-by: Jan Viktorin 
---
 lib/librte_ether/rte_ethdev.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index a9007e7..b17aa11 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -2606,6 +2606,17 @@ _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
rte_spinlock_unlock(_eth_dev_cb_lock);
 }

+static struct rte_intr_handle *
+rte_device_get_intr_handle(union rte_device *dev)
+{
+   switch (dev->magic) {
+   case RTE_PCI_DEVICE_MAGIC:
+   return >pci.intr_handle;
+   default:
+   return NULL;
+   }
+}
+
 int
 rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
 {
@@ -2621,7 +2632,7 @@ rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int 
op, void *data)
}

dev = _eth_devices[port_id];
-   intr_handle = >pci_dev->intr_handle;
+   intr_handle = rte_device_get_intr_handle(dev->dev);
if (!intr_handle->intr_vec) {
RTE_PMD_DEBUG_TRACE("RX Intr vector unset\n");
return -EPERM;
@@ -2684,7 +2695,7 @@ rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t 
queue_id,
return -EINVAL;
}

-   intr_handle = >pci_dev->intr_handle;
+   intr_handle = rte_device_get_intr_handle(dev->dev);
if (!intr_handle->intr_vec) {
RTE_PMD_DEBUG_TRACE("RX Intr vector unset\n");
return -EPERM;
-- 
2.6.3

[dpdk-dev] [PATCH 10/14] lib/ether: copy the rte_device union instead of rte_pci_device

2016-01-04 Thread Jan Viktorin

Signed-off-by: Jan Viktorin 
---
 lib/librte_ether/rte_ethdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index db12515..a9007e7 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1660,7 +1660,7 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
rte_eth_dev_info *dev_info)

RTE_FUNC_PTR_OR_RET(*dev->dev_ops->dev_infos_get);
(*dev->dev_ops->dev_infos_get)(dev, dev_info);
-   dev_info->pci_dev = dev->pci_dev;
+   dev_info->dev = dev->dev;
dev_info->driver_name = dev->data->drv_name;
 }

-- 
2.6.3

[dpdk-dev] [PATCH 09/14] lib/ether: generalize attach/detach of devices

2016-01-04 Thread Jan Viktorin

Make the attach and detach functions independent on the PCI infra. Mostly, this
means to utilize the rte_bus_addr instead of rte_pci_addr.

Signed-off-by: Jan Viktorin 
---
 lib/librte_ether/rte_ethdev.c | 34 ++
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 826d4b9..db12515 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -432,7 +432,7 @@ rte_eth_dev_get_device_type(uint8_t port_id)
 }

 static int
-rte_eth_dev_get_addr_by_port(uint8_t port_id, struct rte_pci_addr *addr)
+rte_eth_dev_get_addr_by_port(uint8_t port_id, struct rte_bus_addr *addr)
 {
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

@@ -441,8 +441,15 @@ rte_eth_dev_get_addr_by_port(uint8_t port_id, struct 
rte_pci_addr *addr)
return -EINVAL;
}

-   *addr = rte_eth_devices[port_id].pci_dev->addr;
-   return 0;
+   if (rte_eth_devices[port_id].dev_type == RTE_ETH_DEV_PCI) {
+   addr->pci = rte_eth_devices[port_id].pci_dev->addr;
+   addr->dev_magic = RTE_PCI_DEVICE_MAGIC;
+   return 0;
+   } else {
+   rte_panic("%s(): unexpected dev_type: %u\n", __func__,
+   rte_eth_devices[port_id].dev_type);
+   return -ENODEV;
+   }
 }

 static int
@@ -566,10 +573,10 @@ err:

 /* detach the new physical device, then store pci_addr of the device */
 static int
-rte_eth_dev_detach_pdev(uint8_t port_id, struct rte_pci_addr *addr)
+rte_eth_dev_detach_pdev(uint8_t port_id, struct rte_bus_addr *addr)
 {
-   struct rte_pci_addr freed_addr;
-   struct rte_pci_addr vp;
+   struct rte_bus_addr freed_addr;
+   struct rte_bus_addr vp;

if (addr == NULL)
goto err;
@@ -583,13 +590,16 @@ rte_eth_dev_detach_pdev(uint8_t port_id, struct 
rte_pci_addr *addr)
goto err;

/* Zeroed pci addr means the port comes from virtual device */
-   vp.domain = vp.bus = vp.devid = vp.function = 0;
-   if (rte_eal_compare_pci_addr(, _addr) == 0)
+   memset(, 0, sizeof(vp));
+   if (rte_eal_compare_bus_addr(, _addr) == 0)
goto err;

/* invoke devuninit func of the pci driver,
 * also remove the device from pci_device_list */
-   if (rte_eal_pci_detach(_addr))
+   if (freed_addr.dev_magic == RTE_PCI_DEVICE_MAGIC) {
+   if (rte_eal_pci_detach(_addr.pci))
+   goto err;
+   } else
goto err;

*addr = freed_addr;
@@ -683,7 +693,7 @@ rte_eth_dev_attach(const char *devargs, uint8_t *port_id)
 int
 rte_eth_dev_detach(uint8_t port_id, char *name)
 {
-   struct rte_pci_addr addr;
+   struct rte_bus_addr addr;
int ret;

if (name == NULL)
@@ -698,8 +708,8 @@ rte_eth_dev_detach(uint8_t port_id, char *name)
if (ret == 0)
snprintf(name, RTE_ETH_NAME_MAX_LEN,
"%04x:%02x:%02x.%d",
-   addr.domain, addr.bus,
-   addr.devid, addr.function);
+   addr.pci.domain, addr.pci.bus,
+   addr.pci.devid, addr.pci.function);

return ret;
} else
-- 
2.6.3

[dpdk-dev] [PATCH 08/14] eal/common: introduce rte_bus_addr

2016-01-04 Thread Jan Viktorin

To support a generic manipulation with devices we need to have a general
representation of the rte_*_addr which replaces the rte_pci_addr in the code.
Here we introduce the rte_bus_addr consisting of a union of various rte_*_addr
fields and the device magic to discriminate among them.

A wrapper around rte_eal_compare_pci_addr is introduced and it will be extended
while adding a new non-PCI infra.

Signed-off-by: Jan Viktorin 
---
 lib/librte_eal/common/include/rte_dev.h | 17 +
 1 file changed, 17 insertions(+)

diff --git a/lib/librte_eal/common/include/rte_dev.h 
b/lib/librte_eal/common/include/rte_dev.h
index c99d038..48c46fd 100644
--- a/lib/librte_eal/common/include/rte_dev.h
+++ b/lib/librte_eal/common/include/rte_dev.h
@@ -161,6 +161,23 @@ union rte_device {
}

 /**
+ * Generic bus address of a device.
+ */
+struct rte_bus_addr {
+   union {
+   struct rte_pci_addr pci;
+   };
+   unsigned int dev_magic;
+};
+
+static inline int
+rte_eal_compare_bus_addr(const struct rte_bus_addr *a,
+   const struct rte_bus_addr *b)
+{
+   return rte_eal_compare_pci_addr(>pci, >pci);
+}
+
+/**
  * Register a device driver.
  *
  * @param driver
-- 
2.6.3

[dpdk-dev] [PATCH 07/14] lib/ether: generalize rte_eth_dev_init/uninit

2016-01-04 Thread Jan Viktorin

Generalize strict PCI-specific initialization and uninitialization steps and
prepare the code to be easily reused for other infrastructures. API of the
eth_driver stays backwards compatible. The previously introduced magic is
utilized to test whether we are working with a PCI device or not.

Signed-off-by: Jan Viktorin 
---
 lib/librte_ether/rte_ethdev.c | 118 --
 lib/librte_ether/rte_ethdev.h |   9 ++--
 2 files changed, 85 insertions(+), 42 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 21fc0d7..826d4b9 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -216,16 +216,23 @@ rte_eth_dev_allocate(const char *name, enum 
rte_eth_dev_type type)

 static int
 rte_eth_dev_create_unique_device_name(char *name, size_t size,
-   struct rte_pci_device *pci_dev)
+   union rte_device *dev)
 {
int ret;

-   if ((name == NULL) || (pci_dev == NULL))
+   if ((name == NULL) || (dev == NULL))
return -EINVAL;

-   ret = snprintf(name, size, "%d:%d.%d",
-   pci_dev->addr.bus, pci_dev->addr.devid,
-   pci_dev->addr.function);
+   switch (dev->magic) {
+   case RTE_PCI_DEVICE_MAGIC:
+   ret = snprintf(name, size, "%d:%d.%d",
+   dev->pci.addr.bus, dev->pci.addr.devid,
+   dev->pci.addr.function);
+   break;
+   default:
+   ret = -ENODEV;
+   }
+
if (ret < 0)
return ret;
return 0;
@@ -243,33 +250,41 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
 }

 static int
-rte_eth_dev_init(struct rte_pci_driver *pci_drv,
-struct rte_pci_device *pci_dev)
+rte_device_get_dev_type(union rte_device *dev)
+{
+   switch (dev->magic) {
+   case RTE_PCI_DEVICE_MAGIC:
+   return RTE_ETH_DEV_PCI;
+   default:
+   return RTE_ETH_DEV_UNKNOWN;
+   }
+}
+
+static int
+rte_eth_dev_init(struct eth_driver *eth_drv, union rte_device *dev)
 {
-   struct eth_driver*eth_drv;
struct rte_eth_dev *eth_dev;
char ethdev_name[RTE_ETH_NAME_MAX_LEN];
-
int diag;

-   eth_drv = (struct eth_driver *)pci_drv;
-
-   /* Create unique Ethernet device name using PCI address */
+   /* Create unique Ethernet device name */
rte_eth_dev_create_unique_device_name(ethdev_name,
-   sizeof(ethdev_name), pci_dev);
+   sizeof(ethdev_name), dev);

-   eth_dev = rte_eth_dev_allocate(ethdev_name, RTE_ETH_DEV_PCI);
+   eth_dev = rte_eth_dev_allocate(ethdev_name,
+   rte_device_get_dev_type(dev));
if (eth_dev == NULL)
return -ENOMEM;

if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
eth_dev->data->dev_private = rte_zmalloc("ethdev private 
structure",
- eth_drv->dev_private_size,
- RTE_CACHE_LINE_SIZE);
+   eth_drv->dev_private_size,
+   RTE_CACHE_LINE_SIZE);
if (eth_dev->data->dev_private == NULL)
rte_panic("Cannot allocate memzone for private port 
data\n");
}
-   eth_dev->pci_dev = pci_dev;
+
+   eth_dev->dev = dev;
eth_dev->driver = eth_drv;
eth_dev->data->rx_mbuf_alloc_failed = 0;

@@ -286,10 +301,6 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
if (diag == 0)
return 0;

-   RTE_PMD_DEBUG_TRACE("driver %s: eth_dev_init(vendor_id=0x%u 
device_id=0x%x) failed\n",
-   pci_drv->name,
-   (unsigned) pci_dev->id.vendor_id,
-   (unsigned) pci_dev->id.device_id);
if (rte_eal_process_type() == RTE_PROC_PRIMARY)
rte_free(eth_dev->data->dev_private);
rte_eth_dev_release_port(eth_dev);
@@ -297,29 +308,41 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
 }

 static int
-rte_eth_dev_uninit(struct rte_pci_device *pci_dev)
+rte_eth_dev_pci_init(struct rte_pci_driver *pci_drv,
+struct rte_pci_device *pci_dev)
+{
+   struct eth_driver *eth_drv = (struct eth_driver *)pci_drv;
+   union rte_device *dev = (union rte_device *) pci_dev;
+   int rc;
+
+   if ((rc = rte_eth_dev_init(eth_drv, dev))) {
+   RTE_PMD_DEBUG_TRACE("driver %s: eth_dev_init(vendor_id=0x%u 
device_id=0x%x) failed\n",
+   pci_drv->name,
+   (unsigned) pci_dev->id.vendor_id,
+   (unsigned) pci_dev->id.device_id);
+   return rc;
+   }
+
+   return 0;
+}
+
+static int
+rte_eth_dev_uninit(struct eth_driver *eth_drv)
 {
-   const struct eth_driver *eth_drv;
+   union rte_device *dev = (union rte_device *)

[dpdk-dev] [PATCH 06/14] Include rte_dev.h instead of rte_pci.h

2016-01-04 Thread Jan Viktorin

As rte_dev.h now include rte_pci.h, we can remove the rte_pci.h inclusion
among all DPDK code base or replace it by a more general rte_dev.h inclusion.

Signed-off-by: Jan Viktorin 
---
 app/test-pipeline/config.c | 2 +-
 app/test-pipeline/init.c   | 2 +-
 app/test-pipeline/main.c   | 2 +-
 app/test-pipeline/runtime.c| 2 +-
 app/test-pmd/cmdline.c | 2 +-
 app/test-pmd/config.c  | 2 +-
 app/test-pmd/csumonly.c| 2 +-
 app/test-pmd/flowgen.c | 2 +-
 app/test-pmd/iofwd.c   | 2 +-
 app/test-pmd/macfwd-retry.c| 2 +-
 app/test-pmd/macfwd.c  | 2 +-
 app/test-pmd/macswap.c | 2 +-
 app/test-pmd/parameters.c  | 2 +-
 app/test-pmd/rxonly.c  | 2 +-
 app/test-pmd/testpmd.c | 2 +-
 app/test-pmd/txonly.c  | 2 +-
 app/test/test_pci.c| 2 +-
 drivers/net/bnx2x/bnx2x_ethdev.h   | 2 +-
 drivers/net/cxgbe/base/t4_hw.c | 2 +-
 drivers/net/cxgbe/cxgbe_ethdev.c   | 2 +-
 drivers/net/cxgbe/cxgbe_main.c | 2 +-
 drivers/net/cxgbe/sge.c| 2 +-
 drivers/net/e1000/em_ethdev.c  | 2 +-
 drivers/net/e1000/em_rxtx.c| 2 +-
 drivers/net/e1000/igb_ethdev.c | 2 +-
 drivers/net/e1000/igb_rxtx.c   | 2 +-
 drivers/net/enic/base/vnic_dev.h   | 2 +-
 drivers/net/enic/enic_ethdev.c | 1 -
 drivers/net/enic/enic_main.c   | 2 +-
 drivers/net/i40e/i40e_ethdev.c | 2 +-
 drivers/net/i40e/i40e_ethdev_vf.c  | 2 +-
 drivers/net/i40e/i40e_pf.c | 2 +-
 drivers/net/ixgbe/ixgbe_ethdev.c   | 2 +-
 drivers/net/ixgbe/ixgbe_fdir.c | 2 +-
 drivers/net/ixgbe/ixgbe_rxtx.c | 2 +-
 drivers/net/mlx5/mlx5.c| 2 +-
 drivers/net/virtio/virtio_ethdev.c | 2 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c   | 2 +-
 drivers/net/vmxnet3/vmxnet3_rxtx.c | 2 +-
 examples/bond/main.c   | 2 +-
 examples/dpdk_qat/main.c   | 2 +-
 examples/exception_path/main.c | 2 +-
 examples/ip_fragmentation/main.c   | 2 +-
 examples/ip_reassembly/main.c  | 2 +-
 examples/ipv4_multicast/main.c | 2 +-
 examples/kni/main.c| 2 +-
 examples/l2fwd-crypto/main.c   | 2 +-
 examples/l2fwd-ivshmem/guest/guest.c   | 2 +-
 examples/l2fwd-jobstats/main.c | 2 +-
 examples/l2fwd-keepalive/main.c| 2 +-
 examples/l2fwd/main.c  | 2 +-
 examples/l3fwd-acl/main.c  | 2 +-
 examples/l3fwd-power/main.c| 2 +-
 examples/l3fwd-vf/main.c   | 2 +-
 examples/l3fwd/main.c  | 2 +-
 examples/link_status_interrupt/main.c  | 2 +-
 examples/load_balancer/config.c| 2 +-
 examples/load_balancer/init.c  | 2 +-
 examples/load_balancer/main.c  | 2 +-
 examples/load_balancer/runtime.c   | 2 +-
 examples/multi_process/client_server_mp/mp_client/client.c | 2 +-
 examples/multi_process/client_server_mp/mp_server/init.c   | 2 +-
 examples/multi_process/client_server_mp/mp_server/main.c   | 2 +-
 examples/multi_process/l2fwd_fork/flib.c   | 2 +-
 examples/multi_process/l2fwd_fork/main.c   | 2 +-
 examples/multi_process/symmetric_mp/main.c | 2 +-
 examples/performance-thread/l3fwd-thread/main.c| 2 +-
 examples/vmdq/main.c   | 2 +-
 examples/vmdq_dcb/main.c   | 2 +-
 lib/librte_cryptodev/rte_cryptodev.c   | 1 -
 lib/librte_cryptodev/rte_cryptodev_pmd.h   | 1 -
 lib/librte_eal/bsdapp/eal/eal.c

[dpdk-dev] [PATCH 05/14] eal/common: introduce function to_pci_device

2016-01-04 Thread Jan Viktorin

Signed-off-by: Jan Viktorin 
---
 lib/librte_eal/common/include/rte_pci.h | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 1321654..204ee82 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -171,6 +171,17 @@ struct rte_pci_device {
enum rte_kernel_driver kdrv;/**< Kernel driver passthrough 
*/
 };

+static inline struct rte_pci_device *
+to_pci_device(void *p)
+{
+   unsigned int *magic = (unsigned int *) p;
+   if (*magic == RTE_PCI_DEVICE_MAGIC)
+   return (struct rte_pci_device *) p;
+
+   rte_panic("%s: bad cast (%p: %08x)\n", __func__, p, *magic);
+   return NULL;
+}
+
 /** Any PCI device identifier (vendor, device, ...) */
 #define PCI_ANY_ID (0x)

-- 
2.6.3

[dpdk-dev] [PATCH 04/14] eal/common: introduce function to_pci_driver

2016-01-04 Thread Jan Viktorin

Signed-off-by: Jan Viktorin 
---
 lib/librte_cryptodev/rte_cryptodev.c|  3 +--
 lib/librte_eal/common/include/rte_pci.h | 12 
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/lib/librte_cryptodev/rte_cryptodev.c 
b/lib/librte_cryptodev/rte_cryptodev.c
index f09f67e..682c1aa 100644
--- a/lib/librte_cryptodev/rte_cryptodev.c
+++ b/lib/librte_cryptodev/rte_cryptodev.c
@@ -424,8 +424,7 @@ rte_cryptodev_pmd_driver_register(struct 
rte_cryptodev_driver *cryptodrv,
 {
/* Call crypto device initialization directly if device is virtual */
if (type == PMD_VDEV)
-   return rte_cryptodev_init((struct rte_pci_driver *)cryptodrv,
-   NULL);
+   return rte_cryptodev_init(to_pci_driver(cryptodrv), NULL);

/*
 * Register PCI driver for physical device intialisation during
diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 54d0fe2..1321654 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -82,6 +82,7 @@ extern "C" {
 #include 
 #include 

+#include 
 #include 

 TAILQ_HEAD(pci_device_list, rte_pci_device); /**< PCI devices in D-linked Q. */
@@ -215,6 +216,17 @@ struct rte_pci_driver {
uint32_t drv_flags; /**< Flags contolling handling 
of device. */
 };

+static inline struct rte_pci_driver *
+to_pci_driver(void *p)
+{
+   unsigned int *magic = (unsigned int *) p;
+   if (*magic == RTE_PCI_DRV_MAGIC)
+   return (struct rte_pci_driver *) p;
+
+   rte_panic("%s: bad cast (%p: %08x)\n", __func__, p, *magic);
+   return NULL;
+}
+
 /** Device needs PCI BAR mapping (done with either IGB_UIO or VFIO) */
 #define RTE_PCI_DRV_NEED_MAPPING 0x0001
 /** Device driver must be registered several times until failure - deprecated 
*/
-- 
2.6.3

[dpdk-dev] [PATCH 03/14] eal/common: introduce union rte_device and related

2016-01-04 Thread Jan Viktorin

The union rte_device can be used in situations where we want to work with all
devices without distinguishing among bus-specific features (PCI, ...).
The target device type can be detected by reading the magic.

Also, the macros RTE_DEVICE_DECL and RTE_DEVICE_PTR_DECL are introduced to
provide a generic way to declare a device or a pointer to a device. The macros
aim to preserve API backwards-compatibility. Eg.

struct old_super_struct { =>   struct old_super_struct {
struct rte_pci_device *pci_dev;   =>RTE_DEVICE_PTR_DECL(pci_dev);
...   =>   ...
};=>   };

struct old_super_struct inst;

The new code should reference inst.dev.pci, the old code can still use the
inst.pci_dev. The previously introduced magic is included so one can ask the
instance about its type:

if (inst.dev.magic == RTE_PCI_DEVICE_MAGIC) {
...
}

I don't like to include the rte_pci.h header here, however, I didn't find
a better way at the moment.

Signed-off-by: Jan Viktorin 
---
 lib/librte_eal/common/include/rte_dev.h | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/lib/librte_eal/common/include/rte_dev.h 
b/lib/librte_eal/common/include/rte_dev.h
index f1b5507..c99d038 100644
--- a/lib/librte_eal/common/include/rte_dev.h
+++ b/lib/librte_eal/common/include/rte_dev.h
@@ -49,6 +49,7 @@ extern "C" {
 #include 
 #include 

+#include 
 #include 

 __attribute__((format(printf, 2, 0)))
@@ -134,6 +135,32 @@ struct rte_driver {
 };

 /**
+ * Generic representation of a device.
+ */
+union rte_device {
+   unsigned int magic;
+   struct rte_pci_device pci;
+};
+
+/**
+ * The macro preserves API backwards compatibility.
+ */
+#define RTE_DEVICE_DECL(_pci)   \
+   union { \
+   union rte_device dev;   \
+   struct rte_pci_device _pci; \
+   }
+
+/**
+ * The macro preserves API backwards compatibility.
+ */
+#define RTE_DEVICE_PTR_DECL(_pci)\
+   union {  \
+   union rte_device *dev;   \
+   struct rte_pci_device *_pci; \
+   }
+
+/**
  * Register a device driver.
  *
  * @param driver
-- 
2.6.3

[dpdk-dev] [PATCH 02/14] eal/common: introduce RTE_PCI_DEVICE_MAGIC

2016-01-04 Thread Jan Viktorin

Signed-off-by: Jan Viktorin 
---
 lib/librte_eal/common/include/rte_pci.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index db8382f..54d0fe2 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -156,6 +156,8 @@ enum rte_kernel_driver {
  * A structure describing a PCI device.
  */
 struct rte_pci_device {
+#define RTE_PCI_DEVICE_MAGIC 0x0001
+   unsigned int magic; /**< PCI device magic */
TAILQ_ENTRY(rte_pci_device) next;   /**< Next probed PCI device. */
struct rte_pci_addr addr;   /**< PCI location. */
struct rte_pci_id id;   /**< PCI ID. */
-- 
2.6.3

[dpdk-dev] [PATCH 01/14] eal/common: introduce RTE_PCI_DRV_MAGIC

2016-01-04 Thread Jan Viktorin

To distinguish between different types of drivers, include a member .magic
at the beginning of the rte_pci_driver structure.

Signed-off-by: Jan Viktorin 
---
 drivers/net/bnx2x/bnx2x_ethdev.c| 2 ++
 drivers/net/cxgbe/cxgbe_ethdev.c| 1 +
 drivers/net/e1000/em_ethdev.c   | 1 +
 drivers/net/e1000/igb_ethdev.c  | 2 ++
 drivers/net/enic/enic_ethdev.c  | 1 +
 drivers/net/fm10k/fm10k_ethdev.c| 1 +
 drivers/net/i40e/i40e_ethdev.c  | 1 +
 drivers/net/i40e/i40e_ethdev_vf.c   | 1 +
 drivers/net/ixgbe/ixgbe_ethdev.c| 2 ++
 drivers/net/mlx4/mlx4.c | 1 +
 drivers/net/mlx5/mlx5.c | 1 +
 drivers/net/virtio/virtio_ethdev.c  | 1 +
 drivers/net/vmxnet3/vmxnet3_ethdev.c| 1 +
 lib/librte_eal/common/include/rte_pci.h | 2 ++
 14 files changed, 18 insertions(+)

diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c b/drivers/net/bnx2x/bnx2x_ethdev.c
index 69df02e..e23dae2 100644
--- a/drivers/net/bnx2x/bnx2x_ethdev.c
+++ b/drivers/net/bnx2x/bnx2x_ethdev.c
@@ -498,6 +498,7 @@ eth_bnx2xvf_dev_init(struct rte_eth_dev *eth_dev)

 static struct eth_driver rte_bnx2x_pmd = {
.pci_drv = {
+   .magic = RTE_PCI_DRV_MAGIC,
.name = "rte_bnx2x_pmd",
.id_table = pci_id_bnx2x_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
@@ -511,6 +512,7 @@ static struct eth_driver rte_bnx2x_pmd = {
  */
 static struct eth_driver rte_bnx2xvf_pmd = {
.pci_drv = {
+   .magic = RTE_PCI_DRV_MAGIC,
.name = "rte_bnx2xvf_pmd",
.id_table = pci_id_bnx2xvf_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index 97ef152..6807e50 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -848,6 +848,7 @@ out_free_adapter:

 static struct eth_driver rte_cxgbe_pmd = {
.pci_drv = {
+   .magic = RTE_PCI_DRV_MAGIC,
.name = "rte_cxgbe_pmd",
.id_table = cxgb4_pci_tbl,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 66e8993..ffd0363 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -336,6 +336,7 @@ eth_em_dev_uninit(struct rte_eth_dev *eth_dev)

 static struct eth_driver rte_em_pmd = {
.pci_drv = {
+   .magic = RTE_PCI_DRV_MAGIC,
.name = "rte_em_pmd",
.id_table = pci_id_em_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index d1bbcda..413072d 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -957,6 +957,7 @@ eth_igbvf_dev_uninit(struct rte_eth_dev *eth_dev)

 static struct eth_driver rte_igb_pmd = {
.pci_drv = {
+   .magic = RTE_PCI_DRV_MAGIC,
.name = "rte_igb_pmd",
.id_table = pci_id_igb_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
@@ -972,6 +973,7 @@ static struct eth_driver rte_igb_pmd = {
  */
 static struct eth_driver rte_igbvf_pmd = {
.pci_drv = {
+   .magic = RTE_PCI_DRV_MAGIC,
.name = "rte_igbvf_pmd",
.id_table = pci_id_igbvf_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index 2a88043..44ea5f9 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -622,6 +622,7 @@ static int eth_enicpmd_dev_init(struct rte_eth_dev *eth_dev)

 static struct eth_driver rte_enic_pmd = {
.pci_drv = {
+   .magic = RTE_PCI_DRV_MAGIC,
.name = "rte_enic_pmd",
.id_table = pci_id_enic_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index e4aed94..e372225 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -2749,6 +2749,7 @@ static const struct rte_pci_id pci_id_fm10k_map[] = {

 static struct eth_driver rte_pmd_fm10k = {
.pci_drv = {
+   .magic = RTE_PCI_DRV_MAGIC,
.name = "rte_pmd_fm10k",
.id_table = pci_id_fm10k_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index bf6220d..f642bce 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -599,6 +599,7 @@ static const struct rte_i40e_xstats_name_off 
rte_i40e_txq_prio_strings[] = {

 static struct eth_driver rte_i40e_pmd = {
.pci_drv = {
+

[dpdk-dev] [PATCH 00/14] Step towards PCI independency

2016-01-04 Thread Jan Viktorin

Hello DPDK community,

A few days ago, I've proposed an RFC of a new infrastructure that allows to
detect non-PCI devices present on SoC systems. It is, however, the easier part
of the story. To bring support of non-PCI devices, it is necessary to do much
deeper changes in DPDK. In this patch series, I am proposing changes that shows
a possible way how to do it.

I extended the rte_pci_{device,driver} with a member .magic. This member holds
a magic number unique to the PCI-infra. Another one (SoC-infra) would get a
different set of magics. This allows to define unions of bus-specific devices
and drivers while not loosing information about the original data type. It can
also add some type-safety into the system. It solves the problem of a missing
'type' member in the eth_driver structure.

Those extensions are then used to generalize the librte_ether library that
seems (to me) to be independent on the PCI now. What is important, the API
stays backwards compatible at the moment. From the point of ABI, I am afraid
that the .magic member breaks it anyway...

The code builds successfully for both x86_64 and ARMv7. I didn't test it in
runtime as the tests are not very suitable for this.

This patch set is independent on the previous one (which was adding the SoC
infra), however, if it is approved I expect them to be joined or to make them
dependent on each other in some way.

Regards
Jan
---
Jan Viktorin (14):
  eal/common: introduce RTE_PCI_DRV_MAGIC
  eal/common: introduce RTE_PCI_DEVICE_MAGIC
  eal/common: introduce union rte_device and related
  eal/common: introduce function to_pci_driver
  eal/common: introduce function to_pci_device
  Include rte_dev.h instead of rte_pci.h
  lib/ether: generalize rte_eth_dev_init/uninit
  eal/common: introduce rte_bus_addr
  lib/ether: generalize attach/detach of devices
  lib/ether: copy the rte_device union instead of rte_pci_device
  lib/ether: extract function rte_device_get_intr_handle
  lib/ether: check magic before naming a zone
  lib/ether: check magic in rte_eth_copy_pci_info
  lib/ether: introduce rte_eth_copy_dev_info

 app/test-pipeline/config.c |   2 +-
 app/test-pipeline/init.c   |   2 +-
 app/test-pipeline/main.c   |   2 +-
 app/test-pipeline/runtime.c|   2 +-
 app/test-pmd/cmdline.c |   2 +-
 app/test-pmd/config.c  |   2 +-
 app/test-pmd/csumonly.c|   2 +-
 app/test-pmd/flowgen.c |   2 +-
 app/test-pmd/iofwd.c   |   2 +-
 app/test-pmd/macfwd-retry.c|   2 +-
 app/test-pmd/macfwd.c  |   2 +-
 app/test-pmd/macswap.c |   2 +-
 app/test-pmd/parameters.c  |   2 +-
 app/test-pmd/rxonly.c  |   2 +-
 app/test-pmd/testpmd.c |   2 +-
 app/test-pmd/txonly.c  |   2 +-
 app/test/test_pci.c|   2 +-
 drivers/net/bnx2x/bnx2x_ethdev.c   |   2 +
 drivers/net/bnx2x/bnx2x_ethdev.h   |   2 +-
 drivers/net/cxgbe/base/t4_hw.c |   2 +-
 drivers/net/cxgbe/cxgbe_ethdev.c   |   3 +-
 drivers/net/cxgbe/cxgbe_main.c |   2 +-
 drivers/net/cxgbe/sge.c|   2 +-
 drivers/net/e1000/em_ethdev.c  |   3 +-
 drivers/net/e1000/em_rxtx.c|   2 +-
 drivers/net/e1000/igb_ethdev.c |   4 +-
 drivers/net/e1000/igb_rxtx.c   |   2 +-
 drivers/net/enic/base/vnic_dev.h   |   2 +-
 drivers/net/enic/enic_ethdev.c |   2 +-
 drivers/net/enic/enic_main.c   |   2 +-
 drivers/net/fm10k/fm10k_ethdev.c   |   1 +
 drivers/net/i40e/i40e_ethdev.c |   3 +-
 drivers/net/i40e/i40e_ethdev_vf.c  |   3 +-
 drivers/net/i40e/i40e_pf.c |   2 +-
 drivers/net/ixgbe/ixgbe_ethdev.c   |   4 +-
 drivers/net/ixgbe/ixgbe_fdir.c |   2 +-
 drivers/net/ixgbe/ixgbe_rxtx.c |   2 +-
 drivers/net/mlx4/mlx4.c|   1 +
 drivers/net/mlx5/mlx5.c|   3 +-
 drivers/net/virtio/virtio_ethdev.c |   3 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c   |   3 +-
 drivers/net/vmxnet3/vmxnet3_rxtx.c |   2 +-
 examples/bond/main.c   |   2 +-
 examples/dpdk_qat/main.c   |   2 +-
 examples/exception_path/main.c |   2 +-
 examples/ip_fragmentation/main.c   |   2 +-
 examples/ip_reassembly/main.c  |   2 +-
 examples/ipv4_multicast/main.c

[dpdk-dev] [PATCH 12/12] examples/l3fwd: add option to parse ptype

2016-01-04 Thread Ananyev, Konstantin


Hi Jianfeng,
> -Original Message-
> From: Tan, Jianfeng
> Sent: Thursday, December 31, 2015 6:53 AM
> To: dev at dpdk.org
> Cc: Zhang, Helin; Ananyev, Konstantin; Tan, Jianfeng
> Subject: [PATCH 12/12] examples/l3fwd: add option to parse ptype
> 
> Firstly, use rte_eth_dev_get_ptype_info() API to check if device will
> parse needed packet type. If not, specifying the newly added option,
> --parse-ptype to do it in the callback softly.
> 
> Signed-off-by: Jianfeng Tan 
> ---
>  examples/l3fwd/main.c | 86 
> +++
>  1 file changed, 86 insertions(+)
> 
> diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
> index 5b0c2dd..ccbdce3 100644
> --- a/examples/l3fwd/main.c
> +++ b/examples/l3fwd/main.c
> @@ -174,6 +174,7 @@ static __m128i val_eth[RTE_MAX_ETHPORTS];
>  static uint32_t enabled_port_mask = 0;
>  static int promiscuous_on = 0; /**< Ports set in promiscuous mode off by 
> default. */
>  static int numa_on = 1; /**< NUMA is enabled by default. */
> +static int parse_ptype = 0; /**< parse packet type using rx callback */
> 
>  #if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH)
>  static int ipv6 = 0; /**< ipv6 is false by default. */
> @@ -2022,6 +2023,7 @@ parse_eth_dest(const char *optarg)
>  #define CMD_LINE_OPT_IPV6 "ipv6"
>  #define CMD_LINE_OPT_ENABLE_JUMBO "enable-jumbo"
>  #define CMD_LINE_OPT_HASH_ENTRY_NUM "hash-entry-num"
> +#define CMD_LINE_OPT_PARSE_PTYPE "parse-ptype"
> 
>  /* Parse the argument given in the command line of the application */
>  static int
> @@ -2038,6 +2040,7 @@ parse_args(int argc, char **argv)
>   {CMD_LINE_OPT_IPV6, 0, 0, 0},
>   {CMD_LINE_OPT_ENABLE_JUMBO, 0, 0, 0},
>   {CMD_LINE_OPT_HASH_ENTRY_NUM, 1, 0, 0},
> + {CMD_LINE_OPT_PARSE_PTYPE, 0, 0, 0},
>   {NULL, 0, 0, 0}
>   };
> 
> @@ -2125,6 +2128,12 @@ parse_args(int argc, char **argv)
>   }
>   }
>  #endif
> + if (!strncmp(lgopts[option_index].name, 
> CMD_LINE_OPT_PARSE_PTYPE,
> + sizeof(CMD_LINE_OPT_PARSE_PTYPE))) {
> + printf("soft parse-ptype is enabled \n");
> + parse_ptype = 1;
> + }
> +
>   break;
> 
>   default:
> @@ -2559,6 +2568,75 @@ check_all_ports_link_status(uint8_t port_num, uint32_t 
> port_mask)
>   }
>  }
> 
> +static int
> +check_packet_type_ok(int portid)
> +{
> + int i;
> + int ret;
> + uint32_t ptypes[RTE_PTYPE_L3_MAX_NUM];
> + int ptype_l3_ipv4 = 0, ptype_l3_ipv6 = 0;
> +
> + ret = rte_eth_dev_get_ptype_info(portid, RTE_PTYPE_L3_MASK, ptypes);
> + for (i = 0; i < ret; ++i) {
> + if (ptypes[i] & RTE_PTYPE_L3_IPV4)
> + ptype_l3_ipv4 = 1;
> + if (ptypes[i] & RTE_PTYPE_L3_IPV6)
> + ptype_l3_ipv6 = 1;
> + }
> +
> + if (ptype_l3_ipv4 == 0)
> + printf("port %d cannot parse RTE_PTYPE_L3_IPV4\n", portid);
> +
> + if (ptype_l3_ipv6 == 0)
> + printf("port %d cannot parse RTE_PTYPE_L3_IPV6\n", portid);
> +
> + if (ptype_l3_ipv4 || ptype_l3_ipv6)
> + return 1;
> +
> + return 0;
> +}
> +static inline void
> +parse_packet_type(struct rte_mbuf *m)
> +{
> + struct ether_hdr *eth_hdr;
> + struct vlan_hdr *vlan_hdr;
> + uint32_t packet_type = 0;
> + uint16_t ethertype;
> +
> + eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
> + ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
> + if (ethertype == ETHER_TYPE_VLAN) {

I don't think either LPM or EM support packets with VLAN right now.
So, probably there is no need to support it here.

> + vlan_hdr = (struct vlan_hdr *)(eth_hdr + 1);
> + ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto);
> + }
> + switch (ethertype) {
> + case ETHER_TYPE_IPv4:
> + packet_type |= RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
> + break;
> + case ETHER_TYPE_IPv6:
> + packet_type |= RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
> + break;
> + default:
> + break;
> + }
> +
> + m->packet_type = packet_type;

Probably:
m->packet_type |= packet_type;
in case HW supports some other packet types.

> +}
> +
> +static uint16_t
> +cb_parse_packet_type(uint8_t port __rte_unused,
> + uint16_t queue __rte_unused,
> + struct rte_mbuf *pkts[],
> + uint16_t nb_pkts,
> + uint16_t max_pkts __rte_unused,
> + void *user_param __rte_unused)
> +{
> + unsigned i;
> +
> + for (i = 0; i < nb_pkts; ++i)
> + parse_packet_type(pkts[i]);
> +}
> +
>  int
>  main(int argc, char **argv)
>  {
> @@ -2672,6 +2750,11 @@ main(int argc, char **argv)
>   rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup: 
> err=%d, "
>

[dpdk-dev] [PATCH v2 4/4] vmxnet3: announce device offload capability

2016-01-04 Thread Yong Wang

Signed-off-by: Yong Wang 
---
 drivers/net/vmxnet3/vmxnet3_ethdev.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.c 
b/drivers/net/vmxnet3/vmxnet3_ethdev.c
index c363bf6..8a40127 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.c
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.c
@@ -693,7 +693,8 @@ vmxnet3_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
 }

 static void
-vmxnet3_dev_info_get(__attribute__((unused))struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
+vmxnet3_dev_info_get(__attribute__((unused))struct rte_eth_dev *dev,
+struct rte_eth_dev_info *dev_info)
 {
dev_info->max_rx_queues = VMXNET3_MAX_RX_QUEUES;
dev_info->max_tx_queues = VMXNET3_MAX_TX_QUEUES;
@@ -716,6 +717,17 @@ vmxnet3_dev_info_get(__attribute__((unused))struct 
rte_eth_dev *dev, struct rte_
.nb_min = VMXNET3_DEF_TX_RING_SIZE,
.nb_align = 1,
};
+
+   dev_info->rx_offload_capa =
+   DEV_RX_OFFLOAD_VLAN_STRIP |
+   DEV_RX_OFFLOAD_UDP_CKSUM |
+   DEV_RX_OFFLOAD_TCP_CKSUM;
+
+   dev_info->tx_offload_capa =
+   DEV_TX_OFFLOAD_VLAN_INSERT |
+   DEV_TX_OFFLOAD_TCP_CKSUM |
+   DEV_TX_OFFLOAD_UDP_CKSUM |
+   DEV_TX_OFFLOAD_TCP_TSO;
 }

 /* return 0 means link status changed, -1 means not changed */
@@ -819,7 +831,7 @@ vmxnet3_dev_vlan_filter_set(struct rte_eth_dev *dev, 
uint16_t vid, int on)
else
VMXNET3_CLEAR_VFTABLE_ENTRY(hw->shadow_vfta, vid);

-   /* don't change active filter if in promiscious mode */
+   /* don't change active filter if in promiscuous mode */
if (rxConf->rxMode & VMXNET3_RXM_PROMISC)
return 0;

-- 
1.9.1

[dpdk-dev] [PATCH v2 3/4] vmxnet3: add TSO support

2016-01-04 Thread Yong Wang

This commit adds vmxnet3 TSO support.

Verified with test-pmd (set fwd csum) that both tso and non-tso
pkts can be successfully transmitted and all segmentes for a tso
pkt are correct on the receiver side.

Signed-off-by: Yong Wang 
---
 doc/guides/rel_notes/release_2_3.rst |   3 +
 drivers/net/vmxnet3/vmxnet3_ring.h   |  13 
 drivers/net/vmxnet3/vmxnet3_rxtx.c   | 117 ++-
 3 files changed, 92 insertions(+), 41 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 58205fe..ae487bb 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -24,6 +24,9 @@ Drivers

   Support TCP/UDP checksum offload.

+* **vmxnet3: add TSO support.**
+
+
 Libraries
 ~

diff --git a/drivers/net/vmxnet3/vmxnet3_ring.h 
b/drivers/net/vmxnet3/vmxnet3_ring.h
index 612487e..15b19e1 100644
--- a/drivers/net/vmxnet3/vmxnet3_ring.h
+++ b/drivers/net/vmxnet3/vmxnet3_ring.h
@@ -130,18 +130,6 @@ struct vmxnet3_txq_stats {
uint64_ttx_ring_full;
 };

-typedef struct vmxnet3_tx_ctx {
-   int  ip_type;
-   bool is_vlan;
-   bool is_cso;
-
-   uint16_t evl_tag;   /* only valid when is_vlan == TRUE */
-   uint32_t eth_hdr_size;  /* only valid for pkts requesting tso or csum
-* offloading */
-   uint32_t ip_hdr_size;
-   uint32_t l4_hdr_size;
-} vmxnet3_tx_ctx_t;
-
 typedef struct vmxnet3_tx_queue {
struct vmxnet3_hw*hw;
struct vmxnet3_cmd_ring  cmd_ring;
@@ -155,7 +143,6 @@ typedef struct vmxnet3_tx_queue {
uint8_t  port_id;   /**< Device port 
identifier. */
 } vmxnet3_tx_queue_t;

-
 struct vmxnet3_rxq_stats {
uint64_t drop_total;
uint64_t drop_err;
diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c 
b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index 08e6115..1dd793e 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -295,27 +295,46 @@ vmxnet3_dev_clear_queues(struct rte_eth_dev *dev)
}
 }

+static int
+vmxnet3_unmap_pkt(uint16_t eop_idx, vmxnet3_tx_queue_t *txq)
+{
+   int completed = 0;
+   struct rte_mbuf *mbuf;
+
+   /* Release cmd_ring descriptor and free mbuf */
+   VMXNET3_ASSERT(txq->cmd_ring.base[eop_idx].txd.eop == 1);
+
+   mbuf = txq->cmd_ring.buf_info[eop_idx].m;
+   if (unlikely(mbuf == NULL))
+   rte_panic("EOP desc does not point to a valid mbuf");
+   else
+   rte_pktmbuf_free(mbuf);
+
+   txq->cmd_ring.buf_info[eop_idx].m = NULL;
+
+   while (txq->cmd_ring.next2comp != eop_idx) {
+   /* no out-of-order completion */
+   
VMXNET3_ASSERT(txq->cmd_ring.base[txq->cmd_ring.next2comp].txd.cq == 0);
+   vmxnet3_cmd_ring_adv_next2comp(>cmd_ring);
+   completed++;
+   }
+
+   /* Mark the txd for which tcd was generated as completed */
+   vmxnet3_cmd_ring_adv_next2comp(>cmd_ring);
+
+   return completed + 1;
+}
+
 static void
 vmxnet3_tq_tx_complete(vmxnet3_tx_queue_t *txq)
 {
int completed = 0;
-   struct rte_mbuf *mbuf;
vmxnet3_comp_ring_t *comp_ring = >comp_ring;
struct Vmxnet3_TxCompDesc *tcd = (struct Vmxnet3_TxCompDesc *)
(comp_ring->base + comp_ring->next2proc);

while (tcd->gen == comp_ring->gen) {
-   /* Release cmd_ring descriptor and free mbuf */
-   VMXNET3_ASSERT(txq->cmd_ring.base[tcd->txdIdx].txd.eop == 1);
-   while (txq->cmd_ring.next2comp != tcd->txdIdx) {
-   mbuf = 
txq->cmd_ring.buf_info[txq->cmd_ring.next2comp].m;
-   txq->cmd_ring.buf_info[txq->cmd_ring.next2comp].m = 
NULL;
-   rte_pktmbuf_free_seg(mbuf);
-
-   /* Mark the txd for which tcd was generated as 
completed */
-   vmxnet3_cmd_ring_adv_next2comp(>cmd_ring);
-   completed++;
-   }
+   completed += vmxnet3_unmap_pkt(tcd->txdIdx, txq);

vmxnet3_comp_ring_adv_next2proc(comp_ring);
tcd = (struct Vmxnet3_TxCompDesc *)(comp_ring->base +
@@ -325,6 +344,13 @@ vmxnet3_tq_tx_complete(vmxnet3_tx_queue_t *txq)
PMD_TX_LOG(DEBUG, "Processed %d tx comps & command descs.", completed);
 }

+/* The number of descriptors that are needed for a packet. */
+static unsigned
+txd_estimate(const struct rte_mbuf *m)
+{
+   return m->nb_segs;
+}
+
 uint16_t
 vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
  uint16_t nb_pkts)
@@ -351,21 +377,42 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
struct rte_mbuf *txm = tx_pkts[nb_tx];
struct rte_mbuf *m_seg = txm;
int copy_size = 0;
+

[dpdk-dev] [PATCH v2 2/4] vmxnet3: add tx l4 cksum offload

2016-01-04 Thread Yong Wang

Support TCP/UDP checksum offload.

Signed-off-by: Yong Wang 
---
 doc/guides/rel_notes/release_2_3.rst |  3 +++
 drivers/net/vmxnet3/vmxnet3_rxtx.c   | 39 +++-
 2 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index a23c8ac..58205fe 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -20,6 +20,9 @@ Drivers
   Tx data ring has been shown to improve small pkt forwarding performance
   on vSphere environment.

+* **vmxnet3: add tx l4 cksum offload.**
+
+  Support TCP/UDP checksum offload.

 Libraries
 ~
diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c 
b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index 2202d31..08e6115 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -332,6 +332,8 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_tx;
vmxnet3_tx_queue_t *txq = tx_queue;
struct vmxnet3_hw *hw = txq->hw;
+   Vmxnet3_TxQueueCtrl *txq_ctrl = >shared->ctrl;
+   uint32_t deferred = rte_le_to_cpu_32(txq_ctrl->txNumDeferred);

if (unlikely(txq->stopped)) {
PMD_TX_LOG(DEBUG, "Tx queue is stopped.");
@@ -413,21 +415,40 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
gdesc->txd.tci = txm->vlan_tci;
}

-   /* TODO: Add transmit checksum offload here */
+   if (txm->ol_flags & PKT_TX_L4_MASK) {
+   gdesc->txd.om = VMXNET3_OM_CSUM;
+   gdesc->txd.hlen = txm->l2_len + txm->l3_len;
+
+   switch (txm->ol_flags & PKT_TX_L4_MASK) {
+   case PKT_TX_TCP_CKSUM:
+   gdesc->txd.msscof = gdesc->txd.hlen + 
offsetof(struct tcp_hdr, cksum);
+   break;
+   case PKT_TX_UDP_CKSUM:
+   gdesc->txd.msscof = gdesc->txd.hlen + 
offsetof(struct udp_hdr, dgram_cksum);
+   break;
+   default:
+   PMD_TX_LOG(WARNING, "requested cksum offload 
not supported %#llx",
+  txm->ol_flags & PKT_TX_L4_MASK);
+   abort();
+   }
+   } else {
+   gdesc->txd.hlen = 0;
+   gdesc->txd.om = VMXNET3_OM_NONE;
+   gdesc->txd.msscof = 0;
+   }
+
+   txq_ctrl->txNumDeferred = rte_cpu_to_le_32(++deferred);

/* flip the GEN bit on the SOP */
rte_compiler_barrier();
gdesc->dword[2] ^= VMXNET3_TXD_GEN;
-
-   txq->shared->ctrl.txNumDeferred++;
nb_tx++;
}

-   PMD_TX_LOG(DEBUG, "vmxnet3 txThreshold: %u", 
txq->shared->ctrl.txThreshold);
-
-   if (txq->shared->ctrl.txNumDeferred >= txq->shared->ctrl.txThreshold) {
+   PMD_TX_LOG(DEBUG, "vmxnet3 txThreshold: %u", 
rte_le_to_cpu_32(txq_ctrl->txThreshold));

-   txq->shared->ctrl.txNumDeferred = 0;
+   if (deferred >= rte_le_to_cpu_32(txq_ctrl->txThreshold)) {
+   txq_ctrl->txNumDeferred = 0;
/* Notify vSwitch that packets are available. */
VMXNET3_WRITE_BAR0_REG(hw, (VMXNET3_REG_TXPROD + txq->queue_id 
* VMXNET3_REG_ALIGN),
   txq->cmd_ring.next2fill);
@@ -728,8 +749,8 @@ vmxnet3_dev_tx_queue_setup(struct rte_eth_dev *dev,
PMD_INIT_FUNC_TRACE();

if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOXSUMS) !=
-   ETH_TXQ_FLAGS_NOXSUMS) {
-   PMD_INIT_LOG(ERR, "TX no support for checksum offload yet");
+   ETH_TXQ_FLAGS_NOXSUMSCTP) {
+   PMD_INIT_LOG(ERR, "SCTP checksum offload not supported");
return -EINVAL;
}

-- 
1.9.1

[dpdk-dev] [PATCH v2 1/4] vmxnet3: restore tx data ring support

2016-01-04 Thread Yong Wang

Tx data ring support was removed in a previous change
to add multi-seg transmit.  This change adds it back.

Fixes: 7ba5de417e3c ("vmxnet3: support multi-segment transmit")

Signed-off-by: Yong Wang 
---
 doc/guides/rel_notes/release_2_3.rst |  5 +
 drivers/net/vmxnet3/vmxnet3_rxtx.c   | 17 -
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 99de186..a23c8ac 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -15,6 +15,11 @@ EAL
 Drivers
 ~~~

+* **vmxnet3: restore tx data ring.**
+
+  Tx data ring has been shown to improve small pkt forwarding performance
+  on vSphere environment.
+

 Libraries
 ~
diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c 
b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index 4de5d89..2202d31 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -348,6 +348,7 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint32_t first2fill, avail, dw2;
struct rte_mbuf *txm = tx_pkts[nb_tx];
struct rte_mbuf *m_seg = txm;
+   int copy_size = 0;

/* Is this packet execessively fragmented, then drop */
if (unlikely(txm->nb_segs > VMXNET3_MAX_TXD_PER_PKT)) {
@@ -365,6 +366,14 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
break;
}

+   if (rte_pktmbuf_pkt_len(txm) <= VMXNET3_HDR_COPY_SIZE) {
+   struct Vmxnet3_TxDataDesc *tdd;
+
+   tdd = txq->data_ring.base + txq->cmd_ring.next2fill;
+   copy_size = rte_pktmbuf_pkt_len(txm);
+   rte_memcpy(tdd->data, rte_pktmbuf_mtod(txm, char *), 
copy_size);
+   }
+
/* use the previous gen bit for the SOP desc */
dw2 = (txq->cmd_ring.gen ^ 0x1) << VMXNET3_TXD_GEN_SHIFT;
first2fill = txq->cmd_ring.next2fill;
@@ -377,7 +386,13 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
   transmit buffer size (16K) is greater than
   maximum sizeof mbuf segment size. */
gdesc = txq->cmd_ring.base + txq->cmd_ring.next2fill;
-   gdesc->txd.addr = RTE_MBUF_DATA_DMA_ADDR(m_seg);
+   if (copy_size)
+   gdesc->txd.addr = 
rte_cpu_to_le_64(txq->data_ring.basePA +
+   
txq->cmd_ring.next2fill *
+   sizeof(struct 
Vmxnet3_TxDataDesc));
+   else
+   gdesc->txd.addr = RTE_MBUF_DATA_DMA_ADDR(m_seg);
+
gdesc->dword[2] = dw2 | m_seg->data_len;
gdesc->dword[3] = 0;

-- 
1.9.1

[dpdk-dev] [PATCH v2 0/4] vmxnet3 TSO and tx cksum offload

2016-01-04 Thread Yong Wang

v2:
* fixed some logging issues when debug option turned on
* updated the txq_flags check in vmxnet3_dev_tx_queue_setup()

This patchset adds TCP/UDP checksum offload and TSO to vmxnet3 PMD.
One of the use cases for these features is to support STT.  It also
restores the tx data ring feature that was removed from a previous
patch.

Yong Wang (4):
  vmxnet3: restore tx data ring support
  vmxnet3: add tx l4 cksum offload
  vmxnet3: add TSO support
  vmxnet3: announce device offload capability

 doc/guides/rel_notes/release_2_3.rst |  11 +++
 drivers/net/vmxnet3/vmxnet3_ethdev.c |  16 +++-
 drivers/net/vmxnet3/vmxnet3_ring.h   |  13 ---
 drivers/net/vmxnet3/vmxnet3_rxtx.c   | 169 +++
 4 files changed, 158 insertions(+), 51 deletions(-)

-- 
1.9.1

[dpdk-dev] [PATCH 07/12] pmd/ixgbe: add dev_ptype_info_get implementation

2016-01-04 Thread Ananyev, Konstantin



> -Original Message-
> From: Tan, Jianfeng
> Sent: Thursday, December 31, 2015 6:53 AM
> To: dev at dpdk.org
> Cc: Zhang, Helin; Ananyev, Konstantin; Tan, Jianfeng
> Subject: [PATCH 07/12] pmd/ixgbe: add dev_ptype_info_get implementation
> 
> Signed-off-by: Jianfeng Tan 
> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c | 50 
> 
>  drivers/net/ixgbe/ixgbe_ethdev.h |  2 ++
>  drivers/net/ixgbe/ixgbe_rxtx.c   |  5 +++-
>  3 files changed, 56 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c 
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 4c4c6df..de5c3a9 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -166,6 +166,8 @@ static int ixgbe_dev_queue_stats_mapping_set(struct 
> rte_eth_dev *eth_dev,
>uint8_t is_rx);
>  static void ixgbe_dev_info_get(struct rte_eth_dev *dev,
>  struct rte_eth_dev_info *dev_info);
> +static int ixgbe_dev_ptype_info_get(struct rte_eth_dev *dev,
> + uint32_t ptype_mask, uint32_t ptypes[]);
>  static void ixgbevf_dev_info_get(struct rte_eth_dev *dev,
>struct rte_eth_dev_info *dev_info);
>  static int ixgbe_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
> @@ -428,6 +430,7 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
>   .xstats_reset = ixgbe_dev_xstats_reset,
>   .queue_stats_mapping_set = ixgbe_dev_queue_stats_mapping_set,
>   .dev_infos_get= ixgbe_dev_info_get,
> + .dev_ptype_info_get   = ixgbe_dev_ptype_info_get,
>   .mtu_set  = ixgbe_dev_mtu_set,
>   .vlan_filter_set  = ixgbe_vlan_filter_set,
>   .vlan_tpid_set= ixgbe_vlan_tpid_set,
> @@ -512,6 +515,7 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
>   .xstats_reset = ixgbevf_dev_stats_reset,
>   .dev_close= ixgbevf_dev_close,
>   .dev_infos_get= ixgbevf_dev_info_get,
> + .dev_ptype_info_get   = ixgbe_dev_ptype_info_get,
>   .mtu_set  = ixgbevf_dev_set_mtu,
>   .vlan_filter_set  = ixgbevf_vlan_filter_set,
>   .vlan_strip_queue_set = ixgbevf_vlan_strip_queue_set,
> @@ -2829,6 +2833,52 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev, struct 
> rte_eth_dev_info *dev_info)
>   dev_info->flow_type_rss_offloads = IXGBE_RSS_OFFLOAD_ALL;
>  }
> 
> +static int
> +ixgbe_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptype_mask,
> + uint32_t ptypes[])
> +{
> + int num = 0;
> +
> + if ((dev->rx_pkt_burst == ixgbe_recv_pkts)
> + || (dev->rx_pkt_burst == 
> ixgbe_recv_pkts_lro_single_alloc)
> + || (dev->rx_pkt_burst == ixgbe_recv_pkts_lro_bulk_alloc)
> + || (dev->rx_pkt_burst == ixgbe_recv_pkts_bulk_alloc)
> +) {


As I remember vector RX for ixgbe sets up packet_type properly too.

> + /* refers to ixgbe_rxd_pkt_info_to_pkt_type() */
> + if ((ptype_mask & RTE_PTYPE_L2_MASK) == RTE_PTYPE_L2_MASK)
> + ptypes[num++] = RTE_PTYPE_L2_ETHER;
> +
> + if ((ptype_mask & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_MASK) {
> + ptypes[num++] = RTE_PTYPE_L3_IPV4;
> + ptypes[num++] = RTE_PTYPE_L3_IPV4_EXT;
> + ptypes[num++] = RTE_PTYPE_L3_IPV6;
> + ptypes[num++] = RTE_PTYPE_L3_IPV6_EXT;
> + }
> +
> + if ((ptype_mask & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_MASK) {
> + ptypes[num++] = RTE_PTYPE_L4_SCTP;
> + ptypes[num++] = RTE_PTYPE_L4_TCP;
> + ptypes[num++] = RTE_PTYPE_L4_UDP;
> + }
> +
> + if ((ptype_mask & RTE_PTYPE_TUNNEL_MASK) == 
> RTE_PTYPE_TUNNEL_MASK)
> + ptypes[num++] = RTE_PTYPE_TUNNEL_IP;
> +
> + if ((ptype_mask & RTE_PTYPE_INNER_L3_MASK) == 
> RTE_PTYPE_INNER_L3_MASK) {
> + ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6;
> + ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6_EXT;
> + }
> +
> + if ((ptype_mask & RTE_PTYPE_INNER_L4_MASK) == 
> RTE_PTYPE_INNER_L4_MASK) {
> + ptypes[num++] = RTE_PTYPE_INNER_L4_TCP;
> + ptypes[num++] = RTE_PTYPE_INNER_L4_UDP;
> + }
> + } else
> + num = -ENOTSUP;
> +
> + return num;
> +}
> +
>  static void
>  ixgbevf_dev_info_get(struct rte_eth_dev *dev,
>struct rte_eth_dev_info *dev_info)
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h 
> b/drivers/net/ixgbe/ixgbe_ethdev.h
> index d26771a..2479830 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.h
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.h
> @@ -379,6 +379,8 @@ void ixgbevf_dev_rxtx_start(struct rte_eth_dev *dev);
>  uint16_t ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
>

[dpdk-dev] [PATCH v2 4/4] virtio: check if any kernel driver is manipulating the virtio device

2016-01-04 Thread Xie, Huawei

On 1/5/2016 1:24 AM, Stephen Hemminger wrote:
> On Mon,  4 Jan 2016 01:56:13 +0800
> Huawei Xie  wrote:
>
>> +if (pci_dev->kdrv != RTE_KDRV_NONE) {
>> +PMD_INIT_LOG(INFO,
>> +"kernel driver is manipulating this device." \
>> +" Please unbind the kernel driver.");
> Splitting strings in general is a bad idea since it makes it harder to find 
> log messages.
> Also the first clause is lower case and the second is captialized.
Got it. This is to avoid 80 char warning. Will put it in one line to
make it friendly for searching.
The first clause is lower is because it actually follows "%s():".
>
> Lastly, the backslash continuation is unnecessary here and will cause 
> checkpatch warning.
>

[dpdk-dev] [PATCH] ixgbe: support multicast promiscuous mode on VF

2016-01-04 Thread Wenzhuo Lu

Add multicast promiscuous mode support on ixgbe VF driver.

Please note if we want to use this promiscuous mode, we need both PF
and VF driver to support it. The reason is this VF feature is
configged on PF.
If use kernel PF driver + dpdk VF driver, make sure kernel PF driver
support VF multicast promiscuous mode. If use dpdk PF + dpdk VF,
better make sure PF driver is the same version as VF.

Signed-off-by: Wenzhuo Lu 
---
 drivers/net/ixgbe/base/ixgbe_mbx.h |  4 +++
 drivers/net/ixgbe/ixgbe_ethdev.c   | 66 ++
 drivers/net/ixgbe/ixgbe_pf.c   | 65 +
 3 files changed, 135 insertions(+)

diff --git a/drivers/net/ixgbe/base/ixgbe_mbx.h 
b/drivers/net/ixgbe/base/ixgbe_mbx.h
index 445df10..4a120a3 100644
--- a/drivers/net/ixgbe/base/ixgbe_mbx.h
+++ b/drivers/net/ixgbe/base/ixgbe_mbx.h
@@ -89,6 +89,7 @@ enum ixgbe_pfvf_api_rev {
ixgbe_mbox_api_10,  /* API version 1.0, linux/freebsd VF driver */
ixgbe_mbox_api_20,  /* API version 2.0, solaris Phase1 VF driver */
ixgbe_mbox_api_11,  /* API version 1.1, linux/freebsd VF driver */
+   ixgbe_mbox_api_12,  /* API version 1.2, linux/freebsd VF driver */
/* This value should always be last */
ixgbe_mbox_api_unknown, /* indicates that API version is not known */
 };
@@ -107,6 +108,9 @@ enum ixgbe_pfvf_api_rev {
 /* mailbox API, version 1.1 VF requests */
 #define IXGBE_VF_GET_QUEUES0x09 /* get queue configuration */

+/* mailbox API, version 1.2 VF requests */
+#define IXGBE_VF_UPDATE_XCAST_MODE 0x0C
+
 /* GET_QUEUES return data indices within the mailbox */
 #define IXGBE_VF_TX_QUEUES 1   /* number of Tx queues supported */
 #define IXGBE_VF_RX_QUEUES 2   /* number of Rx queues supported */
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 4c4c6df..3308a05 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -138,6 +138,12 @@

 #define IXGBE_CYCLECOUNTER_MASK   0xULL

+enum ixgbevf_xcast_modes {
+   IXGBEVF_XCAST_MODE_NONE = 0,
+   IXGBEVF_XCAST_MODE_MULTI,
+   IXGBEVF_XCAST_MODE_ALLMULTI,
+};
+
 static int eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev);
 static int eth_ixgbe_dev_uninit(struct rte_eth_dev *eth_dev);
 static int  ixgbe_dev_configure(struct rte_eth_dev *dev);
@@ -237,6 +243,8 @@ static int ixgbevf_dev_rx_queue_intr_disable(struct 
rte_eth_dev *dev,
 static void ixgbevf_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
 uint8_t queue, uint8_t msix_vector);
 static void ixgbevf_configure_msix(struct rte_eth_dev *dev);
+static void ixgbevf_dev_allmulticast_enable(struct rte_eth_dev *dev);
+static void ixgbevf_dev_allmulticast_disable(struct rte_eth_dev *dev);

 /* For Eth VMDQ APIs support */
 static int ixgbe_uc_hash_table_set(struct rte_eth_dev *dev, struct
@@ -511,6 +519,8 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
.stats_reset  = ixgbevf_dev_stats_reset,
.xstats_reset = ixgbevf_dev_stats_reset,
.dev_close= ixgbevf_dev_close,
+   .allmulticast_enable  = ixgbevf_dev_allmulticast_enable,
+   .allmulticast_disable = ixgbevf_dev_allmulticast_disable,
.dev_infos_get= ixgbevf_dev_info_get,
.mtu_set  = ixgbevf_dev_set_mtu,
.vlan_filter_set  = ixgbevf_vlan_filter_set,
@@ -1224,6 +1234,7 @@ ixgbevf_negotiate_api(struct ixgbe_hw *hw)

/* start with highest supported, proceed down */
static const enum ixgbe_pfvf_api_rev sup_ver[] = {
+   ixgbe_mbox_api_12,
ixgbe_mbox_api_11,
ixgbe_mbox_api_10,
};
@@ -6191,6 +6202,61 @@ ixgbe_dev_get_dcb_info(struct rte_eth_dev *dev,
return 0;
 }

+/* ixgbevf_update_xcast_mode - Update Multicast mode
+ * @hw: pointer to the HW structure
+ * @netdev: pointer to net device structure
+ * @xcast_mode: new multicast mode
+ *
+ * Updates the Multicast Mode of VF.
+ */
+static int ixgbevf_update_xcast_mode(struct ixgbe_hw *hw,
+int xcast_mode)
+{
+   struct ixgbe_mbx_info *mbx = >mbx;
+   u32 msgbuf[2];
+   s32 err;
+
+   switch (hw->api_version) {
+   case ixgbe_mbox_api_12:
+   break;
+   default:
+   return -EOPNOTSUPP;
+   }
+
+   msgbuf[0] = IXGBE_VF_UPDATE_XCAST_MODE;
+   msgbuf[1] = xcast_mode;
+
+   err = mbx->ops.write_posted(hw, msgbuf, 2, 0);
+   if (err)
+   return err;
+
+   err = mbx->ops.read_posted(hw, msgbuf, 2, 0);
+   if (err)
+   return err;
+
+   msgbuf[0] &= ~IXGBE_VT_MSGTYPE_CTS;
+   if (msgbuf[0] == (IXGBE_VF_UPDATE_XCAST_MODE | IXGBE_VT_MSGTYPE_NACK))
+   return -EPERM;
+
+   return 0;
+}
+
+static void
+ixgbevf_dev_allmulticast_enable(struct rte_eth_dev *dev)
+{
+

[dpdk-dev] Traffic scheduling in DPDK

2016-01-04 Thread Singh, Jasvinder

Hi Uday,


> I have an issue in running qos_sched application in DPDK .Could someone tell
> me how to run the command  and what each parameter does In the below
> mentioned text.
> 
> Application mandatory parameters:
> --pfc "RX PORT, TX PORT, RX LCORE, WT LCORE" : Packet flow configuration
>multiple pfc can be configured in command line


RX PORT - Specifies the packets receive port 
TX PORT - Specifies the packets transmit port 
RXCORE - Specifies the  Core used for Packet reception and Classification stage 
of the QoS application.
WTCORE-  Specifies the  Core used for Packet enqueue/dequeue operation (QoS 
scheduling)  and subsequently transmitting the packets out.

Multiple pfc  can be specified depending upon the number of instances of qos 
sched required in application.  For example- in order to run two instance, 
following can be used-

./build/qos_sched -c 0x7e -n 4 -- --pfc "0,1,2,3,4" --pfc "2,3,5,6" --cfg 
"profile.cfg"   

First instance of qos sched receives packets from port 0 and transmits its 
packets through port 1 ,while second qos sched will receives packets from port 
2 and transmit through port 3. In case of single qos sched instance, following 
can be used-

./build/qos_sched -c 0x1e -n 4 -- --pfc "0,1,2,3,4" --cfg "profile.cfg"


Thanks, 
Jasvinder

[dpdk-dev] [RFC 1/7] eal/common: define rte_soc_* related common interface

2016-01-04 Thread Wiles, Keith

On 1/3/16, 11:12 AM, "dev on behalf of Jan Viktorin"  wrote:

>On Sat, 2 Jan 2016 19:45:40 +0100
>Jan Viktorin  wrote:
>
>> > 
>> > Do you consider this will break binary compatibility since
>> > sizeof (rte_soc_addr) is PATH_MAX (1024) and the other elements of the
>> > union inside rte_devargs are much smaller (like 32 bytes).
>> >   
>> 
>> I had a bad feeling about this... Originally, I started with a pointer
>> 'const char *' so it can be done that way... However, this brings
>> compilator mad as it does not allow to cast char * -> const char *
>> because of the strict DPDK compilation settings. I didn't find any
>> workaround yet. I think I can make it just 'char *' for the next version
>> of the patch set.
>
>Having rte_devargs to contain only char * instead of char[PATH_MAX]
>brings an issue. There is no common devargs_free function. Inside the
>function rte_eal_devargs_add, I must malloc memory here to fill the
>devtree_path. But there is no general way to free it. At least, I
>didn't find such code... I need to do this:
>
>108 case RTE_DEVTYPE_WHITELISTED_SOC:
>109 case RTE_DEVTYPE_BLACKLISTED_SOC:
>110 devargs->soc.addr.devtree_path = strdup(buf);
>
>because the buf variable is deallocated at the end of the function.
>
>In fact, I do not clearly understand the rte_eal_devargs_add function.
>What is the expected input? The call to rte_eal_parse_devargs_str
>separates the string into drvname and drvargs. What are the drvargs
>supposted to be? I'd expect the user types -w  or -b 
>(for PCI  is the quaternion, for SoC it's the device-tree
>path).
>
>Regards
>Jan
>
>-- 
>  Jan ViktorinE-mail: Viktorin at RehiveTech.com
>  System ArchitectWeb:www.RehiveTech.com
>  RehiveTech
>  Brno, Czech Republic
>


Regards,
Keith

[dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

2016-01-04 Thread Olivier MATZ

Hi Hanoch,

Please find some comments below.

On 12/27/2015 10:39 AM, Hanoch Haim (hhaim) wrote:
> Hi Bruce,
> 
> I'm Hanoch from Cisco Systems works for  the 
> https://github.com/cisco-system-traffic-generator/trex-core traffic generator 
> project.
> 
> While upgrading from DPDK 1.8 to 2.2 Ido found that the following commit 
> creates a mbuf corruption and result in Tx hang
> 
> 
> 
> commit f20b50b946da9070d21e392e4dbc7d9f68bc983e
> 
> Author: Olivier Matz 
> 
> Date:   Mon Jun 8 16:57:22 2015 +0200
> 
> 
> 
> Looking at the change it is clear why there is an issue, wanted to get your 
> input.
> 
> 
> 
> Init
> 
> -
> 
> alloc const mbuf  ==> mbuf-a (ref=1)
> 
> 
> 
> Simple case that works
> 
> -
> 
> 
> 
> thread 1 , tx: alloc-mbuf->attach(mbuf-a) (ref=2)  inc- non atomic
> 
> thread 1 , tx: alloc-mbuf->attach(mbuf-a) (ref32)  inc- atomic

do you mean "(ref=3)" ?

> 
> thread 1 , drv : free()(ref=2) dec- atomic
> 
> thread 1 , drv : free()(ref=3) dec - non atomic

do you mean "(ref=1)" ?


> 
> Simple case that does not work
> 
> -
> 
> 
> 
> Both do that in parallel
> 
> 
> 
> thread 2 tx : alloc-mbuf->attach(mbuf-a)  (ref=2)  inc- non atomic
> 
> thread 1 tx : alloc-mbuf->attach(mbuf-a)  (ref=2)  inc- non atomic


It is not allowed to call a function from the mbuf API in parallel.

Example:

core0   |   core1
|---
m = rte_pktmbuf_alloc(m);   |
enqueue(m); |
|m = dequeue();
do_something(m);|do_something(m);


do_something() is not allowed because it accesses the same mbuf
structure.
do_something() can be any function of mbuf API: rte_pktmbuf_prepend(),
rte_pktmbuf_attach(), ...


This is allowed:

core0   |   core1
|---
m = rte_pktmbuf_alloc(m);   |
m2 = rte_pktmbuf_attach(m); |
enqueue(m2);|
|m2 = dequeue();
do_something(m);|do_something(m2);



Regards,
Olivier

[dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

2016-01-04 Thread Hanoch Haim (hhaim)

Hi Oliver, 

Let's take your drawing as a reference and add my question
The use case is sending a duplicate multicast packet by many threads.
I can split it to x threads to do the job and with atomic-ref (my multicast not 
mbuf) count it until it reaches zero.

In my following example the two cores (0 and 1) sending the indirect m1/m2 do 
alloc/attach/send 

core0|  core1
- 
|---
m_const=rte_pktmbuf_alloc(mp) |
  |
while true: |  while True:
  m1 =rte_pktmbuf_alloc(mp_64) |m2 =rte_pktmbuf_alloc(mp_64)
  rte_pktmbuf_attach(m1, m_const) |rte_pktmbuf_attach(m1, m_const)
  tx_burst(m1)   |tx_burst(m2)

Is this example is not valid? 


BTW this is our workaround 



  core0 |   core1
-  
|---
m_const=rte_pktmbuf_alloc(mp)  |
rte_mbuf_refcnt_update(m_const,1)| <<-- workaround 
   |
while true:  |  while True:
  m1 =rte_pktmbuf_alloc(mp_64)  |m2 =rte_pktmbuf_alloc(mp_64)
  rte_pktmbuf_attach(m1, m_const)  |rte_pktmbuf_attach(m1, m_const)
  tx_burst(m1) |tx_burst(m2)

thanks,
Hanoh

-Original Message-
From: Olivier MATZ [mailto:olivier.m...@6wind.com] 
Sent: Monday, January 04, 2016 3:53 PM
To: Hanoch Haim (hhaim); bruce.richardson at intel.com
Cc: dev at dpdk.org; Ido Barnea (ibarnea); Itay Marom (imarom)
Subject: Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

Hi Hanoch,

Please find some comments below.

On 12/27/2015 10:39 AM, Hanoch Haim (hhaim) wrote:
> Hi Bruce,
> 
> I'm Hanoch from Cisco Systems works for  the 
> https://github.com/cisco-system-traffic-generator/trex-core traffic generator 
> project.
> 
> While upgrading from DPDK 1.8 to 2.2 Ido found that the following 
> commit creates a mbuf corruption and result in Tx hang
> 
> 
> 
> commit f20b50b946da9070d21e392e4dbc7d9f68bc983e
> 
> Author: Olivier Matz 
> 
> Date:   Mon Jun 8 16:57:22 2015 +0200
> 
> 
> 
> Looking at the change it is clear why there is an issue, wanted to get your 
> input.
> 
> 
> 
> Init
> 
> -
> 
> alloc const mbuf  ==> mbuf-a (ref=1)
> 
> 
> 
> Simple case that works
> 
> -
> 
> 
> 
> thread 1 , tx: alloc-mbuf->attach(mbuf-a) (ref=2)  inc- non atomic
> 
> thread 1 , tx: alloc-mbuf->attach(mbuf-a) (ref32)  inc- atomic

do you mean "(ref=3)" ?
[hh] yes ref=3. 

> 
> thread 1 , drv : free()(ref=2) dec- atomic
> 
> thread 1 , drv : free()(ref=3) dec - non atomic

do you mean "(ref=1)" ?


> 
> Simple case that does not work
> 
> -
> 
> 
> 
> Both do that in parallel
> 
> 
> 
> thread 2 tx : alloc-mbuf->attach(mbuf-a)  (ref=2)  inc- non atomic
> 
> thread 1 tx : alloc-mbuf->attach(mbuf-a)  (ref=2)  inc- non atomic


It is not allowed to call a function from the mbuf API in parallel.

Example:

core0   |   core1
|---
m = rte_pktmbuf_alloc(m);   |
enqueue(m); |
|m = dequeue();
do_something(m);|do_something(m);


do_something() is not allowed because it accesses the same mbuf structure.
do_something() can be any function of mbuf API: rte_pktmbuf_prepend(), 
rte_pktmbuf_attach(), ...


This is allowed:

core0   |   core1
|---
m = rte_pktmbuf_alloc(m);   |
m2 = rte_pktmbuf_attach(m); |
enqueue(m2);|
|m2 = dequeue();
do_something(m);|do_something(m2);



Regards,
Olivier

[dpdk-dev] Traffic scheduling in DPDK

2016-01-04 Thread ravulakollu.ku...@wipro.com

Hello All,

I have an issue in running qos_sched application in DPDK .Could someone tell me 
how to run the command  and what
each parameter does In the below mentioned text.

Application mandatory parameters:
--pfc "RX PORT, TX PORT, RX LCORE, WT LCORE" : Packet flow configuration
   multiple pfc can be configured in command line

Thanks,
Uday
The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. WARNING: Computer viruses can be transmitted via email. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email. www.wipro.com

[dpdk-dev] [PATCH 01/12] ethdev: add API to query what/if packet type is set

2016-01-04 Thread Ananyev, Konstantin



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Monday, January 04, 2016 11:38 AM
> To: Tan, Jianfeng
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 01/12] ethdev: add API to query what/if packet 
> type is set
> 
> I'm not sure about the usefulness of this new callback, but one issue I see
> with rte_eth_dev_get_ptype_info() is that determining the proper size for
> ptypes[] according to a mask is awkward. For instance suppose
> RTE_PTYPE_L4_MASK is redefined to a different size at some point, the caller
> must dynamically adjust its ptypes[] array size to avoid a possible
> overflow, just in case.
> 
> I suggest one of these solutions:
> 
> - A callback to query for a single type at once instead (easiest method in
>   my opinion).
> 
> - An additional argument with the number of entries in ptypes[], in which
>   case rte_eth_dev_get_ptype_info() should return the number of entries that
>   would have been filled regardless, a bit like snprintf().

+1 for the second option.
Also not sure you really need: RTE_PTYPE_*_MAX_NUM macros.
Konstantin

> 
> On Thu, Dec 31, 2015 at 02:53:08PM +0800, Jianfeng Tan wrote:
> > Add a new API rte_eth_dev_get_ptype_info to query what/if packet type will
> > be set by current rx burst function.
> >
> > Signed-off-by: Jianfeng Tan 
> > ---
> >  lib/librte_ether/rte_ethdev.c | 12 
> >  lib/librte_ether/rte_ethdev.h | 22 ++
> >  lib/librte_mbuf/rte_mbuf.h| 13 +
> >  3 files changed, 47 insertions(+)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> > index ed971b4..1885374 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -1614,6 +1614,18 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
> > rte_eth_dev_info *dev_info)
> > dev_info->driver_name = dev->data->drv_name;
> >  }
> >
> > +int
> > +rte_eth_dev_get_ptype_info(uint8_t port_id, uint32_t ptype_mask,
> > +   uint32_t ptypes[])
> > +{
> > +   struct rte_eth_dev *dev;
> > +
> > +   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +   dev = _eth_devices[port_id];
> > +   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_ptype_info_get, -ENOTSUP);
> > +   return (*dev->dev_ops->dev_ptype_info_get)(dev, ptype_mask, ptypes);
> > +}
> > +
> >  void
> >  rte_eth_macaddr_get(uint8_t port_id, struct ether_addr *mac_addr)
> >  {
> > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > index bada8ad..e97b632 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -1021,6 +1021,10 @@ typedef void (*eth_dev_infos_get_t)(struct 
> > rte_eth_dev *dev,
> > struct rte_eth_dev_info *dev_info);
> >  /**< @internal Get specific informations of an Ethernet device. */
> >
> > +typedef int (*eth_dev_ptype_info_get_t)(struct rte_eth_dev *dev,
> > +   uint32_t ptype_mask, uint32_t ptypes[]);
> > +/**< @internal Get ptype info of eth_rx_burst_t. */
> > +
> >  typedef int (*eth_queue_start_t)(struct rte_eth_dev *dev,
> > uint16_t queue_id);
> >  /**< @internal Start rx and tx of a queue of an Ethernet device. */
> > @@ -1347,6 +1351,7 @@ struct eth_dev_ops {
> > eth_queue_stats_mapping_set_t queue_stats_mapping_set;
> > /**< Configure per queue stat counter mapping. */
> > eth_dev_infos_get_tdev_infos_get; /**< Get device info. */
> > +   eth_dev_ptype_info_get_t   dev_ptype_info_get; /** Get ptype info */
> > mtu_set_t  mtu_set; /**< Set MTU. */
> > vlan_filter_set_t  vlan_filter_set;  /**< Filter VLAN Setup. */
> > vlan_tpid_set_tvlan_tpid_set;  /**< Outer VLAN TPID 
> > Setup. */
> > @@ -2273,6 +2278,23 @@ extern void rte_eth_dev_info_get(uint8_t port_id,
> >  struct rte_eth_dev_info *dev_info);
> >
> >  /**
> > + * Retrieve the contextual information of an Ethernet device.
> > + *
> > + * @param port_id
> > + *   The port identifier of the Ethernet device.
> > + * @param ptype_mask
> > + *   A hint of what kind of packet type which the caller is interested in
> > + * @param ptypes
> > + *   An array of packet types to be filled with
> > + * @return
> > + *   - (>=0) if successful. Indicate number of valid values in ptypes 
> > array.
> > + *   - (-ENOTSUP) if hardware-assisted VLAN stripping not configured.
> > + *   - (-ENODEV) if *port_id* invalid.
> > + */
> > +extern int rte_eth_dev_get_ptype_info(uint8_t port_id,
> > +uint32_t ptype_mask, uint32_t ptypes[]);
> > +
> > +/**
> >   * Retrieve the MTU of an Ethernet device.
> >   *
> >   * @param port_id
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index f234ac9..21d4aa2 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -282,6 +282,8 @@ extern "C" {
> >   *

[dpdk-dev] DPDK OVS on Ubuntu 14.04# Issue's Resolved# Getting memory backing issues with qemu parameter passing

2016-01-04 Thread Czesnowicz, Przemyslaw

You should be able to clone networking-ovs-dpdk, switch to kilo branch,  and run
python setup.py install
in the root of networking-ovs-dpdk, that should install agent and mech driver.
Then you would need to enable mech driver (ovsdpdk) on the controller in the 
/etc/neutron/plugins/ml2/ml2_conf.ini
And run the right agent on the computes (networking-ovs-dpdk-agent).


There should be pip packeges of networking-ovs-dpdk available shortly, I?ll let 
you know when that happens.

Przemek

From: Abhijeet Karve [mailto:abhijeet.ka...@tcs.com]
Sent: Thursday, December 24, 2015 6:42 PM
To: Czesnowicz, Przemyslaw
Cc: dev at dpdk.org; discuss at openvswitch.org; Gray, Mark D
Subject: RE: [dpdk-dev] DPDK OVS on Ubuntu 14.04# Issue's Resolved# Getting 
memory backing issues with qemu parameter passing

Hi Przemek,

Thank you so much for your quick response.

The 
guide(https://github.com/openstack/networking-ovs-dpdk/blob/stable/kilo/doc/source/getstarted/ubuntu.rst)
 which you have suggested that is for openstack vhost user installations with 
devstack.
Can't we have any reference for including ovs-dpdk mechanisam driver for 
openstack Ubuntu distribution which we are following for
compute+controller node setup?"

We are facing below listed issues With the current approach of setting up 
openstack kilo interactively + replacing ovs with ovs-dpdk enabled and Instance 
creation in openstack with
passing that instance id to QEMU command line which further passes the 
vhost-user sockets to instances for enabling the DPDK libraries in it.


1. Created a flavor m1.hugepages which is backed by hugepage memory, unable to 
spawn instance with this flavor ? Getting a issue like: No matching hugetlbfs 
for the number of hugepages assigned to the flavor.
2. Passing socket info to instances via qemu manually and instnaces created are 
not persistent.

Now as you suggested, we are looking in enabling ovsdpdk ml2 mechanism driver 
and agent all of that in our openstack ubuntu distribution.

Would be really appriciate if get any help or ref with explanation.

We are using compute + controller node setup and we are using following 
software platform on compute node:
_
Openstack: Kilo
Distribution: Ubuntu 14.04
OVS Version: 2.4.0
DPDK 2.0.0
_

Thanks,
Abhijeet Karve





From:"Czesnowicz, Przemyslaw" mailto:przemyslaw.czesnow...@intel.com>>
To:Abhijeet Karve mailto:abhijeet.karve at 
tcs.com>>
Cc:"dev at dpdk.org" mailto:dev at dpdk.org>>, "discuss at openvswitch.org" mailto:discuss at 
openvswitch.org>>, "Gray, Mark D" mailto:mark.d.gray 
at intel.com>>
Date:12/17/2015 06:32 PM
Subject:RE: [dpdk-dev] DPDK OVS on Ubuntu 14.04# Issue's Resolved# 
Successfully setup DPDK OVS with vhostuser




I haven?t tried that approach not sure if that would work, it seems clunky.

If you enable ovsdpdk ml2 mechanism driver and agent all of that (add ports to 
ovs with the right type, pass the sockets to qemu) would be done by OpenStack.

Przemek

From: Abhijeet Karve [mailto:abhijeet.ka...@tcs.com]
Sent: Thursday, December 17, 2015 12:41 PM
To: Czesnowicz, Przemyslaw
Cc: dev at dpdk.org; discuss at 
openvswitch.org; Gray, Mark D
Subject: RE: [dpdk-dev] DPDK OVS on Ubuntu 14.04# Issue's Resolved# 
Successfully setup DPDK OVS with vhostuser

Hi Przemek,

Thank you so much for sharing the ref guide.

Would be appreciate if clear one doubt.

At present we are setting up openstack kilo interactively and further replacing 
ovs with ovs-dpdk enabled.
Once the above setup done, We are creating instance in openstack and passing 
that instance id to QEMU command line which further passes the vhost-user 
sockets to instances, enabling the DPDK libraries in it.

Isn't this the correct way of integrating ovs-dpdk with openstack?


Thanks & Regards
Abhijeet Karve




From:"Czesnowicz, Przemyslaw" mailto:przemyslaw.czesnow...@intel.com>>
To:Abhijeet Karve mailto:abhijeet.karve at 
tcs.com>>
Cc:"dev at dpdk.org" mailto:dev at dpdk.org>>, "discuss at openvswitch.org" mailto:discuss at 
openvswitch.org>>, "Gray, Mark D" mailto:mark.d.gray 
at intel.com>>
Date:12/17/2015 05:27 PM
Subject:RE: [dpdk-dev] DPDK OVS on Ubuntu 14.04# Issue's Resolved# 
Successfully setup DPDK OVS with vhostuser





HI Abhijeet,

For Kilo you need to use ovsdpdk mechanism driver and a matching agent to 
integrate ovs-dpdk with OpenStack.

The guide you are following only talks about running ovs-dpdk not how it should 
be integrated with OpenStack.

Please follow this guide:
https://github.com/openstack/networking-ovs-dpdk/blob/stable/kilo/doc/source/getstarted/ubuntu.rst

Best regards
Przemek


From: Abhijeet Karve [mailto:abhijeet.ka...@tcs.com]

[dpdk-dev] [PATCH v3 1/4] eal: Introduce new cache macro definitions

2016-01-04 Thread Olivier MATZ

Hi Jerin,

Please see some comments below.

On 12/14/2015 05:32 AM, Jerin Jacob wrote:
> - RTE_CACHE_MIN_LINE_SIZE(Supported minimum cache line size)
> - __rte_cache_min_aligned(Force minimum cache line alignment)
> - RTE_CACHE_LINE_SIZE_LOG2(Express cache line size in terms of log2)
> 
> Signed-off-by: Jerin Jacob 
> Suggested-by: Konstantin Ananyev 
> ---
>  lib/librte_eal/common/include/rte_memory.h | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/lib/librte_eal/common/include/rte_memory.h 
> b/lib/librte_eal/common/include/rte_memory.h
> index 9c9e40f..b67a76f 100644
> --- a/lib/librte_eal/common/include/rte_memory.h
> +++ b/lib/librte_eal/common/include/rte_memory.h
> @@ -77,11 +77,27 @@ enum rte_page_sizes {
>   (RTE_CACHE_LINE_SIZE * ((size + RTE_CACHE_LINE_SIZE - 1) / 
> RTE_CACHE_LINE_SIZE))
>  /**< Return the first cache-aligned value greater or equal to size. */
>  
> +/**< Cache line size in terms of log2 */
> +#if RTE_CACHE_LINE_SIZE == 64
> +#define RTE_CACHE_LINE_SIZE_LOG2 6
> +#elif RTE_CACHE_LINE_SIZE == 128
> +#define RTE_CACHE_LINE_SIZE_LOG2 7
> +#else
> +#error "Unsupported cache line size"
> +#endif
> +
> +#define RTE_CACHE_MIN_LINE_SIZE 64   /**< Minimum Cache line size. */
> +

I think RTE_CACHE_LINE_MIN_SIZE or RTE_MIN_CACHE_LINE_SIZE would
be clearer than RTE_CACHE_MIN_LINE_SIZE.

>  /**
>   * Force alignment to cache line.
>   */
>  #define __rte_cache_aligned __rte_aligned(RTE_CACHE_LINE_SIZE)
>  
> +/**
> + * Force minimum cache line alignment.
> + */
> +#define __rte_cache_min_aligned __rte_aligned(RTE_CACHE_MIN_LINE_SIZE)

I'm not really convinced that __rte_cache_min_aligned is straightforward
for someone reading the code that it means "aligned to the minimum cache
line size supported by the dpdk".

In the two cases you are using this macro (mbuf structure and queue
info), I'm wondering if using __attribute__((aligned(64))) wouldn't be
clearer?
- for mbuf, it could be a local define, like MBUF_ALIGN_SIZE
- for queue info, using 64 makes sense as it's used to reserve space
  for future use

What do you think?


Regards,
Olivier

[dpdk-dev] [PATCH 01/12] ethdev: add API to query what/if packet type is set

2016-01-04 Thread Adrien Mazarguil

I'm not sure about the usefulness of this new callback, but one issue I see
with rte_eth_dev_get_ptype_info() is that determining the proper size for
ptypes[] according to a mask is awkward. For instance suppose
RTE_PTYPE_L4_MASK is redefined to a different size at some point, the caller
must dynamically adjust its ptypes[] array size to avoid a possible
overflow, just in case.

I suggest one of these solutions:

- A callback to query for a single type at once instead (easiest method in
  my opinion).

- An additional argument with the number of entries in ptypes[], in which
  case rte_eth_dev_get_ptype_info() should return the number of entries that
  would have been filled regardless, a bit like snprintf().

On Thu, Dec 31, 2015 at 02:53:08PM +0800, Jianfeng Tan wrote:
> Add a new API rte_eth_dev_get_ptype_info to query what/if packet type will
> be set by current rx burst function.
> 
> Signed-off-by: Jianfeng Tan 
> ---
>  lib/librte_ether/rte_ethdev.c | 12 
>  lib/librte_ether/rte_ethdev.h | 22 ++
>  lib/librte_mbuf/rte_mbuf.h| 13 +
>  3 files changed, 47 insertions(+)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index ed971b4..1885374 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -1614,6 +1614,18 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
> rte_eth_dev_info *dev_info)
>   dev_info->driver_name = dev->data->drv_name;
>  }
>  
> +int
> +rte_eth_dev_get_ptype_info(uint8_t port_id, uint32_t ptype_mask,
> + uint32_t ptypes[])
> +{
> + struct rte_eth_dev *dev;
> +
> + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> + dev = _eth_devices[port_id];
> + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_ptype_info_get, -ENOTSUP);
> + return (*dev->dev_ops->dev_ptype_info_get)(dev, ptype_mask, ptypes);
> +}
> +
>  void
>  rte_eth_macaddr_get(uint8_t port_id, struct ether_addr *mac_addr)
>  {
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index bada8ad..e97b632 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1021,6 +1021,10 @@ typedef void (*eth_dev_infos_get_t)(struct rte_eth_dev 
> *dev,
>   struct rte_eth_dev_info *dev_info);
>  /**< @internal Get specific informations of an Ethernet device. */
>  
> +typedef int (*eth_dev_ptype_info_get_t)(struct rte_eth_dev *dev,
> + uint32_t ptype_mask, uint32_t ptypes[]);
> +/**< @internal Get ptype info of eth_rx_burst_t. */
> +
>  typedef int (*eth_queue_start_t)(struct rte_eth_dev *dev,
>   uint16_t queue_id);
>  /**< @internal Start rx and tx of a queue of an Ethernet device. */
> @@ -1347,6 +1351,7 @@ struct eth_dev_ops {
>   eth_queue_stats_mapping_set_t queue_stats_mapping_set;
>   /**< Configure per queue stat counter mapping. */
>   eth_dev_infos_get_tdev_infos_get; /**< Get device info. */
> + eth_dev_ptype_info_get_t   dev_ptype_info_get; /** Get ptype info */
>   mtu_set_t  mtu_set; /**< Set MTU. */
>   vlan_filter_set_t  vlan_filter_set;  /**< Filter VLAN Setup. */
>   vlan_tpid_set_tvlan_tpid_set;  /**< Outer VLAN TPID 
> Setup. */
> @@ -2273,6 +2278,23 @@ extern void rte_eth_dev_info_get(uint8_t port_id,
>struct rte_eth_dev_info *dev_info);
>  
>  /**
> + * Retrieve the contextual information of an Ethernet device.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param ptype_mask
> + *   A hint of what kind of packet type which the caller is interested in
> + * @param ptypes
> + *   An array of packet types to be filled with
> + * @return
> + *   - (>=0) if successful. Indicate number of valid values in ptypes array.
> + *   - (-ENOTSUP) if hardware-assisted VLAN stripping not configured.
> + *   - (-ENODEV) if *port_id* invalid.
> + */
> +extern int rte_eth_dev_get_ptype_info(uint8_t port_id,
> +  uint32_t ptype_mask, uint32_t ptypes[]);
> +
> +/**
>   * Retrieve the MTU of an Ethernet device.
>   *
>   * @param port_id
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index f234ac9..21d4aa2 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -282,6 +282,8 @@ extern "C" {
>   * It is used for outer packet for tunneling cases.
>   */
>  #define RTE_PTYPE_L2_MASK   0x000f
> +
> +#define RTE_PTYPE_L2_MAX_NUM 4
>  /**
>   * IP (Internet Protocol) version 4 packet type.
>   * It is used for outer packet for tunneling cases, and does not contain any
> @@ -349,6 +351,8 @@ extern "C" {
>   * It is used for outer packet for tunneling cases.
>   */
>  #define RTE_PTYPE_L3_MASK   0x00f0
> +
> +#define RTE_PTYPE_L3_MAX_NUM 6
>  /**
>   * TCP

[dpdk-dev] [PATCH 0/6 for 2.3] initial virtio 1.0 enabling

2016-01-04 Thread Yuanhan Liu

On Mon, Jan 04, 2016 at 03:55:14AM +, Xu, Qian Q wrote:
> Does dpdk vhost-switch sample support virtio1.0? I tried it but seems not 
> working. 

It has nothing to do with vhost-switch sample. It worked from my
test; you may come to find me offline to see what might be wrong
on your side.

--yliu
> 
> Thanks
> Qian
> 
> 
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tan, Jianfeng
> Sent: Tuesday, December 29, 2015 7:19 PM
> To: Yuanhan Liu; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/6 for 2.3] initial virtio 1.0 enabling
> 
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Yuanhan Liu
> > Sent: Thursday, December 10, 2015 11:54 AM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH 0/6 for 2.3] initial virtio 1.0 enabling
> > 
> > Hi,
> > 
> > Here is an initial virtio 1.0 pmd driver enabling.
> > 
> > Almost all difference comes from virtio 1.0 are the PCI layout change:
> > the major configuration structures are stored at bar space, and their 
> > location is stored at corresponding pci cap structure. Reading/parsing 
> > them is one of the major work of patch 6.
> > 
> > To make handling virtio v1.0 and v0.95 co-exist well, this patch set 
> > introduces a virtio_pci_ops structure, to add another layer so that we 
> > could keep those vtpci_foo_bar "APIs". With that, we could do the 
> > minimum change to add virtio 1.0 support.
> 
> 
> Please point out from which version, qemu starts to support virtio 1.0 net 
> devices.
> 
> Thanks,
> Jianfeng
> 
> > 
> > Note that the enabling is still in rough state, and it's likely I may 
> > miss something. So, comments are huge welcome!
> > 
> > --yliu
> > 
> > ---
> > Yuanhan Liu (6):
> >   virtio: don't set vring address again at queue startup
> >   virtio: introduce struct virtio_pci_ops
> >   virtio: move left pci stuff to virtio_pci.c
> >   viritio: switch to 64 bit features
> >   virtio: set RTE_PCI_DRV_NEED_MAPPING flag
> >   virtio: add virtio v1.0 support
> > 
> >  drivers/net/virtio/virtio_ethdev.c | 297 +--
> >  drivers/net/virtio/virtio_ethdev.h |   3 +-
> >  drivers/net/virtio/virtio_pci.c| 752
> > +++--
> >  drivers/net/virtio/virtio_pci.h| 100 -
> >  drivers/net/virtio/virtio_rxtx.c   |  15 -
> >  drivers/net/virtio/virtqueue.h |   4 +-
> >  6 files changed, 843 insertions(+), 328 deletions(-)
> > 
> > --
> > 1.9.0

[dpdk-dev] [PATCH 08/12] pmd/mlx4: add dev_ptype_info_get implementation

2016-01-04 Thread Adrien Mazarguil

Hi Jianfeng,

I'm only commenting the mlx4/mlx5 bits in this message, see below.

On Thu, Dec 31, 2015 at 02:53:15PM +0800, Jianfeng Tan wrote:
> Signed-off-by: Jianfeng Tan 
> ---
>  drivers/net/mlx4/mlx4.c | 27 +++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
> index 207bfe2..85afa32 100644
> --- a/drivers/net/mlx4/mlx4.c
> +++ b/drivers/net/mlx4/mlx4.c
> @@ -2836,6 +2836,8 @@ rxq_cleanup(struct rxq *rxq)
>   * @param flags
>   *   RX completion flags returned by poll_length_flags().
>   *
> + * @note: fix mlx4_dev_ptype_info_get() if any change here.
> + *
>   * @return
>   *   Packet type for struct rte_mbuf.
>   */
> @@ -4268,6 +4270,30 @@ mlx4_dev_infos_get(struct rte_eth_dev *dev, struct 
> rte_eth_dev_info *info)
>   priv_unlock(priv);
>  }
>  
> +static int
> +mlx4_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptype_mask,
> + uint32_t ptypes[])
> +{
> + int num = 0;
> +
> + if ((dev->rx_pkt_burst == mlx4_rx_burst)
> + || (dev->rx_pkt_burst == mlx4_rx_burst_sp)) {
> + /* refers to rxq_cq_to_pkt_type() */
> + if ((ptype_mask & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_MASK) {
> + ptypes[num++] = RTE_PTYPE_L3_IPV4;
> + ptypes[num++] = RTE_PTYPE_L3_IPV6;
> + }
> +
> + if ((ptype_mask & RTE_PTYPE_INNER_L3_MASK) == 
> RTE_PTYPE_INNER_L3_MASK) {
> + ptypes[num++] = RTE_PTYPE_INNER_L3_IPV4;
> + ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6;
> + }
> + } else
> + num = -ENOTSUP;
> +
> + return num;
> +}

I think checking for mlx4_rx_burst and mlx4_rx_burst_sp is unnecessary at
the moment, all RX burst functions do update the packet_type field, no need
for extra complexity.

Same comment for mlx5. 

> +
>  /**
>   * DPDK callback to get device statistics.
>   *
> @@ -4989,6 +5015,7 @@ static const struct eth_dev_ops mlx4_dev_ops = {
>   .stats_reset = mlx4_stats_reset,
>   .queue_stats_mapping_set = NULL,
>   .dev_infos_get = mlx4_dev_infos_get,
> + .dev_ptypes_info_get = mlx4_dev_ptype_info_get,
>   .vlan_filter_set = mlx4_vlan_filter_set,
>   .vlan_tpid_set = NULL,
>   .vlan_strip_queue_set = NULL,
> -- 
> 2.1.4
> 

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC PATCH 0/6] General tunneling APIs

2016-01-04 Thread Walukiewicz, Miroslaw

Hi Jijang, 

My comments below MW>

> -Original Message-
> From: Liu, Jijiang
> Sent: Monday, December 28, 2015 6:55 AM
> To: Walukiewicz, Miroslaw; dev at dpdk.org
> Subject: RE: [dpdk-dev] [RFC PATCH 0/6] General tunneling APIs
> 
> Hi Miroslaw,
> 
> The partial answer is below.
> 
> > -Original Message-
> > From: Walukiewicz, Miroslaw
> > Sent: Wednesday, December 23, 2015 7:18 PM
> > To: Liu, Jijiang; dev at dpdk.org
> > Subject: RE: [dpdk-dev] [RFC PATCH 0/6] General tunneling APIs
> >
> > Hi Jijang,
> >
> > I like an idea of tunnel API very much.
> >
> > I have a few questions.
> >
> > 1. I see that you have only i40e support due to lack of HW tunneling
> support
> > in other NICs.
> > I don't see a way how do you want to handle tunneling requests for NICs
> > without HW offload.
> 
> The flow director offload mechanism is used here, flow director is a
> common feature in current NICs.
> Here I don't use special related tunneling HW offload features, the goal is
> that we want to support  all of NICs.
> 
> > I think that we should have one common function for sending tunneled
> > packets but the initialization should check the NIC capabilities and call
> some
> > registered function making tunneling in SW in case of lack of HW support.
> Yes, we should check NIC capabilities.
> 
> > I know that making tunnel is very time consuming process, but it makes an
> > API more generic. Similar only 3 protocols are supported by i40e by HW
> and
> > we can imagine about 40 or more different tunnels working with this NIC.
> >
> > Making the SW implementation we could support missing tunnels even for
> > i40e.
> 
> In this patch set, I just use VXLAN protocol to demonstrate the framework,
> If the framework is accepted, other tunneling protocol will be added one by
> one in future.
> 
> > 2. I understand that we need RX HW queue defined in struct
> > rte_eth_tunnel_conf but why tx_queue is necessary?.
> >   As I know i40e HW we can set tunneled packet descriptors in any HW
> queue
> > and receive only on one specific queue.
> 
> As for adding tx_queue here, I have already explained here at [1]
> 
> [1] http://dpdk.org/ml/archives/dev/2015-December/030509.html
> 
> Do you think it makes sense?

MW> Unfortunately I do not see any explanation for using tx_queue parameter in 
this thread. 
For me this parameter is not necessary. The tunnels will work without it anyway 
as they are set in the packet descriptor.

> 
> > 4. In your implementation you are assuming the there is one tunnel
> > configured per DPDK interface
> >
> > rte_eth_dev_tunnel_configure(uint8_t port_id,
> > +struct rte_eth_tunnel_conf *tunnel_conf)
> >
> No, in terms of i40e,  there will  be up to 8K tunnels  in one DPDK interface,
> It depends on number of flow rules on a pair of queues.
> 
> struct rte_eth_tunnel_conf {
>   uint16_t rx_queue;
>   uint16_t tx_queue;
>   uint16_t udp_tunnel_port;
>   uint16_t nb_flow;
>   uint16_t filter_type;
>   struct rte_eth_tunnel_flow *tunnel_flow;
> };
> 
> If the ' nb_flow ' is set 2000, and you can configure 2000 flow rules on one
> queues on a port.

MW> so in your design the tunnel_flow is table of rte_eth_tunnel_flow 
structures. 
I did not catch it.

I hope that you will add a possibility to dynamically adding/removing tunnels 
from interface.

 What is a sense of the udp_tunnel_port parameter as the tunnel_flow structure 
also provides the same parameter.

Similar the tunnel_type should be a part of the tunnel_flow also as we assume 
to support different tunnels on single interface (not just VXLAN only)

> 
> > The sense of tunnel is lack of interfaces in the system because number of
> > possible VLANs is too small (4095).
> > In the DPDK we have only one tunnel per physical port what is useless even
> > with such big acceleration provided with i40e.
> 
> > In normal use cases there is a need for 10,000s of tunnels per interface.
> Even
> > for Vxlan we have 24 bits for tunnel definition
> 
> 
> We use flow director HW offload here, in terms of i40e, it support up to 8K
> flow rules of exact match.
> This is HW limitation, 10,000s of tunnels per interface is not supported by
> HW.
> 
> 
> > 5. I see that you have implementations for VXLAN,TEREDO, and GENEVE
> > tunnels in i40e drivers. I could  find the implementation for VXLAN
> > encap/decap. Are all files in the patch present?
> No, I have not finished all of codes, just VXLAN here.
> Other tunneling protocol will be added one by one in future.
> 
> > Regards,
> >
> > Mirek
> >
> >
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu
> > > Sent: Wednesday, December 23, 2015 9:50 AM
> > > To: dev at dpdk.org
> > > Subject: [dpdk-dev] [RFC PATCH 0/6] General tunneling APIs
> > >
> > > I want to define a set of General tunneling APIs, which are used to
> > > accelarate tunneling packet processing in DPDK, In this RFC patch set,

[dpdk-dev] [PATCH] eal: fix compile error in eal_timer.c

2016-01-04 Thread David Marchand

Hello,

Four mails for this ? :-)
Please test off-list when you are not sure how to send patches.

On Sun, Jan 3, 2016 at 3:49 PM, Yi Lu  wrote:

> Error message:
> /root/dpdk-2.2.0/lib/librte_eal/linuxapp/eal/eal_timer.c: In function
> ?rte_eal_hpet_init?:
> /root/dpdk-2.2.0/lib/librte_eal/linuxapp/eal/eal_timer.c:222:2: error:
> implicit declaration of function ?rte_thread_setname?
> [-Werror=implicit-function-declaration]
>   ret = rte_thread_setname(msb_inc_thread_id, thread_name);
>   ^
> /root/dpdk-2.2.0/lib/librte_eal/linuxapp/eal/eal_timer.c:222:2: error:
> nested extern declaration of ?rte_thread_setname? [-Werror=nested-externs]
> cc1: all warnings being treated as errors
>
> Fixes: badb3688ffa8 ("eal/linux: fix build with glibc < 2.12")
>


Well, this problem was not seen before as you need to enable hpet support.
Title and commitlog should, at least, mention hpet.



> Signed-off-by: Yi Lu 
> ---
>  lib/librte_eal/linuxapp/eal/eal_timer.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_timer.c
> b/lib/librte_eal/linuxapp/eal/eal_timer.c
> index 9ceff33..bcadf09 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_timer.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_timer.c
> @@ -50,6 +50,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>
>  #include "eal_private.h"
>


Initial patch preferred eal_thread.h inclusion, but since
rte_thread_setname is defined in rte_lcore.h, I am fine with your patch.
So ack once the commitlog mentions hpet.

Thanks.

-- 
David Marchand

[dpdk-dev] [PATCH] fix checkpatch errors

2016-01-04 Thread Huawei Xie

Signed-off-by: Huawei Xie 
---
 app/test-pmd/cmdline.c | 12 ++--
 app/test-pmd/config.c  |  2 +-
 app/test-pmd/flowgen.c |  2 +-
 app/test-pmd/mempool_anon.c| 12 ++--
 app/test-pmd/testpmd.h |  2 +-
 app/test-pmd/txonly.c  |  2 +-
 app/test/test_mbuf.c   | 12 ++--
 app/test/test_memcpy_perf.c|  4 +-
 app/test/test_mempool.c|  4 +-
 app/test/test_memzone.c| 24 +++
 app/test/test_red.c| 42 ++--
 app/test/test_ring.c   |  4 +-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c |  2 +-
 drivers/crypto/qat/qat_qp.c| 22 +++---
 drivers/net/bnx2x/bnx2x.c  | 34 -
 drivers/net/bnx2x/bnx2x.h  |  4 +-
 drivers/net/bnx2x/bnx2x_rxtx.c | 16 ++---
 drivers/net/bnx2x/debug.c  |  6 +-
 drivers/net/bonding/rte_eth_bond_pmd.c |  2 +-
 drivers/net/e1000/em_ethdev.c  | 40 +--
 drivers/net/e1000/em_rxtx.c| 46 ++---
 drivers/net/e1000/igb_ethdev.c | 18 ++---
 drivers/net/e1000/igb_rxtx.c   | 30 
 drivers/net/fm10k/fm10k_ethdev.c   | 40 +--
 drivers/net/i40e/i40e_ethdev.c |  2 +-
 drivers/net/i40e/i40e_ethdev.h |  2 +-
 drivers/net/i40e/i40e_ethdev_vf.c  |  2 +-
 drivers/net/i40e/i40e_rxtx.c   | 14 ++--
 drivers/net/ixgbe/ixgbe_82599_bypass.c |  4 +-
 drivers/net/ixgbe/ixgbe_bypass.c   |  2 +-
 drivers/net/ixgbe/ixgbe_ethdev.c   | 34 -
 drivers/net/ixgbe/ixgbe_rxtx.c | 36 +-
 drivers/net/mlx5/mlx5_utils.h  |  2 +-
 drivers/net/mpipe/mpipe_tilegx.c   |  4 +-
 drivers/net/nfp/nfp_net.c  | 16 ++---
 drivers/net/virtio/virtio_ethdev.c |  6 +-
 examples/ip_pipeline/cpu_core_map.c|  2 +-
 .../pipeline/pipeline_flow_actions_be.c|  2 +-
 examples/ip_reassembly/main.c  | 22 +++---
 examples/ipv4_multicast/main.c | 14 ++--
 examples/l3fwd/main.c  |  4 +-
 examples/multi_process/symmetric_mp/main.c |  2 +-
 examples/netmap_compat/bridge/bridge.c |  8 +--
 examples/netmap_compat/lib/compat_netmap.c | 80 +++---
 examples/qos_sched/args.c  |  2 +-
 examples/quota_watermark/qw/main.h |  2 +-
 examples/vhost/main.c  |  4 +-
 examples/vhost_xen/main.c  |  2 +-
 examples/vhost_xen/vhost_monitor.c |  6 +-
 lib/librte_acl/acl_run_neon.h  |  2 +-
 lib/librte_cryptodev/rte_cryptodev.c   | 22 +++---
 lib/librte_eal/common/eal_common_memzone.c |  2 +-
 .../common/include/arch/ppc_64/rte_byteorder.h |  2 +-
 lib/librte_eal/common/malloc_heap.c|  2 +-
 lib/librte_eal/linuxapp/eal/eal_xen_memory.c   |  2 +-
 lib/librte_eal/linuxapp/kni/kni_vhost.c|  2 +-
 lib/librte_ether/rte_ether.h   | 10 +--
 lib/librte_hash/rte_cuckoo_hash.c  | 18 ++---
 lib/librte_ip_frag/ip_frag_internal.c  |  4 +-
 lib/librte_lpm/rte_lpm.c   |  2 +-
 lib/librte_mempool/rte_mempool.h   |  2 +-
 lib/librte_ring/rte_ring.h |  6 +-
 lib/librte_sched/rte_bitmap.h  |  6 +-
 lib/librte_sched/rte_red.h |  2 +-
 lib/librte_sched/rte_sched.c   |  4 +-
 65 files changed, 372 insertions(+), 372 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 73298c9..a82682d 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -2418,11 +2418,11 @@ parse_item_list(char* str, const char* item_name, 
unsigned int max_items,
}
if (c != ',') {
printf("character %c is not a decimal digit\n", c);
-   return (0);
+   return 0;
}
if (! value_ok) {
printf("No valid value before comma\n");
-   return (0);
+   return 0;
}
if (nb_item < max_items) {
parsed_items[nb_item] = value;
@@ -2434,11 +2434,11 @@ parse_item_list(char* str, const char* item_name, 
unsigned int max_items,
if (nb_item >= max_items) {

[dpdk-dev] [PATCH 4/4] virtio: check if any kernel driver is manipulating the device

2016-01-04 Thread Stephen Hemminger

On Mon, 4 Jan 2016 09:02:53 +
"Xie, Huawei"  wrote:

> > +   PMD_INIT_LOG(ERR,  
> Better change ERR to INFO and revise the message followed, since user
> might not want to use this device for DPDK.
> > +   "%s(): kernel driver is manipulating this device." \
> > +   " Please unbind the kernel driver.", __func__);

The addition of __func__ here is redundant since this is already in 
PMD_INIT_LOG macro.

[dpdk-dev] [PATCH v2 0/4] fix the issue that DPDK takes over virtio device blindly

2016-01-04 Thread Stephen Hemminger

On Mon,  4 Jan 2016 01:56:09 +0800
Huawei Xie  wrote:

> v2 changes:
>  Remove unnecessary assignment of NULL to dev->data->mac_addrs
>  Ajust one comment's position
>  change LOG level from ERR to INFO
> 
> virtio PMD doesn't set RTE_PCI_DRV_NEED_MAPPING in drv_flags of its
> eth_driver. It will try igb_uio and PORT IO in turn to configure
> virtio device. Even user in guest VM doesn't want to use virtio for
> DPDK, virtio PMD will take over the device blindly.
> 
> The more serious problem is kernel driver is still manipulating the
> device, which causes driver conflict.
> 
> This patch checks if there is any kernel driver manipulating the
> virtio device before virtio PMD uses port IO to configure the device.
> 
> Huawei Xie (4):
>   eal: make the comment more accurate
>   eal: set kdrv to RTE_KDRV_NONE if kernel driver isn't manipulating the 
> device.
>   virtio: return 1 to tell the kernel we don't take over this device
>   virtio: check if any kernel driver is manipulating the virtio device
> 
>  drivers/net/virtio/virtio_ethdev.c | 16 ++--
>  lib/librte_eal/common/eal_common_pci.c |  8 
>  lib/librte_eal/linuxapp/eal/eal_pci.c  |  2 +-
>  3 files changed, 19 insertions(+), 7 deletions(-)
> 

Overall looks good, thanks for addressing this.

It would be good to note that VFIO no-IOMMU mode should work for this
as well.

[dpdk-dev] [PATCH v2 4/4] virtio: check if any kernel driver is manipulating the virtio device

2016-01-04 Thread Stephen Hemminger

On Mon,  4 Jan 2016 01:56:13 +0800
Huawei Xie  wrote:

> + if (pci_dev->kdrv != RTE_KDRV_NONE) {
> + PMD_INIT_LOG(INFO,
> + "kernel driver is manipulating this device." \
> + " Please unbind the kernel driver.");

Splitting strings in general is a bad idea since it makes it harder to find log 
messages.
Also the first clause is lower case and the second is captialized.

Lastly, the backslash continuation is unnecessary here and will cause 
checkpatch warning.

[dpdk-dev] [PATCH 4/4] virtio: check if any kernel driver is manipulating the device

2016-01-04 Thread Xie, Huawei

On 12/25/2015 6:33 PM, Xie, Huawei wrote:
> virtio PMD could use IO port to configure the virtio device without
> using uio driver.
>
> There are two issues with previous implementation:
> 1) virtio PMD will take over each virtio device blindly even if some
> are not intended for DPDK.
> 2) driver conflict between virtio PMD and virtio-net kernel driver.
>
> This patch checks if there is any kernel driver manipulating the virtio
> device before virtio PMD uses IO port to configure the device.
>
> Fixes: da978dfdc43b ("virtio: use port IO to get PCI resource")
>
> Signed-off-by: Huawei Xie 
> ---
>  drivers/net/virtio/virtio_ethdev.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/net/virtio/virtio_ethdev.c 
> b/drivers/net/virtio/virtio_ethdev.c
> index 00015ef..504346a 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -1138,6 +1138,13 @@ static int virtio_resource_init_by_ioports(struct 
> rte_pci_device *pci_dev)
>   int found = 0;
>   size_t linesz;
>  
> + if (pci_dev->kdrv != RTE_KDRV_NONE) {
> + PMD_INIT_LOG(ERR,
Better change ERR to INFO and revise the message followed, since user
might not want to use this device for DPDK.
> + "%s(): kernel driver is manipulating this device." \
> + " Please unbind the kernel driver.", __func__);
> + return -1;
> + }
> +
>   snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT,
>pci_dev->addr.domain,
>pci_dev->addr.bus,

[dpdk-dev] KNI crashes when invalid lldp frames are received

2016-01-04 Thread laxmi.pr...@wipro.com

Hi,


We are using OVS 2.4 with DPDK 2.0 with  kernel 3.18.22. Default flow rules i.e 
NORMAL rules are there.KNI port is a member port of the ovs bridge

We are using Openmul controller and when invoke the set-controller command i.e 
ovs-vsctl --no-wait set-controller br0  tcp:127.0.0.1:6653, kni crashes with 
the below stack


<4>Stack:
<4> 880153937d48 880153937d50 880150814000 0001
<4> 2138f240 810cf36b 880153937df8 82055980
<4> 000100250151 0282 880153937da8 810d042f
<4>Call Trace:
<4> [] ? lock_timer_base.isra.35+0x2b/0x50
<4> [] ? try_to_del_timer_sync+0x4f/0x70
<4> [] ? del_timer_sync+0x52/0x60
<4> [] ? schedule_timeout+0x166/0x290
<4> [] ? internal_add_timer+0x80/0x80
<4> [] kni_net_rx+0xf/0x20 [rte_kni]
<4> [] kni_thread_single+0x5e/0xb0 [rte_kni]
<4> [] ? kni_release+0x100/0x100 [rte_kni]
<4> [] kthread+0xc9/0xe0
<4> [] ? kthread_create_on_node+0x180/0x180
<4> [] ret_from_fork+0x58/0x90
<4> [] ? k:q+0x180/0x180
<4>Code: 48 89 85 b8 fe ff ff eb 0c 0f 1f 44 00 00 48 8b 13 48 83 c3 08 49 8b 
8d a8 01 00 00 4d

8b b5 a0 01 00 00 31 ff 48 29 ca 4c 01 f2 <44> 0f b7 62 22 0f b7 42 12 48 03 02 
ba 20 00 00 00

45 0f b7 cc
<1>RIP  [] kni_net_rx_normal+0x145/0x2c0 [rte_kni]
<4> RSP 
<4>---[ end trace 8a2acec3270c7c4b ]---



During debugging we could see that the lldp frames are transmitted on the 
bridge ports whenever the controller is attached .

But those lldp frames were invalid and had the data_offset of the struct 
rte_mbuf packets set to 2.

Normal packets if we check could see that the data offset is equal to 
RTE_PKTMBUF_HEADROOM ( i.e 128. )

Is there any limitation in KNI that the packet data offset should be equal to 
or greater than RTE_PKTMBUF_HEADROOM



Regards

Laxmi

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. WARNING: Computer viruses can be transmitted via email. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email. www.wipro.com

[dpdk-dev] [PATCH 0/6 for 2.3] initial virtio 1.0 enabling

2016-01-04 Thread Xu, Qian Q

Does dpdk vhost-switch sample support virtio1.0? I tried it but seems not 
working. 

Thanks
Qian


-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Tan, Jianfeng
Sent: Tuesday, December 29, 2015 7:19 PM
To: Yuanhan Liu; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH 0/6 for 2.3] initial virtio 1.0 enabling



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Yuanhan Liu
> Sent: Thursday, December 10, 2015 11:54 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 0/6 for 2.3] initial virtio 1.0 enabling
> 
> Hi,
> 
> Here is an initial virtio 1.0 pmd driver enabling.
> 
> Almost all difference comes from virtio 1.0 are the PCI layout change:
> the major configuration structures are stored at bar space, and their 
> location is stored at corresponding pci cap structure. Reading/parsing 
> them is one of the major work of patch 6.
> 
> To make handling virtio v1.0 and v0.95 co-exist well, this patch set 
> introduces a virtio_pci_ops structure, to add another layer so that we 
> could keep those vtpci_foo_bar "APIs". With that, we could do the 
> minimum change to add virtio 1.0 support.


Please point out from which version, qemu starts to support virtio 1.0 net 
devices.

Thanks,
Jianfeng

> 
> Note that the enabling is still in rough state, and it's likely I may 
> miss something. So, comments are huge welcome!
> 
>   --yliu
> 
> ---
> Yuanhan Liu (6):
>   virtio: don't set vring address again at queue startup
>   virtio: introduce struct virtio_pci_ops
>   virtio: move left pci stuff to virtio_pci.c
>   viritio: switch to 64 bit features
>   virtio: set RTE_PCI_DRV_NEED_MAPPING flag
>   virtio: add virtio v1.0 support
> 
>  drivers/net/virtio/virtio_ethdev.c | 297 +--
>  drivers/net/virtio/virtio_ethdev.h |   3 +-
>  drivers/net/virtio/virtio_pci.c| 752
> +++--
>  drivers/net/virtio/virtio_pci.h| 100 -
>  drivers/net/virtio/virtio_rxtx.c   |  15 -
>  drivers/net/virtio/virtqueue.h |   4 +-
>  6 files changed, 843 insertions(+), 328 deletions(-)
> 
> --
> 1.9.0

[dpdk-dev] [PKTGEN] OK to reindent the pktgen (mix of tabs and spaces, etc.)?

2016-01-04 Thread Wiles, Keith

On 1/3/16, 4:07 PM, "dev on behalf of Matthew Hall"  wrote:

>On 1/3/16 9:09 AM, Wiles, Keith wrote:
>> Pktgen is setup for tabs for 4 (with replace tabs with spaces), using tab 
>> stop of 8 is just wrong IMO :-)
>> Just started using kdevelop instead of eclipse, so I may have corrupted the 
>> style some :-(
>
>The problem I found was a number of files had an incompatible 
>combination of the two formats.
>
>Personally, I agree tab size 4 w/ spaces instead of tabs is easiest to 
>read and edit. But I could live with any space based system for the most 
>part. I find tab based systems are unpleasant because it is difficult 
>when tabs are used for one thing and spaces for another thing. This 
>annoyance also applies to DPDK and the kernel but it's too late for both 
>of those.
>
>> At least it is suppose to be done that way. I will reformat the code (with 
>> tabs=4) and have a look at the output.
>
>Thanks this will be a big help.
>
>> I can run the astyle on the code and look at the output, if it looks OK I 
>> will submit it to the repo
>
>Sounds great... it is no big hurry on my end but I want to start with a 
>clean slate before I get invested in the code, and start really hitting 
>it hard, and making patches.
>
>The formatting command I provided is not perfect, but it was the best I 
>could do with the various popular indenter tools to try to avoid messing 
>up too much of the rest of the good code in the files in the process of 
>fixing the format.
>
>You might be able to improve it a bit further w/ some additional 
>experimentation since you are the original maintainer of the code 
>obviously. Or perhaps reformat using tools in Eclipse or KDevelop? I had 
>good luck w/ Eclipse before with special configuration but I only mostly 
>used the Java mode not the C / C++ one which is less good.

I push a version of Pktgen with cleaned up formatting using ?uncrustify 0.61? 
and 'UniversalIndentGUI 1.2.0 Rev 1070 13 Aug 2015?. I put the script and 
configuration file in the top directory. The formatting looks close to DPDK 
coding guidelines, but I am sure some tweaks would need to be done.

Uncrustify has a huge number of options and not always clear as to the effect 
an option has on the code. Header files seems to get effected a bit more then C 
code as the formatting is usual done by hand I guess. The only way to see the 
deltas is with UnviersalIndentGUI tool.

All of this is on the ?dev? branch, so let me know what you think.

++Keith
>
>Matthew.
>


Regards,
Keith

[dpdk-dev] [PATCH v5 2/3] examples/l2fwd: Handle SIGINT and SIGTERM in l2fwd

2016-01-04 Thread Wang, Zhihong



> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Friday, January 1, 2016 1:02 AM
> To: Wang, Zhihong 
> Cc: dev at dpdk.org; Ananyev, Konstantin ; 
> Qiu,
> Michael 
> Subject: Re: [PATCH v5 2/3] examples/l2fwd: Handle SIGINT and SIGTERM in
> l2fwd
> 
> On Wed, 30 Dec 2015 16:59:50 -0500
> Zhihong Wang  wrote:
> 
> > +static void
> > +signal_handler(int signum)
> > +{
> > +   if (signum == SIGINT || signum == SIGTERM) {
> > +   printf("\n\nSignal %d received, preparing to exit...\n",
> > +   signum);
> > +   force_quit = true;
> 
> Actually, the if () is redundant since you only registered SIGINT, and SIGTERM
> those are the only signals you could possibly receive.

Yes it's kind of an obsession I guess, just want to make the code crystal clear 
:)

> 
> Acked-by: Stephen Hemminger

[dpdk-dev] [PATCH v2 4/4] virtio: check if any kernel driver is manipulating the virtio device

2016-01-04 Thread Huawei Xie

v2 changes:
 change LOG level from ERR to INFO

virtio PMD could use IO port to configure the virtio device without
using uio driver.

There are two issues with previous implementation:
1) virtio PMD will take over each virtio device blindly even if some
are not intended for DPDK.
2) driver conflict between virtio PMD and virtio-net kernel driver.

This patch checks if there is any kernel driver manipulating the virtio
device before virtio PMD uses IO port to configure the device.

Fixes: da978dfdc43b ("virtio: use port IO to get PCI resource")

Signed-off-by: Huawei Xie 
---
 drivers/net/virtio/virtio_ethdev.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index e815acd..7a50dac 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1138,6 +1138,13 @@ static int virtio_resource_init_by_ioports(struct 
rte_pci_device *pci_dev)
int found = 0;
size_t linesz;

+   if (pci_dev->kdrv != RTE_KDRV_NONE) {
+   PMD_INIT_LOG(INFO,
+   "kernel driver is manipulating this device." \
+   " Please unbind the kernel driver.");
+   return -1;
+   }
+
snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT,
 pci_dev->addr.domain,
 pci_dev->addr.bus,
-- 
1.8.1.4

[dpdk-dev] [PATCH v2 3/4] virtio: return 1 to tell the upper layer we don't take over this device

2016-01-04 Thread Huawei Xie

v2 changes:
 Remove unnecessary assignment of NULL to dev->data->mac_addrs
 Ajust one comment's position

if virtio_resource_init fails, cleanup the resource and return 1 to
tell the upper layer we don't take over this device.
return -1 means error and DPDK will exit.

Signed-off-by: Huawei Xie 
---
 drivers/net/virtio/virtio_ethdev.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index d928339..e815acd 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1287,8 +1287,13 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)

pci_dev = eth_dev->pci_dev;

-   if (virtio_resource_init(pci_dev) < 0)
-   return -1;
+   if (virtio_resource_init(pci_dev) < 0) {
+   rte_free(eth_dev->data->mac_addrs);
+   /* Return 1 to tell the upper layer we don't take over
+* this device.
+*/
+   return 1;
+   }

hw->use_msix = virtio_has_msix(_dev->addr);
hw->io_base = (uint32_t)(uintptr_t)pci_dev->mem_resource[0].addr;
-- 
1.8.1.4

[dpdk-dev] [PATCH v2 2/4] eal: set kdrv to RTE_KDRV_NONE if kernel driver isn't manipulating the device.

2016-01-04 Thread Huawei Xie

Use RTE_KDRV_NONE to indicate that kernel driver isn't manipulating
the device.

Signed-off-by: Huawei Xie 
Acked-by: David Marchand 
---
 lib/librte_eal/linuxapp/eal/eal_pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index bc5b5be..640b190 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -362,7 +362,7 @@ pci_scan_one(const char *dirname, uint16_t domain, uint8_t 
bus,
else
dev->kdrv = RTE_KDRV_UNKNOWN;
} else
-   dev->kdrv = RTE_KDRV_UNKNOWN;
+   dev->kdrv = RTE_KDRV_NONE;

/* device is valid, add in list (sorted) */
if (TAILQ_EMPTY(_device_list)) {
-- 
1.8.1.4

[dpdk-dev] [PATCH v2 1/4] eal: make the comment more accurate

2016-01-04 Thread Huawei Xie

Signed-off-by: Huawei Xie 
---
 lib/librte_eal/common/eal_common_pci.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index dcfe947..bbcdb2b 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -204,7 +204,7 @@ rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr, 
struct rte_pci_device *d
/* call the driver devinit() function */
return dr->devinit(dr, dev);
}
-   /* return positive value if driver is not found */
+   /* return positive value if driver doesn't support this device */
return 1;
 }

@@ -259,7 +259,7 @@ rte_eal_pci_detach_dev(struct rte_pci_driver *dr,
return 0;
}

-   /* return positive value if driver is not found */
+   /* return positive value if driver doesn't support this device */
return 1;
 }

@@ -283,7 +283,7 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
/* negative value is an error */
return -1;
if (rc > 0)
-   /* positive value means driver not found */
+   /* positive value means driver doesn't support it */
continue;
return 0;
}
@@ -310,7 +310,7 @@ pci_detach_all_drivers(struct rte_pci_device *dev)
/* negative value is an error */
return -1;
if (rc > 0)
-   /* positive value means driver not found */
+   /* positive value means driver doesn't support it */
continue;
return 0;
}
-- 
1.8.1.4

[dpdk-dev] [PATCH v2 0/4] fix the issue that DPDK takes over virtio device blindly

2016-01-04 Thread Huawei Xie

v2 changes:
 Remove unnecessary assignment of NULL to dev->data->mac_addrs
 Ajust one comment's position
 change LOG level from ERR to INFO

virtio PMD doesn't set RTE_PCI_DRV_NEED_MAPPING in drv_flags of its
eth_driver. It will try igb_uio and PORT IO in turn to configure
virtio device. Even user in guest VM doesn't want to use virtio for
DPDK, virtio PMD will take over the device blindly.

The more serious problem is kernel driver is still manipulating the
device, which causes driver conflict.

This patch checks if there is any kernel driver manipulating the
virtio device before virtio PMD uses port IO to configure the device.

Huawei Xie (4):
  eal: make the comment more accurate
  eal: set kdrv to RTE_KDRV_NONE if kernel driver isn't manipulating the device.
  virtio: return 1 to tell the kernel we don't take over this device
  virtio: check if any kernel driver is manipulating the virtio device

 drivers/net/virtio/virtio_ethdev.c | 16 ++--
 lib/librte_eal/common/eal_common_pci.c |  8 
 lib/librte_eal/linuxapp/eal/eal_pci.c  |  2 +-
 3 files changed, 19 insertions(+), 7 deletions(-)

-- 
1.8.1.4

[dpdk-dev] [PKTGEN] OK to reindent the pktgen (mix of tabs and spaces, etc.)?

2016-01-04 Thread Wiles, Keith

On 1/3/16, 5:35 PM, "Yigit, Ferruh"  wrote:

>On Sun, Jan 03, 2016 at 05:09:16PM +, Wiles, Keith wrote:
>
>
>
>> A bigger question is what is the coding style of DPDK and where is it 
>> documented? I looked in the docs/web site and did not find any coding style 
>> suggestions. Maybe I missed it someplace.
>
>There is one in:
>http://dpdk.org/doc/guides/contributing/coding_style.html

Thanks not sure how I missed that. I use the search tool in the html code for 
coding, style, ? I guess I was looking for Coding Style to pop up, it was under 
contributor?s guidelines instead. :-(

I am playing with uncrustify huge number of option with universalIndentGUI to 
see if I can get close to our guidelines.

I will post it when I get close.

>
>Regards,
>ferruh
>

Regards,
Keith

58 matches

Mail list logo