[dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 for all NICs but 82598

2015-08-19 Thread Lu, Wenzhuo
Hi Helin,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Vlad Zolotarov
> Sent: Friday, August 14, 2015 1:38 PM
> To: Zhang, Helin; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 for
> all NICs but 82598
> 
> 
> 
> On 08/13/15 23:28, Zhang, Helin wrote:
> > Hi Vlad
> >
> > I don't think the changes are needed. It says in datasheet that the RS
> > bit should be set on the last descriptor of every packet, ONLY WHEN
> TXDCTL.WTHRESH equals to ZERO.
> 
> Of course it's needed! See below.
> Exactly the same spec a few lines above the place u've just quoted states:
> 
> "Software should not set the RS bit when TXDCTL.WTHRESH is greater than
> zero."
> 
> And since all three (3) ixgbe xmit callbacks are utilizing RS bit notation 
> ixgbe PMD
> is actually not supporting any value of WTHRESH different from zero.
I think Vlad is right. We need to fix this issue. Any suggestion? If not, I'd 
like to ack this patch.

> 
> >
> > Regards,
> > Helin
> >
> >> -Original Message-
> >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> >> Sent: Thursday, August 13, 2015 11:07 AM
> >> To: dev at dpdk.org
> >> Cc: Zhang, Helin; Ananyev, Konstantin; avi at cloudius-systems.com; Vlad
> >> Zolotarov
> >> Subject: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 for
> all
> >> NICs but 82598
> >>
> >> According to 82599 and x540 HW specifications RS bit *must* be set in the
> last
> >> descriptor of *every* packet.
> > There is a condition that if TXDCTL.WTHRESH equal to zero.
> 
> Right and ixgbe PMD requires this condition to be fulfilled in order to
> function. See above.
> 
> >
> >> This patch fixes the Tx hang we were constantly hitting with a 
> >> seastar-based
> >> application on x540 NIC.
> > Could you help to share with us how to reproduce the tx hang issue, with 
> > using
> > typical DPDK examples?
> 
> Sorry. I'm not very familiar with the typical DPDK examples to help u
> here. However this is quite irrelevant since without this this patch
> ixgbe PMD obviously abuses the HW spec as has been explained above.
> 
> We saw the issue when u stressed the xmit path with a lot of highly
> fragmented TCP frames (packets with up to 33 fragments with non-headers
> fragments as small as 4 bytes) with all offload features enabled.
> 
> Thanks,
> vlad
> >
> >> Signed-off-by: Vlad Zolotarov 
> >> ---
> >>   drivers/net/ixgbe/ixgbe_ethdev.c |  9 +
> >>   drivers/net/ixgbe/ixgbe_rxtx.c   | 23 ++-
> >>   2 files changed, 31 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> >> b/drivers/net/ixgbe/ixgbe_ethdev.c
> >> index b8ee1e9..6714fd9 100644
> >> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> >> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> >> @@ -2414,6 +2414,15 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev,
> struct
> >> rte_eth_dev_info *dev_info)
> >>.txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS |
> >>ETH_TXQ_FLAGS_NOOFFLOADS,
> >>};
> >> +
> >> +  /*
> >> +   * According to 82599 and x540 specifications RS bit *must* be set on
> the
> >> +   * last descriptor of *every* packet. Therefore we will not allow the
> >> +   * tx_rs_thresh above 1 for all NICs newer than 82598.
> >> +   */
> >> +  if (hw->mac.type > ixgbe_mac_82598EB)
> >> +  dev_info->default_txconf.tx_rs_thresh = 1;
> >> +
> >>dev_info->hash_key_size = IXGBE_HKEY_MAX_INDEX * sizeof(uint32_t);
> >>dev_info->reta_size = ETH_RSS_RETA_SIZE_128;
> >>dev_info->flow_type_rss_offloads = IXGBE_RSS_OFFLOAD_ALL; diff --
> git
> >> a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c index
> >> 91023b9..8dbdffc 100644
> >> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> >> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> >> @@ -2085,11 +2085,19 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_dev
> >> *dev,
> >>struct ixgbe_tx_queue *txq;
> >>struct ixgbe_hw *hw;
> >>uint16_t tx_rs_thresh, tx_free_thresh;
> >> +  bool rs_deferring_allowed;
> >>
> >>PMD_INIT_FUNC_TRACE();
> >>hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> >>
> >>/*
> >> +   * According to 82599 and x540 specifications RS bit *must* be set on
> the
> >> +   * last descriptor of *every* packet. Therefore we will not allow the
> >> +   * tx_rs_thresh above 1 for all NICs newer than 82598.
> >> +   */
> >> +  rs_deferring_allowed = (hw->mac.type <= ixgbe_mac_82598EB);
> >> +
> >> +  /*
> >> * Validate number of transmit descriptors.
> >> * It must not exceed hardware maximum, and must be multiple
> >> * of IXGBE_ALIGN.
> >> @@ -2110,6 +2118,8 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_dev
> *dev,
> >> * to transmit a packet is greater than the number of free TX
> >> * descriptors.
> >> * The following constraints must be satisfied:
> >> +   *  tx_rs_thresh must be less than 2 for NICs for which RS deferring is
> >> +   *  forbidden (all but 8

[dpdk-dev] Cisco DPDK enabled Packet Generator.

2015-08-19 Thread Andrew Harvey (agh)
A couple of people asked me at the SF DPDK Summit about Cisco?s Open
Source Packet Generator that leverages DPDK.

TRex an open source, low cost, stateful traffic generator fueled by DPDK.
It generates L4-7 traffic based on pre-processing and smart replay of real
traffic templates. Trex amplifies both client and server side traffic and
can scale to 200Gb/sec with one Cisco UCS using an Intel XL710.

I know the team would welcome comments and join development to help
extend/improve the project.

https://github.com/cisco-system-traffic-generator




[dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev

2015-08-19 Thread Yuanhan Liu
Hi Changchun,

On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> Each virtio device could have multiple queues, say 2 or 4, at most 8.
> Enabling this feature allows virtio device/port on guest has the ability to
> use different vCPU to receive/transmit packets from/to each queue.
> 
> In multiple queues mode, virtio device readiness means all queues of
> this virtio device are ready, cleanup/destroy a virtio device also
> requires clearing all queues belong to it.
> 
> Signed-off-by: Changchun Ouyang 
> ---
[snip ..]
>  /*
> + *  Initialise all variables in vring queue pair.
> + */
> +static void
> +init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx)
> +{
> + uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> + uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> + memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct 
> vhost_virtqueue));
> + memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct 
> vhost_virtqueue));
> +
> + dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
> + dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
> + dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
> + dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
> +
> + /* Backends are set to -1 indicating an inactive device. */
> + dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> + dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> +}
> +
> +/*
>   *  Initialise all variables in device structure.
>   */
>  static void
> @@ -258,17 +294,34 @@ init_device(struct virtio_net *dev)
>   /* Set everything to 0. */

There is a trick here. Let me fill the context first:

283 static void
284 init_device(struct virtio_net *dev)
285 {
286 uint64_t vq_offset;
287
288 /*
289  * Virtqueues have already been malloced so
290  * we don't want to set them to NULL.
291  */
292 vq_offset = offsetof(struct virtio_net, mem);
293
294 /* Set everything to 0. */
295 memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
296 (sizeof(struct virtio_net) - (size_t)vq_offset));
297
298 init_vring_queue_pair(dev, 0);

This piece of code's intention is to memset everything to zero, except
the `virtqueue' field, for, as the comment stated, we have already
allocated virtqueue.

It works only when `virtqueue' field is before `mem' field, and it was
before:

struct virtio_net {
struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];/**< 
Contains all virtqueue information. */
struct virtio_memory*mem;   /**< QEMU memory and memory 
region information. */
...

After this patch, it becomes:

struct virtio_net {
struct virtio_memory*mem;   /**< QEMU memory and memory 
region information. */
struct vhost_virtqueue  **virtqueue;/**< Contains all virtqueue 
information. */
...


Which actually wipes all stuff inside `struct virtio_net`, resulting to 
setting `virtqueue' to NULL as well.

While reading the code(without you patch applied), I thought that it's
error-prone, as it is very likely that someone else besides the author
doesn't know such undocumented rule. And you just gave me an example :)

Huawei, I'm proposing a fix to call rte_zmalloc() for allocating new_ll_dev
to get rid of such issue. What do you think?

--yliu



>   memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
>   (sizeof(struct virtio_net) - (size_t)vq_offset));
> - memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct vhost_virtqueue));
> - memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct vhost_virtqueue));
>  
> - dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
> - dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
> - dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
> - dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;
> + init_vring_queue_pair(dev, 0);
> + dev->virt_qp_nb = 1;
> +}
> +
> +/*
> + *  Alloc mem for vring queue pair.
> + */
> +int
> +alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx)
> +{
> + struct vhost_virtqueue *virtqueue = NULL;
> + uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> + uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
>  
> - /* Backends are set to -1 indicating an inactive device. */
> - dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
> - dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
> + virtqueue = rte_malloc(NULL, sizeof(struct vhost_virtqueue) * 
> VIRTIO_QNUM, 0);
> + if (virtqueue == NULL) {
> + RTE_LOG(ERR, VHOST_CONFIG,
> + "Failed to allocate memory for virt qp:%d.\n", qp_idx);
> + return -1;
> + }
> +
> + dev->virtqueue[virt_rx_q_idx] = virtqueue;
> + dev->virtqueue[virt_tx_q_idx] 

[dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 for all NICs but 82598

2015-08-19 Thread Vladislav Zolotarov
On Aug 19, 2015 03:42, "Lu, Wenzhuo"  wrote:
>
> Hi Helin,
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Vlad Zolotarov
> > Sent: Friday, August 14, 2015 1:38 PM
> > To: Zhang, Helin; dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above
1 for
> > all NICs but 82598
> >
> >
> >
> > On 08/13/15 23:28, Zhang, Helin wrote:
> > > Hi Vlad
> > >
> > > I don't think the changes are needed. It says in datasheet that the RS
> > > bit should be set on the last descriptor of every packet, ONLY WHEN
> > TXDCTL.WTHRESH equals to ZERO.
> >
> > Of course it's needed! See below.
> > Exactly the same spec a few lines above the place u've just quoted
states:
> >
> > "Software should not set the RS bit when TXDCTL.WTHRESH is greater than
> > zero."
> >
> > And since all three (3) ixgbe xmit callbacks are utilizing RS bit
notation ixgbe PMD
> > is actually not supporting any value of WTHRESH different from zero.
> I think Vlad is right. We need to fix this issue. Any suggestion? If not,
I'd like to ack this patch.

Pls., note that there is a v2 of this patch on the list. I forgot to patch
ixgbevf_dev_info_get() in v1.

>
> >
> > >
> > > Regards,
> > > Helin
> > >
> > >> -Original Message-
> > >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> > >> Sent: Thursday, August 13, 2015 11:07 AM
> > >> To: dev at dpdk.org
> > >> Cc: Zhang, Helin; Ananyev, Konstantin; avi at cloudius-systems.com; Vlad
> > >> Zolotarov
> > >> Subject: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above
1 for
> > all
> > >> NICs but 82598
> > >>
> > >> According to 82599 and x540 HW specifications RS bit *must* be set
in the
> > last
> > >> descriptor of *every* packet.
> > > There is a condition that if TXDCTL.WTHRESH equal to zero.
> >
> > Right and ixgbe PMD requires this condition to be fulfilled in order to
> > function. See above.
> >
> > >
> > >> This patch fixes the Tx hang we were constantly hitting with a
seastar-based
> > >> application on x540 NIC.
> > > Could you help to share with us how to reproduce the tx hang issue,
with using
> > > typical DPDK examples?
> >
> > Sorry. I'm not very familiar with the typical DPDK examples to help u
> > here. However this is quite irrelevant since without this this patch
> > ixgbe PMD obviously abuses the HW spec as has been explained above.
> >
> > We saw the issue when u stressed the xmit path with a lot of highly
> > fragmented TCP frames (packets with up to 33 fragments with non-headers
> > fragments as small as 4 bytes) with all offload features enabled.
> >
> > Thanks,
> > vlad
> > >
> > >> Signed-off-by: Vlad Zolotarov 
> > >> ---
> > >>   drivers/net/ixgbe/ixgbe_ethdev.c |  9 +
> > >>   drivers/net/ixgbe/ixgbe_rxtx.c   | 23 ++-
> > >>   2 files changed, 31 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> > >> b/drivers/net/ixgbe/ixgbe_ethdev.c
> > >> index b8ee1e9..6714fd9 100644
> > >> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> > >> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> > >> @@ -2414,6 +2414,15 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev,
> > struct
> > >> rte_eth_dev_info *dev_info)
> > >>.txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS |
> > >>ETH_TXQ_FLAGS_NOOFFLOADS,
> > >>};
> > >> +
> > >> +  /*
> > >> +   * According to 82599 and x540 specifications RS bit *must* be
set on
> > the
> > >> +   * last descriptor of *every* packet. Therefore we will not allow
the
> > >> +   * tx_rs_thresh above 1 for all NICs newer than 82598.
> > >> +   */
> > >> +  if (hw->mac.type > ixgbe_mac_82598EB)
> > >> +  dev_info->default_txconf.tx_rs_thresh = 1;
> > >> +
> > >>dev_info->hash_key_size = IXGBE_HKEY_MAX_INDEX * sizeof(uint32_t);
> > >>dev_info->reta_size = ETH_RSS_RETA_SIZE_128;
> > >>dev_info->flow_type_rss_offloads = IXGBE_RSS_OFFLOAD_ALL; diff --
> > git
> > >> a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index
> > >> 91023b9..8dbdffc 100644
> > >> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> > >> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> > >> @@ -2085,11 +2085,19 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_dev
> > >> *dev,
> > >>struct ixgbe_tx_queue *txq;
> > >>struct ixgbe_hw *hw;
> > >>uint16_t tx_rs_thresh, tx_free_thresh;
> > >> +  bool rs_deferring_allowed;
> > >>
> > >>PMD_INIT_FUNC_TRACE();
> > >>hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> > >>
> > >>/*
> > >> +   * According to 82599 and x540 specifications RS bit *must* be
set on
> > the
> > >> +   * last descriptor of *every* packet. Therefore we will not allow
the
> > >> +   * tx_rs_thresh above 1 for all NICs newer than 82598.
> > >> +   */
> > >> +  rs_deferring_allowed = (hw->mac.type <= ixgbe_mac_82598EB);
> > >> +
> > >> +  /*
> > >> * Validate number of transmit descriptors.
> > >> * It must not exceed hardware maximum, and must be mult

[dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev

2015-08-19 Thread Ouyang, Changchun
Hi Yuanhan,

> -Original Message-
> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> Sent: Wednesday, August 19, 2015 11:53 AM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org; Xie, Huawei
> Subject: Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in
> virtio dev
> 
> Hi Changchun,
> 
> On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > Enabling this feature allows virtio device/port on guest has the
> > ability to use different vCPU to receive/transmit packets from/to each
> queue.
> >
> > In multiple queues mode, virtio device readiness means all queues of
> > this virtio device are ready, cleanup/destroy a virtio device also
> > requires clearing all queues belong to it.
> >
> > Signed-off-by: Changchun Ouyang 
> > ---
> [snip ..]
> >  /*
> > + *  Initialise all variables in vring queue pair.
> > + */
> > +static void
> > +init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > +   uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > +   uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> > +   memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct
> vhost_virtqueue));
> > +   memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct
> > +vhost_virtqueue));
> > +
> > +   dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
> > +   dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
> > +   dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
> > +   dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
> > +
> > +   /* Backends are set to -1 indicating an inactive device. */
> > +   dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> > +   dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED; }
> > +
> > +/*
> >   *  Initialise all variables in device structure.
> >   */
> >  static void
> > @@ -258,17 +294,34 @@ init_device(struct virtio_net *dev)
> > /* Set everything to 0. */
> 
> There is a trick here. Let me fill the context first:
> 
> 283 static void
> 284 init_device(struct virtio_net *dev)
> 285 {
> 286 uint64_t vq_offset;
> 287
> 288 /*
> 289  * Virtqueues have already been malloced so
> 290  * we don't want to set them to NULL.
> 291  */
> 292 vq_offset = offsetof(struct virtio_net, mem);
> 293
> 294 /* Set everything to 0. */
> 295 memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 
> 0,
> 296 (sizeof(struct virtio_net) - (size_t)vq_offset));
> 297
> 298 init_vring_queue_pair(dev, 0);
> 
> This piece of code's intention is to memset everything to zero, except the
> `virtqueue' field, for, as the comment stated, we have already allocated
> virtqueue.
> 
> It works only when `virtqueue' field is before `mem' field, and it was
> before:
> 
> struct virtio_net {
> struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];/**< 
> Contains
> all virtqueue information. */
> struct virtio_memory*mem;   /**< QEMU memory and 
> memory
> region information. */
> ...
> 
> After this patch, it becomes:
> 
> struct virtio_net {
> struct virtio_memory*mem;   /**< QEMU memory and 
> memory
> region information. */
> struct vhost_virtqueue  **virtqueue;/**< Contains all 
> virtqueue
> information. */
> ...
> 
> 
> Which actually wipes all stuff inside `struct virtio_net`, resulting to 
> setting
> `virtqueue' to NULL as well.
> 
> While reading the code(without you patch applied), I thought that it's error-
> prone, as it is very likely that someone else besides the author doesn't know
> such undocumented rule. And you just gave me an example :)
> 
> Huawei, I'm proposing a fix to call rte_zmalloc() for allocating new_ll_dev to
> get rid of such issue. What do you think?
> 
>   --yliu
> 
> 

Suggest you finish the latter patch review:
[PATCH v4 04/12] vhost: set memory layout for multiple queues mode,
After you finish reviewing this patch, I think you will change your mind :-)

This patch resolve what you concern.

> 
> > memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
> > (sizeof(struct virtio_net) - (size_t)vq_offset));
> > -   memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct
> vhost_virtqueue));
> > -   memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct
> vhost_virtqueue));
> >
> > -   dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
> > -   dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
> > -   dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
> > -   dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;
> > +   init_vring_queue_pair(dev, 0);
> > +   dev->virt_qp_nb = 1;
> > +}
> > +
> > +/*
> > + *  Alloc mem for vring queue pair.
> > + */
> > +int
> > +alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > +   struct vhost_virtqueue *virtqueue = NUL

[dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev

2015-08-19 Thread Yuanhan Liu
On Wed, Aug 19, 2015 at 05:54:09AM +, Ouyang, Changchun wrote:
> Hi Yuanhan,
> 
> > -Original Message-
> > From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> > Sent: Wednesday, August 19, 2015 11:53 AM
> > To: Ouyang, Changchun
> > Cc: dev at dpdk.org; Xie, Huawei
> > Subject: Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in
> > virtio dev
> > 
> > Hi Changchun,
> > 
> > On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> > > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > > Enabling this feature allows virtio device/port on guest has the
> > > ability to use different vCPU to receive/transmit packets from/to each
> > queue.
> > >
> > > In multiple queues mode, virtio device readiness means all queues of
> > > this virtio device are ready, cleanup/destroy a virtio device also
> > > requires clearing all queues belong to it.
> > >
> > > Signed-off-by: Changchun Ouyang 
> > > ---
> > [snip ..]
> > >  /*
> > > + *  Initialise all variables in vring queue pair.
> > > + */
> > > +static void
> > > +init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > > + uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > > + uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> > > + memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct
> > vhost_virtqueue));
> > > + memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct
> > > +vhost_virtqueue));
> > > +
> > > + dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
> > > + dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
> > > + dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
> > > + dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
> > > +
> > > + /* Backends are set to -1 indicating an inactive device. */
> > > + dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> > > + dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED; }
> > > +
> > > +/*
> > >   *  Initialise all variables in device structure.
> > >   */
> > >  static void
> > > @@ -258,17 +294,34 @@ init_device(struct virtio_net *dev)
> > >   /* Set everything to 0. */
> > 
> > There is a trick here. Let me fill the context first:
> > 
> > 283 static void
> > 284 init_device(struct virtio_net *dev)
> > 285 {
> > 286 uint64_t vq_offset;
> > 287
> > 288 /*
> > 289  * Virtqueues have already been malloced so
> > 290  * we don't want to set them to NULL.
> > 291  */
> > 292 vq_offset = offsetof(struct virtio_net, mem);
> > 293
> > 294 /* Set everything to 0. */
> > 295 memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + 
> > vq_offset), 0,
> > 296 (sizeof(struct virtio_net) - (size_t)vq_offset));
> > 297
> > 298 init_vring_queue_pair(dev, 0);
> > 
> > This piece of code's intention is to memset everything to zero, except the
> > `virtqueue' field, for, as the comment stated, we have already allocated
> > virtqueue.
> > 
> > It works only when `virtqueue' field is before `mem' field, and it was
> > before:
> > 
> > struct virtio_net {
> > struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];/**< 
> > Contains
> > all virtqueue information. */
> > struct virtio_memory*mem;   /**< QEMU memory and 
> > memory
> > region information. */
> > ...
> > 
> > After this patch, it becomes:
> > 
> > struct virtio_net {
> > struct virtio_memory*mem;   /**< QEMU memory and 
> > memory
> > region information. */
> > struct vhost_virtqueue  **virtqueue;/**< Contains all 
> > virtqueue
> > information. */
> > ...
> > 
> > 
> > Which actually wipes all stuff inside `struct virtio_net`, resulting to 
> > setting
> > `virtqueue' to NULL as well.
> > 
> > While reading the code(without you patch applied), I thought that it's 
> > error-
> > prone, as it is very likely that someone else besides the author doesn't 
> > know
> > such undocumented rule. And you just gave me an example :)
> > 
> > Huawei, I'm proposing a fix to call rte_zmalloc() for allocating new_ll_dev 
> > to
> > get rid of such issue. What do you think?
> > 
> > --yliu
> > 
> > 
> 
> Suggest you finish the latter patch review:
> [PATCH v4 04/12] vhost: set memory layout for multiple queues mode,
> After you finish reviewing this patch, I think you will change your mind :-)
> 
> This patch resolve what you concern.

Sorry that I hadn't gone that far yet. And yes, I see. I found that you
moved the barrier to `features' field, which puts the `virtqueue' field
back to the "do not set to zero" zone.

It's still an undocumented rule, and hence, error prone, IMO. But, you
reminded me that init_device() will be invoked at other place 
else(reset_owner()).
Hence, my solution won't work, either.

I'm wondering saving the `virtqueue'(and `mem_arr' from patch 04/12)
explicitly before memset() and restoring them after that, to get rid of

[dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev

2015-08-19 Thread Yuanhan Liu
On Wed, Aug 19, 2015 at 02:28:51PM +0800, Yuanhan Liu wrote:
> On Wed, Aug 19, 2015 at 05:54:09AM +, Ouyang, Changchun wrote:
> > Hi Yuanhan,
> > 
> > > -Original Message-
> > > From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> > > Sent: Wednesday, August 19, 2015 11:53 AM
> > > To: Ouyang, Changchun
> > > Cc: dev at dpdk.org; Xie, Huawei
> > > Subject: Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in
> > > virtio dev
> > > 
> > > Hi Changchun,
> > > 
> > > On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> > > > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > > > Enabling this feature allows virtio device/port on guest has the
> > > > ability to use different vCPU to receive/transmit packets from/to each
> > > queue.
> > > >
> > > > In multiple queues mode, virtio device readiness means all queues of
> > > > this virtio device are ready, cleanup/destroy a virtio device also
> > > > requires clearing all queues belong to it.
> > > >
> > > > Signed-off-by: Changchun Ouyang 
> > > > ---
> > > [snip ..]
> > > >  /*
> > > > + *  Initialise all variables in vring queue pair.
> > > > + */
> > > > +static void
> > > > +init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > > > +   uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > > > +   uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> > > > +   memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct
> > > vhost_virtqueue));
> > > > +   memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct
> > > > +vhost_virtqueue));
> > > > +
> > > > +   dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
> > > > +   dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
> > > > +   dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
> > > > +   dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
> > > > +
> > > > +   /* Backends are set to -1 indicating an inactive device. */
> > > > +   dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> > > > +   dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED; }
> > > > +
> > > > +/*
> > > >   *  Initialise all variables in device structure.
> > > >   */
> > > >  static void
> > > > @@ -258,17 +294,34 @@ init_device(struct virtio_net *dev)
> > > > /* Set everything to 0. */
> > > 
> > > There is a trick here. Let me fill the context first:
> > > 
> > > 283 static void
> > > 284 init_device(struct virtio_net *dev)
> > > 285 {
> > > 286 uint64_t vq_offset;
> > > 287
> > > 288 /*
> > > 289  * Virtqueues have already been malloced so
> > > 290  * we don't want to set them to NULL.
> > > 291  */
> > > 292 vq_offset = offsetof(struct virtio_net, mem);
> > > 293
> > > 294 /* Set everything to 0. */
> > > 295 memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + 
> > > vq_offset), 0,
> > > 296 (sizeof(struct virtio_net) - (size_t)vq_offset));
> > > 297
> > > 298 init_vring_queue_pair(dev, 0);
> > > 
> > > This piece of code's intention is to memset everything to zero, except the
> > > `virtqueue' field, for, as the comment stated, we have already allocated
> > > virtqueue.
> > > 
> > > It works only when `virtqueue' field is before `mem' field, and it was
> > > before:
> > > 
> > > struct virtio_net {
> > > struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];/**< 
> > > Contains
> > > all virtqueue information. */
> > > struct virtio_memory*mem;   /**< QEMU memory and 
> > > memory
> > > region information. */
> > > ...
> > > 
> > > After this patch, it becomes:
> > > 
> > > struct virtio_net {
> > > struct virtio_memory*mem;   /**< QEMU memory and 
> > > memory
> > > region information. */
> > > struct vhost_virtqueue  **virtqueue;/**< Contains all 
> > > virtqueue
> > > information. */
> > > ...
> > > 
> > > 
> > > Which actually wipes all stuff inside `struct virtio_net`, resulting to 
> > > setting
> > > `virtqueue' to NULL as well.
> > > 
> > > While reading the code(without you patch applied), I thought that it's 
> > > error-
> > > prone, as it is very likely that someone else besides the author doesn't 
> > > know
> > > such undocumented rule. And you just gave me an example :)
> > > 
> > > Huawei, I'm proposing a fix to call rte_zmalloc() for allocating 
> > > new_ll_dev to
> > > get rid of such issue. What do you think?
> > > 
> > >   --yliu
> > > 
> > > 
> > 
> > Suggest you finish the latter patch review:
> > [PATCH v4 04/12] vhost: set memory layout for multiple queues mode,
> > After you finish reviewing this patch, I think you will change your mind :-)
> > 
> > This patch resolve what you concern.
> 
> Sorry that I hadn't gone that far yet. And yes, I see. I found that you
> moved the barrier to `features' field, which puts the `virtqu

[dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 for all NICs but 82598

2015-08-19 Thread Ananyev, Konstantin

Hi Vlad,
Sorry for delay with review, I am OOO till next week.
Meanwhile, few questions/comments from me. 

> > > >
> > > >> This patch fixes the Tx hang we were constantly hitting with a
> seastar-based
> > > >> application on x540 NIC.
> > > > Could you help to share with us how to reproduce the tx hang issue,
> with using
> > > > typical DPDK examples?
> > >
> > > Sorry. I'm not very familiar with the typical DPDK examples to help u
> > > here. However this is quite irrelevant since without this this patch
> > > ixgbe PMD obviously abuses the HW spec as has been explained above.
> > >
> > > We saw the issue when u stressed the xmit path with a lot of highly
> > > fragmented TCP frames (packets with up to 33 fragments with non-headers
> > > fragments as small as 4 bytes) with all offload features enabled.

Could you provide us with the pcap file to reproduce the issue?
My concern with you approach is that it would affect TX performance.
Right now, for simple TX PMD usually reads only (nb_tx_desc/tx_rs_thresh) TXDs,
While with your patch (if I understand it correctly) it has to read all TXDs in 
the HW TX ring.
Even if we really need to setup RS bit in each TXD (I still doubt we really do) 
- ,
I think inside PMD it still should be possible to check TX completion in chunks.
Konstantin   


> > >
> > > Thanks,
> > > vlad
> > > >
> > > >> Signed-off-by: Vlad Zolotarov 
> > > >> ---
> > > >>   drivers/net/ixgbe/ixgbe_ethdev.c |  9 +
> > > >>   drivers/net/ixgbe/ixgbe_rxtx.c   | 23 ++-
> > > >>   2 files changed, 31 insertions(+), 1 deletion(-)
> > > >>
> > > >> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > >> b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > >> index b8ee1e9..6714fd9 100644
> > > >> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > >> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > >> @@ -2414,6 +2414,15 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev,
> > > struct
> > > >> rte_eth_dev_info *dev_info)
> > > >>.txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS |
> > > >>ETH_TXQ_FLAGS_NOOFFLOADS,
> > > >>};
> > > >> +
> > > >> +  /*
> > > >> +   * According to 82599 and x540 specifications RS bit *must* be
> set on
> > > the
> > > >> +   * last descriptor of *every* packet. Therefore we will not allow
> the
> > > >> +   * tx_rs_thresh above 1 for all NICs newer than 82598.
> > > >> +   */
> > > >> +  if (hw->mac.type > ixgbe_mac_82598EB)
> > > >> +  dev_info->default_txconf.tx_rs_thresh = 1;
> > > >> +
> > > >>dev_info->hash_key_size = IXGBE_HKEY_MAX_INDEX * sizeof(uint32_t);
> > > >>dev_info->reta_size = ETH_RSS_RETA_SIZE_128;
> > > >>dev_info->flow_type_rss_offloads = IXGBE_RSS_OFFLOAD_ALL; diff --
> > > git
> > > >> a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
> index
> > > >> 91023b9..8dbdffc 100644
> > > >> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> > > >> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> > > >> @@ -2085,11 +2085,19 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_dev
> > > >> *dev,
> > > >>struct ixgbe_tx_queue *txq;
> > > >>struct ixgbe_hw *hw;
> > > >>uint16_t tx_rs_thresh, tx_free_thresh;
> > > >> +  bool rs_deferring_allowed;
> > > >>
> > > >>PMD_INIT_FUNC_TRACE();
> > > >>hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> > > >>
> > > >>/*
> > > >> +   * According to 82599 and x540 specifications RS bit *must* be
> set on
> > > the
> > > >> +   * last descriptor of *every* packet. Therefore we will not allow
> the
> > > >> +   * tx_rs_thresh above 1 for all NICs newer than 82598.
> > > >> +   */
> > > >> +  rs_deferring_allowed = (hw->mac.type <= ixgbe_mac_82598EB);
> > > >> +
> > > >> +  /*
> > > >> * Validate number of transmit descriptors.
> > > >> * It must not exceed hardware maximum, and must be multiple
> > > >> * of IXGBE_ALIGN.
> > > >> @@ -2110,6 +2118,8 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_dev
> > > *dev,
> > > >> * to transmit a packet is greater than the number of free TX
> > > >> * descriptors.
> > > >> * The following constraints must be satisfied:
> > > >> +   *  tx_rs_thresh must be less than 2 for NICs for which RS
> deferring is
> > > >> +   *  forbidden (all but 82598).
> > > >> *  tx_rs_thresh must be greater than 0.
> > > >> *  tx_rs_thresh must be less than the size of the ring minus 2.
> > > >> *  tx_rs_thresh must be less than or equal to tx_free_thresh.
> > > >> @@ -2121,9 +2131,20 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_dev
> > > *dev,
> > > >> * When set to zero use default values.
> > > >> */
> > > >>tx_rs_thresh = (uint16_t)((tx_conf->tx_rs_thresh) ?
> > > >> -  tx_conf->tx_rs_thresh : DEFAULT_TX_RS_THRESH);
> > > >> +  tx_conf->tx_rs_thresh :
> > > >> +  (rs_deferring_allowed ? DEFAULT_TX_RS_THRESH :
> 1));
> > > >>tx_free_thresh = (uint16_t)((tx_conf->tx_free_thresh) ?
> > > >>tx_conf->tx_free_thresh : DEFAULT

[dpdk-dev] Cisco DPDK enabled Packet Generator.

2015-08-19 Thread Thomas Monjalon
2015-08-19 03:39, Andrew Harvey:
> A couple of people asked me at the SF DPDK Summit about Cisco?s Open
> Source Packet Generator that leverages DPDK.
> 
> TRex an open source, low cost, stateful traffic generator fueled by DPDK.
> It generates L4-7 traffic based on pre-processing and smart replay of real
> traffic templates. Trex amplifies both client and server side traffic and
> can scale to 200Gb/sec with one Cisco UCS using an Intel XL710.
> 
> I know the team would welcome comments and join development to help
> extend/improve the project.
> 
> https://github.com/cisco-system-traffic-generator

You are welcome to host or mirror the project on dpdk.org:
http://dpdk.org/browse/



[dpdk-dev] [PATCH 1/3] vhost: Fix return value of GET_VRING_BASE message

2015-08-19 Thread Tetsuya Mukawa
When vhost-user frontend sends GET_VRING_BASE, last used index of vring
should be returned. In DPDK vhost library, 'last_used_idx' represents it.
But the value can be over max index value. To return correct value to
vhost frontend, it's needed to be masked.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_vhost/virtio-net.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index b520ec5..144f301 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -661,7 +661,8 @@ get_vring_base(struct vhost_device_ctx ctx, uint32_t index,

state->index = index;
/* State->index refers to the queue index. The txq is 1, rxq is 0. */
-   state->num = dev->virtqueue[state->index]->last_used_idx;
+   state->num = dev->virtqueue[state->index]->last_used_idx
+   & (dev->virtqueue[state->index]->size - 1);

return 0;
 }
-- 
2.1.4



[dpdk-dev] [PATCH 2/3] vhost: Fix RESET_OWNER handling not to close callfd

2015-08-19 Thread Tetsuya Mukawa
When RESET_OWNER message is issued, vhost backend shouldn't close
'callfd', because it will be valid while vhost-user connection
is established. It should be closed when connection is closed.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_vhost/virtio-net.c | 35 ---
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 144f301..7794e8b 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -189,13 +189,9 @@ cleanup_device(struct virtio_net *dev)
free(dev->mem);
}

-   /* Close any event notifiers opened by device. */
-   if ((int)dev->virtqueue[VIRTIO_RXQ]->callfd >= 0)
-   close((int)dev->virtqueue[VIRTIO_RXQ]->callfd);
+   /* Close kickfd event notifiers opened by device. */
if ((int)dev->virtqueue[VIRTIO_RXQ]->kickfd >= 0)
close((int)dev->virtqueue[VIRTIO_RXQ]->kickfd);
-   if ((int)dev->virtqueue[VIRTIO_TXQ]->callfd >= 0)
-   close((int)dev->virtqueue[VIRTIO_TXQ]->callfd);
if ((int)dev->virtqueue[VIRTIO_TXQ]->kickfd >= 0)
close((int)dev->virtqueue[VIRTIO_TXQ]->kickfd);
 }
@@ -206,6 +202,12 @@ cleanup_device(struct virtio_net *dev)
 static void
 free_device(struct virtio_net_config_ll *ll_dev)
 {
+   /* close callfd when connection is closed */
+   if ((int)ll_dev->dev.virtqueue[VIRTIO_RXQ]->callfd >= 0)
+   close((int)ll_dev->dev.virtqueue[VIRTIO_RXQ]->callfd);
+   if ((int)ll_dev->dev.virtqueue[VIRTIO_TXQ]->callfd >= 0)
+   close((int)ll_dev->dev.virtqueue[VIRTIO_TXQ]->callfd);
+
/* Free any malloc'd memory */
rte_free(ll_dev->dev.virtqueue[VIRTIO_RXQ]);
rte_free(ll_dev->dev.virtqueue[VIRTIO_TXQ]);
@@ -241,6 +243,21 @@ rm_config_ll_entry(struct virtio_net_config_ll *ll_dev,
}
 }

+
+/*
+ *  Rset variables in device structure.
+ */
+static void
+reset_device(struct virtio_net *dev)
+{
+   dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
+   dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
+
+   /* Backends are set to -1 indicating an inactive device. */
+   dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
+   dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
+}
+
 /*
  *  Initialise all variables in device structure.
  */
@@ -261,14 +278,10 @@ init_device(struct virtio_net *dev)
memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct vhost_virtqueue));
memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct vhost_virtqueue));

-   dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
-   dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;

-   /* Backends are set to -1 indicating an inactive device. */
-   dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
-   dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
+   reset_device(dev);
 }

 /*
@@ -403,7 +416,7 @@ reset_owner(struct vhost_device_ctx ctx)
ll_dev = get_config_ll_entry(ctx);

cleanup_device(&ll_dev->dev);
-   init_device(&ll_dev->dev);
+   reset_device(&ll_dev->dev);

return 0;
 }
-- 
2.1.4



[dpdk-dev] [PATCH 3/3] vhost: Fix RESET_OWNER handling not to free virtqueue

2015-08-19 Thread Tetsuya Mukawa
When RESET_OWNER message is issued, vhost backend should not clear and
free virtqueue memories. This is because vhost backend should handle
GET_VRING_BASE message followed by the RESET_OWNER, so still vhost
backend need to access virtqueue memories.
These memories should be freed when SET_MEM_TABLE message is issued,
and vhost library has already had such a functionality.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_vhost/virtio-net.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 7794e8b..9475d00 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -182,13 +182,6 @@ add_config_ll_entry(struct virtio_net_config_ll 
*new_ll_dev)
 static void
 cleanup_device(struct virtio_net *dev)
 {
-   /* Unmap QEMU memory file if mapped. */
-   if (dev->mem) {
-   munmap((void *)(uintptr_t)dev->mem->mapped_address,
-   (size_t)dev->mem->mapped_size);
-   free(dev->mem);
-   }
-
/* Close kickfd event notifiers opened by device. */
if ((int)dev->virtqueue[VIRTIO_RXQ]->kickfd >= 0)
close((int)dev->virtqueue[VIRTIO_RXQ]->kickfd);
-- 
2.1.4



[dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 for all NICs but 82598

2015-08-19 Thread Vlad Zolotarov


On 08/19/15 10:43, Ananyev, Konstantin wrote:
> Hi Vlad,
> Sorry for delay with review, I am OOO till next week.
> Meanwhile, few questions/comments from me.

Hi, Konstantin, long time no see... ;)

>
>> This patch fixes the Tx hang we were constantly hitting with a
>> seastar-based
>> application on x540 NIC.
> Could you help to share with us how to reproduce the tx hang issue,
>> with using
> typical DPDK examples?
 Sorry. I'm not very familiar with the typical DPDK examples to help u
 here. However this is quite irrelevant since without this this patch
 ixgbe PMD obviously abuses the HW spec as has been explained above.

 We saw the issue when u stressed the xmit path with a lot of highly
 fragmented TCP frames (packets with up to 33 fragments with non-headers
 fragments as small as 4 bytes) with all offload features enabled.
> Could you provide us with the pcap file to reproduce the issue?

Well, the thing is it takes some time to reproduce it (a few minutes of 
heavy load) therefore a pcap would be quite large.

> My concern with you approach is that it would affect TX performance.

It certainly will ;) But it seem inevitable. See below.

> Right now, for simple TX PMD usually reads only (nb_tx_desc/tx_rs_thresh) 
> TXDs,
> While with your patch (if I understand it correctly) it has to read all TXDs 
> in the HW TX ring.

If by "simple" u refer an always single fragment per Tx packet - then u 
are absolutely correct.

My initial patch was to only set RS on every EOP descriptor without 
changing the rs_thresh value and this patch worked.
However HW spec doesn't ensure in a general case that packets are always 
handled/completion write-back completes in the same order the packets 
are placed on the ring (see "Tx arbitration schemes" chapter in 82599 
spec for instance). Therefore AFAIU one should not assume that if 
packet[x+1] DD bit is set then packet[x] is completed too.

That's why I changed the patch to be as u see it now. However if I miss 
something here and your HW people ensure the in-order completion this of 
course may be changed back.

> Even if we really need to setup RS bit in each TXD (I still doubt we really 
> do) - ,

Well, if u doubt u may ask the guys from the Intel networking division 
that wrote the 82599 and x540 HW specs where they clearly state that. ;)

> I think inside PMD it still should be possible to check TX completion in 
> chunks.
> Konstantin
>
>
 Thanks,
 vlad
>> Signed-off-by: Vlad Zolotarov 
>> ---
>>drivers/net/ixgbe/ixgbe_ethdev.c |  9 +
>>drivers/net/ixgbe/ixgbe_rxtx.c   | 23 ++-
>>2 files changed, 31 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
>> b/drivers/net/ixgbe/ixgbe_ethdev.c
>> index b8ee1e9..6714fd9 100644
>> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
>> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
>> @@ -2414,6 +2414,15 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev,
 struct
>> rte_eth_dev_info *dev_info)
>> .txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS |
>> ETH_TXQ_FLAGS_NOOFFLOADS,
>> };
>> +
>> +  /*
>> +   * According to 82599 and x540 specifications RS bit *must* be
>> set on
 the
>> +   * last descriptor of *every* packet. Therefore we will not allow
>> the
>> +   * tx_rs_thresh above 1 for all NICs newer than 82598.
>> +   */
>> +  if (hw->mac.type > ixgbe_mac_82598EB)
>> +  dev_info->default_txconf.tx_rs_thresh = 1;
>> +
>> dev_info->hash_key_size = IXGBE_HKEY_MAX_INDEX * sizeof(uint32_t);
>> dev_info->reta_size = ETH_RSS_RETA_SIZE_128;
>> dev_info->flow_type_rss_offloads = IXGBE_RSS_OFFLOAD_ALL; diff --
 git
>> a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
>> index
>> 91023b9..8dbdffc 100644
>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
>> @@ -2085,11 +2085,19 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_dev
>> *dev,
>> struct ixgbe_tx_queue *txq;
>> struct ixgbe_hw *hw;
>> uint16_t tx_rs_thresh, tx_free_thresh;
>> +  bool rs_deferring_allowed;
>>
>> PMD_INIT_FUNC_TRACE();
>> hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
>>
>> /*
>> +   * According to 82599 and x540 specifications RS bit *must* be
>> set on
 the
>> +   * last descriptor of *every* packet. Therefore we will not allow
>> the
>> +   * tx_rs_thresh above 1 for all NICs newer than 82598.
>> +   */
>> +  rs_deferring_allowed = (hw->mac.type <= ixgbe_mac_82598EB);
>> +
>> +  /*
>>  * Validate number of transmit descriptors.
>>  * It must not exceed hardware maximum, and must be multiple
>>  * of IXGBE_ALIGN.
>> @@ -2110,6 +2118,8 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_dev
 *dev,
>

[dpdk-dev] [PATCH v6 4/9] ethdev: remove HW specific stats in stats structs

2015-08-19 Thread Tahhan, Maryam
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Monday, August 17, 2015 3:54 PM
> To: Tahhan, Maryam; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 4/9] ethdev: remove HW specific stats in
> stats structs
> 
> Hi Maryam,
> 
> On 07/15/2015 03:11 PM, Maryam Tahhan wrote:
> > Remove non generic stats in rte_stats_strings and mark the relevant
> > fields in struct rte_eth_stats as deprecated.
> >
> > Signed-off-by: Maryam Tahhan 
> > ---
> >  doc/guides/rel_notes/abi.rst  | 12 
> > lib/librte_ether/rte_ethdev.c |  9 -
> > lib/librte_ether/rte_ethdev.h | 30 --
> >  3 files changed, 32 insertions(+), 19 deletions(-)
> >
> > diff --git a/doc/guides/rel_notes/abi.rst
> > b/doc/guides/rel_notes/abi.rst index 931e785..d5bf625 100644
> > --- a/doc/guides/rel_notes/abi.rst
> > +++ b/doc/guides/rel_notes/abi.rst
> > @@ -24,3 +24,15 @@ Deprecation Notices
> >
> >  * The Macros RTE_HASH_BUCKET_ENTRIES_MAX and
> RTE_HASH_KEY_LENGTH_MAX are
> >deprecated and will be removed with version 2.2.
> > +
> > +* The following fields have been deprecated in rte_eth_stats:
> > +  * uint64_t imissed
> > +  * uint64_t ibadcrc
> > +  * uint64_t ibadlen
> > +  * uint64_t imcasts
> > +  * uint64_t fdirmatch
> > +  * uint64_t fdirmiss
> > +  * uint64_t tx_pause_xon
> > +  * uint64_t rx_pause_xon
> > +  * uint64_t tx_pause_xoff
> > +  * uint64_t rx_pause_xoff
> 
> Looking again at this patch, I'm wondering if "imissed" should be kept instead
> of beeing deprecated. I think it could be useful to differentiate ierrors from
> imissed, and it's not a hw-specific statistic. What do you think?
> 
> One more comment: it seems these fields are marked as deprecated but they
> are still used on other drivers (e1000, i40e, ...).
> 
> Regards,
> Olivier
> 


Hi Olivier
I can remove the deprecated status for imissed to leave the differentiation 
between errors and missed packets.
igb and i40e will be updated soon to reflect this. I marked them as deprecated 
to deter their use in the future. Older instances/use will need to be resolved.

Regards
Maryam



[dpdk-dev] [PATCH v6 4/9] ethdev: remove HW specific stats in stats structs

2015-08-19 Thread Olivier MATZ
Hi Maryam,

On 08/19/2015 02:53 PM, Tahhan, Maryam wrote:
>> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
>> Sent: Monday, August 17, 2015 3:54 PM
>> To: Tahhan, Maryam; dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v6 4/9] ethdev: remove HW specific stats in
>> stats structs
>>
>> Hi Maryam,
>>
>> On 07/15/2015 03:11 PM, Maryam Tahhan wrote:
>>> Remove non generic stats in rte_stats_strings and mark the relevant
>>> fields in struct rte_eth_stats as deprecated.
>>>
>>
>> Looking again at this patch, I'm wondering if "imissed" should be kept 
>> instead
>> of beeing deprecated. I think it could be useful to differentiate ierrors 
>> from
>> imissed, and it's not a hw-specific statistic. What do you think?
>>
>> One more comment: it seems these fields are marked as deprecated but they
>> are still used on other drivers (e1000, i40e, ...).
>>
>> Regards,
>> Olivier
>>
>
>
> Hi Olivier
> I can remove the deprecated status for imissed to leave the differentiation 
> between errors and missed packets.
> igb and i40e will be updated soon to reflect this. I marked them as 
> deprecated to deter their use in the future. Older instances/use will need to 
> be resolved.

 From my point of view, yes, I think it's better to keep different stats
for imissed and ierrors as it can be useful to determinine the cause of
packet losses. If nobody disagrees, I think we could remove the
deprecation notice for imissed.

Regards,
Olivier



[dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 for all NICs but 82598

2015-08-19 Thread Zhang, Helin
Hi Vlad

Thank you very much for the patches! Give me a few more time to double check 
with more guys, and possibly hardware experts.

Regards,
Helin

From: Vladislav Zolotarov [mailto:vl...@cloudius-systems.com]
Sent: Tuesday, August 18, 2015 9:56 PM
To: Lu, Wenzhuo
Cc: dev at dpdk.org; Zhang, Helin
Subject: RE: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 for 
all NICs but 82598


On Aug 19, 2015 03:42, "Lu, Wenzhuo" mailto:wenzhuo.lu 
at intel.com>> wrote:
>
> Hi Helin,
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] 
> > On Behalf Of Vlad Zolotarov
> > Sent: Friday, August 14, 2015 1:38 PM
> > To: Zhang, Helin; dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 
> > for
> > all NICs but 82598
> >
> >
> >
> > On 08/13/15 23:28, Zhang, Helin wrote:
> > > Hi Vlad
> > >
> > > I don't think the changes are needed. It says in datasheet that the RS
> > > bit should be set on the last descriptor of every packet, ONLY WHEN
> > TXDCTL.WTHRESH equals to ZERO.
> >
> > Of course it's needed! See below.
> > Exactly the same spec a few lines above the place u've just quoted states:
> >
> > "Software should not set the RS bit when TXDCTL.WTHRESH is greater than
> > zero."
> >
> > And since all three (3) ixgbe xmit callbacks are utilizing RS bit notation 
> > ixgbe PMD
> > is actually not supporting any value of WTHRESH different from zero.
> I think Vlad is right. We need to fix this issue. Any suggestion? If not, I'd 
> like to ack this patch.

Pls., note that there is a v2 of this patch on the list. I forgot to patch 
ixgbevf_dev_info_get() in v1.

>
> >
> > >
> > > Regards,
> > > Helin
> > >
> > >> -Original Message-
> > >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com > >> at cloudius-systems.com>]
> > >> Sent: Thursday, August 13, 2015 11:07 AM
> > >> To: dev at dpdk.org
> > >> Cc: Zhang, Helin; Ananyev, Konstantin; avi at 
> > >> cloudius-systems.com; Vlad
> > >> Zolotarov
> > >> Subject: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 for
> > all
> > >> NICs but 82598
> > >>
> > >> According to 82599 and x540 HW specifications RS bit *must* be set in the
> > last
> > >> descriptor of *every* packet.
> > > There is a condition that if TXDCTL.WTHRESH equal to zero.
> >
> > Right and ixgbe PMD requires this condition to be fulfilled in order to
> > function. See above.
> >
> > >
> > >> This patch fixes the Tx hang we were constantly hitting with a 
> > >> seastar-based
> > >> application on x540 NIC.
> > > Could you help to share with us how to reproduce the tx hang issue, with 
> > > using
> > > typical DPDK examples?
> >
> > Sorry. I'm not very familiar with the typical DPDK examples to help u
> > here. However this is quite irrelevant since without this this patch
> > ixgbe PMD obviously abuses the HW spec as has been explained above.
> >
> > We saw the issue when u stressed the xmit path with a lot of highly
> > fragmented TCP frames (packets with up to 33 fragments with non-headers
> > fragments as small as 4 bytes) with all offload features enabled.
> >
> > Thanks,
> > vlad
> > >
> > >> Signed-off-by: Vlad Zolotarov  > >> cloudius-systems.com>
> > >> ---
> > >>   drivers/net/ixgbe/ixgbe_ethdev.c |  9 +
> > >>   drivers/net/ixgbe/ixgbe_rxtx.c   | 23 ++-
> > >>   2 files changed, 31 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> > >> b/drivers/net/ixgbe/ixgbe_ethdev.c
> > >> index b8ee1e9..6714fd9 100644
> > >> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> > >> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> > >> @@ -2414,6 +2414,15 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev,
> > struct
> > >> rte_eth_dev_info *dev_info)
> > >>.txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS |
> > >>ETH_TXQ_FLAGS_NOOFFLOADS,
> > >>};
> > >> +
> > >> +  /*
> > >> +   * According to 82599 and x540 specifications RS bit *must* be set on
> > the
> > >> +   * last descriptor of *every* packet. Therefore we will not allow the
> > >> +   * tx_rs_thresh above 1 for all NICs newer than 82598.
> > >> +   */
> > >> +  if (hw->mac.type > ixgbe_mac_82598EB)
> > >> +  dev_info->default_txconf.tx_rs_thresh = 1;
> > >> +
> > >>dev_info->hash_key_size = IXGBE_HKEY_MAX_INDEX * sizeof(uint32_t);
> > >>dev_info->reta_size = ETH_RSS_RETA_SIZE_128;
> > >>dev_info->flow_type_rss_offloads = IXGBE_RSS_OFFLOAD_ALL; diff --
> > git
> > >> a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c index
> > >> 91023b9..8dbdffc 100644
> > >> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> > >> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> > >> @@ -2085,11 +2085,19 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_dev
> > >> *dev,
> > >>struct ixgbe_tx_queue *txq;

[dpdk-dev] Using pktgen-dpdk with VM's

2015-08-19 Thread Kar, Subarna
Hello,
I am new to dpdk and pktgen-dpdk. I have been able to send packets between 
NIC's in the host machine. How do I send packets between 2 VM's using pktgen? 
Is there any way to add the virtual NICs to the pktgen?

Thanks,
Subarna


[dpdk-dev] flow_director_filter error!!

2015-08-19 Thread Navneet Rao
Any suggestions?

Thanks
-Navneet


-Original Message-
From: Navneet Rao 
Sent: Tuesday, August 18, 2015 4:01 PM
To: dev at dpdk.org
Subject: [dpdk-dev] flow_director_filter error!!

Hello:



Using dpdk-2.0.0 and i540 -



I am using the testpmd app to test-drive the flow-director filter settings.

After I start the testpmd app, I am flusing the flow_director_filter settings 
and get the following error -



testpmd> flush_flow_director 0

PMD: ixgbe_fdir_flush(): Failed to re-initialize FD table.

flow director table flushing error: (Too many open files in system)



Any clues or directions?



Thanks

-Navneet




[dpdk-dev] [PATCH v2] Send updated port_id in vdev_init functions

2015-08-19 Thread Ravi Kerur
Instead of executing following functions before and after vdev_init
   > rte_eth_dev_save and
   > rte_eth_dev_get_changed_port

update following functions to return allocated port_id.
   > rte_eal_vdev_init
   > rte_eal_probe_one

Thanks to Tetsuya for his valuable inputs.

Ravi Kerur (1):
  Change rte_eal_vdev_init to update port_id

 drivers/net/enic/enic_ethdev.c  |   2 +-
 lib/librte_eal/common/eal_common_dev.c  |  13 ++--
 lib/librte_eal/common/eal_common_pci.c  |   6 +-
 lib/librte_eal/common/include/rte_dev.h |  36 +-
 lib/librte_eal/common/include/rte_pci.h |   4 +-
 lib/librte_ether/rte_ethdev.c   | 122 +---
 lib/librte_ether/rte_ether_version.map  |   8 +++
 7 files changed, 125 insertions(+), 66 deletions(-)

-- 
1.9.1



[dpdk-dev] [PATCH v2] Change rte_eal_vdev_init to update port_id

2015-08-19 Thread Ravi Kerur
v2:
   > Remove rte_pmd_mpipe_devinit changes
   > Use rte_eal_compare_pci_addr for address comparison
   > Use dpdk_2.2 in version map file for new functions

v1:
Changes include
   > Modify rte_eal_vdev_init to return allocated port_id
   > Modify rte_eal_probe_one to return allocated port_id

2. Removed following functions
   > rte_eth_dev_save and
   > rte_eth_dev_get_changed_port

3. Added 2 new functions
   > rte_eth_dev_get_port_by_name
   > rte_eth_dev_get_port_by_addr

4. Fix return error(ENOMEM) in function rte_pmd_mpipe_devinit

Compiled on Linux for following targets
   > x86_64-native-linuxapp-gcc
   > x86_64-native-linuxapp-clang
   > x86_x32-native-linuxapp-gcc

Compiled on FreeBSD for following targets
   > x86_64-native-bsdapp-clang
   > x86_64-native-bsdapp-gcc

Tested on Linux/FreeBSD:
   > port attach eth_null
   > port start all
   > port stop all
   > port close all
   > port detach 0
   > port attach eth_null
   > port start all
   > port stop all
   > port close all
   > port detach 0

Successful run of checkpatch.pl on the diffs

Successful validate_abi on Linux for following targets

   > x86_64-native-linuxapp-gcc
   > x86_64-native-linuxapp-clang

Signed-off-by: Ravi Kerur 
---
 drivers/net/enic/enic_ethdev.c  |   2 +-
 lib/librte_eal/common/eal_common_dev.c  |  13 ++--
 lib/librte_eal/common/eal_common_pci.c  |   6 +-
 lib/librte_eal/common/include/rte_dev.h |  36 +-
 lib/librte_eal/common/include/rte_pci.h |   4 +-
 lib/librte_ether/rte_ethdev.c   | 122 +---
 lib/librte_ether/rte_ether_version.map  |   8 +++
 7 files changed, 125 insertions(+), 66 deletions(-)

diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index 8280cea..472ef5a 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -36,8 +36,8 @@
 #include 
 #include 

-#include 
 #include 
+#include 
 #include 
 #include 

diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c
index 4089d66..ffdb3b5 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -37,6 +37,7 @@
 #include 
 #include 

+#include 
 #include 
 #include 
 #include 
@@ -64,7 +65,7 @@ rte_eal_driver_unregister(struct rte_driver *driver)
 }

 int
-rte_eal_vdev_init(const char *name, const char *args)
+rte_eal_vdev_init(const char *name, const char *args, uint8_t *port_id)
 {
struct rte_driver *driver;

@@ -81,8 +82,12 @@ rte_eal_vdev_init(const char *name, const char *args)
 * will be "eth_pcap", but "name" will be "eth_pcapN".
 * So use strncmp to compare.
 */
-   if (!strncmp(driver->name, name, strlen(driver->name)))
-   return driver->init(name, args);
+   if (!strncmp(driver->name, name, strlen(driver->name))) {
+   if (!driver->init(name, args))
+   return rte_eth_dev_get_port_by_name(
+   name, port_id);
+   }
+
}

RTE_LOG(ERR, EAL, "no driver found for %s\n", name);
@@ -108,7 +113,7 @@ rte_eal_dev_init(void)
continue;

if (rte_eal_vdev_init(devargs->virtual.drv_name,
-   devargs->args)) {
+   devargs->args, NULL)) {
RTE_LOG(ERR, EAL, "failed to initialize %s device\n",
devargs->virtual.drv_name);
return -1;
diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index 16e8629..3d97892 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -79,6 +79,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "eal_private.h"

@@ -322,7 +323,7 @@ pci_detach_all_drivers(struct rte_pci_device *dev)
  * the driver of the devive.
  */
 int
-rte_eal_pci_probe_one(const struct rte_pci_addr *addr)
+rte_eal_pci_probe_one(const struct rte_pci_addr *addr, uint8_t *port_id)
 {
struct rte_pci_device *dev = NULL;
int ret = 0;
@@ -337,7 +338,8 @@ rte_eal_pci_probe_one(const struct rte_pci_addr *addr)
ret = pci_probe_all_drivers(dev);
if (ret < 0)
goto err_return;
-   return 0;
+
+   return rte_eth_dev_get_port_by_addr(addr, port_id);
}
return -1;

diff --git a/lib/librte_eal/common/include/rte_dev.h 
b/lib/librte_eal/common/include/rte_dev.h
index f601d21..564cdf3 100644
--- a/lib/librte_eal/common/include/rte_dev.h
+++ b/lib/librte_eal/common/include/rte_dev.h
@@ -110,10 +110,12 @@ int rte_eal_dev_init(void);
  *   The pointer to a driver name to be initialized.
  * @param args
  *   The pointer to arguments used by driver initialization.
+ * @param port_id
+ *   The port

[dpdk-dev] [PATCH v1] Return ENOMEM during mpipe_devinit failure

2015-08-19 Thread Ravi Kerur
In function rte_pmd_mpipe_devinit, if rte_eth_dev_allocate
fails return error which is inline with other drivers.

Signed-off-by: Ravi Kerur 
---
 drivers/net/mpipe/mpipe_tilegx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/mpipe/mpipe_tilegx.c b/drivers/net/mpipe/mpipe_tilegx.c
index 743feef..6e3e304 100644
--- a/drivers/net/mpipe/mpipe_tilegx.c
+++ b/drivers/net/mpipe/mpipe_tilegx.c
@@ -1582,6 +1582,7 @@ rte_pmd_mpipe_devinit(const char *ifname,
if (!eth_dev) {
RTE_LOG(ERR, PMD, "%s: Failed to allocate device.\n", ifname);
rte_free(priv);
+   return -ENOMEM;
}

RTE_LOG(INFO, PMD, "%s: Initialized mpipe device"
-- 
1.9.1



[dpdk-dev] [PATCH v2] Move common functions in eal_thread.c

2015-08-19 Thread Ravi Kerur
v2:
   > Remove un-needed header file eal_private.h from freeBSD
 eal_thread.c after code movement.

v1:
Changes include
   > Moving common functions in eal_thread.c in
 linuxapp and bsdapp into common/eal_common_thread.c file.
   > Rearrange eal_common_thread.c compilation in Makefile
 for ABI.

Compiled successfully for following targets
   > x86_64-native-linuxapp-clang
   > x86_64-native-linuxapp-gcc
   > x86_x32-native-linuxapp-gcc
   > i686-native-linuxapp-gcc
   > x86_64-native-bsdapp-clang
   > x86_64-native-bsdapp-gcc

Tested on
   > Ubuntu 14.04, testpmd functionality
   > FreeBSD 10.1, testpmd functionality

Signed-off-by: Ravi Kerur 
---
 lib/librte_eal/bsdapp/eal/Makefile|   3 +-
 lib/librte_eal/bsdapp/eal/eal_thread.c| 153 --
 lib/librte_eal/common/eal_common_thread.c | 147 +++-
 lib/librte_eal/linuxapp/eal/Makefile  |   3 +-
 lib/librte_eal/linuxapp/eal/eal_thread.c  | 153 --
 5 files changed, 150 insertions(+), 309 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/Makefile 
b/lib/librte_eal/bsdapp/eal/Makefile
index a969435..93d76bb 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -51,6 +51,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) := eal.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_memory.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_hugepage_info.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_thread.c
+SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_thread.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_log.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_pci.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_debug.c
@@ -76,7 +77,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_hexdump.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_devargs.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_options.c
-SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_thread.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += rte_malloc.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += malloc_elem.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += malloc_heap.c
@@ -90,6 +90,7 @@ CFLAGS_eal_common_log.o := -D_GNU_SOURCE
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
 ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
 CFLAGS_eal_thread.o += -Wno-return-type
+CFLAGS_eal_common_thread.o += -Wno-return-type
 CFLAGS_eal_hpet.o += -Wno-return-type
 endif

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c 
b/lib/librte_eal/bsdapp/eal/eal_thread.c
index 9a03437..4036d21 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -35,163 +35,10 @@
 #include 
 #include 
 #include 
-#include 
-#include 
-#include 
-#include 
 #include 

-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include "eal_private.h"
 #include "eal_thread.h"

-RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = LCORE_ID_ANY;
-RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
-RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
-
-/*
- * Send a message to a slave lcore identified by slave_id to call a
- * function f with argument arg. Once the execution is done, the
- * remote lcore switch in FINISHED state.
- */
-int
-rte_eal_remote_launch(int (*f)(void *), void *arg, unsigned slave_id)
-{
-   int n;
-   char c = 0;
-   int m2s = lcore_config[slave_id].pipe_master2slave[1];
-   int s2m = lcore_config[slave_id].pipe_slave2master[0];
-
-   if (lcore_config[slave_id].state != WAIT)
-   return -EBUSY;
-
-   lcore_config[slave_id].f = f;
-   lcore_config[slave_id].arg = arg;
-
-   /* send message */
-   n = 0;
-   while (n == 0 || (n < 0 && errno == EINTR))
-   n = write(m2s, &c, 1);
-   if (n < 0)
-   rte_panic("cannot write on configuration pipe\n");
-
-   /* wait ack */
-   do {
-   n = read(s2m, &c, 1);
-   } while (n < 0 && errno == EINTR);
-
-   if (n <= 0)
-   rte_panic("cannot read on configuration pipe\n");
-
-   return 0;
-}
-
-/* set affinity for current thread */
-static int
-eal_thread_set_affinity(void)
-{
-   unsigned lcore_id = rte_lcore_id();
-
-   /* acquire system unique id  */
-   rte_gettid();
-
-   /* update EAL thread core affinity */
-   return rte_thread_set_affinity(&lcore_config[lcore_id].cpuset);
-}
-
-void eal_thread_init_master(unsigned lcore_id)
-{
-   /* set the lcore ID in per-lcore memory area */
-   RTE_PER_LCORE(_lcore_id) = lcore_id;
-
-   /* set CPU affinity */
-   if (eal_thread_set_affinity() < 0)
-   rte_panic("cannot set affinity\n");
-}
-
-/* main loop of threads */
-__attribute__((noreturn)) void *
-eal_thread_loop(__attribute__((unused)) void *arg)
-{
-   char c;
-   int n, ret;
-   unsigned lcore_id;
-   pthread_t t

[dpdk-dev] Pktgen error on pkts_rx from lua script

2015-08-19 Thread Posadas, Emerson
Hello

I have checked that I can access some of the pktgen data from lua scripts. I'm 
currently running a two port configuration and want to get the pkts_rx and 
pkts_tx 

>From http://dpdk.org/browse/apps/pktgen-dpdk/tree/scripts/rfc2544.lua example 
>script I see that I can access an element like this endstats[0].opackets.

But when trying to access endStats[0].pkts_rx I get this:
!ERROR!: ./Pktgen.lua:65: bad argument #2 to 'strfmt' (number expected, got nil)

I'm currently using pkgten version: 2.9.1. Any suggestion?

EP




[dpdk-dev] flow_director_filter error!!

2015-08-19 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Navneet Rao
> Sent: Tuesday, August 18, 2015 4:01 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] flow_director_filter error!!
> 
> After I start the testpmd app, I am flusing the flow_director_filter
> settings and get the following error -
> 
> 
> 
> testpmd> flush_flow_director 0
> 
> PMD: ixgbe_fdir_flush(): Failed to re-initialize FD table.
> 
> flow director table flushing error: (Too many open files in system)

Hi,

Are you setting a flow director filter before flushing? If so, could you give 
an example.

John.
--