[dpdk-dev] vhost-net stops sending to virito pmd -- already fixed?

2015-09-17 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Kyle Larose
> Sent: Wednesday, September 16, 2015 5:05 AM
> To: Thomas Monjalon
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] vhost-net stops sending to virito pmd -- already
> fixed?
> 
> On Sun, Sep 13, 2015 at 5:43 PM, Thomas Monjalon
>  wrote:
> >
> > Hi,
> >
> > 2015-09-11 12:32, Kyle Larose:
> > > Looking through the version tree for virtio_rxtx.c, I saw the
> > > following
> > > commit:
> > >
> > >
> http://dpdk.org/browse/dpdk/commit/lib/librte_pmd_virtio?id=8c09c20f
> > > b4cde76e53d87bd50acf2b441ecf6eb8
> > >
> > > Does anybody know offhand if the issue fixed by that commit could be
> > > the root cause of what I am seeing?
> >
> > I won't have the definitive answer but I would like to use your
> > question to highlight a common issue in git messages:
> >
> > PLEASE, authors of fixes, explain the bug you are fixing and how it
> > can be reproduced. Good commit messages are REALLY read and useful.
> >
> > Thanks
> >
> 
> I've figured out what happened. It has nothing to do with the fix I pasted
> above. Instead, the issue has to do with running low on mbufs.
> 
> Here's the general logic:
> 
> 1. If packets are not queued, return
> 2. Fetch each queued packet, as an mbuf, into the provided array. This may
> involve some merging/etc 3. Try to fill the virtio receive ring with new mbufs
>   3.a. If we fail to allocate an mbuf, break out of the refill loop 4. Update 
> the
> receive ring information and kick the host
> 
> This is obviously a simplification, but the key point is 3.a. If we hit this 
> logic
> when the virtio receive ring is completely used up, we essentially lock up.
> The host will have no buffers with which to queue packets, so the next time
> we poll, we will hit case 1. However, since we hit case 1, we will not 
> allocate
> mbufs to the virtio receive ring, regardless of how many are now free. Rinse
> and repeat; we are stuck until the pmd is restarted or the link is restarted.
> 
> This is very easy to reproduce when the mbuf pool is fairly small, and packets
> are being passed to worker threads/processes which may increase the
> length of the pipeline.
> 
> I took a quick look at the ixgbe driver, and it looks like it checks if it 
> needs to
> allocate mbufs to the ring before trying to pull packets off the nic. Should 
> we
> not be doing something similar for virtio? Rather than breaking out early if 
> no
> packets are queued, we should first make sure there are resources with
> which to queue packets!

Try to allocate mbuf and refill the vring descriptor when 1 is hit,
This way probably address your issue.

> 
> One solution here is to increase the mbuf pool to a size where such
> exhaustion is impossible, but that doesn't seem like a graceful solution. For
> example, it may be desirable to drop packets rather than have a large
> memory pool, and becoming stuck under such a situation is not good. Further,
> it isn't easy to know the exact size required. You may end up wasting a bunch
> of resources allocating far more than necessary, or you may unknowingly
> under allocate, only to find out once your application has been deployed into
> production, and it's dropping everything on the floor.
> 
> Does anyone have thoughts on this? I took a look at virtio_rxtx and head and
> I didn't see anything resembling my suggestion.
> 
> Comments would be appreciated. Thanks,
> 
> Kyle


[dpdk-dev] [RFC PATCH 4/8] driver/virtio:enqueue TSO offload

2015-09-09 Thread Ouyang, Changchun


> -Original Message-
> From: Liu, Jijiang
> Sent: Thursday, September 10, 2015 1:21 AM
> To: Ouyang, Changchun; dev at dpdk.org
> Subject: RE: [dpdk-dev] [RFC PATCH 4/8] driver/virtio:enqueue TSO offload
> 
> 
> 
> > -Original Message-----
> > From: Ouyang, Changchun
> > Sent: Tuesday, September 8, 2015 6:18 PM
> > To: Liu, Jijiang; dev at dpdk.org
> > Cc: Ouyang, Changchun
> > Subject: RE: [dpdk-dev] [RFC PATCH 4/8] driver/virtio:enqueue TSO
> > offload
> >
> >
> >
> > > -Original Message-----
> > > From: Liu, Jijiang
> > > Sent: Monday, September 7, 2015 2:11 PM
> > > To: Ouyang, Changchun; dev at dpdk.org
> > > Subject: RE: [dpdk-dev] [RFC PATCH 4/8] driver/virtio:enqueue TSO
> > > offload
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: Ouyang, Changchun
> > > > Sent: Monday, August 31, 2015 8:29 PM
> > > > To: Liu, Jijiang; dev at dpdk.org
> > > > Cc: Ouyang, Changchun
> > > > Subject: RE: [dpdk-dev] [RFC PATCH 4/8] driver/virtio:enqueue TSO
> > > > offload
> > > >
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu
> > > > > Sent: Monday, August 31, 2015 5:42 PM
> > > > > To: dev at dpdk.org
> > > > > Subject: [dpdk-dev] [RFC PATCH 4/8] driver/virtio:enqueue TSO
> > > > > offload
> > > > >
> > > > > Enqueue TSO4/6 offload.
> > > > >
> > > > > Signed-off-by: Jijiang Liu 
> > > > > ---
> > > > >  drivers/net/virtio/virtio_rxtx.c |   23 +++
> > > > >  1 files changed, 23 insertions(+), 0 deletions(-)
> > > > >
> > > > > diff --git a/drivers/net/virtio/virtio_rxtx.c
> > > > > b/drivers/net/virtio/virtio_rxtx.c
> > > > > index c5b53bb..4c2d838 100644
> > > > > --- a/drivers/net/virtio/virtio_rxtx.c
> > > > > +++ b/drivers/net/virtio/virtio_rxtx.c
> > > > > @@ -198,6 +198,28 @@ virtqueue_enqueue_recv_refill(struct
> > > > > virtqueue *vq, struct rte_mbuf *cookie)
> > > > >   return 0;
> > > > >  }
> > > > >
> > > > > +static void
> > > > > +virtqueue_enqueue_offload(struct virtqueue *txvq, struct
> > > > > +rte_mbuf
> > > *m,
> > > > > + uint16_t idx, uint16_t hdr_sz) {
> > > > > + struct virtio_net_hdr *hdr = (struct virtio_net_hdr
> *)(uintptr_t)
> > > > > + (txvq->virtio_net_hdr_addr + idx *
> hdr_sz);
> > > > > +
> > > > > + if (m->tso_segsz != 0 && m->ol_flags & PKT_TX_TCP_SEG) {
> > > > > + if (m->ol_flags & PKT_TX_IPV4) {
> > > > > + if (!vtpci_with_feature(txvq->hw,
> > > > > VIRTIO_NET_F_HOST_TSO4))
> > > > > + return;
> > > >
> > > > Do we need return error if host can't handle tso for the packet?
> > > >
> > > > > + hdr->gso_type =
> VIRTIO_NET_HDR_GSO_TCPV4;
> > > > > + } else if (m->ol_flags & PKT_TX_IPV6) {
> > > > > + if (!vtpci_with_feature(txvq->hw,
> > > > > VIRTIO_NET_F_HOST_TSO6))
> > > > > + return;
> > > >
> > > > Same as above
> > > >
> > > > > + hdr->gso_type =
> VIRTIO_NET_HDR_GSO_TCPV6;
> > > > > + }
> > > >
> > > > Do we need else branch for the case of neither tcpv4 nor tcpv6?
> > > >
> > > > > + hdr->gso_size = m->tso_segsz;
> > > > > + hdr->hdr_len = m->l2_len + m->l3_len + m->l4_len;
> > > > > + }
> > > > > +}
> > > > > +
> > > > >  static int
> > > > >  virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf
> > > > > *cookie) { @@ -221,6 +243,7 @@ virtqueue_enqueue_xmit(struct
> > > > > virtqueue *txvq, struct rte_mbuf *cookie)
> > > > >   dxp->cookie = (void *)cookie;
> > > > >   dxp-&g

[dpdk-dev] vring_init bug

2015-09-09 Thread Ouyang, Changchun

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Xie, Huawei
> Sent: Wednesday, September 9, 2015 11:00 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] vring_init bug
> 
> static inline void
> vring_init(struct vring *vr, unsigned int num, uint8_t *p,
> unsigned long align)
> {
> vr->num = num;
> vr->desc = (struct vring_desc *) p;
> vr->avail = (struct vring_avail *) (p +
> num * sizeof(struct vring_desc));
> vr->used = (void *)
> RTE_ALIGN_CEIL((uintptr_t)(&vr->avail->ring[num]), align); }
> 
> There is a bug in vr->used calculation. 2 bytes of used_event_idx isn't
> considered. Would submit a fix.
> __u16 available[num];
> __u16 used_event_idx;

For vring_used ring, it also misses avail_event.

struct vring_used {
u16 flags ;
u16 idx ;
struct vring_used_elem r ing [qsz] ;
u16 avail_event ;  // this one missed in dpdk
} ;

It doesn't affect the offset calculation, but it will be great if you can add 
it together.


[dpdk-dev] [PATCH 4/4] vhost: define callfd and kickfd as int type

2015-09-09 Thread Ouyang, Changchun


> -Original Message-
> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> Sent: Monday, August 24, 2015 11:55 AM
> To: dev at dpdk.org
> Cc: Xie, Huawei; Ouyang, Changchun; Yuanhan Liu
> Subject: [PATCH 4/4] vhost: define callfd and kickfd as int type
> 
> So that we can remove the redundant (int) cast.
> 
> Signed-off-by: Yuanhan Liu 
> ---
>  examples/vhost/main.c |  6 ++---
>  lib/librte_vhost/rte_virtio_net.h |  4 ++--
>  lib/librte_vhost/vhost_rxtx.c |  6 ++---
>  lib/librte_vhost/vhost_user/virtio-net-user.c | 16 +++---
>  lib/librte_vhost/virtio-net.c | 32 
> +--
>  5 files changed, 32 insertions(+), 32 deletions(-)
> 
> diff --git a/examples/vhost/main.c b/examples/vhost/main.c index
> 1b137b9..b090b25 100644
> --- a/examples/vhost/main.c
> +++ b/examples/vhost/main.c
> @@ -1433,7 +1433,7 @@ put_desc_to_used_list_zcp(struct vhost_virtqueue
> *vq, uint16_t desc_idx)
> 
>   /* Kick the guest if necessary. */
>   if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
> - eventfd_write((int)vq->callfd, 1);
> + eventfd_write(vq->callfd, 1);

Don't we need type conversion for '1' to eventfd_t here?

>  }
> 
>  /*
> @@ -1626,7 +1626,7 @@ txmbuf_clean_zcp(struct virtio_net *dev, struct
> vpool *vpool)
> 
>   /* Kick guest if required. */
>   if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
> - eventfd_write((int)vq->callfd, 1);
> + eventfd_write(vq->callfd, 1);

Same as above

> 
>   return 0;
>  }
> @@ -1774,7 +1774,7 @@ virtio_dev_rx_zcp(struct virtio_net *dev, struct
> rte_mbuf **pkts,
> 
>   /* Kick the guest if necessary. */
>   if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
> - eventfd_write((int)vq->callfd, 1);
> + eventfd_write(vq->callfd, 1);

Same as above

> 
>   return count;
>  }
> diff --git a/lib/librte_vhost/rte_virtio_net.h
> b/lib/librte_vhost/rte_virtio_net.h
> index b9bf320..a037c15 100644
> --- a/lib/librte_vhost/rte_virtio_net.h
> +++ b/lib/librte_vhost/rte_virtio_net.h
> @@ -87,8 +87,8 @@ struct vhost_virtqueue {
>   uint16_tvhost_hlen; /**< Vhost header
> length (varies depending on RX merge buffers. */
>   volatile uint16_t   last_used_idx;  /**< Last index used
> on the available ring */
>   volatile uint16_t   last_used_idx_res;  /**< Used for
> multiple devices reserving buffers. */
> - eventfd_t   callfd; /**< Used to notify
> the guest (trigger interrupt). */
> - eventfd_t   kickfd; /**< Currently
> unused as polling mode is enabled. */
> + int callfd; /**< Used to notify
> the guest (trigger interrupt). */
> + int kickfd; /**< Currently
> unused as polling mode is enabled. */
>   struct buf_vector   buf_vec[BUF_VECTOR_MAX];/**< for
> scatter RX. */
>  } __rte_cache_aligned;
> 
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> index d412293..887cdb6 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -230,7 +230,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t
> queue_id,
> 
>   /* Kick the guest if necessary. */
>   if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
> - eventfd_write((int)vq->callfd, 1);
> + eventfd_write(vq->callfd, 1);
>   return count;
>  }
> 
> @@ -529,7 +529,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t
> queue_id,
> 
>   /* Kick the guest if necessary. */
>   if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
> - eventfd_write((int)vq->callfd, 1);
> + eventfd_write(vq->callfd, 1);
>   }
> 
>   return count;
> @@ -752,6 +752,6 @@ rte_vhost_dequeue_burst(struct virtio_net *dev,
> uint16_t queue_id,
>   vq->used->idx += entry_success;
>   /* Kick guest if required. */
>   if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
> - eventfd_write((int)vq->callfd, 1);
> + eventfd_write(vq->callfd, 1);
>   return entry_success;
>  }
> diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c
> b/lib/librte_vhost/vhost_user/virtio-net-user.c
> index c1ffc38..4689927 100644
> --- a/lib/librte_v

[dpdk-dev] [PATCH 4/4] vhost: define callfd and kickfd as int type

2015-09-09 Thread Ouyang, Changchun


> -Original Message-
> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> Sent: Wednesday, September 9, 2015 9:55 AM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org; Xie, Huawei
> Subject: Re: [PATCH 4/4] vhost: define callfd and kickfd as int type
> 
> On Wed, Sep 09, 2015 at 01:43:06AM +, Ouyang, Changchun wrote:
> >
> >
> > > -Original Message-
> > > From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> > > Sent: Monday, August 24, 2015 11:55 AM
> > > To: dev at dpdk.org
> > > Cc: Xie, Huawei; Ouyang, Changchun; Yuanhan Liu
> > > Subject: [PATCH 4/4] vhost: define callfd and kickfd as int type
> > >
> > > So that we can remove the redundant (int) cast.
> > >
> > > Signed-off-by: Yuanhan Liu 
> > > ---
> >
> > > diff --git a/lib/librte_vhost/rte_virtio_net.h
> > > b/lib/librte_vhost/rte_virtio_net.h
> > > index b9bf320..a037c15 100644
> > > --- a/lib/librte_vhost/rte_virtio_net.h
> > > +++ b/lib/librte_vhost/rte_virtio_net.h
> > > @@ -87,8 +87,8 @@ struct vhost_virtqueue {
> > >   uint16_tvhost_hlen; /**< Vhost header
> > > length (varies depending on RX merge buffers. */
> > >   volatile uint16_t   last_used_idx;  /**< Last index used
> > > on the available ring */
> > >   volatile uint16_t   last_used_idx_res;  /**< Used for
> > > multiple devices reserving buffers. */
> > > - eventfd_t   callfd; /**< Used to notify
> > > the guest (trigger interrupt). */
> > > - eventfd_t   kickfd; /**< Currently
> > > unused as polling mode is enabled. */
> > > + int callfd; /**< Used to notify
> > > the guest (trigger interrupt). */
> > > + int kickfd; /**< Currently
> > > unused as polling mode is enabled. */
> >
> > I don't think we have to change it from 8B(eventfd_t is defined as
> > uint64_t) to 4B, Any benefit for this change?
> 
> As I stated in the commit log, to remove the redundant (int) cast. Casts like
> following are a bit ugly:
> 
> if ((int)dev->virtqueue[VIRTIO_RXQ]->callfd >= 0)
> close((int)dev->virtqueue[VIRTIO_RXQ]->callfd);
> 
> On the other hand, why it has to be uint64_t? The caller side sends the
> message(be more precisely, qemu) actually uses int type.
> 

Agree, qemu use 32 bit for the callfd and kickfd.
It could use int.
Well, there is another comment in other place in this patch, I will send out 
soon.


>   --yliu


[dpdk-dev] virtio-net: bind systematically on all non blacklisted virtio-net devices

2015-09-09 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Franck Baudin
> Sent: Tuesday, September 8, 2015 4:23 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] virtio-net: bind systematically on all non blacklisted
> virtio-net devices
> 
> Hi,
> 
> virtio-net driver bind on all virtio-net devices, even if the devices are 
> used by
> the kernel (leading to kernel soft-lookup/panic). One way around is to
> blacklist the ports in use by Linux. This is the case since v2.0.0, in fact 
> since
> commit da978dfdc43b59e290a46d7ece5fd19ce79a1162
> and the removal of the RTE_PCI_DRV_NEED_MAPPING driver flag.

It allows virtio-pmd not necessarily depend on igb_uio, this is which 
characteristic other pmd drivers don't have.

> 
> Questions:
>  1/ Is it the expected behaviour?
>  2/ Why is it different from vmxnet3 pmd? In other words, should't we re-
> add the RTE_PCI_DRV_NEED_MAPPING to virtio pmd or remove it from
> pmxnet3 pmd?
>  3/ If this is the expected behaviour, shouldn't we update
> dpdk_nic_bind.py (binding status irrelevant for virtio) tool and the
> documentation (mentioning igb_uio while misleading and useless)?
> 
> Thanks!
> 
> Best Regards,
> Franck
> 
> 
> 



[dpdk-dev] [PATCH 1/4] vhost: remove redundant ;

2015-09-09 Thread Ouyang, Changchun


> -Original Message-
> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> Sent: Monday, August 24, 2015 11:55 AM
> To: dev at dpdk.org
> Cc: Xie, Huawei; Ouyang, Changchun; Yuanhan Liu
> Subject: [PATCH 1/4] vhost: remove redundant ;
> 
> Signed-off-by: Yuanhan Liu 

Acked-by: Changchun Ouyang 

> ---
>  lib/librte_vhost/vhost_rxtx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> index 0d07338..d412293 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -185,7 +185,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t
> queue_id,
>   }
>   }
>   len_to_cpy = RTE_MIN(data_len - offset, desc->len -
> vb_offset);
> - };
> + }
> 
>   /* Update used ring with desc information */
>   vq->used->ring[res_cur_idx & (vq->size - 1)].id =
> --
> 1.9.0



[dpdk-dev] [PATCH 4/4] vhost: define callfd and kickfd as int type

2015-09-09 Thread Ouyang, Changchun


> -Original Message-
> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> Sent: Monday, August 24, 2015 11:55 AM
> To: dev at dpdk.org
> Cc: Xie, Huawei; Ouyang, Changchun; Yuanhan Liu
> Subject: [PATCH 4/4] vhost: define callfd and kickfd as int type
> 
> So that we can remove the redundant (int) cast.
> 
> Signed-off-by: Yuanhan Liu 
> ---

> diff --git a/lib/librte_vhost/rte_virtio_net.h
> b/lib/librte_vhost/rte_virtio_net.h
> index b9bf320..a037c15 100644
> --- a/lib/librte_vhost/rte_virtio_net.h
> +++ b/lib/librte_vhost/rte_virtio_net.h
> @@ -87,8 +87,8 @@ struct vhost_virtqueue {
>   uint16_tvhost_hlen; /**< Vhost header
> length (varies depending on RX merge buffers. */
>   volatile uint16_t   last_used_idx;  /**< Last index used
> on the available ring */
>   volatile uint16_t   last_used_idx_res;  /**< Used for
> multiple devices reserving buffers. */
> - eventfd_t   callfd; /**< Used to notify
> the guest (trigger interrupt). */
> - eventfd_t   kickfd; /**< Currently
> unused as polling mode is enabled. */
> + int callfd; /**< Used to notify
> the guest (trigger interrupt). */
> + int kickfd; /**< Currently
> unused as polling mode is enabled. */

I don't think we have to change it from 8B(eventfd_t is defined as uint64_t) to 
4B,
Any benefit for this change? 




[dpdk-dev] [PATCH 3/4] vhost: get rid of duplicate code

2015-09-09 Thread Ouyang, Changchun


> -Original Message-
> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> Sent: Monday, August 24, 2015 11:55 AM
> To: dev at dpdk.org
> Cc: Xie, Huawei; Ouyang, Changchun; Yuanhan Liu
> Subject: [PATCH 3/4] vhost: get rid of duplicate code
> 
> Signed-off-by: Yuanhan Liu 

Acked-by: Changchun Ouyang 



[dpdk-dev] [PATCH 2/4] vhost: fix typo

2015-09-09 Thread Ouyang, Changchun


> -Original Message-
> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> Sent: Monday, August 24, 2015 11:55 AM
> To: dev at dpdk.org
> Cc: Xie, Huawei; Ouyang, Changchun; Yuanhan Liu
> Subject: [PATCH 2/4] vhost: fix typo
> 
> _det => _dev
> 
> Signed-off-by: Yuanhan Liu 

Acked-by: Changchun Ouyang 


[dpdk-dev] [RFC PATCH 4/8] driver/virtio:enqueue TSO offload

2015-09-09 Thread Ouyang, Changchun


> -Original Message-
> From: Liu, Jijiang
> Sent: Monday, September 7, 2015 2:11 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Subject: RE: [dpdk-dev] [RFC PATCH 4/8] driver/virtio:enqueue TSO offload
> 
> 
> 
> > -Original Message-----
> > From: Ouyang, Changchun
> > Sent: Monday, August 31, 2015 8:29 PM
> > To: Liu, Jijiang; dev at dpdk.org
> > Cc: Ouyang, Changchun
> > Subject: RE: [dpdk-dev] [RFC PATCH 4/8] driver/virtio:enqueue TSO
> > offload
> >
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu
> > > Sent: Monday, August 31, 2015 5:42 PM
> > > To: dev at dpdk.org
> > > Subject: [dpdk-dev] [RFC PATCH 4/8] driver/virtio:enqueue TSO
> > > offload
> > >
> > > Enqueue TSO4/6 offload.
> > >
> > > Signed-off-by: Jijiang Liu 
> > > ---
> > >  drivers/net/virtio/virtio_rxtx.c |   23 +++
> > >  1 files changed, 23 insertions(+), 0 deletions(-)
> > >
> > > diff --git a/drivers/net/virtio/virtio_rxtx.c
> > > b/drivers/net/virtio/virtio_rxtx.c
> > > index c5b53bb..4c2d838 100644
> > > --- a/drivers/net/virtio/virtio_rxtx.c
> > > +++ b/drivers/net/virtio/virtio_rxtx.c
> > > @@ -198,6 +198,28 @@ virtqueue_enqueue_recv_refill(struct virtqueue
> > > *vq, struct rte_mbuf *cookie)
> > >   return 0;
> > >  }
> > >
> > > +static void
> > > +virtqueue_enqueue_offload(struct virtqueue *txvq, struct rte_mbuf
> *m,
> > > + uint16_t idx, uint16_t hdr_sz)
> > > +{
> > > + struct virtio_net_hdr *hdr = (struct virtio_net_hdr *)(uintptr_t)
> > > + (txvq->virtio_net_hdr_addr + idx * hdr_sz);
> > > +
> > > + if (m->tso_segsz != 0 && m->ol_flags & PKT_TX_TCP_SEG) {
> > > + if (m->ol_flags & PKT_TX_IPV4) {
> > > + if (!vtpci_with_feature(txvq->hw,
> > > VIRTIO_NET_F_HOST_TSO4))
> > > + return;
> >
> > Do we need return error if host can't handle tso for the packet?
> >
> > > + hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
> > > + } else if (m->ol_flags & PKT_TX_IPV6) {
> > > + if (!vtpci_with_feature(txvq->hw,
> > > VIRTIO_NET_F_HOST_TSO6))
> > > + return;
> >
> > Same as above
> >
> > > + hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
> > > + }
> >
> > Do we need else branch for the case of neither tcpv4 nor tcpv6?
> >
> > > + hdr->gso_size = m->tso_segsz;
> > > + hdr->hdr_len = m->l2_len + m->l3_len + m->l4_len;
> > > + }
> > > +}
> > > +
> > >  static int
> > >  virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf
> > > *cookie) { @@ -221,6 +243,7 @@ virtqueue_enqueue_xmit(struct
> > > virtqueue *txvq, struct rte_mbuf *cookie)
> > >   dxp->cookie = (void *)cookie;
> > >   dxp->ndescs = needed;
> > >
> > > + virtqueue_enqueue_offload(txvq, cookie, idx, head_size);
> >
> > If TSO is not enabled in the feature bit, how to resolve here?
> 
> The TSO enablement  check is in the function.
> 
> If TSO is not enabled, and don't need to fill virtio_net_hdr now.

Here I mean if (m->ol_flags & PKT_TX_TCP_SEG) is true, that is to say, the 
virtio-pmd user expect do tso in vhost or virtio,
but the host feature bit doesn't support it, then it should handle this case,
either handle it in virtio pmd, or return error to caller. 
Otherwise the packet with flag tso maybe can't be send out normally.  
Am I right?

> 
> > >   start_dp = txvq->vq_ring.desc;
> > >   start_dp[idx].addr =
> > >   txvq->virtio_net_hdr_mem + idx * head_size;
> > > --
> > > 1.7.7.6



[dpdk-dev] [PATCH 3/4] virtio: use indirect ring elements

2015-09-06 Thread Ouyang, Changchun


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Saturday, September 5, 2015 4:58 AM
> To: Xie, Huawei; Ouyang, Changchun
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: [PATCH 3/4] virtio: use indirect ring elements
> 
> The virtio ring in QEMU/KVM is usually limited to 256 entries and the normal
> way that virtio driver was queuing mbufs required nsegs + 1 ring elements.
> By using the indirect ring element feature if available, each packet will take
> only one ring slot even for multi-segment packets.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  drivers/net/virtio/virtio_ethdev.c | 11 +---
> drivers/net/virtio/virtio_ethdev.h |  3 ++-
>  drivers/net/virtio/virtio_rxtx.c   | 51 ++-
> ---
>  drivers/net/virtio/virtqueue.h |  8 ++
>  4 files changed, 57 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> index 465d3cd..bcfb87b 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -359,12 +359,15 @@ int virtio_dev_queue_setup(struct rte_eth_dev
> *dev,
>   if (queue_type == VTNET_TQ) {
>   /*
>* For each xmit packet, allocate a virtio_net_hdr
> +  * and indirect ring elements
>*/
>   snprintf(vq_name, sizeof(vq_name),
> "port%d_tvq%d_hdrzone",
> - dev->data->port_id, queue_idx);
> - vq->virtio_net_hdr_mz =
> rte_memzone_reserve_aligned(vq_name,
> - vq_size * hw->vtnet_hdr_size,
> - socket_id, 0, RTE_CACHE_LINE_SIZE);
> +  dev->data->port_id, queue_idx);
> +
> + vq->virtio_net_hdr_mz =
> + rte_memzone_reserve_aligned(vq_name,
> + vq_size * sizeof(struct
> virtio_tx_region),
> + socket_id, 0,
> RTE_CACHE_LINE_SIZE);
>   if (vq->virtio_net_hdr_mz == NULL) {
>   if (rte_errno == EEXIST)
>   vq->virtio_net_hdr_mz =
> diff --git a/drivers/net/virtio/virtio_ethdev.h
> b/drivers/net/virtio/virtio_ethdev.h
> index 9026d42..07a9265 100644
> --- a/drivers/net/virtio/virtio_ethdev.h
> +++ b/drivers/net/virtio/virtio_ethdev.h
> @@ -64,7 +64,8 @@
>1u << VIRTIO_NET_F_CTRL_VQ   | \
>1u << VIRTIO_NET_F_CTRL_RX   | \
>1u << VIRTIO_NET_F_CTRL_VLAN | \
> -  1u << VIRTIO_NET_F_MRG_RXBUF)
> +  1u << VIRTIO_NET_F_MRG_RXBUF | \
> +  1u << VIRTIO_RING_F_INDIRECT_DESC)
> 
>  /*
>   * CQ function prototype
> diff --git a/drivers/net/virtio/virtio_rxtx.c 
> b/drivers/net/virtio/virtio_rxtx.c
> index dbe6665..8979695 100644
> --- a/drivers/net/virtio/virtio_rxtx.c
> +++ b/drivers/net/virtio/virtio_rxtx.c
> @@ -199,14 +199,15 @@ virtqueue_enqueue_recv_refill(struct virtqueue
> *vq, struct rte_mbuf *cookie)  }
> 
>  static int
> -virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie)
> +virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie,
> +int use_indirect)
>  {
>   struct vq_desc_extra *dxp;
>   struct vring_desc *start_dp;
>   uint16_t seg_num = cookie->nb_segs;
> - uint16_t needed = 1 + seg_num;
> + uint16_t needed = use_indirect ? 1 : 1 + seg_num;
>   uint16_t head_idx, idx;
> - uint16_t head_size = txvq->hw->vtnet_hdr_size;
> + unsigned long offs;
> 
>   if (unlikely(txvq->vq_free_cnt == 0))
>   return -ENOSPC;
> @@ -220,11 +221,26 @@ virtqueue_enqueue_xmit(struct virtqueue *txvq,
> struct rte_mbuf *cookie)
>   dxp = &txvq->vq_descx[idx];
>   dxp->cookie = (void *)cookie;
>   dxp->ndescs = needed;
> -
>   start_dp = txvq->vq_ring.desc;
> - start_dp[idx].addr =
> - txvq->virtio_net_hdr_mem + idx * head_size;
> - start_dp[idx].len = (uint32_t)head_size;
> +
> + if (use_indirect) {
> + offs = offsetof(struct virtio_tx_region, tx_indir)
> + + idx * sizeof(struct virtio_tx_region);
> +
> + start_dp[idx].addr = txvq->virtio_net_hdr_mem + offs;
> + start_dp[idx].len = sizeof(struct vring_desc);

Should the length be N * sizeof(struct vring_desc)?

> + start_dp[idx].flags = VRING_DESC_F_INDIRECT;
> +
> + start_dp = (struct vring_desc *)
> + 

[dpdk-dev] [PATCH 3/4] virtio: use indirect ring elements

2015-09-06 Thread Ouyang, Changchun


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Saturday, September 5, 2015 4:58 AM
> To: Xie, Huawei; Ouyang, Changchun
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: [PATCH 3/4] virtio: use indirect ring elements
> 
> The virtio ring in QEMU/KVM is usually limited to 256 entries and the normal
> way that virtio driver was queuing mbufs required nsegs + 1 ring elements.
> By using the indirect ring element feature if available, each packet will take
> only one ring slot even for multi-segment packets.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  drivers/net/virtio/virtio_ethdev.c | 11 +---
> drivers/net/virtio/virtio_ethdev.h |  3 ++-
>  drivers/net/virtio/virtio_rxtx.c   | 51 ++-
> ---
>  drivers/net/virtio/virtqueue.h |  8 ++
>  4 files changed, 57 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> index 465d3cd..bcfb87b 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -359,12 +359,15 @@ int virtio_dev_queue_setup(struct rte_eth_dev
> *dev,
>   if (queue_type == VTNET_TQ) {

Do we also need implement indirect ring elements for RX path?

>   /*
>* For each xmit packet, allocate a virtio_net_hdr
> +  * and indirect ring elements
>*/
>   snprintf(vq_name, sizeof(vq_name),
> "port%d_tvq%d_hdrzone",
> - dev->data->port_id, queue_idx);
> - vq->virtio_net_hdr_mz =
> rte_memzone_reserve_aligned(vq_name,
> - vq_size * hw->vtnet_hdr_size,
> - socket_id, 0, RTE_CACHE_LINE_SIZE);
> +  dev->data->port_id, queue_idx);
> +
> + vq->virtio_net_hdr_mz =
> + rte_memzone_reserve_aligned(vq_name,
> + vq_size * sizeof(struct
> virtio_tx_region),
> + socket_id, 0,
> RTE_CACHE_LINE_SIZE);
>   if (vq->virtio_net_hdr_mz == NULL) {
>   if (rte_errno == EEXIST)
>   vq->virtio_net_hdr_mz =
> diff --git a/drivers/net/virtio/virtio_ethdev.h
> b/drivers/net/virtio/virtio_ethdev.h
> index 9026d42..07a9265 100644
> --- a/drivers/net/virtio/virtio_ethdev.h
> +++ b/drivers/net/virtio/virtio_ethdev.h
> @@ -64,7 +64,8 @@
>1u << VIRTIO_NET_F_CTRL_VQ   | \
>1u << VIRTIO_NET_F_CTRL_RX   | \
>1u << VIRTIO_NET_F_CTRL_VLAN | \
> -  1u << VIRTIO_NET_F_MRG_RXBUF)
> +  1u << VIRTIO_NET_F_MRG_RXBUF | \
> +  1u << VIRTIO_RING_F_INDIRECT_DESC)
> 
>  /*
>   * CQ function prototype
> diff --git a/drivers/net/virtio/virtio_rxtx.c 
> b/drivers/net/virtio/virtio_rxtx.c
> index dbe6665..8979695 100644
> --- a/drivers/net/virtio/virtio_rxtx.c
> +++ b/drivers/net/virtio/virtio_rxtx.c
> @@ -199,14 +199,15 @@ virtqueue_enqueue_recv_refill(struct virtqueue
> *vq, struct rte_mbuf *cookie)  }
> 
>  static int
> -virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie)
> +virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie,
> +int use_indirect)
>  {
>   struct vq_desc_extra *dxp;
>   struct vring_desc *start_dp;
>   uint16_t seg_num = cookie->nb_segs;
> - uint16_t needed = 1 + seg_num;
> + uint16_t needed = use_indirect ? 1 : 1 + seg_num;

Do we need check if seg_num > VIRTIO_MAX_TX_INDIRECT?
That mean one slot is not enough for the whole big packet even it is indirect 
ring.

>   uint16_t head_idx, idx;
> - uint16_t head_size = txvq->hw->vtnet_hdr_size;
> + unsigned long offs;
> 
>   if (unlikely(txvq->vq_free_cnt == 0))
>   return -ENOSPC;
> @@ -220,11 +221,26 @@ virtqueue_enqueue_xmit(struct virtqueue *txvq,
> struct rte_mbuf *cookie)
>   dxp = &txvq->vq_descx[idx];
>   dxp->cookie = (void *)cookie;
>   dxp->ndescs = needed;
> -
>   start_dp = txvq->vq_ring.desc;
> - start_dp[idx].addr =
> - txvq->virtio_net_hdr_mem + idx * head_size;
> - start_dp[idx].len = (uint32_t)head_size;
> +
> + if (use_indirect) {
> + offs = offsetof(struct virtio_tx_region, tx_indir)
> + + idx * sizeof(struct virtio_tx_region);
> +
> + start_dp[idx].addr = txvq->virtio_net_hdr_mem + offs;
> + start_dp[idx].len = sizeof(struct vri

[dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev

2015-09-06 Thread Ouyang, Changchun
Hi Tetsuya,

> -Original Message-
> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp]
> Sent: Thursday, September 3, 2015 10:27 AM
> To: dev at dpdk.org; Ouyang, Changchun
> Subject: Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in
> virtio dev
> 
> On 2015/08/12 17:02, Ouyang Changchun wrote:
> > diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.h
> > b/lib/librte_vhost/vhost_user/virtio-net-user.h
> > index df24860..2429836 100644
> > --- a/lib/librte_vhost/vhost_user/virtio-net-user.h
> > +++ b/lib/librte_vhost/vhost_user/virtio-net-user.h
> > @@ -46,4 +46,6 @@ void user_set_vring_kick(struct vhost_device_ctx,
> > struct VhostUserMsg *);
> >
> >  /*
> > @@ -206,9 +213,17 @@ cleanup_device(struct virtio_net *dev)  static
> > void  free_device(struct virtio_net_config_ll *ll_dev)  {
> > -   /* Free any malloc'd memory */
> > -   rte_free(ll_dev->dev.virtqueue[VIRTIO_RXQ]);
> > -   rte_free(ll_dev->dev.virtqueue[VIRTIO_TXQ]);
> > +   uint32_t qp_idx;
> > +
> > +   /*
> > +* Free any malloc'd memory.
> > +*/
> > +   /* Free every queue pair. */
> > +   for (qp_idx = 0; qp_idx < ll_dev->dev.virt_qp_nb; qp_idx++) {
> > +   uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM +
> VIRTIO_RXQ;
> > +   rte_free(ll_dev->dev.virtqueue[virt_rx_q_idx]);
> 
> Hi Changchun,
> 
> Should we free tx queue also here?
>

We don't need do it, as we allocate once for both rx and tx queue.
Thus, we allocate once, free once.
Pls see the following code snippet:

+ *  Alloc mem for vring queue pair.
+ */
+int
+alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
+   struct vhost_virtqueue *virtqueue = NULL;
+   uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+   uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;

-   /* Backends are set to -1 indicating an inactive device. */
-   dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
-   dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
+   virtqueue = rte_malloc(NULL, sizeof(struct vhost_virtqueue) * 
VIRTIO_QNUM, 0);
+   if (virtqueue == NULL) {
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "Failed to allocate memory for virt qp:%d.\n", qp_idx);
+   return -1;
+   }
+
+   dev->virtqueue[virt_rx_q_idx] = virtqueue;
+   dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
+
+   init_vring_queue_pair(dev, qp_idx);
+
+   return 0;
 }

Thanks
Changchun

> 
> > +   }
> > +   rte_free(ll_dev->dev.virtqueue);
> > rte_free(ll_dev);
> >  }
> >
> >



[dpdk-dev] [RFC PATCH 5/8] lib/librte_vhost:dequeue vhost TSO offload

2015-09-01 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu
> Sent: Monday, August 31, 2015 5:42 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [RFC PATCH 5/8] lib/librte_vhost:dequeue vhost TSO
> offload
> 
> Dequeue vhost TSO offload
> 
> Signed-off-by: Jijiang Liu 
> ---
>  lib/librte_vhost/vhost_rxtx.c |   29 -
>  1 files changed, 28 insertions(+), 1 deletions(-)
> 
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> index 0d07338..9adfdb1 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -545,6 +545,30 @@ rte_vhost_enqueue_burst(struct virtio_net *dev,
> uint16_t queue_id,
>   return virtio_dev_rx(dev, queue_id, pkts, count);  }
> 
> +static inline void __attribute__((always_inline))
> +vhost_dequeue_offload(uint64_t addr, struct rte_mbuf *m) {
> + struct virtio_net_hdr *hdr =
> + (struct virtio_net_hdr *)((uintptr_t)addr);
> +
> + if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) {
> + switch (hdr->gso_type & ~VIRTIO_NET_HDR_GSO_ECN) {
> + case VIRTIO_NET_HDR_GSO_TCPV4:
> + m->ol_flags |= (PKT_TX_IPV4 | PKT_TX_TCP_SEG);
> + m->tso_segsz = hdr->gso_size;
> + break;
> + case VIRTIO_NET_HDR_GSO_TCPV6:
> + m->ol_flags |= (PKT_TX_IPV6 | PKT_TX_TCP_SEG);
> + m->tso_segsz = hdr->gso_size;
> + break;
> + default:
> + RTE_LOG(ERR, VHOST_DATA,
> + "bad gso type %u.\n", hdr->gso_type);
> + break;

Do we need special handling for the bad gso type? 

> + }
> + }
> +}
> +
>  uint16_t
>  rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
>   struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t
> count) @@ -553,6 +577,7 @@ rte_vhost_dequeue_burst(struct virtio_net
> *dev, uint16_t queue_id,
>   struct vhost_virtqueue *vq;
>   struct vring_desc *desc;
>   uint64_t vb_addr = 0;
> + uint64_t vb_net_hdr_addr = 0;
>   uint32_t head[MAX_PKT_BURST];
>   uint32_t used_idx;
>   uint32_t i;
> @@ -604,6 +629,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev,
> uint16_t queue_id,
> 
>   desc = &vq->desc[head[entry_success]];
> 
> + vb_net_hdr_addr = gpa_to_vva(dev, desc->addr);
> +
>   /* Discard first buffer as it is the virtio header */
>   if (desc->flags & VRING_DESC_F_NEXT) {
>   desc = &vq->desc[desc->next];
> @@ -742,7 +769,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev,
> uint16_t queue_id,
>   break;
> 
>   m->nb_segs = seg_num;
> -
> + vhost_dequeue_offload(vb_net_hdr_addr, m);
>   pkts[entry_success] = m;
>   vq->last_used_idx++;
>   entry_success++;
> --
> 1.7.7.6



[dpdk-dev] [RFC PATCH 4/8] driver/virtio:enqueue TSO offload

2015-09-01 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu
> Sent: Monday, August 31, 2015 5:42 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [RFC PATCH 4/8] driver/virtio:enqueue TSO offload
> 
> Enqueue TSO4/6 offload.
> 
> Signed-off-by: Jijiang Liu 
> ---
>  drivers/net/virtio/virtio_rxtx.c |   23 +++
>  1 files changed, 23 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_rxtx.c 
> b/drivers/net/virtio/virtio_rxtx.c
> index c5b53bb..4c2d838 100644
> --- a/drivers/net/virtio/virtio_rxtx.c
> +++ b/drivers/net/virtio/virtio_rxtx.c
> @@ -198,6 +198,28 @@ virtqueue_enqueue_recv_refill(struct virtqueue *vq,
> struct rte_mbuf *cookie)
>   return 0;
>  }
> 
> +static void
> +virtqueue_enqueue_offload(struct virtqueue *txvq, struct rte_mbuf *m,
> + uint16_t idx, uint16_t hdr_sz)
> +{
> + struct virtio_net_hdr *hdr = (struct virtio_net_hdr *)(uintptr_t)
> + (txvq->virtio_net_hdr_addr + idx * hdr_sz);
> +
> + if (m->tso_segsz != 0 && m->ol_flags & PKT_TX_TCP_SEG) {
> + if (m->ol_flags & PKT_TX_IPV4) {
> + if (!vtpci_with_feature(txvq->hw,
> VIRTIO_NET_F_HOST_TSO4))
> + return;

Do we need return error if host can't handle tso for the packet?

> + hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
> + } else if (m->ol_flags & PKT_TX_IPV6) {
> + if (!vtpci_with_feature(txvq->hw,
> VIRTIO_NET_F_HOST_TSO6))
> + return;

Same as above

> + hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
> + }

Do we need else branch for the case of neither tcpv4 nor tcpv6?

> + hdr->gso_size = m->tso_segsz;
> + hdr->hdr_len = m->l2_len + m->l3_len + m->l4_len;
> + }
> +}
> +
>  static int
>  virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie)
> { @@ -221,6 +243,7 @@ virtqueue_enqueue_xmit(struct virtqueue *txvq,
> struct rte_mbuf *cookie)
>   dxp->cookie = (void *)cookie;
>   dxp->ndescs = needed;
> 
> + virtqueue_enqueue_offload(txvq, cookie, idx, head_size);

If TSO is not enabled in the feature bit, how to resolve here?

>   start_dp = txvq->vq_ring.desc;
>   start_dp[idx].addr =
>   txvq->virtio_net_hdr_mem + idx * head_size;
> --
> 1.7.7.6



[dpdk-dev] [RFC PATCH] vhost: Add VHOST PMD

2015-08-31 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tetsuya Mukawa
> Sent: Friday, August 28, 2015 11:22 AM
> To: dev at dpdk.org
> Cc: ann.zhuangyanying at huawei.com
> Subject: [dpdk-dev] [RFC PATCH] vhost: Add VHOST PMD
> 
> The patch introduces a new PMD. This PMD is implemented as thin wrapper
> of librte_vhost. It means librte_vhost is also needed to compile the PMD.
> The PMD can have 'iface' parameter like below to specify a path to connect
> to a virtio-net device.
> 
> $ ./testpmd -c f -n 4 --vdev 'eth_vhost0,iface=/tmp/sock0' -- -i
> 
> To connect above testpmd, here is qemu command example.
> 
> $ qemu-system-x86_64 \
> 
> -chardev socket,id=chr0,path=/tmp/sock0 \
> -netdev vhost-user,id=net0,chardev=chr0,vhostforce \
> -device virtio-net-pci,netdev=net0
> 
> Signed-off-by: Tetsuya Mukawa 
> ---
>  config/common_linuxapp  |   6 +
>  drivers/net/Makefile|   4 +
>  drivers/net/vhost/Makefile  |  61 +++
>  drivers/net/vhost/rte_eth_vhost.c   | 639
> 
>  drivers/net/vhost/rte_pmd_vhost_version.map |   4 +
>  mk/rte.app.mk   |   8 +-
>  6 files changed, 721 insertions(+), 1 deletion(-)  create mode 100644
> drivers/net/vhost/Makefile  create mode 100644
> drivers/net/vhost/rte_eth_vhost.c  create mode 100644
> drivers/net/vhost/rte_pmd_vhost_version.map
> 
> diff --git a/config/common_linuxapp b/config/common_linuxapp index
> 0de43d5..7310240 100644
> --- a/config/common_linuxapp
> +++ b/config/common_linuxapp
> @@ -446,6 +446,12 @@ CONFIG_RTE_LIBRTE_VHOST_NUMA=n
> CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
> 
>  #
> +# Compile vhost PMD
> +# To compile, CONFIG_RTE_LIBRTE_VHOST should be enabled.
> +#
> +CONFIG_RTE_LIBRTE_PMD_VHOST=y
> +
> +#
>  #Compile Xen domain0 support
>  #
>  CONFIG_RTE_LIBRTE_XEN_DOM0=n
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile index
> 5ebf963..e46a38e 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -49,5 +49,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio
>  DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += vmxnet3
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += xenvirt
> 
> +ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
> +DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost endif #
> +$(CONFIG_RTE_LIBRTE_VHOST)
> +
>  include $(RTE_SDK)/mk/rte.sharelib.mk
>  include $(RTE_SDK)/mk/rte.subdir.mk
> diff --git a/drivers/net/vhost/Makefile b/drivers/net/vhost/Makefile new
> file mode 100644 index 000..018edde
> --- /dev/null
> +++ b/drivers/net/vhost/Makefile
> @@ -0,0 +1,61 @@
> +#   BSD LICENSE
> +#
> +#   Copyright (c) 2010-2015 Intel Corporation.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +#   notice, this list of conditions and the following disclaimer.
> +# * Redistributions in binary form must reproduce the above copyright
> +#   notice, this list of conditions and the following disclaimer in
> +#   the documentation and/or other materials provided with the
> +#   distribution.
> +# * Neither the name of Intel corporation nor the names of its
> +#   contributors may be used to endorse or promote products derived
> +#   from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +#
> +# library name
> +#
> +LIB = librte_pmd_vhost.a
> +
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +
> +EXPORT_MAP := rte_pmd_vhost_version.map
> +
> +LIBABIVER := 1
> +
> +#
> +# all source are stored in SRCS-y
> +#
> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += rte_eth_vhost.c
> +
> +#
> +# Export include files
> +#
> +SYMLINK-y-include +=
> +
> +# this lib depends upon:
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += lib/librte_mbuf
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += lib/librte_ether
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += lib/librte_

[dpdk-dev] [PATCH 1/3] virtio: don't report link state feature unless available

2015-08-31 Thread Ouyang, Changchun


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Saturday, August 29, 2015 12:24 AM
> To: Xie, Huawei; Ouyang, Changchun
> Cc: dev at dpdk.org; Stephen Hemminger; Stephen Hemminger
> Subject: [PATCH 1/3] virtio: don't report link state feature unless available
> 
> From: Stephen Hemminger 
> 
> If host does not support virtio link state (like current DPDK vhost) then 
> don't
> set the flag. This keeps applications from incorrectly assuming that link 
> state
> is available when it is not. It also avoids useless "guess what works in the
> config".
> 
> Signed-off-by: Stephen Hemminger 

Acked-by: Changchun Ouyang 

> ---
>  drivers/net/virtio/virtio_ethdev.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> index 465d3cd..8c3e924 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -1201,6 +1201,10 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
>   vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER);
>   virtio_negotiate_features(hw);
> 
> + /* If host does not support status then disable LSC */
> + if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS))
> + pci_dev->driver->drv_flags &= ~RTE_PCI_DRV_INTR_LSC;
> +
>   rx_func_get(eth_dev);
> 
>   /* Setting up rx_header size for the device */ @@ -1394,9 +1398,8
> @@ virtio_dev_start(struct rte_eth_dev *dev)
>   struct rte_pci_device *pci_dev = dev->pci_dev;
> 
>   /* check if lsc interrupt feature is enabled */
> - if ((dev->data->dev_conf.intr_conf.lsc) &&
> - (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)) {
> - if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
> + if (dev->data->dev_conf.intr_conf.lsc) {
> + if (!(pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC))
> {
>   PMD_DRV_LOG(ERR, "link status not supported by
> host");
>   return -ENOTSUP;
>   }
> --
> 2.1.4



[dpdk-dev] [PATCH 2/3] virtio: fix Coverity unsigned warnings

2015-08-31 Thread Ouyang, Changchun
Hi Stephen,

> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Saturday, August 29, 2015 12:24 AM
> To: Xie, Huawei; Ouyang, Changchun
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: [PATCH 2/3] virtio: fix Coverity unsigned warnings
> 
> There are some places in virtio driver where uint16_t or int are used where it
> would be safer to use unsigned.

Why will it be safer?

> 
> Signed-off-by: Stephen Hemminger 
> ---
>  drivers/net/virtio/virtio_ethdev.c  |7 +-
>  drivers/net/virtio/virtio_ethdev.c.orig | 1577
> +++
>  drivers/net/virtio/virtio_ring.h|4 +-
>  drivers/net/virtio/virtio_rxtx.c|4 +-
>  4 files changed, 1584 insertions(+), 8 deletions(-)  create mode 100644
> drivers/net/virtio/virtio_ethdev.c.orig
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> index 8c3e924..338d891 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -261,21 +261,20 @@ int virtio_dev_queue_setup(struct rte_eth_dev
> *dev,  {
>   char vq_name[VIRTQUEUE_MAX_NAME_SZ];
>   const struct rte_memzone *mz;
> - uint16_t vq_size;
> - int size;
> + unsigned int vq_size, size;
>   struct virtio_hw *hw = dev->data->dev_private;
>   struct virtqueue *vq = NULL;
> 
>   /* Write the virtqueue index to the Queue Select Field */
>   VIRTIO_WRITE_REG_2(hw, VIRTIO_PCI_QUEUE_SEL,
> vtpci_queue_idx);
> - PMD_INIT_LOG(DEBUG, "selecting queue: %d", vtpci_queue_idx);
> + PMD_INIT_LOG(DEBUG, "selecting queue: %u", vtpci_queue_idx);
> 
>   /*
>* Read the virtqueue size from the Queue Size field
>* Always power of 2 and if 0 virtqueue does not exist
>*/
>   vq_size = VIRTIO_READ_REG_2(hw, VIRTIO_PCI_QUEUE_NUM);
> - PMD_INIT_LOG(DEBUG, "vq_size: %d nb_desc:%d", vq_size,
> nb_desc);
> + PMD_INIT_LOG(DEBUG, "vq_size: %u nb_desc:%u", vq_size,
> nb_desc);
>   if (vq_size == 0) {
>   PMD_INIT_LOG(ERR, "%s: virtqueue does not exist",
> __func__);
>   return -EINVAL;
> diff --git a/drivers/net/virtio/virtio_ethdev.c.orig
> b/drivers/net/virtio/virtio_ethdev.c.orig
> new file mode 100644
> index 000..465d3cd
> --- /dev/null
> +++ b/drivers/net/virtio/virtio_ethdev.c.orig

I don't think we need this orig file.

> @@ -0,0 +1,1577 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + *   notice, this list of conditions and the following disclaimer in
> + *   the documentation and/or other materials provided with the
> + *   distribution.
> + * * Neither the name of Intel Corporation nor the names of its
> + *   contributors may be used to endorse or promote products derived
> + *   from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#ifdef RTE_EXEC_ENV_LINUXAPP
> +#include 
> +#include 
> +#endif
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#

[dpdk-dev] [PATCH 3/3] virtio: fix possible NULL dereference

2015-08-31 Thread Ouyang, Changchun


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Saturday, August 29, 2015 12:24 AM
> To: Xie, Huawei; Ouyang, Changchun
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: [PATCH 3/3] virtio: fix possible NULL dereference
> 
> Found by Coverity. In virtio_dev_queue_release if the queue pointer is NULL,
> then driver is dereferencing it to get hw pointer.
> Also, don't do useless assignment
> 
> Signed-off-by: Stephen Hemminger 

Acked-by: Changchun Ouyang 

> ---
>  drivers/net/virtio/virtio_ethdev.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> index 338d891..914c73d 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -239,15 +239,15 @@ virtio_set_multiple_queues(struct rte_eth_dev
> *dev, uint16_t nb_queues)
> 
>  void
>  virtio_dev_queue_release(struct virtqueue *vq) {
> - struct virtio_hw *hw = vq->hw;
> 
>   if (vq) {
> + struct virtio_hw *hw = vq->hw;
> +
>   /* Select and deactivate the queue */
>   VIRTIO_WRITE_REG_2(hw, VIRTIO_PCI_QUEUE_SEL, vq-
> >queue_id);
>   VIRTIO_WRITE_REG_4(hw, VIRTIO_PCI_QUEUE_PFN, 0);
> 
>   rte_free(vq);
> - vq = NULL;
>   }
>  }
> 
> --
> 2.1.4



[dpdk-dev] [PATCH 1/6] ixgbe: Support VMDq RSS in non-SRIOV environment

2015-08-25 Thread Ouyang, Changchun
Hi Michael,

Pls review the latest version (v4).

Thanks for your effort
Changchun


> -Original Message-
> From: Qiu, Michael
> Sent: Monday, August 24, 2015 6:42 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 1/6] ixgbe: Support VMDq RSS in non-SRIOV
> environment
> 
> On 5/21/2015 3:50 PM, Ouyang Changchun wrote:
> > In non-SRIOV environment, VMDq RSS could be enabled by MRQC register.
> > In theory, the queue number per pool could be 2 or 4, but only 2
> > queues are available due to HW limitation, the same limit also exist in 
> > Linux
> ixgbe driver.
> >
> > Signed-off-by: Changchun Ouyang 
> > ---
> >  lib/librte_ether/rte_ethdev.c | 40 +++
> >  lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 82
> > +--
> >  2 files changed, 111 insertions(+), 11 deletions(-)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c
> > b/lib/librte_ether/rte_ethdev.c index 024fe8b..6535715 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -933,6 +933,16 @@ rte_eth_dev_check_vf_rss_rxq_num(uint8_t
> port_id, uint16_t nb_rx_q)
> > return 0;
> >  }
> >
> > +#define VMDQ_RSS_RX_QUEUE_NUM_MAX 4
> > +
> > +static int
> > +rte_eth_dev_check_vmdq_rss_rxq_num(__rte_unused uint8_t port_id,
> > +uint16_t nb_rx_q) {
> > +   if (nb_rx_q > VMDQ_RSS_RX_QUEUE_NUM_MAX)
> > +   return -EINVAL;
> > +   return 0;
> > +}
> > +
> >  static int
> >  rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q,
> uint16_t nb_tx_q,
> >   const struct rte_eth_conf *dev_conf) @@ -1093,6
> +1103,36 @@
> > rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q,
> uint16_t nb_tx_q,
> > return -EINVAL;
> > }
> > }
> > +
> > +   if (dev_conf->rxmode.mq_mode ==
> ETH_MQ_RX_VMDQ_RSS) {
> > +   uint32_t nb_queue_pools =
> > +   dev_conf-
> >rx_adv_conf.vmdq_rx_conf.nb_queue_pools;
> > +   struct rte_eth_dev_info dev_info;
> > +
> > +   rte_eth_dev_info_get(port_id, &dev_info);
> > +   dev->data->dev_conf.rxmode.mq_mode =
> ETH_MQ_RX_VMDQ_RSS;
> > +   if (nb_queue_pools == ETH_32_POOLS ||
> nb_queue_pools == ETH_64_POOLS)
> > +   RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool =
> > +
>   dev_info.max_rx_queues/nb_queue_pools;
> > +   else {
> > +   PMD_DEBUG_TRACE("ethdev port_id=%d
> VMDQ "
> > +   "nb_queue_pools=%d invalid
> "
> > +   "in VMDQ RSS\n"
> 
> Does here miss "," ?

Yes, it is fixed in later version.

> 
> Thanks,
> Michael
> 
> > +   port_id,
> > +   nb_queue_pools);
> > +   return -EINVAL;
> > +   }
> > +
> > +   if (rte_eth_dev_check_vmdq_rss_rxq_num(port_id,
> > +
>   RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) != 0) {
> > +   PMD_DEBUG_TRACE("ethdev port_id=%d"
> > +   " SRIOV active, invalid queue"
> > +   " number for VMDQ RSS, allowed"
> > +   " value are 1, 2 or 4\n",
> > +   port_id);
> > +   return -EINVAL;
> > +   }
> > +   }
> > }
> > return 0;
> >  }
> >



[dpdk-dev] [PATCH] vhost: fix qemu shutdown issue

2015-08-20 Thread Ouyang Changchun
This patch originates from the patch:
[dpdk-dev] [PATCH 1/2] Patch for Qemu wrapper for US-VHost to ensure Qemu 
process ends when VM is shutdown
http://dpdk.org/ml/archives/dev/2014-June/003606.html

Aslo update the vhost sample guide doc.

Signed-off-by: Claire Murphy 
Signed-off-by: Changchun Ouyang 
---
 doc/guides/sample_app_ug/vhost.rst|  9 -
 lib/librte_vhost/libvirt/qemu-wrap.py | 29 +
 2 files changed, 25 insertions(+), 13 deletions(-)

diff --git a/doc/guides/sample_app_ug/vhost.rst 
b/doc/guides/sample_app_ug/vhost.rst
index 730b9da..743908d 100644
--- a/doc/guides/sample_app_ug/vhost.rst
+++ b/doc/guides/sample_app_ug/vhost.rst
@@ -717,15 +717,6 @@ Common Issues
 needs access to the shared memory from the guest to receive and transmit 
packets. It is important to make sure
 the QEMU version supports shared memory mapping.

-*   Issues with ``virsh destroy`` not destroying the VM:
-
-Using libvirt ``virsh create`` the ``qemu-wrap.py`` spawns a new process 
to run ``qemu-kvm``. This impacts the behavior
-of ``virsh destroy`` which kills the process running ``qemu-wrap.py`` 
without actually destroying the VM (it leaves
-the ``qemu-kvm`` process running):
-
-This following patch should fix this issue:
-http://dpdk.org/ml/archives/dev/2014-June/003607.html
-
 *   In an Ubuntu environment, QEMU fails to start a new guest normally with 
user space VHOST due to not being able
 to allocate huge pages for the new guest:

diff --git a/lib/librte_vhost/libvirt/qemu-wrap.py 
b/lib/librte_vhost/libvirt/qemu-wrap.py
index 5096011..30a0d50 100755
--- a/lib/librte_vhost/libvirt/qemu-wrap.py
+++ b/lib/librte_vhost/libvirt/qemu-wrap.py
@@ -76,6 +76,7 @@
 #"/dev/ptmx", "/dev/kvm", "/dev/kqemu",
 #"/dev/rtc", "/dev/hpet", "/dev/net/tun",
 #"/dev/-",
+#"/dev/hugepages",
 #]
 #
 #   4.b) Disable SELinux or set to permissive mode
@@ -161,6 +162,8 @@ hugetlbfs_dir = ""
 #

 import sys, os, subprocess
+import time
+import signal


 #List of open userspace vhost file descriptors
@@ -174,6 +177,18 @@ vhost_flags = [ "csum=off",
 "guest_ecn=off"
   ]

+#String of the path to the Qemu process pid
+qemu_pid = "/tmp/%d-qemu.pid" % os.getpid()
+
+#
+# Signal haldler to kill Qemu subprocess
+#
+def kill_qemu_process(signum, stack):
+pidfile = open(qemu_pid, 'r')
+pid = int(pidfile.read())
+os.killpg(pid, signal.SIGTERM)
+pidfile.close()
+

 #
 # Find the system hugefile mount point.
@@ -280,7 +295,7 @@ def main():
 while (num < num_cmd_args):
 arg = sys.argv[num]

-   #Check netdev +1 parameter for vhostfd
+   #Check netdev +1 parameter for vhostfd
 if arg == '-netdev':
 num_vhost_devs = len(fd_list)
 new_args.append(arg)
@@ -333,7 +348,6 @@ def main():
 emul_call += mp
 emul_call += " "

-
 #add user options
 for opt in emul_opts_user:
 emul_call += opt
@@ -353,14 +367,21 @@ def main():
 emul_call+=str(arg)
 emul_call+= " "

+emul_call += "-pidfile %s " % qemu_pid
 #Call QEMU
-subprocess.call(emul_call, shell=True)
+process = subprocess.Popen(emul_call, shell=True, preexec_fn=os.setsid)
+
+for sig in [signal.SIGTERM, signal.SIGINT, signal.SIGHUP, signal.SIGQUIT]:
+signal.signal(sig, kill_qemu_process)

+process.wait()

 #Close usvhost files
 for fd in fd_list:
 os.close(fd)
-
+#Cleanup temporary files
+if os.access(qemu_pid, os.F_OK):
+os.remove(qemu_pid)

 if __name__ == "__main__":
 main()
-- 
1.8.4.2



[dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev

2015-08-19 Thread Ouyang, Changchun
Hi Yuanhan,

> -Original Message-
> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> Sent: Wednesday, August 19, 2015 11:53 AM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org; Xie, Huawei
> Subject: Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in
> virtio dev
> 
> Hi Changchun,
> 
> On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > Enabling this feature allows virtio device/port on guest has the
> > ability to use different vCPU to receive/transmit packets from/to each
> queue.
> >
> > In multiple queues mode, virtio device readiness means all queues of
> > this virtio device are ready, cleanup/destroy a virtio device also
> > requires clearing all queues belong to it.
> >
> > Signed-off-by: Changchun Ouyang 
> > ---
> [snip ..]
> >  /*
> > + *  Initialise all variables in vring queue pair.
> > + */
> > +static void
> > +init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > +   uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > +   uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> > +   memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct
> vhost_virtqueue));
> > +   memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct
> > +vhost_virtqueue));
> > +
> > +   dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
> > +   dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
> > +   dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
> > +   dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
> > +
> > +   /* Backends are set to -1 indicating an inactive device. */
> > +   dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> > +   dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED; }
> > +
> > +/*
> >   *  Initialise all variables in device structure.
> >   */
> >  static void
> > @@ -258,17 +294,34 @@ init_device(struct virtio_net *dev)
> > /* Set everything to 0. */
> 
> There is a trick here. Let me fill the context first:
> 
> 283 static void
> 284 init_device(struct virtio_net *dev)
> 285 {
> 286 uint64_t vq_offset;
> 287
> 288 /*
> 289  * Virtqueues have already been malloced so
> 290  * we don't want to set them to NULL.
> 291  */
> 292 vq_offset = offsetof(struct virtio_net, mem);
> 293
> 294 /* Set everything to 0. */
> 295 memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 
> 0,
> 296 (sizeof(struct virtio_net) - (size_t)vq_offset));
> 297
> 298 init_vring_queue_pair(dev, 0);
> 
> This piece of code's intention is to memset everything to zero, except the
> `virtqueue' field, for, as the comment stated, we have already allocated
> virtqueue.
> 
> It works only when `virtqueue' field is before `mem' field, and it was
> before:
> 
> struct virtio_net {
> struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];/**< 
> Contains
> all virtqueue information. */
> struct virtio_memory*mem;   /**< QEMU memory and 
> memory
> region information. */
> ...
> 
> After this patch, it becomes:
> 
> struct virtio_net {
> struct virtio_memory*mem;   /**< QEMU memory and 
> memory
> region information. */
> struct vhost_virtqueue  **virtqueue;/**< Contains all 
> virtqueue
> information. */
> ...
> 
> 
> Which actually wipes all stuff inside `struct virtio_net`, resulting to 
> setting
> `virtqueue' to NULL as well.
> 
> While reading the code(without you patch applied), I thought that it's error-
> prone, as it is very likely that someone else besides the author doesn't know
> such undocumented rule. And you just gave me an example :)
> 
> Huawei, I'm proposing a fix to call rte_zmalloc() for allocating new_ll_dev to
> get rid of such issue. What do you think?
> 
>   --yliu
> 
> 

Suggest you finish the latter patch review:
[PATCH v4 04/12] vhost: set memory layout for multiple queues mode,
After you finish reviewing this patch, I think you will change your mind :-)

This patch resolve what you concern.

> 
> > memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
> > (sizeof(struct virtio_net) - (size_t)vq_offset));
> > -   memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct
> vhost_virtqueue));
> > - 

[dpdk-dev] [PATCH] doc: fix for vhost sample parameter

2015-08-18 Thread Ouyang Changchun
This commit removes the dev-index, so update the doc for this change:
17b8320a3e11e146868906d0082b6e402d5f2255
"vhost: remove index parameter"

Signed-off-by: Changchun Ouyang 
---
 doc/guides/sample_app_ug/vhost.rst| 18 --
 lib/librte_vhost/libvirt/qemu-wrap.py | 10 +-
 2 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/doc/guides/sample_app_ug/vhost.rst 
b/doc/guides/sample_app_ug/vhost.rst
index 730b9da..2ec2843 100644
--- a/doc/guides/sample_app_ug/vhost.rst
+++ b/doc/guides/sample_app_ug/vhost.rst
@@ -386,7 +386,7 @@ Running the Sample Code

 .. code-block:: console

-user at target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir / 
mnt/huge -- -p 0x1 --dev-basename usvhost --dev-index 1
+user at target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir / 
mnt/huge -- -p 0x1 --dev-basename usvhost

 vhost user: a socket file named usvhost will be created under current 
directory. Use its path as the socket path in guest's qemu commandline.

@@ -401,19 +401,17 @@ Running the Sample Code
 Parameters
 ~~

-**Basename and Index.**
+**Basename.**
 vhost cuse uses a Linux* character device to communicate with QEMU.
-The basename and the index are used to generate the character devices name.
-
-/dev/-
+The basename is used to generate the character devices name.

-The index parameter is provided for a situation where multiple instances of 
the virtual switch is required.
+/dev/

-For compatibility with the QEMU wrapper script, a base name of "usvhost" and 
an index of "1" should be used:
+For compatibility with the QEMU wrapper script, a base name of "usvhost" 
should be used:

 .. code-block:: console

-user at target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir / mnt/huge 
-- -p 0x1 --dev-basename usvhost --dev-index 1
+user at target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir / mnt/huge 
-- -p 0x1 --dev-basename usvhost

 **vm2vm.**
 The vm2vm parameter disable/set mode of packet switching between guests in the 
host.
@@ -678,11 +676,11 @@ To call the QEMU wrapper automatically from libvirt, the 
following configuration
 emul_path = "/usr/local/bin/qemu-system-x86_64"

 *   Configure the "us_vhost_path" variable to point to the DPDK vhost-net 
sample code's character devices name.
-DPDK vhost-net sample code's character device will be in the format 
"/dev/-".
+DPDK vhost-net sample code's character device will be in the format 
"/dev/".

 .. code-block:: xml

-us_vhost_path = "/dev/usvhost-1"
+us_vhost_path = "/dev/usvhost"

 Common Issues
 ~
diff --git a/lib/librte_vhost/libvirt/qemu-wrap.py 
b/lib/librte_vhost/libvirt/qemu-wrap.py
index 5096011..cd77c3a 100755
--- a/lib/librte_vhost/libvirt/qemu-wrap.py
+++ b/lib/librte_vhost/libvirt/qemu-wrap.py
@@ -75,7 +75,7 @@
 #"/dev/random", "/dev/urandom",
 #"/dev/ptmx", "/dev/kvm", "/dev/kqemu",
 #"/dev/rtc", "/dev/hpet", "/dev/net/tun",
-#"/dev/-",
+#"/dev/",
 #]
 #
 #   4.b) Disable SELinux or set to permissive mode
@@ -129,13 +129,13 @@
 emul_path = "/usr/local/bin/qemu-system-x86_64"

 #Path to userspace vhost device file
-# This filename should match the --dev-basename --dev-index parameters of
+# This filename should match the --dev-basename parameters of
 # the command used to launch the userspace vhost sample application e.g.
 # if the sample app lauch command is:
-#./build/vhost-switch . --dev-basename usvhost --dev-index 1
+#./build/vhost-switch . --dev-basename usvhost
 # then this variable should be set to:
-#   us_vhost_path = "/dev/usvhost-1"
-us_vhost_path = "/dev/usvhost-1"
+#   us_vhost_path = "/dev/usvhost"
+us_vhost_path = "/dev/usvhost"

 #List of additional user defined emulation options. These options will
 #be added to all Qemu calls
-- 
1.8.4.2



[dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev

2015-08-14 Thread Ouyang, Changchun
Hi Flavio,

Thanks for your comments, see my response below.

> -Original Message-
> From: Flavio Leitner [mailto:fbl at sysclose.org]
> Sent: Thursday, August 13, 2015 8:52 PM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in
> virtio dev
> 
> On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > Enabling this feature allows virtio device/port on guest has the
> > ability to use different vCPU to receive/transmit packets from/to each
> queue.
> >
> > In multiple queues mode, virtio device readiness means all queues of
> > this virtio device are ready, cleanup/destroy a virtio device also
> > requires clearing all queues belong to it.
> >
> > Signed-off-by: Changchun Ouyang 
> > ---
> > Changes in v4:
> >   - rebase and fix conflicts
> >   - resolve comments
> >   - init each virtq pair if mq is on
> >
> > Changes in v3:
> >   - fix coding style
> >   - check virtqueue idx validity
> >
> > Changes in v2:
> >   - remove the q_num_set api
> >   - add the qp_num_get api
> >   - determine the queue pair num from qemu message
> >   - rework for reset owner message handler
> >   - dynamically alloc mem for dev virtqueue
> >   - queue pair num could be 0x8000
> >   - fix checkpatch errors
> >
> >  lib/librte_vhost/rte_virtio_net.h |  10 +-
> >  lib/librte_vhost/vhost-net.h  |   1 +
> >  lib/librte_vhost/vhost_rxtx.c |  52 +---
> >  lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
> >  lib/librte_vhost/vhost_user/virtio-net-user.c |  76 +---
> >  lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
> >  lib/librte_vhost/virtio-net.c | 165 
> > +-
> >  7 files changed, 222 insertions(+), 88 deletions(-)
> >
> > diff --git a/lib/librte_vhost/rte_virtio_net.h
> > b/lib/librte_vhost/rte_virtio_net.h
> > index b9bf320..d9e887f 100644
> > --- a/lib/librte_vhost/rte_virtio_net.h
> > +++ b/lib/librte_vhost/rte_virtio_net.h
> > @@ -59,7 +59,6 @@ struct rte_mbuf;
> >  /* Backend value set by guest. */
> >  #define VIRTIO_DEV_STOPPED -1
> >
> > -
> >  /* Enum for virtqueue management. */
> >  enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
> >
> > @@ -96,13 +95,14 @@ struct vhost_virtqueue {
> >   * Device structure contains all configuration information relating to the
> device.
> >   */
> >  struct virtio_net {
> > -   struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];/**< Contains
> all virtqueue information. */
> > struct virtio_memory*mem;   /**< QEMU memory and
> memory region information. */
> > +   struct vhost_virtqueue  **virtqueue;/**< Contains all virtqueue
> information. */
> > uint64_tfeatures;   /**< Negotiated feature set.
> */
> > uint64_tdevice_fh;  /**< device identifier. */
> > uint32_tflags;  /**< Device flags. Only used
> to check if device is running on data core. */
> >  #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
> > charifname[IF_NAME_SZ]; /**< Name of the tap
> device or socket path. */
> > +   uint32_tvirt_qp_nb;
> > void*priv;  /**< private context */
> >  } __rte_cache_aligned;
> >
> > @@ -235,4 +235,10 @@ uint16_t rte_vhost_enqueue_burst(struct
> > virtio_net *dev, uint16_t queue_id,  uint16_t
> rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
> > struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t
> > count);
> >
> > +/**
> > + * This function get the queue pair number of one vhost device.
> > + * @return
> > + *  num of queue pair of specified virtio device.
> > + */
> > +uint16_t rte_vhost_qp_num_get(struct virtio_net *dev);
> >  #endif /* _VIRTIO_NET_H_ */
> > diff --git a/lib/librte_vhost/vhost-net.h
> > b/lib/librte_vhost/vhost-net.h index c69b60b..7dff14d 100644
> > --- a/lib/librte_vhost/vhost-net.h
> > +++ b/lib/librte_vhost/vhost-net.h
> > @@ -115,4 +115,5 @@ struct vhost_net_device_ops {
> >
> >
> >  struct vhost_net_device_ops const *get_virtio_net_callbacks(void);
> > +int alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx);
> >  #endif /* _VHOST_NET_CDE

[dpdk-dev] [PATCH v4 12/12] doc: update doc for vhost multiple queues

2015-08-12 Thread Ouyang Changchun
Update the sample guide doc for vhost multiple queues;
Update the prog guide doc for vhost lib multiple queues feature;

Signed-off-by: Changchun Ouyang 
---
It is added since v3

 doc/guides/prog_guide/vhost_lib.rst |  38 
 doc/guides/sample_app_ug/vhost.rst  | 113 
 2 files changed, 151 insertions(+)

diff --git a/doc/guides/prog_guide/vhost_lib.rst 
b/doc/guides/prog_guide/vhost_lib.rst
index 48e1fff..6f2315d 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -128,6 +128,44 @@ VHOST_GET_VRING_BASE is used as the signal to remove vhost 
device from data plan

 When the socket connection is closed, vhost will destroy the device.

+Vhost multiple queues feature
+-
+This feature supports the multiple queues for each virtio device in vhost.
+Currently multiple queues feature is supported only for vhost-user, not 
supported for vhost-cuse.
+
+The new QEMU patch version(v6) of enabling vhost-use multiple queues has 
already been sent out to
+QEMU community and in its comments collecting stage. It requires applying the 
patch set onto QEMU
+and rebuild the QEMU before running vhost multiple queues:
+http://patchwork.ozlabs.org/patch/506333/
+http://patchwork.ozlabs.org/patch/506334/
+
+Note: the QEMU patch is based on top of 2 other patches, see patch description 
for more details
+
+The vhost will get the queue pair number based on the communication message 
with QEMU.
+
+HW queue numbers in pool is strongly recommended to set as identical with the 
queue number to start
+the QMEU guest and identical with the queue number to start with virtio port 
on guest.
+
+=
+==|   |==|
+   vport0 |   |  vport1  |
+---  ---  ---  ---|   |---  ---  ---  ---|
+q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
+/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
+||   ||   ||   ||  ||   ||   ||   ||
+||   ||   ||   ||  ||   ||   ||   ||
+||= =||= =||= =||=|   =||== ||== ||== ||=|
+q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
+--|   |--|
+ VMDq pool0   |   |VMDq pool1|
+==|   |==|
+
+In RX side, it firstly polls each queue of the pool and gets the packets from
+it and enqueue them into its corresponding virtqueue in virtio device/port.
+In TX side, it dequeue packets from each virtqueue of virtio device/port and 
send
+to either physical port or another virtio device according to its destination
+MAC address.
+
 Vhost supported vSwitch reference
 -

diff --git a/doc/guides/sample_app_ug/vhost.rst 
b/doc/guides/sample_app_ug/vhost.rst
index 730b9da..e7dfe70 100644
--- a/doc/guides/sample_app_ug/vhost.rst
+++ b/doc/guides/sample_app_ug/vhost.rst
@@ -514,6 +514,13 @@ It is enabled by default.

 user at target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir /mnt/huge 
-- --vlan-strip [0, 1]

+**rxq.**
+The rxq option specify the rx queue number per VMDq pool, it is 1 on default.
+
+.. code-block:: console
+
+user at target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir /mnt/huge 
-- --rxq [1, 2, 4]
+
 Running the Virtual Machine (QEMU)
 --

@@ -833,3 +840,109 @@ For example:
 The above message indicates that device 0 has been registered with MAC address 
cc:bb:bb:bb:bb:bb and VLAN tag 1000.
 Any packets received on the NIC with these values is placed on the devices 
receive queue.
 When a virtio-net device transmits packets, the VLAN tag is added to the 
packet by the DPDK vhost sample code.
+
+Vhost multiple queues
+-
+
+This feature supports the multiple queues for each virtio device in vhost.
+Currently multiple queues feature is supported only for vhost-user, not 
supported for vhost-cuse.
+
+The new QEMU patch version(v6) of enabling vhost-use multiple queues has 
already been sent out to
+QEMU community and in its comments collecting stage. It requires applying the 
patch set onto QEMU
+and rebuild the QEMU before running vhost multiple queues:
+http://patchwork.ozlabs.org/patch/506333/
+http://patchwork.ozlabs.org/patch/506334/
+
+Note: the QEMU patch is based on top of 2 other patches, see patch description 
for more details.
+
+Basically vhost sample leverages the VMDq+RSS in HW to receive packets and 
distribute them
+into different queue in the pool according to their 5 tuples.
+
+On the other hand, the vhost will get the queue pair number based on the 
communication message with
+QEMU.
+
+HW queue numbers in pool is strongly recommended to set as identical with the 
queue number to start
+the QMEU guest and identical with the queue number to start with virtio port 
on guest.
+E.g. use '--rxq 4' to set the queue number as 4, it means there are 4 HW 
queues in each VMDq pool,
+and 4 queues in each vhost device/port, every queue in pool maps to

[dpdk-dev] [PATCH v4 11/12] vhost: alloc core to virtq

2015-08-12 Thread Ouyang Changchun
This patch allocates the core on the granularity of virtq instead of virtio 
device.
This allows vhost having the capability of polling different virtq with 
different core,
which shows better performance on vhost/virtio ports with more cores.

Add 2 API: rte_vhost_core_id_get and rte_vhost_core_id_set.

Signed-off-by: Changchun Ouyang 
---
It is added since v4.

 examples/vhost/Makefile   |   4 +-
 examples/vhost/main.c | 243 --
 examples/vhost/main.h |   3 +-
 lib/librte_vhost/rte_virtio_net.h |  25 
 lib/librte_vhost/virtio-net.c |  22 
 5 files changed, 178 insertions(+), 119 deletions(-)

diff --git a/examples/vhost/Makefile b/examples/vhost/Makefile
index c269466..32a3dec 100644
--- a/examples/vhost/Makefile
+++ b/examples/vhost/Makefile
@@ -50,8 +50,8 @@ APP = vhost-switch
 # all source are stored in SRCS-y
 SRCS-y := main.c

-CFLAGS += -O2 -D_FILE_OFFSET_BITS=64
-CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -O0 -g -D_FILE_OFFSET_BITS=64
+CFLAGS += $(WERROR_FLAGS) -Wno-maybe-uninitialized

 include $(RTE_SDK)/mk/rte.extapp.mk

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 54f9648..0a36c61 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1386,60 +1386,58 @@ switch_worker(__attribute__((unused)) void *arg)
}
if (likely(vdev->ready == DEVICE_RX)) {
/*Handle guest RX*/
-   for (i = 0; i < rxq; i++) {
-   rx_count = rte_eth_rx_burst(ports[0],
-   vdev->vmdq_rx_q + i, 
pkts_burst, MAX_PKT_BURST);
+   uint16_t q_idx = dev_ll->work_q_idx;
+   rx_count = rte_eth_rx_burst(ports[0],
+   vdev->vmdq_rx_q + q_idx, pkts_burst, 
MAX_PKT_BURST);

-   if (rx_count) {
-   /*
-   * Retry is enabled and the 
queue is full then we wait and retry to avoid packet loss
-   * Here MAX_PKT_BURST must be 
less than virtio queue size
-   */
-   if (enable_retry && 
unlikely(rx_count > rte_vring_available_entries(dev,
-   
VIRTIO_RXQ + i * VIRTIO_QNUM))) {
-   for (retry = 0; retry < 
burst_rx_retry_num; retry++) {
-   
rte_delay_us(burst_rx_delay_time);
-   if (rx_count <= 
rte_vring_available_entries(dev,
-   
VIRTIO_RXQ + i * VIRTIO_QNUM))
-   break;
-   }
-   }
-   ret_count = 
rte_vhost_enqueue_burst(dev, VIRTIO_RXQ + i * VIRTIO_QNUM,
-   
pkts_burst, rx_count);
-   if (enable_stats) {
-   rte_atomic64_add(
-   
&dev_statistics[dev_ll->vdev->dev->device_fh].qp_stats[i].rx_total_atomic,
-   rx_count);
-   rte_atomic64_add(
-   
&dev_statistics[dev_ll->vdev->dev->device_fh].qp_stats[i].rx_atomic, ret_count);
-   }
-   while (likely(rx_count)) {
-   rx_count--;
-   
rte_pktmbuf_free(pkts_burst[rx_count]);
+   if (rx_count) {
+   /*
+   * Retry is enabled and the queue is 
full then we wait and retry to avoid packet loss
+   * Here MAX_PKT_BURST must be less than 
virtio queue size
+   */
+   if (enable_retry && unlikely(rx_count > 
rte_vring_available_entries(dev,
+   
VIRTIO_RXQ + q_idx * VIRTIO_QNUM))) {
+   for (retry = 0; retry < 
burst_rx_retry_num; retry++) {
+

[dpdk-dev] [PATCH v4 10/12] vhost: add per queue stats info

2015-08-12 Thread Ouyang Changchun
Add per queue stats info

Signed-off-by: Changchun Ouyang 
---
Changes in v3
  - fix coding style and displaying format
  - check stats_enable to alloc mem for queue pair

Changes in v2
  - fix the stats issue in tx_local
  - dynamically alloc mem for queue pair stats info
  - fix checkpatch errors

 examples/vhost/main.c | 126 +++---
 1 file changed, 79 insertions(+), 47 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 683a300..54f9648 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -314,7 +314,7 @@ struct ipv4_hdr {
 #define VLAN_ETH_HLEN   18

 /* Per-device statistics struct */
-struct device_statistics {
+struct qp_statistics {
uint64_t tx_total;
rte_atomic64_t rx_total_atomic;
uint64_t rx_total;
@@ -322,6 +322,10 @@ struct device_statistics {
rte_atomic64_t rx_atomic;
uint64_t rx;
 } __rte_cache_aligned;
+
+struct device_statistics {
+   struct qp_statistics *qp_stats;
+};
 struct device_statistics dev_statistics[MAX_DEVICES];

 /*
@@ -766,6 +770,17 @@ us_vhost_parse_args(int argc, char **argv)
return -1;
} else {
enable_stats = ret;
+   if (enable_stats)
+   for (i = 0; i < MAX_DEVICES; 
i++) {
+   
dev_statistics[i].qp_stats =
+   
malloc(VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct qp_statistics));
+   if 
(dev_statistics[i].qp_stats == NULL) {
+   RTE_LOG(ERR, 
VHOST_CONFIG, "Failed to allocate memory for qp stats.\n");
+   return -1;
+   }
+   
memset(dev_statistics[i].qp_stats, 0,
+   
VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct qp_statistics));
+   }
}
}

@@ -1121,13 +1136,13 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf 
*m, uint32_t q_idx)
&m, 1);
if (enable_stats) {
rte_atomic64_add(
-   
&dev_statistics[tdev->device_fh].rx_total_atomic,
+   
&dev_statistics[tdev->device_fh].qp_stats[q_idx].rx_total_atomic,
1);
rte_atomic64_add(
-   
&dev_statistics[tdev->device_fh].rx_atomic,
+   
&dev_statistics[tdev->device_fh].qp_stats[q_idx].rx_atomic,
ret);
-   
dev_statistics[tdev->device_fh].tx_total++;
-   dev_statistics[tdev->device_fh].tx += 
ret;
+   
dev_statistics[dev->device_fh].qp_stats[q_idx].tx_total++;
+   
dev_statistics[dev->device_fh].qp_stats[q_idx].tx += ret;
}
}

@@ -1261,8 +1276,8 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf 
*m,
tx_q->m_table[len] = m;
len++;
if (enable_stats) {
-   dev_statistics[dev->device_fh].tx_total++;
-   dev_statistics[dev->device_fh].tx++;
+   dev_statistics[dev->device_fh].qp_stats[q_idx].tx_total++;
+   dev_statistics[dev->device_fh].qp_stats[q_idx].tx++;
}

if (unlikely(len == MAX_PKT_BURST)) {
@@ -1393,10 +1408,10 @@ switch_worker(__attribute__((unused)) void *arg)

pkts_burst, rx_count);
if (enable_stats) {
rte_atomic64_add(
-   
&dev_statistics[dev_ll->vdev->dev->device_fh].rx_total_atomic,
+   
&dev_statistics[dev_ll->vdev->dev->device_fh].qp_stats[i].rx_total_atomic,
rx_count);
rte_atomic64_add(
-   
&dev_statistics[dev_ll->vdev->dev->device_fh].rx_atomic, ret_count);
+   
&dev_statistics[dev_ll->vdev-

[dpdk-dev] [PATCH v4 09/12] virtio: resolve for control queue

2015-08-12 Thread Ouyang Changchun
Fix the max virtio queue pair read issue.

Control queue can't work for vhost-user mulitple queue mode,
so introduce a counter to void the dead loop when polling the control 
queue(removed in v4).

Signed-off-by: Changchun Ouyang 
---
Changes in v4:
  - revert the workaround
  - fix the max virtio queue pair read issue

Changes in v2:
  - fix checkpatch errors

 drivers/net/virtio/virtio_ethdev.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 465d3cd..3ce11f8 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1162,7 +1162,6 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
struct virtio_hw *hw = eth_dev->data->dev_private;
struct virtio_net_config *config;
struct virtio_net_config local_config;
-   uint32_t offset_conf = sizeof(config->mac);
struct rte_pci_device *pci_dev;

RTE_BUILD_BUG_ON(RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr));
@@ -1222,7 +1221,8 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
config = &local_config;

if (vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
-   offset_conf += sizeof(config->status);
+   vtpci_read_dev_config(hw, offsetof(struct 
virtio_net_config, status),
+   (uint8_t *)&config->status, 
sizeof(config->status));
} else {
PMD_INIT_LOG(DEBUG,
 "VIRTIO_NET_F_STATUS is not supported");
@@ -1230,15 +1230,14 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
}

if (vtpci_with_feature(hw, VIRTIO_NET_F_MQ)) {
-   offset_conf += sizeof(config->max_virtqueue_pairs);
+   vtpci_read_dev_config(hw, offsetof(struct 
virtio_net_config, max_virtqueue_pairs),
+   (uint8_t *)&config->max_virtqueue_pairs, 
sizeof(config->max_virtqueue_pairs));
} else {
PMD_INIT_LOG(DEBUG,
 "VIRTIO_NET_F_MQ is not supported");
config->max_virtqueue_pairs = 1;
}

-   vtpci_read_dev_config(hw, 0, (uint8_t *)config, offset_conf);
-
hw->max_rx_queues =
(VIRTIO_MAX_RX_QUEUES < config->max_virtqueue_pairs) ?
VIRTIO_MAX_RX_QUEUES : config->max_virtqueue_pairs;
-- 
1.8.4.2



[dpdk-dev] [PATCH v4 08/12] vhost: support multiple queues

2015-08-12 Thread Ouyang Changchun
Sample vhost leverage the VMDq+RSS in HW to receive packets and distribute them
into different queue in the pool according to 5 tuples.

And enable multiple queues mode in vhost/virtio layer.

HW queue numbers in pool exactly same with the queue number in virtio device,
e.g. rxq = 4, the queue number is 4, it means 4 HW queues in each VMDq pool,
and 4 queues in each virtio device/port, one maps to each.

=
==|   |==|
   vport0 |   |  vport1  |
---  ---  ---  ---|   |---  ---  ---  ---|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
||   ||   ||   ||  ||   ||   ||   ||
||   ||   ||   ||  ||   ||   ||   ||
||= =||= =||= =||=|   =||== ||== ||== ||=|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |

--|   |--|
 VMDq pool0   |   |VMDq pool1|
==|   |==|

In RX side, it firstly polls each queue of the pool and gets the packets from
it and enqueue them into its corresponding queue in virtio device/port.
In TX side, it dequeue packets from each queue of virtio device/port and send
to either physical port or another virtio device according to its destination
MAC address.

Signed-off-by: Changchun Ouyang 
---
Changes in v4:
  - address comments and refine var name
  - support FVL nic
  - fix check patch issue

Changes in v2:
  - check queue num per pool in VMDq and queue pair number per vhost device
  - remove the unnecessary calling q_num_set api
  - fix checkpatch errors

 examples/vhost/main.c | 190 --
 1 file changed, 124 insertions(+), 66 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 5b811af..683a300 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -368,6 +368,37 @@ validate_num_devices(uint32_t max_nb_devices)
return 0;
 }

+static int
+get_dev_nb_for_82599(struct rte_eth_dev_info dev_info)
+{
+   int dev_nb = -1;
+   switch (rxq) {
+   case 1:
+   case 2:
+   /*
+* for 82599, dev_info.max_vmdq_pools always 64 dispite rx mode.
+*/
+   dev_nb = (int)dev_info.max_vmdq_pools;
+   break;
+   case 4:
+   dev_nb = (int)dev_info.max_vmdq_pools / 2;
+   break;
+   default:
+   RTE_LOG(ERR, VHOST_CONFIG, "rxq invalid for VMDq.\n");
+   }
+   return dev_nb;
+}
+
+static int
+get_dev_nb_for_fvl(struct rte_eth_dev_info dev_info)
+{
+   /*
+* for FVL, dev_info.max_vmdq_pools is calculated according to
+* the configured value: CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM.
+*/
+   return (int)dev_info.max_vmdq_pools;
+}
+
 /*
  * Initialises a given port using global settings and with the rx buffers
  * coming from the mbuf_pool passed as parameter
@@ -412,17 +443,14 @@ port_init(uint8_t port)
}

/* Configure the virtio devices num based on VMDQ limits */
-   switch (rxq) {
-   case 1:
-   case 2:
-   num_devices = dev_info.max_vmdq_pools;
-   break;
-   case 4:
-   num_devices = dev_info.max_vmdq_pools / 2;
-   break;
-   default:
-   RTE_LOG(ERR, VHOST_CONFIG, "rxq invalid for VMDq.\n");
-   return -1;
+   if (dev_info.max_vmdq_pools == ETH_64_POOLS) {
+   num_devices = (uint32_t)get_dev_nb_for_82599(dev_info);
+   if (num_devices == (uint32_t)-1)
+   return -1;
+   } else {
+   num_devices = (uint32_t)get_dev_nb_for_fvl(dev_info);
+   if (num_devices == (uint32_t)-1)
+   return -1;
}

if (zero_copy) {
@@ -1001,8 +1029,9 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)

/* Enable stripping of the vlan tag as we handle routing. */
if (vlan_strip)
-   rte_eth_dev_set_vlan_strip_on_queue(ports[0],
-   (uint16_t)vdev->vmdq_rx_q, 1);
+   for (i = 0; i < (int)rxq; i++)
+   rte_eth_dev_set_vlan_strip_on_queue(ports[0],
+   (uint16_t)(vdev->vmdq_rx_q + i), 1);

/* Set device as ready for RX. */
vdev->ready = DEVICE_RX;
@@ -1017,7 +1046,7 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)
 static inline void
 unlink_vmdq(struct vhost_dev *vdev)
 {
-   unsigned i = 0;
+   unsigned i = 0, j = 0;
unsigned rx_count;
struct rte_mbuf *pkts_burst[MAX_PKT_BURST];

@@ -1030,15 +1059,19 @@ unlink_vmdq(struct vhost_dev *vdev)
vdev->vlan_tag = 0;

/*Clear out the receive buffers*/
-   rx_count = rte_eth_rx_burst(ports[0],
-   (uint16_t)vdev->vmdq_rx_q, pkts_burst, 
MAX_PKT_BURST);
+   for (i = 0; i < rxq; i++) {
+ 

[dpdk-dev] [PATCH v4 07/12] vhost: add new command line option: rxq

2015-08-12 Thread Ouyang Changchun
Sample vhost need know the queue number user want to enable for each virtio 
device,
so add the new option '--rxq' into it.

Signed-off-by: Changchun Ouyang 
---
Changes in v3
  - fix coding style

Changes in v2
  - refine help info
  - check if rxq = 0
  - fix checkpatch errors

 examples/vhost/main.c | 49 +
 1 file changed, 45 insertions(+), 4 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index d3c45dd..5b811af 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -163,6 +163,9 @@ static int mergeable;
 /* Do vlan strip on host, enabled on default */
 static uint32_t vlan_strip = 1;

+/* Rx queue number per virtio device */
+static uint32_t rxq = 1;
+
 /* number of descriptors to apply*/
 static uint32_t num_rx_descriptor = RTE_TEST_RX_DESC_DEFAULT_ZCP;
 static uint32_t num_tx_descriptor = RTE_TEST_TX_DESC_DEFAULT_ZCP;
@@ -408,8 +411,19 @@ port_init(uint8_t port)
txconf->tx_deferred_start = 1;
}

-   /*configure the number of supported virtio devices based on VMDQ limits 
*/
-   num_devices = dev_info.max_vmdq_pools;
+   /* Configure the virtio devices num based on VMDQ limits */
+   switch (rxq) {
+   case 1:
+   case 2:
+   num_devices = dev_info.max_vmdq_pools;
+   break;
+   case 4:
+   num_devices = dev_info.max_vmdq_pools / 2;
+   break;
+   default:
+   RTE_LOG(ERR, VHOST_CONFIG, "rxq invalid for VMDq.\n");
+   return -1;
+   }

if (zero_copy) {
rx_ring_size = num_rx_descriptor;
@@ -431,7 +445,7 @@ port_init(uint8_t port)
return retval;
/* NIC queues are divided into pf queues and vmdq queues.  */
num_pf_queues = dev_info.max_rx_queues - dev_info.vmdq_queue_num;
-   queues_per_pool = dev_info.vmdq_queue_num / dev_info.max_vmdq_pools;
+   queues_per_pool = dev_info.vmdq_queue_num / num_devices;
num_vmdq_queues = num_devices * queues_per_pool;
num_queues = num_pf_queues + num_vmdq_queues;
vmdq_queue_base = dev_info.vmdq_queue_base;
@@ -576,7 +590,8 @@ us_vhost_usage(const char *prgname)
"   --rx-desc-num [0-N]: the number of descriptors on rx, "
"used only when zero copy is enabled.\n"
"   --tx-desc-num [0-N]: the number of descriptors on tx, "
-   "used only when zero copy is enabled.\n",
+   "used only when zero copy is enabled.\n"
+   "   --rxq [1,2,4]: rx queue number for each vhost device\n",
   prgname);
 }

@@ -602,6 +617,7 @@ us_vhost_parse_args(int argc, char **argv)
{"zero-copy", required_argument, NULL, 0},
{"rx-desc-num", required_argument, NULL, 0},
{"tx-desc-num", required_argument, NULL, 0},
+   {"rxq", required_argument, NULL, 0},
{NULL, 0, 0, 0},
};

@@ -778,6 +794,18 @@ us_vhost_parse_args(int argc, char **argv)
}
}

+   /* Specify the Rx queue number for each vhost dev. */
+   if (!strncmp(long_option[option_index].name,
+   "rxq", MAX_LONG_OPT_SZ)) {
+   ret = parse_num_opt(optarg, 4);
+   if ((ret == -1) || (ret == 0) || 
(!POWEROF2(ret))) {
+   RTE_LOG(INFO, VHOST_CONFIG,
+   "Valid value for rxq is [1,2,4]\n");
+   us_vhost_usage(prgname);
+   return -1;
+   } else
+   rxq = ret;
+   }
break;

/* Invalid option - print options. */
@@ -813,6 +841,19 @@ us_vhost_parse_args(int argc, char **argv)
return -1;
}

+   if (rxq > 1) {
+   vmdq_conf_default.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
+   vmdq_conf_default.rx_adv_conf.rss_conf.rss_hf = ETH_RSS_IP |
+   ETH_RSS_UDP | ETH_RSS_TCP | ETH_RSS_SCTP;
+   }
+
+   if ((zero_copy == 1) && (rxq > 1)) {
+   RTE_LOG(INFO, VHOST_PORT,
+   "Vhost zero copy doesn't support mq mode,"
+   "please specify '--rxq 1' to disable it.\n");
+   return -1;
+   }
+
return 0;
 }

-- 
1.8.4.2



[dpdk-dev] [PATCH v4 06/12] vhost: support protocol feature

2015-08-12 Thread Ouyang Changchun
Support new protocol feature to communicate with qemu:
Add set and get protocol feature bits;
Add VRING_FLAG for mq feature to set vring flag, which
indicates the vq is enabled or disabled.

Reserve values as follows:
VHOST_USER_SEND_RARP = 17 (merge from qemu community)
VHOST_USER_SET_VRING_FLAG = 18 (reserve for vhost mq)

These reservation need sync up with qemu community before finalizing.

Signed-off-by: Changchun Ouyang 
---
This is added since v4.

 lib/librte_vhost/rte_virtio_net.h |  2 +
 lib/librte_vhost/vhost-net.h  |  3 ++
 lib/librte_vhost/vhost_rxtx.c | 21 ++
 lib/librte_vhost/vhost_user/vhost-net-user.c  | 21 +-
 lib/librte_vhost/vhost_user/vhost-net-user.h  |  4 ++
 lib/librte_vhost/vhost_user/virtio-net-user.c | 29 ++
 lib/librte_vhost/vhost_user/virtio-net-user.h |  2 +
 lib/librte_vhost/virtio-net.c | 56 ++-
 lib/librte_vhost/virtio-net.h |  2 +
 9 files changed, 138 insertions(+), 2 deletions(-)

diff --git a/lib/librte_vhost/rte_virtio_net.h 
b/lib/librte_vhost/rte_virtio_net.h
index 8520d96..e16ad3a 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -88,6 +88,7 @@ struct vhost_virtqueue {
volatile uint16_t   last_used_idx_res;  /**< Used for multiple 
devices reserving buffers. */
eventfd_t   callfd; /**< Used to notify the 
guest (trigger interrupt). */
eventfd_t   kickfd; /**< Currently unused 
as polling mode is enabled. */
+   uint32_tenabled;/**< Indicate the queue 
is enabled or not. */
struct buf_vector   buf_vec[BUF_VECTOR_MAX];/**< for 
scatter RX. */
 } __rte_cache_aligned;

@@ -98,6 +99,7 @@ struct virtio_net {
struct vhost_virtqueue  **virtqueue;/**< Contains all virtqueue 
information. */
struct virtio_memory**mem_arr;  /**< Array for QEMU memory and 
memory region information. */
uint64_tfeatures;   /**< Negotiated feature set. */
+   uint64_tprotocol_features;  /**< Negotiated 
protocol feature set. */
uint64_tdevice_fh;  /**< device identifier. */
uint32_tflags;  /**< Device flags. Only used to 
check if device is running on data core. */
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
index 7dff14d..bc88bad 100644
--- a/lib/librte_vhost/vhost-net.h
+++ b/lib/librte_vhost/vhost-net.h
@@ -99,6 +99,9 @@ struct vhost_net_device_ops {
int (*get_features)(struct vhost_device_ctx, uint64_t *);
int (*set_features)(struct vhost_device_ctx, uint64_t *);

+   int (*get_protocol_features)(struct vhost_device_ctx, uint64_t *);
+   int (*set_protocol_features)(struct vhost_device_ctx, uint64_t *);
+
int (*set_vring_num)(struct vhost_device_ctx, struct vhost_vring_state 
*);
int (*set_vring_addr)(struct vhost_device_ctx, struct vhost_vring_addr 
*);
int (*set_vring_base)(struct vhost_device_ctx, struct vhost_vring_state 
*);
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index a60b542..3af0326 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -89,6 +89,14 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
}

vq = dev->virtqueue[queue_id];
+
+   if (unlikely(vq->enabled == 0)) {
+   RTE_LOG(ERR, VHOST_DATA,
+   "%s (%"PRIu64"): virtqueue idx:%d not enabled.\n",
+__func__, dev->device_fh, queue_id);
+   return 0;
+   }
+
count = (count > MAX_PKT_BURST) ? MAX_PKT_BURST : count;

/*
@@ -281,6 +289,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t 
queue_id,
 * (guest physical addr -> vhost virtual addr)
 */
vq = dev->virtqueue[queue_id];
+
vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
vq->buf_vec[vec_idx].buf_addr);
vb_hdr_addr = vb_addr;
@@ -491,6 +500,14 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
queue_id,
}

vq = dev->virtqueue[queue_id];
+
+   if (unlikely(vq->enabled == 0)) {
+   RTE_LOG(ERR, VHOST_DATA,
+   "%s (%"PRIu64"): virtqueue idx:%d not enabled.\n",
+__func__, dev->device_fh, queue_id);
+   return 0;
+   }
+
count = RTE_MIN((uint32_t)MAX_PKT_BURST, count);

if (count == 0)
@@ -590,6 +607,10 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t 
queue_id,
}

vq = dev->virtqueue[queue_id];
+
+   if (unlikely(vq->enabled == 0))
+   return 0;
+
avail_idx =  *((volatile uint16_t *)&vq

[dpdk-dev] [PATCH v4 05/12] vhost: check the virtqueue address's validity

2015-08-12 Thread Ouyang Changchun
This is added since v3.
Check the virtqueue address's validity.

Signed-off-by: Changchun Ouyang 
---
Changes in v4:
  - remove unnecessary code

 lib/librte_vhost/vhost_user/vhost-net-user.c |  4 +++-
 lib/librte_vhost/virtio-net.c| 10 ++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c 
b/lib/librte_vhost/vhost_user/vhost-net-user.c
index 3d7c373..e926ed7 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -403,7 +403,9 @@ vserver_message_handler(int connfd, void *dat, int *remove)
ops->set_vring_num(ctx, &msg.payload.state);
break;
case VHOST_USER_SET_VRING_ADDR:
-   ops->set_vring_addr(ctx, &msg.payload.addr);
+   if (ops->set_vring_addr(ctx, &msg.payload.addr) != 0)
+   RTE_LOG(INFO, VHOST_CONFIG,
+   "vring address incorrect.\n");
break;
case VHOST_USER_SET_VRING_BASE:
ops->set_vring_base(ctx, &msg.payload.state);
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index fd66a06..8901aa5 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -643,6 +643,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct 
vhost_vring_addr *addr)
 {
struct virtio_net *dev;
struct vhost_virtqueue *vq;
+   uint32_t i;

dev = get_device(ctx);
if (dev == NULL)
@@ -673,6 +674,15 @@ set_vring_addr(struct vhost_device_ctx ctx, struct 
vhost_vring_addr *addr)
return -1;
}

+   for (i = vq->last_used_idx; i < vq->avail->idx; i++)
+   if (vq->avail->ring[i] >= vq->size) {
+   RTE_LOG(ERR, VHOST_CONFIG, "%s (%"PRIu64"):"
+   "Please check virt queue pair idx:%d is "
+   "enalbed correctly on guest.\n", __func__,
+   dev->device_fh, addr->index / VIRTIO_QNUM);
+   return -1;
+   }
+
vq->used = (struct vring_used *)(uintptr_t)qva_to_vva(dev,
addr->index / VIRTIO_QNUM, addr->used_user_addr);
if (vq->used == 0) {
-- 
1.8.4.2



[dpdk-dev] [PATCH v4 04/12] vhost: set memory layout for multiple queues mode

2015-08-12 Thread Ouyang Changchun
QEMU sends separate commands orderly to set the memory layout for each queue
in one virtio device, accordingly vhost need keep memory layout information
for each queue of the virtio device.

This also need adjust the interface a bit for function gpa_to_vva by
introducing the queue index to specify queue of device to look up its
virtual vhost address for the incoming guest physical address.

Signed-off-by: Changchun Ouyang 
---
Chagnes in v4
  - rebase and fix conflicts
  - call calloc for dev.mem_arr

Chagnes in v3
  - fix coding style

Chagnes in v2
  - q_idx is changed into qp_idx
  - dynamically alloc mem for dev mem_arr
  - fix checkpatch errors

 examples/vhost/main.c | 21 +-
 lib/librte_vhost/rte_virtio_net.h | 10 +++--
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 57 ++
 lib/librte_vhost/vhost_rxtx.c | 22 +-
 lib/librte_vhost/vhost_user/virtio-net-user.c | 59 ++-
 lib/librte_vhost/virtio-net.c | 38 -
 6 files changed, 119 insertions(+), 88 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 1b137b9..d3c45dd 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1466,11 +1466,11 @@ attach_rxmbuf_zcp(struct virtio_net *dev)
desc = &vq->desc[desc_idx];
if (desc->flags & VRING_DESC_F_NEXT) {
desc = &vq->desc[desc->next];
-   buff_addr = gpa_to_vva(dev, desc->addr);
+   buff_addr = gpa_to_vva(dev, 0, desc->addr);
phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len,
&addr_type);
} else {
-   buff_addr = gpa_to_vva(dev,
+   buff_addr = gpa_to_vva(dev, 0,
desc->addr + vq->vhost_hlen);
phys_addr = gpa_to_hpa(vdev,
desc->addr + vq->vhost_hlen,
@@ -1722,7 +1722,7 @@ virtio_dev_rx_zcp(struct virtio_net *dev, struct rte_mbuf 
**pkts,
rte_pktmbuf_data_len(buff), 0);

/* Buffer address translation for virtio header. */
-   buff_hdr_addr = gpa_to_vva(dev, desc->addr);
+   buff_hdr_addr = gpa_to_vva(dev, 0, desc->addr);
packet_len = rte_pktmbuf_data_len(buff) + vq->vhost_hlen;

/*
@@ -1946,7 +1946,7 @@ virtio_dev_tx_zcp(struct virtio_net *dev)
desc = &vq->desc[desc->next];

/* Buffer address translation. */
-   buff_addr = gpa_to_vva(dev, desc->addr);
+   buff_addr = gpa_to_vva(dev, 0, desc->addr);
/* Need check extra VLAN_HLEN size for inserting VLAN tag */
phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len + VLAN_HLEN,
&addr_type);
@@ -2604,13 +2604,14 @@ new_device (struct virtio_net *dev)
dev->priv = vdev;

if (zero_copy) {
-   vdev->nregions_hpa = dev->mem->nregions;
-   for (regionidx = 0; regionidx < dev->mem->nregions; 
regionidx++) {
+   struct virtio_memory *dev_mem = dev->mem_arr[0];
+   vdev->nregions_hpa = dev_mem->nregions;
+   for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) 
{
vdev->nregions_hpa
+= check_hpa_regions(
-   
dev->mem->regions[regionidx].guest_phys_address
-   + 
dev->mem->regions[regionidx].address_offset,
-   
dev->mem->regions[regionidx].memory_size);
+   
dev_mem->regions[regionidx].guest_phys_address
+   + 
dev_mem->regions[regionidx].address_offset,
+   
dev_mem->regions[regionidx].memory_size);

}

@@ -2626,7 +2627,7 @@ new_device (struct virtio_net *dev)


if (fill_hpa_memory_regions(
-   vdev->regions_hpa, dev->mem
+   vdev->regions_hpa, dev_mem
) != vdev->nregions_hpa) {

RTE_LOG(ERR, VHOST_CONFIG,
diff --git a/lib/librte_vhost/rte_virtio_net.h 
b/lib/librte_vhost/rte_virtio_net.h
index d9e887f..8520d96 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -95,14 +95,15 @@ struct vhost_virtqueue {
  * Device structure contains all configuration information relating to the 
device.
  */
 struct virtio_net {
-   struct virtio_memory*mem;   /**< QEMU memory and memory 
region information. */
struct vhost_virtqueue  **virtqueue;/**< Contains all virtqueue 
information. */
+   struct virtio_memory**mem_arr;  /**< Array for QEMU mem

[dpdk-dev] [PATCH v4 03/12] vhost: update version map file

2015-08-12 Thread Ouyang Changchun
From: Changchun Ouyang 

it is added in v4.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_vhost/rte_vhost_version.map | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_vhost/rte_vhost_version.map 
b/lib/librte_vhost/rte_vhost_version.map
index 3d8709e..0bb1c0f 100644
--- a/lib/librte_vhost/rte_vhost_version.map
+++ b/lib/librte_vhost/rte_vhost_version.map
@@ -18,5 +18,5 @@ DPDK_2.1 {
global:

rte_vhost_driver_unregister;
-
+   rte_vhost_qp_num_get;
 } DPDK_2.0;
-- 
1.8.4.2



[dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev

2015-08-12 Thread Ouyang Changchun
Each virtio device could have multiple queues, say 2 or 4, at most 8.
Enabling this feature allows virtio device/port on guest has the ability to
use different vCPU to receive/transmit packets from/to each queue.

In multiple queues mode, virtio device readiness means all queues of
this virtio device are ready, cleanup/destroy a virtio device also
requires clearing all queues belong to it.

Signed-off-by: Changchun Ouyang 
---
Changes in v4:
  - rebase and fix conflicts
  - resolve comments
  - init each virtq pair if mq is on

Changes in v3:
  - fix coding style
  - check virtqueue idx validity

Changes in v2:
  - remove the q_num_set api
  - add the qp_num_get api
  - determine the queue pair num from qemu message
  - rework for reset owner message handler
  - dynamically alloc mem for dev virtqueue
  - queue pair num could be 0x8000
  - fix checkpatch errors

 lib/librte_vhost/rte_virtio_net.h |  10 +-
 lib/librte_vhost/vhost-net.h  |   1 +
 lib/librte_vhost/vhost_rxtx.c |  52 +---
 lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
 lib/librte_vhost/vhost_user/virtio-net-user.c |  76 +---
 lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
 lib/librte_vhost/virtio-net.c | 165 +-
 7 files changed, 222 insertions(+), 88 deletions(-)

diff --git a/lib/librte_vhost/rte_virtio_net.h 
b/lib/librte_vhost/rte_virtio_net.h
index b9bf320..d9e887f 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -59,7 +59,6 @@ struct rte_mbuf;
 /* Backend value set by guest. */
 #define VIRTIO_DEV_STOPPED -1

-
 /* Enum for virtqueue management. */
 enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};

@@ -96,13 +95,14 @@ struct vhost_virtqueue {
  * Device structure contains all configuration information relating to the 
device.
  */
 struct virtio_net {
-   struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];/**< Contains 
all virtqueue information. */
struct virtio_memory*mem;   /**< QEMU memory and memory 
region information. */
+   struct vhost_virtqueue  **virtqueue;/**< Contains all virtqueue 
information. */
uint64_tfeatures;   /**< Negotiated feature set. */
uint64_tdevice_fh;  /**< device identifier. */
uint32_tflags;  /**< Device flags. Only used to 
check if device is running on data core. */
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
charifname[IF_NAME_SZ]; /**< Name of the tap 
device or socket path. */
+   uint32_tvirt_qp_nb;
void*priv;  /**< private context */
 } __rte_cache_aligned;

@@ -235,4 +235,10 @@ uint16_t rte_vhost_enqueue_burst(struct virtio_net *dev, 
uint16_t queue_id,
 uint16_t rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);

+/**
+ * This function get the queue pair number of one vhost device.
+ * @return
+ *  num of queue pair of specified virtio device.
+ */
+uint16_t rte_vhost_qp_num_get(struct virtio_net *dev);
 #endif /* _VIRTIO_NET_H_ */
diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
index c69b60b..7dff14d 100644
--- a/lib/librte_vhost/vhost-net.h
+++ b/lib/librte_vhost/vhost-net.h
@@ -115,4 +115,5 @@ struct vhost_net_device_ops {


 struct vhost_net_device_ops const *get_virtio_net_callbacks(void);
+int alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx);
 #endif /* _VHOST_NET_CDEV_H_ */
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 0d07338..db4ad88 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -43,6 +43,18 @@
 #define MAX_PKT_BURST 32

 /**
+ * Check the virtqueue idx validility,
+ * return 1 if pass, otherwise 0.
+ */
+static inline uint8_t __attribute__((always_inline))
+check_virtqueue_idx(uint16_t virtq_idx, uint8_t is_tx, uint32_t virtq_num)
+{
+   if ((is_tx ^ (virtq_idx & 0x1)) || (virtq_idx >= virtq_num))
+   return 0;
+   return 1;
+}
+
+/**
  * This function adds buffers to the virtio devices RX virtqueue. Buffers can
  * be received from the physical port or from another virtio device. A packet
  * count is returned to indicate the number of packets that are succesfully
@@ -68,12 +80,15 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
uint8_t success = 0;

LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_rx()\n", dev->device_fh);
-   if (unlikely(queue_id != VIRTIO_RXQ)) {
-   LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
+   if (unlikely(check_virtqueue_idx(queue_id, 0,
+   VIRTIO_QNUM * dev->virt_qp_nb) == 0)) {
+   RTE_LOG(ERR, VHOST_DATA,
+   "%s (%"PRIu64"): virtqueue idx:%d invalid.\n",

[dpdk-dev] [PATCH v4 01/12] ixgbe: support VMDq RSS in non-SRIOV environment

2015-08-12 Thread Ouyang Changchun
In non-SRIOV environment, VMDq RSS could be enabled by MRQC register.
In theory, the queue number per pool could be 2 or 4, but only 2 queues are
available due to HW limitation, the same limit also exist in Linux ixgbe driver.

Signed-off-by: Changchun Ouyang 
---
Changes in v2:
  - fix checkpatch errors

Changes in v4:
  - use vmdq_queue_num to calculate queue number per pool

 drivers/net/ixgbe/ixgbe_rxtx.c | 86 +++---
 lib/librte_ether/rte_ethdev.c  | 31 +++
 2 files changed, 104 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 91023b9..d063e12 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -3513,16 +3513,16 @@ void ixgbe_configure_dcb(struct rte_eth_dev *dev)
return;
 }

-/*
- * VMDq only support for 10 GbE NIC.
+/**
+ * Config pool for VMDq on 10 GbE NIC.
  */
 static void
-ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
+ixgbe_vmdq_pool_configure(struct rte_eth_dev *dev)
 {
struct rte_eth_vmdq_rx_conf *cfg;
struct ixgbe_hw *hw;
enum rte_eth_nb_pools num_pools;
-   uint32_t mrqc, vt_ctl, vlanctrl;
+   uint32_t vt_ctl, vlanctrl;
uint32_t vmolr = 0;
int i;

@@ -3531,12 +3531,6 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
cfg = &dev->data->dev_conf.rx_adv_conf.vmdq_rx_conf;
num_pools = cfg->nb_queue_pools;

-   ixgbe_rss_disable(dev);
-
-   /* MRQC: enable vmdq */
-   mrqc = IXGBE_MRQC_VMDQEN;
-   IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
-
/* PFVTCTL: turn on virtualisation and set the default pool */
vt_ctl = IXGBE_VT_CTL_VT_ENABLE | IXGBE_VT_CTL_REPLEN;
if (cfg->enable_default_pool)
@@ -3602,7 +3596,29 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
IXGBE_WRITE_FLUSH(hw);
 }

-/*
+/**
+ * VMDq only support for 10 GbE NIC.
+ */
+static void
+ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
+{
+   struct ixgbe_hw *hw;
+   uint32_t mrqc;
+
+   PMD_INIT_FUNC_TRACE();
+   hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   ixgbe_rss_disable(dev);
+
+   /* MRQC: enable vmdq */
+   mrqc = IXGBE_MRQC_VMDQEN;
+   IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
+   IXGBE_WRITE_FLUSH(hw);
+
+   ixgbe_vmdq_pool_configure(dev);
+}
+
+/**
  * ixgbe_dcb_config_tx_hw_config - Configure general VMDq TX parameters
  * @hw: pointer to hardware structure
  */
@@ -3707,6 +3723,41 @@ ixgbe_config_vf_rss(struct rte_eth_dev *dev)
 }

 static int
+ixgbe_config_vmdq_rss(struct rte_eth_dev *dev)
+{
+   struct ixgbe_hw *hw;
+   uint32_t mrqc;
+
+   ixgbe_rss_configure(dev);
+
+   hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   /* MRQC: enable VMDQ RSS */
+   mrqc = IXGBE_READ_REG(hw, IXGBE_MRQC);
+   mrqc &= ~IXGBE_MRQC_MRQE_MASK;
+
+   switch (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) {
+   case 2:
+   mrqc |= IXGBE_MRQC_VMDQRSS64EN;
+   break;
+
+   case 4:
+   mrqc |= IXGBE_MRQC_VMDQRSS32EN;
+   break;
+
+   default:
+   PMD_INIT_LOG(ERR, "Invalid pool number in non-IOV mode with 
VMDQ RSS");
+   return -EINVAL;
+   }
+
+   IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
+
+   ixgbe_vmdq_pool_configure(dev);
+
+   return 0;
+}
+
+static int
 ixgbe_config_vf_default(struct rte_eth_dev *dev)
 {
struct ixgbe_hw *hw =
@@ -3762,6 +3813,10 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev)
ixgbe_vmdq_rx_hw_configure(dev);
break;

+   case ETH_MQ_RX_VMDQ_RSS:
+   ixgbe_config_vmdq_rss(dev);
+   break;
+
case ETH_MQ_RX_NONE:
/* if mq_mode is none, disable rss mode.*/
default: ixgbe_rss_disable(dev);
@@ -4252,6 +4307,8 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)

/* Setup RX queues */
for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   uint32_t psrtype = 0;
+
rxq = dev->data->rx_queues[i];

/*
@@ -4279,12 +4336,10 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
if (rx_conf->header_split) {
if (hw->mac.type == ixgbe_mac_82599EB) {
/* Must setup the PSRTYPE register */
-   uint32_t psrtype;
psrtype = IXGBE_PSRTYPE_TCPHDR |
IXGBE_PSRTYPE_UDPHDR   |
IXGBE_PSRTYPE_IPV4HDR  |
IXGBE_PSRTYPE_IPV6HDR;
-   IXGBE_WRITE_REG(hw, 
IXGBE_PSRTYPE(rxq->reg_idx), psrtype);
}
srrctl = ((rx_con

[dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost

2015-08-12 Thread Ouyang Changchun
This patch targets for R2.2, please ignore it in R2.1.
Send them out a bit earlier just for seeking more comments.

This patch set supports the multiple queues for each virtio device in vhost.
Currently the multiple queues feature is supported only for vhost-user, not yet 
for vhost-cuse.

The new QEMU patch version(v6) of enabling vhost-use multiple queues has 
already been sent out to
QEMU community and in its comments collecting stage. It requires applying the 
patch set onto QEMU
and rebuild the QEMU before running vhost multiple queues:
http://patchwork.ozlabs.org/patch/506333/
http://patchwork.ozlabs.org/patch/506334/

Note: the QEMU patch is based on top of 2 other patches, see patch description 
for more details

Basically vhost sample leverages the VMDq+RSS in HW to receive packets and 
distribute them
into different queue in the pool according to their 5 tuples.

On the other hand, the vhost will get the queue pair number based on the 
communication message with
QEMU.

HW queue numbers in pool is strongly recommended to set as identical with the 
queue number to start
the QMEU guest and identical with the queue number to start with virtio port on 
guest.
E.g. use '--rxq 4' to set the queue number as 4, it means there are 4 HW queues 
in each VMDq pool,
and 4 queues in each vhost device/port, every queue in pool maps to one queue 
in vhost device.

=
==|   |==|
   vport0 |   |  vport1  |
---  ---  ---  ---|   |---  ---  ---  ---|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
||   ||   ||   ||  ||   ||   ||   ||
||   ||   ||   ||  ||   ||   ||   ||
||= =||= =||= =||=|   =||== ||== ||== ||=|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
--|   |--|
 VMDq pool0   |   |VMDq pool1|
==|   |==|

In RX side, it firstly polls each queue of the pool and gets the packets from
it and enqueue them into its corresponding virtqueue in virtio device/port.
In TX side, it dequeue packets from each virtqueue of virtio device/port and 
send
to either physical port or another virtio device according to its destination
MAC address.

Here is some test guidance.
1. On host, firstly mount hugepage, and insmod uio, igb_uio, bind one nic on 
igb_uio;
and then run vhost sample, key steps as follows:
sudo mount -t hugetlbfs nodev /mnt/huge
sudo modprobe uio
sudo insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko

$RTE_SDK/tools/dpdk_nic_bind.py --bind igb_uio :08:00.0
sudo $RTE_SDK/examples/vhost/build/vhost-switch -c 0xf0 -n 4 --huge-dir 
/mnt/huge --socket-mem 1024,0 -- -p 1 --vm2vm 0 --dev-basename usvhost --rxq 2

use '--stats 1' to enable the stats dumping on screen for vhost.

2. After step 1, on host, modprobe kvm and kvm_intel, and use qemu command line 
to start one guest:
modprobe kvm
modprobe kvm_intel
sudo mount -t hugetlbfs nodev /dev/hugepages -o pagesize=1G

$QEMU_PATH/qemu-system-x86_64 -enable-kvm -m 4096 -object 
memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on -numa 
node,memdev=mem -mem-prealloc -smp 10 -cpu core2duo,+sse3,+sse4.1,+sse4.2 -name 
 -drive file=/vm.img -chardev 
socket,id=char0,path=/usvhost -netdev 
type=vhost-user,id=hostnet2,chardev=char0,vhostforce=on,queues=2 -device 
virtio-net-pci,mq=on,vectors=6,netdev=hostnet2,id=net2,mac=52:54:00:12:34:56,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off
 -chardev socket,id=char1,path=/usvhost -netdev 
type=vhost-user,id=hostnet3,chardev=char1,vhostforce=on,queues=2 -device 
virtio-net-pci,mq=on,vectors=6,netdev=hostnet3,id=net3,mac=52:54:00:12:34:57,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off

3. Log on guest, use testpmd(dpdk based) to test, use multiple virtio queues to 
rx and tx packets.
modprobe uio
insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko
echo 1024 > 
/sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
./tools/dpdk_nic_bind.py --bind igb_uio 00:03.0 00:04.0

$RTE_SDK/$RTE_TARGET/app/testpmd -c 1f -n 4 -- --rxq=2 --txq=2 --nb-cores=4 
--rx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" 
--tx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" -i --disable-hw-vlan 
--txqflags 0xf00

set fwd mac
start tx_first

4. Use packet generator to send packets with dest MAC:52 54 00 12 34 57  VLAN 
tag:1001,
select IPv4 as protocols and continuous incremental IP address.

5. Testpmd on guest can display packets received/transmitted in both queues of 
each virtio port.

Changchun Ouyang (12):
  ixgbe: support VMDq RSS in non-SRIOV environment
  vhost: support multiple queues in virtio dev
  vhost: update version map file
  vhost: set memory layout for multiple queues mode
  vhost: check the virtqueue address's validity
  vhost: support protocol feature
  vhost: add new command line option: rxq
  vhost: support multiple queues
  virtio: resolve for control queue
  vhos

[dpdk-dev] [PATCH] vchost: Notify application of ownership change

2015-08-10 Thread Ouyang, Changchun


> -Original Message-
> From: Jan Kiszka [mailto:jan.kiszka at siemens.com]
> Sent: Saturday, August 8, 2015 2:43 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] vchost: Notify application of ownership
> change
> 
> On 2015-08-08 02:25, Ouyang, Changchun wrote:
> >
> >
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jan Kiszka
> >> Sent: Saturday, August 8, 2015 1:21 AM
> >> To: dev at dpdk.org
> >> Subject: [dpdk-dev] [PATCH] vchost: Notify application of ownership
> >> change
> >
> > Vchost should be vhost in the title
> 
> Oops. Unless I need to resend for some other reason, I guess the commit can
> fix this up.
> 
> >
> >>
> >> On VHOST_*_RESET_OWNER, we reinitialize the device but without
> >> telling the application. That will cause crashes when it continues to
> >> invoke vhost services on the device. Fix it by calling the
> >> destruction hook if the device is still in use.
> > What's your qemu version?
> 
> git head, see my other reply for details.
> 
> > Any validation work on this patch?
> 
> What do you mean with this? Test cases? Or steps to reproduce? For the
> latter, just fire up a recent qemu, let the guest enable the virtio device, 
> then
> reboot or simply terminate qemu.

Here, I mean test case, 
Need make sure the change works on both qemu 2.4(with the reset commit in qemu) 
and qemu2.2/2.3(without the commit in qemu).

> 
> >>
> >> Signed-off-by: Jan Kiszka 
> >> ---
> >>
> >> This is the surprisingly simple answer to my questions in
> >> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/22661.
> >>
> >>  lib/librte_vhost/virtio-net.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/lib/librte_vhost/virtio-net.c
> >> b/lib/librte_vhost/virtio-net.c index
> >> b520ec5..3c5b5b2 100644
> >> --- a/lib/librte_vhost/virtio-net.c
> >> +++ b/lib/librte_vhost/virtio-net.c
> >> @@ -402,6 +402,9 @@ reset_owner(struct vhost_device_ctx ctx)
> >>
> >>ll_dev = get_config_ll_entry(ctx);
> >>
> >> +  if ((ll_dev->dev.flags & VIRTIO_DEV_RUNNING))
> >> +  notify_ops->destroy_device(&ll_dev->dev);
> >> +
> >
> > I am not sure whether destroy_device here will affect the second time
> init_device(below) and new_device(after the reset) or not.
> > Need validation.
> 
> Cannot follow, what do you mean with "second time"? If the callback could
> invoke something that causes cleanup_device to be called as well?
> That's at least not the case with vhost-switch, but I'm far from being 
> familiar
> with the API to asses if that is possible in general.

RESET is often followed by a second time virtio device creation. 
If you have chance to run testpmd with virtio PMD on guest, that would be that 
case:
Call RESET, and then create virtio device again to make it work for packets 
rx/tx 

> 
> Jan
> 
> >
> >>cleanup_device(&ll_dev->dev);
> >>init_device(&ll_dev->dev);
> >>
> >> --
> >> 2.1.4
> 
> 
> --
> Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate
> Competence Center Embedded Linux


[dpdk-dev] vhost-switch example: huge memory need and CRC off-loading issue

2015-08-10 Thread Ouyang, Changchun


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Sunday, August 9, 2015 4:56 PM
> To: Jan Kiszka
> Cc: dev at dpdk.org; Ouyang, Changchun
> Subject: Re: [dpdk-dev] vhost-switch example: huge memory need and CRC
> off-loading issue
> 
> 2015-08-08 09:17, Jan Kiszka:
> > On 2015-08-08 02:39, Ouyang, Changchun wrote:
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jan Kiszka
> > >> - MAX_QUEUES of 512 causes pretty high memory need for the
> application
> > >>   (something between 1 and 2G) - is that really needed? I'm now running
> > >>   with 32, and I'm able to get away with 256M. Can we tune this
> > >>   default?
> > >
> > > Don't think we need change default just because your platform is 32,
> > > Well, my platform is 128, other platform may have other value, :-)
> >
> > Then let's make it configurable or explore the actual device needs
> > before allocating the buffer. The impact on memory consumption is way
> > too big to hard-code this, specifically as this is per physical port IIUC.
> 
> You can add a run-time option and/or add a comment in the documentation.
> It is just an example, so it must be simple to understand.

Agree, that is better than changing the default macro value.


[dpdk-dev] vhost-switch example: huge memory need and CRC off-loading issue

2015-08-08 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jan Kiszka
> Sent: Saturday, August 8, 2015 1:31 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] vhost-switch example: huge memory need and CRC off-
> loading issue
> 
> Hi again,
> 
> two findings in the vhost-switch example code that can cause grey hair for
> starters:
> 
> - MAX_QUEUES of 512 causes pretty high memory need for the application
>   (something between 1 and 2G) - is that really needed? I'm now running
>   with 32, and I'm able to get away with 256M. Can we tune this
>   default?

Don't think we need change default just because your platform is 32,
Well, my platform is 128, other platform may have other value, :-)

> 
> - hw_strip_crc is set to 0, but either the igb driver or the ET2 quad
>   port adapter I'm using is ignoring this. It does strip the CRC, so

Igb and ET2 should NOT ignore it.

>   does software, and I'm losing 4 bytes on each unpadded packet. Known
>   issue?
> 
> Thanks,
> Jan
> 
> --
> Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate
> Competence Center Embedded Linux


[dpdk-dev] [PATCH] vchost: Notify application of ownership change

2015-08-08 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jan Kiszka
> Sent: Saturday, August 8, 2015 1:21 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] vchost: Notify application of ownership change

Vchost should be vhost in the title

> 
> On VHOST_*_RESET_OWNER, we reinitialize the device but without telling
> the application. That will cause crashes when it continues to invoke vhost
> services on the device. Fix it by calling the destruction hook if the device 
> is
> still in use.
What's your qemu version?
Any validation work on this patch?
> 
> Signed-off-by: Jan Kiszka 
> ---
> 
> This is the surprisingly simple answer to my questions in
> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/22661.
> 
>  lib/librte_vhost/virtio-net.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c 
> index
> b520ec5..3c5b5b2 100644
> --- a/lib/librte_vhost/virtio-net.c
> +++ b/lib/librte_vhost/virtio-net.c
> @@ -402,6 +402,9 @@ reset_owner(struct vhost_device_ctx ctx)
> 
>   ll_dev = get_config_ll_entry(ctx);
> 
> + if ((ll_dev->dev.flags & VIRTIO_DEV_RUNNING))
> + notify_ops->destroy_device(&ll_dev->dev);
> +

I am not sure whether destroy_device here will affect the second time 
init_device(below) and new_device(after the reset) or not.
Need validation.

>   cleanup_device(&ll_dev->dev);
>   init_device(&ll_dev->dev);
> 
> --
> 2.1.4


[dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding

2015-08-08 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zhang, Helin
> Sent: Saturday, August 8, 2015 12:07 AM
> To: Qiu, Michael; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] testpmd: modify the mac of csum
> forwarding
> 
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michael Qiu
> > Sent: Thursday, August 6, 2015 8:29 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding
> >
> > For some ethnet-switch like intel RRC, all the packet forwarded out by
> > DPDK will be dropped in switch side, so the packet generator will never
> receive the packet.
> Is it because of anti-sproof? E.g. When the hardware found that the dest mac
> is the port itself, then it will be dropped during TX.
> You need to tell the root cause, and why we need to modify like this.
> 
> >
> > Signed-off-by: Michael Qiu 
> > ---
> >  app/test-pmd/csumonly.c | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index
> > 1bf3485..bf8af1d 100644
> > --- a/app/test-pmd/csumonly.c
> > +++ b/app/test-pmd/csumonly.c
> > @@ -550,6 +550,10 @@ pkt_burst_checksum_forward(struct fwd_stream
> *fs)
> >  * and inner headers */
> >
> > eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
> > +   ether_addr_copy(&peer_eth_addrs[fs->peer_addr],
> > +   ð_hdr->d_addr);
> > +   ether_addr_copy(&ports[fs->tx_port].eth_addr,
> > +   ð_hdr->s_addr);
> Is it really necessary? Why other NICs do not need this?
> 

Seems the behavior changes from io fwd into mac fwd?

> > parse_ethernet(eth_hdr, &info);
> > l3_hdr = (char *)eth_hdr + info.l2_len;
> >
> > --
> > 1.9.3



[dpdk-dev] vhost: Problem RESET_OWNER processing

2015-08-08 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jan Kiszka
> Sent: Friday, August 7, 2015 10:04 PM
> To: dev at dpdk.org; Xie, Huawei
> Subject: [dpdk-dev] vhost: Problem RESET_OWNER processing
> 
> Hi,
> 
> I was wondering if I'm alone with this: the vhost-switch example crashes on
> client disconnects if the client send a RESET_OWNER message. That's at least
> the case for QEMU and vhost-user mode (I suppose vhost-cuse is legacy

What's your qemu version?

> now). And it really ruins the party when playing with this because every VM
> shutdown or guest reboot triggers.
> 
> I was looking deeper in the librte_vhost, and I found that reset_owner() is
> doing cleanup_device and then init_device - but without letting the user
> know. So vhost-switch crashed in its main loop over continuing to use the
> device, namely calling rte_vhost_dequeue_burst (with
> dev->virtqueue[]->avail == NULL).
> 
> Do we simply need another hook in the vhost API, similar to the destruction
> notification?
> 
> Jan
> 
> --
> Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate
> Competence Center Embedded Linux


[dpdk-dev] DPDK bond the virtio NIC run error

2015-08-05 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Hu, FangyinX
> Sent: Wednesday, August 5, 2015 10:41 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] DPDK bond the virtio NIC run error
> 
> HI, I use the qemu-kvm create a virtual machine and give a virtio-net-pci
> mode NIC to the VM, then I run the bond_app example get an error. Here is
> the steps.
> 
> 1.   qemu-kvm -enable-kvm -had /home/images/fedora.qcow2 -m 2048 -
> smp 8 -cpu qemu64,+ssse3,+sse4.2 -device virtio-net-pci,netdev=mytap -
> netdev tap,id=mytap,ifname=tap1,script=/etc/qemu-ifup,downscript=no -
> device pci-assign,host=03:00.1
> 
> 2.   compile the DPDK v2.1.0-rc3 and use the tools/dpdk_nic_bind.py script
> to bind the device to the igb_uio
> 
> 3.   execute the example/bond/build/bond_app -c f -n 2  and get the error:
> 
> EAL: memzone_reserve_aligned_thread_unsafe(): memzone 
> already exists
> 
> PMD: slave_configure(1351) - rte_eth_tx_queue_setup: port=0 queue_id 0,
> err (-22)
> 
> PMD: bond_ethdev_start(1497) - bonded port (2) failed to reconfigure slave
> device (0)
> 
> EAL: Error - exiting with code: -1
> 
> Cause: Start port 2 failed (res= -1)

Are the rx_mode.hw_vlan_*  enabled on default? 
You may disable it first and try it again.


[dpdk-dev] [PATCH 2/2] virtio: allow running w/o vlan filtering

2015-08-05 Thread Ouyang, Changchun

Hi Vincent,

> -Original Message-
> From: Vincent JARDIN [mailto:vincent.jardin at 6wind.com]
> Sent: Tuesday, August 4, 2015 8:52 PM
> To: Thomas Monjalon; Ouyang, Changchun
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: Re: [dpdk-dev] [PATCH 2/2] virtio: allow running w/o vlan filtering
> 
> Thomas, Changchun,
> 
> On 29/07/2015 14:56, Thomas Monjalon wrote:
> > Back on this old patch, it seems justified but nobody agreed.
> >
> > --- a/lib/librte_pmd_virtio/virtio_ethdev.c
> > +++ b/lib/librte_pmd_virtio/virtio_ethdev.c
> > @@ -1288,7 +1288,6 @@ virtio_dev_configure(struct rte_eth_dev *dev)
> >  && !vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN)) {
> >  PMD_DRV_LOG(NOTICE,
> >  "vlan filtering not available on this host");
> > -       return -ENOTSUP;
> >  }
> >
> > 2015-03-06 08:24, Stephen Hemminger:
> >> "Ouyang, Changchun"  wrote:
> >>>> From: Stephen Hemminger
> >>>> Vlan filtering is an option, and not a requirement.
> >>>> If host does not support filtering then it can be done in software.
> 
> +1 with Stephen, remove return -ENOTSUP;
> 
> applications must not fail, software stacks will handle it. We did experiment
Do you mean handling it in software stack outside the virtio pmd?
AFAIK, inside virtio PMD, we have no codes to handle it currently.

> some issues when testpmd was failing while it was supposed to run. A notice
> would be good enough.
> 

Use '--disable-hw-vlan-filter'  in testpmd command line will allow it continue 
to work. 
You can have a try.

> 
> >>>
> >>> The question is that guest only send command, no real action to do the
> vlan filter.
> >>> So if both host and guest have no real action for vlan filter, who will 
> >>> do it?
> >>
> >> The virtio driver has features.
> >> Guest can not send commands to host where feature bit not enabled.
> >> Application can call filter_set and check if filter worked or not.
> >>
> >> Our code already had to do MAC and VLAN validation of incoming
> >> packets therefore if hardware can't do vlan match, there is no problem.
> >> I would expect other applications would do the same thing.
> >>
> >> Failing during configuration is bad. DPDK API should never force
> >> application to play "guess the working configuration" with the device
> >> driver or do string match on "which device is this anyway"
> 
> Agree, it is not a failure of a configuration, it is a failure of negotiation 
> of
> virtio's capabilities.

I am not sure which one is better when app configures one feature but fail to 
negotiate it with host(which means has
no such capability to support this feature currently).
1)The driver cheat the app, and continue to do the rest of work(of course need 
some hints).
2)give hints and exit, then user re-run app with correct configuration.

> 
> Let's use another example: we do not expect a guest kernel to panic()
> because it is not properly negotiated? So why should a DPDK application fail
> and return -ENOTSUP?
I think user mode driver/app and kernel is different thing :-)

Changchun



[dpdk-dev] [PATCH 2/2] virtio: allow running w/o vlan filtering

2015-07-30 Thread Ouyang, Changchun
I have comments for that.
Pls see below.

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, July 29, 2015 8:57 PM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: Re: [dpdk-dev] [PATCH 2/2] virtio: allow running w/o vlan filtering
> 
> Back on this old patch, it seems justified but nobody agreed.
> 
> --- a/lib/librte_pmd_virtio/virtio_ethdev.c
> +++ b/lib/librte_pmd_virtio/virtio_ethdev.c
> @@ -1288,7 +1288,6 @@ virtio_dev_configure(struct rte_eth_dev *dev)
> && !vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN)) {
> PMD_DRV_LOG(NOTICE,
> "vlan filtering not available on this host");
> -   return -ENOTSUP;
>     }
> 
> 2015-03-06 08:24, Stephen Hemminger:
> > "Ouyang, Changchun"  wrote:
> > > > From: Stephen Hemminger
> > > > Vlan filtering is an option, and not a requirement.
> > > > If host does not support filtering then it can be done in software.

Yes, vlan filter is an option, but currently virtio driver has no software 
solution for vlan filter.
So I would like to disable hw_vlan_filter in rxmode if the dev can't really 
support it rather than removing the return there.

> > >
> > > The question is that guest only send command, no real action to do the
> vlan filter.
> > > So if both host and guest have no real action for vlan filter, who will 
> > > do it?
> >
> > The virtio driver has features.
> > Guest can not send commands to host where feature bit not enabled.
> > Application can call filter_set and check if filter worked or not.
> >
> > Our code already had to do MAC and VLAN validation of incoming packets

There is vlan strip, but have no vlan filter in the rx function.

> > therefore if hardware can't do vlan match, there is no problem.
> > I would expect other applications would do the same thing.
> >
> > Failing during configuration is bad. DPDK API should never force
> > application to play "guess the working configuration" with the device
> > driver or do string match on "which device is this anyway"



[dpdk-dev] [PATCH] virtio: fix the vq size issue

2015-07-21 Thread Ouyang, Changchun


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, July 20, 2015 6:42 PM
> To: Ouyang, Changchun
> Cc: Xu, Qian Q; Stephen Hemminger; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> 
> 2015-07-20 06:18, Ouyang, Changchun:
> > Another thing burst into my thought.
> > Can we think more about how to setup a mechanism to block those
> patches which causes critical regression issue?
> 
> Yes. Non-regression tests are needed. As it must be done with many
> hardwares and many configurations, it must be a shared effort.
> As a first step, you can run some automated tests by yourself and
> *manually* raise errors on the mailing lists. When it will work well, we could
> discuss about gathering test reports in a clean distributed way.
> Note that this topic is already a work in progress by few people and a public
> proposal should be done in few weeks.

That's good. 

> 
> > I did review that patch before, but fail to realize it will break the basic
> function of virtio PMD, it is my bad.
> > (Can I send the nack to that patch even after it has been merged into
> > dpdk.org?)
> 
> After being approved and merged, a nack has no effect.
> Having a revert approved is the good way.

I have acked Stephen's new patch.

> 
> > After that, we find that in our testing cycle, we spend time in
> > investigating that and root the cause, and sent out the fixing patch on 
> > July 1.
> Keeping virtio basic functionality broken more than 20 days is bad thing for
> me.
> 
> It wouldn't be so long if these 3 simple things were done:
> - use a better title: "virtio: fix Rx from Qemu" instead of a not meaningful 
> "fix
> the vq size issue"
> - cc Stephen (I did it later) who did the original patch you wants to revert
> - have an acked-by from Huawei Xie who commented the patch
> 
> > If we can run a regression automation test with every patch set sent
> > out to dpdk.org, and put those patches breaking any test cases Into
> > failing-list and notify author, reviewer and maintainer, all those things
> should be done before theirs being merged, then it will prevent from
> merging the erroneous patch into mainline, and thus reduce most reverting
> patch.
> 
> As explained above, it is planned and you can start running you own local test
> machine. But please do not spam the mailing list with automated mails from
> these tests.


[dpdk-dev] [PATCH v2 1/2] virtio: fix queue size and number of descriptors

2015-07-21 Thread Ouyang, Changchun


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Tuesday, July 21, 2015 2:41 AM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: [PATCH v2 1/2] virtio: fix queue size and number of descriptors
> 
> The virtual queue ring size and the number of slots actually usable are
> separate parameters. In the most common environment (QEMU) the virtual
> queue ring size is 256, but some environments the ring maybe much larger.
> 
> The ring size comes from the host and the driver must use the actual size
> passed.
> 
> The number of descriptors can be either zero to use the whole available ring,
> or some value smaller. This is used to limit the number of mbufs allocated for
> the receive ring. If more descriptors are requested than available the size is
> silently truncated.
> 
> Note: the ring size (from host) must be a power of two, but the number of
> descriptors used can be any size from 1 to the size of the virtual ring.
> 
> Reported-by: Ouyang Changchun 
> Signed-off-by: Stephen Hemminger 

Basically ok for this change, so
Acked-by: Changchun Ouyang 

> ---
>  drivers/net/virtio/virtio_ethdev.c | 17 -
>  1 file changed, 4 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> index 9ca9bb2..d460d89 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -276,8 +276,6 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
>*/
>   vq_size = VIRTIO_READ_REG_2(hw, VIRTIO_PCI_QUEUE_NUM);
>   PMD_INIT_LOG(DEBUG, "vq_size: %d nb_desc:%d", vq_size,
> nb_desc);
> - if (nb_desc == 0)
> - nb_desc = vq_size;
>   if (vq_size == 0) {
>   PMD_INIT_LOG(ERR, "%s: virtqueue does not exist",
> __func__);
>   return -EINVAL;
> @@ -288,16 +286,6 @@ int virtio_dev_queue_setup(struct rte_eth_dev
> *dev,
>   return -EINVAL;
>   }
> 
> - if (nb_desc < vq_size) {
> - if (!rte_is_power_of_2(nb_desc)) {
> - PMD_INIT_LOG(ERR,
> -  "nb_desc(%u) size is not powerof 2",
> -  nb_desc);
> - return -EINVAL;
> - }
> - vq_size = nb_desc;
> - }
> -
>   if (queue_type == VTNET_RQ) {
>   snprintf(vq_name, sizeof(vq_name), "port%d_rvq%d",
>   dev->data->port_id, queue_idx);
> @@ -325,7 +313,10 @@ int virtio_dev_queue_setup(struct rte_eth_dev
> *dev,
>   vq->queue_id = queue_idx;
>   vq->vq_queue_index = vtpci_queue_idx;
>   vq->vq_nentries = vq_size;
> - vq->vq_free_cnt = vq_size;
> +
> + if (nb_desc == 0 || nb_desc > vq_size)
> + nb_desc = vq_size;
> + vq->vq_free_cnt = nb_desc;
> 
>   /*
>* Reserve a memzone for vring elements
> --
> 2.1.4



[dpdk-dev] [PATCH v2 0/2] virtio: fixes for 2.1-rc1

2015-07-21 Thread Ouyang, Changchun


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Tuesday, July 21, 2015 2:41 AM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: [PATCH v2 0/2] virtio: fixes for 2.1-rc1

I think v2 here should be v3, as you have already a v2.

> 
> From: Stephen Hemminger 
> 
> This integrates my change and earlier change by Ouyang Changchun into one
> fix. And second patch is minor stuff found while reviewing.
> 
> Stephen Hemminger (2):
>   virtio: fix queue size and number of descriptors
>   virtio: small cleanups
> 
>  drivers/net/virtio/virtio_ethdev.c | 24 +++-
> drivers/net/virtio/virtio_ethdev.h |  2 +-
>  drivers/net/virtio/virtio_rxtx.c   |  2 +-
>  3 files changed, 9 insertions(+), 19 deletions(-)
> 
> --
> 2.1.4



[dpdk-dev] [PATCH v2 2/2] virtio: small cleanups

2015-07-21 Thread Ouyang, Changchun


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Tuesday, July 21, 2015 2:41 AM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: [PATCH v2 2/2] virtio: small cleanups
> 
> Some minor cleanups.
>   * pass constant to virtio_dev_queue_setup
>   * fix message on rx_queue_setup
>   * get rid of extra double spaces
> 
> Signed-off-by: Stephen Hemminger 

Acked-by: Changchun Ouyang 

> ---
>  drivers/net/virtio/virtio_ethdev.c | 7 +++
> drivers/net/virtio/virtio_ethdev.h | 2 +-
>  drivers/net/virtio/virtio_rxtx.c   | 2 +-
>  3 files changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> index d460d89..465d3cd 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -254,7 +254,7 @@ virtio_dev_queue_release(struct virtqueue *vq) {  int
> virtio_dev_queue_setup(struct rte_eth_dev *dev,
>   int queue_type,
>   uint16_t queue_idx,
> - uint16_t  vtpci_queue_idx,
> + uint16_t vtpci_queue_idx,
>   uint16_t nb_desc,
>   unsigned int socket_id,
>   struct virtqueue **pvq)
> @@ -264,7 +264,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
>   uint16_t vq_size;
>   int size;
>   struct virtio_hw *hw = dev->data->dev_private;
> - struct virtqueue  *vq = NULL;
> + struct virtqueue *vq = NULL;
> 
>   /* Write the virtqueue index to the Queue Select Field */
>   VIRTIO_WRITE_REG_2(hw, VIRTIO_PCI_QUEUE_SEL,
> vtpci_queue_idx); @@ -413,13 +413,12 @@
> virtio_dev_cq_queue_setup(struct rte_eth_dev *dev, uint16_t
> vtpci_queue_idx,
>   uint32_t socket_id)
>  {
>   struct virtqueue *vq;
> - uint16_t nb_desc = 0;
>   int ret;
>   struct virtio_hw *hw = dev->data->dev_private;
> 
>   PMD_INIT_FUNC_TRACE();
>   ret = virtio_dev_queue_setup(dev, VTNET_CQ,
> VTNET_SQ_CQ_QUEUE_IDX,
> - vtpci_queue_idx, nb_desc, socket_id, &vq);
> + vtpci_queue_idx, 0, socket_id, &vq);
>   if (ret < 0) {
>   PMD_INIT_LOG(ERR, "control vq initialization failed");
>   return ret;
> diff --git a/drivers/net/virtio/virtio_ethdev.h
> b/drivers/net/virtio/virtio_ethdev.h
> index 3858b00..9026d42 100644
> --- a/drivers/net/virtio/virtio_ethdev.h
> +++ b/drivers/net/virtio/virtio_ethdev.h
> @@ -79,7 +79,7 @@ void virtio_dev_rxtx_start(struct rte_eth_dev *dev);
> int virtio_dev_queue_setup(struct rte_eth_dev *dev,
>   int queue_type,
>   uint16_t queue_idx,
> - uint16_t  vtpci_queue_idx,
> + uint16_t vtpci_queue_idx,
>   uint16_t nb_desc,
>   unsigned int socket_id,
>   struct virtqueue **pvq);
> diff --git a/drivers/net/virtio/virtio_rxtx.c 
> b/drivers/net/virtio/virtio_rxtx.c
> index 5388caa..c5b53bb 100644
> --- a/drivers/net/virtio/virtio_rxtx.c
> +++ b/drivers/net/virtio/virtio_rxtx.c
> @@ -390,7 +390,7 @@ virtio_dev_rx_queue_setup(struct rte_eth_dev *dev,
>   ret = virtio_dev_queue_setup(dev, VTNET_RQ, queue_idx,
> vtpci_queue_idx,
>   nb_desc, socket_id, &vq);
>   if (ret < 0) {
> - PMD_INIT_LOG(ERR, "tvq initialization failed");
> + PMD_INIT_LOG(ERR, "rvq initialization failed");
>   return ret;
>   }
> 
> --
> 2.1.4



[dpdk-dev] [PATCH] virtio: fix the vq size issue

2015-07-20 Thread Ouyang, Changchun
Hi Thomas,

I think we have 3 options for this issue.
1) applying this patch;
2) reverting Stephen's original patch;
3) new patch to make both QEMU and GCE work.

1) and 2) will make the test case recover quickly from fail.
As for 3) I don't know whether Stephen has such a patch which can work on both 
or not.
I don't have GCE environment on hand and I am not an expert on that yet, my 
current focus is virtio on QEMU,
So at present I have no chance to make a new one to make sure both can work,
But I can help on reviewing if Stephen has a new patch to do that. 

Another thing burst into my thought.
Can we think more about how to setup a mechanism to block those patches which 
causes critical regression issue?
e.g. this case we are talking about. 
Commit d78deadae4dca240e85054bf2d604a801676becc breaks basic functionality of 
virtio PMD on QEMU,
It means DPDK sample like vhost, vxlan can't rx any packet and accordingly it 
can't forward any packet with virtio PMD.
Neither does ovs.

I did review that patch before, but fail to realize it will break the basic 
function of virtio PMD, it is my bad. 
(Can I send the nack to that patch even after it has been merged into dpdk.org?)
After that, we find that in our testing cycle, we spend time in investigating 
that and root the cause, and sent out
the fixing patch on July 1.  Keeping virtio basic functionality broken more 
than 20 days is bad thing for me. 

If we can run a regression automation test with every patch set sent out to 
dpdk.org, and put those patches breaking any test cases
Into failing-list and notify author, reviewer and maintainer, all those things 
should be done before theirs being merged, then it will
prevent from merging the erroneous patch into mainline, and thus reduce most 
reverting patch.

Hi Stephen, and guys in Brocade

Since you nack my patch, then would you pls send out a new patch to fix the 
issue which your previous patch broke ASAP?
I am not sure you validate your patches on GCE or not, but I strongly suggest 
you validate each of them on QEMU before
you send out a formal one to dpdk.org.

Hi Qian,
Thanks very much for raising this critical issue in virtio!

thanks,
Changchun


> -Original Message-
> From: Xu, Qian Q
> Sent: Monday, July 20, 2015 11:41 AM
> To: Stephen Hemminger; Ouyang, Changchun; 'Thomas Monjalon'
> Cc: dev at dpdk.org; Xu, Qian Q
> Subject: RE: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> 
> Hi, Thomas and all
> I saw in the latest rc1 package, the patch is not merged, and it's a critical 
> issue
> from validation view. I'm responsible for testing the dpdk vhost/virtio
> features, and I found using the latest code, dpdk-vhost/dpdk-virtio can't
> RX/TX package, then my 50% tests are failed while in DPDK2.0 they can pass.
> As you know, it's the basic functions for dpdk virtio to RX/TX, if it's not 
> fixed, I
> think we can't release the R2.1 package. Please help merge the patch, thx.
> 
> 
> 
> Thanks
> Qian
> 
> 
> -Original Message-----
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen
> Hemminger
> Sent: Saturday, July 18, 2015 12:28 AM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> 
> On Wed,  1 Jul 2015 15:48:50 +0800
> Ouyang Changchun  wrote:
> 
> > This commit breaks virtio basic packets rx functionality:
> >   d78deadae4dca240e85054bf2d604a801676becc
> >
> > The QEMU use 256 as default vring size, also use this default value to
> > calculate the virtio avail ring base address and used ring base
> > address, and vhost in the backend use the ring base address to do packet
> IO.
> >
> > Virtio spec also says the queue size in PCI configuration is
> > read-only, so virtio front end can't change it. just need use the
> > read-only value to allocate space for vring and calculate the avail
> > and used ring base address. Otherwise, the avail and used ring base
> address will be different between host and guest, accordingly, packet IO
> can't work normally.
> >
> > Signed-off-by: Changchun Ouyang 
> > ---
> >  drivers/net/virtio/virtio_ethdev.c | 14 +++---
> >  1 file changed, 3 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/net/virtio/virtio_ethdev.c
> > b/drivers/net/virtio/virtio_ethdev.c
> > index fe5f9a1..d84de13 100644
> > --- a/drivers/net/virtio/virtio_ethdev.c
> > +++ b/drivers/net/virtio/virtio_ethdev.c
> > @@ -263,8 +263,6 @@ int virtio_dev_queue_setup(struct rte_eth_dev
> *dev,
> >  */
> > vq_size = VIRTIO_READ_REG_2(hw, VIRTIO_PCI_QUEUE_NUM);
> > PMD_INIT_LOG(DEBUG, "vq_size: %d nb_desc:%d", vq_size,
> nb_desc);

[dpdk-dev] [PATCH] virtio: fix the vq size issue

2015-07-18 Thread Ouyang, Changchun
Hi Stephen,

> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Saturday, July 18, 2015 12:28 AM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> 
> On Wed,  1 Jul 2015 15:48:50 +0800
> Ouyang Changchun  wrote:
> 
> > This commit breaks virtio basic packets rx functionality:
> >   d78deadae4dca240e85054bf2d604a801676becc
> >
> > The QEMU use 256 as default vring size, also use this default value to
> > calculate the virtio avail ring base address and used ring base
> > address, and vhost in the backend use the ring base address to do packet
> IO.
> >
> > Virtio spec also says the queue size in PCI configuration is
> > read-only, so virtio front end can't change it. just need use the
> > read-only value to allocate space for vring and calculate the avail
> > and used ring base address. Otherwise, the avail and used ring base
> address will be different between host and guest, accordingly, packet IO
> can't work normally.
> >
> > Signed-off-by: Changchun Ouyang 
> > ---
> >  drivers/net/virtio/virtio_ethdev.c | 14 +++---
> >  1 file changed, 3 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/net/virtio/virtio_ethdev.c
> > b/drivers/net/virtio/virtio_ethdev.c
> > index fe5f9a1..d84de13 100644
> > --- a/drivers/net/virtio/virtio_ethdev.c
> > +++ b/drivers/net/virtio/virtio_ethdev.c
> > @@ -263,8 +263,6 @@ int virtio_dev_queue_setup(struct rte_eth_dev
> *dev,
> >  */
> > vq_size = VIRTIO_READ_REG_2(hw, VIRTIO_PCI_QUEUE_NUM);
> > PMD_INIT_LOG(DEBUG, "vq_size: %d nb_desc:%d", vq_size,
> nb_desc);
> > -   if (nb_desc == 0)
> > -   nb_desc = vq_size;
> 
> command queue is setup with nb_desc = 0

nb_desc is not used in the rest of the function, then why we need such an 
assignment here?
Why command queues is setup whit nb_desc = 0?
Even if it is the case, what the code change break? 

> 
> > if (vq_size == 0) {
> > PMD_INIT_LOG(ERR, "%s: virtqueue does not exist",
> __func__);
> > return -EINVAL;
> > @@ -275,15 +273,9 @@ int virtio_dev_queue_setup(struct rte_eth_dev
> *dev,
> > return -EINVAL;
> > }
> >
> > -   if (nb_desc < vq_size) {
> > -   if (!rte_is_power_of_2(nb_desc)) {
> > -   PMD_INIT_LOG(ERR,
> > -"nb_desc(%u) size is not powerof 2",
> > -nb_desc);
> > -   return -EINVAL;
> > -   }
> > -   vq_size = nb_desc;
> > -   }
> > +   if (nb_desc != vq_size)
> > +   PMD_INIT_LOG(ERR, "Warning: nb_desc(%d) is not equal to
> vq size (%d), fall to vq size",
> > +   nb_desc, vq_size);
> 
> Nack. This breaks onn Google Compute Engine the vring size is 16K.


As I mentioned before, the commit d78deadae4dca240e85054bf2d604a801676becc 
break the basic functionality of virtio pmd,
I don't think keeping it broken is good way for us.
We have to revert it firstly to recover its functionality on qemu!
Why we need break current functionality to just meet a new thing's requirement?

> 
> An application that wants to work on both QEMU and GCE will want to pass a
> reasonable size and have the negotiation resolve to best value.

Do you have already a patch to revert the mistaken and support both qemu and 
gce?
If you have, then pls send out it, and let's review.

> 
> For example, vRouter passes 512 as Rx ring size.
> On QEMU this gets rounded down to 256 and on GCE only 512 elements are
> used.
> 
> This is what the Linux kernel virtio does.




[dpdk-dev] [ovs-discuss] ovs-dpdk performance is not good

2015-07-17 Thread Ouyang, Changchun


On 7/16/2015 9:45 PM, Traynor, Kevin wrote:
>
> (re-adding the ovs-discuss list)
>
> This might be better on the dpdk dev mailing list. For the OVS part, 
> see this thread 
> http://openvswitch.org/pipermail/discuss/2015-July/018095.html
>
> Kevin.
>
> *From:*Na Zhu [mailto:zhunatuzi at gmail.com]
> *Sent:* Wednesday, July 15, 2015 6:16 AM
> *To:* Traynor, Kevin
> *Subject:* Re: [ovs-discuss] ovs-dpdk performance is not good
>
> Hi Kevin,
>
> The interface MTU is 1500, the TCP message size is 16384 and the UDP 
> message size is 65507.
>
> How to use DPDK virtio PMD?
>
in DPDK virtio PMD, it uses mergeable feature to support jumbo frame,
the mergeable feature need negotiate with vhost on the backend,
so if ovs enable the mergeable feature, and virtio can succeed in 
negotiating this feature,
then jumbo frame can be supported.

thanks
Changchun

> 2015-07-14 20:25 GMT+08:00 Traynor, Kevin  >:
>
> *From:*discuss [mailto:discuss-bounces at openvswitch.org
> ] *On Behalf Of *Na Zhu
> *Sent:* Monday, July 13, 2015 3:15 AM
> *To:* bugs at openvswitch.org 
> *Subject:* [ovs-discuss] ovs-dpdk performance is not good
>
> Dear all,
>
> I want to use ovs-dpdk to improve my nfv performance. But when i
> compare the throughput between standard ovs and ovs-dpdk, the ovs
> is better, does anyone know why?
>
> I use netperf to test the throughput.
>
> use vhost-net to test standard ovs.
>
> use vhost-user to test ovs-dpdk.
>
> My topology is as follow:
>
>  1
>
> The result is that standard ovs performance is better. Throughput
> unit Mbps.
>
>  2
>
>  3
>
> [kt] I would check your core affinitization to ensure that the
> vswitchd
>
> pmd is on a separate core to the vCPUs (set with
> other_config:pmd-cpu-mask).
>
> Also, this test is not using the DPDK vitrio PMD in the guest
> which provides
>
> performance gains.
>
> What packet sizes are you using? you should see a greater gain
> from DPDK
>
> at lower packet sizes (i.e. more PPS)
>



[dpdk-dev] [dpdk-virtio]: cannot start testpmd after binding virtio devices to gib_uio

2015-07-17 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Clarylin L
> Sent: Friday, July 17, 2015 7:10 AM
> To: dev at dpdk.org
> Subject: Re: [dpdk-dev] [dpdk-virtio]: cannot start testpmd after binding
> virtio devices to gib_uio
> 
> > I am running a virtual guest on Ubuntu and trying to use dpdk testpmd
> > as a packet forwarder.
> >
> > After starting the virtual guest, I do insmod igb_uio.ko insmod
> > rte_kni.ko echo ":00:06.0" >
> > /sys/bus/pci/drivers/virtio-pci/unbind
> > echo ":00:07.0" > /sys/bus/pci/drivers/virtio-pci/unbind
> > echo "1af4 1000" > /sys/bus/pci/drivers/igb_uio/new_id


You can try with the following instead of above to bind virtio port with igb_uio
tools/dpdk-nic-bind.py --bind igb_uio 00:06.0 00:07.0

> > mkdir -p /tmp/huge
> > mount -t hugetlbfs nodev /tmp/huge
> > echo 1024 > /sys/kernel/mm/hugepages/hugepages-
> 2048kB/nr_hugepages
> >
> > Where :00:06.0 and :00:07.0 are the two virtio devices I am
> > gonna use, and 1af4 1000 is the corresponding vendor and device id.
> >
> > After the above steps, I verified that the virtio devices are actually
> > bound to igb_uio:
> >
> > lspci -s 00:06.0 -vvv | grep driver
> >
> > Kernel driver in use: igb_uio
> >
> >
> > However, I couldn't start testpmd and it hang at the the last line
> > below
> > "PMD: rte_eh_dev_config_restore.."
> >
> > ...
> >
> > EAL: PCI device :00:05.0 on NUMA socket -1
> >
> > EAL:   probe driver: 1af4:1000 rte_virtio_pmd
> >
> > EAL:   Device is blacklisted, not initializing
> >
> > EAL: PCI device :00:06.0 on NUMA socket -1
> >
> > EAL:   probe driver: 1af4:1000 rte_virtio_pmd
> >
> > EAL: PCI device :00:07.0 on NUMA socket -1
> >
> > EAL:   probe driver: 1af4:1000 rte_virtio_pmd
> >
> > Interactive-mode selected
> >
> > Set mac packet forwarding mode
> >
> > Configuring Port 0 (socket 0)
> >
> > PMD: rte_eth_dev_config_restore: port 0: MAC address array not
> > supported
> >
> >
> > If I do not bind interface to igb_uio, testpmd can start successfully
> > which also shows "probe driver: 1af4:1000 rte_virtio_pmd" during
> > starting process. However, even after testpmd started, virtio devices
> > are bound to nothing ("lspci -s 00:06.0 -vvv | grep driver" shows nothing).
> >
> >
> > I am also attaching my virtual guest configuration below. Thanks for
> > your help. Highly appreciate!!
> >
> >
> >
> > lab at vpc-2:~$ ps aux | grep qemu
> >
> > libvirt+ 12020  228  0.0 102832508 52860 ? Sl   14:54  61:06 
> > *qemu*-system-
> x86_64
> > -enable-kvm -name dpdk-perftest -S -machine
> > pc-i440fx-trusty,accel=kvm,usb=off,mem-merge=off -cpu host -m 98304
> > -mem-prealloc -mem-path /dev/hugepages/libvirt/*qemu* -realtime
> > mlock=off -smp 24,sockets=2,cores=12,threads=1 -numa
> > node,nodeid=0,cpus=0-11,mem=49152 -numa
> > node,nodeid=1,cpus=12-23,mem=49152
> > -uuid eb5f8848-9983-4f13-983c-e3bd4c59387d -no-user-config -nodefaults
> > -chardev
> > socket,id=charmonitor,path=/var/lib/libvirt/*qemu*/dpdk-perftest.monit
> > or,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -
> rtc
> > base=utc -no-shutdown -boot strict=on -device
> > piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
> > file=/var/lib/libvirt/images/dpdk-perftest-hda.img,if=none,id=drive-id
> > e0-0-0,format=qcow2
> > -device
> > ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1
> > -drive
> > file=/var/lib/libvirt/images/dpdk-perftest-hdb.img,if=none,id=drive-id
> > e0-0-1,format=qcow2 -device
> > ide-hd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -drive
> > if=none,id=drive-ide0-1-0,readonly=on,format=raw -device
> > ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=2
> > -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device
> > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:45:ff:5e,bus=pci.0
> > ,addr=0x5
> > -netdev
> > tap,fds=26:27:28:29:30:31:32:33,id=hostnet1,vhost=on,vhostfds=34:35:36
> > :37:38:39:40:41
> > -device
> > virtio-net-pci,mq=on,vectors=17,netdev=hostnet1,id=net1,mac=52:54:00:7
> > e:b5:6b,bus=pci.0,addr=0x6
> > -netdev
> > tap,fds=42:43:44:45:46:47:48:49,id=hostnet2,vhost=on,vhostfds=50:51:52
> > :53:54:55:56:57
> > -device
> > virtio-net-pci,mq=on,vectors=17,netdev=hostnet2,id=net2,mac=52:54:00:f
> > 1:a5:20,bus=pci.0,addr=0x7
> > -chardev pty,id=charserial0 -device
> > isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1
> > -device isa-serial,chardev=charserial1,id=serial1 -vnc 127.0.0.1:0
> > -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device
> > i6300esb,id=watchdog0,bus=pci.0,addr=0x3 -watchdog-action reset
> > -device
> > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4
> >


[dpdk-dev] [PATCH 4/5] virtio: free queue memory in virtio_dev_close()

2015-07-15 Thread Ouyang, Changchun


> -Original Message-
> From: Iremonger, Bernard
> Sent: Wednesday, July 15, 2015 4:27 PM
> To: Stephen Hemminger
> Cc: dev at dpdk.org; Ouyang, Changchun
> Subject: RE: [PATCH 4/5] virtio: free queue memory in virtio_dev_close()
> 
> > -Original Message-
> > From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> > Sent: Tuesday, July 14, 2015 7:28 PM
> > To: Iremonger, Bernard
> > Cc: dev at dpdk.org; Ouyang, Changchun
> > Subject: Re: [PATCH 4/5] virtio: free queue memory in virtio_dev_close()
> >
> > On Tue, 14 Jul 2015 14:10:58 +0100
> > Bernard Iremonger  wrote:
> >
> > >  static void
> > > +virtio_free_queues(struct rte_eth_dev *dev) {
> > > + unsigned int i;
> > > +
> > > + for (i = 0; i < dev->data->nb_rx_queues; i++) {
> > > + virtio_dev_rx_queue_release(dev->data->rx_queues[i]);
> > > + dev->data->rx_queues[i] = NULL;
> > > + }
> > > + dev->data->nb_rx_queues = 0;
> > > +
> > > + for (i = 0; i < dev->data->nb_tx_queues; i++) {
> > > + virtio_dev_tx_queue_release(dev->data->tx_queues[i]);
> > > + dev->data->tx_queues[i] = NULL;
> > > + }
> > > + dev->data->nb_tx_queues = 0;
> > > +}
> > > +
> >
> > Where does command queue get freed?
> 
> The command queue is set up in the eth_virtio_dev_init() function and
> freed in the eth_virtio_dev_uninit() function.
> 

Do you mean control vq?

> Regards,
> 
> Bernard.



[dpdk-dev] [PATCH 4/5] virtio: free queue memory in virtio_dev_close()

2015-07-15 Thread Ouyang, Changchun
Hi, Bernard

> -Original Message-
> From: Iremonger, Bernard
> Sent: Wednesday, July 15, 2015 4:02 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Cc: Xu, Qian Q; stephen at networkplumber.org
> Subject: RE: [PATCH 4/5] virtio: free queue memory in virtio_dev_close()
> 
> Hi  Ouyang,
> 
> 
> 
> > > --- a/drivers/net/virtio/virtio_ethdev.c
> > > +++ b/drivers/net/virtio/virtio_ethdev.c
> > > @@ -438,6 +438,24 @@ virtio_dev_cq_queue_setup(struct rte_eth_dev
> > > *dev, uint16_t vtpci_queue_idx,  }
> > >
> > >  static void
> > > +virtio_free_queues(struct rte_eth_dev *dev) {
> > > + unsigned int i;
> > > +
> > > + for (i = 0; i < dev->data->nb_rx_queues; i++) {
> > > + virtio_dev_rx_queue_release(dev->data->rx_queues[i]);
> > > + dev->data->rx_queues[i] = NULL;
> > > + }
> > > + dev->data->nb_rx_queues = 0;
> > > +
> > > + for (i = 0; i < dev->data->nb_tx_queues; i++) {
> > > + virtio_dev_tx_queue_release(dev->data->tx_queues[i]);
> > > + dev->data->tx_queues[i] = NULL;
> > > + }
> > > + dev->data->nb_tx_queues = 0;
> > > +}
> > > +
> > > +static void
> > >  virtio_dev_close(struct rte_eth_dev *dev)  {
> > >   struct virtio_hw *hw = dev->data->dev_private; @@ -451,6 +469,7
> > @@
> > > virtio_dev_close(struct rte_eth_dev *dev)
> > >   vtpci_reset(hw);
> > >   hw->started = 0;
> > >   virtio_dev_free_mbufs(dev);
> > > + virtio_free_queues(dev);
> >
> > Validate it with vhost sample or not for this change?
> 
> I have tested this change with testpmd on a Fedora VM.

I think we should make sure it will not break any current test case for virtio,
So before applying it, it needs use vhost sample on host and test the virtio 
driver on guest.

Thanks
Changchun



[dpdk-dev] [PATCH 4/5] virtio: free queue memory in virtio_dev_close()

2015-07-15 Thread Ouyang, Changchun


> -Original Message-
> From: Iremonger, Bernard
> Sent: Tuesday, July 14, 2015 9:11 PM
> To: dev at dpdk.org
> Cc: Ouyang, Changchun; stephen at networkplumber.org; Iremonger, Bernard
> Subject: [PATCH 4/5] virtio: free queue memory in virtio_dev_close()
> 
> Add function virtio_free_queues() and call from virtio_dev_close() Use
> virtio_dev_rx_queue_release() and virtio_dev_tx_queue_release()
> 
> Signed-off-by: Bernard Iremonger 
> ---
>  drivers/net/virtio/virtio_ethdev.c | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> index b32b3e9..4676ab1 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -438,6 +438,24 @@ virtio_dev_cq_queue_setup(struct rte_eth_dev
> *dev, uint16_t vtpci_queue_idx,  }
> 
>  static void
> +virtio_free_queues(struct rte_eth_dev *dev) {
> + unsigned int i;
> +
> + for (i = 0; i < dev->data->nb_rx_queues; i++) {
> + virtio_dev_rx_queue_release(dev->data->rx_queues[i]);
> + dev->data->rx_queues[i] = NULL;
> + }
> + dev->data->nb_rx_queues = 0;
> +
> + for (i = 0; i < dev->data->nb_tx_queues; i++) {
> + virtio_dev_tx_queue_release(dev->data->tx_queues[i]);
> + dev->data->tx_queues[i] = NULL;
> + }
> + dev->data->nb_tx_queues = 0;
> +}
> +
> +static void
>  virtio_dev_close(struct rte_eth_dev *dev)  {
>   struct virtio_hw *hw = dev->data->dev_private; @@ -451,6 +469,7
> @@ virtio_dev_close(struct rte_eth_dev *dev)
>   vtpci_reset(hw);
>   hw->started = 0;
>   virtio_dev_free_mbufs(dev);
> + virtio_free_queues(dev);

Validate it with vhost sample or not for this change?

>  }
> 
>  static void
> --
> 1.9.1



[dpdk-dev] [PATCH] virtio: fix the vq size issue

2015-07-13 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Friday, July 10, 2015 10:12 PM
> To: Xie, Huawei
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> 
> 2015-07-10 14:05, Xie, Huawei:
> > Thomas:
> > Could we roll back that commit or apply Changchun's patch?
> 
> It is waiting an agreement with Changchun, symbolized by an Acked-by:

I think applying this patch is better than rolling back the previous commit,
Besides fixing the issue, this patch also removes an unnecessary assigning 
inside that function.

> 
> 
> > On 7/1/2015 11:53 PM, Xie, Huawei wrote:
> > > On 7/1/2015 3:49 PM, Ouyang Changchun wrote:
> > >> This commit breaks virtio basic packets rx functionality:
> > >>   d78deadae4dca240e85054bf2d604a801676becc
> > >>
> > >> The QEMU use 256 as default vring size, also use this default value
> > >> to calculate the virtio avail ring base address and used ring base
> > >> address, and vhost in the backend use the ring base address to do
> packet IO.
> > >>
> > >> Virtio spec also says the queue size in PCI configuration is
> > >> read-only, so virtio front end can't change it. just need use the
> > >> read-only value to allocate space for vring and calculate the avail
> > >> and used ring base address. Otherwise, the avail and used ring base
> address will be different between host and guest, accordingly, packet IO
> can't work normally.
> > > virtio driver could still use the vq_size to initialize avail ring
> > > and use ring so that they still have the same base address.
> > > The other issue is vhost use  index & (vq->size -1) to index the ring.
> > >
> > >
> > > Thomas:
> > > This fix works but introduces slight change with original code.
> > > Could we just rollback that commit?
> > >
> > > d78deadae4dca240e85054bf2d604a801676becc
> > >
> > >
> > >> Signed-off-by: Changchun Ouyang 
> > >> ---
> > >>  drivers/net/virtio/virtio_ethdev.c | 14 +++---
> > >>  1 file changed, 3 insertions(+), 11 deletions(-)



[dpdk-dev] [PATCH] ixgbe: fix the issue that auto negotiation for flow control cannot be disabled

2015-07-09 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Wenzhuo Lu
> Sent: Wednesday, July 8, 2015 9:14 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] ixgbe: fix the issue that auto negotiation for
> flow control cannot be disabled
> 
> There's a parameter "autoneg on|off" in testpmd CLI "set flow_ctrl ...". This
> parameter is used to enable/disable auto negotiation for flow control. But 
> it's
> not supported yet.
> The auto negotiation is enabled by default, we have no way to disable it. This
> patch lets the parameter "autoneg on|off" be supproted.
> 
> Signed-off-by: Wenzhuo Lu 

Acked-by: Changchun Ouyang 


[dpdk-dev] [PATCH v4 4/4] test-pmd: remove call to rte_eth_promiscuous_disable() from detach_port()

2015-07-08 Thread Ouyang, Changchun


> -Original Message-
> From: Iremonger, Bernard
> Sent: Tuesday, July 7, 2015 5:18 PM
> To: dev at dpdk.org
> Cc: Ouyang, Changchun; Iremonger, Bernard
> Subject: [PATCH v4 4/4] test-pmd: remove call to
> rte_eth_promiscuous_disable() from detach_port()
> 
> At this point the stop() and close() functions have already been called.
> The rte_eth_promiscuous_disable() function does not return on the VM.

I think we need root the cause why it doesn't return on the VM.

> 
> Signed-off-by: Bernard Iremonger 
> ---
>  app/test-pmd/testpmd.c |4 +---
>  1 files changed, 1 insertions(+), 3 deletions(-)
> 
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> 82b465d..4769533 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -1,7 +1,7 @@
>  /*-
>   *   BSD LICENSE
>   *
> - *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> + *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
>   *   All rights reserved.
>   *
>   *   Redistribution and use in source and binary forms, with or without
> @@ -1542,8 +1542,6 @@ detach_port(uint8_t port_id)
>   return;
>   }
> 
> - rte_eth_promiscuous_disable(port_id);
> -

It seems a workaround.

>   if (rte_eth_dev_detach(port_id, name))
>   return;
> 
> --
> 1.7.4.1



[dpdk-dev] [PATCH] virtio: fix the vq size issue

2015-07-07 Thread Ouyang, Changchun

> -Original Message-
> From: Ouyang, Changchun
> Sent: Wednesday, July 1, 2015 3:49 PM
> To: dev at dpdk.org
> Cc: Cao, Waterman; Xu, Qian Q; Ouyang, Changchun
> Subject: [PATCH] virtio: fix the vq size issue
> 
> This commit breaks virtio basic packets rx functionality:
>   d78deadae4dca240e85054bf2d604a801676becc
> 
> The QEMU use 256 as default vring size, also use this default value to
> calculate the virtio avail ring base address and used ring base address, and
> vhost in the backend use the ring base address to do packet IO.
> 
> Virtio spec also says the queue size in PCI configuration is read-only, so 
> virtio
> front end can't change it. just need use the read-only value to allocate space
> for vring and calculate the avail and used ring base address. Otherwise, the
> avail and used ring base address will be different between host and guest,
> accordingly, packet IO can't work normally.
> 
> Signed-off-by: Changchun Ouyang 
> ---
>  drivers/net/virtio/virtio_ethdev.c | 14 +++---
>  1 file changed, 3 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> index fe5f9a1..d84de13 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -263,8 +263,6 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
>*/
>   vq_size = VIRTIO_READ_REG_2(hw, VIRTIO_PCI_QUEUE_NUM);
>   PMD_INIT_LOG(DEBUG, "vq_size: %d nb_desc:%d", vq_size,
> nb_desc);
> - if (nb_desc == 0)
> - nb_desc = vq_size;
>   if (vq_size == 0) {
>   PMD_INIT_LOG(ERR, "%s: virtqueue does not exist",
> __func__);
>   return -EINVAL;
> @@ -275,15 +273,9 @@ int virtio_dev_queue_setup(struct rte_eth_dev
> *dev,
>   return -EINVAL;
>   }
> 
> - if (nb_desc < vq_size) {
> - if (!rte_is_power_of_2(nb_desc)) {
> - PMD_INIT_LOG(ERR,
> -  "nb_desc(%u) size is not powerof 2",
> -  nb_desc);
> - return -EINVAL;
> - }
> - vq_size = nb_desc;
> - }
> + if (nb_desc != vq_size)
> + PMD_INIT_LOG(ERR, "Warning: nb_desc(%d) is not equal to
> vq size (%d), fall to vq size",
> + nb_desc, vq_size);
> 
>   if (queue_type == VTNET_RQ) {
>   snprintf(vq_name, sizeof(vq_name), "port%d_rvq%d",
> --
> 1.8.4.2

Any more comments for this patch?

Thanks
Changchun



[dpdk-dev] [PATCH v2 0/3] Fix vhost startup issue

2015-07-07 Thread Ouyang, Changchun

> -Original Message-
> From: Ouyang, Changchun
> Sent: Monday, July 6, 2015 10:27 AM
> To: dev at dpdk.org
> Cc: Xie, Huawei; Cao, Waterman; Xu, Qian Q; Ouyang, Changchun
> Subject: [PATCH v2 0/3] Fix vhost startup issue
> 
> The patch set fix vhost sample fails to start up in second time:
> 
> It should call api to unregister vhost driver when sample exit/quit, then the
> socket file will be removed(by calling unlink), and thus make vhost sample
> work correctly in second time startup.
> 
> It also adds/refines some log information.
> 
> Changchun Ouyang (3):
>   vhost: add log when failing to bind a socket
>   vhost: fix the comments and log
>   vhost: call api to unregister vhost driver
> 
>  examples/vhost/main.c| 16 ++--
>  lib/librte_vhost/vhost_user/vhost-net-user.c |  5 -
>  2 files changed, 18 insertions(+), 3 deletions(-)
> 
> --
> 1.8.4.2

Any more comments for this patch set?

Thanks
Changchun



[dpdk-dev] [PATCH v2 3/3] vhost: call api to unregister vhost driver

2015-07-06 Thread Ouyang Changchun
The following commit broke vhost sample when it runs in second time:
292959c71961acde0cda6e77e737bb0a4df1559c

It should call api to unregister vhost driver when sample exit/quit, then
the socket file will be removed(by calling unlink), and thus make vhost sample
work correctly in the second time startup.

Signed-off-by: Changchun Ouyang 
---
 examples/vhost/main.c | 12 
 1 file changed, 12 insertions(+)

change in v2:
 - refine the signal handler name

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 56a5c70..1b137b9 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -2871,6 +2871,16 @@ setup_mempool_tbl(int socket, uint32_t index, char 
*pool_name,
}
 }

+/* When we receive a INT signal, unregister vhost driver */
+static void
+sigint_handler(__rte_unused int signum)
+{
+   /* Unregister vhost driver. */
+   int ret = rte_vhost_driver_unregister((char *)&dev_basename);
+   if (ret != 0)
+   rte_exit(EXIT_FAILURE, "vhost driver unregister failure.\n");
+   exit(0);
+}

 /*
  * Main function, does initialisation and calls the per-lcore functions. The 
CUSE
@@ -2887,6 +2897,8 @@ main(int argc, char *argv[])
uint16_t queue_id;
static pthread_t tid;

+   signal(SIGINT, sigint_handler);
+
/* init EAL */
ret = rte_eal_init(argc, argv);
if (ret < 0)
-- 
1.8.4.2



[dpdk-dev] [PATCH v2 2/3] vhost: fix the comments and log

2015-07-06 Thread Ouyang Changchun
It fixes the wrong log info when failing to unregister vhost driver.

Signed-off-by: Changchun Ouyang 
---
 examples/vhost/main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

change in v2:
  - refine the comment
  - fix checkpatch issue

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 7863dcf..56a5c70 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -3051,10 +3051,10 @@ main(int argc, char *argv[])
if (mergeable == 0)
rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_MRG_RXBUF);

-   /* Register CUSE device to handle IOCTLs. */
+   /* Register vhost(cuse or user) driver to handle vhost messages. */
ret = rte_vhost_driver_register((char *)&dev_basename);
if (ret != 0)
-   rte_exit(EXIT_FAILURE,"CUSE device setup failure.\n");
+   rte_exit(EXIT_FAILURE, "vhost driver register failure.\n");

rte_vhost_driver_callback_register(&virtio_net_device_ops);

-- 
1.8.4.2



[dpdk-dev] [PATCH v2 1/3] vhost: add log when failing to bind a socket

2015-07-06 Thread Ouyang Changchun
It adds more readable log info if a socket fails to bind to local socket file 
name.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_vhost/vhost_user/vhost-net-user.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c 
b/lib/librte_vhost/vhost_user/vhost-net-user.c
index 87a4711..f406a94 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -122,8 +122,11 @@ uds_socket(const char *path)
un.sun_family = AF_UNIX;
snprintf(un.sun_path, sizeof(un.sun_path), "%s", path);
ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
-   if (ret == -1)
+   if (ret == -1) {
+   RTE_LOG(ERR, VHOST_CONFIG, "fail to bind fd:%d, remove file:%s 
and try again.\n",
+   sockfd, path);
goto err;
+   }
RTE_LOG(INFO, VHOST_CONFIG, "bind to %s\n", path);

ret = listen(sockfd, MAX_VIRTIO_BACKLOG);
-- 
1.8.4.2



[dpdk-dev] [PATCH v2 0/3] Fix vhost startup issue

2015-07-06 Thread Ouyang Changchun
The patch set fix vhost sample fails to start up in second time:

It should call api to unregister vhost driver when sample exit/quit, then
the socket file will be removed(by calling unlink), and thus make vhost sample
work correctly in second time startup.

It also adds/refines some log information.

Changchun Ouyang (3):
  vhost: add log when failing to bind a socket
  vhost: fix the comments and log
  vhost: call api to unregister vhost driver

 examples/vhost/main.c| 16 ++--
 lib/librte_vhost/vhost_user/vhost-net-user.c |  5 -
 2 files changed, 18 insertions(+), 3 deletions(-)

-- 
1.8.4.2



[dpdk-dev] [PATCH 3/3] vhost: call api to unregister vhost driver

2015-07-03 Thread Ouyang, Changchun


> -Original Message-
> From: Xie, Huawei
> Sent: Friday, July 3, 2015 12:04 AM
> To: Ouyang, Changchun; dev at dpdk.org
> Cc: Cao, Waterman; Xu, Qian Q
> Subject: Re: [PATCH 3/3] vhost: call api to unregister vhost driver
> 
> On 7/2/2015 11:33 AM, Ouyang, Changchun wrote:
> > The commit will break vhost sample when it runs in second time:
> > 292959c71961acde0cda6e77e737bb0a4df1559c
> >
> > It should call api to unregister vhost driver when sample exit/quit,
> > then the socket file will be removed(by calling unlink), and thus make
> > vhost sample work correctly in second time startup.
> >
> > Signed-off-by: Changchun Ouyang 
> > ---
> >  examples/vhost/main.c | 18 ++
> >  1 file changed, 18 insertions(+)
> >
> > diff --git a/examples/vhost/main.c b/examples/vhost/main.c index
> > 72c4773..90666b3 100644
> > --- a/examples/vhost/main.c
> > +++ b/examples/vhost/main.c
> > @@ -2871,6 +2871,16 @@ setup_mempool_tbl(int socket, uint32_t index,
> char *pool_name,
> > }
> >  }
> >
> > +/* When we receive a HUP signal, unregister vhost driver */ static
> > +void sighup_handler(__rte_unused int signum) {
> > +   /* Unregister vhost driver. */
> > +   int ret = rte_vhost_driver_unregister((char *)&dev_basename);
> > +   if (ret != 0)
> > +   rte_exit(EXIT_FAILURE, "vhost driver unregister failure.\n");
> > +   exit(0);
> > +}
> >
> >  /*
> >   * Main function, does initialisation and calls the per-lcore
> > functions. The CUSE @@ -2887,6 +2897,8 @@ main(int argc, char *argv[])
> > uint16_t queue_id;
> > static pthread_t tid;
> >
> > +   signal(SIGINT, sighup_handler);
> > +
> 
> ignor if duplciated.
> sighup->sigint

Make sense, will update it in v2

> 
> > /* init EAL */
> > ret = rte_eal_init(argc, argv);
> > if (ret < 0)
> > @@ -3060,6 +3072,12 @@ main(int argc, char *argv[])
> >
> > /* Start CUSE session. */
> > rte_vhost_driver_session_start();
> > +
> > +   /* Unregister vhost driver. */
> > +   ret = rte_vhost_driver_unregister((char *)&dev_basename);
> > +   if (ret != 0)
> > +   rte_exit(EXIT_FAILURE,"vhost driver unregister failure.\n");
> > +
> > return 0;
> >
> >  }



[dpdk-dev] [PATCH 3/3] vhost: call api to unregister vhost driver

2015-07-03 Thread Ouyang, Changchun

> -Original Message-
> From: Xie, Huawei
> Sent: Thursday, July 2, 2015 5:38 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Cc: Cao, Waterman; Xu, Qian Q
> Subject: Re: [PATCH 3/3] vhost: call api to unregister vhost driver
> 
> On 7/2/2015 11:33 AM, Ouyang, Changchun wrote:
> >
> > /* Start CUSE session. */
> > rte_vhost_driver_session_start();
> > +
> > +   /* Unregister vhost driver. */
> > +   ret = rte_vhost_driver_unregister((char *)&dev_basename);
> > +   if (ret != 0)
> > +   rte_exit(EXIT_FAILURE,"vhost driver unregister failure.\n");
> > +
> Better remove the above code.
> It is duplicated with signal handler and actually
> rte_vhost_driver_session_start never returns.

How about call one function to replace the code snippet?
I think we need unregister there, it give us a clear example what the vhost lib 
caller need to do at the ramp down stage. 
Maybe 'never return' will be changed some day.

> 
> > return 0;
> >
> >  }



[dpdk-dev] [PATCH 1/3] vhost: add log if fails to bind a socket

2015-07-03 Thread Ouyang, Changchun


> -Original Message-
> From: Xie, Huawei
> Sent: Thursday, July 2, 2015 5:29 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Cc: Cao, Waterman; Xu, Qian Q
> Subject: Re: [PATCH 1/3] vhost: add log if fails to bind a socket
> 
> On 7/2/2015 11:33 AM, Ouyang, Changchun wrote:
> > It adds more readable log info if a socket fails to bind to local device 
> > file
> name.
> local socket file, not device file. :).

Make sense, will update it, thanks

> >
> > Signed-off-by: Changchun Ouyang 
> > ---
> >  lib/librte_vhost/vhost_user/vhost-net-user.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c
> b/lib/librte_vhost/vhost_user/vhost-net-user.c
> > index 87a4711..f406a94 100644
> > --- a/lib/librte_vhost/vhost_user/vhost-net-user.c
> > +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
> > @@ -122,8 +122,11 @@ uds_socket(const char *path)
> > un.sun_family = AF_UNIX;
> > snprintf(un.sun_path, sizeof(un.sun_path), "%s", path);
> > ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
> > -   if (ret == -1)
> > +   if (ret == -1) {
> > +   RTE_LOG(ERR, VHOST_CONFIG, "fail to bind fd:%d, remove
> file:%s and try again.\n",
> > +   sockfd, path);
> > goto err;
> > +   }
> > RTE_LOG(INFO, VHOST_CONFIG, "bind to %s\n", path);
> >
> > ret = listen(sockfd, MAX_VIRTIO_BACKLOG);



[dpdk-dev] [PATCH 2/3] vhost: fix the comments and log

2015-07-03 Thread Ouyang, Changchun


> -Original Message-
> From: Xie, Huawei
> Sent: Thursday, July 2, 2015 5:25 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Cc: Cao, Waterman; Xu, Qian Q
> Subject: Re: [PATCH 2/3] vhost: fix the comments and log
> 
> 
> On 7/2/2015 11:33 AM, Ouyang, Changchun wrote:
> > It fixes the wrong log info when fails to unregister vhost driver.
> >
> > Signed-off-by: Changchun Ouyang 
> > ---
> >  examples/vhost/main.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/examples/vhost/main.c b/examples/vhost/main.c index
> > 7863dcf..72c4773 100644
> > --- a/examples/vhost/main.c
> > +++ b/examples/vhost/main.c
> > @@ -3051,10 +3051,10 @@ main(int argc, char *argv[])
> > if (mergeable == 0)
> > rte_vhost_feature_disable(1ULL <<
> VIRTIO_NET_F_MRG_RXBUF);
> >
> > -   /* Register CUSE device to handle IOCTLs. */
> > +   /* Register vhost driver to handle IOCTLs. */
> 
> Also update IOCTLS.
> or:  register vhost [cuse or user] driver to handle vhost message.

Make sense, will update it, thanks

> > ret = rte_vhost_driver_register((char *)&dev_basename);
> > if (ret != 0)
> > -   rte_exit(EXIT_FAILURE,"CUSE device setup failure.\n");
> > +   rte_exit(EXIT_FAILURE,"vhost driver register failure.\n");
> >
> > rte_vhost_driver_callback_register(&virtio_net_device_ops);
> >



[dpdk-dev] [PATCH] virtio: fix the vq size issue

2015-07-03 Thread Ouyang, Changchun


> -Original Message-
> From: Xie, Huawei
> Sent: Thursday, July 2, 2015 5:16 PM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org; Thomas Monjalon
> Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> 
> On 7/2/2015 10:16 AM, Ouyang, Changchun wrote:
> >
> >> -Original Message-
> >> From: Xie, Huawei
> >> Sent: Thursday, July 2, 2015 10:02 AM
> >> To: Ouyang, Changchun; dev at dpdk.org; Thomas Monjalon
> >> Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> >>
> >> On 7/2/2015 8:29 AM, Ouyang, Changchun wrote:
> >>> Hi huawei,
> >>>
> >>>> -Original Message-
> >>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Xie, Huawei
> >>>> Sent: Wednesday, July 1, 2015 11:53 PM
> >>>> To: dev at dpdk.org; Thomas Monjalon
> >>>> Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> >>>>
> >>>> On 7/1/2015 3:49 PM, Ouyang Changchun wrote:
> >>>>> This commit breaks virtio basic packets rx functionality:
> >>>>>   d78deadae4dca240e85054bf2d604a801676becc
> >>>>>
> >>>>> The QEMU use 256 as default vring size, also use this default
> >>>>> value to calculate the virtio avail ring base address and used
> >>>>> ring base address, and vhost in the backend use the ring base
> >>>>> address to do packet
> >>>> IO.
> >>>>> Virtio spec also says the queue size in PCI configuration is
> >>>>> read-only, so virtio front end can't change it. just need use the
> >>>>> read-only value to allocate space for vring and calculate the
> >>>>> avail and used ring base address. Otherwise, the avail and used
> >>>>> ring base
> >>>> address will be different between host and guest, accordingly,
> >>>> packet IO can't work normally.
> >>>> virtio driver could still use the vq_size to initialize avail ring
> >>>> and use ring so that they still have the same base address.
> >>>> The other issue is vhost use  index & (vq->size -1) to index the ring.
> >>> I am not sure what is your clear message here, Vhost has no choice
> >>> but use vq->size -1 to index the ring, It is qemu that always use
> >>> 256 as the vq size, and set the avail and used ring base address, It
> >>> also tells vhost the vq size is 256.
> >> I mean "the same base address issue" could be resolved, but we still
> >> couldn't stop vhost using idx & vq->size -1 to index the ring.
> >>
> > Then this patch will resolve this avail ring base address issue.
> I mean different ring base isn't the root cause. The commit message which
> states that this register is read only is simple and enough.  

The direct root cause is avail ring base address issue,
Virtio front end use: vring->avail = vring->desc + vq_size * 
SIZE_OF_DESC_ELEMENT,
And fill the vring->avail->avail_idx, and the ring itself.
Qemu use:  vring->avail = vring->desc + 256 * SIZE_OF_DESC_ELEMENT,
And tell vhost this address, Vhost use this address to enqueue packets from phy 
port  into vring.

Pls note that if vq_size is not 256, e.g. it is changed into 128, then the 
vring->avail in host and in guest
Is totally different, that is why it fail to rx any packet, because they try to 
use different address to get
Same content in that space.

This is why I still think it is the root cause and I need add it into the 
commit.

> 
> >>>> Thomas:
> >>>> This fix works but introduces slight change with original code.
> >>>> Could we just rollback that commit?
> >>> What's your major concern for the slight change here?
> >>> just removing the unnecessary check for nb_desc itself.
> >>> So I think no issue for the slight change.
> >> No major concern. It is better if this patch just rollbacks that
> >> commit without introduce extra change if not necessary.
> >> The original code set nb_desc to vq_size, though it isn't used later.
> >>
> > I prefer to have the slight change to remove unnecessary setting.
> >
> >>> Thanks
> >>> Changchun
> >>>
> >>>
> >>>
> >>>
> >
> >



[dpdk-dev] [PATCH 3/3] vhost: call api to unregister vhost driver

2015-07-02 Thread Ouyang Changchun
The commit will break vhost sample when it runs in second time:
292959c71961acde0cda6e77e737bb0a4df1559c

It should call api to unregister vhost driver when sample exit/quit, then
the socket file will be removed(by calling unlink), and thus make vhost sample
work correctly in second time startup.

Signed-off-by: Changchun Ouyang 
---
 examples/vhost/main.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 72c4773..90666b3 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -2871,6 +2871,16 @@ setup_mempool_tbl(int socket, uint32_t index, char 
*pool_name,
}
 }

+/* When we receive a HUP signal, unregister vhost driver */
+static void
+sighup_handler(__rte_unused int signum)
+{
+   /* Unregister vhost driver. */
+   int ret = rte_vhost_driver_unregister((char *)&dev_basename);
+   if (ret != 0)
+   rte_exit(EXIT_FAILURE, "vhost driver unregister failure.\n");
+   exit(0);
+}

 /*
  * Main function, does initialisation and calls the per-lcore functions. The 
CUSE
@@ -2887,6 +2897,8 @@ main(int argc, char *argv[])
uint16_t queue_id;
static pthread_t tid;

+   signal(SIGINT, sighup_handler);
+
/* init EAL */
ret = rte_eal_init(argc, argv);
if (ret < 0)
@@ -3060,6 +3072,12 @@ main(int argc, char *argv[])

/* Start CUSE session. */
rte_vhost_driver_session_start();
+
+   /* Unregister vhost driver. */
+   ret = rte_vhost_driver_unregister((char *)&dev_basename);
+   if (ret != 0)
+   rte_exit(EXIT_FAILURE,"vhost driver unregister failure.\n");
+
return 0;

 }
-- 
1.8.4.2



[dpdk-dev] [PATCH 2/3] vhost: fix the comments and log

2015-07-02 Thread Ouyang Changchun
It fixes the wrong log info when fails to unregister vhost driver.

Signed-off-by: Changchun Ouyang 
---
 examples/vhost/main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 7863dcf..72c4773 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -3051,10 +3051,10 @@ main(int argc, char *argv[])
if (mergeable == 0)
rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_MRG_RXBUF);

-   /* Register CUSE device to handle IOCTLs. */
+   /* Register vhost driver to handle IOCTLs. */
ret = rte_vhost_driver_register((char *)&dev_basename);
if (ret != 0)
-   rte_exit(EXIT_FAILURE,"CUSE device setup failure.\n");
+   rte_exit(EXIT_FAILURE,"vhost driver register failure.\n");

rte_vhost_driver_callback_register(&virtio_net_device_ops);

-- 
1.8.4.2



[dpdk-dev] [PATCH 1/3] vhost: add log if fails to bind a socket

2015-07-02 Thread Ouyang Changchun
It adds more readable log info if a socket fails to bind to local device file 
name.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_vhost/vhost_user/vhost-net-user.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c 
b/lib/librte_vhost/vhost_user/vhost-net-user.c
index 87a4711..f406a94 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -122,8 +122,11 @@ uds_socket(const char *path)
un.sun_family = AF_UNIX;
snprintf(un.sun_path, sizeof(un.sun_path), "%s", path);
ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
-   if (ret == -1)
+   if (ret == -1) {
+   RTE_LOG(ERR, VHOST_CONFIG, "fail to bind fd:%d, remove file:%s 
and try again.\n",
+   sockfd, path);
goto err;
+   }
RTE_LOG(INFO, VHOST_CONFIG, "bind to %s\n", path);

ret = listen(sockfd, MAX_VIRTIO_BACKLOG);
-- 
1.8.4.2



[dpdk-dev] [PATCH 0/3] Fix vhost startup issue

2015-07-02 Thread Ouyang Changchun
The commit breaks vhost sample when it runs in second time:
292959c71961acde0cda6e77e737bb0a4df1559c

It should call api to unregister vhost driver when sample exit/quit, then
the socket file will be removed(by calling unlink), and thus make vhost sample
work correctly in second time startup.

Also add/refine some log infomation.

Changchun Ouyang (3):
  vhost: add log if fails to bind a socket
  vhost: fix the comments and log
  vhost: call api to unregister vhost driver

 examples/vhost/main.c| 22 --
 lib/librte_vhost/vhost_user/vhost-net-user.c |  5 -
 2 files changed, 24 insertions(+), 3 deletions(-)

-- 
1.8.4.2



[dpdk-dev] [PATCH] virtio: fix the vq size issue

2015-07-02 Thread Ouyang, Changchun


> -Original Message-
> From: Xie, Huawei
> Sent: Thursday, July 2, 2015 10:02 AM
> To: Ouyang, Changchun; dev at dpdk.org; Thomas Monjalon
> Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> 
> On 7/2/2015 8:29 AM, Ouyang, Changchun wrote:
> > Hi huawei,
> >
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Xie, Huawei
> >> Sent: Wednesday, July 1, 2015 11:53 PM
> >> To: dev at dpdk.org; Thomas Monjalon
> >> Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> >>
> >> On 7/1/2015 3:49 PM, Ouyang Changchun wrote:
> >>> This commit breaks virtio basic packets rx functionality:
> >>>   d78deadae4dca240e85054bf2d604a801676becc
> >>>
> >>> The QEMU use 256 as default vring size, also use this default value
> >>> to calculate the virtio avail ring base address and used ring base
> >>> address, and vhost in the backend use the ring base address to do
> >>> packet
> >> IO.
> >>> Virtio spec also says the queue size in PCI configuration is
> >>> read-only, so virtio front end can't change it. just need use the
> >>> read-only value to allocate space for vring and calculate the avail
> >>> and used ring base address. Otherwise, the avail and used ring base
> >> address will be different between host and guest, accordingly, packet
> >> IO can't work normally.
> >> virtio driver could still use the vq_size to initialize avail ring
> >> and use ring so that they still have the same base address.
> >> The other issue is vhost use  index & (vq->size -1) to index the ring.
> > I am not sure what is your clear message here, Vhost has no choice but
> > use vq->size -1 to index the ring, It is qemu that always use 256 as
> > the vq size, and set the avail and used ring base address, It also
> > tells vhost the vq size is 256.
> 
> I mean "the same base address issue" could be resolved, but we still couldn't
> stop vhost using idx & vq->size -1 to index the ring.
> 

Then this patch will resolve this avail ring base address issue.

> >>
> >> Thomas:
> >> This fix works but introduces slight change with original code. Could
> >> we just rollback that commit?
> > What's your major concern for the slight change here?
> > just removing the unnecessary check for nb_desc itself.
> > So I think no issue for the slight change.
> 
> No major concern. It is better if this patch just rollbacks that commit 
> without
> introduce extra change if not necessary.
> The original code set nb_desc to vq_size, though it isn't used later.
> 
I prefer to have the slight change to remove unnecessary setting.

> >
> > Thanks
> > Changchun
> >
> >
> >
> >



[dpdk-dev] [PATCH] virtio: fix the vq size issue

2015-07-02 Thread Ouyang, Changchun
Hi huawei,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Xie, Huawei
> Sent: Wednesday, July 1, 2015 11:53 PM
> To: dev at dpdk.org; Thomas Monjalon
> Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> 
> On 7/1/2015 3:49 PM, Ouyang Changchun wrote:
> > This commit breaks virtio basic packets rx functionality:
> >   d78deadae4dca240e85054bf2d604a801676becc
> >
> > The QEMU use 256 as default vring size, also use this default value to
> > calculate the virtio avail ring base address and used ring base
> > address, and vhost in the backend use the ring base address to do packet
> IO.
> >
> > Virtio spec also says the queue size in PCI configuration is
> > read-only, so virtio front end can't change it. just need use the
> > read-only value to allocate space for vring and calculate the avail
> > and used ring base address. Otherwise, the avail and used ring base
> address will be different between host and guest, accordingly, packet IO
> can't work normally.
> virtio driver could still use the vq_size to initialize avail ring and use 
> ring so
> that they still have the same base address.
> The other issue is vhost use  index & (vq->size -1) to index the ring.

I am not sure what is your clear message here,
Vhost has no choice but use vq->size -1 to index the ring, 
It is qemu that always use 256 as the vq size, and set the avail and used ring 
base address,
It also tells vhost the vq size is 256.

> 
> 
> Thomas:
> This fix works but introduces slight change with original code. Could we just
> rollback that commit?
What's your major concern for the slight change here?
just removing the unnecessary check for nb_desc itself.
So I think no issue for the slight change.  

Thanks
Changchun




[dpdk-dev] [PATCH] virtio: fix the vq size issue

2015-07-01 Thread Ouyang Changchun
This commit breaks virtio basic packets rx functionality:
  d78deadae4dca240e85054bf2d604a801676becc

The QEMU use 256 as default vring size, also use this default value to 
calculate the virtio
avail ring base address and used ring base address, and vhost in the backend 
use the ring base
address to do packet IO.

Virtio spec also says the queue size in PCI configuration is read-only, so 
virtio front end
can't change it. just need use the read-only value to allocate space for vring 
and calculate the
avail and used ring base address. Otherwise, the avail and used ring base 
address will be different
between host and guest, accordingly, packet IO can't work normally.

Signed-off-by: Changchun Ouyang 
---
 drivers/net/virtio/virtio_ethdev.c | 14 +++---
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index fe5f9a1..d84de13 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -263,8 +263,6 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
 */
vq_size = VIRTIO_READ_REG_2(hw, VIRTIO_PCI_QUEUE_NUM);
PMD_INIT_LOG(DEBUG, "vq_size: %d nb_desc:%d", vq_size, nb_desc);
-   if (nb_desc == 0)
-   nb_desc = vq_size;
if (vq_size == 0) {
PMD_INIT_LOG(ERR, "%s: virtqueue does not exist", __func__);
return -EINVAL;
@@ -275,15 +273,9 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
return -EINVAL;
}

-   if (nb_desc < vq_size) {
-   if (!rte_is_power_of_2(nb_desc)) {
-   PMD_INIT_LOG(ERR,
-"nb_desc(%u) size is not powerof 2",
-nb_desc);
-   return -EINVAL;
-   }
-   vq_size = nb_desc;
-   }
+   if (nb_desc != vq_size)
+   PMD_INIT_LOG(ERR, "Warning: nb_desc(%d) is not equal to vq size 
(%d), fall to vq size",
+   nb_desc, vq_size);

if (queue_type == VTNET_RQ) {
snprintf(vq_name, sizeof(vq_name), "port%d_rvq%d",
-- 
1.8.4.2



[dpdk-dev] [PATCH v4 4/4] vhost: add comment for potential unwanted callback on listenfds

2015-07-01 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Huawei Xie
> Sent: Tuesday, June 30, 2015 5:21 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v4 4/4] vhost: add comment for potential
> unwanted callback on listenfds
> 
> add comment for potential unwanted callback on listenfds
> 
> v4 changes:
> add comment for potential unwanted callback on listenfds
> 
> Signed-off-by: Huawei Xie 

Acked-by: Changchun Ouyang 

> ---
>  lib/librte_vhost/vhost_user/fd_man.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/lib/librte_vhost/vhost_user/fd_man.c
> b/lib/librte_vhost/vhost_user/fd_man.c
> index bd30f8d..d68b270 100644
> --- a/lib/librte_vhost/vhost_user/fd_man.c
> +++ b/lib/librte_vhost/vhost_user/fd_man.c
> @@ -242,6 +242,13 @@ fdset_event_dispatch(struct fdset *pfdset)
> 
>   pthread_mutex_unlock(&pfdset->fd_mutex);
> 
> + /*
> +  * When select is blocked, other threads might unregister
> +  * listenfds from and register new listenfds into fdset.
> +  * When select returns, the entries for listenfds in the fdset
> +  * might have been updated. It is ok if there is unwanted call
> +  * for new listenfds.
> +  */
>   ret = select(maxfds + 1, &rfds, &wfds, NULL, &tv);
>   if (ret <= 0)
>   continue;
> --
> 1.8.1.4



[dpdk-dev] [PATCH v4 3/4] vhost: version map file update

2015-07-01 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Huawei Xie
> Sent: Tuesday, June 30, 2015 5:21 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v4 3/4] vhost: version map file update
> 
> update version map file for rte_vhost_driver_unregister API
> 
> v3 changes:
> update version map file
> 
> Signed-off-by: Huawei Xie 

Acked-by: Changchun Ouyang 

> ---
>  lib/librte_vhost/rte_vhost_version.map | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/lib/librte_vhost/rte_vhost_version.map
> b/lib/librte_vhost/rte_vhost_version.map
> index 163dde0..fb6bb9e 100644
> --- a/lib/librte_vhost/rte_vhost_version.map
> +++ b/lib/librte_vhost/rte_vhost_version.map
> @@ -13,3 +13,11 @@ DPDK_2.0 {
> 
>   local: *;
>  };
> +
> +DPDK_2.1 {
> + global:
> +
> + rte_vhost_driver_unregister;
> +
> + local: *;
> +} DPDK_2.0;
> --
> 1.8.1.4



[dpdk-dev] [PATCH v4 2/4] vhost: vhost unix domain socket cleanup

2015-07-01 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Huawei Xie
> Sent: Tuesday, June 30, 2015 5:21 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v4 2/4] vhost: vhost unix domain socket cleanup
> 
> rte_vhost_driver_unregister API will remove the listenfd from event list, and
> then close it.
> 
> v2 changes:
> -minor code style fix, remove unnecessary new line
> 
> Signed-off-by: Huawei Xie 
> Signed-off-by: Peng Sun 

Acked-by: Changchun Ouyang 

> ---
>  lib/librte_vhost/rte_virtio_net.h|  3 ++
>  lib/librte_vhost/vhost_cuse/vhost-net-cdev.c |  9 
> lib/librte_vhost/vhost_user/vhost-net-user.c | 68
> +++-  lib/librte_vhost/vhost_user/vhost-net-
> user.h |  2 +-
>  4 files changed, 69 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/librte_vhost/rte_virtio_net.h
> b/lib/librte_vhost/rte_virtio_net.h
> index 5d38185..5630fbc 100644
> --- a/lib/librte_vhost/rte_virtio_net.h
> +++ b/lib/librte_vhost/rte_virtio_net.h
> @@ -188,6 +188,9 @@ int rte_vhost_enable_guest_notification(struct
> virtio_net *dev, uint16_t queue_i
>  /* Register vhost driver. dev_name could be different for multiple instance
> support. */  int rte_vhost_driver_register(const char *dev_name);
> 
> +/* Unregister vhost driver. This is only meaningful to vhost user. */
> +int rte_vhost_driver_unregister(const char *dev_name);
> +
>  /* Register callbacks. */
>  int rte_vhost_driver_callback_register(struct virtio_net_device_ops const *
> const);
>  /* Start vhost driver session blocking loop. */ diff --git
> a/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c
> b/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c
> index 6b68abf..1ae7c49 100644
> --- a/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c
> +++ b/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c
> @@ -405,6 +405,15 @@ rte_vhost_driver_register(const char *dev_name)  }
> 
>  /**
> + * An empty function for unregister
> + */
> +int
> +rte_vhost_driver_unregister(const char *dev_name __rte_unused) {
> + return 0;
> +}
> +
> +/**
>   * The CUSE session is launched allowing the application to receive open,
>   * release and ioctl calls.
>   */
> diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c
> b/lib/librte_vhost/vhost_user/vhost-net-user.c
> index 31f1215..87a4711 100644
> --- a/lib/librte_vhost/vhost_user/vhost-net-user.c
> +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
> @@ -66,6 +66,8 @@ struct connfd_ctx {
>  struct _vhost_server {
>   struct vhost_server *server[MAX_VHOST_SERVER];
>   struct fdset fdset;
> + int vserver_cnt;
> + pthread_mutex_t server_mutex;
>  };
> 
>  static struct _vhost_server g_vhost_server = { @@ -74,10 +76,10 @@ static
> struct _vhost_server g_vhost_server = {
>   .fd_mutex = PTHREAD_MUTEX_INITIALIZER,
>   .num = 0
>   },
> + .vserver_cnt = 0,
> + .server_mutex = PTHREAD_MUTEX_INITIALIZER,
>  };
> 
> -static int vserver_idx;
> -
>  static const char *vhost_message_str[VHOST_USER_MAX] = {
>   [VHOST_USER_NONE] = "VHOST_USER_NONE",
>   [VHOST_USER_GET_FEATURES] = "VHOST_USER_GET_FEATURES",
> @@ -427,7 +429,6 @@ vserver_message_handler(int connfd, void *dat, int
> *remove)
>   }
>  }
> 
> -
>  /**
>   * Creates and initialise the vhost server.
>   */
> @@ -436,34 +437,77 @@ rte_vhost_driver_register(const char *path)  {
>   struct vhost_server *vserver;
> 
> - if (vserver_idx == 0)
> + pthread_mutex_lock(&g_vhost_server.server_mutex);
> + if (ops == NULL)
>   ops = get_virtio_net_callbacks();
> - if (vserver_idx == MAX_VHOST_SERVER)
> +
> + if (g_vhost_server.vserver_cnt == MAX_VHOST_SERVER) {
> + RTE_LOG(ERR, VHOST_CONFIG,
> + "error: the number of servers reaches maximum\n");
> + pthread_mutex_unlock(&g_vhost_server.server_mutex);
>   return -1;
> + }
> 
>   vserver = calloc(sizeof(struct vhost_server), 1);
> - if (vserver == NULL)
> + if (vserver == NULL) {
> + pthread_mutex_unlock(&g_vhost_server.server_mutex);
>   return -1;
> -
> - unlink(path);
> + }
> 
>   vserver->listenfd = uds_socket(path);
>   if (vserver->listenfd < 0) {
>   free(vserver);
> + pthread_mutex_unlock(&g_vhost_server.server_mutex);
>   return -1;
>   }
> - vserver->path = path;
> +
> + vserver->path = strdup(path);
> 
>   fdset_add(&g_vhost_server.fdset, vserver->listenfd,
> - vserver_new_vq_conn, NULL,
> - vserver);
> + vserver_new_vq_conn, NULL, vserver);
> 
> - g_vhost_server.server[vserver_idx++] = vserver;
> + g_vhost_server.server[g_vhost_server.vserver_cnt++] = vserver;
> + pthread_mutex_unlock(&g_vhost_server.server_mutex);
> 
>   return 0;
>  }
> 
> 
> +/**
> + * Unregister the specified vhost server  */ int
> +rte_vhost_driver_unregister(const char *path) {
> + int i;
> +

[dpdk-dev] [PATCH v4 1/4] vhost: call fdset_del_slot to remove connection fd

2015-07-01 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Huawei Xie
> Sent: Tuesday, June 30, 2015 5:21 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v4 1/4] vhost: call fdset_del_slot to remove
> connection fd
> 
> In the event handler of connection fd, the connection fd could be possibly
> closed. The event dispatch loop would then try to remove the fd from fdset.
> Between these two actions, another thread might register a new listenfd
> reusing the val of just closed fd, so we couldn't call fdset_del which would
> wrongly clean up the new listenfd. A new function fdset_del_slot is provided
> to cleanup the fd at the specified location.
> 
> v4 changes:
> - call fdset_del_slot to remove connection fd
> 
> Signed-off-by: Huawei Xie 

Acked-by: Changchun Ouyang 

> ---
>  lib/librte_vhost/vhost_user/fd_man.c | 27
> ++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vhost/vhost_user/fd_man.c
> b/lib/librte_vhost/vhost_user/fd_man.c
> index 831c9c1..bd30f8d 100644
> --- a/lib/librte_vhost/vhost_user/fd_man.c
> +++ b/lib/librte_vhost/vhost_user/fd_man.c
> @@ -188,6 +188,24 @@ fdset_del(struct fdset *pfdset, int fd)  }
> 
>  /**
> + *  Unregister the fd at the specified slot from the fdset.
> + */
> +static void
> +fdset_del_slot(struct fdset *pfdset, int index) {
> + if (pfdset == NULL || index < 0 || index >= MAX_FDS)
> + return;
> +
> + pthread_mutex_lock(&pfdset->fd_mutex);
> +
> + pfdset->fd[index].fd = -1;
> + pfdset->fd[index].rcb = pfdset->fd[index].wcb = NULL;
> + pfdset->num--;
> +
> + pthread_mutex_unlock(&pfdset->fd_mutex);
> +}
> +
> +/**
>   * This functions runs in infinite blocking loop until there is no fd in
>   * pfdset. It calls corresponding r/w handler if there is event on the fd.
>   *
> @@ -248,8 +266,15 @@ fdset_event_dispatch(struct fdset *pfdset)
>* We don't allow fdset_del to be called in callback
>* directly.
>*/
> + /*
> +  * When we are to clean up the fd from fdset,
> +  * because the fd is closed in the cb,
> +  * the old fd val could be reused by when creates
> new
> +  * listen fd in another thread, we couldn't call
> +  * fd_set_del.
> +  */
>   if (remove1 || remove2)
> - fdset_del(pfdset, fd);
> + fdset_del_slot(pfdset, i);
>   }
>   }
>  }
> --
> 1.8.1.4



[dpdk-dev] [PATCH v2] lib_vhost:reset secure_len when rte_atomic16_cmpset failed

2015-06-23 Thread Ouyang, Changchun
Hi Thomas,

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, June 23, 2015 12:34 AM
> To: Ouyang, Changchun; Wei li
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] lib_vhost:reset secure_len when
> rte_atomic16_cmpset failed
> 
> 2015-06-01 06:14, Ouyang, Changchun:
> > From: Wei li, June 1, 2015 2:12 PM:
> > > when rte_atomic16_cmpset return 0 in first loop, secure_len  should
> > > be reset to 0 in second loop, otherwise (pkt_len > secure_len)
> > > always  be false, the num of desc maybe not enough
> > >
> > > Signed-off-by: Wei li 
> >
> > Acked-by: Changchun Ouyang
> 
> Is it already fixed by this commit?
>   http://dpdk.org/browse/dpdk/commit/?id=2927c37ca4e04067

You are right, so no need apply this patch. 
Thanks
Changchun



[dpdk-dev] [PATCH] doc: fix doxygen warnings

2015-06-19 Thread Ouyang Changchun
Fix doxygen warnings in vhost

Signed-off-by: Changchun Ouyang 
---
 lib/librte_vhost/rte_virtio_net.h | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/lib/librte_vhost/rte_virtio_net.h 
b/lib/librte_vhost/rte_virtio_net.h
index 5d38185..420c05e 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -198,8 +198,14 @@ int rte_vhost_driver_session_start(void);
  * be received from the physical port or from another virtual device. A packet
  * count is returned to indicate the number of packets that were succesfully
  * added to the RX queue.
+ * @param dev
+ *  virtio-net device
  * @param queue_id
  *  virtio queue index in mq case
+ * @param pkts
+ *  array to contain packets to be enqueued
+ * @param count
+ *  packets num to be enqueued
  * @return
  *  num of packets enqueued
  */
@@ -210,10 +216,16 @@ uint16_t rte_vhost_enqueue_burst(struct virtio_net *dev, 
uint16_t queue_id,
  * This function gets guest buffers from the virtio device TX virtqueue,
  * construct host mbufs, copies guest buffer content to host mbufs and
  * store them in pkts to be processed.
+ * @param dev
+ *  virtio-net device
+ * @param queue_id
+ *  virtio queue index in mq case
  * @param mbuf_pool
  *  mbuf_pool where host mbuf is allocated.
- * @param queue_id
- *  virtio queue index in mq case.
+ * @param pkts
+ *  array to contain packets to be dequeued
+ * @param count
+ *  packets num to be dequeued
  * @return
  *  num of packets dequeued
  */
-- 
1.8.4.2



[dpdk-dev] vhost doxygen

2015-06-19 Thread Ouyang, Changchun
Hi Thomas

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Friday, June 19, 2015 5:47 AM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org
> Subject: vhost doxygen
> 
> Please Changchun,
> Could you check and complete the vhost API comments?
> 
> It can be checked with this command:
>   make doc-api-html 2>&1 | grep '/librte_vhost/.*warning:
> 
I have done it. Pls see patch: doc: fix doxygen warnings

Thanks
Changchun



[dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in virtio dev

2015-06-19 Thread Ouyang, Changchun


> -Original Message-
> From: Flavio Leitner [mailto:fbl at sysclose.org]
> Sent: Thursday, June 18, 2015 9:34 PM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in
> virtio dev
> 
> On Mon, Jun 15, 2015 at 03:56:39PM +0800, Ouyang Changchun wrote:
> > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > Enabling this feature allows virtio device/port on guest has the
> > ability to use different vCPU to receive/transmit packets from/to each
> queue.
> >
> > In multiple queues mode, virtio device readiness means all queues of
> > this virtio device are ready, cleanup/destroy a virtio device also
> > requires clearing all queues belong to it.
> >
> > Changes in v3:
> >   - fix coding style
> >   - check virtqueue idx validity
> >
> > Changes in v2:
> >   - remove the q_num_set api
> >   - add the qp_num_get api
> >   - determine the queue pair num from qemu message
> >   - rework for reset owner message handler
> >   - dynamically alloc mem for dev virtqueue
> >   - queue pair num could be 0x8000
> >   - fix checkpatch errors
> >
> > Signed-off-by: Changchun Ouyang 
> > ---
> >  lib/librte_vhost/rte_virtio_net.h |  10 +-
> >  lib/librte_vhost/vhost-net.h  |   1 +
> >  lib/librte_vhost/vhost_rxtx.c |  49 +---
> >  lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
> >  lib/librte_vhost/vhost_user/virtio-net-user.c |  76 +---
> >  lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
> >  lib/librte_vhost/virtio-net.c | 161 
> > +-
> >  7 files changed, 216 insertions(+), 87 deletions(-)
> >
> > diff --git a/lib/librte_vhost/rte_virtio_net.h
> > b/lib/librte_vhost/rte_virtio_net.h
> > index 5d38185..873be3e 100644
> > --- a/lib/librte_vhost/rte_virtio_net.h
> > +++ b/lib/librte_vhost/rte_virtio_net.h
> > @@ -59,7 +59,6 @@ struct rte_mbuf;
> >  /* Backend value set by guest. */
> >  #define VIRTIO_DEV_STOPPED -1
> >
> > -
> >  /* Enum for virtqueue management. */
> >  enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
> >
> > @@ -96,13 +95,14 @@ struct vhost_virtqueue {
> >   * Device structure contains all configuration information relating to the
> device.
> >   */
> >  struct virtio_net {
> > -   struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];/**< Contains
> all virtqueue information. */
> > struct virtio_memory*mem;   /**< QEMU memory and
> memory region information. */
> > +   struct vhost_virtqueue  **virtqueue;/**< Contains all virtqueue
> information. */
> > uint64_tfeatures;   /**< Negotiated feature set.
> */
> > uint64_tdevice_fh;  /**< device identifier. */
> > uint32_tflags;  /**< Device flags. Only used
> to check if device is running on data core. */
> >  #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
> > charifname[IF_NAME_SZ]; /**< Name of the tap
> device or socket path. */
> > +   uint32_tnum_virt_queues;
> > void*priv;  /**< private context */
> >  } __rte_cache_aligned;
> 
> 
> As already pointed out, this breaks ABI.
> Do you have a plan for that or are you pushing this for dpdk 2.2?

Yes, I think it will be enabled in 2.2.
I have already  sent out the abi announce a few days ago.
> 
> 
> > @@ -220,4 +220,10 @@ uint16_t rte_vhost_enqueue_burst(struct
> > virtio_net *dev, uint16_t queue_id,  uint16_t
> rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
> > struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t
> > count);
> >
> > +/**
> > + * This function get the queue pair number of one vhost device.
> > + * @return
> > + *  num of queue pair of specified virtio device.
> > + */
> > +uint16_t rte_vhost_qp_num_get(struct virtio_net *dev);
> 
> This needs to go to rte_vhost_version.map too.
Will update it.

Thanks
Changchun


[dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in virtio dev

2015-06-19 Thread Ouyang, Changchun
Hi Flavio,

> -Original Message-
> From: Flavio Leitner [mailto:fbl at sysclose.org]
> Sent: Thursday, June 18, 2015 9:17 PM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in
> virtio dev
> 
> On Mon, Jun 15, 2015 at 03:56:39PM +0800, Ouyang Changchun wrote:
> > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > Enabling this feature allows virtio device/port on guest has the
> > ability to use different vCPU to receive/transmit packets from/to each
> queue.
> >
> > In multiple queues mode, virtio device readiness means all queues of
> > this virtio device are ready, cleanup/destroy a virtio device also
> > requires clearing all queues belong to it.
> >
> > Changes in v3:
> >   - fix coding style
> >   - check virtqueue idx validity
> >
> > Changes in v2:
> >   - remove the q_num_set api
> >   - add the qp_num_get api
> >   - determine the queue pair num from qemu message
> >   - rework for reset owner message handler
> >   - dynamically alloc mem for dev virtqueue
> >   - queue pair num could be 0x8000
> >   - fix checkpatch errors
> >
> > Signed-off-by: Changchun Ouyang 
> [...]
> 
> > diff --git a/lib/librte_vhost/virtio-net.c
> > b/lib/librte_vhost/virtio-net.c index fced2ab..aaea7d5 100644
> > --- a/lib/librte_vhost/virtio-net.c
> > +++ b/lib/librte_vhost/virtio-net.c
> > @@ -67,10 +67,10 @@ static struct virtio_net_config_ll *ll_root;
> > #define VHOST_SUPPORTED_FEATURES ((1ULL <<
> VIRTIO_NET_F_MRG_RXBUF) | \
> > (1ULL << VIRTIO_NET_F_CTRL_VQ) | \
> > (1ULL << VIRTIO_NET_F_CTRL_RX) | \
> > -   (1ULL << VHOST_F_LOG_ALL))
> > +   (1ULL << VHOST_F_LOG_ALL)) | \
> > +   (1ULL << VIRTIO_NET_F_MQ))
> 
> One extra parenthesis after VHOST_F_LOG_ALL BTW, this series need
> rebase with latest dpdk.
> fbl
> 
Yes, will updated it.
Thanks
Changchun



[dpdk-dev] [PATCH v3 2/2] virtio: check vq parameter

2015-06-17 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bernard Iremonger
> Sent: Tuesday, June 16, 2015 7:30 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 2/2] virtio: check vq parameter
> 
> If vq is NULL, there is a segmentation fault.
> 
> Signed-off-by: Bernard Iremonger 

Acked-by: Changchun Ouyang 


[dpdk-dev] [PATCH v3 1/2] virtio: add support for PCI Port Hotplug

2015-06-17 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bernard Iremonger
> Sent: Tuesday, June 16, 2015 7:30 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 1/2] virtio: add support for PCI Port Hotplug
> 
> This patch depends on the Port Hotplug Framework.
> It implements the eth_dev_uninit_t() function for virtio pmd.
> 
> Signed-off-by: Bernard Iremonger 

Acked-by: Changchun Ouyang 


[dpdk-dev] [PATCH] abi: Announce abi changes plan for vhost-user multiple queues

2015-06-16 Thread Ouyang Changchun
It announces the planned ABI changes for vhost-user multiple queues feature on 
v2.2.

Signed-off-by: Changchun Ouyang 
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index f00a6ee..dc1b0eb 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -38,3 +38,4 @@ Examples of Deprecation Notices

 Deprecation Notices
 ---
+* The ABI changes are planned for struct virtio_net in order to support 
vhost-user multiple queues feature. The upcoming release 2.1 will not contain 
these ABI changes, but release 2.2 will, and no backwards compatibility is 
planed due to the vhost-user multiple queues feature enabling. Binaries using 
this library build prior to version 2.2 will require updating and recompilation.
-- 
1.8.4.2



[dpdk-dev] [RFC PATCH V2 2/2] drivers/net/virtio: check vq parameter

2015-06-16 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bernard Iremonger
> Sent: Thursday, May 28, 2015 12:01 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [RFC PATCH V2 2/2] drivers/net/virtio: check vq
> parameter
> 
> If vq is NULL, there is a segmentation fault.
> 
> Signed-off-by: Bernard 

Acked-by: Changchun Ouyang 


[dpdk-dev] [PATCH v7 0/4] Fix vhost enqueue/dequeue issue

2015-06-16 Thread Ouyang, Changchun
Hi, Thomas


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, June 15, 2015 5:43 PM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v7 0/4] Fix vhost enqueue/dequeue issue
> 
> 2015-06-09 09:03, Ouyang Changchun:
> > Fix enqueue/dequeue can't handle chained vring descriptors; Remove
> > unnecessary vring descriptor length updating; Add support copying
> > scattered mbuf to vring;
> >
> > Changchun Ouyang (4):
> >   lib_vhost: Fix enqueue/dequeue can't handle chained vring descriptors
> >   lib_vhost: Refine code style
> >   lib_vhost: Extract function
> >   lib_vhost: Remove unnecessary vring descriptor length updating
> 
> What changed in v7?
> Is this test report still valuable for v7?
>   http://dpdk.org/ml/archives/dev/2015-June/018610.html
> 
Nothing really changed from v6 to v7, 
In V6, signed-off-by root, 
In v7, signed-off-by myself.
Yes, test report is still valuable.

> Note: it's really convenient to put the relevant changelog in each commit,
> and it would be nicer to have a changelog summary in this cover letter.


[dpdk-dev] [PATCH v3 9/9] doc: Update doc for vhost multiple queues

2015-06-15 Thread Ouyang Changchun
Update the sample guide doc for vhost multiple queues;
Update the prog guide doc for vhost lib multiple queues feature;

It is added since v3

Signed-off-by: Changchun Ouyang 
---
 doc/guides/prog_guide/vhost_lib.rst |  35 
 doc/guides/sample_app_ug/vhost.rst  | 110 
 2 files changed, 145 insertions(+)

diff --git a/doc/guides/prog_guide/vhost_lib.rst 
b/doc/guides/prog_guide/vhost_lib.rst
index 48e1fff..e444681 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -128,6 +128,41 @@ VHOST_GET_VRING_BASE is used as the signal to remove vhost 
device from data plan

 When the socket connection is closed, vhost will destroy the device.

+Vhost multiple queues feature
+-
+This feature supports the multiple queues for each virtio device in vhost.
+The vhost-user is used to enable the multiple queues feature, It's not ready 
for vhost-cuse.
+
+The QEMU patch of enabling vhost-use multiple queues has already merged into 
upstream sub-tree in
+QEMU community and it will be put in QEMU 2.4. If using QEMU 2.3, it requires 
applying the
+same patch onto QEMU 2.3 and rebuild the QEMU before running vhost multiple 
queues:
+http://patchwork.ozlabs.org/patch/477461/
+
+The vhost will get the queue pair number based on the communication message 
with QEMU.
+
+HW queue numbers in pool is strongly recommended to set as identical with the 
queue number to start
+the QMEU guest and identical with the queue number to start with virtio port 
on guest.
+
+=
+==|   |==|
+   vport0 |   |  vport1  |
+---  ---  ---  ---|   |---  ---  ---  ---|
+q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
+/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
+||   ||   ||   ||  ||   ||   ||   ||
+||   ||   ||   ||  ||   ||   ||   ||
+||= =||= =||= =||=|   =||== ||== ||== ||=|
+q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
+--|   |--|
+ VMDq pool0   |   |VMDq pool1|
+==|   |==|
+
+In RX side, it firstly polls each queue of the pool and gets the packets from
+it and enqueue them into its corresponding virtqueue in virtio device/port.
+In TX side, it dequeue packets from each virtqueue of virtio device/port and 
send
+to either physical port or another virtio device according to its destination
+MAC address.
+
 Vhost supported vSwitch reference
 -

diff --git a/doc/guides/sample_app_ug/vhost.rst 
b/doc/guides/sample_app_ug/vhost.rst
index 730b9da..9a57d19 100644
--- a/doc/guides/sample_app_ug/vhost.rst
+++ b/doc/guides/sample_app_ug/vhost.rst
@@ -514,6 +514,13 @@ It is enabled by default.

 user at target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir /mnt/huge 
-- --vlan-strip [0, 1]

+**rxq.**
+The rxq option specify the rx queue number per VMDq pool, it is 1 on default.
+
+.. code-block:: console
+
+user at target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir /mnt/huge 
-- --rxq [1, 2, 4]
+
 Running the Virtual Machine (QEMU)
 --

@@ -833,3 +840,106 @@ For example:
 The above message indicates that device 0 has been registered with MAC address 
cc:bb:bb:bb:bb:bb and VLAN tag 1000.
 Any packets received on the NIC with these values is placed on the devices 
receive queue.
 When a virtio-net device transmits packets, the VLAN tag is added to the 
packet by the DPDK vhost sample code.
+
+Vhost multiple queues
+-
+
+This feature supports the multiple queues for each virtio device in vhost.
+The vhost-user is used to enable the multiple queues feature, It's not ready 
for vhost-cuse.
+
+The QEMU patch of enabling vhost-use multiple queues has already merged into 
upstream sub-tree in
+QEMU community and it will be put in QEMU 2.4. If using QEMU 2.3, it requires 
applying the
+same patch onto QEMU 2.3 and rebuild the QEMU before running vhost multiple 
queues:
+http://patchwork.ozlabs.org/patch/477461/
+
+Basically vhost sample leverages the VMDq+RSS in HW to receive packets and 
distribute them
+into different queue in the pool according to their 5 tuples.
+
+On the other hand, the vhost will get the queue pair number based on the 
communication message with
+QEMU.
+
+HW queue numbers in pool is strongly recommended to set as identical with the 
queue number to start
+the QMEU guest and identical with the queue number to start with virtio port 
on guest.
+E.g. use '--rxq 4' to set the queue number as 4, it means there are 4 HW 
queues in each VMDq pool,
+and 4 queues in each vhost device/port, every queue in pool maps to one queue 
in vhost device.
+
+=
+==|   |==|
+   vport0 |   |  vport1  |
+---  ---  ---  ---|   |---  ---  ---  ---|
+q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
+/\= =/\= =/\= =/\=|   |/

[dpdk-dev] [PATCH v3 8/9] vhost: Add per queue stats info

2015-06-15 Thread Ouyang Changchun
Add per queue stats info

Changes in v3
  - fix coding style and displaying format
  - check stats_enable to alloc mem for queue pair

Changes in v2
  - fix the stats issue in tx_local
  - dynamically alloc mem for queue pair stats info
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang 
---
 examples/vhost/main.c | 126 +++---
 1 file changed, 79 insertions(+), 47 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index d40cb11..76f645f 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -314,7 +314,7 @@ struct ipv4_hdr {
 #define VLAN_ETH_HLEN   18

 /* Per-device statistics struct */
-struct device_statistics {
+struct qp_statistics {
uint64_t tx_total;
rte_atomic64_t rx_total_atomic;
uint64_t rx_total;
@@ -322,6 +322,10 @@ struct device_statistics {
rte_atomic64_t rx_atomic;
uint64_t rx;
 } __rte_cache_aligned;
+
+struct device_statistics {
+   struct qp_statistics *qp_stats;
+};
 struct device_statistics dev_statistics[MAX_DEVICES];

 /*
@@ -738,6 +742,17 @@ us_vhost_parse_args(int argc, char **argv)
return -1;
} else {
enable_stats = ret;
+   if (enable_stats)
+   for (i = 0; i < MAX_DEVICES; 
i++) {
+   
dev_statistics[i].qp_stats =
+   
malloc(VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct qp_statistics));
+   if 
(dev_statistics[i].qp_stats == NULL) {
+   RTE_LOG(ERR, 
VHOST_CONFIG, "Failed to allocate memory for qp stats.\n");
+   return -1;
+   }
+   
memset(dev_statistics[i].qp_stats, 0,
+   
VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct qp_statistics));
+   }
}
}

@@ -1093,13 +1108,13 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf 
*m, uint32_t q_idx)
&m, 1);
if (enable_stats) {
rte_atomic64_add(
-   
&dev_statistics[tdev->device_fh].rx_total_atomic,
+   
&dev_statistics[tdev->device_fh].qp_stats[q_idx].rx_total_atomic,
1);
rte_atomic64_add(
-   
&dev_statistics[tdev->device_fh].rx_atomic,
+   
&dev_statistics[tdev->device_fh].qp_stats[q_idx].rx_atomic,
ret);
-   
dev_statistics[tdev->device_fh].tx_total++;
-   dev_statistics[tdev->device_fh].tx += 
ret;
+   
dev_statistics[dev->device_fh].qp_stats[q_idx].tx_total++;
+   
dev_statistics[dev->device_fh].qp_stats[q_idx].tx += ret;
}
}

@@ -1233,8 +1248,8 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf 
*m,
tx_q->m_table[len] = m;
len++;
if (enable_stats) {
-   dev_statistics[dev->device_fh].tx_total++;
-   dev_statistics[dev->device_fh].tx++;
+   dev_statistics[dev->device_fh].qp_stats[q_idx].tx_total++;
+   dev_statistics[dev->device_fh].qp_stats[q_idx].tx++;
}

if (unlikely(len == MAX_PKT_BURST)) {
@@ -1365,10 +1380,10 @@ switch_worker(__attribute__((unused)) void *arg)

pkts_burst, rx_count);
if (enable_stats) {
rte_atomic64_add(
-   
&dev_statistics[dev_ll->vdev->dev->device_fh].rx_total_atomic,
+   
&dev_statistics[dev_ll->vdev->dev->device_fh].qp_stats[i].rx_total_atomic,
rx_count);
rte_atomic64_add(
-   
&dev_statistics[dev_ll->vdev->dev->device_fh].rx_atomic, ret_count);
+   
&dev_statistics[dev_ll->vdev-

[dpdk-dev] [PATCH v3 7/9] virtio: Resolve for control queue

2015-06-15 Thread Ouyang Changchun
Control queue can't work for vhost-user mulitple queue mode,
so introduce a counter to void the dead loop when polling the control queue.

Changes in v2:
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang 
---
 drivers/net/virtio/virtio_ethdev.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index fe5f9a1..e4bedbd 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -61,6 +61,7 @@
 #include "virtio_logs.h"
 #include "virtqueue.h"

+#define CQ_POLL_COUNTER 500 /* Avoid dead loop when polling control queue */

 static int eth_virtio_dev_init(struct rte_eth_dev *eth_dev);
 static int  virtio_dev_configure(struct rte_eth_dev *dev);
@@ -118,6 +119,7 @@ virtio_send_command(struct virtqueue *vq, struct 
virtio_pmd_ctrl *ctrl,
int k, sum = 0;
virtio_net_ctrl_ack status = ~0;
struct virtio_pmd_ctrl result;
+   uint32_t cq_poll = CQ_POLL_COUNTER;

ctrl->status = status;

@@ -178,9 +180,15 @@ virtio_send_command(struct virtqueue *vq, struct 
virtio_pmd_ctrl *ctrl,
virtqueue_notify(vq);

rte_rmb();
-   while (vq->vq_used_cons_idx == vq->vq_ring.used->idx) {
+
+   /**
+* FIXME: The control queue doesn't work for vhost-user
+* multiple queue, introduce poll_ms to avoid the deadloop.
+*/
+   while ((vq->vq_used_cons_idx == vq->vq_ring.used->idx) && (cq_poll != 
0)) {
rte_rmb();
usleep(100);
+   cq_poll--;
}

while (vq->vq_used_cons_idx != vq->vq_ring.used->idx) {
@@ -208,7 +216,10 @@ virtio_send_command(struct virtqueue *vq, struct 
virtio_pmd_ctrl *ctrl,
PMD_INIT_LOG(DEBUG, "vq->vq_free_cnt=%d\nvq->vq_desc_head_idx=%d",
vq->vq_free_cnt, vq->vq_desc_head_idx);

-   memcpy(&result, vq->virtio_net_hdr_mz->addr,
+   if (cq_poll == 0)
+   result.status = 0;
+   else
+   memcpy(&result, vq->virtio_net_hdr_mz->addr,
sizeof(struct virtio_pmd_ctrl));

return result.status;
-- 
1.8.4.2



[dpdk-dev] [PATCH v3 6/9] vhost: Support multiple queues

2015-06-15 Thread Ouyang Changchun
Sample vhost leverage the VMDq+RSS in HW to receive packets and distribute them
into different queue in the pool according to 5 tuples.

And enable multiple queues mode in vhost/virtio layer.

HW queue numbers in pool exactly same with the queue number in virtio device,
e.g. rxq = 4, the queue number is 4, it means 4 HW queues in each VMDq pool,
and 4 queues in each virtio device/port, one maps to each.

=
==|   |==|
   vport0 |   |  vport1  |
---  ---  ---  ---|   |---  ---  ---  ---|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
||   ||   ||   ||  ||   ||   ||   ||
||   ||   ||   ||  ||   ||   ||   ||
||= =||= =||= =||=|   =||== ||== ||== ||=|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |

--|   |--|
 VMDq pool0   |   |VMDq pool1|
==|   |==|

In RX side, it firstly polls each queue of the pool and gets the packets from
it and enqueue them into its corresponding queue in virtio device/port.
In TX side, it dequeue packets from each queue of virtio device/port and send
to either physical port or another virtio device according to its destination
MAC address.

Changes in v2:
  - check queue num per pool in VMDq and queue pair number per vhost device
  - remove the unnecessary calling q_num_set api
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang 
---
 examples/vhost/main.c | 132 ++
 1 file changed, 79 insertions(+), 53 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index cd9640e..d40cb11 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1001,8 +1001,9 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)

/* Enable stripping of the vlan tag as we handle routing. */
if (vlan_strip)
-   rte_eth_dev_set_vlan_strip_on_queue(ports[0],
-   (uint16_t)vdev->vmdq_rx_q, 1);
+   for (i = 0; i < (int)rxq; i++)
+   rte_eth_dev_set_vlan_strip_on_queue(ports[0],
+   (uint16_t)(vdev->vmdq_rx_q + i), 1);

/* Set device as ready for RX. */
vdev->ready = DEVICE_RX;
@@ -1017,7 +1018,7 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)
 static inline void
 unlink_vmdq(struct vhost_dev *vdev)
 {
-   unsigned i = 0;
+   unsigned i = 0, j = 0;
unsigned rx_count;
struct rte_mbuf *pkts_burst[MAX_PKT_BURST];

@@ -1030,15 +1031,19 @@ unlink_vmdq(struct vhost_dev *vdev)
vdev->vlan_tag = 0;

/*Clear out the receive buffers*/
-   rx_count = rte_eth_rx_burst(ports[0],
-   (uint16_t)vdev->vmdq_rx_q, pkts_burst, 
MAX_PKT_BURST);
+   for (i = 0; i < rxq; i++) {
+   rx_count = rte_eth_rx_burst(ports[0],
+   (uint16_t)vdev->vmdq_rx_q + i,
+   pkts_burst, MAX_PKT_BURST);

-   while (rx_count) {
-   for (i = 0; i < rx_count; i++)
-   rte_pktmbuf_free(pkts_burst[i]);
+   while (rx_count) {
+   for (j = 0; j < rx_count; j++)
+   rte_pktmbuf_free(pkts_burst[j]);

-   rx_count = rte_eth_rx_burst(ports[0],
-   (uint16_t)vdev->vmdq_rx_q, pkts_burst, 
MAX_PKT_BURST);
+   rx_count = rte_eth_rx_burst(ports[0],
+   (uint16_t)vdev->vmdq_rx_q + i,
+   pkts_burst, MAX_PKT_BURST);
+   }
}

vdev->ready = DEVICE_MAC_LEARNING;
@@ -1050,7 +1055,7 @@ unlink_vmdq(struct vhost_dev *vdev)
  * the packet on that devices RX queue. If not then return.
  */
 static inline int __attribute__((always_inline))
-virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m)
+virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m, uint32_t q_idx)
 {
struct virtio_net_data_ll *dev_ll;
struct ether_hdr *pkt_hdr;
@@ -1065,7 +1070,7 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf 
*m)

while (dev_ll != NULL) {
if ((dev_ll->vdev->ready == DEVICE_RX) && 
ether_addr_cmp(&(pkt_hdr->d_addr),
- &dev_ll->vdev->mac_address)) {
+   &dev_ll->vdev->mac_address)) {

/* Drop the packet if the TX packet is destined for the 
TX device. */
if (dev_ll->vdev->dev->device_fh == dev->device_fh) {
@@ -1083,7 +1088,9 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf 
*m)
LOG_DEBUG(VHOST_DATA, "(%"PRIu64") D

[dpdk-dev] [PATCH v3 5/9] vhost: Add new command line option: rxq

2015-06-15 Thread Ouyang Changchun
Sample vhost need know the queue number user want to enable for each virtio 
device,
so add the new option '--rxq' into it.

Changes in v3
  - fix coding style

Changes in v2
  - refine help info
  - check if rxq = 0
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang 
---
 examples/vhost/main.c | 49 +
 1 file changed, 45 insertions(+), 4 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index aba287a..cd9640e 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -163,6 +163,9 @@ static int mergeable;
 /* Do vlan strip on host, enabled on default */
 static uint32_t vlan_strip = 1;

+/* Rx queue number per virtio device */
+static uint32_t rxq = 1;
+
 /* number of descriptors to apply*/
 static uint32_t num_rx_descriptor = RTE_TEST_RX_DESC_DEFAULT_ZCP;
 static uint32_t num_tx_descriptor = RTE_TEST_TX_DESC_DEFAULT_ZCP;
@@ -408,8 +411,19 @@ port_init(uint8_t port)
txconf->tx_deferred_start = 1;
}

-   /*configure the number of supported virtio devices based on VMDQ limits 
*/
-   num_devices = dev_info.max_vmdq_pools;
+   /* Configure the virtio devices num based on VMDQ limits */
+   switch (rxq) {
+   case 1:
+   case 2:
+   num_devices = dev_info.max_vmdq_pools;
+   break;
+   case 4:
+   num_devices = dev_info.max_vmdq_pools / 2;
+   break;
+   default:
+   RTE_LOG(ERR, VHOST_CONFIG, "rxq invalid for VMDq.\n");
+   return -1;
+   }

if (zero_copy) {
rx_ring_size = num_rx_descriptor;
@@ -431,7 +445,7 @@ port_init(uint8_t port)
return retval;
/* NIC queues are divided into pf queues and vmdq queues.  */
num_pf_queues = dev_info.max_rx_queues - dev_info.vmdq_queue_num;
-   queues_per_pool = dev_info.vmdq_queue_num / dev_info.max_vmdq_pools;
+   queues_per_pool = dev_info.vmdq_queue_num / num_devices;
num_vmdq_queues = num_devices * queues_per_pool;
num_queues = num_pf_queues + num_vmdq_queues;
vmdq_queue_base = dev_info.vmdq_queue_base;
@@ -576,7 +590,8 @@ us_vhost_usage(const char *prgname)
"   --rx-desc-num [0-N]: the number of descriptors on rx, "
"used only when zero copy is enabled.\n"
"   --tx-desc-num [0-N]: the number of descriptors on tx, "
-   "used only when zero copy is enabled.\n",
+   "used only when zero copy is enabled.\n"
+   "   --rxq [1,2,4]: rx queue number for each vhost device\n",
   prgname);
 }

@@ -602,6 +617,7 @@ us_vhost_parse_args(int argc, char **argv)
{"zero-copy", required_argument, NULL, 0},
{"rx-desc-num", required_argument, NULL, 0},
{"tx-desc-num", required_argument, NULL, 0},
+   {"rxq", required_argument, NULL, 0},
{NULL, 0, 0, 0},
};

@@ -778,6 +794,18 @@ us_vhost_parse_args(int argc, char **argv)
}
}

+   /* Specify the Rx queue number for each vhost dev. */
+   if (!strncmp(long_option[option_index].name,
+   "rxq", MAX_LONG_OPT_SZ)) {
+   ret = parse_num_opt(optarg, 4);
+   if ((ret == -1) || (ret == 0) || 
(!POWEROF2(ret))) {
+   RTE_LOG(INFO, VHOST_CONFIG,
+   "Valid value for rxq is [1,2,4]\n");
+   us_vhost_usage(prgname);
+   return -1;
+   } else
+   rxq = ret;
+   }
break;

/* Invalid option - print options. */
@@ -813,6 +841,19 @@ us_vhost_parse_args(int argc, char **argv)
return -1;
}

+   if (rxq > 1) {
+   vmdq_conf_default.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
+   vmdq_conf_default.rx_adv_conf.rss_conf.rss_hf = ETH_RSS_IP |
+   ETH_RSS_UDP | ETH_RSS_TCP | ETH_RSS_SCTP;
+   }
+
+   if ((zero_copy == 1) && (rxq > 1)) {
+   RTE_LOG(INFO, VHOST_PORT,
+   "Vhost zero copy doesn't support mq mode,"
+   "please specify '--rxq 1' to disable it.\n");
+   return -1;
+   }
+
return 0;
 }

-- 
1.8.4.2



[dpdk-dev] [PATCH v3 4/9] lib_vhost: Check the virtqueue address's validity

2015-06-15 Thread Ouyang Changchun
This is added since v3.
Check the virtqueue address's validity.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_vhost/vhost_user/vhost-net-user.c | 11 ++-
 lib/librte_vhost/virtio-net.c| 10 ++
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c 
b/lib/librte_vhost/vhost_user/vhost-net-user.c
index b66a653..552b501 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -398,7 +398,16 @@ vserver_message_handler(int connfd, void *dat, int *remove)
ops->set_vring_num(ctx, &msg.payload.state);
break;
case VHOST_USER_SET_VRING_ADDR:
-   ops->set_vring_addr(ctx, &msg.payload.addr);
+   if (ops->set_vring_addr(ctx, &msg.payload.addr) != 0) {
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "error found in vhost set vring,"
+   "the vhost device will destroy\n");
+   close(connfd);
+   *remove = 1;
+   free(cfd_ctx);
+   user_destroy_device(ctx);
+   ops->destroy_device(ctx);
+   }
break;
case VHOST_USER_SET_VRING_BASE:
ops->set_vring_base(ctx, &msg.payload.state);
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 3e24841..80df0ec 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -553,6 +553,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct 
vhost_vring_addr *addr)
 {
struct virtio_net *dev;
struct vhost_virtqueue *vq;
+   uint32_t i;

dev = get_device(ctx);
if (dev == NULL)
@@ -580,6 +581,15 @@ set_vring_addr(struct vhost_device_ctx ctx, struct 
vhost_vring_addr *addr)
return -1;
}

+   for (i = vq->last_used_idx; i < vq->avail->idx; i++)
+   if (vq->avail->ring[i] >= vq->size) {
+   RTE_LOG(ERR, VHOST_CONFIG, "%s (%"PRIu64"):"
+   "Please check virt queue pair idx:%d is "
+   "enalbed correctly on guest.\n", __func__,
+   dev->device_fh, addr->index / VIRTIO_QNUM);
+   return -1;
+   }
+
vq->used = (struct vring_used *)(uintptr_t)qva_to_vva(dev,
addr->index / VIRTIO_QNUM, addr->used_user_addr);
if (vq->used == 0) {
-- 
1.8.4.2



[dpdk-dev] [PATCH v3 3/9] lib_vhost: Set memory layout for multiple queues mode

2015-06-15 Thread Ouyang Changchun
QEMU sends separate commands orderly to set the memory layout for each queue
in one virtio device, accordingly vhost need keep memory layout information
for each queue of the virtio device.

This also need adjust the interface a bit for function gpa_to_vva by
introducing the queue index to specify queue of device to look up its
virtual vhost address for the incoming guest physical address.

Chagnes in v3
  - fix coding style

Chagnes in v2
  - q_idx is changed into qp_idx
  - dynamically alloc mem for dev mem_arr
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang 
---
 examples/vhost/main.c | 21 +-
 lib/librte_vhost/rte_virtio_net.h | 10 +++--
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 57 ++
 lib/librte_vhost/vhost_rxtx.c | 21 +-
 lib/librte_vhost/vhost_user/virtio-net-user.c | 59 ++-
 lib/librte_vhost/virtio-net.c | 38 -
 6 files changed, 118 insertions(+), 88 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 7863dcf..aba287a 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1466,11 +1466,11 @@ attach_rxmbuf_zcp(struct virtio_net *dev)
desc = &vq->desc[desc_idx];
if (desc->flags & VRING_DESC_F_NEXT) {
desc = &vq->desc[desc->next];
-   buff_addr = gpa_to_vva(dev, desc->addr);
+   buff_addr = gpa_to_vva(dev, 0, desc->addr);
phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len,
&addr_type);
} else {
-   buff_addr = gpa_to_vva(dev,
+   buff_addr = gpa_to_vva(dev, 0,
desc->addr + vq->vhost_hlen);
phys_addr = gpa_to_hpa(vdev,
desc->addr + vq->vhost_hlen,
@@ -1722,7 +1722,7 @@ virtio_dev_rx_zcp(struct virtio_net *dev, struct rte_mbuf 
**pkts,
rte_pktmbuf_data_len(buff), 0);

/* Buffer address translation for virtio header. */
-   buff_hdr_addr = gpa_to_vva(dev, desc->addr);
+   buff_hdr_addr = gpa_to_vva(dev, 0, desc->addr);
packet_len = rte_pktmbuf_data_len(buff) + vq->vhost_hlen;

/*
@@ -1946,7 +1946,7 @@ virtio_dev_tx_zcp(struct virtio_net *dev)
desc = &vq->desc[desc->next];

/* Buffer address translation. */
-   buff_addr = gpa_to_vva(dev, desc->addr);
+   buff_addr = gpa_to_vva(dev, 0, desc->addr);
/* Need check extra VLAN_HLEN size for inserting VLAN tag */
phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len + VLAN_HLEN,
&addr_type);
@@ -2604,13 +2604,14 @@ new_device (struct virtio_net *dev)
dev->priv = vdev;

if (zero_copy) {
-   vdev->nregions_hpa = dev->mem->nregions;
-   for (regionidx = 0; regionidx < dev->mem->nregions; 
regionidx++) {
+   struct virtio_memory *dev_mem = dev->mem_arr[0];
+   vdev->nregions_hpa = dev_mem->nregions;
+   for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) 
{
vdev->nregions_hpa
+= check_hpa_regions(
-   
dev->mem->regions[regionidx].guest_phys_address
-   + 
dev->mem->regions[regionidx].address_offset,
-   
dev->mem->regions[regionidx].memory_size);
+   
dev_mem->regions[regionidx].guest_phys_address
+   + 
dev_mem->regions[regionidx].address_offset,
+   
dev_mem->regions[regionidx].memory_size);

}

@@ -2626,7 +2627,7 @@ new_device (struct virtio_net *dev)


if (fill_hpa_memory_regions(
-   vdev->regions_hpa, dev->mem
+   vdev->regions_hpa, dev_mem
) != vdev->nregions_hpa) {

RTE_LOG(ERR, VHOST_CONFIG,
diff --git a/lib/librte_vhost/rte_virtio_net.h 
b/lib/librte_vhost/rte_virtio_net.h
index 873be3e..1b75f45 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -95,14 +95,15 @@ struct vhost_virtqueue {
  * Device structure contains all configuration information relating to the 
device.
  */
 struct virtio_net {
-   struct virtio_memory*mem;   /**< QEMU memory and memory 
region information. */
struct vhost_virtqueue  **virtqueue;/**< Contains all virtqueue 
information. */
+   struct virtio_memory**mem_arr;  /**< Array for QEMU memory and 
memory region information. */
uint64_tfeatu

  1   2   3   4   5   6   7   8   >