Re: [dpdk-dev] [PATCH v4] vhost: check header for legacy dequeue offload

2021-06-16 Thread Wang, Xiao W
Hi David,

Thanks for your comments.
I agree with your suggestions. BTW, I notice some other invalid corner cases 
which need rolling back mbuf->l2_len, l3_len and ol_flag.
E.g. the default case in the "switch {}" context is not valid.
BTW, l4_proto variable is better to be a uint8_t, rather than uint16_t.

I will prepare a new version.

BRs,
Xiao

> -Original Message-
> From: David Marchand 
> Sent: Tuesday, June 15, 2021 3:57 PM
> To: Wang, Xiao W 
> Cc: Maxime Coquelin ; Xia, Chenbo
> ; Jiang, Cheng1 ; dev
> ; dpdk stable 
> Subject: Re: [dpdk-dev] [PATCH v4] vhost: check header for legacy dequeue
> offload
> 
> On Tue, Jun 15, 2021 at 9:06 AM Xiao Wang 
> wrote:
> > diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
> > index 8da8a86a10..351ff0a841 100644
> > --- a/lib/vhost/virtio_net.c
> > +++ b/lib/vhost/virtio_net.c
> > @@ -2259,44 +2259,64 @@ virtio_net_with_host_offload(struct
> virtio_net *dev)
> > return false;
> >  }
> >
> > -static void
> > -parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr)
> > +static int
> > +parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr,
> > +   uint16_t *len)
> >  {
> 
> 
> This function name is misleading, name could be parse_headers().
> Its semantic gets more and more confusing with those l4_hdr and len
> pointers.
> 
> This function fills ->lX_len in the mbuf, everything is available for caller.
> 
> Caller can check that rte_pktmbuf_data_len() is >= m->l2_len +
> m->l3_len + somesize.
> => no need for len.
> 
> l4_hdr can simply be deduced with rte_pktmbuf_mtod_offset(m, struct
> somestruct *, m->l2_len + m->l3_len).
> => no need for l4_hdr.
> 
> 
> > struct rte_ipv4_hdr *ipv4_hdr;
> > struct rte_ipv6_hdr *ipv6_hdr;
> > void *l3_hdr = NULL;
> 
> No need for l3_hdr.
> 
> 
> > struct rte_ether_hdr *eth_hdr;
> > uint16_t ethertype;
> > +   uint16_t data_len = m->data_len;
> 
> Avoid direct access to mbuf internals, we have inline helpers:
> rte_pktmbuf_data_len(m).
> 
> 
> > +
> > +   if (data_len <= sizeof(struct rte_ether_hdr))
> 
> Strictly speaking, < is enough.
> 
> 
> > +   return -EINVAL;
> >
> > eth_hdr = rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
> >
> > m->l2_len = sizeof(struct rte_ether_hdr);
> > ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
> > +   data_len -= sizeof(struct rte_ether_hdr);
> 
> No need to decrement data_len if checks below are all done for absolute
> value.
> See suggestions below.
> 
> 
> >
> > if (ethertype == RTE_ETHER_TYPE_VLAN) {
> > +   if (data_len <= sizeof(struct rte_vlan_hdr))
> > +   return -EINVAL;
> 
> if (data_len < sizeof(rte_ether_hdr) + sizeof(struct rte_vlan_hdr))
> 
> 
> > +
> > struct rte_vlan_hdr *vlan_hdr =
> > (struct rte_vlan_hdr *)(eth_hdr + 1);
> >
> > m->l2_len += sizeof(struct rte_vlan_hdr);
> > ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto);
> > +   data_len -= sizeof(struct rte_vlan_hdr);
> 
> Idem.
> 
> 
> > }
> >
> > l3_hdr = (char *)eth_hdr + m->l2_len;
> >
> > switch (ethertype) {
> > case RTE_ETHER_TYPE_IPV4:
> > +   if (data_len <= sizeof(struct rte_ipv4_hdr))
> > +   return -EINVAL;
> 
> if (data_len < m->l2_len + sizeof(struct rte_ipv4_hdr))
> 
> 
> > ipv4_hdr = l3_hdr;
> 
> ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct rte_ipv4_hdr *, m->l2_len);
> 
> 
> > *l4_proto = ipv4_hdr->next_proto_id;
> > m->l3_len = rte_ipv4_hdr_len(ipv4_hdr);
> > +   if (data_len <= m->l3_len) {
> 
> if (data_len < m->l2_len + m->l3_len)
> 
> 
> > +   m->l3_len = 0;
> > +   return -EINVAL;
> 
> Returning here leaves m->l2_len set.
> 
> 
> > +   }
> > *l4_hdr = (char *)l3_hdr + m->l3_len;
> > m->ol_flags |= PKT_TX_IPV4;
> > +   data_len -= m->l3_len;
> > break;
> > case RTE_ETHER_TYPE_IPV6:
> > +   if (data_len <= sizeof(struct rte_ipv6_hdr))
> > +   return -EI

RE: [PATCH] vdpa/ifc: fix null pointer dereference

2022-06-08 Thread Wang, Xiao W
Hi

> -Original Message-
> From: Pei, Andy 
> Sent: Wednesday, June 8, 2022 3:34 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; maxime.coque...@redhat.com;
> Wang, Xiao W ; Xu, Rosen ;
> Xiao, QimaiX 
> Subject: [PATCH] vdpa/ifc: fix null pointer dereference
> 
> Fix null pointer dereference reported in coverity scan.
> 
> Coverity issue: 378882
> Fixes: 8162a4a9 ("vdpa/ifc/base: access correct register for blk device")
> Signed-off-by: Andy Pei 
> ---
>  drivers/vdpa/ifc/base/ifcvf.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
> index dd475a7..0a9f71a 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.c
> +++ b/drivers/vdpa/ifc/base/ifcvf.c
> @@ -255,6 +255,10 @@
>   u32 ring_state;
> 
>   cfg = hw->common_cfg;
> + if (!cfg) {
> + DEBUGOUT("common_cfg in HW is NULL.\n");
> + return;
> + }
> 
>   IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg->msix_config);
>   for (i = 0; i < hw->nr_vring; i++) {
> @@ -262,6 +266,11 @@
>   IFCVF_WRITE_REG16(0, &cfg->queue_enable);
>   IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg-
> >queue_msix_vector);
> 
> + if (!hw->lm_cfg) {
> + DEBUGOUT("live migration cfg in HW is NULL.\n");
> + continue;
> + }
> +
>   if (hw->device_type == IFCVF_BLK)
>   ring_state = *(u32 *)(hw->lm_cfg +
>   IFCVF_LM_RING_STATE_OFFSET +
> --
> 1.8.3.1

Acked-by: Xiao Wang 

BRs,
Xiao


Re: [dpdk-dev] [PATCH v3] vhost: add header check in dequeue offload

2021-04-02 Thread Wang, Xiao W

> -Original Message-
> From: David Marchand 
> Sent: Thursday, April 1, 2021 8:04 PM
> To: Wang, Xiao W 
> Cc: Xia, Chenbo ; Maxime Coquelin
> ; Liu, Yong ; dev
> ; Ananyev, Konstantin ;
> dpdk stable 
> Subject: Re: [PATCH v3] vhost: add header check in dequeue offload
> 
> On Wed, Mar 17, 2021 at 7:50 AM Xiao Wang 
> wrote:
> >
> > When parsing the virtio net header and packet header for dequeue offload,
> > we need to perform sanity check on the packet header to ensure:
> >   - No out-of-boundary memory access.
> >   - The packet header and virtio_net header are valid and aligned.
> >
> > Fixes: d0cf91303d73 ("vhost: add Tx offload capabilities")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Xiao Wang 
> 
> I spent some time digging on this topic.
> 
> Afaiu the offload API, vhost is not supposed to populate tx offloads.
> I would drop this whole parse_ethernet function and replace
> vhost_dequeue_offload with what virtio does on the rx side.
> 
> Please have a look at this series (especially the last patch):
> http://patchwork.dpdk.org/project/dpdk/list/?series=16052
> 
> 
> Thanks.
> 
> --
> David Marchand

+Yang ,Yi into this loop who may have comments especially from OVS perspective 
on CKSUM/TSO/TSO in tunnel/etc..

I think the original vhost implementation here is to help pass virtio's offload 
request onto the next output port, either physical device or a virtio device.
If we go with series http://patchwork.dpdk.org/project/dpdk/list/?series=16052, 
then virtual switch need to do an extra translation on the flags:
e.g. PKT_RX_LRO --> PKT_TX_TCP_SEG. The question is that a packet marked with 
PKT_RX_LRO may come from different types of ports (non-vhost), how vSwitch can 
tell if TSO request should be set for this packet at transmission?

If I think from an endpoint app's perspective, I'm inclined to agree with your 
series. If I think from a switch/router's perspective, I'm inclined to keep the 
current implementation. Maybe we can add PKT_RX_L4_CKSUM_NONE/PKT_RX_LRO flags 
into the current implementation, seems this method can cover both scenarios.

BRs,
Xiao





Re: [dpdk-dev] [PATCH v2 3/3] drivers: align log names

2021-04-05 Thread Wang, Xiao W
Hi Thomas,  

> -Original Message-
> From: Thomas Monjalon 
> Sent: Monday, April 5, 2021 6:03 PM
> To: dev@dpdk.org
> Cc: Richardson, Bruce ; Xu, Rosen
> ; Wang, Xiao W ; Hemant
> Agrawal ; Ajit Khaparde
> ; Griffin, John ;
> Trahe, Fiona ; Jain, Deepak K
> ; Raveendra Padasalagi
> ; Vikas Gupta
> ; John W. Linville ;
> Chas Williams ; Min Hu (Connor)
> ; Zhang, Tianfei ; Nipun
> Gupta 
> Subject: [PATCH v2 3/3] drivers: align log names
> 
> The log levels are configured by using the name of the logs.
> Some drivers are aligned to follow a common log name standard:
>   pmd.class.driver[.sub]
> Some "common" drivers skip the "class" part:
>   pmd.driver.sub
> 
> Two drivers have exceptions to be clarified:
>   pmd.vdpa.ifcvf instead of pmd.vdpa.ifc
>   pmd.afu.ipn3ke instead of pmd.net.ipn3ke
> 
> Signed-off-by: Thomas Monjalon 
> Acked-by: Bruce Richardson 
> Acked-by: Rosen Xu 
> Acked-by: Xiao Wang 
> Acked-by: Hemant Agrawal 
> Acked-by: Ajit Khaparde 
> ---
>  doc/guides/cryptodevs/qat.rst | 10 +-
>  drivers/common/qat/qat_logs.c |  4 ++--
>  drivers/crypto/bcmfs/bcmfs_logs.c |  4 ++--
>  drivers/net/af_packet/rte_eth_af_packet.c |  2 +-
>  drivers/net/bonding/rte_eth_bond_pmd.c|  2 +-
>  drivers/raw/ifpga/ifpga_rawdev.c  |  2 +-
>  drivers/raw/ioat/ioat_rawdev.c|  2 +-
>  drivers/raw/skeleton/skeleton_rawdev.c|  2 +-
>  drivers/vdpa/ifc/ifcvf_vdpa.c |  2 +-
>  9 files changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst
> index cf16f03503..224b22b3f7 100644
> --- a/doc/guides/cryptodevs/qat.rst
> +++ b/doc/guides/cryptodevs/qat.rst
> @@ -659,15 +659,15 @@ Debugging
> 
>  There are 2 sets of trace available via the dynamic logging feature:
> 
> -* pmd.qat_dp exposes trace on the data-path.
> -* pmd.qat_general exposes all other trace.
> +* pmd.qat.dp exposes trace on the data-path.
> +* pmd.qat.general exposes all other trace.
> 
>  pmd.qat exposes both sets of traces.
>  They can be enabled using the log-level option (where 8=maximum log
> level) on
>  the process cmdline, e.g. using any of the following::
> 
> ---log-level="pmd.qat_general,8"
> ---log-level="pmd.qat_dp,8"
> +--log-level="pmd.qat.general,8"
> +--log-level="pmd.qat.dp,8"
>  --log-level="pmd.qat,8"
> 
>  .. Note::
> @@ -678,4 +678,4 @@ the process cmdline, e.g. using any of the following::
>  Also the dynamic global log level overrides both sets of trace, so e.g. 
> no
>  QAT trace would display in this case::
> 
> - --log-level="7" --log-level="pmd.qat_general,8"
> + --log-level="7" --log-level="pmd.qat.general,8"
> diff --git a/drivers/common/qat/qat_logs.c
> b/drivers/common/qat/qat_logs.c
> index fa48be53c3..adbe163cd9 100644
> --- a/drivers/common/qat/qat_logs.c
> +++ b/drivers/common/qat/qat_logs.c
> @@ -17,5 +17,5 @@ qat_hexdump_log(uint32_t level, uint32_t logtype,
> const char *title,
>   return 0;
>  }
> 
> -RTE_LOG_REGISTER(qat_gen_logtype, pmd.qat_general, NOTICE);
> -RTE_LOG_REGISTER(qat_dp_logtype, pmd.qat_dp, NOTICE);
> +RTE_LOG_REGISTER(qat_gen_logtype, pmd.qat.general, NOTICE);
> +RTE_LOG_REGISTER(qat_dp_logtype, pmd.qat.dp, NOTICE);
> diff --git a/drivers/crypto/bcmfs/bcmfs_logs.c
> b/drivers/crypto/bcmfs/bcmfs_logs.c
> index 701da9ecf3..9faf12f238 100644
> --- a/drivers/crypto/bcmfs/bcmfs_logs.c
> +++ b/drivers/crypto/bcmfs/bcmfs_logs.c
> @@ -21,5 +21,5 @@ bcmfs_hexdump_log(uint32_t level, uint32_t logtype,
> const char *title,
>   return 0;
>  }
> 
> -RTE_LOG_REGISTER(bcmfs_conf_logtype, pmd.bcmfs_config, NOTICE)
> -RTE_LOG_REGISTER(bcmfs_dp_logtype, pmd.bcmfs_fp, NOTICE)
> +RTE_LOG_REGISTER(bcmfs_conf_logtype, pmd.crypto.bcmfs.config,
> NOTICE)
> +RTE_LOG_REGISTER(bcmfs_dp_logtype, pmd.crypto.bcmfs.fp, NOTICE)
> diff --git a/drivers/net/af_packet/rte_eth_af_packet.c
> b/drivers/net/af_packet/rte_eth_af_packet.c
> index bfe5a0a451..a04f7c773a 100644
> --- a/drivers/net/af_packet/rte_eth_af_packet.c
> +++ b/drivers/net/af_packet/rte_eth_af_packet.c
> @@ -97,7 +97,7 @@ static struct rte_eth_link pmd_link = {
>   .link_autoneg = ETH_LINK_FIXED,
>  };
> 
> -RTE_LOG_REGISTER(af_packet_logtype, pmd.net.packet, NOTICE);
> +RTE_LOG_REGISTER(af_packet_logtype, pmd.net.af_packet, NOTICE);
> 
>  #define PMD_LOG(level, fmt, args...) \
>   rte_log(RTE_LOG_ ## level, af_packet_logtype, \
> diff --git a/drivers/net/bonding/rte_eth_bond_p

Re: [dpdk-dev] [PATCH v2 3/3] drivers: align log names

2021-04-06 Thread Wang, Xiao W
Hi Thomas,

I agree with the "ifcvf" name, as there might be more than one driver in a 
single driver dir.

Thanks,
Xiao

> -Original Message-
> From: Xu, Rosen 
> Sent: Tuesday, April 6, 2021 5:48 PM
> To: Thomas Monjalon ; Wang, Xiao W
> 
> Cc: dev@dpdk.org; Richardson, Bruce ;
> Hemant Agrawal ; Ajit Khaparde
> ; Griffin, John ;
> Trahe, Fiona ; Jain, Deepak K
> ; Raveendra Padasalagi
> ; Vikas Gupta
> ; John W. Linville ;
> Chas Williams ; Min Hu (Connor)
> ; Zhang, Tianfei ; Nipun
> Gupta ; david.march...@redhat.com
> Subject: RE: [PATCH v2 3/3] drivers: align log names
> 
> HI Thomas,
> 
> I'm ok, if you replace pmd.afu.ipn3ke with pmd.net.ipn3ke. Thanks your
> reminder.
> 
> > -Original Message-
> > From: Thomas Monjalon 
> > Sent: Tuesday, April 06, 2021 17:31
> > To: Xu, Rosen ; Wang, Xiao W
> > 
> > Cc: dev@dpdk.org; Richardson, Bruce ;
> > Hemant Agrawal ; Ajit Khaparde
> > ; Griffin, John ;
> > Trahe, Fiona ; Jain, Deepak K
> > ; Raveendra Padasalagi
> > ; Vikas Gupta
> > ; John W. Linville ;
> > Chas Williams ; Min Hu (Connor)
> ;
> > Zhang, Tianfei ; Nipun Gupta
> > ; david.march...@redhat.com
> > Subject: Re: [PATCH v2 3/3] drivers: align log names
> >
> > Hi Rosen,
> >
> > You already gave your ack in previous version, no need to re-ack.
> > Instead, please give your opinion and explanation as requested below.
> > We want to replace pmd.afu.ipn3ke with pmd.net.ipn3ke.
> > The use of AFU in the driver is not clear.
> >
> > Xiao, we need your opinion as well about ifcvf vs ifc name.
> >
> >
> > > > The log levels are configured by using the name of the logs.
> > > > Some drivers are aligned to follow a common log name standard:
> > > > pmd.class.driver[.sub]
> > > > Some "common" drivers skip the "class" part:
> > > > pmd.driver.sub
> > > >
> > > > Two drivers have exceptions to be clarified:
> > > > pmd.vdpa.ifcvf instead of pmd.vdpa.ifc
> > > > pmd.afu.ipn3ke instead of pmd.net.ipn3ke
> > > >
> > > > Signed-off-by: Thomas Monjalon 
> > > > Acked-by: Bruce Richardson 
> > > > Acked-by: Rosen Xu 
> > > > Acked-by: Xiao Wang 
> > > > Acked-by: Hemant Agrawal 
> > > > Acked-by: Ajit Khaparde 
> > [...]
> > > Acked-by: Rosen Xu 
> >
> >



Re: [dpdk-dev] [PATCH v3] vhost: add header check in dequeue offload

2021-04-12 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Wang, Xiao W
> Sent: Friday, April 2, 2021 4:39 PM
> To: David Marchand 
> Cc: Xia, Chenbo ; Maxime Coquelin
> ; Liu, Yong ; dev
> ; Ananyev, Konstantin ;
> dpdk stable ; yangy...@inspur.com
> Subject: RE: [PATCH v3] vhost: add header check in dequeue offload
> 
> 
> > -Original Message-
> > From: David Marchand 
> > Sent: Thursday, April 1, 2021 8:04 PM
> > To: Wang, Xiao W 
> > Cc: Xia, Chenbo ; Maxime Coquelin
> > ; Liu, Yong ; dev
> > ; Ananyev, Konstantin ;
> > dpdk stable 
> > Subject: Re: [PATCH v3] vhost: add header check in dequeue offload
> >
> > On Wed, Mar 17, 2021 at 7:50 AM Xiao Wang 
> > wrote:
> > >
> > > When parsing the virtio net header and packet header for dequeue
> offload,
> > > we need to perform sanity check on the packet header to ensure:
> > >   - No out-of-boundary memory access.
> > >   - The packet header and virtio_net header are valid and aligned.
> > >
> > > Fixes: d0cf91303d73 ("vhost: add Tx offload capabilities")
> > > Cc: sta...@dpdk.org
> > >
> > > Signed-off-by: Xiao Wang 
> >
> > I spent some time digging on this topic.
> >
> > Afaiu the offload API, vhost is not supposed to populate tx offloads.
> > I would drop this whole parse_ethernet function and replace
> > vhost_dequeue_offload with what virtio does on the rx side.
> >
> > Please have a look at this series (especially the last patch):
> > http://patchwork.dpdk.org/project/dpdk/list/?series=16052
> >
> >
> > Thanks.
> >
> > --
> > David Marchand
> 
> +Yang ,Yi into this loop who may have comments especially from OVS
> perspective on CKSUM/TSO/TSO in tunnel/etc..
> 
> I think the original vhost implementation here is to help pass virtio's 
> offload
> request onto the next output port, either physical device or a virtio device.
> If we go with series
> http://patchwork.dpdk.org/project/dpdk/list/?series=16052, then virtual
> switch need to do an extra translation on the flags:
> e.g. PKT_RX_LRO --> PKT_TX_TCP_SEG. The question is that a packet
> marked with PKT_RX_LRO may come from different types of ports (non-
> vhost), how vSwitch can tell if TSO request should be set for this packet at
> transmission?
> 
> If I think from an endpoint app's perspective, I'm inclined to agree with your
> series. If I think from a switch/router's perspective, I'm inclined to keep 
> the
> current implementation. Maybe we can add
> PKT_RX_L4_CKSUM_NONE/PKT_RX_LRO flags into the current
> implementation, seems this method can cover both scenarios.
> 
> BRs,
> Xiao
> 
> 

Considering the major consumer of vhost API is virtual switch/router, I tend to 
keep the current implementation and apply this fix patch.
Any comments?

BRs,
Xiao


RE: [PATCH 6/7] vhost: remove non-C++ compatible includes

2022-02-09 Thread Wang, Xiao W
Hi Bruce,

> -Original Message-
> From: Richardson, Bruce 
> Sent: Saturday, February 5, 2022 2:19 AM
> To: dev@dpdk.org
> Cc: Maxime Coquelin ; Xia, Chenbo
> ; Wang, Xiao W ; Matan
> Azrad ; Viacheslav Ovsiienko
> 
> Subject: Re: [PATCH 6/7] vhost: remove non-C++ compatible includes
> 
> On Fri, Feb 04, 2022 at 05:42:08PM +, Bruce Richardson wrote:
> > Some of the linux header includes are explicitly noted as being
> > incompatible with C++. However, these headers can included by C files
> > directly, or by internal headers, to avoid polluting the public DPDK
> > headers with non-C++ safe includes.
> >
> > Signed-off-by: Bruce Richardson 
> > ---
> 
> CI is reporting build issues with this patch on examples, something I'm not
> surprised to see. I will wait for maintainer feedback on best approach
> before respinning patchset.
> 
> /Bruce

Could we move these c++ incompatible linux headers into
#ifndef __cplusplus
...
#endif.
Then we just need to change rte_vhost.h file, and don't break build for the 
drivers and samples.

BRs,
Xiao


RE: [PATCH 09/12] vdpa/ifc: fix build with GCC 12

2022-05-18 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: David Marchand 
> Sent: Wednesday, May 18, 2022 6:17 PM
> To: dev@dpdk.org
> Cc: tho...@monjalon.net; ferruh.yi...@xilinx.com; sta...@dpdk.org;
> Wang, Xiao W 
> Subject: [PATCH 09/12] vdpa/ifc: fix build with GCC 12
> 
> GCC 12 raises the following warning:
> 
> ../drivers/vdpa/ifc/ifcvf_vdpa.c: In function ‘vdpa_enable_vfio_intr’:
> ../drivers/vdpa/ifc/ifcvf_vdpa.c:383:62: error: writing 4 bytes into a
> region of size 0 [-Werror=stringop-overflow=]
>   383 | fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = fd;
>   | ~^~~~
> ../drivers/vdpa/ifc/ifcvf_vdpa.c:348:14: note: at offset 32 into
> destination object ‘irq_set_buf’ of size 32
>   348 | char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
>   |  ^~~
> 
> Validate number of vrings to avoid out of bound access.
> 
> Cc: sta...@dpdk.org
> 
> Signed-off-by: David Marchand 
> ---
>  drivers/vdpa/ifc/ifcvf_vdpa.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
> index 9f05595b6b..6708849bd3 100644
> --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> @@ -354,6 +354,8 @@ vdpa_enable_vfio_intr(struct ifcvf_internal *internal,
> bool m_rx)
>   vring.callfd = -1;
> 
>   nr_vring = rte_vhost_get_vring_num(internal->vid);
> + if (nr_vring > IFCVF_MAX_QUEUES * 2)
> + return -1;
> 
>   irq_set = (struct vfio_irq_set *)irq_set_buf;
>   irq_set->argsz = sizeof(irq_set_buf);
> --
> 2.36.1

Acked-by: Xiao Wang 

BRs,
Xiao


Re: [dpdk-dev] [PATCH] Enhance code readability when dma_map in ifc/ifcvp_vdpa

2021-09-26 Thread Wang, Xiao W
Hi Jilei,

Please notice the patch format requirement, the subject of the patch should 
start with "vdpa/ifc: ".
You also need to keep it concise, around ~50 characters.
Refer " doc/guides/contributing/patches.rst" for more detail.

Back to this patch, it looks we can just change function ifcvf_dma_map(struct 
ifcvf_internal *internal, int do_map) to
ifcvf_dma_map(struct ifcvf_internal *internal, bool do_map), and use "true" or 
"false" when calling it.
This would align with vdpa_enable_vfio_intr(). In your next version patch, you 
can also change the "1", "0" parameter to
"true", "false" when calling vdpa_enable_vfio_intr().

BRs,
Xiao

> -Original Message-----
> From: jilei chen 
> Sent: Monday, September 27, 2021 12:45 AM
> To: Wang, Xiao W 
> Cc: dev@dpdk.org
> Subject: [PATCH] Enhance code readability when dma_map in
> ifc/ifcvp_vdpa
> 
> Signed-off-by: jilei chen 
> ---
>  drivers/vdpa/ifc/ifcvf_vdpa.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
> index 1dc813d0a3..c2bf26f2b7 100644
> --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> @@ -36,6 +36,8 @@ RTE_LOG_REGISTER(ifcvf_vdpa_logtype,
> pmd.vdpa.ifcvf, NOTICE);
> 
>  #define IFCVF_VDPA_MODE  "vdpa"
>  #define IFCVF_SW_FALLBACK_LM "sw-live-migration"
> +#define IFCVF_MAP1
> +#define IFCVF_UNMAP  0
> 
>  #define THREAD_NAME_LEN  16
> 
> @@ -538,7 +540,7 @@ update_datapath(struct ifcvf_internal *internal)
>   if (!rte_atomic32_read(&internal->running) &&
>   (rte_atomic32_read(&internal->started) &&
>rte_atomic32_read(&internal->dev_attached))) {
> - ret = ifcvf_dma_map(internal, 1);
> + ret = ifcvf_dma_map(internal, IFCVF_MAP);
>   if (ret)
>   goto err;
> 
> @@ -568,7 +570,7 @@ update_datapath(struct ifcvf_internal *internal)
>   if (ret)
>   goto err;
> 
> - ret = ifcvf_dma_map(internal, 0);
> + ret = ifcvf_dma_map(internal, IFCVF_UNMAP);
>   if (ret)
>   goto err;
> 
> @@ -875,7 +877,7 @@ ifcvf_sw_fallback_switchover(struct ifcvf_internal
> *internal)
>  unset_intr:
>   vdpa_disable_vfio_intr(internal);
>  unmap:
> - ifcvf_dma_map(internal, 0);
> + ifcvf_dma_map(internal, IFCVF_UNMAP);
>  error:
>   return -1;
>  }
> @@ -934,7 +936,7 @@ ifcvf_dev_close(int vid)
>   vdpa_disable_vfio_intr(internal);
> 
>   /* unset DMA map for guest memory */
> - ifcvf_dma_map(internal, 0);
> + ifcvf_dma_map(internal, IFCVF_UNMAP);
> 
>   internal->sw_fallback_running = false;
>   } else {
> --
> 2.12.2
> 
> 



Re: [dpdk-dev] [PATCH] [v2] vdpa/ifc: increase readability in function

2021-09-27 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: jilei chen 
> Sent: Monday, September 27, 2021 4:12 PM
> To: Wang, Xiao W 
> Cc: dev@dpdk.org
> Subject: [PATCH] [v2] vdpa/ifc: increase readability in function
> 
> Optimize several parameters form order to better readability
To the best of my English knowledge, there's a grammar error here.

How about changing it to:
Use bool type for function's switch parameter, this could avoid passing "1" or 
"0"
which is not reader friendly.

BRs,
Xiao

> 
> Signed-off-by: jilei chen 
> ---
> v2:
> * Concise subject of the patch
> * Optimize function parameters
> ---
>  drivers/vdpa/ifc/ifcvf_vdpa.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
> index 1dc813d0a3..365da2a8b9 100644
> --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> @@ -182,7 +182,7 @@ ifcvf_vfio_setup(struct ifcvf_internal *internal)
>  }
> 
>  static int
> -ifcvf_dma_map(struct ifcvf_internal *internal, int do_map)
> +ifcvf_dma_map(struct ifcvf_internal *internal, bool do_map)
>  {
>   uint32_t i;
>   int ret;
> @@ -538,11 +538,11 @@ update_datapath(struct ifcvf_internal *internal)
>   if (!rte_atomic32_read(&internal->running) &&
>   (rte_atomic32_read(&internal->started) &&
>rte_atomic32_read(&internal->dev_attached))) {
> - ret = ifcvf_dma_map(internal, 1);
> + ret = ifcvf_dma_map(internal, true);
>   if (ret)
>   goto err;
> 
> - ret = vdpa_enable_vfio_intr(internal, 0);
> + ret = vdpa_enable_vfio_intr(internal, false);
>   if (ret)
>   goto err;
> 
> @@ -568,7 +568,7 @@ update_datapath(struct ifcvf_internal *internal)
>   if (ret)
>   goto err;
> 
> - ret = ifcvf_dma_map(internal, 0);
> + ret = ifcvf_dma_map(internal, false);
>   if (ret)
>   goto err;
> 
> @@ -850,7 +850,7 @@ ifcvf_sw_fallback_switchover(struct ifcvf_internal
> *internal)
>   goto error;
> 
>   /* set up interrupt for interrupt relay */
> - ret = vdpa_enable_vfio_intr(internal, 1);
> + ret = vdpa_enable_vfio_intr(internal, true);
>   if (ret)
>   goto unmap;
> 
> @@ -875,7 +875,7 @@ ifcvf_sw_fallback_switchover(struct ifcvf_internal
> *internal)
>  unset_intr:
>   vdpa_disable_vfio_intr(internal);
>  unmap:
> - ifcvf_dma_map(internal, 0);
> + ifcvf_dma_map(internal, false);
>  error:
>   return -1;
>  }
> @@ -934,7 +934,7 @@ ifcvf_dev_close(int vid)
>   vdpa_disable_vfio_intr(internal);
> 
>   /* unset DMA map for guest memory */
> - ifcvf_dma_map(internal, 0);
> + ifcvf_dma_map(internal, false);
> 
>   internal->sw_fallback_running = false;
>   } else {
> @@ -1130,7 +1130,7 @@ ifcvf_set_vring_state(int vid, int vring, int state)
>   }
> 
>   if (state && !hw->vring[vring].enable) {
> - ret = vdpa_enable_vfio_intr(internal, 0);
> + ret = vdpa_enable_vfio_intr(internal, false);
>   if (ret)
>   return ret;
>   }
> --
> 2.12.2
> 
> 



Re: [dpdk-dev] [PATCH] [v3] vdpa/ifc: increase readability in function

2021-09-27 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: jilei chen 
> Sent: Monday, September 27, 2021 6:18 PM
> To: Wang, Xiao W 
> Cc: dev@dpdk.org
> Subject: [PATCH] [v3] vdpa/ifc: increase readability in function
> 
> Use bool type for function's switch parameter,
> this could avoid passing "1" or "0" which is not reader friendly.
> 
> Signed-off-by: jilei chen 
> ---
> v3:
> * Update inappropriate description
> 
> v2:
> * Concise subject of the patch
> * Optimize function parameters
> ---
>  drivers/vdpa/ifc/ifcvf_vdpa.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
> index 1dc813d0a3..365da2a8b9 100644
> --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> @@ -182,7 +182,7 @@ ifcvf_vfio_setup(struct ifcvf_internal *internal)
>  }
> 
>  static int
> -ifcvf_dma_map(struct ifcvf_internal *internal, int do_map)
> +ifcvf_dma_map(struct ifcvf_internal *internal, bool do_map)
>  {
>   uint32_t i;
>   int ret;
> @@ -538,11 +538,11 @@ update_datapath(struct ifcvf_internal *internal)
>   if (!rte_atomic32_read(&internal->running) &&
>   (rte_atomic32_read(&internal->started) &&
>rte_atomic32_read(&internal->dev_attached))) {
> - ret = ifcvf_dma_map(internal, 1);
> + ret = ifcvf_dma_map(internal, true);
>   if (ret)
>   goto err;
> 
> - ret = vdpa_enable_vfio_intr(internal, 0);
> + ret = vdpa_enable_vfio_intr(internal, false);
>   if (ret)
>   goto err;
> 
> @@ -568,7 +568,7 @@ update_datapath(struct ifcvf_internal *internal)
>   if (ret)
>   goto err;
> 
> - ret = ifcvf_dma_map(internal, 0);
> + ret = ifcvf_dma_map(internal, false);
>   if (ret)
>   goto err;
> 
> @@ -850,7 +850,7 @@ ifcvf_sw_fallback_switchover(struct ifcvf_internal
> *internal)
>   goto error;
> 
>   /* set up interrupt for interrupt relay */
> - ret = vdpa_enable_vfio_intr(internal, 1);
> + ret = vdpa_enable_vfio_intr(internal, true);
>   if (ret)
>   goto unmap;
> 
> @@ -875,7 +875,7 @@ ifcvf_sw_fallback_switchover(struct ifcvf_internal
> *internal)
>  unset_intr:
>   vdpa_disable_vfio_intr(internal);
>  unmap:
> - ifcvf_dma_map(internal, 0);
> + ifcvf_dma_map(internal, false);
>  error:
>   return -1;
>  }
> @@ -934,7 +934,7 @@ ifcvf_dev_close(int vid)
>   vdpa_disable_vfio_intr(internal);
> 
>   /* unset DMA map for guest memory */
> - ifcvf_dma_map(internal, 0);
> + ifcvf_dma_map(internal, false);
> 
>   internal->sw_fallback_running = false;
>   } else {
> @@ -1130,7 +1130,7 @@ ifcvf_set_vring_state(int vid, int vring, int state)
>   }
> 
>   if (state && !hw->vring[vring].enable) {
> - ret = vdpa_enable_vfio_intr(internal, 0);
> + ret = vdpa_enable_vfio_intr(internal, false);
>   if (ret)
>   return ret;
>   }
> --
> 2.12.2
> 
> 

Acked-by: Xiao Wang 

BRs,
Xiao


RE: [PATCH 1/3] vdpa/ifc: fix log info mismatch

2021-12-12 Thread Wang, Xiao W
Hi Andy,

Thanks for the patch.
You need to add the "Fixes: " line.

BRs,
Xiao

> -Original Message-
> From: Pei, Andy 
> Sent: Monday, December 13, 2021 12:29 PM
> To: dev@dpdk.org
> Cc: Pei, Andy ; Xia, Chenbo ;
> Wang, Xiao W 
> Subject: [PATCH 1/3] vdpa/ifc: fix log info mismatch
> 
> fix log info mismatch.
> 
> Signed-off-by: Andy Pei 
> ---
>  drivers/vdpa/ifc/base/ifcvf.c | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
> index 721cb1d..d10c1fd 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.c
> +++ b/drivers/vdpa/ifc/base/ifcvf.c
> @@ -94,12 +94,14 @@
>   return -1;
>   }
> 
> - DEBUGOUT("capability mapping:\ncommon cfg: %p\n"
> - "notify base: %p\nisr cfg: %p\ndevice cfg: %p\n"
> - "multiplier: %u\n",
> - hw->common_cfg, hw->dev_cfg,
> - hw->isr, hw->notify_base,
> - hw->notify_off_multiplier);
> + DEBUGOUT("capability mapping:\n"
> +  "common cfg: %p\n"
> +  "notify base: %p\n"
> +  "isr cfg: %p\n"
> +  "device cfg: %p\n"
> +  "multiplier: %u\n",
> +  hw->common_cfg, hw->notify_base, hw->isr, hw->dev_cfg,
> +  hw->notify_off_multiplier);
> 
>   return 0;
>  }
> --
> 1.8.3.1



RE: [PATCH 2/3] vdpa/ifc: check lm_cfg is not NULL before use lm_cfg

2021-12-12 Thread Wang, Xiao W
Hi,

Comments inline.

BRs,
Xiao

> -Original Message-
> From: Pei, Andy 
> Sent: Monday, December 13, 2021 12:29 PM
> To: dev@dpdk.org
> Cc: Pei, Andy ; Xia, Chenbo ;
> Wang, Xiao W 
> Subject: [PATCH 2/3] vdpa/ifc: check lm_cfg is not NULL before use lm_cfg
> 
> check lm_cfg is not NULL before use lm_cfg.
> when init hardware, if lm_cfg is NULL, output some debug information.

1. We need to capitalize the first letter in a sentence.
2. If lm_cfg is null, then I assume device doesn't support HW LM feature. But 
in below code change, many places just return silently, is it an issue?

> 
> Signed-off-by: Andy Pei 
> ---
>  drivers/vdpa/ifc/base/ifcvf.c | 32 
>  1 file changed, 24 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
> index d10c1fd..b9b061f 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.c
> +++ b/drivers/vdpa/ifc/base/ifcvf.c
> @@ -87,6 +87,8 @@
>   }
> 
>   hw->lm_cfg = hw->mem_resource[4].addr;
> + if (!hw->lm_cfg)
> + DEBUGOUT("HW mem_resource[4] is NULL, so lm_cfg is
> NULL.\n");

We need to make the debug message more human readable.

> 
>   if (hw->common_cfg == NULL || hw->notify_base == NULL ||
>   hw->isr == NULL || hw->dev_cfg == NULL) {
> @@ -218,10 +220,13 @@
>   &cfg->queue_used_hi);
>   IFCVF_WRITE_REG16(hw->vring[i].size, &cfg->queue_size);
> 
> - *(u32 *)(lm_cfg + IFCVF_LM_RING_STATE_OFFSET +
> - (i / 2) * IFCVF_LM_CFG_SIZE + (i % 2) * 4) =
> - (u32)hw->vring[i].last_avail_idx |
> - ((u32)hw->vring[i].last_used_idx << 16);
> + if (lm_cfg != NULL) {
> + *(u32 *)(lm_cfg + IFCVF_LM_RING_STATE_OFFSET +
> + (i / 2) * IFCVF_LM_CFG_SIZE +
> + (i % 2) * 4) =
> + (u32)hw->vring[i].last_avail_idx |
> + ((u32)hw->vring[i].last_used_idx << 16);
> + }
> 
>   IFCVF_WRITE_REG16(i + 1, &cfg->queue_msix_vector);
>   if (IFCVF_READ_REG16(&cfg->queue_msix_vector) ==
> @@ -254,10 +259,14 @@
>   IFCVF_WRITE_REG16(i, &cfg->queue_select);
>   IFCVF_WRITE_REG16(0, &cfg->queue_enable);
>   IFCVF_WRITE_REG16(IFCVF_MSI_NO_VECTOR, &cfg-
> >queue_msix_vector);
> - ring_state = *(u32 *)(hw->lm_cfg +
> IFCVF_LM_RING_STATE_OFFSET +
> - (i / 2) * IFCVF_LM_CFG_SIZE + (i % 2) * 4);
> - hw->vring[i].last_avail_idx = (u16)(ring_state >> 16);
> - hw->vring[i].last_used_idx = (u16)(ring_state >> 16);
> + if (hw->lm_cfg != NULL) {
> + ring_state = *(u32 *)(hw->lm_cfg +
> + IFCVF_LM_RING_STATE_OFFSET +
> + (i / 2) * IFCVF_LM_CFG_SIZE +
> + (i % 2) * 4);
> + hw->vring[i].last_avail_idx = (u16)(ring_state >> 16);
> + hw->vring[i].last_used_idx = (u16)(ring_state >> 16);
> + }
>   }
>  }
> 
> @@ -292,6 +301,9 @@
> 
>   lm_cfg = hw->lm_cfg;
> 
> + if (lm_cfg == NULL)
> + return;
> +
>   *(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_LOW) =
>   log_base & IFCVF_32_BIT_MASK;
> 
> @@ -313,6 +325,10 @@
>   u8 *lm_cfg;
> 
>   lm_cfg = hw->lm_cfg;
> +
> + if (lm_cfg == NULL)
> + return;
> +
>   *(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) = IFCVF_LM_DISABLE;
>  }
> 
> --
> 1.8.3.1



RE: [PATCH v2] vdpa/ifc: fix log info mismatch

2021-12-12 Thread Wang, Xiao W
Hi Andy,

BRs,
Xiao

> -Original Message-
> From: Pei, Andy 
> Sent: Monday, December 13, 2021 2:36 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; Wang, Xiao W
> 
> Subject: [PATCH v2] vdpa/ifc: fix log info mismatch
> 
> fix log info mismatch.

Use "Fix".

> 
> Fixes: a3f8150eac6d ("net/ifcvf: add ifcvf vDPA driver")
> Cc: xiao.w.w...@intel.com

For fix patch, you need to Cc "sta...@dpdk.org", not me.

> 
> Signed-off-by: Andy Pei 
> ---
>  drivers/vdpa/ifc/base/ifcvf.c | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
> index 721cb1d..d10c1fd 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.c
> +++ b/drivers/vdpa/ifc/base/ifcvf.c
> @@ -94,12 +94,14 @@
>   return -1;
>   }
> 
> - DEBUGOUT("capability mapping:\ncommon cfg: %p\n"
> - "notify base: %p\nisr cfg: %p\ndevice cfg: %p\n"
> - "multiplier: %u\n",
> - hw->common_cfg, hw->dev_cfg,
> - hw->isr, hw->notify_base,
> - hw->notify_off_multiplier);
> + DEBUGOUT("capability mapping:\n"
> +  "common cfg: %p\n"
> +  "notify base: %p\n"
> +  "isr cfg: %p\n"
> +  "device cfg: %p\n"
> +  "multiplier: %u\n",
> +  hw->common_cfg, hw->notify_base, hw->isr, hw->dev_cfg,
> +  hw->notify_off_multiplier);
> 
>   return 0;
>  }
> --
> 1.8.3.1



RE: [PATCH v2] vdpa/ifc: Match ANY subsystem IDs for modern virtio devices

2022-12-08 Thread Wang, Xiao W
Hi Abhishek,

Please see comments inline.

BRs,
Xiao

> -Original Message-
> From: Maheshwari, Abhishek 
> Sent: Tuesday, December 6, 2022 8:55 PM
> To: Wang, Xiao W 
> Cc: dev@dpdk.org; sta...@dpdk.org; Xia, Chenbo ;
> Mandal, Purna Chandra ; Maheshwari,
> Abhishek 
> Subject: [PATCH v2] vdpa/ifc: Match ANY subsystem IDs for modern virtio
> devices
> 
> Fixing the match table for vdpa/ifcvf driver because as per the Virtio
> device specification, for modern virtio devices, drivers MAY match any
> PCI Subsystem Vendor ID and any PCI Subsystem Device ID value.

Here the "drivers" refers to virtio driver, not vdpa driver.
With below change, this vdpa/ifc driver would hit the standard virtio device 
which can't 100% match this driver.

> 
> Fixes: a60b747d0ad ("vdpa/ifc: support virtio block device")
> Fixes: 5c806b94785 ("vdpa/ifc: add PCI ID for legacy network device")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Abhishek Maheshwari 
> ---
>  drivers/vdpa/ifc/ifcvf_vdpa.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
> index 49d68ad1b1..214d6e1f60 100644
> --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> @@ -1824,8 +1824,8 @@ static const struct rte_pci_id pci_id_ifcvf_map[] = {
>   { .class_id = RTE_CLASS_ANY_ID,
> .vendor_id = IFCVF_VENDOR_ID,
> .device_id = IFCVF_NET_MODERN_DEVICE_ID,
> -   .subsystem_vendor_id = IFCVF_SUBSYS_VENDOR_ID,
> -   .subsystem_device_id = IFCVF_SUBSYS_DEVICE_ID,
> +   .subsystem_vendor_id = RTE_PCI_ANY_ID,
> +   .subsystem_device_id = RTE_PCI_ANY_ID,
>   },
> 
>   { .class_id = RTE_CLASS_ANY_ID,
> @@ -1845,8 +1845,8 @@ static const struct rte_pci_id pci_id_ifcvf_map[] = {
>   { .class_id = RTE_CLASS_ANY_ID,
> .vendor_id = IFCVF_VENDOR_ID,
> .device_id = IFCVF_BLK_MODERN_DEVICE_ID,
> -   .subsystem_vendor_id = IFCVF_SUBSYS_VENDOR_ID,
> -   .subsystem_device_id = IFCVF_SUBSYS_BLK_DEVICE_ID,
> +   .subsystem_vendor_id = RTE_PCI_ANY_ID,
> +   .subsystem_device_id = RTE_PCI_ANY_ID,
>   },
> 
>   { .vendor_id = 0, /* sentinel */
> --
> 2.31.1



Re: [dpdk-dev] [PATCH 3/3] drivers: align log names

2021-03-10 Thread Wang, Xiao W
Hi,

BRs,
Xiao

> -Original Message-
> From: Thomas Monjalon 
> Sent: Wednesday, March 10, 2021 10:01 PM
> To: dev@dpdk.org
> Cc: Griffin, John ; Trahe, Fiona
> ; Jain, Deepak K ; Ajit
> Khaparde ; Raveendra Padasalagi
> ; Vikas Gupta
> ; Xu, Rosen ; Zhang,
> Tianfei ; Richardson, Bruce
> ; Nipun Gupta ;
> Hemant Agrawal ; Wang, Xiao W
> 
> Subject: [PATCH 3/3] drivers: align log names
> 
> The log levels are configured by using the name of the logs.
> Some drivers are aligned to follow a common log name standard:
>   pmd.class.driver[.sub]
> Some "common" drivers skip the "class" part:
>   pmd.driver.sub
> 
> Signed-off-by: Thomas Monjalon 
> ---
>  doc/guides/cryptodevs/qat.rst  | 10 +-
>  drivers/common/qat/qat_logs.c  |  4 ++--
>  drivers/crypto/bcmfs/bcmfs_logs.c  |  4 ++--
>  drivers/raw/ifpga/ifpga_rawdev.c   |  2 +-
>  drivers/raw/ioat/ioat_rawdev.c |  2 +-
>  drivers/raw/skeleton/skeleton_rawdev.c |  2 +-
>  drivers/vdpa/ifc/ifcvf_vdpa.c  |  2 +-
>  7 files changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst
> index cf16f03503..224b22b3f7 100644
> --- a/doc/guides/cryptodevs/qat.rst
> +++ b/doc/guides/cryptodevs/qat.rst
> @@ -659,15 +659,15 @@ Debugging
> 
>  There are 2 sets of trace available via the dynamic logging feature:
> 
> -* pmd.qat_dp exposes trace on the data-path.
> -* pmd.qat_general exposes all other trace.
> +* pmd.qat.dp exposes trace on the data-path.
> +* pmd.qat.general exposes all other trace.
> 

[...]

> b/drivers/raw/skeleton/skeleton_rawdev.c
> index aa3beaad18..8896f0c9c5 100644
> --- a/drivers/raw/skeleton/skeleton_rawdev.c
> +++ b/drivers/raw/skeleton/skeleton_rawdev.c
> @@ -768,4 +768,4 @@ static struct rte_vdev_driver skeleton_pmd_drv = {
>  };
> 
>  RTE_PMD_REGISTER_VDEV(SKELETON_PMD_RAWDEV_NAME,
> skeleton_pmd_drv);
> -RTE_LOG_REGISTER(skeleton_pmd_logtype, rawdev.skeleton, INFO);
> +RTE_LOG_REGISTER(skeleton_pmd_logtype, pmd.raw.skeleton, INFO);
> diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
> index 6a1b44bc77..bf7afe4610 100644
> --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> @@ -25,7 +25,7 @@
> 
>  #include "base/ifcvf.h"
> 
> -RTE_LOG_REGISTER(ifcvf_vdpa_logtype, pmd.net.ifcvf_vdpa, NOTICE);
> +RTE_LOG_REGISTER(ifcvf_vdpa_logtype, pmd.vdpa.ifcvf, NOTICE);
>  #define DRV_LOG(level, fmt, args...) \
>   rte_log(RTE_LOG_ ## level, ifcvf_vdpa_logtype, \
>   "IFCVF %s(): " fmt "\n", __func__, ##args)
> --
> 2.30.1

For vdpa/ifc part:
Acked-by: Xiao Wang 



Re: [dpdk-dev] [PATCH] vhost: add header check in dequeue offload

2021-03-12 Thread Wang, Xiao W
Hi Konstantin,

Comments inline.

BRs,
Xiao

> -Original Message-
> From: Ananyev, Konstantin 
> Sent: Thursday, March 11, 2021 6:38 PM
> To: Wang, Xiao W ; Xia, Chenbo
> ; maxime.coque...@redhat.com
> Cc: Liu, Yong ; dev@dpdk.org; Wang, Xiao W
> ; sta...@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] vhost: add header check in dequeue
> offload
> 
> 
> 
> >
> > When parsing the virtio net header and packet header for dequeue
> offload,
> > we need to perform sanity check on the packet header to ensure:
> >   - No out-of-boundary memory access.
> >   - The packet header and virtio_net header are valid and aligned.
> >
> > Fixes: d0cf91303d73 ("vhost: add Tx offload capabilities")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >  lib/librte_vhost/virtio_net.c | 49
> +--
> >  1 file changed, 43 insertions(+), 6 deletions(-)
> >
> > diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> > index 583bf379c6..0fba0053a3 100644
> > --- a/lib/librte_vhost/virtio_net.c
> > +++ b/lib/librte_vhost/virtio_net.c
> > @@ -1821,44 +1821,64 @@ virtio_net_with_host_offload(struct
> virtio_net *dev)
> > return false;
> >  }
> >
> > -static void
> > -parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr)
> > +static int
> > +parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr,
> > +   uint16_t *len)
> >  {
> > struct rte_ipv4_hdr *ipv4_hdr;
> > struct rte_ipv6_hdr *ipv6_hdr;
> > void *l3_hdr = NULL;
> > struct rte_ether_hdr *eth_hdr;
> > uint16_t ethertype;
> > +   uint16_t data_len = m->data_len;
> >
> > eth_hdr = rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
> >
> > +   if (data_len <= sizeof(struct rte_ether_hdr))
> > +   return -EINVAL;
> > +
> > m->l2_len = sizeof(struct rte_ether_hdr);
> > ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
> > +   data_len -= sizeof(struct rte_ether_hdr);
> >
> > if (ethertype == RTE_ETHER_TYPE_VLAN) {
> > +   if (data_len <= sizeof(struct rte_vlan_hdr))
> > +   return -EINVAL;
> > +
> > struct rte_vlan_hdr *vlan_hdr =
> > (struct rte_vlan_hdr *)(eth_hdr + 1);
> >
> > m->l2_len += sizeof(struct rte_vlan_hdr);
> > ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto);
> > +   data_len -= sizeof(struct rte_vlan_hdr);
> > }
> >
> > l3_hdr = (char *)eth_hdr + m->l2_len;
> >
> > switch (ethertype) {
> > case RTE_ETHER_TYPE_IPV4:
> > +   if (data_len <= sizeof(struct rte_ipv4_hdr))
> > +   return -EINVAL;
> > ipv4_hdr = l3_hdr;
> > *l4_proto = ipv4_hdr->next_proto_id;
> > m->l3_len = rte_ipv4_hdr_len(ipv4_hdr);
> > +   if (data_len <= m->l3_len) {
> > +   m->l3_len = 0;
> > +   return -EINVAL;
> > +   }
> > *l4_hdr = (char *)l3_hdr + m->l3_len;
> > m->ol_flags |= PKT_TX_IPV4;
> > +   data_len -= m->l3_len;
> > break;
> > case RTE_ETHER_TYPE_IPV6:
> > +   if (data_len <= sizeof(struct rte_ipv6_hdr))
> > +   return -EINVAL;
> > ipv6_hdr = l3_hdr;
> > *l4_proto = ipv6_hdr->proto;
> > m->l3_len = sizeof(struct rte_ipv6_hdr);
> > *l4_hdr = (char *)l3_hdr + m->l3_len;
> > m->ol_flags |= PKT_TX_IPV6;
> > +   data_len -= m->l3_len;
> > break;
> > default:
> > m->l3_len = 0;
> > @@ -1866,6 +1886,9 @@ parse_ethernet(struct rte_mbuf *m, uint16_t
> *l4_proto, void **l4_hdr)
> > *l4_hdr = NULL;
> > break;
> > }
> > +
> > +   *len = data_len;
> > +   return 0;
> >  }
> >
> >  static __rte_always_inline void
> > @@ -1874,24 +1897,30 @@ vhost_dequeue_offload(struct virtio_net_hdr
> *hdr, struct rte_mbuf *m)
> > uint16_t l4_proto = 0;
> > void *l4_hdr = NULL;
> > struct rte_tcp_hdr *tcp_hdr = NULL;
> > +   uint16_t len = 0;
> >
> > if (hdr->flags == 0 && hdr->gso_type ==
> VIRTIO_NET_HDR_GSO_NONE)
> > return;
> >
> > -   parse_ethernet(m, &l4_pr

Re: [dpdk-dev] [dpdk-stable] [PATCH v2] vhost: add header check in dequeue offload

2021-03-16 Thread Wang, Xiao W
Hi,

Comments inline.

BRs,
Xiao

> -Original Message-
> From: Ananyev, Konstantin 
> Sent: Tuesday, March 16, 2021 2:53 AM
> To: David Marchand ; Wang, Xiao W
> 
> Cc: Xia, Chenbo ; Maxime Coquelin
> ; Liu, Yong ; dev
> ; dpdk stable 
> Subject: RE: [dpdk-stable] [PATCH v2] vhost: add header check in dequeue
> offload
> 
> 
> 
> > -Original Message-
> > From: David Marchand 
> > Sent: Monday, March 15, 2021 4:17 PM
> > To: Wang, Xiao W 
> > Cc: Xia, Chenbo ; Maxime Coquelin
> ; Liu, Yong ; dev
> > ; Ananyev, Konstantin ;
> dpdk stable 
> > Subject: Re: [dpdk-stable] [PATCH v2] vhost: add header check in dequeue
> offload
> >
> > On Mon, Mar 15, 2021 at 4:52 PM Xiao Wang 
> wrote:
> > >
> > > When parsing the virtio net header and packet header for dequeue
> offload,
> > > we need to perform sanity check on the packet header to ensure:
> > >   - No out-of-boundary memory access.
> > >   - The packet header and virtio_net header are valid and aligned.
> > >
> > > Fixes: d0cf91303d73 ("vhost: add Tx offload capabilities")
> > > Cc: sta...@dpdk.org
> > >
> > > Signed-off-by: Xiao Wang 
> > > ---
> > > v2:
> > > Allow empty L4 payload for cksum offload.
> > > ---
> > >  lib/librte_vhost/virtio_net.c | 49
> +--
> > >  1 file changed, 43 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> > > index 583bf379c6..53a8ff2898 100644
> > > --- a/lib/librte_vhost/virtio_net.c
> > > +++ b/lib/librte_vhost/virtio_net.c
> > > @@ -1821,44 +1821,64 @@ virtio_net_with_host_offload(struct
> virtio_net *dev)
> > > return false;
> > >  }
> > >
> > > -static void
> > > -parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void
> **l4_hdr)
> > > +static int
> > > +parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void
> **l4_hdr,
> > > +   uint16_t *len)
> > >  {
> > > struct rte_ipv4_hdr *ipv4_hdr;
> > > struct rte_ipv6_hdr *ipv6_hdr;
> > > void *l3_hdr = NULL;
> > > struct rte_ether_hdr *eth_hdr;
> > > uint16_t ethertype;
> > > +   uint16_t data_len = m->data_len;
> >
> > >
> > > eth_hdr = rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
> > >
> > > +   if (data_len <= sizeof(struct rte_ether_hdr))
> > > +   return -EINVAL;
> >
> > On principle, the check should happen before calling rte_pktmbuf_mtod,
> > like what rte_pktmbuf_read does.

Yes, I agree. Will fix it in v3.

> >
> > Looking at the rest of the patch, does this helper function only
> > handle mono segment mbufs?
> > My reading of copy_desc_to_mbuf() was that it could generate multi
> > segments mbufs...

copy_desc_to_mbuf() could generate multi seg mbufs, and the whole packet would 
be copied into these multi-mbufs when packet size is larger than mbuf's 
capacity.
Anyway, one mbuf's capacity is big enough for holding the L2/L3/L4 header.

> >
> >
> > [snip]
> >
> > > case RTE_ETHER_TYPE_IPV4:
> > > +   if (data_len <= sizeof(struct rte_ipv4_hdr))
> > > +   return -EINVAL;
> > > ipv4_hdr = l3_hdr;
> > > *l4_proto = ipv4_hdr->next_proto_id;
> > > m->l3_len = rte_ipv4_hdr_len(ipv4_hdr);
> > > +   if (data_len <= m->l3_len) {
> > > +   m->l3_len = 0;
> > > +   return -EINVAL;
> > > +   }
> >
> > ... so here, comparing l3 length to only the first segment length
> > (data_len) would be invalid.
> >
> > If this helper must deal with multi segments, why not use
> rte_pktmbuf_read?
> > This function returns access to mbuf data after checking offset and
> > length are contiguous, else copy the needed data in a passed buffer.
> 
> From my understanding, yes multi-seg is allowed, but an expectation
> Is that at least packet header (l2/l3/l4?) will always reside in first 
> segment.

Yeah, I think so.

Thanks for all the comments,
-Xiao

> 
> >
> >
> > > *l4_hdr = (char *)l3_hdr + m->l3_len;
> > > m->ol_flags |= PKT_TX_IPV4;
> > > +   data_len -= m->l3_len;
> > > break;
> >
> >
> > --
> > David Marchand



Re: [dpdk-dev] [PATCH v3] vhost: add header check in dequeue offload

2021-05-07 Thread Wang, Xiao W
Hi Maxime and David,

I see patch " vhost: fix offload flags in Rx path " 
http://patches.dpdk.org/project/dpdk/patch/20210503164344.27916-4-david.march...@redhat.com/
 has been merged, and the legacy implementation is kept. Do you think we still 
need to fix the header check for the legacy implementation?

BRs,
Xiao

> -Original Message-
> From: Maxime Coquelin 
> Sent: Tuesday, April 13, 2021 10:31 PM
> To: David Marchand ; Wang, Xiao W
> 
> Cc: Xia, Chenbo ; Liu, Yong ;
> dev ; Ananyev, Konstantin
> ; dpdk stable ;
> yangy...@inspur.com
> Subject: Re: [PATCH v3] vhost: add header check in dequeue offload
> 
> 
> 
> On 4/12/21 11:33 AM, David Marchand wrote:
> > On Mon, Apr 12, 2021 at 11:09 AM Wang, Xiao W
>  wrote:
> >> Considering the major consumer of vhost API is virtual switch/router, I
> tend to keep the current implementation and apply this fix patch.
> >> Any comments?
> >
> > This is just a hack that bypasses the vswitch control.
> >
> > It happens to work when the vswitch does nothing.
> > If anything is done, like popping a vlan header, the vswitch needs to
> > update l3 offset.
> >
> >
> 
> I agree with David, current behavior is wrong.
> 
> Furthermore, when the lib is used via the Vhost PMD, the application
> should not have to handle it differently on whether it is Vhost PMD or
> any physical NIC PMD.



Re: [dpdk-dev] [PATCH] net/fm10k: fix secondary process crash

2017-03-31 Thread Wang, Xiao W
Hi Mark,

> -Original Message-
> From: Chen, Jing D
> Sent: Friday, March 31, 2017 9:30 AM
> To: Wang, Xiao W 
> Cc: dev@dpdk.org; sta...@dpdk.org
> Subject: RE: [PATCH] net/fm10k: fix secondary process crash
> 
> 
> 
> > -Original Message-
> > From: Wang, Xiao W
> > Sent: Tuesday, March 28, 2017 11:59 AM
> > To: Chen, Jing D 
> > Cc: dev@dpdk.org; Wang, Xiao W ;
> sta...@dpdk.org
> > Subject: [PATCH] net/fm10k: fix secondary process crash
> >
> > If the primary process has initialized all the queues to vector pmd mode,
> the
> > secondary process should not use scalar code path, because the per queue
> data
> > structures haven't been prepared for that, e.g. txq->ops is for vector Tx
> rather
> > than scalar Tx.
> >
> > Fixes: a6ce64a97520 ("fm10k: introduce vector driver")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >  drivers/net/fm10k/fm10k_ethdev.c | 28 ++--
> >  1 file changed, 26 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/fm10k/fm10k_ethdev.c
> > b/drivers/net/fm10k/fm10k_ethdev.c
> > index 388f929..680d617 100644
> > --- a/drivers/net/fm10k/fm10k_ethdev.c
> > +++ b/drivers/net/fm10k/fm10k_ethdev.c
> > @@ -2750,6 +2750,21 @@ static void __attribute__((cold))
> > int use_sse = 1;
> > uint16_t tx_ftag_en = 0;
> >
> > +   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > +   /* primary process has set the ftag flag and txq_flags */
> > +   txq = dev->data->tx_queues[0];
> > +   if (fm10k_tx_vec_condition_check(txq)) {
> > +   dev->tx_pkt_burst = fm10k_xmit_pkts;
> > +   dev->tx_pkt_prepare = fm10k_prep_pkts;
> > +   PMD_INIT_LOG(DEBUG, "Use regular Tx func");
> > +   } else {
> > +   PMD_INIT_LOG(DEBUG, "Use vector Tx func");
> > +   dev->tx_pkt_burst = fm10k_xmit_pkts_vec;
> > +   dev->tx_pkt_prepare = NULL;
> > +   }
> > +   return;
> > +   }
> > +
> 
> Why we need to check process type? What would happen if no changes
> made here?

If no change, then this function will re-set some fields of txq structure.
e.g.   for (i = 0; i < dev->data->nb_tx_queues; i++) {
txq = dev->data->tx_queues[i];
fm10k_txq_vec_setup(txq);
}
Though these fields would be re-set to the same value, it doesn't look good. In 
secondary, we just read the queues and do not write them.

Best Regards,
Xiao
> 
> > if (fm10k_check_ftag(dev->device->devargs))
> > tx_ftag_en = 1;
> >
> > @@ -2810,6 +2825,9 @@ static void __attribute__((cold))
> > else
> > PMD_INIT_LOG(DEBUG, "Use regular Rx func");
> >
> > +   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> > +   return;
> > +
> > for (i = 0; i < dev->data->nb_rx_queues; i++) {
> > struct fm10k_rx_queue *rxq = dev->data->rx_queues[i];
> >
> > @@ -2856,9 +2874,15 @@ static void __attribute__((cold))
> > dev->tx_pkt_burst = &fm10k_xmit_pkts;
> > dev->tx_pkt_prepare = &fm10k_prep_pkts;
> >
> > -   /* only initialize in the primary process */
> > -   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> > +   /*
> > +* Primary process does the whole initialization, for secondary
> > +* processes, we just select the same Rx and Tx function as primary.
> > +*/
> > +   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > +   fm10k_set_rx_function(dev);
> > +   fm10k_set_tx_function(dev);
> > return 0;
> > +   }
> >
> > rte_eth_copy_pci_info(dev, pdev);
> > dev->data->dev_flags |= RTE_ETH_DEV_DETACHABLE;
> > --
> > 1.8.3.1



[dpdk-dev] [PATCH 01/39] net/ixgbe/base: fix delta check for setting VFTA

2016-09-22 Thread Wang, Xiao W
Hi Ferruh,

> -Original Message-
> From: Yigit, Ferruh
> Sent: Tuesday, September 20, 2016 1:01 AM
> To: Wang, Xiao W ; Lu, Wenzhuo
> 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 01/39] net/ixgbe/base: fix delta check for
> setting VFTA
> 
> On 8/27/2016 4:47 PM, Xiao Wang wrote:
> > The delta value rather than vfta_delta pointer should be checked.
> >
> > Fixes: b978f7b38c14 ("net/ixgbe/base: simplify VLAN management")
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >  drivers/net/ixgbe/base/ixgbe_82598.c  | 6 +++---
> >  drivers/net/ixgbe/base/ixgbe_api.c| 7 ---
> >  drivers/net/ixgbe/base/ixgbe_common.c | 2 +-
> >  3 files changed, 8 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/net/ixgbe/base/ixgbe_82598.c
> b/drivers/net/ixgbe/base/ixgbe_82598.c
> > index db80880..724dcbb 100644
> > --- a/drivers/net/ixgbe/base/ixgbe_82598.c
> > +++ b/drivers/net/ixgbe/base/ixgbe_82598.c
> > @@ -995,19 +995,19 @@ STATIC s32 ixgbe_clear_vmdq_82598(struct
> ixgbe_hw *hw, u32 rar, u32 vmdq)
> >   *  @vlan: VLAN id to write to VLAN filter
> >   *  @vind: VMDq output index that maps queue to VLAN id in VFTA
> >   *  @vlan_on: boolean flag to turn on/off VLAN in VFTA
> > - *  @bypass_vlvf: boolean flag - unused
> > + *  @vlvf_bypass: boolean flag - unused
> 
> bypass_vlvf -> vlvf_bypass belongs to different patch
> 
> >   *
> >   *  Turn on/off specified VLAN in the VLAN filter table.
> >   **/
> >  s32 ixgbe_set_vfta_82598(struct ixgbe_hw *hw, u32 vlan, u32 vind,
> > -bool vlan_on, bool bypass_vlvf)
> > +bool vlan_on, bool vlvf_bypass)
> >  {
> > u32 regindex;
> > u32 bitindex;
> > u32 bits;
> > u32 vftabyte;
> >
> > -   UNREFERENCED_1PARAMETER(bypass_vlvf);
> > +   UNREFERENCED_1PARAMETER(vlvf_bypass);
> >
> > DEBUGFUNC("ixgbe_set_vfta_82598");
> >
> > diff --git a/drivers/net/ixgbe/base/ixgbe_api.c
> b/drivers/net/ixgbe/base/ixgbe_api.c
> > index 1786867..5b721af 100644
> > --- a/drivers/net/ixgbe/base/ixgbe_api.c
> > +++ b/drivers/net/ixgbe/base/ixgbe_api.c
> > @@ -1090,7 +1090,7 @@ s32 ixgbe_set_vfta(struct ixgbe_hw *hw, u32 vlan,
> u32 vind, bool vlan_on,
> >bool vlvf_bypass)
> >  {
> > return ixgbe_call_func(hw, hw->mac.ops.set_vfta, (hw, vlan, vind,
> > - vlan_on, vlvf_bypass),
> IXGBE_NOT_IMPLEMENTED);
> > +  vlan_on, vlvf_bypass),
> IXGBE_NOT_IMPLEMENTED);
> >  }
> >
> >  /**
> > @@ -1100,7 +1100,7 @@ s32 ixgbe_set_vfta(struct ixgbe_hw *hw, u32 vlan,
> u32 vind, bool vlan_on,
> >   *  @vind: VMDq output index that maps queue to VLAN id in VLVFB
> >   *  @vlan_on: boolean flag to turn on/off VLAN in VLVF
> >   *  @vfta_delta: pointer to the difference between the current value of 
> > VFTA
> > - *   and the desired value
> > + *  and the desired value
> >   *  @vfta: the desired value of the VFTA
> >   *  @vlvf_bypass: boolean flag indicating updating the default pool is okay
> >   *
> > @@ -1110,7 +1110,7 @@ s32 ixgbe_set_vlvf(struct ixgbe_hw *hw, u32 vlan,
> u32 vind, bool vlan_on,
> >u32 *vfta_delta, u32 vfta, bool vlvf_bypass)
> >  {
> > return ixgbe_call_func(hw, hw->mac.ops.set_vlvf, (hw, vlan, vind,
> > -   vlan_on, vfta_delta, vfta, vlvf_bypass),
> > +  vlan_on, vfta_delta, vfta, vlvf_bypass),
> >IXGBE_NOT_IMPLEMENTED);
> >  }
> >
> > @@ -1659,6 +1659,7 @@ void ixgbe_init_swfw_semaphore(struct ixgbe_hw
> *hw)
> > hw->mac.ops.init_swfw_sync(hw);
> >  }
> >
> > +
> 
> unrelated whitespace modifications
> 
> 
> >  void ixgbe_disable_rx(struct ixgbe_hw *hw)
> >  {
> > if (hw->mac.ops.disable_rx)
> > diff --git a/drivers/net/ixgbe/base/ixgbe_common.c
> b/drivers/net/ixgbe/base/ixgbe_common.c
> > index 811875a..161bf32 100644
> > --- a/drivers/net/ixgbe/base/ixgbe_common.c
> > +++ b/drivers/net/ixgbe/base/ixgbe_common.c
> > @@ -3967,7 +3967,7 @@ s32 ixgbe_set_vlvf_generic(struct ixgbe_hw *hw,
> u32 vlan, u32 vind,
> >  * we run the risk of stray packets leaking into
> >  * the PF via the default pool
> >  */
> > -   if (vfta_delta)
> > +   if (*vfta_delta)
> 
> This seems only update mentioned in patch commit log.
> 
> What about extracting all other clean up modifications into a new patch,
> other patches also have similar fixes, all can go into that cleanup patch?

Good advice, I will put all the minor misc modifications into a separate patch.
Thanks.

> 
> > IXGBE_WRITE_REG(hw, IXGBE_VFTA(vlan / 32), vfta);
> >
> > /* disable VLVF and clear remaining bit from pool */
> >
> 



[dpdk-dev] [PATCH 23/39] net/ixgbe/base: add bound check in LED functions

2016-09-22 Thread Wang, Xiao W
Hi Ferruh,

> -Original Message-
> From: Yigit, Ferruh
> Sent: Tuesday, September 20, 2016 1:07 AM
> To: Wang, Xiao W ; Lu, Wenzhuo
> 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 23/39] net/ixgbe/base: add bound check in LED
> functions
> 
> On 8/27/2016 4:48 PM, Xiao Wang wrote:
> > Do parameter check to prevent exceptional value being written into
> > register.
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >  drivers/net/ixgbe/base/ixgbe_common.c | 15 ++-
> >  drivers/net/ixgbe/base/ixgbe_x540.c   |  6 ++
> >  2 files changed, 20 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ixgbe/base/ixgbe_common.c
> b/drivers/net/ixgbe/base/ixgbe_common.c
> > index 1c7263d..3c3272e 100644
> > --- a/drivers/net/ixgbe/base/ixgbe_common.c
> > +++ b/drivers/net/ixgbe/base/ixgbe_common.c
> > @@ -1133,6 +1133,9 @@ s32 ixgbe_led_on_generic(struct ixgbe_hw *hw,
> u32 index)
> >
> > DEBUGFUNC("ixgbe_led_on_generic");
> >
> > +   if (index > 3)
> 
> What about using macro for hardcoded value 3.

For the base (shared) code update, we'd better to keep DPDK in consistency
with kernel driver.

> 
> > +   return IXGBE_ERR_PARAM;
> > +
> > /* To turn on the LED, set mode to ON. */
> > led_reg &= ~IXGBE_LED_MODE_MASK(index);
> > led_reg |= IXGBE_LED_ON << IXGBE_LED_MODE_SHIFT(index);
> > @@ -1153,6 +1156,9 @@ s32 ixgbe_led_off_generic(struct ixgbe_hw *hw,
> u32 index)
> >
> > DEBUGFUNC("ixgbe_led_off_generic");
> >
> > +   if (index > 3)
> > +   return IXGBE_ERR_PARAM;
> > +
> > /* To turn off the LED, set mode to OFF. */
> > led_reg &= ~IXGBE_LED_MODE_MASK(index);
> > led_reg |= IXGBE_LED_OFF << IXGBE_LED_MODE_SHIFT(index);
> > @@ -3341,7 +3347,7 @@ s32 prot_autoc_write_generic(struct ixgbe_hw
> *hw, u32 reg_val, bool locked)
> >   **/
> >  s32 ixgbe_enable_sec_rx_path_generic(struct ixgbe_hw *hw)
> >  {
> > -   int secrxreg;
> > +   u32 secrxreg;
> 
> This modification seems unrelated with the patch.

Will put such minor change into a cleanup patch.

> 
> >
> 



[dpdk-dev] [PATCH 29/39] net/ixgbe/base: report autoneg supported for X550

2016-09-23 Thread Wang, Xiao W
Hi Ferruh,

> -Original Message-
> From: Yigit, Ferruh
> Sent: Tuesday, September 20, 2016 1:08 AM
> To: Wang, Xiao W ; Lu, Wenzhuo
> 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 29/39] net/ixgbe/base: report autoneg
> supported for X550
> 
> On 8/27/2016 4:48 PM, Xiao Wang wrote:
> > Make sure ixgbe_device_supports_autoneg_fc() returns true for the device
> > IDs of Seabrook and Shady Acres.
> 
> Is these IDs official public ones?

I will remove such name in v2.

> 
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >  drivers/net/ixgbe/base/ixgbe_common.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/net/ixgbe/base/ixgbe_common.c
> b/drivers/net/ixgbe/base/ixgbe_common.c
> > index bc12bc1..9776ab9 100644
> > --- a/drivers/net/ixgbe/base/ixgbe_common.c
> > +++ b/drivers/net/ixgbe/base/ixgbe_common.c
> > @@ -189,6 +189,8 @@ bool ixgbe_device_supports_autoneg_fc(struct
> ixgbe_hw *hw)
> > case IXGBE_DEV_ID_X550T1:
> > case IXGBE_DEV_ID_X550EM_X_10G_T:
> > case IXGBE_DEV_ID_X550EM_A_10G_T:
> > +   case IXGBE_DEV_ID_X550EM_A_1G_T:
> > +   case IXGBE_DEV_ID_X550EM_A_1G_T_L:
> > supported = true;
> > break;
> > default:
> >
> 



[dpdk-dev] [PATCH 05/39] net/ixgbe/base: support VF multicast promiscuous

2016-09-23 Thread Wang, Xiao W
Hi Ferruh,

> -Original Message-
> From: Yigit, Ferruh
> Sent: Tuesday, September 20, 2016 1:06 AM
> To: Wang, Xiao W ; Lu, Wenzhuo
> 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 05/39] net/ixgbe/base: support VF multicast
> promiscuous
> 
> On 8/27/2016 4:47 PM, Xiao Wang wrote:
> > Currently, VF is limited to 30 multicast addresses. In order to
> > accommodate more addresses, this patch adds support for VF multicast
> > promiscuous.
> 
> It looks like functionality not changed, just
> ixgbevf_update_xcast_mode() moved to ixgbe_mac_operations struct.
> 
> Is 30 multicast address limitation remains with this patch?
> 

You're right, we have supported this feature in ixgbe_ethdev.c before, now 
shared
code supports it, so it looks like just a function movement patch.
I will rewrite the commit log to avoid misunderstanding.

> >
> > Signed-off-by: Xiao Wang 
> > ---
> >  drivers/net/ixgbe/base/ixgbe_mbx.h  |  2 +-
> >  drivers/net/ixgbe/base/ixgbe_type.h |  1 +
> >  drivers/net/ixgbe/base/ixgbe_vf.c   | 38
> 
> >  drivers/net/ixgbe/base/ixgbe_vf.h   |  1 +
> >  drivers/net/ixgbe/ixgbe_ethdev.c| 43 
> > ++---
> >  5 files changed, 43 insertions(+), 42 deletions(-)
> >
> > diff --git a/drivers/net/ixgbe/base/ixgbe_mbx.h
> b/drivers/net/ixgbe/base/ixgbe_mbx.h
> > index d775142..c3e301f 100644
> > --- a/drivers/net/ixgbe/base/ixgbe_mbx.h
> > +++ b/drivers/net/ixgbe/base/ixgbe_mbx.h
> > @@ -111,7 +111,7 @@ enum ixgbe_pfvf_api_rev {
> >  /* mailbox API, version 1.2 VF requests */
> >  #define IXGBE_VF_GET_RETA  0x0a /* VF request for RETA */
> >  #define IXGBE_VF_GET_RSS_KEY   0x0b /* get RSS key */
> > -#define IXGBE_VF_UPDATE_XCAST_MODE 0x0C
> > +#define IXGBE_VF_UPDATE_XCAST_MODE 0x0c
> >
> >  /* GET_QUEUES return data indices within the mailbox */
> >  #define IXGBE_VF_TX_QUEUES 1   /* number of Tx queues
> supported */
> > diff --git a/drivers/net/ixgbe/base/ixgbe_type.h
> b/drivers/net/ixgbe/base/ixgbe_type.h
> > index b2fdfcd..96b5cbd 100644
> > --- a/drivers/net/ixgbe/base/ixgbe_type.h
> > +++ b/drivers/net/ixgbe/base/ixgbe_type.h
> > @@ -3883,6 +3883,7 @@ struct ixgbe_mac_operations {
> > s32 (*init_uta_tables)(struct ixgbe_hw *);
> > void (*set_mac_anti_spoofing)(struct ixgbe_hw *, bool, int);
> > void (*set_vlan_anti_spoofing)(struct ixgbe_hw *, bool, int);
> > +   s32 (*update_xcast_mode)(struct ixgbe_hw *, int);
> >
> > /* Flow Control */
> > s32 (*fc_enable)(struct ixgbe_hw *);
> > diff --git a/drivers/net/ixgbe/base/ixgbe_vf.c
> b/drivers/net/ixgbe/base/ixgbe_vf.c
> > index a75074a..20a739c 100644
> > --- a/drivers/net/ixgbe/base/ixgbe_vf.c
> > +++ b/drivers/net/ixgbe/base/ixgbe_vf.c
> > @@ -75,6 +75,7 @@ s32 ixgbe_init_ops_vf(struct ixgbe_hw *hw)
> > hw->mac.ops.set_uc_addr = ixgbevf_set_uc_addr_vf;
> > hw->mac.ops.init_rx_addrs = NULL;
> > hw->mac.ops.update_mc_addr_list = ixgbe_update_mc_addr_list_vf;
> > +   hw->mac.ops.update_xcast_mode = ixgbevf_update_xcast_mode;
> > hw->mac.ops.enable_mc = NULL;
> > hw->mac.ops.disable_mc = NULL;
> > hw->mac.ops.clear_vfta = NULL;
> > @@ -419,6 +420,43 @@ s32 ixgbe_update_mc_addr_list_vf(struct ixgbe_hw
> *hw, u8 *mc_addr_list,
> >  }
> >
> >  /**
> > + *  ixgbevf_update_xcast_mode - Update Multicast mode
> > + *  @hw: pointer to the HW structure
> > + *  @xcast_mode: new multicast mode
> > + *
> > + *  Updates the Multicast Mode of VF.
> > + **/
> > +s32 ixgbevf_update_xcast_mode(struct ixgbe_hw *hw, int xcast_mode)
> > +{
> > +   struct ixgbe_mbx_info *mbx = &hw->mbx;
> > +   u32 msgbuf[2];
> > +   s32 err;
> > +
> > +   switch (hw->api_version) {
> > +   case ixgbe_mbox_api_12:
> > +   break;
> > +   default:
> > +   return IXGBE_ERR_FEATURE_NOT_SUPPORTED;
> > +   }
> > +
> > +   msgbuf[0] = IXGBE_VF_UPDATE_XCAST_MODE;
> > +   msgbuf[1] = xcast_mode;
> > +
> > +   err = mbx->ops.write_posted(hw, msgbuf, 2, 0);
> > +   if (err)
> > +   return err;
> > +
> > +   err = mbx->ops.read_posted(hw, msgbuf, 2, 0);
> > +   if (err)
> > +   return err;
> > +
> > +   msgbuf[0] &= ~IXGBE_VT_MSGTYPE_CTS;
> > +   if (msgbuf[0] == (IXGBE_VF_UPDATE_XCAST_MODE |
> IXGBE_VT_MSGTYPE_NACK))
> 
> What if other flags set in msgbuf[0]
> Please check 18/39 patch, which fixes something similar to this
> 

This condition should be as (msgbuf[0] == original value of msgbuf[0] | NACK ),
It's what the 18/39 patch fixes. No such issue for this function.

Thanks,
Xiao
> 
> > +   return IXGBE_ERR_FEATURE_NOT_SUPPORTED;
> > +   return IXGBE_SUCCESS;
> > +}
> > +



[dpdk-dev] [PATCH 16/39] net/ixgbe/base: bump mailbox version

2016-09-23 Thread Wang, Xiao W
Hi Ferruh,

> -Original Message-
> From: Yigit, Ferruh
> Sent: Tuesday, September 20, 2016 1:03 AM
> To: Wang, Xiao W ; Lu, Wenzhuo
> 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 16/39] net/ixgbe/base: bump mailbox version
> 
> On 8/27/2016 4:47 PM, Xiao Wang wrote:
> > This patch will pave the way for the new VF unicast promiscuous
> > mode support.
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >  drivers/net/ixgbe/base/ixgbe_mbx.h | 5 +++--
> >  drivers/net/ixgbe/base/ixgbe_vf.c  | 2 ++
> >  2 files changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/ixgbe/base/ixgbe_mbx.h
> b/drivers/net/ixgbe/base/ixgbe_mbx.h
> > index c3e301f..7556a81 100644
> > --- a/drivers/net/ixgbe/base/ixgbe_mbx.h
> > +++ b/drivers/net/ixgbe/base/ixgbe_mbx.h
> > @@ -90,6 +90,7 @@ enum ixgbe_pfvf_api_rev {
> > ixgbe_mbox_api_20,  /* API version 2.0, solaris Phase1 VF driver */
> > ixgbe_mbox_api_11,  /* API version 1.1, linux/freebsd VF driver */
> > ixgbe_mbox_api_12,  /* API version 1.2, linux/freebsd VF driver */
> > +   ixgbe_mbox_api_13,  /* API version 1.3, linux/freebsd VF driver */
> > /* This value should always be last */
> > ixgbe_mbox_api_unknown, /* indicates that API version is not
> known */
> >  };
> > @@ -109,8 +110,8 @@ enum ixgbe_pfvf_api_rev {
> >  #define IXGBE_VF_GET_QUEUES0x09 /* get queue configuration */
> >
> >  /* mailbox API, version 1.2 VF requests */
> > -#define IXGBE_VF_GET_RETA  0x0a /* VF request for RETA */
> > -#define IXGBE_VF_GET_RSS_KEY   0x0b /* get RSS key */
> > +#define IXGBE_VF_GET_RETA  0x0a/* VF request for RETA */
> > +#define IXGBE_VF_GET_RSS_KEY   0x0b/* get RSS key */
> 
> is this intentional, since breaks tab alignment, and the values are not
> changes actually.
> 

Such minor change is to keep in consistency with kernel base code.
I need to put all such modifications into one cleanup patch.

> >  #define IXGBE_VF_UPDATE_XCAST_MODE 0x0c


[dpdk-dev] [PATCH 17/39] net/ixgbe/base: access IOSF by host interface

2016-09-23 Thread Wang, Xiao W
Hi Ferruh,

> -Original Message-
> From: Yigit, Ferruh
> Sent: Tuesday, September 20, 2016 1:04 AM
> To: Wang, Xiao W ; Lu, Wenzhuo
> 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 17/39] net/ixgbe/base: access IOSF by host
> interface
> 
> On 8/27/2016 4:48 PM, Xiao Wang wrote:
> > This patch makes sure that we access IOSF registers through the HIC
> > (host interface command) for the majority of X550em devices. All devices
> > with NVM are capable of using the HIC.
> >
> > For consistency all instances where the ixgbe_read/write_iosf_sb_reg_x550
> > is called directly are converted to function pointer calls.
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >  drivers/net/ixgbe/base/ixgbe_phy.c  |  2 +-
> >  drivers/net/ixgbe/base/ixgbe_x550.c | 38 ++--
> -
> >  2 files changed, 24 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/net/ixgbe/base/ixgbe_phy.c
> b/drivers/net/ixgbe/base/ixgbe_phy.c
> > index d33d0f8..ee8618f 100644
> > --- a/drivers/net/ixgbe/base/ixgbe_phy.c
> > +++ b/drivers/net/ixgbe/base/ixgbe_phy.c
> > @@ -741,7 +741,7 @@ s32 ixgbe_write_phy_reg_generic(struct ixgbe_hw
> *hw, u32 reg_addr,
> > DEBUGFUNC("ixgbe_write_phy_reg_generic");
> >
> > if (hw->mac.ops.acquire_swfw_sync(hw, gssr) == IXGBE_SUCCESS) {
> > -   status = ixgbe_write_phy_reg_mdi(hw, reg_addr, device_type,
> > +   status = hw->phy.ops.write_reg_mdi(hw, reg_addr,
> device_type,
> >  phy_data);
> 
> Is this IOSF register?

No, it's not. For consistency this patch converts this function call to 
function pointer
call.  I will cover this in the v2 commit log.

> 
> ...
> 
> > @@ -4504,7 +4512,7 @@ s32 ixgbe_write_phy_reg_x550a(struct ixgbe_hw
> *hw, u32 reg_addr,
> > DEBUGFUNC("ixgbe_write_phy_reg_x550a");
> >
> > if (hw->mac.ops.acquire_swfw_sync(hw, mask) == IXGBE_SUCCESS) {
> > -   status = ixgbe_write_phy_reg_mdi(hw, reg_addr, device_type,
> > +   status = hw->phy.ops.write_reg_mdi(hw, reg_addr,
> device_type,
> 
> same question?
> 
> >  phy_data);
> > hw->mac.ops.release_swfw_sync(hw, mask);
> > } else {
> >
> 



[dpdk-dev] [PATCH 18/39] net/ixgbe/base: fix check on NACK

2016-09-23 Thread Wang, Xiao W
Hi Ferruh,

> -Original Message-
> From: Yigit, Ferruh
> Sent: Tuesday, September 20, 2016 1:07 AM
> To: Wang, Xiao W ; Lu, Wenzhuo
> 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 18/39] net/ixgbe/base: fix check on NACK
> 
> On 8/27/2016 4:48 PM, Xiao Wang wrote:
> > Previously we checked only msgbuf[0] for
> 
> "return buffer" instead of msgbuf[0] ?
> 

Looks better. Use it in v2.

> > (IXGBE_VF_SET_MACVLAN |  IXGBE_VT_MSGTYPE_NACK), but this would not
> > work if index != 0 and as a result NACK will not be detected.
> 
> "write buffer is not 0" instead of "index != 0"
> 

msgbuf[0] |= index << IXGBE_VT_MSGINFO_SHIFT;
msgbuf[0] |= IXGBE_VF_SET_MACVLAN;

"index != 0" has effect on the msgbuf[0], so we should emphasize on "index".
I will change it to "index is not 0" in v2.

> >
> 
> Function also starts using ixgbevf_write_msg_read_ack() instead of
> separate write and read, is it possible to fix NACK only in this patch,
> and do ixgbevf_write_msg_read_ack() switch in patch 27/39.
> If prefer to keep in this patch, please mention about this switch in
> comment log.
> 

Agree. Thanks.

> > Fixes: af75078fece3 ("first public release")
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >  drivers/net/ixgbe/base/ixgbe_vf.c | 18 --
> >  1 file changed, 8 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/net/ixgbe/base/ixgbe_vf.c
> b/drivers/net/ixgbe/base/ixgbe_vf.c
> > index c0fedea..f60ff7d 100644
> > --- a/drivers/net/ixgbe/base/ixgbe_vf.c
> > +++ b/drivers/net/ixgbe/base/ixgbe_vf.c
> > @@ -529,8 +529,7 @@ s32 ixgbe_get_mac_addr_vf(struct ixgbe_hw *hw, u8
> *mac_addr)
> >
> >  s32 ixgbevf_set_uc_addr_vf(struct ixgbe_hw *hw, u32 index, u8 *addr)
> >  {
> > -   struct ixgbe_mbx_info *mbx = &hw->mbx;
> > -   u32 msgbuf[3];
> > +   u32 msgbuf[3], msgbuf_chk;
> > u8 *msg_addr = (u8 *)(&msgbuf[1]);
> > s32 ret_val;
> >
> > @@ -543,18 +542,17 @@ s32 ixgbevf_set_uc_addr_vf(struct ixgbe_hw *hw,
> u32 index, u8 *addr)
> >  */
> > msgbuf[0] |= index << IXGBE_VT_MSGINFO_SHIFT;
> > msgbuf[0] |= IXGBE_VF_SET_MACVLAN;
> > +   msgbuf_chk = msgbuf[0];
> > if (addr)
> > memcpy(msg_addr, addr, 6);
> > -   ret_val = mbx->ops.write_posted(hw, msgbuf, 3, 0);
> >
> > -   if (!ret_val)
> > -   ret_val = mbx->ops.read_posted(hw, msgbuf, 3, 0);
> > +   ret_val = ixgbevf_write_msg_read_ack(hw, msgbuf, msgbuf, 3);
> > +   if (!ret_val) {
> > +   msgbuf[0] &= ~IXGBE_VT_MSGTYPE_CTS;
> >
> > -   msgbuf[0] &= ~IXGBE_VT_MSGTYPE_CTS;
> > -
> > -   if (!ret_val)
> > -   if (msgbuf[0] == (IXGBE_VF_SET_MACVLAN |
> IXGBE_VT_MSGTYPE_NACK))
> > -   ret_val = IXGBE_ERR_OUT_OF_MEM;
> > +   if (msgbuf[0] == (msgbuf_chk | IXGBE_VT_MSGTYPE_NACK))
> > +   return IXGBE_ERR_OUT_OF_MEM;
> 
> What about following instead of introducing msgbuf_chk:
> 
> if ((msgbuf[0] & IXGBE_VF_SET_MACVLAN) &&
>   (msgbuf[0] & IXGBE_VT_MSGTYPE_NACK))
> 
> Please check patch 15/39

It's different from 15/39 where the write buffer is simple,
Here the write buffer msgbuf[0] is more complicated, if we don't
introduce msgbuf_chk, the code will looks bloated. 

> 
> > +   }
> >
> > return ret_val;
> >  }
> >
> 
> 



[dpdk-dev] [PATCH 33/39] net/ixgbe/base: add X550em_a FW ALEF support

2016-09-23 Thread Wang, Xiao W
Hi Wenzhuo,

> -Original Message-
> From: Lu, Wenzhuo
> Sent: Thursday, September 22, 2016 10:57 AM
> To: Yigit, Ferruh ; Wang, Xiao W
> 
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH 33/39] net/ixgbe/base: add X550em_a FW ALEF
> support
> 
> Hi Xiao,
> 
> 
> > -Original Message-
> > From: Yigit, Ferruh
> > Sent: Tuesday, September 20, 2016 1:08 AM
> > To: Wang, Xiao W; Lu, Wenzhuo
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 33/39] net/ixgbe/base: add X550em_a FW
> ALEF
> > support
> >
> > On 8/27/2016 4:48 PM, Xiao Wang wrote:
> > > This patch adds X550em_a FW ALEF support for B0 per DCR 64. ALEF is
> >
> > Is it required more information on B0 and ALEF. Is there an official name 
> > for
> B0?
> I think we should focus on the change. Users will have no idea about B0.

OK, I will remove these info.


[dpdk-dev] [PATCH 35/39] net/ixgbe/base: hold semaphore for shadow RAM access

2016-09-23 Thread Wang, Xiao W
Hi Ferruh,

> -Original Message-
> From: Yigit, Ferruh
> Sent: Tuesday, September 20, 2016 1:08 AM
> To: Wang, Xiao W ; Lu, Wenzhuo
> 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 35/39] net/ixgbe/base: hold semaphore for
> shadow RAM access
> 
> On 8/27/2016 4:48 PM, Xiao Wang wrote:
> > The semaphore is not being held for complete shadow RAM accesses
> > which could result in corruption. Refactor the code so that it is
> > possible to hold the semaphore around ixgbe_host_interface_command
> > by introducing an unlocked form. This patch also eliminates the
> > function ixgbe_read_ee_hostif_data_X550 in favor of the function
> > ixgbe_read_ee_hostif_X550 and it now gets both semaphore bits
> > at once instead of nesting them. The new arrangement is able to
> > get both the management interface and the EEPROM semaphores at the
> > same time instead of separately.
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >  drivers/net/ixgbe/base/ixgbe_common.c | 106 ++---
> -
> >  drivers/net/ixgbe/base/ixgbe_common.h |   3 +-
> >  drivers/net/ixgbe/base/ixgbe_x550.c   |  57 +-
> >  drivers/net/ixgbe/base/ixgbe_x550.h   |   2 -
> >  4 files changed, 88 insertions(+), 80 deletions(-)
> >
> > diff --git a/drivers/net/ixgbe/base/ixgbe_common.c
> b/drivers/net/ixgbe/base/ixgbe_common.c
> > index 9776ab9..d31fb81 100644
> > --- a/drivers/net/ixgbe/base/ixgbe_common.c
> > +++ b/drivers/net/ixgbe/base/ixgbe_common.c
> > @@ -1066,7 +1066,7 @@ void ixgbe_set_lan_id_multi_port_pcie(struct
> ixgbe_hw *hw)
> > if (hw->device_id == IXGBE_DEV_ID_X550EM_A_SFP) {
> > hw->eeprom.ops.read(hw, IXGBE_EEPROM_CTRL_4,
> &ee_ctrl_4);
> > bus->instance_id = (ee_ctrl_4 & IXGBE_EE_CTRL_4_INST_ID) >>
> > -   IXGBE_EE_CTRL_4_INST_ID_SHIFT;
> > +  IXGBE_EE_CTRL_4_INST_ID_SHIFT;
> > }
> >  }
> >
> > @@ -2877,7 +2877,7 @@ out:
> >   *  advertised settings
> >   **/
> >  s32 ixgbe_negotiate_fc(struct ixgbe_hw *hw, u32 adv_reg, u32 lp_reg,
> > - u32 adv_sym, u32 adv_asm, u32 lp_sym, u32
> lp_asm)
> > +  u32 adv_sym, u32 adv_asm, u32 lp_sym, u32 lp_asm)
> >  {
> > if ((!(adv_reg)) ||  (!(lp_reg))) {
> > ERROR_REPORT3(IXGBE_ERROR_UNSUPPORTED,
> > @@ -3920,7 +3920,8 @@ s32 ixgbe_set_vfta_generic(struct ixgbe_hw *hw,
> u32 vlan, u32 vind,
> > vfta_delta = 1 << (vlan % 32);
> > vfta = IXGBE_READ_REG(hw, IXGBE_VFTA(regidx));
> >
> > -   /* vfta_delta represents the difference between the current value
> > +   /*
> > +* vfta_delta represents the difference between the current value
> 
> These whitespace fixes not belong to this patch.
> It is good to make cleanups, but these are making noise for real patch,
> there are a few more of these in previous patches, perhaps all can be
> merged into a cleanup patch?
> 

Yes, will merge them into a cleanup patch.

> >  * of vfta and the value we want in the register.  Since the diff
> >  * is an XOR mask we can just update the vfta using an XOR
> >  */
> > @@ -3953,7 +3954,7 @@ vfta_update:
> >   *  @vind: VMDq output index that maps queue to VLAN id in VLVFB
> >   *  @vlan_on: boolean flag to turn on/off VLAN in VLVF
> >   *  @vfta_delta: pointer to the difference between the current value of 
> > VFTA
> > - * and the desired value
> > + *  and the desired value
> >   *  @vfta: the desired value of the VFTA
> >   *  @vlvf_bypass: boolean flag indicating updating default pool is okay
> >   *
> > @@ -3980,6 +3981,7 @@ s32 ixgbe_set_vlvf_generic(struct ixgbe_hw *hw,
> u32 vlan, u32 vind,
> >  */
> > if (!(IXGBE_READ_REG(hw, IXGBE_VT_CTL) &
> IXGBE_VT_CTL_VT_ENABLE))
> > return IXGBE_SUCCESS;
> > +
> > vlvf_index = ixgbe_find_vlvf_slot(hw, vlan, vlvf_bypass);
> > if (vlvf_index < 0)
> > return vlvf_index;
> > @@ -4009,6 +4011,7 @@ s32 ixgbe_set_vlvf_generic(struct ixgbe_hw *hw,
> u32 vlan, u32 vind,
> >
> > return IXGBE_SUCCESS;
> > }
> > +
> > /* If there are still bits set in the VLVFB registers
> >  * for the VLAN ID indicated we need to see if the
> >  * caller is requesting that we clear the VFTA entry bit.
> > @@ -4413,43 +4416,31 @@ u8 ixgbe_calculate_checksum(u8 *buffer, u32
> length)
> >  }
> >
> >  /**
> > - *  ixgbe_host_interface_co

[dpdk-dev] [PATCH 03/39] net/ixgbe/base: change endianness of PHY data

2016-09-25 Thread Wang, Xiao W
Hi Ferruh,

> -Original Message-
> From: Yigit, Ferruh
> Sent: Tuesday, September 20, 2016 1:01 AM
> To: Wang, Xiao W ; Lu, Wenzhuo
> 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 03/39] net/ixgbe/base: change endianness of
> PHY data
> 
> On 8/27/2016 4:47 PM, Xiao Wang wrote:
> > The latest firmware reverses the endianness of the PHY data read and
> 
> Good to add the fw version for future references

I consulted ND author of the this base driver patch and got that it was fw just 
for x550a.
The fw change has nothing to do with the other devices.
Will emphasize this on the v2 commit log.

> 
> > written via host interface command, so make corresponding changes
> > to that.
> >
> > Signed-off-by: Xiao Wang 
> 



Re: [dpdk-dev] [PATCH] net/fm10k: initialize link status in device start

2017-06-21 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Chen, Jing D
> Sent: Wednesday, June 21, 2017 10:36 AM
> To: Wang, Xiao W 
> Cc: dev@dpdk.org; sta...@dpdk.org
> Subject: RE: [PATCH] net/fm10k: initialize link status in device start
> 
> Hi,
> 
> > -Original Message-
> > From: Wang, Xiao W
> > Sent: Wednesday, May 31, 2017 7:07 PM
> > To: Chen, Jing D 
> > Cc: dev@dpdk.org; Wang, Xiao W ;
> > sta...@dpdk.org
> > Subject: [PATCH] net/fm10k: initialize link status in device start
> >
> > If port LSC interrupt is configured, application will read link status 
> > directly,
> so
> > driver need to prepare that value in advance.
> 
> Fm10k host driver can't manage PHY directly and provide a fake link status,
> so it always provide a constant value, whatever lsc is set or not.
> I think you need to reorganize the message. :)

OK, thanks.
> 
> >
> > Fixes: 9ae6068c86da ("fm10k: add dev start/stop")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >  drivers/net/fm10k/fm10k_ethdev.c | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/net/fm10k/fm10k_ethdev.c
> > b/drivers/net/fm10k/fm10k_ethdev.c
> > index a742eec..54bf10c 100644
> > --- a/drivers/net/fm10k/fm10k_ethdev.c
> > +++ b/drivers/net/fm10k/fm10k_ethdev.c
> > @@ -84,6 +84,8 @@ static void fm10k_MAC_filter_set(struct rte_eth_dev
> > *dev,  static void fm10k_set_rx_function(struct rte_eth_dev *dev);  static
> > void fm10k_set_tx_function(struct rte_eth_dev *dev);  static int
> > fm10k_check_ftag(struct rte_devargs *devargs);
> > +static int fm10k_link_update(struct rte_eth_dev *dev,
> > +   __rte_unused int wait_to_complete);
> >
> >  struct fm10k_xstats_name_off {
> > char name[RTE_ETH_XSTATS_NAME_SIZE];
> > @@ -1166,6 +1168,9 @@ static inline int fm10k_glort_valid(struct
> fm10k_hw
> > *hw)
> > if (!(dev->data->dev_conf.rxmode.mq_mode &
> > ETH_MQ_RX_VMDQ_FLAG))
> > fm10k_vlan_filter_set(dev, hw->mac.default_vid, true);
> >
> > +   if (dev->data->dev_conf.intr_conf.lsc != 0)
> > +   fm10k_link_update(dev, 0);
> > +
> 
> I'll recommend updating link status anyway when port starts, not considering
> lsc set status.

Agree, will send v2.

BRs,
Xiao



[dpdk-dev] [PATCH] fm10k: fix vlan flag bug in scattered RX

2015-12-18 Thread Wang Xiao W
In fm10k_recv_scattered_pkts function, a packet is stored in a linked list,
offload flags such as PKT_RX_VLAN_PKT should be set in the first segment.

Signed-off-by: Wang Xiao W 
---
 drivers/net/fm10k/fm10k_rxtx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/fm10k/fm10k_rxtx.c b/drivers/net/fm10k/fm10k_rxtx.c
index e958865..de31cad 100644
--- a/drivers/net/fm10k/fm10k_rxtx.c
+++ b/drivers/net/fm10k/fm10k_rxtx.c
@@ -305,7 +305,7 @@ fm10k_recv_scattered_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
 * So, always PKT_RX_VLAN_PKT flag is set and vlan_tci
 * is valid for each RX packet's mbuf.
 */
-   mbuf->ol_flags |= PKT_RX_VLAN_PKT;
+   first_seg->ol_flags |= PKT_RX_VLAN_PKT;
first_seg->vlan_tci = desc.w.vlan;

/* Prefetch data of first segment, if configured to do so. */
-- 
1.9.3



[dpdk-dev] [PATCH 1/2] testpmd: optimize tx_vlan_set and tx_qinq_set function

2015-12-21 Thread Wang Xiao W
Now in cmd_tx_vlan_set_parsed function, we check the vlan_offload
capability first, if it's a invalid port we'll get a prompt saying
"Error, as QinQ has been enabled.". So we should always make sure
that we get a valid port_id first before we check other information.
It's the same problem for cmd_tx_vlan_set_qinq_parsed.

Meanwhile, tx_vlan reset operation is simple enough to be put directly
into tx_vlan_set and tx_qinq_set function.

Signed-off-by: Wang Xiao W 
---
 app/test-pmd/cmdline.c | 12 
 app/test-pmd/config.c  | 21 +++--
 2 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 73298c9..2adf6ca 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -2952,12 +2952,6 @@ cmd_tx_vlan_set_parsed(void *parsed_result,
   __attribute__((unused)) void *data)
 {
struct cmd_tx_vlan_set_result *res = parsed_result;
-   int vlan_offload = rte_eth_dev_get_vlan_offload(res->port_id);
-
-   if (vlan_offload & ETH_VLAN_EXTEND_OFFLOAD) {
-   printf("Error, as QinQ has been enabled.\n");
-   return;
-   }

tx_vlan_set(res->port_id, res->vlan_id);
 }
@@ -3004,12 +2998,6 @@ cmd_tx_vlan_set_qinq_parsed(void *parsed_result,
__attribute__((unused)) void *data)
 {
struct cmd_tx_vlan_set_qinq_result *res = parsed_result;
-   int vlan_offload = rte_eth_dev_get_vlan_offload(res->port_id);
-
-   if (!(vlan_offload & ETH_VLAN_EXTEND_OFFLOAD)) {
-   printf("Error, as QinQ hasn't been enabled.\n");
-   return;
-   }

tx_qinq_set(res->port_id, res->vlan_id, res->vlan_id_outer);
 }
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 7088f6f..7572b3e 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -1839,25 +1839,42 @@ vlan_tpid_set(portid_t port_id, uint16_t tp_id)
 void
 tx_vlan_set(portid_t port_id, uint16_t vlan_id)
 {
+   int vlan_offload;
if (port_id_is_invalid(port_id, ENABLED_WARN))
return;
if (vlan_id_is_invalid(vlan_id))
return;
-   tx_vlan_reset(port_id);
+
+   vlan_offload = rte_eth_dev_get_vlan_offload(port_id);
+   if (vlan_offload & ETH_VLAN_EXTEND_OFFLOAD) {
+   printf("Error, as QinQ has been enabled.\n");
+   return;
+   }
+
+   ports[port_id].tx_ol_flags &= ~TESTPMD_TX_OFFLOAD_INSERT_QINQ;
ports[port_id].tx_ol_flags |= TESTPMD_TX_OFFLOAD_INSERT_VLAN;
ports[port_id].tx_vlan_id = vlan_id;
+   ports[port_id].tx_vlan_id_outer = 0;
 }

 void
 tx_qinq_set(portid_t port_id, uint16_t vlan_id, uint16_t vlan_id_outer)
 {
+   int vlan_offload;
if (port_id_is_invalid(port_id, ENABLED_WARN))
return;
if (vlan_id_is_invalid(vlan_id))
return;
if (vlan_id_is_invalid(vlan_id_outer))
return;
-   tx_vlan_reset(port_id);
+
+   vlan_offload = rte_eth_dev_get_vlan_offload(port_id);
+   if (!(vlan_offload & ETH_VLAN_EXTEND_OFFLOAD)) {
+   printf("Error, as QinQ hasn't been enabled.\n");
+   return;
+   }
+
+   ports[port_id].tx_ol_flags &= ~TESTPMD_TX_OFFLOAD_INSERT_VLAN;
ports[port_id].tx_ol_flags |= TESTPMD_TX_OFFLOAD_INSERT_QINQ;
ports[port_id].tx_vlan_id = vlan_id;
ports[port_id].tx_vlan_id_outer = vlan_id_outer;
-- 
1.9.3



[dpdk-dev] [PATCH v2] testpmd: fix a bug in tx_vlan set command support

2015-12-22 Thread Wang Xiao W
v2:
* Removed the bug fix unrelated code change to make this patch a pure
  bug fix patch.

* Fixed the "PATCH 1/2" mistake in the patch title, rewrote the subject.

v1:
* Initial version for tx_vlan set command support bug fix.

Wang Xiao W (1):
  testpmd: fix a bug in tx_vlan set command support

 app/test-pmd/cmdline.c | 12 
 app/test-pmd/config.c  | 16 
 2 files changed, 16 insertions(+), 12 deletions(-)

-- 
1.9.3



[dpdk-dev] [PATCH v2] testpmd: fix a bug in tx_vlan set command support

2015-12-22 Thread Wang Xiao W
Now in cmd_tx_vlan_set_parsed function, we check the vlan_offload
capability first, if it's an invalid port_id we'll get a strange
prompt saying "Error, as QinQ has been enabled.". We should always
make sure that we get a valid port_id first before we check other
information. It's the same problem for cmd_tx_vlan_set_qinq_parsed.

Signed-off-by: Wang Xiao W 
---
 app/test-pmd/cmdline.c | 12 
 app/test-pmd/config.c  | 16 
 2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 73298c9..2adf6ca 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -2952,12 +2952,6 @@ cmd_tx_vlan_set_parsed(void *parsed_result,
   __attribute__((unused)) void *data)
 {
struct cmd_tx_vlan_set_result *res = parsed_result;
-   int vlan_offload = rte_eth_dev_get_vlan_offload(res->port_id);
-
-   if (vlan_offload & ETH_VLAN_EXTEND_OFFLOAD) {
-   printf("Error, as QinQ has been enabled.\n");
-   return;
-   }

tx_vlan_set(res->port_id, res->vlan_id);
 }
@@ -3004,12 +2998,6 @@ cmd_tx_vlan_set_qinq_parsed(void *parsed_result,
__attribute__((unused)) void *data)
 {
struct cmd_tx_vlan_set_qinq_result *res = parsed_result;
-   int vlan_offload = rte_eth_dev_get_vlan_offload(res->port_id);
-
-   if (!(vlan_offload & ETH_VLAN_EXTEND_OFFLOAD)) {
-   printf("Error, as QinQ hasn't been enabled.\n");
-   return;
-   }

tx_qinq_set(res->port_id, res->vlan_id, res->vlan_id_outer);
 }
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 7088f6f..956d29c 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -1839,10 +1839,18 @@ vlan_tpid_set(portid_t port_id, uint16_t tp_id)
 void
 tx_vlan_set(portid_t port_id, uint16_t vlan_id)
 {
+   int vlan_offload;
if (port_id_is_invalid(port_id, ENABLED_WARN))
return;
if (vlan_id_is_invalid(vlan_id))
return;
+
+   vlan_offload = rte_eth_dev_get_vlan_offload(port_id);
+   if (vlan_offload & ETH_VLAN_EXTEND_OFFLOAD) {
+   printf("Error, as QinQ has been enabled.\n");
+   return;
+   }
+
tx_vlan_reset(port_id);
ports[port_id].tx_ol_flags |= TESTPMD_TX_OFFLOAD_INSERT_VLAN;
ports[port_id].tx_vlan_id = vlan_id;
@@ -1851,12 +1859,20 @@ tx_vlan_set(portid_t port_id, uint16_t vlan_id)
 void
 tx_qinq_set(portid_t port_id, uint16_t vlan_id, uint16_t vlan_id_outer)
 {
+   int vlan_offload;
if (port_id_is_invalid(port_id, ENABLED_WARN))
return;
if (vlan_id_is_invalid(vlan_id))
return;
if (vlan_id_is_invalid(vlan_id_outer))
return;
+
+   vlan_offload = rte_eth_dev_get_vlan_offload(port_id);
+   if (!(vlan_offload & ETH_VLAN_EXTEND_OFFLOAD)) {
+   printf("Error, as QinQ hasn't been enabled.\n");
+   return;
+   }
+
tx_vlan_reset(port_id);
ports[port_id].tx_ol_flags |= TESTPMD_TX_OFFLOAD_INSERT_QINQ;
ports[port_id].tx_vlan_id = vlan_id;
-- 
1.9.3



Re: [dpdk-dev] [PATCH 2/9] vhost: provide helpers for virtio ring relay

2018-12-11 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Bie, Tiwei
> Sent: Monday, December 3, 2018 10:23 PM
> To: Wang, Xiao W 
> Cc: maxime.coque...@redhat.com; dev@dpdk.org; Wang, Zhihong
> ; Ye, Xiaolong 
> Subject: Re: [PATCH 2/9] vhost: provide helpers for virtio ring relay
> 
> On Wed, Nov 28, 2018 at 05:46:00PM +0800, Xiao Wang wrote:
> [...]
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice
> > + *
> > + * Synchronize the available ring from guest to mediate ring, help to
> > + * check desc validity to protect against malicious guest driver.
> > + *
> > + * @param vid
> > + *  vhost device id
> > + * @param qid
> > + *  vhost queue id
> > + * @param m_vring
> > + *  mediate virtio ring pointer
> > + * @return
> > + *  number of synced available entries on success, -1 on failure
> > + */
> > +int __rte_experimental
> > +rte_vdpa_relay_avail_ring(int vid, int qid, struct vring *m_vring);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice
> > + *
> > + * Synchronize the used ring from mediate ring to guest, log dirty
> > + * page for each Rx buffer used.
> > + *
> > + * @param vid
> > + *  vhost device id
> > + * @param qid
> > + *  vhost queue id
> > + * @param m_vring
> > + *  mediate virtio ring pointer
> > + * @return
> > + *  number of synced used entries on success, -1 on failure
> > + */
> > +int __rte_experimental
> > +rte_vdpa_relay_used_ring(int vid, int qid, struct vring *m_vring);
> 
> Above APIs are split ring specific. We also need to take
> packed ring into consideration.

After some study on the current packed ring description, several ideas:
1. These APIs are used as helpers to setup a mediate relay layer to help do 
dirty page logging, we may not need
 this kind of ring relay for packed ring at all. The target of a mediate SW 
layer is to help device do dirty page
 logging, so this SW-assisted VDPA tries to find a way to intercept the 
frontend-backend communication, as you
 can see in this patch set, SW captures the device interrupt and then parse the 
vring and log dirty page
 afterwards. We set up this mediate vring to make sure the relay SW can 
intercept the device interrupt, as you
 know, this way we can control the mediate vring's interrupt suppression 
structure.

2.One new point about the packed ring is that it separates out the event 
suppression structure from the
description ring. So in this case, we can just set up a mediate event 
suppression structure to intercept event
 notification.

BTW, I find one troublesome point about the packed ring is that it's hard for a 
mediate SW to quickly handle the
 "buffer id", guest virtio driver understands this id well, it keeps some 
internal info about each id, e.g. chain list
 length, but the relay SW has to parse the packed ring again, which is not 
efficient.

3. In the split vring, relay SW reuses the guest desc vring, and desc is not 
writed by DMA, so no log for the desc.
 But in the packed vring, desc is writed by DMA, desc ring's logging is a new 
thing.
Packed ring is quite different, it could be a very different mechanism, other 
than following a vring relay API. Also
 from testing point of view, if we come out with a new efficient implementation 
for packed ring VDPA, it's hard to
 test it with HW. Testing need a HW supporting packed ring DMA and the 
get_vring_base/set_vring_base
 interface.

> 
> >  #endif /* _RTE_VDPA_H_ */
> [...]
> > diff --git a/lib/librte_vhost/vdpa.c b/lib/librte_vhost/vdpa.c
> > index e7d849ee0..e41117776 100644
> > --- a/lib/librte_vhost/vdpa.c
> > +++ b/lib/librte_vhost/vdpa.c
> > @@ -122,3 +122,176 @@ rte_vdpa_get_device_num(void)
> >  {
> > return vdpa_device_num;
> >  }
> > +
> > +static int
> > +invalid_desc_check(struct virtio_net *dev, struct vhost_virtqueue *vq,
> > +   uint64_t desc_iova, uint64_t desc_len, uint8_t perm)
> > +{
> > +   uint64_t desc_addr, desc_chunck_len;
> > +
> > +   while (desc_len) {
> > +   desc_chunck_len = desc_len;
> > +   desc_addr = vhost_iova_to_vva(dev, vq,
> > +   desc_iova,
> > +   &desc_chunck_len,
> > +   perm);
> > +
> > +   if (!desc_addr)
> > +   return -1;
> > +
> > +   desc_len -= desc_chunck_len;
> > +   desc_iova += desc_chunck_len;
> > +   }
> > +
> > +   return 0;
> > +}
> > +
> > +int
> > +rte_vdpa_relay_avail_ring

Re: [dpdk-dev] [PATCH 6/9] net/ifc: add devarg for LM mode

2018-12-11 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Bie, Tiwei
> Sent: Monday, December 3, 2018 10:32 PM
> To: Wang, Xiao W 
> Cc: maxime.coque...@redhat.com; dev@dpdk.org; Wang, Zhihong
> ; Ye, Xiaolong 
> Subject: Re: [PATCH 6/9] net/ifc: add devarg for LM mode
> 
> On Wed, Nov 28, 2018 at 05:46:04PM +0800, Xiao Wang wrote:
> [...]
> > @@ -767,6 +771,7 @@ ifcvf_pci_probe(struct rte_pci_driver *pci_drv
> __rte_unused,
> > struct ifcvf_internal *internal = NULL;
> > struct internal_list *list = NULL;
> > int vdpa_mode = 0;
> > +   int sw_fallback_lm = 0;
> > struct rte_kvargs *kvlist = NULL;
> > int ret = 0;
> >
> > @@ -826,6 +831,16 @@ ifcvf_pci_probe(struct rte_pci_driver *pci_drv
> __rte_unused,
> > internal->dev_addr.type = PCI_ADDR;
> > list->internal = internal;
> >
> > +   if (rte_kvargs_count(kvlist, IFCVF_SW_FALLBACK_LM)) {
> > +   ret = rte_kvargs_process(kvlist, IFCVF_SW_FALLBACK_LM,
> > +   &open_int, &sw_fallback_lm);
> > +   if (ret < 0)
> > +   goto error;
> > +   internal->sw_lm = sw_fallback_lm ? true : false;
> > +   } else {
> > +   internal->sw_lm = false;
> > +   }
> 
> Something like this would be better:
> 
>   if (rte_kvargs_count(kvlist, IFCVF_SW_FALLBACK_LM)) {
>   ret = rte_kvargs_process(kvlist, IFCVF_SW_FALLBACK_LM,
>   &open_int, &sw_fallback_lm);
>   if (ret < 0)
>   goto error;
>   }
> 
>   internal->sw_lm = sw_fallback_lm;
> 

Yeah, shorter lines of code, will have an update.

BRs,
Xiao

> 
> > internal->did = rte_vdpa_register_device(&internal->dev_addr,
> > &ifcvf_ops);
> > if (internal->did < 0) {
> > --
> > 2.15.1
> >


Re: [dpdk-dev] [PATCH 6/9] net/ifc: add devarg for LM mode

2018-12-12 Thread Wang, Xiao W
Hi Alejandro,

Yes, this mode datapath is through the relay thread when LM happens, it’s not 
the direct interaction between VM and vdpa device.

BRs,
Xiao

From: Alejandro Lucero [mailto:alejandro.luc...@netronome.com]
Sent: Wednesday, December 12, 2018 2:15 AM
To: Wang, Xiao W 
Cc: Bie, Tiwei ; Maxime Coquelin 
; dev ; Wang, Zhihong 
; Ye, Xiaolong 
Subject: Re: [dpdk-dev] [PATCH 6/9] net/ifc: add devarg for LM mode


On Wed, Nov 28, 2018 at 9:56 AM Xiao Wang 
mailto:xiao.w.w...@intel.com>> wrote:
This patch series enables a new method for live migration, i.e. software
assisted live migration. This patch provides a device argument for user
to choose the methold.

When "swlm=1", driver/device will do live migration with a relay thread
dealing with dirty page logging. Without this parameter, device will do
dirty page logging and there's no relay thread consuming CPU resource.

I'm a bit confused with this mode. If it is a relay thread doing the dirty page 
logging, does it mean that the datapath is through the relay thread and not 
between the VM and the vdpa device?

Signed-off-by: Xiao Wang mailto:xiao.w.w...@intel.com>>
---
 drivers/net/ifc/ifcvf_vdpa.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifc/ifcvf_vdpa.c
index c0e50354a..e9cc8d7bc 100644
--- a/drivers/net/ifc/ifcvf_vdpa.c
+++ b/drivers/net/ifc/ifcvf_vdpa.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
@@ -31,9 +32,11 @@
 #endif

 #define IFCVF_VDPA_MODE"vdpa"
+#define IFCVF_SW_FALLBACK_LM   "swlm"

 static const char * const ifcvf_valid_arguments[] = {
IFCVF_VDPA_MODE,
+   IFCVF_SW_FALLBACK_LM,
NULL
 };

@@ -56,6 +59,7 @@ struct ifcvf_internal {
rte_atomic32_t dev_attached;
rte_atomic32_t running;
rte_spinlock_t lock;
+   bool sw_lm;
 };

 struct internal_list {
@@ -767,6 +771,7 @@ ifcvf_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
struct ifcvf_internal *internal = NULL;
struct internal_list *list = NULL;
int vdpa_mode = 0;
+   int sw_fallback_lm = 0;
struct rte_kvargs *kvlist = NULL;
int ret = 0;

@@ -826,6 +831,16 @@ ifcvf_pci_probe(struct rte_pci_driver *pci_drv 
__rte_unused,
internal->dev_addr.type = PCI_ADDR;
list->internal = internal;

+   if (rte_kvargs_count(kvlist, IFCVF_SW_FALLBACK_LM)) {
+   ret = rte_kvargs_process(kvlist, IFCVF_SW_FALLBACK_LM,
+   &open_int, &sw_fallback_lm);
+   if (ret < 0)
+   goto error;
+   internal->sw_lm = sw_fallback_lm ? true : false;
+   } else {
+   internal->sw_lm = false;
+   }
+
internal->did = rte_vdpa_register_device(&internal->dev_addr,
&ifcvf_ops);
if (internal->did < 0) {
--
2.15.1


Re: [dpdk-dev] [PATCH v3 1/9] vhost: provide helper for host notifier ctrl

2018-12-14 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Friday, December 14, 2018 5:33 AM
> To: Wang, Xiao W ;
> alejandro.luc...@netronome.com; Bie, Tiwei 
> Cc: dev@dpdk.org; Wang, Zhihong ; Ye, Xiaolong
> 
> Subject: Re: [PATCH v3 1/9] vhost: provide helper for host notifier ctrl
> 
> 
> 
> On 12/13/18 11:09 AM, Xiao Wang wrote:
> > VDPA driver can decide if it needs to enable/disable the host notifier
> > mapping, so exposing a API can allow flexibility. A later patch will
> > base on this.
> >
> > Signed-off-by: Xiao Wang 
> > ---
> > v2:
> > * Reword the vdpa host notifier control API comment.
> > ---
> >   drivers/net/ifc/ifcvf_vdpa.c   |  3 +++
> >   lib/librte_vhost/rte_vdpa.h| 18 ++
> >   lib/librte_vhost/rte_vhost_version.map |  1 +
> >   lib/librte_vhost/vhost.c   |  3 +--
> >   lib/librte_vhost/vhost_user.c  |  7 +--
> >   5 files changed, 24 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifc/ifcvf_vdpa.c
> > index 97a57f182..e844109f3 100644
> > --- a/drivers/net/ifc/ifcvf_vdpa.c
> > +++ b/drivers/net/ifc/ifcvf_vdpa.c
> > @@ -556,6 +556,9 @@ ifcvf_dev_config(int vid)
> > rte_atomic32_set(&internal->dev_attached, 1);
> > update_datapath(internal);
> >
> > +   if (rte_vhost_host_notifier_ctrl(vid, true) != 0)
> > +   DRV_LOG(NOTICE, "vDPA (%d): software relay is used.", did);
> > +
> > return 0;
> >   }
> >
> > diff --git a/lib/librte_vhost/rte_vdpa.h b/lib/librte_vhost/rte_vdpa.h
> > index a418da47c..fff657391 100644
> > --- a/lib/librte_vhost/rte_vdpa.h
> > +++ b/lib/librte_vhost/rte_vdpa.h
> > @@ -11,6 +11,8 @@
> >* Device specific vhost lib
> >*/
> >
> > +#include 
> > +
> >   #include 
> >   #include "rte_vhost.h"
> >
> > @@ -155,4 +157,20 @@ rte_vdpa_get_device(int did);
> >*/
> >   int __rte_experimental
> >   rte_vdpa_get_device_num(void);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice
> > + *
> > + * Enable/Disable host notifier mapping for a vdpa port.
> > + *
> > + * @param vid
> > + *  vhost device id
> > + * @enable
> > + *  true for host notifier map, false for host notifier unmap
> > + * @return
> > + *  0 on success, -1 on failure
> > + */
> > +int __rte_experimental
> > +rte_vhost_host_notifier_ctrl(int vid, bool enable);
> >   #endif /* _RTE_VDPA_H_ */
> > diff --git a/lib/librte_vhost/rte_vhost_version.map
> b/lib/librte_vhost/rte_vhost_version.map
> > index ae39b6e21..22302e972 100644
> > --- a/lib/librte_vhost/rte_vhost_version.map
> > +++ b/lib/librte_vhost/rte_vhost_version.map
> > @@ -83,4 +83,5 @@ EXPERIMENTAL {
> > rte_vhost_crypto_finalize_requests;
> > rte_vhost_crypto_set_zero_copy;
> > rte_vhost_va_from_guest_pa;
> > +   rte_vhost_host_notifier_ctrl;
> >   };
> > diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
> > index 70ac6bc9c..e7a60e0b4 100644
> > --- a/lib/librte_vhost/vhost.c
> > +++ b/lib/librte_vhost/vhost.c
> > @@ -408,8 +408,7 @@ vhost_detach_vdpa_device(int vid)
> > if (dev == NULL)
> > return;
> >
> > -   vhost_user_host_notifier_ctrl(vid, false);
> > -
> > +   vhost_destroy_device_notify(dev);
> It seems that is addition is not mentioned in the commit message.
> Why is it needed now?

Compared with the vhost_attach_vdpa_device, I think we should not just disable 
host notifier, but also destroy the vhost port. Also, this internal API is 
currently not used.
Yes, we need to mention this point in the commit message. BTW, I prefer to 
remove this unused internal API, by a separate patch.

BRs,
Xiao

> 
> 
> > dev->vdpa_dev_id = -1;
> >   }
> >
> > diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> > index 3ea64eba6..5e0da0589 100644
> > --- a/lib/librte_vhost/vhost_user.c
> > +++ b/lib/librte_vhost/vhost_user.c
> > @@ -2045,11 +2045,6 @@ vhost_user_msg_handler(int vid, int fd)
> > if (vdpa_dev->ops->dev_conf)
> > vdpa_dev->ops->dev_conf(dev->vid);
> > dev->flags |= VIRTIO_DEV_VDPA_CONFIGURED;
> > -   if (vhost_user_host_notifier_ctrl(dev->vid, true) != 0) {
> > -   RTE_LOG(INFO, VHOST_CONFIG,
> > -   "(%d) software relay is used for vDPA,
> performance may be low.\n",
> > -   dev->vid);
> > -   }
> > }
> >
> > return 0;
> > @@ -2144,7 +2139,7 @@ static int
> vhost_user_slave_set_vring_host_notifier(struct virtio_net *dev,
> > return process_slave_message_reply(dev, &msg);
> >   }
> >
> > -int vhost_user_host_notifier_ctrl(int vid, bool enable)
> > +int rte_vhost_host_notifier_ctrl(int vid, bool enable)
> >   {
> > struct virtio_net *dev;
> > struct rte_vdpa_device *vdpa_dev;
> >


Re: [dpdk-dev] [PATCH v4 03/10] vhost: provide helpers for virtio ring relay

2018-12-17 Thread Wang, Xiao W
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Sunday, December 16, 2018 1:11 AM
> To: Wang, Xiao W ; Bie, Tiwei 
> Cc: alejandro.luc...@netronome.com; dev@dpdk.org; Wang, Zhihong
> ; Ye, Xiaolong 
> Subject: Re: [PATCH v4 03/10] vhost: provide helpers for virtio ring relay
> 
> 
> 
> On 12/14/18 10:16 PM, Xiao Wang wrote:
> > This patch provides two helpers for vdpa device driver to perform a
> > relay between the guest virtio ring and a mediate virtio ring.
> 
> s/mediate/mediated/ ?
> I'm not 100% sure, but if it is mediated, please change everywhere else
> in the patch.

"mediate" can also be used as an adjective, so "mediate" is OK here.

> 
> >
> > The available ring relay will synchronize the available entries, and
> > helps to do desc validity checking.
> 
> s/helps/help/

Yes, will update.

> 
> >
> > The used ring relay will synchronize the used entries from mediate ring
> > to guest ring, and helps to do dirty page logging for live migration.
> 
> s/helps/help/

Will update.

Thanks for the comments,
Xiao

> 
> >
> > The next patch will leverage these two helpers.
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >   lib/librte_vhost/rte_vdpa.h|  39 +++
> >   lib/librte_vhost/rte_vhost_version.map |   2 +
> >   lib/librte_vhost/vdpa.c| 194
> +
> >   lib/librte_vhost/vhost.h   |  40 +++
> >   lib/librte_vhost/virtio_net.c  |  39 ---
> >   5 files changed, 275 insertions(+), 39 deletions(-)
> >
> 
> 
> Appart from that:
> Reviewed-by: Maxime Coquelin 
> 
> Thanks,
> Maxime


Re: [dpdk-dev] [PATCH v4 06/10] net/ifc: detect if VDPA mode is specified

2018-12-17 Thread Wang, Xiao W
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Sunday, December 16, 2018 1:17 AM
> To: Wang, Xiao W ; Bie, Tiwei 
> Cc: alejandro.luc...@netronome.com; dev@dpdk.org; Wang, Zhihong
> ; Ye, Xiaolong 
> Subject: Re: [PATCH v4 06/10] net/ifc: detect if VDPA mode is specified
> 
> 
> 
> On 12/14/18 10:16 PM, Xiao Wang wrote:
> > If user wants the VF to be used in VDPA (vhost data path acceleration)
> > mode, then the user can add a "vdpa=1" parameter for the device.
> >
> > So if driver doesn't not find this option, it should quit and let the
> 
> s/doesn't not/does not/

Yes, I will fix the typo.

> 
> > bus continue the probe.
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >   drivers/net/ifc/Makefile |  1 +
> >   drivers/net/ifc/ifcvf_vdpa.c | 47
> 
> >   2 files changed, 48 insertions(+)
> >
> 
> Should this option be documented somewhere?

Will add a section for this in the last doc patch.

Thanks,
Xiao

> 
> Apart from that:
> Reviewed-by: Maxime Coquelin 
> 
> Thanks,
> Maxime


Re: [dpdk-dev] [PATCH v4 07/10] net/ifc: add devarg for LM mode

2018-12-17 Thread Wang, Xiao W
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Sunday, December 16, 2018 1:21 AM
> To: Wang, Xiao W ; Bie, Tiwei 
> Cc: alejandro.luc...@netronome.com; dev@dpdk.org; Wang, Zhihong
> ; Ye, Xiaolong 
> Subject: Re: [PATCH v4 07/10] net/ifc: add devarg for LM mode
> 
> 
> 
> On 12/14/18 10:16 PM, Xiao Wang wrote:
> > This patch series enables a new method for live migration, i.e. software
> > assisted live migration. This patch provides a device argument for user
> > to choose the methold.
> >
> > When "swlm=1", driver/device will do live migration with a relay thread
> > dealing with dirty page logging. Without this parameter, device will do
> > dirty page logging and there's no relay thread consuming CPU resource.
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >   drivers/net/ifc/ifcvf_vdpa.c | 13 +
> >   1 file changed, 13 insertions(+)
> >
> > diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifc/ifcvf_vdpa.c
> > index c0e50354a..395c5112f 100644
> > --- a/drivers/net/ifc/ifcvf_vdpa.c
> > +++ b/drivers/net/ifc/ifcvf_vdpa.c
> > @@ -8,6 +8,7 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >
> >   #include 
> >   #include 
> > @@ -31,9 +32,11 @@
> >   #endif
> >
> >   #define IFCVF_VDPA_MODE   "vdpa"
> > +#define IFCVF_SW_FALLBACK_LM   "swlm"
> 
> 
> The patch looks good, except that I don't like the "swlm" name.
> Maybe we could have something less obscure, even if a little bt longer?
> 
> What about "sw-live-migration"?

Agree with you, making it clear is more reader-friendly than a short name.

Thanks,
Xiao


Re: [dpdk-dev] [PATCH v4 09/10] net/ifc: support SW assisted VDPA live migration

2018-12-17 Thread Wang, Xiao W
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Sunday, December 16, 2018 1:35 AM
> To: Wang, Xiao W ; Bie, Tiwei 
> Cc: alejandro.luc...@netronome.com; dev@dpdk.org; Wang, Zhihong
> ; Ye, Xiaolong 
> Subject: Re: [PATCH v4 09/10] net/ifc: support SW assisted VDPA live migration
> 
> 
> 
> On 12/14/18 10:16 PM, Xiao Wang wrote:
> > In SW assisted live migration mode, driver will stop the device and
> > setup a mediate virtio ring to relay the communication between the
> > virtio driver and the VDPA device.
> >
> > This data path intervention will allow SW to help on guest dirty page
> > logging for live migration.
> >
> > This SW fallback is event driven relay thread, so when the network
> > throughput is low, this SW fallback will take little CPU resource, but
> > when the throughput goes up, the relay thread's CPU usage will goes up
> > accordinly.
> 
> s/accordinly/accordingly/
> 

Will fix it in next version.

> >
> > User needs to take all the factors including CPU usage, guest perf
> > degradation, etc. into consideration when selecting the live migration
> > support mode.
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >   drivers/net/ifc/base/ifcvf.h |   1 +
> >   drivers/net/ifc/ifcvf_vdpa.c | 346
> ++-
> >   2 files changed, 344 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ifc/base/ifcvf.h b/drivers/net/ifc/base/ifcvf.h
> > index c15c69107..e8a30d2c6 100644
> > --- a/drivers/net/ifc/base/ifcvf.h
> > +++ b/drivers/net/ifc/base/ifcvf.h
> > @@ -50,6 +50,7 @@
> >   #define IFCVF_LM_ENABLE_VF0x1
> >   #define IFCVF_LM_ENABLE_PF0x3
> >   #define IFCVF_LOG_BASE0x1000
> > +#define IFCVF_MEDIATE_VRING0x2000
> 
> MEDIATED?

"mediate" is used as adjective here.

> 
> >
> >   #define IFCVF_32_BIT_MASK 0x
> >
> > diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifc/ifcvf_vdpa.c
> > index f181c5a6e..61757d0b4 100644
> > --- a/drivers/net/ifc/ifcvf_vdpa.c
> > +++ b/drivers/net/ifc/ifcvf_vdpa.c
> > @@ -63,6 +63,9 @@ struct ifcvf_internal {
> > rte_atomic32_t running;
> > rte_spinlock_t lock;
> > bool sw_lm;

[...]

> > +static void *
> > +vring_relay(void *arg)
> > +{
> > +   int i, vid, epfd, fd, nfds;
> > +   struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
> > +   struct rte_vhost_vring vring;
> > +   struct rte_intr_handle *intr_handle;
> > +   uint16_t qid, q_num;
> > +   struct epoll_event events[IFCVF_MAX_QUEUES * 4];
> > +   struct epoll_event ev;
> > +   int nbytes;
> > +   uint64_t buf;
> > +
> > +   vid = internal->vid;
> > +   q_num = rte_vhost_get_vring_num(vid);
> > +   /* prepare the mediate vring */
> > +   for (qid = 0; qid < q_num; qid++) {
> > +   rte_vhost_get_vring_base(vid, qid,
> > +   &internal->m_vring[qid].avail->idx,
> > +   &internal->m_vring[qid].used->idx);
> > +   rte_vdpa_relay_vring_avail(vid, qid, &internal->m_vring[qid]);
> > +   }
> > +
> > +   /* add notify fd and interrupt fd to epoll */
> > +   epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
> > +   if (epfd < 0) {
> > +   DRV_LOG(ERR, "failed to create epoll instance.");
> > +   return NULL;
> > +   }
> > +   internal->epfd = epfd;
> > +
> > +   for (qid = 0; qid < q_num; qid++) {
> > +   ev.events = EPOLLIN | EPOLLPRI;
> > +   rte_vhost_get_vhost_vring(vid, qid, &vring);
> > +   ev.data.u64 = qid << 1 | (uint64_t)vring.kickfd << 32;
> > +   if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
> > +   DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
> > +   return NULL;
> > +   }
> > +   }
> > +
> > +   intr_handle = &internal->pdev->intr_handle;
> > +   for (qid = 0; qid < q_num; qid++) {
> > +   ev.events = EPOLLIN | EPOLLPRI;
> > +   ev.data.u64 = 1 | qid << 1 |
> > +   (uint64_t)intr_handle->efds[qid] << 32;
> > +   if (epoll_ctl(epfd, EPOLL_CTL_ADD, intr_handle->efds[qid],
> &ev)
> > +   < 0) {
> > +   DRV_LOG(

Re: [dpdk-dev] [PATCH v4 10/10] doc: update ifc NIC document

2018-12-17 Thread Wang, Xiao W
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Sunday, December 16, 2018 1:36 AM
> To: Wang, Xiao W ; Bie, Tiwei 
> Cc: alejandro.luc...@netronome.com; dev@dpdk.org; Wang, Zhihong
> ; Ye, Xiaolong 
> Subject: Re: [PATCH v4 10/10] doc: update ifc NIC document
> 
> 
> 
> On 12/14/18 10:16 PM, Xiao Wang wrote:
> > Add the SW assisted VDPA live migration feature into NIC doc.
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >   doc/guides/nics/ifc.rst| 8 
> >   doc/guides/rel_notes/release_19_02.rst | 6 ++
> >   2 files changed, 14 insertions(+)
> >
> > diff --git a/doc/guides/nics/ifc.rst b/doc/guides/nics/ifc.rst
> > index 48f9adf1d..eb55d329a 100644
> > --- a/doc/guides/nics/ifc.rst
> > +++ b/doc/guides/nics/ifc.rst
> > @@ -39,6 +39,13 @@ the driver probe a new container is created for this
> device, with this
> >   container vDPA driver can program DMA remapping table with the VM's
> memory
> >   region information.
> >
> > +The device argument "swlm=1" will configure the driver into SW assisted
> live
> > +migration mode. In this mode, the driver will set up a SW relay thread when
> LM
> > +happens, this thread will help device to log dirty pages. Thus this mode
> does
> > +not require HW to implement a dirty page logging function block, but will
> > +consume some percentage of CPU resource depending on the network
> throughput.
> > +If no "swlm=1" specified, driver will rely on device's logging capability.
> > +
> 
> Ok, so that's documented here.
> What about documenting vdpa option too?

Yes, will explain all the devargs in this doc.

Thanks,
Xiao

> 
> >   Key IFCVF vDPA driver ops
> >   ~
> >
> > @@ -70,6 +77,7 @@ Features
> >   Features of the IFCVF driver are:
> >
> >   - Compatibility with virtio 0.95 and 1.0.
> > +- SW assisted vDPA live migration.
> >
> >
> >   Prerequisites
> > diff --git a/doc/guides/rel_notes/release_19_02.rst
> b/doc/guides/rel_notes/release_19_02.rst
> > index e86ef9511..ced6af8f0 100644
> > --- a/doc/guides/rel_notes/release_19_02.rst
> > +++ b/doc/guides/rel_notes/release_19_02.rst
> > @@ -60,6 +60,12 @@ New Features
> > * Added the handler to get firmware version string.
> > * Added support for multicast filtering.
> >
> > +* **Added support for SW-assisted VDPA live migration.**
> > +
> > +  This SW-assisted VDPA live migration facility helps VDPA devices without
> > +  logging capability to perform live migration, a mediate SW relay can help
> > +  devices to track dirty pages caused by DMA. IFC driver has enabled this
> > +  SW-assisted live migration mode.
> >
> >   Removed Items
> >   -
> >


Re: [dpdk-dev] [PATCH v4 03/10] vhost: provide helpers for virtio ring relay

2018-12-17 Thread Wang, Xiao W
Thanks for the confirmation.

BRs,
Xiao

> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Monday, December 17, 2018 7:03 PM
> To: Wang, Xiao W ; Bie, Tiwei 
> Cc: alejandro.luc...@netronome.com; dev@dpdk.org; Wang, Zhihong
> ; Ye, Xiaolong 
> Subject: Re: [PATCH v4 03/10] vhost: provide helpers for virtio ring relay
> 
> Hi Xiao,
> 
> On 12/17/18 9:51 AM, Wang, Xiao W wrote:
> > Hi Maxime,
> >
> >> -Original Message-
> >> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> >> Sent: Sunday, December 16, 2018 1:11 AM
> >> To: Wang, Xiao W ; Bie, Tiwei
> 
> >> Cc: alejandro.luc...@netronome.com; dev@dpdk.org; Wang, Zhihong
> >> ; Ye, Xiaolong 
> >> Subject: Re: [PATCH v4 03/10] vhost: provide helpers for virtio ring relay
> >>
> >>
> >>
> >> On 12/14/18 10:16 PM, Xiao Wang wrote:
> >>> This patch provides two helpers for vdpa device driver to perform a
> >>> relay between the guest virtio ring and a mediate virtio ring.
> >>
> >> s/mediate/mediated/ ?
> >> I'm not 100% sure, but if it is mediated, please change everywhere else
> >> in the patch.
> >
> > "mediate" can also be used as an adjective, so "mediate" is OK here.
> 
> I got the confirmation from a native speaker that mediate sounds wrong
> in this context, and mediated should be used.
> 
> >>
> >>>
> >>> The available ring relay will synchronize the available entries, and
> >>> helps to do desc validity checking.
> >>
> >> s/helps/help/
> >
> > Yes, will update.
> >
> >>
> >>>
> >>> The used ring relay will synchronize the used entries from mediate ring
> >>> to guest ring, and helps to do dirty page logging for live migration.
> >>
> >> s/helps/help/
> >
> > Will update.
> >
> > Thanks for the comments,
> > Xiao
> >
> >>
> >>>
> >>> The next patch will leverage these two helpers.
> >>>
> >>> Signed-off-by: Xiao Wang 
> >>> ---
> >>>lib/librte_vhost/rte_vdpa.h|  39 +++
> >>>lib/librte_vhost/rte_vhost_version.map |   2 +
> >>>lib/librte_vhost/vdpa.c| 194
> >> +
> >>>lib/librte_vhost/vhost.h   |  40 +++
> >>>lib/librte_vhost/virtio_net.c  |  39 ---
> >>>5 files changed, 275 insertions(+), 39 deletions(-)
> >>>
> >>
> >>
> >> Appart from that:
> >> Reviewed-by: Maxime Coquelin 
> >>
> >> Thanks,
> >> Maxime


Re: [dpdk-dev] [PATCH v4 03/10] vhost: provide helpers for virtio ring relay

2018-12-18 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Tuesday, December 18, 2018 3:01 AM
> To: Wang, Xiao W ; Bie, Tiwei 
> Cc: alejandro.luc...@netronome.com; dev@dpdk.org; Wang, Zhihong
> ; Ye, Xiaolong 
> Subject: Re: [PATCH v4 03/10] vhost: provide helpers for virtio ring relay
> 
> 
> 
> On 12/17/18 3:41 PM, Wang, Xiao W wrote:
> > Thanks for the confirmation.
> 
> Please note that CI reports a checkpatch issue:
> http://patches.dpdk.org/patch/48935/

+ Thomas.

I've tried the checkpatch.pl from CentOS 7.4 & 7.5 and also from the latest 
kernel, get no warning in my
self-check with dpdk/devtools/checkpatches.sh.
I don't know what checkpatch.pl the CI uses, it depends on the 
DPDK_CHECKPATCH_PATH environment
variable setting. In the v5 patch, I add the __rte_experimental flag for the 
new API even in the vdpa.c file,
but CI still reports this warning.

BRs,
Xiao

> 
> Thanks,
> Maxime
> 
> > BRs,
> > Xiao
> >
> >> -Original Message-
> >> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> >> Sent: Monday, December 17, 2018 7:03 PM
> >> To: Wang, Xiao W ; Bie, Tiwei
> 
> >> Cc: alejandro.luc...@netronome.com; dev@dpdk.org; Wang, Zhihong
> >> ; Ye, Xiaolong 
> >> Subject: Re: [PATCH v4 03/10] vhost: provide helpers for virtio ring relay
> >>
> >> Hi Xiao,
> >>
> >> On 12/17/18 9:51 AM, Wang, Xiao W wrote:
> >>> Hi Maxime,
> >>>
> >>>> -Original Message-
> >>>> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> >>>> Sent: Sunday, December 16, 2018 1:11 AM
> >>>> To: Wang, Xiao W ; Bie, Tiwei
> >> 
> >>>> Cc: alejandro.luc...@netronome.com; dev@dpdk.org; Wang, Zhihong
> >>>> ; Ye, Xiaolong 
> >>>> Subject: Re: [PATCH v4 03/10] vhost: provide helpers for virtio ring 
> >>>> relay
> >>>>
> >>>>
> >>>>
> >>>> On 12/14/18 10:16 PM, Xiao Wang wrote:
> >>>>> This patch provides two helpers for vdpa device driver to perform a
> >>>>> relay between the guest virtio ring and a mediate virtio ring.
> >>>>
> >>>> s/mediate/mediated/ ?
> >>>> I'm not 100% sure, but if it is mediated, please change everywhere else
> >>>> in the patch.
> >>>
> >>> "mediate" can also be used as an adjective, so "mediate" is OK here.
> >>
> >> I got the confirmation from a native speaker that mediate sounds wrong
> >> in this context, and mediated should be used.
> >>
> >>>>
> >>>>>
> >>>>> The available ring relay will synchronize the available entries, and
> >>>>> helps to do desc validity checking.
> >>>>
> >>>> s/helps/help/
> >>>
> >>> Yes, will update.
> >>>
> >>>>
> >>>>>
> >>>>> The used ring relay will synchronize the used entries from mediate ring
> >>>>> to guest ring, and helps to do dirty page logging for live migration.
> >>>>
> >>>> s/helps/help/
> >>>
> >>> Will update.
> >>>
> >>> Thanks for the comments,
> >>> Xiao
> >>>
> >>>>
> >>>>>
> >>>>> The next patch will leverage these two helpers.
> >>>>>
> >>>>> Signed-off-by: Xiao Wang 
> >>>>> ---
> >>>>> lib/librte_vhost/rte_vdpa.h|  39 +++
> >>>>> lib/librte_vhost/rte_vhost_version.map |   2 +
> >>>>> lib/librte_vhost/vdpa.c| 194
> >>>> +
> >>>>> lib/librte_vhost/vhost.h   |  40 +++
> >>>>> lib/librte_vhost/virtio_net.c  |  39 ---
> >>>>> 5 files changed, 275 insertions(+), 39 deletions(-)
> >>>>>
> >>>>
> >>>>
> >>>> Appart from that:
> >>>> Reviewed-by: Maxime Coquelin 
> >>>>
> >>>> Thanks,
> >>>> Maxime


Re: [dpdk-dev] [PATCH] net/fm10k: initialize sm_down variable

2019-01-02 Thread Wang, Xiao W
Hi Julien,

> -Original Message-
> From: Julien Meunier [mailto:julien.meun...@nokia.com]
> Sent: Wednesday, January 2, 2019 11:58 PM
> To: Zhang, Qi Z ; Wang, Xiao W
> 
> Cc: dev@dpdk.org; sta...@dpdk.org
> Subject: [PATCH] net/fm10k: initialize sm_down variable
> 
> Fixes: 6f22f2f67268 ("net/fm10k: redefine link status semantics")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Julien Meunier 
> ---
>  drivers/net/fm10k/fm10k_ethdev.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c
> b/drivers/net/fm10k/fm10k_ethdev.c
> index 85fb6c5..caf4d1b 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -3003,6 +3003,7 @@ fm10k_params_init(struct rte_eth_dev *dev)
>   hw->bus.payload = fm10k_bus_payload_256;
> 
>   info->rx_vec_allowed = true;
> + info->sm_down = false;
>  }
> 
>  static int
> --
> 2.10.2

Acked-by: Xiao Wang 

BRs,
Xiao



Re: [dpdk-dev] [PATCH] net/ifc: add live migration support

2018-10-07 Thread Wang, Xiao W
Hi Ferruh,

> -Original Message-
> From: Yigit, Ferruh
> Sent: Tuesday, October 2, 2018 10:46 PM
> To: Wang, Xiao W ; Bie, Tiwei 
> Cc: dev@dpdk.org; Ye, Xiaolong ; Wang, Zhihong
> ; Chao Zhu ;
> Thomas Monjalon ; Xu, Qian Q 
> Subject: Re: [dpdk-dev] [PATCH] net/ifc: add live migration support
> 
> On 9/21/2018 12:55 AM, Ferruh Yigit wrote:
> > On 9/10/2018 12:01 PM, Xiao Wang wrote:
> >> IFCVF can help to log dirty page in live migration stage,
> >> each queue's index can be read and configured to support
> >> VHOST_USER_GET_VRING_BASE and VHOST_USER_SET_VRING_BASE.
> >>
> >> Signed-off-by: Xiao Wang 
> >
> > <...>
> >
> >> +static void
> >> +ifcvf_used_ring_log(struct ifcvf_hw *hw, uint32_t queue, uint8_t *log_buf)
> >> +{
> >> +  uint32_t i, size;
> >> +  uint64_t pfn;
> >> +
> >> +  pfn = hw->vring[queue].used / PAGE_SIZE;
> >> +  size = hw->vring[queue].size * sizeof(struct vring_used_elem) +
> >> +  sizeof(__virtio16) * 3;
> >
> > Getting a build error for PowerPC [1], can someone from PPC side confirm it
> please?
> >
> > [1]
> > .../drivers/net/ifc/ifcvf_vdpa.c: In function ‘ifcvf_used_ring_log’:
> > .../drivers/net/ifc/ifcvf_vdpa.c:288:11: error: ‘__virtio16’ undeclared 
> > (first
> > use in this function)
> > sizeof(__virtio16) * 3;
> >^~
> 
> Also "__virtio16" seems added into Linux kernel on v3.19. Systems with kernel
> version less than this causing build error.
> 
> Can we replace "__virtio16" usage with basic types to prevent build error?
> If so can you please send this as a fix patch?

Yes, will do that.

BRs,
Xiao

> 
> Thanks,
> ferruh


Re: [dpdk-dev] [PATCH] net/ifc: invoke ifcvf HW init function in probe

2018-10-10 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Ye, Xiaolong
> Sent: Wednesday, October 10, 2018 9:23 PM
> To: dev@dpdk.org; Zhang, Qi Z 
> Cc: Wang, Xiao W ; Ye, Xiaolong
> 
> Subject: [PATCH] net/ifc: invoke ifcvf HW init function in probe
> 
> As ifcvf_init_hw is independent with ifcvf_vfio_setup, it's better to
> invoke it directly in probe func.
> 
> Signed-off-by: Xiaolong Ye 
> ---
>  drivers/net/ifc/ifcvf_vdpa.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifc/ifcvf_vdpa.c
> index 9d5594678..adc8f4166 100644
> --- a/drivers/net/ifc/ifcvf_vdpa.c
> +++ b/drivers/net/ifc/ifcvf_vdpa.c
> @@ -112,7 +112,6 @@ ifcvf_vfio_setup(struct ifcvf_internal *internal)
>   struct rte_pci_device *dev = internal->pdev;
>   char devname[RTE_DEV_NAME_MAX_LEN] = {0};
>   int iommu_group_num;
> - int ret = 0;
>   int i;
> 
>   internal->vfio_dev_fd = -1;
> @@ -146,9 +145,8 @@ ifcvf_vfio_setup(struct ifcvf_internal *internal)
>   internal->hw.mem_resource[i].len =
>   internal->pdev->mem_resource[i].len;
>   }
> - ret = ifcvf_init_hw(&internal->hw, internal->pdev);
> 
> - return ret;
> + return 0;
> 
>  err:
>   rte_vfio_container_destroy(internal->vfio_container_fd);
> @@ -758,6 +756,9 @@ ifcvf_pci_probe(struct rte_pci_driver *pci_drv
> __rte_unused,
>   if (ifcvf_vfio_setup(internal) < 0)
>   return -1;
> 
> + if (ifcvf_init_hw(&internal->hw, internal->pdev) < 0)
> + return -1;
> +
>   internal->max_queues = IFCVF_MAX_QUEUES;
>   features = ifcvf_get_features(&internal->hw);
>   internal->features = (features &
> --
> 2.17.1

Acked-by: Xiao Wang 

BRs,
Xiao



Re: [dpdk-dev] [PATCH v2 08/15] net/ifc: rename to ifcvf

2018-06-12 Thread Wang, Xiao W
Hi Bruce,

> -Original Message-
> From: Richardson, Bruce
> Sent: Saturday, June 9, 2018 5:21 AM
> To: dev@dpdk.org
> Cc: Richardson, Bruce ; Wang, Xiao W
> 
> Subject: [PATCH v2 08/15] net/ifc: rename to ifcvf
> 
> All files in the directory and the resulting driver have prefix of ifcvf,
> not just ifc, so rename directory for accuracy. Also rename the map file
> to standard name for meson build in the process.

Compared with renaming the dir to IFCVF and renaming it back to IFC sometime in 
future,
I think keeping the dir name as IFC is better for us, this avoids the extra 
effort.
We can just rename below files:
doc/guides/nics/ifcvf.rst => doc/guides/nics/ifc.rst
drivers/net/ifc/rte_ifcvf_version.map => drivers/net/ifc/rte_pmd_ifc_version.

And yes, we need to update documents which refer to ifc.

Thanks!
Xiao

> 
> CC: Xiao Wang 
> Signed-off-by: Bruce Richardson 
> ---
>  MAINTAINERS   | 4 ++--
>  drivers/net/Makefile  | 2 +-
>  drivers/net/{ifc => ifcvf}/Makefile   | 2 +-
>  drivers/net/{ifc => ifcvf}/base/ifcvf.c   | 0
>  drivers/net/{ifc => ifcvf}/base/ifcvf.h   | 0
>  drivers/net/{ifc => ifcvf}/base/ifcvf_osdep.h | 0
>  drivers/net/{ifc => ifcvf}/ifcvf_vdpa.c   | 0
>  .../rte_ifcvf_version.map => ifcvf/rte_pmd_ifcvf_version.map} | 0
>  8 files changed, 4 insertions(+), 4 deletions(-)
>  rename drivers/net/{ifc => ifcvf}/Makefile (94%)
>  rename drivers/net/{ifc => ifcvf}/base/ifcvf.c (100%)
>  rename drivers/net/{ifc => ifcvf}/base/ifcvf.h (100%)
>  rename drivers/net/{ifc => ifcvf}/base/ifcvf_osdep.h (100%)
>  rename drivers/net/{ifc => ifcvf}/ifcvf_vdpa.c (100%)
>  rename drivers/net/{ifc/rte_ifcvf_version.map =>
> ifcvf/rte_pmd_ifcvf_version.map} (100%)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 4667fa7fb..4f6055590 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -559,10 +559,10 @@ T: git://dpdk.org/next/dpdk-next-net-intel
>  F: drivers/net/avf/
>  F: doc/guides/nics/features/avf*.ini
> 
> -Intel ifc
> +Intel ifcvf
>  M: Xiao Wang 
>  T: git://dpdk.org/next/dpdk-next-net-intel
> -F: drivers/net/ifc/
> +F: drivers/net/ifcvf/
>  F: doc/guides/nics/ifcvf.rst
>  F: doc/guides/nics/features/ifcvf.ini
> 
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> index 9f9da6651..9308f9a7b 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -59,7 +59,7 @@ endif # $(CONFIG_RTE_LIBRTE_SCHED)
>  ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
>  ifeq ($(CONFIG_RTE_EAL_VFIO),y)
> -DIRS-$(CONFIG_RTE_LIBRTE_IFCVF_VDPA_PMD) += ifc
> +DIRS-$(CONFIG_RTE_LIBRTE_IFCVF_VDPA_PMD) += ifcvf
>  endif
>  endif # $(CONFIG_RTE_LIBRTE_VHOST)
> 
> diff --git a/drivers/net/ifc/Makefile b/drivers/net/ifcvf/Makefile
> similarity index 94%
> rename from drivers/net/ifc/Makefile
> rename to drivers/net/ifcvf/Makefile
> index 1011995bc..a022faaad 100644
> --- a/drivers/net/ifc/Makefile
> +++ b/drivers/net/ifcvf/Makefile
> @@ -22,7 +22,7 @@ BASE_DRIVER_OBJS=$(sort $(patsubst %.c,%.o,$(notdir
> $(wildcard $(SRCDIR)/base/*.
> 
>  VPATH += $(SRCDIR)/base
> 
> -EXPORT_MAP := rte_ifcvf_version.map
> +EXPORT_MAP := rte_pmd_ifcvf_version.map
> 
>  LIBABIVER := 1
> 
> diff --git a/drivers/net/ifc/base/ifcvf.c b/drivers/net/ifcvf/base/ifcvf.c
> similarity index 100%
> rename from drivers/net/ifc/base/ifcvf.c
> rename to drivers/net/ifcvf/base/ifcvf.c
> diff --git a/drivers/net/ifc/base/ifcvf.h b/drivers/net/ifcvf/base/ifcvf.h
> similarity index 100%
> rename from drivers/net/ifc/base/ifcvf.h
> rename to drivers/net/ifcvf/base/ifcvf.h
> diff --git a/drivers/net/ifc/base/ifcvf_osdep.h
> b/drivers/net/ifcvf/base/ifcvf_osdep.h
> similarity index 100%
> rename from drivers/net/ifc/base/ifcvf_osdep.h
> rename to drivers/net/ifcvf/base/ifcvf_osdep.h
> diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifcvf/ifcvf_vdpa.c
> similarity index 100%
> rename from drivers/net/ifc/ifcvf_vdpa.c
> rename to drivers/net/ifcvf/ifcvf_vdpa.c
> diff --git a/drivers/net/ifc/rte_ifcvf_version.map
> b/drivers/net/ifcvf/rte_pmd_ifcvf_version.map
> similarity index 100%
> rename from drivers/net/ifc/rte_ifcvf_version.map
> rename to drivers/net/ifcvf/rte_pmd_ifcvf_version.map
> --
> 2.17.1



Re: [dpdk-dev] [PATCH v3 2/2] net/ifcvf: enable the host notifier support

2018-06-12 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Bie, Tiwei
> Sent: Friday, June 8, 2018 11:22 AM
> To: maxime.coque...@redhat.com; dev@dpdk.org
> Cc: Wang, Xiao W 
> Subject: [PATCH v3 2/2] net/ifcvf: enable the host notifier support
> 
> The necessary vDPA ops have already been implemented
> in ifcvf driver. So just need to announce the necessary
> protocol features to enable the host notifier support.
> 
> Signed-off-by: Tiwei Bie 
> ---
>  drivers/net/ifc/ifcvf_vdpa.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifc/ifcvf_vdpa.c
> index c6627c23a..b8e22daf3 100644
> --- a/drivers/net/ifc/ifcvf_vdpa.c
> +++ b/drivers/net/ifc/ifcvf_vdpa.c
> @@ -646,6 +646,9 @@ ifcvf_get_vdpa_features(int did, uint64_t *features)
> 
>  #define VDPA_SUPPORTED_PROTOCOL_FEATURES \
>   (1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK | \
> +  1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ | \
> +  1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD | \
> +  1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER | \
>1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD)
>  static int
>  ifcvf_get_protocol_features(int did __rte_unused, uint64_t *features)
> --
> 2.17.0

Acked-by: Xiao Wang 

BRs,
Xiao



Re: [dpdk-dev] [PATCH v2 1/3] gso: support UDP/IPv4 fragmentation

2018-06-21 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jiayu Hu
> Sent: Sunday, June 17, 2018 11:13 AM
> To: dev@dpdk.org
> Cc: Ananyev, Konstantin ; Zhang, Yuwei1
> ; Iremonger, Bernard
> ; Hu, Jiayu 
> Subject: [dpdk-dev] [PATCH v2 1/3] gso: support UDP/IPv4 fragmentation
> 
> This patch adds GSO support for UDP/IPv4 packets. Supported packets
> may include a single VLAN tag. UDP/IPv4 GSO doesn't check if input
> packets have correct checksums, and doesn't update checksums for
> output packets (the responsibility for this lies with the application).
> Additionally, UDP/IPv4 GSO doesn't process IP fragmented packets.
> 
> UDP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
> MBUF, to organize an output packet. The direct MBUF stores the packet
> header, while the indirect mbuf simply points to a location within the
> original packet's payload. Consequently, use of UDP GSO requires
> multi-segment MBUF support in the TX functions of the NIC driver.
> 
> If a packet is GSO'd, UDP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> result, when all of its GSOed segments are freed, the packet is freed
> automatically.
> 
> Signed-off-by: Jiayu Hu 
> ---
>  lib/librte_gso/Makefile |  1 +
>  lib/librte_gso/gso_common.h |  3 ++
>  lib/librte_gso/gso_udp4.c   | 81
> +
>  lib/librte_gso/gso_udp4.h   | 42 +++
>  lib/librte_gso/rte_gso.c| 24 +++---
>  lib/librte_gso/rte_gso.h|  6 +++-
>  6 files changed, 151 insertions(+), 6 deletions(-)
>  create mode 100644 lib/librte_gso/gso_udp4.c
>  create mode 100644 lib/librte_gso/gso_udp4.h
> 
> diff --git a/lib/librte_gso/Makefile b/lib/librte_gso/Makefile
> index 3648ec0..1fac53a 100644
> --- a/lib/librte_gso/Makefile
> +++ b/lib/librte_gso/Makefile
> @@ -19,6 +19,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_GSO) += rte_gso.c
>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_common.c
>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tcp4.c
>  SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_tunnel_tcp4.c
> +SRCS-$(CONFIG_RTE_LIBRTE_GSO) += gso_udp4.c
> 

meson should be updated accordingly.

>  # install this header file
>  SYMLINK-$(CONFIG_RTE_LIBRTE_GSO)-include += rte_gso.h
> diff --git a/lib/librte_gso/gso_common.h b/lib/librte_gso/gso_common.h
> index 5ca5974..6cd764f 100644
> --- a/lib/librte_gso/gso_common.h
> +++ b/lib/librte_gso/gso_common.h
> @@ -31,6 +31,9 @@
>   (PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | \
>PKT_TX_TUNNEL_GRE))
> 
> +#define IS_IPV4_UDP(flag) (((flag) & (PKT_TX_UDP_SEG | PKT_TX_IPV4)) == \
> + (PKT_TX_UDP_SEG | PKT_TX_IPV4))
> +
>  /**
>   * Internal function which updates the UDP header of a packet, following
>   * segmentation. This is required to update the header's datagram length 
> field.
> diff --git a/lib/librte_gso/gso_udp4.c b/lib/librte_gso/gso_udp4.c
> new file mode 100644
> index 000..a3db329
> --- /dev/null
> +++ b/lib/librte_gso/gso_udp4.c
> @@ -0,0 +1,81 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2018 Intel Corporation
> + */
> +
> +#include "gso_common.h"
> +#include "gso_udp4.h"
> +
> +#define IPV4_HDR_MF_BIT (1U << 13)
> +
> +static inline void
> +update_ipv4_udp_headers(struct rte_mbuf *pkt, struct rte_mbuf **segs,
> + uint16_t nb_segs)
> +{
> + struct ipv4_hdr *ipv4_hdr;
> + uint16_t frag_offset = 0, is_mf;
> + uint16_t l2_hdrlen = pkt->l2_len, l3_hdrlen = pkt->l3_len;
> + uint16_t tail_idx = nb_segs - 1, length, i;
> +
> + /*
> +  * Update IP header fields for output segments. Specifically,
> +  * keep the same IP id, update fragment offset and total
> +  * length.
> +  */
> + for (i = 0; i < nb_segs; i++) {
> + ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(segs[i],
> + char *) + l2_hdrlen);

You could use rte_pktmbuf_mtod_offset to simplify the code.

> + length = segs[i]->pkt_len - l2_hdrlen;
> + ipv4_hdr->total_length = rte_cpu_to_be_16(length);
> +
> + is_mf = i < tail_idx ? IPV4_HDR_MF_BIT : 0;
> + ipv4_hdr->fragment_offset =
> + rte_cpu_to_be_16(frag_offset | is_mf);
> + frag_offset += ((length - l3_hdrlen) >> 3);
> + }
> +}
> +
> +int
> +gso_udp4_segment(struct rte_mbuf *pkt,
> + uint16_t gso_size,
> + struct rte_mempool *direct_pool,
> + struct rte_mempool *indirect_pool,
> + struct rte_mbuf **pkts_out,
> + uint16_t nb_pkts_out)
> +{
> + struct ipv4_hdr *ipv4_hdr;
> + uint16_t pyld_unit_size, hdr_offset;
> + uint16_t frag_off;
> + int ret;
> +
> + /* Don't process the fragmented packet */
> + ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> + pkt->l2_len);

Ditto.

BRs,
Xiao


Re: [dpdk-dev] [PATCH v3 0/3] Support UDP/IPv4 GSO

2018-06-24 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Hu, Jiayu
> Sent: Friday, June 22, 2018 1:54 PM
> To: dev@dpdk.org
> Cc: Wang, Xiao W ; Ananyev, Konstantin
> ; Zhang, Yuwei1 ;
> Iremonger, Bernard ; tho...@monjalon.net;
> Hu, Jiayu 
> Subject: [PATCH v3 0/3] Support UDP/IPv4 GSO
> 
> With the support of UDP Fragmentation Offload (UFO) and TCP Segmentation
> Offload (TSO) in virtio, VMs can exchange large UDP and TCP packets
> exceeding MTU between each other, which can greatly reduce per-packet
> processing overheads.
> 
> When the destination of the large TCP and UDP packets is crossing
> machines, the host application needs to call two different libraries,
> GSO and IP fragmentation, to split the large packets respectively.
> However,the GSO and IP fragmentation library have quite different APIs,
> which greatly complicates the host application implementation.
> 
> To simplify application development, we propose to support UDP/IPv4
> fragmentation in the GSO library. With supporting UDP GSO, host
> applicationss can use the unified APIs to split large UDP and TCP packets.
> 
> This patchset is to support UDP/IPv4 GSO. The first patch is to provide
> UDP GSO function, the second patch is to enable UDP/IPv4 GSO in the
> testpmd checksum forwarding engine, and the last patch is to update the
> programmer guide and testpmd user guide.
> 
> Change log
> ==
> v3:
> - replace rte_pktmbuf_mtod() with rte_pktmbuf_mtod_offset().
> - fix meson build.
> - add updates to document for better explaining how UDP GSO works.
> V2:
> - fix fragment offset calculation bug.
> - add UDP GSO description in testpmd user guide.
> - shorten the second patch name.
> 
> Jiayu Hu (3):
>   gso: support UDP/IPv4 fragmentation
>   app/testpmd: enable UDP GSO in csum engine
>   gso: update documents for UDP/IPv4 GSO
> 
>  app/test-pmd/cmdline.c |  5 +-
>  app/test-pmd/csumonly.c|  2 +
>  app/test-pmd/testpmd.c |  2 +-
>  .../generic_segmentation_offload_lib.rst   | 10 +++
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst|  7 ++
>  lib/librte_gso/Makefile|  1 +
>  lib/librte_gso/gso_common.h|  3 +
>  lib/librte_gso/gso_udp4.c  | 81 
> ++
>  lib/librte_gso/gso_udp4.h  | 42 +++
>  lib/librte_gso/meson.build |  2 +-
>  lib/librte_gso/rte_gso.c   | 24 +--
>  lib/librte_gso/rte_gso.h   |  6 +-
>  12 files changed, 175 insertions(+), 10 deletions(-)
>  create mode 100644 lib/librte_gso/gso_udp4.c
>  create mode 100644 lib/librte_gso/gso_udp4.h
> 
> --
> 2.7.4

Series Acked-by: Xiao Wang 

BRs,
Xiao



Re: [dpdk-dev] [PATCH] net/ifcvf: fix a typo

2018-11-29 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Ye, Xiaolong
> Sent: Thursday, November 29, 2018 4:22 PM
> To: dev@dpdk.org; Maxime Coquelin ; Bie,
> Tiwei ; Wang, Zhihong 
> Cc: Wang, Xiao W ; Ye, Xiaolong
> ; sta...@dpdk.org
> Subject: [PATCH] net/ifcvf: fix a typo
> 
> The struct should be ifcvf_net_config other than ifcvf_net_device_config.
> 
> Fixes: a3f8150eac6d ("net/ifcvf: add ifcvf vDPA driver")
> Cc: xiao.w.w...@intel.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Xiaolong Ye 
> ---
>  drivers/net/ifc/base/ifcvf.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ifc/base/ifcvf.h b/drivers/net/ifc/base/ifcvf.h
> index f026c70ab..c15c69107 100644
> --- a/drivers/net/ifc/base/ifcvf.h
> +++ b/drivers/net/ifc/base/ifcvf.h
> @@ -121,7 +121,7 @@ struct ifcvf_hw {
>   u8 notify_region;
>   u32notify_off_multiplier;
>   struct ifcvf_pci_common_cfg *common_cfg;
> - struct ifcvf_net_device_config *dev_cfg;
> + struct ifcvf_net_config *dev_cfg;
>   u8 *isr;
>   u16*notify_base;
>   u16*notify_addr[IFCVF_MAX_QUEUES * 2];
> --
> 2.17.1

Thanks.
Acked-by: Xiao Wang 


Re: [dpdk-dev] [PATCH v8 0/5] add ifcvf vdpa driver

2018-04-16 Thread Wang, Xiao W
Thanks for the reminder. Will fix it.

BRs,
Xiao

> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Tuesday, April 17, 2018 2:07 AM
> To: Wang, Xiao W 
> Cc: Yigit, Ferruh ; Burakov, Anatoly
> ; dev@dpdk.org; maxime.coque...@redhat.com;
> Wang, Zhihong ; Bie, Tiwei ;
> Tan, Jianfeng ; Liang, Cunming
> ; Daly, Dan 
> Subject: Re: [PATCH v8 0/5] add ifcvf vdpa driver
> 
> 16/04/2018 18:36, Ferruh Yigit:
> > Hi Xiao,
> >
> > Getting following build error for 32bit [1], can you please check them?
> >
> > [1]
> > .../dpdk/drivers/net/ifc/ifcvf_vdpa.c: In function ‘ifcvf_dma_map’:
> > .../dpdk/drivers/net/ifc/ifcvf_vdpa.c:24:3: error: format ‘%lx’ expects
> argument
> > of type ‘long unsigned int’, but argument 6 has type ‘uint64_t {aka long 
> > long
> > unsigned int}’ [-Werror=format=]
> 
> Reminder from this recent post:
>   http://dpdk.org/ml/archives/dev/2018-February/090882.html
> "
> Most of the times, using %l is wrong (except when printing a long).
> So next time you write %l, please think "I am probably wrong".
> "
> 
> 



Re: [dpdk-dev] [PATCH v4 05/20] net/virtio: dump packed virtqueue data

2018-04-24 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jens Freimann
> Sent: Thursday, April 19, 2018 3:08 PM
> To: dev@dpdk.org
> Cc: Bie, Tiwei ; y...@fridaylinux.org;
> maxime.coque...@redhat.com; m...@redhat.com; j...@freimann.org
> Subject: [dpdk-dev] [PATCH v4 05/20] net/virtio: dump packed virtqueue data
> 
> Add support to dump packed virtqueue data to the
> VIRTQUEUE_DUMP() macro.
> 
> Signed-off-by: Jens Freimann 
> ---
>  drivers/net/virtio/virtqueue.h | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
> index 081b27a52..ea804c9c7 100644
> --- a/drivers/net/virtio/virtqueue.h
> +++ b/drivers/net/virtio/virtqueue.h
> @@ -379,6 +379,12 @@ virtqueue_notify(struct virtqueue *vq)
> 
>  #ifdef RTE_LIBRTE_VIRTIO_DEBUG_DUMP
>  #define VIRTQUEUE_DUMP(vq) do { \
> + if (vtpci_packed_queue((vq)->hw)) { \
> +   PMD_INIT_LOG(DEBUG, \
> +   "VQ: - size=%d; free=%d; last_used_idx=%d;" \

Miss a "," after the output format.
"last_used_idx" is a stranger here.

> +   (vq)->vq_nentries, (vq)->vq_free_cnt, nused); \

The "nused" is not declared yet here.

BRs,
Xiao

> +   break; \
> + } \
>   uint16_t used_idx, nused; \
>   used_idx = (vq)->vq_ring.used->idx; \
>   nused = (uint16_t)(used_idx - (vq)->vq_used_cons_idx); \
> --
> 2.14.3



Re: [dpdk-dev] [PATCH] net/fm10k: add imissed stats

2018-09-10 Thread Wang, Xiao W
Hi,

-Original Message-
From: Julien Meunier [mailto:julien.meun...@nokia.com] 
Sent: Monday, September 10, 2018 11:51 PM
To: Zhang, Qi Z ; Wang, Xiao W 
Cc: dev@dpdk.org
Subject: [PATCH] net/fm10k: add imissed stats

Add support of imissed and q_errors statistics, reported by PCIE_QPRDC
register (see datasheet, section 11.27.2.60), which exposes the number
of receive packets dropped for a queue.

Signed-off-by: Julien Meunier 
---
 drivers/net/fm10k/fm10k_ethdev.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 541a49b..a9af6c2 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -1325,7 +1325,7 @@ fm10k_xstats_get(struct rte_eth_dev *dev, struct 
rte_eth_xstat *xstats,
 static int
 fm10k_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
 {
-   uint64_t ipackets, opackets, ibytes, obytes;
+   uint64_t ipackets, opackets, ibytes, obytes, imissed;
struct fm10k_hw *hw =
FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
struct fm10k_hw_stats *hw_stats =
@@ -1336,22 +1336,25 @@ fm10k_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
 
fm10k_update_hw_stats(hw, hw_stats);
 
-   ipackets = opackets = ibytes = obytes = 0;
+   ipackets = opackets = ibytes = obytes = imissed = 0;
for (i = 0; (i < RTE_ETHDEV_QUEUE_STAT_CNTRS) &&
(i < hw->mac.max_queues); ++i) {
stats->q_ipackets[i] = hw_stats->q[i].rx_packets.count;
stats->q_opackets[i] = hw_stats->q[i].tx_packets.count;
stats->q_ibytes[i]   = hw_stats->q[i].rx_bytes.count;
stats->q_obytes[i]   = hw_stats->q[i].tx_bytes.count;
+   stats->q_errors[i]   = hw_stats->q[i].rx_drops.count;
ipackets += stats->q_ipackets[i];
opackets += stats->q_opackets[i];
ibytes   += stats->q_ibytes[i];
obytes   += stats->q_obytes[i];
+   imissed  += stats->q_errors[i];
}
stats->ipackets = ipackets;
stats->opackets = opackets;
stats->ibytes = ibytes;
stats->obytes = obytes;
+   stats->imissed = imissed;
return 0;
 }

Acked-by: Xiao Wang 

 
-- 
2.10.2



Re: [dpdk-dev] [PATCH] net/ifc: do not notify before HW ready

2018-09-13 Thread Wang, Xiao W
Hi Xiaolong,

> -Original Message-
> From: Ye, Xiaolong
> Sent: Thursday, September 13, 2018 8:55 PM
> To: Wang, Xiao W 
> Cc: Bie, Tiwei ; dev@dpdk.org
> Subject: Re: [PATCH] net/ifc: do not notify before HW ready
> 
> Hi, Xiao
> 
> On 09/10, Xiao Wang wrote:
> >Fixes: a3f8150eac6d ("net/ifcvf: add ifcvf vDPA driver")
> 
> Could you help describe what problem is without this fix in commit log?

Generally a driver should finish all the device configurations first then 
notify the HW for data processing.
Without this fix, the potential problems are:
1. If the device is not clearly reset by the previous driver and holds some 
invalid ring addr, and the vDPA relay thread kicks it, a bad DMA request may 
happen.
2. The notify_addr which is used by the relay thread is set in the 
vdpa_ifcvf_start function. If there's really a kick relay before 
vdpa_ifcvf_start finishes, a null addr is accessed.

Would add the description in the commit log in v2.

Thanks for the comment,
Xiao
> 
> Thanks,
> Xiaolong
> >
> >Signed-off-by: Xiao Wang 
> >---
> > drivers/net/ifc/ifcvf_vdpa.c | 8 
> > 1 file changed, 4 insertions(+), 4 deletions(-)
> >
> >diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifc/ifcvf_vdpa.c
> >index 3c5430dc0..7d3085d8d 100644
> >--- a/drivers/net/ifc/ifcvf_vdpa.c
> >+++ b/drivers/net/ifc/ifcvf_vdpa.c
> >@@ -503,11 +503,11 @@ update_datapath(struct ifcvf_internal *internal)
> > if (ret)
> > goto err;
> >
> >-ret = setup_notify_relay(internal);
> >+ret = vdpa_ifcvf_start(internal);
> > if (ret)
> > goto err;
> >
> >-ret = vdpa_ifcvf_start(internal);
> >+ret = setup_notify_relay(internal);
> > if (ret)
> > goto err;
> >
> >@@ -515,12 +515,12 @@ update_datapath(struct ifcvf_internal *internal)
> > } else if (rte_atomic32_read(&internal->running) &&
> >(!rte_atomic32_read(&internal->started) ||
> > !rte_atomic32_read(&internal->dev_attached))) {
> >-vdpa_ifcvf_stop(internal);
> >-
> > ret = unset_notify_relay(internal);
> > if (ret)
> > goto err;
> >
> >+vdpa_ifcvf_stop(internal);
> >+
> > ret = vdpa_disable_vfio_intr(internal);
> > if (ret)
> > goto err;
> >--
> >2.15.1
> >


Re: [dpdk-dev] [PATCH v2 2/2] examples/vdpa: introduce a new sample for vDPA

2018-09-19 Thread Wang, Xiao W
Hi Xiaolong,

> -Original Message-
> From: Ye, Xiaolong
> Sent: Friday, September 14, 2018 2:07 AM
> To: dev@dpdk.org; Maxime Coquelin ; Bie,
> Tiwei ; Wang, Zhihong 
> Cc: Wang, Xiao W ; Rami Rosen
> ; Wang, Haiyue ; Ye,
> Xiaolong 
> Subject: [PATCH v2 2/2] examples/vdpa: introduce a new sample for vDPA
> 
> The vdpa sample application creates vhost-user sockets by using the
> vDPA backend. vDPA stands for vhost Data Path Acceleration which utilizes
> virtio ring compatible devices to serve virtio driver directly to enable
> datapath acceleration. As vDPA driver can help to set up vhost datapath,
> this application doesn't need to launch dedicated worker threads for vhost
> enqueue/dequeue operations.
> 
> Signed-off-by: Xiao Wang 
> Signed-off-by: Xiaolong Ye 
> ---
> 
> v2 changes:
> 
> * fix a compilation error reported by Rosen
> * improve create cmd in interactive mode and add two new cmds: list,
> * quit
> * add application documentation
> 
>  MAINTAINERS|   2 +
>  doc/guides/sample_app_ug/index.rst |   1 +
>  doc/guides/sample_app_ug/vdpa.rst  | 115 
>  examples/Makefile  |   2 +-
>  examples/vdpa/Makefile |  32 +++
>  examples/vdpa/main.c   | 437 +
>  examples/vdpa/meson.build  |  16 ++
>  7 files changed, 604 insertions(+), 1 deletion(-)
>  create mode 100644 doc/guides/sample_app_ug/vdpa.rst
>  create mode 100644 examples/vdpa/Makefile
>  create mode 100644 examples/vdpa/main.c
>  create mode 100644 examples/vdpa/meson.build
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 9fd258fad..f84dbf2a7 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -682,6 +682,8 @@ F: doc/guides/sample_app_ug/vhost.rst
>  F: examples/vhost_scsi/
>  F: doc/guides/sample_app_ug/vhost_scsi.rst
>  F: examples/vhost_crypto/
> +F: examples/vdpa/
> +F: doc/guides/sample_app_ug/vdpa.rst
> 
>  Vhost PMD
>  M: Maxime Coquelin 
> diff --git a/doc/guides/sample_app_ug/index.rst
> b/doc/guides/sample_app_ug/index.rst
> index 5bedf4f6f..74b12af85 100644
> --- a/doc/guides/sample_app_ug/index.rst
> +++ b/doc/guides/sample_app_ug/index.rst
> @@ -45,6 +45,7 @@ Sample Applications User Guides
>  vhost
>  vhost_scsi
>  vhost_crypto
> +vdpa
>  netmap_compatibility
>  ip_pipeline
>  test_pipeline
> diff --git a/doc/guides/sample_app_ug/vdpa.rst
> b/doc/guides/sample_app_ug/vdpa.rst
> new file mode 100644
> index 0..ab222731e
> --- /dev/null
> +++ b/doc/guides/sample_app_ug/vdpa.rst
> @@ -0,0 +1,115 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +Copyright(c) 2018 Intel Corporation.
> +
> +Vdpa Sample Application
> +===
> +
> +The vdpa sample application creates vhost-user sockets by using the
> +vDPA backend. vDPA stands for vhost Data Path Acceleration which utilizes
> +virtio ring compatible devices to serve virtio driver directly to enable
> +datapath acceleration. As vDPA driver can help to set up vhost datapath,
> +this application doesn't need to launch dedicated worker threads for vhost
> +enqueue/dequeue operations.
> +

[...]

> +
> +Take IFCVF driver for example:
> +
> +.. code-block:: console
> +
> +./vdpa --log-level=9 -c 0x6 -n 4 --socket-mem 1024,1024 \
> +-w :06:00.2,vdpa=1 -w :06:00.3,vdpa=1 \
> +-- --interactive
> +
> +.. note::
> +We need to bind VFIO-pci to VFs before running vdpa sample.

Replace "VFIO-pci" with "vfio-pci".

> +
> +* modprobe vfio-pci
> +* ./usertools/dpdk-devbind.py -b vfio-pci 06:00.2 06:00.3
> +
> +Then we can create 2 vdpa ports in interactive cmdline.
> +
> +.. code-block:: console
> +
> +vdpa> list
> +device id   device address
> +0   :06:00.2
> +1   :06:00.3

Could we show out also the device's features and supported queue number?

> +vdpa> create /tmp/vdpa-socket0 :06:00.2
> +vdpa> create /tmp/vdpa-socket1 :06:00.3
> +
> +.. _vdpa_app_run_vm:
> +

[...]

> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define MAX_PATH_LEN 128
> +#define MAX_VDPA_SAMPLE_PORTS 1024

[...]

> + ret = rte_vhost_driver_unregister(socket_path);
> + if (ret != 0)
> + RTE_LOG(ERR, USER1,
> + "Fail to unregister vhost driver for %s.\n",
> +

Re: [dpdk-dev] [PATCH v2 1/2] vhost: introduce API to get vDPA device number

2018-09-19 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Ye, Xiaolong
> Sent: Friday, September 14, 2018 2:07 AM
> To: dev@dpdk.org; Maxime Coquelin ; Bie,
> Tiwei ; Wang, Zhihong 
> Cc: Wang, Xiao W ; Rami Rosen
> ; Wang, Haiyue ; Ye,
> Xiaolong 
> Subject: [PATCH v2 1/2] vhost: introduce API to get vDPA device number
> 
> Signed-off-by: Xiaolong Ye 
> ---
>  lib/librte_vhost/rte_vdpa.h | 3 +++
>  lib/librte_vhost/vdpa.c | 6 ++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/lib/librte_vhost/rte_vdpa.h b/lib/librte_vhost/rte_vdpa.h
> index 90465ca26..b8223e337 100644
> --- a/lib/librte_vhost/rte_vdpa.h
> +++ b/lib/librte_vhost/rte_vdpa.h
> @@ -84,4 +84,7 @@ rte_vdpa_find_device_id(struct rte_vdpa_dev_addr
> *addr);
>  struct rte_vdpa_device * __rte_experimental
>  rte_vdpa_get_device(int did);

add it also to *version.map, so that we can link it as shared lib.

> 
> +/* Get current available vdpa device number */
> +int __rte_experimental
> +rte_vdpa_get_device_num(void);
>  #endif /* _RTE_VDPA_H_ */
> diff --git a/lib/librte_vhost/vdpa.c b/lib/librte_vhost/vdpa.c
> index c82fd4370..c2c5dff1d 100644
> --- a/lib/librte_vhost/vdpa.c
> +++ b/lib/librte_vhost/vdpa.c
> @@ -113,3 +113,9 @@ rte_vdpa_get_device(int did)
> 
>   return vdpa_devices[did];
>  }
> +
> +int
> +rte_vdpa_get_device_num(void)
> +{
> + return vdpa_device_num;
> +}
> --
> 2.17.1

It's better to have a cover letter for a patch set.

BRs,
Xiao



Re: [dpdk-dev] [PATCH v2 2/2] examples/vdpa: introduce a new sample for vDPA

2018-09-19 Thread Wang, Xiao W
Hi Xiaolong,

> -Original Message-
> From: Ye, Xiaolong
> Sent: Thursday, September 20, 2018 6:23 AM
> To: Wang, Xiao W 
> Cc: dev@dpdk.org; Maxime Coquelin ; Bie,
> Tiwei ; Wang, Zhihong ;
> Rami Rosen ; Wang, Haiyue
> 
> Subject: Re: [PATCH v2 2/2] examples/vdpa: introduce a new sample for vDPA
> 
> On 09/19, Wang, Xiao W wrote:
> >Hi Xiaolong,
> >
> [snip]
> >> +.. note::
> >> +We need to bind VFIO-pci to VFs before running vdpa sample.
> >
> >Replace "VFIO-pci" with "vfio-pci".
> 
> Got it.
> 
> >
> >> +
> >> +* modprobe vfio-pci
> >> +* ./usertools/dpdk-devbind.py -b vfio-pci 06:00.2 06:00.3
> >> +
> >> +Then we can create 2 vdpa ports in interactive cmdline.
> >> +
> >> +.. code-block:: console
> >> +
> >> +vdpa> list
> >> +device id   device address
> >> +0   :06:00.2
> >> +1   :06:00.3
> >
> >Could we show out also the device's features and supported queue number?
> 
> Sure, it's a good suggestion.
> 
> [snip]
> >
> >> +  ret = rte_vhost_driver_unregister(socket_path);
> >> +  if (ret != 0)
> >> +  RTE_LOG(ERR, USER1,
> >> +  "Fail to unregister vhost driver for %s.\n",
> >> +  socket_path);
> >> +}
> >> +
> >> +static void
> >> +vdpa_sample_quit(void)
> >> +{
> >> +  int i;
> >> +  for (i = 0; i <  RTE_MIN(MAX_VDPA_SAMPLE_PORTS, dev_total); i++) {
> >
> >Double " ".
> 
> Sorry, not quite understand what's your meaning here.

I mean there are 2 blank spaces after the "<" operator.

BRs,
Xiao


Re: [dpdk-dev] [PATCH v3 2/2] examples/vdpa: introduce a new sample for vDPA

2018-09-20 Thread Wang, Xiao W
Hi Xiaolong,

> -Original Message-
> From: Ye, Xiaolong
> Sent: Friday, September 21, 2018 6:28 AM
> To: dev@dpdk.org; Maxime Coquelin ; Bie,
> Tiwei ; Wang, Zhihong 
> Cc: Wang, Xiao W ; Rami Rosen
> ; Wang, Haiyue ; Ye,
> Xiaolong 
> Subject: [PATCH v3 2/2] examples/vdpa: introduce a new sample for vDPA
> 
> The vdpa sample application creates vhost-user sockets by using the
> vDPA backend. vDPA stands for vhost Data Path Acceleration which utilizes
> virtio ring compatible devices to serve virtio driver directly to enable
> datapath acceleration. As vDPA driver can help to set up vhost datapath,
> this application doesn't need to launch dedicated worker threads for vhost
> enqueue/dequeue operations.
> 
> Signed-off-by: Xiao Wang 
> Signed-off-by: Xiaolong Ye 
> ---
>  MAINTAINERS|   2 +
>  doc/guides/sample_app_ug/index.rst |   1 +
>  doc/guides/sample_app_ug/vdpa.rst  | 115 
>  examples/Makefile  |   2 +-
>  examples/vdpa/Makefile |  32 ++
>  examples/vdpa/main.c   | 458 +
>  examples/vdpa/meson.build  |  16 +
>  7 files changed, 625 insertions(+), 1 deletion(-)
>  create mode 100644 doc/guides/sample_app_ug/vdpa.rst
>  create mode 100644 examples/vdpa/Makefile
>  create mode 100644 examples/vdpa/main.c
>  create mode 100644 examples/vdpa/meson.build
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 5967c1dd3..5656f18e8 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -683,6 +683,8 @@ F: doc/guides/sample_app_ug/vhost.rst
>  F: examples/vhost_scsi/
>  F: doc/guides/sample_app_ug/vhost_scsi.rst
>  F: examples/vhost_crypto/
> +F: examples/vdpa/
> +F: doc/guides/sample_app_ug/vdpa.rst
> 
>  Vhost PMD
>  M: Maxime Coquelin 
> diff --git a/doc/guides/sample_app_ug/index.rst
> b/doc/guides/sample_app_ug/index.rst
> index 5bedf4f6f..74b12af85 100644
> --- a/doc/guides/sample_app_ug/index.rst
> +++ b/doc/guides/sample_app_ug/index.rst
> @@ -45,6 +45,7 @@ Sample Applications User Guides
>  vhost
>  vhost_scsi
>  vhost_crypto
> +vdpa
>  netmap_compatibility
>  ip_pipeline
>  test_pipeline
> diff --git a/doc/guides/sample_app_ug/vdpa.rst
> b/doc/guides/sample_app_ug/vdpa.rst
> new file mode 100644
> index 0..44fe6736d
> --- /dev/null
> +++ b/doc/guides/sample_app_ug/vdpa.rst
> @@ -0,0 +1,115 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +Copyright(c) 2018 Intel Corporation.
> +
> +Vdpa Sample Application
> +===
> +
> +The vdpa sample application creates vhost-user sockets by using the
> +vDPA backend. vDPA stands for vhost Data Path Acceleration which utilizes
> +virtio ring compatible devices to serve virtio driver directly to enable
> +datapath acceleration. As vDPA driver can help to set up vhost datapath,
> +this application doesn't need to launch dedicated worker threads for vhost
> +enqueue/dequeue operations.
> +
> +Testing steps

[...]

> +
> +Then we can create 2 vdpa ports in interactive cmdline.
> +
> +.. code-block:: console
> +
> +vdpa> list
> +device id   device address
> +0   :06:00.2
> +1   :06:00.3

The features and queue numbers of each vDPA device could also be shown now, the 
doc should reflect this.

> +vdpa> create /tmp/vdpa-socket0 :06:00.2
> +vdpa> create /tmp/vdpa-socket1 :06:00.3
> +
> +.. _vdpa_app_run_vm:
> +
> +Start the VMs
> +~
> +
> +.. code-block:: console
> +
> +   qemu-system-x86_64 -cpu host -enable-kvm \
> +   
> +   -mem-prealloc \
> +   -chardev socket,id=char0,path= \
> +   -netdev type=vhost-user,id=vdpa,chardev=char0 \
> +   -device virtio-net-pci,netdev=vdpa,mac=00:aa:bb:cc:dd:ee \
> +
> +After the VMs launches, we can login the VMs and configure the ip, verify the
> +network connection via ping or netperf.
> +
> +.. note::
> +Suggest to use QEMU 3.0.0 which extends vhost-user for vDPA.
> +
> +Live Migration
> +~~
> +vDPA supports cross-backend live migration, user can migrate SW vhost
> backend
> +VM to vDPA backend VM and vice versa. Here are the detailed steps. Assume
> A is
> +the source host with SW vhost VM and B is the destination host with vDPA.
> +
> +1. Start vdpa sample and launch a VM with exact same parameters as the VM
> on A,
> +   in migration-listen mode:
> +
> +.. code-block:: console
> +
> +B:  -incoming tcp:0: (or other PORT))
> +
> +2. Start the migration (on source host):
> +
> +.. code-block:: c

Re: [dpdk-dev] [PATCH v4 2/2] examples/vdpa: introduce a new sample for vDPA

2018-09-23 Thread Wang, Xiao W
Hi Xiaolong,

Thanks for the update, 2 small comments below.

> -Original Message-
> From: Ye, Xiaolong
> Sent: Monday, September 24, 2018 4:43 PM
> To: dev@dpdk.org; Maxime Coquelin ; Bie,
> Tiwei ; Wang, Zhihong 
> Cc: Wang, Xiao W ; Rami Rosen
> ; Wang, Haiyue ; Ye,
> Xiaolong 
> Subject: [PATCH v4 2/2] examples/vdpa: introduce a new sample for vDPA
> 
> The vdpa sample application creates vhost-user sockets by using the
> vDPA backend. vDPA stands for vhost Data Path Acceleration which utilizes
> virtio ring compatible devices to serve virtio driver directly to enable
> datapath acceleration. As vDPA driver can help to set up vhost datapath,
> this application doesn't need to launch dedicated worker threads for vhost
> enqueue/dequeue operations.
> 
> Signed-off-by: Xiao Wang 
> Signed-off-by: Xiaolong Ye 
> ---
>  MAINTAINERS|   2 +
>  doc/guides/rel_notes/release_18_11.rst |   8 +
>  doc/guides/sample_app_ug/index.rst |   1 +
>  doc/guides/sample_app_ug/vdpa.rst  | 118 +++
>  examples/Makefile  |   2 +-
>  examples/vdpa/Makefile |  32 ++
>  examples/vdpa/main.c   | 466 +
>  examples/vdpa/meson.build  |  16 +
>  8 files changed, 644 insertions(+), 1 deletion(-)
>  create mode 100644 doc/guides/sample_app_ug/vdpa.rst
>  create mode 100644 examples/vdpa/Makefile
>  create mode 100644 examples/vdpa/main.c
>  create mode 100644 examples/vdpa/meson.build
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 5967c1dd3..5656f18e8 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -683,6 +683,8 @@ F: doc/guides/sample_app_ug/vhost.rst
>  F: examples/vhost_scsi/
>  F: doc/guides/sample_app_ug/vhost_scsi.rst
>  F: examples/vhost_crypto/
> +F: examples/vdpa/
> +F: doc/guides/sample_app_ug/vdpa.rst
> 
>  Vhost PMD
>  M: Maxime Coquelin 
> diff --git a/doc/guides/rel_notes/release_18_11.rst
> b/doc/guides/rel_notes/release_18_11.rst
> index bc9b74ec4..dd53a9ecf 100644
> --- a/doc/guides/rel_notes/release_18_11.rst
> +++ b/doc/guides/rel_notes/release_18_11.rst
> @@ -67,6 +67,14 @@ New Features
>SR-IOV option in Hyper-V and Azure. This is an alternative to the previous
>vdev_netvsc, tap, and failsafe drivers combination.
> 
> +* **Add a new sample for vDPA**
> +
> +  The vdpa sample application creates vhost-user sockets by using the
> +  vDPA backend. vDPA stands for vhost Data Path Acceleration which utilizes
> +  virtio ring compatible devices to serve virtio driver directly to enable
> +  datapath acceleration. As vDPA driver can help to set up vhost datapath,
> +  this application doesn't need to launch dedicated worker threads for vhost
> +  enqueue/dequeue operations.
> 
>  API Changes
>  ---
> diff --git a/doc/guides/sample_app_ug/index.rst
> b/doc/guides/sample_app_ug/index.rst
> index 5bedf4f6f..74b12af85 100644
> --- a/doc/guides/sample_app_ug/index.rst
> +++ b/doc/guides/sample_app_ug/index.rst
> @@ -45,6 +45,7 @@ Sample Applications User Guides
>  vhost
>  vhost_scsi
>  vhost_crypto
> +vdpa
>  netmap_compatibility
>  ip_pipeline
>  test_pipeline
> diff --git a/doc/guides/sample_app_ug/vdpa.rst
> b/doc/guides/sample_app_ug/vdpa.rst
> new file mode 100644
> index 0..d05728a37
> --- /dev/null
> +++ b/doc/guides/sample_app_ug/vdpa.rst
> @@ -0,0 +1,118 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +Copyright(c) 2018 Intel Corporation.
> +
> +Vdpa Sample Application
> +===
> +
> +The vdpa sample application creates vhost-user sockets by using the
> +vDPA backend. vDPA stands for vhost Data Path Acceleration which utilizes
> +virtio ring compatible devices to serve virtio driver directly to enable
> +datapath acceleration. As vDPA driver can help to set up vhost datapath,
> +this application doesn't need to launch dedicated worker threads for vhost
> +enqueue/dequeue operations.
> +
> +Testing steps
> +-
> +
> +This section shows the steps of how to start VMs with vDPA vhost-user
> +backend and verify network connection & live migration.
> +
> +Build
> +~
> +
> +To compile the sample application see :doc:`compiling`.
> +
> +The application is located in the ``vdpa`` sub-directory.
> +
> +Start the vdpa example
> +~~
> +
> +.. code-block:: console
> +
> +./vdpa [EAL options]  -- [--client] [--interactive|-i] or [--iface 
> SOCKET_PATH]
> +
> +where
> +
> +* --client means running vdpa app in client mode, in the client mode, QEMU
> needs
> +  to run as the server m

Re: [dpdk-dev] [PATCH 1/2] eal/vfio: check if we already have the group fd open

2018-09-25 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Stojaczyk, Dariusz
> Sent: Monday, September 17, 2018 9:47 PM
> To: dev@dpdk.org
> Cc: Alejandro Lucero ; Burakov, Anatoly
> ; sta...@dpdk.org; Stojaczyk, Dariusz
> ; Wang, Xiao W 
> Subject: [PATCH 1/2] eal/vfio: check if we already have the group fd open
> 
> From: Dariusz Stojaczyk 
> 
> Always attempt to find already opened fd for an iommu
> group as subsequent attempts to open it will fail.
> 
> There's no public API to check if a group was already
> bound and has a container, so rte_vfio_container_group_bind()
> shouldn't fail in such case.
> 
> Fixes: ea2dc1066870 ("vfio: add multi container support")
> Cc: xiao.w.w...@intel.com
> 
> Signed-off-by: Dariusz Stojaczyk 
> ---
>  lib/librte_eal/linuxapp/eal/eal_vfio.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> index c68dc38e0..bcb869be1 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> @@ -1680,6 +1680,11 @@ rte_vfio_container_group_bind(int container_fd,
> int iommu_group_num)
>   return -1;
>   }
> 
> + /* check if we already have the group descriptor open */
> + for (i = 0; i < VFIO_MAX_GROUPS; i++)
> + if (vfio_cfg->vfio_groups[i].group_num == iommu_group_num)
> + return vfio_cfg->vfio_groups[i].fd;
> +
>   /* Check room for new group */
>   if (vfio_cfg->vfio_active_groups == VFIO_MAX_GROUPS) {
>   RTE_LOG(ERR, EAL, "Maximum number of VFIO groups
> reached!\n");
> --
> 2.17.1

Acked-by: Xiao Wang 

BRs,
Xiao



Re: [dpdk-dev] [PATCH v5 2/2] examples/vdpa: introduce a new sample for vDPA

2018-09-25 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Ye, Xiaolong
> Sent: Tuesday, September 25, 2018 8:07 PM
> To: dev@dpdk.org; Maxime Coquelin ; Bie,
> Tiwei ; Wang, Zhihong 
> Cc: Wang, Xiao W ; Rami Rosen
> ; Wang, Haiyue ; Ye,
> Xiaolong 
> Subject: [PATCH v5 2/2] examples/vdpa: introduce a new sample for vDPA
> 
> The vdpa sample application creates vhost-user sockets by using the
> vDPA backend. vDPA stands for vhost Data Path Acceleration which utilizes
> virtio ring compatible devices to serve virtio driver directly to enable
> datapath acceleration. As vDPA driver can help to set up vhost datapath,
> this application doesn't need to launch dedicated worker threads for vhost
> enqueue/dequeue operations.
> 
> Signed-off-by: Xiao Wang 
> Signed-off-by: Xiaolong Ye 
> ---
>  MAINTAINERS|   2 +
>  doc/guides/rel_notes/release_18_11.rst |   8 +
>  doc/guides/sample_app_ug/index.rst |   1 +
>  doc/guides/sample_app_ug/vdpa.rst  | 118 +++
>  examples/Makefile  |   2 +-
>  examples/vdpa/Makefile |  32 ++
>  examples/vdpa/main.c   | 466 +
>  examples/vdpa/meson.build  |  16 +
>  8 files changed, 644 insertions(+), 1 deletion(-)
>  create mode 100644 doc/guides/sample_app_ug/vdpa.rst
>  create mode 100644 examples/vdpa/Makefile
>  create mode 100644 examples/vdpa/main.c
>  create mode 100644 examples/vdpa/meson.build
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 5967c1dd3..5656f18e8 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -683,6 +683,8 @@ F: doc/guides/sample_app_ug/vhost.rst
>  F: examples/vhost_scsi/
>  F: doc/guides/sample_app_ug/vhost_scsi.rst
>  F: examples/vhost_crypto/
> +F: examples/vdpa/
> +F: doc/guides/sample_app_ug/vdpa.rst
> 
>  Vhost PMD
>  M: Maxime Coquelin 
> diff --git a/doc/guides/rel_notes/release_18_11.rst
> b/doc/guides/rel_notes/release_18_11.rst
> index bc9b74ec4..dd53a9ecf 100644
> --- a/doc/guides/rel_notes/release_18_11.rst
> +++ b/doc/guides/rel_notes/release_18_11.rst
> @@ -67,6 +67,14 @@ New Features
>SR-IOV option in Hyper-V and Azure. This is an alternative to the previous
>vdev_netvsc, tap, and failsafe drivers combination.
> 
> +* **Add a new sample for vDPA**
> +
> +  The vdpa sample application creates vhost-user sockets by using the
> +  vDPA backend. vDPA stands for vhost Data Path Acceleration which utilizes
> +  virtio ring compatible devices to serve virtio driver directly to enable
> +  datapath acceleration. As vDPA driver can help to set up vhost datapath,
> +  this application doesn't need to launch dedicated worker threads for vhost
> +  enqueue/dequeue operations.
> 
>  API Changes
>  ---
> diff --git a/doc/guides/sample_app_ug/index.rst
> b/doc/guides/sample_app_ug/index.rst
> index 5bedf4f6f..74b12af85 100644
> --- a/doc/guides/sample_app_ug/index.rst
> +++ b/doc/guides/sample_app_ug/index.rst
> @@ -45,6 +45,7 @@ Sample Applications User Guides
>  vhost
>  vhost_scsi
>  vhost_crypto
> +vdpa
>  netmap_compatibility
>  ip_pipeline
>  test_pipeline
> diff --git a/doc/guides/sample_app_ug/vdpa.rst
> b/doc/guides/sample_app_ug/vdpa.rst
> new file mode 100644
> index 0..a089393c0
> --- /dev/null
> +++ b/doc/guides/sample_app_ug/vdpa.rst
> @@ -0,0 +1,118 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +Copyright(c) 2018 Intel Corporation.
> +
> +Vdpa Sample Application
> +===
> +
> +The vdpa sample application creates vhost-user sockets by using the
> +vDPA backend. vDPA stands for vhost Data Path Acceleration which utilizes
> +virtio ring compatible devices to serve virtio driver directly to enable
> +datapath acceleration. As vDPA driver can help to set up vhost datapath,
> +this application doesn't need to launch dedicated worker threads for vhost
> +enqueue/dequeue operations.
> +
> +Testing steps
> +-
> +
> +This section shows the steps of how to start VMs with vDPA vhost-user
> +backend and verify network connection & live migration.
> +
> +Build
> +~
> +
> +To compile the sample application see :doc:`compiling`.
> +
> +The application is located in the ``vdpa`` sub-directory.
> +
> +Start the vdpa example
> +~~
> +
> +.. code-block:: console
> +
> +./vdpa [EAL options]  -- [--client] [--interactive|-i] or [--iface
> SOCKET_PATH]
> +
> +where
> +
> +* --client means running vdpa app in client mode, in the client mode, QEMU
> needs
> +  to run as the server mode and take charge of socket file creation.
> +* --if

Re: [dpdk-dev] [PATCH v2] net/ifc: do not notify before HW ready

2018-09-25 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Kevin Traynor [mailto:ktray...@redhat.com]
> Sent: Wednesday, September 26, 2018 1:16 AM
> To: Wang, Xiao W 
> Cc: Ye, Xiaolong ; Bie, Tiwei ;
> dev@dpdk.org; sta...@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] net/ifc: do not notify before HW ready
> 
> On 09/14/2018 02:25 AM, Xiao Wang wrote:
> > If the device is not clearly reset by the previous driver and holds
> > some invalid ring addr, and the relay thread kicks it before HW is
> > properly re-configured, a bad DMA request may happen.
> >
> > Besides, the notify_addr which is used by the relay thread is set in
> > the vdpa_ifcvf_start function, if a kick relay happens before
> > vdpa_ifcvf_start finishes, a null addr is accessed.
> >
> > Fixes: a3f8150eac6d ("net/ifcvf: add ifcvf vDPA driver")
> >
> 
> Looks like this should be in stable branch too. Can you confirm?  

Yes, they should go also into stable branch, thanks for the notice.

BRs,
Xiao


Re: [dpdk-dev] [PATCH v6 0/2] introduce vdpa sample

2018-09-25 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Ye, Xiaolong
> Sent: Wednesday, September 26, 2018 5:07 PM
> To: dev@dpdk.org; Maxime Coquelin ; Bie,
> Tiwei ; Wang, Zhihong 
> Cc: Wang, Xiao W ; Rami Rosen
> ; Wang, Haiyue ; Ye,
> Xiaolong 
> Subject: [PATCH v6 0/2] introduce vdpa sample
> 
> Hi,
> 
> This patchset introduces vdpa sample to demonstrate the vDPA use case.
> 
> v6 changes:
> * improve the document according to Xiao's comments
> * fix a typo, PRIu64 -> PRIx64
> 
> v5 changes:
> * improve print format and correct from "PRIu64" to "PRIx64"
> * use "-c 0x2" to better demonstrate app doesn't need to launch dedicated
>   worker threads for vhost enqueue/dequeue operations in vdpa.rst
> 
> v4 changes:
> * add client mode support
> * improve the format to list the vDPA device info and improve the vdpa.rst
>   accordingly
> * remove some useless comments
> * add introduction in 18.11 release note.
> 
> 
> v3 changes:
> * list cmd would show queue number and supported features of vdpa devices.
> * address Xiao's review comments
> 
> v2 changes:
> 
> * fix a compilation error reported by Rosen
> * improve create cmd in interactive mode and add two new cmds: list, quit
> * add application documentation
> 
> 
> Xiaolong Ye (2):
>   vhost: introduce API to get vDPA device number
>   examples/vdpa: introduce a new sample for vDPA
> 
>  MAINTAINERS|   2 +
>  doc/guides/sample_app_ug/index.rst |   1 +
>  doc/guides/sample_app_ug/vdpa.rst  | 115 +++
>  examples/Makefile  |   2 +-
>  examples/vdpa/Makefile |  32 ++
>  examples/vdpa/main.c   | 458 +
>  examples/vdpa/meson.build  |  16 +
>  lib/librte_vhost/rte_vdpa.h|   3 +
>  lib/librte_vhost/rte_vhost_version.map |   1 +
>  lib/librte_vhost/vdpa.c|   6 +
>  10 files changed, 635 insertions(+), 1 deletion(-)
>  create mode 100644 doc/guides/sample_app_ug/vdpa.rst
>  create mode 100644 examples/vdpa/Makefile
>  create mode 100644 examples/vdpa/main.c
>  create mode 100644 examples/vdpa/meson.build
> 
> --
> 2.17.1

Series Acked-by: Xiao Wang 

BRs,
Xiao



Re: [dpdk-dev] [PATCH] net/virtio-user: fix dev_init in legacy-mem mode

2018-05-17 Thread Wang, Xiao W
Hi,


> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Thursday, May 17, 2018 4:03 PM
> To: Wang, Xiao W 
> Cc: dev@dpdk.org; Bie, Tiwei ; Yao, Lei A
> 
> Subject: Re: [PATCH] net/virtio-user: fix dev_init in legacy-mem mode
> 
> Hi Xiao,
> 
> Next time, could you please devtools/check-git-log.sh script before
> posting.

OK, thanks!

BRs,
Xiao
> 
> I tihnk the commit title should be changed to:
> net/virtio-user: fix device init in legacy-mem mode
> 
> On 05/17/2018 09:35 AM, Xiao Wang wrote:
> > In legacy-mem mode, memory event callback registering is not supported,
> > we should not return error in dev_init on this case.
> >
> > Fixes: 12ecb2f63b12 ("net/virtio-user: support memory hotplug")
> >
> > Signed-off-by: Xiao Wang 
> > Suggested-by: Tiwei Bie 
> 
> I think the suggested-by line should go above the signed-off one,
> as it was suggested before being implemented.
> 
> > ---
> >   drivers/net/virtio/virtio_user/virtio_user_dev.c | 7 +--
> >   1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c
> b/drivers/net/virtio/virtio_user/virtio_user_dev.c
> > index 992a68757..bd16fbb60 100644
> > --- a/drivers/net/virtio/virtio_user/virtio_user_dev.c
> > +++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c
> > @@ -445,8 +445,11 @@ virtio_user_dev_init(struct virtio_user_dev *dev,
> char *path, int queues,
> >
> > if
> (rte_mem_event_callback_register(VIRTIO_USER_MEM_EVENT_CLB_NAME,
> > virtio_user_mem_event_cb, dev)) {
> > -   PMD_INIT_LOG(ERR, "Failed to register mem event
> callback\n");
> > -   return -1;
> > +   if (rte_errno != ENOTSUP) {
> > +   PMD_INIT_LOG(ERR, "Failed to register mem event"
> > +   " callback\n");
> > +   return -1;
> > +   }
> > }
> >
> > return 0;
> >
> 
> Apart above minor comments, the patch looks good to me:
> Reviewed-by: Maxime Coquelin 
> 
> I'll handle the changes when applying.
> 
> Thanks,
> Maxime


Re: [dpdk-dev] [PATCH-18.08 08/15] net/ifc: rename to ifcvf

2018-05-17 Thread Wang, Xiao W
Hi Bruce,

> -Original Message-
> From: Richardson, Bruce
> Sent: Friday, May 18, 2018 4:15 AM
> To: dev@dpdk.org
> Cc: Richardson, Bruce ; Wang, Xiao W
> 
> Subject: [PATCH-18.08 08/15] net/ifc: rename to ifcvf
> 
> All files in the directory and the resulting driver have prefix of ifcvf,
> not just ifc, so rename directory for accuracy. Also rename the map file
> to standard name for meson build in the process.

Naming the directory as "ifc" allows us to add ifcpf driver into it in the 
future.

BRs,
Xiao

> 
> Signed-off-by: Bruce Richardson 
> ---
> CC: Xiao Wang 
> ---
>  MAINTAINERS   | 4 
> ++--
>  drivers/net/Makefile  | 2 +-
>  drivers/net/{ifc => ifcvf}/Makefile   | 2 +-
>  drivers/net/{ifc => ifcvf}/base/ifcvf.c   | 0
>  drivers/net/{ifc => ifcvf}/base/ifcvf.h   | 0
>  drivers/net/{ifc => ifcvf}/base/ifcvf_osdep.h | 0
>  drivers/net/{ifc => ifcvf}/ifcvf_vdpa.c   | 0
>  .../{ifc/rte_ifcvf_version.map => ifcvf/rte_pmd_ifcvf_version.map}| 0
>  8 files changed, 4 insertions(+), 4 deletions(-)
>  rename drivers/net/{ifc => ifcvf}/Makefile (94%)
>  rename drivers/net/{ifc => ifcvf}/base/ifcvf.c (100%)
>  rename drivers/net/{ifc => ifcvf}/base/ifcvf.h (100%)
>  rename drivers/net/{ifc => ifcvf}/base/ifcvf_osdep.h (100%)
>  rename drivers/net/{ifc => ifcvf}/ifcvf_vdpa.c (100%)
>  rename drivers/net/{ifc/rte_ifcvf_version.map =>
> ifcvf/rte_pmd_ifcvf_version.map} (100%)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 2663f1c03..6f587477c 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -553,10 +553,10 @@ T: git://dpdk.org/next/dpdk-next-net-intel
>  F: drivers/net/avf/
>  F: doc/guides/nics/features/avf*.ini
> 
> -Intel ifc
> +Intel ifcvf
>  M: Xiao Wang 
>  T: git://dpdk.org/next/dpdk-next-net-intel
> -F: drivers/net/ifc/
> +F: drivers/net/ifcvf/
>  F: doc/guides/nics/ifcvf.rst
>  F: doc/guides/nics/features/ifcvf.ini
> 
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> index 9f9da6651..9308f9a7b 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -59,7 +59,7 @@ endif # $(CONFIG_RTE_LIBRTE_SCHED)
>  ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
>  ifeq ($(CONFIG_RTE_EAL_VFIO),y)
> -DIRS-$(CONFIG_RTE_LIBRTE_IFCVF_VDPA_PMD) += ifc
> +DIRS-$(CONFIG_RTE_LIBRTE_IFCVF_VDPA_PMD) += ifcvf
>  endif
>  endif # $(CONFIG_RTE_LIBRTE_VHOST)
> 
> diff --git a/drivers/net/ifc/Makefile b/drivers/net/ifcvf/Makefile
> similarity index 94%
> rename from drivers/net/ifc/Makefile
> rename to drivers/net/ifcvf/Makefile
> index 1011995bc..a022faaad 100644
> --- a/drivers/net/ifc/Makefile
> +++ b/drivers/net/ifcvf/Makefile
> @@ -22,7 +22,7 @@ BASE_DRIVER_OBJS=$(sort $(patsubst %.c,%.o,$(notdir
> $(wildcard $(SRCDIR)/base/*.
> 
>  VPATH += $(SRCDIR)/base
> 
> -EXPORT_MAP := rte_ifcvf_version.map
> +EXPORT_MAP := rte_pmd_ifcvf_version.map
> 
>  LIBABIVER := 1
> 
> diff --git a/drivers/net/ifc/base/ifcvf.c b/drivers/net/ifcvf/base/ifcvf.c
> similarity index 100%
> rename from drivers/net/ifc/base/ifcvf.c
> rename to drivers/net/ifcvf/base/ifcvf.c
> diff --git a/drivers/net/ifc/base/ifcvf.h b/drivers/net/ifcvf/base/ifcvf.h
> similarity index 100%
> rename from drivers/net/ifc/base/ifcvf.h
> rename to drivers/net/ifcvf/base/ifcvf.h
> diff --git a/drivers/net/ifc/base/ifcvf_osdep.h
> b/drivers/net/ifcvf/base/ifcvf_osdep.h
> similarity index 100%
> rename from drivers/net/ifc/base/ifcvf_osdep.h
> rename to drivers/net/ifcvf/base/ifcvf_osdep.h
> diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifcvf/ifcvf_vdpa.c
> similarity index 100%
> rename from drivers/net/ifc/ifcvf_vdpa.c
> rename to drivers/net/ifcvf/ifcvf_vdpa.c
> diff --git a/drivers/net/ifc/rte_ifcvf_version.map
> b/drivers/net/ifcvf/rte_pmd_ifcvf_version.map
> similarity index 100%
> rename from drivers/net/ifc/rte_ifcvf_version.map
> rename to drivers/net/ifcvf/rte_pmd_ifcvf_version.map
> --
> 2.11.0



Re: [dpdk-dev] [PATCH-18.08 08/15] net/ifc: rename to ifcvf

2018-05-18 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Richardson, Bruce
> Sent: Friday, May 18, 2018 4:13 PM
> To: Wang, Xiao W 
> Cc: dev@dpdk.org
> Subject: Re: [PATCH-18.08 08/15] net/ifc: rename to ifcvf
> 
> On Fri, May 18, 2018 at 02:52:36AM +0100, Wang, Xiao W wrote:
> > Hi Bruce,
> >
> > > -Original Message-
> > > From: Richardson, Bruce
> > > Sent: Friday, May 18, 2018 4:15 AM
> > > To: dev@dpdk.org
> > > Cc: Richardson, Bruce ; Wang, Xiao W
> > > 
> > > Subject: [PATCH-18.08 08/15] net/ifc: rename to ifcvf
> > >
> > > All files in the directory and the resulting driver have prefix of ifcvf,
> > > not just ifc, so rename directory for accuracy. Also rename the map file
> > > to standard name for meson build in the process.
> >
> > Naming the directory as "ifc" allows us to add ifcpf driver into it in the 
> > future.
> >
> > BRs,
> > Xiao
> >
> 
> At which point you will have to rename the version file, the rst
> documentation file, the resulting shared library file etc. Right now, most
> of the references are to ifcvf, with the only exception being the folder
> name. This patch makes things consistent by having everything refer to
> ifcvf.

OK.

> 
> If you prefer, I can do a v2 of this set renaming everything to ifc, but
> that would be a lot bigger a job, and would also result in the driver file
> itself getting a new name too. Unless there are immediate plans to add an
> ifcpf driver, I think this change makes more sense.

No need for that.

BRs,
Xiao
> 
> /Bruce



Re: [dpdk-dev] [PATCH] doc: fix an error in ifc NIC document

2019-01-17 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Rami Rosen [mailto:ramir...@gmail.com]
> Sent: Thursday, January 17, 2019 10:51 PM
> To: dev@dpdk.org
> Cc: sta...@dpdk.org; Wang, Xiao W ; Rami Rosen
> 
> Subject: [PATCH] doc: fix an error in ifc NIC document
> 
> This patch fixes an error in ifc NIC document; a previous patch
> changed the semantics to use CONFIG_RTE_LIBRTE_IFC_PMD
> instread of CONFIG_RTE_LIBRTE_IFCVF_VDPA_PMD,
> but the ifc NIC doc file remained with the old name.
> 
> Fixes: 4b614e9504a1 ("net/ifc: make driver name consistent")
> Cc: sta...@dpdk.org
> Signed-off-by: Rami Rosen 
> ---
>  doc/guides/nics/ifc.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/doc/guides/nics/ifc.rst b/doc/guides/nics/ifc.rst
> index bdf7b4e..317d9ff 100644
> --- a/doc/guides/nics/ifc.rst
> +++ b/doc/guides/nics/ifc.rst
> @@ -19,7 +19,7 @@ Config File Options
> 
>  The following option can be modified in the ``config`` file.
> 
> -- ``CONFIG_RTE_LIBRTE_IFCVF_VDPA_PMD`` (default ``y`` for linux)
> +- ``CONFIG_RTE_LIBRTE_IFC_PMD`` (default ``y`` for linux)
> 
>Toggle compilation of the ``librte_ifcvf_vdpa`` driver.

Could we also squash below fix into this patch? Or we make a separate one since 
this patch is applied already.

-  Toggle compilation of the ``librte_ifcvf_vdpa`` driver.
+  Toggle compilation of the ``librte_pmd_ifc`` driver.

BRs,
Xiao



Re: [dpdk-dev] [PATCH v2] vfio: allow secondary process to query IOMMU type

2019-01-18 Thread Wang, Xiao W
Hi Anatoly,

> -Original Message-
> From: Burakov, Anatoly
> Sent: Friday, January 18, 2019 6:25 PM
> To: dev@dpdk.org
> Cc: Wang, Xiao W ; Zhang, Qi Z
> ; qingfu@alibaba-inc.com; tho...@monjalon.net;
> Stojaczyk, Dariusz ; sta...@dpdk.org
> Subject: [PATCH v2] vfio: allow secondary process to query IOMMU type
> 
> It is only possible to know IOMMU type of a given VFIO container
> by attempting to initialize it. Since secondary process never
> attempts to set up VFIO container itself (because they're shared
> between primary and secondary), it never knows which IOMMU type
> the container is using, and never sets up the appropriate config
> structures. This results in inability to perform DMA mappings in
> secondary process.
> 
> Fix this by allowing secondary process to query IOMMU type of
> primary's default container at device initialization.
> 
> Note that this fix is assuming we're only interested in default
> container.
> 
> Bugzilla ID: 174
> 
> Fixes: 6bcb7c95fe14 ("vfio: share default container in multi-process")
> Cc: dariusz.stojac...@intel.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Anatoly Burakov 
> ---
> 
> Notes:
> v2:
> - Check if we found our IOMMU type within list of IOMMU types
> - Don't request new default container fd as this should have
>   been done during rte_vfio_enable()
> 
>  lib/librte_eal/linuxapp/eal/eal_vfio.c| 88 +++
>  lib/librte_eal/linuxapp/eal/eal_vfio.h| 12 ++-
>  .../linuxapp/eal/eal_vfio_mp_sync.c   | 16 
>  3 files changed, 115 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> index 72cc65151..c821e8382 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> @@ -549,6 +549,65 @@ vfio_mem_event_callback(enum rte_mem_event
> type, const void *addr, size_t len,
>   }
>  }
> 
> +static int
> +vfio_sync_default_container(void)
> +{
> + struct rte_mp_msg mp_req, *mp_rep;
> + struct rte_mp_reply mp_reply;
> + struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
> + struct vfio_mp_param *p = (struct vfio_mp_param *)mp_req.param;
> + int iommu_type_id;
> + unsigned int i;
> +
> + /* cannot be called from primary */
> + if (rte_eal_process_type() != RTE_PROC_SECONDARY)
> + return -1;
> +
> + /* default container fd should have been opened in rte_vfio_enable()
> */
> + if (!default_vfio_cfg->vfio_enabled ||
> + default_vfio_cfg->vfio_container_fd < 0) {
> + RTE_LOG(ERR, EAL, "VFIO support is not initialized\n");
> + return -1;
> + }
> +
> + /* find default container's IOMMU type */
> + p->req = SOCKET_REQ_IOMMU_TYPE;

Since this function is to sync IOMMU type for the default container, should we 
make the req type as
SOCKET_REQ_DEFAULT_IOMMU_TYPE?

BRs,
Xiao


Re: [dpdk-dev] [PATCH 1/2] vhost: introduce API to get vDPA device number

2019-01-23 Thread Wang, Xiao W
Please ignore this email I sent by mistake.

BRs,
Xiao

> -Original Message-
> From: Wang, Xiao W
> Sent: Wednesday, January 23, 2019 8:57 PM
> To: maxime.coque...@redhat.com
> Cc: dev@dpdk.org; Bie, Tiwei ; Ye, Xiaolong
> 
> Subject: [PATCH 1/2] vhost: introduce API to get vDPA device number
> 
> From: Xiaolong Ye 
> 
> It's used to get number of available registered vDPA devices.
> 
> Signed-off-by: Xiaolong Ye 
> Acked-by: Xiao Wang 
> Reviewed-by: Maxime Coquelin 
> ---
>  lib/librte_vhost/rte_vdpa.h| 3 +++
>  lib/librte_vhost/rte_vhost_version.map | 1 +
>  lib/librte_vhost/vdpa.c| 6 ++
>  3 files changed, 10 insertions(+)
> 
> diff --git a/lib/librte_vhost/rte_vdpa.h b/lib/librte_vhost/rte_vdpa.h
> index 90465ca26..b8223e337 100644
> --- a/lib/librte_vhost/rte_vdpa.h
> +++ b/lib/librte_vhost/rte_vdpa.h
> @@ -84,4 +84,7 @@ rte_vdpa_find_device_id(struct rte_vdpa_dev_addr
> *addr);
>  struct rte_vdpa_device * __rte_experimental
>  rte_vdpa_get_device(int did);
> 
> +/* Get current available vdpa device number */
> +int __rte_experimental
> +rte_vdpa_get_device_num(void);
>  #endif /* _RTE_VDPA_H_ */
> diff --git a/lib/librte_vhost/rte_vhost_version.map
> b/lib/librte_vhost/rte_vhost_version.map
> index da220dd02..ae39b6e21 100644
> --- a/lib/librte_vhost/rte_vhost_version.map
> +++ b/lib/librte_vhost/rte_vhost_version.map
> @@ -67,6 +67,7 @@ EXPERIMENTAL {
>   rte_vdpa_unregister_device;
>   rte_vdpa_find_device_id;
>   rte_vdpa_get_device;
> + rte_vdpa_get_device_num;
>   rte_vhost_driver_attach_vdpa_device;
>   rte_vhost_driver_detach_vdpa_device;
>   rte_vhost_driver_get_vdpa_device_id;
> diff --git a/lib/librte_vhost/vdpa.c b/lib/librte_vhost/vdpa.c
> index c82fd4370..c2c5dff1d 100644
> --- a/lib/librte_vhost/vdpa.c
> +++ b/lib/librte_vhost/vdpa.c
> @@ -113,3 +113,9 @@ rte_vdpa_get_device(int did)
> 
>   return vdpa_devices[did];
>  }
> +
> +int
> +rte_vdpa_get_device_num(void)
> +{
> + return vdpa_device_num;
> +}
> --
> 2.15.1



Re: [dpdk-dev] [PATCH v4 2/2] eal/vfio: export internal vfio functions

2018-04-03 Thread Wang, Xiao W
Hi Hemant,

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Hemant Agrawal
> Sent: Tuesday, April 3, 2018 7:10 PM
> To: dev@dpdk.org
> Cc: Burakov, Anatoly ; tho...@monjalon.net
> Subject: [dpdk-dev] [PATCH v4 2/2] eal/vfio: export internal vfio functions
> 
> This patch moves some of the internal vfio functions from
> eal_vfio.h to rte_vfio.h for common uses with "rte_" prefix.
> 
> This patch also change the FSLMC bus usages from the internal
> VFIO functions to external ones with "rte_" prefix
> 
> Signed-off-by: Hemant Agrawal 
> Acked-by: Anatoly Burakov 
> ---
>  drivers/bus/fslmc/Makefile |  1 -
>  drivers/bus/fslmc/fslmc_vfio.c |  7 +--
>  drivers/bus/fslmc/fslmc_vfio.h |  2 -
>  drivers/bus/fslmc/meson.build  |  1 -
>  lib/librte_eal/bsdapp/eal/eal.c| 20 +++
>  lib/librte_eal/common/include/rte_vfio.h   | 75
> +-
>  lib/librte_eal/linuxapp/eal/eal_vfio.c | 38 ++---
>  lib/librte_eal/linuxapp/eal/eal_vfio.h | 21 
>  lib/librte_eal/linuxapp/eal/eal_vfio_mp_sync.c |  4 +-
>  lib/librte_eal/rte_eal_version.map |  3 ++
>  10 files changed, 122 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/bus/fslmc/Makefile b/drivers/bus/fslmc/Makefile
> index 93870ba..3aa34e2 100644
> --- a/drivers/bus/fslmc/Makefile
> +++ b/drivers/bus/fslmc/Makefile
> @@ -16,7 +16,6 @@ CFLAGS += $(WERROR_FLAGS)
>  CFLAGS += -I$(RTE_SDK)/drivers/bus/fslmc
>  CFLAGS += -I$(RTE_SDK)/drivers/bus/fslmc/mc
>  CFLAGS += -I$(RTE_SDK)/drivers/bus/fslmc/qbman/include
> -CFLAGS += -I$(RTE_SDK)/lib/librte_eal/linuxapp/eal
>  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
>  LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
>  LDLIBS += -lrte_ethdev
> diff --git a/drivers/bus/fslmc/fslmc_vfio.c b/drivers/bus/fslmc/fslmc_vfio.c
> index 62499de..f3b2960 100644
> --- a/drivers/bus/fslmc/fslmc_vfio.c
> +++ b/drivers/bus/fslmc/fslmc_vfio.c
> @@ -91,7 +91,8 @@ fslmc_get_container_group(int *groupid)
>   }
> 
>   /* get group number */
> - ret = vfio_get_group_no(SYSFS_FSL_MC_DEVICES, g_container,
> groupid);
> + ret = rte_vfio_get_group_num(SYSFS_FSL_MC_DEVICES,
> +  g_container, groupid);
>   if (ret <= 0) {
>   DPAA2_BUS_ERR("Unable to find %s IOMMU group",
> g_container);
>   return -1;
> @@ -124,7 +125,7 @@ vfio_connect_container(void)
>   }
> 
>   /* Opens main vfio file descriptor which represents the "container" */
> - fd = vfio_get_container_fd();
> + fd = rte_vfio_get_container_fd();
>   if (fd < 0) {
>   DPAA2_BUS_ERR("Failed to open VFIO container");
>   return -errno;
> @@ -620,7 +621,7 @@ fslmc_vfio_setup_group(void)
>   }
> 
>   /* Get the actual group fd */
> - ret = vfio_get_group_fd(groupid);
> + ret = rte_vfio_get_group_fd(groupid);
>   if (ret < 0)
>   return ret;
>   vfio_group.fd = ret;
> diff --git a/drivers/bus/fslmc/fslmc_vfio.h b/drivers/bus/fslmc/fslmc_vfio.h
> index e8fb344..9e2c4fe 100644
> --- a/drivers/bus/fslmc/fslmc_vfio.h
> +++ b/drivers/bus/fslmc/fslmc_vfio.h
> @@ -10,8 +10,6 @@
> 
>  #include 
> 
> -#include "eal_vfio.h"
> -
>  #define DPAA2_MC_DPNI_DEVID  7
>  #define DPAA2_MC_DPSECI_DEVID3
>  #define DPAA2_MC_DPCON_DEVID 5
> diff --git a/drivers/bus/fslmc/meson.build b/drivers/bus/fslmc/meson.build
> index e94340e..78f9d92 100644
> --- a/drivers/bus/fslmc/meson.build
> +++ b/drivers/bus/fslmc/meson.build
> @@ -22,6 +22,5 @@ sources = files('fslmc_bus.c',
> 
>  allow_experimental_apis = true
> 
> -includes += include_directories('../../../lib/librte_eal/linuxapp/eal')
>  includes += include_directories('mc', 'qbman/include', 'portal')
>  cflags += ['-D_GNU_SOURCE']
> diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
> index 4eafcb5..ac23db5 100644
> --- a/lib/librte_eal/bsdapp/eal/eal.c
> +++ b/lib/librte_eal/bsdapp/eal/eal.c
> @@ -781,3 +781,23 @@ int rte_vfio_clear_group(__rte_unused int
> vfio_group_fd)
>  {
>   return 0;
>  }
> +
> +int __rte_experimental
> +rte_vfio_get_group_num(__rte_unused const char *sysfs_base,
> +__rte_unused const char *dev_addr,
> +__rte_unused int *iommu_group_num)
> +{
> + return -1;
> +}
> +
> +int  __rte_experimental
> +rte_vfio_get_container_fd(void)
> +{
> + return -1;
> +}
> +
> +int  __rte_experimental
> +rte_vfio_get_group_fd(__rte_unused int iommu_group_num)
> +{
> + return -1;
> +}

No function declarations for the above 3 global API. I guess compile will fail 
in BSD.
You may include the rte_vfio.h in this file, and remove the dummy prototype.
My previous patch "eal/vfio: add support for multiple container" does this too.

BRs,
Xiao



Re: [dpdk-dev] [PATCH v5 2/2] eal/vfio: export internal vfio functions

2018-04-05 Thread Wang, Xiao W
Hi Hemant,

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Hemant Agrawal
> Sent: Wednesday, April 4, 2018 3:49 PM
> To: dev@dpdk.org
> Cc: Burakov, Anatoly ; tho...@monjalon.net
> Subject: [dpdk-dev] [PATCH v5 2/2] eal/vfio: export internal vfio functions
> 
> This patch moves some of the internal vfio functions from
> eal_vfio.h to rte_vfio.h for common uses with "rte_" prefix.
> 
> This patch also change the FSLMC bus usages from the internal
> VFIO functions to external ones with "rte_" prefix
> 
> Signed-off-by: Hemant Agrawal 
> Acked-by: Anatoly Burakov 
> ---
> v5: fix the bsd compilation
> 
>  drivers/bus/fslmc/Makefile |  1 -
>  drivers/bus/fslmc/fslmc_vfio.c |  7 +--
>  drivers/bus/fslmc/fslmc_vfio.h |  2 -
>  drivers/bus/fslmc/meson.build  |  1 -
>  lib/librte_eal/bsdapp/eal/eal.c| 24 +
>  lib/librte_eal/common/include/rte_vfio.h   | 75
> +-
>  lib/librte_eal/linuxapp/eal/eal_vfio.c | 39 +++---
>  lib/librte_eal/linuxapp/eal/eal_vfio.h | 21 
>  lib/librte_eal/linuxapp/eal/eal_vfio_mp_sync.c |  4 +-
>  lib/librte_eal/rte_eal_version.map |  3 ++
>  10 files changed, 127 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/bus/fslmc/Makefile b/drivers/bus/fslmc/Makefile
> index 93870ba..3aa34e2 100644
> --- a/drivers/bus/fslmc/Makefile
> +++ b/drivers/bus/fslmc/Makefile
> @@ -16,7 +16,6 @@ CFLAGS += $(WERROR_FLAGS)
>  CFLAGS += -I$(RTE_SDK)/drivers/bus/fslmc
>  CFLAGS += -I$(RTE_SDK)/drivers/bus/fslmc/mc
>  CFLAGS += -I$(RTE_SDK)/drivers/bus/fslmc/qbman/include
> -CFLAGS += -I$(RTE_SDK)/lib/librte_eal/linuxapp/eal
>  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
>  LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
>  LDLIBS += -lrte_ethdev
> diff --git a/drivers/bus/fslmc/fslmc_vfio.c b/drivers/bus/fslmc/fslmc_vfio.c
> index 62499de..f3b2960 100644
> --- a/drivers/bus/fslmc/fslmc_vfio.c
> +++ b/drivers/bus/fslmc/fslmc_vfio.c
> @@ -91,7 +91,8 @@ fslmc_get_container_group(int *groupid)
>   }
> 
>   /* get group number */
> - ret = vfio_get_group_no(SYSFS_FSL_MC_DEVICES, g_container,
> groupid);
> + ret = rte_vfio_get_group_num(SYSFS_FSL_MC_DEVICES,
> +  g_container, groupid);
>   if (ret <= 0) {
>   DPAA2_BUS_ERR("Unable to find %s IOMMU group",
> g_container);
>   return -1;
> @@ -124,7 +125,7 @@ vfio_connect_container(void)
>   }
> 
>   /* Opens main vfio file descriptor which represents the "container" */
> - fd = vfio_get_container_fd();
> + fd = rte_vfio_get_container_fd();
>   if (fd < 0) {
>   DPAA2_BUS_ERR("Failed to open VFIO container");
>   return -errno;
> @@ -620,7 +621,7 @@ fslmc_vfio_setup_group(void)
>   }
> 
>   /* Get the actual group fd */
> - ret = vfio_get_group_fd(groupid);
> + ret = rte_vfio_get_group_fd(groupid);
>   if (ret < 0)
>   return ret;
>   vfio_group.fd = ret;
> diff --git a/drivers/bus/fslmc/fslmc_vfio.h b/drivers/bus/fslmc/fslmc_vfio.h
> index e8fb344..9e2c4fe 100644
> --- a/drivers/bus/fslmc/fslmc_vfio.h
> +++ b/drivers/bus/fslmc/fslmc_vfio.h
> @@ -10,8 +10,6 @@
> 
>  #include 
> 
> -#include "eal_vfio.h"
> -
>  #define DPAA2_MC_DPNI_DEVID  7
>  #define DPAA2_MC_DPSECI_DEVID3
>  #define DPAA2_MC_DPCON_DEVID 5
> diff --git a/drivers/bus/fslmc/meson.build b/drivers/bus/fslmc/meson.build
> index e94340e..78f9d92 100644
> --- a/drivers/bus/fslmc/meson.build
> +++ b/drivers/bus/fslmc/meson.build
> @@ -22,6 +22,5 @@ sources = files('fslmc_bus.c',
> 
>  allow_experimental_apis = true
> 
> -includes += include_directories('../../../lib/librte_eal/linuxapp/eal')
>  includes += include_directories('mc', 'qbman/include', 'portal')
>  cflags += ['-D_GNU_SOURCE']
> diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
> index 4eafcb5..e2f2df1 100644
> --- a/lib/librte_eal/bsdapp/eal/eal.c
> +++ b/lib/librte_eal/bsdapp/eal/eal.c
> @@ -746,6 +746,10 @@ int rte_vfio_enable(const char *modname);
>  int rte_vfio_is_enabled(const char *modname);
>  int rte_vfio_noiommu_is_enabled(void);
>  int rte_vfio_clear_group(int vfio_group_fd);
> +int rte_vfio_get_group_num(const char *sysfs_base, const char *dev_addr,
> +int *iommu_group_num);
> +int rte_vfio_get_container_fd(void);
> +int rte_vfio_get_group_fd(int iommu_group_num);

Considering the "group_no" field defined in eal_vfio.h, will "iommu_group_num" 
cause inconsistency
In naming?
 
/*
 * we don't need to store device fd's anywhere since they can be obtained from
 * the group fd via an ioctl() call.
 */
struct vfio_group {
int group_no;
int fd;
int devices;
};

BRs,
Xiao


Re: [dpdk-dev] [PATCH v5 2/2] eal/vfio: export internal vfio functions

2018-04-05 Thread Wang, Xiao W
Yes, it's private. We could do that if really needed.

BRs,
Xiao
> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Thursday, April 5, 2018 6:23 PM
> To: Wang, Xiao W 
> Cc: Hemant Agrawal ; dev@dpdk.org; Burakov,
> Anatoly 
> Subject: Re: [dpdk-dev] [PATCH v5 2/2] eal/vfio: export internal vfio 
> functions
> 
> 05/04/2018 11:03, Wang, Xiao W:
> 
> > > +int rte_vfio_get_group_num(const char *sysfs_base, const char
> *dev_addr,
> > > +int *iommu_group_num);
> > > +int rte_vfio_get_container_fd(void);
> > > +int rte_vfio_get_group_fd(int iommu_group_num);
> >
> > Considering the "group_no" field defined in eal_vfio.h, will
> "iommu_group_num" cause inconsistency
> > In naming?
> 
> I asked to change the function name to "num" because it is more meaningful.
> "group_no" field is private? Can it be renamed?
> 



Re: [dpdk-dev] [PATCH v5 0/4] add ifcvf vdpa driver

2018-04-11 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Yigit, Ferruh
> Sent: Thursday, April 12, 2018 2:59 AM
> To: Wang, Xiao W 
> Cc: maxime.coque...@redhat.com; dev@dpdk.org; Wang, Zhihong
> ; Tan, Jianfeng ; Bie,
> Tiwei ; Liang, Cunming ;
> Daly, Dan ; tho...@monjalon.net;
> gaetan.ri...@6wind.com; Burakov, Anatoly ;
> hemant.agra...@nxp.com
> Subject: Re: [PATCH v5 0/4] add ifcvf vdpa driver
> 
> On 4/5/2018 7:06 PM, Xiao Wang wrote:
> > This patch set has dependency on
> http://dpdk.org/dev/patchwork/patch/36772/
> > (vhost: support selective datapath).
> >
> > IFCVF driver
> > 
> > The IFCVF vDPA (vhost data path acceleration) driver provides support for
> the
> > Intel FPGA 100G VF (IFCVF). IFCVF's datapath is virtio ring compatible, it
> > works as a HW vhost backend which can send/receive packets to/from virtio
> > directly by DMA. Besides, it supports dirty page logging and device state
> > report/restore. This driver enables its vDPA functionality with live 
> > migration
> > feature.
> >
> > vDPA mode
> > =
> > IFCVF's vendor ID and device ID are same as that of virtio net pci device,
> > with its specific subsystem vendor ID and device ID. To let the device be
> > probed by IFCVF driver, adding "vdpa=1" parameter helps to specify that this
> > device is to be used in vDPA mode, rather than polling mode, virtio pmd will
> > skip when it detects this message.
> >
> > Container per device
> > 
> > vDPA needs to create different containers for different devices, thus this
> > patch set adds some APIs in eal/vfio to support multiple container, e.g.
> > - rte_vfio_create_container
> > - rte_vfio_destroy_container
> > - rte_vfio_bind_group
> > - rte_vfio_unbind_group
> >
> > By this extension, a device can be put into a new specific container, rather
> > than the previous default container.
> >
> > IFCVF vDPA details
> > ==
> > Key vDPA driver ops implemented:
> > - ifcvf_dev_config:
> >   Enable VF data path with virtio information provided by vhost lib, 
> > including
> >   IOMMU programming to enable VF DMA to VM's memory, VFIO interrupt
> setup to
> >   route HW interrupt to virtio driver, create notify relay thread to 
> > translate
> >   virtio driver's kick to a MMIO write onto HW, HW queues configuration.
> >
> >   This function gets called to set up HW data path backend when virtio 
> > driver
> >   in VM gets ready.
> >
> > - ifcvf_dev_close:
> >   Revoke all the setup in ifcvf_dev_config.
> >
> >   This function gets called when virtio driver stops device in VM.
> >
> > Change log
> > ==
> > v5:
> > - Fix compilation in BSD, remove the rte_vfio.h including in BSD.
> >
> > v4:
> > - Rebase on Zhihong's latest vDPA lib patch, with vDPA ops names change.
> > - Remove API "rte_vfio_get_group_fd", "rte_vfio_bind_group" will return the
> fd.
> > - Align the vfio_cfg search internal APIs naming.
> >
> > v3:
> > - Add doc and release note for the new driver.
> > - Remove the vdev concept, make the driver as a PCI driver, it will get 
> > probed
> >   by PCI bus driver.
> > - Rebase on the v4 vDPA lib patch, register a vDPA device instead of a 
> > engine.
> > - Remove the PCI API exposure accordingly.
> > - Move the MAX_VFIO_CONTAINERS definition to config file.
> > - Let virtio pmd skips when a virtio device needs to work in vDPA mode.
> >
> > v2:
> > - Rename function pci_get_kernel_driver_by_path to
> rte_pci_device_kdriver_name
> >   to make the API generic cross Linux and BSD, make it as EXPERIMENTAL.
> > - Rebase on Zhihong's vDPA v3 patch set.
> > - Minor code cleanup on vfio extension.
> >
> >
> > Xiao Wang (4):
> >   eal/vfio: add multiple container support
> >   net/virtio: skip device probe in vdpa mode
> >   net/ifcvf: add ifcvf vdpa driver
> >   doc: add ifcvf driver document and release note
> 
> Hi Xiao,
> 
> Current patch doesn't apply cleanly after latest updates, can you please 
> rebase
> it onto latest next-net, also there are a few minor comments I put into
> individual patches can you please check them?
> 
> After above changes done, please add for series:
> Reviewed-by: Ferruh Yigit 

Thanks, will update according to that.

BRs,
Xiao


Re: [dpdk-dev] [PATCH v6 1/4] eal/vfio: add multiple container support

2018-04-12 Thread Wang, Xiao W
Hi Anatoly,

> -Original Message-
> From: Burakov, Anatoly
> Sent: Thursday, April 12, 2018 10:04 PM
> To: Wang, Xiao W ; Yigit, Ferruh
> 
> Cc: dev@dpdk.org; maxime.coque...@redhat.com; Wang, Zhihong
> ; Bie, Tiwei ; Tan, Jianfeng
> ; Liang, Cunming ; Daly,
> Dan ; tho...@monjalon.net; gaetan.ri...@6wind.com;
> hemant.agra...@nxp.com; Chen, Junjie J 
> Subject: Re: [PATCH v6 1/4] eal/vfio: add multiple container support
> 
> On 12-Apr-18 8:19 AM, Xiao Wang wrote:
> > Currently eal vfio framework binds vfio group fd to the default
> > container fd during rte_vfio_setup_device, while in some cases,
> > e.g. vDPA (vhost data path acceleration), we want to put vfio group
> > to a separate container and program IOMMU via this container.
> >
> > This patch adds some APIs to support container creating and device
> > binding with a container.
> >
> > A driver could use "rte_vfio_create_container" helper to create a
> > new container from eal, use "rte_vfio_bind_group" to bind a device
> > to the newly created container.
> >
> > During rte_vfio_setup_device, the container bound with the device
> > will be used for IOMMU setup.
> >
> > Signed-off-by: Junjie Chen 
> > Signed-off-by: Xiao Wang 
> > Reviewed-by: Maxime Coquelin 
> > Reviewed-by: Ferruh Yigit 
> > ---
> 
> Apologies for late review. Some comments below.
> 
> <...>
> 
> >
> > +struct rte_memseg;
> > +
> >   /**
> >* Setup vfio_cfg for the device identified by its address.
> >* It discovers the configured I/O MMU groups or sets a new one for the
> device.
> > @@ -131,6 +133,117 @@ rte_vfio_clear_group(int vfio_group_fd);
> >   }
> >   #endif
> >
> 
> <...>
> 
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> > + *
> > + * Perform dma mapping for devices in a conainer.
> > + *
> > + * @param container_fd
> > + *   the specified container fd
> > + *
> > + * @param dma_type
> > + *   the dma map type
> > + *
> > + * @param ms
> > + *   the dma address region to map
> > + *
> > + * @return
> > + *0 if successful
> > + *   <0 if failed
> > + */
> > +int __rte_experimental
> > +rte_vfio_dma_map(int container_fd, int dma_type, const struct
> rte_memseg *ms);
> > +
> 
> First of all, why memseg, instead of va/iova/len? This seems like
> unnecessary attachment to internals of DPDK memory representation. Not
> all memory comes in memsegs, this makes the API unnecessarily specific
> to DPDK memory.

Agree, will use va/iova/len.

> 
> Also, why providing DMA type? There's already a VFIO type pointer in
> vfio_config - you can set this pointer for every new created container,
> so the user wouldn't have to care about IOMMU type. Is it not possible
> to figure out DMA type from within EAL VFIO? If not, maybe provide an
> API to do so, e.g. rte_vfio_container_set_dma_type()?

It's possible, EAL VFIO should be able to figure out a container's DMA type.

> 
> This will also need to be rebased on top of latest HEAD because there
> already is a similar DMA map/unmap API added, only without the container
> parameter. Perhaps rename these new functions to
> rte_vfio_container_(create|destroy|dma_map|dma_unmap)?

OK, will check the latest HEAD and rebase on that.

> 
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> > + *
> > + * Perform dma unmapping for devices in a conainer.
> > + *
> > + * @param container_fd
> > + *   the specified container fd
> > + *
> > + * @param dma_type
> > + *the dma map type
> > + *
> > + * @param ms
> > + *   the dma address region to unmap
> > + *
> > + * @return
> > + *0 if successful
> > + *   <0 if failed
> > + */
> > +int __rte_experimental
> > +rte_vfio_dma_unmap(int container_fd, int dma_type, const struct
> rte_memseg *ms);
> > +
> >   #endif /* VFIO_PRESENT */
> >
> 
> <...>
> 
> > @@ -75,8 +53,8 @@ vfio_get_group_fd(int iommu_group_no)
> > if (vfio_group_fd < 0) {
> > /* if file not found, it's not an error */
> > if (errno != ENOENT) {
> > -   RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> filename,
> > -   strerror(errno

Re: [dpdk-dev] [PATCH v6 1/4] eal/vfio: add multiple container support

2018-04-13 Thread Wang, Xiao W


> -Original Message-
> From: Burakov, Anatoly
> Sent: Friday, April 13, 2018 12:24 AM
> To: Wang, Xiao W ; Yigit, Ferruh
> 
> Cc: dev@dpdk.org; maxime.coque...@redhat.com; Wang, Zhihong
> ; Bie, Tiwei ; Tan, Jianfeng
> ; Liang, Cunming ; Daly,
> Dan ; tho...@monjalon.net; gaetan.ri...@6wind.com;
> hemant.agra...@nxp.com; Chen, Junjie J 
> Subject: Re: [PATCH v6 1/4] eal/vfio: add multiple container support
> 
> On 12-Apr-18 5:07 PM, Wang, Xiao W wrote:
> > Hi Anatoly,
> >
> 
> <...>
> 
> >>
> >> Also, why providing DMA type? There's already a VFIO type pointer in
> >> vfio_config - you can set this pointer for every new created container,
> >> so the user wouldn't have to care about IOMMU type. Is it not possible
> >> to figure out DMA type from within EAL VFIO? If not, maybe provide an
> >> API to do so, e.g. rte_vfio_container_set_dma_type()?
> >
> > It's possible, EAL VFIO should be able to figure out a container's DMA type.
> 
> You probably won't be able to do it until you add a group into the
> container, so probably best place to do it would be on group_bind?

Yes, the IOMMU type pointer could be set when group binding.

BRs,
Xiao

> 
> --
> Thanks,
> Anatoly


Re: [dpdk-dev] [PATCH v7 1/5] vfio: extend data structure for multi container

2018-04-16 Thread Wang, Xiao W
Hi Anatoly,

> -Original Message-
> From: Burakov, Anatoly
> Sent: Monday, April 16, 2018 6:03 PM
> To: Wang, Xiao W ; Yigit, Ferruh
> 
> Cc: dev@dpdk.org; maxime.coque...@redhat.com; Wang, Zhihong
> ; Bie, Tiwei ; Tan, Jianfeng
> ; Liang, Cunming ; Daly,
> Dan ; tho...@monjalon.net; Chen, Junjie J
> 
> Subject: Re: [PATCH v7 1/5] vfio: extend data structure for multi container
> 
> On 15-Apr-18 4:33 PM, Xiao Wang wrote:
> > Currently eal vfio framework binds vfio group fd to the default
> > container fd during rte_vfio_setup_device, while in some cases,
> > e.g. vDPA (vhost data path acceleration), we want to put vfio group
> > to a separate container and program IOMMU via this container.
> >
> > This patch extends the vfio_config structure to contain per-container
> > user_mem_maps and defines an array of vfio_config. The next patch will
> > base on this to add container API.
> >
> > Signed-off-by: Junjie Chen 
> > Signed-off-by: Xiao Wang 
> > Reviewed-by: Maxime Coquelin 
> > Reviewed-by: Ferruh Yigit 
> > ---
> >   config/common_base |   1 +
> >   lib/librte_eal/linuxapp/eal/eal_vfio.c | 407 ++---
> 
> >   lib/librte_eal/linuxapp/eal/eal_vfio.h |  19 +-
> >   3 files changed, 275 insertions(+), 152 deletions(-)
> >
> > diff --git a/config/common_base b/config/common_base
> > index c4236fd1f..4a76d2f14 100644
> > --- a/config/common_base
> > +++ b/config/common_base
> > @@ -87,6 +87,7 @@ CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
> >   CONFIG_RTE_EAL_IGB_UIO=n
> >   CONFIG_RTE_EAL_VFIO=n
> >   CONFIG_RTE_MAX_VFIO_GROUPS=64
> > +CONFIG_RTE_MAX_VFIO_CONTAINERS=64
> >   CONFIG_RTE_MALLOC_DEBUG=n
> >   CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES=n
> >   CONFIG_RTE_USE_LIBBSD=n
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > index 589d7d478..46fba2d8d 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > @@ -22,8 +22,46 @@
> >
> >   #define VFIO_MEM_EVENT_CLB_NAME "vfio_mem_event_clb"
> >
> > +/*
> > + * we don't need to store device fd's anywhere since they can be obtained
> from
> > + * the group fd via an ioctl() call.
> > + */
> > +struct vfio_group {
> > +   int group_no;
> > +   int fd;
> > +   int devices;
> > +};
> 
> What is the purpose of moving this into .c file? Seems like an
> unnecessary change.

Yes, we can let vfio_group stay at .h, and move vfio_config into .c

> 
> > +
> > +/* hot plug/unplug of VFIO groups may cause all DMA maps to be dropped.
> we can
> > + * recreate the mappings for DPDK segments, but we cannot do so for
> memory that
> > + * was registered by the user themselves, so we need to store the user
> mappings
> > + * somewhere, to recreate them later.
> > + */
> > +#define VFIO_MAX_USER_MEM_MAPS 256
> > +struct user_mem_map {
> > +   uint64_t addr;
> > +   uint64_t iova;
> > +   uint64_t len;
> > +};
> > +
> 
> <...>
> 
> > +static struct vfio_config *
> > +get_vfio_cfg_by_group_no(int iommu_group_no)
> > +{
> > +   struct vfio_config *vfio_cfg;
> > +   int i, j;
> > +
> > +   for (i = 0; i < VFIO_MAX_CONTAINERS; i++) {
> > +   vfio_cfg = &vfio_cfgs[i];
> > +   for (j = 0; j < VFIO_MAX_GROUPS; j++) {
> > +   if (vfio_cfg->vfio_groups[j].group_no ==
> > +   iommu_group_no)
> > +   return vfio_cfg;
> > +   }
> > +   }
> > +
> > +   return default_vfio_cfg;
> 
> Here and in other places: i'm not sure returning default vfio config if
> group not found is such a good idea. It would be better if calling code
> explicitly handled case of group not existing yet.

Agree. It would be explicit.

> 
> > +}
> > +
> > +static struct vfio_config *
> > +get_vfio_cfg_by_group_fd(int vfio_group_fd)
> > +{
> > +   struct vfio_config *vfio_cfg;
> > +   int i, j;
> > +
> > +   for (i = 0; i < VFIO_MAX_CONTAINERS; i++) {
> > +   vfio_cfg = &vfio_cfgs[i];
> > +   for (j = 0; j < VFIO_MAX_GROUPS; j++)
> > +   if (vfio_cfg->vfio_groups[j].fd == vfio_group_fd)
> > +   return vfio_cfg;
> > +   }
> >
> 
> <...>
> 
> > -   for (i = 0; i < VFIO_MAX_GROUPS; i++) {
> > -   v

Re: [dpdk-dev] [PATCH v7 2/5] vfio: add multi container support

2018-04-16 Thread Wang, Xiao W
Hi Anatoly,

> -Original Message-
> From: Burakov, Anatoly
> Sent: Monday, April 16, 2018 6:03 PM
> To: Wang, Xiao W ; Yigit, Ferruh
> 
> Cc: dev@dpdk.org; maxime.coque...@redhat.com; Wang, Zhihong
> ; Bie, Tiwei ; Tan, Jianfeng
> ; Liang, Cunming ; Daly,
> Dan ; tho...@monjalon.net; Chen, Junjie J
> 
> Subject: Re: [PATCH v7 2/5] vfio: add multi container support
> 
> On 15-Apr-18 4:33 PM, Xiao Wang wrote:
> > This patch adds APIs to support container create/destroy and device
> > bind/unbind with a container. It also provides API for IOMMU programing
> > on a specified container.
> >
> > A driver could use "rte_vfio_create_container" helper to create a
> 
> ^^ wrong API name in commit message :)

Thanks for the catch. Will fix it.

> 
> > new container from eal, use "rte_vfio_bind_group" to bind a device
> > to the newly created container. During rte_vfio_setup_device the
> > container bound with the device will be used for IOMMU setup.
> >
> > Signed-off-by: Junjie Chen 
> > Signed-off-by: Xiao Wang 
> > Reviewed-by: Maxime Coquelin 
> > Reviewed-by: Ferruh Yigit 
> > ---
> >   lib/librte_eal/bsdapp/eal/eal.c  |  52 +
> >   lib/librte_eal/common/include/rte_vfio.h | 119 
> >   lib/librte_eal/linuxapp/eal/eal_vfio.c   | 316
> +++
> >   lib/librte_eal/rte_eal_version.map   |   6 +
> >   4 files changed, 493 insertions(+)
> >
> > diff --git a/lib/librte_eal/bsdapp/eal/eal.c 
> > b/lib/librte_eal/bsdapp/eal/eal.c
> > index 727adc5d2..c5106d0d6 100644
> > --- a/lib/librte_eal/bsdapp/eal/eal.c
> > +++ b/lib/librte_eal/bsdapp/eal/eal.c
> > @@ -769,6 +769,14 @@ int rte_vfio_noiommu_is_enabled(void);
> >   int rte_vfio_clear_group(int vfio_group_fd);
> >   int rte_vfio_dma_map(uint64_t vaddr, uint64_t iova, uint64_t len);
> >   int rte_vfio_dma_unmap(uint64_t vaddr, uint64_t iova, uint64_t len);
> > +int rte_vfio_container_create(void);
> > +int rte_vfio_container_destroy(int container_fd);
> > +int rte_vfio_bind_group(int container_fd, int iommu_group_no);
> > +int rte_vfio_unbind_group(int container_fd, int iommu_group_no);
> 
> Maybe have these under "container" too? e.g.
> rte_vfio_container_group_bind/unbind? Seems like it would be more
> consistent that way - anything to do with custom containers would be
> under rte_vfio_container_* namespace.

Agree.

> 
> > +int rte_vfio_container_dma_map(int container_fd, uint64_t vaddr,
> > +   uint64_t iova, uint64_t len);
> > +int rte_vfio_container_dma_unmap(int container_fd, uint64_t vaddr,
> > +   uint64_t iova, uint64_t len);
> >
> >   int rte_vfio_setup_device(__rte_unused const char *sysfs_base,
> >   __rte_unused const char *dev_addr,
> > @@ -818,3 +826,47 @@ rte_vfio_dma_unmap(uint64_t __rte_unused vaddr,
> uint64_t __rte_unused iova,
> >   {
> > return -1;
> >   }
> > +
> 
> <...>
> 
> > diff --git a/lib/librte_eal/common/include/rte_vfio.h
> b/lib/librte_eal/common/include/rte_vfio.h
> > index d26ab01cb..0c1509b29 100644
> > --- a/lib/librte_eal/common/include/rte_vfio.h
> > +++ b/lib/librte_eal/common/include/rte_vfio.h
> > @@ -168,6 +168,125 @@ rte_vfio_dma_map(uint64_t vaddr, uint64_t iova,
> uint64_t len);
> >   int __rte_experimental
> >   rte_vfio_dma_unmap(uint64_t vaddr, uint64_t iova, uint64_t len);
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> > + *
> > + * Create a new container for device binding.
> 
> I would add a note that any newly allocated DPDK memory will not be
> mapped into these containers by default.

Will add it.

> 
> > + *
> > + * @return
> > + *   the container fd if successful
> > + *   <0 if failed
> > + */
> > +int __rte_experimental
> > +rte_vfio_container_create(void);
> > +
> 
> <...>
> 
> > + *0 if successful
> > + *   <0 if failed
> > + */
> > +int __rte_experimental
> > +rte_vfio_unbind_group(int container_fd, int iommu_group_no);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> > + *
> > + * Perform dma mapping for devices in a conainer.
> 
> Here and in other places: "dma" should be DMA, and typo: "conainer" :)
> 
> I think you should also add a note to the original API (not this one,
> but the old one) that DMA m

Re: [dpdk-dev] [PATCH v7 3/3] net/virtio: support GUEST ANNOUNCE

2018-01-09 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Tuesday, January 9, 2018 4:50 PM
> To: Wang, Xiao W ; y...@fridaylinux.org
> Cc: Bie, Tiwei ; dev@dpdk.org;
> step...@networkplumber.org
> Subject: Re: [dpdk-dev] [PATCH v7 3/3] net/virtio: support GUEST ANNOUNCE
> 
> 
> 
> On 01/09/2018 03:26 PM, Xiao Wang wrote:
> > When live migration is done, for the backup VM, either the virtio
> > frontend or the vhost backend needs to send out gratuitous RARP packet
> > to announce its new network location.
> >
> > This patch enables VIRTIO_NET_F_GUEST_ANNOUNCE feature to support
> live
> > migration scenario where the vhost backend doesn't have the ability to
> > generate RARP packet.
> >
> > Brief introduction of the work flow:
> > 1. QEMU finishes live migration, pokes the backup VM with an interrupt.
> > 2. Virtio interrupt handler reads out the interrupt status value, and
> > realizes it needs to send out RARP packet to announce its location.
> > 3. Pause device to stop worker thread touching the queues.
> > 4. Inject a RARP packet into a Tx Queue.
> > 5. Ack the interrupt via control queue.
> > 6. Resume device to continue packet processing.
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >   drivers/net/virtio/virtio_ethdev.c | 95
> +-
> >   drivers/net/virtio/virtio_ethdev.h |  1 +
> >   drivers/net/virtio/virtqueue.h | 11 +
> >   3 files changed, 105 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> > index e8ff1e449..9606df514 100644
> > --- a/drivers/net/virtio/virtio_ethdev.c
> > +++ b/drivers/net/virtio/virtio_ethdev.c
> > @@ -19,6 +19,8 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> > +#include 
> >   #include 
> >   #include 
> >   #include 
> > @@ -78,6 +80,11 @@ static int virtio_dev_queue_stats_mapping_set(
> > uint8_t stat_idx,
> > uint8_t is_rx);
> >
> > +static int make_rarp_packet(struct rte_mbuf *rarp_mbuf,
> > +   const struct ether_addr *mac);
> > +static void virtio_notify_peers(struct rte_eth_dev *dev);
> > +static void virtio_ack_link_announce(struct rte_eth_dev *dev);
> > +
> >   /*
> >* The set of PCI devices this driver supports
> >*/
> > @@ -1272,9 +1279,89 @@ virtio_inject_pkts(struct rte_eth_dev *dev, struct
> rte_mbuf **tx_pkts,
> > return ret;
> >   }
> >
> > +#define RARP_PKT_SIZE  64
> > +static int
> > +make_rarp_packet(struct rte_mbuf *rarp_mbuf, const struct ether_addr
> *mac)
> > +{
> > +   struct ether_hdr *eth_hdr;
> > +   struct arp_hdr *rarp;
> > +
> > +   if (rarp_mbuf->buf_len < RARP_PKT_SIZE) {
> > +   PMD_DRV_LOG(ERR, "mbuf size too small %u (< %d)",
> > +   rarp_mbuf->buf_len, RARP_PKT_SIZE);
> > +   return -1;
> > +   }
> > +
> > +   /* Ethernet header. */
> > +   eth_hdr = rte_pktmbuf_mtod(rarp_mbuf, struct ether_hdr *);
> > +   memset(eth_hdr->d_addr.addr_bytes, 0xff, ETHER_ADDR_LEN);
> > +   ether_addr_copy(mac, ð_hdr->s_addr);
> > +   eth_hdr->ether_type = htons(ETHER_TYPE_RARP);
> > +
> > +   /* RARP header. */
> > +   rarp = (struct arp_hdr *)(eth_hdr + 1);
> > +   rarp->arp_hrd = htons(ARP_HRD_ETHER);
> > +   rarp->arp_pro = htons(ETHER_TYPE_IPv4);
> > +   rarp->arp_hln = ETHER_ADDR_LEN;
> > +   rarp->arp_pln = 4;
> > +   rarp->arp_op  = htons(ARP_OP_REVREQUEST);
> > +
> > +   ether_addr_copy(mac, &rarp->arp_data.arp_sha);
> > +   ether_addr_copy(mac, &rarp->arp_data.arp_tha);
> > +   memset(&rarp->arp_data.arp_sip, 0x00, 4);
> > +   memset(&rarp->arp_data.arp_tip, 0x00, 4);
> > +
> > +   rarp_mbuf->data_len = RARP_PKT_SIZE;
> > +   rarp_mbuf->pkt_len = RARP_PKT_SIZE;
> > +
> > +   return 0;
> > +}
> 
> Do you think it could make sense to have this function in a lib, as
> vhost user lib does exactly the same?
> 
> I don't know if it could be useful to others than vhost/virtio though.
> 
> Thanks,
> Maxime

Hi Thomas,

Do you think it's worth adding a new helper for ARP in lib/librte_net/?
Currently we just need a helper to build RARP packet (the above 
make_rarp_packet)

BRs,
Xiao


Re: [dpdk-dev] [PATCH v7 3/3] net/virtio: support GUEST ANNOUNCE

2018-01-09 Thread Wang, Xiao W
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> On 01/09/2018 03:26 PM, Xiao Wang wrote:
> > When live migration is done, for the backup VM, either the virtio
> > frontend or the vhost backend needs to send out gratuitous RARP packet
> > to announce its new network location.
> >
> > This patch enables VIRTIO_NET_F_GUEST_ANNOUNCE feature to support
> live
> > migration scenario where the vhost backend doesn't have the ability to
> > generate RARP packet.
> >
> > Brief introduction of the work flow:
> > 1. QEMU finishes live migration, pokes the backup VM with an interrupt.
> > 2. Virtio interrupt handler reads out the interrupt status value, and
> > realizes it needs to send out RARP packet to announce its location.
> > 3. Pause device to stop worker thread touching the queues.
> > 4. Inject a RARP packet into a Tx Queue.
> > 5. Ack the interrupt via control queue.
> > 6. Resume device to continue packet processing.
> >
> > Signed-off-by: Xiao Wang 
> > ---
> >   drivers/net/virtio/virtio_ethdev.c | 95
> +-
> >   drivers/net/virtio/virtio_ethdev.h |  1 +
> >   drivers/net/virtio/virtqueue.h | 11 +
> >   3 files changed, 105 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> > index e8ff1e449..9606df514 100644
> > --- a/drivers/net/virtio/virtio_ethdev.c
> > +++ b/drivers/net/virtio/virtio_ethdev.c
> > @@ -19,6 +19,8 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> > +#include 
> >   #include 
> >   #include 
> >   #include 
> > @@ -78,6 +80,11 @@ static int virtio_dev_queue_stats_mapping_set(
> > uint8_t stat_idx,
> > uint8_t is_rx);
> >
> > +static int make_rarp_packet(struct rte_mbuf *rarp_mbuf,
> > +   const struct ether_addr *mac);
> > +static void virtio_notify_peers(struct rte_eth_dev *dev);
> > +static void virtio_ack_link_announce(struct rte_eth_dev *dev);
> > +
> >   /*
> >* The set of PCI devices this driver supports
> >*/
> > @@ -1272,9 +1279,89 @@ virtio_inject_pkts(struct rte_eth_dev *dev, struct
> rte_mbuf **tx_pkts,
> > return ret;
> >   }
> >
> > +#define RARP_PKT_SIZE  64
> > +static int
> > +make_rarp_packet(struct rte_mbuf *rarp_mbuf, const struct ether_addr
> *mac)
> > +{
> > +   struct ether_hdr *eth_hdr;
> > +   struct arp_hdr *rarp;
> > +
> > +   if (rarp_mbuf->buf_len < RARP_PKT_SIZE) {
> > +   PMD_DRV_LOG(ERR, "mbuf size too small %u (< %d)",
> > +   rarp_mbuf->buf_len, RARP_PKT_SIZE);
> > +   return -1;
> > +   }
> > +
> > +   /* Ethernet header. */
> > +   eth_hdr = rte_pktmbuf_mtod(rarp_mbuf, struct ether_hdr *);
> > +   memset(eth_hdr->d_addr.addr_bytes, 0xff, ETHER_ADDR_LEN);
> > +   ether_addr_copy(mac, ð_hdr->s_addr);
> > +   eth_hdr->ether_type = htons(ETHER_TYPE_RARP);
> > +
> > +   /* RARP header. */
> > +   rarp = (struct arp_hdr *)(eth_hdr + 1);
> > +   rarp->arp_hrd = htons(ARP_HRD_ETHER);
> > +   rarp->arp_pro = htons(ETHER_TYPE_IPv4);
> > +   rarp->arp_hln = ETHER_ADDR_LEN;
> > +   rarp->arp_pln = 4;
> > +   rarp->arp_op  = htons(ARP_OP_REVREQUEST);
> > +
> > +   ether_addr_copy(mac, &rarp->arp_data.arp_sha);
> > +   ether_addr_copy(mac, &rarp->arp_data.arp_tha);
> > +   memset(&rarp->arp_data.arp_sip, 0x00, 4);
> > +   memset(&rarp->arp_data.arp_tip, 0x00, 4);
> > +
> > +   rarp_mbuf->data_len = RARP_PKT_SIZE;
> > +   rarp_mbuf->pkt_len = RARP_PKT_SIZE;
> > +
> > +   return 0;
> > +}
> 
> Do you think it could make sense to have this function in a lib, as
> vhost user lib does exactly the same?
> 
> I don't know if it could be useful to others than vhost/virtio though.

Hi Thomas,

Do you think it's worth adding a new helper for ARP in lib/librte_net/?
Currently we just need a helper to build RARP packet (the above 
make_rarp_packet)

BRs,
Xiao


> 
> Thanks,
> Maxime


Re: [dpdk-dev] [PATCH v8 3/5] net: add a helper for making RARP packet

2018-01-09 Thread Wang, Xiao W
Hi,

> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Tuesday, January 9, 2018 9:49 PM
> To: Wang, Xiao W 
> Cc: y...@fridaylinux.org; Bie, Tiwei ; dev@dpdk.org;
> step...@networkplumber.org
> Subject: Re: [PATCH v8 3/5] net: add a helper for making RARP packet
> 
> 09/01/2018 14:26, Xiao Wang:
> > +/**
> > + * Make a RARP packet based on MAC addr.
> > + *
> > + * @param mbuf
> > + *   Pointer to the rte_mbuf structure
> > + * @param mac
> > + *   Pointer to the MAC addr
> > + *
> > + * @return
> > + *   - 0 on success, negative on error
> > + */
> > +int
> > +rte_net_make_rarp_packet(struct rte_mbuf *mbuf, const struct ether_addr
> *mac);
> 
> I think we should apply the new policy of introducting functions
> with the experimental state.

OK, will change it soon.

Thanks,
Xiao


Re: [dpdk-dev] [PATCH v10 3/5] net: add a helper for making RARP packet

2018-01-16 Thread Wang, Xiao W
Hi Olivier,

> -Original Message-
> From: Olivier Matz [mailto:olivier.m...@6wind.com]
> Sent: Tuesday, January 16, 2018 5:01 PM
> To: Wang, Xiao W 
> Cc: y...@fridaylinux.org; tho...@monjalon.net; Bie, Tiwei
> ; dev@dpdk.org; step...@networkplumber.org;
> maxime.coque...@redhat.com
> Subject: Re: [dpdk-dev] [PATCH v10 3/5] net: add a helper for making RARP
> packet
> 
> Hi Xiao,
> 
> Please find few comments below.
> 
> On Wed, Jan 10, 2018 at 09:23:54AM +0800, Xiao Wang wrote:
> > Suggested-by: Maxime Coquelin 
> > Signed-off-by: Xiao Wang 
> > Reviewed-by: Maxime Coquelin 
> > ---
> >  lib/librte_net/Makefile|  1 +
> >  lib/librte_net/rte_arp.c   | 42
> ++
> >  lib/librte_net/rte_arp.h   | 17 +++
> >  lib/librte_net/rte_net_version.map |  6 ++
> >  4 files changed, 66 insertions(+)
> >  create mode 100644 lib/librte_net/rte_arp.c
> >
> > diff --git a/lib/librte_net/Makefile b/lib/librte_net/Makefile
> > index 5e8a76b68..ab290c382 100644
> > --- a/lib/librte_net/Makefile
> > +++ b/lib/librte_net/Makefile
> > @@ -13,6 +13,7 @@ LIBABIVER := 1
> >
> >  SRCS-$(CONFIG_RTE_LIBRTE_NET) := rte_net.c
> >  SRCS-$(CONFIG_RTE_LIBRTE_NET) += rte_net_crc.c
> > +SRCS-$(CONFIG_RTE_LIBRTE_NET) += rte_arp.c
> >
> >  # install includes
> >  SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include := rte_ip.h rte_tcp.h
> rte_udp.h rte_esp.h
> > diff --git a/lib/librte_net/rte_arp.c b/lib/librte_net/rte_arp.c
> > new file mode 100644
> > index 0..d7223b044
> > --- /dev/null
> > +++ b/lib/librte_net/rte_arp.c
> > @@ -0,0 +1,42 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2018 Intel Corporation
> > + */
> > +
> > +#include 
> > +
> > +#include 
> > +
> > +#define RARP_PKT_SIZE  64
> > +int
> > +rte_net_make_rarp_packet(struct rte_mbuf *mbuf, const struct ether_addr
> *mac)
> > +{
> > +   struct ether_hdr *eth_hdr;
> > +   struct arp_hdr *rarp;
> > +
> > +   if (mbuf->buf_len < RARP_PKT_SIZE)
> > +   return -1;
> > +
> > +   /* Ethernet header. */
> > +   eth_hdr = rte_pktmbuf_mtod(mbuf, struct ether_hdr *);
> > +   memset(eth_hdr->d_addr.addr_bytes, 0xff, ETHER_ADDR_LEN);
> > +   ether_addr_copy(mac, ð_hdr->s_addr);
> > +   eth_hdr->ether_type = htons(ETHER_TYPE_RARP);
> > +
> > +   /* RARP header. */
> > +   rarp = (struct arp_hdr *)(eth_hdr + 1);
> > +   rarp->arp_hrd = htons(ARP_HRD_ETHER);
> > +   rarp->arp_pro = htons(ETHER_TYPE_IPv4);
> > +   rarp->arp_hln = ETHER_ADDR_LEN;
> > +   rarp->arp_pln = 4;
> > +   rarp->arp_op  = htons(ARP_OP_REVREQUEST);
> > +
> > +   ether_addr_copy(mac, &rarp->arp_data.arp_sha);
> > +   ether_addr_copy(mac, &rarp->arp_data.arp_tha);
> > +   memset(&rarp->arp_data.arp_sip, 0x00, 4);
> > +   memset(&rarp->arp_data.arp_tip, 0x00, 4);
> > +
> > +   mbuf->data_len = RARP_PKT_SIZE;
> > +   mbuf->pkt_len = RARP_PKT_SIZE;
> > +
> > +   return 0;
> > +}
> 
> You don't check that there is enough tailroom to write the packet data.

Yes, tailroom can be used.

> Also, nothing verifies that the mbuf passed to the function is empty.
> I suggest to do the allocation in this function, what do you think?
>

I agree to allocate in this function and let it do all the checks.
 
> You can also use rte_pktmbuf_append() to check for the tailroom and
> update data_len/pkt_len:
> 
>   m = rte_pktmbuf_alloc();
>   if (m == NULL)
>   return NULL;
>   eth_hdr = rte_pktmbuf_append(m, RARP_PKT_SIZE);

When data_len is not enough, we need to rte_pktmbuf_append(m, RARP_PKT_SIZE - 
m->data_len);

>   if (eth_hdr == NULL) {
>   m_freem(m);
>   return NULL;
>   }
>   eth_hdr->... = ...;
>   ...
>   rarp = (struct arp_hdr *)(eth_hdr + 1);
>   rarp->... = ...;
>   ...
> 
>   return m;
> 

Will change it in next version, thanks for the comments.

BRs,
Xiao


Re: [dpdk-dev] [PATCH v10 3/5] net: add a helper for making RARP packet

2018-01-16 Thread Wang, Xiao W
Hi Olivier,

> -Original Message-
> From: Olivier Matz [mailto:olivier.m...@6wind.com]
> Sent: Tuesday, January 16, 2018 6:43 PM
> To: Wang, Xiao W 
> Cc: y...@fridaylinux.org; tho...@monjalon.net; Bie, Tiwei
> ; dev@dpdk.org; step...@networkplumber.org;
> maxime.coque...@redhat.com
> Subject: Re: [dpdk-dev] [PATCH v10 3/5] net: add a helper for making RARP
> packet
> 
> Hi Xiao,
> 
> On Tue, Jan 16, 2018 at 09:43:43AM +, Wang, Xiao W wrote:
> > Hi Olivier,
> > > You can also use rte_pktmbuf_append() to check for the tailroom and
> > > update data_len/pkt_len:
> > >
> > >   m = rte_pktmbuf_alloc();

I just realized that if we let this function to allocate mbuf, it may restrict 
this api's applicability.
E.g. the caller just has a mbuf, without a mempool.
How do you think?

> > >   if (m == NULL)
> > >   return NULL;
> > >   eth_hdr = rte_pktmbuf_append(m, RARP_PKT_SIZE);
> >
> > When data_len is not enough, we need to rte_pktmbuf_append(m,
> RARP_PKT_SIZE - m->data_len);
> 
> Sorry, I don't get your point here.

I mean we just need to extend the data_len by "RARP_PKT_SIZE - m->data_len" 
when the room is not big enough.

BRs,
Xiao


Re: [dpdk-dev] [PATCH v10 3/5] net: add a helper for making RARP packet

2018-01-16 Thread Wang, Xiao W


> -Original Message-
> From: Wang, Xiao W
> Sent: Tuesday, January 16, 2018 7:03 PM
> To: 'Olivier Matz' 
> Cc: y...@fridaylinux.org; tho...@monjalon.net; Bie, Tiwei
> ; dev@dpdk.org; step...@networkplumber.org;
> maxime.coque...@redhat.com
> Subject: RE: [dpdk-dev] [PATCH v10 3/5] net: add a helper for making RARP
> packet
> 
> Hi Olivier,
> 
> > -Original Message-
> > From: Olivier Matz [mailto:olivier.m...@6wind.com]
> > Sent: Tuesday, January 16, 2018 6:43 PM
> > To: Wang, Xiao W 
> > Cc: y...@fridaylinux.org; tho...@monjalon.net; Bie, Tiwei
> > ; dev@dpdk.org; step...@networkplumber.org;
> > maxime.coque...@redhat.com
> > Subject: Re: [dpdk-dev] [PATCH v10 3/5] net: add a helper for making RARP
> > packet
> >
> > Hi Xiao,
> >
> > On Tue, Jan 16, 2018 at 09:43:43AM +, Wang, Xiao W wrote:
> > > Hi Olivier,
> > > > You can also use rte_pktmbuf_append() to check for the tailroom and
> > > > update data_len/pkt_len:
> > > >
> > > > m = rte_pktmbuf_alloc();
> 
> I just realized that if we let this function to allocate mbuf, it may 
> restrict this
> api's applicability.
> E.g. the caller just has a mbuf, without a mempool.
> How do you think?
> 
> > > > if (m == NULL)
> > > > return NULL;
> > > > eth_hdr = rte_pktmbuf_append(m, RARP_PKT_SIZE);
> > >
> > > When data_len is not enough, we need to rte_pktmbuf_append(m,
> > RARP_PKT_SIZE - m->data_len);
> >
> > Sorry, I don't get your point here.
> 
> I mean we just need to extend the data_len by "RARP_PKT_SIZE - m-
> >data_len" when the room is not big enough.

OK, in your sample code, you rte_pktmbuf_alloc() a mbuf, it's reset already, so 
we just append RARP_PKT_SIZE. I got you~

For the mbuf allocation, we can let this function do allocation and content 
filling. If the app needs special need, e.g. chained mbuf,
then let the app fill it by itself.

> 
> BRs,
> Xiao


Re: [dpdk-dev] [PATCH 2/2] net: fix build error

2018-01-17 Thread Wang, Xiao W


> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Thursday, January 18, 2018 3:39 PM
> To: Yuanhan Liu 
> Cc: dev@dpdk.org; Wang, Xiao W ; Yigit, Ferruh
> ; Olivier Matz 
> Subject: Re: [PATCH 2/2] net: fix build error
> 
> 18/01/2018 04:14, Yuanhan Liu:
> > Fix build error when shared lib is enabled:
> >
> >   LD librte_net.so.1.1
> > rte_arp.o: In function `rte_net_make_rarp_packet':
> > rte_arp.c:(.text+0x1f0): undefined reference to `rte_mempool_ops_table'
> > rte_arp.c:(.text+0x21d): undefined reference to `rte_mempool_ops_table'
> > rte_arp.c:(.text+0x2d5): undefined reference to `rte_mempool_ops_table'
> > rte_arp.c:(.text+0x384): undefined reference to `rte_mempool_ops_table'
> > rte_arp.c:(.text+0x4b7): undefined reference to `rte_mempool_ops_table'
> 
> This is very strange, I do not see this error on my machine.

I could see this error on mine with:
+CONFIG_RTE_BUILD_SHARED_LIB=y

And this fix helps.

Best Regards,
Xiao


Re: [dpdk-dev] [PATCH 1/2] net: fixup RARP generation

2018-01-18 Thread Wang, Xiao W


> -Original Message-
> From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> Sent: Thursday, January 18, 2018 4:51 PM
> To: Thomas Monjalon 
> Cc: dev@dpdk.org; Wang, Xiao W ; Yigit, Ferruh
> ; Olivier Matz 
> Subject: Re: [dpdk-dev] [PATCH 1/2] net: fixup RARP generation
> 
> On Thu, Jan 18, 2018 at 09:38:39AM +0100, Thomas Monjalon wrote:
> > 18/01/2018 04:14, Yuanhan Liu:
> > > Due to a mistake operation from me, older version (v10) was merged to
> > > master branch. It's the v11 should be applied. However, the master branch
> > > is not rebase-able. Thus, this patch is made, from the diff between v10
> > > and v11.
> >
> > Understood it is a mistake.
> > However, you can briefly describes what does this change.
> > Is there a changelog in v11 patch?
> 
> Yes, ther is:
> 
> v11:
> - Add check for parameter and tailroom in rte_net_make_rarp_packet.
> - Allocate mbuf in rte_net_make_rarp_packet.
> 
> > >
> > > Code is from Xiao Wang.
> >
> > You may add his Signed-off.
> 
> I have no objection. Xiao, okay to you? I will also set the author
> to you.
> 
>   --yliu
> 
> > > Signed-off-by: Yuanhan Liu 

OK for me.

BRs,
Xiao


Re: [dpdk-dev] [PATCH v11 5/5] net/virtio: support GUEST ANNOUNCE

2018-01-20 Thread Wang, Xiao W


> -Original Message-
> From: Yigit, Ferruh
> Sent: Saturday, January 20, 2018 10:31 PM
> To: Wang, Xiao W ; y...@fridaylinux.org;
> olivier.m...@6wind.com; maxime.coque...@redhat.com; Thomas Monjalon
> 
> Cc: dev@dpdk.org; Bie, Tiwei ;
> step...@networkplumber.org
> Subject: Re: [dpdk-dev] [PATCH v11 5/5] net/virtio: support GUEST ANNOUNCE
> 
> On 1/19/2018 5:33 PM, Ferruh Yigit wrote:
> > On 1/16/2018 9:41 PM, Xiao Wang wrote:
> >> When live migration is done, for the backup VM, either the virtio
> >> frontend or the vhost backend needs to send out gratuitous RARP packet
> >> to announce its new network location.
> >>
> >> This patch enables VIRTIO_NET_F_GUEST_ANNOUNCE feature to support
> live
> >> migration scenario where the vhost backend doesn't have the ability to
> >> generate RARP packet.
> >>
> >> Brief introduction of the work flow:
> >> 1. QEMU finishes live migration, pokes the backup VM with an interrupt.
> >> 2. Virtio interrupt handler reads out the interrupt status value, and
> >>realizes it needs to send out RARP packet to announce its location.
> >> 3. Pause device to stop worker thread touching the queues.
> >> 4. Inject a RARP packet into a Tx Queue.
> >> 5. Ack the interrupt via control queue.
> >> 6. Resume device to continue packet processing.
> >>
> >> Signed-off-by: Xiao Wang 
> >> Reviewed-by: Maxime Coquelin 
> >
> >
> > Hi Yuanhan,
> >
> > This commit breaks the build!
> 
> I switched two patches and problem gone, like:
> first: net: fixup RARP generation
> second: net/virtio: support GUEST ANNOUNCE
> 
> From my point of view nothing more needs to be done, but can you please
> double
> check the patches.

The 2 patches are OK.
Thanks!

BRs,
Xiao
> 
> Thanks,
> ferruh
> 
> >
> > As far as I understand you send a fix but merged into other patch, which
> leaves
> > this commit still broken.
> >
> > What do you think sending a fix that can be mergable to this one, so I can
> > squash it on next-net?
> >
> > Thanks,
> > ferruh
> >



Re: [dpdk-dev] [PATCH 1/3] eal/vfio: add support for multiple container

2018-03-15 Thread Wang, Xiao W
Hi Anatoly,

> -Original Message-
> From: Burakov, Anatoly
> Sent: Wednesday, March 14, 2018 8:08 PM
> To: Wang, Xiao W ; dev@dpdk.org
> Cc: Wang, Zhihong ;
> maxime.coque...@redhat.com; y...@fridaylinux.org; Liang, Cunming
> ; Xu, Rosen ; Chen, Junjie J
> ; Daly, Dan 
> Subject: Re: [dpdk-dev] [PATCH 1/3] eal/vfio: add support for multiple
> container
> 
> On 09-Mar-18 11:08 PM, Xiao Wang wrote:
> > From: Junjie Chen 
> >
> > Currently eal vfio framework binds vfio group fd to the default
> > container fd, while in some cases, e.g. vDPA (vhost data path
> > acceleration), we want to set vfio group to a new container and
> > program DMA mapping via this new container, so this patch adds
> > APIs to support multiple container.
> >
> > Signed-off-by: Junjie Chen 
> > Signed-off-by: Xiao Wang 
> > ---
> 
> I'm not going to get into virtual vs. real device debate, but i do have
> some issues with VFIO side of things.
> 
> I'm not completely convinced this change is needed in the first place.
> If the device driver manages its own groups anyway, it knows which VFIO
> groups belong to it, so it can add/remove them without putting them into
> separate containers. What is the purpose of keeping them in a separate
> container as opposed to just keeping track of group id's?

The device driver needs to have a separate container to program IOMMU
For the device, with the VM's addr translation table. So driver needs the
Devices be put into new containers, rather than the default one.

> 
> <...>
> 
> 
> > +   vfio_cfg->vfio_container_fd = vfio_get_container_fd();
> > +
> > +   if (vfio_cfg->vfio_container_fd < 0)
> > +   return -1;
> > +
> > +   return vfio_cfg->vfio_container_fd;
> > +}
> 
> Please correct me if i'm wrong, but this patch appears to be mistitled.
> You're not really creating multiple containers, you're just partitioning
> existing one. Do we really need to open/store/close container fd's
> separately, if all we have is a single container anyway?

This driver are creating new containers for devices, it needs each device
to have its own container, then we can dma_map/ummap for the device
via it's associated container.

BRs,
Xiao

> 
> The semantics of this are also weird in multiprocess. When secondary
> process requests a container, we always create a new one, send it over
> IPC and close it afterwards. It seems to be oblivious that you may have
> several container fd's, and does not know which one you are asking for.
> We know it's all the same container, but that's clearly not what the
> code appears to be doing.
> 
> --
> Thanks,
> Anatoly


Re: [dpdk-dev] [PATCH 0/3] add ifcvf driver

2018-03-15 Thread Wang, Xiao W
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Sunday, March 11, 2018 2:24 AM
> To: Wang, Xiao W ; dev@dpdk.org
> Cc: Wang, Zhihong ; y...@fridaylinux.org; Liang,
> Cunming ; Xu, Rosen ; Chen,
> Junjie J ; Daly, Dan 
> Subject: Re: [PATCH 0/3] add ifcvf driver
> 
> Hi Xiao,
> 
> On 03/10/2018 12:08 AM, Xiao Wang wrote:
> > This patch set has dependency on
> http://dpdk.org/dev/patchwork/patch/35635/
> > (vhost: support selective datapath);
> >
> > ifc VF is compatible with virtio vring operations, this driver implements
> > vDPA driver ops which configures ifc VF to be a vhost data path accelerator.
> >
> > ifcvf driver uses vdev as a control domain to manage ifc VFs that belong
> > to it. It registers vDPA device ops to vhost lib to enable these VFs to be
> > used as vhost data path accelerator.
> >
> > Live migration feature is supported by ifc VF and this driver enables
> > it based on vhost lib.
> >
> > vDPA needs to create different containers for different devices, thus this
> > patch set adds APIs in eal/vfio to support multiple container.
> Thanks for this! That will avoind having to duplicate these functions
> for every new offload driver.
> 
> 
> >
> > Junjie Chen (1):
> >eal/vfio: add support for multiple container
> >
> > Xiao Wang (2):
> >bus/pci: expose sysfs parsing API
> 
> Still, I'm not convinced the offload device should be a virtual device.
> It is a real PCI device, why not having a new device type for offload
> devices, and let the device to be probed automatically by the existing
> device model?

IFC VFs are generated from SRIOV, with the PF driven by kernel driver.
In DPDK we need to have something to represent PF, to register itself as
a vDPA engine, so a virtual device is used for this purpose.

The VFs are used for vhost net offload, and we could implement exception traffic
Rx/Tx function on the VFs in future via port-representor mechanism. So this 
patch
keeps the device type as net.

BRs,
Xiao

> 
> Thanks,
> Maxime
> 
> 
> >net/ifcvf: add ifcvf driver
> >
> >   config/common_base   |6 +
> >   config/common_linuxapp   |1 +
> >   drivers/bus/pci/linux/pci.c  |9 +-
> >   drivers/bus/pci/linux/pci_init.h |8 +
> >   drivers/bus/pci/rte_bus_pci_version.map  |8 +
> >   drivers/net/Makefile |1 +
> >   drivers/net/ifcvf/Makefile   |   40 +
> >   drivers/net/ifcvf/base/ifcvf.c   |  329 
> >   drivers/net/ifcvf/base/ifcvf.h   |  156 
> >   drivers/net/ifcvf/base/ifcvf_osdep.h |   52 ++
> >   drivers/net/ifcvf/ifcvf_ethdev.c | 1241
> ++
> >   drivers/net/ifcvf/rte_ifcvf_version.map  |4 +
> >   lib/librte_eal/bsdapp/eal/eal.c  |   51 +-
> >   lib/librte_eal/common/include/rte_vfio.h |  117 ++-
> >   lib/librte_eal/linuxapp/eal/eal_vfio.c   |  553 ++---
> >   lib/librte_eal/linuxapp/eal/eal_vfio.h   |2 +
> >   lib/librte_eal/rte_eal_version.map   |7 +
> >   mk/rte.app.mk|1 +
> >   18 files changed, 2480 insertions(+), 106 deletions(-)
> >   create mode 100644 drivers/net/ifcvf/Makefile
> >   create mode 100644 drivers/net/ifcvf/base/ifcvf.c
> >   create mode 100644 drivers/net/ifcvf/base/ifcvf.h
> >   create mode 100644 drivers/net/ifcvf/base/ifcvf_osdep.h
> >   create mode 100644 drivers/net/ifcvf/ifcvf_ethdev.c
> >   create mode 100644 drivers/net/ifcvf/rte_ifcvf_version.map
> >


Re: [dpdk-dev] [PATCH 2/3] bus/pci: expose sysfs parsing API

2018-03-15 Thread Wang, Xiao W
Hi Rivet,

> -Original Message-
> From: Gaëtan Rivet [mailto:gaetan.ri...@6wind.com]
> Sent: Wednesday, March 14, 2018 9:31 PM
> To: Burakov, Anatoly 
> Cc: Wang, Xiao W ; dev@dpdk.org; Wang, Zhihong
> ; maxime.coque...@redhat.com;
> y...@fridaylinux.org; Liang, Cunming ; Xu, Rosen
> ; Chen, Junjie J ; Daly, Dan
> 
> Subject: Re: [dpdk-dev] [PATCH 2/3] bus/pci: expose sysfs parsing API
> 
> Hi,
> 
> On Wed, Mar 14, 2018 at 11:19:31AM +, Burakov, Anatoly wrote:
> > On 09-Mar-18 11:08 PM, Xiao Wang wrote:
> > > Some existing sysfs parsing functions are helpful for the later vDPA
> > > driver, this patch make them global and expose them to shared lib.
> > >
> > > Signed-off-by: Xiao Wang 
> > > ---
> > >   drivers/bus/pci/linux/pci.c | 9 -
> > >   drivers/bus/pci/linux/pci_init.h| 8 
> > >   drivers/bus/pci/rte_bus_pci_version.map | 8 
> > >   3 files changed, 20 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
> > > index abde64119..81e5e5650 100644
> > > --- a/drivers/bus/pci/linux/pci.c
> > > +++ b/drivers/bus/pci/linux/pci.c
> > > @@ -32,7 +32,7 @@
> > >   extern struct rte_pci_bus rte_pci_bus;
> > > -static int
> > > +int
> > >   pci_get_kernel_driver_by_path(const char *filename, char *dri_name)
> >
> > Here and in other places - shouldn't this too be prefixed with rte_?
> >
> 
> A public PCI function should be prefixed by rte_pci_ yes.

OK, will add this prefix.

> 
> Additionally, if this function was to be exposed, then there should be a
> BSD implementation as well (shared map file).
> 
> I don't know how BSD works, I'm not sure parsing the filesystem is the
> way to get a PCI driver name. If so, maybe the function should be called
> another, generic, way, that would work for both linux and BSD (and
> ideally, having a real BSD implementation).

BSD is not parsing the filesystem, it uses PCIOCGETCONF ioctl to retrieve
PCI device information.
This function is quite linux, especially for the API name. I'm afraid we can
only return err on BSD for this API.

BRs,
Xiao

> 
> >
> > --
> > Thanks,
> > Anatoly
> 
> --
> Gaëtan Rivet
> 6WIND


Re: [dpdk-dev] [PATCH 2/3] bus/pci: expose sysfs parsing API

2018-03-18 Thread Wang, Xiao W
Hi Rivet,

> -Original Message-
> From: Gaëtan Rivet [mailto:gaetan.ri...@6wind.com]
> Sent: Friday, March 16, 2018 1:19 AM
> To: Wang, Xiao W 
> Cc: Burakov, Anatoly ; dev@dpdk.org; Wang,
> Zhihong ; maxime.coque...@redhat.com;
> y...@fridaylinux.org; Liang, Cunming ; Xu, Rosen
> ; Chen, Junjie J ; Daly, Dan
> 
> Subject: Re: [dpdk-dev] [PATCH 2/3] bus/pci: expose sysfs parsing API
> 
> On Thu, Mar 15, 2018 at 04:49:41PM +, Wang, Xiao W wrote:
> > Hi Rivet,
> >
> > > -Original Message-
> > > From: Gaëtan Rivet [mailto:gaetan.ri...@6wind.com]
> > > Sent: Wednesday, March 14, 2018 9:31 PM
> > > To: Burakov, Anatoly 
> > > Cc: Wang, Xiao W ; dev@dpdk.org; Wang,
> Zhihong
> > > ; maxime.coque...@redhat.com;
> > > y...@fridaylinux.org; Liang, Cunming ; Xu,
> Rosen
> > > ; Chen, Junjie J ; Daly, Dan
> > > 
> > > Subject: Re: [dpdk-dev] [PATCH 2/3] bus/pci: expose sysfs parsing API
> > >
> > > Hi,
> > >
> > > On Wed, Mar 14, 2018 at 11:19:31AM +, Burakov, Anatoly wrote:
> > > > On 09-Mar-18 11:08 PM, Xiao Wang wrote:
> > > > > Some existing sysfs parsing functions are helpful for the later vDPA
> > > > > driver, this patch make them global and expose them to shared lib.
> > > > >
> > > > > Signed-off-by: Xiao Wang 
> > > > > ---
> > > > >   drivers/bus/pci/linux/pci.c | 9 -
> > > > >   drivers/bus/pci/linux/pci_init.h| 8 
> > > > >   drivers/bus/pci/rte_bus_pci_version.map | 8 
> > > > >   3 files changed, 20 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
> > > > > index abde64119..81e5e5650 100644
> > > > > --- a/drivers/bus/pci/linux/pci.c
> > > > > +++ b/drivers/bus/pci/linux/pci.c
> > > > > @@ -32,7 +32,7 @@
> > > > >   extern struct rte_pci_bus rte_pci_bus;
> > > > > -static int
> > > > > +int
> > > > >   pci_get_kernel_driver_by_path(const char *filename, char *dri_name)
> > > >
> > > > Here and in other places - shouldn't this too be prefixed with rte_?
> > > >
> > >
> > > A public PCI function should be prefixed by rte_pci_ yes.
> >
> > OK, will add this prefix.
> >
> > >
> > > Additionally, if this function was to be exposed, then there should be a
> > > BSD implementation as well (shared map file).
> > >
> > > I don't know how BSD works, I'm not sure parsing the filesystem is the
> > > way to get a PCI driver name. If so, maybe the function should be called
> > > another, generic, way, that would work for both linux and BSD (and
> > > ideally, having a real BSD implementation).
> >
> > BSD is not parsing the filesystem, it uses PCIOCGETCONF ioctl to retrieve
> > PCI device information.
> > This function is quite linux, especially for the API name. I'm afraid we can
> > only return err on BSD for this API.
> 
> How about renaming the function to something like
> rte_pci_device_kdriver_name();
> 
> and allowing for a sensible BSD implementation to happen if someone
> needs it?

Yes, it looks more generic, and allows a BSD implementation to happen.
I will rename it as below in next version.
rte_pci_device_kdriver_name(const struct rte_pci_addr *addr, char *dri_name)

BRs,
Xiao

> 
> --
> Gaëtan Rivet
> 6WIND


Re: [dpdk-dev] [PATCH v2 2/3] bus/pci: expose sysfs parsing API

2018-03-21 Thread Wang, Xiao W
Hi Thomas,

> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Thursday, March 22, 2018 4:45 AM
> To: Wang, Xiao W 
> Cc: dev@dpdk.org; maxime.coque...@redhat.com; y...@fridaylinux.org; Wang,
> Zhihong ; Bie, Tiwei ; Chen,
> Junjie J ; Xu, Rosen ; Daly,
> Dan ; Liang, Cunming ;
> Burakov, Anatoly ; gaetan.ri...@6wind.com
> Subject: Re: [dpdk-dev] [PATCH v2 2/3] bus/pci: expose sysfs parsing API
> 
> 21/03/2018 14:21, Xiao Wang:
> > Some existing sysfs parsing functions are helpful for the later vDPA
> > driver, this patch make them global and expose them to shared lib.
> >
> > Signed-off-by: Xiao Wang 
> > ---
> > /* parse driver */
> > snprintf(filename, sizeof(filename), "%s/driver", dirname);
> > -   ret = pci_get_kernel_driver_by_path(filename, driver);
> > +   ret = rte_pci_device_kdriver_name(addr, driver);
> 
> I guess the snprintf above becomes useless.

Will remove it.
> 
> > + * @param dri_name
> > + *   Output buffer pointer.
> 
> Parameter name and comment can be improved here:
> "kdrv_name" would be more meaningful.
> As a comment, "Output buffer for kernel driver name"

Thanks for the suggestion. Will improve it.

> 
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> > + *
> > + * Parse the "resource" sysfs file.
> > + *
> > + * @param filename
> > + *   The PCI resource file path.
> > + * @dev
> > + *   Pointer of rte_pci_device object, into which the parse result is 
> > recorded.
> > + * @return
> > + *   0 on success, -1 on error, 1 on no driver found.
> > + */
> > +int __rte_experimental
> > +rte_pci_parse_sysfs_resource(const char *filename, struct rte_pci_device
> *dev);
> 
> This is a Linux specific API.
> Maybe remove "sysfs" and replace "filename" by "resource"?

Yes, "sysfs" makes it Linux specific. Will change it.
Thanks for the above comments.

BRs,
Xiao


Re: [dpdk-dev] [PATCH v2 1/3] eal/vfio: add support for multiple container

2018-03-21 Thread Wang, Xiao W
Hi Thomas, Rivet,

> -Original Message-
> From: Gaëtan Rivet [mailto:gaetan.ri...@6wind.com]
> Sent: Thursday, March 22, 2018 5:38 AM
> To: Thomas Monjalon 
> Cc: Wang, Xiao W ; Chen, Junjie J
> ; dev@dpdk.org; maxime.coque...@redhat.com;
> y...@fridaylinux.org; Wang, Zhihong ; Bie, Tiwei
> ; Xu, Rosen ; Daly, Dan
> ; Liang, Cunming ; Burakov,
> Anatoly 
> Subject: Re: [dpdk-dev] [PATCH v2 1/3] eal/vfio: add support for multiple
> container
> 
> On Wed, Mar 21, 2018 at 09:32:18PM +0100, Thomas Monjalon wrote:
> > Hi,
> >
> > 21/03/2018 14:21, Xiao Wang:
> > > +#endif /* VFIO_PRESENT */
> > >  #endif /* _RTE_VFIO_H_ */
> >
> > Please keep the empty line which was present between endif.

OK.
> >
> > > + rte_vfio_create_container;
> > > + rte_vfio_destroy_container;
> > > + rte_vfio_bind_group_no;
> > > + rte_vfio_unbind_group_no;
> > > + rte_vfio_dma_map;
> > > + rte_vfio_dma_unmap;
> > > + rte_vfio_get_group_fd;
> >
> > Please keep alphabetical order.

OK. Will do.
> >
> > About the naming, I see "no" and "idx" are used.
> > Other APIs in DPDK are using "num" and "id". Any strong opinion?
> 
> {bind,unbind}_group is sufficient as a name.
> _no is redundant as implicit from the parameter type.

{bind,unbind}_group looks very neat. Will remove "_no".
For the eal_vfio.c internal API with "_idx" postfix, the return value is an 
index of an array. I think "idx" is appropriate. Before this patch, we already 
have function get_vfio_group_idx.

Thanks for the comments.
-Xiao
> 
> --
> Gaëtan Rivet
> 6WIND


Re: [dpdk-dev] [PATCH v2 3/3] net/ifcvf: add ifcvf driver

2018-03-22 Thread Wang, Xiao W
Hi Ferruh,

> -Original Message-
> From: Yigit, Ferruh
> Sent: Thursday, March 22, 2018 4:51 PM
> To: Wang, Xiao W ; maxime.coque...@redhat.com;
> y...@fridaylinux.org
> Cc: dev@dpdk.org; Wang, Zhihong ; Bie, Tiwei
> ; Chen, Junjie J ; Xu, Rosen
> ; Daly, Dan ; Liang, Cunming
> ; Burakov, Anatoly ;
> gaetan.ri...@6wind.com
> Subject: Re: [dpdk-dev] [PATCH v2 3/3] net/ifcvf: add ifcvf driver
> 
> On 3/21/2018 1:21 PM, Xiao Wang wrote:
> > ifcvf driver uses vdev as a control domain to manage ifc VFs that belong
> > to it. It registers vDPA device ops to vhost lib to enable these VFs to be
> > used as vhost data path accelerator.
> >
> > Live migration feature is supported by ifc VF and this driver enables
> > it based on vhost lib.
> >
> > Because vDPA driver needs to set up MSI-X vector to interrupt the guest,
> > only vfio-pci is supported currently.
> >
> > Signed-off-by: Xiao Wang 
> > Signed-off-by: Rosen Xu 
> > ---
> > v2:
> > - Rebase on Zhihong's vDPA v3 patch set.
> > ---
> >  config/common_base  |6 +
> >  config/common_linuxapp  |1 +
> >  drivers/net/Makefile|1 +
> >  drivers/net/ifcvf/Makefile  |   40 +
> >  drivers/net/ifcvf/base/ifcvf.c  |  329 
> >  drivers/net/ifcvf/base/ifcvf.h  |  156 
> >  drivers/net/ifcvf/base/ifcvf_osdep.h|   52 ++
> >  drivers/net/ifcvf/ifcvf_ethdev.c| 1240
> +++
> >  drivers/net/ifcvf/rte_ifcvf_version.map |4 +
> >  mk/rte.app.mk   |1 +
> 
> Need .ini file to represent driver features.
> Also it is good to add driver documentation and a note into release note to
> announce new driver.

Will do.

> 
> >  10 files changed, 1830 insertions(+)
> >  create mode 100644 drivers/net/ifcvf/Makefile
> >  create mode 100644 drivers/net/ifcvf/base/ifcvf.c
> >  create mode 100644 drivers/net/ifcvf/base/ifcvf.h
> >  create mode 100644 drivers/net/ifcvf/base/ifcvf_osdep.h
> >  create mode 100644 drivers/net/ifcvf/ifcvf_ethdev.c
> >  create mode 100644 drivers/net/ifcvf/rte_ifcvf_version.map
> >
> > diff --git a/config/common_base b/config/common_base
> > index ad03cf433..06fce1ebf 100644
> > --- a/config/common_base
> > +++ b/config/common_base
> > @@ -791,6 +791,12 @@ CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
> >  #
> >  CONFIG_RTE_LIBRTE_PMD_VHOST=n
> >
> > +#
> > +# Compile IFCVF driver
> > +# To compile, CONFIG_RTE_LIBRTE_VHOST should be enabled.
> > +#
> > +CONFIG_RTE_LIBRTE_IFCVF=n
> > +
> >  #
> >  # Compile the test application
> >  #
> > diff --git a/config/common_linuxapp b/config/common_linuxapp
> > index ff98f2355..358d00468 100644
> > --- a/config/common_linuxapp
> > +++ b/config/common_linuxapp
> > @@ -15,6 +15,7 @@ CONFIG_RTE_LIBRTE_PMD_KNI=y
> >  CONFIG_RTE_LIBRTE_VHOST=y
> >  CONFIG_RTE_LIBRTE_VHOST_NUMA=y
> >  CONFIG_RTE_LIBRTE_PMD_VHOST=y
> > +CONFIG_RTE_LIBRTE_IFCVF=y
> 
> Current syntax for PMD config options:
> Virtual ones: CONFIG_RTE_LIBRTE_PMD_XXX
> Physical ones: CONFIG_RTE_LIBRTE_XXX_PMD
> 
> Virtual / Physical difference most probably not done intentionally but that is
> what it is right now.
> 
> Is "PMD" not added intentionally to the config option?

I think vDPA driver is not polling mode, so I didn't put a "PMD" here. Do you 
think CONFIG_RTE_LIBRTE_VDPA_IFCVF is better?

> 
> And what is the config time dependency of the driver, I assume VHOST is one
> of
> them but are there more?

This dependency is described in drivers/net/Makefile, CONFIG_RTE_EAL_VFIO is 
another one, will add it.

> 
> >  CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
> >  CONFIG_RTE_LIBRTE_PMD_TAP=y
> >  CONFIG_RTE_LIBRTE_AVP_PMD=y
> > diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> > index e1127326b..496acf2d2 100644
> > --- a/drivers/net/Makefile
> > +++ b/drivers/net/Makefile
> > @@ -53,6 +53,7 @@ endif # $(CONFIG_RTE_LIBRTE_SCHED)
> >
> >  ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
> >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
> > +DIRS-$(CONFIG_RTE_LIBRTE_IFCVF) += ifcvf
> 
> Since this is mainly vpda driver, does it make sense to put it under
> drivers/net/virtio/vpda/ifcvf
> 
> When there are more vpda driver they can go into drivers/net/virtio/vpda/*

vDPA is for vhost offloading/acceleration, the device can be quite different 
from virtio,
they just need to be virtio ring compatible, and the usage model is quite 
different from virt

Re: [dpdk-dev] [PATCH 0/3] add ifcvf driver

2018-03-23 Thread Wang, Xiao W
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Thursday, March 22, 2018 4:48 AM
> To: Wang, Xiao W ; dev@dpdk.org
> Cc: Wang, Zhihong ; y...@fridaylinux.org; Liang,
> Cunming ; Xu, Rosen ; Chen,
> Junjie J ; Daly, Dan 
> Subject: Re: [PATCH 0/3] add ifcvf driver
> 
> Hi Xiao,
> 
> On 03/15/2018 05:49 PM, Wang, Xiao W wrote:
> > Hi Maxime,
> >
> >> -Original Message-
> >> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> >> Sent: Sunday, March 11, 2018 2:24 AM
> >> To: Wang, Xiao W ; dev@dpdk.org
> >> Cc: Wang, Zhihong ; y...@fridaylinux.org; Liang,
> >> Cunming ; Xu, Rosen ;
> Chen,
> >> Junjie J ; Daly, Dan 
> >> Subject: Re: [PATCH 0/3] add ifcvf driver
> >>
> >> Hi Xiao,
> >>
> >> On 03/10/2018 12:08 AM, Xiao Wang wrote:
> >>> This patch set has dependency on
> >> http://dpdk.org/dev/patchwork/patch/35635/
> >>> (vhost: support selective datapath);
> >>>
> >>> ifc VF is compatible with virtio vring operations, this driver implements
> >>> vDPA driver ops which configures ifc VF to be a vhost data path 
> >>> accelerator.
> >>>
> >>> ifcvf driver uses vdev as a control domain to manage ifc VFs that belong
> >>> to it. It registers vDPA device ops to vhost lib to enable these VFs to be
> >>> used as vhost data path accelerator.
> >>>
> >>> Live migration feature is supported by ifc VF and this driver enables
> >>> it based on vhost lib.
> >>>
> >>> vDPA needs to create different containers for different devices, thus this
> >>> patch set adds APIs in eal/vfio to support multiple container.
> >> Thanks for this! That will avoind having to duplicate these functions
> >> for every new offload driver.
> >>
> >>
> >>>
> >>> Junjie Chen (1):
> >>> eal/vfio: add support for multiple container
> >>>
> >>> Xiao Wang (2):
> >>> bus/pci: expose sysfs parsing API
> >>
> >> Still, I'm not convinced the offload device should be a virtual device.
> >> It is a real PCI device, why not having a new device type for offload
> >> devices, and let the device to be probed automatically by the existing
> >> device model?
> >
> > IFC VFs are generated from SRIOV, with the PF driven by kernel driver.
> > In DPDK we need to have something to represent PF, to register itself as
> > a vDPA engine, so a virtual device is used for this purpose.
> I went through the code, and something is not clear to me.
> 
> Why do we need to have a representation of the PF in DPDK?
> Why cannot we just bind at VF level?

1. With the vdev representation we could use it to talk to PF kernel driver to 
do flow configuration, we can implement
flow API on the vdev in future for this purpose. Using a vdev allows 
introducing this kind of control plane thing.

2. When port representor is ready, we would integrate it into ifcvf driver, 
then each VF will have a
Representor port. For now we don’t have port representor, so this patch set 
manages VF resource internally.

BRs,
Xiao


Re: [dpdk-dev] [PATCH v2 3/3] net/ifcvf: add ifcvf driver

2018-03-23 Thread Wang, Xiao W
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Thursday, March 22, 2018 4:58 AM
> To: Wang, Xiao W ; y...@fridaylinux.org
> Cc: dev@dpdk.org; Wang, Zhihong ; Bie, Tiwei
> ; Chen, Junjie J ; Xu, Rosen
> ; Daly, Dan ; Liang, Cunming
> ; Burakov, Anatoly ;
> gaetan.ri...@6wind.com
> Subject: Re: [PATCH v2 3/3] net/ifcvf: add ifcvf driver
> 
> 
> 
> On 03/21/2018 02:21 PM, Xiao Wang wrote:
> > ifcvf driver uses vdev as a control domain to manage ifc VFs that belong
> > to it. It registers vDPA device ops to vhost lib to enable these VFs to be
> > used as vhost data path accelerator.
> >
> > Live migration feature is supported by ifc VF and this driver enables
> > it based on vhost lib.
> >
> > Because vDPA driver needs to set up MSI-X vector to interrupt the guest,
> > only vfio-pci is supported currently.
> >
> > Signed-off-by: Xiao Wang 
> > Signed-off-by: Rosen Xu 
> > ---
> > v2:
> > - Rebase on Zhihong's vDPA v3 patch set.
> > ---
> >   config/common_base  |6 +
> >   config/common_linuxapp  |1 +
> >   drivers/net/Makefile|1 +
> >   drivers/net/ifcvf/Makefile  |   40 +
> >   drivers/net/ifcvf/base/ifcvf.c  |  329 
> >   drivers/net/ifcvf/base/ifcvf.h  |  156 
> >   drivers/net/ifcvf/base/ifcvf_osdep.h|   52 ++
> >   drivers/net/ifcvf/ifcvf_ethdev.c| 1240
> +++
> >   drivers/net/ifcvf/rte_ifcvf_version.map |4 +
> >   mk/rte.app.mk   |1 +
> >   10 files changed, 1830 insertions(+)
> >   create mode 100644 drivers/net/ifcvf/Makefile
> >   create mode 100644 drivers/net/ifcvf/base/ifcvf.c
> >   create mode 100644 drivers/net/ifcvf/base/ifcvf.h
> >   create mode 100644 drivers/net/ifcvf/base/ifcvf_osdep.h
> >   create mode 100644 drivers/net/ifcvf/ifcvf_ethdev.c
> >   create mode 100644 drivers/net/ifcvf/rte_ifcvf_version.map
> >
> 
> ...
> 
> > +static int
> > +eth_dev_ifcvf_create(struct rte_vdev_device *dev,
> > +   struct rte_pci_addr *pci_addr, int devices)
> > +{
> > +   const char *name = rte_vdev_device_name(dev);
> > +   struct rte_eth_dev *eth_dev = NULL;
> > +   struct ether_addr *eth_addr = NULL;
> > +   struct ifcvf_internal *internal = NULL;
> > +   struct internal_list *list = NULL;
> > +   struct rte_eth_dev_data *data = NULL;
> > +   struct rte_pci_addr pf_addr = *pci_addr;
> > +   int i;
> > +
> > +   list = rte_zmalloc_socket(name, sizeof(*list), 0,
> > +   dev->device.numa_node);
> > +   if (list == NULL)
> > +   goto error;
> > +
> > +   /* reserve an ethdev entry */
> > +   eth_dev = rte_eth_vdev_allocate(dev, sizeof(*internal));
> > +   if (eth_dev == NULL)
> > +   goto error;
> > +
> > +   eth_addr = rte_zmalloc_socket(name, sizeof(*eth_addr), 0,
> > +   dev->device.numa_node);
> > +   if (eth_addr == NULL)
> > +   goto error;
> > +
> > +   *eth_addr = base_eth_addr;
> > +   eth_addr->addr_bytes[5] = eth_dev->data->port_id;
> > +
> > +   internal = eth_dev->data->dev_private;
> > +   internal->dev_name = strdup(name);
> > +   if (internal->dev_name == NULL)
> > +   goto error;
> > +
> > +   internal->eng_addr.pci_addr = *pci_addr;
> > +   for (i = 0; i < devices; i++) {
> > +   pf_addr.domain = pci_addr->domain;
> > +   pf_addr.bus = pci_addr->bus;
> > +   pf_addr.devid = pci_addr->devid + (i + 1) / 8;
> > +   pf_addr.function = pci_addr->function + (i + 1) % 8;
> > +   internal->vf_info[i].pdev.addr = pf_addr;
> > +   rte_spinlock_init(&internal->vf_info[i].lock);
> > +   }
> > +   internal->max_devices = devices;
> > +
> > +   list->eth_dev = eth_dev;
> > +   pthread_mutex_lock(&internal_list_lock);
> > +   TAILQ_INSERT_TAIL(&internal_list, list, next);
> > +   pthread_mutex_unlock(&internal_list_lock);
> > +
> > +   data = eth_dev->data;
> > +   data->nb_rx_queues = IFCVF_MAX_QUEUES;
> > +   data->nb_tx_queues = IFCVF_MAX_QUEUES;
> > +   data->dev_link = vdpa_link;
> > +   data->mac_addrs = eth_addr;
> 
> We might want one ethernet device per VF, as for example you set
> dev_link.link_status to UP as soon as a VF is configured,

Re: [dpdk-dev] [PATCH v2 3/3] net/ifcvf: add ifcvf driver

2018-03-23 Thread Wang, Xiao W
Hi Thomas,

> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Thursday, March 22, 2018 4:52 AM
> To: Wang, Xiao W ; Xu, Rosen 
> Cc: dev@dpdk.org; maxime.coque...@redhat.com; y...@fridaylinux.org; Wang,
> Zhihong ; Bie, Tiwei ; Chen,
> Junjie J ; Daly, Dan ; Liang,
> Cunming ; Burakov, Anatoly
> ; gaetan.ri...@6wind.com
> Subject: Re: [dpdk-dev] [PATCH v2 3/3] net/ifcvf: add ifcvf driver
> 
> 21/03/2018 14:21, Xiao Wang:
> > ifcvf driver uses vdev as a control domain to manage ifc VFs that belong
> > to it. It registers vDPA device ops to vhost lib to enable these VFs to be
> > used as vhost data path accelerator.
> 
> Not everybody work at Intel.
> Please explain what means ifcvf and what is a control domain.

OK, and I will add a document.
> 
> > Live migration feature is supported by ifc VF and this driver enables
> > it based on vhost lib.
> >
> > Because vDPA driver needs to set up MSI-X vector to interrupt the guest,
> > only vfio-pci is supported currently.
> >
> > Signed-off-by: Xiao Wang 
> > Signed-off-by: Rosen Xu 
> > ---
> > v2:
> > - Rebase on Zhihong's vDPA v3 patch set.
> > ---
> >  config/common_base  |6 +
> >  config/common_linuxapp  |1 +
> >  drivers/net/Makefile|1 +
> >  drivers/net/ifcvf/Makefile  |   40 +
> >  drivers/net/ifcvf/base/ifcvf.c  |  329 
> >  drivers/net/ifcvf/base/ifcvf.h  |  156 
> >  drivers/net/ifcvf/base/ifcvf_osdep.h|   52 ++
> >  drivers/net/ifcvf/ifcvf_ethdev.c| 1240
> +++
> >  drivers/net/ifcvf/rte_ifcvf_version.map |4 +
> >  mk/rte.app.mk   |1 +
> 
> This feature needs to be explained and documented.
> It will be helpful to understand the mechanism and to have a good review.
> Please do not merge it until there is a good documentation.
> 

Will add a doc with more details.

BRs,
Xiao





  1   2   3   4   >