[dpdk-dev] [PATCH v6 2/2] vhost: Add VHOST PMD

2016-02-03 Thread Tetsuya Mukawa
On 2016/02/03 18:24, Ferruh Yigit wrote:
> On Wed, Feb 03, 2016 at 04:48:17PM +0900, Tetsuya Mukawa wrote:
>> On 2016/02/03 8:43, Ferruh Yigit wrote:
>>> On Tue, Feb 02, 2016 at 08:18:42PM +0900, Tetsuya Mukawa wrote:
 +
 +  /* find an ethdev entry */
 +  eth_dev = rte_eth_dev_allocated(name);
 +  if (eth_dev == NULL)
 +  return -ENODEV;
 +
 +  internal = eth_dev->data->dev_private;
 +
 +  rte_free(vring_states[internal->port_id]);
 +  vring_states[internal->port_id] = NULL;
 +
 +  pthread_mutex_lock(&internal_list_lock);
 +  TAILQ_REMOVE(&internals_list, internal, next);
 +  pthread_mutex_unlock(&internal_list_lock);
 +
 +  eth_dev_stop(eth_dev);
 +
 +  if ((internal) && (internal->dev_name))
>>> if "internal" can be NULL, above internal->port_id reference will crash, if 
>>> can't be NULL no need to check here.
>>>
>>>
>> Hi Ferruh,
> Hi Tetsuya,
>
>> I guess if internal is NULL, "internal->dev_name" will not be accessed.
> Sure.
>
>> So it may be ok to stay above code.
>>
> But I mean 8,9 lines above there is an access to internal->port_id, either 
> internal NULL check should be before that access or removed completely.

I've got your point.
Thanks!

Tetsuya


[dpdk-dev] [PATCH v6 2/2] vhost: Add VHOST PMD

2016-02-03 Thread Tetsuya Mukawa
On 2016/02/03 8:43, Ferruh Yigit wrote:
> On Tue, Feb 02, 2016 at 08:18:42PM +0900, Tetsuya Mukawa wrote:
>> +
>> +/* find an ethdev entry */
>> +eth_dev = rte_eth_dev_allocated(name);
>> +if (eth_dev == NULL)
>> +return -ENODEV;
>> +
>> +internal = eth_dev->data->dev_private;
>> +
>> +rte_free(vring_states[internal->port_id]);
>> +vring_states[internal->port_id] = NULL;
>> +
>> +pthread_mutex_lock(&internal_list_lock);
>> +TAILQ_REMOVE(&internals_list, internal, next);
>> +pthread_mutex_unlock(&internal_list_lock);
>> +
>> +eth_dev_stop(eth_dev);
>> +
>> +if ((internal) && (internal->dev_name))
> if "internal" can be NULL, above internal->port_id reference will crash, if 
> can't be NULL no need to check here.
>
>

Hi Ferruh,

I guess if internal is NULL, "internal->dev_name" will not be accessed.
So it may be ok to stay above code.

Tetsuya


[dpdk-dev] [PATCH v6 2/2] vhost: Add VHOST PMD

2016-02-03 Thread Tetsuya Mukawa
On 2016/02/03 8:43, Ferruh Yigit wrote:
> On Tue, Feb 02, 2016 at 08:18:42PM +0900, Tetsuya Mukawa wrote:
>> diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
>> index 33c9cea..5819cdb 100644
>> --- a/doc/guides/nics/index.rst
>> +++ b/doc/guides/nics/index.rst
>> @@ -47,6 +47,7 @@ Network Interface Controller Drivers
>>  nfp
>>  szedata2
>>  virtio
>> +vhost
>>  vmxnet3
>>  pcap_ring
>>  
>> diff --git a/doc/guides/rel_notes/release_2_2.rst 
>> b/doc/guides/rel_notes/release_2_2.rst
> Should this be 2.3 release notes?

Hi Ferruh,

Thanks for your checking.
Yes, it should be. I will fix it.

>> +
>> +struct pmd_internal {
>> +TAILQ_ENTRY(pmd_internal) next;
>> +char *dev_name;
>> +char *iface_name;
>> +unsigned nb_rx_queues;
>> +unsigned nb_tx_queues;
>> +uint8_t port_id;
> nb_rx_queuues, nb_tx_queues and port_id are duplicated in rte_eth_dev_data, 
> there is no reason to not use them but create new ones.
> But you may need to keep list of eth devices instead of internals_list for 
> this update, not sure.
> please check: http://dpdk.org/dev/patchwork/patch/10284/

It seems I wrongly understand how to use queues in virtual PMD, then
duplicates some values.
Sure, I will follow above patch, and fix all same issues in my patch.
Also will check Null PMD fixing.

>> +for (i = 0; i < internal->nb_rx_queues; i++) {
>> +vq = internal->rx_vhost_queues[i];
>> +if (vq == NULL)
>> +continue;
>> +vq->device = dev;
>> +vq->internal = internal;
>> +rte_vhost_enable_guest_notification(
>> +dev, vq->virtqueue_id, 0);
> syntax: no line wrap required here

Will fix.

>
>> +}
>> +for (i = 0; i < internal->nb_tx_queues; i++) {
>> +vq = internal->tx_vhost_queues[i];
>> +if (vq == NULL)
>> +continue;
>> +vq->device = dev;
>> +vq->internal = internal;
>> +rte_vhost_enable_guest_notification(
>> +dev, vq->virtqueue_id, 0);
> syntax: no line wrap required here

Will fix it also.

>> +
>> +static int
>> +vring_state_changed(struct virtio_net *dev, uint16_t vring, int enable)
>> +{
>> +struct rte_vhost_vring_state *state;
>> +struct pmd_internal *internal;
>> +#ifdef RTE_LIBRTE_VHOST_NUMA
>> +int newnode, ret;
>> +#endif
>> +
>> +if (dev == NULL) {
>> +RTE_LOG(ERR, PMD, "Invalid argument\n");
>> +return -1;
>> +}
>> +
>> +internal = find_internal_resource(dev->ifname);
> Can find_internal_resource() return NULL here?

Will add null pointer checking here.

>> +state = vring_states[internal->port_id];
>> +if (!state) {
>> +RTE_LOG(ERR, PMD, "Unused port\n");
>> +return -1;
>> +}
>> +
>> +#ifdef RTE_LIBRTE_VHOST_NUMA
>> +ret  = get_mempolicy(&newnode, NULL, 0, dev,
>> +MPOL_F_NODE | MPOL_F_ADDR);
>> +if (ret < 0) {
>> +RTE_LOG(ERR, PMD, "Unknow numa node\n");
>> +return -1;
>> +}
>> +
>> +rte_eth_devices[internal->port_id].data->numa_node = newnode;
> Isn't dev->priv already has eth_dev, do we need to access to eth_dev as 
> rte_eth_devices[...]
> valid for above, instead of find_internal_resource() can't we use 
> dev->priv->data->dev_private

'dev->priv' will be filled in new_device(). And this event will be come
before calling new_device().
Because of this, the function accesses rte_eth_devices[].

>
>> +static int
>> +eth_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id,
>> +   uint16_t nb_rx_desc __rte_unused,
>> +   unsigned int socket_id,
>> +   const struct rte_eth_rxconf *rx_conf __rte_unused,
>> +   struct rte_mempool *mb_pool)
>> +{
>> +struct pmd_internal *internal = dev->data->dev_private;
>> +struct vhost_queue *vq;
>> +
>> +rte_free(internal->rx_vhost_queues[rx_queue_id]);
> Why free here, initially this value already should be NULL?
> If possible to call queue_setup multiple times, does make sense to free in 
> rx/tx_queue_release() functions?

Yes, you are right.
I forgot to delete debug code.
Will remove it.

>> +
>> +vq = rte_zmalloc_socket(NULL, sizeof(struct vhost_queue),
>> +RTE_CACHE_LINE_SIZE, socket_id);
>> +if (vq == NULL) {
>> +RTE_LOG(ERR, PMD, "Failed to allocate memory for rx queue\n");
>> +return -ENOMEM;
>> +}
> Other vPMDs use static memory for queues in internals struct to escape from 
> allocate/free with a cost of memory consumption
> Just another option if you prefer.

This is because queues may be in different numa node in vhost PMD case.

>> +dev_info->min_rx_bufsize = 0;
>> +}
>> +
>> +static void
>> +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats)
> why name is igb_stats, historical?

Yeah, it's came from copy and

[dpdk-dev] [PATCH v6 2/2] vhost: Add VHOST PMD

2016-02-03 Thread Ferruh Yigit
On Wed, Feb 03, 2016 at 04:48:17PM +0900, Tetsuya Mukawa wrote:
> On 2016/02/03 8:43, Ferruh Yigit wrote:
> > On Tue, Feb 02, 2016 at 08:18:42PM +0900, Tetsuya Mukawa wrote:
> >> +
> >> +  /* find an ethdev entry */
> >> +  eth_dev = rte_eth_dev_allocated(name);
> >> +  if (eth_dev == NULL)
> >> +  return -ENODEV;
> >> +
> >> +  internal = eth_dev->data->dev_private;
> >> +
> >> +  rte_free(vring_states[internal->port_id]);
> >> +  vring_states[internal->port_id] = NULL;
> >> +
> >> +  pthread_mutex_lock(&internal_list_lock);
> >> +  TAILQ_REMOVE(&internals_list, internal, next);
> >> +  pthread_mutex_unlock(&internal_list_lock);
> >> +
> >> +  eth_dev_stop(eth_dev);
> >> +
> >> +  if ((internal) && (internal->dev_name))
> > if "internal" can be NULL, above internal->port_id reference will crash, if 
> > can't be NULL no need to check here.
> >
> >
> 
> Hi Ferruh,
Hi Tetsuya,

> 
> I guess if internal is NULL, "internal->dev_name" will not be accessed.
Sure.

> So it may be ok to stay above code.
> 
But I mean 8,9 lines above there is an access to internal->port_id, either 
internal NULL check should be before that access or removed completely.

Thanks,
ferruh


[dpdk-dev] [PATCH v6 2/2] vhost: Add VHOST PMD

2016-02-02 Thread Ferruh Yigit
On Tue, Feb 02, 2016 at 08:18:42PM +0900, Tetsuya Mukawa wrote:
> The patch introduces a new PMD. This PMD is implemented as thin wrapper
> of librte_vhost. It means librte_vhost is also needed to compile the PMD.
> The vhost messages will be handled only when a port is started. So start
> a port first, then invoke QEMU.
> 
> The PMD has 2 parameters.
>  - iface:  The parameter is used to specify a path to connect to a
>virtio-net device.
>  - queues: The parameter is used to specify the number of the queues
>virtio-net device has.
>(Default: 1)
> 
> Here is an example.
> $ ./testpmd -c f -n 4 --vdev 'eth_vhost0,iface=/tmp/sock0,queues=1' -- -i
> 
> To connect above testpmd, here is qemu command example.
> 
> $ qemu-system-x86_64 \
> 
> -chardev socket,id=chr0,path=/tmp/sock0 \
> -netdev vhost-user,id=net0,chardev=chr0,vhostforce,queues=1 \
> -device virtio-net-pci,netdev=net0,mq=on
> 
> Signed-off-by: Tetsuya Mukawa 
> ---
>  config/common_linuxapp  |   6 +
>  doc/guides/nics/index.rst   |   1 +
>  doc/guides/rel_notes/release_2_2.rst|   2 +
>  drivers/net/Makefile|   4 +
>  drivers/net/vhost/Makefile  |  62 ++
>  drivers/net/vhost/rte_eth_vhost.c   | 920 
> 
>  drivers/net/vhost/rte_eth_vhost.h   |  73 +++
>  drivers/net/vhost/rte_pmd_vhost_version.map |  11 +
>  mk/rte.app.mk   |   8 +-
>  9 files changed, 1086 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/vhost/Makefile
>  create mode 100644 drivers/net/vhost/rte_eth_vhost.c
>  create mode 100644 drivers/net/vhost/rte_eth_vhost.h
>  create mode 100644 drivers/net/vhost/rte_pmd_vhost_version.map
> 
> diff --git a/config/common_linuxapp b/config/common_linuxapp
> index 74bc515..357b557 100644
> --- a/config/common_linuxapp
> +++ b/config/common_linuxapp
> @@ -514,6 +514,12 @@ CONFIG_RTE_LIBRTE_VHOST_NUMA=n
>  CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
>  
>  #
> +# Compile vhost PMD
> +# To compile, CONFIG_RTE_LIBRTE_VHOST should be enabled.
> +#
> +CONFIG_RTE_LIBRTE_PMD_VHOST=y
> +
> +#
>  #Compile Xen domain0 support
>  #
>  CONFIG_RTE_LIBRTE_XEN_DOM0=n
> diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
> index 33c9cea..5819cdb 100644
> --- a/doc/guides/nics/index.rst
> +++ b/doc/guides/nics/index.rst
> @@ -47,6 +47,7 @@ Network Interface Controller Drivers
>  nfp
>  szedata2
>  virtio
> +vhost
>  vmxnet3
>  pcap_ring
>  
> diff --git a/doc/guides/rel_notes/release_2_2.rst 
> b/doc/guides/rel_notes/release_2_2.rst

Should this be 2.3 release notes?

> index bb7d15a..6390b44 100644
> --- a/doc/guides/rel_notes/release_2_2.rst
> +++ b/doc/guides/rel_notes/release_2_2.rst
> @@ -168,6 +168,8 @@ New Features
>  
>  * **Added vhost-user multiple queue support.**
>  
> +* **Added vhost PMD.**
> +
>  * **Added port hotplug support to vmxnet3.**
>  
>  * **Added port hotplug support to xenvirt.**
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> index 6e4497e..4300b93 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -52,5 +52,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio
>  DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += vmxnet3
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += xenvirt
>  
> +ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
> +DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
> +endif # $(CONFIG_RTE_LIBRTE_VHOST)
> +

Not directly related to the your patch, but since I saw here:
I think we should have resolve dependencies in our "config tool"/"make system" 
instead of having them in makefiles, for long term.
But there is nothing you can do right now, since there is nothing available to 
resolve configuration dependencies.

>  include $(RTE_SDK)/mk/rte.sharelib.mk
>  include $(RTE_SDK)/mk/rte.subdir.mk
> diff --git a/drivers/net/vhost/Makefile b/drivers/net/vhost/Makefile
> new file mode 100644
> index 000..f49a69b
> --- /dev/null
> +++ b/drivers/net/vhost/Makefile
> @@ -0,0 +1,62 @@
> +#   BSD LICENSE
> +#
> +#   Copyright (c) 2010-2016 Intel Corporation.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +#   notice, this list of conditions and the following disclaimer.
> +# * Redistributions in binary form must reproduce the above copyright
> +#   notice, this list of conditions and the following disclaimer in
> +#   the documentation and/or other materials provided with the
> +#   distribution.
> +# * Neither the name of Intel corporation nor the names of its
> +#   contributors may be used to endorse or promote products derived
> +#   from this software without specific prior written permission.
> +#
> +#   T

[dpdk-dev] [PATCH v6 2/2] vhost: Add VHOST PMD

2016-02-02 Thread Tetsuya Mukawa
The patch introduces a new PMD. This PMD is implemented as thin wrapper
of librte_vhost. It means librte_vhost is also needed to compile the PMD.
The vhost messages will be handled only when a port is started. So start
a port first, then invoke QEMU.

The PMD has 2 parameters.
 - iface:  The parameter is used to specify a path to connect to a
   virtio-net device.
 - queues: The parameter is used to specify the number of the queues
   virtio-net device has.
   (Default: 1)

Here is an example.
$ ./testpmd -c f -n 4 --vdev 'eth_vhost0,iface=/tmp/sock0,queues=1' -- -i

To connect above testpmd, here is qemu command example.

$ qemu-system-x86_64 \

-chardev socket,id=chr0,path=/tmp/sock0 \
-netdev vhost-user,id=net0,chardev=chr0,vhostforce,queues=1 \
-device virtio-net-pci,netdev=net0,mq=on

Signed-off-by: Tetsuya Mukawa 
---
 config/common_linuxapp  |   6 +
 doc/guides/nics/index.rst   |   1 +
 doc/guides/rel_notes/release_2_2.rst|   2 +
 drivers/net/Makefile|   4 +
 drivers/net/vhost/Makefile  |  62 ++
 drivers/net/vhost/rte_eth_vhost.c   | 920 
 drivers/net/vhost/rte_eth_vhost.h   |  73 +++
 drivers/net/vhost/rte_pmd_vhost_version.map |  11 +
 mk/rte.app.mk   |   8 +-
 9 files changed, 1086 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/vhost/Makefile
 create mode 100644 drivers/net/vhost/rte_eth_vhost.c
 create mode 100644 drivers/net/vhost/rte_eth_vhost.h
 create mode 100644 drivers/net/vhost/rte_pmd_vhost_version.map

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 74bc515..357b557 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -514,6 +514,12 @@ CONFIG_RTE_LIBRTE_VHOST_NUMA=n
 CONFIG_RTE_LIBRTE_VHOST_DEBUG=n

 #
+# Compile vhost PMD
+# To compile, CONFIG_RTE_LIBRTE_VHOST should be enabled.
+#
+CONFIG_RTE_LIBRTE_PMD_VHOST=y
+
+#
 #Compile Xen domain0 support
 #
 CONFIG_RTE_LIBRTE_XEN_DOM0=n
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 33c9cea..5819cdb 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -47,6 +47,7 @@ Network Interface Controller Drivers
 nfp
 szedata2
 virtio
+vhost
 vmxnet3
 pcap_ring

diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index bb7d15a..6390b44 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -168,6 +168,8 @@ New Features

 * **Added vhost-user multiple queue support.**

+* **Added vhost PMD.**
+
 * **Added port hotplug support to vmxnet3.**

 * **Added port hotplug support to xenvirt.**
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 6e4497e..4300b93 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -52,5 +52,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio
 DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += vmxnet3
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += xenvirt

+ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
+endif # $(CONFIG_RTE_LIBRTE_VHOST)
+
 include $(RTE_SDK)/mk/rte.sharelib.mk
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/net/vhost/Makefile b/drivers/net/vhost/Makefile
new file mode 100644
index 000..f49a69b
--- /dev/null
+++ b/drivers/net/vhost/Makefile
@@ -0,0 +1,62 @@
+#   BSD LICENSE
+#
+#   Copyright (c) 2010-2016 Intel Corporation.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUD