[dpdk-dev] [PATCH v6 2/2] vhost: Add VHOST PMD
On 2016/02/03 18:24, Ferruh Yigit wrote: > On Wed, Feb 03, 2016 at 04:48:17PM +0900, Tetsuya Mukawa wrote: >> On 2016/02/03 8:43, Ferruh Yigit wrote: >>> On Tue, Feb 02, 2016 at 08:18:42PM +0900, Tetsuya Mukawa wrote: + + /* find an ethdev entry */ + eth_dev = rte_eth_dev_allocated(name); + if (eth_dev == NULL) + return -ENODEV; + + internal = eth_dev->data->dev_private; + + rte_free(vring_states[internal->port_id]); + vring_states[internal->port_id] = NULL; + + pthread_mutex_lock(&internal_list_lock); + TAILQ_REMOVE(&internals_list, internal, next); + pthread_mutex_unlock(&internal_list_lock); + + eth_dev_stop(eth_dev); + + if ((internal) && (internal->dev_name)) >>> if "internal" can be NULL, above internal->port_id reference will crash, if >>> can't be NULL no need to check here. >>> >>> >> Hi Ferruh, > Hi Tetsuya, > >> I guess if internal is NULL, "internal->dev_name" will not be accessed. > Sure. > >> So it may be ok to stay above code. >> > But I mean 8,9 lines above there is an access to internal->port_id, either > internal NULL check should be before that access or removed completely. I've got your point. Thanks! Tetsuya
[dpdk-dev] [PATCH v6 2/2] vhost: Add VHOST PMD
On 2016/02/03 8:43, Ferruh Yigit wrote: > On Tue, Feb 02, 2016 at 08:18:42PM +0900, Tetsuya Mukawa wrote: >> + >> +/* find an ethdev entry */ >> +eth_dev = rte_eth_dev_allocated(name); >> +if (eth_dev == NULL) >> +return -ENODEV; >> + >> +internal = eth_dev->data->dev_private; >> + >> +rte_free(vring_states[internal->port_id]); >> +vring_states[internal->port_id] = NULL; >> + >> +pthread_mutex_lock(&internal_list_lock); >> +TAILQ_REMOVE(&internals_list, internal, next); >> +pthread_mutex_unlock(&internal_list_lock); >> + >> +eth_dev_stop(eth_dev); >> + >> +if ((internal) && (internal->dev_name)) > if "internal" can be NULL, above internal->port_id reference will crash, if > can't be NULL no need to check here. > > Hi Ferruh, I guess if internal is NULL, "internal->dev_name" will not be accessed. So it may be ok to stay above code. Tetsuya
[dpdk-dev] [PATCH v6 2/2] vhost: Add VHOST PMD
On 2016/02/03 8:43, Ferruh Yigit wrote: > On Tue, Feb 02, 2016 at 08:18:42PM +0900, Tetsuya Mukawa wrote: >> diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst >> index 33c9cea..5819cdb 100644 >> --- a/doc/guides/nics/index.rst >> +++ b/doc/guides/nics/index.rst >> @@ -47,6 +47,7 @@ Network Interface Controller Drivers >> nfp >> szedata2 >> virtio >> +vhost >> vmxnet3 >> pcap_ring >> >> diff --git a/doc/guides/rel_notes/release_2_2.rst >> b/doc/guides/rel_notes/release_2_2.rst > Should this be 2.3 release notes? Hi Ferruh, Thanks for your checking. Yes, it should be. I will fix it. >> + >> +struct pmd_internal { >> +TAILQ_ENTRY(pmd_internal) next; >> +char *dev_name; >> +char *iface_name; >> +unsigned nb_rx_queues; >> +unsigned nb_tx_queues; >> +uint8_t port_id; > nb_rx_queuues, nb_tx_queues and port_id are duplicated in rte_eth_dev_data, > there is no reason to not use them but create new ones. > But you may need to keep list of eth devices instead of internals_list for > this update, not sure. > please check: http://dpdk.org/dev/patchwork/patch/10284/ It seems I wrongly understand how to use queues in virtual PMD, then duplicates some values. Sure, I will follow above patch, and fix all same issues in my patch. Also will check Null PMD fixing. >> +for (i = 0; i < internal->nb_rx_queues; i++) { >> +vq = internal->rx_vhost_queues[i]; >> +if (vq == NULL) >> +continue; >> +vq->device = dev; >> +vq->internal = internal; >> +rte_vhost_enable_guest_notification( >> +dev, vq->virtqueue_id, 0); > syntax: no line wrap required here Will fix. > >> +} >> +for (i = 0; i < internal->nb_tx_queues; i++) { >> +vq = internal->tx_vhost_queues[i]; >> +if (vq == NULL) >> +continue; >> +vq->device = dev; >> +vq->internal = internal; >> +rte_vhost_enable_guest_notification( >> +dev, vq->virtqueue_id, 0); > syntax: no line wrap required here Will fix it also. >> + >> +static int >> +vring_state_changed(struct virtio_net *dev, uint16_t vring, int enable) >> +{ >> +struct rte_vhost_vring_state *state; >> +struct pmd_internal *internal; >> +#ifdef RTE_LIBRTE_VHOST_NUMA >> +int newnode, ret; >> +#endif >> + >> +if (dev == NULL) { >> +RTE_LOG(ERR, PMD, "Invalid argument\n"); >> +return -1; >> +} >> + >> +internal = find_internal_resource(dev->ifname); > Can find_internal_resource() return NULL here? Will add null pointer checking here. >> +state = vring_states[internal->port_id]; >> +if (!state) { >> +RTE_LOG(ERR, PMD, "Unused port\n"); >> +return -1; >> +} >> + >> +#ifdef RTE_LIBRTE_VHOST_NUMA >> +ret = get_mempolicy(&newnode, NULL, 0, dev, >> +MPOL_F_NODE | MPOL_F_ADDR); >> +if (ret < 0) { >> +RTE_LOG(ERR, PMD, "Unknow numa node\n"); >> +return -1; >> +} >> + >> +rte_eth_devices[internal->port_id].data->numa_node = newnode; > Isn't dev->priv already has eth_dev, do we need to access to eth_dev as > rte_eth_devices[...] > valid for above, instead of find_internal_resource() can't we use > dev->priv->data->dev_private 'dev->priv' will be filled in new_device(). And this event will be come before calling new_device(). Because of this, the function accesses rte_eth_devices[]. > >> +static int >> +eth_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id, >> + uint16_t nb_rx_desc __rte_unused, >> + unsigned int socket_id, >> + const struct rte_eth_rxconf *rx_conf __rte_unused, >> + struct rte_mempool *mb_pool) >> +{ >> +struct pmd_internal *internal = dev->data->dev_private; >> +struct vhost_queue *vq; >> + >> +rte_free(internal->rx_vhost_queues[rx_queue_id]); > Why free here, initially this value already should be NULL? > If possible to call queue_setup multiple times, does make sense to free in > rx/tx_queue_release() functions? Yes, you are right. I forgot to delete debug code. Will remove it. >> + >> +vq = rte_zmalloc_socket(NULL, sizeof(struct vhost_queue), >> +RTE_CACHE_LINE_SIZE, socket_id); >> +if (vq == NULL) { >> +RTE_LOG(ERR, PMD, "Failed to allocate memory for rx queue\n"); >> +return -ENOMEM; >> +} > Other vPMDs use static memory for queues in internals struct to escape from > allocate/free with a cost of memory consumption > Just another option if you prefer. This is because queues may be in different numa node in vhost PMD case. >> +dev_info->min_rx_bufsize = 0; >> +} >> + >> +static void >> +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats) > why name is igb_stats, historical? Yeah, it's came from copy and
[dpdk-dev] [PATCH v6 2/2] vhost: Add VHOST PMD
On Wed, Feb 03, 2016 at 04:48:17PM +0900, Tetsuya Mukawa wrote: > On 2016/02/03 8:43, Ferruh Yigit wrote: > > On Tue, Feb 02, 2016 at 08:18:42PM +0900, Tetsuya Mukawa wrote: > >> + > >> + /* find an ethdev entry */ > >> + eth_dev = rte_eth_dev_allocated(name); > >> + if (eth_dev == NULL) > >> + return -ENODEV; > >> + > >> + internal = eth_dev->data->dev_private; > >> + > >> + rte_free(vring_states[internal->port_id]); > >> + vring_states[internal->port_id] = NULL; > >> + > >> + pthread_mutex_lock(&internal_list_lock); > >> + TAILQ_REMOVE(&internals_list, internal, next); > >> + pthread_mutex_unlock(&internal_list_lock); > >> + > >> + eth_dev_stop(eth_dev); > >> + > >> + if ((internal) && (internal->dev_name)) > > if "internal" can be NULL, above internal->port_id reference will crash, if > > can't be NULL no need to check here. > > > > > > Hi Ferruh, Hi Tetsuya, > > I guess if internal is NULL, "internal->dev_name" will not be accessed. Sure. > So it may be ok to stay above code. > But I mean 8,9 lines above there is an access to internal->port_id, either internal NULL check should be before that access or removed completely. Thanks, ferruh
[dpdk-dev] [PATCH v6 2/2] vhost: Add VHOST PMD
On Tue, Feb 02, 2016 at 08:18:42PM +0900, Tetsuya Mukawa wrote: > The patch introduces a new PMD. This PMD is implemented as thin wrapper > of librte_vhost. It means librte_vhost is also needed to compile the PMD. > The vhost messages will be handled only when a port is started. So start > a port first, then invoke QEMU. > > The PMD has 2 parameters. > - iface: The parameter is used to specify a path to connect to a >virtio-net device. > - queues: The parameter is used to specify the number of the queues >virtio-net device has. >(Default: 1) > > Here is an example. > $ ./testpmd -c f -n 4 --vdev 'eth_vhost0,iface=/tmp/sock0,queues=1' -- -i > > To connect above testpmd, here is qemu command example. > > $ qemu-system-x86_64 \ > > -chardev socket,id=chr0,path=/tmp/sock0 \ > -netdev vhost-user,id=net0,chardev=chr0,vhostforce,queues=1 \ > -device virtio-net-pci,netdev=net0,mq=on > > Signed-off-by: Tetsuya Mukawa > --- > config/common_linuxapp | 6 + > doc/guides/nics/index.rst | 1 + > doc/guides/rel_notes/release_2_2.rst| 2 + > drivers/net/Makefile| 4 + > drivers/net/vhost/Makefile | 62 ++ > drivers/net/vhost/rte_eth_vhost.c | 920 > > drivers/net/vhost/rte_eth_vhost.h | 73 +++ > drivers/net/vhost/rte_pmd_vhost_version.map | 11 + > mk/rte.app.mk | 8 +- > 9 files changed, 1086 insertions(+), 1 deletion(-) > create mode 100644 drivers/net/vhost/Makefile > create mode 100644 drivers/net/vhost/rte_eth_vhost.c > create mode 100644 drivers/net/vhost/rte_eth_vhost.h > create mode 100644 drivers/net/vhost/rte_pmd_vhost_version.map > > diff --git a/config/common_linuxapp b/config/common_linuxapp > index 74bc515..357b557 100644 > --- a/config/common_linuxapp > +++ b/config/common_linuxapp > @@ -514,6 +514,12 @@ CONFIG_RTE_LIBRTE_VHOST_NUMA=n > CONFIG_RTE_LIBRTE_VHOST_DEBUG=n > > # > +# Compile vhost PMD > +# To compile, CONFIG_RTE_LIBRTE_VHOST should be enabled. > +# > +CONFIG_RTE_LIBRTE_PMD_VHOST=y > + > +# > #Compile Xen domain0 support > # > CONFIG_RTE_LIBRTE_XEN_DOM0=n > diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst > index 33c9cea..5819cdb 100644 > --- a/doc/guides/nics/index.rst > +++ b/doc/guides/nics/index.rst > @@ -47,6 +47,7 @@ Network Interface Controller Drivers > nfp > szedata2 > virtio > +vhost > vmxnet3 > pcap_ring > > diff --git a/doc/guides/rel_notes/release_2_2.rst > b/doc/guides/rel_notes/release_2_2.rst Should this be 2.3 release notes? > index bb7d15a..6390b44 100644 > --- a/doc/guides/rel_notes/release_2_2.rst > +++ b/doc/guides/rel_notes/release_2_2.rst > @@ -168,6 +168,8 @@ New Features > > * **Added vhost-user multiple queue support.** > > +* **Added vhost PMD.** > + > * **Added port hotplug support to vmxnet3.** > > * **Added port hotplug support to xenvirt.** > diff --git a/drivers/net/Makefile b/drivers/net/Makefile > index 6e4497e..4300b93 100644 > --- a/drivers/net/Makefile > +++ b/drivers/net/Makefile > @@ -52,5 +52,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio > DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += vmxnet3 > DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += xenvirt > > +ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y) > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost > +endif # $(CONFIG_RTE_LIBRTE_VHOST) > + Not directly related to the your patch, but since I saw here: I think we should have resolve dependencies in our "config tool"/"make system" instead of having them in makefiles, for long term. But there is nothing you can do right now, since there is nothing available to resolve configuration dependencies. > include $(RTE_SDK)/mk/rte.sharelib.mk > include $(RTE_SDK)/mk/rte.subdir.mk > diff --git a/drivers/net/vhost/Makefile b/drivers/net/vhost/Makefile > new file mode 100644 > index 000..f49a69b > --- /dev/null > +++ b/drivers/net/vhost/Makefile > @@ -0,0 +1,62 @@ > +# BSD LICENSE > +# > +# Copyright (c) 2010-2016 Intel Corporation. > +# All rights reserved. > +# > +# Redistribution and use in source and binary forms, with or without > +# modification, are permitted provided that the following conditions > +# are met: > +# > +# * Redistributions of source code must retain the above copyright > +# notice, this list of conditions and the following disclaimer. > +# * Redistributions in binary form must reproduce the above copyright > +# notice, this list of conditions and the following disclaimer in > +# the documentation and/or other materials provided with the > +# distribution. > +# * Neither the name of Intel corporation nor the names of its > +# contributors may be used to endorse or promote products derived > +# from this software without specific prior written permission. > +# > +# T
[dpdk-dev] [PATCH v6 2/2] vhost: Add VHOST PMD
The patch introduces a new PMD. This PMD is implemented as thin wrapper of librte_vhost. It means librte_vhost is also needed to compile the PMD. The vhost messages will be handled only when a port is started. So start a port first, then invoke QEMU. The PMD has 2 parameters. - iface: The parameter is used to specify a path to connect to a virtio-net device. - queues: The parameter is used to specify the number of the queues virtio-net device has. (Default: 1) Here is an example. $ ./testpmd -c f -n 4 --vdev 'eth_vhost0,iface=/tmp/sock0,queues=1' -- -i To connect above testpmd, here is qemu command example. $ qemu-system-x86_64 \ -chardev socket,id=chr0,path=/tmp/sock0 \ -netdev vhost-user,id=net0,chardev=chr0,vhostforce,queues=1 \ -device virtio-net-pci,netdev=net0,mq=on Signed-off-by: Tetsuya Mukawa --- config/common_linuxapp | 6 + doc/guides/nics/index.rst | 1 + doc/guides/rel_notes/release_2_2.rst| 2 + drivers/net/Makefile| 4 + drivers/net/vhost/Makefile | 62 ++ drivers/net/vhost/rte_eth_vhost.c | 920 drivers/net/vhost/rte_eth_vhost.h | 73 +++ drivers/net/vhost/rte_pmd_vhost_version.map | 11 + mk/rte.app.mk | 8 +- 9 files changed, 1086 insertions(+), 1 deletion(-) create mode 100644 drivers/net/vhost/Makefile create mode 100644 drivers/net/vhost/rte_eth_vhost.c create mode 100644 drivers/net/vhost/rte_eth_vhost.h create mode 100644 drivers/net/vhost/rte_pmd_vhost_version.map diff --git a/config/common_linuxapp b/config/common_linuxapp index 74bc515..357b557 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -514,6 +514,12 @@ CONFIG_RTE_LIBRTE_VHOST_NUMA=n CONFIG_RTE_LIBRTE_VHOST_DEBUG=n # +# Compile vhost PMD +# To compile, CONFIG_RTE_LIBRTE_VHOST should be enabled. +# +CONFIG_RTE_LIBRTE_PMD_VHOST=y + +# #Compile Xen domain0 support # CONFIG_RTE_LIBRTE_XEN_DOM0=n diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst index 33c9cea..5819cdb 100644 --- a/doc/guides/nics/index.rst +++ b/doc/guides/nics/index.rst @@ -47,6 +47,7 @@ Network Interface Controller Drivers nfp szedata2 virtio +vhost vmxnet3 pcap_ring diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst index bb7d15a..6390b44 100644 --- a/doc/guides/rel_notes/release_2_2.rst +++ b/doc/guides/rel_notes/release_2_2.rst @@ -168,6 +168,8 @@ New Features * **Added vhost-user multiple queue support.** +* **Added vhost PMD.** + * **Added port hotplug support to vmxnet3.** * **Added port hotplug support to xenvirt.** diff --git a/drivers/net/Makefile b/drivers/net/Makefile index 6e4497e..4300b93 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -52,5 +52,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += vmxnet3 DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += xenvirt +ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y) +DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost +endif # $(CONFIG_RTE_LIBRTE_VHOST) + include $(RTE_SDK)/mk/rte.sharelib.mk include $(RTE_SDK)/mk/rte.subdir.mk diff --git a/drivers/net/vhost/Makefile b/drivers/net/vhost/Makefile new file mode 100644 index 000..f49a69b --- /dev/null +++ b/drivers/net/vhost/Makefile @@ -0,0 +1,62 @@ +# BSD LICENSE +# +# Copyright (c) 2010-2016 Intel Corporation. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUD