> > On 05.01.2018 19:13, Ciara Loftus wrote: > > Zero copy is disabled by default. To enable it, set the 'dq-zero-copy' > > option to 'true' when configuring the Interface: > > > > ovs-vsctl set Interface dpdkvhostuserclient0 > > options:vhost-server-path=/tmp/dpdkvhostuserclient0 > > options:dq-zero-copy=true > > > > When packets from a vHost device with zero copy enabled are destined for > > a single 'dpdk' port, the number of tx descriptors on that 'dpdk' port > > must be set to a smaller value. 128 is recommended. This can be achieved > > like so: > > > > ovs-vsctl set Interface dpdkport options:n_txq_desc=128 > > > > Note: The sum of the tx descriptors of all 'dpdk' ports the VM will send > > to should not exceed 128. Due to this requirement, the feature is > > considered 'experimental'. > > > > Testing of the patch showed a 15% improvement when switching 512B > > packets between vHost devices on different VMs on the same host when > > zero copy was enabled on the transmitting device. > > > > Signed-off-by: Ciara Loftus <ciara.lof...@intel.com> > > --- > > v8: > > * Disallow configurability after vHost device has been registered & > > update docs accordingly. > > * Give performance datapoint in commit message. > > > > Documentation/intro/install/dpdk.rst | 2 + > > Documentation/topics/dpdk/vhost-user.rst | 72 > ++++++++++++++++++++++++++++++++ > > NEWS | 1 + > > lib/netdev-dpdk.c | 9 +++- > > vswitchd/vswitch.xml | 11 +++++ > > 5 files changed, 94 insertions(+), 1 deletion(-) > > > > diff --git a/Documentation/intro/install/dpdk.rst > b/Documentation/intro/install/dpdk.rst > > index 3fecb5c..087eb88 100644 > > --- a/Documentation/intro/install/dpdk.rst > > +++ b/Documentation/intro/install/dpdk.rst > > @@ -518,6 +518,8 @@ The above command sets the number of rx queues > for DPDK physical interface. > > The rx queues are assigned to pmd threads on the same NUMA node in a > > round-robin fashion. > > > > +.. _dpdk-queues-sizes: > > + > > DPDK Physical Port Queue Sizes > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > diff --git a/Documentation/topics/dpdk/vhost-user.rst > b/Documentation/topics/dpdk/vhost-user.rst > > index 8447e2d..1a6c6d0 100644 > > --- a/Documentation/topics/dpdk/vhost-user.rst > > +++ b/Documentation/topics/dpdk/vhost-user.rst > > @@ -458,3 +458,75 @@ Sample XML > > </domain> > > > > .. _QEMU documentation: http://git.qemu- > project.org/?p=qemu.git;a=blob;f=docs/specs/vhost- > user.txt;h=7890d7169;hb=HEAD > > + > > +vhost-user Dequeue Zero Copy (experimental) > > +------------------------------------------- > > + > > +Normally when dequeuing a packet from a vHost User device, a memcpy > operation > > +must be used to copy that packet from guest address space to host > address > > +space. This memcpy can be removed by enabling dequeue zero-copy like > so:: > > + > > + $ ovs-vsctl add-port br0 dpdkvhostuserclient0 -- set Interface \ > > + dpdkvhostuserclient0 type=dpdkvhostuserclient \ > > + options:vhost-server-path=/tmp/dpdkvhostclient0 \ > > + options:dq-zero-copy=true > > + > > +With this feature enabled, a reference (pointer) to the packet is passed to > > +the host, instead of a copy of the packet. Removing this memcpy can give > a > > +performance improvement for some use cases, for example switching > large packets > > +between different VMs. However additional packet loss may be > observed. > > + > > +Note that the feature is disabled by default and must be explicitly enabled > > +by setting the 'dq-zero-copy' option to 'true' while specifying the > > +'vhost-server-path' option as above. If you wish to split out the command > into > > +multiple commands as below, ensure 'dq-zero-copy' is set before > > +'vhost-server-path':: > > + > > + $ ovs-vsctl set Interface dpdkvhostuserclient0 options:dq-zero- > copy=true > > + $ ovs-vsctl set Interface dpdkvhostuserclient0 \ > > + options:vhost-server-path=/tmp/dpdkvhostclient0 > > + > > +The feature is only available to dpdkvhostuserclient port types. > > + > > +A limitation exists whereby if packets from a vHost port with dq-zero- > copy=true > > +are destined for a 'dpdk' type port, the number of tx descriptors > (n_txq_desc) > > +for that port must be reduced to a smaller number, 128 being the > recommended > > +value. This can be achieved by issuing the following command:: > > + > > + $ ovs-vsctl set Interface dpdkport options:n_txq_desc=128 > > + > > +Note: The sum of the tx descriptors of all 'dpdk' ports the VM will send to > > +should not exceed 128. For example, in case of a bond over two physical > ports > > +in balance-tcp mode, one must divide 128 by the number of links in the > bond. > > + > > +Refer to :ref:`<dpdk-queue-sizes>` for more information. > > + > > +The reason for this limitation is due to how the zero copy functionality is > > +implemented. The vHost device's 'tx used vring', a virtio structure used > > for > > +tracking used ie. sent descriptors, will only be updated when the NIC > frees > > +the corresponding mbuf. If we don't free the mbufs frequently enough, > that > > +vring will be starved and packets will no longer be processed. One way to > > +ensure we don't encounter this scenario, is to configure n_txq_desc to a > small > > +enough number such that the 'mbuf free threshold' for the NIC will be hit > more > > +often and thus free mbufs more frequently. The value of 128 is > suggested, but > > +values of 64 and 256 have been tested and verified to work too, with > differing > > +performance characteristics. A value of 512 can be used too, if the virtio > > +queue size in the guest is increased to 1024 (available to configure in > QEMU > > +versions v2.10 and greater). This value can be set like so:: > > + > > + $ qemu-system-x86_64 ... -chardev > socket,id=char1,path=<sockpath>,server > > + -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce > > + -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1, > > + tx_queue_size=1024 > > + > > +Because of this limitation, this feature is condsidered 'experimental'. > > + > > +The feature currently does not fully work with QEMU >= v2.7 due to a bug > in > > +DPDK which will be addressed in an upcoming release. The patch to fix this > > +issue can be found on > > +`Patchwork > > +<http://dpdk.org/dev/patchwork/patch/32198/>`__ > > + > > +Further information can be found in the > > +`DPDK documentation > > +<http://dpdk.readthedocs.io/en/v17.05/prog_guide/vhost_lib.html>`__ > > There are many keywords like 'vhost-server-path', 'dq-zero-copy', 'true' and > so on in above text. IMHO, they should be marked like ``something`` to be > highlighted. > > > diff --git a/NEWS b/NEWS > > index 752e98f..1eaa32c 100644 > > --- a/NEWS > > +++ b/NEWS > > @@ -24,6 +24,7 @@ Post-v2.8.0 > > - DPDK: > > * Add support for DPDK v17.11 > > * Add support for vHost IOMMU > > + * Add support for vHost dequeue zero copy (experimental) > > > > v2.8.0 - 31 Aug 2017 > > -------------------- > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > > index 8f22264..d100b31 100644 > > --- a/lib/netdev-dpdk.c > > +++ b/lib/netdev-dpdk.c > > @@ -1384,6 +1384,10 @@ netdev_dpdk_vhost_client_set_config(struct > netdev *netdev, > > path = smap_get(args, "vhost-server-path"); > > if (path && strcmp(path, dev->vhost_id)) { > > strcpy(dev->vhost_id, path); > > + /* check zero copy configuration */ > > + if (smap_get_bool(args, "dq-zero-copy", false)) { > > + dev->vhost_driver_flags |= > RTE_VHOST_USER_DEQUEUE_ZERO_COPY; > > In case of rte_vhost_driver_register() failure while reconfiguration we'll not > be able to disable "dq-zero-copy" for that port, becose we're only anble to > enable it. > For example, it could happen in case of typo in vhost_id.
Hi Ilya, Can you clarify what you want here. Previously we had the ability to enable/disable zc but I removed it due to your comments. Do you want to be able to enable/disable zc before successful driver register but not after? What is the benefit? Thanks, Ciara > > > + } > > netdev_request_reconfigure(netdev); > > } > > } > > @@ -3278,7 +3282,7 @@ netdev_dpdk_vhost_client_reconfigure(struct > netdev *netdev) > > { > > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > > int err; > > - uint64_t vhost_flags = 0; > > + uint64_t vhost_flags = dev->vhost_driver_flags; > > This should be under the mutex. > > > > > ovs_mutex_lock(&dev->mutex); > > > > @@ -3307,6 +3311,9 @@ netdev_dpdk_vhost_client_reconfigure(struct > netdev *netdev) > > VLOG_INFO("vHost User device '%s' created in 'client' mode, " > > "using client socket '%s'", > > dev->up.name, dev->vhost_id); > > + if (dev->vhost_driver_flags & > RTE_VHOST_USER_DEQUEUE_ZERO_COPY) { > > + VLOG_INFO("Zero copy enabled for vHost port %s", dev- > >up.name); > > + } > > } > > > > err = rte_vhost_driver_callback_register(dev->vhost_id, > > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > > index 018d644..16b3c3e 100644 > > --- a/vswitchd/vswitch.xml > > +++ b/vswitchd/vswitch.xml > > @@ -2669,6 +2669,17 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 > type=patch options:peer=p1 \ > > </p> > > </column> > > > > + <column name="options" key="dq-zero-copy" > > + type='{"type": "boolean"}'> > > + <p> > > + The value specifies whether or not to enable dequeue zero copy on > > + the given interface. > > + Must be set before vhost-server-path is specified. > > + Only supported by dpdkvhostuserclient interfaces. > > + The feature is considered experimental. > > + </p> > > + </column> > > + > > <column name="options" key="n_rxq_desc" > > type='{"type": "integer", "minInteger": 1, "maxInteger": > > 4096}'> > > <p> > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev