On Thu, Apr 6, 2023 at 4:17 PM Maxime Coquelin
<maxime.coque...@redhat.com> wrote:
>
> Hi Yongji,
>
> On 4/6/23 05:44, Yongji Xie wrote:
> > Hi Maxime,
> >
> > On Fri, Mar 31, 2023 at 11:43 PM Maxime Coquelin
> > <maxime.coque...@redhat.com> wrote:
> >>
> >> This series introduces a new type of backend, VDUSE,
> >> to the Vhost library.
> >>
> >> VDUSE stands for vDPA device in Userspace, it enables
> >> implementing a Virtio device in userspace and have it
> >> attached to the Kernel vDPA bus.
> >>
> >> Once attached to the vDPA bus, it can be used either by
> >> Kernel Virtio drivers, like virtio-net in our case, via
> >> the virtio-vdpa driver. Doing that, the device is visible
> >> to the Kernel networking stack and is exposed to userspace
> >> as a regular netdev.
> >>
> >> It can also be exposed to userspace thanks to the
> >> vhost-vdpa driver, via a vhost-vdpa chardev that can be
> >> passed to QEMU or Virtio-user PMD.
> >>
> >> While VDUSE support is already available in upstream
> >> Kernel, a couple of patches are required to support
> >> network device type:
> >>
> >> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
> >>
> >> In order to attach the created VDUSE device to the vDPA
> >> bus, a recent iproute2 version containing the vdpa tool is
> >> required.
> >>
> >> Usage:
> >> ======
> >>
> >> 1. Probe required Kernel modules
> >> # modprobe vdpa
> >> # modprobe vduse
> >> # modprobe virtio-vdpa
> >>
> >> 2. Build (require vduse kernel headers to be available)
> >> # meson build
> >> # ninja -C build
> >>
> >> 3. Create a VDUSE device (vduse0) using Vhost PMD with
> >> testpmd (with 4 queue pairs in this example)
> >> # ./build/app/dpdk-testpmd --no-pci 
> >> --vdev=net_vhost0,iface=/dev/vduse/vduse0,queues=4 --log-level=*:9  -- -i 
> >> --txq=4 --rxq=4
> >>
> >> 4. Attach the VDUSE device to the vDPA bus
> >> # vdpa dev add name vduse0 mgmtdev vduse
> >> => The virtio-net netdev shows up (eth0 here)
> >> # ip l show eth0
> >> 21: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP 
> >> mode DEFAULT group default qlen 1000
> >>      link/ether c2:73:ea:a7:68:6d brd ff:ff:ff:ff:ff:ff
> >>
> >> 5. Start/stop traffic in testpmd
> >> testpmd> start
> >> testpmd> show port stats 0
> >>    ######################## NIC statistics for port 0  
> >> ########################
> >>    RX-packets: 11         RX-missed: 0          RX-bytes:  1482
> >>    RX-errors: 0
> >>    RX-nombuf:  0
> >>    TX-packets: 1          TX-errors: 0          TX-bytes:  62
> >>
> >>    Throughput (since last show)
> >>    Rx-pps:            0          Rx-bps:            0
> >>    Tx-pps:            0          Tx-bps:            0
> >>    
> >> ############################################################################
> >> testpmd> stop
> >>
> >> 6. Detach the VDUSE device from the vDPA bus
> >> # vdpa dev del vduse0
> >>
> >> 7. Quit testpmd
> >> testpmd> quit
> >>
> >> Known issues & remaining work:
> >> ==============================
> >> - Fix issue in FD manager (still polling while FD has been removed)
> >> - Add Netlink support in Vhost library
> >> - Support device reconnection
> >> - Support packed ring
> >> - Enable & test more Virtio features
> >> - Provide performance benchmark results
> >>
> >
> > Nice work! Thanks for bringing VDUSE to the network area. I wonder if
> > you have some plan to support userspace memory registration [1]? I
> > think this feature can benefit the performance since an extra data
> > copy could be eliminated in our case.
>
> I plan to have a closer look later, once VDUSE support will be added.
> I think it will be difficult to support it in the case of DPDK for
> networking:
>
>   - For dequeue path it would be basically re-introducing dequeue zero-
> copy support that we removed some time ago. It was a hack where we
> replaced the regular mbuf buffer with the descriptor one, increased the
> reference counter, and at next dequeue API calls checked if the former
> mbufs ref counter is 1 and restore the mbuf. Issue is that physical NIC
> drivers usually release sent mbufs by pool, once a certain threshold is
> met. So it can cause draining of the virtqueue as the descs are not
> written back into the used ring for quite some time, depending on the
> NIC/traffic/...
>

OK, I see. Could this issue be mitigated by releasing sent mbufs one
by one once we sent it out or simply increasing the virtqueue size?

> - For enqueue path, I don't think this is possible with virtual switches
> by design, as when a mbuf is received on a physical port, we don't know
> in which Vhost/VDUSE port it will be switched to. And for VM to VM
> communication, should it use the src VM buffer or the dest VM one?
>

Yes, I agree that it's hard to achieve that in the enqueue path.

> Only case it could work is if you had a simple forwarder between a VDUSE
> device and a physical port. But I don't think there is much interest in
> such use-case.
>

OK, I get it.

Thanks,
Yongji

Reply via email to