On Tue, Oct 13, 2020 at 02:14:22AM +0300, Nikos Dragazis wrote: > On 12/10/20 10:22 μ.μ., Cosmin Chenaru wrote: > > I am currently running multiple VMs, connected in between by the DPDK > > vhost-switch. A VM can start, reboot, shutdown, so much of this is dynamic > > and the vhost-switch handles all of these. So these VMs are some sort of > > "endpoints" (I could not find a better naming). > > > > The code which runs on the VM endpoints is somehow tied to the vhost-switch > > code, and if I change something on the VM which breaks the compatibility, I > > need to recompile the vhost-switch and restart. The problem is that most of > > the time I forget to update the vhost-switch, and I run into other problems. > > > > If I could use a VM as a vhost-switch instead of the DPDK app, then I hope > > that in my endpoint code which runs on the VM, I can add functionality to > > make it also run as a switch, and forward the packets between the other VMs > > like the current DPDK switch does. Doing so would allow me to catch this > > out-of-sync between the VM endpoint code and the switch code at compile > > time, since they will be part of the same app. > > > > This would be a two-phase process. First to run the DPDK vhost-switch > > inside a guest VM, and then to move the tx-rx part into my app. > > > > Both Qemu and the DPDK app use "vhost-user". I was happy to see that I can > > start Qemu in vhost-user server mode: > > > > <interface type='vhostuser'> > > <mac address='52:54:00:9c:3a:e3'/> > > <source type='unix' path='/home/cosmin/vsocket.server' mode='server'/> > > <model type='virtio'/> > > <driver queues='2'> > > <host mrg_rxbuf='on'/> > > </driver> > > <address type='pci' domain='0x0000' bus='0x00' slot='0x04' > > function='0x0'/> > > </interface> > > > > This would translate to these Qemu arguments: > > > > -chardev socket,id=charnet1,path=/home/cosmin/vsocket.server,server -netdev > > type=vhost-user,id=hostnet1,chardev=charnet1,queues=2 -device > > virtio-net-pci,mrg_rxbuf=on,mq=on,vectors=6,netdev=hostnet1,id=net1,mac=52:54:00:9c:3a:e3,bus=pci.0,addr=0x4 > > > > But at this point Qemu will not boot the VM until there is a vhost-user > > client connecting to Qemu. I even tried adding the "nowait" argument, but > > Qemu still waits. This will not work in my case, as the endpoint VMs could > > start and stop at any time, and I don't even know how many network > > interfaces the endpoint VMs will have.
The "server" mode simply creates a listen socket instead of connecting. It does not mean that QEMU acts as the vhost-user device backend. QEMU is still the frontend. The UNIX domain socket "client" and "server" relationship is independent of the vhost-user protocol frontend (previously called "master") and device backend (previously called "slave") relationship. > > > > I then found the virtio-vhost-user transport protocol [2], and was thinking > > that I could at least start the packet-switching VM, and then let the DPDK > > app inside it do the forwarding of the packets. But from what I understand, > > this creates a single network interface inside the VM on which the DPDK app > > can bind. The limitation here is that if another VM wants to connect to the > > packet-switching VM, I need to manually add another virtio-vhost-user-pci > > device (and a new vhost-user.sock) before this packet-switching VM starts, > > so this is not dynamic. Yes, each switch port needs its own virtio-vhost-user device because it is the partner to a VM's virtio-net device. It is possible to write a guest application that: 1. Opens multiple virtio-vhost-user devices and handles the connection lifecycle so that devices may be disconnected some of the time. and/or 2. Reacts to virtio-vhost-user hotplug (i.e. udev/uevents) to dynamically add/remove ports. > > The second approach for me would be to use vhost-pci [3], which I could not > > fully understand how it works, but I think it presents a network interface > > to the guest kernel after another VM connects to the first one. This also requires multiple vhost-pci device instances if you want multiple switch ports. > > I realize I made a big story and probably don't make too much sense, but > > one more thing. The ideal solution for me would be a combination of the > > vhost-user socket and the vhost-pci socket. The Qemu will start the VM and > > the socket will wait in the background for vhost-user connections. When a > > new connection is established, Qemu should create a hot-plugable PCI > > network card and either the guest kernel or the DPDK app inside to handle > > the vhost-user messages. vhost-pci and virtio-vhost-user don't present a network card to the switch. Instead the switch acts as the device emulator for the virtio-net NICs that other VMs are using. This has performance advantages because no data copy and extra packet queuing is necessary when there is just 1 NIC instead of 2 point-to-point NICs. Stefan
signature.asc
Description: PGP signature