Hi all, Here is something I've been tinkering with the past few weeks and now have it in a state where the basic idea makes sense, it works, and could use some feedback from the community.
This is what I've been calling QDES or QEMU Distributed Ethernet Switch. I first had the idea when I was playing with the udp and mcast socket network backends while exploring how to build a VM infrastructure. I liked the idea of using the sockets backends cause it doesn't require escalated permissions to configure and run as well as the ability to talk over IP networks. But the built in socket backends either allowed for only 2 guests talking directly or for multiple guests where all traffic is sent to all. So one can either have two guests talking or have bandwidth wasted with multiple guests. There wasn't something that could talk to multiple guests but also utilize unicast traffic. So I made a backend that can do this. It takes the basics of how the udp and mcast socket backends work and combines them with some switching based on the ethernet packets. The result is multiple guests can talk to each other but not waste bandwidth by delivering unicast traffic to all guests. The backend also adds some header data to each packet. This header includes a network identifier so multiple logical networks can be created using the same multicast configuration but still have separation in the guests. Its kind of like VXLAN or NVGRE but replace the GRE tunnels with UDP packets. There are a couple advantages that I see to this. It allows for multiple guests in multiple locations to talk to each other while keeping unicast traffic to just between two hosts. It doesn't require root permissions to run. It can operate over non-ethernet networks (like IPoIB). It doesn't require changing network configuration on the host. It allows for a ton of logical networks to be created (currently 65536 per multicast address and port combination). There are a few disadvantages as well. It does add some more processing to the QEMU process but not much (I saw it go as fast as the socket backends). It is encapsulating an Ethernet frame inside a UDP packet so there is the overhead of the IP and UDP headers as well as the transport medium headers (most likely Ethernet again). Because there is additional header data and MTU of the guest could be limited depending on the ability to send larger multicast packet from the host. (I haven't really looked closely at this last one). There isn't the ability for something besides QEMU processes to communicate using this, though I hope to build a utility to work with a tap device. Overall, I think this is something that's pretty cool. I don't know how much people give any thought to the socket backends for real world use and so I don't know if this would be of much use to anyone. I am looking for some feedback into what the community thinks and for comments about the code. Its only my second time doing more than 20 lines of C so I'm sure I did some stupid things. I have only tested on 64 bit x86 Linux systems so far. Hopefully you all have good things to say. :) mike