I am attempting to enhance libvirt's virDomainUpdateDeviceFlags() API to support changing "just about anything" about the host side of a PCI network device without actually detaching the PCI device from the guest. Here is a patch I sent to the libvirt mailing list that I had thought would accomplish this task:
https://www.redhat.com/archives/libvir-list/2012-October/msg00546.html I am using qemu-kvm-1.2-0.1.20120806git3e430569.fc17.x86_64 on Fedora 17 for my testing. Since the host side and guest side are created (and deleted) with separate monitor commands ("netdev_(add|del)" vs. "device_(add|del)", we had thought that it would be possible to use netdev_del to disconnect everything from the host side, [*not disconnect the guest side*], then create a new tap device and connect it with netdev_add(). And, actually, the netdev_del+netdev_add sequence does complete without error; unfortunately, no traffic is visible on the tap device (looking from the host with tcpdump). When I modify the patch above to also include the device_del and device_add monitor calls (with a 3 second delay in between to allow for the guest's PCI detach to complete), then the device does work properly. Of course in this case (1) the guest sees the device completely disappear for a period, then reappear, which is more disruption than I want, and (2) because qemu has no asynchronous event to notify libvirt when the guest's PCI detach has actually completed, I have to stick in an arbitrary call to sleep() which is generally *way* too long, but may be too short in some cases of extremely high load. The only comment I got from IRC on Friday afternoon (I know - not a good time to be looking for people) was that they would be "surprised if it did work". So, I have the following questions: 1) Should this work? If it's supposed to work now: 2) can you give hints (aside from watching the qemu monitor commands and responses with stap) on what I might need to change, or how to further debug my problem within qemu? (I'm pretty well convinced that the libvirt code is doing the tap device creation/etc correctly). 3) alternately can you verify that this is a known bug? Is fixing it on anyone's todo list? If it's not supposed to work now: 4) Does it sound like a reasonable thing for qemu to support? 5) Is there some other formal way to request addition of this functionality (aside from figuring it out myself and posting a patch)? ******************************************** For reference, here is the sequence of qemu monitor commands sent by libvirt to fully detach, then fully reattach a network device. Note that fd is a newly opened TAP device. Also note the 3 second interval between the netdev_del and the next command: 96.671 > 0x7f8e20000c90 {"execute":"device_del","arguments":{"id":"net0"},"id":"libvirt-25"} 96.673 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-25"} 96.674 > 0x7f8e20000c90 {"execute":"netdev_del","arguments":{"id":"hostnet0"},"id":"libvirt-26"} 96.695 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-26"} 99.777 > 0x7f8e20000c90 {"execute":"getfd","arguments":{"fdname":"fd-net0"},"id":"libvirt-27"} (fd=27) 99.777 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-27"} 99.778 > 0x7f8e20000c90 {"execute":"netdev_add","arguments":{"type":"tap","fd":"fd-net0","id":"hostnet0"},"id":"libvirt-28"} 99.778 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-28"} 99.779 > 0x7f8e20000c90 {"execute":"device_add","arguments":{"driver":"virtio-net-pci","netdev":"hostnet0","id":"net0","mac":"52:54:00:d8:bd:b9","bus":"pci.0","addr":"0x4"},"id":"libvirt-29"} 99.780 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-29"} After this sequence is done, the guest network device is fully functioning. Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems): 168.750 > 0x7f8e20000c90 {"execute":"netdev_del","arguments":{"id":"hostnet0"},"id":"libvirt-30"} 168.762 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-30"} 168.800 > 0x7f8e20000c90 {"execute":"getfd","arguments":{"fdname":"fd-net0"},"id":"libvirt-31"} (fd=27) 168.801 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-31"} 168.801 > 0x7f8e20000c90 {"execute":"netdev_add","arguments":{"type":"tap","fd":"fd-net0","id":"hostnet0"},"id":"libvirt-32"} 168.802 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-32"} 168.802 > 0x7f8e20000c90 {"execute":"set_link","arguments":{"name":"net0","up":true},"id":"libvirt-33"} 168.803 < 0x7f8e20000c90 {"return": {}, "id": "libvirt-33"} After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host. Oh - the extra "set_link" command at the end is because I noticed that the flags shown in ifconfig in the guest switched from: UP BROADCAST RUNNING MULTICAST to UP BROADCAST MULTICAST when reconnecting in this way, so I was hoping that forcing the interface up would solve my problems. It didn't :-/ (Another note: I also tried adding a delay after the netdev_del, and that also did nothing.)