Re: [Qemu-devel] [libvirt] Problems using netdev_del+netdev_add w/o corresponding device_del+device_add
On Mon, Oct 15, 2012 at 10:25:58AM +0100, Daniel P. Berrange wrote: On Mon, Oct 15, 2012 at 10:30:07AM +0200, Stefan Hajnoczi wrote: On Sat, Oct 13, 2012 at 04:47:14PM -0400, Laine Stump wrote: Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems): 168.750 0x7f8e2c90 {execute:netdev_del,arguments:{id:hostnet0},id:libvirt-30} 168.762 0x7f8e2c90 {return: {}, id: libvirt-30} 168.800 0x7f8e2c90 {execute:getfd,arguments:{fdname:fd-net0},id:libvirt-31} (fd=27) 168.801 0x7f8e2c90 {return: {}, id: libvirt-31} 168.801 0x7f8e2c90 {execute:netdev_add,arguments:{type:tap,fd:fd-net0,id:hostnet0},id:libvirt-32} 168.802 0x7f8e2c90 {return: {}, id: libvirt-32} 168.802 0x7f8e2c90 {execute:set_link,arguments:{name:net0,up:true},id:libvirt-33} 168.803 0x7f8e2c90 {return: {}, id: libvirt-33} After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host. What you are trying to do isn't possible today. The device associates with the netdev during initialization only - there is no command to associate at a later point in time. That is why your example works only when the device is deleted together with the netdev. It is certainly possible to implement a command to switch netdevs but I'm curious what the use case is. Is this necessary just because QEMU doesn't provide a way to modify the existing netdev or because you really want to switch to a completely different netdev? We have end users who want to be able to dynamically change the guest' networking attachment, without restarting/hotplugging devices in the guest[1]. If it is just a case of changing from one bridge, to another bridge we can do that just by moving the TAP Device from one to another. This doesn't work if we want to support more general changes in config. eg from a macvtap setup to a TAP setup, or vica-verca. Another requirement is to be able to start a guest with a null backend (akin to not plugging in the ethernet cable on a physical host), and then attach it to a bridge/macvtap device on the fly later on (akin to then plugging in the ethernet cable once running). virtio-net presents a challenge because checksum offload and other advanced features are announced in the virtio feature bits. Virtio feature bits don't change during the lifetime of the device and there's no way to notify existing guests to re-negotiate them besides taking down the device. The offload feature bits are tied to the netdev in QEMU, especially the tap driver's vnet_hdr feature which allows the guest to pass through offload flags to the host network stack. QEMU does not emulate these today and only enables them when the netdev supports vnet_hdr. In other words, virtio-net is tied to its netdev. Changing from -netdev tap,vhost=on setup to a -netdev user is difficult. Two possibilities: 1. Add offload emulation code to QEMU. 2. Place sufficient checks in QEMU and libvirt so that only safe netdev changes can be made. #1 is unattractive because this code path will rarely be used but is complex (a bunch of buffer munging and memory management). Are you trying to change netdev without involving the guest? In that case the link must stay up and libvirt needs to ensure that the new netdev will have a compatible network configuration (subnet, gateway, IP address details). Stefan
Re: [Qemu-devel] [libvirt] Problems using netdev_del+netdev_add w/o corresponding device_del+device_add
Laine Stump la...@redhat.com writes: On 10/15/2012 05:25 AM, Daniel P. Berrange wrote: On Mon, Oct 15, 2012 at 10:30:07AM +0200, Stefan Hajnoczi wrote: On Sat, Oct 13, 2012 at 04:47:14PM -0400, Laine Stump wrote: Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems): 168.750 0x7f8e2c90 {execute:netdev_del,arguments:{id:hostnet0},id:libvirt-30} 168.762 0x7f8e2c90 {return: {}, id: libvirt-30} This deletes the backend, and leaves the frontend NIC without a backend. Such as NIC behaves / should behave like it's not connected to anything (link down). 168.800 0x7f8e2c90 {execute:getfd,arguments:{fdname:fd-net0},id:libvirt-31} (fd=27) 168.801 0x7f8e2c90 {return: {}, id: libvirt-31} 168.801 0x7f8e2c90 {execute:netdev_add,arguments:{type:tap,fd:fd-net0,id:hostnet0},id:libvirt-32} 168.802 0x7f8e2c90 {return: {}, id: libvirt-32} 168.802 0x7f8e2c90 This creates a new backend, not connected to any frontend. The fact that it has the same ID as some deleted backend is completely immaterial. {execute:set_link,arguments:{name:net0,up:true},id:libvirt-33} 168.803 0x7f8e2c90 {return: {}, id: libvirt-33} This orders the NIC to change the link status to up. Can't work, because it's still not connected to anything. It succeeds anyway, which could be regarded as a bug. After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host. What you are trying to do isn't possible today. Well, at least it's good to know that I should stop trying to make it work :-) Actually, it's a bit disconcerting that 1) the act of creating a guest device is split into two commands, implying that they don't necessarily have a hardwired a--b relationship although that is the case, and that It isn't really the case. Network frontend and backend are really separate things, but... 2) netdev_add even returns success when you use it in this way. Although hindsight is 20/20 and all that, if both a and b are required, and must always be in the same order, wouldn't it have made more sense for the two steps to be a single command? I suppose this is a byproduct of the monitor commands being a direct reflection ot the commandline options. (At the very least, though, I think netdev_add should report an error if the device name alias it uses is already in use by a device.) The device associates with the netdev during initialization only - there ... the connection between the two can only be made during frontend initialization. Not because of design limitations, just because more dynamic connecting hasn't been implemented. is no command to associate at a later point in time. That is why your example works only when the device is deleted together with the netdev. It is certainly possible to implement a command to switch netdevs At this point yes, it would be better to have a new command rather than to make netdev_add work in the way I've attempted - this way there would be a new command whose presence libvirt could use to decide whether or not to support this functionality. Besides, I'd oppose ID magic like making netdev_add behave differently when the ID matches some previously used ID. but I'm curious what the use case is. Is this necessary just because QEMU doesn't provide a way to modify the existing netdev or because you really want to switch to a completely different netdev? We have end users who want to be able to dynamically change the guest' networking attachment, without restarting/hotplugging devices in the guest[1]. If it is just a case of changing from one bridge, to another bridge we can do that just by moving the TAP Device from one to another. This doesn't work if we want to support more general changes in config. eg from a macvtap setup to a TAP setup, or vica-verca. Beyond that, I haven't determined it conclusively yet, but it so far looks to me like a macvtap device can only be linked to a physdev when it is created - there is no netlink message to re-link it to a different physdev (this is based on my naive examination of the relevant kernel source). So if you want to change the attach point for a macvtap-type connection, you again need to discard the old macvtap device and create a new one, implying that you need to do a new netdev_add. Wanting to connect a frontend NIC to a different backend seems entirely fair to me. Patches welcome :)
Re: [Qemu-devel] [libvirt] Problems using netdev_del+netdev_add w/o corresponding device_del+device_add
On Mon, Oct 15, 2012 at 11:15:30AM -0400, Laine Stump wrote: On 10/15/2012 05:25 AM, Daniel P. Berrange wrote: On Mon, Oct 15, 2012 at 10:30:07AM +0200, Stefan Hajnoczi wrote: On Sat, Oct 13, 2012 at 04:47:14PM -0400, Laine Stump wrote: Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems): 168.750 0x7f8e2c90 {execute:netdev_del,arguments:{id:hostnet0},id:libvirt-30} 168.762 0x7f8e2c90 {return: {}, id: libvirt-30} 168.800 0x7f8e2c90 {execute:getfd,arguments:{fdname:fd-net0},id:libvirt-31} (fd=27) 168.801 0x7f8e2c90 {return: {}, id: libvirt-31} 168.801 0x7f8e2c90 {execute:netdev_add,arguments:{type:tap,fd:fd-net0,id:hostnet0},id:libvirt-32} 168.802 0x7f8e2c90 {return: {}, id: libvirt-32} 168.802 0x7f8e2c90 {execute:set_link,arguments:{name:net0,up:true},id:libvirt-33} 168.803 0x7f8e2c90 {return: {}, id: libvirt-33} After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host. What you are trying to do isn't possible today. Well, at least it's good to know that I should stop trying to make it work :-) Actually, it's a bit disconcerting that 1) the act of creating a guest device is split into two commands, implying that they don't necessarily have a hardwired a--b relationship although that is the case, and that 2) netdev_add even returns success when you use it in this way. Although hindsight is 20/20 and all that, if both a and b are required, and must always be in the same order, wouldn't it have made more sense for the two steps to be a single command? I suppose this is a byproduct of the monitor commands being a direct reflection ot the commandline options. (At the very least, though, I think netdev_add should report an error if the device name alias it uses is already in use by a device.) The commands are historic (at least to me) and we have to make the best of them. but I'm curious what the use case is. Is this necessary just because QEMU doesn't provide a way to modify the existing netdev or because you really want to switch to a completely different netdev? We have end users who want to be able to dynamically change the guest' networking attachment, without restarting/hotplugging devices in the guest[1]. If it is just a case of changing from one bridge, to another bridge we can do that just by moving the TAP Device from one to another. This doesn't work if we want to support more general changes in config. eg from a macvtap setup to a TAP setup, or vica-verca. Beyond that, I haven't determined it conclusively yet, but it so far looks to me like a macvtap device can only be linked to a physdev when it is created - there is no netlink message to re-link it to a different physdev (this is based on my naive examination of the relevant kernel source). So if you want to change the attach point for a macvtap-type connection, you again need to discard the old macvtap device and create a new one, implying that you need to do a new netdev_add. Yep, I just checked too. macvlan_dev-lowerdev is only set in macvlan_common_newlink(). There is no way to change it once the link has been created. Stefan
Re: [Qemu-devel] [libvirt] Problems using netdev_del+netdev_add w/o corresponding device_del+device_add
On Tue, Oct 16, 2012 at 10:08:21AM +0200, Stefan Hajnoczi wrote: On Mon, Oct 15, 2012 at 10:25:58AM +0100, Daniel P. Berrange wrote: On Mon, Oct 15, 2012 at 10:30:07AM +0200, Stefan Hajnoczi wrote: On Sat, Oct 13, 2012 at 04:47:14PM -0400, Laine Stump wrote: Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems): 168.750 0x7f8e2c90 {execute:netdev_del,arguments:{id:hostnet0},id:libvirt-30} 168.762 0x7f8e2c90 {return: {}, id: libvirt-30} 168.800 0x7f8e2c90 {execute:getfd,arguments:{fdname:fd-net0},id:libvirt-31} (fd=27) 168.801 0x7f8e2c90 {return: {}, id: libvirt-31} 168.801 0x7f8e2c90 {execute:netdev_add,arguments:{type:tap,fd:fd-net0,id:hostnet0},id:libvirt-32} 168.802 0x7f8e2c90 {return: {}, id: libvirt-32} 168.802 0x7f8e2c90 {execute:set_link,arguments:{name:net0,up:true},id:libvirt-33} 168.803 0x7f8e2c90 {return: {}, id: libvirt-33} After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host. What you are trying to do isn't possible today. The device associates with the netdev during initialization only - there is no command to associate at a later point in time. That is why your example works only when the device is deleted together with the netdev. It is certainly possible to implement a command to switch netdevs but I'm curious what the use case is. Is this necessary just because QEMU doesn't provide a way to modify the existing netdev or because you really want to switch to a completely different netdev? We have end users who want to be able to dynamically change the guest' networking attachment, without restarting/hotplugging devices in the guest[1]. If it is just a case of changing from one bridge, to another bridge we can do that just by moving the TAP Device from one to another. This doesn't work if we want to support more general changes in config. eg from a macvtap setup to a TAP setup, or vica-verca. Another requirement is to be able to start a guest with a null backend (akin to not plugging in the ethernet cable on a physical host), and then attach it to a bridge/macvtap device on the fly later on (akin to then plugging in the ethernet cable once running). virtio-net presents a challenge because checksum offload and other advanced features are announced in the virtio feature bits. Virtio feature bits don't change during the lifetime of the device and there's no way to notify existing guests to re-negotiate them besides taking down the device. The offload feature bits are tied to the netdev in QEMU, especially the tap driver's vnet_hdr feature which allows the guest to pass through offload flags to the host network stack. QEMU does not emulate these today and only enables them when the netdev supports vnet_hdr. In other words, virtio-net is tied to its netdev. Changing from -netdev tap,vhost=on setup to a -netdev user is difficult. Urgh, so much for there being a clean separation between frontend and backend :-( Two possibilities: 1. Add offload emulation code to QEMU. 2. Place sufficient checks in QEMU and libvirt so that only safe netdev changes can be made. #1 is unattractive because this code path will rarely be used but is complex (a bunch of buffer munging and memory management). Agreed, sounds like 2 is the only practical option. Are you trying to change netdev without involving the guest? In that case the link must stay up and libvirt needs to ensure that the new netdev will have a compatible network configuration (subnet, gateway, IP address details). We'd leave the decision about that upto the management tool using these APIs. In some cases the subnet/ip/gateway/etc might remain the same, in other cases, the mgmt tool may want to set the link down, change backend then set the link online again, to make NetworkManager (or whatever) redo DHCP in the guest. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [libvirt] Problems using netdev_del+netdev_add w/o corresponding device_del+device_add
On Mon, Oct 15, 2012 at 10:30:07AM +0200, Stefan Hajnoczi wrote: On Sat, Oct 13, 2012 at 04:47:14PM -0400, Laine Stump wrote: Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems): 168.750 0x7f8e2c90 {execute:netdev_del,arguments:{id:hostnet0},id:libvirt-30} 168.762 0x7f8e2c90 {return: {}, id: libvirt-30} 168.800 0x7f8e2c90 {execute:getfd,arguments:{fdname:fd-net0},id:libvirt-31} (fd=27) 168.801 0x7f8e2c90 {return: {}, id: libvirt-31} 168.801 0x7f8e2c90 {execute:netdev_add,arguments:{type:tap,fd:fd-net0,id:hostnet0},id:libvirt-32} 168.802 0x7f8e2c90 {return: {}, id: libvirt-32} 168.802 0x7f8e2c90 {execute:set_link,arguments:{name:net0,up:true},id:libvirt-33} 168.803 0x7f8e2c90 {return: {}, id: libvirt-33} After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host. What you are trying to do isn't possible today. The device associates with the netdev during initialization only - there is no command to associate at a later point in time. That is why your example works only when the device is deleted together with the netdev. It is certainly possible to implement a command to switch netdevs but I'm curious what the use case is. Is this necessary just because QEMU doesn't provide a way to modify the existing netdev or because you really want to switch to a completely different netdev? We have end users who want to be able to dynamically change the guest' networking attachment, without restarting/hotplugging devices in the guest[1]. If it is just a case of changing from one bridge, to another bridge we can do that just by moving the TAP Device from one to another. This doesn't work if we want to support more general changes in config. eg from a macvtap setup to a TAP setup, or vica-verca. Another requirement is to be able to start a guest with a null backend (akin to not plugging in the ethernet cable on a physical host), and then attach it to a bridge/macvtap device on the fly later on (akin to then plugging in the ethernet cable once running). Regards, Daniel [1] Obviously the guest might need to reconfigure its IP or re-run DHCP though. -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [libvirt] Problems using netdev_del+netdev_add w/o corresponding device_del+device_add
On 10/15/2012 05:25 AM, Daniel P. Berrange wrote: On Mon, Oct 15, 2012 at 10:30:07AM +0200, Stefan Hajnoczi wrote: On Sat, Oct 13, 2012 at 04:47:14PM -0400, Laine Stump wrote: Here is the sequence sent to disconnect only the host side, then reconnect it with a new tap device. (although the fd is the same, this is because the old tap device had already been closed, so the number is just being used - the same thing happens when doing sequential full detach/attach cycles, and they all work with no problems): 168.750 0x7f8e2c90 {execute:netdev_del,arguments:{id:hostnet0},id:libvirt-30} 168.762 0x7f8e2c90 {return: {}, id: libvirt-30} 168.800 0x7f8e2c90 {execute:getfd,arguments:{fdname:fd-net0},id:libvirt-31} (fd=27) 168.801 0x7f8e2c90 {return: {}, id: libvirt-31} 168.801 0x7f8e2c90 {execute:netdev_add,arguments:{type:tap,fd:fd-net0,id:hostnet0},id:libvirt-32} 168.802 0x7f8e2c90 {return: {}, id: libvirt-32} 168.802 0x7f8e2c90 {execute:set_link,arguments:{name:net0,up:true},id:libvirt-33} 168.803 0x7f8e2c90 {return: {}, id: libvirt-33} After this sequence is done, everything about the network device *appears* normal on both the guest and host (at least the things I know to look at), but no traffic from the host shows up in a tcpdump of the interface on the guest, and no traffic from the guest shows up in a tcpdump of the tap device on the host. What you are trying to do isn't possible today. Well, at least it's good to know that I should stop trying to make it work :-) Actually, it's a bit disconcerting that 1) the act of creating a guest device is split into two commands, implying that they don't necessarily have a hardwired a--b relationship although that is the case, and that 2) netdev_add even returns success when you use it in this way. Although hindsight is 20/20 and all that, if both a and b are required, and must always be in the same order, wouldn't it have made more sense for the two steps to be a single command? I suppose this is a byproduct of the monitor commands being a direct reflection ot the commandline options. (At the very least, though, I think netdev_add should report an error if the device name alias it uses is already in use by a device.) The device associates with the netdev during initialization only - there is no command to associate at a later point in time. That is why your example works only when the device is deleted together with the netdev. It is certainly possible to implement a command to switch netdevs At this point yes, it would be better to have a new command rather than to make netdev_add work in the way I've attempted - this way there would be a new command whose presence libvirt could use to decide whether or not to support this functionality. but I'm curious what the use case is. Is this necessary just because QEMU doesn't provide a way to modify the existing netdev or because you really want to switch to a completely different netdev? We have end users who want to be able to dynamically change the guest' networking attachment, without restarting/hotplugging devices in the guest[1]. If it is just a case of changing from one bridge, to another bridge we can do that just by moving the TAP Device from one to another. This doesn't work if we want to support more general changes in config. eg from a macvtap setup to a TAP setup, or vica-verca. Beyond that, I haven't determined it conclusively yet, but it so far looks to me like a macvtap device can only be linked to a physdev when it is created - there is no netlink message to re-link it to a different physdev (this is based on my naive examination of the relevant kernel source). So if you want to change the attach point for a macvtap-type connection, you again need to discard the old macvtap device and create a new one, implying that you need to do a new netdev_add.