Re: [virtio-dev] [RFC PATCH net-next v2 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit
On 1/22/2018 7:36 PM, Alexander Duyck wrote: On Mon, Jan 22, 2018 at 6:04 PM, Michael S. Tsirkinwrote: On Mon, Jan 22, 2018 at 05:34:37PM -0800, Samudrala, Sridhar wrote: On 1/22/2018 4:05 PM, Michael S. Tsirkin wrote: On Mon, Jan 22, 2018 at 03:27:40PM -0800, Samudrala, Sridhar wrote: You could probably even handle the Tx queue selection via a simple eBPF program and map since the input for whatever is used to select Tx should be pretty simple, destination MAC, source NUMA node, etc, and the data-set shouldn't be too large. That sounds interesting. A separate device might make this kind of setup a bit easier. Sridhar, did you look into creating a separate device for the virtual bond device at all? It does not have to be in a separate module, that kind of refactoring can come later, but once we commit to using the same single device as virtio, we can't change that. No. I haven't looked into creating a separate device. If we are going to create a new device, i guess it has to be of a new device type with its own driver. Well not necessarily - just a separate netdev ops. Kind of like e.g. vlans share a driver with the main driver. Not sure what you meant by vlans sharing a driver with the main driver. IIUC, vlans are supported via 802.1q driver and creates a netdev of type 'vlan' with vlan_netdev_ops But nothing prevents a single module from registering multiple ops. Just to clarify, it seems like what you are suggesting is just adding the "master" as a separate set of netdev ops or netdevice and to have virtio spawn two network devices, one slave and one master, if the BACKUP bit is set. Do I have that right? I am good with the code still living in the virtio driver and consolidation with other similar implementations and further improvement could probably happen later as part of some refactor. Here are the netdev_ops that are currently supported by virtio_net static const struct net_device_ops virtnet_netdev = { .ndo_open = virtnet_open, .ndo_stop = virtnet_close, .ndo_start_xmit = start_xmit, .ndo_validate_addr = eth_validate_addr, .ndo_set_mac_address = virtnet_set_mac_address, .ndo_set_rx_mode = virtnet_set_rx_mode, .ndo_get_stats64 = virtnet_stats, .ndo_vlan_rx_add_vid = virtnet_vlan_rx_add_vid, .ndo_vlan_rx_kill_vid = virtnet_vlan_rx_kill_vid, #ifdef CONFIG_NET_POLL_CONTROLLER .ndo_poll_controller = virtnet_netpoll, #endif .ndo_bpf = virtnet_xdp, .ndo_xdp_xmit = virtnet_xdp_xmit, .ndo_xdp_flush = virtnet_xdp_flush, .ndo_features_check = passthru_features_check, }; Now if we want to create 2 netdevs associated with a single virtio_device, do we want these ops supported by master or slave netdev? I guess these should be supported by the slave netdev. So do we want the netdev_ops of master simply call the slave netdev_ops for most the the cases except for the ndo_start_xmit() which will decide between virtio or vf datapath? what about ethtool_ops? we need to sync up link state between master and slave netdevs and the userspace needs to be aware of 2 type of virtio_net devices. Is this complexity really required to support a feature that can only be supported/useful for simple guest network configurations that can be controlled by hypervisor. virtio_net device that could be accelerated via a passthru device and support live migration. If a guest admin needs to configure any complex network configurations in the guest, i think those scenarios can be supported via bond/bridge/team setups and live migration may not be a requirement for such usecases. Thanks Sridhar ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Ping Re: [PATCH] virtio: make VIRTIO a menuconfig to ease disabling it all
Vincent Legollwrites: > No need to get into the submenu to disable all VIRTIO-related > config entries. > > This makes it easier to disable all VIRTIO config options > without entering the submenu. It will also enable one > to see that en/dis-abled state from the outside menu. > > This is only intended to change menuconfig UI, not change > the config dependencies. > > v2: Added "default y" to avoid breaking existing configs > v3: Fixed wrong indentation, added *-by from Randy > > Signed-off-by: Vincent Legoll > Reviewed-by: Randy Dunlap > Tested-by: Randy Dunlap # works for me > --- > drivers/virtio/Kconfig | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) This has been broken in linux-next for ~6 weeks now, can we please merge this and get it fixed. cheers ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [virtio-dev] [RFC PATCH net-next v2 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit
On Mon, Jan 22, 2018 at 6:04 PM, Michael S. Tsirkinwrote: > On Mon, Jan 22, 2018 at 05:34:37PM -0800, Samudrala, Sridhar wrote: >> On 1/22/2018 4:05 PM, Michael S. Tsirkin wrote: >> > On Mon, Jan 22, 2018 at 03:27:40PM -0800, Samudrala, Sridhar wrote: >> > > > > You could probably >> > > > > even handle the Tx queue selection via a simple eBPF program and map >> > > > > since the input for whatever is used to select Tx should be pretty >> > > > > simple, destination MAC, source NUMA node, etc, and the data-set >> > > > > shouldn't be too large. >> > > > That sounds interesting. A separate device might make this kind of >> > > > setup >> > > > a bit easier. Sridhar, did you look into creating a separate device >> > > > for >> > > > the virtual bond device at all? It does not have to be in a separate >> > > > module, that kind of refactoring can come later, but once we commit to >> > > > using the same single device as virtio, we can't change that. >> > > No. I haven't looked into creating a separate device. If we are going to >> > > create a new >> > > device, i guess it has to be of a new device type with its own driver. >> > Well not necessarily - just a separate netdev ops. >> > Kind of like e.g. vlans share a driver with the main driver. >> >> Not sure what you meant by vlans sharing a driver with the main driver. >> IIUC, vlans are supported via 802.1q driver and creates a netdev of type >> 'vlan' >> with vlan_netdev_ops > > But nothing prevents a single module from registering > multiple ops. Just to clarify, it seems like what you are suggesting is just adding the "master" as a separate set of netdev ops or netdevice and to have virtio spawn two network devices, one slave and one master, if the BACKUP bit is set. Do I have that right? I am good with the code still living in the virtio driver and consolidation with other similar implementations and further improvement could probably happen later as part of some refactor. Thanks. - Alex ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [virtio-dev] [RFC PATCH net-next v2 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit
On Mon, Jan 22, 2018 at 05:34:37PM -0800, Samudrala, Sridhar wrote: > On 1/22/2018 4:05 PM, Michael S. Tsirkin wrote: > > On Mon, Jan 22, 2018 at 03:27:40PM -0800, Samudrala, Sridhar wrote: > > > > > You could probably > > > > > even handle the Tx queue selection via a simple eBPF program and map > > > > > since the input for whatever is used to select Tx should be pretty > > > > > simple, destination MAC, source NUMA node, etc, and the data-set > > > > > shouldn't be too large. > > > > That sounds interesting. A separate device might make this kind of setup > > > > a bit easier. Sridhar, did you look into creating a separate device for > > > > the virtual bond device at all? It does not have to be in a separate > > > > module, that kind of refactoring can come later, but once we commit to > > > > using the same single device as virtio, we can't change that. > > > No. I haven't looked into creating a separate device. If we are going to > > > create a new > > > device, i guess it has to be of a new device type with its own driver. > > Well not necessarily - just a separate netdev ops. > > Kind of like e.g. vlans share a driver with the main driver. > > Not sure what you meant by vlans sharing a driver with the main driver. > IIUC, vlans are supported via 802.1q driver and creates a netdev of type > 'vlan' > with vlan_netdev_ops But nothing prevents a single module from registering multiple ops. > > > > > As we are using virtio_net to control and manage the VF data path, it is > > > not > > > clear to me > > > what is the advantage of creating a new device rather than extending > > > virtio_net to manage > > > the VF datapath via transparent bond mechanism. > > > > > > Thanks > > > Sridhar > > So that XDP redirect actions can differentiate between virtio, PT > > device and the bond. Without it there's no way to redirect > > to virtio specifically. > I guess this usecase is for a guest admin to add bpf programs to VF netdev > and > redirect frames to virtio. No - this doesn't make much sense IMHO. The usecase would be VM bridging where we process packets incoming on one virtio interface, and transmit them on another one. It was pointed out using a separate master netdev and making both virtio and PT its slaves would allow redirect switch to force virtio, without it it won't be possible to force redirect to virtio. How important that use-case is, I am not sure. > How does bond enable this feature? > > > Thanks > Sridhar As it's a userspace configuration, I guess for starters we can punt this to userspace, too. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [virtio-dev] [RFC PATCH net-next v2 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit
On 1/22/2018 4:02 PM, Stephen Hemminger wrote: In the case of SwitchDev it should be possible for the port representors and the switch to provide data on which interfaces are bonded on the host side and which aren't. With that data it would be pretty easy to just put together a list of addresses that would prefer to go the para-virtual route instead of being transmitted through physical hardware. In addition a bridge implies much more overhead since normally a bridge can receive a packet in on one interface and transmit it on another. We don't really need that. This is more of a VEPA type setup and doesn't need to be anything all that complex. You could probably even handle the Tx queue selection via a simple eBPF program and map since the input for whatever is used to select Tx should be pretty simple, destination MAC, source NUMA node, etc, and the data-set shouldn't be too large. That sounds interesting. A separate device might make this kind of setup a bit easier. Sridhar, did you look into creating a separate device for the virtual bond device at all? It does not have to be in a separate module, that kind of refactoring can come later, but once we commit to using the same single device as virtio, we can't change that. No. I haven't looked into creating a separate device. If we are going to create a new device, i guess it has to be of a new device type with its own driver. As we are using virtio_net to control and manage the VF data path, it is not clear to me what is the advantage of creating a new device rather than extending virtio_net to manage the VF datapath via transparent bond mechanism. Thanks Sridhar The requirement with Azure accelerated network was that a stock distribution image from the store must be able to run unmodified and get accelerated networking. Not sure if other environments need to work the same, but it would be nice. That meant no additional setup scripts (aka no bonding) and also it must work transparently with hot-plug. Also there are diverse set of environments: openstack, cloudinit, network manager and systemd. The solution had to not depend on any one of them, but also not break any of them. Yes. Cloud Service Providers using KVM as hypervisor have a similar requirement to provide accelerated networking with VM images that support virtio_net. Thanks Sridhar ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [virtio-dev] [RFC PATCH net-next v2 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit
On 1/22/2018 4:05 PM, Michael S. Tsirkin wrote: On Mon, Jan 22, 2018 at 03:27:40PM -0800, Samudrala, Sridhar wrote: You could probably even handle the Tx queue selection via a simple eBPF program and map since the input for whatever is used to select Tx should be pretty simple, destination MAC, source NUMA node, etc, and the data-set shouldn't be too large. That sounds interesting. A separate device might make this kind of setup a bit easier. Sridhar, did you look into creating a separate device for the virtual bond device at all? It does not have to be in a separate module, that kind of refactoring can come later, but once we commit to using the same single device as virtio, we can't change that. No. I haven't looked into creating a separate device. If we are going to create a new device, i guess it has to be of a new device type with its own driver. Well not necessarily - just a separate netdev ops. Kind of like e.g. vlans share a driver with the main driver. Not sure what you meant by vlans sharing a driver with the main driver. IIUC, vlans are supported via 802.1q driver and creates a netdev of type 'vlan' with vlan_netdev_ops As we are using virtio_net to control and manage the VF data path, it is not clear to me what is the advantage of creating a new device rather than extending virtio_net to manage the VF datapath via transparent bond mechanism. Thanks Sridhar So that XDP redirect actions can differentiate between virtio, PT device and the bond. Without it there's no way to redirect to virtio specifically. I guess this usecase is for a guest admin to add bpf programs to VF netdev and redirect frames to virtio. How does bond enable this feature? Thanks Sridhar ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [virtio-dev] [RFC PATCH net-next v2 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit
On Mon, Jan 22, 2018 at 05:13:01PM -0800, Jakub Kicinski wrote: > On Tue, 23 Jan 2018 02:47:57 +0200, Michael S. Tsirkin wrote: > > On Mon, Jan 22, 2018 at 04:16:23PM -0800, Jakub Kicinski wrote: > > > On Tue, 23 Jan 2018 02:05:48 +0200, Michael S. Tsirkin wrote: > > > > > As we are using virtio_net to control and manage the VF data path, it > > > > > is not > > > > > clear to me > > > > > what is the advantage of creating a new device rather than extending > > > > > virtio_net to manage > > > > > the VF datapath via transparent bond mechanism. > > > > > > > > So that XDP redirect actions can differentiate between virtio, PT > > > > device and the bond. Without it there's no way to redirect > > > > to virtio specifically. > > > > > > Let's make a list :P > > > > > > separate netdev: > > > 1. virtio device being a bond master is confusing/unexpected. > > > 2. virtio device being both a master and a slave is confusing. > > > > vlans are like this too, aren't they? > > Perhaps a bad wording. Both master and member would be better. > > > > 3. configuration of a master may have different semantics than > > >configuration of a slave. > > > 4. two para-virt devices may create a loop (or rather will be bound > > >to each other indeterministically, depending on which spawns first). > > > > For 2 virtio devices, we can disable the bond to make it deterministic. > > Do you mean the hypervisor can or is there a knob in virtio_net to mask > off features? Hypervisor can do it too. And it really should: specifying 2 devices as backup and giving them same mac seems like a misconfiguration. But it's easy to do in virtio without knobs - check that the potential slave is also a virtio device with the backup flag, and don't make it a slave. > Would that require re-probe of the virtio device? Probably not. > > > 5. there is no user configuration AFAIR in existing patch, VM admin > > >won't be able to prevent the bond. Separate netdev we can make > > >removable even if it's spawned automatically. > > > > That's more or less a feature. If you want configurability, just use > > any of the existing generic solutions (team,bond,bridge,...). > > The use case in mind is that VM admin wants to troubleshoot a problem > and temporarily disable the auto-bond without touching the hypervisor > (and either member preferably). I don't think it's possible to support this unconditionally. E.g. think of a config where these actually share a backend, with virtio becoming active when PT access to the backend is disabled. So you will need some device specific extension for that. > > > 6. XDP redirect use-case (or any explicit use of the virtio slave) > > >(from MST) > > > > > > independent driver: > > > 7. code reuse. > > > > With netvsc? That precludes a separate device though because of > > compatibility. > > Hopefully with one of the established bonding drivers (fingers > crossed). There is very little similarity. Calling this device a bond just confuses people. > We may see proliferation of special bonds (see Achiad's > presentation from last netdev about NIC-NUMA-node-bonds). I'll take a look, but this isn't like a bond at all, no more than a vlan is a bond. E.g. if PT link goes down then link is down period and you do not want to switch to virtio. > > > separate device: > > > > I'm not sure I understand how "separate device" is different from > > "separate netdev". Do you advocate for a special device who's job is > > just to tell the guest "bind these two devices together"? > > > > Yea, sure, that works. However for sure it's more work to > > implement and manage at all levels. Further > > > > - it doesn't answer the question > > - a feature bit in a virtio device is cheap enough that > > I wouldn't worry too much about this feature > > going unused later. > > Right, I think we are referring to different things as device. I mean > a bus device/struct device, but no strong preference on that one. I'll > be happy as long as there is a separate netdev, really :) > > > > 8. bond any netdev with any netdev. > > > 9. reuse well-known device driver model. > > > a. natural anchor for hypervisor configuration (switchdev etc.) > > > > saparate netdev has the same property > > > > > b. next-gen silicon may be able to disguise as virtio device, and the > > >loop check in virtio driver will prevent the legitimate bond it such > > >case. AFAIU that's one of the goals of next-gen virtio spec as well. > > > > In fact we have a virtio feature bit for the fallback. > > So this part does not depend on how software in guest works > > and does not need software solutions. > > You mean in the new spec? Nice. Still I think people will try to > implement the old one too given sufficiently capable HW. Existing HW won't have the BACKUP feature so the new functionality won't be activated. So no problem I think. -- MST
Re: [virtio-dev] [RFC PATCH net-next v2 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit
On Tue, 23 Jan 2018 02:47:57 +0200, Michael S. Tsirkin wrote: > On Mon, Jan 22, 2018 at 04:16:23PM -0800, Jakub Kicinski wrote: > > On Tue, 23 Jan 2018 02:05:48 +0200, Michael S. Tsirkin wrote: > > > > As we are using virtio_net to control and manage the VF data path, it > > > > is not > > > > clear to me > > > > what is the advantage of creating a new device rather than extending > > > > virtio_net to manage > > > > the VF datapath via transparent bond mechanism. > > > > > > So that XDP redirect actions can differentiate between virtio, PT > > > device and the bond. Without it there's no way to redirect > > > to virtio specifically. > > > > Let's make a list :P > > > > separate netdev: > > 1. virtio device being a bond master is confusing/unexpected. > > 2. virtio device being both a master and a slave is confusing. > > vlans are like this too, aren't they? Perhaps a bad wording. Both master and member would be better. > > 3. configuration of a master may have different semantics than > >configuration of a slave. > > 4. two para-virt devices may create a loop (or rather will be bound > >to each other indeterministically, depending on which spawns first). > > For 2 virtio devices, we can disable the bond to make it deterministic. Do you mean the hypervisor can or is there a knob in virtio_net to mask off features? Would that require re-probe of the virtio device? > > 5. there is no user configuration AFAIR in existing patch, VM admin > >won't be able to prevent the bond. Separate netdev we can make > >removable even if it's spawned automatically. > > That's more or less a feature. If you want configurability, just use > any of the existing generic solutions (team,bond,bridge,...). The use case in mind is that VM admin wants to troubleshoot a problem and temporarily disable the auto-bond without touching the hypervisor (and either member preferably). > > 6. XDP redirect use-case (or any explicit use of the virtio slave) > >(from MST) > > > > independent driver: > > 7. code reuse. > > With netvsc? That precludes a separate device though because of > compatibility. Hopefully with one of the established bonding drivers (fingers crossed). We may see proliferation of special bonds (see Achiad's presentation from last netdev about NIC-NUMA-node-bonds). > > separate device: > > I'm not sure I understand how "separate device" is different from > "separate netdev". Do you advocate for a special device who's job is > just to tell the guest "bind these two devices together"? > > Yea, sure, that works. However for sure it's more work to > implement and manage at all levels. Further > > - it doesn't answer the question > - a feature bit in a virtio device is cheap enough that > I wouldn't worry too much about this feature > going unused later. Right, I think we are referring to different things as device. I mean a bus device/struct device, but no strong preference on that one. I'll be happy as long as there is a separate netdev, really :) > > 8. bond any netdev with any netdev. > > 9. reuse well-known device driver model. > > a. natural anchor for hypervisor configuration (switchdev etc.) > > saparate netdev has the same property > > > b. next-gen silicon may be able to disguise as virtio device, and the > >loop check in virtio driver will prevent the legitimate bond it such > >case. AFAIU that's one of the goals of next-gen virtio spec as well. > > In fact we have a virtio feature bit for the fallback. > So this part does not depend on how software in guest works > and does not need software solutions. You mean in the new spec? Nice. Still I think people will try to implement the old one too given sufficiently capable HW. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [virtio-dev] [RFC PATCH net-next v2 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit
On Mon, Jan 22, 2018 at 04:16:23PM -0800, Jakub Kicinski wrote: > On Tue, 23 Jan 2018 02:05:48 +0200, Michael S. Tsirkin wrote: > > > As we are using virtio_net to control and manage the VF data path, it is > > > not > > > clear to me > > > what is the advantage of creating a new device rather than extending > > > virtio_net to manage > > > the VF datapath via transparent bond mechanism. > > > > So that XDP redirect actions can differentiate between virtio, PT > > device and the bond. Without it there's no way to redirect > > to virtio specifically. > > Let's make a list :P > > separate netdev: > 1. virtio device being a bond master is confusing/unexpected. > 2. virtio device being both a master and a slave is confusing. vlans are like this too, aren't they? > 3. configuration of a master may have different semantics than >configuration of a slave. > 4. two para-virt devices may create a loop (or rather will be bound >to each other indeterministically, depending on which spawns first). For 2 virtio devices, we can disable the bond to make it deterministic. > 5. there is no user configuration AFAIR in existing patch, VM admin >won't be able to prevent the bond. Separate netdev we can make >removable even if it's spawned automatically. That's more or less a feature. If you want configurability, just use any of the existing generic solutions (team,bond,bridge,...). > 6. XDP redirect use-case (or any explicit use of the virtio slave) >(from MST) > > independent driver: > 7. code reuse. With netvsc? That precludes a separate device though because of compatibility. > > separate device: I'm not sure I understand how "separate device" is different from "separate netdev". Do you advocate for a special device who's job is just to tell the guest "bind these two devices together"? Yea, sure, that works. However for sure it's more work to implement and manage at all levels. Further - it doesn't answer the question - a feature bit in a virtio device is cheap enough that I wouldn't worry too much about this feature going unused later. > 8. bond any netdev with any netdev. > 9. reuse well-known device driver model. > a. natural anchor for hypervisor configuration (switchdev etc.) saparate netdev has the same property > b. next-gen silicon may be able to disguise as virtio device, and the >loop check in virtio driver will prevent the legitimate bond it such >case. AFAIU that's one of the goals of next-gen virtio spec as well. In fact we have a virtio feature bit for the fallback. So this part does not depend on how software in guest works and does not need software solutions. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [virtio-dev] [RFC PATCH net-next v2 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit
On Tue, 23 Jan 2018 02:05:48 +0200, Michael S. Tsirkin wrote: > > As we are using virtio_net to control and manage the VF data path, it is not > > clear to me > > what is the advantage of creating a new device rather than extending > > virtio_net to manage > > the VF datapath via transparent bond mechanism. > > So that XDP redirect actions can differentiate between virtio, PT > device and the bond. Without it there's no way to redirect > to virtio specifically. Let's make a list :P separate netdev: 1. virtio device being a bond master is confusing/unexpected. 2. virtio device being both a master and a slave is confusing. 3. configuration of a master may have different semantics than configuration of a slave. 4. two para-virt devices may create a loop (or rather will be bound to each other indeterministically, depending on which spawns first). 5. there is no user configuration AFAIR in existing patch, VM admin won't be able to prevent the bond. Separate netdev we can make removable even if it's spawned automatically. 6. XDP redirect use-case (or any explicit use of the virtio slave) (from MST) independent driver: 7. code reuse. separate device: 8. bond any netdev with any netdev. 9. reuse well-known device driver model. a. natural anchor for hypervisor configuration (switchdev etc.) b. next-gen silicon may be able to disguise as virtio device, and the loop check in virtio driver will prevent the legitimate bond it such case. AFAIU that's one of the goals of next-gen virtio spec as well. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [virtio-dev] [RFC PATCH net-next v2 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit
On Mon, Jan 22, 2018 at 03:27:40PM -0800, Samudrala, Sridhar wrote: > > > You could probably > > > even handle the Tx queue selection via a simple eBPF program and map > > > since the input for whatever is used to select Tx should be pretty > > > simple, destination MAC, source NUMA node, etc, and the data-set > > > shouldn't be too large. > > That sounds interesting. A separate device might make this kind of setup > > a bit easier. Sridhar, did you look into creating a separate device for > > the virtual bond device at all? It does not have to be in a separate > > module, that kind of refactoring can come later, but once we commit to > > using the same single device as virtio, we can't change that. > > No. I haven't looked into creating a separate device. If we are going to > create a new > device, i guess it has to be of a new device type with its own driver. Well not necessarily - just a separate netdev ops. Kind of like e.g. vlans share a driver with the main driver. > As we are using virtio_net to control and manage the VF data path, it is not > clear to me > what is the advantage of creating a new device rather than extending > virtio_net to manage > the VF datapath via transparent bond mechanism. > > Thanks > Sridhar So that XDP redirect actions can differentiate between virtio, PT device and the bond. Without it there's no way to redirect to virtio specifically. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [virtio-dev] [RFC PATCH net-next v2 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit
On Mon, 22 Jan 2018 15:27:40 -0800 "Samudrala, Sridhar"wrote: > On 1/22/2018 1:31 PM, Michael S. Tsirkin wrote: > > On Wed, Jan 17, 2018 at 01:49:58PM -0800, Alexander Duyck wrote: > >> On Wed, Jan 17, 2018 at 11:57 AM, Michael S. Tsirkin > >> wrote: > >>> On Wed, Jan 17, 2018 at 11:25:41AM -0800, Samudrala, Sridhar wrote: > > On 1/17/2018 11:02 AM, Michael S. Tsirkin wrote: > > On Wed, Jan 17, 2018 at 10:15:52AM -0800, Alexander Duyck wrote: > >> On Thu, Jan 11, 2018 at 9:58 PM, Sridhar Samudrala > >> wrote: > >>> This feature bit can be used by hypervisor to indicate virtio_net > >>> device to > >>> act as a backup for another device with the same MAC address. > >>> > >>> Signed-off-by: Sridhar Samudrala > >>> --- > >>>drivers/net/virtio_net.c| 2 +- > >>>include/uapi/linux/virtio_net.h | 3 +++ > >>>2 files changed, 4 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > >>> index 12dfc5fee58e..f149a160a8c5 100644 > >>> --- a/drivers/net/virtio_net.c > >>> +++ b/drivers/net/virtio_net.c > >>> @@ -2829,7 +2829,7 @@ static struct virtio_device_id id_table[] = { > >>> VIRTIO_NET_F_GUEST_ANNOUNCE, VIRTIO_NET_F_MQ, \ > >>> VIRTIO_NET_F_CTRL_MAC_ADDR, \ > >>> VIRTIO_NET_F_MTU, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS, \ > >>> - VIRTIO_NET_F_SPEED_DUPLEX > >>> + VIRTIO_NET_F_SPEED_DUPLEX, VIRTIO_NET_F_BACKUP > >>> > >>>static unsigned int features[] = { > >>> VIRTNET_FEATURES, > >>> diff --git a/include/uapi/linux/virtio_net.h > >>> b/include/uapi/linux/virtio_net.h > >>> index 5de6ed37695b..c7c35fd1a5ed 100644 > >>> --- a/include/uapi/linux/virtio_net.h > >>> +++ b/include/uapi/linux/virtio_net.h > >>> @@ -57,6 +57,9 @@ > >>>* Steering */ > >>>#define VIRTIO_NET_F_CTRL_MAC_ADDR 23 /* Set MAC address */ > >>> > >>> +#define VIRTIO_NET_F_BACKUP 62/* Act as backup for another > >>> device > >>> +* with the same MAC. > >>> +*/ > >>>#define VIRTIO_NET_F_SPEED_DUPLEX 63 /* Device set linkspeed and > >>> duplex */ > >>> > >>>#ifndef VIRTIO_NET_NO_LEGACY > >> I'm not a huge fan of the name "backup" since that implies that the > >> Virtio interface is only used if the VF is not present, and there are > >> multiple instances such as dealing with east/west or > >> broadcast/multicast traffic where it may be desirable to use the > >> para-virtual interface rather then deal with PCI overhead/bottleneck > >> to send the packet. > > Right now hypervisors mostly expect that yes, only one at a time is > > used. E.g. if you try to do multicast sending packets on both VF and > > virtio then you will end up with two copies of each packet. > I think we want to use only 1 interface to send out any packet. In case > of > broadcast/multicasts it would be an optimization to send them via virtio > and > this patch series adds that optimization. > >>> Right that's what I think we should rather avoid for now. > >>> > >>> It's *not* an optimization if there's a single VM on this host, > >>> or if a specific multicast group does not have any VMs on same > >>> host. > >> Agreed. In my mind this is something that is controlled by the > >> pass-thru interface once it is enslaved. > > It would be pretty tricky to control through the PT > > interface since a PT interface pretends to be a physical > > device, which has no concept of VMs. > > > >>> I'd rather we just sent everything out on the PT if that's > >>> there. The reason we have virtio in the picture is just so > >>> we can migrate without downtime. > >> I wasn't saying we do that in all cases. That would be something that > >> would have to be decided by the pass-thru interface. Ideally the > >> virtio would provide just enough information to get itself into the > >> bond and I see this being the mechanism for it to do so. From there > >> the complexity mostly lies in the pass-thru interface to configure the > >> correct transmit modes if for example you have multiple pass-thru > >> interfaces or a more complex traffic setup due to things like > >> SwitchDev. > >> > >> In my mind we go the bonding route and there are few use cases for all > >> of this. First is the backup case that is being addressed here. That > >> becomes your basic "copy netvsc" approach for this which would be > >> default. It is how we would handle basic pass-thru back-up paths. If > >> the host decides to send multicast/broadcast traffic from the host up > >> through
Re: [virtio-dev] [RFC PATCH net-next v2 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit
On 1/22/2018 1:31 PM, Michael S. Tsirkin wrote: On Wed, Jan 17, 2018 at 01:49:58PM -0800, Alexander Duyck wrote: On Wed, Jan 17, 2018 at 11:57 AM, Michael S. Tsirkinwrote: On Wed, Jan 17, 2018 at 11:25:41AM -0800, Samudrala, Sridhar wrote: On 1/17/2018 11:02 AM, Michael S. Tsirkin wrote: On Wed, Jan 17, 2018 at 10:15:52AM -0800, Alexander Duyck wrote: On Thu, Jan 11, 2018 at 9:58 PM, Sridhar Samudrala wrote: This feature bit can be used by hypervisor to indicate virtio_net device to act as a backup for another device with the same MAC address. Signed-off-by: Sridhar Samudrala --- drivers/net/virtio_net.c| 2 +- include/uapi/linux/virtio_net.h | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 12dfc5fee58e..f149a160a8c5 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2829,7 +2829,7 @@ static struct virtio_device_id id_table[] = { VIRTIO_NET_F_GUEST_ANNOUNCE, VIRTIO_NET_F_MQ, \ VIRTIO_NET_F_CTRL_MAC_ADDR, \ VIRTIO_NET_F_MTU, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS, \ - VIRTIO_NET_F_SPEED_DUPLEX + VIRTIO_NET_F_SPEED_DUPLEX, VIRTIO_NET_F_BACKUP static unsigned int features[] = { VIRTNET_FEATURES, diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h index 5de6ed37695b..c7c35fd1a5ed 100644 --- a/include/uapi/linux/virtio_net.h +++ b/include/uapi/linux/virtio_net.h @@ -57,6 +57,9 @@ * Steering */ #define VIRTIO_NET_F_CTRL_MAC_ADDR 23 /* Set MAC address */ +#define VIRTIO_NET_F_BACKUP 62/* Act as backup for another device +* with the same MAC. +*/ #define VIRTIO_NET_F_SPEED_DUPLEX 63 /* Device set linkspeed and duplex */ #ifndef VIRTIO_NET_NO_LEGACY I'm not a huge fan of the name "backup" since that implies that the Virtio interface is only used if the VF is not present, and there are multiple instances such as dealing with east/west or broadcast/multicast traffic where it may be desirable to use the para-virtual interface rather then deal with PCI overhead/bottleneck to send the packet. Right now hypervisors mostly expect that yes, only one at a time is used. E.g. if you try to do multicast sending packets on both VF and virtio then you will end up with two copies of each packet. I think we want to use only 1 interface to send out any packet. In case of broadcast/multicasts it would be an optimization to send them via virtio and this patch series adds that optimization. Right that's what I think we should rather avoid for now. It's *not* an optimization if there's a single VM on this host, or if a specific multicast group does not have any VMs on same host. Agreed. In my mind this is something that is controlled by the pass-thru interface once it is enslaved. It would be pretty tricky to control through the PT interface since a PT interface pretends to be a physical device, which has no concept of VMs. I'd rather we just sent everything out on the PT if that's there. The reason we have virtio in the picture is just so we can migrate without downtime. I wasn't saying we do that in all cases. That would be something that would have to be decided by the pass-thru interface. Ideally the virtio would provide just enough information to get itself into the bond and I see this being the mechanism for it to do so. From there the complexity mostly lies in the pass-thru interface to configure the correct transmit modes if for example you have multiple pass-thru interfaces or a more complex traffic setup due to things like SwitchDev. In my mind we go the bonding route and there are few use cases for all of this. First is the backup case that is being addressed here. That becomes your basic "copy netvsc" approach for this which would be default. It is how we would handle basic pass-thru back-up paths. If the host decides to send multicast/broadcast traffic from the host up through it that is a host side decision. I am okay with our default transmit behavior from the guest being to send everything through the pass-thru interface. All the virtio would be doing is advertising that it is a side channel for some sort of bond with another interface. By calling it a side channel it implies that it isn't the direct channel for the interface, but it is an alternative communications channel for things like backup, and other future uses. There end up being a few different "phase 2" concepts I have floating around which I want to avoid limiting. Solving the east/west problem is an example. In my mind that just becomes a bonding Tx mode that is controlled via the pass-thru interface. The VF could black-list the local addresses so that they instead fall into the virtio interface. In addition I seem to
Re: [RFC PATCH net-next v2 2/2] virtio_net: Extend virtio to use VF datapath when available
On Mon, Jan 22, 2018 at 12:27:14PM -0800, Siwei Liu wrote: > First off, as mentioned in another thread, the model of stacking up > virt-bond functionality over virtio seems a wrong direction to me. > Essentially the migration process would need to carry over all guest > side configurations previously done on the VF/PT and get them moved to > the new device being it virtio or VF/PT. I might be wrong but I don't see why we should worry about this usecase. Whoever has a bond configured already has working config for migration. We are trying to help people who don't, not convert existig users. > Without the help of a new > upper layer bond driver that enslaves virtio and VF/PT devices > underneath, virtio will be overloaded with too much specifics being a > VF/PT backup in the future. So this paragraph already includes at least two conflicting proposals. On the one hand you want a separate device for the virtual bond, on the other you are saying a separate driver. Further, the reason to have a separate *driver* was that some people wanted to share code with netvsc - and that one does not create a separate device, which you can't change without breaking existing configs. So some people want a fully userspace-configurable switchdev, and that already exists at some level, and maybe it makes sense to add more features for performance. But the point was that some host configurations are very simple, and it probably makes sense to pass this information to the guest and have guest act on it directly. Let's not conflate the two. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [virtio-dev] [RFC PATCH net-next v2 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit
On Wed, Jan 17, 2018 at 01:49:58PM -0800, Alexander Duyck wrote: > On Wed, Jan 17, 2018 at 11:57 AM, Michael S. Tsirkinwrote: > > On Wed, Jan 17, 2018 at 11:25:41AM -0800, Samudrala, Sridhar wrote: > >> > >> > >> On 1/17/2018 11:02 AM, Michael S. Tsirkin wrote: > >> > On Wed, Jan 17, 2018 at 10:15:52AM -0800, Alexander Duyck wrote: > >> > > On Thu, Jan 11, 2018 at 9:58 PM, Sridhar Samudrala > >> > > wrote: > >> > > > This feature bit can be used by hypervisor to indicate virtio_net > >> > > > device to > >> > > > act as a backup for another device with the same MAC address. > >> > > > > >> > > > Signed-off-by: Sridhar Samudrala > >> > > > --- > >> > > > drivers/net/virtio_net.c| 2 +- > >> > > > include/uapi/linux/virtio_net.h | 3 +++ > >> > > > 2 files changed, 4 insertions(+), 1 deletion(-) > >> > > > > >> > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > >> > > > index 12dfc5fee58e..f149a160a8c5 100644 > >> > > > --- a/drivers/net/virtio_net.c > >> > > > +++ b/drivers/net/virtio_net.c > >> > > > @@ -2829,7 +2829,7 @@ static struct virtio_device_id id_table[] = { > >> > > > VIRTIO_NET_F_GUEST_ANNOUNCE, VIRTIO_NET_F_MQ, \ > >> > > > VIRTIO_NET_F_CTRL_MAC_ADDR, \ > >> > > > VIRTIO_NET_F_MTU, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS, \ > >> > > > - VIRTIO_NET_F_SPEED_DUPLEX > >> > > > + VIRTIO_NET_F_SPEED_DUPLEX, VIRTIO_NET_F_BACKUP > >> > > > > >> > > > static unsigned int features[] = { > >> > > > VIRTNET_FEATURES, > >> > > > diff --git a/include/uapi/linux/virtio_net.h > >> > > > b/include/uapi/linux/virtio_net.h > >> > > > index 5de6ed37695b..c7c35fd1a5ed 100644 > >> > > > --- a/include/uapi/linux/virtio_net.h > >> > > > +++ b/include/uapi/linux/virtio_net.h > >> > > > @@ -57,6 +57,9 @@ > >> > > > * Steering */ > >> > > > #define VIRTIO_NET_F_CTRL_MAC_ADDR 23 /* Set MAC address */ > >> > > > > >> > > > +#define VIRTIO_NET_F_BACKUP 62/* Act as backup for another > >> > > > device > >> > > > +* with the same MAC. > >> > > > +*/ > >> > > > #define VIRTIO_NET_F_SPEED_DUPLEX 63 /* Device set linkspeed and > >> > > > duplex */ > >> > > > > >> > > > #ifndef VIRTIO_NET_NO_LEGACY > >> > > I'm not a huge fan of the name "backup" since that implies that the > >> > > Virtio interface is only used if the VF is not present, and there are > >> > > multiple instances such as dealing with east/west or > >> > > broadcast/multicast traffic where it may be desirable to use the > >> > > para-virtual interface rather then deal with PCI overhead/bottleneck > >> > > to send the packet. > >> > Right now hypervisors mostly expect that yes, only one at a time is > >> > used. E.g. if you try to do multicast sending packets on both VF and > >> > virtio then you will end up with two copies of each packet. > >> > >> I think we want to use only 1 interface to send out any packet. In case of > >> broadcast/multicasts it would be an optimization to send them via virtio > >> and > >> this patch series adds that optimization. > > > > Right that's what I think we should rather avoid for now. > > > > It's *not* an optimization if there's a single VM on this host, > > or if a specific multicast group does not have any VMs on same > > host. > > Agreed. In my mind this is something that is controlled by the > pass-thru interface once it is enslaved. It would be pretty tricky to control through the PT interface since a PT interface pretends to be a physical device, which has no concept of VMs. > > I'd rather we just sent everything out on the PT if that's > > there. The reason we have virtio in the picture is just so > > we can migrate without downtime. > > I wasn't saying we do that in all cases. That would be something that > would have to be decided by the pass-thru interface. Ideally the > virtio would provide just enough information to get itself into the > bond and I see this being the mechanism for it to do so. From there > the complexity mostly lies in the pass-thru interface to configure the > correct transmit modes if for example you have multiple pass-thru > interfaces or a more complex traffic setup due to things like > SwitchDev. > > In my mind we go the bonding route and there are few use cases for all > of this. First is the backup case that is being addressed here. That > becomes your basic "copy netvsc" approach for this which would be > default. It is how we would handle basic pass-thru back-up paths. If > the host decides to send multicast/broadcast traffic from the host up > through it that is a host side decision. I am okay with our default > transmit behavior from the guest being to send everything through the > pass-thru interface. All the virtio would be doing is advertising that > it is a side channel for some sort
Re: [RFC PATCH net-next v2 2/2] virtio_net: Extend virtio to use VF datapath when available
On 1/22/2018 12:27 PM, Siwei Liu wrote: First off, as mentioned in another thread, the model of stacking up virt-bond functionality over virtio seems a wrong direction to me. Essentially the migration process would need to carry over all guest side configurations previously done on the VF/PT and get them moved to the new device being it virtio or VF/PT. Without the help of a new upper layer bond driver that enslaves virtio and VF/PT devices underneath, virtio will be overloaded with too much specifics being a VF/PT backup in the future. I hope you're already aware of the issue in longer term and move to that model as soon as possible. See more inline. The idea behind this design is to provide a low latency datapath to virtio_net while preserving live migration feature without the need for the guest admin to configure a bond between VF and virtio_net. As this feature is enabled and configured via virtio_net which has a back channel to the hypervisor, adding this functionality to virtio_net looks like a reasonable option. Adding a new driver and a new device requires defining a new interface and a channel between the hypervisor and the VM and if required we may implement that in future. On Thu, Jan 11, 2018 at 9:58 PM, Sridhar Samudralawrote: This patch enables virtio_net to switch over to a VF datapath when a VF netdev is present with the same MAC address. The VF datapath is only used for unicast traffic. Broadcasts/multicasts go via virtio datapath so that east-west broadcasts don't use the PCI bandwidth. Why not making an this an option/optimization rather than being the only means? The problem of east-west broadcast eating PCI bandwidth depends on specifics of the (virtual) network setup, while some users won't want to lose VF's merits such as latency. Why restricting broadcast/multicast xmit to virtio only which potentially regresses the performance against raw VF? I am planning to remove this option when i resubmit the patches. It allows live migration of a VM with a direct attached VF without the need to setup a bond/team between a VF and virtio net device in the guest. The hypervisor needs to unplug the VF device from the guest on the source host and reset the MAC filter of the VF to initiate failover of datapath to virtio before starting the migration. After the migration is completed, the destination hypervisor sets the MAC filter on the VF and plugs it back to the guest to switch over to VF datapath. Is there a host side patch (planned) for this MAC filter switching process? As said in another thread, that simple script won't work for macvtap backend. The host side patch to enable qemu to configure this feature is included in this patch series. I have been testing this feature using a shell script, but i hope someone in the libvirt community will extend 'virsh' to handle live migration when this feature is supported. Thanks Sridhar ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH net-next v2 2/2] virtio_net: Extend virtio to use VF datapath when available
First off, as mentioned in another thread, the model of stacking up virt-bond functionality over virtio seems a wrong direction to me. Essentially the migration process would need to carry over all guest side configurations previously done on the VF/PT and get them moved to the new device being it virtio or VF/PT. Without the help of a new upper layer bond driver that enslaves virtio and VF/PT devices underneath, virtio will be overloaded with too much specifics being a VF/PT backup in the future. I hope you're already aware of the issue in longer term and move to that model as soon as possible. See more inline. On Thu, Jan 11, 2018 at 9:58 PM, Sridhar Samudralawrote: > This patch enables virtio_net to switch over to a VF datapath when a VF > netdev is present with the same MAC address. The VF datapath is only used > for unicast traffic. Broadcasts/multicasts go via virtio datapath so that > east-west broadcasts don't use the PCI bandwidth. Why not making an this an option/optimization rather than being the only means? The problem of east-west broadcast eating PCI bandwidth depends on specifics of the (virtual) network setup, while some users won't want to lose VF's merits such as latency. Why restricting broadcast/multicast xmit to virtio only which potentially regresses the performance against raw VF? > It allows live migration > of a VM with a direct attached VF without the need to setup a bond/team > between a VF and virtio net device in the guest. > > The hypervisor needs to unplug the VF device from the guest on the source > host and reset the MAC filter of the VF to initiate failover of datapath to > virtio before starting the migration. After the migration is completed, the > destination hypervisor sets the MAC filter on the VF and plugs it back to > the guest to switch over to VF datapath. Is there a host side patch (planned) for this MAC filter switching process? As said in another thread, that simple script won't work for macvtap backend. Thanks, -Siwei > > This patch is based on the discussion initiated by Jesse on this thread. > https://marc.info/?l=linux-virtualization=151189725224231=2 > > Signed-off-by: Sridhar Samudrala > Reviewed-by: Jesse Brandeburg > --- > drivers/net/virtio_net.c | 307 > ++- > 1 file changed, 305 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index f149a160a8c5..0e58d364fde9 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -30,6 +30,8 @@ > #include > #include > #include > +#include > +#include > #include > #include > > @@ -120,6 +122,15 @@ struct receive_queue { > struct xdp_rxq_info xdp_rxq; > }; > > +struct virtnet_vf_pcpu_stats { > + u64 rx_packets; > + u64 rx_bytes; > + u64 tx_packets; > + u64 tx_bytes; > + struct u64_stats_sync syncp; > + u32 tx_dropped; > +}; > + > struct virtnet_info { > struct virtio_device *vdev; > struct virtqueue *cvq; > @@ -182,6 +193,10 @@ struct virtnet_info { > u32 speed; > > unsigned long guest_offloads; > + > + /* State to manage the associated VF interface. */ > + struct net_device __rcu *vf_netdev; > + struct virtnet_vf_pcpu_stats __percpu *vf_stats; > }; > > struct padded_vnet_hdr { > @@ -1314,16 +1329,53 @@ static int xmit_skb(struct send_queue *sq, struct > sk_buff *skb) > return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb, GFP_ATOMIC); > } > > +/* Send skb on the slave VF device. */ > +static int virtnet_vf_xmit(struct net_device *dev, struct net_device > *vf_netdev, > + struct sk_buff *skb) > +{ > + struct virtnet_info *vi = netdev_priv(dev); > + unsigned int len = skb->len; > + int rc; > + > + skb->dev = vf_netdev; > + skb->queue_mapping = qdisc_skb_cb(skb)->slave_dev_queue_mapping; > + > + rc = dev_queue_xmit(skb); > + if (likely(rc == NET_XMIT_SUCCESS || rc == NET_XMIT_CN)) { > + struct virtnet_vf_pcpu_stats *pcpu_stats > + = this_cpu_ptr(vi->vf_stats); > + > + u64_stats_update_begin(_stats->syncp); > + pcpu_stats->tx_packets++; > + pcpu_stats->tx_bytes += len; > + u64_stats_update_end(_stats->syncp); > + } else { > + this_cpu_inc(vi->vf_stats->tx_dropped); > + } > + > + return rc; > +} > + > static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) > { > struct virtnet_info *vi = netdev_priv(dev); > int qnum = skb_get_queue_mapping(skb); > struct send_queue *sq = >sq[qnum]; > + struct net_device *vf_netdev; > int err; > struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum); > bool kick =
Re: [PATCH v2 net-next] virtio_net: Add ethtool stats
From: Toshiaki MakitaDate: Wed, 17 Jan 2018 15:38:25 +0900 > The main purpose of this patch is adding a way of checking per-queue stats. > It's useful to debug performance problems on multiqueue environment. > > $ ethtool -S ens10 > NIC statistics: > rx_queue_0_packets: 2090408 > rx_queue_0_bytes: 3164825094 > rx_queue_1_packets: 2082531 > rx_queue_1_bytes: 3152932314 > tx_queue_0_packets: 2770841 > tx_queue_0_bytes: 4194955474 > tx_queue_1_packets: 3084697 > tx_queue_1_bytes: 4670196372 > > This change converts existing per-cpu stats structure into per-queue one. > This should not impact on performance since each queue counter is not > updated concurrently by multiple cpus. > > Performance numbers: > - Guest has 2 vcpus and 2 queues > - Guest runs netserver > - Host runs 100-flow super_netperf > > Before After Diff > UDP_STREAM 18byte86.22 87.00 +0.90% > UDP_STREAM 1472byte4055.27 4042.18 -0.32% > TCP_STREAM16956.3216890.63 -0.39% > UDP_RR 178667.11 185862.70 +4.03% > TCP_RR 128473.04 124985.81 -2.71% > > Signed-off-by: Toshiaki Makita > --- > v2: > - Removed redundant counters which can be obtained from dev_get_stats. > - Made queue counter structure different for tx and rx so they can be > easily extended separately, as some additional counters are expected > like XDP related ones and VM-Exit event. > - Added performance numbers in commitlog. Applied, thanks. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net-next 0/2] Enable virtio to act as a master for a passthru device
Apologies I didn't notice that the discussion was mistakenly taken offline. Post it back. -Siwei On Sat, Jan 13, 2018 at 7:25 AM, Siwei Liuwrote: > On Thu, Jan 11, 2018 at 12:32 PM, Samudrala, Sridhar > wrote: >> On 1/8/2018 9:22 AM, Siwei Liu wrote: >>> >>> On Sat, Jan 6, 2018 at 2:33 AM, Samudrala, Sridhar >>> wrote: On 1/5/2018 9:07 AM, Siwei Liu wrote: > > On Thu, Jan 4, 2018 at 8:22 AM, Samudrala, Sridhar > wrote: >> >> On 1/3/2018 10:28 AM, Alexander Duyck wrote: >>> >>> On Wed, Jan 3, 2018 at 10:14 AM, Samudrala, Sridhar >>> wrote: On 1/3/2018 8:59 AM, Alexander Duyck wrote: > > On Tue, Jan 2, 2018 at 6:16 PM, Jakub Kicinski > wrote: >> >> On Tue, 2 Jan 2018 16:35:36 -0800, Sridhar Samudrala wrote: >>> >>> This patch series enables virtio to switch over to a VF datapath >>> when >>> a >>> VF >>> netdev is present with the same MAC address. It allows live >>> migration >>> of >>> a VM >>> with a direct attached VF without the need to setup a bond/team >>> between >>> a >>> VF and virtio net device in the guest. >>> >>> The hypervisor needs to unplug the VF device from the guest on the >>> source >>> host and reset the MAC filter of the VF to initiate failover of >>> datapath >>> to >>> virtio before starting the migration. After the migration is >>> completed, >>> the >>> destination hypervisor sets the MAC filter on the VF and plugs it >>> back >>> to >>> the guest to switch over to VF datapath. >>> >>> It is based on netvsc implementation and it may be possible to >>> make >>> this >>> code >>> generic and move it to a common location that can be shared by >>> netvsc >>> and virtio. >>> >>> This patch series is based on the discussion initiated by Jesse on >>> this >>> thread. >>> https://marc.info/?l=linux-virtualization=151189725224231=2 >> >> How does the notion of a device which is both a bond and a leg of a >> bond fit with Alex's recent discussions about feature propagation? >> Which propagation rules will apply to VirtIO master? Meaning of >> the >> flags on a software upper device may be different. Why muddy the >> architecture like this and not introduce a synthetic bond device? > > It doesn't really fit with the notion I had. I think there may have > been a bit of a disconnect as I have been out for the last week or > so > for the holidays. > > My thought on this was that the feature bit should be spawning a new > para-virtual bond device and that bond should have the virto and the > VF as slaves. Also I thought there was some discussion about trying > to > reuse as much of the netvsc code as possible for this so that we > could > avoid duplication of effort and have the two drivers use the same > approach. It seems like it should be pretty straight forward since > you > would have the feature bit in the case of virto, and netvsc just > does > this sort of thing by default if I am not mistaken. This patch is mostly based on netvsc implementation. The only change is avoiding the explicit dev_open() call of the VF netdev after a delay. I am assuming that the guest userspace will bring up the VF netdev and the hypervisor will update the MAC filters to switch to the right data path. We could commonize the code and make it shared between netvsc and virtio. Do we want to do this right away or later? If so, what would be a good location for these shared functions? Is it net/core/dev.c? >>> >>> No, I would think about starting a new driver file in "/drivers/net/". >>> The idea is this driver would be utilized to create a bond >>> automatically and set the appropriate registration hooks. If nothing >>> else you could probably just call it something generic like virt-bond >>> or vbond or whatever. >> >> >> We are trying to avoid creating another driver or a device. Can we >> look >> into >> consolidation of the 2 implementations(virtio & netvsc) as a later >> patch? >> Also, if we want to go with a solution that creates a bond device, do we want virtio_net/netvsc
Re: [virtio-dev] Re: [PATCH v22 2/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ
On 01/19/2018 08:39 PM, Michael S. Tsirkin wrote: On Fri, Jan 19, 2018 at 11:44:21AM +0800, Wei Wang wrote: On 01/18/2018 12:44 AM, Michael S. Tsirkin wrote: On Wed, Jan 17, 2018 at 01:10:11PM +0800, Wei Wang wrote: + vb->start_cmd_id = cmd_id; + queue_work(vb->balloon_wq, >report_free_page_work); It seems that if a command was already queued (with a different id), this will result in new command id being sent to host twice, which will likely confuse the host. I think that case won't happen, because - the host sends a cmd id to the guest via the config, while the guest acks back the received cmd id via the virtqueue; - the guest ack back a cmd id only when a new cmd id is received from the host, that is the above check: if (cmd_id != vb->start_cmd_id) { --> the driver only queues the reporting work only when a new cmd id is received /* * Host requests to start the reporting by sending a * new cmd id. */ WRITE_ONCE(vb->report_free_page, true); vb->start_cmd_id = cmd_id; queue_work(vb->balloon_wq, >report_free_page_work); } So the same cmd id wouldn't queue the reporting work twice. Like this: vb->start_cmd_id = cmd_id; queue_work(vb->balloon_wq, >report_free_page_work); command id changes vb->start_cmd_id = cmd_id; work executes queue_work(vb->balloon_wq, >report_free_page_work); work executes again If we think about the whole working flow, I think this case couldn't happen: 1) device send cmd_id=1 to driver; 2) driver receives cmd_id=1 in the config and acks cmd_id=1 to the device via the vq; 3) device revives cmd_id=1; 4) device wants to stop the reporting by sending cmd_id=STOP; 5) driver receives cmd_id=STOP from the config, and acks cmd_id=STOP to the device via the vq; 6) device sends cmd_id=2 to driver; ... cmd_id=2 won't come after cmd_id=1, there will be a STOP cmd in between them (STOP won't queue the work). How about defining the correct device behavior in the spec: The device Should NOT send a second cmd id to the driver until a STOP cmd ack for the previous cmd id has been received from the guest. Best, Wei ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization