Re: [summary] virtio network device failover writeup

2019-03-19 Thread Michael S. Tsirkin
On Tue, Mar 19, 2019 at 08:46:47AM -0700, Stephen Hemminger wrote:
> On Tue, 19 Mar 2019 14:38:06 +0200
> Liran Alon  wrote:
> 
> > b.3) cloud-init: If configured to perform network-configuration, it 
> > attempts to configure all available netdevs. It should avoid however doing 
> > so on net-failover slaves.
> > (Microsoft has handled this by adding a mechanism in cloud-init to 
> > blacklist a netdev from being configured in case it is owned by a specific 
> > PCI driver. Specifically, they blacklist Mellanox VF driver. However, this 
> > technique doesn’t work for the net-failover mechanism because both the 
> > net-failover netdev and the virtio-net netdev are owned by the virtio-net 
> > PCI driver).
> 
> Cloud-init should really just ignore all devices that have a master device.
> That would have been more general, and safer for other use cases.

Given lots of userspace doesn't do this, I wonder whether it would be
safer to just somehow pretend to userspace that the slave links are
down? And add a special attribute for the actual link state.

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [summary] virtio network device failover writeup

2019-03-19 Thread Michael S. Tsirkin
On Tue, Mar 19, 2019 at 02:38:06PM +0200, Liran Alon wrote:
> Hi Michael,
> 
> Great blog-post which summarise everything very well!
> 
> Some comments I have:

Thanks!
I'll try to update everything in the post when I'm not so jet-lagged.

> 1) I think that when we are using the term “1-netdev model” on community 
> discussion, we tend to refer to what you have defined in blog-post as 
> "3-device model with hidden slaves”.
> Therefore, I would suggest to just remove the “1-netdev model” section and 
> rename the "3-device model with hidden slaves” section to “1-netdev model”.
> 
> 2) The userspace issues result both from using “2-netdev model” and “3-netdev 
> model”. However, they are described in blog-post as they only exist on 
> “3-netdev model”.
> The reason these issues are not seen in Azure environment is because these 
> issues were partially handled by Microsoft for their specific 2-netdev model.
> Which leads me to the next comment.
> 
> 3) I suggest that blog-post will also elaborate on what exactly are the 
> userspace issues which results in models different than “1-netdev model”.
> The issues that I’m aware of are (Please tell me if you are aware of others!):
> (a) udev rename race-condition: When net-failover device is opened, it also 
> opens it's slaves. However, the order of events to udev on KOBJ_ADD is first 
> for the net-failover netdev and only then for the virtio-net netdev. This 
> means that if userspace will respond to first event by open the net-failover, 
> then any attempt of userspace to rename virtio-net netdev as a response to 
> the second event will fail because the virtio-net netdev is already opened. 
> Also note that this udev rename rule is useful because we would like to add 
> rules that renames virtio-net netdev to clearly signal that it’s used as the 
> standby interface of another net-failover netdev.
> The way this problem was workaround by Microsoft in NetVSC is to delay the 
> open done on slave-VF from the open of the NetVSC netdev. However, this is 
> still a race and thus a hacky solution. It was accepted by community only 
> because it’s internal to the NetVSC driver. However, similar solution was 
> rejected by community for the net-failover driver.
> The solution that we currently proposed to address this (Patch by Si-Wei) was 
> to change the rename kernel handling to allow a net-failover slave to be 
> renamed even if it is already opened. Patch is still not accepted.
> (b) Issues caused because of various userspace components DHCP the 
> net-failover slaves: DHCP of course should only be done on the net-failover 
> netdev. Attempting to DHCP on net-failover slaves as-well will cause 
> networking issues. Therefore, userspace components should be taught to avoid 
> doing DHCP on the net-failover slaves. The various userspace components 
> include:
> b.1) dhclient: If run without parameters, it by default just enum all netdevs 
> and attempt to DHCP them all.
> (I don’t think Microsoft has handled this)
> b.2) initramfs / dracut: In order to mount the root file-system from iSCSI, 
> these components needs networking and therefore DHCP on all netdevs.
> (Microsoft haven’t handled (b.2) because they don’t have images which perform 
> iSCSI boot in their Azure setup. Still an open issue)
> b.3) cloud-init: If configured to perform network-configuration, it attempts 
> to configure all available netdevs. It should avoid however doing so on 
> net-failover slaves.
> (Microsoft has handled this by adding a mechanism in cloud-init to blacklist 
> a netdev from being configured in case it is owned by a specific PCI driver. 
> Specifically, they blacklist Mellanox VF driver. However, this technique 
> doesn’t work for the net-failover mechanism because both the net-failover 
> netdev and the virtio-net netdev are owned by the virtio-net PCI driver).
> b.4) Various distros network-manager need to be updated to avoid DHCP on 
> net-failover slaves? (Not sure. Asking...)
> 
> 4) Another interesting use-case where the net-failover mechanism is useful is 
> for handling NIC firmware failures or NIC firmware Live-Upgrade.
> In both cases, there is a need to perform a full PCIe reset of the NIC. Which 
> lose all the NIC eSwitch configuration of the various VFs.

In this setup, how does VF keep going? If it doesn't keep going, why is
it helpful?

> To handle these cases gracefully, one could just hot-unplug all VFs from 
> guests running on host (which will make all guests now use the virtio-net 
> netdev which is backed by a netdev that eventually is on top of PF). 
> Therefore, networking will be restored to guests once the PCIe reset is 
> completed and the PF is functional again. To re-acceelrate the guests 
> network, hypervisor can just hot-plug new VFs to guests.
> 
> P.S:
> I would very appreciate all this forum help in closing on the pending items 
> written in (3). Which currently prevents using this net-failover mechanism in 
> real production use-cases.
> 

Re: [summary] virtio network device failover writeup

2019-03-19 Thread Stephen Hemminger
On Tue, 19 Mar 2019 14:38:06 +0200
Liran Alon  wrote:

> b.3) cloud-init: If configured to perform network-configuration, it attempts 
> to configure all available netdevs. It should avoid however doing so on 
> net-failover slaves.
> (Microsoft has handled this by adding a mechanism in cloud-init to blacklist 
> a netdev from being configured in case it is owned by a specific PCI driver. 
> Specifically, they blacklist Mellanox VF driver. However, this technique 
> doesn’t work for the net-failover mechanism because both the net-failover 
> netdev and the virtio-net netdev are owned by the virtio-net PCI driver).

Cloud-init should really just ignore all devices that have a master device.
That would have been more general, and safer for other use cases.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH] virtio_console: initialize vtermno value for ports

2019-03-19 Thread Pankaj Gupta
For regular serial ports we do not initialize value of vtermno 
variable. A garbage value is assigned for non console ports.
The value can be observed as a random integer with [1]. 

[1] vim /sys/kernel/debug/virtio-ports/vport*p*

This patch initialize the value of vtermno for console serial 
ports to '1' and regular serial ports are initiaized to '0'.

Reported-by: si...@redhat.com
Signed-off-by: Pankaj Gupta 
---
 drivers/char/virtio_console.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index fbeb71953526..05dbfdb9f4af 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -75,7 +75,7 @@ struct ports_driver_data {
/* All the console devices handled by this driver */
struct list_head consoles;
 };
-static struct ports_driver_data pdrvdata;
+static struct ports_driver_data pdrvdata = { .next_vtermno = 1};
 
 static DEFINE_SPINLOCK(pdrvdata_lock);
 static DECLARE_COMPLETION(early_console_added);
@@ -1394,6 +1394,7 @@ static int add_port(struct ports_device *portdev, u32 id)
port->async_queue = NULL;
 
port->cons.ws.ws_row = port->cons.ws.ws_col = 0;
+   port->cons.vtermno = 0;
 
port->host_connected = port->guest_connected = false;
port->stats = (struct port_stats) { 0 };
-- 
2.20.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization