On Thu, 7 Jun 2018 00:30:21 +0300
"Michael S. Tsirkin" <m...@redhat.com> wrote:

> On Wed, Jun 06, 2018 at 02:16:20PM -0700, Stephen Hemminger wrote:
> > On Tue, 5 Jun 2018 23:11:37 -0700
> > "Samudrala, Sridhar" <sridhar.samudr...@intel.com> wrote:
> >   
> > > On 6/5/2018 11:00 PM, Stephen Hemminger wrote:  
> > > > On Tue, 5 Jun 2018 22:39:12 -0700
> > > > "Samudrala, Sridhar" <sridhar.samudr...@intel.com> wrote:
> > > >    
> > > >> On 6/5/2018 8:51 PM, Stephen Hemminger wrote:    
> > > >>> On Tue, 5 Jun 2018 16:52:22 -0700
> > > >>> "Samudrala, Sridhar" <sridhar.samudr...@intel.com> wrote:
> > > >>>       
> > > >>>> On 6/5/2018 2:52 PM, Stephen Hemminger wrote:    
> > > >>>>> On Tue, 5 Jun 2018 22:38:43 +0300
> > > >>>>> "Michael S. Tsirkin" <m...@redhat.com> wrote:
> > > >>>>>          
> > > >>>>>>> See:
> > > >>>>>>>       https://patchwork.ozlabs.org/patch/851711/    
> > > >>>>>> Let me try to summarize that:
> > > >>>>>>
> > > >>>>>>    You wanted to speed up the delayed link up.  You had an idea to
> > > >>>>>>    additionally take link up when userspace renames the interface 
> > > >>>>>> (standby
> > > >>>>>>    one which is also the failover for netvsc).
> > > >>>>>>
> > > >>>>>>    But userspace might not do any renames, in which case there will
> > > >>>>>>    still be the delay, and so this never got applied.
> > > >>>>>>
> > > >>>>>>    Is this a good summary?
> > > >>>>>>
> > > >>>>>> Davem said delay should go away completely as it's not robust, and 
> > > >>>>>> I
> > > >>>>>> think I agree.  So I don't think we should make all failover users 
> > > >>>>>> use
> > > >>>>>> delay. IIUC failover kept a delay option especially for netvsc to
> > > >>>>>> minimize the surprise factor. Hopefully we can come up with
> > > >>>>>> something more robust and drop that option completely.    
> > > >>>>> The timeout was the original solution to how to complete setup after
> > > >>>>> userspace has had a chance to rename the device. Unfortunately, the 
> > > >>>>> whole network
> > > >>>>> device initialization (cooperation with udev and userspace) is a a 
> > > >>>>> mess because
> > > >>>>> there is no well defined specification, and there are multiple ways 
> > > >>>>> userspace
> > > >>>>> does this in old and new distributions.  The timeout has its own 
> > > >>>>> issues
> > > >>>>> (how long, handling errors during that window, what if userspace 
> > > >>>>> modifies other
> > > >>>>> device state); and open to finding a better solution.
> > > >>>>>
> > > >>>>> My point was that if name change can not be relied on (or used) by 
> > > >>>>> netvsc,
> > > >>>>> then we can't allow it for net_failover either.    
> > > >>>> I think the push back was with the usage of the delay, not bringing 
> > > >>>> up the primary/standby
> > > >>>> device in the name change event handler.
> > > >>>> Can't netvsc use this mechanism instead of depending on the delay?
> > > >>>>
> > > >>>>       
> > > >>> The patch that was rejected for netvsc was about using name change.
> > > >>> Also, you can't depend on name change; you still need a timer. Not 
> > > >>> all distributions
> > > >>> change name of devices. Or user has blocked that by udev rules.    
> > > >> In the net_failover_slave_register() we do a dev_open() and ignore any 
> > > >> failure due to
> > > >> EBUSY and do another dev_open() in the name change event handler.
> > > >> If the name is not expected to change, i would think the dev_open() at 
> > > >> the time of
> > > >> register will succeed.    
> > > > The problem is your first dev_open will bring device up and lockout
> > > > udev from changing the network device name.    
> > > 
> > > I have tried with/without udev and didn't see any issue with the naming 
> > > of the primary/standby
> > > devices in my testing.
> > > 
> > > With the 3-netdev failover model, we are only interested in setting the 
> > > right name for the failover
> > > netdev and that is the reason we do SET_NETDEV_DEV on that netdev. Does 
> > > it really matter if udev fails
> > > to rename the lower primary/standby netdevs, i don't think it will 
> > > matter? The user is not expected
> > > to touch the lower netdevs.  
> > 
> > Renaming matters to some users and the udev maintainers. They want the VF 
> > to be named enpXXX
> > The primary in virtio case needs to be ens3 with some cloud platforms.  
> 
> Confused. VF can't be a standby, of it's used in a failover it's a
> primary, you can't call it both enpXXX amd ens3. Could you describe the
> use case in a bit more detail?

Sorry, got things backwards.  The primary is VF and it should be possible
to have it renamed based on PCI info.
The standby PV (in 3 dev model) would get renamed by udev to ens3.
So the failover device would just be ethX right?

> > 
> > I think you need to get rid of triggering off of the name change.  
> 
> Worth considering down the road since it's a bit of a hack but it's one
> we won't have trouble supporting, unlike the delayed uplink.

You can't depend on userspace doing rename.

> 
> > Long term, let's open the discussion about allowing network devices to 
> > change name when up?
> > Maybe with NETIF_LIVENAME_CHANGE flag?  
> 
> That's probably the cleanest approach assuming it can be made to work
> without races. I suspect we can just live with what we have until then.
> 
> 

Reply via email to