From: Jacob Keller <jacob.e.kel...@intel.com> Date: Mon, 7 Aug 2017 15:24:21 -0700
> Fix an issue with relying on netif_running() which could be true during > when dev->open() handler is being called, even if it would exit with > a failure. This ensures the state does not get set and removed with > a narrow race for other callers to read it as open when infact it never > finished opening. > > Signed-off-by: Jacob Keller <jacob.e.kel...@intel.com> > --- > I found this as a result of debugging a race condition in the i40evf > driver, in which we assumed that netif_running() would not be true until > after dev->open() had been called and succeeded. Unfortunately we can't > hold the rtnl_lock() while checking netif_running() because it would > cause a deadlock between our reset task and our ndo_open handler. > > I am wondering whether the proposed change is acceptable here, or > whether some ndo_open handlers rely on __LINK_STATE_START being true > prior to their being called? I think this has the potential to break a bunch of drivers, but I cannot prove this. A lot of drivers have several pieces of state setup when they bring the device up. And these routines are also invoked from other code paths like suspend/resume, PCI-E error recovery, etc. and they probably do netif_running() calls here and there. This behavior has been this way for a very long time, so the risk is quite high I think.