On Wed, 2014-04-16 at 15:08 +0800, Li, Zhen-Hua wrote: > From: "Li, Zhen-Hua" <zhen-h...@hp.com> > > As netif_running is called in netif_device_attach/detach. There should be > rtnl_lock/unlock called, to avoid dev stat change during netif_device_attach > and detach being called. > I checked NIC some drivers, some of them have netif_device_attach/detach > called between rtnl_lock/unlock, while some drivers do not. > > This patch is tring to find a generic way to fix this for all NIC drivers.
I don't think you can generically use the RTNL lock for this. > Signed-off-by: Li, Zhen-Hua <zhen-h...@hp.com> > --- > net/core/dev.c | 18 ++++++++++++++++++ > 1 file changed, 18 insertions(+) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 5b3042e..795bbc5 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -2190,10 +2190,19 @@ EXPORT_SYMBOL(__dev_kfree_skb_any); > */ > void netif_device_detach(struct net_device *dev) > { > + /** > + * As netif_running is called , rtnl_lock and unlock are needed to > + * avoid __LINK_STATE_START bit changes during this function call. > + */ > + int need_unlock; > + > + need_unlock = rtnl_trylock(); It is never correct to use trylock and then continue even if it fails. I think you're trying to simulate a reentrant mutex but this will fail if *any* task already has the mutex. Furthermore it is currently allowed and useful to call these functions from atomic context (transmit or completion path) where it is not possible to hold the mutex. > if (test_and_clear_bit(__LINK_STATE_PRESENT, &dev->state) && > netif_running(dev)) { > netif_tx_stop_all_queues(dev); > } > + if (need_unlock) > + rtnl_unlock(); > } > EXPORT_SYMBOL(netif_device_detach); For netif_device_detach(), I wonder whether it is necessary to check netif_running(). What are we trying to avoid? > @@ -2205,11 +2214,20 @@ EXPORT_SYMBOL(netif_device_detach); > */ > void netif_device_attach(struct net_device *dev) > { > + /** > + * As netif_running is called , rtnl_lock and unlock are needed to > + * avoid __LINK_STATE_START bit changes during this function call. > + */ > + int need_unlock; > + > + need_unlock = rtnl_trylock(); > if (!test_and_set_bit(__LINK_STATE_PRESENT, &dev->state) && > netif_running(dev)) { > netif_tx_wake_all_queues(dev); > __netdev_watchdog_up(dev); > } > + if (need_unlock) > + rtnl_unlock(); > } > EXPORT_SYMBOL(netif_device_attach); I do see a problem if netif_device_detach() races with dev_deactivate_many(), which is being mitigated but not avoided by the test of netif_running(). I think a proper solution is going to involve changing dev_deactivate_many() as well, removing the use of netif_running(), and possible using cmpxchg() to atomically manipulate multiple bits of dev->state. Ben. -- Ben Hutchings Beware of programmers who carry screwdrivers. - Leonard Brandwein
signature.asc
Description: This is a digitally signed message part