Andrew,
> I don't get it. If some code does
>
> rtnl_lock();
> flush_scheduled_work();
>
> and there's some work scheduled which does rtnl_lock() then it'll deadlock.
>
> But it'll deadlock whether or not the caller of flush_scheduled_work() is
> keventd.
>
> Calling flush_scheduled_work() under locks is generally a bad idea.
Indeed -- this is why I avoid it.
> I'd have thought that was still deadlockable. Could you describe the
> deadlock more completely please?
The simplest sequence of calls that prevents races here is as follows:
unregister_netdev();
rtnl_lock();
unregister_netdevice();
dev_close();
sbmac_close();
phy_stop();
phy_disconnect();
phy_stop_interrupts();
phy_disable_interrupts();
flush_scheduled_work();
free_irq();
phy_detach();
mdiobus_unregister();
rtnl_unlock();
We want to call flush_scheduled_work() from phy_stop_interrupts(), because
there may still be calls to phy_change() waiting for the keventd to
process and mdiobus_unregister() frees structures needed by phy_change().
This may deadlock because of the call to rtnl_lock() though.
So the modified sequence I have implemented is as follows:
unregister_netdev();
rtnl_lock();
unregister_netdevice();
dev_close();
sbmac_close();
phy_stop();
schedule_work(); [sbmac_phy_disconnect()]
rtnl_unlock();
and then:
sbmac_phy_disconnect();
phy_disconnect();
phy_stop_interrupts();
phy_disable_interrupts();
free_irq();
phy_detach();
mdiobus_unregister();
This guarantees calls to phy_change() for this PHY unit will not be run
after mdiobus_unregister(), because any such calls have been placed in the
queue before sbmac_phy_disconnect() (phy_stop() prevents the interrupt
handler from issuing new calls to phy_change()).
We still need flush_scheduled_work() to be called from
phy_stop_interrupts() though, in case some other MAC driver calls
phy_disconnect() (or phy_stop_interrupts(), depending on the abstraction
layer of phylib used) directly rather than using keventd. This is
possible if phy_disconnect() is called from the driver's module_exit()
call, which may make sense for devices that are known not to have their
MII interface available as an external connector. Hence the:
if (!current_is_keventd())
flush_scheduled_work();
sequence in phy_stop_interrupts(). Of course, we can force all drivers
using phylib (in the interrupt mode) to call phy_disconnect() through
keventd instead.
Does it sound clearer?
Maciej
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html