> a) Ban the calling of flush_scheduled_work() from under rtnl_lock(). 
 >    Sounds hard.

Unfortunate if this is happening a lot.  It seems like the most
sensible fix -- flush_scheduled_work() is in effect calling into
an unknown and changeable in the future set of functions (since it
waits for them to finish), and it seems error-prone to hold a lock
across such a call.

 >    This will almost work, as long as it's done in workqueue.c with
 >    appropriate locking.  The bug occurs when some other CPU is running
 >    phy_change() right now - we'll end up freeing data which that CPU is
 >    presently playing with.
 > 
 >    But perhaps we can take care of this within workqueue.c.  We need a
 >    cancel function which will cancel the work and, if its callback is
 >    presently executing it will block until that execution has completed.

I may be misunderstanding you, but this seems to deadlock in exactly
the same way: if someone calls this cancel routine holding rtnl_lock,
and the work function that will also take rtnl_lock has just started,
it will get stuck when the work function tries to take rtnl_lock.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to