> a) Ban the calling of flush_scheduled_work() from under rtnl_lock(). > Sounds hard.
Unfortunate if this is happening a lot. It seems like the most sensible fix -- flush_scheduled_work() is in effect calling into an unknown and changeable in the future set of functions (since it waits for them to finish), and it seems error-prone to hold a lock across such a call. > This will almost work, as long as it's done in workqueue.c with > appropriate locking. The bug occurs when some other CPU is running > phy_change() right now - we'll end up freeing data which that CPU is > presently playing with. > > But perhaps we can take care of this within workqueue.c. We need a > cancel function which will cancel the work and, if its callback is > presently executing it will block until that execution has completed. I may be misunderstanding you, but this seems to deadlock in exactly the same way: if someone calls this cancel routine holding rtnl_lock, and the work function that will also take rtnl_lock has just started, it will get stuck when the work function tries to take rtnl_lock. - R. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/