Hi Tejun, On 05/01/2014 11:09 PM, Tejun Heo wrote: > On Thu, May 01, 2014 at 05:02:42PM -0400, Tejun Heo wrote: >> Hello, Jiri. >> >> On Thu, May 01, 2014 at 10:17:44PM +0200, Jiri Kosina wrote: >>> I agree that this expectation might really somewhat implicit and is not >>> probably properly documented anywhere. The basic observation is "whenever >>> kthread_should_stop() is being called, all data structures are in a >>> consistent state and don't need any further updates in order to achieve >>> consistency, because we can exit the loop immediately here", as >>> kthread_should_stop() is the very last thing every freezable kernel thread >> >> But kthread_should_stop() doesn't necessarily imply that "we can exit >> the loop *immediately*" at all. It just indicates that it should >> terminate in finite amount of time. I don't think it'd be too > > Just a bit of addition. Please note that kthread_should_stop(), along > with the freezer test, is actually trickier than it seems. It's very > easy to write code which works most of the time but misses wake up > from kill when the timing is just right (or wrong). It should be > interlocked with set_current_state() and other related queueing data > structure accesses. This was several years ago but when I audited > most kthread users in kernel, especially in combination with the > freezer test which also has similar requirement, surprising percentage > of users (at least several tens of pct) were getting it slightly > wrong, so kthread_should_stop() really isn't used as "we can exit > *immediately*". It just isn't that simple.
I see the worst case scenario. (For curious readers, it is for example this kthread body: while (1) { some_paired_call(); /* invokes pre-patched code */ if (kthread_should_stop()) { /* kgraft switches to the new code */ its_paired_function(); /* invokes patched code (wrong) */ break; } its_paired_function(); /* the same (wrong) */ }) What to do with that now? We have come up with a couple possibilities. Would you consider try_to_freeze() a good state-defining function? As it is called when a kthread expects weird things can happen, it should be safe to switch to the patched version in our opinion. The other possibility is to patch every kthread loop (~300) and insert kgr_task_safe() semi-manually at some proper place. Or if you have any other suggestions we would appreciate that? thanks, -- js suse labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/