On 10/03, Linus Torvalds wrote: > > On Fri, Oct 3, 2014 at 4:26 PM, Oleg Nesterov <o...@redhat.com> wrote: > > > > And I _think_ that preempt_schedule_context() should be fixed anyway, > > although I am not sure there is no something else. It does: > > > > > > preempt_disable_notrace(); > > prev_ctx = exception_enter(); > > preempt_enable_no_resched_notrace(); > > > > preempt_schedule(); > > > > preempt_disable_notrace(); > > exception_exit(prev_ctx); > > preempt_enable_notrace(); > > > > but exception_exit() is heavy, it is quite possible that TIF_NEED_RESCHED > > and thus set_preempt_need_resched() can be set again when we call > > preempt_enable_notrace(). And in this case preempt_schedule_context() > > will be called recursively. > > Why the hell is it using "preempt_enable_notrace()" in the first > place? Shouldn't it use "preempt_enable_no_resched_notrace()",
Yes, this this is the main problem. > > Frederic, how about the patch below? > > Why do it multiple times? The whole concept is fundamentally racy > anyway, in it doesn't guarantee that any *new* "need_resched()" would > be reacted to (since they could happen *after* the test), But in this case we rely on scheduler_ipi() and return-from-irq path? > The real fix would appear to be to use > "preempt_enable_no_resched_notrace()", which your patch did, but > without the loop. Not sure... preempt_schedule() does the same and afaics for good reason. But perhaps you are right. I am already sleeping, will try to recheck tomorrow. And in fact I got lost in preempt.h files... I can't even understand why __preempt_schedule_context() is only called by preempt_enable_notrace(). Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/