On 10/03, Sasha Levin wrote: > > On 09/24/2014 11:02 AM, tip-bot for Oleg Nesterov wrote: > > Commit-ID: 0ad6e3c5199be12c9745da8f8b9e3c9f8066c235 > > Gitweb: > > http://git.kernel.org/tip/0ad6e3c5199be12c9745da8f8b9e3c9f8066c235 > > Author: Oleg Nesterov <o...@redhat.com> > > AuthorDate: Sun, 21 Sep 2014 20:41:53 +0200 > > Committer: Ingo Molnar <mi...@kernel.org> > > CommitDate: Wed, 24 Sep 2014 15:15:38 +0200 > > > > x86: Speed up ___preempt_schedule*() by using THUNK helpers > > > > ___preempt_schedule() does SAVE_ALL/RESTORE_ALL but this is > > suboptimal, we do not need to save/restore the callee-saved > > register. And we already have arch/x86/lib/thunk_*.S which > > implements the similar asm wrappers, so it makes sense to > > redefine ___preempt_schedule() as "THUNK ..." and remove > > preempt.S altogether. > > > > Signed-off-by: Oleg Nesterov <o...@redhat.com> > > Reviewed-by: Andy Lutomirski <l...@amacapital.net> > > Cc: Denys Vlasenko <dvlas...@redhat.com> > > Cc: Peter Zijlstra <pet...@infradead.org> > > Cc: Linus Torvalds <torva...@linux-foundation.org> > > Link: http://lkml.kernel.org/r/20140921184153.ga23...@redhat.com > > Signed-off-by: Ingo Molnar <mi...@kernel.org> > > --- > > Hi Oleg, > > I *think* that this patch is causing the following trace > (arch/x86/lib/thunk_64.S:44 > is new code introduced by this patch):
So far I still do not think (at least I do not understand how) this patch could introduce the problem. I can be wrong of course... Let's look at this trace again, > [ 921.908530] kernel BUG at kernel/sched/core.c:2702! OK, let's assume this is BUG_ON(unlikely(task_stack_end_corrupted(prev))) in schedule_debug(). > [ 921.909159] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > [ 921.910084] Dumping ftrace buffer: > [ 921.910626] (ftrace buffer empty) > [ 921.911178] Modules linked in: > [ 921.915690] CPU: 18 PID: 9489 Comm: trinity-c195 Not tainted > 3.17.0-rc7-next-20141002-sasha-00031-gbdb4244 #1273 > [ 921.917016] task: ffff8802bd748000 ti: ffff8802bda3c000 task.ti: > ffff8802bda3c000 > [ 921.917752] RIP: __schedule (kernel/sched/core.c:2702 > kernel/sched/core.c:2808) > [ 921.917752] RSP: 0018:ffff8802bda3c360 EFLAGS: 00010297 > [ 921.917752] RAX: ffff8802bda3c000 RBX: ffff8808501e2a00 RCX: > 0000000000000001 > [ 921.917752] RDX: 0000000000000000 RSI: 0000000000000000 RDI: > 0000000000000286 > [ 921.917752] RBP: ffff8802bda3c3c0 R08: 000000000001aa50 R09: > 0000000000000000 > [ 921.917752] R10: 0000000000000000 R11: 0000000000000001 R12: > 0000000000000012 > [ 921.917752] R13: ffff8808501e2a00 R14: 0000000000000002 R15: > ffff8802bda3c428 > [ 921.917752] FS: 00007f5475cc2700(0000) GS:ffff880850000000(0000) > knlGS:0000000000000000 > [ 921.917752] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 921.917752] CR2: 00007f5475abe60c CR3: 00000002bebab000 CR4: > 00000000000006a0 > [ 921.917752] DR0: 00000000006f0000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 921.917752] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000600 > [ 921.917752] Stack: > [ 921.917752] 000000000001aa50 ffff8802bd748000 ffff8802bda3ffd8 > 00000000001e2a00 > [ 921.917752] 00000000001e2a00 ffff8802bd748000 ffff8802bda3c3a0 > 00000000001e2a00 > [ 921.917752] ffff8802bd748000 000000000001a9ea 0000000000000002 > ffff8802bda3c428 > [ 921.917752] Call Trace: > [ 921.917752] schedule_user (kernel/sched/core.c:2894 > include/linux/jump_label.h:114 include/linux/context_tracking_state.h:27 > include/linux/context_tracking.h:20 kernel/sched/core.c:2909) > [ 921.917752] int_careful (arch/x86/kernel/entry_64.S:560) > [ 921.917752] ? retint_careful (arch/x86/kernel/entry_64.S:889) > [ 921.917752] ? preempt_schedule (./arch/x86/include/asm/preempt.h:80 > (discriminator 1) kernel/sched/core.c:2943 (discriminator 1)) ... > [ 921.917752] ? ___preempt_schedule_context (arch/x86/lib/thunk_64.S:44) > [ 921.917752] ? preempt_schedule_context (kernel/context_tracking.c:145) > [ 921.917752] ? ___preempt_schedule_context (arch/x86/lib/thunk_64.S:44) > [ 921.917752] ? preempt_schedule_context (kernel/context_tracking.c:145) > [ 921.917752] ? ___preempt_schedule_context (arch/x86/lib/thunk_64.S:44) > [ 921.917752] ? preempt_schedule_context (kernel/context_tracking.c:145) > [ 921.917752] ? ___preempt_schedule_context (arch/x86/lib/thunk_64.S:44) > [ 921.917752] ? preempt_schedule_context (kernel/context_tracking.c:145) ... A lOT of repeats of above, so we can run out of stack and in this case task_stack_end_corrupted() is clear. > [ 921.917752] ? __schedule (kernel/sched/core.c:2900) > [ 921.917752] ? ___preempt_schedule_context (arch/x86/lib/thunk_64.S:44) > [ 921.917752] ? ftrace_ops_control_func (kernel/trace/ftrace.c:4780) > [ 921.917752] ? ftrace_call (arch/x86/kernel/mcount_64.S:56) > [ 921.917752] ? retint_careful (arch/x86/kernel/entry_64.S:886) > [ 921.917752] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > [ 921.917752] ? schedule_user (kernel/sched/core.c:2900) > [ 921.917752] ? schedule_user (kernel/sched/core.c:2900) > [ 921.917752] ? retint_careful (arch/x86/kernel/entry_64.S:889) And I _think_ that preempt_schedule_context() should be fixed anyway, although I am not sure there is no something else. It does: preempt_disable_notrace(); prev_ctx = exception_enter(); preempt_enable_no_resched_notrace(); preempt_schedule(); preempt_disable_notrace(); exception_exit(prev_ctx); preempt_enable_notrace(); but exception_exit() is heavy, it is quite possible that TIF_NEED_RESCHED and thus set_preempt_need_resched() can be set again when we call preempt_enable_notrace(). And in this case preempt_schedule_context() will be called recursively. Frederic, how about the patch below? In _theory_ this can explain this OOPS unless I am totally confused. Oleg. --- x/kernel/context_tracking.c +++ x/kernel/context_tracking.c @@ -134,15 +134,17 @@ asmlinkage __visible void __sched notrac * and the tracer calls preempt_enable_notrace() causing * an infinite recursion. */ - preempt_disable_notrace(); - prev_ctx = exception_enter(); - preempt_enable_no_resched_notrace(); - - preempt_schedule(); - - preempt_disable_notrace(); - exception_exit(prev_ctx); - preempt_enable_notrace(); + do { + preempt_disable_notrace(); + prev_ctx = exception_enter(); + preempt_enable_no_resched_notrace(); + + preempt_schedule(); + + preempt_disable_notrace(); + exception_exit(prev_ctx); + preempt_enable_no_resched_notrace(); + } while (need_resched()); } EXPORT_SYMBOL_GPL(preempt_schedule_context); #endif /* CONFIG_PREEMPT */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/