On Fri, 14 Nov 2014, Linus Torvalds wrote: > On Fri, Nov 14, 2014 at 1:31 PM, Dave Jones <da...@redhat.com> wrote: > > I'm not sure how long this goes back (3.17 was fine afair) but I'm > > seeing these several times a day lately.. > > Plus, judging by the fact that there's a stale "leave_mm+0x210/0x210" > (wouldn't that be the *next* function, namely do_flush_tlb_all()) > pointer on the stack, I suspect that whole range-flushing doesn't even > trigger, and we are flushing everything.
This stale entry is not relevant here because the thing is stuck in generic_exec_single(). > > NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [trinity-c129:25570] > > RIP: 0010:[<ffffffff9c11e98a>] [<ffffffff9c11e98a>] > > generic_exec_single+0xea/0x1d0 > > Call Trace: > > [<ffffffff9c048b20>] ? leave_mm+0x210/0x210 > > [<ffffffff9c048b20>] ? leave_mm+0x210/0x210 > > [<ffffffff9c11ead6>] smp_call_function_single+0x66/0x110 > > [<ffffffff9c048b20>] ? leave_mm+0x210/0x210 > > [<ffffffff9c11f021>] smp_call_function_many+0x2f1/0x390 > > [<ffffffff9c049300>] flush_tlb_mm_range+0xe0/0x370 flush_tlb_mm_range() ..... out: if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) flush_tlb_others(mm_cpumask(mm), mm, start, end); which calls smp_call_function_many() via native_flush_tlb_others() which is either inlined or not on the stack the invocation of smp_call_function_many() is a tail call. So from smp_call_function_many() we end up via smp_call_function_single() in generic_exec_single(). So the only ways to get stuck there are: csd_lock(csd); and csd_lock_wait(csd); The called function is flush_tlb_func() and I really can't see why that would get stuck at all. So this looks more like a smp function call fuckup. I assume Dave is running that stuff on KVM. So it might be worth while to look at the IPI magic there. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/