On Tue, Sep 04, 2018 at 10:12:13AM -0700, Linus Torvalds wrote: > On Mon, Sep 3, 2018 at 11:39 AM Holger Hoffstätte > <hol...@applied-asynchrony.com> wrote: > > > > Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x76/0xc0 > > Sep 3 20:19:38 ragnarok kernel: tlb_table_flush.part.13+0xe/0x30 > > Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x54/0xc0 > > ..a few hundred times.. > > Sep 3 20:19:38 ragnarok kernel: tlb_table_flush.part.13+0xe/0x30 > > Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x54/0xc0 > > Sep 3 20:19:38 ragnarok kernel: arch_tlb_finish_mmu+0x3a/0x70 > > Sep 3 20:19:38 ragnarok kernel: tlb_finish_mmu+0x1f/0x30 > > Yeah, so what seems to have happened is that commit db7ddef30112 ("mm: > move tlb_table_flush to tlb_flush_mmu_free") wasn't applied to the > stable tree (because it wasn't an obvious dependency). > > And without that, the backport of d86564a2f085 ("mm/tlb, x86/mm: > Support invalidating TLB caches for RCU_TABLE_FREE") ends up with > recursion from tlb_flush_mmu_tlbonly() calling tlb_table_flush(), > which in turn calls tlb_table_invalidate(), which calls back to > tlb_flush_mmu_tlbonly(). > > So you have endless recursion - at least until you run out of stack. > Then, if you have VMAP_STACK enabled (x86-64 without KASAN), you get a > nice clean kernel stack overflow message like you did. > > Or if you have KASAN enabled and no VMAP stack, you just end up with > random hangs and huge memory corruption as the recursion stomps all > over your memory.
Ok, I will go queue this patch up now, it was in my very-long "to-apply" queue, but I didn't catch the dependancy here. thanks, greg k-h