* Linus Torvalds <torva...@linux-foundation.org> wrote:
> On Wed, Aug 5, 2020 at 4:03 AM Jason A. Donenfeld <ja...@zx2c4.com> wrote: > > > > The commit 8bb9bf242d1f ("x86/mm/64: Do not sync vmalloc/ioremap > > mappings") causes the OOPS below, in Linus' tree and in linux-next, > > unearthed by my CI on <https://www.wireguard.com/build-status/>. > > Bisecting reveals 8bb9bf242d1f, and reverting this makes the OOPS go > > away. > > The oops happens early in the function, and the "Code:" line actually > gets almost the whole function prologue in it (missing first two bytes > are probably "push %rbp"): > > 0: 41 56 push %r14 > 2: 41 55 push %r13 > 4: 41 54 push %r12 > 6: 55 push %rbp > 7: 48 89 f5 mov %rsi,%rbp > a: 53 push %rbx > b: 48 89 fb mov %rdi,%rbx > e: 48 83 ec 08 sub $0x8,%rsp > 12: 48 8b 06 mov (%rsi),%rax > 15: 4c 8b 67 40 mov 0x40(%rdi),%r12 > 19: 49 89 c6 mov %rax,%r14 > 1c: 45 30 f6 xor %r14b,%r14b > 1f: a8 04 test $0x4,%al > 21: b8 00 00 00 00 mov $0x0,%eax > 26: 4c 0f 44 f0 cmove %rax,%r14 > 2a:* 49 8b 46 08 mov 0x8(%r14),%rax <-- trapping instruction > > > > BUG: unable to handle page fault for address: ffffe8ffffd00608 > > #PF: supervisor read access in kernel mode > > #PF: error_code(0x0000) - not-present page > > PGD 0 P4D 0 > > Yeah, missing page table because it wasn't copied. > > Presumably because that kthread is using the active_mm of some random > user space process that didn't get sync'ed. > > And the sync_global_pgds() may have ended up being sufficient > synchronization with whoever allocated thigns, even if it wasn't about > the TLB contents themselves. > > So apparently the "the page-table pages are all pre-allocated now" is > simply not true. Joerg? > > Unless somebody can figure this out fairly quickly, I think it should > just be reverted. Agreed. Joerg? Thanks, Ingo