On Thu, Dec 14, 2017 at 10:42 AM, Linus Torvalds <torva...@linux-foundation.org> wrote: > On Thu, Dec 14, 2017 at 9:12 AM, Thomas Gleixner <t...@linutronix.de> wrote: >> On Sun, 3 Dec 2017, syzbot wrote: >>> BUG: unable to handle kernel paging request at fffffffffffffff8 >>> Oops: 0002 [#1] SMP KASAN > > System write of a non-existent page. > >>> RIP: 0010:switch_fpu_prepare arch/x86/include/asm/fpu/internal.h:535 >>> [inline] >>> RIP: 0010:__switch_to+0x95b/0x1330 arch/x86/kernel/process_64.c:407 > > This says it's > > old_fpu->last_cpu = cpu; > > and the code disassembly ends up looking something like this: > > 0: 48 c1 ea 03 shr $0x3,%rdx > 4: 0f b6 04 02 movzbl (%rdx,%rax,1),%eax > 8: 84 c0 test %al,%al > a: 74 08 je 0x14 > c: 3c 03 cmp $0x3,%al > e: 0f 8e d5 06 00 00 jle 0x6e9 > 14: 8b 85 70 fe ff ff mov -0x190(%rbp),%eax > 1a: 41 89 84 24 c0 15 00 mov %eax,0x15c0(%r12) > 21: 00 > 22:* cc int3 <-- trapping instruction > > where that preceding two "mov" instructions look like it might indeed be that > > old_fpu->last_cpu = cpu; > > thing, and the register state doesn't look insane for this. > > So I think the RIP->line encoding is slightly off, and that "int3" is > almost certainly due to the very next thing after the write: > > trace_x86_fpu_regs_deactivated(old_fpu); > > and that actually makes sense if the test robot is doing some tracing, > particularly if it's just about to _start_ tracing, and it has > replaced the first byte of the instruction with 'int3' and is in the > process of doing the rewrite. > > The fact that it then takes a system write fault is because some GDT > or IDT setup is screwed up. Or possibly the stack is screwed up and > started out as 0, and then the push to the stack would decrement the > stack pointer and try to push the error state or something. > >> That's the second report I'm staring at today which has CR2 >> fffffffffffffffx and points to a faulting instruction which does not make >> any sense at all. > > That actually does make sense - see above. It just requires that race > with the instruction rewriting. > > *Normally* we never actually take the "int3" exception, because > normally we'll have completed the rewrite before another CPU actually > executes the instruction that is being rewritten. > > So I'm assuming this is with the page table isolation, and some > unusual case in exception handling got screwed up.
SDM time. Assuming the CPU actually decoded int3 and tried to execute it, I can see a couple possible outcomes: 1. Something's wrong with the IDT and it can't read the vector. I think this would end up triple-faulting, though. 2. It actually tries to handle the breakpoint. A breakpoint is a benign exception, so any exception encountered while delivering it would result in serial delivery. I've never thought that serial delivery made any sense -- presumably it just cancels the breakpoint and delivers the other exception. So this *could* be a page fault hit during delivery of the int3 exception. I don't believe it's a GDT problem, though, because that would also likely lead to a triple fault. What I *would* believe is that the IST table got messed up and we're seeing the result of trying to push to the stack with the initial RSP=0 so the fault hits at address -8. I have no idea how that would happen, though. Especially since int3 from userspace would have exactly the same problem, and we exercise that code in the selftests.