Re: BUG: unable to handle kernel paging request in __switch_to

Andy Lutomirski Thu, 14 Dec 2017 10:55:23 -0800

On Thu, Dec 14, 2017 at 10:42 AM, Linus Torvalds
<[email protected]> wrote:
> On Thu, Dec 14, 2017 at 9:12 AM, Thomas Gleixner <[email protected]> wrote:
>> On Sun, 3 Dec 2017, syzbot wrote:
>>> BUG: unable to handle kernel paging request at fffffffffffffff8
>>> Oops: 0002 [#1] SMP KASAN
>
> System write of a non-existent page.
>
>>> RIP: 0010:switch_fpu_prepare arch/x86/include/asm/fpu/internal.h:535 
>>> [inline]
>>> RIP: 0010:__switch_to+0x95b/0x1330 arch/x86/kernel/process_64.c:407
>
> This says it's
>
>      old_fpu->last_cpu = cpu;
>
> and the code disassembly ends up looking something like this:
>
>    0: 48 c1 ea 03          shr    $0x3,%rdx
>    4: 0f b6 04 02          movzbl (%rdx,%rax,1),%eax
>    8: 84 c0                test   %al,%al
>    a: 74 08                je     0x14
>    c: 3c 03                cmp    $0x3,%al
>    e: 0f 8e d5 06 00 00    jle    0x6e9
>   14: 8b 85 70 fe ff ff    mov    -0x190(%rbp),%eax
>   1a: 41 89 84 24 c0 15 00 mov    %eax,0x15c0(%r12)
>   21: 00
>   22:* cc                    int3    <-- trapping instruction
>
> where that preceding two "mov" instructions look like it might indeed be that
>
>      old_fpu->last_cpu = cpu;
>
> thing, and the register state doesn't look insane for this.
>
> So I think the RIP->line encoding is slightly off, and that "int3" is
> almost certainly due to the very next thing after the write:
>
>                 trace_x86_fpu_regs_deactivated(old_fpu);
>
> and that actually makes sense if the test robot is doing some tracing,
> particularly if it's just about to _start_ tracing, and it has
> replaced the first byte of the instruction with 'int3' and is in the
> process of doing the rewrite.
>
> The fact that it then takes a system write fault is because some GDT
> or IDT setup is screwed up. Or possibly the stack is screwed up and
> started out as 0, and then the push to the stack would decrement the
> stack pointer and try to push the error state or something.
>
>> That's the second report I'm staring at today which has CR2
>> fffffffffffffffx and points to a faulting instruction which does not make
>> any sense at all.
>
> That actually does make sense - see above.  It just requires that race
> with the instruction rewriting.
>
> *Normally* we never actually take the "int3" exception, because
> normally we'll have completed the rewrite before another CPU actually
> executes the instruction that is being rewritten.
>
> So I'm assuming this is with the page table isolation, and some
> unusual case in exception handling got screwed up.


SDM time.  Assuming the CPU actually decoded int3 and tried to execute
it, I can see a couple possible outcomes:

1. Something's wrong with the IDT and it can't read the vector.  I
think this would end up triple-faulting, though.

2. It actually tries to handle the breakpoint.  A breakpoint is a
benign exception, so any exception encountered while delivering it
would result in serial delivery.  I've never thought that serial
delivery made any sense -- presumably it just cancels the breakpoint
and delivers the other exception.  So this *could* be a page fault hit
during delivery of the int3 exception.  I don't believe it's a GDT
problem, though, because that would also likely lead to a triple
fault.  What I *would* believe is that the IST table got messed up and
we're seeing the result of trying to push to the stack with the
initial RSP=0 so the fault hits at address -8.

I have no idea how that would happen, though.  Especially since int3
from userspace would have exactly the same problem, and we exercise
that code in the selftests.

Re: BUG: unable to handle kernel paging request in __switch_to

Reply via email to