On Thu, Aug 6, 2020 at 12:23 PM Joerg Roedel <jroe...@suse.de> wrote:
>
> Yes, that's the best for now. My gut feeling is that the fault Jason is
> seeing didn't happen on a vmalloc address, but I can't prove that yet.

No, it's definitely fairly high in the vmalloc space. Look at the
faulting address:

   BUG: unable to handle page fault for address: ffffe8ffffd00608

and the code sequence is this:

>   12: 48 8b 06              mov    (%rsi),%rax
>   15: 4c 8b 67 40          mov    0x40(%rdi),%r12
>   19: 49 89 c6              mov    %rax,%r14
>   1c: 45 30 f6              xor    %r14b,%r14b
>   1f: a8 04                test   $0x4,%al
>   21: b8 00 00 00 00        mov    $0x0,%eax
>   26: 4c 0f 44 f0          cmove  %rax,%r14

that admittedly odd sequence is get_work_pwq(work)

And then the faulting instruction is:

>   2a:* 49 8b 46 08          mov    0x8(%r14),%rax <-- trapping instruction

and this is the "->wq" dereference.

So it's the pwq->wq that traps, with 'pwq' being the trapping base
pointer, and clearly being in the vmalloc space.

I think pwq may a percpu allocation, so not _directly_ vmalloc().
Adding Tejun to the cc in case he can clarify ("No, silly Linus, it's
allocated here..").

                Linus

Reply via email to