> On Jan 10, 2019, at 4:56 PM, Andy Lutomirski <l...@kernel.org> wrote:
> 
> On Thu, Jan 10, 2019 at 3:02 PM Linus Torvalds
> <torva...@linux-foundation.org> wrote:
>> On Thu, Jan 10, 2019 at 12:52 PM Josh Poimboeuf <jpoim...@redhat.com> wrote:
>>> Right, emulating a call instruction from the #BP handler is ugly,
>>> because you have to somehow grow the stack to make room for the return
>>> address.  Personally I liked the idea of shifting the iret frame by 16
>>> bytes in the #DB entry code, but others hated it.
>> 
>> Yeah, I hated it.
>> 
>> But I'm starting to think it's the simplest solution.
>> 
>> So still not loving it, but all the other models have had huge issues too.
> 
> Putting my maintainer hat on:
> 
> I'm okay-ish with shifting the stack by 16 bytes.  If this is done, I
> want an assertion in do_int3() or wherever the fixup happens that the
> write isn't overlapping pt_regs (which is easy to implement because
> that code has the relevant pt_regs pointer).  And I want some code
> that explicitly triggers the fixup when a CONFIG_DEBUG_ENTRY=y or
> similar kernel is built so that this whole mess actually gets
> exercised.  Because the fixup only happens when a
> really-quite-improbable race gets hit, and the issues depend on stack
> alignment, which is presumably why Josh was able to submit a buggy
> series without noticing.
> 
> BUT: this is going to be utterly gross whenever anyone tries to
> implement shadow stacks for the kernel, and we might need to switch to
> a longjmp-like approach if that happens.

Here is an alternative idea (although similar to Steven’s and my code).

Assume that we always clobber R10, R11 on static-calls explicitly, as anyhow
should be done by the calling convention (and gcc plugin should allow us to
enforce). Also assume that we hold a table with all source RIP and the
matching target.

Now, in the int3 handler can you take the faulting RIP and search for it in
the “static-calls” table, writing the RIP+5 (offset) into R10 (return
address) and the target into R11. You make the int3 handler to divert the
code execution by changing pt_regs->rip to point to a new function that does:

        push R10
        jmp __x86_indirect_thunk_r11

And then you are done. No?

Reply via email to