On Tue, May 17, 2016 at 09:31:12AM -0700, Linus Torvalds wrote: > On Tue, May 17, 2016 at 7:43 AM, Josh Poimboeuf <jpoim...@redhat.com> wrote: > > > > index 98df1fa..dae7ca0 100644 > > --- a/arch/x86/entry/thunk_64.S > > +++ b/arch/x86/entry/thunk_64.S > > @@ -15,9 +15,10 @@ > > .globl \name > > .type \name, @function > > \name: > > + /* push 1 register if frame pointers are enabled */ > > FRAME_BEGIN > > > > - /* this one pushes 9 elems, the next one would be %rIP */ > > + /* push 9 registers */ > > I don't hate this patch, but quite frankly, as with the other case, > I'd just make the frame pointer be unconditional in this case. > > If we push nine other registers, the frame pointer setup code is *not* > going to matter. > > The reason to avoid frame pointers in code generation is two-fold: > > 1) for small leaf functions, it often ends up dominating > > 2) it removes a register that is otherwise usable, which can be > particularly bad on 32-bit x86 due to the much more limited number of > registers (and was apparently really noticeable on the older on-order > atom cores) > > and in this case neither of them is really an issue. > > So I would suggest that any case that actually depends on a frame > access just make the frame pointer not just unconditional, but > _explicit_. > > So not just avoiding the macro because it's conditional, but write out > the sequence to actually set up the frame, and then use > > - movq 9*8(%rsp), %rdi > + movq 8(%rbp), %rdi # return address > > to entirely avoid all kind of "how many registers have we pushed" math. > > Considering that we got this wrong in two places, it's clearly too > subtle for our little brains as-is.
Makes sense, thanks. Here's v2: --- From: Josh Poimboeuf <jpoim...@redhat.com> Subject: [PATCH v2] x86/asm/entry: fix stack return address retrieval in thunk With CONFIG_FRAME_POINTER enabled, a thunk can pass a bad return address value to the called function. '9*8(%rsp)' actually gets the frame pointer, not the return address. The only users of the 'put_ret_addr_in_rdi' option are two functions which trace the enabling and disabling of interrupts, so this bug can result in bad debug or tracing information with CONFIG_IRQSOFF_TRACER or CONFIG_PROVE_LOCKING. Fixes: 058fb73274f9 ("x86/asm/entry: Create stack frames in thunk functions") Reported-by: Matt Fleming <m...@codeblueprint.co.uk> Signed-off-by: Josh Poimboeuf <jpoim...@redhat.com> --- arch/x86/entry/thunk_64.S | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S index 98df1fa..027aec4 100644 --- a/arch/x86/entry/thunk_64.S +++ b/arch/x86/entry/thunk_64.S @@ -8,16 +8,15 @@ #include <linux/linkage.h> #include "calling.h" #include <asm/asm.h> -#include <asm/frame.h> /* rdi: arg1 ... normal C conventions. rax is saved/restored. */ .macro THUNK name, func, put_ret_addr_in_rdi=0 .globl \name .type \name, @function \name: - FRAME_BEGIN + pushq %rbp + movq %rsp, %rbp - /* this one pushes 9 elems, the next one would be %rIP */ pushq %rdi pushq %rsi pushq %rdx @@ -29,8 +28,8 @@ pushq %r11 .if \put_ret_addr_in_rdi - /* 9*8(%rsp) is return addr on stack */ - movq 9*8(%rsp), %rdi + /* 8(%rbp) is return addr on stack */ + movq 8(%rbp), %rdi .endif call \func @@ -65,7 +64,7 @@ restore: popq %rdx popq %rsi popq %rdi - FRAME_END + popq %rbp ret _ASM_NOKPROBE(restore) #endif -- 2.4.11