https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87414

            Bug ID: 87414
           Summary: -mindirect-branch=thunk produces thunk with incorrect
                    CFI on x86_64
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: fw at gcc dot gnu.org
                CC: hjl.tools at gmail dot com
  Target Milestone: ---
            Target: x86_64

GCC 9.0.0 (20180924) generates these thunks on x86-64:

__x86_indirect_thunk_rdi:
.LFB1:
        .cfi_startproc
        call    .LIND1
.LIND0:
        pause
        lfence
        jmp     .LIND0
.LIND1:
        mov     %rdi, (%rsp)
        ret
        .cfi_endproc
.LFE1:

I don't think the CFI is correct.  At the ret instruction, the CFI
indicates that the return address is at the top of the stack.  The
unwinder will use this return address and subtract one because it's a
non-signal handler frame.  But the resulting address is located before
the start of the function, so it will locate an incorrect FDE based on
it.

Indeed I see this when si-stepping through the execution with GDB:

(gdb) disas
Dump of assembler code for function __x86_indirect_thunk_rdi:
   0x00000000004004a5 <+0>:     callq  0x4004b1 <__x86_indirect_thunk_rdi+12>
   0x00000000004004aa <+5>:     pause  
   0x00000000004004ac <+7>:     lfence 
   0x00000000004004af <+10>:    jmp    0x4004aa <__x86_indirect_thunk_rdi+5>
   0x00000000004004b1 <+12>:    mov    %rdi,(%rsp)
=> 0x00000000004004b5 <+16>:    retq   
   0x00000000004004b6 <+17>:    nopw   %cs:0x0(%rax,%rax,1)
(gdb) bt
#0  0x00000000004004b5 in __x86_indirect_thunk_rdi ()
#1  0x0000000000400490 in frame_dummy () at /tmp/cfi.c:16
#2  0x000000000040038e in main () at /tmp/cfi.c:16
End of assembler dump.
(gdb) print f2
$1 = {int (void)} 0x400490 <f2>

Note the “frame_dummy” instead of “f2” in the backtrace.

Test program:

__attribute__ ((weak))
int
f1 (int (*f2) (void))
{
  return f2 ();
}

int
f2 (void)
{
}

int
main (void)
{
  f1 (f2);
}

We had a bit of an internal debate whether it's actually possible to produce
correct CFI for this.  I think we can reflect the stack pointer adjustment
after the thunk-internal call in the CFI, so that the unwinder continues to see
the original caller of the thunk.  Due to the address decrement, this needs to
happen for the jmp instruction, not after the .LIND1 label.

As an alternative, it would be possible to error out when
-mindirect-branch=thunk is used with -fasynchronous-unwind-tables, but since
the latter is the default, this would be a bit harsh.

Reply via email to