I tried this:

struct C {
  virtual ~C();
  virtual void f();
};

void
f (C *p)
{
  p->f();
  p->f();
}

with r256939 and -mindirect-branch=thunk -O2 on x86-64 GNU/Linux, and got this:

_Z1fP1C:
.LFB0:
        .cfi_startproc
        pushq   %rbx
        .cfi_def_cfa_offset 16
        .cfi_offset 3, -16
        movq    (%rdi), %rax
        movq    %rdi, %rbx
        jmp     .LIND1
.LIND0:
        pushq   16(%rax)
        jmp     __x86_indirect_thunk
.LIND1:
        call    .LIND0
        movq    (%rbx), %rax
        movq    %rbx, %rdi
        popq    %rbx
        .cfi_def_cfa_offset 8
        movq    16(%rax), %rax
        jmp     __x86_indirect_thunk_rax
        .cfi_endproc

This doesn't look quite right. x86-64 is supposed to have asynchronous unwind tables by default, but there is nothing that reflects the change in the (relative) frame address after .LIND0. I think that region really has to be moved outside of the .cfi_startproc/.cfi_endproc bracket.

There is a different issue with the think itself.

__x86_indirect_thunk_rax:
.LFB2:
        .cfi_startproc
        call    .LIND5
.LIND4:
        pause
        lfence
        jmp     .LIND4
.LIND5:
        mov     %rax, (%rsp)
        ret
        .cfi_endproc

If a signal is delivered after the mov has executed, the unwinder will eventually unwind through the signal frame and hit __x86_indirect_thunk_rax. It does not treat it as a signal frame, so the return address of the stack is decremented by one, in an attempt to obtain a program counter value which is within the call instruction. However, in this scenario, the return address is actually the start of the function, and subtracting one moves the program counter out of the unwind region for that function.

Both issues are visible in GDB if you set breakpoints in the proper places because the frame information used for debugging is incorrect as well.

This probably does not concern the kernel that much, but it is definitely a problem for userspace.

Thanks,
Florian

Reply via email to