Hello Peter!
On 1/6/2026 10:18 PM, H. Peter Anvin wrote:
> The vdso32 sigreturn.S contains open-coded DWARF bytecode, which
> includes a hack for gdb to not try to step back to a previous call
> instruction when backtracing from a signal handler.
>
> Neither of those are necessary anymore: the backtracing issue is
> handled by ".cfi_entry simple" and ".cfi_signal_frame", both of which
> have been supported for a very long time now, which allows the
> remaining frame to be built using regular .cfi annotations.
Hopefully Glibc developers will do something similar for x86-64
__restore_rt() in Glibc sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c.
>
> Add a few more register offsets to the signal frame just for good
> measure.
>
> Replace the nop on fallthrough of the system call (which should never,
> ever happen) with a ud2a trap.
> diff --git a/arch/x86/entry/vdso/vdso32/sigreturn.S
> b/arch/x86/entry/vdso/vdso32/sigreturn.S
> .text
> .globl __kernel_sigreturn
> .type __kernel_sigreturn,@function
> - nop /* this guy is needed for .LSTARTFDEDLSI1 below (watch for HACK) */
> ALIGN
> __kernel_sigreturn:
> -.LSTART_sigreturn:
> - popl %eax /* XXX does this mean it needs unwind info? */
> + STARTPROC_SIGNAL_FRAME IA32_SIGFRAME_sigcontext
> + popl %eax
> + CFI_ADJUST_CFA_OFFSET -4
> movl $__NR_sigreturn, %eax
> int $0x80
> -.LEND_sigreturn:
...
> .globl __kernel_rt_sigreturn
> .type __kernel_rt_sigreturn,@function
> ALIGN
> __kernel_rt_sigreturn:
> -.LSTART_rt_sigreturn:
> + STARTPROC_SIGNAL_FRAME IA32_RT_SIGFRAME_sigcontext
> movl $__NR_rt_sigreturn, %eax
> int $0x80
> -.LEND_rt_sigreturn:
...
> - .section .eh_frame,"a",@progbits
> -.LSTARTFRAMEDLSI1:
> - .long .LENDCIEDLSI1-.LSTARTCIEDLSI1
> -.LSTARTCIEDLSI1:
> - .long 0 /* CIE ID */
> - .byte 1 /* Version number */
> - .string "zRS" /* NUL-terminated augmentation string */
Note that the "S" in "zRS" is the signal frame indication.
> - .uleb128 1 /* Code alignment factor */
> - .sleb128 -4 /* Data alignment factor */
> - .byte 8 /* Return address register column */
> - .uleb128 1 /* Augmentation value length */
> - .byte 0x1b /* DW_EH_PE_pcrel|DW_EH_PE_sdata4. */
> - .byte 0 /* DW_CFA_nop */
> - .align 4
> -.LENDCIEDLSI1:
> - .long .LENDFDEDLSI1-.LSTARTFDEDLSI1 /* Length FDE */
> -.LSTARTFDEDLSI1:
> - .long .LSTARTFDEDLSI1-.LSTARTFRAMEDLSI1 /* CIE pointer */
> - /* HACK: The dwarf2 unwind routines will subtract 1 from the
> - return address to get an address in the middle of the
> - presumed call instruction. Since we didn't get here via
> - a call, we need to include the nop before the real start
> - to make up for it. */
> - .long .LSTART_sigreturn-1-. /* PC-relative start address */
Your version does no longer have this nop nor does the FDE start one
byte earlier. Isn't that required for unwinders any longer?
See excerpt from dumped DWARF and disassembly for __kernel_sigreturn()
below.
> - .long .LEND_sigreturn-.LSTART_sigreturn+1
> - .uleb128 0 /* Augmentation */
...
> - .align 4
> -.LENDFDEDLSI1:
> -
> - .long .LENDFDEDLSI2-.LSTARTFDEDLSI2 /* Length FDE */
> -.LSTARTFDEDLSI2:
> - .long .LSTARTFDEDLSI2-.LSTARTFRAMEDLSI1 /* CIE pointer */
> - /* HACK: See above wrt unwind library assumptions. */
> - .long .LSTART_rt_sigreturn-1-. /* PC-relative start address */
Ditto.
> - .long .LEND_rt_sigreturn-.LSTART_rt_sigreturn+1
> - .uleb128 0 /* Augmentation */
Excerpt from dump of DWARF and disassembly with your patch:
$ objdump -d -Wf arch/x86/entry/vdso/vdso32/vdso32.so.dbg
...
000001cc 0000003c 00000000 CIE <-- CIE for __kernel_sigreturn
Version: 1
Augmentation: "zRS"
Code alignment factor: 1
Data alignment factor: -4
Return address column: 8
Augmentation data: 1b
DW_CFA_def_cfa: r4 (esp) ofs 4
DW_CFA_offset_extended_sf: r8 (eip) at cfa+56
DW_CFA_offset_extended_sf: r0 (eax) at cfa+44
DW_CFA_offset_extended_sf: r3 (ebx) at cfa+32
DW_CFA_offset_extended_sf: r1 (ecx) at cfa+40
DW_CFA_offset_extended_sf: r2 (edx) at cfa+36
DW_CFA_offset_extended_sf: r4 (esp) at cfa+28
DW_CFA_offset_extended_sf: r5 (ebp) at cfa+24
DW_CFA_offset_extended_sf: r6 (esi) at cfa+20
DW_CFA_offset_extended_sf: r7 (edi) at cfa+16
DW_CFA_offset_extended_sf: r40 (es) at cfa+8
DW_CFA_offset_extended_sf: r41 (cs) at cfa+60
DW_CFA_offset_extended_sf: r42 (ss) at cfa+72
DW_CFA_offset_extended_sf: r43 (ds) at cfa+12
DW_CFA_offset_extended_sf: r9 (eflags) at cfa+64
DW_CFA_nop
0000020c 00000010 00000044 FDE cie=000001cc pc=00001a40..00001a4a <-- FDE for
__kernel_sigreturn
DW_CFA_advance_loc: 1 to 00001a41
DW_CFA_def_cfa_offset: 0
[ The FDE covers the range [1a40..1a4a[. Previously it would have
started one byte earlier (at the nop), so that the range would have
been [1a3f..1a4a[. This is/was required for unwinders that always
subtract one from the unwound return address, so that it points into
the instruction that invoked the function (e.g. call) instead of behind
it, in case it was invoked by a non-returning function. Such an
unwinder would now lookup IP=1a3f as belonging to int80_landing_pad (and
use the DWARF rules applicable to its last instruction) instead of
__kernel_sigreturn (and its rules). Likewise for __kernel_rt_sigreturn. ]
...
00001a3c <int80_landing_pad>:
1a3c: 5d pop %ebp
1a3d: 5a pop %edx
1a3e: 59 pop %ecx
1a3f: c3 ret
00001a40 <__kernel_sigreturn>:
1a40: 58 pop %eax
1a41: b8 77 00 00 00 mov $0x77,%eax
1a46: cd 80 int $0x80
00001a48 <vdso32_sigreturn_landing_pad>:
1a48: 0f 0b ud2
1a4a: 8d b6 00 00 00 00 lea 0x0(%esi),%esi
Excerpt without your patch:
$ objdump -d -Wf arch/x86/entry/vdso/vdso32/vdso32.so.dbg
...
000001cc 00000010 00000000 CIE <-- CIE for __kernel_sigreturn and
__kernel_rt_sigreturn
Version: 1
Augmentation: "zRS"
Code alignment factor: 1
Data alignment factor: -4
Return address column: 8
Augmentation data: 1b
DW_CFA_nop
DW_CFA_nop
000001e0 00000068 00000018 FDE cie=000001cc pc=00001a6f..00001a78 <-- FDE for
__kernel_sigreturn
DW_CFA_def_cfa_expression (DW_OP_breg4 (esp): 32; DW_OP_deref)
DW_CFA_expression: r0 (eax) (DW_OP_breg4 (esp): 48)
DW_CFA_expression: r1 (ecx) (DW_OP_breg4 (esp): 44)
DW_CFA_expression: r2 (edx) (DW_OP_breg4 (esp): 40)
DW_CFA_expression: r3 (ebx) (DW_OP_breg4 (esp): 36)
DW_CFA_expression: r5 (ebp) (DW_OP_breg4 (esp): 28)
DW_CFA_expression: r6 (esi) (DW_OP_breg4 (esp): 24)
DW_CFA_expression: r7 (edi) (DW_OP_breg4 (esp): 20)
DW_CFA_expression: r8 (eip) (DW_OP_breg4 (esp): 60)
DW_CFA_advance_loc: 2 to 00001a71
DW_CFA_def_cfa_expression (DW_OP_breg4 (esp): 28; DW_OP_deref)
DW_CFA_expression: r0 (eax) (DW_OP_breg4 (esp): 44)
DW_CFA_expression: r1 (ecx) (DW_OP_breg4 (esp): 40)
DW_CFA_expression: r2 (edx) (DW_OP_breg4 (esp): 36)
DW_CFA_expression: r3 (ebx) (DW_OP_breg4 (esp): 32)
DW_CFA_expression: r5 (ebp) (DW_OP_breg4 (esp): 24)
DW_CFA_expression: r6 (esi) (DW_OP_breg4 (esp): 20)
DW_CFA_expression: r7 (edi) (DW_OP_breg4 (esp): 16)
DW_CFA_expression: r8 (eip) (DW_OP_breg4 (esp): 56)
[ See how the FDE for __kernel_sigreturn covers the range [1a6f..1a78[.
An unwinder that always subtracts one from the return address would
lookup IP=1a6f as belonging to __kernel_sigreturn (and use the DWARF
rules applicable to the nop preceeding its symbol). Likewise for
__kernel_rt_sigreturn. Or is that no longer true? ]
...
00001a5c <int80_landing_pad>:
1a5c: 5d pop %ebp
1a5d: 5a pop %edx
1a5e: 59 pop %ecx
1a5f: c3 ret
1a60: 90 nop
1a61: 8d b4 26 00 00 00 00 lea 0x0(%esi,%eiz,1),%esi
1a68: 2e 8d b4 26 00 00 00 lea %cs:0x0(%esi,%eiz,1),%esi
1a6f: 00
00001a70 <__kernel_sigreturn>:
1a70: 58 pop %eax
1a71: b8 77 00 00 00 mov $0x77,%eax
1a76: cd 80 int $0x80
Thanks and regards,
Jens
--
Jens Remus
Linux on Z Development (D3303)
[email protected] / [email protected]
IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats:
Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft:
Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/