This series adds s390 support for unwinding of user space using SFrame V3. It is based on Josh's, Steven's, and my work (see prerequisites below). The generic unwind user (sframe) frameworks are extended to enable support for a few s390-particularities (see patches 5 and 6), including unwinding of user space using back chain (see patches 10-12).
Changes in v4: - Rebase on my user unwind sframe series v13, which provides SFrame V3 support. Changes in RFC v3: - Rebase on and include my unwind user cleanup series v4, which includes a simplification of unwind_user_word_size() on x86. (Linus) - Implement unwinding of user space using s390 back chain using unwind user fp instead of introducing a new unwind user backchain. (Josh) Changes in RFC v2: - Rebased on latest "unwind user" enhancements from Peter Zijlstra and my latest "unwind user sframe" series v12. - Incorporated RFC v1 review feedback. - No new config options (except for unwind user backchain). Motivation: On s390 unwinding using frame pointer (FP) is unsupported, because of lack of proper s390 64-bit (s390x) ABI specification and compiler support. The ABI does only specify a "preferred" FP register. Both GCC and Clang, regardless of compiler option -fno-omit-frame-pointer, setup the preferred FP register as late as possible, which usually is after static stack allocation, so that the CFA cannot be deduced from the FP without any further data, such as provided by DWARF CFI or SFrame. In theory there is a s390-specific alternative of unwinding using back chain (compiler option -mbackchain), but this has its own limitations. Ubuntu is currently the only distribution that that builds user space with back chain. As a consequence the Kernel stack tracer cannot unwind user space (except if it is built with back chain). Recording call graphs of user space using perf is limited to stack dump sampling (i.e. perf record --call-graph dwarf), which generates a fairly large amount of data and has limitations. Initial testing of recording call graphs using perf using the s390 support for SFrame provided by RFC v1 of this series shows that data size notably improves: perf record data size is greatly reduced (smaller perf.data): SFrame (--call-graph fp): # perf record -F 9999 --call-graph fp objdump -wdWF objdump [ perf record: Woken up 9 times to write data ] [ perf record: Captured and wrote 2.498 MB perf.data (10891 samples) ] Stack sampling (--call-graph dwarf) with a default stack size of 8192: # perf record -F 9999 --call-graph dwarf objdump -wdWF objdump [ perf record: Woken up 270 times to write data ] [ perf record: Captured and wrote 67.467 MB perf.data (8241 samples) ] Prerequirements: This series applies on top of the latest unwind user sframe series "[PATCH v13 00/18] unwind_deferred: Implement sframe handling": https://lore.kernel.org/all/[email protected]/ Like above series it depends on the upcoming binutils 2.46 release to be used to build executables and libraries (e.g. vDSO) with SFrame V3 on s390 (using the assembler option --gsframe-3). The unwind user sframe series depends on a Glibc patch from Josh, that adds support for the prctls introduced in the Kernel: https://lore.kernel.org/all/20250122023517.lmztuocecdjqzfhc@jpoimboe/ Note that Josh's Glibc patch needs to be adjusted for the updated prctl numbers from "[PATCH v13 18/18] unwind_user/sframe: Add prctl() interface for registering .sframe sections": https://lore.kernel.org/all/[email protected]/ Overview: Patch 1 aligns asm/dwarf.h to x86 asm/dwarf2.h. [*] Patch 2 replicates Josh's x86 patch "x86/asm: Avoid emitting DWARF CFI for non-VDSO" for s390. [*] Patch 3 changes the build of the vDSO on s390 to keep the function symbols for stack tracing purposes. [*] Patch 4 replicates Josh's patch "x86/vdso: Enable sframe generation in VDSO" for s390. It enables generation of SFrame V3 stack trace information (.sframe section) for the vDSO if the assembler supports it. Patches 5 and 6 enable the generic unwind user (sframe) frameworks to support the following s390 particularities: - Patch 5 adds support for architectures that define their CFA as SP at callsite + offset. - Patch 6 adds support for architectures that store the CFA offset from CFA base register (e.g. SP or FP) in SFrame encoded. For instance on s390 the CFA offset is stored adjusted by -160 and then scaled down by 8 to enable and improve the use of signed 8-bit SFrame offsets (i.e. CFA, RA, and FP offset). Patch 7 converts several s390 ptrace function macros to inline functions. [*] Patch 8 introduces frame_pointer() in ptrace on s390, which is a prerequisite for enabling unwind user. Patch 9 adds support for unwinding of user space using SFrame on s390. It leverages the extensions of the generic unwind user (sframe) frameworks from patches 5 and 6, as well as the unwind user sframe series support for SFrame V3 flexible FDEs. Patches 10 and 11 enable unwind user (fp) to support the following s390 back chain particularities: - Patch 10 introduces FP/RA restore rule ZERO, which enables s390 back chain unwinding, which cannot unwind FP. - Patch 11 enables sophisticated architecture-specific initialization of the FP frame, which enables s390 back chain unwinding to provide dynamic information. Patch 12 adds support for unwinding of user space using back chain on s390. Main reasons to support back chain on s390 are: - With Ubuntu there is a major distribution that builds user space with back chain. - Java JREs, such as OpenJDK, do maintain the back chain in jitted code. Limitations: Unwinding of user space using back chain cannot - by design - restore the FP. Therefore unwinding of subsequent frames using e.g. SFrame may fail, if the FP is the CFA base register. Thanks and regards, Jens Jens Remus (12): s390: asm/dwarf.h should only be included in assembly files s390/vdso: Avoid emitting DWARF CFI for non-vDSO s390/vdso: Keep function symbols in vDSO s390/vdso: Enable SFrame V3 generation in vDSO unwind_user: Enable archs that define CFA = SP_callsite + offset unwind_user/sframe: Enable archs with encoded SFrame CFA offsets s390/ptrace: Convert function macros to inline functions s390/ptrace: Provide frame_pointer() s390/unwind_user/sframe: Enable sframe unwinding on s390 unwind_user: Introduce FP/RA recovery rule unknown unwind_user/fp: Use arch-specific helper to initialize FP frame s390/unwind_user/fp: Enable back chain unwinding of user space arch/Kconfig | 7 ++ arch/s390/Kconfig | 2 + arch/s390/include/asm/dwarf.h | 53 ++++++--- arch/s390/include/asm/ptrace.h | 43 +++++-- arch/s390/include/asm/unwind_user.h | 132 +++++++++++++++++++++ arch/s390/include/asm/unwind_user_sframe.h | 21 ++++ arch/s390/kernel/vdso/Makefile | 9 +- arch/s390/kernel/vdso/vdso.lds.S | 9 ++ arch/x86/include/asm/unwind_user.h | 21 +++- arch/x86/include/asm/unwind_user_sframe.h | 2 + include/asm-generic/Kbuild | 1 + include/asm-generic/unwind_user_sframe.h | 20 ++++ include/linux/unwind_user.h | 19 +-- include/linux/unwind_user_types.h | 2 + kernel/unwind/sframe.c | 5 +- kernel/unwind/sframe.h | 11 ++ kernel/unwind/user.c | 31 +++-- 17 files changed, 323 insertions(+), 65 deletions(-) create mode 100644 arch/s390/include/asm/unwind_user.h create mode 100644 arch/s390/include/asm/unwind_user_sframe.h create mode 100644 include/asm-generic/unwind_user_sframe.h -- 2.51.0
