Hello, On Tue, Jan 27, 2026 at 04:05:35PM +0100, Jens Remus wrote: > This is the implementation of parsing the SFrame V3 stack trace information > from an .sframe section in an ELF file. It's a continuation of Josh's and > Steve's work that can be found here: > > https://lore.kernel.org/all/[email protected]/ > https://lore.kernel.org/all/[email protected]/ > > Currently the only way to get a user space stack trace from a stack > walk (and not just copying large amount of user stack into the kernel > ring buffer) is to use frame pointers. This has a few issues. The biggest > one is that compiling frame pointers into every application and library > has been shown to cause performance overhead. > > Another issue is that the format of the frames may not always be consistent > between different compilers and some architectures (s390) has no defined > format to do a reliable stack walk. The only way to perform user space > profiling on these architectures is to copy the user stack into the kernel > buffer. > > SFrame [1] is now supported in binutils (x86-64, ARM64, and s390). There is > discussions going on about supporting SFrame in LLVM. SFrame acts more like > ORC, and lives in the ELF executable file as its own section. Like ORC it > has two tables where the first table is sorted by instruction pointers (IP) > and using the current IP and finding it's entry in the first table, it will > take you to the second table which will tell you where the return address > of the current function is located and then you can use that address to > look it up in the first table to find the return address of that function, > and so on. This performs a user space stack walk. > > Now because the .sframe section lives in the ELF file it needs to be faulted > into memory when it is used. This means that walking the user space stack > requires being in a faultable context. As profilers like perf request a stack > trace in interrupt or NMI context, it cannot do the walking when it is > requested. Instead it must be deferred until it is safe to fault in user > space. One place this is known to be safe is when the task is about to return > back to user space. > > This series makes the deferred unwind user code implement SFrame format V3 > and enables it on x86-64. > > [1]: https://sourceware.org/binutils/wiki/sframe > > > This series applies on top of the tip perf/core branch: > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf/core > > The to be stack-traced user space programs (and libraries) need to be > built with the recent SFrame stack trace information format V3, as > generated by the upcoming binutils 2.46 with assembler option --gsframe. > It can be built from source from the binutils-2_46-branch branch: > > git://sourceware.org/git/binutils-gdb.git binutils-2_46-branch > > Namhyung Kim's related perf tools deferred callchain support can be used > for testing ("perf record --call-graph fp,defer" and "perf report/script").
Is it possible for users to choose the unwinder - frame pointer or SFrame at runtime? I feel like the option should be "--call-graph sframe,defer" or just "--call-graph sframe" if it always uses deferred unwinding. Thanks, Namhyung > > > Changes since v12 (see patch notes for details): > - Rebase on tip perf/core branch (d55c571e4333). > - Add support for SFrame V3, including its new flexible FDEs. SFrame V2 > is not supported. > > Changes since v11 (see patch notes for details): > - Rebase on tip master branch (f8fdee44bf2f) with Namhyung Kim's > perf/defer-callchain-v4 branch merged on top. > - Adjust to Peter's latest undwind user enhancements. > - Simplify logic by using an internal SFrame FDE representation, whose > FDE function start address field is an address instead of a PC-relative > offset (from FDE). > - Rename struct sframe_fre to sframe_fre_internal to align with > struct sframe_fde_internal. > - Remove unused pt_regs from unwind_user_next_common() and its > callers. (Peter) > - Simplify unwind_user_next_sframe(). (Peter) > - Fix a few checkpatch errors and warnings. > - Minor cleanups (e.g. move includes, fix indentation). > > Changes since v10: > - Support for SFrame V2 PC-relative FDE function start address. > - Support for SFrame V2 representing RA undefined as indication for > outermost frames. > > > Patches 1, 4, 11, and 17 have been updated to exclusively support the > latest SFrame V3 stack trace information format, that is generated by > the upcoming binutils 2.46 release. Old SFrame V2 sections get rejected > with dynamic debug message "bad/unsupported sframe header". > > Patches 7 and 8 add support to unwind user (sframe) for outermost frames. > > Patches 12-15 add support to unwind user (sframe) for the new SFrame V3 > flexible FDEs. > > Patch 16 improves the performance of searching the SFrame FRE for an IP. > > Regards, > Jens > > > Jens Remus (7): > unwind_user: Stop when reaching an outermost frame > unwind_user/sframe: Add support for outermost frame indication > unwind_user: Enable archs that pass RA in a register > unwind_user: Flexible FP/RA recovery rules > unwind_user: Flexible CFA recovery rules > unwind_user/sframe: Add support for SFrame V3 flexible FDEs > unwind_user/sframe: Separate reading of FRE from reading of FRE data > words > > Josh Poimboeuf (11): > unwind_user/sframe: Add support for reading .sframe headers > unwind_user/sframe: Store .sframe section data in per-mm maple tree > x86/uaccess: Add unsafe_copy_from_user() implementation > unwind_user/sframe: Add support for reading .sframe contents > unwind_user/sframe: Detect .sframe sections in executables > unwind_user/sframe: Wire up unwind_user to sframe > unwind_user/sframe: Remove .sframe section on detected corruption > unwind_user/sframe: Show file name in debug output > unwind_user/sframe: Add .sframe validation option > unwind_user/sframe/x86: Enable sframe unwinding on x86 > unwind_user/sframe: Add prctl() interface for registering .sframe > sections > > MAINTAINERS | 1 + > arch/Kconfig | 23 + > arch/x86/Kconfig | 1 + > arch/x86/include/asm/mmu.h | 2 +- > arch/x86/include/asm/uaccess.h | 39 +- > arch/x86/include/asm/unwind_user.h | 69 +- > arch/x86/include/asm/unwind_user_sframe.h | 12 + > fs/binfmt_elf.c | 48 +- > include/linux/mm_types.h | 3 + > include/linux/sframe.h | 60 ++ > include/linux/unwind_user.h | 18 + > include/linux/unwind_user_types.h | 46 +- > include/uapi/linux/elf.h | 1 + > include/uapi/linux/prctl.h | 6 +- > kernel/fork.c | 10 + > kernel/sys.c | 9 + > kernel/unwind/Makefile | 3 +- > kernel/unwind/sframe.c | 840 ++++++++++++++++++++++ > kernel/unwind/sframe.h | 87 +++ > kernel/unwind/sframe_debug.h | 68 ++ > kernel/unwind/user.c | 105 ++- > mm/init-mm.c | 2 + > 22 files changed, 1414 insertions(+), 39 deletions(-) > create mode 100644 arch/x86/include/asm/unwind_user_sframe.h > create mode 100644 include/linux/sframe.h > create mode 100644 kernel/unwind/sframe.c > create mode 100644 kernel/unwind/sframe.h > create mode 100644 kernel/unwind/sframe_debug.h > > -- > 2.51.0 >
