From: Pengfei Li <[email protected]>

Hi Steven, all,

This series adds stack trace deduplication to ftrace, reducing ring
buffer usage by ~80% when stacktrace is enabled.

Problem:
When the stacktrace option is enabled, each trace event stores a full
kernel stack (typically 10-20 frames x 8 bytes = 80-160 bytes). On
production devices with 4-8MB trace buffers, this fills the buffer in
seconds, limiting the usefulness of boot-time tracing and always-on
performance monitoring.

Solution:
A lock-free hash map (modeled after tracing_map.c as suggested by
Steven [1]) that deduplicates stack traces. The ring buffer stores
only a 4-byte stack_id; full stacks are exported via tracefs.

Design (following tracing_map.c pattern):
- Lock-free insert via cmpxchg (NMI/IRQ/any context safe)
- Pre-allocated element pool (zero allocation on hot path)
- Linear probing with 2x over-provisioned table
- Per-trace_array instance support

We adopted the same lock-free algorithm as tracing_map but with a
purpose-built data structure, because tracing_map's API is designed
for histogram aggregation with fixed-size keys and sum/var fields,
while our use case requires variable-length stack traces with
reference counting.

Test results (ARM64, Qualcomm SM8850, kernel 6.12):
- kmem_cache_alloc events, 1 second capture:
  774 unique stacks, 8264 hits, 0 drops, 100% hit rate
  Ring buffer savings: 795KB -> 176KB (78% reduction)
- Function tracer, 3 seconds:
  3632 unique stacks, 25466 hits, 0 drops
  Ring buffer savings: 2.5MB -> 653KB (74% reduction)

Note: An earlier prototype using rhashtable crashed in IRQ context
(BUG at rhashtable.h:912), which led us to adopt the tracing_map
cmpxchg-based approach.

Usage:
  echo 1 > /sys/kernel/debug/tracing/options/stackmap
  echo 1 > /sys/kernel/debug/tracing/options/stacktrace
  # trace output: <stack_id 42>
  # resolve:      cat /sys/kernel/debug/tracing/stack_map

[1] https://lore.kernel.org/all/20260513085145.30dd23e0@fedora/

Pengfei Li (3):
  trace: add lock-free stackmap for stack trace deduplication
  trace: integrate stackmap into ftrace stack recording path
  trace: add documentation, selftest and tooling for stackmap

 Documentation/trace/ftrace-stackmap.rst       | 111 ++++
 kernel/trace/Kconfig                          |  21 +
 kernel/trace/Makefile                         |   1 +
 kernel/trace/trace.c                          |  46 ++
 kernel/trace/trace.h                          |  16 +
 kernel/trace/trace_entries.h                  |  15 +
 kernel/trace/trace_output.c                   |  23 +
 kernel/trace/trace_stackmap.c                 | 569 ++++++++++++++++++
 kernel/trace/trace_stackmap.h                 |  54 ++
 .../ftrace/test.d/ftrace/stackmap-basic.tc    |  74 +++
 tools/tracing/stackmap_dump.py                | 120 ++++
 11 files changed, 1050 insertions(+)
 create mode 100644 Documentation/trace/ftrace-stackmap.rst
 create mode 100644 kernel/trace/trace_stackmap.c
 create mode 100644 kernel/trace/trace_stackmap.h
 create mode 100755 
tools/testing/selftests/ftrace/test.d/ftrace/stackmap-basic.tc
 create mode 100755 tools/tracing/stackmap_dump.py

-- 
2.34.1


Reply via email to