On Fri, Apr 24, 2026 at 08:03:36AM -0700, Breno Leitao wrote:
> In kmemleak's verbose mode, every unreferenced object found during a
> scan is logged with its full header, hex dump and 16-frame backtrace.
> Workloads that leak many objects from a single allocation site flood
> dmesg with byte-for-byte identical backtraces, drowning out distinct
> leaks and other kernel messages.
> 
> Dedupe within each scan using stackdepot's trace_handle as the key: for
> every leaked object with a recorded stack trace, look up the
> representative kmemleak_object in a per-scan xarray keyed by
> trace_handle. The first sighting stores the object pointer (with a
> get_object() reference) and sets object->dup_count to 1; later
> sightings just bump dup_count on the representative. After the scan,
> walk the xarray once and emit each unique backtrace, followed by a
> single summary line when more than one object shares it.
> 
> Leaks whose trace_handle is 0 (early-boot allocations tracked before
> kmemleak_init() set up object_cache, or stack_depot_save() failures
> under memory pressure) cannot be deduped, so they are still printed
> inline via the same locked OBJECT_ALLOCATED-checked helper. The
> contents of /sys/kernel/debug/kmemleak are unchanged - only the
> verbose console output is collapsed.
> 
> Safety notes:
> 
>  - The xarray store happens outside object->lock: object->lock is a
>    raw spinlock, while xa_store() may grab xa_node slab locks at a
>    higher wait-context level which lockdep flags as invalid.
>    trace_handle is captured under object->lock (which serialises with
>    kmemleak_update_trace()'s writer), so it is safe to use after
>    dropping the lock.
> 
>  - get_object() pins the kmemleak_object metadata across
>    rcu_read_unlock(), but the underlying tracked allocation can still
>    be freed concurrently. The deferred print path therefore re-acquires
>    object->lock and re-checks OBJECT_ALLOCATED via print_leak_locked()
>    before touching object->pointer; __delete_object() clears that flag
>    under the same lock before the user memory goes away. The same
>    helper is used by the trace_handle == 0 and xa_store() failure
>    fallbacks, so every printer in the new path has identical safety
>    guarantees.
> 
>  - If get_object() fails after we set OBJECT_REPORTED, the object is
>    already being torn down (use_count hit zero); the leak count is
>    still accurate but the verbose line is dropped, which is correct
>    - the memory was freed concurrently and is no longer a leak.
> 
>  - If xa_store() fails to allocate an xa_node under memory pressure,
>    we fall back to printing inline via print_leak_locked() instead of
>    silently dropping the leak.
> 
>  - The hex dump is skipped for coalesced entries (dup_count > 1):
>    bytes would differ across objects sharing a backtrace anyway, and
>    skipping it removes the only remaining read of object->pointer's
>    contents in the deferred path. The representative's reported size
>    may also differ from the coalesced objects' sizes; the printed
>    trace_handle reflects the representative's current value rather
>    than the value used as the dedup key, which is normally - but not
>    strictly - identical.
> 
> Signed-off-by: Breno Leitao <[email protected]>

Reviewed-by: Catalin Marinas <[email protected]>

Reply via email to