Hello Andrew,

On Fri, Apr 24, 2026 at 06:53:25AM -0700, Andrew Morton wrote:
> On Tue, 21 Apr 2026 06:45:03 -0700 Breno Leitao <[email protected]> wrote:
> 
> > I am starting to run with kmemleak in verbose enabled in some "probe
> > points" across the my employers fleet so that suspected leaks land in
> > dmesg without needing a separate read of /sys/kernel/debug/kmemleak.
> > 
> > The downside is that workloads which leak many objects from a single
> > allocation site flood the console with byte-for-byte identical
> > backtraces. Hundreds of duplicates per scan are common, drowning out
> > distinct leaks and unrelated kernel messages, while adding no signal
> > beyond the first occurrence.
> > 
> > This series collapses those duplicates inside kmemleak itself. Each
> > unique stackdepot trace_handle prints once per scan, followed by a
> > short summary line when more than one object shares it:
> 
> AI review:
>       
> https://sashiko.dev/#/patchset/[email protected]

V2 will have them addressed. Here are some of the answers for the question
raised by Sashiko.

> Can print_unreferenced() access freed memory here and in the fallback
> path above? Since the lock is dropped and reacquired, do we need to
> re-check object->flags & OBJECT_ALLOCATED before printing?

v2 introduces print_leak_locked(), which re-acquires object->lock and gates the
hex dump on OBJECT_ALLOCATED:

        static void print_leak_locked(struct kmemleak_object *object, bool 
hex_dump)
        {
                raw_spin_lock_irq(&object->lock);
                __print_unreferenced(NULL, object,
                                        hex_dump && (object->flags & 
OBJECT_ALLOCATED));
                raw_spin_unlock_irq(&object->lock);
        }

hex_dump_object() is the only path that reads object->pointer's user memory;
the rest of the report (backtrace, comm/pid/jiffies, checksum) lives in the
kmemleak_object metadata, which get_object() keeps alive. __delete_object()
clears OBJECT_ALLOCATED under object->lock before the user memory goes away, so
the recheck is sufficient.

> If get_object(object) failed, it means the object's reference count is
> already 0 and it is actively being deleted. Unconditionally locking and
> dumping it there seems like it will read freed memory.

Fixed in v2 by reordering: get_object() is now attempted before xa_store(), and
on failure we simply skip the object — the leak count was already incremented,
and the memory has been freed concurrently so it's no longer a leak. 

> What happens to valid memory leaks that failed to record a stack trace (e.g.
> due to memory pressure or context limits)? Will these leaks also be
> permanently ignored in all future scans?

Also fixed in v2. dedup_record() now starts with:

        if (!trace_handle) {
                print_leak_locked(object, true);
                return;
        }

so leaks with trace_handle == NULL (early-boot allocations tracked before
kmemleak_init() set up object_cache, or stack_depot_save() failures under
memory pressure) are printed inline through the same locked.

Reply via email to