On Fri, Jul 03, 2026 at 04:24:09AM -0700, Breno Leitao wrote:
> On Thu, Jul 02, 2026 at 07:55:42AM -0700, Breno Leitao wrote:
> > On Thu, Jul 02, 2026 at 09:41:14AM +0100, Catalin Marinas wrote:
> > > On Fri, Jun 26, 2026 at 08:52:03AM -0700, Breno Leitao wrote:
> > > > +pass "min_unref_scans=1 immediate; =2 gated to 2nd scan (counts
> > > > $first/$s1/$s2); param read-back ok"
> > >
> > > Are these off by one?
> >
> > They seem to be OK, and I've tested it multiple times.
> >
> > > Kmemleak has a mechanism to detect live objects
> > > via the checksum. A side effect is that on allocation, the checksum is 0
> > > and only after the first scan the checksum is changed.
> >
> > I got the impression that checksum continues to be zero for these
> > objects during the whole life time? (weird).
>
> I've investigated this a bit more and I found something interesting, in
> our per_pcu checksum. The code in update_checksum() is:
>
> for_each_possible_cpu(cpu) {
> void *ptr = per_cpu_ptr((void __percpu *)object->pointer, cpu);
>
> object->checksum ^= crc32(0, kasan_reset_tag((void *)ptr),
> object->size);
> }
>
> From my naive view, this has two concerns:
>
> 1) In the kernel, crc32(0, <64 zero bytes>, 64) is zero, and the samples' test
> I am using (kmemleak-test.c) has:
>
> pr_info("__alloc_percpu(64, 4) = 0x%px\n", __alloc_percpu(64, 4));
>
> alloc_percpu returns ZEROed memory, so, we are checkingsuming zero content.
> Because we are using 0 as seed, that is returning zero.
>
> object->checksum is a bunch of 0 XOR 0 XOR 0 and so forth.
Ah, yes, you are right. Irrespective of the per-cpu xor, I think we
should seed the checksum with something other than 0 (say -1 or some
random clock value).
> 2) that XOR above seems very weird. Basically we want to detect if some of
> those per-cpu areas changed, here, but, if checksum goes to zero if two
> object content is similar.
>
> Let me give you a simple example. We have SMP=2, and both objects have crc32 =
> 0x42. At the end of that function, object->checksum will be zero, given 0x42
> XOR 0x42 is zero.
>
> If both object changes their content at the same time, object->checksum will
> continue to be zero (although the content (and checksum) HAS changed).
>
> I understand we want to detect any change in any of these per cpu field and
> catch it independent of the CPU. I am inclined toward that.
>
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -1409,8 +1409,9 @@ static bool update_checksum(struct
> kmemleak_object *object)
> object->checksum = 0;
> for_each_possible_cpu(cpu) {
> void *ptr = per_cpu_ptr((void __percpu
> *)object->pointer, cpu);
> + u32 seed = object->checksum + cpu;
>
> - object->checksum ^= crc32(0,
> kasan_reset_tag((void *)ptr), object->size);
> + object->checksum ^= crc32(seed,
> kasan_reset_tag((void *)ptr), object->size);
Yeah, the xor wasn't a great idea. What about initialising the checksum
value on object allocation to ~0 (for the two-scans idea) and for
per-cpu, just build the crc on top of the previous crc, something like:
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 7c7ba17ce7af..e196f53f9b46 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -687,7 +687,7 @@ static struct kmemleak_object *__alloc_object(gfp_t gfp)
atomic_set(&object->use_count, 1);
object->excess_ref = 0;
object->count = 0; /* white color initially */
- object->checksum = 0;
+ object->checksum = ~0;
object->del_state = 0;
/* task information */
@@ -981,7 +981,7 @@ static void reset_checksum(unsigned long ptr)
}
raw_spin_lock_irqsave(&object->lock, flags);
- object->checksum = 0;
+ object->checksum = ~0;
raw_spin_unlock_irqrestore(&object->lock, flags);
put_object(object);
}
@@ -1410,7 +1410,8 @@ static bool update_checksum(struct kmemleak_object
*object)
for_each_possible_cpu(cpu) {
void *ptr = per_cpu_ptr((void __percpu
*)object->pointer, cpu);
- object->checksum ^= crc32(0, kasan_reset_tag((void
*)ptr), object->size);
+ object->checksum = crc32(object->checksum,
+ kasan_reset_tag((void *)ptr),
object->size);
}
} else {
object->checksum = crc32(0, kasan_reset_tag((void
*)object->pointer), object->size);
--
Catalin