On Wed, May 20, 2026 at 9:34 PM Muhammad Bilal <[email protected]> wrote: > > commit fa747e9f843b ("selftests/bpf: Fix cold_lru producing zero > batch_hash in XDP LB benchmark") claims the addition ensures the > multiplier input is "always >= 1". This invariant does not hold after > wraparound. > batch_gen is __u32. After 2^32 increments it wraps to 0. On CPU 0, > bpf_get_smp_processor_id() returns 0: > > batch_gen = 0 (after u32 wraparound) > batch_hash = (0 + 0) * KNUTH_HASH_MULT = 0 > *saddr ^= 0 -> no-op, cold_lru miss counter stays 0 > > Setting bit 0 before multiplying guarantees a non-zero odd result for > all possible values of batch_gen and cpu_id, including after wraparound: > > (any_value | 1) >= 1 always, since bit 0 is always set
You say - batch_gen is __u32. After 2^32 increments it wraps to 0 A single batch runs for 10ms, and batch_gen is incremented for every batch, so for it to wrap we need to run the benchmark for 1000+ years with a single producer. and even if we want to benchmark for 1000 years and want to fix this, then doing " | 1 " is not the correct way because: On CPU 0, consecutive batches with | 1: batch_gen=2: (2 + 0) | 1 = 3, batch_hash = 3 * KNUTH batch_gen=3: (3 + 0) | 1 = 3, batch_hash = 3 * KNUTH batch_gen=4: (4 + 0) | 1 = 5, batch_hash = 5 * KNUTH batch_gen=5: (5 + 0) | 1 = 5, batch_hash = 5 * KNUTH Each even/odd pair of batch_gen values collapses to the same batch_hash, so half the batches reuse the previous batch's cold address which is already warm in the LRU.
