On Tue, 2026-05-12 at 06:41 +0000, [email protected] wrote: [...]
> When a BPF program holds an owning or refcount-acquired reference to > one of these nodes (node X), which is structurally supported because > __bpf_obj_drop_impl() uses refcount_dec_and_test() and only frees at > refcount 0, a concurrent push to a DIFFERENT bpf_list_head becomes a > corruption: > > CPU 0 (bpf_list_head_free, lock released) CPU 1 (BPF prog, refcount X) > ----------------------------------------- ---------------------------- > (owner of X == NULL, X linked in drain) > bpf_list_push_back(other, X) > __bpf_list_add: spin_lock() > cmpxchg(X->owner, NULL, > POISON) -> OK > list_add_tail(&X->list_head, > other_head) > -> overwrites X->next, > X->prev, corrupts > other_head's chain > because X is still > stitched into drain > pos = drain.next; (may be X or neighbor using X's stale next) > list_del_init(pos); reads X->next/prev now pointing into other_head, > corrupts other_head's list and/or drain Kaitao, this scenario seem plausible, could you please comment on it? [...]

