On Mon, Jan 19, 2026 at 5:50 PM Leon Hwang <[email protected]> wrote: > > > > On 20/1/26 03:47, Daniel Borkmann wrote: > > On 1/19/26 3:21 PM, Leon Hwang wrote: > >> Switch the free-node pop paths to raw_spin_trylock*() to avoid blocking > >> on contended LRU locks. > >> > >> If the global or per-CPU LRU lock is unavailable, refuse to refill the > >> local free list and return NULL instead. This allows callers to back > >> off safely rather than blocking or re-entering the same lock context. > >> > >> This change avoids lockdep warnings and potential deadlocks caused by > >> re-entrant LRU lock acquisition from NMI context, as shown below: > >> > >> [ 418.260323] bpf_testmod: oh no, recursing into test_1, > >> recursion_misses 1 > >> [ 424.982207] ================================ > >> [ 424.982216] WARNING: inconsistent lock state > >> [ 424.982223] inconsistent {INITIAL USE} -> {IN-NMI} usage. > >> [ 424.982314] *** DEADLOCK *** > >> [...] > >> > >> Signed-off-by: Leon Hwang <[email protected]> > >> --- > >> kernel/bpf/bpf_lru_list.c | 17 ++++++++++------- > >> 1 file changed, 10 insertions(+), 7 deletions(-) > > > > Documentation/bpf/map_lru_hash_update.dot needs update? > > > > Yes, it needs update. > > >> diff --git a/kernel/bpf/bpf_lru_list.c b/kernel/bpf/bpf_lru_list.c > >> index c091f3232cc5..03d37f72731a 100644 > >> --- a/kernel/bpf/bpf_lru_list.c > >> +++ b/kernel/bpf/bpf_lru_list.c > >> @@ -312,14 +312,15 @@ static void bpf_lru_list_push_free(struct > >> bpf_lru_list *l, > >> raw_spin_unlock_irqrestore(&l->lock, flags); > >> } > >> -static void bpf_lru_list_pop_free_to_local(struct bpf_lru *lru, > >> +static bool bpf_lru_list_pop_free_to_local(struct bpf_lru *lru, > >> struct bpf_lru_locallist *loc_l) > >> { > >> struct bpf_lru_list *l = &lru->common_lru.lru_list; > >> struct bpf_lru_node *node, *tmp_node; > >> unsigned int nfree = 0; > >> - raw_spin_lock(&l->lock); > >> + if (!raw_spin_trylock(&l->lock)) > >> + return false; > >> > > > > Could you provide some more analysis, and the effect this has on real-world > > programs? Presumably they'll unexpectedly encounter a lot more frequent > > -ENOMEM as an error on bpf_map_update_elem even though memory might be > > available just that locks are contended? > > > > Also, have you considered rqspinlock as a potential candidate to discover > > deadlocks? > > Thanks for the questions. > > While I haven’t encountered this issue in production systems myself, the > deadlock has been observed repeatedly in practice, including the cases > shown in the cover letter. It can also be reproduced reliably when > running the LRU tests locally, so this is a real and recurring problem. > > I agree that returning -ENOMEM when locks are contended is not ideal. > Using -EBUSY would better reflect the situation where memory is > available but forward progress is temporarily blocked by lock > contention. I can update the patch accordingly. > > Regarding rqspinlock: as mentioned in the cover letter, Menglong > previously explored using rqspinlock to address these deadlocks but was > unable to arrive at a complete solution. After further off-list > discussion, we agreed that using trylock is a more practical approach > here. In most observed cases, the lock contention leading to deadlock > occurs in bpf_common_lru_pop_free(), and trylock allows callers to back > off safely rather than risking re-entrancy and deadlock.
Sorry, trylock is not an option here. We are not going to sacrifice LRU map reliability for the sake of syzbot.

