On Fri, Feb 06, 2026 at 06:34:09PM +0900, Harry Yoo wrote: > Currently, kfree_rcu() cannot be called in an NMI context. > In such a context, even calling call_rcu() is not legal, > forcing users to implement deferred freeing. > > Make users' lives easier by introducing kfree_rcu_nolock() variant. > Unlike kfree_rcu(), kfree_rcu_nolock() only supports a 2-argument > variant, because, in the worst case where memory allocation fails, > the caller cannot synchronously wait for the grace period to finish. > > Similar to kfree_nolock() implementation, try to acquire kfree_rcu_cpu > spinlock, and if that fails, insert the object to per-cpu lockless list > and delay freeing using irq_work that calls kvfree_call_rcu() later. > In case kmemleak or debugobjects is enabled, always defer freeing as > those debug features don't support NMI contexts. > > When trylock succeeds, avoid consuming bnode and run_page_cache_worker() > altogether. Instead, insert objects into struct kfree_rcu_cpu.head > without consuming additional memory. > > For now, the sheaves layer is bypassed if spinning is not allowed. > > Scheduling delayed monitor work in an NMI context is tricky; use > irq_work to schedule, but use lazy irq_work to avoid raising self-IPIs. > That means scheduling delayed monitor work can be delayed up to the > length of a time slice.
By the way, this part is still not optimal. Unfortunately we can't use workqueues in NMI. Need a trick to avoid irq_work (when possible) while avoiding forgetting to drain batches later. -- Cheers, Harry / Hyeonggon
