On Fri, Feb 06, 2026 at 06:34:03PM +0900, Harry Yoo wrote:
> These are a few improvements for k[v]free_rcu() API, which were suggested
> by Alexei Starovoitov.
> 
> [ To kmemleak folks: I'm going to teach delete_object_full() and
>   paint_ptr() to ignore cases when the object does not exist.
>   Could you please let me know if the way it's done in patch 3
>   looks good? Only part 2 is relevant to you. ]

On what commit should I apply this series?  I get conflicts on top of -rcu
(no surprise there) and build errors on top of next-20260205.

                                                        Thanx, Paul

> Although I've put some effort into providing a decent quality
> implementation, I'd like you to consider this as a proof-of-concept
> and let's discuss how best we could tackle those problems:
> 
>   1) Allow an 8-byte field to be used as an alternative to
>      struct rcu_head (16-byte) for 2-argument kvfree_rcu()
>   2) kmalloc_nolock() -> kfree[_rcu]() support
>   3) Add kfree_rcu_nolock() for NMI context
> 
> # Part 1. Allow an 8-byte field to be used as an alternative to
>   struct rcu_head for 2-argument kvfree_rcu()
>   
>   Technically, objects that are freed with k[v]free_rcu() need
>   only one pointer to link objects, because we already know that
>   the callback function is always kvfree(). For this purpose,
>   struct rcu_head is unnecessarily large (16 bytes on 64-bit).
> 
>   Allow a smaller, 8-byte field (of struct rcu_ptr type) to be used
>   with k[v]free_rcu(). Let's save one pointer per slab object.
>   
>   I have to admit that my naming skill isn't great; hopefully
>   we'll come up with a better name than `struct rcu_ptr`.
> 
>   With this feature, either a struct rcu_ptr or rcu_head field
>   can be used as the second argument of the k[v]free_rcu() API.
> 
>   Users that only use k[v]free_rcu() are highly encouraged to use
>   struct rcu_ptr; otherwise you're wasting memory. However, some users,
>   such as maple tree, may use call_rcu() or k[v]free_rcu() depending on
>   the situation for objects of the same type. For such users,
>   struct rcu_head remains the only option.
> 
>   Patch 1 implements this feature, and patch 2 adds a few users in mm/.
> 
> # Part 2. kmalloc_nolock() -> kfree() or kfree_rcu() path support
>   
>   Allow objects allocated with kmalloc_nolock() to be freed with
>   kfree[_rcu](). Without this support, users are forced to call
>   call_rcu() with kfree_nolock() to free objects after a grace period.
>   This is not efficient and can create unnecessarily many grace periods
>   by bypassing the kfree_rcu batching layer.
> 
>   The reason why it was not supported before was because some alloc
>   hooks are not called in kmalloc_nolock(), while all free hooks are
>   called in kfree().
> 
>   Patch 3 adds support for this by teaching kmemleak to ignore cases
>   when free hooks are called without prior alloc hooks. Patch 4 frees
>   a bit in enum objexts_flags, since we no longer have to remember
>   whether the array was allocated using kmalloc_nolock() or kmalloc().
> 
>   Note that the free hooks fall into these categories:
> 
>   - Its alloc hook is called in kmalloc_nolock(), no problem!
>     (kmsan_slab_alloc(), kasan_slab_alloc(),
>      memcg_slab_post_alloc_hook(), alloc_tagging_slab_alloc_hook())
> 
>   - Its alloc hook isn't called in kmalloc_nolock(); free hooks
>     must handle asymmetric hook calls. (kfence_free(),
>     kmemleak_free_recursive())
> 
>   - There is no matching alloc hook for the free hook; it's safe to
>     call. (debug_check_no_{locks,obj}_freed, __kcsan_check_access())
> 
>   Note that kmalloc() -> kfree_nolock() or kfree_rcu_nolock() isn't
>   still supported! That's much trickier :)
> 
> # Part 3. Add kfree_rcu_nolock() for NMI context
> 
>   Add a new 2-argument kfree_rcu_nolock() variant that is safe to be
>   called in NMI context. In NMI context, calling kfree_rcu() or
>   call_rcu() is not legal, and thus users are forced to implement some
>   sort of deferred freeing. Let's make users' lives easier with the new
>   variant.
> 
>   Note that 1-argument kfree_rcu_nolock() is not supported, since there
>   is not much we can do when trylock & memory allocation fails.
>   (You can't call synchronize_rcu() in NMI context!)
> 
>   When spinning on a lock is not allowed, try to acquire the spinlock.
>   When it succeeds in acquiring the lock, do either:
> 
>   1) Use the rcu sheaf to free the object. Note that call_rcu() cannot
>      be called in NMI context! When the rcu sheaf becomes full by
>      freeing the object, it cannot free to the sheaf and has to fall back.
>   
>   2) Use struct rcu_ptr field to link objects. Consuming a bnode
>      (of struct kvfree_rcu_bulk_data) and queueing work to maintain
>      a number of cached bnodes is avoided in NMI context.
> 
>   Note that scheduling delayed monitor work to drain objects after
>   KFREE_DRAIN_JIFFIES is done using a lazy irq_work to avoid raising
>   self-IPIs. That means scheduling delayed monitor work can be delayed
>   up to the length of a time slice.
> 
>   In rare cases where trylock fails, a non-lazy irq_work is used to
>   defer calling kvfree_rcu_call().
> 
>   When certain debug features (kmemleak, debugobjects) are enabled,
>   freeing in NMI context is always deferred because they use spinlocks.
> 
>   Patch 6 implements kfree_rcu_nolock() support, patch 7 adds sheaves
>   support for the new API.
> 
> Harry Yoo (7):
>   mm/slab: introduce k[v]free_rcu() with struct rcu_ptr
>   mm: use rcu_ptr instead of rcu_head
>   mm/slab: allow freeing kmalloc_nolock()'d objects using kfree[_rcu]()
>   mm/slab: free a bit in enum objexts_flags
>   mm/slab: move kfree_rcu_cpu[_work] definitions
>   mm/slab: introduce kfree_rcu_nolock()
>   mm/slab: make kfree_rcu_nolock() work with sheaves
> 
>  include/linux/list_lru.h   |   2 +-
>  include/linux/memcontrol.h |   3 +-
>  include/linux/rcupdate.h   |  68 +++++---
>  include/linux/shrinker.h   |   2 +-
>  include/linux/types.h      |   9 ++
>  mm/kmemleak.c              |  11 +-
>  mm/slab.h                  |   2 +-
>  mm/slab_common.c           | 309 +++++++++++++++++++++++++------------
>  mm/slub.c                  |  47 ++++--
>  mm/vmalloc.c               |   4 +-
>  10 files changed, 310 insertions(+), 147 deletions(-)
> 
> -- 
> 2.43.0
> 

Reply via email to