On 12/2/25 11:16, Harry Yoo wrote: > Currently, kvfree_rcu_barrier() flushes RCU sheaves across all slab > caches when a cache is destroyed. This is unnecessary; only the RCU > sheaves belonging to the cache being destroyed need to be flushed. > > As suggested by Vlastimil Babka, introduce a weaker form of > kvfree_rcu_barrier() that operates on a specific slab cache. > > Factor out flush_rcu_sheaves_on_cache() from flush_all_rcu_sheaves() and > call it from flush_all_rcu_sheaves() and kvfree_rcu_barrier_on_cache(). > > Call kvfree_rcu_barrier_on_cache() instead of kvfree_rcu_barrier() on > cache destruction. > > The performance benefit is evaluated on a 12 core 24 threads AMD Ryzen > 5900X machine (1 socket), by loading slub_kunit module. > > Before: > Total calls: 19 > Average latency (us): 18127 > Total time (us): 344414 > > After: > Total calls: 19 > Average latency (us): 10066 > Total time (us): 191264 > > Two performance regression have been reported: > - stress module loader test's runtime increases by 50-60% (Daniel) > - internal graphics test's runtime on Tegra23 increases by 35% (Jon) > > They are fixed by this change. > > Suggested-by: Vlastimil Babka <[email protected]> > Fixes: ec66e0d59952 ("slab: add sheaf support for batching kfree_rcu() > operations") > Cc: <[email protected]> > Link: > https://lore.kernel.org/linux-mm/[email protected] > Reported-and-tested-by: Daniel Gomez <[email protected]> > Closes: > https://lore.kernel.org/linux-mm/[email protected] > Reported-and-tested-by: Jon Hunter <[email protected]> > Closes: > https://lore.kernel.org/linux-mm/[email protected] > Signed-off-by: Harry Yoo <[email protected]>
Thanks a lot! Added to slab/for-next-fixes

