On Mon, 2 Feb 2026 10:54:14 +0100, Peter Zijlstra wrote: > On Mon, Feb 02, 2026 at 03:45:54PM +0800, Lance Yang wrote: > > When freeing or unsharing page tables we send an IPI to synchronize with > > concurrent lockless page table walkers (e.g. GUP-fast). Today we broadcast > > that IPI to all CPUs, which is costly on large machines and hurts RT > > workloads[1]. > > > > This series makes those IPIs targeted. We track which CPUs are currently > > doing a lockless page table walk for a given mm (per-CPU > > active_lockless_pt_walk_mm). When we need to sync, we only IPI those CPUs. > > GUP-fast and perf_get_page_size() set/clear the tracker around their walk; > > tlb_remove_table_sync_mm() uses it and replaces the previous broadcast in > > the free/unshare paths. > > I'm confused. This only happens when !PT_RECLAIM, because if PT_RECLAIM > __tlb_remove_table_one() actually uses RCU. > > So why are you making things more expensive for no reason?
You're right that when CONFIG_PT_RECLAIM is set, __tlb_remove_table_one() uses call_rcu() and we never call any sync there ??? this series doesn't touch that path. In the !PT_RECLAIM table-free path (same __tlb_remove_table_one() branch that calls tlb_remove_table_sync_mm(tlb->mm) before __tlb_remove_table), we're not adding any new sync; we're replacing the existing broadcast IPI (tlb_remove_table_sync_one()) with targeted IPIs (tlb_remove_table_sync_mm()). One thing I just realized: when CONFIG_MMU_GATHER_RCU_TABLE_FREE is not set, the sync path isn't used at all (tlb_remove_table_sync_one() and friends aren't even compiled), so we don't need the tracker in that config. Thanks for raising this! Lance

