On Mon, Feb 02, 2026 at 10:37:39PM +0800, Lance Yang wrote: > > > On 2026/2/2 21:37, Peter Zijlstra wrote: > > On Mon, Feb 02, 2026 at 09:07:10PM +0800, Lance Yang wrote: > > > > > > > Right, but if we can use full RCU for PT_RECLAIM, why can't we do so > > > > > unconditionally and not add overhead? > > > > > > > > The sync (IPI) is mainly needed for unshare (e.g. hugetlb) and collapse > > > > (khugepaged) paths, regardless of whether table free uses RCU, IIUC. > > > > > > In addition: We need the sync when we modify page tables (e.g. unshare, > > > collapse), not only when we free them. RCU can defer freeing but does > > > not prevent lockless walkers from seeing concurrent in-place > > > modifications, so we need the IPI to synchronize with those walkers > > > first. > > > > Currently PT_RECLAIM=y has no IPI; are you saying that is broken? If > > not, then why do we need this at all? > > PT_RECLAIM=y does have IPI for unshare/collapse — those paths call > tlb_flush_unshared_tables() (for hugetlb unshare) and collapse_huge_page() > (in khugepaged collapse), which already send IPIs today (broadcast to all > CPUs via tlb_remove_table_sync_one()). > > What PT_RECLAIM=y doesn't need IPI for is table freeing ( > __tlb_remove_table_one() uses call_rcu() instead). But table modification > (unshare, collapse) still needs IPI to synchronize with lockless walkers, > regardless of PT_RECLAIM. > > So PT_RECLAIM=y is not broken; it already has IPI where needed. This series > just makes those IPIs targeted instead of broadcast. Does that clarify?
Oh bah, reading is hard. I had missed they had more table_sync_one() calls, rather than remove_table_one(). So you *can* replace table_sync_one() with rcu_sync(), that will provide the same guarantees. Its just a 'little' bit slower on the update side, but does not incur the read side cost. I really think anything here needs to better explain the various requirements. Because now everybody gets to pay the price for hugetlb shared crud, while 'nobody' will actually use that.

