Re: [PATCH v4 1/3] mm: use targeted IPIs for TLB sync with lockless page table walkers

Dave Hansen Mon, 02 Feb 2026 08:25:34 -0800

On 2/2/26 04:14, Lance Yang wrote:
>>> Note that the tracking adds ~3% latency to GUP-fast, as measured on a
>>> 64-core system.
>>
>> What architecture, and that is acceptable?
> 
> x86-64.
> 
> I ran ./gup_bench which spawns 60 threads, each doing 500k GUP-fast
> operations (pinning 8 pages per call) via the gup_test ioctl.
> 
> Results for pin pages:
> - Before: avg 1.489s (10 runs)
> - After:  avg 1.533s (10 runs)
> 
> Given we avoid broadcast IPIs on large systems, I think this is a
> reasonable trade-off 🙂


I thought the big databases were really sensitive to GUP-fast latency.
They like big systems, too. Won't they howl when this finally hits their
testing?

Also, two of the "write" side here are:

 * collapse_huge_page() (khugepaged)
 * tlb_remove_table() (in an "-ENOMEM" path)

Those are quite slow paths, right? Shouldn't the design here favor
keeping gup-fast as fast as possible as opposed to impacting those?

Re: [PATCH v4 1/3] mm: use targeted IPIs for TLB sync with lockless page table walkers

Reply via email to