From: Nadav Amit <[email protected]> This is a respin of a rebased version of an old series, which I did not follow, as I was preoccupied with personal issues (sorry).
The series improve TLB shootdown by flushing the local TLB concurrently with remote TLBs, overlapping the IPI delivery time with the local flush. Performance numbers can be found in the previous version [1]. The patches are essentially the same, but rebasing on the last version required some changes. I left the reviewed-by tags - if anyone considers it inappropriate, please let me know (and you have my apology). [1] https://lore.kernel.org/lkml/[email protected]/ v4 -> v5: * Rebase on 5.11 * Move concurrent smp logic to smp_call_function_many_cond() * Remove SGI-UV patch which is not needed anymore v3 -> v4: * Merge flush_tlb_func_local and flush_tlb_func_remote() [Peter] * Prevent preemption on_each_cpu(). It is not needed, but it prevents concerns. [Peter/tglx] * Adding acked-, review-by tags v2 -> v3: * Open-code the remote/local-flush decision code [Andy] * Fix hyper-v, Xen implementations [Andrew] * Fix redundant TLB flushes. v1 -> v2: * Removing the patches that Thomas took [tglx] * Adding hyper-v, Xen compile-tested implementations [Dave] * Removing UV [Andy] * Adding lazy optimization, removing inline keyword [Dave] * Restructuring patch-set RFCv2 -> v1: * Fix comment on flush_tlb_multi [Juergen] * Removing async invalidation optimizations [Andy] * Adding KVM support [Paolo] Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Juergen Gross <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Sasha Levin <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Nadav Amit (8): smp: Run functions concurrently in smp_call_function_many_cond() x86/mm/tlb: Unify flush_tlb_func_local() and flush_tlb_func_remote() x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy() x86/mm/tlb: Flush remote and local TLBs concurrently x86/mm/tlb: Privatize cpu_tlbstate x86/mm/tlb: Do not make is_lazy dirty for no reason cpumask: Mark functions as pure x86/mm/tlb: Remove unnecessary uses of the inline keyword arch/x86/hyperv/mmu.c | 10 +- arch/x86/include/asm/paravirt.h | 6 +- arch/x86/include/asm/paravirt_types.h | 4 +- arch/x86/include/asm/tlbflush.h | 48 +++---- arch/x86/include/asm/trace/hyperv.h | 2 +- arch/x86/kernel/alternative.c | 2 +- arch/x86/kernel/kvm.c | 11 +- arch/x86/mm/init.c | 2 +- arch/x86/mm/tlb.c | 177 +++++++++++++++----------- arch/x86/xen/mmu_pv.c | 11 +- include/linux/cpumask.h | 6 +- include/trace/events/xen.h | 2 +- kernel/smp.c | 148 +++++++++++---------- 13 files changed, 242 insertions(+), 187 deletions(-) -- 2.25.1

