Sorry, I made a mistake and included the wrong patches. I will send RFC v2 in few minutes.
> On Aug 23, 2019, at 3:46 PM, Nadav Amit <na...@vmware.com> wrote: > > INVPCID is considerably slower than INVLPG of a single PTE, but it is > currently used to flush PTEs in the user page-table when PTI is used. > > Instead, it is possible to defer TLB flushes until after the user > page-tables are loaded. Preventing speculation over the TLB flushes > should keep the whole thing safe. In some cases, deferring TLB flushes > in such a way can result in more full TLB flushes, but arguably this > behavior is oftentimes beneficial. > > These patches are based and evaluated on top of the concurrent > TLB-flushes v4 patch-set. > > I will provide more results later, but it might be easier to look at the > time an isolated TLB flush takes. These numbers are from skylake, > showing the number of cycles that running madvise(DONTNEED) which > results in local TLB flushes takes: > > n_pages concurrent +deferred-pti change > ------- ---------- ------------- ------ > 1 2119 1986 -6.7% > 10 6791 5417 -20% > > Please let me know if I missed something that affects security or > performance. > > [ Yes, I know there is another pending RFC for async TLB flushes, but I > think it might be easier to merge this one first ] > > Nadav Amit (3): > x86/mm/tlb: Defer PTI flushes > x86/mm/tlb: Avoid deferring PTI flushes on shootdown > x86/mm/tlb: Use lockdep irq assertions > > arch/x86/entry/calling.h | 52 +++++++++++++++++++-- > arch/x86/include/asm/tlbflush.h | 31 ++++++++++-- > arch/x86/kernel/asm-offsets.c | 3 ++ > arch/x86/mm/tlb.c | 83 +++++++++++++++++++++++++++++++-- > 4 files changed, 158 insertions(+), 11 deletions(-) > > -- > 2.17.1