On Wed, Aug 29, 2018 at 08:46:04AM -0700, Andy Lutomirski wrote:
> On Wed, Aug 29, 2018 at 2:28 AM, Peter Zijlstra <pet...@infradead.org> wrote:
> > On Wed, Aug 29, 2018 at 01:11:46AM -0700, Nadav Amit wrote:
> 
> >> +     pte_clear(poking_mm, poking_addr, ptep);
> >> +
> >> +     /*
> >> +      * __flush_tlb_one_user() performs a redundant TLB flush when PTI is 
> >> on,
> >> +      * as it also flushes the corresponding "user" address spaces, which
> >> +      * does not exist.
> >> +      *
> >> +      * Poking, however, is already very inefficient since it does not 
> >> try to
> >> +      * batch updates, so we ignore this problem for the time being.
> >> +      *
> >> +      * Since the PTEs do not exist in other kernel address-spaces, we do
> >> +      * not use __flush_tlb_one_kernel(), which when PTI is on would cause
> >> +      * more unwarranted TLB flushes.
> >> +      */
> >
> > yuck :-), but yeah.
> 
> I'm sure we covered this ad nauseum when PTI was being developed, but
> we were kind of in a rush, so:
> 
> Why do we do INVPCID at all?  The fallback path for non-INVPCID
> systems uses invalidate_user_asid(), which should be faster than the
> invpcid path.  And doesn't do a redundant flush in this case.

I don't remember; and you forgot to (re)add dhansen.

Logically INVPCID_SINGLE should be faster since it pokes out a single
translation in another PCID instead of killing all user translations.

Is it just a matter of (current) chips implementing INVLPCID_SINGLE
inefficient, or something else?

Reply via email to