On Thu, 2019-08-22 at 03:46 +0200, Christoph Hellwig wrote:
> On Wed, Aug 21, 2019 at 05:46:42PM -0700, Atish Patra wrote:
> > In RISC-V, tlb flush happens via SBI which is expensive. If the
> > local
> > cpu is the only cpu in cpumask, there is no need to invoke a SBI
> > call.
> > 
> > Just do a local flush and return.
> > 
> > Signed-off-by: Atish Patra <atish.pa...@wdc.com>
> > ---
> >  arch/riscv/mm/tlbflush.c | 15 +++++++++++++++
> >  1 file changed, 15 insertions(+)
> > 
> > diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
> > index df93b26f1b9d..36430ee3bed9 100644
> > --- a/arch/riscv/mm/tlbflush.c
> > +++ b/arch/riscv/mm/tlbflush.c
> > @@ -2,6 +2,7 @@
> >  
> >  #include <linux/mm.h>
> >  #include <linux/smp.h>
> > +#include <linux/sched.h>
> >  #include <asm/sbi.h>
> >  
> >  void flush_tlb_all(void)
> > @@ -13,9 +14,23 @@ static void __sbi_tlb_flush_range(struct cpumask
> > *cmask, unsigned long start,
> >             unsigned long size)
> >  {
> >     struct cpumask hmask;
> > +   unsigned int cpuid = get_cpu();
> >  
> > +   if (!cmask) {
> > +           riscv_cpuid_to_hartid_mask(cpu_online_mask, &hmask);
> > +           goto issue_sfence;
> > +   }
> > +
> > +   if (cpumask_test_cpu(cpuid, cmask) && cpumask_weight(cmask) ==
> > 1) {
> > +           local_flush_tlb_all();
> > +           goto done;
> > +   }
> 
> I think a single core on a SMP kernel is a valid enough use case
> given
> how litte distros still have UP kernels.  So Maybe this shiuld rather
> be:
> 
>       if (!cmask)
>               cmask = cpu_online_mask;
> 
>       if (cpumask_test_cpu(cpuid, cmask) && cpumask_weight(cmask) ==
> 1) {
>               local_flush_tlb_all();
>       } else {
>               riscv_cpuid_to_hartid_mask(cmask, &hmask);
>               sbi_remote_sfence_vma(hmask.bits, start, size);
>       }

The downside of this is that for every !cmask case in true SMP (more
common probably) it will execute 2 extra cpumask instructions. As
tlbflush path is in performance critical path, I think we should favor
more common case (SMP with more than 1 core).

Thoughts ?

-- 
Regards,
Atish

Reply via email to