On Tue, Mar 16, 2021 at 01:28:01PM +0100, Peter Zijlstra wrote:
> On Thu, Mar 11, 2021 at 01:37:00PM +0100, Frederic Weisbecker wrote:
> > Optimize further the check for local full dynticks CPU. Testing directly
> > tick_nohz_full_cpu(smp_processor_id()) is suboptimal because the
> > compiler first fetches the CPU number and only then processes the
> > static key.
> > 
> > It's best to evaluate the static branch before anything.
> 
> Or you do tricky things like this ;-)

Good point!

I'll check the asm diff to see if that really does what we want.
I expect it will.

Thanks.

> 
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 7340613c7eff..bd4a6b055b80 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -185,13 +185,12 @@ static inline bool tick_nohz_full_enabled(void)
>       return tick_nohz_full_running;
>  }
>  
> -static inline bool tick_nohz_full_cpu(int cpu)
> -{
> -     if (!tick_nohz_full_enabled())
> -             return false;
> -
> -     return cpumask_test_cpu(cpu, tick_nohz_full_mask);
> -}
> +#define tick_nohz_full_cpu(_cpu) ({                                  \
> +     bool __ret = false;                                             \
> +     if (tick_nohz_full_enabled())                                   \
> +             __ret = cpumask_test_cpu((_cpu), tick_nohz_full_mask);  \
> +     __ret;                                                          \
> +})
>  
>  static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
>  {
> 
> 

Reply via email to