merged.

Bruce

In message: [linux-yocto][v5.10/standard/base][PATCH] timers: Fix 
get_next_timer_interrupt() with no timers pending
on 12/07/2021 He Zhe wrote:

> From: Frederic Weisbecker <frede...@kernel.org>
> 
> 31cd0e119d50 ("timers: Recalculate next timer interrupt only when
> necessary") subtly altered get_next_timer_interrupt()'s behaviour. The
> function no longer consistently returns KTIME_MAX with no timers
> pending.
> 
> In order to decide if there are any timers pending we check whether the
> next expiry will happen NEXT_TIMER_MAX_DELTA jiffies from now.
> Unfortunately, the next expiry time and the timer base clock are no
> longer updated in unison. The former changes upon certain timer
> operations (enqueue, expire, detach), whereas the latter keeps track of
> jiffies as they move forward. Ultimately breaking the logic above.
> 
> A simplified example:
> 
> - Upon entering get_next_timer_interrupt() with:
> 
>       jiffies = 1
>       base->clk = 0;
>       base->next_expiry = NEXT_TIMER_MAX_DELTA;
> 
>   'base->next_expiry == base->clk + NEXT_TIMER_MAX_DELTA', the function
>   returns KTIME_MAX.
> 
> - 'base->clk' is updated to the jiffies value.
> 
> - The next time we enter get_next_timer_interrupt(), taking into account
>   no timer operations happened:
> 
>       base->clk = 1;
>       base->next_expiry = NEXT_TIMER_MAX_DELTA;
> 
>   'base->next_expiry != base->clk + NEXT_TIMER_MAX_DELTA', the function
>   returns a valid expire time, which is incorrect.
> 
> This ultimately might unnecessarily rearm sched's timer on nohz_full
> setups, and add latency to the system[1].
> 
> So, introduce 'base->timers_pending'[2], update it every time
> 'base->next_expiry' changes, and use it in get_next_timer_interrupt().
> 
> [1] See tick_nohz_stop_tick().
> [2] A quick pahole check on x86_64 and arm64 shows it doesn't make
>     'struct timer_base' any bigger.
> 
> Fixes: 31cd0e119d50 ("timers: Recalculate next timer interrupt only when 
> necessary")
> Signed-off-by: Nicolas Saenz Julienne <nsaen...@redhat.com>
> Acked-by: Frederic Weisbecker <frede...@kernel.org>
> 
> Link: https://lore.kernel.org/lkml/20210710005243.GA23956@lothringen/
> Signed-off-by: He Zhe <zhe...@windriver.com>
> ---
> This fixes the bug that arch_timer interrupt counts keep increasing on 
> isolated cores for all arm64
> This has been acked by the original problematic commit author and I have 
> validated it.
> 
>  kernel/time/timer.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index d111adf4a0cb..99b97ccefdbd 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -207,6 +207,7 @@ struct timer_base {
>       unsigned int            cpu;
>       bool                    next_expiry_recalc;
>       bool                    is_idle;
> +     bool                    timers_pending;
>       DECLARE_BITMAP(pending_map, WHEEL_SIZE);
>       struct hlist_head       vectors[WHEEL_SIZE];
>  } ____cacheline_aligned;
> @@ -595,6 +596,7 @@ static void enqueue_timer(struct timer_base *base, struct 
> timer_list *timer,
>                * can reevaluate the wheel:
>                */
>               base->next_expiry = bucket_expiry;
> +             base->timers_pending = true;
>               base->next_expiry_recalc = false;
>               trigger_dyntick_cpu(base, timer);
>       }
> @@ -1596,6 +1598,7 @@ static unsigned long __next_timer_interrupt(struct 
> timer_base *base)
>       }
>  
>       base->next_expiry_recalc = false;
> +     base->timers_pending = !(next == base->clk + NEXT_TIMER_MAX_DELTA);
>  
>       return next;
>  }
> @@ -1647,7 +1650,6 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 
> basem)
>       struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]);
>       u64 expires = KTIME_MAX;
>       unsigned long nextevt;
> -     bool is_max_delta;
>  
>       /*
>        * Pretend that there is no timer pending if the cpu is offline.
> @@ -1660,7 +1662,6 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 
> basem)
>       if (base->next_expiry_recalc)
>               base->next_expiry = __next_timer_interrupt(base);
>       nextevt = base->next_expiry;
> -     is_max_delta = (nextevt == base->clk + NEXT_TIMER_MAX_DELTA);
>  
>       /*
>        * We have a fresh next event. Check whether we can forward the
> @@ -1678,7 +1679,7 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 
> basem)
>               expires = basem;
>               base->is_idle = false;
>       } else {
> -             if (!is_max_delta)
> +             if (base->timers_pending)
>                       expires = basem + (u64)(nextevt - basej) * TICK_NSEC;
>               /*
>                * If we expect to sleep more than a tick, mark the base idle.
> @@ -1961,6 +1962,7 @@ int timers_prepare_cpu(unsigned int cpu)
>               base = per_cpu_ptr(&timer_bases[b], cpu);
>               base->clk = jiffies;
>               base->next_expiry = base->clk + NEXT_TIMER_MAX_DELTA;
> +             base->timers_pending = false;
>               base->is_idle = false;
>       }
>       return 0;
> -- 
> 2.29.2
> 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#10111): 
https://lists.yoctoproject.org/g/linux-yocto/message/10111
Mute This Topic: https://lists.yoctoproject.org/mt/84147461/21656
Group Owner: linux-yocto+ow...@lists.yoctoproject.org
Unsubscribe: https://lists.yoctoproject.org/g/linux-yocto/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to