On Fri, Oct 20, 2023 at 01:35:42AM +0200, Frederic Weisbecker wrote:
> The commit:
> 
>       cff9b2332ab7 ("kernel/sched: Modify initial boot task idle setup")
> 
> fixed an issue where rcutiny would request a quiescent state with
> setting TIF_NEED_RESCHED in early boot when init/0 has the PF_IDLE flag
> set but interrupts aren't enabled yet. A subsequent call to
> cond_resched() would then enable IRQs too early.
> 
> When callbacks are enqueued in idle, RCU currently performs the
> following:
> 
> 1) Call resched_cpu() to trigger exit from idle and go through the
>    scheduler to call rcu_note_context_switch() -> rcu_qs()
> 
> 2) rcu_qs() notes the quiescent state and raises RCU_SOFTIRQ if there
>    is a callback, waking up ksoftirqd since it isn't called from an
>    interrupt.
> 
> However the call to resched_cpu() can opportunistically be replaced and
> optimized with raising RCU_SOFTIRQ and forcing ksoftirqd wakeup instead.
> 
> It's worth noting that RCU grace period polling while idle is then
> suboptimized but such a usecase can be considered very rare or even
> non-existent.
> 
> The advantage of this optimization is that it also works if PF_IDLE is
> set early because ksoftirqd is created way after IRQs are enabled on
> boot and it can't be awaken before its creation. If
> raise_ksoftirqd_irqoff() is called after the first scheduling point
> but before kostfirqd is created, nearby voluntary schedule calls are
> expected to provide the desired quiescent state and in the worst case
> the first launch of ksoftirqd is close enough on the first initcalls.
> 
> Fixes: cff9b2332ab7 ("kernel/sched: Modify initial boot task idle setup")
> Cc: Liam R. Howlett <[email protected]>
> Cc: Peter Zijlstra (Intel) <[email protected]>
> Cc: Sebastian Siewior <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>

Reviewed-by: Paul E. McKenney <[email protected]>

I did a touch-test of the series, and have started overnight testing.

                                                        Thanx, Paul

> ---
>  kernel/rcu/tiny.c | 21 ++++++++++++++++-----
>  1 file changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
> index fec804b79080..9460e4e9d84c 100644
> --- a/kernel/rcu/tiny.c
> +++ b/kernel/rcu/tiny.c
> @@ -190,12 +190,15 @@ void call_rcu(struct rcu_head *head, rcu_callback_t 
> func)
>       local_irq_save(flags);
>       *rcu_ctrlblk.curtail = head;
>       rcu_ctrlblk.curtail = &head->next;
> -     local_irq_restore(flags);
>  
>       if (unlikely(is_idle_task(current))) {
> -             /* force scheduling for rcu_qs() */
> -             resched_cpu(0);
> +             /*
> +              * Force resched to trigger a QS and handle callbacks right 
> after.
> +              * This also takes care of avoiding too early rescheduling on 
> boot.
> +              */
> +             raise_ksoftirqd_irqoff(RCU_SOFTIRQ);
>       }
> +     local_irq_restore(flags);
>  }
>  EXPORT_SYMBOL_GPL(call_rcu);
>  
> @@ -228,8 +231,16 @@ unsigned long start_poll_synchronize_rcu(void)
>       unsigned long gp_seq = get_state_synchronize_rcu();
>  
>       if (unlikely(is_idle_task(current))) {
> -             /* force scheduling for rcu_qs() */
> -             resched_cpu(0);
> +             unsigned long flags;
> +
> +             /*
> +              * Force resched to trigger a QS. This also takes care of 
> avoiding
> +              * too early rescheduling on boot. It's suboptimized but GP
> +              * polling on idle isn't expected much as a usecase.
> +              */
> +             local_irq_save(flags);
> +             raise_ksoftirqd_irqoff(RCU_SOFTIRQ);
> +             local_irq_restore(flags);
>       }
>       return gp_seq;
>  }
> -- 
> 2.34.1
> 

Reply via email to