On Fri, Sep 22, 2017 at 05:28:04PM +0200, Sebastian Andrzej Siewior wrote:
> The current check via srcu_online is slightly racy because after looking
> at srcu_online there could be an interrupt that interrupted us long
> enough until the CPU we checked against went offline.

But in that case, wouldn't the interrupt block the synchronize_sched()
later in the offline sequence?

More to the point, are you actually seeing this failure, or is this
a theoretical bug?

> An alternative would be to hold the hotplug rwsem (so the CPUs don't
> change their state) and then check based on cpu_online() if we queue it
> on a specific CPU or not. queue_work_on() itself can handle if something
> is enqueued on an offline CPU but a timer which is enqueued on an offline
> CPU won't fire until the CPU is back online.
> 
> I am not sure if the removal in rcu_init() is okay or not. I assume that
> SRCU won't enqueue a work item before SRCU is up and ready.

Another alternative would be to disable preemption across the check and
the call to queue_delayed_work_on().

Yet another alternative would be to have an SRCU-specific per-CPU lock
that is acquired across the setting and clearing of srcu_online,
and also across the check and the call to queue_delayed_work_on().
This last would be more consistent with a desire to remove the
synchronize_sched() from the offline sequence.

Or am I missing something here?

                                                        Thanx, Paul

> Signed-off-by: Sebastian Andrzej Siewior <bige...@linutronix.de>
> ---
>  kernel/rcu/srcutree.c | 22 ++++------------------
>  kernel/rcu/tree.c     |  6 ------
>  2 files changed, 4 insertions(+), 24 deletions(-)
> 
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 729a8706751d..d190af0e56f8 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -36,6 +36,7 @@
>  #include <linux/delay.h>
>  #include <linux/module.h>
>  #include <linux/srcu.h>
> +#include <linux/cpu.h>
>  
>  #include "rcu.h"
>  #include "rcu_segcblist.h"
> @@ -424,21 +425,6 @@ static void srcu_gp_start(struct srcu_struct *sp)
>       WARN_ON_ONCE(state != SRCU_STATE_SCAN1);
>  }
>  
> -/*
> - * Track online CPUs to guide callback workqueue placement.
> - */
> -DEFINE_PER_CPU(bool, srcu_online);
> -
> -void srcu_online_cpu(unsigned int cpu)
> -{
> -     WRITE_ONCE(per_cpu(srcu_online, cpu), true);
> -}
> -
> -void srcu_offline_cpu(unsigned int cpu)
> -{
> -     WRITE_ONCE(per_cpu(srcu_online, cpu), false);
> -}
> -
>  /*
>   * Place the workqueue handler on the specified CPU if online, otherwise
>   * just run it whereever.  This is useful for placing workqueue handlers
> @@ -450,12 +436,12 @@ static bool srcu_queue_delayed_work_on(int cpu, struct 
> workqueue_struct *wq,
>  {
>       bool ret;
>  
> -     preempt_disable();
> -     if (READ_ONCE(per_cpu(srcu_online, cpu)))
> +     cpus_read_lock();
> +     if (cpu_online(cpu))
>               ret = queue_delayed_work_on(cpu, wq, dwork, delay);
>       else
>               ret = queue_delayed_work(wq, dwork, delay);
> -     preempt_enable();
> +     cpus_read_unlock();
>       return ret;
>  }
>  
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 1250e4bd4b85..a3cb562955c9 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3729,8 +3729,6 @@ int rcutree_online_cpu(unsigned int cpu)
>  {
>       sync_sched_exp_online_cleanup(cpu);
>       rcutree_affinity_setting(cpu, -1);
> -     if (IS_ENABLED(CONFIG_TREE_SRCU))
> -             srcu_online_cpu(cpu);
>       return 0;
>  }
>  
> @@ -3741,8 +3739,6 @@ int rcutree_online_cpu(unsigned int cpu)
>  int rcutree_offline_cpu(unsigned int cpu)
>  {
>       rcutree_affinity_setting(cpu, cpu);
> -     if (IS_ENABLED(CONFIG_TREE_SRCU))
> -             srcu_offline_cpu(cpu);
>       return 0;
>  }
>  
> @@ -4188,8 +4184,6 @@ void __init rcu_init(void)
>       for_each_online_cpu(cpu) {
>               rcutree_prepare_cpu(cpu);
>               rcu_cpu_starting(cpu);
> -             if (IS_ENABLED(CONFIG_TREE_SRCU))
> -                     srcu_online_cpu(cpu);
>       }
>  }
>  
> -- 
> 2.14.1
> 

Reply via email to