On Tue, Jun 23, 2015 at 11:26:26AM -0700, Paul E. McKenney wrote:
> On Tue, Jun 23, 2015 at 08:04:11PM +0200, Peter Zijlstra wrote:
> > On Tue, Jun 23, 2015 at 10:30:38AM -0700, Paul E. McKenney wrote:
> > > Good, you don't need this because you can check for dynticks later.
> > > You will need to check for offline CPUs.
> > 
> > get_online_cpus()
> > for_each_online_cpus() {
> >  ...
> > }
> > 
> > is what the new code does.
> 
> Ah, I missed that this was not deleted.

But get_online_cpus() will re-introduce a deadlock.

                                                        Thanx, Paul

> > > > -       /*
> > > > -        * Each pass through the following loop attempts to force a
> > > > -        * context switch on each CPU.
> > > > -        */
> > > > -       while (try_stop_cpus(cma ? cm : cpu_online_mask,
> > > > -                            synchronize_sched_expedited_cpu_stop,
> > > > -                            NULL) == -EAGAIN) {
> > > > -               put_online_cpus();
> > > > -               atomic_long_inc(&rsp->expedited_tryfail);
> > > > -
> > > > -               /* Check to see if someone else did our work for us. */
> > > > -               s = atomic_long_read(&rsp->expedited_done);
> > > > -               if (ULONG_CMP_GE((ulong)s, (ulong)firstsnap)) {
> > > > -                       /* ensure test happens before caller kfree */
> > > > -                       smp_mb__before_atomic(); /* ^^^ */
> > > > -                       atomic_long_inc(&rsp->expedited_workdone1);
> > > > -                       free_cpumask_var(cm);
> > > > -                       return;
> > > 
> > > Here you lose batching.  Yeah, I know that synchronize_sched_expedited()
> > > is -supposed- to be used sparingly, but it is not cool for the kernel
> > > to melt down just because some creative user found a way to heat up a
> > > code path.  Need a mutex_trylock() with a counter and checking for
> > > others having already done the needed work.
> > 
> > I really think you're making that expedited nonsense far too accessible.
> 
> This has nothing to do with accessibility and everything to do with
> robustness.  And with me not becoming the triage center for too many
> non-RCU bugs.
> 
> > But it was exactly that trylock I was trying to get rid of.
> 
> OK.  Why, exactly?
> 
> > > And we still need to be able to drop back to synchronize_sched()
> > > (AKA wait_rcu_gp(call_rcu_sched) in this case) in case we have both a
> > > creative user and a long-running RCU-sched read-side critical section.
> > 
> > No, a long-running RCU-sched read-side is a bug and we should fix that,
> > its called a preemption-latency, we don't like those.
> 
> Yes, we should fix them.  No, they absolutely must not result in a
> meltdown of some unrelated portion of the kernel (like RCU), particularly
> if this situation occurs on some system running a production workload
> that doesn't happen to care about preemption latency.
> 
> > > > +       for_each_online_cpu(cpu) {
> > > > +               struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu);
> > > > 
> > > > -               /* Recheck to see if someone else did our work for us. 
> > > > */
> > > > -               s = atomic_long_read(&rsp->expedited_done);
> > > > -               if (ULONG_CMP_GE((ulong)s, (ulong)firstsnap)) {
> > > > -                       /* ensure test happens before caller kfree */
> > > > -                       smp_mb__before_atomic(); /* ^^^ */
> > > > -                       atomic_long_inc(&rsp->expedited_workdone2);
> > > > -                       free_cpumask_var(cm);
> > > > -                       return;
> > > > -               }
> > > > +               /* Offline CPUs, idle CPUs, and any CPU we run on are 
> > > > quiescent. */
> > > > +               if (!(atomic_add_return(0, &rdtp->dynticks) & 0x1))
> > > > +                       continue;
> > > 
> > > Let's see...  This does work for idle CPUs and for nohz_full CPUs running
> > > in userspace.
> > > 
> > > It does not work for the current CPU, so the check needs an additional
> > > check against raw_smp_processor_id(), which is easy enough to add.
> > 
> > Right, realized after I send it out, but it _should_ work for the
> > current cpu too. Just pointless doing it.
> 
> OK, and easily fixed up in any case.
> 
> > > There always has been a race window involving CPU hotplug.
> > 
> > There is no hotplug race, the entire thing has get_online_cpus() held
> > across it.
> 
> Which I would like to get rid of, but not urgent.
> 
> > > > +               stop_one_cpu(cpu, synchronize_sched_expedited_cpu_stop, 
> > > > NULL);
> > > 
> > > My thought was to use smp_call_function_single(), and to have the function
> > > called recheck dyntick-idle state, avoiding doing a set_tsk_need_resched()
> > > if so.
> > 
> > set_tsk_need_resched() is buggy and should not be used.
> 
> OK, what API is used for this purpose?
> 
> > > This would result in a single pass through schedule() instead
> > > of stop_one_cpu()'s double context switch.  It would likely also require
> > > some rework of rcu_note_context_switch(), which stop_one_cpu() avoids
> > > the need for.
> > 
> > _IF_ you're going to touch rcu_note_context_switch(), you might as well
> > use a completion, set it for the number of CPUs that need a resched,
> > spray resched-IPI and have rcu_note_context_switch() do a complete().
> > 
> > But I would really like to avoid adding code to
> > rcu_note_context_switch(), because we run that on _every_ single context
> > switch.
> 
> I believe that I can rework the current code to get the effect without
> increased overhead, given that I have no intention of adding the
> complete().  Adding the complete -would- add overhead to that fastpath.
> 
>                                                       Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to