On Tue, Jun 23, 2015 at 11:26:26AM -0700, Paul E. McKenney wrote: > On Tue, Jun 23, 2015 at 08:04:11PM +0200, Peter Zijlstra wrote: > > On Tue, Jun 23, 2015 at 10:30:38AM -0700, Paul E. McKenney wrote: > > > Good, you don't need this because you can check for dynticks later. > > > You will need to check for offline CPUs. > > > > get_online_cpus() > > for_each_online_cpus() { > > ... > > } > > > > is what the new code does. > > Ah, I missed that this was not deleted.
But get_online_cpus() will re-introduce a deadlock. Thanx, Paul > > > > - /* > > > > - * Each pass through the following loop attempts to force a > > > > - * context switch on each CPU. > > > > - */ > > > > - while (try_stop_cpus(cma ? cm : cpu_online_mask, > > > > - synchronize_sched_expedited_cpu_stop, > > > > - NULL) == -EAGAIN) { > > > > - put_online_cpus(); > > > > - atomic_long_inc(&rsp->expedited_tryfail); > > > > - > > > > - /* Check to see if someone else did our work for us. */ > > > > - s = atomic_long_read(&rsp->expedited_done); > > > > - if (ULONG_CMP_GE((ulong)s, (ulong)firstsnap)) { > > > > - /* ensure test happens before caller kfree */ > > > > - smp_mb__before_atomic(); /* ^^^ */ > > > > - atomic_long_inc(&rsp->expedited_workdone1); > > > > - free_cpumask_var(cm); > > > > - return; > > > > > > Here you lose batching. Yeah, I know that synchronize_sched_expedited() > > > is -supposed- to be used sparingly, but it is not cool for the kernel > > > to melt down just because some creative user found a way to heat up a > > > code path. Need a mutex_trylock() with a counter and checking for > > > others having already done the needed work. > > > > I really think you're making that expedited nonsense far too accessible. > > This has nothing to do with accessibility and everything to do with > robustness. And with me not becoming the triage center for too many > non-RCU bugs. > > > But it was exactly that trylock I was trying to get rid of. > > OK. Why, exactly? > > > > And we still need to be able to drop back to synchronize_sched() > > > (AKA wait_rcu_gp(call_rcu_sched) in this case) in case we have both a > > > creative user and a long-running RCU-sched read-side critical section. > > > > No, a long-running RCU-sched read-side is a bug and we should fix that, > > its called a preemption-latency, we don't like those. > > Yes, we should fix them. No, they absolutely must not result in a > meltdown of some unrelated portion of the kernel (like RCU), particularly > if this situation occurs on some system running a production workload > that doesn't happen to care about preemption latency. > > > > > + for_each_online_cpu(cpu) { > > > > + struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu); > > > > > > > > - /* Recheck to see if someone else did our work for us. > > > > */ > > > > - s = atomic_long_read(&rsp->expedited_done); > > > > - if (ULONG_CMP_GE((ulong)s, (ulong)firstsnap)) { > > > > - /* ensure test happens before caller kfree */ > > > > - smp_mb__before_atomic(); /* ^^^ */ > > > > - atomic_long_inc(&rsp->expedited_workdone2); > > > > - free_cpumask_var(cm); > > > > - return; > > > > - } > > > > + /* Offline CPUs, idle CPUs, and any CPU we run on are > > > > quiescent. */ > > > > + if (!(atomic_add_return(0, &rdtp->dynticks) & 0x1)) > > > > + continue; > > > > > > Let's see... This does work for idle CPUs and for nohz_full CPUs running > > > in userspace. > > > > > > It does not work for the current CPU, so the check needs an additional > > > check against raw_smp_processor_id(), which is easy enough to add. > > > > Right, realized after I send it out, but it _should_ work for the > > current cpu too. Just pointless doing it. > > OK, and easily fixed up in any case. > > > > There always has been a race window involving CPU hotplug. > > > > There is no hotplug race, the entire thing has get_online_cpus() held > > across it. > > Which I would like to get rid of, but not urgent. > > > > > + stop_one_cpu(cpu, synchronize_sched_expedited_cpu_stop, > > > > NULL); > > > > > > My thought was to use smp_call_function_single(), and to have the function > > > called recheck dyntick-idle state, avoiding doing a set_tsk_need_resched() > > > if so. > > > > set_tsk_need_resched() is buggy and should not be used. > > OK, what API is used for this purpose? > > > > This would result in a single pass through schedule() instead > > > of stop_one_cpu()'s double context switch. It would likely also require > > > some rework of rcu_note_context_switch(), which stop_one_cpu() avoids > > > the need for. > > > > _IF_ you're going to touch rcu_note_context_switch(), you might as well > > use a completion, set it for the number of CPUs that need a resched, > > spray resched-IPI and have rcu_note_context_switch() do a complete(). > > > > But I would really like to avoid adding code to > > rcu_note_context_switch(), because we run that on _every_ single context > > switch. > > I believe that I can rework the current code to get the effect without > increased overhead, given that I have no intention of adding the > complete(). Adding the complete -would- add overhead to that fastpath. > > Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/