On Tue, Jul 21, 2015 at 09:22:47PM +0200, Oleg Nesterov wrote:

> +static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
> +                                 int cpu2, struct cpu_stop_work *work2)
> +{
> +     struct cpu_stopper *stopper1 = per_cpu_ptr(&cpu_stopper, cpu1);
> +     struct cpu_stopper *stopper2 = per_cpu_ptr(&cpu_stopper, cpu2);
> +     int err;
> +retry:
> +     spin_lock_irq(&stopper1->lock);
> +     spin_lock_nested(&stopper2->lock, SINGLE_DEPTH_NESTING);
> +     /*
> +      * If we observe both CPUs active we know _cpu_down() cannot yet have
> +      * queued its stop_machine works and therefore ours will get executed
> +      * first. Or its not either one of our CPUs that's getting unplugged,
> +      * in which case we don't care.
> +      */
> +     err = -ENOENT;
> +     if (!cpu_active(cpu1) || !cpu_active(cpu2))
> +             goto unlock;
> +
> +     WARN_ON(!stopper1->enabled || !stopper2->enabled);
> +     /*
> +      * Ensure that if we race with stop_cpus() the stoppers won't
> +      * get queued up in reverse order, leading to system deadlock.
> +      */
> +     err = -EDEADLK;
> +     if (stop_work_pending(stopper1) != stop_work_pending(stopper2))
> +             goto unlock;

You could DoS/false positive this by running stop_one_cpu() in a loop,
and thereby 'always' having work pending on one but not the other.

(doing so if obviously daft for other reasons)

> +
> +     err = 0;
> +     __cpu_stop_queue_work(stopper1, work1);
> +     __cpu_stop_queue_work(stopper2, work2);
> +unlock:
> +     spin_unlock(&stopper2->lock);
> +     spin_unlock_irq(&stopper1->lock);
> +
> +     if (unlikely(err == -EDEADLK)) {
> +             cond_resched();
> +             goto retry;

And this just gives me -rt nightmares.

> +     }
> +     return err;
> +}

As it is, -rt does horrible things to stop_machine, and I would very
much like to make it such that we don't need to do that.

Now, obviously, stop_cpus() is _BAD_ for -rt, and we try real hard to
make sure that doesn't happen, but stop_one_cpu() and stop_two_cpus()
should not be a problem.

Exclusion between stop_{one,two}_cpu{,s}() and stop_cpus() makes this
trivially go away.

Paul's RCU branch already kills try_stop_cpus() dead, so that wart is
also gone. But we're still stuck with stop_machine_from_inactive_cpu()
which does a spin-wait for exclusive state. So I suppose we'll have to
keep stop_cpus_mutex :/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to