Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and lg_double_lock/unlock()

2015-08-03 Thread Oleg Nesterov
On 07/30, Peter Zijlstra wrote:
>
> On Tue, Jul 21, 2015 at 09:22:47PM +0200, Oleg Nesterov wrote:
>
> > +   err = -EDEADLK;
> > +   if (stop_work_pending(stopper1) != stop_work_pending(stopper2))
> > +   goto unlock;
>
> You could DoS/false positive this by running stop_one_cpu() in a loop,
> and thereby 'always' having work pending on one but not the other.

as we already discussed this is not a problem.

> > +   if (unlikely(err == -EDEADLK)) {
> > +   cond_resched();
> > +   goto retry;
>
> And this just gives me -rt nightmares.

Why?

> As it is, -rt does horrible things to stop_machine, and I would very
> much like to make it such that we don't need to do that.
>
> Now, obviously, stop_cpus() is _BAD_ for -rt, and we try real hard to
> make sure that doesn't happen,

Yes. stop_cpus() is already bad so I am not sure I understand why this
change make the things really worse.

stop_two_cpus() needs to spin/retry if it races with the main loop in
queue_stop_cpus_work(),

preempt_disable();
for_each_cpu(cpu, cpumask) {
work = _cpu(cpu_stopper.stop_work, cpu);
work->fn = fn;
work->arg = arg;
work->done = done;
cpu_stop_queue_work(cpu, work);
}
preempt_enable();

and iirc preempt_disable() means "disable preemption" even in -rt, but
I am not sure. So "goto retry" should be really, really unlikely.

Besides, whatever we do stop_two_cpus(X, Y) will wait anyway if ->stop_work
was queued on X or Y anyway. And with your patch in the next email it will
spin too (yes, yes, -rt differs).


Another case when stop_two_cpus(X, Y) needs to retry is when ->stop_work
was already dequeued on CPU X but not on CPU Y (and this is why it needs
cond_resched() for CONFIG_PREEMPT=n, it can run on CPU Y). This does not
look really bad too, the migration/Y thread is already activated and it
has the highest priority.


So I still think that at least correctness wise this patch is fine. Am I
missed something else?


> Paul's RCU branch already kills try_stop_cpus() dead, so that wart is
> also gone. But we're still stuck with stop_machine_from_inactive_cpu()
> which does a spin-wait for exclusive state. So I suppose we'll have to
> keep stop_cpus_mutex :/

Yes, we still need stop_cpus_mutex. Even if we remove try_stop_cpus() and
stop_machine_from_inactive_cpu(). But this is another issue.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and lg_double_lock/unlock()

2015-08-03 Thread Oleg Nesterov
On 07/31, Peter Zijlstra wrote:
>
> + for_each_cpu(cpu, cpumask)
> + arch_spin_lock((arch_spinlock_t *)_cpu(cpu_stopper.lock, 
> cpu));
> +
>   for_each_cpu(cpu, cpumask) {
>   work = _cpu(cpu_stopper.stop_work, cpu);
>   work->fn = fn;
>   work->arg = arg;
>   work->done = done;
> - cpu_stop_queue_work(cpu, work);
> + __cpu_stop_queue_work(cpu, work);
>   }
> - lg_global_unlock(_cpus_lock);
> +
> + for_each_cpu(cpu, cpumask)
> + arch_spin_unlock((arch_spinlock_t *)_cpu(cpu_stopper.lock, 
> cpu));

Of course, we discussed this before and I think this should work too.
However to me this looks more ugly (although better than the current
code), and this is what I tried to avoid.

But! of course "more ugly" is very much subjective, so I won't really
argue if you prefer this change. That said, let me write another email
in reply to your initial review.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and lg_double_lock/unlock()

2015-08-03 Thread Oleg Nesterov
On 07/30, Peter Zijlstra wrote:

 On Tue, Jul 21, 2015 at 09:22:47PM +0200, Oleg Nesterov wrote:

  +   err = -EDEADLK;
  +   if (stop_work_pending(stopper1) != stop_work_pending(stopper2))
  +   goto unlock;

 You could DoS/false positive this by running stop_one_cpu() in a loop,
 and thereby 'always' having work pending on one but not the other.

as we already discussed this is not a problem.

  +   if (unlikely(err == -EDEADLK)) {
  +   cond_resched();
  +   goto retry;

 And this just gives me -rt nightmares.

Why?

 As it is, -rt does horrible things to stop_machine, and I would very
 much like to make it such that we don't need to do that.

 Now, obviously, stop_cpus() is _BAD_ for -rt, and we try real hard to
 make sure that doesn't happen,

Yes. stop_cpus() is already bad so I am not sure I understand why this
change make the things really worse.

stop_two_cpus() needs to spin/retry if it races with the main loop in
queue_stop_cpus_work(),

preempt_disable();
for_each_cpu(cpu, cpumask) {
work = per_cpu(cpu_stopper.stop_work, cpu);
work-fn = fn;
work-arg = arg;
work-done = done;
cpu_stop_queue_work(cpu, work);
}
preempt_enable();

and iirc preempt_disable() means disable preemption even in -rt, but
I am not sure. So goto retry should be really, really unlikely.

Besides, whatever we do stop_two_cpus(X, Y) will wait anyway if -stop_work
was queued on X or Y anyway. And with your patch in the next email it will
spin too (yes, yes, -rt differs).


Another case when stop_two_cpus(X, Y) needs to retry is when -stop_work
was already dequeued on CPU X but not on CPU Y (and this is why it needs
cond_resched() for CONFIG_PREEMPT=n, it can run on CPU Y). This does not
look really bad too, the migration/Y thread is already activated and it
has the highest priority.


So I still think that at least correctness wise this patch is fine. Am I
missed something else?


 Paul's RCU branch already kills try_stop_cpus() dead, so that wart is
 also gone. But we're still stuck with stop_machine_from_inactive_cpu()
 which does a spin-wait for exclusive state. So I suppose we'll have to
 keep stop_cpus_mutex :/

Yes, we still need stop_cpus_mutex. Even if we remove try_stop_cpus() and
stop_machine_from_inactive_cpu(). But this is another issue.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and lg_double_lock/unlock()

2015-08-03 Thread Oleg Nesterov
On 07/31, Peter Zijlstra wrote:

 + for_each_cpu(cpu, cpumask)
 + arch_spin_lock((arch_spinlock_t *)per_cpu(cpu_stopper.lock, 
 cpu));
 +
   for_each_cpu(cpu, cpumask) {
   work = per_cpu(cpu_stopper.stop_work, cpu);
   work-fn = fn;
   work-arg = arg;
   work-done = done;
 - cpu_stop_queue_work(cpu, work);
 + __cpu_stop_queue_work(cpu, work);
   }
 - lg_global_unlock(stop_cpus_lock);
 +
 + for_each_cpu(cpu, cpumask)
 + arch_spin_unlock((arch_spinlock_t *)per_cpu(cpu_stopper.lock, 
 cpu));

Of course, we discussed this before and I think this should work too.
However to me this looks more ugly (although better than the current
code), and this is what I tried to avoid.

But! of course more ugly is very much subjective, so I won't really
argue if you prefer this change. That said, let me write another email
in reply to your initial review.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and lg_double_lock/unlock()

2015-08-01 Thread Peter Zijlstra
On Sat, Aug 01, 2015 at 12:57:18PM +0200, Oleg Nesterov wrote:

> > > + if (stop_work_pending(stopper1) != stop_work_pending(stopper2))
> > > + goto unlock;
> >
> > You could DoS/false positive this by running stop_one_cpu() in a loop,
> > and thereby 'always' having work pending on one but not the other.
> 
> IIRC no. I am pretty sure stop_one_cpu() doesn't use stopper->stop_work,
> only stop_machine() does.

Urgh, I missed you were testing the cpu_stopper::stop_work not the
cpu_stopper::works list.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and lg_double_lock/unlock()

2015-08-01 Thread Oleg Nesterov
Hi Peter,

Thanks for looking. I'll try to reply on Monday, just one note...

On 07/30, Peter Zijlstra wrote:
>
> On Tue, Jul 21, 2015 at 09:22:47PM +0200, Oleg Nesterov wrote:
>
> > +static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
> > +   int cpu2, struct cpu_stop_work *work2)
> > +{
> > +   struct cpu_stopper *stopper1 = per_cpu_ptr(_stopper, cpu1);
> > +   struct cpu_stopper *stopper2 = per_cpu_ptr(_stopper, cpu2);
> > +   int err;
> > +retry:
> > +   spin_lock_irq(>lock);
> > +   spin_lock_nested(>lock, SINGLE_DEPTH_NESTING);
> > +   /*
> > +* If we observe both CPUs active we know _cpu_down() cannot yet have
> > +* queued its stop_machine works and therefore ours will get executed
> > +* first. Or its not either one of our CPUs that's getting unplugged,
> > +* in which case we don't care.
> > +*/
> > +   err = -ENOENT;
> > +   if (!cpu_active(cpu1) || !cpu_active(cpu2))
> > +   goto unlock;
> > +
> > +   WARN_ON(!stopper1->enabled || !stopper2->enabled);
> > +   /*
> > +* Ensure that if we race with stop_cpus() the stoppers won't
> > +* get queued up in reverse order, leading to system deadlock.
> > +*/
> > +   err = -EDEADLK;
> > +   if (stop_work_pending(stopper1) != stop_work_pending(stopper2))
> > +   goto unlock;
>
> You could DoS/false positive this by running stop_one_cpu() in a loop,
> and thereby 'always' having work pending on one but not the other.

IIRC no. I am pretty sure stop_one_cpu() doesn't use stopper->stop_work,
only stop_machine() does.

Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and lg_double_lock/unlock()

2015-08-01 Thread Oleg Nesterov
Hi Peter,

Thanks for looking. I'll try to reply on Monday, just one note...

On 07/30, Peter Zijlstra wrote:

 On Tue, Jul 21, 2015 at 09:22:47PM +0200, Oleg Nesterov wrote:

  +static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
  +   int cpu2, struct cpu_stop_work *work2)
  +{
  +   struct cpu_stopper *stopper1 = per_cpu_ptr(cpu_stopper, cpu1);
  +   struct cpu_stopper *stopper2 = per_cpu_ptr(cpu_stopper, cpu2);
  +   int err;
  +retry:
  +   spin_lock_irq(stopper1-lock);
  +   spin_lock_nested(stopper2-lock, SINGLE_DEPTH_NESTING);
  +   /*
  +* If we observe both CPUs active we know _cpu_down() cannot yet have
  +* queued its stop_machine works and therefore ours will get executed
  +* first. Or its not either one of our CPUs that's getting unplugged,
  +* in which case we don't care.
  +*/
  +   err = -ENOENT;
  +   if (!cpu_active(cpu1) || !cpu_active(cpu2))
  +   goto unlock;
  +
  +   WARN_ON(!stopper1-enabled || !stopper2-enabled);
  +   /*
  +* Ensure that if we race with stop_cpus() the stoppers won't
  +* get queued up in reverse order, leading to system deadlock.
  +*/
  +   err = -EDEADLK;
  +   if (stop_work_pending(stopper1) != stop_work_pending(stopper2))
  +   goto unlock;

 You could DoS/false positive this by running stop_one_cpu() in a loop,
 and thereby 'always' having work pending on one but not the other.

IIRC no. I am pretty sure stop_one_cpu() doesn't use stopper-stop_work,
only stop_machine() does.

Oleg.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and lg_double_lock/unlock()

2015-08-01 Thread Peter Zijlstra
On Sat, Aug 01, 2015 at 12:57:18PM +0200, Oleg Nesterov wrote:

   + if (stop_work_pending(stopper1) != stop_work_pending(stopper2))
   + goto unlock;
 
  You could DoS/false positive this by running stop_one_cpu() in a loop,
  and thereby 'always' having work pending on one but not the other.
 
 IIRC no. I am pretty sure stop_one_cpu() doesn't use stopper-stop_work,
 only stop_machine() does.

Urgh, I missed you were testing the cpu_stopper::stop_work not the
cpu_stopper::works list.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and lg_double_lock/unlock()

2015-07-31 Thread Peter Zijlstra
On Fri, Jul 31, 2015 at 01:12:46PM +0200, Peter Zijlstra wrote:
> On Thu, Jul 30, 2015 at 11:55:27PM +0200, Peter Zijlstra wrote:
> > 
> > Exclusion between stop_{one,two}_cpu{,s}() and stop_cpus() makes this
> > trivially go away.
> > 
> > Paul's RCU branch already kills try_stop_cpus() dead, so that wart is
> > also gone. But we're still stuck with stop_machine_from_inactive_cpu()
> > which does a spin-wait for exclusive state. So I suppose we'll have to
> > keep stop_cpus_mutex :/
> 
> *groan* we really need to kill that from_inactive_cpu() shite too. Lemme
> go have a look at how this MTRR crap works.

*sigh* we can't get rid of it :/ The hardware is 'broken', see:
d0af9eed5aa9 ("x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT
init")

And that means we must not ever block and the global primitive must be a
spinner.

The below is the best we can do.. At least it localizes the ugly.

--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -20,7 +20,6 @@
 #include 
 #include 
 #include 
-#include 
 
 /*
  * Structure to determine completion condition and record errors.  May
@@ -47,14 +46,6 @@ struct cpu_stopper {
 static DEFINE_PER_CPU(struct cpu_stopper, cpu_stopper);
 static bool stop_machine_initialized = false;
 
-/*
- * Avoids a race between stop_two_cpus and global stop_cpus, where
- * the stoppers could get queued up in reverse order, leading to
- * system deadlock. Using an lglock means stop_two_cpus remains
- * relatively cheap.
- */
-DEFINE_STATIC_LGLOCK(stop_cpus_lock);
-
 static void cpu_stop_init_done(struct cpu_stop_done *done, unsigned int 
nr_todo)
 {
memset(done, 0, sizeof(*done));
@@ -74,20 +65,25 @@ static void cpu_stop_signal_done(struct
 }
 
 /* queue @work to @stopper.  if offline, @work is completed immediately */
-static void cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work)
+static void __cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work)
 {
struct cpu_stopper *stopper = _cpu(cpu_stopper, cpu);
 
-   unsigned long flags;
-
-   spin_lock_irqsave(>lock, flags);
-
if (stopper->enabled) {
list_add_tail(>list, >works);
wake_up_process(stopper->thread);
-   } else
+   } else {
cpu_stop_signal_done(work->done, false);
+   }
+}
 
+static void cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work)
+{
+   struct cpu_stopper *stopper = _cpu(cpu_stopper, cpu);
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   __cpu_stop_queue_work(cpu, work);
spin_unlock_irqrestore(>lock, flags);
 }
 
@@ -226,9 +222,14 @@ static int multi_cpu_stop(void *data)
  */
 int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void 
*arg)
 {
-   struct cpu_stop_done done;
+   struct cpu_stopper *stopper1, *stopper2;
struct cpu_stop_work work1, work2;
struct multi_stop_data msdata;
+   struct cpu_stop_done done;
+   unsigned long flags;
+
+   if (cpu2 < cpu1)
+   swap(cpu1, cpu2);
 
preempt_disable();
msdata = (struct multi_stop_data){
@@ -260,10 +261,17 @@ int stop_two_cpus(unsigned int cpu1, uns
return -ENOENT;
}
 
-   lg_double_lock(_cpus_lock, cpu1, cpu2);
-   cpu_stop_queue_work(cpu1, );
-   cpu_stop_queue_work(cpu2, );
-   lg_double_unlock(_cpus_lock, cpu1, cpu2);
+   stopper1 = per_cpu_ptr(_stopper, cpu1);
+   stopper2 = per_cpu_ptr(_stopper, cpu2);
+
+   spin_lock_irqsave(>lock, flags);
+   spin_lock_nested(>lock, SINGLE_DEPTH_NESTING);
+
+   __cpu_stop_queue_work(cpu1, );
+   __cpu_stop_queue_work(cpu2, );
+
+   spin_unlock(>lock);
+   spin_unlock_irqrestore(>lock, flags);
 
preempt_enable();
 
@@ -304,19 +312,24 @@ static void queue_stop_cpus_work(const s
unsigned int cpu;
 
/*
-* Disable preemption while queueing to avoid getting
-* preempted by a stopper which might wait for other stoppers
-* to enter @fn which can lead to deadlock.
+* Disgusting, but take all relevant per-cpu spinlocks to serialize
+* against stop_{one,two}_cpu{,s}().
 */
-   lg_global_lock(_cpus_lock);
+   preempt_disable();
+   for_each_cpu(cpu, cpumask)
+   arch_spin_lock((arch_spinlock_t *)_cpu(cpu_stopper.lock, 
cpu));
+
for_each_cpu(cpu, cpumask) {
work = _cpu(cpu_stopper.stop_work, cpu);
work->fn = fn;
work->arg = arg;
work->done = done;
-   cpu_stop_queue_work(cpu, work);
+   __cpu_stop_queue_work(cpu, work);
}
-   lg_global_unlock(_cpus_lock);
+
+   for_each_cpu(cpu, cpumask)
+   arch_spin_unlock((arch_spinlock_t *)_cpu(cpu_stopper.lock, 
cpu));
+   preempt_enable();
 }
 
 static int __stop_cpus(const struct cpumask *cpumask,
--
To unsubscribe from this list: send the line 

Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and lg_double_lock/unlock()

2015-07-31 Thread Peter Zijlstra
On Thu, Jul 30, 2015 at 11:55:27PM +0200, Peter Zijlstra wrote:
> 
> Exclusion between stop_{one,two}_cpu{,s}() and stop_cpus() makes this
> trivially go away.
> 
> Paul's RCU branch already kills try_stop_cpus() dead, so that wart is
> also gone. But we're still stuck with stop_machine_from_inactive_cpu()
> which does a spin-wait for exclusive state. So I suppose we'll have to
> keep stop_cpus_mutex :/

*groan* we really need to kill that from_inactive_cpu() shite too. Lemme
go have a look at how this MTRR crap works.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and lg_double_lock/unlock()

2015-07-31 Thread Peter Zijlstra
On Fri, Jul 31, 2015 at 01:12:46PM +0200, Peter Zijlstra wrote:
 On Thu, Jul 30, 2015 at 11:55:27PM +0200, Peter Zijlstra wrote:
  
  Exclusion between stop_{one,two}_cpu{,s}() and stop_cpus() makes this
  trivially go away.
  
  Paul's RCU branch already kills try_stop_cpus() dead, so that wart is
  also gone. But we're still stuck with stop_machine_from_inactive_cpu()
  which does a spin-wait for exclusive state. So I suppose we'll have to
  keep stop_cpus_mutex :/
 
 *groan* we really need to kill that from_inactive_cpu() shite too. Lemme
 go have a look at how this MTRR crap works.

*sigh* we can't get rid of it :/ The hardware is 'broken', see:
d0af9eed5aa9 (x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT
init)

And that means we must not ever block and the global primitive must be a
spinner.

The below is the best we can do.. At least it localizes the ugly.

--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -20,7 +20,6 @@
 #include linux/kallsyms.h
 #include linux/smpboot.h
 #include linux/atomic.h
-#include linux/lglock.h
 
 /*
  * Structure to determine completion condition and record errors.  May
@@ -47,14 +46,6 @@ struct cpu_stopper {
 static DEFINE_PER_CPU(struct cpu_stopper, cpu_stopper);
 static bool stop_machine_initialized = false;
 
-/*
- * Avoids a race between stop_two_cpus and global stop_cpus, where
- * the stoppers could get queued up in reverse order, leading to
- * system deadlock. Using an lglock means stop_two_cpus remains
- * relatively cheap.
- */
-DEFINE_STATIC_LGLOCK(stop_cpus_lock);
-
 static void cpu_stop_init_done(struct cpu_stop_done *done, unsigned int 
nr_todo)
 {
memset(done, 0, sizeof(*done));
@@ -74,20 +65,25 @@ static void cpu_stop_signal_done(struct
 }
 
 /* queue @work to @stopper.  if offline, @work is completed immediately */
-static void cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work)
+static void __cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work)
 {
struct cpu_stopper *stopper = per_cpu(cpu_stopper, cpu);
 
-   unsigned long flags;
-
-   spin_lock_irqsave(stopper-lock, flags);
-
if (stopper-enabled) {
list_add_tail(work-list, stopper-works);
wake_up_process(stopper-thread);
-   } else
+   } else {
cpu_stop_signal_done(work-done, false);
+   }
+}
 
+static void cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work)
+{
+   struct cpu_stopper *stopper = per_cpu(cpu_stopper, cpu);
+   unsigned long flags;
+
+   spin_lock_irqsave(stopper-lock, flags);
+   __cpu_stop_queue_work(cpu, work);
spin_unlock_irqrestore(stopper-lock, flags);
 }
 
@@ -226,9 +222,14 @@ static int multi_cpu_stop(void *data)
  */
 int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void 
*arg)
 {
-   struct cpu_stop_done done;
+   struct cpu_stopper *stopper1, *stopper2;
struct cpu_stop_work work1, work2;
struct multi_stop_data msdata;
+   struct cpu_stop_done done;
+   unsigned long flags;
+
+   if (cpu2  cpu1)
+   swap(cpu1, cpu2);
 
preempt_disable();
msdata = (struct multi_stop_data){
@@ -260,10 +261,17 @@ int stop_two_cpus(unsigned int cpu1, uns
return -ENOENT;
}
 
-   lg_double_lock(stop_cpus_lock, cpu1, cpu2);
-   cpu_stop_queue_work(cpu1, work1);
-   cpu_stop_queue_work(cpu2, work2);
-   lg_double_unlock(stop_cpus_lock, cpu1, cpu2);
+   stopper1 = per_cpu_ptr(cpu_stopper, cpu1);
+   stopper2 = per_cpu_ptr(cpu_stopper, cpu2);
+
+   spin_lock_irqsave(stopper1-lock, flags);
+   spin_lock_nested(stopper2-lock, SINGLE_DEPTH_NESTING);
+
+   __cpu_stop_queue_work(cpu1, work1);
+   __cpu_stop_queue_work(cpu2, work2);
+
+   spin_unlock(stopper2-lock);
+   spin_unlock_irqrestore(stopper1-lock, flags);
 
preempt_enable();
 
@@ -304,19 +312,24 @@ static void queue_stop_cpus_work(const s
unsigned int cpu;
 
/*
-* Disable preemption while queueing to avoid getting
-* preempted by a stopper which might wait for other stoppers
-* to enter @fn which can lead to deadlock.
+* Disgusting, but take all relevant per-cpu spinlocks to serialize
+* against stop_{one,two}_cpu{,s}().
 */
-   lg_global_lock(stop_cpus_lock);
+   preempt_disable();
+   for_each_cpu(cpu, cpumask)
+   arch_spin_lock((arch_spinlock_t *)per_cpu(cpu_stopper.lock, 
cpu));
+
for_each_cpu(cpu, cpumask) {
work = per_cpu(cpu_stopper.stop_work, cpu);
work-fn = fn;
work-arg = arg;
work-done = done;
-   cpu_stop_queue_work(cpu, work);
+   __cpu_stop_queue_work(cpu, work);
}
-   lg_global_unlock(stop_cpus_lock);
+
+   for_each_cpu(cpu, cpumask)
+   arch_spin_unlock((arch_spinlock_t 

Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and lg_double_lock/unlock()

2015-07-31 Thread Peter Zijlstra
On Thu, Jul 30, 2015 at 11:55:27PM +0200, Peter Zijlstra wrote:
 
 Exclusion between stop_{one,two}_cpu{,s}() and stop_cpus() makes this
 trivially go away.
 
 Paul's RCU branch already kills try_stop_cpus() dead, so that wart is
 also gone. But we're still stuck with stop_machine_from_inactive_cpu()
 which does a spin-wait for exclusive state. So I suppose we'll have to
 keep stop_cpus_mutex :/

*groan* we really need to kill that from_inactive_cpu() shite too. Lemme
go have a look at how this MTRR crap works.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and lg_double_lock/unlock()

2015-07-30 Thread Peter Zijlstra
On Tue, Jul 21, 2015 at 09:22:47PM +0200, Oleg Nesterov wrote:

> +static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
> + int cpu2, struct cpu_stop_work *work2)
> +{
> + struct cpu_stopper *stopper1 = per_cpu_ptr(_stopper, cpu1);
> + struct cpu_stopper *stopper2 = per_cpu_ptr(_stopper, cpu2);
> + int err;
> +retry:
> + spin_lock_irq(>lock);
> + spin_lock_nested(>lock, SINGLE_DEPTH_NESTING);
> + /*
> +  * If we observe both CPUs active we know _cpu_down() cannot yet have
> +  * queued its stop_machine works and therefore ours will get executed
> +  * first. Or its not either one of our CPUs that's getting unplugged,
> +  * in which case we don't care.
> +  */
> + err = -ENOENT;
> + if (!cpu_active(cpu1) || !cpu_active(cpu2))
> + goto unlock;
> +
> + WARN_ON(!stopper1->enabled || !stopper2->enabled);
> + /*
> +  * Ensure that if we race with stop_cpus() the stoppers won't
> +  * get queued up in reverse order, leading to system deadlock.
> +  */
> + err = -EDEADLK;
> + if (stop_work_pending(stopper1) != stop_work_pending(stopper2))
> + goto unlock;

You could DoS/false positive this by running stop_one_cpu() in a loop,
and thereby 'always' having work pending on one but not the other.

(doing so if obviously daft for other reasons)

> +
> + err = 0;
> + __cpu_stop_queue_work(stopper1, work1);
> + __cpu_stop_queue_work(stopper2, work2);
> +unlock:
> + spin_unlock(>lock);
> + spin_unlock_irq(>lock);
> +
> + if (unlikely(err == -EDEADLK)) {
> + cond_resched();
> + goto retry;

And this just gives me -rt nightmares.

> + }
> + return err;
> +}

As it is, -rt does horrible things to stop_machine, and I would very
much like to make it such that we don't need to do that.

Now, obviously, stop_cpus() is _BAD_ for -rt, and we try real hard to
make sure that doesn't happen, but stop_one_cpu() and stop_two_cpus()
should not be a problem.

Exclusion between stop_{one,two}_cpu{,s}() and stop_cpus() makes this
trivially go away.

Paul's RCU branch already kills try_stop_cpus() dead, so that wart is
also gone. But we're still stuck with stop_machine_from_inactive_cpu()
which does a spin-wait for exclusive state. So I suppose we'll have to
keep stop_cpus_mutex :/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and lg_double_lock/unlock()

2015-07-30 Thread Peter Zijlstra
On Tue, Jul 21, 2015 at 09:22:47PM +0200, Oleg Nesterov wrote:

 +static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
 + int cpu2, struct cpu_stop_work *work2)
 +{
 + struct cpu_stopper *stopper1 = per_cpu_ptr(cpu_stopper, cpu1);
 + struct cpu_stopper *stopper2 = per_cpu_ptr(cpu_stopper, cpu2);
 + int err;
 +retry:
 + spin_lock_irq(stopper1-lock);
 + spin_lock_nested(stopper2-lock, SINGLE_DEPTH_NESTING);
 + /*
 +  * If we observe both CPUs active we know _cpu_down() cannot yet have
 +  * queued its stop_machine works and therefore ours will get executed
 +  * first. Or its not either one of our CPUs that's getting unplugged,
 +  * in which case we don't care.
 +  */
 + err = -ENOENT;
 + if (!cpu_active(cpu1) || !cpu_active(cpu2))
 + goto unlock;
 +
 + WARN_ON(!stopper1-enabled || !stopper2-enabled);
 + /*
 +  * Ensure that if we race with stop_cpus() the stoppers won't
 +  * get queued up in reverse order, leading to system deadlock.
 +  */
 + err = -EDEADLK;
 + if (stop_work_pending(stopper1) != stop_work_pending(stopper2))
 + goto unlock;

You could DoS/false positive this by running stop_one_cpu() in a loop,
and thereby 'always' having work pending on one but not the other.

(doing so if obviously daft for other reasons)

 +
 + err = 0;
 + __cpu_stop_queue_work(stopper1, work1);
 + __cpu_stop_queue_work(stopper2, work2);
 +unlock:
 + spin_unlock(stopper2-lock);
 + spin_unlock_irq(stopper1-lock);
 +
 + if (unlikely(err == -EDEADLK)) {
 + cond_resched();
 + goto retry;

And this just gives me -rt nightmares.

 + }
 + return err;
 +}

As it is, -rt does horrible things to stop_machine, and I would very
much like to make it such that we don't need to do that.

Now, obviously, stop_cpus() is _BAD_ for -rt, and we try real hard to
make sure that doesn't happen, but stop_one_cpu() and stop_two_cpus()
should not be a problem.

Exclusion between stop_{one,two}_cpu{,s}() and stop_cpus() makes this
trivially go away.

Paul's RCU branch already kills try_stop_cpus() dead, so that wart is
also gone. But we're still stuck with stop_machine_from_inactive_cpu()
which does a spin-wait for exclusive state. So I suppose we'll have to
keep stop_cpus_mutex :/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/