Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep

2019-10-04 Thread Sebastian Andrzej Siewior
On 2019-09-24 11:35:16 [-0500], Scott Wood wrote:
> 
> OK, sounds like stop_one_cpu_nowait() is the way to go then.

so I applied the last three patches from the migrate-disable() series
and it looks good. Nothing blew up in my testing.
There were no objects to the stop_one_cpu_nowait() suggestion and I
think that it is important not to lose the state at end.
Could you please repost the patches?

> -Scott

Sebastian


Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep

2019-09-24 Thread Scott Wood
On Tue, 2019-09-24 at 18:05 +0200, Sebastian Andrzej Siewior wrote:
> On 2019-09-24 10:47:36 [-0500], Scott Wood wrote:
> > When the stop machine finishes it will do a wake_up_process() via
> > complete().  Since this does not pass WF_LOCK_SLEEPER, saved_state will
> > be
> > cleared, and you'll have TASK_RUNNING when you get to other_func() and
> > schedule(), regardless of whether CPU1 sends wake_up() -- so this change
> > doesn't actually accomplish anything.
> 
> True, I completely missed that part.
> 
> > While as noted in the other thread I don't think these spurious wakeups
> > are
> > a huge problem, we could avoid them by doing stop_one_cpu_nowait() and
> > then
> > schedule() without messing with task state.  Since we're stopping our
> > own
> > cpu, it should be guaranteed that the stopper has finished by the time
> > we
> > exit schedule().
> 
> I remember loosing a state can be a problem. Lets say it is not "just"
> TASK_UNINTERRUPTIBLE -> TASK_RUNNING which sounds harmless but it is
> __TASK_TRACED and you lose it as part of unlocking siglock.

OK, sounds like stop_one_cpu_nowait() is the way to go then.

-Scott




Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep

2019-09-24 Thread Sebastian Andrzej Siewior
On 2019-09-24 10:47:36 [-0500], Scott Wood wrote:
> When the stop machine finishes it will do a wake_up_process() via
> complete().  Since this does not pass WF_LOCK_SLEEPER, saved_state will be
> cleared, and you'll have TASK_RUNNING when you get to other_func() and
> schedule(), regardless of whether CPU1 sends wake_up() -- so this change
> doesn't actually accomplish anything.

True, I completely missed that part.

> While as noted in the other thread I don't think these spurious wakeups are
> a huge problem, we could avoid them by doing stop_one_cpu_nowait() and then
> schedule() without messing with task state.  Since we're stopping our own
> cpu, it should be guaranteed that the stopper has finished by the time we
> exit schedule().

I remember loosing a state can be a problem. Lets say it is not "just"
TASK_UNINTERRUPTIBLE -> TASK_RUNNING which sounds harmless but it is
__TASK_TRACED and you lose it as part of unlocking siglock.

> -Scott

Sebastian


Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep

2019-09-24 Thread Scott Wood
On Tue, 2019-09-24 at 17:25 +0200, Sebastian Andrzej Siewior wrote:
> On 2019-09-24 08:53:43 [-0500], Scott Wood wrote:
> > As I pointed out in the "[PATCH RT 6/8] sched: migrate_enable: Set state
> > to
> > TASK_RUNNING" discussion, we can get here inside the rtmutex code (e.g.
> > from
> > debug_rt_mutex_print_deadlock) where saved_state is already holding
> > something -- plus, the waker won't have WF_LOCK_SLEEPER and therefore
> > saved_state will get cleared anyway.
> 
> So let me drop the saved_state pieces and get back to it once I get to
> the other thread (which you replied and I didn't realised until now).
> 
> Regarding the WF_LOCK_SLEEPER part. I think this works as expected.
> Imagine:
> 
> CPU0  CPU1
> spin_lock();
> set_current_state(TASK_UNINTERRUPTIBLE);
> …
> spin_unlock()
>  -> migrate_enable();
>-> stop_one_cpu(); <-- A)
> other_func(); <-- B)
> schedule();
> 
> So. With only CPU0 we enter schedule() with TASK_UNINTERRUPTIBLE because
> the state gets preserved with the change I added (which is expected).
> If CPU1 sends a wake_up() at A) then the saved_state gets overwritten
> and we enter schedule() with TASK_RUNNING. Same happens if it is sent at
> B) point which is outside of any migrate/spin lock related code. 
> 
> Was this clear or did I miss the point?

When the stop machine finishes it will do a wake_up_process() via
complete().  Since this does not pass WF_LOCK_SLEEPER, saved_state will be
cleared, and you'll have TASK_RUNNING when you get to other_func() and
schedule(), regardless of whether CPU1 sends wake_up() -- so this change
doesn't actually accomplish anything.

While as noted in the other thread I don't think these spurious wakeups are
a huge problem, we could avoid them by doing stop_one_cpu_nowait() and then
schedule() without messing with task state.  Since we're stopping our own
cpu, it should be guaranteed that the stopper has finished by the time we
exit schedule().

-Scott




Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep

2019-09-24 Thread Sebastian Andrzej Siewior
On 2019-09-24 08:53:43 [-0500], Scott Wood wrote:
> As I pointed out in the "[PATCH RT 6/8] sched: migrate_enable: Set state to
> TASK_RUNNING" discussion, we can get here inside the rtmutex code (e.g. from
> debug_rt_mutex_print_deadlock) where saved_state is already holding
> something -- plus, the waker won't have WF_LOCK_SLEEPER and therefore
> saved_state will get cleared anyway.

So let me drop the saved_state pieces and get back to it once I get to
the other thread (which you replied and I didn't realised until now).

Regarding the WF_LOCK_SLEEPER part. I think this works as expected.
Imagine:

CPU0CPU1
spin_lock();
set_current_state(TASK_UNINTERRUPTIBLE);
…
spin_unlock()
 -> migrate_enable();
   -> stop_one_cpu();   <-- A)
other_func();   <-- B)
schedule();

So. With only CPU0 we enter schedule() with TASK_UNINTERRUPTIBLE because
the state gets preserved with the change I added (which is expected).
If CPU1 sends a wake_up() at A) then the saved_state gets overwritten
and we enter schedule() with TASK_RUNNING. Same happens if it is sent at
B) point which is outside of any migrate/spin lock related code. 

Was this clear or did I miss the point?

> -Scott

Sebastian


Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep

2019-09-24 Thread Scott Wood
On Tue, 2019-09-24 at 13:21 +0200, Sebastian Andrzej Siewior wrote:
> On 2019-09-23 19:52:33 [+0200], To Scott Wood wrote:
> 
> I made dis:
> 
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 885a195dfbe02..25afa2bb1a2cf 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -308,7 +308,9 @@ void pin_current_cpu(void)
>   preempt_lazy_enable();
>   preempt_enable();
>  
> + sleeping_lock_inc();
>   __read_rt_lock(cpuhp_pin);
> + sleeping_lock_dec();
>  
>   preempt_disable();
>   preempt_lazy_disable();
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index e1bdd7f9be054..63a6420d01053 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7388,6 +7388,7 @@ void migrate_enable(void)
>  
>   WARN_ON(smp_processor_id() != task_cpu(p));
>   if (!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) {
> + struct task_struct *self = current;
>   const struct cpumask *cpu_valid_mask =
> cpu_active_mask;
>   struct migration_arg arg;
>   unsigned int dest_cpu;
> @@ -7405,7 +7406,21 @@ void migrate_enable(void)
>   unpin_current_cpu();
>   preempt_lazy_enable();
>   preempt_enable();
> + rt_invol_sleep_inc();
> +
> + raw_spin_lock_irq(&self->pi_lock);
> + self->saved_state = self->state;
> + __set_current_state_no_track(TASK_RUNNING);
> + raw_spin_unlock_irq(&self->pi_lock);
> +
>   stop_one_cpu(task_cpu(p), migration_cpu_stop, &arg);
> +
> + raw_spin_lock_irq(&self->pi_lock);
> + __set_current_state_no_track(self->saved_state);
> + self->saved_state = TASK_RUNNING;
> + raw_spin_unlock_irq(&self->pi_lock);
> +
> + rt_invol_sleep_dec();
>   return;
>   }
>   }
> 
> I think we need to preserve the current state, otherwise we will lose
> anything != TASK_RUNNING here. So the spin_lock() would preserve it
> while waiting but the migrate_enable() will lose it if it needs to
> change the CPU at the end.
> I will try to prepare all commits for the next release before I release
> so you can have a look first and yell if needed.

As I pointed out in the "[PATCH RT 6/8] sched: migrate_enable: Set state to
TASK_RUNNING" discussion, we can get here inside the rtmutex code (e.g. from
debug_rt_mutex_print_deadlock) where saved_state is already holding
something -- plus, the waker won't have WF_LOCK_SLEEPER and therefore
saved_state will get cleared anyway.

-Scott




Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep

2019-09-24 Thread Sebastian Andrzej Siewior
On 2019-09-23 19:52:33 [+0200], To Scott Wood wrote:

I made dis:

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 885a195dfbe02..25afa2bb1a2cf 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -308,7 +308,9 @@ void pin_current_cpu(void)
preempt_lazy_enable();
preempt_enable();
 
+   sleeping_lock_inc();
__read_rt_lock(cpuhp_pin);
+   sleeping_lock_dec();
 
preempt_disable();
preempt_lazy_disable();
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e1bdd7f9be054..63a6420d01053 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7388,6 +7388,7 @@ void migrate_enable(void)
 
WARN_ON(smp_processor_id() != task_cpu(p));
if (!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) {
+   struct task_struct *self = current;
const struct cpumask *cpu_valid_mask = cpu_active_mask;
struct migration_arg arg;
unsigned int dest_cpu;
@@ -7405,7 +7406,21 @@ void migrate_enable(void)
unpin_current_cpu();
preempt_lazy_enable();
preempt_enable();
+   rt_invol_sleep_inc();
+
+   raw_spin_lock_irq(&self->pi_lock);
+   self->saved_state = self->state;
+   __set_current_state_no_track(TASK_RUNNING);
+   raw_spin_unlock_irq(&self->pi_lock);
+
stop_one_cpu(task_cpu(p), migration_cpu_stop, &arg);
+
+   raw_spin_lock_irq(&self->pi_lock);
+   __set_current_state_no_track(self->saved_state);
+   self->saved_state = TASK_RUNNING;
+   raw_spin_unlock_irq(&self->pi_lock);
+
+   rt_invol_sleep_dec();
return;
}
}

I think we need to preserve the current state, otherwise we will lose
anything != TASK_RUNNING here. So the spin_lock() would preserve it
while waiting but the migrate_enable() will lose it if it needs to
change the CPU at the end.
I will try to prepare all commits for the next release before I release
so you can have a look first and yell if needed.

> > -Scott
 
Sebastian


Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep

2019-09-23 Thread Sebastian Andrzej Siewior
On 2019-09-23 11:59:23 [-0500], Scott Wood wrote:
> On Tue, 2019-09-17 at 09:06 -0500, Scott Wood wrote:
> > On Tue, 2019-09-17 at 09:59 +0200, Sebastian Andrzej Siewior wrote:
> > > On 2019-09-11 17:57:27 [+0100], Scott Wood wrote:
> > > > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > > > index 885a195dfbe0..32c6175b63b6 100644
> > > > --- a/kernel/cpu.c
> > > > +++ b/kernel/cpu.c
> > > > @@ -308,7 +308,9 @@ void pin_current_cpu(void)
> > > > preempt_lazy_enable();
> > > > preempt_enable();
> > > >  
> > > > +   rt_invol_sleep_inc();
> > > > __read_rt_lock(cpuhp_pin);
> > > > +   rt_invol_sleep_dec();
> > > >  
> > > > preempt_disable();
> > > > preempt_lazy_disable();
> > > 
> > > I understand the other one. But now looking at it, both end up in
> > > rt_spin_lock_slowlock_locked() which would be the proper place to do
> > > that annotation. Okay.
> > 
> > FWIW, if my lazy migrate patchset is accepted, then there will be no users
> > of __read_rt_lock() outside rwlock-rt.c and it'll be moot.
> 
> I missed the "both" -- which is the "other one" that ends up in
> rt_spin_lock_slowlock_locked()?  stop_one_cpu() doesn't...

That one used here:
 __read_rt_lock()
-> rt_spin_lock_slowlock_locked()

The official one (including the write part):
 rt_read_lock() (annotation)
   -> do_read_rt_lock()
 -> __read_rt_lock()
   -> rt_spin_lock_slowlock_locked()


and the only missing to the party of sleeping locks is:
rt_spin_lock() (annotation)
  -> rt_spin_lock_slowlock()
-> rt_spin_lock_slowlock_locked()

> -Scott

Sebastian


Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep

2019-09-23 Thread Scott Wood
On Tue, 2019-09-17 at 09:06 -0500, Scott Wood wrote:
> On Tue, 2019-09-17 at 09:59 +0200, Sebastian Andrzej Siewior wrote:
> > On 2019-09-11 17:57:27 [+0100], Scott Wood wrote:
> > > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > > index 885a195dfbe0..32c6175b63b6 100644
> > > --- a/kernel/cpu.c
> > > +++ b/kernel/cpu.c
> > > @@ -308,7 +308,9 @@ void pin_current_cpu(void)
> > >   preempt_lazy_enable();
> > >   preempt_enable();
> > >  
> > > + rt_invol_sleep_inc();
> > >   __read_rt_lock(cpuhp_pin);
> > > + rt_invol_sleep_dec();
> > >  
> > >   preempt_disable();
> > >   preempt_lazy_disable();
> > 
> > I understand the other one. But now looking at it, both end up in
> > rt_spin_lock_slowlock_locked() which would be the proper place to do
> > that annotation. Okay.
> 
> FWIW, if my lazy migrate patchset is accepted, then there will be no users
> of __read_rt_lock() outside rwlock-rt.c and it'll be moot.

I missed the "both" -- which is the "other one" that ends up in
rt_spin_lock_slowlock_locked()?  stop_one_cpu() doesn't...

-Scott




Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep

2019-09-17 Thread Scott Wood
On Tue, 2019-09-17 at 09:59 +0200, Sebastian Andrzej Siewior wrote:
> On 2019-09-11 17:57:27 [+0100], Scott Wood wrote:
> > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > index 885a195dfbe0..32c6175b63b6 100644
> > --- a/kernel/cpu.c
> > +++ b/kernel/cpu.c
> > @@ -308,7 +308,9 @@ void pin_current_cpu(void)
> > preempt_lazy_enable();
> > preempt_enable();
> >  
> > +   rt_invol_sleep_inc();
> > __read_rt_lock(cpuhp_pin);
> > +   rt_invol_sleep_dec();
> >  
> > preempt_disable();
> > preempt_lazy_disable();
> 
> I understand the other one. But now looking at it, both end up in
> rt_spin_lock_slowlock_locked() which would be the proper place to do
> that annotation. Okay.

FWIW, if my lazy migrate patchset is accepted, then there will be no users
of __read_rt_lock() outside rwlock-rt.c and it'll be moot.

-Scott




Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep

2019-09-17 Thread Sebastian Andrzej Siewior
On 2019-09-11 17:57:27 [+0100], Scott Wood wrote:
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 885a195dfbe0..32c6175b63b6 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -308,7 +308,9 @@ void pin_current_cpu(void)
>   preempt_lazy_enable();
>   preempt_enable();
>  
> + rt_invol_sleep_inc();
>   __read_rt_lock(cpuhp_pin);
> + rt_invol_sleep_dec();
>  
>   preempt_disable();
>   preempt_lazy_disable();

I understand the other one. But now looking at it, both end up in
rt_spin_lock_slowlock_locked() which would be the proper place to do
that annotation. Okay.

Sebastian