Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep
On 2019-09-24 11:35:16 [-0500], Scott Wood wrote: > > OK, sounds like stop_one_cpu_nowait() is the way to go then. so I applied the last three patches from the migrate-disable() series and it looks good. Nothing blew up in my testing. There were no objects to the stop_one_cpu_nowait() suggestion and I think that it is important not to lose the state at end. Could you please repost the patches? > -Scott Sebastian
Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep
On Tue, 2019-09-24 at 18:05 +0200, Sebastian Andrzej Siewior wrote: > On 2019-09-24 10:47:36 [-0500], Scott Wood wrote: > > When the stop machine finishes it will do a wake_up_process() via > > complete(). Since this does not pass WF_LOCK_SLEEPER, saved_state will > > be > > cleared, and you'll have TASK_RUNNING when you get to other_func() and > > schedule(), regardless of whether CPU1 sends wake_up() -- so this change > > doesn't actually accomplish anything. > > True, I completely missed that part. > > > While as noted in the other thread I don't think these spurious wakeups > > are > > a huge problem, we could avoid them by doing stop_one_cpu_nowait() and > > then > > schedule() without messing with task state. Since we're stopping our > > own > > cpu, it should be guaranteed that the stopper has finished by the time > > we > > exit schedule(). > > I remember loosing a state can be a problem. Lets say it is not "just" > TASK_UNINTERRUPTIBLE -> TASK_RUNNING which sounds harmless but it is > __TASK_TRACED and you lose it as part of unlocking siglock. OK, sounds like stop_one_cpu_nowait() is the way to go then. -Scott
Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep
On 2019-09-24 10:47:36 [-0500], Scott Wood wrote: > When the stop machine finishes it will do a wake_up_process() via > complete(). Since this does not pass WF_LOCK_SLEEPER, saved_state will be > cleared, and you'll have TASK_RUNNING when you get to other_func() and > schedule(), regardless of whether CPU1 sends wake_up() -- so this change > doesn't actually accomplish anything. True, I completely missed that part. > While as noted in the other thread I don't think these spurious wakeups are > a huge problem, we could avoid them by doing stop_one_cpu_nowait() and then > schedule() without messing with task state. Since we're stopping our own > cpu, it should be guaranteed that the stopper has finished by the time we > exit schedule(). I remember loosing a state can be a problem. Lets say it is not "just" TASK_UNINTERRUPTIBLE -> TASK_RUNNING which sounds harmless but it is __TASK_TRACED and you lose it as part of unlocking siglock. > -Scott Sebastian
Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep
On Tue, 2019-09-24 at 17:25 +0200, Sebastian Andrzej Siewior wrote: > On 2019-09-24 08:53:43 [-0500], Scott Wood wrote: > > As I pointed out in the "[PATCH RT 6/8] sched: migrate_enable: Set state > > to > > TASK_RUNNING" discussion, we can get here inside the rtmutex code (e.g. > > from > > debug_rt_mutex_print_deadlock) where saved_state is already holding > > something -- plus, the waker won't have WF_LOCK_SLEEPER and therefore > > saved_state will get cleared anyway. > > So let me drop the saved_state pieces and get back to it once I get to > the other thread (which you replied and I didn't realised until now). > > Regarding the WF_LOCK_SLEEPER part. I think this works as expected. > Imagine: > > CPU0 CPU1 > spin_lock(); > set_current_state(TASK_UNINTERRUPTIBLE); > … > spin_unlock() > -> migrate_enable(); >-> stop_one_cpu(); <-- A) > other_func(); <-- B) > schedule(); > > So. With only CPU0 we enter schedule() with TASK_UNINTERRUPTIBLE because > the state gets preserved with the change I added (which is expected). > If CPU1 sends a wake_up() at A) then the saved_state gets overwritten > and we enter schedule() with TASK_RUNNING. Same happens if it is sent at > B) point which is outside of any migrate/spin lock related code. > > Was this clear or did I miss the point? When the stop machine finishes it will do a wake_up_process() via complete(). Since this does not pass WF_LOCK_SLEEPER, saved_state will be cleared, and you'll have TASK_RUNNING when you get to other_func() and schedule(), regardless of whether CPU1 sends wake_up() -- so this change doesn't actually accomplish anything. While as noted in the other thread I don't think these spurious wakeups are a huge problem, we could avoid them by doing stop_one_cpu_nowait() and then schedule() without messing with task state. Since we're stopping our own cpu, it should be guaranteed that the stopper has finished by the time we exit schedule(). -Scott
Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep
On 2019-09-24 08:53:43 [-0500], Scott Wood wrote: > As I pointed out in the "[PATCH RT 6/8] sched: migrate_enable: Set state to > TASK_RUNNING" discussion, we can get here inside the rtmutex code (e.g. from > debug_rt_mutex_print_deadlock) where saved_state is already holding > something -- plus, the waker won't have WF_LOCK_SLEEPER and therefore > saved_state will get cleared anyway. So let me drop the saved_state pieces and get back to it once I get to the other thread (which you replied and I didn't realised until now). Regarding the WF_LOCK_SLEEPER part. I think this works as expected. Imagine: CPU0CPU1 spin_lock(); set_current_state(TASK_UNINTERRUPTIBLE); … spin_unlock() -> migrate_enable(); -> stop_one_cpu(); <-- A) other_func(); <-- B) schedule(); So. With only CPU0 we enter schedule() with TASK_UNINTERRUPTIBLE because the state gets preserved with the change I added (which is expected). If CPU1 sends a wake_up() at A) then the saved_state gets overwritten and we enter schedule() with TASK_RUNNING. Same happens if it is sent at B) point which is outside of any migrate/spin lock related code. Was this clear or did I miss the point? > -Scott Sebastian
Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep
On Tue, 2019-09-24 at 13:21 +0200, Sebastian Andrzej Siewior wrote: > On 2019-09-23 19:52:33 [+0200], To Scott Wood wrote: > > I made dis: > > diff --git a/kernel/cpu.c b/kernel/cpu.c > index 885a195dfbe02..25afa2bb1a2cf 100644 > --- a/kernel/cpu.c > +++ b/kernel/cpu.c > @@ -308,7 +308,9 @@ void pin_current_cpu(void) > preempt_lazy_enable(); > preempt_enable(); > > + sleeping_lock_inc(); > __read_rt_lock(cpuhp_pin); > + sleeping_lock_dec(); > > preempt_disable(); > preempt_lazy_disable(); > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index e1bdd7f9be054..63a6420d01053 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -7388,6 +7388,7 @@ void migrate_enable(void) > > WARN_ON(smp_processor_id() != task_cpu(p)); > if (!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) { > + struct task_struct *self = current; > const struct cpumask *cpu_valid_mask = > cpu_active_mask; > struct migration_arg arg; > unsigned int dest_cpu; > @@ -7405,7 +7406,21 @@ void migrate_enable(void) > unpin_current_cpu(); > preempt_lazy_enable(); > preempt_enable(); > + rt_invol_sleep_inc(); > + > + raw_spin_lock_irq(&self->pi_lock); > + self->saved_state = self->state; > + __set_current_state_no_track(TASK_RUNNING); > + raw_spin_unlock_irq(&self->pi_lock); > + > stop_one_cpu(task_cpu(p), migration_cpu_stop, &arg); > + > + raw_spin_lock_irq(&self->pi_lock); > + __set_current_state_no_track(self->saved_state); > + self->saved_state = TASK_RUNNING; > + raw_spin_unlock_irq(&self->pi_lock); > + > + rt_invol_sleep_dec(); > return; > } > } > > I think we need to preserve the current state, otherwise we will lose > anything != TASK_RUNNING here. So the spin_lock() would preserve it > while waiting but the migrate_enable() will lose it if it needs to > change the CPU at the end. > I will try to prepare all commits for the next release before I release > so you can have a look first and yell if needed. As I pointed out in the "[PATCH RT 6/8] sched: migrate_enable: Set state to TASK_RUNNING" discussion, we can get here inside the rtmutex code (e.g. from debug_rt_mutex_print_deadlock) where saved_state is already holding something -- plus, the waker won't have WF_LOCK_SLEEPER and therefore saved_state will get cleared anyway. -Scott
Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep
On 2019-09-23 19:52:33 [+0200], To Scott Wood wrote: I made dis: diff --git a/kernel/cpu.c b/kernel/cpu.c index 885a195dfbe02..25afa2bb1a2cf 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -308,7 +308,9 @@ void pin_current_cpu(void) preempt_lazy_enable(); preempt_enable(); + sleeping_lock_inc(); __read_rt_lock(cpuhp_pin); + sleeping_lock_dec(); preempt_disable(); preempt_lazy_disable(); diff --git a/kernel/sched/core.c b/kernel/sched/core.c index e1bdd7f9be054..63a6420d01053 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7388,6 +7388,7 @@ void migrate_enable(void) WARN_ON(smp_processor_id() != task_cpu(p)); if (!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) { + struct task_struct *self = current; const struct cpumask *cpu_valid_mask = cpu_active_mask; struct migration_arg arg; unsigned int dest_cpu; @@ -7405,7 +7406,21 @@ void migrate_enable(void) unpin_current_cpu(); preempt_lazy_enable(); preempt_enable(); + rt_invol_sleep_inc(); + + raw_spin_lock_irq(&self->pi_lock); + self->saved_state = self->state; + __set_current_state_no_track(TASK_RUNNING); + raw_spin_unlock_irq(&self->pi_lock); + stop_one_cpu(task_cpu(p), migration_cpu_stop, &arg); + + raw_spin_lock_irq(&self->pi_lock); + __set_current_state_no_track(self->saved_state); + self->saved_state = TASK_RUNNING; + raw_spin_unlock_irq(&self->pi_lock); + + rt_invol_sleep_dec(); return; } } I think we need to preserve the current state, otherwise we will lose anything != TASK_RUNNING here. So the spin_lock() would preserve it while waiting but the migrate_enable() will lose it if it needs to change the CPU at the end. I will try to prepare all commits for the next release before I release so you can have a look first and yell if needed. > > -Scott Sebastian
Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep
On 2019-09-23 11:59:23 [-0500], Scott Wood wrote: > On Tue, 2019-09-17 at 09:06 -0500, Scott Wood wrote: > > On Tue, 2019-09-17 at 09:59 +0200, Sebastian Andrzej Siewior wrote: > > > On 2019-09-11 17:57:27 [+0100], Scott Wood wrote: > > > > diff --git a/kernel/cpu.c b/kernel/cpu.c > > > > index 885a195dfbe0..32c6175b63b6 100644 > > > > --- a/kernel/cpu.c > > > > +++ b/kernel/cpu.c > > > > @@ -308,7 +308,9 @@ void pin_current_cpu(void) > > > > preempt_lazy_enable(); > > > > preempt_enable(); > > > > > > > > + rt_invol_sleep_inc(); > > > > __read_rt_lock(cpuhp_pin); > > > > + rt_invol_sleep_dec(); > > > > > > > > preempt_disable(); > > > > preempt_lazy_disable(); > > > > > > I understand the other one. But now looking at it, both end up in > > > rt_spin_lock_slowlock_locked() which would be the proper place to do > > > that annotation. Okay. > > > > FWIW, if my lazy migrate patchset is accepted, then there will be no users > > of __read_rt_lock() outside rwlock-rt.c and it'll be moot. > > I missed the "both" -- which is the "other one" that ends up in > rt_spin_lock_slowlock_locked()? stop_one_cpu() doesn't... That one used here: __read_rt_lock() -> rt_spin_lock_slowlock_locked() The official one (including the write part): rt_read_lock() (annotation) -> do_read_rt_lock() -> __read_rt_lock() -> rt_spin_lock_slowlock_locked() and the only missing to the party of sleeping locks is: rt_spin_lock() (annotation) -> rt_spin_lock_slowlock() -> rt_spin_lock_slowlock_locked() > -Scott Sebastian
Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep
On Tue, 2019-09-17 at 09:06 -0500, Scott Wood wrote: > On Tue, 2019-09-17 at 09:59 +0200, Sebastian Andrzej Siewior wrote: > > On 2019-09-11 17:57:27 [+0100], Scott Wood wrote: > > > diff --git a/kernel/cpu.c b/kernel/cpu.c > > > index 885a195dfbe0..32c6175b63b6 100644 > > > --- a/kernel/cpu.c > > > +++ b/kernel/cpu.c > > > @@ -308,7 +308,9 @@ void pin_current_cpu(void) > > > preempt_lazy_enable(); > > > preempt_enable(); > > > > > > + rt_invol_sleep_inc(); > > > __read_rt_lock(cpuhp_pin); > > > + rt_invol_sleep_dec(); > > > > > > preempt_disable(); > > > preempt_lazy_disable(); > > > > I understand the other one. But now looking at it, both end up in > > rt_spin_lock_slowlock_locked() which would be the proper place to do > > that annotation. Okay. > > FWIW, if my lazy migrate patchset is accepted, then there will be no users > of __read_rt_lock() outside rwlock-rt.c and it'll be moot. I missed the "both" -- which is the "other one" that ends up in rt_spin_lock_slowlock_locked()? stop_one_cpu() doesn't... -Scott
Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep
On Tue, 2019-09-17 at 09:59 +0200, Sebastian Andrzej Siewior wrote: > On 2019-09-11 17:57:27 [+0100], Scott Wood wrote: > > diff --git a/kernel/cpu.c b/kernel/cpu.c > > index 885a195dfbe0..32c6175b63b6 100644 > > --- a/kernel/cpu.c > > +++ b/kernel/cpu.c > > @@ -308,7 +308,9 @@ void pin_current_cpu(void) > > preempt_lazy_enable(); > > preempt_enable(); > > > > + rt_invol_sleep_inc(); > > __read_rt_lock(cpuhp_pin); > > + rt_invol_sleep_dec(); > > > > preempt_disable(); > > preempt_lazy_disable(); > > I understand the other one. But now looking at it, both end up in > rt_spin_lock_slowlock_locked() which would be the proper place to do > that annotation. Okay. FWIW, if my lazy migrate patchset is accepted, then there will be no users of __read_rt_lock() outside rwlock-rt.c and it'll be moot. -Scott
Re: [PATCH RT v3 3/5] sched: migrate_dis/enable: Use rt_invol_sleep
On 2019-09-11 17:57:27 [+0100], Scott Wood wrote: > diff --git a/kernel/cpu.c b/kernel/cpu.c > index 885a195dfbe0..32c6175b63b6 100644 > --- a/kernel/cpu.c > +++ b/kernel/cpu.c > @@ -308,7 +308,9 @@ void pin_current_cpu(void) > preempt_lazy_enable(); > preempt_enable(); > > + rt_invol_sleep_inc(); > __read_rt_lock(cpuhp_pin); > + rt_invol_sleep_dec(); > > preempt_disable(); > preempt_lazy_disable(); I understand the other one. But now looking at it, both end up in rt_spin_lock_slowlock_locked() which would be the proper place to do that annotation. Okay. Sebastian