Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-12-09 Thread Valentin Schneider
On 08/12/20 13:46, Qian Cai wrote: > On Mon, 2020-12-07 at 19:27 +, Valentin Schneider wrote: >> Ok, can reproduce this on a TX2 on next-20201207. I didn't use your config, >> I oldconfig'd my distro config and only modified it to CONFIG_PREEMPT_NONE. >> Interestingly the BUG happens on

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-12-08 Thread Qian Cai
On Mon, 2020-12-07 at 19:27 +, Valentin Schneider wrote: > Ok, can reproduce this on a TX2 on next-20201207. I didn't use your config, > I oldconfig'd my distro config and only modified it to CONFIG_PREEMPT_NONE. > Interestingly the BUG happens on CPU127 here too... I think that number is

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-12-07 Thread Valentin Schneider
On 04/12/20 21:19, Qian Cai wrote: > On Tue, 2020-11-17 at 19:28 +, Valentin Schneider wrote: >> We did have some breakage in that area, but all the holes I was aware of >> have been plugged. What would help here is to see which tasks are still >> queued on that outgoing CPU, and their

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-12-05 Thread Qian Cai
On Sat, 2020-12-05 at 18:37 +, Valentin Schneider wrote: > From there I see: > > [20798.166987][ T650] CPU127 nr_running=2 > [20798.171185][ T650] p=migration/127 > [20798.175161][ T650] p=kworker/127:1 > > so this might be another workqueue hurdle. This should be prevented by: > >

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-12-05 Thread Valentin Schneider
On 04/12/20 21:19, Qian Cai wrote: > On Tue, 2020-11-17 at 19:28 +, Valentin Schneider wrote: >> We did have some breakage in that area, but all the holes I was aware of >> have been plugged. What would help here is to see which tasks are still >> queued on that outgoing CPU, and their

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-12-04 Thread Qian Cai
On Tue, 2020-11-17 at 19:28 +, Valentin Schneider wrote: > We did have some breakage in that area, but all the holes I was aware of > have been plugged. What would help here is to see which tasks are still > queued on that outgoing CPU, and their recent activity. > > Something like > -

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-12-03 Thread Qian Cai
FYI, it did crash on arm64 (Thunder X2) as well, so I'll re-run to gather more information too. .config: https://cailca.coding.net/public/linux/mm/git/files/master/arm64.config [20370.682747][T77637] psci: CPU123 killed (polled 0 ms) [20370.823651][ T635] IRQ 43: no longer affine to CPU124

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-12-03 Thread Qian Cai
On Mon, 2020-11-23 at 19:13 +0100, Sebastian Andrzej Siewior wrote: > On 2020-11-18 09:44:34 [-0500], Qian Cai wrote: > > On Tue, 2020-11-17 at 19:28 +, Valentin Schneider wrote: > > > We did have some breakage in that area, but all the holes I was aware of > > > have been plugged. What would

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-12-02 Thread Qian Cai
On Mon, 2020-11-23 at 19:13 +0100, Sebastian Andrzej Siewior wrote: > On 2020-11-18 09:44:34 [-0500], Qian Cai wrote: > > On Tue, 2020-11-17 at 19:28 +, Valentin Schneider wrote: > > > We did have some breakage in that area, but all the holes I was aware of > > > have been plugged. What would

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-11-23 Thread Sebastian Andrzej Siewior
On 2020-11-18 09:44:34 [-0500], Qian Cai wrote: > On Tue, 2020-11-17 at 19:28 +, Valentin Schneider wrote: > > We did have some breakage in that area, but all the holes I was aware of > > have been plugged. What would help here is to see which tasks are still > > queued on that outgoing CPU,

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-11-18 Thread Qian Cai
On Tue, 2020-11-17 at 19:28 +, Valentin Schneider wrote: > We did have some breakage in that area, but all the holes I was aware of > have been plugged. What would help here is to see which tasks are still > queued on that outgoing CPU, and their recent activity. > > Something like > -

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-11-17 Thread Valentin Schneider
On 13/11/20 15:06, Qian Cai wrote: > On Fri, 2020-10-23 at 12:12 +0200, Peter Zijlstra wrote: [...] >> @@ -7310,7 +7334,7 @@ int sched_cpu_dying(unsigned int cpu) >> sched_tick_stop(cpu); >> >> rq_lock_irqsave(rq, ); >> -BUG_ON(rq->nr_running != 1); >> +BUG_ON(rq->nr_running

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-11-13 Thread Qian Cai
On Fri, 2020-10-23 at 12:12 +0200, Peter Zijlstra wrote: > From: Thomas Gleixner > > On CPU unplug tasks which are in a migrate disabled region cannot be pushed > to a different CPU until they returned to migrateable state. > > Account the number of tasks on a runqueue which are in a migrate

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-10-29 Thread Valentin Schneider
On 29/10/20 17:34, Peter Zijlstra wrote: > On Thu, Oct 29, 2020 at 04:27:09PM +, Valentin Schneider wrote: [...] > Can do I suppose, although I'm no sure what, if anything that helps, > because then we needs yet another comment explaining things. > > I ended up with the below. Is that an

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-10-29 Thread Peter Zijlstra
On Thu, Oct 29, 2020 at 04:27:09PM +, Valentin Schneider wrote: > > On 23/10/20 11:12, Peter Zijlstra wrote: > > @@ -7006,15 +7024,20 @@ static bool balance_push(struct rq *rq) > >* Both the cpu-hotplug and stop task are in this case and are > >* required to complete the

Re: [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-10-29 Thread Valentin Schneider
On 23/10/20 11:12, Peter Zijlstra wrote: > @@ -7006,15 +7024,20 @@ static bool balance_push(struct rq *rq) >* Both the cpu-hotplug and stop task are in this case and are >* required to complete the hotplug process. >*/ > - if (is_per_cpu_kthread(push_task)) { > +

[PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative

2020-10-23 Thread Peter Zijlstra
From: Thomas Gleixner On CPU unplug tasks which are in a migrate disabled region cannot be pushed to a different CPU until they returned to migrateable state. Account the number of tasks on a runqueue which are in a migrate disabled section and make the hotplug wait mechanism respect that.