On 08/12/20 13:46, Qian Cai wrote:
> On Mon, 2020-12-07 at 19:27 +, Valentin Schneider wrote:
>> Ok, can reproduce this on a TX2 on next-20201207. I didn't use your config,
>> I oldconfig'd my distro config and only modified it to CONFIG_PREEMPT_NONE.
>> Interestingly the BUG happens on
On Mon, 2020-12-07 at 19:27 +, Valentin Schneider wrote:
> Ok, can reproduce this on a TX2 on next-20201207. I didn't use your config,
> I oldconfig'd my distro config and only modified it to CONFIG_PREEMPT_NONE.
> Interestingly the BUG happens on CPU127 here too...
I think that number is
On 04/12/20 21:19, Qian Cai wrote:
> On Tue, 2020-11-17 at 19:28 +, Valentin Schneider wrote:
>> We did have some breakage in that area, but all the holes I was aware of
>> have been plugged. What would help here is to see which tasks are still
>> queued on that outgoing CPU, and their
On Sat, 2020-12-05 at 18:37 +, Valentin Schneider wrote:
> From there I see:
>
> [20798.166987][ T650] CPU127 nr_running=2
> [20798.171185][ T650] p=migration/127
> [20798.175161][ T650] p=kworker/127:1
>
> so this might be another workqueue hurdle. This should be prevented by:
>
>
On 04/12/20 21:19, Qian Cai wrote:
> On Tue, 2020-11-17 at 19:28 +, Valentin Schneider wrote:
>> We did have some breakage in that area, but all the holes I was aware of
>> have been plugged. What would help here is to see which tasks are still
>> queued on that outgoing CPU, and their
On Tue, 2020-11-17 at 19:28 +, Valentin Schneider wrote:
> We did have some breakage in that area, but all the holes I was aware of
> have been plugged. What would help here is to see which tasks are still
> queued on that outgoing CPU, and their recent activity.
>
> Something like
> -
FYI, it did crash on arm64 (Thunder X2) as well, so I'll re-run to gather more
information too.
.config: https://cailca.coding.net/public/linux/mm/git/files/master/arm64.config
[20370.682747][T77637] psci: CPU123 killed (polled 0 ms)
[20370.823651][ T635] IRQ 43: no longer affine to CPU124
On Mon, 2020-11-23 at 19:13 +0100, Sebastian Andrzej Siewior wrote:
> On 2020-11-18 09:44:34 [-0500], Qian Cai wrote:
> > On Tue, 2020-11-17 at 19:28 +, Valentin Schneider wrote:
> > > We did have some breakage in that area, but all the holes I was aware of
> > > have been plugged. What would
On Mon, 2020-11-23 at 19:13 +0100, Sebastian Andrzej Siewior wrote:
> On 2020-11-18 09:44:34 [-0500], Qian Cai wrote:
> > On Tue, 2020-11-17 at 19:28 +, Valentin Schneider wrote:
> > > We did have some breakage in that area, but all the holes I was aware of
> > > have been plugged. What would
On 2020-11-18 09:44:34 [-0500], Qian Cai wrote:
> On Tue, 2020-11-17 at 19:28 +, Valentin Schneider wrote:
> > We did have some breakage in that area, but all the holes I was aware of
> > have been plugged. What would help here is to see which tasks are still
> > queued on that outgoing CPU,
On Tue, 2020-11-17 at 19:28 +, Valentin Schneider wrote:
> We did have some breakage in that area, but all the holes I was aware of
> have been plugged. What would help here is to see which tasks are still
> queued on that outgoing CPU, and their recent activity.
>
> Something like
> -
On 13/11/20 15:06, Qian Cai wrote:
> On Fri, 2020-10-23 at 12:12 +0200, Peter Zijlstra wrote:
[...]
>> @@ -7310,7 +7334,7 @@ int sched_cpu_dying(unsigned int cpu)
>> sched_tick_stop(cpu);
>>
>> rq_lock_irqsave(rq, );
>> -BUG_ON(rq->nr_running != 1);
>> +BUG_ON(rq->nr_running
On Fri, 2020-10-23 at 12:12 +0200, Peter Zijlstra wrote:
> From: Thomas Gleixner
>
> On CPU unplug tasks which are in a migrate disabled region cannot be pushed
> to a different CPU until they returned to migrateable state.
>
> Account the number of tasks on a runqueue which are in a migrate
On 29/10/20 17:34, Peter Zijlstra wrote:
> On Thu, Oct 29, 2020 at 04:27:09PM +, Valentin Schneider wrote:
[...]
> Can do I suppose, although I'm no sure what, if anything that helps,
> because then we needs yet another comment explaining things.
>
> I ended up with the below. Is that an
On Thu, Oct 29, 2020 at 04:27:09PM +, Valentin Schneider wrote:
>
> On 23/10/20 11:12, Peter Zijlstra wrote:
> > @@ -7006,15 +7024,20 @@ static bool balance_push(struct rq *rq)
> >* Both the cpu-hotplug and stop task are in this case and are
> >* required to complete the
On 23/10/20 11:12, Peter Zijlstra wrote:
> @@ -7006,15 +7024,20 @@ static bool balance_push(struct rq *rq)
>* Both the cpu-hotplug and stop task are in this case and are
>* required to complete the hotplug process.
>*/
> - if (is_per_cpu_kthread(push_task)) {
> +
From: Thomas Gleixner
On CPU unplug tasks which are in a migrate disabled region cannot be pushed
to a different CPU until they returned to migrateable state.
Account the number of tasks on a runqueue which are in a migrate disabled
section and make the hotplug wait mechanism respect that.
17 matches
Mail list logo