On Tue, Jul 07, 2020 at 10:17:19AM +0200, Peter Zijlstra wrote:
> Anyway, let me now endeavour to write a coherent Changelog for this mess
I'll go stick this in sched/urgent and update that other documentation
patch (again)..
---
Subject: sched: Fix loadavg accounting race
From: Peter Zijlstra
On 07/07/20 09:17, Peter Zijlstra wrote:
> On Tue, Jul 07, 2020 at 12:56:04AM +0100, Valentin Schneider wrote:
>
>> > @@ -2605,8 +2596,20 @@ try_to_wake_up(struct task_struct *p, unsigned int
>> > state, int wake_flags)
>> >*
>> >* Pairs with the LOCK+smp_mb__after_spinlock() on
On Tue, Jul 07, 2020 at 10:20:05AM +0100, Qais Yousef wrote:
> On 07/06/20 16:59, Peter Zijlstra wrote:
> > + if (!preempt && prev_state && prev_state == prev->state) {
>
> I think the compiler won't optimize `prev_state == prev->state` out because of
> the smp_mb__after_spinlock() which
On 07/06/20 16:59, Peter Zijlstra wrote:
[...]
> @@ -4104,12 +4108,19 @@ static void __sched notrace __schedule(bool preempt)
> local_irq_disable();
> rcu_note_context_switch(preempt);
>
> + prev_state = prev->state;
> +
> /*
> - * Make sure that
On Tue, Jul 07, 2020 at 12:56:04AM +0100, Valentin Schneider wrote:
> > @@ -2605,8 +2596,20 @@ try_to_wake_up(struct task_struct *p, unsigned int
> > state, int wake_flags)
> >*
> >* Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
> >* __schedule(). See the
On Mon, Jul 06, 2020 at 05:20:57PM -0400, Dave Jones wrote:
> On Mon, Jul 06, 2020 at 04:59:52PM +0200, Peter Zijlstra wrote:
> > On Fri, Jul 03, 2020 at 04:51:53PM -0400, Dave Jones wrote:
> > > On Fri, Jul 03, 2020 at 12:40:33PM +0200, Peter Zijlstra wrote:
> > >
> > > looked promising the
On 06/07/20 15:59, Peter Zijlstra wrote:
> OK, lots of cursing later, I now have the below...
>
> The TL;DR is that while schedule() doesn't change p->state once it
> starts, it does read it quite a bit, and ttwu() will actually change it
> to TASK_WAKING. So if ttwu() changes it to WAKING
On Mon, Jul 06, 2020 at 04:59:52PM +0200, Peter Zijlstra wrote:
> On Fri, Jul 03, 2020 at 04:51:53PM -0400, Dave Jones wrote:
> > On Fri, Jul 03, 2020 at 12:40:33PM +0200, Peter Zijlstra wrote:
> >
> > looked promising the first few hours, but as soon as it hit four hours
> > of uptime,
On Fri, Jul 03, 2020 at 04:51:53PM -0400, Dave Jones wrote:
> On Fri, Jul 03, 2020 at 12:40:33PM +0200, Peter Zijlstra wrote:
>
> > So ARM/Power/etc.. can speculate the load such that the
> > task_contributes_to_load() value is from before ->on_rq.
> >
> > The compiler might similar
On Fri, Jul 03, 2020 at 12:40:33PM +0200, Peter Zijlstra wrote:
> So ARM/Power/etc.. can speculate the load such that the
> task_contributes_to_load() value is from before ->on_rq.
>
> The compiler might similar re-order things -- although I've not found it
> doing so with the few builds I
[Re: weird loadavg on idle machine post 5.7] On 02/07/2020 (Thu 17:15) Paul
Gortmaker wrote:
> [weird loadavg on idle machine post 5.7] On 02/07/2020 (Thu 13:15) Dave Jones
> wrote:
[...]
> > both implicated this commit:
> >
> > commit c6e7bd7afaeb3af55ffac12282803
On Fri, Jul 03, 2020 at 11:02:26AM +0200, Peter Zijlstra wrote:
> On Thu, Jul 02, 2020 at 10:36:27PM +0100, Mel Gorman wrote:
>
> > > commit c6e7bd7afaeb3af55ffac122828035f1c01d1d7b (refs/bisect/bad)
> > > Author: Peter Zijlstra
>
> > Peter, I'm not supremely confident about this but could it
On Thu, Jul 02, 2020 at 10:36:27PM +0100, Mel Gorman wrote:
> > commit c6e7bd7afaeb3af55ffac122828035f1c01d1d7b (refs/bisect/bad)
> > Author: Peter Zijlstra
> Peter, I'm not supremely confident about this but could it be because
> "p->sched_contributes_to_load = !!task_contributes_to_load(p)"
On Thu, Jul 02, 2020 at 10:36:27PM +0100, Mel Gorman wrote:
> I'm thinking that the !!task_contributes_to_load(p) should still happen
> after smp_cond_load_acquire() when on_cpu is stable and the pi_lock is
> held to stabilised p->state against a parallel wakeup or updating the
> task rq. I
On Thu, Jul 02, 2020 at 10:36:27PM +0100, Mel Gorman wrote:
>
> It builds, not booted, it's for discussion but maybe Dave is feeling brave!
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ca5db40392d4..52c73598b18a 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
On Thu, Jul 02, 2020 at 01:15:48PM -0400, Dave Jones wrote:
> When I upgraded my firewall to 5.7-rc2 I noticed that on a mostly
> idle machine (that usually sees loadavg hover in the 0.xx range)
> that it was consistently above 1.00 even when there was nothing running.
> All that perf showed was
[weird loadavg on idle machine post 5.7] On 02/07/2020 (Thu 13:15) Dave Jones
wrote:
> When I upgraded my firewall to 5.7-rc2 I noticed that on a mostly
> idle machine (that usually sees loadavg hover in the 0.xx range)
> that it was consistently above 1.00 even when there was nothin
On Thu, Jul 02, 2020 at 01:15:48PM -0400, Dave Jones wrote:
> When I upgraded my firewall to 5.7-rc2 I noticed that on a mostly
> idle machine (that usually sees loadavg hover in the 0.xx range)
> that it was consistently above 1.00 even when there was nothing running.
> All that perf showed
When I upgraded my firewall to 5.7-rc2 I noticed that on a mostly
idle machine (that usually sees loadavg hover in the 0.xx range)
that it was consistently above 1.00 even when there was nothing running.
All that perf showed was the kernel was spending time in the idle loop
(and running perf).
19 matches
Mail list logo