Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group

2019-08-15 Thread Phil Auld
On Fri, Aug 09, 2019 at 06:43:09PM +0100 Valentin Schneider wrote:
> On 09/08/2019 14:33, Phil Auld wrote:
> > On Tue, Aug 06, 2019 at 03:03:34PM +0200 Peter Zijlstra wrote:
> >> On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote:
> >>> Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
> >>
> >> ISTR there were more issues; but it sure is good to start picking them
> >> off.
> >>
> > 
> > Following up on this I hit another in rt.c which looks like:
> > 
> > [  156.348854] Call Trace:
> > [  156.351301]  
> > [  156.353322]  sched_rt_period_timer+0x124/0x350
> > [  156.357766]  ? sched_rt_rq_enqueue+0x90/0x90
> > [  156.362037]  __hrtimer_run_queues+0xfb/0x270
> > [  156.366303]  hrtimer_interrupt+0x122/0x270
> > [  156.370403]  smp_apic_timer_interrupt+0x6a/0x140
> > [  156.375022]  apic_timer_interrupt+0xf/0x20
> > [  156.379119]  
> > 
> > It looks like the same issue of not using the rq_lock* wrappers and
> > hence not using the pinning. From looking at the code there is at 
> > least one potential hit in deadline.c in the push_dl_task path with 
> > find_lock_later_rq but I have not hit that in practice.
> > 
> > This commit, which introduced the warning, seems to imply that the use
> > of the rq_lock* wrappers is required, at least for any sections that will
> > call update_rq_clock:
> > 
> > commit 26ae58d23b94a075ae724fd18783a3773131cfbc
> > Author: Peter Zijlstra 
> > Date:   Mon Oct 3 16:53:49 2016 +0200
> > 
> > sched/core: Add WARNING for multiple update_rq_clock() calls
> > 
> > Now that we have no missing calls, add a warning to find multiple
> > calls.
> > 
> > By having only a single update_rq_clock() call per rq-lock section,
> > the section appears 'atomic' wrt time.
> > 
> > 
> > Is that the case? Otherwise we have these false positives.
> > 
> 
> Looks like it - only rq_pin_lock() clears RQCF_UPDATED, so any
> update_rq_clock() that isn't preceded by that function will still have
> RQCF_UPDATED set the second time it's executed and will trigger the warn.
> 
> Seeing as the wrappers boil down to raw_spin_*() when the debug bits are
> disabled, I don't see why we wouldn't want to convert these callsites.
> 

The one above is easy enough.  After that I hit one related to the 
double_rq_lock
paths. Now I see why that was not cleaned up already. That's going to be a 
bit messier and will require some study. 

I'll post this trivial anyway. 

Cheers,
Phil

-- 


Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group

2019-08-09 Thread Valentin Schneider
On 09/08/2019 14:33, Phil Auld wrote:
> On Tue, Aug 06, 2019 at 03:03:34PM +0200 Peter Zijlstra wrote:
>> On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote:
>>> Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
>>
>> ISTR there were more issues; but it sure is good to start picking them
>> off.
>>
> 
> Following up on this I hit another in rt.c which looks like:
> 
> [  156.348854] Call Trace:
> [  156.351301]  
> [  156.353322]  sched_rt_period_timer+0x124/0x350
> [  156.357766]  ? sched_rt_rq_enqueue+0x90/0x90
> [  156.362037]  __hrtimer_run_queues+0xfb/0x270
> [  156.366303]  hrtimer_interrupt+0x122/0x270
> [  156.370403]  smp_apic_timer_interrupt+0x6a/0x140
> [  156.375022]  apic_timer_interrupt+0xf/0x20
> [  156.379119]  
> 
> It looks like the same issue of not using the rq_lock* wrappers and
> hence not using the pinning. From looking at the code there is at 
> least one potential hit in deadline.c in the push_dl_task path with 
> find_lock_later_rq but I have not hit that in practice.
> 
> This commit, which introduced the warning, seems to imply that the use
> of the rq_lock* wrappers is required, at least for any sections that will
> call update_rq_clock:
> 
> commit 26ae58d23b94a075ae724fd18783a3773131cfbc
> Author: Peter Zijlstra 
> Date:   Mon Oct 3 16:53:49 2016 +0200
> 
> sched/core: Add WARNING for multiple update_rq_clock() calls
> 
> Now that we have no missing calls, add a warning to find multiple
> calls.
> 
> By having only a single update_rq_clock() call per rq-lock section,
> the section appears 'atomic' wrt time.
> 
> 
> Is that the case? Otherwise we have these false positives.
> 

Looks like it - only rq_pin_lock() clears RQCF_UPDATED, so any
update_rq_clock() that isn't preceded by that function will still have
RQCF_UPDATED set the second time it's executed and will trigger the warn.

Seeing as the wrappers boil down to raw_spin_*() when the debug bits are
disabled, I don't see why we wouldn't want to convert these callsites.

> I can spin up patches if so. 
> 
> 
> Thanks,
> Phil
> 
> 


Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group

2019-08-09 Thread Phil Auld
On Tue, Aug 06, 2019 at 03:03:34PM +0200 Peter Zijlstra wrote:
> On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote:
> > Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
> 
> ISTR there were more issues; but it sure is good to start picking them
> off.
> 

Following up on this I hit another in rt.c which looks like:

[  156.348854] Call Trace:
[  156.351301]  
[  156.353322]  sched_rt_period_timer+0x124/0x350
[  156.357766]  ? sched_rt_rq_enqueue+0x90/0x90
[  156.362037]  __hrtimer_run_queues+0xfb/0x270
[  156.366303]  hrtimer_interrupt+0x122/0x270
[  156.370403]  smp_apic_timer_interrupt+0x6a/0x140
[  156.375022]  apic_timer_interrupt+0xf/0x20
[  156.379119]  

It looks like the same issue of not using the rq_lock* wrappers and
hence not using the pinning. From looking at the code there is at 
least one potential hit in deadline.c in the push_dl_task path with 
find_lock_later_rq but I have not hit that in practice.

This commit, which introduced the warning, seems to imply that the use
of the rq_lock* wrappers is required, at least for any sections that will
call update_rq_clock:

commit 26ae58d23b94a075ae724fd18783a3773131cfbc
Author: Peter Zijlstra 
Date:   Mon Oct 3 16:53:49 2016 +0200

sched/core: Add WARNING for multiple update_rq_clock() calls

Now that we have no missing calls, add a warning to find multiple
calls.

By having only a single update_rq_clock() call per rq-lock section,
the section appears 'atomic' wrt time.


Is that the case? Otherwise we have these false positives.

I can spin up patches if so. 


Thanks,
Phil


-- 


Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group

2019-08-06 Thread Phil Auld
On Tue, Aug 06, 2019 at 03:03:34PM +0200 Peter Zijlstra wrote:
> On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote:
> > Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
> 
> ISTR there were more issues; but it sure is good to start picking them
> off.

I haven't hit any others but if/when I'll try to dig into them. 

> 
> > warning to fire in update_rq_clock. This seems to be caused by onlining
> > a new fair sched group not using the rq lock wrappers.
> > 
> > [472978.683085] rq->clock_update_flags & RQCF_UPDATED
> > [472978.683100] WARNING: CPU: 5 PID: 54385 at kernel/sched/core.c:210 
> > update_rq_clock+0xec/0x150
> 
> > Using the wrappers in online_fair_sched_group instead of the raw locking 
> > removes this warning. 
> 
> Yeah, that seems sane. Thanks!

Thanks,
Phil

-- 


Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group

2019-08-06 Thread Peter Zijlstra
On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote:
> Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes

ISTR there were more issues; but it sure is good to start picking them
off.

> warning to fire in update_rq_clock. This seems to be caused by onlining
> a new fair sched group not using the rq lock wrappers.
> 
> [472978.683085] rq->clock_update_flags & RQCF_UPDATED
> [472978.683100] WARNING: CPU: 5 PID: 54385 at kernel/sched/core.c:210 
> update_rq_clock+0xec/0x150

> Using the wrappers in online_fair_sched_group instead of the raw locking 
> removes this warning. 

Yeah, that seems sane. Thanks!


Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group

2019-08-06 Thread Phil Auld
On Tue, Aug 06, 2019 at 02:04:16PM +0800 Hillf Danton wrote:
> 
> On Mon, 5 Aug 2019 22:07:05 +0800 Phil Auld wrote:
> >
> > If we're to clear that flag right there, outside of the lock pinning code,
> > then I think we might as well just remove the flag and all associated
> > comments etc, no?
> 
> A diff may tell the Peter folks more about your thoughts?
> 

I provided a diff with my thoughts of how to remove this warning in
the original post :)

This comment was about your patch which, to my mind, makes the flag 
meaningless and so could just remove the whole thing. I was not 
proposing to actually do that. I assumed it was there because it was
thought to be useful. Although, if that is what people want I could 
certainly spin up a patch to that effect. 


Cheers,
Phil

> Hillf
> 

-- 


Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group

2019-08-05 Thread Phil Auld
On Fri, Aug 02, 2019 at 05:20:38PM +0800 Hillf Danton wrote:
> 
> On Thu,  1 Aug 2019 09:37:49 -0400 Phil Auld wrote:
> >
> > Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
> > warning to fire in update_rq_clock. This seems to be caused by onlining
> > a new fair sched group not using the rq lock wrappers.
> > 
> > [472978.683085] rq->clock_update_flags & RQCF_UPDATED
> > [472978.683100] WARNING: CPU: 5 PID: 54385 at kernel/sched/core.c:210 
> > update_rq_clock+0xec/0x150
> 
> Another option perhaps only if that wrappers are not mandatory.
> 
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -212,10 +212,14 @@ void update_rq_clock(struct rq *rq)
>  #endif
>  
>   delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
> - if (delta < 0)
> - return;
> - rq->clock += delta;
> - update_rq_clock_task(rq, delta);
> + if (delta >= 0) {
> + rq->clock += delta;
> + update_rq_clock_task(rq, delta);
> + }
> +
> +#ifdef CONFIG_SCHED_DEBUG
> + rq->clock_update_flags &= ~RQCF_UPDATED;
> +#endif
>  }
>  
>  
> --
> 

I think that would silence the warning, but...

If we're to clear that flag right there, outside of the lock pinning code, 
then I think we might as well just remove the flag and all associated 
comments etc, no?


Cheers,
Phil

--