Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
On Fri, Aug 09, 2019 at 06:43:09PM +0100 Valentin Schneider wrote: > On 09/08/2019 14:33, Phil Auld wrote: > > On Tue, Aug 06, 2019 at 03:03:34PM +0200 Peter Zijlstra wrote: > >> On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote: > >>> Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes > >> > >> ISTR there were more issues; but it sure is good to start picking them > >> off. > >> > > > > Following up on this I hit another in rt.c which looks like: > > > > [ 156.348854] Call Trace: > > [ 156.351301] > > [ 156.353322] sched_rt_period_timer+0x124/0x350 > > [ 156.357766] ? sched_rt_rq_enqueue+0x90/0x90 > > [ 156.362037] __hrtimer_run_queues+0xfb/0x270 > > [ 156.366303] hrtimer_interrupt+0x122/0x270 > > [ 156.370403] smp_apic_timer_interrupt+0x6a/0x140 > > [ 156.375022] apic_timer_interrupt+0xf/0x20 > > [ 156.379119] > > > > It looks like the same issue of not using the rq_lock* wrappers and > > hence not using the pinning. From looking at the code there is at > > least one potential hit in deadline.c in the push_dl_task path with > > find_lock_later_rq but I have not hit that in practice. > > > > This commit, which introduced the warning, seems to imply that the use > > of the rq_lock* wrappers is required, at least for any sections that will > > call update_rq_clock: > > > > commit 26ae58d23b94a075ae724fd18783a3773131cfbc > > Author: Peter Zijlstra > > Date: Mon Oct 3 16:53:49 2016 +0200 > > > > sched/core: Add WARNING for multiple update_rq_clock() calls > > > > Now that we have no missing calls, add a warning to find multiple > > calls. > > > > By having only a single update_rq_clock() call per rq-lock section, > > the section appears 'atomic' wrt time. > > > > > > Is that the case? Otherwise we have these false positives. > > > > Looks like it - only rq_pin_lock() clears RQCF_UPDATED, so any > update_rq_clock() that isn't preceded by that function will still have > RQCF_UPDATED set the second time it's executed and will trigger the warn. > > Seeing as the wrappers boil down to raw_spin_*() when the debug bits are > disabled, I don't see why we wouldn't want to convert these callsites. > The one above is easy enough. After that I hit one related to the double_rq_lock paths. Now I see why that was not cleaned up already. That's going to be a bit messier and will require some study. I'll post this trivial anyway. Cheers, Phil --
Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
On 09/08/2019 14:33, Phil Auld wrote: > On Tue, Aug 06, 2019 at 03:03:34PM +0200 Peter Zijlstra wrote: >> On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote: >>> Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes >> >> ISTR there were more issues; but it sure is good to start picking them >> off. >> > > Following up on this I hit another in rt.c which looks like: > > [ 156.348854] Call Trace: > [ 156.351301] > [ 156.353322] sched_rt_period_timer+0x124/0x350 > [ 156.357766] ? sched_rt_rq_enqueue+0x90/0x90 > [ 156.362037] __hrtimer_run_queues+0xfb/0x270 > [ 156.366303] hrtimer_interrupt+0x122/0x270 > [ 156.370403] smp_apic_timer_interrupt+0x6a/0x140 > [ 156.375022] apic_timer_interrupt+0xf/0x20 > [ 156.379119] > > It looks like the same issue of not using the rq_lock* wrappers and > hence not using the pinning. From looking at the code there is at > least one potential hit in deadline.c in the push_dl_task path with > find_lock_later_rq but I have not hit that in practice. > > This commit, which introduced the warning, seems to imply that the use > of the rq_lock* wrappers is required, at least for any sections that will > call update_rq_clock: > > commit 26ae58d23b94a075ae724fd18783a3773131cfbc > Author: Peter Zijlstra > Date: Mon Oct 3 16:53:49 2016 +0200 > > sched/core: Add WARNING for multiple update_rq_clock() calls > > Now that we have no missing calls, add a warning to find multiple > calls. > > By having only a single update_rq_clock() call per rq-lock section, > the section appears 'atomic' wrt time. > > > Is that the case? Otherwise we have these false positives. > Looks like it - only rq_pin_lock() clears RQCF_UPDATED, so any update_rq_clock() that isn't preceded by that function will still have RQCF_UPDATED set the second time it's executed and will trigger the warn. Seeing as the wrappers boil down to raw_spin_*() when the debug bits are disabled, I don't see why we wouldn't want to convert these callsites. > I can spin up patches if so. > > > Thanks, > Phil > >
Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
On Tue, Aug 06, 2019 at 03:03:34PM +0200 Peter Zijlstra wrote: > On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote: > > Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes > > ISTR there were more issues; but it sure is good to start picking them > off. > Following up on this I hit another in rt.c which looks like: [ 156.348854] Call Trace: [ 156.351301] [ 156.353322] sched_rt_period_timer+0x124/0x350 [ 156.357766] ? sched_rt_rq_enqueue+0x90/0x90 [ 156.362037] __hrtimer_run_queues+0xfb/0x270 [ 156.366303] hrtimer_interrupt+0x122/0x270 [ 156.370403] smp_apic_timer_interrupt+0x6a/0x140 [ 156.375022] apic_timer_interrupt+0xf/0x20 [ 156.379119] It looks like the same issue of not using the rq_lock* wrappers and hence not using the pinning. From looking at the code there is at least one potential hit in deadline.c in the push_dl_task path with find_lock_later_rq but I have not hit that in practice. This commit, which introduced the warning, seems to imply that the use of the rq_lock* wrappers is required, at least for any sections that will call update_rq_clock: commit 26ae58d23b94a075ae724fd18783a3773131cfbc Author: Peter Zijlstra Date: Mon Oct 3 16:53:49 2016 +0200 sched/core: Add WARNING for multiple update_rq_clock() calls Now that we have no missing calls, add a warning to find multiple calls. By having only a single update_rq_clock() call per rq-lock section, the section appears 'atomic' wrt time. Is that the case? Otherwise we have these false positives. I can spin up patches if so. Thanks, Phil --
Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
On Tue, Aug 06, 2019 at 03:03:34PM +0200 Peter Zijlstra wrote: > On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote: > > Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes > > ISTR there were more issues; but it sure is good to start picking them > off. I haven't hit any others but if/when I'll try to dig into them. > > > warning to fire in update_rq_clock. This seems to be caused by onlining > > a new fair sched group not using the rq lock wrappers. > > > > [472978.683085] rq->clock_update_flags & RQCF_UPDATED > > [472978.683100] WARNING: CPU: 5 PID: 54385 at kernel/sched/core.c:210 > > update_rq_clock+0xec/0x150 > > > Using the wrappers in online_fair_sched_group instead of the raw locking > > removes this warning. > > Yeah, that seems sane. Thanks! Thanks, Phil --
Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote: > Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes ISTR there were more issues; but it sure is good to start picking them off. > warning to fire in update_rq_clock. This seems to be caused by onlining > a new fair sched group not using the rq lock wrappers. > > [472978.683085] rq->clock_update_flags & RQCF_UPDATED > [472978.683100] WARNING: CPU: 5 PID: 54385 at kernel/sched/core.c:210 > update_rq_clock+0xec/0x150 > Using the wrappers in online_fair_sched_group instead of the raw locking > removes this warning. Yeah, that seems sane. Thanks!
Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
On Tue, Aug 06, 2019 at 02:04:16PM +0800 Hillf Danton wrote: > > On Mon, 5 Aug 2019 22:07:05 +0800 Phil Auld wrote: > > > > If we're to clear that flag right there, outside of the lock pinning code, > > then I think we might as well just remove the flag and all associated > > comments etc, no? > > A diff may tell the Peter folks more about your thoughts? > I provided a diff with my thoughts of how to remove this warning in the original post :) This comment was about your patch which, to my mind, makes the flag meaningless and so could just remove the whole thing. I was not proposing to actually do that. I assumed it was there because it was thought to be useful. Although, if that is what people want I could certainly spin up a patch to that effect. Cheers, Phil > Hillf > --
Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
On Fri, Aug 02, 2019 at 05:20:38PM +0800 Hillf Danton wrote: > > On Thu, 1 Aug 2019 09:37:49 -0400 Phil Auld wrote: > > > > Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes > > warning to fire in update_rq_clock. This seems to be caused by onlining > > a new fair sched group not using the rq lock wrappers. > > > > [472978.683085] rq->clock_update_flags & RQCF_UPDATED > > [472978.683100] WARNING: CPU: 5 PID: 54385 at kernel/sched/core.c:210 > > update_rq_clock+0xec/0x150 > > Another option perhaps only if that wrappers are not mandatory. > > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -212,10 +212,14 @@ void update_rq_clock(struct rq *rq) > #endif > > delta = sched_clock_cpu(cpu_of(rq)) - rq->clock; > - if (delta < 0) > - return; > - rq->clock += delta; > - update_rq_clock_task(rq, delta); > + if (delta >= 0) { > + rq->clock += delta; > + update_rq_clock_task(rq, delta); > + } > + > +#ifdef CONFIG_SCHED_DEBUG > + rq->clock_update_flags &= ~RQCF_UPDATED; > +#endif > } > > > -- > I think that would silence the warning, but... If we're to clear that flag right there, outside of the lock pinning code, then I think we might as well just remove the flag and all associated comments etc, no? Cheers, Phil --