Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()
On Wed, May 18, 2016 at 09:41:20AM +0100, Matt Fleming wrote: > On Wed, 18 May, at 03:01:27AM, Yuyang Du wrote: > > On Tue, May 17, 2016 at 01:24:15PM +0100, Matt Fleming wrote: > > > So, if the code looks like the following, either now or in the future, > > > > > > static void __schedule(bool preempt) > > > { > > > ... > > > /* Clear RQCF_ACT_SKIP */ > > > rq->clock_update_flags = 0; > > > ... > > > delta = rq_clock(); > > > } > > > > Sigh, you even said "Clear RQCF_ACT_SKIP", but you not only clear it, > > you clear everything. > > That was sloppy on my part but intentional because that's what the > code looks like in tip/sched/core today. Sure, rq->clock_update_flags = 0 is itself all right, just say what you do. > It was purely meant to demonstrate that setting RQCF_UPDATE just > because RQCF_ACT_SKIP is set does not make sense. You can replace the > clearing line with the correct bit masking operation. I don't understand how you demonstrated that does not make sense. Anways, you sort it out. > But I get it, the pseudo-code was confusing. I'll send out a v2. > > > And if you clear the RQCF_UPDATE also (maybe you shouldn't, but > > actually it does not matter), of course you will get a warning... > > Right, I wouldn't actually clear RQCF_UPDATE in v2 of this patch. > > > In addition, it looks like multiple skips are possible, so: > > I'm not sure what you mean, could you elaborate? > > > update_rq_clock() { > > rq->clock_update_flags |= RQCF_UPDATE; > > > > ... > > } > > > > instead of clearing the skip flag there. > > Huh? RQCF_*_SKIP are not cleared in update_rq_clock(). Yeah, I previously cleared the SKIP bit, which is not right, so I corrected.
Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()
On Wed, 18 May, at 03:01:27AM, Yuyang Du wrote: > On Tue, May 17, 2016 at 01:24:15PM +0100, Matt Fleming wrote: > > So, if the code looks like the following, either now or in the future, > > > > static void __schedule(bool preempt) > > { > > ... > > /* Clear RQCF_ACT_SKIP */ > > rq->clock_update_flags = 0; > > ... > > delta = rq_clock(); > > } > > Sigh, you even said "Clear RQCF_ACT_SKIP", but you not only clear it, > you clear everything. That was sloppy on my part but intentional because that's what the code looks like in tip/sched/core today. It was purely meant to demonstrate that setting RQCF_UPDATE just because RQCF_ACT_SKIP is set does not make sense. You can replace the clearing line with the correct bit masking operation. But I get it, the pseudo-code was confusing. I'll send out a v2. > And if you clear the RQCF_UPDATE also (maybe you shouldn't, but > actually it does not matter), of course you will get a warning... Right, I wouldn't actually clear RQCF_UPDATE in v2 of this patch. > In addition, it looks like multiple skips are possible, so: I'm not sure what you mean, could you elaborate? > update_rq_clock() { > rq->clock_update_flags |= RQCF_UPDATE; > > ... > } > > instead of clearing the skip flag there. Huh? RQCF_*_SKIP are not cleared in update_rq_clock().
Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()
On Tue, May 17, 2016 at 01:24:15PM +0100, Matt Fleming wrote: > So, if the code looks like the following, either now or in the future, > > static void __schedule(bool preempt) > { > ... > /* Clear RQCF_ACT_SKIP */ > rq->clock_update_flags = 0; > ... > delta = rq_clock(); > } Sigh, you even said "Clear RQCF_ACT_SKIP", but you not only clear it, you clear everything. And if you clear the RQCF_UPDATE also (maybe you shouldn't, but actually it does not matter), of course you will get a warning... In addition, it looks like multiple skips are possible, so: update_rq_clock() { rq->clock_update_flags |= RQCF_UPDATE; ... } instead of clearing the skip flag there.
Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()
On Tue, 17 May, at 04:11:09AM, Yuyang Du wrote: > On Mon, May 16, 2016 at 10:46:38AM +0100, Matt Fleming wrote: > > > > No because if someone calls rq_clock() immediately after __schedule(), > > or even immediately after we clear RQCF_ACT_SKIP in __schedule(), we > > should trigger a warning since the clock has not actually been > > updated. > > Well, I don't know how concurrent it can be, but aren't both update > and read synchronized by rq->lock? So I don't understand the latter > case, and the former should be addressed (missing its own update?). I'm not talking about concurrency; when I said "someone" above, I was referring to code. So, if the code looks like the following, either now or in the future, static void __schedule(bool preempt) { ... /* Clear RQCF_ACT_SKIP */ rq->clock_update_flags = 0; ... delta = rq_clock(); } I would expect to see a warning triggered, because we've read the rq clock outside of the code area where we know it's safe to do so without a clock update. The solution for that bug may be as simple as rearranging the code, delta = rq_clock(); ... rq->clock_update_flags = 0; but we definitely want to catch such bugs in the first instance.
Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()
On Mon, May 16, 2016 at 10:46:38AM +0100, Matt Fleming wrote: > > > > > > - rq->clock_skip_update = 0; > > > + /* Clear ACT, preserve everything else */ > > > + rq->clock_update_flags ^= RQCF_ACT_SKIP; > > > > The comment says "Clear ACT", but this is really xor, and I am not sure > > this is even what you want. > > Urgh, you're right. I'm not sure what I was thinking when I wrote > that. It happens, ;) > > In addition, would it be simpler to do this? > > > > update_rq_clock() > > if (flags & RQCF_ACT_SKIP) > > flags <<= 1; /* effective skip is an update */ > > return; > > > > flags = RQCF_UPDATED; > > No because if someone calls rq_clock() immediately after __schedule(), > or even immediately after we clear RQCF_ACT_SKIP in __schedule(), we > should trigger a warning since the clock has not actually been > updated. Well, I don't know how concurrent it can be, but aren't both update and read synchronized by rq->lock? So I don't understand the latter case, and the former should be addressed (missing its own update?).
Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()
On Sun, 15 May, at 10:14:39AM, Yuyang Du wrote: > Hi Matt, > > Thanks for Ccing me. > > I am indeed interested, because I recently encountered an rq clock > issue, which is that the clock jumps about 200ms when I was > experimenting the "flat util hierarchy" patches, which really annoyed > me, and I had to stop to figure out what is wrong (but haven't yet > figured out ;)) > > First, this patchset does not solve my problem, but never mind, by > reviewing your patches, I have some comments: Thanks for the review. One gap that this patch series doesn't address is that some callers of update_rq_clock() do not pin rq->lock, which makes the diagnostic checks useless in that case. I plan on handling that next, but I wanted to get this series out as soon as possible for review. > On Thu, May 12, 2016 at 08:49:53PM +0100, Matt Fleming wrote: > > > > - rq->clock_skip_update = 0; > > + /* Clear ACT, preserve everything else */ > > + rq->clock_update_flags ^= RQCF_ACT_SKIP; > > The comment says "Clear ACT", but this is really xor, and I am not sure > this is even what you want. Urgh, you're right. I'm not sure what I was thinking when I wrote that. > In addition, would it be simpler to do this? > > update_rq_clock() > if (flags & RQCF_ACT_SKIP) > flags <<= 1; /* effective skip is an update */ > return; > > flags = RQCF_UPDATED; No because if someone calls rq_clock() immediately after __schedule(), or even immediately after we clear RQCF_ACT_SKIP in __schedule(), we should trigger a warning since the clock has not actually been updated.
Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()
Hi Matt, Thanks for Ccing me. I am indeed interested, because I recently encountered an rq clock issue, which is that the clock jumps about 200ms when I was experimenting the "flat util hierarchy" patches, which really annoyed me, and I had to stop to figure out what is wrong (but haven't yet figured out ;)) First, this patchset does not solve my problem, but never mind, by reviewing your patches, I have some comments: On Thu, May 12, 2016 at 08:49:53PM +0100, Matt Fleming wrote: > > - rq->clock_skip_update = 0; > + /* Clear ACT, preserve everything else */ > + rq->clock_update_flags ^= RQCF_ACT_SKIP; The comment says "Clear ACT", but this is really xor, and I am not sure this is even what you want. In addition, would it be simpler to do this? update_rq_clock() if (flags & RQCF_ACT_SKIP) flags <<= 1; /* effective skip is an update */ return; flags = RQCF_UPDATED; Thanks, Yuyang
[RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()
There's no diagnostic checks for figuring out when we've accidentally missed update_rq_clock() calls. Let's add some by piggybacking on the rq_*pin_lock() wrappers. The idea behind the diagnostic checks is that upon pining rq lock the rq clock should be updated, via update_rq_clock(), before anybody reads the clock with rq_clock(). The exception to this rule is when updates have explicitly been disabled with the rq_clock_skip_update() optimisation. There are some functions that only unpin the rq lock in order to grab some other lock and avoid deadlock. In that case we don't need to update the clock again and the previous diagnostic state can be carried over in rq_repin_lock() by saving the state in the rq_flags context. Since this patch adds a new clock update flag and some already exist in rq::clock_skip_update, that field has now been renamed. An attempt has been made to keep the flag manipulation code small and fast since it's used in the heart of the __schedule() fast path. For the !CONFIG_SCHED_DEBUG case the only object code change (other than addresses) is the following change to the two lines to reset RQCF_ACT_SKIP inside of __schedule(), - 41 c7 84 24 f0 08 00movl$0x0,0x8f0(%r12) - 00 00 00 00 00 + 41 83 b4 24 f0 08 00xorl$0x2,0x8f0(%r12) + 00 02 Suggested-by: Peter Zijlstra Cc: Ingo Molnar Cc: Mike Galbraith Cc: Mel Gorman Cc: Thomas Gleixner Cc: Frederic Weisbecker Cc: Yuyang Du Cc: "Rafael J. Wysocki" Signed-off-by: Matt Fleming --- kernel/sched/core.c | 13 +++--- kernel/sched/sched.h | 70 +++- 2 files changed, 73 insertions(+), 10 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index d2112c268fd1..0999b3f23dde 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -101,9 +101,12 @@ void update_rq_clock(struct rq *rq) lockdep_assert_held(&rq->lock); - if (rq->clock_skip_update & RQCF_ACT_SKIP) + if (rq->clock_update_flags & RQCF_ACT_SKIP) return; +#ifdef CONFIG_SCHED_DEBUG + rq->clock_update_flags |= RQCF_UPDATED; +#endif delta = sched_clock_cpu(cpu_of(rq)) - rq->clock; if (delta < 0) return; @@ -2825,7 +2828,8 @@ context_switch(struct rq *rq, struct task_struct *prev, rq->prev_mm = oldmm; } - rq->clock_skip_update = 0; + /* Clear ACT, preserve everything else */ + rq->clock_update_flags ^= RQCF_ACT_SKIP; /* * Since the runqueue lock will be released by the next @@ -3287,7 +3291,7 @@ static void __sched notrace __schedule(bool preempt) raw_spin_lock(&rq->lock); rq_pin_lock(rq, &rf); - rq->clock_skip_update <<= 1; /* promote REQ to ACT */ + rq->clock_update_flags <<= 1; /* promote REQ to ACT */ switch_count = &prev->nivcsw; if (!preempt && prev->state) { @@ -3328,7 +3332,8 @@ static void __sched notrace __schedule(bool preempt) trace_sched_switch(preempt, prev, next); rq = context_switch(rq, prev, next, &rf); /* unlocks the rq */ } else { - rq->clock_skip_update = 0; + /* Clear ACT, preserve everything else */ + rq->clock_update_flags ^= RQCF_ACT_SKIP; rq_unpin_lock(rq, &rf); raw_spin_unlock_irq(&rq->lock); } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index cbeb830c7ac6..072c020cd8e3 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -628,7 +628,7 @@ struct rq { unsigned long next_balance; struct mm_struct *prev_mm; - unsigned int clock_skip_update; + unsigned int clock_update_flags; u64 clock; u64 clock_task; @@ -735,9 +735,45 @@ static inline u64 __rq_clock_broken(struct rq *rq) return READ_ONCE(rq->clock); } +/* + * rq::clock_update_flags bits + * + * %RQCF_REQ_SKIP - will request skipping of clock update on the next + * call to __schedule(). This is an optimisation to avoid + * neighbouring rq clock updates. + * + * %RQCF_ACT_SKIP - is set from inside of __schedule() when skipping is + * in effect and calls to update_rq_clock() are being ignored. + * + * %RQCF_UPDATED - is a debug flag that indicates whether a call has been + * made to update_rq_clock() since the last time rq::lock was pinned. + * + * If inside of __schedule(), clock_update_flags will have been + * shifted left (a left shift is a cheap operation for the fast path + * to promote %RQCF_REQ_SKIP to %RQCF_ACT_SKIP), so you must use, + * + * if (rq-clock_update_flags >= RQCF_UPDATED) + * + * to check if %RQCF_UPADTED is set. It'll never be shifted more than + * one position though, because the next rq_unpin_lock() will shift it + * back. + */ +#define RQCF_REQ_SKIP 0x01 +#define RQCF_ACT_SKIP 0x02 +#define RQCF_UPDATED 0x04 + static inline u64 rq_clock(struct rq *rq) { loc