Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-18 Thread Yuyang Du
On Wed, May 18, 2016 at 09:41:20AM +0100, Matt Fleming wrote:
> On Wed, 18 May, at 03:01:27AM, Yuyang Du wrote:
> > On Tue, May 17, 2016 at 01:24:15PM +0100, Matt Fleming wrote:
> > > So, if the code looks like the following, either now or in the future,
> > > 
> > > static void __schedule(bool preempt)
> > > {
> > >   ...
> > >   /* Clear RQCF_ACT_SKIP */
> > >   rq->clock_update_flags = 0;
> > >   ...
> > >   delta = rq_clock();
> > > }
> >  
> > Sigh, you even said "Clear RQCF_ACT_SKIP", but you not only clear it,
> > you clear everything.
> 
> That was sloppy on my part but intentional because that's what the
> code looks like in tip/sched/core today.
 
Sure, rq->clock_update_flags = 0 is itself all right, just say what you do.

> It was purely meant to demonstrate that setting RQCF_UPDATE just
> because RQCF_ACT_SKIP is set does not make sense. You can replace the
> clearing line with the correct bit masking operation.
 
I don't understand how you demonstrated that does not make sense. Anways,
you sort it out.

> But I get it, the pseudo-code was confusing. I'll send out a v2.
>
> > And if you clear the RQCF_UPDATE also (maybe you shouldn't, but
> > actually it does not matter), of course you will get a warning...
> 
> Right, I wouldn't actually clear RQCF_UPDATE in v2 of this patch.
> 
> > In addition, it looks like multiple skips are possible, so:
>  
> I'm not sure what you mean, could you elaborate?
> 
> > update_rq_clock() {
> > rq->clock_update_flags |= RQCF_UPDATE;
> > 
> > ...
> > }
> > 
> > instead of clearing the skip flag there.
> 
> Huh? RQCF_*_SKIP are not cleared in update_rq_clock().

Yeah, I previously cleared the SKIP bit, which is not right, so I corrected.


Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-18 Thread Matt Fleming
On Wed, 18 May, at 03:01:27AM, Yuyang Du wrote:
> On Tue, May 17, 2016 at 01:24:15PM +0100, Matt Fleming wrote:
> > So, if the code looks like the following, either now or in the future,
> > 
> > static void __schedule(bool preempt)
> > {
> > ...
> > /* Clear RQCF_ACT_SKIP */
> > rq->clock_update_flags = 0;
> > ...
> > delta = rq_clock();
> > }
>  
> Sigh, you even said "Clear RQCF_ACT_SKIP", but you not only clear it,
> you clear everything.

That was sloppy on my part but intentional because that's what the
code looks like in tip/sched/core today.

It was purely meant to demonstrate that setting RQCF_UPDATE just
because RQCF_ACT_SKIP is set does not make sense. You can replace the
clearing line with the correct bit masking operation.

But I get it, the pseudo-code was confusing. I'll send out a v2.

> And if you clear the RQCF_UPDATE also (maybe you shouldn't, but
> actually it does not matter), of course you will get a warning...

Right, I wouldn't actually clear RQCF_UPDATE in v2 of this patch.

> In addition, it looks like multiple skips are possible, so:
 
I'm not sure what you mean, could you elaborate?

> update_rq_clock() {
> rq->clock_update_flags |= RQCF_UPDATE;
> 
> ...
> }
> 
> instead of clearing the skip flag there.

Huh? RQCF_*_SKIP are not cleared in update_rq_clock().


Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-17 Thread Yuyang Du
On Tue, May 17, 2016 at 01:24:15PM +0100, Matt Fleming wrote:
> So, if the code looks like the following, either now or in the future,
> 
> static void __schedule(bool preempt)
> {
>   ...
>   /* Clear RQCF_ACT_SKIP */
>   rq->clock_update_flags = 0;
>   ...
>   delta = rq_clock();
> }
 
Sigh, you even said "Clear RQCF_ACT_SKIP", but you not only clear it,
you clear everything. And if you clear the RQCF_UPDATE also (maybe you
shouldn't, but actually it does not matter), of course you will get
a warning...

In addition, it looks like multiple skips are possible, so:

update_rq_clock() {
rq->clock_update_flags |= RQCF_UPDATE;

...
}

instead of clearing the skip flag there.


Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-17 Thread Matt Fleming
On Tue, 17 May, at 04:11:09AM, Yuyang Du wrote:
> On Mon, May 16, 2016 at 10:46:38AM +0100, Matt Fleming wrote:
> > 
> > No because if someone calls rq_clock() immediately after __schedule(),
> > or even immediately after we clear RQCF_ACT_SKIP in __schedule(), we
> > should trigger a warning since the clock has not actually been
> > updated.
> 
> Well, I don't know how concurrent it can be, but aren't both update
> and read synchronized by rq->lock? So I don't understand the latter
> case, and the former should be addressed (missing its own update?).

I'm not talking about concurrency; when I said "someone" above, I was
referring to code.

So, if the code looks like the following, either now or in the future,

static void __schedule(bool preempt)
{
...
/* Clear RQCF_ACT_SKIP */
rq->clock_update_flags = 0;
...
delta = rq_clock();
}

I would expect to see a warning triggered, because we've read the rq
clock outside of the code area where we know it's safe to do so
without a clock update.

The solution for that bug may be as simple as rearranging the code,

delta = rq_clock();
...
rq->clock_update_flags = 0;

but we definitely want to catch such bugs in the first instance.


Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-16 Thread Yuyang Du
On Mon, May 16, 2016 at 10:46:38AM +0100, Matt Fleming wrote:
> > >  
> > > - rq->clock_skip_update = 0;
> > > + /* Clear ACT, preserve everything else */
> > > + rq->clock_update_flags ^= RQCF_ACT_SKIP;
> > 
> > The comment says "Clear ACT", but this is really xor, and I am not sure
> > this is even what you want.
>  
> Urgh, you're right. I'm not sure what I was thinking when I wrote
> that.

It happens, ;)
 
> > In addition, would it be simpler to do this?
> > 
> > update_rq_clock()
> > if (flags & RQCF_ACT_SKIP)
> > flags <<= 1; /* effective skip is an update */
> > return;
> > 
> > flags = RQCF_UPDATED;
> 
> No because if someone calls rq_clock() immediately after __schedule(),
> or even immediately after we clear RQCF_ACT_SKIP in __schedule(), we
> should trigger a warning since the clock has not actually been
> updated.

Well, I don't know how concurrent it can be, but aren't both update
and read synchronized by rq->lock? So I don't understand the latter
case, and the former should be addressed (missing its own update?).


Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-16 Thread Matt Fleming
On Sun, 15 May, at 10:14:39AM, Yuyang Du wrote:
> Hi Matt,
> 
> Thanks for Ccing me.
> 
> I am indeed interested, because I recently encountered an rq clock
> issue, which is that the clock jumps about 200ms when I was
> experimenting the "flat util hierarchy" patches, which really annoyed
> me, and I had to stop to figure out what is wrong (but haven't yet
> figured out ;))
> 
> First, this patchset does not solve my problem, but never mind, by
> reviewing your patches, I have some comments:
 
Thanks for the review. One gap that this patch series doesn't address
is that some callers of update_rq_clock() do not pin rq->lock, which
makes the diagnostic checks useless in that case.

I plan on handling that next, but I wanted to get this series out as
soon as possible for review.

> On Thu, May 12, 2016 at 08:49:53PM +0100, Matt Fleming wrote:
> >  
> > -   rq->clock_skip_update = 0;
> > +   /* Clear ACT, preserve everything else */
> > +   rq->clock_update_flags ^= RQCF_ACT_SKIP;
> 
> The comment says "Clear ACT", but this is really xor, and I am not sure
> this is even what you want.
 
Urgh, you're right. I'm not sure what I was thinking when I wrote
that.

> In addition, would it be simpler to do this?
> 
> update_rq_clock()
>   if (flags & RQCF_ACT_SKIP)
>   flags <<= 1; /* effective skip is an update */
>   return;
> 
>   flags = RQCF_UPDATED;

No because if someone calls rq_clock() immediately after __schedule(),
or even immediately after we clear RQCF_ACT_SKIP in __schedule(), we
should trigger a warning since the clock has not actually been
updated.


Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-15 Thread Yuyang Du
Hi Matt,

Thanks for Ccing me.

I am indeed interested, because I recently encountered an rq clock
issue, which is that the clock jumps about 200ms when I was
experimenting the "flat util hierarchy" patches, which really annoyed
me, and I had to stop to figure out what is wrong (but haven't yet
figured out ;))

First, this patchset does not solve my problem, but never mind, by
reviewing your patches, I have some comments:

On Thu, May 12, 2016 at 08:49:53PM +0100, Matt Fleming wrote:
>  
> - rq->clock_skip_update = 0;
> + /* Clear ACT, preserve everything else */
> + rq->clock_update_flags ^= RQCF_ACT_SKIP;

The comment says "Clear ACT", but this is really xor, and I am not sure
this is even what you want.

In addition, would it be simpler to do this?

update_rq_clock()
if (flags & RQCF_ACT_SKIP)
flags <<= 1; /* effective skip is an update */
return;

flags = RQCF_UPDATED;

Thanks,
Yuyang


[RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-12 Thread Matt Fleming
There's no diagnostic checks for figuring out when we've accidentally
missed update_rq_clock() calls. Let's add some by piggybacking on the
rq_*pin_lock() wrappers.

The idea behind the diagnostic checks is that upon pining rq lock the
rq clock should be updated, via update_rq_clock(), before anybody
reads the clock with rq_clock(). The exception to this rule is when
updates have explicitly been disabled with the rq_clock_skip_update()
optimisation.

There are some functions that only unpin the rq lock in order to grab
some other lock and avoid deadlock. In that case we don't need to
update the clock again and the previous diagnostic state can be
carried over in rq_repin_lock() by saving the state in the rq_flags
context.

Since this patch adds a new clock update flag and some already exist
in rq::clock_skip_update, that field has now been renamed. An attempt
has been made to keep the flag manipulation code small and fast since
it's used in the heart of the __schedule() fast path.

For the !CONFIG_SCHED_DEBUG case the only object code change (other
than addresses) is the following change to the two lines to reset
RQCF_ACT_SKIP inside of __schedule(),

  -   41 c7 84 24 f0 08 00movl$0x0,0x8f0(%r12)
  -   00 00 00 00 00
  +   41 83 b4 24 f0 08 00xorl$0x2,0x8f0(%r12)
  +   00 02

Suggested-by: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Mike Galbraith 
Cc: Mel Gorman 
Cc: Thomas Gleixner 
Cc: Frederic Weisbecker 
Cc: Yuyang Du 
Cc: "Rafael J. Wysocki" 
Signed-off-by: Matt Fleming 
---
 kernel/sched/core.c  | 13 +++---
 kernel/sched/sched.h | 70 +++-
 2 files changed, 73 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d2112c268fd1..0999b3f23dde 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -101,9 +101,12 @@ void update_rq_clock(struct rq *rq)
 
lockdep_assert_held(&rq->lock);
 
-   if (rq->clock_skip_update & RQCF_ACT_SKIP)
+   if (rq->clock_update_flags & RQCF_ACT_SKIP)
return;
 
+#ifdef CONFIG_SCHED_DEBUG
+   rq->clock_update_flags |= RQCF_UPDATED;
+#endif
delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
if (delta < 0)
return;
@@ -2825,7 +2828,8 @@ context_switch(struct rq *rq, struct task_struct *prev,
rq->prev_mm = oldmm;
}
 
-   rq->clock_skip_update = 0;
+   /* Clear ACT, preserve everything else */
+   rq->clock_update_flags ^= RQCF_ACT_SKIP;
 
/*
 * Since the runqueue lock will be released by the next
@@ -3287,7 +3291,7 @@ static void __sched notrace __schedule(bool preempt)
raw_spin_lock(&rq->lock);
rq_pin_lock(rq, &rf);
 
-   rq->clock_skip_update <<= 1; /* promote REQ to ACT */
+   rq->clock_update_flags <<= 1; /* promote REQ to ACT */
 
switch_count = &prev->nivcsw;
if (!preempt && prev->state) {
@@ -3328,7 +3332,8 @@ static void __sched notrace __schedule(bool preempt)
trace_sched_switch(preempt, prev, next);
rq = context_switch(rq, prev, next, &rf); /* unlocks the rq */
} else {
-   rq->clock_skip_update = 0;
+   /* Clear ACT, preserve everything else */
+   rq->clock_update_flags ^= RQCF_ACT_SKIP;
rq_unpin_lock(rq, &rf);
raw_spin_unlock_irq(&rq->lock);
}
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index cbeb830c7ac6..072c020cd8e3 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -628,7 +628,7 @@ struct rq {
unsigned long next_balance;
struct mm_struct *prev_mm;
 
-   unsigned int clock_skip_update;
+   unsigned int clock_update_flags;
u64 clock;
u64 clock_task;
 
@@ -735,9 +735,45 @@ static inline u64 __rq_clock_broken(struct rq *rq)
return READ_ONCE(rq->clock);
 }
 
+/*
+ * rq::clock_update_flags bits
+ *
+ * %RQCF_REQ_SKIP - will request skipping of clock update on the next
+ *  call to __schedule(). This is an optimisation to avoid
+ *  neighbouring rq clock updates.
+ *
+ * %RQCF_ACT_SKIP - is set from inside of __schedule() when skipping is
+ *  in effect and calls to update_rq_clock() are being ignored.
+ *
+ * %RQCF_UPDATED - is a debug flag that indicates whether a call has been
+ *  made to update_rq_clock() since the last time rq::lock was pinned.
+ *
+ * If inside of __schedule(), clock_update_flags will have been
+ * shifted left (a left shift is a cheap operation for the fast path
+ * to promote %RQCF_REQ_SKIP to %RQCF_ACT_SKIP), so you must use,
+ *
+ * if (rq-clock_update_flags >= RQCF_UPDATED)
+ *
+ * to check if %RQCF_UPADTED is set. It'll never be shifted more than
+ * one position though, because the next rq_unpin_lock() will shift it
+ * back.
+ */
+#define RQCF_REQ_SKIP  0x01
+#define RQCF_ACT_SKIP  0x02
+#define RQCF_UPDATED   0x04
+
 static inline u64 rq_clock(struct rq *rq)
 {
loc