Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-22 Thread Wanpeng Li
2017-02-02 23:55 GMT+08:00 Peter Zijlstra : > On Tue, Jan 31, 2017 at 10:22:47AM -0700, Ross Zwisler wrote: >> On Tue, Jan 31, 2017 at 4:48 AM, Mike Galbraith wrote: >> > On Tue, 2017-01-31 at 16:30 +0530, Sachin Sant wrote: > > > Could some of you test this? It seems to cure things in my (very) >

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-06 Thread Paul E. McKenney
On Mon, Feb 06, 2017 at 07:10:48AM -0800, Paul E. McKenney wrote: > On Mon, Feb 06, 2017 at 11:53:10AM +0530, Sachin Sant wrote: > > > > >>> I've seen it on tip. It looks like hot unplug goes really slow when > > >>> there's running tasks on the CPU being taken down. > > >>> > > >>> What I did wa

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-06 Thread Paul E. McKenney
On Mon, Feb 06, 2017 at 11:53:10AM +0530, Sachin Sant wrote: > > >>> I've seen it on tip. It looks like hot unplug goes really slow when > >>> there's running tasks on the CPU being taken down. > >>> > >>> What I did was something like: > >>> > >>> taskset -p $((1<<1)) $$ > >>> for ((i=0; i<20

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-05 Thread Sachin Sant
>>> I've seen it on tip. It looks like hot unplug goes really slow when >>> there's running tasks on the CPU being taken down. >>> >>> What I did was something like: >>> >>> taskset -p $((1<<1)) $$ >>> for ((i=0; i<20; i++)) do while :; do :; done & done >>> >>> taskset -p $((1<<0)) $$ >>>

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-03 Thread Paul E. McKenney
On Fri, Feb 03, 2017 at 07:44:57AM -0800, Paul E. McKenney wrote: > On Fri, Feb 03, 2017 at 02:37:48PM +0100, Peter Zijlstra wrote: > > On Fri, Feb 03, 2017 at 01:59:34PM +0100, Mike Galbraith wrote: > > > On Fri, 2017-02-03 at 09:53 +0100, Peter Zijlstra wrote: > > > > On Fri, Feb 03, 2017 at 10:0

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-03 Thread Paul E. McKenney
On Fri, Feb 03, 2017 at 02:37:48PM +0100, Peter Zijlstra wrote: > On Fri, Feb 03, 2017 at 01:59:34PM +0100, Mike Galbraith wrote: > > On Fri, 2017-02-03 at 09:53 +0100, Peter Zijlstra wrote: > > > On Fri, Feb 03, 2017 at 10:03:14AM +0530, Sachin Sant wrote: > > > > > > I ran few cycles of cpu hot(

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-03 Thread Sachin Sant
[  173.493453] INFO: rcu_sched detected stalls on CPUs/tasks:[  173.493473] 8-...: (2 GPs behind) idle=006/140/0 softirq=0/0 fqs=2996 [  173.493476] (detected by 0, t=6002 jiffies, g=885, c=884, q=6350)Right, I actually saw that too, but I don't think that would be relatedto my patch.

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-03 Thread Mike Galbraith
On Fri, 2017-02-03 at 14:37 +0100, Peter Zijlstra wrote: > On Fri, Feb 03, 2017 at 01:59:34PM +0100, Mike Galbraith wrote: > > FWIW, I'm not seeing stalls/hangs while beating hotplug up in tip. (so > > next grew a wart?) > > I've seen it on tip. It looks like hot unplug goes really slow when > th

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-03 Thread Peter Zijlstra
On Fri, Feb 03, 2017 at 01:59:34PM +0100, Mike Galbraith wrote: > On Fri, 2017-02-03 at 09:53 +0100, Peter Zijlstra wrote: > > On Fri, Feb 03, 2017 at 10:03:14AM +0530, Sachin Sant wrote: > > > > I ran few cycles of cpu hot(un)plug tests. In most cases it works except > > > one > > > where I ran

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-03 Thread Borislav Petkov
On Thu, Feb 02, 2017 at 04:55:06PM +0100, Peter Zijlstra wrote: > On Tue, Jan 31, 2017 at 10:22:47AM -0700, Ross Zwisler wrote: > > On Tue, Jan 31, 2017 at 4:48 AM, Mike Galbraith wrote: > > > On Tue, 2017-01-31 at 16:30 +0530, Sachin Sant wrote: > > > Could some of you test this? It seems to cu

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-03 Thread Mike Galbraith
On Fri, 2017-02-03 at 09:53 +0100, Peter Zijlstra wrote: > On Fri, Feb 03, 2017 at 10:03:14AM +0530, Sachin Sant wrote: > > I ran few cycles of cpu hot(un)plug tests. In most cases it works except one > > where I ran into rcu stall: > > > > [ 173.493453] INFO: rcu_sched detected stalls on CPUs/t

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-03 Thread Peter Zijlstra
On Fri, Feb 03, 2017 at 10:03:14AM +0530, Sachin Sant wrote: > > > On 02-Feb-2017, at 9:25 PM, Peter Zijlstra wrote: > > > > On Tue, Jan 31, 2017 at 10:22:47AM -0700, Ross Zwisler wrote: > >> On Tue, Jan 31, 2017 at 4:48 AM, Mike Galbraith wrote: > >>> On Tue, 2017-01-31 at 16:30 +0530, Sachin

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-02 Thread Sachin Sant
> On 02-Feb-2017, at 9:25 PM, Peter Zijlstra wrote: > > On Tue, Jan 31, 2017 at 10:22:47AM -0700, Ross Zwisler wrote: >> On Tue, Jan 31, 2017 at 4:48 AM, Mike Galbraith wrote: >>> On Tue, 2017-01-31 at 16:30 +0530, Sachin Sant wrote: > > > Could some of you test this? It seems to cure things

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-02 Thread Mike Galbraith
On Thu, 2017-02-02 at 16:55 +0100, Peter Zijlstra wrote: > On Tue, Jan 31, 2017 at 10:22:47AM -0700, Ross Zwisler wrote: > > On Tue, Jan 31, 2017 at 4:48 AM, Mike Galbraith > > wrote: > > > On Tue, 2017-01-31 at 16:30 +0530, Sachin Sant wrote: > > > Could some of you test this? It seems to cure

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-02 Thread Matt Fleming
On Thu, 02 Feb, at 04:55:06PM, Peter Zijlstra wrote: > On Tue, Jan 31, 2017 at 10:22:47AM -0700, Ross Zwisler wrote: > > On Tue, Jan 31, 2017 at 4:48 AM, Mike Galbraith wrote: > > > On Tue, 2017-01-31 at 16:30 +0530, Sachin Sant wrote: > > > Could some of you test this? It seems to cure things i

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-02-02 Thread Peter Zijlstra
On Tue, Jan 31, 2017 at 10:22:47AM -0700, Ross Zwisler wrote: > On Tue, Jan 31, 2017 at 4:48 AM, Mike Galbraith wrote: > > On Tue, 2017-01-31 at 16:30 +0530, Sachin Sant wrote: Could some of you test this? It seems to cure things in my (very) limited testing. --- diff --git a/kernel/sched/core.

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-01-31 Thread Ross Zwisler
On Tue, Jan 31, 2017 at 4:48 AM, Mike Galbraith wrote: > On Tue, 2017-01-31 at 16:30 +0530, Sachin Sant wrote: >> Trimming the cc list. >> >> > > I assume I should be worried? >> > >> > Thanks for the report. No need to worry, the bug has existed for a >> > while, this patch just turns on the warn

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-01-31 Thread Mike Galbraith
On Tue, 2017-01-31 at 16:30 +0530, Sachin Sant wrote: > Trimming the cc list. > > > > I assume I should be worried? > > > > Thanks for the report. No need to worry, the bug has existed for a > > while, this patch just turns on the warning ;-) > > > > The following commit queued up in tip/sched/c

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-01-31 Thread Sachin Sant
Trimming the cc list. >> I assume I should be worried? > > Thanks for the report. No need to worry, the bug has existed for a > while, this patch just turns on the warning ;-) > > The following commit queued up in tip/sched/core should fix your > issues (assuming you see the same callstack on al

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-01-31 Thread Michael Ellerman
Matt Fleming writes: > On Tue, 31 Jan, at 08:24:53AM, Michael Ellerman wrote: >> >> I'm hitting this on multiple powerpc systems: >> >> [ 38.339126] rq->clock_update_flags < RQCF_ACT_SKIP >> [ 38.339134] [ cut here ] >> [ 38.339142] WARNING: CPU: 2 PID: 1 at kernel

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-01-30 Thread Matt Fleming
On Tue, 31 Jan, at 08:24:53AM, Michael Ellerman wrote: > > I'm hitting this on multiple powerpc systems: > > [ 38.339126] rq->clock_update_flags < RQCF_ACT_SKIP > [ 38.339134] [ cut here ] > [ 38.339142] WARNING: CPU: 2 PID: 1 at kernel/sched/sched.h:804 > detach_ta

Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

2017-01-30 Thread Michael Ellerman
tip-bot for Matt Fleming writes: > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 98e7eee..6eeae7e 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -768,48 +768,110 @@ static inline u64 __rq_clock_broken(struct rq *rq) > return READ_ONCE(rq->clock); >