Re: [BUG nohz]: wrong user and system time accounting

2017-06-29 Thread Frederic Weisbecker
On Mon, May 15, 2017 at 04:17:10PM +0800, Wanpeng Li wrote: > Ping, Sorry for the late answer, I was focused on some other bugs. So since my ideas weren't even clear on that issue yet, I took your patch and enhanced the code around. I just posted a new series with it, please have a look.

Re: [BUG nohz]: wrong user and system time accounting

2017-06-29 Thread Frederic Weisbecker
On Mon, May 15, 2017 at 04:17:10PM +0800, Wanpeng Li wrote: > Ping, Sorry for the late answer, I was focused on some other bugs. So since my ideas weren't even clear on that issue yet, I took your patch and enhanced the code around. I just posted a new series with it, please have a look.

Re: [BUG nohz]: wrong user and system time accounting

2017-05-15 Thread Wanpeng Li
Ping, 2017-05-02 18:01 GMT+08:00 Wanpeng Li : > Cc Paolo, > 2017-04-13 21:32 GMT+08:00 Frederic Weisbecker : >> On Thu, Apr 13, 2017 at 12:31:12PM +0800, Wanpeng Li wrote: >>> 2017-04-12 22:57 GMT+08:00 Thomas Gleixner : >>> > On Wed, 12

Re: [BUG nohz]: wrong user and system time accounting

2017-05-15 Thread Wanpeng Li
Ping, 2017-05-02 18:01 GMT+08:00 Wanpeng Li : > Cc Paolo, > 2017-04-13 21:32 GMT+08:00 Frederic Weisbecker : >> On Thu, Apr 13, 2017 at 12:31:12PM +0800, Wanpeng Li wrote: >>> 2017-04-12 22:57 GMT+08:00 Thomas Gleixner : >>> > On Wed, 12 Apr 2017, Frederic Weisbecker wrote: >>> >> On Tue, Apr 11,

Re: [BUG nohz]: wrong user and system time accounting

2017-05-02 Thread Wanpeng Li
Cc Paolo, 2017-04-13 21:32 GMT+08:00 Frederic Weisbecker : > On Thu, Apr 13, 2017 at 12:31:12PM +0800, Wanpeng Li wrote: >> 2017-04-12 22:57 GMT+08:00 Thomas Gleixner : >> > On Wed, 12 Apr 2017, Frederic Weisbecker wrote: >> >> On Tue, Apr 11, 2017 at

Re: [BUG nohz]: wrong user and system time accounting

2017-05-02 Thread Wanpeng Li
Cc Paolo, 2017-04-13 21:32 GMT+08:00 Frederic Weisbecker : > On Thu, Apr 13, 2017 at 12:31:12PM +0800, Wanpeng Li wrote: >> 2017-04-12 22:57 GMT+08:00 Thomas Gleixner : >> > On Wed, 12 Apr 2017, Frederic Weisbecker wrote: >> >> On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote: >> >>

Re: [BUG nohz]: wrong user and system time accounting

2017-04-13 Thread Frederic Weisbecker
On Thu, Apr 13, 2017 at 12:31:12PM +0800, Wanpeng Li wrote: > 2017-04-12 22:57 GMT+08:00 Thomas Gleixner : > > On Wed, 12 Apr 2017, Frederic Weisbecker wrote: > >> On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote: > >> > It's not different from the current

Re: [BUG nohz]: wrong user and system time accounting

2017-04-13 Thread Frederic Weisbecker
On Thu, Apr 13, 2017 at 12:31:12PM +0800, Wanpeng Li wrote: > 2017-04-12 22:57 GMT+08:00 Thomas Gleixner : > > On Wed, 12 Apr 2017, Frederic Weisbecker wrote: > >> On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote: > >> > It's not different from the current jiffies based stuff at

Re: [BUG nohz]: wrong user and system time accounting

2017-04-12 Thread Wanpeng Li
2017-04-12 22:57 GMT+08:00 Thomas Gleixner : > On Wed, 12 Apr 2017, Frederic Weisbecker wrote: >> On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote: >> > It's not different from the current jiffies based stuff at all. Same >> > failure mode. >> >> Yes you're

Re: [BUG nohz]: wrong user and system time accounting

2017-04-12 Thread Wanpeng Li
2017-04-12 22:57 GMT+08:00 Thomas Gleixner : > On Wed, 12 Apr 2017, Frederic Weisbecker wrote: >> On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote: >> > It's not different from the current jiffies based stuff at all. Same >> > failure mode. >> >> Yes you're right, I got confused

Re: [BUG nohz]: wrong user and system time accounting

2017-04-12 Thread Frederic Weisbecker
On Wed, Apr 12, 2017 at 04:57:58PM +0200, Thomas Gleixner wrote: > On Wed, 12 Apr 2017, Frederic Weisbecker wrote: > > On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote: > > > It's not different from the current jiffies based stuff at all. Same > > > failure mode. > > > > Yes you're

Re: [BUG nohz]: wrong user and system time accounting

2017-04-12 Thread Frederic Weisbecker
On Wed, Apr 12, 2017 at 04:57:58PM +0200, Thomas Gleixner wrote: > On Wed, 12 Apr 2017, Frederic Weisbecker wrote: > > On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote: > > > It's not different from the current jiffies based stuff at all. Same > > > failure mode. > > > > Yes you're

Re: [BUG nohz]: wrong user and system time accounting

2017-04-12 Thread Thomas Gleixner
On Wed, 12 Apr 2017, Frederic Weisbecker wrote: > On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote: > > It's not different from the current jiffies based stuff at all. Same > > failure mode. > > Yes you're right, I got confused again. So to fix this we could do our > snapshots >

Re: [BUG nohz]: wrong user and system time accounting

2017-04-12 Thread Thomas Gleixner
On Wed, 12 Apr 2017, Frederic Weisbecker wrote: > On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote: > > It's not different from the current jiffies based stuff at all. Same > > failure mode. > > Yes you're right, I got confused again. So to fix this we could do our > snapshots >

Re: [BUG nohz]: wrong user and system time accounting

2017-04-12 Thread Frederic Weisbecker
On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote: > On Thu, 30 Mar 2017, Wanpeng Li wrote: > > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c > > index f3778e2b..f1ee393 100644 > > --- a/kernel/sched/cputime.c > > +++ b/kernel/sched/cputime.c > > @@ -676,18 +676,21 @@

Re: [BUG nohz]: wrong user and system time accounting

2017-04-12 Thread Frederic Weisbecker
On Tue, Apr 11, 2017 at 04:22:48PM +0200, Thomas Gleixner wrote: > On Thu, 30 Mar 2017, Wanpeng Li wrote: > > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c > > index f3778e2b..f1ee393 100644 > > --- a/kernel/sched/cputime.c > > +++ b/kernel/sched/cputime.c > > @@ -676,18 +676,21 @@

Re: [BUG nohz]: wrong user and system time accounting

2017-04-11 Thread Thomas Gleixner
On Thu, 30 Mar 2017, Wanpeng Li wrote: > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c > index f3778e2b..f1ee393 100644 > --- a/kernel/sched/cputime.c > +++ b/kernel/sched/cputime.c > @@ -676,18 +676,21 @@ void thread_group_cputime_adjusted(struct > task_struct *p, u64 *ut, u64 *st)

Re: [BUG nohz]: wrong user and system time accounting

2017-04-11 Thread Thomas Gleixner
On Thu, 30 Mar 2017, Wanpeng Li wrote: > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c > index f3778e2b..f1ee393 100644 > --- a/kernel/sched/cputime.c > +++ b/kernel/sched/cputime.c > @@ -676,18 +676,21 @@ void thread_group_cputime_adjusted(struct > task_struct *p, u64 *ut, u64 *st)

Re: [BUG nohz]: wrong user and system time accounting

2017-04-11 Thread Wanpeng Li
2017-04-11 19:36 GMT+08:00 Peter Zijlstra : > On Tue, Apr 11, 2017 at 07:03:17PM +0800, Wanpeng Li wrote: >> 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker : >> > On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote: >> >> [...] >> >> > >> >> >> >>

Re: [BUG nohz]: wrong user and system time accounting

2017-04-11 Thread Wanpeng Li
2017-04-11 19:36 GMT+08:00 Peter Zijlstra : > On Tue, Apr 11, 2017 at 07:03:17PM +0800, Wanpeng Li wrote: >> 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker : >> > On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote: >> >> [...] >> >> > >> >> >> >>

Re: [BUG nohz]: wrong user and system time accounting

2017-04-11 Thread Peter Zijlstra
On Tue, Apr 11, 2017 at 07:03:17PM +0800, Wanpeng Li wrote: > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker : > > On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote: > > [...] > > > > >> > >>

Re: [BUG nohz]: wrong user and system time accounting

2017-04-11 Thread Peter Zijlstra
On Tue, Apr 11, 2017 at 07:03:17PM +0800, Wanpeng Li wrote: > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker : > > On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote: > > [...] > > > > >> > >> -->8- >

Re: [BUG nohz]: wrong user and system time accounting

2017-04-11 Thread Wanpeng Li
2017-03-30 21:38 GMT+08:00 Frederic Weisbecker : > On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote: [...] > >> >> -->8- >> >> use nanosecond granularity to check deltas but only

Re: [BUG nohz]: wrong user and system time accounting

2017-04-11 Thread Wanpeng Li
2017-03-30 21:38 GMT+08:00 Frederic Weisbecker : > On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote: [...] > >> >> -->8- >> >> use nanosecond granularity to check deltas but only perform an actual

Re: [BUG nohz]: wrong user and system time accounting

2017-04-05 Thread Rik van Riel
On Tue, 2017-04-04 at 13:36 -0400, Luiz Capitulino wrote: >  > On further debugging this, I realized that I had overlooked > something: > the timer interrupt in this trace is not the tick, but cyclictest's > timer > (remember that the test-case consists of pinning cyclictest and a > task > hogging

Re: [BUG nohz]: wrong user and system time accounting

2017-04-05 Thread Rik van Riel
On Tue, 2017-04-04 at 13:36 -0400, Luiz Capitulino wrote: >  > On further debugging this, I realized that I had overlooked > something: > the timer interrupt in this trace is not the tick, but cyclictest's > timer > (remember that the test-case consists of pinning cyclictest and a > task > hogging

Re: [BUG nohz]: wrong user and system time accounting

2017-04-04 Thread Luiz Capitulino
On Mon, 3 Apr 2017 15:06:13 -0400 Luiz Capitulino wrote: > On Mon, 3 Apr 2017 17:23:17 +0200 > Frederic Weisbecker wrote: > > > Do you observe aligned ticks with trace events (hrtimer_expire_entry)? > > > > You might want to enforce the global clock

Re: [BUG nohz]: wrong user and system time accounting

2017-04-04 Thread Luiz Capitulino
On Mon, 3 Apr 2017 15:06:13 -0400 Luiz Capitulino wrote: > On Mon, 3 Apr 2017 17:23:17 +0200 > Frederic Weisbecker wrote: > > > Do you observe aligned ticks with trace events (hrtimer_expire_entry)? > > > > You might want to enforce the global clock to trace that: > > > > echo "global" >

Re: [BUG nohz]: wrong user and system time accounting

2017-04-04 Thread Mike Galbraith
On Mon, 2017-04-03 at 16:40 +0200, Frederic Weisbecker wrote: > On Thu, Mar 30, 2017 at 03:35:22PM +0200, Mike Galbraith wrote: > Nohz_full is already bad for powersavings anyway. CPU 0 always ticks :-) OTOH, if a nohz_full set is doing what it was born to do, CPU0 tick spikes won't be

Re: [BUG nohz]: wrong user and system time accounting

2017-04-04 Thread Mike Galbraith
On Mon, 2017-04-03 at 16:40 +0200, Frederic Weisbecker wrote: > On Thu, Mar 30, 2017 at 03:35:22PM +0200, Mike Galbraith wrote: > Nohz_full is already bad for powersavings anyway. CPU 0 always ticks :-) OTOH, if a nohz_full set is doing what it was born to do, CPU0 tick spikes won't be

Re: [BUG nohz]: wrong user and system time accounting

2017-04-03 Thread Luiz Capitulino
On Mon, 3 Apr 2017 17:23:17 +0200 Frederic Weisbecker wrote: > Do you observe aligned ticks with trace events (hrtimer_expire_entry)? > > You might want to enforce the global clock to trace that: > > echo "global" > /sys/kernel/debug/tracing/trace_clock I've used the

Re: [BUG nohz]: wrong user and system time accounting

2017-04-03 Thread Luiz Capitulino
On Mon, 3 Apr 2017 17:23:17 +0200 Frederic Weisbecker wrote: > Do you observe aligned ticks with trace events (hrtimer_expire_entry)? > > You might want to enforce the global clock to trace that: > > echo "global" > /sys/kernel/debug/tracing/trace_clock I've used the same trace points &

Re: [BUG nohz]: wrong user and system time accounting

2017-04-03 Thread Frederic Weisbecker
On Fri, Mar 31, 2017 at 11:11:19PM -0400, Luiz Capitulino wrote: > On Sat, 1 Apr 2017 01:24:54 +0200 > Frederic Weisbecker wrote: > > > On Fri, Mar 31, 2017 at 04:09:10PM -0400, Luiz Capitulino wrote: > > > On Thu, 30 Mar 2017 17:25:46 -0400 > > > Luiz Capitulino

Re: [BUG nohz]: wrong user and system time accounting

2017-04-03 Thread Frederic Weisbecker
On Fri, Mar 31, 2017 at 11:11:19PM -0400, Luiz Capitulino wrote: > On Sat, 1 Apr 2017 01:24:54 +0200 > Frederic Weisbecker wrote: > > > On Fri, Mar 31, 2017 at 04:09:10PM -0400, Luiz Capitulino wrote: > > > On Thu, 30 Mar 2017 17:25:46 -0400 > > > Luiz Capitulino wrote: > > > > > > > On Thu,

Re: [BUG nohz]: wrong user and system time accounting

2017-04-03 Thread Frederic Weisbecker
On Thu, Mar 30, 2017 at 03:35:22PM +0200, Mike Galbraith wrote: > On Thu, 2017-03-30 at 09:02 -0400, Rik van Riel wrote: > > On Thu, 2017-03-30 at 14:51 +0200, Frederic Weisbecker wrote: > > > > Also, why does it raise power consumption issues? > > > > On a system without either nohz_full or

Re: [BUG nohz]: wrong user and system time accounting

2017-04-03 Thread Frederic Weisbecker
On Thu, Mar 30, 2017 at 03:35:22PM +0200, Mike Galbraith wrote: > On Thu, 2017-03-30 at 09:02 -0400, Rik van Riel wrote: > > On Thu, 2017-03-30 at 14:51 +0200, Frederic Weisbecker wrote: > > > > Also, why does it raise power consumption issues? > > > > On a system without either nohz_full or

Re: [BUG nohz]: wrong user and system time accounting

2017-03-31 Thread Luiz Capitulino
On Sat, 1 Apr 2017 01:24:54 +0200 Frederic Weisbecker wrote: > On Fri, Mar 31, 2017 at 04:09:10PM -0400, Luiz Capitulino wrote: > > On Thu, 30 Mar 2017 17:25:46 -0400 > > Luiz Capitulino wrote: > > > > > On Thu, 30 Mar 2017 16:18:17 +0200 > > >

Re: [BUG nohz]: wrong user and system time accounting

2017-03-31 Thread Luiz Capitulino
On Sat, 1 Apr 2017 01:24:54 +0200 Frederic Weisbecker wrote: > On Fri, Mar 31, 2017 at 04:09:10PM -0400, Luiz Capitulino wrote: > > On Thu, 30 Mar 2017 17:25:46 -0400 > > Luiz Capitulino wrote: > > > > > On Thu, 30 Mar 2017 16:18:17 +0200 > > > Frederic Weisbecker wrote: > > > > > > > On

Re: [BUG nohz]: wrong user and system time accounting

2017-03-31 Thread Frederic Weisbecker
On Fri, Mar 31, 2017 at 04:09:10PM -0400, Luiz Capitulino wrote: > On Thu, 30 Mar 2017 17:25:46 -0400 > Luiz Capitulino wrote: > > > On Thu, 30 Mar 2017 16:18:17 +0200 > > Frederic Weisbecker wrote: > > > > > On Thu, Mar 30, 2017 at 09:59:54PM +0800,

Re: [BUG nohz]: wrong user and system time accounting

2017-03-31 Thread Frederic Weisbecker
On Fri, Mar 31, 2017 at 04:09:10PM -0400, Luiz Capitulino wrote: > On Thu, 30 Mar 2017 17:25:46 -0400 > Luiz Capitulino wrote: > > > On Thu, 30 Mar 2017 16:18:17 +0200 > > Frederic Weisbecker wrote: > > > > > On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote: > > > > 2017-03-30

Re: [BUG nohz]: wrong user and system time accounting

2017-03-31 Thread Luiz Capitulino
On Thu, 30 Mar 2017 17:25:46 -0400 Luiz Capitulino wrote: > On Thu, 30 Mar 2017 16:18:17 +0200 > Frederic Weisbecker wrote: > > > On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote: > > > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker

Re: [BUG nohz]: wrong user and system time accounting

2017-03-31 Thread Luiz Capitulino
On Thu, 30 Mar 2017 17:25:46 -0400 Luiz Capitulino wrote: > On Thu, 30 Mar 2017 16:18:17 +0200 > Frederic Weisbecker wrote: > > > On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote: > > > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker : > > > > If it works, we may want to take

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Luiz Capitulino
On Thu, 30 Mar 2017 16:18:17 +0200 Frederic Weisbecker wrote: > On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote: > > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker : > > > If it works, we may want to take that solution, likely less performance

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Luiz Capitulino
On Thu, 30 Mar 2017 16:18:17 +0200 Frederic Weisbecker wrote: > On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote: > > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker : > > > If it works, we may want to take that solution, likely less performance > > > sensitive > > > than using

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Frederic Weisbecker
On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote: > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker : > > If it works, we may want to take that solution, likely less performance > > sensitive > > than using sched_clock(). In fact sched_clock() is fast, especially as

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Frederic Weisbecker
On Thu, Mar 30, 2017 at 09:59:54PM +0800, Wanpeng Li wrote: > 2017-03-30 21:38 GMT+08:00 Frederic Weisbecker : > > If it works, we may want to take that solution, likely less performance > > sensitive > > than using sched_clock(). In fact sched_clock() is fast, especially as we > > require it to

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Wanpeng Li
2017-03-30 21:38 GMT+08:00 Frederic Weisbecker : > On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote: >> Cc Peterz, Thomas, >> 2017-03-30 12:27 GMT+08:00 Mike Galbraith : >> > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: >> > >> >> In other

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Wanpeng Li
2017-03-30 21:38 GMT+08:00 Frederic Weisbecker : > On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote: >> Cc Peterz, Thomas, >> 2017-03-30 12:27 GMT+08:00 Mike Galbraith : >> > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: >> > >> >> In other words, the tick on cpu0 is aligned >>

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Frederic Weisbecker
On Thu, Mar 30, 2017 at 09:02:31AM -0400, Rik van Riel wrote: > On Thu, 2017-03-30 at 14:51 +0200, Frederic Weisbecker wrote: > > On Thu, Mar 30, 2017 at 06:27:31AM +0200, Mike Galbraith wrote: > > > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: > > > > > > > A random offset, or better

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Frederic Weisbecker
On Thu, Mar 30, 2017 at 09:02:31AM -0400, Rik van Riel wrote: > On Thu, 2017-03-30 at 14:51 +0200, Frederic Weisbecker wrote: > > On Thu, Mar 30, 2017 at 06:27:31AM +0200, Mike Galbraith wrote: > > > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: > > > > > > > A random offset, or better

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Frederic Weisbecker
On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote: > Cc Peterz, Thomas, > 2017-03-30 12:27 GMT+08:00 Mike Galbraith : > > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: > > > >> In other words, the tick on cpu0 is aligned > >> with the tick on the nohz_full cpus,

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Frederic Weisbecker
On Thu, Mar 30, 2017 at 02:47:11PM +0800, Wanpeng Li wrote: > Cc Peterz, Thomas, > 2017-03-30 12:27 GMT+08:00 Mike Galbraith : > > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: > > > >> In other words, the tick on cpu0 is aligned > >> with the tick on the nohz_full cpus, and > >> jiffies

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Mike Galbraith
On Thu, 2017-03-30 at 09:02 -0400, Rik van Riel wrote: > On Thu, 2017-03-30 at 14:51 +0200, Frederic Weisbecker wrote: > > Also, why does it raise power consumption issues? > > On a system without either nohz_full or nohz idle > mode, skewed ticks result in CPU cores waking up > at different

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Mike Galbraith
On Thu, 2017-03-30 at 09:02 -0400, Rik van Riel wrote: > On Thu, 2017-03-30 at 14:51 +0200, Frederic Weisbecker wrote: > > Also, why does it raise power consumption issues? > > On a system without either nohz_full or nohz idle > mode, skewed ticks result in CPU cores waking up > at different

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Mike Galbraith
On Thu, 2017-03-30 at 14:40 +0200, Frederic Weisbecker wrote: > On Thu, Mar 30, 2017 at 09:58:44AM +0800, Wanpeng Li wrote: > > There is such a feature skew_tick currently, refer to commit > > 5307c9556bc (tick: add tick skew boot option), w/ skew_tick=1 boot > > parameter, the bug disappear,

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Mike Galbraith
On Thu, 2017-03-30 at 14:40 +0200, Frederic Weisbecker wrote: > On Thu, Mar 30, 2017 at 09:58:44AM +0800, Wanpeng Li wrote: > > There is such a feature skew_tick currently, refer to commit > > 5307c9556bc (tick: add tick skew boot option), w/ skew_tick=1 boot > > parameter, the bug disappear,

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Rik van Riel
On Thu, 2017-03-30 at 14:51 +0200, Frederic Weisbecker wrote: > On Thu, Mar 30, 2017 at 06:27:31AM +0200, Mike Galbraith wrote: > > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: > > > > > A random offset, or better yet a somewhat randomized > > > tick length to make sure that

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Rik van Riel
On Thu, 2017-03-30 at 14:51 +0200, Frederic Weisbecker wrote: > On Thu, Mar 30, 2017 at 06:27:31AM +0200, Mike Galbraith wrote: > > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: > > > > > A random offset, or better yet a somewhat randomized > > > tick length to make sure that

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Rik van Riel
On Thu, 2017-03-30 at 00:54 +0200, Frederic Weisbecker wrote: > (Adding Thomas in Cc) > > On Wed, Mar 29, 2017 at 04:08:45PM -0400, Rik van Riel wrote: > >  > > Frederic, can you think of any reason why > > the tick on nohz_full CPUs would end up aligned > > with the tick on cpu0, instead of

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Rik van Riel
On Thu, 2017-03-30 at 00:54 +0200, Frederic Weisbecker wrote: > (Adding Thomas in Cc) > > On Wed, Mar 29, 2017 at 04:08:45PM -0400, Rik van Riel wrote: > >  > > Frederic, can you think of any reason why > > the tick on nohz_full CPUs would end up aligned > > with the tick on cpu0, instead of

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Frederic Weisbecker
On Thu, Mar 30, 2017 at 06:27:31AM +0200, Mike Galbraith wrote: > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: > > > A random offset, or better yet a somewhat randomized > > tick length to make sure that simultaneous ticks are > > fairly rare and the vtime sampling does not end up > >

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Frederic Weisbecker
On Thu, Mar 30, 2017 at 06:27:31AM +0200, Mike Galbraith wrote: > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: > > > A random offset, or better yet a somewhat randomized > > tick length to make sure that simultaneous ticks are > > fairly rare and the vtime sampling does not end up > >

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Frederic Weisbecker
On Thu, Mar 30, 2017 at 09:58:44AM +0800, Wanpeng Li wrote: > 2017-03-30 4:08 GMT+08:00 Rik van Riel : > > > > In other words, the tick on cpu0 is aligned > > with the tick on the nohz_full cpus, and > > jiffies is advanced while the nohz_full cpus > > with an active tick happen

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Frederic Weisbecker
On Thu, Mar 30, 2017 at 09:58:44AM +0800, Wanpeng Li wrote: > 2017-03-30 4:08 GMT+08:00 Rik van Riel : > > > > In other words, the tick on cpu0 is aligned > > with the tick on the nohz_full cpus, and > > jiffies is advanced while the nohz_full cpus > > with an active tick happen to be in kernel >

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Mike Galbraith
On Thu, 2017-03-30 at 19:52 +0800, Wanpeng Li wrote: > If we should just add random offset to the cpu in the nohz_full mode? Up to you, whatever works best. I left the regular skew alone, just added some noise to scheduler_tick_max_deferment(). -Mike

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Mike Galbraith
On Thu, 2017-03-30 at 19:52 +0800, Wanpeng Li wrote: > If we should just add random offset to the cpu in the nohz_full mode? Up to you, whatever works best. I left the regular skew alone, just added some noise to scheduler_tick_max_deferment(). -Mike

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Wanpeng Li
2017-03-30 10:14 GMT+08:00 Luiz Capitulino : > On Thu, 30 Mar 2017 06:46:30 +0800 > Wanpeng Li wrote: > >> > So! Now we need to find a proper fix :o) >> > >> > Hmm, how bad would it be to revert to sched_clock() instead of jiffies in >> >

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Wanpeng Li
2017-03-30 10:14 GMT+08:00 Luiz Capitulino : > On Thu, 30 Mar 2017 06:46:30 +0800 > Wanpeng Li wrote: > >> > So! Now we need to find a proper fix :o) >> > >> > Hmm, how bad would it be to revert to sched_clock() instead of jiffies in >> > vtime_delta()? >> > We could use nanosecond granularity

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Wanpeng Li
2017-03-30 14:47 GMT+08:00 Wanpeng Li : > Cc Peterz, Thomas, > 2017-03-30 12:27 GMT+08:00 Mike Galbraith : >> On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: >> >>> In other words, the tick on cpu0 is aligned >>> with the tick on the nohz_full cpus, and

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Wanpeng Li
2017-03-30 14:47 GMT+08:00 Wanpeng Li : > Cc Peterz, Thomas, > 2017-03-30 12:27 GMT+08:00 Mike Galbraith : >> On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: >> >>> In other words, the tick on cpu0 is aligned >>> with the tick on the nohz_full cpus, and >>> jiffies is advanced while the

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Wanpeng Li
Cc Peterz, Thomas, 2017-03-30 12:27 GMT+08:00 Mike Galbraith : > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: > >> In other words, the tick on cpu0 is aligned >> with the tick on the nohz_full cpus, and >> jiffies is advanced while the nohz_full cpus >> with an active tick

Re: [BUG nohz]: wrong user and system time accounting

2017-03-30 Thread Wanpeng Li
Cc Peterz, Thomas, 2017-03-30 12:27 GMT+08:00 Mike Galbraith : > On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: > >> In other words, the tick on cpu0 is aligned >> with the tick on the nohz_full cpus, and >> jiffies is advanced while the nohz_full cpus >> with an active tick happen to be

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Mike Galbraith
On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: > In other words, the tick on cpu0 is aligned > with the tick on the nohz_full cpus, and > jiffies is advanced while the nohz_full cpus > with an active tick happen to be in kernel > mode? You really want skew_tick=1, especially on big

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Mike Galbraith
On Wed, 2017-03-29 at 16:08 -0400, Rik van Riel wrote: > In other words, the tick on cpu0 is aligned > with the tick on the nohz_full cpus, and > jiffies is advanced while the nohz_full cpus > with an active tick happen to be in kernel > mode? You really want skew_tick=1, especially on big

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Luiz Capitulino
On Thu, 30 Mar 2017 06:46:30 +0800 Wanpeng Li wrote: > > So! Now we need to find a proper fix :o) > > > > Hmm, how bad would it be to revert to sched_clock() instead of jiffies in > > vtime_delta()? > > We could use nanosecond granularity to check deltas but only perform an

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Luiz Capitulino
On Thu, 30 Mar 2017 06:46:30 +0800 Wanpeng Li wrote: > > So! Now we need to find a proper fix :o) > > > > Hmm, how bad would it be to revert to sched_clock() instead of jiffies in > > vtime_delta()? > > We could use nanosecond granularity to check deltas but only perform an > > actual cputime

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Wanpeng Li
2017-03-30 4:08 GMT+08:00 Rik van Riel : > On Wed, 2017-03-29 at 13:16 -0400, Luiz Capitulino wrote: >> On Tue, 28 Mar 2017 13:24:06 -0400 >> Luiz Capitulino wrote: >> >> > 1. In my tracing I'm seeing that sometimes (always?) the >> > time interval

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Wanpeng Li
2017-03-30 4:08 GMT+08:00 Rik van Riel : > On Wed, 2017-03-29 at 13:16 -0400, Luiz Capitulino wrote: >> On Tue, 28 Mar 2017 13:24:06 -0400 >> Luiz Capitulino wrote: >> >> > 1. In my tracing I'm seeing that sometimes (always?) the >> > time interval between two timer interrupts is less than

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Luiz Capitulino
On Wed, 29 Mar 2017 23:12:00 +0200 Frederic Weisbecker wrote: > On Wed, Mar 29, 2017 at 09:23:57AM -0400, Luiz Capitulino wrote: > > > > There are various reproducers actually. I started off with the simple > > loop above, then wrote the attach program and then wrote the one

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Luiz Capitulino
On Wed, 29 Mar 2017 23:12:00 +0200 Frederic Weisbecker wrote: > On Wed, Mar 29, 2017 at 09:23:57AM -0400, Luiz Capitulino wrote: > > > > There are various reproducers actually. I started off with the simple > > loop above, then wrote the attach program and then wrote the one > > you're

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Frederic Weisbecker
(Adding Thomas in Cc) On Wed, Mar 29, 2017 at 04:08:45PM -0400, Rik van Riel wrote: > On Wed, 2017-03-29 at 13:16 -0400, Luiz Capitulino wrote: > > On Tue, 28 Mar 2017 13:24:06 -0400 > > Luiz Capitulino wrote: > > > > >  1. In my tracing I'm seeing that sometimes

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Frederic Weisbecker
(Adding Thomas in Cc) On Wed, Mar 29, 2017 at 04:08:45PM -0400, Rik van Riel wrote: > On Wed, 2017-03-29 at 13:16 -0400, Luiz Capitulino wrote: > > On Tue, 28 Mar 2017 13:24:06 -0400 > > Luiz Capitulino wrote: > > > > >  1. In my tracing I'm seeing that sometimes (always?) the > > > time

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Wanpeng Li
2017-03-30 6:17 GMT+08:00 Frederic Weisbecker : > On Wed, Mar 29, 2017 at 01:16:56PM -0400, Luiz Capitulino wrote: >> On Tue, 28 Mar 2017 13:24:06 -0400 >> Luiz Capitulino wrote: >> >> > 1. In my tracing I'm seeing that sometimes (always?) the >> >

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Wanpeng Li
2017-03-30 6:17 GMT+08:00 Frederic Weisbecker : > On Wed, Mar 29, 2017 at 01:16:56PM -0400, Luiz Capitulino wrote: >> On Tue, 28 Mar 2017 13:24:06 -0400 >> Luiz Capitulino wrote: >> >> > 1. In my tracing I'm seeing that sometimes (always?) the >> > time interval between two timer interrupts

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Frederic Weisbecker
On Wed, Mar 29, 2017 at 09:23:57AM -0400, Luiz Capitulino wrote: > > There are various reproducers actually. I started off with the simple > loop above, then wrote the attach program and then wrote the one > you're mentioning: > > http://people.redhat.com/~lcapitul/real-time/acct-bug.c > > All

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Frederic Weisbecker
On Wed, Mar 29, 2017 at 09:23:57AM -0400, Luiz Capitulino wrote: > > There are various reproducers actually. I started off with the simple > loop above, then wrote the attach program and then wrote the one > you're mentioning: > > http://people.redhat.com/~lcapitul/real-time/acct-bug.c > > All

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Rik van Riel
On Wed, 2017-03-29 at 13:16 -0400, Luiz Capitulino wrote: > On Tue, 28 Mar 2017 13:24:06 -0400 > Luiz Capitulino wrote: > > >  1. In my tracing I'm seeing that sometimes (always?) the > > time interval between two timer interrupts is less than 1ms > > I think that's

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Rik van Riel
On Wed, 2017-03-29 at 13:16 -0400, Luiz Capitulino wrote: > On Tue, 28 Mar 2017 13:24:06 -0400 > Luiz Capitulino wrote: > > >  1. In my tracing I'm seeing that sometimes (always?) the > > time interval between two timer interrupts is less than 1ms > > I think that's the root cause. >  > In

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Luiz Capitulino
On Wed, 29 Mar 2017 09:14:32 -0400 Rik van Riel wrote: > > I failed to reproduce with your config. I'm still getting 99% > > userspace > > cputime. So I'm wondering if the hogging style plays a role. > > > > I run pure user loops: > > > > int main(int argc, char **argv) >

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Luiz Capitulino
On Wed, 29 Mar 2017 09:14:32 -0400 Rik van Riel wrote: > > I failed to reproduce with your config. I'm still getting 99% > > userspace > > cputime. So I'm wondering if the hogging style plays a role. > > > > I run pure user loops: > > > > int main(int argc, char **argv) > > { > >   

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Rik van Riel
On Wed, 2017-03-29 at 15:04 +0200, Frederic Weisbecker wrote: > On Thu, Mar 23, 2017 at 04:55:12PM -0400, Luiz Capitulino wrote: > > > > When there are two or more tasks executing in user-space and > > taking 100% of a nohz_full CPU, top reports 70% system time > > and 30% user time utilization.

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Rik van Riel
On Wed, 2017-03-29 at 15:04 +0200, Frederic Weisbecker wrote: > On Thu, Mar 23, 2017 at 04:55:12PM -0400, Luiz Capitulino wrote: > > > > When there are two or more tasks executing in user-space and > > taking 100% of a nohz_full CPU, top reports 70% system time > > and 30% user time utilization.

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Frederic Weisbecker
On Thu, Mar 23, 2017 at 04:55:12PM -0400, Luiz Capitulino wrote: > > When there are two or more tasks executing in user-space and > taking 100% of a nohz_full CPU, top reports 70% system time > and 30% user time utilization. Sometimes I'm even able to get > 100% system time and 0% user time. > >

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Frederic Weisbecker
On Thu, Mar 23, 2017 at 04:55:12PM -0400, Luiz Capitulino wrote: > > When there are two or more tasks executing in user-space and > taking 100% of a nohz_full CPU, top reports 70% system time > and 30% user time utilization. Sometimes I'm even able to get > 100% system time and 0% user time. > >

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Frederic Weisbecker
On Wed, Mar 29, 2017 at 05:56:30PM +0800, Wanpeng Li wrote: > 2017-03-29 5:26 GMT+08:00 Luiz Capitulino : > > On Tue, 28 Mar 2017 17:01:52 -0400 > > Rik van Riel wrote: > > > >> On Tue, 2017-03-28 at 16:14 -0400, Luiz Capitulino wrote: > >> > On Tue, 28

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Frederic Weisbecker
On Wed, Mar 29, 2017 at 05:56:30PM +0800, Wanpeng Li wrote: > 2017-03-29 5:26 GMT+08:00 Luiz Capitulino : > > On Tue, 28 Mar 2017 17:01:52 -0400 > > Rik van Riel wrote: > > > >> On Tue, 2017-03-28 at 16:14 -0400, Luiz Capitulino wrote: > >> > On Tue, 28 Mar 2017 13:24:06 -0400 > >> > Luiz

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Wanpeng Li
2017-03-29 5:26 GMT+08:00 Luiz Capitulino : > On Tue, 28 Mar 2017 17:01:52 -0400 > Rik van Riel wrote: > >> On Tue, 2017-03-28 at 16:14 -0400, Luiz Capitulino wrote: >> > On Tue, 28 Mar 2017 13:24:06 -0400 >> > Luiz Capitulino

Re: [BUG nohz]: wrong user and system time accounting

2017-03-29 Thread Wanpeng Li
2017-03-29 5:26 GMT+08:00 Luiz Capitulino : > On Tue, 28 Mar 2017 17:01:52 -0400 > Rik van Riel wrote: > >> On Tue, 2017-03-28 at 16:14 -0400, Luiz Capitulino wrote: >> > On Tue, 28 Mar 2017 13:24:06 -0400 >> > Luiz Capitulino wrote: >> > > I'm starting to suspect that the nohz code may be

Re: [BUG nohz]: wrong user and system time accounting

2017-03-28 Thread Luiz Capitulino
On Tue, 28 Mar 2017 17:24:11 -0400 Rik van Riel wrote: > On Tue, 2017-03-28 at 16:14 -0400, Luiz Capitulino wrote: > > > And I think I was right, it looks like the nohz code is programming > > the tick period incorrectly when restarting the tick. The patch below > > fixes

Re: [BUG nohz]: wrong user and system time accounting

2017-03-28 Thread Luiz Capitulino
On Tue, 28 Mar 2017 17:24:11 -0400 Rik van Riel wrote: > On Tue, 2017-03-28 at 16:14 -0400, Luiz Capitulino wrote: > > > And I think I was right, it looks like the nohz code is programming > > the tick period incorrectly when restarting the tick. The patch below > > fixes things for me, but I

  1   2   >