On Sun, Mar 04, 2018 at 11:28:56PM +0100, Rafael J. Wysocki wrote: > Index: linux-pm/kernel/sched/idle.c > =================================================================== > --- linux-pm.orig/kernel/sched/idle.c > +++ linux-pm/kernel/sched/idle.c > @@ -188,13 +188,14 @@ static void cpuidle_idle_call(void) > } else { > unsigned int duration_us; > > - tick_nohz_idle_go_idle(true); > - rcu_idle_enter(); > - > /* > * Ask the cpuidle framework to choose a convenient idle state. > */ > next_state = cpuidle_select(drv, dev, &duration_us); > + > + tick_nohz_idle_go_idle(duration_us > USEC_PER_SEC / HZ);
(FWIW we have TICK_USEC for this) > + rcu_idle_enter(); > + > entered_state = call_cpuidle(drv, dev, next_state); > /* > * Give the governor an opportunity to reflect on the outcome Also, I think that at this point you've introduced a problem; by not disabling the tick unconditionally, we'll have extra wakeups due to the (now still running) tick, which will bias the estimation, as per reflect(), downwards. We should effectively discard tick wakeups when we could have entered nohz but didn't, accumulating the idle period in reflect and only commit once we get a !tick wakeup. Of course, for that to work we need to somehow divine what woke us, which is going to be tricky.