On 2017/10/14 8:35, Rafael J. Wysocki wrote: > On Saturday, September 30, 2017 9:20:28 AM CEST Aubrey Li wrote: >> Record the overhead of idle entry in micro-second >> > > What is this needed for?
We need to figure out how long of a idle is a short idle and recording the overhead is for this purpose. The short idle threshold is based on this overhead. > >> +void cpuidle_entry_end(void) >> +{ >> + struct cpuidle_device *dev = cpuidle_get_device(); >> + u64 overhead; >> + s64 diff; >> + >> + if (dev) { >> + dev->idle_stat.entry_end = local_clock(); >> + overhead = div_u64(dev->idle_stat.entry_end - >> + dev->idle_stat.entry_start, NSEC_PER_USEC); > > Is the conversion really necessary? > > If so, then why? We can choose nano-second and micro-second. Given that workload results in the short idle pattern, I think micro-second is good enough for the real workload. Another reason is that prediction from idle governor is micro-second, so I convert it for comparing purpose. > > And if there is a good reason, what about using right shift to do > an approximate conversion to avoid the extra division here? Sure >> 10 works for me as I don't think here precision is a big deal. > >> + diff = overhead - dev->idle_stat.overhead; >> + dev->idle_stat.overhead += diff >> 3; > > Can you please explain what happens in the two lines above? Online average computing algorithm, stolen from update_avg() @ kernel/sched/core.c. > >> + /* >> + * limit overhead to 1us >> + */ >> + if (dev->idle_stat.overhead == 0) >> + dev->idle_stat.overhead = 1; >> + } >> +} >> + >> /** >> * cpuidle_install_idle_handler - installs the cpuidle idle loop handler >> */ >> diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h >> index fc1e5d7..cad9b71 100644 >> --- a/include/linux/cpuidle.h >> +++ b/include/linux/cpuidle.h >> @@ -72,6 +72,15 @@ struct cpuidle_device_kobj; >> struct cpuidle_state_kobj; >> struct cpuidle_driver_kobj; >> >> +struct cpuidle_stat { >> + u64 entry_start; /* nanosecond */ >> + u64 entry_end; /* nanosecond */ >> + u64 overhead; /* nanosecond */ >> + unsigned int predicted_us; /* microsecond */ >> + bool predicted; /* ever predicted? */ >> + bool fast_idle; /* fast idle? */ > > Some of the fields here are never used in the code after this patch. > > Also it would be good to add a comment describing the meaning of the > fields. > okay, will add in the next version. Thanks, -Aubrey