On Friday, August 26, 2016 11:40:48 AM Steve Muckle wrote: > A policy of going to fmax on any RT activity will be detrimental > for power on many platforms. Often RT accounts for only a small amount > of CPU activity so sending the CPU frequency to fmax is overkill. Worse > still, some platforms may not be able to even complete the CPU frequency > change before the RT activity has already completed. > > Cpufreq governors have not treated RT activity this way in the past so > it is not part of the expected semantics of the RT scheduling class. The > DL class offers guarantees about task completion and could be used for > this purpose. > > Modify the schedutil algorithm to instead use rt_avg as an estimate of > RT utilization of the CPU. > > Based on previous work by Vincent Guittot <vincent.guit...@linaro.org>.
If we do it for RT, why not to do a similar thing for DL? As in the original patch from Peter, for example? > Signed-off-by: Steve Muckle <smuc...@linaro.org> > --- > kernel/sched/cpufreq_schedutil.c | 26 +++++++++++++++++--------- > 1 file changed, 17 insertions(+), 9 deletions(-) > > diff --git a/kernel/sched/cpufreq_schedutil.c > b/kernel/sched/cpufreq_schedutil.c > index cb8a77b1ef1b..89094a466250 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -146,13 +146,21 @@ static unsigned int get_next_freq(struct sugov_cpu > *sg_cpu, unsigned long util, > > static void sugov_get_util(unsigned long *util, unsigned long *max) > { > - struct rq *rq = this_rq(); > - unsigned long cfs_max; > + int cpu = smp_processor_id(); > + struct rq *rq = cpu_rq(cpu); > + unsigned long max_cap, rt; > + s64 delta; > > - cfs_max = arch_scale_cpu_capacity(NULL, smp_processor_id()); > + max_cap = arch_scale_cpu_capacity(NULL, cpu); > > - *util = min(rq->cfs.avg.util_avg, cfs_max); > - *max = cfs_max; > + delta = rq_clock(rq) - rq->age_stamp; > + if (unlikely(delta < 0)) > + delta = 0; > + rt = div64_u64(rq->rt_avg, sched_avg_period() + delta); > + rt = (rt * max_cap) >> SCHED_CAPACITY_SHIFT; These computations are rather heavy, so I wonder if they are avoidable based on the flags, for example? Plus is SCHED_CAPACITY_SHIFT actually defined for all architectures? One more ugly thing is about using rq_clock(rq) directly from here whereas we pass it around as the 'time' argument elsewhere. > + > + *util = min(rq->cfs.avg.util_avg + rt, max_cap); > + *max = max_cap; > } Thanks, Rafael