Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-04-01 Thread Steve Muckle
On 04/01/2016 12:14 PM, Rafael J. Wysocki wrote:
> On Fri, Apr 1, 2016 at 7:49 PM, Steve Muckle  wrote:
>> On 03/29/2016 07:00 PM, Rafael J. Wysocki wrote:
>> ...
>>> +config CPU_FREQ_GOV_SCHEDUTIL
>>> + tristate "'schedutil' cpufreq policy governor"
>>> + depends on CPU_FREQ
>>> + select CPU_FREQ_GOV_ATTR_SET
>>> + select IRQ_WORK
>>> + help
>>> +   This governor makes decisions based on the utilization data provided
>>> +   by the scheduler.  It sets the CPU frequency to be proportional to
>>> +   the utilization/capacity ratio coming from the scheduler.  If the
>>> +   utilization is frequency-invariant, the new frequency is also
>>> +   proportional to the maximum available frequency.  If that is not the
>>> +   case, it is proportional to the current frequency of the CPU with 
>>> the
>>> +   tipping point at utilization/capacity equal to 80%.
>>
>> This help text implies that the tipping point of 80% applies only to
>> non-frequency invariant configurations, rather than both. Possible to
>> rephrase?
> 
> Sure.
> 
> What about:
> 
> "If that is not the case, it is proportional to the current frequency
> of the CPU.  The frequency tipping point is at utilization/capacity
> equal to 80% in both cases."

LGTM

thanks,
Steve


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-04-01 Thread Steve Muckle
On 04/01/2016 12:14 PM, Rafael J. Wysocki wrote:
> On Fri, Apr 1, 2016 at 7:49 PM, Steve Muckle  wrote:
>> On 03/29/2016 07:00 PM, Rafael J. Wysocki wrote:
>> ...
>>> +config CPU_FREQ_GOV_SCHEDUTIL
>>> + tristate "'schedutil' cpufreq policy governor"
>>> + depends on CPU_FREQ
>>> + select CPU_FREQ_GOV_ATTR_SET
>>> + select IRQ_WORK
>>> + help
>>> +   This governor makes decisions based on the utilization data provided
>>> +   by the scheduler.  It sets the CPU frequency to be proportional to
>>> +   the utilization/capacity ratio coming from the scheduler.  If the
>>> +   utilization is frequency-invariant, the new frequency is also
>>> +   proportional to the maximum available frequency.  If that is not the
>>> +   case, it is proportional to the current frequency of the CPU with 
>>> the
>>> +   tipping point at utilization/capacity equal to 80%.
>>
>> This help text implies that the tipping point of 80% applies only to
>> non-frequency invariant configurations, rather than both. Possible to
>> rephrase?
> 
> Sure.
> 
> What about:
> 
> "If that is not the case, it is proportional to the current frequency
> of the CPU.  The frequency tipping point is at utilization/capacity
> equal to 80% in both cases."

LGTM

thanks,
Steve


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-04-01 Thread Rafael J. Wysocki
On Fri, Apr 1, 2016 at 7:49 PM, Steve Muckle  wrote:
> On 03/29/2016 07:00 PM, Rafael J. Wysocki wrote:
> ...
>> +config CPU_FREQ_GOV_SCHEDUTIL
>> + tristate "'schedutil' cpufreq policy governor"
>> + depends on CPU_FREQ
>> + select CPU_FREQ_GOV_ATTR_SET
>> + select IRQ_WORK
>> + help
>> +   This governor makes decisions based on the utilization data provided
>> +   by the scheduler.  It sets the CPU frequency to be proportional to
>> +   the utilization/capacity ratio coming from the scheduler.  If the
>> +   utilization is frequency-invariant, the new frequency is also
>> +   proportional to the maximum available frequency.  If that is not the
>> +   case, it is proportional to the current frequency of the CPU with the
>> +   tipping point at utilization/capacity equal to 80%.
>
> This help text implies that the tipping point of 80% applies only to
> non-frequency invariant configurations, rather than both. Possible to
> rephrase?

Sure.

What about:

"If that is not the case, it is proportional to the current frequency
of the CPU.  The frequency tipping point is at utilization/capacity
equal to 80% in both cases."

> ...
>> +static unsigned int sugov_next_freq_shared(struct sugov_policy *sg_policy,
>> +unsigned long util, unsigned long 
>> max)
>> +{
>> + struct cpufreq_policy *policy = sg_policy->policy;
>> + unsigned int max_f = policy->cpuinfo.max_freq;
>> + u64 last_freq_update_time = sg_policy->last_freq_update_time;
>> + unsigned int j;
>> +
>> + if (util == ULONG_MAX)
>> + return max_f;
>> +
>> + for_each_cpu(j, policy->cpus) {
>> + struct sugov_cpu *j_sg_cpu;
>> + unsigned long j_util, j_max;
>> + u64 delta_ns;
>> +
>> + if (j == smp_processor_id())
>> + continue;
>> +
>> + j_sg_cpu = _cpu(sugov_cpu, j);
>> + /*
>> +  * If the CPU utilization was last updated before the previous
>> +  * frequency update and the time elapsed between the last 
>> update
>> +  * of the CPU utilization and the last frequency update is long
>> +  * enough, don't take the CPU into account as it probably is
>> +  * idle now.
>> +  */
>> + delta_ns = last_freq_update_time - j_sg_cpu->last_update;
>> + if ((s64)delta_ns > TICK_NSEC)
>
>>> Why not declare delta_ns as an s64 (also in suguv_should_update_freq)
>>> and avoid the cast?
>>
>> I took this from __update_load_avg(), but it shouldn't matter here.
>
> Did you mean to keep these casts?

Not really.  I'll fix that up shortly.


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-04-01 Thread Rafael J. Wysocki
On Fri, Apr 1, 2016 at 7:49 PM, Steve Muckle  wrote:
> On 03/29/2016 07:00 PM, Rafael J. Wysocki wrote:
> ...
>> +config CPU_FREQ_GOV_SCHEDUTIL
>> + tristate "'schedutil' cpufreq policy governor"
>> + depends on CPU_FREQ
>> + select CPU_FREQ_GOV_ATTR_SET
>> + select IRQ_WORK
>> + help
>> +   This governor makes decisions based on the utilization data provided
>> +   by the scheduler.  It sets the CPU frequency to be proportional to
>> +   the utilization/capacity ratio coming from the scheduler.  If the
>> +   utilization is frequency-invariant, the new frequency is also
>> +   proportional to the maximum available frequency.  If that is not the
>> +   case, it is proportional to the current frequency of the CPU with the
>> +   tipping point at utilization/capacity equal to 80%.
>
> This help text implies that the tipping point of 80% applies only to
> non-frequency invariant configurations, rather than both. Possible to
> rephrase?

Sure.

What about:

"If that is not the case, it is proportional to the current frequency
of the CPU.  The frequency tipping point is at utilization/capacity
equal to 80% in both cases."

> ...
>> +static unsigned int sugov_next_freq_shared(struct sugov_policy *sg_policy,
>> +unsigned long util, unsigned long 
>> max)
>> +{
>> + struct cpufreq_policy *policy = sg_policy->policy;
>> + unsigned int max_f = policy->cpuinfo.max_freq;
>> + u64 last_freq_update_time = sg_policy->last_freq_update_time;
>> + unsigned int j;
>> +
>> + if (util == ULONG_MAX)
>> + return max_f;
>> +
>> + for_each_cpu(j, policy->cpus) {
>> + struct sugov_cpu *j_sg_cpu;
>> + unsigned long j_util, j_max;
>> + u64 delta_ns;
>> +
>> + if (j == smp_processor_id())
>> + continue;
>> +
>> + j_sg_cpu = _cpu(sugov_cpu, j);
>> + /*
>> +  * If the CPU utilization was last updated before the previous
>> +  * frequency update and the time elapsed between the last 
>> update
>> +  * of the CPU utilization and the last frequency update is long
>> +  * enough, don't take the CPU into account as it probably is
>> +  * idle now.
>> +  */
>> + delta_ns = last_freq_update_time - j_sg_cpu->last_update;
>> + if ((s64)delta_ns > TICK_NSEC)
>
>>> Why not declare delta_ns as an s64 (also in suguv_should_update_freq)
>>> and avoid the cast?
>>
>> I took this from __update_load_avg(), but it shouldn't matter here.
>
> Did you mean to keep these casts?

Not really.  I'll fix that up shortly.


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-04-01 Thread Steve Muckle
On 03/29/2016 07:00 PM, Rafael J. Wysocki wrote:
...
> +config CPU_FREQ_GOV_SCHEDUTIL
> + tristate "'schedutil' cpufreq policy governor"
> + depends on CPU_FREQ
> + select CPU_FREQ_GOV_ATTR_SET
> + select IRQ_WORK
> + help
> +   This governor makes decisions based on the utilization data provided
> +   by the scheduler.  It sets the CPU frequency to be proportional to
> +   the utilization/capacity ratio coming from the scheduler.  If the
> +   utilization is frequency-invariant, the new frequency is also
> +   proportional to the maximum available frequency.  If that is not the
> +   case, it is proportional to the current frequency of the CPU with the
> +   tipping point at utilization/capacity equal to 80%.

This help text implies that the tipping point of 80% applies only to
non-frequency invariant configurations, rather than both. Possible to
rephrase?

...
> +static unsigned int sugov_next_freq_shared(struct sugov_policy *sg_policy,
> +unsigned long util, unsigned long 
> max)
> +{
> + struct cpufreq_policy *policy = sg_policy->policy;
> + unsigned int max_f = policy->cpuinfo.max_freq;
> + u64 last_freq_update_time = sg_policy->last_freq_update_time;
> + unsigned int j;
> +
> + if (util == ULONG_MAX)
> + return max_f;
> +
> + for_each_cpu(j, policy->cpus) {
> + struct sugov_cpu *j_sg_cpu;
> + unsigned long j_util, j_max;
> + u64 delta_ns;
> +
> + if (j == smp_processor_id())
> + continue;
> +
> + j_sg_cpu = _cpu(sugov_cpu, j);
> + /*
> +  * If the CPU utilization was last updated before the previous
> +  * frequency update and the time elapsed between the last update
> +  * of the CPU utilization and the last frequency update is long
> +  * enough, don't take the CPU into account as it probably is
> +  * idle now.
> +  */
> + delta_ns = last_freq_update_time - j_sg_cpu->last_update;
> + if ((s64)delta_ns > TICK_NSEC)

>> Why not declare delta_ns as an s64 (also in suguv_should_update_freq)
>> and avoid the cast?
>
> I took this from __update_load_avg(), but it shouldn't matter here.

Did you mean to keep these casts?

thanks,
Steve



Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-04-01 Thread Steve Muckle
On 03/29/2016 07:00 PM, Rafael J. Wysocki wrote:
...
> +config CPU_FREQ_GOV_SCHEDUTIL
> + tristate "'schedutil' cpufreq policy governor"
> + depends on CPU_FREQ
> + select CPU_FREQ_GOV_ATTR_SET
> + select IRQ_WORK
> + help
> +   This governor makes decisions based on the utilization data provided
> +   by the scheduler.  It sets the CPU frequency to be proportional to
> +   the utilization/capacity ratio coming from the scheduler.  If the
> +   utilization is frequency-invariant, the new frequency is also
> +   proportional to the maximum available frequency.  If that is not the
> +   case, it is proportional to the current frequency of the CPU with the
> +   tipping point at utilization/capacity equal to 80%.

This help text implies that the tipping point of 80% applies only to
non-frequency invariant configurations, rather than both. Possible to
rephrase?

...
> +static unsigned int sugov_next_freq_shared(struct sugov_policy *sg_policy,
> +unsigned long util, unsigned long 
> max)
> +{
> + struct cpufreq_policy *policy = sg_policy->policy;
> + unsigned int max_f = policy->cpuinfo.max_freq;
> + u64 last_freq_update_time = sg_policy->last_freq_update_time;
> + unsigned int j;
> +
> + if (util == ULONG_MAX)
> + return max_f;
> +
> + for_each_cpu(j, policy->cpus) {
> + struct sugov_cpu *j_sg_cpu;
> + unsigned long j_util, j_max;
> + u64 delta_ns;
> +
> + if (j == smp_processor_id())
> + continue;
> +
> + j_sg_cpu = _cpu(sugov_cpu, j);
> + /*
> +  * If the CPU utilization was last updated before the previous
> +  * frequency update and the time elapsed between the last update
> +  * of the CPU utilization and the last frequency update is long
> +  * enough, don't take the CPU into account as it probably is
> +  * idle now.
> +  */
> + delta_ns = last_freq_update_time - j_sg_cpu->last_update;
> + if ((s64)delta_ns > TICK_NSEC)

>> Why not declare delta_ns as an s64 (also in suguv_should_update_freq)
>> and avoid the cast?
>
> I took this from __update_load_avg(), but it shouldn't matter here.

Did you mean to keep these casts?

thanks,
Steve



Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-31 Thread Peter Zijlstra
On Wed, Mar 30, 2016 at 04:00:24AM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki 
> 
> Add a new cpufreq scaling governor, called "schedutil", that uses
> scheduler-provided CPU utilization information as input for making
> its decisions.
> 
> Doing that is possible after commit 34e2c555f3e1 (cpufreq: Add
> mechanism for registering utilization update callbacks) that
> introduced cpufreq_update_util() called by the scheduler on
> utilization changes (from CFS) and RT/DL task status updates.
> In particular, CPU frequency scaling decisions may be based on
> the the utilization data passed to cpufreq_update_util() by CFS.
> 
> The new governor is relatively simple.
> 
> The frequency selection formula used by it depends on whether or not
> the utilization is frequency-invariant.  In the frequency-invariant
> case the new CPU frequency is given by
> 
>   next_freq = 1.25 * max_freq * util / max
> 
> where util and max are the last two arguments of cpufreq_update_util().
> In turn, if util is not frequency-invariant, the maximum frequency in
> the above formula is replaced with the current frequency of the CPU:
> 
>   next_freq = 1.25 * curr_freq * util / max
> 
> The coefficient 1.25 corresponds to the frequency tipping point at
> (util / max) = 0.8.
> 
> All of the computations are carried out in the utilization update
> handlers provided by the new governor.  One of those handlers is
> used for cpufreq policies shared between multiple CPUs and the other
> one is for policies with one CPU only (and therefore it doesn't need
> to use any extra synchronization means).
> 
> The governor supports fast frequency switching if that is supported
> by the cpufreq driver in use and possible for the given policy.
> In the fast switching case, all operations of the governor take
> place in its utilization update handlers.  If fast switching cannot
> be used, the frequency switch operations are carried out with the
> help of a work item which only calls __cpufreq_driver_target()
> (under a mutex) to trigger a frequency update (to a value already
> computed beforehand in one of the utilization update handlers).
> 
> Currently, the governor treats all of the RT and DL tasks as
> "unknown utilization" and sets the frequency to the allowed
> maximum when updated from the RT or DL sched classes.  That
> heavy-handed approach should be replaced with something more
> subtle and specifically targeted at RT and DL tasks.
> 
> The governor shares some tunables management code with the
> "ondemand" and "conservative" governors and uses some common
> definitions from cpufreq_governor.h, but apart from that it
> is stand-alone.
> 
> Signed-off-by: Rafael J. Wysocki 
> ---
>  drivers/cpufreq/Kconfig  |   29 ++
>  kernel/sched/Makefile|1 
>  kernel/sched/cpufreq_schedutil.c |  528 
> +++
>  kernel/sched/sched.h |8 
>  4 files changed, 566 insertions(+)

I think this is a good first step and we can definitely work from here;
afaict there are no (big) disagreements on the general approach, so

Acked-by: Peter Zijlstra (Intel) 



Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-31 Thread Peter Zijlstra
On Wed, Mar 30, 2016 at 04:00:24AM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki 
> 
> Add a new cpufreq scaling governor, called "schedutil", that uses
> scheduler-provided CPU utilization information as input for making
> its decisions.
> 
> Doing that is possible after commit 34e2c555f3e1 (cpufreq: Add
> mechanism for registering utilization update callbacks) that
> introduced cpufreq_update_util() called by the scheduler on
> utilization changes (from CFS) and RT/DL task status updates.
> In particular, CPU frequency scaling decisions may be based on
> the the utilization data passed to cpufreq_update_util() by CFS.
> 
> The new governor is relatively simple.
> 
> The frequency selection formula used by it depends on whether or not
> the utilization is frequency-invariant.  In the frequency-invariant
> case the new CPU frequency is given by
> 
>   next_freq = 1.25 * max_freq * util / max
> 
> where util and max are the last two arguments of cpufreq_update_util().
> In turn, if util is not frequency-invariant, the maximum frequency in
> the above formula is replaced with the current frequency of the CPU:
> 
>   next_freq = 1.25 * curr_freq * util / max
> 
> The coefficient 1.25 corresponds to the frequency tipping point at
> (util / max) = 0.8.
> 
> All of the computations are carried out in the utilization update
> handlers provided by the new governor.  One of those handlers is
> used for cpufreq policies shared between multiple CPUs and the other
> one is for policies with one CPU only (and therefore it doesn't need
> to use any extra synchronization means).
> 
> The governor supports fast frequency switching if that is supported
> by the cpufreq driver in use and possible for the given policy.
> In the fast switching case, all operations of the governor take
> place in its utilization update handlers.  If fast switching cannot
> be used, the frequency switch operations are carried out with the
> help of a work item which only calls __cpufreq_driver_target()
> (under a mutex) to trigger a frequency update (to a value already
> computed beforehand in one of the utilization update handlers).
> 
> Currently, the governor treats all of the RT and DL tasks as
> "unknown utilization" and sets the frequency to the allowed
> maximum when updated from the RT or DL sched classes.  That
> heavy-handed approach should be replaced with something more
> subtle and specifically targeted at RT and DL tasks.
> 
> The governor shares some tunables management code with the
> "ondemand" and "conservative" governors and uses some common
> definitions from cpufreq_governor.h, but apart from that it
> is stand-alone.
> 
> Signed-off-by: Rafael J. Wysocki 
> ---
>  drivers/cpufreq/Kconfig  |   29 ++
>  kernel/sched/Makefile|1 
>  kernel/sched/cpufreq_schedutil.c |  528 
> +++
>  kernel/sched/sched.h |8 
>  4 files changed, 566 insertions(+)

I think this is a good first step and we can definitely work from here;
afaict there are no (big) disagreements on the general approach, so

Acked-by: Peter Zijlstra (Intel) 



Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-31 Thread Peter Zijlstra
On Thu, Mar 31, 2016 at 02:18:33PM +0200, Rafael J. Wysocki wrote:
> On Thu, Mar 31, 2016 at 2:12 PM, Peter Zijlstra  wrote:
> >
> > Ingo reminded me that the schedutil governor is part of the scheduler
> > proper and can access scheduler data because of that.
> >
> > This allows us to remove the util and max arguments since only the
> > schedutil governor will use those, which leads to some further text
> > reduction:
> >
> >   435951226  24   44845af2d 
> > defconfig-build/kernel/sched/fair.o.pre
> >   429071226  24   44157ac7d 
> > defconfig-build/kernel/sched/fair.o.post
> >
> > Of course, we get more text in schedutil in return, but the below also
> > shows how we can benefit from not being tied to those two parameters by
> > doing a very coarse deadline reservation.
> 
> OK
> 
> Do you want this to go into the series or be folded into the schedutil
> patch or on top  of it?

We can do it on top; no need to rush this.


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-31 Thread Peter Zijlstra
On Thu, Mar 31, 2016 at 02:18:33PM +0200, Rafael J. Wysocki wrote:
> On Thu, Mar 31, 2016 at 2:12 PM, Peter Zijlstra  wrote:
> >
> > Ingo reminded me that the schedutil governor is part of the scheduler
> > proper and can access scheduler data because of that.
> >
> > This allows us to remove the util and max arguments since only the
> > schedutil governor will use those, which leads to some further text
> > reduction:
> >
> >   435951226  24   44845af2d 
> > defconfig-build/kernel/sched/fair.o.pre
> >   429071226  24   44157ac7d 
> > defconfig-build/kernel/sched/fair.o.post
> >
> > Of course, we get more text in schedutil in return, but the below also
> > shows how we can benefit from not being tied to those two parameters by
> > doing a very coarse deadline reservation.
> 
> OK
> 
> Do you want this to go into the series or be folded into the schedutil
> patch or on top  of it?

We can do it on top; no need to rush this.


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-31 Thread Rafael J. Wysocki
On Thu, Mar 31, 2016 at 2:12 PM, Peter Zijlstra  wrote:
>
> Ingo reminded me that the schedutil governor is part of the scheduler
> proper and can access scheduler data because of that.
>
> This allows us to remove the util and max arguments since only the
> schedutil governor will use those, which leads to some further text
> reduction:
>
>   435951226  24   44845af2d 
> defconfig-build/kernel/sched/fair.o.pre
>   429071226  24   44157ac7d 
> defconfig-build/kernel/sched/fair.o.post
>
> Of course, we get more text in schedutil in return, but the below also
> shows how we can benefit from not being tied to those two parameters by
> doing a very coarse deadline reservation.

OK

Do you want this to go into the series or be folded into the schedutil
patch or on top  of it?


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-31 Thread Rafael J. Wysocki
On Thu, Mar 31, 2016 at 2:12 PM, Peter Zijlstra  wrote:
>
> Ingo reminded me that the schedutil governor is part of the scheduler
> proper and can access scheduler data because of that.
>
> This allows us to remove the util and max arguments since only the
> schedutil governor will use those, which leads to some further text
> reduction:
>
>   435951226  24   44845af2d 
> defconfig-build/kernel/sched/fair.o.pre
>   429071226  24   44157ac7d 
> defconfig-build/kernel/sched/fair.o.post
>
> Of course, we get more text in schedutil in return, but the below also
> shows how we can benefit from not being tied to those two parameters by
> doing a very coarse deadline reservation.

OK

Do you want this to go into the series or be folded into the schedutil
patch or on top  of it?


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-31 Thread Peter Zijlstra

Ingo reminded me that the schedutil governor is part of the scheduler
proper and can access scheduler data because of that.

This allows us to remove the util and max arguments since only the
schedutil governor will use those, which leads to some further text
reduction:

  435951226  24   44845af2d defconfig-build/kernel/sched/fair.o.pre
  429071226  24   44157ac7d defconfig-build/kernel/sched/fair.o.post

Of course, we get more text in schedutil in return, but the below also
shows how we can benefit from not being tied to those two parameters by
doing a very coarse deadline reservation.

---
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -248,8 +248,7 @@ static void dbs_irq_work(struct irq_work
schedule_work_on(smp_processor_id(), _dbs->work);
 }
 
-static void dbs_update_util_handler(struct update_util_data *data, u64 time,
-   unsigned long util, unsigned long max)
+static void dbs_update_util_handler(struct update_util_data *data, u64 time)
 {
struct cpu_dbs_info *cdbs = container_of(data, struct cpu_dbs_info, 
update_util);
struct policy_dbs_info *policy_dbs = cdbs->policy_dbs;
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -1032,8 +1032,7 @@ static inline void intel_pstate_adjust_b
get_avg_frequency(cpu));
 }
 
-static void intel_pstate_update_util(struct update_util_data *data, u64 time,
-unsigned long util, unsigned long max)
+static void intel_pstate_update_util(struct update_util_data *data, u64 time)
 {
struct cpudata *cpu = container_of(data, struct cpudata, update_util);
u64 delta_ns = time - cpu->sample.time;
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -3236,13 +3236,11 @@ static inline unsigned long rlimit_max(u
 
 #ifdef CONFIG_CPU_FREQ
 struct update_util_data {
-   void (*func)(struct update_util_data *data,
-u64 time, unsigned long util, unsigned long max);
+   void (*func)(struct update_util_data *data, u64 time);
 };
 
 void cpufreq_add_update_util_hook(int cpu, struct update_util_data *data,
-   void (*func)(struct update_util_data *data, u64 time,
-unsigned long util, unsigned long max));
+   void (*func)(struct update_util_data *data, u64 time));
 void cpufreq_remove_update_util_hook(int cpu);
 #endif /* CONFIG_CPU_FREQ */
 
--- a/kernel/sched/cpufreq.c
+++ b/kernel/sched/cpufreq.c
@@ -32,8 +32,7 @@ DEFINE_PER_CPU(struct update_util_data *
  * called or it will WARN() and return with no effect.
  */
 void cpufreq_add_update_util_hook(int cpu, struct update_util_data *data,
-   void (*func)(struct update_util_data *data, u64 time,
-unsigned long util, unsigned long max))
+   void (*func)(struct update_util_data *data, u64 time))
 {
if (WARN_ON(!data || !func))
return;
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -129,19 +129,55 @@ static unsigned int get_next_freq(struct
return (freq + (freq >> 2)) * util / max;
 }
 
-static void sugov_update_single(struct update_util_data *hook, u64 time,
-   unsigned long util, unsigned long max)
+static void sugov_get_util(unsigned long *util, unsigned long *max)
+{
+   unsigned long dl_util, dl_max;
+   unsigned long cfs_util, cfs_max;
+   int cpu = smp_processor_id();
+   struct dl_bw *dl_bw = dl_bw_of(cpu);
+   struct rq *rq = this_rq();
+
+   if (rt_prio(current->prio)) {
+   /*
+* Punt for now; maybe do something based on sysctl_sched_rt_*.
+*/
+   *util = ULONG_MAX;
+   return;
+   }
+
+   dl_max = dl_bw_cpus(cpu) << 20;
+   dl_util = dl_bw->total_bw;
+
+   cfs_max = rq->cpu_capacity_orig;
+   cfs_util = min(rq->cfs.avg.util_avg, cfs_max);
+
+   if (cfs_util * dl_max > dl_util * cfs_max) {
+   *util = cfs_util;
+   *max  = cfs_max;
+   } else {
+   *util = dl_util;
+   *max  = dl_max;
+   }
+}
+
+static void sugov_update_single(struct update_util_data *hook, u64 time)
 {
struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, 
update_util);
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
struct cpufreq_policy *policy = sg_policy->policy;
+   unsigned long util, max;
unsigned int next_f;
 
if (!sugov_should_update_freq(sg_policy, time))
return;
 
-   next_f = util == ULONG_MAX ? policy->cpuinfo.max_freq :
-   get_next_freq(policy, util, max);
+   sugov_get_util(, );
+
+   if (util == ULONG_MAX)
+   next_f = policy->cpuinfo.max_freq;
+   else
+   

Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-31 Thread Peter Zijlstra

Ingo reminded me that the schedutil governor is part of the scheduler
proper and can access scheduler data because of that.

This allows us to remove the util and max arguments since only the
schedutil governor will use those, which leads to some further text
reduction:

  435951226  24   44845af2d defconfig-build/kernel/sched/fair.o.pre
  429071226  24   44157ac7d defconfig-build/kernel/sched/fair.o.post

Of course, we get more text in schedutil in return, but the below also
shows how we can benefit from not being tied to those two parameters by
doing a very coarse deadline reservation.

---
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -248,8 +248,7 @@ static void dbs_irq_work(struct irq_work
schedule_work_on(smp_processor_id(), _dbs->work);
 }
 
-static void dbs_update_util_handler(struct update_util_data *data, u64 time,
-   unsigned long util, unsigned long max)
+static void dbs_update_util_handler(struct update_util_data *data, u64 time)
 {
struct cpu_dbs_info *cdbs = container_of(data, struct cpu_dbs_info, 
update_util);
struct policy_dbs_info *policy_dbs = cdbs->policy_dbs;
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -1032,8 +1032,7 @@ static inline void intel_pstate_adjust_b
get_avg_frequency(cpu));
 }
 
-static void intel_pstate_update_util(struct update_util_data *data, u64 time,
-unsigned long util, unsigned long max)
+static void intel_pstate_update_util(struct update_util_data *data, u64 time)
 {
struct cpudata *cpu = container_of(data, struct cpudata, update_util);
u64 delta_ns = time - cpu->sample.time;
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -3236,13 +3236,11 @@ static inline unsigned long rlimit_max(u
 
 #ifdef CONFIG_CPU_FREQ
 struct update_util_data {
-   void (*func)(struct update_util_data *data,
-u64 time, unsigned long util, unsigned long max);
+   void (*func)(struct update_util_data *data, u64 time);
 };
 
 void cpufreq_add_update_util_hook(int cpu, struct update_util_data *data,
-   void (*func)(struct update_util_data *data, u64 time,
-unsigned long util, unsigned long max));
+   void (*func)(struct update_util_data *data, u64 time));
 void cpufreq_remove_update_util_hook(int cpu);
 #endif /* CONFIG_CPU_FREQ */
 
--- a/kernel/sched/cpufreq.c
+++ b/kernel/sched/cpufreq.c
@@ -32,8 +32,7 @@ DEFINE_PER_CPU(struct update_util_data *
  * called or it will WARN() and return with no effect.
  */
 void cpufreq_add_update_util_hook(int cpu, struct update_util_data *data,
-   void (*func)(struct update_util_data *data, u64 time,
-unsigned long util, unsigned long max))
+   void (*func)(struct update_util_data *data, u64 time))
 {
if (WARN_ON(!data || !func))
return;
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -129,19 +129,55 @@ static unsigned int get_next_freq(struct
return (freq + (freq >> 2)) * util / max;
 }
 
-static void sugov_update_single(struct update_util_data *hook, u64 time,
-   unsigned long util, unsigned long max)
+static void sugov_get_util(unsigned long *util, unsigned long *max)
+{
+   unsigned long dl_util, dl_max;
+   unsigned long cfs_util, cfs_max;
+   int cpu = smp_processor_id();
+   struct dl_bw *dl_bw = dl_bw_of(cpu);
+   struct rq *rq = this_rq();
+
+   if (rt_prio(current->prio)) {
+   /*
+* Punt for now; maybe do something based on sysctl_sched_rt_*.
+*/
+   *util = ULONG_MAX;
+   return;
+   }
+
+   dl_max = dl_bw_cpus(cpu) << 20;
+   dl_util = dl_bw->total_bw;
+
+   cfs_max = rq->cpu_capacity_orig;
+   cfs_util = min(rq->cfs.avg.util_avg, cfs_max);
+
+   if (cfs_util * dl_max > dl_util * cfs_max) {
+   *util = cfs_util;
+   *max  = cfs_max;
+   } else {
+   *util = dl_util;
+   *max  = dl_max;
+   }
+}
+
+static void sugov_update_single(struct update_util_data *hook, u64 time)
 {
struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, 
update_util);
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
struct cpufreq_policy *policy = sg_policy->policy;
+   unsigned long util, max;
unsigned int next_f;
 
if (!sugov_should_update_freq(sg_policy, time))
return;
 
-   next_f = util == ULONG_MAX ? policy->cpuinfo.max_freq :
-   get_next_freq(policy, util, max);
+   sugov_get_util(, );
+
+   if (util == ULONG_MAX)
+   next_f = policy->cpuinfo.max_freq;
+   else
+   

Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-30 Thread Steve Muckle
On 03/30/2016 10:24 AM, Rafael J. Wysocki wrote:
> On Wed, Mar 30, 2016 at 7:05 PM, Steve Muckle  wrote:
>> On 03/30/2016 04:31 AM, Rafael J. Wysocki wrote:
>>> +static int sugov_limits(struct cpufreq_policy *policy)
>>> +{
>>> + struct sugov_policy *sg_policy = policy->governor_data;
>>> +
>>> + if (!policy->fast_switch_enabled) {
>>> + mutex_lock(_policy->work_lock);
>>> +
>>> + if (policy->max < policy->cur)
>>> + __cpufreq_driver_target(policy, policy->max,
>>> + CPUFREQ_RELATION_H);
>>> + else if (policy->min > policy->cur)
>>> + __cpufreq_driver_target(policy, policy->min,
>>> + CPUFREQ_RELATION_L);
>>> +
>>> + mutex_unlock(_policy->work_lock);
>>> + }
>>> +
>>> + sg_policy->need_freq_update = true;
>
> I am wondering why we need to do this for !fast_switch_enabled case?
>>>
>>> That will cause the rate limit to be ignored in the utilization update
>>> handler which may be necessary if it is set to a relatively large
>>> value (like 1 s).
>>
>> But why is that necessary for !fast_switch_enabled? In that case the
>> frequency has been adjusted to satisfy the new limits here, so ignoring
>> the rate limit shouldn't be necessary. In other words why not
>>
>> } else {
>> sg_policy->need_freq_update = true;
>> }
> 
> My thinking here was that the governor might decide to use something
> different from the limit enforced here, so it would be good to make it
> do so as soon as possible.  In particular in the
> non-frequency-invariant utilization case in which new frequency
> depends on the current one.
> 
> That said i'm not particularly opposed to making that change if that's
> preferred.

Ah ok fair enough. No strong opinion from me...

thanks,
Steve



Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-30 Thread Steve Muckle
On 03/30/2016 10:24 AM, Rafael J. Wysocki wrote:
> On Wed, Mar 30, 2016 at 7:05 PM, Steve Muckle  wrote:
>> On 03/30/2016 04:31 AM, Rafael J. Wysocki wrote:
>>> +static int sugov_limits(struct cpufreq_policy *policy)
>>> +{
>>> + struct sugov_policy *sg_policy = policy->governor_data;
>>> +
>>> + if (!policy->fast_switch_enabled) {
>>> + mutex_lock(_policy->work_lock);
>>> +
>>> + if (policy->max < policy->cur)
>>> + __cpufreq_driver_target(policy, policy->max,
>>> + CPUFREQ_RELATION_H);
>>> + else if (policy->min > policy->cur)
>>> + __cpufreq_driver_target(policy, policy->min,
>>> + CPUFREQ_RELATION_L);
>>> +
>>> + mutex_unlock(_policy->work_lock);
>>> + }
>>> +
>>> + sg_policy->need_freq_update = true;
>
> I am wondering why we need to do this for !fast_switch_enabled case?
>>>
>>> That will cause the rate limit to be ignored in the utilization update
>>> handler which may be necessary if it is set to a relatively large
>>> value (like 1 s).
>>
>> But why is that necessary for !fast_switch_enabled? In that case the
>> frequency has been adjusted to satisfy the new limits here, so ignoring
>> the rate limit shouldn't be necessary. In other words why not
>>
>> } else {
>> sg_policy->need_freq_update = true;
>> }
> 
> My thinking here was that the governor might decide to use something
> different from the limit enforced here, so it would be good to make it
> do so as soon as possible.  In particular in the
> non-frequency-invariant utilization case in which new frequency
> depends on the current one.
> 
> That said i'm not particularly opposed to making that change if that's
> preferred.

Ah ok fair enough. No strong opinion from me...

thanks,
Steve



Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-30 Thread Rafael J. Wysocki
On Wed, Mar 30, 2016 at 7:05 PM, Steve Muckle  wrote:
> On 03/30/2016 04:31 AM, Rafael J. Wysocki wrote:
 >> +static int sugov_limits(struct cpufreq_policy *policy)
 >> +{
 >> + struct sugov_policy *sg_policy = policy->governor_data;
 >> +
 >> + if (!policy->fast_switch_enabled) {
 >> + mutex_lock(_policy->work_lock);
 >> +
 >> + if (policy->max < policy->cur)
 >> + __cpufreq_driver_target(policy, policy->max,
 >> + CPUFREQ_RELATION_H);
 >> + else if (policy->min > policy->cur)
 >> + __cpufreq_driver_target(policy, policy->min,
 >> + CPUFREQ_RELATION_L);
 >> +
 >> + mutex_unlock(_policy->work_lock);
 >> + }
 >> +
 >> + sg_policy->need_freq_update = true;
>>> >
>>> > I am wondering why we need to do this for !fast_switch_enabled case?
>>
>> That will cause the rate limit to be ignored in the utilization update
>> handler which may be necessary if it is set to a relatively large
>> value (like 1 s).
>
> But why is that necessary for !fast_switch_enabled? In that case the
> frequency has been adjusted to satisfy the new limits here, so ignoring
> the rate limit shouldn't be necessary. In other words why not
>
> } else {
> sg_policy->need_freq_update = true;
> }

My thinking here was that the governor might decide to use something
different from the limit enforced here, so it would be good to make it
do so as soon as possible.  In particular in the
non-frequency-invariant utilization case in which new frequency
depends on the current one.

That said i'm not particularly opposed to making that change if that's
preferred.

Thanks,
Rafael


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-30 Thread Rafael J. Wysocki
On Wed, Mar 30, 2016 at 7:05 PM, Steve Muckle  wrote:
> On 03/30/2016 04:31 AM, Rafael J. Wysocki wrote:
 >> +static int sugov_limits(struct cpufreq_policy *policy)
 >> +{
 >> + struct sugov_policy *sg_policy = policy->governor_data;
 >> +
 >> + if (!policy->fast_switch_enabled) {
 >> + mutex_lock(_policy->work_lock);
 >> +
 >> + if (policy->max < policy->cur)
 >> + __cpufreq_driver_target(policy, policy->max,
 >> + CPUFREQ_RELATION_H);
 >> + else if (policy->min > policy->cur)
 >> + __cpufreq_driver_target(policy, policy->min,
 >> + CPUFREQ_RELATION_L);
 >> +
 >> + mutex_unlock(_policy->work_lock);
 >> + }
 >> +
 >> + sg_policy->need_freq_update = true;
>>> >
>>> > I am wondering why we need to do this for !fast_switch_enabled case?
>>
>> That will cause the rate limit to be ignored in the utilization update
>> handler which may be necessary if it is set to a relatively large
>> value (like 1 s).
>
> But why is that necessary for !fast_switch_enabled? In that case the
> frequency has been adjusted to satisfy the new limits here, so ignoring
> the rate limit shouldn't be necessary. In other words why not
>
> } else {
> sg_policy->need_freq_update = true;
> }

My thinking here was that the governor might decide to use something
different from the limit enforced here, so it would be good to make it
do so as soon as possible.  In particular in the
non-frequency-invariant utilization case in which new frequency
depends on the current one.

That said i'm not particularly opposed to making that change if that's
preferred.

Thanks,
Rafael


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-30 Thread Steve Muckle
On 03/30/2016 04:31 AM, Rafael J. Wysocki wrote:
>>> >> +static int sugov_limits(struct cpufreq_policy *policy)
>>> >> +{
>>> >> + struct sugov_policy *sg_policy = policy->governor_data;
>>> >> +
>>> >> + if (!policy->fast_switch_enabled) {
>>> >> + mutex_lock(_policy->work_lock);
>>> >> +
>>> >> + if (policy->max < policy->cur)
>>> >> + __cpufreq_driver_target(policy, policy->max,
>>> >> + CPUFREQ_RELATION_H);
>>> >> + else if (policy->min > policy->cur)
>>> >> + __cpufreq_driver_target(policy, policy->min,
>>> >> + CPUFREQ_RELATION_L);
>>> >> +
>>> >> + mutex_unlock(_policy->work_lock);
>>> >> + }
>>> >> +
>>> >> + sg_policy->need_freq_update = true;
>> >
>> > I am wondering why we need to do this for !fast_switch_enabled case?
>
> That will cause the rate limit to be ignored in the utilization update
> handler which may be necessary if it is set to a relatively large
> value (like 1 s).

But why is that necessary for !fast_switch_enabled? In that case the
frequency has been adjusted to satisfy the new limits here, so ignoring
the rate limit shouldn't be necessary. In other words why not

} else {
sg_policy->need_freq_update = true;
}


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-30 Thread Steve Muckle
On 03/30/2016 04:31 AM, Rafael J. Wysocki wrote:
>>> >> +static int sugov_limits(struct cpufreq_policy *policy)
>>> >> +{
>>> >> + struct sugov_policy *sg_policy = policy->governor_data;
>>> >> +
>>> >> + if (!policy->fast_switch_enabled) {
>>> >> + mutex_lock(_policy->work_lock);
>>> >> +
>>> >> + if (policy->max < policy->cur)
>>> >> + __cpufreq_driver_target(policy, policy->max,
>>> >> + CPUFREQ_RELATION_H);
>>> >> + else if (policy->min > policy->cur)
>>> >> + __cpufreq_driver_target(policy, policy->min,
>>> >> + CPUFREQ_RELATION_L);
>>> >> +
>>> >> + mutex_unlock(_policy->work_lock);
>>> >> + }
>>> >> +
>>> >> + sg_policy->need_freq_update = true;
>> >
>> > I am wondering why we need to do this for !fast_switch_enabled case?
>
> That will cause the rate limit to be ignored in the utilization update
> handler which may be necessary if it is set to a relatively large
> value (like 1 s).

But why is that necessary for !fast_switch_enabled? In that case the
frequency has been adjusted to satisfy the new limits here, so ignoring
the rate limit shouldn't be necessary. In other words why not

} else {
sg_policy->need_freq_update = true;
}


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-30 Thread Rafael J. Wysocki
[cut]

> The current version of this looks good to me and takes care of all the issues 
> I
> raised earlier. Thanks.
>
>> +static int sugov_limits(struct cpufreq_policy *policy)
>> +{
>> + struct sugov_policy *sg_policy = policy->governor_data;
>> +
>> + if (!policy->fast_switch_enabled) {
>> + mutex_lock(_policy->work_lock);
>> +
>> + if (policy->max < policy->cur)
>> + __cpufreq_driver_target(policy, policy->max,
>> + CPUFREQ_RELATION_H);
>> + else if (policy->min > policy->cur)
>> + __cpufreq_driver_target(policy, policy->min,
>> + CPUFREQ_RELATION_L);
>> +
>> + mutex_unlock(_policy->work_lock);
>> + }
>> +
>> + sg_policy->need_freq_update = true;
>
> I am wondering why we need to do this for !fast_switch_enabled case?

That will cause the rate limit to be ignored in the utilization update
handler which may be necessary if it is set to a relatively large
value (like 1 s).

>> + return 0;
>> +}
>
> Apart from that:
>
> Acked-by: Viresh Kumar 

Thanks,
Rafael


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-30 Thread Rafael J. Wysocki
[cut]

> The current version of this looks good to me and takes care of all the issues 
> I
> raised earlier. Thanks.
>
>> +static int sugov_limits(struct cpufreq_policy *policy)
>> +{
>> + struct sugov_policy *sg_policy = policy->governor_data;
>> +
>> + if (!policy->fast_switch_enabled) {
>> + mutex_lock(_policy->work_lock);
>> +
>> + if (policy->max < policy->cur)
>> + __cpufreq_driver_target(policy, policy->max,
>> + CPUFREQ_RELATION_H);
>> + else if (policy->min > policy->cur)
>> + __cpufreq_driver_target(policy, policy->min,
>> + CPUFREQ_RELATION_L);
>> +
>> + mutex_unlock(_policy->work_lock);
>> + }
>> +
>> + sg_policy->need_freq_update = true;
>
> I am wondering why we need to do this for !fast_switch_enabled case?

That will cause the rate limit to be ignored in the utilization update
handler which may be necessary if it is set to a relatively large
value (like 1 s).

>> + return 0;
>> +}
>
> Apart from that:
>
> Acked-by: Viresh Kumar 

Thanks,
Rafael


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-29 Thread Viresh Kumar
On 30-03-16, 04:00, Rafael J. Wysocki wrote:
> +static int sugov_init(struct cpufreq_policy *policy)
> +{
> + struct sugov_policy *sg_policy;
> + struct sugov_tunables *tunables;
> + unsigned int lat;
> + int ret = 0;
> +
> + /* State should be equivalent to EXIT */
> + if (policy->governor_data)
> + return -EBUSY;
> +
> + sg_policy = sugov_policy_alloc(policy);
> + if (!sg_policy)
> + return -ENOMEM;
> +
> + mutex_lock(_tunables_lock);
> +
> + if (global_tunables) {
> + if (WARN_ON(have_governor_per_policy())) {
> + ret = -EINVAL;
> + goto free_sg_policy;
> + }
> + policy->governor_data = sg_policy;
> + sg_policy->tunables = global_tunables;
> +
> + gov_attr_set_get(_tunables->attr_set, 
> _policy->tunables_hook);
> + goto out;
> + }
> +
> + tunables = sugov_tunables_alloc(sg_policy);
> + if (!tunables) {
> + ret = -ENOMEM;
> + goto free_sg_policy;
> + }
> +
> + tunables->rate_limit_us = LATENCY_MULTIPLIER;
> + lat = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
> + if (lat)
> + tunables->rate_limit_us *= lat;
> +
> + policy->governor_data = sg_policy;
> + sg_policy->tunables = tunables;
> +
> + ret = kobject_init_and_add(>attr_set.kobj, 
> _tunables_ktype,
> +get_governor_parent_kobj(policy), "%s",
> +schedutil_gov.name);
> + if (ret)
> + goto fail;
> +
> + out:
> + mutex_unlock(_tunables_lock);
> +
> + cpufreq_enable_fast_switch(policy);
> + return 0;
> +
> + fail:
> + policy->governor_data = NULL;
> + sugov_tunables_free(tunables);
> +
> + free_sg_policy:
> + mutex_unlock(_tunables_lock);
> +
> + sugov_policy_free(sg_policy);
> + pr_err("cpufreq: schedutil governor initialization failed (error 
> %d)\n", ret);
> + return ret;
> +}

The current version of this looks good to me and takes care of all the issues I
raised earlier. Thanks.

> +static int sugov_limits(struct cpufreq_policy *policy)
> +{
> + struct sugov_policy *sg_policy = policy->governor_data;
> +
> + if (!policy->fast_switch_enabled) {
> + mutex_lock(_policy->work_lock);
> +
> + if (policy->max < policy->cur)
> + __cpufreq_driver_target(policy, policy->max,
> + CPUFREQ_RELATION_H);
> + else if (policy->min > policy->cur)
> + __cpufreq_driver_target(policy, policy->min,
> + CPUFREQ_RELATION_L);
> +
> + mutex_unlock(_policy->work_lock);
> + }
> +
> + sg_policy->need_freq_update = true;

I am wondering why we need to do this for !fast_switch_enabled case?

> + return 0;
> +}

Apart from that:

Acked-by: Viresh Kumar 

-- 
viresh


Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

2016-03-29 Thread Viresh Kumar
On 30-03-16, 04:00, Rafael J. Wysocki wrote:
> +static int sugov_init(struct cpufreq_policy *policy)
> +{
> + struct sugov_policy *sg_policy;
> + struct sugov_tunables *tunables;
> + unsigned int lat;
> + int ret = 0;
> +
> + /* State should be equivalent to EXIT */
> + if (policy->governor_data)
> + return -EBUSY;
> +
> + sg_policy = sugov_policy_alloc(policy);
> + if (!sg_policy)
> + return -ENOMEM;
> +
> + mutex_lock(_tunables_lock);
> +
> + if (global_tunables) {
> + if (WARN_ON(have_governor_per_policy())) {
> + ret = -EINVAL;
> + goto free_sg_policy;
> + }
> + policy->governor_data = sg_policy;
> + sg_policy->tunables = global_tunables;
> +
> + gov_attr_set_get(_tunables->attr_set, 
> _policy->tunables_hook);
> + goto out;
> + }
> +
> + tunables = sugov_tunables_alloc(sg_policy);
> + if (!tunables) {
> + ret = -ENOMEM;
> + goto free_sg_policy;
> + }
> +
> + tunables->rate_limit_us = LATENCY_MULTIPLIER;
> + lat = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
> + if (lat)
> + tunables->rate_limit_us *= lat;
> +
> + policy->governor_data = sg_policy;
> + sg_policy->tunables = tunables;
> +
> + ret = kobject_init_and_add(>attr_set.kobj, 
> _tunables_ktype,
> +get_governor_parent_kobj(policy), "%s",
> +schedutil_gov.name);
> + if (ret)
> + goto fail;
> +
> + out:
> + mutex_unlock(_tunables_lock);
> +
> + cpufreq_enable_fast_switch(policy);
> + return 0;
> +
> + fail:
> + policy->governor_data = NULL;
> + sugov_tunables_free(tunables);
> +
> + free_sg_policy:
> + mutex_unlock(_tunables_lock);
> +
> + sugov_policy_free(sg_policy);
> + pr_err("cpufreq: schedutil governor initialization failed (error 
> %d)\n", ret);
> + return ret;
> +}

The current version of this looks good to me and takes care of all the issues I
raised earlier. Thanks.

> +static int sugov_limits(struct cpufreq_policy *policy)
> +{
> + struct sugov_policy *sg_policy = policy->governor_data;
> +
> + if (!policy->fast_switch_enabled) {
> + mutex_lock(_policy->work_lock);
> +
> + if (policy->max < policy->cur)
> + __cpufreq_driver_target(policy, policy->max,
> + CPUFREQ_RELATION_H);
> + else if (policy->min > policy->cur)
> + __cpufreq_driver_target(policy, policy->min,
> + CPUFREQ_RELATION_L);
> +
> + mutex_unlock(_policy->work_lock);
> + }
> +
> + sg_policy->need_freq_update = true;

I am wondering why we need to do this for !fast_switch_enabled case?

> + return 0;
> +}

Apart from that:

Acked-by: Viresh Kumar 

-- 
viresh