RE: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled

2020-09-06 Thread Doug Smythies
Hi Rafael,

On 2020.08.17 14:06 Doug Smythies wrote:
> On 2020.08.06 05:04 Rafael J. Wysocki wrote:
> 
> > Allow intel_pstate to work in the passive mode with HWP enabled and
> > make it set the HWP minimum performance limit (HWP floor) to the
> > P-state value given by the target frequency supplied by the cpufreq
> > governor, so as to prevent the HWP algorithm and the CPU scheduler
> > from working against each other, at least when the schedutil governor
> > is in use, and update the intel_pstate documentation accordingly.
> 
...
> 
> powersave governor:
> acpi-cpufreq: good
> intel_cpufreq hwp: bad
> intel_cpufreq no hwp: good

It occurs to me that my expectations as to what 
is meant by "powersave" might not agree with yours. 

For the powersave governor, this is what we have now:

intel_cpufreq hwp == intel_pstate hwp
intel_cpufreq no hwp == acpi-cpufreq == always minimum freq
intel_pstate no hwp ~= acpi-cpufreq/ondemand

Is that your understanding/intention?

My expectation was/is:

intel_cpufreq hwp == intel_cpufreq no hwp == acpi-cpufreq == always minimum freq
intel_pstate no hwp ~= acpi-cpufreq/ondemand
intel_pstate hwp == Unique. Say, extremely course version of ondemand.

... Doug




RE: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled

2020-08-17 Thread Doug Smythies
On 2020.08.06 05:04 Rafael J. Wysocki wrote:

> Allow intel_pstate to work in the passive mode with HWP enabled and
> make it set the HWP minimum performance limit (HWP floor) to the
> P-state value given by the target frequency supplied by the cpufreq
> governor, so as to prevent the HWP algorithm and the CPU scheduler
> from working against each other, at least when the schedutil governor
> is in use, and update the intel_pstate documentation accordingly.

...

Hi Rafael,

You may or may not recall, I mentioned my further feedback would be
delayed, as I wanted to work on reducing the labour content of my
most basic CPU frequency scaler test.

I have tested kernel 5.9-rc1 for pretty much every intel_pstate
variant and governor, and also the acpi-cpufreq driver.

Other than changing governors, changes were only made via
grub command line options and re-boot. EPP or EPB were never
modified, they were always whatever default.

performance governor: (left mostly blank, on purpose.)
acpi-cpufreq:
intel_cpufreq hwp: good
intel_cpufreq no hwp:
intel_pstate hwp:
intel_pstate no hwp:

ondemand governor:
acpi-cpufreq: good
intel_cpufreq hwp: bad
intel_cpufreq no hwp: good

conservative governor:
acpi-cpufreq: good
intel_cpufreq hwp: good
intel_cpufreq no hwp: good

schedutil governor:
acpi-cpufreq: good
intel_cpufreq hwp: bad
intel_cpufreq no hwp: good

powersave governor:
acpi-cpufreq: good
intel_cpufreq hwp: bad
intel_cpufreq no hwp: good

active-powersave governor:
intel_pstate hwp: ? not smooth, suffers from the broken HWP issue.
intel_pstate no hwp: good.
Intel_pstate hwp, idle state 2 disabled: Better but still worse for power.

Now, we don't actually care about CPU frequency, we care about power:

ondemand governor:

periodic workflow at 347 hertz.
~58% load at 4.60 GHz (where hwp operates)
~76% load at 3.5 GHz (where no hwp operates)

intel_cpufreq hwp: 14.3 processor package watts. 51.5 watts on the mains to the 
computer.
intel_cpufreq no hwp: 9.1 processor package watts. 45.5 watts on the mains to 
the computer. 

schedutil governor:

periodic workflow at 347 hertz.
~36% load at 4.60 GHz (where hwp operates)
~55% load at 3.2 GHz (where no hwp operates)

intel_cpufreq hwp: 9.6 processor package watts. 45.8 watts on the mains to the 
computer.
intel_cpufreq no hwp: ~6 processor package watts. ~41 watts on the mains to the 
computer. (noisy)

powersave governor:

periodic workflow at 347 hertz.
~39.8% load at 2.00 GHz (where hwp operates)
~92.5% load at 0.8 GHz (where no hwp operates)

intel_cpufreq hwp: 2.6 processor package watts. 38 watts on the mains to the 
computer.
intel_cpufreq no hwp: 1.9 processor package watts. 36 watts on the mains to the 
computer.

active-powersave governor:

periodic workflow at 347 hertz.
~58% load at 4.60 GHz (where hwp operates)
~72% load at 3.88 GHz (where no hwp operates) 

intel_pstate hwp: 14.2 processor package watts. 52 watts on the mains to the 
computer.
intel_pstate no hwp: 10.1 processor package watts. 48 watts on the mains to the 
computer.

Link to web page with much of this same content which, in turn, links to 
various graphs.
Coded, to avoid the barrage of bots:

double u double u double u dot smythies dot com /~doug/linux/s18/hwp/v7/
 
... Doug




Re: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled

2020-08-11 Thread Rafael J. Wysocki
On Tuesday, August 11, 2020 2:51:41 AM CEST Francisco Jerez wrote:
> 
> --==-=-=
> Content-Type: multipart/mixed; boundary="=-=-="
> 
> --=-=-=
> Content-Type: text/plain; charset=utf-8
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
> 
> "Rafael J. Wysocki"  writes:
> 
> > From: Rafael J. Wysocki 
> >
> > Allow intel_pstate to work in the passive mode with HWP enabled and
> > make it set the HWP minimum performance limit (HWP floor) to the
> > P-state value given by the target frequency supplied by the cpufreq
> > governor, so as to prevent the HWP algorithm and the CPU scheduler
> > from working against each other, at least when the schedutil governor
> > is in use, and update the intel_pstate documentation accordingly.
> >
> > Among other things, this allows utilization clamps to be taken
> > into account, at least to a certain extent, when intel_pstate is
> > in use and makes it more likely that sufficient capacity for
> > deadline tasks will be provided.
> >
> > After this change, the resulting behavior of an HWP system with
> > intel_pstate in the passive mode should be close to the behavior
> > of the analogous non-HWP system with intel_pstate in the passive
> > mode, except that in the frequency range below the base frequency
> > (ie. the frequency retured by the base_frequency cpufreq attribute
> > in sysfs on HWP systems) the HWP algorithm is allowed to make the
> > CPU run at a frequency above the floor P-state set by intel_pstate,
> > with or without hardware coordination of P-states among CPUs in the
> > same package.
> >
> 
> The "frequency range below the base frequency" part of the paragraph
> above seems somewhat misleading, since AFAICT the same thing will happen
> in the P-state range above the base frequency. 

Fair enough.  I rephrased the changelog when applying the patch.

> Another minor comment below, other than that LGTM:

And this one has been fixed too.

> Reviewed-by: Francisco Jerez 

Thanks!





Re: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled

2020-08-10 Thread Francisco Jerez
"Rafael J. Wysocki"  writes:

> From: Rafael J. Wysocki 
>
> Allow intel_pstate to work in the passive mode with HWP enabled and
> make it set the HWP minimum performance limit (HWP floor) to the
> P-state value given by the target frequency supplied by the cpufreq
> governor, so as to prevent the HWP algorithm and the CPU scheduler
> from working against each other, at least when the schedutil governor
> is in use, and update the intel_pstate documentation accordingly.
>
> Among other things, this allows utilization clamps to be taken
> into account, at least to a certain extent, when intel_pstate is
> in use and makes it more likely that sufficient capacity for
> deadline tasks will be provided.
>
> After this change, the resulting behavior of an HWP system with
> intel_pstate in the passive mode should be close to the behavior
> of the analogous non-HWP system with intel_pstate in the passive
> mode, except that in the frequency range below the base frequency
> (ie. the frequency retured by the base_frequency cpufreq attribute
> in sysfs on HWP systems) the HWP algorithm is allowed to make the
> CPU run at a frequency above the floor P-state set by intel_pstate,
> with or without hardware coordination of P-states among CPUs in the
> same package.
>

The "frequency range below the base frequency" part of the paragraph
above seems somewhat misleading, since AFAICT the same thing will happen
in the P-state range above the base frequency.   Another minor comment
below, other than that LGTM:

Reviewed-by: Francisco Jerez 

> [If P-states of the CPUs in the same package are coordinated at the
>  hardware level, a non-HWP processor may choose a P-state above the
>  target one like a processor with HWP enabled may choose a P-state
>  above the HWP floor, so the HWP behavior is analogous to the non-HWP
>  one in that case.
>
>  Also note that the HWP floor may not be taken into account by
>  the processor in the range of P-states above the base frequency,
>  referred to as the turbo range, where the processor has a license to
>  choose any P-state, either below or above the HWP floor, just like a
>  non-HWP processor in the case when the target P-state falls into the
>  turbo range.]
>
> With this change applied, intel_pstate in the passive mode
> assumes complete control over the HWP request MSR and concurrent
> changes of that MSR (eg. via the direct MSR access interface) are
> overridden by it.
>
> Signed-off-by: Rafael J. Wysocki 
> ---
>
> Sending the right patch this time, sorry for the confusion.
>
> This is based on the current mainline.
>
> v1 -> v2:
>* Avoid a race condition when updating the HWP request register while
>  setting a new EPP value via sysfs.
>
> v2 -> v3:
>* Rebase.
>
> v3 -> v4:
>* Avoid exposing the hwp_dynamic_boost sysfs switch in the passive mode.
>
> v4 -> v5:
>* Do not acquire intel_pstate_driver_lock in
>  store_energy_performance_preference(), because it runs under
>  policy->rwsem, so intel_pstate_driver cannot change while it is running.
>* Rearrange the changelog a bit to avoid confusion.
>
> v5 -> v6:
>* Fix the problem with the EPP setting via sysfs not working with the
>  performance and powersave governors by stopping and restarting the
>  governor around the sysfs-based EPP updates in the passive mode.
>* Because of that, use the epp_cached field just for avoiding the above
>  if the new EPP value for the given CPU is the same as the old one.
>* Export cpufreq_start/stop_governor() from the core (for the above).
>
> v6 -> v7:
>* Cosmetic changes in store_energy_performance_prefernce() to reduce the
>  LoC number and make it a bit easier to read.  No intentional functional
>  impact.
>
> ---
>  Documentation/admin-guide/pm/intel_pstate.rst |   89 -
>  drivers/cpufreq/cpufreq.c |6 
>  drivers/cpufreq/intel_pstate.c|  245 
> +++---
>  include/linux/cpufreq.h   |2 
>  4 files changed, 229 insertions(+), 113 deletions(-)
>
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -36,6 +36,7 @@
>  #define INTEL_PSTATE_SAMPLING_INTERVAL   (10 * NSEC_PER_MSEC)
>  
>  #define INTEL_CPUFREQ_TRANSITION_LATENCY 2
> +#define INTEL_CPUFREQ_TRANSITION_DELAY_HWP   5000
>  #define INTEL_CPUFREQ_TRANSITION_DELAY   500
>  
>  #ifdef CONFIG_ACPI
> @@ -220,6 +221,7 @@ struct global_params {
>   *   preference/bias
>   * @epp_saved:   Saved EPP/EPB during system suspend or CPU 
> offline
>   *   operation
> + * @epp_cached   Cached HWP energy-performance preference value
>   * @hwp_req_cached:  Cached value of the last HWP Request MSR
>   * @hwp_cap_cached:  Cached value of the last HWP 

Re: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled

2020-08-09 Thread Srinivas Pandruvada
On Thu, 2020-08-06 at 14:03 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki 
> 
> Allow intel_pstate to work in the passive mode with HWP enabled and
> make it set the HWP minimum performance limit (HWP floor) to the
> P-state value given by the target frequency supplied by the cpufreq
> governor, so as to prevent the HWP algorithm and the CPU scheduler
> from working against each other, at least when the schedutil governor
> is in use, and update the intel_pstate documentation accordingly.
> 
> Among other things, this allows utilization clamps to be taken
> into account, at least to a certain extent, when intel_pstate is
> in use and makes it more likely that sufficient capacity for
> deadline tasks will be provided.
> 
> After this change, the resulting behavior of an HWP system with
> intel_pstate in the passive mode should be close to the behavior
> of the analogous non-HWP system with intel_pstate in the passive
> mode, except that in the frequency range below the base frequency
> (ie. the frequency retured by the base_frequency cpufreq attribute
> in sysfs on HWP systems) the HWP algorithm is allowed to make the
> CPU run at a frequency above the floor P-state set by intel_pstate,
> with or without hardware coordination of P-states among CPUs in the
> same package.
> 
> [If P-states of the CPUs in the same package are coordinated at the
>  hardware level, a non-HWP processor may choose a P-state above the
>  target one like a processor with HWP enabled may choose a P-state
>  above the HWP floor, so the HWP behavior is analogous to the non-HWP
>  one in that case.
> 
>  Also note that the HWP floor may not be taken into account by
>  the processor in the range of P-states above the base frequency,
>  referred to as the turbo range, where the processor has a license to
>  choose any P-state, either below or above the HWP floor, just like a
>  non-HWP processor in the case when the target P-state falls into the
>  turbo range.]
> 
> With this change applied, intel_pstate in the passive mode
> assumes complete control over the HWP request MSR and concurrent
> changes of that MSR (eg. via the direct MSR access interface) are
> overridden by it.
> 
> Signed-off-by: Rafael J. Wysocki 
Acked-by: Srinivas Pandruvada 

> ---
> 
> Sending the right patch this time, sorry for the confusion.
> 
> This is based on the current mainline.
> 
> v1 -> v2:
>* Avoid a race condition when updating the HWP request register
> while
>  setting a new EPP value via sysfs.
> 
> v2 -> v3:
>* Rebase.
> 
> v3 -> v4:
>* Avoid exposing the hwp_dynamic_boost sysfs switch in the passive
> mode.
> 
> v4 -> v5:
>* Do not acquire intel_pstate_driver_lock in
>  store_energy_performance_preference(), because it runs under
>  policy->rwsem, so intel_pstate_driver cannot change while it is
> running.
>* Rearrange the changelog a bit to avoid confusion.
> 
> v5 -> v6:
>* Fix the problem with the EPP setting via sysfs not working with
> the
>  performance and powersave governors by stopping and restarting
> the
>  governor around the sysfs-based EPP updates in the passive mode.
>* Because of that, use the epp_cached field just for avoiding the
> above
>  if the new EPP value for the given CPU is the same as the old
> one.
>* Export cpufreq_start/stop_governor() from the core (for the
> above).
> 
> v6 -> v7:
>* Cosmetic changes in store_energy_performance_prefernce() to
> reduce the
>  LoC number and make it a bit easier to read.  No intentional
> functional
>  impact.
> 
> ---
>  Documentation/admin-guide/pm/intel_pstate.rst |   89 -
>  drivers/cpufreq/cpufreq.c |6 
>  drivers/cpufreq/intel_pstate.c|  245
> +++---
>  include/linux/cpufreq.h   |2 
>  4 files changed, 229 insertions(+), 113 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -36,6 +36,7 @@
>  #define INTEL_PSTATE_SAMPLING_INTERVAL   (10 * NSEC_PER_MSEC)
>  
>  #define INTEL_CPUFREQ_TRANSITION_LATENCY 2
> +#define INTEL_CPUFREQ_TRANSITION_DELAY_HWP   5000
>  #define INTEL_CPUFREQ_TRANSITION_DELAY   500
>  
>  #ifdef CONFIG_ACPI
> @@ -220,6 +221,7 @@ struct global_params {
>   *   preference/bias
>   * @epp_saved:   Saved EPP/EPB during system suspend or
> CPU offline
>   *   operation
> + * @epp_cached   Cached HWP energy-performance
> preference value
>   * @hwp_req_cached:  Cached value of the last HWP Request MSR
>   * @hwp_cap_cached:  Cached value of the last HWP Capabilities MSR
>   * @last_io_update:  Last time when IO wake flag was set
> @@ -257,6 +259,7 @@ struct cpudata {
>   s16 epp_policy;
>   s16 epp_default;
>   s16 

Re: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled

2020-08-06 Thread Rafael J. Wysocki
On Thu, Aug 6, 2020 at 1:04 PM Doug Smythies  wrote:
>
> On 2020.08.05 09:56 Rafael J. Wysocki wrote:
>
> > v6 -> v7:
> >* Cosmetic changes in store_energy_performance_prefernce() to reduce the
> >  LoC number and make it a bit easier to read.  No intentional functional
> >  impact.
>
> ??
> V7 is identical to V6.
>
> Diff:

Oh.

It looks like I sent the v6 instead of it, then.

I'll send the right one shortly, sorry for the confusion.


RE: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled

2020-08-05 Thread Doug Smythies
On 2020.08.05 09:56 Rafael J. Wysocki wrote:

> v6 -> v7:
>* Cosmetic changes in store_energy_performance_prefernce() to reduce the
>  LoC number and make it a bit easier to read.  No intentional functional
>  impact.

??
V7 is identical to V6.

Diff:

$ diff hwppassive-v6-2-2.patch hwppassive-v7-2-2.patch
2c2
< Sent: August 4, 2020 8:11 AM
---
> Sent: August 5, 2020 9:56 AM
5c5
< Subject: [PATCH v6] cpufreq: intel_pstate: Implement passive mode with HWP 
enabled
---
> Subject: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP 
> enabled
76a77,81
>
> v6 -> v7:
>* Cosmetic changes in store_energy_performance_prefernce() to reduce the
>  LoC number and make it a bit easier to read.  No intentional functional
>  impact.

... Doug