RE: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled
Hi Rafael, On 2020.08.17 14:06 Doug Smythies wrote: > On 2020.08.06 05:04 Rafael J. Wysocki wrote: > > > Allow intel_pstate to work in the passive mode with HWP enabled and > > make it set the HWP minimum performance limit (HWP floor) to the > > P-state value given by the target frequency supplied by the cpufreq > > governor, so as to prevent the HWP algorithm and the CPU scheduler > > from working against each other, at least when the schedutil governor > > is in use, and update the intel_pstate documentation accordingly. > ... > > powersave governor: > acpi-cpufreq: good > intel_cpufreq hwp: bad > intel_cpufreq no hwp: good It occurs to me that my expectations as to what is meant by "powersave" might not agree with yours. For the powersave governor, this is what we have now: intel_cpufreq hwp == intel_pstate hwp intel_cpufreq no hwp == acpi-cpufreq == always minimum freq intel_pstate no hwp ~= acpi-cpufreq/ondemand Is that your understanding/intention? My expectation was/is: intel_cpufreq hwp == intel_cpufreq no hwp == acpi-cpufreq == always minimum freq intel_pstate no hwp ~= acpi-cpufreq/ondemand intel_pstate hwp == Unique. Say, extremely course version of ondemand. ... Doug
RE: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled
On 2020.08.06 05:04 Rafael J. Wysocki wrote: > Allow intel_pstate to work in the passive mode with HWP enabled and > make it set the HWP minimum performance limit (HWP floor) to the > P-state value given by the target frequency supplied by the cpufreq > governor, so as to prevent the HWP algorithm and the CPU scheduler > from working against each other, at least when the schedutil governor > is in use, and update the intel_pstate documentation accordingly. ... Hi Rafael, You may or may not recall, I mentioned my further feedback would be delayed, as I wanted to work on reducing the labour content of my most basic CPU frequency scaler test. I have tested kernel 5.9-rc1 for pretty much every intel_pstate variant and governor, and also the acpi-cpufreq driver. Other than changing governors, changes were only made via grub command line options and re-boot. EPP or EPB were never modified, they were always whatever default. performance governor: (left mostly blank, on purpose.) acpi-cpufreq: intel_cpufreq hwp: good intel_cpufreq no hwp: intel_pstate hwp: intel_pstate no hwp: ondemand governor: acpi-cpufreq: good intel_cpufreq hwp: bad intel_cpufreq no hwp: good conservative governor: acpi-cpufreq: good intel_cpufreq hwp: good intel_cpufreq no hwp: good schedutil governor: acpi-cpufreq: good intel_cpufreq hwp: bad intel_cpufreq no hwp: good powersave governor: acpi-cpufreq: good intel_cpufreq hwp: bad intel_cpufreq no hwp: good active-powersave governor: intel_pstate hwp: ? not smooth, suffers from the broken HWP issue. intel_pstate no hwp: good. Intel_pstate hwp, idle state 2 disabled: Better but still worse for power. Now, we don't actually care about CPU frequency, we care about power: ondemand governor: periodic workflow at 347 hertz. ~58% load at 4.60 GHz (where hwp operates) ~76% load at 3.5 GHz (where no hwp operates) intel_cpufreq hwp: 14.3 processor package watts. 51.5 watts on the mains to the computer. intel_cpufreq no hwp: 9.1 processor package watts. 45.5 watts on the mains to the computer. schedutil governor: periodic workflow at 347 hertz. ~36% load at 4.60 GHz (where hwp operates) ~55% load at 3.2 GHz (where no hwp operates) intel_cpufreq hwp: 9.6 processor package watts. 45.8 watts on the mains to the computer. intel_cpufreq no hwp: ~6 processor package watts. ~41 watts on the mains to the computer. (noisy) powersave governor: periodic workflow at 347 hertz. ~39.8% load at 2.00 GHz (where hwp operates) ~92.5% load at 0.8 GHz (where no hwp operates) intel_cpufreq hwp: 2.6 processor package watts. 38 watts on the mains to the computer. intel_cpufreq no hwp: 1.9 processor package watts. 36 watts on the mains to the computer. active-powersave governor: periodic workflow at 347 hertz. ~58% load at 4.60 GHz (where hwp operates) ~72% load at 3.88 GHz (where no hwp operates) intel_pstate hwp: 14.2 processor package watts. 52 watts on the mains to the computer. intel_pstate no hwp: 10.1 processor package watts. 48 watts on the mains to the computer. Link to web page with much of this same content which, in turn, links to various graphs. Coded, to avoid the barrage of bots: double u double u double u dot smythies dot com /~doug/linux/s18/hwp/v7/ ... Doug
Re: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled
On Tuesday, August 11, 2020 2:51:41 AM CEST Francisco Jerez wrote: > > --==-=-= > Content-Type: multipart/mixed; boundary="=-=-=" > > --=-=-= > Content-Type: text/plain; charset=utf-8 > Content-Disposition: inline > Content-Transfer-Encoding: quoted-printable > > "Rafael J. Wysocki" writes: > > > From: Rafael J. Wysocki > > > > Allow intel_pstate to work in the passive mode with HWP enabled and > > make it set the HWP minimum performance limit (HWP floor) to the > > P-state value given by the target frequency supplied by the cpufreq > > governor, so as to prevent the HWP algorithm and the CPU scheduler > > from working against each other, at least when the schedutil governor > > is in use, and update the intel_pstate documentation accordingly. > > > > Among other things, this allows utilization clamps to be taken > > into account, at least to a certain extent, when intel_pstate is > > in use and makes it more likely that sufficient capacity for > > deadline tasks will be provided. > > > > After this change, the resulting behavior of an HWP system with > > intel_pstate in the passive mode should be close to the behavior > > of the analogous non-HWP system with intel_pstate in the passive > > mode, except that in the frequency range below the base frequency > > (ie. the frequency retured by the base_frequency cpufreq attribute > > in sysfs on HWP systems) the HWP algorithm is allowed to make the > > CPU run at a frequency above the floor P-state set by intel_pstate, > > with or without hardware coordination of P-states among CPUs in the > > same package. > > > > The "frequency range below the base frequency" part of the paragraph > above seems somewhat misleading, since AFAICT the same thing will happen > in the P-state range above the base frequency. Fair enough. I rephrased the changelog when applying the patch. > Another minor comment below, other than that LGTM: And this one has been fixed too. > Reviewed-by: Francisco Jerez Thanks!
Re: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled
"Rafael J. Wysocki" writes: > From: Rafael J. Wysocki > > Allow intel_pstate to work in the passive mode with HWP enabled and > make it set the HWP minimum performance limit (HWP floor) to the > P-state value given by the target frequency supplied by the cpufreq > governor, so as to prevent the HWP algorithm and the CPU scheduler > from working against each other, at least when the schedutil governor > is in use, and update the intel_pstate documentation accordingly. > > Among other things, this allows utilization clamps to be taken > into account, at least to a certain extent, when intel_pstate is > in use and makes it more likely that sufficient capacity for > deadline tasks will be provided. > > After this change, the resulting behavior of an HWP system with > intel_pstate in the passive mode should be close to the behavior > of the analogous non-HWP system with intel_pstate in the passive > mode, except that in the frequency range below the base frequency > (ie. the frequency retured by the base_frequency cpufreq attribute > in sysfs on HWP systems) the HWP algorithm is allowed to make the > CPU run at a frequency above the floor P-state set by intel_pstate, > with or without hardware coordination of P-states among CPUs in the > same package. > The "frequency range below the base frequency" part of the paragraph above seems somewhat misleading, since AFAICT the same thing will happen in the P-state range above the base frequency. Another minor comment below, other than that LGTM: Reviewed-by: Francisco Jerez > [If P-states of the CPUs in the same package are coordinated at the > hardware level, a non-HWP processor may choose a P-state above the > target one like a processor with HWP enabled may choose a P-state > above the HWP floor, so the HWP behavior is analogous to the non-HWP > one in that case. > > Also note that the HWP floor may not be taken into account by > the processor in the range of P-states above the base frequency, > referred to as the turbo range, where the processor has a license to > choose any P-state, either below or above the HWP floor, just like a > non-HWP processor in the case when the target P-state falls into the > turbo range.] > > With this change applied, intel_pstate in the passive mode > assumes complete control over the HWP request MSR and concurrent > changes of that MSR (eg. via the direct MSR access interface) are > overridden by it. > > Signed-off-by: Rafael J. Wysocki > --- > > Sending the right patch this time, sorry for the confusion. > > This is based on the current mainline. > > v1 -> v2: >* Avoid a race condition when updating the HWP request register while > setting a new EPP value via sysfs. > > v2 -> v3: >* Rebase. > > v3 -> v4: >* Avoid exposing the hwp_dynamic_boost sysfs switch in the passive mode. > > v4 -> v5: >* Do not acquire intel_pstate_driver_lock in > store_energy_performance_preference(), because it runs under > policy->rwsem, so intel_pstate_driver cannot change while it is running. >* Rearrange the changelog a bit to avoid confusion. > > v5 -> v6: >* Fix the problem with the EPP setting via sysfs not working with the > performance and powersave governors by stopping and restarting the > governor around the sysfs-based EPP updates in the passive mode. >* Because of that, use the epp_cached field just for avoiding the above > if the new EPP value for the given CPU is the same as the old one. >* Export cpufreq_start/stop_governor() from the core (for the above). > > v6 -> v7: >* Cosmetic changes in store_energy_performance_prefernce() to reduce the > LoC number and make it a bit easier to read. No intentional functional > impact. > > --- > Documentation/admin-guide/pm/intel_pstate.rst | 89 - > drivers/cpufreq/cpufreq.c |6 > drivers/cpufreq/intel_pstate.c| 245 > +++--- > include/linux/cpufreq.h |2 > 4 files changed, 229 insertions(+), 113 deletions(-) > > Index: linux-pm/drivers/cpufreq/intel_pstate.c > === > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c > +++ linux-pm/drivers/cpufreq/intel_pstate.c > @@ -36,6 +36,7 @@ > #define INTEL_PSTATE_SAMPLING_INTERVAL (10 * NSEC_PER_MSEC) > > #define INTEL_CPUFREQ_TRANSITION_LATENCY 2 > +#define INTEL_CPUFREQ_TRANSITION_DELAY_HWP 5000 > #define INTEL_CPUFREQ_TRANSITION_DELAY 500 > > #ifdef CONFIG_ACPI > @@ -220,6 +221,7 @@ struct global_params { > * preference/bias > * @epp_saved: Saved EPP/EPB during system suspend or CPU > offline > * operation > + * @epp_cached Cached HWP energy-performance preference value > * @hwp_req_cached: Cached value of the last HWP Request MSR > * @hwp_cap_cached: Cached value of the last HWP Capabilit
Re: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled
On Thu, 2020-08-06 at 14:03 +0200, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki > > Allow intel_pstate to work in the passive mode with HWP enabled and > make it set the HWP minimum performance limit (HWP floor) to the > P-state value given by the target frequency supplied by the cpufreq > governor, so as to prevent the HWP algorithm and the CPU scheduler > from working against each other, at least when the schedutil governor > is in use, and update the intel_pstate documentation accordingly. > > Among other things, this allows utilization clamps to be taken > into account, at least to a certain extent, when intel_pstate is > in use and makes it more likely that sufficient capacity for > deadline tasks will be provided. > > After this change, the resulting behavior of an HWP system with > intel_pstate in the passive mode should be close to the behavior > of the analogous non-HWP system with intel_pstate in the passive > mode, except that in the frequency range below the base frequency > (ie. the frequency retured by the base_frequency cpufreq attribute > in sysfs on HWP systems) the HWP algorithm is allowed to make the > CPU run at a frequency above the floor P-state set by intel_pstate, > with or without hardware coordination of P-states among CPUs in the > same package. > > [If P-states of the CPUs in the same package are coordinated at the > hardware level, a non-HWP processor may choose a P-state above the > target one like a processor with HWP enabled may choose a P-state > above the HWP floor, so the HWP behavior is analogous to the non-HWP > one in that case. > > Also note that the HWP floor may not be taken into account by > the processor in the range of P-states above the base frequency, > referred to as the turbo range, where the processor has a license to > choose any P-state, either below or above the HWP floor, just like a > non-HWP processor in the case when the target P-state falls into the > turbo range.] > > With this change applied, intel_pstate in the passive mode > assumes complete control over the HWP request MSR and concurrent > changes of that MSR (eg. via the direct MSR access interface) are > overridden by it. > > Signed-off-by: Rafael J. Wysocki Acked-by: Srinivas Pandruvada > --- > > Sending the right patch this time, sorry for the confusion. > > This is based on the current mainline. > > v1 -> v2: >* Avoid a race condition when updating the HWP request register > while > setting a new EPP value via sysfs. > > v2 -> v3: >* Rebase. > > v3 -> v4: >* Avoid exposing the hwp_dynamic_boost sysfs switch in the passive > mode. > > v4 -> v5: >* Do not acquire intel_pstate_driver_lock in > store_energy_performance_preference(), because it runs under > policy->rwsem, so intel_pstate_driver cannot change while it is > running. >* Rearrange the changelog a bit to avoid confusion. > > v5 -> v6: >* Fix the problem with the EPP setting via sysfs not working with > the > performance and powersave governors by stopping and restarting > the > governor around the sysfs-based EPP updates in the passive mode. >* Because of that, use the epp_cached field just for avoiding the > above > if the new EPP value for the given CPU is the same as the old > one. >* Export cpufreq_start/stop_governor() from the core (for the > above). > > v6 -> v7: >* Cosmetic changes in store_energy_performance_prefernce() to > reduce the > LoC number and make it a bit easier to read. No intentional > functional > impact. > > --- > Documentation/admin-guide/pm/intel_pstate.rst | 89 - > drivers/cpufreq/cpufreq.c |6 > drivers/cpufreq/intel_pstate.c| 245 > +++--- > include/linux/cpufreq.h |2 > 4 files changed, 229 insertions(+), 113 deletions(-) > > Index: linux-pm/drivers/cpufreq/intel_pstate.c > === > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c > +++ linux-pm/drivers/cpufreq/intel_pstate.c > @@ -36,6 +36,7 @@ > #define INTEL_PSTATE_SAMPLING_INTERVAL (10 * NSEC_PER_MSEC) > > #define INTEL_CPUFREQ_TRANSITION_LATENCY 2 > +#define INTEL_CPUFREQ_TRANSITION_DELAY_HWP 5000 > #define INTEL_CPUFREQ_TRANSITION_DELAY 500 > > #ifdef CONFIG_ACPI > @@ -220,6 +221,7 @@ struct global_params { > * preference/bias > * @epp_saved: Saved EPP/EPB during system suspend or > CPU offline > * operation > + * @epp_cached Cached HWP energy-performance > preference value > * @hwp_req_cached: Cached value of the last HWP Request MSR > * @hwp_cap_cached: Cached value of the last HWP Capabilities MSR > * @last_io_update: Last time when IO wake flag was set > @@ -257,6 +259,7 @@ struct cpudata { > s16 epp_policy; > s16 epp_default; > s16 epp_save
Re: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled
On Thu, Aug 6, 2020 at 1:04 PM Doug Smythies wrote: > > On 2020.08.05 09:56 Rafael J. Wysocki wrote: > > > v6 -> v7: > >* Cosmetic changes in store_energy_performance_prefernce() to reduce the > > LoC number and make it a bit easier to read. No intentional functional > > impact. > > ?? > V7 is identical to V6. > > Diff: Oh. It looks like I sent the v6 instead of it, then. I'll send the right one shortly, sorry for the confusion.
RE: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled
On 2020.08.05 09:56 Rafael J. Wysocki wrote: > v6 -> v7: >* Cosmetic changes in store_energy_performance_prefernce() to reduce the > LoC number and make it a bit easier to read. No intentional functional > impact. ?? V7 is identical to V6. Diff: $ diff hwppassive-v6-2-2.patch hwppassive-v7-2-2.patch 2c2 < Sent: August 4, 2020 8:11 AM --- > Sent: August 5, 2020 9:56 AM 5c5 < Subject: [PATCH v6] cpufreq: intel_pstate: Implement passive mode with HWP enabled --- > Subject: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP > enabled 76a77,81 > > v6 -> v7: >* Cosmetic changes in store_energy_performance_prefernce() to reduce the > LoC number and make it a bit easier to read. No intentional functional > impact. ... Doug