Re: [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities
On Saturday, August 22, 2020 2:47:05 AM CEST Doug Smythies wrote: > Hi Rafael, > > On 2020.08.20 09:35 Rafael J. Wysocki wrote: > > > > The purpose of this series is to address some peculiarities related to > > taking CPUs offline/online and switching between different operation > > modes with HWP enabled that have become visible after allowing the > > driver to work in the passive mode with HWP enabled in 5.9-rc1 (and > > one that was there earlier, but can be addressed easily after the > > changes madein 5.9-rc1). > > > > Please refer to the patch changelogs for details. > > > > For easier testing/review, the series is available from the git branch at: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ > > intel_pstate-testing > > Thanks for that. > > There still seems to be a problem with EPP getting messed up. > I have not been able to find the exact spot in the code. > > One problem is that EPP can end up as 0, and thereafter stays > at 0. In sysfs terms, it ends up as "performance" and thereafter > stays as "performance". Meanwhile I never modified it, and it started > as "balance_performance". > > It happens when changing from active to passive if the governor is > performance. > If the governor is not performance things work as expected. One change is missing from the patches in the $subject series and IMO it doesn't belong to any of them, so please find it appended below (on top of the $subject series). With it applied, this particular issue should go away. > Another problem is that EPP will end up as 128 when changing from passive > to active. I don't seem to be able to reproduce this (at least not without involving system-wide suspend/resume). Cheers! --- From: Rafael J. Wysocki Subject: [PATCH] cpufreq: intel_pstate: Restore cached EPP value during offline Because hwp_req_cached contains the effective EPP value (0) when the "performance" scaling algorithm is used in the active mode, replace it with the cached EPP value during CPU offline to prevent it from being used (unexpectedly) after switching over from the active mode to the passive mode. Also rename intel_pstate_hwp_force_min_perf() because it will do more than just forcing the minimum performance now. Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/intel_pstate.c | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) Index: linux-pm/drivers/cpufreq/intel_pstate.c === --- linux-pm.orig/drivers/cpufreq/intel_pstate.c +++ linux-pm/drivers/cpufreq/intel_pstate.c @@ -904,12 +904,23 @@ skip_epp: wrmsrl_on_cpu(cpu, MSR_HWP_REQUEST, value); } -static void intel_pstate_hwp_force_min_perf(int cpu) +static void intel_pstate_hwp_offline(int cpu) { - u64 value; + struct cpudata *cpudata = all_cpu_data[cpu]; + u64 value = READ_ONCE(cpudata->hwp_req_cached); int min_perf; - value = all_cpu_data[cpu]->hwp_req_cached; + if (boot_cpu_has(X86_FEATURE_HWP_EPP)) { + /* +* In case the EPP has been set to "performance" by the +* active mode "performance" scaling algorithm, replace that +* temporary value with the cached EPP one. +*/ + value &= ~GENMASK_ULL(31, 24); + value |= HWP_ENERGY_PERF_PREFERENCE(cpudata->epp_cached); + WRITE_ONCE(cpudata->hwp_req_cached, value); + } + value &= ~GENMASK_ULL(31, 0); min_perf = HWP_LOWEST_PERF(all_cpu_data[cpu]->hwp_cap_cached); @@ -2312,7 +2323,7 @@ static int intel_pstate_cpu_offline(stru * performance on CPU offline to prevent that from happening. */ if (hwp_active) - intel_pstate_hwp_force_min_perf(policy->cpu); + intel_pstate_hwp_offline(policy->cpu); else intel_pstate_set_min_pstate(all_cpu_data[policy->cpu]);
RE: [PATCH 0/4] cpufreq: intel_pstate: Address some HWP-related oddities
Hi Rafael, On 2020.08.20 09:35 Rafael J. Wysocki wrote: > > The purpose of this series is to address some peculiarities related to > taking CPUs offline/online and switching between different operation > modes with HWP enabled that have become visible after allowing the > driver to work in the passive mode with HWP enabled in 5.9-rc1 (and > one that was there earlier, but can be addressed easily after the > changes madein 5.9-rc1). > > Please refer to the patch changelogs for details. > > For easier testing/review, the series is available from the git branch at: > > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ > intel_pstate-testing Thanks for that. There still seems to be a problem with EPP getting messed up. I have not been able to find the exact spot in the code. One problem is that EPP can end up as 0, and thereafter stays at 0. In sysfs terms, it ends up as "performance" and thereafter stays as "performance". Meanwhile I never modified it, and it started as "balance_performance". It happens when changing from active to passive if the governor is performance. If the governor is not performance things work as expected. Another problem is that EPP will end up as 128 when changing from passive to active. This erroneous condition is cleared by changing the governor to powersave and back to performance. It also doesn't occur the first time after boot, when booting to intel_cpufreq/performance/HWP. (confused yet?) The sysfs value is O.K. during this. Supporting data: Example 1: Grub: GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1 consoleblank=450 msr.allow_writes=on cpuidle.governor=teo" So I boot to intel_pstate/performance/HWP: # /home/doug/c/msr-decoder (always edited for only relevant parts) 6.) 0x774: IA32_HWP_REQUEST:CPU 0-5 : raw: 2E2E : 2E2E : 2E2E : 2E2E : 2E2E : 2E2E : epp:0 :0 :0 :0 :0 :0 : # cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_driver intel_pstate # cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor performance # cat /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference balance_performance # echo passive > /sys/devices/system/cpu/intel_pstate/status Note: the following results are incorrect: # cat /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference performance # echo "ondemand" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ondemand # cat /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference performance # /home/doug/c/msr-decoder 6.) 0x774: IA32_HWP_REQUEST:CPU 0-5 : raw: 2E08 : 2E08 : 2E08 : 2E0B : 2E13 : 2E08 : epp:0 :0 :0 :0 :0 :0 : # echo active > /sys/devices/system/cpu/intel_pstate/status # cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor performance # cat /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference performance # echo "powersave" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor powersave # cat /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference performance # cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor powersave # cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_driver intel_pstate Example 2: Grub: GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1 consoleblank=450 intel_pstate=passive msr.allow_writes=on cpuidle.governor=teo" So I boot to intel_cpufreq/performance/HWP: # /home/doug/c/msr-decoder 6.) 0x774: IA32_HWP_REQUEST:CPU 0-5 : raw: 80002E2E : 80002E2E : 80002E2E : 80002E2E : 80002E2E : 80002E2E : epp: 128 : 128 : 128 : 128 : 128 : 128 : # cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_driver intel_cpufreq # cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor performance # cat /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference balance_performance # echo active > /sys/devices/system/cpu/intel_pstate/status # cat /sys/devices/system/cpu/cpufreq/policy3/energy_performance_preference balance_performance # cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor performance # cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_driver intel_pstate # /home/doug/c/msr-decoder 6.) 0x774: IA32_HWP_REQUEST:CPU 0-5 : raw: 2E2E : 2E2E : 2E2E : 2E2E : 2E2E : 2E2E : epp:0 :0 :0 :0 :0 :0 : # echo passive > /sys/devices/system/cpu/intel_pstate/status # /home/doug/c/msr-decoder 6.) 0x774: IA32_HWP_REQUEST:CPU 0-5 : raw: 80002E2E : 80002E2E : 80002E2E : 80002E2E : 80002E2E : 80002E2E : epp: 128 : 128 : 128 : 128 : 128 : 128 : # echo active > /sys/devices/system/cpu/intel_pstate/status Note: the following results are incorrect: # /home/doug/c/msr-decoder 6.) 0x774: IA32_HWP_REQUEST:CPU 0-5 : raw: 80002E2E : 80002E2E : 80002E2E : 80002E2E :