On 03-12-20, 12:54, Dietmar Eggemann wrote: > On 24/11/2020 07:26, Viresh Kumar wrote: > > Several parts of the kernel are already using the effective CPU > > utilization (as seen by the scheduler) to get the current load on the > > CPU, do the same here instead of depending on the idle time of the CPU, > > which isn't that accurate comparatively. > > > > This is also the right thing to do as it makes the cpufreq governor > > (schedutil) align better with the cpufreq_cooling driver, as the power > > requested by cpufreq_cooling governor will exactly match the next > > frequency requested by the schedutil governor since they are both using > > the same metric to calculate load. > > > > This was tested on ARM Hikey6220 platform with hackbench, sysbench and > > schbench. None of them showed any regression or significant > > improvements. Schbench is the most important ones out of these as it > > creates the scenario where the utilization numbers provide a better > > estimate of the future. > > > > Scenario 1: The CPUs were mostly idle in the previous polling window of > > the IPA governor as the tasks were sleeping and here are the details > > from traces (load is in %): > > > > Old: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 > > total_load=203 load={{0x35,0x1,0x0,0x31,0x0,0x0,0x64,0x0}} > > dynamic_power=1339 > > New: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 > > total_load=600 load={{0x60,0x46,0x45,0x45,0x48,0x3b,0x61,0x44}} > > dynamic_power=3960 > > When I ran schbench (-t 16 -r 5) on hikey960 I get multiple (~50) > instances of ~80ms task activity phase and then ~20ms idle phase on all > CPUs. > > So I assume that scenario 1 is at the beginning (but you mentioned the > task were sleeping?)
I am not able to find the exact values I used, but I did something like this to create a scenario where the old computations shall find the CPU as idle in the last IPA window: - schbench -m 2 -t 4 -s 25000 -c 20000 -r 60 - sampling rate of IPA to 10 ms With this IPA wakes up many times and finds the CPUs to have been idle in the last IPA window (i.e. 10ms). > and scenario 2 is somewhere in the middle of the > testrun? This also happens all the time, as there will be cases when the IPA runs and finds the CPUs to be always running in last 10 ms. > IMHO, the util-based approach delivers really better results at the > beginning and at the end of the entire testrun. > During the testrun, the util-based and the idle-based approach deliver > similar results. > > It's a little bit tricky to compare test results since the IPA sampling > rate is 100ms and the load values you get depend on how the workload > pattern and the IPA sampling align. Right. > > Here, the "Old" line gives the load and requested_power (dynamic_power > > here) numbers calculated using the idle time based implementation, while > > "New" is based on the CPU utilization from scheduler. > > > > As can be clearly seen, the load and requested_power numbers are simply > > incorrect in the idle time based approach and the numbers collected from > > CPU's utilization are much closer to the reality. > > I assume the IPA sampling is done after ~50ms of the first task activity > phase. > > > Scenario 2: The CPUs were busy in the previous polling window of the IPA > > governor: > > > > Old: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 > > total_load=800 load={{0x64,0x64,0x64,0x64,0x64,0x64,0x64,0x64}} > > dynamic_power=5280 > > New: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 > > total_load=708 load={{0x4d,0x5c,0x5c,0x5b,0x5c,0x5c,0x51,0x5b}} > > dynamic_power=4672 > > > > As can be seen, the idle time based load is 100% for all the CPUs as it > > took only the last window into account, but in reality the CPUs aren't > > that loaded as shown by the utilization numbers. > > Is this an IPA sampling at the end of the ~20ms idle phase? This is during the phase where the CPUs were fully busy for the last period. -- viresh