On 03/23/2017 06:39 PM, Rafael J. Wysocki wrote:
> On Thu, Mar 23, 2017 at 8:26 PM, Sai Gurrappadi <[email protected]> 
> wrote:
>> Hi Rafael,
> 
> Hi,
> 
>> On 03/21/2017 04:08 PM, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <[email protected]>
>>
>> <snip>
>>
>>>
>>> That has been attributed to CPU utilization metric updates on task
>>> migration that cause the total utilization value for the CPU to be
>>> reduced by the utilization of the migrated task.  If that happens,
>>> the schedutil governor may see a CPU utilization reduction and will
>>> attempt to reduce the CPU frequency accordingly right away.  That
>>> may be premature, though, for example if the system is generally
>>> busy and there are other runnable tasks waiting to be run on that
>>> CPU already.
>>>
>>> This is unlikely to be an issue on systems where cpufreq policies are
>>> shared between multiple CPUs, because in those cases the policy
>>> utilization is computed as the maximum of the CPU utilization values
>>> over the whole policy and if that turns out to be low, reducing the
>>> frequency for the policy most likely is a good idea anyway.  On
>>
>> I have observed this issue even in the shared policy case (one clock domain 
>> for many CPUs). On migrate, the actual load update is split into two updates:
>>
>> 1. Add to removed_load on src_cpu (cpu_util(src_cpu) not updated yet)
>> 2. Do wakeup on dst_cpu, add load to dst_cpu
>>
>> Now if src_cpu manages to do a PELT update before 2. happens, ex: say a 
>> small periodic task woke up on src_cpu, it'll end up subtracting the 
>> removed_load from its utilization and issue a frequency update before 2. 
>> happens.
>>
>> This causes a premature dip in frequency which doesn't get corrected until 
>> the next util update that fires after rate_limit_us. The dst_cpu freq. 
>> update from step 2. above gets rate limited in this scenario.
> 
> Interesting, and this seems to be related to last_freq_update_time
> being per-policy (which it has to be, because frequency updates are
> per-policy too and that's what we need to rate-limit).
> 

Correct.

> Does this happen often enough to be a real concern in practice on
> those configurations, though?
> 
> The other CPUs in the policy need to be either idle (so schedutil
> doesn't take them into account at all) or lightly utilized for that to
> happen, so that would affect workloads with one CPU hog type of task
> that is migrated from one CPU to another within a policy and that
> doesn't happen too often AFAICS.

So it is possible, even likely in some cases for a heavy CPU task to migrate on 
wakeup between the policy->cpus via select_idle_sibling() if the prev_cpu it 
was on was !idle on wakeup.

This style of heavy thread + lots of light work is a common pattern on Android 
(games, browsing, etc.) given how Android does its threading for ipc (Binder 
stuff) + its rendering/audio pipelines. 

I unfortunately don't have any numbers atm though.

-Sai

Reply via email to