2016-08-18 21:45 GMT+08:00 Morten Rasmussen <morten.rasmus...@arm.com>: > On Thu, Aug 18, 2016 at 07:46:44PM +0800, Wanpeng Li wrote: >> 2016-08-18 18:24 GMT+08:00 Morten Rasmussen <morten.rasmus...@arm.com>: >> > On Thu, Aug 18, 2016 at 09:40:55AM +0100, Morten Rasmussen wrote: >> >> On Mon, Aug 15, 2016 at 04:42:37PM +0100, Morten Rasmussen wrote: >> >> > On Mon, Aug 15, 2016 at 04:23:42PM +0200, Peter Zijlstra wrote: >> >> > > But unlike that function, it doesn't actually use __update_load_avg(). >> >> > > Why not? >> >> > >> >> > Fair question :) >> >> > >> >> > We currently exploit the fact that the task utilization is _not_ updated >> >> > in wake-up balancing to make sure we don't under-estimate the capacity >> >> > requirements for tasks that have slept for a while. If we update it, we >> >> > loose the non-decayed 'peak' utilization, but I guess we could just >> >> > store it somewhere when we do the wake-up decay. >> >> > >> >> > I thought there was a better reason when I wrote the patch, but I don't >> >> > recall right now. I will look into it again and see if we can use >> >> > __update_load_avg() to do a proper update instead of doing things twice. >> >> >> >> AFAICT, we should be able to synchronize the task utilization to the >> >> previous rq utilization using __update_load_avg() as you suggest. The >> >> patch below is should work as a replacement without any changes to >> >> subsequent patches. It doesn't solve the under-estimation issue, but I >> >> have another patch for that. >> > >> > And here is a possible solution to the under-estimation issue. The patch >> > would have to go at the end of this set. >> > >> > ---8<--- >> > >> > From 5bc918995c6c589b833ba1f189a8b92fa22202ae Mon Sep 17 00:00:00 2001 >> > From: Morten Rasmussen <morten.rasmus...@arm.com> >> > Date: Wed, 17 Aug 2016 15:30:43 +0100 >> > Subject: [PATCH] sched/fair: Track peak per-entity utilization >> > >> > When using PELT (per-entity load tracking) utilization to place tasks at >> > wake-up using the decayed utilization (due to sleep) leads to >> > under-estimation of true utilization of the task. This could mean >> > putting the task on a cpu with less available capacity than is actually >> > needed. This issue can be mitigated by using 'peak' utilization instead >> > of the decayed utilization for placement decisions, e.g. at task >> > wake-up. >> > >> > The 'peak' utilization metric, util_peak, tracks util_avg when the task >> > is running and retains its previous value while the task is >> > blocked/waiting on the rq. It is instantly updated to track util_avg >> > again as soon as the task running again. >> >> Maybe this will lead to disable wake affine due to a spike peak value >> for a low average load task. > > I assume you are referring to using task_util_peak() instead of > task_util() in wake_cap()?
Yes. > > The peak value should never exceed the util_avg accumulated by the task > last time it ran. So any spike has to be caused by the task accumulating > more utilization last time it ran. We don't know if it a spike or a more I see. > permanent change in behaviour, so we have to guess. So a spike on an > asymmetric system could cause us to disable wake affine in some > circumstances (either prev_cpu or waker cpu has to be low compute > capacity) for the following wake-up. > > SMP should be unaffected as we should bail out on the previous > condition. Why capacity_orig instead of capacity since it is checked each time wakeup and maybe rt class/interrupt have already occupied many cpu utilization. > > The counter-example is task with a fairly long busy period and a much > longer period (cycle). Its util_avg might have decayed away since the > last activation so it appears very small at wake-up and we end up > putting it on a low capacity cpu every time even though it keeps the cpu > busy for a long time every time it wakes up. Agreed, that's the reason for under-estimation concern. > > Did that answer your question? Yeah, thanks for the clarification. Regards, Wanpeng Li