On 16/02/2021 17:39, vincent.donnef...@arm.com wrote: > From: Vincent Donnefort <vincent.donnef...@arm.com> > > Being called for each dequeue, util_est reduces the number of its updates > by filtering out when the EWMA signal is different from the task util_avg > by less than 1%. It is a problem for a sudden util_avg ramp-up. Due to the > decay from a previous high util_avg, EWMA might now be close enough to > the new util_avg. No update would then happen while it would leave > ue.enqueued with an out-of-date value.
(1) enqueued[x-1] < ewma[x-1] (2) diff(enqueued[x], ewma[x]) < 1024/100 && enqueued[x] < ewma[x] (*) with ewma[x-1] == ewma[x] (*) enqueued[x] must still be less than ewma[x] w/ default UTIL_EST_FASTUP. Otherwise we would already 'goto done' (write the new util_est) via the previous if condition. > > Taking into consideration the two util_est members, EWMA and enqueued for > the filtering, ensures, for both, an up-to-date value. > > This is for now an issue only for the trace probe that might return the > stale value. Functional-wise, it isn't (yet) a problem, as the value is > always accessed through max(enqueued, ewma). Yeah, I remember that the ue.enqueued plots looked weird in these sections with stale ue.enqueued values. > This problem has been observed using LISA's UtilConvergence:test_means on > the sd845c board. I ran the test a couple of times on my juno board and I never hit this path (util_est_within_margin(last_ewma_diff) && !util_est_within_margin(last_enqueued_diff)) for a test task. I can't see how this issue can be board specific? Does it happen reliably on sd845c or is it just that it happens very, very occasionally? I saw it a couple of times but always with a (non-test) tasks migrating from one CPU to another. > Signed-off-by: Vincent Donnefort <vincent.donnef...@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggem...@arm.com> [...]