On 01/09/17 14:21, Peter Zijlstra wrote: > The PELT _sum values are a saw-tooth function, dropping on the decay > edge and then growing back up again during the window. > > When these window-edges are not aligned between cfs_rq and se, we can > have the situation where, for example, on dequeue, the se decays > first. > > Its _sum values will be small(er), while the cfs_rq _sum values will > still be on their way up. Because of this, the subtraction: > cfs_rq->avg._sum -= se->avg._sum will result in a positive value. This > will then, once the cfs_rq reaches an edge, translate into its _avg > value jumping up. > > This is especially visible with the runnable_load bits, since they get > added/subtracted a lot. > > Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> > --- > kernel/sched/fair.c | 45 +++++++++++++++++++++++++++++++-------------- > 1 file changed, 31 insertions(+), 14 deletions(-)
[...] > @@ -3644,7 +3634,34 @@ update_cfs_rq_load_avg(u64 now, struct c > */ > static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct > sched_entity *se) > { > + u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib; > + > + /* > + * When we attach the @se to the @cfs_rq, we must align the decay > + * window because without that, really weird and wonderful things can > + * happen. > + * > + * XXX illustrate > + */ > se->avg.last_update_time = cfs_rq->avg.last_update_time; > + se->avg.period_contrib = cfs_rq->avg.period_contrib; > + > + /* > + * Hell(o) Nasty stuff.. we need to recompute _sum based on the new > + * period_contrib. This isn't strictly correct, but since we're > + * entirely outside of the PELT hierarchy, nobody cares if we truncate > + * _sum a little. > + */ > + se->avg.util_sum = se->avg.util_avg * divider; > + > + se->avg.load_sum = divider; > + if (se_weight(se)) { > + se->avg.load_sum = > + div_u64(se->avg.load_avg * se->avg.load_sum, > se_weight(se)); > + } Can scale_load_down(se->load.weight) ever become 0 here? [...]