On Wed, Aug 24, 2016 at 09:54:35AM +0100, Morten Rasmussen wrote: > As Dietmar mentioned already, the 'disconnect' is a feature of the PELT > rewrite. Paul and Ben's original implementation had full propagation up > and down the hierarchy. IIRC, one of the key points of the rewrite was > more 'stable' signals, which we would loose by re-introducing immediate > updates throughout hierarchy. As I mentioned earlier, no essential change! A feature perhaps is: the rewrite takes into account the runnable ratio.
E.g., let there be a group having one task with share 1024, if the task sticks to one CPU, and the task is runnable 50% of the time. With the old implementation, the group_entity_load_avg is 1024; but with the rewritten implementation, the group_entity_load_avg is 512. Isn't this good? If the task migrates, the old implementation will still be 1024 on the new CPU, but the rewritten implementation will transition to 512, albeit taking 0.1+ second time, which we are now addressing. Isn't this good? > It is a significant change to group scheduling, so I'm a bit surprised > that nobody has observed any problems post the rewrite. But maybe most > users don't care about the load-balance being slightly off when tasks > have migrated or new tasks are added to a group. I don't understand what you are saying. > If we want to re-introduce propagation of both load and utilization I > would suggest that we just look at the original implementation. It > seemed to work. > > Handling utilization and load differently will inevitably result in more > code. The 'flat hierarchy' approach seems slightly less complicated, but > it prevents us from using group utilization later should we wish to do > so. It might for example become useful for the schedutil cpufreq > governor should it ever consider selecting frequencies differently based > on whether the current task is in a (specific) group or not. I understand group util may have some usage should you attempt to do so, I'm not sure how realistic it is. Nothing prevents you from knowing the current task is from which (specific) group or not.