core: uclamp: add CPU clamp groups accounting

Patrick Bellasi Fri, 13 Apr 2018 04:48:08 -0700

On 13-Apr 13:36, Peter Zijlstra wrote:
> On Fri, Apr 13, 2018 at 12:15:10PM +0100, Patrick Bellasi wrote:
> > On 13-Apr 10:43, Peter Zijlstra wrote:
> > > On Mon, Apr 09, 2018 at 05:56:09PM +0100, Patrick Bellasi wrote:
> > > > +static inline void uclamp_task_update(struct rq *rq, struct 
> > > > task_struct *p)
> > > > +{
> > > > +       int cpu = cpu_of(rq);
> > > > +       int clamp_id;
> > > > +
> > > > +       /* The idle task does not affect CPU's clamps */
> > > > +       if (unlikely(p->sched_class == &idle_sched_class))
> > > > +               return;
> > > > +       /* DEADLINE tasks do not affect CPU's clamps */
> > > > +       if (unlikely(p->sched_class == &dl_sched_class))
> > > > +               return;
> > > > +
> > > > +       for (clamp_id = 0; clamp_id < UCLAMP_CNT; ++clamp_id) {
> > > > +               if (uclamp_task_affects(p, clamp_id))
> > > > +                       uclamp_cpu_put(p, cpu, clamp_id);
> > > > +               else
> > > > +                       uclamp_cpu_get(p, cpu, clamp_id);
> > > > +       }
> > > > +}
> > > 
> > > Is that uclamp_task_affects() thing there to fix up the fact you failed
> > > to propagate the calling context (enqueue/dequeue) ?
> > 
> > Not really, it's intended by design: we back annotate the clamp_group
> > a task has been refcounted in.
> > 
> > The uclamp_task_affects() tells if we are refcounted now and then we
> > know from the back-annotation from which refcounter we need to remove
> > the task.
> > 
> > I found this solution much less racy and effective in avoiding to
> > screw up the refcounter whenever we look at a task at either
> > dequeue/migration time and these operations can overlaps with the
> > slow-path. Meaning, when we change the task specific clamp_group
> > either via syscall or cgroups attributes.
> > 
> > IOW, the back annotation allows to decouple refcounting from
> > clamp_group configuration in a lockless way.
> 
> But it adds extra state and logic, to a fastpath, for no reason.
> 
> I suspect you messed up the cgroup side; because the syscall should
> already have done task_rq_lock() and hold both p->pi_lock and rq->lock
> and have dequeued the task when changing the attribute.


Yes, actually I'm using task_rq_lock() from the cgroup callback to
update each task already queued. And I do the same from the
sched_setattr syscall...

> It is actually really hard to make the syscall do it wrong.

... thus, I'll look better into this.

Not sure now if there was some other corner-case.

In the past I remember some funny dance in cgroup callbacks when a
task was terminating (like being moved in the root-rq just before
exiting). But, as you say, if we always have the task_rq_lock we
should be safe.


-- 
#include <best/regards.h>

Patrick Bellasi

Re: [PATCH 1/7] sched/core: uclamp: add CPU clamp groups accounting

Reply via email to