Chris Redpath <chris.redp...@arm.com> writes: > If we migrate a sleeping task away from a CPU which has the > tick stopped, then both the clock_task and decay_counter will > be out of date for that CPU and we will not decay load correctly > regardless of how often we update the blocked load. > > This is only an issue for tasks which are not on a runqueue > (because otherwise that CPU would be awake) and simultaneously > the CPU the task previously ran on has had the tick stopped. > > Signed-off-by: Chris Redpath <chris.redp...@arm.com>
This looks like it is basically correct, but it seems unfortunate to take any rq lock for these ttwus. I don't know enough about the nohz machinery to know if that's at all avoidable. > --- > kernel/sched/fair.c | 30 ++++++++++++++++++++++++++++++ > 1 file changed, 30 insertions(+) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index b7e5945..0af1dc2 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4324,6 +4324,7 @@ unlock: > return new_cpu; > } > > +static int nohz_test_cpu(int cpu); > /* > * Called immediately before a task is migrated to a new cpu; task_cpu(p) and > * cfs_rq_of(p) references at time of call are still valid and identify the > @@ -4343,6 +4344,25 @@ migrate_task_rq_fair(struct task_struct *p, int > next_cpu) > * be negative here since on-rq tasks have decay-count == 0. > */ > if (se->avg.decay_count) { > + /* > + * If we migrate a sleeping task away from a CPU > + * which has the tick stopped, then both the clock_task > + * and decay_counter will be out of date for that CPU > + * and we will not decay load correctly. > + */ > + if (!se->on_rq && nohz_test_cpu(task_cpu(p))) { p->on_rq - se->on_rq must be false to call set_task_cpu at all. That said, barring bugs like the one you fixed in patch 1 I think decay_count != 0 should also imply !p->on_rq. > + struct rq *rq = cpu_rq(task_cpu(p)); > + unsigned long flags; > + /* > + * Current CPU cannot be holding rq->lock in this > + * circumstance, but another might be. We must hold > + * rq->lock before we go poking around in its clocks > + */ > + raw_spin_lock_irqsave(&rq->lock, flags); > + update_rq_clock(rq); > + update_cfs_rq_blocked_load(cfs_rq, 0); > + raw_spin_unlock_irqrestore(&rq->lock, flags); > + } > se->avg.decay_count = -__synchronize_entity_decay(se); > atomic_long_add(se->avg.load_avg_contrib, > &cfs_rq->removed_load); > @@ -6507,6 +6527,11 @@ static struct { > unsigned long next_balance; /* in jiffy units */ > } nohz ____cacheline_aligned; > > +static int nohz_test_cpu(int cpu) > +{ > + return cpumask_test_cpu(cpu, nohz.idle_cpus_mask); > +} > + > static inline int find_new_ilb(int call_cpu) > { > int ilb = cpumask_first(nohz.idle_cpus_mask); > @@ -6619,6 +6644,11 @@ static int sched_ilb_notifier(struct notifier_block > *nfb, > return NOTIFY_DONE; > } > } > +#else > +static int nohz_test_cpu(int cpu) > +{ > + return 0; > +} > #endif > > static DEFINE_SPINLOCK(balancing); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/