On 4/4/2016 3:12 PM, Rik van Riel wrote:
On Fri, 2016-04-01 at 15:42 -0400, Chris Metcalf wrote:
On arm64, when calling enqueue_task_fair() from migration_cpu_stop(),
we find the nr_running value updated by add_nr_running(), but the
cfs.nr_running value has not always yet been updated.  Accordingly,
the sched_can_stop_tick() false returns true when we are migrating a
second task onto a core.
I don't get it.

Looking at the enqueue_task_fair(), I see this:

         for_each_sched_entity(se) {
                 cfs_rq = cfs_rq_of(se);
                 cfs_rq->h_nr_running++;
                ...
        }

         if (!se)
                 add_nr_running(rq, 1);

What is the difference between cfs_rq->h_nr_running,
and rq->cfs.nr_running?

Why do we have two?
Are we simply testing against the wrong one in
sched_can_stop_tick?

It seems that using the non-CFS one is what we want.  I don't know whether
using a different CFS count instead might be more correct.

Since I'm not sure what causes the difference I see between tile (correct)
and arm64 (incorrect) it's hard for me to speculate.

Correct this by using rq->nr_running instead of rq->cfs.nr_running.
This should always be more conservative, and reverts the test to the
form it had before commit 76d92ac305f2 ("sched: Migrate sched to use
new tick dependency mask model").
That would cause us to run the timer tick while running
a single SCHED_RR real time task, with a single
SCHED_OTHER task sitting in the background (which will
not get run until the SCHED_RR task is done).

No, because in sched_can_stop_tick(), we first handle the special
cases of RR or FIFO tasks present.  For example, RR:

        if (rq->rt.rr_nr_running) {
                if (rq->rt.rr_nr_running == 1)
                        return true;
                else
                        return false;
        }

Once we see there's any RR tasks running, the return value
ignores any possible SCHED_OTHER tasks.  Only after the code
concludes there are no RR/FIFO tasks do we even look at
the over nr_running value.

--
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

Reply via email to