Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency

2014-05-18 Thread Yuyang Du
> So I should have just deleted all patches, for none of them has a
> changelog.
> 

It is my bad to not make changelogs in patches. The v2 has them, but I should
have made them since always.

> So all this cc crap only hooks into and modifies fair.c behaviour. There
> is absolutely no reason it should live anywhere else except fair.c
> 
> Secondly, the very last thing we need is more CONFIG_ goo, and you
> sprinkle #ifdef around like it was gold dust.
> 

Aggreed. I will change these.

> Thirdly, wth is wrong with the current per-task runtime accounting and
> why can't you extend/adapt that instead of duplicating the lot.
> 

Sure. As you and Vincent said, CC will take a ride of current tracking codes
instead of duplicating.

> Fourthly, I'm _never_ going to merge anything that hijacks the load
> balancer and does some random other thing. There's going to be a single
> load-balancer full stop.
> 
> Many people have expressed interest in a packing balancer (vs the
> spreading we currently default to). Some have even done patches.
> At the same time it seems very difficult to agree on _when_ packing
> makes sense. That said, when we do packing we should do it driven by the
> topology and policy, not by some compile time option.
>

I will make "Workload Consolidation" driven by topology and policy,
essentially it is already so, but sure the codes are not completely clean in
that regard.

> Lastly, if you'd done your homework and actually read some of the
> threads on the subject from say the past two years, you'd know pretty
> much all that already.
> 
> I'm not here to endlessly repeat myself and waste time staring at
> unchangelogged patches.
> 

This will not happen again.

> Anyway, there might or might not be useful ideas in there.. but its very
> hard to tell one way or another.

I think the above is mostly about "amenability" to scheduler codes.
Apparently, I am not doing it right. Will send another version to
make it less hard. Thanks for your time.

Yuyang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency

2014-05-18 Thread Yuyang Du
 So I should have just deleted all patches, for none of them has a
 changelog.
 

It is my bad to not make changelogs in patches. The v2 has them, but I should
have made them since always.

 So all this cc crap only hooks into and modifies fair.c behaviour. There
 is absolutely no reason it should live anywhere else except fair.c
 
 Secondly, the very last thing we need is more CONFIG_ goo, and you
 sprinkle #ifdef around like it was gold dust.
 

Aggreed. I will change these.

 Thirdly, wth is wrong with the current per-task runtime accounting and
 why can't you extend/adapt that instead of duplicating the lot.
 

Sure. As you and Vincent said, CC will take a ride of current tracking codes
instead of duplicating.

 Fourthly, I'm _never_ going to merge anything that hijacks the load
 balancer and does some random other thing. There's going to be a single
 load-balancer full stop.
 
 Many people have expressed interest in a packing balancer (vs the
 spreading we currently default to). Some have even done patches.
 At the same time it seems very difficult to agree on _when_ packing
 makes sense. That said, when we do packing we should do it driven by the
 topology and policy, not by some compile time option.


I will make Workload Consolidation driven by topology and policy,
essentially it is already so, but sure the codes are not completely clean in
that regard.

 Lastly, if you'd done your homework and actually read some of the
 threads on the subject from say the past two years, you'd know pretty
 much all that already.
 
 I'm not here to endlessly repeat myself and waste time staring at
 unchangelogged patches.
 

This will not happen again.

 Anyway, there might or might not be useful ideas in there.. but its very
 hard to tell one way or another.

I think the above is mostly about amenability to scheduler codes.
Apparently, I am not doing it right. Will send another version to
make it less hard. Thanks for your time.

Yuyang
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency

2014-05-15 Thread Peter Zijlstra
On Wed, May 07, 2014 at 02:46:37AM +0800, Yuyang Du wrote:
> > The general code structure is an immediate no go. We're not going to
> > bolt on anything like this.
> 
> Could you please detail a little bit about general code structure?

So I should have just deleted all patches, for none of them has a
changelog.

So all this cc crap only hooks into and modifies fair.c behaviour. There
is absolutely no reason it should live anywhere else except fair.c

Secondly, the very last thing we need is more CONFIG_ goo, and you
sprinkle #ifdef around like it was gold dust.

Thirdly, wth is wrong with the current per-task runtime accounting and
why can't you extend/adapt that instead of duplicating the lot.

Fourthly, I'm _never_ going to merge anything that hijacks the load
balancer and does some random other thing. There's going to be a single
load-balancer full stop.

Many people have expressed interest in a packing balancer (vs the
spreading we currently default to). Some have even done patches.
At the same time it seems very difficult to agree on _when_ packing
makes sense. That said, when we do packing we should do it driven by the
topology and policy, not by some compile time option.

Lastly, if you'd done your homework and actually read some of the
threads on the subject from say the past two years, you'd know pretty
much all that already.

I'm not here to endlessly repeat myself and waste time staring at
unchangelogged patches.

Anyway, there might or might not be useful ideas in there.. but its very
hard to tell one way or another.


pgpbPmEncupJ3.pgp
Description: PGP signature


Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency

2014-05-15 Thread Peter Zijlstra
On Wed, May 07, 2014 at 02:46:37AM +0800, Yuyang Du wrote:
  The general code structure is an immediate no go. We're not going to
  bolt on anything like this.
 
 Could you please detail a little bit about general code structure?

So I should have just deleted all patches, for none of them has a
changelog.

So all this cc crap only hooks into and modifies fair.c behaviour. There
is absolutely no reason it should live anywhere else except fair.c

Secondly, the very last thing we need is more CONFIG_ goo, and you
sprinkle #ifdef around like it was gold dust.

Thirdly, wth is wrong with the current per-task runtime accounting and
why can't you extend/adapt that instead of duplicating the lot.

Fourthly, I'm _never_ going to merge anything that hijacks the load
balancer and does some random other thing. There's going to be a single
load-balancer full stop.

Many people have expressed interest in a packing balancer (vs the
spreading we currently default to). Some have even done patches.
At the same time it seems very difficult to agree on _when_ packing
makes sense. That said, when we do packing we should do it driven by the
topology and policy, not by some compile time option.

Lastly, if you'd done your homework and actually read some of the
threads on the subject from say the past two years, you'd know pretty
much all that already.

I'm not here to endlessly repeat myself and waste time staring at
unchangelogged patches.

Anyway, there might or might not be useful ideas in there.. but its very
hard to tell one way or another.


pgpbPmEncupJ3.pgp
Description: PGP signature


Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency

2014-05-06 Thread Yuyang Du
> The general code structure is an immediate no go. We're not going to
> bolt on anything like this.

Could you please detail a little bit about general code structure?

Thank you all the same,
Yuyang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency

2014-05-06 Thread Yuyang Du
 The general code structure is an immediate no go. We're not going to
 bolt on anything like this.

Could you please detail a little bit about general code structure?

Thank you all the same,
Yuyang
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency

2014-05-05 Thread Peter Zijlstra
On Mon, May 05, 2014 at 08:02:40AM +0800, Yuyang Du wrote:
> Hi Ingo, PeterZ, Rafael, and others,

The general code structure is an immediate no go. We're not going to
bolt on anything like this.

I've yet to look at the content.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency

2014-05-05 Thread Yuyang Du
Hi Ingo, PeterZ, Rafael, and others,

The current scheduler’s load balancing is completely work-conserving. In some
workload, generally low CPU utilization but immersed with CPU bursts of
transient tasks, migrating task to engage all available CPUs for
work-conserving can lead to significant overhead: cache locality loss,
idle/active HW state transitional latency and power, shallower idle state,
etc, which are both power and performance inefficient especially for today’s
low power processors in mobile. 

This RFC introduces a sense of idleness-conserving into work-conserving (by
all means, we really don’t want to be overwhelming in only one way). But to
what extent the idleness-conserving should be, bearing in mind that we don’t
want to sacrifice performance? We first need a load/idleness indicator to that
end.

Thanks to CFS’s “model an ideal, precise multi-tasking CPU”, tasks can be seen
as concurrently running (the tasks in the runqueue). So it is natural to use
task concurrency as load indicator. Having said that, we do two things:

1) Divide continuous time into periods of time, and average task concurrency
in period, for tolerating the transient bursts:
a = sum(concurrency * time) / period
2) Exponentially decay past periods, and synthesize them all, for hysteresis
to load drops or resilience to load rises (let f be decaying factor, and a_x
the xth period average since period 0):
s = a_n + f^1 * a_n-1 + f^2 * a_n-2 +, ..., + f^(n-1) * a_1 + f^n * a_0

We name this load indicator as CPU ConCurrency (CC): task concurrency
determines how many CPUs are needed to be running concurrently.

Another two ways of how to interpret CC:

1) the current work-conserving load balance also uses CC, but instantaneous
CC.

2) CC vs. CPU utilization. CC is runqueue-length-weighted CPU utilization. If
we change: "a = sum(concurrency * time) / period" to "a' = sum(1 * time) /
period". Then a' is just about the CPU utilization. And the way we weight
runqueue-length is the simplest one (excluding the exponential decays, and you
may have other ways).

To track CC, we intercept the scheduler in 1) enqueue, 2) dequeue, 3)
scheduler tick, and 4) enter/exit idle.

After CC, in the consolidation part, we do 1) attach the CPU topology to be
adaptive beyond our experimental platforms, and 2) intercept the current load
balance for load and load balancing containment.

Currently, CC is per CPU. To consolidate, the formula is based on a heuristic.
Suppose we have 2 CPUs, their task concurrency over time is ('-' means no
task, 'x' having tasks):

1)
CPU0: ----- (CC[0])
CPU1: - (CC[1])

2)
CPU0: ----- (CC[0])
CPU1: ----- (CC[1])

If we consolidate CPU0 and CPU1, the consolidated CC will be: CC' = CC[0] +
CC[1] for case 1 and CC'' = (CC[0] + CC[1]) * 2 for case 2. For the cases in
between case 1 and 2 in terms of how xxx overlaps, the CC should be between
CC' and CC''. So, we uniformly use this condition for consolidation (suppose
we consolidate m CPUs to n CPUs, m > n):

(CC[0] + CC[1] + ... + CC[m-2] + CC[m-1]) * (n + log(m-n)) >=http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency

2014-05-05 Thread Yuyang Du
Hi Ingo, PeterZ, Rafael, and others,

The current scheduler’s load balancing is completely work-conserving. In some
workload, generally low CPU utilization but immersed with CPU bursts of
transient tasks, migrating task to engage all available CPUs for
work-conserving can lead to significant overhead: cache locality loss,
idle/active HW state transitional latency and power, shallower idle state,
etc, which are both power and performance inefficient especially for today’s
low power processors in mobile. 

This RFC introduces a sense of idleness-conserving into work-conserving (by
all means, we really don’t want to be overwhelming in only one way). But to
what extent the idleness-conserving should be, bearing in mind that we don’t
want to sacrifice performance? We first need a load/idleness indicator to that
end.

Thanks to CFS’s “model an ideal, precise multi-tasking CPU”, tasks can be seen
as concurrently running (the tasks in the runqueue). So it is natural to use
task concurrency as load indicator. Having said that, we do two things:

1) Divide continuous time into periods of time, and average task concurrency
in period, for tolerating the transient bursts:
a = sum(concurrency * time) / period
2) Exponentially decay past periods, and synthesize them all, for hysteresis
to load drops or resilience to load rises (let f be decaying factor, and a_x
the xth period average since period 0):
s = a_n + f^1 * a_n-1 + f^2 * a_n-2 +, ..., + f^(n-1) * a_1 + f^n * a_0

We name this load indicator as CPU ConCurrency (CC): task concurrency
determines how many CPUs are needed to be running concurrently.

Another two ways of how to interpret CC:

1) the current work-conserving load balance also uses CC, but instantaneous
CC.

2) CC vs. CPU utilization. CC is runqueue-length-weighted CPU utilization. If
we change: a = sum(concurrency * time) / period to a' = sum(1 * time) /
period. Then a' is just about the CPU utilization. And the way we weight
runqueue-length is the simplest one (excluding the exponential decays, and you
may have other ways).

To track CC, we intercept the scheduler in 1) enqueue, 2) dequeue, 3)
scheduler tick, and 4) enter/exit idle.

After CC, in the consolidation part, we do 1) attach the CPU topology to be
adaptive beyond our experimental platforms, and 2) intercept the current load
balance for load and load balancing containment.

Currently, CC is per CPU. To consolidate, the formula is based on a heuristic.
Suppose we have 2 CPUs, their task concurrency over time is ('-' means no
task, 'x' having tasks):

1)
CPU0: ----- (CC[0])
CPU1: - (CC[1])

2)
CPU0: ----- (CC[0])
CPU1: ----- (CC[1])

If we consolidate CPU0 and CPU1, the consolidated CC will be: CC' = CC[0] +
CC[1] for case 1 and CC'' = (CC[0] + CC[1]) * 2 for case 2. For the cases in
between case 1 and 2 in terms of how xxx overlaps, the CC should be between
CC' and CC''. So, we uniformly use this condition for consolidation (suppose
we consolidate m CPUs to n CPUs, m  n):

(CC[0] + CC[1] + ... + CC[m-2] + CC[m-1]) * (n + log(m-n)) =? (1 * n) * n *
consolidate_coefficient

The consolidate_coefficient could be like 100% or more or less.

By CC, we implemented a Workload Consolidation patch on two Intel mobile
platforms (a quad-core composed of two dual-core modules): contain load and
load balancing in the first dual-core when aggregated CC low, and if not in
the full quad-core. Results show that we got power savings and no substantial
performance regression (even gains for some). The workloads we used to
evaluate the Workload Consolidation include 1) 50+ perf/ux benchmarks (almost
all of the magazine ones), and 2) ~10 power workloads, of course, they are the
easiest ones, such as browsing, audio, video, recording, imaging, etc. The
current half-life is 1 period, and the period was 32ms, and now 64ms for more
aggressive consolidation.

Yuyang Du (12):
  CONFIG for CPU ConCurrency
  Init for CPU ConCurrency
  CPU ConCurrency calculation
  CPU ConCurrency collecting in:
  CONFIG for Workload Consolidation
  Attach CPU topology
  CPU ConCurrency API for Workload Consolidation
  Intercept wakeup/fork/exec balance
  Intercept idle balance
  Intercept periodic nohz idle balance
  Intercept periodic load balance
  Intercept RT provocatively

 arch/x86/Kconfig |   21 +
 include/linux/sched.h|   13 +
 include/linux/sched/sysctl.h |8 +
 include/linux/topology.h |   16 +
 kernel/sched/Makefile|1 +
 kernel/sched/concurrency.c   |  928 ++
 kernel/sched/core.c  |   49 +++
 kernel/sched/fair.c  |  131 +-
 kernel/sched/rt.c|   25 ++
 kernel/sched/sched.h |   36 ++
 kernel/sysctl.c  |   16 +
 11 files changed, 1235 insertions(+), 9 deletions(-)
 create mode 100644 kernel/sched/concurrency.c

-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in

Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency

2014-05-05 Thread Peter Zijlstra
On Mon, May 05, 2014 at 08:02:40AM +0800, Yuyang Du wrote:
 Hi Ingo, PeterZ, Rafael, and others,

The general code structure is an immediate no go. We're not going to
bolt on anything like this.

I've yet to look at the content.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/