Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency
> So I should have just deleted all patches, for none of them has a > changelog. > It is my bad to not make changelogs in patches. The v2 has them, but I should have made them since always. > So all this cc crap only hooks into and modifies fair.c behaviour. There > is absolutely no reason it should live anywhere else except fair.c > > Secondly, the very last thing we need is more CONFIG_ goo, and you > sprinkle #ifdef around like it was gold dust. > Aggreed. I will change these. > Thirdly, wth is wrong with the current per-task runtime accounting and > why can't you extend/adapt that instead of duplicating the lot. > Sure. As you and Vincent said, CC will take a ride of current tracking codes instead of duplicating. > Fourthly, I'm _never_ going to merge anything that hijacks the load > balancer and does some random other thing. There's going to be a single > load-balancer full stop. > > Many people have expressed interest in a packing balancer (vs the > spreading we currently default to). Some have even done patches. > At the same time it seems very difficult to agree on _when_ packing > makes sense. That said, when we do packing we should do it driven by the > topology and policy, not by some compile time option. > I will make "Workload Consolidation" driven by topology and policy, essentially it is already so, but sure the codes are not completely clean in that regard. > Lastly, if you'd done your homework and actually read some of the > threads on the subject from say the past two years, you'd know pretty > much all that already. > > I'm not here to endlessly repeat myself and waste time staring at > unchangelogged patches. > This will not happen again. > Anyway, there might or might not be useful ideas in there.. but its very > hard to tell one way or another. I think the above is mostly about "amenability" to scheduler codes. Apparently, I am not doing it right. Will send another version to make it less hard. Thanks for your time. Yuyang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency
So I should have just deleted all patches, for none of them has a changelog. It is my bad to not make changelogs in patches. The v2 has them, but I should have made them since always. So all this cc crap only hooks into and modifies fair.c behaviour. There is absolutely no reason it should live anywhere else except fair.c Secondly, the very last thing we need is more CONFIG_ goo, and you sprinkle #ifdef around like it was gold dust. Aggreed. I will change these. Thirdly, wth is wrong with the current per-task runtime accounting and why can't you extend/adapt that instead of duplicating the lot. Sure. As you and Vincent said, CC will take a ride of current tracking codes instead of duplicating. Fourthly, I'm _never_ going to merge anything that hijacks the load balancer and does some random other thing. There's going to be a single load-balancer full stop. Many people have expressed interest in a packing balancer (vs the spreading we currently default to). Some have even done patches. At the same time it seems very difficult to agree on _when_ packing makes sense. That said, when we do packing we should do it driven by the topology and policy, not by some compile time option. I will make Workload Consolidation driven by topology and policy, essentially it is already so, but sure the codes are not completely clean in that regard. Lastly, if you'd done your homework and actually read some of the threads on the subject from say the past two years, you'd know pretty much all that already. I'm not here to endlessly repeat myself and waste time staring at unchangelogged patches. This will not happen again. Anyway, there might or might not be useful ideas in there.. but its very hard to tell one way or another. I think the above is mostly about amenability to scheduler codes. Apparently, I am not doing it right. Will send another version to make it less hard. Thanks for your time. Yuyang -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency
On Wed, May 07, 2014 at 02:46:37AM +0800, Yuyang Du wrote: > > The general code structure is an immediate no go. We're not going to > > bolt on anything like this. > > Could you please detail a little bit about general code structure? So I should have just deleted all patches, for none of them has a changelog. So all this cc crap only hooks into and modifies fair.c behaviour. There is absolutely no reason it should live anywhere else except fair.c Secondly, the very last thing we need is more CONFIG_ goo, and you sprinkle #ifdef around like it was gold dust. Thirdly, wth is wrong with the current per-task runtime accounting and why can't you extend/adapt that instead of duplicating the lot. Fourthly, I'm _never_ going to merge anything that hijacks the load balancer and does some random other thing. There's going to be a single load-balancer full stop. Many people have expressed interest in a packing balancer (vs the spreading we currently default to). Some have even done patches. At the same time it seems very difficult to agree on _when_ packing makes sense. That said, when we do packing we should do it driven by the topology and policy, not by some compile time option. Lastly, if you'd done your homework and actually read some of the threads on the subject from say the past two years, you'd know pretty much all that already. I'm not here to endlessly repeat myself and waste time staring at unchangelogged patches. Anyway, there might or might not be useful ideas in there.. but its very hard to tell one way or another. pgpbPmEncupJ3.pgp Description: PGP signature
Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency
On Wed, May 07, 2014 at 02:46:37AM +0800, Yuyang Du wrote: The general code structure is an immediate no go. We're not going to bolt on anything like this. Could you please detail a little bit about general code structure? So I should have just deleted all patches, for none of them has a changelog. So all this cc crap only hooks into and modifies fair.c behaviour. There is absolutely no reason it should live anywhere else except fair.c Secondly, the very last thing we need is more CONFIG_ goo, and you sprinkle #ifdef around like it was gold dust. Thirdly, wth is wrong with the current per-task runtime accounting and why can't you extend/adapt that instead of duplicating the lot. Fourthly, I'm _never_ going to merge anything that hijacks the load balancer and does some random other thing. There's going to be a single load-balancer full stop. Many people have expressed interest in a packing balancer (vs the spreading we currently default to). Some have even done patches. At the same time it seems very difficult to agree on _when_ packing makes sense. That said, when we do packing we should do it driven by the topology and policy, not by some compile time option. Lastly, if you'd done your homework and actually read some of the threads on the subject from say the past two years, you'd know pretty much all that already. I'm not here to endlessly repeat myself and waste time staring at unchangelogged patches. Anyway, there might or might not be useful ideas in there.. but its very hard to tell one way or another. pgpbPmEncupJ3.pgp Description: PGP signature
Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency
> The general code structure is an immediate no go. We're not going to > bolt on anything like this. Could you please detail a little bit about general code structure? Thank you all the same, Yuyang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency
The general code structure is an immediate no go. We're not going to bolt on anything like this. Could you please detail a little bit about general code structure? Thank you all the same, Yuyang -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency
On Mon, May 05, 2014 at 08:02:40AM +0800, Yuyang Du wrote: > Hi Ingo, PeterZ, Rafael, and others, The general code structure is an immediate no go. We're not going to bolt on anything like this. I've yet to look at the content. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency
Hi Ingo, PeterZ, Rafael, and others, The current scheduler’s load balancing is completely work-conserving. In some workload, generally low CPU utilization but immersed with CPU bursts of transient tasks, migrating task to engage all available CPUs for work-conserving can lead to significant overhead: cache locality loss, idle/active HW state transitional latency and power, shallower idle state, etc, which are both power and performance inefficient especially for today’s low power processors in mobile. This RFC introduces a sense of idleness-conserving into work-conserving (by all means, we really don’t want to be overwhelming in only one way). But to what extent the idleness-conserving should be, bearing in mind that we don’t want to sacrifice performance? We first need a load/idleness indicator to that end. Thanks to CFS’s “model an ideal, precise multi-tasking CPU”, tasks can be seen as concurrently running (the tasks in the runqueue). So it is natural to use task concurrency as load indicator. Having said that, we do two things: 1) Divide continuous time into periods of time, and average task concurrency in period, for tolerating the transient bursts: a = sum(concurrency * time) / period 2) Exponentially decay past periods, and synthesize them all, for hysteresis to load drops or resilience to load rises (let f be decaying factor, and a_x the xth period average since period 0): s = a_n + f^1 * a_n-1 + f^2 * a_n-2 +, ..., + f^(n-1) * a_1 + f^n * a_0 We name this load indicator as CPU ConCurrency (CC): task concurrency determines how many CPUs are needed to be running concurrently. Another two ways of how to interpret CC: 1) the current work-conserving load balance also uses CC, but instantaneous CC. 2) CC vs. CPU utilization. CC is runqueue-length-weighted CPU utilization. If we change: "a = sum(concurrency * time) / period" to "a' = sum(1 * time) / period". Then a' is just about the CPU utilization. And the way we weight runqueue-length is the simplest one (excluding the exponential decays, and you may have other ways). To track CC, we intercept the scheduler in 1) enqueue, 2) dequeue, 3) scheduler tick, and 4) enter/exit idle. After CC, in the consolidation part, we do 1) attach the CPU topology to be adaptive beyond our experimental platforms, and 2) intercept the current load balance for load and load balancing containment. Currently, CC is per CPU. To consolidate, the formula is based on a heuristic. Suppose we have 2 CPUs, their task concurrency over time is ('-' means no task, 'x' having tasks): 1) CPU0: ----- (CC[0]) CPU1: - (CC[1]) 2) CPU0: ----- (CC[0]) CPU1: ----- (CC[1]) If we consolidate CPU0 and CPU1, the consolidated CC will be: CC' = CC[0] + CC[1] for case 1 and CC'' = (CC[0] + CC[1]) * 2 for case 2. For the cases in between case 1 and 2 in terms of how xxx overlaps, the CC should be between CC' and CC''. So, we uniformly use this condition for consolidation (suppose we consolidate m CPUs to n CPUs, m > n): (CC[0] + CC[1] + ... + CC[m-2] + CC[m-1]) * (n + log(m-n)) >=http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency
Hi Ingo, PeterZ, Rafael, and others, The current scheduler’s load balancing is completely work-conserving. In some workload, generally low CPU utilization but immersed with CPU bursts of transient tasks, migrating task to engage all available CPUs for work-conserving can lead to significant overhead: cache locality loss, idle/active HW state transitional latency and power, shallower idle state, etc, which are both power and performance inefficient especially for today’s low power processors in mobile. This RFC introduces a sense of idleness-conserving into work-conserving (by all means, we really don’t want to be overwhelming in only one way). But to what extent the idleness-conserving should be, bearing in mind that we don’t want to sacrifice performance? We first need a load/idleness indicator to that end. Thanks to CFS’s “model an ideal, precise multi-tasking CPU”, tasks can be seen as concurrently running (the tasks in the runqueue). So it is natural to use task concurrency as load indicator. Having said that, we do two things: 1) Divide continuous time into periods of time, and average task concurrency in period, for tolerating the transient bursts: a = sum(concurrency * time) / period 2) Exponentially decay past periods, and synthesize them all, for hysteresis to load drops or resilience to load rises (let f be decaying factor, and a_x the xth period average since period 0): s = a_n + f^1 * a_n-1 + f^2 * a_n-2 +, ..., + f^(n-1) * a_1 + f^n * a_0 We name this load indicator as CPU ConCurrency (CC): task concurrency determines how many CPUs are needed to be running concurrently. Another two ways of how to interpret CC: 1) the current work-conserving load balance also uses CC, but instantaneous CC. 2) CC vs. CPU utilization. CC is runqueue-length-weighted CPU utilization. If we change: a = sum(concurrency * time) / period to a' = sum(1 * time) / period. Then a' is just about the CPU utilization. And the way we weight runqueue-length is the simplest one (excluding the exponential decays, and you may have other ways). To track CC, we intercept the scheduler in 1) enqueue, 2) dequeue, 3) scheduler tick, and 4) enter/exit idle. After CC, in the consolidation part, we do 1) attach the CPU topology to be adaptive beyond our experimental platforms, and 2) intercept the current load balance for load and load balancing containment. Currently, CC is per CPU. To consolidate, the formula is based on a heuristic. Suppose we have 2 CPUs, their task concurrency over time is ('-' means no task, 'x' having tasks): 1) CPU0: ----- (CC[0]) CPU1: - (CC[1]) 2) CPU0: ----- (CC[0]) CPU1: ----- (CC[1]) If we consolidate CPU0 and CPU1, the consolidated CC will be: CC' = CC[0] + CC[1] for case 1 and CC'' = (CC[0] + CC[1]) * 2 for case 2. For the cases in between case 1 and 2 in terms of how xxx overlaps, the CC should be between CC' and CC''. So, we uniformly use this condition for consolidation (suppose we consolidate m CPUs to n CPUs, m n): (CC[0] + CC[1] + ... + CC[m-2] + CC[m-1]) * (n + log(m-n)) =? (1 * n) * n * consolidate_coefficient The consolidate_coefficient could be like 100% or more or less. By CC, we implemented a Workload Consolidation patch on two Intel mobile platforms (a quad-core composed of two dual-core modules): contain load and load balancing in the first dual-core when aggregated CC low, and if not in the full quad-core. Results show that we got power savings and no substantial performance regression (even gains for some). The workloads we used to evaluate the Workload Consolidation include 1) 50+ perf/ux benchmarks (almost all of the magazine ones), and 2) ~10 power workloads, of course, they are the easiest ones, such as browsing, audio, video, recording, imaging, etc. The current half-life is 1 period, and the period was 32ms, and now 64ms for more aggressive consolidation. Yuyang Du (12): CONFIG for CPU ConCurrency Init for CPU ConCurrency CPU ConCurrency calculation CPU ConCurrency collecting in: CONFIG for Workload Consolidation Attach CPU topology CPU ConCurrency API for Workload Consolidation Intercept wakeup/fork/exec balance Intercept idle balance Intercept periodic nohz idle balance Intercept periodic load balance Intercept RT provocatively arch/x86/Kconfig | 21 + include/linux/sched.h| 13 + include/linux/sched/sysctl.h |8 + include/linux/topology.h | 16 + kernel/sched/Makefile|1 + kernel/sched/concurrency.c | 928 ++ kernel/sched/core.c | 49 +++ kernel/sched/fair.c | 131 +- kernel/sched/rt.c| 25 ++ kernel/sched/sched.h | 36 ++ kernel/sysctl.c | 16 + 11 files changed, 1235 insertions(+), 9 deletions(-) create mode 100644 kernel/sched/concurrency.c -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in
Re: [RFC PATCH 00/12 v1] A new CPU load metric for power-efficient scheduler: CPU ConCurrency
On Mon, May 05, 2014 at 08:02:40AM +0800, Yuyang Du wrote: Hi Ingo, PeterZ, Rafael, and others, The general code structure is an immediate no go. We're not going to bolt on anything like this. I've yet to look at the content. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/