Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-26 Thread Alex Shi
On 11/22/2013 02:37 PM, Alex Shi wrote:
>   latest kernel 527d1511310a89+ this patchset
> hackbench -T -g 10 -f 40
>   23.25"  21.7"
>   23.16"  19.99"
>   24.24"  21.53"
> hackbench -p -g 10 -f 40
>   26.52"  22.48"
>   23.89"  24.00"
>   25.65"  23.06"
> hackbench -P -g 10 -f 40
>   20.14"  19.37"
>   19.96"  19.76"
>   21.76"  21.54"
> 
> The git tree for this patchset at:
>  g...@github.com:alexshi/power-scheduling.git no-load-idx 

Fengguang,

Did your kernel testing find sth unusual on this 3 patches?

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-26 Thread Alex Shi
On 11/26/2013 09:01 PM, Daniel Lezcano wrote:
> 
> Ok, bad copy-paste, the third test run results with the patchset is wrong.
> 
> hackbench -P -s 4096 -l 1000 -g 10 -f 40
> 38.938  39.585   
> 39.363 39.008
> 39.340 38.954
> 38.909 39.273
> 39.095 38.755
> 38.869 39.003
> 39.041 38.945
> 38.939 38.005
> 38.992 38.994
> 38.947 38.855

Oops,
Anyway, at least, no harm on hackbench process testing. :)

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-26 Thread Daniel Lezcano

On 11/26/2013 01:52 PM, Alex Shi wrote:

On 11/26/2013 08:35 PM, Daniel Lezcano wrote:



Here the new results with your patchset + patch #5

I have some issues with perf for the moment, so I will fix it up and
send the result after.


Thanks a lot, Daniel!
The result is pretty good!, thread/pipe performance has a slight little
drop, but processes performance increase about 25%!





527d1511310a  / + patchset + #5

hackbench -T -s 4096 -l 1000 -g 10 -f 40
26.677  30.308
27.914 28.497
28.390 30.360
28.048 28.587
26.344 29.513
27.848 28.706
28.315 30.152
28.232 29.721
26.549 28.766
30.340 38.801
hackbench -p -s 4096 -l 1000 -g 10 -f 40
34.522  35.469
34.545 34.966
34.469 35.342
34.115 35.286
34.457 35.592
34.561 35.314
34.459 35.316
34.054 35.629
34.532 35.149
34.459 34.876
hackbench -P -s 4096 -l 1000 -g 10 -f 40
38.938  30.308
39.363 28.497
39.340 30.360
38.909 28.587
39.095 29.513
38.869 28.706
39.041 30.152
38.939 29.721
38.992 28.766
38.947 38.801


Ok, bad copy-paste, the third test run results with the patchset is wrong.

hackbench -P -s 4096 -l 1000 -g 10 -f 40
38.938   39.585 
39.363   39.008
39.340   38.954
38.909   39.273
39.095   38.755
38.869   39.003
39.041   38.945
38.939   38.005
38.992   38.994
38.947   38.855



--
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-26 Thread Daniel Lezcano

On 11/26/2013 01:52 PM, Alex Shi wrote:

On 11/26/2013 08:35 PM, Daniel Lezcano wrote:



Here the new results with your patchset + patch #5

I have some issues with perf for the moment, so I will fix it up and
send the result after.


Thanks a lot, Daniel!
The result is pretty good!, thread/pipe performance has a slight little
drop, but processes performance increase about 25%!


Mmh, wait. Let me double check the results, it sounds weird we have so 
much performance increase.




527d1511310a  / + patchset + #5

hackbench -T -s 4096 -l 1000 -g 10 -f 40
26.677  30.308
27.914 28.497
28.390 30.360
28.048 28.587
26.344 29.513
27.848 28.706
28.315 30.152
28.232 29.721
26.549 28.766
30.340 38.801
hackbench -p -s 4096 -l 1000 -g 10 -f 40
34.522  35.469
34.545 34.966
34.469 35.342
34.115 35.286
34.457 35.592
34.561 35.314
34.459 35.316
34.054 35.629
34.532 35.149
34.459 34.876
hackbench -P -s 4096 -l 1000 -g 10 -f 40
38.938  30.308
39.363 28.497
39.340 30.360
38.909 28.587
39.095 29.513
38.869 28.706
39.041 30.152
38.939 29.721
38.992 28.766
38.947 38.801






--
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-26 Thread Alex Shi
On 11/26/2013 08:35 PM, Daniel Lezcano wrote:
> 
> 
> Here the new results with your patchset + patch #5
> 
> I have some issues with perf for the moment, so I will fix it up and
> send the result after.

Thanks a lot, Daniel!
The result is pretty good!, thread/pipe performance has a slight little
drop, but processes performance increase about 25%!


> 
> 
> 527d1511310a  / + patchset + #5
> 
> hackbench -T -s 4096 -l 1000 -g 10 -f 40
> 26.677  30.308
> 27.914 28.497
> 28.390 30.360
> 28.048 28.587
> 26.344 29.513
> 27.848 28.706
> 28.315 30.152
> 28.232 29.721
> 26.549 28.766
> 30.340 38.801
> hackbench -p -s 4096 -l 1000 -g 10 -f 40
> 34.522  35.469
> 34.545 34.966
> 34.469 35.342
> 34.115 35.286
> 34.457 35.592
> 34.561 35.314
> 34.459 35.316
> 34.054 35.629
> 34.532 35.149
> 34.459 34.876
> hackbench -P -s 4096 -l 1000 -g 10 -f 40
> 38.938  30.308
> 39.363 28.497
> 39.340 30.360
> 38.909 28.587
> 39.095 29.513
> 38.869 28.706
> 39.041 30.152
> 38.939 29.721
> 38.992 28.766
> 38.947 38.801


-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-26 Thread Daniel Lezcano

On 11/24/2013 06:29 AM, Alex Shi wrote:

On 11/22/2013 08:13 PM, Daniel Lezcano wrote:


Hi Alex,

I tried on my Xeon server (2 x 4 cores) your patchset and got the
following result:

kernel a5d6e63323fe7799eb0e6  / + patchset

hackbench -T -s 4096 -l 1000 -g 10 -f 40
   27.604  38.556


Wondering if the following patch is helpful on your Xeon server?

Btw, you can run vmstat as background tool or use 'perf sched'
to get scheduler statistics change for this patchset.

The following are results of original kernel and all 5 patches
on pandaboard ES.

 latest kernel 527d1511310a89+ this patchset
hackbench -T -g 10 -f 40
 23.25"20.79"
 23.16"20.4"
 24.24"20.29"
hackbench -p -g 10 -f 40
 26.52"21.2"
 23.89"24.07"
 25.65"20.30"
hackbench -P -g 10 -f 40
 20.14"19.53"
 19.96"20.37"
 21.76"20.39"




Here the new results with your patchset + patch #5

I have some issues with perf for the moment, so I will fix it up and 
send the result after.



527d1511310a  / + patchset + #5

hackbench -T -s 4096 -l 1000 -g 10 -f 40
26.677   30.308
27.914   28.497
28.390   30.360
28.048   28.587
26.344   29.513
27.848   28.706
28.315   30.152
28.232   29.721
26.549   28.766
30.340   38.801
hackbench -p -s 4096 -l 1000 -g 10 -f 40
34.522   35.469
34.545   34.966
34.469   35.342
34.115   35.286
34.457   35.592
34.561   35.314
34.459   35.316
34.054   35.629
34.532   35.149
34.459   34.876
hackbench -P -s 4096 -l 1000 -g 10 -f 40
38.938   30.308
39.363   28.497
39.340   30.360
38.909   28.587
39.095   29.513
38.869   28.706
39.041   30.152
38.939   29.721
38.992   28.766
38.947   38.801



--
 From 4f5efd6c2b1e7293410ad57c3db24dcf3394c4a3 Mon Sep 17 00:00:00 2001
From: Alex Shi 
Date: Sat, 23 Nov 2013 23:18:09 +0800
Subject: [PATCH] sched: aggravate target cpu load to reduce task moving

Task migration happens when target just a bit less then source cpu load
to reduce such situation happens, aggravate the target cpu with sd->
imbalance_pct.

Signed-off-by: Alex Shi 
---
  kernel/sched/fair.c | 18 --
  1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bccdd89..c49b7ba 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -978,7 +978,7 @@ static inline unsigned long group_weight(struct task_struct 
*p, int nid)

  static unsigned long weighted_cpuload(const int cpu);
  static unsigned long source_load(int cpu);
-static unsigned long target_load(int cpu);
+static unsigned long target_load(int cpu, int imbalance_pct);
  static unsigned long power_of(int cpu);
  static long effective_load(struct task_group *tg, int cpu, long wl, long wg);

@@ -3809,11 +3809,17 @@ static unsigned long source_load(int cpu)
   * Return a high guess at the load of a migration-target cpu weighted
   * according to the scheduling class and "nice" value.
   */
-static unsigned long target_load(int cpu)
+static unsigned long target_load(int cpu, int imbalance_pct)
  {
struct rq *rq = cpu_rq(cpu);
unsigned long total = weighted_cpuload(cpu);

+   /*
+* without cpu_load decay, in most of time cpu_load is same as total
+* so we need to make target a bit heavier to reduce task migration
+*/
+   total = total * imbalance_pct / 100;
+
if (!sched_feat(LB_BIAS))
return total;

@@ -4033,7 +4039,7 @@ static int wake_affine(struct sched_domain *sd, struct 
task_struct *p, int sync)
this_cpu  = smp_processor_id();
prev_cpu  = task_cpu(p);
load  = source_load(prev_cpu);
-   this_load = target_load(this_cpu);
+   this_load = target_load(this_cpu, 100);

/*
 * If sync wakeup then subtract the (maximum possible)
@@ -4089,7 +4095,7 @@ static int wake_affine(struct sched_domain *sd, struct 
task_struct *p, int sync)

if (balanced ||
(this_load <= load &&
-this_load + target_load(prev_cpu) <= tl_per_task)) {
+this_load + target_load(prev_cpu, 100) <= tl_per_task)) {
/*
 * This domain has SD_WAKE_AFFINE and
 * p is cache cold in this domain, and
@@ -4135,7 +4141,7 @@ find_idlest_group(struct sched_domain *sd, struct 
task_struct *p, int this_cpu)
if (local_group)
load = source_load(i);
else
-   

Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-26 Thread Daniel Lezcano

On 11/24/2013 06:29 AM, Alex Shi wrote:

On 11/22/2013 08:13 PM, Daniel Lezcano wrote:


Hi Alex,

I tried on my Xeon server (2 x 4 cores) your patchset and got the
following result:

kernel a5d6e63323fe7799eb0e6  / + patchset

hackbench -T -s 4096 -l 1000 -g 10 -f 40
   27.604  38.556


Wondering if the following patch is helpful on your Xeon server?

Btw, you can run vmstat as background tool or use 'perf sched'
to get scheduler statistics change for this patchset.

The following are results of original kernel and all 5 patches
on pandaboard ES.

 latest kernel 527d1511310a89+ this patchset
hackbench -T -g 10 -f 40
 23.2520.79
 23.1620.4
 24.2420.29
hackbench -p -g 10 -f 40
 26.5221.2
 23.8924.07
 25.6520.30
hackbench -P -g 10 -f 40
 20.1419.53
 19.9620.37
 21.7620.39




Here the new results with your patchset + patch #5

I have some issues with perf for the moment, so I will fix it up and 
send the result after.



527d1511310a  / + patchset + #5

hackbench -T -s 4096 -l 1000 -g 10 -f 40
26.677   30.308
27.914   28.497
28.390   30.360
28.048   28.587
26.344   29.513
27.848   28.706
28.315   30.152
28.232   29.721
26.549   28.766
30.340   38.801
hackbench -p -s 4096 -l 1000 -g 10 -f 40
34.522   35.469
34.545   34.966
34.469   35.342
34.115   35.286
34.457   35.592
34.561   35.314
34.459   35.316
34.054   35.629
34.532   35.149
34.459   34.876
hackbench -P -s 4096 -l 1000 -g 10 -f 40
38.938   30.308
39.363   28.497
39.340   30.360
38.909   28.587
39.095   29.513
38.869   28.706
39.041   30.152
38.939   29.721
38.992   28.766
38.947   38.801



--
 From 4f5efd6c2b1e7293410ad57c3db24dcf3394c4a3 Mon Sep 17 00:00:00 2001
From: Alex Shi alex@linaro.org
Date: Sat, 23 Nov 2013 23:18:09 +0800
Subject: [PATCH] sched: aggravate target cpu load to reduce task moving

Task migration happens when target just a bit less then source cpu load
to reduce such situation happens, aggravate the target cpu with sd-
imbalance_pct.

Signed-off-by: Alex Shi alex@linaro.org
---
  kernel/sched/fair.c | 18 --
  1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bccdd89..c49b7ba 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -978,7 +978,7 @@ static inline unsigned long group_weight(struct task_struct 
*p, int nid)

  static unsigned long weighted_cpuload(const int cpu);
  static unsigned long source_load(int cpu);
-static unsigned long target_load(int cpu);
+static unsigned long target_load(int cpu, int imbalance_pct);
  static unsigned long power_of(int cpu);
  static long effective_load(struct task_group *tg, int cpu, long wl, long wg);

@@ -3809,11 +3809,17 @@ static unsigned long source_load(int cpu)
   * Return a high guess at the load of a migration-target cpu weighted
   * according to the scheduling class and nice value.
   */
-static unsigned long target_load(int cpu)
+static unsigned long target_load(int cpu, int imbalance_pct)
  {
struct rq *rq = cpu_rq(cpu);
unsigned long total = weighted_cpuload(cpu);

+   /*
+* without cpu_load decay, in most of time cpu_load is same as total
+* so we need to make target a bit heavier to reduce task migration
+*/
+   total = total * imbalance_pct / 100;
+
if (!sched_feat(LB_BIAS))
return total;

@@ -4033,7 +4039,7 @@ static int wake_affine(struct sched_domain *sd, struct 
task_struct *p, int sync)
this_cpu  = smp_processor_id();
prev_cpu  = task_cpu(p);
load  = source_load(prev_cpu);
-   this_load = target_load(this_cpu);
+   this_load = target_load(this_cpu, 100);

/*
 * If sync wakeup then subtract the (maximum possible)
@@ -4089,7 +4095,7 @@ static int wake_affine(struct sched_domain *sd, struct 
task_struct *p, int sync)

if (balanced ||
(this_load = load 
-this_load + target_load(prev_cpu) = tl_per_task)) {
+this_load + target_load(prev_cpu, 100) = tl_per_task)) {
/*
 * This domain has SD_WAKE_AFFINE and
 * p is cache cold in this domain, and
@@ -4135,7 +4141,7 @@ find_idlest_group(struct sched_domain *sd, struct 
task_struct *p, int this_cpu)
if (local_group)
load = source_load(i);
else

Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-26 Thread Alex Shi
On 11/26/2013 08:35 PM, Daniel Lezcano wrote:
 
 
 Here the new results with your patchset + patch #5
 
 I have some issues with perf for the moment, so I will fix it up and
 send the result after.

Thanks a lot, Daniel!
The result is pretty good!, thread/pipe performance has a slight little
drop, but processes performance increase about 25%!


 
 
 527d1511310a  / + patchset + #5
 
 hackbench -T -s 4096 -l 1000 -g 10 -f 40
 26.677  30.308
 27.914 28.497
 28.390 30.360
 28.048 28.587
 26.344 29.513
 27.848 28.706
 28.315 30.152
 28.232 29.721
 26.549 28.766
 30.340 38.801
 hackbench -p -s 4096 -l 1000 -g 10 -f 40
 34.522  35.469
 34.545 34.966
 34.469 35.342
 34.115 35.286
 34.457 35.592
 34.561 35.314
 34.459 35.316
 34.054 35.629
 34.532 35.149
 34.459 34.876
 hackbench -P -s 4096 -l 1000 -g 10 -f 40
 38.938  30.308
 39.363 28.497
 39.340 30.360
 38.909 28.587
 39.095 29.513
 38.869 28.706
 39.041 30.152
 38.939 29.721
 38.992 28.766
 38.947 38.801


-- 
Thanks
Alex
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-26 Thread Daniel Lezcano

On 11/26/2013 01:52 PM, Alex Shi wrote:

On 11/26/2013 08:35 PM, Daniel Lezcano wrote:



Here the new results with your patchset + patch #5

I have some issues with perf for the moment, so I will fix it up and
send the result after.


Thanks a lot, Daniel!
The result is pretty good!, thread/pipe performance has a slight little
drop, but processes performance increase about 25%!


Mmh, wait. Let me double check the results, it sounds weird we have so 
much performance increase.




527d1511310a  / + patchset + #5

hackbench -T -s 4096 -l 1000 -g 10 -f 40
26.677  30.308
27.914 28.497
28.390 30.360
28.048 28.587
26.344 29.513
27.848 28.706
28.315 30.152
28.232 29.721
26.549 28.766
30.340 38.801
hackbench -p -s 4096 -l 1000 -g 10 -f 40
34.522  35.469
34.545 34.966
34.469 35.342
34.115 35.286
34.457 35.592
34.561 35.314
34.459 35.316
34.054 35.629
34.532 35.149
34.459 34.876
hackbench -P -s 4096 -l 1000 -g 10 -f 40
38.938  30.308
39.363 28.497
39.340 30.360
38.909 28.587
39.095 29.513
38.869 28.706
39.041 30.152
38.939 29.721
38.992 28.766
38.947 38.801






--
 http://www.linaro.org/ Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  http://www.facebook.com/pages/Linaro Facebook |
http://twitter.com/#!/linaroorg Twitter |
http://www.linaro.org/linaro-blog/ Blog

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-26 Thread Daniel Lezcano

On 11/26/2013 01:52 PM, Alex Shi wrote:

On 11/26/2013 08:35 PM, Daniel Lezcano wrote:



Here the new results with your patchset + patch #5

I have some issues with perf for the moment, so I will fix it up and
send the result after.


Thanks a lot, Daniel!
The result is pretty good!, thread/pipe performance has a slight little
drop, but processes performance increase about 25%!





527d1511310a  / + patchset + #5

hackbench -T -s 4096 -l 1000 -g 10 -f 40
26.677  30.308
27.914 28.497
28.390 30.360
28.048 28.587
26.344 29.513
27.848 28.706
28.315 30.152
28.232 29.721
26.549 28.766
30.340 38.801
hackbench -p -s 4096 -l 1000 -g 10 -f 40
34.522  35.469
34.545 34.966
34.469 35.342
34.115 35.286
34.457 35.592
34.561 35.314
34.459 35.316
34.054 35.629
34.532 35.149
34.459 34.876
hackbench -P -s 4096 -l 1000 -g 10 -f 40
38.938  30.308
39.363 28.497
39.340 30.360
38.909 28.587
39.095 29.513
38.869 28.706
39.041 30.152
38.939 29.721
38.992 28.766
38.947 38.801


Ok, bad copy-paste, the third test run results with the patchset is wrong.

hackbench -P -s 4096 -l 1000 -g 10 -f 40
38.938   39.585 
39.363   39.008
39.340   38.954
38.909   39.273
39.095   38.755
38.869   39.003
39.041   38.945
38.939   38.005
38.992   38.994
38.947   38.855



--
 http://www.linaro.org/ Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  http://www.facebook.com/pages/Linaro Facebook |
http://twitter.com/#!/linaroorg Twitter |
http://www.linaro.org/linaro-blog/ Blog

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-26 Thread Alex Shi
On 11/26/2013 09:01 PM, Daniel Lezcano wrote:
 
 Ok, bad copy-paste, the third test run results with the patchset is wrong.
 
 hackbench -P -s 4096 -l 1000 -g 10 -f 40
 38.938  39.585   
 39.363 39.008
 39.340 38.954
 38.909 39.273
 39.095 38.755
 38.869 39.003
 39.041 38.945
 38.939 38.005
 38.992 38.994
 38.947 38.855

Oops,
Anyway, at least, no harm on hackbench process testing. :)

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-26 Thread Alex Shi
On 11/22/2013 02:37 PM, Alex Shi wrote:
   latest kernel 527d1511310a89+ this patchset
 hackbench -T -g 10 -f 40
   23.25  21.7
   23.16  19.99
   24.24  21.53
 hackbench -p -g 10 -f 40
   26.52  22.48
   23.89  24.00
   25.65  23.06
 hackbench -P -g 10 -f 40
   20.14  19.37
   19.96  19.76
   21.76  21.54
 
 The git tree for this patchset at:
  g...@github.com:alexshi/power-scheduling.git no-load-idx 

Fengguang,

Did your kernel testing find sth unusual on this 3 patches?

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-25 Thread Alex Shi
On 11/25/2013 04:36 PM, Daniel Lezcano wrote:
> On 11/25/2013 01:58 AM, Alex Shi wrote:
>> On 11/22/2013 08:13 PM, Daniel Lezcano wrote:
>>>
>>> Hi Alex,
>>>
>>> I tried on my Xeon server (2 x 4 cores) your patchset and got the
>>> following result:
>>>
>>> kernel a5d6e63323fe7799eb0e6  / + patchset
>>>
>>> hackbench -T -s 4096 -l 1000 -g 10 -f 40
>>>27.604  38.556
>>
>> Hi Daniel, would you like give the detailed server info? 2 socket * 4
>> cores, sounds it isn't a modern machine.
> 
> Well it has several years old now, that's true but still competing with
> some recent processors :)
> 
> Bi-Xeon E5345 2.33GHz / 8Mb L2 cache / 7BG FB-DIMM Memory 667 MHz /
> 300GB SSD 3Gb/s
> 
> 


It is a core2 CPU, quite old.
Fengguang, do you include similar box in your system?

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-25 Thread Daniel Lezcano

On 11/25/2013 01:58 AM, Alex Shi wrote:

On 11/22/2013 08:13 PM, Daniel Lezcano wrote:


Hi Alex,

I tried on my Xeon server (2 x 4 cores) your patchset and got the
following result:

kernel a5d6e63323fe7799eb0e6  / + patchset

hackbench -T -s 4096 -l 1000 -g 10 -f 40
   27.604  38.556


Hi Daniel, would you like give the detailed server info? 2 socket * 4
cores, sounds it isn't a modern machine.


Well it has several years old now, that's true but still competing with 
some recent processors :)


Bi-Xeon E5345 2.33GHz / 8Mb L2 cache / 7BG FB-DIMM Memory 667 MHz / 
300GB SSD 3Gb/s



--
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-25 Thread Daniel Lezcano

On 11/25/2013 01:58 AM, Alex Shi wrote:

On 11/22/2013 08:13 PM, Daniel Lezcano wrote:


Hi Alex,

I tried on my Xeon server (2 x 4 cores) your patchset and got the
following result:

kernel a5d6e63323fe7799eb0e6  / + patchset

hackbench -T -s 4096 -l 1000 -g 10 -f 40
   27.604  38.556


Hi Daniel, would you like give the detailed server info? 2 socket * 4
cores, sounds it isn't a modern machine.


Well it has several years old now, that's true but still competing with 
some recent processors :)


Bi-Xeon E5345 2.33GHz / 8Mb L2 cache / 7BG FB-DIMM Memory 667 MHz / 
300GB SSD 3Gb/s



--
 http://www.linaro.org/ Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  http://www.facebook.com/pages/Linaro Facebook |
http://twitter.com/#!/linaroorg Twitter |
http://www.linaro.org/linaro-blog/ Blog

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-25 Thread Alex Shi
On 11/25/2013 04:36 PM, Daniel Lezcano wrote:
 On 11/25/2013 01:58 AM, Alex Shi wrote:
 On 11/22/2013 08:13 PM, Daniel Lezcano wrote:

 Hi Alex,

 I tried on my Xeon server (2 x 4 cores) your patchset and got the
 following result:

 kernel a5d6e63323fe7799eb0e6  / + patchset

 hackbench -T -s 4096 -l 1000 -g 10 -f 40
27.604  38.556

 Hi Daniel, would you like give the detailed server info? 2 socket * 4
 cores, sounds it isn't a modern machine.
 
 Well it has several years old now, that's true but still competing with
 some recent processors :)
 
 Bi-Xeon E5345 2.33GHz / 8Mb L2 cache / 7BG FB-DIMM Memory 667 MHz /
 300GB SSD 3Gb/s
 
 


It is a core2 CPU, quite old.
Fengguang, do you include similar box in your system?

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-24 Thread Alex Shi
On 11/22/2013 08:13 PM, Daniel Lezcano wrote:
> 
> Hi Alex,
> 
> I tried on my Xeon server (2 x 4 cores) your patchset and got the
> following result:
> 
> kernel a5d6e63323fe7799eb0e6  / + patchset
> 
> hackbench -T -s 4096 -l 1000 -g 10 -f 40
>   27.604  38.556

Hi Daniel, would you like give the detailed server info? 2 socket * 4
cores, sounds it isn't a modern machine.

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-24 Thread Alex Shi
On 11/22/2013 08:13 PM, Daniel Lezcano wrote:
 
 Hi Alex,
 
 I tried on my Xeon server (2 x 4 cores) your patchset and got the
 following result:
 
 kernel a5d6e63323fe7799eb0e6  / + patchset
 
 hackbench -T -s 4096 -l 1000 -g 10 -f 40
   27.604  38.556

Hi Daniel, would you like give the detailed server info? 2 socket * 4
cores, sounds it isn't a modern machine.

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-23 Thread Alex Shi
On 11/22/2013 08:13 PM, Daniel Lezcano wrote:
> 
> Hi Alex,
> 
> I tried on my Xeon server (2 x 4 cores) your patchset and got the
> following result:
> 
> kernel a5d6e63323fe7799eb0e6  / + patchset
> 
> hackbench -T -s 4096 -l 1000 -g 10 -f 40
>   27.604  38.556

Wondering if the following patch is helpful on your Xeon server?

Btw, you can run vmstat as background tool or use 'perf sched'
to get scheduler statistics change for this patchset.

The following are results of original kernel and all 5 patches 
on pandaboard ES.

latest kernel 527d1511310a89+ this patchset
hackbench -T -g 10 -f 40
23.25"20.79"
23.16"20.4"
24.24"20.29"
hackbench -p -g 10 -f 40
26.52"21.2"
23.89"24.07"
25.65"20.30"
hackbench -P -g 10 -f 40
20.14"19.53"
19.96"20.37"
21.76"20.39"

--
>From 4f5efd6c2b1e7293410ad57c3db24dcf3394c4a3 Mon Sep 17 00:00:00 2001
From: Alex Shi 
Date: Sat, 23 Nov 2013 23:18:09 +0800
Subject: [PATCH] sched: aggravate target cpu load to reduce task moving

Task migration happens when target just a bit less then source cpu load
to reduce such situation happens, aggravate the target cpu with sd->
imbalance_pct.

Signed-off-by: Alex Shi 
---
 kernel/sched/fair.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bccdd89..c49b7ba 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -978,7 +978,7 @@ static inline unsigned long group_weight(struct task_struct 
*p, int nid)
 
 static unsigned long weighted_cpuload(const int cpu);
 static unsigned long source_load(int cpu);
-static unsigned long target_load(int cpu);
+static unsigned long target_load(int cpu, int imbalance_pct);
 static unsigned long power_of(int cpu);
 static long effective_load(struct task_group *tg, int cpu, long wl, long wg);
 
@@ -3809,11 +3809,17 @@ static unsigned long source_load(int cpu)
  * Return a high guess at the load of a migration-target cpu weighted
  * according to the scheduling class and "nice" value.
  */
-static unsigned long target_load(int cpu)
+static unsigned long target_load(int cpu, int imbalance_pct)
 {
struct rq *rq = cpu_rq(cpu);
unsigned long total = weighted_cpuload(cpu);
 
+   /*
+* without cpu_load decay, in most of time cpu_load is same as total
+* so we need to make target a bit heavier to reduce task migration
+*/
+   total = total * imbalance_pct / 100;
+
if (!sched_feat(LB_BIAS))
return total;
 
@@ -4033,7 +4039,7 @@ static int wake_affine(struct sched_domain *sd, struct 
task_struct *p, int sync)
this_cpu  = smp_processor_id();
prev_cpu  = task_cpu(p);
load  = source_load(prev_cpu);
-   this_load = target_load(this_cpu);
+   this_load = target_load(this_cpu, 100);
 
/*
 * If sync wakeup then subtract the (maximum possible)
@@ -4089,7 +4095,7 @@ static int wake_affine(struct sched_domain *sd, struct 
task_struct *p, int sync)
 
if (balanced ||
(this_load <= load &&
-this_load + target_load(prev_cpu) <= tl_per_task)) {
+this_load + target_load(prev_cpu, 100) <= tl_per_task)) {
/*
 * This domain has SD_WAKE_AFFINE and
 * p is cache cold in this domain, and
@@ -4135,7 +4141,7 @@ find_idlest_group(struct sched_domain *sd, struct 
task_struct *p, int this_cpu)
if (local_group)
load = source_load(i);
else
-   load = target_load(i);
+   load = target_load(i, sd->imbalance_pct);
 
avg_load += load;
}
@@ -5478,7 +5484,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 
/* Bias balancing toward cpus of our domain */
if (local_group)
-   load = target_load(i);
+   load = target_load(i, env->sd->imbalance_pct);
else
load = source_load(i);
 
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-23 Thread Alex Shi
On 11/22/2013 08:13 PM, Daniel Lezcano wrote:
>>
>> The git tree for this patchset at:
>>   g...@github.com:alexshi/power-scheduling.git no-load-idx
>> Since Fengguang had included this tree into his kernel testing system.
>> and I haven't get a regression report until now. I suppose it is fine
>> for x86 system.
>>
>> But anyway, since the scheduler change will effect all archs. and
>> hackbench is only benchmark I found now for this patchset. I'd like to
>> see more testing and talking on this patchset.
> 
> Hi Alex,
> 
> I tried on my Xeon server (2 x 4 cores) your patchset and got the
> following result:
> 
> kernel a5d6e63323fe7799eb0e6  / + patchset
> 
> hackbench -T -s 4096 -l 1000 -g 10 -f 40
>   27.604  38.556


Thanks for your testing, Daniel!

Fengguang, how about your kernel results for this patchset?

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-23 Thread Alex Shi
On 11/22/2013 08:13 PM, Daniel Lezcano wrote:

 The git tree for this patchset at:
   g...@github.com:alexshi/power-scheduling.git no-load-idx
 Since Fengguang had included this tree into his kernel testing system.
 and I haven't get a regression report until now. I suppose it is fine
 for x86 system.

 But anyway, since the scheduler change will effect all archs. and
 hackbench is only benchmark I found now for this patchset. I'd like to
 see more testing and talking on this patchset.
 
 Hi Alex,
 
 I tried on my Xeon server (2 x 4 cores) your patchset and got the
 following result:
 
 kernel a5d6e63323fe7799eb0e6  / + patchset
 
 hackbench -T -s 4096 -l 1000 -g 10 -f 40
   27.604  38.556


Thanks for your testing, Daniel!

Fengguang, how about your kernel results for this patchset?

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-23 Thread Alex Shi
On 11/22/2013 08:13 PM, Daniel Lezcano wrote:
 
 Hi Alex,
 
 I tried on my Xeon server (2 x 4 cores) your patchset and got the
 following result:
 
 kernel a5d6e63323fe7799eb0e6  / + patchset
 
 hackbench -T -s 4096 -l 1000 -g 10 -f 40
   27.604  38.556

Wondering if the following patch is helpful on your Xeon server?

Btw, you can run vmstat as background tool or use 'perf sched'
to get scheduler statistics change for this patchset.

The following are results of original kernel and all 5 patches 
on pandaboard ES.

latest kernel 527d1511310a89+ this patchset
hackbench -T -g 10 -f 40
23.2520.79
23.1620.4
24.2420.29
hackbench -p -g 10 -f 40
26.5221.2
23.8924.07
25.6520.30
hackbench -P -g 10 -f 40
20.1419.53
19.9620.37
21.7620.39

--
From 4f5efd6c2b1e7293410ad57c3db24dcf3394c4a3 Mon Sep 17 00:00:00 2001
From: Alex Shi alex@linaro.org
Date: Sat, 23 Nov 2013 23:18:09 +0800
Subject: [PATCH] sched: aggravate target cpu load to reduce task moving

Task migration happens when target just a bit less then source cpu load
to reduce such situation happens, aggravate the target cpu with sd-
imbalance_pct.

Signed-off-by: Alex Shi alex@linaro.org
---
 kernel/sched/fair.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bccdd89..c49b7ba 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -978,7 +978,7 @@ static inline unsigned long group_weight(struct task_struct 
*p, int nid)
 
 static unsigned long weighted_cpuload(const int cpu);
 static unsigned long source_load(int cpu);
-static unsigned long target_load(int cpu);
+static unsigned long target_load(int cpu, int imbalance_pct);
 static unsigned long power_of(int cpu);
 static long effective_load(struct task_group *tg, int cpu, long wl, long wg);
 
@@ -3809,11 +3809,17 @@ static unsigned long source_load(int cpu)
  * Return a high guess at the load of a migration-target cpu weighted
  * according to the scheduling class and nice value.
  */
-static unsigned long target_load(int cpu)
+static unsigned long target_load(int cpu, int imbalance_pct)
 {
struct rq *rq = cpu_rq(cpu);
unsigned long total = weighted_cpuload(cpu);
 
+   /*
+* without cpu_load decay, in most of time cpu_load is same as total
+* so we need to make target a bit heavier to reduce task migration
+*/
+   total = total * imbalance_pct / 100;
+
if (!sched_feat(LB_BIAS))
return total;
 
@@ -4033,7 +4039,7 @@ static int wake_affine(struct sched_domain *sd, struct 
task_struct *p, int sync)
this_cpu  = smp_processor_id();
prev_cpu  = task_cpu(p);
load  = source_load(prev_cpu);
-   this_load = target_load(this_cpu);
+   this_load = target_load(this_cpu, 100);
 
/*
 * If sync wakeup then subtract the (maximum possible)
@@ -4089,7 +4095,7 @@ static int wake_affine(struct sched_domain *sd, struct 
task_struct *p, int sync)
 
if (balanced ||
(this_load = load 
-this_load + target_load(prev_cpu) = tl_per_task)) {
+this_load + target_load(prev_cpu, 100) = tl_per_task)) {
/*
 * This domain has SD_WAKE_AFFINE and
 * p is cache cold in this domain, and
@@ -4135,7 +4141,7 @@ find_idlest_group(struct sched_domain *sd, struct 
task_struct *p, int this_cpu)
if (local_group)
load = source_load(i);
else
-   load = target_load(i);
+   load = target_load(i, sd-imbalance_pct);
 
avg_load += load;
}
@@ -5478,7 +5484,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 
/* Bias balancing toward cpus of our domain */
if (local_group)
-   load = target_load(i);
+   load = target_load(i, env-sd-imbalance_pct);
else
load = source_load(i);
 
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-22 Thread Daniel Lezcano

On 11/22/2013 07:37 AM, Alex Shi wrote:

The cpu_load decays on time according past cpu load of rq. New sched_avg decays 
on tasks' load of time. Now we has 2 kind decay for cpu_load. That is a kind of 
redundancy. And increase the system load in sched_tick etc.

This patch trying to remove the cpu_load decay. And fixed a nohz_full bug by 
the way.

There are 5 load_idx used for cpu_load in sched_domain. busy_idx and idle_idx 
are not zero usually, but newidle_idx, wake_idx and forkexec_idx are all zero 
on every arch. A shortcut to remove cpu_Load decay in the first patch. just one 
line patch for this change. :)

I have tested the patchset on my pandaES board, 2 cores ARM Cortex A9.
hackbench thread/pipe performance increased nearly 10% with this patchset! That 
do surprise me!

latest kernel 527d1511310a89+ this patchset
hackbench -T -g 10 -f 40
23.25" 21.7"
23.16" 19.99"
24.24" 21.53"
hackbench -p -g 10 -f 40
26.52" 22.48"
23.89" 24.00"
25.65" 23.06"
hackbench -P -g 10 -f 40
20.14" 19.37"
19.96" 19.76"
21.76" 21.54"

The git tree for this patchset at:
  g...@github.com:alexshi/power-scheduling.git no-load-idx
Since Fengguang had included this tree into his kernel testing system. and I 
haven't get a regression report until now. I suppose it is fine for x86 system.

But anyway, since the scheduler change will effect all archs. and hackbench is 
only benchmark I found now for this patchset. I'd like to see more testing and 
talking on this patchset.


Hi Alex,

I tried on my Xeon server (2 x 4 cores) your patchset and got the 
following result:


kernel a5d6e63323fe7799eb0e6  / + patchset

hackbench -T -s 4096 -l 1000 -g 10 -f 40
  27.604 38.556
  27.397 38.694
  26.695 38.647
  25.975 38.528
  29.586 38.553
  25.956 38.331
  27.895 38.472
  26.874 38.608
  26.836 38.341
  28.064 38.626
hackbench -p -s 4096 -l 1000 -g 10 -f 40
  34.502 35.489
  34.551 35.389
  34.027 35.664
  34.343 35.418
  34.570 35.423
  34.386 35.466
  34.387 35.486
  33.869 35.212
  34.600 35.465
  34.155 35.235
hackbench -P -s 4096 -l 1000 -g 10 -f 40
  39.170 38.794
  39.108 38.662
  39.056 38.946
  39.120 38.668
  38.896 38.865
  39.109 38.803
  39.020 38.946
  39.099 38.844
  38.820 38.872
  38.923 39.337



--
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-22 Thread Daniel Lezcano

On 11/22/2013 07:37 AM, Alex Shi wrote:

The cpu_load decays on time according past cpu load of rq. New sched_avg decays 
on tasks' load of time. Now we has 2 kind decay for cpu_load. That is a kind of 
redundancy. And increase the system load in sched_tick etc.

This patch trying to remove the cpu_load decay. And fixed a nohz_full bug by 
the way.

There are 5 load_idx used for cpu_load in sched_domain. busy_idx and idle_idx 
are not zero usually, but newidle_idx, wake_idx and forkexec_idx are all zero 
on every arch. A shortcut to remove cpu_Load decay in the first patch. just one 
line patch for this change. :)

I have tested the patchset on my pandaES board, 2 cores ARM Cortex A9.
hackbench thread/pipe performance increased nearly 10% with this patchset! That 
do surprise me!

latest kernel 527d1511310a89+ this patchset
hackbench -T -g 10 -f 40
23.25 21.7
23.16 19.99
24.24 21.53
hackbench -p -g 10 -f 40
26.52 22.48
23.89 24.00
25.65 23.06
hackbench -P -g 10 -f 40
20.14 19.37
19.96 19.76
21.76 21.54

The git tree for this patchset at:
  g...@github.com:alexshi/power-scheduling.git no-load-idx
Since Fengguang had included this tree into his kernel testing system. and I 
haven't get a regression report until now. I suppose it is fine for x86 system.

But anyway, since the scheduler change will effect all archs. and hackbench is 
only benchmark I found now for this patchset. I'd like to see more testing and 
talking on this patchset.


Hi Alex,

I tried on my Xeon server (2 x 4 cores) your patchset and got the 
following result:


kernel a5d6e63323fe7799eb0e6  / + patchset

hackbench -T -s 4096 -l 1000 -g 10 -f 40
  27.604 38.556
  27.397 38.694
  26.695 38.647
  25.975 38.528
  29.586 38.553
  25.956 38.331
  27.895 38.472
  26.874 38.608
  26.836 38.341
  28.064 38.626
hackbench -p -s 4096 -l 1000 -g 10 -f 40
  34.502 35.489
  34.551 35.389
  34.027 35.664
  34.343 35.418
  34.570 35.423
  34.386 35.466
  34.387 35.486
  33.869 35.212
  34.600 35.465
  34.155 35.235
hackbench -P -s 4096 -l 1000 -g 10 -f 40
  39.170 38.794
  39.108 38.662
  39.056 38.946
  39.120 38.668
  38.896 38.865
  39.109 38.803
  39.020 38.946
  39.099 38.844
  38.820 38.872
  38.923 39.337



--
 http://www.linaro.org/ Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  http://www.facebook.com/pages/Linaro Facebook |
http://twitter.com/#!/linaroorg Twitter |
http://www.linaro.org/linaro-blog/ Blog

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-21 Thread Alex Shi
The cpu_load decays on time according past cpu load of rq. New sched_avg decays 
on tasks' load of time. Now we has 2 kind decay for cpu_load. That is a kind of 
redundancy. And increase the system load in sched_tick etc.

This patch trying to remove the cpu_load decay. And fixed a nohz_full bug by 
the way. 

There are 5 load_idx used for cpu_load in sched_domain. busy_idx and idle_idx 
are not zero usually, but newidle_idx, wake_idx and forkexec_idx are all zero 
on every arch. A shortcut to remove cpu_Load decay in the first patch. just one 
line patch for this change. :)

I have tested the patchset on my pandaES board, 2 cores ARM Cortex A9.
hackbench thread/pipe performance increased nearly 10% with this patchset! That 
do surprise me!

latest kernel 527d1511310a89+ this patchset
hackbench -T -g 10 -f 40
23.25"  21.7"
23.16"  19.99"
24.24"  21.53"
hackbench -p -g 10 -f 40
26.52"  22.48"
23.89"  24.00"
25.65"  23.06"
hackbench -P -g 10 -f 40
20.14"  19.37"
19.96"  19.76"
21.76"  21.54"

The git tree for this patchset at:
 g...@github.com:alexshi/power-scheduling.git no-load-idx 
Since Fengguang had included this tree into his kernel testing system. and I 
haven't get a regression report until now. I suppose it is fine for x86 system.

But anyway, since the scheduler change will effect all archs. and hackbench is 
only benchmark I found now for this patchset. I'd like to see more testing and 
talking on this patchset.

Regards
Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 0/4] sched: remove cpu_load decay.

2013-11-21 Thread Alex Shi
The cpu_load decays on time according past cpu load of rq. New sched_avg decays 
on tasks' load of time. Now we has 2 kind decay for cpu_load. That is a kind of 
redundancy. And increase the system load in sched_tick etc.

This patch trying to remove the cpu_load decay. And fixed a nohz_full bug by 
the way. 

There are 5 load_idx used for cpu_load in sched_domain. busy_idx and idle_idx 
are not zero usually, but newidle_idx, wake_idx and forkexec_idx are all zero 
on every arch. A shortcut to remove cpu_Load decay in the first patch. just one 
line patch for this change. :)

I have tested the patchset on my pandaES board, 2 cores ARM Cortex A9.
hackbench thread/pipe performance increased nearly 10% with this patchset! That 
do surprise me!

latest kernel 527d1511310a89+ this patchset
hackbench -T -g 10 -f 40
23.25  21.7
23.16  19.99
24.24  21.53
hackbench -p -g 10 -f 40
26.52  22.48
23.89  24.00
25.65  23.06
hackbench -P -g 10 -f 40
20.14  19.37
19.96  19.76
21.76  21.54

The git tree for this patchset at:
 g...@github.com:alexshi/power-scheduling.git no-load-idx 
Since Fengguang had included this tree into his kernel testing system. and I 
haven't get a regression report until now. I suppose it is fine for x86 system.

But anyway, since the scheduler change will effect all archs. and hackbench is 
only benchmark I found now for this patchset. I'd like to see more testing and 
talking on this patchset.

Regards
Alex

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/