Re: High priority tasks break SMP balancer?

2007-11-09 Thread Cyrus Massoumi

Hi Micah


On my machine (2-way Opteron with a vanilla 2.6.23.1 kernel) this test
program will reliably put the scheduler into a state where one CPU has
both of the busy-looping processes in its runqueue, and the other CPU
is usually idle. The usually-idle CPU will have a very high cpu_load,
as reported by /proc/sched_debug.


I tried your program on my machine (C2D, 2.6.17, O(1) scheduler).

Both CPUs are 100% busy all the time. Each busy-looping thread is 
running on its own CPU. I've been watching top output for 10 minutes, 
the spreading is stable and the threads don't bounce at all.



greetings
Cyrus


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: aim7 -30% regression in 2.6.24-rc1

2007-11-05 Thread Cyrus Massoumi

Zhang, Yanmin wrote:

On Thu, 2007-11-01 at 11:02 +0100, Cyrus Massoumi wrote:

Zhang, Yanmin wrote:

On Wed, 2007-10-31 at 17:57 +0800, Zhang, Yanmin wrote:

On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote:

On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote:

* Zhang, Yanmin <[EMAIL PROTECTED]> wrote:

sub-bisecting captured patch 
38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) 
caused 20% regression of aim7.


The last 10% should be also related to sched parameters, such like 
sysctl_sched_min_granularity.
ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you 
please try to figure out what the best value for 
/proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and 
/proc/sys/kernel_sched_min_granularity is?


there's a tuning constraint for kernel_sched_nr_latency: 

- kernel_sched_nr_latency should always be set to 
  kernel_sched_latency/kernel_sched_min_granularity. (it's not a free 
  tunable)


i suspect a good approach would be to double the value of 
kernel_sched_latency and kernel_sched_nr_latency in each tuning 
iteration, while keeping kernel_sched_min_granularity unchanged. That 
will excercise the tuning values of the 2.6.23 kernel as well.

I followed your idea to test 2.6.24-rc1. The improvement is slow.
When sched_nr_latency=2560 and sched_latency_ns=64000, the performance
is still about 15% less than 2.6.23.

I got the aim7 30% regression on my new upgraded stoakley machine. I found
this mahcine is slower than the old one. Maybe BIOS has issues, or 
memeory(Might not
be dual-channel?) is slow. So I retested it on the old machine and found on the 
old
stoakley machine, the regression is about 6%, quite similiar to the regression 
on tigerton
machine.

By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley 
machine,
the regression becomes about 2%. Other latency has more regression.

On my tulsa machine, by sched_nr_latency=640 and sched_latency_ns=64000,
the regression becomes less than 1% (The original regression is about 20%).

I rerun SPECjbb by ched_nr_latency=640 and sched_latency_ns=64000. On 
tigerton,
the regression is still more than 40%. On stoakley machine, it becomes worse 
(26%,
original is 9%). I will do more investigation to make sure SPECjbb regression is
also casued by the bad default values.

We need a smarter method to calculate the best default values for the key tuning
parameters.

One interesting is sysbench+mysql(readonly) got the same result like 2.6.22 (no
regression). Good job!
Do you mean you couldn't reproduce the regression which was reported 
with 2.6.23 (http://lkml.org/lkml/2007/10/30/53) with 2.6.24-rc1?

It looks like you missed my emails.


Yeah :(


Firstly, I reproduced (or just find the same myself :) ) the issue with kernel 
2.6.22,
2.6.23-rc and 2.6.23.

Ingo wrote a big patch to fix it and the new patch is in 2.6.24-rc1 now.


That's nice, could you please point me to the commit?


Then I retested it with 2.6.24-rc1 on a couple of x86_64 machines. The issue
disappeared. You could test it with 2.6.24-rc1.


Will do!

 It 
would be nice if you could provide some numbers for 2.6.22, 2.6.23 and 
2.6.24-rc1.

Sorry. Intel policy doesn't allow me to publish the numbers because only
specific departments in Intel could do that. But I could talk the regression
percentage.


Fair enough :)


-yanmin


greetings
Cyrus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: aim7 -30% regression in 2.6.24-rc1

2007-11-01 Thread Cyrus Massoumi

Zhang, Yanmin wrote:

On Wed, 2007-10-31 at 17:57 +0800, Zhang, Yanmin wrote:

On Tue, 2007-10-30 at 16:36 +0800, Zhang, Yanmin wrote:

On Tue, 2007-10-30 at 08:26 +0100, Ingo Molnar wrote:

* Zhang, Yanmin <[EMAIL PROTECTED]> wrote:

sub-bisecting captured patch 
38ad464d410dadceda1563f36bdb0be7fe4c8938(sched: uniform tunings) 
caused 20% regression of aim7.


The last 10% should be also related to sched parameters, such like 
sysctl_sched_min_granularity.
ah, interesting. Since you have CONFIG_SCHED_DEBUG enabled, could you 
please try to figure out what the best value for 
/proc/sys/kernel_sched_latency, /proc/sys/kernel_sched_nr_latency and 
/proc/sys/kernel_sched_min_granularity is?


there's a tuning constraint for kernel_sched_nr_latency: 

- kernel_sched_nr_latency should always be set to 
  kernel_sched_latency/kernel_sched_min_granularity. (it's not a free 
  tunable)


i suspect a good approach would be to double the value of 
kernel_sched_latency and kernel_sched_nr_latency in each tuning 
iteration, while keeping kernel_sched_min_granularity unchanged. That 
will excercise the tuning values of the 2.6.23 kernel as well.

I followed your idea to test 2.6.24-rc1. The improvement is slow.
When sched_nr_latency=2560 and sched_latency_ns=64000, the performance
is still about 15% less than 2.6.23.

I got the aim7 30% regression on my new upgraded stoakley machine. I found
this mahcine is slower than the old one. Maybe BIOS has issues, or 
memeory(Might not
be dual-channel?) is slow. So I retested it on the old machine and found on the 
old
stoakley machine, the regression is about 6%, quite similiar to the regression 
on tigerton
machine.

By sched_nr_latency=640 and sched_latency_ns=64000 on the old stoakley 
machine,
the regression becomes about 2%. Other latency has more regression.

On my tulsa machine, by sched_nr_latency=640 and sched_latency_ns=64000,
the regression becomes less than 1% (The original regression is about 20%).

I rerun SPECjbb by ched_nr_latency=640 and sched_latency_ns=64000. On 
tigerton,
the regression is still more than 40%. On stoakley machine, it becomes worse 
(26%,
original is 9%). I will do more investigation to make sure SPECjbb regression is
also casued by the bad default values.

We need a smarter method to calculate the best default values for the key tuning
parameters.

One interesting is sysbench+mysql(readonly) got the same result like 2.6.22 (no
regression). Good job!


Do you mean you couldn't reproduce the regression which was reported 
with 2.6.23 (http://lkml.org/lkml/2007/10/30/53) with 2.6.24-rc1? It 
would be nice if you could provide some numbers for 2.6.22, 2.6.23 and 
2.6.24-rc1.



-yanmin


greetings
Cyrus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/