subject:"Re\: \[patch\] CFS scheduler, \-v5"

Re: [patch] CFS scheduler, -v5

2007-04-25 Thread Ingo Molnar


* Srivatsa Vaddagiri <[EMAIL PROTECTED]> wrote:

> +#define __NR_yield_to  280
> +__SYSCALL(__NR_move_pages, sys_sched_yield_to)
> 
> s/__NR_move_pages/__NR_yield_to in the above line?

yeah, thanks.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-25 Thread Srivatsa Vaddagiri

On Mon, Apr 23, 2007 at 08:21:16AM +0200, Ingo Molnar wrote:
> > Changing sys_yield_to to sys_sched_yield_to in 
> > include/asm-x86_64/unistd.h fixes the problem.
> 
> thanks. I edited the -v5 patch so new downloads should have the fix. (i 
> also test-booted x86_64 with this patch)

I downloaded -v5 and noticed this:

--- linux.orig/include/asm-x86_64/unistd.h
+++ linux/include/asm-x86_64/unistd.h
@@ -619,8 +619,10 @@ __SYSCALL(__NR_sync_file_range, sys_sync
 __SYSCALL(__NR_vmsplice, sys_vmsplice)
 #define __NR_move_pages279
 __SYSCALL(__NR_move_pages, sys_move_pages)
+#define __NR_yield_to  280
+__SYSCALL(__NR_move_pages, sys_sched_yield_to)

s/__NR_move_pages/__NR_yield_to in the above line?

-- 
Regards,
vatsa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-25 Thread Ingo Molnar


* Christian Hesse <[EMAIL PROTECTED]> wrote:

> > Or do you mean that the default placement of single tasks starts at 
> > CPU#0, while with mainline they were alternating?
> 
> That was not your fault. I updated suspend2 to 2.2.9.13 and everything 
> works as expected again. Sorry for the noise.

ok, great!

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-25 Thread Christian Hesse

On Wednesday 25 April 2007, Ingo Molnar wrote:
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> > On Monday 23 April 2007, Ingo Molnar wrote:
> > > i'm pleased to announce release -v5 of the CFS scheduler patchset.
> >
> > Hi Ingo,
> >
> > I just noticed that with cfs all processes (except some kernel
> > threads) run on cpu 0. I don't think this is expected cpu affinity for
> > an smp system? I remember about half of the processes running on each
> > core with mainline.
>
> i've got several SMP systems with CFS and all distribute the load
> properly to all CPUs, so it would be nice if you could tell me more
> about how the problem manifests itself on your system.
>
> for example, if you start two infinite loops:
>
> for (( N=0; N < 2; N++ )); do ( while :; do :; done ) & done
>
> do they end up on the same CPU?
>
> Or do you mean that the default placement of single tasks starts at
> CPU#0, while with mainline they were alternating?

That was not your fault. I updated suspend2 to 2.2.9.13 and everything works 
as expected again. Sorry for the noise.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.

Re: [patch] CFS scheduler, -v5

2007-04-25 Thread Ingo Molnar

* Christian Hesse <[EMAIL PROTECTED]> wrote:

> On Monday 23 April 2007, Ingo Molnar wrote:
> > i'm pleased to announce release -v5 of the CFS scheduler patchset.
> 
> Hi Ingo,
> 
> I just noticed that with cfs all processes (except some kernel 
> threads) run on cpu 0. I don't think this is expected cpu affinity for 
> an smp system? I remember about half of the processes running on each 
> core with mainline.

i've got several SMP systems with CFS and all distribute the load 
properly to all CPUs, so it would be nice if you could tell me more 
about how the problem manifests itself on your system.

for example, if you start two infinite loops:

for (( N=0; N < 2; N++ )); do ( while :; do :; done ) & done

do they end up on the same CPU?

Or do you mean that the default placement of single tasks starts at 
CPU#0, while with mainline they were alternating?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-25 Thread Ingo Molnar


* Christian Hesse [EMAIL PROTECTED] wrote:

 On Monday 23 April 2007, Ingo Molnar wrote:
  i'm pleased to announce release -v5 of the CFS scheduler patchset.
 
 Hi Ingo,
 
 I just noticed that with cfs all processes (except some kernel 
 threads) run on cpu 0. I don't think this is expected cpu affinity for 
 an smp system? I remember about half of the processes running on each 
 core with mainline.

i've got several SMP systems with CFS and all distribute the load 
properly to all CPUs, so it would be nice if you could tell me more 
about how the problem manifests itself on your system.

for example, if you start two infinite loops:

for (( N=0; N  2; N++ )); do ( while :; do :; done )  done

do they end up on the same CPU?

Or do you mean that the default placement of single tasks starts at 
CPU#0, while with mainline they were alternating?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-25 Thread Ingo Molnar


* Christian Hesse [EMAIL PROTECTED] wrote:

  Or do you mean that the default placement of single tasks starts at 
  CPU#0, while with mainline they were alternating?
 
 That was not your fault. I updated suspend2 to 2.2.9.13 and everything 
 works as expected again. Sorry for the noise.

ok, great!

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-25 Thread Christian Hesse

On Wednesday 25 April 2007, Ingo Molnar wrote:
 * Christian Hesse [EMAIL PROTECTED] wrote:
  On Monday 23 April 2007, Ingo Molnar wrote:
   i'm pleased to announce release -v5 of the CFS scheduler patchset.
 
  Hi Ingo,
 
  I just noticed that with cfs all processes (except some kernel
  threads) run on cpu 0. I don't think this is expected cpu affinity for
  an smp system? I remember about half of the processes running on each
  core with mainline.

 i've got several SMP systems with CFS and all distribute the load
 properly to all CPUs, so it would be nice if you could tell me more
 about how the problem manifests itself on your system.

 for example, if you start two infinite loops:

 for (( N=0; N  2; N++ )); do ( while :; do :; done )  done

 do they end up on the same CPU?

 Or do you mean that the default placement of single tasks starts at
 CPU#0, while with mainline they were alternating?

That was not your fault. I updated suspend2 to 2.2.9.13 and everything works 
as expected again. Sorry for the noise.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.

Re: [patch] CFS scheduler, -v5

2007-04-25 Thread Srivatsa Vaddagiri

On Mon, Apr 23, 2007 at 08:21:16AM +0200, Ingo Molnar wrote:
  Changing sys_yield_to to sys_sched_yield_to in 
  include/asm-x86_64/unistd.h fixes the problem.
 
 thanks. I edited the -v5 patch so new downloads should have the fix. (i 
 also test-booted x86_64 with this patch)

I downloaded -v5 and noticed this:

--- linux.orig/include/asm-x86_64/unistd.h
+++ linux/include/asm-x86_64/unistd.h
@@ -619,8 +619,10 @@ __SYSCALL(__NR_sync_file_range, sys_sync
 __SYSCALL(__NR_vmsplice, sys_vmsplice)
 #define __NR_move_pages279
 __SYSCALL(__NR_move_pages, sys_move_pages)
+#define __NR_yield_to  280
+__SYSCALL(__NR_move_pages, sys_sched_yield_to)

s/__NR_move_pages/__NR_yield_to in the above line?

-- 
Regards,
vatsa
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-25 Thread Ingo Molnar


* Srivatsa Vaddagiri [EMAIL PROTECTED] wrote:

 +#define __NR_yield_to  280
 +__SYSCALL(__NR_move_pages, sys_sched_yield_to)
 
 s/__NR_move_pages/__NR_yield_to in the above line?

yeah, thanks.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-24 Thread Christian Hesse

On Monday 23 April 2007, Ingo Molnar wrote:
> i'm pleased to announce release -v5 of the CFS scheduler patchset.

Hi Ingo,

I just noticed that with cfs all processes (except some kernel threads) run on 
cpu 0. I don't think this is expected cpu affinity for an smp system? I 
remember about half of the processes running on each core with mainline.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.

Re: [patch] CFS scheduler, -v5

2007-04-24 Thread Christian Hesse

On Monday 23 April 2007, Ingo Molnar wrote:
 i'm pleased to announce release -v5 of the CFS scheduler patchset.

Hi Ingo,

I just noticed that with cfs all processes (except some kernel threads) run on 
cpu 0. I don't think this is expected cpu affinity for an smp system? I 
remember about half of the processes running on each core with mainline.
-- 
Regards,
Chris


signature.asc
Description: This is a digitally signed message part.

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Ingo Molnar


* Guillaume Chazarain <[EMAIL PROTECTED]> wrote:

> 2007/4/23, Ingo Molnar <[EMAIL PROTECTED]>:
> 
> Index: linux/kernel/sched.c
> ===
> --- linux.orig/kernel/sched.c
> +++ linux/kernel/sched.c
> +#include "sched_stats.h"
> +#include "sched_rt.c"
> +#include "sched_fair.c"
> +#include "sched_debug.c"
> 
> Index: linux/kernel/sched_stats.h
> ===
> --- /dev/null
> +++ linux/kernel/sched_stats.h
> 
> These look unnatural if it were to be included in mainline.

agreed - these will likely be separate modules - i just wanted to have 
an easy way of sharing infrastructure between sched.c and these.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Guillaume Chazarain


2007/4/23, Ingo Molnar <[EMAIL PROTECTED]>:

Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
+#include "sched_stats.h"
+#include "sched_rt.c"
+#include "sched_fair.c"
+#include "sched_debug.c"

Index: linux/kernel/sched_stats.h
===
--- /dev/null
+++ linux/kernel/sched_stats.h

These look unnatural if it were to be included in mainline.

WBR.

--
Guillaume
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5 (build problem - make headers_check fails)

2007-04-23 Thread Ingo Molnar


* Zach Carter <[EMAIL PROTECTED]> wrote:

> FYI, make headers_check seems to fail on this:
> 
> [EMAIL PROTECTED] linux-2.6]$ make headers_check

> make[2]: *** No rule to make target 
> `/src/linux-2.6/usr/include/linux/.check.sched.h', needed by 
> `__headerscheck'.  Stop.
> make[1]: *** [linux] Error 2
> make: *** [headers_check] Error 2
> [EMAIL PROTECTED] linux-2.6]$
> 
> This also fails if I have CONFIG_HEADERS_CHECK=y in my .config

ah, indeed - the patch below should fix this. It will be in -v6.

Ingo

Index: linux/include/linux/sched.h
===
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -2,7 +2,6 @@
 #define _LINUX_SCHED_H
 
 #include   /* For AT_VECTOR_SIZE */
-#include   /* For run_node */
 /*
  * cloning flags:
  */
@@ -37,6 +36,8 @@
 
 #ifdef __KERNEL__
 
+#include   /* For run_node */
+
 struct sched_param {
int sched_priority;
 };
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Ingo Molnar

* Nick Piggin <[EMAIL PROTECTED]> wrote:

> > yeah - but they'll all be quad core, so the SMP timeslice 
> > multiplicator should do the trick. Most of the CFS testers use 
> > single-CPU systems.
> 
> But desktop users could have have quad thread and even 8 thread CPUs 
> soon, [...]

SMT is indeed an issue, so i think what should be used to scale 
timeslices isnt num_online_cpus(), but the sum of all CPU's ->cpu_power 
value (scaled down by SCHED_LOAD_SCALE). That way if the thread is not a 
'full CPU', then the scaling will be proportionally smaller. Can you see 
any hole in that?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Ingo Molnar


* Nick Piggin <[EMAIL PROTECTED]> wrote:

> > do need reinforcement and test results on the basic part: _can_ this 
> > design be interactive enough on the desktop? So far the feedback has 
> > been affirmative, but more testing is needed.
> 
> It seems to be fairly easy to make a scheduler interactive if the 
> timeslice is as low as that (not that I've released one for wider 
> testing, but just by my own observations). [...]

ok, i'll bite: please release such a scheduler that does that with 
5-8-10msec range timeslices :-)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Nick Piggin

On Mon, Apr 23, 2007 at 09:10:50AM +0200, Ingo Molnar wrote:
> 
> * Nick Piggin <[EMAIL PROTECTED]> wrote:
> 
> > > yeah - but they'll all be quad core, so the SMP timeslice 
> > > multiplicator should do the trick. Most of the CFS testers use 
> > > single-CPU systems.
> > 
> > But desktop users could have have quad thread and even 8 thread CPUs 
> > soon, so if the number doesn't work for both then you're in trouble. 
> > It just smells like a hack to scale with CPU numbers.
> 
> hm, i still like Con's approach in this case because it makes 
> independent sense: in essence we calculate the "human visible" effective 
> latency of a physical resource: more CPUs/threads means more parallelism 
> and less visible choppiness of whatever basic chunking of workloads 
> there might be, hence larger size chunking can be done.

If there were no penalty, you would like the timeslice as small as
possible.

There is a penalty, which is why we want larger timeslices.

This penalty is still almost as significant on multiprocessor systems
as it is on single processor systems (remote memory / coherency
traffic make it slightly more on some multiprocessors, but nothing
like the basic cache<->RAM order of magnitude problem).

> > > it doesnt in any test i do, but again, i'm erring on the side of it 
> > > being more interactive.
> > 
> > I'd start by erring on the side of trying to ensure no obvious 
> > performance regressions like this because that's the easy part. 
> > Suppose everybody finds your scheduler wonderfully interactive, but 
> > you can't make it so with a larger timeslice?
> 
> look at CFS's design and you'll see that it can easily take larger 
> timeslices :) I really dont need any reinforcement on that part. But i 

By default, I mean.

> do need reinforcement and test results on the basic part: _can_ this 
> design be interactive enough on the desktop? So far the feedback has 
> been affirmative, but more testing is needed.

It seems to be fairly easy to make a scheduler interactive if the
timeslice is as low as that (not that I've released one for wider
testing, but just by my own observations). So I don't think we'd
need to go to rbtree based scheduling just for that.

> server scheduling, while obviously of prime importance to us, is really 
> 'easy' in comparison technically, because it has alot less human factors 
> and is thus a much more deterministic task.

But there are lots of shades of grey (CPU efficiency on desktops
is often important, and sometimes servers need to do interactive
sorts of things).

It would be much better if a single scheduler with default
settings would be reasonable for all.

> > For _real_ desktop systems, sure, erring on the side of being more 
> > interactive is fine. For RFC patches for testing, I really think you 
> > could be taking advantage of the fact that people will give you 
> > feedback on the issue.
> 
> 90% of the testers are using CFS on desktops. 80% of the scheduler 
> complaints come regarding the human (latency/behavior/consistency) 
> aspect of the upstream scheduler. (Sure, we dont want to turn that 
> around into '80% of the complaints come due to performance' - so i 
> increased the granularity based on your kbuild feedback to that near of 
> SD's, to show that mini-timeslices are not a necessity in CFS, but i 
> really think that server scheduling is the easier part.)

So why not solve that (or at least not introduce obvious regressions),
and then focus on the hard part?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Ingo Molnar

* Nick Piggin <[EMAIL PROTECTED]> wrote:

> > yeah - but they'll all be quad core, so the SMP timeslice 
> > multiplicator should do the trick. Most of the CFS testers use 
> > single-CPU systems.
> 
> But desktop users could have have quad thread and even 8 thread CPUs 
> soon, so if the number doesn't work for both then you're in trouble. 
> It just smells like a hack to scale with CPU numbers.

hm, i still like Con's approach in this case because it makes 
independent sense: in essence we calculate the "human visible" effective 
latency of a physical resource: more CPUs/threads means more parallelism 
and less visible choppiness of whatever basic chunking of workloads 
there might be, hence larger size chunking can be done.

> > it doesnt in any test i do, but again, i'm erring on the side of it 
> > being more interactive.
> 
> I'd start by erring on the side of trying to ensure no obvious 
> performance regressions like this because that's the easy part. 
> Suppose everybody finds your scheduler wonderfully interactive, but 
> you can't make it so with a larger timeslice?

look at CFS's design and you'll see that it can easily take larger 
timeslices :) I really dont need any reinforcement on that part. But i 
do need reinforcement and test results on the basic part: _can_ this 
design be interactive enough on the desktop? So far the feedback has 
been affirmative, but more testing is needed.

server scheduling, while obviously of prime importance to us, is really 
'easy' in comparison technically, because it has alot less human factors 
and is thus a much more deterministic task.

> For _real_ desktop systems, sure, erring on the side of being more 
> interactive is fine. For RFC patches for testing, I really think you 
> could be taking advantage of the fact that people will give you 
> feedback on the issue.

90% of the testers are using CFS on desktops. 80% of the scheduler 
complaints come regarding the human (latency/behavior/consistency) 
aspect of the upstream scheduler. (Sure, we dont want to turn that 
around into '80% of the complaints come due to performance' - so i 
increased the granularity based on your kbuild feedback to that near of 
SD's, to show that mini-timeslices are not a necessity in CFS, but i 
really think that server scheduling is the easier part.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Ingo Molnar


* Markus Trippelsdorf <[EMAIL PROTECTED]> wrote:

> > The new version does not link here (amd64,smp):
> > 
> >   LD  .tmp_vmlinux1
> >   arch/x86_64/kernel/built-in.o:(.rodata+0x1dd8): undefined reference to
> >   `sys_yield_to'
> 
> Changing sys_yield_to to sys_sched_yield_to in 
> include/asm-x86_64/unistd.h fixes the problem.

thanks. I edited the -v5 patch so new downloads should have the fix. (i 
also test-booted x86_64 with this patch)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Ingo Molnar


* Markus Trippelsdorf [EMAIL PROTECTED] wrote:

  The new version does not link here (amd64,smp):
  
LD  .tmp_vmlinux1
arch/x86_64/kernel/built-in.o:(.rodata+0x1dd8): undefined reference to
`sys_yield_to'
 
 Changing sys_yield_to to sys_sched_yield_to in 
 include/asm-x86_64/unistd.h fixes the problem.

thanks. I edited the -v5 patch so new downloads should have the fix. (i 
also test-booted x86_64 with this patch)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Ingo Molnar


* Nick Piggin [EMAIL PROTECTED] wrote:

  yeah - but they'll all be quad core, so the SMP timeslice 
  multiplicator should do the trick. Most of the CFS testers use 
  single-CPU systems.
 
 But desktop users could have have quad thread and even 8 thread CPUs 
 soon, so if the number doesn't work for both then you're in trouble. 
 It just smells like a hack to scale with CPU numbers.

hm, i still like Con's approach in this case because it makes 
independent sense: in essence we calculate the human visible effective 
latency of a physical resource: more CPUs/threads means more parallelism 
and less visible choppiness of whatever basic chunking of workloads 
there might be, hence larger size chunking can be done.

  it doesnt in any test i do, but again, i'm erring on the side of it 
  being more interactive.
 
 I'd start by erring on the side of trying to ensure no obvious 
 performance regressions like this because that's the easy part. 
 Suppose everybody finds your scheduler wonderfully interactive, but 
 you can't make it so with a larger timeslice?

look at CFS's design and you'll see that it can easily take larger 
timeslices :) I really dont need any reinforcement on that part. But i 
do need reinforcement and test results on the basic part: _can_ this 
design be interactive enough on the desktop? So far the feedback has 
been affirmative, but more testing is needed.

server scheduling, while obviously of prime importance to us, is really 
'easy' in comparison technically, because it has alot less human factors 
and is thus a much more deterministic task.

 For _real_ desktop systems, sure, erring on the side of being more 
 interactive is fine. For RFC patches for testing, I really think you 
 could be taking advantage of the fact that people will give you 
 feedback on the issue.

90% of the testers are using CFS on desktops. 80% of the scheduler 
complaints come regarding the human (latency/behavior/consistency) 
aspect of the upstream scheduler. (Sure, we dont want to turn that 
around into '80% of the complaints come due to performance' - so i 
increased the granularity based on your kbuild feedback to that near of 
SD's, to show that mini-timeslices are not a necessity in CFS, but i 
really think that server scheduling is the easier part.)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Nick Piggin

On Mon, Apr 23, 2007 at 09:10:50AM +0200, Ingo Molnar wrote:
 
 * Nick Piggin [EMAIL PROTECTED] wrote:
 
   yeah - but they'll all be quad core, so the SMP timeslice 
   multiplicator should do the trick. Most of the CFS testers use 
   single-CPU systems.
  
  But desktop users could have have quad thread and even 8 thread CPUs 
  soon, so if the number doesn't work for both then you're in trouble. 
  It just smells like a hack to scale with CPU numbers.
 
 hm, i still like Con's approach in this case because it makes 
 independent sense: in essence we calculate the human visible effective 
 latency of a physical resource: more CPUs/threads means more parallelism 
 and less visible choppiness of whatever basic chunking of workloads 
 there might be, hence larger size chunking can be done.

If there were no penalty, you would like the timeslice as small as
possible.

There is a penalty, which is why we want larger timeslices.

This penalty is still almost as significant on multiprocessor systems
as it is on single processor systems (remote memory / coherency
traffic make it slightly more on some multiprocessors, but nothing
like the basic cache-RAM order of magnitude problem).


   it doesnt in any test i do, but again, i'm erring on the side of it 
   being more interactive.
  
  I'd start by erring on the side of trying to ensure no obvious 
  performance regressions like this because that's the easy part. 
  Suppose everybody finds your scheduler wonderfully interactive, but 
  you can't make it so with a larger timeslice?
 
 look at CFS's design and you'll see that it can easily take larger 
 timeslices :) I really dont need any reinforcement on that part. But i 

By default, I mean.

 do need reinforcement and test results on the basic part: _can_ this 
 design be interactive enough on the desktop? So far the feedback has 
 been affirmative, but more testing is needed.

It seems to be fairly easy to make a scheduler interactive if the
timeslice is as low as that (not that I've released one for wider
testing, but just by my own observations). So I don't think we'd
need to go to rbtree based scheduling just for that.


 server scheduling, while obviously of prime importance to us, is really 
 'easy' in comparison technically, because it has alot less human factors 
 and is thus a much more deterministic task.

But there are lots of shades of grey (CPU efficiency on desktops
is often important, and sometimes servers need to do interactive
sorts of things).

It would be much better if a single scheduler with default
settings would be reasonable for all.


  For _real_ desktop systems, sure, erring on the side of being more 
  interactive is fine. For RFC patches for testing, I really think you 
  could be taking advantage of the fact that people will give you 
  feedback on the issue.
 
 90% of the testers are using CFS on desktops. 80% of the scheduler 
 complaints come regarding the human (latency/behavior/consistency) 
 aspect of the upstream scheduler. (Sure, we dont want to turn that 
 around into '80% of the complaints come due to performance' - so i 
 increased the granularity based on your kbuild feedback to that near of 
 SD's, to show that mini-timeslices are not a necessity in CFS, but i 
 really think that server scheduling is the easier part.)

So why not solve that (or at least not introduce obvious regressions),
and then focus on the hard part?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Ingo Molnar


* Nick Piggin [EMAIL PROTECTED] wrote:

  do need reinforcement and test results on the basic part: _can_ this 
  design be interactive enough on the desktop? So far the feedback has 
  been affirmative, but more testing is needed.
 
 It seems to be fairly easy to make a scheduler interactive if the 
 timeslice is as low as that (not that I've released one for wider 
 testing, but just by my own observations). [...]

ok, i'll bite: please release such a scheduler that does that with 
5-8-10msec range timeslices :-)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Ingo Molnar


* Nick Piggin [EMAIL PROTECTED] wrote:

  yeah - but they'll all be quad core, so the SMP timeslice 
  multiplicator should do the trick. Most of the CFS testers use 
  single-CPU systems.
 
 But desktop users could have have quad thread and even 8 thread CPUs 
 soon, [...]

SMT is indeed an issue, so i think what should be used to scale 
timeslices isnt num_online_cpus(), but the sum of all CPU's -cpu_power 
value (scaled down by SCHED_LOAD_SCALE). That way if the thread is not a 
'full CPU', then the scaling will be proportionally smaller. Can you see 
any hole in that?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5 (build problem - make headers_check fails)

2007-04-23 Thread Ingo Molnar


* Zach Carter [EMAIL PROTECTED] wrote:

 FYI, make headers_check seems to fail on this:
 
 [EMAIL PROTECTED] linux-2.6]$ make headers_check

 make[2]: *** No rule to make target 
 `/src/linux-2.6/usr/include/linux/.check.sched.h', needed by 
 `__headerscheck'.  Stop.
 make[1]: *** [linux] Error 2
 make: *** [headers_check] Error 2
 [EMAIL PROTECTED] linux-2.6]$
 
 This also fails if I have CONFIG_HEADERS_CHECK=y in my .config

ah, indeed - the patch below should fix this. It will be in -v6.

Ingo

Index: linux/include/linux/sched.h
===
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -2,7 +2,6 @@
 #define _LINUX_SCHED_H
 
 #include linux/auxvec.h  /* For AT_VECTOR_SIZE */
-#include linux/rbtree.h  /* For run_node */
 /*
  * cloning flags:
  */
@@ -37,6 +36,8 @@
 
 #ifdef __KERNEL__
 
+#include linux/rbtree.h  /* For run_node */
+
 struct sched_param {
int sched_priority;
 };
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Guillaume Chazarain


2007/4/23, Ingo Molnar [EMAIL PROTECTED]:

Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
+#include sched_stats.h
+#include sched_rt.c
+#include sched_fair.c
+#include sched_debug.c

Index: linux/kernel/sched_stats.h
===
--- /dev/null
+++ linux/kernel/sched_stats.h

These look unnatural if it were to be included in mainline.

WBR.

--
Guillaume
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Ingo Molnar


* Guillaume Chazarain [EMAIL PROTECTED] wrote:

 2007/4/23, Ingo Molnar [EMAIL PROTECTED]:
 
 Index: linux/kernel/sched.c
 ===
 --- linux.orig/kernel/sched.c
 +++ linux/kernel/sched.c
 +#include sched_stats.h
 +#include sched_rt.c
 +#include sched_fair.c
 +#include sched_debug.c
 
 Index: linux/kernel/sched_stats.h
 ===
 --- /dev/null
 +++ linux/kernel/sched_stats.h
 
 These look unnatural if it were to be included in mainline.

agreed - these will likely be separate modules - i just wanted to have 
an easy way of sharing infrastructure between sched.c and these.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Markus Trippelsdorf

On Mon, Apr 23, 2007 at 03:12:29AM +0200, Ingo Molnar wrote:
> 
> i'm pleased to announce release -v5 of the CFS scheduler patchset. The 
> patch against v2.6.21-rc7 and v2.6.20.7 can be downloaded from:
...
>  - feature: add initial sys_sched_yield_to() implementation. Not hooked 
>into the futex code yet, but testers are encouraged to give the 
>syscalls a try, on i686 the new syscall is __NR_yield_to==320, on 
>x86_64 it's __NR_yield_to==280. The prototype is 
>sys_sched_yield_to(pid_t), as suggested by Ulrich Drepper.

The new version does not link here (amd64,smp):

  LD  .tmp_vmlinux1
  arch/x86_64/kernel/built-in.o:(.rodata+0x1dd8): undefined reference to
  `sys_yield_to'

-- 
Markus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Markus Trippelsdorf

On Mon, Apr 23, 2007 at 07:16:59AM +0200, Markus Trippelsdorf wrote:
> On Mon, Apr 23, 2007 at 03:12:29AM +0200, Ingo Molnar wrote:
> > 
> > i'm pleased to announce release -v5 of the CFS scheduler patchset. The 
> > patch against v2.6.21-rc7 and v2.6.20.7 can be downloaded from:
> ...
> >  - feature: add initial sys_sched_yield_to() implementation. Not hooked 
> >into the futex code yet, but testers are encouraged to give the 
> >syscalls a try, on i686 the new syscall is __NR_yield_to==320, on 
> >x86_64 it's __NR_yield_to==280. The prototype is 
> >sys_sched_yield_to(pid_t), as suggested by Ulrich Drepper.
> 
> The new version does not link here (amd64,smp):
> 
>   LD  .tmp_vmlinux1
>   arch/x86_64/kernel/built-in.o:(.rodata+0x1dd8): undefined reference to
>   `sys_yield_to'

Changing  sys_yield_to to sys_sched_yield_to in include/asm-x86_64/unistd.h
fixes the problem.
-- 
Markus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Nick Piggin

On Mon, Apr 23, 2007 at 05:43:10AM +0200, Ingo Molnar wrote:
> 
> * Nick Piggin <[EMAIL PROTECTED]> wrote:
> 
> > > note that CFS's "granularity" value is not directly comparable to 
> > > "timeslice length":
> > 
> > Right, but it does introduce the kbuild regression, [...]
> 
> Note that i increased the granularity from 1msec to 5msecs after your 
> kbuild report, could you perhaps retest kbuild with the default settings 
> of -v5?

I'm looking at mysql again today, but I will try eventually. It was
just a simple kbuild.


> > [...] and as we discussed, this will be only worse on newer CPUs with 
> > bigger caches or less naturally context switchy workloads.
> 
> yeah - but they'll all be quad core, so the SMP timeslice multiplicator 
> should do the trick. Most of the CFS testers use single-CPU systems.

But desktop users could have have quad thread and even 8 thread CPUs
soon, so if the number doesn't work for both then you're in trouble.
It just smells like a hack to scale with CPU numbers.

 
> > > (in -v6 i'll scale the granularity up a bit with the number of CPUs, 
> > > like SD does. That should get the right result on larger SMP boxes 
> > > too.)
> > 
> > I don't really like the scaling with SMP thing. The cache effects are 
> > still going to be significant on small systems, and there are lots of 
> > non-desktop users of those (eg. clusters).
> 
> CFS using clusters will want to tune the granularity up drastically 
> anyway, to 1 second or more, to maximize throughput. I think a small 
> default with a scale-up-on-SMP rule is pretty sane. We'll gather some 
> more kbuild data and see what happens, ok?
> 
> > > while i agree it's a tad too finegrained still, I agree with Con's 
> > > choice: rather err on the side of being too finegrained and lose 
> > > some small amount of throughput on cache-intense workloads like 
> > > compile jobs, than err on the side of being visibly too choppy for 
> > > users on the desktop.
> > 
> > So cfs gets too choppy if you make the effective timeslice comparable 
> > to mainline?
> 
> it doesnt in any test i do, but again, i'm erring on the side of it 
> being more interactive.

I'd start by erring on the side of trying to ensure no obvious
performance regressions like this because that's the easy part. Suppose
everybody finds your scheduler wonderfully interactive, but you can't
make it so with a larger timeslice?

For _real_ desktop systems, sure, erring on the side of being more
interactive is fine. For RFC patches for testing, I really think you
could be taking advantage of the fact that people will give you feedback
on the issue.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Ingo Molnar

* Nick Piggin <[EMAIL PROTECTED]> wrote:

> > note that CFS's "granularity" value is not directly comparable to 
> > "timeslice length":
> 
> Right, but it does introduce the kbuild regression, [...]

Note that i increased the granularity from 1msec to 5msecs after your 
kbuild report, could you perhaps retest kbuild with the default settings 
of -v5?

> [...] and as we discussed, this will be only worse on newer CPUs with 
> bigger caches or less naturally context switchy workloads.

yeah - but they'll all be quad core, so the SMP timeslice multiplicator 
should do the trick. Most of the CFS testers use single-CPU systems.

> > (in -v6 i'll scale the granularity up a bit with the number of CPUs, 
> > like SD does. That should get the right result on larger SMP boxes 
> > too.)
> 
> I don't really like the scaling with SMP thing. The cache effects are 
> still going to be significant on small systems, and there are lots of 
> non-desktop users of those (eg. clusters).

CFS using clusters will want to tune the granularity up drastically 
anyway, to 1 second or more, to maximize throughput. I think a small 
default with a scale-up-on-SMP rule is pretty sane. We'll gather some 
more kbuild data and see what happens, ok?

> > while i agree it's a tad too finegrained still, I agree with Con's 
> > choice: rather err on the side of being too finegrained and lose 
> > some small amount of throughput on cache-intense workloads like 
> > compile jobs, than err on the side of being visibly too choppy for 
> > users on the desktop.
> 
> So cfs gets too choppy if you make the effective timeslice comparable 
> to mainline?

it doesnt in any test i do, but again, i'm erring on the side of it 
being more interactive.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Nick Piggin

On Mon, Apr 23, 2007 at 04:55:53AM +0200, Ingo Molnar wrote:
> 
> * Nick Piggin <[EMAIL PROTECTED]> wrote:
> 
> > > the biggest user-visible change in -v5 are various interactivity 
> > > improvements (especially under higher load) to fix reported 
> > > regressions, and an improved way of handling nice levels. There's 
> > > also a new sys_sched_yield_to() syscall implementation for i686 and 
> > > x86_64.
> > > 
> > > All known regressions have been fixed. (knock on wood)
> > 
> > I think the granularity is still much too low. Why not increase it to 
> > something more reasonable as a default?
> 
> note that CFS's "granularity" value is not directly comparable to 
> "timeslice length":

Right, but it does introduce the kbuild regression, and as we
discussed, this will be only worse on newer CPUs with bigger
caches or less naturally context switchy workloads.

> > [ Note: while CFS's default preemption granularity is currently set to
> >   5 msecs, this value does not directly transform into timeslices: for 
> >   example two CPU-intense tasks will have effective timeslices of 10 
> >   msecs with this setting. ]
> 
> also, i just checked SD: 0.46 defaults to 8 msecs rr_interval (on 1 CPU 
> systems), which is lower than the 10 msecs effective timeslice length 
> CVS-v5 achieves on two CPU-bound tasks.

This is about an order of magnitude more than the current scheduler, so
I still think it is too small.

> (in -v6 i'll scale the granularity up a bit with the number of CPUs, 
> like SD does. That should get the right result on larger SMP boxes too.)

I don't really like the scaling with SMP thing. The cache effects are
still going to be significant on small systems, and there are lots of
non-desktop users of those (eg. clusters).

> while i agree it's a tad too finegrained still, I agree with Con's 
> choice: rather err on the side of being too finegrained and lose some 
> small amount of throughput on cache-intense workloads like compile jobs, 
> than err on the side of being visibly too choppy for users on the 
> desktop.

So cfs gets too choppy if you make the effective timeslice comparable
to mainline?

My approach is completely the opposite. For testing, I prefer to make
the timeslice as large as possible so any problems or regressions are
really noticable and will be reported; it can be scaled back to be
smaller once those kinks are ironed out.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5 (build problem - make headers_check fails)

2007-04-22 Thread Zach Carter




Ingo Molnar wrote:
i'm pleased to announce release -v5 of the CFS scheduler patchset. The 
patch against v2.6.21-rc7 and v2.6.20.7 can be downloaded from:




FYI, make headers_check seems to fail on this:

[EMAIL PROTECTED] linux-2.6]$ make headers_check

[snip]

  CHECK   include/linux/usb/cdc.h
  CHECK   include/linux/usb/audio.h
make[2]: *** No rule to make target `/src/linux-2.6/usr/include/linux/.check.sched.h', needed by 
`__headerscheck'.  Stop.

make[1]: *** [linux] Error 2
make: *** [headers_check] Error 2
[EMAIL PROTECTED] linux-2.6]$

This also fails if I have CONFIG_HEADERS_CHECK=y in my .config

unset CONFIG_HEADERS_CHECK and it builds just fine.

-Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Ingo Molnar

* Gene Heskett <[EMAIL PROTECTED]> wrote:

> I haven't approached that yet, but I just noticed, having been booted 
> to this for all of 5 minutes, that although I told it not to renice x 
> when my script ran 'make oldconfig', and I answered n, but there it 
> is, sitting at -19 according to htop.
> 
> The .config says otherwise:
> [EMAIL PROTECTED] linux-2.6.21-rc7-CFS-v5]# grep RENICE .config
> # CONFIG_RENICE_X is not set
> 
> So v5 reniced X in spite of the 'no' setting.

Hmm, apparently your X uses ioperm() while mine uses iopl(), and i only 
turned off the renicing for iopl. (I fixed this in my tree and it will 
show up in -v6.)

> Although I hadn't noticed it, one way or the other, I just set it (X) 
> back to the default -1 so that I'm comparing the same apples when I do 
> compare.

note that CFS handles negative nice levels differently from other 
schedulers, so the disadvantages of agressively reniced X (lost 
throughput due to overscheduling, worse interactivity) do _not_ apply to 
CFS.

I think the 'fair' setting would be whatever the scheduler writer 
recommends: for SD, X probably performs better at around nice 0 (i'll 
let Con correct me if his experience is different). On CFS, nice -10 is 
perfectly fine too, and you'll have a zippier desktop under higher 
loads. (on servers this might be unnecessary/disadvantegous so there 
this can be turned off.)

(also, in my tree i've changed the default from -19 to -10 to make it 
less scary to people and to leave more levels to the sysadmin, this 
change too will show up in -v6.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Ingo Molnar

* Nick Piggin <[EMAIL PROTECTED]> wrote:

> > the biggest user-visible change in -v5 are various interactivity 
> > improvements (especially under higher load) to fix reported 
> > regressions, and an improved way of handling nice levels. There's 
> > also a new sys_sched_yield_to() syscall implementation for i686 and 
> > x86_64.
> > 
> > All known regressions have been fixed. (knock on wood)
> 
> I think the granularity is still much too low. Why not increase it to 
> something more reasonable as a default?

note that CFS's "granularity" value is not directly comparable to 
"timeslice length":

> [ Note: while CFS's default preemption granularity is currently set to
>   5 msecs, this value does not directly transform into timeslices: for 
>   example two CPU-intense tasks will have effective timeslices of 10 
>   msecs with this setting. ]

also, i just checked SD: 0.46 defaults to 8 msecs rr_interval (on 1 CPU 
systems), which is lower than the 10 msecs effective timeslice length 
CVS-v5 achieves on two CPU-bound tasks.

(in -v6 i'll scale the granularity up a bit with the number of CPUs, 
like SD does. That should get the right result on larger SMP boxes too.)

while i agree it's a tad too finegrained still, I agree with Con's 
choice: rather err on the side of being too finegrained and lose some 
small amount of throughput on cache-intense workloads like compile jobs, 
than err on the side of being visibly too choppy for users on the 
desktop.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Gene Heskett

On Sunday 22 April 2007, Nick Piggin wrote:
>On Mon, Apr 23, 2007 at 03:12:29AM +0200, Ingo Molnar wrote:
>> i'm pleased to announce release -v5 of the CFS scheduler patchset. The
>> patch against v2.6.21-rc7 and v2.6.20.7 can be downloaded from:
>>
>> http://redhat.com/~mingo/cfs-scheduler/
>>
>> this CFS release mainly fixes regressions and improves interactivity:
>>
>> 13 files changed, 211 insertions(+), 199 deletions(-)
>>
>> the biggest user-visible change in -v5 are various interactivity
>> improvements (especially under higher load) to fix reported regressions,
>> and an improved way of handling nice levels. There's also a new
>> sys_sched_yield_to() syscall implementation for i686 and x86_64.
>>
>> All known regressions have been fixed. (knock on wood)
>
>I think the granularity is still much too low. Why not increase it to
>something more reasonable as a default?

I haven't approached that yet, but I just noticed, having been booted to this 
for all of 5 minutes, that although I told it not to renice x when my script 
ran 'make oldconfig', and I answered n, but there it is, sitting at -19 
according to htop.

The .config says otherwise:
[EMAIL PROTECTED] linux-2.6.21-rc7-CFS-v5]# grep RENICE .config
# CONFIG_RENICE_X is not set

So v5 reniced X in spite of the 'no' setting.

Although I hadn't noticed it, one way or the other, I just set it (X) back to 
the default -1 so that I'm comparing the same apples when I do compare.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Fortune finishes the great quotations, #2

If at first you don't succeed, think how many people
you've made happy.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Nick Piggin

On Mon, Apr 23, 2007 at 03:12:29AM +0200, Ingo Molnar wrote:
> 
> i'm pleased to announce release -v5 of the CFS scheduler patchset. The 
> patch against v2.6.21-rc7 and v2.6.20.7 can be downloaded from:
> 
> http://redhat.com/~mingo/cfs-scheduler/
> 
> this CFS release mainly fixes regressions and improves interactivity:
> 
> 13 files changed, 211 insertions(+), 199 deletions(-)
> 
> the biggest user-visible change in -v5 are various interactivity 
> improvements (especially under higher load) to fix reported regressions, 
> and an improved way of handling nice levels. There's also a new 
> sys_sched_yield_to() syscall implementation for i686 and x86_64.
> 
> All known regressions have been fixed. (knock on wood)

I think the granularity is still much too low. Why not increase it to
something more reasonable as a default?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Nick Piggin

On Mon, Apr 23, 2007 at 03:12:29AM +0200, Ingo Molnar wrote:
 
 i'm pleased to announce release -v5 of the CFS scheduler patchset. The 
 patch against v2.6.21-rc7 and v2.6.20.7 can be downloaded from:
 
 http://redhat.com/~mingo/cfs-scheduler/
 
 this CFS release mainly fixes regressions and improves interactivity:
 
 13 files changed, 211 insertions(+), 199 deletions(-)
 
 the biggest user-visible change in -v5 are various interactivity 
 improvements (especially under higher load) to fix reported regressions, 
 and an improved way of handling nice levels. There's also a new 
 sys_sched_yield_to() syscall implementation for i686 and x86_64.
 
 All known regressions have been fixed. (knock on wood)

I think the granularity is still much too low. Why not increase it to
something more reasonable as a default?

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Gene Heskett

On Sunday 22 April 2007, Nick Piggin wrote:
On Mon, Apr 23, 2007 at 03:12:29AM +0200, Ingo Molnar wrote:
 i'm pleased to announce release -v5 of the CFS scheduler patchset. The
 patch against v2.6.21-rc7 and v2.6.20.7 can be downloaded from:

 http://redhat.com/~mingo/cfs-scheduler/

 this CFS release mainly fixes regressions and improves interactivity:

 13 files changed, 211 insertions(+), 199 deletions(-)

 the biggest user-visible change in -v5 are various interactivity
 improvements (especially under higher load) to fix reported regressions,
 and an improved way of handling nice levels. There's also a new
 sys_sched_yield_to() syscall implementation for i686 and x86_64.

 All known regressions have been fixed. (knock on wood)

I think the granularity is still much too low. Why not increase it to
something more reasonable as a default?

I haven't approached that yet, but I just noticed, having been booted to this 
for all of 5 minutes, that although I told it not to renice x when my script 
ran 'make oldconfig', and I answered n, but there it is, sitting at -19 
according to htop.

The .config says otherwise:
[EMAIL PROTECTED] linux-2.6.21-rc7-CFS-v5]# grep RENICE .config
# CONFIG_RENICE_X is not set

So v5 reniced X in spite of the 'no' setting.

Although I hadn't noticed it, one way or the other, I just set it (X) back to 
the default -1 so that I'm comparing the same apples when I do compare.

-- 
Cheers, Gene
There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order.
-Ed Howdershelt (Author)
Fortune finishes the great quotations, #2

If at first you don't succeed, think how many people
you've made happy.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Ingo Molnar


* Nick Piggin [EMAIL PROTECTED] wrote:

  the biggest user-visible change in -v5 are various interactivity 
  improvements (especially under higher load) to fix reported 
  regressions, and an improved way of handling nice levels. There's 
  also a new sys_sched_yield_to() syscall implementation for i686 and 
  x86_64.
  
  All known regressions have been fixed. (knock on wood)
 
 I think the granularity is still much too low. Why not increase it to 
 something more reasonable as a default?

note that CFS's granularity value is not directly comparable to 
timeslice length:

 [ Note: while CFS's default preemption granularity is currently set to
   5 msecs, this value does not directly transform into timeslices: for 
   example two CPU-intense tasks will have effective timeslices of 10 
   msecs with this setting. ]

also, i just checked SD: 0.46 defaults to 8 msecs rr_interval (on 1 CPU 
systems), which is lower than the 10 msecs effective timeslice length 
CVS-v5 achieves on two CPU-bound tasks.

(in -v6 i'll scale the granularity up a bit with the number of CPUs, 
like SD does. That should get the right result on larger SMP boxes too.)

while i agree it's a tad too finegrained still, I agree with Con's 
choice: rather err on the side of being too finegrained and lose some 
small amount of throughput on cache-intense workloads like compile jobs, 
than err on the side of being visibly too choppy for users on the 
desktop.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Ingo Molnar


* Gene Heskett [EMAIL PROTECTED] wrote:

 I haven't approached that yet, but I just noticed, having been booted 
 to this for all of 5 minutes, that although I told it not to renice x 
 when my script ran 'make oldconfig', and I answered n, but there it 
 is, sitting at -19 according to htop.
 
 The .config says otherwise:
 [EMAIL PROTECTED] linux-2.6.21-rc7-CFS-v5]# grep RENICE .config
 # CONFIG_RENICE_X is not set
 
 So v5 reniced X in spite of the 'no' setting.

Hmm, apparently your X uses ioperm() while mine uses iopl(), and i only 
turned off the renicing for iopl. (I fixed this in my tree and it will 
show up in -v6.)

 Although I hadn't noticed it, one way or the other, I just set it (X) 
 back to the default -1 so that I'm comparing the same apples when I do 
 compare.

note that CFS handles negative nice levels differently from other 
schedulers, so the disadvantages of agressively reniced X (lost 
throughput due to overscheduling, worse interactivity) do _not_ apply to 
CFS.

I think the 'fair' setting would be whatever the scheduler writer 
recommends: for SD, X probably performs better at around nice 0 (i'll 
let Con correct me if his experience is different). On CFS, nice -10 is 
perfectly fine too, and you'll have a zippier desktop under higher 
loads. (on servers this might be unnecessary/disadvantegous so there 
this can be turned off.)

(also, in my tree i've changed the default from -19 to -10 to make it 
less scary to people and to leave more levels to the sysadmin, this 
change too will show up in -v6.)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5 (build problem - make headers_check fails)

2007-04-22 Thread Zach Carter




Ingo Molnar wrote:
i'm pleased to announce release -v5 of the CFS scheduler patchset. The 
patch against v2.6.21-rc7 and v2.6.20.7 can be downloaded from:




FYI, make headers_check seems to fail on this:

[EMAIL PROTECTED] linux-2.6]$ make headers_check

[snip]

  CHECK   include/linux/usb/cdc.h
  CHECK   include/linux/usb/audio.h
make[2]: *** No rule to make target `/src/linux-2.6/usr/include/linux/.check.sched.h', needed by 
`__headerscheck'.  Stop.

make[1]: *** [linux] Error 2
make: *** [headers_check] Error 2
[EMAIL PROTECTED] linux-2.6]$

This also fails if I have CONFIG_HEADERS_CHECK=y in my .config

unset CONFIG_HEADERS_CHECK and it builds just fine.

-Zach
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Nick Piggin

On Mon, Apr 23, 2007 at 04:55:53AM +0200, Ingo Molnar wrote:
 
 * Nick Piggin [EMAIL PROTECTED] wrote:
 
   the biggest user-visible change in -v5 are various interactivity 
   improvements (especially under higher load) to fix reported 
   regressions, and an improved way of handling nice levels. There's 
   also a new sys_sched_yield_to() syscall implementation for i686 and 
   x86_64.
   
   All known regressions have been fixed. (knock on wood)
  
  I think the granularity is still much too low. Why not increase it to 
  something more reasonable as a default?
 
 note that CFS's granularity value is not directly comparable to 
 timeslice length:

Right, but it does introduce the kbuild regression, and as we
discussed, this will be only worse on newer CPUs with bigger
caches or less naturally context switchy workloads.


  [ Note: while CFS's default preemption granularity is currently set to
5 msecs, this value does not directly transform into timeslices: for 
example two CPU-intense tasks will have effective timeslices of 10 
msecs with this setting. ]
 
 also, i just checked SD: 0.46 defaults to 8 msecs rr_interval (on 1 CPU 
 systems), which is lower than the 10 msecs effective timeslice length 
 CVS-v5 achieves on two CPU-bound tasks.

This is about an order of magnitude more than the current scheduler, so
I still think it is too small.


 (in -v6 i'll scale the granularity up a bit with the number of CPUs, 
 like SD does. That should get the right result on larger SMP boxes too.)

I don't really like the scaling with SMP thing. The cache effects are
still going to be significant on small systems, and there are lots of
non-desktop users of those (eg. clusters).


 while i agree it's a tad too finegrained still, I agree with Con's 
 choice: rather err on the side of being too finegrained and lose some 
 small amount of throughput on cache-intense workloads like compile jobs, 
 than err on the side of being visibly too choppy for users on the 
 desktop.

So cfs gets too choppy if you make the effective timeslice comparable
to mainline?

My approach is completely the opposite. For testing, I prefer to make
the timeslice as large as possible so any problems or regressions are
really noticable and will be reported; it can be scaled back to be
smaller once those kinks are ironed out.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Ingo Molnar


* Nick Piggin [EMAIL PROTECTED] wrote:

  note that CFS's granularity value is not directly comparable to 
  timeslice length:
 
 Right, but it does introduce the kbuild regression, [...]

Note that i increased the granularity from 1msec to 5msecs after your 
kbuild report, could you perhaps retest kbuild with the default settings 
of -v5?

 [...] and as we discussed, this will be only worse on newer CPUs with 
 bigger caches or less naturally context switchy workloads.

yeah - but they'll all be quad core, so the SMP timeslice multiplicator 
should do the trick. Most of the CFS testers use single-CPU systems.

  (in -v6 i'll scale the granularity up a bit with the number of CPUs, 
  like SD does. That should get the right result on larger SMP boxes 
  too.)
 
 I don't really like the scaling with SMP thing. The cache effects are 
 still going to be significant on small systems, and there are lots of 
 non-desktop users of those (eg. clusters).

CFS using clusters will want to tune the granularity up drastically 
anyway, to 1 second or more, to maximize throughput. I think a small 
default with a scale-up-on-SMP rule is pretty sane. We'll gather some 
more kbuild data and see what happens, ok?

  while i agree it's a tad too finegrained still, I agree with Con's 
  choice: rather err on the side of being too finegrained and lose 
  some small amount of throughput on cache-intense workloads like 
  compile jobs, than err on the side of being visibly too choppy for 
  users on the desktop.
 
 So cfs gets too choppy if you make the effective timeslice comparable 
 to mainline?

it doesnt in any test i do, but again, i'm erring on the side of it 
being more interactive.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Nick Piggin

On Mon, Apr 23, 2007 at 05:43:10AM +0200, Ingo Molnar wrote:
 
 * Nick Piggin [EMAIL PROTECTED] wrote:
 
   note that CFS's granularity value is not directly comparable to 
   timeslice length:
  
  Right, but it does introduce the kbuild regression, [...]
 
 Note that i increased the granularity from 1msec to 5msecs after your 
 kbuild report, could you perhaps retest kbuild with the default settings 
 of -v5?

I'm looking at mysql again today, but I will try eventually. It was
just a simple kbuild.


  [...] and as we discussed, this will be only worse on newer CPUs with 
  bigger caches or less naturally context switchy workloads.
 
 yeah - but they'll all be quad core, so the SMP timeslice multiplicator 
 should do the trick. Most of the CFS testers use single-CPU systems.

But desktop users could have have quad thread and even 8 thread CPUs
soon, so if the number doesn't work for both then you're in trouble.
It just smells like a hack to scale with CPU numbers.

 
   (in -v6 i'll scale the granularity up a bit with the number of CPUs, 
   like SD does. That should get the right result on larger SMP boxes 
   too.)
  
  I don't really like the scaling with SMP thing. The cache effects are 
  still going to be significant on small systems, and there are lots of 
  non-desktop users of those (eg. clusters).
 
 CFS using clusters will want to tune the granularity up drastically 
 anyway, to 1 second or more, to maximize throughput. I think a small 
 default with a scale-up-on-SMP rule is pretty sane. We'll gather some 
 more kbuild data and see what happens, ok?
 
   while i agree it's a tad too finegrained still, I agree with Con's 
   choice: rather err on the side of being too finegrained and lose 
   some small amount of throughput on cache-intense workloads like 
   compile jobs, than err on the side of being visibly too choppy for 
   users on the desktop.
  
  So cfs gets too choppy if you make the effective timeslice comparable 
  to mainline?
 
 it doesnt in any test i do, but again, i'm erring on the side of it 
 being more interactive.

I'd start by erring on the side of trying to ensure no obvious
performance regressions like this because that's the easy part. Suppose
everybody finds your scheduler wonderfully interactive, but you can't
make it so with a larger timeslice?

For _real_ desktop systems, sure, erring on the side of being more
interactive is fine. For RFC patches for testing, I really think you
could be taking advantage of the fact that people will give you feedback
on the issue.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Markus Trippelsdorf

On Mon, Apr 23, 2007 at 07:16:59AM +0200, Markus Trippelsdorf wrote:
 On Mon, Apr 23, 2007 at 03:12:29AM +0200, Ingo Molnar wrote:
  
  i'm pleased to announce release -v5 of the CFS scheduler patchset. The 
  patch against v2.6.21-rc7 and v2.6.20.7 can be downloaded from:
 ...
   - feature: add initial sys_sched_yield_to() implementation. Not hooked 
 into the futex code yet, but testers are encouraged to give the 
 syscalls a try, on i686 the new syscall is __NR_yield_to==320, on 
 x86_64 it's __NR_yield_to==280. The prototype is 
 sys_sched_yield_to(pid_t), as suggested by Ulrich Drepper.
 
 The new version does not link here (amd64,smp):
 
   LD  .tmp_vmlinux1
   arch/x86_64/kernel/built-in.o:(.rodata+0x1dd8): undefined reference to
   `sys_yield_to'

Changing  sys_yield_to to sys_sched_yield_to in include/asm-x86_64/unistd.h
fixes the problem.
-- 
Markus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Markus Trippelsdorf

On Mon, Apr 23, 2007 at 03:12:29AM +0200, Ingo Molnar wrote:
 
 i'm pleased to announce release -v5 of the CFS scheduler patchset. The 
 patch against v2.6.21-rc7 and v2.6.20.7 can be downloaded from:
...
  - feature: add initial sys_sched_yield_to() implementation. Not hooked 
into the futex code yet, but testers are encouraged to give the 
syscalls a try, on i686 the new syscall is __NR_yield_to==320, on 
x86_64 it's __NR_yield_to==280. The prototype is 
sys_sched_yield_to(pid_t), as suggested by Ulrich Drepper.

The new version does not link here (amd64,smp):

  LD  .tmp_vmlinux1
  arch/x86_64/kernel/built-in.o:(.rodata+0x1dd8): undefined reference to
  `sys_yield_to'

-- 
Markus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

48 matches

Mail list logo