Re: [test-PATCH] Re: [QUESTION] 2.4.x nice level
Rik van Riel wrote: > > On Thu, 12 Apr 2001, Pavel Machek wrote: > > > > One rule of optimization is to move any code you can outside the loop. > > > Why isn't the nice_to_ticks calculation done when nice is changed > > > instead of EVERY recalc.? I guess another way to ask this is, who needs > > > > This way change is localized very nicely, and it is "obviously right". > > Except for two obvious things: > > 1. we need to load the nice level anyway > 2. a shift takes less cycles than a load on most >CPUs > Gosh, what am I missing here? I think "top" and "ps" want to see the "nice" value so it needs to be available and since the NICE_TO_TICK() function looses information (i.e. is not reversible) we can not compute it from ticks. Still, yes we need to load something, but is it nice? Why not the result of the NICE_TO_TICK()? A shift and a subtract are fast, yes, but this loop runs over all tasks (not just the run list). This loop can put a real dent in preemption times AND the notion of turning on interrupts while it is done can run into some interesting race conditions. (This is why the MontaVista scheduler does the loop without dropping the lock, AFTER optimizing the h... out of it.) What am I missing? George - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test-PATCH] Re: [QUESTION] 2.4.x nice level
On Thu, 12 Apr 2001, Pavel Machek wrote: > > One rule of optimization is to move any code you can outside the loop. > > Why isn't the nice_to_ticks calculation done when nice is changed > > instead of EVERY recalc.? I guess another way to ask this is, who needs > > This way change is localized very nicely, and it is "obviously right". Except for two obvious things: 1. we need to load the nice level anyway 2. a shift takes less cycles than a load on most CPUs Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com.br/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test-PATCH] Re: [QUESTION] 2.4.x nice level
Hi! > One rule of optimization is to move any code you can outside the loop. > Why isn't the nice_to_ticks calculation done when nice is changed > instead of EVERY recalc.? I guess another way to ask this is, who needs This way change is localized very nicely, and it is "obviously right". -- Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt, details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test-PATCH] Re: [QUESTION] 2.4.x nice level
Hi! One rule of optimization is to move any code you can outside the loop. Why isn't the nice_to_ticks calculation done when nice is changed instead of EVERY recalc.? I guess another way to ask this is, who needs This way change is localized very nicely, and it is "obviously right". -- Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt, details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test-PATCH] Re: [QUESTION] 2.4.x nice level
On Thu, 12 Apr 2001, Pavel Machek wrote: One rule of optimization is to move any code you can outside the loop. Why isn't the nice_to_ticks calculation done when nice is changed instead of EVERY recalc.? I guess another way to ask this is, who needs This way change is localized very nicely, and it is "obviously right". Except for two obvious things: 1. we need to load the nice level anyway 2. a shift takes less cycles than a load on most CPUs Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com.br/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test-PATCH] Re: [QUESTION] 2.4.x nice level
Rik van Riel wrote: On Thu, 12 Apr 2001, Pavel Machek wrote: One rule of optimization is to move any code you can outside the loop. Why isn't the nice_to_ticks calculation done when nice is changed instead of EVERY recalc.? I guess another way to ask this is, who needs This way change is localized very nicely, and it is "obviously right". Except for two obvious things: 1. we need to load the nice level anyway 2. a shift takes less cycles than a load on most CPUs Gosh, what am I missing here? I think "top" and "ps" want to see the "nice" value so it needs to be available and since the NICE_TO_TICK() function looses information (i.e. is not reversible) we can not compute it from ticks. Still, yes we need to load something, but is it nice? Why not the result of the NICE_TO_TICK()? A shift and a subtract are fast, yes, but this loop runs over all tasks (not just the run list). This loop can put a real dent in preemption times AND the notion of turning on interrupts while it is done can run into some interesting race conditions. (This is why the MontaVista scheduler does the loop without dropping the lock, AFTER optimizing the h... out of it.) What am I missing? George - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test-PATCH] Re: [QUESTION] 2.4.x nice level
On Wed, Apr 11, 2001 at 12:53:16PM -0300, Rik van Riel wrote: > On Wed, 11 Apr 2001, Rik van Riel wrote: > > > OK, here it is. It's nothing like montavista's singing-dancing > > scheduler patch that does all, just a really minimal change that > > should stretch the nice levels to yield the following CPU usage: > > > > Nice05 10 15 19 > > %CPU 100 56 2561 > > PID USER PRI NI SIZE SWAP RSS SHARE STAT %CPU %MEM TIME COMMAND > 980 riel 17 0 2960 296 240 R54.1 0.5 54:19 loop > 1005 riel 16 5 2960 296 240 R N 27.0 0.5 0:34 loop > 1006 riel 17 10 2960 296 240 R N 13.5 0.5 0:16 loop > 1007 riel 18 15 2960 296 240 R N 4.5 0.5 0:05 loop > 987 riel 20 19 2960 296 240 R N 0.4 0.5 0:25 loop How does this scale to negative nice levels? Afaik it should, in some way. (I don't mean that it's wrong in this state, i'm just asking). regards, Balazs. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test-PATCH] Re: [QUESTION] 2.4.x nice level
On Wed, Apr 11, 2001 at 12:53:16PM -0300, Rik van Riel wrote: On Wed, 11 Apr 2001, Rik van Riel wrote: OK, here it is. It's nothing like montavista's singing-dancing scheduler patch that does all, just a really minimal change that should stretch the nice levels to yield the following CPU usage: Nice05 10 15 19 %CPU 100 56 2561 PID USER PRI NI SIZE SWAP RSS SHARE STAT %CPU %MEM TIME COMMAND 980 riel 17 0 2960 296 240 R54.1 0.5 54:19 loop 1005 riel 16 5 2960 296 240 R N 27.0 0.5 0:34 loop 1006 riel 17 10 2960 296 240 R N 13.5 0.5 0:16 loop 1007 riel 18 15 2960 296 240 R N 4.5 0.5 0:05 loop 987 riel 20 19 2960 296 240 R N 0.4 0.5 0:25 loop How does this scale to negative nice levels? Afaik it should, in some way. (I don't mean that it's wrong in this state, i'm just asking). regards, Balazs. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test-PATCH] Re: [QUESTION] 2.4.x nice level
One rule of optimization is to move any code you can outside the loop. Why isn't the nice_to_ticks calculation done when nice is changed instead of EVERY recalc.? I guess another way to ask this is, who needs to see the original nice? Would it be worth another task_struct entry to move this calculation out of the loop? George Rik van Riel wrote: > > On Tue, 10 Apr 2001, Rik van Riel wrote: > > > I'll try to come up with a recalculation change that will make > > this thing behave better, while still retaining the short time > > slices for multiple normal-priority tasks and the cache footprint > > schedule() and friends currently have... > > OK, here it is. It's nothing like montavista's singing-dancing > scheduler patch that does all, just a really minimal change that > should stretch the nice levels to yield the following CPU usage: > > Nice05 10 15 19 > %CPU 100 56 2561 > > Note that the code doesn't change the actual scheduling code, > just the recalculation. Care has also been taken to not increase > the cache footprint of the scheduling and recalculation code. > > I'd love to hear some test results from people who are interested > in wider nice levels. How does this run on your system? Can you > trigger bad behaviour in any way? > > regards, > > Rik > -- > Virtual memory is like a game you can't win; > However, without VM there's truly nothing to lose... > > http://www.surriel.com/ > http://www.conectiva.com/ http://distro.conectiva.com.br/ > > --- linux-2.4.3-ac4/kernel/sched.c.orig Tue Apr 10 21:04:06 2001 > +++ linux-2.4.3-ac4/kernel/sched.c Wed Apr 11 06:18:46 2001 > @@ -686,8 +686,26 @@ > struct task_struct *p; > spin_unlock_irq(_lock); > read_lock(_lock); > - for_each_task(p) > + for_each_task(p) { > + if (p->nice <= 0) { > + /* The normal case... */ > p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice); > + } else { > + /* > +* Niced tasks get less CPU less often, leading to > +* the following distribution of CPU time: > +* > +* Nice05 10 15 19 > +* %CPU 100 56 2561 > +*/ > + short prio = 20 - p->nice; > + p->nice_calc += prio; > + if (p->nice_calc >= 20) { > + p->nice_calc -= 20; > + p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice); > + } > + } > + } > read_unlock(_lock); > spin_lock_irq(_lock); > } > --- linux-2.4.3-ac4/include/linux/sched.h.orig Tue Apr 10 21:04:13 2001 > +++ linux-2.4.3-ac4/include/linux/sched.h Wed Apr 11 06:26:47 2001 > @@ -303,7 +303,8 @@ > * the goodness() loop in schedule(). > */ > long counter; > - long nice; > + short nice_calc; > + short nice; > unsigned long policy; > struct mm_struct *mm; > int has_cpu, processor; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test-PATCH] Re: [QUESTION] 2.4.x nice level
On Wed, 11 Apr 2001, Rik van Riel wrote: > OK, here it is. It's nothing like montavista's singing-dancing > scheduler patch that does all, just a really minimal change that > should stretch the nice levels to yield the following CPU usage: > > Nice05 10 15 19 > %CPU 100 56 2561 PID USER PRI NI SIZE SWAP RSS SHARE STAT %CPU %MEM TIME COMMAND 980 riel 17 0 2960 296 240 R54.1 0.5 54:19 loop 1005 riel 16 5 2960 296 240 R N 27.0 0.5 0:34 loop 1006 riel 17 10 2960 296 240 R N 13.5 0.5 0:16 loop 1007 riel 18 15 2960 296 240 R N 4.5 0.5 0:05 loop 987 riel 20 19 2960 296 240 R N 0.4 0.5 0:25 loop ... is what I got when testing it here. It seems that nice levels REALLY mean something with the patch applied ;) You can get it at http://www.surriel.com/patches/2.4/2.4.3ac4-largenice Since there seems to be quite a bit of demand for this feature, please test it and try to make it break. If it doesn't break we can try to put it in the kernel... regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com.br/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[test-PATCH] Re: [QUESTION] 2.4.x nice level
On Tue, 10 Apr 2001, Rik van Riel wrote: > I'll try to come up with a recalculation change that will make > this thing behave better, while still retaining the short time > slices for multiple normal-priority tasks and the cache footprint > schedule() and friends currently have... OK, here it is. It's nothing like montavista's singing-dancing scheduler patch that does all, just a really minimal change that should stretch the nice levels to yield the following CPU usage: Nice05 10 15 19 %CPU 100 56 2561 Note that the code doesn't change the actual scheduling code, just the recalculation. Care has also been taken to not increase the cache footprint of the scheduling and recalculation code. I'd love to hear some test results from people who are interested in wider nice levels. How does this run on your system? Can you trigger bad behaviour in any way? regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com.br/ --- linux-2.4.3-ac4/kernel/sched.c.orig Tue Apr 10 21:04:06 2001 +++ linux-2.4.3-ac4/kernel/sched.c Wed Apr 11 06:18:46 2001 @@ -686,8 +686,26 @@ struct task_struct *p; spin_unlock_irq(_lock); read_lock(_lock); - for_each_task(p) + for_each_task(p) { + if (p->nice <= 0) { + /* The normal case... */ p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice); + } else { + /* +* Niced tasks get less CPU less often, leading to +* the following distribution of CPU time: +* +* Nice05 10 15 19 +* %CPU 100 56 2561 +*/ + short prio = 20 - p->nice; + p->nice_calc += prio; + if (p->nice_calc >= 20) { + p->nice_calc -= 20; + p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice); + } + } + } read_unlock(_lock); spin_lock_irq(_lock); } --- linux-2.4.3-ac4/include/linux/sched.h.orig Tue Apr 10 21:04:13 2001 +++ linux-2.4.3-ac4/include/linux/sched.h Wed Apr 11 06:26:47 2001 @@ -303,7 +303,8 @@ * the goodness() loop in schedule(). */ long counter; - long nice; + short nice_calc; + short nice; unsigned long policy; struct mm_struct *mm; int has_cpu, processor; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[test-PATCH] Re: [QUESTION] 2.4.x nice level
On Tue, 10 Apr 2001, Rik van Riel wrote: I'll try to come up with a recalculation change that will make this thing behave better, while still retaining the short time slices for multiple normal-priority tasks and the cache footprint schedule() and friends currently have... OK, here it is. It's nothing like montavista's singing-dancing scheduler patch that does all, just a really minimal change that should stretch the nice levels to yield the following CPU usage: Nice05 10 15 19 %CPU 100 56 2561 Note that the code doesn't change the actual scheduling code, just the recalculation. Care has also been taken to not increase the cache footprint of the scheduling and recalculation code. I'd love to hear some test results from people who are interested in wider nice levels. How does this run on your system? Can you trigger bad behaviour in any way? regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com.br/ --- linux-2.4.3-ac4/kernel/sched.c.orig Tue Apr 10 21:04:06 2001 +++ linux-2.4.3-ac4/kernel/sched.c Wed Apr 11 06:18:46 2001 @@ -686,8 +686,26 @@ struct task_struct *p; spin_unlock_irq(runqueue_lock); read_lock(tasklist_lock); - for_each_task(p) + for_each_task(p) { + if (p-nice = 0) { + /* The normal case... */ p-counter = (p-counter 1) + NICE_TO_TICKS(p-nice); + } else { + /* +* Niced tasks get less CPU less often, leading to +* the following distribution of CPU time: +* +* Nice05 10 15 19 +* %CPU 100 56 2561 +*/ + short prio = 20 - p-nice; + p-nice_calc += prio; + if (p-nice_calc = 20) { + p-nice_calc -= 20; + p-counter = (p-counter 1) + NICE_TO_TICKS(p-nice); + } + } + } read_unlock(tasklist_lock); spin_lock_irq(runqueue_lock); } --- linux-2.4.3-ac4/include/linux/sched.h.orig Tue Apr 10 21:04:13 2001 +++ linux-2.4.3-ac4/include/linux/sched.h Wed Apr 11 06:26:47 2001 @@ -303,7 +303,8 @@ * the goodness() loop in schedule(). */ long counter; - long nice; + short nice_calc; + short nice; unsigned long policy; struct mm_struct *mm; int has_cpu, processor; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test-PATCH] Re: [QUESTION] 2.4.x nice level
On Wed, 11 Apr 2001, Rik van Riel wrote: OK, here it is. It's nothing like montavista's singing-dancing scheduler patch that does all, just a really minimal change that should stretch the nice levels to yield the following CPU usage: Nice05 10 15 19 %CPU 100 56 2561 PID USER PRI NI SIZE SWAP RSS SHARE STAT %CPU %MEM TIME COMMAND 980 riel 17 0 2960 296 240 R54.1 0.5 54:19 loop 1005 riel 16 5 2960 296 240 R N 27.0 0.5 0:34 loop 1006 riel 17 10 2960 296 240 R N 13.5 0.5 0:16 loop 1007 riel 18 15 2960 296 240 R N 4.5 0.5 0:05 loop 987 riel 20 19 2960 296 240 R N 0.4 0.5 0:25 loop ... is what I got when testing it here. It seems that nice levels REALLY mean something with the patch applied ;) You can get it at http://www.surriel.com/patches/2.4/2.4.3ac4-largenice Since there seems to be quite a bit of demand for this feature, please test it and try to make it break. If it doesn't break we can try to put it in the kernel... regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com.br/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test-PATCH] Re: [QUESTION] 2.4.x nice level
One rule of optimization is to move any code you can outside the loop. Why isn't the nice_to_ticks calculation done when nice is changed instead of EVERY recalc.? I guess another way to ask this is, who needs to see the original nice? Would it be worth another task_struct entry to move this calculation out of the loop? George Rik van Riel wrote: On Tue, 10 Apr 2001, Rik van Riel wrote: I'll try to come up with a recalculation change that will make this thing behave better, while still retaining the short time slices for multiple normal-priority tasks and the cache footprint schedule() and friends currently have... OK, here it is. It's nothing like montavista's singing-dancing scheduler patch that does all, just a really minimal change that should stretch the nice levels to yield the following CPU usage: Nice05 10 15 19 %CPU 100 56 2561 Note that the code doesn't change the actual scheduling code, just the recalculation. Care has also been taken to not increase the cache footprint of the scheduling and recalculation code. I'd love to hear some test results from people who are interested in wider nice levels. How does this run on your system? Can you trigger bad behaviour in any way? regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com.br/ --- linux-2.4.3-ac4/kernel/sched.c.orig Tue Apr 10 21:04:06 2001 +++ linux-2.4.3-ac4/kernel/sched.c Wed Apr 11 06:18:46 2001 @@ -686,8 +686,26 @@ struct task_struct *p; spin_unlock_irq(runqueue_lock); read_lock(tasklist_lock); - for_each_task(p) + for_each_task(p) { + if (p-nice = 0) { + /* The normal case... */ p-counter = (p-counter 1) + NICE_TO_TICKS(p-nice); + } else { + /* +* Niced tasks get less CPU less often, leading to +* the following distribution of CPU time: +* +* Nice05 10 15 19 +* %CPU 100 56 2561 +*/ + short prio = 20 - p-nice; + p-nice_calc += prio; + if (p-nice_calc = 20) { + p-nice_calc -= 20; + p-counter = (p-counter 1) + NICE_TO_TICKS(p-nice); + } + } + } read_unlock(tasklist_lock); spin_lock_irq(runqueue_lock); } --- linux-2.4.3-ac4/include/linux/sched.h.orig Tue Apr 10 21:04:13 2001 +++ linux-2.4.3-ac4/include/linux/sched.h Wed Apr 11 06:26:47 2001 @@ -303,7 +303,8 @@ * the goodness() loop in schedule(). */ long counter; - long nice; + short nice_calc; + short nice; unsigned long policy; struct mm_struct *mm; int has_cpu, processor; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/