Hi,
priority nice levels are too close to each other in standard kernel.
When you run f.e. some CPU consuming tasks at +19 (lowest) priority,
you can still feel the performance degradation on processes with 0
(standard) priority.

Simple and working solution is to use
http://www.surriel.com/patches/2.4/2.4.3ac4-largenice
 (still applies to current 2.4)
which makes nice levels to actually means something without changing
scheduler policy/algorithms and makes low priority tasks not eat so
much CPU time when there are higher priority tasks to run.

I have not seen a single problem with it.

I'm attaching LKML posts about it.

Have a nice day

P.S. Better solution could be incorporating Mingo's new O(1) scheduler
(which is in 2.5 now), but it's a bigger/new thing and I understand
that MDK kernel maintainers could fear of it's stability. Though I'm
really happy with this scheduler. (but it's not ready for production
use yet, it's still under heavy development, but stabilizing rapidly)

-- 
         Martin Mačok                 http://underground.cz/
   [EMAIL PROTECTED]        http://Xtrmntr.org/ORBman/

Date:   Mon, 09 Apr 2001 20:37:10 -0700
From: george anzinger <[EMAIL PROTECTED]>
To: SodaPop <[EMAIL PROTECTED]>
Cc: [EMAIL PROTECTED]
Subject: Re: [QUESTION] 2.4.x nice level

SodaPop wrote:
> 
> I too have noticed that nicing processes does not work nearly as
> effectively as I'd like it to.  I run on an underpowered machine,
> and have had to stop running things such as seti because it steals too
> much cpu time, even when maximally niced.
> 
> As an example, I can run mpg123 and a kernel build concurrently without
> trouble; but if I add a single maximally niced seti process, mpg123 runs
> out of gas and will start to skip while decoding.
> 
> Is there any way we can make nice levels stronger than they currently are
> in 2.4?  Or is this perhaps a timeslice problem, where once seti gets cpu
> time it runs longer than it should since it makes relatively few system
> calls?
> 
In kernel/sched.c for HZ < 200 an adjustment of nice to tick is set up
to be nice>>2 (i.e. nice /4).  This gives the ratio of nice to time
slice.  Adjustments are made to make the MOST nice yield 1 jiffy, so
using this scale and remembering nice ranges from -19 to 20 the least
nice is 40/4 or 10 ticks.  This implies that if only two tasks are
running and they are most and least niced then one will get 1/11 of the
processor, the other 10/11 (about 10% and 90%).  If one is niced and the
other is not you get 1 and 5 for the time slices or 1/6 and 5/6 (17% and
83%).  

In 2.2.x systems the full range of nice was used one to one to give 1
and 39 or 40 or 2.5% and 97.5% for max nice to min.  For most nice to
normal you would get 1 and 20 or 4.7% and 95.3%.

The comments say the objective is to come up with a time slice of 50ms,
presumably for the normal nice value of zero.  After translating the
range this would be a value of 20 and, yep 20/4 give 5 jiffies or 50
ms.  Sure puts a crimp in the min to max range, however.

George
-
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Date:   Wed, 11 Apr 2001 07:34:59 -0300 (BRST)
From: Rik van Riel <[EMAIL PROTECTED]>
To: george anzinger <[EMAIL PROTECTED]>
Cc: SodaPop <[EMAIL PROTECTED]>, [EMAIL PROTECTED],
        [EMAIL PROTECTED]
Subject: [test-PATCH] Re: [QUESTION] 2.4.x nice level

On Tue, 10 Apr 2001, Rik van Riel wrote:

> I'll try to come up with a recalculation change that will make
> this thing behave better, while still retaining the short time
> slices for multiple normal-priority tasks and the cache footprint
> schedule() and friends currently have...

OK, here it is. It's nothing like montavista's singing-dancing
scheduler patch that does all, just a really minimal change that
should stretch the nice levels to yield the following CPU usage:

Nice    0    5   10   15   19
%CPU  100   56   25    6    1

Note that the code doesn't change the actual scheduling code,
just the recalculation. Care has also been taken to not increase
the cache footprint of the scheduling and recalculation code.

I'd love to hear some test results from people who are interested
in wider nice levels. How does this run on your system?  Can you
trigger bad behaviour in any way?

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

                http://www.surriel.com/
http://www.conectiva.com/       http://distro.conectiva.com.br/



--- linux-2.4.3-ac4/kernel/sched.c.orig Tue Apr 10 21:04:06 2001
+++ linux-2.4.3-ac4/kernel/sched.c      Wed Apr 11 06:18:46 2001
@@ -686,8 +686,26 @@
                struct task_struct *p;
                spin_unlock_irq(&runqueue_lock);
                read_lock(&tasklist_lock);
-               for_each_task(p)
+               for_each_task(p) {
+                   if (p->nice <= 0) {
+                       /* The normal case... */
                        p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice);
+                   } else {
+                       /*
+                        * Niced tasks get less CPU less often, leading to
+                        * the following distribution of CPU time:
+                        *
+                        * Nice    0    5   10   15   19
+                        * %CPU  100   56   25    6    1        
+                        */
+                       short prio = 20 - p->nice;
+                       p->nice_calc += prio;
+                       if (p->nice_calc >= 20) {
+                           p->nice_calc -= 20;
+                           p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice);
+                       }
+                   }
+               }
                read_unlock(&tasklist_lock);
                spin_lock_irq(&runqueue_lock);
        }
--- linux-2.4.3-ac4/include/linux/sched.h.orig  Tue Apr 10 21:04:13 2001
+++ linux-2.4.3-ac4/include/linux/sched.h       Wed Apr 11 06:26:47 2001
@@ -303,7 +303,8 @@
  * the goodness() loop in schedule().
  */
        long counter;
-       long nice;
+       short nice_calc;
+       short nice;
        unsigned long policy;
        struct mm_struct *mm;
        int has_cpu, processor;

-
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Date:   Wed, 11 Apr 2001 12:53:16 -0300 (BRST)
From: Rik van Riel <[EMAIL PROTECTED]>
To: george anzinger <[EMAIL PROTECTED]>
Cc: SodaPop <[EMAIL PROTECTED]>, [EMAIL PROTECTED],
        [EMAIL PROTECTED]
Subject: Re: [test-PATCH] Re: [QUESTION] 2.4.x nice level

On Wed, 11 Apr 2001, Rik van Riel wrote:

> OK, here it is. It's nothing like montavista's singing-dancing
> scheduler patch that does all, just a really minimal change that
> should stretch the nice levels to yield the following CPU usage:
> 
> Nice    0    5   10   15   19
> %CPU  100   56   25    6    1

  PID USER     PRI  NI  SIZE SWAP  RSS SHARE STAT %CPU %MEM   TIME COMMAND
  980 riel      17   0   296    0  296   240 R    54.1  0.5  54:19 loop
 1005 riel      16   5   296    0  296   240 R N  27.0  0.5   0:34 loop
 1006 riel      17  10   296    0  296   240 R N  13.5  0.5   0:16 loop
 1007 riel      18  15   296    0  296   240 R N   4.5  0.5   0:05 loop
  987 riel      20  19   296    0  296   240 R N   0.4  0.5   0:25 loop

... is what I got when testing it here. It seems that nice levels
REALLY mean something with the patch applied ;)

You can get it at http://www.surriel.com/patches/2.4/2.4.3ac4-largenice

Since there seems to be quite a bit of demand for this feature,
please test it and try to make it break. If it doesn't break we
can try to put it in the kernel...

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

                http://www.surriel.com/
http://www.conectiva.com/       http://distro.conectiva.com.br/

-
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


----- End forwarded message -----

-- 
         Martin Mačok                 http://underground.cz/
   [EMAIL PROTECTED]        http://Xtrmntr.org/ORBman/

Reply via email to