On Fri, 25 Feb 2011, Bruce Evans wrote:

On Thu, 24 Feb 2011, John Baldwin wrote:

On Thursday, February 24, 2011 2:03:33 pm Remko Lodder wrote:

[contex restored:
+A priority of 19 or 20 will prevent a process from taking any cycles from
+others at nice 0 or better.]

On Feb 24, 2011, at 7:47 PM, John Baldwin wrote:

Are you sure that this statement applies to both ULE and 4BSD?  The two
schedulers treat nice values a bit differently.

No I am not sure that the statement applies, given your response I understand that both schedulers work differently. Can you or David tell me what the difference is so that I can properly document it? I thought that the tool is doin the same for all
schedulers, but that the backend might treat it differently.

I'm sure that testing would show that it doesn't apply in FreeBSD.  It is
supposed to apply only approximately in FreeBSD, but niceness handling in
FreeBSD is quite broken so it doesn't apply at all.  Also, the magic numbers
of 19 and 20 probably don't apply in FreeBSD.  These were because there
nicenesses that are the same mod 2 (maybe after adding 1) have the same
effect, since priorities that are the same mode RQ_PPQ = 4 have the same
effect and the niceness space was scaled to the priority space by
multiplying by NICE_WEIGHT = 2.  But NICE_WEIGHT has been broken to be 1
in FreeBSD with SCHED_4BSD and doesn't apply with SCHED_ULE.  With
SCHED_4BSD, there are 4 (not 2) nice values near 20 that give the same
behaviour.

It strictly only applies to broken schedulers.  Preventing a process
from taking *any* cycles gives priority inversion livelock.  FreeBSD
has priority propagation to prevent this.

Just tried it with SCHED_4BSD.  On a multi-CPU system (ref9-i386), but
I think I used cpuset correctly to emulate 1 CPU.

% last pid: 85392;  load averages:  1.71,  0.86,  0.38   up 94+01:00:36  
21:55:59
% 66 processes:  3 running, 63 sleeping
% CPU:  6.9% user,  3.7% nice,  2.0% system,  0.0% interrupt, 87.3% idle
% Mem: 268M Active, 4969M Inact, 310M Wired, 50M Cache, 112M Buf, 2413M Free
% Swap: 8192M Total, 580K Used, 8191M Free
% % PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
% [... system is not nearly idle, but plenty of CPUs to spare]
% 85368 bde           1 111    0  9892K  1312K RUN     1   1:07 65.67% sh
% 85369 bde           1 123   20  9892K  1312K CPU1    1   0:35 37.89% sh

This shows the bogus 1:2 ratio even for a niceness difference of 20.  I've
seen too much of this ratio.  IIRC, before FreeBSD-4 was fixed, the
various nonlinearities caused by not even clamping, combined with the
broken scaling, gave a ratio of about this.  Then FreeBSD-5 restored
a similarly bogus ratio.  Apparently, the algorithm for decaying p_estcpu
in SCHED_4BSD tends to generate this ratio.  SCHED_ULE uses a completely
different algorithm and I think it has more control over the scaling, so
it is surprising that it duplicates this brokenness so perfectly.

And here is what it does with more nice values: this was generated by:

% for i in 0 2 4 6 8 10 12 14 16 18 20
% do
%       cpuset -l 1 nice -$i sh -c "while :; do echo -n;done" &
% done
% top -o time

% last pid: 85649;  load averages: 10.99,  9.06,  5.35  up 94+01:19:33    
22:14:56
% 74 processes:  12 running, 62 sleeping
% % Mem: 270M Active, 4969M Inact, 310M Wired, 50M Cache, 112M Buf, 2411M Free
% Swap: 8192M Total, 580K Used, 8191M Free
% % % PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND % 85581 bde 1 98 0 9892K 1312K RUN 1 0:48 11.47% sh
% 85582 bde           1 100    2  9892K  1312K RUN     1   0:45 10.69% sh
% 85583 bde           1 102    4  9892K  1312K RUN     1   0:42 10.35% sh
% 85584 bde           1 104    6  9892K  1312K CPU1    1   0:40  9.47% sh
% 85585 bde           1 106    8  9892K  1312K RUN     1   0:38  8.79% sh
% 85586 bde           1 108   10  9892K  1312K RUN     1   0:36  8.06% sh
% 85587 bde           1 110   12  9892K  1312K RUN     1   0:34  8.40% sh
% 85588 bde           1 111   14  9892K  1312K RUN     1   0:33  8.50% sh
% 85589 bde           1 113   16  9892K  1312K RUN     1   0:31  7.67% sh
% 85590 bde           1 115   18  9892K  1312K RUN     1   0:30  7.28% sh
% 85591 bde           1 117   20  9892K  1312K RUN     1   0:29  6.69% sh

This is OK except for the far-too-small dynamic range of 29:48 (even worse
than 1:2).

My version spaces out things nicely according to its table:

% last pid:  1374;  load averages: 11.02,  8.74,  4.93    up 0+02:26:12  
09:16:47
% 43 processes:  12 running, 31 sleeping
% CPU: 14.0% user, 85.7% nice,  0.0% system,  0.4% interrupt,  0.0% idle
% Mem: 35M Active, 23M Inact, 67M Wired, 24K Cache, 61M Buf, 876M Free
% Swap:
% % PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
%  1325 root        1 120    0   856K   572K RUN      2:18 28.52% sh
%  1326 root        1 120    2   856K   572K RUN      1:39 19.97% sh
%  1327 root        1 120    4   856K   572K RUN      1:10 13.96% sh
%  1328 root        1 120    6   856K   572K RUN      0:50  9.72% sh
%  1329 root        1 123    8   856K   572K RUN      0:36  7.18% sh
%  1330 root        1 123   10   856K   572K RUN      0:25  5.03% sh
%  1331 root        1 124   12   856K   572K RUN      0:18  2.93% sh
%  1332 root        1 124   14   856K   572K RUN      0:13  1.86% sh
%  1333 root        1 124   16   856K   572K RUN      0:09  0.98% sh
%  1334 root        1 124   18   856K   572K RUN      0:06  1.07% sh
%  1335 root        1 123   20   856K   572K RUN      0:05  0.15% sh

The dynamic range here is 5:138.  Not as close to the table's 1:32 as
I would like.

Bruce
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to