On Thu, 1 Feb 2001, Cejka Rudolf wrote:
> Sheldon Hearn wrote (2001/02/01):
> > What I remember of the discussions that surrounded this one, your
> > summary is correct. The only thing is that nice isn't so much _broken_
> > as it just isn't doing what you'd expect it to. :-)
>
> Ok, scheduler in -current is not broken. But I'm afraid that in -stable
Nah, it is broken.
> it is - can (niced) process cause a lock of machine?. Currently, we
> have dual processor box with 4.2-STABLE and it silently locks too often.
> With current scheduler in mind, it is hard to say if I should search in
> HW or SW for a potential fix...
I think only processes with niceness less than -5 can cause lockups.
This is because they get assigned priorities below some kernel priorities,
so priority inversion is possible. I think priority inversions between
processes running in user mode are impossible (because low-priority
(higher priority numbered) processes can't block in user mode).
> > I don't think any of the FreeBSD manual pages suggest that nice 20
> > processes aren't supposed to get _any_ CPU time.
>
> Maybe. But there are some conventions and two-process sensitivity
> 2.5:1 is not very big (is low). Solaris and Linux have much bigger
> ratio (sufficient). So why FreeBSD has to be so much different?
Just a bug. The relative weights of the priorities are mostly determined
accidentally by the following formula in resetpriority():
newpriority = PUSER + p->p_estcpu / INVERSE_ESTCPU_WEIGHT +
NICE_WEIGHT * (p->p_nice - PRIO_MIN);
For process mixes consisting of only long-running cpu hog processes, the
priorities tend to a common limit (this is fairly obvious -- lower
priority processes get run more; this increases their priority until
it is not lower, so the priority of all the processes is similar; also,
correct design of the algorithm that decays p_estcpu should ensure that
the common priority doesn't hunt).
For the simple case of 2 long-running cpu hog processes with nicenesses
n1 and n2, the priority for both tends towards a limit of approximately
104 (see ps output). This corresponds to the (p_estcpu / weight) term
for the process with the lowest nice value tending towards its maximum
value of (PRIO_MAX - PRIO_MIN - PPQ == 36); the maximum is nearly reached
because of correct design of of the decay algorithm, and actually reached
because of bugs). The p_estcpu's for the two processes can be calculated
from this:
104 = 48 + p->p_estcpu / 8 + 1 * (p->p_nice + 20)
p->p_estcpu = (36 - p->p_nice) * 8
In the limit, the runtimes are to a first approximation proportional to
p_estcpu (higher terms depend on the decay algorithm), so for processes
with niceness n0 and n1, the relative runtimes are approximately:
(36 - n0) : (36 - n1)
This formula is neither simple nor good. For n0 = 0 and n1 = 20, it gives
the ratio of 36:16, which is close to the 2.5:1 ratio observed by Cejka.
I observe a ratio closer to 3:1 for the runtimes, and the formula gives
almost the exact ratio for the p_estcpu's. The formula breaks down near
(n0 = -16, n1 = 20), since the p_estcpu term can never raise the priority
by more than 36, so the ratio is 1:0 for n0 < - 16 and n1 = 20. This is
a bug.
> Insensitivity of nice were problem in the past and it is going back.
For RELENG_4 before I imported the fixes from NetBSD, the corresponding
calculations are:
127 = 50 + p->p_estcpu / 4 + 2 * p->p_nice
p->p_estcpu = (38.5 - p->p_nice) * 8
ratio = (38.5 - n0) : (38.5 - n1)
ratio for nicenesses (0, 20) = 38.5:18.5
so -current is almost perfectly bug for bug compatible in this area with
RELENG_4! :-(
In RELENG_4, the corresponding calculations are:
86 = 50 + p->p_estcpu / 8 + 2 * p->p_nice
p->p_estcpu = (18 - p->p_nice) * 16
ratio = (18 - n0) : (18 - n1)
ratio for nicenesses (0, 20) = 18:-2 (formula invalid, actually 1:0)
ratio for nicenesses (0, 17) = 18:1
Quick fix:
---
Index: kern/kern_synch.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/kern_synch.c,v
retrieving revision 1.124
diff -c -2 -r1.124 kern_synch.c
*** kern/kern_synch.c 2001/01/31 04:29:51 1.124
--- kern/kern_synch.c 2001/02/03 15:16:44
***************
*** 1039,1044 ****
mtx_enter(&sched_lock, MTX_SPIN);
if (p->p_rtprio.type == RTP_PRIO_NORMAL) {
! newpriority = PUSER + p->p_estcpu / INVERSE_ESTCPU_WEIGHT +
! NICE_WEIGHT * (p->p_nice - PRIO_MIN);
newpriority = min(newpriority, MAXPRI);
p->p_usrpri = newpriority;
--- 1039,1046 ----
mtx_enter(&sched_lock, MTX_SPIN);
if (p->p_rtprio.type == RTP_PRIO_NORMAL) {
! newpriority = PUSER +
! (PRIO_MAX - PRIO_MIN + 1) *
! (p->p_estcpu / INVERSE_ESTCPU_WEIGHT) /
! (PRIO_MAX - p->p_nice + 1);
newpriority = min(newpriority, MAXPRI);
p->p_usrpri = newpriority;
Index: sys/proc.h
===================================================================
RCS file: /home/ncvs/src/sys/sys/proc.h,v
retrieving revision 1.145
diff -c -2 -r1.145 proc.h
*** sys/proc.h 2001/01/31 04:29:52 1.145
--- sys/proc.h 2001/02/03 15:56:49
***************
*** 493,505 ****
* XXX macros for scheduler. Shouldn't be here, but currently needed for
* bounding the dubious p_estcpu inheritance in wait1().
* INVERSE_ESTCPU_WEIGHT is only suitable for statclock() frequencies in
* the range 100-256 Hz (approximately).
*/
#define ESTCPULIM(e) \
! min((e), INVERSE_ESTCPU_WEIGHT * (NICE_WEIGHT * (PRIO_MAX - PRIO_MIN) - \
! PPQ) + INVERSE_ESTCPU_WEIGHT - 1)
! #define INVERSE_ESTCPU_WEIGHT 8 /* 1 / (priorities per estcpu level).
*/
! #define NICE_WEIGHT 1 /* Priorities per nice level. */
! #define PPQ (128 / NQS) /* Priorities per queue. */
struct mtx;
--- 493,505 ----
* XXX macros for scheduler. Shouldn't be here, but currently needed for
* bounding the dubious p_estcpu inheritance in wait1().
+ * XXX PPQ no longer needed for that.
* INVERSE_ESTCPU_WEIGHT is only suitable for statclock() frequencies in
* the range 100-256 Hz (approximately).
*/
#define ESTCPULIM(e) \
! min((e), (INVERSE_ESTCPU_WEIGHT * (MAXPRI + 1 - PUSER) - 1) * \
! (PRIO_MAX + 1) / (PRIO_MAX - PRIO_MIN + 1))
! #define INVERSE_ESTCPU_WEIGHT 16 /* Resol. of estcpu for priorities. */
! #define PPQ ((MAXPRI + 1) / NQS) /* Priorities per queue. */
struct mtx;
---
This works by making the steady-state value of p_estcpu proportional to
(PRIO_MAX - p->p_nice + 1) = (21 - p->p_nice). This gives interestingly
different ramping up of the priorities: the initial priority is now
independent of p_nice, so niced processes aren't completely locked out
by short-lived unniced processes.
I have a better fix that uses a table of multipliers instead of the
divisor (PRIO_MAX - p->p_nice + 1) in the above. It is unfinished in
different ways. Correctly limiting p_estcpu and/or scaling is a
problem in all versions.
Bruce
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message