Re: [patch] CFS scheduler, v3

Peter Williams Sat, 21 Apr 2007 05:24:22 -0700

Peter Williams wrote:

Ingo Molnar wrote:
* Peter Williams <[EMAIL PROTECTED]> wrote:
I retract this suggestion as it's a very bad idea. It introduces thepossibility of starvation via the poor sods at the bottom of thequeue having their "on CPU" forever postponed and we all know thateven the smallest possibility of starvation will eventually causeproblems.
I think there should be a rule: Once a task is on the queue its "onCPU" time is immutable.
Yeah, fully agreed. Currently i'm using the simple method ofp->nice_offset, which plainly just moves the per nice level areas ofthe tree far enough apart (by a constant offset) so that lower nicelevels rarely interact with higher nice levels. Lower nice levelsnever truly starve because rq->fair_clock increases deterministicallyand currently the fair_key values are indeed 'immutable' as you suggest.
In practice they can starve a bit when one renices thousands of tasks,so i was thinking about the following special-case: to at least makethem easily killable: if a nice 0 task sends a SIGKILL to a nice 19task then we could 'share' its p->wait_runtime with that nice 19 taskand copy the signal sender's nice_offset. This would in essence passthe right to execute over to the killed task, so that it can tearitself down.
This cannot be used to gain an 'unfair advantage' because the signalsender spends its own 'right to execute on the CPU', and because thetarget task cannot execute any user code anymore when it gets a SIGKILL.
In any case, it is clear that rq->raw_cpu_load should be used insteadof rq->nr_running, when calculating the fair clock, but i begin tolike the nice_offset solution too in addition of this: it's effectivein practice and starvation-free in theory, and most importantly, it'svery simple. We could even make the nice offset granularity tunable,just in case anyone wants to weaken (or strengthen) the effectivity ofnice levels. What do you think, can you see any obvious (or lessobvious) showstoppers with this approach?
I haven't had a close look at it but from the above description itsounds an order of magnitude more complex than I thought it would be.The idea of different nice levels sounds like a recipe for starvation tome (if it works the way it sounds like it works).
I guess I'll have to spend more time reading the code because I don'tseem to be able to make sense of the above description in any way thatdoesn't say "starvation here we come".

I'm finding it hard to figure out what the underling principle for theway you're queuing things by reading the code (that's the trouble withreading the code it just tells you what's being done not why -- andsometimes it's even hard to figure out what's being done when there's alot of indirection). sched-design-CFS.txt isn't much help in this areaeither.


Any chance of a brief description of how it's supposed to work?

Key questions are:

How do you decide the key value for a task's position in the queue?

Is it an absolute time or an offset from the current time?

How do you decide when to boot the current task of the queue? Both atwake up of another task and in general play?


Peter
PS I think that you're trying to do too much in one patch.
--
Peter Williams                                   [EMAIL PROTECTED]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, v3

Reply via email to