Re: [patch] CFS scheduler, -v12

Peter Williams Mon, 21 May 2007 21:48:28 -0700

Peter Williams wrote:

Dmitry Adamushko wrote:
On 18/05/07, Peter Williams <[EMAIL PROTECTED]> wrote:
[...]
One thing that might work is to jitter the load balancing interval a
bit.  The reason I say this is that one of the characteristics of top
and gkrellm is that they run at a more or less constant interval (and,
in this case, X would also be following this pattern as it's doing
screen updates for top and gkrellm) and this means that it's possible
for the load balancing interval to synchronize with their intervals
which in turn causes the observed problem.
Hum.. I guess, a 0/4 scenario wouldn't fit well in this explanation..
No, and I haven't seen one.
all 4 spinners "tend" to be on CPU0 (and as I understand each gets
~25% approx.?), so there must be plenty of moments for
*idle_balance()* to be called on CPU1 - as gkrellm, top and X consume
together just a few % of CPU. Hence, we should not be that dependent
on the load balancing interval here..
The split that I see is 3/1 and neither CPU seems to be favoured withrespect to getting the majority. However, top, gkrellm and X seem to bealways on the CPU with the single spinner. The CPU% reported by top isapprox. 33%, 33%, 33% and 100% for the spinners.
If I renice the spinners to -10 (so that there load weights dominate therun queue load calculations) the problem goes away and the spinner toCPU allocation is 2/2 and top reports them all getting approx. 50% each.

For no good reason other than curiosity, I tried a variation of thisexperiment where I reniced the spinners to 10 instead of -10 and, to mysurprise, they were allocated 2/2 to the CPUs on average. I say onaverage because the allocations were a little more volatile andoccasionally 0/4 splits would occur but these would last for less thanone top cycle before the 2/2 was re-established. The quickness of theserecoveries would indicate that it was most likely the idle balancemechanism that restored the balance.

This may point the finger at the tick based load balance mechanism beingtoo conservative in when it decides whether tasks need to be moved. Inthe case where the spinners are at nice == 0, the idle balance mechanismnever comes into play as the 0/4 split is never seen so only the tickbased mechanism is in force in this case and this is where the anomaliesare seen.

This tick rebalance mechanism only situation is also true for the nice== -10 case but in this case the high load weights of the spinnersovercomes the tick based load balancing mechanism's conservatism e.g.the difference in queue loads for a 1/3 split in this case is theequivalent to the difference that would be generated by an imbalance ofabout 18 nice == 0 spinners i.e. too big to be ignored.

The evidence seems to indicate that IF a rebalance operation getsinitiated then the right amount of load will get moved.

This new evidence weakens (but does not totally destroy) mysynchronization (a.k.a. conspiracy) theory.


Peter

PS As the total load weight for 4 nice == 10 tasks is only about 40% ofthe load weight of a single nice == 0 task, the occasional 0/4 split inthe spinners at nice == 10 case is not unexpected as it would be thedesirable allocation if there were exactly one other running task atnice == 0.

--
Peter Williams                                   [EMAIL PROTECTED]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v12

Reply via email to