On 2016.01.21 07:29 Peter Zijlstra wrote: > On Thu, Jan 21, 2016 at 10:23:25AM +0100, Vik Heyndrickx wrote: >> Systems show a minimal load average of 0.00, 0.01, 0.05 even when they have >> no load at all. >> --- >> Subject: sched: Fix non-zero idle loadavg >> From: Vik Heyndrickx <[email protected]> >> Date: Thu, 21 Jan 2016 10:23:25 +0100
>> Systems show a minimal load average of 0.00, 0.01, 0.05 even when they >> have no load at all. >> By removing the single code line that performed a rounding on the >> internally kept load value, effectively returning this function >> calc_load to its state it had before, the visualization problem is >> completely fixed. Yes, but it introduces a systematic error, rather than the current balanced error. Thus it doubles the maximum error due to finite number of bits used in the math. >> Once the (old) load becomes 93 or higher, it mathematically can never >> get lower than 93, even when the active (load) remains 0 forever. >> This results in the strange 0.00, 0.01, 0.05 uptime values on idle >> systems. Note: 93/2048 = 0.0454..., which rounds up to 0.05. As I mentioned on the bug report [1], this is a consequence of carrying a finite number of bits with a so very strong IIR (Infinite Impulse Response) filter coefficient. >> It is not correct to add a 0.5 rounding (=1024/2048) here, since the >> result from this function is fed back into the next iteration again, >> so the result of that +0.5 rounding value then gets multiplied by >> (2048-2037), and then rounded again, so there is a virtual "ghost" >> load created, next to the old and active load terms. If you do not round then you get a doubling of problems on the load increasing side of things. Consider an old load value of 1862 (90.92%), regardless of how it got there, and a new load value of 2048 (100%) from here onwards. With this proposed change, the 15 minute math becomes: new = (old * 2037 + load * (2048 - 2037)) / 2048 new = (1862 * 2037 + 2048 * (2048 - 2037)) / 2048 new = 1862 So, the 100% load will always be shown as 91% (double the old limit). I have been running this proposed code with 100% load on CPU 7 for a couple of hours now, and the 15 minute load average is stuck at 0.91. Myself, I would not take out the rounding, but I defer to Peter. [1] https://bugzilla.kernel.org/show_bug.cgi?id=45001

