On Sat, 2007-04-07 at 20:08 +0200, Ingo Molnar wrote: >> not many - and i dont think Mike tested any of these - Mike tested >> pretty low make -j values (Mike, can you confirm?).
On Sat, Apr 07, 2007 at 09:14:21PM +0200, Mike Galbraith wrote: > Yes. I don't test anything more than make -j5 when looking at > interactivity, and make -j nr_cpus+1 is my must have yardstick. I strongly suggest assembling a battery of cleanly and properly written, configurable testcases, and scripting a series of regression tests as opposed to just randomly running kernel compiles and relying on Braille. For instance, a program that spawns a set of tasks with some spectrum of interactive vs. noninteractive behaviors and maybe priorities too according to command-line flags and then measures and reports the distribution of CPU bandwidth between them, with some notion of success or failure and performance within the realm of success reported would be something to include in such a battery of testcases. Different sorts of cooperating processes attempting to defeat whatever sorts of guarantees the scheduler is intended to provide would also be good testcases, particularly if they're arranged so as to automatically report success or failure in their attempts to defeat the scheduler (which even irman2.c, while quite good otherwise, fails to do). IMHO the failure of these threads to converge to some clear conclusion is in part due to the lack of an agreed-upon set of standards for what the scheduler should achieve and overreliance on subjective criteria. The testcase code going around is also somewhat embarrassing. >From the point of view of someone wondering what these schedulers solve, how any of this is to be demonstrated, and what the status of various pathological cases are, these threads are a nightmare of subjective squishiness and a tug-of-war between testcases only ever considered one at a time needing Lindent to read that furthermore have all their parameters hardcoded. Scripting edits and recompiles is awkward. Just finding the testcases is also awkward; con has a collection of a few, but they've got the aforementioned flaws and others also go around that can only be dredged up from mailing list archive searches, plus there's nothing like LTP where they can be run in a script with pass/fail reports and/or performance metrics for each. One patch goes through for one testcase and regressions against the others are open questions. Scheduling does have a strong subjective component, but this is too disorganized to be allowed to pass without comment. Some minimum bar must be set for schedulers to pass before they're considered correct. Some method of regression testing must be arranged. And the code to do such testing should not be complete crap with hardcoded parameters. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/