so, i've done a little bit more work characterizing the performance of the scheduler correctness changes, and i know have some understanding on why e.g. ping times are a bit slower.
the old code essentially let processor 0 spin in runproc, other processors called halt. the new code uses monmwait to wait for a change on all processors. this has some significant impacts on performance and power use. for example, on my test box with 4c/8t: spin/halt monmwait spin/monmwait ping 8µs 14µs 8µs # ip/ping -n10 $sysname mk 6.26s 3.98s 3.80 # make nix kernel fans audible silent audible δpower - -24w 0 # resolution = .1A = 12w @ 120v) this seems to indicate the latency is all in runproc(), and not waiting for things to be ready and assuming they will be has a big performance boost. (the third column, testing spin on mach 0, plus monmwait on the others was done to tell if monmwait has high latency or not.) i'd really be interested to see what this does on 24c/48t machines. something tells me the performance impacts would be huge, and different. - erik --- ps. hzsched in the distribution is 10% off for HZ=100, since schedticks = m->ticks + HZ/10, and delaysched tests for > not the expected >=.