If I run "crafty" with mt=4 (4 active threads) then interactive
feels somewhat worse. IE the old problem of moving the mouse around
and having it jump inches at a time to catch up is even more noticable.
I assume that such tasks are having to really wedge their way in
to get on a CPU now. Overall performance is actually slightly better
if you only count "work done" however, which is what I was interested
in testing this against. I was pretty well convinced that the harder
you stick a process to a CPU, the worse interactive performance is going
to be.
One of the cases I have watched is really pointless, in that running a
single process (compute-bound) does tend to jump around on the stock
scheduler, and that process does suffer a small amount due to cache
misses after a cpu change. But on a quad with one process, is this
really important? Probably not. And I think our "APIC" device is the
main guilty party.
IE on our old sequent we had, the interrupt controller was aware of what
each cpu was doing (process priority of whatever was running on a per cpu
basis) and it would channel interrupts to idle processors. I suppose we
could dynamically reprogram the APIC to direct interrupts away from a CPU
that is busy, so long as one is idle. But the question is, is this worth
anything other than from a purely theoretical interest point of view?
However, I'm willing to try most anything here, particularly since I
can now build a kernel so damned fast. :)
What I am really looking for is for a single compute-bound process to
"stick" on a single cpu and keep the cache alive and well. And when I
run 4 processes, I'd like them to "stick" as well. Right now I see a
lot of cpu0 gets an int, and while it is handling that, cpu1 can switch
to that displaced process, which starts this merry-go-round and blows
cache out.
Certainly non-critical, but interesting to look at. And I'm willing to
look at a lot. :)
Bob
Robert Hyatt Computer and Information Sciences
[EMAIL PROTECTED] University of Alabama at Birmingham
(205) 934-2213 115A Campbell Hall, UAB Station
(205) 934-5473 FAX Birmingham, AL 35294-1170
On Mon, 21 Dec 1998, Linus Torvalds wrote:
>
>
> On Mon, 21 Dec 1998, Robert M. Hyatt wrote:
> >
> > I am testing this on my quad xeon, and it does look better. IE a compute
> > bound process seems to stick on one cpu for long periods of time. It will
> > occasionally move, when the process does an I/O, but it is far better than
> > it was, in that running xosview would show a single process bouncing
> > around quite frequently...
>
> Umm.. What about interactive feel?
>
> PLEASE PLEASE PLEASE don't think that "stick to one CPU" is automatically
> a good thing. It isn't. It has absolutely no meaning what-so-ever aside
> from cache issues, and can be an extremely _bad_ thing for other reasons.
> One of the other reasons is interactive performance and scheduling latency
> under load.
>
> Any patches that are developed using xosview and looking at the load meter
> are very very suspect. PLEAE don't do that, it is a completely bogus
> metric.
>
> The only thing that matters is:
> - absolute performance (ie NUMBERS, not "xosview says it sticks to a
> CPU")
> - latency and responsiveness.
>
> And note that the second one is MORE important - I'd much rather have a
> machine that feels good than one that benchmarks 5% better.
>
> If the only criterion is how xosview looks, then I don't want to see the
> patches, quite frankly. Nice "sticks to one CPU" behaviour on osview does
> NOT automatically mean that performance is actually better, and it can
> easily mean that interactive response is pure crap.
>
> Note that if you have a quad PII, interactive response is usually fine -
> and the cross-CPU scheduling stuff doesn't matter unless you have a
> CPU-bound load noticeably over four. Be very very careful.
>
> Linus
>
>
-
Linux SMP list: FIRST see FAQ at http://www.irisa.fr/prive/mentre/smp-faq/
To Unsubscribe: send "unsubscribe linux-smp" to [EMAIL PROTECTED]