On Mon, 14 Dec 2009 21:18:28 -0500, dsimcha <dsim...@yahoo.com> wrote:
== Quote from Dan (dsstruth...@yahoo.com)'s article
I have a question regarding performance issue I am seeing on multicore
Windows
systems. I am creating many threads to do parallel tasks, and on
multicore
Windows systems the performance is abysmal. If I use task manager to
set the
processor affinity to a single CPU, the program runs as I would expect.
Without
that, it takes about 10 times as long to complete.
Am I doing something wrong? I have tried DMD 2.0.37 and DMD 1.0.53
with the
same results, running the binary on both a dual-core P4 and a newer
Core2 duo.
Any help is greatly appreciated!
I've seen this happen before. Without knowing the details of your code,
my best
guess is that you're getting a lot of contention for the GC lock. (It
could also
be some other lock, but if it were, there's a good chance you'd already
know it
because it wouldn't be hidden.) The current GC design isn't very
multithreading-friendly yet. It requires a lock on every allocation.
Furthermore, the array append operator (~=) currently takes the GC lock
on **every
append** to query the GC for info about the memory block that the array
points to.
There's been plenty of talk about what should be done to eliminate
this, but
nothing has been implemented so far.
I would suspect something else. I would expect actually that in an
allocation-heavy design, running on multiple cores should be at *least* as
fast as running on a single core. He also only has 2 cores. For
splitting the parallel tasks to 2 cores to take 10x longer is very
alarming. I would suspect application design before the GC in this case.
If it's a fundamental D issue, then we need to fix it ASAP, especially
since D2 is supposed to be (among other things) an upgrade for multi-core.
Maybe I'm wrong, is there a good test case to prove it is worse on
multiple cores?
-Steve