On Mon, 14 Dec 2009 21:18:28 -0500, dsimcha <dsim...@yahoo.com> wrote:

== Quote from Dan (dsstruth...@yahoo.com)'s article
I have a question regarding performance issue I am seeing on multicore Windows
systems. I am creating many threads to do parallel tasks, and on multicore Windows systems the performance is abysmal. If I use task manager to set the processor affinity to a single CPU, the program runs as I would expect. Without
that, it takes about 10 times as long to complete.
Am I doing something wrong? I have tried DMD 2.0.37 and DMD 1.0.53 with the
same results, running the binary on both a dual-core P4 and a newer Core2 duo.
Any help is greatly appreciated!

I've seen this happen before. Without knowing the details of your code, my best guess is that you're getting a lot of contention for the GC lock. (It could also be some other lock, but if it were, there's a good chance you'd already know it
because it wouldn't be hidden.)  The current GC design isn't very
multithreading-friendly yet.  It requires a lock on every allocation.
Furthermore, the array append operator (~=) currently takes the GC lock on **every append** to query the GC for info about the memory block that the array points to. There's been plenty of talk about what should be done to eliminate this, but
nothing has been implemented so far.

I would suspect something else. I would expect actually that in an allocation-heavy design, running on multiple cores should be at *least* as fast as running on a single core. He also only has 2 cores. For splitting the parallel tasks to 2 cores to take 10x longer is very alarming. I would suspect application design before the GC in this case. If it's a fundamental D issue, then we need to fix it ASAP, especially since D2 is supposed to be (among other things) an upgrade for multi-core.

Maybe I'm wrong, is there a good test case to prove it is worse on multiple cores?

-Steve

Reply via email to