On Sunday, 4 May 2014 at 17:01:23 UTC, safety0ff wrote:
On Saturday, 3 May 2014 at 22:46:03 UTC, Andrei Alexandrescu
wrote:
On 5/3/14, 2:42 PM, Atila Neves wrote:
gdc gave _very_ different results. I had to use different
modules
because at some point tests started failing, but with gdc the
threaded
version runs ~3x faster.
On my own unit-threaded benchmarks, running the UTs for
Cerealed over
and over again was only slightly slower with threads than
without. With
dmd the threaded version was nearly 3x slower.
Sounds like a severe bug in dmd or dependents. -- Andrei
This reminds me of when I was parallelizing a project euler
solution: atomic access was so much slower on DMD that it made
performance worse than the single threaded version for one
stage of the program.
I know that std.parallelism does make use of core.atomic under
the hood, so this may be a factor when using DMD.
Funny you should say that, a friend of mine tried porting a
lock-free algorithm of his from Java to D a few weeks ago. The D
version ran 3 orders of magnitude slower. Then I tried gdc and
ldc on his code. ldc produced code running at around 80% of the
speed of the Java version, fdc was around 30%. But dmd...