On Tuesday, 18 August 2015 at 10:45:49 UTC, Walter Bright wrote:
Martin ran some benchmarks recently that showed that ddmd
compiled with dmd was about 30% slower than when compiled with
gdc/ldc. This seems to be fairly typical.
I'm interested in ways to reduce that gap.
There are 3 broad kinds of optimizations that compilers do:
1. source translations like rewriting x*2 into x<<1, and
function inlining
2. instruction selection patterns like should one generate:
SETC AL
MOVZ EAX,AL
or:
SBB EAX
NEG EAX
3. data flow analysis optimizations like constant propagation,
dead code elimination, register allocation, loop invariants,
etc.
Modern compilers (including dmd) do all three.
So if you're comparing code generated by dmd/gdc/ldc, and
notice something that dmd could do better at (1, 2 or 3),
please let me know. Often this sort of thing is low hanging
fruit that is fairly easily inserted into the back end.
For example, recently I improved the usage of the SETcc
instructions.
https://github.com/D-Programming-Language/dmd/pull/4901
https://github.com/D-Programming-Language/dmd/pull/4904
A while back I improved usage of BT instructions, the way
switch statements were implemented, and fixed integer divide by
a constant with multiply by its reciprocal.
I've often looked at the assembly output of ICC.
One thing that was striking to me is that it by and large it
doesn't use PUSH, POP, and SETcc. Actually I don't remember such
an instruction being emitted by it.
And indeed using PUSH/POP/SETcc in assembly were often slower
than the alternative. Which is _way_ different that the old x86
where each of these things would gain speed.
Instead of PUSH/POP it would spill all registers to an RBP-based
location the (perhaps taking advantage of the register renamer?).
---------------
That said: I entirely agree with Vladimir about the codegen risk.
DMD will always be used anyway because it compiles faster.