I've created a simple 4x4 Matrix struct and have made some tests, with
surprising results. (although I've heard from similar results in c++)
struct mat4 {
float[4][4] m;
mat4 opMul(in mat4 _m);
}
mat4 mul(in mat4 m1, in mat4 m2);
I've tested this 2 multiplication functions (overloaded and free).
Result (100_000 times, windows7 64bit, 32bit build):
DMD: (dmd -O -inline -release)
opMul: 20 msecs
mul : 1355 msecs
GDC: (gdc -m32 -O3 -frelease)
opMul: 10 msecs
mul : 1481 msecs
Why this huge difference in performance?