Okay, here we go...

1) Don't use upper case letters in module names (http://dlang.org/module.html#ModuleDeclaration) 2) As has already been suggested, if you're targeting raw performance, don't use GC. You can always malloc and free your storage. Using C heap has certain implications, but they're not important right now. 3) Ditch extra constructors, they're completely unnecessary. For matrix you only need your non-trivial ctor, postblit and lvalue assignment operator.
4) This:

    ref Matrix transpose() const {
        return Matrix(this).transposeAssign();
    }

just doesn't make any sense. You're returning a reference to a temporary.
5) Use scoped imports, they're good :)
6) Use writefln for formatted output, it's good :)

With the above suggestions your code transfrorms into this:

http://dpaste.dzfl.pl/9d7feeab59f6

And here are the timings on my machine:

$ rdmd -O -release -inline -noboundscheck main.d
allocationTest ...
        Time required: 1 sec, 112 ms, 827 μs, and 3 hnsecs
multiplicationTest ...
        Time required: 1 sec, 234 ms, 417 μs, and 8 hnsecs
toStringTest ...
        Time required: 998 ms, 16 μs, and 2 hnsecs
transposeTest ...
        Time required: 813 ms, 947 μs, and 3 hnsecs
scalarMultiplicationTest ...
        Time required: 105 ms, 828 μs, and 5 hnsecs
matrixAddSubTest ...
        Time required: 240 ms and 384 μs
matrixEqualsTest ...
        Time required: 244 ms, 249 μs, and 8 hnsecs
identityMatrixTest ...
        Time required: 249 ms, 897 μs, and 4 hnsecs

LDC yields roughly the same times.

Reply via email to