Re: Optimize my code =)

Robin Mon, 17 Feb 2014 14:01:21 -0800

Hiho,

thank you for your code improvements and suggestions.

I really like the foreach loop in D as well as the slight (butexisting) performance boost over conventional for loops. =)

Another success of the changes I have made is that I haveachieved to further improve the matrix multiplication performancefrom 3.6 seconds for two 1000x1000 matrices to 1.9 seconds whichis already very close to java and c++ with about 1.3 - 1.5seconds.

The key to victory was pointer arithmetics as I notices that Ihave used them in the C++ implementation, too. xD

The toString implementation has improved its performance slightlydue to the changes you have mentioned above: 1.37 secs -> 1.29secs

I have also adjusted all operator overloadings to the "new style"- I just haven't known about that "new style" until then - thanks!

I will just post the whole code again so that you can see what Ihave changed.

Keep in mind that I am still using DMD as compiler and thusperformance may still raise once I use another compiler!

All in all I am very happy with the code analysis and itsimprovements!However, there were some strange things of which I am veryconfused ...


void allocationTest() {
        writeln("allocationTest ...");
        sw.start();
        auto m1 = Matrix!double(10000, 10000);
        { auto m2 = Matrix!double(10000, 10000); }
        { auto m2 = Matrix!double(10000, 10000); }
        { auto m2 = Matrix!double(10000, 10000); }
        //{ auto m2 = Matrix!double(10000, 10000); }
        sw.stop();
        printBenchmarks();
}

This is the most confusing code snippet. I have just changed thewhole allocation for all m1 and m2 from new Matrix!double (onheap) to Matrix!double (on stack) and the performance droppedsignificantly - the benchmarked timed raised from 2,3 seconds toover 25 seconds!! Now look at the code above. When I leave it asit is now, the code requires about 2,9 seconds runtime, however,when enabeling the currently out-commented line the code takes 14to 25 seconds longer! mind blown ... 0.o This is extremelyconfusion as I allocate these matrices on the stack and since Ihave allocated them within their own scoped-block they shouldinstantly release their memory again so that no memoryconsumption takes place for more than 2 matrices at the sametime. This just wasn't the fact as far as I have tested it.


Another strange things was that the new opEquals implementation:

        bool opEquals(const ref Matrix other) const pure nothrow {
                if (this.dim != other.dim) {
                        return false;
                }
                foreach (immutable i; 0 .. this.dim.size) {
                        if (this.data[i] != other.data[i]) return false;
                }
                return true;
        }

is actually about 20% faster than the one you have suggested.With the single line of "return (this.dim == other.dim &&this.data[] == other.data[]).

The last thing I haven't quite understood is that I tried toreplace


auto t = Matrix(other).transposeAssign();

in the matrix multiplication algorithm with its shorter andclearer form

auto t = other.transpose(); // sorry for the nasty '()', but Ilike them! :/

This however gave me wonderful segmentation faults on runtimewhile using the matrix multiplication ...


And here is the complete and improved code:
http://dpaste.dzfl.pl/7f8610efa82b

Thanks in advance for helping me! =)

Robin

Re: Optimize my code =)

Reply via email to