Re: Matrix mul

2008-11-23 Thread BCS
Reply to Bill, Exactly. That's why I haven't spent too much time benchmarking it. It would be quite surprising if something I wrote in D outperformed the ATLAS SSE3 optimized BLAS implementation. (Though It would not be so surprising if something Don wrote managed to outperform it. :-) ) --bb

Re: Matrix mul

2008-11-23 Thread bearophile
Bill Baxter: > I do find all your benchmark postings interesting. Most of them are buggy, in the beginning... The main problem is that they often do nothing real, so there's no practical result to test. In the future I'll add better tests to avoid the most silly errors of mine. > Anyway I gave

Re: Matrix mul

2008-11-23 Thread Andrei Alexandrescu
Tom S wrote: Andrei Alexandrescu wrote: Tom S wrote: Don wrote: bearophile wrote: While writing code that works on matrices I have found something curious... Here's what I think is going on. AFAIK, D hasn't got any special code for initializing jagged arrays. So auto A = new doubl

Re: Matrix mul

2008-11-23 Thread Bill Baxter
On Sun, Nov 23, 2008 at 6:15 PM, bearophile <[EMAIL PROTECTED]> wrote: > Bill Baxter: >> Exactly. That's why I haven't spent too much time benchmarking it. >> It would be quite surprising if something I wrote in D outperformed >> the ATLAS SSE3 optimized BLAS implementation. > > Performing many be

Re: Matrix mul

2008-11-23 Thread Tom S
Andrei Alexandrescu wrote: Tom S wrote: Don wrote: bearophile wrote: While writing code that works on matrices I have found something curious... Here's what I think is going on. AFAIK, D hasn't got any special code for initializing jagged arrays. So auto A = new double[][](N, N); i

Re: Matrix mul

2008-11-23 Thread Sergey Gromov
Sun, 23 Nov 2008 07:33:16 -0600, Andrei Alexandrescu wrote: > Sergey Gromov wrote: >> The really weird part is, if I comment out the "init mats randomly" loop >> in #1, it becomes twice as slow, i.e. 20s against the original 10s. I >> don't get it. > > I think what happens is that the speed of F

Re: Matrix mul

2008-11-23 Thread Andrei Alexandrescu
bearophile wrote: Andrei Alexandrescu: My guess is that if you turn that off, the differences won't be as large (or even detectable for certain ranges of N). The array bounds aren't controlled, the code is compiled with -O -release -inline. Do you see array bound controls in the asm code at t

Re: Matrix mul

2008-11-23 Thread Andrei Alexandrescu
Tom S wrote: Don wrote: bearophile wrote: While writing code that works on matrices I have found something curious... Here's what I think is going on. AFAIK, D hasn't got any special code for initializing jagged arrays. So auto A = new double[][](N, N); involves N+1 memory allocatio

Re: Matrix mul

2008-11-23 Thread Andrei Alexandrescu
Sergey Gromov wrote: Sat, 22 Nov 2008 20:14:37 -0500, bearophile wrote: Can you or someone else run that little D code, so you can tell me if my timings are right? Tested your code using DMD 2.019. You are right. The #1 is about 30 times slower than #2: 10s against 0,3s on my laptop. I've

Re: Matrix mul

2008-11-23 Thread bearophile
Bill Baxter: > Exactly. That's why I haven't spent too much time benchmarking it. > It would be quite surprising if something I wrote in D outperformed > the ATLAS SSE3 optimized BLAS implementation. Performing many benchmarks teaches you that it's better not assume too much things. Nature and c

Re: Matrix mul

2008-11-23 Thread bearophile
Tom S: > Nah, it's about NaNs :) > Version #1 initializes C to NaN, Version #2 initializes it to 0. The > 'init mats randomly' loop doesn't touch C at all, thus all the latter > additions leave C at NaN, causing lots of FP exceptions. You are right, and I'm a stupid. Most of my benchmarks have s

Re: Matrix mul

2008-11-23 Thread Bill Baxter
On Sun, Nov 23, 2008 at 12:25 PM, BCS <[EMAIL PROTECTED]> wrote: > Reply to bearophile, > >> Bill Baxter Wrote: >> >>> I haven't done any benchmarks, though. :-) >>> >> I see. But using something because it's supposed to be faster without >> performing actual performance comparisons looks a little

Re: Matrix mul

2008-11-23 Thread Bill Baxter
Doh! The NaNs strike again. --bb On Sun, Nov 23, 2008 at 3:40 PM, Tom S <[EMAIL PROTECTED]> wrote: > Don wrote: >> >> bearophile wrote: >>> >>> While writing code that works on matrices I have found something >>> curious... >> >> Here's what I think is going on. AFAIK, D hasn't got any special c

Re: Matrix mul

2008-11-22 Thread Tom S
Don wrote: bearophile wrote: While writing code that works on matrices I have found something curious... Here's what I think is going on. AFAIK, D hasn't got any special code for initializing jagged arrays. So auto A = new double[][](N, N); involves N+1 memory allocations. As well as

Re: Matrix mul

2008-11-22 Thread Don
bearophile wrote: While writing code that works on matrices I have found something curious... Here's what I think is going on. AFAIK, D hasn't got any special code for initializing jagged arrays. So auto A = new double[][](N, N); involves N+1 memory allocations. As well as being slow,

Re: Matrix mul

2008-11-22 Thread Sergey Gromov
Sat, 22 Nov 2008 20:14:37 -0500, bearophile wrote: > Can you or someone else run that little D code, so you can tell me if > my timings are right? Tested your code using DMD 2.019. You are right. The #1 is about 30 times slower than #2: 10s against 0,3s on my laptop. I've also tried to replace

Re: Matrix mul

2008-11-22 Thread BCS
Reply to bearophile, Bill Baxter Wrote: I haven't done any benchmarks, though. :-) I see. But using something because it's supposed to be faster without performing actual performance comparisons looks a little strange to me :-) BLAS may well be the most tested, optimized and benchmarked c

Re: Matrix mul

2008-11-22 Thread bearophile
Bill Baxter Wrote: > I haven't done any benchmarks, though. :-) I see. But using something because it's supposed to be faster without performing actual performance comparisons looks a little strange to me :-) > Might be interesting to try out my MingGW-compiled ATLAS BLAS matrix > mult against

Re: Matrix mul

2008-11-22 Thread Bill Baxter
On Sun, Nov 23, 2008 at 8:29 AM, bearophile <[EMAIL PROTECTED]> wrote: > Andrei Alexandrescu: >> My guess is that if you turn that off, the differences won't be as large >> (or even detectable for certain ranges of N). > > The array bounds aren't controlled, the code is compiled with -O -release >

Re: Matrix mul

2008-11-22 Thread bearophile
Andrei Alexandrescu: > My guess is that if you turn that off, the differences won't be as large > (or even detectable for certain ranges of N). The array bounds aren't controlled, the code is compiled with -O -release -inline. Do you see array bound controls in the asm code at the bottom of my po

Re: Matrix mul

2008-11-22 Thread Andrei Alexandrescu
bearophile wrote: While writing code that works on matrices I have found something curious, so I have written the following little benchmark. As usual keep eyes open for possible bugs and mistakes of mine: [snip] This is yet another proof that bounds checking can cost a lot. Although the loopi

Matrix mul

2008-11-22 Thread bearophile
While writing code that works on matrices I have found something curious, so I have written the following little benchmark. As usual keep eyes open for possible bugs and mistakes of mine: import std.conv: toInt; import std.c.stdlib: rand, malloc; const int RAND_MAX = short.max; // RAND_MAX seem