[julia-users] Re: Matlab bench in Julia
And Elliot Saba writes: The first thing you should do is run your code once to warm up the JIT, and then run it again to measure the actual run time, rather than compile time + run time. To be fair, he seems to be timing MATLAB in the same way, so he's comparing systems appropriately at that level. It's just the tuned BLAS+LAPACK fftw v. the default ones. This is one reason why MATLAB bundles so much. (Another reason being the differences in numerical results causing support calls. Took a long time before MATLAB gave in to per-platform-tuned libraries.)
Re: [julia-users] Re: Matlab bench in Julia
In addition our lu calculates a partially pivoted lu and returns the L and U matrices and the vector of permutations. To get something comparable in MATLAB you'll have to write [L,,U,p] = lu(A,'vector') On my old Mac where Julia is compiled with OpenBLAS the timings are MATLAB: tic();for i = 1:10 [L,U,p] = qr(A, 'vector'); end;toc()/10 ans = 3.4801 Julia: julia tic(); for i = 1:10 qr(A); end;toc()/10 elapsed time: 14.758491472 seconds 1.4758491472 Med venlig hilsen Andreas Noack 2014-09-18 15:33 GMT-04:00 Jason Riedy ja...@lovesgoodfood.com: And Elliot Saba writes: The first thing you should do is run your code once to warm up the JIT, and then run it again to measure the actual run time, rather than compile time + run time. To be fair, he seems to be timing MATLAB in the same way, so he's comparing systems appropriately at that level. It's just the tuned BLAS+LAPACK fftw v. the default ones. This is one reason why MATLAB bundles so much. (Another reason being the differences in numerical results causing support calls. Took a long time before MATLAB gave in to per-platform-tuned libraries.)
Re: [julia-users] Re: Matlab bench in Julia
I'm slightly confused – does that mean Julia is 2.4x faster in this case? On Thu, Sep 18, 2014 at 3:53 PM, Andreas Noack andreasnoackjen...@gmail.com wrote: In addition our lu calculates a partially pivoted lu and returns the L and U matrices and the vector of permutations. To get something comparable in MATLAB you'll have to write [L,,U,p] = lu(A,'vector') On my old Mac where Julia is compiled with OpenBLAS the timings are MATLAB: tic();for i = 1:10 [L,U,p] = qr(A, 'vector'); end;toc()/10 ans = 3.4801 Julia: julia tic(); for i = 1:10 qr(A); end;toc()/10 elapsed time: 14.758491472 seconds 1.4758491472 Med venlig hilsen Andreas Noack 2014-09-18 15:33 GMT-04:00 Jason Riedy ja...@lovesgoodfood.com: And Elliot Saba writes: The first thing you should do is run your code once to warm up the JIT, and then run it again to measure the actual run time, rather than compile time + run time. To be fair, he seems to be timing MATLAB in the same way, so he's comparing systems appropriately at that level. It's just the tuned BLAS+LAPACK fftw v. the default ones. This is one reason why MATLAB bundles so much. (Another reason being the differences in numerical results causing support calls. Took a long time before MATLAB gave in to per-platform-tuned libraries.)
Re: [julia-users] Re: Matlab bench in Julia
Yes. It appears so on my Mac. I just redid the timings with the same result. Med venlig hilsen Andreas Noack 2014-09-18 15:55 GMT-04:00 Stefan Karpinski ste...@karpinski.org: I'm slightly confused – does that mean Julia is 2.4x faster in this case? On Thu, Sep 18, 2014 at 3:53 PM, Andreas Noack andreasnoackjen...@gmail.com wrote: In addition our lu calculates a partially pivoted lu and returns the L and U matrices and the vector of permutations. To get something comparable in MATLAB you'll have to write [L,,U,p] = lu(A,'vector') On my old Mac where Julia is compiled with OpenBLAS the timings are MATLAB: tic();for i = 1:10 [L,U,p] = qr(A, 'vector'); end;toc()/10 ans = 3.4801 Julia: julia tic(); for i = 1:10 qr(A); end;toc()/10 elapsed time: 14.758491472 seconds 1.4758491472 Med venlig hilsen Andreas Noack 2014-09-18 15:33 GMT-04:00 Jason Riedy ja...@lovesgoodfood.com: And Elliot Saba writes: The first thing you should do is run your code once to warm up the JIT, and then run it again to measure the actual run time, rather than compile time + run time. To be fair, he seems to be timing MATLAB in the same way, so he's comparing systems appropriately at that level. It's just the tuned BLAS+LAPACK fftw v. the default ones. This is one reason why MATLAB bundles so much. (Another reason being the differences in numerical results causing support calls. Took a long time before MATLAB gave in to per-platform-tuned libraries.)
Re: [julia-users] Re: Matlab bench in Julia
Nice :-) On Thu, Sep 18, 2014 at 4:20 PM, Andreas Noack andreasnoackjen...@gmail.com wrote: Yes. It appears so on my Mac. I just redid the timings with the same result. Med venlig hilsen Andreas Noack 2014-09-18 15:55 GMT-04:00 Stefan Karpinski ste...@karpinski.org: I'm slightly confused – does that mean Julia is 2.4x faster in this case? On Thu, Sep 18, 2014 at 3:53 PM, Andreas Noack andreasnoackjen...@gmail.com wrote: In addition our lu calculates a partially pivoted lu and returns the L and U matrices and the vector of permutations. To get something comparable in MATLAB you'll have to write [L,,U,p] = lu(A,'vector') On my old Mac where Julia is compiled with OpenBLAS the timings are MATLAB: tic();for i = 1:10 [L,U,p] = qr(A, 'vector'); end;toc()/10 ans = 3.4801 Julia: julia tic(); for i = 1:10 qr(A); end;toc()/10 elapsed time: 14.758491472 seconds 1.4758491472 Med venlig hilsen Andreas Noack 2014-09-18 15:33 GMT-04:00 Jason Riedy ja...@lovesgoodfood.com: And Elliot Saba writes: The first thing you should do is run your code once to warm up the JIT, and then run it again to measure the actual run time, rather than compile time + run time. To be fair, he seems to be timing MATLAB in the same way, so he's comparing systems appropriately at that level. It's just the tuned BLAS+LAPACK fftw v. the default ones. This is one reason why MATLAB bundles so much. (Another reason being the differences in numerical results causing support calls. Took a long time before MATLAB gave in to per-platform-tuned libraries.)
Re: [julia-users] Re: Matlab bench in Julia
I knew something was not right. I typed qr, not lu. Hence in that case, MATLAB did pivoting and Julia didn't. Sorry for that. Here are the right timings for lu which are as expected. MKL is slightly faster than OpenBLAS. MATLAB: tic();for i = 1:10 [L,U,p] = lu(A, 'vector'); end;toc()/10 ans = 0.2314 Julia: julia tic(); for i = 1:10 lu(A); end;toc()/10 elapsed time: 3.147632455 seconds 0.3147632455 Med venlig hilsen Andreas Noack 2014-09-18 16:25 GMT-04:00 Stefan Karpinski ste...@karpinski.org: Nice :-) On Thu, Sep 18, 2014 at 4:20 PM, Andreas Noack andreasnoackjen...@gmail.com wrote: Yes. It appears so on my Mac. I just redid the timings with the same result. Med venlig hilsen Andreas Noack 2014-09-18 15:55 GMT-04:00 Stefan Karpinski ste...@karpinski.org: I'm slightly confused – does that mean Julia is 2.4x faster in this case? On Thu, Sep 18, 2014 at 3:53 PM, Andreas Noack andreasnoackjen...@gmail.com wrote: In addition our lu calculates a partially pivoted lu and returns the L and U matrices and the vector of permutations. To get something comparable in MATLAB you'll have to write [L,,U,p] = lu(A,'vector') On my old Mac where Julia is compiled with OpenBLAS the timings are MATLAB: tic();for i = 1:10 [L,U,p] = qr(A, 'vector'); end;toc()/10 ans = 3.4801 Julia: julia tic(); for i = 1:10 qr(A); end;toc()/10 elapsed time: 14.758491472 seconds 1.4758491472 Med venlig hilsen Andreas Noack 2014-09-18 15:33 GMT-04:00 Jason Riedy ja...@lovesgoodfood.com: And Elliot Saba writes: The first thing you should do is run your code once to warm up the JIT, and then run it again to measure the actual run time, rather than compile time + run time. To be fair, he seems to be timing MATLAB in the same way, so he's comparing systems appropriately at that level. It's just the tuned BLAS+LAPACK fftw v. the default ones. This is one reason why MATLAB bundles so much. (Another reason being the differences in numerical results causing support calls. Took a long time before MATLAB gave in to per-platform-tuned libraries.)
[julia-users] Re: Matlab bench in Julia
Thanks for the tips. I have now compiled julia on my laptop, and the results are: julia versioninfo() Julia Version 0.3.0+6 Commit 7681878* (2014-08-20 20:43 UTC) Platform Info: System: Linux (x86_64-redhat-linux) CPU: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz WORD_SIZE: 64 BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell) LAPACK: libopenblas LIBM: libopenlibm LLVM: libLLVM-3.3 julia include(code/julia/bench.jl) LU decomposition, elapsed time: 0.123349203 seconds FFT , elapsed time: 0.20440579 seconds Matlab r2104a, with [L,U,P] = lu(A); instead of just lu(A); LU decomposition, elapsed time: 0.0586 seconds FFT elapsed time: 0.0809 seconds So a great improvement, but julia seems still 2-3 times slower than matlab, the underlying linear algebra libraries, respectively, and for these two very limited bench marks. Perhaps Matlab found a way to speed their lin.alg. up recently? The Fedora precompiled openblas was installed already at the first test (and presumably used by julia), but, as Andreas has also pointed out, it seems to be significantly slower than an openblas library compiled now with the julia installation.
[julia-users] Re: Matlab bench in Julia
I have found that I get better performance from some openblas routines by setting the number of blas threads to the number of physical CPU cores (half the number returned by CPU_CORES when hyperthreading is enabled): Base.blas_set_num_threads(div(CPU_CORES,2)) --Peter On Thursday, September 18, 2014 3:09:17 PM UTC-7, Stephan Buchert wrote: Thanks for the tips. I have now compiled julia on my laptop, and the results are: julia versioninfo() Julia Version 0.3.0+6 Commit 7681878* (2014-08-20 20:43 UTC) Platform Info: System: Linux (x86_64-redhat-linux) CPU: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz WORD_SIZE: 64 BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell) LAPACK: libopenblas LIBM: libopenlibm LLVM: libLLVM-3.3 julia include(code/julia/bench.jl) LU decomposition, elapsed time: 0.123349203 seconds FFT , elapsed time: 0.20440579 seconds Matlab r2104a, with [L,U,P] = lu(A); instead of just lu(A); LU decomposition, elapsed time: 0.0586 seconds FFT elapsed time: 0.0809 seconds So a great improvement, but julia seems still 2-3 times slower than matlab, the underlying linear algebra libraries, respectively, and for these two very limited bench marks. Perhaps Matlab found a way to speed their lin.alg. up recently? The Fedora precompiled openblas was installed already at the first test (and presumably used by julia), but, as Andreas has also pointed out, it seems to be significantly slower than an openblas library compiled now with the julia installation.
Re: [julia-users] Re: Matlab bench in Julia
As Douglas Bates wrote, these benchmarks mainly measures the speed of the underlying libraries. MATLAB uses MKL from Intel which is often the fastest library. However, the speed of OpenBLAS can be very different on different architectures and sometimes it can be faster than MKL. I just tried the benchmarks on a Linux server where that is the case. Milan, unfortunately I don't remember which distribution it was. I think it was a couple of months ago, but I'm not sure. Med venlig hilsen Andreas Noack 2014-09-18 19:06 GMT-04:00 Peter Simon psimon0...@gmail.com: I have found that I get better performance from some openblas routines by setting the number of blas threads to the number of physical CPU cores (half the number returned by CPU_CORES when hyperthreading is enabled): Base.blas_set_num_threads(div(CPU_CORES,2)) --Peter On Thursday, September 18, 2014 3:09:17 PM UTC-7, Stephan Buchert wrote: Thanks for the tips. I have now compiled julia on my laptop, and the results are: julia versioninfo() Julia Version 0.3.0+6 Commit 7681878* (2014-08-20 20:43 UTC) Platform Info: System: Linux (x86_64-redhat-linux) CPU: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz WORD_SIZE: 64 BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell) LAPACK: libopenblas LIBM: libopenlibm LLVM: libLLVM-3.3 julia include(code/julia/bench.jl) LU decomposition, elapsed time: 0.123349203 seconds FFT , elapsed time: 0.20440579 seconds Matlab r2104a, with [L,U,P] = lu(A); instead of just lu(A); LU decomposition, elapsed time: 0.0586 seconds FFT elapsed time: 0.0809 seconds So a great improvement, but julia seems still 2-3 times slower than matlab, the underlying linear algebra libraries, respectively, and for these two very limited bench marks. Perhaps Matlab found a way to speed their lin.alg. up recently? The Fedora precompiled openblas was installed already at the first test (and presumably used by julia), but, as Andreas has also pointed out, it seems to be significantly slower than an openblas library compiled now with the julia installation.