Sure I just made a very quick benchmark and it should not be taken too seriously. I just thought we should not speculate too much on what Matlab does but better measure it.
Am Donnerstag, 22. Mai 2014 15:38:15 UTC+2 schrieb gael....@gmail.com: > > > > Le jeudi 22 mai 2014 09:51:44 UTC+2, Tobias Knopp a écrit : >> >> To give this discussions some facts I have done some benchmarking on my >> own >> >> Matlab R2013a: >> >> function [ y ] = perf( ) >> N = 10000000; >> x = rand(N,1); >> y = x + x .* x + x .* x; >> end >> >> >> tic;y=perf();toc; >> Elapsed time is 0.177664 seconds. >> >> Julia 0.3 prerelease >> >> function perf() >> N = 10000000 >> x = rand(N) >> y = x + x .* x + x .* x >> end >> >> julia> @time perf() >> elapsed time: 0.232852894 seconds (400002808 bytes allocated) >> >> using Devectorize.jl >> >> function perf_devec() >> N = 10000000 >> x = rand(N) >> @devec y = x + x .* x + x .* x >> end >> >> julia> @time perf_devec() >> elapsed time: 0.084605794 seconds (160000664 bytes allocated) >> >> So seems all pretty consistent to me. Matlab is a little better in >> vectorized code as they presumely have a better memory caching. But still >> explicit devectorization using the @devec macro performs best. So using >> vectorized code in Julia is fine and "reasonable fast". If someone wants to >> do performance tweaking I don't see the issue telling him about >> devectorization. >> > > Ahah !!! I was sure of it: we don't talk about the same thing. To me, > @devec y = x + x .* x + x .* x > is actually *vectorized* code :). When I'm talking about devectorizing > code, I'm only talking about explicit loops. It's a shame that I only paid > attention to Devectorize.jl yesterday night. This thing is awesome and it > should be a great place to contribute to. > > This should be the very first answer to "this part of my code is too slow". > > > Regarding the benchmarks you've done, thanks. Without evidence, no > discussion. I agree. > > But there are two problems with your benchmarks. Firstly, you've not > repeated them and therefore can't associate an uncertainty with them. Maybe > matlab code is not actually faster. Secondly, what if matlab or julia > actually spend most of its time getting the random vector? > > I'd recommend you to repeat your result and compare directly the > estimated distribution function. I've done just that for your simple code. > I just put N = rand(...) outside of the function each time. I also created > devectorized versions of your code (I mean, with explicit loops written by > myself). Once with ".*" as a multiplier and once with "*". The resulting > kernel densities can be found attached. > > As you can see, the resulting functions are not even close from a > Gaussian. Normality tests failed for each of those distributions. How to > explain that? Easy: a typical desktop computer does nothing most of the > time. Once you launch something, it spends it's time on this but from time > to time very rarely, it needs to spend some CPU time on something else. > Therefore, there is the mode : the most probable execution time and a tail > that is bigger on the side of increasing times. > > I just go the time using tic(), toc(), so for each of the thousand > repetitions of the exact same calculation, I could follow the execution > time in real time. The fact that "Explicit loop (*)" has a bigger and > stranger tail is directly related to the fact that I used my mouse quite > intensively during that run. What does that mean? > > 1) One must repeat calculations for benchmarking. > 2) Calculating the mean of the repetitions is useless because it is not a > good estimator of the mode of the distribution. > > The point 1 is obvious, but not point 2 : please consider the second plot > attached which is a zoomed in part of the first one. As you can see, the > mode of the "Explicit loop (*)" curve is positioned slightly on the left > compared to the "Explicit loop (.*)" curve. This certainly means that for > scalars, "*" has less overhead than ".*" (maybe just a couple of extra > "if"s or an extra function call). However, because I shook my mouse (on > purpose) during that run (it was so funny to see the execution time bump on > the terminal), it has a bigger asymmetric tail. > > The result? The ".*" is on "average" 3% faster than the "*" version. > > The problem? This is not a function benchmark, this is a system load > benchmark. > > Oh my, I've done it again, sorry for the long post. >