Hey Ferran, You should be suspicious when your apparent speed up surpasses the level of parallelism available on your CPU. I looks like your codes don't actually compute the same thing.
I'm assuming you're trying to compute the matrix exponential of A (A^1000000000) by repeatedly multiplying A. In your parallel code, each process gets a local copy of 'z' and uses that. This means each process is computing something like (A^(1000000000/# of procs)). Check out this <http://docs.julialang.org/en/release-0.4/manual/parallel-computing/#parallel-map-and-loops> section of the documentation on parallel map and loops to see what I mean. That said, that doesn't explain your speed up completely, you should also make sure that each part of your script is wrapped in a function and that you 'warm-up' each function by running it once before comparing. Cheers, Nathan On Thursday, 21 July 2016 12:00:47 UTC-4, Ferran Mazzanti wrote: > > Hi, > > mostly showing my astonishment, but I can even understand the figures in > this stupid parallelization code > > A = [[1.0 1.0001];[1.0002 1.0003]] > z = A > tic() > for i in 1:1000000000 > z *= A > end > toc() > A > > produces > > elapsed time: 105.458639263 seconds > > 2x2 Array{Float64,2}: > 1.0 1.0001 > 1.0002 1.0003 > > > > But then add @parallel in the for loop > > A = [[1.0 1.0001];[1.0002 1.0003]] > z = A > tic() > @parallel for i in 1:1000000000 > z *= A > end > toc() > A > > and get > > elapsed time: 0.008912282 seconds > > 2x2 Array{Float64,2}: > 1.0 1.0001 > 1.0002 1.0003 > > > look at the elapsed time differences! And I'm running this on my Xeon > desktop, not even a cluster > Of course A-B reports > > 2x2 Array{Float64,2}: > 0.0 0.0 > 0.0 0.0 > > > So is this what one should expect from this kind of simple > paralleizations? If so, I'm definitely *in love* with Julia :):):) > > Best, > > Ferran. > > >