To be clear, you need to compare the final 'z' not the final 'A' to check 
if your calculations are consistent. The matrix A does not change through 
out this calculation, but the matrix z does.
Also, there is no parallelism with the @parallel loop unless your start 
julia with 'julia -np N' where N is the number of processes you'd like to 
use.

On Thursday, 21 July 2016 12:45:17 UTC-4, Ferran Mazzanti wrote:
>
> Hi Nathan,
>
> I posted the codes, so you can check if they do the same thing or not. 
> These went to separate cells in Jupyter, nothing more and nothing less.
> Not even a single line I didn't post. And yes I understand your line of 
> reasoning, so that's why I got astonished also.
> But I can see what is making this huge difference, and I'd like to know :)
>
> Best,
>
> Ferran.
>
> On Thursday, July 21, 2016 at 6:31:57 PM UTC+2, Nathan Smith wrote:
>>
>> Hey Ferran, 
>>
>> You should be suspicious when your apparent speed up surpasses the level 
>> of parallelism available on your CPU. I looks like your codes don't 
>> actually compute the same thing.
>>
>> I'm assuming you're trying to compute the matrix exponential of A 
>> (A^1000000000) by repeatedly multiplying A. In your parallel code, each 
>> process gets a local copy of 'z' and
>> uses that. This means each process is computing something like 
>> (A^(1000000000/# of procs)). Check out this 
>> <http://docs.julialang.org/en/release-0.4/manual/parallel-computing/#parallel-map-and-loops>
>>  section 
>> of the documentation on parallel map and loops to see what I mean.
>>
>> That said, that doesn't explain your speed up completely, you should also 
>> make sure that each part of your script is wrapped in a function and that 
>> you 'warm-up' each function by running it once before comparing.
>>
>> Cheers, 
>> Nathan
>>
>> On Thursday, 21 July 2016 12:00:47 UTC-4, Ferran Mazzanti wrote:
>>>
>>> Hi,
>>>
>>> mostly showing my astonishment, but I can even understand the figures in 
>>> this stupid parallelization code
>>>
>>> A = [[1.0 1.0001];[1.0002 1.0003]]
>>> z = A
>>> tic()
>>> for i in 1:1000000000
>>>     z *= A
>>> end
>>> toc()
>>> A
>>>
>>> produces
>>>
>>> elapsed time: 105.458639263 seconds
>>>
>>> 2x2 Array{Float64,2}:
>>>  1.0     1.0001
>>>  1.0002  1.0003
>>>
>>>
>>>
>>> But then add @parallel in the for loop
>>>
>>> A = [[1.0 1.0001];[1.0002 1.0003]]
>>> z = A
>>> tic()
>>> @parallel for i in 1:1000000000
>>>     z *= A
>>> end
>>> toc()
>>> A
>>>
>>> and get 
>>>
>>> elapsed time: 0.008912282 seconds
>>>
>>> 2x2 Array{Float64,2}:
>>>  1.0     1.0001
>>>  1.0002  1.0003
>>>
>>>
>>> look at the elapsed time differences! And I'm running this on my Xeon 
>>> desktop, not even a cluster
>>> Of course A-B reports
>>>
>>> 2x2 Array{Float64,2}:
>>>  0.0  0.0
>>>  0.0  0.0
>>>
>>>
>>> So is this what one should expect from this kind of simple 
>>> paralleizations? If so, I'm definitely *in love* with Julia :):):)
>>>
>>> Best,
>>>
>>> Ferran.
>>>
>>>
>>>

Reply via email to