Hi Amit: yes, the idea is to have just two DArrays, one each for the
previous and current iterations. I had some trouble assigning values
directly to a DArray (a setindex! error) and so had to write it like this.
Do you know any means around this?

Btw, the parallel code runs slower than the serial version even for just
one iteration.

On Sun, Jan 4, 2015 at 10:27 PM, Amit Murthy <amit.mur...@gmail.com> wrote:

> As written, this is creating a 1000 DArrays. I think you intended to have
> only 2 of them and swap values in each iteration?
>
>
> On Sunday, 4 January 2015 11:07:47 UTC+5:30, Amuthan A. Ramabathiran wrote:
>>
>> Hello: I recently started exploring the parallel capabilities of Julia
>> and I need some help in understanding and improving the performance a very
>> elementary parallel code using DArrays (I use Julia
>> version 0.4.0-dev+2431). The code pasted below (based essentially on
>> plife.jl) solves u''(x) = 0, x \in [0,1] with u(0) and u(1) specified,
>> using the 2nd order central difference approximation. The parallel version
>> of the code runs significantly slower than the serial version. It would be
>> nice if someone could point out ways to improve this and/or suggest an
>> alternative efficient version.
>>
>> function laplace_1D_serial(u::Array{Float64})
>>    N = length(u) - 2
>>    u_new = zeros(N)
>>
>>    for i = 1:N
>>       u_new[i] = 0.5(u[i] + u[i + 2])
>>    end
>>
>>    u_new
>> end
>>
>> function serial_iterate(u::Array{Float64})
>>    u_new = laplace_1D_serial(u)
>>
>>    for i = 1:length(u_new)
>>       u[i + 1] = u_new[i]
>>    end
>> end
>>
>> function parallel_iterate(u::DArray)
>>    DArray(size(u), procs(u)) do I
>>       J = I[1]
>>
>>       if myid() == 2
>>          local_array = zeros(length(J) + 1)
>>          for i = J[1] : J[end] + 1
>>             local_array[i - J[1] + 1] = u[i]
>>          end
>>          append!([float(u[1])], laplace_1D_serial(local_array))
>>
>>       elseif myid() == length(procs(u)) + 1
>>          local_array = zeros(length(J) + 1)
>>          for i = J[1] - 1 : J[end]
>>             local_array[i - J[1] + 2] = u[i]
>>          end
>>          append!(laplace_1D_serial(local_array), [float(u[end])])
>>
>>       else
>>          local_array = zeros(length(J) + 2)
>>          for i = J[1] - 1 : J[end] + 1
>>             local_array[i - J[1] + 2] = u[i]
>>          end
>>          laplace_1D_serial(local_array)
>>
>>       end
>>    end
>> end
>>
>> A sample run on my laptop with 4 processors:
>> julia> u = zeros(1000); u[end] = 1.0; u_distributed = distribute(u);
>>
>> julia> @time for i = 1:1000
>>          serial_iterate(u)
>>        end
>> elapsed time: 0.011452192 seconds (8300112 bytes allocated)
>>
>> julia> @time for i = 1:1000
>>          u_distributed = parallel_iterate(u_distributed)
>>        end
>> elapsed time: 4.461922218 seconds (190565036 bytes allocated, 10.17% gc
>> time)
>>
>> Thanks for your help!
>>
>> Cheers,
>> Amuthan
>>
>>
>>

Reply via email to