Re: [julia-users] Julia Parallel Computing Optimization

2014-02-06 Thread David Salamon
You're welcome, glad to hear I was of use :) On Thu, Feb 6, 2014 at 12:23 PM, Alex C wrote: > David, > Thanks for your help and input. Greatly appreciated for a newbie like me. > > Alex > > On Wednesday, February 5, 2014 6:11:15 PM UTC-5, David Salamon wrote: > >> Hey Alex, >> >> Great catch on

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-06 Thread Alex C
David, Thanks for your help and input. Greatly appreciated for a newbie like me. Alex On Wednesday, February 5, 2014 6:11:15 PM UTC-5, David Salamon wrote: > > Hey Alex, > > Great catch on #5 -- that was dumb on my part :) > > Re #4: turns out it's a known bug involving parallel code specifically

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-05 Thread David Salamon
Hey Alex, Great catch on #5 -- that was dumb on my part :) Re #4: turns out it's a known bug involving parallel code specifically. I've upadated https://github.com/JuliaLang/julia/issues/2669 to let them know it's no longer a theoretic discussion. Re: breaking the matrix in to pieces -- splittin

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-05 Thread Alex C
1. Done 2. I am not sure what you mean by "running the paralel computation over D and E directly, then scaling it down afterwards" but I do the scaling element-wise now instead of at the end 3. I will consider it in the future 4. Done although it won't allow me to name the output (A,B,C). It give

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-05 Thread Alex C
Good catch! I thought I had checked that the operation was okay before I wrote the line of code. Guess not. On Tuesday, February 4, 2014 5:16:47 PM UTC-5, David Salamon wrote: > > Woah, also: > A = B = zeros(Float64,Ly,Lx); > > is almost surely not what you intended. > > julia> A = B = [1 2] > 1

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-04 Thread David Salamon
Woah, also: A = B = zeros(Float64,Ly,Lx); is almost surely not what you intended. julia> A = B = [1 2] 1x2 Array{Int64,2}: 1 2 julia> A[1] = 10 10 julia> B 1x2 Array{Int64,2}: 10 2 On Tue, Feb 4, 2014 at 2:15 PM, David Salamon wrote: > 1. You should change: > C = complex(zeros(Float64, Ly

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-04 Thread David Salamon
1. You should change: C = complex(zeros(Float64, Ly, Lx) to: C = zeros(Complex{Float64}, Ly, Lx) [the way you are doing it there creates a float version, then a complex version, then trashes it] 2. The algorithm after the above change allocates 3 * limit * (limit/2) * samples * 16 bytes in the

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-04 Thread Alex C
Sorry for the confusion. I was trying to get a simple example to work so I wouldn't get distracted by details. The @everywhere did the trick. This is the fastest parallel version of the code that I was able to get working. However, I easily run into memory limitations (8GB RAM) as I increase the

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-04 Thread David Salamon
huh. maybe @everywhere in front of the function definition? I'm not sure On Tue, Feb 4, 2014 at 10:53 AM, Alex C wrote: > Thanks for the hint. Getting rid of 'mx' and 'my' definitely helps. > > I couldn't figure out how to implement the parallel version of tuple > adding. This is what I've got

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-04 Thread Alex C
Thanks for the hint. Getting rid of 'mx' and 'my' definitely helps. I couldn't figure out how to implement the parallel version of tuple adding. This is what I've got. It crashes my Julia Studio console when I try to run it. What am I missing? add_two_tuple(x,y) = (x[1]+y[1], x[2]+y[2], x[3]+y[

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-04 Thread David Salamon
Thanks Amit -- I think you just saved future me a lot of frustration :) On Mon, Feb 3, 2014 at 7:27 PM, Amit Murthy wrote: > Would like to mention that the non-reducer version of @parallel is > asynchronous. Before you can use Ans1 and Ans2, you should wait for > completion. > > For example, if

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-03 Thread Amit Murthy
Would like to mention that the non-reducer version of @parallel is asynchronous. Before you can use Ans1 and Ans2, you should wait for completion. For example, if you need to time it, you can wrap it in a @sync block like this: @time @sync begin @parallel . end end On Mon,

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-03 Thread David Salamon
I have no experience with it, but it looks like you could also just do: Ans1 = SharedArray(Float64, (limit, int64(limit/2)) Ans2 = SharedArray(Float64, (limit, int64(limit/2)) @parallel for sample=1:samples, i=1:limit, j=1:int64(limit/2) Sx = S[i, sample] Sy = S[j, sample] Sxy = S[i+j, s

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-03 Thread David Salamon
Also S[:,1] is allocating. it should look something like: for sample=1:samples, i=1:limit, j=1:int64(limit/2) Sx = S[i, sample] Sy = S[j, sample] Sxy = S[i+j, sample] ... end On Mon, Feb 3, 2014 at 8:45 AM, David Salamon wrote: > You're not out of the no-slicing woods yet. Looks li

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-03 Thread David Salamon
You're not out of the no-slicing woods yet. Looks like you can get rid of `mx` and `my` for i=1:limit, j=1:int64(limit/2) end As far as parallelizing, you could define: three_tup_add(a, b, c) = (a[1] + b[1] + c[1], a[2] + b[2] + c[2], a[3] + b[3] + c[3]) and then do a @parallel (three_tup_add)

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-03 Thread Alex C
Thanks. I've re-written the function to minimize the amount of copying (i.e. slicing) that is required. But now, I'm befuddled as to how to parallelize this function using Julia. Any suggestions? Alex function expensive_hat(S::Array{Complex{Float64},2}, mx::Array{Int64,2}, my::Array{Int64,2})

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-03 Thread Alex C
Thanks. I've re-written the function to minimize the amount of copying (i.e. slicing) that is required. But now, I'm befuddled as to how to parallelize this function using Julia. Any suggestions? Alex function expensive_hat(S::Array{Complex{Float64},2}, mx::Array{Int64,2}, my::Array{Int64,2})

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-03 Thread John Myles White
Just to be clear: in the future, Julia will not makes copies during array slicing. But it does now, which can be costly. — John On Feb 3, 2014, at 7:01 AM, David Salamon wrote: > I agree with John about the insane amount of copying going on. However, I > added some @times to your code and it

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-03 Thread David Salamon
I agree with John about the insane amount of copying going on. However, I added some @times to your code and it looks like most of the time is spent in conj. You probably want to precompute that for both B and C's calculation. function expensive_hat(S::Array{Complex{Float64},2}, mx::Array{Int64,2}

Re: [julia-users] Julia Parallel Computing Optimization

2014-02-03 Thread John Myles White
One potential performance issue here is that the array indexing steps like S[:,i][my] currently produce copies, not references, which would slow things down. Someone with more expertise in parallel programming might have better suggestions than that. Have you tried profiling your code? http://

[julia-users] Julia Parallel Computing Optimization

2014-02-03 Thread Alex C
Hi, I am trying to port some Matlab code into Julia in order to improve performance. The Julia parallel code currently takes about 2-3x as long as my Matlab implementation. I am at wit's end as to how to improve the performance. Any suggestions? I tried using pmap but couldn't figure out how t