Re: [julia-users] efficient use of shared arrays and @parallel for
Hi. Regarding your questions, I am also having the same problems and no luck with those {Any} and SharedArrays. Also, I am having the same problems with those crashes, so I have to restart Julia after finishing any parallel code. I am looking forward to see Tim's comments about those type instabilities. Thanks. On Friday, July 31, 2015 at 9:47:52 AM UTC-3, thr wrote: > > Yes, sure. > > On Friday, July 31, 2015 at 2:02:25 PM UTC+2, Tim Holy wrote: >> >> I have a demo to answer your question, but I'd like to simply add it to >> the >> documentation on SharedArrays. May I use some of your code in writing up >> the >> demo? (MIT license, see >> >> https://github.com/JuliaLang/julia/blob/master/CONTRIBUTING.md#improving-documentation) >> >> >> >> --Tim >> >> On Thursday, July 30, 2015 05:51:24 PM thr wrote: >> > Hi all, >> > >> > I'm implementing a basic explicit advection algorithm of the form: >> > >> >for t = 1:T-1 >> > for j = 3:n-2 >> > for i = 3:m-2 >> > q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t]) >> > end >> > end >> > end >> > >> > >> > where q is a quantity and u a velocity field. >> > I'd like to parallelize this by using sharded arrays and @parallel for, >> I >> > tried the following: >> > >> > const n = 500 >> > const m = 500 >> > const T = 500 >> > >> > @everywhere function timestep(x,y) >> > #return x+y >> > return x+y +x+y +x+y +x+y +x+y +x+y +x+y >> > end >> > >> > function advection_ser(q, u) >> > println("==serial=$n x $m x $T") >> > for t = 1:T-1 >> > for j = 3:n-2 >> > for i = 3:m-2 >> > q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t]) >> > end >> > end >> > end >> > return q >> > end >> > >> > function advection_par(q,u) >> > println("==parallel=$n x $m x $T") >> > for t = 1:T-1 >> > @sync @parallel for j = 3:n-2 >> > for i = 3:m-2 >> > q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t]) >> > end >> > end >> > end >> > return q >> > end >> > >> > q= SharedArray(Float64, (m,n,T), init=false) >> > u= SharedArray(Float64, (m,n,T), init=false) >> > >> > @time qs = advection_ser(q,u) >> > @time qp = advection_par(q,u) >> > >> > >> > >> > >> > But this yields only a very moderate speed gain: the parallel version >> is >> > about 1/3 faster than the serial version for m,n,T=500,500,500 and -p >> 4. >> > Is there a way I can improve on this? >> > >> > I have also seen some weird behaviour regarding shared arrays and I'd >> like >> > to verify that I'm not just doing it wrong before opening issues: >> > >> > 1. When I construct q inside of the advection function, @code_warntype >> > tells me that it's handled as an 'any' and the code is much slower. >> > However, typeof(q) tells me it's of type SharedArray{Float64,3} as it >> > should be. >> > >> > 2. I'm pretty sure there's a memory hole associated with SharedArrays, >> for >> > when I start above program over and over eventually I get a bus error >> and >> > julia crashes. Do I have to somehow release the shared memory from the >> > workers? >> > >> > Thanks in advance, Johannes >> >>
Re: [julia-users] efficient use of shared arrays and @parallel for
Yes, sure. On Friday, July 31, 2015 at 2:02:25 PM UTC+2, Tim Holy wrote: > > I have a demo to answer your question, but I'd like to simply add it to > the > documentation on SharedArrays. May I use some of your code in writing up > the > demo? (MIT license, see > > https://github.com/JuliaLang/julia/blob/master/CONTRIBUTING.md#improving-documentation) > > > > --Tim > > On Thursday, July 30, 2015 05:51:24 PM thr wrote: > > Hi all, > > > > I'm implementing a basic explicit advection algorithm of the form: > > > >for t = 1:T-1 > > for j = 3:n-2 > > for i = 3:m-2 > > q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t]) > > end > > end > > end > > > > > > where q is a quantity and u a velocity field. > > I'd like to parallelize this by using sharded arrays and @parallel for, > I > > tried the following: > > > > const n = 500 > > const m = 500 > > const T = 500 > > > > @everywhere function timestep(x,y) > > #return x+y > > return x+y +x+y +x+y +x+y +x+y +x+y +x+y > > end > > > > function advection_ser(q, u) > > println("==serial=$n x $m x $T") > > for t = 1:T-1 > > for j = 3:n-2 > > for i = 3:m-2 > > q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t]) > > end > > end > > end > > return q > > end > > > > function advection_par(q,u) > > println("==parallel=$n x $m x $T") > > for t = 1:T-1 > > @sync @parallel for j = 3:n-2 > > for i = 3:m-2 > > q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t]) > > end > > end > > end > > return q > > end > > > > q= SharedArray(Float64, (m,n,T), init=false) > > u= SharedArray(Float64, (m,n,T), init=false) > > > > @time qs = advection_ser(q,u) > > @time qp = advection_par(q,u) > > > > > > > > > > But this yields only a very moderate speed gain: the parallel version is > > about 1/3 faster than the serial version for m,n,T=500,500,500 and -p 4. > > Is there a way I can improve on this? > > > > I have also seen some weird behaviour regarding shared arrays and I'd > like > > to verify that I'm not just doing it wrong before opening issues: > > > > 1. When I construct q inside of the advection function, @code_warntype > > tells me that it's handled as an 'any' and the code is much slower. > > However, typeof(q) tells me it's of type SharedArray{Float64,3} as it > > should be. > > > > 2. I'm pretty sure there's a memory hole associated with SharedArrays, > for > > when I start above program over and over eventually I get a bus error > and > > julia crashes. Do I have to somehow release the shared memory from the > > workers? > > > > Thanks in advance, Johannes > >
Re: [julia-users] efficient use of shared arrays and @parallel for
I have a demo to answer your question, but I'd like to simply add it to the documentation on SharedArrays. May I use some of your code in writing up the demo? (MIT license, see https://github.com/JuliaLang/julia/blob/master/CONTRIBUTING.md#improving-documentation) --Tim On Thursday, July 30, 2015 05:51:24 PM thr wrote: > Hi all, > > I'm implementing a basic explicit advection algorithm of the form: > >for t = 1:T-1 > for j = 3:n-2 > for i = 3:m-2 > q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t]) > end > end > end > > > where q is a quantity and u a velocity field. > I'd like to parallelize this by using sharded arrays and @parallel for, I > tried the following: > > const n = 500 > const m = 500 > const T = 500 > > @everywhere function timestep(x,y) > #return x+y > return x+y +x+y +x+y +x+y +x+y +x+y +x+y > end > > function advection_ser(q, u) > println("==serial=$n x $m x $T") > for t = 1:T-1 > for j = 3:n-2 > for i = 3:m-2 > q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t]) > end > end > end > return q > end > > function advection_par(q,u) > println("==parallel=$n x $m x $T") > for t = 1:T-1 > @sync @parallel for j = 3:n-2 > for i = 3:m-2 > q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t]) > end > end > end > return q > end > > q= SharedArray(Float64, (m,n,T), init=false) > u= SharedArray(Float64, (m,n,T), init=false) > > @time qs = advection_ser(q,u) > @time qp = advection_par(q,u) > > > > > But this yields only a very moderate speed gain: the parallel version is > about 1/3 faster than the serial version for m,n,T=500,500,500 and -p 4. > Is there a way I can improve on this? > > I have also seen some weird behaviour regarding shared arrays and I'd like > to verify that I'm not just doing it wrong before opening issues: > > 1. When I construct q inside of the advection function, @code_warntype > tells me that it's handled as an 'any' and the code is much slower. > However, typeof(q) tells me it's of type SharedArray{Float64,3} as it > should be. > > 2. I'm pretty sure there's a memory hole associated with SharedArrays, for > when I start above program over and over eventually I get a bus error and > julia crashes. Do I have to somehow release the shared memory from the > workers? > > Thanks in advance, Johannes
[julia-users] efficient use of shared arrays and @parallel for
Hi all, I'm implementing a basic explicit advection algorithm of the form: for t = 1:T-1 for j = 3:n-2 for i = 3:m-2 q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t]) end end end where q is a quantity and u a velocity field. I'd like to parallelize this by using sharded arrays and @parallel for, I tried the following: const n = 500 const m = 500 const T = 500 @everywhere function timestep(x,y) #return x+y return x+y +x+y +x+y +x+y +x+y +x+y +x+y end function advection_ser(q, u) println("==serial=$n x $m x $T") for t = 1:T-1 for j = 3:n-2 for i = 3:m-2 q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t]) end end end return q end function advection_par(q,u) println("==parallel=$n x $m x $T") for t = 1:T-1 @sync @parallel for j = 3:n-2 for i = 3:m-2 q[i,j,t+1]= timestep(q[i,j,t], u[i,j,t]) end end end return q end q= SharedArray(Float64, (m,n,T), init=false) u= SharedArray(Float64, (m,n,T), init=false) @time qs = advection_ser(q,u) @time qp = advection_par(q,u) But this yields only a very moderate speed gain: the parallel version is about 1/3 faster than the serial version for m,n,T=500,500,500 and -p 4. Is there a way I can improve on this? I have also seen some weird behaviour regarding shared arrays and I'd like to verify that I'm not just doing it wrong before opening issues: 1. When I construct q inside of the advection function, @code_warntype tells me that it's handled as an 'any' and the code is much slower. However, typeof(q) tells me it's of type SharedArray{Float64,3} as it should be. 2. I'm pretty sure there's a memory hole associated with SharedArrays, for when I start above program over and over eventually I get a bus error and julia crashes. Do I have to somehow release the shared memory from the workers? Thanks in advance, Johannes