Issue - https://github.com/JuliaLang/julia/issues/9219
On Tue, Dec 2, 2014 at 10:04 AM, Amit Murthy <amit.mur...@gmail.com> wrote: > From the documentation - "Modules in Julia are separate global variable > workspaces." > > So what is happening is that the anonymous function in "remotecall(i, > x->(global const X=x; nothing), localX)" creates X as module global. > > The following works: > > module ParallelStuff > export doparallelstuff > > function doparallelstuff()(m = 10, n = 20) > # initialize variables > localX = Base.shmem_rand(m; pids=procs()) > localY = Base.shmem_rand(n; pids=procs()) > localf = [x->i+sum(x) for i=1:m] > localg = [x->i+sum(x) for i=1:n] > > # broadcast variables to all worker processes (thanks to Amit Murthy > for suggesting this syntax) > @sync begin > for i in procs(localX) > remotecall(i, x->(global X=x; nothing), localX) > remotecall(i, x->(global Y=x; nothing), localY) > remotecall(i, x->(global f=x; nothing), localf) > remotecall(i, x->(global g=x; nothing), localg) > end > end > > # compute > for iteration=1:1 > @everywhere begin > X=ParallelStuff.X > Y=ParallelStuff.Y > f=ParallelStuff.f > g=ParallelStuff.g > for i=localindexes(X) > X[i] = f[i](Y) > end > for j=localindexes(Y) > Y[j] = g[j](X) > end > end > end > end > > end #module > > > While remotecall, @everywhere, etc run under Main, the fact that the > closure variables refers to Module ParallelStuff is pretty confusing..... > I think we need a better way to handle this. > > > On Tue, Dec 2, 2014 at 4:58 AM, Madeleine Udell <madeleine.ud...@gmail.com > > wrote: > >> Thanks to Blake and Amit for some excellent suggestions! Both strategies >> work fine when embedded in functions, but not when those functions are >> embedded in modules. For example, the following throws an error: >> >> @everywhere include("ParallelStuff.jl") >> @everywhere using ParallelStuff >> doparallelstuff() >> >> when ParallelStuff.jl contains the following code: >> >> module ParallelStuff >> export doparallelstuff >> >> function doparallelstuff(m = 10, n = 20) >> # initialize variables >> localX = Base.shmem_rand(m; pids=procs()) >> localY = Base.shmem_rand(n; pids=procs()) >> localf = [x->i+sum(x) for i=1:m] >> localg = [x->i+sum(x) for i=1:n] >> >> # broadcast variables to all worker processes (thanks to Amit Murthy >> for suggesting this syntax) >> @sync begin >> for i in procs(localX) >> remotecall(i, x->(global const X=x; nothing), localX) >> remotecall(i, x->(global const Y=x; nothing), localY) >> remotecall(i, x->(global const f=x; nothing), localf) >> remotecall(i, x->(global const g=x; nothing), localg) >> end >> end >> >> # compute >> for iteration=1:1 >> @everywhere for i=localindexes(X) >> X[i] = f[i](Y) >> end >> @everywhere for j=localindexes(Y) >> Y[j] = g[j](X) >> end >> end >> end >> >> end #module >> >> On 3 processes (julia -p 3), the error is as follows: >> >> exception on 1: exception on 2: exception on 3: ERROR: X not defined >> in anonymous at no file >> in eval at >> /Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7 >> in anonymous at multi.jl:1310 >> in run_work_thunk at multi.jl:621 >> in run_work_thunk at multi.jl:630 >> in anonymous at task.jl:6 >> ERROR: X not defined >> in anonymous at no file >> in eval at >> /Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7 >> in anonymous at multi.jl:1310 >> in anonymous at multi.jl:848 >> in run_work_thunk at multi.jl:621 >> in run_work_thunk at multi.jl:630 >> in anonymous at task.jl:6 >> ERROR: X not defined >> in anonymous at no file >> in eval at >> /Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7 >> in anonymous at multi.jl:1310 >> in anonymous at multi.jl:848 >> in run_work_thunk at multi.jl:621 >> in run_work_thunk at multi.jl:630 >> in anonymous at task.jl:6 >> exception on exception on 2: 1: ERROR: Y not defined >> in anonymous at no file >> in eval at >> /Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7 >> in anonymous at multi.jl:1310 >> in anonymous at multi.jl:848 >> in run_work_thunk at multi.jl:621 >> in run_work_thunk at multi.jl:630 >> in anonymous at task.jl:6 >> ERROR: Y not defined >> in anonymous at no file >> in eval at >> /Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7 >> in anonymous at multi.jl:1310 >> in run_work_thunk at multi.jl:621 >> in run_work_thunk at multi.jl:630 >> in anonymous at task.jl:6 >> exception on 3: ERROR: Y not defined >> in anonymous at no file >> in eval at >> /Users/vagrant/tmp/julia-packaging/osx10.7+/julia-master/base/sysimg.jl:7 >> in anonymous at multi.jl:1310 >> in anonymous at multi.jl:848 >> in run_work_thunk at multi.jl:621 >> in run_work_thunk at multi.jl:630 >> in anonymous at task.jl:6 >> >> For comparison, the non-modularized version works: >> >> function doparallelstuff(m = 10, n = 20) >> # initialize variables >> localX = Base.shmem_rand(m; pids=procs()) >> localY = Base.shmem_rand(n; pids=procs()) >> localf = [x->i+sum(x) for i=1:m] >> localg = [x->i+sum(x) for i=1:n] >> >> # broadcast variables to all worker processes (thanks to Amit Murthy >> for suggesting this syntax) >> @sync begin >> for i in procs(localX) >> remotecall(i, x->(global const X=x; nothing), localX) >> remotecall(i, x->(global const Y=x; nothing), localY) >> remotecall(i, x->(global const f=x; nothing), localf) >> remotecall(i, x->(global const g=x; nothing), localg) >> end >> end >> >> # compute >> for iteration=1:1 >> @everywhere for i=localindexes(X) >> X[i] = f[i](Y) >> end >> @everywhere for j=localindexes(Y) >> Y[j] = g[j](X) >> end >> end >> end >> >> doparallelstuff() >> >> On Mon, Nov 24, 2014 at 11:24 AM, Blake Johnson <blakejohnso...@gmail.com >> > wrote: >> >>> I use this macro to send variables to remote processes: >>> >>> macro sendvar(proc, x) >>> quote >>> rr = RemoteRef() >>> put!(rr, $x) >>> remotecall($proc, (rr)->begin >>> global $(esc(x)) >>> $(esc(x)) = fetch(rr) >>> end, rr) >>> end >>> end >>> >>> Though the solution above looks a little simpler. >>> >>> --Blake >>> >>> On Sunday, November 23, 2014 1:30:49 AM UTC-5, Amit Murthy wrote: >>>> >>>> From the description of Base.localize_vars - 'wrap an expression in >>>> "let a=a,b=b,..." for each var it references' >>>> >>>> Though that does not seem to the only(?) issue here.... >>>> >>>> On Sun, Nov 23, 2014 at 11:52 AM, Madeleine Udell <madelei...@gmail.com >>>> > wrote: >>>> >>>>> Thanks! This is extremely helpful. >>>>> >>>>> Can you tell me more about what localize_vars does? >>>>> >>>>> On Sat, Nov 22, 2014 at 9:11 PM, Amit Murthy <amit....@gmail.com> >>>>> wrote: >>>>> >>>>>> This works: >>>>>> >>>>>> function doparallelstuff(m = 10, n = 20) >>>>>> # initialize variables >>>>>> localX = Base.shmem_rand(m; pids=procs()) >>>>>> localY = Base.shmem_rand(n; pids=procs()) >>>>>> localf = [x->i+sum(x) for i=1:m] >>>>>> localg = [x->i+sum(x) for i=1:n] >>>>>> >>>>>> # broadcast variables to all worker processes >>>>>> @sync begin >>>>>> for i in procs(localX) >>>>>> remotecall(i, x->(global X; X=x; nothing), localX) >>>>>> remotecall(i, x->(global Y; Y=x; nothing), localY) >>>>>> remotecall(i, x->(global f; f=x; nothing), localf) >>>>>> remotecall(i, x->(global g; g=x; nothing), localg) >>>>>> end >>>>>> end >>>>>> >>>>>> # compute >>>>>> for iteration=1:1 >>>>>> @everywhere for i=localindexes(X) >>>>>> X[i] = f[i](Y) >>>>>> end >>>>>> @everywhere for j=localindexes(Y) >>>>>> Y[j] = g[j](X) >>>>>> end >>>>>> end >>>>>> end >>>>>> >>>>>> doparallelstuff() >>>>>> >>>>>> Though I would have expected broadcast of variables to be possible >>>>>> with just >>>>>> @everywhere X=localX >>>>>> and so on .... >>>>>> >>>>>> >>>>>> Looks like @everywhere does not call localize_vars. I don't know if >>>>>> this is by design or just an oversight. I would have expected it to do >>>>>> so. >>>>>> Will file an issue on github. >>>>>> >>>>>> >>>>>> >>>>>> On Sun, Nov 23, 2014 at 8:24 AM, Madeleine Udell < >>>>>> madelei...@gmail.com> wrote: >>>>>> >>>>>>> The code block I posted before works, but throws an error when >>>>>>> embedded in a function: "ERROR: X not defined" (in first line of >>>>>>> @parallel). Why am I getting this error when I'm *assigning to* X? >>>>>>> >>>>>>> function doparallelstuff(m = 10, n = 20) >>>>>>> # initialize variables >>>>>>> localX = Base.shmem_rand(m) >>>>>>> localY = Base.shmem_rand(n) >>>>>>> localf = [x->i+sum(x) for i=1:m] >>>>>>> localg = [x->i+sum(x) for i=1:n] >>>>>>> >>>>>>> # broadcast variables to all worker processes >>>>>>> @parallel for i=workers() >>>>>>> global X = localX >>>>>>> global Y = localY >>>>>>> global f = localf >>>>>>> global g = localg >>>>>>> end >>>>>>> # give variables same name on master >>>>>>> X,Y,f,g = localX,localY,localf,localg >>>>>>> >>>>>>> # compute >>>>>>> for iteration=1:1 >>>>>>> @everywhere for i=localindexes(X) >>>>>>> X[i] = f[i](Y) >>>>>>> end >>>>>>> @everywhere for j=localindexes(Y) >>>>>>> Y[j] = g[j](X) >>>>>>> end >>>>>>> end >>>>>>> end >>>>>>> >>>>>>> doparallelstuff() >>>>>>> >>>>>>> On Fri, Nov 21, 2014 at 5:13 PM, Madeleine Udell < >>>>>>> madelei...@gmail.com> wrote: >>>>>>> >>>>>>>> My experiments with parallelism also occur in focused blocks; I >>>>>>>> think that's a sign that it's not yet as user friendly as it could be. >>>>>>>> >>>>>>>> Here's a solution to the problem I posed that's simple to use: >>>>>>>> @parallel + global can be used to broadcast a variable, while >>>>>>>> @everywhere >>>>>>>> can be used to do a computation on local data (ie, without resending >>>>>>>> the >>>>>>>> data). I'm not sure how to do the variable renaming programmatically, >>>>>>>> though. >>>>>>>> >>>>>>>> # initialize variables >>>>>>>> m,n = 10,20 >>>>>>>> localX = Base.shmem_rand(m) >>>>>>>> localY = Base.shmem_rand(n) >>>>>>>> localf = [x->i+sum(x) for i=1:m] >>>>>>>> localg = [x->i+sum(x) for i=1:n] >>>>>>>> >>>>>>>> # broadcast variables to all worker processes >>>>>>>> @parallel for i=workers() >>>>>>>> global X = localX >>>>>>>> global Y = localY >>>>>>>> global f = localf >>>>>>>> global g = localg >>>>>>>> end >>>>>>>> # give variables same name on master >>>>>>>> X,Y,f,g = localX,localY,localf,localg >>>>>>>> >>>>>>>> # compute >>>>>>>> for iteration=1:10 >>>>>>>> @everywhere for i=localindexes(X) >>>>>>>> X[i] = f[i](Y) >>>>>>>> end >>>>>>>> @everywhere for j=localindexes(Y) >>>>>>>> Y[j] = g[j](X) >>>>>>>> end >>>>>>>> end >>>>>>>> >>>>>>>> On Fri, Nov 21, 2014 at 11:14 AM, Tim Holy <tim....@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> My experiments with parallelism tend to occur in focused blocks, >>>>>>>>> and I haven't >>>>>>>>> done it in quite a while. So I doubt I can help much. But in >>>>>>>>> general I suspect >>>>>>>>> you're encountering these problems because much of the IPC goes >>>>>>>>> through >>>>>>>>> thunks, and so a lot of stuff gets reclaimed when execution is >>>>>>>>> done. >>>>>>>>> >>>>>>>>> If I were experimenting, I'd start by trying to create >>>>>>>>> RemoteRef()s and put! >>>>>>>>> ()ing my variables into them. Then perhaps you might be able to >>>>>>>>> fetch them >>>>>>>>> from other processes. Not sure that will work, but it seems to be >>>>>>>>> worth a try. >>>>>>>>> >>>>>>>>> HTH, >>>>>>>>> --Tim >>>>>>>>> >>>>>>>>> On Thursday, November 20, 2014 08:20:19 PM Madeleine Udell wrote: >>>>>>>>> > I'm trying to use parallelism in julia for a task with a >>>>>>>>> structure that I >>>>>>>>> > think is quite pervasive. It looks like this: >>>>>>>>> > >>>>>>>>> > # broadcast lists of functions f and g to all processes so >>>>>>>>> they're >>>>>>>>> > available everywhere >>>>>>>>> > # create shared arrays X,Y on all processes so they're available >>>>>>>>> everywhere >>>>>>>>> > for iteration=1:1000 >>>>>>>>> > @parallel for i=1:size(X) >>>>>>>>> > X[i] = f[i](Y) >>>>>>>>> > end >>>>>>>>> > @parallel for j=1:size(Y) >>>>>>>>> > Y[j] = g[j](X) >>>>>>>>> > end >>>>>>>>> > end >>>>>>>>> > >>>>>>>>> > I'm having trouble making this work, and I'm not sure where to >>>>>>>>> dig around >>>>>>>>> > to find a solution. Here are the difficulties I've encountered: >>>>>>>>> > >>>>>>>>> > * @parallel doesn't allow me to create persistent variables on >>>>>>>>> each >>>>>>>>> > process; ie, the following results in an error. >>>>>>>>> > >>>>>>>>> > s = Base.shmem_rand(12,3) >>>>>>>>> > @parallel for i=1:nprocs() m,n = size(s) end >>>>>>>>> > @parallel for i=1:nprocs() println(m) end >>>>>>>>> > >>>>>>>>> > * @everywhere does allow me to create persistent variables on >>>>>>>>> each process, >>>>>>>>> > but doesn't send any data at all, including the variables I need >>>>>>>>> in order >>>>>>>>> > to define new variables. Eg the following is an error: s is a >>>>>>>>> shared array, >>>>>>>>> > but the variable (ie pointer to) s is apparently not shared. >>>>>>>>> > s = Base.shmem_rand(12,3) >>>>>>>>> > @everywhere m,n = size(s) >>>>>>>>> > >>>>>>>>> > Here are the kinds of questions I'd like to see protocode for: >>>>>>>>> > * How can I broadcast a variable so that it is available and >>>>>>>>> persistent on >>>>>>>>> > every process? >>>>>>>>> > * How can I create a reference to the same shared array "s" that >>>>>>>>> is >>>>>>>>> > accessible from every process? >>>>>>>>> > * How can I send a command to be performed in parallel, >>>>>>>>> specifying which >>>>>>>>> > variables should be sent to the relevant processes and which >>>>>>>>> should be >>>>>>>>> > looked up in the local namespace? >>>>>>>>> > >>>>>>>>> > Note that everything I ask above is not specific to shared >>>>>>>>> arrays; the same >>>>>>>>> > constructs would also be extremely useful in the distributed >>>>>>>>> case. >>>>>>>>> > >>>>>>>>> > ---------------------- >>>>>>>>> > >>>>>>>>> > An interesting partial solution is the following: >>>>>>>>> > funcs! = Function[x->x[:] = x+k for k=1:3] >>>>>>>>> > d = drand(3,12) >>>>>>>>> > let funcs! = funcs! >>>>>>>>> > @sync @parallel for k in 1:3 >>>>>>>>> > funcs![myid()-1](localpart(d)) >>>>>>>>> > end >>>>>>>>> > end >>>>>>>>> > >>>>>>>>> > Here, I'm not sure why the let statement is necessary to send >>>>>>>>> funcs!, since >>>>>>>>> > d is sent automatically. >>>>>>>>> > >>>>>>>>> > --------------------- >>>>>>>>> > >>>>>>>>> > Thanks! >>>>>>>>> > Madeleine >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Madeleine Udell >>>>>>>> PhD Candidate in Computational and Mathematical Engineering >>>>>>>> Stanford University >>>>>>>> www.stanford.edu/~udell >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Madeleine Udell >>>>>>> PhD Candidate in Computational and Mathematical Engineering >>>>>>> Stanford University >>>>>>> www.stanford.edu/~udell >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Madeleine Udell >>>>> PhD Candidate in Computational and Mathematical Engineering >>>>> Stanford University >>>>> www.stanford.edu/~udell >>>>> >>>> >>>> >> >> >> -- >> Madeleine Udell >> PhD Candidate in Computational and Mathematical Engineering >> Stanford University >> www.stanford.edu/~udell >> > >