Re: [julia-users] Broadcasting variables

Amit Murthy Sun, 23 Nov 2014 19:26:52 -0800

"global X; X=x" should probably be "global const X=x" and so on....


On Sun, Nov 23, 2014 at 1:33 PM, Amit Murthy <amit.mur...@gmail.com> wrote:

> I mentioned  localize_vars() since it is one of the differences between
> the implementations of @everywhere and @spawnat. But, there is something
> else also going on that I don't understand.
>
> On Sun, Nov 23, 2014 at 12:13 PM, Madeleine Udell <
> madeleine.ud...@gmail.com> wrote:
>
>> Yes, I read the code, but I'm not sure I understand what the let
>> statement is doing. It's trying to redefine the scope of the variable, or
>> create a new variable with the same value but over a different scope? How
>> does the let statement interact with the namespaces of the various
>> processes?
>>
>> On Sat, Nov 22, 2014 at 10:30 PM, Amit Murthy <amit.mur...@gmail.com>
>> wrote:
>>
>>> From the description of Base.localize_vars - 'wrap an expression in "let
>>> a=a,b=b,..." for each var it references'
>>>
>>> Though that does not seem to the only(?) issue here....
>>>
>>> On Sun, Nov 23, 2014 at 11:52 AM, Madeleine Udell <
>>> madeleine.ud...@gmail.com> wrote:
>>>
>>>> Thanks! This is extremely helpful.
>>>>
>>>> Can you tell me more about what localize_vars does?
>>>>
>>>> On Sat, Nov 22, 2014 at 9:11 PM, Amit Murthy <amit.mur...@gmail.com>
>>>> wrote:
>>>>
>>>>> This works:
>>>>>
>>>>> function doparallelstuff(m = 10, n = 20)
>>>>>     # initialize variables
>>>>>     localX = Base.shmem_rand(m; pids=procs())
>>>>>     localY = Base.shmem_rand(n; pids=procs())
>>>>>     localf = [x->i+sum(x) for i=1:m]
>>>>>     localg = [x->i+sum(x) for i=1:n]
>>>>>
>>>>>     # broadcast variables to all worker processes
>>>>>     @sync begin
>>>>>         for i in procs(localX)
>>>>>             remotecall(i, x->(global X; X=x; nothing), localX)
>>>>>             remotecall(i, x->(global Y; Y=x; nothing), localY)
>>>>>             remotecall(i, x->(global f; f=x; nothing), localf)
>>>>>             remotecall(i, x->(global g; g=x; nothing), localg)
>>>>>         end
>>>>>     end
>>>>>
>>>>>     # compute
>>>>>     for iteration=1:1
>>>>>         @everywhere for i=localindexes(X)
>>>>>             X[i] = f[i](Y)
>>>>>         end
>>>>>         @everywhere for j=localindexes(Y)
>>>>>             Y[j] = g[j](X)
>>>>>         end
>>>>>     end
>>>>> end
>>>>>
>>>>> doparallelstuff()
>>>>>
>>>>> Though I would have expected broadcast of variables to be possible
>>>>> with just
>>>>> @everywhere X=localX
>>>>> and so on ....
>>>>>
>>>>>
>>>>> Looks like @everywhere does not call localize_vars.  I don't know if
>>>>> this is by design or just an oversight. I would have expected it to do so.
>>>>> Will file an issue on github.
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Nov 23, 2014 at 8:24 AM, Madeleine Udell <
>>>>> madeleine.ud...@gmail.com> wrote:
>>>>>
>>>>>> The code block I posted before works, but throws an error when
>>>>>> embedded in a function: "ERROR: X not defined" (in first line of
>>>>>> @parallel). Why am I getting this error when I'm *assigning to* X?
>>>>>>
>>>>>> function doparallelstuff(m = 10, n = 20)
>>>>>>     # initialize variables
>>>>>>     localX = Base.shmem_rand(m)
>>>>>>     localY = Base.shmem_rand(n)
>>>>>>     localf = [x->i+sum(x) for i=1:m]
>>>>>>     localg = [x->i+sum(x) for i=1:n]
>>>>>>
>>>>>>     # broadcast variables to all worker processes
>>>>>>     @parallel for i=workers()
>>>>>>         global X = localX
>>>>>>         global Y = localY
>>>>>>         global f = localf
>>>>>>         global g = localg
>>>>>>     end
>>>>>>     # give variables same name on master
>>>>>>     X,Y,f,g = localX,localY,localf,localg
>>>>>>
>>>>>>     # compute
>>>>>>     for iteration=1:1
>>>>>>         @everywhere for i=localindexes(X)
>>>>>>             X[i] = f[i](Y)
>>>>>>         end
>>>>>>         @everywhere for j=localindexes(Y)
>>>>>>             Y[j] = g[j](X)
>>>>>>         end
>>>>>>     end
>>>>>> end
>>>>>>
>>>>>> doparallelstuff()
>>>>>>
>>>>>> On Fri, Nov 21, 2014 at 5:13 PM, Madeleine Udell <
>>>>>> madeleine.ud...@gmail.com> wrote:
>>>>>>
>>>>>>> My experiments with parallelism also occur in focused blocks; I
>>>>>>> think that's a sign that it's not yet as user friendly as it could be.
>>>>>>>
>>>>>>> Here's a solution to the problem I posed that's simple to use:
>>>>>>> @parallel + global can be used to broadcast a variable, while 
>>>>>>> @everywhere
>>>>>>> can be used to do a computation on local data (ie, without resending the
>>>>>>> data). I'm not sure how to do the variable renaming programmatically,
>>>>>>> though.
>>>>>>>
>>>>>>> # initialize variables
>>>>>>> m,n = 10,20
>>>>>>> localX = Base.shmem_rand(m)
>>>>>>> localY = Base.shmem_rand(n)
>>>>>>> localf = [x->i+sum(x) for i=1:m]
>>>>>>> localg = [x->i+sum(x) for i=1:n]
>>>>>>>
>>>>>>> # broadcast variables to all worker processes
>>>>>>> @parallel for i=workers()
>>>>>>>     global X = localX
>>>>>>>     global Y = localY
>>>>>>>     global f = localf
>>>>>>>     global g = localg
>>>>>>> end
>>>>>>> # give variables same name on master
>>>>>>> X,Y,f,g = localX,localY,localf,localg
>>>>>>>
>>>>>>> # compute
>>>>>>> for iteration=1:10
>>>>>>>     @everywhere for i=localindexes(X)
>>>>>>>         X[i] = f[i](Y)
>>>>>>>     end
>>>>>>>     @everywhere for j=localindexes(Y)
>>>>>>>         Y[j] = g[j](X)
>>>>>>>     end
>>>>>>> end
>>>>>>>
>>>>>>> On Fri, Nov 21, 2014 at 11:14 AM, Tim Holy <tim.h...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> My experiments with parallelism tend to occur in focused blocks,
>>>>>>>> and I haven't
>>>>>>>> done it in quite a while. So I doubt I can help much. But in
>>>>>>>> general I suspect
>>>>>>>> you're encountering these problems because much of the IPC goes
>>>>>>>> through
>>>>>>>> thunks, and so a lot of stuff gets reclaimed when execution is done.
>>>>>>>>
>>>>>>>> If I were experimenting, I'd start by trying to create RemoteRef()s
>>>>>>>> and put!
>>>>>>>> ()ing my variables into them. Then perhaps you might be able to
>>>>>>>> fetch them
>>>>>>>> from other processes. Not sure that will work, but it seems to be
>>>>>>>> worth a try.
>>>>>>>>
>>>>>>>> HTH,
>>>>>>>> --Tim
>>>>>>>>
>>>>>>>> On Thursday, November 20, 2014 08:20:19 PM Madeleine Udell wrote:
>>>>>>>> > I'm trying to use parallelism in julia for a task with a
>>>>>>>> structure that I
>>>>>>>> > think is quite pervasive. It looks like this:
>>>>>>>> >
>>>>>>>> > # broadcast lists of functions f and g to all processes so they're
>>>>>>>> > available everywhere
>>>>>>>> > # create shared arrays X,Y on all processes so they're available
>>>>>>>> everywhere
>>>>>>>> > for iteration=1:1000
>>>>>>>> > @parallel for i=1:size(X)
>>>>>>>> > X[i] = f[i](Y)
>>>>>>>> > end
>>>>>>>> > @parallel for j=1:size(Y)
>>>>>>>> > Y[j] = g[j](X)
>>>>>>>> > end
>>>>>>>> > end
>>>>>>>> >
>>>>>>>> > I'm having trouble making this work, and I'm not sure where to
>>>>>>>> dig around
>>>>>>>> > to find a solution. Here are the difficulties I've encountered:
>>>>>>>> >
>>>>>>>> > * @parallel doesn't allow me to create persistent variables on
>>>>>>>> each
>>>>>>>> > process; ie, the following results in an error.
>>>>>>>> >
>>>>>>>> >         s = Base.shmem_rand(12,3)
>>>>>>>> > @parallel for i=1:nprocs() m,n = size(s) end
>>>>>>>> > @parallel for i=1:nprocs() println(m) end
>>>>>>>> >
>>>>>>>> > * @everywhere does allow me to create persistent variables on
>>>>>>>> each process,
>>>>>>>> > but doesn't send any data at all, including the variables I need
>>>>>>>> in order
>>>>>>>> > to define new variables. Eg the following is an error: s is a
>>>>>>>> shared array,
>>>>>>>> > but the variable (ie pointer to) s is apparently not shared.
>>>>>>>> > s = Base.shmem_rand(12,3)
>>>>>>>> > @everywhere m,n = size(s)
>>>>>>>> >
>>>>>>>> > Here are the kinds of questions I'd like to see protocode for:
>>>>>>>> > * How can I broadcast a variable so that it is available and
>>>>>>>> persistent on
>>>>>>>> > every process?
>>>>>>>> > * How can I create a reference to the same shared array "s" that
>>>>>>>> is
>>>>>>>> > accessible from every process?
>>>>>>>> > * How can I send a command to be performed in parallel,
>>>>>>>> specifying which
>>>>>>>> > variables should be sent to the relevant processes and which
>>>>>>>> should be
>>>>>>>> > looked up in the local namespace?
>>>>>>>> >
>>>>>>>> > Note that everything I ask above is not specific to shared
>>>>>>>> arrays; the same
>>>>>>>> > constructs would also be extremely useful in the distributed case.
>>>>>>>> >
>>>>>>>> > ----------------------
>>>>>>>> >
>>>>>>>> > An interesting partial solution is the following:
>>>>>>>> > funcs! = Function[x->x[:] = x+k for k=1:3]
>>>>>>>> > d = drand(3,12)
>>>>>>>> > let funcs! = funcs!
>>>>>>>> >   @sync @parallel for k in 1:3
>>>>>>>> >     funcs![myid()-1](localpart(d))
>>>>>>>> >   end
>>>>>>>> > end
>>>>>>>> >
>>>>>>>> > Here, I'm not sure why the let statement is necessary to send
>>>>>>>> funcs!, since
>>>>>>>> > d is sent automatically.
>>>>>>>> >
>>>>>>>> > ---------------------
>>>>>>>> >
>>>>>>>> > Thanks!
>>>>>>>> > Madeleine
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Madeleine Udell
>>>>>>> PhD Candidate in Computational and Mathematical Engineering
>>>>>>> Stanford University
>>>>>>> www.stanford.edu/~udell
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Madeleine Udell
>>>>>> PhD Candidate in Computational and Mathematical Engineering
>>>>>> Stanford University
>>>>>> www.stanford.edu/~udell
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Madeleine Udell
>>>> PhD Candidate in Computational and Mathematical Engineering
>>>> Stanford University
>>>> www.stanford.edu/~udell
>>>>
>>>
>>>
>>
>>
>> --
>> Madeleine Udell
>> PhD Candidate in Computational and Mathematical Engineering
>> Stanford University
>> www.stanford.edu/~udell
>>
>
>

Re: [julia-users] Broadcasting variables

Reply via email to