Thinking of a different design:

1. Master python process builds and compiles all theano functions like 
normal (for GPU), and pickles them.
2. Worker processes initialize on other GPUs and unpickle all the functions.
3. User calls wrapped theano functions in master process, which signals to 
workers.
4. Workers run infinite loop, waiting for signal of what to do (some switch 
statement), e.g.:
    a. call some function (can take inputs from multiprocessing shared 
variables) and communicate result
    b. copy multiprocessing shared variables to update local theano GPU 
shared variables
    c. do collective GPU comms.
    d. etc.

The workers are "dumb" and never have to bother with any graphs.  It's a 
bit of a pain to set up the multiprocessing shared variables (have to 
declare data sizes ahead of time) but not so bad.  

What I'm running into trouble with now is the theano shared variables. 
 They get unpickled under the function's input_storage, but each function 
ends up with a separate set of objects here.  I can manipulate them 
individually, but *is there a way to get multiple unpickled functions to 
refer to the same memory for corresponding shared variables?*  (Simply 
setting the input_storage entries to another function's does not work.)

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to