matteosal commented on issue #20109:
URL: 
https://github.com/apache/incubator-mxnet/issues/20109#issuecomment-812614000


   Ok I went looking for how the python wrapper 
([symbol.py](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/symbol.py)
 and 
[executor.py](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/executor.py))
 got adapted to work with the cached op API and I'm starting to get a feeling 
of the current scenario. Some questions:
   
   * One major difference is that while the the old C executor required arrays 
at construction time, cached op requires them at evaluation time 
(`MXInvokeCachedOp`). Cached op also has a flag called `static_alloc`. This 
suggests that when such flag is activated the graph is actually "compiled" and 
shipped to the execution device by `MXInvokeCachedOp`, similarly to what 
`MXExecutorBind` used to do. Is this reasoning correct? 
   * Following the reasoning of the previous point, calling `MXInvokeCachedOp` 
twice in a row specifying CPU and GPU devices (with `static_alloc = true`) will 
allocate the graph on both system memory and GPU memory. Is there a way to 
selectively free CPU or GPU memory?
   * Again on the same reasoning, what happens if `MXInvokeCachedOp` is called 
multiple times with NDArrays of different shapes? Will the arrays share memory?
   * What is the purpose of flags `forward_bulk_size` and `backward_bulk_size`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to