Thank you @szha and @asmushetzel for looking through the RFC.

> Can you elaborate a bit more about specific use cases that this enables or 
> simplifies? Is there something that can't be done today that this would 
> enable? Are there major pain points that this would address compared to 
> hybrid-blocks? Etc..

The RFC is not so much about extending what is possible, but improving the user 
experience. A major issue of the existing API is that `mx.nd` and `mx.sym` are 
distinct and partially incompatible. The issue of both being distinct is 
partially addressed by existing `HybridBlock` at the cost of making the issue 
of their incompatibility even more severe. Some of this is tracked in [[Bug] 
Inconsistency between HybridBlock and 
Block](https://github.com/apache/incubator-mxnet/issues/16279).

Unifying symbolic and imperative mode with deferred compute also works towards 
[[RFC] Introducing NumPy-compatible coding experience into 
MXNet](https://github.com/apache/incubator-mxnet/issues/14253). While with 
deferred compute we only trace a computational graph (as with current symbolic 
API), a logical next step is to provide support for parsing the AST of user 
provided implementation and directly hybridize it without tracing. You can find 
some more discussion on it in #14253. AST transformation also benefits from a 
unified interface, as a separate imperative and symbolic frontend would be 
meaningless.

> First, should we restrict this mode to only apply to the new numpy arrays?

It may be feasible to provide support also for the normal ndarray interface. 
That said, I suggest to consider such support as a bonus. Providing backwards 
compatibility adds complexity for existing ndarray, which doesn't apply to new 
numpy arrays. The final decision could be taken later.

> Since the deferred compute mode won't support reverse shape inference, new 
> blocks that implement the forward interface will not work without 
> implementing the parameter shape inference logic in infer_shape. This also 
> applies when migrating the existing Gluon blocks in our API. Since we have 
> plan to adopt numpy array in Gluon, the two changes can potentially happen at 
> the same time.

Agree that both should happen at the same time


> could you elaborate on what the changes are to the `infer_shape`, especially 
> on how and when it's invoked during deferred initialization?
 
No conceptual change to the existing `infer_shape` API is required. 
The current implementation works as follows, during forward, if called 
imperatively

https://github.com/apache/incubator-mxnet/blob/4940ec0e7408fad2443f921131cf1ada72724c38/python/mxnet/gluon/block.py#L1084-L1097

where `_deferred_infer_shape` calls `infer_shape`.
Exactly the same logic applies with proposed deferred compute mode. In Line 
1091 a `DeferredInitializationError` will be caught, which is then handled by 
user-implemented implementation of `infer_shape`. If the user did not implement 
`infer_shape`, we raise a warning containing information on the requirement to 
implement `infer_shape` given the lack of general backward shape inference 
support.

-- 
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-mxnet/issues/16376#issuecomment-539227866

Reply via email to