ptrendx opened a new pull request #16836: Fix InferAttr/InferShapeAttr not calling inference for all nodes in a graph URL: https://github.com/apache/incubator-mxnet/pull/16836 ## Description ## Currently type/shape inference of the operator is only called if the values of input and output attributes are not currently known. This is a bug as it may lead to some operators never seeing the values of those inputs/outputs attributes during InferAttr pass and so never validating them. As an example, this code snippet runs without throwing any exception in current MXNet: ``` v = mx.sym.Variable('V') s = mx.sym.transpose(v) x = mx.sym.Variable('x') s2 = x + v s3 = s + s2 e = s3.simple_bind(ctx=mx.cpu(), x=(2,3), grad_req='null') ``` and it errors out only when calling `e.forward` (in 1.5.1, in 1.6.0 new 2D transpose implementation removed the check for shape inside the forward function of the operator so it does not error out even then). This is because the `transpose` node is early in the topologically sorted graph and ran its shape inference earlier (on unknown input and output shapes) than the `_Plus` operators, which then set all of the shapes by themselves, which prevented `transpose` from running its shape inference again to spot that they are not suitable for this operator. In this PR, I track which operators are actually finished inferring attributes (including validating the attributes they got if all of the inputs/outputs are already known at the point of running their attribute inference). ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [x] Changes are complete (i.e. I finished coding on this PR) - [x] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) ## Comments ## - Currently I check that the operator finished inferring attributes if all input and output attributes are set after the call to operator's InferShape/InferType. This should not be necessary as those functions return the value that should be equivalent. However, since this value was never actually checked, there are a lot of operators that lie and return `true` even if not everything was inferred correctly. This should be tackled in a separate PR.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services