## Description Current design of DeepNumpy's random module follows native numpy in terms of the interpretation of the parameter `size`. More specifically, `size` indicates the final output size of the sampling operation. Parameter tensors, if narrower or smaller than `size`, will be automatically broadcast to the output's shape. However, this mechanism makes I.I.D sampling little bit tricky, for example: ``` loc = loc_net(x) scale = scale_net(x) N = 10 # Generate N samples from the network-parameterized gaussian np.random.normal(loc, scale, (N,) + loc.shape) ``` Problem would arise in symbolic model, as the shape of `loc` and `scale` cannot be obtained in the frontend.
## Solution The following `InferShape` function could resolve this issue. (modified from: https://github.com/apache/incubator-mxnet/blob/master/src/operator/numpy/random/dist_common.h#L143) ``` template <typename DistParam> inline bool TwoparamsDistOpConcatShape(const nnvm::NodeAttrs &attrs, std::vector<TShape> *in_attrs, std::vector<TShape> *out_attrs) { const DistParam ¶m = nnvm::get<DistParam>(attrs.parsed); mxnet::TShape &low = (*in_attrs)[0]; mxnet::TShape &high = (*in_attrs)[1]; mxnet::TShape out(std::max(low.ndim(), high.ndim()), -1); if (in_attrs->size() == 2U) { // Both params from ndarray. InferBroadcastShape(low, high, &out); } else if (in_attrs->size() == 1U) { // One param from ndarray. out = in_attrs->at(0); } else if (in_attrs->size() == 0) { // Two scalar case. out = TShape(0, -1); } if (param.size.has_value()) { // Size declared. std::vector<dim_t> oshape_vec; const mxnet::Tuple<int> &size = param.size.value(); for (int i = 0; i < size.ndim(); ++i) { oshape_vec.emplace_back(size[i]); } for (int i = 0; i < out.ndim(); ++i) { oshape_vec.emplace_back(out[i]); } SHAPE_ASSIGN_CHECK(*out_attrs, 0, TShape(oshape_vec)); for (size_t input_idx = 0; input_idx < in_attrs->size(); input_idx++) { CheckBroadcastable((*in_attrs)[input_idx], (*out_attrs)[0]); } } else { SHAPE_ASSIGN_CHECK(*out_attrs, 0, out); } return out_attrs->at(0).ndim() != 0U; } ``` Notice that the `FCompute` function could stay the same. The modified sampling method is now able to produce the following result: ``` >>> loc = np.zeros((3,3)) >>> scale = np.ones((4,3,3)) >>> npx.random.normal_N(loc, scale, (2,)).shape (2, 4, 3, 3) ``` -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-mxnet/issues/16793