## Description
Current design of DeepNumpy's random module follows native numpy in terms of 
the interpretation of the parameter `size`.
More specifically, `size` indicates the final output size of the sampling 
operation. Parameter tensors, if narrower or smaller than `size`, will be 
automatically broadcast to the output's shape.
However, this mechanism makes I.I.D sampling little bit tricky, for example:
```
loc = loc_net(x)
scale = scale_net(x)
N = 10
# Generate N samples from the network-parameterized gaussian
np.random.normal(loc, scale, (N,) + loc.shape)
```
Problem would arise in symbolic model, as the shape of `loc` and `scale` cannot 
be obtained in the frontend.

## Solution

The following `InferShape` function could resolve this issue. (modified from: 
https://github.com/apache/incubator-mxnet/blob/master/src/operator/numpy/random/dist_common.h#L143)

```
template <typename DistParam>
inline bool TwoparamsDistOpConcatShape(const nnvm::NodeAttrs &attrs,
                                 std::vector<TShape> *in_attrs,
                                 std::vector<TShape> *out_attrs) {
  const DistParam &param = nnvm::get<DistParam>(attrs.parsed);
  mxnet::TShape &low = (*in_attrs)[0];
  mxnet::TShape &high = (*in_attrs)[1];
  mxnet::TShape out(std::max(low.ndim(), high.ndim()), -1);
  if (in_attrs->size() == 2U) {
      // Both params from ndarray.
      InferBroadcastShape(low, high, &out);
    } else if (in_attrs->size() == 1U) {
      // One param from ndarray.
      out = in_attrs->at(0);
    } else if (in_attrs->size() == 0) {
      // Two scalar case.
      out = TShape(0, -1);
    }
  if (param.size.has_value()) {
    // Size declared.
    std::vector<dim_t> oshape_vec;
    const mxnet::Tuple<int> &size = param.size.value();
    for (int i = 0; i < size.ndim(); ++i) {
      oshape_vec.emplace_back(size[i]);
    }
    for (int i = 0; i < out.ndim(); ++i) {
      oshape_vec.emplace_back(out[i]);
    }
    SHAPE_ASSIGN_CHECK(*out_attrs, 0, TShape(oshape_vec));
    for (size_t input_idx = 0; input_idx < in_attrs->size(); input_idx++) {
      CheckBroadcastable((*in_attrs)[input_idx], (*out_attrs)[0]);
    }
  } else {
    SHAPE_ASSIGN_CHECK(*out_attrs, 0, out);
  }
  return out_attrs->at(0).ndim() != 0U;
}
```

Notice that the `FCompute` function could stay the same.

The modified sampling method is now able to produce the following result:
```
>>> loc = np.zeros((3,3))
>>> scale = np.ones((4,3,3))
>>> npx.random.normal_N(loc, scale, (2,)).shape
(2, 4, 3, 3)
```



-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-mxnet/issues/16793

Reply via email to