[GitHub] [incubator-mxnet] sxjscience commented on a change in pull request #14779: Fully connected, higher order grad

2019-07-26 Thread GitBox
sxjscience commented on a change in pull request #14779: Fully connected, 
higher order grad
URL: https://github.com/apache/incubator-mxnet/pull/14779#discussion_r307903135
 
 

 ##
 File path: src/operator/nn/fully_connected-inl.h
 ##
 @@ -249,6 +285,114 @@ void FullyConnectedGradCompute(const nnvm::NodeAttrs& 
attrs,
   }
 }
 
+
+
+///
+// Inputs are:
+// o_x_grad : head gradient for x_grad
+// o_w_grad : head gradient for w_grad
+// o_b_grad : if param.no_bias is false
+// o_y : head gradient of y
+//
+// outputs are:
+// o_y_grad : gradient of o_y
+// x_grad_grad : o_y *  o_w_grad
+// w_grad_grad : o_y.T * o_x_grad
+// b_grad_grad: if param.no_bias is false
+//
+// For implementation details see this PR: 
https://github.com/apache/incubator-mxnet/pull/14779
+
+/**
+ * Second order gradient for Fully Connected
+ * x_grad_grad = o_y * o_w_grad
+ * w_grad_grad = o_y.T * o_x_grad
+ *
+ * @tparam xpu
+ * @tparam DType
+ * @param attrs
+ * @param ctx
+ * @param inputs
+ * @param req
+ * @param outputs
+ */
+template
+void FullyConnectedGradGradCompute(const nnvm::NodeAttrs& attrs,
+   const OpContext& ctx,
+   const std::vector& inputs,
+   const std::vector& req,
+   const std::vector& outputs) {
+  using namespace std;
+  using namespace fullc;
+  Stream *stream = ctx.get_stream();
+  const FullyConnectedParam& param = 
nnvm::get(attrs.parsed);
+  const size_t num_inputs = param.no_bias ? 3U : 4U;
+  // outputs are: o_x_grad, o_w_grad, o_y   || o_x_grad, o_w_grad, o_b_grad, 
o_y
+  const size_t num_outputs = 3U;
+  CHECK_EQ(inputs.size(), num_inputs);
+  CHECK_EQ(outputs.size(), num_outputs);
+  CHECK_EQ(req.size(), num_outputs);
+
+  // inputs
+  Tensor o_x_grad;
+  Tensor o_w_grad;
+  Tensor o_y;
+  // unused
+  // Tensor o_b_grad;
+
+  // outputs
+  Tensor o_y_grad;
+  TBlob o_y_grad_blob = outputs[kOyGrad];
+  Tensor x_grad_grad;
+  Tensor w_grad_grad;
+  Tensor b_grad_grad;
+  size_t o_y_idx = std::numeric_limits::max();
+  if (param.no_bias)
+o_y_idx = kOy;
+  else
+o_y_idx = kOyBias;
+  if (!param.flatten) {
+o_x_grad = FlattenAs2DHead(inputs[kOxGrad], ctx);
+o_w_grad = inputs[kOwGrad].get(stream);
+o_y = FlattenAs2DHead(inputs[o_y_idx], ctx);
+x_grad_grad = FlattenAs2DHead(outputs[kXGradGrad], ctx);
+w_grad_grad = FlattenAs2DHead(outputs[kWGradGrad], ctx);
+  } else {
+o_x_grad = FlattenAs2DTail(inputs[kOxGrad], ctx);
+o_w_grad = FlattenAs2DTail(inputs[kOwGrad], ctx);
+o_y = inputs[o_y_idx].get(stream);
+x_grad_grad = FlattenAs2DTail(outputs[kXGradGrad], ctx);
+w_grad_grad = FlattenAs2DTail(outputs[kWGradGrad], ctx);
+  }
+  linalg_gemm(o_y, o_w_grad, x_grad_grad, false, false, stream);
+  linalg_gemm(o_y, o_x_grad, w_grad_grad, true, false, stream);
 
 Review comment:
   It seems that we need to add the `req` flag in linalg_gemm. Because the 
computing the gradient may trigger the `addTo` optimization.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] sxjscience commented on a change in pull request #14779: Fully connected, higher order grad

2019-06-26 Thread GitBox
sxjscience commented on a change in pull request #14779: Fully connected, 
higher order grad
URL: https://github.com/apache/incubator-mxnet/pull/14779#discussion_r297779647
 
 

 ##
 File path: tests/python/unittest/test_higher_order_grad.py
 ##
 @@ -129,6 +131,44 @@ def check_second_order_unary(x, op, grad_grad_op):
 # Validate the gradients.
 assert_almost_equal(expected_grad_grad, x.grad.asnumpy())
 
+class RandomShapes(object):
+def __init__(self, dim):
+self.dim = dim
+self.curdim = 1
+
+def __iter__(self):
+return self
+
+def next(self):
+return self.__next__()
+
+def __next__(self):
+if self.curdim > self.dim:
+raise StopIteration
+shape = rand_shape_nd(self.curdim)
+print(shape)
+x = nd.random.normal(shape=shape)
+self.curdim += 1
+return x
+
+
+@with_seed()
+def test_dense_backward():
+import mxnet.autograd as ag
+import mxnet.ndarray as nd
+for x in RandomShapes(5):
+net = gluon.nn.Sequential()
+with net.name_scope():
+#net.add(gluon.nn.Dense(1, in_units=x.shape[1]))
+net.add(gluon.nn.Dense(1))
+net.initialize(mxnet.initializer.Constant(.5))
+x.attach_grad()
+with ag.record():
+y = net.forward(x)
+x_grad = ag.grad(y, x, create_graph=True, retain_graph=True)[0]
 
 Review comment:
   We'd better do something like the following to check the correctness.
   ```python
   with ag.record():
 y = net.forward(x)
 x_grad = ag.grad(y, x, create_graph=True, retain_graph=True)[0]
 z = (random_multiplier * x_grad).sum()
   z.backward()
   ```
   Then we check whether `x.grad` is correct.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] sxjscience commented on a change in pull request #14779: Fully connected, higher order grad

2019-06-26 Thread GitBox
sxjscience commented on a change in pull request #14779: Fully connected, 
higher order grad
URL: https://github.com/apache/incubator-mxnet/pull/14779#discussion_r297779647
 
 

 ##
 File path: tests/python/unittest/test_higher_order_grad.py
 ##
 @@ -129,6 +131,44 @@ def check_second_order_unary(x, op, grad_grad_op):
 # Validate the gradients.
 assert_almost_equal(expected_grad_grad, x.grad.asnumpy())
 
+class RandomShapes(object):
+def __init__(self, dim):
+self.dim = dim
+self.curdim = 1
+
+def __iter__(self):
+return self
+
+def next(self):
+return self.__next__()
+
+def __next__(self):
+if self.curdim > self.dim:
+raise StopIteration
+shape = rand_shape_nd(self.curdim)
+print(shape)
+x = nd.random.normal(shape=shape)
+self.curdim += 1
+return x
+
+
+@with_seed()
+def test_dense_backward():
+import mxnet.autograd as ag
+import mxnet.ndarray as nd
+for x in RandomShapes(5):
+net = gluon.nn.Sequential()
+with net.name_scope():
+#net.add(gluon.nn.Dense(1, in_units=x.shape[1]))
+net.add(gluon.nn.Dense(1))
+net.initialize(mxnet.initializer.Constant(.5))
+x.attach_grad()
+with ag.record():
+y = net.forward(x)
+x_grad = ag.grad(y, x, create_graph=True, retain_graph=True)[0]
 
 Review comment:
   We'd better do something like the following to check the correctness.
   ```python
   with ag.record():
   y = net.forward(x)
   x_grad = ag.grad(y, x, create_graph=True, retain_graph=True)[0]
   z = (random_multiplier * x_grad).sum()
   z.backward()
   ```
   Then we check whether `x.grad` is correct.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services