[GitHub] [incubator-mxnet] sxjscience commented on issue #18078: A possible bug when computing a gradient vector
sxjscience commented on issue #18078: A possible bug when computing a gradient vector URL: https://github.com/apache/incubator-mxnet/issues/18078#issuecomment-614406630 @zleyk22 Yes, it's a bug and also exists in numpy. ```python import mxnet as mx mx.npx.set_np() a = mx.np.array([1,0]) a.attach_grad() with mx.autograd.record(): b = mx.np.prod(a) b.backward() print(a.grad) ``` Output: ``` [ 0. nan] ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sxjscience commented on issue #18078: A possible bug when computing a gradient vector
sxjscience commented on issue #18078: A possible bug when computing a gradient vector URL: https://github.com/apache/incubator-mxnet/issues/18078#issuecomment-614766460 @zleyk22 I think currently no one is looking at the issue. Would you try to solve it? Thanks for pointing out the problem! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sxjscience commented on issue #18078: A possible bug when computing a gradient vector
sxjscience commented on issue #18078: A possible bug when computing a gradient vector URL: https://github.com/apache/incubator-mxnet/issues/18078#issuecomment-614921875 @zleyk22 I can think of two possible ways to solve this problem: 1) Use two cumsums - `[\log a_0, \log a_0 + \log a_1, ..., \log a_0 + \log a_1 + ... \log a_{n-3}]`, - `[\log a_2 + \log a_3 ... + \log a_{n-1}, \log a_3 + ... + \log a_{n-1}, ... \log a_{n-1}]` Then, sum up these two cumsums. Here, I think we should take the log-sum approach to avoid the overflow/underflow problem of multiplying lots of numbers. (Also this is the algorithm used to solve https://leetcode.com/problems/product-of-array-except-self/) 2) Detect 0s and give them special treatment. We may detect the positions of the zeros and update the gradient of these positions with the correct value. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services