DickJC123 commented on issue #21111: URL: https://github.com/apache/incubator-mxnet/issues/21111#issuecomment-1204238899
Let me suggest a few things that may be involved in these results: - The BatchNorm implementations may not update the moving mean and variance at the same time. Some might do it during training-forward, while others training-backward. This is OK in my mind and shouldn't affect the defined use case where training-backward always follows training-forward. - The beta and gamma are learned parameters, right? So they will change with a training iteration and affect subsequent inference outputs. - Regarding the nan test: I wasn't aware that a 2D input [1, 6] was even supported. But if this is indeed supported, is it equivalent to [1, 6, 1]? A Batchnorm performed over 1 element might be problematic. The cudnn moving variance is unbiased, which means it has had a **_m_** / **__(m-1)__** factor applied to the population variance. For example, see: https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_numpy_op.py#L1877-L1880 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
