gilbertfrancois commented on issue #18751: URL: https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-662756717
I suspect that the behaviour is corrected when the update of moving_mean and moving_var on GPU is done in the backward pass, like it is on CPU. It will solve the NaN values and will most likely make the quantitative output between CPU and GPU more consistent. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org