Neutron3529 edited a comment on issue #19649: URL: https://github.com/apache/incubator-mxnet/issues/19649#issuecomment-758323589
> After more tests, I found that the result also varies on RTX2080Ti on both MXNet 1.9.0 and MXNet 2.0.0. > ~The result have 0.005 difference in the shallow layer. I think it will have more difference as the layer grows.~ > > ```python > import os > # os.environ['MXNET_CUDNN_AUTOTUNE_DEFAULT'] = '0' > import mxnet as mx > import numpy as np > from mxnet.gluon.model_zoo.vision.resnet import resnet18_v1 > > def testrestnet(): > ctx = mx.gpu(0) > mx_model = resnet18_v1(pretrained=True,ctx=ctx) > mx_model.hybridize() > > x_mx = mx.nd.ones(shape=(1,3,224,224), ctx=ctx) > > y_mx = mx_model.features[0:6](x_mx) > > # the res is always 13064.977 on CPU > # the res varies on RTX2080Ti/RTX3090 on both MXNet 1.9.0 and 2.0.0 without > # MXNET_CUDNN_AUTOTUNE_DEFAULT=0: 13064.971, 13064.976 > res = y_mx.asnumpy().sum() > > print(res) > > if __name__ == '__main__': > testrestnet() > ``` have you ever tried `NVIDIA_TF32_OVERRIDE=0 python`? 3090 using tf32 to accelerate training&testing by default, and using `NVIDIA_TF32_OVERRIDE=0` will disable it. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@mxnet.apache.org For additional commands, e-mail: issues-h...@mxnet.apache.org