[GitHub] TaoLv commented on issue #10580: inference results unstable in mxnet_mkl-1.2.0b20180416
TaoLv commented on issue #10580: inference results unstable in mxnet_mkl-1.2.0b20180416 URL: https://github.com/apache/incubator-mxnet/issues/10580#issuecomment-383007445 Please ignore that. @zheng-da has submitted the final fix in #10624 . You can try the nightly build after that PR is merged. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] TaoLv commented on issue #10580: inference results unstable in mxnet_mkl-1.2.0b20180416
TaoLv commented on issue #10580: inference results unstable in mxnet_mkl-1.2.0b20180416 URL: https://github.com/apache/incubator-mxnet/issues/10580#issuecomment-382780304 update: @dwSun could you help to try this branch to see if this issue still there? https://github.com/TaoLv/incubator-mxnet/tree/fix-SetMKLMem Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] TaoLv commented on issue #10580: inference results unstable in mxnet_mkl-1.2.0b20180416
TaoLv commented on issue #10580: inference results unstable in mxnet_mkl-1.2.0b20180416 URL: https://github.com/apache/incubator-mxnet/issues/10580#issuecomment-382236136 On latest master, this issue for @dwSun 's script can be resolved by removing below line from mkldnn_convolution.cc https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/mkldnn/mkldnn_convolution.cc#L286 ``` weight.MKLDNNDataReorderAsync(fwd.fwd_pd.weights_primitive_desc()); ``` However this change only works for inference. We still need more comprehensive solution for it. I feel that we can't push async operation to execution engine from an operator body. It may cause dead lock or data race. @zheng-da @pengzhao-intel please take a look. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] TaoLv commented on issue #10580: inference results unstable in mxnet_mkl-1.2.0b20180416
TaoLv commented on issue #10580: inference results unstable in mxnet_mkl-1.2.0b20180416 URL: https://github.com/apache/incubator-mxnet/issues/10580#issuecomment-382069203 I guess yes, but not sure. I opened MXNET_MKLDNN_DEBUG but no complaints. Still need minimum case to reproduce it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] TaoLv commented on issue #10580: inference results unstable in mxnet_mkl-1.2.0b20180416
TaoLv commented on issue #10580: inference results unstable in mxnet_mkl-1.2.0b20180416 URL: https://github.com/apache/incubator-mxnet/issues/10580#issuecomment-381926332 Is it possible for you to give a repeatable model for this issue? Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services