Zha0q1 opened a new issue #20265:
URL: https://github.com/apache/incubator-mxnet/issues/20265


   I was trying to build MXNet 1.x with MKLDNN with ACL (Arm Compute Library) 
integration on an Arm instance. I used this [cmake config 
file](https://github.com/apache/incubator-mxnet/blob/v1.x/config/distribution/linux_aarch64_cpu.cmake)
 to integrate MKLDNN with ACL. The build was very performant and would surely 
benefit MXNet users hugely. I got a 3-4X boost with (16/64, 3, 512, 512) on 
Resnet compared to MKLDNN with no integration. However two operator unit test 
failed on this build:
   ```
   test_operator.test_convolution_grouping 
   test_operator.test_convolution_independent_gradients
   ```
   I tried multiple mkldnn versions (v1.x now points to mkldnn release 2.0 beta 
10, I also tried release 2.1 and 2.2) and ACl versions (20.08 and 21.2, 20.08 
is Aug 2020 and 21.2 is the latest) and the failures persisted.
   
   This got me suspect that there is some integration issue with MXNet-MKLDNN 
(more possible) or MKLDNN-ACL. Would someone from the Intel team help share 
some insights on this?
   
   Would you help triage?@szha @leezu 
   
   CC @josephevans @mseth10 @waytrue17 @sandeep-krishnamurthy 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@mxnet.apache.org
For additional commands, e-mail: issues-h...@mxnet.apache.org

Reply via email to