Thanks Alex for bringing up this proposal. As far as I know, applied to the MKL-DNN backend, MXNet is the most performant framework on CPU side now. Especially that the recent subgraph fusion feature boosts the performance a lot again. Thus, I think it’s worth to make it default and let more users leverage the benefits of it.
Regarding MKL-DNN integration, it’s a joint work and takes lots of effort from Amazon and Intel engineers, including Da, Jun, Haibin, Junyuan, Sheng, Marco, Chris (AWS) and Patric, Tao, Wenting, Rong , Jin, Shufan, Ashok (Intel). We also got many great suggestions from MXNet community and learned much from those discussions. Here I personally want to appreciate Da Zheng for his great efforts in this project. As the main contributor, he plays an important role in the project, from the initial co-design, implementations to recent advanced subgraph feature and finally makes these good things happen. I would like to thank Alex for stabilizing MKL-DNN backend by adding more tests for it and also environment variables so the user can switch between the original flow and MKL-DNN flow easily. His efforts are really helpful for pushing MKL-DNN backend from experimental toward GA. MXNet community is one of the best groups and there're many intelligent people here. Thank you all for the strong support. --Patric > -----Original Message----- > From: Jun Wu [mailto:wujun....@gmail.com] > Sent: Thursday, October 18, 2018 6:29 AM > To: dev@mxnet.incubator.apache.org > Cc: d...@mxnet.apache.org; aza...@gmail.com > Subject: Re: Include MKLDNN into default mxnet pip package > > If my understanding is correct about the context, it should be acknowledged > that the significant performance improvement comes from the Intel > MKLDNN team's contribution in this PR: > https://github.com/apache/incubator-mxnet/pull/12530. > > On Wed, Oct 17, 2018 at 3:12 PM kellen sunderland < > kellen.sunderl...@gmail.com> wrote: > > > First of all thanks to Intel for these improvements, really a great effort. > > > > What would the compatibility story look like for users that don't have > > these AVX instructions? Would there be any negative affect for AMD users? > > > > Regarding TensorRT: It's a possibility but not planned in the short > > term. A few considerations would be the limits on PyPi package sizes > > and the bloat incurred with TRT, the requirements of TRT to be > > installed on the user side, and the TRT engine build times which are > > non-trivial. We can work towards fixing or working around these > > issues in the future if default TRT is something the user community > > would like to see for Cuda packages. While the feature is > > experimental we'll likely continue to use 'mxnet-tensorrt-cu92' and > 'mxnet-tensorrt-cu90'. > > > > On Wed, Oct 17, 2018 at 2:12 PM Alfredo Luque > > <alfredo.lu...@airbnb.com.invalid> wrote: > > > > > This is huge. Thanks for working on this. Is there a similar plan > > > with > > eg; > > > tensor-rt support being ported into the main cuda-9.x packages? > > > > > > On October 17, 2018 at 2:10:20 PM, Alex Zai (aza...@gmail.com) wrote: > > > > > > Hey all, > > > We have been working hard these past few months to integrate and > > stabilize > > > Intel’s MKLDNN deep learning CPU accelerator into Mxnet and have > > > made incredible progress. On CPUs with AVX512 instructions (such as > > > c5.18x) we have seen performance increase up to 12x and on other > > > platforms (Macs, > > > AVX2) we seen a speedup of 1.5+. Full list of benchmarks can be > > > found > > here > > > ( > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95650 > > 764 > > > and https://github.com/apache/incubator-mxnet/pull/12591). > > > > > > Currently, using this accelerator requires the developer to either > > > pip install the mxnet-mkl version of mxnet or to build it themselves > > > from source. Given that we should try to provide the best > > > performance "out of the box” with mxnet we should include this in > > > the default build. The > > mkldnn > > > library is included with in the pip package build so it does not > > > require > > an > > > external dependency. > > > > > > There were concerns that MKLDNN could cause regressions on certain > > > platforms (as it did with the tensorflow version a while back); but > > > we added a env flag (MXNET_MKLDNN_ENABLED) that allows users to > turn > > > of this feature during runtime. Please bring up any other concerns > > > you may have > > and > > > your thoughts on including this accelerator in the default build. > > > > > > Best, > > > Alex > > > > > > — > > > Alfredo Luque > > > Software Engineer > > > Machine Learning Infrastructure > > > Airbnb > > > San Francisco, CA > > > > >