Here are my answers for the questions from Kellen and Naveen about MKL-DNN. It 
doesn't mean that I'm supportive for making MKL-DNN default here.

@Kellen,

FYI, here is a list for those platforms which are officially supported by 
MKL-DNN.
https://github.com/intel/mkl-dnn#system-requirements 

Most of computation intensive kernels in MKL-DNN are JITed. So they are 
supposed to generate code according to the platform during runtime. For non-JIT 
code in MKL-DNN, same as other code in MXNet, it will generate instructions 
according to the options/flags of compiler. We can set -DARCH_OPT_FLAGS when 
build MKL-DNN to avoid optimization for compiling machine. That's exactly what 
we are doing for MKL-DNN build in MXNet. Even without MKL-DNN, I noticed there 
were issues about illegal instructions of MXNet when users import the pip 
package on a lower end machine which probably only supports SSE.

@Naveen,

The LSTM issue has already been identified as a regression from the recent 
version of MKL-DNN. Hopefully it will be fixed soon with a new update of 
MKL-DNN.

MXNet has many submodule dependencies under the 3rd party folder. Seems we 
don't require release versions for most of these dependencies. The release 
period of MKL-DNN and MXNet are not matched very well. I think it would be a 
risk for MXNet release if it hardly depends on the release of a submodule, no 
need to say depends on the releases of all submodules.

-tao

-----Original Message-----
From: Naveen Swamy [mailto:mnnav...@gmail.com] 
Sent: Thursday, November 22, 2018 9:08 AM
To: dev@mxnet.incubator.apache.org
Cc: d...@mxnet.apache.org
Subject: Re: Include MKLDNN into default mxnet pip package

Hi Alex,

Thanks for promptly running the numbers on AMD and reporting here.

Can you please update the AMD numbers here for posterity 
https://cwiki.apache.org/confluence/display/MXNET/MXNet+with+Intel+MKL-DNN+-+Performance+Benchmarking
?

are there any outstanding issues when MKLDNN is enabled? from my offline 
conversation I am briefly aware performance issues with LSTM, is there an 
GitHub issue for it?

MKLDNN is a submodule dependency, are we pulling the latest commit or releases  
? If not we should move to releases before we make it a default.
Ideally we should use platform specific distributions (-dev packages) at least 
we should rely on well tested releases.


Thanks, Naveen

On Wed, Nov 21, 2018 at 4:55 PM Zai, Alexander <alex...@amazon.com.invalid>
wrote:

> AMD benchmarks have been published. We are seeing a x15.8 speedup with
> Resnet50 (batch size 32) on AWS's new m5a.24xlarge machine. With a 
> smaller network (Mobilenet - batch size 32) the speedup is more 
> significant at x38.7. Let's have a vote to see if the PR to have 
> MKLDNN enabled by default
> (https://github.com/apache/incubator-mxnet/pull/12591) can be merged 
> before 1.4.0 release.
>
> On 10/19/18, 9:17 AM, "Pedro Larroy" <pedro.larroy.li...@gmail.com>
> wrote:
>
>     I did  pip install mxnet-mkl==1.3.1b20181018 on an AMD Ryzen 1950X 
> and unit
>     tests are passing.
>
>     Is this build using AVX512?  in /proc/cpuinfo I see only "avx" flag.
>     There's no "avx2" like on recent intel cpus.
>
>     Pedro.
>
>     On Fri, Oct 19, 2018 at 5:12 PM Hagay Lupesko <lupe...@gmail.com>
> wrote:
>
>     > Awesome collaborative effort across many contributors and companies!
>     >
>     > The boost is impressive and for MXNet users to get this boost 
> "out of the
>     > box" is a great benefit and makes MXNet an even better choice.
>     >
>     > Alex - can you clarify whether there are any down sides with 
> regards to
>     > noon AVX-512 architectures, AMD CPUs, etc? Will it gracefully 
> fallback?
>     >
>     > Hagay
>     >
>     >
>     > On Fri, Oct 19, 2018, 15:46 Sergio Fernández <wik...@apache.org>
> wrote:
>     >
>     > > If there is no downside on platforms not supporting AVX512 
> instructions,
>     > > then +1
>     > >
>     > >
>     > > On Wed, Oct 17, 2018, 14:10 Alex Zai <aza...@gmail.com> wrote:
>     > >
>     > > > Hey all,
>     > > > We have been working hard these past few months to integrate and
>     > > stabilize
>     > > > Intel’s MKLDNN deep learning CPU accelerator into Mxnet and 
> have made
>     > > > incredible progress. On CPUs with AVX512 instructions (such 
> as
> c5.18x)
>     > we
>     > > > have seen performance increase up to 12x and on other 
> platforms (Macs,
>     > > > AVX2) we seen a speedup of 1.5+. Full list of benchmarks can 
> be found
>     > > here
>     > > > (
>     > > >
>     > >
>     >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95650764
>     > > >  and https://github.com/apache/incubator-mxnet/pull/12591).
>     > > >
>     > > > Currently, using this accelerator requires the developer to 
> either pip
>     > > > install the mxnet-mkl version of mxnet or to build it 
> themselves from
>     > > > source. Given that we should try to provide the best 
> performance "out
>     > of
>     > > > the box” with mxnet we should include this in the default build.
> The
>     > > mkldnn
>     > > > library is included with in the pip package build so it does not
>     > require
>     > > an
>     > > > external dependency.
>     > > >
>     > > > There were concerns that MKLDNN could cause regressions on 
> certain
>     > > > platforms (as it did with the tensorflow version a while 
> back); but we
>     > > > added a env flag (MXNET_MKLDNN_ENABLED) that allows users to 
> turn of
>     > this
>     > > > feature during runtime. Please bring up any other concerns 
> you may have
>     > > and
>     > > > your thoughts on including this accelerator in the default build.
>     > > >
>     > > > Best,
>     > > > Alex
>     > > >
>     > >
>     >
>
>
>

Reply via email to