Re: Include MKLDNN into default mxnet pip package

kellen sunderland Wed, 21 Nov 2018 18:55:56 -0800

Agree with your point about other repos also not being based on versioning
Tao.  I would point out that I've given some that I've worked with similar
feedback: https://github.com/onnx/onnx-tensorrt/issues/68


On Wed, Nov 21, 2018 at 6:48 PM Naveen Swamy <mnnav...@gmail.com> wrote:

> Tao,
>
> You are right there are many submodules in 3rd party. We have to start
> somewhere and I believe this one is a good candidate to start with. This is
> not to cater to release of MXNet or to tie them with the releases of the
> submodules but instead to pick only stable releases and not to pick up
> bleeding edge commits from the tip of the master, this gives us confidence
> in the submodule that MXNet users are depending on that especially if we
> make MKLDNN the default.
>
> Good to know it is known already as a regression.Alex has created this
> issue https://github.com/apache/incubator-mxnet/issues/13369, please add
> details and link the corresponding issue in MKLDNN(I couldn't find).
>
> -Naveen
>
> On Wed, Nov 21, 2018 at 6:04 PM Lv, Tao A <tao.a...@intel.com> wrote:
>
> > Here are my answers for the questions from Kellen and Naveen about
> > MKL-DNN. It doesn't mean that I'm supportive for making MKL-DNN default
> > here.
> >
> > @Kellen,
> >
> > FYI, here is a list for those platforms which are officially supported by
> > MKL-DNN.
> > https://github.com/intel/mkl-dnn#system-requirements
> >
> > Most of computation intensive kernels in MKL-DNN are JITed. So they are
> > supposed to generate code according to the platform during runtime. For
> > non-JIT code in MKL-DNN, same as other code in MXNet, it will generate
> > instructions according to the options/flags of compiler. We can set
> > -DARCH_OPT_FLAGS when build MKL-DNN to avoid optimization for compiling
> > machine. That's exactly what we are doing for MKL-DNN build in MXNet.
> Even
> > without MKL-DNN, I noticed there were issues about illegal instructions
> of
> > MXNet when users import the pip package on a lower end machine which
> > probably only supports SSE.
> >
> > @Naveen,
> >
> > The LSTM issue has already been identified as a regression from the
> recent
> > version of MKL-DNN. Hopefully it will be fixed soon with a new update of
> > MKL-DNN.
> >
> > MXNet has many submodule dependencies under the 3rd party folder. Seems
> we
> > don't require release versions for most of these dependencies. The
> release
> > period of MKL-DNN and MXNet are not matched very well. I think it would
> be
> > a risk for MXNet release if it hardly depends on the release of a
> > submodule, no need to say depends on the releases of all submodules.
> >
> > -tao
> >
> > -----Original Message-----
> > From: Naveen Swamy [mailto:mnnav...@gmail.com]
> > Sent: Thursday, November 22, 2018 9:08 AM
> > To: dev@mxnet.incubator.apache.org
> > Cc: d...@mxnet.apache.org
> > Subject: Re: Include MKLDNN into default mxnet pip package
> >
> > Hi Alex,
> >
> > Thanks for promptly running the numbers on AMD and reporting here.
> >
> > Can you please update the AMD numbers here for posterity
> >
> https://cwiki.apache.org/confluence/display/MXNET/MXNet+with+Intel+MKL-DNN+-+Performance+Benchmarking
> > ?
> >
> > are there any outstanding issues when MKLDNN is enabled? from my offline
> > conversation I am briefly aware performance issues with LSTM, is there an
> > GitHub issue for it?
> >
> > MKLDNN is a submodule dependency, are we pulling the latest commit or
> > releases  ? If not we should move to releases before we make it a
> default.
> > Ideally we should use platform specific distributions (-dev packages) at
> > least we should rely on well tested releases.
> >
> >
> > Thanks, Naveen
> >
> > On Wed, Nov 21, 2018 at 4:55 PM Zai, Alexander
> <alex...@amazon.com.invalid
> > >
> > wrote:
> >
> > > AMD benchmarks have been published. We are seeing a x15.8 speedup with
> > > Resnet50 (batch size 32) on AWS's new m5a.24xlarge machine. With a
> > > smaller network (Mobilenet - batch size 32) the speedup is more
> > > significant at x38.7. Let's have a vote to see if the PR to have
> > > MKLDNN enabled by default
> > > (https://github.com/apache/incubator-mxnet/pull/12591) can be merged
> > > before 1.4.0 release.
> > >
> > > On 10/19/18, 9:17 AM, "Pedro Larroy" <pedro.larroy.li...@gmail.com>
> > > wrote:
> > >
> > >     I did  pip install mxnet-mkl==1.3.1b20181018 on an AMD Ryzen 1950X
> > > and unit
> > >     tests are passing.
> > >
> > >     Is this build using AVX512?  in /proc/cpuinfo I see only "avx"
> flag.
> > >     There's no "avx2" like on recent intel cpus.
> > >
> > >     Pedro.
> > >
> > >     On Fri, Oct 19, 2018 at 5:12 PM Hagay Lupesko <lupe...@gmail.com>
> > > wrote:
> > >
> > >     > Awesome collaborative effort across many contributors and
> > companies!
> > >     >
> > >     > The boost is impressive and for MXNet users to get this boost
> > > "out of the
> > >     > box" is a great benefit and makes MXNet an even better choice.
> > >     >
> > >     > Alex - can you clarify whether there are any down sides with
> > > regards to
> > >     > noon AVX-512 architectures, AMD CPUs, etc? Will it gracefully
> > > fallback?
> > >     >
> > >     > Hagay
> > >     >
> > >     >
> > >     > On Fri, Oct 19, 2018, 15:46 Sergio Fernández <wik...@apache.org>
> > > wrote:
> > >     >
> > >     > > If there is no downside on platforms not supporting AVX512
> > > instructions,
> > >     > > then +1
> > >     > >
> > >     > >
> > >     > > On Wed, Oct 17, 2018, 14:10 Alex Zai <aza...@gmail.com> wrote:
> > >     > >
> > >     > > > Hey all,
> > >     > > > We have been working hard these past few months to integrate
> > and
> > >     > > stabilize
> > >     > > > Intel’s MKLDNN deep learning CPU accelerator into Mxnet and
> > > have made
> > >     > > > incredible progress. On CPUs with AVX512 instructions (such
> > > as
> > > c5.18x)
> > >     > we
> > >     > > > have seen performance increase up to 12x and on other
> > > platforms (Macs,
> > >     > > > AVX2) we seen a speedup of 1.5+. Full list of benchmarks can
> > > be found
> > >     > > here
> > >     > > > (
> > >     > > >
> > >     > >
> > >     >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95650764
> > >     > > >  and https://github.com/apache/incubator-mxnet/pull/12591).
> > >     > > >
> > >     > > > Currently, using this accelerator requires the developer to
> > > either pip
> > >     > > > install the mxnet-mkl version of mxnet or to build it
> > > themselves from
> > >     > > > source. Given that we should try to provide the best
> > > performance "out
> > >     > of
> > >     > > > the box” with mxnet we should include this in the default
> > build.
> > > The
> > >     > > mkldnn
> > >     > > > library is included with in the pip package build so it does
> > not
> > >     > require
> > >     > > an
> > >     > > > external dependency.
> > >     > > >
> > >     > > > There were concerns that MKLDNN could cause regressions on
> > > certain
> > >     > > > platforms (as it did with the tensorflow version a while
> > > back); but we
> > >     > > > added a env flag (MXNET_MKLDNN_ENABLED) that allows users to
> > > turn of
> > >     > this
> > >     > > > feature during runtime. Please bring up any other concerns
> > > you may have
> > >     > > and
> > >     > > > your thoughts on including this accelerator in the default
> > build.
> > >     > > >
> > >     > > > Best,
> > >     > > > Alex
> > >     > > >
> > >     > >
> > >     >
> > >
> > >
> > >
> >
>

Re: Include MKLDNN into default mxnet pip package

Reply via email to