If my understanding is correct about the context, it should be acknowledged
that the significant performance improvement comes from the Intel MKLDNN
team's contribution in this PR:
https://github.com/apache/incubator-mxnet/pull/12530.

On Wed, Oct 17, 2018 at 3:12 PM kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> First of all thanks to Intel for these improvements, really a great effort.
>
> What would the compatibility story look like for users that don't have
> these AVX instructions?  Would there be any negative affect for AMD users?
>
> Regarding TensorRT: It's a possibility but not planned in the short term. A
> few considerations would be the limits on PyPi package sizes and the bloat
> incurred with TRT, the requirements of TRT to be installed on the user
> side, and the TRT engine build times which are non-trivial.  We can work
> towards fixing or working around these issues in the future if default TRT
> is something the user community would like to see for Cuda packages.  While
> the feature is experimental we'll likely continue to use
> 'mxnet-tensorrt-cu92' and 'mxnet-tensorrt-cu90'.
>
> On Wed, Oct 17, 2018 at 2:12 PM Alfredo Luque
> <alfredo.lu...@airbnb.com.invalid> wrote:
>
> > This is huge. Thanks for working on this. Is there a similar plan with
> eg;
> > tensor-rt support being ported into the main cuda-9.x packages?
> >
> > On October 17, 2018 at 2:10:20 PM, Alex Zai (aza...@gmail.com) wrote:
> >
> > Hey all,
> > We have been working hard these past few months to integrate and
> stabilize
> > Intel’s MKLDNN deep learning CPU accelerator into Mxnet and have made
> > incredible progress. On CPUs with AVX512 instructions (such as c5.18x) we
> > have seen performance increase up to 12x and on other platforms (Macs,
> > AVX2) we seen a speedup of 1.5+. Full list of benchmarks can be found
> here
> > (
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95650764
> > and https://github.com/apache/incubator-mxnet/pull/12591).
> >
> > Currently, using this accelerator requires the developer to either pip
> > install the mxnet-mkl version of mxnet or to build it themselves from
> > source. Given that we should try to provide the best performance "out of
> > the box” with mxnet we should include this in the default build. The
> mkldnn
> > library is included with in the pip package build so it does not require
> an
> > external dependency.
> >
> > There were concerns that MKLDNN could cause regressions on certain
> > platforms (as it did with the tensorflow version a while back); but we
> > added a env flag (MXNET_MKLDNN_ENABLED) that allows users to turn of this
> > feature during runtime. Please bring up any other concerns you may have
> and
> > your thoughts on including this accelerator in the default build.
> >
> > Best,
> > Alex
> >
> > —
> > Alfredo Luque
> > Software Engineer
> > Machine Learning Infrastructure
> > Airbnb
> > San Francisco, CA
> >
>

Reply via email to