Re: Proposal to make MKLDNN as default CPU backend

Lin Yuan Tue, 19 Nov 2019 13:40:19 -0800

Also per Sam's suggestion, we could still release a build without MKLDNN
(name it mxnet-nomkldnn?) and track the usage/download for one or two
releases. If there is no usage, we could drop that build in the future.


Best,

Lin

On Tue, Nov 19, 2019 at 1:23 PM Lin Yuan <apefor...@gmail.com> wrote:

> Just to summarize base on the concerns Marco raised and discussed abvove:
>
> - AMD CPU (it should work with MKLDNN:
> https://cwiki.apache.org/confluence/display/MXNET/MXNet+with+Intel+MKL-DNN+-+Performance+Benchmarking
> )
> - ARM CPU (we don't have it today w/o MKLDNN either)
> - Windows (Windows support is there regardless of MKLDNN or not)
> - GPU and MKLDNN enabled (already supported)
> - Fully reproducible results (medical and financial sector requested that
> and we have some flags for cuda) (The nondeterminism exists even today w/o
> MKLDNN. We should address it regardless of MLKDNN)
>
> Marco, please let us know if your concerns are properly addressed?
>
> Given that MKLDNN gives significant performance speed up in CPU, I am
> inclined to make it default in pip build.
>
> Best,
>
> Lin
>
> On Tue, Nov 19, 2019 at 8:08 AM Chris Olivier <cjolivie...@gmail.com>
> wrote:
>
>> Thanks, Patric. I was just trying to point out that there was currently no
>> guarantee of deterministic results without MKL, so there’s not necessarily
>> an expectation of determinism with MKL (ie requirement isn’t relaxed).
>>
>> On Mon, Nov 18, 2019 at 9:38 PM Zhao, Patric <patric.z...@intel.com>
>> wrote:
>>
>> > It may be a concern but little noise can't affect the final results if
>> the
>> > algorithm is stable in numerical.
>> > The MKLDNN backend with mxnet-mkl has been used for 2 years and we
>> didn't
>> > see the coverage issue caused by multiple threading.
>> > In other words, GPU programming mode works well on training where the
>> > non-deterministic also exists from multiple threads.
>> >
>> > Parts of training accuracy was pasted in the first PR when MKLDNN is
>> > integrated.
>> >
>> https://github.com/apache/incubator-mxnet/pull/8302#issuecomment-359674818
>> >
>> > In conclusion, it may happen with very little probability. I believe we
>> > can get a solution in case it happens someday.
>> >
>> > Thanks,
>> >
>> > --Patric
>> >
>> >
>> > > -----Original Message-----
>> > > From: Chris Olivier <cjolivie...@gmail.com>
>> > > Sent: Tuesday, November 19, 2019 11:51 AM
>> > > To: dev@mxnet.incubator.apache.org
>> > > Cc: Tao Lv <mutou...@gmail.com>
>> > > Subject: Re: Proposal to make MKLDNN as default CPU backend
>> > >
>> > > (for non mkl dropout, for instance)
>> > >
>> > > On Mon, Nov 18, 2019 at 7:50 PM Chris Olivier <cjolivie...@gmail.com>
>> > > wrote:
>> > >
>> > > > To address the deterministic item, I know for a fact that training
>> > > > will not be deterministic in some cases where the “parallel random”
>> > > > class is utilized in parallel threads, such as OMP, if the number of
>> > > > cores is different, even with the same seed, because threads are
>> > > > seeded independently and different number of threads will end up
>> > > > generating different random number sequences. Dropout operator being
>> > > an example.
>> > > >
>> > > > On Mon, Nov 18, 2019 at 6:39 PM Alfredo Luque
>> > > > <alfredo.lu...@airbnb.com.invalid> wrote:
>> > > >
>> > > >> For AMD CPUs, you’d want to perform validation because now MKL-DNN
>> > > >> would be enabled by default. Historically, other intel libraries
>> > > >> (along with the ICC
>> > > >> compiler) have had performance issues on AMD CPUs. It’s just worth
>> > > >> double checking to make sure that’s not the case here. Perhaps some
>> > > >> MKL-DNN authors can chime in though. It’s not sufficient to double
>> > > >> check that an
>> > > >> AVX2 package passes tests.
>> > > >>
>> > > >> Agreed in the case we’re not releasing ARM binaries.
>> > > >>
>> > > >> The reproducibility argument is around the results being
>> numerically
>> > > >> reproducible. That is, eg; if I train a model with some fixed set
>> of
>> > > >> data, some random seed, etc. and then run inference on it do I get
>> > > >> the exact same floating point values for the weights and results?
>> > > >> Does MxNet already offer this without MKL-DNN?
>> > > >>
>> > > >> On November 18, 2019 at 6:32:07 PM, Tao Lv (mutou...@gmail.com)
>> > > wrote:
>> > > >>
>> > > >> Regarding the cases listed by Marco:
>> > > >> - AMD CPU
>> > > >> From my architecture knowledge, what works on C4 instances (with
>> AVX2
>> > > >> support) should also work well on m5a, right? I think mxnet-mkl and
>> > > >> mxnet-cuxxmkl packages have been fully validated on AVX2 machines.
>> > > >> Also, we didn't perform any validation on AMD CPU before, why we
>> need
>> > > >> do that for this time?
>> > > >>
>> > > >> - ARM CPU
>> > > >> I don't know we're releasing any convenience binaries for ARM CPU.
>> > > >> This proposal mainly targets those pypi packages.
>> > > >>
>> > > >> - Windows
>> > > >> Already validated by CI. We're also releasing mxnet-mkl packages
>> for
>> > Win.
>> > > >>
>> > > >> - GPU and MKLDNN enabled
>> > > >> Already validated by CI and mxnet-cuxxmkl packages have been
>> released
>> > > >> for several versions.
>> > > >>
>> > > >> - Fully reproducible results (medical and financial sector
>> requested
>> > > >> that and we have some flags for cuda) Not sure I understand this
>> > > >> case. We already have MKL-DNN backend for a while. Functionality
>> and
>> > > >> correctness of it have been verified by MXNet users.
>> > > >>
>> > > >> -tao
>> > > >>
>> > > >> On Tue, Nov 19, 2019 at 4:41 AM Marco de Abreu
>> > > >> <marco.g.ab...@gmail.com>
>> > > >> wrote:
>> > > >>
>> > > >> > Sorry, my intent with the "non-standard" phrase was not about
>> > > >> > general
>> > > >> MXNet
>> > > >> > but rather from MKLDNNs point of view, considering that it's
>> being
>> > > >> > developed by Intel, I assumed that MKLDNN might consider
>> non-intel
>> > > >> > use-cases non standard.
>> > > >> >
>> > > >> > -Marco
>> > > >> >
>> > > >> > Skalicky, Sam <sska...@amazon.com.invalid> schrieb am Mo., 18.
>> Nov.
>> > > >> 2019,
>> > > >> > 21:34:
>> > > >> >
>> > > >> > > Thanks Alfredo, if you can create a GitHub issue with
>> notes/steps
>> > > >> > > we
>> > > >> can
>> > > >> > > add this to the todo list for integrating with the MXNet CI to
>> > > >> > > test on
>> > > >> > m5a
>> > > >> > > instances too. Then we can start tracking this on a regular
>> > > >> > > basis. It
>> > > >> > would
>> > > >> > > be great to actually test on ARM instances now that AWS has A1
>> > > >> instances
>> > > >> > > too…..ill add it to the wish list ;-D
>> > > >> > >
>> > > >> > > Sam
>> > > >> > >
>> > > >> > > > On Nov 18, 2019, at 12:32 PM, Alfredo Luque <
>> > > >> alfredo.lu...@airbnb.com
>> > > >> > .INVALID>
>> > > >> > > wrote:
>> > > >> > > >
>> > > >> > > > Happy to run some benchmarks on an AWS m5a instance (Epyc)
>> and
>> > > >> > > > first generation AMD Threadripper Gen 1 if someone has
>> > > >> > > > something easy to
>> > > >> run
>> > > >> > > and
>> > > >> > > > representative.
>> > > >> > > >
>> > > >> > > > On November 18, 2019 at 12:29:31 PM, Skalicky, Sam (
>> > > >> > > > sska...@amazon.com.invalid) wrote:
>> > > >> > > >
>> > > >> > > > Thanks a good idea Alfredo, are you able to help test on AMD
>> > CPUs?
>> > > >> Or
>> > > >> > is
>> > > >> > > > there someone else in the mxnet dev@ community who can help?
>> > > >> > > >
>> > > >> > > > Sam
>> > > >> > > >
>> > > >> > > >> On Nov 18, 2019, at 12:27 PM, Alfredo Luque
>> > > >> > > > <alfredo.lu...@airbnb.com.INVALID> wrote:
>> > > >> > > >>
>> > > >> > > >> Verifying that there isn’t a slowdown on AMD CPUs (eg;
>> Ryzen /
>> > > >> Epyc)
>> > > >> > > > would
>> > > >> > > >> definitely make sense as a requirement. It seems odd to
>> > > >> > > >> classify
>> > > >> that
>> > > >> > as
>> > > >> > > > a
>> > > >> > > >> “nonstandard” use case.
>> > > >> > > >>
>> > > >> > > >> On November 18, 2019 at 12:20:33 PM, Skalicky, Sam (
>> > > >> > > >> sska...@amazon.com.invalid) wrote:
>> > > >> > > >>
>> > > >> > > >> Thanks Patric & team for your work over the years to make
>> > > >> > > >> MXNet
>> > > >> fast
>> > > >> > > with
>> > > >> > > >> MKLDNN!
>> > > >> > > >>
>> > > >> > > >> I think it would be great to make MKLDNN enabled by default.
>> > > >> > > >> We
>> > > >> will
>> > > >> > > need
>> > > >> > > >> to continue producing variants without MKLDNN for those who
>> > > >> > > >> don’t
>> > > >> want
>> > > >> > > it
>> > > >> > > >> (Marco enumerated some use cases). How do you propose to
>> > > >> > > >> identify
>> > > >> the
>> > > >> > > pip
>> > > >> > > >> wheels with/without MKLDNN? Previously we had: mxnet-mkl and
>> > > >> > > > mxnet-cu101mkl
>> > > >> > > >> with MKLDNN. If the plain “mxnet” pip wheel now contains
>> > > >> > > >> MKLDNN
>> > > >> what
>> > > >> > do
>> > > >> > > > you
>> > > >> > > >> propose we call the build without MKLDNN? mxnet-nomkl?
>> > > >> > > >>
>> > > >> > > >> Thanks!
>> > > >> > > >> Sam
>> > > >> > > >>
>> > > >> > > >>> On Nov 18, 2019, at 11:08 AM, Marco de Abreu <
>> > > >> > marco.g.ab...@gmail.com>
>> > > >> > > >> wrote:
>> > > >> > > >>>
>> > > >> > > >>> Hi Patric,
>> > > >> > > >>>
>> > > >> > > >>> First of all, thanks a lot to you and your team for all the
>> > > >> > > >>> effort
>> > > >> on
>> > > >> > > >> MXNet
>> > > >> > > >>> and mkldnn!
>> > > >> > > >>>
>> > > >> > > >>> Generally I'm inclined towards your proposal, but I'm
>> > > >> > > >>> thinking
>> > > >> about
>> > > >> > > the
>> > > >> > > >>> non-standard use cases:
>> > > >> > > >>> - AMD CPU
>> > > >> > > >>> - ARM CPU
>> > > >> > > >>> - Windows
>> > > >> > > >>> - GPU and MKLDNN enabled
>> > > >> > > >>> - Fully reproducible results (medical and financial sector
>> > > >> requested
>> > > >> > > > that
>> > > >> > > >>> and we have some flags for cuda)
>> > > >> > > >>>
>> > > >> > > >>> Is mkldnn fully compatible with these use cases? If not,
>> what
>> > > >> would
>> > > >> > > >> happen?
>> > > >> > > >>> If yes, do we have performance numbers?
>> > > >> > > >>>
>> > > >> > > >>> Best regards,
>> > > >> > > >>> Marco
>> > > >> > > >>>
>> > > >> > > >>> Zhao, Patric <patric.z...@intel.com> schrieb am Mo., 18.
>> Nov.
>> > > >> 2019,
>> > > >> > > >> 14:00:
>> > > >> > > >>>
>> > > >> > > >>>> Hi MXNet community,
>> > > >> > > >>>>
>> > > >> > > >>>> From the first MKLDNN backend integrated in release 1.2,
>> the
>> > > >> > community
>> > > >> > > >> is
>> > > >> > > >>>> continuously improving the quality and performance of
>> MKLDNN
>> > > >> > > >>>> CPU
>> > > >> > > >> backend.
>> > > >> > > >>>> Nowadays, the MKLDNN backend is widely used for the
>> > > >> > > >>>> inference,
>> > > >> > > >> especially
>> > > >> > > >>>> for INT8 inference, and we got lots of very positive
>> > > >> > > >>>> feedbacks
>> > > >> from
>> > > >> > > >> MXNet
>> > > >> > > >>>> users.
>> > > >> > > >>>>
>> > > >> > > >>>> Achieved milestones as below:
>> > > >> > > >>>>
>> > > >> > > >>>> - MKLDNN integrated into Apache MXNet from release 1.2,
>> Feb,
>> > > >> > > >>>> 2018
>> > > >> > [1]
>> > > >> > > >>>> - MKLDNN backend as default CPU backend from source
>> > > >> > > >>>> building,
>> > > >> Jan,
>> > > >> > > 2019
>> > > >> > > >> [2]
>> > > >> > > >>>> - MKLDNN subgraph optimization as default for the
>> inference,
>> > > >> > > >>>> Jul,
>> > > >> > 2019
>> > > >> > > >> [3]
>> > > >> > > >>>> - MKLDNN major version upgrade in release 1.6, Oct, 2019
>> [4]
>> > > >> > > >>>>
>> > > >> > > >>>> To make more successful and technical leadership for
>> Apache
>> > > >> > > >>>> MXNet
>> > > >> in
>> > > >> > > > the
>> > > >> > > >>>> industry, I propose to make MKLDNN as default CPU backend
>> in
>> > > >> > > >>>> all
>> > > >> > > binary
>> > > >> > > >>>> distribution from the next release.
>> > > >> > > >>>> The new milestone includes:
>> > > >> > > >>>>
>> > > >> > > >>>> - Static link MKLDNN library in the binary avoiding the
>> > > >> > > >>>> mismatch
>> > > >> > > > version
>> > > >> > > >>>> in the runtime [5]
>> > > >> > > >>>> - Make nightly build with MKLDNN default from master pre
>> 1.7
>> > > >> release
>> > > >> > > >>>> - Binary distribution with MKLDNN default from 1.7
>> release.
>> > > >> > > >>>>
>> > > >> > > >>>> What will be changed:
>> > > >> > > >>>>
>> > > >> > > >>>> - mxnet and mxnet-cuXX binary will be built with MKLDNN=1
>> > > >> > > >>>> - mxnet-mkl and mxnet-cuXXmkl will be not changed in the
>> > > >> > > >>>> minor
>> > > >> > release
>> > > >> > > >>>> (1.x) and plan to remove in next major release (2.0)
>> > > >> > > >>>>
>> > > >> > > >>>> Suggestions and comments are highly appreciated.
>> > > >> > > >>>>
>> > > >> > > >>>> Thanks,
>> > > >> > > >>>>
>> > > >> > > >>>> --Patric
>> > > >> > > >>>>
>> > > >> > > >>>>
>> > > >> > > >>>> [1] https://github.com/apache/incubator-mxnet/pull/9677
>> > > >> > > >>>> [2]
>> > > >> > > >>>>
>> > > >> > > >>
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > > >>
>> > > https://lists.apache.org/thread.html/bfeae6ee46374112eb4dff1470c26295
>> > > >> 9101e4bffb19930926963535@%3Cdev.mxnet.apache.org%3E
>> > > >> > > >>>> [3] https://github.com/apache/incubator-mxnet/pull/15518
>> > > >> > > >>>> [4]
>> > > >> > > >>>>
>> > > >> > > >>
>> > > >> > > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > > >>
>> > > https://lists.apache.org/thread.html/f46ab920f18795496eafe713e6e9e561
>> > > >> c684e06189085cec17b401dc@%3Cdev.mxnet.apache.org%3E
>> > > >> > > >>>> [5] https://github.com/apache/incubator-mxnet/pull/16731
>> > > >> > > >>>>
>> > > >> > > >>
>> > > >> > > >> —
>> > > >> > > >> Alfredo Luque
>> > > >> > > >> Software Engineer
>> > > >> > > >> Machine Learning Infrastructure Airbnb San Francisco, CA
>> > > >> > > >
>> > > >> > > > —
>> > > >> > > > Alfredo Luque
>> > > >> > > > Software Engineer
>> > > >> > > > Machine Learning Infrastructure Airbnb San Francisco, CA
>> > > >> > >
>> > > >> > >
>> > > >> >
>> > > >>
>> > > >> —
>> > > >> Alfredo Luque
>> > > >> Software Engineer
>> > > >> Machine Learning Infrastructure
>> > > >> Airbnb
>> > > >> San Francisco, CA
>> > > >>
>> > > >
>> >
>>
>

Re: Proposal to make MKLDNN as default CPU backend

Reply via email to