Thanks all of the great suggestions. Regarding the binary release, including w/o MKLDNN build, I have summarized a table (check attachment).
- Major changes in python packages, see attached table. - Switch on MKLDNN for no mkl suffix binary in release 1.7 (Red check mark) - Add new mxnet-native build w/o MKLDNN and cuDNN (Yellow background) Track the usage/download in 1-2 releases and then decide if we need it for a long time - Drop all mkl suffix binary in next major release v2.x. Thanks, --Patric > -----Original Message----- > From: Lin Yuan <apefor...@gmail.com> > Sent: Wednesday, November 20, 2019 5:40 AM > To: dev@mxnet.incubator.apache.org > Cc: Tao Lv <mutou...@gmail.com> > Subject: Re: Proposal to make MKLDNN as default CPU backend > > Also per Sam's suggestion, we could still release a build without MKLDNN > (name it mxnet-nomkldnn?) and track the usage/download for one or two > releases. If there is no usage, we could drop that build in the future. > > Best, > > Lin > > On Tue, Nov 19, 2019 at 1:23 PM Lin Yuan <apefor...@gmail.com> wrote: > > > Just to summarize base on the concerns Marco raised and discussed > abvove: > > > > - AMD CPU (it should work with MKLDNN: > > > https://cwiki.apache.org/confluence/display/MXNET/MXNet+with+Intel+M > KL > > -DNN+-+Performance+Benchmarking > > ) > > - ARM CPU (we don't have it today w/o MKLDNN either) > > - Windows (Windows support is there regardless of MKLDNN or not) > > - GPU and MKLDNN enabled (already supported) > > - Fully reproducible results (medical and financial sector requested > > that and we have some flags for cuda) (The nondeterminism exists even > > today w/o MKLDNN. We should address it regardless of MLKDNN) > > > > Marco, please let us know if your concerns are properly addressed? > > > > Given that MKLDNN gives significant performance speed up in CPU, I am > > inclined to make it default in pip build. > > > > Best, > > > > Lin > > > > On Tue, Nov 19, 2019 at 8:08 AM Chris Olivier <cjolivie...@gmail.com> > > wrote: > > > >> Thanks, Patric. I was just trying to point out that there was > >> currently no guarantee of deterministic results without MKL, so > >> there’s not necessarily an expectation of determinism with MKL (ie > requirement isn’t relaxed). > >> > >> On Mon, Nov 18, 2019 at 9:38 PM Zhao, Patric <patric.z...@intel.com> > >> wrote: > >> > >> > It may be a concern but little noise can't affect the final results > >> > if > >> the > >> > algorithm is stable in numerical. > >> > The MKLDNN backend with mxnet-mkl has been used for 2 years and > we > >> didn't > >> > see the coverage issue caused by multiple threading. > >> > In other words, GPU programming mode works well on training where > >> > the non-deterministic also exists from multiple threads. > >> > > >> > Parts of training accuracy was pasted in the first PR when MKLDNN > >> > is integrated. > >> > > >> https://github.com/apache/incubator-mxnet/pull/8302#issuecomment- > 3596 > >> 74818 > >> > > >> > In conclusion, it may happen with very little probability. I > >> > believe we can get a solution in case it happens someday. > >> > > >> > Thanks, > >> > > >> > --Patric > >> > > >> > > >> > > -----Original Message----- > >> > > From: Chris Olivier <cjolivie...@gmail.com> > >> > > Sent: Tuesday, November 19, 2019 11:51 AM > >> > > To: dev@mxnet.incubator.apache.org > >> > > Cc: Tao Lv <mutou...@gmail.com> > >> > > Subject: Re: Proposal to make MKLDNN as default CPU backend > >> > > > >> > > (for non mkl dropout, for instance) > >> > > > >> > > On Mon, Nov 18, 2019 at 7:50 PM Chris Olivier > >> > > <cjolivie...@gmail.com> > >> > > wrote: > >> > > > >> > > > To address the deterministic item, I know for a fact that > >> > > > training will not be deterministic in some cases where the “parallel > random” > >> > > > class is utilized in parallel threads, such as OMP, if the > >> > > > number of cores is different, even with the same seed, because > >> > > > threads are seeded independently and different number of > >> > > > threads will end up generating different random number > >> > > > sequences. Dropout operator being > >> > > an example. > >> > > > > >> > > > On Mon, Nov 18, 2019 at 6:39 PM Alfredo Luque > >> > > > <alfredo.lu...@airbnb.com.invalid> wrote: > >> > > > > >> > > >> For AMD CPUs, you’d want to perform validation because now > >> > > >> MKL-DNN would be enabled by default. Historically, other intel > >> > > >> libraries (along with the ICC > >> > > >> compiler) have had performance issues on AMD CPUs. It’s just > >> > > >> worth double checking to make sure that’s not the case here. > >> > > >> Perhaps some MKL-DNN authors can chime in though. It’s not > >> > > >> sufficient to double check that an > >> > > >> AVX2 package passes tests. > >> > > >> > >> > > >> Agreed in the case we’re not releasing ARM binaries. > >> > > >> > >> > > >> The reproducibility argument is around the results being > >> numerically > >> > > >> reproducible. That is, eg; if I train a model with some fixed > >> > > >> set > >> of > >> > > >> data, some random seed, etc. and then run inference on it do I > >> > > >> get the exact same floating point values for the weights and > >> > > >> results? > >> > > >> Does MxNet already offer this without MKL-DNN? > >> > > >> > >> > > >> On November 18, 2019 at 6:32:07 PM, Tao Lv > >> > > >> (mutou...@gmail.com) > >> > > wrote: > >> > > >> > >> > > >> Regarding the cases listed by Marco: > >> > > >> - AMD CPU > >> > > >> From my architecture knowledge, what works on C4 instances > >> > > >> (with > >> AVX2 > >> > > >> support) should also work well on m5a, right? I think > >> > > >> mxnet-mkl and mxnet-cuxxmkl packages have been fully validated > on AVX2 machines. > >> > > >> Also, we didn't perform any validation on AMD CPU before, why > >> > > >> we > >> need > >> > > >> do that for this time? > >> > > >> > >> > > >> - ARM CPU > >> > > >> I don't know we're releasing any convenience binaries for ARM > CPU. > >> > > >> This proposal mainly targets those pypi packages. > >> > > >> > >> > > >> - Windows > >> > > >> Already validated by CI. We're also releasing mxnet-mkl > >> > > >> packages > >> for > >> > Win. > >> > > >> > >> > > >> - GPU and MKLDNN enabled > >> > > >> Already validated by CI and mxnet-cuxxmkl packages have been > >> released > >> > > >> for several versions. > >> > > >> > >> > > >> - Fully reproducible results (medical and financial sector > >> requested > >> > > >> that and we have some flags for cuda) Not sure I understand > >> > > >> this case. We already have MKL-DNN backend for a while. > >> > > >> Functionality > >> and > >> > > >> correctness of it have been verified by MXNet users. > >> > > >> > >> > > >> -tao > >> > > >> > >> > > >> On Tue, Nov 19, 2019 at 4:41 AM Marco de Abreu > >> > > >> <marco.g.ab...@gmail.com> > >> > > >> wrote: > >> > > >> > >> > > >> > Sorry, my intent with the "non-standard" phrase was not > >> > > >> > about general > >> > > >> MXNet > >> > > >> > but rather from MKLDNNs point of view, considering that it's > >> being > >> > > >> > developed by Intel, I assumed that MKLDNN might consider > >> non-intel > >> > > >> > use-cases non standard. > >> > > >> > > >> > > >> > -Marco > >> > > >> > > >> > > >> > Skalicky, Sam <sska...@amazon.com.invalid> schrieb am Mo., 18. > >> Nov. > >> > > >> 2019, > >> > > >> > 21:34: > >> > > >> > > >> > > >> > > Thanks Alfredo, if you can create a GitHub issue with > >> notes/steps > >> > > >> > > we > >> > > >> can > >> > > >> > > add this to the todo list for integrating with the MXNet > >> > > >> > > CI to test on > >> > > >> > m5a > >> > > >> > > instances too. Then we can start tracking this on a > >> > > >> > > regular basis. It > >> > > >> > would > >> > > >> > > be great to actually test on ARM instances now that AWS > >> > > >> > > has A1 > >> > > >> instances > >> > > >> > > too…..ill add it to the wish list ;-D > >> > > >> > > > >> > > >> > > Sam > >> > > >> > > > >> > > >> > > > On Nov 18, 2019, at 12:32 PM, Alfredo Luque < > >> > > >> alfredo.lu...@airbnb.com > >> > > >> > .INVALID> > >> > > >> > > wrote: > >> > > >> > > > > >> > > >> > > > Happy to run some benchmarks on an AWS m5a instance > >> > > >> > > > (Epyc) > >> and > >> > > >> > > > first generation AMD Threadripper Gen 1 if someone has > >> > > >> > > > something easy to > >> > > >> run > >> > > >> > > and > >> > > >> > > > representative. > >> > > >> > > > > >> > > >> > > > On November 18, 2019 at 12:29:31 PM, Skalicky, Sam ( > >> > > >> > > > sska...@amazon.com.invalid) wrote: > >> > > >> > > > > >> > > >> > > > Thanks a good idea Alfredo, are you able to help test on > >> > > >> > > > AMD > >> > CPUs? > >> > > >> Or > >> > > >> > is > >> > > >> > > > there someone else in the mxnet dev@ community who can > help? > >> > > >> > > > > >> > > >> > > > Sam > >> > > >> > > > > >> > > >> > > >> On Nov 18, 2019, at 12:27 PM, Alfredo Luque > >> > > >> > > > <alfredo.lu...@airbnb.com.INVALID> wrote: > >> > > >> > > >> > >> > > >> > > >> Verifying that there isn’t a slowdown on AMD CPUs (eg; > >> Ryzen / > >> > > >> Epyc) > >> > > >> > > > would > >> > > >> > > >> definitely make sense as a requirement. It seems odd to > >> > > >> > > >> classify > >> > > >> that > >> > > >> > as > >> > > >> > > > a > >> > > >> > > >> “nonstandard” use case. > >> > > >> > > >> > >> > > >> > > >> On November 18, 2019 at 12:20:33 PM, Skalicky, Sam ( > >> > > >> > > >> sska...@amazon.com.invalid) wrote: > >> > > >> > > >> > >> > > >> > > >> Thanks Patric & team for your work over the years to > >> > > >> > > >> make MXNet > >> > > >> fast > >> > > >> > > with > >> > > >> > > >> MKLDNN! > >> > > >> > > >> > >> > > >> > > >> I think it would be great to make MKLDNN enabled by > default. > >> > > >> > > >> We > >> > > >> will > >> > > >> > > need > >> > > >> > > >> to continue producing variants without MKLDNN for those > >> > > >> > > >> who don’t > >> > > >> want > >> > > >> > > it > >> > > >> > > >> (Marco enumerated some use cases). How do you propose > >> > > >> > > >> to identify > >> > > >> the > >> > > >> > > pip > >> > > >> > > >> wheels with/without MKLDNN? Previously we had: > >> > > >> > > >> mxnet-mkl and > >> > > >> > > > mxnet-cu101mkl > >> > > >> > > >> with MKLDNN. If the plain “mxnet” pip wheel now > >> > > >> > > >> contains MKLDNN > >> > > >> what > >> > > >> > do > >> > > >> > > > you > >> > > >> > > >> propose we call the build without MKLDNN? mxnet-nomkl? > >> > > >> > > >> > >> > > >> > > >> Thanks! > >> > > >> > > >> Sam > >> > > >> > > >> > >> > > >> > > >>> On Nov 18, 2019, at 11:08 AM, Marco de Abreu < > >> > > >> > marco.g.ab...@gmail.com> > >> > > >> > > >> wrote: > >> > > >> > > >>> > >> > > >> > > >>> Hi Patric, > >> > > >> > > >>> > >> > > >> > > >>> First of all, thanks a lot to you and your team for > >> > > >> > > >>> all the effort > >> > > >> on > >> > > >> > > >> MXNet > >> > > >> > > >>> and mkldnn! > >> > > >> > > >>> > >> > > >> > > >>> Generally I'm inclined towards your proposal, but I'm > >> > > >> > > >>> thinking > >> > > >> about > >> > > >> > > the > >> > > >> > > >>> non-standard use cases: > >> > > >> > > >>> - AMD CPU > >> > > >> > > >>> - ARM CPU > >> > > >> > > >>> - Windows > >> > > >> > > >>> - GPU and MKLDNN enabled > >> > > >> > > >>> - Fully reproducible results (medical and financial > >> > > >> > > >>> sector > >> > > >> requested > >> > > >> > > > that > >> > > >> > > >>> and we have some flags for cuda) > >> > > >> > > >>> > >> > > >> > > >>> Is mkldnn fully compatible with these use cases? If > >> > > >> > > >>> not, > >> what > >> > > >> would > >> > > >> > > >> happen? > >> > > >> > > >>> If yes, do we have performance numbers? > >> > > >> > > >>> > >> > > >> > > >>> Best regards, > >> > > >> > > >>> Marco > >> > > >> > > >>> > >> > > >> > > >>> Zhao, Patric <patric.z...@intel.com> schrieb am Mo., 18. > >> Nov. > >> > > >> 2019, > >> > > >> > > >> 14:00: > >> > > >> > > >>> > >> > > >> > > >>>> Hi MXNet community, > >> > > >> > > >>>> > >> > > >> > > >>>> From the first MKLDNN backend integrated in release > >> > > >> > > >>>> 1.2, > >> the > >> > > >> > community > >> > > >> > > >> is > >> > > >> > > >>>> continuously improving the quality and performance of > >> MKLDNN > >> > > >> > > >>>> CPU > >> > > >> > > >> backend. > >> > > >> > > >>>> Nowadays, the MKLDNN backend is widely used for the > >> > > >> > > >>>> inference, > >> > > >> > > >> especially > >> > > >> > > >>>> for INT8 inference, and we got lots of very positive > >> > > >> > > >>>> feedbacks > >> > > >> from > >> > > >> > > >> MXNet > >> > > >> > > >>>> users. > >> > > >> > > >>>> > >> > > >> > > >>>> Achieved milestones as below: > >> > > >> > > >>>> > >> > > >> > > >>>> - MKLDNN integrated into Apache MXNet from release > >> > > >> > > >>>> 1.2, > >> Feb, > >> > > >> > > >>>> 2018 > >> > > >> > [1] > >> > > >> > > >>>> - MKLDNN backend as default CPU backend from source > >> > > >> > > >>>> building, > >> > > >> Jan, > >> > > >> > > 2019 > >> > > >> > > >> [2] > >> > > >> > > >>>> - MKLDNN subgraph optimization as default for the > >> inference, > >> > > >> > > >>>> Jul, > >> > > >> > 2019 > >> > > >> > > >> [3] > >> > > >> > > >>>> - MKLDNN major version upgrade in release 1.6, Oct, > >> > > >> > > >>>> 2019 > >> [4] > >> > > >> > > >>>> > >> > > >> > > >>>> To make more successful and technical leadership for > >> Apache > >> > > >> > > >>>> MXNet > >> > > >> in > >> > > >> > > > the > >> > > >> > > >>>> industry, I propose to make MKLDNN as default CPU > >> > > >> > > >>>> backend > >> in > >> > > >> > > >>>> all > >> > > >> > > binary > >> > > >> > > >>>> distribution from the next release. > >> > > >> > > >>>> The new milestone includes: > >> > > >> > > >>>> > >> > > >> > > >>>> - Static link MKLDNN library in the binary avoiding > >> > > >> > > >>>> the mismatch > >> > > >> > > > version > >> > > >> > > >>>> in the runtime [5] > >> > > >> > > >>>> - Make nightly build with MKLDNN default from master > >> > > >> > > >>>> pre > >> 1.7 > >> > > >> release > >> > > >> > > >>>> - Binary distribution with MKLDNN default from 1.7 > >> release. > >> > > >> > > >>>> > >> > > >> > > >>>> What will be changed: > >> > > >> > > >>>> > >> > > >> > > >>>> - mxnet and mxnet-cuXX binary will be built with > >> > > >> > > >>>> MKLDNN=1 > >> > > >> > > >>>> - mxnet-mkl and mxnet-cuXXmkl will be not changed in > >> > > >> > > >>>> the minor > >> > > >> > release > >> > > >> > > >>>> (1.x) and plan to remove in next major release (2.0) > >> > > >> > > >>>> > >> > > >> > > >>>> Suggestions and comments are highly appreciated. > >> > > >> > > >>>> > >> > > >> > > >>>> Thanks, > >> > > >> > > >>>> > >> > > >> > > >>>> --Patric > >> > > >> > > >>>> > >> > > >> > > >>>> > >> > > >> > > >>>> [1] > >> > > >> > > >>>> https://github.com/apache/incubator-mxnet/pull/9677 > >> > > >> > > >>>> [2] > >> > > >> > > >>>> > >> > > >> > > >> > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > >> > >> > > >> > >> > > https://lists.apache.org/thread.html/bfeae6ee46374112eb4dff1470c2 > >> > > 6295 > >> > > >> 9101e4bffb19930926963535@%3Cdev.mxnet.apache.org%3E > >> > > >> > > >>>> [3] > >> > > >> > > >>>> https://github.com/apache/incubator-mxnet/pull/15518 > >> > > >> > > >>>> [4] > >> > > >> > > >>>> > >> > > >> > > >> > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > >> > >> > > >> > >> > > https://lists.apache.org/thread.html/f46ab920f18795496eafe713e6e9 > >> > > e561 > >> > > >> c684e06189085cec17b401dc@%3Cdev.mxnet.apache.org%3E > >> > > >> > > >>>> [5] > >> > > >> > > >>>> https://github.com/apache/incubator-mxnet/pull/16731 > >> > > >> > > >>>> > >> > > >> > > >> > >> > > >> > > >> — > >> > > >> > > >> Alfredo Luque > >> > > >> > > >> Software Engineer > >> > > >> > > >> Machine Learning Infrastructure Airbnb San Francisco, > >> > > >> > > >> CA > >> > > >> > > > > >> > > >> > > > — > >> > > >> > > > Alfredo Luque > >> > > >> > > > Software Engineer > >> > > >> > > > Machine Learning Infrastructure Airbnb San Francisco, CA > >> > > >> > > > >> > > >> > > > >> > > >> > > >> > > >> > >> > > >> — > >> > > >> Alfredo Luque > >> > > >> Software Engineer > >> > > >> Machine Learning Infrastructure Airbnb San Francisco, CA > >> > > >> > >> > > > > >> > > >> > >
MKLDNN to default.xlsx
Description: MKLDNN to default.xlsx