Thanks Alfredo, if you can create a GitHub issue with notes/steps we can add this to the todo list for integrating with the MXNet CI to test on m5a instances too. Then we can start tracking this on a regular basis. It would be great to actually test on ARM instances now that AWS has A1 instances too…..ill add it to the wish list ;-D
Sam > On Nov 18, 2019, at 12:32 PM, Alfredo Luque > <alfredo.lu...@airbnb.com.INVALID> wrote: > > Happy to run some benchmarks on an AWS m5a instance (Epyc) and first > generation AMD Threadripper Gen 1 if someone has something easy to run and > representative. > > On November 18, 2019 at 12:29:31 PM, Skalicky, Sam ( > sska...@amazon.com.invalid) wrote: > > Thanks a good idea Alfredo, are you able to help test on AMD CPUs? Or is > there someone else in the mxnet dev@ community who can help? > > Sam > >> On Nov 18, 2019, at 12:27 PM, Alfredo Luque > <alfredo.lu...@airbnb.com.INVALID> wrote: >> >> Verifying that there isn’t a slowdown on AMD CPUs (eg; Ryzen / Epyc) > would >> definitely make sense as a requirement. It seems odd to classify that as > a >> “nonstandard” use case. >> >> On November 18, 2019 at 12:20:33 PM, Skalicky, Sam ( >> sska...@amazon.com.invalid) wrote: >> >> Thanks Patric & team for your work over the years to make MXNet fast with >> MKLDNN! >> >> I think it would be great to make MKLDNN enabled by default. We will need >> to continue producing variants without MKLDNN for those who don’t want it >> (Marco enumerated some use cases). How do you propose to identify the pip >> wheels with/without MKLDNN? Previously we had: mxnet-mkl and > mxnet-cu101mkl >> with MKLDNN. If the plain “mxnet” pip wheel now contains MKLDNN what do > you >> propose we call the build without MKLDNN? mxnet-nomkl? >> >> Thanks! >> Sam >> >>> On Nov 18, 2019, at 11:08 AM, Marco de Abreu <marco.g.ab...@gmail.com> >> wrote: >>> >>> Hi Patric, >>> >>> First of all, thanks a lot to you and your team for all the effort on >> MXNet >>> and mkldnn! >>> >>> Generally I'm inclined towards your proposal, but I'm thinking about the >>> non-standard use cases: >>> - AMD CPU >>> - ARM CPU >>> - Windows >>> - GPU and MKLDNN enabled >>> - Fully reproducible results (medical and financial sector requested > that >>> and we have some flags for cuda) >>> >>> Is mkldnn fully compatible with these use cases? If not, what would >> happen? >>> If yes, do we have performance numbers? >>> >>> Best regards, >>> Marco >>> >>> Zhao, Patric <patric.z...@intel.com> schrieb am Mo., 18. Nov. 2019, >> 14:00: >>> >>>> Hi MXNet community, >>>> >>>> From the first MKLDNN backend integrated in release 1.2, the community >> is >>>> continuously improving the quality and performance of MKLDNN CPU >> backend. >>>> Nowadays, the MKLDNN backend is widely used for the inference, >> especially >>>> for INT8 inference, and we got lots of very positive feedbacks from >> MXNet >>>> users. >>>> >>>> Achieved milestones as below: >>>> >>>> - MKLDNN integrated into Apache MXNet from release 1.2, Feb, 2018 [1] >>>> - MKLDNN backend as default CPU backend from source building, Jan, 2019 >> [2] >>>> - MKLDNN subgraph optimization as default for the inference, Jul, 2019 >> [3] >>>> - MKLDNN major version upgrade in release 1.6, Oct, 2019 [4] >>>> >>>> To make more successful and technical leadership for Apache MXNet in > the >>>> industry, I propose to make MKLDNN as default CPU backend in all binary >>>> distribution from the next release. >>>> The new milestone includes: >>>> >>>> - Static link MKLDNN library in the binary avoiding the mismatch > version >>>> in the runtime [5] >>>> - Make nightly build with MKLDNN default from master pre 1.7 release >>>> - Binary distribution with MKLDNN default from 1.7 release. >>>> >>>> What will be changed: >>>> >>>> - mxnet and mxnet-cuXX binary will be built with MKLDNN=1 >>>> - mxnet-mkl and mxnet-cuXXmkl will be not changed in the minor release >>>> (1.x) and plan to remove in next major release (2.0) >>>> >>>> Suggestions and comments are highly appreciated. >>>> >>>> Thanks, >>>> >>>> --Patric >>>> >>>> >>>> [1] https://github.com/apache/incubator-mxnet/pull/9677 >>>> [2] >>>> >> > https://lists.apache.org/thread.html/bfeae6ee46374112eb4dff1470c262959101e4bffb19930926963535@%3Cdev.mxnet.apache.org%3E >>>> [3] https://github.com/apache/incubator-mxnet/pull/15518 >>>> [4] >>>> >> > https://lists.apache.org/thread.html/f46ab920f18795496eafe713e6e9e561c684e06189085cec17b401dc@%3Cdev.mxnet.apache.org%3E >>>> [5] https://github.com/apache/incubator-mxnet/pull/16731 >>>> >> >> — >> Alfredo Luque >> Software Engineer >> Machine Learning Infrastructure >> Airbnb >> San Francisco, CA > > — > Alfredo Luque > Software Engineer > Machine Learning Infrastructure > Airbnb > San Francisco, CA