Regarding the cases listed by Marco:
- AMD CPU
>From my architecture knowledge, what works on C4 instances (with AVX2
support) should also work well on m5a, right? I think mxnet-mkl and
mxnet-cuxxmkl packages have been fully validated on AVX2 machines.
Also, we didn't perform any validation on AMD CPU before, why we need do
that for this time?

- ARM CPU
I don't know we're releasing any convenience binaries for ARM CPU. This
proposal mainly targets those pypi packages.

- Windows
Already validated by CI. We're also releasing mxnet-mkl packages for Win.

- GPU and MKLDNN enabled
Already validated by CI and mxnet-cuxxmkl packages have been released for
several versions.

- Fully reproducible results (medical and financial sector requested that
and we have some flags for cuda)
Not sure I understand this case. We already have MKL-DNN backend for a
while. Functionality and correctness of it have been verified by MXNet
users.

-tao

On Tue, Nov 19, 2019 at 4:41 AM Marco de Abreu <marco.g.ab...@gmail.com>
wrote:

> Sorry, my intent with the "non-standard" phrase was not about general MXNet
> but rather from MKLDNNs point of view, considering that it's being
> developed by Intel, I assumed that MKLDNN might consider non-intel
> use-cases non standard.
>
> -Marco
>
> Skalicky, Sam <sska...@amazon.com.invalid> schrieb am Mo., 18. Nov. 2019,
> 21:34:
>
> > Thanks Alfredo, if you can create a GitHub issue with notes/steps we can
> > add this to the todo list for integrating with the MXNet CI to test on
> m5a
> > instances too. Then we can start tracking this on a regular basis. It
> would
> > be great to actually test on ARM instances now that AWS has A1 instances
> > too…..ill add it to the wish list ;-D
> >
> > Sam
> >
> > > On Nov 18, 2019, at 12:32 PM, Alfredo Luque <alfredo.lu...@airbnb.com
> .INVALID>
> > wrote:
> > >
> > > Happy to run some benchmarks on an AWS m5a instance (Epyc) and first
> > > generation AMD Threadripper Gen 1 if someone has something easy to run
> > and
> > > representative.
> > >
> > > On November 18, 2019 at 12:29:31 PM, Skalicky, Sam (
> > > sska...@amazon.com.invalid) wrote:
> > >
> > > Thanks a good idea Alfredo, are you able to help test on AMD CPUs? Or
> is
> > > there someone else in the mxnet dev@ community who can help?
> > >
> > > Sam
> > >
> > >> On Nov 18, 2019, at 12:27 PM, Alfredo Luque
> > > <alfredo.lu...@airbnb.com.INVALID> wrote:
> > >>
> > >> Verifying that there isn’t a slowdown on AMD CPUs (eg; Ryzen / Epyc)
> > > would
> > >> definitely make sense as a requirement. It seems odd to classify that
> as
> > > a
> > >> “nonstandard” use case.
> > >>
> > >> On November 18, 2019 at 12:20:33 PM, Skalicky, Sam (
> > >> sska...@amazon.com.invalid) wrote:
> > >>
> > >> Thanks Patric & team for your work over the years to make MXNet fast
> > with
> > >> MKLDNN!
> > >>
> > >> I think it would be great to make MKLDNN enabled by default. We will
> > need
> > >> to continue producing variants without MKLDNN for those who don’t want
> > it
> > >> (Marco enumerated some use cases). How do you propose to identify the
> > pip
> > >> wheels with/without MKLDNN? Previously we had: mxnet-mkl and
> > > mxnet-cu101mkl
> > >> with MKLDNN. If the plain “mxnet” pip wheel now contains MKLDNN what
> do
> > > you
> > >> propose we call the build without MKLDNN? mxnet-nomkl?
> > >>
> > >> Thanks!
> > >> Sam
> > >>
> > >>> On Nov 18, 2019, at 11:08 AM, Marco de Abreu <
> marco.g.ab...@gmail.com>
> > >> wrote:
> > >>>
> > >>> Hi Patric,
> > >>>
> > >>> First of all, thanks a lot to you and your team for all the effort on
> > >> MXNet
> > >>> and mkldnn!
> > >>>
> > >>> Generally I'm inclined towards your proposal, but I'm thinking about
> > the
> > >>> non-standard use cases:
> > >>> - AMD CPU
> > >>> - ARM CPU
> > >>> - Windows
> > >>> - GPU and MKLDNN enabled
> > >>> - Fully reproducible results (medical and financial sector requested
> > > that
> > >>> and we have some flags for cuda)
> > >>>
> > >>> Is mkldnn fully compatible with these use cases? If not, what would
> > >> happen?
> > >>> If yes, do we have performance numbers?
> > >>>
> > >>> Best regards,
> > >>> Marco
> > >>>
> > >>> Zhao, Patric <patric.z...@intel.com> schrieb am Mo., 18. Nov. 2019,
> > >> 14:00:
> > >>>
> > >>>> Hi MXNet community,
> > >>>>
> > >>>> From the first MKLDNN backend integrated in release 1.2, the
> community
> > >> is
> > >>>> continuously improving the quality and performance of MKLDNN CPU
> > >> backend.
> > >>>> Nowadays, the MKLDNN backend is widely used for the inference,
> > >> especially
> > >>>> for INT8 inference, and we got lots of very positive feedbacks from
> > >> MXNet
> > >>>> users.
> > >>>>
> > >>>> Achieved milestones as below:
> > >>>>
> > >>>> - MKLDNN integrated into Apache MXNet from release 1.2, Feb, 2018
> [1]
> > >>>> - MKLDNN backend as default CPU backend from source building, Jan,
> > 2019
> > >> [2]
> > >>>> - MKLDNN subgraph optimization as default for the inference, Jul,
> 2019
> > >> [3]
> > >>>> - MKLDNN major version upgrade in release 1.6, Oct, 2019 [4]
> > >>>>
> > >>>> To make more successful and technical leadership for Apache MXNet in
> > > the
> > >>>> industry, I propose to make MKLDNN as default CPU backend in all
> > binary
> > >>>> distribution from the next release.
> > >>>> The new milestone includes:
> > >>>>
> > >>>> - Static link MKLDNN library in the binary avoiding the mismatch
> > > version
> > >>>> in the runtime [5]
> > >>>> - Make nightly build with MKLDNN default from master pre 1.7 release
> > >>>> - Binary distribution with MKLDNN default from 1.7 release.
> > >>>>
> > >>>> What will be changed:
> > >>>>
> > >>>> - mxnet and mxnet-cuXX binary will be built with MKLDNN=1
> > >>>> - mxnet-mkl and mxnet-cuXXmkl will be not changed in the minor
> release
> > >>>> (1.x) and plan to remove in next major release (2.0)
> > >>>>
> > >>>> Suggestions and comments are highly appreciated.
> > >>>>
> > >>>> Thanks,
> > >>>>
> > >>>> --Patric
> > >>>>
> > >>>>
> > >>>> [1] https://github.com/apache/incubator-mxnet/pull/9677
> > >>>> [2]
> > >>>>
> > >>
> > >
> >
> https://lists.apache.org/thread.html/bfeae6ee46374112eb4dff1470c262959101e4bffb19930926963535@%3Cdev.mxnet.apache.org%3E
> > >>>> [3] https://github.com/apache/incubator-mxnet/pull/15518
> > >>>> [4]
> > >>>>
> > >>
> > >
> >
> https://lists.apache.org/thread.html/f46ab920f18795496eafe713e6e9e561c684e06189085cec17b401dc@%3Cdev.mxnet.apache.org%3E
> > >>>> [5] https://github.com/apache/incubator-mxnet/pull/16731
> > >>>>
> > >>
> > >> —
> > >> Alfredo Luque
> > >> Software Engineer
> > >> Machine Learning Infrastructure
> > >> Airbnb
> > >> San Francisco, CA
> > >
> > > —
> > > Alfredo Luque
> > > Software Engineer
> > > Machine Learning Infrastructure
> > > Airbnb
> > > San Francisco, CA
> >
> >
>

Reply via email to