Re: [Discuss] Upgrade MKL-DNN submodule to its v1.0 release

Marco de Abreu Thu, 31 Oct 2019 20:02:17 -0700

Great job, well done everyone!!

Lv, Tao A <tao.a...@intel.com> schrieb am Fr., 1. Nov. 2019, 03:50:


> Hi dev,
>
> The feature branch mkldnn-v1.0 has been merged to master. Really
> appreciate your support for this task.
> Branch: https://github.com/apache/incubator-mxnet/tree/mkldnn-v1.0
> Project: https://github.com/apache/incubator-mxnet/projects/16
> PR: https://github.com/apache/incubator-mxnet/pull/16555
>
> If possible, please downstream projects help to verify the latest master
> branch and feel free to report issues if any.
>
> Thanks,
> -tao
>
> -----Original Message-----
> From: Lv, Tao A <tao.a...@intel.com>
> Sent: Sunday, July 28, 2019 11:55 PM
> To: dev@mxnet.incubator.apache.org
> Cc: Zhao, Patric <patric.z...@intel.com>; Ye, Jason Y <
> jason.y...@intel.com>
> Subject: RE: [Discuss] Upgrade MKL-DNN submodule to its v1.0 release
>
> Update:
>
> I just cut out the feature branch for MKL-DNN 1.0 integration:
> https://github.com/apache/incubator-mxnet/tree/mkldnn-v1.0
>
> Thanks,
> -tao
>
> -----Original Message-----
> From: Lv, Tao A <tao.a...@intel.com>
> Sent: Friday, July 26, 2019 10:21 PM
> To: dev@mxnet.incubator.apache.org
> Cc: Zhao, Patric <patric.z...@intel.com>; Ye, Jason Y <
> jason.y...@intel.com>
> Subject: RE: [Discuss] Upgrade MKL-DNN submodule to its v1.0 release
>
> Seems we don't have any objection. I will try to cut the feature branch in
> the following days.
>
> Thanks,
> -tao
>
> -----Original Message-----
> From: Lv, Tao A <tao.a...@intel.com>
> Sent: Saturday, July 20, 2019 11:06 PM
> To: dev@mxnet.incubator.apache.org
> Cc: Zhao, Patric <patric.z...@intel.com>; Ye, Jason Y <
> jason.y...@intel.com>
> Subject: [Discuss] Upgrade MKL-DNN submodule to its v1.0 release
>
>
>
> Hi dev,
>
>
>
> MKL-DNN just published its first major release this month:
> https://github.com/intel/mkl-dnn/releases/tag/v1.0. Here I would like to
> start a discussion about upgrading MKL-DNN integration from the current
> v0.20 to v1.0.
>
>
>
> Motivation
>
> To improve the general look-n-feel of the library and solve few important
> design issues, in the coming v1.0 major release, some of the data
> structures, primitive APIs and execution model will be changed and the
> compatibility to v0.x versions will be broken accordingly. Change details
> in MKL-DNN v1.0 are mostly covered in RFC for v1.0<
> https://github.com/intel/mkl-dnn/tree/rfc-api-changes-v1.0/doc/rfc/api-v1.0>.
> The major changes are listed as below:
> *        Support large tensor with int64_t dimension size.
> *        Expose scratchpad to support stateless primitive and better
> memory management hence thread safe.
> *        Pass memory and stream to primitive at execution.
> *        Rework MKL-DNN memory descriptor.
> *        Split LSTM/GRU/RNN into different primitives.
> *        Remove MKLML dependency and stop the release of MKLML and iomp
> packages in MKL-DNN repository.
> *        Support Intel integrated graphics.
>
> With these changes, we can resolve or mitigate several existing issues of
> MXNet, eg. #15576 for thread safe, #15544 for MKLML/iomp5 license issue,
> and the int64 tensor size for MKL-DNN backend. Besides that, all new
> features will go to v1.x and will not be back ported to v0.x. MXNet need
> update the MKL-DNN dependency to v1.0 to better leverage new features and
> performance improvement.
>
>
>
> Development
>
> Basically we will follow the same integration methodology we used for v0.x
> integration, including operator implementation, registration, NDArray
> modification and graph partitioning. For better collaboration among the
> community, we will have a feature branch for the development and validation
> of MKL-DNN 1.0 integration. All the PRs to the feature branch should pass
> the code review and CI and finally get committers approval. The development
> can be simply divide into 3 parts and all the work will be done before
> Q3'19 ends. During the development, feature branch will sync to the master
> branch periodically.
> *        P1: make/cmake build with MKL-DNN v1.0, all FP32 CNN operators
> integration (in src/operator/nn/mkldnn/). We can do FP32 training and
> inference for CNN models after P1 is done.
> *        P2: quantization pass, INT8 operators integration (in
> src/operator/quantization/mkldnn). We can do INT8 quantization and INT8
> inference after P2 is done.
> *        P3: RNN operators integration.
>
> If needed, documents will be revised accordingly during the development.
>
>
>
> Validation:
> *        Use feature branch for development - all PRs should pass MXNet CI.
> *        Disable MKL-DNN related tests at the beginning of development and
> recover them incrementally during the development.
> *        Intel internal validation: mainly focus on performance and
> convergence validation on CPU, with models from MXNet examples, Gluon-CV
> and Gluon-NLP.
>
>
>
> Criteria for development done:
> *        MXNet CI: pass all existing unit tests, nightly tests
> *        Accuracy: Pass training convergence and inference accuracy
> validation
> *        Performance: Achieve similar FP32/INT8 performance as v0.x
> integration
>
>
>
> Upstreaming to master branch:
>
> After development is done, we will start to upstream the feature branch to
> the master branch. Since we cannot have two MKL-DNN libraries in MXNet
> simultaneously, the upstream should be done in a single PR. Possibly the PR
> will be large, so I hope the community can take time to review and comment
> during development of the feature branch.
>
>
>
> We need do our best to make this happen before the 1.6.0 release so we can
> address the license issue raised in the 1.5.0 vote.
>
>
>
> Please let me know what do you think about this plan. If you think
> something should be fixed or improved in this integration, also let me know.
>
>
>
> thanks,
>
> -tao (on behalf of the Intel MXNet team)
>
>

Re: [Discuss] Upgrade MKL-DNN submodule to its v1.0 release

Reply via email to