Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

Anton Chernov Tue, 06 Nov 2018 07:18:32 -0800

The following PR's have been created so far:

Infer dtype in SymbolBlock import from input symbol (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13117


[MXNET-953] Fix oob memory read (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13118

[MXNET-969] Fix buffer overflow in RNNOp (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13119

[MXNET-922] Fix memleak in profiler (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13120

Set correct update on kvstore flag in dist_device_sync mode (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13121

update mshadow (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13122

CudnnFind() usage improvements (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13123

Fix lazy record io when used with dataloader and multi_worker > 0 (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/13124


As stated previously I would be rather opposed to have following PR's it in
the patch release:

Gluon LSTM Projection and Clipping Support (#13055) v1.3.x
https://github.com/apache/incubator-mxnet/pull/13129

sample_like operators (#13034) v1.3.x
https://github.com/apache/incubator-mxnet/pull/13130


Best
Anton

вт, 6 нояб. 2018 г. в 16:06, Anton Chernov <mecher...@gmail.com>:

> Hi Haibin,
>
> I have a few comments regarding the proposed performance improvement
> changes.
>
> CUDNN support for LSTM with projection & clipping
> https://github.com/apache/incubator-mxnet/pull/13056
>
> There is no doubt that this change brings value, but I don't see it as a
> critical bug fix. I would rather leave it for the next major release.
>
> sample_like operators
> https://github.com/apache/incubator-mxnet/pull/13034
>
> Even if it's related to performance, this is an addition of functionality
> and I would also push this to be in the next major release only.
>
>
> Best
> Anton
>
>
> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov <mecher...@gmail.com>:
>
>> Hi Patric,
>>
>> This change was listed in the 'PR candidates suggested for consideration
>> for v1.3.1 patch release' section [1].
>>
>> You are right, I also think that this is not a critical hotfix change
>> that should be included into the 1.3.1 patch release.
>>
>> Thus I'm not making any further efforts to bring it in.
>>
>> Best
>> Anton
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates
>>
>>
>> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric <patric.z...@intel.com>:
>>
>>> Hi Anton,
>>>
>>> Thanks for looking into the MKL-DNN PR.
>>>
>>> As my understanding of cwiki (
>>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
>>> ),
>>> these features will go into 1.4 rather than patch release of 1.3.1.
>>>
>>> Feel free to correct me :)
>>>
>>> Thanks,
>>>
>>> --Patric
>>>
>>> > -----Original Message-----
>>> > From: Anton Chernov [mailto:mecher...@gmail.com]
>>> > Sent: Tuesday, November 6, 2018 3:11 AM
>>> > To: d...@mxnet.apache.org
>>> > Subject: Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch
>>> > release
>>> >
>>> > It seems that there is a problem porting following changes to the
>>> v1.3.x
>>> > release branch:
>>> >
>>> > Implement mkldnn convolution fusion and quantization
>>> > https://github.com/apache/incubator-mxnet/pull/12530
>>> >
>>> > MKL-DNN Quantization Examples and README
>>> > https://github.com/apache/incubator-mxnet/pull/12808
>>> >
>>> > The bases are different.
>>> >
>>> > I would need help from authors of these changes to make a backport PR.
>>> >
>>> > @ZhennanQin, @xinyu-intel would you be able to assist me and create the
>>> > corresponding PR's?
>>> >
>>> > Without proper history and domain knowledge I would not be able to
>>> create
>>> > them by my own in reasonable amount of time, I'm afraid.
>>> >
>>> > Best regards,
>>> > Anton
>>> >
>>> > пн, 5 нояб. 2018 г. в 19:45, Anton Chernov <mecher...@gmail.com>:
>>> >
>>> > >
>>> > > As part of:
>>> > >
>>> > > Implement mkldnn convolution fusion and quantization
>>> > > https://github.com/apache/incubator-mxnet/pull/12530
>>> > >
>>> > > I propose to add the examples and documentation PR as well:
>>> > >
>>> > > MKL-DNN Quantization Examples and README
>>> > > https://github.com/apache/incubator-mxnet/pull/12808
>>> > >
>>> > >
>>> > > Best regards,
>>> > > Anton
>>> > >
>>> > > пн, 5 нояб. 2018 г. в 19:02, Anton Chernov <mecher...@gmail.com>:
>>> > >
>>> > >> Dear MXNet community,
>>> > >>
>>> > >> I will be the release manager for the upcoming 1.3.1 patch release.
>>> > >> Naveen will be co-managing the release and providing help from the
>>> > >> committers side.
>>> > >>
>>> > >> The following dates have been set:
>>> > >>
>>> > >> Code Freeze: 31st October 2018
>>> > >> Release published: 13th November 2018
>>> > >>
>>> > >> Release notes have been drafted here [1].
>>> > >>
>>> > >>
>>> > >> * Known issues
>>> > >>
>>> > >> Update MKL-DNN dependency
>>> > >> https://github.com/apache/incubator-mxnet/pull/12953
>>> > >>
>>> > >> This PR hasn't been merged even to master yet. Requires additional
>>> > >> discussion and merge.
>>> > >>
>>> > >> distributed kvstore bug in MXNet
>>> > >> https://github.com/apache/incubator-mxnet/issues/12713
>>> > >>
>>> > >> > When distributed kvstore is used, by default gluon.Trainer doesn't
>>> > >> > work
>>> > >> with mx.optimizer.LRScheduler if a worker has more than 1 GPU. To be
>>> > >> more specific, the trainer updates once per GPU, the LRScheduler
>>> > >> object is shared across GPUs and get a wrong update count.
>>> > >>
>>> > >> This needs to be fixed. [6]
>>> > >>
>>> > >>
>>> > >> * Changes
>>> > >>
>>> > >> The following changes will be ported to the release branch, per [2]:
>>> > >>
>>> > >> Infer dtype in SymbolBlock import from input symbol [3]
>>> > >> https://github.com/apache/incubator-mxnet/pull/12412
>>> > >>
>>> > >> [MXNET-953] Fix oob memory read
>>> > >> https://github.com/apache/incubator-mxnet/pull/12631
>>> > >>
>>> > >> [MXNET-969] Fix buffer overflow in RNNOp
>>> > >> https://github.com/apache/incubator-mxnet/pull/12603
>>> > >>
>>> > >> [MXNET-922] Fix memleak in profiler
>>> > >> https://github.com/apache/incubator-mxnet/pull/12499
>>> > >>
>>> > >> Implement mkldnn convolution fusion and quantization (MXNet Graph
>>> > >> Optimization and Quantization based on subgraph and MKL-DNN
>>> > proposal
>>> > >> [4])
>>> > >> https://github.com/apache/incubator-mxnet/pull/12530
>>> > >>
>>> > >> Following items (test cases) should be already part of 1.3.0:
>>> > >>
>>> > >> [MXNET-486] Create CPP test for concat MKLDNN operator
>>> > >> https://github.com/apache/incubator-mxnet/pull/11371
>>> > >>
>>> > >> [MXNET-489] MKLDNN Pool test
>>> > >> https://github.com/apache/incubator-mxnet/pull/11608
>>> > >>
>>> > >> [MXNET-484] MKLDNN C++ test for LRN operator
>>> > >> https://github.com/apache/incubator-mxnet/pull/11831
>>> > >>
>>> > >> [MXNET-546] Add unit test for MKLDNNSum
>>> > >> https://github.com/apache/incubator-mxnet/pull/11272
>>> > >>
>>> > >> [MXNET-498] Test MKLDNN backward operators
>>> > >> https://github.com/apache/incubator-mxnet/pull/11232
>>> > >>
>>> > >> [MXNET-500] Test cases improvement for MKLDNN on Gluon
>>> > >> https://github.com/apache/incubator-mxnet/pull/10921
>>> > >>
>>> > >> Set correct update on kvstore flag in dist_device_sync mode (as part
>>> > >> of fixing [5])
>>> > >> https://github.com/apache/incubator-mxnet/pull/12786
>>> > >>
>>> > >> upgrade mshadow version
>>> > >> https://github.com/apache/incubator-mxnet/pull/12692
>>> > >> But another PR will be used instead:
>>> > >> update mshadow
>>> > >> https://github.com/apache/incubator-mxnet/pull/12674
>>> > >>
>>> > >> CudnnFind() usage improvements
>>> > >> https://github.com/apache/incubator-mxnet/pull/12804
>>> > >> A critical CUDNN fix that reduces GPU memory consumption and
>>> > >> addresses this memory leak issue. This is an important fix to
>>> include
>>> > >> in 1.3.1
>>> > >>
>>> > >>
>>> > >> From discussion about gluon toolkits:
>>> > >>
>>> > >> disable opencv threading for forked process
>>> > >> https://github.com/apache/incubator-mxnet/pull/12025
>>> > >>
>>> > >> Fix lazy record io when used with dataloader and multi_worker > 0
>>> > >> https://github.com/apache/incubator-mxnet/pull/12554
>>> > >>
>>> > >> fix potential floating number overflow, enable float16
>>> > >> https://github.com/apache/incubator-mxnet/pull/12118
>>> > >>
>>> > >>
>>> > >>
>>> > >> * Resolved issues
>>> > >>
>>> > >> MxNet 1.2.1–module get_outputs()
>>> > >> https://discuss.mxnet.io/t/mxnet-1-2-1-module-get-outputs/1882
>>> > >>
>>> > >> As far as I can see from the comments the issue has been resolved,
>>> no
>>> > >> actions need to be taken for this release. [7] is mentioned in this
>>> > >> regards, but I don't see any action points here either.
>>> > >>
>>> > >>
>>> > >> I will start with help of Naveen port the mentioned PR's to the
>>> 1.3.x
>>> > >> branch.
>>> > >>
>>> > >>
>>> > >> Best regards,
>>> > >> Anton
>>> > >>
>>> > >> [1] https://cwiki.apache.org/confluence/x/eZGzBQ
>>> > >> [2]
>>> > >>
>>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+f
>>> > >> or+next+MXNet+Release [3]
>>> > >> https://github.com/apache/incubator-mxnet/issues/11849
>>> > >> [4]
>>> > >>
>>> > https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimiz
>>> > >> ation+and+Quantization+based+on+subgraph+and+MKL-DNN
>>> > >> [5] https://github.com/apache/incubator-mxnet/issues/12713
>>> > >> [6]
>>> > >> https://github.com/apache/incubator-
>>> > mxnet/issues/12713#issuecomment-4
>>> > >> 35773777 [7] https://github.com/apache/incubator-mxnet/pull/11005
>>> > >>
>>> > >>
>>>
>>

Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

Reply via email to