Similar to the two PRs that Haibin suggested, 12992 introduces new interface for controlling determinism, which is better suited for minor release.
I think other than lack of release manager to drive 1.4.0 release, there’s no reason we cannot do two releases (1.4.0 & 1.3.1) at the same time. I’m willing to help with the 1.4.0 release to make these new features available one month sooner, if there’s no other concern. -sz > On Nov 6, 2018, at 3:30 PM, Lin Yuan <apefor...@gmail.com> wrote: > > Hi Anton, > > Thanks for helping the release. > The following PRs are needed by customers who want to use deterministic > CUDNN convolution algorithms: > > https://github.com/apache/incubator-mxnet/pull/12992 > https://github.com/apache/incubator-mxnet/pull/13049 > > Thanks! > > Lin > > > On Tue, Nov 6, 2018 at 1:51 PM Aaron Markham <aaron.s.mark...@gmail.com> > wrote: > >> Hi Anton, >> I have the following suggestions for fixes to include in 1.3.1. These each >> have updates to files that will impact docs generation for the 1.3.x >> version of the website's Python API docs: >> >> https://github.com/apache/incubator-mxnet/pull/12879 >> https://github.com/apache/incubator-mxnet/pull/12871 >> https://github.com/apache/incubator-mxnet/pull/12856 >> >> Thanks, >> Aaron >> >>> On Tue, Nov 6, 2018 at 1:29 PM Lai Wei <roywei...@gmail.com> wrote: >>> >>> Hi Anton, >>> >>> Thanks for driving this, I would like to include the following fix in >>> 1.3.1: >>> Allow infer shape partial on foreach operator: >>> https://github.com/apache/incubator-mxnet/pull/12471 >>> >>> Keras-MXNet needs this functionality to infer shape partially >>> on foreach operator. (Used in RNN operators) >>> >>> Thanks a lot! >>> >>> >>> Best Regards >>> Lai Wei >>> >>> >>> >>> On Tue, Nov 6, 2018 at 10:44 AM Haibin Lin <haibin.lin....@gmail.com> >>> wrote: >>> >>>> Hi Naveen and Anton, >>>> >>>> Thanks for pointing that out. You are right that these are not critical >>>> fixes. Putting them in 1.4.0 is more appropriate. PRs are closed. >>>> >>>> Best, >>>> Haibin >>>> >>>> On Tue, Nov 6, 2018 at 7:35 AM Naveen Swamy <mnnav...@gmail.com> >> wrote: >>>> >>>>> Please note that this is a patch release(1.3.1) to address critical >>>> bugs!, >>>>> For everything else please wait for 1.4.0 which is planned very >> shortly >>>>> after 1.3.1 >>>>> >>>>>> On Nov 6, 2018, at 7:17 AM, Anton Chernov <mecher...@gmail.com> >>> wrote: >>>>>> >>>>>> The following PR's have been created so far: >>>>>> >>>>>> Infer dtype in SymbolBlock import from input symbol (v1.3.x) >>>>>> https://github.com/apache/incubator-mxnet/pull/13117 >>>>>> >>>>>> [MXNET-953] Fix oob memory read (v1.3.x) >>>>>> https://github.com/apache/incubator-mxnet/pull/13118 >>>>>> >>>>>> [MXNET-969] Fix buffer overflow in RNNOp (v1.3.x) >>>>>> https://github.com/apache/incubator-mxnet/pull/13119 >>>>>> >>>>>> [MXNET-922] Fix memleak in profiler (v1.3.x) >>>>>> https://github.com/apache/incubator-mxnet/pull/13120 >>>>>> >>>>>> Set correct update on kvstore flag in dist_device_sync mode >> (v1.3.x) >>>>>> https://github.com/apache/incubator-mxnet/pull/13121 >>>>>> >>>>>> update mshadow (v1.3.x) >>>>>> https://github.com/apache/incubator-mxnet/pull/13122 >>>>>> >>>>>> CudnnFind() usage improvements (v1.3.x) >>>>>> https://github.com/apache/incubator-mxnet/pull/13123 >>>>>> >>>>>> Fix lazy record io when used with dataloader and multi_worker > 0 >>>>> (v1.3.x) >>>>>> https://github.com/apache/incubator-mxnet/pull/13124 >>>>>> >>>>>> >>>>>> As stated previously I would be rather opposed to have following >> PR's >>>> it >>>>> in >>>>>> the patch release: >>>>>> >>>>>> Gluon LSTM Projection and Clipping Support (#13055) v1.3.x >>>>>> https://github.com/apache/incubator-mxnet/pull/13129 >>>>>> >>>>>> sample_like operators (#13034) v1.3.x >>>>>> https://github.com/apache/incubator-mxnet/pull/13130 >>>>>> >>>>>> >>>>>> Best >>>>>> Anton >>>>>> >>>>>> вт, 6 нояб. 2018 г. в 16:06, Anton Chernov <mecher...@gmail.com>: >>>>>> >>>>>>> Hi Haibin, >>>>>>> >>>>>>> I have a few comments regarding the proposed performance >> improvement >>>>>>> changes. >>>>>>> >>>>>>> CUDNN support for LSTM with projection & clipping >>>>>>> https://github.com/apache/incubator-mxnet/pull/13056 >>>>>>> >>>>>>> There is no doubt that this change brings value, but I don't see >> it >>>> as a >>>>>>> critical bug fix. I would rather leave it for the next major >>> release. >>>>>>> >>>>>>> sample_like operators >>>>>>> https://github.com/apache/incubator-mxnet/pull/13034 >>>>>>> >>>>>>> Even if it's related to performance, this is an addition of >>>>> functionality >>>>>>> and I would also push this to be in the next major release only. >>>>>>> >>>>>>> >>>>>>> Best >>>>>>> Anton >>>>>>> >>>>>>> >>>>>>> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov <mecher...@gmail.com>: >>>>>>> >>>>>>>> Hi Patric, >>>>>>>> >>>>>>>> This change was listed in the 'PR candidates suggested for >>>>> consideration >>>>>>>> for v1.3.1 patch release' section [1]. >>>>>>>> >>>>>>>> You are right, I also think that this is not a critical hotfix >>> change >>>>>>>> that should be included into the 1.3.1 patch release. >>>>>>>> >>>>>>>> Thus I'm not making any further efforts to bring it in. >>>>>>>> >>>>>>>> Best >>>>>>>> Anton >>>>>>>> >>>>>>>> [1] >>>>>>>> >>>>> >>>> >>> >> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates >>>>>>>> >>>>>>>> >>>>>>>> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric <patric.z...@intel.com >>> : >>>>>>>> >>>>>>>>> Hi Anton, >>>>>>>>> >>>>>>>>> Thanks for looking into the MKL-DNN PR. >>>>>>>>> >>>>>>>>> As my understanding of cwiki ( >>>>>>>>> >>>>> >>>> >>> >> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release >>>>>>>>> ), >>>>>>>>> these features will go into 1.4 rather than patch release of >>> 1.3.1. >>>>>>>>> >>>>>>>>> Feel free to correct me :) >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> --Patric >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: Anton Chernov [mailto:mecher...@gmail.com] >>>>>>>>>> Sent: Tuesday, November 6, 2018 3:11 AM >>>>>>>>>> To: d...@mxnet.apache.org >>>>>>>>>> Subject: Re: [Announce] Upcoming Apache MXNet (incubating) >> 1.3.1 >>>>> patch >>>>>>>>>> release >>>>>>>>>> >>>>>>>>>> It seems that there is a problem porting following changes to >> the >>>>>>>>> v1.3.x >>>>>>>>>> release branch: >>>>>>>>>> >>>>>>>>>> Implement mkldnn convolution fusion and quantization >>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12530 >>>>>>>>>> >>>>>>>>>> MKL-DNN Quantization Examples and README >>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12808 >>>>>>>>>> >>>>>>>>>> The bases are different. >>>>>>>>>> >>>>>>>>>> I would need help from authors of these changes to make a >>> backport >>>>> PR. >>>>>>>>>> >>>>>>>>>> @ZhennanQin, @xinyu-intel would you be able to assist me and >>> create >>>>> the >>>>>>>>>> corresponding PR's? >>>>>>>>>> >>>>>>>>>> Without proper history and domain knowledge I would not be able >>> to >>>>>>>>> create >>>>>>>>>> them by my own in reasonable amount of time, I'm afraid. >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Anton >>>>>>>>>> >>>>>>>>>> пн, 5 нояб. 2018 г. в 19:45, Anton Chernov < >> mecher...@gmail.com >>>> : >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> As part of: >>>>>>>>>>> >>>>>>>>>>> Implement mkldnn convolution fusion and quantization >>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12530 >>>>>>>>>>> >>>>>>>>>>> I propose to add the examples and documentation PR as well: >>>>>>>>>>> >>>>>>>>>>> MKL-DNN Quantization Examples and README >>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12808 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> Anton >>>>>>>>>>> >>>>>>>>>>> пн, 5 нояб. 2018 г. в 19:02, Anton Chernov < >> mecher...@gmail.com >>>> : >>>>>>>>>>> >>>>>>>>>>>> Dear MXNet community, >>>>>>>>>>>> >>>>>>>>>>>> I will be the release manager for the upcoming 1.3.1 patch >>>> release. >>>>>>>>>>>> Naveen will be co-managing the release and providing help >> from >>>> the >>>>>>>>>>>> committers side. >>>>>>>>>>>> >>>>>>>>>>>> The following dates have been set: >>>>>>>>>>>> >>>>>>>>>>>> Code Freeze: 31st October 2018 >>>>>>>>>>>> Release published: 13th November 2018 >>>>>>>>>>>> >>>>>>>>>>>> Release notes have been drafted here [1]. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> * Known issues >>>>>>>>>>>> >>>>>>>>>>>> Update MKL-DNN dependency >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12953 >>>>>>>>>>>> >>>>>>>>>>>> This PR hasn't been merged even to master yet. Requires >>>> additional >>>>>>>>>>>> discussion and merge. >>>>>>>>>>>> >>>>>>>>>>>> distributed kvstore bug in MXNet >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/issues/12713 >>>>>>>>>>>> >>>>>>>>>>>>> When distributed kvstore is used, by default gluon.Trainer >>>> doesn't >>>>>>>>>>>>> work >>>>>>>>>>>> with mx.optimizer.LRScheduler if a worker has more than 1 >> GPU. >>> To >>>>> be >>>>>>>>>>>> more specific, the trainer updates once per GPU, the >>> LRScheduler >>>>>>>>>>>> object is shared across GPUs and get a wrong update count. >>>>>>>>>>>> >>>>>>>>>>>> This needs to be fixed. [6] >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> * Changes >>>>>>>>>>>> >>>>>>>>>>>> The following changes will be ported to the release branch, >> per >>>>> [2]: >>>>>>>>>>>> >>>>>>>>>>>> Infer dtype in SymbolBlock import from input symbol [3] >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12412 >>>>>>>>>>>> >>>>>>>>>>>> [MXNET-953] Fix oob memory read >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12631 >>>>>>>>>>>> >>>>>>>>>>>> [MXNET-969] Fix buffer overflow in RNNOp >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12603 >>>>>>>>>>>> >>>>>>>>>>>> [MXNET-922] Fix memleak in profiler >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12499 >>>>>>>>>>>> >>>>>>>>>>>> Implement mkldnn convolution fusion and quantization (MXNet >>> Graph >>>>>>>>>>>> Optimization and Quantization based on subgraph and MKL-DNN >>>>>>>>>> proposal >>>>>>>>>>>> [4]) >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12530 >>>>>>>>>>>> >>>>>>>>>>>> Following items (test cases) should be already part of 1.3.0: >>>>>>>>>>>> >>>>>>>>>>>> [MXNET-486] Create CPP test for concat MKLDNN operator >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/11371 >>>>>>>>>>>> >>>>>>>>>>>> [MXNET-489] MKLDNN Pool test >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/11608 >>>>>>>>>>>> >>>>>>>>>>>> [MXNET-484] MKLDNN C++ test for LRN operator >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/11831 >>>>>>>>>>>> >>>>>>>>>>>> [MXNET-546] Add unit test for MKLDNNSum >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/11272 >>>>>>>>>>>> >>>>>>>>>>>> [MXNET-498] Test MKLDNN backward operators >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/11232 >>>>>>>>>>>> >>>>>>>>>>>> [MXNET-500] Test cases improvement for MKLDNN on Gluon >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/10921 >>>>>>>>>>>> >>>>>>>>>>>> Set correct update on kvstore flag in dist_device_sync mode >> (as >>>>> part >>>>>>>>>>>> of fixing [5]) >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12786 >>>>>>>>>>>> >>>>>>>>>>>> upgrade mshadow version >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12692 >>>>>>>>>>>> But another PR will be used instead: >>>>>>>>>>>> update mshadow >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12674 >>>>>>>>>>>> >>>>>>>>>>>> CudnnFind() usage improvements >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12804 >>>>>>>>>>>> A critical CUDNN fix that reduces GPU memory consumption and >>>>>>>>>>>> addresses this memory leak issue. This is an important fix to >>>>>>>>> include >>>>>>>>>>>> in 1.3.1 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> From discussion about gluon toolkits: >>>>>>>>>>>> >>>>>>>>>>>> disable opencv threading for forked process >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12025 >>>>>>>>>>>> >>>>>>>>>>>> Fix lazy record io when used with dataloader and multi_worker >>>> 0 >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12554 >>>>>>>>>>>> >>>>>>>>>>>> fix potential floating number overflow, enable float16 >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/pull/12118 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> * Resolved issues >>>>>>>>>>>> >>>>>>>>>>>> MxNet 1.2.1–module get_outputs() >>>>>>>>>>>> >> https://discuss.mxnet.io/t/mxnet-1-2-1-module-get-outputs/1882 >>>>>>>>>>>> >>>>>>>>>>>> As far as I can see from the comments the issue has been >>>> resolved, >>>>>>>>> no >>>>>>>>>>>> actions need to be taken for this release. [7] is mentioned >> in >>>> this >>>>>>>>>>>> regards, but I don't see any action points here either. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I will start with help of Naveen port the mentioned PR's to >> the >>>>>>>>> 1.3.x >>>>>>>>>>>> branch. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Best regards, >>>>>>>>>>>> Anton >>>>>>>>>>>> >>>>>>>>>>>> [1] https://cwiki.apache.org/confluence/x/eZGzBQ >>>>>>>>>>>> [2] >>>>>>>>>>>> >>>>>>>>> >>>> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+f >>>>>>>>>>>> or+next+MXNet+Release [3] >>>>>>>>>>>> https://github.com/apache/incubator-mxnet/issues/11849 >>>>>>>>>>>> [4] >>>>>>>>>>>> >>>>>>>>>> >>>>> >> https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimiz >>>>>>>>>>>> ation+and+Quantization+based+on+subgraph+and+MKL-DNN >>>>>>>>>>>> [5] https://github.com/apache/incubator-mxnet/issues/12713 >>>>>>>>>>>> [6] >>>>>>>>>>>> https://github.com/apache/incubator- >>>>>>>>>> mxnet/issues/12713#issuecomment-4 >>>>>>>>>>>> 35773777 [7] >>>> https://github.com/apache/incubator-mxnet/pull/11005 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>> >>>> >>> >>