Hi Steffen, Can we add the following PR to 1.4.0 release:
https://github.com/apache/incubator-mxnet/pull/13452 It's just a Python API returning header path so it should not cause any regression issues. But it is required for Horovod to integrate MXNet. It's better to have this in a minor release than patch release. Thanks, Lin On Thu, Nov 29, 2018 at 6:46 PM Steffen Rochel <steffenroc...@gmail.com> wrote: > Hi Zhi - thanks for the improvement, which we should consider for 1.4.0. > However, I don't see any tests with the PR and think it is too risky to add > changes without tests. I will add your PR to the tracking list, but would > like to ask you to add functional tests before completing the PR to master > and v1.4.x branch. > > Steffen > > On Thu, Nov 29, 2018 at 5:01 PM Joshua Z. Zhang <cheungc...@gmail.com> > wrote: > > > Hi, I would like to bring a critical performance and stability patch of > > existing gluon dataloader to 1.4.0: > > https://github.com/apache/incubator-mxnet/pull/13447 < > > https://github.com/apache/incubator-mxnet/pull/13447>. > > > > This PR is finished, waiting for CI to pass. > > > > Steffen, could you help me add that to the tracked list? > > > > Best, > > Zhi > > > > > On Nov 29, 2018, at 4:25 PM, Naveen Swamy <mnnav...@gmail.com> wrote: > > > > > > the tests are randomly failing in different stages > > > > > > http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-13105/ > > > This PR has failed 8 times so far > > > > > > On Thu, Nov 29, 2018 at 3:43 PM Steffen Rochel < > steffenroc...@gmail.com> > > > wrote: > > > > > >> Pedro - ok. Please add PR to v1.4.x branch after merge to master and > > please > > >> update tracking page > > >> < > > >> > > > https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.4.0+Release+Plan+and+Status#ApacheMXNet(incubating)1.4.0ReleasePlanandStatus-OpenPRstotrack > > >>> > > >> . > > >> Steffen > > >> > > >> On Thu, Nov 29, 2018 at 3:00 PM Pedro Larroy < > > pedro.larroy.li...@gmail.com > > >>> > > >> wrote: > > >> > > >>> PR is ready from my side and passes the tests, unless somebody raises > > >>> any concerns it's good to go. > > >>> On Thu, Nov 29, 2018 at 9:50 PM Steffen Rochel < > > steffenroc...@gmail.com> > > >>> wrote: > > >>>> > > >>>> Pedro - added to 1.4.0 tracking list > > >>>> < > > >>> > > >> > > > https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.4.0+Release+Plan+and+Status#ApacheMXNet(incubating)1.4.0ReleasePlanandStatus-OpenPRstotrack > > >>>> > > >>>> > > >>>> Do you have already ETA? > > >>>> Steffen > > >>>> > > >>>> On Thu, Nov 29, 2018 at 6:13 AM Pedro Larroy < > > >>> pedro.larroy.li...@gmail.com> > > >>>> wrote: > > >>>> > > >>>>> Hi all. > > >>>>> > > >>>>> There are two important issues / fixes that should go in the next > > >>>>> release in my radar: > > >>>>> > > >>>>> 1) https://github.com/apache/incubator-mxnet/pull/13409/files > > >>>>> There is a bug in shape inference on CPU when not using MKL, also > we > > >>>>> are running activation on CPU via MKL when we compile CUDNN+MKLDNN. > > >>>>> I'm finishing a fix for these issues in the above PR. > > >>>>> > > >>>>> 2) https://github.com/apache/incubator-mxnet/issues/13438 > > >>>>> We are seeing crashes due to unsafe setenv in multithreaded code. > > >>>>> Setenv / getenv from multiple threads is not safe and is causing > > >>>>> segfaults. This piece of code (the handlers in pthread_atfork) > > >> already > > >>>>> caused a very difficult to diagnose hang in a previous release, > where > > >>>>> a fork inside cudnn would deadlock the engine. > > >>>>> > > >>>>> I would remove setenv from 2) as a mitigation, but we would need to > > >>>>> check for regressions as we could be creating additional threads > > >>>>> inside the engine. > > >>>>> > > >>>>> I would suggest that we address these two major issues before the > > >> next > > >>>>> release. > > >>>>> > > >>>>> Pedro > > >>>>> > > >>>>> > > >>>>> > > >>>>> On Sun, Nov 25, 2018 at 11:41 PM Steffen Rochel < > > >>> steffenroc...@gmail.com> > > >>>>> wrote: > > >>>>>> > > >>>>>> Dear MXNet community, > > >>>>>> > > >>>>>> I will be the release manager for the upcoming Apache MXNet 1.4.0 > > >>>>> release. > > >>>>>> Sergey Kolychev will be co-managing the release and providing help > > >>> from > > >>>>> the > > >>>>>> committers side. > > >>>>>> A release candidate will be cut on November 29, 2018 and voting > > >> will > > >>>>> start > > >>>>>> December 7, 2018. Release notes have been drafted here [1]. If you > > >>> have > > >>>>> any > > >>>>>> additional features in progress and would like to include it in > > >> this > > >>>>>> release, please assure they have been merged by November 27, 2018. > > >>>>> Release > > >>>>>> schedule is available here [2]. > > >>>>>> > > >>>>>> Feel free to add any other comments/suggestions. Please help to > > >>> review > > >>>>> and > > >>>>>> merge outstanding PR's and resolve issues impacting the quality of > > >>> the > > >>>>>> 1.4.0 release. > > >>>>>> > > >>>>>> Regards, > > >>>>>> > > >>>>>> Steffen > > >>>>>> > > >>>>>> [1] > > >>>>>> > > >>>>> > > >>> > > >> > > > https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.4.0+Release+Notes > > >>>>>> > > >>>>>> [2] > > >>>>> > > >>> > > >> > > > https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.4.0+Release+Plan+and+Status > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> On Tue, Nov 20, 2018 at 7:15 PM kellen sunderland < > > >>>>>> kellen.sunderl...@gmail.com> wrote: > > >>>>>> > > >>>>>>> Spoke too soon[1], looks like others have been adding Turing > > >>> support as > > >>>>>>> well (thanks to those helping with this). I believe there's > > >> still > > >>> a > > >>>>> few > > >>>>>>> changes we'd have to make to claim support though (mshadow CMake > > >>>>> changes, > > >>>>>>> PyPi package creation tweaks). > > >>>>>>> > > >>>>>>> 1: > > >>>>>>> > > >>>>>>> > > >>>>> > > >>> > > >> > > > https://github.com/apache/incubator-mxnet/commit/2c3357443ec3d49a11e93c89f278264ce10c2f08 > > >>>>>>> > > >>>>>>> On Tue, Nov 20, 2018 at 7:00 PM kellen sunderland < > > >>>>>>> kellen.sunderl...@gmail.com> wrote: > > >>>>>>> > > >>>>>>>> Hey Steffen, I'd like to be able to merge this PR for version > > >>> 1.4: > > >>>>>>>> https://github.com/apache/incubator-mxnet/pull/13310 . It > > >> fixes > > >>> a > > >>>>>>>> regression in master which causes incorrect feature vectors to > > >> be > > >>>>> output > > >>>>>>>> when using the TensorRT feature. (Thanks to Nathalie for > > >>> helping me > > >>>>>>> track > > >>>>>>>> down the root cause of the issue). I'm currently blocked on a > > >>> CI > > >>>>> issue > > >>>>>>> I > > >>>>>>>> haven't seen before, but hope to have it resolved by EOW. > > >>>>>>>> > > >>>>>>>> One call-out I would make is that we currently don't support > > >>> Turing > > >>>>>>>> architecture (sm_75). I've been slowly trying to add support, > > >>> but I > > >>>>>>> don't > > >>>>>>>> think I'd have capacity to do this done by EOW. Does anyone > > >> feel > > >>>>>>> strongly > > >>>>>>>> we need this in the 1.4 release? From my perspective this will > > >>>>> already > > >>>>>>> be > > >>>>>>>> a strong release without it. > > >>>>>>>> > > >>>>>>>> On Tue, Nov 20, 2018 at 6:42 PM Steffen Rochel < > > >>>>> steffenroc...@gmail.com> > > >>>>>>>> wrote: > > >>>>>>>> > > >>>>>>>>> Thanks Patrick, lets target to get the PR's merged this week. > > >>>>>>>>> > > >>>>>>>>> Call for contributions from the community: Right now we have > > >> 10 > > >>> PR > > >>>>>>>>> awaiting > > >>>>>>>>> merge > > >>>>>>>>> < > > >>>>>>>>> > > >>>>>>> > > >>>>> > > >>> > > >> > > > https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Aopen+label%3Apr-awaiting-merge+ > > >>>>>>>>>> > > >>>>>>>>> and > > >>>>>>>>> we have 61 open PR awaiting review. > > >>>>>>>>> < > > >>>>>>>>> > > >>>>>>> > > >>>>> > > >>> > > >> > > > https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Aopen+label%3Apr-awaiting-review > > >>>>>>>>>> > > >>>>>>>>> I would appreciate if you all can help to review the open PR > > >>> and the > > >>>>>>>>> committers can drive the merge before code freeze for 1.4.0. > > >>>>>>>>> > > >>>>>>>>> The contributors on the Java API are making progress, but not > > >>> all > > >>>>>>>>> performance issues are resolved. With some luck it should be > > >>>>> possible to > > >>>>>>>>> code freeze towards end of this week. > > >>>>>>>>> > > >>>>>>>>> Are there other critical features/bugs/PR you think need to be > > >>>>> included > > >>>>>>> in > > >>>>>>>>> 1.4.0? If so, please communicate as soon as possible. > > >>>>>>>>> > > >>>>>>>>> Regards, > > >>>>>>>>> Steffen > > >>>>>>>>> > > >>>>>>>>> On Mon, Nov 19, 2018 at 8:26 PM Zhao, Patric < > > >>> patric.z...@intel.com > > >>>>>> > > >>>>>>>>> wrote: > > >>>>>>>>> > > >>>>>>>>>> Thanks, Steffen. I think there is NO open issue to block the > > >>>>> MKLDNN to > > >>>>>>>>> GA > > >>>>>>>>>> now. > > >>>>>>>>>> > > >>>>>>>>>> BTW, several quantization related PRs (#13297,#13260) are > > >>> under > > >>>>> the > > >>>>>>>>> review > > >>>>>>>>>> and I think it can be merged in this week. > > >>>>>>>>>> > > >>>>>>>>>> Thanks, > > >>>>>>>>>> > > >>>>>>>>>> --Patric > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>>> -----Original Message----- > > >>>>>>>>>>> From: Steffen Rochel [mailto:steffenroc...@gmail.com] > > >>>>>>>>>>> Sent: Tuesday, November 20, 2018 2:57 AM > > >>>>>>>>>>> To: dev@mxnet.incubator.apache.org > > >>>>>>>>>>> Subject: Re: [Announce] Upcoming Apache MXNet (incubating) > > >>> 1.4.0 > > >>>>>>>>> release > > >>>>>>>>>>> > > >>>>>>>>>>> On Friday the contributors working on Java API discovered > > >> a > > >>>>>>> potential > > >>>>>>>>>>> performance problem with inference using Java API vs. > > >>> Python. > > >>>>>>>>>> Investigation > > >>>>>>>>>>> is ongoing. > > >>>>>>>>>>> As the Java API is one of the main features for the > > >> upcoming > > >>>>>>> release, > > >>>>>>>>> I > > >>>>>>>>>>> suggest to post-pone the code freeze towards end of this > > >>> week. > > >>>>>>>>>>> > > >>>>>>>>>>> Please provide feedback and concern about the change in > > >>> dates > > >>>>> for > > >>>>>>> code > > >>>>>>>>>>> freeze and 1.4.0 release. I will provide updates on > > >> progress > > >>>>>>> resolving > > >>>>>>>>>> the > > >>>>>>>>>>> potential performance problem. > > >>>>>>>>>>> > > >>>>>>>>>>> Patrick - do you think it is possible to resolve the > > >>> remaining > > >>>>>>> issues > > >>>>>>>>> on > > >>>>>>>>>> MKL- > > >>>>>>>>>>> DNN this week, so we can consider GA for MKL-DNN with > > >> 1.4.0? > > >>>>>>>>>>> > > >>>>>>>>>>> Regards, > > >>>>>>>>>>> Steffen > > >>>>>>>>>>> > > >>>>>>>>>>> On Thu, Nov 15, 2018 at 5:26 AM Anton Chernov < > > >>>>> mecher...@gmail.com> > > >>>>>>>>>>> wrote: > > >>>>>>>>>>> > > >>>>>>>>>>>> I'd like to remind everyone that 'code freeze' would > > >> mean > > >>>>> cutting > > >>>>>>> a > > >>>>>>>>>>>> v1.4.x release branch and all following fixes would need > > >>> to be > > >>>>>>>>>> backported. > > >>>>>>>>>>>> Development on master can be continued as usual. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Best > > >>>>>>>>>>>> Anton > > >>>>>>>>>>>> > > >>>>>>>>>>>> ср, 14 нояб. 2018 г. в 6:04, Steffen Rochel < > > >>>>>>>>> steffenroc...@gmail.com>: > > >>>>>>>>>>>> > > >>>>>>>>>>>>> Dear MXNet community, > > >>>>>>>>>>>>> the agreed plan was to establish code freeze for 1.4.0 > > >>>>> release > > >>>>>>>>>>>>> today. As the 1.3.1 patch release is still ongoing I > > >>>>> suggest to > > >>>>>>>>>>>>> post-pone the code freeze to Friday 16th November > > >> 2018. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Sergey Kolychev has agreed to act as co-release > > >> manager > > >>> for > > >>>>> all > > >>>>>>>>>>>>> tasks > > >>>>>>>>>>>> which > > >>>>>>>>>>>>> require committer privileges. If anybody is interested > > >>> to > > >>>>>>>>> volunteer > > >>>>>>>>>>>>> as release manager - now is the time to speak up. > > >>> Otherwise > > >>>>> I > > >>>>>>> will > > >>>>>>>>>>>>> manage > > >>>>>>>>>>>> the > > >>>>>>>>>>>>> release. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Regards, > > >>>>>>>>>>>>> Steffen > > >>>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>> > > >>> > > >> > > > > >