Tracked down the issue referred to above and it's not a bug. I'll update the ticket.
Changing to +1. On Mon, Sep 10, 2018 at 3:00 PM kellen sunderland < kellen.sunderl...@gmail.com> wrote: > -0.1 > > There's one test failure I've run into (details below). Following Indhu's > logic I don't think this should block the release as it's not relating to a > release feature introduced in this version. > > I'm trying to use the cpp-package examples as reference code for how to > run MXNet models from a native context. I'd like to run them with ASAN as a > sanity check for memory leaks and pointer errors. I was continually > running into segfaults and crashes w/ and w/o ASAN. A little googling > shows me that this issue has already been reported, and is related to > running tests on CPU, not to any changes I made: > https://github.com/apache/incubator-mxnet/issues/9814 Having what our > effectively our reference examples crash is not a good practice IMO. > > I also share some concerns around the fp16 failures. I know developers > who are currently porting their models to Gluon who use fp16. They'll be > disappointed with the error. > > In general though, release looks good. Big thanks to Sheng and Roshani > for putting it together (and sorry for the late testing). > > -Kellen > > > On Fri, Sep 7, 2018 at 4:31 AM Anirudh <anirudh2...@gmail.com> wrote: > >> -1 Considering that using fp16 with gluon is much easier than the >> alternative where you need access to the model code, this fix is really >> useful. I understand the pain of doing mxnet release and appreciate >> Roshani >> and Shengs efforts, but this seems like something we should fix. >> >> On Thu, Sep 6, 2018, 4:57 PM Haibin Lin <haibin.lin....@gmail.com> wrote: >> >> > +1 built from source and passes dist_sync_kvstore test on Ubuntu. >> > >> > Best, >> > Haibin >> > >> > On Thu, Sep 6, 2018 at 1:32 PM Indhu <indhubhara...@gmail.com> wrote: >> > >> > > +1 >> > > >> > > The release candidate looks good. I'm able to build and run basic >> models. >> > > >> > > One the FP16 issue: >> > > >> > > Like others have pointed out, releases on expensive in terms of time >> and >> > > effort. There needs to be a high and more objective bar on what >> qualifies >> > > as a release blocker to make sure we are not setting precedence for a >> lot >> > > of release blockers in future. >> > > >> > > I think a release blocker is justified only if there is a serious bug >> > > discovered in one of the features included in the release or if there >> is >> > a >> > > regression. Given FP16 supports is not a new feature claimed in this >> > > release and this is not a regression in this release candidate, I'm >> > > inclined to release this candidate and include the FP16 fix in a >> > subsequent >> > > release. >> > > >> > > Thanks, >> > > Indu >> > > >> > > On Wed, Sep 5, 2018 at 10:21 AM Aaron Markham < >> aaron.s.mark...@gmail.com >> > > >> > > wrote: >> > > >> > > > 0 (non-binding) If we have a problem that blocks users, and a >> solution >> > in >> > > > hand... then we should fix it, but not at the expense of starting >> the >> > > > release cycle again just for one fix. Users can cherry pick or build >> > from >> > > > master if they want the fix right away, right? I'd change my mind >> to -1 >> > > if >> > > > this wasn't the case, with good reason, and if the user impact was >> > > critical >> > > > to adoption or risks abandonment. >> > > > >> > > > >> > > > On Wed, Sep 5, 2018 at 9:57 AM Roshani Nagmote < >> > > roshaninagmo...@gmail.com> >> > > > wrote: >> > > > >> > > > > I believe everyone here is working hard to make MXNet a better >> > > framework >> > > > > for users. It's completely okay to have different opinions, we can >> > > decide >> > > > > together if this issue is a blocker or not after voting time is >> over. >> > > > > >> > > > > As I mentioned before, voting will end at 7 pm today. So there is >> > still >> > > > > time to test the release. If there are any other issues anyone >> > finds, I >> > > > > will be happy to start the process again and work on RC1. For >> now, I >> > > want >> > > > > to encourage everyone to utilize this time and vote. :) >> > > > > >> > > > > Thanks, >> > > > > Roshani >> > > > > >> > > > > On Tue, Sep 4, 2018 at 10:35 PM sandeep krishnamurthy < >> > > > > sandeep.krishn...@gmail.com> wrote: >> > > > > >> > > > > > 1. As a Apache MXNet community member, I raised the concern >> of >> > > > broken >> > > > > > functionality for the user. I explained and provided the data >> > > points >> > > > > on >> > > > > > the >> > > > > > issue, workaround and why I think it is important. If after >> all >> > > > this, >> > > > > > you >> > > > > > think my vote is biased on my employer just because a user I >> > > quoted >> > > > is >> > > > > > from >> > > > > > Amazon, this is more concerning to me on my voting abilities. >> > > > > > 2. My -1 no where undermines the huge amount of effort that >> goes >> > > > > behind >> > > > > > the scene for a release to happen. Great respect and >> recognition >> > > for >> > > > > > everyone involved in all the releases of MXNet in the past >> and >> > > > this. I >> > > > > > voted on my judgement of what may be good for the users of >> > MXNet. >> > > > > > 3. As pointed by Naveen & Chris, -1 are NOT veto. Feel free >> to >> > > > decide >> > > > > > and progress on the release as we already have >3 +1 in this >> > > thread. >> > > > > > >> > > > > > >> > > > > > Best, >> > > > > > >> > > > > > Sandeep >> > > > > > >> > > > > > On Tue, Sep 4, 2018 at 8:29 PM Chris Olivier < >> > cjolivie...@gmail.com> >> > > > > > wrote: >> > > > > > >> > > > > > > btw, there are no vetoes on package releases: >> > > > > > > >> > > > > > > VOTES ON PACKAGE RELEASES >> > > > > > > <https://www.apache.org/foundation/voting.html#ReleaseVotes> >> > > > > > > >> > > > > > > Votes on whether a package is ready to be released use >> majority >> > > > > approval >> > > > > > > < >> > https://www.apache.org/foundation/glossary.html#MajorityApproval> >> > > > -- >> > > > > > i.e. >> > > > > > > at least three PMC members must vote affirmatively for >> release, >> > and >> > > > > there >> > > > > > > must be more positive than negative votes.Releases may not be >> > > vetoed. >> > > > > > > Generally >> > > > > > > the community will cancel the release vote if anyone >> identifies >> > > > serious >> > > > > > > problems, but in most cases the ultimate decision, lies with >> the >> > > > > > individual >> > > > > > > serving as release manager. The specifics of the process may >> vary >> > > > from >> > > > > > > project to project, but the 'minimum quorum of three +1 votes' >> > rule >> > > > is >> > > > > > > universal. >> > > > > > > >> > > > > > > On Tue, Sep 4, 2018 at 7:12 PM Sheng Zha <szha....@gmail.com> >> > > wrote: >> > > > > > > >> > > > > > > > Thanks for sharing your opinions, Thomas. Your recognition >> and >> > > > > respect >> > > > > > of >> > > > > > > > people's efforts on preparing the release candidate are >> > certainly >> > > > > > > > appreciated. >> > > > > > > > >> > > > > > > > Now that the vote is set to fail thanks to the veto, there >> will >> > > be >> > > > > > plenty >> > > > > > > > of opportunities to include those bug fixes, including the >> one >> > > Zhi >> > > > > > > > mentioned [1], which was already merged in the master and >> yet >> > > chose >> > > > > not >> > > > > > > to >> > > > > > > > block this release with [2]. I will be happy to work with >> > Roshani >> > > > to >> > > > > > > > prepare another release candidate once ready. >> > > > > > > > >> > > > > > > > -sz >> > > > > > > > >> > > > > > > > [1] >> > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://lists.apache.org/thread.html/f02e952bec22c82cb00a6741390a78f55373311c97464997bb455a6c@%3Cdev.mxnet.apache.org%3E >> > > > > > > > [2] >> > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://lists.apache.org/thread.html/85d3fcabb3437ba7f1af455cf69aa13eb3afd1ea1d1f6f891e9c339c@%3Cdev.mxnet.apache.org%3E >> > > > > > > > >> > > > > > > > On Tue, Sep 4, 2018 at 6:02 PM Thomas DELTEIL < >> > > > > > thomas.delte...@gmail.com >> > > > > > > > >> > > > > > > > wrote: >> > > > > > > > >> > > > > > > > > -0 >> > > > > > > > > (non-binding) >> > > > > > > > > >> > > > > > > > > If I may add some nuancing plus a personal data point as >> one >> > of >> > > > the >> > > > > > > users >> > > > > > > > > commenting in the bug report in question: >> > > > > > > > > >> > > > > > > > > - Performance vs. Basic functionality => I don't think >> high >> > > > > > performance >> > > > > > > > > use-cases and basic functionality are two obviously >> opposed >> > > > > concepts >> > > > > > > and >> > > > > > > > > see no contradiction in Hagay's and Sandeep's statements. >> > > > > > > > > Float16 support is feature of MXNet that provides more >> than >> > > twice >> > > > > the >> > > > > > > > > performance of Float32 on supported platforms, hence the >> high >> > > > > > > performance >> > > > > > > > > use-case. The bug is that the basic functionality of >> > reloading >> > > a >> > > > > > saved >> > > > > > > > > float16 models is currently broken. >> > > > > > > > > >> > > > > > > > > - This bug vs Other bugs => Contrary the vast majority of >> the >> > > 140 >> > > > > > open >> > > > > > > > bugs >> > > > > > > > > that are mentioned above, I would put to Sandeep's credit >> > that >> > > > this >> > > > > > one >> > > > > > > > bug >> > > > > > > > > has a PR open that provides a fix for it. This would make >> it >> > a >> > > > > better >> > > > > > > > > candidate to get included in this release than a bug that >> has >> > > no >> > > > > fix >> > > > > > > > ready >> > > > > > > > > for it. >> > > > > > > > > >> > > > > > > > > - Personal datapoint: I recently did some experimentation >> > with >> > > > > > float16 >> > > > > > > > [1] >> > > > > > > > > and actually coincidentally just published a video on >> > > optimizing >> > > > > > > > > performance for Gluon. Float16 conversion is one of the >> most, >> > > if >> > > > > not >> > > > > > > the >> > > > > > > > > most effective way to get performance out of MXNet [2]. I >> > > believe >> > > > > > there >> > > > > > > > is >> > > > > > > > > a lot of value in publicizing more its use and hence >> making >> > > sure >> > > > at >> > > > > > > least >> > > > > > > > > the basic support for normal use-cases is present. >> > > > > > > > > >> > > > > > > > > Of course this needs to be balanced with the overhead of >> > > > preparing >> > > > > a >> > > > > > > new >> > > > > > > > > release candidate once the fixed is reviewed and merged, >> > which >> > > > > seems >> > > > > > to >> > > > > > > > be >> > > > > > > > > a lengthy and complex process in its own right, and the >> delay >> > > > with >> > > > > > > > > providing the other features present in 1.3 for users that >> > are >> > > > not >> > > > > > > > running >> > > > > > > > > off the nightly builds. >> > > > > > > > > >> > > > > > > > > All the best, >> > > > > > > > > >> > > > > > > > > Thomas >> > > > > > > > > >> > > > > > > > > [1] >> > > https://github.com/ThomasDelteil/PerformanceTricksMXNetGluon >> > > > > > > > > [2] >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://www.youtube.com/watch?v=Cqo7FPftNyo&t=0s&list=PLkEvNnRk8uVk6U515Pj-jHQUxFC4eDi3m >> > > > > > > > > >> > > > > > > > > Le mar. 4 sept. 2018 à 17:11, Sheng Zha < >> szha....@gmail.com> >> > a >> > > > > > écrit : >> > > > > > > > > >> > > > > > > > > > Sandeep, >> > > > > > > > > > >> > > > > > > > > > Thanks for explaining your veto. We have open bugs that >> > > > impacted >> > > > > a >> > > > > > > lot >> > > > > > > > > more >> > > > > > > > > > than just 3 customers, just by referring to the number >> of >> > > > > > commenters >> > > > > > > on >> > > > > > > > > the >> > > > > > > > > > issue [1]. >> > > > > > > > > > >> > > > > > > > > > You said that this is for "high performance use cases", >> > which >> > > > > > > > contradicts >> > > > > > > > > > with Hagay's assement that this is "basic functionality >> > > > broken". >> > > > > > > Given >> > > > > > > > > that >> > > > > > > > > > this is for advanced use cases of using half-precision >> > > > training, >> > > > > > why >> > > > > > > is >> > > > > > > > > it >> > > > > > > > > > so much more important than any other open bug reports, >> > that >> > > > for >> > > > > > this >> > > > > > > > > > specific bug fix, we have to delay the access of regular >> > > users >> > > > to >> > > > > > the >> > > > > > > > new >> > > > > > > > > > MXNet 1.3 release by at least another week? >> > > > > > > > > > >> > > > > > > > > > Honestly, I'm concerned that your vote is biased by >> Amazon >> > > > > > > involvement, >> > > > > > > > > > given that you quoted Amazon Rekognition. >> > > > > > > > > > >> > > > > > > > > > -sz >> > > > > > > > > > >> > > > > > > > > > [1] >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/incubator-mxnet/issues?q=is%3Aissue+is%3Aopen+label%3ABug+sort%3Acomments-desc >> > > > > > > > > > >> > > > > > > > > > On Tue, Sep 4, 2018 at 4:51 PM sandeep krishnamurthy < >> > > > > > > > > > sandeep.krishn...@gmail.com> wrote: >> > > > > > > > > > >> > > > > > > > > > > My initial vote of “-0” was due to lack of info from a >> > user >> > > > who >> > > > > > had >> > > > > > > > > said, >> > > > > > > > > > > he overcame this issue for FP16 model. >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > However, suggested workaround [1] for the issue is not >> > > > straight >> > > > > > > > forward >> > > > > > > > > > and >> > > > > > > > > > > generally usable for all users. Also, issue is not >> simple >> > > and >> > > > > > > > isolated >> > > > > > > > > to >> > > > > > > > > > > be listed in the Release Notes as known issue with a >> > > > > workaround. >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > Changing my vote to: "-1 (binding)" owing to the user >> > > impact >> > > > > [3] >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > @Sheng: >> > > > > > > > > > > >> > > > > > > > > > > 1. Agreed, bug existed from long time. However, FP16 >> and >> > > such >> > > > > > > > > > optimizations >> > > > > > > > > > > were added later on. Followed by users [2] using this >> > > feature >> > > > > for >> > > > > > > > high >> > > > > > > > > > > performance use cases. It is not ok to measure >> severity >> > of >> > > > the >> > > > > > bug >> > > > > > > > > based >> > > > > > > > > > on >> > > > > > > > > > > its past existence, rather we can see who is impacted >> now >> > > and >> > > > > is >> > > > > > > it a >> > > > > > > > > > small >> > > > > > > > > > > subset with a simple workaround or large user >> impacting >> > > > issue. >> > > > > > > > > > > >> > > > > > > > > > > 2. Agreed bug was reported 7/21. However, I became >> aware >> > of >> > > > > this >> > > > > > > > issue >> > > > > > > > > on >> > > > > > > > > > > 08/29 and submitted the fix on 08/30. Also, I did >> bring >> > > this >> > > > to >> > > > > > the >> > > > > > > > > > notice >> > > > > > > > > > > of community, you and 1.3 release manager (Roshani) on >> > the >> > > > RC0 >> > > > > > > > proposal >> > > > > > > > > > > thread. Also, I would focus on the issue and user >> impact >> > > than >> > > > > who >> > > > > > > > > > > identified and who is fixing the issue. >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > Based on my discussion with 2 users, I think it is a >> > > > important >> > > > > > > > feature >> > > > > > > > > > for >> > > > > > > > > > > them to see in Apache MXNet v1.3.0. >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > Best, >> > > > > > > > > > > >> > > > > > > > > > > Sandeep >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > [1] Workaround used by the user. >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > net_fp16 = >> > > > > > > mx.gluon.SymbolBlock.imports('resnet34_fp16-symbol.json', >> > > > > > > > > > > ['data']) >> > > > > > > > > > > >> > > > > > > > > > > params_fp16 = mx.nd.load('resnet34_fp16-0000.params') >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > for k, v in params_fp16.items(): >> > > > > > > > > > > >> > > > > > > > > > > new_key = k.split(':')[1] >> > > > > > > > > > > >> > > > > > > > > > > net_fp16.collect_params()[new_key].cast(v.dtype) >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > net_fp16.collect_params().load('resnet34_fp16-0000.params', >> > > > > ctx) >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > [2] Amazon Rekognition >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > [3] User story: Train a model -> Cast it to FP16 -> >> Save >> > > the >> > > > > > model >> > > > > > > -> >> > > > > > > > > > Load >> > > > > > > > > > > back the model does not work. They have to cast every >> > > > parameter >> > > > > > > with >> > > > > > > > a >> > > > > > > > > > > workaround mentioned above [1]. >> > > > > > > > > > > >> > > > > > > > > > > On Tue, Sep 4, 2018 at 4:14 PM Hagay Lupesko < >> > > > > lupe...@gmail.com> >> > > > > > > > > wrote: >> > > > > > > > > > > >> > > > > > > > > > > > Hi Sheng, >> > > > > > > > > > > > >> > > > > > > > > > > > Addressing your questions: >> > > > > > > > > > > > >> > > > > > > > > > > > - "why this specific bug is more important than all >> the >> > > > other >> > > > > > > known >> > > > > > > > > > bugs, >> > > > > > > > > > > > that this becomes a release blocker" >> > > > > > > > > > > > I do not consider it to be more or less important >> than >> > > > other >> > > > > > > fixes. >> > > > > > > > > It >> > > > > > > > > > > can >> > > > > > > > > > > > be fixed and included in the release alongside the >> rest >> > > of >> > > > > the >> > > > > > > > > release >> > > > > > > > > > > > content, right? >> > > > > > > > > > > > From the description of the issue it seems important >> > > since >> > > > it >> > > > > > is >> > > > > > > > > > blocking >> > > > > > > > > > > > users from loading models that were previously >> trained >> > > and >> > > > > > saved. >> > > > > > > > > There >> > > > > > > > > > > is >> > > > > > > > > > > > nothing stopping the community from including this >> fix >> > > into >> > > > > > > 1.3.0, >> > > > > > > > > > > > alongside the rest of the features and fixes. >> > > > > > > > > > > > >> > > > > > > > > > > > - "The bug exists since SymbolBlock was introduced a >> > year >> > > > ago >> > > > > > and >> > > > > > > > has >> > > > > > > > > > > > survived at least three releases, so this is not a >> > > > > regression." >> > > > > > > > > > > > I do not think I said it is a regression. However, >> the >> > > > fact a >> > > > > > bug >> > > > > > > > > > existed >> > > > > > > > > > > > before, does not mean it is OK to release it rather >> > than >> > > > fix >> > > > > > it. >> > > > > > > > > > > > >> > > > > > > > > > > > - "Timeline-wise, this bug was reported on 7/21, but >> > was >> > > > not >> > > > > > > > reported >> > > > > > > > > > as >> > > > > > > > > > > > release-blocker in the release discussion thread >> until >> > > 8/31 >> > > > > > [1]. >> > > > > > > > > > Neither >> > > > > > > > > > > > its reporting as release-blocker nor its fix made it >> > for >> > > > the >> > > > > > 8/3 >> > > > > > > > code >> > > > > > > > > > > > freeze." >> > > > > > > > > > > > You are right, would have been better to have this >> > > > identified >> > > > > > and >> > > > > > > > > fixed >> > > > > > > > > > > > earlier and included before code freeze. >> > > > > > > > > > > > >> > > > > > > > > > > > - "The PR is still not ready yet as it doesn't have >> > > > > approval." >> > > > > > > > > > > > I think it is waiting for your review. >> > > > > > > > > > > > >> > > > > > > > > > > > - "it would be great if you could provide some >> > additional >> > > > > > > reasoning >> > > > > > > > > > > besides >> > > > > > > > > > > > "X mentions the issue" or "fix was done by X"" >> > > > > > > > > > > > I have. Repeating what I wrote in my previous email >> for >> > > > > > clarity: >> > > > > > > > > Basic >> > > > > > > > > > > > functionality broken: loading a model (albeit one >> that >> > > that >> > > > > was >> > > > > > > > saved >> > > > > > > > > > as >> > > > > > > > > > > > non FP32) >> > > > > > > > > > > > >> > > > > > > > > > > > So, yes - this issue seems to have been out there >> for a >> > > > > while, >> > > > > > > > > somehow >> > > > > > > > > > > went >> > > > > > > > > > > > under the radar... but I think the key question is >> > > whether >> > > > > this >> > > > > > > > > blocks >> > > > > > > > > > a >> > > > > > > > > > > > basic functionality in MXNet. I believe so, hence >> my -1 >> > > > vote. >> > > > > > > > > > > > >> > > > > > > > > > > > Hagay >> > > > > > > > > > > > >> > > > > > > > > > > > On Tue, Sep 4, 2018 at 1:19 PM Sheng Zha < >> > > > szha....@gmail.com >> > > > > > >> > > > > > > > wrote: >> > > > > > > > > > > > >> > > > > > > > > > > > > Hi Hagay and Sandeep, >> > > > > > > > > > > > > >> > > > > > > > > > > > > Could you help us understand why this specific >> bug is >> > > > more >> > > > > > > > > important >> > > > > > > > > > > than >> > > > > > > > > > > > > all the other known bugs, that this becomes a >> release >> > > > > > blocker? >> > > > > > > > > > > > > >> > > > > > > > > > > > > Some facts to consider: >> > > > > > > > > > > > > - The bug exists since SymbolBlock was introduced >> a >> > > year >> > > > > ago >> > > > > > > and >> > > > > > > > > has >> > > > > > > > > > > > > survived at least three releases, so this is not a >> > > > > > regression. >> > > > > > > > > > > > > - Timeline-wise, this bug was reported on 7/21, >> but >> > was >> > > > not >> > > > > > > > > reported >> > > > > > > > > > as >> > > > > > > > > > > > > release-blocker in the release discussion thread >> > until >> > > > 8/31 >> > > > > > > [1]. >> > > > > > > > > > > Neither >> > > > > > > > > > > > > its reporting as release-blocker nor its fix made >> it >> > > for >> > > > > the >> > > > > > > 8/3 >> > > > > > > > > code >> > > > > > > > > > > > > freeze. >> > > > > > > > > > > > > - The PR is still not ready yet as it doesn't have >> > > > > approval. >> > > > > > > > > > > > > >> > > > > > > > > > > > > Hagay, it would be great if you could provide some >> > > > > additional >> > > > > > > > > > reasoning >> > > > > > > > > > > > > besides "X mentions the issue" or "fix was done by >> > X". >> > > > > > Thanks. >> > > > > > > > > > > > > >> > > > > > > > > > > > > -sz >> > > > > > > > > > > > > >> > > > > > > > > > > > > [1] >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://lists.apache.org/thread.html/d1ed611f98c20d5d85c294b0c07c8bdebca13a209cf66a3872c9123e@%3Cdev.mxnet.apache.org%3E >> > > > > > > > > > > > > >> > > > > > > > > > > > > On Tue, Sep 4, 2018 at 12:39 PM Hagay Lupesko < >> > > > > > > lupe...@gmail.com >> > > > > > > > > >> > > > > > > > > > > wrote: >> > > > > > > > > > > > > >> > > > > > > > > > > > > > Sandeep mentions the issue of an error when user >> > > tries >> > > > to >> > > > > > > load >> > > > > > > > > > model >> > > > > > > > > > > > > params >> > > > > > > > > > > > > > trained/saved as FP16. >> > > > > > > > > > > > > > >> > > https://github.com/apache/incubator-mxnet/issues/11849 >> > > > > > > > > > > > > > The fix was done by Sandeep: >> > > > > > > > > > > > > > >> > https://github.com/apache/incubator-mxnet/pull/12412 >> > > > and >> > > > > > is >> > > > > > > > > ready >> > > > > > > > > > to >> > > > > > > > > > > > be >> > > > > > > > > > > > > > cherry picked into the release branch. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > This seems like a release blocker to me: >> > > > > > > > > > > > > > - Basic functionality broken: loading a model >> > (albeit >> > > > one >> > > > > > > that >> > > > > > > > > that >> > > > > > > > > > > was >> > > > > > > > > > > > > > saved as non FP32) >> > > > > > > > > > > > > > - Reported by 3 users (wgchang@, nicklhy@ and >> > > > > > ThomasDelteil@ >> > > > > > > ) >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > -1 (non binding) >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > Hagay >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > On Tue, Sep 4, 2018 at 12:01 PM sandeep >> > > krishnamurthy < >> > > > > > > > > > > > > > sandeep.krishn...@gmail.com> wrote: >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > "- 0" >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I believe the bug #11849 >> > > > > > > > > > > > > > > < >> > > > > https://github.com/apache/incubator-mxnet/issues/11849 >> > > > > > >, >> > > > > > > > > unable >> > > > > > > > > > > to >> > > > > > > > > > > > > > import >> > > > > > > > > > > > > > > non-fp32 models into Gluon, fixed in this PR >> > #12412 >> > > > > > > > > > > > > > > < >> > > > https://github.com/apache/incubator-mxnet/pull/12412> >> > > > > > is >> > > > > > > > > > > important >> > > > > > > > > > > > > for >> > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > users. I would rather pick this fix in this >> > release >> > > > > than >> > > > > > > > plan a >> > > > > > > > > > > minor >> > > > > > > > > > > > > > > release later. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Best, >> > > > > > > > > > > > > > > Sandeep >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > On Mon, Sep 3, 2018 at 2:34 PM Philip Cho < >> > > > > > > > > > > > chohy...@cs.washington.edu> >> > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Actually, the command "git clone --recursive >> > > > > > > > > > > > > > > > https://github.com/apache/incubator-mxnet >> -b >> > > > > > 1.3.0.rc0" >> > > > > > > > > works >> > > > > > > > > > > fine >> > > > > > > > > > > > > > now, >> > > > > > > > > > > > > > > > never mind. >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > On Mon, Sep 3, 2018 at 1:45 PM Philip Cho < >> > > > > > > > > > > > > chohy...@cs.washington.edu> >> > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Unfortunately, MXNet was depending on a >> > branch >> > > of >> > > > > TVM >> > > > > > > > that >> > > > > > > > > is >> > > > > > > > > > > now >> > > > > > > > > > > > > > > > deleted. >> > > > > > > > > > > > > > > > > We will have to merge #12448 >> > > > > > > > > > > > > > > > > < >> > > > > > https://github.com/apache/incubator-mxnet/pull/12448> >> > > > > > > > > > before >> > > > > > > > > > > > the >> > > > > > > > > > > > > > > > release. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Background: See dmlc/tvm#1394 < >> > > > > > > > > > > > > > https://github.com/dmlc/tvm/issues/1394 >> > > > > > > > > > > > > > > >. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Philip. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > On Mon, Sep 3, 2018 at 7:26 AM Carin >> Meier < >> > > > > > > > > > > carinme...@gmail.com >> > > > > > > > > > > > > >> > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> Checked out the tag, built and tested the >> > > > Clojure >> > > > > > > > package. >> > > > > > > > > > +1 >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> On Fri, Aug 31, 2018 at 10:59 PM Roshani >> > > > Nagmote < >> > > > > > > > > > > > > > > > >> roshaninagmo...@gmail.com> >> > > > > > > > > > > > > > > > >> wrote: >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> > Hi all, >> > > > > > > > > > > > > > > > >> > >> > > > > > > > > > > > > > > > >> > I would like to propose a vote to >> release >> > > > Apache >> > > > > > > MXNet >> > > > > > > > > > > > > > (incubating) >> > > > > > > > > > > > > > > > >> version >> > > > > > > > > > > > > > > > >> > 1.3.0.RC0. Voting will start now >> (Friday, >> > > Aug >> > > > > > 31st) >> > > > > > > > and >> > > > > > > > > > end >> > > > > > > > > > > at >> > > > > > > > > > > > > > 7:00 >> > > > > > > > > > > > > > > PM >> > > > > > > > > > > > > > > > >> > PDT, Wednesday, Sept 5th. >> > > > > > > > > > > > > > > > >> > >> > > > > > > > > > > > > > > > >> > Link to release notes: >> > > > > > > > > > > > > > > > >> > >> > > > > > https://github.com/apache/incubator-mxnet/releases >> > > > > > > > > > > > > > > > >> > >> > > > > > > > > > > > > > > > >> > Link to release candidate 1.3.0.rc0: >> > > > > > > > > > > > > > > > >> > * >> > > > > > > > > > > > > >> > > > > > > >> https://github.com/apache/incubator-mxnet/releases/tag/1.3.0.rc >> > > > > > > > > > > > > > > > >> > < >> > > > > > > > > > > > > >> > > > > > > >> https://github.com/apache/incubator-mxnet/releases/tag/1.3.0.rc0 >> > > > > > > > > > > > > > > >0* >> > > > > > > > > > > > > > > > >> > >> > > > > > > > > > > > > > > > >> > View this page, click on "Build from >> > > Source", >> > > > > and >> > > > > > > use >> > > > > > > > > the >> > > > > > > > > > > > source >> > > > > > > > > > > > > > > code >> > > > > > > > > > > > > > > > >> > obtained from 1.3.0.rc0 tag: >> > > > > > > > > > > > > > > > >> > >> > > > > > > https://mxnet.incubator.apache.org/install/index.html >> > > > > > > > > > > > > > > > >> > >> > > > > > > > > > > > > > > > >> > Please remember to TEST first before >> > voting >> > > > > > > > accordingly: >> > > > > > > > > > > > > > > > >> > >> > > > > > > > > > > > > > > > >> > +1 = approve >> > > > > > > > > > > > > > > > >> > +0 = no opinion >> > > > > > > > > > > > > > > > >> > -1 = disapprove (provide reason) >> > > > > > > > > > > > > > > > >> > >> > > > > > > > > > > > > > > > >> > Thanks, >> > > > > > > > > > > > > > > > >> > Roshani >> > > > > > > > > > > > > > > > >> > >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > -- >> > > > > > > > > > > > > > > Sandeep Krishnamurthy >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > -- >> > > > > > > > > > > Sandeep Krishnamurthy >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > >> > > > > > -- >> > > > > > Sandeep Krishnamurthy >> > > > > > >> > > > > >> > > > >> > > >> > >> >