...and please don't get me wrong, my English is horrendously poor. On Thu, Oct 19, 2017 at 3:53 PM, Suneel Marthi <smar...@apache.org> wrote:
> I guess the whole discussion here is about - Who is 'We' in your email ? > > 'We thought there is a better way of doing this' > > It may just be misinterpretation or misunderstanding amongst folks here > due to language barrier. > > > On Thu, Oct 19, 2017 at 3:48 PM, Tianqi Chen <tqc...@cs.washington.edu> > wrote: > >> We thought there is a better way of doing this, by proposing nnvm as part >> of apache deep learning stack along with Mxnet. This is the reason why we >> did not simply move the repo over now >> >> Tianqi >> On Thu, Oct 19, 2017 at 12:43 PM Chris Olivier <cjolivie...@gmail.com> >> wrote: >> >> > Why don't we just move all of these dmlc modules into the Apache >> repository >> > right now and have the correct discussions on dev? What's the argument >> > against this? IMHO, I thought that's what was going to be done >> originally. >> > >> > On Thu, Oct 19, 2017 at 12:14 PM, Tianqi Chen <tqc...@cs.washington.edu >> > >> > wrote: >> > >> > > Hi Hen: >> > > >> > > It is sad to think DMLC adversarially in this matter. DMLC projects >> > adopt >> > > apache way of doing things and we are planning moving more modules >> into >> > > Apache. >> > > >> > > All the discussion so far happens under the Apache manner and I do >> think >> > > that healthy discussion on critical design issues is important. It is >> > > unfair to say something is rotten just when there is a debate going >> on in >> > > terms of technical issues. >> > > >> > > They are merely based on our technical assessment of what is better >> for >> > the >> > > project in general, instead of being political or chanting the >> detailed >> > > credits or ownership of the code. >> > > >> > > >> > > Tianqi >> > > >> > > On Thu, Oct 19, 2017 at 12:03 PM, Hen <bay...@apache.org> wrote: >> > > >> > > > What I think I'm seeing here is that: >> > > > >> > > > * MXNet moved to Apache. >> > > > * Some of the code it relied on (50% per the last release thread, >> but >> > > that >> > > > may have been bombastic) remained at DMLC. >> > > > * The MXNet community thinks one thing. >> > > > * The DMLC community (which is a subset of the MXNet community that >> > runs >> > > > under different community rules) thinks another. >> > > > >> > > > Something is rotten. >> > > > >> > > > One solution: The MXNet community forks the DMLC code it relies on >> into >> > > the >> > > > MXNet codebase and moves on without being tied down by the decisions >> > of a >> > > > non-compatible community. >> > > > >> > > > Hen >> > > > >> > > > >> > > > >> > > > On Thu, Oct 19, 2017 at 11:59 AM, Tianqi Chen < >> > tqc...@cs.washington.edu> >> > > > wrote: >> > > > >> > > > > Here are the detailed points(sorry for resenting it over again) >> > > > > >> > > > > Technical Reasoning: >> > > > > >> > > > > - Model exchange format like CoreML and ONNX are not lossless and >> > > > > complete. They are designed to an contain a core set of the >> > > > > minimum operators to support necessary inference tasks like >> ResNet, >> > > etc. >> > > > > So you cannot rely on a bi-directional serialization with this >> format >> > > for >> > > > > all MXNet models. As a simple example, broadcast add/mul is >> simply >> > not >> > > > > supported in onnx. >> > > > > >> > > > > - Same problem goes for compilation and in-memory IR, a core set >> of >> > > most >> > > > > interesting primitives are effectively supported. >> > > > > >> > > > > - Either in the case of supporting exchange format, or in-memory >> IR, >> > we >> > > > > need to make the decision on what core set of operators are we >> > > interested >> > > > > in support. We cannot simply say let us support everything from >> the >> > > > > beginning due to the limitations of the exchange format. >> > > > > >> > > > > - It is crucial for us articulate what is the core set of >> operators >> > we >> > > > care >> > > > > about in MXNet. Either in terms of providing guidelines to the >> > > community, >> > > > > or influence the design of model exchange format them-selfs to >> move >> > in >> > > > > favor of MXNet. >> > > > > >> > > > > - nnvm/top is that initial core set of operators for both compiler >> > > > support >> > > > > and exchange purposes. It is modeled under numpy and gluon, under >> the >> > > > > supervision of Eric, Me and Mu. It can be bi-directionally >> exchanged >> > > > with >> > > > > a current mxnet operator without loss of information. >> > > > > >> > > > > The Effort of Engineering: >> > > > > >> > > > > - Because nnvm/top is modeled with numpy and gluon, mxnet<-> >> nnvm/top >> > > is >> > > > > quite easy, and we already have one direction done. I would be >> very >> > > happy >> > > > > to answer any questions on another. No information loss will >> happen >> > > with >> > > > > this path. >> > > > > >> > > > > - mxnet/symbol or nnvm/symbol(they are essentially the same thing >> > with >> > > a >> > > > > bit different op defs) <- onnx is harder. There has been already >> > enough >> > > > > effort to support onnx 0.1 as Roshani mentioned. Which is >> contributed >> > > by >> > > > > Zhi Zhang, another Apache MXNet committer. Zhi already provided >> code >> > to >> > > > > alleviate this process. Built code on the existing effort would >> > > actually >> > > > > make the problem easier. >> > > > > >> > > > > On Thu, Oct 19, 2017 at 11:55 AM, Tianqi Chen < >> > > tqc...@cs.washington.edu> >> > > > > wrote: >> > > > > >> > > > > > As for where the code should sit, we have seen onnx's support >> for >> > > > caffe2 >> > > > > > sitting on a separate repo. My suggestion would be put code >> under >> > > > > nnvm/top >> > > > > > and migrate into mxnet eventually when the top components get >> into >> > > > MXNet, >> > > > > > hopefully by end of next month. >> > > > > > >> > > > > > I have elaborated my point in the last email thread. This (going >> > > > through >> > > > > > nnvm/top) is an important design decision both technically >> > > > (compilation, >> > > > > > more hardware) and strategically (articulate our core set of >> > > operators >> > > > > and >> > > > > > influence the model exchange format). >> > > > > > >> > > > > > I am glad to see the discussion happening and surely there is >> > doubt, >> > > as >> > > > > > with every big step of changes. But with the rapidly changing >> pace >> > > of >> > > > > deep >> > > > > > learning systems, this is the direction that we thought is most >> > > > > promising. >> > > > > > We can call for a vote if necessary among the committers for the >> > > design >> > > > > > decision if there is still debate on this issue. Or we can keep >> the >> > > > > > discussion open and start some effort around nnvm/top to see >> how it >> > > > goes >> > > > > > >> > > > > > Tianqi >> > > > > > >> > > > > > On Thu, Oct 19, 2017 at 11:15 AM, Lupesko, Hagay < >> > lupe...@gmail.com> >> > > > > > wrote: >> > > > > > >> > > > > >> Mu, >> > > > > >> >> > > > > >> You’re mentioning plans for a new model format and compiler, >> but I >> > > > don’t >> > > > > >> recall seeing it shared/discussed on the dev list. Can you >> share >> > > > these, >> > > > > so >> > > > > >> it is more accessible to folks to understand the plan and >> vision? >> > > > > >> >> > > > > >> Personally, I think it will be a shame to add ONNX support to >> > MXNet, >> > > > and >> > > > > >> have it implemented outside of MXNet. At the end of the day, it >> > > makes >> > > > > >> things difficult for MXNet users. >> > > > > >> >> > > > > >> Hagay >> > > > > >> >> > > > > >> On 10/19/17, 10:01, "Mu Li" <limu...@gmail.com on behalf of >> > > > > >> muli....@gmail.com> wrote: >> > > > > >> >> > > > > >> I'm speaking under my "MXNet contributor" hat. >> > > > > >> >> > > > > >> It will be sad that our new model format and compiler is >> not >> > > > > >> supported by >> > > > > >> our own contributors. It puts us in a bad position to reach >> > out >> > > to >> > > > > >> outside >> > > > > >> to ask for support. >> > > > > >> >> > > > > >> If you really what to do it with the onnx <-> mxnet way, I >> > > suggest >> > > > > >> putting >> > > > > >> the codes under https://github.com/aws. >> > > > > >> >> > > > > >> Best >> > > > > >> Mu >> > > > > >> >> > > > > >> On Thu, Oct 19, 2017 at 9:51 AM, Lupesko, Hagay < >> > > > lupe...@gmail.com> >> > > > > >> wrote: >> > > > > >> >> > > > > >> > Since there seems to be a difficulty to reach a consensus >> > > here, >> > > > > and >> > > > > >> this >> > > > > >> > is a new area, maybe a good compromise would be to >> > contribute >> > > > this >> > > > > >> under >> > > > > >> > /contrib as experimental, with whatever way Roshani >> thinks >> > > makes >> > > > > >> sense. >> > > > > >> > Once there is code in place, and MXNet users and >> > contributors >> > > > are >> > > > > >> able to >> > > > > >> > check it out, we can consider future steps. >> > > > > >> > >> > > > > >> > Does this proposal make sense to folks? >> > > > > >> > >> > > > > >> > On 10/18/17, 23:01, "Tianqi Chen" <workc...@gmail.com on >> > > behalf >> > > > > of >> > > > > >> > tqc...@cs.washington.edu> wrote: >> > > > > >> > >> > > > > >> > I want to offer one last thing in terms of technical >> > > > details. >> > > > > I >> > > > > >> > mentioned >> > > > > >> > two trends in the deep learning systems. There is one >> > last >> > > > > >> thing that >> > > > > >> > is >> > > > > >> > omitted. How should we build a good deploy end for >> deep >> > > > > learning >> > > > > >> > models. >> > > > > >> > >> > > > > >> > There is always a paradox to this problem: >> > > > > >> > >> > > > > >> > - On one hand, the deployment end needs to be >> > lightweight >> > > > and >> > > > > >> portable. >> > > > > >> > - We want a lot of optimizations (memory layout >> compute) >> > > and >> > > > > >> feature >> > > > > >> > support, this makes the project big. >> > > > > >> > >> > > > > >> > All the existing systems suffer from this problem. >> The >> > > > > solution >> > > > > >> is >> > > > > >> > simple, >> > > > > >> > separates the optimization part from the actual >> runtime >> > > and >> > > > > >> compiles >> > > > > >> > the >> > > > > >> > things down to a bare metal module. And this is the >> > > solution >> > > > > >> nnvm/top >> > > > > >> > compiler pipeline offer, which I believe will become >> a >> > > > > standard >> > > > > >> > practice of >> > > > > >> > deployment and where all systems go to >> > > > > >> > >> > > > > >> > Tianqi >> > > > > >> > >> > > > > >> > On Wed, Oct 18, 2017 at 10:03 PM, Tianqi Chen < >> > > > > >> > tqc...@cs.washington.edu> >> > > > > >> > wrote: >> > > > > >> > >> > > > > >> > > OK, there is some miscommunication in here I guess. >> > We >> > > > only >> > > > > >> need to >> > > > > >> > do a >> > > > > >> > > "canonization" step in python API that goes a >> symbol >> > to >> > > > > symbol >> > > > > >> > translation >> > > > > >> > > layer. It can be done in purely in python, and >> there >> > is >> > > no >> > > > > >> need for >> > > > > >> > going >> > > > > >> > > "down" into c++ to do this. >> > > > > >> > > >> > > > > >> > > For example, the current nnvm.from_mxnet API takes >> > > Module >> > > > or >> > > > > >> Gluon >> > > > > >> > module >> > > > > >> > > and get you back nnvm/top graph in python. >> > > > > >> > > >> > > > > >> > > All we are asking for is to decomposing it into >> > > > > >> > > >> > > > > >> > > def mxnet_to_onnx(module): >> > > > > >> > > nnvm_graph, params = nnvm_from_mxnet(module) >> > > > > >> > > onnx = nnvm_to_onnx(nnvm_graph, params) >> > > > > >> > > return onnx >> > > > > >> > > >> > > > > >> > > This allows nnvm_from_mxnet to be reused for other >> > > > purposes, >> > > > > >> like >> > > > > >> > > compiling API to deployable modules >> > > > > >> > > >> > > > > >> > > Tianqi >> > > > > >> > > >> > > > > >> > > On Wed, Oct 18, 2017 at 9:55 PM, Lupesko, Hagay < >> > > > > >> lupe...@gmail.com> >> > > > > >> > wrote: >> > > > > >> > > >> > > > > >> > >> Tianqi: >> > > > > >> > >> Thanks for detailing the trends. I fully agree >> that >> > > ONNX >> > > > is >> > > > > >> just a >> > > > > >> > graph >> > > > > >> > >> serialization format – nothing more, nothing >> less. I >> > > also >> > > > > >> think we >> > > > > >> > all >> > > > > >> > >> agree that this simple mechanism holds lots of >> value >> > to >> > > > DL >> > > > > >> users >> > > > > >> > since it >> > > > > >> > >> allows them to move between frameworks easily >> (e.g. >> > > train >> > > > > >> with >> > > > > >> > MXNet, >> > > > > >> > >> deploy on a mobile device with Caffe2, or the >> other >> > way >> > > > > >> around). >> > > > > >> > >> As you said, In Memory IR is different than >> > > serialization >> > > > > >> formats >> > > > > >> > such as >> > > > > >> > >> ONNX. They are designed to make the runtime >> execution >> > > as >> > > > > >> efficient >> > > > > >> > as >> > > > > >> > >> possible, leveraging software and hardware >> > > optimizations. >> > > > > >> They are >> > > > > >> > indeed >> > > > > >> > >> complex, and where the “meat” is. >> > > > > >> > >> (BTW ONNX regards itself as an “IR” format, but >> not >> > in >> > > > the >> > > > > >> same >> > > > > >> > sense as >> > > > > >> > >> NNVM). >> > > > > >> > >> >> > > > > >> > >> At the end of the day, Roshani is aiming to >> deliver a >> > > > > simple >> > > > > >> > >> functionality to MXNet users: (1) take an ONNX >> file, >> > > and >> > > > > >> load it >> > > > > >> > into MXNet >> > > > > >> > >> so you get a graph+weights you can work with (2) >> > Given >> > > a >> > > > > >> trained >> > > > > >> > model, >> > > > > >> > >> save it as an ONNX file. Since MXNet users do not >> > > > interact >> > > > > >> with NNVM >> > > > > >> > >> directly, but rather interact with MXNet API >> (MXNet >> > > > > Module), >> > > > > >> isn’t >> > > > > >> > the >> > > > > >> > >> simplest thing to do is just to construct the >> Module >> > > “on >> > > > > the >> > > > > >> fly” >> > > > > >> > using >> > > > > >> > >> MXNet API? Taking the other approach, we will go >> from >> > > the >> > > > > >> top level >> > > > > >> > MXNet >> > > > > >> > >> “load” API, go “down” to NNVM to construct the >> graph, >> > > go >> > > > > >> back up to >> > > > > >> > MXNet >> > > > > >> > >> to expose it as a Module. This seems to complex >> and >> > > does >> > > > > not >> > > > > >> add any >> > > > > >> > >> benefit. In whatever way we construct the MXNet >> > Module >> > > > > >> object, NNVM >> > > > > >> > will >> > > > > >> > >> always be the underlying in memory IR that is >> being >> > > > > >> executed, so >> > > > > >> > why not >> > > > > >> > >> take the simpler route? >> > > > > >> > >> >> > > > > >> > >> Hagay >> > > > > >> > >> >> > > > > >> > >> On 10/18/17, 19:42, "Tianqi Chen" < >> > workc...@gmail.com >> > > on >> > > > > >> behalf of >> > > > > >> > >> tqc...@cs.washington.edu> wrote: >> > > > > >> > >> >> > > > > >> > >> Hi Chris: >> > > > > >> > >> >> > > > > >> > >> There is no intention to move things away from >> > > mxnet. >> > > > > The >> > > > > >> > reduction of >> > > > > >> > >> lines of code by having a better design in >> > general, >> > > > and >> > > > > >> > usually, you >> > > > > >> > >> write >> > > > > >> > >> less redundant code by benefiting from better >> > > design. >> > > > > As >> > > > > >> I may >> > > > > >> > quote: >> > > > > >> > >> "the >> > > > > >> > >> best design is not achieved not when there is >> > > nothing >> > > > > to >> > > > > >> add, >> > > > > >> > but when >> > > > > >> > >> there is nothing to be taken away." >> > > > > >> > >> >> > > > > >> > >> MXNet has always benefited from this >> philosophy >> > and >> > > > > >> improves >> > > > > >> > with the >> > > > > >> > >> new >> > > > > >> > >> designs and proper modularization. For >> example, >> > we >> > > > see >> > > > > >> such >> > > > > >> > reduction >> > > > > >> > >> and >> > > > > >> > >> convenience happening when migrating from >> MXNet's >> > > > > legacy >> > > > > >> op to >> > > > > >> > the >> > > > > >> > >> NNVM's mechanism. The new mechanism now >> enables >> > > > things >> > > > > >> like >> > > > > >> > sparse >> > > > > >> > >> aware >> > > > > >> > >> support and other stuff which would be much >> > harder >> > > to >> > > > > >> support. >> > > > > >> > >> >> > > > > >> > >> The nnvm/tvm stack comes brings the same >> > benefit(if >> > > > not >> > > > > >> more) >> > > > > >> > and it >> > > > > >> > >> will >> > > > > >> > >> only add more features to MXNet itself. >> Offering >> > > more >> > > > > >> hardware >> > > > > >> > >> backends and >> > > > > >> > >> optimization, allowing us to write less code >> and >> > > > spent >> > > > > >> less >> > > > > >> > time to >> > > > > >> > >> optimize for each backend by going through TVM >> > > > > >> > >> >> > > > > >> > >> Tianqi >> > > > > >> > >> >> > > > > >> > >> On Wed, Oct 18, 2017 at 7:15 PM, Chris >> Olivier < >> > > > > >> > cjolivie...@gmail.com >> > > > > >> > >> > >> > > > > >> > >> wrote: >> > > > > >> > >> >> > > > > >> > >> > Reduce code base of mxnet? By increasing >> scope >> > of >> > > > the >> > > > > >> dmlc >> > > > > >> > modules? >> > > > > >> > >> Is the >> > > > > >> > >> > intent to make mxnet a thin language wrapper >> > > > around a >> > > > > >> group >> > > > > >> > of dmlc >> > > > > >> > >> > modules? >> > > > > >> > >> > >> > > > > >> > >> > >> > > > > >> > >> > On Wed, Oct 18, 2017 at 6:58 PM Tianqi Chen >> < >> > > > > >> > >> tqc...@cs.washington.edu> >> > > > > >> > >> > wrote: >> > > > > >> > >> > >> > > > > >> > >> > > To better answer Hagay's question, I would >> > like >> > > > to >> > > > > >> dive >> > > > > >> > down a >> > > > > >> > >> bit deeper >> > > > > >> > >> > > on the relation between MXNet, NNVM and >> model >> > > > > >> exchange >> > > > > >> > format >> > > > > >> > >> like ONNX. >> > > > > >> > >> > > >> > > > > >> > >> > > There are two major trends in deep >> learning >> > > > systems >> > > > > >> now: >> > > > > >> > >> > > >> > > > > >> > >> > > - Common serializable formats, like ONNX >> and >> > > > > CoreML, >> > > > > >> that >> > > > > >> > defines >> > > > > >> > >> the >> > > > > >> > >> > model >> > > > > >> > >> > > exchange format. >> > > > > >> > >> > > - The in-memory graph IR for quick >> > optimization >> > > > and >> > > > > >> JIT. >> > > > > >> > NNVM, >> > > > > >> > >> > Tensorflow's >> > > > > >> > >> > > XLA falls into this category. >> > > > > >> > >> > > >> > > > > >> > >> > > The exchange formats are great, it only >> > poses a >> > > > > >> layer of >> > > > > >> > >> conversion, >> > > > > >> > >> > which >> > > > > >> > >> > > is good for exchange. The real meat still >> > comes >> > > > > from >> > > > > >> the >> > > > > >> > >> compilation and >> > > > > >> > >> > > JIT pipeline you have to offer. For that, >> we >> > > will >> > > > > >> need an >> > > > > >> > >> in-memory IR, >> > > > > >> > >> > > because of the cost of constructing, >> > serialize >> > > > > could >> > > > > >> be >> > > > > >> > high for >> > > > > >> > >> the >> > > > > >> > >> > > exchange formats like protobuf. And >> usually, >> > > the >> > > > > >> exchange >> > > > > >> > >> formats are >> > > > > >> > >> > > designed in a minimalistic fashion, >> making it >> > > > less >> > > > > >> easy to >> > > > > >> > extend >> > > > > >> > >> more >> > > > > >> > >> > > information to support in-depth >> optimization >> > > like >> > > > > >> automatic >> > > > > >> > >> quantization, >> > > > > >> > >> > > accelerator support. >> > > > > >> > >> > > >> > > > > >> > >> > > The current MXNet relies on NNVM for >> > in-memory >> > > IR >> > > > > >> > manipulation >> > > > > >> > >> but does >> > > > > >> > >> > not >> > > > > >> > >> > > contain a compilation component that >> compiles >> > > to >> > > > > the >> > > > > >> > hardware >> > > > > >> > >> backends. >> > > > > >> > >> > > Doing export to an exchange format and >> then >> > > back >> > > > > >> into NNVM >> > > > > >> > run the >> > > > > >> > >> > > compilation poses too much burden that JIT >> > > > compiler >> > > > > >> could >> > > > > >> > pay. >> > > > > >> > >> Using the >> > > > > >> > >> > > same in-memory graph IR as the compilation >> > > stack >> > > > > >> give much >> > > > > >> > more >> > > > > >> > >> advantage >> > > > > >> > >> > > in terms of this. >> > > > > >> > >> > > >> > > > > >> > >> > > The newly introduces nnvm/top and compiler >> > > offers >> > > > > >> in-memory >> > > > > >> > graph >> > > > > >> > >> > > optimization and compilation and offers >> more >> > > > > hardware >> > > > > >> > backend >> > > > > >> > >> directly >> > > > > >> > >> > via >> > > > > >> > >> > > TVM. We already see promising results in >> edge >> > > > > >> deployments >> > > > > >> > with a >> > > > > >> > >> much >> > > > > >> > >> > lower >> > > > > >> > >> > > overhead of runtime. We will further >> benefit >> > > > > quickly >> > > > > >> from >> > > > > >> > more >> > > > > >> > >> graph >> > > > > >> > >> > > optimizations that it has to offer. >> > > > > >> > >> > > >> > > > > >> > >> > > Building support around this new paradigm >> > > offers >> > > > us >> > > > > >> > advantage of >> > > > > >> > >> being >> > > > > >> > >> > > future compatible and takes full benefit >> of >> > the >> > > > > >> points I >> > > > > >> > >> mentioned above >> > > > > >> > >> > > >> > > > > >> > >> > > Tianqi >> > > > > >> > >> > > >> > > > > >> > >> > > >> > > > > >> > >> > > >> > > > > >> > >> > > On Wed, Oct 18, 2017 at 4:57 PM, Lupesko, >> > > Hagay < >> > > > > >> > >> lupe...@gmail.com> >> > > > > >> > >> > wrote: >> > > > > >> > >> > > >> > > > > >> > >> > > > Roshani – this is an exciting >> initiative, >> > > ONNX >> > > > > >> support on >> > > > > >> > MXNet >> > > > > >> > >> will >> > > > > >> > >> > > > enable more users to ramp up on MXNet, >> > which >> > > is >> > > > > >> great. >> > > > > >> > >> > > > >> > > > > >> > >> > > > Tianqi – a few questions and thoughts >> about >> > > > your >> > > > > >> note: >> > > > > >> > >> > > > - “More hardware backends to mxnet” – >> MXNet >> > > > users >> > > > > >> get the >> > > > > >> > same >> > > > > >> > >> benefit >> > > > > >> > >> > of >> > > > > >> > >> > > > HW support implementing ONNX import on >> top >> > of >> > > > > MXNet >> > > > > >> > symbolic, >> > > > > >> > >> right? >> > > > > >> > >> > > > - “NNVM Compiler now received >> contributions >> > > > from >> > > > > >> AWS, UW >> > > > > >> > and >> > > > > >> > >> many other >> > > > > >> > >> > > > folks in MXNet community.” – agreed it >> is >> > > > ramping >> > > > > >> up, but >> > > > > >> > when >> > > > > >> > >> you look >> > > > > >> > >> > > at >> > > > > >> > >> > > > the data, it is clear that it is very >> early >> > > on >> > > > > for >> > > > > >> NNVM. >> > > > > >> > >> Looking at the >> > > > > >> > >> > > > repo, it has overall 223 commits, 0 >> > releases. >> > > > > >> Compare it >> > > > > >> > to >> > > > > >> > >> MXNet with >> > > > > >> > >> > > 6136 >> > > > > >> > >> > > > commits and 32 releases. It seems to be >> > still >> > > > > >> early on for >> > > > > >> > >> NNVM, and >> > > > > >> > >> > for >> > > > > >> > >> > > a >> > > > > >> > >> > > > more reliable initial implementation >> > building >> > > > the >> > > > > >> import >> > > > > >> > on top >> > > > > >> > >> of >> > > > > >> > >> > MXNet >> > > > > >> > >> > > is >> > > > > >> > >> > > > easier, faster and safer. MXNet has >> lots of >> > > > users >> > > > > >> already >> > > > > >> > using >> > > > > >> > >> the >> > > > > >> > >> > > > Symbolic API which hopefully mean that >> is a >> > > > > mature >> > > > > >> API >> > > > > >> > that is >> > > > > >> > >> not >> > > > > >> > >> > likely >> > > > > >> > >> > > > to have breaking changes or major >> issues. >> > > > > >> > >> > > > >> > > > > >> > >> > > > I’m supportive option 1 proposed by >> Roshani >> > > > > >> (building >> > > > > >> > serde on >> > > > > >> > >> top of >> > > > > >> > >> > > > MXNet symbolic), but to do it as an >> > > > encapsulated >> > > > > >> > implementation >> > > > > >> > >> detail, >> > > > > >> > >> > > so >> > > > > >> > >> > > > the implementation can be migrated to >> NNVM >> > or >> > > > > >> another >> > > > > >> > >> implementation in >> > > > > >> > >> > > the >> > > > > >> > >> > > > future, if at that point it seems like >> the >> > > > right >> > > > > >> thing to >> > > > > >> > do. >> > > > > >> > >> > > > >> > > > > >> > >> > > > Interested in hearing other opinions >> > though… >> > > > > >> > >> > > > >> > > > > >> > >> > > > Hagay >> > > > > >> > >> > > > >> > > > > >> > >> > > > On 10/18/17, 14:13, "Tianqi Chen" < >> > > > > >> workc...@gmail.com on >> > > > > >> > >> behalf of >> > > > > >> > >> > > > tqc...@cs.washington.edu> wrote: >> > > > > >> > >> > > > >> > > > > >> > >> > > > I am strongly recommending going >> > through >> > > > the >> > > > > >> > nnvm/top. One >> > > > > >> > >> major >> > > > > >> > >> > > > reason in >> > > > > >> > >> > > > here is that the support of nnvm/top >> > > layer >> > > > > NOT >> > > > > >> ONLY >> > > > > >> > mean >> > > > > >> > >> > > compatibility >> > > > > >> > >> > > > of >> > > > > >> > >> > > > model format with onnx. These are >> the >> > > major >> > > > > >> benefits: >> > > > > >> > >> > > > >> > > > > >> > >> > > > >> > > > > >> > >> > > > - More hardware backends to mxnet, >> > > > including >> > > > > >> opencl, >> > > > > >> > metal, >> > > > > >> > >> > Raspberry >> > > > > >> > >> > > > Pi, >> > > > > >> > >> > > > web browser. These things are >> > > automatically >> > > > > >> enabled >> > > > > >> > by going >> > > > > >> > >> > through >> > > > > >> > >> > > > this >> > > > > >> > >> > > > layer. In general, we design >> nnvm/tvm >> > > stack >> > > > > to >> > > > > >> > resolve the >> > > > > >> > >> > challenge >> > > > > >> > >> > > of >> > > > > >> > >> > > > current mxnet's weakness in terms >> > > deploying >> > > > > to >> > > > > >> more >> > > > > >> > hardware >> > > > > >> > >> > > backends. >> > > > > >> > >> > > > >> > > > > >> > >> > > > - More frontend capabilities, nnvm's >> > > gluon >> > > > > >> style IR >> > > > > >> > ingests >> > > > > >> > >> now >> > > > > >> > >> > from >> > > > > >> > >> > > > CoreML, ONNX and in future keras. >> > > > Supporting >> > > > > >> those >> > > > > >> > will >> > > > > >> > >> reduce the >> > > > > >> > >> > > > amount >> > > > > >> > >> > > > of engineering effort needed. >> > > > > >> > >> > > > >> > > > > >> > >> > > > - Future compatibility. We all agree >> > that >> > > > the >> > > > > >> future >> > > > > >> > being >> > > > > >> > >> migrated >> > > > > >> > >> > > to >> > > > > >> > >> > > > gluon's API. NNVM/top tries to look >> > ahead >> > > > by >> > > > > >> directly >> > > > > >> > >> adopting the >> > > > > >> > >> > > > symbolic >> > > > > >> > >> > > > API to be gluon. >> > > > > >> > >> > > > >> > > > > >> > >> > > > >> > > > > >> > >> > > > I would also like to correct some of >> > the >> > > > > >> mentioned >> > > > > >> > facts >> > > > > >> > >> with >> > > > > >> > >> > regard >> > > > > >> > >> > > to >> > > > > >> > >> > > > nnvm/tvm stack >> > > > > >> > >> > > > >> > > > > >> > >> > > > 1. Nascent project with few >> > > contributors >> > > > > >> > >> > > > >> > > > > >> > >> > > > NNVM Compiler now received >> > contributions >> > > > from >> > > > > >> AWS, UW >> > > > > >> > and >> > > > > >> > >> many >> > > > > >> > >> > other >> > > > > >> > >> > > > folks >> > > > > >> > >> > > > in MXNet community. NNVM itself is >> > > already >> > > > > >> being used >> > > > > >> > by >> > > > > >> > >> MXNet. >> > > > > >> > >> > > > MXNet's internal IR is migrating >> toward >> > > > > gluon, >> > > > > >> and its >> > > > > >> > >> final form >> > > > > >> > >> > > being >> > > > > >> > >> > > > nnvm/top >> > > > > >> > >> > > > >> > > > > >> > >> > > > 3. Does not support all operators >> > that >> > > > > exist >> > > > > >> in >> > > > > >> > MXNet >> > > > > >> > >> Symbolic >> > > > > >> > >> > API >> > > > > >> > >> > > > >> > > > > >> > >> > > > Neither NNVM/top or onnx support all >> > > > > operators >> > > > > >> that >> > > > > >> > exist >> > > > > >> > >> in mxnet >> > > > > >> > >> > > > symbolic >> > > > > >> > >> > > > API. The end goal here is mainly to >> > make >> > > > > >> nnvm/top onnx >> > > > > >> > >> compatible, >> > > > > >> > >> > > > which is >> > > > > >> > >> > > > a more reasonable goal. >> > > > > >> > >> > > > >> > > > > >> > >> > > > 4. No CI Pipeline and testcases >> > > > > >> > >> > > > >> > > > > >> > >> > > > NNVM already contains a compiler >> > contains >> > > > > >> unittests >> > > > > >> > and ci >> > > > > >> > >> tested >> > > > > >> > >> > > with >> > > > > >> > >> > > > integration >> > > https://github.com/dmlc/nnvm, >> > > > > >> with a CI >> > > > > >> > >> pipline that >> > > > > >> > >> > is >> > > > > >> > >> > > > well >> > > > > >> > >> > > > tested on CPU and GPU cases for >> > > front-ends. >> > > > > >> > >> > > > >> > > > > >> > >> > > > Tianqi >> > > > > >> > >> > > > >> > > > > >> > >> > > > >> > > > > >> > >> > > > On Wed, Oct 18, 2017 at 1:41 PM, >> > Roshani >> > > > > >> Nagmote < >> > > > > >> > >> > > > roshaninagmo...@gmail.com> >> > > > > >> > >> > > > wrote: >> > > > > >> > >> > > > >> > > > > >> > >> > > > > Hi guys, >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > I am working on supporting ONNX < >> > > > > >> > >> https://github.com/onnx/onnx> >> > > > > >> > >> > > > pre-trained >> > > > > >> > >> > > > > models in Apache MXNet and would >> like >> > > to >> > > > > >> seek your >> > > > > >> > >> opinion on the >> > > > > >> > >> > > > choice of >> > > > > >> > >> > > > > implementation. I also have >> created a >> > > > > GitHub >> > > > > >> issue >> > > > > >> > >> > > > > <https://github.com/apache/ >> > > > > >> > incubator-mxnet/issues/8319>. >> > > > > >> > >> > > Supporting >> > > > > >> > >> > > > ONNX >> > > > > >> > >> > > > > in >> > > > > >> > >> > > > > MXNet will enable users to move >> > between >> > > > > >> frameworks >> > > > > >> > with >> > > > > >> > >> their >> > > > > >> > >> > > > models, this >> > > > > >> > >> > > > > will also enable MXNet project to >> be >> > a >> > > > part >> > > > > >> of the >> > > > > >> > ONNX >> > > > > >> > >> open >> > > > > >> > >> > > > standard and >> > > > > >> > >> > > > > steer the direction of ONNX. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > For those who don’t know ONNX, >> ONNX >> > is >> > > an >> > > > > >> open >> > > > > >> > source >> > > > > >> > >> format for >> > > > > >> > >> > AI >> > > > > >> > >> > > > models >> > > > > >> > >> > > > > which enables models to be >> > transferred >> > > > > >> between >> > > > > >> > >> frameworks. Refer >> > > > > >> > >> > to >> > > > > >> > >> > > > > https://github.com/onnx/onnx for >> > more >> > > > > >> details. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > To implement the import/export >> > > > > functionality >> > > > > >> in >> > > > > >> > MXNet, I >> > > > > >> > >> propose >> > > > > >> > >> > to >> > > > > >> > >> > > > expose >> > > > > >> > >> > > > > a MXNet python module “serde”(name >> > > taken >> > > > > from >> > > > > >> > Apache Hive >> > > > > >> > >> > project) >> > > > > >> > >> > > > with the >> > > > > >> > >> > > > > following methods supporting >> > different >> > > > > >> formats: >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > sym, params = >> > > > > mxnet.serde.import(other_forma >> > > > > >> t_file, >> > > > > >> > >> > > > other_format=‘onnx’) >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > other_format_file = >> > > > > >> mxnet.serde.export(mxnet_sym, >> > > > > >> > >> mxnet_params, >> > > > > >> > >> > > > ‘onnx’) >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > The implementation under the hood >> can >> > > be >> > > > > >> done in >> > > > > >> > two ways: >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > 1) Implement at the MXNet layer by >> > > > parsing >> > > > > >> the ONNX >> > > > > >> > >> model(in >> > > > > >> > >> > > protobuf >> > > > > >> > >> > > > > format) and turn into MXNet >> Symbolic >> > > > > >> operators and >> > > > > >> > build >> > > > > >> > >> MXNet >> > > > > >> > >> > > model >> > > > > >> > >> > > > > directly. Similarly, I can convert >> > the >> > > > > MXNet >> > > > > >> model >> > > > > >> > to >> > > > > >> > >> ONNX format >> > > > > >> > >> > > at >> > > > > >> > >> > > > this >> > > > > >> > >> > > > > layer. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > 2) The DMLC community has released >> > the >> > > > > >> nnvm/tvm >> > > > > >> > complier >> > > > > >> > >> and an >> > > > > >> > >> > > > > intermediate representation of the >> > > > models, >> > > > > >> refer: >> > > > > >> > >> > > > > http://www.tvmlang.org/2017/ >> > > > > >> > 10/06/nnvm/tvm-compiler- >> > > > > >> > >> > > > announcement.html >> > > > > >> > >> > > > > <http://www.tvmlang.org/2017/1 >> > > > > >> 0/06/nnvm-compiler- >> > > > > >> > >> > announcement.html >> > > > > >> > >> > > > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Based on the conversation on the >> > GitHub >> > > > > issue >> > > > > >> > >> > > > > <https://github.com/apache/ >> > > > > >> > incubator-mxnet/issues/8319> I >> > > > > >> > >> > opened, >> > > > > >> > >> > > Mu >> > > > > >> > >> > > > > mentioned that MXNet would use >> > nnvm/tvm >> > > > as >> > > > > >> the >> > > > > >> > backend in >> > > > > >> > >> the >> > > > > >> > >> > > future. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > We could hook into this layer to >> > > > implement >> > > > > >> the >> > > > > >> > >> import/export >> > > > > >> > >> > > > functionality. >> > > > > >> > >> > > > > nnvm/tvm has ONNX 0.1 version >> import >> > > > > >> implemented. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > For import, >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > 1. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > I will need to enhance >> nnvm/tvm’s >> > > > > >> importer to >> > > > > >> > support >> > > > > >> > >> ONNX 0.2 >> > > > > >> > >> > > > > 2. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Implement nnvm/tvm->mxnet >> symbolic >> > > > > >> operators. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > For export: >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > 1. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > mxnet->nnvm/tvm ( nnvm/tvm >> > provides >> > > > this >> > > > > >> > implementation >> > > > > >> > >> > already) >> > > > > >> > >> > > > > 2. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > I will need to Implement >> > > > nnvm/tvm>onnx. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > These are the pros and cons I see >> in >> > > the >> > > > > >> above >> > > > > >> > approaches: >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > 1. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Import/export at mxnet layer >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Pros: >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > 1. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Stable APIs currently used by >> > users. >> > > > > >> > >> > > > > 2. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Larger Apache MXNet community >> of >> > > > > >> contributors. >> > > > > >> > >> > > > > 3. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > CI pipeline to catch bugs. >> > > > > >> > >> > > > > 4. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Comparatively less time to >> > implement >> > > > and >> > > > > >> put it >> > > > > >> > in the >> > > > > >> > >> hands >> > > > > >> > >> > of >> > > > > >> > >> > > > the >> > > > > >> > >> > > > > users. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Cons: >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > 1. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > In the future we may have to >> > > > reimplement >> > > > > >> at the >> > > > > >> > >> nnvm/tvm >> > > > > >> > >> > layer, >> > > > > >> > >> > > > in case >> > > > > >> > >> > > > > MXNet moves to the nnvm/tvm >> > > > > >> backend(assuming it >> > > > > >> > will >> > > > > >> > >> move). >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > 1. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Import/export at nnvm/tvm layer >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Pros: >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > 1. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Less engineering work in case >> > mxnet >> > > > > moves >> > > > > >> to >> > > > > >> > nnvm/tvm >> > > > > >> > >> > > > > 2. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > nnvm/tvm would become a hub to >> > > convert >> > > > > to >> > > > > >> > different >> > > > > >> > >> formats. >> > > > > >> > >> > > > > 3. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > nnvm operators are more in >> parity >> > > with >> > > > > >> mxnet’s >> > > > > >> > gluon >> > > > > >> > >> APIs this >> > > > > >> > >> > > > could be >> > > > > >> > >> > > > > useful in case Gluon becomes >> the >> > > only >> > > > > >> standard >> > > > > >> > that >> > > > > >> > >> MXNet will >> > > > > >> > >> > > > support. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Cons: >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > 1. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Nascent project with few >> > > contributors >> > > > > >> > >> > > > > 2. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Does not support all operators >> > that >> > > > > exist >> > > > > >> in >> > > > > >> > MXNet >> > > > > >> > >> Symbolic >> > > > > >> > >> > API >> > > > > >> > >> > > > > 3. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > No CI Pipeline >> > > > > >> > >> > > > > 4. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Current Apache MXNet project >> does >> > > not >> > > > > use >> > > > > >> > nnvm/tvm >> > > > > >> > >> backend >> > > > > >> > >> > > > > 5. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > mxnet->nnvm/tvm backend needs >> more >> > > > > >> testing and >> > > > > >> > user >> > > > > >> > >> feedback. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Any suggestions on both of these >> > > > > approaches? >> > > > > >> From >> > > > > >> > user's >> > > > > >> > >> > > > perspective, this >> > > > > >> > >> > > > > will be an implementation detail >> that >> > > is >> > > > > not >> > > > > >> > exposed. >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Thanks, >> > > > > >> > >> > > > > >> > > > > >> > >> > > > > Roshani >> > > > > >> > >> > > > > >> > > > > >> > >> > > > >> > > > > >> > >> > > > >> > > > > >> > >> > > > >> > > > > >> > >> > > > >> > > > > >> > >> > > >> > > > > >> > >> > >> > > > > >> > >> >> > > > > >> > >> >> > > > > >> > >> >> > > > > >> > >> >> > > > > >> > > >> > > > > >> > >> > > > > >> > >> > > > > >> > >> > > > > >> > >> > > > > >> >> > > > > >> >> > > > > >> >> > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >> > >