The core part of the proposal is to move the graph to be much more strongly
typed template class.
I think this is mainly a point of engineering taste, and both sides have
pros and cons, let me list them before I share my thoughts on this issue:

- Typed fields certainly enjoy more compile-time type checking, on the
other hand, it is hard to expose
   template of explosive possibilities to frontend languages.
- More type-erased fields provide runtime flexibility to store polymorphic
types as well as extensible attributes for graph optimization
  - It is hard to use a virtual class to expose every possible attribute
that an operator might have, such as inlining, storage pattern, gradient
etc..
  - The nature of supporting a growing set of operator attribute requires a
type-erased attrs field.
- In contrast to your argument(typing is a blocker to features),
type-erased or typed code can both get to the same feature except, except
that
  typed code gets more compile-time errors while type-erased get some of
them in runtime.
- Templatized data structures will likely introduce additional metal
burdens to developers and are not really suitable as a core data structure
   - Because they imply an explosive number of possible data structures,
while the core data structure should be a single one.

Now my view(as an MXNet PMC member) on typed vs type-erased style: If MXNet
is a pure C++ project, I might take more of the typed approach.
However, MXNet itself is a project that takes python/scala/clojure and
other frontend languages.
The introduction of more typing may not align with the original goal as the
tradeoffs I listed above.

This proposal is really a drastic change of what NNVM does, as well as the
optimization passes, and given the scope, in your analogy, "a new vehicle
to solve all the problems"
rather than a minor patch. It will take a lot of engineering effort to
bring in new features and adapting the existing ones.
Because of that, it does merit a discussion about how shall we think about
the future MXNet2.0.

Technically Relay is a serious candidate. Of course relay, as well as its
core, is in C++ but maintains the multi-language first principle, that is
why the example code was in python.
See more related discussion comparing NNVMv1 and relay:
https://discuss.tvm.ai/t/any-materials-of-relay-for-beginners/2392/5

I think the ideal graph data structure candidate for MXNet2.0 should have
natural support for:
- Native support of function, module, and recursions
- Control flows
- The ability of interpolation with multi-language frontend, e.g. being
able to prototype graph optimizations in python/scala/clojure if needed.

Adding these support needs significant engineering effort, and I do hope we
only have to do it once. While I don't want to force any conclusion here,
I do think Relay is one such candidate.

Tianqi


On Tue, May 14, 2019 at 5:58 PM Pedro Larroy <pedro.larroy.li...@gmail.com>
wrote:

> Hi Tianqi
>
> Thanks for the quick response.
>
> Could you point to examples where graph.h is being exposed which would
> not be possible with what I propose? I don't think my proposal is
> having any impact in language bindings, and the way I describe it
> doesn't affect having or not having higher language bindings. Please
> elaborate so I can understand your concern.  Maybe code examples where
> the graph attributes are being changed from Python?  I don't think we
> have this on MXNet. This is such a core foundation for MXNet, that I
> don't think we should compromise on it because other project not
> directly related to MXNet might want to expose some untyped graph and
> Node attributes.  The current status makes maintaining the code very
> painful and also is preventing desired features such as higher order
> gradients to be developed. I have heard from you many times how speed
> is critical for us to innovate in this quickly changing field.
>
> My proposal is limited to the graph and wouldn't change the way
> operators are registered and arguments are processed for operators for
> example.
>
>
> Regarding the second point, the documentation about Relay in the web
> which I found for example:
>
> https://docs.tvm.ai/dev/relay_add_op.html#
>
> Is somebody working on making Imperative::Backward use this API? this
> would be a big change which I'm not aware of. And using an IR is of a
> much bigger scope than the change I'm proposing here for example.
>
> I think I'm having difficulty understanding what are the arguments
> here. I'm saying I need to change one piece of my car and what you are
> selling me is a new vehicle here?  Or your suggestion that we use
> Relay for the graph passes in MXNet?
>
> I would like to see C++ code examples, Python examples are not
> sufficient when we talk about the core MXNet.
>
> Pedro.
>
>
>
>
>
>
> On Tue, May 14, 2019 at 5:39 PM Tianqi Chen <tqc...@cs.washington.edu>
> wrote:
> >
> > Thanks for the proposal. Let me share some of my thoughts:
> >
> > Specific comments on the proposal
> > -----------------------------------------------
> > The heavy use of generic in the Graph type was a huge departure from
> > type-erased data structure which was presented in the previous design.
> > While we understand the advantage of typed language(more compile-time
> > checking) and type-erased types(more dynamism) the heavy use of
> > the template will actually make the project solely C++ focused, making it
> > hard to expose intermediate(templatized) data structure to
> > other languages like python/scala/clojure.
> >
> > While I fully understand some of the lessons taught in programming
> > C++(reduce shared_ptr, more typing etc.)
> > We need to think about the context of MXNet project and **the need to
> > support multi-language as a first-class**.
> > Some of the type-erased types are design trade-offs made to support these
> > features, and we need to think more
> > carefully instead of just applying "rules for C++" which may bring
> problems.
> >
> > Future of NNVM
> > ----------------------
> > Given that this thread touched upon what we should do for better
> > computational graph handling. I would recommend also to take a look at
> > NNVMv2 -- relay.
> >
> > Relay addresses many of the wish-lists in the proposal already, such as
> > operator fusion, high order gradient, offload to hardware, isolated
> > compilation, deployment on edge and accelerators etc.
> > Relay also address problems not yet being mentioned in the proposal,
> > including control flow and dynamic runtime, automatic layout optimization
> > etc.
> >
> > Tianqi
> >
> > On Tue, May 14, 2019 at 5:06 PM Sheng Zha <zhash...@apache.org> wrote:
> >
> > > Hi Pedro,
> > >
> > > Thanks for taking the inititaive. Skimming through the design doc, I
> > > didn't see comparison with existing solutions such as relay in tvm,
> which
> > > is already a dependency of mxnet already. Could you elaborate on
> comparison
> > > with existing solutions in the design doc too?
> > >
> > > -sz
> > >
> > > On 2019/05/14 23:49:30, Pedro Larroy <pedro.larroy.li...@gmail.com>
> > > wrote:
> > > > Hi dev@
> > > >
> > > > As a result of my deep dives on the graph machinery I have created a
> > > > new proposal to improve the operator graph in MXNet.
> > > >
> > > > This would mean superseding the use of NNVM Graph in MXNet and having
> > > > a new implementation that we can use to simplify a lot of code and do
> > > > powerful graph manipulation and passes such as operator fusion and
> > > > other optimizations.
> > > >
> > > > As it would be a change with big impact and ramifications, your
> > > > thoughts and feedback on the document would be highly appreciated so
> > > > we can take potential future interesting use cases:
> > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/MXNET/MXVM%3A+Operator+graph+2.0
> > > >
> > > > Pedro.
> > > >
> > >
>

Reply via email to