Thanks for the good discussion.

I actually wasn't referring particularly to our conversations in
github with respect of the refactors, but it's nice from you to bring
them up. And it's ok to disagree in small things, hopefully we can
align in the big ones.

I understand that for TVM you might have different constraints with
how dynamic you want to be for mutating the graph and doing quick
experimentation and research but please try to understand my
perspectives coming from a software engineering background and helping
maintain MXNet for thousands of users and teams using it in
production. Let's also consider how many issues we have open and our
bandwidth to deal with additional complexity.

To your pros and cons I would like to add and emphasize that currently
the heavy use of dynamic attributes in the graph using dmlc::any has
two very negative consequences, at least for MXNet:

1 - Makes the data structures using dmlc::any almost impossible to
debug, as they are just binary.
2 - Makes the code more difficult to understand because there's no
declaration in a data structure of the data fields it uses and its
responsibilities. We are basically shoving all kinds of stuff using
dmlc::any.
3 - You get no help from the IDE to navigate and refactor as another
consequence.

I would really like you to give me solutions to these problems or at
least acknowledge them and tell me why do we have to pay those
tradeoffs instead of just dismissing them as engineering taste.

The more I work with MXNet the more I wish debugging was easier, and
reading and refactoring the code, and those fields would be declared
and typed in their corresponding data structures, for MXNet I don't
think this would affect anything in regards the python bindings since
they go through the typed C API anyway.

Maybe we can get some inspiration from LLVM as they have bindings for
many languages to work with the AST and have very clean APIs for the
compilation steps. It's also OK to have an initial dynamic codebase
for research and experimentation and then "cure" them into a solid
maintainable one with more types and more robust...

Pedro.





On Fri, May 31, 2019 at 9:31 AM Tianqi Chen <tqc...@cs.washington.edu> wrote:
>
> A good infrastructure design has a long way to go and has a profound impact 
> on the project itself. That is why we always want to rethink if the interface 
> can be better done, and think about the next possible infrastructure to make 
> things better, Refactoring is certainly part of it.
>
> There are usually two types of refactoring we refers to :
> 1) The major design change, in terms of class relations, data structures 
> (e.g. numpy support, adding compilation to new hardware)
> 2) The specific choice of API, programming style(more types or type-erased 
> program)
>
> (1) affects the long term support of the project, introduces new features if 
> necessary and need a lot of thoughts into that. I believe the general IR, 
> compilation and numpy support belongs to that category.
>
> I would particularly like to talk about (2).
> Because there is no unified correct answer in software engineering, different 
> developers may prefer different views on a certain problem.
> Some of them have things to do with the taste developers. The change could 
> favor certain aspect of the project, but not necessarily another part.
> Refactoring wrt these sometimes does require a more thoughtful conversation 
> and make a reasonable compromise.
>
> For example, we have a recent discussion about whether to introduce more 
> typing into the code base, to the extent that the base data structure could 
> be templatized.
> - The Pros of this approach
>     - It introduces more typing and compile-time error message(instead of 
> runtime checking), could help developers to find problem earlier.
> - The Cons of the approach:
>    - Having a template in the base data structure causes ABI problem(which 
> code generated by DLL A vs DLL B) and will have potential future issues.
>    - Template sometimes confuses some developers.
>    - For serialization, it is hard to anticipate all kinds of classes and it 
> is easier to have one class(any) that handles polymorphism.
>    - Because of most frontends(python) are dynamic, it is easier to interface 
> them with a type-erased API.
>
> As we can see there are pros and cons of bringing in more typing to the 
> change, and there is no unified answer.
> One good example of a nice infrastructure design trade-off is DLPack 
> https://github.com/dmlc/dlpack/blob/master/include/dlpack/dlpack.h
> This is a base data structure adopted by MXNet, Pytorch, Chainer, and many 
> other frameworks unanimously.
> It is a type-erased data structure that erases the data type, and memory 
> allocator from the data structure and is designed to exchange tensor(coming 
> from different memory allocators) across DLL boundaries.
> As you can see this is a good example of type-erased data structures.
>
> When we are having this kind of questions. It is important to have a good 
> conversation. Sometimes we have to make tradeoffs rather than bend 
> everyone-else to our will. This is what open source is about.
> I would also like to give some examples of conversations and how design 
> decisions are resolved. It comes from the TVM community's recent discussion 
> about VM design.
> I directly paste the github issue conversation here for the sake of 
> clarity(note that all the conversations are also mirrored to dev@tvm).
> The background is that the community want to bring a virtual machine that can 
> execute dynamic operations more effectively.
>
> - The initial proposal, made by one of the committers gave a detailed design 
> based on Stack VM https://github.com/dmlc/tvm/issues/2810
>    - As you can see that there are quite some discussions about whether we 
> want to use a different set of design, in this case, a register-based version.
>    - The conversation evolves, and while the community members disagree on 
> some cases, also agrees with each other on the particular tradeoffs.
> - After some discussions, the committers bring a tradeoff design that tries 
> to consolidate the needs of both sides and this is the final solution being 
> adopted  https://github.com/dmlc/tvm/issues/2915
> I would like to particularly highlight the fact that: 1) there are 
> disagreements in the development process. 2) developers work together to 
> understand each others' needs and then make consensus on a perhaps better 
> design.
>
> There are two other particular conversations between Pedro and myself, which 
> are during his contributions.
> - https://github.com/dmlc/tvm/pull/3037 In this case, I raised the concern 
> about API consistency, and Pedro brings up a reason why he thinks it is a 
> better idea, I agreed and we merged the PR
> - https://github.com/dmlc/tvm/pull/3108 In this other case, there are 
> technical reasons for going both sides for the case of MXNet, we have listed 
> pros/cons about both sides and have a constructive conversation. Eventually, 
> I decided to not merge the PR after weighing in all the cases.
>
> I believe both are useful conversations, and while Pedro and I disagree 
> sometimes, we do agree on many other cases. The most crucial part is about 
> having a constructive conversation.
> To summarize, I do refactoring and making things better is certainly 
> important to make the project better. And I do believe it is crucial to think 
> about all the technical consequences and make good tradeoff decisions.
> Sometimes the decision may not make every developer mostly happy, but a good 
> technical compromise could move the project forward and help the community in 
> general.
>
> Tianqi
>
> On Fri, May 31, 2019 at 6:26 AM Isabel Drost-Fromm <isa...@apache.org> wrote:
>>
>>
>>
>> Am 31. Mai 2019 14:13:30 MESZ schrieb Pedro Larroy 
>> <pedro.larroy.li...@gmail.com>:
>> > I think Martin does a very good job explaining why
>> >refactoring,
>> >reducing developer frustration and internal improvement is a crucial
>> >productivity multiplier which includes lower cost to ship features,
>> >less
>> >bugs and time spent debugging.
>>
>> There's one aspect that's special for open source projects: if a project 
>> wants to survive long term, it should make it easy for people to get started 
>> working on the project. In my experience, refactoring and cleanup play an 
>> important role in that. So thanks also for making recruiting of new 
>> contributers better.
>>
>> Isabel
>> --
>> This message was sent with K-9 from a mobile device with swipe to type 
>> enabled. I'm sorry for any embarrassing typos that slipped through.

Reply via email to