Sheng, I will provide more detail on the GitHub issue.
The "API Change" labeling for PRs sounds like a good solution to keep consistent API design across MXNet. I guess we can close the discussion on this topic now. Best, Lin On Tue, Jan 14, 2020 at 6:02 PM Sheng Zha <zhash...@apache.org> wrote: > > 2) Regarding issue #17292, it was not broken by 4ed14e2 but an C API > > change in in https://github.com/apache/incubator-mxnet/pull/17128. The > > later commit 4ed14e2 was trying to fix this API change but it did not > seem > > to work yet. > > None of the existing C API was changed in #17128. #17128 had an > unnecessary addition of a C API which was removed in 4ed14e2. Neither > change should have broken third party integration if it's not making > assumptions on where to find the implementation. > > As I don't see any further discussion on this in the issue #17292, let's > make sure the related details are added there please. > > -sz > > On 2020/01/14 18:24:13, Lin Yuan <apefor...@gmail.com> wrote: > > Hi Sheng, > > > > Thanks for your reply. > > > > 1) Adding a "API Change" label is a good way to flag PRs with API change. > > It would be great if we could make this labeling automatic with some hook > > in API related modules, so we don't miss them in PRs. > > > > 2) Regarding issue #17292, it was not broken by 4ed14e2 but an C API > > change in in https://github.com/apache/incubator-mxnet/pull/17128. The > > later commit 4ed14e2 was trying to fix this API change but it did not > seem > > to work yet. > > > > Horovod integration does not call any inline function from MXNet, it > > includes an exported header c_api_error.h from mxnet to throw and catch > > mxnet exception. Same header is included in other project, such as BytePS > > https://github.com/bytedance/byteps/blob/master/byteps/mxnet/ops.h#L22. > > > > I agree with you we need a better design to allow other third party > > libraries to build on top of MXNet. E.g. TensorFlow provides customer > > operators for Horovod to push their allreduce actions to TensorFlow as a > > Custom Operator instead of low-level C API calls. It seems our Custom > > Dynamic Operator > > < > https://cwiki.apache.org/confluence/display/MXNET/Dynamic+CustomOp+Support > > > > project may enable this feature in MXNet 2.0 and I am looking forward to > it > > :) > > > > Cheers, > > > > Lin > > > > > > > > > > On Mon, Jan 13, 2020 at 7:24 PM Sheng Zha <zhash...@apache.org> wrote: > > > > > Hi Lin, > > > > > > Thanks for the suggestions. > > > > > > With respect to your proposal: > > > > > > > (2) Any PR that contains API change should clearly state this in PR > > > title. > > > > Otherwise, committer can reject the PR > > > > > > I agree that PRs with API changes should be made more prominent. > Another > > > mechanism that has already been used is to tag the PRs with the "API > > > change" label [1]. > > > > > > On the other hand, relying on the community to call out the PRs with > API > > > changes may not be reliable. Oftentimes, people didn't realize that a > > > change constitutes an API change. If a committer identifies such a > change, > > > a more friendly response would be to just label the PR and call out > where > > > the API change happens in a comment. > > > > > > > (1) Any PR involving change of APIs should be approved by at least > one of > > > > the committers from a "API Committee" before it can be merged. I will > > > > explain how to form this committee at the end of email > > > > > > I'm not convinced that more hierarchy should be created among > committers. > > > All committers are entrusted by the PPMC to use their judgement to the > best > > > interest of this project, and additional qualification seems > > > counter-productive. > > > > > > With respect to your linked issue #17292, as @stephenrawls pointed > out, it > > > comes from 4ed14e2 where the inline definition of MXAPIHandleException > is > > > moved to a .cc file, and I'm the one that actually made this change to > > > unblock the PR. I want to call out that: > > > - This is not an API change in that there's no change in the function > > > signature or visibility in the symbol table of libmxnet.so. > > > - It should not be the responsibility of MXNet to maintain the > assumption > > > that downstream projects like horovod make in their building logic. > > > > > > A more pressing issue should have been the way that a third-party > > > communication library like horovod integrates with MXNet. So far the > > > horovod integration seemed brittle and there have been many issues > [2]. For > > > this specific issue, to me, it doesn't seem like a good decision on the > > > horovod side to assume any function would be defined inline on the > MXNet > > > side. > > > > > > With the development of MXNet 2.0, it's a good time to rethink how > horovod > > > integration should work with MXNet. I'm hoping that MXNet 2.0 item 4.11 > > > AbstractKVStore interface (See #17115) could help simplify and > alleviate > > > the coupling in the current way of integration. > > > > > > -sz > > > > > > [1] > > > > https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93&q=is%3Apr+label%3A%22API+change%22+ > > > [2] > > > > https://github.com/apache/incubator-mxnet/issues?utf8=%E2%9C%93&q=is%3Aissue+horovod > > > > > > On 2020/01/14 00:37:55, Lin Yuan <apefor...@gmail.com> wrote: > > > > Dear Community, > > > > > > > > Recently, there were some changes to C APIs that broke another > downstream > > > > project Horovod: > https://github.com/apache/incubator-mxnet/issues/17292. > > > > Since we do not have integration tests for downstream project, it > becomes > > > > critical for us to update APIs with extreme caution. > > > > > > > > I would like to propose the following mechanism for us to maintain a > > > clean > > > > and robust APIs: including both C API and Python API at the moment. > > > > > > > > (1) Any PR involving change of APIs should be approved by at least > one of > > > > the committers from a "API Committee" before it can be merged. I will > > > > explain how to form this committee at the end of email > > > > > > > > (2) Any PR that contains API change should clearly state this in PR > > > title. > > > > Otherwise, committer can reject the PR > > > > > > > > API Committee: > > > > - This committee should consist of both seasoned MXNet developers and > > > users. > > > > - Committee members should have a comprehensive view of MXNet APIs to > > > make > > > > sure their usage are consistent across stack. > > > > - Committee members review PRs that involve API change with extra > > > caution. > > > > - Committee members are required to attend the roadmap discussion for > > > each > > > > new release. > > > > - For API breaking changes, committe members should reach consensus > > > before > > > > the change is made. > > > > > > > > Any other suggestion is welcome here. > > > > > > > > Best, > > > > > > > > Lin > > > > > > > > > >