Thanks for the proposal, Felix. On one hand, I agree that richer workload from the ecosystem helps find issues in MXNet early. On the other hand, I'm concerned about tightly coupling the development of projects.
Monitoring the upstream library and addressing problems for upgrading dependency should be the concern of the downstream projects. These projects own the effort of having proper testing for any changes needed, including version upgrade. Having these projects in MXNet CI means the responsibiliy of maintaining these projects partly transfers to the MXNet's contributors, which doesn't seem right. It blurs the line of who's responsible for debugging, isolating the problem, making minimum reproducible sample code, and posting the fix. That said, I think there's much opportunity for reusing the current code for MXNet CI. Projects in MXNet's ecosystem would likely benefit from MXNet's CI solution so that each individual community project can identify issues early. (And from offline chats with Chance and his team members, I think this is what's already on their minds.) -sz On 2019/02/11 16:46:06, "Zhao, Patric" <patric.z...@intel.com> wrote: > Agree to track the 3rd party packages which make MXNet more prosperous :) > > Before building the CI, I suggest to create the related labels, like sockeye, > gluonCV, gluonNLP, etc, in the GitHub and give the high priority for these > issues/PR. > So the issue/PR can be fixed quickly and these important applications would > not be blocked again. > > We can help for the performance/backend/operator related issues as well :) > > Thanks, > > --Patric > > > > > -----Original Message----- > > From: Chance Bair [mailto:chanceb...@gmail.com] > > Sent: Monday, February 11, 2019 11:28 PM > > To: dev@mxnet.incubator.apache.org > > Cc: d...@mxnet.apache.org > > Subject: Re: Third-party package tests for MXNet nightly builds > > > > Hi Felix, > > > > Thank you for the request! The CI team is currently working on improving > > our benchmarking platform and will evaluate this request carefully. > > > > Chance Bair > > > > > > > > On Mon, Feb 11, 2019 at 3:59 PM Carin Meier <carinme...@gmail.com> > > wrote: > > > > > Can't speak for the CI team, but in general I think that it is good idea. > > > > > > On a separate note, I've been playing around with Sockeye recently and > > > it's great! Awesome work and glad to see MXNet used for such cutting > > > edge use cases. > > > I'd love to see closer collaboration with the Sockeye team and MXNet > > > for innovation, cross pollination, and evangelization of what MXNet can > > do . > > > > > > Best, > > > Carin > > > > > > On Mon, Feb 11, 2019 at 6:01 AM Felix Hieber <felix.hie...@gmail.com> > > > wrote: > > > > > > > Hello dev@, > > > > > > > > > > > > > > > > I would like to ask around whether there is interest in the > > > > community to test nightly builds of MXNet with third-party packages > > > > that depend on > > > MXNet > > > > and act as early adopters. The goal is to catch regressions in MXNet > > > early, > > > > allowing time for bug fixes before a new release is cut. > > > > > > > > > > > > > > > > For example, Sockeye <https://github.com/awslabs/sockeye> is a > > > > customer > > > of > > > > new MXNet releases and aims to upgrade to latest MXNet as soon as > > > possible. > > > > Typically, we update our dependency on MXNet once a new release > > > > becomes available (through pip). However, there have been cases > > > > where new > > > releases > > > > of MXNet introduced regressions undetected by MXNet tests (hence > > > > passing the release process): the latest example is this issue > > > > <https://github.com/apache/incubator-mxnet/issues/13862>, which may > > > > have been introduced already back in October, but, due to infrequent > > > > MXNet releases, has only surfaced recently and will most likely > > > > force us to > > > wait > > > > for a post or 1.4.1 release. In this particular example, Sockeye’s > > > > tests would have detected this, and the issue could have been > > > > created already > > > in > > > > October, potentially avoiding its presence in the 1.4.0 release. > > > > > > > > > > > > > > > > More generally, I think there are several third-party packages with > > > > valuable test suites (e.g. gluon-nlp) that can contribute to > > > > catching > > > MXNet > > > > regressions or incompatibilities early. Running these test suites > > > > for > > > each > > > > and every PR or commit on the MXNet main repo would be too much > > overhead. > > > > My proposal would be to trigger these tests with the nightly builds > > > > (pip > > > > releases) of MXNet in a separate CI pipeline that is able to notify > > > > the > > > 3p > > > > maintainers in a case of failure, but does not block MXNet > > > > development > > > (or > > > > nightly build releases) in any way. > > > > > > > > Roughly it would do the following: > > > > > > > > - pip install mxnet--<date> > > > > - for each 3p package that is part of the pipeline: > > > > - clone/setup up package > > > > - run unit/integration tests of package with some timeout > > > > - in case of failure, notify package owner > > > > > > > > > > > > > > > > I am not familiar with the current CI pipelines, their requirements > > > > and resources. It would be great if someone from the CI team could > > > > chime in > > > and > > > > evaluate whether such a proposal seems doable and worthwhile. > > > > > > > > > > > > > > > > Best, > > > > > > > > Felix > > > > > > > >