Hi Sheng,

Thank you for your efforts and this proposal to improve the tests. Here are
my thoughts.

Shouldn’t the focus be to _engineer_ each test to be reliable instead of
compromising and discussing the relative tradeoffs in re-enabling flaky
tests? Is the test failure probability really 10%?

As you correctly mention, the experiences in making the tests reliable will
then serve as the standard for adding new tests rather than continuing to
chase the elusive goal of reliable tests.

Hence, my non-binding vote is:
-1 for proposal #1 for renabling flaky tests.
+1 for proposal #2 for setting the standard for adding reliable tests.

I suggest to NOT compromise on the quality and reliability of the tests,
similar to the high bar maintained for the MXNet source code.

If the final vote is to re-enable flaky tests, then I propose that we
enable them immediately AFTER the next MXNet release instead of doing it
during the upcoming release.

Bhavin Thaker.

On Sat, Jan 13, 2018 at 2:20 PM, Marco de Abreu <
marco.g.ab...@googlemail.com> wrote:

> Hello Sheng,
>
> thanks a lot for leading this task!
>
> +1 for both points. Additionally, I'd propose to add the requirement to
> specify a reason if a new test takes more than X seconds (say 10) or adds
> an external dependency.
>
> Looking forward to getting these tests fixed :)
>
> Best regards,
> Marco
>
> On Sat, Jan 13, 2018 at 11:14 PM, Sheng Zha <zhash...@apache.org> wrote:
>
> > Hi MXNet community,
> >
> > Thanks to the efforts of several community members, we identified many
> > flaky tests. These tests are currently disabled to ensure the smooth
> > execution of continuous integration (CI). As a result, we lost coverage
> on
> > those features. They need fixing and to be re-enabled to ensure the
> quality
> > of our releases. I'd like to propose the following:
> >
> > 1, Re-enable flaky python tests with retries if feasible
> > Although the tests are unstable, they would still be able to catch
> breaking
> > changes. For example, suppose a test fails randomly with 10% probability,
> > the probability of three failed retries become 0.1%. On the other hand, a
> > breaking change would result in 100% failure. Although this could
> increase
> > the testing time, it's a compromise that can help avoid bigger problem.
> >
> > 2, Set standard for new tests
> > I think having criteria that new tests should follow can help improve the
> > quality of tests, but also the quality of code. I propose the following
> > standard for tests.
> > - Reliably passing with good coverage
> > - Avoid randomness unless necessary
> > - Avoid external dependency unless necessary (e.g. due to license)
> > - Not resource-intensive unless necessary (e.g. scaling tests)
> >
> > In addition, I'd like to call for volunteers on helping with the fix of
> > tests. New members are especially welcome, as it's a good opportunity to
> > familiarize with MXNet. Also, I'd like to request that members who wrote
> > the feature/test could help either by fixing, or by helping others
> > understand the issues.
> >
> > The effort on fixing the tests is tracked at:
> > https://github.com/apache/incubator-mxnet/issues/9412
> >
> > Best regards,
> > Sheng
> >
>

Reply via email to