Hi
Thanks for your insightful comments. I see the concerns about moving PR
checks to nightly and it worries me too. We should all agree on a tradeoff
of some sorts. Flaky tests not only do not help validate the code, but are
detrimental to our progress. Agree that disabling them is suboptimal but
Agree, we need to get serious about reliable fixing and re-enabling the
tests.
Mapping from test to folders might be a good enough approximation for mxnet
repo. In general you would have to trace code - test dependencies.
Steffen
On Thu, Jun 7, 2018 at 6:48 PM Marco de Abreu
wrote:
> We already
We already have GitHub issues for most of the flaky tests. Even for some
that have been disabled for almost half a year - they have never been
re-enabled and thus we still lack coverage.
I think I have an idea how to do it. I will check if it actually works and
then provide a small POC. Basically
I support to create github/Jira for flaky tests and disable for now.
However, we need to get serious and prioritize fixing the disabled tests.
Making PR checks smart and test only code impacted by change is a good
idea, anybody has experience with tools enabling smart validation?
I'm concerned abo
Sorry, I missed reading that Pedro was asking to move the tests that run
training. I agree with that.
Additionally we should make the CI smart as I mentioned above.
-Naveen
On Thu, Jun 7, 2018 at 3:59 PM, Naveen Swamy wrote:
> -1 for moving to nightly. I think that would be detrimental.
>
> W
-1 for moving to nightly. I think that would be detrimental.
We have to make our CI a little more smart and only build required
components and not build all components to reduce cost and the time it
takes to run CI. A Scala build need not build everything and run tests
related to Python, etc.,
Th
Thanks a lot for our input, Thomas! You are right, 3h are only hit if
somebody makes changes in their Dockerfiles and thus every node has to
rebuild their containers - but this is expected and inevitable.
So far there have not been any big attempts to resolve the number of flaky
tests. We had a fe
Thanks for bringing the issue of CI stability!
However I disagree with some points in this thread:
- "We are at approximately 3h for a full successful run."
=> Looking at Jenkins I see the last successful runs oscillating between
1h53 and 2h42 with a mean that seems to be at 2h20. Or are you talk
Yeah, I think we are at the point at which we have to disable tests..
If a test fails in nightly, the commit would not be reverted since it's
hard to pin a failure to a specific PR. We will have reporting for failures
on nightly (they have proven to be stable, so we can enable it right from
the be
I'd like to disable flaky tests until they're fixed.
What would the process be for fixing a failure if the tests are done
nightly? Would the commit be reverted? Won't we end up in the same
situation with so many flaky tests?
I'd like to see if we can separate the test pipelines based on the conten
Hi Team
The time to validate a PR is growing, due to our number of supported
platforms and increased time spent in testing and running models. We are
at approximately 3h for a full successful run.
This is compounded with the failure rate of builds due to flaky tests of
more than 50% which is a b
11 matches
Mail list logo