We've been in the same position a while back with the same effects. We solved it by creating JIRAs for every failing test and cracking down hard on them; I don't think there's any other way to address this. However to truly solve this one must look at the original cause to prevent new flaky tests from being added. From what I remember, many of our tests were flaky because they relied on timings (e.g. lets Thread.sleep for X and assume Y has happened) or had similar race-conditions, and committers nowadays are rather observant for these issues.

By now the majority of our builds succeed.
We don't to anything like running the builds multiple times before a merge. I know some committers always run a PR at least once against master, but this certainly doesn't apply to everyone. There are still tests that fail from time-to-time, but my impressions is that people still check which tests are failing to ensure they are unrelated, and track them regardless.

On 26.02.2019 17:28, Stanislav Kozlovski wrote:
Hey there Flink community,

I work on a fellow open-source project - Apache Kafka - and there we have been 
fighting flaky tests a lot. We run Java 8 and Java 11 builds on every Pull 
Request and due to test flakiness, almost all of them turn out red with 1 or 2 
tests (completely unrelated to the change in the PR) failing. This has resulted 
in committers either ignoring them and merging the changes or in the worst case 
rerunning the hour-long build until it becomes green.
This test flakiness has also slowed down our releases significantly.

In general, I was just curious to understand if this is a problem that your 
project faces as well. Does your project have a lot of intermittently failing 
tests, do you have any active process of addressing such tests (during the 
initial review, after realizing it is flaky, etc). Any pointers will be greatly 
appreciated!

Thanks,
Stanislav



Reply via email to