@Till, Robert: +1. That would be helpful.
On Thu, Feb 28, 2019 at 4:08 PM Till Rohrmann <trohrm...@apache.org> wrote: > > @Ufuk my understanding, though never written down, was to mark test > stability issues as critical and adding the test-stability label. Maybe we > should state this somewhere more explicitly. > > On Thu, Feb 28, 2019 at 1:59 PM Ufuk Celebi <u...@ververica.com> wrote: > > > I fully agree with Aljoscha and Chesnay (although my recent PR > > experience was still close to what Stanislav describes). > > > > @Robert: Do we have standard labels that we apply to tickets that > > report a flaky test? I think this would be helpful to make sure that > > we have a good overview of the state of flaky tests. > > > > Best, > > > > Ufuk > > > > On Wed, Feb 27, 2019 at 3:04 PM Aljoscha Krettek <aljos...@apache.org> > > wrote: > > > > > > I agree with Chesnay, and I would like to add that the most important > > step towards fixing flakiness is awareness and willingness. As soon as you > > accept flakiness and start working around it (as you mentioned) more > > flakiness will creep in, making it harder to get rid of it in the future. > > > > > > Aljoscha > > > > > > > On 27. Feb 2019, at 12:04, Chesnay Schepler <ches...@apache.org> > > wrote: > > > > > > > > We've been in the same position a while back with the same effects. We > > solved it by creating JIRAs for every failing test and cracking down hard > > on them; I don't think there's any other way to address this. > > > > However to truly solve this one must look at the original cause to > > prevent new flaky tests from being added. > > > > From what I remember, many of our tests were flaky because they relied > > on timings (e.g. lets Thread.sleep for X and assume Y has happened) or had > > similar race-conditions, and committers nowadays are rather observant for > > these issues. > > > > > > > > By now the majority of our builds succeed. > > > > We don't to anything like running the builds multiple times before a > > merge. I know some committers always run a PR at least once against master, > > but this certainly doesn't apply to everyone. > > > > There are still tests that fail from time-to-time, but my impressions > > is that people still check which tests are failing to ensure they are > > unrelated, and track them regardless. > > > > > > > > On 26.02.2019 17:28, Stanislav Kozlovski wrote: > > > >> Hey there Flink community, > > > >> > > > >> I work on a fellow open-source project - Apache Kafka - and there we > > have been fighting flaky tests a lot. We run Java 8 and Java 11 builds on > > every Pull Request and due to test flakiness, almost all of them turn out > > red with 1 or 2 tests (completely unrelated to the change in the PR) > > failing. This has resulted in committers either ignoring them and merging > > the changes or in the worst case rerunning the hour-long build until it > > becomes green. > > > >> This test flakiness has also slowed down our releases significantly. > > > >> > > > >> In general, I was just curious to understand if this is a problem > > that your project faces as well. Does your project have a lot of > > intermittently failing tests, do you have any active process of addressing > > such tests (during the initial review, after realizing it is flaky, etc). > > Any pointers will be greatly appreciated! > > > >> > > > >> Thanks, > > > >> Stanislav > > > >> > > > >> > > > > > > > > >