“ In my opinion, not all flakies are equals. Some fails every 10 runs, some
fails 1 in a 1000 runs.”
Agreed, for all not new tests/regressions which are also not infra related.

“ We can start by putting the bar at a lower level and raise the level over
time when most of the flakies that we hit are above that level.”
My only concern is only who and how will track that.
Also, metric for non-infra issues I guess

“ At the same time we should make sure that we do not introduce new
flakies. One simple approach that has been mentioned several time is to run
the new tests added by a given patch in a loop using one of the CircleCI
tasks. ”
+1, I personally find this very valuable and more efficient than bisecting
and getting back to works done in some cases months ago


“ We should also probably revert newly committed patch if we detect that
they introduced flakies.”
+1, not that I like my patches to be reverted but it seems as the most fair
way to stick to our stated goals. But I think last time we talked about
reverting, we discussed it only for trunk? Or do I remember it wrong?



On Tue, 9 Aug 2022 at 7:58, Benjamin Lerer <ble...@apache.org> wrote:

> At this point it is clear that we will probably never be able to remove
> some level of flakiness from our tests. For me the questions are: 1) Where
> do we draw the line for a release ? and 2) How do we maintain that line
> over time?
>
> In my opinion, not all flakies are equals. Some fails every 10 runs, some
> fails 1 in a 1000 runs. I would personally draw the line based on that
> metric. With the circleci tasks that Andres has added we can easily get
> that information for a given test.
> We can start by putting the bar at a lower level and raise the level over
> time when most of the flakies that we hit are above that level.
>
> TThat would allow us to minimize the risk of introducing flaky tests. We
> should also probably revert newly committed patch if we detect that they
> introduced flakies.
>
> What do you think?
>
>
>
>
>
> Le dim. 7 août 2022 à 12:24, Mick Semb Wever <m...@apache.org> a écrit :
>
>>
>>
>> With that said, I guess we can just revise on a regular basis what
>>> exactly are the last flakes and not numbers which also change quickly up
>>> and down with the first change in the Infra.
>>>
>>
>>
>> +1, I am in favour of taking a pragmatic approach.
>>
>> If flakies are identified and triaged enough that, with correlation from
>> both CI systems, we are confident that no legit bugs are behind them, I'm
>> in favour of going beta.
>>
>> I still remain in favour of somehow incentivising reducing other flakies
>> as well. Flakies that expose poor/limited CI infra, and/or tests that are
>> not as resilient as they could be, are still noise that indirectly reduce
>> our QA (and increase efforts to find and tackle those legit runtime
>> problems). Interested in hearing input from others here that have been
>> spending a lot of time on this front.
>>
>> Could it work if we say: all flakies must be ticketed, and test/infra
>> related flakies do not block a beta release so long as there are fewer than
>> the previous release? The intent here being pragmatic, but keeping us on a
>> "keep the campground cleaner" trajectory…
>>
>>

Reply via email to