Hi David,

I guess you meant to say

"This does not mean that we should NOT continue our effort to reduce the number of flaky tests."

I totally agree with what you wrote. I am also +1 on considering all failures for unit tests.

Best,
Bruno

On 2/12/24 9:11 AM, David Jacot wrote:
Hi folks,

I have been playing with `reports.junitXml.mergeReruns` setting in gradle
[1]. From the gradle doc:

When mergeReruns is enabled, if a test fails but is then retried and
succeeds, its failures will be recorded as <flakyFailure> instead of
<failure>, within one <testcase>. This is effectively the reporting
produced by the surefire plugin of Apache Maven™ when enabling reruns. If
your CI server understands this format, it will indicate that the test was
flaky. If it does not, it will indicate that the test succeeded as it will
ignore the <flakyFailure> information. If the test does not succeed (i.e.
it fails for every retry), it will be indicated as having failed whether
your tool understands this format or not.

With this, we get really close to having green builds [2] all the time.
There are only a few tests which are too flaky. We should address or
disable those.

I think that this would help us a lot because it would reduce the noise
that we get in pull requests. At the moment, there are just too many failed
tests reported so it is really hard to know whether a pull request is
actually fine or not.

[1] applies it to both unit and integration tests. Following the discussion
in the `github build queue` thread, it may be better to only apply it to
the integration tests. Being stricter with unit tests would make sense.

This does not mean that we should continue our effort to reduce the number
of flaky tests. For this, I propose to keep using Gradle Entreprise. It
provides a nice report for them that we can leverage.

Thoughts?

Best,
David

[1] https://github.com/apache/kafka/pull/14862
[2]
https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14862/19/tests

Reply via email to