RE: [DISCUSS] Partial CI builds - Reducing flakiness with fewer tests

Gaurav Narula Wed, 07 Jun 2023 02:09:06 -0700

Hey Greg,

Thanks for sharing this idea!


The idea of building and testing a relevant subset of code certainly seems 
interesting.

Perhaps this is a good fit for Bazel [1] where
target-determinator [2] can be used to to find a subset of targets that have 
changed between two commits.

Even without [2], Bazel builds can benefit immensely from distributing builds
to a set of remote nodes [3] with support for caching previously built
targets [4].

We've seen a few other ASF projects adopt Bazel as well:

* https://github.com/apache/rocketmq
* https://github.com/apache/brpc
* https://github.com/apache/trafficserver
* https://github.com/apache/ws-axiom

I wonder how the Kafka community feels about experimenting with Bazel and
exploring if it helps us offer faster build times without compromising on the
correctness of the targets that need to be built and tested?

Thanks,
Gaurav

[1]: https://bazel.build
[2]: https://github.com/bazel-contrib/target-determinator
[3]: https://bazel.build/remote/rbe
[4]: https://bazel.build/remote/caching

On 2023/06/05 17:47:07 Greg Harris wrote:
> Hey all,
> 
> I've been working on test flakiness recently, and I've been trying to
> come up with ways to tackle the issue top-down as well as bottom-up,
> and I'm interested to hear your thoughts on an idea.
> 
> In addition to the current full-suite runs, can we in parallel trigger
> a smaller test run which has only a relevant subset of tests? For
> example, if someone is working on one sub-module, the CI would only
> run tests in that module.
> 
> I think this would be more likely to pass than the full suite due to
> the fewer tests failing probabilistically, and would improve the
> signal-to-noise ratio of the summary pass/fail marker on GitHub. This
> should also be shorter to execute than the full suite, allowing for
> faster cycle-time than the current full suite encourages.
> 
> This would also strengthen the incentive for contributors specializing
> in a module to de-flake tests, as they are rewarded with a tangible
> improvement within their area of the project. Currently, even the
> modules with the most reliable tests receive consistent CI failures
> from other less reliable modules.
> 
> I believe this is possible, even if there isn't an off-the-shelf
> solution for it. We can learn of the changed files via a git diff, map
> that to modules containing those files, and then execute the tests
> just for those modules with gradle. GitHub also permits showing
> multiple "checks" so that we can emit both the full-suite and partial
> test results.
> 
> Thanks,
> Greg
>

RE: [DISCUSS] Partial CI builds - Reducing flakiness with fewer tests

Reply via email to