Finer-grained test runs?

Kenneth Knowles Wed, 08 Jul 2020 15:29:23 -0700

Hi all,

I wanted to start a discussion about getting finer grained test execution
more focused on particular artifacts/modules. In particular, I want to
gather the downsides and impossibilities. So I will make a proposal that
people can disagree with easily.


Context: job_PreCommit_Java is a monolithic job that...

 - takes 40-50 minutes
 - runs tests of maybe a bit under 100 modules
 - executes over 10k tests
 - runs on any change to model/, sdks/java/, runners/, examples/java/,
examples/kotlin/, release/ (only exception is SQL)
 - is pretty flaky (because it conflates so many independent test flakes,
mostly runners and IOs)

See a scan at https://scans.gradle.com/s/dnuo4o245d2fw/timeline?sort=longest

Proposal: Eliminate monolithic job and break into finer-grained jobs that
operate on two principles:

1. Test run should be focused on validating one artifact or a specific
integration of other artifacts.
2. Test run should trigger only on things that could affect the validity of
that artifact.

For example, a starting point is to separate:

 - core SDK
 - runner helper libs
 - each runner
 - each extension
 - each IO

Benefits:

 - changing an IO or runner would not trigger the 20 minutes of core SDK
tests
 - changing a runner would not trigger the long IO local integration tests
 - changing the core SDK could potentially not run as many tests in
presubmit, but maybe it would and they would be separately reported results
with clear flakiness signal

There are 72 build.gradle files under sdks/java/ and 30 under runners/.
They don't all require a separate job. But still there are enough that it
is worth automation. Does anyone know of what options we might have? It
does not even have to be in Jenkins. We could have one "test the things"
Jenkins job if the underlying tool (Gradle) could resolve what needs to be
run. Caching is not sufficient in my experience.

(there are other quick fix alternatives to shrinking this time, but I want
to focus on bigger picture)

Kenn

Finer-grained test runs?

Reply via email to