Hello everyone,

*TL;DR; *I have a proposal on how we can probably significantly decrease
time needed to run our CI tests. All that at the expense of  small - I
think-  effort by the need to mark tests as "db-tests" by contributors
enforced by our CI (in most cases that won't even be needed)..

*A bit more context:*

I have started to focus on AIP-44 work (internal API) and part of this work
is to make it possible to run subset of our tests (related to
untrusted components and providers) in a db-isolation mode to see if we
have not forgotten anything (and to keep db isolation mode in the future).
Part of this task is to see how many tests we have that actually use the
DB. I had some interesting findings and I thought I would explore an idea
to split our tests to DB and NON-DB tests.

After some initial exploration I believe we can get at least 30-50%
reduction in our CI test execution time (and build test time) at the
expense of the necessity of marking tests as "db_tests" when they need
the DB.

This topic has been brought up occasionally in the past - why do we run our
tests on multiple versions of multiple databases. And the problem is that
many of our tests are using the DB and we need to test them on all of them.
But also - turns out we have quite a lot of tests that do not use the DB.
This - for example - does not allow us to use "pytest-xdist" to run the
test in parallel freely, we have to use the "custom" docker based
parallelism in CI.

So the idea is that If we could separate DB from NON-DB tests we could only
run the DB tests on multiple databases, but then all the "non-db tests"
could only be run once. It would only make sense though if we have a
significant number of non-DB tests, if we could clearly separate them and
if it was easy in the future to keep them separated (so only mark tests as
db_tests when they actually need DB).

*Findings:*

I tried it out and I found out that:

a) Turns out that we have quite a significant number of tests that are
non-db tests. Likely between 30% to 50% by my estimate - both in core and
providers and they are scattered among our code - mostly whole "modules"
are either db or non-db tests.

b) It is rather easy to separate db from non-db tests without any
structural change to our test structure (just by using a
custom @pytest.mark.db_test - on test, class, or even module level). So we
can easily handle cases where the whole module is "mostly non-db" and only
few test cases are "db" - there are some cases like that, but the custom
markers are rather nice to help with that.

c) It should be easy to keep them separated in the future with only very
little overhead. Basically when your tests will start failing in CI and
tell you "mark your test with @pytest.mark.db_test" you will have to do it
- but in most cases you won't even have to do it because your test will be
added to the module that is already marked as db_test module

d) We can use pytest-xdist for parallel execution of non-db tests which
will make the test suites of "non-db" tests much more straightforward -
basically `pytest tests --skip-db-tests -n auto` should do the trick). And
you could do the same locally as well.

*Impact on contributors: *

There is absolutely no impact on running the tests locally. Non-db tests
will run when DB is there, no problem with that. I think in a vast number
of cases it will be mostly no change for regular contributors. Only
occasionally (when new test module is added for example) their tests will
fail in CI with very clear instructions from the CI to mark their tests as
"@pytest.mark.db_test" or marking the whole module as "db_test module" by
`pytestmark = pytest.mark.db_test`. This will happen when their tests will
be treated as "non-db" tests and attempted to run in CI.

*Summarising:*

I think we can achieve some 30%-50% reduction in time/CPU needed if we
implement it. Of course it means again complicating our CI builds quite a
bit. But with those gains I think it's totally worth it.

I have a draft (still needs quite some work) -  PR showing how it could
look like  https://github.com/apache/airflow/pull/35160. I think in 1-2
days I should have a ready-to-merge change with all the tests passing and
with 30%-50% reduction of the resources needed and time.

Would love your comments - is it worth it? Should I proceed with it? Any
other comments?

J.

Reply via email to