Hi Jarek,

> So maybe it's a good time to improve it and come up with a convention that
we can apply also to shared folders? Such change will be very little
disruptive and can be largely automated and applied in a single PR

Yeah, I think it's a good time and idea to get such a thing done.

*> from unit.amazon import s3_operator_test*
I never paid close attention to such import patterns and I do not like
them. Its very
confusing and ambiguous, and it's very hard to tell "what package is this"
to any of those
imports.

While writing the Task SDK integration tests I did keep a lot of those
things in mind and if you
see the internal imports I have, they are of the pattern:
*from task_sdk_tests.constants import AIRFLOW_ROOT_PATH *which is far less
ambiguous and
confusing.

> Note that this does not yet tackle the other "special" test types we have
-> kubernetes-tests, docker-tests, task-sdk-tests (right now being merged
by Amogh), especially the last one seem to be clashing with
`task_sdk_tests`. But I think (and this would be separate discussion) -
this also adds an opportunity (next step) to move those "special" test
types we have. Eventually we could move those tests to be "yet another test
type" under "tests":

Agreed, we can consider moving those to the right places where applicable,
like:
*task_sdk_tests/*

*├── unit/└── integration/      # Moved from task-sdk-tests/*

We can start small here and build up to fix the issues one package at a
time, and if
we can come up with a good AI prompt, we can get some work multiplexed by
the community
too.


Thanks & Regards,
Amogh Desai


On Thu, Jul 24, 2025 at 8:38 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> Hello here,
>
> Since we have now started even more isolation and separation of the code
> and libraries, I think it might be a good time to maybe improve a bit the
> way our tests "package" hierarchy is done - in providers and elsewhere.
>
> This question came up when i started to add pre-commit checks after our
> "shared code" PR have been merged - in
> https://github.com/apache/airflow/pull/53697
>
> One problem with having tests in monorepo, is that if all "tests" folders
> are added to PYTHONPATH, we cannot really put tests in the "top" level of
> the tests folder (that is pretty natural whe you do not have multiple of
> "tests" folders. There are two reasons for that:
>
> * it's likely module names or package names will have the same names in
> separate distributions
> * quite often - if we add tests in a sub-package of tests directly they
> might clash with the same names - for example from stdlib or another
> package we installed. Good examples of it (and something that happened in
> the past were `tests/smtp/test_*` and `tests/kubernetes/test_*`. Depending
> on how your IDE and env was configured you could have ended up in "import
> kubernetes" not behaving as you expected.
>
> So far in providers we have the "{unit/system/integration}/<PROVIDER>/"
> convention that solved the issue. But even me as author of it, I must admit
> `from unit.amazon` when you want to cross-import between different amazon
> or different providers looks rather stupid and baffling. Even recently Ash
> commented in one PR "that must be wrong" - and yeah, it does look wrong....
>
> So maybe it's a good time to improve it and come up with a convention that
> we can apply also to shared folders? Such change will be very little
> disruptive and can be largely automated and applied in a single PR - and
> modern git and IDE integration will follow such changes nicely - so that
> you can see history of changes and it resolves most of conflicts
> automatically - so it could be done with very little disruption (only
> currently opened PRs will have to be rebased).
>
> My proposal would be to add a "parent" folder in the "tests" directory to
> indicate where the test is coming from. That might sound like a
> duplication, but I think it's a natural consequence of having code in
> monorepo, and a nice way to improve the organisation of tests. Also it has
> an immediate notion or where the imports come from.
>
> airflow-core\
>              src\
>                  airflow\
>              tests\
>                    airflow_core_tests\
>                                       unit\
>                                       system\
>                                       integration\
> airflow-task-sdk\
>                  src\
>                      airflow\
>                              sdk
>                  tests\
>                        task_sdk_tests\
>                                      # here all tests are unit so no need
> for sub-packages
>
> providers/amazon/tests\
>                        provider_tests\
>                                       unit\
>                                            amazon
>                                       integration\
>                                                  amazon
>
>
> providers/cncf/kubernetes/tests\
>                                 provider_tests\
>                                               unit\
>                                                   cncf\kubernetes\
>                                               integration\
>                                                           cncf\kubernetes\
>
>
> I also considered swapping "unit" / "provider" (provider_tests\amazon\unit)
> - but that would make it a bit more complex when we choose type of tests to
> run or examples to have and I think it's somewhat better to have the
> unit/system/integration distinction right after provider_tests because
> essentially those test types are VERY different.
>
> Again - that sounds like a lot of duplication in the path, but I think it's
> worth it (and our pre-commits already make sure that there are no typos and
> keep it in order and they can be updated to keep this new structure).
>
> The main benefit of it hat when you import (wherever) between different
> distributions - the imports will **look** better and be more obvious:
>
> from provider_tests.unit.amazon import some_amazon_test_code
> from task_sdk_tests import apis
>
> Note that this does not yet tackle the other "special" test types we have
> -> kubernetes-tests, docker-tests, task-sdk-tests (right now being merged
> by Amogh), especially the last one seem to be clashing with
> `task_sdk_tests`. But I think (and this would be separate discussion) -
> this also adds an opportunity (next step) to move those "special" test
> types we have. Eventually we could move those tests to be "yet another test
> type" under "tests":
>
> airflow_core_tests\
>                    unit
>                    system
>                    integration
>                    docker
>                    kubernetes
>
> task_sdk_tests\
>               unit
>               integration
>
> But I would approach it as the next step after we reorganize imports for
> the "regular" tests we have.
>
> WDYT? Good idea? Other proposals?
>
> J.
>

Reply via email to