Hi Jarek, > So maybe it's a good time to improve it and come up with a convention that we can apply also to shared folders? Such change will be very little disruptive and can be largely automated and applied in a single PR
Yeah, I think it's a good time and idea to get such a thing done. *> from unit.amazon import s3_operator_test* I never paid close attention to such import patterns and I do not like them. Its very confusing and ambiguous, and it's very hard to tell "what package is this" to any of those imports. While writing the Task SDK integration tests I did keep a lot of those things in mind and if you see the internal imports I have, they are of the pattern: *from task_sdk_tests.constants import AIRFLOW_ROOT_PATH *which is far less ambiguous and confusing. > Note that this does not yet tackle the other "special" test types we have -> kubernetes-tests, docker-tests, task-sdk-tests (right now being merged by Amogh), especially the last one seem to be clashing with `task_sdk_tests`. But I think (and this would be separate discussion) - this also adds an opportunity (next step) to move those "special" test types we have. Eventually we could move those tests to be "yet another test type" under "tests": Agreed, we can consider moving those to the right places where applicable, like: *task_sdk_tests/* *├── unit/└── integration/ # Moved from task-sdk-tests/* We can start small here and build up to fix the issues one package at a time, and if we can come up with a good AI prompt, we can get some work multiplexed by the community too. Thanks & Regards, Amogh Desai On Thu, Jul 24, 2025 at 8:38 PM Jarek Potiuk <ja...@potiuk.com> wrote: > Hello here, > > Since we have now started even more isolation and separation of the code > and libraries, I think it might be a good time to maybe improve a bit the > way our tests "package" hierarchy is done - in providers and elsewhere. > > This question came up when i started to add pre-commit checks after our > "shared code" PR have been merged - in > https://github.com/apache/airflow/pull/53697 > > One problem with having tests in monorepo, is that if all "tests" folders > are added to PYTHONPATH, we cannot really put tests in the "top" level of > the tests folder (that is pretty natural whe you do not have multiple of > "tests" folders. There are two reasons for that: > > * it's likely module names or package names will have the same names in > separate distributions > * quite often - if we add tests in a sub-package of tests directly they > might clash with the same names - for example from stdlib or another > package we installed. Good examples of it (and something that happened in > the past were `tests/smtp/test_*` and `tests/kubernetes/test_*`. Depending > on how your IDE and env was configured you could have ended up in "import > kubernetes" not behaving as you expected. > > So far in providers we have the "{unit/system/integration}/<PROVIDER>/" > convention that solved the issue. But even me as author of it, I must admit > `from unit.amazon` when you want to cross-import between different amazon > or different providers looks rather stupid and baffling. Even recently Ash > commented in one PR "that must be wrong" - and yeah, it does look wrong.... > > So maybe it's a good time to improve it and come up with a convention that > we can apply also to shared folders? Such change will be very little > disruptive and can be largely automated and applied in a single PR - and > modern git and IDE integration will follow such changes nicely - so that > you can see history of changes and it resolves most of conflicts > automatically - so it could be done with very little disruption (only > currently opened PRs will have to be rebased). > > My proposal would be to add a "parent" folder in the "tests" directory to > indicate where the test is coming from. That might sound like a > duplication, but I think it's a natural consequence of having code in > monorepo, and a nice way to improve the organisation of tests. Also it has > an immediate notion or where the imports come from. > > airflow-core\ > src\ > airflow\ > tests\ > airflow_core_tests\ > unit\ > system\ > integration\ > airflow-task-sdk\ > src\ > airflow\ > sdk > tests\ > task_sdk_tests\ > # here all tests are unit so no need > for sub-packages > > providers/amazon/tests\ > provider_tests\ > unit\ > amazon > integration\ > amazon > > > providers/cncf/kubernetes/tests\ > provider_tests\ > unit\ > cncf\kubernetes\ > integration\ > cncf\kubernetes\ > > > I also considered swapping "unit" / "provider" (provider_tests\amazon\unit) > - but that would make it a bit more complex when we choose type of tests to > run or examples to have and I think it's somewhat better to have the > unit/system/integration distinction right after provider_tests because > essentially those test types are VERY different. > > Again - that sounds like a lot of duplication in the path, but I think it's > worth it (and our pre-commits already make sure that there are no typos and > keep it in order and they can be updated to keep this new structure). > > The main benefit of it hat when you import (wherever) between different > distributions - the imports will **look** better and be more obvious: > > from provider_tests.unit.amazon import some_amazon_test_code > from task_sdk_tests import apis > > Note that this does not yet tackle the other "special" test types we have > -> kubernetes-tests, docker-tests, task-sdk-tests (right now being merged > by Amogh), especially the last one seem to be clashing with > `task_sdk_tests`. But I think (and this would be separate discussion) - > this also adds an opportunity (next step) to move those "special" test > types we have. Eventually we could move those tests to be "yet another test > type" under "tests": > > airflow_core_tests\ > unit > system > integration > docker > kubernetes > > task_sdk_tests\ > unit > integration > > But I would approach it as the next step after we reorganize imports for > the "regular" tests we have. > > WDYT? Good idea? Other proposals? > > J. >