I will try it today on the train while getting back from my hometown ;) On Tue, Jul 29, 2025 at 7:39 AM Amogh Desai <amoghde...@apache.org> wrote:
> Nice to hear that opinion Ash, it's an interesting angle. > > If someone works on a POC for that, would be interested to know how it > spans out. > > Thanks & Regards, > Amogh Desai > > > On Mon, Jul 28, 2025 at 7:23 PM Ash Berlin-Taylor <a...@apache.org> wrote: > > > Yeah, I wasn’t sure if it would work for us. Let us know how it goes. > > > > > On 28 Jul 2025, at 14:12, Jarek Potiuk <ja...@potiuk.com> wrote: > > > > > > That is an interesting alternative and I quite like it. > > > > > > We tried to use importlib before and we failed (but we did not try > > "hard") > > > - Andrey tried it one day and it generated a lot of errors - but we've > > > changed a LOT since the (including separating of common test classes to > > > "devel_common". > > > > > > It might be worth doing a POC for that - this adds very nice isolation > > > where all tests are generally fully "standalone". I might attempt to > > make a > > > POC on that. > > > > > > I would not asy "only fixtures" is the right approach - but I think it > > > should generally apply to trying to import common code between "tests". > > > There is a bit of imports that are pretty "valid" in some way and > easier > > to > > > use than fixtures I think - i.e. "devel-common" classes, but they are > > not > > > "tests" classes - they are regular "src" classes in "tests_common" > > package > > > in "apache-airflow-devel-common" distribution, so this is pretty good > > idea > > > to import them. > > > > > > But I do agree that importing one test from other classes is "fishy" > and > > if > > > importlib might help us with this, then it is an easy way to enforce > it. > > I > > > will try it first. That's a very good idea :) > > > > > > J. > > > > > > On Mon, Jul 28, 2025 at 10:24 AM Ash Berlin-Taylor <a...@apache.org> > > wrote: > > > > > >> Did you look at changing the pytest import option instead? > > >> https://docs.pytest.org/en/stable/explanation/pythonpath.html - and > > >> specifically the `importlib` option > > >> > > >> To quote their documentation: > > >> > > >>> importlib: this mode uses more fine control mechanisms provided by > > >> importlib < > > >> https://docs.python.org/3/library/importlib.html#module-importlib> to > > >> import test modules, without changing sys.path < > > >> https://docs.python.org/3/library/sys.html#sys.path>. > > >>> > > >>> Advantages of this mode: > > >>> > > >>> pytest will not change sys.path < > > >> https://docs.python.org/3/library/sys.html#sys.path> at all. > > >>> > > >>> Test module names do not need to be unique – pytest will generate a > > >> unique name automatically based on the rootdir. > > >>> > > >>> Disadvantages: > > >>> > > >>> Test modules can’t import each other. > > >>> > > >>> Testing utility modules in the tests directories (for example a > > >> tests.helpers module containing test-related functions/classes) are > not > > >> importable. The recommendation in this case it to place testing > utility > > >> modules together with the application/library code, for example > > >> app.testing.helpers. > > >>> > > >>> Important: by “test utility modules”, we mean functions/classes which > > >> are imported by other tests directly; this does not include fixtures, > > which > > >> should be placed in conftest.py files, along with the test modules, > and > > are > > >> discovered automatically by pytest. > > >>> > > >> > > >> I think with pytest fixtures we almost shouldn’t need to import tests > > >> directly at all, so if they are a package or not should not be an > issue. > > >> Doing this (or working towards) it means we don’t need to worry about > > the > > >> name at all, and we already have the `tests_common` etc as a place to > > put > > >> helpers. > > >> > > >> That way we don’t need any special rules or naming convention or > > otherwise > > >> “duplicate” folder names. > > >> > > >> I.e. instead of adding a name to make things unique, we forbid imports > > of > > >> helper functions and make them use `pytest` fixtures instead. > > >> > > >> Thoughts? > > >> > > >>> On 27 Jul 2025, at 17:55, Jens Scheffler <j_scheff...@gmx.de.INVALID > > > > >> wrote: > > >>> > > >>> Hi Jarek, > > >>> > > >>> I like the general idea of the structure as you propose. > > >>> > > >>> I would not "swap" as you described afterwards but propose to use the > > >> first proposal you made. > > >>> > > >>> For docker and k8s tests I think it would be better to keep them > > outside > > >> of "airflow-core" as they have a rather integrative/system scope and > are > > >> only indirectly related to core. For the moment I'd keep then and once > > we > > >> re-structure the packages still we can align later. > > >>> > > >>> Jens > > >>> > > >>> On 25.07.25 11:57, Amogh Desai wrote: > > >>>> Hi Jarek, > > >>>> > > >>>>> So maybe it's a good time to improve it and come up with a > convention > > >> that > > >>>> we can apply also to shared folders? Such change will be very little > > >>>> disruptive and can be largely automated and applied in a single PR > > >>>> > > >>>> Yeah, I think it's a good time and idea to get such a thing done. > > >>>> > > >>>> *> from unit.amazon import s3_operator_test* > > >>>> I never paid close attention to such import patterns and I do not > like > > >>>> them. Its very > > >>>> confusing and ambiguous, and it's very hard to tell "what package is > > >> this" > > >>>> to any of those > > >>>> imports. > > >>>> > > >>>> While writing the Task SDK integration tests I did keep a lot of > those > > >>>> things in mind and if you > > >>>> see the internal imports I have, they are of the pattern: > > >>>> *from task_sdk_tests.constants import AIRFLOW_ROOT_PATH *which is > far > > >> less > > >>>> ambiguous and > > >>>> confusing. > > >>>> > > >>>>> Note that this does not yet tackle the other "special" test types > we > > >> have > > >>>> -> kubernetes-tests, docker-tests, task-sdk-tests (right now being > > >> merged > > >>>> by Amogh), especially the last one seem to be clashing with > > >>>> `task_sdk_tests`. But I think (and this would be separate > discussion) > > - > > >>>> this also adds an opportunity (next step) to move those "special" > test > > >>>> types we have. Eventually we could move those tests to be "yet > another > > >> test > > >>>> type" under "tests": > > >>>> > > >>>> Agreed, we can consider moving those to the right places where > > >> applicable, > > >>>> like: > > >>>> *task_sdk_tests/* > > >>>> > > >>>> *├── unit/└── integration/ # Moved from task-sdk-tests/* > > >>>> > > >>>> We can start small here and build up to fix the issues one package > at > > a > > >>>> time, and if > > >>>> we can come up with a good AI prompt, we can get some work > multiplexed > > >> by > > >>>> the community > > >>>> too. > > >>>> > > >>>> > > >>>> Thanks & Regards, > > >>>> Amogh Desai > > >>>> > > >>>> > > >>>> On Thu, Jul 24, 2025 at 8:38 PM Jarek Potiuk <ja...@potiuk.com> > > wrote: > > >>>> > > >>>>> Hello here, > > >>>>> > > >>>>> Since we have now started even more isolation and separation of the > > >> code > > >>>>> and libraries, I think it might be a good time to maybe improve a > bit > > >> the > > >>>>> way our tests "package" hierarchy is done - in providers and > > elsewhere. > > >>>>> > > >>>>> This question came up when i started to add pre-commit checks after > > our > > >>>>> "shared code" PR have been merged - in > > >>>>> https://github.com/apache/airflow/pull/53697 > > >>>>> > > >>>>> One problem with having tests in monorepo, is that if all "tests" > > >> folders > > >>>>> are added to PYTHONPATH, we cannot really put tests in the "top" > > level > > >> of > > >>>>> the tests folder (that is pretty natural whe you do not have > multiple > > >> of > > >>>>> "tests" folders. There are two reasons for that: > > >>>>> > > >>>>> * it's likely module names or package names will have the same > names > > in > > >>>>> separate distributions > > >>>>> * quite often - if we add tests in a sub-package of tests directly > > they > > >>>>> might clash with the same names - for example from stdlib or > another > > >>>>> package we installed. Good examples of it (and something that > > happened > > >> in > > >>>>> the past were `tests/smtp/test_*` and `tests/kubernetes/test_*`. > > >> Depending > > >>>>> on how your IDE and env was configured you could have ended up in > > >> "import > > >>>>> kubernetes" not behaving as you expected. > > >>>>> > > >>>>> So far in providers we have the > > "{unit/system/integration}/<PROVIDER>/" > > >>>>> convention that solved the issue. But even me as author of it, I > must > > >> admit > > >>>>> `from unit.amazon` when you want to cross-import between different > > >> amazon > > >>>>> or different providers looks rather stupid and baffling. Even > > recently > > >> Ash > > >>>>> commented in one PR "that must be wrong" - and yeah, it does look > > >> wrong.... > > >>>>> > > >>>>> So maybe it's a good time to improve it and come up with a > convention > > >> that > > >>>>> we can apply also to shared folders? Such change will be very > little > > >>>>> disruptive and can be largely automated and applied in a single PR > - > > >> and > > >>>>> modern git and IDE integration will follow such changes nicely - so > > >> that > > >>>>> you can see history of changes and it resolves most of conflicts > > >>>>> automatically - so it could be done with very little disruption > (only > > >>>>> currently opened PRs will have to be rebased). > > >>>>> > > >>>>> My proposal would be to add a "parent" folder in the "tests" > > directory > > >> to > > >>>>> indicate where the test is coming from. That might sound like a > > >>>>> duplication, but I think it's a natural consequence of having code > in > > >>>>> monorepo, and a nice way to improve the organisation of tests. Also > > it > > >> has > > >>>>> an immediate notion or where the imports come from. > > >>>>> > > >>>>> airflow-core\ > > >>>>> src\ > > >>>>> airflow\ > > >>>>> tests\ > > >>>>> airflow_core_tests\ > > >>>>> unit\ > > >>>>> system\ > > >>>>> integration\ > > >>>>> airflow-task-sdk\ > > >>>>> src\ > > >>>>> airflow\ > > >>>>> sdk > > >>>>> tests\ > > >>>>> task_sdk_tests\ > > >>>>> # here all tests are unit so no > > >> need > > >>>>> for sub-packages > > >>>>> > > >>>>> providers/amazon/tests\ > > >>>>> provider_tests\ > > >>>>> unit\ > > >>>>> amazon > > >>>>> integration\ > > >>>>> amazon > > >>>>> > > >>>>> > > >>>>> providers/cncf/kubernetes/tests\ > > >>>>> provider_tests\ > > >>>>> unit\ > > >>>>> cncf\kubernetes\ > > >>>>> integration\ > > >>>>> > > >> cncf\kubernetes\ > > >>>>> > > >>>>> > > >>>>> I also considered swapping "unit" / "provider" > > >> (provider_tests\amazon\unit) > > >>>>> - but that would make it a bit more complex when we choose type of > > >> tests to > > >>>>> run or examples to have and I think it's somewhat better to have > the > > >>>>> unit/system/integration distinction right after provider_tests > > because > > >>>>> essentially those test types are VERY different. > > >>>>> > > >>>>> Again - that sounds like a lot of duplication in the path, but I > > think > > >> it's > > >>>>> worth it (and our pre-commits already make sure that there are no > > >> typos and > > >>>>> keep it in order and they can be updated to keep this new > structure). > > >>>>> > > >>>>> The main benefit of it hat when you import (wherever) between > > different > > >>>>> distributions - the imports will **look** better and be more > obvious: > > >>>>> > > >>>>> from provider_tests.unit.amazon import some_amazon_test_code > > >>>>> from task_sdk_tests import apis > > >>>>> > > >>>>> Note that this does not yet tackle the other "special" test types > we > > >> have > > >>>>> -> kubernetes-tests, docker-tests, task-sdk-tests (right now being > > >> merged > > >>>>> by Amogh), especially the last one seem to be clashing with > > >>>>> `task_sdk_tests`. But I think (and this would be separate > > discussion) - > > >>>>> this also adds an opportunity (next step) to move those "special" > > test > > >>>>> types we have. Eventually we could move those tests to be "yet > > another > > >> test > > >>>>> type" under "tests": > > >>>>> > > >>>>> airflow_core_tests\ > > >>>>> unit > > >>>>> system > > >>>>> integration > > >>>>> docker > > >>>>> kubernetes > > >>>>> > > >>>>> task_sdk_tests\ > > >>>>> unit > > >>>>> integration > > >>>>> > > >>>>> But I would approach it as the next step after we reorganize > imports > > >> for > > >>>>> the "regular" tests we have. > > >>>>> > > >>>>> WDYT? Good idea? Other proposals? > > >>>>> > > >>>>> J. > > >>>>> > > >>> > > >>> --------------------------------------------------------------------- > > >>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > >>> For additional commands, e-mail: dev-h...@airflow.apache.org > > >>> > > >> > > >> > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > >