I will try it today on the train while getting back from my hometown ;)

On Tue, Jul 29, 2025 at 7:39 AM Amogh Desai <amoghde...@apache.org> wrote:

> Nice to hear that opinion Ash, it's an interesting angle.
>
> If someone works on a POC for that, would be interested to know how it
> spans out.
>
> Thanks & Regards,
> Amogh Desai
>
>
> On Mon, Jul 28, 2025 at 7:23 PM Ash Berlin-Taylor <a...@apache.org> wrote:
>
> > Yeah, I wasn’t sure if it would work for us. Let us know how it goes.
> >
> > > On 28 Jul 2025, at 14:12, Jarek Potiuk <ja...@potiuk.com> wrote:
> > >
> > > That is an interesting alternative and I quite like it.
> > >
> > > We tried to use importlib before and we failed (but we did not try
> > "hard")
> > > - Andrey tried it one day and it generated a lot of errors - but we've
> > > changed a LOT since the (including separating of common test classes to
> > > "devel_common".
> > >
> > > It might be worth doing a POC for that - this adds very nice isolation
> > > where all tests are generally fully "standalone". I might attempt to
> > make a
> > > POC on that.
> > >
> > > I would not asy "only fixtures" is the right approach - but I think it
> > > should generally apply to trying to import common code between "tests".
> > > There is a bit of imports that are pretty "valid" in some way and
> easier
> > to
> > > use than fixtures I think  - i.e. "devel-common" classes, but they are
> > not
> > > "tests" classes - they are regular "src" classes in "tests_common"
> > package
> > > in "apache-airflow-devel-common" distribution, so this is pretty good
> > idea
> > > to import them.
> > >
> > > But I do agree that importing one test from other classes is "fishy"
> and
> > if
> > > importlib might help us with this, then it is an easy way to enforce
> it.
> > I
> > > will try it first. That's a very good idea :)
> > >
> > > J.
> > >
> > > On Mon, Jul 28, 2025 at 10:24 AM Ash Berlin-Taylor <a...@apache.org>
> > wrote:
> > >
> > >> Did you look at changing the pytest import option instead?
> > >> https://docs.pytest.org/en/stable/explanation/pythonpath.html - and
> > >> specifically the `importlib` option
> > >>
> > >> To quote their documentation:
> > >>
> > >>> importlib: this mode uses more fine control mechanisms provided by
> > >> importlib <
> > >> https://docs.python.org/3/library/importlib.html#module-importlib> to
> > >> import test modules, without changing sys.path <
> > >> https://docs.python.org/3/library/sys.html#sys.path>.
> > >>>
> > >>> Advantages of this mode:
> > >>>
> > >>> pytest will not change sys.path <
> > >> https://docs.python.org/3/library/sys.html#sys.path> at all.
> > >>>
> > >>> Test module names do not need to be unique – pytest will generate a
> > >> unique name automatically based on the rootdir.
> > >>>
> > >>> Disadvantages:
> > >>>
> > >>> Test modules can’t import each other.
> > >>>
> > >>> Testing utility modules in the tests directories (for example a
> > >> tests.helpers module containing test-related functions/classes) are
> not
> > >> importable. The recommendation in this case it to place testing
> utility
> > >> modules together with the application/library code, for example
> > >> app.testing.helpers.
> > >>>
> > >>> Important: by “test utility modules”, we mean functions/classes which
> > >> are imported by other tests directly; this does not include fixtures,
> > which
> > >> should be placed in conftest.py files, along with the test modules,
> and
> > are
> > >> discovered automatically by pytest.
> > >>>
> > >>
> > >> I think with pytest fixtures we almost shouldn’t need to import tests
> > >> directly at all, so if they are a package or not should not be an
> issue.
> > >> Doing this (or working towards) it means we don’t need to worry about
> > the
> > >> name at all, and we already have the `tests_common` etc as a place to
> > put
> > >> helpers.
> > >>
> > >> That way we don’t need any special rules or naming convention or
> > otherwise
> > >> “duplicate” folder names.
> > >>
> > >> I.e. instead of adding a name to make things unique, we forbid imports
> > of
> > >> helper functions and make them use `pytest` fixtures instead.
> > >>
> > >> Thoughts?
> > >>
> > >>> On 27 Jul 2025, at 17:55, Jens Scheffler <j_scheff...@gmx.de.INVALID
> >
> > >> wrote:
> > >>>
> > >>> Hi Jarek,
> > >>>
> > >>> I like the general idea of the structure as you propose.
> > >>>
> > >>> I would not "swap" as you described afterwards but propose to use the
> > >> first proposal you made.
> > >>>
> > >>> For docker and k8s tests I think it would be better to keep them
> > outside
> > >> of "airflow-core" as they have a rather integrative/system scope and
> are
> > >> only indirectly related to core. For the moment I'd keep then and once
> > we
> > >> re-structure the packages still we can align later.
> > >>>
> > >>> Jens
> > >>>
> > >>> On 25.07.25 11:57, Amogh Desai wrote:
> > >>>> Hi Jarek,
> > >>>>
> > >>>>> So maybe it's a good time to improve it and come up with a
> convention
> > >> that
> > >>>> we can apply also to shared folders? Such change will be very little
> > >>>> disruptive and can be largely automated and applied in a single PR
> > >>>>
> > >>>> Yeah, I think it's a good time and idea to get such a thing done.
> > >>>>
> > >>>> *> from unit.amazon import s3_operator_test*
> > >>>> I never paid close attention to such import patterns and I do not
> like
> > >>>> them. Its very
> > >>>> confusing and ambiguous, and it's very hard to tell "what package is
> > >> this"
> > >>>> to any of those
> > >>>> imports.
> > >>>>
> > >>>> While writing the Task SDK integration tests I did keep a lot of
> those
> > >>>> things in mind and if you
> > >>>> see the internal imports I have, they are of the pattern:
> > >>>> *from task_sdk_tests.constants import AIRFLOW_ROOT_PATH *which is
> far
> > >> less
> > >>>> ambiguous and
> > >>>> confusing.
> > >>>>
> > >>>>> Note that this does not yet tackle the other "special" test types
> we
> > >> have
> > >>>> -> kubernetes-tests, docker-tests, task-sdk-tests (right now being
> > >> merged
> > >>>> by Amogh), especially the last one seem to be clashing with
> > >>>> `task_sdk_tests`. But I think (and this would be separate
> discussion)
> > -
> > >>>> this also adds an opportunity (next step) to move those "special"
> test
> > >>>> types we have. Eventually we could move those tests to be "yet
> another
> > >> test
> > >>>> type" under "tests":
> > >>>>
> > >>>> Agreed, we can consider moving those to the right places where
> > >> applicable,
> > >>>> like:
> > >>>> *task_sdk_tests/*
> > >>>>
> > >>>> *├── unit/└── integration/      # Moved from task-sdk-tests/*
> > >>>>
> > >>>> We can start small here and build up to fix the issues one package
> at
> > a
> > >>>> time, and if
> > >>>> we can come up with a good AI prompt, we can get some work
> multiplexed
> > >> by
> > >>>> the community
> > >>>> too.
> > >>>>
> > >>>>
> > >>>> Thanks & Regards,
> > >>>> Amogh Desai
> > >>>>
> > >>>>
> > >>>> On Thu, Jul 24, 2025 at 8:38 PM Jarek Potiuk <ja...@potiuk.com>
> > wrote:
> > >>>>
> > >>>>> Hello here,
> > >>>>>
> > >>>>> Since we have now started even more isolation and separation of the
> > >> code
> > >>>>> and libraries, I think it might be a good time to maybe improve a
> bit
> > >> the
> > >>>>> way our tests "package" hierarchy is done - in providers and
> > elsewhere.
> > >>>>>
> > >>>>> This question came up when i started to add pre-commit checks after
> > our
> > >>>>> "shared code" PR have been merged - in
> > >>>>> https://github.com/apache/airflow/pull/53697
> > >>>>>
> > >>>>> One problem with having tests in monorepo, is that if all "tests"
> > >> folders
> > >>>>> are added to PYTHONPATH, we cannot really put tests in the "top"
> > level
> > >> of
> > >>>>> the tests folder (that is pretty natural whe you do not have
> multiple
> > >> of
> > >>>>> "tests" folders. There are two reasons for that:
> > >>>>>
> > >>>>> * it's likely module names or package names will have the same
> names
> > in
> > >>>>> separate distributions
> > >>>>> * quite often - if we add tests in a sub-package of tests directly
> > they
> > >>>>> might clash with the same names - for example from stdlib or
> another
> > >>>>> package we installed. Good examples of it (and something that
> > happened
> > >> in
> > >>>>> the past were `tests/smtp/test_*` and `tests/kubernetes/test_*`.
> > >> Depending
> > >>>>> on how your IDE and env was configured you could have ended up in
> > >> "import
> > >>>>> kubernetes" not behaving as you expected.
> > >>>>>
> > >>>>> So far in providers we have the
> > "{unit/system/integration}/<PROVIDER>/"
> > >>>>> convention that solved the issue. But even me as author of it, I
> must
> > >> admit
> > >>>>> `from unit.amazon` when you want to cross-import between different
> > >> amazon
> > >>>>> or different providers looks rather stupid and baffling. Even
> > recently
> > >> Ash
> > >>>>> commented in one PR "that must be wrong" - and yeah, it does look
> > >> wrong....
> > >>>>>
> > >>>>> So maybe it's a good time to improve it and come up with a
> convention
> > >> that
> > >>>>> we can apply also to shared folders? Such change will be very
> little
> > >>>>> disruptive and can be largely automated and applied in a single PR
> -
> > >> and
> > >>>>> modern git and IDE integration will follow such changes nicely - so
> > >> that
> > >>>>> you can see history of changes and it resolves most of conflicts
> > >>>>> automatically - so it could be done with very little disruption
> (only
> > >>>>> currently opened PRs will have to be rebased).
> > >>>>>
> > >>>>> My proposal would be to add a "parent" folder in the "tests"
> > directory
> > >> to
> > >>>>> indicate where the test is coming from. That might sound like a
> > >>>>> duplication, but I think it's a natural consequence of having code
> in
> > >>>>> monorepo, and a nice way to improve the organisation of tests. Also
> > it
> > >> has
> > >>>>> an immediate notion or where the imports come from.
> > >>>>>
> > >>>>> airflow-core\
> > >>>>>             src\
> > >>>>>                 airflow\
> > >>>>>             tests\
> > >>>>>                   airflow_core_tests\
> > >>>>>                                      unit\
> > >>>>>                                      system\
> > >>>>>                                      integration\
> > >>>>> airflow-task-sdk\
> > >>>>>                 src\
> > >>>>>                     airflow\
> > >>>>>                             sdk
> > >>>>>                 tests\
> > >>>>>                       task_sdk_tests\
> > >>>>>                                     # here all tests are unit so no
> > >> need
> > >>>>> for sub-packages
> > >>>>>
> > >>>>> providers/amazon/tests\
> > >>>>>                       provider_tests\
> > >>>>>                                      unit\
> > >>>>>                                           amazon
> > >>>>>                                      integration\
> > >>>>>                                                 amazon
> > >>>>>
> > >>>>>
> > >>>>> providers/cncf/kubernetes/tests\
> > >>>>>                                provider_tests\
> > >>>>>                                              unit\
> > >>>>>                                                  cncf\kubernetes\
> > >>>>>                                              integration\
> > >>>>>
> > >> cncf\kubernetes\
> > >>>>>
> > >>>>>
> > >>>>> I also considered swapping "unit" / "provider"
> > >> (provider_tests\amazon\unit)
> > >>>>> - but that would make it a bit more complex when we choose type of
> > >> tests to
> > >>>>> run or examples to have and I think it's somewhat better to have
> the
> > >>>>> unit/system/integration distinction right after provider_tests
> > because
> > >>>>> essentially those test types are VERY different.
> > >>>>>
> > >>>>> Again - that sounds like a lot of duplication in the path, but I
> > think
> > >> it's
> > >>>>> worth it (and our pre-commits already make sure that there are no
> > >> typos and
> > >>>>> keep it in order and they can be updated to keep this new
> structure).
> > >>>>>
> > >>>>> The main benefit of it hat when you import (wherever) between
> > different
> > >>>>> distributions - the imports will **look** better and be more
> obvious:
> > >>>>>
> > >>>>> from provider_tests.unit.amazon import some_amazon_test_code
> > >>>>> from task_sdk_tests import apis
> > >>>>>
> > >>>>> Note that this does not yet tackle the other "special" test types
> we
> > >> have
> > >>>>> -> kubernetes-tests, docker-tests, task-sdk-tests (right now being
> > >> merged
> > >>>>> by Amogh), especially the last one seem to be clashing with
> > >>>>> `task_sdk_tests`. But I think (and this would be separate
> > discussion) -
> > >>>>> this also adds an opportunity (next step) to move those "special"
> > test
> > >>>>> types we have. Eventually we could move those tests to be "yet
> > another
> > >> test
> > >>>>> type" under "tests":
> > >>>>>
> > >>>>> airflow_core_tests\
> > >>>>>                   unit
> > >>>>>                   system
> > >>>>>                   integration
> > >>>>>                   docker
> > >>>>>                   kubernetes
> > >>>>>
> > >>>>> task_sdk_tests\
> > >>>>>              unit
> > >>>>>              integration
> > >>>>>
> > >>>>> But I would approach it as the next step after we reorganize
> imports
> > >> for
> > >>>>> the "regular" tests we have.
> > >>>>>
> > >>>>> WDYT? Good idea? Other proposals?
> > >>>>>
> > >>>>> J.
> > >>>>>
> > >>>
> > >>> ---------------------------------------------------------------------
> > >>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > >>> For additional commands, e-mail: dev-h...@airflow.apache.org
> > >>>
> > >>
> > >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
> >
>

Reply via email to