Huge +1 (I know its been few months :) ), I am in favor of the proposed structure .
On Sat, 17 Dec 2022 at 13:05, Andrey Anshin <[email protected]> wrote: > +1 > > I just want to add additional thoughts about our tests. Right now a lot of > stuff (fixtures, context managers, etc) sits in tests/conftest.py and > tests/test_utils so if we just move providers we need to do something > with this. > > I would suggest create separate package into > {project_root}/dev/some_awesome_name and design it as pytest plugin ( > https://docs.pytest.org/en/7.1.x/how-to/writing_plugins.html#writing-your-own-plugin) > and move generic stuff related to tests into this package > > If we correctly register all our fixtures we do not need to change > anything related to them in our code. > In case of utilities we need to change imports from from tests.test_utils > import awesome_stuff to from some_awesome_name.utils import awesome_stuff > > > ---- > Best Wishes > *Andrey Anshin* > > > > On Wed, 14 Dec 2022 at 15:46, Jarek Potiuk <[email protected]> wrote: > >> Hey Denis, >> >> > tests/providers/airbyte/hooks >> > tests/providers/airbyte/operators >> > tests/providers/airbyte/sensors >> > tests/providers/airbyte/system >> > tests/providers/airbyte/transfers >> >> Yes. It could be done like that - but it would require merging some >> packages (there are for example already "system" packages for amazon and >> google to keep "utils" there. >> >> I might attempt to see if this is possible to merge, but this might also >> be problematic. I want to be able to get the automation in place to make >> the migration fully automated - so that there is no manual interventions. >> This will allow people to review the changes, and we can test development >> environment integration and then re-apply the changes automatically while >> rebasing. >> >> This change is disruptive enough and if we try to "manipulate" with the >> files moving around too much, it might be too problematic to keep PR merged >> while reviewing it and making sure the CI is fixed and development >> environment works - this will take weeks to get right or even months I am >> afraid, with some experimentation and trying different approaches. As you >> mentioned - this is an early POC and there is still plenty of work to do on >> that one. If we find that we cannot do it in this big change, then we can >> always split it to a separate steps. While we will have all the providers >> still in the same repository, we can still make "moving system tests" a >> separate follow-up PR which will be far less disruptive. >> >> Hey Everyone. >> >> I also think that one deserves an AIP - just the lenght of my email >> indicates it. I would turn my long email into one if you think it is a good >> idea, this way we can keep it up-to-date and explain whatever we can come >> up with. I have a feeling we will get a few "u-turns" while we try >> different aspects of such separation. >> >> WDYT? >> >> J. >> >> >> >> On Tue, Dec 13, 2022 at 3:54 AM Ferruzzi, Dennis >> <[email protected]> wrote: >> >>> > Somehow what you wanted to add here was lost so I am not sure what >>> the proposal is :). >>> >>> Gah, I knew I should have explained it as well. I was basically >>> suggesting move the system tests for a given provider into that provider's >>> tests module, alongside hooks, operators, sensors, etc: >>> >>> tests/providers/airbyte/hooks >>> tests/providers/airbyte/operators >>> tests/providers/airbyte/sensors >>> tests/providers/airbyte/system >>> tests/providers/airbyte/transfers >>> >>> >>> >>> >>> ------------------------------ >>> *From:* Jarek Potiuk <[email protected]> >>> *Sent:* Monday, December 12, 2022 5:48 PM >>> *To:* [email protected] >>> *Subject:* RE: [EXTERNAL][PROPOSAL] (Internal) move of provider >>> packages to isolated "providers" sub-folders >>> >>> >>> *CAUTION*: This email originated from outside of the organization. Do >>> not click links or open attachments unless you can confirm the sender and >>> know the content is safe. >>> >>> And just to clarify - there is a little too much redundancy in your >>> example: >>> >>> `airflow/providers/airbyte/hooks/airbyte.py` becomes `airflow >>> /providers/airbyte/src/airflow/providers/airbyte/hooks/airbyte.py` >>> >>> The leading "airflow" is not there in the target path: >>> >>> `airflow/providers/airbyte/hooks/airbyte.py` becomes >>> `providers/airbyte/src/airflow/providers/airbyte/hooks/airbyte.py` >>> >>> And the redundancy comes from the fact that we have to somehow name the >>> directory where the entire provider is moved. Let me rewrite it differently: >>> >>> providers/airbyte <- This is where the airbyte provider is moved >>> | >>> src/airflow/providers/airbyte/hooks/airbyte.py <- this >>> is actual path "inside" the provider >>> >>> >>> So there is far less redundancy than you think. >>> >>> >>> On Tue, Dec 13, 2022 at 2:40 AM Jarek Potiuk <[email protected]> wrote: >>> >>>> >>>>> I have two immediate thoughts on it. First, can this be used as an >>>>> opportunity to reorganize the system tests tree into the provider's test >>>>> tree? Instead of having `tests/system/providers/{foo}`, maybe we can move >>>>> those system tests alongside the other provider tests like this: >>>>> >>>> >>>> Somehow what you wanted to add here was lost so I am not sure what the >>>> proposal is :). >>>> >>>> >>>>> My other thought is that I think I am missing something here as far as >>>>> the new organization goes. I haven't spent the time that you have spent >>>>> looking at how to make this work, and I also realize this is early PoC >>>>> discussion so maybe this isn't the end target, but the new paths are very >>>>> long and redundant. Is that for automation reasons? For example in the >>>>> image you shared: >>>>> >>>> >>>> >>>>> `airflow/providers/airbyte/hooks/airbyte.py` becomes >>>>> `airflow/providers/airbyte/src/airflow/providers/airbyte/hooks/airbyte.py` >>>>> `tests/providers/airbyte/hooks/test_airbyte.py` becomes >>>>> `airflow/providers/airbyte/ >>>>> tests/airflow/providers/airbyte/hooks/test_airbyte.py. >>>>> >>>> >>>> For sources, I am quite sure we cannot do it - as this would end up >>>> with a non-standard package and we would have to make some non-standard >>>> manipulations with import paths and it would confuse multiple IDEs like >>>> crazy. >>>> Our providers are in "airflow.providers.<provider>" package. The "src" >>>> is the root of sources and we need to have "airflow"/"providers" as the >>>> package in. >>>> >>>> Also this choice was largely based on the official Python packaging >>>> tutorial: >>>> https://packaging.python.org/en/latest/tutorials/packaging-projects/ >>>> (this is where I took the "src" from - it's not been very common in Python >>>> Projects - including Airflow - but it is now recommended in multiple places >>>> including official Pypa tutorial: >>>> >>>> packaging_tutorial/ >>>> └── src/ >>>> └── example_package_YOUR_USERNAME_HERE/ >>>> ├── __init__.py >>>> └── example.py >>>> >>>> Regarding tests - yes, that could be removed. I even had it like that >>>> initially. And I am on the fence on this one. >>>> >>>> Putting tests at the top level has some drawbacks. One drawback is that >>>> if we put tests at the top, we miss "namespace". Having a "namespace" makes >>>> tests unique even if they are defined in different providers and are both >>>> put on PYTHONPATH >>>> >>>> Imagine provider_a defining "tests/hooks/test_utils.py" and provider_b >>>> defining "tests/utils/test_utils.py". When you import >>>> "tests.utils.test_utils" - which one do you import if they are all in the >>>> PYTHONPATH? This will become a problem when we will make all of them added >>>> to PYTHONPATH in your IDE - I think. I believe many IDEs (including PyCharm >>>> and VSCode) will not work with multiple, separate python modules and they >>>> will get confused when seeing multiple repeated packages with the same >>>> modules on PYTHONPATH. >>>> <package_name> gives us "namespace" and guarantees that each module >>>> will be different. >>>> >>>> Also keeping them in packages is pretty consistent with some guidelines >>>> I saw that "tests/<package>" is recommended. I would love however to hear >>>> more from others experiences (especially TP) as maybe there are other ways >>>> I have not thought about. I would also prefer "tests/test_mmm.py" rather >>>> than "tests/airflow/providers/providers/test_mmm.py" - but I am afraid it's >>>> going to be difficult due to the "repeated modules" problem. >>>> >>>> >>>> >>>>>
