Huge +1 (I know its been few months :) ), I am in favor of the proposed
structure .

On Sat, 17 Dec 2022 at 13:05, Andrey Anshin <[email protected]>
wrote:

> +1
>
> I just want to add additional thoughts about our tests. Right now a lot of
> stuff (fixtures, context managers, etc) sits in tests/conftest.py and
> tests/test_utils so if we just move providers we need to do something
> with this.
>
> I would suggest create separate package into
> {project_root}/dev/some_awesome_name and design it as pytest plugin (
> https://docs.pytest.org/en/7.1.x/how-to/writing_plugins.html#writing-your-own-plugin)
> and move generic stuff related to tests into this package
>
> If we correctly register all our fixtures we do not need to change
> anything related to them in our code.
> In case of utilities we need to change imports from from tests.test_utils
> import awesome_stuff to from some_awesome_name.utils import awesome_stuff
>
>
> ----
> Best Wishes
> *Andrey Anshin*
>
>
>
> On Wed, 14 Dec 2022 at 15:46, Jarek Potiuk <[email protected]> wrote:
>
>> Hey Denis,
>>
>> > tests/providers/airbyte/hooks
>> > tests/providers/airbyte/operators
>> > tests/providers/airbyte/sensors
>> > tests/providers/airbyte/system
>> > tests/providers/airbyte/transfers
>>
>> Yes. It could be done like that - but it would require merging some
>> packages (there are for example already "system" packages for amazon and
>> google to keep "utils" there.
>>
>> I might attempt to see if this is possible to merge, but this might also
>> be problematic. I want to be able to get the automation in place to make
>> the migration fully automated - so that there is no manual interventions.
>> This will allow people to review the changes, and we can test development
>> environment integration and then re-apply the changes automatically while
>> rebasing.
>>
>> This change is disruptive enough and if we try to "manipulate" with the
>> files moving around too much, it might be too problematic to keep PR merged
>> while reviewing it and making sure the CI is fixed and development
>> environment works - this will take weeks to get right or even months I am
>> afraid, with some experimentation and trying different approaches. As you
>> mentioned - this is an early POC and there is still plenty of work to do on
>> that one. If we find that we cannot do it in this big change, then we can
>> always split it to a separate steps. While we will have all the providers
>> still in the same repository, we can still make "moving system tests" a
>> separate follow-up PR which will be far less disruptive.
>>
>> Hey Everyone.
>>
>> I also think that one deserves an AIP - just the lenght of my email
>> indicates it. I would turn my long email into one if you think it is a good
>> idea, this way we can keep it up-to-date and explain whatever we can come
>> up with. I have a feeling we will get a few "u-turns" while we try
>> different aspects of such separation.
>>
>> WDYT?
>>
>> J.
>>
>>
>>
>> On Tue, Dec 13, 2022 at 3:54 AM Ferruzzi, Dennis
>> <[email protected]> wrote:
>>
>>> >  Somehow what you wanted to add here was lost so I am not sure what
>>> the proposal is :).
>>>
>>> Gah,  I knew I should have explained it as well.  I was basically
>>> suggesting move the system tests for a given provider into that provider's
>>> tests module, alongside hooks, operators, sensors, etc:
>>>
>>> tests/providers/airbyte/hooks
>>> tests/providers/airbyte/operators
>>> tests/providers/airbyte/sensors
>>> tests/providers/airbyte/system
>>> tests/providers/airbyte/transfers
>>>
>>>
>>>
>>>
>>> ------------------------------
>>> *From:* Jarek Potiuk <[email protected]>
>>> *Sent:* Monday, December 12, 2022 5:48 PM
>>> *To:* [email protected]
>>> *Subject:* RE: [EXTERNAL][PROPOSAL] (Internal) move of provider
>>> packages to isolated "providers" sub-folders
>>>
>>>
>>> *CAUTION*: This email originated from outside of the organization. Do
>>> not click links or open attachments unless you can confirm the sender and
>>> know the content is safe.
>>>
>>> And just to clarify - there is a little too much redundancy in your
>>> example:
>>>
>>> `airflow/providers/airbyte/hooks/airbyte.py` becomes `airflow
>>> /providers/airbyte/src/airflow/providers/airbyte/hooks/airbyte.py`
>>>
>>> The leading "airflow" is not there in the target path:
>>>
>>> `airflow/providers/airbyte/hooks/airbyte.py` becomes
>>> `providers/airbyte/src/airflow/providers/airbyte/hooks/airbyte.py`
>>>
>>> And the redundancy comes from the fact that we have to somehow name the
>>> directory where the entire provider is moved. Let me rewrite it differently:
>>>
>>> providers/airbyte <- This is where the airbyte provider is moved
>>>                 |
>>>                 src/airflow/providers/airbyte/hooks/airbyte.py <- this
>>> is actual path "inside" the provider
>>>
>>>
>>> So there is far less redundancy than you think.
>>>
>>>
>>> On Tue, Dec 13, 2022 at 2:40 AM Jarek Potiuk <[email protected]> wrote:
>>>
>>>>
>>>>> I have two immediate thoughts on it.   First, can this be used as an
>>>>> opportunity to reorganize the system tests tree into the provider's test
>>>>> tree?  Instead of having `tests/system/providers/{foo}`, maybe we can move
>>>>> those system tests alongside the other provider tests like this:
>>>>>
>>>>
>>>> Somehow what you wanted to add here was lost so I am not sure what the
>>>> proposal is :).
>>>>
>>>>
>>>>> My other thought is that I think I am missing something here as far as
>>>>> the new organization goes.  I haven't spent the time that you have spent
>>>>> looking at how to make this work, and I also realize this is early PoC
>>>>> discussion so maybe this isn't the end target, but the new paths are very
>>>>> long and redundant.  Is that for automation reasons?  For example in the
>>>>> image you shared:
>>>>>
>>>>
>>>>
>>>>> `airflow/providers/airbyte/hooks/airbyte.py` becomes
>>>>> `airflow/providers/airbyte/src/airflow/providers/airbyte/hooks/airbyte.py`
>>>>> `tests/providers/airbyte/hooks/test_airbyte.py` becomes
>>>>> `airflow/providers/airbyte/
>>>>> tests/airflow/providers/airbyte/hooks/test_airbyte.py.
>>>>>
>>>>
>>>> For sources, I am quite sure we cannot do it - as this would end up
>>>> with a non-standard package and we would have to make some non-standard
>>>> manipulations with import paths and it would confuse multiple IDEs like
>>>> crazy.
>>>> Our providers are in "airflow.providers.<provider>" package. The "src"
>>>> is the root of sources and we need to have "airflow"/"providers" as the
>>>> package in.
>>>>
>>>> Also this choice was largely based on the official Python packaging
>>>> tutorial:
>>>> https://packaging.python.org/en/latest/tutorials/packaging-projects/
>>>> (this is where I took the "src" from - it's not been very common in Python
>>>> Projects - including Airflow - but it is now recommended in multiple places
>>>> including official Pypa tutorial:
>>>>
>>>> packaging_tutorial/
>>>> └── src/
>>>>     └── example_package_YOUR_USERNAME_HERE/
>>>>         ├── __init__.py
>>>>         └── example.py
>>>>
>>>> Regarding tests - yes, that could be removed. I even had it like that
>>>> initially. And I am on the fence on this one.
>>>>
>>>> Putting tests at the top level has some drawbacks. One drawback is that
>>>> if we put tests at the top, we miss "namespace". Having a "namespace" makes
>>>> tests unique even if they are defined in different providers and are both
>>>> put on PYTHONPATH
>>>>
>>>> Imagine provider_a defining "tests/hooks/test_utils.py" and provider_b
>>>> defining "tests/utils/test_utils.py". When you import
>>>> "tests.utils.test_utils" - which one do you import if they are all in the
>>>> PYTHONPATH? This will become a problem when we will make all of them added
>>>> to PYTHONPATH in your IDE - I think. I believe many IDEs (including PyCharm
>>>> and VSCode) will not work with multiple, separate python modules and they
>>>> will get confused when seeing multiple repeated packages with the same
>>>> modules on PYTHONPATH.
>>>> <package_name> gives us "namespace" and guarantees that each module
>>>> will be different.
>>>>
>>>> Also keeping them in packages is pretty consistent with some guidelines
>>>> I saw that "tests/<package>" is recommended. I would love however to hear
>>>> more from others experiences (especially TP) as maybe there are other ways
>>>> I have not thought about. I would also prefer "tests/test_mmm.py" rather
>>>> than "tests/airflow/providers/providers/test_mmm.py" - but I am afraid it's
>>>> going to be difficult due to the "repeated modules" problem.
>>>>
>>>>
>>>>
>>>>>

Reply via email to