>>> As of additional dependency complexity between providers actually the
additional dependency I think creates more problems than the benefit… would be 
cool if there would be an option to „inline“ common code from a single place 
but keep individual providers fully independent…

>Well, we already  do a lot of inlining, so if we think we should do more, we 
>have mechanisms for that. We have  pre-commits and release commands that do a 
>lot of that. Pre commits are inlining scripts in Dockerfiles, shortening PyPI 
>readme . The providers __init__.py files and changelogs and index 
>documentation .rst (partially) are generated at release documentation 
>preparation time, pyproject.toml for providers are generated from common 
>templates at package building time and so on and so on :). So we can do more 
>of that and generate common code, it's just a matter of adding pre-commits or 
>breeze scripts. But again "can't have and eat cake" - this has the drawback 
>that there are extra steps involved and even if it's automated it does add 
>friction when you have to regenerate the code every time you change it and 
>when you change it in another place than where you use it.

Yes, also thought a moment about pre-commit. I#d be okay if we really in-line 
and have a pre-commit aligning the redundancy of python in folders. Might need 
to be an opt-in if only 10 of 85 providers are using common stuff and if we 
change a common line we probably do not need to affect all providers.

As long as no Windows users trying to code for airflow (do we need to 
consider?) would it also work to use symlinks? Git can cope with this, I don't 
know if the python toolchain can de-reference a copy and are not packaging a 
symlink? Would be worth a test... would save the pre-commit and we even could 
selectively include common bla into providers as needed :-D

Mit freundlichen Grüßen / Best regards

Jens Scheffler

Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T)
Robert Bosch GmbH | Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen | GERMANY | 
www.bosch.com
Tel. +49 711 811-91508 | Mobil +49 160 90417410 | jens.scheff...@de.bosch.com

Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer; 
Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer, Dr. Markus 
Forschner, 
Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer, Dr. Tanja Rückert

-----Original Message-----
From: Jarek Potiuk <ja...@potiuk.com> 
Sent: Mittwoch, 21. Februar 2024 21:18
To: dev@airflow.apache.org
Subject: Re: [DISCUSS] Common.util provider?

> if we have a common piece then we are locking all depending providers
(potentially) together if common code changes

Yes, coupling in this case is the drawback of this solution. You can't have 
cake and eat it too and in this case you trade DRY with coupling.

> As of additional dependency complexity between providers actually the
additional dependency I think creates more problems than the benefit… would be 
cool if there would be an option to „inline“ common code from a single place 
but keep individual providers fully independent…

Well, we already  do a lot of inlining, so if we think we should do more, we 
have mechanisms for that. We have  pre-commits and release commands that do a 
lot of that. Pre commits are inlining scripts in Dockerfiles, shortening PyPI 
readme . The providers __init__.py files and changelogs and index documentation 
.rst (partially) are generated at release documentation preparation time, 
pyproject.toml for providers are generated from common templates at package 
building time and so on and so on :). So we can do more of that and generate 
common code, it's just a matter of adding pre-commits or breeze scripts. But 
again "can't have and eat cake" - this has the drawback that there are extra 
steps involved and even if it's automated it does add friction when you have to 
regenerate the code every time you change it and when you change it in another 
place than where you use it.

J.

On Wed, Feb 21, 2024 at 9:02 PM Scheffler Jens (XC-AS/EAE-ADA-T) 
<jens.scheff...@de.bosch.com.invalid> wrote:

> Hi Jarek,
>
> At reviewing the PR from uranusjr for AIP-60 I also had the feeling 
> that a lot of very similar code is repeated in all the providers. But 
> during review yesterday I dropped the ides because if we have a common 
> piece then we are locking all depending providers (potentially) 
> together if common code changes.
> As of additional dependency complexity between providers actually the 
> additional dependency I think creates more prblems than the benefit… 
> would be cool if tehere would be an option to „inline“ common code 
> from a single place but keep individual providers fully independent…
>
> Jens
>
> Sent from Outlook for 
> iOS<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> aka.ms%2Fo0ukef&data=05%7C02%7CJens.Scheffler%40de.bosch.com%7C98c8897
> 195d944d483ab08dc331a49bb%7C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0%7
> C638441435197193656%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIj
> oiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=n6gk9fNnWB
> SJOPYEgJ9WbriZ3H4id3RhLr16SguOuFA%3D&reserved=0>
> ________________________________
> From: Jarek Potiuk <ja...@potiuk.com>
> Sent: Wednesday, February 21, 2024 5:42:20 PM
> To: dev@airflow.apache.org <dev@airflow.apache.org>
> Subject: [DISCUSS] Common.util provider?
>
> Hello everyone,
>
> How do we feel about introducing a common.util provider?
>
> I know it's not been the original idea behind providers, but - after 
> testing common.sql and now also having common.io, seems like more and 
> more we would like to extract some common code that we would like 
> providers to use, but we refrain from it, because it will only be 
> actually usable 6 months after we introduce some common code.
>
> However, if we introduce common.util, this problem is generally gone - 
> at the expense of more complex maintenance and cross-provider dependencies.
> We should be able to add a common util method and use it in a provider 
> at the same time.
>
> Think Amazon provider using a new feature released in common.util 
> >=1.2.0 and google provider >= 1.1.0. All manageable and we do it 
> already for common.sql. We know how to do it, we know what to avoid, 
> we know we cannot introduce backwards-incompatible changes, so we have 
> to be very clear what is and what is not a public API there, We could 
> rather easily add tests to prevent such backwards-incompatible 
> changes. We even have a solution for chicken-egg providers where we 
> need to release two providers at the same time if they depend on each 
> other. Generally speaking it's quite workable but adds a bit of overhead.
>
> Examples that we could implement as "common.util":
>
> - common versioning check with cache - where multiple providers could 
> reuse "do we have pendulum 2"
> - more complex - some date management features (we have a few like 
> date_ranges/round_time). But there are many more.
>
> I generally do not love the common "util" approach. It has a tendency 
> to become a bag of everything over time. but if we limit it to a set 
> of small, fully decoupled modules where each module is independent - 
> it's OK. And we already have it in "airflow.util" and we seem to be doing 
> well.
>
> WDYT? Is it worth it ?
>
> J.
>

Reply via email to