As the author of https://github.com/apache/airflow/pull/39530 I love the
idea.

* when providers get >= airflow 2.10 - we change them to import from
> `airflow.openlineage` rather than from "airflow.providers.common.compat".
>
What's the reasoning behind that? How would Airflow core release impact
providers dependencies?

pon., 10 cze 2024 o 11:08 Maciej Obuchowski <obuchowski.mac...@gmail.com>
napisał(a):

> I think it's a good solution.
> The only known problem with that idea is that the common code has to live
> "forever" - as long as someone can use the older providers (or older
> Airflow version).
> The solution would be to introduce some explicit deprecation or versioning
> for provider dependencies - but that's not really possible due to lack of
> constraints
> for optional dependencies.
>
> sob., 8 cze 2024 o 22:00 Jarek Potiuk <ja...@potiuk.com> napisał(a):
>
> > I have an idea about that one, and probably that one will fulfill the
> > "polyfill" approach discussed earlier.
> >
> > I think we should not name the provider "common.util" but
> "common.compat" -
> > because all the code that we need to put there is really about keeping
> > compatibility.
> >
> > For example look here https://github.com/apache/airflow/pull/39530
> >
> > We have a need to have a "compatibility" code somewhere that a number of
> > providers could use in case we want to keep some backwards compatibility.
> >
> > So having a "common.compat" provider would likely nicely full-fill the
> > polyfill approach - It should only contain the code that we aim to keep
> > backwards compatibility
> >
> > Example for https://github.com/apache/airflow/pull/39530
> >
> > * we add the complex compatibility code (see
> > https://github.com/apache/airflow/pull/39530#issuecomment-2145670785) in
> > the "common.compat" provider - and to airflow.openlineage in this case
> > * we import it from there in all providers that need it (this will
> > automatically add dependency)
> > * when providers get >= airflow 2.10 - we change them to import from
> > `airflow.openlineage` rather than from "airflow.providers.common.compat".
> >
> > We could apply similar approach for other "compatibility" code
> >
> > J.
> >
> >
> >
> >
> > On Thu, Apr 11, 2024 at 10:22 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> > > Any other  ideas or suggestions here? Can someone explain how the
> > > "polypill" approach would look like, maybe? How do we imagine this
> > working?
> > >
> > > Just to continue this discussion - another example.
> > >
> > > Small thing that David wanted to add for changes in some sql providers:
> > >
> > > @contextmanager
> > > def suppress_and_warn(*exceptions: type[BaseException]):
> > >     """Context manager that suppresses the given exceptions and logs a
> > > warning message."""
> > >     try:
> > >         yield
> > >     except exceptions as e:
> > >         warnings.warn(f"Exception suppressed:
> > > {e}\n{traceback.format_exc()}", category=UserWarning)
> > >
> > >
> > >
> > >
> >
> https://github.com/apache/airflow/pull/38707/files#diff-6e1b2f961cb951d05d66d2d814ef5f6d8f8bf8f43c40fb5d40e27a031fed8dd7R115
> > >
> > > This is a small thing - but adding it in `airflow` is problematic -
> > > because it will only be released in 1.10, so we cannot use it in
> > providers
> > > if we do.
> > > Currently - since it is used in sql providers, I suggested using
> > > `common.sql` for that code (and add >= 1.12 for common-sql-providers
> for
> > > those providers that use it). And I will write a separate email about a
> > > proposed versioning approach there.
> > >
> > > Do we have a good proposal on how we can solve similar things in the
> > > future?
> > > Do we want it at all? It has some challenges - yes it DRY's the code
> but
> > > it also introduces coupling.
> > >
> > > J.
> > >
> > >
> > > On Sun, Mar 10, 2024 at 6:21 PM Jarek Potiuk <ja...@potiuk.com> wrote:
> > >
> > >> Coming back to it - what about the "polypill" :)? What's different vs
> > the
> > >> "common.sql" way of doing it ? How do we think it can work ?
> > >>
> > >> On Thu, Feb 22, 2024 at 1:58 PM Jarek Potiuk <ja...@potiuk.com>
> wrote:
> > >>
> > >>> > The symbolic link approach seems to disregard all the external
> > >>> providers, unless I misunderstand it.
> > >>>
> > >>> Not really. It just does not make it easy for the external providers
> to
> > >>> use it "fast".  They can still - if they want to just manually copy
> > those
> > >>> utils from the latest version of Airflow if they want to use it.
> > Almost by
> > >>> definition, those will be small, independent modules that can be
> simply
> > >>> copied as needed by whoever releases external providers - and they
> are
> > also
> > >>> free to copy any older version if they want. That is a nice feature
> > that
> > >>> makes them fully decoupled from the version of Airflow they are
> > installed
> > >>> in (same as community providers). Or - if they want they can just
> > import
> > >>> them from "airflow.provider_utils" - but then they have to add >=
> > Airflow
> > >>> 2.9 if that util appeared in Airflow 2.9 (which is the main reason we
> > want
> > >>> to use symbolic links - because due to our policies and promises, we
> > do not
> > >>> want community providers to depend on latest version of Airflow in
> vast
> > >>> majority of cases.
> > >>>
> > >>> So this approach is also fully usable by external providers, but it
> > >>> requires some manual effort to copy the modules to their providers.
> > >>>
> > >>> > I like the polypill idea. A backport provider that brings new
> > >>> interfaces to providers without the actual functionalities.
> > >>>
> > >>> I would love to hear more about this, I think the "common.util" was
> > >>> exactly the kind of polyfill approach (with its own versioning
> > >>> complexities) but maybe I do not understand how such a polyfill
> > provider
> > >>> would work. Say we want to add a new "urlparse" method usable for all
> > >>> providers. Could you explain how it would work - say:
> > >>>
> > >>> * we add "urlparse" in Airflow 2.9
> > >>> * some provider wants to use it in Airflow 2.7
> > >>>
> > >>> What providers, with what code/interfaces we would have to release in
> > >>> this case and what dependencies such providers that want to use it
> > (both
> > >>> community and Airflow should have)? I **think** that would mean
> > exactly the
> > >>> "common.<something>" approach we already have with "io" and "sql",
> but
> > >>> maybe I do not understand it :)
> > >>>
> > >>> On Thu, Feb 22, 2024 at 1:45 PM Tzu-ping Chung
> > <t...@astronomer.io.invalid>
> > >>> wrote:
> > >>>
> > >>>> I like the polypill idea. A backport provider that brings new
> > >>>> interfaces to providers without the actual functionalities.
> > >>>>
> > >>>>
> > >>>> > On 22 Feb 2024, at 20:41, Maciej Obuchowski <
> mobuchow...@apache.org
> > >
> > >>>> wrote:
> > >>>> >
> > >>>> >> That's why I generally do
> > >>>> > not like the "util" approach because common packaging introduces
> > >>>> > unnecessary coupling (you have to upgrade independent utils
> > together).
> > >>>> >
> > >>>> > From my experience with releasing OpenLineage where we do things
> > >>>> similarly:
> > >>>> > I think that's the advantage of it, but only _if_ you can release
> > >>>> those
> > >>>> > together.
> > >>>> > With "build-in" providers it makes sense, but could be burdensome
> if
> > >>>> > "external"
> > >>>> > ones would depend on that functionality.
> > >>>> >
> > >>>> >> I know it's not been the original idea behind providers, but -
> > after
> > >>>> > testing common.sql and now also having common.io, seems like more
> > >>>> and more
> > >>>> > we would like to extract some common code that we would like
> > >>>> providers to
> > >>>> > use, but we refrain from it, because it will only be actually
> > usable 6
> > >>>> > months after we introduce some common code.
> > >>>> >
> > >>>> > So, maybe better approach would be to introduce the functionality
> > into
> > >>>> > core,
> > >>>> > and use common.X provider as "polyfill" (to borrow JS
> nomenclature)
> > >>>> > to make sure providers could use that functionality now, where
> > >>>> external
> > >>>> > ones could depend on the Airflow ones?
> > >>>> >
> > >>>> > The symbolic link approach seems to disregard all the external
> > >>>> providers,
> > >>>> > unless
> > >>>> > I misunderstand it.
> > >>>> >
> > >>>> > czw., 22 lut 2024 o 13:28 Jarek Potiuk <ja...@potiuk.com>
> > napisał(a):
> > >>>> >
> > >>>> >>> Ideally utilities for each purpose (parsing URI, reading Object
> > >>>> Storage,
> > >>>> >> reading SQL, etc.) should each have its own utility package, so
> > they
> > >>>> can be
> > >>>> >> released independently without dependency problems popping up if
> we
> > >>>> need to
> > >>>> >> break compatibility to one purpose. But more providers are
> > >>>> exponentially
> > >>>> >> more difficult to maintain, so I’d settle for one utility
> provider
> > >>>> for now
> > >>>> >> and split further if needed in the future.
> > >>>> >>
> > >>>> >> Very much agree with this general statement. That's why I
> generally
> > >>>> do
> > >>>> >> not like the "util" approach because common packaging introduces
> > >>>> >> unnecessary coupling (you have to upgrade independent utils
> > >>>> together). And
> > >>>> >> when we have a common set of things that seem to make sense to be
> > >>>> released
> > >>>> >> together when upgraded we should package them together in
> > >>>> >> "common.<something concrete" (like we have with common.io and
> > >>>> common.sql).
> > >>>> >>
> > >>>> >> However - in this case, I think what Jens proposed (and I am
> happy
> > >>>> to try
> > >>>> >> as well) is to attempt to use symbolic links - i.e. add the code
> in
> > >>>> >> `airflow.util` but then create a symbolic link in the provider.
> I
> > >>>> tested
> > >>>> >> it yesterday and it works as expected - i.e. such symbolic link
> is
> > >>>> >> dereferenced and the provider package contains the python file,
> not
> > >>>> >> symbolic link. That seems like a much more lightweight approach
> > that
> > >>>> will
> > >>>> >> serve the purpose of "common.util" much better. The only thing we
> > >>>> will have
> > >>>> >> to take care of (and we can add it once the POC is successful) is
> > to
> > >>>> add
> > >>>> >> some pre-commit protection that those kind of symbolically linked
> > >>>> util
> > >>>> >> modules are imported in providers, from inside of those provider,
> > >>>> not from
> > >>>> >> airlfow, and make sure they are "standalone" (i.e. - as you
> > >>>> mentioned - not
> > >>>> >> depend on anything in airflow code). We could create a new
> package
> > >>>> for that
> > >>>> >> in airlfow
> > >>>> >> "airlfow.provider_utils" for example - and make sure (as next
> step)
> > >>>> that
> > >>>> >> anything from that package is never directly imported by any
> > >>>> provider, and
> > >>>> >> whenever provider uses it, it should be symbolic link inside that
> > >>>> provider.
> > >>>> >> That's all automatable and we can prevent mistakes via
> pre-commit.
> > >>>> >>
> > >>>> >> I think that might lead to a very lightweight approach where we
> > >>>> introduce
> > >>>> >> new common functionality which is immediately reusable in
> providers
> > >>>> without
> > >>>> >> the hassle of taking care about backwards compatibility, and
> > >>>> managing the
> > >>>> >> "common.util" provider. At the expense of a bit complex
> pre-commit
> > >>>> that
> > >>>> >> will guard the usage of it.
> > >>>> >>
> > >>>> >> Seems that it might be the "Eat cake and have it too" way that
> > we've
> > >>>> been
> > >>>> >> looking for.
> > >>>> >>
> > >>>> >> J.
> > >>>> >>
> > >>>> >> On Thu, Feb 22, 2024 at 6:14 AM Tzu-ping Chung
> > >>>> <t...@astronomer.io.invalid>
> > >>>> >> wrote:
> > >>>> >>
> > >>>> >>> It would help in the sense mentioned in previous posts, yes. But
> > one
> > >>>> >> thing
> > >>>> >>> I want to point out is, for the provider to actually be helpful,
> > it
> > >>>> must
> > >>>> >> be
> > >>>> >>> treated a bit differently from normal providers, but more like a
> > >>>> separate
> > >>>> >>> third-party dependency. Specifically, the provider should not
> > have a
> > >>>> >>> dependency to Core Airflow, so it can be released and depended
> on
> > >>>> more
> > >>>> >>> flexibly.
> > >>>> >>>
> > >>>> >>> Ideally utilities for each purpose (parsing URI, reading Object
> > >>>> Storage,
> > >>>> >>> reading SQL, etc.) should each have its own utility package, so
> > >>>> they can
> > >>>> >> be
> > >>>> >>> released independently without dependency problems popping up if
> > we
> > >>>> need
> > >>>> >> to
> > >>>> >>> break compatibility to one purpose. But more providers are
> > >>>> exponentially
> > >>>> >>> more difficult to maintain, so I’d settle for one utility
> provider
> > >>>> for
> > >>>> >> now
> > >>>> >>> and split further if needed in the future.
> > >>>> >>>
> > >>>> >>> TP
> > >>>> >>>
> > >>>> >>>
> > >>>> >>>> On 22 Feb 2024, at 10:10, Scheffler Jens (XC-AS/EAE-ADA-T) <
> > >>>> >>> jens.scheff...@de.bosch.com.INVALID> wrote:
> > >>>> >>>>
> > >>>> >>>> @Uranusjr would this help as a pilot in your AIP-60 code to
> parse
> > >>>> and
> > >>>> >>> validate URIs for datasets?
> > >>>> >>>>
> > >>>> >>>> Mit freundlichen Grüßen / Best regards
> > >>>> >>>>
> > >>>> >>>> Jens Scheffler
> > >>>> >>>>
> > >>>> >>>> Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T)
> > >>>> >>>> Robert Bosch GmbH | Hessbruehlstraße 21 | 70565
> > >>>> Stuttgart-Vaihingen |
> > >>>> >>> GERMANY | www.bosch.com
> > >>>> >>>> Tel. +49 711 811-91508 | Mobil +49 160 90417410 |
> > >>>> >>> jens.scheff...@de.bosch.com
> > >>>> >>>>
> > >>>> >>>> Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB
> > 14000;
> > >>>> >>>> Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer;
> > >>>> >>>> Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer,
> Dr.
> > >>>> Markus
> > >>>> >>> Forschner,
> > >>>> >>>> Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer, Dr. Tanja
> > Rückert
> > >>>> >>>>
> > >>>> >>>> -----Original Message-----
> > >>>> >>>> From: Jarek Potiuk <ja...@potiuk.com>
> > >>>> >>>> Sent: Donnerstag, 22. Februar 2024 00:53
> > >>>> >>>> To: dev@airflow.apache.org
> > >>>> >>>> Subject: Re: [DISCUSS] Common.util provider?
> > >>>> >>>>
> > >>>> >>>> Yep. It could work with symbolic links. Tested it and with
> flit -
> > >>>> both
> > >>>> >>> wheel and sdist packaged code such symbolically linked file is
> > >>>> >> dereferenced
> > >>>> >>> and copy of the file is added there. It could be a nice way of
> > >>>> doing it.
> > >>>> >>>>
> > >>>> >>>> Maybe then worth trying next time if someone has a need?
> > >>>> >>>>
> > >>>> >>>> J
> > >>>> >>>>
> > >>>> >>>> On Thu, Feb 22, 2024 at 12:39 AM Scheffler Jens
> > (XC-AS/EAE-ADA-T) <
> > >>>> >>> jens.scheff...@de.bosch.com.invalid> wrote:
> > >>>> >>>>
> > >>>> >>>>>>>> As of additional dependency complexity between providers
> > >>>> actually
> > >>>> >>>>>>>> the
> > >>>> >>>>> additional dependency I think creates more problems than the
> > >>>> benefit…
> > >>>> >>>>> would be cool if there would be an option to „inline“ common
> > code
> > >>>> from
> > >>>> >>>>> a single place but keep individual providers fully
> independent…
> > >>>> >>>>>
> > >>>> >>>>>> Well, we already  do a lot of inlining, so if we think we
> > should
> > >>>> do
> > >>>> >>>>>> more,
> > >>>> >>>>> we have mechanisms for that. We have  pre-commits and release
> > >>>> commands
> > >>>> >>>>> that do a lot of that. Pre commits are inlining scripts in
> > >>>> >>>>> Dockerfiles, shortening PyPI readme . The providers
> __init__.py
> > >>>> files
> > >>>> >>>>> and changelogs and index documentation .rst (partially) are
> > >>>> generated
> > >>>> >>>>> at release documentation preparation time, pyproject.toml for
> > >>>> >>>>> providers are generated from common templates at package
> > building
> > >>>> time
> > >>>> >>>>> and so on and so on :). So we can do more of that and generate
> > >>>> common
> > >>>> >>>>> code, it's just a matter of adding pre-commits or breeze
> > scripts.
> > >>>> But
> > >>>> >>>>> again "can't have and eat cake" - this has the drawback that
> > >>>> there are
> > >>>> >>>>> extra steps involved and even if it's automated it does add
> > >>>> friction
> > >>>> >>>>> when you have to regenerate the code every time you change it
> > and
> > >>>> when
> > >>>> >>>>> you change it in another place than where you use it.
> > >>>> >>>>>
> > >>>> >>>>> Yes, also thought a moment about pre-commit. I#d be okay if we
> > >>>> really
> > >>>> >>>>> in-line and have a pre-commit aligning the redundancy of
> python
> > in
> > >>>> >>> folders.
> > >>>> >>>>> Might need to be an opt-in if only 10 of 85 providers are
> using
> > >>>> common
> > >>>> >>>>> stuff and if we change a common line we probably do not need
> to
> > >>>> affect
> > >>>> >>>>> all providers.
> > >>>> >>>>>
> > >>>> >>>>> As long as no Windows users trying to code for airflow (do we
> > >>>> need to
> > >>>> >>>>> consider?) would it also work to use symlinks? Git can cope
> with
> > >>>> this,
> > >>>> >>>>> I don't know if the python toolchain can de-reference a copy
> and
> > >>>> are
> > >>>> >>>>> not packaging a symlink? Would be worth a test... would save
> the
> > >>>> >>>>> pre-commit and we even could selectively include common bla
> into
> > >>>> >>>>> providers as needed :-D
> > >>>> >>>>>
> > >>>> >>>>> Mit freundlichen Grüßen / Best regards
> > >>>> >>>>>
> > >>>> >>>>> Jens Scheffler
> > >>>> >>>>>
> > >>>> >>>>> Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T) Robert Bosch
> > GmbH
> > >>>> |
> > >>>> >>>>> Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen | GERMANY |
> > >>>> >>>>> www.bosch.com Tel. +49 711 811-91508 | Mobil +49 160
> 90417410 |
> > >>>> >>>>> jens.scheff...@de.bosch.com
> > >>>> >>>>>
> > >>>> >>>>> Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB
> > >>>> 14000;
> > >>>> >>>>> Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer;
> > >>>> >>>>> Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer,
> Dr.
> > >>>> >>>>> Markus Forschner, Stefan Grosch, Dr. Markus Heyn, Dr. Frank
> > >>>> Meyer, Dr.
> > >>>> >>>>> Tanja Rückert
> > >>>> >>>>>
> > >>>> >>>>> -----Original Message-----
> > >>>> >>>>> From: Jarek Potiuk <ja...@potiuk.com>
> > >>>> >>>>> Sent: Mittwoch, 21. Februar 2024 21:18
> > >>>> >>>>> To: dev@airflow.apache.org
> > >>>> >>>>> Subject: Re: [DISCUSS] Common.util provider?
> > >>>> >>>>>
> > >>>> >>>>>> if we have a common piece then we are locking all depending
> > >>>> >>>>>> providers
> > >>>> >>>>> (potentially) together if common code changes
> > >>>> >>>>>
> > >>>> >>>>> Yes, coupling in this case is the drawback of this solution.
> You
> > >>>> can't
> > >>>> >>>>> have cake and eat it too and in this case you trade DRY with
> > >>>> coupling.
> > >>>> >>>>>
> > >>>> >>>>>> As of additional dependency complexity between providers
> > actually
> > >>>> >>>>>> the
> > >>>> >>>>> additional dependency I think creates more problems than the
> > >>>> benefit…
> > >>>> >>>>> would be cool if there would be an option to „inline“ common
> > code
> > >>>> from
> > >>>> >>>>> a single place but keep individual providers fully
> independent…
> > >>>> >>>>>
> > >>>> >>>>> Well, we already  do a lot of inlining, so if we think we
> should
> > >>>> do
> > >>>> >>>>> more, we have mechanisms for that. We have  pre-commits and
> > >>>> release
> > >>>> >>>>> commands that do a lot of that. Pre commits are inlining
> scripts
> > >>>> in
> > >>>> >>>>> Dockerfiles, shortening PyPI readme . The providers
> __init__.py
> > >>>> files
> > >>>> >>>>> and changelogs and index documentation .rst (partially) are
> > >>>> generated
> > >>>> >>>>> at release documentation preparation time, pyproject.toml for
> > >>>> >>>>> providers are generated from common templates at package
> > building
> > >>>> time
> > >>>> >>>>> and so on and so on :). So we can do more of that and generate
> > >>>> common
> > >>>> >>>>> code, it's just a matter of adding pre-commits or breeze
> > scripts.
> > >>>> But
> > >>>> >>>>> again "can't have and eat cake" - this has the drawback that
> > >>>> there are
> > >>>> >>>>> extra steps involved and even if it's automated it does add
> > >>>> friction
> > >>>> >>>>> when you have to regenerate the code every time you change it
> > and
> > >>>> when
> > >>>> >>>>> you change it in another place than where you use it.
> > >>>> >>>>>
> > >>>> >>>>> J.
> > >>>> >>>>>
> > >>>> >>>>> On Wed, Feb 21, 2024 at 9:02 PM Scheffler Jens
> > (XC-AS/EAE-ADA-T) <
> > >>>> >>>>> jens.scheff...@de.bosch.com.invalid> wrote:
> > >>>> >>>>>
> > >>>> >>>>>> Hi Jarek,
> > >>>> >>>>>>
> > >>>> >>>>>> At reviewing the PR from uranusjr for AIP-60 I also had the
> > >>>> feeling
> > >>>> >>>>>> that a lot of very similar code is repeated in all the
> > providers.
> > >>>> >>>>>> But during review yesterday I dropped the ides because if we
> > >>>> have a
> > >>>> >>>>>> common piece then we are locking all depending providers
> > >>>> >>>>>> (potentially) together if common code changes.
> > >>>> >>>>>> As of additional dependency complexity between providers
> > actually
> > >>>> >>>>>> the additional dependency I think creates more prblems than
> the
> > >>>> >>>>>> benefit… would be cool if tehere would be an option to
> „inline“
> > >>>> >>>>>> common code from a single place but keep individual providers
> > >>>> fully
> > >>>> >>>>>> independent…
> > >>>> >>>>>>
> > >>>> >>>>>> Jens
> > >>>> >>>>>>
> > >>>> >>>>>> Sent from Outlook for
> > >>>> >>>>>> iOS<
> > >>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%
> > >>>> >>>>>> 2F
> > >>>> >>>>>> aka.ms%2Fo0ukef&data=05%7C02%7CJens.Scheffler%40de.bosch.com
> > >>>> %7C98c88
> > >>>> >>>>>> 97
> > >>>> >>>>>>
> > >>>> 195d944d483ab08dc331a49bb%7C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0
> > >>>> >>>>>> %7
> > >>>> >>>>>>
> > >>>> C638441435197193656%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQ
> > >>>> >>>>>> Ij
> > >>>> >>>>>>
> > >>>> oiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=n6gk9fNn
> > >>>> >>>>>> WB SJOPYEgJ9WbriZ3H4id3RhLr16SguOuFA%3D&reserved=0>
> > >>>> >>>>>> ________________________________
> > >>>> >>>>>> From: Jarek Potiuk <ja...@potiuk.com>
> > >>>> >>>>>> Sent: Wednesday, February 21, 2024 5:42:20 PM
> > >>>> >>>>>> To: dev@airflow.apache.org <dev@airflow.apache.org>
> > >>>> >>>>>> Subject: [DISCUSS] Common.util provider?
> > >>>> >>>>>>
> > >>>> >>>>>> Hello everyone,
> > >>>> >>>>>>
> > >>>> >>>>>> How do we feel about introducing a common.util provider?
> > >>>> >>>>>>
> > >>>> >>>>>> I know it's not been the original idea behind providers, but
> -
> > >>>> after
> > >>>> >>>>>> testing common.sql and now also having common.io, seems like
> > >>>> more
> > >>>> >>>>>> and more we would like to extract some common code that we
> > would
> > >>>> >>>>>> like providers to use, but we refrain from it, because it
> will
> > >>>> only
> > >>>> >>>>>> be actually usable 6 months after we introduce some common
> > code.
> > >>>> >>>>>>
> > >>>> >>>>>> However, if we introduce common.util, this problem is
> generally
> > >>>> gone
> > >>>> >>>>>> - at the expense of more complex maintenance and
> cross-provider
> > >>>> >>>>> dependencies.
> > >>>> >>>>>> We should be able to add a common util method and use it in a
> > >>>> >>>>>> provider at the same time.
> > >>>> >>>>>>
> > >>>> >>>>>> Think Amazon provider using a new feature released in
> > common.util
> > >>>> >>>>>>> =1.2.0 and google provider >= 1.1.0. All manageable and we
> do
> > it
> > >>>> >>>>>> already for common.sql. We know how to do it, we know what to
> > >>>> avoid,
> > >>>> >>>>>> we know we cannot introduce backwards-incompatible changes,
> so
> > we
> > >>>> >>>>>> have to be very clear what is and what is not a public API
> > >>>> there, We
> > >>>> >>>>>> could rather easily add tests to prevent such
> > >>>> backwards-incompatible
> > >>>> >>>>>> changes. We even have a solution for chicken-egg providers
> > where
> > >>>> we
> > >>>> >>>>>> need to release two providers at the same time if they depend
> > on
> > >>>> >>>>>> each other. Generally speaking it's quite workable but adds a
> > >>>> bit of
> > >>>> >>> overhead.
> > >>>> >>>>>>
> > >>>> >>>>>> Examples that we could implement as "common.util":
> > >>>> >>>>>>
> > >>>> >>>>>> - common versioning check with cache - where multiple
> providers
> > >>>> >>>>>> could reuse "do we have pendulum 2"
> > >>>> >>>>>> - more complex - some date management features (we have a few
> > >>>> like
> > >>>> >>>>>> date_ranges/round_time). But there are many more.
> > >>>> >>>>>>
> > >>>> >>>>>> I generally do not love the common "util" approach. It has a
> > >>>> >>>>>> tendency to become a bag of everything over time. but if we
> > >>>> limit it
> > >>>> >>>>>> to a set of small, fully decoupled modules where each module
> is
> > >>>> >>>>>> independent - it's OK. And we already have it in
> "airflow.util"
> > >>>> and
> > >>>> >>>>>> we seem to be
> > >>>> >>>>> doing well.
> > >>>> >>>>>>
> > >>>> >>>>>> WDYT? Is it worth it ?
> > >>>> >>>>>>
> > >>>> >>>>>> J.
> > >>>> >>>>>>
> > >>>> >>>>>
> > >>>> >>>>
> > >>>> >>>>
> > >>>>
> ---------------------------------------------------------------------
> > >>>> >>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > >>>> >>>> For additional commands, e-mail: dev-h...@airflow.apache.org
> > >>>> >>>
> > >>>> >>>
> > >>>> >>
> > >>>>
> > >>>>
> > >>>>
> ---------------------------------------------------------------------
> > >>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > >>>> For additional commands, e-mail: dev-h...@airflow.apache.org
> > >>>>
> > >>>>
> >
>

Reply via email to