Hi Jarek,

At reviewing the PR from uranusjr for AIP-60 I also had the feeling that a lot 
of very similar code is repeated in all the providers. But during review 
yesterday I dropped the ides because if we have a common piece then we are 
locking all depending providers (potentially) together if common code changes.
As of additional dependency complexity between providers actually the 
additional dependency I think creates more prblems than the benefit… would be 
cool if tehere would be an option to „inline“ common code from a single place 
but keep individual providers fully independent…

Jens

Sent from Outlook for iOS<https://aka.ms/o0ukef>
________________________________
From: Jarek Potiuk <ja...@potiuk.com>
Sent: Wednesday, February 21, 2024 5:42:20 PM
To: dev@airflow.apache.org <dev@airflow.apache.org>
Subject: [DISCUSS] Common.util provider?

Hello everyone,

How do we feel about introducing a common.util provider?

I know it's not been the original idea behind providers, but - after
testing common.sql and now also having common.io, seems like more and more
we would like to extract some common code that we would like providers to
use, but we refrain from it, because it will only be actually usable 6
months after we introduce some common code.

However, if we introduce common.util, this problem is generally gone - at
the expense of more complex maintenance and cross-provider dependencies.
We should be able to add a common util method and use it in a provider at
the same time.

Think Amazon provider using a new feature released in common.util >=1.2.0
and google provider >= 1.1.0. All manageable and we do it already for
common.sql. We know how to do it, we know what to avoid, we know we cannot
introduce backwards-incompatible changes, so we have to be very clear what
is and what is not a public API there, We could rather easily add tests to
prevent such backwards-incompatible changes. We even have a solution for
chicken-egg providers where we need to release two providers at the same
time if they depend on each other. Generally speaking it's quite workable
but adds a bit of overhead.

Examples that we could implement as "common.util":

- common versioning check with cache - where multiple providers could reuse
"do we have pendulum 2"
- more complex - some date management features (we have a few like
date_ranges/round_time). But there are many more.

I generally do not love the common "util" approach. It has a tendency to
become a bag of everything over time. but if we limit it to a set of small,
fully decoupled modules where each module is independent - it's OK. And we
already have it in "airflow.util" and we seem to be doing well.

WDYT? Is it worth it ?

J.

Reply via email to