Can someone explain to me how having multiple packages will work in practice?

How will we ensure that core changes don't break any hooks/operators?

How do we support the logging backends for s3/azure/gcp?

What would the release process be for the "sub"-packages?

There is nothing stopping someone /currently/​ creating their own operators package. There is nothing what-so-ever special about the |airflow.operators| package namespace, and for example Google could choose to release a airflow-gcp-operators package now and tell people to |from gcp.airflow.operators import SomeNewOperator|​.​

My view on this currently is -1 as I don't see it solving any problem than test speed (which is a big one, yes) but doesn't reduce the amount of workload on the committers - rather it increases it by having a more complex release process (each sub-project would still have to follow the normal Apache voting process) and having 24 repos to check for PRs rather than just 1.

Am I missing something?

("Core" vs "contrib" made sense when Airflow was still under Airbnb, we should probably just move everything from contrib out to core pre 2.0.0)

-ash

airflowuser wrote on 08/01/2019 15:44:
I think the operator should be placed by the source.
If it's MySQLToHiveOperator then it would be placed in MySQL package.


The BIG question here is if this serve actual improvement like faster 
deployment of hook/operators bug-fix to Airflow users (faster than actual 
Airflow release) or this is mere cosmetic issue.

I assume that this also covers the unnecessary separation of core and contrib.



Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, January 7, 2019 10:16 PM, Maxime Beauchemin 
<maximebeauche...@gmail.com> wrote:

Something to think about is how data transfer operators like the
MysqlToHiveOperator usually rely on 2 hooks. With a package-specific
approach that may mean something like an `airflow-hive`, `airflow-mysql`
and `airflow-mysql-hive` packages, where the `airflow-mysql-hive` package
depends on the two other packages.

It's just a matter of having a clear strategy, good naming conventions and
a nice central place in the docs that centralize a list of approved
packages.

Max

On Mon, Jan 7, 2019 at 9:05 AM Tim Swast sw...@google.com.invalid wrote:

I've created AIP-8:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303
To follow-up from the discussion about splitting hooks/operators out of the
core Airflow package at
http://mail-archives.apache.org/mod_mbox/airflow-dev/201809.mbox/<308670db-bd2a-4738-81b1-3f6fb312c...@apache.org>
I propose packaging based on the target system, informed by the existing
hooks in both core and contrib. This will allow those with the relevant
expertise in each target system to respond to contributions / issues
without having to follow the flood of everything Airflow-related. It will
also decrease the surface area of the core package, helping with
testability and long-term maintenance.

-   • *Tim Swast
-   • *Software Friendliness Engineer
-   • *Google Cloud Developer Relations
-   • *Seattle, WA, USA


Reply via email to