One more day to go. I would love to see some opinions on this AIP-21 update
:).

Executive summary:

* we will be moving a number of integrations to sub-packages of airflow.
* they will be backportable to 1.10.*.  There will be
'apache-airflow-[package]-backport' pypi installable with python 3 that
will make Airflow 2.0 operators/hooks etc. available with 1.10* operators.
* the current proposal for sub-packages is "protocols/software/providers/"
(but if you think merging protocols and software makes sense - please
express your opinion
* we are not moving "fundamental" operators/hooks etc..
* Airflow 2.0 is still going to be installed as a single package with all
operators (so we are not yet implementing AIP-8)

J.

On Wed, Nov 6, 2019 at 10:07 AM Jarek Potiuk <[email protected]>
wrote:

> I think all this cases are valid but maybe I was not super-clear. It's
> only the transfer operators that we need to decide where to put - not
> hooks.
> Usually the complexity of communication with particular storages is (or at
> least should be) in the Hooks rather than Operators.
>
> Operators should be just thin wrappers over the logic in the hooks.
> Hooks are going to stay where they belong - S3 Hooks in amazon, GCS Hooks
> in google.cloud, GoogleSheet Hooks in google.gsuite.
>
> Since we actually have mono-repo - this will be no problem (and no cross
> dependencies problem) to have S3 -> GCS operator  in google and use hooks
> from both google/amazon.
>
> I hope this alleviates your concern Daniel ?
>
> J.
>
>
>> What about GoogleSheetsToS3?  GoogleSheetsToGCS?  These you would put in
>> the target, i.e. the storage?  But GoogleSheetsToSftp would be in google
>> sheets operators file?  The complexity, and the shared code, are in the
>> gsheet component -- not into the storage destination.
>>
>>
>
>
>
>> On Tue, Nov 5, 2019 at 5:46 PM Jarek Potiuk <[email protected]>
>> wrote:
>>
>> > Hello Airflow Community,
>> >
>> > The email calls for a vote to update AIP-21 Changes in import paths
>> > <
>> >
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-21%3A+Changes+in+import+paths
>> > >
>> > with
>> > the changes described below. The vote will last till Saturday 8th 2am
>> CEST
>> > (72 hours). Committers have a binding vote but everyone from the
>> community
>> > is encouraged to cast an advisory vote.
>> >
>> > *Summary*:
>> >
>> > The proposal is to update AIP-21 to move all non-core
>> > operators/hooks/sensor (and related files) to sub-packages within
>> airflow
>> > (protocols/software/providers) or (software/providers).
>> > I am also happy to merge protocols+software, so if you have a strong
>> > opinion on it - please state it with your vote and we can decide based
>> on
>> > majority.
>> >
>> > Those packages will be separately released (schedule/process TBD) and
>> will
>> > be backportable to 1.10.* airflow series, so that users can install it
>> and
>> > start using new Airflow2.0 operators in their Python 3 Airflow 1.10
>> > environments (only Python 3.5+ is supported).
>> >
>> > We will proceed with migrating the providers package to already agreed
>> > paths without waiting for the final vote (following current version of
>> > AIP-21). Since we have working POC - we know the agreed paths will work
>> for
>> > us.
>> >
>> > *Previous discussions: *
>> >
>> >    -
>> >
>> >
>> https://lists.apache.org/thread.html/b07a93c9114e3d3c55d4ee514955bac79bc012c7a00db627c6b4c55f@%3Cdev.airflow.apache.org%3E
>> >    -
>> >
>> >
>> https://lists.apache.org/thread.html/e25ddc546e367a4af3e594fecbd4431959bd5a89045e748e4206e7ff@%3Cdev.airflow.apache.org%3E
>> >
>> > *More Details*:
>> >
>> > 1) Information that we are going in the direction of AIP-8 but not yet
>> > reaching it - focusing on separating out backportable packages
>> installable
>> > in Airflow releases 1.10.* . Airflow 2.0 will still be installed as a
>> whole
>> > and all the source will be kept in one repo, but we now have a way to
>> build
>> > backportable packages for groups of operators. POC available here:
>> > https://github.com/apache/airflow/pull/6507 (based on Ash's
>> > https://github.com/ashb/airflow-submodule-test)
>> >
>> > 2) We move all integrations to new packages (keeping deprecated import
>> > aliases in the old places). The following split (according to
>> "stewardship"
>> > over the integrations):
>> >
>> >    - *fundamentals* - core of ariflow - they are really part of Apache
>> >    Airflow. Stewards - core Airflow team. Not backportable/separated
>> out.
>> >    - *protocols* - are not owned by anyone, they are public and the
>> >    implementation is fully "open". There are no particular stewards (no
>> > need).
>> >    Users of particular protocols should mainly maintain those and add
>> > support
>> >    for different versions of the protocols.
>> >    - *software* - both API and software are controlled by someone
>> outside
>> >    of Airflow (commercial or open-source project), but the deployment of
>> > that
>> >    software is "owned" by the user installing Airflow. The "stewardship"
>> > might
>> >    be also the users but the controlling party (Oracle for example)
>> might
>> > be
>> >    interested in maintaining those operators as well.
>> >    - *providers* - API/software/deployments are fully controlled by a
>> 3rd
>> >    party. Here most likely "provider" will be interested in maintaining
>> the
>> >    operators (and for example like Google - provide integration
>> guidelines
>> >    <
>> >
>> https://docs.google.com/document/d/1_rTdJSLCt0eyrAylmmgYc3yZr-_h51fVlnvMmWqhCkY/edit?usp=drive_web&ouid=112320280470690058978
>> > >
>> > for
>> >    their hooks/operators/sensors)
>> >
>> >
>> > 3) Between-providers transfer operators should be kept at the "target"
>> > rather than "source"
>> > For example S3 -> GCS should be in "google" provider, but GCS-> S3
>> should
>> > be in "amazon".
>> >
>> > 4) One-side provider transfer operators should be kept at the "provider"
>> > regardless if they are target or source.
>> > For example GCS-> SFTP or SFTP -> GCS should be in "google" provider.
>> >
>> > 5) If in doubt we will discuss individual cases separately.
>> >
>> > J.
>> >
>> > --
>> >
>> > Jarek Potiuk
>> > Polidea <https://www.polidea.com/> | Principal Software Engineer
>> >
>> > M: +48 660 796 129 <+48660796129>
>> > [image: Polidea] <https://www.polidea.com/>
>> >
>>
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to