I think all this cases are valid but maybe I was not super-clear. It's only the transfer operators that we need to decide where to put - not hooks. Usually the complexity of communication with particular storages is (or at least should be) in the Hooks rather than Operators.
Operators should be just thin wrappers over the logic in the hooks. Hooks are going to stay where they belong - S3 Hooks in amazon, GCS Hooks in google.cloud, GoogleSheet Hooks in google.gsuite. Since we actually have mono-repo - this will be no problem (and no cross dependencies problem) to have S3 -> GCS operator in google and use hooks from both google/amazon. I hope this alleviates your concern Daniel ? J. > What about GoogleSheetsToS3? GoogleSheetsToGCS? These you would put in > the target, i.e. the storage? But GoogleSheetsToSftp would be in google > sheets operators file? The complexity, and the shared code, are in the > gsheet component -- not into the storage destination. > > > On Tue, Nov 5, 2019 at 5:46 PM Jarek Potiuk <jarek.pot...@polidea.com> > wrote: > > > Hello Airflow Community, > > > > The email calls for a vote to update AIP-21 Changes in import paths > > < > > > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-21%3A+Changes+in+import+paths > > > > > with > > the changes described below. The vote will last till Saturday 8th 2am > CEST > > (72 hours). Committers have a binding vote but everyone from the > community > > is encouraged to cast an advisory vote. > > > > *Summary*: > > > > The proposal is to update AIP-21 to move all non-core > > operators/hooks/sensor (and related files) to sub-packages within airflow > > (protocols/software/providers) or (software/providers). > > I am also happy to merge protocols+software, so if you have a strong > > opinion on it - please state it with your vote and we can decide based on > > majority. > > > > Those packages will be separately released (schedule/process TBD) and > will > > be backportable to 1.10.* airflow series, so that users can install it > and > > start using new Airflow2.0 operators in their Python 3 Airflow 1.10 > > environments (only Python 3.5+ is supported). > > > > We will proceed with migrating the providers package to already agreed > > paths without waiting for the final vote (following current version of > > AIP-21). Since we have working POC - we know the agreed paths will work > for > > us. > > > > *Previous discussions: * > > > > - > > > > > https://lists.apache.org/thread.html/b07a93c9114e3d3c55d4ee514955bac79bc012c7a00db627c6b4c55f@%3Cdev.airflow.apache.org%3E > > - > > > > > https://lists.apache.org/thread.html/e25ddc546e367a4af3e594fecbd4431959bd5a89045e748e4206e7ff@%3Cdev.airflow.apache.org%3E > > > > *More Details*: > > > > 1) Information that we are going in the direction of AIP-8 but not yet > > reaching it - focusing on separating out backportable packages > installable > > in Airflow releases 1.10.* . Airflow 2.0 will still be installed as a > whole > > and all the source will be kept in one repo, but we now have a way to > build > > backportable packages for groups of operators. POC available here: > > https://github.com/apache/airflow/pull/6507 (based on Ash's > > https://github.com/ashb/airflow-submodule-test) > > > > 2) We move all integrations to new packages (keeping deprecated import > > aliases in the old places). The following split (according to > "stewardship" > > over the integrations): > > > > - *fundamentals* - core of ariflow - they are really part of Apache > > Airflow. Stewards - core Airflow team. Not backportable/separated out. > > - *protocols* - are not owned by anyone, they are public and the > > implementation is fully "open". There are no particular stewards (no > > need). > > Users of particular protocols should mainly maintain those and add > > support > > for different versions of the protocols. > > - *software* - both API and software are controlled by someone outside > > of Airflow (commercial or open-source project), but the deployment of > > that > > software is "owned" by the user installing Airflow. The "stewardship" > > might > > be also the users but the controlling party (Oracle for example) might > > be > > interested in maintaining those operators as well. > > - *providers* - API/software/deployments are fully controlled by a 3rd > > party. Here most likely "provider" will be interested in maintaining > the > > operators (and for example like Google - provide integration > guidelines > > < > > > https://docs.google.com/document/d/1_rTdJSLCt0eyrAylmmgYc3yZr-_h51fVlnvMmWqhCkY/edit?usp=drive_web&ouid=112320280470690058978 > > > > > for > > their hooks/operators/sensors) > > > > > > 3) Between-providers transfer operators should be kept at the "target" > > rather than "source" > > For example S3 -> GCS should be in "google" provider, but GCS-> S3 should > > be in "amazon". > > > > 4) One-side provider transfer operators should be kept at the "provider" > > regardless if they are target or source. > > For example GCS-> SFTP or SFTP -> GCS should be in "google" provider. > > > > 5) If in doubt we will discuss individual cases separately. > > > > J. > > > > -- > > > > Jarek Potiuk > > Polidea <https://www.polidea.com/> | Principal Software Engineer > > > > M: +48 660 796 129 <+48660796129> > > [image: Polidea] <https://www.polidea.com/> > > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>