Do we need to include `-backport,`? What was the thinking behind that? I think software and protocol should be merged. I would also say _everything_ is a provider, so airflow.providers.ssh.SSHOperator for instance is what I would prefer
-a On 8 November 2019 08:32:42 GMT, Jarek Potiuk <jarek.pot...@polidea.com> wrote: >One more day to go. I would love to see some opinions on this AIP-21 >update >:). > >Executive summary: > >* we will be moving a number of integrations to sub-packages of >airflow. >* they will be backportable to 1.10.*. There will be >'apache-airflow-[package]-backport' pypi installable with python 3 that >will make Airflow 2.0 operators/hooks etc. available with 1.10* >operators. >* the current proposal for sub-packages is >"protocols/software/providers/" >(but if you think merging protocols and software makes sense - please >express your opinion >* we are not moving "fundamental" operators/hooks etc.. >* Airflow 2.0 is still going to be installed as a single package with >all >operators (so we are not yet implementing AIP-8) > >J. > >On Wed, Nov 6, 2019 at 10:07 AM Jarek Potiuk <jarek.pot...@polidea.com> >wrote: > >> I think all this cases are valid but maybe I was not super-clear. >It's >> only the transfer operators that we need to decide where to put - not >> hooks. >> Usually the complexity of communication with particular storages is >(or at >> least should be) in the Hooks rather than Operators. >> >> Operators should be just thin wrappers over the logic in the hooks. >> Hooks are going to stay where they belong - S3 Hooks in amazon, GCS >Hooks >> in google.cloud, GoogleSheet Hooks in google.gsuite. >> >> Since we actually have mono-repo - this will be no problem (and no >cross >> dependencies problem) to have S3 -> GCS operator in google and use >hooks >> from both google/amazon. >> >> I hope this alleviates your concern Daniel ? >> >> J. >> >> >>> What about GoogleSheetsToS3? GoogleSheetsToGCS? These you would >put in >>> the target, i.e. the storage? But GoogleSheetsToSftp would be in >google >>> sheets operators file? The complexity, and the shared code, are in >the >>> gsheet component -- not into the storage destination. >>> >>> >> >> >> >>> On Tue, Nov 5, 2019 at 5:46 PM Jarek Potiuk ><jarek.pot...@polidea.com> >>> wrote: >>> >>> > Hello Airflow Community, >>> > >>> > The email calls for a vote to update AIP-21 Changes in import >paths >>> > < >>> > >>> >https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-21%3A+Changes+in+import+paths >>> > > >>> > with >>> > the changes described below. The vote will last till Saturday 8th >2am >>> CEST >>> > (72 hours). Committers have a binding vote but everyone from the >>> community >>> > is encouraged to cast an advisory vote. >>> > >>> > *Summary*: >>> > >>> > The proposal is to update AIP-21 to move all non-core >>> > operators/hooks/sensor (and related files) to sub-packages within >>> airflow >>> > (protocols/software/providers) or (software/providers). >>> > I am also happy to merge protocols+software, so if you have a >strong >>> > opinion on it - please state it with your vote and we can decide >based >>> on >>> > majority. >>> > >>> > Those packages will be separately released (schedule/process TBD) >and >>> will >>> > be backportable to 1.10.* airflow series, so that users can >install it >>> and >>> > start using new Airflow2.0 operators in their Python 3 Airflow >1.10 >>> > environments (only Python 3.5+ is supported). >>> > >>> > We will proceed with migrating the providers package to already >agreed >>> > paths without waiting for the final vote (following current >version of >>> > AIP-21). Since we have working POC - we know the agreed paths will >work >>> for >>> > us. >>> > >>> > *Previous discussions: * >>> > >>> > - >>> > >>> > >>> >https://lists.apache.org/thread.html/b07a93c9114e3d3c55d4ee514955bac79bc012c7a00db627c6b4c55f@%3Cdev.airflow.apache.org%3E >>> > - >>> > >>> > >>> >https://lists.apache.org/thread.html/e25ddc546e367a4af3e594fecbd4431959bd5a89045e748e4206e7ff@%3Cdev.airflow.apache.org%3E >>> > >>> > *More Details*: >>> > >>> > 1) Information that we are going in the direction of AIP-8 but not >yet >>> > reaching it - focusing on separating out backportable packages >>> installable >>> > in Airflow releases 1.10.* . Airflow 2.0 will still be installed >as a >>> whole >>> > and all the source will be kept in one repo, but we now have a way >to >>> build >>> > backportable packages for groups of operators. POC available here: >>> > https://github.com/apache/airflow/pull/6507 (based on Ash's >>> > https://github.com/ashb/airflow-submodule-test) >>> > >>> > 2) We move all integrations to new packages (keeping deprecated >import >>> > aliases in the old places). The following split (according to >>> "stewardship" >>> > over the integrations): >>> > >>> > - *fundamentals* - core of ariflow - they are really part of >Apache >>> > Airflow. Stewards - core Airflow team. Not >backportable/separated >>> out. >>> > - *protocols* - are not owned by anyone, they are public and >the >>> > implementation is fully "open". There are no particular >stewards (no >>> > need). >>> > Users of particular protocols should mainly maintain those and >add >>> > support >>> > for different versions of the protocols. >>> > - *software* - both API and software are controlled by someone >>> outside >>> > of Airflow (commercial or open-source project), but the >deployment of >>> > that >>> > software is "owned" by the user installing Airflow. The >"stewardship" >>> > might >>> > be also the users but the controlling party (Oracle for >example) >>> might >>> > be >>> > interested in maintaining those operators as well. >>> > - *providers* - API/software/deployments are fully controlled >by a >>> 3rd >>> > party. Here most likely "provider" will be interested in >maintaining >>> the >>> > operators (and for example like Google - provide integration >>> guidelines >>> > < >>> > >>> >https://docs.google.com/document/d/1_rTdJSLCt0eyrAylmmgYc3yZr-_h51fVlnvMmWqhCkY/edit?usp=drive_web&ouid=112320280470690058978 >>> > > >>> > for >>> > their hooks/operators/sensors) >>> > >>> > >>> > 3) Between-providers transfer operators should be kept at the >"target" >>> > rather than "source" >>> > For example S3 -> GCS should be in "google" provider, but GCS-> S3 >>> should >>> > be in "amazon". >>> > >>> > 4) One-side provider transfer operators should be kept at the >"provider" >>> > regardless if they are target or source. >>> > For example GCS-> SFTP or SFTP -> GCS should be in "google" >provider. >>> > >>> > 5) If in doubt we will discuss individual cases separately. >>> > >>> > J. >>> > >>> > -- >>> > >>> > Jarek Potiuk >>> > Polidea <https://www.polidea.com/> | Principal Software Engineer >>> > >>> > M: +48 660 796 129 <+48660796129> >>> > [image: Polidea] <https://www.polidea.com/> >>> > >>> >> >> >> -- >> >> Jarek Potiuk >> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >> M: +48 660 796 129 <+48660796129> >> [image: Polidea] <https://www.polidea.com/> >> >> > >-- > >Jarek Potiuk >Polidea <https://www.polidea.com/> | Principal Software Engineer > >M: +48 660 796 129 <+48660796129> >[image: Polidea] <https://www.polidea.com/>