I updated the spreadsheet and put Bash + Python operator into fundamentals. Also treat Apache same way as "proprietary" providers.
I will re-start the vote then :).. J. On Mon, Nov 11, 2019 at 7:21 PM Jarek Potiuk <jarek.pot...@polidea.com> wrote: > Ok. Happy to move it back then :). No problem with that. > > According to rules of AIP-21 it should actually be: "*from > airflow.providers.kubernetes.operators.pod import KubernetesPodOperator*" > (Case 2A. (drop _operator in module name) + Case 5B. (keep Operator in > class name). We can have more than just a Pod operator for Kubernetes > (KubernetesPod, KubernetesVolume, KubernetesIstio. and many more) so > keeping KubernetesPod in class name and having separate module for pod > operator(s?) makes sense IMHO. > > It's similar to *from airflow.providers.google.cloud.operators.pubsub > import PubSubTopicCreateOperator* for example. > > Re - remote log storage - indeed. That should be part of AIP- 8. > > J, > > On Mon, Nov 11, 2019 at 6:36 PM Ash Berlin-Taylor <a...@apache.org> wrote: > >> +1 for Python and Bash being in the stock install -- they are just _so_ >> commonly used that I think it makes sense to keep them in the base install. >> (and the virtualenv module is not an onerous dep, not caused us any >> problems. Yet). >> >> Kubeneretes is also a slighlty funny one since the deps for that will be >> in "core" anyway thanks to the Kube executor, but I think it probably makes >> sense to have `from airflow.providers.kubernetes.operators import >> KubernetesOperator`. Is that the pattern we are going with for the >> "one-level" providers, or will it be `from >> airflow.providers.kubernetes.operators.pod_operator import >> KubernetesOperator`? >> >> Possibly more an AIP-8 question: with moving Azure Blob/S3/GCS to >> separate packages we might have to look at how we enable remote log storage. >> >> -a >> >> >> > On 11 Nov 2019, at 15:53, Jarek Potiuk <jarek.pot...@polidea.com> >> wrote: >> > >> > On Mon, Nov 11, 2019 at 4:22 PM Kamil Breguła < >> kamil.breg...@polidea.com <mailto:kamil.breg...@polidea.com>> >> > wrote: >> > >> >> One more question. Are you sure you want to move Python and Bash from >> >> core? These are the elements that are installed in every environment >> >> because they are required by Airflow, so moving them to a separate >> >> installed package is pointless in my opinion. >> >> >> >> I have no problem with moving them to "fundamentals", but I am not >> sure if >> > they are really required ? I looked through the code and other than few >> > examples and tests, they are not really "required". Maybe that's >> enough to >> > keep them in fundamentals, >> > Also Python operator has some dependencies - virtualenv - which is only >> > required for this operator so maybe it's worth to keep it separate from >> > "fundamentals". >> > >> > >> >> On Mon, Nov 11, 2019 at 3:07 PM Kaxil Naik <kaxiln...@gmail.com> >> wrote: >> >>> >> >>> I am fine with this list +1 >> >>> >> >>> On Mon, Nov 11, 2019 at 1:27 PM Jarek Potiuk < >> jarek.pot...@polidea.com> >> >>> wrote: >> >>> >> >>>> I am all for it Kamil! >> >>>> >> >>>> Super happy to treat Apache projects in the same way as "proprietary" >> >>>> providers :). Anyone else has some other comments ? >> >>>> >> >>>> J. >> >>>> >> >>>> On Mon, Nov 11, 2019 at 2:17 PM Kamil Breguła < >> >> kamil.breg...@polidea.com> >> >>>> wrote: >> >>>> >> >>>>> I looked at this list and I'm only worried about two operators. >> >>>>> >> >>>>> airflow.contrib.operators.vertica_to_hive >> >>>>> airflow.contrib.operators.s3_to_hive >> >>>>> >> >>>>> If we want the operators to be grouped according to destination, >> then >> >>>>> this operator should be in apache package. It is the members of the >> >>>>> Apache community who will care most about this operator being of >> high >> >>>>> quality. Apache can be treated equally with other large cloud >> >>>>> providers, such as GCP, AWS. I can imagine that a new Apache product >> >>>>> will appear and it will want to promote the same way as products of >> >>>>> cloud providers are promoted. By creating a large number of >> >>>>> integrations that allow you to copy data to its operating range. >> >>>>> There's another cases - building a strong Apache community. As a >> >>>>> member of the Apache community, we should promote Apache products to >> >>>>> ensure that the development of the community is correct, and >> >> therefore >> >>>>> also for integration into our products with other products. >> >>>>> >> >>>>> On Mon, Nov 11, 2019 at 12:28 AM Jarek Potiuk < >> >> jarek.pot...@polidea.com> >> >>>>> wrote: >> >>>>>> >> >>>>>> Just to select the "packages" for this update. Anyone has >> >> objections >> >>>> for >> >>>>>> this structure (details including transfer operators in >> >>>>>> >> >>>>>> https://docs.google.com/spreadsheets/d/17zA5t2JVxnDdg5Cs1Cg_ >> >>>>>> Mb1GXvGctmesfg2L089QSOk/edit#gid=0? >> >>>>>> >> >>>>>> *Fundamentals (no change)* >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> providers >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> google >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> cloud >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> gsuite >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> marketing_platform >> >>>>>> >> >>>>>> >> >>>>>> amazon >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> aws >> >>>>>> >> >>>>>> >> >>>>>> microsoft >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> azure >> >>>>>> >> >>>>>> >> >>>>>> apache >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> cassandra >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> druid >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> hadoop >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> hive >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> pig >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> pinot >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> spark >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> sqoop >> >>>>>> >> >>>>>> >> >>>>>> mysql >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> jira >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> databricks >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> datadog >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> dingding >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> discord >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> cloudant >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> jenkins >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> opsgenie >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> qubole >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> salesforce >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> segment >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> slack >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> snowflake >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> vertica >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> zendesk >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> celery >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> docker >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> bash >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> kubernetes >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> mssql >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> mongodb >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> mysql >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> openfaas >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> oracle >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> papermill >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> postgres >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> presto >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> python >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> redis >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> samba >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> sqlite >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> imap >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> ssh >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> filesystem >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> sftp >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> ftp >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> http >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> grpc >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> smtp >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> jdbc >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> winrm >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On Fri, Nov 8, 2019 at 5:47 PM Jarek Potiuk < >> >> jarek.pot...@polidea.com> >> >>>>>> wrote: >> >>>>>> >> >>>>>>> Let me then cancel this vote and I will restart it next week. >> >>>>>>> >> >>>>>>> Yeah. It's a bit like re-opening the Pandora's box but now that >> >> we >> >>>> know >> >>>>>>> that we can do it, and we are unblocked in moving to google >> >> (which is >> >>>>> now >> >>>>>>> the biggest move in-progress), we can spend more time on getting >> >>>>> better >> >>>>>>> (and more final) consensus. >> >>>>>>> I decided to go through the list from the docs (once again Kamil >> >> - >> >>>>> great >> >>>>>>> that you did it) and prepared this spreadsheet showing the >> >>>> structure. I >> >>>>>>> went through ALL the operators and put them in the right place >> >> where >> >>>>> our >> >>>>>>> current rules place them. >> >>>>>>> >> >>>>>>> After this exercise, I think that makes sense: >> >>>>>>> - put all the stuff except fundamentals in *"providers"* >> >> (everything >> >>>>>>> in "providers" will be potentially backportable). >> >>>>>>> - grouping apache projects under *"apache"* - similar to >> >>>>>>> google/amazon/microsoft (different kind of ownership but still >> >> it is >> >>>> an >> >>>>>>> ownership) >> >>>>>>> - for the rest I think what we can do is really to put the >> >> operators >> >>>> in >> >>>>>>> folders per "service/company" (without sub-packages). That >> >> includes >> >>>>>>> sftp/ssh/ftp etc (should we group [ftp and sftp] or [ssh and >> >> sftp] >> >>>> ??). >> >>>>>>> there is no "ownership" there and no reason to group them. That >> >> will >> >>>>> put >> >>>>>>> "operators/hooks/sensors" at different levels in the directory >> >> tree >> >>>>> but we >> >>>>>>> already have that for fundamentals and I am not too worried about >> >>>>> that. We >> >>>>>>> do not have to have everything at the same level. >> >>>>>>> - I put transfer operators according to the rule where "to" side >> >> is >> >>>>> more >> >>>>>>> important unless the other side is a public protocol (so sftp -> >> >> gcs >> >>>>> and >> >>>>>>> gcs -> sftp both go to google/gcp). I did not have any doubt >> >> where to >> >>>>> put >> >>>>>>> which transfer operator, so this is a good sign: >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>> >> >>>> >> >> >> https://docs.google.com/spreadsheets/d/17zA5t2JVxnDdg5Cs1Cg_Mb1GXvGctmesfg2L089QSOk/edit#gid=0 >> >>>>>>> >> >>>>>>> Can you please take a look and express your opinions here so >> >> that we >> >>>>> can >> >>>>>>> have final voting next week (for those who are not yet tired >> >> with the >> >>>>>>> discussion ;)). >> >>>>>>> >> >>>>>>> J. >> >>>>>>> >> >>>>>>> On Fri, Nov 8, 2019 at 4:38 PM Kaxil Naik <kaxiln...@gmail.com> >> >>>> wrote: >> >>>>>>> >> >>>>>>>> Yes, that makes sense. >> >>>>>>>> >> >>>>>>>> On Fri, Nov 8, 2019 at 3:22 PM Kamil Breguła < >> >>>>> kamil.breg...@polidea.com> >> >>>>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> In the case of Hadoop, it is published by Apache, so it can >> >> be in >> >>>>> the >> >>>>>>>>> apache directory. This will mimic the grouping presented in >> >> the >> >>>>>>>>> documentation. >> >>>>>>>>> >> >>>>>>>> >> >>>>> >> >>>> >> >> >> https://airflow.readthedocs.io/en/latest/operators-and-hooks-ref.html#software-operators-and-hooks >> >>>>>>>>> >> >>>>>>>>> On Fri, Nov 8, 2019 at 3:47 PM Kaxil Naik < >> >> kaxiln...@gmail.com> >> >>>>> wrote: >> >>>>>>>>>> >> >>>>>>>>>> I think we should keep the vote open at least until mid next >> >>>> week >> >>>>> to >> >>>>>>>> have >> >>>>>>>>>> more thought and inputs on this one. >> >>>>>>>>>> >> >>>>>>>>>> In general, I am happy with the approach but >> >> operators/hooks and >> >>>>>>>> sensors >> >>>>>>>>>> shouldn't be a provider. "hadoop" can be its provider and >> >> hdfs >> >>>>> can be >> >>>>>>>> a >> >>>>>>>>>> part of it. >> >>>>>>>>>> >> >>>>>>>>>> providers/ >> >>>>>>>>>> google >> >>>>>>>>>> cloud >> >>>>>>>>>> operators >> >>>>>>>>>> hooks >> >>>>>>>>>> sensors >> >>>>>>>>>> gsuite >> >>>>>>>>>> operators >> >>>>>>>>>> ... >> >>>>>>>>>> amazon >> >>>>>>>>>> aws >> >>>>>>>>>> operators >> >>>>>>>>>> ... >> >>>>>>>>>> microsoft >> >>>>>>>>>> azure >> >>>>>>>>>> operators >> >>>>>>>>>> ... >> >>>>>>>>>> hadoop >> >>>>>>>>>> hdfs >> >>>>>>>>>> operators >> >>>>>>>>>> ... >> >>>>>>>>>> >> >>>>>>>>>> We can also define what is a "provider" so we know what to >> >> add >> >>>> in >> >>>>> it >> >>>>>>>> in >> >>>>>>>>> the >> >>>>>>>>>> future. SSH/FTP/SFTP belongs to the same family group. Do we >> >>>> want >> >>>>> to >> >>>>>>>> have >> >>>>>>>>>> separate providers for each one of them ??? >> >>>>>>>>>> >> >>>>>>>>>> Regards, >> >>>>>>>>>> Kaxil >> >>>>>>>>>> >> >>>>>>>>>> On Fri, Nov 8, 2019 at 9:08 AM Jarek Potiuk < >> >>>>> jarek.pot...@polidea.com >> >>>>>>>>> >> >>>>>>>>>> wrote: >> >>>>>>>>>> >> >>>>>>>>>>> I really like to make everything a provider. That's a >> >> great >> >>>>> idea ! >> >>>>>>>>> This way >> >>>>>>>>>>> everything "backportable" will have to be in "providers" >> >>>>> package. >> >>>>>>>>> Really >> >>>>>>>>>>> nice and clean separation (and less mess in "airflow"). >> >> And we >> >>>>> will >> >>>>>>>> not >> >>>>>>>>>>> have to have any artificial grouping (we can still group >> >> them >> >>>>> at the >> >>>>>>>>>>> documentation level). >> >>>>>>>>>>> >> >>>>>>>>>>> We do not need backport in name. And I think it's more of >> >>>>> technical >> >>>>>>>>> detail >> >>>>>>>>>>> on naming the package which we can work out while >> >> reviewing >> >>>> PRs >> >>>>> and >> >>>>>>>> we >> >>>>>>>>> can >> >>>>>>>>>>> agree final naming of the released packaged on PMC level >> >> (PMCs >> >>>>> will >> >>>>>>>>> have to >> >>>>>>>>>>> vote on releasing those). >> >>>>>>>>>>> >> >>>>>>>>>>> The thinking is that it's intention is really to be only >> >>>>> backported >> >>>>>>>> to >> >>>>>>>>> 1.10 >> >>>>>>>>>>> - we are not going (yet) to use the packages in Airflow >> >> 2.*. >> >>>> so >> >>>>> I >> >>>>>>>>> thought >> >>>>>>>>>>> by naming them backport we can express that intent more >> >>>> clearly. >> >>>>>>>>>>> >> >>>>>>>>>>> So let me clarify the structure of folders we are going to >> >>>> have >> >>>>> if >> >>>>>>>> we >> >>>>>>>>>>> follow it (i just added some examples) including the >> >> already >> >>>>> agreed >> >>>>>>>>> changes >> >>>>>>>>>>> from AIP-21: >> >>>>>>>>>>> >> >>>>>>>>>>> providers/ >> >>>>>>>>>>> google >> >>>>>>>>>>> cloud >> >>>>>>>>>>> operators >> >>>>>>>>>>> hooks >> >>>>>>>>>>> sensors >> >>>>>>>>>>> gsuite >> >>>>>>>>>>> operators >> >>>>>>>>>>> ... >> >>>>>>>>>>> amazon >> >>>>>>>>>>> aws >> >>>>>>>>>>> operators >> >>>>>>>>>>> ... >> >>>>>>>>>>> microsoft >> >>>>>>>>>>> azure >> >>>>>>>>>>> operators >> >>>>>>>>>>> ... >> >>>>>>>>>>> operators >> >>>>>>>>>>> sqlite.py >> >>>>>>>>>>> oracle.py >> >>>>>>>>>>> docker.py >> >>>>>>>>>>> hooks >> >>>>>>>>>>> hdfs.py >> >>>>>>>>>>> sqlite.py >> >>>>>>>>>>> sensors >> >>>>>>>>>>> http.py >> >>>>>>>>>>> sql.py >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> J. >> >>>>>>>>>>> >> >>>>>>>>>>> On Fri, Nov 8, 2019 at 9:43 AM Ash Berlin-Taylor < >> >>>>> a...@apache.org> >> >>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>>> Do we need to include `-backport,`? What was the >> >> thinking >> >>>>> behind >> >>>>>>>>> that? >> >>>>>>>>>>>> >> >>>>>>>>>>>> I think software and protocol should be merged. I would >> >> also >> >>>>> say >> >>>>>>>>>>>> _everything_ is a provider, so >> >>>>> airflow.providers.ssh.SSHOperator >> >>>>>>>> for >> >>>>>>>>>>>> instance is what I would prefer >> >>>>>>>>>>>> >> >>>>>>>>>>>> -a >> >>>>>>>>>>>> >> >>>>>>>>>>>> On 8 November 2019 08:32:42 GMT, Jarek Potiuk < >> >>>>>>>>> jarek.pot...@polidea.com> >> >>>>>>>>>>>> wrote: >> >>>>>>>>>>>>> One more day to go. I would love to see some opinions >> >> on >> >>>> this >> >>>>>>>> AIP-21 >> >>>>>>>>>>>>> update >> >>>>>>>>>>>>> :). >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Executive summary: >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> * we will be moving a number of integrations to >> >>>> sub-packages >> >>>>> of >> >>>>>>>>>>>>> airflow. >> >>>>>>>>>>>>> * they will be backportable to 1.10.*. There will be >> >>>>>>>>>>>>> 'apache-airflow-[package]-backport' pypi installable >> >> with >> >>>>> python >> >>>>>>>> 3 >> >>>>>>>>> that >> >>>>>>>>>>>>> will make Airflow 2.0 operators/hooks etc. available >> >> with >> >>>>> 1.10* >> >>>>>>>>>>>>> operators. >> >>>>>>>>>>>>> * the current proposal for sub-packages is >> >>>>>>>>>>>>> "protocols/software/providers/" >> >>>>>>>>>>>>> (but if you think merging protocols and software makes >> >>>> sense >> >>>>> - >> >>>>>>>>> please >> >>>>>>>>>>>>> express your opinion >> >>>>>>>>>>>>> * we are not moving "fundamental" operators/hooks etc.. >> >>>>>>>>>>>>> * Airflow 2.0 is still going to be installed as a >> >> single >> >>>>> package >> >>>>>>>>> with >> >>>>>>>>>>>>> all >> >>>>>>>>>>>>> operators (so we are not yet implementing AIP-8) >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> J. >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> On Wed, Nov 6, 2019 at 10:07 AM Jarek Potiuk < >> >>>>>>>>> jarek.pot...@polidea.com> >> >>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>> >> >>>>>>>>>>>>>> I think all this cases are valid but maybe I was not >> >>>>>>>> super-clear. >> >>>>>>>>>>>>> It's >> >>>>>>>>>>>>>> only the transfer operators that we need to decide >> >> where >> >>>> to >> >>>>>>>> put - >> >>>>>>>>> not >> >>>>>>>>>>>>>> hooks. >> >>>>>>>>>>>>>> Usually the complexity of communication with >> >> particular >> >>>>>>>> storages >> >>>>>>>>> is >> >>>>>>>>>>>>> (or at >> >>>>>>>>>>>>>> least should be) in the Hooks rather than Operators. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Operators should be just thin wrappers over the >> >> logic in >> >>>>> the >> >>>>>>>>> hooks. >> >>>>>>>>>>>>>> Hooks are going to stay where they belong - S3 Hooks >> >> in >> >>>>> amazon, >> >>>>>>>>> GCS >> >>>>>>>>>>>>> Hooks >> >>>>>>>>>>>>>> in google.cloud, GoogleSheet Hooks in google.gsuite. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Since we actually have mono-repo - this will be no >> >>>> problem >> >>>>>>>> (and no >> >>>>>>>>>>>>> cross >> >>>>>>>>>>>>>> dependencies problem) to have S3 -> GCS operator in >> >>>>> google and >> >>>>>>>>> use >> >>>>>>>>>>>>> hooks >> >>>>>>>>>>>>>> from both google/amazon. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> I hope this alleviates your concern Daniel ? >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> J. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> What about GoogleSheetsToS3? GoogleSheetsToGCS? >> >> These >> >>>>> you >> >>>>>>>> would >> >>>>>>>>>>>>> put in >> >>>>>>>>>>>>>>> the target, i.e. the storage? But >> >> GoogleSheetsToSftp >> >>>>> would >> >>>>>>>> be in >> >>>>>>>>>>>>> google >> >>>>>>>>>>>>>>> sheets operators file? The complexity, and the >> >> shared >> >>>>> code, >> >>>>>>>> are >> >>>>>>>>> in >> >>>>>>>>>>>>> the >> >>>>>>>>>>>>>>> gsheet component -- not into the storage >> >> destination. >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> On Tue, Nov 5, 2019 at 5:46 PM Jarek Potiuk >> >>>>>>>>>>>>> <jarek.pot...@polidea.com> >> >>>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> Hello Airflow Community, >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> The email calls for a vote to update AIP-21 >> >> Changes in >> >>>>>>>> import >> >>>>>>>>>>>>> paths >> >>>>>>>>>>>>>>>> < >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>> >> >>>> >> >> >> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-21%3A+Changes+in+import+paths >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> with >> >>>>>>>>>>>>>>>> the changes described below. The vote will last >> >> till >> >>>>>>>> Saturday >> >>>>>>>>> 8th >> >>>>>>>>>>>>> 2am >> >>>>>>>>>>>>>>> CEST >> >>>>>>>>>>>>>>>> (72 hours). Committers have a binding vote but >> >>>> everyone >> >>>>> from >> >>>>>>>>> the >> >>>>>>>>>>>>>>> community >> >>>>>>>>>>>>>>>> is encouraged to cast an advisory vote. >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> *Summary*: >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> The proposal is to update AIP-21 to move all >> >> non-core >> >>>>>>>>>>>>>>>> operators/hooks/sensor (and related files) to >> >>>>> sub-packages >> >>>>>>>>> within >> >>>>>>>>>>>>>>> airflow >> >>>>>>>>>>>>>>>> (protocols/software/providers) or >> >>>> (software/providers). >> >>>>>>>>>>>>>>>> I am also happy to merge protocols+software, so >> >> if you >> >>>>> have >> >>>>>>>> a >> >>>>>>>>>>>>> strong >> >>>>>>>>>>>>>>>> opinion on it - please state it with your vote >> >> and we >> >>>>> can >> >>>>>>>>> decide >> >>>>>>>>>>>>> based >> >>>>>>>>>>>>>>> on >> >>>>>>>>>>>>>>>> majority. >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> Those packages will be separately released >> >>>>> (schedule/process >> >>>>>>>>> TBD) >> >>>>>>>>>>>>> and >> >>>>>>>>>>>>>>> will >> >>>>>>>>>>>>>>>> be backportable to 1.10.* airflow series, so that >> >>>> users >> >>>>> can >> >>>>>>>>>>>>> install it >> >>>>>>>>>>>>>>> and >> >>>>>>>>>>>>>>>> start using new Airflow2.0 operators in their >> >> Python 3 >> >>>>>>>> Airflow >> >>>>>>>>>>>>> 1.10 >> >>>>>>>>>>>>>>>> environments (only Python 3.5+ is supported). >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> We will proceed with migrating the providers >> >> package >> >>>> to >> >>>>>>>> already >> >>>>>>>>>>>>> agreed >> >>>>>>>>>>>>>>>> paths without waiting for the final vote >> >> (following >> >>>>> current >> >>>>>>>>>>>>> version of >> >>>>>>>>>>>>>>>> AIP-21). Since we have working POC - we know the >> >>>> agreed >> >>>>>>>> paths >> >>>>>>>>> will >> >>>>>>>>>>>>> work >> >>>>>>>>>>>>>>> for >> >>>>>>>>>>>>>>>> us. >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> *Previous discussions: * >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> - >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>> >> >>>> >> >> >> https://lists.apache.org/thread.html/b07a93c9114e3d3c55d4ee514955bac79bc012c7a00db627c6b4c55f@%3Cdev.airflow.apache.org%3E >> >>>>>>>>>>>>>>>> - >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>> >> >>>> >> >> >> https://lists.apache.org/thread.html/e25ddc546e367a4af3e594fecbd4431959bd5a89045e748e4206e7ff@%3Cdev.airflow.apache.org%3E >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> *More Details*: >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> 1) Information that we are going in the direction >> >> of >> >>>>> AIP-8 >> >>>>>>>> but >> >>>>>>>>> not >> >>>>>>>>>>>>> yet >> >>>>>>>>>>>>>>>> reaching it - focusing on separating out >> >> backportable >> >>>>>>>> packages >> >>>>>>>>>>>>>>> installable >> >>>>>>>>>>>>>>>> in Airflow releases 1.10.* . Airflow 2.0 will >> >> still be >> >>>>>>>>> installed >> >>>>>>>>>>>>> as a >> >>>>>>>>>>>>>>> whole >> >>>>>>>>>>>>>>>> and all the source will be kept in one repo, but >> >> we >> >>>> now >> >>>>>>>> have a >> >>>>>>>>> way >> >>>>>>>>>>>>> to >> >>>>>>>>>>>>>>> build >> >>>>>>>>>>>>>>>> backportable packages for groups of operators. POC >> >>>>> available >> >>>>>>>>> here: >> >>>>>>>>>>>>>>>> https://github.com/apache/airflow/pull/6507 >> >> (based on >> >>>>> Ash's >> >>>>>>>>>>>>>>>> https://github.com/ashb/airflow-submodule-test) >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> 2) We move all integrations to new packages >> >> (keeping >> >>>>>>>> deprecated >> >>>>>>>>>>>>> import >> >>>>>>>>>>>>>>>> aliases in the old places). The following split >> >>>>> (according >> >>>>>>>> to >> >>>>>>>>>>>>>>> "stewardship" >> >>>>>>>>>>>>>>>> over the integrations): >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> - *fundamentals* - core of ariflow - they are >> >>>> really >> >>>>>>>> part of >> >>>>>>>>>>>>> Apache >> >>>>>>>>>>>>>>>> Airflow. Stewards - core Airflow team. Not >> >>>>>>>>>>>>> backportable/separated >> >>>>>>>>>>>>>>> out. >> >>>>>>>>>>>>>>>> - *protocols* - are not owned by anyone, they >> >> are >> >>>>> public >> >>>>>>>> and >> >>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>> implementation is fully "open". There are no >> >>>>> particular >> >>>>>>>>>>>>> stewards (no >> >>>>>>>>>>>>>>>> need). >> >>>>>>>>>>>>>>>> Users of particular protocols should mainly >> >>>> maintain >> >>>>>>>> those >> >>>>>>>>> and >> >>>>>>>>>>>>> add >> >>>>>>>>>>>>>>>> support >> >>>>>>>>>>>>>>>> for different versions of the protocols. >> >>>>>>>>>>>>>>>> - *software* - both API and software are >> >> controlled >> >>>>> by >> >>>>>>>>> someone >> >>>>>>>>>>>>>>> outside >> >>>>>>>>>>>>>>>> of Airflow (commercial or open-source >> >> project), but >> >>>>> the >> >>>>>>>>>>>>> deployment of >> >>>>>>>>>>>>>>>> that >> >>>>>>>>>>>>>>>> software is "owned" by the user installing >> >> Airflow. >> >>>>> The >> >>>>>>>>>>>>> "stewardship" >> >>>>>>>>>>>>>>>> might >> >>>>>>>>>>>>>>>> be also the users but the controlling party >> >> (Oracle >> >>>>> for >> >>>>>>>>>>>>> example) >> >>>>>>>>>>>>>>> might >> >>>>>>>>>>>>>>>> be >> >>>>>>>>>>>>>>>> interested in maintaining those operators as >> >> well. >> >>>>>>>>>>>>>>>> - *providers* - API/software/deployments are >> >> fully >> >>>>>>>>> controlled >> >>>>>>>>>>>>> by a >> >>>>>>>>>>>>>>> 3rd >> >>>>>>>>>>>>>>>> party. Here most likely "provider" will be >> >>>>> interested in >> >>>>>>>>>>>>> maintaining >> >>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>> operators (and for example like Google - >> >> provide >> >>>>>>>> integration >> >>>>>>>>>>>>>>> guidelines >> >>>>>>>>>>>>>>>> < >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>> >> >>>> >> >> >> https://docs.google.com/document/d/1_rTdJSLCt0eyrAylmmgYc3yZr-_h51fVlnvMmWqhCkY/edit?usp=drive_web&ouid=112320280470690058978 >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> for >> >>>>>>>>>>>>>>>> their hooks/operators/sensors) >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> 3) Between-providers transfer operators should be >> >> kept >> >>>>> at >> >>>>>>>> the >> >>>>>>>>>>>>> "target" >> >>>>>>>>>>>>>>>> rather than "source" >> >>>>>>>>>>>>>>>> For example S3 -> GCS should be in "google" >> >> provider, >> >>>>> but >> >>>>>>>>> GCS-> S3 >> >>>>>>>>>>>>>>> should >> >>>>>>>>>>>>>>>> be in "amazon". >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> 4) One-side provider transfer operators should be >> >> kept >> >>>>> at >> >>>>>>>> the >> >>>>>>>>>>>>> "provider" >> >>>>>>>>>>>>>>>> regardless if they are target or source. >> >>>>>>>>>>>>>>>> For example GCS-> SFTP or SFTP -> GCS should be in >> >>>>> "google" >> >>>>>>>>>>>>> provider. >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> 5) If in doubt we will discuss individual cases >> >>>>> separately. >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> J. >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> -- >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> Jarek Potiuk >> >>>>>>>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal >> >>>> Software >> >>>>>>>>> Engineer >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> M: +48 660 796 129 <+48660796129> >> >>>>>>>>>>>>>>>> [image: Polidea] <https://www.polidea.com/> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> -- >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Jarek Potiuk >> >>>>>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal >> >> Software >> >>>>>>>> Engineer >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> M: +48 660 796 129 <+48660796129> >> >>>>>>>>>>>>>> [image: Polidea] <https://www.polidea.com/> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> -- >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Jarek Potiuk >> >>>>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal >> >> Software >> >>>>> Engineer >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> M: +48 660 796 129 <+48660796129> >> >>>>>>>>>>>>> [image: Polidea] <https://www.polidea.com/> >> >>>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> -- >> >>>>>>>>>>> >> >>>>>>>>>>> Jarek Potiuk >> >>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software >> >>>>> Engineer >> >>>>>>>>>>> >> >>>>>>>>>>> M: +48 660 796 129 <+48660796129> >> >>>>>>>>>>> [image: Polidea] <https://www.polidea.com/> >> >>>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> >> >>>>>>> Jarek Potiuk >> >>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >>>>>>> >> >>>>>>> M: +48 660 796 129 <+48660796129> >> >>>>>>> [image: Polidea] <https://www.polidea.com/> >> >>>>>>> >> >>>>>>> >> >>>>>> >> >>>>>> -- >> >>>>>> >> >>>>>> Jarek Potiuk >> >>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >>>>>> >> >>>>>> M: +48 660 796 129 <+48660796129> >> >>>>>> [image: Polidea] <https://www.polidea.com/> >> >>>>> >> >>>> >> >>>> >> >>>> -- >> >>>> >> >>>> Jarek Potiuk >> >>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >>>> >> >>>> M: +48 660 796 129 <+48660796129> >> >>>> [image: Polidea] <https://www.polidea.com/> >> >>>> >> >> >> > >> > >> > -- >> > >> > Jarek Potiuk >> > Polidea <https://www.polidea.com/ <https://www.polidea.com/>> | >> Principal Software Engineer >> > >> > M: +48 660 796 129 <+48660796129> >> > [image: Polidea] <https://www.polidea.com/ <https://www.polidea.com/>> >> >> > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>