I updated the spreadsheet and put Bash + Python operator into fundamentals.
Also treat Apache same way as "proprietary" providers.

I will re-start the vote then :)..

J.


On Mon, Nov 11, 2019 at 7:21 PM Jarek Potiuk <jarek.pot...@polidea.com>
wrote:

> Ok. Happy to move it back then :). No problem with that.
>
> According to rules of AIP-21 it should actually be:  "*from
> airflow.providers.kubernetes.operators.pod import KubernetesPodOperator*"
> (Case 2A. (drop _operator in module name) + Case 5B. (keep Operator in
> class name). We can have more than just a Pod operator for Kubernetes
> (KubernetesPod, KubernetesVolume, KubernetesIstio. and many more) so
> keeping KubernetesPod in class name and having separate module for pod
> operator(s?) makes sense IMHO.
>
> It's similar to *from airflow.providers.google.cloud.operators.pubsub
> import PubSubTopicCreateOperator* for example.
>
> Re - remote log storage - indeed. That should be part of AIP- 8.
>
> J,
>
> On Mon, Nov 11, 2019 at 6:36 PM Ash Berlin-Taylor <a...@apache.org> wrote:
>
>> +1 for Python and Bash being in the stock install -- they are just _so_
>> commonly used that I think it makes sense to keep them in the base install.
>> (and the virtualenv module is not an onerous dep, not caused us any
>> problems. Yet).
>>
>> Kubeneretes is also a slighlty funny one since the deps for that will be
>> in "core" anyway thanks to the Kube executor, but I think it probably makes
>> sense to have `from airflow.providers.kubernetes.operators import
>> KubernetesOperator`. Is that the pattern we are going with for the
>> "one-level" providers, or will it be `from
>> airflow.providers.kubernetes.operators.pod_operator import
>> KubernetesOperator`?
>>
>> Possibly more an AIP-8 question: with moving Azure Blob/S3/GCS to
>> separate packages we might have to look at how we enable remote log storage.
>>
>> -a
>>
>>
>> > On 11 Nov 2019, at 15:53, Jarek Potiuk <jarek.pot...@polidea.com>
>> wrote:
>> >
>> > On Mon, Nov 11, 2019 at 4:22 PM Kamil Breguła <
>> kamil.breg...@polidea.com <mailto:kamil.breg...@polidea.com>>
>> > wrote:
>> >
>> >> One more question. Are you sure you want to move Python and Bash from
>> >> core?  These are the elements that are installed in every environment
>> >> because they are required by Airflow, so moving them to a separate
>> >> installed package is pointless in my opinion.
>> >>
>> >> I have no problem with moving them to "fundamentals", but I am not
>> sure if
>> > they are really required ? I looked through the code and other than few
>> > examples and tests, they are not really "required".  Maybe that's
>> enough to
>> > keep them in fundamentals,
>> > Also Python operator has some dependencies - virtualenv - which is only
>> > required for this operator so maybe it's worth to keep it separate from
>> > "fundamentals".
>> >
>> >
>> >> On Mon, Nov 11, 2019 at 3:07 PM Kaxil Naik <kaxiln...@gmail.com>
>> wrote:
>> >>>
>> >>> I am fine with this list +1
>> >>>
>> >>> On Mon, Nov 11, 2019 at 1:27 PM Jarek Potiuk <
>> jarek.pot...@polidea.com>
>> >>> wrote:
>> >>>
>> >>>> I am all for it Kamil!
>> >>>>
>> >>>> Super happy to treat Apache projects in the same way as "proprietary"
>> >>>> providers :). Anyone else has some other comments ?
>> >>>>
>> >>>> J.
>> >>>>
>> >>>> On Mon, Nov 11, 2019 at 2:17 PM Kamil Breguła <
>> >> kamil.breg...@polidea.com>
>> >>>> wrote:
>> >>>>
>> >>>>> I looked at this list and I'm only worried about two operators.
>> >>>>>
>> >>>>> airflow.contrib.operators.vertica_to_hive
>> >>>>> airflow.contrib.operators.s3_to_hive
>> >>>>>
>> >>>>> If we want the operators to be grouped according to destination,
>> then
>> >>>>> this operator should be in apache package. It is the members of the
>> >>>>> Apache community who will care most about this operator being of
>> high
>> >>>>> quality. Apache can be treated equally with other large cloud
>> >>>>> providers, such as GCP, AWS. I can imagine that a new Apache product
>> >>>>> will appear and it will want to promote the same way as products of
>> >>>>> cloud providers are promoted. By creating a large number of
>> >>>>> integrations that allow you to copy data to its operating range.
>> >>>>> There's another cases - building a strong Apache community. As a
>> >>>>> member of the Apache community, we should promote Apache products to
>> >>>>> ensure that the development of the community is correct, and
>> >> therefore
>> >>>>> also for integration into our products with other products.
>> >>>>>
>> >>>>> On Mon, Nov 11, 2019 at 12:28 AM Jarek Potiuk <
>> >> jarek.pot...@polidea.com>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> Just to select the "packages" for this update. Anyone has
>> >> objections
>> >>>> for
>> >>>>>> this structure (details including transfer operators in
>> >>>>>>
>> >>>>>> https://docs.google.com/spreadsheets/d/17zA5t2JVxnDdg5Cs1Cg_
>> >>>>>> Mb1GXvGctmesfg2L089QSOk/edit#gid=0?
>> >>>>>>
>> >>>>>> *Fundamentals (no change)*
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> providers
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> google
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> cloud
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> gsuite
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> marketing_platform
>> >>>>>>
>> >>>>>>
>> >>>>>> amazon
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> aws
>> >>>>>>
>> >>>>>>
>> >>>>>> microsoft
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> azure
>> >>>>>>
>> >>>>>>
>> >>>>>> apache
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> cassandra
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> druid
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> hadoop
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> hive
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> pig
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> pinot
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> spark
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> sqoop
>> >>>>>>
>> >>>>>>
>> >>>>>> mysql
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> jira
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> databricks
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> datadog
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> dingding
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> discord
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> cloudant
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> jenkins
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> opsgenie
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> qubole
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> salesforce
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> segment
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> slack
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> snowflake
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> vertica
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> zendesk
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> celery
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> docker
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> bash
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> kubernetes
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> mssql
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> mongodb
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> mysql
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> openfaas
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> oracle
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> papermill
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> postgres
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> presto
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> python
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> redis
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> samba
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> sqlite
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> imap
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> ssh
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> filesystem
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> sftp
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> ftp
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> http
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> grpc
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> smtp
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> jdbc
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> winrm
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> On Fri, Nov 8, 2019 at 5:47 PM Jarek Potiuk <
>> >> jarek.pot...@polidea.com>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>>> Let me then cancel this vote and I will restart it next week.
>> >>>>>>>
>> >>>>>>> Yeah. It's a bit like re-opening the Pandora's box but now that
>> >> we
>> >>>> know
>> >>>>>>> that we can do it, and we are unblocked in moving to google
>> >> (which is
>> >>>>> now
>> >>>>>>> the biggest move in-progress),  we can spend more time on getting
>> >>>>> better
>> >>>>>>> (and more final) consensus.
>> >>>>>>> I decided to go through the list from the docs (once again Kamil
>> >> -
>> >>>>> great
>> >>>>>>> that you did it) and prepared this spreadsheet showing the
>> >>>> structure. I
>> >>>>>>> went through ALL the operators and put them in the right place
>> >> where
>> >>>>> our
>> >>>>>>> current rules place them.
>> >>>>>>>
>> >>>>>>> After this exercise, I think that makes sense:
>> >>>>>>> - put all the stuff except fundamentals in *"providers"*
>> >> (everything
>> >>>>>>> in "providers" will be potentially backportable).
>> >>>>>>> - grouping apache projects under *"apache"* - similar to
>> >>>>>>> google/amazon/microsoft (different kind of ownership but still
>> >> it is
>> >>>> an
>> >>>>>>> ownership)
>> >>>>>>> - for the rest I think what we can do is really to put the
>> >> operators
>> >>>> in
>> >>>>>>> folders per "service/company" (without sub-packages). That
>> >> includes
>> >>>>>>> sftp/ssh/ftp etc (should we group [ftp and sftp] or [ssh and
>> >> sftp]
>> >>>> ??).
>> >>>>>>> there is no "ownership" there and no reason to group them. That
>> >> will
>> >>>>> put
>> >>>>>>> "operators/hooks/sensors" at different levels in the directory
>> >> tree
>> >>>>> but we
>> >>>>>>> already have that for fundamentals and I am not too worried about
>> >>>>> that. We
>> >>>>>>> do not have to have everything at the same level.
>> >>>>>>> - I put transfer operators according to the rule where "to" side
>> >> is
>> >>>>> more
>> >>>>>>> important unless the other side is a public protocol (so sftp ->
>> >> gcs
>> >>>>> and
>> >>>>>>> gcs -> sftp both go to google/gcp). I did not have any doubt
>> >> where to
>> >>>>> put
>> >>>>>>> which transfer operator, so this is a good sign:
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>>
>> >>
>> https://docs.google.com/spreadsheets/d/17zA5t2JVxnDdg5Cs1Cg_Mb1GXvGctmesfg2L089QSOk/edit#gid=0
>> >>>>>>>
>> >>>>>>> Can you please take a look and express your opinions here so
>> >> that we
>> >>>>> can
>> >>>>>>> have final voting next week (for those who are not yet tired
>> >> with the
>> >>>>>>> discussion ;)).
>> >>>>>>>
>> >>>>>>> J.
>> >>>>>>>
>> >>>>>>> On Fri, Nov 8, 2019 at 4:38 PM Kaxil Naik <kaxiln...@gmail.com>
>> >>>> wrote:
>> >>>>>>>
>> >>>>>>>> Yes, that makes sense.
>> >>>>>>>>
>> >>>>>>>> On Fri, Nov 8, 2019 at 3:22 PM Kamil Breguła <
>> >>>>> kamil.breg...@polidea.com>
>> >>>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> In the case of Hadoop, it is published by Apache, so it can
>> >> be in
>> >>>>> the
>> >>>>>>>>> apache directory.  This will mimic the grouping presented in
>> >> the
>> >>>>>>>>> documentation.
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>
>> >>>>
>> >>
>> https://airflow.readthedocs.io/en/latest/operators-and-hooks-ref.html#software-operators-and-hooks
>> >>>>>>>>>
>> >>>>>>>>> On Fri, Nov 8, 2019 at 3:47 PM Kaxil Naik <
>> >> kaxiln...@gmail.com>
>> >>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>> I think we should keep the vote open at least until mid next
>> >>>> week
>> >>>>> to
>> >>>>>>>> have
>> >>>>>>>>>> more thought and inputs on this one.
>> >>>>>>>>>>
>> >>>>>>>>>> In general, I am happy with the approach but
>> >> operators/hooks and
>> >>>>>>>> sensors
>> >>>>>>>>>> shouldn't be a provider. "hadoop" can be its provider and
>> >> hdfs
>> >>>>> can be
>> >>>>>>>> a
>> >>>>>>>>>> part of it.
>> >>>>>>>>>>
>> >>>>>>>>>> providers/
>> >>>>>>>>>>    google
>> >>>>>>>>>>         cloud
>> >>>>>>>>>>             operators
>> >>>>>>>>>>             hooks
>> >>>>>>>>>>             sensors
>> >>>>>>>>>>         gsuite
>> >>>>>>>>>>             operators
>> >>>>>>>>>>             ...
>> >>>>>>>>>>    amazon
>> >>>>>>>>>>         aws
>> >>>>>>>>>>             operators
>> >>>>>>>>>>             ...
>> >>>>>>>>>>    microsoft
>> >>>>>>>>>>         azure
>> >>>>>>>>>>             operators
>> >>>>>>>>>>             ...
>> >>>>>>>>>>    hadoop
>> >>>>>>>>>>        hdfs
>> >>>>>>>>>>             operators
>> >>>>>>>>>>             ...
>> >>>>>>>>>>
>> >>>>>>>>>> We can also define what is a "provider" so we know what to
>> >> add
>> >>>> in
>> >>>>> it
>> >>>>>>>> in
>> >>>>>>>>> the
>> >>>>>>>>>> future. SSH/FTP/SFTP belongs to the same family group. Do we
>> >>>> want
>> >>>>> to
>> >>>>>>>> have
>> >>>>>>>>>> separate providers for each one of them ???
>> >>>>>>>>>>
>> >>>>>>>>>> Regards,
>> >>>>>>>>>> Kaxil
>> >>>>>>>>>>
>> >>>>>>>>>> On Fri, Nov 8, 2019 at 9:08 AM Jarek Potiuk <
>> >>>>> jarek.pot...@polidea.com
>> >>>>>>>>>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> I really like to make everything a provider. That's a
>> >> great
>> >>>>> idea !
>> >>>>>>>>> This way
>> >>>>>>>>>>> everything "backportable" will have to be in "providers"
>> >>>>> package.
>> >>>>>>>>> Really
>> >>>>>>>>>>> nice and clean separation (and less mess in "airflow").
>> >> And we
>> >>>>> will
>> >>>>>>>> not
>> >>>>>>>>>>> have to have any artificial grouping (we can still group
>> >> them
>> >>>>> at the
>> >>>>>>>>>>> documentation level).
>> >>>>>>>>>>>
>> >>>>>>>>>>> We do not need backport in name. And I think it's more of
>> >>>>> technical
>> >>>>>>>>> detail
>> >>>>>>>>>>> on naming the package which we can work out while
>> >> reviewing
>> >>>> PRs
>> >>>>> and
>> >>>>>>>> we
>> >>>>>>>>> can
>> >>>>>>>>>>> agree final naming of the released packaged on PMC level
>> >> (PMCs
>> >>>>> will
>> >>>>>>>>> have to
>> >>>>>>>>>>> vote on releasing those).
>> >>>>>>>>>>>
>> >>>>>>>>>>> The thinking is that it's intention is really to be only
>> >>>>> backported
>> >>>>>>>> to
>> >>>>>>>>> 1.10
>> >>>>>>>>>>> - we are not going (yet) to use the packages in Airflow
>> >> 2.*.
>> >>>> so
>> >>>>> I
>> >>>>>>>>> thought
>> >>>>>>>>>>> by naming them backport we can express that intent more
>> >>>> clearly.
>> >>>>>>>>>>>
>> >>>>>>>>>>> So let me clarify the structure of folders we are going to
>> >>>> have
>> >>>>> if
>> >>>>>>>> we
>> >>>>>>>>>>> follow it (i just added some examples) including the
>> >> already
>> >>>>> agreed
>> >>>>>>>>> changes
>> >>>>>>>>>>> from AIP-21:
>> >>>>>>>>>>>
>> >>>>>>>>>>> providers/
>> >>>>>>>>>>>    google
>> >>>>>>>>>>>         cloud
>> >>>>>>>>>>>             operators
>> >>>>>>>>>>>             hooks
>> >>>>>>>>>>>             sensors
>> >>>>>>>>>>>         gsuite
>> >>>>>>>>>>>             operators
>> >>>>>>>>>>>             ...
>> >>>>>>>>>>>    amazon
>> >>>>>>>>>>>         aws
>> >>>>>>>>>>>             operators
>> >>>>>>>>>>>             ...
>> >>>>>>>>>>>    microsoft
>> >>>>>>>>>>>         azure
>> >>>>>>>>>>>             operators
>> >>>>>>>>>>>             ...
>> >>>>>>>>>>>    operators
>> >>>>>>>>>>>         sqlite.py
>> >>>>>>>>>>>         oracle.py
>> >>>>>>>>>>>         docker.py
>> >>>>>>>>>>>    hooks
>> >>>>>>>>>>>         hdfs.py
>> >>>>>>>>>>>         sqlite.py
>> >>>>>>>>>>>    sensors
>> >>>>>>>>>>>         http.py
>> >>>>>>>>>>>         sql.py
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> J.
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Fri, Nov 8, 2019 at 9:43 AM Ash Berlin-Taylor <
>> >>>>> a...@apache.org>
>> >>>>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> Do we need to include `-backport,`? What was the
>> >> thinking
>> >>>>> behind
>> >>>>>>>>> that?
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I think software and protocol should be merged. I would
>> >> also
>> >>>>> say
>> >>>>>>>>>>>> _everything_ is a provider, so
>> >>>>> airflow.providers.ssh.SSHOperator
>> >>>>>>>> for
>> >>>>>>>>>>>> instance is what I would prefer
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> -a
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> On 8 November 2019 08:32:42 GMT, Jarek Potiuk <
>> >>>>>>>>> jarek.pot...@polidea.com>
>> >>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>> One more day to go. I would love to see some opinions
>> >> on
>> >>>> this
>> >>>>>>>> AIP-21
>> >>>>>>>>>>>>> update
>> >>>>>>>>>>>>> :).
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Executive summary:
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> * we will be moving a number of integrations to
>> >>>> sub-packages
>> >>>>> of
>> >>>>>>>>>>>>> airflow.
>> >>>>>>>>>>>>> * they will be backportable to 1.10.*.  There will be
>> >>>>>>>>>>>>> 'apache-airflow-[package]-backport' pypi installable
>> >> with
>> >>>>> python
>> >>>>>>>> 3
>> >>>>>>>>> that
>> >>>>>>>>>>>>> will make Airflow 2.0 operators/hooks etc. available
>> >> with
>> >>>>> 1.10*
>> >>>>>>>>>>>>> operators.
>> >>>>>>>>>>>>> * the current proposal for sub-packages is
>> >>>>>>>>>>>>> "protocols/software/providers/"
>> >>>>>>>>>>>>> (but if you think merging protocols and software makes
>> >>>> sense
>> >>>>> -
>> >>>>>>>>> please
>> >>>>>>>>>>>>> express your opinion
>> >>>>>>>>>>>>> * we are not moving "fundamental" operators/hooks etc..
>> >>>>>>>>>>>>> * Airflow 2.0 is still going to be installed as a
>> >> single
>> >>>>> package
>> >>>>>>>>> with
>> >>>>>>>>>>>>> all
>> >>>>>>>>>>>>> operators (so we are not yet implementing AIP-8)
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> J.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> On Wed, Nov 6, 2019 at 10:07 AM Jarek Potiuk <
>> >>>>>>>>> jarek.pot...@polidea.com>
>> >>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>> I think all this cases are valid but maybe I was not
>> >>>>>>>> super-clear.
>> >>>>>>>>>>>>> It's
>> >>>>>>>>>>>>>> only the transfer operators that we need to decide
>> >> where
>> >>>> to
>> >>>>>>>> put -
>> >>>>>>>>> not
>> >>>>>>>>>>>>>> hooks.
>> >>>>>>>>>>>>>> Usually the complexity of communication with
>> >> particular
>> >>>>>>>> storages
>> >>>>>>>>> is
>> >>>>>>>>>>>>> (or at
>> >>>>>>>>>>>>>> least should be) in the Hooks rather than Operators.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Operators should be just thin wrappers over the
>> >> logic in
>> >>>>> the
>> >>>>>>>>> hooks.
>> >>>>>>>>>>>>>> Hooks are going to stay where they belong - S3 Hooks
>> >> in
>> >>>>> amazon,
>> >>>>>>>>> GCS
>> >>>>>>>>>>>>> Hooks
>> >>>>>>>>>>>>>> in google.cloud, GoogleSheet Hooks in google.gsuite.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Since we actually have mono-repo - this will be no
>> >>>> problem
>> >>>>>>>> (and no
>> >>>>>>>>>>>>> cross
>> >>>>>>>>>>>>>> dependencies problem) to have S3 -> GCS operator  in
>> >>>>> google and
>> >>>>>>>>> use
>> >>>>>>>>>>>>> hooks
>> >>>>>>>>>>>>>> from both google/amazon.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> I hope this alleviates your concern Daniel ?
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> J.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> What about GoogleSheetsToS3?  GoogleSheetsToGCS?
>> >> These
>> >>>>> you
>> >>>>>>>> would
>> >>>>>>>>>>>>> put in
>> >>>>>>>>>>>>>>> the target, i.e. the storage?  But
>> >> GoogleSheetsToSftp
>> >>>>> would
>> >>>>>>>> be in
>> >>>>>>>>>>>>> google
>> >>>>>>>>>>>>>>> sheets operators file?  The complexity, and the
>> >> shared
>> >>>>> code,
>> >>>>>>>> are
>> >>>>>>>>> in
>> >>>>>>>>>>>>> the
>> >>>>>>>>>>>>>>> gsheet component -- not into the storage
>> >> destination.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> On Tue, Nov 5, 2019 at 5:46 PM Jarek Potiuk
>> >>>>>>>>>>>>> <jarek.pot...@polidea.com>
>> >>>>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> Hello Airflow Community,
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> The email calls for a vote to update AIP-21
>> >> Changes in
>> >>>>>>>> import
>> >>>>>>>>>>>>> paths
>> >>>>>>>>>>>>>>>> <
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>
>> >>>>
>> >>
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-21%3A+Changes+in+import+paths
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> with
>> >>>>>>>>>>>>>>>> the changes described below. The vote will last
>> >> till
>> >>>>>>>> Saturday
>> >>>>>>>>> 8th
>> >>>>>>>>>>>>> 2am
>> >>>>>>>>>>>>>>> CEST
>> >>>>>>>>>>>>>>>> (72 hours). Committers have a binding vote but
>> >>>> everyone
>> >>>>> from
>> >>>>>>>>> the
>> >>>>>>>>>>>>>>> community
>> >>>>>>>>>>>>>>>> is encouraged to cast an advisory vote.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> *Summary*:
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> The proposal is to update AIP-21 to move all
>> >> non-core
>> >>>>>>>>>>>>>>>> operators/hooks/sensor (and related files) to
>> >>>>> sub-packages
>> >>>>>>>>> within
>> >>>>>>>>>>>>>>> airflow
>> >>>>>>>>>>>>>>>> (protocols/software/providers) or
>> >>>> (software/providers).
>> >>>>>>>>>>>>>>>> I am also happy to merge protocols+software, so
>> >> if you
>> >>>>> have
>> >>>>>>>> a
>> >>>>>>>>>>>>> strong
>> >>>>>>>>>>>>>>>> opinion on it - please state it with your vote
>> >> and we
>> >>>>> can
>> >>>>>>>>> decide
>> >>>>>>>>>>>>> based
>> >>>>>>>>>>>>>>> on
>> >>>>>>>>>>>>>>>> majority.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> Those packages will be separately released
>> >>>>> (schedule/process
>> >>>>>>>>> TBD)
>> >>>>>>>>>>>>> and
>> >>>>>>>>>>>>>>> will
>> >>>>>>>>>>>>>>>> be backportable to 1.10.* airflow series, so that
>> >>>> users
>> >>>>> can
>> >>>>>>>>>>>>> install it
>> >>>>>>>>>>>>>>> and
>> >>>>>>>>>>>>>>>> start using new Airflow2.0 operators in their
>> >> Python 3
>> >>>>>>>> Airflow
>> >>>>>>>>>>>>> 1.10
>> >>>>>>>>>>>>>>>> environments (only Python 3.5+ is supported).
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> We will proceed with migrating the providers
>> >> package
>> >>>> to
>> >>>>>>>> already
>> >>>>>>>>>>>>> agreed
>> >>>>>>>>>>>>>>>> paths without waiting for the final vote
>> >> (following
>> >>>>> current
>> >>>>>>>>>>>>> version of
>> >>>>>>>>>>>>>>>> AIP-21). Since we have working POC - we know the
>> >>>> agreed
>> >>>>>>>> paths
>> >>>>>>>>> will
>> >>>>>>>>>>>>> work
>> >>>>>>>>>>>>>>> for
>> >>>>>>>>>>>>>>>> us.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> *Previous discussions: *
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>   -
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>
>> >>>>
>> >>
>> https://lists.apache.org/thread.html/b07a93c9114e3d3c55d4ee514955bac79bc012c7a00db627c6b4c55f@%3Cdev.airflow.apache.org%3E
>> >>>>>>>>>>>>>>>>   -
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>
>> >>>>
>> >>
>> https://lists.apache.org/thread.html/e25ddc546e367a4af3e594fecbd4431959bd5a89045e748e4206e7ff@%3Cdev.airflow.apache.org%3E
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> *More Details*:
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> 1) Information that we are going in the direction
>> >> of
>> >>>>> AIP-8
>> >>>>>>>> but
>> >>>>>>>>> not
>> >>>>>>>>>>>>> yet
>> >>>>>>>>>>>>>>>> reaching it - focusing on separating out
>> >> backportable
>> >>>>>>>> packages
>> >>>>>>>>>>>>>>> installable
>> >>>>>>>>>>>>>>>> in Airflow releases 1.10.* . Airflow 2.0 will
>> >> still be
>> >>>>>>>>> installed
>> >>>>>>>>>>>>> as a
>> >>>>>>>>>>>>>>> whole
>> >>>>>>>>>>>>>>>> and all the source will be kept in one repo, but
>> >> we
>> >>>> now
>> >>>>>>>> have a
>> >>>>>>>>> way
>> >>>>>>>>>>>>> to
>> >>>>>>>>>>>>>>> build
>> >>>>>>>>>>>>>>>> backportable packages for groups of operators. POC
>> >>>>> available
>> >>>>>>>>> here:
>> >>>>>>>>>>>>>>>> https://github.com/apache/airflow/pull/6507
>> >> (based on
>> >>>>> Ash's
>> >>>>>>>>>>>>>>>> https://github.com/ashb/airflow-submodule-test)
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> 2) We move all integrations to new packages
>> >> (keeping
>> >>>>>>>> deprecated
>> >>>>>>>>>>>>> import
>> >>>>>>>>>>>>>>>> aliases in the old places). The following split
>> >>>>> (according
>> >>>>>>>> to
>> >>>>>>>>>>>>>>> "stewardship"
>> >>>>>>>>>>>>>>>> over the integrations):
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>   - *fundamentals* - core of ariflow - they are
>> >>>> really
>> >>>>>>>> part of
>> >>>>>>>>>>>>> Apache
>> >>>>>>>>>>>>>>>>   Airflow. Stewards - core Airflow team. Not
>> >>>>>>>>>>>>> backportable/separated
>> >>>>>>>>>>>>>>> out.
>> >>>>>>>>>>>>>>>>   - *protocols* - are not owned by anyone, they
>> >> are
>> >>>>> public
>> >>>>>>>> and
>> >>>>>>>>>>>>> the
>> >>>>>>>>>>>>>>>>   implementation is fully "open". There are no
>> >>>>> particular
>> >>>>>>>>>>>>> stewards (no
>> >>>>>>>>>>>>>>>> need).
>> >>>>>>>>>>>>>>>>   Users of particular protocols should mainly
>> >>>> maintain
>> >>>>>>>> those
>> >>>>>>>>> and
>> >>>>>>>>>>>>> add
>> >>>>>>>>>>>>>>>> support
>> >>>>>>>>>>>>>>>>   for different versions of the protocols.
>> >>>>>>>>>>>>>>>>   - *software* - both API and software are
>> >> controlled
>> >>>>> by
>> >>>>>>>>> someone
>> >>>>>>>>>>>>>>> outside
>> >>>>>>>>>>>>>>>>   of Airflow (commercial or open-source
>> >> project), but
>> >>>>> the
>> >>>>>>>>>>>>> deployment of
>> >>>>>>>>>>>>>>>> that
>> >>>>>>>>>>>>>>>>   software is "owned" by the user installing
>> >> Airflow.
>> >>>>> The
>> >>>>>>>>>>>>> "stewardship"
>> >>>>>>>>>>>>>>>> might
>> >>>>>>>>>>>>>>>>   be also the users but the controlling party
>> >> (Oracle
>> >>>>> for
>> >>>>>>>>>>>>> example)
>> >>>>>>>>>>>>>>> might
>> >>>>>>>>>>>>>>>> be
>> >>>>>>>>>>>>>>>>   interested in maintaining those operators as
>> >> well.
>> >>>>>>>>>>>>>>>>   - *providers* - API/software/deployments are
>> >> fully
>> >>>>>>>>> controlled
>> >>>>>>>>>>>>> by a
>> >>>>>>>>>>>>>>> 3rd
>> >>>>>>>>>>>>>>>>   party. Here most likely "provider" will be
>> >>>>> interested in
>> >>>>>>>>>>>>> maintaining
>> >>>>>>>>>>>>>>> the
>> >>>>>>>>>>>>>>>>   operators (and for example like Google -
>> >> provide
>> >>>>>>>> integration
>> >>>>>>>>>>>>>>> guidelines
>> >>>>>>>>>>>>>>>>   <
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>
>> >>>>
>> >>
>> https://docs.google.com/document/d/1_rTdJSLCt0eyrAylmmgYc3yZr-_h51fVlnvMmWqhCkY/edit?usp=drive_web&ouid=112320280470690058978
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> for
>> >>>>>>>>>>>>>>>>   their hooks/operators/sensors)
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> 3) Between-providers transfer operators should be
>> >> kept
>> >>>>> at
>> >>>>>>>> the
>> >>>>>>>>>>>>> "target"
>> >>>>>>>>>>>>>>>> rather than "source"
>> >>>>>>>>>>>>>>>> For example S3 -> GCS should be in "google"
>> >> provider,
>> >>>>> but
>> >>>>>>>>> GCS-> S3
>> >>>>>>>>>>>>>>> should
>> >>>>>>>>>>>>>>>> be in "amazon".
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> 4) One-side provider transfer operators should be
>> >> kept
>> >>>>> at
>> >>>>>>>> the
>> >>>>>>>>>>>>> "provider"
>> >>>>>>>>>>>>>>>> regardless if they are target or source.
>> >>>>>>>>>>>>>>>> For example GCS-> SFTP or SFTP -> GCS should be in
>> >>>>> "google"
>> >>>>>>>>>>>>> provider.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> 5) If in doubt we will discuss individual cases
>> >>>>> separately.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> J.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> Jarek Potiuk
>> >>>>>>>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal
>> >>>> Software
>> >>>>>>>>> Engineer
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> M: +48 660 796 129 <+48660796129>
>> >>>>>>>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Jarek Potiuk
>> >>>>>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal
>> >> Software
>> >>>>>>>> Engineer
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> M: +48 660 796 129 <+48660796129>
>> >>>>>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> --
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Jarek Potiuk
>> >>>>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal
>> >> Software
>> >>>>> Engineer
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> M: +48 660 796 129 <+48660796129>
>> >>>>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> --
>> >>>>>>>>>>>
>> >>>>>>>>>>> Jarek Potiuk
>> >>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
>> >>>>> Engineer
>> >>>>>>>>>>>
>> >>>>>>>>>>> M: +48 660 796 129 <+48660796129>
>> >>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>>
>> >>>>>>> Jarek Potiuk
>> >>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>> >>>>>>>
>> >>>>>>> M: +48 660 796 129 <+48660796129>
>> >>>>>>> [image: Polidea] <https://www.polidea.com/>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>>
>> >>>>>> Jarek Potiuk
>> >>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>> >>>>>>
>> >>>>>> M: +48 660 796 129 <+48660796129>
>> >>>>>> [image: Polidea] <https://www.polidea.com/>
>> >>>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>>
>> >>>> Jarek Potiuk
>> >>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>> >>>>
>> >>>> M: +48 660 796 129 <+48660796129>
>> >>>> [image: Polidea] <https://www.polidea.com/>
>> >>>>
>> >>
>> >
>> >
>> > --
>> >
>> > Jarek Potiuk
>> > Polidea <https://www.polidea.com/ <https://www.polidea.com/>> |
>> Principal Software Engineer
>> >
>> > M: +48 660 796 129 <+48660796129>
>> > [image: Polidea] <https://www.polidea.com/ <https://www.polidea.com/>>
>>
>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to