+1 for Python and Bash being in the stock install -- they are just _so_ 
commonly used that I think it makes sense to keep them in the base install. 
(and the virtualenv module is not an onerous dep, not caused us any problems. 
Yet).

Kubeneretes is also a slighlty funny one since the deps for that will be in 
"core" anyway thanks to the Kube executor, but I think it probably makes sense 
to have `from airflow.providers.kubernetes.operators import 
KubernetesOperator`. Is that the pattern we are going with for the "one-level" 
providers, or will it be `from 
airflow.providers.kubernetes.operators.pod_operator import KubernetesOperator`?

Possibly more an AIP-8 question: with moving Azure Blob/S3/GCS to separate 
packages we might have to look at how we enable remote log storage.

-a


> On 11 Nov 2019, at 15:53, Jarek Potiuk <jarek.pot...@polidea.com> wrote:
> 
> On Mon, Nov 11, 2019 at 4:22 PM Kamil Breguła <kamil.breg...@polidea.com 
> <mailto:kamil.breg...@polidea.com>>
> wrote:
> 
>> One more question. Are you sure you want to move Python and Bash from
>> core?  These are the elements that are installed in every environment
>> because they are required by Airflow, so moving them to a separate
>> installed package is pointless in my opinion.
>> 
>> I have no problem with moving them to "fundamentals", but I am not sure if
> they are really required ? I looked through the code and other than few
> examples and tests, they are not really "required".  Maybe that's enough to
> keep them in fundamentals,
> Also Python operator has some dependencies - virtualenv - which is only
> required for this operator so maybe it's worth to keep it separate from
> "fundamentals".
> 
> 
>> On Mon, Nov 11, 2019 at 3:07 PM Kaxil Naik <kaxiln...@gmail.com> wrote:
>>> 
>>> I am fine with this list +1
>>> 
>>> On Mon, Nov 11, 2019 at 1:27 PM Jarek Potiuk <jarek.pot...@polidea.com>
>>> wrote:
>>> 
>>>> I am all for it Kamil!
>>>> 
>>>> Super happy to treat Apache projects in the same way as "proprietary"
>>>> providers :). Anyone else has some other comments ?
>>>> 
>>>> J.
>>>> 
>>>> On Mon, Nov 11, 2019 at 2:17 PM Kamil Breguła <
>> kamil.breg...@polidea.com>
>>>> wrote:
>>>> 
>>>>> I looked at this list and I'm only worried about two operators.
>>>>> 
>>>>> airflow.contrib.operators.vertica_to_hive
>>>>> airflow.contrib.operators.s3_to_hive
>>>>> 
>>>>> If we want the operators to be grouped according to destination, then
>>>>> this operator should be in apache package. It is the members of the
>>>>> Apache community who will care most about this operator being of high
>>>>> quality. Apache can be treated equally with other large cloud
>>>>> providers, such as GCP, AWS. I can imagine that a new Apache product
>>>>> will appear and it will want to promote the same way as products of
>>>>> cloud providers are promoted. By creating a large number of
>>>>> integrations that allow you to copy data to its operating range.
>>>>> There's another cases - building a strong Apache community. As a
>>>>> member of the Apache community, we should promote Apache products to
>>>>> ensure that the development of the community is correct, and
>> therefore
>>>>> also for integration into our products with other products.
>>>>> 
>>>>> On Mon, Nov 11, 2019 at 12:28 AM Jarek Potiuk <
>> jarek.pot...@polidea.com>
>>>>> wrote:
>>>>>> 
>>>>>> Just to select the "packages" for this update. Anyone has
>> objections
>>>> for
>>>>>> this structure (details including transfer operators in
>>>>>> 
>>>>>> https://docs.google.com/spreadsheets/d/17zA5t2JVxnDdg5Cs1Cg_
>>>>>> Mb1GXvGctmesfg2L089QSOk/edit#gid=0?
>>>>>> 
>>>>>> *Fundamentals (no change)*
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> providers
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> google
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> cloud
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> gsuite
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> marketing_platform
>>>>>> 
>>>>>> 
>>>>>> amazon
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> aws
>>>>>> 
>>>>>> 
>>>>>> microsoft
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> azure
>>>>>> 
>>>>>> 
>>>>>> apache
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> cassandra
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> druid
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> hadoop
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> hive
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> pig
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> pinot
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> spark
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> sqoop
>>>>>> 
>>>>>> 
>>>>>> mysql
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> jira
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> databricks
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> datadog
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> dingding
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> discord
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> cloudant
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> jenkins
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> opsgenie
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> qubole
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> salesforce
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> segment
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> slack
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> snowflake
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> vertica
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> zendesk
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> celery
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> docker
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> bash
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> kubernetes
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> mssql
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> mongodb
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> mysql
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> openfaas
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> oracle
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> papermill
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> postgres
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> presto
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> python
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> redis
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> samba
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> sqlite
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> imap
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ssh
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> filesystem
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> sftp
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ftp
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> http
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> grpc
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> smtp
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> jdbc
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> winrm
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Fri, Nov 8, 2019 at 5:47 PM Jarek Potiuk <
>> jarek.pot...@polidea.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Let me then cancel this vote and I will restart it next week.
>>>>>>> 
>>>>>>> Yeah. It's a bit like re-opening the Pandora's box but now that
>> we
>>>> know
>>>>>>> that we can do it, and we are unblocked in moving to google
>> (which is
>>>>> now
>>>>>>> the biggest move in-progress),  we can spend more time on getting
>>>>> better
>>>>>>> (and more final) consensus.
>>>>>>> I decided to go through the list from the docs (once again Kamil
>> -
>>>>> great
>>>>>>> that you did it) and prepared this spreadsheet showing the
>>>> structure. I
>>>>>>> went through ALL the operators and put them in the right place
>> where
>>>>> our
>>>>>>> current rules place them.
>>>>>>> 
>>>>>>> After this exercise, I think that makes sense:
>>>>>>> - put all the stuff except fundamentals in *"providers"*
>> (everything
>>>>>>> in "providers" will be potentially backportable).
>>>>>>> - grouping apache projects under *"apache"* - similar to
>>>>>>> google/amazon/microsoft (different kind of ownership but still
>> it is
>>>> an
>>>>>>> ownership)
>>>>>>> - for the rest I think what we can do is really to put the
>> operators
>>>> in
>>>>>>> folders per "service/company" (without sub-packages). That
>> includes
>>>>>>> sftp/ssh/ftp etc (should we group [ftp and sftp] or [ssh and
>> sftp]
>>>> ??).
>>>>>>> there is no "ownership" there and no reason to group them. That
>> will
>>>>> put
>>>>>>> "operators/hooks/sensors" at different levels in the directory
>> tree
>>>>> but we
>>>>>>> already have that for fundamentals and I am not too worried about
>>>>> that. We
>>>>>>> do not have to have everything at the same level.
>>>>>>> - I put transfer operators according to the rule where "to" side
>> is
>>>>> more
>>>>>>> important unless the other side is a public protocol (so sftp ->
>> gcs
>>>>> and
>>>>>>> gcs -> sftp both go to google/gcp). I did not have any doubt
>> where to
>>>>> put
>>>>>>> which transfer operator, so this is a good sign:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> https://docs.google.com/spreadsheets/d/17zA5t2JVxnDdg5Cs1Cg_Mb1GXvGctmesfg2L089QSOk/edit#gid=0
>>>>>>> 
>>>>>>> Can you please take a look and express your opinions here so
>> that we
>>>>> can
>>>>>>> have final voting next week (for those who are not yet tired
>> with the
>>>>>>> discussion ;)).
>>>>>>> 
>>>>>>> J.
>>>>>>> 
>>>>>>> On Fri, Nov 8, 2019 at 4:38 PM Kaxil Naik <kaxiln...@gmail.com>
>>>> wrote:
>>>>>>> 
>>>>>>>> Yes, that makes sense.
>>>>>>>> 
>>>>>>>> On Fri, Nov 8, 2019 at 3:22 PM Kamil Breguła <
>>>>> kamil.breg...@polidea.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> In the case of Hadoop, it is published by Apache, so it can
>> be in
>>>>> the
>>>>>>>>> apache directory.  This will mimic the grouping presented in
>> the
>>>>>>>>> documentation.
>>>>>>>>> 
>>>>>>>> 
>>>>> 
>>>> 
>> https://airflow.readthedocs.io/en/latest/operators-and-hooks-ref.html#software-operators-and-hooks
>>>>>>>>> 
>>>>>>>>> On Fri, Nov 8, 2019 at 3:47 PM Kaxil Naik <
>> kaxiln...@gmail.com>
>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> I think we should keep the vote open at least until mid next
>>>> week
>>>>> to
>>>>>>>> have
>>>>>>>>>> more thought and inputs on this one.
>>>>>>>>>> 
>>>>>>>>>> In general, I am happy with the approach but
>> operators/hooks and
>>>>>>>> sensors
>>>>>>>>>> shouldn't be a provider. "hadoop" can be its provider and
>> hdfs
>>>>> can be
>>>>>>>> a
>>>>>>>>>> part of it.
>>>>>>>>>> 
>>>>>>>>>> providers/
>>>>>>>>>>    google
>>>>>>>>>>         cloud
>>>>>>>>>>             operators
>>>>>>>>>>             hooks
>>>>>>>>>>             sensors
>>>>>>>>>>         gsuite
>>>>>>>>>>             operators
>>>>>>>>>>             ...
>>>>>>>>>>    amazon
>>>>>>>>>>         aws
>>>>>>>>>>             operators
>>>>>>>>>>             ...
>>>>>>>>>>    microsoft
>>>>>>>>>>         azure
>>>>>>>>>>             operators
>>>>>>>>>>             ...
>>>>>>>>>>    hadoop
>>>>>>>>>>        hdfs
>>>>>>>>>>             operators
>>>>>>>>>>             ...
>>>>>>>>>> 
>>>>>>>>>> We can also define what is a "provider" so we know what to
>> add
>>>> in
>>>>> it
>>>>>>>> in
>>>>>>>>> the
>>>>>>>>>> future. SSH/FTP/SFTP belongs to the same family group. Do we
>>>> want
>>>>> to
>>>>>>>> have
>>>>>>>>>> separate providers for each one of them ???
>>>>>>>>>> 
>>>>>>>>>> Regards,
>>>>>>>>>> Kaxil
>>>>>>>>>> 
>>>>>>>>>> On Fri, Nov 8, 2019 at 9:08 AM Jarek Potiuk <
>>>>> jarek.pot...@polidea.com
>>>>>>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> I really like to make everything a provider. That's a
>> great
>>>>> idea !
>>>>>>>>> This way
>>>>>>>>>>> everything "backportable" will have to be in "providers"
>>>>> package.
>>>>>>>>> Really
>>>>>>>>>>> nice and clean separation (and less mess in "airflow").
>> And we
>>>>> will
>>>>>>>> not
>>>>>>>>>>> have to have any artificial grouping (we can still group
>> them
>>>>> at the
>>>>>>>>>>> documentation level).
>>>>>>>>>>> 
>>>>>>>>>>> We do not need backport in name. And I think it's more of
>>>>> technical
>>>>>>>>> detail
>>>>>>>>>>> on naming the package which we can work out while
>> reviewing
>>>> PRs
>>>>> and
>>>>>>>> we
>>>>>>>>> can
>>>>>>>>>>> agree final naming of the released packaged on PMC level
>> (PMCs
>>>>> will
>>>>>>>>> have to
>>>>>>>>>>> vote on releasing those).
>>>>>>>>>>> 
>>>>>>>>>>> The thinking is that it's intention is really to be only
>>>>> backported
>>>>>>>> to
>>>>>>>>> 1.10
>>>>>>>>>>> - we are not going (yet) to use the packages in Airflow
>> 2.*.
>>>> so
>>>>> I
>>>>>>>>> thought
>>>>>>>>>>> by naming them backport we can express that intent more
>>>> clearly.
>>>>>>>>>>> 
>>>>>>>>>>> So let me clarify the structure of folders we are going to
>>>> have
>>>>> if
>>>>>>>> we
>>>>>>>>>>> follow it (i just added some examples) including the
>> already
>>>>> agreed
>>>>>>>>> changes
>>>>>>>>>>> from AIP-21:
>>>>>>>>>>> 
>>>>>>>>>>> providers/
>>>>>>>>>>>    google
>>>>>>>>>>>         cloud
>>>>>>>>>>>             operators
>>>>>>>>>>>             hooks
>>>>>>>>>>>             sensors
>>>>>>>>>>>         gsuite
>>>>>>>>>>>             operators
>>>>>>>>>>>             ...
>>>>>>>>>>>    amazon
>>>>>>>>>>>         aws
>>>>>>>>>>>             operators
>>>>>>>>>>>             ...
>>>>>>>>>>>    microsoft
>>>>>>>>>>>         azure
>>>>>>>>>>>             operators
>>>>>>>>>>>             ...
>>>>>>>>>>>    operators
>>>>>>>>>>>         sqlite.py
>>>>>>>>>>>         oracle.py
>>>>>>>>>>>         docker.py
>>>>>>>>>>>    hooks
>>>>>>>>>>>         hdfs.py
>>>>>>>>>>>         sqlite.py
>>>>>>>>>>>    sensors
>>>>>>>>>>>         http.py
>>>>>>>>>>>         sql.py
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> J.
>>>>>>>>>>> 
>>>>>>>>>>> On Fri, Nov 8, 2019 at 9:43 AM Ash Berlin-Taylor <
>>>>> a...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Do we need to include `-backport,`? What was the
>> thinking
>>>>> behind
>>>>>>>>> that?
>>>>>>>>>>>> 
>>>>>>>>>>>> I think software and protocol should be merged. I would
>> also
>>>>> say
>>>>>>>>>>>> _everything_ is a provider, so
>>>>> airflow.providers.ssh.SSHOperator
>>>>>>>> for
>>>>>>>>>>>> instance is what I would prefer
>>>>>>>>>>>> 
>>>>>>>>>>>> -a
>>>>>>>>>>>> 
>>>>>>>>>>>> On 8 November 2019 08:32:42 GMT, Jarek Potiuk <
>>>>>>>>> jarek.pot...@polidea.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> One more day to go. I would love to see some opinions
>> on
>>>> this
>>>>>>>> AIP-21
>>>>>>>>>>>>> update
>>>>>>>>>>>>> :).
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Executive summary:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> * we will be moving a number of integrations to
>>>> sub-packages
>>>>> of
>>>>>>>>>>>>> airflow.
>>>>>>>>>>>>> * they will be backportable to 1.10.*.  There will be
>>>>>>>>>>>>> 'apache-airflow-[package]-backport' pypi installable
>> with
>>>>> python
>>>>>>>> 3
>>>>>>>>> that
>>>>>>>>>>>>> will make Airflow 2.0 operators/hooks etc. available
>> with
>>>>> 1.10*
>>>>>>>>>>>>> operators.
>>>>>>>>>>>>> * the current proposal for sub-packages is
>>>>>>>>>>>>> "protocols/software/providers/"
>>>>>>>>>>>>> (but if you think merging protocols and software makes
>>>> sense
>>>>> -
>>>>>>>>> please
>>>>>>>>>>>>> express your opinion
>>>>>>>>>>>>> * we are not moving "fundamental" operators/hooks etc..
>>>>>>>>>>>>> * Airflow 2.0 is still going to be installed as a
>> single
>>>>> package
>>>>>>>>> with
>>>>>>>>>>>>> all
>>>>>>>>>>>>> operators (so we are not yet implementing AIP-8)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> J.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Wed, Nov 6, 2019 at 10:07 AM Jarek Potiuk <
>>>>>>>>> jarek.pot...@polidea.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I think all this cases are valid but maybe I was not
>>>>>>>> super-clear.
>>>>>>>>>>>>> It's
>>>>>>>>>>>>>> only the transfer operators that we need to decide
>> where
>>>> to
>>>>>>>> put -
>>>>>>>>> not
>>>>>>>>>>>>>> hooks.
>>>>>>>>>>>>>> Usually the complexity of communication with
>> particular
>>>>>>>> storages
>>>>>>>>> is
>>>>>>>>>>>>> (or at
>>>>>>>>>>>>>> least should be) in the Hooks rather than Operators.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Operators should be just thin wrappers over the
>> logic in
>>>>> the
>>>>>>>>> hooks.
>>>>>>>>>>>>>> Hooks are going to stay where they belong - S3 Hooks
>> in
>>>>> amazon,
>>>>>>>>> GCS
>>>>>>>>>>>>> Hooks
>>>>>>>>>>>>>> in google.cloud, GoogleSheet Hooks in google.gsuite.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Since we actually have mono-repo - this will be no
>>>> problem
>>>>>>>> (and no
>>>>>>>>>>>>> cross
>>>>>>>>>>>>>> dependencies problem) to have S3 -> GCS operator  in
>>>>> google and
>>>>>>>>> use
>>>>>>>>>>>>> hooks
>>>>>>>>>>>>>> from both google/amazon.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I hope this alleviates your concern Daniel ?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> J.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> What about GoogleSheetsToS3?  GoogleSheetsToGCS?
>> These
>>>>> you
>>>>>>>> would
>>>>>>>>>>>>> put in
>>>>>>>>>>>>>>> the target, i.e. the storage?  But
>> GoogleSheetsToSftp
>>>>> would
>>>>>>>> be in
>>>>>>>>>>>>> google
>>>>>>>>>>>>>>> sheets operators file?  The complexity, and the
>> shared
>>>>> code,
>>>>>>>> are
>>>>>>>>> in
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> gsheet component -- not into the storage
>> destination.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Tue, Nov 5, 2019 at 5:46 PM Jarek Potiuk
>>>>>>>>>>>>> <jarek.pot...@polidea.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hello Airflow Community,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The email calls for a vote to update AIP-21
>> Changes in
>>>>>>>> import
>>>>>>>>>>>>> paths
>>>>>>>>>>>>>>>> <
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-21%3A+Changes+in+import+paths
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>> the changes described below. The vote will last
>> till
>>>>>>>> Saturday
>>>>>>>>> 8th
>>>>>>>>>>>>> 2am
>>>>>>>>>>>>>>> CEST
>>>>>>>>>>>>>>>> (72 hours). Committers have a binding vote but
>>>> everyone
>>>>> from
>>>>>>>>> the
>>>>>>>>>>>>>>> community
>>>>>>>>>>>>>>>> is encouraged to cast an advisory vote.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> *Summary*:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The proposal is to update AIP-21 to move all
>> non-core
>>>>>>>>>>>>>>>> operators/hooks/sensor (and related files) to
>>>>> sub-packages
>>>>>>>>> within
>>>>>>>>>>>>>>> airflow
>>>>>>>>>>>>>>>> (protocols/software/providers) or
>>>> (software/providers).
>>>>>>>>>>>>>>>> I am also happy to merge protocols+software, so
>> if you
>>>>> have
>>>>>>>> a
>>>>>>>>>>>>> strong
>>>>>>>>>>>>>>>> opinion on it - please state it with your vote
>> and we
>>>>> can
>>>>>>>>> decide
>>>>>>>>>>>>> based
>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>> majority.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Those packages will be separately released
>>>>> (schedule/process
>>>>>>>>> TBD)
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>> be backportable to 1.10.* airflow series, so that
>>>> users
>>>>> can
>>>>>>>>>>>>> install it
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> start using new Airflow2.0 operators in their
>> Python 3
>>>>>>>> Airflow
>>>>>>>>>>>>> 1.10
>>>>>>>>>>>>>>>> environments (only Python 3.5+ is supported).
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> We will proceed with migrating the providers
>> package
>>>> to
>>>>>>>> already
>>>>>>>>>>>>> agreed
>>>>>>>>>>>>>>>> paths without waiting for the final vote
>> (following
>>>>> current
>>>>>>>>>>>>> version of
>>>>>>>>>>>>>>>> AIP-21). Since we have working POC - we know the
>>>> agreed
>>>>>>>> paths
>>>>>>>>> will
>>>>>>>>>>>>> work
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>> us.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> *Previous discussions: *
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>   -
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>> 
>>>> 
>> https://lists.apache.org/thread.html/b07a93c9114e3d3c55d4ee514955bac79bc012c7a00db627c6b4c55f@%3Cdev.airflow.apache.org%3E
>>>>>>>>>>>>>>>>   -
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>> 
>>>> 
>> https://lists.apache.org/thread.html/e25ddc546e367a4af3e594fecbd4431959bd5a89045e748e4206e7ff@%3Cdev.airflow.apache.org%3E
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> *More Details*:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 1) Information that we are going in the direction
>> of
>>>>> AIP-8
>>>>>>>> but
>>>>>>>>> not
>>>>>>>>>>>>> yet
>>>>>>>>>>>>>>>> reaching it - focusing on separating out
>> backportable
>>>>>>>> packages
>>>>>>>>>>>>>>> installable
>>>>>>>>>>>>>>>> in Airflow releases 1.10.* . Airflow 2.0 will
>> still be
>>>>>>>>> installed
>>>>>>>>>>>>> as a
>>>>>>>>>>>>>>> whole
>>>>>>>>>>>>>>>> and all the source will be kept in one repo, but
>> we
>>>> now
>>>>>>>> have a
>>>>>>>>> way
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> build
>>>>>>>>>>>>>>>> backportable packages for groups of operators. POC
>>>>> available
>>>>>>>>> here:
>>>>>>>>>>>>>>>> https://github.com/apache/airflow/pull/6507
>> (based on
>>>>> Ash's
>>>>>>>>>>>>>>>> https://github.com/ashb/airflow-submodule-test)
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 2) We move all integrations to new packages
>> (keeping
>>>>>>>> deprecated
>>>>>>>>>>>>> import
>>>>>>>>>>>>>>>> aliases in the old places). The following split
>>>>> (according
>>>>>>>> to
>>>>>>>>>>>>>>> "stewardship"
>>>>>>>>>>>>>>>> over the integrations):
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>   - *fundamentals* - core of ariflow - they are
>>>> really
>>>>>>>> part of
>>>>>>>>>>>>> Apache
>>>>>>>>>>>>>>>>   Airflow. Stewards - core Airflow team. Not
>>>>>>>>>>>>> backportable/separated
>>>>>>>>>>>>>>> out.
>>>>>>>>>>>>>>>>   - *protocols* - are not owned by anyone, they
>> are
>>>>> public
>>>>>>>> and
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>   implementation is fully "open". There are no
>>>>> particular
>>>>>>>>>>>>> stewards (no
>>>>>>>>>>>>>>>> need).
>>>>>>>>>>>>>>>>   Users of particular protocols should mainly
>>>> maintain
>>>>>>>> those
>>>>>>>>> and
>>>>>>>>>>>>> add
>>>>>>>>>>>>>>>> support
>>>>>>>>>>>>>>>>   for different versions of the protocols.
>>>>>>>>>>>>>>>>   - *software* - both API and software are
>> controlled
>>>>> by
>>>>>>>>> someone
>>>>>>>>>>>>>>> outside
>>>>>>>>>>>>>>>>   of Airflow (commercial or open-source
>> project), but
>>>>> the
>>>>>>>>>>>>> deployment of
>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>   software is "owned" by the user installing
>> Airflow.
>>>>> The
>>>>>>>>>>>>> "stewardship"
>>>>>>>>>>>>>>>> might
>>>>>>>>>>>>>>>>   be also the users but the controlling party
>> (Oracle
>>>>> for
>>>>>>>>>>>>> example)
>>>>>>>>>>>>>>> might
>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>   interested in maintaining those operators as
>> well.
>>>>>>>>>>>>>>>>   - *providers* - API/software/deployments are
>> fully
>>>>>>>>> controlled
>>>>>>>>>>>>> by a
>>>>>>>>>>>>>>> 3rd
>>>>>>>>>>>>>>>>   party. Here most likely "provider" will be
>>>>> interested in
>>>>>>>>>>>>> maintaining
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>   operators (and for example like Google -
>> provide
>>>>>>>> integration
>>>>>>>>>>>>>>> guidelines
>>>>>>>>>>>>>>>>   <
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>> 
>>>> 
>> https://docs.google.com/document/d/1_rTdJSLCt0eyrAylmmgYc3yZr-_h51fVlnvMmWqhCkY/edit?usp=drive_web&ouid=112320280470690058978
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>   their hooks/operators/sensors)
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 3) Between-providers transfer operators should be
>> kept
>>>>> at
>>>>>>>> the
>>>>>>>>>>>>> "target"
>>>>>>>>>>>>>>>> rather than "source"
>>>>>>>>>>>>>>>> For example S3 -> GCS should be in "google"
>> provider,
>>>>> but
>>>>>>>>> GCS-> S3
>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>> be in "amazon".
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 4) One-side provider transfer operators should be
>> kept
>>>>> at
>>>>>>>> the
>>>>>>>>>>>>> "provider"
>>>>>>>>>>>>>>>> regardless if they are target or source.
>>>>>>>>>>>>>>>> For example GCS-> SFTP or SFTP -> GCS should be in
>>>>> "google"
>>>>>>>>>>>>> provider.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 5) If in doubt we will discuss individual cases
>>>>> separately.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> J.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Jarek Potiuk
>>>>>>>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal
>>>> Software
>>>>>>>>> Engineer
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Jarek Potiuk
>>>>>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal
>> Software
>>>>>>>> Engineer
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> --
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Jarek Potiuk
>>>>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal
>> Software
>>>>> Engineer
>>>>>>>>>>>>> 
>>>>>>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> 
>>>>>>>>>>> Jarek Potiuk
>>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software
>>>>> Engineer
>>>>>>>>>>> 
>>>>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> 
>>>>>>> Jarek Potiuk
>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>> 
>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> Jarek Potiuk
>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>> 
>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> 
>>>> Jarek Potiuk
>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>> 
>>>> M: +48 660 796 129 <+48660796129>
>>>> [image: Polidea] <https://www.polidea.com/>
>>>> 
>> 
> 
> 
> -- 
> 
> Jarek Potiuk
> Polidea <https://www.polidea.com/ <https://www.polidea.com/>> | Principal 
> Software Engineer
> 
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/ <https://www.polidea.com/>>

Reply via email to