> > Another question is operators like SlackWebHookOperator depends on > SimpleHTTPOperator ! Will this cause dependencies issues or with proper > versioning this should be OK ? >
Very good question Kaxil! This is one of the reasons we do not want to make yet full AIP-8 implementation. There will be dependencies between the packages (including pip dependencies) that will make it difficult to have them managed fully independently. In this version of the AIP-21 proposal, all the operators in 2.0 will still be released together with main airflow (as it was done for Airflow 1.10) from one repository. We have not yet discussed versioning scheme for backporting packages. I think we can decide on it later, separately, how exactly we name those versions. But I think we will make a single "snapshot" of all packages moved to "providers" (for backporting purpose) and release them together. They will have single version and cross-dependencies between the packages - if we find they are needed. For example we could add dependency "slack" -> "http" while we build the package. In this AIP-21 backporting scenario, we only have to worry about matching full set of pip dependencies between the backport releases and few latest 1.10.* released versions. This should be doable and testable by installing the backport packages with recent Airflow releases. We can automate this :). The best thing is - this whole exercise with backporting will help us to learn about all such dependencies (and also about pip dependencies). In the POC https://github.com/apache/airflow/pull/6507 you can see that for every package we can have separate dependency set defined (for example google package depends on 'gcp' extra). We can even have different set of constraints if we find that certain backport packages need to have some additional limits on pip versions. While we do the exercise and have a backport releases and learn from that we can make much better decisions that might lead eventually to AIP-8. J. On Mon, Nov 11, 2019 at 3:22 PM Kamil Breguła <kamil.breg...@polidea.com> > wrote: > > > One more question. Are you sure you want to move Python and Bash from > > core? These are the elements that are installed in every environment > > because they are required by Airflow, so moving them to a separate > > installed package is pointless in my opinion. > > > > On Mon, Nov 11, 2019 at 3:07 PM Kaxil Naik <kaxiln...@gmail.com> wrote: > > > > > > I am fine with this list +1 > > > > > > On Mon, Nov 11, 2019 at 1:27 PM Jarek Potiuk <jarek.pot...@polidea.com > > > > > wrote: > > > > > > > I am all for it Kamil! > > > > > > > > Super happy to treat Apache projects in the same way as "proprietary" > > > > providers :). Anyone else has some other comments ? > > > > > > > > J. > > > > > > > > On Mon, Nov 11, 2019 at 2:17 PM Kamil Breguła < > > kamil.breg...@polidea.com> > > > > wrote: > > > > > > > > > I looked at this list and I'm only worried about two operators. > > > > > > > > > > airflow.contrib.operators.vertica_to_hive > > > > > airflow.contrib.operators.s3_to_hive > > > > > > > > > > If we want the operators to be grouped according to destination, > then > > > > > this operator should be in apache package. It is the members of the > > > > > Apache community who will care most about this operator being of > high > > > > > quality. Apache can be treated equally with other large cloud > > > > > providers, such as GCP, AWS. I can imagine that a new Apache > product > > > > > will appear and it will want to promote the same way as products of > > > > > cloud providers are promoted. By creating a large number of > > > > > integrations that allow you to copy data to its operating range. > > > > > There's another cases - building a strong Apache community. As a > > > > > member of the Apache community, we should promote Apache products > to > > > > > ensure that the development of the community is correct, and > > therefore > > > > > also for integration into our products with other products. > > > > > > > > > > On Mon, Nov 11, 2019 at 12:28 AM Jarek Potiuk < > > jarek.pot...@polidea.com> > > > > > wrote: > > > > > > > > > > > > Just to select the "packages" for this update. Anyone has > > objections > > > > for > > > > > > this structure (details including transfer operators in > > > > > > > > > > > > https://docs.google.com/spreadsheets/d/17zA5t2JVxnDdg5Cs1Cg_ > > > > > > Mb1GXvGctmesfg2L089QSOk/edit#gid=0? > > > > > > > > > > > > *Fundamentals (no change)* > > > > > > > > > > > > > > > > > > > > > > > > providers > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > google > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > cloud > > > > > > > > > > > > > > > > > > > > > > > > gsuite > > > > > > > > > > > > > > > > > > > > > > > > marketing_platform > > > > > > > > > > > > > > > > > > amazon > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > aws > > > > > > > > > > > > > > > > > > microsoft > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > azure > > > > > > > > > > > > > > > > > > apache > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > cassandra > > > > > > > > > > > > > > > > > > > > > > > > druid > > > > > > > > > > > > > > > > > > > > > > > > hadoop > > > > > > > > > > > > > > > > > > > > > > > > hive > > > > > > > > > > > > > > > > > > > > > > > > pig > > > > > > > > > > > > > > > > > > > > > > > > pinot > > > > > > > > > > > > > > > > > > > > > > > > spark > > > > > > > > > > > > > > > > > > > > > > > > sqoop > > > > > > > > > > > > > > > > > > mysql > > > > > > > > > > > > > > > > > > > > > > > > jira > > > > > > > > > > > > > > > > > > > > > > > > databricks > > > > > > > > > > > > > > > > > > > > > > > > datadog > > > > > > > > > > > > > > > > > > > > > > > > dingding > > > > > > > > > > > > > > > > > > > > > > > > discord > > > > > > > > > > > > > > > > > > > > > > > > cloudant > > > > > > > > > > > > > > > > > > > > > > > > jenkins > > > > > > > > > > > > > > > > > > > > > > > > opsgenie > > > > > > > > > > > > > > > > > > > > > > > > qubole > > > > > > > > > > > > > > > > > > > > > > > > salesforce > > > > > > > > > > > > > > > > > > > > > > > > segment > > > > > > > > > > > > > > > > > > > > > > > > slack > > > > > > > > > > > > > > > > > > > > > > > > snowflake > > > > > > > > > > > > > > > > > > > > > > > > vertica > > > > > > > > > > > > > > > > > > > > > > > > zendesk > > > > > > > > > > > > > > > > > > > > > > > > celery > > > > > > > > > > > > > > > > > > > > > > > > docker > > > > > > > > > > > > > > > > > > > > > > > > bash > > > > > > > > > > > > > > > > > > > > > > > > kubernetes > > > > > > > > > > > > > > > > > > > > > > > > mssql > > > > > > > > > > > > > > > > > > > > > > > > mongodb > > > > > > > > > > > > > > > > > > > > > > > > mysql > > > > > > > > > > > > > > > > > > > > > > > > openfaas > > > > > > > > > > > > > > > > > > > > > > > > oracle > > > > > > > > > > > > > > > > > > > > > > > > papermill > > > > > > > > > > > > > > > > > > > > > > > > postgres > > > > > > > > > > > > > > > > > > > > > > > > presto > > > > > > > > > > > > > > > > > > > > > > > > python > > > > > > > > > > > > > > > > > > > > > > > > redis > > > > > > > > > > > > > > > > > > > > > > > > samba > > > > > > > > > > > > > > > > > > > > > > > > sqlite > > > > > > > > > > > > > > > > > > > > > > > > imap > > > > > > > > > > > > > > > > > > > > > > > > ssh > > > > > > > > > > > > > > > > > > > > > > > > filesystem > > > > > > > > > > > > > > > > > > > > > > > > sftp > > > > > > > > > > > > > > > > > > > > > > > > ftp > > > > > > > > > > > > > > > > > > > > > > > > http > > > > > > > > > > > > > > > > > > > > > > > > grpc > > > > > > > > > > > > > > > > > > > > > > > > smtp > > > > > > > > > > > > > > > > > > > > > > > > jdbc > > > > > > > > > > > > > > > > > > > > > > > > winrm > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Nov 8, 2019 at 5:47 PM Jarek Potiuk < > > jarek.pot...@polidea.com> > > > > > > wrote: > > > > > > > > > > > > > Let me then cancel this vote and I will restart it next week. > > > > > > > > > > > > > > Yeah. It's a bit like re-opening the Pandora's box but now that > > we > > > > know > > > > > > > that we can do it, and we are unblocked in moving to google > > (which is > > > > > now > > > > > > > the biggest move in-progress), we can spend more time on > getting > > > > > better > > > > > > > (and more final) consensus. > > > > > > > I decided to go through the list from the docs (once again > Kamil > > - > > > > > great > > > > > > > that you did it) and prepared this spreadsheet showing the > > > > structure. I > > > > > > > went through ALL the operators and put them in the right place > > where > > > > > our > > > > > > > current rules place them. > > > > > > > > > > > > > > After this exercise, I think that makes sense: > > > > > > > - put all the stuff except fundamentals in *"providers"* > > (everything > > > > > > > in "providers" will be potentially backportable). > > > > > > > - grouping apache projects under *"apache"* - similar to > > > > > > > google/amazon/microsoft (different kind of ownership but still > > it is > > > > an > > > > > > > ownership) > > > > > > > - for the rest I think what we can do is really to put the > > operators > > > > in > > > > > > > folders per "service/company" (without sub-packages). That > > includes > > > > > > > sftp/ssh/ftp etc (should we group [ftp and sftp] or [ssh and > > sftp] > > > > ??). > > > > > > > there is no "ownership" there and no reason to group them. That > > will > > > > > put > > > > > > > "operators/hooks/sensors" at different levels in the directory > > tree > > > > > but we > > > > > > > already have that for fundamentals and I am not too worried > about > > > > > that. We > > > > > > > do not have to have everything at the same level. > > > > > > > - I put transfer operators according to the rule where "to" > side > > is > > > > > more > > > > > > > important unless the other side is a public protocol (so sftp > -> > > gcs > > > > > and > > > > > > > gcs -> sftp both go to google/gcp). I did not have any doubt > > where to > > > > > put > > > > > > > which transfer operator, so this is a good sign: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/spreadsheets/d/17zA5t2JVxnDdg5Cs1Cg_Mb1GXvGctmesfg2L089QSOk/edit#gid=0 > > > > > > > > > > > > > > Can you please take a look and express your opinions here so > > that we > > > > > can > > > > > > > have final voting next week (for those who are not yet tired > > with the > > > > > > > discussion ;)). > > > > > > > > > > > > > > J. > > > > > > > > > > > > > > On Fri, Nov 8, 2019 at 4:38 PM Kaxil Naik <kaxiln...@gmail.com > > > > > > wrote: > > > > > > > > > > > > > >> Yes, that makes sense. > > > > > > >> > > > > > > >> On Fri, Nov 8, 2019 at 3:22 PM Kamil Breguła < > > > > > kamil.breg...@polidea.com> > > > > > > >> wrote: > > > > > > >> > > > > > > >> > In the case of Hadoop, it is published by Apache, so it can > > be in > > > > > the > > > > > > >> > apache directory. This will mimic the grouping presented in > > the > > > > > > >> > documentation. > > > > > > >> > > > > > > > >> > > > > > > > > > > > > https://airflow.readthedocs.io/en/latest/operators-and-hooks-ref.html#software-operators-and-hooks > > > > > > >> > > > > > > > >> > On Fri, Nov 8, 2019 at 3:47 PM Kaxil Naik < > > kaxiln...@gmail.com> > > > > > wrote: > > > > > > >> > > > > > > > > >> > > I think we should keep the vote open at least until mid > next > > > > week > > > > > to > > > > > > >> have > > > > > > >> > > more thought and inputs on this one. > > > > > > >> > > > > > > > > >> > > In general, I am happy with the approach but > > operators/hooks and > > > > > > >> sensors > > > > > > >> > > shouldn't be a provider. "hadoop" can be its provider and > > hdfs > > > > > can be > > > > > > >> a > > > > > > >> > > part of it. > > > > > > >> > > > > > > > > >> > > providers/ > > > > > > >> > > google > > > > > > >> > > cloud > > > > > > >> > > operators > > > > > > >> > > hooks > > > > > > >> > > sensors > > > > > > >> > > gsuite > > > > > > >> > > operators > > > > > > >> > > ... > > > > > > >> > > amazon > > > > > > >> > > aws > > > > > > >> > > operators > > > > > > >> > > ... > > > > > > >> > > microsoft > > > > > > >> > > azure > > > > > > >> > > operators > > > > > > >> > > ... > > > > > > >> > > hadoop > > > > > > >> > > hdfs > > > > > > >> > > operators > > > > > > >> > > ... > > > > > > >> > > > > > > > > >> > > We can also define what is a "provider" so we know what to > > add > > > > in > > > > > it > > > > > > >> in > > > > > > >> > the > > > > > > >> > > future. SSH/FTP/SFTP belongs to the same family group. Do > we > > > > want > > > > > to > > > > > > >> have > > > > > > >> > > separate providers for each one of them ??? > > > > > > >> > > > > > > > > >> > > Regards, > > > > > > >> > > Kaxil > > > > > > >> > > > > > > > > >> > > On Fri, Nov 8, 2019 at 9:08 AM Jarek Potiuk < > > > > > jarek.pot...@polidea.com > > > > > > >> > > > > > > > >> > > wrote: > > > > > > >> > > > > > > > > >> > > > I really like to make everything a provider. That's a > > great > > > > > idea ! > > > > > > >> > This way > > > > > > >> > > > everything "backportable" will have to be in "providers" > > > > > package. > > > > > > >> > Really > > > > > > >> > > > nice and clean separation (and less mess in "airflow"). > > And we > > > > > will > > > > > > >> not > > > > > > >> > > > have to have any artificial grouping (we can still group > > them > > > > > at the > > > > > > >> > > > documentation level). > > > > > > >> > > > > > > > > > >> > > > We do not need backport in name. And I think it's more > of > > > > > technical > > > > > > >> > detail > > > > > > >> > > > on naming the package which we can work out while > > reviewing > > > > PRs > > > > > and > > > > > > >> we > > > > > > >> > can > > > > > > >> > > > agree final naming of the released packaged on PMC level > > (PMCs > > > > > will > > > > > > >> > have to > > > > > > >> > > > vote on releasing those). > > > > > > >> > > > > > > > > > >> > > > The thinking is that it's intention is really to be only > > > > > backported > > > > > > >> to > > > > > > >> > 1.10 > > > > > > >> > > > - we are not going (yet) to use the packages in Airflow > > 2.*. > > > > so > > > > > I > > > > > > >> > thought > > > > > > >> > > > by naming them backport we can express that intent more > > > > clearly. > > > > > > >> > > > > > > > > > >> > > > So let me clarify the structure of folders we are going > to > > > > have > > > > > if > > > > > > >> we > > > > > > >> > > > follow it (i just added some examples) including the > > already > > > > > agreed > > > > > > >> > changes > > > > > > >> > > > from AIP-21: > > > > > > >> > > > > > > > > > >> > > > providers/ > > > > > > >> > > > google > > > > > > >> > > > cloud > > > > > > >> > > > operators > > > > > > >> > > > hooks > > > > > > >> > > > sensors > > > > > > >> > > > gsuite > > > > > > >> > > > operators > > > > > > >> > > > ... > > > > > > >> > > > amazon > > > > > > >> > > > aws > > > > > > >> > > > operators > > > > > > >> > > > ... > > > > > > >> > > > microsoft > > > > > > >> > > > azure > > > > > > >> > > > operators > > > > > > >> > > > ... > > > > > > >> > > > operators > > > > > > >> > > > sqlite.py > > > > > > >> > > > oracle.py > > > > > > >> > > > docker.py > > > > > > >> > > > hooks > > > > > > >> > > > hdfs.py > > > > > > >> > > > sqlite.py > > > > > > >> > > > sensors > > > > > > >> > > > http.py > > > > > > >> > > > sql.py > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > J. > > > > > > >> > > > > > > > > > >> > > > On Fri, Nov 8, 2019 at 9:43 AM Ash Berlin-Taylor < > > > > > a...@apache.org> > > > > > > >> > wrote: > > > > > > >> > > > > > > > > > >> > > > > Do we need to include `-backport,`? What was the > > thinking > > > > > behind > > > > > > >> > that? > > > > > > >> > > > > > > > > > > >> > > > > I think software and protocol should be merged. I > would > > also > > > > > say > > > > > > >> > > > > _everything_ is a provider, so > > > > > airflow.providers.ssh.SSHOperator > > > > > > >> for > > > > > > >> > > > > instance is what I would prefer > > > > > > >> > > > > > > > > > > >> > > > > -a > > > > > > >> > > > > > > > > > > >> > > > > On 8 November 2019 08:32:42 GMT, Jarek Potiuk < > > > > > > >> > jarek.pot...@polidea.com> > > > > > > >> > > > > wrote: > > > > > > >> > > > > >One more day to go. I would love to see some opinions > > on > > > > this > > > > > > >> AIP-21 > > > > > > >> > > > > >update > > > > > > >> > > > > >:). > > > > > > >> > > > > > > > > > > > >> > > > > >Executive summary: > > > > > > >> > > > > > > > > > > > >> > > > > >* we will be moving a number of integrations to > > > > sub-packages > > > > > of > > > > > > >> > > > > >airflow. > > > > > > >> > > > > >* they will be backportable to 1.10.*. There will be > > > > > > >> > > > > >'apache-airflow-[package]-backport' pypi installable > > with > > > > > python > > > > > > >> 3 > > > > > > >> > that > > > > > > >> > > > > >will make Airflow 2.0 operators/hooks etc. available > > with > > > > > 1.10* > > > > > > >> > > > > >operators. > > > > > > >> > > > > >* the current proposal for sub-packages is > > > > > > >> > > > > >"protocols/software/providers/" > > > > > > >> > > > > >(but if you think merging protocols and software > makes > > > > sense > > > > > - > > > > > > >> > please > > > > > > >> > > > > >express your opinion > > > > > > >> > > > > >* we are not moving "fundamental" operators/hooks > etc.. > > > > > > >> > > > > >* Airflow 2.0 is still going to be installed as a > > single > > > > > package > > > > > > >> > with > > > > > > >> > > > > >all > > > > > > >> > > > > >operators (so we are not yet implementing AIP-8) > > > > > > >> > > > > > > > > > > > >> > > > > >J. > > > > > > >> > > > > > > > > > > > >> > > > > >On Wed, Nov 6, 2019 at 10:07 AM Jarek Potiuk < > > > > > > >> > jarek.pot...@polidea.com> > > > > > > >> > > > > >wrote: > > > > > > >> > > > > > > > > > > > >> > > > > >> I think all this cases are valid but maybe I was > not > > > > > > >> super-clear. > > > > > > >> > > > > >It's > > > > > > >> > > > > >> only the transfer operators that we need to decide > > where > > > > to > > > > > > >> put - > > > > > > >> > not > > > > > > >> > > > > >> hooks. > > > > > > >> > > > > >> Usually the complexity of communication with > > particular > > > > > > >> storages > > > > > > >> > is > > > > > > >> > > > > >(or at > > > > > > >> > > > > >> least should be) in the Hooks rather than > Operators. > > > > > > >> > > > > >> > > > > > > >> > > > > >> Operators should be just thin wrappers over the > > logic in > > > > > the > > > > > > >> > hooks. > > > > > > >> > > > > >> Hooks are going to stay where they belong - S3 > Hooks > > in > > > > > amazon, > > > > > > >> > GCS > > > > > > >> > > > > >Hooks > > > > > > >> > > > > >> in google.cloud, GoogleSheet Hooks in > google.gsuite. > > > > > > >> > > > > >> > > > > > > >> > > > > >> Since we actually have mono-repo - this will be no > > > > problem > > > > > > >> (and no > > > > > > >> > > > > >cross > > > > > > >> > > > > >> dependencies problem) to have S3 -> GCS operator > in > > > > > google and > > > > > > >> > use > > > > > > >> > > > > >hooks > > > > > > >> > > > > >> from both google/amazon. > > > > > > >> > > > > >> > > > > > > >> > > > > >> I hope this alleviates your concern Daniel ? > > > > > > >> > > > > >> > > > > > > >> > > > > >> J. > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > > >> > > > > >>> What about GoogleSheetsToS3? GoogleSheetsToGCS? > > These > > > > > you > > > > > > >> would > > > > > > >> > > > > >put in > > > > > > >> > > > > >>> the target, i.e. the storage? But > > GoogleSheetsToSftp > > > > > would > > > > > > >> be in > > > > > > >> > > > > >google > > > > > > >> > > > > >>> sheets operators file? The complexity, and the > > shared > > > > > code, > > > > > > >> are > > > > > > >> > in > > > > > > >> > > > > >the > > > > > > >> > > > > >>> gsheet component -- not into the storage > > destination. > > > > > > >> > > > > >>> > > > > > > >> > > > > >>> > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > > >> > > > > >>> On Tue, Nov 5, 2019 at 5:46 PM Jarek Potiuk > > > > > > >> > > > > ><jarek.pot...@polidea.com> > > > > > > >> > > > > >>> wrote: > > > > > > >> > > > > >>> > > > > > > >> > > > > >>> > Hello Airflow Community, > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > The email calls for a vote to update AIP-21 > > Changes in > > > > > > >> import > > > > > > >> > > > > >paths > > > > > > >> > > > > >>> > < > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-21%3A+Changes+in+import+paths > > > > > > >> > > > > >>> > > > > > > > > >> > > > > >>> > with > > > > > > >> > > > > >>> > the changes described below. The vote will last > > till > > > > > > >> Saturday > > > > > > >> > 8th > > > > > > >> > > > > >2am > > > > > > >> > > > > >>> CEST > > > > > > >> > > > > >>> > (72 hours). Committers have a binding vote but > > > > everyone > > > > > from > > > > > > >> > the > > > > > > >> > > > > >>> community > > > > > > >> > > > > >>> > is encouraged to cast an advisory vote. > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > *Summary*: > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > The proposal is to update AIP-21 to move all > > non-core > > > > > > >> > > > > >>> > operators/hooks/sensor (and related files) to > > > > > sub-packages > > > > > > >> > within > > > > > > >> > > > > >>> airflow > > > > > > >> > > > > >>> > (protocols/software/providers) or > > > > (software/providers). > > > > > > >> > > > > >>> > I am also happy to merge protocols+software, so > > if you > > > > > have > > > > > > >> a > > > > > > >> > > > > >strong > > > > > > >> > > > > >>> > opinion on it - please state it with your vote > > and we > > > > > can > > > > > > >> > decide > > > > > > >> > > > > >based > > > > > > >> > > > > >>> on > > > > > > >> > > > > >>> > majority. > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > Those packages will be separately released > > > > > (schedule/process > > > > > > >> > TBD) > > > > > > >> > > > > >and > > > > > > >> > > > > >>> will > > > > > > >> > > > > >>> > be backportable to 1.10.* airflow series, so > that > > > > users > > > > > can > > > > > > >> > > > > >install it > > > > > > >> > > > > >>> and > > > > > > >> > > > > >>> > start using new Airflow2.0 operators in their > > Python 3 > > > > > > >> Airflow > > > > > > >> > > > > >1.10 > > > > > > >> > > > > >>> > environments (only Python 3.5+ is supported). > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > We will proceed with migrating the providers > > package > > > > to > > > > > > >> already > > > > > > >> > > > > >agreed > > > > > > >> > > > > >>> > paths without waiting for the final vote > > (following > > > > > current > > > > > > >> > > > > >version of > > > > > > >> > > > > >>> > AIP-21). Since we have working POC - we know the > > > > agreed > > > > > > >> paths > > > > > > >> > will > > > > > > >> > > > > >work > > > > > > >> > > > > >>> for > > > > > > >> > > > > >>> > us. > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > *Previous discussions: * > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > - > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > https://lists.apache.org/thread.html/b07a93c9114e3d3c55d4ee514955bac79bc012c7a00db627c6b4c55f@%3Cdev.airflow.apache.org%3E > > > > > > >> > > > > >>> > - > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > https://lists.apache.org/thread.html/e25ddc546e367a4af3e594fecbd4431959bd5a89045e748e4206e7ff@%3Cdev.airflow.apache.org%3E > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > *More Details*: > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > 1) Information that we are going in the > direction > > of > > > > > AIP-8 > > > > > > >> but > > > > > > >> > not > > > > > > >> > > > > >yet > > > > > > >> > > > > >>> > reaching it - focusing on separating out > > backportable > > > > > > >> packages > > > > > > >> > > > > >>> installable > > > > > > >> > > > > >>> > in Airflow releases 1.10.* . Airflow 2.0 will > > still be > > > > > > >> > installed > > > > > > >> > > > > >as a > > > > > > >> > > > > >>> whole > > > > > > >> > > > > >>> > and all the source will be kept in one repo, but > > we > > > > now > > > > > > >> have a > > > > > > >> > way > > > > > > >> > > > > >to > > > > > > >> > > > > >>> build > > > > > > >> > > > > >>> > backportable packages for groups of operators. > POC > > > > > available > > > > > > >> > here: > > > > > > >> > > > > >>> > https://github.com/apache/airflow/pull/6507 > > (based on > > > > > Ash's > > > > > > >> > > > > >>> > https://github.com/ashb/airflow-submodule-test) > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > 2) We move all integrations to new packages > > (keeping > > > > > > >> deprecated > > > > > > >> > > > > >import > > > > > > >> > > > > >>> > aliases in the old places). The following split > > > > > (according > > > > > > >> to > > > > > > >> > > > > >>> "stewardship" > > > > > > >> > > > > >>> > over the integrations): > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > - *fundamentals* - core of ariflow - they are > > > > really > > > > > > >> part of > > > > > > >> > > > > >Apache > > > > > > >> > > > > >>> > Airflow. Stewards - core Airflow team. Not > > > > > > >> > > > > >backportable/separated > > > > > > >> > > > > >>> out. > > > > > > >> > > > > >>> > - *protocols* - are not owned by anyone, they > > are > > > > > public > > > > > > >> and > > > > > > >> > > > > >the > > > > > > >> > > > > >>> > implementation is fully "open". There are no > > > > > particular > > > > > > >> > > > > >stewards (no > > > > > > >> > > > > >>> > need). > > > > > > >> > > > > >>> > Users of particular protocols should mainly > > > > maintain > > > > > > >> those > > > > > > >> > and > > > > > > >> > > > > >add > > > > > > >> > > > > >>> > support > > > > > > >> > > > > >>> > for different versions of the protocols. > > > > > > >> > > > > >>> > - *software* - both API and software are > > controlled > > > > > by > > > > > > >> > someone > > > > > > >> > > > > >>> outside > > > > > > >> > > > > >>> > of Airflow (commercial or open-source > > project), but > > > > > the > > > > > > >> > > > > >deployment of > > > > > > >> > > > > >>> > that > > > > > > >> > > > > >>> > software is "owned" by the user installing > > Airflow. > > > > > The > > > > > > >> > > > > >"stewardship" > > > > > > >> > > > > >>> > might > > > > > > >> > > > > >>> > be also the users but the controlling party > > (Oracle > > > > > for > > > > > > >> > > > > >example) > > > > > > >> > > > > >>> might > > > > > > >> > > > > >>> > be > > > > > > >> > > > > >>> > interested in maintaining those operators as > > well. > > > > > > >> > > > > >>> > - *providers* - API/software/deployments are > > fully > > > > > > >> > controlled > > > > > > >> > > > > >by a > > > > > > >> > > > > >>> 3rd > > > > > > >> > > > > >>> > party. Here most likely "provider" will be > > > > > interested in > > > > > > >> > > > > >maintaining > > > > > > >> > > > > >>> the > > > > > > >> > > > > >>> > operators (and for example like Google - > > provide > > > > > > >> integration > > > > > > >> > > > > >>> guidelines > > > > > > >> > > > > >>> > < > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > https://docs.google.com/document/d/1_rTdJSLCt0eyrAylmmgYc3yZr-_h51fVlnvMmWqhCkY/edit?usp=drive_web&ouid=112320280470690058978 > > > > > > >> > > > > >>> > > > > > > > > >> > > > > >>> > for > > > > > > >> > > > > >>> > their hooks/operators/sensors) > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > 3) Between-providers transfer operators should > be > > kept > > > > > at > > > > > > >> the > > > > > > >> > > > > >"target" > > > > > > >> > > > > >>> > rather than "source" > > > > > > >> > > > > >>> > For example S3 -> GCS should be in "google" > > provider, > > > > > but > > > > > > >> > GCS-> S3 > > > > > > >> > > > > >>> should > > > > > > >> > > > > >>> > be in "amazon". > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > 4) One-side provider transfer operators should > be > > kept > > > > > at > > > > > > >> the > > > > > > >> > > > > >"provider" > > > > > > >> > > > > >>> > regardless if they are target or source. > > > > > > >> > > > > >>> > For example GCS-> SFTP or SFTP -> GCS should be > in > > > > > "google" > > > > > > >> > > > > >provider. > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > 5) If in doubt we will discuss individual cases > > > > > separately. > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > J. > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > -- > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > Jarek Potiuk > > > > > > >> > > > > >>> > Polidea <https://www.polidea.com/> | Principal > > > > Software > > > > > > >> > Engineer > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > M: +48 660 796 129 <+48660796129> > > > > > > >> > > > > >>> > [image: Polidea] <https://www.polidea.com/> > > > > > > >> > > > > >>> > > > > > > > >> > > > > >>> > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > > >> > > > > >> -- > > > > > > >> > > > > >> > > > > > > >> > > > > >> Jarek Potiuk > > > > > > >> > > > > >> Polidea <https://www.polidea.com/> | Principal > > Software > > > > > > >> Engineer > > > > > > >> > > > > >> > > > > > > >> > > > > >> M: +48 660 796 129 <+48660796129> > > > > > > >> > > > > >> [image: Polidea] <https://www.polidea.com/> > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > > >> > > > > > > > > > > > >> > > > > >-- > > > > > > >> > > > > > > > > > > > >> > > > > >Jarek Potiuk > > > > > > >> > > > > >Polidea <https://www.polidea.com/> | Principal > > Software > > > > > Engineer > > > > > > >> > > > > > > > > > > > >> > > > > >M: +48 660 796 129 <+48660796129> > > > > > > >> > > > > >[image: Polidea] <https://www.polidea.com/> > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > -- > > > > > > >> > > > > > > > > > >> > > > Jarek Potiuk > > > > > > >> > > > Polidea <https://www.polidea.com/> | Principal Software > > > > > Engineer > > > > > > >> > > > > > > > > > >> > > > M: +48 660 796 129 <+48660796129> > > > > > > >> > > > [image: Polidea] <https://www.polidea.com/> > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > Jarek Potiuk > > > > > > > Polidea <https://www.polidea.com/> | Principal Software > Engineer > > > > > > > > > > > > > > M: +48 660 796 129 <+48660796129> > > > > > > > [image: Polidea] <https://www.polidea.com/> > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Jarek Potiuk > > > > > > Polidea <https://www.polidea.com/> | Principal Software Engineer > > > > > > > > > > > > M: +48 660 796 129 <+48660796129> > > > > > > [image: Polidea] <https://www.polidea.com/> > > > > > > > > > > > > > > > > > -- > > > > > > > > Jarek Potiuk > > > > Polidea <https://www.polidea.com/> | Principal Software Engineer > > > > > > > > M: +48 660 796 129 <+48660796129> > > > > [image: Polidea] <https://www.polidea.com/> > > > > > > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>