Also in this case you are free to decide how you publish or produce the documentation and release the software (and whether you test it or not) - this is precisely the freedom that Ash mentioned you get when you are not bound by the community rules.
BTW. Adding a link to the ecosystem page is very easy - just click "Suggest a change on this page" and you will get PR opened, where you will be able to link to your provider. J. On Tue, Dec 13, 2022 at 4:21 PM Jarek Potiuk <[email protected]> wrote: > The documentation index is only for the community and only for providers > managed by the community. > > This is a very strict requirement by The Apache Software Foundation. > According to the Apache Software Foundation rules, we cannot suggest or > even hint that a software released by a 3rd-party is an "Apache Software > Foundation" software. If we add link to a 3rd-party software, we have to > explicitly state that it is not an ASF software. > For such cases, we have dedicated "Ecosystem" page > https://airflow.apache.org/ecosystem/ which explicitly states: > > "These resources and services are not maintained, nor endorsed by the > Apache Airflow Community and Apache Airflow project (maintained by the > Committers and the Airflow PMC). Use them at your sole discretion. The > community does not verify the licences nor validity of those tools, so it’s > your responsibility to verify them." > > And there is a section for 3rd-party plugins and providers: > https://airflow.apache.org/ecosystem/#third-party-airflow-plugins-and-providers > > J. > > > > > On Tue, Dec 13, 2022 at 4:01 PM Philippe Lanoe <[email protected]> > wrote: > >> Hello Airflow community, >> >> We decided internally that we cannot take on the maintenance of this CI >> environment on our side right now. However, as Ash mentioned and suggested, >> we would like to be part of the documentation index >> <https://airflow.apache.org/docs/#providers-packagesdocsapache-airflow-providersindexhtml>. >> I assume that the documentation of our provider should be handled by us >> (for instance in our Git repository)? Or should the documentation of the >> provider be part of the Airflow repository, though the code lives outside ? >> >> In any case, what is the process for doing into the documentation index? >> Do we need to raise a PR or is this request in this email enough? Please >> let us know what inputs are required from our side. >> >> Thanks, >> Philippe >> >> >> On Fri, Dec 9, 2022 at 9:17 PM Oliveira, Niko <[email protected]> >> wrote: >> >>> This has yet to be published by Google and Amazon - I know they are >>> progressing a lot on making the automation and publishing regular result of >>> the System tests from main in the way that we can verify that all tests >>> pass - all that is done outside of the community resources and maintenance >>> (i.e. this is entirely on the Amazon and Google teams to run and publish >>> those tests). >>> >>> Just as an update: we're still working hard on this, I promise :) It has >>> taken MUCH longer than expected to get all the requisite internal approvals >>> and agreement on how to share the results with the community. But we're >>> zeroing in on an approach that everyone agrees on for publication. Please >>> bear with us on this one! >>> >>> >>> >>> In this case - anyone with any "Cloudera" account should be able to run >>> it locally when contributing. But the idea of AIP-47 was to off-load >>> regular execution of those tests and provide public "status" of those to >>> those teams of those service providers that want to make sure that their >>> provider still runs. >>> >>> Agreed, the tests are written in a way that anyone can run them (with >>> mechanisms to provide any pre-exisitng resources some tests required). But >>> to expect the community to have the resources to regularly run the all the >>> system tests for all providers is unreasonable, collaboration is really >>> required here. >>> >>> Cheers, >>> Niko >>> >>> >>> >>> ------------------------------ >>> *From:* Pierre Jeambrun <[email protected]> >>> *Sent:* Friday, December 9, 2022 11:43:21 AM >>> *To:* [email protected] >>> *Subject:* RE: [EXTERNAL][VOTE] New Provider: Cloudera >>> >>> >>> *CAUTION*: This email originated from outside of the organization. Do >>> not click links or open attachments unless you can confirm the sender and >>> know the content is safe. >>> >>> Thanks for taking time to give more details Jarek. This puts things >>> in perspective. >>> >>> Le ven. 9 déc. 2022 à 18:48, Collin McNulty <[email protected]> >>> a écrit : >>> >>>> I concur with the concerns raised by Ash. Cloudera seems like an >>>> organization quite well suited to releasing its own provider. If such an >>>> organization is not expected to release outside the Apache process, who is? >>>> Maybe I'm misunderstanding, but I thought that the idea was that providers >>>> going forward would be mostly third party which allows for a larger and >>>> more vibrant ecosystem. >>>> >>>> Collin McNulty >>>> >>>> On Fri, Dec 9, 2022 at 6:04 AM Pierre Jeambrun <[email protected]> >>>> wrote: >>>> >>>>> Hello, >>>>> >>>>> I am really excited about a public official cloudera provider for >>>>> airflow. This would be a great addition to the airflow ecosystem. >>>>> >>>>> System tests would be an additional layer that would be great for the >>>>> CI and release process, but would individual contributors be able to run >>>>> these system tests locally ? From what I understand, such credentials >>>>> would >>>>> be stored in the CI, and only people with their own credentials would be >>>>> able to test the code locally and therefore realistically help in >>>>> maintaining the provider. (Iterating on CI failure wouldn't be great :p) >>>>> >>>>> Ash point is echoing in me, remembering when I had to work on a >>>>> specific provider where free accounts/quotas were not available. It was >>>>> basically a shot in the dark, making code changes based on documentation >>>>> and api specs without being able to actually test the code. Maybe this was >>>>> the reason the issues stayed open for more than a year without being >>>>> picked >>>>> up. >>>>> >>>>> Will the community really be able to contribute and support >>>>> the provider, while most of us don't have a paid account ? Or is it >>>>> 'stakeholders' maintained and 'community' released at most. (Even >>>>> reviewing >>>>> code for release would be tricky without an account). >>>>> >>>>> Maybe I misunderstood something and apologize in advance. >>>>> >>>>> Best regards, >>>>> Pierre >>>>> >>>>> Le ven. 9 déc. 2022 à 12:24, Jarek Potiuk <[email protected]> a écrit : >>>>> >>>>>> > My concern about how we will actually test it works given we'd need >>>>>> a cloudera account/install/instance would be good to comment on though. >>>>>> >>>>>> This is a very good point Ash and I love you've made it as I think we >>>>>> have a very good solution at hand. >>>>>> >>>>>> This simply calls for Cloudera's commitment to work on AIP-47 style >>>>>> tests and providing a test bed for that. >>>>>> >>>>>> This has yet to be published by Google and Amazon - I know they are >>>>>> progressing a lot on making the automation and publishing regular result >>>>>> of >>>>>> the System tests from main in the way that we can verify that all tests >>>>>> pass - all that is done outside of the community resources and >>>>>> maintenance >>>>>> (i.e. this is entirely on the Amazon and Google teams to run and publish >>>>>> those tests). >>>>>> >>>>>> So I have a PROPOSAL (I can send a formal vote on that shortly) >>>>>> >>>>>> For all the future (starting from Cloudera) we should make that as a >>>>>> requirement that any of the providers accepted by the community MUST have >>>>>> AIP-47 style System Tests and the service provider in question MUST >>>>>> provide >>>>>> their own System Test environment with public access of status for the >>>>>> community and commit to maintaining those for as long as the Provider is >>>>>> released by the community. >>>>>> >>>>>> I think this is a very reasonable ask for Cloudera (and anyone else >>>>>> in the future) and a very, very good compromise (win-win for both sides >>>>>> while also requiring both sides to commit to a long term cooperation). >>>>>> This way we make sure we have to cooperate with the service provider >>>>>> rather >>>>>> than letting the Service provider "throw the code over the fence" and put >>>>>> all the burden of maintenance on the community. >>>>>> >>>>>> * with AIP-47 we provided a very solid foundation for fully-automated >>>>>> system testing of precisely this kind of external service providers >>>>>> * we (community) take on our shoulders the burden of reviewing and >>>>>> releasing the code, and at the same time the service gets community >>>>>> recognition and becomes part of the "Airflow Community supported" >>>>>> * similarly the Service Provider takes on their shoulders the burden >>>>>> or running and keeping in check the System Tests Bed for their system >>>>>> tests >>>>>> submitted to the community and make sure they succeed before the release >>>>>> happens >>>>>> * whenever we release such a service provider - we hold on with the >>>>>> release for that provider until the system tests for such provider are >>>>>> green (and it's on the service provider to fix the problems with those >>>>>> before we release). >>>>>> * I know both Google and Amazon are committed to do so, I also know >>>>>> Databricks is looking into it and in the future we might decide to apply >>>>>> it >>>>>> to all "external service providers". >>>>>> >>>>>> Philippe - what do you think about such an arrangement? Is that >>>>>> something that Cloudera will be able to commit to? >>>>>> >>>>>> J. >>>>>> >>>>>> >>>>>> On Fri, Dec 9, 2022 at 12:00 PM Ash Berlin-Taylor <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> As per the original vote email: >>>>>>> >>>>>>> Please note that this vote is about the fact to add this new >>>>>>> provider not about the code itself, which will be reviewed as part of >>>>>>> the PR >>>>>>> >>>>>>> >>>>>>> So it's not a veto (as vetos can only apply to code). >>>>>>> >>>>>>> My concern about how we will actually test it works given we'd need >>>>>>> a cloudera account/install/instance would be good to comment on though. >>>>>>> >>>>>>> -ash >>>>>>> >>>>>>> On Dec 7 2022, at 1:43 pm, Jarek Potiuk <[email protected]> wrote: >>>>>>> >>>>>>> Yeah. I would really want to understand that (and maybe others have >>>>>>> an opinion here): >>>>>>> >>>>>>> https://www.apache.org/foundation/voting.html >>>>>>> >>>>>>> * Is this a "code modification" - where -1 is veto >>>>>>> * or is it a "procedural issue" - where -1 is just a vote and >>>>>>> majority rules >>>>>>> >>>>>>> I personally think that "code modification" is really on "PR review" >>>>>>> level - when we see that the code submitted is not good. But this case >>>>>>> seems to be more of a procedural issue than code modification. For me >>>>>>> this >>>>>>> is more "are we ok to accept a provider from cloudera?" rather than "do >>>>>>> we >>>>>>> accept this code". >>>>>>> >>>>>>> Ash - how do you treat your -1 ? >>>>>>> >>>>>>> And others - what do you think of that ? >>>>>>> >>>>>>> I think the next course of action depends if we have consensus on >>>>>>> how we treat the issue of "adding a new provider". >>>>>>> >>>>>>> J. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Dec 7, 2022 at 1:45 PM Philippe Lanoe >>>>>>> <[email protected]> wrote: >>>>>>> >>>>>>> Hello Airflow community, >>>>>>> >>>>>>> Following up on this -1. I'm assuming that's a veto? >>>>>>> >>>>>>> If it is, would it be possible to decouple the provider >>>>>>> sustainability discussion from this proposal (Cloudera provider addition >>>>>>> request)? >>>>>>> >>>>>>> I do think sustainability discussions make full sense but I feel >>>>>>> that this new provider is following the current rules that the community >>>>>>> has established so far. The original thread [1] in which we discussed >>>>>>> Cloudera provider addition (we were not ready with the PR at that time) >>>>>>> led >>>>>>> to the new provider discussion [2] and finally the lazy consensus [3] on >>>>>>> mixed governance model. The outcome was a new mixed governance rule >>>>>>> which >>>>>>> was introduced [4], with an aim to (a) reduce the maintenance burden for >>>>>>> the community and (b) allow more providers in since point (a) became >>>>>>> acceptable. >>>>>>> >>>>>>> Let me know if it is acceptable to break up these two discussions >>>>>>> and have this vote move forward. >>>>>>> >>>>>>> Thank you, >>>>>>> Regards. >>>>>>> Philippe >>>>>>> >>>>>>> [1] https://lists.apache.org/thread/2z0lvgj466ksxxrbvofx41qvn03jrwwb >>>>>>> [2] https://lists.apache.org/thread/nvfc75kj2w1tywvvkw8ho5wkx1dcvgrn >>>>>>> [3] https://lists.apache.org/thread/gq9vym17x0o8j8s9clkbmdz2nt38nnbt >>>>>>> [4] https://github.com/apache/airflow/pull/24680 >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Dec 5, 2022 at 1:54 PM Ash Berlin-Taylor <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> Just to break with the consensus: -1 >>>>>>> >>>>>>> Not because I don't think the provider would be useful or popular >>>>>>> enough, precisely the opposite, and I'd like to see more companies >>>>>>> maintain >>>>>>> and manage their own providers and see an ecosystem of providers start >>>>>>> to >>>>>>> grow. >>>>>>> >>>>>>> Cloudera def has the means and resources to maintain their own >>>>>>> provider, and the communication channels to let their users/customers >>>>>>> know >>>>>>> about its existence. And I have no problem with linking to the provider >>>>>>> from our docs index. >>>>>>> >>>>>>> In generaly I am slightly worried about the workload we as >>>>>>> maintainers are letting ourselves in for inthe long run with an ever >>>>>>> growing number of providers. Particularly one that needs paid-for >>>>>>> accounts >>>>>>> that we don't have access to! >>>>>>> >>>>>>> -ash >>>>>>> >>>>>>> On Dec 4 2022, at 11:59 pm, Kaxil Naik <[email protected]> wrote: >>>>>>> >>>>>>> +1 binding >>>>>>> >>>>>>> On Sat, 3 Dec 2022 at 15:14, Holden Karau <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> non-binding +1 >>>>>>> >>>>>>> On Sat, Dec 3, 2022 at 3:55 AM Jarek Potiuk <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> I think cloudera is important player in our ecosystem and as long as >>>>>>> it passes all the bars (i.e. 2.3.0+ compatibility and good >>>>>>> non-conflicting dependencies, passing all the tests, I am +1. >>>>>>> >>>>>>> On Sat, Dec 3, 2022 at 12:51 PM Philippe Lanoe >>>>>>> <[email protected]> wrote: >>>>>>> > >>>>>>> > Hello, >>>>>>> > >>>>>>> > Correction: since it is a vote on code modification, all >>>>>>> committers' votes count, I was mistaken in my previous email (which >>>>>>> mentioned only PMC votes are binding), quite new in this process. >>>>>>> > Please let me know if a discussion thread is preferred. >>>>>>> > >>>>>>> > Thanks, >>>>>>> > Regards, >>>>>>> > Philippe >>>>>>> > >>>>>>> > On Wed, Nov 30, 2022 at 5:34 PM Philippe Lanoe < >>>>>>> [email protected]> wrote: >>>>>>> >> >>>>>>> >> Hello Airflow community! >>>>>>> >> >>>>>>> >> As requested in our PR, I would like to start a vote for adding a >>>>>>> new provider (Cloudera). Please note that this vote is about the fact to >>>>>>> add this new provider not about the code itself, which will be reviewed >>>>>>> as >>>>>>> part of the PR. >>>>>>> >> >>>>>>> >> We would like to contribute the Cloudera provider to allow data >>>>>>> practitioners out-of-the-box interactions with a multi-function >>>>>>> analytics >>>>>>> and hybrid platform, >>>>>>> >> >>>>>>> >> Our first two Operators are CdeRunJobOperator, to run a CDE job >>>>>>> (Spark or Airflow within the Cloudera Data Engineering service) and >>>>>>> CdwExecuteQueryOperator, to execute a query on a managed CDW cluster >>>>>>> (Hive >>>>>>> / Impala within the Cloudera Data Warehousing service). It also comes >>>>>>> with >>>>>>> a Sensor for CDW, in order to wait on a Hive partition. >>>>>>> >> We are also planning to contribute more in the future, as we >>>>>>> develop operators for other Cloudera services in Cloudera Data Platform >>>>>>> (CDP), like Cloudera Machine Learning and others, to cover the various >>>>>>> needs of data practitioners across the entire data lifecycle. >>>>>>> >> >>>>>>> >> Our code has been already used for quite some time internally and >>>>>>> we would like to contribute it to Airflow, to give a better experience >>>>>>> for >>>>>>> the users as it would be another system that users can reach seamlessly >>>>>>> in >>>>>>> their pipelines. >>>>>>> >> >>>>>>> >> Another important Note: Cloudera already filed a CCLA as >>>>>>> mentioned in this thread, so I think we are OK on the Legal side. >>>>>>> >> >>>>>>> >> You can find the PR here: >>>>>>> >> https://github.com/apache/airflow/pull/27866 >>>>>>> >> >>>>>>> >> The voting will last for 6 days (until 6th of December 2022, 6pm >>>>>>> UTC), and until at least 3 binding votes have been cast. I am sure about >>>>>>> the timeframe which is needed for providers actually, please let me >>>>>>> know if >>>>>>> it is adequate. >>>>>>> >> >>>>>>> >> Please vote accordingly: >>>>>>> >> >>>>>>> >> [ ] + 1 approve >>>>>>> >> [ ] + 0 no opinion >>>>>>> >> [ ] - 1 disapprove with the reason >>>>>>> >> >>>>>>> >> Only votes from PMC members and committers are binding, but other >>>>>>> members of the community are encouraged to check the AIP and vote with >>>>>>> "(non-binding)". >>>>>>> >> >>>>>>> >> Thanks! >>>>>>> >> >>>>>>> >> Regards, >>>>>>> >> Philippe >>>>>>> >> >>>>>>> >> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>>> >>>>>>>
