Also in this case you are free to decide how you publish or produce the
documentation and release the software (and whether you test it or not) -
this is precisely the freedom that Ash mentioned you get when you are not
bound by the community rules.

BTW. Adding a link to the ecosystem page is very easy - just click "Suggest
a change on this page" and you will get PR opened, where you will be able
to link to your provider.

J.


On Tue, Dec 13, 2022 at 4:21 PM Jarek Potiuk <[email protected]> wrote:

> The documentation index is only for the community and only for providers
> managed by the community.
>
> This is a very strict requirement by The Apache Software Foundation.
> According to the Apache Software Foundation rules, we cannot suggest or
> even hint that a software released by a 3rd-party is an "Apache Software
> Foundation" software. If we add link to a 3rd-party software, we have to
> explicitly state that it is not an ASF software.
> For such cases, we have dedicated "Ecosystem" page
> https://airflow.apache.org/ecosystem/ which explicitly states:
>
> "These resources and services are not maintained, nor endorsed by the
> Apache Airflow Community and Apache Airflow project (maintained by the
> Committers and the Airflow PMC). Use them at your sole discretion. The
> community does not verify the licences nor validity of those tools, so it’s
> your responsibility to verify them."
>
> And there is a section for 3rd-party plugins and providers:
> https://airflow.apache.org/ecosystem/#third-party-airflow-plugins-and-providers
>
> J.
>
>
>
>
> On Tue, Dec 13, 2022 at 4:01 PM Philippe Lanoe <[email protected]>
> wrote:
>
>> Hello Airflow community,
>>
>> We decided internally that we cannot take on the maintenance of this CI
>> environment on our side right now. However, as Ash mentioned and suggested,
>> we would like to be part of the documentation index
>> <https://airflow.apache.org/docs/#providers-packagesdocsapache-airflow-providersindexhtml>.
>> I assume that the documentation of our provider should be handled by us
>> (for instance in our Git repository)? Or should the documentation of the
>> provider be part of the Airflow repository, though the code lives outside ?
>>
>> In any case, what is the process for doing into the documentation index?
>> Do we need to raise a PR or is this request in this email enough? Please
>> let us know what inputs are required from our side.
>>
>> Thanks,
>> Philippe
>>
>>
>> On Fri, Dec 9, 2022 at 9:17 PM Oliveira, Niko <[email protected]>
>> wrote:
>>
>>> This has yet to be published by Google and Amazon - I know they are
>>> progressing a lot on making the automation and publishing regular result of
>>> the System tests from main in the way that we can verify that all tests
>>> pass - all that is done outside of the community resources and maintenance
>>> (i.e. this is entirely on the Amazon and Google teams to run and publish
>>> those tests).
>>>
>>> Just as an update: we're still working hard on this, I promise :) It has
>>> taken MUCH longer than expected to get all the requisite internal approvals
>>> and agreement on how to share the results with the community. But we're
>>> zeroing in on an approach that everyone agrees on for publication. Please
>>> bear with us on this one!
>>>
>>>
>>>
>>> In this case - anyone with any "Cloudera" account should be able to run
>>> it locally when contributing. But the idea of AIP-47 was to off-load
>>> regular execution of those tests and provide public "status" of those to
>>> those teams of those service providers that want to make sure that their
>>> provider still runs.
>>>
>>> Agreed, the tests are written in a way that anyone can run them (with
>>> mechanisms to provide any pre-exisitng resources some tests required). But
>>> to expect the community to have the resources to regularly run the all the
>>> system tests for all providers is unreasonable, collaboration is really
>>> required here.
>>>
>>> Cheers,
>>> Niko
>>>
>>>
>>>
>>> ------------------------------
>>> *From:* Pierre Jeambrun <[email protected]>
>>> *Sent:* Friday, December 9, 2022 11:43:21 AM
>>> *To:* [email protected]
>>> *Subject:* RE: [EXTERNAL][VOTE] New Provider: Cloudera
>>>
>>>
>>> *CAUTION*: This email originated from outside of the organization. Do
>>> not click links or open attachments unless you can confirm the sender and
>>> know the content is safe.
>>>
>>> Thanks for taking time to give more details Jarek. This puts things
>>> in perspective.
>>>
>>> Le ven. 9 déc. 2022 à 18:48, Collin McNulty <[email protected]>
>>> a écrit :
>>>
>>>> I concur with the concerns raised by Ash. Cloudera seems like an
>>>> organization quite well suited to releasing its own provider. If such an
>>>> organization is not expected to release outside the Apache process, who is?
>>>> Maybe I'm misunderstanding, but I thought that the idea was that providers
>>>> going forward would be mostly third party which allows for a larger and
>>>> more vibrant ecosystem.
>>>>
>>>> Collin McNulty
>>>>
>>>> On Fri, Dec 9, 2022 at 6:04 AM Pierre Jeambrun <[email protected]>
>>>> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I am really excited about a public official cloudera provider for
>>>>> airflow. This would be a great addition to the airflow ecosystem.
>>>>>
>>>>> System tests would be an additional layer that would be great for the
>>>>> CI and release process, but would individual contributors be able to run
>>>>> these system tests locally ? From what I understand, such credentials 
>>>>> would
>>>>> be stored in the CI, and only people with their own credentials would be
>>>>> able to test the code locally and therefore realistically help in
>>>>> maintaining the provider. (Iterating on CI failure wouldn't be great :p)
>>>>>
>>>>> Ash point is echoing in me, remembering when I had to work on a
>>>>> specific provider where free accounts/quotas were not available. It was
>>>>> basically a shot in the dark, making code changes based on documentation
>>>>> and api specs without being able to actually test the code. Maybe this was
>>>>> the reason the issues stayed open for more than a year without being 
>>>>> picked
>>>>> up.
>>>>>
>>>>> Will the community really be able to contribute and support
>>>>> the provider, while most of us don't have a paid account ? Or is it
>>>>> 'stakeholders' maintained and 'community' released at most. (Even 
>>>>> reviewing
>>>>> code for release would be tricky without an account).
>>>>>
>>>>> Maybe I misunderstood something and apologize in advance.
>>>>>
>>>>> Best regards,
>>>>> Pierre
>>>>>
>>>>> Le ven. 9 déc. 2022 à 12:24, Jarek Potiuk <[email protected]> a écrit :
>>>>>
>>>>>> > My concern about how we will actually test it works given we'd need
>>>>>> a cloudera account/install/instance would be good to comment on though.
>>>>>>
>>>>>> This is a very good point Ash and I love you've made it as I think we
>>>>>> have a very good solution at hand.
>>>>>>
>>>>>> This simply calls for Cloudera's commitment to work on AIP-47 style
>>>>>> tests and providing a test bed for that.
>>>>>>
>>>>>> This has yet to be published by Google and Amazon - I know they are
>>>>>> progressing a lot on making the automation and publishing regular result 
>>>>>> of
>>>>>> the System tests from main in the way that we can verify that all tests
>>>>>> pass - all that is done outside of the community resources and 
>>>>>> maintenance
>>>>>> (i.e. this is entirely on the Amazon and Google teams to run and publish
>>>>>> those tests).
>>>>>>
>>>>>> So I have a PROPOSAL (I can send a formal vote on that shortly)
>>>>>>
>>>>>> For all the future (starting from Cloudera) we should make that as a
>>>>>> requirement that any of the providers accepted by the community MUST have
>>>>>> AIP-47 style System Tests and the service provider in question MUST 
>>>>>> provide
>>>>>> their own System Test environment with public access of status for the
>>>>>> community and commit to maintaining those for as long as the Provider is
>>>>>> released by the community.
>>>>>>
>>>>>> I think this is a very reasonable ask for Cloudera (and anyone else
>>>>>> in the future) and a very, very good compromise (win-win for both sides
>>>>>> while also requiring both sides to commit to a long term cooperation).
>>>>>> This way we make sure we have to cooperate with the service provider 
>>>>>> rather
>>>>>> than letting the Service provider "throw the code over the fence" and put
>>>>>> all the burden of maintenance on the community.
>>>>>>
>>>>>> * with AIP-47 we provided a very solid foundation for fully-automated
>>>>>> system testing of precisely this kind of external service providers
>>>>>> * we (community) take on our shoulders the burden of reviewing and
>>>>>> releasing the code, and at the same time the service gets community
>>>>>> recognition and becomes part of the "Airflow Community supported"
>>>>>> * similarly the Service Provider takes on their shoulders the burden
>>>>>> or running and keeping in check the System Tests Bed for their system 
>>>>>> tests
>>>>>> submitted to the community and make sure they succeed before the release
>>>>>> happens
>>>>>> * whenever we release such a service provider - we hold on with the
>>>>>> release for that provider until the system tests for such provider are
>>>>>> green (and it's on the service provider to fix the problems with those
>>>>>> before we release).
>>>>>> * I know both Google and Amazon are committed to do so, I also know
>>>>>> Databricks is looking into it and in the future we might decide to apply 
>>>>>> it
>>>>>> to all "external service providers".
>>>>>>
>>>>>> Philippe - what do you think about such an arrangement? Is that
>>>>>> something that Cloudera will be able to commit to?
>>>>>>
>>>>>> J.
>>>>>>
>>>>>>
>>>>>> On Fri, Dec 9, 2022 at 12:00 PM Ash Berlin-Taylor <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> As per the original vote email:
>>>>>>>
>>>>>>> Please note that this vote is about the fact to add this new
>>>>>>> provider not about the code itself, which will be reviewed as part of 
>>>>>>> the PR
>>>>>>>
>>>>>>>
>>>>>>> So it's not a veto (as vetos can only apply to code).
>>>>>>>
>>>>>>> My concern about how we will actually test it works given we'd need
>>>>>>> a cloudera account/install/instance would be good to comment on though.
>>>>>>>
>>>>>>> -ash
>>>>>>>
>>>>>>> On Dec 7 2022, at 1:43 pm, Jarek Potiuk <[email protected]> wrote:
>>>>>>>
>>>>>>> Yeah. I would really want to understand that (and maybe others have
>>>>>>> an opinion here):
>>>>>>>
>>>>>>> https://www.apache.org/foundation/voting.html
>>>>>>>
>>>>>>> * Is this a "code modification" - where -1 is veto
>>>>>>> * or is it a "procedural issue" - where -1 is just a vote and
>>>>>>> majority rules
>>>>>>>
>>>>>>> I personally think that "code modification" is really on "PR review"
>>>>>>> level - when we see that the code submitted is not good.  But this case
>>>>>>> seems to be more of a procedural issue than code modification. For me 
>>>>>>> this
>>>>>>> is more "are we ok to accept a provider from cloudera?" rather than "do 
>>>>>>> we
>>>>>>> accept this code".
>>>>>>>
>>>>>>> Ash - how do you treat your -1 ?
>>>>>>>
>>>>>>> And others - what do you think of that ?
>>>>>>>
>>>>>>> I think the next course of action depends if we have consensus on
>>>>>>> how we treat the issue of "adding a new provider".
>>>>>>>
>>>>>>> J.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Dec 7, 2022 at 1:45 PM Philippe Lanoe
>>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> Hello Airflow community,
>>>>>>>
>>>>>>> Following up on this -1. I'm assuming that's a veto?
>>>>>>>
>>>>>>> If it is, would it be possible to decouple the provider
>>>>>>> sustainability discussion from this proposal (Cloudera provider addition
>>>>>>> request)?
>>>>>>>
>>>>>>> I do think sustainability discussions make full sense but I feel
>>>>>>> that this new provider is following the current rules that the community
>>>>>>> has established so far. The original thread [1] in which we discussed
>>>>>>> Cloudera provider addition (we were not ready with the PR at that time) 
>>>>>>> led
>>>>>>> to the new provider discussion [2] and finally the lazy consensus [3] on
>>>>>>> mixed governance model. The outcome was a new mixed governance rule 
>>>>>>> which
>>>>>>> was introduced [4], with an aim to (a) reduce the maintenance burden for
>>>>>>> the community and (b) allow more providers in since point (a) became
>>>>>>> acceptable.
>>>>>>>
>>>>>>> Let me know if it is acceptable to break up these two discussions
>>>>>>> and have this vote move forward.
>>>>>>>
>>>>>>> Thank you,
>>>>>>> Regards.
>>>>>>> Philippe
>>>>>>>
>>>>>>> [1] https://lists.apache.org/thread/2z0lvgj466ksxxrbvofx41qvn03jrwwb
>>>>>>> [2] https://lists.apache.org/thread/nvfc75kj2w1tywvvkw8ho5wkx1dcvgrn
>>>>>>> [3] https://lists.apache.org/thread/gq9vym17x0o8j8s9clkbmdz2nt38nnbt
>>>>>>> [4] https://github.com/apache/airflow/pull/24680
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Dec 5, 2022 at 1:54 PM Ash Berlin-Taylor <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Just to break with the consensus: -1
>>>>>>>
>>>>>>> Not because I don't think the provider would be useful or popular
>>>>>>> enough, precisely the opposite, and I'd like to see more companies 
>>>>>>> maintain
>>>>>>> and manage their own providers and see an ecosystem of providers start 
>>>>>>> to
>>>>>>> grow.
>>>>>>>
>>>>>>> Cloudera def has the means and resources to maintain their own
>>>>>>> provider, and the communication channels to let their users/customers 
>>>>>>> know
>>>>>>> about its existence. And I have no problem with linking to the provider
>>>>>>> from our docs index.
>>>>>>>
>>>>>>> In generaly I am slightly worried about the workload we as
>>>>>>> maintainers are letting ourselves in for inthe long run with an ever
>>>>>>> growing number of providers. Particularly one that needs paid-for 
>>>>>>> accounts
>>>>>>> that we don't have access to!
>>>>>>>
>>>>>>> -ash
>>>>>>>
>>>>>>> On Dec 4 2022, at 11:59 pm, Kaxil Naik <[email protected]> wrote:
>>>>>>>
>>>>>>> +1 binding
>>>>>>>
>>>>>>> On Sat, 3 Dec 2022 at 15:14, Holden Karau <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> non-binding +1
>>>>>>>
>>>>>>> On Sat, Dec 3, 2022 at 3:55 AM Jarek Potiuk <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> I think cloudera is important player in our ecosystem and as long as
>>>>>>> it passes all the bars (i.e. 2.3.0+ compatibility and good
>>>>>>> non-conflicting dependencies, passing all the tests, I am +1.
>>>>>>>
>>>>>>> On Sat, Dec 3, 2022 at 12:51 PM Philippe Lanoe
>>>>>>> <[email protected]> wrote:
>>>>>>> >
>>>>>>> > Hello,
>>>>>>> >
>>>>>>> > Correction: since it is a vote on code modification, all
>>>>>>> committers' votes count, I was mistaken in my previous email (which
>>>>>>> mentioned only PMC votes are binding), quite new in this process.
>>>>>>> > Please let me know if a discussion thread is preferred.
>>>>>>> >
>>>>>>> > Thanks,
>>>>>>> > Regards,
>>>>>>> > Philippe
>>>>>>> >
>>>>>>> > On Wed, Nov 30, 2022 at 5:34 PM Philippe Lanoe <
>>>>>>> [email protected]> wrote:
>>>>>>> >>
>>>>>>> >> Hello Airflow community!
>>>>>>> >>
>>>>>>> >> As requested in our PR, I would like to start a vote for adding a
>>>>>>> new provider (Cloudera). Please note that this vote is about the fact to
>>>>>>> add this new provider not about the code itself, which will be reviewed 
>>>>>>> as
>>>>>>> part of the PR.
>>>>>>> >>
>>>>>>> >> We would like to contribute the Cloudera provider to allow data
>>>>>>> practitioners out-of-the-box interactions with a multi-function 
>>>>>>> analytics
>>>>>>> and hybrid platform,
>>>>>>> >>
>>>>>>> >> Our first two Operators are CdeRunJobOperator, to run a CDE job
>>>>>>> (Spark or Airflow within the Cloudera Data Engineering service) and
>>>>>>> CdwExecuteQueryOperator, to execute a query on a managed CDW cluster 
>>>>>>> (Hive
>>>>>>> / Impala within the Cloudera Data Warehousing service). It also comes 
>>>>>>> with
>>>>>>> a Sensor for CDW, in order to wait on a Hive partition.
>>>>>>> >> We are also planning to contribute more in the future, as we
>>>>>>> develop operators for other Cloudera services in Cloudera Data Platform
>>>>>>> (CDP), like Cloudera Machine Learning and others, to cover the various
>>>>>>> needs of data practitioners across the entire data lifecycle.
>>>>>>> >>
>>>>>>> >> Our code has been already used for quite some time internally and
>>>>>>> we would like to contribute it to Airflow, to give a better experience 
>>>>>>> for
>>>>>>> the users as it would be another system that users can reach seamlessly 
>>>>>>> in
>>>>>>> their pipelines.
>>>>>>> >>
>>>>>>> >> Another important Note: Cloudera already filed a CCLA as
>>>>>>> mentioned in this thread, so I think we are OK on the Legal side.
>>>>>>> >>
>>>>>>> >> You can find the PR here:
>>>>>>> >> https://github.com/apache/airflow/pull/27866
>>>>>>> >>
>>>>>>> >> The voting will last for 6 days (until 6th of December 2022, 6pm
>>>>>>> UTC), and until at least 3 binding votes have been cast. I am sure about
>>>>>>> the timeframe which is needed for providers actually, please let me 
>>>>>>> know if
>>>>>>> it is adequate.
>>>>>>> >>
>>>>>>> >> Please vote accordingly:
>>>>>>> >>
>>>>>>> >> [ ] + 1 approve
>>>>>>> >> [ ] + 0 no opinion
>>>>>>> >> [ ] - 1 disapprove with the reason
>>>>>>> >>
>>>>>>> >> Only votes from PMC members and committers are binding, but other
>>>>>>> members of the community are encouraged to check the AIP and vote with
>>>>>>> "(non-binding)".
>>>>>>> >>
>>>>>>> >> Thanks!
>>>>>>> >>
>>>>>>> >> Regards,
>>>>>>> >> Philippe
>>>>>>> >>
>>>>>>> >>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>>
>>>>>>>

Reply via email to