Sorry for the late reply.

To be clear, even if you are not looking for maintainer help:
- You will (most likely) have to go through an IP clearance process.
- You will not be able to push code without a committer or PMC.
- You will not be able to release code without a PMC vote.

Is that really desirable if you are just looking to 'park' the code somewhere? 
I would think this only makes sense if you want to actually ship it as part of 
the Arrow Java libraries, and participate in Java development. So could you 
clarify what you are looking for in a "logical location for the project"? 

We can add the "powered-by" part quite easily. A PR to powered_by.md [1] would 
be sufficient.

[1]: https://github.com/apache/arrow-site/blob/master/powered_by.md

On Tue, Nov 1, 2022, at 09:56, Kyle Brooks wrote:
> Hi David and Weston,
>
> With your and other relevant maintainers permission, we’d like to take 
> one of the following actions in order of preference:
>
> - Transfer ownership of the flight-spark-connector repo to the Apache 
> Github Org
> - Make a new repo in Apache and push the code
> - Keep the repo where it is and work to get it added into the “used-by” 
> or “powered-by” list on the Arrow website.
>
> For the first two options we are not looking for help maintaining or 
> developing.  We want a logical location for the project in the short / 
> medium term.
>
> Would you be able to help us to implement for one of these options?  I 
> like Weston's idea of listing the project in the “powered-by” website 
> regardless of the option chosen.
>
> Thanks,
> Kyle
>
> On 2022/10/21 15:58:06 Weston Pace wrote:
>> > Maybe to take a step back - why do we want this in the Arrow 
>> > repositories/under Arrow governance?
>> 
>> I think this is the important question.  What is the goal here?
>> 
>> If the goal is to help spread awareness then we can link to a repo
>> somewhere (e.g. a "projects that use Arrow" section or something)  For
>> example, I could eventually see something like [1] for ADBC.
>> 
>> If the goal is to share some kind of CI infrastructure burden (e.g.
>> ensure a library runs everywhere that Arrow can run) then the contrib
>> repo might be more useful than a repo-per-project but I think we'll
>> need some more general discussion on how to make this happen.
>> 
>> If the goal is to share maintenance / development cost or find new
>> developers then I don't think any approach works.  Most Arrow
>> developers are quite adept at ignoring the parts of the repo they
>> don't need to interact with.
>> 
>> [1] https://jwt.io/libraries
>> 
>> On Fri, Oct 21, 2022 at 8:48 AM Antoine Pitrou <an...@python.org> wrote:
>> >
>> >
>> > Le 21/10/2022 à 17:35, David Li a écrit :
>> > > Maybe to take a step back - why do we want this in the Arrow 
>> > > repositories/under Arrow governance?
>> > >
>> > > I'm excited to see more integrations and use cases for Flight and Flight 
>> > > SQL in the wild, but I think it would be good to see a true ecosystem 
>> > > around this, and so I don't think -every- integration needs to end up in 
>> > > the Arrow repos. And there is a cost to set up CI, releases, etc. (ADBC 
>> > > is still getting set up there, and my hope at least is that most 
>> > > integrations will eventually be provided by the database systems, not by 
>> > > Arrow.)
>> > >
>> > > That said I'm not necessarily opposed. We've discussed similar 'contrib' 
>> > > things in the past [1][2]. It may be worth reviewing the discussions 
>> > > there and discussing how this project would address the criteria 
>> > > proposed.
>> >
>> > The problem is that Arrow is so broad nowadays that a "contrib" repo
>> > would end up hosting a hodgepodge of entirely disparate subprojects with
>> > no common maintenance/release policies, and disjoint development and
>> > user communities.
>> >
>> > A separate Apache repo for each subproject is probably better, even
>> > though there might be a small setup overhead.
>> >
>> > Regards
>> >
>> > Antoine.
>> >
>> >
>> >
>> >
>> >
>> > >
>> > > [1]: https://lists.apache.org/thread/nfr3tq1tb5tvr34zg5z7on8xglfsj79t
>> > > [2]: https://lists.apache.org/thread/yshp4b3g34kxovzvf6x48pzj0894qbw5 
>> > > (though you may have to dig to find the responses - the UI didn't link 
>> > > them up)
>> > >
>> > > On Fri, Oct 21, 2022, at 11:08, Kyle Brooks wrote:
>> > >> Hi David and Antoine,
>> > >>
>> > >> Long-term I completely agree that this should belong in Apache Spark.
>> > >> I also agree that Flight SQL or ADBC would be a good enhancement for
>> > >> users.  We are planning on implementing Flight SQL support soon.  ADBC
>> > >> doesn't look mature enough right now for this use case.  We will keep
>> > >> an eye on it.
>> > >>
>> > >> Short-term, I'd like to propose either creating an Arrow contrib repo
>> > >> or making a separate Apache repo just for the Flight Spark Connector.
>> > >>
>> > >> We would need help facilitating this within Apache / Arrow.
>> > >>
>> > >> Thank you,
>> > >> Kyle
>> > >>
>> > >> On 2022/10/18 23:44:49 David Li wrote:
>> > >>> Given the probable need for IP clearance, getting it into Arrow would 
>> > >>> also be a Process(TM) unfortunately. We also don't really have a great 
>> > >>> place for "not quite in tree" projects; there have been discussions of 
>> > >>> a 'contrib' repo in the past, but nothing has materialized.
>> > >>>
>> > >>> That said - have you shown this to Spark users? I'd guess there'd be 
>> > >>> more enthusiasm there, especially if there are particular data 
>> > >>> source(s) you anticipate this would make available to them. (Though 
>> > >>> again, Flight SQL or ADBC over plain Flight RPC would might be a more 
>> > >>> attractive target for such a Spark plugin.)
>> > >>>
>> > >>> -David
>> > >>>
>> > >>> On Tue, Oct 18, 2022, at 16:50, Matt Phelps wrote:
>> > >>>> Hi David and Antoine,
>> > >>>>
>> > >>>> Thanks for your input. On past experience talking to some other Arrow 
>> > >>>> /
>> > >>>> Spark developers, we anticipate that it would take a long time to get
>> > >>>> into Spark. Our plan was to build up a user base in the Arrow 
>> > >>>> community
>> > >>>> before submitting for inclusion to Spark. Is there a place the code 
>> > >>>> can
>> > >>>> live in the mean time?
>> > >>>>
>> > >>>> Matt Phelps
>> > >>>>
>> > >>>>
>> > >>>> From: Antoine Pitrou <an...@python.org>
>> > >>>> Date: Monday, October 17, 2022 at 2:48 PM
>> > >>>> To: dev@arrow.apache.org <de...@arrow.apache.org>
>> > >>>> Subject: Re: [DISCUSS] Integrate existing Spark connector for Flight
>> > >>>> CAUTION: This email originated from outside of the organization. Do 
>> > >>>> not
>> > >>>> click links or open attachments unless you recognize the sender and
>> > >>>> know the content is safe.
>> > >>>>
>> > >>>> Le 17/10/2022 à 21:27, David Li a écrit :
>> > >>>>> Hey Matt,
>> > >>>>>
>> > >>>>> This is cool to see. To be clear, this is an implementation of Spark 
>> > >>>>> DataSourceV2 using Arrow Flight?
>> > >>>>>
>> > >>>>> I think the questions I have are:
>> > >>>>>
>> > >>>>> - Does this belong under Arrow, or under Spark - I lean towards it 
>> > >>>>> being closer to Spark than Arrow;
>> > >>>>
>> > >>>> FWIW, that is my feeling as well.
>> > >>>>
>> > >>>> Regards
>> > >>>>
>> > >>>> Antoine.
>> > >>>
>>

Reply via email to