+1 on start this with a separate repo.
Which new clients can be placed in the main repo should be discussed after they 
are mature enough,

Yang Jie

发件人: Denny Lee <denny.g....@gmail.com>
日期: 2023年5月24日 星期三 21:31
收件人: Hyukjin Kwon <gurwls...@apache.org>
抄送: Maciej <mszymkiew...@gmail.com>, "dev@spark.apache.org" 
<dev@spark.apache.org>
主题: Re: [CONNECT] New Clients for Go and Rust

+1 on separate repo allowing different APIs to run at different speeds and 
ensuring they get community support.

On Wed, May 24, 2023 at 00:37 Hyukjin Kwon 
<gurwls...@apache.org<mailto:gurwls...@apache.org>> wrote:
I think we can just start this with a separate repo.
I am fine with the second option too but in this case we would have to triage 
which language to add into the main repo.

On Fri, 19 May 2023 at 22:28, Maciej 
<mszymkiew...@gmail.com<mailto:mszymkiew...@gmail.com>> wrote:
Hi,

Personally, I'm strongly against the second option and have some preference 
towards the third one (or maybe a mix of the first one and the third one).

The project is already pretty large as-is and, with an extremely conservative 
approach towards removal of APIs, it only tends to grow over time. Making it 
even larger is not going to make things more maintainable and is likely to 
create an entry barrier for new contributors (that's similar to Jia's 
arguments).

Moreover, we've seen quite a few different language clients over the years and 
all but one or two survived while none is particularly active, as far as I'm 
aware.  Taking responsibility for more clients, without being sure that we have 
resources to maintain them and there is enough community around them to make 
such effort worthwhile, doesn't seem like a good idea.


--

Best regards,

Maciej Szymkiewicz



Web: 
https://zero323.net<https://mailshield.baidu.com/check?q=ZqimyN8NSYrM5LNLYs2dCk0kgoTFi6Ap>

PGP: A30CEF0C31A501EC


On 5/19/23 14:57, Jia Fan wrote:
Hi,

Thanks for contribution!
I prefer (1). There are some reason:

1. Different repository can maintain independent versions, different release 
times, and faster bug fix releases.

2. Different languages have different build tools. Putting them in one 
repository will make the main repository more and more complicated, and it will 
become extremely difficult to perform a complete build in the main repository.

3. Different repository will make CI configuration and execute easier, and the 
PR and commit lists will be clearer.

4. Other repository also have different client to governed, like clickhouse. It 
use different repository for jdbc, odbc, c++. Please refer:
https://github.com/ClickHouse/clickhouse-java<https://mailshield.baidu.com/check?q=bnJjk%2bk2NRQA4%2fhBbhtfi0g77ETgH45cbNzxcnFzestgEKDCKORylumJUsxxaT7HfA1Uxg%3d%3d>
https://github.com/ClickHouse/clickhouse-odbc<https://mailshield.baidu.com/check?q=ok%2fK6G9Dvxwugm5rzt2TY5COv5QeVPUNLztlmY19Qm7bDK%2fDamhM9uwqOW6MucmLhtq3EA%3d%3d>
https://github.com/ClickHouse/clickhouse-cpp<https://mailshield.baidu.com/check?q=Pj9nXH8oXyfeUM2lboc9kI8ogubV73Ex5kRiT%2f%2byAVPoyMvoniKFcl165tM4pBXf>

PS: I'm looking forward to the javascript connect client!

Thanks Regards
Jia Fan

Martin Grund <mgr...@apache.org<mailto:mgr...@apache.org>> 于2023年5月19日周五 
20:03写道:
Hi folks,

When Bo (thanks for the time and contribution) started the work on 
https://github.com/apache/spark/pull/41036<https://mailshield.baidu.com/check?q=QA1f5OtGUINKcNtbceorvf6kS6rrjlZn2EkcW%2fbqXOi%2fi6SdRKARKv8Ds5EYKaEV>
 he started the Go client directly in the Spark repository. In the meantime, I 
was approached by other engineers who are willing to contribute to working on a 
Rust client for Spark Connect.

Now one of the key questions is where should these connectors live and how we 
manage expectations most effectively.

At the high level, there are two approaches:

(1) "3rd party" (non-JVM / Python) clients should live in separate repositories 
owned and governed by the Apache Spark community.

(2) All clients should live in the main Apache Spark repository in the 
`connector/connect/client` directory.

(3) Non-native (Python, JVM) Spark Connect clients should not be part of the 
Apache Spark repository and governance rules.

Before we iron out how exactly, we mark these clients as experimental and how 
we align their release process etc with Spark, my suggestion would be to get a 
consensus on this first question.

Personally, I'm fine with (1) and (2) with a preference for (2).

Would love to get feedback from other members of the community!

Thanks
Martin




Reply via email to