+1 on separate repo allowing different APIs to run at different speeds and ensuring they get community support.
On Wed, May 24, 2023 at 00:37 Hyukjin Kwon <gurwls...@apache.org> wrote: > I think we can just start this with a separate repo. > I am fine with the second option too but in this case we would have to > triage which language to add into the main repo. > > On Fri, 19 May 2023 at 22:28, Maciej <mszymkiew...@gmail.com> wrote: > >> Hi, >> >> Personally, I'm strongly against the second option and have some >> preference towards the third one (or maybe a mix of the first one and the >> third one). >> >> The project is already pretty large as-is and, with an extremely >> conservative approach towards removal of APIs, it only tends to grow over >> time. Making it even larger is not going to make things more maintainable >> and is likely to create an entry barrier for new contributors (that's >> similar to Jia's arguments). >> >> Moreover, we've seen quite a few different language clients over the >> years and all but one or two survived while none is particularly active, as >> far as I'm aware. Taking responsibility for more clients, without being >> sure that we have resources to maintain them and there is enough community >> around them to make such effort worthwhile, doesn't seem like a good idea. >> >> -- >> Best regards, >> Maciej Szymkiewicz >> >> Web: https://zero323.net >> PGP: A30CEF0C31A501EC >> >> >> >> On 5/19/23 14:57, Jia Fan wrote: >> >> Hi, >> >> Thanks for contribution! >> I prefer (1). There are some reason: >> >> 1. Different repository can maintain independent versions, different >> release times, and faster bug fix releases. >> >> 2. Different languages have different build tools. Putting them in one >> repository will make the main repository more and more complicated, and it >> will become extremely difficult to perform a complete build in the main >> repository. >> >> 3. Different repository will make CI configuration and execute easier, >> and the PR and commit lists will be clearer. >> >> 4. Other repository also have different client to governed, like >> clickhouse. It use different repository for jdbc, odbc, c++. Please refer: >> https://github.com/ClickHouse/clickhouse-java >> https://github.com/ClickHouse/clickhouse-odbc >> https://github.com/ClickHouse/clickhouse-cpp >> >> PS: I'm looking forward to the javascript connect client! >> >> Thanks Regards >> Jia Fan >> >> Martin Grund <mgr...@apache.org> 于2023年5月19日周五 20:03写道: >> >>> Hi folks, >>> >>> When Bo (thanks for the time and contribution) started the work on >>> https://github.com/apache/spark/pull/41036 he started the Go client >>> directly in the Spark repository. In the meantime, I was approached by >>> other engineers who are willing to contribute to working on a Rust client >>> for Spark Connect. >>> >>> Now one of the key questions is where should these connectors live and >>> how we manage expectations most effectively. >>> >>> At the high level, there are two approaches: >>> >>> (1) "3rd party" (non-JVM / Python) clients should live in separate >>> repositories owned and governed by the Apache Spark community. >>> >>> (2) All clients should live in the main Apache Spark repository in the >>> `connector/connect/client` directory. >>> >>> (3) Non-native (Python, JVM) Spark Connect clients should not be part of >>> the Apache Spark repository and governance rules. >>> >>> Before we iron out how exactly, we mark these clients as experimental >>> and how we align their release process etc with Spark, my suggestion would >>> be to get a consensus on this first question. >>> >>> Personally, I'm fine with (1) and (2) with a preference for (2). >>> >>> Would love to get feedback from other members of the community! >>> >>> Thanks >>> Martin >>> >>> >>> >>> >>