It might be a good idea to have a discussion about how new connect clients fit into the overall process we have. In particular:

 * Under what conditions do we consider adding a new language to the
   official channels?  What process do we follow?
 * What guarantees do we offer in respect to these clients? Is adding a
   new client the same type of commitment as for the core API? In other
   words, do we commit to maintaining such clients "forever" or do we
   separate the "official" and "contrib" clients, with the later being
   governed by the ASF, but not guaranteed to be maintained in the future?
 * Do we follow the same release schedule as for the core project, or
   rather release each client separately, after the main release is
   completed?

Also, an elephant in the room is the future of the current API in Spark 4 and onwards. As useful as connect is, it is not exactly a replacement for many existing deployments. Furthermore, it doesn't make extending Spark much easier and the current ecosystem is, subjectively speaking, a bit brittle.

--
Best regards,
Maciej


On 5/26/23 07:26, Martin Grund wrote:
Thanks everyone for your feedback! I will work on figuring out what it takes to get started with a repo for the go client.

On Thu 25. May 2023 at 21:51 Chao Sun <sunc...@apache.org> wrote:

    +1 on separate repo too

    On Thu, May 25, 2023 at 12:43 PM Dongjoon Hyun
    <dongjoon.h...@gmail.com> wrote:
    >
    > +1 for starting on a separate repo.
    >
    > Dongjoon.
    >
    > On Thu, May 25, 2023 at 9:53 AM yangjie01 <yangji...@baidu.com>
    wrote:
    >>
    >> +1 on start this with a separate repo.
    >>
    >> Which new clients can be placed in the main repo should be
    discussed after they are mature enough,
    >>
    >>
    >>
    >> Yang Jie
    >>
    >>
    >>
    >> 发件人: Denny Lee <denny.g....@gmail.com>
    >> 日期: 2023年5月24日 星期三 21:31
    >> 收件人: Hyukjin Kwon <gurwls...@apache.org>
    >> 抄送: Maciej <mszymkiew...@gmail.com>, "dev@spark.apache.org"
    <dev@spark.apache.org>
    >> 主题: Re: [CONNECT] New Clients for Go and Rust
    >>
    >>
    >>
    >> +1 on separate repo allowing different APIs to run at different
    speeds and ensuring they get community support.
    >>
    >>
    >>
    >> On Wed, May 24, 2023 at 00:37 Hyukjin Kwon
    <gurwls...@apache.org> wrote:
    >>
    >> I think we can just start this with a separate repo.
    >> I am fine with the second option too but in this case we would
    have to triage which language to add into the main repo.
    >>
    >>
    >>
    >> On Fri, 19 May 2023 at 22:28, Maciej <mszymkiew...@gmail.com>
    wrote:
    >>
    >> Hi,
    >>
    >>
    >>
    >> Personally, I'm strongly against the second option and have
    some preference towards the third one (or maybe a mix of the first
    one and the third one).
    >>
    >>
    >>
    >> The project is already pretty large as-is and, with an
    extremely conservative approach towards removal of APIs, it only
    tends to grow over time. Making it even larger is not going to
    make things more maintainable and is likely to create an entry
    barrier for new contributors (that's similar to Jia's arguments).
    >>
    >>
    >>
    >> Moreover, we've seen quite a few different language clients
    over the years and all but one or two survived while none is
    particularly active, as far as I'm aware.  Taking responsibility
    for more clients, without being sure that we have resources to
    maintain them and there is enough community around them to make
    such effort worthwhile, doesn't seem like a good idea.
    >>
    >>
    >>
    >> --
    >>
    >> Best regards,
    >>
    >> Maciej Szymkiewicz
    >>
    >>
    >>
    >> Web: https://zero323.net
    >>
    >> PGP: A30CEF0C31A501EC
    >>
    >>
    >>
    >>
    >>
    >> On 5/19/23 14:57, Jia Fan wrote:
    >>
    >> Hi,
    >>
    >>
    >>
    >> Thanks for contribution!
    >>
    >> I prefer (1). There are some reason:
    >>
    >>
    >>
    >> 1. Different repository can maintain independent versions,
    different release times, and faster bug fix releases.
    >>
    >>
    >>
    >> 2. Different languages have different build tools. Putting them
    in one repository will make the main repository more and more
    complicated, and it will become extremely difficult to perform a
    complete build in the main repository.
    >>
    >>
    >>
    >> 3. Different repository will make CI configuration and execute
    easier, and the PR and commit lists will be clearer.
    >>
    >>
    >>
    >> 4. Other repository also have different client to governed,
    like clickhouse. It use different repository for jdbc, odbc, c++.
    Please refer:
    >>
    >> https://github.com/ClickHouse/clickhouse-java
    >>
    >> https://github.com/ClickHouse/clickhouse-odbc
    >>
    >> https://github.com/ClickHouse/clickhouse-cpp
    >>
    >>
    >>
    >> PS: I'm looking forward to the javascript connect client!
    >>
    >>
    >>
    >> Thanks Regards
    >>
    >> Jia Fan
    >>
    >>
    >>
    >> Martin Grund <mgr...@apache.org> 于2023年5月19日周五 20:03写道:
    >>
    >> Hi folks,
    >>
    >>
    >>
    >> When Bo (thanks for the time and contribution) started the work
    on https://github.com/apache/spark/pull/41036 he started the Go
    client directly in the Spark repository. In the meantime, I was
    approached by other engineers who are willing to contribute to
    working on a Rust client for Spark Connect.
    >>
    >>
    >>
    >> Now one of the key questions is where should these connectors
    live and how we manage expectations most effectively.
    >>
    >>
    >>
    >> At the high level, there are two approaches:
    >>
    >>
    >>
    >> (1) "3rd party" (non-JVM / Python) clients should live in
    separate repositories owned and governed by the Apache Spark
    community.
    >>
    >>
    >>
    >> (2) All clients should live in the main Apache Spark repository
    in the `connector/connect/client` directory.
    >>
    >>
    >>
    >> (3) Non-native (Python, JVM) Spark Connect clients should not
    be part of the Apache Spark repository and governance rules.
    >>
    >>
    >>
    >> Before we iron out how exactly, we mark these clients as
    experimental and how we align their release process etc with
    Spark, my suggestion would be to get a consensus on this first
    question.
    >>
    >>
    >>
    >> Personally, I'm fine with (1) and (2) with a preference for (2).
    >>
    >>
    >>
    >> Would love to get feedback from other members of the community!
    >>
    >>
    >>
    >> Thanks
    >>
    >> Martin
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >>

    ---------------------------------------------------------------------
    To unsubscribe e-mail: dev-unsubscr...@spark.apache.org


Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Reply via email to