+1 binding, thank you so for working through the concerns on the earlier
version, I’m so excited for this to get built!

Twitter: https://twitter.com/holdenkarau
Fight Health Insurance: https://www.fighthealthinsurance.com/
<https://www.fighthealthinsurance.com/?q=hk_email>
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau
Pronouns: she/her


On Wed, Mar 18, 2026 at 12:32 PM L. C. Hsieh <[email protected]> wrote:

> +1
>
> On Wed, Mar 18, 2026 at 2:33 AM Haiyang Sun via dev <[email protected]>
> wrote:
>
>> Hi Spark devs,
>>
>> I would like to call for *a new vote following the previous attempt* for the
>> *SPIP: Language-Agnostic UDF Execution Protocol for Spark *after
>> addressing comments and providing a supplementary design document for
>> worker specification.
>>
>> The SPIP proposes a structured, language-agnostic framework for running
>> user-defined functions (UDFs) in Spark across multiple programming languages
>>
>> Today, Spark Connect allows users to write queries from multiple
>> languages, but support for user-defined functions remains incomplete. In
>> practice, only Scala, Java, Python have working support, and this relies on
>> language-specific mechanisms that do not generalize well to other languages
>> such as Go <https://github.com/apache/spark-connect-go> / Rust
>> <https://github.com/apache/spark-connect-rust> / Swift
>> <https://github.com/apache/spark-connect-swift> / TypeScript
>> <https://github.com/BaldrVivaldelli/ts-spark-connector>  where UDF
>> support is currently unavailable. In addition, there are legacy limitations
>> in the existing PySpark worker implementation that make it difficult to
>> evolve the system or extend it to new languages.
>>
>> The proposal introduces two related components:
>>
>>
>>    1.
>>
>>    *A unified UDF execution protocol*
>>
>>    The proposal defines a structured API and execution protocol for
>>    running UDFs outside the Spark executor process and communicating with
>>    Spark via inter-process communication (IPC). This protocol enables Spark 
>> to
>>    interact with external UDF workers in a consistent and extensible way,
>>    regardless of the implementation language.
>>    2.
>>
>>    *A worker specification for provisioning and lifecycle management.*
>>
>>    To support multi-language execution environments, the proposal also
>>    introduces a worker specification describing how UDF workers can be
>>    installed, started, connected to, and terminated. This document 
>> complements
>>    the SPIP by outlining how workers can be provisioned and managed in a
>>    consistent way.
>>
>> Note that this SPIP can help enable UDF support for languages that
>> currently do not support UDFs. For languages that already have UDF
>> implementations (especially Python), the goal is not to replace existing
>> implementations immediately, but to provide a framework that may allow them
>> to gradually evolve toward more language-agnostic abstractions over time.
>>
>> More details can be found in the SPIP document and the supplementary
>> design for worker specification:
>>
>> SPIP:
>> https://docs.google.com/document/d/19Whzq127QxVt2Luk0EClgaDtcpBsFUp67NcVdKKyPF8
>>
>> Worker specification design document:
>> https://docs.google.com/document/d/1Dx9NqHRNuUpatH9DYoFF9cmvUl2fqHT4Rjbyw4EGLHs
>>
>> Discussion Thread:
>> https://lists.apache.org/thread/9t4svsnd71j7sb4r4scf2xhh8dvp3b43
>>
>> Previous vote and discussion thread:
>> https://lists.apache.org/thread/81xghrfwvopp274rgyxfthsstb2xmkz1
>>
>> *Please vote on adopting this proposal.*
>>
>> [ ] +1: Accept the proposal as an official SPIP
>>
>> [ ] +0: No opinion
>>
>> [ ] -1: Disapprove (please explain why)
>>
>> The vote will remain open for *at least 72 hours. *
>>
>> Thanks to everyone who participated in the discussion and provided
>> valuable feedback!
>>
>>
>> Best regards,
>>
>> Haiyang
>>
>

Reply via email to