+1 binding, thank you so for working through the concerns on the earlier version, I’m so excited for this to get built!
Twitter: https://twitter.com/holdenkarau Fight Health Insurance: https://www.fighthealthinsurance.com/ <https://www.fighthealthinsurance.com/?q=hk_email> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau Pronouns: she/her On Wed, Mar 18, 2026 at 12:32 PM L. C. Hsieh <[email protected]> wrote: > +1 > > On Wed, Mar 18, 2026 at 2:33 AM Haiyang Sun via dev <[email protected]> > wrote: > >> Hi Spark devs, >> >> I would like to call for *a new vote following the previous attempt* for the >> *SPIP: Language-Agnostic UDF Execution Protocol for Spark *after >> addressing comments and providing a supplementary design document for >> worker specification. >> >> The SPIP proposes a structured, language-agnostic framework for running >> user-defined functions (UDFs) in Spark across multiple programming languages >> >> Today, Spark Connect allows users to write queries from multiple >> languages, but support for user-defined functions remains incomplete. In >> practice, only Scala, Java, Python have working support, and this relies on >> language-specific mechanisms that do not generalize well to other languages >> such as Go <https://github.com/apache/spark-connect-go> / Rust >> <https://github.com/apache/spark-connect-rust> / Swift >> <https://github.com/apache/spark-connect-swift> / TypeScript >> <https://github.com/BaldrVivaldelli/ts-spark-connector> where UDF >> support is currently unavailable. In addition, there are legacy limitations >> in the existing PySpark worker implementation that make it difficult to >> evolve the system or extend it to new languages. >> >> The proposal introduces two related components: >> >> >> 1. >> >> *A unified UDF execution protocol* >> >> The proposal defines a structured API and execution protocol for >> running UDFs outside the Spark executor process and communicating with >> Spark via inter-process communication (IPC). This protocol enables Spark >> to >> interact with external UDF workers in a consistent and extensible way, >> regardless of the implementation language. >> 2. >> >> *A worker specification for provisioning and lifecycle management.* >> >> To support multi-language execution environments, the proposal also >> introduces a worker specification describing how UDF workers can be >> installed, started, connected to, and terminated. This document >> complements >> the SPIP by outlining how workers can be provisioned and managed in a >> consistent way. >> >> Note that this SPIP can help enable UDF support for languages that >> currently do not support UDFs. For languages that already have UDF >> implementations (especially Python), the goal is not to replace existing >> implementations immediately, but to provide a framework that may allow them >> to gradually evolve toward more language-agnostic abstractions over time. >> >> More details can be found in the SPIP document and the supplementary >> design for worker specification: >> >> SPIP: >> https://docs.google.com/document/d/19Whzq127QxVt2Luk0EClgaDtcpBsFUp67NcVdKKyPF8 >> >> Worker specification design document: >> https://docs.google.com/document/d/1Dx9NqHRNuUpatH9DYoFF9cmvUl2fqHT4Rjbyw4EGLHs >> >> Discussion Thread: >> https://lists.apache.org/thread/9t4svsnd71j7sb4r4scf2xhh8dvp3b43 >> >> Previous vote and discussion thread: >> https://lists.apache.org/thread/81xghrfwvopp274rgyxfthsstb2xmkz1 >> >> *Please vote on adopting this proposal.* >> >> [ ] +1: Accept the proposal as an official SPIP >> >> [ ] +0: No opinion >> >> [ ] -1: Disapprove (please explain why) >> >> The vote will remain open for *at least 72 hours. * >> >> Thanks to everyone who participated in the discussion and provided >> valuable feedback! >> >> >> Best regards, >> >> Haiyang >> >
