+1 On Wed, Mar 18, 2026 at 2:33 AM Haiyang Sun via dev <[email protected]> wrote:
> Hi Spark devs, > > I would like to call for *a new vote following the previous attempt* for the > *SPIP: Language-Agnostic UDF Execution Protocol for Spark *after > addressing comments and providing a supplementary design document for > worker specification. > > The SPIP proposes a structured, language-agnostic framework for running > user-defined functions (UDFs) in Spark across multiple programming languages > > Today, Spark Connect allows users to write queries from multiple > languages, but support for user-defined functions remains incomplete. In > practice, only Scala, Java, Python have working support, and this relies on > language-specific mechanisms that do not generalize well to other languages > such as Go <https://github.com/apache/spark-connect-go> / Rust > <https://github.com/apache/spark-connect-rust> / Swift > <https://github.com/apache/spark-connect-swift> / TypeScript > <https://github.com/BaldrVivaldelli/ts-spark-connector> where UDF > support is currently unavailable. In addition, there are legacy limitations > in the existing PySpark worker implementation that make it difficult to > evolve the system or extend it to new languages. > > The proposal introduces two related components: > > > 1. > > *A unified UDF execution protocol* > > The proposal defines a structured API and execution protocol for > running UDFs outside the Spark executor process and communicating with > Spark via inter-process communication (IPC). This protocol enables Spark to > interact with external UDF workers in a consistent and extensible way, > regardless of the implementation language. > 2. > > *A worker specification for provisioning and lifecycle management.* > > To support multi-language execution environments, the proposal also > introduces a worker specification describing how UDF workers can be > installed, started, connected to, and terminated. This document complements > the SPIP by outlining how workers can be provisioned and managed in a > consistent way. > > Note that this SPIP can help enable UDF support for languages that > currently do not support UDFs. For languages that already have UDF > implementations (especially Python), the goal is not to replace existing > implementations immediately, but to provide a framework that may allow them > to gradually evolve toward more language-agnostic abstractions over time. > > More details can be found in the SPIP document and the supplementary > design for worker specification: > > SPIP: > https://docs.google.com/document/d/19Whzq127QxVt2Luk0EClgaDtcpBsFUp67NcVdKKyPF8 > > Worker specification design document: > https://docs.google.com/document/d/1Dx9NqHRNuUpatH9DYoFF9cmvUl2fqHT4Rjbyw4EGLHs > > Discussion Thread: > https://lists.apache.org/thread/9t4svsnd71j7sb4r4scf2xhh8dvp3b43 > > Previous vote and discussion thread: > https://lists.apache.org/thread/81xghrfwvopp274rgyxfthsstb2xmkz1 > > *Please vote on adopting this proposal.* > > [ ] +1: Accept the proposal as an official SPIP > > [ ] +0: No opinion > > [ ] -1: Disapprove (please explain why) > > The vote will remain open for *at least 72 hours. * > > Thanks to everyone who participated in the discussion and provided > valuable feedback! > > > Best regards, > > Haiyang >
