[PR] feat: custom Rust UDFs via arrow-ffi (alternative to #4283) [experimental] [skip ci] [datafusion-comet]

via GitHub Wed, 27 May 2026 14:30:47 -0700


andygrove opened a new pull request, #4459:
URL: https://github.com/apache/datafusion-comet/pull/4459


   ## Which issue does this PR close?
   
   Companion / alternative to #4283. Both PRs are drafts intended for 
comparison, not for merging.
   
   ## Rationale for this change
   
   This PR adds custom Rust scalar UDF support using only arrow-stable FFI 
surfaces, as suggested in feedback on #4283 by @paleolimbot and @timsaucer. The 
goal is to compare two ABI strategies side-by-side:
   
   - #4283 — bespoke C ABI with custom type tags + IPC bytes + 
`comet_udf_describe`
   - this PR — arrow-only FFI, with two flavors offered side-by-side:
   
     1. **C ABI (sedona-style)** — pure C-callable struct of function pointers, 
parameterized only by Arrow's C Data Interface (`FFI_ArrowSchema` / 
`FFI_ArrowArray`). No DataFusion types appear in the FFI surface, so user 
libraries are decoupled from datafusion versions and the ABI is portable to 
C/C++. Modeled on 
[apache/sedona-db](https://github.com/apache/sedona-db/blob/6fe541cade2fcd7132feffb17091f83ac1411777/c/sedona-extension/src/sedona_extension.h#L114-L204)'s
 `SedonaCScalarKernel`.
   
     2. **datafusion-ffi (`FFI_ScalarUDF`)** — wraps the user's `ScalarUDFImpl` 
as `FFI_ScalarUDF`. The user's library inherits the full `ScalarUDFImpl` 
surface (variadic signatures, type coercion, metadata-aware return types) for 
free, at the cost of a major-version pin against `datafusion-ffi`.
   
   A single library may export either or both flavors; the host loader walks 
both discovery functions. The test cdylib exposes `add_one_c` via the C ABI and 
`add_one_df` via datafusion-ffi, and the end-to-end Spark suite drives both.
   
   The scope is **intentionally minimal** vs #4283 — scalar-only, happy-path 
e2e tests only — to keep the diff focused on the ABI strategy comparison. JVM 
API surface mirrors #4283 exactly so the JVM side is apples-to-apples.
   
   ## What changes are included in this PR?
   
   - **`native/comet-udf-sdk`** — public SDK with both ABI flavors behind cargo 
features (`c-abi`, `df-abi`, both default-on). Provides `CometCScalarUdf` trait 
+ `comet_c_udf_export!` macro for the C flavor; `comet_df_udf_export!` for 
datafusion-ffi.
   - **`native/comet-test-udfs`** — test cdylib exposing one UDF per ABI.
   - **`native/core/src/execution/rust_udf`** — `loader` (libloading + ABI 
version probe + dual-discovery), process-wide `cache`, `ImportedCScalarUdf` 
adapter wrapping the C ABI as `ScalarUDFImpl`. The datafusion-ffi side needs no 
adapter — `Arc<dyn ScalarUDFImpl>::from(&FFI_ScalarUDF)` is built-in.
   - **`RustUdfCall` proto** in `expr.proto` (field 71) + planner branch in 
`create_expr` resolving (library_path, name) against the cache and dispatching 
through whichever ABI registered the kernel.
   - **JNI bridge** (`CometRustUdfBridge` / `comet_rust_udf_bridge.rs`) for 
driver-side `validateLibrary` / `listUdfs`.
   - **Scala API** (`CometRustUDF.register`, `CometRustUdfRegistry`, typed 
exception classes). The `QueryPlanSerde` path: `CometScalaUDF.convert` first 
checks the registry — if the udfName is registered, it emits `RustUdfCall`; 
otherwise it falls back to the existing JVM codegen dispatcher.
   
   The Scala/Java files live under `spark/.../udf/` (rather than 
`common/.../udf/` as in #4283) to avoid the `NativeBase`-relocation churn. This 
is the only intentional structural divergence from #4283.
   
   ## How are these changes tested?
   
   - **SDK unit tests** — 3 tests covering ABI version, C ABI adapter roundtrip 
(export → import via FFI), and datafusion-ffi list build + round-trip via 
`From<&FFI_ScalarUDF>`.
   - **Native loader tests** — 5 tests covering library load, missing-path 
error, and verifying both ABI flavors are discovered correctly.
   - **End-to-end Spark suite** (`CometRustUdfSuite`) — 2 tests, one per ABI: 
`add_one_c` (C ABI) and `add_one_df` (datafusion-ffi). Both pass, gated on 
`-Dcomet.test.udfs.lib=<path>`.
   
   ```
   $ DYLD_LIBRARY_PATH=$JAVA_HOME/lib/server cargo test -p datafusion-comet 
--lib rust_udf
   running 5 tests
   test execution::rust_udf::loader::tests::missing_path_errors_open ... ok
   test execution::rust_udf::loader::tests::load_test_udfs_succeeds ... ok
   test execution::rust_udf::loader::tests::c_and_df_abi_have_expected_flavors 
... ok
   test execution::rust_udf::cache::tests::same_path_returns_same_arc ... ok
   test execution::rust_udf::cache::tests::missing_path_propagates_error ... ok
   test result: ok. 5 passed; 0 failed
   
   $ ./mvnw -q -Pspark-3.5 -pl spark test 
-DwildcardSuites=org.apache.comet.CometRustUdfSuite \
       -Dtest=none -Denforcer.skip=true \
       -Dcomet.test.udfs.lib=$PWD/native/target/debug/libcomet_test_udfs.dylib
   - C ABI: add_one_c returns id + 1 for a range (6 seconds, 465 milliseconds)
   - datafusion-ffi: add_one_df returns id + 1 for a range (69 milliseconds)
   Run completed in 9 seconds, 43 milliseconds.
   All tests passed.
   ```
   
   ## What this PR is *not*
   
   - Not a full replacement for #4283 — half the test coverage, no user guide, 
no struct-typed/registerAll/error/panic paths. The point is to make the ABI 
surfaces comparable, not to ship a finished feature. Once a direction is 
picked, the loser of the comparison can be deleted and the winner extended.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] feat: custom Rust UDFs via arrow-ffi (alternative to #4283) [experimental] [skip ci] [datafusion-comet]

Reply via email to