RaghunandanKumar opened a new pull request, #55906:
URL: https://github.com/apache/spark/pull/55906

   ### What changes were proposed in this pull request?
   
   This PR tightens `DirectWorkerDispatcher` validation for the new 
language-agnostic UDF worker specification.
   
   Specifically, it now rejects direct worker specs that:
   - omit worker capabilities entirely
   - provide `UNSPECIFIED` enum values in supported data formats or 
communication patterns
   - omit required protocol support for `ARROW`
   - omit required protocol support for `BIDIRECTIONAL_STREAMING`
   
   The existing `DirectWorkerDispatcherSuite` fixtures are updated to include 
explicit capabilities where they were previously constructing minimal specs, 
and new focused tests cover the new validation paths.
   
   ### Why are the changes needed?
   
   `SPARK-56468` is an audit task for the UDF gRPC / worker protocol before the 
Spark 4.2 protocol boundary hardens.
   
   The protobuf comments already describe these capability fields as required, 
but the direct dispatcher previously accepted incomplete specs and deferred 
that inconsistency into later runtime paths. Failing closed here keeps the 
worker spec self-contained and prevents silently invalid protocol definitions 
from being treated as usable.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. The UDF worker framework is still experimental and not yet wired into 
user-facing execution paths.
   
   ### How was this patch tested?
   
   Locally on macOS with Homebrew OpenJDK 17:
   
   ```bash
   export 
JAVA_HOME=/opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home
   export PATH="$JAVA_HOME/bin:$PATH"
   build/sbt "udf-worker-core/testOnly 
org.apache.spark.udf.worker.core.DirectWorkerDispatcherSuite"
   ```
   
   Passed: `DirectWorkerDispatcherSuite` (33 tests)
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Yes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to