minni31 opened a new pull request, #12099: URL: https://github.com/apache/gluten/pull/12099
## Context Spark's `Concat` expression supports `StringType`, `BinaryType`, and `ArrayType` inputs. Currently in Gluten, `Concat` falls through the generic expression transformer without any type-specific handling. This PR adds explicit routing with proper edge case handling for the Velox backend. ## What Add `genConcatTransformer` to `SparkPlanExecApi` and override it in `VeloxSparkPlanExecApi` with: - **StringType / BinaryType**: Offloaded to Velox. Velox's `concat` uses `defaultNullBehavior=true` (returns NULL when any input is NULL), matching Spark's null-in-null-out semantics. - **ArrayType**: Falls back to Spark. Velox returns NULL if ANY input is NULL, but Spark 3.4+ (SPARK-41296) skips NULL array inputs and only returns NULL when ALL inputs are NULL. - **Zero arguments**: Falls back to Spark (Velox requires at least 1 argument). - **Single argument**: Returns the child expression directly (identity optimization; Velox requires at least 2 arguments for concat). ## Changes - `SparkPlanExecApi.scala`: Add `genConcatTransformer` with default `GenericExpressionTransformer` implementation - `ExpressionConverter.scala`: Add explicit `case c: Concat =>` routing to backend API - `VeloxSparkPlanExecApi.scala`: Override with type checks and edge case handling - `VeloxStringFunctionsSuite.scala`: Enhanced tests for null handling, single-arg identity, zero-arg fallback, ArrayType fallback, and BinaryType support -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
