andygrove opened a new issue, #4471:
URL: https://github.com/apache/datafusion-comet/issues/4471

   ## Describe the bug
   
   Spark's `concat(...)` accepts `StringType`, `BinaryType`, and `ArrayType` 
arguments (`Concat.allowedTypes = Seq(StringType, BinaryType, ArrayType)` in 
`collectionOperations.scala`, widened to `StringTypeWithCollation` in Spark 
4.0+). Comet's `CometConcat` only natively supports `StringType` children; for 
`BinaryType` or `ArrayType` it falls back to Spark.
   
   Surfaced by the collection-expressions audit in 
apache/datafusion-comet#4471. The audit relabels the `getSupportLevel` branch 
from `Incompatible` to `Unsupported` (the fallback is a genuine "Comet does not 
support" case, not a wrong-result case), but the underlying coverage gap 
remains.
   
   ## Steps to reproduce
   
   ```sql
   -- BinaryType
   SELECT concat(unhex('CAFE'), unhex('BEEF'));
   
   -- ArrayType
   CREATE TABLE t(a array<int>, b array<int>) USING parquet;
   INSERT INTO t VALUES (array(1, 2), array(3, 4));
   SELECT concat(a, b) FROM t;
   ```
   
   Both queries currently fall back to Spark.
   
   ## Expected behavior
   
   Native support for `concat` over `BinaryType` (concatenate byte arrays) and 
`ArrayType` (concatenate arrays, equivalent to `array_concat`).
   
   ## Additional context
   
   - Serde: `CometConcat` in 
`spark/src/main/scala/org/apache/comet/serde/strings.scala`
   - DataFusion has `array_concat` for the array case; Comet already wires it 
for the array_concat function. The work is to route `Concat(<array>...)` 
through the same native path.
   - For BinaryType, DataFusion's `concat` UDF is Utf8-only, so a Comet-side 
helper or upstream patch would be needed.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to