andygrove opened a new issue, #4464:
URL: https://github.com/apache/datafusion-comet/issues/4464
## Describe the bug
`bit_length` and `octet_length` are wired as plain
`CometScalarFunction("bit_length")` / `CometScalarFunction("octet_length")` in
`QueryPlanSerde.scala` with no `BinaryType` guard, so both report
`Compatible(None)` for `BinaryType` input. However DataFusion's `BitLengthFunc`
and `OctetLengthFunc` use a `Signature::coercible(... logical_string() ...)`
and reject Binary at execution time. The net effect: `bit_length(<binary>)` and
`octet_length(<binary>)` plan successfully under Comet, then surface as a
native execution error rather than falling back cleanly to Spark.
For contrast, `length` (also handled by Comet) explicitly guards
`BinaryType` in `CometLength.getSupportLevel` and falls back to Spark.
Surfaced by the string-expressions audit in apache/datafusion-comet#4461.
## Steps to reproduce
```sql
CREATE TABLE t(b binary) USING parquet;
INSERT INTO t VALUES (X'48656c6c6f');
SELECT bit_length(b) FROM t;
SELECT octet_length(b) FROM t;
```
Spark: returns `40` and `5`.
Comet: native execution error from DataFusion's `BitLengthFunc` /
`OctetLengthFunc` signature check.
## Expected behavior
Either guard `BinaryType` in dedicated `CometBitLength` / `CometOctetLength`
serdes (mirroring `CometLength`), or wire to a Comet-side UDF that supports
both `Utf8` and `Binary` (the underlying `arrow::compute::bit_length` /
`length` kernels do support Binary natively).
## Additional context
- Wiring: `QueryPlanSerde.scala` lines 176 (`bit_length`) and 187
(`octet_length`).
- Existing guard pattern: `CometLength` in
`spark/src/main/scala/org/apache/comet/serde/strings.scala`.
- Spark accepts `(StringType|BinaryType) -> IntegerType` for both
expressions across 3.4.3, 3.5.8, and 4.0.1.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]