andygrove opened a new issue, #4123:
URL: https://github.com/apache/datafusion-comet/issues/4123
## Describe the bug
When the sort key is a struct containing a Map, Comet's native sort fails
with:
```
org.apache.comet.CometNativeException
Not yet implemented: Row format support not yet implemented for: [SortField {
options: SortOptions { descending: false, nulls_first: true },
data_type: Struct([Field {
name: "data",
data_type: Map(Field { name: "entries", data_type: Struct([
Field { name: "key", data_type: Utf8 },
Field { name: "value", data_type: Utf8 }
]) }, false)
}])
}]
```
This surfaces in Spark 4.1.1's new
`having-and-order-by-recursive-type-name-resolution.sql` at query #38:
```sql
SELECT col1.data['key']
FROM VALUES (NAMED_STRUCT('data', MAP('key', 'value', 'num', '42'))) t (col1)
GROUP BY col1
HAVING col1.data['num'] IS NOT NULL
ORDER BY col1.data['key'];
```
## Expected behavior
Comet should fall back to Spark when the sort key includes types not
supported by the Arrow row format (Struct/Map combinations are a known gap
upstream).
## Workaround
The file is currently disabled when Comet is enabled via `--SET
spark.comet.enabled = false` at the top of the file in `dev/diffs/4.1.1.diff`.
## Additional context
PR #4093 enables Spark 4.1.1 in the `Spark SQL Tests` workflow. The
underlying limitation lives in `arrow-row` in DataFusion / Arrow.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]