Copilot commented on code in PR #2990: URL: https://github.com/apache/datafusion-comet/pull/2990#discussion_r2648897457
########## docs/source/user-guide/latest/compatibility.md: ########## @@ -192,3 +192,84 @@ or strings containing null bytes (e.g \\u0000) | Any cast not listed in the previous tables is currently unsupported. We are working on adding more. See the [tracking issue](https://github.com/apache/datafusion-comet/issues/286) for more details. + +### Complex Type Casts + +Comet provides native support for a limited set of complex type casts. +All other complex casts fall back to Spark. + +#### Struct Type Casting + +- **`STRUCT` → `STRING`** + Casting a struct to a string is supported. + This includes: + - Named structs + - Nested structs + - Structs containing primitive, decimal, date, and timestamp fields + + Example: + + ```sql + SELECT CAST(named_struct('a', 1, 'b', 'x') AS STRING); + ``` + +- **`STRUCT` → `STRUCT`** + Casting between struct types is supported when the number of fields matches. + Fields are matched by position, not by name, consistent with Spark behavior. + + Example: + + ```sql + SELECT CAST(s AS struct<field1:string, field2:string>) + FROM (SELECT named_struct('a', '1', 'b', '2') AS s); + ``` + +#### Array Type Casting + +- **`ARRAY<T>` → `STRING`** + Casting arrays to strings is supported and produces a string representation + of the array contents. Review Comment: The description states that casting arrays to strings "produces a string representation of the array contents," which is vague. Consider being more specific about the format of the string representation (e.g., whether it's a JSON-like format with square brackets and commas like "[1, 2, 3]" or some other format) to help users understand the expected output. ```suggestion Casting arrays to strings is supported and produces a JSON-like string representation of the array contents (for example, `[1, 2, 3]`, with elements separated by commas and enclosed in square brackets). ``` ########## docs/source/user-guide/latest/compatibility.md: ########## @@ -192,3 +192,84 @@ or strings containing null bytes (e.g \\u0000) | Any cast not listed in the previous tables is currently unsupported. We are working on adding more. See the [tracking issue](https://github.com/apache/datafusion-comet/issues/286) for more details. + +### Complex Type Casts + +Comet provides native support for a limited set of complex type casts. +All other complex casts fall back to Spark. + +#### Struct Type Casting + +- **`STRUCT` → `STRING`** + Casting a struct to a string is supported. + This includes: + - Named structs + - Nested structs + - Structs containing primitive, decimal, date, and timestamp fields + + Example: + + ```sql + SELECT CAST(named_struct('a', 1, 'b', 'x') AS STRING); + ``` + +- **`STRUCT` → `STRUCT`** + Casting between struct types is supported when the number of fields matches. + Fields are matched by position, not by name, consistent with Spark behavior. + + Example: + + ```sql + SELECT CAST(s AS struct<field1:string, field2:string>) + FROM (SELECT named_struct('a', '1', 'b', '2') AS s); + ``` + +#### Array Type Casting + +- **`ARRAY<T>` → `STRING`** + Casting arrays to strings is supported and produces a string representation + of the array contents. + + Supported element types include: + - `boolean` + - `byte` + - `short` + - `integer` + - `long` + - `string` + - `decimal` + + Support depends on the scan implementation. + Arrays with unsupported element types may fall back to Spark. + Review Comment: The statement "Support depends on the scan implementation" is somewhat ambiguous. Consider clarifying what this means for end users, such as explaining which scan implementations are referred to (e.g., native scan vs. other implementations) or providing guidance on how users can determine if their query will be supported. ```suggestion Support for these casts is implemented for queries that use the engine’s native/optimized scan for the underlying data source. When a query is planned using a different scan implementation (for example, a connector-specific or Spark-provided scan), the cast may not be pushed down and Spark will evaluate it instead. Arrays whose element types are not in the list above are always handled by Spark and are not evaluated natively by the plugin. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
