chenhao-db opened a new pull request, #45934: URL: https://github.com/apache/spark/pull/45934
### What changes were proposed in this pull request? This PR adds a new `schema_of_variant_agg` expression. It returns the merged schema in the SQL format of a variant column. Compared to `schema_of_variant`, which is a scalar expression and returns one schema for one row, the `schema_of_variant_agg` expression merges the schema of all rows. Usage examples: ``` > SELECT schema_of_variant_agg(parse_json(j)) FROM VALUES ('1'), ('2'), ('3') AS tab(j); BIGINT > SELECT schema_of_variant_agg(parse_json(j)) FROM VALUES ('{"a": 1}'), ('{"b": true}'), ('{"c": 1.23}') AS tab(j); STRUCT<a: BIGINT, b: BOOLEAN, c: DECIMAL(3,2)> ``` ### Why are the changes needed? This expression can help the user explore the content of variant values. ### Does this PR introduce _any_ user-facing change? Yes. A new SQL expression is added. ### How was this patch tested? Unit tests. ### Was this patch authored or co-authored using generative AI tooling? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org