GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/18412
Fix wrong results of insertion of Array of Struct ### What changes were proposed in this pull request? ```SQL CREATE TABLE `tab1` (`custom_fields` ARRAY<STRUCT<`id`: BIGINT, `value`: STRING>>) USING parquet INSERT INTO `tab1` SELECT ARRAY(named_struct('id', 1, 'value', 'a'), named_struct('id', 2, 'value', 'b')) SELECT custom_fields.id, custom_fields.value FROM tab1 ``` The above query always return the last struct of the array, because the rule `SimplifyCasts` incorrectly rewrites the query. The underlying cause is we always use the same `GenericInternalRow` object when doing the cast. ### How was this patch tested? You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark castStruct Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18412.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18412 ---- commit 3be3475d3da7e281f7c1a6599988a621c4d6b0f5 Author: gatorsmile <gatorsm...@gmail.com> Date: 2017-06-24T03:29:38Z fix. ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org