GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/18412

    Fix wrong results of insertion of Array of Struct

    ### What changes were proposed in this pull request?
    ```SQL
    CREATE TABLE `tab1`
    (`custom_fields` ARRAY<STRUCT<`id`: BIGINT, `value`: STRING>>)
    USING parquet
    
    INSERT INTO `tab1`
    SELECT ARRAY(named_struct('id', 1, 'value', 'a'), named_struct('id', 2, 
'value', 'b'))
    
    SELECT custom_fields.id, custom_fields.value FROM tab1
    ```
    
    The above query always return the last struct of the array, because the 
rule `SimplifyCasts` incorrectly rewrites the query. The underlying cause is we 
always use the same `GenericInternalRow` object when doing the cast. 
    
    ### How was this patch tested?

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark castStruct

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18412.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18412
    
----
commit 3be3475d3da7e281f7c1a6599988a621c4d6b0f5
Author: gatorsmile <gatorsm...@gmail.com>
Date:   2017-06-24T03:29:38Z

    fix.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to