GitHub user vofque opened a pull request:

    https://github.com/apache/spark/pull/22745

    [SPARK-21402][SQL][FOLLOWUP] Fix java map of structs deserialization

    When deserializing values of MapType with struct keys/values in java beans, 
fields of structs get mixed up.
    I suggest using struct data types retrieved from resolved input data 
instead of inferring them from java beans.
    
    ## What changes were proposed in this pull request?
    
    Invocations of "keyArray" and "valueArray" functions are used to extract 
arrays of keys and values. Struct type of keys or values is also inferred from 
java bean structure and ends up with mixed up field order.
    I created a new UnresolvedInvoke expression as a temporary substitution of 
Invoke expression while no actual data is available. It allows to provide the 
resulting data type during analysis based on the resolved input data, not on 
the java bean (similar to UnresolvedMapObjects).
    
    Key and value arrays are then fed to MapObjects expression which I replaced 
with UnresolvedMapObjects, just like in case of ArrayType.
    
    Finally I added resolution of UnresolvedInvoke expressions in 
Analyzer.resolveExpression method as an additional pattern matching case.
    
    ## How was this patch tested?
    
    Added a test case.
    Built complete project on travis.
    
    @viirya @kiszk @cloud-fan @michalsenkyr @marmbrus @liancheng

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vofque/spark SPARK-21402-FOLLOWUP

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22745.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22745
    
----
commit d3578bb3776a79b97f2713f752127849e7910368
Author: Vladimir Kuriatkov <vofque@...>
Date:   2018-10-12T10:49:18Z

    Merge pull request #2 from apache/master
    
    Synch with apache:master

commit bac0ff9e45cb91c81bfaff510c0ccf2e0cc4064b
Author: Vladimir Kuriatkov <vofque@...>
Date:   2018-10-16T07:50:00Z

    Merge pull request #4 from apache/master
    
    Synch with apache:master

commit ac5d0bee7cec3c6dba9c19baba47aa43448e18df
Author: Vladimir Kuriatkov <vladimir_kuriatkov@...>
Date:   2018-10-11T12:56:15Z

    Java map of structs deserialization fixed

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to