[ 
https://issues.apache.org/jira/browse/HIVE-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-5320:
------------------------------

    Attachment: HIVE-5320.patch
    
> Querying a table with nested struct type over JSON data results in errors
> -------------------------------------------------------------------------
>
>                 Key: HIVE-5320
>                 URL: https://issues.apache.org/jira/browse/HIVE-5320
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.9.0
>            Reporter: Chaoyu Tang
>         Attachments: HIVE-5320.patch
>
>
> Querying a table with nested_struct datatype like
> ==
> create table nest_struct_tbl (col1 string, col2 array<struct<a1:string, 
> a2:array<struct<b1:int, b2:string, b3:string>>>>) ROW FORMAT SERDE 
> 'org.openx.data.jsonserde.JsonSerDe'; 
> ==
> over JSON data cause errors including java.lang.IndexOutOfBoundsException or 
> corrupted data. 
> The JsonSerDe used is 
> json-serde-1.1.4.jar/json-serde-1.1.4-jar-dependencies.jar.
> The cause is that the method:
> public List<Object> getStructFieldsDataAsList(Object o) 
> in JsonStructObjectInspector.java 
> returns a list referencing to a static arraylist "values"
> So the local variable 'list' in method serialize of Hive LazySimpleSerDe 
> class is returned with same reference in its recursive calls and its element 
> values are kept on being overwritten in the case STRUCT.
> Solutions:
> 1. Fix in JsonSerDe, and change the field 'values' in 
> java.org.openx.data.jsonserde.objectinspector.JsonStructObjectInspector.java
> to instance scope.
> Filed a ticket to JSonSerDe 
> (https://github.com/rcongiu/Hive-JSON-Serde/issues/31)
> 2. Ideally, in the method serialize of class LazySimpleSerDe, we should 
> defensively save a copy of a list resulted from list = 
> soi.getStructFieldsDataAsList(obj) in which case the soi is the instance of 
> JsonStructObjectInspector, so that the recursive calls of serialize can work 
> properly regardless of the extended SerDe implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to