[ 
https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15112805#comment-15112805
 ] 

Sergio Peña commented on HIVE-12619:
------------------------------------

Thanks [~kamrul] for the patch. Here are my comments:

- Could you add more tests for deeper levels? Like 
array<struct<f1:int,f2:array<struct<e1:int>>>>? Doing to 3 levels deep should 
be good to verify all works correctly.
- I read the comment on {{getListType}} that says that we are supporting only 
3-levels. That is true only when writing Parquet files, but when reading we 
should support all different Parquet files. Take a look at 
{{TestArrayCompatibility.java}} about different schema to test. 

The rest of the code looks fine. I think we only need to do changes on the 
{{getListType}} method. Not sure how to do it yet, but I'll try to figure out a 
good solution for this and help you.

Also, could you add the next patch to review board and paste the link here? So 
that it is easier to leave comments there.

> Switching the field order within an array of structs causes the query to fail
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-12619
>                 URL: https://issues.apache.org/jira/browse/HIVE-12619
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Ang Zhang
>            Assignee: Mohammad Kamrul Islam
>            Priority: Minor
>         Attachments: HIVE-12619.2.patch
>
>
> Switching the field order within an array of structs causes the query to fail 
> or return the wrong data for the fields, but switching the field order within 
> just a struct works.
> How to reproduce:
> Case1 if the two fields have the same type, query will return wrong data for 
> the fields
> drop table if exists schema_test;
> create table schema_test (msg array<struct<f1: string, f2: string>>) stored 
> as parquet;
> insert into table schema_test select stack(2, array(named_struct('f1', 'abc', 
> 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one 
> limit 2;
> select * from schema_test;
> --returns
> --[{"f1":"efg","f2":"efg2"}]
> --[{"f1":"abc","f2":"abc2"}]
> alter table schema_test change msg msg array<struct<f2: string, f1: string>>;
> select * from schema_test;
> --returns
> --[{"f2":"efg","f1":"efg2"}]
> --[{"f2":"abc","f1":"abc2"}]
> Case2: if the two fields have different type, the query will fail
> drop table if exists schema_test;
> create table schema_test (msg array<struct<f1: string, f2: int>>) stored as 
> parquet;
> insert into table schema_test select stack(2, array(named_struct('f1', 'abc', 
> 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2;
> select * from schema_test;
> --returns
> --[{"f1":"efg","f2":2}]
> --[{"f1":"abc","f2":1}]
> alter table schema_test change msg msg array<struct<f2: int, f1: string>>;
> select * from schema_test;
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to 
> org.apache.hadoop.io.IntWritable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to