[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195071#comment-15195071 ] Hive QA commented on HIVE-12619: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12792594/HIVE-12619.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9778 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-cte_4.q-orc_merge5.q-vectorization_limit.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-dynpart_sort_optimization2.q-cte_mat_1.q-tez_bmj_schema_evolution.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_coalesce.q-auto_sortmerge_join_7.q-dynamic_partition_pruning.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7271/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7271/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7271/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12792594 - PreCommit-HIVE-TRUNK-Build > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch, HIVE-12619.3.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array>; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196198#comment-15196198 ] Xuefu Zhang commented on HIVE-12619: Patch #3 seems simpler and fixing the field ordering issue. Looking good on my side. +1. [~spena], it would be good if you can take a look too. > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch, HIVE-12619.3.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array>; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197938#comment-15197938 ] Sergio Peña commented on HIVE-12619: Thanks [~jxiang]. The code looks better. Just a couple of requests. - Could you create a method that returns the new 'List selectedFields' and wraps up the new code? Just to avoid making the init() method larger. - Could you add more tests for deeper levels? Like array>>>? Just to make sure everything will work correctly. > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch, HIVE-12619.3.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array>; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200348#comment-15200348 ] Jimmy Xiang commented on HIVE-12619: Thanks a lot for the review. Attached v4 that has a testcase to cover deeper levels. > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch, HIVE-12619.3.patch, > HIVE-12619.4.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array>; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200356#comment-15200356 ] Jimmy Xiang commented on HIVE-12619: [~spena], v3 doesn't work with deeper levels. So I enhanced v2 a little, modified the testcase, and got v4. > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch, HIVE-12619.3.patch, > HIVE-12619.4.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array>; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203033#comment-15203033 ] Hive QA commented on HIVE-12619: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12794371/HIVE-12619.5.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9837 tests executed *Failed tests:* {noformat} TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_multi_field_struct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_unannotated_groups org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_unannotated_primitives org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_avro_array_of_single_field_struct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_thrift_array_of_single_field_struct {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7315/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7315/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7315/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12794371 - PreCommit-HIVE-TRUNK-Build > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch, HIVE-12619.3.patch, > HIVE-12619.4.patch, HIVE-12619.5.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array>; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15204797#comment-15204797 ] Sergio Peña commented on HIVE-12619: [~jxiang] could you upload the patch to the review board? I have some comments I think it would be easier to leave there. > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch, HIVE-12619.3.patch, > HIVE-12619.4.patch, HIVE-12619.5.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array>; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15204960#comment-15204960 ] Jimmy Xiang commented on HIVE-12619: Sure. I will upload the next patch to RB. > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch, HIVE-12619.3.patch, > HIVE-12619.4.patch, HIVE-12619.5.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array>; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205252#comment-15205252 ] Jimmy Xiang commented on HIVE-12619: Patch v6 is uploaded to RB: https://reviews.apache.org/r/45128/ > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch, HIVE-12619.3.patch, > HIVE-12619.4.patch, HIVE-12619.5.patch, HIVE-12619.6.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array>; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15209099#comment-15209099 ] Hive QA commented on HIVE-12619: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12794618/HIVE-12619.6.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9836 tests executed *Failed tests:* {noformat} TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llap_udf {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7346/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7346/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7346/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12794618 - PreCommit-HIVE-TRUNK-Build > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch, HIVE-12619.3.patch, > HIVE-12619.4.patch, HIVE-12619.5.patch, HIVE-12619.6.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array>; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216754#comment-15216754 ] Jimmy Xiang commented on HIVE-12619: [~spena], could you take a look patch v6 when you get a chance? Thanks. > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch, HIVE-12619.3.patch, > HIVE-12619.4.patch, HIVE-12619.5.patch, HIVE-12619.6.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array>; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216807#comment-15216807 ] Sergio Peña commented on HIVE-12619: Thanks [~jxiang] for the patch. The code looks good. +1 > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch, HIVE-12619.3.patch, > HIVE-12619.4.patch, HIVE-12619.5.patch, HIVE-12619.6.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array>; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15063788#comment-15063788 ] Mohammad Kamrul Islam commented on HIVE-12619: -- RB: https://reviews.apache.org/r/41541/ [~spena] please review it. > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array>; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065097#comment-15065097 ] Hive QA commented on HIVE-12619: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12778465/HIVE-12619.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 9965 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_filemetadata org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_stats_filemetadata org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_unionDistinct_2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_columnstats_partlvl_multiple_part_clause org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats5 org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hadoop.hive.ql.security.TestMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.authorization.plugin.TestHiveOperationType.checkHiveOperationTypeMatch org.apache.hive.jdbc.TestJdbcDriver2.testShowGrant org.apache.hive.jdbc.TestJdbcDriver2.testShowRoleGrant org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6401/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6401/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6401/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 26 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12778465 - PreCommit-HIVE-TRUNK-Build > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table
[jira] [Commented] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15112805#comment-15112805 ] Sergio Peña commented on HIVE-12619: Thanks [~kamrul] for the patch. Here are my comments: - Could you add more tests for deeper levels? Like array>>>? Doing to 3 levels deep should be good to verify all works correctly. - I read the comment on {{getListType}} that says that we are supporting only 3-levels. That is true only when writing Parquet files, but when reading we should support all different Parquet files. Take a look at {{TestArrayCompatibility.java}} about different schema to test. The rest of the code looks fine. I think we only need to do changes on the {{getListType}} method. Not sure how to do it yet, but I'll try to figure out a good solution for this and help you. Also, could you add the next patch to review board and paste the link here? So that it is easier to leave comments there. > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array>) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array>; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array>) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array>; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)