[
https://issues.apache.org/jira/browse/HIVE-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110482#comment-14110482
]
Zhichun Wu commented on HIVE-7847:
----------------------------------
According to the testreport,
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
has failed for 43 times. So I think it is unrelated to the patch.
> query orc partitioned table fail when table column type change
> --------------------------------------------------------------
>
> Key: HIVE-7847
> URL: https://issues.apache.org/jira/browse/HIVE-7847
> Project: Hive
> Issue Type: Bug
> Components: File Formats
> Affects Versions: 0.11.0, 0.12.0, 0.13.0
> Reporter: Zhichun Wu
> Assignee: Zhichun Wu
> Fix For: 0.14.0
>
> Attachments: HIVE-7847.1.patch
>
>
> I use the following script to test orc column type change with partitioned
> table on branch-0.13:
> {code}
> use test;
> DROP TABLE if exists orc_change_type_staging;
> DROP TABLE if exists orc_change_type;
> CREATE TABLE orc_change_type_staging (
> id int
> );
> CREATE TABLE orc_change_type (
> id int
> ) PARTITIONED BY (`dt` string)
> stored as orc;
> --- load staging table
> LOAD DATA LOCAL INPATH '../hive/examples/files/int.txt' OVERWRITE INTO TABLE
> orc_change_type_staging;
> --- populate orc hive table
> INSERT OVERWRITE TABLE orc_change_type partition(dt='20140718') select * FROM
> orc_change_type_staging limit 1;
> --- change column id from int to bigint
> ALTER TABLE orc_change_type CHANGE id id bigint;
> INSERT OVERWRITE TABLE orc_change_type partition(dt='20140719') select * FROM
> orc_change_type_staging limit 1;
> SELECT id FROM orc_change_type where dt between '20140718' and '20140719';
> {code}
> if fails in the last query "SELECT id FROM orc_change_type where dt between
> '20140718' and '20140719';" with exception:
> {code}
> Error: java.io.IOException: java.io.IOException:
> java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast
> to org.apache.hadoop.io.LongWritable
> at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:256)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:171)
> at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
> at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> Caused by: java.io.IOException: java.lang.ClassCastException:
> org.apache.hadoop.io.IntWritable cannot be cast to
> org.apache.hadoop.io.LongWritable
> at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:344)
> at
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
> at
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
> at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:122)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:254)
> ... 11 more
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable
> cannot be cast to org.apache.hadoop.io.LongWritable
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:717)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1788)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2997)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:153)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:127)
> at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:339)
> ... 15 more
> {code}
> The value object would be reused each time we deserialize the row, it will
> fail when we start to process the next path with different schema. Resetting
> value each time we finish reading one path would solve this problem.
--
This message was sent by Atlassian JIRA
(v6.2#6252)