Zhichun Wu created HIVE-7847:
--------------------------------

             Summary: query orc partitioned table fail when table column type 
change
                 Key: HIVE-7847
                 URL: https://issues.apache.org/jira/browse/HIVE-7847
             Project: Hive
          Issue Type: Bug
          Components: File Formats
    Affects Versions: 0.13.0, 0.12.0, 0.11.0
            Reporter: Zhichun Wu
            Assignee: Zhichun Wu
             Fix For: 0.14.0


I use the following script to test orc column type change with partitioned 
table on branch-0.13:

{code}
use test;
DROP TABLE if exists orc_change_type_staging;
DROP TABLE if exists orc_change_type;
CREATE TABLE orc_change_type_staging (
    id int
);
CREATE TABLE orc_change_type (
    id int
) PARTITIONED BY (`dt` string)
stored as orc;
--- load staging table
LOAD DATA LOCAL INPATH '../hive/examples/files/int.txt' OVERWRITE INTO TABLE 
orc_change_type_staging;
--- populate orc hive table
INSERT OVERWRITE TABLE orc_change_type partition(dt='20140718') select * FROM 
orc_change_type_staging limit 1;
--- change column id from int to bigint
ALTER TABLE orc_change_type CHANGE id id bigint;
INSERT OVERWRITE TABLE orc_change_type partition(dt='20140719') select * FROM 
orc_change_type_staging limit 1;
SELECT id FROM orc_change_type where dt between '20140718' and '20140719';
{code}

if fails in the last query "SELECT id FROM orc_change_type where dt between 
'20140718' and '20140719';" with exception:
{code}
Error: java.io.IOException: java.io.IOException: java.lang.ClassCastException: 
org.apache.hadoop.io.IntWritable cannot be cast to 
org.apache.hadoop.io.LongWritable
        at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
        at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
        at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:256)
        at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:171)
        at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
        at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.io.IOException: java.lang.ClassCastException: 
org.apache.hadoop.io.IntWritable cannot be cast to 
org.apache.hadoop.io.LongWritable
        at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
        at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
        at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:344)
        at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
        at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
        at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:122)
        at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:254)
        ... 11 more
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
cannot be cast to org.apache.hadoop.io.LongWritable
        at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:717)
        at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1788)
        at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2997)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:153)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:127)
        at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:339)
        ... 15 more
{code}

The value object would be reused each time we deserialize the row,  it will 
fail when we start to process the next path with different schema.  Resetting 
value each time we finish reading one path would solve this problem.





--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to