[
https://issues.apache.org/jira/browse/HIVE-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241835#comment-14241835
]
Slava Markeyev commented on HIVE-7217:
--------------------------------------
[~muthunivas] I'm curious if you are using CDH5 with mr1. Here is what cloudera
did to fix the problem
https://github.com/cloudera/hive/commit/7af9de2e45ba503a6b1b962e720e13c3a413f1ee
Though this is somewhat still an issue because their isMR2 check relies on if
a class is loaded
https://github.com/cloudera/hive/blob/cdh5-0.13.1_5.2.1/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java#L102
which isn't quite correct because by default the yarn jars are loaded but not
used.
> Inner join query fails in the reducer when join key file is spilled to tmp by
> RowContainer
> ------------------------------------------------------------------------------------------
>
> Key: HIVE-7217
> URL: https://issues.apache.org/jira/browse/HIVE-7217
> Project: Hive
> Issue Type: Bug
> Affects Versions: 0.13.0, 0.13.1
> Reporter: Muthu
> Attachments: reducer.log
>
>
> {code}
> SELECT T1.userid, T2.video_title FROM videoview T1 JOIN video T2 ON
> T1.video_id = T2.video_id WHERE T1.hourid=389567
> hive> show create table video;
> OK
> CREATE TABLE `video`(
> `video_id` int,
> `video_title` string,
> )
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '\t'
> LINES TERMINATED BY '\n'
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
> 'hdfs://elsharpynn001.prod.hulu.com:8020/hive/warehouse/video'
> TBLPROPERTIES (
> 'numPartitions'='0',
> 'numFiles'='1',
> 'last_modified_by'='hadoop',
> 'last_modified_time'='1336446601',
> 'COLUMN_STATS_ACCURATE'='true',
> 'transient_lastDdlTime'='1402514051',
> 'numRows'='0',
> 'totalSize'='586773666',
> 'rawDataSize'='0')
> Time taken: 0.249 seconds, Fetched: 98 row(s)
> {code}
> The reducer fails with the following exception:
> {code}
> 2014-06-11 12:32:39,051 INFO
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 16000 rows for
> join key [663184]
> 2014-06-11 12:32:39,061 INFO
> org.apache.hadoop.hive.ql.exec.persistence.RowContainer: RowContainer created
> temp file
> /mnt/volume2/mapred/local/taskTracker/muthu.nivas/jobcache/job_201405301214_170634/attempt_201405301214_170634_r_000000_0/work/tmp/hive-rowcontainer413460656723947992/RowContainer1053550561043043830.tmp
> 2014-06-11 12:32:39,237 INFO org.apache.hadoop.mapred.FileInputFormat: Total
> input paths to process : 2
> 2014-06-11 12:32:39,299 WARN org.apache.hadoop.mapred.Child: Error running
> child
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException:
> java.io.IOException:
> hdfs://elsharpynn001.prod.hulu.com:8020/hive/warehouse/video/video_20140611071209
> not a SequenceFile
> at
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:283)
> at
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
> java.io.IOException:
> hdfs://elsharpynn001.prod.hulu.com:8020/hive/warehouse/video/video_20140611071209
> not a SequenceFile
> at
> org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:237)
> at
> org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:74)
> at
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:644)
> at
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:758)
> at
> org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)
> at
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:216)
> ... 7 more
> Caused by: java.io.IOException:
> hdfs://elsharpynn001.prod.hulu.com:8020/hive/warehouse/video/video_20140611071209
> not a SequenceFile
> at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1805)
> at
> org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1765)
> at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1714)
> at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1728)
> at
> org.apache.hadoop.mapred.SequenceFileRecordReader.<init>(SequenceFileRecordReader.java:43)
> at
> org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:59)
> at
> org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:226)
> ... 12 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)