HIVE 13 : Simple Join Throwing java.io.IOException

Siddhartha Gunda Wed, 11 Jun 2014 12:13:39 -0700

Hi,

I am trying to run a simple join query on hive 13.


Both tables are in text format. Both tables are read in mappers, and the error 
is thrown in reducer. I don't get why a reducer is reading a table when the 
mappers have read it already and the reason for assuming that the video file is 
in SequenceFile format.

Below, you can find query, query plan, and the error. Any help is greatly 
appreciated.

Thanks,
Sid

Hadoop Version: 2.0.0-mr1


Query:
SELECT computerguid
FROM revenue_start_adeffx_v2
JOIN video
ON revenue_start_adeffx_v2.video_id = video.video_id
WHERE hourid = '389567';



Query Plan:
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: revenue_start_adeffx_v2
            Statistics: Num rows: 3175840 Data size: 330287403 Basic stats: 
COMPLETE Column stats: NONE
            Reduce Output Operator
              key expressions: video_id (type: int)
              sort order: +
              Map-reduce partition columns: video_id (type: int)
              Statistics: Num rows: 3175840 Data size: 330287403 Basic stats: 
COMPLETE Column stats: NONE
              value expressions: computerguid (type: string)
          TableScan
            alias: video
            Statistics: Num rows: 146679792 Data size: 586719168 Basic stats: 
COMPLETE Column stats: NONE
            Reduce Output Operator
              key expressions: video_id (type: int)
              sort order: +
              Map-reduce partition columns: video_id (type: int)
              Statistics: Num rows: 146679792 Data size: 586719168 Basic stats: 
COMPLETE Column stats: NONE
      Reduce Operator Tree:
        Join Operator
          condition map:
               Inner Join 0 to 1
          condition expressions:
            0 {VALUE._col0}
            1
          outputColumnNames: _col0
          Statistics: Num rows: 161347776 Data size: 645391104 Basic stats: 
COMPLETE Column stats: NONE
          Select Operator
            expressions: _col0 (type: string)
            outputColumnNames: _col0
            Statistics: Num rows: 161347776 Data size: 645391104 Basic stats: 
COMPLETE Column stats: NONE
            File Output Operator
              compressed: false
              Statistics: Num rows: 161347776 Data size: 645391104 Basic stats: 
COMPLETE Column stats: NONE
              table:
                  input format: org.apache.hadoop.mapred.TextInputFormat
                  output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: -1


Error:

2014-06-11 10:18:34,818 FATAL ExecReducer: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
hdfs://<NN><Path>/video/video_20140611051139 not a SequenceFile
        at 
org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:237)
        at 
org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:74)
        at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:644)
        at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:758)
        at 
org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)
        at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:216)
        at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
        at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.io.IOException: 
hdfs:/<NN><Path>/hive/warehouse/video/video_20140611051139 not a SequenceFile
        at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1805)
        at 
org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1765)
        at 
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1714)
        at 
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1728)
        at 
org.apache.hadoop.mapred.SequenceFileRecordReader.<init>(SequenceFileRecordReader.java:43)
        at 
org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:59)
        at 
org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:226)
        ... 12 more

2014-06-11 10:18:34,822 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2014-06-11 10:18:34,824 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: hdfs://<NN><Path>/video_20140611051139 not a SequenceFile
        at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:283)
        at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
        at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: hdfs://<NN><Path>/video/video_20140611051139 not a 
SequenceFile
        at 
org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:237)
        at 
org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:74)
        at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:644)
        at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:758)
        at 
org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)
        at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:216)
        ... 7 more
Caused by: java.io.IOException: hdfs://<NN><Path>/video/video_20140611051139 
not a SequenceFile
        at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1805)
        at 
org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1765)
        at 
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1714)
        at 
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1728)
        at 
org.apache.hadoop.mapred.SequenceFileRecordReader.<init>(SequenceFileRecordReader.java:43)
        at 
org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:59)
        at org.apache.

HIVE 13 : Simple Join Throwing java.io.IOException

Reply via email to