Also, I believe you are comparing the Tez code for IFile (which is intermediate 
data) vs code for SequenceFile (which is the final output or initial input from 
stable storage like HDFS). So they may not be related.

-----Original Message-----
From: Gopal Vijayaraghavan [mailto:[email protected]] On Behalf Of Gopal 
Vijayaraghavan
Sent: Monday, July 27, 2015 9:20 PM
To: [email protected]; [email protected]
Cc: Jim Green <[email protected]>
Subject: Re: Hive on Tez query failed with ³wrong key class"




> From the java code which creates the sequence file, it has set the key 
>class to NullWritable.class:
> job.setOutputKeyClass(org.apache.hadoop.io.NullWritable.class);
...
> I think that caused the mismatch:
> wrong key class: org.apache.hadoop.io.BytesWritable is not class 
>org.apache.hadoop.io.NullWritable

In all possibilities, the exception you¹re hitting originates from here

https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-co
mmon/src/main/java/org/apache/hadoop/io/SequenceFile.java#L2328


> Anyone knows why Tez will check the key and value class when doing 
>sort stuff?

As I said in my earlier mail, if you can check the SequenceFile headers and 
they look like my pasted pair, then we know it¹s the same as the known issue.

Cheers,
Gopal



Reply via email to