Hi Team,

Some clue:
>From the java code which creates the sequence file, it has set the key
class to NullWritable.class:
job.setOutputKeyClass(org.apache.hadoop.io.NullWritable.class);

However per the source code of Hive, and the key class for sequence file
writer should be : BytesWritable.
HiveSequenceFileOutputFormat.java:
final SequenceFile.Writer outStream = Utilities.createSequenceWriter(jc,
fs, finalOutPath, BytesWritable.class, valueClass, isCompressed);

I think that caused the mismatch:
wrong key class: org.apache.hadoop.io.BytesWritable is not class
org.apache.hadoop.io.NullWritable

Then I look into the Tez source code and found the reason is in :
tez-runtime-library/src/main/java/org/apache/tez/runtime/lib
rary/common/sort/impl/IFile.java
/**
* Send key/value to be appended to IFile. To represent same key as previous
* one, send IFile.REPEAT_KEY as key parameter. Should not call this method
with
* IFile.REPEAT_KEY as the first key.
*
* @param key
* @param value
* @throws IOException
*/
public void append(Object key, Object value) throws IOException {
checkArgument((key == REPEAT_KEY || key.getClass() == keyClass),
WRONG_KEY_CLASS,
key.getClass(), keyClass);

Above IFile should be speficic to Tez. Hive does not have that code to
check the key class and value class.
Anyone knows why Tez will check the key and value class when doing sort
stuff?

Thanks.



On Tue, Jul 21, 2015 at 5:26 PM, Jim Green <openkbi...@gmail.com> wrote:

>
> Sample stacktrace is :
> [Error: Failure while running task:java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException:
> java.io.IOException: wrong key class: org.apache.hadoop.io.BytesWritable is
> not class org.apache.hadoop.io.NullWritable
>         at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>         at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
>         at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
> java.io.IOException: java.io.IOException: wrong key class:
> org.apache.hadoop.io.BytesWritable is not class
> org.apache.hadoop.io.NullWritable
>         at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71)
>         at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294)
>         at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
>         ... 13 more
> Caused by: java.io.IOException: java.io.IOException: wrong key class:
> org.apache.hadoop.io.BytesWritable is not class
> org.apache.hadoop.io.NullWritable
>         at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>         at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:363)
>         at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>         at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>         at
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:126)
>         at
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
>         at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61)
>         ... 15 more
> Caused by: java.io.IOException: wrong key class:
> org.apache.hadoop.io.BytesWritable is not class
> org.apache.hadoop.io.NullWritable
>         at
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2495)
>         at
> org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:82)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:358)
>         ... 21 more
> ],
>
>
>
> On Tue, Jul 21, 2015 at 11:26 AM, Bikas Saha <bi...@hortonworks.com>
> wrote:
>
>>  A full stack trace would help determine is this is a Tez issue or hive
>> issue.
>>
>>
>>
>> *From:* Jim Green [mailto:openkbi...@gmail.com]
>> *Sent:* Tuesday, July 21, 2015 11:12 AM
>> *To:* u...@tez.apache.org; user@hive.apache.org
>> *Subject:* Hive on Tez query failed with “wrong key class"
>>
>>
>>
>> Hi Team,
>>
>>
>>
>> Env: Hive 1.0 on Tez 0.5.3
>>
>> Query is a simple group-by on top of sequence table.
>>
>>
>>
>> It fails with below error on tez mode:
>>
>> *java.lang.RuntimeException:
>> org.apache.hadoop.hive.ql.metadata.HiveException: *
>>
>> *java.io.IOException: java.io.IOException: wrong key class:
>> org.apache.hadoop.io.BytesWritable is not class
>> org.apache.hadoop.io.NullWritable *
>>
>>
>>
>> And it works fine in MR mode.
>>
>> Anyone met this issue before?
>>
>>
>>
>> --
>>
>> Thanks,
>>
>> www.openkb.info
>>
>> (Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)
>>
>
>
>
> --
> Thanks,
> www.openkb.info
> (Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)
>



-- 
Thanks,
www.openkb.info
(Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)

Reply via email to