The following change is correct. However, this MRHelpers methods is soon
going to disappear. We recommend you switch to MRInput.createConfigurer()
and MROutput.createConfigurer() methods. Also, switch to the
*EdgeConfigurer methods e.g. OrderedPartitionedEdgeConfigurer for edge
configuration. Please look at WordCount or OrderedWordCount for example
code.



byte[] mapInputPayload =
MRHelpers.createMRInputPayloadWithGrouping(mapPayload,
JobContextInputFormat.class.getName());



To



byte[] mapInputPayload =
MRHelpers.createMRInputPayloadWithGrouping(mapPayload);





*From:* Thaddeus Diamond [mailto:[email protected]]
*Sent:* Saturday, August 09, 2014 4:10 PM
*To:* [email protected]
*Subject:* TezGroupedSplit ClassCastException



After changeset b0c87d9 my Tez DAG is now broken.  In the logs I'm seeing:



TaskAttempt 3 failed, info=[Error: Failure while running
task:java.lang.ClassCastException:
com.hadapt.de.storage.hadoop.StorageJobInputSplit cannot be cast to
org.apache.hadoop.mapreduce.split.TezGroupedSplit

     at
org.apache.hadoop.mapreduce.split.TezGroupedSplitsInputFormat.createRecordReader(TezGroupedSplitsInputFormat.java:111)

     at
org.apache.tez.mapreduce.lib.MRReaderMapReduce.setupNewRecordReader(MRReaderMapReduce.java:148)

     at
org.apache.tez.mapreduce.lib.MRReaderMapReduce.<init>(MRReaderMapReduce.java:78)

     at
org.apache.tez.mapreduce.input.MRInput.initializeInternal(MRInput.java:327)

     at
org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)

     at
org.apache.tez.mapreduce.processor.map.MapProcessor.run(MapProcessor.java:103)

     at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)

     at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)

     at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)

     at java.security.AccessController.doPrivileged(Native Method)

     at javax.security.auth.Subject.doAs(Subject.java:415)

     at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)

     at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172)

     at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167)

     at java.util.concurrent.FutureTask.run(FutureTask.java:262)

     at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

     at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

     at java.lang.Thread.run(Thread.java:744)

]], Vertex failed as one or more tasks failed. failedTasks:1]



In order to get my DAG to compile I had to change the following line:



byte[] mapInputPayload =
MRHelpers.createMRInputPayloadWithGrouping(mapPayload,
JobContextInputFormat.class.getName());



To



byte[] mapInputPayload =
MRHelpers.createMRInputPayloadWithGrouping(mapPayload);



I think the relevant change in the Tez project's commit is:



   private static byte[] createMRInputPayload(ByteString bytes,

-      MRSplitsProto mrSplitsProto, String inputFormatName) throws
IOException {

+      MRSplitsProto mrSplitsProto, boolean isGrouped) throws IOException {

     MRInputUserPayloadProto.Builder userPayloadBuilder =
MRInputUserPayloadProto

         .newBuilder();

     userPayloadBuilder.setConfigurationBytes(bytes);

     if (mrSplitsProto != null) {

       userPayloadBuilder.setSplits(mrSplitsProto);

     }

-    if (inputFormatName!=null) {

-      userPayloadBuilder.setInputFormatName(inputFormatName);

-    }



+    userPayloadBuilder.setGroupingEnabled(isGrouped);


     // TODO Should this be a ByteBuffer or a byte array ? A ByteBuffer
would be

     // more efficient.

     return userPayloadBuilder.build().toByteArray();

   }

-

+



Please advise.



Thanks,

Thad

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Reply via email to