Hi, 

   I am using cloudera and  run mapreduce job written with pig latin,  I met 
the following exception in a map task: 
014-04-15 11:30:39,532 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.ClassCastException: java.lang.String cannot be cast to 
org.apache.pig.data.DataBag
        at 
org.apache.pig.builtin.Distinct.getDistinctFromNestedBags(Distinct.java:140)
        at org.apache.pig.builtin.Distinct.access$100(Distinct.java:39)
        at org.apache.pig.builtin.Distinct$Intermediate.exec(Distinct.java:101)
        at org.apache.pig.builtin.Distinct$Intermediate.exec(Distinct.java:94)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:337)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:376)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:354)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:372)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:297)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:220)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:210)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:185)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:163)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:51)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:164)
        at 
org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1477)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1587)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1199)
        at 
org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:609)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:675)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
        at org.apache.hadoop.mapred.Child.main(Child.java:262)
By looking up the staketrace i think the exception is throw here:  
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.pig/pig/0.11.0-cdh4.3.1/org/apache/pig/builtin/Distinct.java
  line 140

However,  the second retry of this  map task succeed. They are using exactly 
the same data and same code. This really confuse me.

Any insight about this?

Thanks,
Lei
 


leiwang...@gmail.com

Reply via email to