[ 
https://issues.apache.org/jira/browse/PIG-4036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040450#comment-14040450
 ] 

Rohini Palaniswamy commented on PIG-4036:
-----------------------------------------

 BigData_4 stacktrace is below. 

{code}
2014-06-23 04:54:16,640 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error 
running child : java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2219)
        at java.util.ArrayList.grow(ArrayList.java:213)
        at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:187)
        at java.util.ArrayList.add(ArrayList.java:411)
        at org.apache.pig.data.InternalCachedBag.add(InternalCachedBag.java:82)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.CombinerPackager.getNext(CombinerPackager.java:128)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNextTuple(POPackage.java:271)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:177)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:168)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:51)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)
        at 
org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1631)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1568)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1417)
        at 
org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:664)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1300)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
{code}

Problem is with protected DataBag[] bags; in Packager.java. While 
POCombinerPackage in older version iterated through the tuple, current code in 
POPackage iterates through the tuple and puts in a bag (in BigData_4 that 
becomes 600+MB) and passes the bag to POCombinerPackage which causes OOM when 
trying to add to the bag there as memory is already occupied by the other bag. 
We need to do some rework on POPackage and Packager.  

> Fix e2e failures - JobManagement_3, CmdErrors_3 and BigData_4
> -------------------------------------------------------------
>
>                 Key: PIG-4036
>                 URL: https://issues.apache.org/jira/browse/PIG-4036
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.13.0
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to