[
https://issues.apache.org/jira/browse/PIG-4036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040450#comment-14040450
]
Rohini Palaniswamy commented on PIG-4036:
-----------------------------------------
BigData_4 stacktrace is below.
{code}
2014-06-23 04:54:16,640 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error
running child : java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2219)
at java.util.ArrayList.grow(ArrayList.java:213)
at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:187)
at java.util.ArrayList.add(ArrayList.java:411)
at org.apache.pig.data.InternalCachedBag.add(InternalCachedBag.java:82)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.CombinerPackager.getNext(CombinerPackager.java:128)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNextTuple(POPackage.java:271)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:177)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:168)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:51)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)
at
org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1631)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1568)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1417)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:664)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1300)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
{code}
Problem is with protected DataBag[] bags; in Packager.java. While
POCombinerPackage in older version iterated through the tuple, current code in
POPackage iterates through the tuple and puts in a bag (in BigData_4 that
becomes 600+MB) and passes the bag to POCombinerPackage which causes OOM when
trying to add to the bag there as memory is already occupied by the other bag.
We need to do some rework on POPackage and Packager.
> Fix e2e failures - JobManagement_3, CmdErrors_3 and BigData_4
> -------------------------------------------------------------
>
> Key: PIG-4036
> URL: https://issues.apache.org/jira/browse/PIG-4036
> Project: Pig
> Issue Type: Bug
> Reporter: Rohini Palaniswamy
> Assignee: Rohini Palaniswamy
> Fix For: 0.13.0
>
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)