[jira] [Commented] (MAPREDUCE-4655) MergeManager.reserve can OutOfMemoryError if more than 10% of max memory is used on non-MapOutputs

Sandy Ryza (JIRA) Thu, 27 Sep 2012 16:29:09 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465215#comment-13465215
 ]


Sandy Ryza commented on MAPREDUCE-4655:
---------------------------------------

Looked at a heap dump, and it appears that the problem was caused by Avro 
holding on to a reference after it was done with it.  Filed AVRO-1175.
                
> MergeManager.reserve can OutOfMemoryError if more than 10% of max memory is 
> used on non-MapOutputs
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4655
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4655
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.1-alpha
>            Reporter: Sandy Ryza
>
> The MergeManager does a memory check, using a limit that defaults to 90% of 
> Runtime.getRuntime().maxMemory(). Allocations that would bring the total 
> memory allocated by the MergeManager over this limit are asked to wait until 
> memory frees up. Disk is used for single allocations that would be over 25% 
> of the memory limit.
> If some other part of the reducer were to be using more than 10% of the 
> memory. the current check wouldn't stop an OutOfMemoryError.
> Before creating an in-memory MapOutput, a check can be done using 
> Runtime.getRuntime().freeMemory(), waiting until memory is freed up if it 
> fails.
> 12/08/17 10:36:29 INFO mapreduce.Job: Task Id : 
> attempt_1342723342632_0010_r_000005_0, Status : FAILED 
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#6 
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123) 
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371) 
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:416) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>  
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147) 
> Caused by: java.lang.OutOfMemoryError: Java heap space 
> at 
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>  
> at 
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>  
> at 
> org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97) 
> at 
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>  
> at 
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>  
> at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:327)
>  
> at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:273)
>  
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:153)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4655) MergeManager.reserve can OutOfMemoryError if more than 10% of max memory is used on non-MapOutputs

Reply via email to