[ https://issues.apache.org/jira/browse/MAPREDUCE-1571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Douglas resolved MAPREDUCE-1571. -------------------------------------- Resolution: Duplicate This is a duplicate of MAPREDUCE-1182. {{MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION}} governs the maximum size of a map output segment that will be stored in memory. The aggregate limit is enforced by a separate mechanism. > OutOfMemoryError during shuffle > ------------------------------- > > Key: MAPREDUCE-1571 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1571 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 0.20.1, 0.20.2 > Reporter: Jacob Rideout > > A OutOfMemoryError can occur when determining if the shuffle can be > accomplished in memory > 2010-03-06 07:54:49,621 INFO org.apache.hadoop.mapred.ReduceTask: > Shuffling 4191933 bytes (435311 raw bytes) into RAM from > attempt_201003060739_0002_m_000061_0 > 2010-03-06 07:54:50,222 INFO org.apache.hadoop.mapred.ReduceTask: Task > attempt_201003060739_0002_r_000000_0: Failed fetch #1 from > attempt_201003060739_0002_m_000202_0 > 2010-03-06 07:54:50,223 WARN org.apache.hadoop.mapred.ReduceTask: > attempt_201003060739_0002_r_000000_0 adding host > hd37.dfs.returnpath.net to penalty box, next contact in 4 seconds > 2010-03-06 07:54:50,223 INFO org.apache.hadoop.mapred.ReduceTask: > attempt_201003060739_0002_r_000000_0: Got 1 map-outputs from previous > failures > 2010-03-06 07:54:50,223 FATAL org.apache.hadoop.mapred.TaskRunner: > attempt_201003060739_0002_r_000000_0 : Map output copy failure : > java.lang.OutOfMemoryError: Java heap space > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) > Ted Yu identified the following potential solution: > I think there is mismatch (in ReduceTask.java) between: > this.numCopiers = conf.getInt("mapred.reduce.parallel.copies", 5); > and: > maxSingleShuffleLimit = (long)(maxSize * > MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION); > where MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION is 0.25f > because > copiers = new ArrayList<MapOutputCopier>(numCopiers); > so the total memory allocated for in-mem shuffle is 1.25 * maxSize > A JIRA should be filed to correlate the constant 5 above and > MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.