I had the same problem yesterday, it sure does look to be dead on that issue. I found another forum discussion on AWS that suggested more memory as a stop-gap way to deal with it, or apply the patch. I checked the code on hadoop 1.0.3 (the version on AWS) and it didn't have the fix, so it looks like it's only in the newer builds. I actually have an AWS ticket opened for it seeing if their engineers can offer any guidance as well.
My understanding is that it should be doing a shuffle on disk in this case, it appeared to be just a small fix (a few lines) to apply the patch to src/mapred/org/apache/hadoop/mapred/ReduceTask.java Dave From: Manoj Babu [mailto:manoj...@gmail.com] Sent: Monday, December 10, 2012 8:09 PM To: user@hadoop.apache.org Subject: Reg: Map output copy failure Hi All I got the below exception, Is the issue related to https://issues.apache.org/jira/browse/MAPREDUCE-1182 ? Am using CDH3U1 2012-12-10 06:22:39,688 FATAL org.apache.hadoop.mapred.Task: attempt_201211120903_9197_r_000024_0 : Map output copy failure : java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMe mory(ReduceTask.java:1593) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutpu t(ReduceTask.java:1453) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput( ReduceTask.java:1302) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceT ask.java:1234) Cheers! Manoj.