Hey Ted, any further insights on this? We're encountering a similar issue (on CD2). I'll be applying MAPREDUCE-1182 to see if that resolves our case but it sounds like that JIRA didn't completely eliminate the problem for some folks.
On Wed, Mar 10, 2010 at 11:54 PM, Ted Yu <yuzhih...@gmail.com> wrote: > I pressed send key a bit early. > > I will have to dig a bit deeper. > Hopefully someone can find reader.close() call after which I will look for > another possible root cause :-) > > > On Wed, Mar 10, 2010 at 7:48 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Thanks to Andy for the log he provided. >> >> You can see from the log below that size increased steadily from 341535057 >> to 408181692, approaching maxSize. Then OOME: >> >> >> 2010-03-10 18:38:32,936 INFO org.apache.hadoop.mapred.ReduceTask: reserve: >> pos=start requestedSize=3893000 size=341535057 numPendingRequests=0 >> maxSize=417601952 >> 2010-03-10 18:38:32,936 INFO org.apache.hadoop.mapred.ReduceTask: reserve: >> pos=end requestedSize=3893000 size=345428057 numPendingRequests=0 >> maxSize=417601952 >> ... >> 2010-03-10 18:38:35,950 INFO org.apache.hadoop.mapred.ReduceTask: reserve: >> pos=end requestedSize=635753 size=408181692 numPendingRequests=0 >> maxSize=417601952 >> 2010-03-10 18:38:36,603 INFO org.apache.hadoop.mapred.ReduceTask: Task >> attempt_201003101826_0001_r_000004_0: Failed fetch #1 from >> attempt_201003101826_0001_m_000875_0 >> >> 2010-03-10 18:38:36,603 WARN org.apache.hadoop.mapred.ReduceTask: >> attempt_201003101826_0001_r_000004_0 adding host hd17.dfs.returnpath.netto >> penalty box, next contact in 4 seconds >> 2010-03-10 18:38:36,604 INFO org.apache.hadoop.mapred.ReduceTask: >> attempt_201003101826_0001_r_000004_0: Got 1 map-outputs from previous >> failures >> 2010-03-10 18:38:36,605 FATAL org.apache.hadoop.mapred.TaskRunner: >> attempt_201003101826_0001_r_000004_0 : Map output copy failure : >> java.lang.OutOfMemoryError: Java heap space >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1513) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1413) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1266) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1200) >> >> Looking at the call to unreserve() in ReduceTask, two were for IOException >> and the other was for Sanity check (line 1557). Meaning they wouldn't be >> called in normal execution path. >> >> I see one call in IFile.InMemoryReader close() method: >> // Inform the RamManager >> ramManager.unreserve(bufferSize); >> >> And InMemoryReader is used in createInMemorySegments(): >> Reader<K, V> reader = >> new InMemoryReader<K, V>(ramManager, mo.mapAttemptId, >> mo.data, 0, mo.data.length); >> >> But I don't see reader.close() in ReduceTask file. >> > > >> On Wed, Mar 10, 2010 at 3:34 PM, Chris Douglas <chri...@yahoo-inc.com>wrote: >> >>> I don't think this OOM is a framework bug per se, and given the >>> rewrite/refactoring of the shuffle in MAPREDUCE-318 (in 0.21), tuning the >>> 0.20 shuffle semantics is likely not worthwhile (though data informing >>> improvements to trunk would be excellent). Most likely (and tautologically), >>> ReduceTask simply requires more memory than is available and the job failure >>> can be avoided by either 0) increasing the heap size or 1) lowering >>> mapred.shuffle.input.buffer.percent. Most of the tasks we run have a heap of >>> 1GB. For a reduce fetching >200k map outputs, that's a reasonable, even >>> stingy amount of space. -C >>> >>> >>> On Mar 10, 2010, at 5:26 AM, Ted Yu wrote: >>> >>> I verified that size and maxSize are long. This means MR-1182 didn't >>>> resolve >>>> Andy's issue. >>>> >>>> According to Andy: >>>> At the beginning of the job there are 209,754 pending map tasks and 32 >>>> pending reduce tasks >>>> >>>> My guess is that GC wasn't reclaiming memory fast enough, leading to OOME >>>> because of large number of in-memory shuffle candidates. >>>> >>>> My suggestion for Andy would be to: >>>> 1. add -*verbose*:*gc as JVM parameter >>>> 2. modify reserve() slightly to calculate the maximum outstanding >>>> numPendingRequests and print the maximum. >>>> >>>> Based on the output from above two items, we can discuss solution. >>>> My intuition is to place upperbound on numPendingRequests beyond which >>>> canFitInMemory() returns false. >>>> * >>>> My two cents. >>>> >>>> On Tue, Mar 9, 2010 at 11:51 PM, Christopher Douglas >>>> <chri...@yahoo-inc.com>wrote: >>>> >>>> That section of code is unmodified in MR-1182. See the patches/svn log. >>>>> -C >>>>> >>>>> Sent from my iPhone >>>>> >>>>> >>>>> On Mar 9, 2010, at 7:44 PM, "Ted Yu" <yuzhih...@gmail.com> wrote: >>>>> >>>>> I just downloaded hadoop-0.20.2 tar ball from cloudera mirror. >>>>> >>>>>> This is what I see in ReduceTask (line 999): >>>>>> public synchronized boolean reserve(int requestedSize, InputStream >>>>>> in) >>>>>> >>>>>> throws InterruptedException { >>>>>> // Wait till the request can be fulfilled... >>>>>> while ((size + requestedSize) > maxSize) { >>>>>> >>>>>> I don't see the fix from MR-1182. >>>>>> >>>>>> That's why I suggested to Andy that he manually apply MR-1182. >>>>>> >>>>>> Cheers >>>>>> >>>>>> On Tue, Mar 9, 2010 at 5:01 PM, Andy Sautins < >>>>>> andy.saut...@returnpath.net >>>>>> >>>>>>> wrote: >>>>>>> >>>>>> >>>>>> >>>>>> Thanks Christopher. >>>>>>> >>>>>>> The heap size for reduce tasks is configured to be 640M ( >>>>>>> mapred.child.java.opts set to -Xmx640m ). >>>>>>> >>>>>>> Andy >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Christopher Douglas [mailto:chri...@yahoo-inc.com] >>>>>>> Sent: Tuesday, March 09, 2010 5:19 PM >>>>>>> To: common-user@hadoop.apache.org >>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError >>>>>>> >>>>>>> No, MR-1182 is included in 0.20.2 >>>>>>> >>>>>>> What heap size have you set for your reduce tasks? -C >>>>>>> >>>>>>> Sent from my iPhone >>>>>>> >>>>>>> On Mar 9, 2010, at 2:34 PM, "Ted Yu" <yuzhih...@gmail.com> wrote: >>>>>>> >>>>>>> Andy: >>>>>>> >>>>>>>> You need to manually apply the patch. >>>>>>>> >>>>>>>> Cheers >>>>>>>> >>>>>>>> On Tue, Mar 9, 2010 at 2:23 PM, Andy Sautins < >>>>>>>> >>>>>>>> andy.saut...@returnpath.net >>>>>>> >>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> Thanks Ted. My understanding is that MAPREDUCE-1182 is included >>>>>>>>> in the >>>>>>>>> 0.20.2 release. We upgraded our cluster to 0.20.2 this weekend and >>>>>>>>> re-ran >>>>>>>>> the same job scenarios. Running with mapred.reduce.parallel.copies >>>>>>>>> set to 1 >>>>>>>>> and continue to have the same Java heap space error. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Ted Yu [mailto:yuzhih...@gmail.com] >>>>>>>>> Sent: Tuesday, March 09, 2010 12:56 PM >>>>>>>>> To: common-user@hadoop.apache.org >>>>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError >>>>>>>>> >>>>>>>>> This issue has been resolved in >>>>>>>>> http://issues.apache.org/jira/browse/MAPREDUCE-1182 >>>>>>>>> >>>>>>>>> Please apply the patch >>>>>>>>> M1182-1v20.patch< >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> http://issues.apache.org/jira/secure/attachment/12424116/M1182-1v20.patch >>>>>>> >>>>>>> >>>>>>>> >>>>>>>>>> On Sun, Mar 7, 2010 at 3:57 PM, Andy Sautins < >>>>>>>>> >>>>>>>>> andy.saut...@returnpath.net >>>>>>>> >>>>>>> >>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks Ted. Very helpful. You are correct that I misunderstood >>>>>>>>>> the >>>>>>>>>> >>>>>>>>>> code >>>>>>>>> >>>>>>>>> at ReduceTask.java:1535. I missed the fact that it's in a >>>>>>>>>> IOException >>>>>>>>>> >>>>>>>>>> catch >>>>>>>>> >>>>>>>>> block. My mistake. That's what I get for being in a rush. >>>>>>>>>> >>>>>>>>>> For what it's worth I did re-run the job with >>>>>>>>>> mapred.reduce.parallel.copies set with values from 5 all the way >>>>>>>>>> down to >>>>>>>>>> >>>>>>>>>> 1. >>>>>>>>> >>>>>>>>> All failed with the same error: >>>>>>>>>> >>>>>>>>>> Error: java.lang.OutOfMemoryError: Java heap space >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier >>>>>>>>>> >>>>>>>>> $MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) >>>>>>>>> >>>>>>>>> at >>>>>>>>>> >>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier >>>>>>>>>> >>>>>>>>> $MapOutputCopier.getMapOutput(ReduceTask.java:1408) >>>>>>>>> >>>>>>>>> at >>>>>>>>>> >>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier >>>>>>>>>> >>>>>>>>> $MapOutputCopier.copyOutput(ReduceTask.java:1261) >>>>>>>>> >>>>>>>>> at >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run >>>>>>>>>> >>>>>>>>> (ReduceTask.java:1195) >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> So from that it does seem like something else might be going on, >>>>>>>>>> yes? >>>>>>>>>> >>>>>>>>>> I >>>>>>>>> >>>>>>>>> need to do some more research. >>>>>>>>>> >>>>>>>>>> I appreciate your insights. >>>>>>>>>> >>>>>>>>>> Andy >>>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: Ted Yu [mailto:yuzhih...@gmail.com] >>>>>>>>>> Sent: Sunday, March 07, 2010 3:38 PM >>>>>>>>>> To: common-user@hadoop.apache.org >>>>>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError >>>>>>>>>> >>>>>>>>>> My observation is based on this call chain: >>>>>>>>>> MapOutputCopier.run() calling copyOutput() calling getMapOutput() >>>>>>>>>> calling >>>>>>>>>> ramManager.canFitInMemory(decompressedLength) >>>>>>>>>> >>>>>>>>>> Basically ramManager.canFitInMemory() makes decision without >>>>>>>>>> considering >>>>>>>>>> the >>>>>>>>>> number of MapOutputCopiers that are running. Thus 1.25 * 0.7 of >>>>>>>>>> total >>>>>>>>>> >>>>>>>>>> heap >>>>>>>>> >>>>>>>>> may be used in shuffling if default parameters were used. >>>>>>>>>> Of course, you should check the value for >>>>>>>>>> mapred.reduce.parallel.copies >>>>>>>>>> >>>>>>>>>> to >>>>>>>>> >>>>>>>>> see if it is 5. If it is 4 or lower, my reasoning wouldn't apply. >>>>>>>>>> >>>>>>>>>> About ramManager.unreserve() call, ReduceTask.java from hadoop >>>>>>>>>> 0.20.2 >>>>>>>>>> >>>>>>>>>> only >>>>>>>>> >>>>>>>>> has 2731 lines. So I have to guess the location of the code >>>>>>>>>> snippet you >>>>>>>>>> provided. >>>>>>>>>> I found this around line 1535: >>>>>>>>>> } catch (IOException ioe) { >>>>>>>>>> LOG.info("Failed to shuffle from " + >>>>>>>>>> mapOutputLoc.getTaskAttemptId(), >>>>>>>>>> ioe); >>>>>>>>>> >>>>>>>>>> // Inform the ram-manager >>>>>>>>>> ramManager.closeInMemoryFile(mapOutputLength); >>>>>>>>>> ramManager.unreserve(mapOutputLength); >>>>>>>>>> >>>>>>>>>> // Discard the map-output >>>>>>>>>> try { >>>>>>>>>> mapOutput.discard(); >>>>>>>>>> } catch (IOException ignored) { >>>>>>>>>> LOG.info("Failed to discard map-output from " + >>>>>>>>>> mapOutputLoc.getTaskAttemptId(), ignored); >>>>>>>>>> } >>>>>>>>>> Please confirm the line number. >>>>>>>>>> >>>>>>>>>> If we're looking at the same code, I am afraid I don't see how we >>>>>>>>>> can >>>>>>>>>> improve it. First, I assume IOException shouldn't happen that >>>>>>>>>> often. >>>>>>>>>> Second, >>>>>>>>>> mapOutput.discard() just sets: >>>>>>>>>> data = null; >>>>>>>>>> for in memory case. Even if we call mapOutput.discard() before >>>>>>>>>> ramManager.unreserve(), we don't know when GC would kick in and >>>>>>>>>> make more >>>>>>>>>> memory available. >>>>>>>>>> Of course, given the large number of map outputs in your system, it >>>>>>>>>> >>>>>>>>>> became >>>>>>>>> >>>>>>>>> more likely that the root cause from my reasoning made OOME happen >>>>>>>>>> >>>>>>>>>> sooner. >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sun, Mar 7, 2010 at 1:03 PM, Andy Sautins < >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> andy.saut...@returnpath.net >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Ted, >>>>>>>>>>> >>>>>>>>>>> I'm trying to follow the logic in your mail and I'm not sure I'm >>>>>>>>>>> following. If you would mind helping me understand I would >>>>>>>>>>> appreciate >>>>>>>>>>> >>>>>>>>>>> it. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Looking at the code maxSingleShuffleLimit is only used in >>>>>>>>>>> determining >>>>>>>>>>> >>>>>>>>>>> if >>>>>>>>>> >>>>>>>>>> the copy _can_ fit into memory: >>>>>>>>>>> >>>>>>>>>>> boolean canFitInMemory(long requestedSize) { >>>>>>>>>>> return (requestedSize < Integer.MAX_VALUE && >>>>>>>>>>> requestedSize < maxSingleShuffleLimit); >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> It also looks like the RamManager.reserve should wait until >>>>>>>>>>> memory >>>>>>>>>>> >>>>>>>>>>> is >>>>>>>>>> >>>>>>>>> >>>>>>>>> available so it should hit a memory limit for that reason. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> What does seem a little strange to me is the following ( >>>>>>>>>>> >>>>>>>>>>> ReduceTask.java >>>>>>>>>> >>>>>>>>>> starting at 2730 ): >>>>>>>>>>> >>>>>>>>>>> // Inform the ram-manager >>>>>>>>>>> ramManager.closeInMemoryFile(mapOutputLength); >>>>>>>>>>> ramManager.unreserve(mapOutputLength); >>>>>>>>>>> >>>>>>>>>>> // Discard the map-output >>>>>>>>>>> try { >>>>>>>>>>> mapOutput.discard(); >>>>>>>>>>> } catch (IOException ignored) { >>>>>>>>>>> LOG.info("Failed to discard map-output from " + >>>>>>>>>>> mapOutputLoc.getTaskAttemptId(), ignored); >>>>>>>>>>> } >>>>>>>>>>> mapOutput = null; >>>>>>>>>>> >>>>>>>>>>> So to me that looks like the ramManager unreserves the memory >>>>>>>>>>> before >>>>>>>>>>> >>>>>>>>>>> the >>>>>>>>>> >>>>>>>>>> mapOutput is discarded. Shouldn't the mapOutput be discarded >>>>>>>>>>> _before_ >>>>>>>>>>> >>>>>>>>>>> the >>>>>>>>>> >>>>>>>>>> ramManager unreserves the memory? If the memory is unreserved >>>>>>>>>>> before >>>>>>>>>>> >>>>>>>>>>> the >>>>>>>>>> >>>>>>>>> >>>>>>>>> actual underlying data references are removed then it seems like >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> another >>>>>>>>>> >>>>>>>>> >>>>>>>>> thread can try to allocate memory ( ReduceTask.java:2730 ) before >>>>>>>>>> >>>>>>>>>>> the >>>>>>>>>>> previous memory is disposed ( mapOutput.discard() ). >>>>>>>>>>> >>>>>>>>>>> Not sure that makes sense. One thing to note is that the >>>>>>>>>>> particular >>>>>>>>>>> >>>>>>>>>>> job >>>>>>>>>> >>>>>>>>>> that is failing does have a good number ( 200k+ ) of map >>>>>>>>>>> outputs. The >>>>>>>>>>> >>>>>>>>>>> large >>>>>>>>>> >>>>>>>>>> number of small map outputs may be why we are triggering a >>>>>>>>>>> problem. >>>>>>>>>>> >>>>>>>>>>> Thanks again for your thoughts. >>>>>>>>>>> >>>>>>>>>>> Andy >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: Jacob R Rideout [mailto:apa...@jacobrideout.net] >>>>>>>>>>> Sent: Sunday, March 07, 2010 1:21 PM >>>>>>>>>>> To: common-user@hadoop.apache.org >>>>>>>>>>> Cc: Andy Sautins; Ted Yu >>>>>>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError >>>>>>>>>>> >>>>>>>>>>> Ted, >>>>>>>>>>> >>>>>>>>>>> Thank you. I filled MAPREDUCE-1571 to cover this issue. I might >>>>>>>>>>> have >>>>>>>>>>> some time to write a patch later this week. >>>>>>>>>>> >>>>>>>>>>> Jacob Rideout >>>>>>>>>>> >>>>>>>>>>> On Sat, Mar 6, 2010 at 11:37 PM, Ted Yu <yuzhih...@gmail.com> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> I think there is mismatch (in ReduceTask.java) between: >>>>>>>>>>>> this.numCopiers = conf.getInt("mapred.reduce.parallel.copies", >>>>>>>>>>>> >>>>>>>>>>>> 5); >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> and: >>>>>>>>>> >>>>>>>>>>> maxSingleShuffleLimit = (long)(maxSize * >>>>>>>>>>>> MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION); >>>>>>>>>>>> where MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION is 0.25f >>>>>>>>>>>> >>>>>>>>>>>> because >>>>>>>>>>>> copiers = new ArrayList<MapOutputCopier>(numCopiers); >>>>>>>>>>>> so the total memory allocated for in-mem shuffle is 1.25 * >>>>>>>>>>>> maxSize >>>>>>>>>>>> >>>>>>>>>>>> A JIRA should be filed to correlate the constant 5 above and >>>>>>>>>>>> MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION. >>>>>>>>>>>> >>>>>>>>>>>> Cheers >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Mar 6, 2010 at 8:31 AM, Jacob R Rideout < >>>>>>>>>>>> >>>>>>>>>>>> apa...@jacobrideout.net >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> We are seeing the following error in our reducers of a >>>>>>>>>>>>> particular >>>>>>>>>>>>> >>>>>>>>>>>>> job: >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>>> Error: java.lang.OutOfMemoryError: Java heap space >>>>>>>>>>>>> at >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier >>>>>>>>>> >>>>>>>>> $MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) >>>>>>>>> >>>>>>>>> at >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier >>>>>>>>>> >>>>>>>>> $MapOutputCopier.getMapOutput(ReduceTask.java:1408) >>>>>>>>> >>>>>>>>> at >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier >>>>>>>>>> >>>>>>>>> $MapOutputCopier.copyOutput(ReduceTask.java:1261) >>>>>>>>> >>>>>>>>> at >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run >>>>>>>>>> >>>>>>>>> (ReduceTask.java:1195) >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> After enough reducers fail the entire job fails. This error >>>>>>>>>>>>> occurs >>>>>>>>>>>>> regardless of whether mapred.compress.map.output is true. We >>>>>>>>>>>>> were >>>>>>>>>>>>> >>>>>>>>>>>>> able >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> to avoid the issue by reducing >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> mapred.job.shuffle.input.buffer.percent >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> to 20%. Shouldn't the framework via >>>>>>>>>> >>>>>>>>>>> ShuffleRamManager.canFitInMemory >>>>>>>>>>>>> and.ShuffleRamManager.reserve correctly detect the the memory >>>>>>>>>>>>> available for allocation? I would think that with poor >>>>>>>>>>>>> configuration >>>>>>>>>>>>> settings (and default settings in particular) the job may not >>>>>>>>>>>>> be as >>>>>>>>>>>>> efficient, but wouldn't die. >>>>>>>>>>>>> >>>>>>>>>>>>> Here is some more context in the logs, I have attached the full >>>>>>>>>>>>> reducer log here: http://gist.github.com/323746 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 2010-03-06 07:54:49,621 INFO >>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask: >>>>>>>>>>>>> Shuffling 4191933 bytes (435311 raw bytes) into RAM from >>>>>>>>>>>>> attempt_201003060739_0002_m_000061_0 >>>>>>>>>>>>> 2010-03-06 07:54:50,222 INFO >>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask: >>>>>>>>>>>>> >>>>>>>>>>>>> Task >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> attempt_201003060739_0002_r_000000_0: Failed fetch #1 from >>>>>>>>>> >>>>>>>>>>> attempt_201003060739_0002_m_000202_0 >>>>>>>>>>>>> 2010-03-06 07:54:50,223 WARN >>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask: >>>>>>>>>>>>> attempt_201003060739_0002_r_000000_0 adding host >>>>>>>>>>>>> hd37.dfs.returnpath.net to penalty box, next contact in 4 >>>>>>>>>>>>> seconds >>>>>>>>>>>>> 2010-03-06 07:54:50,223 INFO >>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask: >>>>>>>>>>>>> attempt_201003060739_0002_r_000000_0: Got 1 map-outputs from >>>>>>>>>>>>> >>>>>>>>>>>>> previous >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> failures >>>>>>>>>> >>>>>>>>>>> 2010-03-06 07:54:50,223 FATAL >>>>>>>>>>>>> org.apache.hadoop.mapred.TaskRunner: >>>>>>>>>>>>> attempt_201003060739_0002_r_000000_0 : Map output copy failure : >>>>>>>>>>>>> java.lang.OutOfMemoryError: Java heap space >>>>>>>>>>>>> at >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier >>>>>>>>>> >>>>>>>>> $MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) >>>>>>>>> >>>>>>>>> at >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier >>>>>>>>>> >>>>>>>>> $MapOutputCopier.getMapOutput(ReduceTask.java:1408) >>>>>>>>> >>>>>>>>> at >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier >>>>>>>>>> >>>>>>>>> $MapOutputCopier.copyOutput(ReduceTask.java:1261) >>>>>>>>> >>>>>>>>> at >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run >>>>>>>>>> >>>>>>>>> (ReduceTask.java:1195) >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> We tried this both in 0.20.1 and 0.20.2. We had hoped >>>>>>>>>>>>> MAPREDUCE-1182 >>>>>>>>>>>>> would address the issue in 0.20.2, but it did not. Does anyone >>>>>>>>>>>>> have >>>>>>>>>>>>> any comments or suggestions? Is this a bug I should file a JIRA >>>>>>>>>>>>> for? >>>>>>>>>>>>> >>>>>>>>>>>>> Jacob Rideout >>>>>>>>>>>>> Return Path >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>> >> >