Re: Shuffle In Memory OutOfMemoryError

Bo Shi Fri, 07 May 2010 20:08:58 -0700

Hey Ted, any further insights on this?  We're encountering a similar
issue (on CD2).  I'll be applying MAPREDUCE-1182 to see if that
resolves our case but it sounds like that JIRA didn't completely
eliminate the problem for some folks.


On Wed, Mar 10, 2010 at 11:54 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> I pressed send key a bit early.
>
> I will have to dig a bit deeper.
> Hopefully someone can find reader.close() call after which I will look for
> another possible root cause :-)
>
>
> On Wed, Mar 10, 2010 at 7:48 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> Thanks to Andy for the log he provided.
>>
>> You can see from the log below that size increased steadily from 341535057
>> to 408181692, approaching maxSize. Then OOME:
>>
>>
>> 2010-03-10 18:38:32,936 INFO org.apache.hadoop.mapred.ReduceTask: reserve:
>> pos=start requestedSize=3893000 size=341535057 numPendingRequests=0
>> maxSize=417601952
>> 2010-03-10 18:38:32,936 INFO org.apache.hadoop.mapred.ReduceTask: reserve:
>> pos=end requestedSize=3893000 size=345428057 numPendingRequests=0
>> maxSize=417601952
>> ...
>> 2010-03-10 18:38:35,950 INFO org.apache.hadoop.mapred.ReduceTask: reserve:
>> pos=end requestedSize=635753 size=408181692 numPendingRequests=0
>> maxSize=417601952
>> 2010-03-10 18:38:36,603 INFO org.apache.hadoop.mapred.ReduceTask: Task
>> attempt_201003101826_0001_r_000004_0: Failed fetch #1 from
>> attempt_201003101826_0001_m_000875_0
>>
>> 2010-03-10 18:38:36,603 WARN org.apache.hadoop.mapred.ReduceTask:
>> attempt_201003101826_0001_r_000004_0 adding host hd17.dfs.returnpath.netto 
>> penalty box, next contact in 4 seconds
>> 2010-03-10 18:38:36,604 INFO org.apache.hadoop.mapred.ReduceTask:
>> attempt_201003101826_0001_r_000004_0: Got 1 map-outputs from previous
>> failures
>> 2010-03-10 18:38:36,605 FATAL org.apache.hadoop.mapred.TaskRunner:
>> attempt_201003101826_0001_r_000004_0 : Map output copy failure :
>> java.lang.OutOfMemoryError: Java heap space
>>         at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1513)
>>         at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1413)
>>         at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1266)
>>         at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1200)
>>
>> Looking at the call to unreserve() in ReduceTask, two were for IOException
>> and the other was for Sanity check (line 1557). Meaning they wouldn't be
>> called in normal execution path.
>>
>> I see one call in IFile.InMemoryReader close() method:
>>       // Inform the RamManager
>>       ramManager.unreserve(bufferSize);
>>
>> And InMemoryReader is used in createInMemorySegments():
>>           Reader<K, V> reader =
>>             new InMemoryReader<K, V>(ramManager, mo.mapAttemptId,
>>                                      mo.data, 0, mo.data.length);
>>
>> But I don't see reader.close() in ReduceTask file.
>>
>
>
>> On Wed, Mar 10, 2010 at 3:34 PM, Chris Douglas <chri...@yahoo-inc.com>wrote:
>>
>>> I don't think this OOM is a framework bug per se, and given the
>>> rewrite/refactoring of the shuffle in MAPREDUCE-318 (in 0.21), tuning the
>>> 0.20 shuffle semantics is likely not worthwhile (though data informing
>>> improvements to trunk would be excellent). Most likely (and tautologically),
>>> ReduceTask simply requires more memory than is available and the job failure
>>> can be avoided by either 0) increasing the heap size or 1) lowering
>>> mapred.shuffle.input.buffer.percent. Most of the tasks we run have a heap of
>>> 1GB. For a reduce fetching >200k map outputs, that's a reasonable, even
>>> stingy amount of space. -C
>>>
>>>
>>> On Mar 10, 2010, at 5:26 AM, Ted Yu wrote:
>>>
>>>  I verified that size and maxSize are long. This means MR-1182 didn't
>>>> resolve
>>>> Andy's issue.
>>>>
>>>> According to Andy:
>>>> At the beginning of the job there are 209,754 pending map tasks and 32
>>>> pending reduce tasks
>>>>
>>>> My guess is that GC wasn't reclaiming memory fast enough, leading to OOME
>>>> because of large number of in-memory shuffle candidates.
>>>>
>>>> My suggestion for Andy would be to:
>>>> 1. add -*verbose*:*gc as JVM parameter
>>>> 2. modify reserve() slightly to calculate the maximum outstanding
>>>> numPendingRequests and print the maximum.
>>>>
>>>> Based on the output from above two items, we can discuss solution.
>>>> My intuition is to place upperbound on numPendingRequests beyond which
>>>> canFitInMemory() returns false.
>>>> *
>>>> My two cents.
>>>>
>>>> On Tue, Mar 9, 2010 at 11:51 PM, Christopher Douglas
>>>> <chri...@yahoo-inc.com>wrote:
>>>>
>>>>  That section of code is unmodified in MR-1182. See the patches/svn log.
>>>>> -C
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>>
>>>>> On Mar 9, 2010, at 7:44 PM, "Ted Yu" <yuzhih...@gmail.com> wrote:
>>>>>
>>>>> I just downloaded hadoop-0.20.2 tar ball from cloudera mirror.
>>>>>
>>>>>> This is what I see in ReduceTask (line 999):
>>>>>>   public synchronized boolean reserve(int requestedSize, InputStream
>>>>>> in)
>>>>>>
>>>>>>   throws InterruptedException {
>>>>>>     // Wait till the request can be fulfilled...
>>>>>>     while ((size + requestedSize) > maxSize) {
>>>>>>
>>>>>> I don't see the fix from MR-1182.
>>>>>>
>>>>>> That's why I suggested to Andy that he manually apply MR-1182.
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>> On Tue, Mar 9, 2010 at 5:01 PM, Andy Sautins <
>>>>>> andy.saut...@returnpath.net
>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>
>>>>>>
>>>>>>  Thanks Christopher.
>>>>>>>
>>>>>>> The heap size for reduce tasks is configured to be 640M (
>>>>>>> mapred.child.java.opts set to -Xmx640m ).
>>>>>>>
>>>>>>> Andy
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Christopher Douglas [mailto:chri...@yahoo-inc.com]
>>>>>>> Sent: Tuesday, March 09, 2010 5:19 PM
>>>>>>> To: common-user@hadoop.apache.org
>>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError
>>>>>>>
>>>>>>> No, MR-1182 is included in 0.20.2
>>>>>>>
>>>>>>> What heap size have you set for your reduce tasks? -C
>>>>>>>
>>>>>>> Sent from my iPhone
>>>>>>>
>>>>>>> On Mar 9, 2010, at 2:34 PM, "Ted Yu" <yuzhih...@gmail.com> wrote:
>>>>>>>
>>>>>>> Andy:
>>>>>>>
>>>>>>>> You need to manually apply the patch.
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>>
>>>>>>>> On Tue, Mar 9, 2010 at 2:23 PM, Andy Sautins <
>>>>>>>>
>>>>>>>>  andy.saut...@returnpath.net
>>>>>>>
>>>>>>>  wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>  Thanks Ted.  My understanding is that MAPREDUCE-1182 is included
>>>>>>>>> in the
>>>>>>>>> 0.20.2 release.  We upgraded our cluster to 0.20.2 this weekend and
>>>>>>>>> re-ran
>>>>>>>>> the same job scenarios.  Running with mapred.reduce.parallel.copies
>>>>>>>>> set to 1
>>>>>>>>> and continue to have the same Java heap space error.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Ted Yu [mailto:yuzhih...@gmail.com]
>>>>>>>>> Sent: Tuesday, March 09, 2010 12:56 PM
>>>>>>>>> To: common-user@hadoop.apache.org
>>>>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError
>>>>>>>>>
>>>>>>>>> This issue has been resolved in
>>>>>>>>> http://issues.apache.org/jira/browse/MAPREDUCE-1182
>>>>>>>>>
>>>>>>>>> Please apply the patch
>>>>>>>>> M1182-1v20.patch<
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>> http://issues.apache.org/jira/secure/attachment/12424116/M1182-1v20.patch
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>>  On Sun, Mar 7, 2010 at 3:57 PM, Andy Sautins <
>>>>>>>>>
>>>>>>>>>  andy.saut...@returnpath.net
>>>>>>>>
>>>>>>>
>>>>>>>  wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  Thanks Ted.  Very helpful.  You are correct that I misunderstood
>>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>>  code
>>>>>>>>>
>>>>>>>>>  at ReduceTask.java:1535.  I missed the fact that it's in a
>>>>>>>>>> IOException
>>>>>>>>>>
>>>>>>>>>>  catch
>>>>>>>>>
>>>>>>>>>  block.  My mistake.  That's what I get for being in a rush.
>>>>>>>>>>
>>>>>>>>>> For what it's worth I did re-run the job with
>>>>>>>>>> mapred.reduce.parallel.copies set with values from 5 all the way
>>>>>>>>>> down to
>>>>>>>>>>
>>>>>>>>>>  1.
>>>>>>>>>
>>>>>>>>>  All failed with the same error:
>>>>>>>>>>
>>>>>>>>>> Error: java.lang.OutOfMemoryError: Java heap space
>>>>>>>>>>   at
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>>
>>>>>>>>> $MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
>>>>>>>>>
>>>>>>>>>    at
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>>
>>>>>>>>> $MapOutputCopier.getMapOutput(ReduceTask.java:1408)
>>>>>>>>>
>>>>>>>>>    at
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>>
>>>>>>>>> $MapOutputCopier.copyOutput(ReduceTask.java:1261)
>>>>>>>>>
>>>>>>>>>    at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run
>>>>>>>>>>
>>>>>>>>> (ReduceTask.java:1195)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> So from that it does seem like something else might be going on,
>>>>>>>>>> yes?
>>>>>>>>>>
>>>>>>>>>>  I
>>>>>>>>>
>>>>>>>>>  need to do some more research.
>>>>>>>>>>
>>>>>>>>>> I appreciate your insights.
>>>>>>>>>>
>>>>>>>>>> Andy
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Ted Yu [mailto:yuzhih...@gmail.com]
>>>>>>>>>> Sent: Sunday, March 07, 2010 3:38 PM
>>>>>>>>>> To: common-user@hadoop.apache.org
>>>>>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError
>>>>>>>>>>
>>>>>>>>>> My observation is based on this call chain:
>>>>>>>>>> MapOutputCopier.run() calling copyOutput() calling getMapOutput()
>>>>>>>>>> calling
>>>>>>>>>> ramManager.canFitInMemory(decompressedLength)
>>>>>>>>>>
>>>>>>>>>> Basically ramManager.canFitInMemory() makes decision without
>>>>>>>>>> considering
>>>>>>>>>> the
>>>>>>>>>> number of MapOutputCopiers that are running. Thus 1.25 * 0.7 of
>>>>>>>>>> total
>>>>>>>>>>
>>>>>>>>>>  heap
>>>>>>>>>
>>>>>>>>>  may be used in shuffling if default parameters were used.
>>>>>>>>>> Of course, you should check the value for
>>>>>>>>>> mapred.reduce.parallel.copies
>>>>>>>>>>
>>>>>>>>>>  to
>>>>>>>>>
>>>>>>>>>  see if it is 5. If it is 4 or lower, my reasoning wouldn't apply.
>>>>>>>>>>
>>>>>>>>>> About ramManager.unreserve() call, ReduceTask.java from hadoop
>>>>>>>>>> 0.20.2
>>>>>>>>>>
>>>>>>>>>>  only
>>>>>>>>>
>>>>>>>>>  has 2731 lines. So I have to guess the location of the code
>>>>>>>>>> snippet you
>>>>>>>>>> provided.
>>>>>>>>>> I found this around line 1535:
>>>>>>>>>>   } catch (IOException ioe) {
>>>>>>>>>>     LOG.info("Failed to shuffle from " +
>>>>>>>>>> mapOutputLoc.getTaskAttemptId(),
>>>>>>>>>>              ioe);
>>>>>>>>>>
>>>>>>>>>>     // Inform the ram-manager
>>>>>>>>>>     ramManager.closeInMemoryFile(mapOutputLength);
>>>>>>>>>>     ramManager.unreserve(mapOutputLength);
>>>>>>>>>>
>>>>>>>>>>     // Discard the map-output
>>>>>>>>>>     try {
>>>>>>>>>>       mapOutput.discard();
>>>>>>>>>>     } catch (IOException ignored) {
>>>>>>>>>>       LOG.info("Failed to discard map-output from " +
>>>>>>>>>>                mapOutputLoc.getTaskAttemptId(), ignored);
>>>>>>>>>>     }
>>>>>>>>>> Please confirm the line number.
>>>>>>>>>>
>>>>>>>>>> If we're looking at the same code, I am afraid I don't see how we
>>>>>>>>>> can
>>>>>>>>>> improve it. First, I assume IOException shouldn't happen that
>>>>>>>>>> often.
>>>>>>>>>> Second,
>>>>>>>>>> mapOutput.discard() just sets:
>>>>>>>>>>     data = null;
>>>>>>>>>> for in memory case. Even if we call mapOutput.discard() before
>>>>>>>>>> ramManager.unreserve(), we don't know when GC would kick in and
>>>>>>>>>> make more
>>>>>>>>>> memory available.
>>>>>>>>>> Of course, given the large number of map outputs in your system, it
>>>>>>>>>>
>>>>>>>>>>  became
>>>>>>>>>
>>>>>>>>>  more likely that the root cause from my reasoning made OOME happen
>>>>>>>>>>
>>>>>>>>>>  sooner.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  On Sun, Mar 7, 2010 at 1:03 PM, Andy Sautins <
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  andy.saut...@returnpath.net
>>>>>>>>>
>>>>>>>>>  wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  Ted,
>>>>>>>>>>>
>>>>>>>>>>> I'm trying to follow the logic in your mail and I'm not sure I'm
>>>>>>>>>>> following.  If you would mind helping me understand I would
>>>>>>>>>>> appreciate
>>>>>>>>>>>
>>>>>>>>>>>  it.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Looking at the code maxSingleShuffleLimit is only used in
>>>>>>>>>>> determining
>>>>>>>>>>>
>>>>>>>>>>>  if
>>>>>>>>>>
>>>>>>>>>>  the copy _can_ fit into memory:
>>>>>>>>>>>
>>>>>>>>>>> boolean canFitInMemory(long requestedSize) {
>>>>>>>>>>>   return (requestedSize < Integer.MAX_VALUE &&
>>>>>>>>>>>           requestedSize < maxSingleShuffleLimit);
>>>>>>>>>>>  }
>>>>>>>>>>>
>>>>>>>>>>> It also looks like the RamManager.reserve should wait until
>>>>>>>>>>> memory
>>>>>>>>>>>
>>>>>>>>>>>  is
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  available so it should hit a memory limit for that reason.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> What does seem a little strange to me is the following (
>>>>>>>>>>>
>>>>>>>>>>>  ReduceTask.java
>>>>>>>>>>
>>>>>>>>>>  starting at 2730 ):
>>>>>>>>>>>
>>>>>>>>>>>     // Inform the ram-manager
>>>>>>>>>>>     ramManager.closeInMemoryFile(mapOutputLength);
>>>>>>>>>>>     ramManager.unreserve(mapOutputLength);
>>>>>>>>>>>
>>>>>>>>>>>     // Discard the map-output
>>>>>>>>>>>     try {
>>>>>>>>>>>       mapOutput.discard();
>>>>>>>>>>>     } catch (IOException ignored) {
>>>>>>>>>>>       LOG.info("Failed to discard map-output from " +
>>>>>>>>>>>                mapOutputLoc.getTaskAttemptId(), ignored);
>>>>>>>>>>>     }
>>>>>>>>>>>     mapOutput = null;
>>>>>>>>>>>
>>>>>>>>>>> So to me that looks like the ramManager unreserves the memory
>>>>>>>>>>> before
>>>>>>>>>>>
>>>>>>>>>>>  the
>>>>>>>>>>
>>>>>>>>>>  mapOutput is discarded.  Shouldn't the mapOutput be discarded
>>>>>>>>>>> _before_
>>>>>>>>>>>
>>>>>>>>>>>  the
>>>>>>>>>>
>>>>>>>>>>  ramManager unreserves the memory?  If the memory is unreserved
>>>>>>>>>>> before
>>>>>>>>>>>
>>>>>>>>>>>  the
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  actual underlying data references are removed then it seems like
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  another
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  thread can try to allocate memory ( ReduceTask.java:2730 ) before
>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>>>> previous memory is disposed ( mapOutput.discard() ).
>>>>>>>>>>>
>>>>>>>>>>> Not sure that makes sense.  One thing to note is that the
>>>>>>>>>>> particular
>>>>>>>>>>>
>>>>>>>>>>>  job
>>>>>>>>>>
>>>>>>>>>>  that is failing does have a good number ( 200k+ ) of map
>>>>>>>>>>> outputs.  The
>>>>>>>>>>>
>>>>>>>>>>>  large
>>>>>>>>>>
>>>>>>>>>>  number of small map outputs may be why we are triggering a
>>>>>>>>>>> problem.
>>>>>>>>>>>
>>>>>>>>>>> Thanks again for your thoughts.
>>>>>>>>>>>
>>>>>>>>>>> Andy
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Jacob R Rideout [mailto:apa...@jacobrideout.net]
>>>>>>>>>>> Sent: Sunday, March 07, 2010 1:21 PM
>>>>>>>>>>> To: common-user@hadoop.apache.org
>>>>>>>>>>> Cc: Andy Sautins; Ted Yu
>>>>>>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError
>>>>>>>>>>>
>>>>>>>>>>> Ted,
>>>>>>>>>>>
>>>>>>>>>>> Thank you. I filled MAPREDUCE-1571 to cover this issue. I might
>>>>>>>>>>> have
>>>>>>>>>>> some time to write a patch later this week.
>>>>>>>>>>>
>>>>>>>>>>> Jacob Rideout
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Mar 6, 2010 at 11:37 PM, Ted Yu <yuzhih...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>  I think there is mismatch (in ReduceTask.java) between:
>>>>>>>>>>>>  this.numCopiers = conf.getInt("mapred.reduce.parallel.copies",
>>>>>>>>>>>>
>>>>>>>>>>>>  5);
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>  and:
>>>>>>>>>>
>>>>>>>>>>>   maxSingleShuffleLimit = (long)(maxSize *
>>>>>>>>>>>> MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION);
>>>>>>>>>>>> where MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION is 0.25f
>>>>>>>>>>>>
>>>>>>>>>>>> because
>>>>>>>>>>>>  copiers = new ArrayList<MapOutputCopier>(numCopiers);
>>>>>>>>>>>> so the total memory allocated for in-mem shuffle is 1.25 *
>>>>>>>>>>>> maxSize
>>>>>>>>>>>>
>>>>>>>>>>>> A JIRA should be filed to correlate the constant 5 above and
>>>>>>>>>>>> MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Mar 6, 2010 at 8:31 AM, Jacob R Rideout <
>>>>>>>>>>>>
>>>>>>>>>>>>  apa...@jacobrideout.net
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> We are seeing the following error in our reducers of a
>>>>>>>>>>>>> particular
>>>>>>>>>>>>>
>>>>>>>>>>>>>  job:
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>  Error: java.lang.OutOfMemoryError: Java heap space
>>>>>>>>>>>>>   at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>>
>>>>>>>>> $MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
>>>>>>>>>
>>>>>>>>>    at
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>>
>>>>>>>>> $MapOutputCopier.getMapOutput(ReduceTask.java:1408)
>>>>>>>>>
>>>>>>>>>    at
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>>
>>>>>>>>> $MapOutputCopier.copyOutput(ReduceTask.java:1261)
>>>>>>>>>
>>>>>>>>>    at
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run
>>>>>>>>>>
>>>>>>>>> (ReduceTask.java:1195)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>> After enough reducers fail the entire job fails. This error
>>>>>>>>>>>>> occurs
>>>>>>>>>>>>> regardless of whether mapred.compress.map.output is true. We
>>>>>>>>>>>>> were
>>>>>>>>>>>>>
>>>>>>>>>>>>>  able
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>  to avoid the issue by reducing
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>  mapred.job.shuffle.input.buffer.percent
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>  to 20%. Shouldn't the framework via
>>>>>>>>>>
>>>>>>>>>>>  ShuffleRamManager.canFitInMemory
>>>>>>>>>>>>> and.ShuffleRamManager.reserve correctly detect the the memory
>>>>>>>>>>>>> available for allocation? I would think that with poor
>>>>>>>>>>>>> configuration
>>>>>>>>>>>>> settings (and default settings in particular) the job may not
>>>>>>>>>>>>> be as
>>>>>>>>>>>>> efficient, but wouldn't die.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here is some more context in the logs, I have attached the full
>>>>>>>>>>>>> reducer log here: http://gist.github.com/323746
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2010-03-06 07:54:49,621 INFO
>>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask:
>>>>>>>>>>>>> Shuffling 4191933 bytes (435311 raw bytes) into RAM from
>>>>>>>>>>>>> attempt_201003060739_0002_m_000061_0
>>>>>>>>>>>>> 2010-03-06 07:54:50,222 INFO
>>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask:
>>>>>>>>>>>>>
>>>>>>>>>>>>>  Task
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>  attempt_201003060739_0002_r_000000_0: Failed fetch #1 from
>>>>>>>>>>
>>>>>>>>>>>  attempt_201003060739_0002_m_000202_0
>>>>>>>>>>>>> 2010-03-06 07:54:50,223 WARN
>>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask:
>>>>>>>>>>>>> attempt_201003060739_0002_r_000000_0 adding host
>>>>>>>>>>>>> hd37.dfs.returnpath.net to penalty box, next contact in 4
>>>>>>>>>>>>> seconds
>>>>>>>>>>>>> 2010-03-06 07:54:50,223 INFO
>>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask:
>>>>>>>>>>>>> attempt_201003060739_0002_r_000000_0: Got 1 map-outputs from
>>>>>>>>>>>>>
>>>>>>>>>>>>>  previous
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>  failures
>>>>>>>>>>
>>>>>>>>>>>  2010-03-06 07:54:50,223 FATAL
>>>>>>>>>>>>> org.apache.hadoop.mapred.TaskRunner:
>>>>>>>>>>>>> attempt_201003060739_0002_r_000000_0 : Map output copy failure :
>>>>>>>>>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>>>>>>>>>>   at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>>
>>>>>>>>> $MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
>>>>>>>>>
>>>>>>>>>    at
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>>
>>>>>>>>> $MapOutputCopier.getMapOutput(ReduceTask.java:1408)
>>>>>>>>>
>>>>>>>>>    at
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>>
>>>>>>>>> $MapOutputCopier.copyOutput(ReduceTask.java:1261)
>>>>>>>>>
>>>>>>>>>    at
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run
>>>>>>>>>>
>>>>>>>>> (ReduceTask.java:1195)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>> We tried this both in 0.20.1 and 0.20.2. We had hoped
>>>>>>>>>>>>> MAPREDUCE-1182
>>>>>>>>>>>>> would address the issue in 0.20.2, but it did not. Does anyone
>>>>>>>>>>>>> have
>>>>>>>>>>>>> any comments or suggestions? Is this a bug I should file a JIRA
>>>>>>>>>>>>> for?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jacob Rideout
>>>>>>>>>>>>> Return Path
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>
>>
>

Re: Shuffle In Memory OutOfMemoryError

Reply via email to