[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650237#comment-13650237
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5168:
----------------------------------------------------

bq.  If enough of these on-disk outputs are queued up waiting to be merged, it 
can cause the reducer to OOM during the shuffle phase. 
Actually, thinking more, it doesn't look like your patch will solve this at 
all. As the streams are opened in the constructor which the patch isn't 
changing at all. I am now surprised that this patch helped at all. The 
MapOutput objects are GC'ed almost always immediately after abort/commit. May 
be I am missing something.

OTOH, delaying the creation of the output-stream till shuffling to disk is 
actually going to happen will fix the issue.
                
> Reducer can OOM during shuffle because on-disk output stream not released
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5168
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5168
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.7, 2.0.5-beta
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Critical
>         Attachments: MAPREDUCE-5168-branch-0.23.patch, MAPREDUCE-5168.patch
>
>
> If a reducer needs to shuffle a map output to disk, it opens an output stream 
> and writes the data to disk.  However it does not release the reference to 
> the output stream within the MapOutput, and the output stream can have a 128K 
> buffer attached to it.  If enough of these on-disk outputs are queued up 
> waiting to be merged, it can cause the reducer to OOM during the shuffle 
> phase.  In one case I saw there were 1200 on-disk outputs queued up to be 
> merged, leading to an extra 150MB of pressure on the heap due to the output 
> stream buffers that were no longer necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to