I don't think this OOM is a framework bug per se, and given the
rewrite/refactoring of the shuffle in MAPREDUCE-318 (in 0.21), tuning
the 0.20 shuffle semantics is likely not worthwhile (though data
informing improvements to trunk would be excellent). Most likely (and
tautologically), ReduceTask simply requires more memory than is
available and the job failure can be avoided by either 0) increasing
the heap size or 1) lowering mapred.shuffle.input.buffer.percent. Most
of the tasks we run have a heap of 1GB. For a reduce fetching >200k
map outputs, that's a reasonable, even stingy amount of space. -C
On Mar 10, 2010, at 5:26 AM, Ted Yu wrote:
I verified that size and maxSize are long. This means MR-1182 didn't
resolve
Andy's issue.
According to Andy:
At the beginning of the job there are 209,754 pending map tasks and 32
pending reduce tasks
My guess is that GC wasn't reclaiming memory fast enough, leading to
OOME
because of large number of in-memory shuffle candidates.
My suggestion for Andy would be to:
1. add -*verbose*:*gc as JVM parameter
2. modify reserve() slightly to calculate the maximum outstanding
numPendingRequests and print the maximum.
Based on the output from above two items, we can discuss solution.
My intuition is to place upperbound on numPendingRequests beyond which
canFitInMemory() returns false.
*
My two cents.
On Tue, Mar 9, 2010 at 11:51 PM, Christopher Douglas
wrote:
That section of code is unmodified in MR-1182. See the patches/svn
log. -C
Sent from my iPhone
On Mar 9, 2010, at 7:44 PM, "Ted Yu" wrote:
I just downloaded hadoop-0.20.2 tar ball from cloudera mirror.
This is what I see in ReduceTask (line 999):
public synchronized boolean reserve(int requestedSize,
InputStream in)
throws InterruptedException {
// Wait till the request can be fulfilled...
while ((size + requestedSize) > maxSize) {
I don't see the fix from MR-1182.
That's why I suggested to Andy that he manually apply MR-1182.
Cheers
On Tue, Mar 9, 2010 at 5:01 PM, Andy Sautins
wrote:
Thanks Christopher.
The heap size for reduce tasks is configured to be 640M (
mapred.child.java.opts set to -Xmx640m ).
Andy
-Original Message-
From: Christopher Douglas [mailto:chri...@yahoo-inc.com]
Sent: Tuesday, March 09, 2010 5:19 PM
To: common-user@hadoop.apache.org
Subject: Re: Shuffle In Memory OutOfMemoryError
No, MR-1182 is included in 0.20.2
What heap size have you set for your reduce tasks? -C
Sent from my iPhone
On Mar 9, 2010, at 2:34 PM, "Ted Yu" wrote:
Andy:
You need to manually apply the patch.
Cheers
On Tue, Mar 9, 2010 at 2:23 PM, Andy Sautins <
andy.saut...@returnpath.net
wrote:
Thanks Ted. My understanding is that MAPREDUCE-1182 is included
in the
0.20.2 release. We upgraded our cluster to 0.20.2 this weekend
and
re-ran
the same job scenarios. Running with
mapred.reduce.parallel.copies
set to 1
and continue to have the same Java heap space error.
-Original Message-
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Tuesday, March 09, 2010 12:56 PM
To: common-user@hadoop.apache.org
Subject: Re: Shuffle In Memory OutOfMemoryError
This issue has been resolved in
http://issues.apache.org/jira/browse/MAPREDUCE-1182
Please apply the patch
M1182-1v20.patch<
http://issues.apache.org/jira/secure/attachment/12424116/M1182-1v20.patch
On Sun, Mar 7, 2010 at 3:57 PM, Andy Sautins <
andy.saut...@returnpath.net
wrote:
Thanks Ted. Very helpful. You are correct that I
misunderstood the
code
at ReduceTask.java:1535. I missed the fact that it's in a
IOException
catch
block. My mistake. That's what I get for being in a rush.
For what it's worth I did re-run the job with
mapred.reduce.parallel.copies set with values from 5 all the way
down to
1.
All failed with the same error:
Error: java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier
$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier
$MapOutputCopier.getMapOutput(ReduceTask.java:1408)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier
$MapOutputCopier.copyOutput(ReduceTask.java:1261)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier
$MapOutputCopier.run
(ReduceTask.java:1195)
So from that it does seem like something else might be going on,
yes?
I
need to do some more research.
I appreciate your insights.
Andy
-Original Message-
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Sunday, March 07, 2010 3:38 PM
To: common-user@hadoop.apache.org
Subject: Re: Shuffle In Memory OutOfMemoryError
My observation is based on this call chain:
MapOutputCopier.run() calling copyOutput() calling
getMapOutput()
calling
ramManager.canFitInMemory(decompressedLength)
Basically ramManager.canFitInMemory() makes decision without
considering
the
number of MapOutputCopiers that are running. Thus 1.25 * 0.7 o