[jira] [Created] (MAPREDUCE-6105) Inconsistent configuration in property mapreduce.reduce.shuffle.merge.percent

2014-09-23 Thread Dongwook Kwon (JIRA)
Dongwook Kwon created MAPREDUCE-6105:


 Summary: Inconsistent configuration in property 
mapreduce.reduce.shuffle.merge.percent
 Key: MAPREDUCE-6105
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6105
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.4.1, 2.4.0
Reporter: Dongwook Kwon
Priority: Trivial


Similar to MAPREDUCE-5906, In MergeManagerImpl.java, the default value of 
MRJobConfig.SHUFFLE_MERGE_PERCENT(mapreduce.reduce.shuffle.merge.percent) 
should be 0.66 
According to official doc.
https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

{code}
this.mergeThreshold = (long)(this.memoryLimit * 
  jobConf.getFloat(MRJobConfig.SHUFFLE_MERGE_PERCENT, 
   0.90f));

{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6108) ShuffleError OOM while reserving by MergeManagerImpl

2014-09-24 Thread Dongwook Kwon (JIRA)
Dongwook Kwon created MAPREDUCE-6108:


 Summary: ShuffleError OOM while reserving by MergeManagerImpl
 Key: MAPREDUCE-6108
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6108
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.5.1, 2.4.1, 2.5.0, 2.4.0
Reporter: Dongwook Kwon
Priority: Minor


Shuffle has OOM issue from time to time.  

Such as this email reported.
http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-dev/201408.mbox/%3ccabwxxjnk-on0xtrmurijd8sdgjjtamsvqw2czpm3oekj3ym...@mail.gmail.com%3E

{code}

Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
shuffle in fetcher#14
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:377)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.OutOfMemoryError: Java heap space
at 
org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:56)
at 
org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:46)
at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.(InMemoryMapOutput.java:63)
at 
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.unconditionalReserve(MergeManagerImpl.java:297)
at 
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.reserve(MergeManagerImpl.java:287)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:411)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:341)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)

{code}

Lowering mapreduce.reduce.shuffle.input.buffer.percent value mitigate the 
issue. However depending on the data and the memory system had, the issue comes 
back.

>From my test, when it's happening , the issue is very constant, memory foot 
>print, and the point OOM happens was the same, regardless of the value of 
>mapreduce.reduce.shuffle.input.buffer.percent( my test had default 0.7).  


Here is what I found.

According to MergeManagerImpl which implemented by 
https://issues.apache.org/jira/browse/MAPREDUCE-4808, it appears the reserve 
method deliberately allows just one thread(fetcher) to go over "memoryLimit" by 
checking the condition (usedMemory > memoryLimit) instead of (usedMemory + 
requestedSize > memoryLimit) to prevent stalling all fetchers issue as comment 
indicated. This seems working well most of times. However when the one fetcher 
tries to reserver usedMemory + requestedSize more than 
memoryLimit(Runtime.getRuntime().maxMemory()), I think there is OOM issue.

{code}
 @Override
public synchronized MapOutput reserve(TaskAttemptID mapId,
long requestedSize,
int fetcher
) throws IOException {
if (!canShuffleToMemory(requestedSize)) {
LOG.info(mapId + ": Shuffling to disk since " + requestedSize +
" is greater than maxSingleShuffleLimit (" +
maxSingleShuffleLimit + ")");
return new OnDiskMapOutput(mapId, reduceId, this, requestedSize,
jobConf, mapOutputFile, fetcher, true);
}
// Stall shuffle if we are above the memory limit
// It is possible that all threads could just be stalling and not make
// progress at all. This could happen when:
//
// requested size is causing the used memory to go above limit &&
// requested size < singleShuffleLimit &&
// current used size < mergeThreshold (merge will not get triggered)
//
// To avoid this from happening, we allow exactly one thread to go past
// the memory limit. We check (usedMemory > memoryLimit) and not
// (usedMemory + requestedSize > memoryLimit). When this thread is done
// fetching, this will automatically trigger a merge thereby unlocking
// all the stalled threads
if (usedMemory > memoryLimit) {
LOG.debug(mapId + ": Stalling shuffle since usedMemory (" + usedMemory
+ ") is greater than memoryLimit (" + memoryLimit + ")." +
" CommitMemory is (" + commitMemory + ")");
return null;
}
// Allow the in-memory shuffle to progress
LOG.debug(mapId + ": Proceeding with shuffle since usedMemory ("
+ usedMemory + ") is lesser than memoryLimit (" + memoryLimit + ")."
+ "CommitMemory is (" + commitMemory + ")");
return unconditionalReserve(mapId, requestedSize, true);
}

{code}

https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java#L256

When the one fetcher tries to reserve (usedMemory + r