[
https://issues.apache.org/jira/browse/HADOOP-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549264
]
Christian Kunz commented on HADOOP-2119:
----------------------------------------
To be precise:
To find out the reason why reduces get stuck by not getting map output for a
certain number of mappers,
besides rolling back Srikanth's patch we also rolled back the trivially looking
patch suggested by Devaraj:
Index: src/java/org/apache/hadoop/mapred/TaskInProgress.java
===================================================================
--- src/java/org/apache/hadoop/mapred/TaskInProgress.java (revision
598581)
+++ src/java/org/apache/hadoop/mapred/TaskInProgress.java (working copy)
@@ -663,7 +663,7 @@
* Return whether this TIP still needs to run
*/
boolean isRunnable() {
- return !failed && (completes == 0);
+ return !isOnlyCommitPending() && !failed && (completes == 0);
}
/**
> JobTracker becomes non-responsive if the task trackers finish task too fast
> ---------------------------------------------------------------------------
>
> Key: HADOOP-2119
> URL: https://issues.apache.org/jira/browse/HADOOP-2119
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.16.0
> Reporter: Runping Qi
> Priority: Blocker
> Fix For: 0.15.2
>
> Attachments: hadoop-2119.patch, hadoop-jobtracker-thread-dump.txt
>
>
> I ran a job with 0 reducer on a cluster with 390 nodes.
> The mappers ran very fast.
> The jobtracker lacks behind on committing completed mapper tasks.
> The number of running mappers displayed on web UI getting bigger and bigger.
> The jos tracker eventually stopped responding to web UI.
> No progress is reported afterwards.
> Job tracker is running on a separate node.
> The job tracker process consumed 100% cpu, with vm size 1.01g (reach the heap
> space limit).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.