[ https://issues.apache.org/jira/browse/HADOOP-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549264 ]
Christian Kunz commented on HADOOP-2119: ---------------------------------------- To be precise: To find out the reason why reduces get stuck by not getting map output for a certain number of mappers, besides rolling back Srikanth's patch we also rolled back the trivially looking patch suggested by Devaraj: Index: src/java/org/apache/hadoop/mapred/TaskInProgress.java =================================================================== --- src/java/org/apache/hadoop/mapred/TaskInProgress.java (revision 598581) +++ src/java/org/apache/hadoop/mapred/TaskInProgress.java (working copy) @@ -663,7 +663,7 @@ * Return whether this TIP still needs to run */ boolean isRunnable() { - return !failed && (completes == 0); + return !isOnlyCommitPending() && !failed && (completes == 0); } /** > JobTracker becomes non-responsive if the task trackers finish task too fast > --------------------------------------------------------------------------- > > Key: HADOOP-2119 > URL: https://issues.apache.org/jira/browse/HADOOP-2119 > Project: Hadoop > Issue Type: Bug > Components: mapred > Affects Versions: 0.16.0 > Reporter: Runping Qi > Priority: Blocker > Fix For: 0.15.2 > > Attachments: hadoop-2119.patch, hadoop-jobtracker-thread-dump.txt > > > I ran a job with 0 reducer on a cluster with 390 nodes. > The mappers ran very fast. > The jobtracker lacks behind on committing completed mapper tasks. > The number of running mappers displayed on web UI getting bigger and bigger. > The jos tracker eventually stopped responding to web UI. > No progress is reported afterwards. > Job tracker is running on a separate node. > The job tracker process consumed 100% cpu, with vm size 1.01g (reach the heap > space limit). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.