[
https://issues.apache.org/jira/browse/HADOOP-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592908#action_12592908
]
Yiping Han commented on HADOOP-3321:
------------------------------------
One thing to mention is, the mapper seems to be the one running on the same
node as of the reducer.
> getMapOutput() keeps failing too many times before the tasktracker fails
> ------------------------------------------------------------------------
>
> Key: HADOOP-3321
> URL: https://issues.apache.org/jira/browse/HADOOP-3321
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.16.1
> Reporter: Yiping Han
> Priority: Critical
>
> We are running a big job on our cluster. There are about 400 reducers. Around
> 361 reducers finished successfully while the last batch of 39 reducers all
> failed roughly around the same time. After examining the log files, the
> following error info was found 858 times for a single tasktracker:
> 2008-04-21 02:42:45,368 WARN org.apache.hadoop.mapred.TaskTracker:
> getMapOutput(task_200804101742_0001_m_032077_2,396) failed :
> 2008-04-21 02:42:49,468 WARN org.apache.hadoop.mapred.TaskTracker:
> getMapOutput(task_200804101742_0001_m_032077_2,396) failed :
> 2008-04-21 02:43:03,717 WARN org.apache.hadoop.mapred.TaskTracker:
> getMapOutput(task_200804101742_0001_m_032077_2,396) failed :
> Shouldn't the task tracker failed early without trying so many times?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.