[
https://issues.apache.org/jira/browse/HADOOP-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582047#action_12582047
]
Devaraj Das commented on HADOOP-2175:
-------------------------------------
Come to think about it, lostTaskTracker may not be the best way to go since it
can potentially affect tasks from multiple jobs. We probably need to make the
lostTaskTracker take a JobInProgress argument and do failedTask for tasks of
that job only. The other option is to implement APIs that gives the TIPs and
taskIds corresponding to a job & tasktracker combination, and then invoke
failedTask in the JobInProgress for each TIP/taskId. The second approach seems
cleaner generally but for the first approach most of the necessary
infrastructure is already there. Thoughts?
> Blacklisted hosts may not be able to serve map outputs
> ------------------------------------------------------
>
> Key: HADOOP-2175
> URL: https://issues.apache.org/jira/browse/HADOOP-2175
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Assignee: Amar Kamat
> Fix For: 0.17.0
>
> Attachments: HADOOP-2175-v1.1.patch, HADOOP-2175-v1.patch,
> HADOOP-2175-v2.patch, HADOOP-2175-v2.patch
>
>
> After a node fails 4 mappers (tasks), it is added to blacklist thus it will
> no longer accept tasks.
> But, it will continue serve the map outputs of any mappers that ran
> successfully there.
> However, the node may not be able serve the map outputs either.
> This will cause the reducers to mark the corresponding map outputs as from
> slow hosts,
> but continue to try to get the map outputs from that node.
> This may lead to waiting forever.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.