[
https://issues.apache.org/jira/browse/HADOOP-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584419#action_12584419
]
Arun C Murthy commented on HADOOP-3140:
---------------------------------------
{quote}
1) Task.done() method checks if the task has data to be promoted and passes
this info to the TaskTracker via the TaskTracker.done() api.
2) If there is no data to promote, the TaskTracker sets the task status as
SUCCEEDED or FAILED depending on whether the task succeeds or fails.
{quote}
+1
In addition, we should discard outputs of failed tasks in
TaskTracker.Child.main if feasible in the 'finally' clause in
TaskTracker.Child.main. Then we could just set the status to 'FAILED/KILLED'
and relieve of the need to discard outputs in a lot of cases. We could go
further and do the same in the TT too to ensure that the JT only needs to
promote outputs of successful tasks... clearly it needs some careful thought.
> JobTracker should not try to promote a (map) task if it does not write to DFS
> at all
> ------------------------------------------------------------------------------------
>
> Key: HADOOP-3140
> URL: https://issues.apache.org/jira/browse/HADOOP-3140
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
>
> In most cases, map tasks do not write to dfs.
> Thus, when they complete, they should not be put into commit_pending queue at
> all.
> This will improve the task promotion significantly.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.