[ 
https://issues.apache.org/jira/browse/HADOOP-5643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12701438#action_12701438
 ] 

Amar Kamat commented on HADOOP-5643:
------------------------------------

I think calling this as blacklisting will lead to more confusion. As Owen 
suggested we can call it as *decommissioning/recommissioning* of trackers which 
would essentially mean that irrespective of what state the tracker is, the 
jobtracker is asked to decommission(rerun+ignore)/recommission(add back) it. So 
the command would be

_bin/hadoop jobtracker -decommission tracker1,tracker2...._ and _bin/hadoop 
jobtracker -recommission tracker1,tracker2...._. 

All the running tasks  (also completed maps) that were launched on that machine 
will be killed and rerun. We can reuse the lost-tracker code for doing this. 
Maybe a thread should be started on demand (similar to cleanup queue thread) 
for a decommissioning request. Also these tracker will be added to the ignore 
list (i.e issue a 'shutdown' upon contact). So a decommission request is 
equivalent to lost-tracker + add-to-ignore-list. 

Upon a recommission, the trackers will be removed from the ignore list. This 
can be done inline.

>From the webui, a simple checkbox against all the trackers can be provided and 
>an action named 'Decommission' can be provided (similar to actions for jobs on 
>jobtracker.jsp). On the trackers page, we can provide another section for 
>decommissioned trackers and there we can provide a checkbox for 
>recommissioning it.

Note :
1) Acls check should be done before decommissioning and recommissioning.
2) This info needs to be persisted. Upon every decommission/recommission, 
persist this info to system.dir/jobtracker.info
3) Upon restart, the ignore list will also be recovered and loaded (i.e invoke 
jobtracker.decommission(recovered-list) from recovery-manager)
4) These new apis can be added to the TaskTrackerManager interface as there 
really are tasktracker level actions. 
----
Thoughts?

> Ability to blacklist tasktracker
> --------------------------------
>
>                 Key: HADOOP-5643
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5643
>             Project: Hadoop Core
>          Issue Type: New Feature
>    Affects Versions: 0.20.0
>            Reporter: Rajiv Chittajallu
>            Assignee: Amar Kamat
>
> Its not always possible to shutdown the tasktracker to stop scheduling tasks 
> on the node. (eg you can't login to the node but the TT is up). 
> This can be via 
>   * mapred.exclude and should be refreshed with out restarting the tasktracker
>   * hadoop job -fail-tracker <tracker id>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to