Timothy Chklovski wrote:
We are interested in whether other users have developed approaches to get
machines to join (and leave) both the DFS and Tasktracker pools.
It does not seem very complicated, but we are wondering if the brute-force
approach ignores some arcana about such issues as, eg, whether refreshes
should be called on the namenode and the jobtracker.
Brute force should be effective for tasktrackers, but one should be more
careful with datanodes to avoid data loss.
To decomission datanodes, use the dfs.hosts and dfs.hosts.exclude
configuration parameters. Remove nodes to be decomissioned using these
files, then use 'bin/hadoop dfsadmin -refreshNodes' to cause the
namenode to re-read these files. Finally, wait until the 'dfsadmin
-report' reports that the requested nodes are decomissioned before
killing their datanode processes.
Doug