Re: Dynamic addition and removal of Hadoop nodes

Doug Cutting Tue, 10 Apr 2007 13:53:21 -0700

Timothy Chklovski wrote:

We are interested in whether other users have developed approaches to get
machines to join (and leave) both the DFS and Tasktracker pools.


It does not seem very complicated, but we are wondering if the brute-force
approach ignores some arcana about such issues as, eg, whether refreshes
should be called on the namenode and the jobtracker.

Brute force should be effective for tasktrackers, but one should be morecareful with datanodes to avoid data loss.

To decomission datanodes, use the dfs.hosts and dfs.hosts.excludeconfiguration parameters. Remove nodes to be decomissioned using thesefiles, then use 'bin/hadoop dfsadmin -refreshNodes' to cause thenamenode to re-read these files. Finally, wait until the 'dfsadmin-report' reports that the requested nodes are decomissioned beforekilling their datanode processes.


Doug

Re: Dynamic addition and removal of Hadoop nodes

Reply via email to