Hello everyone,

I have a question and was hoping some on the mailinglist could offer some 
pointers. I'm working on a project with another student and for part of this 
project we are trying to create something that will allow nodes to be added and 
removed from the hadoop cluster at will.  The goal is to have the nodes run a 
program that gives the user the freedom to add or remove themselves from the 
cluster to take advantage of a workstation when the user leaves (or if they'd 
like it running anyway when they're at the PC).  This would be on Windows 
computers of various different OSes.

>From what we can find, hadoop does not already support this feature, but it 
>does seem to support dynamically adding nodes and removing nodes in other 
>ways.  For example, to add a node, one would have to make sure hadoop is set 
>up on the PC along with cygwin, Java, and ssh, but after that initial setup 
>it's just a matter of adding the PC to the conf/slaves file, making sure the 
>node is not listed in the exclude file, and running the start datanode and 
>start tasktracker commands from the node you are adding (basically described 
>in FAQ item 25).  To remove a node, it seems to be just a matter of adding it 
>to dfs.hosts.exclude and refreshing the list of nodes (described in hadoop FAQ 
>17).

Our question is whether or not a simple interface for this already exists, and 
whether or not anyone sees any potential flaws with how we are planning to 
accomplish these tasks.  From our research we were not able to find anything 
that already exists for this purpose, but we find it surprising that an 
interface for this would not already exist.  We welcome any comments, 
recommendations, and insights anyone might have for accomplishing this task.

Thank you,
Alyssa Hargraves
Patrick Crane
WPI Class of 2009

Reply via email to