Dynamic addition and removal of Hadoop nodes

Timothy Chklovski Fri, 06 Apr 2007 09:08:05 -0700

Hello,

We have been experimenting with Hadoop on a largish, but shared cluster.
That means we can allocate various nodes, but would also like to let others
use nodes
(so not having a node permanently is a bit like the situation on EC2).
We are interested in whether other users have developed approaches to get
machines to join (and leave) both the DFS and Tasktracker pools.


It does not seem very complicated, but we are wondering if the brute-force
approach ignores some arcana about such issues as, eg, whether refreshes
should be called on the namenode and the jobtracker.

Also, if we know a node will leave the pool, is there something that we can
tell
the namenode and the jobtracker in advance to make the leaving less
disruptive
(eg, stop accepting new large jobs, or even go into safe mode?)

-> If people have developed approaches to automating how machines join
and leave pools, we'd love to know.
-> Furthermore, if it makes sense, please consider it a feature request that
this
be automated/wrapped in scripts that can come with a Hadoop distribution
(or,
if everything already works, then extending the documentation on how one
might
accomplish this correctly).

Thanks much for Hadoop & continued work on it!

-- Tim

--
Timothy Chklovski
Senior Research Scientist
USC Information Sciences Institute
[EMAIL PROTECTED]
310.448.8763

Dynamic addition and removal of Hadoop nodes

Reply via email to