While I configure and use the hadoop framework, it seems that the
DNS server
must be used to do hostname resolution (even if i configure the IP
address
but not hostname in config/slaves and config/masters file). Because
we don't
have local DNS server in our local ethernet, so i have to add the
hostname -
IP mappings in /etc/hosts file.
Yeah ...annoying isn't it? :)
I have two questions about the hostname configuration:
1) Can we do some configuration in hadoop to avoid hostname
resolution, but use IP address directly?
We tried, failed and gave up. That said that was quite some time ago.
(0.13?)
I know some fixes went in but...
2) If I add a new machine to the cluster, it seems that i have
to add
the new machines hostname or IP address on each node's config/slaves
file.
If the cluster size is too large, this way could be impossible to
maintain.
Is there any simply way to add a node dynamically without
modifying all
the other cluster nodes?
Good question! Would love lo see a somewhat more dynamic discovery as
well.
That said. For a big cluster you will probably have a central
configuration management anyway.
So for us it's just changing one file and Puppet will roll it out to
the nodes.
cheers
--
Torsten