While I configure and use the hadoop framework, it seems that the DNS server must be used to do hostname resolution (even if i configure the IP address but not hostname in config/slaves and config/masters file). Because we don't have local DNS server in our local ethernet, so i have to add the hostname -
IP mappings in /etc/hosts file.

Yeah ...annoying isn't it? :)

I have two questions about the hostname configuration:
      1) Can we do some configuration in hadoop to avoid hostname
resolution, but use IP address directly?

We tried, failed and gave up. That said that was quite some time ago. (0.13?)

I know some fixes went in but...

2) If I add a new machine to the cluster, it seems that i have to add the new machines hostname or IP address on each node's config/slaves file. If the cluster size is too large, this way could be impossible to maintain. Is there any simply way to add a node dynamically without modifying all
the other cluster nodes?

Good question! Would love lo see a somewhat more dynamic discovery as well.

That said. For a big cluster you will probably have a central configuration management anyway. So for us it's just changing one file and Puppet will roll it out to the nodes.

cheers
--
Torsten

Reply via email to