Make EC2 cluster nodes more independent of each other
-----------------------------------------------------

                 Key: HADOOP-2410
                 URL: https://issues.apache.org/jira/browse/HADOOP-2410
             Project: Hadoop
          Issue Type: Improvement
            Reporter: Tom White


The cluster start up scripts currently wait for each node to start up before 
appointing a master (to run the namenode and jobtracker on), and copying 
private keys to all the nodes, and writing the private IP address of the master 
to the hadoop-site.xml file (which is then copied to the slaves via rsync). 
Only once this is all done is hadoop started on the cluster (from the master). 
This can fail if any of the nodes fails to come up, which can happen as EC2 
doesn't guarantee that you get a cluster of the size you ask for (I've seen 
this happen).

The process would be more robust if each node was told the address of the 
master as user metadata and then started its own daemons. This is complicated 
by the fact that the public DNS alias of the master resolves to a public IP 
address so cannot be used by EC2 nodes (see 
http://docs.amazonwebservices.com/AWSEC2/2007-08-29/DeveloperGuide/instance-addressing.html).
 Instead we need to use a trick 
(http://developer.amazonwebservices.com/connect/message.jspa?messageID=71126#71126)
 to find the private IP, and what's more we need to attempt to resolve the 
private IP in a loop until it is available since the DNS will only be set up 
after the master has started.

This change will also mean the private key doesn't need to be copied to each 
node, which can be slow and has dubious security. Configuration can be handled 
using the mechanism described in HADOOP-2409.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to