[ 
https://issues.apache.org/jira/browse/HADOOP-1638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12514425
 ] 

Tom White commented on HADOOP-1638:
-----------------------------------

This problem was caused by the changes made in Amazon EC2 addressing: 
previously instances were direct addressed (given a single IP routable address) 
and now they are NAT-addressed (by default, for later tool versions). The key 
point is that NAT-addressed instances can't access other NAT-addressed 
instances using the public address. Direct addressing is going to be phased 
out. See 
http://developer.amazonwebservices.com/connect/entry.jspa?externalID=682&categoryID=100
 for more details. 

Tools versions ec2-api-tools-1.2-9739 and later use NAT addressing, and I have 
been using ec2-api-tools-1.2-7546 (although I thought I had been using 
ec2-api-tools-1.2-9739) which still uses direct addressing.

I don't think HADOOP-1202 will make this a non-issue since EC2 NAT instances 
cannot route to the public address of other instances. So even if the namenode 
and job tracker could bind to the public address that would not be much help to 
the slaves since they have to connect to the internal address - so this patch 
would still be needed.

Stu, I agree that it would be nice to fix this problem more thoroughly but 
until we have a better solution I think this approach is fine.

I've tested with the last three versions of ec2-api-tools and have successfully 
run the grep example on small multi-node clusters. When NAT-addressing is used 
however the webservers on datanodes and task trackers are not accessible since 
non-routable addresses are used. Apart from this limitation (which can be 
worked around by logging in to the relevant machine to browse logs) jobs ran OK.

So I vote to commit this (along with HADOOP-1635, HADOOP-1634) - I'll have some 
time to do this tomorrow.

> Master node unable to bind to DNS hostname
> ------------------------------------------
>
>                 Key: HADOOP-1638
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1638
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/ec2
>    Affects Versions: 0.13.0, 0.13.1, 0.14.0, 0.15.0
>            Reporter: Stu Hood
>            Priority: Minor
>             Fix For: 0.13.1, 0.14.0, 0.15.0
>
>         Attachments: hadoop-1638.patch
>
>
> With a release package of Hadoop 0.13.0 or with latest SVN, the Hadoop 
> contrib/ec2 scripts fail to start Hadoop correctly. After working around 
> issues HADOOP-1634 and HADOOP-1635, and setting up a DynDNS address pointing 
> to the master's IP, the ec2/bin/start-hadoop script completes.
> But the cluster is unusable because the namenode and tasktracker have not 
> started successfully. Looking at the namenode log on the master reveals the 
> following error:
> {quote}
> 2007-07-19 16:54:53,156 ERROR org.apache.hadoop.dfs.NameNode: 
> java.net.BindException: Cannot assign requested address
>         at sun.nio.ch.Net.bind(Native Method)
>         at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>         at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>         at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:186)
>         at org.apache.hadoop.ipc.Server.<init>(Server.java:631)
>         at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:325)
>         at org.apache.hadoop.ipc.RPC.getServer(RPC.java:295)
>         at org.apache.hadoop.dfs.NameNode.init(NameNode.java:164)
>         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:211)
>         at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:803)
>         at org.apache.hadoop.dfs.NameNode.main(NameNode.java:811)
> {quote}
> The master node refuses to bind to the DynDNS hostname in the generated 
> hadoop-site.xml. Here is the relevant part of the generated file:
> {quote}
> <property>
>   <name>fs.default.name</name>
>   <value>blah-ec2.gotdns.org:50001</value>
> </property>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>blah-ec2.gotdns.org:50002</value>
> </property>
> {quote}
> I'll attach a patch against hadoop-trunk that fixes the issue for me, but I'm 
> not sure if this issue is something that someone can fix more thoroughly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to