I still need to go through the whole thread. but we feel your pain.
First, please try setting fs.default.name to namenode internal ip on the
datanodes. This should make NN to attach internal ip so the datanodes
(assuming your routing is correct). NameNode webUI should list internal
ips for datanode. You might have to temporarily change NameNode code to
listen on 0.0.0.0.
That said, The issues you are facing are pretty unfortunate. As Steve
mentioned Hadoop is all confused about hostname/ip and there is
unecessary reliance on hostname and reverse DNS look ups in many many
places.
At least fairly straight fwd set ups with multiple NICs should be
handled well.
dfs.datanode.dns.interface should work like you expected (but not very
surprised it didn't).
Another thing you could try is setting dfs.datanode.address to the
internal ip address (this might already be discussed in the thread).
This should at least get all the bulk datatransfers happen over internal
NICs. One way to make sure is to hover on the datanode node on NameNode
webUI.. it shows the ip address.
good luck.
It might be better document your pains and findings in a Jira (with most
of the details in one or more comments rather than in description).
Raghu.
John Martyniak wrote:
So I changed all of the 0.0.0.0 on one machine to point to the
192.168.1.102 address.
And still it picks up the hostname and ip address of the external network.
I am kind of at my wits end with this, as I am not seeing a solution
yet, except to take the machines off of the external network and leave
them on the internal network which isn't an option.
Has anybody had this problem before? What was the solution?
-John
On Jun 9, 2009, at 10:17 AM, Steve Loughran wrote:
One thing to consider is that some of the various services of Hadoop
are bound to 0:0:0:0, which means every Ipv4 address, you really want
to bring up everything, including jetty services, on the en0 network
adapter, by binding them to 192.168.1.102; this will cause anyone
trying to talk to them over the other network to fail, which at least
find the problem sooner rather than later