The address of the JobTracker (NameNode) is specified using * mapred.job.tracker* (*fs.default.name*) in the configurations. When the JobTracker (NameNode) starts, it will listen on the address specified by * mapred.job.tracker* (*fs.default.name*); and when a TaskTracker (DataNode) starts, it will talk to the address specified by *mapred.job.tracker* (* fs.default.name*) through RPC. So there are no confusions (about the communications between TaskTracker and JobTracker, as well as between DataNode and NameNode) even for multi-homed nodes, so long those two addresses are correctly specified.
On the other hand, when a TaskTracker (DataNode) starts, it will also listen on its own service addresses which are usually specified in the configurations as *0.0.0.0* (e.g., *mapred.task.tracker.http.address* and * dfs.datanode.address*); that is, it will accept connections from all the NICs in the node. In addition, the TaskTracker (DataNode) will send to the JobTracker (NameNode) status messages regularly, which contain its hostname. Consequently, when a Map or Reduce task obtains the addresses of the TaskTrackers (DataNodes) from the JobTracker (NameNode), e.g., for copying the Map output or reading a HDFS block, it will get the hostnames specified in the status messages and talk to the TaskTrackers (DataNodes) using those hostnames. The hostname specified in the status messages are determined something like below (as of Hadoop 0.19.1), which can be a little tricky for multi-homed nodes. String hostname = conf.get("slave.host.name"); if (hostname == null) { String interface = conf.get("mapred.tasktracker.dns.interface","default"); String nameserver = conf.get("mapred.tasktracker.dns.nameserver", "default"); if (interface.equals("default")) hostname = InetAddress.getLocalHost().getCanonicalHostName(); else { String[] ips = getIPs(strInterface); Vector<String> hosts = new Vector<String>(); for (int i = 0; i < ips.length; i ++) { hosts.add(reverseDns(InetAddress.getByName(ips[i]), nameserver)); } if (hosts.size() == 0) hostname = InetAddress.getLocalHost().getCanonicalHostName(); else hostname = hosts.toArray(new String[] {}); } } I think the easiest way for multiple NICs is probably to start each TaskTracker (DataNode) by specifying appropriate *slave.host.name* at its command line, which can be done in bin/slave.sh. On Thu, Jun 11, 2009 at 11:35 AM, John Martyniak < j...@beforedawnsolutions.com> wrote: > So it turns out the reason that I was getting the duey.local. was because > that is what was in the reverse DNS on the nameserver from a previous test. > So that is fixed, and now the machine says duey.local.xxx.com. > > The only remaining issue is the trailing "." (Period) that is required by > DNS to make the name fully qualified. > > So not sure if this is a bug in the Hadoop uses this information or some > other issue. > > If anybody has run across this issue before any help would be greatly > appreciated. > > Thank you, > > -John > > On Jun 10, 2009, at 9:21 PM, Matt Massie wrote: > > If you look at the documentation for the getCanonicalHostName() function >> (thanks, Steve)... >> >> >> http://java.sun.com/javase/6/docs/api/java/net/InetAddress.html#getCanonicalHostName() >> >> you'll see two Java security properties (networkaddress.cache.ttl and >> networkaddress.cache.negative.ttl). >> >> You might take a look at your /etc/nsswitch.conf configuration as well to >> learn how hosts are resolved on your machine, e.g... >> >> $ grep hosts /etc/nsswitch.conf >> hosts: files dns >> >> and lastly, you may want to check if you are running nscd (the NameService >> cache daemon). If you are, take a look at /etc/nscd.conf for the caching >> policy it's using. >> >> Good luck. >> >> -Matt >> >> >> >> On Jun 10, 2009, at 1:09 PM, John Martyniak wrote: >> >> That is what I thought also, is that it needs to keep that information >>> somewhere, because it needs to be able to communicate with all of the >>> servers. >>> >>> So I deleted the /tmp/had* and /tmp/hs* directories, removed the log >>> files, and grepped for the duey name in all files in config. And the >>> problem still exists. Originally I thought that it might have had something >>> to do with multiple entries in the .ssh/authorized_keys file but removed >>> everything there. And the problem still existed. >>> >>> So I think that I am going to grab a new install of hadoop 0.19.1, delete >>> the existing one and start out fresh to see if that changes anything. >>> >>> Wish me luck:) >>> >>> -John >>> >>> On Jun 10, 2009, at 12:30 PM, Steve Loughran wrote: >>> >>> John Martyniak wrote: >>>> >>>>> Does hadoop "cache" the server names anywhere? Because I changed to >>>>> using DNS for name resolution, but when I go to the nodes view, it is >>>>> trying >>>>> to view with the old name. And I changed the hadoop-site.xml file so that >>>>> it no longer has any of those values. >>>>> >>>> >>>> in SVN head, we try and get Java to tell us what is going on >>>> >>>> http://svn.apache.org/viewvc/hadoop/core/trunk/src/core/org/apache/hadoop/net/DNS.java >>>> >>>> This uses InetAddress.getLocalHost().getCanonicalHostName() to get the >>>> value, which is cached for life of the process. I don't know of anything >>>> else, but wouldn't be surprised -the Namenode has to remember the machines >>>> where stuff was stored. >>>> >>>> >>>> >>> John Martyniak >>> President/CEO >>> Before Dawn Solutions, Inc. >>> 9457 S. University Blvd #266 >>> Highlands Ranch, CO 80126 >>> o: 877-499-1562 >>> c: 303-522-1756 >>> e: j...@beforedawnsoutions.com >>> w: http://www.beforedawnsolutions.com >>> >>> >> > John Martyniak > President/CEO > Before Dawn Solutions, Inc. > 9457 S. University Blvd #266 > Highlands Ranch, CO 80126 > o: 877-499-1562 > c: 303-522-1756 > e: j...@beforedawnsoutions.com > w: http://www.beforedawnsolutions.com > >