Re: Multiple NIC Cards

JQ Hadoop Mon, 22 Jun 2009 00:15:56 -0700

The address of the JobTracker (NameNode) is specified using *
mapred.job.tracker* (*fs.default.name*) in the configurations. When the
JobTracker (NameNode) starts, it will listen on the address specified by *
mapred.job.tracker* (*fs.default.name*); and when a TaskTracker (DataNode)
starts, it will talk to the address specified by *mapred.job.tracker* (*
fs.default.name*) through RPC. So there are no confusions (about the
communications between TaskTracker and JobTracker, as well as between
DataNode and NameNode) even for multi-homed nodes, so long those two
addresses are correctly specified.


On the other hand, when a TaskTracker (DataNode) starts, it will also listen
on its own service addresses which are usually specified in the
configurations as *0.0.0.0* (e.g., *mapred.task.tracker.http.address* and *
dfs.datanode.address*); that is, it will accept connections from all the
NICs in the node. In addition, the TaskTracker (DataNode) will send to the
JobTracker (NameNode) status messages regularly, which contain its hostname.
Consequently, when a Map or Reduce task obtains the addresses of the
TaskTrackers (DataNodes) from the JobTracker (NameNode), e.g., for copying
the Map output or reading a HDFS block, it will get the hostnames specified
in the status messages and talk to the TaskTrackers (DataNodes) using those
hostnames.

The hostname specified in the status messages are determined something like
below (as of Hadoop 0.19.1), which can be a little tricky for multi-homed
nodes.

    String hostname = conf.get("slave.host.name");
    if (hostname == null) {
      String interface =
conf.get("mapred.tasktracker.dns.interface","default");
      String nameserver = conf.get("mapred.tasktracker.dns.nameserver",
"default");
      if (interface.equals("default"))
        hostname = InetAddress.getLocalHost().getCanonicalHostName();
      else {
        String[] ips = getIPs(strInterface);
        Vector<String> hosts = new Vector<String>();
        for (int i = 0; i < ips.length; i ++) {
          hosts.add(reverseDns(InetAddress.getByName(ips[i]), nameserver));
        }
        if (hosts.size() == 0)
          hostname = InetAddress.getLocalHost().getCanonicalHostName();
        else
          hostname = hosts.toArray(new String[] {});
      }
    }

I think the easiest way for multiple NICs is probably to start each
TaskTracker (DataNode) by specifying appropriate *slave.host.name* at its
command line, which can be done in bin/slave.sh.



On Thu, Jun 11, 2009 at 11:35 AM, John Martyniak <
j...@beforedawnsolutions.com> wrote:

> So it turns out the reason that I was getting the duey.local. was because
> that is what was in the reverse DNS on the nameserver from a previous test.
>  So that is fixed, and now the machine says duey.local.xxx.com.
>
> The only remaining issue is the trailing "." (Period) that is required by
> DNS to make the name fully qualified.
>
> So not sure if this is a bug in the Hadoop uses this information or some
> other issue.
>
> If anybody has run across this issue before any help would be greatly
> appreciated.
>
> Thank you,
>
> -John
>
> On Jun 10, 2009, at 9:21 PM, Matt Massie wrote:
>
>  If you look at the documentation for the getCanonicalHostName() function
>> (thanks, Steve)...
>>
>>
>> http://java.sun.com/javase/6/docs/api/java/net/InetAddress.html#getCanonicalHostName()
>>
>> you'll see two Java security properties (networkaddress.cache.ttl and
>> networkaddress.cache.negative.ttl).
>>
>> You might take a look at your /etc/nsswitch.conf configuration as well to
>> learn how hosts are resolved on your machine, e.g...
>>
>> $ grep hosts /etc/nsswitch.conf
>> hosts:      files dns
>>
>> and lastly, you may want to check if you are running nscd (the NameService
>> cache daemon).  If you are, take a look at /etc/nscd.conf for the caching
>> policy it's using.
>>
>> Good luck.
>>
>> -Matt
>>
>>
>>
>> On Jun 10, 2009, at 1:09 PM, John Martyniak wrote:
>>
>>  That is what I thought also, is that it needs to keep that information
>>> somewhere, because it needs to be able to communicate with all of the
>>> servers.
>>>
>>> So I deleted the /tmp/had* and /tmp/hs* directories, removed the log
>>> files, and grepped for the duey name in all files in config.  And the
>>> problem still exists.  Originally I thought that it might have had something
>>> to do with multiple entries in the .ssh/authorized_keys file but removed
>>> everything there.  And the problem still existed.
>>>
>>> So I think that I am going to grab a new install of hadoop 0.19.1, delete
>>> the existing one and start out fresh to see if that changes anything.
>>>
>>> Wish me luck:)
>>>
>>> -John
>>>
>>> On Jun 10, 2009, at 12:30 PM, Steve Loughran wrote:
>>>
>>>  John Martyniak wrote:
>>>>
>>>>> Does hadoop "cache" the server names anywhere?  Because I changed to
>>>>> using DNS for name resolution, but when I go to the nodes view, it is 
>>>>> trying
>>>>> to view with the old name.  And I changed the hadoop-site.xml file so that
>>>>> it no longer has any of those values.
>>>>>
>>>>
>>>> in SVN head, we try and get Java to tell us what is going on
>>>>
>>>> http://svn.apache.org/viewvc/hadoop/core/trunk/src/core/org/apache/hadoop/net/DNS.java
>>>>
>>>> This uses InetAddress.getLocalHost().getCanonicalHostName() to get the
>>>> value, which is cached for life of the process. I don't know of anything
>>>> else, but wouldn't be surprised -the Namenode has to remember the machines
>>>> where stuff was stored.
>>>>
>>>>
>>>>
>>> John Martyniak
>>> President/CEO
>>> Before Dawn Solutions, Inc.
>>> 9457 S. University Blvd #266
>>> Highlands Ranch, CO 80126
>>> o: 877-499-1562
>>> c: 303-522-1756
>>> e: j...@beforedawnsoutions.com
>>> w: http://www.beforedawnsolutions.com
>>>
>>>
>>
> John Martyniak
> President/CEO
> Before Dawn Solutions, Inc.
> 9457 S. University Blvd #266
> Highlands Ranch, CO 80126
> o: 877-499-1562
> c: 303-522-1756
> e: j...@beforedawnsoutions.com
> w: http://www.beforedawnsolutions.com
>
>

Re: Multiple NIC Cards

Reply via email to