I'm not sure if java is using the system's libc resolver, but assuming it is, 
you cannot use utilities like nslookup or dig because their use their own 
resolver.  Ping usually uses the libc resolver.  If you are on linux, you can 
use "getent hosts $hostname" to definitively test the libc resolver.

If you really do want to use mdns hosts (ie. end in ".local"), then you must 
have nss_mdns installed on your system and configure /etc/nsswitch.conf to use 
it.  You may also want to consider using nscd to cache dns lookup.  Although if 
you are using mdns, due to its dynamic nature, you may not want to cache 
(especially negative lookups) very long unless the host is assigned a static ip.

I hope this helps.

Daryn


On Jan 4, 2012, at 10:53 AM, Alexander Lorenz wrote:

> Hi,
> 
> Please ping the host you want to reach and check your hosts-file and your 
> resolve.conf
> 
> - Alex
> 
> Alexander Lorenz
> http://mapredit.blogspot.com
> 
> On Jan 4, 2012, at 7:28 AM, Oren <or...@infolinks.com> wrote:
> 
>> so it seems but doing a dig from terminal command line returns the results 
>> correctly.
>> the same setting are running in production servers (not hadoop) for months 
>> without problems.
>> 
>> clarification - i changed servers names in logs, domain isn't xxx.local 
>> originally..
>> 
>> 
>> On 01/04/2012 05:19 PM, Harsh J wrote:
>>> Looks like your caching DNS servers aren't really functioning as you'd
>>> expect them to?
>>> 
>>>> org.apache.hadoop.hbase.ZooKeeperConnectionException:
>>>> java.net.UnknownHostException: s06.xxx.local
>>> (That .local also worries me, you probably have a misconfiguration in
>>> resolution somewhere.)
>>> 
>>> On Wed, Jan 4, 2012 at 8:38 PM, Oren<or...@infolinks.com>  wrote:
>>>> hi.
>>>> i have a small hadoop grid connected  with a 1g network.
>>>> when servers are configured to use the local dns server the jobs are 
>>>> running
>>>> without a problem and copy speed during reduce is tens on MB.
>>>> once i change the servers to work with a cache only named server on each
>>>> node, i start to get failed tasks with timeout errors.
>>>> also, copy speed is reduced to under 1M.
>>>> 
>>>> there is NO degradation in network, copy of files between servers is still
>>>> tens of MB.
>>>> resolving is working ok and in the same speed (give or take) with both
>>>> configurations.
>>>> 
>>>> any idea of what happens during the map/reduce process that causes this
>>>> behavior?
>>>> this is an example for the exceptions i get during map:
>>>> Too many fetch-failures
>>>> 
>>>> and during reduce:
>>>> java.lang.RuntimeException:
>>>> org.apache.hadoop.hbase.ZooKeeperConnectionException:
>>>> java.net.UnknownHostException: s06.xxx.local at
>>>> org.apache.hadoop.hbase.client.HTableFactory.createHTableInterface(HTableFactory.java:38)
>>>> at
>>>> org.apache.hadoop.hbase.client.HTablePool.createHTable(HTablePool.java:129)
>>>> at org.apache.hadoop.hbase.client.HTablePool.getTable(HTablePool.java:89) 
>>>> at
>>>> com.infolinks.hadoop.commons.hbase.HBaseOperations.getTable(HBaseOperations.java:118)
>>>> at com.infolinks.hadoop.framework.HBaseReducer.setup(HBaseReducer.java:71)
>>>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174) at
>>>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566) at
>>>> org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at
>>>> org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by:
>>>> org.apache.hadoop.hbase.ZooKeeperConnectionException:
>>>> java.net.UnknownHostException: s06.xxx.local at
>>>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1000)
>>>> at
>>>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:303)
>>>> at
>>>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.(HConnectionManager.java:294)
>>>> at
>>>> org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:156)
>>>> at org.apache.hadoop.hbase.client.HTable.(HTable.java:167) at
>>>> org.apache.hadoop.hbase.client.HTableFactory.createHTableInterface(HTableFactory.java:36)
>>>> ... 8 more Caused by: java.net.UnknownHostException: s06.xxx.local at
>>>> java.net.InetAddress.getAllByName0(InetAddress.java:1158) at
>>>> java.net.InetAddress.getAllByName(InetAddress.java:1084) at
>>>> java.net.InetAddress.getAllByName(InetAddress.java:1020) at
>>>> org.apache.zookeeper.ClientCnxn.(ClientCnxn.java:386) at
>>>> org.apache.zookeeper.ClientCnxn.(ClientCnxn.java:331) at
>>>> org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:377) at
>>>> org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:97) at
>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.(ZooKeeperWatcher.java:119)
>>>> at
>>>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:998)
>>>> ... 13 more
>>>> 
>>>> thank you,
>>>> Oren.
>>>> 
>>> 
>>> 
>> 

Reply via email to