hi.
i have a small hadoop grid connected  with a 1g network.
when servers are configured to use the local dns server the jobs are running without a problem and copy speed during reduce is tens on MB. once i change the servers to work with a cache only named server on each node, i start to get failed tasks with timeout errors.
also, copy speed is reduced to under 1M.

there is NO degradation in network, copy of files between servers is still tens of MB. resolving is working ok and in the same speed (give or take) with both configurations.

any idea of what happens during the map/reduce process that causes this behavior?
this is an example for the exceptions i get during map:
Too many fetch-failures

and during reduce:
java.lang.RuntimeException: org.apache.hadoop.hbase.ZooKeeperConnectionException: java.net.UnknownHostException: s06.xxx.local at org.apache.hadoop.hbase.client.HTableFactory.createHTableInterface(HTableFactory.java:38) at org.apache.hadoop.hbase.client.HTablePool.createHTable(HTablePool.java:129) at org.apache.hadoop.hbase.client.HTablePool.getTable(HTablePool.java:89) at com.infolinks.hadoop.commons.hbase.HBaseOperations.getTable(HBaseOperations.java:118) at com.infolinks.hadoop.framework.HBaseReducer.setup(HBaseReducer.java:71) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: org.apache.hadoop.hbase.ZooKeeperConnectionException: java.net.UnknownHostException: s06.xxx.local at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1000) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:303) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.(HConnectionManager.java:294) at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:156) at org.apache.hadoop.hbase.client.HTable.(HTable.java:167) at org.apache.hadoop.hbase.client.HTableFactory.createHTableInterface(HTableFactory.java:36) ... 8 more Caused by: java.net.UnknownHostException: s06.xxx.local at java.net.InetAddress.getAllByName0(InetAddress.java:1158) at java.net.InetAddress.getAllByName(InetAddress.java:1084) at java.net.InetAddress.getAllByName(InetAddress.java:1020) at org.apache.zookeeper.ClientCnxn.(ClientCnxn.java:386) at org.apache.zookeeper.ClientCnxn.(ClientCnxn.java:331) at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:377) at org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:97) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.(ZooKeeperWatcher.java:119) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:998) ... 13 more

thank you,
Oren.

Reply via email to