Hi,
I am trying to set-up a two node cluster using Hadoop0.19.0, with 1
master(which should also work as a slave) and 1 slave node.
But while running bin/start-dfs.sh the datanode is not starting on
the
slave. I had read the previous mails on the list, but nothing
seems to be
working in this case. I am getting the following error in the
hadoop-root-datanode-slave log file while running the command
bin/start-dfs.sh =>
2009-02-03 13:00:27,516 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = slave/172.16.0.32
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.19.0
STARTUP_MSG: build =
https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r
713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
************************************************************/
2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client:
Retrying connect
to server: master/172.16.0.46:54310. Already tried 0 time(s).
2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client:
Retrying connect
to server: master/172.16.0.46:54310. Already tried 1 time(s).
2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client:
Retrying connect
to server: master/172.16.0.46:54310. Already tried 2 time(s).
2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client:
Retrying connect
to server: master/172.16.0.46:54310. Already tried 3 time(s).
2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client:
Retrying connect
to server: master/172.16.0.46:54310. Already tried 4 time(s).
2009-02-03 13:00:33,730 INFO org.apache.hadoop.ipc.Client:
Retrying connect
to server: master/172.16.0.46:54310. Already tried 5 time(s).
2009-02-03 13:00:34,731 INFO org.apache.hadoop.ipc.Client:
Retrying connect
to server: master/172.16.0.46:54310. Already tried 6 time(s).
2009-02-03 13:00:35,732 INFO org.apache.hadoop.ipc.Client:
Retrying connect
to server: master/172.16.0.46:54310. Already tried 7 time(s).
2009-02-03 13:00:36,733 INFO org.apache.hadoop.ipc.Client:
Retrying connect
to server: master/172.16.0.46:54310. Already tried 8 time(s).
2009-02-03 13:00:37,734 INFO org.apache.hadoop.ipc.Client:
Retrying connect
to server: master/172.16.0.46:54310. Already tried 9 time(s).
2009-02-03 13:00:37,738 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
java.io.IOException: Call
to master/172.16.0.46:54310 failed on local exception: No route to
host
at org.apache.hadoop.ipc.Client.call(Client.java:699)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at $Proxy4.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343)
at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:288)
at
org
.apache
.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:
258)
at
org
.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:
205)
at
org
.apache
.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:
1199)
at
org
.apache
.hadoop
.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:
1154)
at
org
.apache
.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:
1162)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:
1284)
Caused by: java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:
574)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:
299)
at
org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:685)
... 12 more
2009-02-03 13:00:37,739 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.32
************************************************************/
Also, the Pseudo distributed operation is working on both the
machines. And
i am able to ssh from 'master to master' and 'master to slave' via a
password-less ssh login. I do not think there is any problem with
the
network because cross pinging is working fine.
I am working on Linux (Fedora 8)
The following is the configuration which i am using
On master and slave, <HADOOP_INSTALL>/conf/masters looks like this:
master
On master and slave, <HADOOP_INSTALL>/conf/slaves looks like this:
master
slave
On both the machines conf/hadoop-site.xml looks like this
<property>
<name>fs.default.name</name>
<value>hdfs://master:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used
to
determine the host, port, etc. for a filesystem.</description>
</property>
<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication.
The actual number of replications can be specified when the file is
created.
The default is used if replication is not specified in create time.
</description>
</property>
namenode is formatted succesfully by running
"bin/hadoop namenode -format"
on the master node.
I am new to Hadoop and I do not know what is going wrong.
Any help will be appreciated.
Thanking you in advance
Shefali Pawar
Pune, India