In the first instance make sure that all the relevant ports are actually open. I would also check that your conf files are ok. Looking at the example below, it seems that /work has a permissions problem.
(Note that telnet has nothing to do with Hadoop as far as I'm aware --a better test would be ssh) Miles 2008/7/22 Jose Vidal <[EMAIL PROTECTED]>: > I'm trying to install hadoop on our linux machine but after > start-all.sh none of the slaves can connect: > > 2008-07-22 16:35:27,534 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: > /************************************************************ > STARTUP_MSG: Starting DataNode > STARTUP_MSG: host = thetis/127.0.0.1 > STARTUP_MSG: args = [] > STARTUP_MSG: version = 0.16.4 > STARTUP_MSG: build = > http://svn.apache.org/repos/asf/hadoop/core/branches/bran > ch-0.16 <http://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.16>-r > 652614; compiled by 'hadoopqa' on Fri May 2 00:18:12 UTC 2008 > ************************************************************/ > 2008-07-22 16:35:27,643 WARN org.apache.hadoop.dfs.DataNode: Invalid > directory i > n dfs.data.dir: directory is not writable: /work > 2008-07-22 16:35:27,699 INFO org.apache.hadoop.ipc.Client: Retrying connect > to s > erver: hermes.cse.sc.edu/129.252.130.148:9000. Already tried 1 time(s). > 2008-07-22 16:35:28,700 INFO org.apache.hadoop.ipc.Client: Retrying connect > to s > erver: hermes.cse.sc.edu/129.252.130.148:9000. Already tried 2 time(s). > 2008-07-22 16:35:29,700 INFO org.apache.hadoop.ipc.Client: Retrying connect > to s > erver: hermes.cse.sc.edu/129.252.130.148:9000. Already tried 3 time(s). > 2008-07-22 16:35:30,701 INFO org.apache.hadoop.ipc.Client: Retrying connect > to s > erver: hermes.cse.sc.edu/129.252.130.148:9000. Already tried 4 time(s). > 2008-07-22 16:35:31,702 INFO org.apache.hadoop.ipc.Client: Retrying connect > to s > erver: hermes.cse.sc.edu/129.252.130.148:9000. Already tried 5 time(s). > 2008-07-22 16:35:32,702 INFO org.apache.hadoop.ipc.Client: Retrying connect > to s > erver: hermes.cse.sc.edu/129.252.130.148:9000. Already tried 6 time(s). > > same for the tasktrackers (port 9001). > > I think the problem has something to do with name resolution. Check these > out: > > [EMAIL PROTECTED]:~/hadoop-0.16.4> telnet hermes.cse.sc.edu 9000 > Trying 127.0.0.1... > Connected to hermes.cse.sc.edu (127.0.0.1). > Escape character is '^]'. > bye > Connection closed by foreign host. > > [EMAIL PROTECTED]:~/hadoop-0.16.4> host hermes.cse.sc.edu > hermes.cse.sc.edu has address 129.252.130.148 > > [EMAIL PROTECTED]:~/hadoop-0.16.4> telnet 129.252.130.148 9000 > Trying 129.252.130.148... > telnet: connect to address 129.252.130.148: Connection refused > telnet: Unable to connect to remote host: Connection refused > > So, the first one connects but not the second one, but they both go to > the same machine:port. My guess is that the hadoop server is closing > the connection, but why? > > Thanks, > Jose > > -- > Jose M. Vidal <[EMAIL PROTECTED]> http://jmvidal.cse.sc.edu > University of South Carolina http://www.multiagent.com > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.