As a follow-up to what Jeff posted: go ahead and ignore the message you got on the NN for now.
If you look at the address that the DN log shows it is 127.0.0.1 and the ip:port it is trying to connect to for the NN is 127.0.0.1:54310 ---> it is trying to bind to itself as if it was still in single machine mode. Make sure that you have correctly pushed the URI for the NN into the config files on both machines and then bounce DFS. Matt -----Original Message----- From: jeff.schm...@shell.com [mailto:jeff.schm...@shell.com] Sent: Monday, June 27, 2011 4:08 PM To: common-user@hadoop.apache.org Subject: RE: Why I cannot see live nodes in a LAN-based cluster setup? http://www.mentby.com/tim-robertson/error-register-getprotocolversion.html -----Original Message----- From: Jingwei Lu [mailto:j...@ucsd.edu] Sent: Monday, June 27, 2011 3:58 PM To: common-user@hadoop.apache.org Subject: Re: Why I cannot see live nodes in a LAN-based cluster setup? Hi, I just manually modify the masters & slaves files in the both machines. I found something wrong in the log files, as shown below: -- Master : namenote.log: **************************************** 2011-06-27 13:44:47,055 INFO org.mortbay.log: jetty-6.1.14 2011-06-27 13:44:47,394 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50070 2011-06-27 13:44:47,395 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at: 0.0.0.0:50070 2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting 2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54310: starting 2011-06-27 13:44:47,396 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54310: starting 2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54310: starting 2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54310: starting 2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54310: starting 2011-06-27 13:44:47,402 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 54310: starting 2011-06-27 13:44:47,404 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 54310: starting 2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 54310: starting 2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 54310: starting 2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 54310: starting 2011-06-27 13:44:47,408 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 54310: starting 2011-06-27 13:44:47,500 INFO org.apache.hadoop.ipc.Server: Error register getProtocolVersion java.lang.IllegalArgumentException: Duplicate metricsName:getProtocolVersion at org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53) at org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89) at org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) 2011-06-27 13:45:02,572 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 127.0.0.1:50010 storage DS-87816363-127.0.0.1-50010-1309207502566 **************************************** -- slave: datanode.log: **************************************** 1 2011-06-27 13:45:00,335 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: 2 /************************************************************ 3 STARTUP_MSG: Starting DataNode 4 STARTUP_MSG: host = hdl.ucsd.edu/127.0.0.1 5 STARTUP_MSG: args = [] 6 STARTUP_MSG: version = 0.20.2 7 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 8 ************************************************************/ 9 2011-06-27 13:45:02,476 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 0 time(s). 10 2011-06-27 13:45:03,549 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 1 time(s). 11 2011-06-27 13:45:04,552 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 2 time(s). 12 2011-06-27 13:45:05,609 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 3 time(s). 13 2011-06-27 13:45:06,640 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 4 time(s). 14 2011-06-27 13:45:07,643 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 5 time(s). 15 2011-06-27 13:45:08,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 6 time(s). 16 2011-06-27 13:45:09,661 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 7 time(s). 17 2011-06-27 13:45:10,664 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 8 time(s). 18 2011-06-27 13:45:11,678 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 9 time(s). 19 2011-06-27 13:45:11,679 INFO org.apache.hadoop.ipc.RPC: Server at hdl.ucsd.edu/127.0.0.1:54310 not available yet, Zzzzz... **************************************** (just guess, is this due to some porting problem?) Any comments will be greatly appreciated! Best Regards Yours Sincerely Jingwei Lu On Mon, Jun 27, 2011 at 1:28 PM, GOEKE, MATTHEW (AG/1000) < matthew.go...@monsanto.com> wrote: > Did you make sure to define the datanode/tasktracker in the slaves file in > your conf directory and push that to both machines? Also have you checked > the logs on either to see if there are any errors? > > Matt > > -----Original Message----- > From: Jingwei Lu [mailto:j...@ucsd.edu] > Sent: Monday, June 27, 2011 3:24 PM > To: HADOOP MLIST > Subject: Why I cannot see live nodes in a LAN-based cluster setup? > > Hi Everyone: > > I am quite new to hadoop here. I am attempting to set up Hadoop locally in > two machines, connected by LAN. Both of them pass the single-node test. > However, I failed in two-node cluster setup, as shown in the 2 cases below: > > 1) set one as dedicated namenode and the other as dedicated datanode > 2) set one as both name- and data-node, and the other as just datanode > > I launch *start-dfs.sh *on the namenode. Since I have all the *ssh *issues > cleared, thus I can always observe the startup of daemon in every datanode. > However, by website of *http://(URI of namenode):50070 *it shows only 0 > live > node for (1) and 1 live node for (2), which is the same as the output by > command-line *hadoop dfsadmin -report* > > Generally it appears that from the namenode you cannot observe the remote > datanode alive, let alone a normal across-node MapReduce execution. > > Could anyone give some hints / instructions at this point? I really > appreciate it! > > Thank. > > Best Regards > Yours Sincerely > > Jingwei Lu > This e-mail message may contain privileged and/or confidential information, > and is intended to be received only by persons entitled > to receive such information. If you have received this e-mail in error, > please notify the sender immediately. Please delete it and > all attachments from any servers, hard drives or any other media. Other use > of this e-mail by you is strictly prohibited. > > All e-mails and attachments sent and received are subject to monitoring, > reading and archival by Monsanto, including its > subsidiaries. The recipient of this e-mail is solely responsible for > checking for the presence of "Viruses" or other "Malware". > Monsanto, along with its subsidiaries, accepts no liability for any damage > caused by any such code transmitted by or accompanying > this e-mail or any attachment. > > > The information contained in this email may be subject to the export > control laws and regulations of the United States, potentially > including but not limited to the Export Administration Regulations (EAR) > and sanctions regulations issued by the U.S. Department of > Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this > information you are obligated to comply with all > applicable U.S. export laws and regulations. >