Re: JobTracker startup failure when starting hadoop-0.20.0 cluster on Amazon EC2 with contrib/ec2 scripts
Hi Jeyendran, I have exactly the same problem as you when setting up hadoop 20 on EC2. I found your post through google. I was wondering whether you've found a solution yet. Or, anyone has a solution for this? Thanks, Yuanyuan |> | From: | |> >--| |"Jeyendran Balakrishnan" | >--| |> | To:| |> >--| | | >--| |> | Date: | |> >--| |09/03/2009 04:18 PM | >--| |> | Subject: | |> >----------------------| |JobTracker startup failure when starting hadoop-0.20.0 cluster on Amazon EC2 with contrib/ec2 scripts | >--| I downloaded Hadoop 0.20.0 and used the src/contrib/ec2/bin scripts to launch a Hadoop cluster on Amazon EC2, after building a new Hadoop 0.20.0 AMI. I launched an instance with my new Hadoop 0.20.0 AMI, then logged in and ran the following to launch a new cluster: root(/vol/hadoop-0.20.0)> bin/launch-hadoop-cluster hadoop-test 2 After the usual EC2 wait, one master and two slave instances were launched on EC2, as expected. When I ssh'ed into the instances, here is what I found: Slaves: DataNode and NameNode are running Master: Only NameNode is running I could use HDFS commands (using $HADOOP_HOME/bin/hadoop scripts) without any problems, from both master and slaves. However, since JobTracker is not running, I cannot run map-reduce jobs. I checked the logs from /vol/hadoop-0.20.0/logs for the JobTracker, reproduced below: 2009-09-03 18:55:38,486 WARN org.apache.hadoop.conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and h dfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively 2009-09-03 18:55:38,520 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG: / STARTUP_MSG: Starting JobTracker STARTUP_MSG: host = domU-12-31-39-06-44-E3.compute-1.internal/10.208.75.17 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.0 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r 763504; compiled by 'ndaley' on Thu Apr 9 05:18:40 UTC 2009 / 2009-09-03 18:55:38,652 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=JobTracker, port=50002 2009-09-03 18:55:38,703 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2009-09-03 18:55:38,827 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50030 2009-09-03 18:55:38,827 INFO org.mortbay.log: jetty-6.1.14 2009-09-03 18:55:48,425 INFO org.mortbay.log: Started selectchannelconnec...@0.0.0.0:50030 2009-09-03 18:55:48,427 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 2009-09-03 18:55:48,432 INFO org.apache.hadoop.mapred.JobTracker: JobTracker up at: 50002 2009-09-03 18:55:48,432 INFO org.apache.hadoop.mapred.JobTracker: JobTracker webserver: 50030 2009-09-03 18:55:48,541 INFO org.apache.hadoop.mapred.JobTracker: Cleaning up the system directory 2009-09-03 18:55:48,628 INFO org.apache.hadoop.hdfs
Re: JobTracker startup failure when starting hadoop-0.20.0 cluster on Amazon EC2 with contrib/ec2 scripts
毛宏 wrote: I downloaded Hadoop 0.20.0 and used the src/contrib/ec2/bin scripts to launch a Hadoop cluster on Amazon EC2, after building a new Hadoop 0.20.0 AMI. I launched an instance with my new Hadoop 0.20.0 AMI, then logged in and ran the following to launch a new cluster: root(/vol/hadoop-0.20.0)> bin/launch-hadoop-cluster hadoop-test 2 After the usual EC2 wait, one master and two slave instances were launched on EC2, as expected. When I ssh'ed into the instances, here is what I found: Slaves: DataNode and NameNode are running Master: Only NameNode is running I could use HDFS commands (using $HADOOP_HOME/bin/hadoop scripts) without any problems, from both master and slaves. However, since JobTracker is not running, I cannot run map-reduce jobs. 2009-09-03 18:55:48,628 INFO org.apache.hadoop.hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /mnt/hadoop/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1 at 2009-09-03 18:55:48,628 WARN org.apache.hadoop.hdfs.DFSClient: NotReplicatedYetException sleeping /mnt/hadoop/mapred/system/jobtracker.info retries left 4 2009-09-03 18:55:49,030 INFO org.apache.hadoop.hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /mnt/hadoop/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1 the JT isn't up as the datanodes aren't taking data, JT spins waiting for files to be writeable so it can save state. I cheat in my clusters by running a (small) datanode in the root VM, so it will come up without needing any more. check more about the DN/HDFS status, that looks like the first problem.
JobTracker startup failure when starting hadoop-0.20.0 cluster on Amazon EC2 with contrib/ec2 scripts
I downloaded Hadoop 0.20.0 and used the src/contrib/ec2/bin scripts to launch a Hadoop cluster on Amazon EC2, after building a new Hadoop 0.20.0 AMI. I launched an instance with my new Hadoop 0.20.0 AMI, then logged in and ran the following to launch a new cluster: root(/vol/hadoop-0.20.0)> bin/launch-hadoop-cluster hadoop-test 2 After the usual EC2 wait, one master and two slave instances were launched on EC2, as expected. When I ssh'ed into the instances, here is what I found: Slaves: DataNode and NameNode are running Master: Only NameNode is running I could use HDFS commands (using $HADOOP_HOME/bin/hadoop scripts) without any problems, from both master and slaves. However, since JobTracker is not running, I cannot run map-reduce jobs. I checked the logs from /vol/hadoop-0.20.0/logs for the JobTracker, reproduced below: 2009-09-03 18:55:38,486 WARN org.apache.hadoop.conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and h dfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively 2009-09-03 18:55:38,520 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG: / STARTUP_MSG: Starting JobTracker STARTUP_MSG: host = domU-12-31-39-06-44-E3.compute-1.internal/10.208.75.17 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.0 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r 763504; compiled by 'ndaley' on Thu Apr 9 05:18:40 UTC 2009 / 2009-09-03 18:55:38,652 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=JobTracker, port=50002 2009-09-03 18:55:38,703 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2009-09-03 18:55:38,827 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50030 2009-09-03 18:55:38,827 INFO org.mortbay.log: jetty-6.1.14 2009-09-03 18:55:48,425 INFO org.mortbay.log: Started selectchannelconnec...@0.0.0.0:50030 2009-09-03 18:55:48,427 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 2009-09-03 18:55:48,432 INFO org.apache.hadoop.mapred.JobTracker: JobTracker up at: 50002 2009-09-03 18:55:48,432 INFO org.apache.hadoop.mapred.JobTracker: JobTracker webserver: 50030 2009-09-03 18:55:48,541 INFO org.apache.hadoop.mapred.JobTracker: Cleaning up the system directory 2009-09-03 18:55:48,628 INFO org.apache.hadoop.hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /mnt/hadoop/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(F SNamesystem.java:1256) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:4 22) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav a:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) at org.apache.hadoop.ipc.Client.call(Client.java:739) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy4.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav a:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo cationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation Handler.java:59) at $Proxy4.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DF SClient.java:2873) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(D FSClient.java:2755) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.j ava:2046) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSCli ent.java:2232) 2009-09-03 18:55:48,628
Re: JobTracker startup failure when starting hadoop-0.20.0 cluster on Amazon EC2 with contrib/ec2 scripts
Hi Jeyendran, Were there any errors reported in the datanode logs? There could be a problem with datanodes contacting the namenode, caused by firewall configuration problems (EC2 security groups). Cheers, Tom On Fri, Sep 4, 2009 at 12:17 AM, Jeyendran Balakrishnan wrote: > I downloaded Hadoop 0.20.0 and used the src/contrib/ec2/bin scripts to > launch a Hadoop cluster on Amazon EC2, after building a new Hadoop > 0.20.0 AMI. > > I launched an instance with my new Hadoop 0.20.0 AMI, then logged in and > ran the following to launch a new cluster: > root(/vol/hadoop-0.20.0)> bin/launch-hadoop-cluster hadoop-test 2 > > After the usual EC2 wait, one master and two slave instances were > launched on EC2, as expected. When I ssh'ed into the instances, here is > what I found: > > Slaves: DataNode and NameNode are running > Master: Only NameNode is running > > I could use HDFS commands (using $HADOOP_HOME/bin/hadoop scripts) > without any problems, from both master and slaves. However, since > JobTracker is not running, I cannot run map-reduce jobs. > > I checked the logs from /vol/hadoop-0.20.0/logs for the JobTracker, > reproduced below: > > > 2009-09-03 18:55:38,486 WARN org.apache.hadoop.conf.Configuration: > DEPRECATED: hadoop-site.xml found in the classpath. Usage of > hadoop-site.xml is deprecated. Instead use core-site.xml, > mapred-site.xml and h > dfs-site.xml to override properties of core-default.xml, > mapred-default.xml and hdfs-default.xml respectively > 2009-09-03 18:55:38,520 INFO org.apache.hadoop.mapred.JobTracker: > STARTUP_MSG: > / > STARTUP_MSG: Starting JobTracker > STARTUP_MSG: host = > domU-12-31-39-06-44-E3.compute-1.internal/10.208.75.17 > STARTUP_MSG: args = [] > STARTUP_MSG: version = 0.20.0 > STARTUP_MSG: build = > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r > 763504; compiled by 'ndaley' on Thu Apr 9 05:18:40 UTC 2009 > / > 2009-09-03 18:55:38,652 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: > Initializing RPC Metrics with hostName=JobTracker, port=50002 > 2009-09-03 18:55:38,703 INFO org.mortbay.log: Logging to > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via > org.mortbay.log.Slf4jLog > 2009-09-03 18:55:38,827 INFO org.apache.hadoop.http.HttpServer: Jetty > bound to port 50030 > 2009-09-03 18:55:38,827 INFO org.mortbay.log: jetty-6.1.14 > 2009-09-03 18:55:48,425 INFO org.mortbay.log: Started > selectchannelconnec...@0.0.0.0:50030 > 2009-09-03 18:55:48,427 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: > Initializing JVM Metrics with processName=JobTracker, sessionId= > 2009-09-03 18:55:48,432 INFO org.apache.hadoop.mapred.JobTracker: > JobTracker up at: 50002 > 2009-09-03 18:55:48,432 INFO org.apache.hadoop.mapred.JobTracker: > JobTracker webserver: 50030 > 2009-09-03 18:55:48,541 INFO org.apache.hadoop.mapred.JobTracker: > Cleaning up the system directory > 2009-09-03 18:55:48,628 INFO org.apache.hadoop.hdfs.DFSClient: > org.apache.hadoop.ipc.RemoteException: java.io.IOException: File > /mnt/hadoop/mapred/system/jobtracker.info could only be replicated to 0 > nodes, > instead of 1 > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(F > SNamesystem.java:1256) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:4 > 22) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav > a:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor > Impl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) > > at org.apache.hadoop.ipc.Client.call(Client.java:739) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > at $Proxy4.addBlock(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav > a:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor > Impl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo > cationHandler.java:82) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation > Handler.java:59) > at $Proxy4.
JobTracker startup failure when starting hadoop-0.20.0 cluster on Amazon EC2 with contrib/ec2 scripts
I downloaded Hadoop 0.20.0 and used the src/contrib/ec2/bin scripts to launch a Hadoop cluster on Amazon EC2, after building a new Hadoop 0.20.0 AMI. I launched an instance with my new Hadoop 0.20.0 AMI, then logged in and ran the following to launch a new cluster: root(/vol/hadoop-0.20.0)> bin/launch-hadoop-cluster hadoop-test 2 After the usual EC2 wait, one master and two slave instances were launched on EC2, as expected. When I ssh'ed into the instances, here is what I found: Slaves: DataNode and NameNode are running Master: Only NameNode is running I could use HDFS commands (using $HADOOP_HOME/bin/hadoop scripts) without any problems, from both master and slaves. However, since JobTracker is not running, I cannot run map-reduce jobs. I checked the logs from /vol/hadoop-0.20.0/logs for the JobTracker, reproduced below: 2009-09-03 18:55:38,486 WARN org.apache.hadoop.conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and h dfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively 2009-09-03 18:55:38,520 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG: / STARTUP_MSG: Starting JobTracker STARTUP_MSG: host = domU-12-31-39-06-44-E3.compute-1.internal/10.208.75.17 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.0 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r 763504; compiled by 'ndaley' on Thu Apr 9 05:18:40 UTC 2009 / 2009-09-03 18:55:38,652 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=JobTracker, port=50002 2009-09-03 18:55:38,703 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2009-09-03 18:55:38,827 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50030 2009-09-03 18:55:38,827 INFO org.mortbay.log: jetty-6.1.14 2009-09-03 18:55:48,425 INFO org.mortbay.log: Started selectchannelconnec...@0.0.0.0:50030 2009-09-03 18:55:48,427 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 2009-09-03 18:55:48,432 INFO org.apache.hadoop.mapred.JobTracker: JobTracker up at: 50002 2009-09-03 18:55:48,432 INFO org.apache.hadoop.mapred.JobTracker: JobTracker webserver: 50030 2009-09-03 18:55:48,541 INFO org.apache.hadoop.mapred.JobTracker: Cleaning up the system directory 2009-09-03 18:55:48,628 INFO org.apache.hadoop.hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /mnt/hadoop/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(F SNamesystem.java:1256) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:4 22) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav a:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) at org.apache.hadoop.ipc.Client.call(Client.java:739) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy4.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav a:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo cationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation Handler.java:59) at $Proxy4.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DF SClient.java:2873) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(D FSClient.java:2755) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.j ava:2046) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSCli ent.java:2232) 2009-09-03 18:55:48,628