Hadoop Namenode not starting up. -------------------------------- Key: HDFS-1864 URL: https://issues.apache.org/jira/browse/HDFS-1864 Project: Hadoop HDFS Issue Type: Task Reporter: Ronak Shah
1. Checked to make sure hadoop was running properly. Discovered that we suppose to run 'jps' and make sure there is a namenode process. 2. Documentation said, if namenode does not exist - then run /etc/init.d/hadoop-0.20-namenode start /etc/init.d/hadoop-0.20-namenode status - namenode process fails EQX hdfs@hadoop-master:/usr/lib/hadoop/bin$ /etc/init.d/hadoop-0.20-namenode status namenode dead but pid file exists 3. Searched for pid files. We deleted pid files. 4. RYStats fell over at 4.45. As direct of result, looking at the process list - and there appeared to be a stalled process that was killed. kill -9 for the following process: EQX root@hadoop-master:/etc/init.d# ps aux | grep namenode hdfs 5038 0.2 1.0 3617440 526704 ? Sl Mar31 74:02 /usr/java/default/bin/java -Dproc_namenode -Xmx3000m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.log.dir=/usr/lib/hadoop/logs -Dhadoop.log.file=hadoop-hdfs-namenode-hadoop-master.rockyou.com.log -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=/usr/lib/hadoop/lib/native/Linux-amd64-64 -Dhadoop.policy.file=hadoop-policy.xml -classpath /usr/lib/hadoop/conf:/usr/java/default/lib/tools.jar:/usr/lib/hadoop:/usr/lib/hadoop/hadoop-core-0.20.2+737.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2+737.jar:/usr/lib/hadoop/lib/hadoop-lzo-0.4.8.jar:/usr/lib/hadoop/lib/hadoop-lzo.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/hue-plugins-1.1.0.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.14.jar:/usr/lib/hadoop/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop/lib/jsp-2.1/jsp-api-2.1.jar::/usr/local/lib/mysql-connector-java-5.1.7-bin.jar:/usr/local/lib/mail.jar:/usr/local/lib/mysql-connector-java-5.1.7-bin.jar:/usr/local/lib/mail.jar:/usr/local/lib/mysql-connector-java-5.1.7-bin.jar:/usr/local/lib/mail.jar:/usr/local/lib/mysql-connector-java-5.1.7-bin.jar:/usr/local/lib/mail.jar:/usr/local/lib/mysql-connector-java-5.1.7-bin.jar:/usr/local/lib/mail.jar org.apache.hadoop.hdfs.server.namenode.NameNode root 16449 0.0 0.0 61136 744 pts/4 S+ 16:29 0:00 grep namenode EQX root@hadoop-master:/etc/init.d# kill -9 5038 We starting looking at log output - we discovered the namenode startup process is throwing a null pointer exception. STARTUP_MSG: build = -r 98c55c28258aa6f42250569bd7fa431ac657bdbd; compiled by 'root' on Mon Oct 11 13:14:05 EDT 2010 ************************************************************/ 2011-04-25 21:16:47,841 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null 2011-04-25 21:16:47,949 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.ganglia.GangliaContext31 2011-04-25 21:16:47,982 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hdfs 2011-04-25 21:16:47,982 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=root 2011-04-25 21:16:47,982 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true 2011-04-25 21:16:47,987 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 2011-04-25 21:16:48,301 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.ganglia.GangliaContext31 2011-04-25 21:16:48,302 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean 2011-04-25 21:16:48,328 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 237791 2011-04-25 21:16:51,699 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 0 2011-04-25 21:16:51,699 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 42758182 loaded in 3 seconds. 2011-04-25 21:16:51,701 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1088) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1100) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:987) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:974) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:718) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1034) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:845) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:379) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:99) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:343) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:317) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:214) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:394) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1148) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1157) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira