It looks like this is during the hdfs recovery phase of the cluster start. Perhaps a tmp cleaner has removed some of the files, and now this portion of the restart is causing a failure.
I am not terribly familiar with the job recovery code. On Wed, Apr 22, 2009 at 11:44 AM, Tamir Kamara <tamirkam...@gmail.com>wrote: > Hey, > > hadoop-site.xml from the name node is attached. > I performed a cluster restart and then it would come up. > > Thanks in advance, > Tamir > > > On Wed, Apr 22, 2009 at 9:03 PM, Alex Loddengaard <a...@cloudera.com>wrote: > >> Can you post your hadoop-site.xml? Also, what prompted this problem? Did >> you bounce the cluster? >> >> Alex >> >> On Wed, Apr 22, 2009 at 8:16 AM, Tamir Kamara <tamirkam...@gmail.com> >> wrote: >> >> > Hi, >> > >> > After a while working with hadoop I'm now faced with a situation where >> the >> > namenode won't start up. I'm working with a patched up version of 0.19.1 >> > with ganglia patches (3422, 4675) and with 5269 which suppose to deal >> with >> > killed_unclean task status and the massive "serious problem" lines in >> the >> > JT >> > logs. >> > The latest NN logs are below. >> > >> > Can you help me figure out what is going on ? >> > >> > Thanks, >> > Tamir >> > >> > >> > >> > 2009-04-22 18:12:36,966 INFO >> > org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG: >> > /************************************************************ >> > STARTUP_MSG: Starting NameNode >> > STARTUP_MSG: host = lb-emu-3/192.168.14.11 >> > STARTUP_MSG: args = [] >> > STARTUP_MSG: version = 0.19.2-dev >> > STARTUP_MSG: build = -r ; compiled by 'tkamara' on Tue Apr 21 >> 12:03:50 >> > IDT 2009 >> > ************************************************************/ >> > 2009-04-22 18:12:37,448 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: >> > Initializing RPC Metrics with hostName=NameNode, port=54310 >> > 2009-04-22 18:12:37,456 INFO >> > org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: >> > lb-emu-3.israel.verisign.com/192.168.14.11:54310 >> > 2009-04-22< >> http://lb-emu-3.israel.verisign.com/192.168.14.11:54310%0A2009-04-22>18:12:37,467 >> INFO org.apache.hadoop.metrics.jvm.JvmMetrics: >> > Initializing JVM Metrics with processName=NameNode, sessionId=null >> > 2009-04-22 18:12:37,474 INFO >> > org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: >> > Initializing >> > NameNodeMeterics using context >> object:org.apache.hadoop.metrics.spi.NullC >> > ontext >> > 2009-04-22 18:12:37,627 INFO >> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >> fsOwner=hadoop,hadoop >> > 2009-04-22 18:12:37,628 INFO >> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >> supergroup=supergroup >> > 2009-04-22 18:12:37,628 INFO >> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >> > isPermissionEnabled=true >> > 2009-04-22 18:12:37,649 INFO >> > org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: >> > Initializing FSNamesystemMetrics using context >> > object:org.apache.hadoop.metrics.sp >> > i.NullContext >> > 2009-04-22 18:12:37,651 INFO >> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered >> > FSNamesystemStatusMBean >> > 2009-04-22 18:12:37,814 INFO >> org.apache.hadoop.hdfs.server.common.Storage: >> > Number of files = 3427 >> > 2009-04-22 18:12:38,486 INFO >> org.apache.hadoop.hdfs.server.common.Storage: >> > Number of files under construction = 28 >> > 2009-04-22 18:12:38,511 INFO >> org.apache.hadoop.hdfs.server.common.Storage: >> > Image file of size 488333 loaded in 0 seconds. >> > 2009-04-22 18:12:38,634 INFO >> org.apache.hadoop.hdfs.server.common.Storage: >> > Edits file /usr/local/hadoop-datastore/hadoop/dfs/name/current/edits of >> > size >> > 82110 edits # 477 loaded in >> > 0 seconds. >> > 2009-04-22 18:12:40,893 INFO >> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Invalid opcode, >> > reached >> > end of edit log Number of transactions found 36635 >> > 2009-04-22 18:12:40,893 INFO >> org.apache.hadoop.hdfs.server.common.Storage: >> > Edits file /usr/local/hadoop-datastore/hadoop/dfs/name/current/edits.new >> of >> > size 5229334 edits # 36635 l >> > oaded in 2 seconds. >> > 2009-04-22 18:12:41,024 ERROR >> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem >> > initialization failed. >> > java.io.IOException: saveLeases found path >> > >> > >> /tmp/temp623789763/tmp659456056/_temporary/_attempt_200904211331_0010_r_000002_0/part-00002 >> > but no matching entry in namespace. >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:4608) >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:1010) >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:1031) >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:88) >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309) >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288) >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163) >> > at >> > >> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208) >> > at >> > >> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194) >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859) >> > at >> > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868) >> > 2009-04-22 18:12:41,038 INFO org.apache.hadoop.ipc.Server: Stopping >> server >> > on 54310 >> > 2009-04-22 18:12:41,038 ERROR >> > org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException: >> > saveLeases found path >> > /tmp/temp623789763/tmp659456056/_temporary/_attempt_20090 >> > 4211331_0010_r_000002_0/part-00002 but no matching entry in namespace. >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:4608) >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:1010) >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:1031) >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:88) >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309) >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288) >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163) >> > at >> > >> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208) >> > at >> > >> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194) >> > at >> > >> > >> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859) >> > at >> > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868) >> > >> > 2009-04-22 18:12:41,039 INFO >> > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: >> > /************************************************************ >> > SHUTDOWN_MSG: Shutting down NameNode at lb-emu-3/192.168.14.11 >> > ************************************************************/ >> > >> > > -- Alpha Chapters of my book on Hadoop are available http://www.apress.com/book/view/9781430219422