[ http://issues.apache.org/jira/browse/HADOOP-820?page=all ]
Bryan Pendleton updated HADOOP-820: ----------------------------------- Attachment: fixNameNodeStartup.patch This is a trivial workaround, which can be used by anyone else stuck by a truncated log. It's not what a good solution - needs better logging, probably a preference that defaults to "don't go on", etc. However, as my log filled up doing replication changes, this even results in no data loss in my case. > NameNode startup fails if edit log terminates prematurely > --------------------------------------------------------- > > Key: HADOOP-820 > URL: http://issues.apache.org/jira/browse/HADOOP-820 > Project: Hadoop > Issue Type: Bug > Components: dfs > Environment: ~50 node cluster > Reporter: Bryan Pendleton > Attachments: fixNameNodeStartup.patch > > > I ran out of space on the device that stores the edit log, resulting in an > edit log that is truncated mid transaction. > Ideally, the NameNode should start up, in SafeMode or the like, whenever this > happens. Right now, you get this stack trace: > 2006-12-12 15:33:57,212 ERROR org.apache.hadoop.dfs.NameNode: > java.io.EOFExcepti > on > at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:310) > at org.apache.hadoop.io.UTF8.readFields(UTF8.java:104) > at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:227) > at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:191) > at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:320) > at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:226) > at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:146) > at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:138) > at org.apache.hadoop.dfs.NameNode.main(NameNode.java:589) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira