[ 
https://issues.apache.org/jira/browse/HDFS-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006515#comment-13006515
 ] 

Todd Lipcon commented on HDFS-1594:
-----------------------------------

Dhruba: I haven't dug into it yet, but I've also seen this before when the 
drive hosting the log4j logs fills up. I think there are some cases in which we 
do a System.exit(1) which can corrupt the logs -- the issue being that, even 
though we fsync after every edit, there's no guarantee that we don't get half 
of the preceding edit. Hairong's proposal to checksum each edit should help 
here.

> When the disk becomes full Namenode is getting shutdown and not able to 
> recover
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-1594
>                 URL: https://issues.apache.org/jira/browse/HDFS-1594
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.21.0, 0.21.1, 0.22.0
>         Environment: Linux linux124 2.6.27.19-5-default #1 SMP 2009-02-28 
> 04:40:21 +0100 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: Devaraj K
>             Fix For: 0.23.0
>
>         Attachments: HDFS-1594.patch, HDFS-1594.patch, HDFS-1594.patch, 
> hadoop-root-namenode-linux124.log
>
>
> When the disk becomes full name node is shutting down and if we try to start 
> after making the space available It is not starting and throwing the below 
> exception.
> {code:xml} 
> 2011-01-24 23:23:33,727 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
> initialization failed.
> java.io.EOFException
>       at java.io.DataInputStream.readFully(DataInputStream.java:180)
>       at org.apache.hadoop.io.UTF8.readFields(UTF8.java:117)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.readString(FSImageSerialization.java:201)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:185)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:60)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1089)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:1041)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:487)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:149)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:306)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:284)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:328)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:356)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:577)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:570)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1529)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1538)
> 2011-01-24 23:23:33,729 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.EOFException
>       at java.io.DataInputStream.readFully(DataInputStream.java:180)
>       at org.apache.hadoop.io.UTF8.readFields(UTF8.java:117)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.readString(FSImageSerialization.java:201)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:185)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:60)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1089)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:1041)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:487)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:149)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:306)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:284)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:328)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:356)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:577)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:570)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1529)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1538)
> 2011-01-24 23:23:33,730 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG: 
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at linux124/10.18.52.124
> ************************************************************/
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to