[ 
https://issues.apache.org/jira/browse/HDFS-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881425#action_12881425
 ] 

Konstantin Shvachko commented on HDFS-1221:
-------------------------------------------

If I understand it correctly, this is about name-node failure, when the image 
or edits files get corrupt, and when name-node detects that it fails even 
though there are other directories with good images and edits, right? I think 
this works as designed. We want the admins to know that something went wrong 
with those directories in bad conditions rather than silently starting the 
name-node. Admins may choose to manually change configuration, replace drives 
or something else, and restart the name-node again.
Did you check HDFS-955, which fixed similar issues I believe? Since you do not 
provide test cases it is really hard to understand what failure condition 
exactly you are talking about. Are you planning to contribute your Failure 
Testing Service framework?

> NameNode unable to start due to stale edits log after a crash
> -------------------------------------------------------------
>
>                 Key: HDFS-1221
>                 URL: https://issues.apache.org/jira/browse/HDFS-1221
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.20.1
>            Reporter: Thanh Do
>
> - Summary: 
> If a crash happens during FSEditLog.createEditLogFile(), the
> edits log file on disk may be stale. During next reboot, NameNode 
> will get an exception when parsing the edits file, because of stale data, 
> leading to unsuccessful reboot.
> Note: This is just one example. Since we see that edits log (and fsimage)
> does not have checksum, they are vulnerable to corruption too.
>  
> - Details:
> The steps to create new edits log (which we infer from HDFS code) are:
> 1) truncate the file to zero size
> 2) write FSConstants.LAYOUT_VERSION to buffer
> 3) insert the end-of-file marker OP_INVALID to the end of the buffer
> 4) preallocate 1MB of data, and fill the data with 0
> 5) flush the buffer to disk
>  
> Note that only in step 1, 4, 5, the data on disk is actually changed.
> Now, suppose a crash happens after step 4, but before step 5.
> In the next reboot, NameNode will fetch this edits log file (which contains
> all 0). The first thing parsed is the LAYOUT_VERSION, which is 0. This is OK,
> because NameNode has code to handle that case.
> (but we expect LAYOUT_VERSION to be -18, don't we). 
> Now it parses the operation code, which happens to be 0. Unfortunately, since > 0
> is the value for OP_ADD, the NameNode expects some parameters corresponding 
> to that operation. Now NameNode calls readString to read the path, which 
> throws
> an exception leading to a failed reboot.
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do (than...@cs.wisc.edu) and 
> Haryadi Gunawi (hary...@eecs.berkeley.edu)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to