[ 
https://issues.apache.org/jira/browse/HDFS-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207718#comment-13207718
 ] 

Hudson commented on HDFS-2815:
------------------------------

Integrated in Hadoop-Mapreduce-trunk #990 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/990/])
    HDFS-2815. Namenode sometimes oes not come out of safemode during NN crash 
+ restart. Contributed by Uma Maheswara Rao. (Revision 1243673)

     Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1243673
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java

                
> Namenode is not coming out of safemode when we perform ( NN crash + restart ) 
> .  Also FSCK report shows blocks missed.
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-2815
>                 URL: https://issues.apache.org/jira/browse/HDFS-2815
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.22.0, 0.24.0, 0.23.1, 1.0.0, 1.1.0
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>            Priority: Critical
>             Fix For: 0.24.0, 0.23.2
>
>         Attachments: HDFS-2815.patch, HDFS-2815.patch
>
>
> When tested the HA(internal) with continuous switch with some 5mins gap, 
> found some *blocks missed* and namenode went into safemode after next switch.
>    
>    After the analysis, i found that this files already deleted by clients. 
> But i don't see any delete commands logs namenode log files. But namenode 
> added that blocks to invalidateSets and DNs deleted the blocks.
>    When restart of the namenode, it went into safemode and expecting some 
> more blocks to come out of safemode.
>    Here the reason could be that, file has been deleted in memory and added 
> into invalidates after this it is trying to sync the edits into editlog file. 
> By that time NN asked DNs to delete that blocks. Now namenode shuts down 
> before persisting to editlogs.( log behind)
>    Due to this reason, we may not get the INFO logs about delete, and when we 
> restart the Namenode (in my scenario it is again switch), Namenode expects 
> this deleted blocks also, as delete request is not persisted into editlog 
> before.
>    I reproduced this scenario with bedug points. *I feel, We should not add 
> the blocks to invalidates before persisting into Editlog*. 
>     Note: for switch, we used kill -9 (force kill)
>   I am currently in 0.20.2 version. Same verified in 0.23 as well in normal 
> crash + restart  scenario.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to