[ 
https://issues.apache.org/jira/browse/HDFS-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202644#comment-13202644
 ] 

Bikas Saha commented on HDFS-2909:
----------------------------------

Repro steps
1) Start 2 NN's in active standby mode
2) Remove write permissions from shared edits dir
3) Upon log roll triggered by standby, the active gets error when finalizing 
the edit logs
4) The error exception is caught way up on the stack and error does not get 
reported against the bad shared edits dir

This happens because error reporting happens when FSImage.rollEditLogs() calls 
storage.writeTransactionIdFileToStorage() which is called after 
FSEDit.rollEditLogs(). The error in FSEdit.rollEditLogs() raises an exception 
that is not handled in FSImage.rollEditLogs() and hence 
storage.writeTransactionIdFileToStorage() does not get called and no error is 
reported. The bad directory continues to remain in FSImage.storage.
                
> HA: Inaccessible shared edits dir not getting removed from FSImage storage 
> dirs upon error
> ------------------------------------------------------------------------------------------
>
>                 Key: HDFS-2909
>                 URL: https://issues.apache.org/jira/browse/HDFS-2909
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to