[ https://issues.apache.org/jira/browse/HDFS-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202644#comment-13202644 ]
Bikas Saha commented on HDFS-2909: ---------------------------------- Repro steps 1) Start 2 NN's in active standby mode 2) Remove write permissions from shared edits dir 3) Upon log roll triggered by standby, the active gets error when finalizing the edit logs 4) The error exception is caught way up on the stack and error does not get reported against the bad shared edits dir This happens because error reporting happens when FSImage.rollEditLogs() calls storage.writeTransactionIdFileToStorage() which is called after FSEDit.rollEditLogs(). The error in FSEdit.rollEditLogs() raises an exception that is not handled in FSImage.rollEditLogs() and hence storage.writeTransactionIdFileToStorage() does not get called and no error is reported. The bad directory continues to remain in FSImage.storage. > HA: Inaccessible shared edits dir not getting removed from FSImage storage > dirs upon error > ------------------------------------------------------------------------------------------ > > Key: HDFS-2909 > URL: https://issues.apache.org/jira/browse/HDFS-2909 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, name-node > Affects Versions: HA branch (HDFS-1623) > Reporter: Bikas Saha > Assignee: Bikas Saha > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira