[ https://issues.apache.org/jira/browse/HDFS-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13792996#comment-13792996 ]
Tsz Wo (Nicholas), SZE commented on HDFS-5283: ---------------------------------------------- Vinay, thanks for working on this. Some comments: The new method added to Namesystem is better to # pass BlockInfoUnderConstruction, # call it as isInSnapshot, and # do not throw IOException. i.e. {code} //Namesystem.java public boolean isInSnapshot(BlockInfoUnderConstruction block); {code} In the implementation in FSNamesystem, it should try-catch the UnresolvedLinkException and log it as an error since the full path obtained from a file should not have unresolved link. Second question: Why adding DFSTestUtil.abortStream(..)? It does not look very useful. > NN not coming out of startup safemode due to under construction blocks only > inside snapshots also counted in safemode threshhold > -------------------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-5283 > URL: https://issues.apache.org/jira/browse/HDFS-5283 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots > Affects Versions: 3.0.0, 2.1.1-beta > Reporter: Vinay > Assignee: Vinay > Priority: Blocker > Attachments: HDFS-5283.000.patch, HDFS-5283.patch, HDFS-5283.patch, > HDFS-5283.patch > > > This is observed in one of our env: > 1. A MR Job was running which has created some temporary files and was > writing to them. > 2. Snapshot was taken > 3. And Job was killed and temporary files were deleted. > 4. Namenode restarted. > 5. After restart Namenode was in safemode waiting for blocks > Analysis > --------- > 1. Since the snapshot taken also includes the temporary files which were > open, and later original files are deleted. > 2. UnderConstruction blocks count was taken from leases. not considered the > UC blocks only inside snapshots > 3. So safemode threshold count was more and NN did not come out of safemode -- This message was sent by Atlassian JIRA (v6.1#6144)