[ https://issues.apache.org/jira/browse/HDFS-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881970#action_12881970 ]
Todd Lipcon commented on HDFS-1263: ----------------------------------- I think I actually fixed this with a patch in HDFS-1260 - the problems turned out to be the same. Can you take a look at that patch and let me know what you think? > 0.20: in tryUpdateBlock, the meta file is renamed away before genstamp > validation is done > ----------------------------------------------------------------------------------------- > > Key: HDFS-1263 > URL: https://issues.apache.org/jira/browse/HDFS-1263 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node > Affects Versions: 0.20-append > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Fix For: 0.20-append > > > Saw an issue where multiple datanodes are trying to recover at the same time, > and all of them failed. I think the issue is in FSDataset.tryUpdateBlock, we > do the rename of blk_B_OldGS to blk_B_OldGS_tmpNewGS and *then* check that > the generation stamp is moving upwards. Because of this, invalid update block > calls are blocked, but they then cause future updateBlock calls to fail with > "Meta file not found" errors. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.