[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15386922#comment-15386922 ]
Vinitha Reddy Gankidi commented on HDFS-10301: ---------------------------------------------- Thanks for the review [~liuml07]. I have attached a new patch (012) that addresses your comments. > FSImage#isUpgradeFinalized is not volatile and > nn.getFSImage().isUpgradeFinalized() is not holding the read lock in > NameNodeRpcServer#blockReport(). Is this a problem? This is not very related > to this issue though. My patch does not make any changes to the isUpgradeFinalized method. If this is a problem, we should open another JIRA to address it. > If you’re gonna process exceptions thrown by the task, I think we don’t need > to return it explicitly as Callable.call()is permitted to throw checked > exceptions Thanks for the good suggestion! I have modified the Callable.call() to return a DataNodeCommand and throw IOException. I don't explicitly catch the exception since junit will take care of it. > I think we need to interpret the return value of the future.get()? future.get() returns DataNodeCommand which we don’t take care about and don’t need to interpret. > do you mean Assert.assertArrayEquals(storageInfos, > dnDescriptor.getStorageInfos()); Yes, thanks for that! I have made the change. > We should add javadoc for STORAGE_REPORT as it’s not that straightforward > defined in BlockListAsLongsabstract class. Added the doc > assert (blockList.getNumberOfBlocks() == -1); I believe we don’t need to use > assert statement along with Assert.asserEquals()? I changed the assert to Assert.assertEquals. However, the existing test does use assert as well {{assert(numBlocksReported >= expectedTotalBlockCount);}} > Always use slf4j placeholder in the code as you are doing int he latest > patch. Thanks for the tip! I noticed that placeholders were not used consistently. I tried to maintain the logging style that was already used in that particular file. I have modified all the log messages in my patch to use placeholders wherever possible. Sl4j was not used in some places, for instance in TestNameNodePrunesMissingStorages. > I see unnecessary blank lines in the v11 patch.I see not addressed long line > checkstyle warnings in BlockManager I noticed two blank lines in TestNameNodePrunesMissingStorages inv11 patch. I removed that. I do not see any checkstyle warnings. > if (nn.getFSImage().isUpgradeFinalized() && context.getTotalRpcs() == context.getCurRpc() + 1) { Set<String> storageIDsInBlockReport = new HashSet<>(); Combined as suggested. > BPServiceActor.java Let’s make cmd final. Since cmd was not final previously, I have left it unchanged. > BlockReport retransmissions may lead to storages falsely being declared > zombie if storage report processing happens out of order > -------------------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-10301 > URL: https://issues.apache.org/jira/browse/HDFS-10301 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 2.6.1 > Reporter: Konstantin Shvachko > Assignee: Vinitha Reddy Gankidi > Priority: Critical > Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, > HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.006.patch, > HDFS-10301.007.patch, HDFS-10301.008.patch, HDFS-10301.009.patch, > HDFS-10301.01.patch, HDFS-10301.010.patch, HDFS-10301.011.patch, > HDFS-10301.sample.patch, zombieStorageLogs.rtf > > > When NameNode is busy a DataNode can timeout sending a block report. Then it > sends the block report again. Then NameNode while process these two reports > at the same time can interleave processing storages from different reports. > This screws up the blockReportId field, which makes NameNode think that some > storages are zombie. Replicas from zombie storages are immediately removed, > causing missing blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org