[ https://issues.apache.org/jira/browse/HDFS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135556#comment-13135556 ]
Hadoop QA commented on HDFS-2477: --------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12500747/reportDiff.patch-4 against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS org.apache.hadoop.hdfs.server.namenode.TestParallelImageWrite org.apache.hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement org.apache.hadoop.hdfs.server.namenode.TestSaveNamespace org.apache.hadoop.hdfs.server.datanode.TestBlockReport org.apache.hadoop.hdfs.server.namenode.TestEditLogJournalFailures org.apache.hadoop.hdfs.TestFileCreationClient org.apache.hadoop.hdfs.TestSetrepIncreasing org.apache.hadoop.hdfs.TestLeaseRecovery2 org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure org.apache.hadoop.hdfs.TestDistributedFileSystem org.apache.hadoop.hdfs.TestBlocksScheduledCounter org.apache.hadoop.hdfs.TestWriteConfigurationToDFS org.apache.hadoop.hdfs.server.namenode.TestEditLog +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/1442//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1442//console This message is automatically generated. > Optimize computing the diff between a block report and the namenode state. > -------------------------------------------------------------------------- > > Key: HDFS-2477 > URL: https://issues.apache.org/jira/browse/HDFS-2477 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node > Reporter: Tomasz Nykiel > Assignee: Tomasz Nykiel > Attachments: reportDiff.patch, reportDiff.patch-2, > reportDiff.patch-3, reportDiff.patch-4 > > > When a block report is processed at the NN, the BlockManager.reportDiff > traverses all blocks contained in the report, and for each one block, which > is also present in the corresponding datanode descriptor, the block is moved > to the head of the list of the blocks in this datanode descriptor. > With HDFS-395 the huge majority of the blocks in the report, are also present > in the datanode descriptor, which means that almost every block in the report > will have to be moved to the head of the list. > Currently this operation is performed by DatanodeDescriptor.moveBlockToHead, > which removes a block from a list and then inserts it. In this process, we > call findDatanode several times (afair 6 times for each moveBlockToHead > call). findDatanode is relatively expensive, since it linearly goes through > the triplets to locate the given datanode. > With this patch, we do some memoization of findDatanode, so we can reclaim 2 > findDatanode calls. Our experiments show that this can improve the reportDiff > (which is executed under write lock) by around 15%. Currently with HDFS-395, > reportDiff is responsible for almost 100% of the block report processing time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira