[ https://issues.apache.org/jira/browse/HDFS-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14935973#comment-14935973 ]
Jing Zhao edited comment on HDFS-1172 at 9/29/15 9:56 PM: ---------------------------------------------------------- Thanks for updating the patch, [~iwasakims]. Comments on the latest patch: # It is not necessary to call {{numNodes}} again in the following code. We can directly use {{numNodes}}. {code} int numNodes = curBlock.numNodes(); ...... + DatanodeStorageInfo[] expectedStorages = + curBlock.getUnderConstructionFeature().getExpectedStorageLocations(); + if (curBlock.numNodes() < expectedStorages.length) { {code} # We'd better place the new "adding block to pending replica queue" logic only in {{checkReplication}}. Several reasons for this: #* {{completeBlock}} is also called by {{forceCompleteBlock}}, which is invoked when loading edits. At this time we should not update pending replication queue since the NN is just being started. #* {{completeBlock}} can often be called when NN has only received 1 block_received msg, updating pending replication queue at this time means later when further IBRs (incremental block reports) come we need to remove these DN from pending queue again. #* Semantically updating pending queue is more closely coupled with updating neededReplication queue. # Instead of making changes to {{PendingBlockInfo}}'s constructor, when updating the pending replication queue, you can prepare all the corresponding {{DatanodeDescriptor}} in an array first, and call {{pendingReplications.increment}} only once. # Do we need to call {{computeAllPendingWork}} in {{TestReplication#pendingReplicationCount}}? # Let's add a maximum retry count or total waiting time for {{waitForNoPendingReplication}}. was (Author: jingzhao): # It is not necessary to call {{numNodes}} again. We can directly use {{numNodes}}. {code} int numNodes = curBlock.numNodes(); ...... + DatanodeStorageInfo[] expectedStorages = + curBlock.getUnderConstructionFeature().getExpectedStorageLocations(); + if (curBlock.numNodes() < expectedStorages.length) { {code} # We'd better place the new "adding block to pending replica queue" logic only in {{checkReplication}}. Several reasons for this: #* {{completeBlock}} is also called by {{forceCompleteBlock}}, which is invoked when loading edits. At this time we should not update pending replication queue since the NN is just being started. #* {{completeBlock}} can often be called when NN has only received 1 block_received msg, updating pending replication queue at this time means later when further IBRs (incremental block reports) come we need to remove these DN from pending queue again. #* Semantically updating pending queue is more closely coupled with updating neededReplication queue. # Instead of making changes to {{PendingBlockInfo}}'s constructor, when updating the pending replication queue, you can prepare all the corresponding {{DatanodeDescriptor}} in an array first, and call {{pendingReplications.increment}} only once. # Do we need to call {{computeAllPendingWork}} in {{TestReplication#pendingReplicationCount}}? # Let's add a maximum retry count or total waiting time for {{waitForNoPendingReplication}}. > Blocks in newly completed files are considered under-replicated too quickly > --------------------------------------------------------------------------- > > Key: HDFS-1172 > URL: https://issues.apache.org/jira/browse/HDFS-1172 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 0.21.0 > Reporter: Todd Lipcon > Assignee: Masatake Iwasaki > Attachments: HDFS-1172-150907.patch, HDFS-1172.008.patch, > HDFS-1172.009.patch, HDFS-1172.010.patch, HDFS-1172.patch, hdfs-1172.txt, > hdfs-1172.txt, replicateBlocksFUC.patch, replicateBlocksFUC1.patch, > replicateBlocksFUC1.patch > > > I've seen this for a long time, and imagine it's a known issue, but couldn't > find an existing JIRA. It often happens that we see the NN schedule > replication on the last block of files very quickly after they're completed, > before the other DNs in the pipeline have a chance to report the new block. > This results in a lot of extra replication work on the cluster, as we > replicate the block and then end up with multiple excess replicas which are > very quickly deleted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)