[ https://issues.apache.org/jira/browse/HDFS-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13730813#comment-13730813 ]
Hudson commented on HDFS-4366: ------------------------------ FAILURE: Integrated in Hadoop-Mapreduce-trunk #1510 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1510/]) Fix CHANGES.txt. Move HDFS-4366 from 2.3.0 to under 3.0.0. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1510727) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HDFS-4366. Block Replication Policy Implementation May Skip Higher-Priority Blocks for Lower-Priority Blocks. Contributed by Derek Dagit. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1510724) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightLinkedSet.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLightWeightLinkedSet.java > Block Replication Policy Implementation May Skip Higher-Priority Blocks for > Lower-Priority Blocks > ------------------------------------------------------------------------------------------------- > > Key: HDFS-4366 > URL: https://issues.apache.org/jira/browse/HDFS-4366 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 3.0.0 > Reporter: Derek Dagit > Assignee: Derek Dagit > Fix For: 3.0.0 > > Attachments: HDFS-4366.patch, HDFS-4366.patch, HDFS-4366.patch, > HDFS-4366.patch, HDFS-4366.patch, HDFS-4366.patch, hdfs-4366-unittest.patch > > > In certain cases, higher-priority under-replicated blocks can be skipped by > the replication policy implementation. The current implementation maintains, > for each priority level, an index into a list of blocks that are > under-replicated. Together, the lists compose a priority queue (see note > later about branch-0.23). In some cases when blocks are removed from a list, > the caller (BlockManager) properly handles the index into the list from which > it removed a block. In some other cases, the index remains stationary while > the list changes. Whenever this happens, and the removed block happened to > be at or before the index, the implementation will skip over a block when > selecting blocks for replication work. > In situations when entire racks are decommissioned, leading to many > under-replicated blocks, loss of blocks can occur. > Background: HDFS-1765 > This patch to trunk greatly improved the state of the replication policy > implementation. Prior to the patch, the following details were true: > * The block "priority queue" was no such thing: It was really set of > trees that held blocks in natural ordering, that being by the blocks ID, > which resulted in iterator walks over the blocks in pseudo-random order. > * There was only a single index into an iteration over all of the > blocks... > * ... meaning the implementation was only successful in respecting > priority levels on the first pass. Overall, the behavior was a > round-robin-type scheduling of blocks. > After the patch > * A proper priority queue is implemented, preserving log n operations > while iterating over blocks in the order added. > * A separate index for each priority is key is kept... > * ... allowing for processing of the highest priority blocks first > regardless of which priority had last been processed. > The change was suggested for branch-0.23 as well as trunk, but it does not > appear to have been pulled in. > The problem: > Although the indices are now tracked in a better way, there is a > synchronization issue since the indices are managed outside of methods to > modify the contents of the queue. > Removal of a block from a priority level without adjusting the index can mean > that the index then points to the block after the block it originally pointed > to. In the next round of scheduling for that priority level, the block > originally pointed to by the index is skipped. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira