[ https://issues.apache.org/jira/browse/HDFS-2486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181450#comment-14181450 ]
Hudson commented on HDFS-2486: ------------------------------ FAILURE: Integrated in Hadoop-Mapreduce-trunk #1935 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1935/]) Move HDFS-2486 down to 2.7.0 in CHANGES.txt (wang: rev 08457e9e57e4fa3c83217fd0a092e926ba7eb135) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Review issues with UnderReplicatedBlocks > ---------------------------------------- > > Key: HDFS-2486 > URL: https://issues.apache.org/jira/browse/HDFS-2486 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode > Affects Versions: 0.23.0 > Reporter: Steve Loughran > Assignee: Uma Maheswara Rao G > Priority: Minor > Fix For: 2.7.0 > > Attachments: HDFS-2486.patch > > > Here are some things I've noted in the UnderReplicatedBlocks class that > someone else should review and consider if the code is correct. If not, they > are easy to fix. > remove(Block block, int priLevel) is not synchronized, and as the inner > classes are not, there is a risk of race conditions there. > some of the code assumes that getPriority can return the value LEVEL, and if > so does not attempt to queue the blocks. As this return value is not > currently possible, those checks can be removed. > The queue gives priority to blocks whose replication count is less than a > third of its expected count over those that are "normally under replicated". > While this is good for ensuring that files scheduled for large replication > are replicated fast, it may not be the best strategy for maintaining data > integrity. For that it may be better to give whichever blocks have only two > replicas priority over blocks that may, for example, already have 3 out of 10 > copies in the filesystem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)