[ https://issues.apache.org/jira/browse/HDFS-13284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474990#comment-16474990 ]
Chris Douglas edited comment on HDFS-13284 at 5/14/18 10:49 PM: ---------------------------------------------------------------- Can you be more explicit about the conditions where these heuristics are (in)correct? The description documents the current behavior and the proposed change, but not what it resolves. If the problem is that blocks with only two replicas should be assigned higher priority than those with three (e.g., because you're seeing avoidable high-priority replications), would {{((curReplicas * 3) < expectedReplicas || curReplicas == 2)}} resolve this? Items added to the distributed cache with high replication may be over-prioritized, if we change the heuristic from 1:3 to 1:2. The patch should also update the javadoc, which documents the priority assignment. was (Author: chris.douglas): Can you be more explicit about the conditions where these heuristics are (in)correct? The description documents the current behavior and the proposed change, but not what it resolves. If the problem is that files with only two replicas should be assigned higher priority than those with three (e.g., because you're seeing avoidable high-priority replications), would {{((curReplicas * 3) < expectedReplicas || curReplicas == 2)}} resolve this? Items added to the distributed cache with high replication may be over-prioritized, if we change the heuristic from 1:3 to 1:2. The patch should also update the javadoc, which documents the priority assignment. > Adjust criteria for LowRedundancyBlocks.QUEUE_VERY_LOW_REDUNDANCY > ----------------------------------------------------------------- > > Key: HDFS-13284 > URL: https://issues.apache.org/jira/browse/HDFS-13284 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode > Reporter: Lukas Majercak > Assignee: Lukas Majercak > Priority: Major > Attachments: HDFS-13284.000.patch, HDFS-13284.001.patch > > > LowRedundancyBlocks currently has 5 priority queues: > QUEUE_HIGHEST_PRIORITY = 0 - reserved for last > replica blocks > QUEUE_VERY_LOW_REDUNDANCY = 1 - *if ((curReplicas * 3) < > expectedReplicas)* > QUEUE_LOW_REDUNDANCY = 2 - the rest > QUEUE_REPLICAS_BADLY_DISTRIBUTED = 3 > QUEUE_WITH_CORRUPT_BLOCKS = 4 > The problem lies in QUEUE_VERY_LOW_REDUNDANCY. Currently, a block that has > curReplicas=2 and expectedReplicas=4 is treated the same as a block with > curReplicas=3 and expectedReplicas=4. A block with 2/3 replicas is also put > into QUEUE_LOW_REDUNDANCY. > The proposal is to change the *{{if ((curReplicas * 3) < expectedReplicas)}}* > check to *{{if ((curReplicas * 2) <= expectedReplicas || curReplicas == 2)}}* > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org