[ https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14525440#comment-14525440 ]
Hadoop QA commented on HDFS-4253: --------------------------------- \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12729962/HDFS-4253.06.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 6ae2a0d | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10721/console | This message was automatically generated. > block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance > ----------------------------------------------------------------------------- > > Key: HDFS-4253 > URL: https://issues.apache.org/jira/browse/HDFS-4253 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 3.0.0, 2.0.2-alpha > Reporter: Andy Isaacson > Assignee: Andy Isaacson > Attachments: HDFS-4253.06.patch, hdfs4253-1.txt, hdfs4253-2.txt, > hdfs4253-3.txt, hdfs4253-4.txt, hdfs4253-5.txt, hdfs4253-6.txt, hdfs4253.txt > > > When many nodes (10) read from the same block simultaneously, we get > asymmetric distribution of read load. This can result in slow block reads > when one replica is serving most of the readers and the other replicas are > idle. The busy DN bottlenecks on its network link. > This is especially visible with large block sizes and high replica counts (I > reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication > 5), but the same behavior happens on a small scale with normal-sized blocks > and replication=3. > The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which > explicitly does not try to spread traffic among replicas in a given rack -- > it only randomizes usage for off-rack replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332)