[ 
https://issues.apache.org/jira/browse/HDFS-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6268:
------------------------------

    Attachment: hdfs-6268-1.patch

Here's a patch that adds some randomization to pseudoSortByDistance when 
there's no local replica found. We initialize the RNG with the block ID to get 
some determinism for hopefully better page cache behavior; this is sort of like 
how the first rack-local replica was always chosen.

I didn't make the same change for NetworkTopologyWithNodeGroup, can do that in 
a follow-on if desired. I also took the opportunity to remove what looked like 
dead code in FSNamesystem#getBlockLocations.

> Better sorting in NetworkTopology#pseudoSortByDistance when no local node is 
> found
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-6268
>                 URL: https://issues.apache.org/jira/browse/HDFS-6268
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.4.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>            Priority: Minor
>         Attachments: hdfs-6268-1.patch
>
>
> In NetworkTopology#pseudoSortByDistance, if no local node is found, it will 
> always place the first rack local node in the list in front.
> This became an issue when a dataset was loaded from a single datanode. This 
> datanode ended up being the first replica for all the blocks in the dataset. 
> When running an Impala query, the non-local reads when reading past a block 
> boundary were all hitting this node, meaning massive load skew.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to