[
https://issues.apache.org/jira/browse/HADOOP-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473815
]
dhruba borthakur commented on HADOOP-972:
-----------------------------------------
+1, looks good.
1. It may be possible to further optimize getLeave() by making it
non-recursive. But in the current case, the network topology map is only two
levels deep and this optimization might not give us any immediate performance
gain.
2. In this implementation, if we have a large number of racks, the time that
chooseRandom() takes to pick a node increases when the selected node index lies
towards the end of the range of datanode indices. Again, this probably will
have some material impact only when the topology tree is deep and there are
thousands of racks.
> Improve the rack-aware replica placement performance
> ----------------------------------------------------
>
> Key: HADOOP-972
> URL: https://issues.apache.org/jira/browse/HADOOP-972
> Project: Hadoop
> Issue Type: Improvement
> Components: dfs
> Affects Versions: 0.11.0
> Reporter: Hairong Kuang
> Assigned To: Hairong Kuang
> Fix For: 0.12.0
>
> Attachments: rack_performance.patch
>
>
> This issue aims to improve the rack-aware replica placement performance. A
> major idea is to avoid constructing lists of possible targets for random
> selection in chooseTarget, which currently needs interating all
> DatanodeDescriptors. I plan to change the NetworkTopology data structure as
> follow:
> 1. each InnerNode stores its childrens as a list;
> 2. each InnerNode adds a new field numberOfLeaves the total number of leaves
> (i.e. data nodes) in its subtree.
> NetworkTopology will support two new methods:
> 1. DatanodeDescriptor chooseRandom( String scope): it randomly choose one
> leave from scope.
> 2. DatanodeDescriptor chooseRandomExclude(String excludedScope): it randomly
> choose one leave from ~scope
> In addition, Issue 971 will also help improve the performance of the
> rack-aware DFS patch.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.