[
https://issues.apache.org/jira/browse/HADOOP-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hairong Kuang updated HADOOP-1155:
----------------------------------
Attachment: rackMap2.patch
This patch makes 4 changes:
1. add a rackname to its rack node map in NetworkTopology to speed up getNode
2. optimize sortedByDistance by taking Sameer's suggestion in HADOOP-1073:
> Do we need to sort datanodes by distance? Why not just do a linear scan for
> the on node and on rack instances, put them at the front of the pipeline and
> leave the rest in random order?
This suggestion allows us to reduce memeory allocation and the # of calls
to getDistance.
3. add a test case to test sortedByDistance
4. change chooseRandom to return a list instead of an array. This allows us to
reduce one memory allocation.
> Additional performance improvement to chooseTarget
> --------------------------------------------------
>
> Key: HADOOP-1155
> URL: https://issues.apache.org/jira/browse/HADOOP-1155
> Project: Hadoop
> Issue Type: Improvement
> Components: dfs
> Affects Versions: 0.12.2
> Reporter: Hairong Kuang
> Assigned To: Hairong Kuang
> Fix For: 0.13.0
>
> Attachments: rackMap.patch, rackMap1.patch, rackMap2.patch
>
>
> A few additional thoughts to improve the performance of chooseTarget:
> 1. Reduce the # of calls to getDistance in sortedByDistance
> 2. Improve the performance of getNode by adding a rack name to rack node map
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.