[ 
https://issues.apache.org/jira/browse/HADOOP-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HADOOP-1155:
----------------------------------

    Attachment: rackMap2.patch

This patch makes 4 changes:

1. add a rackname to its rack node map in NetworkTopology to speed up getNode
2. optimize sortedByDistance by taking Sameer's suggestion in HADOOP-1073:
>   Do we need to sort datanodes by distance? Why not just do a linear scan for 
> the on node and on rack instances, put them at the front of the pipeline and 
> leave the rest in random order?
    This suggestion allows us to reduce memeory allocation and the # of calls 
to getDistance.
3. add a test case to test sortedByDistance
4. change chooseRandom to return a list instead of an array. This allows us to 
reduce one memory allocation.

> Additional performance improvement to chooseTarget
> --------------------------------------------------
>
>                 Key: HADOOP-1155
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1155
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>             Fix For: 0.13.0
>
>         Attachments: rackMap.patch, rackMap1.patch, rackMap2.patch
>
>
> A few additional thoughts to improve the performance of chooseTarget:
> 1. Reduce the # of calls to getDistance in sortedByDistance
> 2. Improve the performance of getNode by adding a rack name to rack node map

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to