[ 
https://issues.apache.org/jira/browse/HADOOP-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12682960#action_12682960
 ] 

Jothi Padmanabhan commented on HADOOP-5381:
-------------------------------------------

# The client is using the Network Topology class only to build a tree of nodes 
for itself based on the topology information provided by 
fs.getBlockLocations(). HADOOP-4567 made this modification. No resolution of 
nodes and racks happen at the client itself.
# True, the hosts returned by split is treated as a map. However, returning 
hosts that make the maximum contribution to the split is still beneficial than 
others. For example, say Host A has 100 bytes, B 80 bytes, C 60 bytes, D 40 
bytes and E 20 bytes for a particular split. It would still be beneficial to 
return A,B,C. While it is true that A, B and C have an equal chance of getting 
picked up, it is still better than executing on D or E. No?

> Extend HADOOP-3293 to MapReduce package also
> --------------------------------------------
>
>                 Key: HADOOP-5381
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5381
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Jothi Padmanabhan
>            Assignee: Jothi Padmanabhan
>             Fix For: 0.21.0
>
>         Attachments: hadoop-5381.patch
>
>
> HADOOP-3293 made changes to FileInputFormat to identify split locations that 
> contribute most to the split. This functionality has to be added to the 
> MapReduce.FileInputFormat too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to