[
https://issues.apache.org/jira/browse/HADOOP-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12682960#action_12682960
]
Jothi Padmanabhan commented on HADOOP-5381:
-------------------------------------------
# The client is using the Network Topology class only to build a tree of nodes
for itself based on the topology information provided by
fs.getBlockLocations(). HADOOP-4567 made this modification. No resolution of
nodes and racks happen at the client itself.
# True, the hosts returned by split is treated as a map. However, returning
hosts that make the maximum contribution to the split is still beneficial than
others. For example, say Host A has 100 bytes, B 80 bytes, C 60 bytes, D 40
bytes and E 20 bytes for a particular split. It would still be beneficial to
return A,B,C. While it is true that A, B and C have an equal chance of getting
picked up, it is still better than executing on D or E. No?
> Extend HADOOP-3293 to MapReduce package also
> --------------------------------------------
>
> Key: HADOOP-5381
> URL: https://issues.apache.org/jira/browse/HADOOP-5381
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Jothi Padmanabhan
> Assignee: Jothi Padmanabhan
> Fix For: 0.21.0
>
> Attachments: hadoop-5381.patch
>
>
> HADOOP-3293 made changes to FileInputFormat to identify split locations that
> contribute most to the split. This functionality has to be added to the
> MapReduce.FileInputFormat too.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.