[
https://issues.apache.org/jira/browse/HADOOP-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647781#action_12647781
]
Jothi Padmanabhan commented on HADOOP-3293:
-------------------------------------------
Sorry, I should have said "Patch for review"; the Patch was locally tested.
I also did a test to demonstrate the performance improvement from the patch. I
allocated a 440 node cluster, ran randomwriter with 40 maps, each map output
25G. I then killed the task trackers on the nodes that ran the maps. I then ran
a modified sort (no map output, no reduces) with a minimum input split of 10G.
If found that, over an average of three runs, patch was about 17 seconds faster
than the trunk (175 secs as opposed to 192 secs)
> When an input split spans cross block boundary, the split location should be
> the host having most of bytes on it.
> ------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3293
> URL: https://issues.apache.org/jira/browse/HADOOP-3293
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Assignee: Jothi Padmanabhan
> Attachments: hadoop-3293.patch
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.