[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840435#action_12840435
 ] 

Luke Lu commented on MAPREDUCE-1484:
------------------------------------

There are a few approaches to this issue, which is mostly to preserve the task 
ids in the original order of the input splits. We want avoid sorting the 
splits, which could be up to a million for large scans, at JobTracker side, 
since it would be done per job and JT is already overloaded in large clusters.

One approach is to sort at client side and introduce an additional id in the 
input splits' meta info fields (new API only) to preserve the original order of 
the splits. One issue came for this approach: do we want to validate the ids at 
JT side? If so (seems prudent), the validation algorithm could be expensive as 
well (check whether the ids are unique is at least O(N) time, not sure if we 
can do O(1) space, O(N) space and time with a BitSet is the most 
straightforward approach.)

Another approach is to punt the sorting altogether. Observing that most splits 
are from file inputformats with most block sizes being the same default block 
size except the last block, maybe we don't need to sort at all. There are 
cases, where the block sizes are different within a job, but that's relatively 
rare(?)

I prefer the later approach, as it's simpler and easier to implement (just 
remove the sorting code.)  It has a cleaner semantics as well, as split sizes 
should be input to a scheduler that cares about it (in a two level scheduler 
world, sorting at one of the job manager nodes is scalable as well.)

Thoughts?

> Framework should not sort the input splits
> ------------------------------------------
>
>                 Key: MAPREDUCE-1484
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1484
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>
> Currently the framework sorts the input splits by size before the job is 
> submitted. This makes it very difficult to run map only jobs that transform 
> the input because the assignment of input names to output names isn't 
> obvious. We fixed this once in HADOOP-1440, but the fix was broken so it was 
> rolled back.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to