[jira] [Commented] (MAPREDUCE-4502) Multi-level aggregation with combining the result of maps per node/rack

Chris Douglas (JIRA) Wed, 12 Sep 2012 15:09:11 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454424#comment-13454424
 ]


Chris Douglas commented on MAPREDUCE-4502:
------------------------------------------

{quote}
>> Will there be one LocalAggregator and one ShuffleHandler per each 
>> reducer? Or, is it a single LocalAggregator/ShuffleHandler daemon
>> with relevant thread(pool)s per container?
Latter.
It's ideal to minimize code modifications and maximize the performance. At the 
current MR implementation, a ShuffleHandler is launched per container. Keeping 
it so can save the code modification. 
{quote}

{{ShuffleHandler}} is an auxiliary service loaded in the NodeManager. It's 
shared across all containers. Unfortunately, this makes the merge/combine 
difficult to implement as a daemon, because it requires loading user 
comparators (which shouldn't happen in the NM process).

Would it make sense to do this either at the end of a map (coordinated by the 
AM) or, as you suggest, as a ReduceTask rather than a daemon?

bq. To make this design more generic to support rack-level aggregation, a 
special task like Reducer which can fetch outputs and reduce them, but write 
its outputs not to HDFS but to local disk is necessary. With the special task, 
it can be used in rack-level aggregation by extending the new APIs between 
mappers and reducers to launch special tasks and delegate the aggregation.

[~curino] and I experimented with this, but (a) saw only slight improvements in 
job performance and (b) the changes to the AM to accommodate a new task type 
were extensive. We're currently experimenting with different heuristics for 
ReduceTasks to exit at the end of the shuffle stage based on skew (from 
sampling) or in response to resource scarcity (based on RM feedback). With 
logic to manage skew, we're hoping that scheduling an aggressive range can have 
a similar effect to combiner tasks, without introducing the new task type.
                
> Multi-level aggregation with combining the result of maps per node/rack
> -----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4502
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4502
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: applicationmaster, mrv2
>            Reporter: Tsuyoshi OZAWA
>            Assignee: Tsuyoshi OZAWA
>         Attachments: speculative_draft.pdf
>
>
> The shuffle costs is expensive in Hadoop in spite of the existence of 
> combiner, because the scope of combining is limited within only one MapTask. 
> To solve this problem, it's a good way to aggregate the result of maps per 
> node/rack by launch combiner.
> This JIRA is to implement the multi-level aggregation infrastructure, 
> including combining per container(MAPREDUCE-3902 is related), coordinating 
> containers by application master without breaking fault tolerance of jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4502) Multi-level aggregation with combining the result of maps per node/rack

Reply via email to