[jira] [Updated] (MAPREDUCE-5643) DynamicMR: A Dynamic Slot Utilization Optimization Framework for Hadoop MRv1

tang shanjiang (JIRA) Thu, 12 Jun 2014 23:53:24 -0700

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


tang shanjiang updated MAPREDUCE-5643:
--------------------------------------

    Attachment: DynamicMR_TCC_SupplementalMaterial.pdf
                DynamicMR A Dynamic Slot Allocation Optimization Framework for 
MapReduce Clusters.pdf

A technique report on DynamicMR

> DynamicMR: A Dynamic Slot Utilization Optimization Framework for Hadoop MRv1
> ----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5643
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5643
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/fair-share
>    Affects Versions: 1.2.1
>            Reporter: tang shanjiang
>            Assignee: tang shanjiang
>              Labels: performance
>         Attachments: DynamicMR A Dynamic Slot Allocation Optimization 
> Framework for MapReduce Clusters.pdf, DynamicMR-0.1.1-patch, 
> DynamicMR_TCC_SupplementalMaterial.pdf, README
>
>
> Hadoop MRv1 uses the slot-based resource model with the static configuration 
> of map/reduce slots. There is a strict utility constrain that map tasks can 
> only run on map slots and reduce tasks can only use reduce slots. Due to the 
> rigid execution order between map and reduce tasks in a MapReduce 
> environment, slots can be severely under-utilized, which significantly 
> degrades the performance. 
> In contrast to YARN that gives up the slot-based resource model and propose a 
> container-based model to maximize the resource utilization via unawareness of 
> the types of map/reduce tasks, we keep the slot-based model and propose a 
> dynamic slot utilization optimization system called DynamicMR to improve the 
> performance of Hadoop by maximizing the slots utilization as well as slot 
> utilization efficiency while guaranteeing the fairness across pools. It 
> consists of three types of scheduling components, namely, Dynamic Hadoop Fair 
> Scheduler (DHFS), Dynamic Speculative Task Scheduler (DSTS), and Data 
> Locality Maximization Scheduler (DLMS).
> Our tests show that DynamicMR outperforms YARN for MapReduce workloads with 
> multiple jobs, especially when the number of jobs is large. The explanation 
> is that, given a certain number of resources, it is obvious that the 
> performance for the case with a ratio control of concurrently running map and 
> reduce tasks is better than without control. Because without control, it 
> easily occurs that there are too many reduce tasks running, causing the 
> network to be a bottleneck seriously. For YARN, both map and reduce tasks can 
> run on any idle container. There is no control mechanism for the ratio of 
> resource allocation between map and reduce tasks. It means that when there 
> are pending reduce tasks, the idle container will be most likely possessed by 
> them. In contrast, DynamicMR follows the traditional slot-based model. In 
> contrast to the ’hard’ constrain of slot allocation that map slots have to be 
> allocated to map tasks and reduce tasks should be dispatched to reduce tasks, 
> DynamicMR obeys a ’soft’ constrain of slot allocation to allow that map slot 
> can be allocated to reduce task and vice versa. But whenever there are 
> pending map tasks, the map slot should be given to map tasks first, and the 
> rule is similar for reduce tasks. It means that, the traditional way of 
> static map/reduce slot configuration for the ratio control of running 
> map/reduce tasks still works for DynamicMR. In comparison to YARN which 
> maximizes the resource utilization only, DynamicMR can maximize the slot 
> resource utilization and meanwhile dynamically control the ratio of running 
> map/reduce tasks via map/reduce slot configuration.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (MAPREDUCE-5643) DynamicMR: A Dynamic Slot Utilization Optimization Framework for Hadoop MRv1

Reply via email to