[ https://issues.apache.org/jira/browse/MAPREDUCE-5643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892219#comment-13892219 ]
Chen He commented on MAPREDUCE-5643: ------------------------------------ This is interesting. I would suggest you upload your design documents including your DHFS, DSTS, and DLMS. I have following questions about your scheduler. 1) if map and reduce slots can exchange, it is possible that some small jobs can not finish in time; 2) is there any load-balancing feature in your scheduling for map and reduce stage? 3) if reduce tasks steal map slot, some local map task will become non-local task because of shortage of map slots; > DynamicMR: A Dynamic Slot Utilization Optimization Framework for Hadoop MRv1 > ---------------------------------------------------------------------------- > > Key: MAPREDUCE-5643 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5643 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share > Affects Versions: 1.2.1 > Reporter: tang shanjiang > Assignee: tang shanjiang > Labels: performance > Attachments: DynamicMR-0.1.1-patch, README > > > Hadoop MRv1 uses the slot-based resource model with the static configuration > of map/reduce slots. There is a strict utility constrain that map tasks can > only run on map slots and reduce tasks can only use reduce slots. Due to the > rigid execution order between map and reduce tasks in a MapReduce > environment, slots can be severely under-utilized, which significantly > degrades the performance. > In contrast to YARN that gives up the slot-based resource model and propose a > container-based model to maximize the resource utilization via unawareness of > the types of map/reduce tasks, we keep the slot-based model and propose a > dynamic slot utilization optimization system called DynamicMR to improve the > performance of Hadoop by maximizing the slots utilization as well as slot > utilization efficiency while guaranteeing the fairness across pools. It > consists of three types of scheduling components, namely, Dynamic Hadoop Fair > Scheduler (DHFS), Dynamic Speculative Task Scheduler (DSTS), and Data > Locality Maximization Scheduler (DLMS). > Our tests show that DynamicMR outperforms YARN for MapReduce workloads with > multiple jobs, especially when the number of jobs is large. The explanation > is that, given a certain number of resources, it is obvious that the > performance for the case with a ratio control of concurrently running map and > reduce tasks is better than without control. Because without control, it > easily occurs that there are too many reduce tasks running, causing the > network to be a bottleneck seriously. For YARN, both map and reduce tasks can > run on any idle container. There is no control mechanism for the ratio of > resource allocation between map and reduce tasks. It means that when there > are pending reduce tasks, the idle container will be most likely possessed by > them. In contrast, DynamicMR follows the traditional slot-based model. In > contrast to the ’hard’ constrain of slot allocation that map slots have to be > allocated to map tasks and reduce tasks should be dispatched to reduce tasks, > DynamicMR obeys a ’soft’ constrain of slot allocation to allow that map slot > can be allocated to reduce task and vice versa. But whenever there are > pending map tasks, the map slot should be given to map tasks first, and the > rule is similar for reduce tasks. It means that, the traditional way of > static map/reduce slot configuration for the ratio control of running > map/reduce tasks still works for DynamicMR. In comparison to YARN which > maximizes the resource utilization only, DynamicMR can maximize the slot > resource utilization and meanwhile dynamically control the ratio of running > map/reduce tasks via map/reduce slot configuration. -- This message was sent by Atlassian JIRA (v6.1.5#6160)