[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated MAPREDUCE-279:
------------------------------------

    Comment: was deleted

(was: h5. Proposal 

The fundamental idea of the re-factor is to divide the two major functions of 
the JobTracker, resource management and job scheduling/monitoring, into 
separate components: a generic resource scheduler and a per-job, user-defined 
component that manages the application execution. 

The new ResourceManager manages the global assignment of compute resources to 
applications and the per-application ApplicationMaster manages the 
application's scheduling and coordination. An application is either a single 
job in the classic MapReduce jobs or a DAG of such jobs. The ResourceManager 
and per-machine NodeManager server, which manages the user processes on that 
machine, form the computation fabric. The per-application ApplicationMaster is, 
in effect, a framework specific library and is tasked with negotiating 
resources from the ResourceManager and working with the NodeManager(s) to 
execute and monitor the tasks.

The ResourceManager is a pure scheduler in the sense that it performs no 
monitoring or tracking of status for the application. Also, it offers no 
guarantees on restarting failed tasks either due to application failure or 
hardware failures.

The ResourceManager performs its scheduling function based the resource 
requirements of the applications; each application has multiple resource 
request types that represent the resources required for containers. The 
resource requests include memory, CPU, disk, network etc. Note that this is a 
significant change from the current model of fixed-type slots in Hadoop 
MapReduce, which leads to significant negative impact on cluster utilization. 
The ResourceManager has a scheduler policy plug-in, which is responsible for 
partitioning the cluster resources among various queues, applications etc. 
Scheduler plug-ins can be based, for e.g., on the current CapacityScheduler and 
FairScheduler.

The NodeManager is the per-machine framework agent who is responsible for 
launching the applications' containers, monitoring their resource usage (cpu, 
memory, disk, network) and reporting the same to the Scheduler.

The per-application ApplicationMaster has the responsibility of negotiating 
appropriate resource containers from the Scheduler, launching tasks, tracking 
their status & monitoring for progress, handling task-failures and recovering 
from saved state on an ResourceManager fail-over.

Since downtime is more expensive at scale high-availability is built-in from 
the beginning via Apache ZooKeeper for the ResourceManager and HDFS checkpoint 
for the MapReduce ApplicationMaster. Security and multi-tenancy support is 
critical to support many users on the larger clusters. The new architecture 
will also increase innovation and agility by allowing for user-defined versions 
of MapReduce runtime. Support for generic resource requests will increase 
cluster utilization by removing artificial bottlenecks such as 
hard-partitioning of resources into map and reduce slots.

----

We have a *prototype* we'd like to commit to a branch soon, where we look 
forward to feedback. From there on, we would love to collaborate to get it 
committed to trunk.

)

> Map-Reduce 2.0
> --------------
>
>                 Key: MAPREDUCE-279
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker, tasktracker
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>             Fix For: 0.23.0
>
>
> Re-factor MapReduce into a generic resource scheduler and a per-job, 
> user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to