[ 
https://issues.apache.org/jira/browse/HADOOP-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated HADOOP-1301:
-------------------------------------

    Status: Patch Available  (was: In Progress)

> resource management proviosioning for Hadoop
> --------------------------------------------
>
>                 Key: HADOOP-1301
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1301
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Pete Wyckoff
>            Assignee: Hemanth Yamijala
>             Fix For: 0.16.0
>
>         Attachments: hod-hadoop.patch, hod-hadoop.v2.patch, 
> hod-hadoop.v3.patch, hod-hadoop.v4.patch, hod-open-4.tar.gz, hod.0.2.2.tar.gz
>
>
> The Hadoop On Demand (HOD) project addresses the provisioning and managing of 
> MapReduce instances on cluster resources. With HOD, the MapReduce user 
> interacts with the cluster solely through a self-service interface and the 
> JT, TT info ports. The user never needs to log into the cluster or even have 
> an account on the cluster for that matter. HOD allocates nodes, provisions 
> MapReduce (and optionally HDFS) on the cluster and when the user is done with 
> MapReduce jobs, cleanly shuts down MapReduce and de-allocates the nodes 
> (i.e., re-introducing them to the pool of available resources in the cluster).
> Using HOD, a cluster can be shared among different users in a fair and 
> efficient manner. HOD is not a replacement or re-implementation of a 
> traditional resource manager. HOD is implemented using the resource manager 
> paradigm and at present is envisioned supporting Torque and Condor out of the 
> box. It also supports "static" resources, i.e., a dedicated set of resources 
> not using a resource manager.
> HOD is also self provisioning and, thus, can be used on systems such as EC2 
> or a campus cluster not already running MapReduce software or a resouce 
> manager. Figure 1 depicts a cluster using HOD. As the figure shows, the user 
> never logs into the cluster itself. The user's jobs run as the 'hod' user (a 
> configurable unix id).
> The user interacts with MapReduce and the cluster using the hod shell, hodsh. 
> Once in the hodsh, the user can allocate/de-allocate nodes and automatically 
> run JT, TTs, NN, DNs on those nodes without knowing the specifics of which 
> nodes are running which or logging into any of those boxes. HOD transparently 
> masks failures by allocating nodes to replace failed nodes. Once the user has 
> allocated nodes, she can run /bin/MapReduce my1.jar and then /bin/MapReduce 
> my2.jar ... from within the hod shell which automatically generates the 
> configuration file for the MapReduce script. When done, the user will exit 
> the shell.
> The hod shell has an automatic timeout so that users cannot hog resources 
> they aren't using. The timeout applies only when there is no MapReduce job 
> running. In addition, hod also has the option of tracking and enforcing 
> user/group resource limits.
> Optionally, HOD can run dedicated log and directory services in the cluster. 
> The log services are a central repository for collecting and retrieving 
> Hadoop logs for any given job. The directory service provides an easy way to 
> inspect what's running in the cluster or for the end user and html 
> interfacing for getting to their JT and TT info ports. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to