[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427111#comment-13427111
 ] 

eric baldeschwieler commented on MAPREDUCE-4495:
------------------------------------------------

+1 Arun, Chris - Yarn is about opening up Hadoop to allow a lot more innovation 
on open APIs.  The whole point of open APIs is to let lots of different people 
try lots of different things in parallel without having to get those things 
added to the Hadoop core.  So go build something excellent, share it and build 
a community.  Use GitHub, sourceforge, apache extras or start an incubator 
project.  You don't need our approval and its not fair or scalable to ask the 
Yarn folks to get involved in supporting everyone's ideas of what might be 
interesting to build on top of Yarn. Everyone building a tomcat app doesn't 
want or get to check it into the tomcat project.

Another idea is to include this in Oozie if it fits well there...  If that 
would let us break oozie down into a scheduler and a workflow library (two 
sub-projects), that would be a very cool refactoring of oozie.

Look at the OpenMPI work...  They are proposing to add their AM to MPI.
                
> Workflow Application Master in YARN
> -----------------------------------
>
>                 Key: MAPREDUCE-4495
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 2.0.0-alpha
>            Reporter: Bo Wang
>            Assignee: Bo Wang
>
> It is useful to have a workflow application master, which will be capable of 
> running a DAG of jobs. The workflow client submits a DAG request to the AM 
> and then the AM will manage the life cycle of this application in terms of 
> requesting the needed resources from the RM, and starting, monitoring and 
> retrying the application's individual tasks.
> Compared to running Oozie with the current MapReduce Application Master, 
> these are some of the advantages:
>  - Less number of consumed resources, since only one application master will 
> be spawned for the whole workflow.
>  - Reuse of resources, since the same resources can be used by multiple 
> consecutive jobs in the workflow (no need to request/wait for resources for 
> every individual job from the central RM).
>  - More optimization opportunities in terms of collective resource requests.
>  - Optimization opportunities in terms of rewriting and composing jobs in the 
> workflow (e.g. pushing down Mappers).
>  - This Application Master can be reused/extended by higher systems like Pig 
> and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to