[ 
https://issues.apache.org/jira/browse/OOZIE-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561353#comment-13561353
 ] 

Tianyou Li commented on OOZIE-1178:
-----------------------------------

Hello,

This is Tianyou Li from Intel. This JIRA, and the YAPP proposal specifically, 
have been recently brought to our attention. We think YAPP is very interesting 
and would be eager to participate in the development of the software and 
creation of a viable developer and user community. We have been working on 
hosting Hive execution plans in a specialized AM that can run MR jobs 
internally according to a DAG supplied by the Hive front end. Initial 
performance tests show attractive numbers that seem to bear out the approach. 
Our current plan is to finish a production ready specialized Hive AM for 
executing plans (job DAGs), and then work on managing reuse of a scalable pool 
of persistent containers for the executors, and also reuse of the specialized 
Hive AM so the AM does not need to be instantiated for every query.

However it would be great if, rather than focus on a specialized Hive AM 
exclusively, we could contribute efforts to something useful to Hive, Pig, 
Oozie, and many other new efforts that could benefit. We hope it is a suitable 
time to consider collaboration, before we make any further progress. We would 
like to contribute our work in some form, but more importantly our ongoing 
efforts. Both myself (Tianyou Li, tianyou...@gmail.com) and my colleague Yi Liu 
(hitli...@gmail.com) have been doing the above described work internally and 
would like to volunteer as additional initial committers  on the YAPP proposal, 
with the backing of our employer Intel. Others in our team are Apache 
committers and PMC members, so we are aware of the responsibilities and are 
committed to fulfilling them.

Thank you for your kind consideration.

                
> Workflow Application Master in YARN
> -----------------------------------
>
>                 Key: OOZIE-1178
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1178
>             Project: Oozie
>          Issue Type: New Feature
>            Reporter: Bo Wang
>         Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
> MapReduceWorkflowAM.pdf, yapp_proposal.txt
>
>
> It is useful to have a workflow application master, which will be capable of 
> running a DAG of jobs. The workflow client submits a DAG request to the AM 
> and then the AM will manage the life cycle of this application in terms of 
> requesting the needed resources from the RM, and starting, monitoring and 
> retrying the application's individual tasks.
> Compared to running Oozie with the current MapReduce Application Master, 
> these are some of the advantages:
>  - Less number of consumed resources, since only one application master will 
> be spawned for the whole workflow.
>  - Reuse of resources, since the same resources can be used by multiple 
> consecutive jobs in the workflow (no need to request/wait for resources for 
> every individual job from the central RM).
>  - More optimization opportunities in terms of collective resource requests.
>  - Optimization opportunities in terms of rewriting and composing jobs in the 
> workflow (e.g. pushing down Mappers).
>  - This Application Master can be reused/extended by higher systems like Pig 
> and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to