[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473727#comment-13473727
 ] 

Mayank Bansal commented on MAPREDUCE-4495:
------------------------------------------

Since beginning I was in favor of adding WF DAG to AM and I already shown 
intrest contributing to that. I was before in favor of adding this work to MR 
and YARN but Bobby pointed out correctly that it would be very difficult for us 
to keep Oozie and WFAM in sync.

I dont think we should add this WFAM as part of oozie as well because if we do 
that then all other projects like PIG and HIVE wants to use it and then it 
would be a problem for them or any other future project which wants to use WF 
functionality.

Ideally it should be a library/project which other projects like Oozie, PIG and 
HIVE should depend upon. One can argue that then it should be part of MR and 
YARN and then every other project can depends on it, But if we do that, then 
every new functionality which wants to implement AM interface has to be part of 
YARN which does not seems right to me at this point. I think the basic Idea 
behind creating AM is that any new application/project can implement that and 
use the YARN framework. Its not necessarily means it should be part of YARN 
framework itself.

Based on my understanding I think we should create new project and move forward 
, I am very much willing to contribute to that. It would be easier for us to 
innovate and move forward in separate project then being part of YARN.

Thats just my understanding.

Thoughts?


                
> Workflow Application Master in YARN
> -----------------------------------
>
>                 Key: MAPREDUCE-4495
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 2.0.0-alpha
>            Reporter: Bo Wang
>            Assignee: Bo Wang
>         Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
> MapReduceWorkflowAM.pdf
>
>
> It is useful to have a workflow application master, which will be capable of 
> running a DAG of jobs. The workflow client submits a DAG request to the AM 
> and then the AM will manage the life cycle of this application in terms of 
> requesting the needed resources from the RM, and starting, monitoring and 
> retrying the application's individual tasks.
> Compared to running Oozie with the current MapReduce Application Master, 
> these are some of the advantages:
>  - Less number of consumed resources, since only one application master will 
> be spawned for the whole workflow.
>  - Reuse of resources, since the same resources can be used by multiple 
> consecutive jobs in the workflow (no need to request/wait for resources for 
> every individual job from the central RM).
>  - More optimization opportunities in terms of collective resource requests.
>  - Optimization opportunities in terms of rewriting and composing jobs in the 
> workflow (e.g. pushing down Mappers).
>  - This Application Master can be reused/extended by higher systems like Pig 
> and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to