[ https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427585#comment-13427585 ]
Chris A. Mattmann commented on MAPREDUCE-4495: ---------------------------------------------- Hi Josh: bq. @Chris, thanks for clarifying your meaning. I think the line that threw me for a curve in your original comment was "Putting code into Apache Hadoop is the same as putting code in yet-to-be-named-Apache-Incubator-project," since for that to be true, we would need for the incubating project to have, at the very least, a source code repository created by infrastructure. And again, having the repo created isn't anything big. Most mentors can do this themselves on day 1 of the project being accepted? It's a simple svn mkdir https://svn.apache.org/repos/asf/... Besides that, regarding your statistics data, I wouldn't put much data quality investment into your stats since: * JIRA issues aren't always closed the minute/second/milisecond the work is done. I've been involved in lots of projects where an issue is finished, and then the issue is closed days, weeks, even months later (e.g., "oh, I forgot to close that issue...") * Not all work in INFRA is a JIRA issue. * The JIRA SVN plugin requires that the person tagged the SVN commit with the JIRA issue ID. Not everything is tracked in Subversion. Thus, unfortunately, garbage in, garbage out. > Workflow Application Master in YARN > ----------------------------------- > > Key: MAPREDUCE-4495 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Affects Versions: 2.0.0-alpha > Reporter: Bo Wang > Assignee: Bo Wang > > It is useful to have a workflow application master, which will be capable of > running a DAG of jobs. The workflow client submits a DAG request to the AM > and then the AM will manage the life cycle of this application in terms of > requesting the needed resources from the RM, and starting, monitoring and > retrying the application's individual tasks. > Compared to running Oozie with the current MapReduce Application Master, > these are some of the advantages: > - Less number of consumed resources, since only one application master will > be spawned for the whole workflow. > - Reuse of resources, since the same resources can be used by multiple > consecutive jobs in the workflow (no need to request/wait for resources for > every individual job from the central RM). > - More optimization opportunities in terms of collective resource requests. > - Optimization opportunities in terms of rewriting and composing jobs in the > workflow (e.g. pushing down Mappers). > - This Application Master can be reused/extended by higher systems like Pig > and hive to provide an optimized way of running their workflows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira