[ https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427484#comment-13427484 ]
Josh Wills commented on MAPREDUCE-4495: --------------------------------------- @Chris, thanks for clarifying your meaning. I think the line that threw me for a curve in your original comment was "Putting code into Apache Hadoop is the same as putting code in yet-to-be-named-Apache-Incubator-project," since for that to be true, we would need for the incubating project to have, at the very least, a source code repository created by infrastructure. And of course, we shouldn't rely on anyone's anecdotal experience, especially when we have tools that can show us resolution times for infrastructure issues tagged with/Subversion going all the way back to 2005: Quarter Closed Days Closed/Days Q1/2005 1 17 17 Q2/2005 18 459 25 Q3/2005 24 403 16 Q4/2005 12 98 8 Q1/2006 8 104 13 Q2/2006 8 149 18 Q3/2006 8 483 60 Q4/2006 10 745 74 Q1/2007 5 733 146 Q2/2007 4 27 6 Q3/2007 1 9 9 Q4/2007 1 0 0 Q1/2008 13 328 25 Q2/2008 7 90 12 Q3/2008 3 65 21 Q4/2008 7 519 74 Q1/2009 9 1393 154 Q2/2009 4 92 23 Q3/2009 8 409 51 Q4/2009 9 934 103 Q1/2010 9 42 4 Q2/2010 13 749 57 Q3/2010 7 92 13 Q4/2010 17 1086 63 Q1/2011 11 102 9 Q2/2011 11 82 7 Q3/2011 17 96 5 Q4/2011 10 72 7 Q1/2012 20 72 3 Q2/2012 13 129 9 Q3/2012 2 10 5 Or for git since it was added in 2009: Quarter Closed Days Closed/Days Q1/2009 2 6 3 Q2/2009 22 158 7 Q3/2009 17 249 14 Q4/2009 4 114 28 Q1/2010 12 266 22 Q2/2010 11 80 7 Q3/2010 15 271 18 Q4/2010 17 112 6 Q1/2011 15 379 25 Q2/2011 14 7 0 Q3/2011 33 2281 69 Q4/2011 33 632 19 Q1/2012 23 403 17 Q2/2012 28 491 17 Q3/2012 18 1074 59 Sigh. The median would be so much more useful, right? > Workflow Application Master in YARN > ----------------------------------- > > Key: MAPREDUCE-4495 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Affects Versions: 2.0.0-alpha > Reporter: Bo Wang > Assignee: Bo Wang > > It is useful to have a workflow application master, which will be capable of > running a DAG of jobs. The workflow client submits a DAG request to the AM > and then the AM will manage the life cycle of this application in terms of > requesting the needed resources from the RM, and starting, monitoring and > retrying the application's individual tasks. > Compared to running Oozie with the current MapReduce Application Master, > these are some of the advantages: > - Less number of consumed resources, since only one application master will > be spawned for the whole workflow. > - Reuse of resources, since the same resources can be used by multiple > consecutive jobs in the workflow (no need to request/wait for resources for > every individual job from the central RM). > - More optimization opportunities in terms of collective resource requests. > - Optimization opportunities in terms of rewriting and composing jobs in the > workflow (e.g. pushing down Mappers). > - This Application Master can be reused/extended by higher systems like Pig > and hive to provide an optimized way of running their workflows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira