[ https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13428257#comment-13428257 ]
Arun C Murthy commented on MAPREDUCE-4495: ------------------------------------------ Alejandro, making MR AM thread-safe is a good goal. We can do that independently of the new AM. I have opened MAPREDUCE-4513 for the same. I don't which other 'private' classes you need - frankly that concerns me. It means you are adding new requirements on MR-AM which isn't an 'api' of that nature. Also, if we are going that route I strongly suggest we do not import code from Oozie and merely take JobControl api and support it. That should be a trivial exercise without adding any new 'interfaces' to MapReduce. So, I see two options: # Enhance JobControl api to work in AM by making MR-AM, specifially MRAppMaster thread-safe. This will allow for multiple objects of MRAppMaster to be created. This means there are no new interfaces to MapReduce. # Go the full distance, make it generic, import code from Oozie, come up with a new set of interfaces etc. etc. and do it in a separate Incubator project. As I indicated previously, my preference is option #2 and I have already offered help to deal with the specifics so you and Bo can concentrate on getting the code out. Thoughts? > Workflow Application Master in YARN > ----------------------------------- > > Key: MAPREDUCE-4495 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Affects Versions: 2.0.0-alpha > Reporter: Bo Wang > Assignee: Bo Wang > > It is useful to have a workflow application master, which will be capable of > running a DAG of jobs. The workflow client submits a DAG request to the AM > and then the AM will manage the life cycle of this application in terms of > requesting the needed resources from the RM, and starting, monitoring and > retrying the application's individual tasks. > Compared to running Oozie with the current MapReduce Application Master, > these are some of the advantages: > - Less number of consumed resources, since only one application master will > be spawned for the whole workflow. > - Reuse of resources, since the same resources can be used by multiple > consecutive jobs in the workflow (no need to request/wait for resources for > every individual job from the central RM). > - More optimization opportunities in terms of collective resource requests. > - Optimization opportunities in terms of rewriting and composing jobs in the > workflow (e.g. pushing down Mappers). > - This Application Master can be reused/extended by higher systems like Pig > and hive to provide an optimized way of running their workflows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira