[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13428257#comment-13428257
 ] 

Arun C Murthy commented on MAPREDUCE-4495:
------------------------------------------

Alejandro, making MR AM thread-safe is a good goal. We can do that 
independently of the new AM. I have opened MAPREDUCE-4513 for the same.

I don't which other 'private' classes you need - frankly that concerns me. It 
means you are adding new requirements on MR-AM which isn't an 'api' of that 
nature.

Also, if we are going that route I strongly suggest we do not import code from 
Oozie and merely take JobControl api and support it. That should be a trivial 
exercise without adding any new 'interfaces' to MapReduce.

So, I see two options:
# Enhance JobControl api to work in AM by making MR-AM, specifially MRAppMaster 
thread-safe. This will allow for multiple objects of MRAppMaster to be created. 
This means there are no new interfaces to MapReduce.
# Go the full distance, make it generic, import code from Oozie, come up with a 
new set of interfaces etc. etc. and do it in a separate Incubator project.

As I indicated previously, my preference is option #2 and I have already 
offered help to deal with the specifics so you and Bo can concentrate on 
getting the code out.

Thoughts?
                
> Workflow Application Master in YARN
> -----------------------------------
>
>                 Key: MAPREDUCE-4495
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 2.0.0-alpha
>            Reporter: Bo Wang
>            Assignee: Bo Wang
>
> It is useful to have a workflow application master, which will be capable of 
> running a DAG of jobs. The workflow client submits a DAG request to the AM 
> and then the AM will manage the life cycle of this application in terms of 
> requesting the needed resources from the RM, and starting, monitoring and 
> retrying the application's individual tasks.
> Compared to running Oozie with the current MapReduce Application Master, 
> these are some of the advantages:
>  - Less number of consumed resources, since only one application master will 
> be spawned for the whole workflow.
>  - Reuse of resources, since the same resources can be used by multiple 
> consecutive jobs in the workflow (no need to request/wait for resources for 
> every individual job from the central RM).
>  - More optimization opportunities in terms of collective resource requests.
>  - Optimization opportunities in terms of rewriting and composing jobs in the 
> workflow (e.g. pushing down Mappers).
>  - This Application Master can be reused/extended by higher systems like Pig 
> and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to