[jira] [Commented] (OOZIE-1178) Workflow Application Master in YARN

2013-01-23 Thread Bo Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561429#comment-13561429
 ] 

Bo Wang commented on OOZIE-1178:


Hi Tianyou, glad to hear from you! I am a 3rd year PhD at Stanford and has been 
working on this JIRA since my internship at Cloudera last summer. Look forward 
to collaborating on this project.

Hi Arun, definitely I'd love to contribute to this and make it into the 
production.

 Workflow Application Master in YARN
 ---

 Key: OOZIE-1178
 URL: https://issues.apache.org/jira/browse/OOZIE-1178
 Project: Oozie
  Issue Type: New Feature
Reporter: Bo Wang
 Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
 MapReduceWorkflowAM.pdf, yapp_proposal.txt


 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1178) Workflow Application Master in YARN

2013-01-21 Thread Bo Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559186#comment-13559186
 ] 

Bo Wang commented on OOZIE-1178:


Hi Andrew,

bq. Would it be possible to refresh the patch on this JIRA?

The code is in an internal repo at Cloudera, but I am now back to school and 
have no access to it.

{quote}
But for V2 and V3 when an AM is launched by the WF AM and not directly by the 
RM the WF AM must take over some responsibilities of the RM. I am curious how 
many of those responsibilities it will take over. I am also curious about what 
modifications will be required to other AMs so that they can interact with both 
the WF AM and also the RM directly.

bq. Would it be possible this could be handled by a RM-AM delegation API, 
with consideration for when the RM can kill a delegate not responding 
sufficiently to its responsibilities?
{quote}

This is a good question. WfAM takes over responsibilities including monitoring 
child AMs, killing/restarting child AMs in case of failure, etc. One of the 
design principles is to allow AMs to run in WfAM without modification. In other 
words, AMs should just treat WfAM as the RM. Resource requests/releases 
should all be sent to WfAM instead. Then WfAM will determine how to serve these 
requests (either locally or forward it to RM).

When a WfAM is not responding (over a period long enough for restarting), RM 
should kill the WfAM together with all the containers allocated to it. These 
containers include child AMs and workers. When a child AM is not responding, 
WfAM can trigger the kill and restart for it.

bq. Finally, it would be interesting and useful if something like the WFAM 
proposed on this issue could maintain a persistent pool of workers...

Yes, maintaining a (relatively) persistent pool of workers can reduce the 
scheduling cost. This is a great benefit of WfAM. Your comment reminds me of a 
discussion on a RM-WfAM protocol in one of the early design meetings. This 
RM-WfAM protocol allows RM to distinguish WfAM from other AMs. Thus each WfAM 
can report to RM the idle resources it retains (for possible reallocations) via 
this protocol. Then when there is a shortage of resources globally, RM can 
request WfAMs to release withheld resources. This protocol is not included in 
the proposal due to the potentially big changes to RM.


 Workflow Application Master in YARN
 ---

 Key: OOZIE-1178
 URL: https://issues.apache.org/jira/browse/OOZIE-1178
 Project: Oozie
  Issue Type: New Feature
Reporter: Bo Wang
 Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
 MapReduceWorkflowAM.pdf, yapp_proposal.txt


 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1178) Workflow Application Master in YARN

2013-01-21 Thread Bo Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559188#comment-13559188
 ] 

Bo Wang commented on OOZIE-1178:


Hi Arun, I'd love to keep on working on WfAM. But I think the discussion on 
where to put it is not resolved yet.

 Workflow Application Master in YARN
 ---

 Key: OOZIE-1178
 URL: https://issues.apache.org/jira/browse/OOZIE-1178
 Project: Oozie
  Issue Type: New Feature
Reporter: Bo Wang
 Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
 MapReduceWorkflowAM.pdf, yapp_proposal.txt


 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira