[ 
https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470766#comment-16470766
 ] 

Billie Rinaldi commented on YARN-8080:
--------------------------------------

Hi [~suma.shivaprasad], thanks for the new patch. I like the addition of the 
restart policy handlers. It looks like the patch is missing a couple of the new 
classes. 

Thanks also for adding the dependency checking. I think it would be better to 
add an isReady method to Component, so instead of saying 
!restartPolicy.isDependentComponentReady(dependentComponent), you could have 
!dependentComponent.isReady(), and the dependentComponent could use its own 
restartPolicy to check its readiness. Maybe the 
restartPolicy.isDependentComponentReady method should be renamed 
isComponentReady. A comment would be helpful in the isReady method saying that 
it means the component is ready for other components that depend on it to 
start. (If isReady doesn't seem specific enough, we could use something like 
readyForDownstream or readyForDependentsToStart.)

There seem to be some indentation changes in FlexComponentTransition that 
aren't needed, and the patch still needs the suceeded -> succeeded typo fixes.

> YARN native service should support component restart policy
> -----------------------------------------------------------
>
>                 Key: YARN-8080
>                 URL: https://issues.apache.org/jira/browse/YARN-8080
>             Project: Hadoop YARN
>          Issue Type: Task
>            Reporter: Wangda Tan
>            Assignee: Suma Shivaprasad
>            Priority: Critical
>         Attachments: YARN-8080.001.patch, YARN-8080.002.patch, 
> YARN-8080.003.patch, YARN-8080.005.patch, YARN-8080.006.patch, 
> YARN-8080.007.patch, YARN-8080.009.patch
>
>
> Existing native service assumes the service is long running and never 
> finishes. Containers will be restarted even if exit code == 0. 
> To support boarder use cases, we need to allow restart policy of component 
> specified by users. Propose to have following policies:
> 1) Always: containers always restarted by framework regardless of container 
> exit status. This is existing/default behavior.
> 2) Never: Do not restart containers in any cases after container finishes: To 
> support job-like workload (for example Tensorflow training job). If a task 
> exit with code == 0, we should not restart the task. This can be used by 
> services which is not restart/recovery-able.
> 3) On-failure: Similar to above, only restart task with exitcode != 0. 
> Behaviors after component *instance* finalize (Succeeded or Failed when 
> restart_policy != ALWAYS): 
> 1) For single component, single instance: complete service.
> 2) For single component, multiple instance: other running instances from the 
> same component won't be affected by the finalized component instance. Service 
> will be terminated once all instances finalized. 
> 3) For multiple components: Service will be terminated once all components 
> finalized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to