Wangda Tan created YARN-8080:
--------------------------------

             Summary: YARN native service should support component restart 
policy
                 Key: YARN-8080
                 URL: https://issues.apache.org/jira/browse/YARN-8080
             Project: Hadoop YARN
          Issue Type: Task
            Reporter: Wangda Tan
            Assignee: Wangda Tan
         Attachments: YARN-8080.001.patch

Existing native service assumes the service is long running and never finishes. 
Containers will be restarted even if exit code == 0. 

To support boarder use cases, we need to allow restart policy of component 
specified by users. Propose to have following policies:
1) Always: containers always restarted by framework regardless of container 
exit status. This is existing/default behavior.
2) Never: Do not restart containers in any cases after container finishes: To 
support job-like workload (for example Tensorflow training job). If a task exit 
with code == 0, we should not restart the task. This can be used by services 
which is not restart/recovery-able.
3) On-failure: Similar to above, only restart task with exitcode != 0. 

Behaviors after component *instance* finalize (Succeeded or Failed when 
restart_policy != ALWAYS): 
1) For single component, single instance: complete service.
2) For single component, multiple instance: other running instances from the 
same component won't be affected by the finalized component instance. Service 
will be terminated once all instances finalized. 
3) For multiple components: Service will be terminated once all components 
finalized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Reply via email to