[ 
https://issues.apache.org/jira/browse/TWILL-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15844119#comment-15844119
 ] 

ASF GitHub Bot commented on TWILL-181:
--------------------------------------

Github user serranom commented on a diff in the pull request:

    https://github.com/apache/twill/pull/23#discussion_r98334746
  
    --- Diff: 
twill-yarn/src/main/java/org/apache/twill/internal/appmaster/RunningContainers.java
 ---
    @@ -113,9 +117,11 @@ public Integer apply(BitSet input) {
       private final Location applicationLocation;
       private final Set<String> runnableNames;
       private final Map<String, Map<String, String>> logLevels;
    +  private final Map<String, Integer> maxRetries;
    --- End diff --
    
    but on second thought, the additional complexity of tracking by instance id 
would handle the following case better:
    
    - initially start with `x` instances, all of which succeed.
    - add 1 more instance, which fails
    
    If tracking by instance id, the new instance will get maxRetries.  Without 
tracking by instance id, the new instance would get `maxRetries*(x+1)` retries. 
 So I will add the complexity of keeping track by id.  It seems worth it.


> Control the maximum number of retries for failed application starts
> -------------------------------------------------------------------
>
>                 Key: TWILL-181
>                 URL: https://issues.apache.org/jira/browse/TWILL-181
>             Project: Apache Twill
>          Issue Type: Improvement
>          Components: yarn
>    Affects Versions: 0.7.0-incubating
>            Reporter: Martin Serrano
>            Assignee: Martin Serrano
>             Fix For: 0.10.0
>
>
> If an application consistently exits with a non-zero code,  twill will 
> attempt to restart indefinitely.  I ran into this issue and a list search 
> also reveals [others|  http://markmail.org/message/dehx7r6tpqgcmjh4].  
> There should be a mechanism to specify the maximum number of retries until 
> the application fails.  Ideally by default there would be a non-infinite 
> maximum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to