[ 
https://issues.apache.org/jira/browse/TWILL-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15850979#comment-15850979
 ] 

ASF GitHub Bot commented on TWILL-181:
--------------------------------------

Github user serranom commented on the issue:

    https://github.com/apache/twill/pull/23
  
    I found that there is still a race condition in 
ApplicationMasterService.launchRunnable.  If the number of instances is 
increased right after the original request is fullfilled, the current logic can 
result in the original request not being polled resulting in future requests 
hanging.  This seems to be a fairly unlikely case in the real world, but I'll 
file a JIRA for it.  For now, I've reworked the test to wait until 
launchRunnable has done the polling for the original request.  This PR is now 
complete.


> Control the maximum number of retries for failed application starts
> -------------------------------------------------------------------
>
>                 Key: TWILL-181
>                 URL: https://issues.apache.org/jira/browse/TWILL-181
>             Project: Apache Twill
>          Issue Type: Improvement
>          Components: yarn
>    Affects Versions: 0.7.0-incubating
>            Reporter: Martin Serrano
>            Assignee: Martin Serrano
>             Fix For: 0.10.0
>
>
> If an application consistently exits with a non-zero code,  twill will 
> attempt to restart indefinitely.  I ran into this issue and a list search 
> also reveals [others|  http://markmail.org/message/dehx7r6tpqgcmjh4].  
> There should be a mechanism to specify the maximum number of retries until 
> the application fails.  Ideally by default there would be a non-infinite 
> maximum.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to