[ 
https://issues.apache.org/jira/browse/TWILL-145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Saputra updated TWILL-145:
--------------------------------
    Description: 
Found this issue from careful eyes of [~chtyim]

When sending restart instance to all for a particular TwillRunnable, it could 
have race condition where the heartbeat thread run right after all containers 
have been released which make the check:

{code}
     // Looks for containers requests.
      if (provisioning.isEmpty() && runnableContainerRequests.isEmpty() && 
runningContainers.isEmpty()) {
        LOG.info("All containers completed. Shutting down application master.");
        break;
      }
{code}
This could happen when all running containers are empty and new 
runnableContainerRequests has not been added.

  was:
Found this issue from careful eyes of [~chtyim]

When sending restart instance to all for a particular TwillRunnable, it could 
have race condition where the heartbeat thread run right after all containers 
have been released which make the check:

     // Looks for containers requests.
      if (provisioning.isEmpty() && runnableContainerRequests.isEmpty() && 
runningContainers.isEmpty()) {
        LOG.info("All containers completed. Shutting down application master.");
        break;
      }

This could happen when all running containers are empty and new 
runnableContainerRequests has not been added.


> Potential race condition when restart all is called for a Twill runnable
> ------------------------------------------------------------------------
>
>                 Key: TWILL-145
>                 URL: https://issues.apache.org/jira/browse/TWILL-145
>             Project: Apache Twill
>          Issue Type: Bug
>          Components: yarn
>    Affects Versions: 0.6.0-incubating
>            Reporter: Henry Saputra
>
> Found this issue from careful eyes of [~chtyim]
> When sending restart instance to all for a particular TwillRunnable, it could 
> have race condition where the heartbeat thread run right after all containers 
> have been released which make the check:
> {code}
>      // Looks for containers requests.
>       if (provisioning.isEmpty() && runnableContainerRequests.isEmpty() && 
> runningContainers.isEmpty()) {
>         LOG.info("All containers completed. Shutting down application 
> master.");
>         break;
>       }
> {code}
> This could happen when all running containers are empty and new 
> runnableContainerRequests has not been added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to