[ 
https://issues.apache.org/jira/browse/YARN-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706432#comment-13706432
 ] 

Zhijie Shen commented on YARN-292:
----------------------------------

{code}
      // Acquire the AM container from the scheduler.
      Allocation amContainerAllocation = appAttempt.scheduler.allocate(
          appAttempt.applicationAttemptId, EMPTY_CONTAINER_REQUEST_LIST,
          EMPTY_CONTAINER_RELEASE_LIST, null, null);
{code}
The above code will eventually pull the newly allocated containers in 
newlyAllocatedContainers.

Logically, AMContainerAllocatedTransition happens after RMAppAttempt receives 
CONTAINER_ALLOCATED. CONTAINER_ALLOCATED is sent during 
ContainerStartedTransition, when RMContainer is moving from NEW to ALLOCATED. 
Therefore, pulling newlyAllocatedContainers happens when RMContainer is at 
ALLOCATED. In contrast, RMContainer is added to newlyAllocatedContainers when 
it is still at NEW. In conclusion, one container in the allocation is expected 
in AMContainerAllocatedTransition.

Hinted by [~nemon], the problem may happen at
{code}
    FiCaSchedulerApp application = getApplication(applicationAttemptId);
    if (application == null) {
      LOG.error("Calling allocate on removed " +
          "or non existant application " + applicationAttemptId);
      return EMPTY_ALLOCATION;
    }
{code}
EMPTY_ALLOCATION has 0 container. Another observation is that there seems to be 
inconsistent synchronization on accessing the application map.

Suddenly be aware that [~djp] has started working on this problem. Please feel 
free to take it over. Thanks! 
                
> ResourceManager throws ArrayIndexOutOfBoundsException while handling 
> CONTAINER_ALLOCATED for application attempt
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-292
>                 URL: https://issues.apache.org/jira/browse/YARN-292
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.0.1-alpha
>            Reporter: Devaraj K
>            Assignee: Zhijie Shen
>
> {code:xml}
> 2012-12-26 08:41:15,030 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: 
> Calling allocate on removed or non existant application 
> appattempt_1356385141279_49525_000001
> 2012-12-26 08:41:15,031 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type CONTAINER_ALLOCATED for applicationAttempt 
> application_1356385141279_49525
> java.lang.ArrayIndexOutOfBoundsException: 0
>       at java.util.Arrays$ArrayList.get(Arrays.java:3381)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:655)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:644)
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357)
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298)
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414)
>       at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
>       at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
>       at java.lang.Thread.run(Thread.java:662)
>  {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to