[ 
https://issues.apache.org/jira/browse/YARN-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541592#comment-13541592
 ] 

shenhong commented on YARN-301:
-------------------------------

  The reason is when the thread SchedulerEventDispatche assignContainer, it 
will get priorities from AppSchedulingInfo, at AppSchedulingInfo the code is: 
  synchronized public Collection<Priority> getPriorities() {
    return priorities;
  }
but it just get the reference of priorities, in AppSchedulable#assignContainer, 
it traverse the priorities.
    // (not scheduled) in order to promote better locality.
    for (Priority priority : app.getPriorities()) {
      app.addSchedulingOpportunity(priority); 
    ...

  On the other hand, when the RM processing the request from AM and update the 
priorities at AppSchedulingInfo#updateResourceRequests:
      if (asks == null) {
        asks = new HashMap<String, ResourceRequest>();
        this.requests.put(priority, asks);
        this.priorities.add(priority);
      } else if (updatePendingResources) {

it turn out to concurrentModificationException.

                
> Fairscheduler appear to concurrentModificationException and RM crash
> --------------------------------------------------------------------
>
>                 Key: YARN-301
>                 URL: https://issues.apache.org/jira/browse/YARN-301
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager, scheduler
>            Reporter: shenhong
>             Fix For: 2.0.3-alpha
>
>
> In my test cluster, fairscheduler appear to concurrentModificationException 
> and RM crash,  here is the message:
> 2012-12-30 17:14:17,171 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type NODE_UPDATE to the scheduler
> java.util.ConcurrentModificationException
>         at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
>         at java.util.TreeMap$KeyIterator.next(TreeMap.java:1154)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable.assignContainer(AppSchedulable.java:297)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:181)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:780)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:842)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:98)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:340)
>         at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to