[ 
https://issues.apache.org/jira/browse/YARN-11147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated YARN-11147:
------------------------------
     Target Version/s: 3.4.0
    Affects Version/s: 3.4.0

> ResourceUsage and QueueCapacities classes provide node label iterators that 
> are not thread safe
> -----------------------------------------------------------------------------------------------
>
>                 Key: YARN-11147
>                 URL: https://issues.apache.org/jira/browse/YARN-11147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>    Affects Versions: 3.4.0
>            Reporter: András Győri
>            Assignee: András Győri
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.4.0
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> AbstractResourceUsage#getNodePartitionsSet and 
> QueueCapacities#getNodePartitionsSet provide keySet, a mutable view on the 
> HashMap's keys, that is subject to change. Iterating through an iterator that 
> is modified by an other thread at the same time results in a 
> ConcurrentModificationException as the following stacktrace shows:
> {code:java}
> 2022-04-28 13:21:53,692 FATAL org.apache.hadoop.yarn.event.EventDispatcher: 
> Error in handling event type NODE_LABELS_UPDATE to the Event Dispatcher
> java.util.ConcurrentModificationException
>     at java.util.HashMap$HashIterator.nextNode(HashMap.java:1445)
>     at java.util.HashMap$KeyIterator.next(HashMap.java:1469)
>     at com.google.common.collect.Sets$1$1.computeNext(Sets.java:758)
>     at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
>     at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:236)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:1281)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:2115)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1900)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:169)
>     at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
>     at java.lang.Thread.run(Thread.java:748)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to