[ 
https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16971642#comment-16971642
 ] 

Wilfred Spiegelenburg commented on YARN-8373:
---------------------------------------------

The one inside the scheduler prevents updates of nodes to flow through as much 
as possible. Node changes are linked to allocating or finishing containers, 
scheduling actions. These are prevented as much as possible with this lock.
The one in the node tracker prevents adding or deleting nodes while it creates 
the list. However a node can still be removed after sorting and that is handled 
later while iterating over the list of nodes that is returned

The two locks do different things.

> RM  Received RMFatalEvent of type CRITICAL_THREAD_CRASH
> -------------------------------------------------------
>
>                 Key: YARN-8373
>                 URL: https://issues.apache.org/jira/browse/YARN-8373
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler, resourcemanager
>    Affects Versions: 2.9.0
>            Reporter: Girish Bhat
>            Assignee: Wilfred Spiegelenburg
>            Priority: Major
>              Labels: newbie
>         Attachments: YARN-8373.001.patch, YARN-8373.002.patch, 
> YARN-8373.003.patch
>
>
>  
>  
> {noformat}
> sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 
> Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 
> 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on 
> 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum 
> 0a76a9a32a5257331741f8d5932f183 This command was run using 
> /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat}
> This is for version 2.9.0 
>  
> {noformat}
> 2018-05-25 05:53:12,742 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received 
> RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai
> rSchedulerContinuousScheduling, that exited unexpectedly: 
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,743 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down 
> the resource manager.
> 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: a critical thread, FairSchedulerContinuousScheduling, that exited 
> unexpectedly: java.lang.IllegalArgumentException: Comparison method violates 
> its general contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,772 ERROR 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  ExpiredTokenRemover received java.lang.InterruptedException: sleep 
> interrupted{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to