[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on
[ https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15393844#comment-15393844 ] Alex Rovner commented on YARN-1724: --- We are running CDH 5.4.0 and are hitting the same issue. Any advise would be appreciated. hadoop version Hadoop 2.6.0-cdh5.4.0 Subversion http://github.com/cloudera/hadoop -r c788a14a5de9ecd968d1e2666e8765c5f018c271 Compiled by jenkins on 2015-04-21T19:18Z Compiled with protoc 2.5.0 >From source with checksum cd78f139c66c13ab5cee96e15a629025 This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.4.0.jar {noformat} Exception: 2016-07-26 06:48:05,082 ERROR org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception. java.lang.IllegalArgumentException: Comparison method violates its general contract! at java.util.TimSort.mergeLo(TimSort.java:747) at java.util.TimSort.mergeAt(TimSort.java:483) at java.util.TimSort.mergeForceCollapse(TimSort.java:426) at java.util.TimSort.sort(TimSort.java:223) at java.util.TimSort.sort(TimSort.java:173) at java.util.Arrays.sort(Arrays.java:659) at java.util.Collections.sort(Collections.java:217) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:991) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:280) {noformat} > Race condition in Fair Scheduler when continuous scheduling is turned on > - > > Key: YARN-1724 > URL: https://issues.apache.org/jira/browse/YARN-1724 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: Sandy Ryza >Assignee: Sandy Ryza >Priority: Critical > Fix For: 2.4.0 > > Attachments: YARN-1724-1.patch, YARN-1724.patch > > > If nodes resource allocations change during > Collections.sort(nodeIdList, nodeAvailableResourceComparator); > we'll hit: > java.lang.IllegalArgumentException: Comparison method violates its general > contract! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on
[ https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905094#comment-13905094 ] Hudson commented on YARN-1724: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #486 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/486/]) YARN-1724. Race condition in Fair Scheduler when continuous scheduling is turned on (Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569447) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java Race condition in Fair Scheduler when continuous scheduling is turned on - Key: YARN-1724 URL: https://issues.apache.org/jira/browse/YARN-1724 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Priority: Critical Fix For: 2.4.0 Attachments: YARN-1724-1.patch, YARN-1724.patch If nodes resource allocations change during Collections.sort(nodeIdList, nodeAvailableResourceComparator); we'll hit: java.lang.IllegalArgumentException: Comparison method violates its general contract! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on
[ https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905441#comment-13905441 ] Hudson commented on YARN-1724: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1678 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1678/]) YARN-1724. Race condition in Fair Scheduler when continuous scheduling is turned on (Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569447) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java Race condition in Fair Scheduler when continuous scheduling is turned on - Key: YARN-1724 URL: https://issues.apache.org/jira/browse/YARN-1724 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Priority: Critical Fix For: 2.4.0 Attachments: YARN-1724-1.patch, YARN-1724.patch If nodes resource allocations change during Collections.sort(nodeIdList, nodeAvailableResourceComparator); we'll hit: java.lang.IllegalArgumentException: Comparison method violates its general contract! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on
[ https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905515#comment-13905515 ] Hudson commented on YARN-1724: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1703 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1703/]) YARN-1724. Race condition in Fair Scheduler when continuous scheduling is turned on (Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569447) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java Race condition in Fair Scheduler when continuous scheduling is turned on - Key: YARN-1724 URL: https://issues.apache.org/jira/browse/YARN-1724 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Priority: Critical Fix For: 2.4.0 Attachments: YARN-1724-1.patch, YARN-1724.patch If nodes resource allocations change during Collections.sort(nodeIdList, nodeAvailableResourceComparator); we'll hit: java.lang.IllegalArgumentException: Comparison method violates its general contract! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on
[ https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13904321#comment-13904321 ] Sandy Ryza commented on YARN-1724: -- The rationale for adding the sort is in YARN-1290. Wei presented experimental results there showing that it distributed work more evenly in the cluster. This JIRA is just for fixing the race - if you think we should remove it, we can discuss that on YARN-1290 or in a new JIRA. Race condition in Fair Scheduler when continuous scheduling is turned on - Key: YARN-1724 URL: https://issues.apache.org/jira/browse/YARN-1724 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1724-1.patch, YARN-1724.patch If nodes resource allocations change during Collections.sort(nodeIdList, nodeAvailableResourceComparator); we'll hit: java.lang.IllegalArgumentException: Comparison method violates its general contract! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on
[ https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13904330#comment-13904330 ] Hudson commented on YARN-1724: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5183 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5183/]) YARN-1724. Race condition in Fair Scheduler when continuous scheduling is turned on (Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569447) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java Race condition in Fair Scheduler when continuous scheduling is turned on - Key: YARN-1724 URL: https://issues.apache.org/jira/browse/YARN-1724 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1724-1.patch, YARN-1724.patch If nodes resource allocations change during Collections.sort(nodeIdList, nodeAvailableResourceComparator); we'll hit: java.lang.IllegalArgumentException: Comparison method violates its general contract! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on
[ https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905039#comment-13905039 ] Junping Du commented on YARN-1724: -- I could miss that JIRA and discussion. My thought here is: given container requests are coming randomly and continuous scheduling are looping forever, it didn't give node with more resources more chance to get assigned containers. The only differences for beginner node of the loop is it gives 5 ms sleep window (from last iteration). If this is the real reason to cause imbalanced, may be we should try to remove sleep to achieve more balanced scheduling? Locking whole scheduler seems expensive to me. Race condition in Fair Scheduler when continuous scheduling is turned on - Key: YARN-1724 URL: https://issues.apache.org/jira/browse/YARN-1724 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Priority: Critical Fix For: 2.4.0 Attachments: YARN-1724-1.patch, YARN-1724.patch If nodes resource allocations change during Collections.sort(nodeIdList, nodeAvailableResourceComparator); we'll hit: java.lang.IllegalArgumentException: Comparison method violates its general contract! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on
[ https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13903699#comment-13903699 ] Junping Du commented on YARN-1724: -- I would prefer more to remove sort rather than adding a lock. No matter sort or not, it will iterate all nodes and see if can do attemptScheduling() which make sort sounds unnecessary. It only make sense when we skip some nodes with less resources but additional lock may be needed. Thoughts? Race condition in Fair Scheduler when continuous scheduling is turned on - Key: YARN-1724 URL: https://issues.apache.org/jira/browse/YARN-1724 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1724-1.patch, YARN-1724.patch If nodes resource allocations change during Collections.sort(nodeIdList, nodeAvailableResourceComparator); we'll hit: java.lang.IllegalArgumentException: Comparison method violates its general contract! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on
[ https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13903769#comment-13903769 ] Karthik Kambatla commented on YARN-1724: Sorting here helps with balancing load better between nodes. Given not all containers run for the same duration, round-robin alone wouldn't lead to balanced load. Race condition in Fair Scheduler when continuous scheduling is turned on - Key: YARN-1724 URL: https://issues.apache.org/jira/browse/YARN-1724 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1724-1.patch, YARN-1724.patch If nodes resource allocations change during Collections.sort(nodeIdList, nodeAvailableResourceComparator); we'll hit: java.lang.IllegalArgumentException: Comparison method violates its general contract! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on
[ https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902540#comment-13902540 ] Hadoop QA commented on YARN-1724: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12629236/YARN-1724-1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3111//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3111//console This message is automatically generated. Race condition in Fair Scheduler when continuous scheduling is turned on - Key: YARN-1724 URL: https://issues.apache.org/jira/browse/YARN-1724 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1724-1.patch, YARN-1724.patch If nodes resource allocations change during Collections.sort(nodeIdList, nodeAvailableResourceComparator); we'll hit: java.lang.IllegalArgumentException: Comparison method violates its general contract! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on
[ https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902569#comment-13902569 ] Karthik Kambatla commented on YARN-1724: +1 Race condition in Fair Scheduler when continuous scheduling is turned on - Key: YARN-1724 URL: https://issues.apache.org/jira/browse/YARN-1724 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1724-1.patch, YARN-1724.patch If nodes resource allocations change during Collections.sort(nodeIdList, nodeAvailableResourceComparator); we'll hit: java.lang.IllegalArgumentException: Comparison method violates its general contract! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on
[ https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902242#comment-13902242 ] Karthik Kambatla commented on YARN-1724: Not a huge fan of adding a synchronization around a comparator, but looks like that would be the least invasive change in this case. We should at least a very detailed comment explaining the reason for this synchronization. Race condition in Fair Scheduler when continuous scheduling is turned on - Key: YARN-1724 URL: https://issues.apache.org/jira/browse/YARN-1724 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1724.patch If nodes resource allocations change during Collections.sort(nodeIdList, nodeAvailableResourceComparator); we'll hit: java.lang.IllegalArgumentException: Comparison method violates its general contract! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on
[ https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13900057#comment-13900057 ] Hadoop QA commented on YARN-1724: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12628663/YARN-1724.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3092//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3092//console This message is automatically generated. Race condition in Fair Scheduler when continuous scheduling is turned on - Key: YARN-1724 URL: https://issues.apache.org/jira/browse/YARN-1724 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1724.patch If nodes resource allocations change during Collections.sort(nodeIdList, nodeAvailableResourceComparator); we'll hit: java.lang.IllegalArgumentException: Comparison method violates its general contract! -- This message was sent by Atlassian JIRA (v6.1.5#6160)