[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on

2016-07-26 Thread Alex Rovner (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15393844#comment-15393844
 ] 

Alex Rovner commented on YARN-1724:
---

We are running CDH 5.4.0 and are hitting the same issue. Any advise would be 
appreciated.

hadoop version
Hadoop 2.6.0-cdh5.4.0
Subversion http://github.com/cloudera/hadoop -r 
c788a14a5de9ecd968d1e2666e8765c5f018c271
Compiled by jenkins on 2015-04-21T19:18Z
Compiled with protoc 2.5.0
>From source with checksum cd78f139c66c13ab5cee96e15a629025
This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.4.0.jar

{noformat}
Exception:
2016-07-26 06:48:05,082 ERROR 
org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception.
java.lang.IllegalArgumentException: Comparison method violates its general 
contract!
at java.util.TimSort.mergeLo(TimSort.java:747)
at java.util.TimSort.mergeAt(TimSort.java:483)
at java.util.TimSort.mergeForceCollapse(TimSort.java:426)
at java.util.TimSort.sort(TimSort.java:223)
at java.util.TimSort.sort(TimSort.java:173)
at java.util.Arrays.sort(Arrays.java:659)
at java.util.Collections.sort(Collections.java:217)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:991)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:280)
{noformat}

> Race condition in Fair Scheduler when continuous scheduling is turned on 
> -
>
> Key: YARN-1724
> URL: https://issues.apache.org/jira/browse/YARN-1724
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>Priority: Critical
> Fix For: 2.4.0
>
> Attachments: YARN-1724-1.patch, YARN-1724.patch
>
>
> If nodes resource allocations change during
> Collections.sort(nodeIdList, nodeAvailableResourceComparator);
> we'll hit:
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on

2014-02-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905094#comment-13905094
 ] 

Hudson commented on YARN-1724:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #486 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/486/])
YARN-1724. Race condition in Fair Scheduler when continuous scheduling is 
turned on (Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569447)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1724-1.patch, YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on

2014-02-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905441#comment-13905441
 ] 

Hudson commented on YARN-1724:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1678 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1678/])
YARN-1724. Race condition in Fair Scheduler when continuous scheduling is 
turned on (Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569447)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1724-1.patch, YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on

2014-02-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905515#comment-13905515
 ] 

Hudson commented on YARN-1724:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1703 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1703/])
YARN-1724. Race condition in Fair Scheduler when continuous scheduling is 
turned on (Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569447)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1724-1.patch, YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on

2014-02-18 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13904321#comment-13904321
 ] 

Sandy Ryza commented on YARN-1724:
--

The rationale for adding the sort is in YARN-1290. Wei presented experimental 
results there showing that it distributed work more evenly in the cluster.  
This JIRA is just for fixing the race - if you think we should remove it, we 
can discuss that on YARN-1290 or in a new JIRA.

 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1724-1.patch, YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on

2014-02-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13904330#comment-13904330
 ] 

Hudson commented on YARN-1724:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5183 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5183/])
YARN-1724. Race condition in Fair Scheduler when continuous scheduling is 
turned on (Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569447)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1724-1.patch, YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on

2014-02-18 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905039#comment-13905039
 ] 

Junping Du commented on YARN-1724:
--

I could miss that JIRA and discussion. My thought here is: given container 
requests are coming randomly and continuous scheduling are looping forever, it 
didn't give node with more resources more chance to get assigned containers. 
The only differences for beginner node of the loop is it gives 5 ms sleep 
window (from last iteration). If this is the real reason to cause imbalanced, 
may be we should try to remove sleep to achieve more balanced scheduling? 
Locking whole scheduler seems expensive to me.

 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1724-1.patch, YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on

2014-02-17 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13903699#comment-13903699
 ] 

Junping Du commented on YARN-1724:
--

I would prefer more to remove sort rather than adding a lock. No matter sort or 
not, it will iterate all nodes and see if can do attemptScheduling() which make 
sort sounds unnecessary. It only make sense when we skip some nodes with less 
resources but additional lock may be needed. Thoughts?

 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1724-1.patch, YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on

2014-02-17 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13903769#comment-13903769
 ] 

Karthik Kambatla commented on YARN-1724:


Sorting here helps with balancing load better between nodes. Given not all 
containers run for the same duration, round-robin alone wouldn't lead to 
balanced load. 

 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1724-1.patch, YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on

2014-02-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902540#comment-13902540
 ] 

Hadoop QA commented on YARN-1724:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629236/YARN-1724-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3111//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3111//console

This message is automatically generated.

 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1724-1.patch, YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on

2014-02-15 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902569#comment-13902569
 ] 

Karthik Kambatla commented on YARN-1724:


+1

 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1724-1.patch, YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on

2014-02-14 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902242#comment-13902242
 ] 

Karthik Kambatla commented on YARN-1724:


Not a huge fan of adding a synchronization around a comparator, but looks like 
that would be the least invasive change in this case.

We should at least a very detailed comment explaining the reason for this 
synchronization.

 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on

2014-02-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13900057#comment-13900057
 ] 

Hadoop QA commented on YARN-1724:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12628663/YARN-1724.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3092//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3092//console

This message is automatically generated.

 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)