[jira] [Updated] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'

2019-11-05 Thread kailiu_dev (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kailiu_dev updated YARN-9940:
-
Attachment: (was: YARN-9940-branch-2.7.2.001.patch)

> avoid continuous scheduling thread crashes while sorting nodes get 
> 'Comparison method violates its general contract'
> 
>
> Key: YARN-9940
> URL: https://issues.apache.org/jira/browse/YARN-9940
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: kailiu_dev
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: YARN-9940-branch-2.7.2.001.patch
>
>
> 2019-10-16 09:14:51,215 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception.
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>     at java.util.TimSort.mergeHi(TimSort.java:868)
>     at java.util.TimSort.mergeAt(TimSort.java:485)
>     at java.util.TimSort.mergeForceCollapse(TimSort.java:426)
>     at java.util.TimSort.sort(TimSort.java:223)
>     at java.util.TimSort.sort(TimSort.java:173)
>     at java.util.Arrays.sort(Arrays.java:659)
>     at java.util.Collections.sort(Collections.java:217)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'

2019-11-05 Thread kailiu_dev (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16966386#comment-16966386
 ] 

kailiu_dev edited comment on YARN-9940 at 11/5/19 8:24 AM:
---

h5.     [~zxu] [~snemeth]  can you please help me review this code?  this  is a 
fixed code to avoid continuous scheduling thread crashes while sorting nodes 
get 'Comparison method violates its general contract'


was (Author: kailiu_dev):
    [~zxu] [~snemeth]   can you please help me review this code? 

> avoid continuous scheduling thread crashes while sorting nodes get 
> 'Comparison method violates its general contract'
> 
>
> Key: YARN-9940
> URL: https://issues.apache.org/jira/browse/YARN-9940
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: kailiu_dev
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: YARN-9940-branch-2.7.2.001.patch
>
>
> 2019-10-16 09:14:51,215 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception.
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>     at java.util.TimSort.mergeHi(TimSort.java:868)
>     at java.util.TimSort.mergeAt(TimSort.java:485)
>     at java.util.TimSort.mergeForceCollapse(TimSort.java:426)
>     at java.util.TimSort.sort(TimSort.java:223)
>     at java.util.TimSort.sort(TimSort.java:173)
>     at java.util.Arrays.sort(Arrays.java:659)
>     at java.util.Collections.sort(Collections.java:217)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'

2019-11-05 Thread kailiu_dev (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16966386#comment-16966386
 ] 

kailiu_dev edited comment on YARN-9940 at 11/5/19 8:25 AM:
---

h5.    Dear  [~zxu]     [~snemeth]   [~bibinchundatt]  can you please help me 
review this code?  this  is a fixed code to avoid continuous scheduling thread 
crashes while sorting nodes get 'Comparison method violates its general 
contract'  in  hadoop version-2.7.2


was (Author: kailiu_dev):
h5.     [~zxu] [~snemeth]  can you please help me review this code?  this  is a 
fixed code to avoid continuous scheduling thread crashes while sorting nodes 
get 'Comparison method violates its general contract'

> avoid continuous scheduling thread crashes while sorting nodes get 
> 'Comparison method violates its general contract'
> 
>
> Key: YARN-9940
> URL: https://issues.apache.org/jira/browse/YARN-9940
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: kailiu_dev
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: YARN-9940-branch-2.7.2.001.patch
>
>
> 2019-10-16 09:14:51,215 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception.
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>     at java.util.TimSort.mergeHi(TimSort.java:868)
>     at java.util.TimSort.mergeAt(TimSort.java:485)
>     at java.util.TimSort.mergeForceCollapse(TimSort.java:426)
>     at java.util.TimSort.sort(TimSort.java:223)
>     at java.util.TimSort.sort(TimSort.java:173)
>     at java.util.Arrays.sort(Arrays.java:659)
>     at java.util.Collections.sort(Collections.java:217)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'

2019-11-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967326#comment-16967326
 ] 

Hadoop QA commented on YARN-9940:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.7.2 Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  2m 
10s{color} | {color:red} root in branch-2.7.2 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
37s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 
failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} branch-2.7.2 passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
14s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 
failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 
failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.7.2 
failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 10s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 9 new + 614 unchanged - 0 fixed = 623 total (was 614) 
{color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
11s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
11s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 11s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  6m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:date2019-11-05 |
| JIRA Issue | YARN-9940 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12984900/YARN-9940-branch-2.7.2.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 7db85f5e17e6 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2.7.2 / b165c4f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/25100/artifact/out/branch-mvninstall-root.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-YARN-Build/25100/artifact/out/branch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| mvnsite | 
https://builds.apache.org/job/PreCommit-YARN-Build/25100/artifact/out/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-se

[jira] [Commented] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'

2019-11-05 Thread kailiu_dev (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967331#comment-16967331
 ] 

kailiu_dev commented on YARN-9940:
--

h5. one the above report, wen you  see  "{color:#FF}Failed to execute goal 
org.apache.hadoop:hadoop-maven-plugins:2.7.2:protoc (compile-protoc) on project 
hadoop-yarn-server-resourcemanager: 
org.apache.maven.plugin.MojoExecutionException: 'protoc --version' did not 
return a version -> [Help 1]{color}",  you don'nt need to care, this is not my 
problem,  it is beause of  this test enviroment of the Jira server is not  
*suitable*

> avoid continuous scheduling thread crashes while sorting nodes get 
> 'Comparison method violates its general contract'
> 
>
> Key: YARN-9940
> URL: https://issues.apache.org/jira/browse/YARN-9940
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: kailiu_dev
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: YARN-9940-branch-2.7.2.001.patch
>
>
> 2019-10-16 09:14:51,215 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception.
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>     at java.util.TimSort.mergeHi(TimSort.java:868)
>     at java.util.TimSort.mergeAt(TimSort.java:485)
>     at java.util.TimSort.mergeForceCollapse(TimSort.java:426)
>     at java.util.TimSort.sort(TimSort.java:223)
>     at java.util.TimSort.sort(TimSort.java:173)
>     at java.util.Arrays.sort(Arrays.java:659)
>     at java.util.Collections.sort(Collections.java:217)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9927) RM multi-thread event processing mechanism

2019-11-05 Thread hcarrot (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hcarrot updated YARN-9927:
--
Attachment: YARN-9927.001.patch

> RM multi-thread event processing mechanism
> --
>
> Key: YARN-9927
> URL: https://issues.apache.org/jira/browse/YARN-9927
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.0.0, 2.9.2
>Reporter: hcarrot
>Priority: Major
> Attachments: RM multi-thread event processing mechanism.pdf, 
> YARN-9927-addMultiEventDispatcher.patch, YARN-9927.001.patch
>
>
> Recently, we have observed serious event blocking in RM event dispatcher 
> queue. After analysis of RM event monitoring data and RM event processing 
> logic, we found that
> 1) environment: a cluster with thousands of nodes
> 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler
> 3) Meanwhile, RM event processing is in a single-thread mode, and It results 
> in the low headroom of RM event scheduler, thus performance of RM.
> So we proposed a RM multi-thread event processing mechanism to improve RM 
> performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9927) RM multi-thread event processing mechanism

2019-11-05 Thread hcarrot (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hcarrot updated YARN-9927:
--
Attachment: (was: YARN-9927-addMultiEventDispatcher.patch)

> RM multi-thread event processing mechanism
> --
>
> Key: YARN-9927
> URL: https://issues.apache.org/jira/browse/YARN-9927
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.0.0, 2.9.2
>Reporter: hcarrot
>Priority: Major
> Attachments: RM multi-thread event processing mechanism.pdf, 
> YARN-9927.001.patch
>
>
> Recently, we have observed serious event blocking in RM event dispatcher 
> queue. After analysis of RM event monitoring data and RM event processing 
> logic, we found that
> 1) environment: a cluster with thousands of nodes
> 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler
> 3) Meanwhile, RM event processing is in a single-thread mode, and It results 
> in the low headroom of RM event scheduler, thus performance of RM.
> So we proposed a RM multi-thread event processing mechanism to improve RM 
> performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6448) Continuous scheduling thread crashes while sorting nodes

2019-11-05 Thread kailiu_dev (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967334#comment-16967334
 ] 

kailiu_dev commented on YARN-6448:
--

Dear  [~yufeigu]
h5. can you please help me review this code?  this  is a fixed code to avoid 
continuous scheduling thread crashes while sorting nodes get 'Comparison method 
violates its general contract'  in  hadoop version-2.7.2

,   

> Continuous scheduling thread crashes while sorting nodes
> 
>
> Key: YARN-6448
> URL: https://issues.apache.org/jira/browse/YARN-6448
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>Priority: Major
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: YARN-6448.001.patch, YARN-6448.002.patch, 
> YARN-6448.003.patch, YARN-6448.004.patch
>
>
> YARN-4719 remove the lock in continuous scheduling while sorting nodes. It 
> breaks the order in comparison if nodes changes while sorting.
> {code}
> 2017-04-04 23:42:26,123 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.RMCriticalThreadUncaughtExceptionHandler:
>  Critical thread FairSchedulerContinuousScheduling crashed!
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:306)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:884)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:316)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9927) RM multi-thread event processing mechanism

2019-11-05 Thread hcarrot (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hcarrot updated YARN-9927:
--
Attachment: (was: YARN-9927.001.patch)

> RM multi-thread event processing mechanism
> --
>
> Key: YARN-9927
> URL: https://issues.apache.org/jira/browse/YARN-9927
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.0.0, 2.9.2
>Reporter: hcarrot
>Priority: Major
> Attachments: RM multi-thread event processing mechanism.pdf, 
> YARN-9927.001.patch
>
>
> Recently, we have observed serious event blocking in RM event dispatcher 
> queue. After analysis of RM event monitoring data and RM event processing 
> logic, we found that
> 1) environment: a cluster with thousands of nodes
> 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler
> 3) Meanwhile, RM event processing is in a single-thread mode, and It results 
> in the low headroom of RM event scheduler, thus performance of RM.
> So we proposed a RM multi-thread event processing mechanism to improve RM 
> performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9927) RM multi-thread event processing mechanism

2019-11-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967336#comment-16967336
 ] 

Hadoop QA commented on YARN-9927:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} YARN-9927 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9927 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12984902/YARN-9927.001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25101/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RM multi-thread event processing mechanism
> --
>
> Key: YARN-9927
> URL: https://issues.apache.org/jira/browse/YARN-9927
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.0.0, 2.9.2
>Reporter: hcarrot
>Priority: Major
> Attachments: RM multi-thread event processing mechanism.pdf, 
> YARN-9927.001.patch
>
>
> Recently, we have observed serious event blocking in RM event dispatcher 
> queue. After analysis of RM event monitoring data and RM event processing 
> logic, we found that
> 1) environment: a cluster with thousands of nodes
> 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler
> 3) Meanwhile, RM event processing is in a single-thread mode, and It results 
> in the low headroom of RM event scheduler, thus performance of RM.
> So we proposed a RM multi-thread event processing mechanism to improve RM 
> performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6448) Continuous scheduling thread crashes while sorting nodes

2019-11-05 Thread kailiu_dev (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967334#comment-16967334
 ] 

kailiu_dev edited comment on YARN-6448 at 11/5/19 8:44 AM:
---

Dear  [~yufeigu]
h5. can you please help me review this code?  this  is a fixed code to avoid 
continuous scheduling thread crashes while sorting nodes get 'Comparison method 
violates its general contract'  in  hadoop version-2.7.2,

https://issues.apache.org/jira/browse/YARN-9940


was (Author: kailiu_dev):
Dear  [~yufeigu]
h5. can you please help me review this code?  this  is a fixed code to avoid 
continuous scheduling thread crashes while sorting nodes get 'Comparison method 
violates its general contract'  in  hadoop version-2.7.2

,   

> Continuous scheduling thread crashes while sorting nodes
> 
>
> Key: YARN-6448
> URL: https://issues.apache.org/jira/browse/YARN-6448
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>Priority: Major
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: YARN-6448.001.patch, YARN-6448.002.patch, 
> YARN-6448.003.patch, YARN-6448.004.patch
>
>
> YARN-4719 remove the lock in continuous scheduling while sorting nodes. It 
> breaks the order in comparison if nodes changes while sorting.
> {code}
> 2017-04-04 23:42:26,123 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.RMCriticalThreadUncaughtExceptionHandler:
>  Critical thread FairSchedulerContinuousScheduling crashed!
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:306)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:884)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:316)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8436) FSParentQueue: Comparison method violates its general contract

2019-11-05 Thread kailiu_dev (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967339#comment-16967339
 ] 

kailiu_dev commented on YARN-8436:
--

Dear [~wilfreds] 
can you please help me review this code? this is a fixed code to avoid 
continuous scheduling thread crashes while sorting nodes get 'Comparison method 
violates its general contract' in hadoop version-2.7.2
https://issues.apache.org/jira/browse/YARN-9940

> FSParentQueue: Comparison method violates its general contract
> --
>
> Key: YARN-8436
> URL: https://issues.apache.org/jira/browse/YARN-8436
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: YARN-8436.001.patch, YARN-8436.002.patch, 
> YARN-8436.003.patch
>
>
> The ResourceManager can fail while sorting queues if an update comes in:
> {code:java}
> FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type NODE_UPDATE to the scheduler
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>   at java.util.TimSort.mergeLo(TimSort.java:777)
>   at java.util.TimSort.mergeAt(TimSort.java:514)
> ...
>   at java.util.Collections.sort(Collections.java:175)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:223){code}
> The reason it breaks is a change in the sorted object itself. 
> This is why it fails:
>  * an update from a node comes in as a heartbeat.
>  * the update triggers a check to see if we can assign a container on the 
> node.
>  * walk over the queue hierarchy to find a queue to assign a container to: 
> top down.
>  * for each parent queue we sort the child queues in {{assignContainer}} to 
> decide which queue to descent into.
>  * we lock the parent queue when sort to prevent changes, but we do not lock 
> the child queues that we are sorting.
> If during this sorting a different node update changes a child queue then we 
> allow that. This means that the objects that we are trying to sort now might 
> be out of order. That causes the issue with the comparator. The comparator 
> itself is not broken.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9948) Remove attempts that are beyond max-attempt limit from RMAppImpl

2019-11-05 Thread Hu Ziqian (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967341#comment-16967341
 ] 

Hu Ziqian commented on YARN-9948:
-

[~hex108],  [~jianhe] ,  this issue is based on YARN-3480, could you review it 
and give some adivce?

> Remove attempts that are beyond max-attempt limit from RMAppImpl
> 
>
> Key: YARN-9948
> URL: https://issues.apache.org/jira/browse/YARN-9948
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.1.3
>Reporter: Hu Ziqian
>Priority: Major
> Attachments: YARN-9948.001.patch
>
>
> RM will store app attempt in both state store and RMAppImpl. YARN-3480 
> removes attempts that are beyond max-attempt limit from state store.  In this 
> issue we delete those attempts in RMAppImpl the reduce decrease memory usage 
> of RM.
> We introduce flag yarn.resourcemanager.am.delete-old-attempts.enabled to 
> enable this logic, default value is false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'

2019-11-05 Thread kailiu_dev (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967347#comment-16967347
 ] 

kailiu_dev commented on YARN-9940:
--

exception is same ,but my probleam is ablout continuous scheduling, that is 
about FSParentQueue

> avoid continuous scheduling thread crashes while sorting nodes get 
> 'Comparison method violates its general contract'
> 
>
> Key: YARN-9940
> URL: https://issues.apache.org/jira/browse/YARN-9940
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: kailiu_dev
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: YARN-9940-branch-2.7.2.001.patch
>
>
> 2019-10-16 09:14:51,215 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception.
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>     at java.util.TimSort.mergeHi(TimSort.java:868)
>     at java.util.TimSort.mergeAt(TimSort.java:485)
>     at java.util.TimSort.mergeForceCollapse(TimSort.java:426)
>     at java.util.TimSort.sort(TimSort.java:223)
>     at java.util.TimSort.sort(TimSort.java:173)
>     at java.util.Arrays.sort(Arrays.java:659)
>     at java.util.Collections.sort(Collections.java:217)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'

2019-11-05 Thread kailiu_dev (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967348#comment-16967348
 ] 

kailiu_dev commented on YARN-9940:
--

{color:#FF}exception is same ,but my probleam is ablout continuous 
scheduling, that is about FSParentQueue{color}

> avoid continuous scheduling thread crashes while sorting nodes get 
> 'Comparison method violates its general contract'
> 
>
> Key: YARN-9940
> URL: https://issues.apache.org/jira/browse/YARN-9940
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: kailiu_dev
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: YARN-9940-branch-2.7.2.001.patch
>
>
> 2019-10-16 09:14:51,215 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception.
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>     at java.util.TimSort.mergeHi(TimSort.java:868)
>     at java.util.TimSort.mergeAt(TimSort.java:485)
>     at java.util.TimSort.mergeForceCollapse(TimSort.java:426)
>     at java.util.TimSort.sort(TimSort.java:223)
>     at java.util.TimSort.sort(TimSort.java:173)
>     at java.util.Arrays.sort(Arrays.java:659)
>     at java.util.Collections.sort(Collections.java:217)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'

2019-11-05 Thread kailiu_dev (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kailiu_dev updated YARN-9940:
-
Comment: was deleted

(was: exception is same ,but my probleam is ablout continuous scheduling, that 
is about FSParentQueue)

> avoid continuous scheduling thread crashes while sorting nodes get 
> 'Comparison method violates its general contract'
> 
>
> Key: YARN-9940
> URL: https://issues.apache.org/jira/browse/YARN-9940
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: kailiu_dev
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: YARN-9940-branch-2.7.2.001.patch
>
>
> 2019-10-16 09:14:51,215 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception.
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>     at java.util.TimSort.mergeHi(TimSort.java:868)
>     at java.util.TimSort.mergeAt(TimSort.java:485)
>     at java.util.TimSort.mergeForceCollapse(TimSort.java:426)
>     at java.util.TimSort.sort(TimSort.java:223)
>     at java.util.TimSort.sort(TimSort.java:173)
>     at java.util.Arrays.sort(Arrays.java:659)
>     at java.util.Collections.sort(Collections.java:217)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'

2019-11-05 Thread kailiu_dev (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967348#comment-16967348
 ] 

kailiu_dev edited comment on YARN-9940 at 11/5/19 9:20 AM:
---

{color:#ff}YARN-8436  && YARN-9940  exception is same ,but my probleam is 
ablout continuous scheduling, that is about FSParentQueue{color}


was (Author: kailiu_dev):
{color:#FF}exception is same ,but my probleam is ablout continuous 
scheduling, that is about FSParentQueue{color}

> avoid continuous scheduling thread crashes while sorting nodes get 
> 'Comparison method violates its general contract'
> 
>
> Key: YARN-9940
> URL: https://issues.apache.org/jira/browse/YARN-9940
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: kailiu_dev
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: YARN-9940-branch-2.7.2.001.patch
>
>
> 2019-10-16 09:14:51,215 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception.
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>     at java.util.TimSort.mergeHi(TimSort.java:868)
>     at java.util.TimSort.mergeAt(TimSort.java:485)
>     at java.util.TimSort.mergeForceCollapse(TimSort.java:426)
>     at java.util.TimSort.sort(TimSort.java:223)
>     at java.util.TimSort.sort(TimSort.java:173)
>     at java.util.Arrays.sort(Arrays.java:659)
>     at java.util.Collections.sort(Collections.java:217)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9948) Remove attempts that are beyond max-attempt limit from RMAppImpl

2019-11-05 Thread Jun Gong (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967374#comment-16967374
 ] 

Jun Gong commented on YARN-9948:


[~ziqian hu] Thanks for the patch. How much memory does it consume? If not 
much, it will be better to keep them, the reasons could be found at comment 1 
and comment 2.

> Remove attempts that are beyond max-attempt limit from RMAppImpl
> 
>
> Key: YARN-9948
> URL: https://issues.apache.org/jira/browse/YARN-9948
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.1.3
>Reporter: Hu Ziqian
>Priority: Major
> Attachments: YARN-9948.001.patch
>
>
> RM will store app attempt in both state store and RMAppImpl. YARN-3480 
> removes attempts that are beyond max-attempt limit from state store.  In this 
> issue we delete those attempts in RMAppImpl the reduce decrease memory usage 
> of RM.
> We introduce flag yarn.resourcemanager.am.delete-old-attempts.enabled to 
> enable this logic, default value is false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9923) Detect missing Docker binary or not running Docker daemon

2019-11-05 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967375#comment-16967375
 ] 

Szilard Nemeth commented on YARN-9923:
--

Hi [~adam.antal] / [~ebadger]!

If I understood Adam's latest comment correctly, if a user set the flag to 
NONE, it would have the current behaviour.

If it's set to RUNTIME, the NM health check script would run, as suggested by 
[~ebadger].

The STARTUP option is clear, it just checks the daemon at startup. I would vote 
to implement this option as well, for better flexibility.

What do you guys think?  

> Detect missing Docker binary or not running Docker daemon
> -
>
> Key: YARN-9923
> URL: https://issues.apache.org/jira/browse/YARN-9923
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
>
> Currently if a NodeManager is enabled to allocate Docker containers, but the 
> specified binary (docker.binary in the container-executor.cfg) is missing the 
> container allocation fails with the following error message:
> {noformat}
> Container launch fails
> Exit code: 29
> Exception message: Launch container failed
> Shell error output: sh: : No 
> such file or directory
> Could not inspect docker network to get type /usr/bin/docker network inspect 
> host --format='{{.Driver}}'.
> Error constructing docker command, docker error code=-1, error 
> message='Unknown error'
> {noformat}
> I suggest to add a property say "yarn.nodemanager.runtime.linux.docker.check" 
> to have the following options:
> - STARTUP: setting this option the NodeManager would not start if Docker 
> binaries are missing or the Docker daemon is not running (the exception is 
> considered FATAL during startup)
> - RUNTIME: would give a more detailed/user-friendly exception in 
> NodeManager's side (NM logs) if Docker binaries are missing or the daemon is 
> not working. This would also prevent further Docker container allocation as 
> long as the binaries do not exist and the docker daemon is not running.
> - NONE (default): preserving the current behaviour, throwing exception during 
> container allocation, carrying on using the default retry procedure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9948) Remove attempts that are beyond max-attempt limit from RMAppImpl

2019-11-05 Thread Jun Gong (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967374#comment-16967374
 ] 

Jun Gong edited comment on YARN-9948 at 11/5/19 9:31 AM:
-

[~ziqian hu] Thanks for the patch. How much memory does it consume? If not 
much, it will be better to keep them, the reasons could be found at [comment 
1|https://issues.apache.org/jira/browse/YARN-3480?focusedCommentId=15059193&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15059193]
 and [comment 
2|https://issues.apache.org/jira/browse/YARN-3480?focusedCommentId=15059399&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15059399].


was (Author: hex108):
[~ziqian hu] Thanks for the patch. How much memory does it consume? If not 
much, it will be better to keep them, the reasons could be found at comment 1 
and comment 2.

> Remove attempts that are beyond max-attempt limit from RMAppImpl
> 
>
> Key: YARN-9948
> URL: https://issues.apache.org/jira/browse/YARN-9948
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.1.3
>Reporter: Hu Ziqian
>Priority: Major
> Attachments: YARN-9948.001.patch
>
>
> RM will store app attempt in both state store and RMAppImpl. YARN-3480 
> removes attempts that are beyond max-attempt limit from state store.  In this 
> issue we delete those attempts in RMAppImpl the reduce decrease memory usage 
> of RM.
> We introduce flag yarn.resourcemanager.am.delete-old-attempts.enabled to 
> enable this logic, default value is false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9937) Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo

2019-11-05 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9937:

Attachment: YARN-9937-branch-3.2.001.patch

> Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo
> 
>
> Key: YARN-9937
> URL: https://issues.apache.org/jira/browse/YARN-9937
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: Screen Shot 2019-10-28 at 8.54.53 PM.png, 
> YARN-9937-001.patch, YARN-9937-002.patch, YARN-9937-003.patch, 
> YARN-9937-004.patch, YARN-9937-branch-3.2.001.patch
>
>
> Below are the missing queue configs which are not part of RMWebServices 
> scheduler endpoint. 
> 1. Maximum Allocation
> 2. Queue ACLs
> 3. Queue Priority
> 4. Application Lifetime



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Reopened] (YARN-9937) Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo

2019-11-05 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph reopened YARN-9937:
-

> Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo
> 
>
> Key: YARN-9937
> URL: https://issues.apache.org/jira/browse/YARN-9937
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: Screen Shot 2019-10-28 at 8.54.53 PM.png, 
> YARN-9937-001.patch, YARN-9937-002.patch, YARN-9937-003.patch, 
> YARN-9937-004.patch, YARN-9937-branch-3.2.001.patch
>
>
> Below are the missing queue configs which are not part of RMWebServices 
> scheduler endpoint. 
> 1. Maximum Allocation
> 2. Queue ACLs
> 3. Queue Priority
> 4. Application Lifetime



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9937) Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo

2019-11-05 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967407#comment-16967407
 ] 

Prabhu Joseph commented on YARN-9937:
-

Reopened to submit branch-3.2 patch.

> Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo
> 
>
> Key: YARN-9937
> URL: https://issues.apache.org/jira/browse/YARN-9937
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: Screen Shot 2019-10-28 at 8.54.53 PM.png, 
> YARN-9937-001.patch, YARN-9937-002.patch, YARN-9937-003.patch, 
> YARN-9937-004.patch, YARN-9937-branch-3.2.001.patch
>
>
> Below are the missing queue configs which are not part of RMWebServices 
> scheduler endpoint. 
> 1. Maximum Allocation
> 2. Queue ACLs
> 3. Queue Priority
> 4. Application Lifetime



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9949) Add missing queue configs for root queue in RMWebService#CapacitySchedulerInfo

2019-11-05 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967408#comment-16967408
 ] 

Prabhu Joseph commented on YARN-9949:
-

Thanks [~ferhui] for finding the issue. 

This patch requires YARN-9937 as well. Have bakcported YARN-9937 to branch-3.2. 

> Add missing queue configs for root queue in 
> RMWebService#CapacitySchedulerInfo 
> ---
>
> Key: YARN-9949
> URL: https://issues.apache.org/jira/browse/YARN-9949
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Fix For: 3.3.0, 3.2.2
>
> Attachments: YARN-9949-001.patch, YARN-9949-002.patch
>
>
> YARN-9937 has added below missing queue configs but missed to add for root 
> queue.
> 1. Maximum Allocation
> 2. Queue ACLs
> 3. Queue Priority
> 4. Application Lifetime
> 5. Ordering Policy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9537) Add configuration to disable AM preemption

2019-11-05 Thread zhoukang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated YARN-9537:
---
Attachment: YARN-9537.003.patch

> Add configuration to disable AM preemption
> --
>
> Key: YARN-9537
> URL: https://issues.apache.org/jira/browse/YARN-9537
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.2.0, 3.1.2
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Attachments: YARN-9537-002.patch, YARN-9537.001.patch, 
> YARN-9537.003.patch
>
>
> In this issue, i will add a configuration to support disable AM preemption.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA

2019-11-05 Thread zhoukang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967436#comment-16967436
 ] 

zhoukang commented on YARN-9605:


ping [~prabhujoseph] [~subru] [~tangzhankun]  could you help review or retest 
this? thanks!

I have run the related unit test locally

> Add ZkConfiguredFailoverProxyProvider for RM HA
> ---
>
> Key: YARN-9605
> URL: https://issues.apache.org/jira/browse/YARN-9605
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-9605.001.patch, YARN-9605.002.patch
>
>
> In this issue, i will track a new feature to support 
> ZkConfiguredFailoverProxyProvider for RM HA



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA

2019-11-05 Thread zhoukang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated YARN-9605:
---
Comment: was deleted

(was: ping [~prabhujoseph] [~subru] [~tangzhankun]  could you help review or 
retest this? thanks!

I have run the related unit test locally)

> Add ZkConfiguredFailoverProxyProvider for RM HA
> ---
>
> Key: YARN-9605
> URL: https://issues.apache.org/jira/browse/YARN-9605
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-9605.001.patch, YARN-9605.002.patch
>
>
> In this issue, i will track a new feature to support 
> ZkConfiguredFailoverProxyProvider for RM HA



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9937) Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo

2019-11-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967472#comment-16967472
 ] 

Hadoop QA commented on YARN-9937:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m  
7s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  6m 
32s{color} | {color:red} root in branch-3.2 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-3.2 
failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} branch-3.2 passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-3.2 
failed. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  4m 
19s{color} | {color:red} branch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-3.2 
failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 35s{color} 
| {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 15 new + 2 unchanged - 10 fixed = 17 total (was 12) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 29s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 39 new + 166 unchanged - 1 fixed = 205 total (was 167) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 18s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 0 new + 4 unchanged - 2 fixed = 4 total (was 6) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 35s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}115m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisherForV2 |
|   | 
hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:63396beab41 |
| JIRA Issue | YARN-9937 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12984910/YARN-9937-branch-3.2.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux eb53e1e659af 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/p

[jira] [Updated] (YARN-9899) Migration tool that help to generate CS config based on FS config [Phase 2]

2019-11-05 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9899:
---
Attachment: YARN-9899-005.patch

> Migration tool that help to generate CS config based on FS config [Phase 2] 
> 
>
> Key: YARN-9899
> URL: https://issues.apache.org/jira/browse/YARN-9899
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9899-001.patch, YARN-9899-002.patch, 
> YARN-9899-003.patch, YARN-9899-004.patch, YARN-9899-005.patch
>
>
> YARN-9699 laid down the groundworks of a converter from FS to CS config.
> During the development of the converter, we came up with the following things 
> to fix. 
> 1. If we don't specify a mandatory option, we have this stacktrace for 
> example:
>  
> {code:java}
> org.apache.commons.cli.MissingOptionException: Missing required option: o
>  at org.apache.commons.cli.Parser.checkRequiredOptions(Parser.java:299)
>  at org.apache.commons.cli.Parser.parse(Parser.java:231)
>  at org.apache.commons.cli.Parser.parse(Parser.java:85)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.converter.FSConfigToCSConfigArgumentHandler.parseAndConvert(FSConfigToCSConfigArgumentHandler.java:100)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1572){code}
>  
> We should provide a more concise and meaningful error message (without 
> stacktrace on the CLI, but we should log the exception with stacktrace to the 
> RM log).
> An explanation of the missing option is also required.
> 2. We may think about how to handle exceptions from commons CLI: 
> MissingArgumentException vs. MissingOptionException
> 3. We need to provide a -h / --help option for the CLI that prints all the 
> possible options / arguments.
> 4. Last but not least: We should move the CLI command to a more reasonable 
> place:
> As YARN-9699 implemented it, the command can be invoked like: 
> {code:java}
> /opt/hadoop/bin/yarn resourcemanager -convert-fs-configuration -y 
> /opt/hadoop/etc/hadoop/yarn-site.xml -f 
> /opt/hadoop/etc/hadoop/fair-scheduler.xml -r 
> ~systest/sample-rules-config.properties -o /tmp/fs-cs-output
> {code}
> This is problematic, as if YARN RM is already running, we need to stop it in 
> order to start the RM again with the conversion switch.
> 5. Add unit test coverage for {{QueuePlacementConverter}}
> 6. Close some feature gaps.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9899) Migration tool that help to generate CS config based on FS config [Phase 2]

2019-11-05 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967513#comment-16967513
 ] 

Peter Bacsko commented on YARN-9899:


Fixed a couple of minor things in v5. Also, if args.length == 0, then we 
display a help instead of throwing an exception.

[~snemeth] could you check out patch v5 when you have some time?

> Migration tool that help to generate CS config based on FS config [Phase 2] 
> 
>
> Key: YARN-9899
> URL: https://issues.apache.org/jira/browse/YARN-9899
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9899-001.patch, YARN-9899-002.patch, 
> YARN-9899-003.patch, YARN-9899-004.patch, YARN-9899-005.patch
>
>
> YARN-9699 laid down the groundworks of a converter from FS to CS config.
> During the development of the converter, we came up with the following things 
> to fix. 
> 1. If we don't specify a mandatory option, we have this stacktrace for 
> example:
>  
> {code:java}
> org.apache.commons.cli.MissingOptionException: Missing required option: o
>  at org.apache.commons.cli.Parser.checkRequiredOptions(Parser.java:299)
>  at org.apache.commons.cli.Parser.parse(Parser.java:231)
>  at org.apache.commons.cli.Parser.parse(Parser.java:85)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.converter.FSConfigToCSConfigArgumentHandler.parseAndConvert(FSConfigToCSConfigArgumentHandler.java:100)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1572){code}
>  
> We should provide a more concise and meaningful error message (without 
> stacktrace on the CLI, but we should log the exception with stacktrace to the 
> RM log).
> An explanation of the missing option is also required.
> 2. We may think about how to handle exceptions from commons CLI: 
> MissingArgumentException vs. MissingOptionException
> 3. We need to provide a -h / --help option for the CLI that prints all the 
> possible options / arguments.
> 4. Last but not least: We should move the CLI command to a more reasonable 
> place:
> As YARN-9699 implemented it, the command can be invoked like: 
> {code:java}
> /opt/hadoop/bin/yarn resourcemanager -convert-fs-configuration -y 
> /opt/hadoop/etc/hadoop/yarn-site.xml -f 
> /opt/hadoop/etc/hadoop/fair-scheduler.xml -r 
> ~systest/sample-rules-config.properties -o /tmp/fs-cs-output
> {code}
> This is problematic, as if YARN RM is already running, we need to stop it in 
> order to start the RM again with the conversion switch.
> 5. Add unit test coverage for {{QueuePlacementConverter}}
> 6. Close some feature gaps.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9537) Add configuration to disable AM preemption

2019-11-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967533#comment-16967533
 ] 

Hadoop QA commented on YARN-9537:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 38s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 85m 
27s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}143m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9537 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12984918/YARN-9537.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ffcd33807bf3 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d17ba85 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25103/testReport/ |
| Max. process+thread count | 816 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25103/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Add configuration to disable AM preemp

[jira] [Commented] (YARN-9537) Add configuration to disable AM preemption

2019-11-05 Thread zhoukang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967534#comment-16967534
 ] 

zhoukang commented on YARN-9537:


[~snemeth][~adam.antal] the style problem has been fixed. any more suggestion? 

> Add configuration to disable AM preemption
> --
>
> Key: YARN-9537
> URL: https://issues.apache.org/jira/browse/YARN-9537
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.2.0, 3.1.2
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Attachments: YARN-9537-002.patch, YARN-9537.001.patch, 
> YARN-9537.003.patch
>
>
> In this issue, i will add a configuration to support disable AM preemption.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9537) Add configuration to disable AM preemption

2019-11-05 Thread zhoukang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967534#comment-16967534
 ] 

zhoukang edited comment on YARN-9537 at 11/5/19 1:39 PM:
-

[~snemeth][~adam.antal] the style problem has been fixed. any more suggestion? 
thanks


was (Author: cane):
[~snemeth][~adam.antal] the style problem has been fixed. any more suggestion? 

> Add configuration to disable AM preemption
> --
>
> Key: YARN-9537
> URL: https://issues.apache.org/jira/browse/YARN-9537
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.2.0, 3.1.2
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Attachments: YARN-9537-002.patch, YARN-9537.001.patch, 
> YARN-9537.003.patch
>
>
> In this issue, i will add a configuration to support disable AM preemption.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9362) Code cleanup in TestNMLeveldbStateStoreService

2019-11-05 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967545#comment-16967545
 ] 

Peter Bacsko commented on YARN-9362:


Thanks for the patch [~denes.gerencser]. I don't have too much context, but 
having smaller tests instead of a big one is certainly much better. I just have 
two comments right now:

1) Would be good to see some nice assertion messages. However, it's missing 
everywhere and I don't think it's worth the hassle, so let's keep it that way.
2) I'm wondering if using {{System.currentTimeMillis()}} is necessary:
{noformat}
long containerStartTime = System.currentTimeMillis();
{noformat}

can't we just use a constant?



> Code cleanup in TestNMLeveldbStateStoreService
> --
>
> Key: YARN-9362
> URL: https://issues.apache.org/jira/browse/YARN-9362
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Denes Gerencser
>Priority: Minor
> Attachments: YARN-9362.001.patch
>
>
> There are many ways to improve TestNMLeveldbStateStoreService: 
> 1. RecoveredContainerState fields are asserted many times repeatedly. Some 
> simple method extractions would definitely make this more readable.
> 2. The tests are very long and hard to read in general: Again, finding how 
> methods could be extracted to avoid code repetition could help. 
> 3. You name it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9677) Make FpgaDevice and GpuDevice classes more similar to each other

2019-11-05 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967556#comment-16967556
 ] 

Peter Bacsko commented on YARN-9677:


Thanks for the patch [~pingsutw]. Looks like a straightforward refactor. 
+1 (non-binding) 

> Make FpgaDevice and GpuDevice classes more similar to each other
> 
>
> Key: YARN-9677
> URL: https://issues.apache.org/jira/browse/YARN-9677
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: kevin su
>Priority: Major
>  Labels: newbie, newbie++
> Attachments: YARN-9677.001.patch
>
>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceAllocator.FpgaDevice
>  is an inner class of FpgaResourceAllocator.
> It is not only being used from its parent class but from other classes as 
> well so we are losing the purpose of the inner class, it does not really make 
> sense.
> We also have 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuDevice
>  which is a similar class, but for GPU devices.
> What we could do here is to make FpgaDevice a single class and harmonize the 
> packages of these 2 classes, meaning they should be "closer" to each other in 
> terms of packaging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8004) Add unit tests for inter queue preemption for dominant resource calculator

2019-11-05 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967606#comment-16967606
 ] 

Eric Payne commented on YARN-8004:
--

{quote}
bq. could i backport this to branch-2.9/2.8 as well?
Sunil G, sure that would be fine. Thanks for the reviews and commits.
{quote} 
It doesn't look like this happened. I backports cleanly to branch-2. Unless 
there are objections, I would like to backport this to branch-2.

> Add unit tests for inter queue preemption for dominant resource calculator
> --
>
> Key: YARN-8004
> URL: https://issues.apache.org/jira/browse/YARN-8004
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Zian Chen
>Priority: Critical
> Fix For: 3.2.0, 3.1.1, 3.0.3
>
> Attachments: YARN-8004.001.patch, YARN-8004.002.patch, 
> YARN-8004.003.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8004) Add unit tests for inter queue preemption for dominant resource calculator

2019-11-05 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967606#comment-16967606
 ] 

Eric Payne edited comment on YARN-8004 at 11/5/19 4:16 PM:
---

{quote}
bq. could i backport this to branch-2.9/2.8 as well?
Sunil G, sure that would be fine. Thanks for the reviews and commits.
{quote} 
It doesn't look like this happened. It backports cleanly to branch-2. Unless 
there are objections, I will like to backport this to branch-2.


was (Author: eepayne):
{quote}
bq. could i backport this to branch-2.9/2.8 as well?
Sunil G, sure that would be fine. Thanks for the reviews and commits.
{quote} 
It doesn't look like this happened. I backports cleanly to branch-2. Unless 
there are objections, I would like to backport this to branch-2.

> Add unit tests for inter queue preemption for dominant resource calculator
> --
>
> Key: YARN-8004
> URL: https://issues.apache.org/jira/browse/YARN-8004
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Zian Chen
>Priority: Critical
> Fix For: 3.2.0, 3.1.1, 3.0.3
>
> Attachments: YARN-8004.001.patch, YARN-8004.002.patch, 
> YARN-8004.003.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry

2019-11-05 Thread Manikandan R (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967650#comment-16967650
 ] 

Manikandan R commented on YARN-9768:


Attached .007.patch. Please review.

> RM Renew Delegation token thread should timeout and retry
> -
>
> Key: YARN-9768
> URL: https://issues.apache.org/jira/browse/YARN-9768
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: CR Hota
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9768.001.patch, YARN-9768.002.patch, 
> YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, 
> YARN-9768.006.patch, YARN-9768.007.patch
>
>
> Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews 
> HDFS tokens received to check for validity and expiration time.
> This call is made to an underlying HDFS NN or Router Node (which has exact 
> APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the 
> thread remains stuck indefinitely. The thread should ideally timeout the 
> renewToken and retry from the client's perspective.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9768) RM Renew Delegation token thread should timeout and retry

2019-11-05 Thread Manikandan R (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R updated YARN-9768:
---
Attachment: YARN-9768.007.patch

> RM Renew Delegation token thread should timeout and retry
> -
>
> Key: YARN-9768
> URL: https://issues.apache.org/jira/browse/YARN-9768
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: CR Hota
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9768.001.patch, YARN-9768.002.patch, 
> YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, 
> YARN-9768.006.patch, YARN-9768.007.patch
>
>
> Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews 
> HDFS tokens received to check for validity and expiration time.
> This call is made to an underlying HDFS NN or Router Node (which has exact 
> APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the 
> thread remains stuck indefinitely. The thread should ideally timeout the 
> renewToken and retry from the client's perspective.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8004) Add unit tests for inter queue preemption for dominant resource calculator

2019-11-05 Thread Eric Payne (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-8004:
-
Fix Version/s: 2.11.0
   2.10.1

> Add unit tests for inter queue preemption for dominant resource calculator
> --
>
> Key: YARN-8004
> URL: https://issues.apache.org/jira/browse/YARN-8004
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Zian Chen
>Priority: Critical
> Fix For: 3.2.0, 3.1.1, 3.0.3, 2.10.1, 2.11.0
>
> Attachments: YARN-8004.001.patch, YARN-8004.002.patch, 
> YARN-8004.003.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8004) Add unit tests for inter queue preemption for dominant resource calculator

2019-11-05 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967660#comment-16967660
 ] 

Eric Payne commented on YARN-8004:
--

Backport completed to branch-2 and branch-2.10.

> Add unit tests for inter queue preemption for dominant resource calculator
> --
>
> Key: YARN-8004
> URL: https://issues.apache.org/jira/browse/YARN-8004
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Zian Chen
>Priority: Critical
> Fix For: 3.2.0, 3.1.1, 3.0.3, 2.10.1, 2.11.0
>
> Attachments: YARN-8004.001.patch, YARN-8004.002.patch, 
> YARN-8004.003.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Reopened] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative

2019-11-05 Thread Eric Payne (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne reopened YARN-8292:
--

Sorry for reopening, but this issue also exists in branch-2 and branch-2.10 
(see the VOTE discussion thread for Hadoop 2.10.0 RC1). Unfortunately, the 
patch does not backport cleanly to branch-2/2.10, so I would like to reopen 
this JIRA and put up a patch for branch-2/2.10. I am preparing a patch now.

> Fix the dominant resource preemption cannot happen when some of the resource 
> vector becomes negative
> 
>
> Key: YARN-8292
> URL: https://issues.apache.org/jira/browse/YARN-8292
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Wangda Tan
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8292.001.patch, YARN-8292.002.patch, 
> YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch, 
> YARN-8292.006.patch, YARN-8292.007.patch, YARN-8292.008.patch, 
> YARN-8292.009.patch
>
>
> This is an example of the problem: 
>   
> {code}
> //   guaranteed,  max,used,   pending
> "root(=[30:18:6  30:18:6 12:12:6 1:1:1]);" + //root
> "-a(=[10:6:2 10:6:2  6:6:3   0:0:0]);" + // a
> "-b(=[10:6:2 10:6:2  6:6:3   0:0:0]);" + // b
> "-c(=[10:6:2 10:6:2  0:0:0   1:1:1])"; // c
> {code}
> There're 3 resource types. Total resource of the cluster is 30:18:6
> For both of a/b, there're 3 containers running, each of container is 2:2:1.
> Queue c uses 0 resource, and have 1:1:1 pending resource.
> Under existing logic, preemption cannot happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9949) Add missing queue configs for root queue in RMWebService#CapacitySchedulerInfo

2019-11-05 Thread Siyao Meng (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967684#comment-16967684
 ] 

Siyao Meng commented on YARN-9949:
--

[~prabhujoseph] [~sunilg] Yep, this commit breaks the compilation of branch-3.2 
due to YARN-9937 hasn't landed in branch-3.2 yet:
{code:mvn clean install -Pdist -DskipTests -e -Dmaven.javadoc.skip=true}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hadoop-yarn-server-resourcemanager: Compilation failure: Compilation 
failure:
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[53,13]
 cannot find symbol
[ERROR]   symbol:   class QueueAclsInfo
[ERROR]   location: class 
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[127,10]
 cannot find symbol
[ERROR]   symbol:   class QueueAclsInfo
[ERROR]   location: class 
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[76,48]
 cannot find symbol
[ERROR]   symbol:   method getMaximumAllocation()
[ERROR]   location: variable parent of type 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueue
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[79,21]
 cannot find symbol
[ERROR]   symbol:   class QueueAclsInfo
[ERROR]   location: class 
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[82,7]
 cannot find symbol
[ERROR]   symbol:   class QueueAclInfo
[ERROR]   location: class 
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[82,35]
 cannot find symbol
[ERROR]   symbol:   class QueueAclInfo
[ERROR]   location: class 
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[92,5]
 cannot find symbol
[ERROR]   symbol:   class QueueAclInfo
[ERROR]   location: class 
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[92,33]
 cannot find symbol
[ERROR]   symbol:   class QueueAclInfo
[ERROR]   location: class 
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo
[ERROR] -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hadoop-yarn-server-resourcemanager: Compilation failure
at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:215)
{code}

YARN-9937 doesn't seem to be a clean backport. If that's the case, please 
revert this on branch-3.2 so that it can be compiled. We can get this back in 
3.2 after YARN-9937 is in. Thanks!

> Add missing queue configs for root queue in 
> RMWebService#CapacitySchedulerInfo 
> ---
>
> Key: YARN-9949
> URL: https://issues.apache.org/jira/browse/YARN-9949
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Fix For: 3.3.0, 3.2.2
>
> Attachments: YARN-9949-001.patch, YARN-9949-002.patch
>
>
> YARN-9937 has a

[jira] [Comment Edited] (YARN-9949) Add missing queue configs for root queue in RMWebService#CapacitySchedulerInfo

2019-11-05 Thread Siyao Meng (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967684#comment-16967684
 ] 

Siyao Meng edited comment on YARN-9949 at 11/5/19 5:16 PM:
---

[~prabhujoseph] [~sunilg] Yep, this commit breaks the compilation of branch-3.2 
due to YARN-9937 hasn't landed in branch-3.2 yet:
{code:title=mvn clean install -Pdist -DskipTests -e -Dmaven.javadoc.skip=true}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hadoop-yarn-server-resourcemanager: Compilation failure: Compilation 
failure:
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[53,13]
 cannot find symbol
[ERROR]   symbol:   class QueueAclsInfo
[ERROR]   location: class 
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[127,10]
 cannot find symbol
[ERROR]   symbol:   class QueueAclsInfo
[ERROR]   location: class 
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[76,48]
 cannot find symbol
[ERROR]   symbol:   method getMaximumAllocation()
[ERROR]   location: variable parent of type 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueue
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[79,21]
 cannot find symbol
[ERROR]   symbol:   class QueueAclsInfo
[ERROR]   location: class 
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[82,7]
 cannot find symbol
[ERROR]   symbol:   class QueueAclInfo
[ERROR]   location: class 
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[82,35]
 cannot find symbol
[ERROR]   symbol:   class QueueAclInfo
[ERROR]   location: class 
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[92,5]
 cannot find symbol
[ERROR]   symbol:   class QueueAclInfo
[ERROR]   location: class 
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo
[ERROR] 
/Users/smeng/repo/branch-3.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java:[92,33]
 cannot find symbol
[ERROR]   symbol:   class QueueAclInfo
[ERROR]   location: class 
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo
[ERROR] -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hadoop-yarn-server-resourcemanager: Compilation failure
at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:215)
{code}

I was working on another jira (backporting HADOOP-16152 to 3.2) and found this 
to be a potential issue for precommits.

YARN-9937 doesn't seem to be a clean backport. If that's the case, please 
revert this on branch-3.2 so that it can be compiled. We can get this back in 
3.2 after YARN-9937 is in. Thanks!


was (Author: smeng):
[~prabhujoseph] [~sunilg] Yep, this commit breaks the compilation of branch-3.2 
due to YARN-9937 hasn't landed in branch-3.2 yet:
{code:mvn clean install -Pdist -DskipTests -e -Dmaven.javadoc.skip=true}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hadoop-yarn-server-resourcemanager: Compilation failure: Compilation 
failure:
[ERROR] 
/Users/smeng/repo/bra

[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry

2019-11-05 Thread Jira


[ 
https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967695#comment-16967695
 ] 

Íñigo Goiri commented on YARN-9768:
---

[^YARN-9768.007.patch] looks good.
Let's see what Yetus says.

> RM Renew Delegation token thread should timeout and retry
> -
>
> Key: YARN-9768
> URL: https://issues.apache.org/jira/browse/YARN-9768
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: CR Hota
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9768.001.patch, YARN-9768.002.patch, 
> YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, 
> YARN-9768.006.patch, YARN-9768.007.patch
>
>
> Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews 
> HDFS tokens received to check for validity and expiration time.
> This call is made to an underlying HDFS NN or Router Node (which has exact 
> APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the 
> thread remains stuck indefinitely. The thread should ideally timeout the 
> renewToken and retry from the client's perspective.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9899) Migration tool that help to generate CS config based on FS config [Phase 2]

2019-11-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967707#comment-16967707
 ] 

Hadoop QA commented on YARN-9899:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
38s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
22s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 63 unchanged - 1 fixed = 63 total (was 64) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
25s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
17s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 32s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 90m 
45s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}265m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.webproxy.TestWebAppProxyServlet 

[jira] [Assigned] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'

2019-11-05 Thread Yufei Gu (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu reassigned YARN-9940:
--

Assignee: kailiu_dev

> avoid continuous scheduling thread crashes while sorting nodes get 
> 'Comparison method violates its general contract'
> 
>
> Key: YARN-9940
> URL: https://issues.apache.org/jira/browse/YARN-9940
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: kailiu_dev
>Assignee: kailiu_dev
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: YARN-9940-branch-2.7.2.001.patch
>
>
> 2019-10-16 09:14:51,215 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception.
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>     at java.util.TimSort.mergeHi(TimSort.java:868)
>     at java.util.TimSort.mergeAt(TimSort.java:485)
>     at java.util.TimSort.mergeForceCollapse(TimSort.java:426)
>     at java.util.TimSort.sort(TimSort.java:223)
>     at java.util.TimSort.sort(TimSort.java:173)
>     at java.util.Arrays.sort(Arrays.java:659)
>     at java.util.Collections.sort(Collections.java:217)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'

2019-11-05 Thread Yufei Gu (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967729#comment-16967729
 ] 

Yufei Gu commented on YARN-9940:


Hi [~kailiu_dev], added you to the contributor role, and assign this to you. I 
will try to review this later.

> avoid continuous scheduling thread crashes while sorting nodes get 
> 'Comparison method violates its general contract'
> 
>
> Key: YARN-9940
> URL: https://issues.apache.org/jira/browse/YARN-9940
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: kailiu_dev
>Assignee: kailiu_dev
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: YARN-9940-branch-2.7.2.001.patch
>
>
> 2019-10-16 09:14:51,215 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception.
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>     at java.util.TimSort.mergeHi(TimSort.java:868)
>     at java.util.TimSort.mergeAt(TimSort.java:485)
>     at java.util.TimSort.mergeForceCollapse(TimSort.java:426)
>     at java.util.TimSort.sort(TimSort.java:223)
>     at java.util.TimSort.sort(TimSort.java:173)
>     at java.util.Arrays.sort(Arrays.java:659)
>     at java.util.Collections.sort(Collections.java:217)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9937) Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo

2019-11-05 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9937:

Attachment: YARN-9937-branch-3.2.002.patch

> Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo
> 
>
> Key: YARN-9937
> URL: https://issues.apache.org/jira/browse/YARN-9937
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: Screen Shot 2019-10-28 at 8.54.53 PM.png, 
> YARN-9937-001.patch, YARN-9937-002.patch, YARN-9937-003.patch, 
> YARN-9937-004.patch, YARN-9937-branch-3.2.001.patch, 
> YARN-9937-branch-3.2.002.patch
>
>
> Below are the missing queue configs which are not part of RMWebServices 
> scheduler endpoint. 
> 1. Maximum Allocation
> 2. Queue ACLs
> 3. Queue Priority
> 4. Application Lifetime



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9677) Make FpgaDevice and GpuDevice classes more similar to each other

2019-11-05 Thread kevin su (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967808#comment-16967808
 ] 

kevin su commented on YARN-9677:


[~pbacsko] Thanks for the review

> Make FpgaDevice and GpuDevice classes more similar to each other
> 
>
> Key: YARN-9677
> URL: https://issues.apache.org/jira/browse/YARN-9677
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: kevin su
>Priority: Major
>  Labels: newbie, newbie++
> Attachments: YARN-9677.001.patch
>
>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceAllocator.FpgaDevice
>  is an inner class of FpgaResourceAllocator.
> It is not only being used from its parent class but from other classes as 
> well so we are losing the purpose of the inner class, it does not really make 
> sense.
> We also have 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuDevice
>  which is a similar class, but for GPU devices.
> What we could do here is to make FpgaDevice a single class and harmonize the 
> packages of these 2 classes, meaning they should be "closer" to each other in 
> terms of packaging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry

2019-11-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967810#comment-16967810
 ] 

Hadoop QA commented on YARN-9768:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
37s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
0s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
39s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 308 unchanged - 0 fixed = 310 total (was 308) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
52s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
45s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 87m 18s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}183m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9768 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12984971/YARN-9768.007.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shade

[jira] [Commented] (YARN-9937) Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo

2019-11-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967827#comment-16967827
 ] 

Hadoop QA commented on YARN-9937:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  6m 
39s{color} | {color:red} root in branch-3.2 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-3.2 
failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} branch-3.2 passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-3.2 
failed. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  4m 
19s{color} | {color:red} branch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-3.2 
failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 35s{color} 
| {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 15 new + 2 unchanged - 10 fixed = 17 total (was 12) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 12 new + 166 unchanged - 1 fixed = 178 total (was 167) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 53s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 0 new + 4 unchanged - 2 fixed = 4 total (was 6) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 36s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}105m 58s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisherForV2 |
|   | 
hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:63396beab41 |
| JIRA Issue | YARN-9937 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12984986/YARN-9937-branch-3.2.002.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d134ce98fe38 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/p

[jira] [Commented] (YARN-9953) YARN Service dependency should be configurable for each app

2019-11-05 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967829#comment-16967829
 ] 

Eric Yang commented on YARN-9953:
-

The programmable API for YARN service is to maintain backward compatibility at 
yarnfile level instead of at private Java API calls.  YARN service depends on 
API server, which is part of Resource Manager process.  There is fair bit of 
dependencies between YARN framework and the YARN service application.  By 
exposing the YARN service as configurable version, it will be harder to manage 
upgrades and create more obstacles for future version of YARN framework because 
older version of YARN service uses internal YARN API which may not work in 
future version of YARN.  Given those reasons, we can't move forward with this 
patch.  Sorry.

> YARN Service dependency should be configurable for each app
> ---
>
> Key: YARN-9953
> URL: https://issues.apache.org/jira/browse/YARN-9953
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: kyungwan nam
>Assignee: kyungwan nam
>Priority: Major
> Attachments: YARN-9953.001.patch
>
>
> Currently, YARN Service dependency can be set as yarn.service.framework.path.
> But, It works only as configured in RM.
> This makes it impossible for the user to choose their YARN Service dependency.
> It should be configurable for each app.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9949) Add missing queue configs for root queue in RMWebService#CapacitySchedulerInfo

2019-11-05 Thread Jonathan Turner Eagles (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967878#comment-16967878
 ] 

Jonathan Turner Eagles commented on YARN-9949:
--

Helping to get branch-3.2 compiling again to unblock other commits. I've 
reverted commit 11c763c22055fea367b19b338a3d8067f9386ba4 in branch-3.2. It 
seems like there is more work to be done for the back-port so this seems the 
cleaned way until it's resolved. Thanks for understanding the reason for the 
revert.

> Add missing queue configs for root queue in 
> RMWebService#CapacitySchedulerInfo 
> ---
>
> Key: YARN-9949
> URL: https://issues.apache.org/jira/browse/YARN-9949
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Fix For: 3.3.0, 3.2.2
>
> Attachments: YARN-9949-001.patch, YARN-9949-002.patch
>
>
> YARN-9937 has added below missing queue configs but missed to add for root 
> queue.
> 1. Maximum Allocation
> 2. Queue ACLs
> 3. Queue Priority
> 4. Application Lifetime
> 5. Ordering Policy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9949) Add missing queue configs for root queue in RMWebService#CapacitySchedulerInfo

2019-11-05 Thread Jonathan Turner Eagles (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967878#comment-16967878
 ] 

Jonathan Turner Eagles edited comment on YARN-9949 at 11/5/19 9:18 PM:
---

Helping to get branch-3.2 compiling again to unblock other committers. I've 
reverted commit 11c763c22055fea367b19b338a3d8067f9386ba4 in branch-3.2. It 
seems like there is more work to be done for the back-port so this seems the 
cleanest way until it's resolved. Thanks for understanding the reason for the 
revert.


was (Author: jeagles):
Helping to get branch-3.2 compiling again to unblock other commits. I've 
reverted commit 11c763c22055fea367b19b338a3d8067f9386ba4 in branch-3.2. It 
seems like there is more work to be done for the back-port so this seems the 
cleaned way until it's resolved. Thanks for understanding the reason for the 
revert.

> Add missing queue configs for root queue in 
> RMWebService#CapacitySchedulerInfo 
> ---
>
> Key: YARN-9949
> URL: https://issues.apache.org/jira/browse/YARN-9949
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Fix For: 3.3.0, 3.2.2
>
> Attachments: YARN-9949-001.patch, YARN-9949-002.patch
>
>
> YARN-9937 has added below missing queue configs but missed to add for root 
> queue.
> 1. Maximum Allocation
> 2. Queue ACLs
> 3. Queue Priority
> 4. Application Lifetime
> 5. Ordering Policy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9930) Support max running app logic for CapacityScheduler

2019-11-05 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967930#comment-16967930
 ] 

Eric Payne commented on YARN-9930:
--

bq. When we migrate from FS to CS, this difference will make users be confused.
[~cane] I see.

So, if I understand correctly, you are saying the following:
- in FS, submitting more than {{maxRunningApps}} per user will just leave the 
apps waiting in the submitted state and will run them once other apps from that 
user have completed.
- The CS will refuse to submitt more than {{Max Applications Per User}}.

Is that correct?

If it is a requirement to change this behavior in the CS, I would at least like 
to see this change in behavior surrounded by a config property, with the 
default being the old CS behavior.

> Support max running app logic for CapacityScheduler
> ---
>
> Key: YARN-9930
> URL: https://issues.apache.org/jira/browse/YARN-9930
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 3.1.0, 3.1.1
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
>
> In FairScheduler, there has limitation for max running which will let 
> application pending.
> But in CapacityScheduler there has no feature like max running app.Only got 
> max app,and jobs will be rejected directly on client.
> This jira i want to implement this semantic for CapacityScheduler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9562) Add Java changes for the new RuncContainerRuntime

2019-11-05 Thread Shane Kumpf (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967937#comment-16967937
 ] 

Shane Kumpf commented on YARN-9562:
---

Hey [~ebadger]. Thanks for your (and everyone elses) hard work here. Overall 
this looks to be coming together nicely.

I've taken a look at the code and have a couple of items, but nothing blocking. 
However, I'm having a bit of trouble getting runC containers working so far. 
I'm out of time to continue troubleshooting right now, but this is what I'm 
seeing, both dshell and MR pi do the same. Docker MR jobs are working. I am 
running all containers as the nobody user in this case.
{code:java}
2019-11-05 22:40:14,225 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
 Shell execution returned exit code: 35. Privileged Execution Operation Stderr:
Bad/Missing runC int
Could not create container dirs
Could not create local files and directories
Nonzero exit code=35, error message='Could not create work dirs'
Stdout: Can't create directory 
/tmp/hadoop-yarn/nm-local-dir/usercache/hadoopuser/appcache/application_1572993484434_0003/container_e04_1572993484434_0003_01_02
 - Permission denied
Full command array for failed execution:
[/usr/local/hadoop/bin/container-executor, --run-runc-container, 
/tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1572993484434_0003/container_e04_1572993484434_0003_01_02/runc-config.json]{code}
 

Here are some questions/nits on the patch. None of these are blockers IMO.

Questions/Comments:

1) Why is the keystore and truststore needed within RuncContainerExecutorConfig?

2) I'm not a big fan of hard coded mounts like this. This would also be 
problematic for systemd based containers where systemd expects /tmp to be a 
tmpfs.
{code:java}
addRuncMountLocation(mounts, containerWorkDir.toString() +
"/private_slash_tmp", "/tmp", true, true);
addRuncMountLocation(mounts, containerWorkDir.toString() +
"/private_var_slash_tmp", "/var/tmp", true, true);
{code}
3) It would be great to track these disabled features for future implementation.
{code:java}
  public String getExposedPorts(Container container) {
return null;
  }

  public String[] getIpAndHost(Container container) {
return null;
  }

  public IOStreamPair execContainer(ContainerExecContext ctx)
  throws ContainerExecutionException {
return null;
  }

  public void reapContainer(ContainerRuntimeContext ctx)
  throws ContainerExecutionException {
  }

  public void relaunchContainer(ContainerRuntimeContext ctx)
  throws ContainerExecutionException {
  }
{code}
Nits:

1) clean up the whitespace around Container#getContainerRuntimeData

2) RuncContainerExecutorConfig typo in class javadoc

3) YarnConfiguration DEFAULT_NM_RUNC_ALLOWED_CONTAINER_NETWORKS and 
DEFAULT_NM_RUNC_ALLOWED_CONTAINER_RUNTIMES - copy and paste error on the javadoc

4) Many of the tests create tmpDirs but don't appear to clean them up. 
TestRuncContainerRuntime creates two temp dirs, once via mkdirs and the other 
via a Rule.
{code:java}
TestDockerContainerRuntime mkdirs for tmpDir
TestHdfsManifestToResouvesPlugin creates a tmpDir but doesn't clean it up
TestRuncContainerRuntime has both a tmpDir and TempDir created by a @Rule
{code}
5) Docs
 * Overview: "if created", newline after runC in second paragraph.
 * Docker to squash section: first paragraph "Getting" newline.
 * I'm fine with leaving reference to the patch to docker_to_squash.py for now 
until we have a better story, but I did need to do a few steps to get that tool 
working. 1) Create the hdfs runc-root as root 2) install skopeo, 
squashfs-tools, and attr.

> Add Java changes for the new RuncContainerRuntime
> -
>
> Key: YARN-9562
> URL: https://issues.apache.org/jira/browse/YARN-9562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9562.001.patch, YARN-9562.002.patch, 
> YARN-9562.003.patch, YARN-9562.004.patch, YARN-9562.005.patch, 
> YARN-9562.006.patch, YARN-9562.007.patch, YARN-9562.008.patch, 
> YARN-9562.009.patch, YARN-9562.010.patch, YARN-9562.011.patch, 
> YARN-9562.012.patch, YARN-9562.013.patch
>
>
> This JIRA will be used to add the Java changes for the new 
> RuncContainerRuntime. This will work off of YARN-9560 to use much of the 
> existing DockerLinuxContainerRuntime code once it is moved up into an 
> abstract class that can be extended. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-9954) Configurable max application tags and max tag length

2019-11-05 Thread Jonathan Hung (Jira)
Jonathan Hung created YARN-9954:
---

 Summary: Configurable max application tags and max tag length
 Key: YARN-9954
 URL: https://issues.apache.org/jira/browse/YARN-9954
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jonathan Hung


Currently max tags and max tag length is hardcoded, it should be configurable
{noformat}
@Evolving
public static final int APPLICATION_MAX_TAGS = 10;

@Evolving
public static final int APPLICATION_MAX_TAG_LENGTH = 100; {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry

2019-11-05 Thread Jira


[ 
https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968002#comment-16968002
 ] 

Íñigo Goiri commented on YARN-9768:
---

We can fix the checkstyles.
And as we are changing that, let's also do:
{code}
60s
{code}

> RM Renew Delegation token thread should timeout and retry
> -
>
> Key: YARN-9768
> URL: https://issues.apache.org/jira/browse/YARN-9768
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: CR Hota
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9768.001.patch, YARN-9768.002.patch, 
> YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, 
> YARN-9768.006.patch, YARN-9768.007.patch
>
>
> Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews 
> HDFS tokens received to check for validity and expiration time.
> This call is made to an underlying HDFS NN or Router Node (which has exact 
> APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the 
> thread remains stuck indefinitely. The thread should ideally timeout the 
> renewToken and retry from the client's perspective.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA

2019-11-05 Thread Zhankun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968014#comment-16968014
 ] 

Zhankun Tang commented on YARN-9605:


[~cane], I triggered a new build and let's see.

> Add ZkConfiguredFailoverProxyProvider for RM HA
> ---
>
> Key: YARN-9605
> URL: https://issues.apache.org/jira/browse/YARN-9605
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-9605.001.patch, YARN-9605.002.patch
>
>
> In this issue, i will track a new feature to support 
> ZkConfiguredFailoverProxyProvider for RM HA



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9940) avoid continuous scheduling thread crashes while sorting nodes get 'Comparison method violates its general contract'

2019-11-05 Thread kailiu_dev (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968027#comment-16968027
 ] 

kailiu_dev commented on YARN-9940:
--

Dear [~yufeigu] , thanks for your replay! :).  I am Looking forward to 
receiving your {color:#172b4d}review {color}and replay! 

> avoid continuous scheduling thread crashes while sorting nodes get 
> 'Comparison method violates its general contract'
> 
>
> Key: YARN-9940
> URL: https://issues.apache.org/jira/browse/YARN-9940
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: kailiu_dev
>Assignee: kailiu_dev
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: YARN-9940-branch-2.7.2.001.patch
>
>
> 2019-10-16 09:14:51,215 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception.
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>     at java.util.TimSort.mergeHi(TimSort.java:868)
>     at java.util.TimSort.mergeAt(TimSort.java:485)
>     at java.util.TimSort.mergeForceCollapse(TimSort.java:426)
>     at java.util.TimSort.sort(TimSort.java:223)
>     at java.util.TimSort.sort(TimSort.java:173)
>     at java.util.Arrays.sort(Arrays.java:659)
>     at java.util.Collections.sort(Collections.java:217)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117)
>     at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9955) LogAggregationService Thread OOM

2019-11-05 Thread wangxiangchun (Jira)
wangxiangchun created YARN-9955:
---

 Summary: LogAggregationService Thread OOM
 Key: YARN-9955
 URL: https://issues.apache.org/jira/browse/YARN-9955
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.3, 2.7.2
Reporter: wangxiangchun


Because of some IPC problem,we found that if our recover directory stores too 
much container information.When we restart nodemanager , it appears to error:

_java.lang.OutOfMemoryError: Unable to create new native thread_ 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9955) LogAggregationService Thread OOM

2019-11-05 Thread wangxiangchun (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangxiangchun updated YARN-9955:

Attachment: e04cffec-d7d9-4817-a483-4b0c6d8001f5-1092898.jpg
9f6ef9ac-7b25-4aa0-a6db-f03b0bf003e0-1092898.jpg

> LogAggregationService Thread OOM
> 
>
> Key: YARN-9955
> URL: https://issues.apache.org/jira/browse/YARN-9955
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2, 2.7.3
>Reporter: wangxiangchun
>Priority: Major
> Attachments: 9f6ef9ac-7b25-4aa0-a6db-f03b0bf003e0-1092898.jpg, 
> e04cffec-d7d9-4817-a483-4b0c6d8001f5-1092898.jpg
>
>
> Because of some IPC problem,we found that if our recover directory stores too 
> much container information.When we restart nodemanager , it appears to error:
> _java.lang.OutOfMemoryError: Unable to create new native thread_ 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9955) LogAggregationService Thread OOM

2019-11-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968075#comment-16968075
 ] 

Hadoop QA commented on YARN-9955:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} YARN-9955 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9955 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12985019/9955.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25108/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> LogAggregationService Thread OOM
> 
>
> Key: YARN-9955
> URL: https://issues.apache.org/jira/browse/YARN-9955
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2, 2.7.3
>Reporter: wangxiangchun
>Priority: Major
> Attachments: 9955.patch, 
> 9f6ef9ac-7b25-4aa0-a6db-f03b0bf003e0-1092898.jpg, 
> e04cffec-d7d9-4817-a483-4b0c6d8001f5-1092898.jpg
>
>
> Because of some IPC problem,we found that if our recover directory stores too 
> much container information.When we restart nodemanager , it appears to error:
> _java.lang.OutOfMemoryError: Unable to create new native thread_ 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA

2019-11-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968088#comment-16968088
 ] 

Hadoop QA commented on YARN-9605:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
2s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 15m 38s{color} | 
{color:red} root generated 3 new + 23 unchanged - 3 fixed = 26 total (was 26) 
{color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
38s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 35s{color} | {color:orange} root: The patch generated 22 new + 22 unchanged 
- 0 fixed = 44 total (was 22) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 48s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m  
0s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
55s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
48s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 86m  
0s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}218m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9605 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12982939/YARN-9605.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  uni

[jira] [Updated] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA

2019-11-05 Thread zhoukang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated YARN-9605:
---
Attachment: YARN-9605.003.patch

> Add ZkConfiguredFailoverProxyProvider for RM HA
> ---
>
> Key: YARN-9605
> URL: https://issues.apache.org/jira/browse/YARN-9605
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-9605.001.patch, YARN-9605.002.patch, 
> YARN-9605.003.patch
>
>
> In this issue, i will track a new feature to support 
> ZkConfiguredFailoverProxyProvider for RM HA



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA

2019-11-05 Thread zhoukang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated YARN-9605:
---
Attachment: (was: YARN-9605.003.patch)

> Add ZkConfiguredFailoverProxyProvider for RM HA
> ---
>
> Key: YARN-9605
> URL: https://issues.apache.org/jira/browse/YARN-9605
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-9605.001.patch, YARN-9605.002.patch, 
> YARN-9605.003.patch
>
>
> In this issue, i will track a new feature to support 
> ZkConfiguredFailoverProxyProvider for RM HA



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA

2019-11-05 Thread zhoukang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated YARN-9605:
---
Attachment: YARN-9605.003.patch

> Add ZkConfiguredFailoverProxyProvider for RM HA
> ---
>
> Key: YARN-9605
> URL: https://issues.apache.org/jira/browse/YARN-9605
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-9605.001.patch, YARN-9605.002.patch, 
> YARN-9605.003.patch
>
>
> In this issue, i will track a new feature to support 
> ZkConfiguredFailoverProxyProvider for RM HA



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA

2019-11-05 Thread zhoukang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968138#comment-16968138
 ] 

zhoukang commented on YARN-9605:


new patch has pushed but error below i can not figure out the cause
{code:java}
[WARNING] 
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/target/main/native/libhdfspp/lib/proto/acl.pb.cc:533:13:
 warning: 'dynamic_init_dummy_acl_2eproto' defined but not used 
[-Wunused-variable]
[WARNING] 
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/target/main/native/libhdfspp/lib/proto/inotify.pb.cc:467:13:
 warning: 'dynamic_init_dummy_inotify_2eproto' defined but not used 
[-Wunused-variable]
[WARNING] 
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/target/main/native/libhdfspp/lib/proto/HAServiceProtocol.pb.cc:404:13:
 warning: 'dynamic_init_dummy_HAServiceProtocol_2eproto' defined but not used 
[-Wunused-variable]
{code}

> Add ZkConfiguredFailoverProxyProvider for RM HA
> ---
>
> Key: YARN-9605
> URL: https://issues.apache.org/jira/browse/YARN-9605
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-9605.001.patch, YARN-9605.002.patch, 
> YARN-9605.003.patch
>
>
> In this issue, i will track a new feature to support 
> ZkConfiguredFailoverProxyProvider for RM HA



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org