[jira] [Commented] (YARN-9646) Yarn miniYarn cluster tests failed to bind to a local host name

2019-07-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885863#comment-16885863
 ] 

Hadoop QA commented on YARN-9646:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
57s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 46s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
15s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  0s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 6 new + 95 unchanged - 3 fixed = 101 total (was 98) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  3s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
7s{color} | {color:green} hadoop-yarn-server-tests in the patch passed. {color} 
|
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
15s{color} | {color:green} hadoop-yarn-applications-distributedshell in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 93m 50s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9646 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12974748/YARN-9646.00.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  

[jira] [Commented] (YARN-9646) Yarn miniYarn cluster tests failed to bind to a local host name

2019-07-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885857#comment-16885857
 ] 

Hadoop QA commented on YARN-9646:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
47s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
43s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 16s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 6 new + 95 unchanged - 3 fixed = 101 total (was 98) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 48s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  3m 14s{color} 
| {color:red} hadoop-yarn-server-tests in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
32s{color} | {color:green} hadoop-yarn-applications-distributedshell in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}100m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-9646 |
| JIRA Patch URL | 

[jira] [Commented] (YARN-8199) Logging fileSize of log files under NM Local Dir

2019-07-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885624#comment-16885624
 ] 

Hadoop QA commented on YARN-8199:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  5m 
36s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
0s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
49s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 
35s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
45s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}120m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-8199 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12974750/YARN-8199-004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux dcb9c7997eff 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 

[jira] [Commented] (YARN-9671) Improve Locality Scheduling when cluster is busy

2019-07-15 Thread Muhammad Samir Khan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885468#comment-16885468
 ] 

Muhammad Samir Khan commented on YARN-9671:
---

Edited description: will handle priority inversion metrics separately in 
another jira.

> Improve Locality Scheduling when cluster is busy
> 
>
> Key: YARN-9671
> URL: https://issues.apache.org/jira/browse/YARN-9671
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Muhammad Samir Khan
>Assignee: Muhammad Samir Khan
>Priority: Major
>
> When a cluster is very busy, scheduling opportunities are few and far 
> between. Scheduling opportunities are how an application knows when to give 
> up looking for decent locality.
> It doesn't make sense to work hard waiting for locality when the odds of it 
> coming are very small and it may actually take a very long time to actually 
> give up.
> This causes the priority of queues to be violated which is the last thing we 
> want to do when the cluster is full.
> Add a mode to disable skipping locality when cluster is busy.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9671) Improve Locality Scheduling when cluster is busy

2019-07-15 Thread Muhammad Samir Khan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Muhammad Samir Khan updated YARN-9671:
--
Description: 
When a cluster is very busy, scheduling opportunities are few and far between. 
Scheduling opportunities are how an application knows when to give up looking 
for decent locality.

It doesn't make sense to work hard waiting for locality when the odds of it 
coming are very small and it may actually take a very long time to actually 
give up.

This causes the priority of queues to be violated which is the last thing we 
want to do when the cluster is full.

Add a mode to disable skipping locality when cluster is busy.

  was:
When a cluster is very busy, scheduling opportunities are few and far between. 
Scheduling opportunities are how an application knows when to give up looking 
for decent locality.

It doesn't make sense to work hard waiting for locality when the odds of it 
coming are very small and it may actually take a very long time to actually 
give up.

This causes the priority of queues to be violated which is the last thing we 
want to do when the cluster is full.
 * Add metrics for queue priority inversions.
 * Add mode to disable skipping locality when cluster is busy.


> Improve Locality Scheduling when cluster is busy
> 
>
> Key: YARN-9671
> URL: https://issues.apache.org/jira/browse/YARN-9671
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Muhammad Samir Khan
>Assignee: Muhammad Samir Khan
>Priority: Major
>
> When a cluster is very busy, scheduling opportunities are few and far 
> between. Scheduling opportunities are how an application knows when to give 
> up looking for decent locality.
> It doesn't make sense to work hard waiting for locality when the odds of it 
> coming are very small and it may actually take a very long time to actually 
> give up.
> This causes the priority of queues to be violated which is the last thing we 
> want to do when the cluster is full.
> Add a mode to disable skipping locality when cluster is busy.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9668) UGI conf doesn't read user overridden configurations on RM and NM startup

2019-07-15 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885443#comment-16885443
 ] 

Jonathan Hung commented on YARN-9668:
-

TestCapacityOverTimePolicy test failure looks related to YARN-9450

> UGI conf doesn't read user overridden configurations on RM and NM startup
> -
>
> Key: YARN-9668
> URL: https://issues.apache.org/jira/browse/YARN-9668
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9668-branch-2.001.patch, 
> YARN-9668-branch-2.002.patch, YARN-9668-branch-3.2.001.patch, 
> YARN-9668.001.patch, YARN-9668.002.patch, YARN-9668.003.patch
>
>
> Similar to HADOOP-15150. Configs overridden thru e.g. -D or -conf are not 
> passed to the UGI conf on RM or NM startup.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8199) Logging fileSize of log files under NM Local Dir

2019-07-15 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885389#comment-16885389
 ] 

Prabhu Joseph commented on YARN-8199:
-

Thanks [~adam.antal] for the review. Have addressed 1, 2 & 4 in  
[^YARN-8199-004.patch] .

Regarding 3 - the file path is container log directory which has both appId and 
containerId.

{code}
2019-07-15 16:16:09,490 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
 Log File 
/hadoop/yarn/log/application_1559302665385_0012/container_e01_1559302665385_0012_01_01/syslog
 size is 105906176
 bytes
{code}

bq. Just out of curiosity: how did you end up with 100Mb as the limit?

Have considered 100Mb when debugging a customer case but no strong reason 
behind this. This will vary for each user.

> Logging fileSize of log files under NM Local Dir
> 
>
> Key: YARN-8199
> URL: https://issues.apache.org/jira/browse/YARN-8199
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>  Labels: supportability
> Attachments: 0001-YARN-8199.patch, 0002-YARN-8199.patch, 
> YARN-8199-003.patch, YARN-8199-004.patch
>
>
> Logging fileSize of log files like syslog, stderr, stdout under NM Local Dir 
> by NodeManager before the cleanup will help to find the application which has 
> written too verbose.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8199) Logging fileSize of log files under NM Local Dir

2019-07-15 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-8199:

Attachment: YARN-8199-004.patch

> Logging fileSize of log files under NM Local Dir
> 
>
> Key: YARN-8199
> URL: https://issues.apache.org/jira/browse/YARN-8199
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>  Labels: supportability
> Attachments: 0001-YARN-8199.patch, 0002-YARN-8199.patch, 
> YARN-8199-003.patch, YARN-8199-004.patch
>
>
> Logging fileSize of log files like syslog, stderr, stdout under NM Local Dir 
> by NodeManager before the cleanup will help to find the application which has 
> written too verbose.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9646) Yarn miniYarn cluster tests failed to bind to a local host name

2019-07-15 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885353#comment-16885353
 ] 

Haibo Chen commented on YARN-9646:
--

Thanks [~ste...@apache.org] for the clarification. Agreed that the 
MiniYARNCluster is fussy as I have seen other issues with it in the past. I 
believe this change will an improvement at least. +1 on the change pending 
Jenkins report given it has been a few weeks since it was submitted. 

I have attached the patch from the git pull request to trigger the Jenkins 
build.

> Yarn miniYarn cluster tests failed to bind to a local host name
> ---
>
> Key: YARN-9646
> URL: https://issues.apache.org/jira/browse/YARN-9646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.4
>Reporter: Ray Yang
>Assignee: Ray Yang
>Priority: Major
> Attachments: YARN-9646.00.patch
>
>
> When running the integration test 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell#testDSShellWithoutDomain
> at home
> The following error happened:
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [ruyang-mn3.linkedin.biz:0] 
> java.net.BindException: Can't assign requested address; For more details see: 
>  [http://wiki.apache.org/hadoop/BindException]
>  
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:327)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$400(MiniYARNCluster.java:99)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:447)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:278)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.setupInternal(TestDistributedShell.java:91)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.setup(TestDistributedShell.java:71)
> …
> Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [ruyang-mn3.linkedin.biz:0] 
> java.net.BindException: Can't assign requested address; For more details see: 
>  [http://wiki.apache.org/hadoop/BindException]
> at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
> at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
> at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.*ResourceTrackerService.serviceStart*(ResourceTrackerService.java:163)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:588)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:976)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1017)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1013)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1013)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1053)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:319)
> ... 31 more
> Caused by: java.net.BindException: Problem binding to 
> [ruyang-mn3.linkedin.biz:0]java.net.BindException: Can't assign requested 
> address; For more details see:  [http://wiki.apache.org/hadoop/BindException]
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> 

[jira] [Updated] (YARN-9646) Yarn miniYarn cluster tests failed to bind to a local host name

2019-07-15 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-9646:
-
Attachment: YARN-9646.00.patch

> Yarn miniYarn cluster tests failed to bind to a local host name
> ---
>
> Key: YARN-9646
> URL: https://issues.apache.org/jira/browse/YARN-9646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.4
>Reporter: Ray Yang
>Assignee: Ray Yang
>Priority: Major
> Attachments: YARN-9646.00.patch
>
>
> When running the integration test 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell#testDSShellWithoutDomain
> at home
> The following error happened:
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [ruyang-mn3.linkedin.biz:0] 
> java.net.BindException: Can't assign requested address; For more details see: 
>  [http://wiki.apache.org/hadoop/BindException]
>  
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:327)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$400(MiniYARNCluster.java:99)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:447)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:278)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.setupInternal(TestDistributedShell.java:91)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.setup(TestDistributedShell.java:71)
> …
> Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [ruyang-mn3.linkedin.biz:0] 
> java.net.BindException: Can't assign requested address; For more details see: 
>  [http://wiki.apache.org/hadoop/BindException]
> at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
> at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
> at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.*ResourceTrackerService.serviceStart*(ResourceTrackerService.java:163)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:588)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:976)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1017)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1013)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1013)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1053)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:319)
> ... 31 more
> Caused by: java.net.BindException: Problem binding to 
> [ruyang-mn3.linkedin.biz:0]java.net.BindException: Can't assign requested 
> address; For more details see:  [http://wiki.apache.org/hadoop/BindException]
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:721)
> at org.apache.hadoop.ipc.Server.bind(Server.java:494)
> at org.apache.hadoop.ipc.Server$Listener.(Server.java:715)
> at 

[jira] [Commented] (YARN-9451) AggregatedLogsBlock shows wrong NM http port

2019-07-15 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885322#comment-16885322
 ] 

Prabhu Joseph commented on YARN-9451:
-

Thanks [~snemeth] for reviewing. Can we backport to branch-3.2 and 3.1.

> AggregatedLogsBlock shows wrong NM http port
> 
>
> Key: YARN-9451
> URL: https://issues.apache.org/jira/browse/YARN-9451
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: Screen Shot 2019-06-20 at 7.49.46 PM.png, 
> YARN-9451-001.patch, YARN-9451-002.patch, YARN-9451-003.patch
>
>
> AggregatedLogsBlock shows wrong NM http port when aggregated file is not 
> available. It shows [http://yarn-ats-3:45454|http://yarn-ats-3:45454/] - NM 
> rpc port instead of http port.
> {code:java}
> Logs not available for job_1554476304275_0003. Aggregation may not be 
> complete, Check back later or try the nodemanager at yarn-ats-3:45454
> Or see application log at 
> http://yarn-ats-3:45454/node/application/application_1554476304275_0003
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6492) Generate queue metrics for each partition

2019-07-15 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885321#comment-16885321
 ] 

Eric Payne commented on YARN-6492:
--

[~maniraj...@gmail.com], the current patches don't apply anymore. Do you have a 
plan for updating them?

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9667) Container-executor.c duplicates messages to stdout

2019-07-15 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885318#comment-16885318
 ] 

Adam Antal commented on YARN-9667:
--

The error message reproduced intermittently in our setup downstream, but I will 
try to provide some more information soon.

The suggestions you mentioned are seemingly good to me, [~pbacsko].

> Container-executor.c duplicates messages to stdout
> --
>
> Key: YARN-9667
> URL: https://issues.apache.org/jira/browse/YARN-9667
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Priority: Major
>
> When a container is killed by its AM we get a similar error message like this:
> {noformat}
> 2019-06-30 12:09:04,412 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 143. Privileged Execution Operation 
> Stderr:
> Stdout: main : command provided 1
> main : run as user is systest
> main : requested yarn user is systest
> Getting exit code file...
> Creating script paths...
> Writing pid file...
> Writing to tmp file 
> /yarn/nm/nmPrivate/application_1561921629886_0001/container_e84_1561921629886_0001_01_19/container_e84_1561921629886_0001_01_19.pid.tmp
> Writing to cgroup task files...
> Creating local dirs...
> Launching container...
> Getting exit code file...
> Creating script paths...
> {noformat}
> In container-executor.c the fork point is right after the "Creating script 
> paths..." part, though in the Stdout log we can clearly see it has been 
> written there twice. After consulting with [~pbacsko] it seems like there's a 
> missing flush in container-executor.c before the fork and that causes the 
> duplication.
> I suggest to add a flush there so that it won't be duplicated: it's a bit 
> misleading that the child process writes out "Getting exit code file" and 
> "Creating script paths" even though it is clearly not doing that.
> A more appealing solution could be to revisit the fprintf-fflush pairs in the 
> code and change them to a single call, so that the fflush calls would not be 
> forgotten accidentally. (It can cause problems in every place where it's 
> used).
> Note: this issue probably affects every occasion of fork(), not just the one 
> from {{launch_container_as_user}} in {{main.c}}.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9573) DistributedShell cannot specify LogAggregationContext

2019-07-15 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885311#comment-16885311
 ] 

Adam Antal commented on YARN-9573:
--

Thanks [~snemeth]!

> DistributedShell cannot specify LogAggregationContext
> -
>
> Key: YARN-9573
> URL: https://issues.apache.org/jira/browse/YARN-9573
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: distributed-shell, log-aggregation, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: YARN-9573.001.branch-3.1.patch, 
> YARN-9573.001.branch-3.2.patch, YARN-9573.001.patch, YARN-9573.002.patch, 
> YARN-9573.002.patch, YARN-9573.003.patch
>
>
> When DShell sends the application request object to the RM, it doesn't 
> specify the LogAggregationContext object - thus it is not possible to run 
> DShell with various log-aggregation configurations, for e.g. a rolling 
> fashioned log aggregation.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9337) GPU auto-discovery script runs even when the resource is given by hand

2019-07-15 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885307#comment-16885307
 ] 

Adam Antal commented on YARN-9337:
--

Thanks for the commit and the reviews!

> GPU auto-discovery script runs even when the resource is given by hand
> --
>
> Key: YARN-9337
> URL: https://issues.apache.org/jira/browse/YARN-9337
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: YARN-9337.001.patch, YARN-9337.002.patch, 
> YARN-9337.003.patch, YARN-9337.003.patch, YARN-9337.branch-3.1.001.patch, 
> YARN-9337.branch-3.2.001.patch, YARN-9337_addendum.branch-3.1.001.patch, 
> YARN-9337_addendum.branch-3.2.001.patch
>
>
> The nvidia-smi script is called even when the gpu configs are given by hand 
> (so there's no need for GPU auto-discovery).
> We should mitigate the call of that script, since it has no effect. (The 
> configs written by the user is not overwritten by the result of the 
> auto-discovery script.)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9326) Fair Scheduler configuration defaults are not documented in case of min and maxResources

2019-07-15 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885305#comment-16885305
 ] 

Adam Antal commented on YARN-9326:
--

Thanks for the commit and the reviewers!

> Fair Scheduler configuration defaults are not documented in case of min and 
> maxResources
> 
>
> Key: YARN-9326
> URL: https://issues.apache.org/jira/browse/YARN-9326
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: docs, documentation, fairscheduler, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0, 3.2.1
>
> Attachments: YARN-9326.001.patch, YARN-9326.002.patch, 
> YARN-9326.003.patch, YARN-9326.004.patch, YARN-9326.005.patch
>
>
> The FairScheduler's configuration has the following defaults (from the code: 
> javadoc):
> {noformat}
> In new style resources, any resource that is not specified will be set to 
> missing or 0%, as appropriate. Also, in the new style resources, units are 
> not allowed. Units are assumed from the resource manager's settings for the 
> resources when the value isn't a percentage. The missing parameter is only 
> used in the case of new style resources without percentages. With new style 
> resources with percentages, any missing resources will be assumed to be 100% 
> because percentages are only used with maximum resource limits.
> {noformat}
> This is not documented in the hadoop yarn site FairScheduler.html. It is 
> quite intuitive, but still need to be documented though.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9326) Fair Scheduler configuration defaults are not documented in case of min and maxResources

2019-07-15 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885305#comment-16885305
 ] 

Adam Antal edited comment on YARN-9326 at 7/15/19 2:53 PM:
---

Thanks for the commit and the reviews!


was (Author: adam.antal):
Thanks for the commit and the reviewers!

> Fair Scheduler configuration defaults are not documented in case of min and 
> maxResources
> 
>
> Key: YARN-9326
> URL: https://issues.apache.org/jira/browse/YARN-9326
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: docs, documentation, fairscheduler, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0, 3.2.1
>
> Attachments: YARN-9326.001.patch, YARN-9326.002.patch, 
> YARN-9326.003.patch, YARN-9326.004.patch, YARN-9326.005.patch
>
>
> The FairScheduler's configuration has the following defaults (from the code: 
> javadoc):
> {noformat}
> In new style resources, any resource that is not specified will be set to 
> missing or 0%, as appropriate. Also, in the new style resources, units are 
> not allowed. Units are assumed from the resource manager's settings for the 
> resources when the value isn't a percentage. The missing parameter is only 
> used in the case of new style resources without percentages. With new style 
> resources with percentages, any missing resources will be assumed to be 100% 
> because percentages are only used with maximum resource limits.
> {noformat}
> This is not documented in the hadoop yarn site FairScheduler.html. It is 
> quite intuitive, but still need to be documented though.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9679) Regular code cleanup in TestResourcePluginManager

2019-07-15 Thread Adam Antal (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal reassigned YARN-9679:


Assignee: Adam Antal

> Regular code cleanup in TestResourcePluginManager
> -
>
> Key: YARN-9679
> URL: https://issues.apache.org/jira/browse/YARN-9679
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Adam Antal
>Priority: Major
>  Labels: newbie
>
> There are several things could be cleaned up in this class: 
> 1. stubResourcePluginmanager should be private.
> 2. In tearDown, the result of dest.delete() should be checked
> 3. In class CustomizedResourceHandler, there are several methods where 
> exceptions decalarations are unnecessary.
> 4. Class MyMockNM should be renamed to some more meaningful name.
> 5. There are some danling javadoc comments, for example: 
> {code:java}
> /*
>* Make sure ResourcePluginManager is initialized during NM start up.
>*/
> {code}
> 6. There are some exceptions unnecessarily declared on test methods but they 
> are never thrown, an example: 
> testLinuxContainerExecutorWithResourcePluginsEnabled
> 7. Assert.assertTrue(false); expressions should be replaced with Assert.fail()
> 8. A handful of usages of Mockito's spy method. This method is not preferred 
> so we should think about replacing it with mocks, somehow.
> The rest can be figured out by whoever takes this jira :) 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9680) Code cleanup in ResourcePluginManager init methods

2019-07-15 Thread Szilard Nemeth (JIRA)
Szilard Nemeth created YARN-9680:


 Summary: Code cleanup in ResourcePluginManager init methods
 Key: YARN-9680
 URL: https://issues.apache.org/jira/browse/YARN-9680
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth


We have 2 initializer methods in this class: 
1. 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.ResourcePluginManager#initialize
2. 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.ResourcePluginManager#initializePluggableDevicePlugins

Both are overly complex, contains nested conditions / loops.
A simple refactor could be performed (along with tests) on these methods, for 
example, by splitting them into smaller methods, as a start.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9679) Regular code cleanup in TestResourcePluginManager

2019-07-15 Thread Szilard Nemeth (JIRA)
Szilard Nemeth created YARN-9679:


 Summary: Regular code cleanup in TestResourcePluginManager
 Key: YARN-9679
 URL: https://issues.apache.org/jira/browse/YARN-9679
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth


There are several things could be cleaned up in this class: 
1. stubResourcePluginmanager should be private.
2. In tearDown, the result of dest.delete() should be checked
3. In class CustomizedResourceHandler, there are several methods where 
exceptions decalarations are unnecessary.
4. Class MyMockNM should be renamed to some more meaningful name.
5. There are some danling javadoc comments, for example: 

{code:java}
/*
   * Make sure ResourcePluginManager is initialized during NM start up.
   */
{code}

6. There are some exceptions unnecessarily declared on test methods but they 
are never thrown, an example: 
testLinuxContainerExecutorWithResourcePluginsEnabled

7. Assert.assertTrue(false); expressions should be replaced with Assert.fail()
8. A handful of usages of Mockito's spy method. This method is not preferred so 
we should think about replacing it with mocks, somehow.

The rest can be figured out by whoever takes this jira :) 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9678) TestGpuResourceHandler / TestFpgaResourceHandler should be renamed

2019-07-15 Thread Szilard Nemeth (JIRA)
Szilard Nemeth created YARN-9678:


 Summary: TestGpuResourceHandler / TestFpgaResourceHandler should 
be renamed
 Key: YARN-9678
 URL: https://issues.apache.org/jira/browse/YARN-9678
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth


Their respective production classes are GpuResourceHandlerImpl and 
FpgaResourceHandlerImpl so we are missing the "Impl" from the testcase 
classnames.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9677) Make FpgaDevice and GpuDevice classes more similar to each other

2019-07-15 Thread Szilard Nemeth (JIRA)
Szilard Nemeth created YARN-9677:


 Summary: Make FpgaDevice and GpuDevice classes more similar to 
each other
 Key: YARN-9677
 URL: https://issues.apache.org/jira/browse/YARN-9677
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth


org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceAllocator.FpgaDevice
 is an inner class of FpgaResourceAllocator.
It is not only being used from its parent class but from other classes as well 
so we are losing the purpose of the inner class, it does not really make sense.

We also have 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuDevice
 which is a similar class, but for GPU devices.

What we could do here is to make FpgaDevice a single class and harmonize the 
packages of these 2 classes, meaning they should be "closer" to each other in 
terms of packaging.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9326) Fair Scheduler configuration defaults are not documented in case of min and maxResources

2019-07-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885160#comment-16885160
 ] 

Hudson commented on YARN-9326:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16916 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16916/])
YARN-9326. Fair Scheduler configuration defaults are not documented in 
(snemeth: rev 5446308360f57cb98c54c416231788ba9ae332f8)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/FairScheduler.md


> Fair Scheduler configuration defaults are not documented in case of min and 
> maxResources
> 
>
> Key: YARN-9326
> URL: https://issues.apache.org/jira/browse/YARN-9326
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: docs, documentation, fairscheduler, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0, 3.2.1
>
> Attachments: YARN-9326.001.patch, YARN-9326.002.patch, 
> YARN-9326.003.patch, YARN-9326.004.patch, YARN-9326.005.patch
>
>
> The FairScheduler's configuration has the following defaults (from the code: 
> javadoc):
> {noformat}
> In new style resources, any resource that is not specified will be set to 
> missing or 0%, as appropriate. Also, in the new style resources, units are 
> not allowed. Units are assumed from the resource manager's settings for the 
> resources when the value isn't a percentage. The missing parameter is only 
> used in the case of new style resources without percentages. With new style 
> resources with percentages, any missing resources will be assumed to be 100% 
> because percentages are only used with maximum resource limits.
> {noformat}
> This is not documented in the hadoop yarn site FairScheduler.html. It is 
> quite intuitive, but still need to be documented though.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8199) Logging fileSize of log files under NM Local Dir

2019-07-15 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885150#comment-16885150
 ] 

Adam Antal commented on YARN-8199:
--

Thanks for patch [~Prabhu Joseph],

1) When one of the {{getFileStatus}} call fails all of the other files are not 
getting processed since the catch block is out of the loop. That would not be a 
problem for HDFS, but if another filesystem is given as remote-app-folder where 
{{getFileStatus}} is not reliable then it can produce unexpected messages 
(S3!). Also it would make this debugging not reliable at all: if a file is not 
mentioned there in the logs, we could not deduce that this log has actually not 
exceeded the limit given with the configuration.
2) I agree putting this to a debug block, since the {{getFileStatus}} reasoning 
above. However I'd recommend mentioning this in the description: it only takes 
effect if level at least debug (even though the configuration itself implicitly 
contains "debug").
3) Could please add the container id (and maybe the application id as well) 
when displaying the message about the size of the file? If one app has lots of 
containers some of those having syslog exceeding the size then the error 
message can not be bound to a container.
4) I'd rename yarn.log-aggregation.debug.filesize.bytes to 
yarn.log-aggregation.debug.filesize since you already mention in the 
description the measure.

Just out of curiosity: how did you end up with 100Mb as the limit?

> Logging fileSize of log files under NM Local Dir
> 
>
> Key: YARN-8199
> URL: https://issues.apache.org/jira/browse/YARN-8199
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>  Labels: supportability
> Attachments: 0001-YARN-8199.patch, 0002-YARN-8199.patch, 
> YARN-8199-003.patch
>
>
> Logging fileSize of log files like syslog, stderr, stdout under NM Local Dir 
> by NodeManager before the cleanup will help to find the application which has 
> written too verbose.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8480) Add boolean option for resources

2019-07-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885134#comment-16885134
 ] 

Hadoop QA commented on YARN-8480:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} YARN-8480 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-8480 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12930979/YARN-8480.002.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24395/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Add boolean option for resources
> 
>
> Key: YARN-8480
> URL: https://issues.apache.org/jira/browse/YARN-8480
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Daniel Templeton
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-8480.001.patch, YARN-8480.002.patch
>
>
> Make it possible to define a resource with a boolean value.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8480) Add boolean option for resources

2019-07-15 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885128#comment-16885128
 ] 

Szilard Nemeth commented on YARN-8480:
--

Hi [~leftnoteasy], [~templedf], [~sunilg]!
Given that we had a useful discussion here, but the rest of the opinions tend 
to point to node attributes as the preferred approach over boolean resources, 
I'm fine by closing this jira off.
What do you think?

Thanks!

> Add boolean option for resources
> 
>
> Key: YARN-8480
> URL: https://issues.apache.org/jira/browse/YARN-8480
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Daniel Templeton
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-8480.001.patch, YARN-8480.002.patch
>
>
> Make it possible to define a resource with a boolean value.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8586) Extract log aggregation related fields and methods from RMAppImpl

2019-07-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885118#comment-16885118
 ] 

Hadoop QA commented on YARN-8586:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} YARN-8586 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-8586 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948865/YARN-8586.002.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24394/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Extract log aggregation related fields and methods from RMAppImpl
> -
>
> Key: YARN-8586
> URL: https://issues.apache.org/jira/browse/YARN-8586
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-8586.001.patch, YARN-8586.002.patch, 
> YARN-8586.002.patch
>
>
> Given that RMAppImpl is already above 2000 lines and it is very complex, as a 
> very simple 
> and straightforward step, all Log aggregation related fields and methods 
> could be extracted to a new class.
> The clients of RMAppImpl may access the same methods and RMAppImpl would 
> delegate all those calls to the newly introduced class.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8224) LogAggregation status TIME_OUT for absent container misleading

2019-07-15 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885115#comment-16885115
 ] 

Szilard Nemeth commented on YARN-8224:
--

Hi [~pbacsko]!
Can you check this one?

Thanks!

> LogAggregation status TIME_OUT for absent container misleading
> --
>
> Key: YARN-8224
> URL: https://issues.apache.org/jira/browse/YARN-8224
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Peter Bacsko
>Priority: Major
>
> When a container is not launched on NM and it is absent, RM still tries to 
> get the Log Aggregation Status and reports the status as TIME_OUT in RM UI. 
> {code}
> 2018-04-26 12:47:38,403 WARN  containermanager.ContainerManagerImpl 
> (ContainerManagerImpl.java:handle(1070)) - Event EventType: KILL_CONTAINER 
> sent to absent container container_e361_1524687599273_2110_01_000770
> 2018-04-26 12:49:31,743 WARN  containermanager.ContainerManagerImpl 
> (ContainerManagerImpl.java:handle(1086)) - Event EventType: 
> FINISH_APPLICATION sent to absent application application_1524687599273_2110
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8586) Extract log aggregation related fields and methods from RMAppImpl

2019-07-15 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-8586:


Assignee: Peter Bacsko  (was: Szilard Nemeth)

> Extract log aggregation related fields and methods from RMAppImpl
> -
>
> Key: YARN-8586
> URL: https://issues.apache.org/jira/browse/YARN-8586
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-8586.001.patch, YARN-8586.002.patch, 
> YARN-8586.002.patch
>
>
> Given that RMAppImpl is already above 2000 lines and it is very complex, as a 
> very simple 
> and straightforward step, all Log aggregation related fields and methods 
> could be extracted to a new class.
> The clients of RMAppImpl may access the same methods and RMAppImpl would 
> delegate all those calls to the newly introduced class.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8224) LogAggregation status TIME_OUT for absent container misleading

2019-07-15 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-8224:


Assignee: Peter Bacsko  (was: Szilard Nemeth)

> LogAggregation status TIME_OUT for absent container misleading
> --
>
> Key: YARN-8224
> URL: https://issues.apache.org/jira/browse/YARN-8224
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Peter Bacsko
>Priority: Major
>
> When a container is not launched on NM and it is absent, RM still tries to 
> get the Log Aggregation Status and reports the status as TIME_OUT in RM UI. 
> {code}
> 2018-04-26 12:47:38,403 WARN  containermanager.ContainerManagerImpl 
> (ContainerManagerImpl.java:handle(1070)) - Event EventType: KILL_CONTAINER 
> sent to absent container container_e361_1524687599273_2110_01_000770
> 2018-04-26 12:49:31,743 WARN  containermanager.ContainerManagerImpl 
> (ContainerManagerImpl.java:handle(1086)) - Event EventType: 
> FINISH_APPLICATION sent to absent application application_1524687599273_2110
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9451) AggregatedLogsBlock shows wrong NM http port

2019-07-15 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885111#comment-16885111
 ] 

Szilard Nemeth commented on YARN-9451:
--

Hi [~cheersyang]!
Patch looks good to me. Do you have any objections for pushing this in?
[~Prabhu Joseph], [~cheersyang]: If I push this to trunk, which versions shall 
I backport this patch?

> AggregatedLogsBlock shows wrong NM http port
> 
>
> Key: YARN-9451
> URL: https://issues.apache.org/jira/browse/YARN-9451
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: Screen Shot 2019-06-20 at 7.49.46 PM.png, 
> YARN-9451-001.patch, YARN-9451-002.patch, YARN-9451-003.patch
>
>
> AggregatedLogsBlock shows wrong NM http port when aggregated file is not 
> available. It shows [http://yarn-ats-3:45454|http://yarn-ats-3:45454/] - NM 
> rpc port instead of http port.
> {code:java}
> Logs not available for job_1554476304275_0003. Aggregation may not be 
> complete, Check back later or try the nodemanager at yarn-ats-3:45454
> Or see application log at 
> http://yarn-ats-3:45454/node/application/application_1554476304275_0003
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9326) Fair Scheduler configuration defaults are not documented in case of min and maxResources

2019-07-15 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885104#comment-16885104
 ] 

Szilard Nemeth commented on YARN-9326:
--

Committed to trunk / branch-3.2.
Thanks [~adam.antal] for the contribution and [~wilfreds], [~templedf] for the 
useful reviews!

> Fair Scheduler configuration defaults are not documented in case of min and 
> maxResources
> 
>
> Key: YARN-9326
> URL: https://issues.apache.org/jira/browse/YARN-9326
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: docs, documentation, fairscheduler, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-9326.001.patch, YARN-9326.002.patch, 
> YARN-9326.003.patch, YARN-9326.004.patch, YARN-9326.005.patch
>
>
> The FairScheduler's configuration has the following defaults (from the code: 
> javadoc):
> {noformat}
> In new style resources, any resource that is not specified will be set to 
> missing or 0%, as appropriate. Also, in the new style resources, units are 
> not allowed. Units are assumed from the resource manager's settings for the 
> resources when the value isn't a percentage. The missing parameter is only 
> used in the case of new style resources without percentages. With new style 
> resources with percentages, any missing resources will be assumed to be 100% 
> because percentages are only used with maximum resource limits.
> {noformat}
> This is not documented in the hadoop yarn site FairScheduler.html. It is 
> quite intuitive, but still need to be documented though.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9326) Fair Scheduler configuration defaults are not documented in case of min and maxResources

2019-07-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885103#comment-16885103
 ] 

Hadoop QA commented on YARN-9326:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} YARN-9326 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9326 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12961406/YARN-9326.005.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24393/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Fair Scheduler configuration defaults are not documented in case of min and 
> maxResources
> 
>
> Key: YARN-9326
> URL: https://issues.apache.org/jira/browse/YARN-9326
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: docs, documentation, fairscheduler, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-9326.001.patch, YARN-9326.002.patch, 
> YARN-9326.003.patch, YARN-9326.004.patch, YARN-9326.005.patch
>
>
> The FairScheduler's configuration has the following defaults (from the code: 
> javadoc):
> {noformat}
> In new style resources, any resource that is not specified will be set to 
> missing or 0%, as appropriate. Also, in the new style resources, units are 
> not allowed. Units are assumed from the resource manager's settings for the 
> resources when the value isn't a percentage. The missing parameter is only 
> used in the case of new style resources without percentages. With new style 
> resources with percentages, any missing resources will be assumed to be 100% 
> because percentages are only used with maximum resource limits.
> {noformat}
> This is not documented in the hadoop yarn site FairScheduler.html. It is 
> quite intuitive, but still need to be documented though.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9326) Fair Scheduler configuration defaults are not documented in case of min and maxResources

2019-07-15 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885102#comment-16885102
 ] 

Szilard Nemeth commented on YARN-9326:
--

Hi [~adam.antal]!
+1 for the latest patch! Committing this into trunk and branch-3.2

> Fair Scheduler configuration defaults are not documented in case of min and 
> maxResources
> 
>
> Key: YARN-9326
> URL: https://issues.apache.org/jira/browse/YARN-9326
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: docs, documentation, fairscheduler, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-9326.001.patch, YARN-9326.002.patch, 
> YARN-9326.003.patch, YARN-9326.004.patch, YARN-9326.005.patch
>
>
> The FairScheduler's configuration has the following defaults (from the code: 
> javadoc):
> {noformat}
> In new style resources, any resource that is not specified will be set to 
> missing or 0%, as appropriate. Also, in the new style resources, units are 
> not allowed. Units are assumed from the resource manager's settings for the 
> resources when the value isn't a percentage. The missing parameter is only 
> used in the case of new style resources without percentages. With new style 
> resources with percentages, any missing resources will be assumed to be 100% 
> because percentages are only used with maximum resource limits.
> {noformat}
> This is not documented in the hadoop yarn site FairScheduler.html. It is 
> quite intuitive, but still need to be documented though.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9127) Create more tests to verify GpuDeviceInformationParser

2019-07-15 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885083#comment-16885083
 ] 

Szilard Nemeth edited comment on YARN-9127 at 7/15/19 10:54 AM:


Committed to trunk / 3.2 / 3.1!
Thanks [~pbacsko] for your contribution!



was (Author: snemeth):
Committed to trunk!
Thanks [~pbacsko] for your contribution!


> Create more tests to verify GpuDeviceInformationParser
> --
>
> Key: YARN-9127
> URL: https://issues.apache.org/jira/browse/YARN-9127
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: YARN-9127.001.patch, YARN-9127.002.patch, 
> YARN-9127.003.patch, YARN-9127.004.patch, YARN-9127.005.patch, 
> YARN-9127.branch-3.1.001.patch, YARN-9127.branch-3.2.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9127) Create more tests to verify GpuDeviceInformationParser

2019-07-15 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885083#comment-16885083
 ] 

Szilard Nemeth commented on YARN-9127:
--

Committed to trunk!
Thanks [~pbacsko] for your contribution!


> Create more tests to verify GpuDeviceInformationParser
> --
>
> Key: YARN-9127
> URL: https://issues.apache.org/jira/browse/YARN-9127
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9127.001.patch, YARN-9127.002.patch, 
> YARN-9127.003.patch, YARN-9127.004.patch, YARN-9127.005.patch, 
> YARN-9127.branch-3.1.001.patch, YARN-9127.branch-3.2.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9127) Create more tests to verify GpuDeviceInformationParser

2019-07-15 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9127:
-
Attachment: YARN-9127.branch-3.1.001.patch

> Create more tests to verify GpuDeviceInformationParser
> --
>
> Key: YARN-9127
> URL: https://issues.apache.org/jira/browse/YARN-9127
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9127.001.patch, YARN-9127.002.patch, 
> YARN-9127.003.patch, YARN-9127.004.patch, YARN-9127.005.patch, 
> YARN-9127.branch-3.1.001.patch, YARN-9127.branch-3.2.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9127) Create more tests to verify GpuDeviceInformationParser

2019-07-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885041#comment-16885041
 ] 

Hadoop QA commented on YARN-9127:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} YARN-9127 does not apply to branch-3.2. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9127 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12974711/YARN-9127.branch-3.2.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24392/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Create more tests to verify GpuDeviceInformationParser
> --
>
> Key: YARN-9127
> URL: https://issues.apache.org/jira/browse/YARN-9127
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9127.001.patch, YARN-9127.002.patch, 
> YARN-9127.003.patch, YARN-9127.004.patch, YARN-9127.005.patch, 
> YARN-9127.branch-3.2.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9127) Create more tests to verify GpuDeviceInformationParser

2019-07-15 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9127:
-
Attachment: YARN-9127.branch-3.2.001.patch

> Create more tests to verify GpuDeviceInformationParser
> --
>
> Key: YARN-9127
> URL: https://issues.apache.org/jira/browse/YARN-9127
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9127.001.patch, YARN-9127.002.patch, 
> YARN-9127.003.patch, YARN-9127.004.patch, YARN-9127.005.patch, 
> YARN-9127.branch-3.2.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9676) Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected classes

2019-07-15 Thread Adam Antal (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal reassigned YARN-9676:


Assignee: Adam Antal

> Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected 
> classes
> 
>
> Key: YARN-9676
> URL: https://issues.apache.org/jira/browse/YARN-9676
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
>
> During the development of the last items of YARN-6875, it was typically 
> difficult to extract information about the internal state of some log 
> aggregation related classes (e.g. {{AppLogAggregatiorImpl}} and 
> {{LogAggregationFileController}}). 
> On my fork I added a few more messages to those classes like:
> - displaying the number of log aggregation cycles
> - displaying the names of the files currently considered for log aggregation 
> by containers
> - immediately displaying any exception caught (and sent to the RM in the 
> diagnostic messages) during the log aggregation process.
> Those messages were quite useful for debugging if any issue occurs, but 
> otherwise it flooded the NM log file with these messages that are usually not 
> needed. I suggest to add (some of) these messages in DEBUG or TRACE level.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9676) Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected classes

2019-07-15 Thread Adam Antal (JIRA)
Adam Antal created YARN-9676:


 Summary: Add DEBUG and TRACE level messages to 
AppLogAggregatorImpl and connected classes
 Key: YARN-9676
 URL: https://issues.apache.org/jira/browse/YARN-9676
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Adam Antal


During the development of the last items of YARN-6875, it was typically 
difficult to extract information about the internal state of some log 
aggregation related classes (e.g. {{AppLogAggregatiorImpl}} and 
{{LogAggregationFileController}}). 

On my fork I added a few more messages to those classes like:
- displaying the number of log aggregation cycles
- displaying the names of the files currently considered for log aggregation by 
containers
- immediately displaying any exception caught (and sent to the RM in the 
diagnostic messages) during the log aggregation process.

Those messages were quite useful for debugging if any issue occurs, but 
otherwise it flooded the NM log file with these messages that are usually not 
needed. I suggest to add (some of) these messages in DEBUG or TRACE level.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9127) Create more tests to verify GpuDeviceInformationParser

2019-07-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885029#comment-16885029
 ] 

Hudson commented on YARN-9127:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16915 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16915/])
YARN-9127. Create more tests to verify GpuDeviceInformationParser. (snemeth: 
rev 18ee1092b471c5337f05809f8f01dae415e51a3a)
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/resources/resource-types/resource-types-error-redefine-fpga-unit.xml
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/dao/gpu/GpuDeviceInformation.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/resources/resource-types/resource-types-error-redefine-gpu-unit.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/dao/gpu/PerGpuMemoryUsage.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/gpu/GpuDiscoverer.java
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-xml-output
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/dao/gpu/PerGpuDeviceInformation.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/dao/gpu/GpuDeviceInformationParser.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/dao/gpu/TestGpuDeviceInformationParser.java


> Create more tests to verify GpuDeviceInformationParser
> --
>
> Key: YARN-9127
> URL: https://issues.apache.org/jira/browse/YARN-9127
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9127.001.patch, YARN-9127.002.patch, 
> YARN-9127.003.patch, YARN-9127.004.patch, YARN-9127.005.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9127) Create more tests to verify GpuDeviceInformationParser

2019-07-15 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885026#comment-16885026
 ] 

Szilard Nemeth commented on YARN-9127:
--

Thanks [~pbacsko]!
+1 for the latest patch, committing to trunk and 3.2 / 3.1 soon.

> Create more tests to verify GpuDeviceInformationParser
> --
>
> Key: YARN-9127
> URL: https://issues.apache.org/jira/browse/YARN-9127
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9127.001.patch, YARN-9127.002.patch, 
> YARN-9127.003.patch, YARN-9127.004.patch, YARN-9127.005.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9675) Expose log aggregation diagnostic messages through RM API

2019-07-15 Thread Adam Antal (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal reassigned YARN-9675:


Assignee: Adam Antal

> Expose log aggregation diagnostic messages through RM API
> -
>
> Key: YARN-9675
> URL: https://issues.apache.org/jira/browse/YARN-9675
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api, log-aggregation, resourcemanager
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
>
> The ResourceManager collects the log aggregation status reports from the 
> NodeManagers. Currently these reports are collected, but when app info API or 
> similar high-level REST is called, only an overall status is displayed 
> (RUNNING, RUNNING_WITH_FAILURES,FAILED etc.). 
> The diagnostic messages are only available through the old RM web UI, so our 
> internal tool currently crawls that page and extract the log aggregation 
> diagnostic and error messages from the raw HTML. This is not a good practice, 
> and more elegant API call may be preferable. It may be useful for others as 
> well since log aggregation related failures are usually hard to debug since 
> the lack of trace/debug messages.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9675) Expose log aggregation diagnostic messages through RM API

2019-07-15 Thread Adam Antal (JIRA)
Adam Antal created YARN-9675:


 Summary: Expose log aggregation diagnostic messages through RM API
 Key: YARN-9675
 URL: https://issues.apache.org/jira/browse/YARN-9675
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api, log-aggregation, resourcemanager
Affects Versions: 3.2.0
Reporter: Adam Antal


The ResourceManager collects the log aggregation status reports from the 
NodeManagers. Currently these reports are collected, but when app info API or 
similar high-level REST is called, only an overall status is displayed 
(RUNNING, RUNNING_WITH_FAILURES,FAILED etc.). 

The diagnostic messages are only available through the old RM web UI, so our 
internal tool currently crawls that page and extract the log aggregation 
diagnostic and error messages from the raw HTML. This is not a good practice, 
and more elegant API call may be preferable. It may be useful for others as 
well since log aggregation related failures are usually hard to debug since the 
lack of trace/debug messages.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9360) Do not expose innards of QueueMetrics object into FSLeafQueue#computeMaxAMResource

2019-07-15 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884965#comment-16884965
 ] 

Szilard Nemeth commented on YARN-9360:
--

Hi [~pbacsko]!
Thanks for the contribution, committed to trunk!


> Do not expose innards of QueueMetrics object into 
> FSLeafQueue#computeMaxAMResource
> --
>
> Key: YARN-9360
> URL: https://issues.apache.org/jira/browse/YARN-9360
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9360.001.patch, YARN-9360.002.patch, 
> YARN-9360.003.patch, YARN-9360.003.patch
>
>
> This is a follow-up for YARN-9323, covering required changes as discussed 
> with [~templedf] earlier.
> After YARN-9323, 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue#computeMaxAMResource
>  gets the QueueMetricsForCustomResources object from 
> scheduler.getRootQueueMetrics().
> Instead, we should use a "fill-in" method in QueueMetrics that receives a 
> Resource and fills in custom resource values if they are non-zero.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9360) Do not expose innards of QueueMetrics object into FSLeafQueue#computeMaxAMResource

2019-07-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884964#comment-16884964
 ] 

Hudson commented on YARN-9360:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16913 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16913/])
YARN-9360. Do not expose innards of QueueMetrics object into (snemeth: rev 
91ce09e7065bacd7b4f09696fff35b789c52bcd7)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java


> Do not expose innards of QueueMetrics object into 
> FSLeafQueue#computeMaxAMResource
> --
>
> Key: YARN-9360
> URL: https://issues.apache.org/jira/browse/YARN-9360
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9360.001.patch, YARN-9360.002.patch, 
> YARN-9360.003.patch, YARN-9360.003.patch
>
>
> This is a follow-up for YARN-9323, covering required changes as discussed 
> with [~templedf] earlier.
> After YARN-9323, 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue#computeMaxAMResource
>  gets the QueueMetricsForCustomResources object from 
> scheduler.getRootQueueMetrics().
> Instead, we should use a "fill-in" method in QueueMetrics that receives a 
> Resource and fills in custom resource values if they are non-zero.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9360) Do not expose innards of QueueMetrics object into FSLeafQueue#computeMaxAMResource

2019-07-15 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884931#comment-16884931
 ] 

Szilard Nemeth edited comment on YARN-9360 at 7/15/19 8:49 AM:
---

HI [~pbacsko]!
+1 for the latest patch, committing this soon to trunk. Just trunk, as this 
patch is a follow-up for YARN-9323 and that patch is not backported to 3.2 / 
3.1 so we neither do need to backport this one.


was (Author: snemeth):
HI [~pbacsko]!
+1 for the latest patch, committing this soon!

> Do not expose innards of QueueMetrics object into 
> FSLeafQueue#computeMaxAMResource
> --
>
> Key: YARN-9360
> URL: https://issues.apache.org/jira/browse/YARN-9360
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9360.001.patch, YARN-9360.002.patch, 
> YARN-9360.003.patch, YARN-9360.003.patch
>
>
> This is a follow-up for YARN-9323, covering required changes as discussed 
> with [~templedf] earlier.
> After YARN-9323, 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue#computeMaxAMResource
>  gets the QueueMetricsForCustomResources object from 
> scheduler.getRootQueueMetrics().
> Instead, we should use a "fill-in" method in QueueMetrics that receives a 
> Resource and fills in custom resource values if they are non-zero.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9360) Do not expose innards of QueueMetrics object into FSLeafQueue#computeMaxAMResource

2019-07-15 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884931#comment-16884931
 ] 

Szilard Nemeth commented on YARN-9360:
--

HI [~pbacsko]!
+1 for the latest patch, committing this soon!

> Do not expose innards of QueueMetrics object into 
> FSLeafQueue#computeMaxAMResource
> --
>
> Key: YARN-9360
> URL: https://issues.apache.org/jira/browse/YARN-9360
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9360.001.patch, YARN-9360.002.patch, 
> YARN-9360.003.patch, YARN-9360.003.patch
>
>
> This is a follow-up for YARN-9323, covering required changes as discussed 
> with [~templedf] earlier.
> After YARN-9323, 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue#computeMaxAMResource
>  gets the QueueMetricsForCustomResources object from 
> scheduler.getRootQueueMetrics().
> Instead, we should use a "fill-in" method in QueueMetrics that receives a 
> Resource and fills in custom resource values if they are non-zero.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9674) Max AM Resource calculation is wrong

2019-07-15 Thread ANANDA G B (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884880#comment-16884880
 ] 

ANANDA G B commented on YARN-9674:
--

@[~sunilg] Can you please check this

@[~sunilg] One more issue is, after running the job on particular queue (Say 
Queue1 of partition1), its resetting the Max AM Resources of Default Partitions 
Queue1. So can i raise separate Jira for it ?

> Max AM Resource calculation is wrong
> 
>
> Key: YARN-9674
> URL: https://issues.apache.org/jira/browse/YARN-9674
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.1.2
>Reporter: ANANDA G B
>Priority: Major
> Attachments: RM_Issue.png
>
>
> 'Max AM Resource' calculated for default partition using 'Effective Max 
> Capacity' and ohter partitions it using 'Effective Capacity'.
> Which one is correct implemenation?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9674) AM Resource calculation is wrong

2019-07-15 Thread ANANDA G B (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ANANDA G B updated YARN-9674:
-
Summary: AM Resource calculation is wrong  (was: 'Max AM Resource' 
calculated for default partition using 'Effective Max Capacity' and ohter 
partitions it using 'Effective Capacity')

> AM Resource calculation is wrong
> 
>
> Key: YARN-9674
> URL: https://issues.apache.org/jira/browse/YARN-9674
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.1.2
>Reporter: ANANDA G B
>Priority: Major
> Attachments: RM_Issue.png
>
>
> 'Max AM Resource' calculated for default partition using 'Effective Max 
> Capacity' and ohter partitions it using 'Effective Capacity'.
> Which one is correct implemenation?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9674) Max AM Resource calculation is wrong

2019-07-15 Thread ANANDA G B (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ANANDA G B updated YARN-9674:
-
Summary: Max AM Resource calculation is wrong  (was: AM Resource 
calculation is wrong)

> Max AM Resource calculation is wrong
> 
>
> Key: YARN-9674
> URL: https://issues.apache.org/jira/browse/YARN-9674
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.1.2
>Reporter: ANANDA G B
>Priority: Major
> Attachments: RM_Issue.png
>
>
> 'Max AM Resource' calculated for default partition using 'Effective Max 
> Capacity' and ohter partitions it using 'Effective Capacity'.
> Which one is correct implemenation?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org