[jira] [Commented] (YARN-10391) --module-gpu functionality is broken in container-executor

2020-08-17 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179152#comment-17179152
 ] 

Eric Badger commented on YARN-10391:


Thanks, [~Jim_Brennan]!

> --module-gpu functionality is broken in container-executor
> --
>
> Key: YARN-10391
> URL: https://issues.apache.org/jira/browse/YARN-10391
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Fix For: 3.4.0, 3.3.1
>
> Attachments: YARN-10391.001.patch
>
>
> {{--module-gpu}} doesn't set the {{operation}} variable, and so the 
> {{main()}} function's switch statement on {{operation}} falls through to the 
> default case. This causes it to report a failure, even though it succeeded. 
> {noformat}
>   default:
> fprintf(ERRORFILE, "Unexpected operation code: %d\n", operation);
> exit_code = INVALID_COMMAND_PROVIDED;
> break;
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted

2020-08-17 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179146#comment-17179146
 ] 

Bilwa S T edited comment on YARN-4783 at 8/17/20, 5:45 PM:
---

Thanks [~gandras] for patch.

This may not work in multiple nameservice setup in case if you are requesting 
for a new token. currently client handles getting delegation token from all 
nameservices . whereas in your patch i see that you are just trying to get 
current nameservice token correct me if i am wrong.

I am talking about a case where yarn.nodemanager.remote-app-log-dir is set to a 
namespace that is not default-fs


was (Author: bilwast):
Thanks [~gandras] for patch.

This may not work in multiple nameservice setup in case if you are requesting 
for a new token. currently client handles getting delegation token from all 
nameservices . whereas in your patch i see that you are just trying to get 
current nameservice token correct me if i am wrong.

> Log aggregation failure for application when Nodemanager is restarted 
> --
>
> Key: YARN-4783
> URL: https://issues.apache.org/jira/browse/YARN-4783
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Andras Gyori
>Priority: Major
> Attachments: YARN-4783.001.patch, YARN-4783.002.patch, 
> YARN-4783.003.patch
>
>
> Scenario :
> =
> 1.Start NM with user dsperf:hadoop
> 2.Configure linux-execute user as dsperf
> 3.Submit application with yarn user 
> 4.Once few containers are allocated to NM 1
> 5.Nodemanager 1 is stopped  (wait for expiry )
> 6.Start node manager after application is completed
> 7.Check the log aggregation is happening for the containers log in NMLocal 
> directory
> Expect Output :
> ===
> Log aggregation should be succesfull
> Actual Output :
> ===
> Log aggreation not successfull



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted

2020-08-17 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179146#comment-17179146
 ] 

Bilwa S T commented on YARN-4783:
-

Thanks [~gandras] for patch.

This may not work in multiple nameservice setup in case if you are requesting 
for a new token. currently client handles getting delegation token from all 
nameservices . whereas in your patch i see that you are just trying to get 
current nameservice token correct me if i am wrong.

> Log aggregation failure for application when Nodemanager is restarted 
> --
>
> Key: YARN-4783
> URL: https://issues.apache.org/jira/browse/YARN-4783
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Andras Gyori
>Priority: Major
> Attachments: YARN-4783.001.patch, YARN-4783.002.patch, 
> YARN-4783.003.patch
>
>
> Scenario :
> =
> 1.Start NM with user dsperf:hadoop
> 2.Configure linux-execute user as dsperf
> 3.Submit application with yarn user 
> 4.Once few containers are allocated to NM 1
> 5.Nodemanager 1 is stopped  (wait for expiry )
> 6.Start node manager after application is completed
> 7.Check the log aggregation is happening for the containers log in NMLocal 
> directory
> Expect Output :
> ===
> Log aggregation should be succesfull
> Actual Output :
> ===
> Log aggreation not successfull



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10368) Log aggregation reset to NOT_START after RM restart.

2020-08-17 Thread Amithsha (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179135#comment-17179135
 ] 

Amithsha commented on YARN-10368:
-

[~leftnoteasy] FYI

> Log aggregation reset to NOT_START after RM restart.
> 
>
> Key: YARN-10368
> URL: https://issues.apache.org/jira/browse/YARN-10368
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager, resourcemanager, yarn
>Affects Versions: 3.2.1
>Reporter: Anuj
>Priority: Major
> Attachments: Screenshot 2020-07-27 at 2.35.15 PM.png
>
>
> Attempt recovered after RM restart the log aggregation status is not 
> preserved and it come to NOT_START.
> From NOT_START it never moves to TIMED_OUT and then never cleaned up RM App 
> in memory resulting max-completed-app in memory limit hit and RM stops 
> accepting new apps.
> https://issues.apache.org/jira/browse/YARN-7952



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10368) Log aggregation reset to NOT_START after RM restart.

2020-08-17 Thread Amithsha (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179134#comment-17179134
 ] 

Amithsha commented on YARN-10368:
-

+1 

> Log aggregation reset to NOT_START after RM restart.
> 
>
> Key: YARN-10368
> URL: https://issues.apache.org/jira/browse/YARN-10368
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager, resourcemanager, yarn
>Affects Versions: 3.2.1
>Reporter: Anuj
>Priority: Major
> Attachments: Screenshot 2020-07-27 at 2.35.15 PM.png
>
>
> Attempt recovered after RM restart the log aggregation status is not 
> preserved and it come to NOT_START.
> From NOT_START it never moves to TIMED_OUT and then never cleaned up RM App 
> in memory resulting max-completed-app in memory limit hit and RM stops 
> accepting new apps.
> https://issues.apache.org/jira/browse/YARN-7952



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10391) --module-gpu functionality is broken in container-executor

2020-08-17 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated YARN-10391:
---
Fix Version/s: 3.3.1
   3.4.0

> --module-gpu functionality is broken in container-executor
> --
>
> Key: YARN-10391
> URL: https://issues.apache.org/jira/browse/YARN-10391
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Fix For: 3.4.0, 3.3.1
>
> Attachments: YARN-10391.001.patch
>
>
> {{--module-gpu}} doesn't set the {{operation}} variable, and so the 
> {{main()}} function's switch statement on {{operation}} falls through to the 
> default case. This causes it to report a failure, even though it succeeded. 
> {noformat}
>   default:
> fprintf(ERRORFILE, "Unexpected operation code: %d\n", operation);
> exit_code = INVALID_COMMAND_PROVIDED;
> break;
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10391) --module-gpu functionality is broken in container-executor

2020-08-17 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179119#comment-17179119
 ] 

Jim Brennan commented on YARN-10391:


Thanks [~ebadger]!  I have committed this to trunk and branch-3.3.

> --module-gpu functionality is broken in container-executor
> --
>
> Key: YARN-10391
> URL: https://issues.apache.org/jira/browse/YARN-10391
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-10391.001.patch
>
>
> {{--module-gpu}} doesn't set the {{operation}} variable, and so the 
> {{main()}} function's switch statement on {{operation}} falls through to the 
> default case. This causes it to report a failure, even though it succeeded. 
> {noformat}
>   default:
> fprintf(ERRORFILE, "Unexpected operation code: %d\n", operation);
> exit_code = INVALID_COMMAND_PROVIDED;
> break;
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10400) Build the new version of hadoop on Mac os system with bug

2020-08-17 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated YARN-10400:
-
Description: !image-2020-08-18-00-23-48-730.png|width=1141,height=449!  
(was: !image-2020-08-18-00-23-48-730.png!)

> Build the new version of hadoop on Mac os system with bug
> -
>
> Key: YARN-10400
> URL: https://issues.apache.org/jira/browse/YARN-10400
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: zhuqi
>Priority: Major
> Attachments: image-2020-08-18-00-23-48-730.png
>
>
> !image-2020-08-18-00-23-48-730.png|width=1141,height=449!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10400) Build the new version of hadoop on Mac os system with bug

2020-08-17 Thread zhuqi (Jira)
zhuqi created YARN-10400:


 Summary: Build the new version of hadoop on Mac os system with bug
 Key: YARN-10400
 URL: https://issues.apache.org/jira/browse/YARN-10400
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.3.0
Reporter: zhuqi
 Attachments: image-2020-08-18-00-23-48-730.png

!image-2020-08-18-00-23-48-730.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10106) Yarn logs CLI filtering by application attempt

2020-08-17 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179041#comment-17179041
 ] 

Hadoop QA commented on YARN-10106:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
57s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 19s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The patch generated 7 new + 
126 unchanged - 0 fixed = 133 total (was 126) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 52s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 27m 
56s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} 

[jira] [Commented] (YARN-10391) --module-gpu functionality is broken in container-executor

2020-08-17 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179019#comment-17179019
 ] 

Jim Brennan commented on YARN-10391:


The unit test failure is unrelated, and I don't think we need a new test for 
this.
Technically, these --module options shouldn't be doing the work in the 
validate_arguments() function, they should set an operation code and call the 
function that does the work in main like the others.

But I am ok with just fixing this bug as you have done here.  The comment makes 
it clear what is going on.

I'm +1 on this.  Will commit later today.

 

> --module-gpu functionality is broken in container-executor
> --
>
> Key: YARN-10391
> URL: https://issues.apache.org/jira/browse/YARN-10391
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-10391.001.patch
>
>
> {{--module-gpu}} doesn't set the {{operation}} variable, and so the 
> {{main()}} function's switch statement on {{operation}} falls through to the 
> default case. This causes it to report a failure, even though it succeeded. 
> {noformat}
>   default:
> fprintf(ERRORFILE, "Unexpected operation code: %d\n", operation);
> exit_code = INVALID_COMMAND_PROVIDED;
> break;
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10106) Yarn logs CLI filtering by application attempt

2020-08-17 Thread Jira


 [ 
https://issues.apache.org/jira/browse/YARN-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hudáky Márton Gyula updated YARN-10106:
---
Attachment: YARN-10106.011.patch

> Yarn logs CLI filtering by application attempt
> --
>
> Key: YARN-10106
> URL: https://issues.apache.org/jira/browse/YARN-10106
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Adam Antal
>Assignee: Hudáky Márton Gyula
>Priority: Trivial
> Attachments: YARN-10106.001.patch, YARN-10106.002.patch, 
> YARN-10106.003.patch, YARN-10106.004.patch, YARN-10106.005.patch, 
> YARN-10106.006.patch, YARN-10106.007.patch, YARN-10106.008.patch, 
> YARN-10106.009.patch, YARN-10106.010.patch, YARN-10106.011.patch
>
>
> {{ContainerLogsRequest}} got a new parameter in YARN-10101, which is the 
> {{applicationAttempt}} - we can use this new parameter in Yarn logs CLI as 
> well to filter by application attempt.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org