[jira] [Commented] (YARN-10391) --module-gpu functionality is broken in container-executor
[ https://issues.apache.org/jira/browse/YARN-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179152#comment-17179152 ] Eric Badger commented on YARN-10391: Thanks, [~Jim_Brennan]! > --module-gpu functionality is broken in container-executor > -- > > Key: YARN-10391 > URL: https://issues.apache.org/jira/browse/YARN-10391 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Fix For: 3.4.0, 3.3.1 > > Attachments: YARN-10391.001.patch > > > {{--module-gpu}} doesn't set the {{operation}} variable, and so the > {{main()}} function's switch statement on {{operation}} falls through to the > default case. This causes it to report a failure, even though it succeeded. > {noformat} > default: > fprintf(ERRORFILE, "Unexpected operation code: %d\n", operation); > exit_code = INVALID_COMMAND_PROVIDED; > break; > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted
[ https://issues.apache.org/jira/browse/YARN-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179146#comment-17179146 ] Bilwa S T edited comment on YARN-4783 at 8/17/20, 5:45 PM: --- Thanks [~gandras] for patch. This may not work in multiple nameservice setup in case if you are requesting for a new token. currently client handles getting delegation token from all nameservices . whereas in your patch i see that you are just trying to get current nameservice token correct me if i am wrong. I am talking about a case where yarn.nodemanager.remote-app-log-dir is set to a namespace that is not default-fs was (Author: bilwast): Thanks [~gandras] for patch. This may not work in multiple nameservice setup in case if you are requesting for a new token. currently client handles getting delegation token from all nameservices . whereas in your patch i see that you are just trying to get current nameservice token correct me if i am wrong. > Log aggregation failure for application when Nodemanager is restarted > -- > > Key: YARN-4783 > URL: https://issues.apache.org/jira/browse/YARN-4783 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Surendra Singh Lilhore >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-4783.001.patch, YARN-4783.002.patch, > YARN-4783.003.patch > > > Scenario : > = > 1.Start NM with user dsperf:hadoop > 2.Configure linux-execute user as dsperf > 3.Submit application with yarn user > 4.Once few containers are allocated to NM 1 > 5.Nodemanager 1 is stopped (wait for expiry ) > 6.Start node manager after application is completed > 7.Check the log aggregation is happening for the containers log in NMLocal > directory > Expect Output : > === > Log aggregation should be succesfull > Actual Output : > === > Log aggreation not successfull -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted
[ https://issues.apache.org/jira/browse/YARN-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179146#comment-17179146 ] Bilwa S T commented on YARN-4783: - Thanks [~gandras] for patch. This may not work in multiple nameservice setup in case if you are requesting for a new token. currently client handles getting delegation token from all nameservices . whereas in your patch i see that you are just trying to get current nameservice token correct me if i am wrong. > Log aggregation failure for application when Nodemanager is restarted > -- > > Key: YARN-4783 > URL: https://issues.apache.org/jira/browse/YARN-4783 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Surendra Singh Lilhore >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-4783.001.patch, YARN-4783.002.patch, > YARN-4783.003.patch > > > Scenario : > = > 1.Start NM with user dsperf:hadoop > 2.Configure linux-execute user as dsperf > 3.Submit application with yarn user > 4.Once few containers are allocated to NM 1 > 5.Nodemanager 1 is stopped (wait for expiry ) > 6.Start node manager after application is completed > 7.Check the log aggregation is happening for the containers log in NMLocal > directory > Expect Output : > === > Log aggregation should be succesfull > Actual Output : > === > Log aggreation not successfull -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10368) Log aggregation reset to NOT_START after RM restart.
[ https://issues.apache.org/jira/browse/YARN-10368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179135#comment-17179135 ] Amithsha commented on YARN-10368: - [~leftnoteasy] FYI > Log aggregation reset to NOT_START after RM restart. > > > Key: YARN-10368 > URL: https://issues.apache.org/jira/browse/YARN-10368 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager, resourcemanager, yarn >Affects Versions: 3.2.1 >Reporter: Anuj >Priority: Major > Attachments: Screenshot 2020-07-27 at 2.35.15 PM.png > > > Attempt recovered after RM restart the log aggregation status is not > preserved and it come to NOT_START. > From NOT_START it never moves to TIMED_OUT and then never cleaned up RM App > in memory resulting max-completed-app in memory limit hit and RM stops > accepting new apps. > https://issues.apache.org/jira/browse/YARN-7952 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10368) Log aggregation reset to NOT_START after RM restart.
[ https://issues.apache.org/jira/browse/YARN-10368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179134#comment-17179134 ] Amithsha commented on YARN-10368: - +1 > Log aggregation reset to NOT_START after RM restart. > > > Key: YARN-10368 > URL: https://issues.apache.org/jira/browse/YARN-10368 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager, resourcemanager, yarn >Affects Versions: 3.2.1 >Reporter: Anuj >Priority: Major > Attachments: Screenshot 2020-07-27 at 2.35.15 PM.png > > > Attempt recovered after RM restart the log aggregation status is not > preserved and it come to NOT_START. > From NOT_START it never moves to TIMED_OUT and then never cleaned up RM App > in memory resulting max-completed-app in memory limit hit and RM stops > accepting new apps. > https://issues.apache.org/jira/browse/YARN-7952 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10391) --module-gpu functionality is broken in container-executor
[ https://issues.apache.org/jira/browse/YARN-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Brennan updated YARN-10391: --- Fix Version/s: 3.3.1 3.4.0 > --module-gpu functionality is broken in container-executor > -- > > Key: YARN-10391 > URL: https://issues.apache.org/jira/browse/YARN-10391 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Fix For: 3.4.0, 3.3.1 > > Attachments: YARN-10391.001.patch > > > {{--module-gpu}} doesn't set the {{operation}} variable, and so the > {{main()}} function's switch statement on {{operation}} falls through to the > default case. This causes it to report a failure, even though it succeeded. > {noformat} > default: > fprintf(ERRORFILE, "Unexpected operation code: %d\n", operation); > exit_code = INVALID_COMMAND_PROVIDED; > break; > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10391) --module-gpu functionality is broken in container-executor
[ https://issues.apache.org/jira/browse/YARN-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179119#comment-17179119 ] Jim Brennan commented on YARN-10391: Thanks [~ebadger]! I have committed this to trunk and branch-3.3. > --module-gpu functionality is broken in container-executor > -- > > Key: YARN-10391 > URL: https://issues.apache.org/jira/browse/YARN-10391 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-10391.001.patch > > > {{--module-gpu}} doesn't set the {{operation}} variable, and so the > {{main()}} function's switch statement on {{operation}} falls through to the > default case. This causes it to report a failure, even though it succeeded. > {noformat} > default: > fprintf(ERRORFILE, "Unexpected operation code: %d\n", operation); > exit_code = INVALID_COMMAND_PROVIDED; > break; > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10400) Build the new version of hadoop on Mac os system with bug
[ https://issues.apache.org/jira/browse/YARN-10400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuqi updated YARN-10400: - Description: !image-2020-08-18-00-23-48-730.png|width=1141,height=449! (was: !image-2020-08-18-00-23-48-730.png!) > Build the new version of hadoop on Mac os system with bug > - > > Key: YARN-10400 > URL: https://issues.apache.org/jira/browse/YARN-10400 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: zhuqi >Priority: Major > Attachments: image-2020-08-18-00-23-48-730.png > > > !image-2020-08-18-00-23-48-730.png|width=1141,height=449! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10400) Build the new version of hadoop on Mac os system with bug
zhuqi created YARN-10400: Summary: Build the new version of hadoop on Mac os system with bug Key: YARN-10400 URL: https://issues.apache.org/jira/browse/YARN-10400 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.3.0 Reporter: zhuqi Attachments: image-2020-08-18-00-23-48-730.png !image-2020-08-18-00-23-48-730.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10106) Yarn logs CLI filtering by application attempt
[ https://issues.apache.org/jira/browse/YARN-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179041#comment-17179041 ] Hadoop QA commented on YARN-10106: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 39s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 11s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 57s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The patch generated 7 new + 126 unchanged - 0 fixed = 133 total (was 126) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 52s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 27m 56s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green}
[jira] [Commented] (YARN-10391) --module-gpu functionality is broken in container-executor
[ https://issues.apache.org/jira/browse/YARN-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179019#comment-17179019 ] Jim Brennan commented on YARN-10391: The unit test failure is unrelated, and I don't think we need a new test for this. Technically, these --module options shouldn't be doing the work in the validate_arguments() function, they should set an operation code and call the function that does the work in main like the others. But I am ok with just fixing this bug as you have done here. The comment makes it clear what is going on. I'm +1 on this. Will commit later today. > --module-gpu functionality is broken in container-executor > -- > > Key: YARN-10391 > URL: https://issues.apache.org/jira/browse/YARN-10391 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-10391.001.patch > > > {{--module-gpu}} doesn't set the {{operation}} variable, and so the > {{main()}} function's switch statement on {{operation}} falls through to the > default case. This causes it to report a failure, even though it succeeded. > {noformat} > default: > fprintf(ERRORFILE, "Unexpected operation code: %d\n", operation); > exit_code = INVALID_COMMAND_PROVIDED; > break; > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10106) Yarn logs CLI filtering by application attempt
[ https://issues.apache.org/jira/browse/YARN-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hudáky Márton Gyula updated YARN-10106: --- Attachment: YARN-10106.011.patch > Yarn logs CLI filtering by application attempt > -- > > Key: YARN-10106 > URL: https://issues.apache.org/jira/browse/YARN-10106 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Adam Antal >Assignee: Hudáky Márton Gyula >Priority: Trivial > Attachments: YARN-10106.001.patch, YARN-10106.002.patch, > YARN-10106.003.patch, YARN-10106.004.patch, YARN-10106.005.patch, > YARN-10106.006.patch, YARN-10106.007.patch, YARN-10106.008.patch, > YARN-10106.009.patch, YARN-10106.010.patch, YARN-10106.011.patch > > > {{ContainerLogsRequest}} got a new parameter in YARN-10101, which is the > {{applicationAttempt}} - we can use this new parameter in Yarn logs CLI as > well to filter by application attempt. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org