[jira] [Commented] (YARN-9584) Should put initializeProcessTrees method call before get pid
[ https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849356#comment-16849356 ] Wanqiang Ji commented on YARN-9584: --- Hi [~tangzhankun], sure it will cost much effort do that. But I can create new JIAR to do refactoring. > Should put initializeProcessTrees method call before get pid > > > Key: YARN-9584 > URL: https://issues.apache.org/jira/browse/YARN-9584 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0, 3.0.3, 3.1.2 >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Critical > Attachments: YARN-9584.001.patch > > > In ContainerMonitorImpl#MonitoringThread.run method had a logical error that > get pid first then initialize uninitialized process trees. > {code:java} > String pId = ptInfo.getPID(); > // Initialize uninitialized process trees > initializeProcessTrees(entry); > if (pId == null || !isResourceCalculatorAvailable()) { > continue; // processTree cannot be tracked > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9583) Failed job which is submitted unknown queue is showed all users
[ https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849350#comment-16849350 ] KWON BYUNGCHANG commented on YARN-9583: --- I attached patch with test code. please review it. > Failed job which is submitted unknown queue is showed all users > --- > > Key: YARN-9583 > URL: https://issues.apache.org/jira/browse/YARN-9583 > Project: Hadoop YARN > Issue Type: Bug > Components: security >Affects Versions: 3.1.2 >Reporter: KWON BYUNGCHANG >Priority: Minor > Attachments: YARN-9583-screenshot.png, YARN-9583.001.patch, > YARN-9583.002.patch, YARN-9583.003.patch > > > In secure mode, Failed job which is submitted unknown queue is showed all > users. > I attached RM UI screen shot. > reproduction senario >1. user foo submit job to unknown queue without view-acl and job will fail > immediately. >2. user bar can access the job of user foo which previously failed. > According to comments in QueueACLsManager .java that caused the problem, > This situation can happen when RM is restarted after deleting queue. > I think showing app of non existing queue to all users is the problem after > RM start. > It will become a security hole. > I fixed it a little bit. > After fixing it, Both owner of job and admin of yarn can access job which is > submitted unknown queue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9583) Failed job which is submitted unknown queue is showed all users
[ https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KWON BYUNGCHANG updated YARN-9583: -- Attachment: YARN-9583.003.patch > Failed job which is submitted unknown queue is showed all users > --- > > Key: YARN-9583 > URL: https://issues.apache.org/jira/browse/YARN-9583 > Project: Hadoop YARN > Issue Type: Bug > Components: security >Affects Versions: 3.1.2 >Reporter: KWON BYUNGCHANG >Priority: Minor > Attachments: YARN-9583-screenshot.png, YARN-9583.001.patch, > YARN-9583.002.patch, YARN-9583.003.patch > > > In secure mode, Failed job which is submitted unknown queue is showed all > users. > I attached RM UI screen shot. > reproduction senario >1. user foo submit job to unknown queue without view-acl and job will fail > immediately. >2. user bar can access the job of user foo which previously failed. > According to comments in QueueACLsManager .java that caused the problem, > This situation can happen when RM is restarted after deleting queue. > I think showing app of non existing queue to all users is the problem after > RM start. > It will become a security hole. > I fixed it a little bit. > After fixing it, Both owner of job and admin of yarn can access job which is > submitted unknown queue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9301) Too many InvalidStateTransitionException with SLS
[ https://issues.apache.org/jira/browse/YARN-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-9301: Attachment: YARN-9301-001.patch > Too many InvalidStateTransitionException with SLS > - > > Key: YARN-9301 > URL: https://issues.apache.org/jira/browse/YARN-9301 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > Labels: simulator > Attachments: YARN-9301-001.patch > > > Too many InvalidStateTransistionExcetion > {noformat} > 19/02/13 17:44:43 ERROR rmcontainer.RMContainerImpl: Can't handle this event > at current state > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > LAUNCHED at RUNNING > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:483) > at > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:65) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.containerLaunchedOnNode(SchedulerApplicationAttempt.java:655) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.containerLaunchedOnNode(AbstractYarnScheduler.java:359) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNewContainerInfo(AbstractYarnScheduler.java:1010) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.nodeUpdate(AbstractYarnScheduler.java:1112) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1295) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1752) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:205) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:60) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:745) > 19/02/13 17:44:43 ERROR rmcontainer.RMContainerImpl: Invalid event LAUNCHED > on container container_1550059705491_0067_01_01 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9301) Too many InvalidStateTransitionException with SLS
[ https://issues.apache.org/jira/browse/YARN-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849338#comment-16849338 ] Bilwa S T commented on YARN-9301: - Need to add CLEAN_UP event for AM so that container is cleaned up. so i added a CLEANUP event in MockAMLauncher which would remove it form the NM map > Too many InvalidStateTransitionException with SLS > - > > Key: YARN-9301 > URL: https://issues.apache.org/jira/browse/YARN-9301 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > Labels: simulator > Attachments: YARN-9301-001.patch > > > Too many InvalidStateTransistionExcetion > {noformat} > 19/02/13 17:44:43 ERROR rmcontainer.RMContainerImpl: Can't handle this event > at current state > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > LAUNCHED at RUNNING > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:483) > at > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:65) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.containerLaunchedOnNode(SchedulerApplicationAttempt.java:655) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.containerLaunchedOnNode(AbstractYarnScheduler.java:359) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNewContainerInfo(AbstractYarnScheduler.java:1010) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.nodeUpdate(AbstractYarnScheduler.java:1112) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1295) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1752) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:205) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:60) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:745) > 19/02/13 17:44:43 ERROR rmcontainer.RMContainerImpl: Invalid event LAUNCHED > on container container_1550059705491_0067_01_01 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9565) RMAppImpl#ranNodes not cleared on FinalTransition
[ https://issues.apache.org/jira/browse/YARN-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849337#comment-16849337 ] Hadoop QA commented on YARN-9565: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s{color} | {color:red} YARN-9565 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-9565 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12969962/YARN-9565-001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24157/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > RMAppImpl#ranNodes not cleared on FinalTransition > - > > Key: YARN-9565 > URL: https://issues.apache.org/jira/browse/YARN-9565 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-9565-001.patch > > > RMAppImpl holds the list of nodes on which containers ran which is never > cleared. > This could cause memory leak -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9557) Application fails in diskchecker when ReadWriteDiskValidator is configured.
[ https://issues.apache.org/jira/browse/YARN-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849335#comment-16849335 ] Bilwa S T commented on YARN-9557: - ContainerLocalizer checks for path which is not yet created. So we should check for a parent path . > Application fails in diskchecker when ReadWriteDiskValidator is configured. > --- > > Key: YARN-9557 > URL: https://issues.apache.org/jira/browse/YARN-9557 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.1.1 > Environment: Configure: > > yarn.nodemanager.disk-validator > read-write > >Reporter: Anuruddh Nayak >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-9557-001.patch > > > Application fails to execute successfully when ReadWriteDiskValidator is > configured. > {code} > > yarn.nodemanager.disk-validator > read-write > > {code} > {noformat} > Exception thrown while starting Container: > java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: > Disk Check failed! > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233) > Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check > failed! > at > org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198) > ... 2 more > Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: > /opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11 > is not a directory! > at > org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50) > {noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9557) Application fails in diskchecker when ReadWriteDiskValidator is configured.
[ https://issues.apache.org/jira/browse/YARN-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-9557: Attachment: YARN-9557-001.patch > Application fails in diskchecker when ReadWriteDiskValidator is configured. > --- > > Key: YARN-9557 > URL: https://issues.apache.org/jira/browse/YARN-9557 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.1.1 > Environment: Configure: > > yarn.nodemanager.disk-validator > read-write > >Reporter: Anuruddh Nayak >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-9557-001.patch > > > Application fails to execute successfully when ReadWriteDiskValidator is > configured. > {code} > > yarn.nodemanager.disk-validator > read-write > > {code} > {noformat} > Exception thrown while starting Container: > java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: > Disk Check failed! > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233) > Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check > failed! > at > org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198) > ... 2 more > Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: > /opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11 > is not a directory! > at > org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50) > {noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9565) RMAppImpl#ranNodes not cleared on FinalTransition
[ https://issues.apache.org/jira/browse/YARN-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-9565: Attachment: YARN-9565-001.patch > RMAppImpl#ranNodes not cleared on FinalTransition > - > > Key: YARN-9565 > URL: https://issues.apache.org/jira/browse/YARN-9565 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-9565-001.patch > > > RMAppImpl holds the list of nodes on which containers ran which is never > cleared. > This could cause memory leak -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9583) Failed job which is submitted unknown queue is showed all users
[ https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KWON BYUNGCHANG updated YARN-9583: -- Attachment: YARN-9583.002.patch > Failed job which is submitted unknown queue is showed all users > --- > > Key: YARN-9583 > URL: https://issues.apache.org/jira/browse/YARN-9583 > Project: Hadoop YARN > Issue Type: Bug > Components: security >Affects Versions: 3.1.2 >Reporter: KWON BYUNGCHANG >Priority: Minor > Attachments: YARN-9583-screenshot.png, YARN-9583.001.patch, > YARN-9583.002.patch > > > In secure mode, Failed job which is submitted unknown queue is showed all > users. > I attached RM UI screen shot. > reproduction senario >1. user foo submit job to unknown queue without view-acl and job will fail > immediately. >2. user bar can access the job of user foo which previously failed. > According to comments in QueueACLsManager .java that caused the problem, > This situation can happen when RM is restarted after deleting queue. > I think showing app of non existing queue to all users is the problem after > RM start. > It will become a security hole. > I fixed it a little bit. > After fixing it, Both owner of job and admin of yarn can access job which is > submitted unknown queue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-6260) Findbugs warning in YARN-5355 branch
[ https://issues.apache.org/jira/browse/YARN-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka resolved YARN-6260. - Resolution: Invalid The warning no longer exists. Closing. > Findbugs warning in YARN-5355 branch > > > Key: YARN-6260 > URL: https://issues.apache.org/jira/browse/YARN-6260 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Saxena >Priority: Minor > > {noformat} > Bug type SE_BAD_FIELD > In class > org.apache.hadoop.yarn.server.timelineservice.storage.entity.EntityColumnPrefix > Field > org.apache.hadoop.yarn.server.timelineservice.storage.entity.EntityColumnPrefix.column > Actual type > org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnHelper > In EntityColumnPrefix.java > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9584) Should put initializeProcessTrees method call before get pid
[ https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849295#comment-16849295 ] Zhankun Tang commented on YARN-9584: [~jiwq], does it cost much effort to do refactoring for that method? if too much, I'll get this in without test case. > Should put initializeProcessTrees method call before get pid > > > Key: YARN-9584 > URL: https://issues.apache.org/jira/browse/YARN-9584 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0, 3.0.3, 3.1.2 >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Critical > Attachments: YARN-9584.001.patch > > > In ContainerMonitorImpl#MonitoringThread.run method had a logical error that > get pid first then initialize uninitialized process trees. > {code:java} > String pId = ptInfo.getPID(); > // Initialize uninitialized process trees > initializeProcessTrees(entry); > if (pId == null || !isResourceCalculatorAvailable()) { > continue; // processTree cannot be tracked > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9584) Should put initializeProcessTrees method call before get pid
[ https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wanqiang Ji updated YARN-9584: -- Description: In ContainerMonitorImpl#MonitoringThread.run method had a logical error that get pid first then initialize uninitialized process trees. {code:java} String pId = ptInfo.getPID(); // Initialize uninitialized process trees initializeProcessTrees(entry); if (pId == null || !isResourceCalculatorAvailable()) { continue; // processTree cannot be tracked } {code} was: In ContainerMonitorImpl#MonitoringThread.run method had a logic error that get pid first then initialize uninitialized process trees. {code:java} String pId = ptInfo.getPID(); // Initialize uninitialized process trees initializeProcessTrees(entry); if (pId == null || !isResourceCalculatorAvailable()) { continue; // processTree cannot be tracked } {code} > Should put initializeProcessTrees method call before get pid > > > Key: YARN-9584 > URL: https://issues.apache.org/jira/browse/YARN-9584 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0, 3.0.3, 3.1.2 >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Critical > Attachments: YARN-9584.001.patch > > > In ContainerMonitorImpl#MonitoringThread.run method had a logical error that > get pid first then initialize uninitialized process trees. > {code:java} > String pId = ptInfo.getPID(); > // Initialize uninitialized process trees > initializeProcessTrees(entry); > if (pId == null || !isResourceCalculatorAvailable()) { > continue; // processTree cannot be tracked > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9584) Should put initializeProcessTrees method call before get pid
[ https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849245#comment-16849245 ] Wanqiang Ji commented on YARN-9584: --- Thanks for [~tangzhankun] for reviewing this. If we want to add a new UT for this, I think we should code refactoring for the *initializeProcessTrees* method. Any thoughts? > Should put initializeProcessTrees method call before get pid > > > Key: YARN-9584 > URL: https://issues.apache.org/jira/browse/YARN-9584 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0, 3.0.3, 3.1.2 >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Critical > Attachments: YARN-9584.001.patch > > > In ContainerMonitorImpl#MonitoringThread.run method had a logic error that > get pid first then initialize uninitialized process trees. > {code:java} > String pId = ptInfo.getPID(); > // Initialize uninitialized process trees > initializeProcessTrees(entry); > if (pId == null || !isResourceCalculatorAvailable()) { > continue; // processTree cannot be tracked > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9482) DistributedShell job with localization fails in unsecure cluster
[ https://issues.apache.org/jira/browse/YARN-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848944#comment-16848944 ] Prabhu Joseph commented on YARN-9482: - [~sunilg] YARN-9008 has added localization in DistributedShell which is only from 3.3.0. This fix is on top of that. > DistributedShell job with localization fails in unsecure cluster > > > Key: YARN-9482 > URL: https://issues.apache.org/jira/browse/YARN-9482 > Project: Hadoop YARN > Issue Type: Bug > Components: distributed-shell >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9482-001.patch, YARN-9482-002.patch, > YARN-9482-003.patch, YARN-9482-004.patch > > > DistributedShell job with localization fails in unsecure cluster. The client > localizes the input files to home directory (job user) whereas the AM runs as > yarn user reads from it's home directory. > *Command:* > {code} > yarn jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -shell_command ls -shell_args / -jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -localize_files /tmp/prabhu > {code} > {code} > Exception in thread "Thread-4" java.io.UncheckedIOException: Error during > localization setup > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1495) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.run(ApplicationMaster.java:1481) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.FileNotFoundException: File does not exist: > hdfs://yarn-ats-1:8020/user/yarn/DistributedShell/application_1554817981283_0003/prabhu > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1579) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1594) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1487) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9482) DistributedShell job with localization fails in unsecure cluster
[ https://issues.apache.org/jira/browse/YARN-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848940#comment-16848940 ] Sunil Govindan commented on YARN-9482: -- Thanks [~giovanni.fumarola], [~Prabhu Joseph] and [~pbacsko] Cud we get this to branch-3.2/3.1 as well. [~Prabhu Joseph] is this good to backport, thanks ? > DistributedShell job with localization fails in unsecure cluster > > > Key: YARN-9482 > URL: https://issues.apache.org/jira/browse/YARN-9482 > Project: Hadoop YARN > Issue Type: Bug > Components: distributed-shell >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9482-001.patch, YARN-9482-002.patch, > YARN-9482-003.patch, YARN-9482-004.patch > > > DistributedShell job with localization fails in unsecure cluster. The client > localizes the input files to home directory (job user) whereas the AM runs as > yarn user reads from it's home directory. > *Command:* > {code} > yarn jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -shell_command ls -shell_args / -jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -localize_files /tmp/prabhu > {code} > {code} > Exception in thread "Thread-4" java.io.UncheckedIOException: Error during > localization setup > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1495) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.run(ApplicationMaster.java:1481) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.FileNotFoundException: File does not exist: > hdfs://yarn-ats-1:8020/user/yarn/DistributedShell/application_1554817981283_0003/prabhu > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1579) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1594) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1487) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9543) UI2 should handle missing ATSv2 gracefully
[ https://issues.apache.org/jira/browse/YARN-9543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848936#comment-16848936 ] Sunil Govindan commented on YARN-9543: -- Getting this in now. Thanks [~zsiegl] and [~akhilpb] > UI2 should handle missing ATSv2 gracefully > -- > > Key: YARN-9543 > URL: https://issues.apache.org/jira/browse/YARN-9543 > Project: Hadoop YARN > Issue Type: Improvement > Components: ATSv2, yarn-ui-v2 >Affects Versions: 3.1.2 >Reporter: Zoltan Siegl >Assignee: Zoltan Siegl >Priority: Major > Attachments: YARN-9543.001.patch, YARN-9543.002.patch > > > Resource manager UI2 is throwing some console errors and an error page on the > flows page. > Suggested improvements: > * Disable or remove the flows tab if ATSv2 is not available or not installed > * Handle all connection errors to ATSv2 gracefully -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9545) Create healthcheck REST endpoint for ATSv2
[ https://issues.apache.org/jira/browse/YARN-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848935#comment-16848935 ] Sunil Govindan commented on YARN-9545: -- Thanks [~zsiegl], in general the DAO class looks good and forward looking. And tests are also good and covering basic cases. I have one small concern on {{isConnectionAlive}}. In this method, a simple bool is returned and based on that a decision is taken whether to consider Timeline Reader is up or down.. Now I think since this method a new interface, i suggest lets return an ENUM from this method which can renamed into {{getConnectionStatus}}/. With this, we can return status like RUNNING, CONNECTION_FAILURE etc. And in {{TimelineReaderWebServices}}, we can reimplement {{health}} method with switch case statement to consider basic 2 scenarios to start with, and same will be pretty much forward looking as well. Thoughts/? > Create healthcheck REST endpoint for ATSv2 > -- > > Key: YARN-9545 > URL: https://issues.apache.org/jira/browse/YARN-9545 > Project: Hadoop YARN > Issue Type: Improvement > Components: ATSv2 >Affects Versions: 3.1.2 >Reporter: Zoltan Siegl >Assignee: Zoltan Siegl >Priority: Major > Attachments: YARN-9545.001.patch, YARN-9545.002.patch, > YARN-9545.003.patch > > > RM UI2 and CM needs a health check url for ATSv2 service. > Create a /health rest endpoint > * must respond with 200 \{health: ok} if all ok > * must respond with non 200 if any problem occurs > * could check reader/writer connection -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9584) Should put initializeProcessTrees method call before get pid
[ https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848860#comment-16848860 ] Hadoop QA commented on YARN-9584: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 21s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 55s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 54s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 23s{color} | {color:red} The patch generated 17 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 70m 43s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9584 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12969880/YARN-9584.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux b511b6af41b6 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a3745c5 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24154/testReport/ | | asflicense | https://builds.apache.org/job/PreCommit-YARN-Build/24154/artifact/out/patch-asflicense-problems.txt | | Max. process+thread count | 447 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output |
[jira] [Commented] (YARN-9584) Should put initializeProcessTrees method call before get pid
[ https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848838#comment-16848838 ] Zhankun Tang commented on YARN-9584: [~jiwq], thanks for reporting this. It looks good to me and it would be much better if we can write a test case for it. :) > Should put initializeProcessTrees method call before get pid > > > Key: YARN-9584 > URL: https://issues.apache.org/jira/browse/YARN-9584 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0, 3.0.3, 3.1.2 >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Critical > Attachments: YARN-9584.001.patch > > > In ContainerMonitorImpl#MonitoringThread.run method had a logic error that > get pid first then initialize uninitialized process trees. > {code:java} > String pId = ptInfo.getPID(); > // Initialize uninitialized process trees > initializeProcessTrees(entry); > if (pId == null || !isResourceCalculatorAvailable()) { > continue; // processTree cannot be tracked > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9584) Should put initializeProcessTrees method call before get pid
[ https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wanqiang Ji updated YARN-9584: -- Affects Version/s: 3.2.0 3.0.3 3.1.2 > Should put initializeProcessTrees method call before get pid > > > Key: YARN-9584 > URL: https://issues.apache.org/jira/browse/YARN-9584 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0, 3.0.3, 3.1.2 >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Critical > Attachments: YARN-9584.001.patch > > > In ContainerMonitorImpl#MonitoringThread.run method had a logic error that > get pid first then initialize uninitialized process trees. > {code:java} > String pId = ptInfo.getPID(); > // Initialize uninitialized process trees > initializeProcessTrees(entry); > if (pId == null || !isResourceCalculatorAvailable()) { > continue; // processTree cannot be tracked > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9584) Should put initializeProcessTrees method call before get pid
[ https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wanqiang Ji updated YARN-9584: -- Attachment: YARN-9584.001.patch > Should put initializeProcessTrees method call before get pid > > > Key: YARN-9584 > URL: https://issues.apache.org/jira/browse/YARN-9584 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Critical > Attachments: YARN-9584.001.patch > > > In ContainerMonitorImpl#MonitoringThread.run method had a logic error that > get pid first then initialize uninitialized process trees. > {code:java} > String pId = ptInfo.getPID(); > // Initialize uninitialized process trees > initializeProcessTrees(entry); > if (pId == null || !isResourceCalculatorAvailable()) { > continue; // processTree cannot be tracked > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9584) Should put initializeProcessTrees method call before get pid
Wanqiang Ji created YARN-9584: - Summary: Should put initializeProcessTrees method call before get pid Key: YARN-9584 URL: https://issues.apache.org/jira/browse/YARN-9584 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Wanqiang Ji Assignee: Wanqiang Ji In ContainerMonitorImpl#MonitoringThread.run method had a logic error that get pid first then initialize uninitialized process trees. {code:java} String pId = ptInfo.getPID(); // Initialize uninitialized process trees initializeProcessTrees(entry); if (pId == null || !isResourceCalculatorAvailable()) { continue; // processTree cannot be tracked } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9583) Failed job which is submitted unknown queue is showed all users
[ https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KWON BYUNGCHANG updated YARN-9583: -- Description: In secure mode, Failed job which is submitted unknown queue is showed all users. I attached RM UI screen shot. reproduction senario 1. user foo submit job to unknown queue without view-acl and job will fail immediately. 2. user bar can access the job of user foo which previously failed. According to comments in QueueACLsManager .java that caused the problem, This situation can happen when RM is restarted after deleting queue. I think showing app of non existing queue to all users is the problem after RM start. It will become a security hole. I fixed it a little bit. After fixing it, Both owner of job and admin of yarn can access job which is submitted unknown queue. was: In secure mode, Failed job which is submitted unknown queue is showed all users. I attached RM UI screen shot. reproduction senario 1. user foo submit job to unknown queue without view-acl and job will fail immediately. 2. user bar can access the job of user foo which previously failed. According to comments in QueueACLsManager .java that caused the problem, This situation can happen when RM is restarted after deletion queue. I think showing app of non existing queue to all users is the problem after RM start. It will become a security hole. I fixed it a little bit. After fixing it, Both owner of job and admin of yarn can access job which is submitted unknown queue. > Failed job which is submitted unknown queue is showed all users > --- > > Key: YARN-9583 > URL: https://issues.apache.org/jira/browse/YARN-9583 > Project: Hadoop YARN > Issue Type: Bug > Components: security >Affects Versions: 3.1.2 >Reporter: KWON BYUNGCHANG >Priority: Minor > Attachments: YARN-9583-screenshot.png, YARN-9583.001.patch > > > In secure mode, Failed job which is submitted unknown queue is showed all > users. > I attached RM UI screen shot. > reproduction senario >1. user foo submit job to unknown queue without view-acl and job will fail > immediately. >2. user bar can access the job of user foo which previously failed. > According to comments in QueueACLsManager .java that caused the problem, > This situation can happen when RM is restarted after deleting queue. > I think showing app of non existing queue to all users is the problem after > RM start. > It will become a security hole. > I fixed it a little bit. > After fixing it, Both owner of job and admin of yarn can access job which is > submitted unknown queue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9583) Failed job which is submitted unknown queue is showed all users
[ https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KWON BYUNGCHANG updated YARN-9583: -- Description: In secure mode, Failed job which is submitted unknown queue is showed all users. I attached RM UI screen shot. reproduction senario 1. user foo submit job to unknown queue without view-acl and job will fail immediately. 2. user bar can access the job of user foo which previously failed. According to comments in QueueACLsManager .java that caused the problem, This situation can happen when RM is restarted after deletion queue. I think showing app of non existing queue to all users is the problem after RM start. It will become a security hole. I fixed it a little bit. After fixing it, Both owner of job and admin of yarn can access job which is submitted unknown queue. was: Failed job which is submitted unknown queue is showed all users. I attached RM UI screen shot. reproduction senario 1. user foo submit job to unknown queue without view-acl and job will fail immediately. 2. user bar can access the job of user foo which previously failed. According to comments in QueueACLsManager .java that caused the problem, This situation can happen when RM is restarted after deletion queue. I think showing app of non existing queue to all users is the problem after RM start. It will become a security hole. I fixed it a little bit. After fixing it, Both owner of job and admin of yarn can access job which is submitted unknown queue. > Failed job which is submitted unknown queue is showed all users > --- > > Key: YARN-9583 > URL: https://issues.apache.org/jira/browse/YARN-9583 > Project: Hadoop YARN > Issue Type: Bug > Components: security >Affects Versions: 3.1.2 >Reporter: KWON BYUNGCHANG >Priority: Minor > Attachments: YARN-9583-screenshot.png, YARN-9583.001.patch > > > In secure mode, Failed job which is submitted unknown queue is showed all > users. > I attached RM UI screen shot. > reproduction senario >1. user foo submit job to unknown queue without view-acl and job will fail > immediately. >2. user bar can access the job of user foo which previously failed. > According to comments in QueueACLsManager .java that caused the problem, > This situation can happen when RM is restarted after deletion queue. > I think showing app of non existing queue to all users is the problem after > RM start. > It will become a security hole. > I fixed it a little bit. > After fixing it, Both owner of job and admin of yarn can access job which is > submitted unknown queue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9583) Failed job which is submitted unknown queue is showed all users
[ https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KWON BYUNGCHANG updated YARN-9583: -- Description: Failed job which is submitted unknown queue is showed all users. I attached RM UI screen shot. reproduction senario 1. user foo submit job to unknown queue without view-acl and job will fail immediately. 2. user bar can access the job of user foo which previously failed. According to comments in QueueACLsManager .java that caused the problem, This situation can happen when RM is restarted after deletion queue. I think showing app of non existing queue to all users is the problem after RM start. It will become a security hole. I fixed it a little bit. After fixing it, Both owner of job and admin of yarn can access job which is submitted unknown queue. was: Failed job which is submitted unknown queue is showed all users. I attached RM UI screen shot. reproduction senario 1. user foo submit job to unknown queue without view-acl and failed job. 2. user bar can access job of user foo. According to comments in QueueACLsManager .java that caused the problem. This situation can happen when RM is restarted after deletion queue. I think showing app of non existing queue to all users is the problem after RM start. It will become a security hole. I fixed it a little bit. After fixing it, Both owner of job and admin of yarn can access job which is submitted unknown queue. > Failed job which is submitted unknown queue is showed all users > --- > > Key: YARN-9583 > URL: https://issues.apache.org/jira/browse/YARN-9583 > Project: Hadoop YARN > Issue Type: Bug > Components: security >Affects Versions: 3.1.2 >Reporter: KWON BYUNGCHANG >Priority: Minor > Attachments: YARN-9583-screenshot.png, YARN-9583.001.patch > > > Failed job which is submitted unknown queue is showed all users. > I attached RM UI screen shot. > reproduction senario >1. user foo submit job to unknown queue without view-acl and job will fail > immediately. >2. user bar can access the job of user foo which previously failed. > According to comments in QueueACLsManager .java that caused the problem, > This situation can happen when RM is restarted after deletion queue. > I think showing app of non existing queue to all users is the problem after > RM start. > It will become a security hole. > I fixed it a little bit. > After fixing it, Both owner of job and admin of yarn can access job which is > submitted unknown queue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9583) Failed job which is submitted unknown queue is showed all users
[ https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848661#comment-16848661 ] Hadoop QA commented on YARN-9583: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 20s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 3s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 82m 22s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}130m 29s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9583 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12969834/YARN-9583.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 2c03af2c869d 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9f056d9 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24153/testReport/ | | Max. process+thread count | 886 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24153/console | | Powered by | Apache Yetus 0.8.0
[jira] [Commented] (YARN-8693) Add signalToContainer REST API for RMWebServices
[ https://issues.apache.org/jira/browse/YARN-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848650#comment-16848650 ] Hadoop QA commented on YARN-8693: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 52s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 56s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 81m 16s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 41s{color} | {color:green} hadoop-yarn-server-router in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}139m 16s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-8693 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12969832/YARN-8693.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux cf8d5f1588c5 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9f056d9 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24152/testReport/ | | Max. process+thread count | 899 (vs. ulimit of