[jira] [Commented] (YARN-9584) Should put initializeProcessTrees method call before get pid

2019-05-27 Thread Wanqiang Ji (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849356#comment-16849356
 ] 

Wanqiang Ji commented on YARN-9584:
---

Hi [~tangzhankun], sure it will cost much effort do that. But I can create new 
JIAR to do refactoring.

> Should put initializeProcessTrees method call before get pid
> 
>
> Key: YARN-9584
> URL: https://issues.apache.org/jira/browse/YARN-9584
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0, 3.0.3, 3.1.2
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Critical
> Attachments: YARN-9584.001.patch
>
>
> In ContainerMonitorImpl#MonitoringThread.run method had a logical error that 
> get pid first then initialize uninitialized process trees. 
> {code:java}
> String pId = ptInfo.getPID();
> // Initialize uninitialized process trees
> initializeProcessTrees(entry);
> if (pId == null || !isResourceCalculatorAvailable()) {
>   continue; // processTree cannot be tracked
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9583) Failed job which is submitted unknown queue is showed all users

2019-05-27 Thread KWON BYUNGCHANG (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849350#comment-16849350
 ] 

KWON BYUNGCHANG commented on YARN-9583:
---

I attached patch with test code.  please review it.

> Failed job which is submitted unknown queue is showed all users
> ---
>
> Key: YARN-9583
> URL: https://issues.apache.org/jira/browse/YARN-9583
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Priority: Minor
> Attachments: YARN-9583-screenshot.png, YARN-9583.001.patch, 
> YARN-9583.002.patch, YARN-9583.003.patch
>
>
> In secure mode, Failed job which is submitted unknown queue is showed all 
> users.
> I attached RM UI screen shot.
> reproduction senario
>1. user foo submit job to unknown queue without view-acl and job will fail 
> immediately. 
>2. user bar can access the job of user foo which previously failed.
> According to comments in  QueueACLsManager .java that caused the problem,
> This situation can happen when RM is restarted after deleting queue.
> I think  showing app of non existing queue to all users is the problem after 
> RM start. 
> It will become a security hole.
> I fixed it a little bit.  
> After fixing it, Both owner of job and admin of yarn can access job which is 
> submitted unknown queue. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9583) Failed job which is submitted unknown queue is showed all users

2019-05-27 Thread KWON BYUNGCHANG (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KWON BYUNGCHANG updated YARN-9583:
--
Attachment: YARN-9583.003.patch

> Failed job which is submitted unknown queue is showed all users
> ---
>
> Key: YARN-9583
> URL: https://issues.apache.org/jira/browse/YARN-9583
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Priority: Minor
> Attachments: YARN-9583-screenshot.png, YARN-9583.001.patch, 
> YARN-9583.002.patch, YARN-9583.003.patch
>
>
> In secure mode, Failed job which is submitted unknown queue is showed all 
> users.
> I attached RM UI screen shot.
> reproduction senario
>1. user foo submit job to unknown queue without view-acl and job will fail 
> immediately. 
>2. user bar can access the job of user foo which previously failed.
> According to comments in  QueueACLsManager .java that caused the problem,
> This situation can happen when RM is restarted after deleting queue.
> I think  showing app of non existing queue to all users is the problem after 
> RM start. 
> It will become a security hole.
> I fixed it a little bit.  
> After fixing it, Both owner of job and admin of yarn can access job which is 
> submitted unknown queue. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9301) Too many InvalidStateTransitionException with SLS

2019-05-27 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9301:

Attachment: YARN-9301-001.patch

> Too many InvalidStateTransitionException with SLS
> -
>
> Key: YARN-9301
> URL: https://issues.apache.org/jira/browse/YARN-9301
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Major
>  Labels: simulator
> Attachments: YARN-9301-001.patch
>
>
> Too many InvalidStateTransistionExcetion
> {noformat}
> 19/02/13 17:44:43 ERROR rmcontainer.RMContainerImpl: Can't handle this event 
> at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> LAUNCHED at RUNNING
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:483)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:65)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.containerLaunchedOnNode(SchedulerApplicationAttempt.java:655)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.containerLaunchedOnNode(AbstractYarnScheduler.java:359)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNewContainerInfo(AbstractYarnScheduler.java:1010)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.nodeUpdate(AbstractYarnScheduler.java:1112)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1295)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1752)
> at 
> org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:205)
> at 
> org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:60)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:745)
> 19/02/13 17:44:43 ERROR rmcontainer.RMContainerImpl: Invalid event LAUNCHED 
> on container container_1550059705491_0067_01_01
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9301) Too many InvalidStateTransitionException with SLS

2019-05-27 Thread Bilwa S T (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849338#comment-16849338
 ] 

Bilwa S T commented on YARN-9301:
-

Need to add CLEAN_UP event for AM so that container is cleaned up. so i added a 
CLEANUP event in MockAMLauncher which would remove it form the NM map

> Too many InvalidStateTransitionException with SLS
> -
>
> Key: YARN-9301
> URL: https://issues.apache.org/jira/browse/YARN-9301
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Major
>  Labels: simulator
> Attachments: YARN-9301-001.patch
>
>
> Too many InvalidStateTransistionExcetion
> {noformat}
> 19/02/13 17:44:43 ERROR rmcontainer.RMContainerImpl: Can't handle this event 
> at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> LAUNCHED at RUNNING
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:483)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:65)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.containerLaunchedOnNode(SchedulerApplicationAttempt.java:655)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.containerLaunchedOnNode(AbstractYarnScheduler.java:359)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNewContainerInfo(AbstractYarnScheduler.java:1010)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.nodeUpdate(AbstractYarnScheduler.java:1112)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1295)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1752)
> at 
> org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:205)
> at 
> org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:60)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:745)
> 19/02/13 17:44:43 ERROR rmcontainer.RMContainerImpl: Invalid event LAUNCHED 
> on container container_1550059705491_0067_01_01
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9565) RMAppImpl#ranNodes not cleared on FinalTransition

2019-05-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849337#comment-16849337
 ] 

Hadoop QA commented on YARN-9565:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} YARN-9565 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9565 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12969962/YARN-9565-001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24157/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RMAppImpl#ranNodes not cleared on FinalTransition
> -
>
> Key: YARN-9565
> URL: https://issues.apache.org/jira/browse/YARN-9565
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9565-001.patch
>
>
> RMAppImpl holds the list of  nodes on which containers ran which is never 
> cleared.
> This could cause memory leak



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9557) Application fails in diskchecker when ReadWriteDiskValidator is configured.

2019-05-27 Thread Bilwa S T (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849335#comment-16849335
 ] 

Bilwa S T commented on YARN-9557:
-

ContainerLocalizer checks for path which is not yet created. So we should check 
for a parent path . 

> Application fails in diskchecker when ReadWriteDiskValidator is configured.
> ---
>
> Key: YARN-9557
> URL: https://issues.apache.org/jira/browse/YARN-9557
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.1.1
> Environment: Configure:
> 
>  yarn.nodemanager.disk-validator
>  read-write
>  
>Reporter: Anuruddh Nayak
>Assignee: Bilwa S T
>Priority: Critical
> Attachments: YARN-9557-001.patch
>
>
> Application fails to execute successfully when ReadWriteDiskValidator is 
> configured.
> {code}
> 
> yarn.nodemanager.disk-validator
> read-write
> 
> {code}
> {noformat}
> Exception thrown while starting Container:
> java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
> Disk Check failed!
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233)
>  Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check 
> failed!
>  at 
> org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198)
>  ... 2 more
>  Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
> /opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11
>  is not a directory!
>  at 
> org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50)
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9557) Application fails in diskchecker when ReadWriteDiskValidator is configured.

2019-05-27 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9557:

Attachment: YARN-9557-001.patch

> Application fails in diskchecker when ReadWriteDiskValidator is configured.
> ---
>
> Key: YARN-9557
> URL: https://issues.apache.org/jira/browse/YARN-9557
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.1.1
> Environment: Configure:
> 
>  yarn.nodemanager.disk-validator
>  read-write
>  
>Reporter: Anuruddh Nayak
>Assignee: Bilwa S T
>Priority: Critical
> Attachments: YARN-9557-001.patch
>
>
> Application fails to execute successfully when ReadWriteDiskValidator is 
> configured.
> {code}
> 
> yarn.nodemanager.disk-validator
> read-write
> 
> {code}
> {noformat}
> Exception thrown while starting Container:
> java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
> Disk Check failed!
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233)
>  Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check 
> failed!
>  at 
> org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198)
>  ... 2 more
>  Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
> /opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11
>  is not a directory!
>  at 
> org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50)
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9565) RMAppImpl#ranNodes not cleared on FinalTransition

2019-05-27 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9565:

Attachment: YARN-9565-001.patch

> RMAppImpl#ranNodes not cleared on FinalTransition
> -
>
> Key: YARN-9565
> URL: https://issues.apache.org/jira/browse/YARN-9565
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9565-001.patch
>
>
> RMAppImpl holds the list of  nodes on which containers ran which is never 
> cleared.
> This could cause memory leak



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9583) Failed job which is submitted unknown queue is showed all users

2019-05-27 Thread KWON BYUNGCHANG (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KWON BYUNGCHANG updated YARN-9583:
--
Attachment: YARN-9583.002.patch

> Failed job which is submitted unknown queue is showed all users
> ---
>
> Key: YARN-9583
> URL: https://issues.apache.org/jira/browse/YARN-9583
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Priority: Minor
> Attachments: YARN-9583-screenshot.png, YARN-9583.001.patch, 
> YARN-9583.002.patch
>
>
> In secure mode, Failed job which is submitted unknown queue is showed all 
> users.
> I attached RM UI screen shot.
> reproduction senario
>1. user foo submit job to unknown queue without view-acl and job will fail 
> immediately. 
>2. user bar can access the job of user foo which previously failed.
> According to comments in  QueueACLsManager .java that caused the problem,
> This situation can happen when RM is restarted after deleting queue.
> I think  showing app of non existing queue to all users is the problem after 
> RM start. 
> It will become a security hole.
> I fixed it a little bit.  
> After fixing it, Both owner of job and admin of yarn can access job which is 
> submitted unknown queue. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6260) Findbugs warning in YARN-5355 branch

2019-05-27 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved YARN-6260.
-
Resolution: Invalid

The warning no longer exists. Closing.

> Findbugs warning in YARN-5355 branch
> 
>
> Key: YARN-6260
> URL: https://issues.apache.org/jira/browse/YARN-6260
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Varun Saxena
>Priority: Minor
>
> {noformat}
> Bug type SE_BAD_FIELD 
> In class 
> org.apache.hadoop.yarn.server.timelineservice.storage.entity.EntityColumnPrefix
> Field 
> org.apache.hadoop.yarn.server.timelineservice.storage.entity.EntityColumnPrefix.column
> Actual type 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnHelper
> In EntityColumnPrefix.java
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9584) Should put initializeProcessTrees method call before get pid

2019-05-27 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849295#comment-16849295
 ] 

Zhankun Tang commented on YARN-9584:


[~jiwq], does it cost much effort to do refactoring for that method? if too 
much, I'll get this in without test case.

> Should put initializeProcessTrees method call before get pid
> 
>
> Key: YARN-9584
> URL: https://issues.apache.org/jira/browse/YARN-9584
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0, 3.0.3, 3.1.2
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Critical
> Attachments: YARN-9584.001.patch
>
>
> In ContainerMonitorImpl#MonitoringThread.run method had a logical error that 
> get pid first then initialize uninitialized process trees. 
> {code:java}
> String pId = ptInfo.getPID();
> // Initialize uninitialized process trees
> initializeProcessTrees(entry);
> if (pId == null || !isResourceCalculatorAvailable()) {
>   continue; // processTree cannot be tracked
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9584) Should put initializeProcessTrees method call before get pid

2019-05-27 Thread Wanqiang Ji (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wanqiang Ji updated YARN-9584:
--
Description: 
In ContainerMonitorImpl#MonitoringThread.run method had a logical error that 
get pid first then initialize uninitialized process trees. 
{code:java}
String pId = ptInfo.getPID();

// Initialize uninitialized process trees
initializeProcessTrees(entry);
if (pId == null || !isResourceCalculatorAvailable()) {
  continue; // processTree cannot be tracked
}
{code}

  was:
In ContainerMonitorImpl#MonitoringThread.run method had a logic error that get 
pid first then initialize uninitialized process trees. 
{code:java}
String pId = ptInfo.getPID();

// Initialize uninitialized process trees
initializeProcessTrees(entry);
if (pId == null || !isResourceCalculatorAvailable()) {
  continue; // processTree cannot be tracked
}
{code}




> Should put initializeProcessTrees method call before get pid
> 
>
> Key: YARN-9584
> URL: https://issues.apache.org/jira/browse/YARN-9584
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0, 3.0.3, 3.1.2
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Critical
> Attachments: YARN-9584.001.patch
>
>
> In ContainerMonitorImpl#MonitoringThread.run method had a logical error that 
> get pid first then initialize uninitialized process trees. 
> {code:java}
> String pId = ptInfo.getPID();
> // Initialize uninitialized process trees
> initializeProcessTrees(entry);
> if (pId == null || !isResourceCalculatorAvailable()) {
>   continue; // processTree cannot be tracked
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9584) Should put initializeProcessTrees method call before get pid

2019-05-27 Thread Wanqiang Ji (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849245#comment-16849245
 ] 

Wanqiang Ji commented on YARN-9584:
---

Thanks for [~tangzhankun] for reviewing this. If we want to add a new UT for 
this, I think we should code refactoring for the *initializeProcessTrees* 
method. Any thoughts?

> Should put initializeProcessTrees method call before get pid
> 
>
> Key: YARN-9584
> URL: https://issues.apache.org/jira/browse/YARN-9584
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0, 3.0.3, 3.1.2
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Critical
> Attachments: YARN-9584.001.patch
>
>
> In ContainerMonitorImpl#MonitoringThread.run method had a logic error that 
> get pid first then initialize uninitialized process trees. 
> {code:java}
> String pId = ptInfo.getPID();
> // Initialize uninitialized process trees
> initializeProcessTrees(entry);
> if (pId == null || !isResourceCalculatorAvailable()) {
>   continue; // processTree cannot be tracked
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9482) DistributedShell job with localization fails in unsecure cluster

2019-05-27 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848944#comment-16848944
 ] 

Prabhu Joseph commented on YARN-9482:
-

[~sunilg] YARN-9008 has added localization in DistributedShell which is only 
from 3.3.0. This fix is on top of that. 

> DistributedShell job with localization fails in unsecure cluster
> 
>
> Key: YARN-9482
> URL: https://issues.apache.org/jira/browse/YARN-9482
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9482-001.patch, YARN-9482-002.patch, 
> YARN-9482-003.patch, YARN-9482-004.patch
>
>
> DistributedShell job with localization fails in unsecure cluster. The client 
> localizes the input files to home directory (job user) whereas the AM runs as 
> yarn user reads from it's home directory.
> *Command:*
> {code}
> yarn jar 
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -shell_command ls  -shell_args / -jar  
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -localize_files /tmp/prabhu
> {code}
> {code}
> Exception in thread "Thread-4" java.io.UncheckedIOException: Error during 
> localization setup
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1495)
>   at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
>   at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.run(ApplicationMaster.java:1481)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: File does not exist: 
> hdfs://yarn-ats-1:8020/user/yarn/DistributedShell/application_1554817981283_0003/prabhu
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1579)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1594)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1487)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9482) DistributedShell job with localization fails in unsecure cluster

2019-05-27 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848940#comment-16848940
 ] 

Sunil Govindan commented on YARN-9482:
--

Thanks [~giovanni.fumarola], [~Prabhu Joseph] and [~pbacsko]

Cud we get this to branch-3.2/3.1 as well. [~Prabhu Joseph] is this good to 
backport, thanks ?

> DistributedShell job with localization fails in unsecure cluster
> 
>
> Key: YARN-9482
> URL: https://issues.apache.org/jira/browse/YARN-9482
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9482-001.patch, YARN-9482-002.patch, 
> YARN-9482-003.patch, YARN-9482-004.patch
>
>
> DistributedShell job with localization fails in unsecure cluster. The client 
> localizes the input files to home directory (job user) whereas the AM runs as 
> yarn user reads from it's home directory.
> *Command:*
> {code}
> yarn jar 
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -shell_command ls  -shell_args / -jar  
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -localize_files /tmp/prabhu
> {code}
> {code}
> Exception in thread "Thread-4" java.io.UncheckedIOException: Error during 
> localization setup
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1495)
>   at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
>   at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.run(ApplicationMaster.java:1481)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: File does not exist: 
> hdfs://yarn-ats-1:8020/user/yarn/DistributedShell/application_1554817981283_0003/prabhu
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1579)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1594)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1487)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9543) UI2 should handle missing ATSv2 gracefully

2019-05-27 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848936#comment-16848936
 ] 

Sunil Govindan commented on YARN-9543:
--

Getting this in now. Thanks [~zsiegl] and [~akhilpb]

 

> UI2 should handle missing ATSv2 gracefully
> --
>
> Key: YARN-9543
> URL: https://issues.apache.org/jira/browse/YARN-9543
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: ATSv2, yarn-ui-v2
>Affects Versions: 3.1.2
>Reporter: Zoltan Siegl
>Assignee: Zoltan Siegl
>Priority: Major
> Attachments: YARN-9543.001.patch, YARN-9543.002.patch
>
>
> Resource manager UI2 is throwing some console errors and an error page on the 
> flows page.
> Suggested improvements:
>  * Disable or remove the flows tab if ATSv2 is not available or not installed
>  * Handle all connection errors to ATSv2 gracefully



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9545) Create healthcheck REST endpoint for ATSv2

2019-05-27 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848935#comment-16848935
 ] 

Sunil Govindan commented on YARN-9545:
--

Thanks

[~zsiegl], in general the DAO class looks good and forward looking. And tests 
are also good and covering basic cases. I have one small concern on 
{{isConnectionAlive}}. In this method, a simple bool is returned and based on 
that a decision is taken whether to consider Timeline Reader is up or down..

Now I think since this method a new interface, i suggest lets return an ENUM 
from this method which can renamed into {{getConnectionStatus}}/. With this, we 
can return status like RUNNING, CONNECTION_FAILURE etc. And in 
{{TimelineReaderWebServices}}, we can reimplement {{health}} method with switch 
case statement to consider basic 2 scenarios to start with, and same will be 
pretty much forward looking as well.

Thoughts/?

> Create healthcheck REST endpoint for ATSv2
> --
>
> Key: YARN-9545
> URL: https://issues.apache.org/jira/browse/YARN-9545
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: ATSv2
>Affects Versions: 3.1.2
>Reporter: Zoltan Siegl
>Assignee: Zoltan Siegl
>Priority: Major
> Attachments: YARN-9545.001.patch, YARN-9545.002.patch, 
> YARN-9545.003.patch
>
>
> RM UI2 and CM needs a health check url for ATSv2 service.
> Create a /health rest endpoint
>  * must respond with 200 \{health: ok} if all ok
>  * must respond with non 200 if any problem occurs
>  * could check reader/writer connection



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9584) Should put initializeProcessTrees method call before get pid

2019-05-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848860#comment-16848860
 ] 

Hadoop QA commented on YARN-9584:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
54s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
23s{color} | {color:red} The patch generated 17 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 70m 43s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9584 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12969880/YARN-9584.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b511b6af41b6 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a3745c5 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24154/testReport/ |
| asflicense | 
https://builds.apache.org/job/PreCommit-YARN-Build/24154/artifact/out/patch-asflicense-problems.txt
 |
| Max. process+thread count | 447 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 

[jira] [Commented] (YARN-9584) Should put initializeProcessTrees method call before get pid

2019-05-27 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848838#comment-16848838
 ] 

Zhankun Tang commented on YARN-9584:


[~jiwq], thanks for reporting this. It looks good to me and it would be much 
better if we can write a test case for it. :)

> Should put initializeProcessTrees method call before get pid
> 
>
> Key: YARN-9584
> URL: https://issues.apache.org/jira/browse/YARN-9584
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0, 3.0.3, 3.1.2
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Critical
> Attachments: YARN-9584.001.patch
>
>
> In ContainerMonitorImpl#MonitoringThread.run method had a logic error that 
> get pid first then initialize uninitialized process trees. 
> {code:java}
> String pId = ptInfo.getPID();
> // Initialize uninitialized process trees
> initializeProcessTrees(entry);
> if (pId == null || !isResourceCalculatorAvailable()) {
>   continue; // processTree cannot be tracked
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9584) Should put initializeProcessTrees method call before get pid

2019-05-27 Thread Wanqiang Ji (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wanqiang Ji updated YARN-9584:
--
Affects Version/s: 3.2.0
   3.0.3
   3.1.2

> Should put initializeProcessTrees method call before get pid
> 
>
> Key: YARN-9584
> URL: https://issues.apache.org/jira/browse/YARN-9584
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0, 3.0.3, 3.1.2
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Critical
> Attachments: YARN-9584.001.patch
>
>
> In ContainerMonitorImpl#MonitoringThread.run method had a logic error that 
> get pid first then initialize uninitialized process trees. 
> {code:java}
> String pId = ptInfo.getPID();
> // Initialize uninitialized process trees
> initializeProcessTrees(entry);
> if (pId == null || !isResourceCalculatorAvailable()) {
>   continue; // processTree cannot be tracked
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9584) Should put initializeProcessTrees method call before get pid

2019-05-27 Thread Wanqiang Ji (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wanqiang Ji updated YARN-9584:
--
Attachment: YARN-9584.001.patch

> Should put initializeProcessTrees method call before get pid
> 
>
> Key: YARN-9584
> URL: https://issues.apache.org/jira/browse/YARN-9584
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Critical
> Attachments: YARN-9584.001.patch
>
>
> In ContainerMonitorImpl#MonitoringThread.run method had a logic error that 
> get pid first then initialize uninitialized process trees. 
> {code:java}
> String pId = ptInfo.getPID();
> // Initialize uninitialized process trees
> initializeProcessTrees(entry);
> if (pId == null || !isResourceCalculatorAvailable()) {
>   continue; // processTree cannot be tracked
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9584) Should put initializeProcessTrees method call before get pid

2019-05-27 Thread Wanqiang Ji (JIRA)
Wanqiang Ji created YARN-9584:
-

 Summary: Should put initializeProcessTrees method call before get 
pid
 Key: YARN-9584
 URL: https://issues.apache.org/jira/browse/YARN-9584
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Wanqiang Ji
Assignee: Wanqiang Ji


In ContainerMonitorImpl#MonitoringThread.run method had a logic error that get 
pid first then initialize uninitialized process trees. 
{code:java}
String pId = ptInfo.getPID();

// Initialize uninitialized process trees
initializeProcessTrees(entry);
if (pId == null || !isResourceCalculatorAvailable()) {
  continue; // processTree cannot be tracked
}
{code}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9583) Failed job which is submitted unknown queue is showed all users

2019-05-27 Thread KWON BYUNGCHANG (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KWON BYUNGCHANG updated YARN-9583:
--
Description: 
In secure mode, Failed job which is submitted unknown queue is showed all users.
I attached RM UI screen shot.

reproduction senario
   1. user foo submit job to unknown queue without view-acl and job will fail 
immediately. 
   2. user bar can access the job of user foo which previously failed.

According to comments in  QueueACLsManager .java that caused the problem,
This situation can happen when RM is restarted after deleting queue.

I think  showing app of non existing queue to all users is the problem after RM 
start. 
It will become a security hole.

I fixed it a little bit.  
After fixing it, Both owner of job and admin of yarn can access job which is 
submitted unknown queue. 


  was:
In secure mode, Failed job which is submitted unknown queue is showed all users.
I attached RM UI screen shot.

reproduction senario
   1. user foo submit job to unknown queue without view-acl and job will fail 
immediately. 
   2. user bar can access the job of user foo which previously failed.

According to comments in  QueueACLsManager .java that caused the problem,
This situation can happen when RM is restarted after deletion queue.

I think  showing app of non existing queue to all users is the problem after RM 
start. 
It will become a security hole.

I fixed it a little bit.  
After fixing it, Both owner of job and admin of yarn can access job which is 
submitted unknown queue. 



> Failed job which is submitted unknown queue is showed all users
> ---
>
> Key: YARN-9583
> URL: https://issues.apache.org/jira/browse/YARN-9583
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Priority: Minor
> Attachments: YARN-9583-screenshot.png, YARN-9583.001.patch
>
>
> In secure mode, Failed job which is submitted unknown queue is showed all 
> users.
> I attached RM UI screen shot.
> reproduction senario
>1. user foo submit job to unknown queue without view-acl and job will fail 
> immediately. 
>2. user bar can access the job of user foo which previously failed.
> According to comments in  QueueACLsManager .java that caused the problem,
> This situation can happen when RM is restarted after deleting queue.
> I think  showing app of non existing queue to all users is the problem after 
> RM start. 
> It will become a security hole.
> I fixed it a little bit.  
> After fixing it, Both owner of job and admin of yarn can access job which is 
> submitted unknown queue. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9583) Failed job which is submitted unknown queue is showed all users

2019-05-27 Thread KWON BYUNGCHANG (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KWON BYUNGCHANG updated YARN-9583:
--
Description: 
In secure mode, Failed job which is submitted unknown queue is showed all users.
I attached RM UI screen shot.

reproduction senario
   1. user foo submit job to unknown queue without view-acl and job will fail 
immediately. 
   2. user bar can access the job of user foo which previously failed.

According to comments in  QueueACLsManager .java that caused the problem,
This situation can happen when RM is restarted after deletion queue.

I think  showing app of non existing queue to all users is the problem after RM 
start. 
It will become a security hole.

I fixed it a little bit.  
After fixing it, Both owner of job and admin of yarn can access job which is 
submitted unknown queue. 


  was:
Failed job which is submitted unknown queue is showed all users.
I attached RM UI screen shot.

reproduction senario
   1. user foo submit job to unknown queue without view-acl and job will fail 
immediately. 
   2. user bar can access the job of user foo which previously failed.

According to comments in  QueueACLsManager .java that caused the problem,
This situation can happen when RM is restarted after deletion queue.

I think  showing app of non existing queue to all users is the problem after RM 
start. 
It will become a security hole.

I fixed it a little bit.  
After fixing it, Both owner of job and admin of yarn can access job which is 
submitted unknown queue. 



> Failed job which is submitted unknown queue is showed all users
> ---
>
> Key: YARN-9583
> URL: https://issues.apache.org/jira/browse/YARN-9583
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Priority: Minor
> Attachments: YARN-9583-screenshot.png, YARN-9583.001.patch
>
>
> In secure mode, Failed job which is submitted unknown queue is showed all 
> users.
> I attached RM UI screen shot.
> reproduction senario
>1. user foo submit job to unknown queue without view-acl and job will fail 
> immediately. 
>2. user bar can access the job of user foo which previously failed.
> According to comments in  QueueACLsManager .java that caused the problem,
> This situation can happen when RM is restarted after deletion queue.
> I think  showing app of non existing queue to all users is the problem after 
> RM start. 
> It will become a security hole.
> I fixed it a little bit.  
> After fixing it, Both owner of job and admin of yarn can access job which is 
> submitted unknown queue. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9583) Failed job which is submitted unknown queue is showed all users

2019-05-27 Thread KWON BYUNGCHANG (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KWON BYUNGCHANG updated YARN-9583:
--
Description: 
Failed job which is submitted unknown queue is showed all users.
I attached RM UI screen shot.

reproduction senario
   1. user foo submit job to unknown queue without view-acl and job will fail 
immediately. 
   2. user bar can access the job of user foo which previously failed.

According to comments in  QueueACLsManager .java that caused the problem,
This situation can happen when RM is restarted after deletion queue.

I think  showing app of non existing queue to all users is the problem after RM 
start. 
It will become a security hole.

I fixed it a little bit.  
After fixing it, Both owner of job and admin of yarn can access job which is 
submitted unknown queue. 


  was:
Failed job which is submitted unknown queue is showed all users.
I attached RM UI screen shot.

reproduction senario
   1. user foo submit job to unknown queue without view-acl and failed job. 
   2. user bar can access job of user foo.  

According to comments in  QueueACLsManager .java that caused the problem.
This situation can happen when RM is restarted after deletion queue.

I think  showing app of non existing queue to all users is the problem after RM 
start. 
It will become a security hole.

I fixed it a little bit.  
After fixing it, Both owner of job and admin of yarn can access job which is 
submitted unknown queue. 



> Failed job which is submitted unknown queue is showed all users
> ---
>
> Key: YARN-9583
> URL: https://issues.apache.org/jira/browse/YARN-9583
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Priority: Minor
> Attachments: YARN-9583-screenshot.png, YARN-9583.001.patch
>
>
> Failed job which is submitted unknown queue is showed all users.
> I attached RM UI screen shot.
> reproduction senario
>1. user foo submit job to unknown queue without view-acl and job will fail 
> immediately. 
>2. user bar can access the job of user foo which previously failed.
> According to comments in  QueueACLsManager .java that caused the problem,
> This situation can happen when RM is restarted after deletion queue.
> I think  showing app of non existing queue to all users is the problem after 
> RM start. 
> It will become a security hole.
> I fixed it a little bit.  
> After fixing it, Both owner of job and admin of yarn can access job which is 
> submitted unknown queue. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9583) Failed job which is submitted unknown queue is showed all users

2019-05-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848661#comment-16848661
 ] 

Hadoop QA commented on YARN-9583:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 20s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  3s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 82m 
22s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}130m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9583 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12969834/YARN-9583.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 2c03af2c869d 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9f056d9 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24153/testReport/ |
| Max. process+thread count | 886 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24153/console |
| Powered by | Apache Yetus 0.8.0  

[jira] [Commented] (YARN-8693) Add signalToContainer REST API for RMWebServices

2019-05-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848650#comment-16848650
 ] 

Hadoop QA commented on YARN-8693:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 52s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 56s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 81m 
16s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
41s{color} | {color:green} hadoop-yarn-server-router in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}139m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-8693 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12969832/YARN-8693.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cf8d5f1588c5 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9f056d9 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24152/testReport/ |
| Max. process+thread count | 899 (vs. ulimit of