[jira] [Commented] (YARN-7601) Incorrect container states recovered as LevelDB uses alphabetical order

2018-10-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671162#comment-16671162
 ] 

Hadoop QA commented on YARN-7601:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
32s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
58s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 58s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
32s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  4m 
21s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
24s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 58s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 48m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-7601 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12903697/YARN-7601.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 3eaa12055ebd 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b13c567 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/22395/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-YARN-Build/22395/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| javac | 
https://builds.apache.org/job/PreCommit-YARN-Build/22395/artifact/out/pa

[jira] [Commented] (YARN-7901) Adding profile capability in resourceReq in LocalityMulticastAMRMProxyPolicy

2018-10-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671144#comment-16671144
 ] 

Hadoop QA commented on YARN-7901:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} YARN-7901 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7901 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12909626/YARN-7901_trunk.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22396/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Adding profile capability in resourceReq in LocalityMulticastAMRMProxyPolicy
> 
>
> Key: YARN-7901
> URL: https://issues.apache.org/jira/browse/YARN-7901
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: lovekesh bansal
>Assignee: lovekesh bansal
>Priority: Minor
> Attachments: YARN-7901_trunk.001.patch
>
>
> in the splitIndividualAny method while creating the resourceRequest we are 
> not setting the profile capability. 
> ResourceRequest.newInstance(originalResourceRequest.getPriority(),
>  originalResourceRequest.getResourceName(),
>  originalResourceRequest.getCapability(),
>  originalResourceRequest.getNumContainers(),
>  originalResourceRequest.getRelaxLocality(),
>  originalResourceRequest.getNodeLabelExpression(),
>  originalResourceRequest.getExecutionTypeRequest());



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4858) start-yarn and stop-yarn scripts to support timeline and sharedcachemanager

2018-10-31 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-4858:

Target Version/s: 2.10.0  (was: 2.9.2)

> start-yarn and stop-yarn scripts to support timeline and sharedcachemanager
> ---
>
> Key: YARN-4858
> URL: https://issues.apache.org/jira/browse/YARN-4858
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>  Labels: oct16-easy
> Attachments: YARN-4858-001.patch, YARN-4858-branch-2.001.patch
>
>
> The start-yarn and stop-yarn scripts don't have any (even commented out) 
> support for the  timeline and sharedcachemanager
> Proposed:
> * bash and cmd start-yarn scripts have commented out start actions
> * stop-yarn scripts stop the servers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7601) Incorrect container states recovered as LevelDB uses alphabetical order

2018-10-31 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-7601:

Target Version/s: 2.9.3  (was: 2.9.2)

> Incorrect container states recovered as LevelDB uses alphabetical order
> ---
>
> Key: YARN-7601
> URL: https://issues.apache.org/jira/browse/YARN-7601
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sampada Dehankar
>Assignee: Sampada Dehankar
>Priority: Major
> Attachments: YARN-7601.001.patch, YARN-7601.002.patch
>
>
> LevelDB stores key-value pairs in the alphabetical order. Container id 
> concatenated by its state is used as key. So, even if container goes through 
> any states in its life cycle, the order of states for following values 
> retrieved from LevelDB is always going to be as below`:
> LAUNCHED
> PAUSED
> QUEUED
> For eg: If a container is LAUNCHED then PAUSED and LAUNCHED again, the 
> recovered container state is PAUSED currently instead of LAUNCHED.
> We propose to store the timestamp as the value while making call to 
>   
>   storeContainerLaunched
>   storeContainerPaused
>   storeContainerQueued
>   
> so that correct container state is recovered based on timestamps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6918) Remove acls after queue delete to avoid memory leak

2018-10-31 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-6918:

Target Version/s: 3.0.4, 2.9.3  (was: 3.0.2, 2.9.2)

> Remove acls after queue delete to avoid memory leak
> ---
>
> Key: YARN-6918
> URL: https://issues.apache.org/jira/browse/YARN-6918
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
> Attachments: YARN-6918.001.patch, YARN-6918.002.patch
>
>
> Acl for deleted queue need to removed from allAcls to avoid leak 
> (Priority,YarnAuthorizer)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7450) ATS Client should retry on intermittent Kerberos issues.

2018-10-31 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-7450:

Target Version/s: 2.9.3  (was: 2.9.2)

> ATS Client should retry on intermittent Kerberos issues.
> 
>
> Key: YARN-7450
> URL: https://issues.apache.org/jira/browse/YARN-7450
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: ATSv2
>Affects Versions: 2.7.3
> Environment: Hadoop-2.7.3
>Reporter: Ravi Prakash
>Priority: Major
>
> We saw a stack trace (posted in the first comment) in the ResourceManager 
> logs for the TimelineClientImpl not being able to relogin from keytab.
> I'm guessing there was an intermittent issue that failed the kerberos relogin 
> from keytab. However, I'm assuming this was *not* retried because I only saw 
> one instance of this stack trace.  I propose that this operation should have 
> been retried.
> It seems, this caused events at the ResourceManager to queue up and 
> eventually stop responding to even basic {{yarn application -list}} commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7560) Resourcemanager hangs when resourceUsedWithWeightToResourceRatio return a overflow value

2018-10-31 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-7560:

Fix Version/s: (was: 3.0.3)

> Resourcemanager hangs when  resourceUsedWithWeightToResourceRatio return a 
> overflow value 
> --
>
> Key: YARN-7560
> URL: https://issues.apache.org/jira/browse/YARN-7560
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 3.0.0
>Reporter: zhengchenyu
>Assignee: zhengchenyu
>Priority: Major
> Attachments: YARN-7560.000.patch, YARN-7560.001.patch
>
>
> In our cluster, we changed the configuration, then refreshQueues, we found 
> the resourcemanager hangs. And the Resourcemanager can't restart 
> successfully. We got jstack information, always show like this:
> {code}
> "main" #1 prio=5 os_prio=0 tid=0x7f98e8017000 nid=0x2f5 runnable 
> [0x7f98eed9a000]
>java.lang.Thread.State: RUNNABLE
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.resourceUsedWithWeightToResourceRatio(ComputeFairShares.java:182)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSharesInternal(ComputeFairShares.java:140)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSteadyShares(ComputeFairShares.java:66)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy.computeSteadyShares(FairSharePolicy.java:148)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.recomputeSteadyShares(FSParentQueue.java:102)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:148)
> - locked <0x7f8c4a8177a0> (a java.util.HashMap)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:101)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.updateAllocationConfiguration(QueueManager.java:387)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$AllocationReloadListener.onReload(FairScheduler.java:1728)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:422)
> - locked <0x7f8c4a7eb2e0> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1597)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1621)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> - locked <0x7f8c4a76ac48> (a java.lang.Object)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:569)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> - locked <0x7f8c49254268> (a java.lang.Object)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:997)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:257)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> - locked <0x7f8c467495e0> (a java.lang.Object)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1220)
> {code}
> When we debug the cluster, we found resourceUsedWithWeightToResourceRatio 
> return a negative value. So the loop can't return. We found in our cluster, 
> the sum of all minRes is over int.max, so 
> resourceUsedWithWeightToResourceRatio return a negative value.
> below is the loop. Because totalResource is long, so always postive. But 
> resourceUsedWithWeightToResourceRatio return int type. Our cluster is so big 
> that resourceUsedWithWeightToResourceRatio will return a overflow value, just 
> a negative. So the loop will never break.
> {code}
> while (resourceUsedWithWeightToResourceRatio(rMax, schedulables, type)
> < totalResource) {
>   rMax *= 2.0;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubs

[jira] [Updated] (YARN-7560) Resourcemanager hangs when resourceUsedWithWeightToResourceRatio return a overflow value

2018-10-31 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-7560:

Target Version/s: 3.0.4  (was: 3.0.0)

> Resourcemanager hangs when  resourceUsedWithWeightToResourceRatio return a 
> overflow value 
> --
>
> Key: YARN-7560
> URL: https://issues.apache.org/jira/browse/YARN-7560
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 3.0.0
>Reporter: zhengchenyu
>Assignee: zhengchenyu
>Priority: Major
> Attachments: YARN-7560.000.patch, YARN-7560.001.patch
>
>
> In our cluster, we changed the configuration, then refreshQueues, we found 
> the resourcemanager hangs. And the Resourcemanager can't restart 
> successfully. We got jstack information, always show like this:
> {code}
> "main" #1 prio=5 os_prio=0 tid=0x7f98e8017000 nid=0x2f5 runnable 
> [0x7f98eed9a000]
>java.lang.Thread.State: RUNNABLE
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.resourceUsedWithWeightToResourceRatio(ComputeFairShares.java:182)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSharesInternal(ComputeFairShares.java:140)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSteadyShares(ComputeFairShares.java:66)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy.computeSteadyShares(FairSharePolicy.java:148)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.recomputeSteadyShares(FSParentQueue.java:102)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:148)
> - locked <0x7f8c4a8177a0> (a java.util.HashMap)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:101)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.updateAllocationConfiguration(QueueManager.java:387)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$AllocationReloadListener.onReload(FairScheduler.java:1728)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:422)
> - locked <0x7f8c4a7eb2e0> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1597)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1621)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> - locked <0x7f8c4a76ac48> (a java.lang.Object)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:569)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> - locked <0x7f8c49254268> (a java.lang.Object)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:997)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:257)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> - locked <0x7f8c467495e0> (a java.lang.Object)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1220)
> {code}
> When we debug the cluster, we found resourceUsedWithWeightToResourceRatio 
> return a negative value. So the loop can't return. We found in our cluster, 
> the sum of all minRes is over int.max, so 
> resourceUsedWithWeightToResourceRatio return a negative value.
> below is the loop. Because totalResource is long, so always postive. But 
> resourceUsedWithWeightToResourceRatio return int type. Our cluster is so big 
> that resourceUsedWithWeightToResourceRatio will return a overflow value, just 
> a negative. So the loop will never break.
> {code}
> while (resourceUsedWithWeightToResourceRatio(rMax, schedulables, type)
> < totalResource) {
>   rMax *= 2.0;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To 

[jira] [Updated] (YARN-7901) Adding profile capability in resourceReq in LocalityMulticastAMRMProxyPolicy

2018-10-31 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-7901:

Fix Version/s: (was: 3.0.3)

> Adding profile capability in resourceReq in LocalityMulticastAMRMProxyPolicy
> 
>
> Key: YARN-7901
> URL: https://issues.apache.org/jira/browse/YARN-7901
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: lovekesh bansal
>Assignee: lovekesh bansal
>Priority: Minor
> Attachments: YARN-7901_trunk.001.patch
>
>
> in the splitIndividualAny method while creating the resourceRequest we are 
> not setting the profile capability. 
> ResourceRequest.newInstance(originalResourceRequest.getPriority(),
>  originalResourceRequest.getResourceName(),
>  originalResourceRequest.getCapability(),
>  originalResourceRequest.getNumContainers(),
>  originalResourceRequest.getRelaxLocality(),
>  originalResourceRequest.getNodeLabelExpression(),
>  originalResourceRequest.getExecutionTypeRequest());



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7649) RMContainer state transition exception after container update

2018-10-31 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-7649:

Target Version/s: 3.0.4, 2.9.3  (was: 3.0.2, 2.9.2)

> RMContainer state transition exception after container update
> -
>
> Key: YARN-7649
> URL: https://issues.apache.org/jira/browse/YARN-7649
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Weiwei Yang
>Assignee: Arun Suresh
>Priority: Major
>
> I've been seen this in a cluster deployment as well as in UT, run 
> {{TestAMRMClient#testAMRMClientWithContainerPromotion}} could reproduce this, 
>  it doesn't fail the test case but following error message is shown up in the 
> log
> {noformat}
> 2017-12-13 19:41:31,817 ERROR rmcontainer.RMContainerImpl 
> (RMContainerImpl.java:handle(480)) - Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> RELEASED at ALLOCATED
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:478)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:65)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:675)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1586)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:155)
>   at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
>   at java.lang.Thread.run(Thread.java:748)
> 2017-12-13 19:41:31,817 ERROR rmcontainer.RMContainerImpl 
> (RMContainerImpl.java:handle(481)) - Invalid event RELEASED on container 
> container_1513165290804_0001_01_03
> {noformat}
> this seems to be related to YARN-6251.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8965) Revisit delay scheduling for cloud environment

2018-10-31 Thread Weiwei Yang (JIRA)
Weiwei Yang created YARN-8965:
-

 Summary: Revisit delay scheduling for cloud environment
 Key: YARN-8965
 URL: https://issues.apache.org/jira/browse/YARN-8965
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Weiwei Yang
Assignee: Weiwei Yang


Delay scheduling was introduced to honor task locality at best-effort, which 
has node/rack level delays. However under cloud environment, usually the 
storage is remote to the work-load cluster, which makes the locality constraint 
totally irrelevant. Lets revisit this and create a model for cloud environments 
without too many setups.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8948) PlacementRule interface should be for all YarnSchedulers

2018-10-31 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-8948:
---
Priority: Critical  (was: Major)

> PlacementRule interface should be for all YarnSchedulers
> 
>
> Key: YARN-8948
> URL: https://issues.apache.org/jira/browse/YARN-8948
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Priority: Critical
> Attachments: YARN-8948.001.patch, YARN-8948.002.patch
>
>
> *Issue 1:*
> YARN-3635 intention was to add PlacementRule interface common for all 
> YarnSchedules.
> {code}
> 33  public abstract boolean initialize(
> 34  CapacitySchedulerContext schedulerContext) throws IOException;
> {code}
> PlacementRule initialization is done using CapacitySchedulerContext binding 
> to CapacityScheduler
> *Issue 2:*
> {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity 
> Scheduler
> {quote}
> * **Queue Mapping Interface based on Default or User Defined Placement 
> Rules** - This feature allows users to map a job to a specific queue based on 
> some default placement rule. For instance based on user & group, or 
> application name. User can also define their own placement rule.
> {quote}
> As per current UserGroupMapping is always added in placementRule. 
> {{CapacityScheduler#updatePlacementRules}}
> {code}
> // Initialize placement rules
> Collection placementRuleStrs = conf.getStringCollection(
> YarnConfiguration.QUEUE_PLACEMENT_RULES);
> List placementRules = new ArrayList<>();
> ...
> // add UserGroupMappingPlacementRule if absent
> distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8948) PlacementRule interface should be for all YarnSchedulers

2018-10-31 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-8948:
---
Description: 
*Issue 1:*

YARN-3635 intention was to add PlacementRule interface common for all 
YarnSchedules.
{code}
33public abstract boolean initialize(
34CapacitySchedulerContext schedulerContext) throws IOException;
{code}
PlacementRule initialization is done using CapacitySchedulerContext binding to 
CapacityScheduler

*Issue 2:*

{{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity 
Scheduler
{quote}
* **Queue Mapping Interface based on Default or User Defined Placement Rules** 
- This feature allows users to map a job to a specific queue based on some 
default placement rule. For instance based on user & group, or application 
name. User can also define their own placement rule.
{quote}
As per current UserGroupMapping is always added in placementRule. 
{{CapacityScheduler#updatePlacementRules}}
{code}
// Initialize placement rules
Collection placementRuleStrs = conf.getStringCollection(
YarnConfiguration.QUEUE_PLACEMENT_RULES);
List placementRules = new ArrayList<>();
...

// add UserGroupMappingPlacementRule if absent
distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE);
{code}

  was:
YARN-3635 intention was to add PlacementRule interface common for all 
YarnSchedules.
{code}
33public abstract boolean initialize(
34CapacitySchedulerContext schedulerContext) throws IOException;
{code}
PlacementRule initialization is done using CapacitySchedulerContext binding to 
CapacityScheduler


> PlacementRule interface should be for all YarnSchedulers
> 
>
> Key: YARN-8948
> URL: https://issues.apache.org/jira/browse/YARN-8948
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Priority: Major
> Attachments: YARN-8948.001.patch, YARN-8948.002.patch
>
>
> *Issue 1:*
> YARN-3635 intention was to add PlacementRule interface common for all 
> YarnSchedules.
> {code}
> 33  public abstract boolean initialize(
> 34  CapacitySchedulerContext schedulerContext) throws IOException;
> {code}
> PlacementRule initialization is done using CapacitySchedulerContext binding 
> to CapacityScheduler
> *Issue 2:*
> {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity 
> Scheduler
> {quote}
> * **Queue Mapping Interface based on Default or User Defined Placement 
> Rules** - This feature allows users to map a job to a specific queue based on 
> some default placement rule. For instance based on user & group, or 
> application name. User can also define their own placement rule.
> {quote}
> As per current UserGroupMapping is always added in placementRule. 
> {{CapacityScheduler#updatePlacementRules}}
> {code}
> // Initialize placement rules
> Collection placementRuleStrs = conf.getStringCollection(
> YarnConfiguration.QUEUE_PLACEMENT_RULES);
> List placementRules = new ArrayList<>();
> ...
> // add UserGroupMappingPlacementRule if absent
> distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8394) Improve data locality documentation for Capacity Scheduler

2018-10-31 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671119#comment-16671119
 ] 

Weiwei Yang commented on YARN-8394:
---

Hi [~yufeigu]

Apologies I missed your last comment.

Are you suggesting when "yarn.scheduler.capacity.node-locality-delay" is set to 
"-1", then we should AUTO disable "rack-locality-additional-delay" too?  I 
think that makes sense.

We need a Jira to track this, revisit the locality code under the context of 
cloud environment. Let me open one to track.

Thanks!

> Improve data locality documentation for Capacity Scheduler
> --
>
> Key: YARN-8394
> URL: https://issues.apache.org/jira/browse/YARN-8394
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Fix For: 3.2.0, 3.1.1, 3.0.4
>
> Attachments: YARN-8394.001.patch, YARN-8394.002.patch
>
>
> YARN-6344 introduces a new parameter 
> {{yarn.scheduler.capacity.rack-locality-additional-delay}} in 
> capacity-scheduler.xml, we need to add some documentation in 
> {{CapacityScheduler.md}} accordingly.
> Moreover, we are seeing more and more clusters are separating storage and 
> computation where file system is always remote, in such cases we need to 
> introduce how to compromise data locality in CS otherwise MR jobs are 
> suffering.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8404) Timeline event publish need to be async to avoid Dispatcher thread leak in case ATS is down

2018-10-31 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-8404:

Fix Version/s: 2.9.2

Thanks! Backported to branch-2.9.

> Timeline event publish need to be async to avoid Dispatcher thread leak in 
> case ATS is down
> ---
>
> Key: YARN-8404
> URL: https://issues.apache.org/jira/browse/YARN-8404
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.2
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Blocker
> Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.4
>
> Attachments: YARN-8404.01.patch
>
>
> It is observed that if ATS1/1.5 daemon is not running, RM recovery is delayed 
> as long as timeline client get timed out for each applications. By default, 
> timed out will take around 5 mins. If completed applications are more then 
> amount of time RM will wait is *(number of completed applications in a 
> cluster * 5 minutes)* which is kind of hanged. 
> Primary reason for this behavior is YARN-3044 YARN-4129 which refactor 
> existing system metric publisher. This refactoring made appFinished event as 
> synchronous which was asynchronous earlier. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7765) [Atsv2] GSSException: No valid credentials provided - Failed to find any Kerberos tgt thrown by Timelinev2Client & HBaseClient in NM

2018-10-31 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-7765:

Target Version/s: 3.0.1, 3.1.0, 2.10.0  (was: 3.1.0, 2.10.0, 3.0.1)
   Fix Version/s: 2.9.2

Backported to branch-2.9. Thanks [~rohithsharma].

> [Atsv2] GSSException: No valid credentials provided - Failed to find any 
> Kerberos tgt thrown by Timelinev2Client & HBaseClient in NM
> 
>
> Key: YARN-7765
> URL: https://issues.apache.org/jira/browse/YARN-7765
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Sumana Sathish
>Assignee: Rohith Sharma K S
>Priority: Blocker
> Fix For: 3.1.0, 2.10.0, 3.0.1, 2.9.2
>
> Attachments: YARN-7765.01.patch, YARN-7765.02.patch
>
>
> Secure cluster is deployed and all YARN services are started successfully. 
> When application is submitted, app collectors which is started as aux-service 
> throwing below exception. But this exception is *NOT* observed from RM 
> TimelineCollector. 
> Cluster is deployed with Hadoop-3.0 and Hbase-1.2.6 secure cluster. All the 
> YARN and HBase service are started and working perfectly fine. After 24 hours 
> i.e when token lifetime is expired, HBaseClient in NM and HDFSClient in 
> HMaster and HRegionServer started getting this error. After sometime, HBase 
> daemons got shutdown. In NM, JVM didn't shutdown but none of the events got 
> published.
> {noformat}
> 2018-01-17 11:04:48,017 FATAL ipc.RpcClientImpl (RpcClientImpl.java:run(684)) 
> - SASL authentication failed. The most likely cause is missing or invalid 
> credentials. Consider 'kinit'.
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> {noformat}
> cc :/ [~vrushalic] [~varun_saxena] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7277) Container Launch expand environment needs to consider bracket matching

2018-10-31 Thread Zhankun Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang updated YARN-7277:
---
Attachment: YARN-7277-trunk.004.patch

> Container Launch expand environment needs to consider bracket matching
> --
>
> Key: YARN-7277
> URL: https://issues.apache.org/jira/browse/YARN-7277
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: balloons
>Assignee: Zhankun Tang
>Priority: Critical
> Attachments: YARN-7277-trunk.001.patch, YARN-7277-trunk.002.patch, 
> YARN-7277-trunk.003.patch, YARN-7277-trunk.004.patch
>
>
> The SPARK application I submitted always failed and I finally found that the 
> commands I specified to launch AM Container were changed by NM.
> *The following is part of the excerpt I submitted to RM to see the command:*
> {code:java}
> *'{\"handler\":\"FILLER\",\"inputTable\":\"engine_arch.adult_train\",\"outputTable\":[\"ether_features_filler_\$experimentId_\$taskId_out0\"],\"params\":{\"age\":{\"param\":[\"0\"]}}}'*
> {code}
> *The following is an excerpt from the corresponding command used when I 
> observe the NM launch container:*
> {code:java}
> *'{\"handler\":\"FILLER\",\"inputTable\":\"engine_arch.adult_train\",\"outputTable\":[\"ether_features_filler_\$experimentId_\$taskId_out0\"],\"params\":{\"age\":{\"param\":[\"0\"]}*
> {code}
> Finally, I found that NM made the following transformation in launch 
> container which led to this situation:
> {code:java}
> @VisibleForTesting
>   public static String expandEnvironment(String var,
>   Path containerLogDir) {
> var = var.replace(ApplicationConstants.LOG_DIR_EXPANSION_VAR,
>   containerLogDir.toString());
> var =  var.replace(ApplicationConstants.CLASS_PATH_SEPARATOR,
>   File.pathSeparator);
> // replace parameter expansion marker. e.g. {{VAR}} on Windows is replaced
> // as %VAR% and on Linux replaced as "$VAR"
> if (Shell.WINDOWS) {
>   var = var.replaceAll("(\\{\\{)|(\\}\\})", "%");
> } else {
>   var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_LEFT, "$");
>   *var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");*
> }
> return var;
>   }
> {code}
> I think this is a Bug that doesn't even consider the pairing of 
> "*PARAMETER_EXPANSION_LEFT*" and "*PARAMETER_EXPANSION_RIGHT*" when 
> substituting. But simply substituting for simple violence.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app

2018-10-31 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671044#comment-16671044
 ] 

Weiwei Yang commented on YARN-8958:
---

Hi [~Tao Yang], No worries. I saw it several times, it should not be caused by 
this patch.

I think the fix makes sense to me, I'll take one more look today. Thanks

> Schedulable entities leak in fair ordering policy when recovering containers 
> between remove app attempt and remove app
> --
>
> Key: YARN-8958
> URL: https://issues.apache.org/jira/browse/YARN-8958
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8958.001.patch, YARN-8958.002.patch
>
>
> We found a NPE in ClientRMService#getApplications when querying apps with 
> specified queue. The cause is that there is one app which can't be found by 
> calling RMContextImpl#getRMApps(is finished and swapped out of memory) but 
> still can be queried from fair ordering policy.
> To reproduce schedulable entities leak in fair ordering policy:
> (1) create app1 and launch container1 on node1
> (2) restart RM
> (3) remove app1 attempt, app1 is removed from the schedulable entities.
> (4) recover container1 after node1 reconnected to RM, then the state of 
> contianer1 is changed to COMPLETED, app1 is bring back to entitiesToReorder 
> after container released, then app1 will be added back into schedulable 
> entities after calling FairOrderingPolicy#getAssignmentIterator by scheduler.
> (5) remove app1
> To solve this problem, we should make sure schedulableEntities can only be 
> affected by add or remove app attempt, new entity should not be added into 
> schedulableEntities by reordering process.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> schedulableEntities.add(schedulableEntity);
>   }
> {code}
> Related codes above can be improved as follow to make sure only existent 
> entity can be re-add into schedulableEntities.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> boolean exists = schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> if (exists) {
>   schedulableEntities.add(schedulableEntity);
> } else {
>   LOG.info("Skip reordering non-existent schedulable entity: "
>   + schedulableEntity.getId());
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app

2018-10-31 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671030#comment-16671030
 ] 

Tao Yang commented on YARN-8958:


There is no UT failure but still got -1 for unit by Hadoop QA. 
[~cheersyang], Can you help to see what happened?

> Schedulable entities leak in fair ordering policy when recovering containers 
> between remove app attempt and remove app
> --
>
> Key: YARN-8958
> URL: https://issues.apache.org/jira/browse/YARN-8958
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8958.001.patch, YARN-8958.002.patch
>
>
> We found a NPE in ClientRMService#getApplications when querying apps with 
> specified queue. The cause is that there is one app which can't be found by 
> calling RMContextImpl#getRMApps(is finished and swapped out of memory) but 
> still can be queried from fair ordering policy.
> To reproduce schedulable entities leak in fair ordering policy:
> (1) create app1 and launch container1 on node1
> (2) restart RM
> (3) remove app1 attempt, app1 is removed from the schedulable entities.
> (4) recover container1 after node1 reconnected to RM, then the state of 
> contianer1 is changed to COMPLETED, app1 is bring back to entitiesToReorder 
> after container released, then app1 will be added back into schedulable 
> entities after calling FairOrderingPolicy#getAssignmentIterator by scheduler.
> (5) remove app1
> To solve this problem, we should make sure schedulableEntities can only be 
> affected by add or remove app attempt, new entity should not be added into 
> schedulableEntities by reordering process.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> schedulableEntities.add(schedulableEntity);
>   }
> {code}
> Related codes above can be improved as follow to make sure only existent 
> entity can be re-add into schedulableEntities.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> boolean exists = schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> if (exists) {
>   schedulableEntities.add(schedulableEntity);
> } else {
>   LOG.info("Skip reordering non-existent schedulable entity: "
>   + schedulableEntity.getId());
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app

2018-10-31 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671024#comment-16671024
 ] 

Tao Yang edited comment on YARN-8958 at 11/1/18 2:09 AM:
-

Thanks [~cheersyang] for the review.
{quote}
In testSchedulableEntitiesLeak, why the app attempt is finished, but then you 
try to recover a container for this app? I suppose by then all containers of 
this app attempt are done correct?
{quote}
This can happen after RM restart which is step 2 of the reproduce process. 
Remove app attempt(step 3) may happen before NM reconnect to RM and recover 
containers (step 4), so that not all containers are done when app attempt 
finished. 
Clarified step 4 as {{(4) recover container1 after node1 reconnected to RM}}.


was (Author: tao yang):
Thanks [~cheersyang] for the review.
{quote}
In testSchedulableEntitiesLeak, why the app attempt is finished, but then you 
try to recover a container for this app? I suppose by then all containers of 
this app attempt are done correct?
{quote}
This can happen after RM restart which is step 2 of the reproduce process. 
Remove app attempt(step 3) may happen before NM reconnect to RM and recover 
containers (step 4), so that not all containers are done when app attempt 
finished.

> Schedulable entities leak in fair ordering policy when recovering containers 
> between remove app attempt and remove app
> --
>
> Key: YARN-8958
> URL: https://issues.apache.org/jira/browse/YARN-8958
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8958.001.patch, YARN-8958.002.patch
>
>
> We found a NPE in ClientRMService#getApplications when querying apps with 
> specified queue. The cause is that there is one app which can't be found by 
> calling RMContextImpl#getRMApps(is finished and swapped out of memory) but 
> still can be queried from fair ordering policy.
> To reproduce schedulable entities leak in fair ordering policy:
> (1) create app1 and launch container1 on node1
> (2) restart RM
> (3) remove app1 attempt, app1 is removed from the schedulable entities.
> (4) recover container1 after node1 reconnected to RM, then the state of 
> contianer1 is changed to COMPLETED, app1 is bring back to entitiesToReorder 
> after container released, then app1 will be added back into schedulable 
> entities after calling FairOrderingPolicy#getAssignmentIterator by scheduler.
> (5) remove app1
> To solve this problem, we should make sure schedulableEntities can only be 
> affected by add or remove app attempt, new entity should not be added into 
> schedulableEntities by reordering process.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> schedulableEntities.add(schedulableEntity);
>   }
> {code}
> Related codes above can be improved as follow to make sure only existent 
> entity can be re-add into schedulableEntities.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> boolean exists = schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> if (exists) {
>   schedulableEntities.add(schedulableEntity);
> } else {
>   LOG.info("Skip reordering non-existent schedulable entity: "
>   + schedulableEntity.getId());
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app

2018-10-31 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8958:
---
Description: 
We found a NPE in ClientRMService#getApplications when querying apps with 
specified queue. The cause is that there is one app which can't be found by 
calling RMContextImpl#getRMApps(is finished and swapped out of memory) but 
still can be queried from fair ordering policy.

To reproduce schedulable entities leak in fair ordering policy:
(1) create app1 and launch container1 on node1
(2) restart RM
(3) remove app1 attempt, app1 is removed from the schedulable entities.
(4) recover container1 after node1 reconnected to RM, then the state of 
contianer1 is changed to COMPLETED, app1 is bring back to entitiesToReorder 
after container released, then app1 will be added back into schedulable 
entities after calling FairOrderingPolicy#getAssignmentIterator by scheduler.
(5) remove app1

To solve this problem, we should make sure schedulableEntities can only be 
affected by add or remove app attempt, new entity should not be added into 
schedulableEntities by reordering process.
{code:java}
  protected void reorderSchedulableEntity(S schedulableEntity) {
//remove, update comparable data, and reinsert to update position in order
schedulableEntities.remove(schedulableEntity);
updateSchedulingResourceUsage(
  schedulableEntity.getSchedulingResourceUsage());
schedulableEntities.add(schedulableEntity);
  }
{code}
Related codes above can be improved as follow to make sure only existent entity 
can be re-add into schedulableEntities.
{code:java}
  protected void reorderSchedulableEntity(S schedulableEntity) {
//remove, update comparable data, and reinsert to update position in order
boolean exists = schedulableEntities.remove(schedulableEntity);
updateSchedulingResourceUsage(
  schedulableEntity.getSchedulingResourceUsage());
if (exists) {
  schedulableEntities.add(schedulableEntity);
} else {
  LOG.info("Skip reordering non-existent schedulable entity: "
  + schedulableEntity.getId());
}
  }
{code}

  was:
We found a NPE in ClientRMService#getApplications when querying apps with 
specified queue. The cause is that there is one app which can't be found by 
calling RMContextImpl#getRMApps(is finished and swapped out of memory) but 
still can be queried from fair ordering policy.

To reproduce schedulable entities leak in fair ordering policy:
(1) create app1 and launch container1 on node1
(2) restart RM
(3) remove app1 attempt, app1 is removed from the schedulable entities.
(4) recover container1, then the state of contianer1 is changed to COMPLETED, 
app1 is bring back to entitiesToReorder after container released, then app1 
will be added back into schedulable entities after calling 
FairOrderingPolicy#getAssignmentIterator by scheduler.
(5) remove app1

To solve this problem, we should make sure schedulableEntities can only be 
affected by add or remove app attempt, new entity should not be added into 
schedulableEntities by reordering process.
{code:java}
  protected void reorderSchedulableEntity(S schedulableEntity) {
//remove, update comparable data, and reinsert to update position in order
schedulableEntities.remove(schedulableEntity);
updateSchedulingResourceUsage(
  schedulableEntity.getSchedulingResourceUsage());
schedulableEntities.add(schedulableEntity);
  }
{code}
Related codes above can be improved as follow to make sure only existent entity 
can be re-add into schedulableEntities.
{code:java}
  protected void reorderSchedulableEntity(S schedulableEntity) {
//remove, update comparable data, and reinsert to update position in order
boolean exists = schedulableEntities.remove(schedulableEntity);
updateSchedulingResourceUsage(
  schedulableEntity.getSchedulingResourceUsage());
if (exists) {
  schedulableEntities.add(schedulableEntity);
} else {
  LOG.info("Skip reordering non-existent schedulable entity: "
  + schedulableEntity.getId());
}
  }
{code}


> Schedulable entities leak in fair ordering policy when recovering containers 
> between remove app attempt and remove app
> --
>
> Key: YARN-8958
> URL: https://issues.apache.org/jira/browse/YARN-8958
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8958.001.patch, YARN-8958.002.patch
>
>
> We found a NPE in ClientRMService#getApplications when querying apps with 
> specified queue. The cause is that there is one app which can't be found by 
> calling RMContextImpl#getRMApps(is finished and

[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app

2018-10-31 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671024#comment-16671024
 ] 

Tao Yang commented on YARN-8958:


Thanks [~cheersyang] for the review.
{quote}
In testSchedulableEntitiesLeak, why the app attempt is finished, but then you 
try to recover a container for this app? I suppose by then all containers of 
this app attempt are done correct?
{quote}
This can happen after RM restart which is step 2 of the reproduce process. 
Remove app attempt(step 3) may happen before NM reconnect to RM and recover 
containers (step 4), so that not all containers are done when app attempt 
finished.

> Schedulable entities leak in fair ordering policy when recovering containers 
> between remove app attempt and remove app
> --
>
> Key: YARN-8958
> URL: https://issues.apache.org/jira/browse/YARN-8958
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8958.001.patch, YARN-8958.002.patch
>
>
> We found a NPE in ClientRMService#getApplications when querying apps with 
> specified queue. The cause is that there is one app which can't be found by 
> calling RMContextImpl#getRMApps(is finished and swapped out of memory) but 
> still can be queried from fair ordering policy.
> To reproduce schedulable entities leak in fair ordering policy:
> (1) create app1 and launch container1 on node1
> (2) restart RM
> (3) remove app1 attempt, app1 is removed from the schedulable entities.
> (4) recover container1, then the state of contianer1 is changed to COMPLETED, 
> app1 is bring back to entitiesToReorder after container released, then app1 
> will be added back into schedulable entities after calling 
> FairOrderingPolicy#getAssignmentIterator by scheduler.
> (5) remove app1
> To solve this problem, we should make sure schedulableEntities can only be 
> affected by add or remove app attempt, new entity should not be added into 
> schedulableEntities by reordering process.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> schedulableEntities.add(schedulableEntity);
>   }
> {code}
> Related codes above can be improved as follow to make sure only existent 
> entity can be re-add into schedulableEntities.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> boolean exists = schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> if (exists) {
>   schedulableEntities.add(schedulableEntity);
> } else {
>   LOG.info("Skip reordering non-existent schedulable entity: "
>   + schedulableEntity.getId());
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8932) ResourceUtilization cpu is misused in oversubscription as a percentage

2018-10-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671022#comment-16671022
 ] 

Hadoop QA commented on YARN-8932:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  9m  
3s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-1011 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
56s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
13s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
6s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
59s{color} | {color:green} YARN-1011 passed {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 14m 
39s{color} | {color:red} branch has errors when building and testing our client 
artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
50s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} YARN-1011 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  3m  3s{color} 
| {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server generated 16 
new + 86 unchanged - 0 fixed = 102 total (was 86) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  0s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 1 new + 
203 unchanged - 0 fixed = 204 total (was 203) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 12m 
18s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  1m 15s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  1m 20s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 84m 26s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8932 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12946290/YARN-8932-YARN-1011.02.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 77cdecc1e6f3 4.4.0-134-generic #160~14.04.1-Ubuntu SMP Fri Aug 
17 11:07:07 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | YARN-1011 / f3d08c7 |
|

[jira] [Commented] (YARN-8404) Timeline event publish need to be async to avoid Dispatcher thread leak in case ATS is down

2018-10-31 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671017#comment-16671017
 ] 

Rohith Sharma K S commented on YARN-8404:
-

Yes, this need to be backported. 

> Timeline event publish need to be async to avoid Dispatcher thread leak in 
> case ATS is down
> ---
>
> Key: YARN-8404
> URL: https://issues.apache.org/jira/browse/YARN-8404
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.2
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Blocker
> Fix For: 2.10.0, 3.2.0, 3.1.1, 3.0.4
>
> Attachments: YARN-8404.01.patch
>
>
> It is observed that if ATS1/1.5 daemon is not running, RM recovery is delayed 
> as long as timeline client get timed out for each applications. By default, 
> timed out will take around 5 mins. If completed applications are more then 
> amount of time RM will wait is *(number of completed applications in a 
> cluster * 5 minutes)* which is kind of hanged. 
> Primary reason for this behavior is YARN-3044 YARN-4129 which refactor 
> existing system metric publisher. This refactoring made appFinished event as 
> synchronous which was asynchronous earlier. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7765) [Atsv2] GSSException: No valid credentials provided - Failed to find any Kerberos tgt thrown by Timelinev2Client & HBaseClient in NM

2018-10-31 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671016#comment-16671016
 ] 

Rohith Sharma K S commented on YARN-7765:
-

Yes, this need to be back ported to branch-2.9 because kerborse support for 
ATSv2 exist in branch-2.9

> [Atsv2] GSSException: No valid credentials provided - Failed to find any 
> Kerberos tgt thrown by Timelinev2Client & HBaseClient in NM
> 
>
> Key: YARN-7765
> URL: https://issues.apache.org/jira/browse/YARN-7765
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Sumana Sathish
>Assignee: Rohith Sharma K S
>Priority: Blocker
> Fix For: 3.1.0, 2.10.0, 3.0.1
>
> Attachments: YARN-7765.01.patch, YARN-7765.02.patch
>
>
> Secure cluster is deployed and all YARN services are started successfully. 
> When application is submitted, app collectors which is started as aux-service 
> throwing below exception. But this exception is *NOT* observed from RM 
> TimelineCollector. 
> Cluster is deployed with Hadoop-3.0 and Hbase-1.2.6 secure cluster. All the 
> YARN and HBase service are started and working perfectly fine. After 24 hours 
> i.e when token lifetime is expired, HBaseClient in NM and HDFSClient in 
> HMaster and HRegionServer started getting this error. After sometime, HBase 
> daemons got shutdown. In NM, JVM didn't shutdown but none of the events got 
> published.
> {noformat}
> 2018-01-17 11:04:48,017 FATAL ipc.RpcClientImpl (RpcClientImpl.java:run(684)) 
> - SASL authentication failed. The most likely cause is missing or invalid 
> credentials. Consider 'kinit'.
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> {noformat}
> cc :/ [~vrushalic] [~varun_saxena] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8914) Add xtermjs to YARN UI2

2018-10-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671007#comment-16671007
 ] 

Hadoop QA commented on YARN-8914:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 14m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
31m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui 
hadoop-client-modules/hadoop-client-minicluster . {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
57s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 21s{color} | {color:orange} root: The patch generated 1 new + 3 unchanged - 
0 fixed = 4 total (was 3) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 13m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
5s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui 
hadoop-client-modules/hadoop-client-minicluster . {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}156m 43s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
54s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}320m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManager |
|   | 
hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |

[jira] [Commented] (YARN-8778) Add Command Line interface to invoke interactive docker shell

2018-10-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671005#comment-16671005
 ] 

Hadoop QA commented on YARN-8778:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 14m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
30m  7s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-client-modules/hadoop-client-minicluster . {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
32s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
42s{color} | {color:green} root: The patch generated 0 new + 322 unchanged - 1 
fixed = 322 total (was 323) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 13m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 27s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-client-modules/hadoop-client-minicluster . {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}166m  1s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
48s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}331m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized |
|   | 
hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8778 

[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log

2018-10-31 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670962#comment-16670962
 ] 

Jonathan Hung commented on YARN-7225:
-

BTW the trunk patch looks good to me, but I found neither this one or the 
branch-2.8 one apply to branch-2/branch-2.9. Do you mind uploading a patch for 
these?

> Add queue and partition info to RM audit log
> 
>
> Key: YARN-7225
> URL: https://issues.apache.org/jira/browse/YARN-7225
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1
>Reporter: Jonathan Hung
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-7225.001.patch, YARN-7225.002.patch, 
> YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, 
> YARN-7225.branch-2.8.001.patch
>
>
> Right now RM audit log has fields such as user, ip, resource, etc. Having 
> queue and partition  is useful for resource tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7225) Add queue and partition info to RM audit log

2018-10-31 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670926#comment-16670926
 ] 

Jonathan Hung edited comment on YARN-7225 at 11/1/18 12:22 AM:
---

Thanks [~eepayne], IMO #2 makes the most sense, at least for our use case this 
information will be used for resource tracking so it seems most useful if it 
tracks what partition it actually ran on - not sure if you have a similar use 
case. Furthermore I think #1 introduces other issues since the AM partition is 
often not the same as the partition of the non-AM containers, whether it be 
this non-AM container's requested partition or the partition it ran on.


was (Author: jhung):
Thanks [~eepayne], IMO #2 makes the most sense, at least for our use case this 
information will be used for resource tracking so it seems most useful if it 
tracks what partition it actually ran on. Furthermore I think #1 introduces 
other issues since the AM partition is often not the same as the partition of 
the non-AM containers, whether it be this non-AM container's requested 
partition or the partition it ran on.

> Add queue and partition info to RM audit log
> 
>
> Key: YARN-7225
> URL: https://issues.apache.org/jira/browse/YARN-7225
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1
>Reporter: Jonathan Hung
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-7225.001.patch, YARN-7225.002.patch, 
> YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, 
> YARN-7225.branch-2.8.001.patch
>
>
> Right now RM audit log has fields such as user, ip, resource, etc. Having 
> queue and partition  is useful for resource tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7225) Add queue and partition info to RM audit log

2018-10-31 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670926#comment-16670926
 ] 

Jonathan Hung edited comment on YARN-7225 at 11/1/18 12:21 AM:
---

Thanks [~eepayne], IMO #2 makes the most sense, at least for our use case this 
information will be used for resource tracking so it seems most useful if it 
tracks what partition it actually ran on. Furthermore I think #1 introduces 
other issues since the AM partition is often not the same as the partition of 
the non-AM containers, whether it be this non-AM container's requested 
partition or the partition it ran on.


was (Author: jhung):
Thanks [~eepayne], IMO #2 makes the most sense, at least for our use case this 
information will be used for resource tracking so it seems most useful if it 
tracks what partition it actually ran on. Furthermore I think #1 introduces 
other issues since the AM partition is often not the same as the partition of 
the non-AM containers.

> Add queue and partition info to RM audit log
> 
>
> Key: YARN-7225
> URL: https://issues.apache.org/jira/browse/YARN-7225
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1
>Reporter: Jonathan Hung
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-7225.001.patch, YARN-7225.002.patch, 
> YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, 
> YARN-7225.branch-2.8.001.patch
>
>
> Right now RM audit log has fields such as user, ip, resource, etc. Having 
> queue and partition  is useful for resource tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log

2018-10-31 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670926#comment-16670926
 ] 

Jonathan Hung commented on YARN-7225:
-

Thanks [~eepayne], IMO #2 makes the most sense, at least for our use case this 
information will be used for resource tracking so it seems most useful if it 
tracks what partition it actually ran on. Furthermore I think #1 introduces 
other issues since the AM partition is often not the same as the partition of 
the non-AM containers.

> Add queue and partition info to RM audit log
> 
>
> Key: YARN-7225
> URL: https://issues.apache.org/jira/browse/YARN-7225
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1
>Reporter: Jonathan Hung
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-7225.001.patch, YARN-7225.002.patch, 
> YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, 
> YARN-7225.branch-2.8.001.patch
>
>
> Right now RM audit log has fields such as user, ip, resource, etc. Having 
> queue and partition  is useful for resource tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app

2018-10-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670851#comment-16670851
 ] 

Hadoop QA commented on YARN-8958:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 28s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 51s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8958 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12946245/YARN-8958.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 31032a523279 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 
17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6668c19 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/22388/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22388/testReport/ |
| Max. process+thread count | 944 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreComm

[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log

2018-10-31 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670827#comment-16670827
 ] 

Eric Payne commented on YARN-7225:
--

Thanks [~jhung]. The behavior is slightly different depending on if we are 
using {{appAMNodePartitionName}} vs. { {{partition}}, 
{{schedulerContainer.getSchedulerNode().getPartition()}}, 
{{node.getPartition()}} }
1) {{appAMNodePartitionName}} will always be the partition of the application.
2) Using the other methods to get the partition AND non-exclusive labels is 
used, it will be the label of the node on which the container ran.

I think #2 is technically the correct behavior, but I also think that it may be 
confusing if the application was submitted to a queue that does not explicitly 
allow that label. Thoughts?

> Add queue and partition info to RM audit log
> 
>
> Key: YARN-7225
> URL: https://issues.apache.org/jira/browse/YARN-7225
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1
>Reporter: Jonathan Hung
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-7225.001.patch, YARN-7225.002.patch, 
> YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, 
> YARN-7225.branch-2.8.001.patch
>
>
> Right now RM audit log has fields such as user, ip, resource, etc. Having 
> queue and partition  is useful for resource tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8854) Upgrade jquery datatable version references to v1.10.19

2018-10-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670743#comment-16670743
 ] 

Hudson commented on YARN-8854:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15340 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15340/])
YARN-8854. Upgrade jquery datatable version references to v1.10.19. (sunilg: 
rev d36012b69f01c9ddfd2e95545d1f5e1fbc1c3236)
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/css/custom_datatable.css
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/favicon.ico
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/sort_desc_disabled.png
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/sort_asc_disabled.png
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/css/demo_table.css
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/back_enabled.jpg
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/sort_desc.png
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/JQueryUI.java
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/back_enabled.jpg
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/sort_both.png
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/forward_disabled.jpg
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/forward_enabled.jpg
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/Sorting
 icons.psd
* (edit) LICENSE.txt
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/forward_disabled.jpg
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/sort_desc.png
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/css/demo_page.css
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/js/jquery.dataTables.min.js
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/sort_asc.png
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/sort_desc_disabled.png
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/sort_asc_disabled.png
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/forward_enabled.jpg
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/sort_both.png
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/pom.xml
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/sort_asc.png
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/css/jui-dt.css
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/css/jui-dt.css
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/css/demo_page.css
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/back_disabled.jpg
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/Sorting
 icons.psd
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/favicon.ico
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/css/demo_table.css
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/css/jquery.dataTables.css
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/back_disabled.jpg
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/js/jquery.dataTables.min.js


> Upgrade jquery datatable version references to v1.10.19
> ---
>
> Key: YARN-8854

[jira] [Commented] (YARN-6729) Clarify documentation on how to enable cgroup support

2018-10-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670744#comment-16670744
 ] 

Hudson commented on YARN-6729:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15340 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15340/])
YARN-6729. Clarify documentation on how to enable cgroup support. (skumpf: rev 
277a3d8d9fe1127c75452d083ff7859c603e686d)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerCgroups.md


> Clarify documentation on how to enable cgroup support
> -
>
> Key: YARN-6729
> URL: https://issues.apache.org/jira/browse/YARN-6729
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Yufei Gu
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-6729-trunk.001.patch
>
>
> NM percentage-physical-cpu-limit is not honored in 
> DefaultLCEResourcesHandler, which may cause container cpu usage calculation 
> issue. e.g. container vcore usage is potentially more than 100% if 
> percentage-physical-cpu-limit is set to a value less than 100. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8776) Container Executor change to create stdin/stdout pipeline

2018-10-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670740#comment-16670740
 ] 

Hadoop QA commented on YARN-8776:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 51s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 81 unchanged - 3 fixed = 81 total (was 84) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 43s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 79m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8776 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12946237/YARN-8776.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 7a807e9fc42b 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6668c19 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/22389/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results

[jira] [Commented] (YARN-8932) ResourceUtilization cpu is misused in oversubscription as a percentage

2018-10-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670737#comment-16670737
 ] 

Hadoop QA commented on YARN-8932:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
4s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-1011 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
52s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
29s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
55s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
24s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
5s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 24s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
6s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} YARN-1011 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
36s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  3m 36s{color} 
| {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server generated 16 
new + 86 unchanged - 0 fixed = 102 total (was 86) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 11s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 1 new + 
203 unchanged - 0 fixed = 204 total (was 203) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 32s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 72m 
15s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}179m 41s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestNMProxy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8932 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12946290/YARN-8932-YARN-1011.02.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 0a371aeadbcb 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |

[jira] [Commented] (YARN-8902) Add volume manager that manages CSI volume lifecycle

2018-10-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670720#comment-16670720
 ] 

Hadoop QA commented on YARN-8902:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 57s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 4 new + 
57 unchanged - 0 fixed = 61 total (was 57) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 41s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
22s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 23s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}179m 43s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestOpportunisticContainerAllocatorAMService 
|
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8902 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12946327/YARN-8902.007.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d4f8734eb0f3 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | 

[jira] [Commented] (YARN-8761) Service AM support for decommissioning component instances

2018-10-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670722#comment-16670722
 ] 

Hadoop QA commented on YARN-8761:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
4s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 11m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
33m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
44s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  8m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 34s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 16 new + 392 unchanged - 1 fixed = 408 total (was 393) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 10s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
26s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 25m 16s{color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 
19s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
54s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
22s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {col

[jira] [Commented] (YARN-8932) ResourceUtilization cpu is misused in oversubscription as a percentage

2018-10-31 Thread Robert Kanter (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670670#comment-16670670
 ] 

Robert Kanter commented on YARN-8932:
-

+1 pending Jenkins

> ResourceUtilization cpu is misused in oversubscription as a percentage
> --
>
> Key: YARN-8932
> URL: https://issues.apache.org/jira/browse/YARN-8932
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: YARN-1011
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-8932-YARN-1011.00.patch, 
> YARN-8932-YARN-1011.01.patch, YARN-8932-YARN-1011.02.patch
>
>
> The ResourceUtilization javadoc mistakenly documents the cpu as a percentage 
> represented by a float number in [0, 1.0f], however it is used as the # of 
> vcores used in reality.
> See javadoc and discussion in YARN-8911.
>   /**
>    * Get CPU utilization.
>    *
>    * @return CPU utilization normalized to 1 CPU
>    */
>   @Public
>   @Unstable
>   public abstract float getCPU();



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8916) Define a constant "docker" string in "ContainerRuntimeConstants.java" for better maintainability

2018-10-31 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-8916:
--
Labels: Docker  (was: )

> Define a constant "docker" string in "ContainerRuntimeConstants.java" for 
> better maintainability
> 
>
> Key: YARN-8916
> URL: https://issues.apache.org/jira/browse/YARN-8916
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Minor
>  Labels: Docker
> Fix For: 3.3.0, 3.2.1
>
> Attachments: YARN-8916-trunk.001.patch
>
>
> There're several hard-code string "docker" exists. It's better to use a 
> constant string in "ContainerRuntimeConstants" to make this container type 
> easy to use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8927) Better handling of "docker.trusted.registries" in container-executor's "trusted_image_check" function

2018-10-31 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-8927:
--
Labels: Docker  (was: )

> Better handling of "docker.trusted.registries" in container-executor's 
> "trusted_image_check" function
> -
>
> Key: YARN-8927
> URL: https://issues.apache.org/jira/browse/YARN-8927
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
>  Labels: Docker
>
> There are some missing cases that we need to catch when handling 
> "docker.trusted.registries".
> The container-executor.cfg configuration is as follows:
> {code:java}
> docker.trusted.registries=tangzhankun,ubuntu,centos{code}
> It works if run DistrubutedShell with "tangzhankun/tensorflow"
> {code:java}
> "yarn ... -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=tangzhankun/tensorflow
> {code}
> But run a DistrubutedShell job with "centos", "centos[:tagName]", "ubuntu" 
> and "ubuntu[:tagName]" fails:
> The error message is like:
> {code:java}
> "image: centos is not trusted"
> {code}
> We need better handling the above cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8949) Document usage of "library" in "docker.trusted.repositories" to trust local image

2018-10-31 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-8949:
--
Labels: Docker  (was: )

> Document usage of "library" in "docker.trusted.repositories" to trust local 
> image
> -
>
> Key: YARN-8949
> URL: https://issues.apache.org/jira/browse/YARN-8949
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
>  Labels: Docker
>
> As we came to a solution on how code changes will be (YARN-8927) to improve 
> this trusted repo feature,  the usage of "library" in 
> "docker.trusted.repositories" in current implementation should be documented.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8867) Retrieve the status of resource localization

2018-10-31 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670503#comment-16670503
 ] 

Chandni Singh commented on YARN-8867:
-

{quote}
Is this status going to show up in Yarn Service JSON or some other mechanism to 
surface to the end user?
{quote}
Didn't have any plans yet to expose this in Yarn Service JSON. It will be used 
by the Service AM to find out when the localization is complete. We can expose 
it in the Yarn Service JSON when the need arises.

{quote}
The status definition may also include a state for not yet started, like 
PENDING.
{quote}
Right now I was using the information in {{ResourceSet}} class. All the 
resources in {{pendingResources}} had state marked as {{IN_PROGRESS}}. In order 
to differentiate further between resources for which localization has started 
and the ones which are queued, will have to extract the information from 
{{ResourceLocalizationService}}. 

> Retrieve the status of resource localization
> 
>
> Key: YARN-8867
> URL: https://issues.apache.org/jira/browse/YARN-8867
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8867.wip.patch
>
>
> Refer YARN-3854.
> Currently NM does not have an API to retrieve the status of localization. 
> Unless the client can know when the localization of a resource is complete 
> irrespective of the type of the resource, it cannot take any appropriate 
> action. 
> We need an API in {{ContainerManagementProtocol}} to retrieve the status on 
> the localization. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8960) Can't get submarine service status using the command of "yarn app -status" under security environment

2018-10-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670418#comment-16670418
 ] 

Hadoop QA commented on YARN-8960:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 15s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine: 
The patch generated 7 new + 41 unchanged - 0 fixed = 48 total (was 41) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
32s{color} | {color:green} hadoop-yarn-submarine in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8960 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12946367/YARN-8960.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 9e28c47e065c 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 
17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 478b2cb |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/22384/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-submarine.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22384/testReport/ |
| Max. process+thread count | 306 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine 
U: 
hadoop-yarn-

[jira] [Updated] (YARN-8838) Add security check for container user is same as websocket user

2018-10-31 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8838:

Attachment: YARN-8838.003.patch

> Add security check for container user is same as websocket user
> ---
>
> Key: YARN-8838
> URL: https://issues.apache.org/jira/browse/YARN-8838
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: docker
> Attachments: YARN-8838.001.patch, YARN-8838.002.patch, 
> YARN-8838.003.patch
>
>
> When user is authenticate via SPNEGO entry point, node manager must verify 
> the remote user is the same as the container user to start the web socket 
> session.  One possible solution is to verify the web request user matches 
> yarn container local directory owne during onWebSocketConnect..



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app

2018-10-31 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670258#comment-16670258
 ] 

Weiwei Yang commented on YARN-8958:
---

Hi [~Tao Yang]

Thanks for creating the issue and the fix. I am trying to understand this 
issue, got a question about UT,

In \{{testSchedulableEntitiesLeak}}, why the app attempt is finished, but then 
you try to recover a container for this app? I suppose by then all containers 
of this app attempt are done correct?

> Schedulable entities leak in fair ordering policy when recovering containers 
> between remove app attempt and remove app
> --
>
> Key: YARN-8958
> URL: https://issues.apache.org/jira/browse/YARN-8958
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8958.001.patch, YARN-8958.002.patch
>
>
> We found a NPE in ClientRMService#getApplications when querying apps with 
> specified queue. The cause is that there is one app which can't be found by 
> calling RMContextImpl#getRMApps(is finished and swapped out of memory) but 
> still can be queried from fair ordering policy.
> To reproduce schedulable entities leak in fair ordering policy:
> (1) create app1 and launch container1 on node1
> (2) restart RM
> (3) remove app1 attempt, app1 is removed from the schedulable entities.
> (4) recover container1, then the state of contianer1 is changed to COMPLETED, 
> app1 is bring back to entitiesToReorder after container released, then app1 
> will be added back into schedulable entities after calling 
> FairOrderingPolicy#getAssignmentIterator by scheduler.
> (5) remove app1
> To solve this problem, we should make sure schedulableEntities can only be 
> affected by add or remove app attempt, new entity should not be added into 
> schedulableEntities by reordering process.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> schedulableEntities.add(schedulableEntity);
>   }
> {code}
> Related codes above can be improved as follow to make sure only existent 
> entity can be re-add into schedulableEntities.
> {code:java}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
> //remove, update comparable data, and reinsert to update position in order
> boolean exists = schedulableEntities.remove(schedulableEntity);
> updateSchedulingResourceUsage(
>   schedulableEntity.getSchedulingResourceUsage());
> if (exists) {
>   schedulableEntities.add(schedulableEntity);
> } else {
>   LOG.info("Skip reordering non-existent schedulable entity: "
>   + schedulableEntity.getId());
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8960) Can't get submarine service status using the command of "yarn app -status" under security environment

2018-10-31 Thread Zac Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zac Zhou updated YARN-8960:
---
Attachment: YARN-8960.003.patch

> Can't get submarine service status using the command of "yarn app -status" 
> under security environment
> -
>
> Key: YARN-8960
> URL: https://issues.apache.org/jira/browse/YARN-8960
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
> Attachments: YARN-8960.001.patch, YARN-8960.002.patch, 
> YARN-8960.003.patch
>
>
> After submitting a submarine job, we tried to get service status using the 
> following command:
> yarn app -status ${service_name}
> But we got the following error:
> HTTP error code : 500
>  
> The stack in resourcemanager log is :
> ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {}
> java.lang.reflect.UndeclaredThrowableException
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
>  at 
> com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker
> ._dispatch(AbstractResourceMethodDispatchProvider.java:205)
>  at 
> com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodD
> ispatcher.java:75)
>  at 
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
>  at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>  at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
>  at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>  at 
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
>  at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
>  at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
>  at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
>  at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
>  at 
> com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:89)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:179)
>  at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
>  at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
>  at 
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
>  at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
>  at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
>  at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
>  at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.apache.hadoop.security.authentication.server.AuthenticationFi

[jira] [Commented] (YARN-8954) Reservations list field in ReservationListInfo is not accessible

2018-10-31 Thread Oleksandr Shevchenko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669832#comment-16669832
 ] 

Oleksandr Shevchenko commented on YARN-8954:


The failed tests not related to the changes. All tests in TestRMAppTransitions, 
TestCapacitySchedulerMetrics, and TestFairScheduler passed successfully locally.

Could someone review the patch?
Thanks!

> Reservations list field in ReservationListInfo is not accessible
> 
>
> Key: YARN-8954
> URL: https://issues.apache.org/jira/browse/YARN-8954
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager, restapi
>Reporter: Oleksandr Shevchenko
>Priority: Minor
> Attachments: YARN-8954.001.patch
>
>
> We need to add the getter for Reservations list field since the field cannot 
> be accessible after the unmarshal. The similar problem described in YARN-2280.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7277) Container Launch expand environment needs to consider bracket matching

2018-10-31 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669692#comment-16669692
 ] 

Weiwei Yang commented on YARN-7277:
---

Hi [~tangzhankun]

Thanks for creating this issue [~balloons] and thanks for the patch 
[~tangzhankun].

For v3 patch, it doesn't seem to work on windows, can you pls check?

Thanks

> Container Launch expand environment needs to consider bracket matching
> --
>
> Key: YARN-7277
> URL: https://issues.apache.org/jira/browse/YARN-7277
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: balloons
>Assignee: Zhankun Tang
>Priority: Critical
> Attachments: YARN-7277-trunk.001.patch, YARN-7277-trunk.002.patch, 
> YARN-7277-trunk.003.patch
>
>
> The SPARK application I submitted always failed and I finally found that the 
> commands I specified to launch AM Container were changed by NM.
> *The following is part of the excerpt I submitted to RM to see the command:*
> {code:java}
> *'{\"handler\":\"FILLER\",\"inputTable\":\"engine_arch.adult_train\",\"outputTable\":[\"ether_features_filler_\$experimentId_\$taskId_out0\"],\"params\":{\"age\":{\"param\":[\"0\"]}}}'*
> {code}
> *The following is an excerpt from the corresponding command used when I 
> observe the NM launch container:*
> {code:java}
> *'{\"handler\":\"FILLER\",\"inputTable\":\"engine_arch.adult_train\",\"outputTable\":[\"ether_features_filler_\$experimentId_\$taskId_out0\"],\"params\":{\"age\":{\"param\":[\"0\"]}*
> {code}
> Finally, I found that NM made the following transformation in launch 
> container which led to this situation:
> {code:java}
> @VisibleForTesting
>   public static String expandEnvironment(String var,
>   Path containerLogDir) {
> var = var.replace(ApplicationConstants.LOG_DIR_EXPANSION_VAR,
>   containerLogDir.toString());
> var =  var.replace(ApplicationConstants.CLASS_PATH_SEPARATOR,
>   File.pathSeparator);
> // replace parameter expansion marker. e.g. {{VAR}} on Windows is replaced
> // as %VAR% and on Linux replaced as "$VAR"
> if (Shell.WINDOWS) {
>   var = var.replaceAll("(\\{\\{)|(\\}\\})", "%");
> } else {
>   var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_LEFT, "$");
>   *var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");*
> }
> return var;
>   }
> {code}
> I think this is a Bug that doesn't even consider the pairing of 
> "*PARAMETER_EXPANSION_LEFT*" and "*PARAMETER_EXPANSION_RIGHT*" when 
> substituting. But simply substituting for simple violence.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8961) [UI2] Flow Run End Time shows 'Invalid date'

2018-10-31 Thread Charan Hebri (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669690#comment-16669690
 ] 

Charan Hebri commented on YARN-8961:


[~akhilpb] below is the response for flow runs,
{noformat}
[
 {
 "metrics": [],
 "events": [],
 "createdtime": 1540969639076,
 "idprefix": 0,
 "id": "hrt_qa@test_flow/1540969639076",
 "type": "YARN_FLOW_RUN",
 "info": {
 "UID": "yarn-cluster!hrt_qa!test_flow!1540969639076",
 "SYSTEM_INFO_FLOW_NAME": "test_flow",
 "SYSTEM_INFO_FLOW_RUN_ID": 1540969639076,
 "SYSTEM_INFO_USER": "hrt_qa",
 "FROM_ID": "yarn-cluster!hrt_qa!test_flow!1540969639076"
 },
 "isrelatedto": {},
 "relatesto": {}
 },
 {
 "metrics": [],
 "events": [],
 "createdtime": 1540969221139,
 "idprefix": 0,
 "id": "hrt_qa@test_flow/1540969221139",
 "type": "YARN_FLOW_RUN",
 "info": {
 "UID": "yarn-cluster!hrt_qa!test_flow!1540969221139",
 "SYSTEM_INFO_FLOW_RUN_END_TIME": 1540969587649,
 "SYSTEM_INFO_FLOW_NAME": "test_flow",
 "SYSTEM_INFO_FLOW_RUN_ID": 1540969221139,
 "SYSTEM_INFO_USER": "hrt_qa",
 "FROM_ID": "yarn-cluster!hrt_qa!test_flow!1540969221139"
 },
 "isrelatedto": {},
 "relatesto": {}
 }
]{noformat}
You can see that for a flow that hasn't completed yet, 
SYSTEM_INFO_FLOW_RUN_END_TIME is not available in the response. 

 

> [UI2] Flow Run End Time shows 'Invalid date'
> 
>
> Key: YARN-8961
> URL: https://issues.apache.org/jira/browse/YARN-8961
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Charan Hebri
>Assignee: Akhil PB
>Priority: Major
> Attachments: Invalid_Date.png
>
>
> End Time for Flow Runs is shown as *Invalid date* for runs that are in 
> progress. This should be shown as *N/A* just like how it is shown for 'CPU 
> VCores' and 'Memory Used'. Attached relevant screenshot.
> cc [~akhilpb]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org