[jira] [Commented] (YARN-10265) Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue

2020-05-13 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106928#comment-17106928
 ] 

Hadoop QA commented on YARN-10265:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
44s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
59m 28s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
5s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
24s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
9s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}110m 57s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
51s{color} | {color:green} hadoop-yarn-csi in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
53s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}223m 22s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26030/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10265 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13002874/YARN-10265.001.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient xml |
| uname | Linux 4b3fc8e53de9 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git 

[jira] [Commented] (YARN-9849) Leaf queues not inheriting parent queue status after adding status as “RUNNING” and thereafter, commenting the same in capacity-scheduler.xml

2020-05-13 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106901#comment-17106901
 ] 

Bilwa S T commented on YARN-9849:
-

cc [~inigoiri]

> Leaf queues not inheriting parent queue status after adding status as 
> “RUNNING” and thereafter, commenting the same in capacity-scheduler.xml 
> --
>
> Key: YARN-9849
> URL: https://issues.apache.org/jira/browse/YARN-9849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9849.001.patch, YARN-9849.002.patch, 
> YARN-9849.003.patch
>
>
> 【Precondition】:
> 1. Install the cluster
> 2. Config Queues with more numbers say 2 parent [default,q1] & 4 leaf [q2,q3]
> 3. Cluster should be up and running
> 【Test step】:
> 1.By default leaf quques inherit parent status 
> 2.Change leaf queues status  as "RUNNING" explicitly
> 3. Run refresh command, leaf queues status shown as "RUNNING" in CLI/UI
> 4. Therafter,change the leaft queues status to "STOPPED"  and refresh command
> 5. Run refresh command, leaf queues status shown as "STOPPING" in CLI/UI
> 6. Now comment the leafy queues status and run refresh queues
> 7.Observe
> 【Expect Output】:
> The leaf queues status should be displayed as "RUNNING" inheriting from the 
> parent queue.
> 【Actual Output】:
> Still displays the leaf queues status as "STOPPED" raather than inheriting 
> the same from parent which is in RUNNING



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10265) Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue

2020-05-13 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106890#comment-17106890
 ] 

Surendra Singh Lilhore commented on YARN-10265:
---

LGTM, will wait for build report...

> Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue
> --
>
> Key: YARN-10265
> URL: https://issues.apache.org/jira/browse/YARN-10265
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liusheng
>Priority: Major
> Attachments: YARN-10265.001.patch
>
>
> In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
> workaround using an non-official released netty-4.1.48 to fix the ARM support 
> issue. but just a few hours agon, Netty has release the 4.1.50 version which 
> is officially support ARM platform, please see: 
> [https://github.com/netty/netty/pull/9804]
>  
> netty-4.1.50.Final release: 
> [https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]
> commits from netty-4.1.48 to netty-4.1.50: 
> [https://github.com/netty/netty/compare/netty-4.1.48.Final...netty-4.1.50.Final]
> So, now it is better to upgrade the netty-dependency version of Hadoop to 
> 4.1.50 version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10248) when config allowed-gpu-devices , excluded GPUs still be visible to containers

2020-05-13 Thread zhao yufei (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106852#comment-17106852
 ] 

zhao yufei commented on YARN-10248:
---

[~tangzhankun] yes,  also  this test class support FakeGpuDiscoveryBinary, but  
lookUpAutoDiscoveryBinary in class GpuDiscoverer  not support it .  


> when config allowed-gpu-devices , excluded GPUs still be visible to containers
> --
>
> Key: YARN-10248
> URL: https://issues.apache.org/jira/browse/YARN-10248
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.1
>Reporter: zhao yufei
>Assignee: zhao yufei
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.2.1
>
> Attachments: YARN-10248-branch-3.2.001.path, 
> YARN-10248-branch-3.2.001.path
>
>
> I have a server with two GPU, and i want to use only one of them within yarn 
> cluster.
> according to hadoop document, i set configs:
> {code:java}
> 
> yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
> 0:1
>   
> 
> 
> yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables
> /etc/alternatives/x86_64-linux-gnu_nvidia_smi
>   
> {code}
> then i running following command to test:
> {code:java}
> yarn jar 
> ./share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.1.jar \
>  -jar 
> ./share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.1.jar  
> -shell_command ' nvidia-smi & sleep 3  ' \
>  -container_resources memory-mb=3072,vcores=1,yarn.io/gpu=1  \
>  -num_containers 1 -queue yufei -node_label_expression slaves
> {code}
> iI expected gpu with minor number 0 will not visible to container, but in the 
> launched container, nvidia-smi  print two gpu information.
> I check the related source code and find it is a bug.
> the problem is:
> when you specify allowed-gpu-devices, GpuDiscoverer will populate usable gpus 
> from it,  
> then when assign to a container some of the gpus, it will set denied gpus for 
> the container,
> but it never consider excluded gpu of the host. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10248) when config allowed-gpu-devices , excluded GPUs still be visible to containers

2020-05-13 Thread zhao yufei (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106852#comment-17106852
 ] 

zhao yufei edited comment on YARN-10248 at 5/14/20, 3:47 AM:
-

[~tangzhankun]  no, i found another test issues:
 this test class support FakeGpuDiscoveryBinary, but  lookUpAutoDiscoveryBinary 
in class GpuDiscoverer  not support it .  



was (Author: jasstionzyf):
[~tangzhankun] yes,  also  this test class support FakeGpuDiscoveryBinary, but  
lookUpAutoDiscoveryBinary in class GpuDiscoverer  not support it .  


> when config allowed-gpu-devices , excluded GPUs still be visible to containers
> --
>
> Key: YARN-10248
> URL: https://issues.apache.org/jira/browse/YARN-10248
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.1
>Reporter: zhao yufei
>Assignee: zhao yufei
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.2.1
>
> Attachments: YARN-10248-branch-3.2.001.path, 
> YARN-10248-branch-3.2.001.path
>
>
> I have a server with two GPU, and i want to use only one of them within yarn 
> cluster.
> according to hadoop document, i set configs:
> {code:java}
> 
> yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
> 0:1
>   
> 
> 
> yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables
> /etc/alternatives/x86_64-linux-gnu_nvidia_smi
>   
> {code}
> then i running following command to test:
> {code:java}
> yarn jar 
> ./share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.1.jar \
>  -jar 
> ./share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.1.jar  
> -shell_command ' nvidia-smi & sleep 3  ' \
>  -container_resources memory-mb=3072,vcores=1,yarn.io/gpu=1  \
>  -num_containers 1 -queue yufei -node_label_expression slaves
> {code}
> iI expected gpu with minor number 0 will not visible to container, but in the 
> launched container, nvidia-smi  print two gpu information.
> I check the related source code and find it is a bug.
> the problem is:
> when you specify allowed-gpu-devices, GpuDiscoverer will populate usable gpus 
> from it,  
> then when assign to a container some of the gpus, it will set denied gpus for 
> the container,
> but it never consider excluded gpu of the host. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10265) Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue

2020-05-13 Thread liusheng (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liusheng updated YARN-10265:

Description: 
In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
workaround using an non-official released netty-4.1.48 to fix the ARM support 
issue. but just a few hours agon, Netty has release the 4.1.50 version which is 
officially support ARM platform, please see: 
[https://github.com/netty/netty/pull/9804]

 

netty-4.1.50.Final release: 
[https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]

commits from netty-4.1.48 to netty-4.1.50: 
[https://github.com/netty/netty/compare/netty-4.1.48.Final...netty-4.1.50.Final]

So, now it is better to upgrade the netty-dependency version of Hadoop to 
4.1.50 version.

  was:
In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
workaround using an non-official released netty-4.1.48 to fix the ARM support 
issue. but just a few hours agon, Netty has release the 4.1.50 version which is 
officially support ARM platform, please see: 
[https://github.com/netty/netty/pull/9804]

 

netty-4.1.50.Final release: 
[https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]

commits from netty-4.1.48 to netty-4.1.50: 
[https://github.com/netty/netty/compare/netty-4.1.48.Final...netty-4.1.50.Final|https://github.com/netty/netty/compare/netty-4.1.49.Final...netty-4.1.50.Final]

So, now it is better to upgrade the netty-dependency version of Hadoop to 
4.1.50 version.


> Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue
> --
>
> Key: YARN-10265
> URL: https://issues.apache.org/jira/browse/YARN-10265
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liusheng
>Priority: Major
>
> In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
> workaround using an non-official released netty-4.1.48 to fix the ARM support 
> issue. but just a few hours agon, Netty has release the 4.1.50 version which 
> is officially support ARM platform, please see: 
> [https://github.com/netty/netty/pull/9804]
>  
> netty-4.1.50.Final release: 
> [https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]
> commits from netty-4.1.48 to netty-4.1.50: 
> [https://github.com/netty/netty/compare/netty-4.1.48.Final...netty-4.1.50.Final]
> So, now it is better to upgrade the netty-dependency version of Hadoop to 
> 4.1.50 version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10265) Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue

2020-05-13 Thread liusheng (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liusheng updated YARN-10265:

Description: 
In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
workaround using an non-official released netty-4.1.48 to fix the ARM support 
issue. but just a few hours agon, Netty has release the 4.1.50 version which is 
officially support ARM platform, please see: 
[https://github.com/netty/netty/pull/9804]

 

netty-4.1.50.Final release: 
[https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]

commits from netty-4.1.48 to netty-4.1.50: 
[https://github.com/netty/netty/compare/netty-4.1.48.Final...netty-4.1.50.Final|https://github.com/netty/netty/compare/netty-4.1.49.Final...netty-4.1.50.Final]

So, now it is better to upgrade the netty-dependency version of Hadoop to 
4.1.50 version.

  was:
In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
workaround using an non-official released netty-4.1.49 to fix the ARM support 
issue. but just a few hours agon, Netty has release the 4.1.50 version which is 
officially support ARM platform, please see: 
[https://github.com/netty/netty/pull/9804]

 

netty-4.1.50.Final release: 
[https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]

commits from netty-4.1.49 to netty-4.1.50: 
[https://github.com/netty/netty/compare/netty-4.1.49.Final...netty-4.1.50.Final]

So, now it is better to upgrade the netty-dependency version of Hadoop to 
4.1.50 version.


> Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue
> --
>
> Key: YARN-10265
> URL: https://issues.apache.org/jira/browse/YARN-10265
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liusheng
>Priority: Major
>
> In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
> workaround using an non-official released netty-4.1.48 to fix the ARM support 
> issue. but just a few hours agon, Netty has release the 4.1.50 version which 
> is officially support ARM platform, please see: 
> [https://github.com/netty/netty/pull/9804]
>  
> netty-4.1.50.Final release: 
> [https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]
> commits from netty-4.1.48 to netty-4.1.50: 
> [https://github.com/netty/netty/compare/netty-4.1.48.Final...netty-4.1.50.Final|https://github.com/netty/netty/compare/netty-4.1.49.Final...netty-4.1.50.Final]
> So, now it is better to upgrade the netty-dependency version of Hadoop to 
> 4.1.50 version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10265) Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue

2020-05-13 Thread liusheng (Jira)
liusheng created YARN-10265:
---

 Summary: Upgrade Netty-all dependency to latest version 4.1.50 to 
fix ARM support issue
 Key: YARN-10265
 URL: https://issues.apache.org/jira/browse/YARN-10265
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: liusheng


In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
workaround using an non-official released netty-4.1.49 to fix the ARM support 
issue. but just a few hours agon, Netty has release the 4.1.50 version which is 
officially support ARM platform, please see: 
[https://github.com/netty/netty/pull/9804]

 

netty-4.1.50.Final release: 
[https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]

commits from netty-4.1.49 to netty-4.1.50: 
[https://github.com/netty/netty/compare/netty-4.1.49.Final...netty-4.1.50.Final]

So, now it is better to upgrade the netty-dependency version of Hadoop to 
4.1.50 version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9898) Dependency netty-all-4.1.27.Final doesn't support ARM platform

2020-05-13 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106614#comment-17106614
 ] 

Hudson commented on YARN-9898:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18254 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18254/])
YARN-9898. Dependency netty-all-4.1.27.Final doesn't support ARM (ayushsaxena: 
rev 0918433b4da1affbe380988b8f63fca39bc0850b)
* (edit) hadoop-project/pom.xml
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-csi/pom.xml
* (edit) hadoop-hdfs-project/hadoop-hdfs-client/pom.xml
* (edit) hadoop-hdfs-project/hadoop-hdfs/pom.xml


> Dependency netty-all-4.1.27.Final doesn't support ARM platform
> --
>
> Key: YARN-9898
> URL: https://issues.apache.org/jira/browse/YARN-9898
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
> Fix For: 3.3.0, 3.4.0
>
> Attachments: YARN-9898.001.patch, YARN-9898.002.patch, 
> YARN-9898.003.patch, YARN-9898.004.patch
>
>
> Hadoop dependent the Netty package, but the *netty-all-4.1.27.Final* of 
> io.netty maven repo, cannot support ARM platform. 
> When run the test *TestCsiClient.testIdentityService* on ARM server, it will 
> raise error like following:
> {code:java}
> Caused by: java.io.FileNotFoundException: 
> META-INF/native/libnetty_transport_native_epoll_aarch_64.so
> at 
> io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:161)
> ... 45 more
> Suppressed: java.lang.UnsatisfiedLinkError: no 
> netty_transport_native_epoll_aarch_64 in java.library.path
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
> at java.lang.Runtime.loadLibrary0(Runtime.java:870)
> at java.lang.System.loadLibrary(System.java:1122)
> at 
> io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:243)
> at 
> io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:124)
> ... 45 more
> Suppressed: java.lang.UnsatisfiedLinkError: no 
> netty_transport_native_epoll_aarch_64 in java.library.path
> at 
> java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
> at java.lang.Runtime.loadLibrary0(Runtime.java:870)
> at java.lang.System.loadLibrary(System.java:1122)
> at 
> io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:263)
> at java.security.AccessController.doPrivileged(Native 
> Method)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:255)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:233)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9898) Dependency netty-all-4.1.27.Final doesn't support ARM platform

2020-05-13 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106603#comment-17106603
 ] 

Ayush Saxena commented on YARN-9898:


Committed to trunk, branch-3.3 and 3.3.0

Thanx [~seanlau]  for the contribution!!!

> Dependency netty-all-4.1.27.Final doesn't support ARM platform
> --
>
> Key: YARN-9898
> URL: https://issues.apache.org/jira/browse/YARN-9898
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
> Attachments: YARN-9898.001.patch, YARN-9898.002.patch, 
> YARN-9898.003.patch, YARN-9898.004.patch
>
>
> Hadoop dependent the Netty package, but the *netty-all-4.1.27.Final* of 
> io.netty maven repo, cannot support ARM platform. 
> When run the test *TestCsiClient.testIdentityService* on ARM server, it will 
> raise error like following:
> {code:java}
> Caused by: java.io.FileNotFoundException: 
> META-INF/native/libnetty_transport_native_epoll_aarch_64.so
> at 
> io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:161)
> ... 45 more
> Suppressed: java.lang.UnsatisfiedLinkError: no 
> netty_transport_native_epoll_aarch_64 in java.library.path
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
> at java.lang.Runtime.loadLibrary0(Runtime.java:870)
> at java.lang.System.loadLibrary(System.java:1122)
> at 
> io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:243)
> at 
> io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:124)
> ... 45 more
> Suppressed: java.lang.UnsatisfiedLinkError: no 
> netty_transport_native_epoll_aarch_64 in java.library.path
> at 
> java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
> at java.lang.Runtime.loadLibrary0(Runtime.java:870)
> at java.lang.System.loadLibrary(System.java:1122)
> at 
> io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:263)
> at java.security.AccessController.doPrivileged(Native 
> Method)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:255)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:233)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10229) [Federation] Client should be able to submit application to RM directly using normal client conf

2020-05-13 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106566#comment-17106566
 ] 

Hadoop QA commented on YARN-10229:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 40s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
21s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 23s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 3 new + 165 unchanged - 1 fixed = 168 total (was 166) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 
59s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 86m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26029/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10229 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13002833/YARN-10229.001.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux c8795019f11d 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / 108ecf992f0 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
| checkstyle | 

[jira] [Resolved] (YARN-8552) [DS] Container report fails for distributed containers

2020-05-13 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T resolved YARN-8552.
-
Resolution: Cannot Reproduce

> [DS]  Container report fails for distributed containers
> ---
>
> Key: YARN-8552
> URL: https://issues.apache.org/jira/browse/YARN-8552
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Bilwa S T
>Priority: Major
>
> 2018-07-19 19:15:02,281 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1531994217928_0003_01_1099511627753 Container Transitioned from 
> ACQUIRED to RUNNING
> 2018-07-19 19:15:02,384 ERROR 
> org.apache.hadoop.yarn.server.webapp.ContainerBlock: Failed to read the 
> container container_1531994217928_0003_01_1099511627773.
> Container report failing for Distributed Scheduler containers. Currently all 
> the container are fetched from central RM so need to find alternative for the 
> same.
> {code}
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.yarn.exceptions.ContainerNotFoundException: 
> Container with id 'container_1531994217928_0003_01_1099511627773' doesn't 
> exist in RM.
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getContainerReport(ClientRMService.java:499)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMContainerBlock.getContainerReport(RMContainerBlock.java:44)
> at 
> org.apache.hadoop.yarn.server.webapp.ContainerBlock$1.run(ContainerBlock.java:82)
> at 
> org.apache.hadoop.yarn.server.webapp.ContainerBlock$1.run(ContainerBlock.java:79)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
> ... 70 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-10128) [FederationSecurity] YARN RMAdmin commands fail when Authorization is enabled on router

2020-05-13 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T resolved YARN-10128.
--
Resolution: Duplicate

> [FederationSecurity] YARN RMAdmin commands fail when Authorization is enabled 
> on router
> ---
>
> Key: YARN-10128
> URL: https://issues.apache.org/jira/browse/YARN-10128
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>
> Exception thrown is 
> {quote}Protocol interface 
> org.apache.hadoop.yarn.server.api.ResourceManagerAdministrationProtocolPB is 
> not known., while invoking 
> ResourceManagerAdministrationProtocolPBClientImpl.refreshQueues over rm2 
> after 1 failover attempts. Trying to failover after sleeping for 44717ms.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8942) PriorityBasedRouterPolicy throws exception if all sub-cluster weights have negative value

2020-05-13 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106498#comment-17106498
 ] 

Hudson commented on YARN-8942:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18251 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18251/])
YARN-8942. PriorityBasedRouterPolicy throws exception if all sub-cluster 
(inigoiri: rev 108ecf992f0004dd64a7143d1c400de1361b13f3)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/federation/policies/router/TestPriorityRouterPolicy.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/policies/router/PriorityRouterPolicy.java


> PriorityBasedRouterPolicy throws exception if all sub-cluster weights have 
> negative value
> -
>
> Key: YARN-8942
> URL: https://issues.apache.org/jira/browse/YARN-8942
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Akshay Agarwal
>Assignee: Bilwa S T
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: YARN-8942.001.patch, YARN-8942.002.patch
>
>
> In *PriorityBasedRouterPolicy* if all sub-cluster weights are *set to 
> negative values* it is throwing exception while running a job.
> Ideally it should handle the negative priority as well according to the home 
> sub cluster selection process of the policy.
>  *Exception Details:*
> {code:java}
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Unable 
> to insert the ApplicationId application_1540356760422_0015 into the 
> FederationStateStore
> at 
> org.apache.hadoop.yarn.server.router.RouterServerUtil.logAndThrowException(RouterServerUtil.java:56)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:418)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> Caused by: 
> org.apache.hadoop.yarn.server.federation.store.exception.FederationStateStoreInvalidInputException:
>  Missing SubCluster Id information. Please try again by specifying Subcluster 
> Id information.
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationMembershipStateStoreInputValidator.checkSubClusterId(FederationMembershipStateStoreInputValidator.java:247)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.checkApplicationHomeSubCluster(FederationApplicationHomeSubClusterStoreInputValidator.java:160)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.validate(FederationApplicationHomeSubClusterStoreInputValidator.java:65)
> at 
> org.apache.hadoop.yarn.server.federation.store.impl.ZookeeperFederationStateStore.addApplicationHomeSubCluster(ZookeeperFederationStateStore.java:159)
> at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
> 

[jira] [Commented] (YARN-6526) Refactoring SQLFederationStateStore by avoiding to recreate a connection at every call

2020-05-13 Thread Jira


[ 
https://issues.apache.org/jira/browse/YARN-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106481#comment-17106481
 ] 

Íñigo Goiri commented on YARN-6526:
---

+1 on  [^YARN-6526.007.patch].
The findbugs seems unrelated.

> Refactoring SQLFederationStateStore by avoiding to recreate a connection at 
> every call
> --
>
> Key: YARN-6526
> URL: https://issues.apache.org/jira/browse/YARN-6526
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation
>Reporter: Giovanni Matteo Fumarola
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-6526.001.patch, YARN-6526.002.patch, 
> YARN-6526.003.patch, YARN-6526.004.patch, YARN-6526.005.patch, 
> YARN-6526.006.patch, YARN-6526.007.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8942) PriorityBasedRouterPolicy throws exception if all sub-cluster weights have negative value

2020-05-13 Thread Jira


[ 
https://issues.apache.org/jira/browse/YARN-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106480#comment-17106480
 ] 

Íñigo Goiri commented on YARN-8942:
---

Thanks [~BilwaST] for the patch.
Committed to trunk.

> PriorityBasedRouterPolicy throws exception if all sub-cluster weights have 
> negative value
> -
>
> Key: YARN-8942
> URL: https://issues.apache.org/jira/browse/YARN-8942
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Akshay Agarwal
>Assignee: Bilwa S T
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: YARN-8942.001.patch, YARN-8942.002.patch
>
>
> In *PriorityBasedRouterPolicy* if all sub-cluster weights are *set to 
> negative values* it is throwing exception while running a job.
> Ideally it should handle the negative priority as well according to the home 
> sub cluster selection process of the policy.
>  *Exception Details:*
> {code:java}
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Unable 
> to insert the ApplicationId application_1540356760422_0015 into the 
> FederationStateStore
> at 
> org.apache.hadoop.yarn.server.router.RouterServerUtil.logAndThrowException(RouterServerUtil.java:56)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:418)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> Caused by: 
> org.apache.hadoop.yarn.server.federation.store.exception.FederationStateStoreInvalidInputException:
>  Missing SubCluster Id information. Please try again by specifying Subcluster 
> Id information.
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationMembershipStateStoreInputValidator.checkSubClusterId(FederationMembershipStateStoreInputValidator.java:247)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.checkApplicationHomeSubCluster(FederationApplicationHomeSubClusterStoreInputValidator.java:160)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.validate(FederationApplicationHomeSubClusterStoreInputValidator.java:65)
> at 
> org.apache.hadoop.yarn.server.federation.store.impl.ZookeeperFederationStateStore.addApplicationHomeSubCluster(ZookeeperFederationStateStore.java:159)
> at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy84.addApplicationHomeSubCluster(Unknown Source)
> at 
> org.apache.hadoop.yarn.server.federation.utils.FederationStateStoreFacade.addApplicationHomeSubCluster(FederationStateStoreFacade.java:402)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:413)
> ... 11 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (YARN-8942) PriorityBasedRouterPolicy throws exception if all sub-cluster weights have negative value

2020-05-13 Thread Jira


 [ 
https://issues.apache.org/jira/browse/YARN-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-8942:
--
Affects Version/s: 3.3.0

> PriorityBasedRouterPolicy throws exception if all sub-cluster weights have 
> negative value
> -
>
> Key: YARN-8942
> URL: https://issues.apache.org/jira/browse/YARN-8942
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Akshay Agarwal
>Assignee: Bilwa S T
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: YARN-8942.001.patch, YARN-8942.002.patch
>
>
> In *PriorityBasedRouterPolicy* if all sub-cluster weights are *set to 
> negative values* it is throwing exception while running a job.
> Ideally it should handle the negative priority as well according to the home 
> sub cluster selection process of the policy.
>  *Exception Details:*
> {code:java}
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Unable 
> to insert the ApplicationId application_1540356760422_0015 into the 
> FederationStateStore
> at 
> org.apache.hadoop.yarn.server.router.RouterServerUtil.logAndThrowException(RouterServerUtil.java:56)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:418)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> Caused by: 
> org.apache.hadoop.yarn.server.federation.store.exception.FederationStateStoreInvalidInputException:
>  Missing SubCluster Id information. Please try again by specifying Subcluster 
> Id information.
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationMembershipStateStoreInputValidator.checkSubClusterId(FederationMembershipStateStoreInputValidator.java:247)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.checkApplicationHomeSubCluster(FederationApplicationHomeSubClusterStoreInputValidator.java:160)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.validate(FederationApplicationHomeSubClusterStoreInputValidator.java:65)
> at 
> org.apache.hadoop.yarn.server.federation.store.impl.ZookeeperFederationStateStore.addApplicationHomeSubCluster(ZookeeperFederationStateStore.java:159)
> at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy84.addApplicationHomeSubCluster(Unknown Source)
> at 
> org.apache.hadoop.yarn.server.federation.utils.FederationStateStoreFacade.addApplicationHomeSubCluster(FederationStateStoreFacade.java:402)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:413)
> ... 11 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Updated] (YARN-8942) PriorityBasedRouterPolicy throws exception if all sub-cluster weights have negative value

2020-05-13 Thread Jira


 [ 
https://issues.apache.org/jira/browse/YARN-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-8942:
--
Fix Version/s: (was: 3.4.0)

> PriorityBasedRouterPolicy throws exception if all sub-cluster weights have 
> negative value
> -
>
> Key: YARN-8942
> URL: https://issues.apache.org/jira/browse/YARN-8942
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Akshay Agarwal
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-8942.001.patch, YARN-8942.002.patch
>
>
> In *PriorityBasedRouterPolicy* if all sub-cluster weights are *set to 
> negative values* it is throwing exception while running a job.
> Ideally it should handle the negative priority as well according to the home 
> sub cluster selection process of the policy.
>  *Exception Details:*
> {code:java}
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Unable 
> to insert the ApplicationId application_1540356760422_0015 into the 
> FederationStateStore
> at 
> org.apache.hadoop.yarn.server.router.RouterServerUtil.logAndThrowException(RouterServerUtil.java:56)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:418)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> Caused by: 
> org.apache.hadoop.yarn.server.federation.store.exception.FederationStateStoreInvalidInputException:
>  Missing SubCluster Id information. Please try again by specifying Subcluster 
> Id information.
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationMembershipStateStoreInputValidator.checkSubClusterId(FederationMembershipStateStoreInputValidator.java:247)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.checkApplicationHomeSubCluster(FederationApplicationHomeSubClusterStoreInputValidator.java:160)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.validate(FederationApplicationHomeSubClusterStoreInputValidator.java:65)
> at 
> org.apache.hadoop.yarn.server.federation.store.impl.ZookeeperFederationStateStore.addApplicationHomeSubCluster(ZookeeperFederationStateStore.java:159)
> at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy84.addApplicationHomeSubCluster(Unknown Source)
> at 
> org.apache.hadoop.yarn.server.federation.utils.FederationStateStoreFacade.addApplicationHomeSubCluster(FederationStateStoreFacade.java:402)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:413)
> ... 11 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (YARN-10229) [Federation] Client should be able to submit application to RM directly using normal client conf

2020-05-13 Thread Jira


[ 
https://issues.apache.org/jira/browse/YARN-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106475#comment-17106475
 ] 

Íñigo Goiri commented on YARN-10229:


[~subru], we had discussed about something like this in the past.
Do you have any feedback? It would be awesome to support this.

> [Federation] Client should be able to submit application to RM directly using 
> normal client conf
> 
>
> Key: YARN-10229
> URL: https://issues.apache.org/jira/browse/YARN-10229
> Project: Hadoop YARN
>  Issue Type: Wish
>  Components: amrmproxy, federation
>Affects Versions: 3.1.1
>Reporter: JohnsonGuo
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10229.001.patch
>
>
> Scenario: When enable the yarn federation feature with multi yarn clusters, 
> one can submit their job to yarn-router by *modified* their client 
> configuration with yarn router address.
> But if one still wants to submit their jobs via the original client (before 
> enable federation) to RM directly, it will encounter the AMRMToken exception. 
>  That means once enable federation ,if some one want to submit job, they have 
> to  modify the client conf.
>  
> one possible solution for this Scenario is:
> In NodeManger, when the client ApplicationMaster request comes:
>  * get the client job.xml  from HDFS "".
>  * parse the "yarn.resourcemanager.scheduler.address" parameter in job.xml
>  * if the value of the parameter is "localhost:8049"(AMRM address),then do 
> the AMRMToken valid process
>  * if the value of the parameter is "rm:port"(rm address),then skip the 
> AMRMToken valid process
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10229) [Federation] Client should be able to submit application to RM directly using normal client conf

2020-05-13 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10229:
-
Attachment: YARN-10229.001.patch

> [Federation] Client should be able to submit application to RM directly using 
> normal client conf
> 
>
> Key: YARN-10229
> URL: https://issues.apache.org/jira/browse/YARN-10229
> Project: Hadoop YARN
>  Issue Type: Wish
>  Components: amrmproxy, federation
>Affects Versions: 3.1.1
>Reporter: JohnsonGuo
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10229.001.patch
>
>
> Scenario: When enable the yarn federation feature with multi yarn clusters, 
> one can submit their job to yarn-router by *modified* their client 
> configuration with yarn router address.
> But if one still wants to submit their jobs via the original client (before 
> enable federation) to RM directly, it will encounter the AMRMToken exception. 
>  That means once enable federation ,if some one want to submit job, they have 
> to  modify the client conf.
>  
> one possible solution for this Scenario is:
> In NodeManger, when the client ApplicationMaster request comes:
>  * get the client job.xml  from HDFS "".
>  * parse the "yarn.resourcemanager.scheduler.address" parameter in job.xml
>  * if the value of the parameter is "localhost:8049"(AMRM address),then do 
> the AMRMToken valid process
>  * if the value of the parameter is "rm:port"(rm address),then skip the 
> AMRMToken valid process
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10264) Add container launch related env / classpath debug info to container logs when a container fails

2020-05-13 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10264:
--
Description: 
Sometimes when a container fails to launch, it can be pretty hard to figure out 
why it has failed.

Similar to YARN-4309, we can add a switch to control if the printing of 
environment variables and Java classpath should be done.
As a bonus, 
[jdeps|https://docs.oracle.com/javase/8/docs/technotes/tools/unix/jdeps.html] 
could also be utilized to print some verbose info about the classpath. 

When log aggregation occurs, all this information will automatically get 
collected and make debugging such container launch failures much easier.

Below is an example output when the user faces a classpath configuration issue 
while launching an application: 

{code:java}
End of LogType:prelaunch.err
**
2020-04-19 05:49:12,145 DEBUG:app_info:Diagnostics of the failed app
2020-04-19 05:49:12,145 DEBUG:app_info:Application 
application_1587300264561_0001 failed 2 times due to AM Container for 
appattempt_1587300264561_0001_02 exited with  exitCode: 1
Failing this attempt.Diagnostics: [2020-04-19 12:45:01.955]Exception from 
container-launch.
Container id: container_e60_1587300264561_0001_02_01
Exit code: 1
Exception message: Launch container failed
Shell output: main : command provided 1
main : run as user is systest
main : requested yarn user is systest
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file 
/dataroot/ycloud/yarn/nm/nmPrivate/application_1587300264561_0001/container_e60_1587300264561_0001_02_01/container_e60_1587300264561_0001_02_01.pid.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...
Getting exit code file...
Creating script paths...


[2020-04-19 12:45:01.984]Container exited with a non-zero exit code 1. Error 
file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Please check whether your etc/hadoop/mapred-site.xml contains the below 
configuration:

  yarn.app.mapreduce.am.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}


  mapreduce.map.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}


  mapreduce.reduce.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}


[2020-04-19 12:45:01.985]Container exited with a non-zero exit code 1. Error 
file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Please check whether your etc/hadoop/mapred-site.xml contains the below 
configuration:

  yarn.app.mapreduce.am.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}


  mapreduce.map.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}


  mapreduce.reduce.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}


For more detailed output, check the application tracking page: 
http://quasar-plnefj-2.quasar-plnefj.root.hwx.site:8088/cluster/app/application_1587300264561_0001
 Then click on links to logs of each attempt.
...
2020-04-19 05:49:12,148 INFO:util:* End test_app_API (yarn.suite.YarnAPITests) *
{code}


  was:
Sometimes when a container fails to launch, it can be pretty hard to figure out 
why it failed.

Similar to YARN-4309, we can add a switch to control if the printing of 
environment variables and Java classpath should be done.
As a bonus, 
[jdeps|https://docs.oracle.com/javase/8/docs/technotes/tools/unix/jdeps.html] 
could also be utilized to print some verbose info about the classpath. 

When log aggregation occurs, all this information will automatically get 
collected and make debugging such container launch failures much easier.

Below is an example output when the user faces a classpath configuration issue: 

{code:java}
End of LogType:prelaunch.err
**
2020-04-19 05:49:12,145 DEBUG:app_info:Diagnostics of the failed app
2020-04-19 05:49:12,145 DEBUG:app_info:Application 
application_1587300264561_0001 failed 2 times due to AM Container for 
appattempt_1587300264561_0001_02 exited with  exitCode: 1
Failing this attempt.Diagnostics: [2020-04-19 12:45:01.955]Exception from 
container-launch.
Container id: container_e60_1587300264561_0001_02_01
Exit code: 1
Exception message: Launch container failed
Shell output: main : command provided 1
main : run as user is systest
main : requested yarn user is systest
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file 

[jira] [Created] (YARN-10264) Add container launch related env / classpath debug info to container logs when a container fails

2020-05-13 Thread Szilard Nemeth (Jira)
Szilard Nemeth created YARN-10264:
-

 Summary: Add container launch related env / classpath debug info 
to container logs when a container fails
 Key: YARN-10264
 URL: https://issues.apache.org/jira/browse/YARN-10264
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Szilard Nemeth
Assignee: Szilard Nemeth


Sometimes when a container fails to launch, it can be pretty hard to figure out 
why it failed.

Similar to YARN-4309, we can add a switch to control if the printing of 
environment variables and Java classpath should be done.
As a bonus, 
[jdeps|https://docs.oracle.com/javase/8/docs/technotes/tools/unix/jdeps.html] 
could also be utilized to print some verbose info about the classpath. 

When log aggregation occurs, all this information will automatically get 
collected and make debugging such container launch failures much easier.

Below is an example output when the user faces a classpath configuration issue: 

{code:java}
End of LogType:prelaunch.err
**
2020-04-19 05:49:12,145 DEBUG:app_info:Diagnostics of the failed app
2020-04-19 05:49:12,145 DEBUG:app_info:Application 
application_1587300264561_0001 failed 2 times due to AM Container for 
appattempt_1587300264561_0001_02 exited with  exitCode: 1
Failing this attempt.Diagnostics: [2020-04-19 12:45:01.955]Exception from 
container-launch.
Container id: container_e60_1587300264561_0001_02_01
Exit code: 1
Exception message: Launch container failed
Shell output: main : command provided 1
main : run as user is systest
main : requested yarn user is systest
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file 
/dataroot/ycloud/yarn/nm/nmPrivate/application_1587300264561_0001/container_e60_1587300264561_0001_02_01/container_e60_1587300264561_0001_02_01.pid.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...
Getting exit code file...
Creating script paths...


[2020-04-19 12:45:01.984]Container exited with a non-zero exit code 1. Error 
file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Please check whether your etc/hadoop/mapred-site.xml contains the below 
configuration:

  yarn.app.mapreduce.am.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}


  mapreduce.map.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}


  mapreduce.reduce.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}


[2020-04-19 12:45:01.985]Container exited with a non-zero exit code 1. Error 
file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Please check whether your etc/hadoop/mapred-site.xml contains the below 
configuration:

  yarn.app.mapreduce.am.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}


  mapreduce.map.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}


  mapreduce.reduce.env
  HADOOP_MAPRED_HOME=${full path of your hadoop distribution 
directory}


For more detailed output, check the application tracking page: 
http://quasar-plnefj-2.quasar-plnefj.root.hwx.site:8088/cluster/app/application_1587300264561_0001
 Then click on links to logs of each attempt.
...
2020-04-19 05:49:12,148 INFO:util:* End test_app_API (yarn.suite.YarnAPITests) *
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10209) DistributedShell should initialize TimelineClient conditionally

2020-05-13 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106295#comment-17106295
 ] 

Bilwa S T commented on YARN-10209:
--

[~bteke] Okay

> DistributedShell should initialize TimelineClient conditionally
> ---
>
> Key: YARN-10209
> URL: https://issues.apache.org/jira/browse/YARN-10209
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Benjamin Teke
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10209.001.patch, YARN-10209.branch-2.6.0.patch, 
> YARN-10209.branch-2.6.0.v2.patch
>
>
> YarnConfiguration was changed along with the introduction of newer Timeline 
> Service versions to include configuration about the used version. In Hadoop 
> 2.6.0 the distributed shell instantiates Timeline Client whether if it's 
> enabled in the configuration or not. Running this distributed shell on newer 
> Hadoop versions (where the new Timeline Service is available) causes an 
> exception, because the bundled YarnConfiguration doesn't have the necessary 
> version configuration property. Making the Timeline Client initialization 
> conditional the distributed shell would run at least with disabled Timeline 
> Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10209) DistributedShell should initialize TimelineClient conditionally

2020-05-13 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106291#comment-17106291
 ] 

Bilwa S T commented on YARN-10209:
--

[~bteke] Looks like this branch is not rebased.

> DistributedShell should initialize TimelineClient conditionally
> ---
>
> Key: YARN-10209
> URL: https://issues.apache.org/jira/browse/YARN-10209
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Benjamin Teke
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10209.001.patch, YARN-10209.branch-2.6.0.patch, 
> YARN-10209.branch-2.6.0.v2.patch
>
>
> YarnConfiguration was changed along with the introduction of newer Timeline 
> Service versions to include configuration about the used version. In Hadoop 
> 2.6.0 the distributed shell instantiates Timeline Client whether if it's 
> enabled in the configuration or not. Running this distributed shell on newer 
> Hadoop versions (where the new Timeline Service is available) causes an 
> exception, because the bundled YarnConfiguration doesn't have the necessary 
> version configuration property. Making the Timeline Client initialization 
> conditional the distributed shell would run at least with disabled Timeline 
> Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10209) DistributedShell should initialize TimelineClient conditionally

2020-05-13 Thread Benjamin Teke (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106287#comment-17106287
 ] 

Benjamin Teke commented on YARN-10209:
--

Hi [~BilwaST],

Thanks again for the patch. The build issues are because the Dockerfile indeed 
doesn't exist yet on branch-2.6.0.  Fixing the build would have greater cost 
than the benefits of the fix, so we worked around this problem. Closing this 
issue as won't do.

> DistributedShell should initialize TimelineClient conditionally
> ---
>
> Key: YARN-10209
> URL: https://issues.apache.org/jira/browse/YARN-10209
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Benjamin Teke
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10209.001.patch, YARN-10209.branch-2.6.0.patch, 
> YARN-10209.branch-2.6.0.v2.patch
>
>
> YarnConfiguration was changed along with the introduction of newer Timeline 
> Service versions to include configuration about the used version. In Hadoop 
> 2.6.0 the distributed shell instantiates Timeline Client whether if it's 
> enabled in the configuration or not. Running this distributed shell on newer 
> Hadoop versions (where the new Timeline Service is available) causes an 
> exception, because the bundled YarnConfiguration doesn't have the necessary 
> version configuration property. Making the Timeline Client initialization 
> conditional the distributed shell would run at least with disabled Timeline 
> Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2020-05-13 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106280#comment-17106280
 ] 

Bilwa S T commented on YARN-9606:
-

[~prabhujoseph] ok thanks

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606.003.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10254) CapacityScheduler incorrect User Group Mapping after leaf queue change

2020-05-13 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106252#comment-17106252
 ] 

Peter Bacsko edited comment on YARN-10254 at 5/13/20, 12:16 PM:


Thanks for the latest patch [~shuzirra] I don't really have complaints, except 
for one thing.

I do believe that we need to extend the current code and this patch with more 
logging. My ideas:

1. {{getContextForGroupParent()}} - log if {{groupQueue}} is not found
2. {{getPlacementContextWithParent()}} - log if {{parent}} is null, this should 
be at least a warning.
3. Under the comment "if the queue doesn't exit we return null" - log if 
{{queue}} is null
4. {{getPlacementContextNoParent()}} - log if {{queue}} is null
5. I can see extra messages in {{getPlacementForUser}} potentially useful. For 
example, before each {{return}} statement, we could log stuff like:
{noformat}
} else if (mapping.getQueue().equals(CURRENT_USER_MAPPING)) {
   LOG.debug("Creating placement context based on current-user 
mapping");
return getPlacementContext(mapping, user);
  } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) {
   LOG.debug("Creating placement context based on primary-group 
mapping");
   return getPlacementContext(mapping, getPrimaryGroup(user));
{noformat}

I think it's OK to have them on DEBUG level, with the exception of #2. But to 
me, even INFO sounds reasonable. This class has been changed substantially in 
the past months (15 commits since 2019 Oct), I'd feel safer with extra 
printouts. 


was (Author: pbacsko):
Thanks for the latest patch [~shuzirra] I don't really complaint other than 
logging.

I do believe that we need to extend the current code and this patch with more 
logging. My ideas:

1. {{getContextForGroupParent()}} - log if {{groupQueue}} is not found
2. {{getPlacementContextWithParent()}} - log if {{parent}} is null, this should 
be at least a warning.
3. Under the comment "if the queue doesn't exit we return null" - log if 
{{queue}} is null
4. {{getPlacementContextNoParent()}} - log if {{queue}} is null
5. I can see extra messages in {{getPlacementForUser}} potentially useful. For 
example, before each {{return}} statement, we could log stuff like:
{noformat}
} else if (mapping.getQueue().equals(CURRENT_USER_MAPPING)) {
   LOG.debug("Creating placement context based on current-user 
mapping");
return getPlacementContext(mapping, user);
  } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) {
   LOG.debug("Creating placement context based on primary-group 
mapping");
   return getPlacementContext(mapping, getPrimaryGroup(user));
{noformat}

I think it's OK to have them on DEBUG level, with the exception of #2. But to 
me, even INFO sounds reasonable. This class has been changed substantially in 
the past months (15 commits since 2019 Oct), I'd feel safer with extra 
printouts. 

> CapacityScheduler incorrect User Group Mapping after leaf queue change
> --
>
> Key: YARN-10254
> URL: https://issues.apache.org/jira/browse/YARN-10254
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Attachments: YARN-10254.001.patch, YARN-10254.002.patch, 
> YARN-10254.003.patch
>
>
> YARN-9879 and YARN-10198 introduced some major changes to user group mapping, 
> and some of them unfortunately had some negative impact on the way mapping 
> works.
> In some cases incorrect PlacementContexts were created, where full queue path 
> was passed as leaf queue name. This affects how the yarn cli app list 
> displays the queues.
> u:%user:%primary_group.%user mapping fails with an incorrect validation error 
> when the %primary_group parent queue was a managed parent.
> Group based rules in certain cases are mapped to root.[primary_group] rules, 
> loosing the ability to create deeper structures.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10254) CapacityScheduler incorrect User Group Mapping after leaf queue change

2020-05-13 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106252#comment-17106252
 ] 

Peter Bacsko commented on YARN-10254:
-

Thanks for the latest patch [~shuzirra] I don't really complaint other than 
logging.

I do believe that we need to extend the current code and this patch with more 
logging. My ideas:

1. {{getContextForGroupParent()}} - log if {{groupQueue}} is not found
2. {{getPlacementContextWithParent()}} - log if {{parent}} is null, this should 
be at least a warning.
3. Under the comment "if the queue doesn't exit we return null" - log if 
{{queue}} is null
4. {{getPlacementContextNoParent()}} - log if {{queue}} is null
5. I can see extra messages in {{getPlacementForUser}} potentially useful. For 
example, before each {{return}} statement, we could log stuff like:
{noformat}
} else if (mapping.getQueue().equals(CURRENT_USER_MAPPING)) {
   LOG.debug("Creating placement context based on current-user 
mapping");
return getPlacementContext(mapping, user);
  } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) {
   LOG.debug("Creating placement context based on primary-group 
mapping");
   return getPlacementContext(mapping, getPrimaryGroup(user));
{noformat}

I think it's OK to have them on DEBUG level, with the exception of #2. But to 
me, even INFO sounds reasonable. This class has been changed substantially in 
the past months (15 commits since 2019 Oct), I'd feel safer with extra 
printouts. 

> CapacityScheduler incorrect User Group Mapping after leaf queue change
> --
>
> Key: YARN-10254
> URL: https://issues.apache.org/jira/browse/YARN-10254
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Attachments: YARN-10254.001.patch, YARN-10254.002.patch, 
> YARN-10254.003.patch
>
>
> YARN-9879 and YARN-10198 introduced some major changes to user group mapping, 
> and some of them unfortunately had some negative impact on the way mapping 
> works.
> In some cases incorrect PlacementContexts were created, where full queue path 
> was passed as leaf queue name. This affects how the yarn cli app list 
> displays the queues.
> u:%user:%primary_group.%user mapping fails with an incorrect validation error 
> when the %primary_group parent queue was a managed parent.
> Group based rules in certain cases are mapped to root.[primary_group] rules, 
> loosing the ability to create deeper structures.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10108) FS-CS converter: nestedUserQueue with default rule results in invalid queue mapping

2020-05-13 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106208#comment-17106208
 ] 

Peter Bacsko commented on YARN-10108:
-

Thanks for the patch [~shuzirra].

Just one comment - this part from the new testcase can be eliminated completely:

{noformat}
  // submit an app
  submitApp(mockRM, cs.getQueue(PARENT_QUEUE), USER0, USER0, 1, 1);

  // check preconditions
  List appsInC = cs.getAppsInQueue(PARENT_QUEUE);
  assertEquals(1, appsInC.size());
  assertNotNull(cs.getQueue(USER0));

  AutoCreatedLeafQueue autoCreatedLeafQueue =
  (AutoCreatedLeafQueue) cs.getQueue(USER0);
  ManagedParentQueue parentQueue = (ManagedParentQueue) cs.getQueue(
  PARENT_QUEUE);
  assertEquals(parentQueue, autoCreatedLeafQueue.getParent());

  Map expectedChildQueueAbsCapacity =
  populateExpectedAbsCapacityByLabelForParentQueue(1);
  validateInitialQueueEntitlement(parentQueue, USER0,
  expectedChildQueueAbsCapacity, accessibleNodeLabelsOnC);

  validateUserAndAppLimits(autoCreatedLeafQueue, 1000, 1000);
  validateContainerLimits(autoCreatedLeafQueue);

  assertTrue(autoCreatedLeafQueue
  .getOrderingPolicy() instanceof FairOrderingPolicy);
{noformat}

Other than that LGTM +1 (non-binding).

> FS-CS converter: nestedUserQueue with default rule results in invalid queue 
> mapping
> ---
>
> Key: YARN-10108
> URL: https://issues.apache.org/jira/browse/YARN-10108
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Gergely Pollak
>Priority: Major
>  Labels: fs2cs
> Attachments: YARN-10108.001.patch, YARN-10108.002.patch
>
>
> FS Queue Placement Policy
> {code:java}
> 
> 
> 
> 
> 
>  {code}
> gets mapped to an invalid CS queue mapping "u:%user:root.users.%user"
> RM fails to start with above queue mapping in CS
> {code:java}
> 2020-01-28 00:19:12,889 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
> ResourceManager
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: mapping 
> contains invalid or non-leaf queue [%user] and invalid parent queue 
> [root.users]
>   at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:173)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:829)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1247)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:324)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1534)
> Caused by: java.io.IOException: mapping contains invalid or non-leaf queue 
> [%user] and invalid parent queue [root.users]
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.QueuePlacementRuleUtils.validateQueueMappingUnderParentQueue(QueuePlacementRuleUtils.java:48)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.validateAndGetAutoCreatedQueueMapping(UserGroupMappingPlacementRule.java:363)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:300)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   ... 7 more
> {code}
> QueuePlacementConverter#handleNestedRule has to be fixed.
> {code:java}
> else if (pr instanceof 

[jira] [Commented] (YARN-10209) DistributedShell should initialize TimelineClient conditionally

2020-05-13 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106189#comment-17106189
 ] 

Hadoop QA commented on YARN-10209:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
10s{color} | {color:red} Dockerfile 
'/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/sourcedir/dev-support/docker/Dockerfile'
 not found. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-10209 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12999255/YARN-10209.branch-2.6.0.v2.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/26028/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |


This message was automatically generated.



> DistributedShell should initialize TimelineClient conditionally
> ---
>
> Key: YARN-10209
> URL: https://issues.apache.org/jira/browse/YARN-10209
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Benjamin Teke
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10209.001.patch, YARN-10209.branch-2.6.0.patch, 
> YARN-10209.branch-2.6.0.v2.patch
>
>
> YarnConfiguration was changed along with the introduction of newer Timeline 
> Service versions to include configuration about the used version. In Hadoop 
> 2.6.0 the distributed shell instantiates Timeline Client whether if it's 
> enabled in the configuration or not. Running this distributed shell on newer 
> Hadoop versions (where the new Timeline Service is available) causes an 
> exception, because the bundled YarnConfiguration doesn't have the necessary 
> version configuration property. Making the Timeline Client initialization 
> conditional the distributed shell would run at least with disabled Timeline 
> Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9930) Support max running app logic for CapacityScheduler

2020-05-13 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106183#comment-17106183
 ] 

Peter Bacsko commented on YARN-9930:


[~cane] any updates here? Do you have a patch?

> Support max running app logic for CapacityScheduler
> ---
>
> Key: YARN-9930
> URL: https://issues.apache.org/jira/browse/YARN-9930
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 3.1.0, 3.1.1
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
>
> In FairScheduler, there has limitation for max running which will let 
> application pending.
> But in CapacityScheduler there has no feature like max running app.Only got 
> max app,and jobs will be rejected directly on client.
> This jira i want to implement this semantic for CapacityScheduler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources

2020-05-13 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106159#comment-17106159
 ] 

Hudson commented on YARN-10154:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18249 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18249/])
YARN-10154. Addendum Patch which fixes below bugs (pjoseph: rev 
450e5aa9dd49eae46a0e05151bbddc56083eafd5)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestAbsoluteResourceWithAutoQueue.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ManagedParentQueue.java


> CS Dynamic Queues cannot be configured with absolute resources
> --
>
> Key: YARN-10154
> URL: https://issues.apache.org/jira/browse/YARN-10154
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.3
>Reporter: Sunil G
>Assignee: Manikandan R
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10154.001.patch, YARN-10154.002.patch, 
> YARN-10154.003.patch, YARN-10154.addendum-001.patch, 
> YARN-10154.addendum-002.patch, YARN-10154.addendum-003.patch, 
> YARN-10154.addendum-004.patch
>
>
> In CS, ManagedParent Queue and its template cannot take absolute resource 
> value like 
> [memory=8192,vcores=8]
>  Thsi Jira is to track and improve the configuration reading module of 
> DynamicQueue to support absolute resource values.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources

2020-05-13 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106140#comment-17106140
 ] 

Prabhu Joseph commented on YARN-10154:
--

Thanks [~sunilg] and [~maniraj...@gmail.com] for the review. Have committed  
[^YARN-10154.addendum-004.patch]  to trunk.

> CS Dynamic Queues cannot be configured with absolute resources
> --
>
> Key: YARN-10154
> URL: https://issues.apache.org/jira/browse/YARN-10154
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.3
>Reporter: Sunil G
>Assignee: Manikandan R
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10154.001.patch, YARN-10154.002.patch, 
> YARN-10154.003.patch, YARN-10154.addendum-001.patch, 
> YARN-10154.addendum-002.patch, YARN-10154.addendum-003.patch, 
> YARN-10154.addendum-004.patch
>
>
> In CS, ManagedParent Queue and its template cannot take absolute resource 
> value like 
> [memory=8192,vcores=8]
>  Thsi Jira is to track and improve the configuration reading module of 
> DynamicQueue to support absolute resource values.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10259) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement

2020-05-13 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106082#comment-17106082
 ] 

Hadoop QA commented on YARN-10259:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 26m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
42s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
40s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 27s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}105m 
29s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}196m  1s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26027/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10259 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13002781/YARN-10259-003.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux bd1ffdcfffd1 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / d60496e6c66 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/26027/testReport/ |
| Max. process+thread count | 820 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 

[jira] [Resolved] (YARN-10158) FS-CS converter: convert property yarn.scheduler.fair.update-interval-ms

2020-05-13 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko resolved YARN-10158.
-
Resolution: Won't Do

> FS-CS converter: convert property yarn.scheduler.fair.update-interval-ms
> 
>
> Key: YARN-10158
> URL: https://issues.apache.org/jira/browse/YARN-10158
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10158) FS-CS converter: convert property yarn.scheduler.fair.update-interval-ms

2020-05-13 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106045#comment-17106045
 ] 

Peter Bacsko commented on YARN-10158:
-

As discussed offline with [~leftnoteasy], it's a very low level property. 
Preemption at this level works differently in FS and CS so it's fine to ignore 
the conversion of such settings.

> FS-CS converter: convert property yarn.scheduler.fair.update-interval-ms
> 
>
> Key: YARN-10158
> URL: https://issues.apache.org/jira/browse/YARN-10158
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org