date:20200513

[jira] [Commented] (YARN-10265) Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue

2020-05-13 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106928#comment-17106928
 ] 

Hadoop QA commented on YARN-10265:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
44s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
59m 28s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
5s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
24s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
9s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}110m 57s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
51s{color} | {color:green} hadoop-yarn-csi in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
53s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}223m 22s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26030/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10265 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13002874/YARN-10265.001.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient xml |
| uname | Linux 4b3fc8e53de9 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git

[jira] [Commented] (YARN-9849) Leaf queues not inheriting parent queue status after adding status as “RUNNING” and thereafter, commenting the same in capacity-scheduler.xml

2020-05-13 Thread Bilwa S T (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-9849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106901#comment-17106901
 ] 

Bilwa S T commented on YARN-9849:
-

cc [~inigoiri]

> Leaf queues not inheriting parent queue status after adding status as 
> “RUNNING” and thereafter, commenting the same in capacity-scheduler.xml 
> --
>
> Key: YARN-9849
> URL: https://issues.apache.org/jira/browse/YARN-9849
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9849.001.patch, YARN-9849.002.patch, 
> YARN-9849.003.patch
>
>
> 【Precondition】：
> 1. Install the cluster
> 2. Config Queues with more numbers say 2 parent [default,q1] & 4 leaf [q2,q3]
> 3. Cluster should be up and running
> 【Test step】：
> 1.By default leaf quques inherit parent status 
> 2.Change leaf queues status  as "RUNNING" explicitly
> 3. Run refresh command, leaf queues status shown as "RUNNING" in CLI/UI
> 4. Therafter,change the leaft queues status to "STOPPED"  and refresh command
> 5. Run refresh command, leaf queues status shown as "STOPPING" in CLI/UI
> 6. Now comment the leafy queues status and run refresh queues
> 7.Observe
> 【Expect Output】：
> The leaf queues status should be displayed as "RUNNING" inheriting from the 
> parent queue.
> 【Actual Output】：
> Still displays the leaf queues status as "STOPPED" raather than inheriting 
> the same from parent which is in RUNNING



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10265) Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue

2020-05-13 Thread Surendra Singh Lilhore (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106890#comment-17106890
 ] 

Surendra Singh Lilhore commented on YARN-10265:
---

LGTM, will wait for build report...

> Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue
> --
>
> Key: YARN-10265
> URL: https://issues.apache.org/jira/browse/YARN-10265
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liusheng
>Priority: Major
> Attachments: YARN-10265.001.patch
>
>
> In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
> workaround using an non-official released netty-4.1.48 to fix the ARM support 
> issue. but just a few hours agon, Netty has release the 4.1.50 version which 
> is officially support ARM platform, please see: 
> [https://github.com/netty/netty/pull/9804]
>  
> netty-4.1.50.Final release: 
> [https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]
> commits from netty-4.1.48 to netty-4.1.50: 
> [https://github.com/netty/netty/compare/netty-4.1.48.Final...netty-4.1.50.Final]
> So, now it is better to upgrade the netty-dependency version of Hadoop to 
> 4.1.50 version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10248) when config allowed-gpu-devices , excluded GPUs still be visible to containers

2020-05-13 Thread zhao yufei (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106852#comment-17106852
 ] 

zhao yufei commented on YARN-10248:
---

[~tangzhankun] yes,  also  this test class support FakeGpuDiscoveryBinary, but  
lookUpAutoDiscoveryBinary in class GpuDiscoverer  not support it .  


> when config allowed-gpu-devices , excluded GPUs still be visible to containers
> --
>
> Key: YARN-10248
> URL: https://issues.apache.org/jira/browse/YARN-10248
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.1
>Reporter: zhao yufei
>Assignee: zhao yufei
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.2.1
>
> Attachments: YARN-10248-branch-3.2.001.path, 
> YARN-10248-branch-3.2.001.path
>
>
> I have a server with two GPU, and i want to use only one of them within yarn 
> cluster.
> according to hadoop document, i set configs:
> {code:java}
> 
> yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
> 0:1
>   
> 
> 
> yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables
> /etc/alternatives/x86_64-linux-gnu_nvidia_smi
>   
> {code}
> then i running following command to test:
> {code:java}
> yarn jar 
> ./share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.1.jar \
>  -jar 
> ./share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.1.jar  
> -shell_command ' nvidia-smi & sleep 3  ' \
>  -container_resources memory-mb=3072,vcores=1,yarn.io/gpu=1  \
>  -num_containers 1 -queue yufei -node_label_expression slaves
> {code}
> iI expected gpu with minor number 0 will not visible to container, but in the 
> launched container, nvidia-smi  print two gpu information.
> I check the related source code and find it is a bug.
> the problem is:
> when you specify allowed-gpu-devices, GpuDiscoverer will populate usable gpus 
> from it,  
> then when assign to a container some of the gpus, it will set denied gpus for 
> the container,
> but it never consider excluded gpu of the host. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-10248) when config allowed-gpu-devices , excluded GPUs still be visible to containers

2020-05-13 Thread zhao yufei (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106852#comment-17106852
 ] 

zhao yufei edited comment on YARN-10248 at 5/14/20, 3:47 AM:
-

[~tangzhankun]  no, i found another test issues:
 this test class support FakeGpuDiscoveryBinary, but  lookUpAutoDiscoveryBinary 
in class GpuDiscoverer  not support it .  



was (Author: jasstionzyf):
[~tangzhankun] yes,  also  this test class support FakeGpuDiscoveryBinary, but  
lookUpAutoDiscoveryBinary in class GpuDiscoverer  not support it .  


> when config allowed-gpu-devices , excluded GPUs still be visible to containers
> --
>
> Key: YARN-10248
> URL: https://issues.apache.org/jira/browse/YARN-10248
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.1
>Reporter: zhao yufei
>Assignee: zhao yufei
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.2.1
>
> Attachments: YARN-10248-branch-3.2.001.path, 
> YARN-10248-branch-3.2.001.path
>
>
> I have a server with two GPU, and i want to use only one of them within yarn 
> cluster.
> according to hadoop document, i set configs:
> {code:java}
> 
> yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
> 0:1
>   
> 
> 
> yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables
> /etc/alternatives/x86_64-linux-gnu_nvidia_smi
>   
> {code}
> then i running following command to test:
> {code:java}
> yarn jar 
> ./share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.1.jar \
>  -jar 
> ./share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.1.jar  
> -shell_command ' nvidia-smi & sleep 3  ' \
>  -container_resources memory-mb=3072,vcores=1,yarn.io/gpu=1  \
>  -num_containers 1 -queue yufei -node_label_expression slaves
> {code}
> iI expected gpu with minor number 0 will not visible to container, but in the 
> launched container, nvidia-smi  print two gpu information.
> I check the related source code and find it is a bug.
> the problem is:
> when you specify allowed-gpu-devices, GpuDiscoverer will populate usable gpus 
> from it,  
> then when assign to a container some of the gpus, it will set denied gpus for 
> the container,
> but it never consider excluded gpu of the host. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10265) Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue

2020-05-13 Thread liusheng (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liusheng updated YARN-10265:

Description: 
In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
workaround using an non-official released netty-4.1.48 to fix the ARM support 
issue. but just a few hours agon, Netty has release the 4.1.50 version which is 
officially support ARM platform, please see: 
[https://github.com/netty/netty/pull/9804]

 

netty-4.1.50.Final release: 
[https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]

commits from netty-4.1.48 to netty-4.1.50: 
[https://github.com/netty/netty/compare/netty-4.1.48.Final...netty-4.1.50.Final]

So, now it is better to upgrade the netty-dependency version of Hadoop to 
4.1.50 version.

  was:
In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
workaround using an non-official released netty-4.1.48 to fix the ARM support 
issue. but just a few hours agon, Netty has release the 4.1.50 version which is 
officially support ARM platform, please see: 
[https://github.com/netty/netty/pull/9804]

 

netty-4.1.50.Final release: 
[https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]

commits from netty-4.1.48 to netty-4.1.50: 
[https://github.com/netty/netty/compare/netty-4.1.48.Final...netty-4.1.50.Final|https://github.com/netty/netty/compare/netty-4.1.49.Final...netty-4.1.50.Final]

So, now it is better to upgrade the netty-dependency version of Hadoop to 
4.1.50 version.


> Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue
> --
>
> Key: YARN-10265
> URL: https://issues.apache.org/jira/browse/YARN-10265
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liusheng
>Priority: Major
>
> In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
> workaround using an non-official released netty-4.1.48 to fix the ARM support 
> issue. but just a few hours agon, Netty has release the 4.1.50 version which 
> is officially support ARM platform, please see: 
> [https://github.com/netty/netty/pull/9804]
>  
> netty-4.1.50.Final release: 
> [https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]
> commits from netty-4.1.48 to netty-4.1.50: 
> [https://github.com/netty/netty/compare/netty-4.1.48.Final...netty-4.1.50.Final]
> So, now it is better to upgrade the netty-dependency version of Hadoop to 
> 4.1.50 version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10265) Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue

2020-05-13 Thread liusheng (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liusheng updated YARN-10265:

Description: 
In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
workaround using an non-official released netty-4.1.48 to fix the ARM support 
issue. but just a few hours agon, Netty has release the 4.1.50 version which is 
officially support ARM platform, please see: 
[https://github.com/netty/netty/pull/9804]

 

netty-4.1.50.Final release: 
[https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]

commits from netty-4.1.48 to netty-4.1.50: 
[https://github.com/netty/netty/compare/netty-4.1.48.Final...netty-4.1.50.Final|https://github.com/netty/netty/compare/netty-4.1.49.Final...netty-4.1.50.Final]

So, now it is better to upgrade the netty-dependency version of Hadoop to 
4.1.50 version.

  was:
In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
workaround using an non-official released netty-4.1.49 to fix the ARM support 
issue. but just a few hours agon, Netty has release the 4.1.50 version which is 
officially support ARM platform, please see: 
[https://github.com/netty/netty/pull/9804]

 

netty-4.1.50.Final release: 
[https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]

commits from netty-4.1.49 to netty-4.1.50: 
[https://github.com/netty/netty/compare/netty-4.1.49.Final...netty-4.1.50.Final]

So, now it is better to upgrade the netty-dependency version of Hadoop to 
4.1.50 version.


> Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue
> --
>
> Key: YARN-10265
> URL: https://issues.apache.org/jira/browse/YARN-10265
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liusheng
>Priority: Major
>
> In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
> workaround using an non-official released netty-4.1.48 to fix the ARM support 
> issue. but just a few hours agon, Netty has release the 4.1.50 version which 
> is officially support ARM platform, please see: 
> [https://github.com/netty/netty/pull/9804]
>  
> netty-4.1.50.Final release: 
> [https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]
> commits from netty-4.1.48 to netty-4.1.50: 
> [https://github.com/netty/netty/compare/netty-4.1.48.Final...netty-4.1.50.Final|https://github.com/netty/netty/compare/netty-4.1.49.Final...netty-4.1.50.Final]
> So, now it is better to upgrade the netty-dependency version of Hadoop to 
> 4.1.50 version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-10265) Upgrade Netty-all dependency to latest version 4.1.50 to fix ARM support issue

2020-05-13 Thread liusheng (Jira)

liusheng created YARN-10265:
---

 Summary: Upgrade Netty-all dependency to latest version 4.1.50 to 
fix ARM support issue
 Key: YARN-10265
 URL: https://issues.apache.org/jira/browse/YARN-10265
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: liusheng


In the issue: https://issues.apache.org/jira/browse/YARN-9898 we have added a 
workaround using an non-official released netty-4.1.49 to fix the ARM support 
issue. but just a few hours agon, Netty has release the 4.1.50 version which is 
officially support ARM platform, please see: 
[https://github.com/netty/netty/pull/9804]

 

netty-4.1.50.Final release: 
[https://github.com/netty/netty/releases/tag/netty-4.1.50.Final]

commits from netty-4.1.49 to netty-4.1.50: 
[https://github.com/netty/netty/compare/netty-4.1.49.Final...netty-4.1.50.Final]

So, now it is better to upgrade the netty-dependency version of Hadoop to 
4.1.50 version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9898) Dependency netty-all-4.1.27.Final doesn't support ARM platform

2020-05-13 Thread Hudson (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106614#comment-17106614
 ] 

Hudson commented on YARN-9898:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18254 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18254/])
YARN-9898. Dependency netty-all-4.1.27.Final doesn't support ARM (ayushsaxena: 
rev 0918433b4da1affbe380988b8f63fca39bc0850b)
* (edit) hadoop-project/pom.xml
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-csi/pom.xml
* (edit) hadoop-hdfs-project/hadoop-hdfs-client/pom.xml
* (edit) hadoop-hdfs-project/hadoop-hdfs/pom.xml


> Dependency netty-all-4.1.27.Final doesn't support ARM platform
> --
>
> Key: YARN-9898
> URL: https://issues.apache.org/jira/browse/YARN-9898
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
> Fix For: 3.3.0, 3.4.0
>
> Attachments: YARN-9898.001.patch, YARN-9898.002.patch, 
> YARN-9898.003.patch, YARN-9898.004.patch
>
>
> Hadoop dependent the Netty package, but the *netty-all-4.1.27.Final* of 
> io.netty maven repo, cannot support ARM platform. 
> When run the test *TestCsiClient.testIdentityService* on ARM server, it will 
> raise error like following:
> {code:java}
> Caused by: java.io.FileNotFoundException: 
> META-INF/native/libnetty_transport_native_epoll_aarch_64.so
> at 
> io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:161)
> ... 45 more
> Suppressed: java.lang.UnsatisfiedLinkError: no 
> netty_transport_native_epoll_aarch_64 in java.library.path
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
> at java.lang.Runtime.loadLibrary0(Runtime.java:870)
> at java.lang.System.loadLibrary(System.java:1122)
> at 
> io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:243)
> at 
> io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:124)
> ... 45 more
> Suppressed: java.lang.UnsatisfiedLinkError: no 
> netty_transport_native_epoll_aarch_64 in java.library.path
> at 
> java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
> at java.lang.Runtime.loadLibrary0(Runtime.java:870)
> at java.lang.System.loadLibrary(System.java:1122)
> at 
> io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:263)
> at java.security.AccessController.doPrivileged(Native 
> Method)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:255)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:233)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9898) Dependency netty-all-4.1.27.Final doesn't support ARM platform

2020-05-13 Thread Ayush Saxena (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106603#comment-17106603
 ] 

Ayush Saxena commented on YARN-9898:


Committed to trunk, branch-3.3 and 3.3.0

Thanx [~seanlau]  for the contribution!!!

> Dependency netty-all-4.1.27.Final doesn't support ARM platform
> --
>
> Key: YARN-9898
> URL: https://issues.apache.org/jira/browse/YARN-9898
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
> Attachments: YARN-9898.001.patch, YARN-9898.002.patch, 
> YARN-9898.003.patch, YARN-9898.004.patch
>
>
> Hadoop dependent the Netty package, but the *netty-all-4.1.27.Final* of 
> io.netty maven repo, cannot support ARM platform. 
> When run the test *TestCsiClient.testIdentityService* on ARM server, it will 
> raise error like following:
> {code:java}
> Caused by: java.io.FileNotFoundException: 
> META-INF/native/libnetty_transport_native_epoll_aarch_64.so
> at 
> io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:161)
> ... 45 more
> Suppressed: java.lang.UnsatisfiedLinkError: no 
> netty_transport_native_epoll_aarch_64 in java.library.path
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
> at java.lang.Runtime.loadLibrary0(Runtime.java:870)
> at java.lang.System.loadLibrary(System.java:1122)
> at 
> io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:243)
> at 
> io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:124)
> ... 45 more
> Suppressed: java.lang.UnsatisfiedLinkError: no 
> netty_transport_native_epoll_aarch_64 in java.library.path
> at 
> java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
> at java.lang.Runtime.loadLibrary0(Runtime.java:870)
> at java.lang.System.loadLibrary(System.java:1122)
> at 
> io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:263)
> at java.security.AccessController.doPrivileged(Native 
> Method)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:255)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:233)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10229) [Federation] Client should be able to submit application to RM directly using normal client conf

2020-05-13 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106566#comment-17106566
 ] 

Hadoop QA commented on YARN-10229:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 40s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
21s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 23s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 3 new + 165 unchanged - 1 fixed = 168 total (was 166) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 
59s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 86m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26029/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10229 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13002833/YARN-10229.001.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux c8795019f11d 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / 108ecf992f0 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
| checkstyle |

[jira] [Resolved] (YARN-8552) [DS] Container report fails for distributed containers

2020-05-13 Thread Bilwa S T (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T resolved YARN-8552.
-
Resolution: Cannot Reproduce

> [DS]  Container report fails for distributed containers
> ---
>
> Key: YARN-8552
> URL: https://issues.apache.org/jira/browse/YARN-8552
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Bilwa S T
>Priority: Major
>
> 2018-07-19 19:15:02,281 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1531994217928_0003_01_1099511627753 Container Transitioned from 
> ACQUIRED to RUNNING
> 2018-07-19 19:15:02,384 ERROR 
> org.apache.hadoop.yarn.server.webapp.ContainerBlock: Failed to read the 
> container container_1531994217928_0003_01_1099511627773.
> Container report failing for Distributed Scheduler containers. Currently all 
> the container are fetched from central RM so need to find alternative for the 
> same.
> {code}
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.yarn.exceptions.ContainerNotFoundException: 
> Container with id 'container_1531994217928_0003_01_1099511627773' doesn't 
> exist in RM.
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getContainerReport(ClientRMService.java:499)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMContainerBlock.getContainerReport(RMContainerBlock.java:44)
> at 
> org.apache.hadoop.yarn.server.webapp.ContainerBlock$1.run(ContainerBlock.java:82)
> at 
> org.apache.hadoop.yarn.server.webapp.ContainerBlock$1.run(ContainerBlock.java:79)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
> ... 70 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Resolved] (YARN-10128) [FederationSecurity] YARN RMAdmin commands fail when Authorization is enabled on router

2020-05-13 Thread Bilwa S T (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T resolved YARN-10128.
--
Resolution: Duplicate

> [FederationSecurity] YARN RMAdmin commands fail when Authorization is enabled 
> on router
> ---
>
> Key: YARN-10128
> URL: https://issues.apache.org/jira/browse/YARN-10128
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>
> Exception thrown is 
> {quote}Protocol interface 
> org.apache.hadoop.yarn.server.api.ResourceManagerAdministrationProtocolPB is 
> not known., while invoking 
> ResourceManagerAdministrationProtocolPBClientImpl.refreshQueues over rm2 
> after 1 failover attempts. Trying to failover after sleeping for 44717ms.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8942) PriorityBasedRouterPolicy throws exception if all sub-cluster weights have negative value

2020-05-13 Thread Hudson (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106498#comment-17106498
 ] 

Hudson commented on YARN-8942:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18251 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18251/])
YARN-8942. PriorityBasedRouterPolicy throws exception if all sub-cluster 
(inigoiri: rev 108ecf992f0004dd64a7143d1c400de1361b13f3)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/federation/policies/router/TestPriorityRouterPolicy.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/policies/router/PriorityRouterPolicy.java


> PriorityBasedRouterPolicy throws exception if all sub-cluster weights have 
> negative value
> -
>
> Key: YARN-8942
> URL: https://issues.apache.org/jira/browse/YARN-8942
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Akshay Agarwal
>Assignee: Bilwa S T
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: YARN-8942.001.patch, YARN-8942.002.patch
>
>
> In *PriorityBasedRouterPolicy* if all sub-cluster weights are *set to 
> negative values* it is throwing exception while running a job.
> Ideally it should handle the negative priority as well according to the home 
> sub cluster selection process of the policy.
>  *Exception Details:*
> {code:java}
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Unable 
> to insert the ApplicationId application_1540356760422_0015 into the 
> FederationStateStore
> at 
> org.apache.hadoop.yarn.server.router.RouterServerUtil.logAndThrowException(RouterServerUtil.java:56)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:418)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> Caused by: 
> org.apache.hadoop.yarn.server.federation.store.exception.FederationStateStoreInvalidInputException:
>  Missing SubCluster Id information. Please try again by specifying Subcluster 
> Id information.
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationMembershipStateStoreInputValidator.checkSubClusterId(FederationMembershipStateStoreInputValidator.java:247)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.checkApplicationHomeSubCluster(FederationApplicationHomeSubClusterStoreInputValidator.java:160)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.validate(FederationApplicationHomeSubClusterStoreInputValidator.java:65)
> at 
> org.apache.hadoop.yarn.server.federation.store.impl.ZookeeperFederationStateStore.addApplicationHomeSubCluster(ZookeeperFederationStateStore.java:159)
> at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
>

[jira] [Commented] (YARN-6526) Refactoring SQLFederationStateStore by avoiding to recreate a connection at every call

2020-05-13 Thread Jira



[ 
https://issues.apache.org/jira/browse/YARN-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106481#comment-17106481
 ] 

Íñigo Goiri commented on YARN-6526:
---

+1 on  [^YARN-6526.007.patch].
The findbugs seems unrelated.

> Refactoring SQLFederationStateStore by avoiding to recreate a connection at 
> every call
> --
>
> Key: YARN-6526
> URL: https://issues.apache.org/jira/browse/YARN-6526
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation
>Reporter: Giovanni Matteo Fumarola
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-6526.001.patch, YARN-6526.002.patch, 
> YARN-6526.003.patch, YARN-6526.004.patch, YARN-6526.005.patch, 
> YARN-6526.006.patch, YARN-6526.007.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8942) PriorityBasedRouterPolicy throws exception if all sub-cluster weights have negative value

2020-05-13 Thread Jira



[ 
https://issues.apache.org/jira/browse/YARN-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106480#comment-17106480
 ] 

Íñigo Goiri commented on YARN-8942:
---

Thanks [~BilwaST] for the patch.
Committed to trunk.

> PriorityBasedRouterPolicy throws exception if all sub-cluster weights have 
> negative value
> -
>
> Key: YARN-8942
> URL: https://issues.apache.org/jira/browse/YARN-8942
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Akshay Agarwal
>Assignee: Bilwa S T
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: YARN-8942.001.patch, YARN-8942.002.patch
>
>
> In *PriorityBasedRouterPolicy* if all sub-cluster weights are *set to 
> negative values* it is throwing exception while running a job.
> Ideally it should handle the negative priority as well according to the home 
> sub cluster selection process of the policy.
>  *Exception Details:*
> {code:java}
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Unable 
> to insert the ApplicationId application_1540356760422_0015 into the 
> FederationStateStore
> at 
> org.apache.hadoop.yarn.server.router.RouterServerUtil.logAndThrowException(RouterServerUtil.java:56)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:418)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> Caused by: 
> org.apache.hadoop.yarn.server.federation.store.exception.FederationStateStoreInvalidInputException:
>  Missing SubCluster Id information. Please try again by specifying Subcluster 
> Id information.
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationMembershipStateStoreInputValidator.checkSubClusterId(FederationMembershipStateStoreInputValidator.java:247)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.checkApplicationHomeSubCluster(FederationApplicationHomeSubClusterStoreInputValidator.java:160)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.validate(FederationApplicationHomeSubClusterStoreInputValidator.java:65)
> at 
> org.apache.hadoop.yarn.server.federation.store.impl.ZookeeperFederationStateStore.addApplicationHomeSubCluster(ZookeeperFederationStateStore.java:159)
> at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy84.addApplicationHomeSubCluster(Unknown Source)
> at 
> org.apache.hadoop.yarn.server.federation.utils.FederationStateStoreFacade.addApplicationHomeSubCluster(FederationStateStoreFacade.java:402)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:413)
> ... 11 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (YARN-8942) PriorityBasedRouterPolicy throws exception if all sub-cluster weights have negative value

2020-05-13 Thread Jira



 [ 
https://issues.apache.org/jira/browse/YARN-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-8942:
--
Affects Version/s: 3.3.0

> PriorityBasedRouterPolicy throws exception if all sub-cluster weights have 
> negative value
> -
>
> Key: YARN-8942
> URL: https://issues.apache.org/jira/browse/YARN-8942
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Akshay Agarwal
>Assignee: Bilwa S T
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: YARN-8942.001.patch, YARN-8942.002.patch
>
>
> In *PriorityBasedRouterPolicy* if all sub-cluster weights are *set to 
> negative values* it is throwing exception while running a job.
> Ideally it should handle the negative priority as well according to the home 
> sub cluster selection process of the policy.
>  *Exception Details:*
> {code:java}
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Unable 
> to insert the ApplicationId application_1540356760422_0015 into the 
> FederationStateStore
> at 
> org.apache.hadoop.yarn.server.router.RouterServerUtil.logAndThrowException(RouterServerUtil.java:56)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:418)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> Caused by: 
> org.apache.hadoop.yarn.server.federation.store.exception.FederationStateStoreInvalidInputException:
>  Missing SubCluster Id information. Please try again by specifying Subcluster 
> Id information.
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationMembershipStateStoreInputValidator.checkSubClusterId(FederationMembershipStateStoreInputValidator.java:247)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.checkApplicationHomeSubCluster(FederationApplicationHomeSubClusterStoreInputValidator.java:160)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.validate(FederationApplicationHomeSubClusterStoreInputValidator.java:65)
> at 
> org.apache.hadoop.yarn.server.federation.store.impl.ZookeeperFederationStateStore.addApplicationHomeSubCluster(ZookeeperFederationStateStore.java:159)
> at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy84.addApplicationHomeSubCluster(Unknown Source)
> at 
> org.apache.hadoop.yarn.server.federation.utils.FederationStateStoreFacade.addApplicationHomeSubCluster(FederationStateStoreFacade.java:402)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:413)
> ... 11 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail:

[jira] [Updated] (YARN-8942) PriorityBasedRouterPolicy throws exception if all sub-cluster weights have negative value

2020-05-13 Thread Jira



 [ 
https://issues.apache.org/jira/browse/YARN-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-8942:
--
Fix Version/s: (was: 3.4.0)

> PriorityBasedRouterPolicy throws exception if all sub-cluster weights have 
> negative value
> -
>
> Key: YARN-8942
> URL: https://issues.apache.org/jira/browse/YARN-8942
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Akshay Agarwal
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-8942.001.patch, YARN-8942.002.patch
>
>
> In *PriorityBasedRouterPolicy* if all sub-cluster weights are *set to 
> negative values* it is throwing exception while running a job.
> Ideally it should handle the negative priority as well according to the home 
> sub cluster selection process of the policy.
>  *Exception Details:*
> {code:java}
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Unable 
> to insert the ApplicationId application_1540356760422_0015 into the 
> FederationStateStore
> at 
> org.apache.hadoop.yarn.server.router.RouterServerUtil.logAndThrowException(RouterServerUtil.java:56)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:418)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> Caused by: 
> org.apache.hadoop.yarn.server.federation.store.exception.FederationStateStoreInvalidInputException:
>  Missing SubCluster Id information. Please try again by specifying Subcluster 
> Id information.
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationMembershipStateStoreInputValidator.checkSubClusterId(FederationMembershipStateStoreInputValidator.java:247)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.checkApplicationHomeSubCluster(FederationApplicationHomeSubClusterStoreInputValidator.java:160)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.validate(FederationApplicationHomeSubClusterStoreInputValidator.java:65)
> at 
> org.apache.hadoop.yarn.server.federation.store.impl.ZookeeperFederationStateStore.addApplicationHomeSubCluster(ZookeeperFederationStateStore.java:159)
> at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy84.addApplicationHomeSubCluster(Unknown Source)
> at 
> org.apache.hadoop.yarn.server.federation.utils.FederationStateStoreFacade.addApplicationHomeSubCluster(FederationStateStoreFacade.java:402)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:413)
> ... 11 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail:

[jira] [Commented] (YARN-10229) [Federation] Client should be able to submit application to RM directly using normal client conf

2020-05-13 Thread Jira



[ 
https://issues.apache.org/jira/browse/YARN-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106475#comment-17106475
 ] 

Íñigo Goiri commented on YARN-10229:


[~subru], we had discussed about something like this in the past.
Do you have any feedback? It would be awesome to support this.

> [Federation] Client should be able to submit application to RM directly using 
> normal client conf
> 
>
> Key: YARN-10229
> URL: https://issues.apache.org/jira/browse/YARN-10229
> Project: Hadoop YARN
>  Issue Type: Wish
>  Components: amrmproxy, federation
>Affects Versions: 3.1.1
>Reporter: JohnsonGuo
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10229.001.patch
>
>
> Scenario： When enable the yarn federation feature with multi yarn clusters, 
> one can submit their job to yarn-router by *modified* their client 
> configuration with yarn router address.
> But if one still wants to submit their jobs via the original client (before 
> enable federation) to RM directly, it will encounter the AMRMToken exception. 
>  That means once enable federation ,if some one want to submit job, they have 
> to  modify the client conf.
>  
> one possible solution for this Scenario is:
> In NodeManger, when the client ApplicationMaster request comes:
>  * get the client job.xml  from HDFS "".
>  * parse the "yarn.resourcemanager.scheduler.address" parameter in job.xml
>  * if the value of the parameter is "localhost:8049"(AMRM address),then do 
> the AMRMToken valid process
>  * if the value of the parameter is "rm:port"(rm address),then skip the 
> AMRMToken valid process
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10229) [Federation] Client should be able to submit application to RM directly using normal client conf

2020-05-13 Thread Bilwa S T (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10229:
-
Attachment: YARN-10229.001.patch

> [Federation] Client should be able to submit application to RM directly using 
> normal client conf
> 
>
> Key: YARN-10229
> URL: https://issues.apache.org/jira/browse/YARN-10229
> Project: Hadoop YARN
>  Issue Type: Wish
>  Components: amrmproxy, federation
>Affects Versions: 3.1.1
>Reporter: JohnsonGuo
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10229.001.patch
>
>
> Scenario： When enable the yarn federation feature with multi yarn clusters, 
> one can submit their job to yarn-router by *modified* their client 
> configuration with yarn router address.
> But if one still wants to submit their jobs via the original client (before 
> enable federation) to RM directly, it will encounter the AMRMToken exception. 
>  That means once enable federation ,if some one want to submit job, they have 
> to  modify the client conf.
>  
> one possible solution for this Scenario is:
> In NodeManger, when the client ApplicationMaster request comes:
>  * get the client job.xml  from HDFS "".
>  * parse the "yarn.resourcemanager.scheduler.address" parameter in job.xml
>  * if the value of the parameter is "localhost:8049"(AMRM address),then do 
> the AMRMToken valid process
>  * if the value of the parameter is "rm:port"(rm address),then skip the 
> AMRMToken valid process
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10264) Add container launch related env / classpath debug info to container logs when a container fails

2020-05-13 Thread Szilard Nemeth (Jira)

[
https://issues.apache.org/jira/browse/YARN-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Szilard Nemeth updated YARN-10264:
--
Description:
Sometimes when a container fails to launch, it can be pretty hard to figure out
why it has failed.

Similar to YARN-4309, we can add a switch to control if the printing of
environment variables and Java classpath should be done.
As a bonus,
[jdeps|https://docs.oracle.com/javase/8/docs/technotes/tools/unix/jdeps.html]
could also be utilized to print some verbose info about the classpath.

When log aggregation occurs, all this information will automatically get
collected and make debugging such container launch failures much easier.

Below is an example output when the user faces a classpath configuration issue
while launching an application:

{code:java}
End of LogType:prelaunch.err
**
2020-04-19 05:49:12,145 DEBUG:app_info:Diagnostics of the failed app
2020-04-19 05:49:12,145 DEBUG:app_info:Application
application_1587300264561_0001 failed 2 times due to AM Container for
appattempt_1587300264561_0001_02 exited with exitCode: 1
Failing this attempt.Diagnostics: [2020-04-19 12:45:01.955]Exception from
container-launch.
Container id: container_e60_1587300264561_0001_02_01
Exit code: 1
Exception message: Launch container failed
Shell output: main : command provided 1
main : run as user is systest
main : requested yarn user is systest
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file
/dataroot/ycloud/yarn/nm/nmPrivate/application_1587300264561_0001/container_e60_1587300264561_0001_02_01/container_e60_1587300264561_0001_02_01.pid.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...
Getting exit code file...
Creating script paths...

[2020-04-19 12:45:01.984]Container exited with a non-zero exit code 1. Error
file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class
org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Please check whether your etc/hadoop/mapred-site.xml contains the below
configuration:

yarn.app.mapreduce.am.env
HADOOP_MAPRED_HOME=${full path of your hadoop distribution
directory}

mapreduce.map.env
HADOOP_MAPRED_HOME=${full path of your hadoop distribution
directory}

mapreduce.reduce.env
HADOOP_MAPRED_HOME=${full path of your hadoop distribution
directory}

[2020-04-19 12:45:01.985]Container exited with a non-zero exit code 1. Error
file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class
org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Please check whether your etc/hadoop/mapred-site.xml contains the below
configuration:

yarn.app.mapreduce.am.env
HADOOP_MAPRED_HOME=${full path of your hadoop distribution
directory}

mapreduce.map.env
HADOOP_MAPRED_HOME=${full path of your hadoop distribution
directory}

mapreduce.reduce.env
HADOOP_MAPRED_HOME=${full path of your hadoop distribution
directory}

For more detailed output, check the application tracking page:
http://quasar-plnefj-2.quasar-plnefj.root.hwx.site:8088/cluster/app/application_1587300264561_0001
Then click on links to logs of each attempt.
...
2020-04-19 05:49:12,148 INFO:util:* End test_app_API (yarn.suite.YarnAPITests) *
{code}

was:
Sometimes when a container fails to launch, it can be pretty hard to figure out
why it failed.

When log aggregation occurs, all this information will automatically get
collected and make debugging such container launch failures much easier.

Below is an example output when the user faces a classpath configuration issue:

[jira] [Created] (YARN-10264) Add container launch related env / classpath debug info to container logs when a container fails

2020-05-13 Thread Szilard Nemeth (Jira)

Szilard Nemeth created YARN-10264:
-

Summary: Add container launch related env / classpath debug info
to container logs when a container fails
Key: YARN-10264
URL: https://issues.apache.org/jira/browse/YARN-10264
Project: Hadoop YARN
Issue Type: Task
Reporter: Szilard Nemeth
Assignee: Szilard Nemeth

Sometimes when a container fails to launch, it can be pretty hard to figure out
why it failed.