[jira] [Commented] (YARN-10063) Usage output of container-executor binary needs to include --http/--https argument

2020-01-06 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009429#comment-17009429
 ] 

Peter Bacsko commented on YARN-10063:
-

[~sahuja] yeah, this looks like acceptable to me. 

> Usage output of container-executor binary needs to include --http/--https 
> argument
> --
>
> Key: YARN-10063
> URL: https://issues.apache.org/jira/browse/YARN-10063
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
> Attachments: YARN-10063.001.patch
>
>
> YARN-8448/YARN-6586 seems to have introduced a new option - "\--http" 
> (default) and "\--https" that is possible to be passed in to the 
> container-executor binary, see :
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L564
> and 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L521
> however, the usage output seems to have missed this:
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L74
> Raising this jira to improve this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9956) Improve connection error message for YARN ApiServerClient

2020-01-06 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009394#comment-17009394
 ] 

Prabhu Joseph commented on YARN-9956:
-

Thanks [~eyang].

> Improve connection error message for YARN ApiServerClient
> -
>
> Key: YARN-9956
> URL: https://issues.apache.org/jira/browse/YARN-9956
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Yang
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9956-001.patch, YARN-9956-002.patch, 
> YARN-9956-003.patch, YARN-9956-004.patch, YARN-9956-005.patch
>
>
> In HA environment, yarn.resourcemanager.webapp.address configuration is 
> optional.  ApiServiceClient may produce confusing error message like this:
> {code}
> 19/10/30 20:13:42 INFO client.ApiServiceClient: Fail to connect to: 
> host1.example.com:8090
> 19/10/30 20:13:42 INFO client.ApiServiceClient: Fail to connect to: 
> host2.example.com:8090
> 19/10/30 20:13:42 INFO util.log: Logging initialized @2301ms
> 19/10/30 20:13:42 ERROR client.ApiServiceClient: Error: {}
> GSSException: No valid credentials provided (Mechanism level: Server not 
> found in Kerberos database (7) - LOOKING_UP_SERVER)
>   at 
> java.security.jgss/sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:771)
>   at 
> java.security.jgss/sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:266)
>   at 
> java.security.jgss/sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:196)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient$1.run(ApiServiceClient.java:125)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient$1.run(ApiServiceClient.java:105)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.generateToken(ApiServiceClient.java:105)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:290)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:271)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.actionLaunch(ApiServiceClient.java:416)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:589)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:125)
> Caused by: KrbException: Server not found in Kerberos database (7) - 
> LOOKING_UP_SERVER
>   at 
> java.security.jgss/sun.security.krb5.KrbTgsRep.(KrbTgsRep.java:73)
>   at 
> java.security.jgss/sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:251)
>   at 
> java.security.jgss/sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:262)
>   at 
> java.security.jgss/sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:308)
>   at 
> java.security.jgss/sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:126)
>   at 
> java.security.jgss/sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:458)
>   at 
> java.security.jgss/sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:695)
>   ... 15 more
> Caused by: KrbException: Identifier doesn't match expected value (906)
>   at 
> java.security.jgss/sun.security.krb5.internal.KDCRep.init(KDCRep.java:140)
>   at 
> java.security.jgss/sun.security.krb5.internal.TGSRep.init(TGSRep.java:65)
>   at 
> java.security.jgss/sun.security.krb5.internal.TGSRep.(TGSRep.java:60)
>   at 
> java.security.jgss/sun.security.krb5.KrbTgsRep.(KrbTgsRep.java:55)
>   ... 21 more
> 19/10/30 20:13:42 ERROR client.ApiServiceClient: Fail to launch application: 
> java.io.IOException: java.lang.reflect.UndeclaredThrowableException
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:293)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:271)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.actionLaunch(ApiServiceClient.java:416)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:589)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(A

[jira] [Commented] (YARN-10069) Showing jstack on UI for containers

2020-01-06 Thread Akhil PB (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009392#comment-17009392
 ] 

Akhil PB commented on YARN-10069:
-

[~cane] is this jira for showing stack trace for the failed container in UI2?

> Showing jstack on UI for containers
> ---
>
> Key: YARN-10069
> URL: https://issues.apache.org/jira/browse/YARN-10069
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
>
> In this jira, i want to post a patch to support showing jstack on the 
> container ui



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7913) Improve error handling when application recovery fails with exception

2020-01-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009382#comment-17009382
 ] 

Hadoop QA commented on YARN-7913:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 44s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 86m 
56s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}143m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | YARN-7913 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990058/YARN-7913.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 5d59e5f1e264 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 59aac00 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25338/testReport/ |
| Max. process+thread count | 827 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25338/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Improve error handling when applicatio

[jira] [Updated] (YARN-7913) Improve error handling when application recovery fails with exception

2020-01-06 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-7913:

Attachment: YARN-7913.003.patch

> Improve error handling when application recovery fails with exception
> -
>
> Key: YARN-7913
> URL: https://issues.apache.org/jira/browse/YARN-7913
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: Gergo Repas
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-7913.000.poc.patch, YARN-7913.001.patch, 
> YARN-7913.002.patch, YARN-7913.003.patch
>
>
> There are edge cases when the application recovery fails with an exception.
> Example failure scenario:
>  * setup: a queue is a leaf queue in the primary RM's config and the same 
> queue is a parent queue in the secondary RM's config.
>  * When failover happens with this setup, the recovery will fail for 
> applications on this queue, and an APP_REJECTED event will be dispatched to 
> the async dispatcher. On the same thread (that handles the recovery), a 
> NullPointerException is thrown when the applicationAttempt is tried to be 
> recovered 
> (https://github.com/apache/hadoop/blob/55066cc53dc22b68f9ca55a0029741d6c846be0a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java#L494).
>  I don't see a good way to avoid the NPE in this scenario, because when the 
> NPE occurs the APP_REJECTED has not been processed yet, and we don't know 
> that the application recovery failed.
> Currently the first exception will abort the recovery, and if there are X 
> applications, there will be ~X passive -> active RM transition attempts - the 
> passive -> active RM transition will only succeed when the last APP_REJECTED 
> event is processed on the async dispatcher thread.
> _The point of this ticket is to improve the error handling and reduce the 
> number of passive -> active RM transition attempts (solving the above 
> described failure scenario isn't in scope)._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7913) Improve error handling when application recovery fails with exception

2020-01-06 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009329#comment-17009329
 ] 

Wilfred Spiegelenburg commented on YARN-7913:
-

The ACL test case did not fail in the expected way when reversing the fix as 
ACLs were not correctly setup in the test config.
Updating patch with the proper config. [^YARN-7913.003.patch] 

> Improve error handling when application recovery fails with exception
> -
>
> Key: YARN-7913
> URL: https://issues.apache.org/jira/browse/YARN-7913
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: Gergo Repas
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-7913.000.poc.patch, YARN-7913.001.patch, 
> YARN-7913.002.patch, YARN-7913.003.patch
>
>
> There are edge cases when the application recovery fails with an exception.
> Example failure scenario:
>  * setup: a queue is a leaf queue in the primary RM's config and the same 
> queue is a parent queue in the secondary RM's config.
>  * When failover happens with this setup, the recovery will fail for 
> applications on this queue, and an APP_REJECTED event will be dispatched to 
> the async dispatcher. On the same thread (that handles the recovery), a 
> NullPointerException is thrown when the applicationAttempt is tried to be 
> recovered 
> (https://github.com/apache/hadoop/blob/55066cc53dc22b68f9ca55a0029741d6c846be0a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java#L494).
>  I don't see a good way to avoid the NPE in this scenario, because when the 
> NPE occurs the APP_REJECTED has not been processed yet, and we don't know 
> that the application recovery failed.
> Currently the first exception will abort the recovery, and if there are X 
> applications, there will be ~X passive -> active RM transition attempts - the 
> passive -> active RM transition will only succeed when the last APP_REJECTED 
> event is processed on the async dispatcher thread.
> _The point of this ticket is to improve the error handling and reduce the 
> number of passive -> active RM transition attempts (solving the above 
> described failure scenario isn't in scope)._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7913) Improve error handling when application recovery fails with exception

2020-01-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009304#comment-17009304
 ] 

Hadoop QA commented on YARN-7913:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
42s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 46s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 85m 
49s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}143m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | YARN-7913 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990050/YARN-7913.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 7fbf987bdaec 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 819159f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25337/testReport/ |
| Max. process+thread count | 830 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25337/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Improve error handling when applicatio

[jira] [Updated] (YARN-10063) Usage output of container-executor binary needs to include --http/--https argument

2020-01-06 Thread Siddharth Ahuja (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10063:
---
Description: 
YARN-8448/YARN-6586 seems to have introduced a new option - "\--http" (default) 
and "\--https" that is possible to be passed in to the container-executor 
binary, see :

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L564

and 

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L521

however, the usage output seems to have missed this:

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L74

Raising this jira to improve this.

  was:
YARN-8448/YARN-6586 seems to have introduced a new option - "--http" (default) 
and "--https" that is possible to be passed in to the container-executor 
binary, see :

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L564

and 

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L521

however, the usage output seems to have missed this:

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L74

Raising this jira to improve this.


> Usage output of container-executor binary needs to include --http/--https 
> argument
> --
>
> Key: YARN-10063
> URL: https://issues.apache.org/jira/browse/YARN-10063
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
> Attachments: YARN-10063.001.patch
>
>
> YARN-8448/YARN-6586 seems to have introduced a new option - "\--http" 
> (default) and "\--https" that is possible to be passed in to the 
> container-executor binary, see :
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L564
> and 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L521
> however, the usage output seems to have missed this:
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L74
> Raising this jira to improve this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10063) Usage output of container-executor binary needs to include --http/--https argument

2020-01-06 Thread Siddharth Ahuja (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009286#comment-17009286
 ] 

Siddharth Ahuja edited comment on YARN-10063 at 1/7/20 1:57 AM:


Hi [~pbacsko], thanks for your update. I see your point, however, if we make 
\--https as optional, then, we still need to specify \--http there (even if 
there are no additional details that need to be supplied). The reason why this 
JIRA was actually created was because someone noticed "\--http" in the command 
array but it wasn't clear when/how this got introduced because the usage output 
from container-executor binary does not have this, e.g:

{code}
Full command array for failed execution:
[nice, -n, 0, /var/lib/yarn-ce/bin/container-executor, usr, usr, 1, 
application_1576461726457_5994, container_e136_1576461726457_5994_01_001801, 
/data07/yarn/nm/usercache/usr_ds_exec_hdp/appcache/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801,
 
/data09/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/launch_container.sh,
 
/data08/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/container_e136_1576461726457_5994_01_001801.tokens,
 --http, 
/data06/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/container_e136_1576461726457_5994_01_001801.pid,
 
/data01/yarn/nm%/data02/yarn/nm%/data03/yarn/nm%/data04/yarn/nm%/data05/yarn/nm%/data06/yarn/nm%/data07/yarn/nm%/data08/yarn/nm%/data09/yarn/nm%/data10/yarn/nm%/data11/yarn/nm%/data12/yarn/nm,
 
/data01/yarn/container-logs%/data02/yarn/container-logs%/data03/yarn/container-logs%/data04/yarn/container-logs%/data05/yarn/container-logs%/data06/yarn/container-logs%/data07/yarn/container-logs%/data08/yarn/container-logs%/data09/yarn/container-logs%/data10/yarn/container-logs%/data11/yarn/container-logs%/data12/yarn/container-logs%/opt/yarn/container-logs,
 
cgroups=/var/lib/yarn-ce/cgroups/cpu/hadoop-yarn/container_e136_1576461726457_5994_01_001801/tasks]
{code}

As per:

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DefaultLinuxContainerRuntime.java#L127

and 

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DefaultLinuxContainerRuntime.java#L131

we will either have --http or --https specified and if the latter is specified 
then it will have keystore and truststore details.

Therefore, we could have something like this:

{code}
launch container:   1 appid containerid workdir container-script tokens 
--http | --https keystorepath truststorepath pidfile nm-local-dirs nm-log-dirs 
resources
launch docker container:   4 appid containerid workdir container-script 
tokens --http | --https keystorepath truststorepath pidfile nm-local-dirs 
nm-log-dirs docker-command-file resources
{code}

Please let me know if you are happy with this and I will provide another patch 
based on the above. Thanks!


was (Author: sahuja):
Hi [~pbacsko], thanks for your update. I see your point, however, if we make 
\--https as optional, then, we still need to specify \--http there (even if 
there is no additional details that need to be supplied). The reason why this 
JIRA was actually created was because someone noticed "\--http" in the command 
array but it wasn't clear when/how this got introduced because the usage output 
from container-executor binary does not have this, e.g:

{code}
Full command array for failed execution:
[nice, -n, 0, /var/lib/yarn-ce/bin/container-executor, usr, usr, 1, 
application_1576461726457_5994, container_e136_1576461726457_5994_01_001801, 
/data07/yarn/nm/usercache/usr_ds_exec_hdp/appcache/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801,
 
/data09/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/launch_container.sh,
 
/data08/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/container_e136_1576461726457_5994_01_001801.tokens,
 --http, 
/data06/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/container_e136_1576461726457_5994_01_001801.pid,
 
/data01/yarn/nm%/data02/yarn/nm%/data03/yarn/nm%/data04/yarn/nm%/data05/yarn/nm%/data06/yarn/nm%/data07/yarn/nm%/data08/yarn/nm%/data09/yarn/nm%/data10/yarn/nm%/data11/yarn/nm%/data12/yarn/nm,
 
/data01/yarn/container-logs%/data02/yarn/container-logs%/data03/yarn/container-logs%/data04/yarn/container-logs%/data05/yarn/container-logs%/data06/yarn/container-logs%/data07/yarn/container-l

[jira] [Comment Edited] (YARN-10063) Usage output of container-executor binary needs to include --http/--https argument

2020-01-06 Thread Siddharth Ahuja (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009286#comment-17009286
 ] 

Siddharth Ahuja edited comment on YARN-10063 at 1/7/20 1:56 AM:


Hi [~pbacsko], thanks for your update. I see your point, however, if we make 
\--https as optional, then, we still need to specify \--http there (even if 
there is no additional details that need to be supplied). The reason why this 
JIRA was actually created was because someone noticed "\--http" in the command 
array but it wasn't clear when/how this got introduced because the usage output 
from container-executor binary does not have this, e.g:

{code}
Full command array for failed execution:
[nice, -n, 0, /var/lib/yarn-ce/bin/container-executor, usr, usr, 1, 
application_1576461726457_5994, container_e136_1576461726457_5994_01_001801, 
/data07/yarn/nm/usercache/usr_ds_exec_hdp/appcache/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801,
 
/data09/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/launch_container.sh,
 
/data08/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/container_e136_1576461726457_5994_01_001801.tokens,
 --http, 
/data06/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/container_e136_1576461726457_5994_01_001801.pid,
 
/data01/yarn/nm%/data02/yarn/nm%/data03/yarn/nm%/data04/yarn/nm%/data05/yarn/nm%/data06/yarn/nm%/data07/yarn/nm%/data08/yarn/nm%/data09/yarn/nm%/data10/yarn/nm%/data11/yarn/nm%/data12/yarn/nm,
 
/data01/yarn/container-logs%/data02/yarn/container-logs%/data03/yarn/container-logs%/data04/yarn/container-logs%/data05/yarn/container-logs%/data06/yarn/container-logs%/data07/yarn/container-logs%/data08/yarn/container-logs%/data09/yarn/container-logs%/data10/yarn/container-logs%/data11/yarn/container-logs%/data12/yarn/container-logs%/opt/yarn/container-logs,
 
cgroups=/var/lib/yarn-ce/cgroups/cpu/hadoop-yarn/container_e136_1576461726457_5994_01_001801/tasks]
{code}

As per:

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DefaultLinuxContainerRuntime.java#L127

and 

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DefaultLinuxContainerRuntime.java#L131

we will either have --http or --https specified and if the latter is specified 
then it will have keystore and truststore details.

Therefore, we could have something like this:

{code}
launch container:   1 appid containerid workdir container-script tokens 
--http | --https keystorepath truststorepath pidfile nm-local-dirs nm-log-dirs 
resources
launch docker container:   4 appid containerid workdir container-script 
tokens --http | --https keystorepath truststorepath pidfile nm-local-dirs 
nm-log-dirs docker-command-file resources
{code}

Please let me know if you are happy with this and I will provide another patch 
based on the above. Thanks!


was (Author: sahuja):
Hi [~pbacsko], thanks for your update. I see your point, however, if we make 
--https as optional, then, we still need to specify --http there (even if there 
is no additional details that need to be supplied). The reason why this JIRA 
was actually created was because someone noticed "--http" in the command array 
but it wasn't clear when/how this got introduced because the usage output from 
container-executor binary does not have this, e.g:

{code}
Full command array for failed execution:
[nice, -n, 0, /var/lib/yarn-ce/bin/container-executor, usr, usr, 1, 
application_1576461726457_5994, container_e136_1576461726457_5994_01_001801, 
/data07/yarn/nm/usercache/usr_ds_exec_hdp/appcache/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801,
 
/data09/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/launch_container.sh,
 
/data08/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/container_e136_1576461726457_5994_01_001801.tokens,
 --http, 
/data06/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/container_e136_1576461726457_5994_01_001801.pid,
 
/data01/yarn/nm%/data02/yarn/nm%/data03/yarn/nm%/data04/yarn/nm%/data05/yarn/nm%/data06/yarn/nm%/data07/yarn/nm%/data08/yarn/nm%/data09/yarn/nm%/data10/yarn/nm%/data11/yarn/nm%/data12/yarn/nm,
 
/data01/yarn/container-logs%/data02/yarn/container-logs%/data03/yarn/container-logs%/data04/yarn/container-logs%/data05/yarn/container-logs%/data06/yarn/container-logs%/data07/yarn/container-logs%

[jira] [Commented] (YARN-10063) Usage output of container-executor binary needs to include --http/--https argument

2020-01-06 Thread Siddharth Ahuja (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009286#comment-17009286
 ] 

Siddharth Ahuja commented on YARN-10063:


Hi [~pbacsko], thanks for your update. I see your point, however, if we make 
--https as optional, then, we still need to specify --http there (even if there 
is no additional details that need to be supplied). The reason why this JIRA 
was actually created was because someone noticed "--http" in the command array 
but it wasn't clear when/how this got introduced because the usage output from 
container-executor binary does not have this, e.g:

{code}
Full command array for failed execution:
[nice, -n, 0, /var/lib/yarn-ce/bin/container-executor, usr, usr, 1, 
application_1576461726457_5994, container_e136_1576461726457_5994_01_001801, 
/data07/yarn/nm/usercache/usr_ds_exec_hdp/appcache/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801,
 
/data09/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/launch_container.sh,
 
/data08/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/container_e136_1576461726457_5994_01_001801.tokens,
 --http, 
/data06/yarn/nm/nmPrivate/application_1576461726457_5994/container_e136_1576461726457_5994_01_001801/container_e136_1576461726457_5994_01_001801.pid,
 
/data01/yarn/nm%/data02/yarn/nm%/data03/yarn/nm%/data04/yarn/nm%/data05/yarn/nm%/data06/yarn/nm%/data07/yarn/nm%/data08/yarn/nm%/data09/yarn/nm%/data10/yarn/nm%/data11/yarn/nm%/data12/yarn/nm,
 
/data01/yarn/container-logs%/data02/yarn/container-logs%/data03/yarn/container-logs%/data04/yarn/container-logs%/data05/yarn/container-logs%/data06/yarn/container-logs%/data07/yarn/container-logs%/data08/yarn/container-logs%/data09/yarn/container-logs%/data10/yarn/container-logs%/data11/yarn/container-logs%/data12/yarn/container-logs%/opt/yarn/container-logs,
 
cgroups=/var/lib/yarn-ce/cgroups/cpu/hadoop-yarn/container_e136_1576461726457_5994_01_001801/tasks]
{code}

As per:

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DefaultLinuxContainerRuntime.java#L127

and 

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DefaultLinuxContainerRuntime.java#L131

we will either have --http or --https specified and if the latter is specified 
then it will have keystore and truststore details.

Therefore, we could have something like this:

{code}
launch container:   1 appid containerid workdir container-script tokens 
--http | --https keystorepath truststorepath pidfile nm-local-dirs nm-log-dirs 
resources
launch docker container:   4 appid containerid workdir container-script 
tokens --http | --https keystorepath truststorepath pidfile nm-local-dirs 
nm-log-dirs docker-command-file resources
{code}

Please let me know if you are happy with this and I will provide another patch 
based on the above. Thanks!

> Usage output of container-executor binary needs to include --http/--https 
> argument
> --
>
> Key: YARN-10063
> URL: https://issues.apache.org/jira/browse/YARN-10063
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
> Attachments: YARN-10063.001.patch
>
>
> YARN-8448/YARN-6586 seems to have introduced a new option - "--http" 
> (default) and "--https" that is possible to be passed in to the 
> container-executor binary, see :
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L564
> and 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L521
> however, the usage output seems to have missed this:
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L74
> Raising this jira to improve this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7387) org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer fails intermittently

2020-01-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009282#comment-17009282
 ] 

Hadoop QA commented on YARN-7387:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 14s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 42s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 86m 
18s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}143m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | YARN-7387 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990046/YARN-7387.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c7bddf44f984 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 819159f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25336/testReport/ |
| Max. process+thread count | 820 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25336/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> org.apache.hadoop.yarn.server.resource

[jira] [Updated] (YARN-7913) Improve error handling when application recovery fails with exception

2020-01-06 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-7913:

Attachment: YARN-7913.002.patch

> Improve error handling when application recovery fails with exception
> -
>
> Key: YARN-7913
> URL: https://issues.apache.org/jira/browse/YARN-7913
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: Gergo Repas
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-7913.000.poc.patch, YARN-7913.001.patch, 
> YARN-7913.002.patch
>
>
> There are edge cases when the application recovery fails with an exception.
> Example failure scenario:
>  * setup: a queue is a leaf queue in the primary RM's config and the same 
> queue is a parent queue in the secondary RM's config.
>  * When failover happens with this setup, the recovery will fail for 
> applications on this queue, and an APP_REJECTED event will be dispatched to 
> the async dispatcher. On the same thread (that handles the recovery), a 
> NullPointerException is thrown when the applicationAttempt is tried to be 
> recovered 
> (https://github.com/apache/hadoop/blob/55066cc53dc22b68f9ca55a0029741d6c846be0a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java#L494).
>  I don't see a good way to avoid the NPE in this scenario, because when the 
> NPE occurs the APP_REJECTED has not been processed yet, and we don't know 
> that the application recovery failed.
> Currently the first exception will abort the recovery, and if there are X 
> applications, there will be ~X passive -> active RM transition attempts - the 
> passive -> active RM transition will only succeed when the last APP_REJECTED 
> event is processed on the async dispatcher thread.
> _The point of this ticket is to improve the error handling and reduce the 
> number of passive -> active RM transition attempts (solving the above 
> described failure scenario isn't in scope)._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7913) Improve error handling when application recovery fails with exception

2020-01-06 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009240#comment-17009240
 ] 

Wilfred Spiegelenburg commented on YARN-7913:
-

new patch to fix javac commenting on deprecated method use and checkstyle 
issues (IDE was not setup correctly) 

> Improve error handling when application recovery fails with exception
> -
>
> Key: YARN-7913
> URL: https://issues.apache.org/jira/browse/YARN-7913
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: Gergo Repas
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-7913.000.poc.patch, YARN-7913.001.patch, 
> YARN-7913.002.patch
>
>
> There are edge cases when the application recovery fails with an exception.
> Example failure scenario:
>  * setup: a queue is a leaf queue in the primary RM's config and the same 
> queue is a parent queue in the secondary RM's config.
>  * When failover happens with this setup, the recovery will fail for 
> applications on this queue, and an APP_REJECTED event will be dispatched to 
> the async dispatcher. On the same thread (that handles the recovery), a 
> NullPointerException is thrown when the applicationAttempt is tried to be 
> recovered 
> (https://github.com/apache/hadoop/blob/55066cc53dc22b68f9ca55a0029741d6c846be0a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java#L494).
>  I don't see a good way to avoid the NPE in this scenario, because when the 
> NPE occurs the APP_REJECTED has not been processed yet, and we don't know 
> that the application recovery failed.
> Currently the first exception will abort the recovery, and if there are X 
> applications, there will be ~X passive -> active RM transition attempts - the 
> passive -> active RM transition will only succeed when the last APP_REJECTED 
> event is processed on the async dispatcher thread.
> _The point of this ticket is to improve the error handling and reduce the 
> number of passive -> active RM transition attempts (solving the above 
> described failure scenario isn't in scope)._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10068) TimelineV2Client may leak file descriptors creating ClientResponse objects.

2020-01-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009149#comment-17009149
 ] 

Hadoop QA commented on YARN-10068:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
42s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 47s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 44s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
40s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | YARN-10068 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990031/YARN-10068.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b2c37448a73b 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 819159f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25335/testReport/ |
| Max. process+thread count | 306 (vs. ulimit of 5500) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25335/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



[jira] [Commented] (YARN-10068) TimelineV2Client may leak file descriptors creating ClientResponse objects.

2020-01-06 Thread Anand Srinivasan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009125#comment-17009125
 ] 

Anand Srinivasan commented on YARN-10068:
-

Hi Prabhu Joseph,

Thanks for reviewing the patch.

I fixed the checkstyle issues now.

The reasons we don't need new tests are that :

(1)There are no changes to the putObjects() API signature so the callers and 
their functionalities won't be affected/changed with this fix.

(2)The functionality of the API itself is not changed. The exceptions thrown 
from this method are still the same. This fix just closes the inputstream of 
the ClientResponse object which again is local to the putObjects() method. 
There are no visible changes to the callers of this method both in terms of 
functionality and throwing exceptions.

(3)The existing tests covers both the positive and negative cases for this 
particular functionality already.

Thanks and kind regards.

> TimelineV2Client may leak file descriptors creating ClientResponse objects.
> ---
>
> Key: YARN-10068
> URL: https://issues.apache.org/jira/browse/YARN-10068
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.0.0
> Environment: HDP VERSION3.1.4
> AMBARI VERSION2.7.4.0
>Reporter: Anand Srinivasan
>Assignee: Anand Srinivasan
>Priority: Critical
> Attachments: YARN-10068.001.patch, YARN-10068.002.patch, 
> YARN-10068.003.patch, image-2020-01-02-14-58-12-773.png
>
>
> Hi team,
> Code-walkthrough between v1 and v2 of TimelineClient API revealed that v2 API 
> TimelineV2ClientImpl#putObjects doesn't close ClientResponse objects under 
> success status returned from Timeline Server. ClientResponse is closed only 
> under erroneous response from the server using ClientResponse#getEntity.
> We also noticed that TimelineClient (v1) closes the ClientResponse object in 
> TimelineWriter#putEntities by calling ClientResponse#getEntity in both 
> success and error conditions from the server thereby avoiding this file 
> descriptor leak.
> Customer's original issue and the symptom was that the NodeManager went down 
> because of 'too many files open' condition where there were lots of 
> CLOSED_WAIT sockets observed between the timeline client (from NM) and the 
> timeline server hosts. 
> Could you please help resolve this issue ? Thanks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10068) TimelineV2Client may leak file descriptors creating ClientResponse objects.

2020-01-06 Thread Anand Srinivasan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009119#comment-17009119
 ] 

Anand Srinivasan commented on YARN-10068:
-

Hi Adam Antal,

Thanks for reviewing the patch.

For your comments :

1. Can we make this ERROR level, since it's causing serious issues ?

The reason I kept it at WARN level is that the HTTP response itself is 
processed successfully in this case and hence TimelineV2ClientImpl#putObjects 
just logs a message when ClientResponse#close fails.

If you think that ERROR level is more appropriate even in the above case, I can 
change the level accordingly.

 

2. We will override the msg {{String}} in the finally part. 

We won't override the msg as it's been added at the end of the msg string in 
finally part.

} finally {

  msg = "Response from the timeline server is not successful"

              + ", HTTP error code: " + resp.getStatus()

              + ", "

              + msg;   <

 

3.  I suggest to add a {{Throwable}} case

Good point. I added Throwable to the list of exceptions.

Thanks and kind regards.

> TimelineV2Client may leak file descriptors creating ClientResponse objects.
> ---
>
> Key: YARN-10068
> URL: https://issues.apache.org/jira/browse/YARN-10068
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.0.0
> Environment: HDP VERSION3.1.4
> AMBARI VERSION2.7.4.0
>Reporter: Anand Srinivasan
>Assignee: Anand Srinivasan
>Priority: Critical
> Attachments: YARN-10068.001.patch, YARN-10068.002.patch, 
> YARN-10068.003.patch, image-2020-01-02-14-58-12-773.png
>
>
> Hi team,
> Code-walkthrough between v1 and v2 of TimelineClient API revealed that v2 API 
> TimelineV2ClientImpl#putObjects doesn't close ClientResponse objects under 
> success status returned from Timeline Server. ClientResponse is closed only 
> under erroneous response from the server using ClientResponse#getEntity.
> We also noticed that TimelineClient (v1) closes the ClientResponse object in 
> TimelineWriter#putEntities by calling ClientResponse#getEntity in both 
> success and error conditions from the server thereby avoiding this file 
> descriptor leak.
> Customer's original issue and the symptom was that the NodeManager went down 
> because of 'too many files open' condition where there were lots of 
> CLOSED_WAIT sockets observed between the timeline client (from NM) and the 
> timeline server hosts. 
> Could you please help resolve this issue ? Thanks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9956) Improve connection error message for YARN ApiServerClient

2020-01-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009089#comment-17009089
 ] 

Hudson commented on YARN-9956:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17817 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17817/])
YARN-9956. Improved connection error message for YARN ApiServerClient.   
(eyang: rev d81d45ff2fc9a1c424222e021f9306bf64c916b2)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/src/test/java/org/apache/hadoop/yarn/service/client/TestApiServiceClient.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/src/main/java/org/apache/hadoop/yarn/service/client/ApiServiceClient.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/src/test/java/org/apache/hadoop/yarn/service/client/TestSecureApiServiceClient.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/util/YarnClientUtils.java


> Improve connection error message for YARN ApiServerClient
> -
>
> Key: YARN-9956
> URL: https://issues.apache.org/jira/browse/YARN-9956
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Yang
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9956-001.patch, YARN-9956-002.patch, 
> YARN-9956-003.patch, YARN-9956-004.patch, YARN-9956-005.patch
>
>
> In HA environment, yarn.resourcemanager.webapp.address configuration is 
> optional.  ApiServiceClient may produce confusing error message like this:
> {code}
> 19/10/30 20:13:42 INFO client.ApiServiceClient: Fail to connect to: 
> host1.example.com:8090
> 19/10/30 20:13:42 INFO client.ApiServiceClient: Fail to connect to: 
> host2.example.com:8090
> 19/10/30 20:13:42 INFO util.log: Logging initialized @2301ms
> 19/10/30 20:13:42 ERROR client.ApiServiceClient: Error: {}
> GSSException: No valid credentials provided (Mechanism level: Server not 
> found in Kerberos database (7) - LOOKING_UP_SERVER)
>   at 
> java.security.jgss/sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:771)
>   at 
> java.security.jgss/sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:266)
>   at 
> java.security.jgss/sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:196)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient$1.run(ApiServiceClient.java:125)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient$1.run(ApiServiceClient.java:105)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.generateToken(ApiServiceClient.java:105)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:290)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:271)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.actionLaunch(ApiServiceClient.java:416)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:589)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:125)
> Caused by: KrbException: Server not found in Kerberos database (7) - 
> LOOKING_UP_SERVER
>   at 
> java.security.jgss/sun.security.krb5.KrbTgsRep.(KrbTgsRep.java:73)
>   at 
> java.security.jgss/sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:251)
>   at 
> java.security.jgss/sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:262)
>   at 
> java.security.jgss/sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:308)
>   at 
> java.security.jgss/sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:126)
>   at 
> java.security.jgss/sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:458)
>   at 
> java.security.jgss/sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:695)
>   ... 15 more
> Caused by: KrbException: Identifier doesn't match expected value (906)
>   at 
> java.security.jgss/sun.security.krb5.internal.KDCRep.init(KDCRep.java:140)
>   at 
> java.security.jgss/sun.security.krb5.internal.TGSRep.init(TGSRep.java:65)
>   at 
> java.security

[jira] [Commented] (YARN-9956) Improve connection error message for YARN ApiServerClient

2020-01-06 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009079#comment-17009079
 ] 

Eric Yang commented on YARN-9956:
-

Thank you [~prabhujoseph] for the patch.
+1 for patch 5.  Committing shortly.

> Improve connection error message for YARN ApiServerClient
> -
>
> Key: YARN-9956
> URL: https://issues.apache.org/jira/browse/YARN-9956
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Yang
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9956-001.patch, YARN-9956-002.patch, 
> YARN-9956-003.patch, YARN-9956-004.patch, YARN-9956-005.patch
>
>
> In HA environment, yarn.resourcemanager.webapp.address configuration is 
> optional.  ApiServiceClient may produce confusing error message like this:
> {code}
> 19/10/30 20:13:42 INFO client.ApiServiceClient: Fail to connect to: 
> host1.example.com:8090
> 19/10/30 20:13:42 INFO client.ApiServiceClient: Fail to connect to: 
> host2.example.com:8090
> 19/10/30 20:13:42 INFO util.log: Logging initialized @2301ms
> 19/10/30 20:13:42 ERROR client.ApiServiceClient: Error: {}
> GSSException: No valid credentials provided (Mechanism level: Server not 
> found in Kerberos database (7) - LOOKING_UP_SERVER)
>   at 
> java.security.jgss/sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:771)
>   at 
> java.security.jgss/sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:266)
>   at 
> java.security.jgss/sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:196)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient$1.run(ApiServiceClient.java:125)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient$1.run(ApiServiceClient.java:105)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.generateToken(ApiServiceClient.java:105)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:290)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:271)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.actionLaunch(ApiServiceClient.java:416)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:589)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:125)
> Caused by: KrbException: Server not found in Kerberos database (7) - 
> LOOKING_UP_SERVER
>   at 
> java.security.jgss/sun.security.krb5.KrbTgsRep.(KrbTgsRep.java:73)
>   at 
> java.security.jgss/sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:251)
>   at 
> java.security.jgss/sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:262)
>   at 
> java.security.jgss/sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:308)
>   at 
> java.security.jgss/sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:126)
>   at 
> java.security.jgss/sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:458)
>   at 
> java.security.jgss/sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:695)
>   ... 15 more
> Caused by: KrbException: Identifier doesn't match expected value (906)
>   at 
> java.security.jgss/sun.security.krb5.internal.KDCRep.init(KDCRep.java:140)
>   at 
> java.security.jgss/sun.security.krb5.internal.TGSRep.init(TGSRep.java:65)
>   at 
> java.security.jgss/sun.security.krb5.internal.TGSRep.(TGSRep.java:60)
>   at 
> java.security.jgss/sun.security.krb5.KrbTgsRep.(KrbTgsRep.java:55)
>   ... 21 more
> 19/10/30 20:13:42 ERROR client.ApiServiceClient: Fail to launch application: 
> java.io.IOException: java.lang.reflect.UndeclaredThrowableException
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:293)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:271)
>   at 
> org.apache.hadoop.yarn.service.client.ApiServiceClient.actionLaunch(ApiServiceClient.java:416)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:589)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.yarn.client.cli

[jira] [Commented] (YARN-10026) Pull out common code pieces from ATS v1.5 and v2

2020-01-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009070#comment-17009070
 ] 

Hadoop QA commented on YARN-10026:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  9m 
37s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
31s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
43s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 50s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
33s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 42s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 2 new + 
77 unchanged - 2 fixed = 79 total (was 79) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
35s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m  
2s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 81m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:0f25cbbb251 |
| JIRA Issue | YARN-10026 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990022/YARN-10026.branch-3.2.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 70dd0aaf78fd 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.2 / 250cd9f |
| maven | version: Apache Maven 3.3.9 |
| De

[jira] [Commented] (YARN-10026) Pull out common code pieces from ATS v1.5 and v2

2020-01-06 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008986#comment-17008986
 ] 

Adam Antal commented on YARN-10026:
---

Thanks for the commit [~snemeth]!

Uploaded patch for branch-3.2. I had some minor conflicts in the code, but 
resolved them.

> Pull out common code pieces from ATS v1.5 and v2
> 
>
> Key: YARN-10026
> URL: https://issues.apache.org/jira/browse/YARN-10026
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-10026.001.patch, YARN-10026.002.patch, 
> YARN-10026.003.patch, YARN-10026.branch-3.2.001.patch
>
>
> ATSv1.5 and ATSv2 has lots of common code that can be pulled to an abstract 
> service / package. The logic is the same, and the code is _almost_ the same.
> As far as I see, the only ATS specific thing in that AppInfo is constructed 
> from an ApplicationReport, which information is extracted from the 
> TimelineReader client, 
> Later the appInfo object's user and appState fields are used, but I see no 
> other dependency on the timeline part. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10026) Pull out common code pieces from ATS v1.5 and v2

2020-01-06 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal updated YARN-10026:
--
Attachment: YARN-10026.branch-3.2.001.patch

> Pull out common code pieces from ATS v1.5 and v2
> 
>
> Key: YARN-10026
> URL: https://issues.apache.org/jira/browse/YARN-10026
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-10026.001.patch, YARN-10026.002.patch, 
> YARN-10026.003.patch, YARN-10026.branch-3.2.001.patch
>
>
> ATSv1.5 and ATSv2 has lots of common code that can be pulled to an abstract 
> service / package. The logic is the same, and the code is _almost_ the same.
> As far as I see, the only ATS specific thing in that AppInfo is constructed 
> from an ApplicationReport, which information is extracted from the 
> TimelineReader client, 
> Later the appInfo object's user and appState fields are used, but I see no 
> other dependency on the timeline part. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10026) Pull out common code pieces from ATS v1.5 and v2

2020-01-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008977#comment-17008977
 ] 

Hudson commented on YARN-10026:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17816 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17816/])
YARN-10026. Pull out common code pieces from ATS v1.5 and v2. (snemeth: rev 
dd2607e3ec3c349130e4143b0f67b23e11da420a)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/WebServices.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/webapp/TestLogWebService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/LogServlet.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/BasicAppInfo.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/LogWebService.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppInfoProvider.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/package-info.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebServices.java


> Pull out common code pieces from ATS v1.5 and v2
> 
>
> Key: YARN-10026
> URL: https://issues.apache.org/jira/browse/YARN-10026
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-10026.001.patch, YARN-10026.002.patch, 
> YARN-10026.003.patch
>
>
> ATSv1.5 and ATSv2 has lots of common code that can be pulled to an abstract 
> service / package. The logic is the same, and the code is _almost_ the same.
> As far as I see, the only ATS specific thing in that AppInfo is constructed 
> from an ApplicationReport, which information is extracted from the 
> TimelineReader client, 
> Later the appInfo object's user and appState fields are used, but I see no 
> other dependency on the timeline part. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10026) Pull out common code pieces from ATS v1.5 and v2

2020-01-06 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10026:
--
Fix Version/s: 3.3.0

> Pull out common code pieces from ATS v1.5 and v2
> 
>
> Key: YARN-10026
> URL: https://issues.apache.org/jira/browse/YARN-10026
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-10026.001.patch, YARN-10026.002.patch, 
> YARN-10026.003.patch
>
>
> ATSv1.5 and ATSv2 has lots of common code that can be pulled to an abstract 
> service / package. The logic is the same, and the code is _almost_ the same.
> As far as I see, the only ATS specific thing in that AppInfo is constructed 
> from an ApplicationReport, which information is extracted from the 
> TimelineReader client, 
> Later the appInfo object's user and appState fields are used, but I see no 
> other dependency on the timeline part. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10026) Pull out common code pieces from ATS v1.5 and v2

2020-01-06 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008975#comment-17008975
 ] 

Szilard Nemeth commented on YARN-10026:
---

Hi [~adam.antal]!
Thanks for this effort.
Latest patch LGTM, committed to trunk.
Do you want to backport this to branch-3.2?
I wonder if we miss the backport, it would made future commits on ATS / AHS 
very difficult to backport to 3.2
Please assess how hard to backport this patch and we can come to a conclusion.
Thanks!

> Pull out common code pieces from ATS v1.5 and v2
> 
>
> Key: YARN-10026
> URL: https://issues.apache.org/jira/browse/YARN-10026
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-10026.001.patch, YARN-10026.002.patch, 
> YARN-10026.003.patch
>
>
> ATSv1.5 and ATSv2 has lots of common code that can be pulled to an abstract 
> service / package. The logic is the same, and the code is _almost_ the same.
> As far as I see, the only ATS specific thing in that AppInfo is constructed 
> from an ApplicationReport, which information is extracted from the 
> TimelineReader client, 
> Later the appInfo object's user and appState fields are used, but I see no 
> other dependency on the timeline part. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10035) Add ability to filter the Cluster Applications API request by name

2020-01-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008953#comment-17008953
 ] 

Hudson commented on YARN-10035:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17815 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17815/])
YARN-10035. Add ability to filter the Cluster Applications API request 
(snemeth: rev 768ee22e9e73543d2fb193d9b6ec34a247cb0411)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWSConsts.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/MockDefaultRequestInterceptorREST.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/TestFederationInterceptorREST.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebServices.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServiceProtocol.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/MockRESTRequestInterceptor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/WebServices.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/webapp/FederationInterceptorREST.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetApplicationsRequestPBImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/BaseRouterWebServicesTest.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/PassThroughRESTRequestInterceptor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetApplicationsRequest.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/ApplicationsRequestBuilder.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/webapp/DefaultRequestInterceptorREST.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/TestFederationInterceptorRESTRetry.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/webapp/RouterWebServices.java


> Add ability to filter the Cluster Applications API request by name
> --
>
> Key: YARN-10035
> URL: https://issues.apache.org/jira/browse/YARN-10035
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-10035.001.patch, YARN-10035.002.patch, 
> YARN-10035.003.patch
>
>
> According to the 
> [documentation|https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html]
>  we don't support filtering by name in the Cluster Applications API request.
> Usually application tags are a perfect way for tracking applications, but for 
> MR applications the older CLIs usua

[jira] [Commented] (YARN-10035) Add ability to filter the Cluster Applications API request by name

2020-01-06 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008944#comment-17008944
 ] 

Szilard Nemeth commented on YARN-10035:
---

Hi [~adam.antal]! 
Latest patc LGTM, committed to trunk.

> Add ability to filter the Cluster Applications API request by name
> --
>
> Key: YARN-10035
> URL: https://issues.apache.org/jira/browse/YARN-10035
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-10035.001.patch, YARN-10035.002.patch, 
> YARN-10035.003.patch
>
>
> According to the 
> [documentation|https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html]
>  we don't support filtering by name in the Cluster Applications API request.
> Usually application tags are a perfect way for tracking applications, but for 
> MR applications the older CLIs usually doesn't support providing app tags, 
> while specifying the name of the job is possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10035) Add ability to filter the Cluster Applications API request by name

2020-01-06 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008944#comment-17008944
 ] 

Szilard Nemeth edited comment on YARN-10035 at 1/6/20 3:27 PM:
---

Hi [~adam.antal]! 
Latest patch LGTM, committed to trunk.


was (Author: snemeth):
Hi [~adam.antal]! 
Latest patc LGTM, committed to trunk.

> Add ability to filter the Cluster Applications API request by name
> --
>
> Key: YARN-10035
> URL: https://issues.apache.org/jira/browse/YARN-10035
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-10035.001.patch, YARN-10035.002.patch, 
> YARN-10035.003.patch
>
>
> According to the 
> [documentation|https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html]
>  we don't support filtering by name in the Cluster Applications API request.
> Usually application tags are a perfect way for tracking applications, but for 
> MR applications the older CLIs usually doesn't support providing app tags, 
> while specifying the name of the job is possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10069) Showing jstack on UI for containers

2020-01-06 Thread zhoukang (Jira)
zhoukang created YARN-10069:
---

 Summary: Showing jstack on UI for containers
 Key: YARN-10069
 URL: https://issues.apache.org/jira/browse/YARN-10069
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: zhoukang
Assignee: zhoukang


In this jira, i want to post a patch to support showing jstack on the container 
ui



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10067) Add dry-run feature to FS-CS converter tool

2020-01-06 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008914#comment-17008914
 ] 

Peter Bacsko commented on YARN-10067:
-

[~snemeth] could you pls check the changes?

> Add dry-run feature to FS-CS converter tool
> ---
>
> Key: YARN-10067
> URL: https://issues.apache.org/jira/browse/YARN-10067
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10067-001.patch, YARN-10067-002.patch, 
> YARN-10067-003.patch
>
>
> Add a "d" / "-dry-run" switch to the tool. The purpose of this would be to 
> inform the user whether a conversion is possible and if it is, are there any 
> warnings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10067) Add dry-run feature to FS-CS converter tool

2020-01-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008912#comment-17008912
 ] 

Hadoop QA commented on YARN-10067:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 27s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 2 new + 3 unchanged - 0 fixed = 5 total (was 3) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 82m 
19s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}135m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | YARN-10067 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990002/YARN-10067-003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 2995f87b46ff 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 4a76ab7 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/25333/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25333/testReport/ |
| Max. process+thread count | 851 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-re

[jira] [Assigned] (YARN-8286) Inform AM of container relaunch

2020-01-06 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal reassigned YARN-8286:


Assignee: Adam Antal

> Inform AM of container relaunch
> ---
>
> Key: YARN-8286
> URL: https://issues.apache.org/jira/browse/YARN-8286
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Billie Rinaldi
>Assignee: Adam Antal
>Priority: Critical
>
> The AM may need to perform actions when a container has been relaunched. For 
> example, the service AM would want to change the state it has recorded for 
> the container and retrieve new container status for the container, in case 
> the container IP has changed. (The NM would also need to remove the IP it has 
> stored for the container, so container status calls don't return an IP for a 
> container that is not currently running.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9892) Capacity scheduler: support DRF ordering policy on queue level

2020-01-06 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008878#comment-17008878
 ] 

Peter Bacsko commented on YARN-9892:


Thanks for the patch [~maniraj...@gmail.com].

I just have a few thoughts:
1. Do we need a {{CompoundComparator}} in the constructor? Can't we just pass 
{{DominantResourceFairnessComparator}} instance directly to 
{{ConcurrentSkipListSet}}?2. Quite a few methods are empty, I guess on purpose. 
However, it would be good to indicate that it's intentional. So I'd add a short 
comment like "// nop" to each empty method body.
3. Probably the biggest concern: I think the "dominant resource" is the one 
which has the largest {{resource[N]/clusterResource[N]}} value, where "N" is 
the index of a particular resource in the available resource vector. So I think 
you need the cluster resource there, not the queue effective resource. So you 
should somehow retrieve the cluster resource and pass it to the policy 
implementation.

See Fair Scheduler implementation:
https://github.com/apache/hadoop/blob/0921b706f7f80c40e061d2c0f8c8b2e4910071e5/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java#L283-L298



> Capacity scheduler: support DRF ordering policy on queue level
> --
>
> Key: YARN-9892
> URL: https://issues.apache.org/jira/browse/YARN-9892
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Peter Bacsko
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9892.001.patch
>
>
> Capacity scheduler does not support DRF (Dominant Resource Fairness) ordering 
> policy on queue level. Only "fifo" and "fair" are accepted for 
> {{yarn.scheduler.capacity..ordering-policy}}.
> DRF can only be used globally if 
> {{yarn.scheduler.capacity.resource-calculator}} is set to 
> DominantResourceCalculator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9892) Capacity scheduler: support DRF ordering policy on queue level

2020-01-06 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008878#comment-17008878
 ] 

Peter Bacsko edited comment on YARN-9892 at 1/6/20 1:59 PM:


Thanks for the patch [~maniraj...@gmail.com].

I just have a few thoughts:
1. Do we need a {{CompoundComparator}} in the constructor? Can't we just pass 
{{DominantResourceFairnessComparator}} instance directly to 
{{ConcurrentSkipListSet}}?
2. Quite a few methods are empty, I guess on purpose. However, it would be good 
to indicate that it's intentional. So I'd add a short comment like "// nop" to 
each empty method body.
3. Probably the biggest concern: I think the "dominant resource" is the one 
which has the largest {{resource[N]/clusterResource[N]}} value, where "N" is 
the index of a particular resource in the available resource vector. So I think 
you need the cluster resource there, not the queue effective resource. So you 
should somehow retrieve the cluster resource and pass it to the policy 
implementation.

See Fair Scheduler implementation:
https://github.com/apache/hadoop/blob/0921b706f7f80c40e061d2c0f8c8b2e4910071e5/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java#L283-L298




was (Author: pbacsko):
Thanks for the patch [~maniraj...@gmail.com].

I just have a few thoughts:
1. Do we need a {{CompoundComparator}} in the constructor? Can't we just pass 
{{DominantResourceFairnessComparator}} instance directly to 
{{ConcurrentSkipListSet}}?2. Quite a few methods are empty, I guess on purpose. 
However, it would be good to indicate that it's intentional. So I'd add a short 
comment like "// nop" to each empty method body.
3. Probably the biggest concern: I think the "dominant resource" is the one 
which has the largest {{resource[N]/clusterResource[N]}} value, where "N" is 
the index of a particular resource in the available resource vector. So I think 
you need the cluster resource there, not the queue effective resource. So you 
should somehow retrieve the cluster resource and pass it to the policy 
implementation.

See Fair Scheduler implementation:
https://github.com/apache/hadoop/blob/0921b706f7f80c40e061d2c0f8c8b2e4910071e5/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java#L283-L298



> Capacity scheduler: support DRF ordering policy on queue level
> --
>
> Key: YARN-9892
> URL: https://issues.apache.org/jira/browse/YARN-9892
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Peter Bacsko
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9892.001.patch
>
>
> Capacity scheduler does not support DRF (Dominant Resource Fairness) ordering 
> policy on queue level. Only "fifo" and "fair" are accepted for 
> {{yarn.scheduler.capacity..ordering-policy}}.
> DRF can only be used globally if 
> {{yarn.scheduler.capacity.resource-calculator}} is set to 
> DominantResourceCalculator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7913) Improve error handling when application recovery fails with exception

2020-01-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008838#comment-17008838
 ] 

Hadoop QA commented on YARN-7913:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 40s{color} 
| {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 1 new + 27 unchanged - 0 fixed = 28 total (was 27) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 31s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 25 new + 197 unchanged - 0 fixed = 222 total (was 197) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 19s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 87m 
33s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}146m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | YARN-7913 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12989995/YARN-7913.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux e82176f64e53 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 4a76ab7 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| javac | 
https://builds.apache.org/job/PreCommit-YARN-Build/25331/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/25331/artifact/out/diff-checkstyle-hadoop-yar

[jira] [Commented] (YARN-9989) Typo in CapacityScheduler documentation: Runtime Configuration

2020-01-06 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008810#comment-17008810
 ] 

Peter Bacsko commented on YARN-9989:


+1 (non-binding)

> Typo in CapacityScheduler documentation:  Runtime Configuration
> ---
>
> Key: YARN-9989
> URL: https://issues.apache.org/jira/browse/YARN-9989
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: kevin su
>Priority: Major
>  Labels: capacity-scheduler, docuentation, newbie
>
> {quote}
> Administrators can add additional queues at runtime, but queues cannot be 
> deleted at runtime unless the queue is STOPPED and *nhas* no pending/running 
> apps.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9989) Typo in CapacityScheduler documentation: Runtime Configuration

2020-01-06 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9989:
---
Labels: capacity-scheduler docuentation newbie  (was: capacity-scheduler 
docuentation)

> Typo in CapacityScheduler documentation:  Runtime Configuration
> ---
>
> Key: YARN-9989
> URL: https://issues.apache.org/jira/browse/YARN-9989
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: kevin su
>Priority: Major
>  Labels: capacity-scheduler, docuentation, newbie
>
> {quote}
> Administrators can add additional queues at runtime, but queues cannot be 
> deleted at runtime unless the queue is STOPPED and *nhas* no pending/running 
> apps.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9989) Typo in CapacityScheduler documentation: Runtime Configuration

2020-01-06 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9989:
---
Labels: capacity-scheduler docuentation  (was: )

> Typo in CapacityScheduler documentation:  Runtime Configuration
> ---
>
> Key: YARN-9989
> URL: https://issues.apache.org/jira/browse/YARN-9989
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: kevin su
>Priority: Major
>  Labels: capacity-scheduler, docuentation
>
> {quote}
> Administrators can add additional queues at runtime, but queues cannot be 
> deleted at runtime unless the queue is STOPPED and *nhas* no pending/running 
> apps.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10063) Usage output of container-executor binary needs to include --http/--https argument

2020-01-06 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008809#comment-17008809
 ] 

Peter Bacsko commented on YARN-10063:
-

[~sahuja] wouldn't it be better to print sth like {{1 appid containerid workdir 
container-script tokens [--https keystorepath truststorepath] pidfile...}}, 
because you don't really need any details for http. You can consider printing 
that {{[ ]}} means that the given arguments are optional.

> Usage output of container-executor binary needs to include --http/--https 
> argument
> --
>
> Key: YARN-10063
> URL: https://issues.apache.org/jira/browse/YARN-10063
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
> Attachments: YARN-10063.001.patch
>
>
> YARN-8448/YARN-6586 seems to have introduced a new option - "--http" 
> (default) and "--https" that is possible to be passed in to the 
> container-executor binary, see :
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L564
> and 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L521
> however, the usage output seems to have missed this:
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L74
> Raising this jira to improve this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10067) Add dry-run feature to FS-CS converter tool

2020-01-06 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-10067:

Attachment: YARN-10067-003.patch

> Add dry-run feature to FS-CS converter tool
> ---
>
> Key: YARN-10067
> URL: https://issues.apache.org/jira/browse/YARN-10067
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10067-001.patch, YARN-10067-002.patch, 
> YARN-10067-003.patch
>
>
> Add a "d" / "-dry-run" switch to the tool. The purpose of this would be to 
> inform the user whether a conversion is possible and if it is, are there any 
> warnings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10026) Pull out common code pieces from ATS v1.5 and v2

2020-01-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008772#comment-17008772
 ] 

Hadoop QA commented on YARN-10026:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
40s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  0s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 2 new + 
77 unchanged - 2 fixed = 79 total (was 79) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
1m 45s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
21s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
38s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | YARN-10026 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12989997/YARN-10026.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux acc4c46e0724 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 4a76ab7 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
|

[jira] [Commented] (YARN-9898) Dependency netty-all-4.1.27.Final doesn't support ARM platform

2020-01-06 Thread liusheng (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008754#comment-17008754
 ] 

liusheng commented on YARN-9898:


we have tried to push netty to support on ARM platform:

https://github.com/netty/netty/pull/9804

> Dependency netty-all-4.1.27.Final doesn't support ARM platform
> --
>
> Key: YARN-9898
> URL: https://issues.apache.org/jira/browse/YARN-9898
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liusheng
>Priority: Major
>
> Hadoop dependent the Netty package, but the *netty-all-4.1.27.Final* of 
> io.netty maven repo, cannot support ARM platform. 
> When run the test *TestCsiClient.testIdentityService* on ARM server, it will 
> raise error like following:
> {code:java}
> Caused by: java.io.FileNotFoundException: 
> META-INF/native/libnetty_transport_native_epoll_aarch_64.so
> at 
> io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:161)
> ... 45 more
> Suppressed: java.lang.UnsatisfiedLinkError: no 
> netty_transport_native_epoll_aarch_64 in java.library.path
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
> at java.lang.Runtime.loadLibrary0(Runtime.java:870)
> at java.lang.System.loadLibrary(System.java:1122)
> at 
> io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:243)
> at 
> io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:124)
> ... 45 more
> Suppressed: java.lang.UnsatisfiedLinkError: no 
> netty_transport_native_epoll_aarch_64 in java.library.path
> at 
> java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
> at java.lang.Runtime.loadLibrary0(Runtime.java:870)
> at java.lang.System.loadLibrary(System.java:1122)
> at 
> io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:263)
> at java.security.AccessController.doPrivileged(Native 
> Method)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:255)
> at 
> io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:233)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10068) TimelineV2Client may leak file descriptors creating ClientResponse objects.

2020-01-06 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008733#comment-17008733
 ] 

Adam Antal edited comment on YARN-10068 at 1/6/20 11:09 AM:


Hi [~anand.srinivasan],

I have some comments on the patch.
1. {code:java}
LOG.warn("Error closing the HTTP response's inputstream.", che);

{code}
Can we make this ERROR level, since it's causing serious issues.
2. In the else-branch, regardless if we succeed in this part:
{code:java}
String stringType = resp.getEntity(String.class);
msg = "Server response:\n" + stringType;
{code}
We will override the msg {{String}} in the finally part. I suggest to use 
{code:java}
msg +=
{code}
instead of simple "=", so we will have all the information in the message.
3. Catching only the {{ClientHandlerException}} and 
{{UniformInterfaceException}} types are a bit concerning. In case of any 
unchecked exceptions, since we throw a YarnException at the end of the finally 
block, those are not going to have any trace. I suggest to add a {{Throwable}} 
case - something like this:
{code:java}
   ...
} catch (ClientHandlerException | UniformInterfaceException chuie) {
   msg = "Error getting entity from the HTTP response." + 
chuie.getLocalizedMessage();
} catch (Throwable t) {
   msg = "Error happened during getting server response: " + 
t.getLocalizedMessage();
} finally {
   ...
{code}


was (Author: adam.antal):
Hi [~anand.srinivasan],

I have some comments on the patch.
1. {code:java}
LOG.warn("Error closing the HTTP response's inputstream.", che);

{code}
Can we make this ERROR level, since it's causing serious issues.
2. In the else-branch, regardless if we succeed in this part:
{code:java}
String stringType = resp.getEntity(String.class);
msg = "Server response:\n" + stringType;
{code}
We will override the msg {{String}} in the finally part. I suggest to use 
{code:java}
msg *+*=
{code}
instead of simple "=", so we will have all the information in the message.
3. Catching only the {{ClientHandlerException}} and 
{{UniformInterfaceException}} types are a bit concerning. In case of any 
unchecked exceptions, since we throw a YarnException at the end of the finally 
block, those are not going to have any trace. I suggest to add a {{Throwable}} 
case - something like this:
{code:java}
   ...
} catch (ClientHandlerException | UniformInterfaceException chuie) {
   msg = "Error getting entity from the HTTP response." + 
chuie.getLocalizedMessage();
} catch (Throwable t) {
   msg = "Error happened during getting server response: " + 
t.getLocalizedMessage();
} finally {
   ...
{code}

> TimelineV2Client may leak file descriptors creating ClientResponse objects.
> ---
>
> Key: YARN-10068
> URL: https://issues.apache.org/jira/browse/YARN-10068
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.0.0
> Environment: HDP VERSION3.1.4
> AMBARI VERSION2.7.4.0
>Reporter: Anand Srinivasan
>Assignee: Anand Srinivasan
>Priority: Critical
> Attachments: YARN-10068.001.patch, YARN-10068.002.patch, 
> image-2020-01-02-14-58-12-773.png
>
>
> Hi team,
> Code-walkthrough between v1 and v2 of TimelineClient API revealed that v2 API 
> TimelineV2ClientImpl#putObjects doesn't close ClientResponse objects under 
> success status returned from Timeline Server. ClientResponse is closed only 
> under erroneous response from the server using ClientResponse#getEntity.
> We also noticed that TimelineClient (v1) closes the ClientResponse object in 
> TimelineWriter#putEntities by calling ClientResponse#getEntity in both 
> success and error conditions from the server thereby avoiding this file 
> descriptor leak.
> Customer's original issue and the symptom was that the NodeManager went down 
> because of 'too many files open' condition where there were lots of 
> CLOSED_WAIT sockets observed between the timeline client (from NM) and the 
> timeline server hosts. 
> Could you please help resolve this issue ? Thanks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10068) TimelineV2Client may leak file descriptors creating ClientResponse objects.

2020-01-06 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008733#comment-17008733
 ] 

Adam Antal commented on YARN-10068:
---

Hi [~anand.srinivasan],

I have some comments on the patch.
1. {code:java}
LOG.warn("Error closing the HTTP response's inputstream.", che);

{code}
Can we make this ERROR level, since it's causing serious issues.
2. In the else-branch, regardless if we succeed in this part:
{code:java}
String stringType = resp.getEntity(String.class);
msg = "Server response:\n" + stringType;
{code}
We will override the msg {{String}} in the finally part. I suggest to use 
{code:java}
msg *+*=
{code}
instead of simple "=", so we will have all the information in the message.
3. Catching only the {{ClientHandlerException}} and 
{{UniformInterfaceException}} types are a bit concerning. In case of any 
unchecked exceptions, since we throw a YarnException at the end of the finally 
block, those are not going to have any trace. I suggest to add a {{Throwable}} 
case - something like this:
{code:java}
   ...
} catch (ClientHandlerException | UniformInterfaceException chuie) {
   msg = "Error getting entity from the HTTP response." + 
chuie.getLocalizedMessage();
} catch (Throwable t) {
   msg = "Error happened during getting server response: " + 
t.getLocalizedMessage();
} finally {
   ...
{code}

> TimelineV2Client may leak file descriptors creating ClientResponse objects.
> ---
>
> Key: YARN-10068
> URL: https://issues.apache.org/jira/browse/YARN-10068
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.0.0
> Environment: HDP VERSION3.1.4
> AMBARI VERSION2.7.4.0
>Reporter: Anand Srinivasan
>Assignee: Anand Srinivasan
>Priority: Critical
> Attachments: YARN-10068.001.patch, YARN-10068.002.patch, 
> image-2020-01-02-14-58-12-773.png
>
>
> Hi team,
> Code-walkthrough between v1 and v2 of TimelineClient API revealed that v2 API 
> TimelineV2ClientImpl#putObjects doesn't close ClientResponse objects under 
> success status returned from Timeline Server. ClientResponse is closed only 
> under erroneous response from the server using ClientResponse#getEntity.
> We also noticed that TimelineClient (v1) closes the ClientResponse object in 
> TimelineWriter#putEntities by calling ClientResponse#getEntity in both 
> success and error conditions from the server thereby avoiding this file 
> descriptor leak.
> Customer's original issue and the symptom was that the NodeManager went down 
> because of 'too many files open' condition where there were lots of 
> CLOSED_WAIT sockets observed between the timeline client (from NM) and the 
> timeline server hosts. 
> Could you please help resolve this issue ? Thanks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10026) Pull out common code pieces from ATS v1.5 and v2

2020-01-06 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008725#comment-17008725
 ] 

Adam Antal commented on YARN-10026:
---

Fixed one last related checkstyle issue and added/updated javadoc 
documentations to the classes affected by this patch. Pending on jenkins.

> Pull out common code pieces from ATS v1.5 and v2
> 
>
> Key: YARN-10026
> URL: https://issues.apache.org/jira/browse/YARN-10026
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-10026.001.patch, YARN-10026.002.patch, 
> YARN-10026.003.patch
>
>
> ATSv1.5 and ATSv2 has lots of common code that can be pulled to an abstract 
> service / package. The logic is the same, and the code is _almost_ the same.
> As far as I see, the only ATS specific thing in that AppInfo is constructed 
> from an ApplicationReport, which information is extracted from the 
> TimelineReader client, 
> Later the appInfo object's user and appState fields are used, but I see no 
> other dependency on the timeline part. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10026) Pull out common code pieces from ATS v1.5 and v2

2020-01-06 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal updated YARN-10026:
--
Attachment: YARN-10026.003.patch

> Pull out common code pieces from ATS v1.5 and v2
> 
>
> Key: YARN-10026
> URL: https://issues.apache.org/jira/browse/YARN-10026
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-10026.001.patch, YARN-10026.002.patch, 
> YARN-10026.003.patch
>
>
> ATSv1.5 and ATSv2 has lots of common code that can be pulled to an abstract 
> service / package. The logic is the same, and the code is _almost_ the same.
> As far as I see, the only ATS specific thing in that AppInfo is constructed 
> from an ApplicationReport, which information is extracted from the 
> TimelineReader client, 
> Later the appInfo object's user and appState fields are used, but I see no 
> other dependency on the timeline part. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10068) TimelineV2Client may leak file descriptors creating ClientResponse objects.

2020-01-06 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008644#comment-17008644
 ] 

Prabhu Joseph commented on YARN-10068:
--

[~anand.srinivasan] The patch  [^YARN-10068.002.patch]  looks good. Can you 
include a test case, if not, pls justify. And fix the checkstyle issues.

> TimelineV2Client may leak file descriptors creating ClientResponse objects.
> ---
>
> Key: YARN-10068
> URL: https://issues.apache.org/jira/browse/YARN-10068
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.0.0
> Environment: HDP VERSION3.1.4
> AMBARI VERSION2.7.4.0
>Reporter: Anand Srinivasan
>Assignee: Anand Srinivasan
>Priority: Critical
> Attachments: YARN-10068.001.patch, YARN-10068.002.patch, 
> image-2020-01-02-14-58-12-773.png
>
>
> Hi team,
> Code-walkthrough between v1 and v2 of TimelineClient API revealed that v2 API 
> TimelineV2ClientImpl#putObjects doesn't close ClientResponse objects under 
> success status returned from Timeline Server. ClientResponse is closed only 
> under erroneous response from the server using ClientResponse#getEntity.
> We also noticed that TimelineClient (v1) closes the ClientResponse object in 
> TimelineWriter#putEntities by calling ClientResponse#getEntity in both 
> success and error conditions from the server thereby avoiding this file 
> descriptor leak.
> Customer's original issue and the symptom was that the NodeManager went down 
> because of 'too many files open' condition where there were lots of 
> CLOSED_WAIT sockets observed between the timeline client (from NM) and the 
> timeline server hosts. 
> Could you please help resolve this issue ? Thanks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org