[jira] [Commented] (YARN-9418) ATSV2 /apps/appId/entities/YARN_CONTAINER rest api does not show metrics

2019-09-04 Thread Rohith Sharma K S (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16923091#comment-16923091
 ] 

Rohith Sharma K S commented on YARN-9418:
-

[~Prabhu Joseph] why this isn't back ported to branch-3.2?

> ATSV2 /apps/appId/entities/YARN_CONTAINER rest api does not show metrics
> 
>
> Key: YARN-9418
> URL: https://issues.apache.org/jira/browse/YARN-9418
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: YARN-9418-001.patch, YARN-9418-002.patch, 
> YARN-9418-003.patch
>
>
> ATSV2 entities rest api does not show the metrics
> {code:java}
> [hbase@yarn-ats-3 centos]$ curl -s 
> "http://yarn-ats-3:8198/ws/v2/timeline/apps/application_1553685341603_0006/entities/YARN_CONTAINER/container_e18_1553685341603_0006_01_01?user.name=hbase&fields=METRICS";
>  | jq .
> {
> "metrics": [],
> "events": [],
> "createdtime": 1553695002014,
> "idprefix": 0,
> "type": "YARN_CONTAINER",
> "id": "container_e18_1553685341603_0006_01_01",
> "info": {
> "UID": 
> "ats!application_1553685341603_0006!YARN_CONTAINER!0!container_e18_1553685341603_0006_01_01",
> "FROM_ID": 
> "ats!hbase!QuasiMonteCarlo!1553695001394!application_1553685341603_0006!YARN_CONTAINER!0!container_e18_1553685341603_0006_01_01"
> },
> "configs": {},
> "isrelatedto": {},
> "relatesto": {}
> }{code}
> NodeManager puts YARN_CONTAINER entities with CPU and MEMORY metrics but this 
> is not shown in above output. Found NM container entities are set with 
> entityIdPrefix as inverted container starttime whereas RM container entities 
> are set with default 0. TimelineReader fetches only RM container entries.
> Confirmed with setting NM container entities entityIdPrefix to 0 same as RM 
> (for testing purpose) and found metrics are shown.
> {code:java}
> "metrics": [
> {
> "type": "SINGLE_VALUE",
> "id": "MEMORY",
> "aggregationOp": "NOP",
> "values": {
> "1553774981355": 490430464
> }
> },
> {
> "type": "SINGLE_VALUE",
> "id": "CPU",
> "aggregationOp": "NOP",
> "values": {
> "1553774981355": 5
> }
> }
> ]{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9795) ClusterMetrics to include AM allocation delay

2019-09-04 Thread Fengnan Li (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16923048#comment-16923048
 ] 

Fengnan Li commented on YARN-9795:
--

Thanks very much [~Tao Yang] for the review. Uploaded [^YARN-9795.002.patch]

> ClusterMetrics to include AM allocation delay
> -
>
> Key: YARN-9795
> URL: https://issues.apache.org/jira/browse/YARN-9795
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Minor
> Attachments: YARN-9795.001.patch, YARN-9795.002.patch
>
>
> Add AM container allocation in QueueMetrics to help diagnose performance 
> issue. This is following 
> [YARN-2802|https://jira.apache.org/jira/browse/YARN-2802]
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9728)  ResourceManager REST API can produce an illegal xml response

2019-09-04 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9728:

Attachment: YARN-9728-005.patch

>  ResourceManager REST API can produce an illegal xml response
> -
>
> Key: YARN-9728
> URL: https://issues.apache.org/jira/browse/YARN-9728
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api, resourcemanager
>Affects Versions: 2.7.3
>Reporter: Thomas
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: IllegalResponseChrome.png, YARN-9728-001.patch, 
> YARN-9728-002.patch, YARN-9728-003.patch, YARN-9728-004.patch, 
> YARN-9728-005.patch
>
>
> When a spark job throws an exception with a message containing a character 
> out of the range supported by xml 1.0, then
>  the application fails and the stack trace will be stored into the 
> {{diagnostics}} field. So far, so good.
> But the issue occurred when we try to get application information with the 
> ResourceManager REST API
>  The xml response will contain the illegal xml 1.0 char and will be invalid.
>  *+Examples of illegals characters in xml 1.0 :+* 
>  * {{\u}}
>  * {{\u0001}}
>  * {{\u0002}}
>  * {{\u0003}}
>  * {{\u0004}}
> _For more information about supported characters :_
>  [https://www.w3.org/TR/xml/#charsets]
> *+Example of illegal response from the Ressource Manager API :+* 
> {code:xml}
> 
> 
>   application_1326821518301_0005
>   user1
>   job
>   a1
>   FINISHED
>   FAILED
>   100.0
>   History
>   
> http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5
>   Exception in thread "main" java.lang.Exception: \u0001
>   at com..main(JobWithSpecialCharMain.java:6)
>   [...]
> 
> {code}
>  
> *+Example of job to reproduce :+*
> {code:java}
> public class JobWithSpecialCharMain {
>  public static void main(String[] args) throws Exception {
>   throw new Exception("\u0001");
>  }
> }
> {code}
> {code:bash}
> javac -d . JobWithSpecialCharMain.java
> jar cvf repro.jar com/
> spark-submit --class com.JobWithSpecialCharMain --master yarn-cluster 
> repro.jar
> {code}
> !IllegalResponseChrome.png!



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9795) ClusterMetrics to include AM allocation delay

2019-09-04 Thread Fengnan Li (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengnan Li updated YARN-9795:
-
Attachment: YARN-9795.002.patch

> ClusterMetrics to include AM allocation delay
> -
>
> Key: YARN-9795
> URL: https://issues.apache.org/jira/browse/YARN-9795
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Minor
> Attachments: YARN-9795.001.patch, YARN-9795.002.patch
>
>
> Add AM container allocation in QueueMetrics to help diagnose performance 
> issue. This is following 
> [YARN-2802|https://jira.apache.org/jira/browse/YARN-2802]
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-09-04 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16923046#comment-16923046
 ] 

Hadoop QA commented on YARN-8995:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
32s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
49s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 24s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
51s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
40s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 81m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.2 Server=19.03.2 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-8995 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979507/YARN-8995.016.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 9c16f2568269 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3db7184 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| 

[jira] [Commented] (YARN-9795) ClusterMetrics to include AM allocation delay

2019-09-04 Thread Tao Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16923024#comment-16923024
 ] 

Tao Yang commented on YARN-9795:


Thanks [~fengnanli] for this improvement.
Patch almost LGTM,  IMO, there's no need to set -1 as the initial value of 
scheduledTime and add the special annotation, 0 should be the proper initial 
value like other times.  And new check-style warnings should be fixed as well.

> ClusterMetrics to include AM allocation delay
> -
>
> Key: YARN-9795
> URL: https://issues.apache.org/jira/browse/YARN-9795
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Minor
> Attachments: YARN-9795.001.patch
>
>
> Add AM container allocation in QueueMetrics to help diagnose performance 
> issue. This is following 
> [YARN-2802|https://jira.apache.org/jira/browse/YARN-2802]
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9812) mvn javadoc:javadoc fails in hadoop-sls

2019-09-04 Thread Abhishek Modi (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Modi reassigned YARN-9812:
---

Assignee: Abhishek Modi

> mvn javadoc:javadoc fails in hadoop-sls
> ---
>
> Key: YARN-9812
> URL: https://issues.apache.org/jira/browse/YARN-9812
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Akira Ajisaka
>Assignee: Abhishek Modi
>Priority: Major
>  Labels: newbie
>
> {noformat}
> [ERROR] 
> hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/DAGAMSimulator.java:57:
>  error: bad use of '>'
> [ERROR]  * pending -> requests which are NOT yet sent to RM.
> [ERROR] ^
> [ERROR] 
> hadoop-mirror/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/DAGAMSimulator.java:58:
>  error: bad use of '>'
> [ERROR]  * scheduled -> requests which are sent to RM but not yet assigned.
> [ERROR]   ^
> [ERROR] 
> hadoop-mirror/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/DAGAMSimulator.java:59:
>  error: bad use of '>'
> [ERROR]  * assigned -> requests which are assigned to a container.
> [ERROR]  ^
> [ERROR] 
> hadoop-mirror/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/DAGAMSimulator.java:60:
>  error: bad use of '>'
> [ERROR]  * completed -> request corresponding to which container has 
> completed.
> [ERROR]   ^
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9804) Update ATSv2 document for latest feature supports

2019-09-04 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16923019#comment-16923019
 ] 

Hudson commented on YARN-9804:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17227 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17227/])
YARN-9804. Update ATSv2 document for latest feature supports. (rohithsharmaks: 
rev 3db71840824c58344c2c59423fd605808785dc2c)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServiceV2.md


> Update ATSv2 document for latest feature supports
> -
>
> Key: YARN-9804
> URL: https://issues.apache.org/jira/browse/YARN-9804
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Blocker
> Attachments: YARN-9804.01.patch, YARN-9804.02.patch
>
>
> Revisit ATSv2 documents and update for GA features. And also for the road map.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9804) Update ATSv2 document for latest feature supports

2019-09-04 Thread Rohith Sharma K S (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16923012#comment-16923012
 ] 

Rohith Sharma K S commented on YARN-9804:
-

committing shortly

> Update ATSv2 document for latest feature supports
> -
>
> Key: YARN-9804
> URL: https://issues.apache.org/jira/browse/YARN-9804
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Blocker
> Attachments: YARN-9804.01.patch, YARN-9804.02.patch
>
>
> Revisit ATSv2 documents and update for GA features. And also for the road map.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-09-04 Thread Tao Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922996#comment-16922996
 ] 

Tao Yang commented on YARN-8995:


Hi, [~zhuqi], I found another place need to be improved.  {{ if (qSize % 
detailsInterval == 0) }} should be updated to {{ if (qSize != 0 && qSize % 
detailsInterval == 0 && lastEventDetailsQueueSizeLogged != qSize )}}, avoid 
printing for empty queue and print details redundantly. 

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0, 3.3.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.007.patch, 
> YARN-8995.008.patch, YARN-8995.009.patch, YARN-8995.010.patch, 
> YARN-8995.011.patch, YARN-8995.012.patch, YARN-8995.013.patch, 
> YARN-8995.014.patch, image-2019-09-04-15-20-02-914.png
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9812) mvn javadoc:javadoc fails in hadoop-sls

2019-09-04 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-9812:

Labels: newbie  (was: )

> mvn javadoc:javadoc fails in hadoop-sls
> ---
>
> Key: YARN-9812
> URL: https://issues.apache.org/jira/browse/YARN-9812
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Akira Ajisaka
>Priority: Major
>  Labels: newbie
>
> {noformat}
> [ERROR] 
> hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/DAGAMSimulator.java:57:
>  error: bad use of '>'
> [ERROR]  * pending -> requests which are NOT yet sent to RM.
> [ERROR] ^
> [ERROR] 
> hadoop-mirror/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/DAGAMSimulator.java:58:
>  error: bad use of '>'
> [ERROR]  * scheduled -> requests which are sent to RM but not yet assigned.
> [ERROR]   ^
> [ERROR] 
> hadoop-mirror/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/DAGAMSimulator.java:59:
>  error: bad use of '>'
> [ERROR]  * assigned -> requests which are assigned to a container.
> [ERROR]  ^
> [ERROR] 
> hadoop-mirror/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/DAGAMSimulator.java:60:
>  error: bad use of '>'
> [ERROR]  * completed -> request corresponding to which container has 
> completed.
> [ERROR]   ^
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9812) mvn javadoc:javadoc fails in hadoop-sls

2019-09-04 Thread Akira Ajisaka (Jira)
Akira Ajisaka created YARN-9812:
---

 Summary: mvn javadoc:javadoc fails in hadoop-sls
 Key: YARN-9812
 URL: https://issues.apache.org/jira/browse/YARN-9812
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Akira Ajisaka


{noformat}
[ERROR] 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/DAGAMSimulator.java:57:
 error: bad use of '>'
[ERROR]  * pending -> requests which are NOT yet sent to RM.
[ERROR] ^
[ERROR] 
hadoop-mirror/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/DAGAMSimulator.java:58:
 error: bad use of '>'
[ERROR]  * scheduled -> requests which are sent to RM but not yet assigned.
[ERROR]   ^
[ERROR] 
hadoop-mirror/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/DAGAMSimulator.java:59:
 error: bad use of '>'
[ERROR]  * assigned -> requests which are assigned to a container.
[ERROR]  ^
[ERROR] 
hadoop-mirror/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/DAGAMSimulator.java:60:
 error: bad use of '>'
[ERROR]  * completed -> request corresponding to which container has completed.
[ERROR]   ^
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9810) Add queue capacity/maxcapacity percentage metrics

2019-09-04 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922920#comment-16922920
 ] 

Hadoop QA commented on YARN-9810:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  7s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 35s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:blue}0{color} | {color:blue} asflicense {color} | {color:blue}  0m 
12s{color} | {color:blue} ASF License check generated no output? {color} |
| {color:black}{color} | {color:black} {color} | {color:black}128m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-9810 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979463/YARN-9810.01.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 54cc2a75091c 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 337e9b7 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/24727/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24727/testReport/ |
| Max. process+thread count | 827 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24727/con

[jira] [Commented] (YARN-9795) ClusterMetrics to include AM allocation delay

2019-09-04 Thread Wangda Tan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922915#comment-16922915
 ] 

Wangda Tan commented on YARN-9795:
--

[~fengnanli], thanks for working on the Jira. I just added you to contributor 
list so you can assign YARN JIRAs to yourself in the future. It looks like an 
important improvement.

[~Tao Yang] , [~tangzhankun] can you help to review the patch? Thanks

> ClusterMetrics to include AM allocation delay
> -
>
> Key: YARN-9795
> URL: https://issues.apache.org/jira/browse/YARN-9795
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Minor
> Attachments: YARN-9795.001.patch
>
>
> Add AM container allocation in QueueMetrics to help diagnose performance 
> issue. This is following 
> [YARN-2802|https://jira.apache.org/jira/browse/YARN-2802]
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9795) ClusterMetrics to include AM allocation delay

2019-09-04 Thread Wangda Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan reassigned YARN-9795:


Assignee: Fengnan Li

> ClusterMetrics to include AM allocation delay
> -
>
> Key: YARN-9795
> URL: https://issues.apache.org/jira/browse/YARN-9795
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Minor
> Attachments: YARN-9795.001.patch
>
>
> Add AM container allocation in QueueMetrics to help diagnose performance 
> issue. This is following 
> [YARN-2802|https://jira.apache.org/jira/browse/YARN-2802]
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime

2019-09-04 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922888#comment-16922888
 ] 

Eric Yang commented on YARN-9561:
-

[~ebadger] thank you for the debugging session today.  I got mapreduce pi to 
run correctly after adding /etc/krb5.conf to default mount location.  Some 
improvements to make this better:

1.  The current output looks like this when container run fails:

{code}
[2019-09-04 13:26:30.726]Exception from container-launch.
Container id: container_1567624987243_0004_01_06
Exit code: 1
Exception message: Launch container failed

[2019-09-04 13:26:30.731]Container exited with a non-zero exit code 1. Error 
file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :


[2019-09-04 13:26:30.734]Container exited with a non-zero exit code 1. Error 
file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :



2019-09-04 13:26:31,455 INFO mapreduce.Job: Task Id : 
attempt_1567624987243_0004_m_03_0, Status : FAILED
{code}

Print a line of output before calling runc.  Prelaunch initialization is 
completed or print the formatted json in prelaunch.out.  This helps to narrow 
down the root of the problem is caused by user job configuration or bugs in 
container-executor code.  Docker runtime shows the command line for calling 
docker.  This helps to troubleshoot the actual problem sooner.

2.  ENTRY_POINT support.  Instead of calling out to launch_container.sh, it 
would be nice to dup the stdout, stderr without launch_container.sh wrapper.  
This helps to remove the requirement of bind mounting log or workdir 
directories into the container for some use cases.

3.  User defined properties integration:

YARN Docker integration have a list of [configurable 
properties|https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-site/DockerContainers.html#Application_Submission].
  These settings do not work with runc container today.  Without hinder 
progress, I suggest to open new issues to improve integration.

4.  YARN service uses the properties defined in #3 for customize YARN services 
mount points, network to use, and privilege container flag.  Similar feature 
sets need new tickets to ensure the new runtime can integrate well with YARN 
service programming interfaces.

> Add C changes for the new RuncContainerRuntime
> --
>
> Key: YARN-9561
> URL: https://issues.apache.org/jira/browse/YARN-9561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9561.001.patch, YARN-9561.002.patch, 
> YARN-9561.003.patch, YARN-9561.004.patch
>
>
> This JIRA will be used to add the C changes to the container-executor native 
> binary that are necessary for the new RuncContainerRuntime. There should be 
> no changes to existing code paths. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9728)  ResourceManager REST API can produce an illegal xml response

2019-09-04 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922827#comment-16922827
 ] 

Hadoop QA commented on YARN-9728:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
47s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
34s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 15s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 265 unchanged - 0 fixed = 266 total (was 265) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
56s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
52s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 80m 
54s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-9728 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979370/YARN-9728-004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 2fa4628c520e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019

[jira] [Commented] (YARN-9795) ClusterMetrics to include AM allocation delay

2019-09-04 Thread Fengnan Li (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922759#comment-16922759
 ] 

Fengnan Li commented on YARN-9795:
--

[~leftnoteasy] Can you help here? Thanks!

> ClusterMetrics to include AM allocation delay
> -
>
> Key: YARN-9795
> URL: https://issues.apache.org/jira/browse/YARN-9795
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Fengnan Li
>Priority: Minor
> Attachments: YARN-9795.001.patch
>
>
> Add AM container allocation in QueueMetrics to help diagnose performance 
> issue. This is following 
> [YARN-2802|https://jira.apache.org/jira/browse/YARN-2802]
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9764) Print application submission context label in application summary

2019-09-04 Thread Varun Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-9764:
--

Assignee: Manoj Kumar

> Print application submission context label in application summary
> -
>
> Key: YARN-9764
> URL: https://issues.apache.org/jira/browse/YARN-9764
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Hung
>Assignee: Manoj Kumar
>Priority: Major
>  Labels: release-blocker
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9762) Add submission context label to audit logs

2019-09-04 Thread Varun Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-9762:
--

Assignee: Manoj Kumar

> Add submission context label to audit logs
> --
>
> Key: YARN-9762
> URL: https://issues.apache.org/jira/browse/YARN-9762
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Hung
>Assignee: Manoj Kumar
>Priority: Major
>  Labels: release-blocker
>
> Currently we log NODELABEL in container allocation/release audit logs, we 
> should also log NODELABEL of application submission context on app submission.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9763) Print application tags in application summary

2019-09-04 Thread Varun Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-9763:
--

Assignee: Manoj Kumar

> Print application tags in application summary
> -
>
> Key: YARN-9763
> URL: https://issues.apache.org/jira/browse/YARN-9763
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Hung
>Assignee: Manoj Kumar
>Priority: Major
>  Labels: release-blocker
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9810) Add queue capacity/maxcapacity percentage metrics

2019-09-04 Thread Varun Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-9810:
--

Assignee: Shubham Gupta  (was: Jonathan Hung)

> Add queue capacity/maxcapacity percentage metrics
> -
>
> Key: YARN-9810
> URL: https://issues.apache.org/jira/browse/YARN-9810
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Hung
>Assignee: Shubham Gupta
>Priority: Major
>  Labels: release-blocker
> Attachments: YARN-9810.01.patch
>
>
> Similar to YARN-9085, it'd be good to have queue (absolute) capacity / 
> (absolute) max capacity metrics in CSQueueMetrics.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9810) Add queue capacity/maxcapacity percentage metrics

2019-09-04 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922690#comment-16922690
 ] 

Jonathan Hung commented on YARN-9810:
-

Uploading 01 patch on behalf of [~shubham29].

> Add queue capacity/maxcapacity percentage metrics
> -
>
> Key: YARN-9810
> URL: https://issues.apache.org/jira/browse/YARN-9810
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
>  Labels: release-blocker
> Attachments: YARN-9810.01.patch
>
>
> Similar to YARN-9085, it'd be good to have queue (absolute) capacity / 
> (absolute) max capacity metrics in CSQueueMetrics.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9810) Add queue capacity/maxcapacity percentage metrics

2019-09-04 Thread Jonathan Hung (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-9810:

Attachment: YARN-9810.01.patch

> Add queue capacity/maxcapacity percentage metrics
> -
>
> Key: YARN-9810
> URL: https://issues.apache.org/jira/browse/YARN-9810
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
>  Labels: release-blocker
> Attachments: YARN-9810.01.patch
>
>
> Similar to YARN-9085, it'd be good to have queue (absolute) capacity / 
> (absolute) max capacity metrics in CSQueueMetrics.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9810) Add queue capacity/maxcapacity percentage metrics

2019-09-04 Thread Jonathan Hung (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-9810:

Target Version/s: 2.10.0
  Labels: release-blocker  (was: )

> Add queue capacity/maxcapacity percentage metrics
> -
>
> Key: YARN-9810
> URL: https://issues.apache.org/jira/browse/YARN-9810
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
>  Labels: release-blocker
>
> Similar to YARN-9085, it'd be good to have queue (absolute) capacity / 
> (absolute) max capacity metrics in CSQueueMetrics.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9810) Add queue capacity/maxcapacity percentage metrics

2019-09-04 Thread Jonathan Hung (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung reassigned YARN-9810:
---

Assignee: Jonathan Hung

> Add queue capacity/maxcapacity percentage metrics
> -
>
> Key: YARN-9810
> URL: https://issues.apache.org/jira/browse/YARN-9810
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
>
> Similar to YARN-9085, it'd be good to have queue (absolute) capacity / 
> (absolute) max capacity metrics in CSQueueMetrics.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9810) Add queue capacity/maxcapacity percentage metrics

2019-09-04 Thread Shubham Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922633#comment-16922633
 ] 

Shubham Gupta commented on YARN-9810:
-

+1

> Add queue capacity/maxcapacity percentage metrics
> -
>
> Key: YARN-9810
> URL: https://issues.apache.org/jira/browse/YARN-9810
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Hung
>Priority: Major
>
> Similar to YARN-9085, it'd be good to have queue (absolute) capacity / 
> (absolute) max capacity metrics in CSQueueMetrics.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9761) Allow overriding application submissions based on server side configs

2019-09-04 Thread pralabhkumar (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922482#comment-16922482
 ] 

pralabhkumar edited comment on YARN-9761 at 9/4/19 3:31 PM:


Address [~jhung] comment

testSubmissionContextWithAbsentTAG is in line with 

testAppSubmitWithSubmissionPreProcessor (method length is more that 150 , 
that's why created separate method)


was (Author: pralabhkumar):
Address jonathan comment

> Allow overriding application submissions based on server side configs
> -
>
> Key: YARN-9761
> URL: https://issues.apache.org/jira/browse/YARN-9761
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Jonathan Hung
>Assignee: pralabhkumar
>Priority: Major
>  Labels: release-blocker
> Attachments: YARN-9761.01.patch, YARN-9761.02.patch, 
> YARN-9761.03.patch, YARN-9761.04.patch, YARN-9761.05.patch
>
>
> Create a preprocessor/interceptor which takes each app submitted to RM and 
> overrides the submission context based on server side configs.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9776) yarn logs throws an error "Not a valid BCFile"

2019-09-04 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922561#comment-16922561
 ] 

Prabhu Joseph commented on YARN-9776:
-

[~jordanagoodboy]  The meta file removed was written with IFile whereas the 
client reads the log file uses TFile format. Setting 
yarn.log-aggregation.file-formats to IFile in Client machine would have solved 
the issue. This does not look like a Bug. Can you share what we need to fix as 
part of this Jira. Thanks.

 

 

> yarn logs throws an error "Not a valid BCFile"
> --
>
> Key: YARN-9776
> URL: https://issues.apache.org/jira/browse/YARN-9776
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 3.1.0
> Environment: HDP 3.1.0.78
>  
>Reporter: agoodboy
>Priority: Major
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Env: hdp 3.1.0.0-78
> Command: yarn logs -applicationId xxx, throws an error "Not a valid BCFile.", 
> and then exit.
> After open debug log using "export YARN_ROOT_LOGGER="DEBUG,console", and 
> rerun command. It shows that 
> "fileName=/data1/app-logs/hadoop/logs/application_1566555356033_0032/meta" is 
> not a valid BCFile. And after I remove the file from hdfs, and rerun command, 
> it success. 
> So, how to generate this meta file? 
> I guess that is because this in yarn-site.xml:
> 
>  
> yarn.timeline-service.generic-application-history.save-non-am-container-meta-info
>  true
>  
> I set this value to true because I want to see all container logs in timeline 
> web page, orelse I can just see the am container log.
> So, it seems that yarn logs can't properly handle this suituation.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9809) NMs should supply a health status when registering with RM

2019-09-04 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922542#comment-16922542
 ] 

Eric Badger commented on YARN-9809:
---

bq. Although it is good to have a way to prevent scheduling containers to a 
node manager that is going through registration process to save network round 
trips and compute resources, the existing async design allows the node to show 
up in Resource Manager as quickly as possible to improve system admin user 
experience.

But if that node is bad, then registering to the RM is just adding unnecessary 
work. The NM health check script can check for many things that are known 
without a container being run. For example, docker could not be installed, or 
nscd not running (causing a user lookup for every new container). These could 
be reasons for the node to declare itself as unhealthy depending on the 
specific health check script. If we register with the RM and then declare the 
node unhealthy afterwards then we have to kill every container that was 
scheduled in the period between registration and first heartbeat.

> NMs should supply a health status when registering with RM
> --
>
> Key: YARN-9809
> URL: https://issues.apache.org/jira/browse/YARN-9809
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
>
> Currently if the NM registers with the RM and it is unhealthy, it can be 
> scheduled many containers before the first heartbeat. After the first 
> heartbeat, the RM will mark the NM as unhealthy and kill all of the 
> containers.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9761) Allow overriding application submissions based on server side configs

2019-09-04 Thread pralabhkumar (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pralabhkumar updated YARN-9761:
---
Attachment: YARN-9761.05.patch

> Allow overriding application submissions based on server side configs
> -
>
> Key: YARN-9761
> URL: https://issues.apache.org/jira/browse/YARN-9761
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Jonathan Hung
>Assignee: pralabhkumar
>Priority: Major
>  Labels: release-blocker
> Attachments: YARN-9761.01.patch, YARN-9761.02.patch, 
> YARN-9761.03.patch, YARN-9761.04.patch, YARN-9761.05.patch
>
>
> Create a preprocessor/interceptor which takes each app submitted to RM and 
> overrides the submission context based on server side configs.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9776) yarn logs throws an error "Not a valid BCFile"

2019-09-04 Thread agoodboy (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922472#comment-16922472
 ] 

agoodboy commented on YARN-9776:


[~Prabhu Joseph]  I dont't think that we should close the issue. 

> yarn logs throws an error "Not a valid BCFile"
> --
>
> Key: YARN-9776
> URL: https://issues.apache.org/jira/browse/YARN-9776
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 3.1.0
> Environment: HDP 3.1.0.78
>  
>Reporter: agoodboy
>Priority: Major
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Env: hdp 3.1.0.0-78
> Command: yarn logs -applicationId xxx, throws an error "Not a valid BCFile.", 
> and then exit.
> After open debug log using "export YARN_ROOT_LOGGER="DEBUG,console", and 
> rerun command. It shows that 
> "fileName=/data1/app-logs/hadoop/logs/application_1566555356033_0032/meta" is 
> not a valid BCFile. And after I remove the file from hdfs, and rerun command, 
> it success. 
> So, how to generate this meta file? 
> I guess that is because this in yarn-site.xml:
> 
>  
> yarn.timeline-service.generic-application-history.save-non-am-container-meta-info
>  true
>  
> I set this value to true because I want to see all container logs in timeline 
> web page, orelse I can just see the am container log.
> So, it seems that yarn logs can't properly handle this suituation.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9030) Log aggregation changes to handle filesystems which do not support setting permissions

2019-09-04 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal updated YARN-9030:
-
Component/s: log-aggregation

> Log aggregation changes to handle filesystems which do not support setting 
> permissions
> --
>
> Key: YARN-9030
> URL: https://issues.apache.org/jira/browse/YARN-9030
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
> Fix For: 3.3.0, 3.2.1
>
> Attachments: YARN-9030.1.patch, YARN-9030.2.patch
>
>
> Some cloud storages like ADLS do not support permissions in which case they 
> throw an UnsupportedOperationException. Log aggregation code should 
> log/ignore these exceptions and not set permissions henceforth for log 
> aggregation base dir/sub dirs 
> {noformat}
> 2018-11-12 15:37:28,726 WARN  logaggregation.LogAggregationService 
> (LogAggregationService.java:initApp(209)) - Application failed to init 
> aggregation
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to check 
> permissions for dir [abfs://testc...@test.blob.core.windows.net/app-logs]
> at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileController.verifyAndCreateRemoteLogDir(LogAggregationFileController.java:277)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:238)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:204)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:347)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:69)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9811) FederationInterceptor fails to recover in Kerberos environment

2019-09-04 Thread Xie YiFan (Jira)
Xie YiFan created YARN-9811:
---

 Summary: FederationInterceptor fails to recover in Kerberos 
environment
 Key: YARN-9811
 URL: https://issues.apache.org/jira/browse/YARN-9811
 Project: Hadoop YARN
  Issue Type: Bug
  Components: amrmproxy
Reporter: Xie YiFan
Assignee: Xie YiFan


*scenario*:
 Start up cluster in Kerberos environment with enable recover & AMRMProxy in 
NM. Submit one application to cluster, and restart NM which has master 
container. The NM will block in FederationInterceptor recover.

*LOG*
{code:java}
INFO org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor: 
Recovering data for FederationInterceptor
INFO org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor: 
Found 0 existing UAMs for application application_1561534175896_4102 in 
NMStateStore
INFO org.apache.hadoop.yarn.server.utils.AMRMClientUtils: Creating RMProxy to 
RM online-bx for protocol ApplicationClientProtocol for user recommend 
(auth:SIMPLE)
INFO 
org.apache.hadoop.yarn.server.federation.failover.FederationRMFailoverProxyProvider:
 Initialized Federation proxy for user: recommend
INFO 
org.apache.hadoop.yarn.server.federation.failover.FederationRMFailoverProxyProvider:
 Failing over to the ResourceManager for SubClusterId: online-bx
INFO 
org.apache.hadoop.yarn.server.federation.failover.FederationRMFailoverProxyProvider:
 Connecting to /10.88.86.142:8032 subClusterId online-bx with protocol 
ApplicationClientProtocol as user recommend (auth:SIMPLE)
WARN org.apache.hadoop.ipc.Client: Exception encountered while connecting to 
the server : org.apache.hadoop.security.AccessControlException: Client cannot 
authenticate via:[TOKEN, KERBEROS]
INFO 
org.apache.hadoop.yarn.server.federation.failover.FederationRMFailoverProxyProvider:
 Failing over to the ResourceManager for SubClusterId: online-bx
INFO org.apache.hadoop.yarn.server.federation.utils.FederationStateStoreFacade: 
Flushing subClusters from cache and rehydrating from store, most likely on 
account of RM failover.
INFO 
org.apache.hadoop.yarn.server.federation.failover.FederationRMFailoverProxyProvider:
 Connecting to /10.88.86.142:8032 subClusterId online-bx with protocol 
ApplicationClientProtocol as user recommend (auth:SIMPLE)
WARN org.apache.hadoop.ipc.Client: Exception encountered while connecting to 
the server : org.apache.hadoop.security.AccessControlException: Client cannot 
authenticate via:[TOKEN, KERBEROS]
INFO org.apache.hadoop.io.retry.RetryInvocationHandler: java.io.IOException: 
DestHost:destPort hadoop1684.bx.momo.com:8032 , LocalHost:localPort 
hadoop999.bx.momo.com/10.88.64.186:0. Failed on local exception: 
java.io.IOException: org.apache.hadoop.security.AccessControlException: Client 
cannot authenticate via:[TOKEN, KERBEROS], while invoking 
ApplicationClientProtocolPBClientImpl.getContainers over online-bx after 1 
failover attempts. Trying to failover after sleeping for 3244ms.{code}
*Analysis*

rmclient.getContainers is called. But AuthMethod of appSubmitter is SIMPLE.We 
should use createProxyUser instead of createRemoteUser in Security.
{code:java}
UserGroupInformation appSubmitter = UserGroupInformation  
.createRemoteUser(getApplicationContext().getUser());  
ApplicationClientProtocol rmClient =   
createHomeRMProxy(getApplicationContext(),  
ApplicationClientProtocol.class, appSubmitter);
  GetContainersResponse response = rmClient  
.getContainers(GetContainersRequest.newInstance(this.attemptId));
{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9804) Update ATSv2 document for latest feature supports

2019-09-04 Thread Sunil Govindan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922320#comment-16922320
 ] 

Sunil Govindan commented on YARN-9804:
--

+1. Thanks [~rohithsharma]

> Update ATSv2 document for latest feature supports
> -
>
> Key: YARN-9804
> URL: https://issues.apache.org/jira/browse/YARN-9804
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Blocker
> Attachments: YARN-9804.01.patch, YARN-9804.02.patch
>
>
> Revisit ATSv2 documents and update for GA features. And also for the road map.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9698) [Umbrella] Tools to help migration from Fair Scheduler to Capacity Scheduler

2019-09-04 Thread Gergely Pollak (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922315#comment-16922315
 ] 

Gergely Pollak commented on YARN-9698:
--

I'm attaching a documentation we created with [~Prabhu Joseph] [~sunilg] 
[~wangda] [~wilfreds] [~snemeth].

It includes the main features of the schedulers and the configuration mapping. 
Please feel free to comment and share your thoughts.

> [Umbrella] Tools to help migration from Fair Scheduler to Capacity Scheduler
> 
>
> Key: YARN-9698
> URL: https://issues.apache.org/jira/browse/YARN-9698
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Weiwei Yang
>Priority: Major
>  Labels: fs2cs
> Attachments: FS-CS Migration.pdf
>
>
> We see some users want to migrate from Fair Scheduler to Capacity Scheduler, 
> this Jira is created as an umbrella to track all related efforts for the 
> migration, the scope contains
>  * Bug fixes
>  * Add missing features
>  * Migration tools that help to generate CS configs based on FS, validate 
> configs etc
>  * Documents
> this is part of CS component, the purpose is to make the migration process 
> smooth.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-09-04 Thread Weiwei Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922309#comment-16922309
 ] 

Weiwei Yang commented on YARN-8995:
---

Also looks good to me, [~Tao Yang], feel free to commit this.

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0, 3.3.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.007.patch, 
> YARN-8995.008.patch, YARN-8995.009.patch, YARN-8995.010.patch, 
> YARN-8995.011.patch, YARN-8995.012.patch, YARN-8995.013.patch, 
> YARN-8995.014.patch, image-2019-09-04-15-20-02-914.png
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9698) [Umbrella] Tools to help migration from Fair Scheduler to Capacity Scheduler

2019-09-04 Thread Gergely Pollak (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Pollak updated YARN-9698:
-
Attachment: FS-CS Migration.pdf

> [Umbrella] Tools to help migration from Fair Scheduler to Capacity Scheduler
> 
>
> Key: YARN-9698
> URL: https://issues.apache.org/jira/browse/YARN-9698
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Weiwei Yang
>Priority: Major
>  Labels: fs2cs
> Attachments: FS-CS Migration.pdf
>
>
> We see some users want to migrate from Fair Scheduler to Capacity Scheduler, 
> this Jira is created as an umbrella to track all related efforts for the 
> migration, the scope contains
>  * Bug fixes
>  * Add missing features
>  * Migration tools that help to generate CS configs based on FS, validate 
> configs etc
>  * Documents
> this is part of CS component, the purpose is to make the migration process 
> smooth.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-09-04 Thread Tao Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922279#comment-16922279
 ] 

Tao Yang commented on YARN-8995:


Confirmed that latest patch should not fail like that. 
Now the patch LGTM, waiting for feedbacks from [~cheersyang], thanks.

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0, 3.3.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.007.patch, 
> YARN-8995.008.patch, YARN-8995.009.patch, YARN-8995.010.patch, 
> YARN-8995.011.patch, YARN-8995.012.patch, YARN-8995.013.patch, 
> YARN-8995.014.patch, image-2019-09-04-15-20-02-914.png
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9728)  ResourceManager REST API can produce an illegal xml response

2019-09-04 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922280#comment-16922280
 ] 

Prabhu Joseph commented on YARN-9728:
-

Thanks [~eyang] and [~tde] for reviewing. Have handled below changes in 
[^YARN-9728-004.patch] .

1. Added unicode characters in x1-#x10 range. 
 2. Used \uFFFd as the substitute.
 3. Fixed the camel case issue.
 4. Fixed the description of {{yarn.webapp.filter-invalid-xml-chars}}.

>  ResourceManager REST API can produce an illegal xml response
> -
>
> Key: YARN-9728
> URL: https://issues.apache.org/jira/browse/YARN-9728
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api, resourcemanager
>Affects Versions: 2.7.3
>Reporter: Thomas
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: IllegalResponseChrome.png, YARN-9728-001.patch, 
> YARN-9728-002.patch, YARN-9728-003.patch, YARN-9728-004.patch
>
>
> When a spark job throws an exception with a message containing a character 
> out of the range supported by xml 1.0, then
>  the application fails and the stack trace will be stored into the 
> {{diagnostics}} field. So far, so good.
> But the issue occurred when we try to get application information with the 
> ResourceManager REST API
>  The xml response will contain the illegal xml 1.0 char and will be invalid.
>  *+Examples of illegals characters in xml 1.0 :+* 
>  * {{\u}}
>  * {{\u0001}}
>  * {{\u0002}}
>  * {{\u0003}}
>  * {{\u0004}}
> _For more information about supported characters :_
>  [https://www.w3.org/TR/xml/#charsets]
> *+Example of illegal response from the Ressource Manager API :+* 
> {code:xml}
> 
> 
>   application_1326821518301_0005
>   user1
>   job
>   a1
>   FINISHED
>   FAILED
>   100.0
>   History
>   
> http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5
>   Exception in thread "main" java.lang.Exception: \u0001
>   at com..main(JobWithSpecialCharMain.java:6)
>   [...]
> 
> {code}
>  
> *+Example of job to reproduce :+*
> {code:java}
> public class JobWithSpecialCharMain {
>  public static void main(String[] args) throws Exception {
>   throw new Exception("\u0001");
>  }
> }
> {code}
> {code:bash}
> javac -d . JobWithSpecialCharMain.java
> jar cvf repro.jar com/
> spark-submit --class com.JobWithSpecialCharMain --master yarn-cluster 
> repro.jar
> {code}
> !IllegalResponseChrome.png!



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9728)  ResourceManager REST API can produce an illegal xml response

2019-09-04 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9728:

Attachment: YARN-9728-004.patch

>  ResourceManager REST API can produce an illegal xml response
> -
>
> Key: YARN-9728
> URL: https://issues.apache.org/jira/browse/YARN-9728
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api, resourcemanager
>Affects Versions: 2.7.3
>Reporter: Thomas
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: IllegalResponseChrome.png, YARN-9728-001.patch, 
> YARN-9728-002.patch, YARN-9728-003.patch, YARN-9728-004.patch
>
>
> When a spark job throws an exception with a message containing a character 
> out of the range supported by xml 1.0, then
>  the application fails and the stack trace will be stored into the 
> {{diagnostics}} field. So far, so good.
> But the issue occurred when we try to get application information with the 
> ResourceManager REST API
>  The xml response will contain the illegal xml 1.0 char and will be invalid.
>  *+Examples of illegals characters in xml 1.0 :+* 
>  * {{\u}}
>  * {{\u0001}}
>  * {{\u0002}}
>  * {{\u0003}}
>  * {{\u0004}}
> _For more information about supported characters :_
>  [https://www.w3.org/TR/xml/#charsets]
> *+Example of illegal response from the Ressource Manager API :+* 
> {code:xml}
> 
> 
>   application_1326821518301_0005
>   user1
>   job
>   a1
>   FINISHED
>   FAILED
>   100.0
>   History
>   
> http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5
>   Exception in thread "main" java.lang.Exception: \u0001
>   at com..main(JobWithSpecialCharMain.java:6)
>   [...]
> 
> {code}
>  
> *+Example of job to reproduce :+*
> {code:java}
> public class JobWithSpecialCharMain {
>  public static void main(String[] args) throws Exception {
>   throw new Exception("\u0001");
>  }
> }
> {code}
> {code:bash}
> javac -d . JobWithSpecialCharMain.java
> jar cvf repro.jar com/
> spark-submit --class com.JobWithSpecialCharMain --master yarn-cluster 
> repro.jar
> {code}
> !IllegalResponseChrome.png!



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9785) Fix DominantResourceCalculator when one resource is zero

2019-09-04 Thread Zhankun Tang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang updated YARN-9785:
---
Fix Version/s: 3.1.3

> Fix DominantResourceCalculator when one resource is zero
> 
>
> Key: YARN-9785
> URL: https://issues.apache.org/jira/browse/YARN-9785
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Blocker
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: YARN-9785-001.patch, YARN-9785-branch-3.1.001.patch, 
> YARN-9785.002.patch, YARN-9785.003.patch, YARN-9785.wip.patch
>
>
> Configure below property in resource-types.xml
> {quote}
>  yarn.resource-types
>  yarn.io/gpu
>  
> {quote}
> Submit applications even after AM limit for a queue is reached. Applications 
> get activated even after limit is reached
> !queue.png!



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-09-04 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1699#comment-1699
 ] 

zhuqi commented on YARN-8995:
-

Hi [~Tao Yang]. 

!image-2019-09-04-15-20-02-914.png!

The metric that i have changed.Now not in thousand, but i forget to change it 
in the last two patch. Sorry for my mistake. 

Thanks.

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0, 3.3.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.007.patch, 
> YARN-8995.008.patch, YARN-8995.009.patch, YARN-8995.010.patch, 
> YARN-8995.011.patch, YARN-8995.012.patch, YARN-8995.013.patch, 
> YARN-8995.014.patch, image-2019-09-04-15-20-02-914.png
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6715) Fix documentation about NodeHealthScriptRunner

2019-09-04 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1691#comment-1691
 ] 

Peter Bacsko commented on YARN-6715:


[~szegedim] yes, that's fine.

> Fix documentation about NodeHealthScriptRunner 
> ---
>
> Key: YARN-6715
> URL: https://issues.apache.org/jira/browse/YARN-6715
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation, nodemanager
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-6715-001.patch, YARN-6715-002.patch, 
> YARN-6715-003.patch
>
>
> NodeHealthScriptRunner does *not* report a bad health if the script exits 
> with an exit code other than 0. Look at the {{FAILED_WITH_EXIT_CODE}} case:
> {noformat}
> void reportHealthStatus(HealthCheckerExitStatus status) {
>   long now = System.currentTimeMillis();
>   switch (status) {
>   case SUCCESS:
> setHealthStatus(true, "", now);
> break;
>   case TIMED_OUT:
> setHealthStatus(false, NODE_HEALTH_SCRIPT_TIMED_OUT_MSG);
> break;
>   case FAILED_WITH_EXCEPTION:
> setHealthStatus(false, exceptionStackTrace);
> break;
>   case FAILED_WITH_EXIT_CODE:
> setHealthStatus(true, "", now);
> break;
>   case FAILED:
> setHealthStatus(false, shexec.getOutput());
> break;
>   }
> }
> {noformat}
> Based on the discussion in YARN-5567, this is intentional, but conflicts with 
> the upstream document, which says: 
> "If the script *exits with a non-zero exit code*, times out or results in an 
> exception being thrown, the node is marked as unhealthy"
> This statement can be extremely misleading and must be corrected. We might 
> also add an extra comment to {{reportHealthStatus()}} which explains that 
> {{FAILED_WITH_EXIT_CODE}} is not buggy.
> This case also lacks unit test coverage.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.

2019-09-04 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated YARN-8995:

Attachment: image-2019-09-04-15-20-02-914.png

> Log the event type of the too big AsyncDispatcher event queue size, and add 
> the information to the metrics. 
> 
>
> Key: YARN-8995
> URL: https://issues.apache.org/jira/browse/YARN-8995
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: metrics, nodemanager, resourcemanager
>Affects Versions: 3.2.0, 3.3.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: TestStreamPerf.java, YARN-8995.001.patch, 
> YARN-8995.002.patch, YARN-8995.003.patch, YARN-8995.004.patch, 
> YARN-8995.005.patch, YARN-8995.006.patch, YARN-8995.007.patch, 
> YARN-8995.008.patch, YARN-8995.009.patch, YARN-8995.010.patch, 
> YARN-8995.011.patch, YARN-8995.012.patch, YARN-8995.013.patch, 
> YARN-8995.014.patch, image-2019-09-04-15-20-02-914.png
>
>
> In our growing cluster,there are unexpected situations that cause some event 
> queues to block the performance of the cluster, such as the bug of  
> https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to 
> log the event type of the too big event queue size, and add the information 
> to the metrics, and the threshold of queue size is a parametor which can be 
> changed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9511) [JDK11] TestAuxServices#testRemoteAuxServiceClassPath YarnRuntimeException: The remote jarfile should not be writable by group or others. The current Permission is 436

2019-09-04 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922215#comment-16922215
 ] 

Adam Antal commented on YARN-9511:
--

Hi [~seanlau],

Szilard is on vacation, so there isn't going to be any update on this for the 
next 2-3 weeks. If it's urgent for you, I think [~snemeth] wouldn't mind if you 
take this over. I can take a look at it as well next week, the JDK11 issues are 
in my scope.

> [JDK11] TestAuxServices#testRemoteAuxServiceClassPath YarnRuntimeException: 
> The remote jarfile should not be writable by group or others. The current 
> Permission is 436
> ---
>
> Key: YARN-9511
> URL: https://issues.apache.org/jira/browse/YARN-9511
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Siyao Meng
>Assignee: Szilard Nemeth
>Priority: Major
>
> Found in maven JDK 11 unit test run. Compiled on JDK 8.
> {code}
> [ERROR] 
> testRemoteAuxServiceClassPath(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestAuxServices)
>   Time elapsed: 0.551 s  <<< 
> ERROR!org.apache.hadoop.yarn.exceptions.YarnRuntimeException: The remote 
> jarfile should not be writable by group or others. The current Permission is 
> 436
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:202)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.TestAuxServices.testRemoteAuxServiceClassPath(TestAuxServices.java:268)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9511) [JDK11] TestAuxServices#testRemoteAuxServiceClassPath YarnRuntimeException: The remote jarfile should not be writable by group or others. The current Permission is 436

2019-09-04 Thread liusheng (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922211#comment-16922211
 ] 

liusheng commented on YARN-9511:


Hi,

Any update about this issue ?

> [JDK11] TestAuxServices#testRemoteAuxServiceClassPath YarnRuntimeException: 
> The remote jarfile should not be writable by group or others. The current 
> Permission is 436
> ---
>
> Key: YARN-9511
> URL: https://issues.apache.org/jira/browse/YARN-9511
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Siyao Meng
>Assignee: Szilard Nemeth
>Priority: Major
>
> Found in maven JDK 11 unit test run. Compiled on JDK 8.
> {code}
> [ERROR] 
> testRemoteAuxServiceClassPath(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestAuxServices)
>   Time elapsed: 0.551 s  <<< 
> ERROR!org.apache.hadoop.yarn.exceptions.YarnRuntimeException: The remote 
> jarfile should not be writable by group or others. The current Permission is 
> 436
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:202)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.TestAuxServices.testRemoteAuxServiceClassPath(TestAuxServices.java:268)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9784) org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue is flaky

2019-09-04 Thread Julia Kinga Marton (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922205#comment-16922205
 ] 

Julia Kinga Marton commented on YARN-9784:
--

Thank you [~adam.antal] and [~sunilg] for the review!

> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue
>  is flaky
> ---
>
> Key: YARN-9784
> URL: https://issues.apache.org/jira/browse/YARN-9784
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.3.0
>Reporter: Julia Kinga Marton
>Assignee: Julia Kinga Marton
>Priority: Major
> Attachments: YARN-9784.001.patch
>
>
> There are some test cases in TestLeafQueue which are failing intermittently.
> From 100 runs, there were 16 failures. 
> Some failure examples are the following ones:
> {code:java}
> 2019-08-26 13:18:13 [ERROR] Errors: 
> 2019-08-26 13:18:13 [ERROR]   TestLeafQueue.setUp:144->setUpInternal:221 
> WrongTypeOfReturnValue 
> 2019-08-26 13:18:13 YarnConfigu...
> 2019-08-26 13:18:13 [ERROR]   TestLeafQueue.setUp:144->setUpInternal:221 
> WrongTypeOfReturnValue 
> 2019-08-26 13:18:13 YarnConfigu...
> 2019-08-26 13:18:13 [INFO] 
> 2019-08-26 13:18:13 [ERROR] Tests run: 36, Failures: 0, Errors: 2, Skipped: 0
> {code}
> {code:java}
> 2019-08-26 13:18:09 [ERROR] Failures: 
> 2019-08-26 13:18:09 [ERROR]   TestLeafQueue.testHeadroomWithMaxCap:1373 
> expected:<2048> but was:<0>
> 2019-08-26 13:18:09 [INFO] 
> 2019-08-26 13:18:09 [ERROR] Tests run: 36, Failures: 1, Errors: 0, Skipped: 0
> {code}
> {code:java}
> 2019-08-26 13:18:18 [ERROR] Errors: 
> 2019-08-26 13:18:18 [ERROR]   TestLeafQueue.setUp:144->setUpInternal:221 
> WrongTypeOfReturnValue 
> 2019-08-26 13:18:18 YarnConfigu...
> 2019-08-26 13:18:18 [ERROR]   TestLeafQueue.testHeadroomWithMaxCap:1307 ? 
> ClassCast org.apache.hadoop.yarn.c...
> 2019-08-26 13:18:18 [INFO] 
> 2019-08-26 13:18:18 [ERROR] Tests run: 36, Failures: 0, Errors: 2, Skipped: 0
> {code}
> {code:java}
> 2019-08-26 13:18:10 [ERROR] Failures: 
> 2019-08-26 13:18:10 [ERROR]   TestLeafQueue.testDRFUserLimits:847 Verify 
> user_0 got resources 
> 2019-08-26 13:18:10 [INFO] 
> 2019-08-26 13:18:10 [ERROR] Tests run: 36, Failures: 1, Errors: 0, Skipped: 0
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org