[jira] [Commented] (YARN-9554) TimelineEntity DAO has java.util.Set interface which JAXB can't handle

2019-05-15 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840986#comment-16840986
 ] 

Rohith Sharma K S commented on YARN-9554:
-

ATSv2 has separate TimelineEntity and TimelineEntities under separate package 
i.e org.apache.hadoop.yarn.api.records.timelineservice. And v1 and v2 clients 
are different. So, it shouldn't be problem.

> TimelineEntity DAO has java.util.Set interface which JAXB can't handle
> --
>
> Key: YARN-9554
> URL: https://issues.apache.org/jira/browse/YARN-9554
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineservice
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9554-001.patch
>
>
> TimelineEntity DAO has java.util.Set interface which JAXB can't handle. This 
> breaks the fix of YARN-7266.
> {code}
> Caused by: com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException: 
> 1 counts of IllegalAnnotationExceptions
> java.util.Set is an interface, and JAXB can't handle interfaces.
>   this problem is related to the following location:
>   at java.util.Set
>   at public java.util.HashMap 
> org.apache.hadoop.yarn.api.records.timeline.TimelineEntity.getPrimaryFiltersJAXB()
>   at org.apache.hadoop.yarn.api.records.timeline.TimelineEntity
>   at public java.util.List 
> org.apache.hadoop.yarn.api.records.timeline.TimelineEntities.getEntities()
>   at org.apache.hadoop.yarn.api.records.timeline.TimelineEntities
>   at 
> com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException$Builder.check(IllegalAnnotationsException.java:91)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.getTypeInfoSet(JAXBContextImpl.java:445)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:277)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:124)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl$JAXBContextBuilder.build(JAXBContextImpl.java:1123)
>   at 
> com.sun.xml.internal.bind.v2.ContextFactory.createContext(ContextFactory.java:147)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9555) Yarn Docs : single cluster yarn setup - Step 1 configure parameters - multiple roots

2019-05-15 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840981#comment-16840981
 ] 

Prabhu Joseph commented on YARN-9555:
-

Thanks [~ajisakaa].

> Yarn Docs : single cluster yarn setup - Step 1 configure parameters - 
> multiple roots
> 
>
> Key: YARN-9555
> URL: https://issues.apache.org/jira/browse/YARN-9555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.2
>Reporter: Vishva
>Priority: Minor
>
> Step 1 for 
> [https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_Single_Node]
>  
> Configure parameters as follows:
> {{etc/hadoop/mapred-site.xml}}:
>  
> 
> 
> mapreduce.framework.name
> yarn
> 
> 
> 
> 
> mapreduce.application.classpath
> 
> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
> 
> 
> but setting this will throw an error when running yarn : 
> {color:#6a9955}2019-05-14{color} {color:#6a9955}16:32:05,815{color} 
> {color:#ce9178}ERROR{color} 
> {color:#569cd6}org.apache.hadoop.conf.Configuration{color}{color:#d4d4d4}: 
> error parsing conf {color}{color:#569cd6}mapred-site.xml{color}
> {color:#ce9178}com.ctc.wstx.exc.WstxParsingException{color}{color:#d4d4d4}: 
> Illegal to have multiple roots (start tag in epilog?).{color}This should be 
> modified to 
> {code:java}
>  
>  
> mapreduce.framework.name 
> yarn 
>  
>  
> mapreduce.application.classpath 
> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
>  
>  
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9555) Yarn Docs : single cluster yarn setup - Step 1 configure parameters - multiple roots

2019-05-15 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840972#comment-16840972
 ] 

Akira Ajisaka commented on YARN-9555:
-

IMO, it is not possible. We need to release 3.2.1 instead.

BTW, the latest documentation is available: 
https://aajisaka.github.io/hadoop-document/hadoop-project/
This is my daily documentation build of Hadoop trunk (unofficial)

> Yarn Docs : single cluster yarn setup - Step 1 configure parameters - 
> multiple roots
> 
>
> Key: YARN-9555
> URL: https://issues.apache.org/jira/browse/YARN-9555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.2
>Reporter: Vishva
>Priority: Minor
>
> Step 1 for 
> [https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_Single_Node]
>  
> Configure parameters as follows:
> {{etc/hadoop/mapred-site.xml}}:
>  
> 
> 
> mapreduce.framework.name
> yarn
> 
> 
> 
> 
> mapreduce.application.classpath
> 
> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
> 
> 
> but setting this will throw an error when running yarn : 
> {color:#6a9955}2019-05-14{color} {color:#6a9955}16:32:05,815{color} 
> {color:#ce9178}ERROR{color} 
> {color:#569cd6}org.apache.hadoop.conf.Configuration{color}{color:#d4d4d4}: 
> error parsing conf {color}{color:#569cd6}mapred-site.xml{color}
> {color:#ce9178}com.ctc.wstx.exc.WstxParsingException{color}{color:#d4d4d4}: 
> Illegal to have multiple roots (start tag in epilog?).{color}This should be 
> modified to 
> {code:java}
>  
>  
> mapreduce.framework.name 
> yarn 
>  
>  
> mapreduce.application.classpath 
> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
>  
>  
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9555) Yarn Docs : single cluster yarn setup - Step 1 configure parameters - multiple roots

2019-05-15 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840968#comment-16840968
 ] 

Prabhu Joseph commented on YARN-9555:
-

Thanks [~ajisakaa] for clarification. Have one more doubt, since users are 
referring the 3.2.0 document, is it not possible to change the released 3.2.0 
document now.

> Yarn Docs : single cluster yarn setup - Step 1 configure parameters - 
> multiple roots
> 
>
> Key: YARN-9555
> URL: https://issues.apache.org/jira/browse/YARN-9555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.2
>Reporter: Vishva
>Priority: Minor
>
> Step 1 for 
> [https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_Single_Node]
>  
> Configure parameters as follows:
> {{etc/hadoop/mapred-site.xml}}:
>  
> 
> 
> mapreduce.framework.name
> yarn
> 
> 
> 
> 
> mapreduce.application.classpath
> 
> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
> 
> 
> but setting this will throw an error when running yarn : 
> {color:#6a9955}2019-05-14{color} {color:#6a9955}16:32:05,815{color} 
> {color:#ce9178}ERROR{color} 
> {color:#569cd6}org.apache.hadoop.conf.Configuration{color}{color:#d4d4d4}: 
> error parsing conf {color}{color:#569cd6}mapred-site.xml{color}
> {color:#ce9178}com.ctc.wstx.exc.WstxParsingException{color}{color:#d4d4d4}: 
> Illegal to have multiple roots (start tag in epilog?).{color}This should be 
> modified to 
> {code:java}
>  
>  
> mapreduce.framework.name 
> yarn 
>  
>  
> mapreduce.application.classpath 
> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
>  
>  
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9559) Create AbstractContainersLauncher for pluggable ContainersLauncher logic

2019-05-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840921#comment-16840921
 ] 

Hadoop QA commented on YARN-9559:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  8m 
56s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 20s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
50s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 15s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 7 new + 318 unchanged - 2 fixed = 325 total (was 320) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 51s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 48s{color} 
| {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 46s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}104m 14s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | TEST-TestYarnConfigurationFields |
|   | 
hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService
 |
|   | hadoop.yarn.server.nodemanager.webapp.TestNMWebServices |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9559 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12968849/YARN-9559.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 09d72362d82d 4.4.0-139-generic 

[jira] [Resolved] (YARN-9555) Yarn Docs : single cluster yarn setup - Step 1 configure parameters - multiple roots

2019-05-15 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved YARN-9555.
-
Resolution: Duplicate

Closing this as duplicate.

> Yarn Docs : single cluster yarn setup - Step 1 configure parameters - 
> multiple roots
> 
>
> Key: YARN-9555
> URL: https://issues.apache.org/jira/browse/YARN-9555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.2
>Reporter: Vishva
>Priority: Minor
>
> Step 1 for 
> [https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_Single_Node]
>  
> Configure parameters as follows:
> {{etc/hadoop/mapred-site.xml}}:
>  
> 
> 
> mapreduce.framework.name
> yarn
> 
> 
> 
> 
> mapreduce.application.classpath
> 
> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
> 
> 
> but setting this will throw an error when running yarn : 
> {color:#6a9955}2019-05-14{color} {color:#6a9955}16:32:05,815{color} 
> {color:#ce9178}ERROR{color} 
> {color:#569cd6}org.apache.hadoop.conf.Configuration{color}{color:#d4d4d4}: 
> error parsing conf {color}{color:#569cd6}mapred-site.xml{color}
> {color:#ce9178}com.ctc.wstx.exc.WstxParsingException{color}{color:#d4d4d4}: 
> Illegal to have multiple roots (start tag in epilog?).{color}This should be 
> modified to 
> {code:java}
>  
>  
> mapreduce.framework.name 
> yarn 
>  
>  
> mapreduce.application.classpath 
> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
>  
>  
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9555) Yarn Docs : single cluster yarn setup - Step 1 configure parameters - multiple roots

2019-05-15 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840903#comment-16840903
 ] 

Akira Ajisaka commented on YARN-9555:
-

The fix version of MAPREDUCE-7165 is 3.2.1, therefore the fix is not reflected 
to 3.2.0 document.
In 3.1.2, the document is fixed. 
https://hadoop.apache.org/docs/r3.1.2/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_Single_Node

> Yarn Docs : single cluster yarn setup - Step 1 configure parameters - 
> multiple roots
> 
>
> Key: YARN-9555
> URL: https://issues.apache.org/jira/browse/YARN-9555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.2
>Reporter: Vishva
>Priority: Minor
>
> Step 1 for 
> [https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_Single_Node]
>  
> Configure parameters as follows:
> {{etc/hadoop/mapred-site.xml}}:
>  
> 
> 
> mapreduce.framework.name
> yarn
> 
> 
> 
> 
> mapreduce.application.classpath
> 
> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
> 
> 
> but setting this will throw an error when running yarn : 
> {color:#6a9955}2019-05-14{color} {color:#6a9955}16:32:05,815{color} 
> {color:#ce9178}ERROR{color} 
> {color:#569cd6}org.apache.hadoop.conf.Configuration{color}{color:#d4d4d4}: 
> error parsing conf {color}{color:#569cd6}mapred-site.xml{color}
> {color:#ce9178}com.ctc.wstx.exc.WstxParsingException{color}{color:#d4d4d4}: 
> Illegal to have multiple roots (start tag in epilog?).{color}This should be 
> modified to 
> {code:java}
>  
>  
> mapreduce.framework.name 
> yarn 
>  
>  
> mapreduce.application.classpath 
> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
>  
>  
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9559) Create AbstractContainersLauncher for pluggable ContainersLauncher logic

2019-05-15 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840877#comment-16840877
 ] 

Jonathan Hung commented on YARN-9559:
-

attached 001 patch which creates AbstractContainersLauncher class

> Create AbstractContainersLauncher for pluggable ContainersLauncher logic
> 
>
> Key: YARN-9559
> URL: https://issues.apache.org/jira/browse/YARN-9559
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9559.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9559) Create AbstractContainersLauncher for pluggable ContainersLauncher logic

2019-05-15 Thread Jonathan Hung (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-9559:

Attachment: YARN-9559.001.patch

> Create AbstractContainersLauncher for pluggable ContainersLauncher logic
> 
>
> Key: YARN-9559
> URL: https://issues.apache.org/jira/browse/YARN-9559
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9559.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9559) Create AbstractContainersLauncher for pluggable ContainersLauncher logic

2019-05-15 Thread Jonathan Hung (JIRA)
Jonathan Hung created YARN-9559:
---

 Summary: Create AbstractContainersLauncher for pluggable 
ContainersLauncher logic
 Key: YARN-9559
 URL: https://issues.apache.org/jira/browse/YARN-9559
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Jonathan Hung
Assignee: Jonathan Hung






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException

2019-05-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840679#comment-16840679
 ] 

Hudson commented on YARN-9552:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16554 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16554/])
YARN-9552. FairScheduler: NODE_UPDATE can cause NoSuchElementException. 
(gifuma: rev 55bd35921c2bb013e45120bbd1602b658b8b999b)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSAppAttempt.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java


> FairScheduler: NODE_UPDATE can cause NoSuchElementException
> ---
>
> Key: YARN-9552
> URL: https://issues.apache.org/jira/browse/YARN-9552
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9552-001.patch, YARN-9552-002.patch, 
> YARN-9552-003.patch, YARN-9552-004.patch
>
>
> We observed a race condition inside YARN with the following stack trace:
> {noformat}
> 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR 
> EventDispatcher: Error in handling event type NODE_UPDATE to the Event 
> Dispatcher
> java.util.NoSuchElementException
> at 
> java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036)
> at 
> java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> This is basically the same as the one described in YARN-7382, but the root 
> cause is different.
> When we create an application attempt, we create an {{FSAppAttempt}} object. 
> This contains an {{AppSchedulingInfo}} which contains a set of 
> {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a 
> bit later on a separate thread during a state transition:
> {noformat}
> 2019-05-07 15:58:02,659 INFO  [RM StateStore dispatcher] 
> recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for 
> app: application_1557237478804_0001
> 2019-05-07 15:58:02,684 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED
> 2019-05-07 15:58:02,690 INFO  [SchedulerEventDispatcher:Event Processor] 
> fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted 
> application application_1557237478804_0001 from user: bacskop, in queue: 
> root.bacskop, currently num of applications: 1
> 2019-05-07 15:58:02,698 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from SUBMITTED to ACCEPTED on event = APP_ACCEPTED
> 2019-05-07 15:58:02,731 INFO  [RM Event dispatcher] 
> resourcemanager.ApplicationMasterService 
> (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app 
> attempt : appattempt_1557237478804_0001_01
> 2019-05-07 15:58:02,732 INFO  [RM Event dispatcher] attempt.RMAppAttemptImpl 

[jira] [Commented] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException

2019-05-15 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840663#comment-16840663
 ] 

Giovanni Matteo Fumarola commented on YARN-9552:


Committed [^YARN-9552-004.patch] to trunk. The patch looks good.

Thanks [~pbacsko] for working on this and [~snemeth] for the initial review.

> FairScheduler: NODE_UPDATE can cause NoSuchElementException
> ---
>
> Key: YARN-9552
> URL: https://issues.apache.org/jira/browse/YARN-9552
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9552-001.patch, YARN-9552-002.patch, 
> YARN-9552-003.patch, YARN-9552-004.patch
>
>
> We observed a race condition inside YARN with the following stack trace:
> {noformat}
> 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR 
> EventDispatcher: Error in handling event type NODE_UPDATE to the Event 
> Dispatcher
> java.util.NoSuchElementException
> at 
> java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036)
> at 
> java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> This is basically the same as the one described in YARN-7382, but the root 
> cause is different.
> When we create an application attempt, we create an {{FSAppAttempt}} object. 
> This contains an {{AppSchedulingInfo}} which contains a set of 
> {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a 
> bit later on a separate thread during a state transition:
> {noformat}
> 2019-05-07 15:58:02,659 INFO  [RM StateStore dispatcher] 
> recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for 
> app: application_1557237478804_0001
> 2019-05-07 15:58:02,684 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED
> 2019-05-07 15:58:02,690 INFO  [SchedulerEventDispatcher:Event Processor] 
> fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted 
> application application_1557237478804_0001 from user: bacskop, in queue: 
> root.bacskop, currently num of applications: 1
> 2019-05-07 15:58:02,698 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from SUBMITTED to ACCEPTED on event = APP_ACCEPTED
> 2019-05-07 15:58:02,731 INFO  [RM Event dispatcher] 
> resourcemanager.ApplicationMasterService 
> (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app 
> attempt : appattempt_1557237478804_0001_01
> 2019-05-07 15:58:02,732 INFO  [RM Event dispatcher] attempt.RMAppAttemptImpl 
> (RMAppAttemptImpl.java:handle(920)) - appattempt_1557237478804_0001_01 
> State change from NEW to SUBMITTED on event = START
> 2019-05-07 15:58:02,746 INFO  [SchedulerEventDispatcher:Event Processor] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:(207)) - *** In the constructor of 
> SchedulerApplicationAttempt
> 2019-05-07 15:58:02,747 INFO  [SchedulerEventDispatcher:Event Processor] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:(230)) - *** Contents of 
> appSchedulingInfo: []
> 2019-05-07 15:58:02,752 INFO  [SchedulerEventDispatcher:Event Processor] 
> 

[jira] [Updated] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException

2019-05-15 Thread Giovanni Matteo Fumarola (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-9552:
---
Fix Version/s: 3.3.0

> FairScheduler: NODE_UPDATE can cause NoSuchElementException
> ---
>
> Key: YARN-9552
> URL: https://issues.apache.org/jira/browse/YARN-9552
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9552-001.patch, YARN-9552-002.patch, 
> YARN-9552-003.patch, YARN-9552-004.patch
>
>
> We observed a race condition inside YARN with the following stack trace:
> {noformat}
> 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR 
> EventDispatcher: Error in handling event type NODE_UPDATE to the Event 
> Dispatcher
> java.util.NoSuchElementException
> at 
> java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036)
> at 
> java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> This is basically the same as the one described in YARN-7382, but the root 
> cause is different.
> When we create an application attempt, we create an {{FSAppAttempt}} object. 
> This contains an {{AppSchedulingInfo}} which contains a set of 
> {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a 
> bit later on a separate thread during a state transition:
> {noformat}
> 2019-05-07 15:58:02,659 INFO  [RM StateStore dispatcher] 
> recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for 
> app: application_1557237478804_0001
> 2019-05-07 15:58:02,684 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED
> 2019-05-07 15:58:02,690 INFO  [SchedulerEventDispatcher:Event Processor] 
> fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted 
> application application_1557237478804_0001 from user: bacskop, in queue: 
> root.bacskop, currently num of applications: 1
> 2019-05-07 15:58:02,698 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from SUBMITTED to ACCEPTED on event = APP_ACCEPTED
> 2019-05-07 15:58:02,731 INFO  [RM Event dispatcher] 
> resourcemanager.ApplicationMasterService 
> (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app 
> attempt : appattempt_1557237478804_0001_01
> 2019-05-07 15:58:02,732 INFO  [RM Event dispatcher] attempt.RMAppAttemptImpl 
> (RMAppAttemptImpl.java:handle(920)) - appattempt_1557237478804_0001_01 
> State change from NEW to SUBMITTED on event = START
> 2019-05-07 15:58:02,746 INFO  [SchedulerEventDispatcher:Event Processor] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:(207)) - *** In the constructor of 
> SchedulerApplicationAttempt
> 2019-05-07 15:58:02,747 INFO  [SchedulerEventDispatcher:Event Processor] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:(230)) - *** Contents of 
> appSchedulingInfo: []
> 2019-05-07 15:58:02,752 INFO  [SchedulerEventDispatcher:Event Processor] 
> fair.FairScheduler (FairScheduler.java:addApplicationAttempt(546)) - Added 
> Application Attempt appattempt_1557237478804_0001_01 to scheduler from 
> user: bacskop
> 

[jira] [Commented] (YARN-9555) Yarn Docs : single cluster yarn setup - Step 1 configure parameters - multiple roots

2019-05-15 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840657#comment-16840657
 ] 

Prabhu Joseph commented on YARN-9555:
-

[~Vishva001] Looks the document is already fixed by MAPREDUCE-7165. Not sure 
when the documentation will reflect with new changes. [~ajisakaa] Can you check 
this one. Thanks.

> Yarn Docs : single cluster yarn setup - Step 1 configure parameters - 
> multiple roots
> 
>
> Key: YARN-9555
> URL: https://issues.apache.org/jira/browse/YARN-9555
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.2
>Reporter: Vishva
>Priority: Minor
>
> Step 1 for 
> [https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_Single_Node]
>  
> Configure parameters as follows:
> {{etc/hadoop/mapred-site.xml}}:
>  
> 
> 
> mapreduce.framework.name
> yarn
> 
> 
> 
> 
> mapreduce.application.classpath
> 
> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
> 
> 
> but setting this will throw an error when running yarn : 
> {color:#6a9955}2019-05-14{color} {color:#6a9955}16:32:05,815{color} 
> {color:#ce9178}ERROR{color} 
> {color:#569cd6}org.apache.hadoop.conf.Configuration{color}{color:#d4d4d4}: 
> error parsing conf {color}{color:#569cd6}mapred-site.xml{color}
> {color:#ce9178}com.ctc.wstx.exc.WstxParsingException{color}{color:#d4d4d4}: 
> Illegal to have multiple roots (start tag in epilog?).{color}This should be 
> modified to 
> {code:java}
>  
>  
> mapreduce.framework.name 
> yarn 
>  
>  
> mapreduce.application.classpath 
> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
>  
>  
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9554) TimelineEntity DAO has java.util.Set interface which JAXB can't handle

2019-05-15 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840600#comment-16840600
 ] 

Eric Yang commented on YARN-9554:
-

[~Prabhu Joseph], Error handling for logging the exception is going to stderr 
instead of log files.  Can 

e.printStackTrace();

be changed to a log statement?

I am not familiar with Timeline server 2 internals.  [~rohithsharma] do you 
know if we would run into problems for dropping TimelineEntity and 
TimelineEntities objects, will this cause problem for timeline server from 
receiving certain events?

> TimelineEntity DAO has java.util.Set interface which JAXB can't handle
> --
>
> Key: YARN-9554
> URL: https://issues.apache.org/jira/browse/YARN-9554
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineservice
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9554-001.patch
>
>
> TimelineEntity DAO has java.util.Set interface which JAXB can't handle. This 
> breaks the fix of YARN-7266.
> {code}
> Caused by: com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException: 
> 1 counts of IllegalAnnotationExceptions
> java.util.Set is an interface, and JAXB can't handle interfaces.
>   this problem is related to the following location:
>   at java.util.Set
>   at public java.util.HashMap 
> org.apache.hadoop.yarn.api.records.timeline.TimelineEntity.getPrimaryFiltersJAXB()
>   at org.apache.hadoop.yarn.api.records.timeline.TimelineEntity
>   at public java.util.List 
> org.apache.hadoop.yarn.api.records.timeline.TimelineEntities.getEntities()
>   at org.apache.hadoop.yarn.api.records.timeline.TimelineEntities
>   at 
> com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException$Builder.check(IllegalAnnotationsException.java:91)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.getTypeInfoSet(JAXBContextImpl.java:445)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:277)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:124)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl$JAXBContextBuilder.build(JAXBContextImpl.java:1123)
>   at 
> com.sun.xml.internal.bind.v2.ContextFactory.createContext(ContextFactory.java:147)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9558) Log Aggregation testcases failing

2019-05-15 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9558:

Description: 
Test cases related to Log Aggregation from below classes are failing

hadoop.yarn.server.nodemanager.webapp.TestNMWebServices 
hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService
 
hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices 
hadoop.yarn.client.cli.TestLogsCLI 

  was:
TestAHSWebServices testcases failing. 

{code:java}
[ERROR]   TestAHSWebServices.testContainerLogsForFinishedApps:570
[ERROR]   TestAHSWebServices.testContainerLogsForFinishedApps:570
[ERROR]   TestAHSWebServices.testContainerLogsForRunningApps:777
[ERROR]   TestAHSWebServices.testContainerLogsForRunningApps:777
[ERROR] Errors: 
[ERROR]   TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » 
WebApplication j...
[ERROR]   TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » 
WebApplication j...
[ERROR]   TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » 
WebApplication ja...
[ERROR]   TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » 
WebApplication ja...
 {code}


> Log Aggregation testcases failing
> -
>
> Key: YARN-9558
> URL: https://issues.apache.org/jira/browse/YARN-9558
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation, test
>Affects Versions: 3.3.0, 3.2.1, 3.1.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> Test cases related to Log Aggregation from below classes are failing
> hadoop.yarn.server.nodemanager.webapp.TestNMWebServices 
> hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService
>  
> hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices 
> hadoop.yarn.client.cli.TestLogsCLI 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9558) Log Aggregation testcases failing

2019-05-15 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9558:

Component/s: (was: timelineservice)
 log-aggregation

> Log Aggregation testcases failing
> -
>
> Key: YARN-9558
> URL: https://issues.apache.org/jira/browse/YARN-9558
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation, test
>Affects Versions: 3.3.0, 3.2.1, 3.1.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> TestAHSWebServices testcases failing. 
> {code:java}
> [ERROR]   TestAHSWebServices.testContainerLogsForFinishedApps:570
> [ERROR]   TestAHSWebServices.testContainerLogsForFinishedApps:570
> [ERROR]   TestAHSWebServices.testContainerLogsForRunningApps:777
> [ERROR]   TestAHSWebServices.testContainerLogsForRunningApps:777
> [ERROR] Errors: 
> [ERROR]   TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » 
> WebApplication j...
> [ERROR]   TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » 
> WebApplication j...
> [ERROR]   TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » 
> WebApplication ja...
> [ERROR]   TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » 
> WebApplication ja...
>  {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9558) Log Aggregation testcases failing

2019-05-15 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9558:

Summary: Log Aggregation testcases failing  (was: TestAHSWebServices 
testcases failing)

> Log Aggregation testcases failing
> -
>
> Key: YARN-9558
> URL: https://issues.apache.org/jira/browse/YARN-9558
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test, timelineservice
>Affects Versions: 3.3.0, 3.2.1, 3.1.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> TestAHSWebServices testcases failing. 
> {code:java}
> [ERROR]   TestAHSWebServices.testContainerLogsForFinishedApps:570
> [ERROR]   TestAHSWebServices.testContainerLogsForFinishedApps:570
> [ERROR]   TestAHSWebServices.testContainerLogsForRunningApps:777
> [ERROR]   TestAHSWebServices.testContainerLogsForRunningApps:777
> [ERROR] Errors: 
> [ERROR]   TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » 
> WebApplication j...
> [ERROR]   TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » 
> WebApplication j...
> [ERROR]   TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » 
> WebApplication ja...
> [ERROR]   TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » 
> WebApplication ja...
>  {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9360) Do not expose innards of QueueMetrics object into FSLeafQueue#computeMaxAMResource

2019-05-15 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840504#comment-16840504
 ] 

Peter Bacsko commented on YARN-9360:


[~snemeth] please check whether the test failures are related. Also fix the new 
checkstyle issue.

 

> Do not expose innards of QueueMetrics object into 
> FSLeafQueue#computeMaxAMResource
> --
>
> Key: YARN-9360
> URL: https://issues.apache.org/jira/browse/YARN-9360
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-9360.001.patch
>
>
> This is a follow-up for YARN-9323, covering required changes as discussed 
> with [~templedf] earlier.
> After YARN-9323, 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue#computeMaxAMResource
>  gets the QueueMetricsForCustomResources object from 
> scheduler.getRootQueueMetrics().
> Instead, we should use a "fill-in" method in QueueMetrics that receives a 
> Resource and fills in custom resource values if they are non-zero.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException

2019-05-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840475#comment-16840475
 ] 

Hadoop QA commented on YARN-9552:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 80m 
34s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}127m 40s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9552 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12968787/YARN-9552-004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 693c2ec01b77 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 570fa2d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24094/testReport/ |
| Max. process+thread count | 862 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24094/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> FairScheduler: NODE_UPDATE can cause 

[jira] [Commented] (YARN-9554) TimelineEntity DAO has java.util.Set interface which JAXB can't handle

2019-05-15 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840360#comment-16840360
 ] 

Prabhu Joseph commented on YARN-9554:
-

Failed test cases from TestAHSWebServices are not related and will be fixed by 
YARN-9558.

[~eyang] Can you review this Jira when you get time. This is a follow up fix 
for YARN-7266. Thanks.

> TimelineEntity DAO has java.util.Set interface which JAXB can't handle
> --
>
> Key: YARN-9554
> URL: https://issues.apache.org/jira/browse/YARN-9554
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineservice
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9554-001.patch
>
>
> TimelineEntity DAO has java.util.Set interface which JAXB can't handle. This 
> breaks the fix of YARN-7266.
> {code}
> Caused by: com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException: 
> 1 counts of IllegalAnnotationExceptions
> java.util.Set is an interface, and JAXB can't handle interfaces.
>   this problem is related to the following location:
>   at java.util.Set
>   at public java.util.HashMap 
> org.apache.hadoop.yarn.api.records.timeline.TimelineEntity.getPrimaryFiltersJAXB()
>   at org.apache.hadoop.yarn.api.records.timeline.TimelineEntity
>   at public java.util.List 
> org.apache.hadoop.yarn.api.records.timeline.TimelineEntities.getEntities()
>   at org.apache.hadoop.yarn.api.records.timeline.TimelineEntities
>   at 
> com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException$Builder.check(IllegalAnnotationsException.java:91)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.getTypeInfoSet(JAXBContextImpl.java:445)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:277)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:124)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl$JAXBContextBuilder.build(JAXBContextImpl.java:1123)
>   at 
> com.sun.xml.internal.bind.v2.ContextFactory.createContext(ContextFactory.java:147)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9554) TimelineEntity DAO has java.util.Set interface which JAXB can't handle

2019-05-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840355#comment-16840355
 ] 

Hadoop QA commented on YARN-9554:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 40s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  3m 23s{color} 
| {color:red} hadoop-yarn-server-applicationhistoryservice in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9554 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12968784/YARN-9554-001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 191235cae53a 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 570fa2d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/24093/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24093/testReport/ |
| Max. process+thread count | 445 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 U: 

[jira] [Commented] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException

2019-05-15 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840350#comment-16840350
 ] 

Peter Bacsko commented on YARN-9552:


Thanks Szilard. Ok, I uploaded patch v4 just to make checkstyle happy :)

> FairScheduler: NODE_UPDATE can cause NoSuchElementException
> ---
>
> Key: YARN-9552
> URL: https://issues.apache.org/jira/browse/YARN-9552
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9552-001.patch, YARN-9552-002.patch, 
> YARN-9552-003.patch, YARN-9552-004.patch
>
>
> We observed a race condition inside YARN with the following stack trace:
> {noformat}
> 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR 
> EventDispatcher: Error in handling event type NODE_UPDATE to the Event 
> Dispatcher
> java.util.NoSuchElementException
> at 
> java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036)
> at 
> java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> This is basically the same as the one described in YARN-7382, but the root 
> cause is different.
> When we create an application attempt, we create an {{FSAppAttempt}} object. 
> This contains an {{AppSchedulingInfo}} which contains a set of 
> {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a 
> bit later on a separate thread during a state transition:
> {noformat}
> 2019-05-07 15:58:02,659 INFO  [RM StateStore dispatcher] 
> recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for 
> app: application_1557237478804_0001
> 2019-05-07 15:58:02,684 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED
> 2019-05-07 15:58:02,690 INFO  [SchedulerEventDispatcher:Event Processor] 
> fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted 
> application application_1557237478804_0001 from user: bacskop, in queue: 
> root.bacskop, currently num of applications: 1
> 2019-05-07 15:58:02,698 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from SUBMITTED to ACCEPTED on event = APP_ACCEPTED
> 2019-05-07 15:58:02,731 INFO  [RM Event dispatcher] 
> resourcemanager.ApplicationMasterService 
> (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app 
> attempt : appattempt_1557237478804_0001_01
> 2019-05-07 15:58:02,732 INFO  [RM Event dispatcher] attempt.RMAppAttemptImpl 
> (RMAppAttemptImpl.java:handle(920)) - appattempt_1557237478804_0001_01 
> State change from NEW to SUBMITTED on event = START
> 2019-05-07 15:58:02,746 INFO  [SchedulerEventDispatcher:Event Processor] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:(207)) - *** In the constructor of 
> SchedulerApplicationAttempt
> 2019-05-07 15:58:02,747 INFO  [SchedulerEventDispatcher:Event Processor] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:(230)) - *** Contents of 
> appSchedulingInfo: []
> 2019-05-07 15:58:02,752 INFO  [SchedulerEventDispatcher:Event Processor] 
> fair.FairScheduler (FairScheduler.java:addApplicationAttempt(546)) - Added 
> Application Attempt appattempt_1557237478804_0001_01 to 

[jira] [Updated] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException

2019-05-15 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9552:
---
Attachment: YARN-9552-004.patch

> FairScheduler: NODE_UPDATE can cause NoSuchElementException
> ---
>
> Key: YARN-9552
> URL: https://issues.apache.org/jira/browse/YARN-9552
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9552-001.patch, YARN-9552-002.patch, 
> YARN-9552-003.patch, YARN-9552-004.patch
>
>
> We observed a race condition inside YARN with the following stack trace:
> {noformat}
> 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR 
> EventDispatcher: Error in handling event type NODE_UPDATE to the Event 
> Dispatcher
> java.util.NoSuchElementException
> at 
> java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036)
> at 
> java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> This is basically the same as the one described in YARN-7382, but the root 
> cause is different.
> When we create an application attempt, we create an {{FSAppAttempt}} object. 
> This contains an {{AppSchedulingInfo}} which contains a set of 
> {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a 
> bit later on a separate thread during a state transition:
> {noformat}
> 2019-05-07 15:58:02,659 INFO  [RM StateStore dispatcher] 
> recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for 
> app: application_1557237478804_0001
> 2019-05-07 15:58:02,684 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED
> 2019-05-07 15:58:02,690 INFO  [SchedulerEventDispatcher:Event Processor] 
> fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted 
> application application_1557237478804_0001 from user: bacskop, in queue: 
> root.bacskop, currently num of applications: 1
> 2019-05-07 15:58:02,698 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from SUBMITTED to ACCEPTED on event = APP_ACCEPTED
> 2019-05-07 15:58:02,731 INFO  [RM Event dispatcher] 
> resourcemanager.ApplicationMasterService 
> (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app 
> attempt : appattempt_1557237478804_0001_01
> 2019-05-07 15:58:02,732 INFO  [RM Event dispatcher] attempt.RMAppAttemptImpl 
> (RMAppAttemptImpl.java:handle(920)) - appattempt_1557237478804_0001_01 
> State change from NEW to SUBMITTED on event = START
> 2019-05-07 15:58:02,746 INFO  [SchedulerEventDispatcher:Event Processor] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:(207)) - *** In the constructor of 
> SchedulerApplicationAttempt
> 2019-05-07 15:58:02,747 INFO  [SchedulerEventDispatcher:Event Processor] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:(230)) - *** Contents of 
> appSchedulingInfo: []
> 2019-05-07 15:58:02,752 INFO  [SchedulerEventDispatcher:Event Processor] 
> fair.FairScheduler (FairScheduler.java:addApplicationAttempt(546)) - Added 
> Application Attempt appattempt_1557237478804_0001_01 to scheduler from 
> user: bacskop
> 2019-05-07 15:58:02,756 INFO  [RM Event 

[jira] [Commented] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException

2019-05-15 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840349#comment-16840349
 ] 

Szilard Nemeth commented on YARN-9552:
--

Hi [~pbacsko]!
+1 (non-binding) for the latest patch!

> FairScheduler: NODE_UPDATE can cause NoSuchElementException
> ---
>
> Key: YARN-9552
> URL: https://issues.apache.org/jira/browse/YARN-9552
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9552-001.patch, YARN-9552-002.patch, 
> YARN-9552-003.patch
>
>
> We observed a race condition inside YARN with the following stack trace:
> {noformat}
> 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR 
> EventDispatcher: Error in handling event type NODE_UPDATE to the Event 
> Dispatcher
> java.util.NoSuchElementException
> at 
> java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036)
> at 
> java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> This is basically the same as the one described in YARN-7382, but the root 
> cause is different.
> When we create an application attempt, we create an {{FSAppAttempt}} object. 
> This contains an {{AppSchedulingInfo}} which contains a set of 
> {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a 
> bit later on a separate thread during a state transition:
> {noformat}
> 2019-05-07 15:58:02,659 INFO  [RM StateStore dispatcher] 
> recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for 
> app: application_1557237478804_0001
> 2019-05-07 15:58:02,684 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED
> 2019-05-07 15:58:02,690 INFO  [SchedulerEventDispatcher:Event Processor] 
> fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted 
> application application_1557237478804_0001 from user: bacskop, in queue: 
> root.bacskop, currently num of applications: 1
> 2019-05-07 15:58:02,698 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from SUBMITTED to ACCEPTED on event = APP_ACCEPTED
> 2019-05-07 15:58:02,731 INFO  [RM Event dispatcher] 
> resourcemanager.ApplicationMasterService 
> (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app 
> attempt : appattempt_1557237478804_0001_01
> 2019-05-07 15:58:02,732 INFO  [RM Event dispatcher] attempt.RMAppAttemptImpl 
> (RMAppAttemptImpl.java:handle(920)) - appattempt_1557237478804_0001_01 
> State change from NEW to SUBMITTED on event = START
> 2019-05-07 15:58:02,746 INFO  [SchedulerEventDispatcher:Event Processor] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:(207)) - *** In the constructor of 
> SchedulerApplicationAttempt
> 2019-05-07 15:58:02,747 INFO  [SchedulerEventDispatcher:Event Processor] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:(230)) - *** Contents of 
> appSchedulingInfo: []
> 2019-05-07 15:58:02,752 INFO  [SchedulerEventDispatcher:Event Processor] 
> fair.FairScheduler (FairScheduler.java:addApplicationAttempt(546)) - Added 
> Application Attempt appattempt_1557237478804_0001_01 to scheduler from 
> user: bacskop
> 

[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.

2019-05-15 Thread Lars Francke (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840348#comment-16840348
 ] 

Lars Francke commented on YARN-6875:


All subtasks have been resolved here but the issue is still OPEN. Is this 
feature complete and implemented, are we using this new format already? Can we 
close the issue?

> New aggregated log file format for YARN log aggregation.
> 
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException

2019-05-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840346#comment-16840346
 ] 

Hadoop QA commented on YARN-9552:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 20 unchanged - 0 fixed = 21 total (was 20) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 50s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 85m  
3s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}139m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9552 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12968773/YARN-9552-003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 3f35447184cd 4.4.0-144-generic #170~14.04.1-Ubuntu SMP Mon Mar 
18 15:02:05 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 570fa2d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/24092/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24092/testReport/ |
| Max. process+thread count | 915 (vs. ulimit of 1) |
| modules | C: 

[jira] [Comment Edited] (YARN-9554) TimelineEntity DAO has java.util.Set interface which JAXB can't handle

2019-05-15 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840317#comment-16840317
 ] 

Prabhu Joseph edited comment on YARN-9554 at 5/15/19 11:57 AM:
---

{{TimelineEntity}} DAO class has field with {{Set}} interface which JAXB can't 
handle and so {{ContextFactory}} throws {{JAXBException}} shown in description 
while creating {{JAXBContextImpl}}. This will be ignored by jersey.
{code:java}
INFO: Couldn't find grammar element for class 
org.apache.hadoop.yarn.api.records.timeline.TimelineEntity
{code}
All timeline put entities request will invoke createContext with TimelineEntity 
throws JAXBException and jaxbContext is always null. This again causes slowness 
due to synchronization while calling createContext which YARN-7266 tried to fix.
{code:java}
Fix of YARN-7266

 synchronized (ContextFactory.class) {
  if (jaxbContext == null) {
jaxbContext = (JAXBContext) m.invoke((Object) null, classes,
properties);
  }
}
return jaxbContext;
{code}

*The patch includes below fixes:*

1. If {{createContext}} is for {{TimelineEntity}} and {{TimelineEntities}}, 
throw {{JAXBException}} (suppressed stacktrace) immediately.

2. Reuse single {{JAXBContextImpl}} for other DAO classes from 
{{AHSWebServices}} and {{TimelineWebServices}}.

3. If {{createContext}} is for any other classes like 
{{com.sun.research.ws.wadl.Application}}, let create new context as above 
context does not know about this class.

*Testing Covered:*

1. Junit test classes from hadoop-yarn-server-applicationhistoryservice runs 
fine
 2. Functional Testing
{code:java}
1. AHSWebServices and TimelineWebServices REST API both from browser and curl 
command - XML and JSON format.

http://:8188/ws/v1/applicationhistory/about

http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/appattempt_1557825335381_0001_01/containers/container_1557825335381_0001_01_01

http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/appattempt_1557825335381_0001_01/containers/

http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/appattempt_1557825335381_0001_01/

http://:8188/ws/v1/timeline/about/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/

http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/

http://:8188/ws/v1/applicationhistory/apps/

http://:8188/ws/v1/timeline/about:8188/ws/v1/applicationhistory

http://:8188/ws/v1/timeline

http://:8188/ws/v1/timeline/about

http://:8188/ws/v1/timeline/YARN_APPLICATION

http://:8188/ws/v1/timeline/YARN_APPLICATION/application_1557825335381_0001

http://:8188/ws/v1/timeline/YARN_APPLICATION/events

http://:8188/ws/v1/timeline/HIVE_QUERY_ID

http://:8188/ws/v1/timeline/TEZ_DAG_ID

Insert Domain using PUT:

curl -H "Accept: application/json" -H "Content-Type: application/json" -X PUT 
http://:8188/ws/v1/timeline/domain -d 
'{"id":"abd","description":"test1","owner":"ambari-qa","readers":"ambari-qa","writers":"ambari-qa","createdtime":"123456","modifiedtime":"123456"}'
{"errors":[]}

Get Domain:

http://:8188/ws/v1/timeline/domain

http://:8188/ws/v1/timeline/domain/abc

{"domains":[{"id":"abc","description":"test","owner":"dr.who","readers":"ambari-qa","writers":"ambari-qa","createdtime":1557835184393,"modifiedtime":1557835209581}]}

Wrong URL:

http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/containers/

Wrong Accept Type:

curl -H "Accept: application/xml" 
http://:8188/ws/v1/timeline/YARN_APPLICATION

2. MapReduce Service Check

3. Tez Service Check

4. Hive Queries

5. Tez View

6. ApplicationHistory Web App

http://:8188/applicationhistory/

7. PUT Entities:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.yarn.client.api.TimelineClient;
import org.apache.hadoop.yarn.api.records.timeline.TimelineEntity;
import org.apache.hadoop.yarn.api.records.timeline.TimelinePutResponse;

public class Putter {

public static void main(String[] arg) {

TimelineClient client = TimelineClient.createTimelineClient();
client.init(new Configuration());
client.start();
TimelineEntity entity = new TimelineEntity();
entity.setEntityId(arg[0]);
entity.setEntityType("dummy");
entity.setStartTime(System.currentTimeMillis());

try {
TimelinePutResponse response = client.putEntities(entity);
System.out.println("RESPONSE="+response.toString());
} catch (Exception e) {
   e.printStackTrace();
}
client.stop();
}
}

8. GET Entities:

http://:8188/ws/v1/timeline/dummy
{code}


was (Author: prabhu joseph):
{{TimelineEntity}} DAO class has field with {{Set}} interface which JAXB can't 
handle and so {{ContextFactory}} throws {{JAXBException}} shown 

[jira] [Commented] (YARN-9554) TimelineEntity DAO has java.util.Set interface which JAXB can't handle

2019-05-15 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840317#comment-16840317
 ] 

Prabhu Joseph commented on YARN-9554:
-

{{TimelineEntity}} DAO class has field with {{Set}} interface which JAXB can't 
handle and so {{ContextFactory}} throws {{JAXBException}} shown in description 
while creating {{JAXBContextImpl}}. This will be ignored by jersey.
{code:java}
INFO: Couldn't find grammar element for class 
org.apache.hadoop.yarn.api.records.timeline.TimelineEntity
{code}
All timeline put entities request will invoke createContext with TimelineEntity 
throws JAXBException and jaxbContext is always null. This again causes slowness 
due to synchronization while calling createContext which YARN-7266 tried to fix.
{code:java}
Fix of YARN-7266

 synchronized (ContextFactory.class) {
  if (jaxbContext == null) {
jaxbContext = (JAXBContext) m.invoke((Object) null, classes,
properties);
  }
}
return jaxbContext;
{code}
The patch includes below fixes:

1. If {{createContext}} is for {{TimelineEntity}} and {{TimelineEntities}}, 
throw {{JAXBException}} (suppressed stacktrace) immediately.

2. Reuse single {{JAXBContextImpl}} for other DAO classes from 
{{AHSWebServices}} and {{TimelineWebServices}}.

3. If {{createContext}} is for any other classes like 
{{com.sun.research.ws.wadl.Application}}, let create new context as above 
context does not know about this class.

Below are the testing done:

1. Junit test classes from hadoop-yarn-server-applicationhistoryservice runs 
fine
 2. Functional Testing
{code:java}
1. AHSWebServices and TimelineWebServices REST API both from browser and curl 
command - XML and JSON format.

http://:8188/ws/v1/applicationhistory/about

http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/appattempt_1557825335381_0001_01/containers/container_1557825335381_0001_01_01

http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/appattempt_1557825335381_0001_01/containers/

http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/appattempt_1557825335381_0001_01/

http://:8188/ws/v1/timeline/about/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/

http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/

http://:8188/ws/v1/applicationhistory/apps/

http://:8188/ws/v1/timeline/about:8188/ws/v1/applicationhistory

http://:8188/ws/v1/timeline

http://:8188/ws/v1/timeline/about

http://:8188/ws/v1/timeline/YARN_APPLICATION

http://:8188/ws/v1/timeline/YARN_APPLICATION/application_1557825335381_0001

http://:8188/ws/v1/timeline/YARN_APPLICATION/events

http://:8188/ws/v1/timeline/HIVE_QUERY_ID

http://:8188/ws/v1/timeline/TEZ_DAG_ID

Insert Domain using PUT:

curl -H "Accept: application/json" -H "Content-Type: application/json" -X PUT 
http://:8188/ws/v1/timeline/domain -d 
'{"id":"abd","description":"test1","owner":"ambari-qa","readers":"ambari-qa","writers":"ambari-qa","createdtime":"123456","modifiedtime":"123456"}'
{"errors":[]}

Get Domain:

http://:8188/ws/v1/timeline/domain

http://:8188/ws/v1/timeline/domain/abc

{"domains":[{"id":"abc","description":"test","owner":"dr.who","readers":"ambari-qa","writers":"ambari-qa","createdtime":1557835184393,"modifiedtime":1557835209581}]}

Wrong URL:

http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/containers/

Wrong Accept Type:

curl -H "Accept: application/xml" 
http://:8188/ws/v1/timeline/YARN_APPLICATION

2. MapReduce Service Check

3. Tez Service Check

4. Hive Queries

5. Tez View

6. ApplicationHistory Web App

http://:8188/applicationhistory/

7. PUT Entities:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.yarn.client.api.TimelineClient;
import org.apache.hadoop.yarn.api.records.timeline.TimelineEntity;
import org.apache.hadoop.yarn.api.records.timeline.TimelinePutResponse;

public class Putter {

public static void main(String[] arg) {

TimelineClient client = TimelineClient.createTimelineClient();
client.init(new Configuration());
client.start();
TimelineEntity entity = new TimelineEntity();
entity.setEntityId(arg[0]);
entity.setEntityType("dummy");
entity.setStartTime(System.currentTimeMillis());

try {
TimelinePutResponse response = client.putEntities(entity);
System.out.println("RESPONSE="+response.toString());
} catch (Exception e) {
   e.printStackTrace();
}
client.stop();
}
}

8. GET Entities:

http://:8188/ws/v1/timeline/dummy
{code}

> TimelineEntity DAO has java.util.Set interface which JAXB can't handle
> --
>
> Key: YARN-9554
> URL: 

[jira] [Updated] (YARN-9554) TimelineEntity DAO has java.util.Set interface which JAXB can't handle

2019-05-15 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9554:

Attachment: YARN-9554-001.patch

> TimelineEntity DAO has java.util.Set interface which JAXB can't handle
> --
>
> Key: YARN-9554
> URL: https://issues.apache.org/jira/browse/YARN-9554
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineservice
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9554-001.patch
>
>
> TimelineEntity DAO has java.util.Set interface which JAXB can't handle. This 
> breaks the fix of YARN-7266.
> {code}
> Caused by: com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException: 
> 1 counts of IllegalAnnotationExceptions
> java.util.Set is an interface, and JAXB can't handle interfaces.
>   this problem is related to the following location:
>   at java.util.Set
>   at public java.util.HashMap 
> org.apache.hadoop.yarn.api.records.timeline.TimelineEntity.getPrimaryFiltersJAXB()
>   at org.apache.hadoop.yarn.api.records.timeline.TimelineEntity
>   at public java.util.List 
> org.apache.hadoop.yarn.api.records.timeline.TimelineEntities.getEntities()
>   at org.apache.hadoop.yarn.api.records.timeline.TimelineEntities
>   at 
> com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException$Builder.check(IllegalAnnotationsException.java:91)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.getTypeInfoSet(JAXBContextImpl.java:445)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:277)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:124)
>   at 
> com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl$JAXBContextBuilder.build(JAXBContextImpl.java:1123)
>   at 
> com.sun.xml.internal.bind.v2.ContextFactory.createContext(ContextFactory.java:147)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException

2019-05-15 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9552:
---
Attachment: YARN-9552-003.patch

> FairScheduler: NODE_UPDATE can cause NoSuchElementException
> ---
>
> Key: YARN-9552
> URL: https://issues.apache.org/jira/browse/YARN-9552
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9552-001.patch, YARN-9552-002.patch, 
> YARN-9552-003.patch
>
>
> We observed a race condition inside YARN with the following stack trace:
> {noformat}
> 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR 
> EventDispatcher: Error in handling event type NODE_UPDATE to the Event 
> Dispatcher
> java.util.NoSuchElementException
> at 
> java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036)
> at 
> java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> This is basically the same as the one described in YARN-7382, but the root 
> cause is different.
> When we create an application attempt, we create an {{FSAppAttempt}} object. 
> This contains an {{AppSchedulingInfo}} which contains a set of 
> {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a 
> bit later on a separate thread during a state transition:
> {noformat}
> 2019-05-07 15:58:02,659 INFO  [RM StateStore dispatcher] 
> recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for 
> app: application_1557237478804_0001
> 2019-05-07 15:58:02,684 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED
> 2019-05-07 15:58:02,690 INFO  [SchedulerEventDispatcher:Event Processor] 
> fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted 
> application application_1557237478804_0001 from user: bacskop, in queue: 
> root.bacskop, currently num of applications: 1
> 2019-05-07 15:58:02,698 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from SUBMITTED to ACCEPTED on event = APP_ACCEPTED
> 2019-05-07 15:58:02,731 INFO  [RM Event dispatcher] 
> resourcemanager.ApplicationMasterService 
> (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app 
> attempt : appattempt_1557237478804_0001_01
> 2019-05-07 15:58:02,732 INFO  [RM Event dispatcher] attempt.RMAppAttemptImpl 
> (RMAppAttemptImpl.java:handle(920)) - appattempt_1557237478804_0001_01 
> State change from NEW to SUBMITTED on event = START
> 2019-05-07 15:58:02,746 INFO  [SchedulerEventDispatcher:Event Processor] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:(207)) - *** In the constructor of 
> SchedulerApplicationAttempt
> 2019-05-07 15:58:02,747 INFO  [SchedulerEventDispatcher:Event Processor] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:(230)) - *** Contents of 
> appSchedulingInfo: []
> 2019-05-07 15:58:02,752 INFO  [SchedulerEventDispatcher:Event Processor] 
> fair.FairScheduler (FairScheduler.java:addApplicationAttempt(546)) - Added 
> Application Attempt appattempt_1557237478804_0001_01 to scheduler from 
> user: bacskop
> 2019-05-07 15:58:02,756 INFO  [RM Event dispatcher] 
> 

[jira] [Commented] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException

2019-05-15 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840256#comment-16840256
 ] 

Peter Bacsko commented on YARN-9552:


[~snemeth] I added a short comment to the testcase.

> FairScheduler: NODE_UPDATE can cause NoSuchElementException
> ---
>
> Key: YARN-9552
> URL: https://issues.apache.org/jira/browse/YARN-9552
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9552-001.patch, YARN-9552-002.patch, 
> YARN-9552-003.patch
>
>
> We observed a race condition inside YARN with the following stack trace:
> {noformat}
> 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR 
> EventDispatcher: Error in handling event type NODE_UPDATE to the Event 
> Dispatcher
> java.util.NoSuchElementException
> at 
> java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036)
> at 
> java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> This is basically the same as the one described in YARN-7382, but the root 
> cause is different.
> When we create an application attempt, we create an {{FSAppAttempt}} object. 
> This contains an {{AppSchedulingInfo}} which contains a set of 
> {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a 
> bit later on a separate thread during a state transition:
> {noformat}
> 2019-05-07 15:58:02,659 INFO  [RM StateStore dispatcher] 
> recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for 
> app: application_1557237478804_0001
> 2019-05-07 15:58:02,684 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED
> 2019-05-07 15:58:02,690 INFO  [SchedulerEventDispatcher:Event Processor] 
> fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted 
> application application_1557237478804_0001 from user: bacskop, in queue: 
> root.bacskop, currently num of applications: 1
> 2019-05-07 15:58:02,698 INFO  [RM Event dispatcher] rmapp.RMAppImpl 
> (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change 
> from SUBMITTED to ACCEPTED on event = APP_ACCEPTED
> 2019-05-07 15:58:02,731 INFO  [RM Event dispatcher] 
> resourcemanager.ApplicationMasterService 
> (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app 
> attempt : appattempt_1557237478804_0001_01
> 2019-05-07 15:58:02,732 INFO  [RM Event dispatcher] attempt.RMAppAttemptImpl 
> (RMAppAttemptImpl.java:handle(920)) - appattempt_1557237478804_0001_01 
> State change from NEW to SUBMITTED on event = START
> 2019-05-07 15:58:02,746 INFO  [SchedulerEventDispatcher:Event Processor] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:(207)) - *** In the constructor of 
> SchedulerApplicationAttempt
> 2019-05-07 15:58:02,747 INFO  [SchedulerEventDispatcher:Event Processor] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:(230)) - *** Contents of 
> appSchedulingInfo: []
> 2019-05-07 15:58:02,752 INFO  [SchedulerEventDispatcher:Event Processor] 
> fair.FairScheduler (FairScheduler.java:addApplicationAttempt(546)) - Added 
> Application Attempt appattempt_1557237478804_0001_01 to scheduler from 
> user: bacskop
> 2019-05-07 

[jira] [Commented] (YARN-9482) DistributedShell job with localization fails in unsecure cluster

2019-05-15 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840231#comment-16840231
 ] 

Peter Bacsko commented on YARN-9482:


If that's the case, then I give +1 (non-binding).

> DistributedShell job with localization fails in unsecure cluster
> 
>
> Key: YARN-9482
> URL: https://issues.apache.org/jira/browse/YARN-9482
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9482-001.patch, YARN-9482-002.patch, 
> YARN-9482-003.patch
>
>
> DistributedShell job with localization fails in unsecure cluster. The client 
> localizes the input files to home directory (job user) whereas the AM runs as 
> yarn user reads from it's home directory.
> *Command:*
> {code}
> yarn jar 
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -shell_command ls  -shell_args / -jar  
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -localize_files /tmp/prabhu
> {code}
> {code}
> Exception in thread "Thread-4" java.io.UncheckedIOException: Error during 
> localization setup
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1495)
>   at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
>   at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.run(ApplicationMaster.java:1481)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: File does not exist: 
> hdfs://yarn-ats-1:8020/user/yarn/DistributedShell/application_1554817981283_0003/prabhu
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1579)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1594)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1487)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9508) YarnConfiguration areNodeLabel enabled is costly in allocation flow

2019-05-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840153#comment-16840153
 ] 

Hudson commented on YARN-9508:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16552 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16552/])
YARN-9508. YarnConfiguration areNodeLabel enabled is costly in (bibinchundatt: 
rev 570fa2da20706490dc7823efd0ce0cef3ddc81f9)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/DefaultAMSProcessor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java


> YarnConfiguration areNodeLabel enabled is costly in allocation flow
> ---
>
> Key: YARN-9508
> URL: https://issues.apache.org/jira/browse/YARN-9508
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Critical
> Fix For: 3.3.0, 3.2.1
>
> Attachments: YARN-9508-001.patch, YARN-9508-002.patch, 
> YARN-9508-003.patch
>
>
> For every allocate request locking can be avoided. Improving performance
> {noformat}
> "pool-6-thread-300" #624 prio=5 os_prio=0 tid=0x7f2f91152800 nid=0x8ec5 
> waiting for monitor entry [0x7f1ec6a8d000]
>  java.lang.Thread.State: BLOCKED (on object monitor)
>  at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2841)
>  - waiting to lock <0x7f1f8107c748> (a 
> org.apache.hadoop.yarn.conf.YarnConfiguration)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1214)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1268)
>  at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1674)
>  at 
> org.apache.hadoop.yarn.conf.YarnConfiguration.areNodeLabelsEnabled(YarnConfiguration.java:3646)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:274)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:261)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:242)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:427)
>  - locked <0x7f24dd3f9e40> (a 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService$AllocateResponseLock)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:352)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:349)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.sendContainerRequest(MRAMSimulator.java:348)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.AMSimulator.middleStep(AMSimulator.java:212)
>  at 
> org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:94)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (YARN-9547) ContainerStatusPBImpl default execution type is not returned

2019-05-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840143#comment-16840143
 ] 

Hudson commented on YARN-9547:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16551 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16551/])
YARN-9547. ContainerStatusPBImpl default execution type is not returned. 
(bibinchundatt: rev 2de1e30658439945edf598b47257142f4730a37d)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/api/protocolrecords/TestProtocolRecords.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ContainerStatusPBImpl.java


> ContainerStatusPBImpl default execution type is not returned
> 
>
> Key: YARN-9547
> URL: https://issues.apache.org/jira/browse/YARN-9547
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9547-001.patch
>
>
> {code}
>   @Override
>   public synchronized ExecutionType getExecutionType() {
> ContainerStatusProtoOrBuilder p = viaProto ? proto : builder;
> if (!p.hasExecutionType()) {
>   return null;
> }
> return convertFromProtoFormat(p.getExecutionType());
>   }
> {code}
> ContainerStatusPBImpl executionType should return default as 
> ExecutionType.GUARANTEED.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9547) ContainerStatusPBImpl default execution type is not returned

2019-05-15 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840137#comment-16840137
 ] 

Bibin A Chundatt commented on YARN-9547:


Committed to trunk

[~BilwaST]  Could you  add patch for 3.1 and 3.2 branch too.

> ContainerStatusPBImpl default execution type is not returned
> 
>
> Key: YARN-9547
> URL: https://issues.apache.org/jira/browse/YARN-9547
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9547-001.patch
>
>
> {code}
>   @Override
>   public synchronized ExecutionType getExecutionType() {
> ContainerStatusProtoOrBuilder p = viaProto ? proto : builder;
> if (!p.hasExecutionType()) {
>   return null;
> }
> return convertFromProtoFormat(p.getExecutionType());
>   }
> {code}
> ContainerStatusPBImpl executionType should return default as 
> ExecutionType.GUARANTEED.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9558) TestAHSWebServices testcases failing

2019-05-15 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9558:

Affects Version/s: 3.1.3
   3.2.1

> TestAHSWebServices testcases failing
> 
>
> Key: YARN-9558
> URL: https://issues.apache.org/jira/browse/YARN-9558
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test, timelineservice
>Affects Versions: 3.3.0, 3.2.1, 3.1.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> TestAHSWebServices testcases failing. 
> {code:java}
> [ERROR]   TestAHSWebServices.testContainerLogsForFinishedApps:570
> [ERROR]   TestAHSWebServices.testContainerLogsForFinishedApps:570
> [ERROR]   TestAHSWebServices.testContainerLogsForRunningApps:777
> [ERROR]   TestAHSWebServices.testContainerLogsForRunningApps:777
> [ERROR] Errors: 
> [ERROR]   TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » 
> WebApplication j...
> [ERROR]   TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » 
> WebApplication j...
> [ERROR]   TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » 
> WebApplication ja...
> [ERROR]   TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » 
> WebApplication ja...
>  {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9558) TestAHSWebServices testcases failing

2019-05-15 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-9558:
---

 Summary: TestAHSWebServices testcases failing
 Key: YARN-9558
 URL: https://issues.apache.org/jira/browse/YARN-9558
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test, timelineservice
Affects Versions: 3.3.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


TestAHSWebServices testcases failing. 

{code:java}
[ERROR]   TestAHSWebServices.testContainerLogsForFinishedApps:570
[ERROR]   TestAHSWebServices.testContainerLogsForFinishedApps:570
[ERROR]   TestAHSWebServices.testContainerLogsForRunningApps:777
[ERROR]   TestAHSWebServices.testContainerLogsForRunningApps:777
[ERROR] Errors: 
[ERROR]   TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » 
WebApplication j...
[ERROR]   TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » 
WebApplication j...
[ERROR]   TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » 
WebApplication ja...
[ERROR]   TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » 
WebApplication ja...
 {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9557) Application fails in diskchecker when ReadWriteDiskValidator is configured.

2019-05-15 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-9557:
---
Description: 
Application fails to execute successfully when ReadWriteDiskValidator is 
configured.

{code}

yarn.nodemanager.disk-validator
read-write

{code}
{noformat}
Exception thrown while starting Container:

java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
Disk Check failed!
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200)
 at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233)
 Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check 
failed!
 at 
org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198)
 ... 2 more
 Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
/opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11
 is not a directory!
 at 
org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50)
{noformat}
 

  was:
Application fails to execute successfully when ReadWriteDiskValidator is 
configured.

 
{noformat}
Exception thrown while starting Container:

java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
Disk Check failed!
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200)
 at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233)
 Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check 
failed!
 at 
org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198)
 ... 2 more
 Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
/opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11
 is not a directory!
 at 
org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50)
{noformat}
 


> Application fails in diskchecker when ReadWriteDiskValidator is configured.
> ---
>
> Key: YARN-9557
> URL: https://issues.apache.org/jira/browse/YARN-9557
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.1.1
> Environment: Configure:
> 
>  yarn.nodemanager.disk-validator
>  read-write
>  
>Reporter: Anuruddh Nayak
>Priority: Critical
>
> Application fails to execute successfully when ReadWriteDiskValidator is 
> configured.
> {code}
> 
> yarn.nodemanager.disk-validator
> read-write
> 
> {code}
> {noformat}
> Exception thrown while starting Container:
> java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
> Disk Check failed!
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233)
>  Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check 
> failed!
>  at 
> org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82)
>  at 
> 

[jira] [Updated] (YARN-9557) Application fails in diskchecker when ReadWriteDiskValidator is configured.

2019-05-15 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-9557:
---
Description: 
Application fails to execute successfully when ReadWriteDiskValidator is 
configured.

 
{noformat}
Exception thrown while starting Container:

java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
Disk Check failed!
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200)
 at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233)
 Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check 
failed!
 at 
org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198)
 ... 2 more
 Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
/opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11
 is not a directory!
 at 
org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50)
{noformat}
 

  was:
Application fails to execute successfully when ReadWriteDiskValidator is 
configured.

 

Exception thrown while starting Container:

java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
Disk Check failed!
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200)
 at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check 
failed!
 at 
org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198)
 ... 2 more
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
/opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11
 is not a directory!
 at 
org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50)

 


> Application fails in diskchecker when ReadWriteDiskValidator is configured.
> ---
>
> Key: YARN-9557
> URL: https://issues.apache.org/jira/browse/YARN-9557
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.1.1
> Environment: Configure:
> 
>  yarn.nodemanager.disk-validator
>  read-write
>  
>Reporter: Anuruddh Nayak
>Priority: Critical
>
> Application fails to execute successfully when ReadWriteDiskValidator is 
> configured.
>  
> {noformat}
> Exception thrown while starting Container:
> java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
> Disk Check failed!
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233)
>  Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check 
> failed!
>  at 
> org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255)
>  at 
> 

[jira] [Updated] (YARN-9557) Application fails in diskchecker when ReadWriteDiskValidator is configured.

2019-05-15 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-9557:
---
Priority: Critical  (was: Major)

> Application fails in diskchecker when ReadWriteDiskValidator is configured.
> ---
>
> Key: YARN-9557
> URL: https://issues.apache.org/jira/browse/YARN-9557
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.1.1
> Environment: Configure:
> 
>  yarn.nodemanager.disk-validator
>  read-write
>  
>Reporter: Anuruddh Nayak
>Priority: Critical
>
> Application fails to execute successfully when ReadWriteDiskValidator is 
> configured.
>  
> Exception thrown while starting Container:
> java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
> Disk Check failed!
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233)
> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check 
> failed!
>  at 
> org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198)
>  ... 2 more
> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
> /opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11
>  is not a directory!
>  at 
> org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9557) Application fails in diskchecker when ReadWriteDiskValidator is configured.

2019-05-15 Thread Anuruddh Nayak (JIRA)
Anuruddh Nayak created YARN-9557:


 Summary: Application fails in diskchecker when 
ReadWriteDiskValidator is configured.
 Key: YARN-9557
 URL: https://issues.apache.org/jira/browse/YARN-9557
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.1.1
 Environment: Configure:


 yarn.nodemanager.disk-validator
 read-write
 
Reporter: Anuruddh Nayak


Application fails to execute successfully when ReadWriteDiskValidator is 
configured.

 

Exception thrown while starting Container:

java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
Disk Check failed!
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200)
 at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check 
failed!
 at 
org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198)
 ... 2 more
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: 
/opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11
 is not a directory!
 at 
org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50)

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9521) RM filed to start due to system services

2019-05-15 Thread kyungwan nam (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840096#comment-16840096
 ] 

kyungwan nam commented on YARN-9521:


I think the cause of this problem is as follows.

1. _fs_ is set by calling FileSystem.get() on 
SystemServiceManagerImpl.serviceInit

2. RMAppImpl.appAdminClientCleanUp will be called on RMAppImpl.FinalTransition, 
if APP_COMPLETED event occurs during RMStateStore recovery 

{code}
  static void appAdminClientCleanUp(RMAppImpl app) {
try {
  AppAdminClient client = AppAdminClient.createAppAdminClient(app
  .applicationType, app.conf);
  int result = client.actionCleanUp(app.name, app.user);
{code}

ApiServiceClient.actionCleanUp
{code}
  @Override
  public int actionCleanUp(String appName, String userName) throws
  IOException, YarnException {
ServiceClient sc = new ServiceClient();
sc.init(getConfig());
sc.start();
int result = sc.actionCleanUp(appName, userName);
sc.close();
return result;
  }
{code}

ServiceClient instance has a FileSystem by calling FileSystem.get() at 
initialization time. but, it might be a cached one.
the FileSystem cached will be closed by _sc.close()_

3. scanForUserServices is called on SystemServiceManagerImpl.serviceStart. but, 
_fs_ has been closed already.



RM log

{code}

// 1. SystemServiceManagerImpl.serviceInit
//
2019-05-15 10:27:59,445 DEBUG service.AbstractService 
(AbstractService.java:enterState(443)) - Service: 
org.apache.hadoop.yarn.service.client.SystemServiceManagerImpl entered state 
INITED
2019-05-15 10:27:59,446 INFO  client.SystemServiceManagerImpl 
(SystemServiceManagerImpl.java:serviceInit(114)) - System Service Directory is 
configured to /services
2019-05-15 10:27:59,472 DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3209)) - Loading filesystems
2019-05-15 10:27:59,483 DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3221)) - file:// = class 
org.apache.hadoop.fs.LocalFileSystem from 
/usr/hdp/3.1.0.0-78/hadoop/hadoop-common-3.1.1.3.1.2.3.1.0.0-78.jar
2019-05-15 10:27:59,488 DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3221)) - viewfs:// = class 
org.apache.hadoop.fs.viewfs.ViewFileSystem from 
/usr/hdp/3.1.0.0-78/hadoop/hadoop-common-3.1.1.3.1.2.3.1.0.0-78.jar
2019-05-15 10:27:59,491 DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3221)) - har:// = class 
org.apache.hadoop.fs.HarFileSystem from 
/usr/hdp/3.1.0.0-78/hadoop/hadoop-common-3.1.1.3.1.2.3.1.0.0-78.jar
2019-05-15 10:27:59,492 DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3221)) - http:// = class 
org.apache.hadoop.fs.http.HttpFileSystem from 
/usr/hdp/3.1.0.0-78/hadoop/hadoop-common-3.1.1.3.1.2.3.1.0.0-78.jar
2019-05-15 10:27:59,493 DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3221)) - https:// = class 
org.apache.hadoop.fs.http.HttpsFileSystem from 
/usr/hdp/3.1.0.0-78/hadoop/hadoop-common-3.1.1.3.1.2.3.1.0.0-78.jar
2019-05-15 10:27:59,503 DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3221)) - hdfs:// = class 
org.apache.hadoop.hdfs.DistributedFileSystem from 
/usr/hdp/3.1.0.0-78/hadoop-hdfs/hadoop-hdfs-client-3.1.1.3.1.2.3.1.0.0-78.jar
2019-05-15 10:27:59,511 DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3221)) - webhdfs:// = class 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem from 
/usr/hdp/3.1.0.0-78/hadoop-hdfs/hadoop-hdfs-client-3.1.1.3.1.2.3.1.0.0-78.jar
2019-05-15 10:27:59,512 DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3221)) - swebhdfs:// = class 
org.apache.hadoop.hdfs.web.SWebHdfsFileSystem from 
/usr/hdp/3.1.0.0-78/hadoop-hdfs/hadoop-hdfs-client-3.1.1.3.1.2.3.1.0.0-78.jar
2019-05-15 10:27:59,514 DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3221)) - s3n:// = class 
org.apache.hadoop.fs.s3native.NativeS3FileSystem from 
/usr/hdp/3.1.0.0-78/hadoop-mapreduce/hadoop-aws-3.1.1.3.1.2.3.1.0.0-78.jar
2019-05-15 10:27:59,514 DEBUG fs.FileSystem 
(FileSystem.java:getFileSystemClass(3264)) - Looking for FS supporting hdfs
2019-05-15 10:27:59,514 DEBUG fs.FileSystem 
(FileSystem.java:getFileSystemClass(3268)) - looking for configuration option 
fs.hdfs.impl
2019-05-15 10:27:59,528 DEBUG fs.FileSystem 
(FileSystem.java:getFileSystemClass(3275)) - Looking in service filesystems for 
implementation class
2019-05-15 10:27:59,528 DEBUG fs.FileSystem 
(FileSystem.java:getFileSystemClass(3284)) - FS for hdfs is class 
org.apache.hadoop.hdfs.DistributedFileSystem

// 2. APP_COMPLETED event occurs
//
2019-05-15 10:28:02,931 DEBUG rmapp.RMAppImpl (RMAppImpl.java:handle(895)) - 
Processing event for application_1556612756829_0001 of type RECOVER
2019-05-15 10:28:02,931 DEBUG rmapp.RMAppImpl (RMAppImpl.java:recover(933)) - 
Recovering app: application_1556612756829_0001 with 2 attempts and final state 
= FAILED
2019-05-15 10:28:02,931 DEBUG attempt.RMAppAttemptImpl 
(RMAppAttemptImpl.java:(544)) - yarn.app.attempt.diagnostics.limit.kc : 64

[jira] [Commented] (YARN-9519) TFile log aggregation file format is not working for yarn.log-aggregation.TFile.remote-app-log-dir config

2019-05-15 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840094#comment-16840094
 ] 

Adam Antal commented on YARN-9519:
--

Thanks for the reviews and the commit.

> TFile log aggregation file format is not working for 
> yarn.log-aggregation.TFile.remote-app-log-dir config
> -
>
> Key: YARN-9519
> URL: https://issues.apache.org/jira/browse/YARN-9519
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: YARN-9519.001.patch, YARN-9519.002.patch, 
> YARN-9519.003.patch, YARN-9519.004.patch, YARN-9519.005.patch
>
>
> The TFile log aggregation file format is not sensitive to the 
> yarn.log-aggregation.TFile.remote-app-log-dir config.
> In {{LogAggregationTFileController$initInternal}}:
> {code:java}
> this.remoteRootLogDir = new Path(
> conf.get(YarnConfiguration.NM_REMOTE_APP_LOG_DIR,
> YarnConfiguration.DEFAULT_NM_REMOTE_APP_LOG_DIR));
> {code}
> So the remoteRootLogDir is only aware of the 
> yarn.nodemanager.remote-app-log-dir config, while other file format, like 
> IFile defaults to the file format config, so its priority is higher.
> From {{LogAggregationIndexedFileController$initInternal}}:
> {code:java}
> String remoteDirStr = String.format(
> YarnConfiguration.LOG_AGGREGATION_REMOTE_APP_LOG_DIR_FMT,
> this.fileControllerName);
> String remoteDir = conf.get(remoteDirStr);
> if (remoteDir == null || remoteDir.isEmpty()) {
>   remoteDir = conf.get(YarnConfiguration.NM_REMOTE_APP_LOG_DIR,
>   YarnConfiguration.DEFAULT_NM_REMOTE_APP_LOG_DIR);
> }
> {code}
> (Where these configs are: )
> {code:java}
> public static final String LOG_AGGREGATION_REMOTE_APP_LOG_DIR_FMT
>   = YARN_PREFIX + "log-aggregation.%s.remote-app-log-dir";
> public static final String NM_REMOTE_APP_LOG_DIR = 
> NM_PREFIX + "remote-app-log-dir";
> {code}
> I suggest TFile should try to obtain the remote dir config from 
> yarn.log-aggregation.TFile.remote-app-log-dir first, and only if that is not 
> specified falls back to the yarn.nodemanager.remote-app-log-dir config.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org