[jira] [Commented] (YARN-9554) TimelineEntity DAO has java.util.Set interface which JAXB can't handle
[ https://issues.apache.org/jira/browse/YARN-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840986#comment-16840986 ] Rohith Sharma K S commented on YARN-9554: - ATSv2 has separate TimelineEntity and TimelineEntities under separate package i.e org.apache.hadoop.yarn.api.records.timelineservice. And v1 and v2 clients are different. So, it shouldn't be problem. > TimelineEntity DAO has java.util.Set interface which JAXB can't handle > -- > > Key: YARN-9554 > URL: https://issues.apache.org/jira/browse/YARN-9554 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineservice >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9554-001.patch > > > TimelineEntity DAO has java.util.Set interface which JAXB can't handle. This > breaks the fix of YARN-7266. > {code} > Caused by: com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException: > 1 counts of IllegalAnnotationExceptions > java.util.Set is an interface, and JAXB can't handle interfaces. > this problem is related to the following location: > at java.util.Set > at public java.util.HashMap > org.apache.hadoop.yarn.api.records.timeline.TimelineEntity.getPrimaryFiltersJAXB() > at org.apache.hadoop.yarn.api.records.timeline.TimelineEntity > at public java.util.List > org.apache.hadoop.yarn.api.records.timeline.TimelineEntities.getEntities() > at org.apache.hadoop.yarn.api.records.timeline.TimelineEntities > at > com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException$Builder.check(IllegalAnnotationsException.java:91) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.getTypeInfoSet(JAXBContextImpl.java:445) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:277) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:124) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl$JAXBContextBuilder.build(JAXBContextImpl.java:1123) > at > com.sun.xml.internal.bind.v2.ContextFactory.createContext(ContextFactory.java:147) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9555) Yarn Docs : single cluster yarn setup - Step 1 configure parameters - multiple roots
[ https://issues.apache.org/jira/browse/YARN-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840981#comment-16840981 ] Prabhu Joseph commented on YARN-9555: - Thanks [~ajisakaa]. > Yarn Docs : single cluster yarn setup - Step 1 configure parameters - > multiple roots > > > Key: YARN-9555 > URL: https://issues.apache.org/jira/browse/YARN-9555 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.2 >Reporter: Vishva >Priority: Minor > > Step 1 for > [https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_Single_Node] > > Configure parameters as follows: > {{etc/hadoop/mapred-site.xml}}: > > > > mapreduce.framework.name > yarn > > > > > mapreduce.application.classpath > > $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* > > > but setting this will throw an error when running yarn : > {color:#6a9955}2019-05-14{color} {color:#6a9955}16:32:05,815{color} > {color:#ce9178}ERROR{color} > {color:#569cd6}org.apache.hadoop.conf.Configuration{color}{color:#d4d4d4}: > error parsing conf {color}{color:#569cd6}mapred-site.xml{color} > {color:#ce9178}com.ctc.wstx.exc.WstxParsingException{color}{color:#d4d4d4}: > Illegal to have multiple roots (start tag in epilog?).{color}This should be > modified to > {code:java} > > > mapreduce.framework.name > yarn > > > mapreduce.application.classpath > $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* > > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9555) Yarn Docs : single cluster yarn setup - Step 1 configure parameters - multiple roots
[ https://issues.apache.org/jira/browse/YARN-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840972#comment-16840972 ] Akira Ajisaka commented on YARN-9555: - IMO, it is not possible. We need to release 3.2.1 instead. BTW, the latest documentation is available: https://aajisaka.github.io/hadoop-document/hadoop-project/ This is my daily documentation build of Hadoop trunk (unofficial) > Yarn Docs : single cluster yarn setup - Step 1 configure parameters - > multiple roots > > > Key: YARN-9555 > URL: https://issues.apache.org/jira/browse/YARN-9555 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.2 >Reporter: Vishva >Priority: Minor > > Step 1 for > [https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_Single_Node] > > Configure parameters as follows: > {{etc/hadoop/mapred-site.xml}}: > > > > mapreduce.framework.name > yarn > > > > > mapreduce.application.classpath > > $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* > > > but setting this will throw an error when running yarn : > {color:#6a9955}2019-05-14{color} {color:#6a9955}16:32:05,815{color} > {color:#ce9178}ERROR{color} > {color:#569cd6}org.apache.hadoop.conf.Configuration{color}{color:#d4d4d4}: > error parsing conf {color}{color:#569cd6}mapred-site.xml{color} > {color:#ce9178}com.ctc.wstx.exc.WstxParsingException{color}{color:#d4d4d4}: > Illegal to have multiple roots (start tag in epilog?).{color}This should be > modified to > {code:java} > > > mapreduce.framework.name > yarn > > > mapreduce.application.classpath > $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* > > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9555) Yarn Docs : single cluster yarn setup - Step 1 configure parameters - multiple roots
[ https://issues.apache.org/jira/browse/YARN-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840968#comment-16840968 ] Prabhu Joseph commented on YARN-9555: - Thanks [~ajisakaa] for clarification. Have one more doubt, since users are referring the 3.2.0 document, is it not possible to change the released 3.2.0 document now. > Yarn Docs : single cluster yarn setup - Step 1 configure parameters - > multiple roots > > > Key: YARN-9555 > URL: https://issues.apache.org/jira/browse/YARN-9555 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.2 >Reporter: Vishva >Priority: Minor > > Step 1 for > [https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_Single_Node] > > Configure parameters as follows: > {{etc/hadoop/mapred-site.xml}}: > > > > mapreduce.framework.name > yarn > > > > > mapreduce.application.classpath > > $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* > > > but setting this will throw an error when running yarn : > {color:#6a9955}2019-05-14{color} {color:#6a9955}16:32:05,815{color} > {color:#ce9178}ERROR{color} > {color:#569cd6}org.apache.hadoop.conf.Configuration{color}{color:#d4d4d4}: > error parsing conf {color}{color:#569cd6}mapred-site.xml{color} > {color:#ce9178}com.ctc.wstx.exc.WstxParsingException{color}{color:#d4d4d4}: > Illegal to have multiple roots (start tag in epilog?).{color}This should be > modified to > {code:java} > > > mapreduce.framework.name > yarn > > > mapreduce.application.classpath > $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* > > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9559) Create AbstractContainersLauncher for pluggable ContainersLauncher logic
[ https://issues.apache.org/jira/browse/YARN-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840921#comment-16840921 ] Hadoop QA commented on YARN-9559: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 8m 56s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 38s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 20s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 50s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 15s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 7 new + 318 unchanged - 2 fixed = 325 total (was 320) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 51s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 48s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 46s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}104m 14s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | TEST-TestYarnConfigurationFields | | | hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService | | | hadoop.yarn.server.nodemanager.webapp.TestNMWebServices | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9559 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12968849/YARN-9559.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 09d72362d82d 4.4.0-139-generic
[jira] [Resolved] (YARN-9555) Yarn Docs : single cluster yarn setup - Step 1 configure parameters - multiple roots
[ https://issues.apache.org/jira/browse/YARN-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka resolved YARN-9555. - Resolution: Duplicate Closing this as duplicate. > Yarn Docs : single cluster yarn setup - Step 1 configure parameters - > multiple roots > > > Key: YARN-9555 > URL: https://issues.apache.org/jira/browse/YARN-9555 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.2 >Reporter: Vishva >Priority: Minor > > Step 1 for > [https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_Single_Node] > > Configure parameters as follows: > {{etc/hadoop/mapred-site.xml}}: > > > > mapreduce.framework.name > yarn > > > > > mapreduce.application.classpath > > $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* > > > but setting this will throw an error when running yarn : > {color:#6a9955}2019-05-14{color} {color:#6a9955}16:32:05,815{color} > {color:#ce9178}ERROR{color} > {color:#569cd6}org.apache.hadoop.conf.Configuration{color}{color:#d4d4d4}: > error parsing conf {color}{color:#569cd6}mapred-site.xml{color} > {color:#ce9178}com.ctc.wstx.exc.WstxParsingException{color}{color:#d4d4d4}: > Illegal to have multiple roots (start tag in epilog?).{color}This should be > modified to > {code:java} > > > mapreduce.framework.name > yarn > > > mapreduce.application.classpath > $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* > > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9555) Yarn Docs : single cluster yarn setup - Step 1 configure parameters - multiple roots
[ https://issues.apache.org/jira/browse/YARN-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840903#comment-16840903 ] Akira Ajisaka commented on YARN-9555: - The fix version of MAPREDUCE-7165 is 3.2.1, therefore the fix is not reflected to 3.2.0 document. In 3.1.2, the document is fixed. https://hadoop.apache.org/docs/r3.1.2/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_Single_Node > Yarn Docs : single cluster yarn setup - Step 1 configure parameters - > multiple roots > > > Key: YARN-9555 > URL: https://issues.apache.org/jira/browse/YARN-9555 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.2 >Reporter: Vishva >Priority: Minor > > Step 1 for > [https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_Single_Node] > > Configure parameters as follows: > {{etc/hadoop/mapred-site.xml}}: > > > > mapreduce.framework.name > yarn > > > > > mapreduce.application.classpath > > $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* > > > but setting this will throw an error when running yarn : > {color:#6a9955}2019-05-14{color} {color:#6a9955}16:32:05,815{color} > {color:#ce9178}ERROR{color} > {color:#569cd6}org.apache.hadoop.conf.Configuration{color}{color:#d4d4d4}: > error parsing conf {color}{color:#569cd6}mapred-site.xml{color} > {color:#ce9178}com.ctc.wstx.exc.WstxParsingException{color}{color:#d4d4d4}: > Illegal to have multiple roots (start tag in epilog?).{color}This should be > modified to > {code:java} > > > mapreduce.framework.name > yarn > > > mapreduce.application.classpath > $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* > > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9559) Create AbstractContainersLauncher for pluggable ContainersLauncher logic
[ https://issues.apache.org/jira/browse/YARN-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840877#comment-16840877 ] Jonathan Hung commented on YARN-9559: - attached 001 patch which creates AbstractContainersLauncher class > Create AbstractContainersLauncher for pluggable ContainersLauncher logic > > > Key: YARN-9559 > URL: https://issues.apache.org/jira/browse/YARN-9559 > Project: Hadoop YARN > Issue Type: Task >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Attachments: YARN-9559.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9559) Create AbstractContainersLauncher for pluggable ContainersLauncher logic
[ https://issues.apache.org/jira/browse/YARN-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated YARN-9559: Attachment: YARN-9559.001.patch > Create AbstractContainersLauncher for pluggable ContainersLauncher logic > > > Key: YARN-9559 > URL: https://issues.apache.org/jira/browse/YARN-9559 > Project: Hadoop YARN > Issue Type: Task >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Attachments: YARN-9559.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9559) Create AbstractContainersLauncher for pluggable ContainersLauncher logic
Jonathan Hung created YARN-9559: --- Summary: Create AbstractContainersLauncher for pluggable ContainersLauncher logic Key: YARN-9559 URL: https://issues.apache.org/jira/browse/YARN-9559 Project: Hadoop YARN Issue Type: Task Reporter: Jonathan Hung Assignee: Jonathan Hung -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException
[ https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840679#comment-16840679 ] Hudson commented on YARN-9552: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16554 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16554/]) YARN-9552. FairScheduler: NODE_UPDATE can cause NoSuchElementException. (gifuma: rev 55bd35921c2bb013e45120bbd1602b658b8b999b) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSAppAttempt.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java > FairScheduler: NODE_UPDATE can cause NoSuchElementException > --- > > Key: YARN-9552 > URL: https://issues.apache.org/jira/browse/YARN-9552 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9552-001.patch, YARN-9552-002.patch, > YARN-9552-003.patch, YARN-9552-004.patch > > > We observed a race condition inside YARN with the following stack trace: > {noformat} > 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR > EventDispatcher: Error in handling event type NODE_UPDATE to the Event > Dispatcher > java.util.NoSuchElementException > at > java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036) > at > java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > {noformat} > This is basically the same as the one described in YARN-7382, but the root > cause is different. > When we create an application attempt, we create an {{FSAppAttempt}} object. > This contains an {{AppSchedulingInfo}} which contains a set of > {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a > bit later on a separate thread during a state transition: > {noformat} > 2019-05-07 15:58:02,659 INFO [RM StateStore dispatcher] > recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for > app: application_1557237478804_0001 > 2019-05-07 15:58:02,684 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED > 2019-05-07 15:58:02,690 INFO [SchedulerEventDispatcher:Event Processor] > fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted > application application_1557237478804_0001 from user: bacskop, in queue: > root.bacskop, currently num of applications: 1 > 2019-05-07 15:58:02,698 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from SUBMITTED to ACCEPTED on event = APP_ACCEPTED > 2019-05-07 15:58:02,731 INFO [RM Event dispatcher] > resourcemanager.ApplicationMasterService > (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app > attempt : appattempt_1557237478804_0001_01 > 2019-05-07 15:58:02,732 INFO [RM Event dispatcher] attempt.RMAppAttemptImpl
[jira] [Commented] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException
[ https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840663#comment-16840663 ] Giovanni Matteo Fumarola commented on YARN-9552: Committed [^YARN-9552-004.patch] to trunk. The patch looks good. Thanks [~pbacsko] for working on this and [~snemeth] for the initial review. > FairScheduler: NODE_UPDATE can cause NoSuchElementException > --- > > Key: YARN-9552 > URL: https://issues.apache.org/jira/browse/YARN-9552 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9552-001.patch, YARN-9552-002.patch, > YARN-9552-003.patch, YARN-9552-004.patch > > > We observed a race condition inside YARN with the following stack trace: > {noformat} > 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR > EventDispatcher: Error in handling event type NODE_UPDATE to the Event > Dispatcher > java.util.NoSuchElementException > at > java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036) > at > java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > {noformat} > This is basically the same as the one described in YARN-7382, but the root > cause is different. > When we create an application attempt, we create an {{FSAppAttempt}} object. > This contains an {{AppSchedulingInfo}} which contains a set of > {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a > bit later on a separate thread during a state transition: > {noformat} > 2019-05-07 15:58:02,659 INFO [RM StateStore dispatcher] > recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for > app: application_1557237478804_0001 > 2019-05-07 15:58:02,684 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED > 2019-05-07 15:58:02,690 INFO [SchedulerEventDispatcher:Event Processor] > fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted > application application_1557237478804_0001 from user: bacskop, in queue: > root.bacskop, currently num of applications: 1 > 2019-05-07 15:58:02,698 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from SUBMITTED to ACCEPTED on event = APP_ACCEPTED > 2019-05-07 15:58:02,731 INFO [RM Event dispatcher] > resourcemanager.ApplicationMasterService > (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app > attempt : appattempt_1557237478804_0001_01 > 2019-05-07 15:58:02,732 INFO [RM Event dispatcher] attempt.RMAppAttemptImpl > (RMAppAttemptImpl.java:handle(920)) - appattempt_1557237478804_0001_01 > State change from NEW to SUBMITTED on event = START > 2019-05-07 15:58:02,746 INFO [SchedulerEventDispatcher:Event Processor] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:(207)) - *** In the constructor of > SchedulerApplicationAttempt > 2019-05-07 15:58:02,747 INFO [SchedulerEventDispatcher:Event Processor] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:(230)) - *** Contents of > appSchedulingInfo: [] > 2019-05-07 15:58:02,752 INFO [SchedulerEventDispatcher:Event Processor] >
[jira] [Updated] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException
[ https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-9552: --- Fix Version/s: 3.3.0 > FairScheduler: NODE_UPDATE can cause NoSuchElementException > --- > > Key: YARN-9552 > URL: https://issues.apache.org/jira/browse/YARN-9552 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9552-001.patch, YARN-9552-002.patch, > YARN-9552-003.patch, YARN-9552-004.patch > > > We observed a race condition inside YARN with the following stack trace: > {noformat} > 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR > EventDispatcher: Error in handling event type NODE_UPDATE to the Event > Dispatcher > java.util.NoSuchElementException > at > java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036) > at > java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > {noformat} > This is basically the same as the one described in YARN-7382, but the root > cause is different. > When we create an application attempt, we create an {{FSAppAttempt}} object. > This contains an {{AppSchedulingInfo}} which contains a set of > {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a > bit later on a separate thread during a state transition: > {noformat} > 2019-05-07 15:58:02,659 INFO [RM StateStore dispatcher] > recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for > app: application_1557237478804_0001 > 2019-05-07 15:58:02,684 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED > 2019-05-07 15:58:02,690 INFO [SchedulerEventDispatcher:Event Processor] > fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted > application application_1557237478804_0001 from user: bacskop, in queue: > root.bacskop, currently num of applications: 1 > 2019-05-07 15:58:02,698 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from SUBMITTED to ACCEPTED on event = APP_ACCEPTED > 2019-05-07 15:58:02,731 INFO [RM Event dispatcher] > resourcemanager.ApplicationMasterService > (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app > attempt : appattempt_1557237478804_0001_01 > 2019-05-07 15:58:02,732 INFO [RM Event dispatcher] attempt.RMAppAttemptImpl > (RMAppAttemptImpl.java:handle(920)) - appattempt_1557237478804_0001_01 > State change from NEW to SUBMITTED on event = START > 2019-05-07 15:58:02,746 INFO [SchedulerEventDispatcher:Event Processor] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:(207)) - *** In the constructor of > SchedulerApplicationAttempt > 2019-05-07 15:58:02,747 INFO [SchedulerEventDispatcher:Event Processor] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:(230)) - *** Contents of > appSchedulingInfo: [] > 2019-05-07 15:58:02,752 INFO [SchedulerEventDispatcher:Event Processor] > fair.FairScheduler (FairScheduler.java:addApplicationAttempt(546)) - Added > Application Attempt appattempt_1557237478804_0001_01 to scheduler from > user: bacskop >
[jira] [Commented] (YARN-9555) Yarn Docs : single cluster yarn setup - Step 1 configure parameters - multiple roots
[ https://issues.apache.org/jira/browse/YARN-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840657#comment-16840657 ] Prabhu Joseph commented on YARN-9555: - [~Vishva001] Looks the document is already fixed by MAPREDUCE-7165. Not sure when the documentation will reflect with new changes. [~ajisakaa] Can you check this one. Thanks. > Yarn Docs : single cluster yarn setup - Step 1 configure parameters - > multiple roots > > > Key: YARN-9555 > URL: https://issues.apache.org/jira/browse/YARN-9555 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.2 >Reporter: Vishva >Priority: Minor > > Step 1 for > [https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_Single_Node] > > Configure parameters as follows: > {{etc/hadoop/mapred-site.xml}}: > > > > mapreduce.framework.name > yarn > > > > > mapreduce.application.classpath > > $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* > > > but setting this will throw an error when running yarn : > {color:#6a9955}2019-05-14{color} {color:#6a9955}16:32:05,815{color} > {color:#ce9178}ERROR{color} > {color:#569cd6}org.apache.hadoop.conf.Configuration{color}{color:#d4d4d4}: > error parsing conf {color}{color:#569cd6}mapred-site.xml{color} > {color:#ce9178}com.ctc.wstx.exc.WstxParsingException{color}{color:#d4d4d4}: > Illegal to have multiple roots (start tag in epilog?).{color}This should be > modified to > {code:java} > > > mapreduce.framework.name > yarn > > > mapreduce.application.classpath > $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* > > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9554) TimelineEntity DAO has java.util.Set interface which JAXB can't handle
[ https://issues.apache.org/jira/browse/YARN-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840600#comment-16840600 ] Eric Yang commented on YARN-9554: - [~Prabhu Joseph], Error handling for logging the exception is going to stderr instead of log files. Can e.printStackTrace(); be changed to a log statement? I am not familiar with Timeline server 2 internals. [~rohithsharma] do you know if we would run into problems for dropping TimelineEntity and TimelineEntities objects, will this cause problem for timeline server from receiving certain events? > TimelineEntity DAO has java.util.Set interface which JAXB can't handle > -- > > Key: YARN-9554 > URL: https://issues.apache.org/jira/browse/YARN-9554 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineservice >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9554-001.patch > > > TimelineEntity DAO has java.util.Set interface which JAXB can't handle. This > breaks the fix of YARN-7266. > {code} > Caused by: com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException: > 1 counts of IllegalAnnotationExceptions > java.util.Set is an interface, and JAXB can't handle interfaces. > this problem is related to the following location: > at java.util.Set > at public java.util.HashMap > org.apache.hadoop.yarn.api.records.timeline.TimelineEntity.getPrimaryFiltersJAXB() > at org.apache.hadoop.yarn.api.records.timeline.TimelineEntity > at public java.util.List > org.apache.hadoop.yarn.api.records.timeline.TimelineEntities.getEntities() > at org.apache.hadoop.yarn.api.records.timeline.TimelineEntities > at > com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException$Builder.check(IllegalAnnotationsException.java:91) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.getTypeInfoSet(JAXBContextImpl.java:445) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:277) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:124) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl$JAXBContextBuilder.build(JAXBContextImpl.java:1123) > at > com.sun.xml.internal.bind.v2.ContextFactory.createContext(ContextFactory.java:147) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9558) Log Aggregation testcases failing
[ https://issues.apache.org/jira/browse/YARN-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9558: Description: Test cases related to Log Aggregation from below classes are failing hadoop.yarn.server.nodemanager.webapp.TestNMWebServices hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices hadoop.yarn.client.cli.TestLogsCLI was: TestAHSWebServices testcases failing. {code:java} [ERROR] TestAHSWebServices.testContainerLogsForFinishedApps:570 [ERROR] TestAHSWebServices.testContainerLogsForFinishedApps:570 [ERROR] TestAHSWebServices.testContainerLogsForRunningApps:777 [ERROR] TestAHSWebServices.testContainerLogsForRunningApps:777 [ERROR] Errors: [ERROR] TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » WebApplication j... [ERROR] TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » WebApplication j... [ERROR] TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » WebApplication ja... [ERROR] TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » WebApplication ja... {code} > Log Aggregation testcases failing > - > > Key: YARN-9558 > URL: https://issues.apache.org/jira/browse/YARN-9558 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, test >Affects Versions: 3.3.0, 3.2.1, 3.1.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > Test cases related to Log Aggregation from below classes are failing > hadoop.yarn.server.nodemanager.webapp.TestNMWebServices > hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService > > hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices > hadoop.yarn.client.cli.TestLogsCLI -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9558) Log Aggregation testcases failing
[ https://issues.apache.org/jira/browse/YARN-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9558: Component/s: (was: timelineservice) log-aggregation > Log Aggregation testcases failing > - > > Key: YARN-9558 > URL: https://issues.apache.org/jira/browse/YARN-9558 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, test >Affects Versions: 3.3.0, 3.2.1, 3.1.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > TestAHSWebServices testcases failing. > {code:java} > [ERROR] TestAHSWebServices.testContainerLogsForFinishedApps:570 > [ERROR] TestAHSWebServices.testContainerLogsForFinishedApps:570 > [ERROR] TestAHSWebServices.testContainerLogsForRunningApps:777 > [ERROR] TestAHSWebServices.testContainerLogsForRunningApps:777 > [ERROR] Errors: > [ERROR] TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » > WebApplication j... > [ERROR] TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » > WebApplication j... > [ERROR] TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » > WebApplication ja... > [ERROR] TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » > WebApplication ja... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9558) Log Aggregation testcases failing
[ https://issues.apache.org/jira/browse/YARN-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9558: Summary: Log Aggregation testcases failing (was: TestAHSWebServices testcases failing) > Log Aggregation testcases failing > - > > Key: YARN-9558 > URL: https://issues.apache.org/jira/browse/YARN-9558 > Project: Hadoop YARN > Issue Type: Bug > Components: test, timelineservice >Affects Versions: 3.3.0, 3.2.1, 3.1.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > TestAHSWebServices testcases failing. > {code:java} > [ERROR] TestAHSWebServices.testContainerLogsForFinishedApps:570 > [ERROR] TestAHSWebServices.testContainerLogsForFinishedApps:570 > [ERROR] TestAHSWebServices.testContainerLogsForRunningApps:777 > [ERROR] TestAHSWebServices.testContainerLogsForRunningApps:777 > [ERROR] Errors: > [ERROR] TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » > WebApplication j... > [ERROR] TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » > WebApplication j... > [ERROR] TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » > WebApplication ja... > [ERROR] TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » > WebApplication ja... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9360) Do not expose innards of QueueMetrics object into FSLeafQueue#computeMaxAMResource
[ https://issues.apache.org/jira/browse/YARN-9360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840504#comment-16840504 ] Peter Bacsko commented on YARN-9360: [~snemeth] please check whether the test failures are related. Also fix the new checkstyle issue. > Do not expose innards of QueueMetrics object into > FSLeafQueue#computeMaxAMResource > -- > > Key: YARN-9360 > URL: https://issues.apache.org/jira/browse/YARN-9360 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-9360.001.patch > > > This is a follow-up for YARN-9323, covering required changes as discussed > with [~templedf] earlier. > After YARN-9323, > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue#computeMaxAMResource > gets the QueueMetricsForCustomResources object from > scheduler.getRootQueueMetrics(). > Instead, we should use a "fill-in" method in QueueMetrics that receives a > Resource and fills in custom resource values if they are non-zero. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException
[ https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840475#comment-16840475 ] Hadoop QA commented on YARN-9552: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 58s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 55s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 80m 34s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}127m 40s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9552 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12968787/YARN-9552-004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 693c2ec01b77 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 570fa2d | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24094/testReport/ | | Max. process+thread count | 862 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24094/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > FairScheduler: NODE_UPDATE can cause
[jira] [Commented] (YARN-9554) TimelineEntity DAO has java.util.Set interface which JAXB can't handle
[ https://issues.apache.org/jira/browse/YARN-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840360#comment-16840360 ] Prabhu Joseph commented on YARN-9554: - Failed test cases from TestAHSWebServices are not related and will be fixed by YARN-9558. [~eyang] Can you review this Jira when you get time. This is a follow up fix for YARN-7266. Thanks. > TimelineEntity DAO has java.util.Set interface which JAXB can't handle > -- > > Key: YARN-9554 > URL: https://issues.apache.org/jira/browse/YARN-9554 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineservice >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9554-001.patch > > > TimelineEntity DAO has java.util.Set interface which JAXB can't handle. This > breaks the fix of YARN-7266. > {code} > Caused by: com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException: > 1 counts of IllegalAnnotationExceptions > java.util.Set is an interface, and JAXB can't handle interfaces. > this problem is related to the following location: > at java.util.Set > at public java.util.HashMap > org.apache.hadoop.yarn.api.records.timeline.TimelineEntity.getPrimaryFiltersJAXB() > at org.apache.hadoop.yarn.api.records.timeline.TimelineEntity > at public java.util.List > org.apache.hadoop.yarn.api.records.timeline.TimelineEntities.getEntities() > at org.apache.hadoop.yarn.api.records.timeline.TimelineEntities > at > com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException$Builder.check(IllegalAnnotationsException.java:91) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.getTypeInfoSet(JAXBContextImpl.java:445) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:277) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:124) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl$JAXBContextBuilder.build(JAXBContextImpl.java:1123) > at > com.sun.xml.internal.bind.v2.ContextFactory.createContext(ContextFactory.java:147) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9554) TimelineEntity DAO has java.util.Set interface which JAXB can't handle
[ https://issues.apache.org/jira/browse/YARN-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840355#comment-16840355 ] Hadoop QA commented on YARN-9554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 25s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 40s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 23s{color} | {color:red} hadoop-yarn-server-applicationhistoryservice in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 50m 30s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9554 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12968784/YARN-9554-001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 191235cae53a 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 570fa2d | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/24093/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24093/testReport/ | | Max. process+thread count | 445 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice U:
[jira] [Commented] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException
[ https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840350#comment-16840350 ] Peter Bacsko commented on YARN-9552: Thanks Szilard. Ok, I uploaded patch v4 just to make checkstyle happy :) > FairScheduler: NODE_UPDATE can cause NoSuchElementException > --- > > Key: YARN-9552 > URL: https://issues.apache.org/jira/browse/YARN-9552 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9552-001.patch, YARN-9552-002.patch, > YARN-9552-003.patch, YARN-9552-004.patch > > > We observed a race condition inside YARN with the following stack trace: > {noformat} > 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR > EventDispatcher: Error in handling event type NODE_UPDATE to the Event > Dispatcher > java.util.NoSuchElementException > at > java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036) > at > java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > {noformat} > This is basically the same as the one described in YARN-7382, but the root > cause is different. > When we create an application attempt, we create an {{FSAppAttempt}} object. > This contains an {{AppSchedulingInfo}} which contains a set of > {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a > bit later on a separate thread during a state transition: > {noformat} > 2019-05-07 15:58:02,659 INFO [RM StateStore dispatcher] > recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for > app: application_1557237478804_0001 > 2019-05-07 15:58:02,684 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED > 2019-05-07 15:58:02,690 INFO [SchedulerEventDispatcher:Event Processor] > fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted > application application_1557237478804_0001 from user: bacskop, in queue: > root.bacskop, currently num of applications: 1 > 2019-05-07 15:58:02,698 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from SUBMITTED to ACCEPTED on event = APP_ACCEPTED > 2019-05-07 15:58:02,731 INFO [RM Event dispatcher] > resourcemanager.ApplicationMasterService > (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app > attempt : appattempt_1557237478804_0001_01 > 2019-05-07 15:58:02,732 INFO [RM Event dispatcher] attempt.RMAppAttemptImpl > (RMAppAttemptImpl.java:handle(920)) - appattempt_1557237478804_0001_01 > State change from NEW to SUBMITTED on event = START > 2019-05-07 15:58:02,746 INFO [SchedulerEventDispatcher:Event Processor] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:(207)) - *** In the constructor of > SchedulerApplicationAttempt > 2019-05-07 15:58:02,747 INFO [SchedulerEventDispatcher:Event Processor] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:(230)) - *** Contents of > appSchedulingInfo: [] > 2019-05-07 15:58:02,752 INFO [SchedulerEventDispatcher:Event Processor] > fair.FairScheduler (FairScheduler.java:addApplicationAttempt(546)) - Added > Application Attempt appattempt_1557237478804_0001_01 to
[jira] [Updated] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException
[ https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9552: --- Attachment: YARN-9552-004.patch > FairScheduler: NODE_UPDATE can cause NoSuchElementException > --- > > Key: YARN-9552 > URL: https://issues.apache.org/jira/browse/YARN-9552 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9552-001.patch, YARN-9552-002.patch, > YARN-9552-003.patch, YARN-9552-004.patch > > > We observed a race condition inside YARN with the following stack trace: > {noformat} > 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR > EventDispatcher: Error in handling event type NODE_UPDATE to the Event > Dispatcher > java.util.NoSuchElementException > at > java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036) > at > java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > {noformat} > This is basically the same as the one described in YARN-7382, but the root > cause is different. > When we create an application attempt, we create an {{FSAppAttempt}} object. > This contains an {{AppSchedulingInfo}} which contains a set of > {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a > bit later on a separate thread during a state transition: > {noformat} > 2019-05-07 15:58:02,659 INFO [RM StateStore dispatcher] > recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for > app: application_1557237478804_0001 > 2019-05-07 15:58:02,684 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED > 2019-05-07 15:58:02,690 INFO [SchedulerEventDispatcher:Event Processor] > fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted > application application_1557237478804_0001 from user: bacskop, in queue: > root.bacskop, currently num of applications: 1 > 2019-05-07 15:58:02,698 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from SUBMITTED to ACCEPTED on event = APP_ACCEPTED > 2019-05-07 15:58:02,731 INFO [RM Event dispatcher] > resourcemanager.ApplicationMasterService > (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app > attempt : appattempt_1557237478804_0001_01 > 2019-05-07 15:58:02,732 INFO [RM Event dispatcher] attempt.RMAppAttemptImpl > (RMAppAttemptImpl.java:handle(920)) - appattempt_1557237478804_0001_01 > State change from NEW to SUBMITTED on event = START > 2019-05-07 15:58:02,746 INFO [SchedulerEventDispatcher:Event Processor] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:(207)) - *** In the constructor of > SchedulerApplicationAttempt > 2019-05-07 15:58:02,747 INFO [SchedulerEventDispatcher:Event Processor] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:(230)) - *** Contents of > appSchedulingInfo: [] > 2019-05-07 15:58:02,752 INFO [SchedulerEventDispatcher:Event Processor] > fair.FairScheduler (FairScheduler.java:addApplicationAttempt(546)) - Added > Application Attempt appattempt_1557237478804_0001_01 to scheduler from > user: bacskop > 2019-05-07 15:58:02,756 INFO [RM Event
[jira] [Commented] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException
[ https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840349#comment-16840349 ] Szilard Nemeth commented on YARN-9552: -- Hi [~pbacsko]! +1 (non-binding) for the latest patch! > FairScheduler: NODE_UPDATE can cause NoSuchElementException > --- > > Key: YARN-9552 > URL: https://issues.apache.org/jira/browse/YARN-9552 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9552-001.patch, YARN-9552-002.patch, > YARN-9552-003.patch > > > We observed a race condition inside YARN with the following stack trace: > {noformat} > 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR > EventDispatcher: Error in handling event type NODE_UPDATE to the Event > Dispatcher > java.util.NoSuchElementException > at > java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036) > at > java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > {noformat} > This is basically the same as the one described in YARN-7382, but the root > cause is different. > When we create an application attempt, we create an {{FSAppAttempt}} object. > This contains an {{AppSchedulingInfo}} which contains a set of > {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a > bit later on a separate thread during a state transition: > {noformat} > 2019-05-07 15:58:02,659 INFO [RM StateStore dispatcher] > recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for > app: application_1557237478804_0001 > 2019-05-07 15:58:02,684 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED > 2019-05-07 15:58:02,690 INFO [SchedulerEventDispatcher:Event Processor] > fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted > application application_1557237478804_0001 from user: bacskop, in queue: > root.bacskop, currently num of applications: 1 > 2019-05-07 15:58:02,698 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from SUBMITTED to ACCEPTED on event = APP_ACCEPTED > 2019-05-07 15:58:02,731 INFO [RM Event dispatcher] > resourcemanager.ApplicationMasterService > (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app > attempt : appattempt_1557237478804_0001_01 > 2019-05-07 15:58:02,732 INFO [RM Event dispatcher] attempt.RMAppAttemptImpl > (RMAppAttemptImpl.java:handle(920)) - appattempt_1557237478804_0001_01 > State change from NEW to SUBMITTED on event = START > 2019-05-07 15:58:02,746 INFO [SchedulerEventDispatcher:Event Processor] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:(207)) - *** In the constructor of > SchedulerApplicationAttempt > 2019-05-07 15:58:02,747 INFO [SchedulerEventDispatcher:Event Processor] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:(230)) - *** Contents of > appSchedulingInfo: [] > 2019-05-07 15:58:02,752 INFO [SchedulerEventDispatcher:Event Processor] > fair.FairScheduler (FairScheduler.java:addApplicationAttempt(546)) - Added > Application Attempt appattempt_1557237478804_0001_01 to scheduler from > user: bacskop >
[jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.
[ https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840348#comment-16840348 ] Lars Francke commented on YARN-6875: All subtasks have been resolved here but the issue is still OPEN. Is this feature complete and implemented, are we using this new format already? Can we close the issue? > New aggregated log file format for YARN log aggregation. > > > Key: YARN-6875 > URL: https://issues.apache.org/jira/browse/YARN-6875 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Major > Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf > > > T-file is the underlying log format for the aggregated logs in YARN. We have > seen several performance issues, especially for very large log files. > We will introduce a new log format which have better performance for large > log files. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException
[ https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840346#comment-16840346 ] Hadoop QA commented on YARN-9552: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 11s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 30s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 20 unchanged - 0 fixed = 21 total (was 20) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 50s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 85m 3s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}139m 11s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9552 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12968773/YARN-9552-003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 3f35447184cd 4.4.0-144-generic #170~14.04.1-Ubuntu SMP Mon Mar 18 15:02:05 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 570fa2d | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/24092/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24092/testReport/ | | Max. process+thread count | 915 (vs. ulimit of 1) | | modules | C:
[jira] [Comment Edited] (YARN-9554) TimelineEntity DAO has java.util.Set interface which JAXB can't handle
[ https://issues.apache.org/jira/browse/YARN-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840317#comment-16840317 ] Prabhu Joseph edited comment on YARN-9554 at 5/15/19 11:57 AM: --- {{TimelineEntity}} DAO class has field with {{Set}} interface which JAXB can't handle and so {{ContextFactory}} throws {{JAXBException}} shown in description while creating {{JAXBContextImpl}}. This will be ignored by jersey. {code:java} INFO: Couldn't find grammar element for class org.apache.hadoop.yarn.api.records.timeline.TimelineEntity {code} All timeline put entities request will invoke createContext with TimelineEntity throws JAXBException and jaxbContext is always null. This again causes slowness due to synchronization while calling createContext which YARN-7266 tried to fix. {code:java} Fix of YARN-7266 synchronized (ContextFactory.class) { if (jaxbContext == null) { jaxbContext = (JAXBContext) m.invoke((Object) null, classes, properties); } } return jaxbContext; {code} *The patch includes below fixes:* 1. If {{createContext}} is for {{TimelineEntity}} and {{TimelineEntities}}, throw {{JAXBException}} (suppressed stacktrace) immediately. 2. Reuse single {{JAXBContextImpl}} for other DAO classes from {{AHSWebServices}} and {{TimelineWebServices}}. 3. If {{createContext}} is for any other classes like {{com.sun.research.ws.wadl.Application}}, let create new context as above context does not know about this class. *Testing Covered:* 1. Junit test classes from hadoop-yarn-server-applicationhistoryservice runs fine 2. Functional Testing {code:java} 1. AHSWebServices and TimelineWebServices REST API both from browser and curl command - XML and JSON format. http://:8188/ws/v1/applicationhistory/about http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/appattempt_1557825335381_0001_01/containers/container_1557825335381_0001_01_01 http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/appattempt_1557825335381_0001_01/containers/ http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/appattempt_1557825335381_0001_01/ http://:8188/ws/v1/timeline/about/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/ http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/ http://:8188/ws/v1/applicationhistory/apps/ http://:8188/ws/v1/timeline/about:8188/ws/v1/applicationhistory http://:8188/ws/v1/timeline http://:8188/ws/v1/timeline/about http://:8188/ws/v1/timeline/YARN_APPLICATION http://:8188/ws/v1/timeline/YARN_APPLICATION/application_1557825335381_0001 http://:8188/ws/v1/timeline/YARN_APPLICATION/events http://:8188/ws/v1/timeline/HIVE_QUERY_ID http://:8188/ws/v1/timeline/TEZ_DAG_ID Insert Domain using PUT: curl -H "Accept: application/json" -H "Content-Type: application/json" -X PUT http://:8188/ws/v1/timeline/domain -d '{"id":"abd","description":"test1","owner":"ambari-qa","readers":"ambari-qa","writers":"ambari-qa","createdtime":"123456","modifiedtime":"123456"}' {"errors":[]} Get Domain: http://:8188/ws/v1/timeline/domain http://:8188/ws/v1/timeline/domain/abc {"domains":[{"id":"abc","description":"test","owner":"dr.who","readers":"ambari-qa","writers":"ambari-qa","createdtime":1557835184393,"modifiedtime":1557835209581}]} Wrong URL: http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/containers/ Wrong Accept Type: curl -H "Accept: application/xml" http://:8188/ws/v1/timeline/YARN_APPLICATION 2. MapReduce Service Check 3. Tez Service Check 4. Hive Queries 5. Tez View 6. ApplicationHistory Web App http://:8188/applicationhistory/ 7. PUT Entities: import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.yarn.client.api.TimelineClient; import org.apache.hadoop.yarn.api.records.timeline.TimelineEntity; import org.apache.hadoop.yarn.api.records.timeline.TimelinePutResponse; public class Putter { public static void main(String[] arg) { TimelineClient client = TimelineClient.createTimelineClient(); client.init(new Configuration()); client.start(); TimelineEntity entity = new TimelineEntity(); entity.setEntityId(arg[0]); entity.setEntityType("dummy"); entity.setStartTime(System.currentTimeMillis()); try { TimelinePutResponse response = client.putEntities(entity); System.out.println("RESPONSE="+response.toString()); } catch (Exception e) { e.printStackTrace(); } client.stop(); } } 8. GET Entities: http://:8188/ws/v1/timeline/dummy {code} was (Author: prabhu joseph): {{TimelineEntity}} DAO class has field with {{Set}} interface which JAXB can't handle and so {{ContextFactory}} throws {{JAXBException}} shown
[jira] [Commented] (YARN-9554) TimelineEntity DAO has java.util.Set interface which JAXB can't handle
[ https://issues.apache.org/jira/browse/YARN-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840317#comment-16840317 ] Prabhu Joseph commented on YARN-9554: - {{TimelineEntity}} DAO class has field with {{Set}} interface which JAXB can't handle and so {{ContextFactory}} throws {{JAXBException}} shown in description while creating {{JAXBContextImpl}}. This will be ignored by jersey. {code:java} INFO: Couldn't find grammar element for class org.apache.hadoop.yarn.api.records.timeline.TimelineEntity {code} All timeline put entities request will invoke createContext with TimelineEntity throws JAXBException and jaxbContext is always null. This again causes slowness due to synchronization while calling createContext which YARN-7266 tried to fix. {code:java} Fix of YARN-7266 synchronized (ContextFactory.class) { if (jaxbContext == null) { jaxbContext = (JAXBContext) m.invoke((Object) null, classes, properties); } } return jaxbContext; {code} The patch includes below fixes: 1. If {{createContext}} is for {{TimelineEntity}} and {{TimelineEntities}}, throw {{JAXBException}} (suppressed stacktrace) immediately. 2. Reuse single {{JAXBContextImpl}} for other DAO classes from {{AHSWebServices}} and {{TimelineWebServices}}. 3. If {{createContext}} is for any other classes like {{com.sun.research.ws.wadl.Application}}, let create new context as above context does not know about this class. Below are the testing done: 1. Junit test classes from hadoop-yarn-server-applicationhistoryservice runs fine 2. Functional Testing {code:java} 1. AHSWebServices and TimelineWebServices REST API both from browser and curl command - XML and JSON format. http://:8188/ws/v1/applicationhistory/about http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/appattempt_1557825335381_0001_01/containers/container_1557825335381_0001_01_01 http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/appattempt_1557825335381_0001_01/containers/ http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/appattempt_1557825335381_0001_01/ http://:8188/ws/v1/timeline/about/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/ http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/ http://:8188/ws/v1/applicationhistory/apps/ http://:8188/ws/v1/timeline/about:8188/ws/v1/applicationhistory http://:8188/ws/v1/timeline http://:8188/ws/v1/timeline/about http://:8188/ws/v1/timeline/YARN_APPLICATION http://:8188/ws/v1/timeline/YARN_APPLICATION/application_1557825335381_0001 http://:8188/ws/v1/timeline/YARN_APPLICATION/events http://:8188/ws/v1/timeline/HIVE_QUERY_ID http://:8188/ws/v1/timeline/TEZ_DAG_ID Insert Domain using PUT: curl -H "Accept: application/json" -H "Content-Type: application/json" -X PUT http://:8188/ws/v1/timeline/domain -d '{"id":"abd","description":"test1","owner":"ambari-qa","readers":"ambari-qa","writers":"ambari-qa","createdtime":"123456","modifiedtime":"123456"}' {"errors":[]} Get Domain: http://:8188/ws/v1/timeline/domain http://:8188/ws/v1/timeline/domain/abc {"domains":[{"id":"abc","description":"test","owner":"dr.who","readers":"ambari-qa","writers":"ambari-qa","createdtime":1557835184393,"modifiedtime":1557835209581}]} Wrong URL: http://:8188/ws/v1/applicationhistory/apps/application_1557825335381_0001/appattempts/containers/ Wrong Accept Type: curl -H "Accept: application/xml" http://:8188/ws/v1/timeline/YARN_APPLICATION 2. MapReduce Service Check 3. Tez Service Check 4. Hive Queries 5. Tez View 6. ApplicationHistory Web App http://:8188/applicationhistory/ 7. PUT Entities: import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.yarn.client.api.TimelineClient; import org.apache.hadoop.yarn.api.records.timeline.TimelineEntity; import org.apache.hadoop.yarn.api.records.timeline.TimelinePutResponse; public class Putter { public static void main(String[] arg) { TimelineClient client = TimelineClient.createTimelineClient(); client.init(new Configuration()); client.start(); TimelineEntity entity = new TimelineEntity(); entity.setEntityId(arg[0]); entity.setEntityType("dummy"); entity.setStartTime(System.currentTimeMillis()); try { TimelinePutResponse response = client.putEntities(entity); System.out.println("RESPONSE="+response.toString()); } catch (Exception e) { e.printStackTrace(); } client.stop(); } } 8. GET Entities: http://:8188/ws/v1/timeline/dummy {code} > TimelineEntity DAO has java.util.Set interface which JAXB can't handle > -- > > Key: YARN-9554 > URL:
[jira] [Updated] (YARN-9554) TimelineEntity DAO has java.util.Set interface which JAXB can't handle
[ https://issues.apache.org/jira/browse/YARN-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9554: Attachment: YARN-9554-001.patch > TimelineEntity DAO has java.util.Set interface which JAXB can't handle > -- > > Key: YARN-9554 > URL: https://issues.apache.org/jira/browse/YARN-9554 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineservice >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9554-001.patch > > > TimelineEntity DAO has java.util.Set interface which JAXB can't handle. This > breaks the fix of YARN-7266. > {code} > Caused by: com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException: > 1 counts of IllegalAnnotationExceptions > java.util.Set is an interface, and JAXB can't handle interfaces. > this problem is related to the following location: > at java.util.Set > at public java.util.HashMap > org.apache.hadoop.yarn.api.records.timeline.TimelineEntity.getPrimaryFiltersJAXB() > at org.apache.hadoop.yarn.api.records.timeline.TimelineEntity > at public java.util.List > org.apache.hadoop.yarn.api.records.timeline.TimelineEntities.getEntities() > at org.apache.hadoop.yarn.api.records.timeline.TimelineEntities > at > com.sun.xml.internal.bind.v2.runtime.IllegalAnnotationsException$Builder.check(IllegalAnnotationsException.java:91) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.getTypeInfoSet(JAXBContextImpl.java:445) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:277) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:124) > at > com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl$JAXBContextBuilder.build(JAXBContextImpl.java:1123) > at > com.sun.xml.internal.bind.v2.ContextFactory.createContext(ContextFactory.java:147) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException
[ https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9552: --- Attachment: YARN-9552-003.patch > FairScheduler: NODE_UPDATE can cause NoSuchElementException > --- > > Key: YARN-9552 > URL: https://issues.apache.org/jira/browse/YARN-9552 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9552-001.patch, YARN-9552-002.patch, > YARN-9552-003.patch > > > We observed a race condition inside YARN with the following stack trace: > {noformat} > 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR > EventDispatcher: Error in handling event type NODE_UPDATE to the Event > Dispatcher > java.util.NoSuchElementException > at > java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036) > at > java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > {noformat} > This is basically the same as the one described in YARN-7382, but the root > cause is different. > When we create an application attempt, we create an {{FSAppAttempt}} object. > This contains an {{AppSchedulingInfo}} which contains a set of > {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a > bit later on a separate thread during a state transition: > {noformat} > 2019-05-07 15:58:02,659 INFO [RM StateStore dispatcher] > recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for > app: application_1557237478804_0001 > 2019-05-07 15:58:02,684 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED > 2019-05-07 15:58:02,690 INFO [SchedulerEventDispatcher:Event Processor] > fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted > application application_1557237478804_0001 from user: bacskop, in queue: > root.bacskop, currently num of applications: 1 > 2019-05-07 15:58:02,698 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from SUBMITTED to ACCEPTED on event = APP_ACCEPTED > 2019-05-07 15:58:02,731 INFO [RM Event dispatcher] > resourcemanager.ApplicationMasterService > (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app > attempt : appattempt_1557237478804_0001_01 > 2019-05-07 15:58:02,732 INFO [RM Event dispatcher] attempt.RMAppAttemptImpl > (RMAppAttemptImpl.java:handle(920)) - appattempt_1557237478804_0001_01 > State change from NEW to SUBMITTED on event = START > 2019-05-07 15:58:02,746 INFO [SchedulerEventDispatcher:Event Processor] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:(207)) - *** In the constructor of > SchedulerApplicationAttempt > 2019-05-07 15:58:02,747 INFO [SchedulerEventDispatcher:Event Processor] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:(230)) - *** Contents of > appSchedulingInfo: [] > 2019-05-07 15:58:02,752 INFO [SchedulerEventDispatcher:Event Processor] > fair.FairScheduler (FairScheduler.java:addApplicationAttempt(546)) - Added > Application Attempt appattempt_1557237478804_0001_01 to scheduler from > user: bacskop > 2019-05-07 15:58:02,756 INFO [RM Event dispatcher] >
[jira] [Commented] (YARN-9552) FairScheduler: NODE_UPDATE can cause NoSuchElementException
[ https://issues.apache.org/jira/browse/YARN-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840256#comment-16840256 ] Peter Bacsko commented on YARN-9552: [~snemeth] I added a short comment to the testcase. > FairScheduler: NODE_UPDATE can cause NoSuchElementException > --- > > Key: YARN-9552 > URL: https://issues.apache.org/jira/browse/YARN-9552 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9552-001.patch, YARN-9552-002.patch, > YARN-9552-003.patch > > > We observed a race condition inside YARN with the following stack trace: > {noformat} > 18/11/07 06:45:09.559 SchedulerEventDispatcher:Event Processor ERROR > EventDispatcher: Error in handling event type NODE_UPDATE to the Event > Dispatcher > java.util.NoSuchElementException > at > java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036) > at > java.util.concurrent.ConcurrentSkipListSet.first(ConcurrentSkipListSet.java:396) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.getNextPendingAsk(AppSchedulingInfo.java:373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.isOverAMShareLimit(FSAppAttempt.java:941) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:1373) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:353) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:204) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1094) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:961) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1183) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:132) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > {noformat} > This is basically the same as the one described in YARN-7382, but the root > cause is different. > When we create an application attempt, we create an {{FSAppAttempt}} object. > This contains an {{AppSchedulingInfo}} which contains a set of > {{SchedulerRequestKey}}. Initially, this set is empty and only initialized a > bit later on a separate thread during a state transition: > {noformat} > 2019-05-07 15:58:02,659 INFO [RM StateStore dispatcher] > recovery.RMStateStore (RMStateStore.java:transition(239)) - Storing info for > app: application_1557237478804_0001 > 2019-05-07 15:58:02,684 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED > 2019-05-07 15:58:02,690 INFO [SchedulerEventDispatcher:Event Processor] > fair.FairScheduler (FairScheduler.java:addApplication(490)) - Accepted > application application_1557237478804_0001 from user: bacskop, in queue: > root.bacskop, currently num of applications: 1 > 2019-05-07 15:58:02,698 INFO [RM Event dispatcher] rmapp.RMAppImpl > (RMAppImpl.java:handle(903)) - application_1557237478804_0001 State change > from SUBMITTED to ACCEPTED on event = APP_ACCEPTED > 2019-05-07 15:58:02,731 INFO [RM Event dispatcher] > resourcemanager.ApplicationMasterService > (ApplicationMasterService.java:registerAppAttempt(434)) - Registering app > attempt : appattempt_1557237478804_0001_01 > 2019-05-07 15:58:02,732 INFO [RM Event dispatcher] attempt.RMAppAttemptImpl > (RMAppAttemptImpl.java:handle(920)) - appattempt_1557237478804_0001_01 > State change from NEW to SUBMITTED on event = START > 2019-05-07 15:58:02,746 INFO [SchedulerEventDispatcher:Event Processor] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:(207)) - *** In the constructor of > SchedulerApplicationAttempt > 2019-05-07 15:58:02,747 INFO [SchedulerEventDispatcher:Event Processor] > scheduler.SchedulerApplicationAttempt > (SchedulerApplicationAttempt.java:(230)) - *** Contents of > appSchedulingInfo: [] > 2019-05-07 15:58:02,752 INFO [SchedulerEventDispatcher:Event Processor] > fair.FairScheduler (FairScheduler.java:addApplicationAttempt(546)) - Added > Application Attempt appattempt_1557237478804_0001_01 to scheduler from > user: bacskop > 2019-05-07
[jira] [Commented] (YARN-9482) DistributedShell job with localization fails in unsecure cluster
[ https://issues.apache.org/jira/browse/YARN-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840231#comment-16840231 ] Peter Bacsko commented on YARN-9482: If that's the case, then I give +1 (non-binding). > DistributedShell job with localization fails in unsecure cluster > > > Key: YARN-9482 > URL: https://issues.apache.org/jira/browse/YARN-9482 > Project: Hadoop YARN > Issue Type: Bug > Components: distributed-shell >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9482-001.patch, YARN-9482-002.patch, > YARN-9482-003.patch > > > DistributedShell job with localization fails in unsecure cluster. The client > localizes the input files to home directory (job user) whereas the AM runs as > yarn user reads from it's home directory. > *Command:* > {code} > yarn jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -shell_command ls -shell_args / -jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -localize_files /tmp/prabhu > {code} > {code} > Exception in thread "Thread-4" java.io.UncheckedIOException: Error during > localization setup > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1495) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.run(ApplicationMaster.java:1481) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.FileNotFoundException: File does not exist: > hdfs://yarn-ats-1:8020/user/yarn/DistributedShell/application_1554817981283_0003/prabhu > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1579) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1594) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1487) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9508) YarnConfiguration areNodeLabel enabled is costly in allocation flow
[ https://issues.apache.org/jira/browse/YARN-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840153#comment-16840153 ] Hudson commented on YARN-9508: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16552 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16552/]) YARN-9508. YarnConfiguration areNodeLabel enabled is costly in (bibinchundatt: rev 570fa2da20706490dc7823efd0ce0cef3ddc81f9) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/DefaultAMSProcessor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java > YarnConfiguration areNodeLabel enabled is costly in allocation flow > --- > > Key: YARN-9508 > URL: https://issues.apache.org/jira/browse/YARN-9508 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Critical > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-9508-001.patch, YARN-9508-002.patch, > YARN-9508-003.patch > > > For every allocate request locking can be avoided. Improving performance > {noformat} > "pool-6-thread-300" #624 prio=5 os_prio=0 tid=0x7f2f91152800 nid=0x8ec5 > waiting for monitor entry [0x7f1ec6a8d000] > java.lang.Thread.State: BLOCKED (on object monitor) > at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2841) > - waiting to lock <0x7f1f8107c748> (a > org.apache.hadoop.yarn.conf.YarnConfiguration) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1214) > at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1268) > at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1674) > at > org.apache.hadoop.yarn.conf.YarnConfiguration.areNodeLabelsEnabled(YarnConfiguration.java:3646) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:274) > at > org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:261) > at > org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:242) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75) > at > org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:427) > - locked <0x7f24dd3f9e40> (a > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService$AllocateResponseLock) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:352) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:349) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.sendContainerRequest(MRAMSimulator.java:348) > at > org.apache.hadoop.yarn.sls.appmaster.AMSimulator.middleStep(AMSimulator.java:212) > at > org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:94) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail:
[jira] [Commented] (YARN-9547) ContainerStatusPBImpl default execution type is not returned
[ https://issues.apache.org/jira/browse/YARN-9547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840143#comment-16840143 ] Hudson commented on YARN-9547: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16551 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16551/]) YARN-9547. ContainerStatusPBImpl default execution type is not returned. (bibinchundatt: rev 2de1e30658439945edf598b47257142f4730a37d) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/api/protocolrecords/TestProtocolRecords.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ContainerStatusPBImpl.java > ContainerStatusPBImpl default execution type is not returned > > > Key: YARN-9547 > URL: https://issues.apache.org/jira/browse/YARN-9547 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-9547-001.patch > > > {code} > @Override > public synchronized ExecutionType getExecutionType() { > ContainerStatusProtoOrBuilder p = viaProto ? proto : builder; > if (!p.hasExecutionType()) { > return null; > } > return convertFromProtoFormat(p.getExecutionType()); > } > {code} > ContainerStatusPBImpl executionType should return default as > ExecutionType.GUARANTEED. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9547) ContainerStatusPBImpl default execution type is not returned
[ https://issues.apache.org/jira/browse/YARN-9547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840137#comment-16840137 ] Bibin A Chundatt commented on YARN-9547: Committed to trunk [~BilwaST] Could you add patch for 3.1 and 3.2 branch too. > ContainerStatusPBImpl default execution type is not returned > > > Key: YARN-9547 > URL: https://issues.apache.org/jira/browse/YARN-9547 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-9547-001.patch > > > {code} > @Override > public synchronized ExecutionType getExecutionType() { > ContainerStatusProtoOrBuilder p = viaProto ? proto : builder; > if (!p.hasExecutionType()) { > return null; > } > return convertFromProtoFormat(p.getExecutionType()); > } > {code} > ContainerStatusPBImpl executionType should return default as > ExecutionType.GUARANTEED. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9558) TestAHSWebServices testcases failing
[ https://issues.apache.org/jira/browse/YARN-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9558: Affects Version/s: 3.1.3 3.2.1 > TestAHSWebServices testcases failing > > > Key: YARN-9558 > URL: https://issues.apache.org/jira/browse/YARN-9558 > Project: Hadoop YARN > Issue Type: Bug > Components: test, timelineservice >Affects Versions: 3.3.0, 3.2.1, 3.1.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > TestAHSWebServices testcases failing. > {code:java} > [ERROR] TestAHSWebServices.testContainerLogsForFinishedApps:570 > [ERROR] TestAHSWebServices.testContainerLogsForFinishedApps:570 > [ERROR] TestAHSWebServices.testContainerLogsForRunningApps:777 > [ERROR] TestAHSWebServices.testContainerLogsForRunningApps:777 > [ERROR] Errors: > [ERROR] TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » > WebApplication j... > [ERROR] TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » > WebApplication j... > [ERROR] TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » > WebApplication ja... > [ERROR] TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » > WebApplication ja... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9558) TestAHSWebServices testcases failing
Prabhu Joseph created YARN-9558: --- Summary: TestAHSWebServices testcases failing Key: YARN-9558 URL: https://issues.apache.org/jira/browse/YARN-9558 Project: Hadoop YARN Issue Type: Bug Components: test, timelineservice Affects Versions: 3.3.0 Reporter: Prabhu Joseph Assignee: Prabhu Joseph TestAHSWebServices testcases failing. {code:java} [ERROR] TestAHSWebServices.testContainerLogsForFinishedApps:570 [ERROR] TestAHSWebServices.testContainerLogsForFinishedApps:570 [ERROR] TestAHSWebServices.testContainerLogsForRunningApps:777 [ERROR] TestAHSWebServices.testContainerLogsForRunningApps:777 [ERROR] Errors: [ERROR] TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » WebApplication j... [ERROR] TestAHSWebServices.testContainerLogsMetaForFinishedApps:942 » WebApplication j... [ERROR] TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » WebApplication ja... [ERROR] TestAHSWebServices.testContainerLogsMetaForRunningApps:875 » WebApplication ja... {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9557) Application fails in diskchecker when ReadWriteDiskValidator is configured.
[ https://issues.apache.org/jira/browse/YARN-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-9557: --- Description: Application fails to execute successfully when ReadWriteDiskValidator is configured. {code} yarn.nodemanager.disk-validator read-write {code} {noformat} Exception thrown while starting Container: java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check failed! at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233) Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check failed! at org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198) ... 2 more Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: /opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11 is not a directory! at org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50) {noformat} was: Application fails to execute successfully when ReadWriteDiskValidator is configured. {noformat} Exception thrown while starting Container: java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check failed! at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233) Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check failed! at org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198) ... 2 more Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: /opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11 is not a directory! at org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50) {noformat} > Application fails in diskchecker when ReadWriteDiskValidator is configured. > --- > > Key: YARN-9557 > URL: https://issues.apache.org/jira/browse/YARN-9557 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.1.1 > Environment: Configure: > > yarn.nodemanager.disk-validator > read-write > >Reporter: Anuruddh Nayak >Priority: Critical > > Application fails to execute successfully when ReadWriteDiskValidator is > configured. > {code} > > yarn.nodemanager.disk-validator > read-write > > {code} > {noformat} > Exception thrown while starting Container: > java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: > Disk Check failed! > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233) > Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check > failed! > at > org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82) > at >
[jira] [Updated] (YARN-9557) Application fails in diskchecker when ReadWriteDiskValidator is configured.
[ https://issues.apache.org/jira/browse/YARN-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-9557: --- Description: Application fails to execute successfully when ReadWriteDiskValidator is configured. {noformat} Exception thrown while starting Container: java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check failed! at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233) Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check failed! at org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198) ... 2 more Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: /opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11 is not a directory! at org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50) {noformat} was: Application fails to execute successfully when ReadWriteDiskValidator is configured. Exception thrown while starting Container: java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check failed! at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233) Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check failed! at org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198) ... 2 more Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: /opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11 is not a directory! at org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50) > Application fails in diskchecker when ReadWriteDiskValidator is configured. > --- > > Key: YARN-9557 > URL: https://issues.apache.org/jira/browse/YARN-9557 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.1.1 > Environment: Configure: > > yarn.nodemanager.disk-validator > read-write > >Reporter: Anuruddh Nayak >Priority: Critical > > Application fails to execute successfully when ReadWriteDiskValidator is > configured. > > {noformat} > Exception thrown while starting Container: > java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: > Disk Check failed! > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233) > Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check > failed! > at > org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255) > at >
[jira] [Updated] (YARN-9557) Application fails in diskchecker when ReadWriteDiskValidator is configured.
[ https://issues.apache.org/jira/browse/YARN-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-9557: --- Priority: Critical (was: Major) > Application fails in diskchecker when ReadWriteDiskValidator is configured. > --- > > Key: YARN-9557 > URL: https://issues.apache.org/jira/browse/YARN-9557 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.1.1 > Environment: Configure: > > yarn.nodemanager.disk-validator > read-write > >Reporter: Anuruddh Nayak >Priority: Critical > > Application fails to execute successfully when ReadWriteDiskValidator is > configured. > > Exception thrown while starting Container: > java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: > Disk Check failed! > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233) > Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check > failed! > at > org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198) > ... 2 more > Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: > /opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11 > is not a directory! > at > org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9557) Application fails in diskchecker when ReadWriteDiskValidator is configured.
Anuruddh Nayak created YARN-9557: Summary: Application fails in diskchecker when ReadWriteDiskValidator is configured. Key: YARN-9557 URL: https://issues.apache.org/jira/browse/YARN-9557 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.1.1 Environment: Configure: yarn.nodemanager.disk-validator read-write Reporter: Anuruddh Nayak Application fails to execute successfully when ReadWriteDiskValidator is configured. Exception thrown while starting Container: java.io.IOException: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check failed! at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:200) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:180) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233) Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Disk Check failed! at org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:82) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:255) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:312) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:198) ... 2 more Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: /opt/HA/AN0805/nmlocal/usercache/dsperf/appcache/application_1557736108162_0009/filecache/11 is not a directory! at org.apache.hadoop.util.ReadWriteDiskValidator.checkStatus(ReadWriteDiskValidator.java:50) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9521) RM filed to start due to system services
[ https://issues.apache.org/jira/browse/YARN-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840096#comment-16840096 ] kyungwan nam commented on YARN-9521: I think the cause of this problem is as follows. 1. _fs_ is set by calling FileSystem.get() on SystemServiceManagerImpl.serviceInit 2. RMAppImpl.appAdminClientCleanUp will be called on RMAppImpl.FinalTransition, if APP_COMPLETED event occurs during RMStateStore recovery {code} static void appAdminClientCleanUp(RMAppImpl app) { try { AppAdminClient client = AppAdminClient.createAppAdminClient(app .applicationType, app.conf); int result = client.actionCleanUp(app.name, app.user); {code} ApiServiceClient.actionCleanUp {code} @Override public int actionCleanUp(String appName, String userName) throws IOException, YarnException { ServiceClient sc = new ServiceClient(); sc.init(getConfig()); sc.start(); int result = sc.actionCleanUp(appName, userName); sc.close(); return result; } {code} ServiceClient instance has a FileSystem by calling FileSystem.get() at initialization time. but, it might be a cached one. the FileSystem cached will be closed by _sc.close()_ 3. scanForUserServices is called on SystemServiceManagerImpl.serviceStart. but, _fs_ has been closed already. RM log {code} // 1. SystemServiceManagerImpl.serviceInit // 2019-05-15 10:27:59,445 DEBUG service.AbstractService (AbstractService.java:enterState(443)) - Service: org.apache.hadoop.yarn.service.client.SystemServiceManagerImpl entered state INITED 2019-05-15 10:27:59,446 INFO client.SystemServiceManagerImpl (SystemServiceManagerImpl.java:serviceInit(114)) - System Service Directory is configured to /services 2019-05-15 10:27:59,472 DEBUG fs.FileSystem (FileSystem.java:loadFileSystems(3209)) - Loading filesystems 2019-05-15 10:27:59,483 DEBUG fs.FileSystem (FileSystem.java:loadFileSystems(3221)) - file:// = class org.apache.hadoop.fs.LocalFileSystem from /usr/hdp/3.1.0.0-78/hadoop/hadoop-common-3.1.1.3.1.2.3.1.0.0-78.jar 2019-05-15 10:27:59,488 DEBUG fs.FileSystem (FileSystem.java:loadFileSystems(3221)) - viewfs:// = class org.apache.hadoop.fs.viewfs.ViewFileSystem from /usr/hdp/3.1.0.0-78/hadoop/hadoop-common-3.1.1.3.1.2.3.1.0.0-78.jar 2019-05-15 10:27:59,491 DEBUG fs.FileSystem (FileSystem.java:loadFileSystems(3221)) - har:// = class org.apache.hadoop.fs.HarFileSystem from /usr/hdp/3.1.0.0-78/hadoop/hadoop-common-3.1.1.3.1.2.3.1.0.0-78.jar 2019-05-15 10:27:59,492 DEBUG fs.FileSystem (FileSystem.java:loadFileSystems(3221)) - http:// = class org.apache.hadoop.fs.http.HttpFileSystem from /usr/hdp/3.1.0.0-78/hadoop/hadoop-common-3.1.1.3.1.2.3.1.0.0-78.jar 2019-05-15 10:27:59,493 DEBUG fs.FileSystem (FileSystem.java:loadFileSystems(3221)) - https:// = class org.apache.hadoop.fs.http.HttpsFileSystem from /usr/hdp/3.1.0.0-78/hadoop/hadoop-common-3.1.1.3.1.2.3.1.0.0-78.jar 2019-05-15 10:27:59,503 DEBUG fs.FileSystem (FileSystem.java:loadFileSystems(3221)) - hdfs:// = class org.apache.hadoop.hdfs.DistributedFileSystem from /usr/hdp/3.1.0.0-78/hadoop-hdfs/hadoop-hdfs-client-3.1.1.3.1.2.3.1.0.0-78.jar 2019-05-15 10:27:59,511 DEBUG fs.FileSystem (FileSystem.java:loadFileSystems(3221)) - webhdfs:// = class org.apache.hadoop.hdfs.web.WebHdfsFileSystem from /usr/hdp/3.1.0.0-78/hadoop-hdfs/hadoop-hdfs-client-3.1.1.3.1.2.3.1.0.0-78.jar 2019-05-15 10:27:59,512 DEBUG fs.FileSystem (FileSystem.java:loadFileSystems(3221)) - swebhdfs:// = class org.apache.hadoop.hdfs.web.SWebHdfsFileSystem from /usr/hdp/3.1.0.0-78/hadoop-hdfs/hadoop-hdfs-client-3.1.1.3.1.2.3.1.0.0-78.jar 2019-05-15 10:27:59,514 DEBUG fs.FileSystem (FileSystem.java:loadFileSystems(3221)) - s3n:// = class org.apache.hadoop.fs.s3native.NativeS3FileSystem from /usr/hdp/3.1.0.0-78/hadoop-mapreduce/hadoop-aws-3.1.1.3.1.2.3.1.0.0-78.jar 2019-05-15 10:27:59,514 DEBUG fs.FileSystem (FileSystem.java:getFileSystemClass(3264)) - Looking for FS supporting hdfs 2019-05-15 10:27:59,514 DEBUG fs.FileSystem (FileSystem.java:getFileSystemClass(3268)) - looking for configuration option fs.hdfs.impl 2019-05-15 10:27:59,528 DEBUG fs.FileSystem (FileSystem.java:getFileSystemClass(3275)) - Looking in service filesystems for implementation class 2019-05-15 10:27:59,528 DEBUG fs.FileSystem (FileSystem.java:getFileSystemClass(3284)) - FS for hdfs is class org.apache.hadoop.hdfs.DistributedFileSystem // 2. APP_COMPLETED event occurs // 2019-05-15 10:28:02,931 DEBUG rmapp.RMAppImpl (RMAppImpl.java:handle(895)) - Processing event for application_1556612756829_0001 of type RECOVER 2019-05-15 10:28:02,931 DEBUG rmapp.RMAppImpl (RMAppImpl.java:recover(933)) - Recovering app: application_1556612756829_0001 with 2 attempts and final state = FAILED 2019-05-15 10:28:02,931 DEBUG attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:(544)) - yarn.app.attempt.diagnostics.limit.kc : 64
[jira] [Commented] (YARN-9519) TFile log aggregation file format is not working for yarn.log-aggregation.TFile.remote-app-log-dir config
[ https://issues.apache.org/jira/browse/YARN-9519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840094#comment-16840094 ] Adam Antal commented on YARN-9519: -- Thanks for the reviews and the commit. > TFile log aggregation file format is not working for > yarn.log-aggregation.TFile.remote-app-log-dir config > - > > Key: YARN-9519 > URL: https://issues.apache.org/jira/browse/YARN-9519 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 3.2.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: YARN-9519.001.patch, YARN-9519.002.patch, > YARN-9519.003.patch, YARN-9519.004.patch, YARN-9519.005.patch > > > The TFile log aggregation file format is not sensitive to the > yarn.log-aggregation.TFile.remote-app-log-dir config. > In {{LogAggregationTFileController$initInternal}}: > {code:java} > this.remoteRootLogDir = new Path( > conf.get(YarnConfiguration.NM_REMOTE_APP_LOG_DIR, > YarnConfiguration.DEFAULT_NM_REMOTE_APP_LOG_DIR)); > {code} > So the remoteRootLogDir is only aware of the > yarn.nodemanager.remote-app-log-dir config, while other file format, like > IFile defaults to the file format config, so its priority is higher. > From {{LogAggregationIndexedFileController$initInternal}}: > {code:java} > String remoteDirStr = String.format( > YarnConfiguration.LOG_AGGREGATION_REMOTE_APP_LOG_DIR_FMT, > this.fileControllerName); > String remoteDir = conf.get(remoteDirStr); > if (remoteDir == null || remoteDir.isEmpty()) { > remoteDir = conf.get(YarnConfiguration.NM_REMOTE_APP_LOG_DIR, > YarnConfiguration.DEFAULT_NM_REMOTE_APP_LOG_DIR); > } > {code} > (Where these configs are: ) > {code:java} > public static final String LOG_AGGREGATION_REMOTE_APP_LOG_DIR_FMT > = YARN_PREFIX + "log-aggregation.%s.remote-app-log-dir"; > public static final String NM_REMOTE_APP_LOG_DIR = > NM_PREFIX + "remote-app-log-dir"; > {code} > I suggest TFile should try to obtain the remote dir config from > yarn.log-aggregation.TFile.remote-app-log-dir first, and only if that is not > specified falls back to the yarn.nodemanager.remote-app-log-dir config. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org