[jira] [Commented] (YARN-1618) Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store
[ https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885133#comment-13885133 ] Karthik Kambatla commented on YARN-1618: Thanks Bikas. Yes, verified the latest patch also on a secure cluster and ran Oozie workflows against it. The RM doesn't crash anymore when the workflow is supplied the Standby RM. Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store --- Key: YARN-1618 URL: https://issues.apache.org/jira/browse/YARN-1618 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch YARN-891 augments the RMStateStore to store information on completed applications. In the process, it adds transitions from NEW to FINAL_SAVING. This leads to the RM trying to update entries in the state-store that do not exist. On ZKRMStateStore, this leads to the RM crashing. Previous description: ZKRMStateStore fails to handle updates to znodes that don't exist. For instance, this can happen when an app transitions from NEW to FINAL_SAVING. In these cases, the store should create the missing znode and handle the update. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1578) Fix how to handle ApplicationHistory about the container
[ https://issues.apache.org/jira/browse/YARN-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichi Yamashita updated YARN-1578: - Attachment: application_1390978867235_0001 resoucemanager.log Thank you for your comment. I confirmed that this problem occurred in trunk which built today. I attached the ResourceManager log (resourcemanager.log). finish data of container_1390978867235_0001_01_28 did not seem to be recorded in ResourceManager log. And the finish information of this container is not output for the history file (attached application_1390978867235_0001). By the current implementaion, FileSystemApplicationHistorySever generates only startData at the point that your comment. And it becomes NullPointerException in the following code because the finishData is null. {code} private static void mergeContainerHistoryData( ContainerHistoryData historyData, ContainerFinishData finishData) { historyData.setFinishTime(finishData.getFinishTime()); historyData.setDiagnosticsInfo(finishData.getDiagnosticsInfo()); historyData.setLogURL(finishData.getLogURL()); historyData.setContainerExitStatus(finishData .getContainerExitStatus()); historyData.setContainerState(finishData.getContainerState()); } {code} Fix how to handle ApplicationHistory about the container Key: YARN-1578 URL: https://issues.apache.org/jira/browse/YARN-1578 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: YARN-321 Reporter: Shinichi Yamashita Assignee: Shinichi Yamashita Attachments: YARN-1578.patch, application_1390978867235_0001, resoucemanager.log, screenshot.png I carried out PiEstimator job at Hadoop cluster which applied YARN-321. After the job end and when I accessed Web UI of HistoryServer, it displayed 500. And HistoryServer daemon log was output as follows. {code} 2014-01-09 13:31:12,227 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /applicationhistory/appattempt/appattempt_1389146249925_0008_01 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) (snip...) Caused by: java.lang.NullPointerException at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.mergeContainerHistoryData(FileSystemApplicationHistoryStore.java:696) at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.getContainers(FileSystemApplicationHistoryStore.java:429) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getContainers(ApplicationHistoryManagerImpl.java:201) at org.apache.hadoop.yarn.server.webapp.AppAttemptBlock.render(AppAttemptBlock.java:110) (snip...) {code} I confirmed that there was container which was not finished from ApplicationHistory file. In ResourceManager daemon log, ResourceManager reserved this container, but did not allocate it. Therefore, about a container which is not allocated, it is necessary to change how to handle in ApplicationHistory. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-925) Augment HistoryStorage Reader Interface to Support Filters When Getting Applications
[ https://issues.apache.org/jira/browse/YARN-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885152#comment-13885152 ] Shinichi Yamashita commented on YARN-925: - Thank you for the check of the patch. As you say, the current patch is not good for a histroy of huge application. I thought about a plan to add filter information to a history file name and a plan to add find command to HDFS. However, I thought that your idea was simple and better than mine. I think about a better method. Augment HistoryStorage Reader Interface to Support Filters When Getting Applications Key: YARN-925 URL: https://issues.apache.org/jira/browse/YARN-925 Project: Hadoop YARN Issue Type: Sub-task Reporter: Mayank Bansal Assignee: Shinichi Yamashita Fix For: YARN-321 Attachments: YARN-925-1.patch, YARN-925-2.patch, YARN-925-3.patch, YARN-925-4.patch, YARN-925-5.patch, YARN-925-6.patch, YARN-925-7.patch, YARN-925-8.patch We need to allow filter parameters for getApplications, pushing filtering to the implementations of the interface. The implementations should know the best about optimizing filtering. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1631) Container allocation issue in Leafqueue assignContainers()
[ https://issues.apache.org/jira/browse/YARN-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-1631: -- Attachment: Yarn-1631.2.patch Updated with test case to reproduce this scenario. Please review the same. Container allocation issue in Leafqueue assignContainers() -- Key: YARN-1631 URL: https://issues.apache.org/jira/browse/YARN-1631 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.2.0 Environment: SuSe 11 Linux Reporter: Sunil G Attachments: Yarn-1631.1.patch, Yarn-1631.2.patch Application1 has a demand of 8GB[Map Task Size as 8GB] which is more than Node_1 can handle. Node_1 has a size of 8GB and 2GB is used by Application1's AM. Hence reservation happened for remaining 6GB in Node_1 by Application1. A new job is submitted with 2GB AM size and 2GB task size with only 2 Maps to run. Node_2 also has 8GB capability. But Application2's AM cannot be launched in Node_2. And Application2 waits longer as only 2 Nodes are available in cluster. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1630) Introduce timeout for async polling operations in YarnClientImpl
[ https://issues.apache.org/jira/browse/YARN-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885238#comment-13885238 ] Hudson commented on YARN-1630: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #465 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/465/]) YARN-1630. Introduce timeout for async polling operations in YarnClientImpl (Aditya Acharya via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1562289) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java Introduce timeout for async polling operations in YarnClientImpl Key: YARN-1630 URL: https://issues.apache.org/jira/browse/YARN-1630 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Aditya Acharya Assignee: Aditya Acharya Fix For: 2.3.0 Attachments: diff-1.txt, diff.txt I ran an MR2 application that would have been long running, and killed it programmatically using a YarnClient. The app was killed, but the client hung forever. The message that I saw, which spammed the logs, was Watiting for application application_1389036507624_0018 to be killed. The RM log indicated that the app had indeed transitioned from RUNNING to KILLED, but for some reason future responses to the RPC to kill the application did not indicate that the app had been terminated. I tracked this down to YarnClientImpl.java, and though I was unable to reproduce the bug, I wrote a patch to introduce a bound on the number of times that YarnClientImpl retries the RPC before giving up. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1618) Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store
[ https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885252#comment-13885252 ] Jian He commented on YARN-1618: --- patch looks good to me, +1 Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store --- Key: YARN-1618 URL: https://issues.apache.org/jira/browse/YARN-1618 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch YARN-891 augments the RMStateStore to store information on completed applications. In the process, it adds transitions from NEW to FINAL_SAVING. This leads to the RM trying to update entries in the state-store that do not exist. On ZKRMStateStore, this leads to the RM crashing. Previous description: ZKRMStateStore fails to handle updates to znodes that don't exist. For instance, this can happen when an app transitions from NEW to FINAL_SAVING. In these cases, the store should create the missing znode and handle the update. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1618) Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store
[ https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885258#comment-13885258 ] Jian He commented on YARN-1618: --- I found New state on Recover event is possible to transition to FINAL_SAVING state. This should not happen. FINAL_SAVING state should be removed. we may just fix it here ? Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store --- Key: YARN-1618 URL: https://issues.apache.org/jira/browse/YARN-1618 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch YARN-891 augments the RMStateStore to store information on completed applications. In the process, it adds transitions from NEW to FINAL_SAVING. This leads to the RM trying to update entries in the state-store that do not exist. On ZKRMStateStore, this leads to the RM crashing. Previous description: ZKRMStateStore fails to handle updates to znodes that don't exist. For instance, this can happen when an app transitions from NEW to FINAL_SAVING. In these cases, the store should create the missing znode and handle the update. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1631) Container allocation issue in Leafqueue assignContainers()
[ https://issues.apache.org/jira/browse/YARN-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885267#comment-13885267 ] Hadoop QA commented on YARN-1631: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12625843/Yarn-1631.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2957//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2957//console This message is automatically generated. Container allocation issue in Leafqueue assignContainers() -- Key: YARN-1631 URL: https://issues.apache.org/jira/browse/YARN-1631 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.2.0 Environment: SuSe 11 Linux Reporter: Sunil G Attachments: Yarn-1631.1.patch, Yarn-1631.2.patch Application1 has a demand of 8GB[Map Task Size as 8GB] which is more than Node_1 can handle. Node_1 has a size of 8GB and 2GB is used by Application1's AM. Hence reservation happened for remaining 6GB in Node_1 by Application1. A new job is submitted with 2GB AM size and 2GB task size with only 2 Maps to run. Node_2 also has 8GB capability. But Application2's AM cannot be launched in Node_2. And Application2 waits longer as only 2 Nodes are available in cluster. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1630) Introduce timeout for async polling operations in YarnClientImpl
[ https://issues.apache.org/jira/browse/YARN-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885329#comment-13885329 ] Hudson commented on YARN-1630: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1682 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1682/]) YARN-1630. Introduce timeout for async polling operations in YarnClientImpl (Aditya Acharya via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1562289) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java Introduce timeout for async polling operations in YarnClientImpl Key: YARN-1630 URL: https://issues.apache.org/jira/browse/YARN-1630 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Aditya Acharya Assignee: Aditya Acharya Fix For: 2.3.0 Attachments: diff-1.txt, diff.txt I ran an MR2 application that would have been long running, and killed it programmatically using a YarnClient. The app was killed, but the client hung forever. The message that I saw, which spammed the logs, was Watiting for application application_1389036507624_0018 to be killed. The RM log indicated that the app had indeed transitioned from RUNNING to KILLED, but for some reason future responses to the RPC to kill the application did not indicate that the app had been terminated. I tracked this down to YarnClientImpl.java, and though I was unable to reproduce the bug, I wrote a patch to introduce a bound on the number of times that YarnClientImpl retries the RPC before giving up. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1630) Introduce timeout for async polling operations in YarnClientImpl
[ https://issues.apache.org/jira/browse/YARN-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885334#comment-13885334 ] Hudson commented on YARN-1630: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1657 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1657/]) YARN-1630. Introduce timeout for async polling operations in YarnClientImpl (Aditya Acharya via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1562289) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java Introduce timeout for async polling operations in YarnClientImpl Key: YARN-1630 URL: https://issues.apache.org/jira/browse/YARN-1630 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Aditya Acharya Assignee: Aditya Acharya Fix For: 2.3.0 Attachments: diff-1.txt, diff.txt I ran an MR2 application that would have been long running, and killed it programmatically using a YarnClient. The app was killed, but the client hung forever. The message that I saw, which spammed the logs, was Watiting for application application_1389036507624_0018 to be killed. The RM log indicated that the app had indeed transitioned from RUNNING to KILLED, but for some reason future responses to the RPC to kill the application did not indicate that the app had been terminated. I tracked this down to YarnClientImpl.java, and though I was unable to reproduce the bug, I wrote a patch to introduce a bound on the number of times that YarnClientImpl retries the RPC before giving up. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1600) RM does not startup when security is enabled without spnego configured
[ https://issues.apache.org/jira/browse/YARN-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885429#comment-13885429 ] Hudson commented on YARN-1600: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5058 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5058/]) YARN-1600. RM does not startup when security is enabled without spnego configured. Contributed by Haohui Mai (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1562482) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java RM does not startup when security is enabled without spnego configured -- Key: YARN-1600 URL: https://issues.apache.org/jira/browse/YARN-1600 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Haohui Mai Priority: Blocker Attachments: YARN-1600.000.patch We have a custom auth filter in front of our various UI pages that handles user authentication. However currently the RM assumes that if security is enabled then the user must have configured spnego as well for the RM web pages which is not true in our case. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1632) TestApplicationMasterServices should be under org.apache.hadoop.yarn.server.resourcemanager package
[ https://issues.apache.org/jira/browse/YARN-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885506#comment-13885506 ] Jonathan Eagles commented on YARN-1632: --- +1. Simple fix. Thanks, Chen. TestApplicationMasterServices should be under org.apache.hadoop.yarn.server.resourcemanager package --- Key: YARN-1632 URL: https://issues.apache.org/jira/browse/YARN-1632 Project: Hadoop YARN Issue Type: Bug Affects Versions: 0.23.9, 2.2.0 Reporter: Chen He Assignee: Chen He Priority: Minor Attachments: yarn-1632v2.patch ApplicationMasterService is under org.apache.hadoop.yarn.server.resourcemanager package. However, its unit test file TestApplicationMasterService is placed under org.apache.hadoop.yarn.server.resourcemanager.applicationmasterservice package which only contains one file (TestApplicationMasterService). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1670) aggregated log writer can write more log data then it says is the log length
Thomas Graves created YARN-1670: --- Summary: aggregated log writer can write more log data then it says is the log length Key: YARN-1670 URL: https://issues.apache.org/jira/browse/YARN-1670 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.2.0, 0.23.10 Reporter: Thomas Graves We have seen exceptions when using 'yarn logs' to read log files. at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518) at org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178) at org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130) at org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246) We traced it down to the reader trying to read the file type of the next file but where it reads is still log data from the previous file. What happened was the Log Length was written as a certain size but the log data was actually longer then that. Inside of the write() routine in LogValue it first writes what the logfile length is, but then when it goes to write the log itself it just goes to the end of the file. There is a race condition here where if someone is still writing to the file when it goes to be aggregated the length written could be to small. We should have the write() routine stop when it writes whatever it said was the length. It would be nice if we could somehow tell the user it might be truncated but I'm not sure of a good way to do this. We also noticed that a bug in readAContainerLogsForALogType where it is using an int for curRead whereas it should be using a long. while (len != -1 curRead fileLength) { This isn't actually a problem right now as it looks like the underlying decoder is doing the right thing and the len condition exits. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1618) Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store
[ https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885560#comment-13885560 ] Karthik Kambatla commented on YARN-1618: Thanks for the review, Jian. bq. I found New state on Recover event is possible to transition to FINAL_SAVING state. This should not happen. FINAL_SAVING state should be removed. I suppose you are referring to the following transition. An RMAppEvent of type RECOVER is created only when recovering applications, which means the application is already in the store. For these applications, I am not sure if we should save the state of this second attempt or not. I don't think either approach would lead to store issues as reported here. {code} .addTransition(RMAppState.NEW, EnumSet.of(RMAppState.SUBMITTED, RMAppState.ACCEPTED, RMAppState.FINISHED, RMAppState.FAILED, RMAppState.KILLED, RMAppState.FINAL_SAVING), RMAppEventType.RECOVER, new RMAppRecoveredTransition()) {code} Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store --- Key: YARN-1618 URL: https://issues.apache.org/jira/browse/YARN-1618 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch YARN-891 augments the RMStateStore to store information on completed applications. In the process, it adds transitions from NEW to FINAL_SAVING. This leads to the RM trying to update entries in the state-store that do not exist. On ZKRMStateStore, this leads to the RM crashing. Previous description: ZKRMStateStore fails to handle updates to znodes that don't exist. For instance, this can happen when an app transitions from NEW to FINAL_SAVING. In these cases, the store should create the missing znode and handle the update. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1618) Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store
[ https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885574#comment-13885574 ] Karthik Kambatla commented on YARN-1618: [~bikassaha], [~jianhe] - if we need to spend more time on addressing Jian's comment on recovered applications, are you okay with addressing it in a follow-up JIRA? Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store --- Key: YARN-1618 URL: https://issues.apache.org/jira/browse/YARN-1618 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch YARN-891 augments the RMStateStore to store information on completed applications. In the process, it adds transitions from NEW to FINAL_SAVING. This leads to the RM trying to update entries in the state-store that do not exist. On ZKRMStateStore, this leads to the RM crashing. Previous description: ZKRMStateStore fails to handle updates to znodes that don't exist. For instance, this can happen when an app transitions from NEW to FINAL_SAVING. In these cases, the store should create the missing znode and handle the update. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1618) Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store
[ https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885576#comment-13885576 ] Bikas Saha commented on YARN-1618: -- yeah. lets do it in a separate jira for clarity. Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store --- Key: YARN-1618 URL: https://issues.apache.org/jira/browse/YARN-1618 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch YARN-891 augments the RMStateStore to store information on completed applications. In the process, it adds transitions from NEW to FINAL_SAVING. This leads to the RM trying to update entries in the state-store that do not exist. On ZKRMStateStore, this leads to the RM crashing. Previous description: ZKRMStateStore fails to handle updates to znodes that don't exist. For instance, this can happen when an app transitions from NEW to FINAL_SAVING. In these cases, the store should create the missing znode and handle the update. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1659) Define ApplicationTimelineStore interface and store-facing entity, entity-info and event objects
[ https://issues.apache.org/jira/browse/YARN-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1659: -- Attachment: YARN-1659.2.patch Upload a new patch with some slight adjustment on class name Define ApplicationTimelineStore interface and store-facing entity, entity-info and event objects Key: YARN-1659 URL: https://issues.apache.org/jira/browse/YARN-1659 Project: Hadoop YARN Issue Type: Sub-task Reporter: Billie Rinaldi Assignee: Billie Rinaldi Attachments: YARN-1659-1.patch, YARN-1659.2.patch These will be used by ApplicationTimelineStore interface. The web services will convert the store-facing obects to the user-facing objects. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1618) Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store
[ https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885578#comment-13885578 ] Karthik Kambatla commented on YARN-1618: Thanks Bikas. I ll create a separate JIRA for that, and go ahead and commit this then. Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store --- Key: YARN-1618 URL: https://issues.apache.org/jira/browse/YARN-1618 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch YARN-891 augments the RMStateStore to store information on completed applications. In the process, it adds transitions from NEW to FINAL_SAVING. This leads to the RM trying to update entries in the state-store that do not exist. On ZKRMStateStore, this leads to the RM crashing. Previous description: ZKRMStateStore fails to handle updates to znodes that don't exist. For instance, this can happen when an app transitions from NEW to FINAL_SAVING. In these cases, the store should create the missing znode and handle the update. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1618) Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store
[ https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885582#comment-13885582 ] Karthik Kambatla commented on YARN-1618: Filed YARN-1671. Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store --- Key: YARN-1618 URL: https://issues.apache.org/jira/browse/YARN-1618 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch YARN-891 augments the RMStateStore to store information on completed applications. In the process, it adds transitions from NEW to FINAL_SAVING. This leads to the RM trying to update entries in the state-store that do not exist. On ZKRMStateStore, this leads to the RM crashing. Previous description: ZKRMStateStore fails to handle updates to znodes that don't exist. For instance, this can happen when an app transitions from NEW to FINAL_SAVING. In these cases, the store should create the missing znode and handle the update. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1671) Revisit RMApp transitions from NEW on RECOVER
[ https://issues.apache.org/jira/browse/YARN-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1671: --- Issue Type: Sub-task (was: Bug) Parent: YARN-128 Revisit RMApp transitions from NEW on RECOVER - Key: YARN-1671 URL: https://issues.apache.org/jira/browse/YARN-1671 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.3.0 Reporter: Karthik Kambatla As discussed on YARN-1618, while recovering applications on restart, NEW - FINAL_SAVING transition is possible. Revisit this to make sure we want this transition. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1671) Revisit RMApp transitions from NEW on RECOVER
Karthik Kambatla created YARN-1671: -- Summary: Revisit RMApp transitions from NEW on RECOVER Key: YARN-1671 URL: https://issues.apache.org/jira/browse/YARN-1671 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: Karthik Kambatla As discussed on YARN-1618, while recovering applications on restart, NEW - FINAL_SAVING transition is possible. Revisit this to make sure we want this transition. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1618) Fix invalid transition from NEW to FINAL_SAVING
[ https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1618: --- Summary: Fix invalid transition from NEW to FINAL_SAVING (was: Applications transition from NEW to FINAL_SAVING, and try to update non-existing entries in the state-store) Fix invalid transition from NEW to FINAL_SAVING --- Key: YARN-1618 URL: https://issues.apache.org/jira/browse/YARN-1618 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch YARN-891 augments the RMStateStore to store information on completed applications. In the process, it adds transitions from NEW to FINAL_SAVING. This leads to the RM trying to update entries in the state-store that do not exist. On ZKRMStateStore, this leads to the RM crashing. Previous description: ZKRMStateStore fails to handle updates to znodes that don't exist. For instance, this can happen when an app transitions from NEW to FINAL_SAVING. In these cases, the store should create the missing znode and handle the update. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1618) Fix invalid RMApp transition from NEW to FINAL_SAVING
[ https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1618: --- Summary: Fix invalid RMApp transition from NEW to FINAL_SAVING (was: Fix invalid transition from NEW to FINAL_SAVING) Fix invalid RMApp transition from NEW to FINAL_SAVING - Key: YARN-1618 URL: https://issues.apache.org/jira/browse/YARN-1618 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch YARN-891 augments the RMStateStore to store information on completed applications. In the process, it adds transitions from NEW to FINAL_SAVING. This leads to the RM trying to update entries in the state-store that do not exist. On ZKRMStateStore, this leads to the RM crashing. Previous description: ZKRMStateStore fails to handle updates to znodes that don't exist. For instance, this can happen when an app transitions from NEW to FINAL_SAVING. In these cases, the store should create the missing znode and handle the update. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1618) Fix invalid RMApp transition from NEW to FINAL_SAVING
[ https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1618: --- Attachment: yarn-1618-branch-2.3.patch Patch for branch-2.3. The patch is functionally the same, trivial conflicts due to YARN-321 merge changes to TestRMAppTransitions. Fix invalid RMApp transition from NEW to FINAL_SAVING - Key: YARN-1618 URL: https://issues.apache.org/jira/browse/YARN-1618 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch, yarn-1618-branch-2.3.patch YARN-891 augments the RMStateStore to store information on completed applications. In the process, it adds transitions from NEW to FINAL_SAVING. This leads to the RM trying to update entries in the state-store that do not exist. On ZKRMStateStore, this leads to the RM crashing. Previous description: ZKRMStateStore fails to handle updates to znodes that don't exist. For instance, this can happen when an app transitions from NEW to FINAL_SAVING. In these cases, the store should create the missing znode and handle the update. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1618) Fix invalid RMApp transition from NEW to FINAL_SAVING
[ https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885612#comment-13885612 ] Hudson commented on YARN-1618: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5059 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5059/]) YARN-1618. Fix invalid RMApp transition from NEW to FINAL_SAVING (kasha) (kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1562529) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppEventType.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java Fix invalid RMApp transition from NEW to FINAL_SAVING - Key: YARN-1618 URL: https://issues.apache.org/jira/browse/YARN-1618 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch, yarn-1618-branch-2.3.patch YARN-891 augments the RMStateStore to store information on completed applications. In the process, it adds transitions from NEW to FINAL_SAVING. This leads to the RM trying to update entries in the state-store that do not exist. On ZKRMStateStore, this leads to the RM crashing. Previous description: ZKRMStateStore fails to handle updates to znodes that don't exist. For instance, this can happen when an app transitions from NEW to FINAL_SAVING. In these cases, the store should create the missing znode and handle the update. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1618) Fix invalid RMApp transition from NEW to FINAL_SAVING
[ https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885617#comment-13885617 ] Hadoop QA commented on YARN-1618: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12625911/yarn-1618-branch-2.3.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2958//console This message is automatically generated. Fix invalid RMApp transition from NEW to FINAL_SAVING - Key: YARN-1618 URL: https://issues.apache.org/jira/browse/YARN-1618 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch, yarn-1618-branch-2.3.patch YARN-891 augments the RMStateStore to store information on completed applications. In the process, it adds transitions from NEW to FINAL_SAVING. This leads to the RM trying to update entries in the state-store that do not exist. On ZKRMStateStore, this leads to the RM crashing. Previous description: ZKRMStateStore fails to handle updates to znodes that don't exist. For instance, this can happen when an app transitions from NEW to FINAL_SAVING. In these cases, the store should create the missing znode and handle the update. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1672) YarnConfiguration is missing a default for yarn.nodemanager.log.retain-seconds
Karthik Kambatla created YARN-1672: -- Summary: YarnConfiguration is missing a default for yarn.nodemanager.log.retain-seconds Key: YARN-1672 URL: https://issues.apache.org/jira/browse/YARN-1672 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Trivial YarnConfiguration is missing a default for yarn.nodemanager.log.retain-seconds -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1636) Implement timeline related web-services inside AHS for storing and retrieving entities+eventies
[ https://issues.apache.org/jira/browse/YARN-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1636: -- Attachment: YARN-1636.1.patch Upload a patch which contains the REST API service part. The test cases of it are held off until in-memory implementation of ApplicationTimelineStore is ready. Implement timeline related web-services inside AHS for storing and retrieving entities+eventies --- Key: YARN-1636 URL: https://issues.apache.org/jira/browse/YARN-1636 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: YARN-1636.1.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1636) Implement timeline related web-services inside AHS for storing and retrieving entities+eventies
[ https://issues.apache.org/jira/browse/YARN-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885666#comment-13885666 ] Hadoop QA commented on YARN-1636: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12625924/YARN-1636.1.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2959//console This message is automatically generated. Implement timeline related web-services inside AHS for storing and retrieving entities+eventies --- Key: YARN-1636 URL: https://issues.apache.org/jira/browse/YARN-1636 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: YARN-1636.1.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1628) TestContainerManagerSecurity fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885677#comment-13885677 ] Daryn Sharp commented on YARN-1628: --- +1. Will check in later today. Thanks! TestContainerManagerSecurity fails on trunk --- Key: YARN-1628 URL: https://issues.apache.org/jira/browse/YARN-1628 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 2.2.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: YARN-1628.patch The Test fails with the following error {noformat} java.lang.IllegalArgumentException: java.net.UnknownHostException: InvalidHost at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377) at org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.newInstance(BaseNMTokenSecretManager.java:145) at org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.createNMToken(BaseNMTokenSecretManager.java:136) at org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:253) at org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:144) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1639) YARM RM HA requires different configs on different RM hosts
[ https://issues.apache.org/jira/browse/YARN-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1639: Attachment: YARN-1639.4.patch Add a test case to test if RM_HA_ID can not be found YARM RM HA requires different configs on different RM hosts --- Key: YARN-1639 URL: https://issues.apache.org/jira/browse/YARN-1639 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1639.1.patch, YARN-1639.2.patch, YARN-1639.3.patch, YARN-1639.4.patch We need to set yarn.resourcemanager.ha.id to rm1 or rm2 based on which rm you want to first or second. This means we have different configs on different RM nodes. This is unlike HDFS HA where the same configs are pushed to both NN's and it would be better to have the same setup for RM as this would make installation and managing easier. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1578) Fix how to handle ApplicationHistory about the container
[ https://issues.apache.org/jira/browse/YARN-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885709#comment-13885709 ] Zhijie Shen commented on YARN-1578: --- [~sinchii], thanks for your investigation. In my previous comment, I meant it should be fine if the finish data of an container is not written by RM. Then, in this case, the finish data should not exist in the persisted history file. Therefore, in the following code, {code} if (entry.key.id.equals(containerId.toString())) { if (entry.key.suffix.equals(START_DATA_SUFFIX)) { ContainerStartData startData = parseContainerStartData(entry.value); mergeContainerHistoryData(historyData, startData); readStartData = true; } else if (entry.key.suffix.equals(FINISH_DATA_SUFFIX)) { ContainerFinishData finishData = parseContainerFinishData(entry.value); mergeContainerHistoryData(historyData, finishData); readFinishData = true; } } {code} The second inner condition is supposed be failed. However, it seems that the second inner condition got passed, while the entry was actually not the byte[] to construct finish data instance. Fix how to handle ApplicationHistory about the container Key: YARN-1578 URL: https://issues.apache.org/jira/browse/YARN-1578 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: YARN-321 Reporter: Shinichi Yamashita Assignee: Shinichi Yamashita Attachments: YARN-1578.patch, application_1390978867235_0001, resoucemanager.log, screenshot.png I carried out PiEstimator job at Hadoop cluster which applied YARN-321. After the job end and when I accessed Web UI of HistoryServer, it displayed 500. And HistoryServer daemon log was output as follows. {code} 2014-01-09 13:31:12,227 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /applicationhistory/appattempt/appattempt_1389146249925_0008_01 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) (snip...) Caused by: java.lang.NullPointerException at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.mergeContainerHistoryData(FileSystemApplicationHistoryStore.java:696) at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.getContainers(FileSystemApplicationHistoryStore.java:429) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getContainers(ApplicationHistoryManagerImpl.java:201) at org.apache.hadoop.yarn.server.webapp.AppAttemptBlock.render(AppAttemptBlock.java:110) (snip...) {code} I confirmed that there was container which was not finished from ApplicationHistory file. In ResourceManager daemon log, ResourceManager reserved this container, but did not allocate it. Therefore, about a container which is not allocated, it is necessary to change how to handle in ApplicationHistory. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1673) Valid yarn kill application prints out help message.
Tassapol Athiapinya created YARN-1673: - Summary: Valid yarn kill application prints out help message. Key: YARN-1673 URL: https://issues.apache.org/jira/browse/YARN-1673 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.4.0 Reporter: Tassapol Athiapinya Priority: Critical Fix For: 2.4.0 yarn application -kill application ID used to work previously. In 2.4.0 it prints out help message and does not kill the application. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1577) Unmanaged AM is broken because of YARN-1493
[ https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1577: - Target Version/s: 2.3.0 (was: ) Unmanaged AM is broken because of YARN-1493 --- Key: YARN-1577 URL: https://issues.apache.org/jira/browse/YARN-1577 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.3.0 Reporter: Jian He Assignee: Jian He Priority: Blocker Today unmanaged AM client is waiting for app state to be Accepted to launch the AM. This is broken since we changed in YARN-1493 to start the attempt after the application is Accepted. We may need to introduce an attempt state report that client can rely on to query the attempt state and choose to launch the unmanaged AM. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.
[ https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1506: - Target Version/s: 2.3.0 (was: ) Setting target version to 2.3.0 since it was originally targeted 2.4.0, and 2.3 is the new 2.4. If this isn't really a blocker for the 2.3.0 release, please either target it to a later version or downgrade the priority. Replace set resource change on RMNode/SchedulerNode directly with event notification. - Key: YARN-1506 URL: https://issues.apache.org/jira/browse/YARN-1506 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, scheduler Reporter: Junping Du Assignee: Junping Du Priority: Blocker Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, YARN-1506-v3.patch, YARN-1506-v4.patch, YARN-1506-v5.patch, YARN-1506-v6.patch According to Vinod's comments on YARN-312 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087), we should replace RMNode.setResourceOption() with some resource change event. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1444) RM crashes when node resource request sent without corresponding rack request
[ https://issues.apache.org/jira/browse/YARN-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1444: - Target Version/s: 2.3.0 (was: ) RM crashes when node resource request sent without corresponding rack request - Key: YARN-1444 URL: https://issues.apache.org/jira/browse/YARN-1444 Project: Hadoop YARN Issue Type: Bug Components: client, resourcemanager Reporter: Robert Grandl Assignee: Wangda Tan Priority: Blocker Attachments: yarn-1444.ver1.patch I have tried to force reducers to execute on certain nodes. What I did is I changed for reduce tasks, the RMContainerRequestor#addResourceRequest(req.priority, ResourceRequest.ANY, req.capability) to RMContainerRequestor#addResourceRequest(req.priority, HOST_NAME, req.capability). However, this change lead to RM crashes when reducers needs to be assigned with the following exception: FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:841) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:640) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:554) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:695) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:739) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:86) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:549) at java.lang.Thread.run(Thread.java:722) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1602) All failed RMStateStore operations should not be RMFatalEvents
[ https://issues.apache.org/jira/browse/YARN-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1602: - Target Version/s: 2.3.0 (was: ) All failed RMStateStore operations should not be RMFatalEvents -- Key: YARN-1602 URL: https://issues.apache.org/jira/browse/YARN-1602 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Currently, if a state store operation fails, depending on the exception, either a RMFatalEvent.STATE_STORE_FENCED or RMFatalEvent.STATE_STORE_OP_FAILED events are created. The latter results in the RM failing. Instead, we should probably kill the application corresponding to the store operation. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1639) YARM RM HA requires different configs on different RM hosts
[ https://issues.apache.org/jira/browse/YARN-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885752#comment-13885752 ] Hadoop QA commented on YARN-1639: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12625936/YARN-1639.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2960//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/2960//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-api.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2960//console This message is automatically generated. YARM RM HA requires different configs on different RM hosts --- Key: YARN-1639 URL: https://issues.apache.org/jira/browse/YARN-1639 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1639.1.patch, YARN-1639.2.patch, YARN-1639.3.patch, YARN-1639.4.patch We need to set yarn.resourcemanager.ha.id to rm1 or rm2 based on which rm you want to first or second. This means we have different configs on different RM nodes. This is unlike HDFS HA where the same configs are pushed to both NN's and it would be better to have the same setup for RM as this would make installation and managing easier. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1206) Container logs link is broken on RM web UI after application finished
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1206: - Target Version/s: 2.3.0 (was: ) Container logs link is broken on RM web UI after application finished - Key: YARN-1206 URL: https://issues.apache.org/jira/browse/YARN-1206 Project: Hadoop YARN Issue Type: Bug Reporter: Jian He Priority: Blocker With log aggregation disabled, when container is running, its logs link works properly, but after the application is finished, the link shows 'Container does not exist.' -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1611) Make admin refresh of configuration work across RM failover
[ https://issues.apache.org/jira/browse/YARN-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1611: Attachment: YARN-1611.4.patch Make admin refresh of configuration work across RM failover --- Key: YARN-1611 URL: https://issues.apache.org/jira/browse/YARN-1611 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-1611.1.patch, YARN-1611.2.patch, YARN-1611.2.patch, YARN-1611.3.patch, YARN-1611.3.patch, YARN-1611.4.patch Currently, If we do refresh* for a standby RM, it will failover to the current active RM, and do the refresh* based on the local configuration file of the active RM. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1611) Make admin refresh of configuration work across RM failover
[ https://issues.apache.org/jira/browse/YARN-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885756#comment-13885756 ] Xuan Gong commented on YARN-1611: - create a patch only contains RemoteConfiguration functionality and for the scheduler configuration. Make admin refresh of configuration work across RM failover --- Key: YARN-1611 URL: https://issues.apache.org/jira/browse/YARN-1611 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-1611.1.patch, YARN-1611.2.patch, YARN-1611.2.patch, YARN-1611.3.patch, YARN-1611.3.patch, YARN-1611.4.patch Currently, If we do refresh* for a standby RM, it will failover to the current active RM, and do the refresh* based on the local configuration file of the active RM. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-867) Isolation of failures in aux services
[ https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-867: Target Version/s: 2.3.0 (was: ) Isolation of failures in aux services -- Key: YARN-867 URL: https://issues.apache.org/jira/browse/YARN-867 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Hitesh Shah Assignee: Xuan Gong Priority: Critical Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, YARN-867.4.patch, YARN-867.5.patch, YARN-867.6.patch, YARN-867.sampleCode.2.patch Today, a malicious application can bring down the NM by sending bad data to a service. For example, sending data to the ShuffleService such that it results any non-IOException will cause the NM's async dispatcher to exit as the service's INIT APP event is not handled properly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-867) Isolation of failures in aux services
[ https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885767#comment-13885767 ] Hadoop QA commented on YARN-867: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606599/YARN-867.6.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2961//console This message is automatically generated. Isolation of failures in aux services -- Key: YARN-867 URL: https://issues.apache.org/jira/browse/YARN-867 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Hitesh Shah Assignee: Xuan Gong Priority: Critical Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, YARN-867.4.patch, YARN-867.5.patch, YARN-867.6.patch, YARN-867.sampleCode.2.patch Today, a malicious application can bring down the NM by sending bad data to a service. For example, sending data to the ShuffleService such that it results any non-IOException will cause the NM's async dispatcher to exit as the service's INIT APP event is not handled properly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1639) YARM RM HA requires different configs on different RM hosts
[ https://issues.apache.org/jira/browse/YARN-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1639: Attachment: YARN-1639.5.patch YARM RM HA requires different configs on different RM hosts --- Key: YARN-1639 URL: https://issues.apache.org/jira/browse/YARN-1639 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1639.1.patch, YARN-1639.2.patch, YARN-1639.3.patch, YARN-1639.4.patch, YARN-1639.5.patch We need to set yarn.resourcemanager.ha.id to rm1 or rm2 based on which rm you want to first or second. This means we have different configs on different RM nodes. This is unlike HDFS HA where the same configs are pushed to both NN's and it would be better to have the same setup for RM as this would make installation and managing easier. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1611) Make admin refresh of configuration work across RM failover
[ https://issues.apache.org/jira/browse/YARN-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885772#comment-13885772 ] Hadoop QA commented on YARN-1611: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12625953/YARN-1611.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2962//console This message is automatically generated. Make admin refresh of configuration work across RM failover --- Key: YARN-1611 URL: https://issues.apache.org/jira/browse/YARN-1611 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-1611.1.patch, YARN-1611.2.patch, YARN-1611.2.patch, YARN-1611.3.patch, YARN-1611.3.patch, YARN-1611.4.patch Currently, If we do refresh* for a standby RM, it will failover to the current active RM, and do the refresh* based on the local configuration file of the active RM. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1611) Make admin refresh of configuration work across RM failover
[ https://issues.apache.org/jira/browse/YARN-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885809#comment-13885809 ] Sandy Ryza commented on YARN-1611: -- Just took a quick look at the patch. Right now we have a nice one way relationship where configs affect the services, but the services do not affect configs. The patch appears to have the RM deleting and uploading files to the remote config directory, which makes me nervous. Would it make sense for the admin to be responsible for placing configs in the remote dir, and the RMs just be responsible for pulling them down? Also, a couple other questions: * Will the existing way of doing things (writing files to disk for RMs and calling refresh on both) still be supported? * Will the remote configuration be supported for the non-HA case? We should make it possible to configure things the same way for HA and non-HA. Make admin refresh of configuration work across RM failover --- Key: YARN-1611 URL: https://issues.apache.org/jira/browse/YARN-1611 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-1611.1.patch, YARN-1611.2.patch, YARN-1611.2.patch, YARN-1611.3.patch, YARN-1611.3.patch, YARN-1611.4.patch Currently, If we do refresh* for a standby RM, it will failover to the current active RM, and do the refresh* based on the local configuration file of the active RM. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (YARN-1673) Valid yarn kill application prints out help message.
[ https://issues.apache.org/jira/browse/YARN-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reassigned YARN-1673: - Assignee: Vinod Kumar Vavilapalli Looking at this quickly. Valid yarn kill application prints out help message. Key: YARN-1673 URL: https://issues.apache.org/jira/browse/YARN-1673 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.4.0 Reporter: Tassapol Athiapinya Assignee: Vinod Kumar Vavilapalli Priority: Critical Fix For: 2.4.0 yarn application -kill application ID used to work previously. In 2.4.0 it prints out help message and does not kill the application. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1639) YARM RM HA requires different configs on different RM hosts
[ https://issues.apache.org/jira/browse/YARN-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885832#comment-13885832 ] Hadoop QA commented on YARN-1639: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12625959/YARN-1639.5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2963//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2963//console This message is automatically generated. YARM RM HA requires different configs on different RM hosts --- Key: YARN-1639 URL: https://issues.apache.org/jira/browse/YARN-1639 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1639.1.patch, YARN-1639.2.patch, YARN-1639.3.patch, YARN-1639.4.patch, YARN-1639.5.patch We need to set yarn.resourcemanager.ha.id to rm1 or rm2 based on which rm you want to first or second. This means we have different configs on different RM nodes. This is unlike HDFS HA where the same configs are pushed to both NN's and it would be better to have the same setup for RM as this would make installation and managing easier. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1673) Valid yarn kill application prints out help message.
[ https://issues.apache.org/jira/browse/YARN-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1673: -- Priority: Blocker (was: Critical) I cornered it to YARN-967. Fixing this.. Valid yarn kill application prints out help message. Key: YARN-1673 URL: https://issues.apache.org/jira/browse/YARN-1673 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.4.0 Reporter: Tassapol Athiapinya Assignee: Vinod Kumar Vavilapalli Priority: Blocker Fix For: 2.4.0 yarn application -kill application ID used to work previously. In 2.4.0 it prints out help message and does not kill the application. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1673) Valid yarn kill application prints out help message.
[ https://issues.apache.org/jira/browse/YARN-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1673: -- Attachment: YARN-1673.txt This works on my setup. [~mayank_bansal], can you verify why this set was put in the first place? Tx. Valid yarn kill application prints out help message. Key: YARN-1673 URL: https://issues.apache.org/jira/browse/YARN-1673 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.4.0 Reporter: Tassapol Athiapinya Assignee: Vinod Kumar Vavilapalli Priority: Blocker Fix For: 2.4.0 Attachments: YARN-1673.txt yarn application -kill application ID used to work previously. In 2.4.0 it prints out help message and does not kill the application. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1673) Valid yarn kill application prints out help message.
[ https://issues.apache.org/jira/browse/YARN-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885874#comment-13885874 ] Mayank Bansal commented on YARN-1673: - This will break CLI for history server, let me take a look at this Thanks, Mayank Valid yarn kill application prints out help message. Key: YARN-1673 URL: https://issues.apache.org/jira/browse/YARN-1673 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.4.0 Reporter: Tassapol Athiapinya Assignee: Vinod Kumar Vavilapalli Priority: Blocker Fix For: 2.4.0 Attachments: YARN-1673.txt yarn application -kill application ID used to work previously. In 2.4.0 it prints out help message and does not kill the application. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1611) Make admin refresh of configuration work across RM failover
[ https://issues.apache.org/jira/browse/YARN-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885871#comment-13885871 ] Xuan Gong commented on YARN-1611: - Thanks for comments, [~sandyr] bq. Just took a quick look at the patch. Right now we have a nice one way relationship where configs affect the services, but the services do not affect configs. The patch appears to have the RM deleting and uploading files to the remote config directory, which makes me nervous. Would it make sense for the admin to be responsible for placing configs in the remote dir, and the RMs just be responsible for pulling them down? Yes, you are right. Will remove upload and deleting file from RemoteConfiguration functionality. bq. Will the existing way of doing things (writing files to disk for RMs and calling refresh on both) still be supported? Yes, it is still supported. In HA case, if there is no remote configuration, it will give the warning message. bq. Will the remote configuration be supported for the non-HA case? Currently, no. This is for HA case. Make admin refresh of configuration work across RM failover --- Key: YARN-1611 URL: https://issues.apache.org/jira/browse/YARN-1611 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-1611.1.patch, YARN-1611.2.patch, YARN-1611.2.patch, YARN-1611.3.patch, YARN-1611.3.patch, YARN-1611.4.patch Currently, If we do refresh* for a standby RM, it will failover to the current active RM, and do the refresh* based on the local configuration file of the active RM. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (YARN-1673) Valid yarn kill application prints out help message.
[ https://issues.apache.org/jira/browse/YARN-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal reassigned YARN-1673: --- Assignee: Mayank Bansal (was: Vinod Kumar Vavilapalli) Valid yarn kill application prints out help message. Key: YARN-1673 URL: https://issues.apache.org/jira/browse/YARN-1673 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.4.0 Reporter: Tassapol Athiapinya Assignee: Mayank Bansal Priority: Blocker Attachments: YARN-1673.txt yarn application -kill application ID used to work previously. In 2.4.0 it prints out help message and does not kill the application. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1673) Valid yarn kill application prints out help message.
[ https://issues.apache.org/jira/browse/YARN-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1673: -- Fix Version/s: (was: 2.4.0) Valid yarn kill application prints out help message. Key: YARN-1673 URL: https://issues.apache.org/jira/browse/YARN-1673 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.4.0 Reporter: Tassapol Athiapinya Assignee: Mayank Bansal Priority: Blocker Attachments: YARN-1673.txt yarn application -kill application ID used to work previously. In 2.4.0 it prints out help message and does not kill the application. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1611) Make admin refresh of configuration work across RM failover
[ https://issues.apache.org/jira/browse/YARN-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1611: Attachment: YARN-1611.5.patch Make admin refresh of configuration work across RM failover --- Key: YARN-1611 URL: https://issues.apache.org/jira/browse/YARN-1611 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-1611.1.patch, YARN-1611.2.patch, YARN-1611.2.patch, YARN-1611.3.patch, YARN-1611.3.patch, YARN-1611.4.patch, YARN-1611.5.patch Currently, If we do refresh* for a standby RM, it will failover to the current active RM, and do the refresh* based on the local configuration file of the active RM. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1674) Application launch gets stuck in ACCEPTED state
Trupti Dhavle created YARN-1674: --- Summary: Application launch gets stuck in ACCEPTED state Key: YARN-1674 URL: https://issues.apache.org/jira/browse/YARN-1674 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Trupti Dhavle During a test run, it was seen that for one of the application never started running. It was stuck in ACCEPTED state although the RM UI shows that the cluster had enough resources to run the application. Even the subsequent apps got stuck. From the logs- From logs - {noformat} 2014-01-29 11:53:36,030 INFO capacity.ParentQueue (ParentQueue.java:assignContainers(583)) - assignedContainer queue=root usedCapacity=0.5 absoluteUsedCapacity=0.5 used=memory:4096, vCores:2 cluster=memory:8192, vCores:8 2014-01-29 11:53:36,031 ERROR resourcemanager.ResourceManager (ResourceManager.java:handle(716)) - Error in handling event type CONTAINER_ALLOCATED for applicationAttempt application_1390987787623_0264 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:819) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:804) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:643) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:102) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:714) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:695) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:662) 2014-01-29 11:53:37,876 INFO delegation.AbstractDelegationTokenSecretManager (AbstractDelegationTokenSecretManager.java:createPassword(285)) - Creating password for identifier: owner=hrt_qa, renewer=oozi {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1675) Application does not change to RUNNING after being scheduled
Trupti Dhavle created YARN-1675: --- Summary: Application does not change to RUNNING after being scheduled Key: YARN-1675 URL: https://issues.apache.org/jira/browse/YARN-1675 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Trupti Dhavle I dont see any stacktraces in logs. But the debug logs show negative vcores- 2014-01-29 18:42:26,357 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(808)) - assignContainers: node=hor11n39.gq1.ygridcore.net #applications=5 2014-01-29 18:42:26,357 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(827)) - pre-assignContainers for application application_1390986573180_0269 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0269 headRoom=memory:22528, vCores:0 currentConsumption=2048 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0269 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,358 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(911)) - post-assignContainers for application application_1390986573180_0269 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0269 headRoom=memory:22528, vCores:0 currentConsumption=2048 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0269 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,358 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(827)) - pre-assignContainers for application application_1390986573180_0272 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0272 headRoom=memory:18432, vCores:-2 currentConsumption=2048 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0272 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,359 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(911)) - post-assignContainers for application application_1390986573180_0272 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0272 headRoom=memory:18432, vCores:-2 currentConsumption=2048 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0272 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,359 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(827)) - pre-assignContainers for application application_1390986573180_0273 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0273 headRoom=memory:18432, vCores:-2 currentConsumption=2048 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0273 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,360 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(911)) - post-assignContainers for application application_1390986573180_0273 2014-01-29 18:42:26,360 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0273 headRoom=memory:18432, vCores:-2 currentConsumption=2048 2014-01-29 18:42:26,360 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0273 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,360 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(827)) - pre-assignContainers for application application_1390986573180_0274 2014-01-29 18:42:26,360 DEBUG scheduler.SchedulerApplicationAttempt
[jira] [Updated] (YARN-1675) Application does not change to RUNNING after being scheduled
[ https://issues.apache.org/jira/browse/YARN-1675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Trupti Dhavle updated YARN-1675: Component/s: resourcemanager Application does not change to RUNNING after being scheduled Key: YARN-1675 URL: https://issues.apache.org/jira/browse/YARN-1675 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Trupti Dhavle I dont see any stacktraces in logs. But the debug logs show negative vcores- {noformat} 2014-01-29 18:42:26,357 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(808)) - assignContainers: node=hor11n39.gq1.ygridcore.net #applications=5 2014-01-29 18:42:26,357 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(827)) - pre-assignContainers for application application_1390986573180_0269 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0269 headRoom=memory:22528, vCores:0 currentConsumption=2048 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0269 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,358 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(911)) - post-assignContainers for application application_1390986573180_0269 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0269 headRoom=memory:22528, vCores:0 currentConsumption=2048 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0269 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,358 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(827)) - pre-assignContainers for application application_1390986573180_0272 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0272 headRoom=memory:18432, vCores:-2 currentConsumption=2048 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0272 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,359 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(911)) - post-assignContainers for application application_1390986573180_0272 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0272 headRoom=memory:18432, vCores:-2 currentConsumption=2048 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0272 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,359 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(827)) - pre-assignContainers for application application_1390986573180_0273 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0273 headRoom=memory:18432, vCores:-2 currentConsumption=2048 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0273 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,360 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(911)) - post-assignContainers for application application_1390986573180_0273 2014-01-29 18:42:26,360 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0273 headRoom=memory:18432, vCores:-2 currentConsumption=2048 2014-01-29 18:42:26,360 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0273 request={Priority: 0,
[jira] [Updated] (YARN-1675) Application does not change to RUNNING after being scheduled
[ https://issues.apache.org/jira/browse/YARN-1675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Trupti Dhavle updated YARN-1675: Description: I dont see any stacktraces in logs. But the debug logs show negative vcores- {noformat} 2014-01-29 18:42:26,357 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(808)) - assignContainers: node=hor11n39.gq1.ygridcore.net #applications=5 2014-01-29 18:42:26,357 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(827)) - pre-assignContainers for application application_1390986573180_0269 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0269 headRoom=memory:22528, vCores:0 currentConsumption=2048 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0269 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,358 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(911)) - post-assignContainers for application application_1390986573180_0269 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0269 headRoom=memory:22528, vCores:0 currentConsumption=2048 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0269 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,358 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(827)) - pre-assignContainers for application application_1390986573180_0272 2014-01-29 18:42:26,358 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0272 headRoom=memory:18432, vCores:-2 currentConsumption=2048 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0272 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,359 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(911)) - post-assignContainers for application application_1390986573180_0272 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0272 headRoom=memory:18432, vCores:-2 currentConsumption=2048 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0272 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,359 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(827)) - pre-assignContainers for application application_1390986573180_0273 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0273 headRoom=memory:18432, vCores:-2 currentConsumption=2048 2014-01-29 18:42:26,359 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0273 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,360 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(911)) - post-assignContainers for application application_1390986573180_0273 2014-01-29 18:42:26,360 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0273 headRoom=memory:18432, vCores:-2 currentConsumption=2048 2014-01-29 18:42:26,360 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(330)) - showRequests: application=application_1390986573180_0273 request={Priority: 0, Capability: memory:2048, vCores:1, # Containers: 0, Location: *, Relax Locality: true} 2014-01-29 18:42:26,360 DEBUG capacity.LeafQueue (LeafQueue.java:assignContainers(827)) - pre-assignContainers for application application_1390986573180_0274 2014-01-29 18:42:26,360 DEBUG scheduler.SchedulerApplicationAttempt (SchedulerApplicationAttempt.java:showRequests(326)) - showRequests: application=application_1390986573180_0274 headRoom=memory:16384, vCores:-3
[jira] [Updated] (YARN-1668) Make admin refreshAdminAcls work across RM failover
[ https://issues.apache.org/jira/browse/YARN-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1668: Attachment: YARN-1668.1.patch create patch for admin refreshAdminAcls changes. This patch is based on YARN-1611 Make admin refreshAdminAcls work across RM failover --- Key: YARN-1668 URL: https://issues.apache.org/jira/browse/YARN-1668 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-1668.1.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1504) RM changes for moving apps between queues
[ https://issues.apache.org/jira/browse/YARN-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885942#comment-13885942 ] Karthik Kambatla commented on YARN-1504: Comments: # AbstractYarnScheduler: The exception message should specify the current scheduler being used, not FairScheduler. {code} @Override public String moveApplication(ApplicationId appId, String newQueue) throws YarnException { throw new YarnException( Fair Scheduler does not support moving apps between queues); } {code} # TestClientRMService: can we add more tests to cover all the error cases being checked for in ClientRMService#move*() # Shouldn't need a new RMAppAttemptEventType for MOVE? # RMAppMoveTransition: Using futures, it is easy to regress and not set the Exception or value of the future. Can we add comments (may be javadoc style) describing the contract honored by RMAppMoveTransition? Also, we should add unit tests for RMAppMoveTransition to avoid regressions in the future - verify either the value or the exception is set. # Nit: Not a fan of the field name - RMAppEvent#resultFuture. Any better names? How about just result? # Nit: Would drop the RMAppState changes (comments) - don't seem to add much. RM changes for moving apps between queues - Key: YARN-1504 URL: https://issues.apache.org/jira/browse/YARN-1504 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1504.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1667) Make admin refreshSuperUserGroupsConfiguration work across RM failover
[ https://issues.apache.org/jira/browse/YARN-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1667: Attachment: YARN-1667.1.patch create the patch based on YARN-1611 for refreshSuperUserGroupsConfiguration changes Make admin refreshSuperUserGroupsConfiguration work across RM failover -- Key: YARN-1667 URL: https://issues.apache.org/jira/browse/YARN-1667 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-1667.1.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1676) Make admin refreshUserToGroupsMappings of configuration work across RM failover
Xuan Gong created YARN-1676: --- Summary: Make admin refreshUserToGroupsMappings of configuration work across RM failover Key: YARN-1676 URL: https://issues.apache.org/jira/browse/YARN-1676 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1498) Common scheduler changes for moving apps between queues
[ https://issues.apache.org/jira/browse/YARN-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885973#comment-13885973 ] Karthik Kambatla commented on YARN-1498: Comments: # AppSchedulingInfo: Not sure I understand the relevance of the following change to this JIRA. Am I missing something or is it just cleanup? {code} -metrics.incrPendingResources(user, request.getNumContainers() -- lastRequestContainers, Resources.subtractFrom( // save a clone -Resources.multiply(request.getCapability(), request -.getNumContainers()), Resources.multiply(lastRequestCapability, -lastRequestContainers))); +metrics.incrPendingResources(user, request.getNumContainers(), +request.getCapability()); +metrics.decrPendingResources(user, lastRequestContainers, +lastRequestCapability); {code} # Can we throw an Exception instead of returning null. {code} @Override public ActiveUsersManager getActiveUsersManager() { // Should never be called since all applications are submitted to LeafQueues return null; } {code} Otherwise, looks good to me. Common scheduler changes for moving apps between queues --- Key: YARN-1498 URL: https://issues.apache.org/jira/browse/YARN-1498 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1498-1.patch, YARN-1498.patch, YARN-1498.patch This JIRA is to track changes that aren't in particular schedulers but that help them support moving apps between queues. In particular, it makes sure that QueueMetrics are properly updated when an app changes queue. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1676) Make admin refreshUserToGroupsMappings of configuration work across RM failover
[ https://issues.apache.org/jira/browse/YARN-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1676: Attachment: YARN-1676.1.patch create the patch based on YARN-1611 for refreshUserToGroupsMappings changes Make admin refreshUserToGroupsMappings of configuration work across RM failover --- Key: YARN-1676 URL: https://issues.apache.org/jira/browse/YARN-1676 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-1676.1.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1611) Make admin refresh of configuration work across RM failover
[ https://issues.apache.org/jira/browse/YARN-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13886015#comment-13886015 ] Hadoop QA commented on YARN-1611: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12625987/YARN-1611.5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2964//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2964//console This message is automatically generated. Make admin refresh of configuration work across RM failover --- Key: YARN-1611 URL: https://issues.apache.org/jira/browse/YARN-1611 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-1611.1.patch, YARN-1611.2.patch, YARN-1611.2.patch, YARN-1611.3.patch, YARN-1611.3.patch, YARN-1611.4.patch, YARN-1611.5.patch Currently, If we do refresh* for a standby RM, it will failover to the current active RM, and do the refresh* based on the local configuration file of the active RM. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1661) AppMaster logs says failing even if an application does succeed.
[ https://issues.apache.org/jira/browse/YARN-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1661: -- Fix Version/s: (was: 2.3.0) Assignee: Vinod Kumar Vavilapalli AppMaster logs says failing even if an application does succeed. Key: YARN-1661 URL: https://issues.apache.org/jira/browse/YARN-1661 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Affects Versions: 2.3.0 Reporter: Tassapol Athiapinya Assignee: Vinod Kumar Vavilapalli Run: /usr/bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar distributed shell jar -shell_command ls Open AM logs. Last line would indicate AM failure even though container logs print good ls result. {code} 2014-01-24 21:45:29,592 INFO [main] distributedshell.ApplicationMaster (ApplicationMaster.java:finish(599)) - Application completed. Signalling finish to RM 2014-01-24 21:45:29,612 INFO [main] impl.AMRMClientImpl (AMRMClientImpl.java:unregisterApplicationMaster(315)) - Waiting for application to be successfully unregistered. 2014-01-24 21:45:29,816 INFO [main] distributedshell.ApplicationMaster (ApplicationMaster.java:main(267)) - Application Master failed. exiting {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1661) AppMaster logs says failing even if an application does succeed.
[ https://issues.apache.org/jira/browse/YARN-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13886292#comment-13886292 ] Vinod Kumar Vavilapalli commented on YARN-1661: --- This was broken by YARN-1566. AppMaster logs says failing even if an application does succeed. Key: YARN-1661 URL: https://issues.apache.org/jira/browse/YARN-1661 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Affects Versions: 2.3.0 Reporter: Tassapol Athiapinya Assignee: Vinod Kumar Vavilapalli Run: /usr/bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar distributed shell jar -shell_command ls Open AM logs. Last line would indicate AM failure even though container logs print good ls result. {code} 2014-01-24 21:45:29,592 INFO [main] distributedshell.ApplicationMaster (ApplicationMaster.java:finish(599)) - Application completed. Signalling finish to RM 2014-01-24 21:45:29,612 INFO [main] impl.AMRMClientImpl (AMRMClientImpl.java:unregisterApplicationMaster(315)) - Waiting for application to be successfully unregistered. 2014-01-24 21:45:29,816 INFO [main] distributedshell.ApplicationMaster (ApplicationMaster.java:main(267)) - Application Master failed. exiting {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1661) AppMaster logs says failing even if an application does succeed.
[ https://issues.apache.org/jira/browse/YARN-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1661: -- Attachment: YARN-1661.txt The issue is that {{run()}} returns the value of success and the correct value of _success_ is set right after that by {{finish()}} method of ApplicationMaster. Attaching patch that fixes this. Tested that it works on a single node cluster. AppMaster logs says failing even if an application does succeed. Key: YARN-1661 URL: https://issues.apache.org/jira/browse/YARN-1661 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Affects Versions: 2.3.0 Reporter: Tassapol Athiapinya Assignee: Vinod Kumar Vavilapalli Attachments: YARN-1661.txt Run: /usr/bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar distributed shell jar -shell_command ls Open AM logs. Last line would indicate AM failure even though container logs print good ls result. {code} 2014-01-24 21:45:29,592 INFO [main] distributedshell.ApplicationMaster (ApplicationMaster.java:finish(599)) - Application completed. Signalling finish to RM 2014-01-24 21:45:29,612 INFO [main] impl.AMRMClientImpl (AMRMClientImpl.java:unregisterApplicationMaster(315)) - Waiting for application to be successfully unregistered. 2014-01-24 21:45:29,816 INFO [main] distributedshell.ApplicationMaster (ApplicationMaster.java:main(267)) - Application Master failed. exiting {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1673) Valid yarn kill application prints out help message.
[ https://issues.apache.org/jira/browse/YARN-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1673: -- Target Version/s: 2.4.0 Valid yarn kill application prints out help message. Key: YARN-1673 URL: https://issues.apache.org/jira/browse/YARN-1673 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.4.0 Reporter: Tassapol Athiapinya Assignee: Mayank Bansal Priority: Blocker Attachments: YARN-1673.txt yarn application -kill application ID used to work previously. In 2.4.0 it prints out help message and does not kill the application. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-978) [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation
[ https://issues.apache.org/jira/browse/YARN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-978: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation -- Key: YARN-978 URL: https://issues.apache.org/jira/browse/YARN-978 Project: Hadoop YARN Issue Type: Sub-task Reporter: Mayank Bansal Assignee: Mayank Bansal Fix For: 2.4.0 Attachments: YARN-978-1.patch, YARN-978.10.patch, YARN-978.2.patch, YARN-978.3.patch, YARN-978.4.patch, YARN-978.5.patch, YARN-978.6.patch, YARN-978.7.patch, YARN-978.8.patch, YARN-978.9.patch We dont have ApplicationAttemptReport and Protobuf implementation. Adding that. Thanks, Mayank -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-947) Defining the history data classes for the implementation of the reading/writing interface
[ https://issues.apache.org/jira/browse/YARN-947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-947: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. Defining the history data classes for the implementation of the reading/writing interface - Key: YARN-947 URL: https://issues.apache.org/jira/browse/YARN-947 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.4.0 Attachments: YARN-947.1.patch, YARN-947.2.patch, YARN-947.3.patch, YARN-947.4.patch, YARN-947.5.patch, YARN-947.6.patch, YARN-947.8.patch, YARN-947.9.patch We need to define the history data classes have the exact fields to be stored. Therefore, all the implementations don't need to have the duplicate logic to exact the required information from RMApp, RMAppAttempt and RMContainer. We use protobuf to define these classes, such that they can be ser/des to/from bytes, which are easier for persistence. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1123) [YARN-321] Adding ContainerReport and Protobuf implementation
[ https://issues.apache.org/jira/browse/YARN-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1123: -- Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. [YARN-321] Adding ContainerReport and Protobuf implementation - Key: YARN-1123 URL: https://issues.apache.org/jira/browse/YARN-1123 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Mayank Bansal Fix For: 2.4.0 Attachments: YARN-1123-1.patch, YARN-1123-2.patch, YARN-1123-3.patch, YARN-1123-4.patch, YARN-1123-5.patch, YARN-1123-6.patch Like YARN-978, we need some client-oriented class to expose the container history info. Neither Container nor RMContainer is the right one. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-987) Adding ApplicationHistoryManager responsible for exposing reports to all clients
[ https://issues.apache.org/jira/browse/YARN-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-987: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. Adding ApplicationHistoryManager responsible for exposing reports to all clients Key: YARN-987 URL: https://issues.apache.org/jira/browse/YARN-987 Project: Hadoop YARN Issue Type: Sub-task Reporter: Mayank Bansal Assignee: Mayank Bansal Fix For: 2.4.0 Attachments: YARN-987-1.patch, YARN-987-2.patch, YARN-987-3.patch, YARN-987-4.patch, YARN-987-5.patch, YARN-987-6.patch, YARN-987-7.patch, YARN-987-8.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-955) [YARN-321] Implementation of ApplicationHistoryProtocol
[ https://issues.apache.org/jira/browse/YARN-955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-955: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. [YARN-321] Implementation of ApplicationHistoryProtocol --- Key: YARN-955 URL: https://issues.apache.org/jira/browse/YARN-955 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Mayank Bansal Fix For: 2.4.0 Attachments: YARN-955-1.patch, YARN-955-2.patch, YARN-955-3.patch, YARN-955-4.patch, YARN-955-5.patch, YARN-955-6.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1242) Script changes to start AHS as an individual process
[ https://issues.apache.org/jira/browse/YARN-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1242: -- Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. Script changes to start AHS as an individual process Key: YARN-1242 URL: https://issues.apache.org/jira/browse/YARN-1242 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Mayank Bansal Fix For: 2.4.0 Attachments: YARN-1242-1.patch, YARN-1242-2.patch, YARN-1242-3.patch Add the command in yarn and yarn.cmd to start and stop AHS -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-930) Bootstrap ApplicationHistoryService module
[ https://issues.apache.org/jira/browse/YARN-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-930: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. Bootstrap ApplicationHistoryService module -- Key: YARN-930 URL: https://issues.apache.org/jira/browse/YARN-930 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: YARN-321 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: 2.4.0 Attachments: YARN-930-20130716.1.txt, YARN-930-20130716.2.txt, YARN-930-20130716.txt -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1534) TestAHSWebApp failed in YARN-321 branch
[ https://issues.apache.org/jira/browse/YARN-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1534: -- Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. TestAHSWebApp failed in YARN-321 branch --- Key: YARN-1534 URL: https://issues.apache.org/jira/browse/YARN-1534 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: YARN-321 Environment: CentOS 6.3, JDK 1.6.0_31 Reporter: Shinichi Yamashita Assignee: Shinichi Yamashita Fix For: 2.4.0 Attachments: YARN-1534.patch I ran the following commands. And I confirmed failure of TestAHSWebApp. {code} [sinchii@hdX YARN-321-test]$ mvn clean test -Dtest=org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.* {code} {code} Running org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.492 sec - in org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices Running org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebApp Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.193 sec FAILURE! - in org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebApp initializationError(org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebApp) Time elapsed: 0.016 sec ERROR! java.lang.Exception: Test class should have exactly one public zero-argument constructor at org.junit.runners.BlockJUnit4ClassRunner.validateZeroArgConstructor(BlockJUnit4ClassRunner.java:144) at org.junit.runners.BlockJUnit4ClassRunner.validateConstructor(BlockJUnit4ClassRunner.java:121) at org.junit.runners.BlockJUnit4ClassRunner.collectInitializationErrors(BlockJUnit4ClassRunner.java:101) at org.junit.runners.ParentRunner.validate(ParentRunner.java:344) (*snip*) {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1191) [YARN-321] Update artifact versions for application history service
[ https://issues.apache.org/jira/browse/YARN-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1191: -- Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. [YARN-321] Update artifact versions for application history service --- Key: YARN-1191 URL: https://issues.apache.org/jira/browse/YARN-1191 Project: Hadoop YARN Issue Type: Sub-task Reporter: Mayank Bansal Assignee: Mayank Bansal Fix For: 2.4.0 Attachments: YARN-1191-1.patch Compilation is failing for YARN-321 branch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1605) Fix formatting issues with new module in YARN-321 branch
[ https://issues.apache.org/jira/browse/YARN-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1605: -- Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. Fix formatting issues with new module in YARN-321 branch Key: YARN-1605 URL: https://issues.apache.org/jira/browse/YARN-1605 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: 2.4.0 Attachments: YARN-1605-20140116.txt There are a bunch of formatting issues. I'm restricting myself for a sweep of all the files in the new module. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-967) [YARN-321] Command Line Interface(CLI) for Reading Application History Storage Data
[ https://issues.apache.org/jira/browse/YARN-967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-967: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. [YARN-321] Command Line Interface(CLI) for Reading Application History Storage Data --- Key: YARN-967 URL: https://issues.apache.org/jira/browse/YARN-967 Project: Hadoop YARN Issue Type: Sub-task Reporter: Devaraj K Assignee: Mayank Bansal Fix For: 2.4.0 Attachments: YARN-967-1.patch, YARN-967-10.patch, YARN-967-11.patch, YARN-967-12.patch, YARN-967-13.patch, YARN-967-14.patch, YARN-967-2.patch, YARN-967-3.patch, YARN-967-4.patch, YARN-967-5.patch, YARN-967-6.patch, YARN-967-7.patch, YARN-967-8.patch, YARN-967-9.patch, YARN-967.15.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-954) [YARN-321] History Service should create the webUI and wire it to HistoryStorage
[ https://issues.apache.org/jira/browse/YARN-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-954: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. [YARN-321] History Service should create the webUI and wire it to HistoryStorage Key: YARN-954 URL: https://issues.apache.org/jira/browse/YARN-954 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Fix For: 2.4.0 Attachments: YARN-954-3.patch, YARN-954-v0.patch, YARN-954-v1.patch, YARN-954-v2.patch, YARN-954.4.patch, YARN-954.5.patch, YARN-954.6.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-975) Add a file-system implementation for history-storage
[ https://issues.apache.org/jira/browse/YARN-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-975: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. Add a file-system implementation for history-storage Key: YARN-975 URL: https://issues.apache.org/jira/browse/YARN-975 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.4.0 Attachments: YARN-975.1.patch, YARN-975.10.patch, YARN-975.11.patch, YARN-975.2.patch, YARN-975.3.patch, YARN-975.4.patch, YARN-975.5.patch, YARN-975.6.patch, YARN-975.7.patch, YARN-975.8.patch, YARN-975.9.patch HDFS implementation should be a standard persistence strategy of history storage -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1555) [YARN-321] Failing tests in org.apache.hadoop.yarn.server.applicationhistoryservice.*
[ https://issues.apache.org/jira/browse/YARN-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1555: -- Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. [YARN-321] Failing tests in org.apache.hadoop.yarn.server.applicationhistoryservice.* - Key: YARN-1555 URL: https://issues.apache.org/jira/browse/YARN-1555 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: 2.4.0 Attachments: YARN-1555-20140102.txt Several tests are failing on the latest YARN-321 branch. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-935) YARN-321 branch is broken due to applicationhistoryserver module's pom.xml
[ https://issues.apache.org/jira/browse/YARN-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-935: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. YARN-321 branch is broken due to applicationhistoryserver module's pom.xml -- Key: YARN-935 URL: https://issues.apache.org/jira/browse/YARN-935 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.4.0 Attachments: YARN-935.1.patch, YARN-935.2.patch The branch was created from branch-2, hadoop-yarn-server-applicationhistoryserver/pom.xml should use 2.2.0-SNAPSHOT, not 3.0.0-SNAPSHOT. Otherwise, the sub-project cannot be built correctly because of wrong dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-974) RMContainer should collect more useful information to be recorded in Application-History
[ https://issues.apache.org/jira/browse/YARN-974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-974: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. RMContainer should collect more useful information to be recorded in Application-History Key: YARN-974 URL: https://issues.apache.org/jira/browse/YARN-974 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.4.0 Attachments: YARN-974.1.patch, YARN-974.2.patch, YARN-974.3.patch, YARN-974.4.patch, YARN-974.5.patch To record the history of a container, users may be also interested in the following information: 1. Start Time 2. Stop Time 3. Diagnostic Information 4. URL to the Log File 5. Actually Allocated Resource 6. Actually Assigned Node These should be remembered during the RMContainer's life cycle. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-962) Update application_history_service.proto
[ https://issues.apache.org/jira/browse/YARN-962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-962: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. Update application_history_service.proto Key: YARN-962 URL: https://issues.apache.org/jira/browse/YARN-962 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.4.0 Attachments: YARN-962.1.patch 1. Change it's name to application_history_client.proto 2. Fix the incorrect proto reference. 3. Correct the dir in pom.xml -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1595) Test failures on YARN-321 branch
[ https://issues.apache.org/jira/browse/YARN-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1595: -- Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. Test failures on YARN-321 branch Key: YARN-1595 URL: https://issues.apache.org/jira/browse/YARN-1595 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: 2.4.0 Attachments: YARN-1595-20140115.1.txt, YARN-1595-20140115.txt, YARN-1595-20140116.1.txt, YARN-1595-20140116.txt mvn test doesn't pass on YARN-321 branch anymore. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1266) Implement PB service and client wrappers for ApplicationHistoryProtocol
[ https://issues.apache.org/jira/browse/YARN-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1266: -- Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. Implement PB service and client wrappers for ApplicationHistoryProtocol --- Key: YARN-1266 URL: https://issues.apache.org/jira/browse/YARN-1266 Project: Hadoop YARN Issue Type: Sub-task Reporter: Mayank Bansal Assignee: Mayank Bansal Fix For: 2.4.0 Attachments: YARN-1266-1.patch, YARN-1266-2.patch, YARN-1266-3.patch, YARN-1266-4.patch, YARN-1266-5.patch, YARN-1266-6.patch Adding ApplicationHistoryProtocolPBService to make web apps to work and changing yarn to run AHS as a seprate process -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-979) [YARN-321] Add more APIs related to ApplicationAttempt and Container in ApplicationHistoryProtocol
[ https://issues.apache.org/jira/browse/YARN-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-979: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. [YARN-321] Add more APIs related to ApplicationAttempt and Container in ApplicationHistoryProtocol -- Key: YARN-979 URL: https://issues.apache.org/jira/browse/YARN-979 Project: Hadoop YARN Issue Type: Sub-task Reporter: Mayank Bansal Assignee: Mayank Bansal Fix For: 2.4.0 Attachments: YARN-979-1.patch, YARN-979-3.patch, YARN-979-4.patch, YARN-979-5.patch, YARN-979-6.patch, YARN-979.2.patch, YARN-979.7.patch ApplicationHistoryProtocol should have the following APIs as well: * getApplicationAttemptReport * getApplicationAttempts * getContainerReport * getContainers The corresponding request and response classes need to be added as well. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1023) [YARN-321] Webservices REST API's support for Application History
[ https://issues.apache.org/jira/browse/YARN-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1023: -- Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. [YARN-321] Webservices REST API's support for Application History - Key: YARN-1023 URL: https://issues.apache.org/jira/browse/YARN-1023 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: YARN-321 Reporter: Devaraj K Assignee: Zhijie Shen Fix For: 2.4.0 Attachments: YARN-1023-v0.patch, YARN-1023-v1.patch, YARN-1023.2.patch, YARN-1023.3.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-934) HistoryStorage writer interface for Application History Server
[ https://issues.apache.org/jira/browse/YARN-934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-934: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. HistoryStorage writer interface for Application History Server -- Key: YARN-934 URL: https://issues.apache.org/jira/browse/YARN-934 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.4.0 Attachments: YARN-934.1.patch, YARN-934.2.patch, YARN-934.3.patch, YARN-934.4.patch, YARN-934.5.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-984) [YARN-321] Move classes from applicationhistoryservice.records.pb.impl package to applicationhistoryservice.records.impl.pb
[ https://issues.apache.org/jira/browse/YARN-984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-984: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. [YARN-321] Move classes from applicationhistoryservice.records.pb.impl package to applicationhistoryservice.records.impl.pb --- Key: YARN-984 URL: https://issues.apache.org/jira/browse/YARN-984 Project: Hadoop YARN Issue Type: Sub-task Reporter: Devaraj K Assignee: Devaraj K Fix For: 2.4.0 Attachments: YARN-984-1.patch, YARN-984.patch While creating instance for applicationhistoryservice.records.* pb records, It is throwing the ClassNotFoundException. {code:xml} Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.server.applicationhistoryservice.records.impl.pb.ApplicationHistoryDataPBImpl not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1619) at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:56) ... 49 more {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1578) Fix how to handle ApplicationHistory about the container
[ https://issues.apache.org/jira/browse/YARN-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichi Yamashita updated YARN-1578: - Attachment: screenshot2.pdf I recognized that code which you showed was FileSystemApplicationHistoryStore.getContainer(ContainerId) method. That code is OK, and we can watch information of ApplicationMaster in web UI. And when I access the information of the list of containers from a link of AppAttempt, web UI displays 500 (attach screenshot2.pdf). I'm sorry about unkindness of my explanation. By this access, AHS calls FileSystemApplicationHistoryStore.getContainers(ApplicationAttemptId) and ContainerFinishData is not set with the following code. {code} HistoryFileReader hfReader = getHistoryFileReader(appAttemptId.getApplicationId()); try { while (hfReader.hasNext()) { HistoryFileReader.Entry entry = hfReader.next(); if (entry.key.id.startsWith(ConverterUtils.CONTAINER_PREFIX)) { if (entry.key.suffix.equals(START_DATA_SUFFIX)) { retrieveStartFinishData(appAttemptId, entry, startFinshDataMap, true); } else if (entry.key.suffix.equals(FINISH_DATA_SUFFIX)) { retrieveStartFinishData(appAttemptId, entry, startFinshDataMap, false); } } } LOG.info(Completed reading history information of all conatiners + of application attempt + appAttemptId); } catch (IOException e) { LOG.info(Error when reading history information of some containers + of application attempt + appAttemptId); } finally { hfReader.close(); } {code} In consideration of the possibility that finish data was not included in history file, I thought that we should fix how to read history file in FileSystemApplicationHistoryStore. Fix how to handle ApplicationHistory about the container Key: YARN-1578 URL: https://issues.apache.org/jira/browse/YARN-1578 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: YARN-321 Reporter: Shinichi Yamashita Assignee: Shinichi Yamashita Attachments: YARN-1578.patch, application_1390978867235_0001, resoucemanager.log, screenshot.png, screenshot2.pdf I carried out PiEstimator job at Hadoop cluster which applied YARN-321. After the job end and when I accessed Web UI of HistoryServer, it displayed 500. And HistoryServer daemon log was output as follows. {code} 2014-01-09 13:31:12,227 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /applicationhistory/appattempt/appattempt_1389146249925_0008_01 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) (snip...) Caused by: java.lang.NullPointerException at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.mergeContainerHistoryData(FileSystemApplicationHistoryStore.java:696) at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.getContainers(FileSystemApplicationHistoryStore.java:429) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getContainers(ApplicationHistoryManagerImpl.java:201) at org.apache.hadoop.yarn.server.webapp.AppAttemptBlock.render(AppAttemptBlock.java:110) (snip...) {code} I confirmed that there was container which was not finished from ApplicationHistory file. In ResourceManager daemon log, ResourceManager reserved this container, but did not allocate it. Therefore, about a container which is not allocated, it is necessary to change how to handle in ApplicationHistory. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1596) Javadoc failures on YARN-321 branch
[ https://issues.apache.org/jira/browse/YARN-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1596: -- Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. Javadoc failures on YARN-321 branch --- Key: YARN-1596 URL: https://issues.apache.org/jira/browse/YARN-1596 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: 2.4.0 Attachments: YARN-1596.txt There are some javadoc issues on YARN-321 branch. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1379) [YARN-321] AHS protocols need to be in yarn proto package name after YARN-1170
[ https://issues.apache.org/jira/browse/YARN-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1379: -- Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. [YARN-321] AHS protocols need to be in yarn proto package name after YARN-1170 -- Key: YARN-1379 URL: https://issues.apache.org/jira/browse/YARN-1379 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: 2.4.0 Attachments: YARN-1379.txt Found this while merging YARN-321 to the latest branch-2. Without this, compilation fails. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-956) [YARN-321] Add a testable in-memory HistoryStorage
[ https://issues.apache.org/jira/browse/YARN-956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-956: - Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. [YARN-321] Add a testable in-memory HistoryStorage --- Key: YARN-956 URL: https://issues.apache.org/jira/browse/YARN-956 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Fix For: 2.4.0 Attachments: YARN-956-1.patch, YARN-956-2.patch, YARN-956-3.patch, YARN-956.4.patch, YARN-956.5.patch, YARN-956.6.patch, YARN-956.7.patch, YARN-956.8.patch, YARN-956.9.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1594) YARN-321 branch needs to be updated after YARN-888 pom changes
[ https://issues.apache.org/jira/browse/YARN-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1594: -- Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. YARN-321 branch needs to be updated after YARN-888 pom changes -- Key: YARN-1594 URL: https://issues.apache.org/jira/browse/YARN-1594 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: 2.4.0 Attachments: YARN-1594-20140113.txt, YARN-1594.txt YARN-888 changed the pom structure. And so latest merge to trunk breaks YARN-321 branch. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1597) FindBugs warnings on YARN-321 branch
[ https://issues.apache.org/jira/browse/YARN-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1597: -- Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. FindBugs warnings on YARN-321 branch Key: YARN-1597 URL: https://issues.apache.org/jira/browse/YARN-1597 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: 2.4.0 Attachments: YARN-1597.txt There are a bunch of findBugs warnings on YARN-321 branch. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1007) [YARN-321] Enhance History Reader interface for Containers
[ https://issues.apache.org/jira/browse/YARN-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1007: -- Fix Version/s: (was: YARN-321) 2.4.0 YARN-321 branch is merged into trunk and branch-2. Setting fix-version of all committed patches under YARN-321 to be 2.4.0. [YARN-321] Enhance History Reader interface for Containers -- Key: YARN-1007 URL: https://issues.apache.org/jira/browse/YARN-1007 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: YARN-321 Reporter: Devaraj K Assignee: Mayank Bansal Fix For: 2.4.0 Attachments: YARN-1007-1.patch, YARN-1007-2.patch If we want to show the containers used by application/app attempt, We need to have two more API's which returns collection of ContainerHistoryData for application id and applcation attempt id something like below. {code:xml} CollectionContainerHistoryData getContainers( ApplicationAttemptId appAttemptId); CollectionContainerHistoryData getContainers(ApplicationId appId); {code} {code:xml} /** * This method returns {@link Container} for specified {@link ContainerId}. * * @param {@link ContainerId} * @return {@link Container} for ContainerId */ ContainerHistoryData getAMContainer(ContainerId containerId); {code} In the above API, we need to change the argument to application attempt id or we can remove this API because every attempt history data has master container id field, using master container id, history data can get using this below API if it takes argument as container id. {code:xml} /** * This method returns {@link ContainerHistoryData} for specified * {@link ApplicationAttemptId}. * * @param {@link ApplicationAttemptId} * @return {@link ContainerHistoryData} for ApplicationAttemptId */ ContainerHistoryData getContainer(ApplicationAttemptId appAttemptId); {code} Here application attempt can use numbers of containers but we cannot choose which container history data to return. This API argument also need to be changed to take container id instead of app attempt id. -- This message was sent by Atlassian JIRA (v6.1.5#6160)