[jira] [Updated] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs
[ https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-1852: - Attachment: YARN-1852.3patch bq. We may check against RMApp.recoveredFinalState state instead? Done Test is written for checking FINISHED/KILLED/FAILED applications. The fix I verified in single node cluster. Attached updated patch as per comment. Please review. Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs - Key: YARN-1852 URL: https://issues.apache.org/jira/browse/YARN-1852 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0, 2.4.0 Reporter: Rohith Assignee: Rohith Attachments: YARN-1852.2.patch, YARN-1852.3patch, YARN-1852.patch Recovering for failed/killed application throw InvalidStateTransitonException. These are logged during recovery of applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs
[ https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-1852: - Attachment: (was: YARN-1852.3patch) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs - Key: YARN-1852 URL: https://issues.apache.org/jira/browse/YARN-1852 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0, 2.4.0 Reporter: Rohith Assignee: Rohith Attachments: YARN-1852.2.patch, YARN-1852.3.patch, YARN-1852.patch Recovering for failed/killed application throw InvalidStateTransitonException. These are logged during recovery of applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs
[ https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-1852: - Attachment: YARN-1852.3.patch Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs - Key: YARN-1852 URL: https://issues.apache.org/jira/browse/YARN-1852 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0, 2.4.0 Reporter: Rohith Assignee: Rohith Attachments: YARN-1852.2.patch, YARN-1852.3.patch, YARN-1852.patch Recovering for failed/killed application throw InvalidStateTransitonException. These are logged during recovery of applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs
[ https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944881#comment-13944881 ] Hadoop QA commented on YARN-1852: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636314/YARN-1852.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3441//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3441//console This message is automatically generated. Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs - Key: YARN-1852 URL: https://issues.apache.org/jira/browse/YARN-1852 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0, 2.4.0 Reporter: Rohith Assignee: Rohith Attachments: YARN-1852.2.patch, YARN-1852.3.patch, YARN-1852.patch Recovering for failed/killed application throw InvalidStateTransitonException. These are logged during recovery of applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1865) ShellScriptBuilder does not check for some error conditions
[ https://issues.apache.org/jira/browse/YARN-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated YARN-1865: --- Priority: Minor (was: Major) ShellScriptBuilder does not check for some error conditions --- Key: YARN-1865 URL: https://issues.apache.org/jira/browse/YARN-1865 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0, 2.2.0, 2.3.0 Reporter: Remus Rusanu Priority: Minor The WindowsShellScriptBuilder does not check for commands exceeding windows maximum shell command line length (8191 chars) Neither Unix nor Windows script builder do not check for error condition after mkdir nor link WindowsShellScriptBuilder mkdir is not safe with regard to paths containing spaces -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1865) ShellScriptBuilder does not check for some error conditions
Remus Rusanu created YARN-1865: -- Summary: ShellScriptBuilder does not check for some error conditions Key: YARN-1865 URL: https://issues.apache.org/jira/browse/YARN-1865 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.3.0, 2.2.0, 3.0.0 Reporter: Remus Rusanu The WindowsShellScriptBuilder does not check for commands exceeding windows maximum shell command line length (8191 chars) Neither Unix nor Windows script builder do not check for error condition after mkdir nor link WindowsShellScriptBuilder mkdir is not safe with regard to paths containing spaces -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1865) ShellScriptBuilder does not check for some error conditions
[ https://issues.apache.org/jira/browse/YARN-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated YARN-1865: --- Attachment: YARN-1865.1.patch ShellScriptBuilder does not check for some error conditions --- Key: YARN-1865 URL: https://issues.apache.org/jira/browse/YARN-1865 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0, 2.2.0, 2.3.0 Reporter: Remus Rusanu Priority: Minor Attachments: YARN-1865.1.patch The WindowsShellScriptBuilder does not check for commands exceeding windows maximum shell command line length (8191 chars) Neither Unix nor Windows script builder do not check for error condition after mkdir nor link WindowsShellScriptBuilder mkdir is not safe with regard to paths containing spaces -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1670) aggregated log writer can write more log data then it says is the log length
[ https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945188#comment-13945188 ] Mit Desai commented on YARN-1670: - I realize that I created the patch based on the trunk before the commit of the earlier patch so it fails. I will upload a new one. [~jeagles] # Nice logic. This is much easier to understand. I will incorporate your suggestion in the new change. # For the buffer size, you are correct. I already did some analysis on that. I read some discussions/articles online which say that 64K buffer size performs efficiently. aggregated log writer can write more log data then it says is the log length Key: YARN-1670 URL: https://issues.apache.org/jira/browse/YARN-1670 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 0.23.10, 2.2.0 Reporter: Thomas Graves Assignee: Mit Desai Priority: Critical Fix For: 2.4.0 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, YARN-1670-v4-b23.patch, YARN-1670-v4.patch, YARN-1670.patch, YARN-1670.patch We have seen exceptions when using 'yarn logs' to read log files. at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518) at org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178) at org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130) at org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246) We traced it down to the reader trying to read the file type of the next file but where it reads is still log data from the previous file. What happened was the Log Length was written as a certain size but the log data was actually longer then that. Inside of the write() routine in LogValue it first writes what the logfile length is, but then when it goes to write the log itself it just goes to the end of the file. There is a race condition here where if someone is still writing to the file when it goes to be aggregated the length written could be to small. We should have the write() routine stop when it writes whatever it said was the length. It would be nice if we could somehow tell the user it might be truncated but I'm not sure of a good way to do this. We also noticed that a bug in readAContainerLogsForALogType where it is using an int for curRead whereas it should be using a long. while (len != -1 curRead fileLength) { This isn't actually a problem right now as it looks like the underlying decoder is doing the right thing and the len condition exits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1670) aggregated log writer can write more log data then it says is the log length
[ https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated YARN-1670: Attachment: YARN-1670-v4-b23.patch aggregated log writer can write more log data then it says is the log length Key: YARN-1670 URL: https://issues.apache.org/jira/browse/YARN-1670 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 0.23.10, 2.2.0 Reporter: Thomas Graves Assignee: Mit Desai Priority: Critical Fix For: 2.4.0 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, YARN-1670-v4-b23.patch, YARN-1670-v4-b23.patch, YARN-1670-v4.patch, YARN-1670.patch, YARN-1670.patch We have seen exceptions when using 'yarn logs' to read log files. at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518) at org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178) at org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130) at org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246) We traced it down to the reader trying to read the file type of the next file but where it reads is still log data from the previous file. What happened was the Log Length was written as a certain size but the log data was actually longer then that. Inside of the write() routine in LogValue it first writes what the logfile length is, but then when it goes to write the log itself it just goes to the end of the file. There is a race condition here where if someone is still writing to the file when it goes to be aggregated the length written could be to small. We should have the write() routine stop when it writes whatever it said was the length. It would be nice if we could somehow tell the user it might be truncated but I'm not sure of a good way to do this. We also noticed that a bug in readAContainerLogsForALogType where it is using an int for curRead whereas it should be using a long. while (len != -1 curRead fileLength) { This isn't actually a problem right now as it looks like the underlying decoder is doing the right thing and the len condition exits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1670) aggregated log writer can write more log data then it says is the log length
[ https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated YARN-1670: Attachment: YARN-1670-v4.patch Attaching the patch with the updated changes. aggregated log writer can write more log data then it says is the log length Key: YARN-1670 URL: https://issues.apache.org/jira/browse/YARN-1670 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 0.23.10, 2.2.0 Reporter: Thomas Graves Assignee: Mit Desai Priority: Critical Fix For: 2.4.0 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, YARN-1670-v4-b23.patch, YARN-1670-v4-b23.patch, YARN-1670-v4.patch, YARN-1670-v4.patch, YARN-1670.patch, YARN-1670.patch We have seen exceptions when using 'yarn logs' to read log files. at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518) at org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178) at org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130) at org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246) We traced it down to the reader trying to read the file type of the next file but where it reads is still log data from the previous file. What happened was the Log Length was written as a certain size but the log data was actually longer then that. Inside of the write() routine in LogValue it first writes what the logfile length is, but then when it goes to write the log itself it just goes to the end of the file. There is a race condition here where if someone is still writing to the file when it goes to be aggregated the length written could be to small. We should have the write() routine stop when it writes whatever it said was the length. It would be nice if we could somehow tell the user it might be truncated but I'm not sure of a good way to do this. We also noticed that a bug in readAContainerLogsForALogType where it is using an int for curRead whereas it should be using a long. while (len != -1 curRead fileLength) { This isn't actually a problem right now as it looks like the underlying decoder is doing the right thing and the len condition exits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1670) aggregated log writer can write more log data then it says is the log length
[ https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945296#comment-13945296 ] Hadoop QA commented on YARN-1670: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636350/YARN-1670-v4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3442//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3442//console This message is automatically generated. aggregated log writer can write more log data then it says is the log length Key: YARN-1670 URL: https://issues.apache.org/jira/browse/YARN-1670 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 0.23.10, 2.2.0 Reporter: Thomas Graves Assignee: Mit Desai Priority: Critical Fix For: 2.4.0 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, YARN-1670-v4-b23.patch, YARN-1670-v4-b23.patch, YARN-1670-v4.patch, YARN-1670-v4.patch, YARN-1670.patch, YARN-1670.patch We have seen exceptions when using 'yarn logs' to read log files. at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518) at org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178) at org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130) at org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246) We traced it down to the reader trying to read the file type of the next file but where it reads is still log data from the previous file. What happened was the Log Length was written as a certain size but the log data was actually longer then that. Inside of the write() routine in LogValue it first writes what the logfile length is, but then when it goes to write the log itself it just goes to the end of the file. There is a race condition here where if someone is still writing to the file when it goes to be aggregated the length written could be to small. We should have the write() routine stop when it writes whatever it said was the length. It would be nice if we could somehow tell the user it might be truncated but I'm not sure of a good way to do this. We also noticed that a bug in readAContainerLogsForALogType where it is using an int for curRead whereas it should be using a long. while (len != -1 curRead fileLength) { This isn't actually a problem right now as it looks like the underlying decoder is doing the right thing and the len condition exits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1452) Document AHS Feature
[ https://issues.apache.org/jira/browse/YARN-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1452: -- Issue Type: Bug (was: Sub-task) Parent: (was: YARN-321) Document AHS Feature Key: YARN-1452 URL: https://issues.apache.org/jira/browse/YARN-1452 Project: Hadoop YARN Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen Labels: documentation We need to write a bunch of documents to guide users. such as command line tools, configurations and REST APIs -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1452) Document AHS Feature
[ https://issues.apache.org/jira/browse/YARN-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1452: -- Issue Type: Task (was: Bug) Document AHS Feature Key: YARN-1452 URL: https://issues.apache.org/jira/browse/YARN-1452 Project: Hadoop YARN Issue Type: Task Reporter: Zhijie Shen Assignee: Zhijie Shen Labels: documentation We need to write a bunch of documents to guide users. such as command line tools, configurations and REST APIs -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1452) Document the usage of the generic application history and the timeline data service
[ https://issues.apache.org/jira/browse/YARN-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1452: -- Summary: Document the usage of the generic application history and the timeline data service (was: Document AHS Feature) Document the usage of the generic application history and the timeline data service --- Key: YARN-1452 URL: https://issues.apache.org/jira/browse/YARN-1452 Project: Hadoop YARN Issue Type: Task Reporter: Zhijie Shen Assignee: Zhijie Shen Labels: documentation We need to write a bunch of documents to guide users. such as command line tools, configurations and REST APIs -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (YARN-1827) yarn client fails when RM is killed within 5s of job submission
[ https://issues.apache.org/jira/browse/YARN-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli resolved YARN-1827. --- Resolution: Duplicate This is getting fixed as part of YARN-1521. Closing as dup. yarn client fails when RM is killed within 5s of job submission --- Key: YARN-1827 URL: https://issues.apache.org/jira/browse/YARN-1827 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-556) RM Restart phase 2 - Work preserving restart
[ https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945437#comment-13945437 ] Bikas Saha commented on YARN-556: - Please align with the design doc while prototyping. If the design needs changes then please update the document. The sub-tasks need to follow the design doc so that other folks can follow even if they are not writing the code. Some pieces of this are already underway in trunk (eg. RM not killing the containers on app attempt exit). The scheduler changes are the most complex piece. But they can come in the end. Working on trunk allows breaks/bugs to be caught quicker and forces us to be more methodical in our approach. A branch is useful when its not clear what approach to take or when we know the code is going to be broken across commits. So I would prefer we do this on trunk. RM Restart phase 2 - Work preserving restart Key: YARN-556 URL: https://issues.apache.org/jira/browse/YARN-556 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Reporter: Bikas Saha Assignee: Bikas Saha Attachments: Work Preserving RM Restart.pdf YARN-128 covered storing the state needed for the RM to recover critical information. This umbrella jira will track changes needed to recover the running state of the cluster so that work can be preserved across RM restarts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1670) aggregated log writer can write more log data then it says is the log length
[ https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945436#comment-13945436 ] Jonathan Eagles commented on YARN-1670: --- +1 on this change. committing to trunk, branch-2.4, branch-2, branch-0.23 aggregated log writer can write more log data then it says is the log length Key: YARN-1670 URL: https://issues.apache.org/jira/browse/YARN-1670 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 0.23.10, 2.2.0 Reporter: Thomas Graves Assignee: Mit Desai Priority: Critical Fix For: 2.4.0 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, YARN-1670-v4-b23.patch, YARN-1670-v4-b23.patch, YARN-1670-v4.patch, YARN-1670-v4.patch, YARN-1670.patch, YARN-1670.patch We have seen exceptions when using 'yarn logs' to read log files. at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518) at org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178) at org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130) at org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246) We traced it down to the reader trying to read the file type of the next file but where it reads is still log data from the previous file. What happened was the Log Length was written as a certain size but the log data was actually longer then that. Inside of the write() routine in LogValue it first writes what the logfile length is, but then when it goes to write the log itself it just goes to the end of the file. There is a race condition here where if someone is still writing to the file when it goes to be aggregated the length written could be to small. We should have the write() routine stop when it writes whatever it said was the length. It would be nice if we could somehow tell the user it might be truncated but I'm not sure of a good way to do this. We also noticed that a bug in readAContainerLogsForALogType where it is using an int for curRead whereas it should be using a long. while (len != -1 curRead fileLength) { This isn't actually a problem right now as it looks like the underlying decoder is doing the right thing and the len condition exits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1838) Timeline service getEntities API should provide ability to get entities from given id
[ https://issues.apache.org/jira/browse/YARN-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945459#comment-13945459 ] Zhijie Shen commented on YARN-1838: --- Committed to trunk, branch-2 and branch-2.4. Thanks, Billie! Timeline service getEntities API should provide ability to get entities from given id - Key: YARN-1838 URL: https://issues.apache.org/jira/browse/YARN-1838 Project: Hadoop YARN Issue Type: Sub-task Reporter: Srimanth Gunturi Assignee: Billie Rinaldi Fix For: 2.4.0 Attachments: YARN-1838.1.patch, YARN-1838.2.patch, YARN-1838.3.patch, YARN-1838.4.patch, YARN-1838.5.patch To support pagination, we need ability to get entities from a certain ID by providing a new param called {{fromid}}. For example on a page of 10 jobs, our first call will be like [http://server:8188/ws/v1/timeline/HIVE_QUERY_ID?fields=events,primaryfilters,otherinfolimit=11] When user hits next, we would like to call [http://server:8188/ws/v1/timeline/HIVE_QUERY_ID?fields=events,primaryfilters,otherinfofromid=JID11limit=11] and continue on for further _Next_ clicks On hitting back, we will make similar calls for previous items [http://server:8188/ws/v1/timeline/HIVE_QUERY_ID?fields=events,primaryfilters,otherinfofromid=JID1limit=11] {{fromid}} should be inclusive of the id given. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs
[ https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945464#comment-13945464 ] Jian He commented on YARN-1852: --- LGTM, +1 Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs - Key: YARN-1852 URL: https://issues.apache.org/jira/browse/YARN-1852 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0, 2.4.0 Reporter: Rohith Assignee: Rohith Attachments: YARN-1852.2.patch, YARN-1852.3.patch, YARN-1852.patch Recovering for failed/killed application throw InvalidStateTransitonException. These are logged during recovery of applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-556) RM Restart phase 2 - Work preserving restart
[ https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945480#comment-13945480 ] Karthik Kambatla commented on YARN-556: --- bq. Please align with the design doc while prototyping. If the design needs changes then please update the document. The sub-tasks need to follow the design doc so that other folks can follow even if they are not writing the code. Yes, that is the idea. The prototype should be mostly ready by end of the week. Will update the document with any minor changes we see are required, along with a prototype. bq. The scheduler changes are the most complex piece. But they can come in the end. Without the scheduler changes, I am concerned the remaining patches would only break things. The alternative is to have a config to enable work-preserving restart and guard all changes by that config - I am not yet fully convinced of this approach, would we want to leave this config even after the feature is complete? RM Restart phase 2 - Work preserving restart Key: YARN-556 URL: https://issues.apache.org/jira/browse/YARN-556 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Reporter: Bikas Saha Assignee: Bikas Saha Attachments: Work Preserving RM Restart.pdf YARN-128 covered storing the state needed for the RM to recover critical information. This umbrella jira will track changes needed to recover the running state of the cluster so that work can be preserved across RM restarts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1838) Timeline service getEntities API should provide ability to get entities from given id
[ https://issues.apache.org/jira/browse/YARN-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945478#comment-13945478 ] Hudson commented on YARN-1838: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5387 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5387/]) YARN-1838. Enhanced timeline service getEntities API to get entities from a given entity ID or insertion timestamp. Contributed by Billie Rinaldi. (zjshen: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1580960) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/timeline/GenericObjectMapper.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/timeline/LeveldbTimelineStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/timeline/MemoryTimelineStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/timeline/TimelineReader.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TimelineWebServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/timeline/TestLeveldbTimelineStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/timeline/TestMemoryTimelineStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/timeline/TimelineStoreTestUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestTimelineWebServices.java Timeline service getEntities API should provide ability to get entities from given id - Key: YARN-1838 URL: https://issues.apache.org/jira/browse/YARN-1838 Project: Hadoop YARN Issue Type: Sub-task Reporter: Srimanth Gunturi Assignee: Billie Rinaldi Fix For: 2.4.0 Attachments: YARN-1838.1.patch, YARN-1838.2.patch, YARN-1838.3.patch, YARN-1838.4.patch, YARN-1838.5.patch To support pagination, we need ability to get entities from a certain ID by providing a new param called {{fromid}}. For example on a page of 10 jobs, our first call will be like [http://server:8188/ws/v1/timeline/HIVE_QUERY_ID?fields=events,primaryfilters,otherinfolimit=11] When user hits next, we would like to call [http://server:8188/ws/v1/timeline/HIVE_QUERY_ID?fields=events,primaryfilters,otherinfofromid=JID11limit=11] and continue on for further _Next_ clicks On hitting back, we will make similar calls for previous items [http://server:8188/ws/v1/timeline/HIVE_QUERY_ID?fields=events,primaryfilters,otherinfofromid=JID1limit=11] {{fromid}} should be inclusive of the id given. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1670) aggregated log writer can write more log data then it says is the log length
[ https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945479#comment-13945479 ] Hudson commented on YARN-1670: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5387 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5387/]) YARN-1670. aggregated log writer can write more log data then it says is the log length (Mit Desai via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1580957) * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java aggregated log writer can write more log data then it says is the log length Key: YARN-1670 URL: https://issues.apache.org/jira/browse/YARN-1670 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 0.23.10, 2.2.0 Reporter: Thomas Graves Assignee: Mit Desai Priority: Critical Fix For: 3.0.0, 0.23.11, 2.4.0, 2.5.0 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, YARN-1670-v4-b23.patch, YARN-1670-v4-b23.patch, YARN-1670-v4.patch, YARN-1670-v4.patch, YARN-1670.patch, YARN-1670.patch We have seen exceptions when using 'yarn logs' to read log files. at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518) at org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178) at org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130) at org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246) We traced it down to the reader trying to read the file type of the next file but where it reads is still log data from the previous file. What happened was the Log Length was written as a certain size but the log data was actually longer then that. Inside of the write() routine in LogValue it first writes what the logfile length is, but then when it goes to write the log itself it just goes to the end of the file. There is a race condition here where if someone is still writing to the file when it goes to be aggregated the length written could be to small. We should have the write() routine stop when it writes whatever it said was the length. It would be nice if we could somehow tell the user it might be truncated but I'm not sure of a good way to do this. We also noticed that a bug in readAContainerLogsForALogType where it is using an int for curRead whereas it should be using a long. while (len != -1 curRead fileLength) { This isn't actually a problem right now as it looks like the underlying decoder is doing the right thing and the len condition exits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs
[ https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945557#comment-13945557 ] Hudson commented on YARN-1852: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5390 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5390/]) YARN-1852. Fixed RMAppAttempt to not resend AttemptFailed/AttemptKilled events to already recovered Failed/Killed RMApps. Contributed by Rohith Sharmaks (jianhe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1580997) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs - Key: YARN-1852 URL: https://issues.apache.org/jira/browse/YARN-1852 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0, 2.4.0 Reporter: Rohith Assignee: Rohith Fix For: 2.4.0 Attachments: YARN-1852.2.patch, YARN-1852.3.patch, YARN-1852.patch Recovering for failed/killed application throw InvalidStateTransitonException. These are logged during recovery of applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-556) RM Restart phase 2 - Work preserving restart
[ https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945587#comment-13945587 ] Vinod Kumar Vavilapalli commented on YARN-556: -- I don't see the value of a prototype given we have a mostly concrete design. It's fine to do it, but let's make sure we are not taking shortcuts in the interest of getting a quick dirty version up. RM Restart phase 2 - Work preserving restart Key: YARN-556 URL: https://issues.apache.org/jira/browse/YARN-556 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Reporter: Bikas Saha Assignee: Bikas Saha Attachments: Work Preserving RM Restart.pdf YARN-128 covered storing the state needed for the RM to recover critical information. This umbrella jira will track changes needed to recover the running state of the cluster so that work can be preserved across RM restarts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions
[ https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated YARN-596: - Attachment: YARN-596.patch [~sandyr], [~kkambatl]. Could you guys take a look of this patch? The premption rules right now. For FSQueue: select the child candidate queue/app in reverse of its scheduling policy (fair, drf, or fifo). For AppScheduable: select the candidate container in decreasing order of priority. And I moved all preemption-related code from FairScheduler.java to a separate file FSPreemption.java. In fair scheduler, intra-application container priorities affect inter-application preemption decisions --- Key: YARN-596 URL: https://issues.apache.org/jira/browse/YARN-596 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-596.patch In the fair scheduler, containers are chosen for preemption in the following way: All containers for all apps that are in queues that are over their fair share are put in a list. The list is sorted in order of the priority that the container was requested in. This means that an application can shield itself from preemption by requesting it's containers at higher priorities, which doesn't really make sense. Also, an application that is not over its fair share, but that is in a queue that is over it's fair share is just as likely to have containers preempted as an application that is over its fair share. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions
[ https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945609#comment-13945609 ] Hadoop QA commented on YARN-596: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636408/YARN-596.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3443//console This message is automatically generated. In fair scheduler, intra-application container priorities affect inter-application preemption decisions --- Key: YARN-596 URL: https://issues.apache.org/jira/browse/YARN-596 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-596.patch In the fair scheduler, containers are chosen for preemption in the following way: All containers for all apps that are in queues that are over their fair share are put in a list. The list is sorted in order of the priority that the container was requested in. This means that an application can shield itself from preemption by requesting it's containers at higher priorities, which doesn't really make sense. Also, an application that is not over its fair share, but that is in a queue that is over it's fair share is just as likely to have containers preempted as an application that is over its fair share. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1866) YARN RM fails to load state store with delegation token parsing error
Arpit Gupta created YARN-1866: - Summary: YARN RM fails to load state store with delegation token parsing error Key: YARN-1866 URL: https://issues.apache.org/jira/browse/YARN-1866 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Priority: Critical In our secure Nightlies we saw exceptions in the RM log where it failed to parse the deletegation token. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1866) YARN RM fails to load state store with delegation token parsing error
[ https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945613#comment-13945613 ] Arpit Gupta commented on YARN-1866: --- {code} interface org.apache.hadoop.ha.HAServiceProtocol 2014-03-23 17:14:49,300 INFO client.ConfiguredRMFailoverProxyProvider (ConfiguredRMFailoverProxyProvider.java:performFailover(100)) - Failing over to rm1 2014-03-23 17:14:49,303 WARN retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(119)) - Exception while invoking class org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.renewDelegationToken over rm1. Not retrying because failovers (30) exceeded maximum allowed (30) java.net.ConnectException: Call From host/ip to host:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730) at org.apache.hadoop.ipc.Client.call(Client.java:1414) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at $Proxy11.renewDelegationToken(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.renewDelegationToken(ApplicationClientProtocolPBClientImpl.java:297) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) at $Proxy12.renewDelegationToken(Unknown Source) at org.apache.hadoop.yarn.security.client.RMDelegationTokenIdentifier$Renewer.renew(RMDelegationTokenIdentifier.java:107) at org.apache.hadoop.security.token.Token.renew(Token.java:377) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:466) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:463) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:462) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:384) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.addApplicationSync(DelegationTokenRenewer.java:346) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:326) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:425) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1018) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:481) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:830) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:870) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:867) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:867) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:265) at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:116) at
[jira] [Updated] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions
[ https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated YARN-596: - Attachment: YARN-596.patch In fair scheduler, intra-application container priorities affect inter-application preemption decisions --- Key: YARN-596 URL: https://issues.apache.org/jira/browse/YARN-596 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-596.patch, YARN-596.patch In the fair scheduler, containers are chosen for preemption in the following way: All containers for all apps that are in queues that are over their fair share are put in a list. The list is sorted in order of the priority that the container was requested in. This means that an application can shield itself from preemption by requesting it's containers at higher priorities, which doesn't really make sense. Also, an application that is not over its fair share, but that is in a queue that is over it's fair share is just as likely to have containers preempted as an application that is over its fair share. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1865) ShellScriptBuilder does not check for some error conditions
[ https://issues.apache.org/jira/browse/YARN-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated YARN-1865: Assignee: Remus Rusanu ShellScriptBuilder does not check for some error conditions --- Key: YARN-1865 URL: https://issues.apache.org/jira/browse/YARN-1865 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0, 2.2.0, 2.3.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Attachments: YARN-1865.1.patch The WindowsShellScriptBuilder does not check for commands exceeding windows maximum shell command line length (8191 chars) Neither Unix nor Windows script builder do not check for error condition after mkdir nor link WindowsShellScriptBuilder mkdir is not safe with regard to paths containing spaces -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-556) RM Restart phase 2 - Work preserving restart
[ https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945650#comment-13945650 ] Karthik Kambatla commented on YARN-556: --- We think the prototype would be a validation of the design. Individual sub-tasks will go through the same rigor of unit tests and code review. It would help to add further details to the design or evaluate any minor changes required before committing the sub-tasks. RM Restart phase 2 - Work preserving restart Key: YARN-556 URL: https://issues.apache.org/jira/browse/YARN-556 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Reporter: Bikas Saha Assignee: Bikas Saha Attachments: Work Preserving RM Restart.pdf YARN-128 covered storing the state needed for the RM to recover critical information. This umbrella jira will track changes needed to recover the running state of the cluster so that work can be preserved across RM restarts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1867) NPE while fetching apps via the REST API
[ https://issues.apache.org/jira/browse/YARN-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945662#comment-13945662 ] Karthik Kambatla commented on YARN-1867: It appears the RM doesn't have the ACLs corresponding to an application from {{RMContext#getRMApps()}}. We couldn't reproduce this. The cluster had RM failover with app recovery. Couldn't identify the source of this through visual inspection. This can happen when the RM goes down (or restarted, or failed over) between adding the app and adding the ACLs for the app. If that were the case, I see the following solutions: # If ACLs are not available, use the default value. # Reverse the order of adding an app to ACLsManager and RMContext. NPE while fetching apps via the REST API Key: YARN-1867 URL: https://issues.apache.org/jira/browse/YARN-1867 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Labels: rest_api We ran into the following NPE when fetching applications using the REST API: {noformat} INTERNAL_SERVER_ERROR java.lang.NullPointerException at org.apache.hadoop.yarn.server.security.ApplicationACLsManager.checkAccess(ApplicationACLsManager.java:104) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.hasAccess(RMWebServices.java:123) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.getApps(RMWebServices.java:418) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1854) TestRMHA#testStartAndTransitions Fails
[ https://issues.apache.org/jira/browse/YARN-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated YARN-1854: Attachment: Log.rtf [~rohithsharma], [~vinodkv], I have added the logs for the failure that I mentioned before. I found this failure in our nightly builds TestRMHA#testStartAndTransitions Fails -- Key: YARN-1854 URL: https://issues.apache.org/jira/browse/YARN-1854 Project: Hadoop YARN Issue Type: Test Affects Versions: 2.4.0 Reporter: Mit Desai Assignee: Rohith Priority: Blocker Fix For: 2.4.0 Attachments: Log.rtf, YARN-1854.1.patch, YARN-1854.patch {noformat} testStartAndTransitions(org.apache.hadoop.yarn.server.resourcemanager.TestRMHA) Time elapsed: 5.883 sec FAILURE! java.lang.AssertionError: Incorrect value for metric availableMB expected:2048 but was:4096 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.apache.hadoop.yarn.server.resourcemanager.TestRMHA.assertMetric(TestRMHA.java:396) at org.apache.hadoop.yarn.server.resourcemanager.TestRMHA.verifyClusterMetrics(TestRMHA.java:387) at org.apache.hadoop.yarn.server.resourcemanager.TestRMHA.testStartAndTransitions(TestRMHA.java:160) Results : Failed tests: TestRMHA.testStartAndTransitions:160-verifyClusterMetrics:387-assertMetric:396 Incorrect value for metric availableMB expected:2048 but was:4096 {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions
[ https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945695#comment-13945695 ] Hadoop QA commented on YARN-596: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636418/YARN-596.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3444//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/3444//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3444//console This message is automatically generated. In fair scheduler, intra-application container priorities affect inter-application preemption decisions --- Key: YARN-596 URL: https://issues.apache.org/jira/browse/YARN-596 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-596.patch, YARN-596.patch In the fair scheduler, containers are chosen for preemption in the following way: All containers for all apps that are in queues that are over their fair share are put in a list. The list is sorted in order of the priority that the container was requested in. This means that an application can shield itself from preemption by requesting it's containers at higher priorities, which doesn't really make sense. Also, an application that is not over its fair share, but that is in a queue that is over it's fair share is just as likely to have containers preempted as an application that is over its fair share. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1868) YARN status web ui does not show correctly in IE 11
[ https://issues.apache.org/jira/browse/YARN-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chuan Liu updated YARN-1868: Attachment: YARN_status.png Attach a screenshot. YARN status web ui does not show correctly in IE 11 --- Key: YARN-1868 URL: https://issues.apache.org/jira/browse/YARN-1868 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0 Reporter: Chuan Liu Attachments: YARN_status.png The YARN status web ui does not show correctly in IE 11. The drop down menu for app entries are not shown. Also the navigation menu displays incorrectly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1868) YARN status web ui does not show correctly in IE 11
Chuan Liu created YARN-1868: --- Summary: YARN status web ui does not show correctly in IE 11 Key: YARN-1868 URL: https://issues.apache.org/jira/browse/YARN-1868 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0 Reporter: Chuan Liu The YARN status web ui does not show correctly in IE 11. The drop down menu for app entries are not shown. Also the navigation menu displays incorrectly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1865) ShellScriptBuilder does not check for some error conditions
[ https://issues.apache.org/jira/browse/YARN-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945707#comment-13945707 ] Hadoop QA commented on YARN-1865: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636335/YARN-1865.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: org.apache.hadoop.yarn.server.nodemanager.TestContainerExecutor {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3445//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3445//console This message is automatically generated. ShellScriptBuilder does not check for some error conditions --- Key: YARN-1865 URL: https://issues.apache.org/jira/browse/YARN-1865 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0, 2.2.0, 2.3.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Attachments: YARN-1865.1.patch The WindowsShellScriptBuilder does not check for commands exceeding windows maximum shell command line length (8191 chars) Neither Unix nor Windows script builder do not check for error condition after mkdir nor link WindowsShellScriptBuilder mkdir is not safe with regard to paths containing spaces -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions
[ https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated YARN-596: - Attachment: YARN-596.patch In fair scheduler, intra-application container priorities affect inter-application preemption decisions --- Key: YARN-596 URL: https://issues.apache.org/jira/browse/YARN-596 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-596.patch, YARN-596.patch, YARN-596.patch In the fair scheduler, containers are chosen for preemption in the following way: All containers for all apps that are in queues that are over their fair share are put in a list. The list is sorted in order of the priority that the container was requested in. This means that an application can shield itself from preemption by requesting it's containers at higher priorities, which doesn't really make sense. Also, an application that is not over its fair share, but that is in a queue that is over it's fair share is just as likely to have containers preempted as an application that is over its fair share. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1837) TestMoveApplication.testMoveRejectedByScheduler randomly fails
[ https://issues.apache.org/jira/browse/YARN-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945730#comment-13945730 ] Mit Desai commented on YARN-1837: - This test is also failing for us. TestMoveApplication.testMoveRejectedByScheduler randomly fails -- Key: YARN-1837 URL: https://issues.apache.org/jira/browse/YARN-1837 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.3.0 Reporter: Tsuyoshi OZAWA TestMoveApplication#testMoveRejectedByScheduler fails because of NullPointerException. It looks caused by unhandled exception handling at server-side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1865) ShellScriptBuilder does not check for some error conditions
[ https://issues.apache.org/jira/browse/YARN-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated YARN-1865: --- Attachment: YARN-1865.2.patch Remove the bogus change in TestcontainerExecutor, reverted the only whitespace change in Shell ShellScriptBuilder does not check for some error conditions --- Key: YARN-1865 URL: https://issues.apache.org/jira/browse/YARN-1865 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0, 2.2.0, 2.3.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Attachments: YARN-1865.1.patch, YARN-1865.2.patch The WindowsShellScriptBuilder does not check for commands exceeding windows maximum shell command line length (8191 chars) Neither Unix nor Windows script builder do not check for error condition after mkdir nor link WindowsShellScriptBuilder mkdir is not safe with regard to paths containing spaces -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1521) Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation
[ https://issues.apache.org/jira/browse/YARN-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945736#comment-13945736 ] Jian He commented on YARN-1521: --- post initial review: - this doesn’t seem correct, user is possible to get multiple delegation tokens. Given this change, user can only get one token {code} // check whether the token exists or not. // If this token existed, recover the token with // DelegationTokenInformation {code} - Explicitly assert the App does not exist in RMStateStore or not. {code} // After submission, the applicationState will // not be saved in RMStateStore {code} - Explicitly assert the app exists in RMContext after the 2nd RM comes Active. {code} // Submit Application // After submission, the applicationState will be saved in RMStateStore. {code} - The bulk of test-specific hack in MiniYarnCluster can be moved to TestHAProtocol, as MiniYarnCluster is commonly used by others. Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation - Key: YARN-1521 URL: https://issues.apache.org/jira/browse/YARN-1521 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-1521.0.patch After YARN-1028, we add the automatically failover into RMProxy. This JIRA is to identify whether we need to add idempotent annotation and which methods can be marked as idempotent. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1867) NPE while fetching apps via the REST API
[ https://issues.apache.org/jira/browse/YARN-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945741#comment-13945741 ] Vinod Kumar Vavilapalli commented on YARN-1867: --- That doesn't sound right. When RM recovers, it just simply puts the app back into *both* the context and the acls manager. Can you debug more? NPE while fetching apps via the REST API Key: YARN-1867 URL: https://issues.apache.org/jira/browse/YARN-1867 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Labels: rest_api We ran into the following NPE when fetching applications using the REST API: {noformat} INTERNAL_SERVER_ERROR java.lang.NullPointerException at org.apache.hadoop.yarn.server.security.ApplicationACLsManager.checkAccess(ApplicationACLsManager.java:104) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.hasAccess(RMWebServices.java:123) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.getApps(RMWebServices.java:418) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1866) YARN RM fails to load state store with delegation token parsing error
[ https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1866: -- Priority: Blocker (was: Critical) This sounds like a blocker. Likely broken by the patch for YARN-1812. YARN RM fails to load state store with delegation token parsing error - Key: YARN-1866 URL: https://issues.apache.org/jira/browse/YARN-1866 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Priority: Blocker In our secure Nightlies we saw exceptions in the RM log where it failed to parse the deletegation token. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1810) YARN RM Webapp Application page Issue
[ https://issues.apache.org/jira/browse/YARN-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1810: -- Component/s: webapp YARN RM Webapp Application page Issue - Key: YARN-1810 URL: https://issues.apache.org/jira/browse/YARN-1810 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, webapp Affects Versions: 2.3.0 Reporter: Ethan Setnik Attachments: Screen Shot 2014-03-10 at 3.59.54 PM.png, Screen Shot 2014-03-11 at 1.40.12 PM.png When browsing the ResourceManager's web interface I am presented with the attached screenshot. I can't understand why it does not show the applications, even though there is no search text. The application counts show the correct values relative to the submissions, successes, and failures. Also see the text in the screenshot: Showing 0 to 0 of 0 entries (filtered from 19 total entries) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1868) YARN status web ui does not show correctly in IE 11
[ https://issues.apache.org/jira/browse/YARN-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1868: -- Component/s: webapp YARN status web ui does not show correctly in IE 11 --- Key: YARN-1868 URL: https://issues.apache.org/jira/browse/YARN-1868 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 3.0.0 Reporter: Chuan Liu Attachments: YARN_status.png The YARN status web ui does not show correctly in IE 11. The drop down menu for app entries are not shown. Also the navigation menu displays incorrectly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1868) YARN status web ui does not show correctly in IE 11
[ https://issues.apache.org/jira/browse/YARN-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945772#comment-13945772 ] Vinod Kumar Vavilapalli commented on YARN-1868: --- Sounds related to YARN-1810. YARN status web ui does not show correctly in IE 11 --- Key: YARN-1868 URL: https://issues.apache.org/jira/browse/YARN-1868 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 3.0.0 Reporter: Chuan Liu Attachments: YARN_status.png The YARN status web ui does not show correctly in IE 11. The drop down menu for app entries are not shown. Also the navigation menu displays incorrectly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1865) ShellScriptBuilder does not check for some error conditions
[ https://issues.apache.org/jira/browse/YARN-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945787#comment-13945787 ] Hadoop QA commented on YARN-1865: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636441/YARN-1865.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3447//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3447//console This message is automatically generated. ShellScriptBuilder does not check for some error conditions --- Key: YARN-1865 URL: https://issues.apache.org/jira/browse/YARN-1865 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0, 2.2.0, 2.3.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Attachments: YARN-1865.1.patch, YARN-1865.2.patch The WindowsShellScriptBuilder does not check for commands exceeding windows maximum shell command line length (8191 chars) Neither Unix nor Windows script builder do not check for error condition after mkdir nor link WindowsShellScriptBuilder mkdir is not safe with regard to paths containing spaces -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions
[ https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945792#comment-13945792 ] Hadoop QA commented on YARN-596: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636436/YARN-596.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3446//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3446//console This message is automatically generated. In fair scheduler, intra-application container priorities affect inter-application preemption decisions --- Key: YARN-596 URL: https://issues.apache.org/jira/browse/YARN-596 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-596.patch, YARN-596.patch, YARN-596.patch In the fair scheduler, containers are chosen for preemption in the following way: All containers for all apps that are in queues that are over their fair share are put in a list. The list is sorted in order of the priority that the container was requested in. This means that an application can shield itself from preemption by requesting it's containers at higher priorities, which doesn't really make sense. Also, an application that is not over its fair share, but that is in a queue that is over it's fair share is just as likely to have containers preempted as an application that is over its fair share. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (YARN-1866) YARN RM fails to load state store with delegation token parsing error
[ https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He reassigned YARN-1866: - Assignee: Jian He YARN RM fails to load state store with delegation token parsing error - Key: YARN-1866 URL: https://issues.apache.org/jira/browse/YARN-1866 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Jian He Priority: Blocker In our secure Nightlies we saw exceptions in the RM log where it failed to parse the deletegation token. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1869) Access to zkAcl should be synchronized in ZKRMStateStore#addStoreOrUpdateOps()
Ted Yu created YARN-1869: Summary: Access to zkAcl should be synchronized in ZKRMStateStore#addStoreOrUpdateOps() Key: YARN-1869 URL: https://issues.apache.org/jira/browse/YARN-1869 Project: Hadoop YARN Issue Type: Bug Reporter: Ted Yu Priority: Minor Here is related code: {code} } else { opList.add(Op.create(nodeCreatePath, tokenOs.toByteArray(), zkAcl, CreateMode.PERSISTENT)); } {code} The other methods accessing zkAcl are synchronized. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1870) FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo()
Ted Yu created YARN-1870: Summary: FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo() Key: YARN-1870 URL: https://issues.apache.org/jira/browse/YARN-1870 Project: Hadoop YARN Issue Type: Bug Reporter: Ted Yu Priority: Minor {code} ListString lines = IOUtils.readLines(new FileInputStream(file)); {code} FileInputStream is not closed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1866) YARN RM fails to load state store with delegation token parsing error
[ https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945895#comment-13945895 ] Jian He commented on YARN-1866: --- This is due to RMDelegationTokenIdentifier#localServiceAddress is not yet set when application is renewing the local tokens on recovery. YARN-1107 had a similar issue, where the test case didn't catch this because RMDelegationTokenIdentifier#localServiceAddress is a static field and once it's set in the first RM, the second RM can also pick up the same value. Upload a patch to set the localSecretManager and service address in DelegationTokenRenwer#serviceInit also. YARN RM fails to load state store with delegation token parsing error - Key: YARN-1866 URL: https://issues.apache.org/jira/browse/YARN-1866 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Jian He Priority: Blocker In our secure Nightlies we saw exceptions in the RM log where it failed to parse the deletegation token. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1521) Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation
[ https://issues.apache.org/jira/browse/YARN-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1521: Attachment: YARN-1521.1.patch Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation - Key: YARN-1521 URL: https://issues.apache.org/jira/browse/YARN-1521 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-1521.0.patch, YARN-1521.1.patch After YARN-1028, we add the automatically failover into RMProxy. This JIRA is to identify whether we need to add idempotent annotation and which methods can be marked as idempotent. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1866) YARN RM fails to load state store with delegation token parsing error
[ https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-1866: -- Attachment: YARN-1866.1.patch The patch added a few missing test timeouts in TestRMRestart. YARN RM fails to load state store with delegation token parsing error - Key: YARN-1866 URL: https://issues.apache.org/jira/browse/YARN-1866 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Jian He Priority: Blocker Attachments: YARN-1866.1.patch In our secure Nightlies we saw exceptions in the RM log where it failed to parse the deletegation token. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1521) Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation
[ https://issues.apache.org/jira/browse/YARN-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945905#comment-13945905 ] Xuan Gong commented on YARN-1521: - Thanks for the review. bq. The bulk of test-specific hack in MiniYarnCluster can be moved to TestHAProtocol, as MiniYarnCluster is commonly used by others. Yes, you are right. Remove those changes from MiniYarnCluster to TestHAProtocol bq. Explicitly assert the app exists in RMContext after the 2nd RM comes Active./ Explicitly assert the App does not exist in RMStateStore or not. DONE bq. this doesn’t seem correct, user is possible to get multiple delegation tokens. Given this change, user can only get one token Remove all changes from ClientRMService#getDelegationToken(). Just let the method re-entry if failover happens. [~jianhe] Please correct me if I am wrong. The getDelegationToken() is used to get a new Token, but is not saved in zookeeper yet. So, even if failover happens during getDelegationToken call, we can not get previous generated Token back, so just let the method re-generate a new token, and it should be fine. Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation - Key: YARN-1521 URL: https://issues.apache.org/jira/browse/YARN-1521 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-1521.0.patch, YARN-1521.1.patch After YARN-1028, we add the automatically failover into RMProxy. This JIRA is to identify whether we need to add idempotent annotation and which methods can be marked as idempotent. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1521) Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation
[ https://issues.apache.org/jira/browse/YARN-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945964#comment-13945964 ] Hadoop QA commented on YARN-1521: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636467/YARN-1521.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3448//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3448//console This message is automatically generated. Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation - Key: YARN-1521 URL: https://issues.apache.org/jira/browse/YARN-1521 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-1521.0.patch, YARN-1521.1.patch After YARN-1028, we add the automatically failover into RMProxy. This JIRA is to identify whether we need to add idempotent annotation and which methods can be marked as idempotent. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1870) FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo()
[ https://issues.apache.org/jira/browse/YARN-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945968#comment-13945968 ] Fengdong Yu commented on YARN-1870: --- Good catch [~yuzhih...@gmail.com], I've upload a simple patch to cover it. FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo() -- Key: YARN-1870 URL: https://issues.apache.org/jira/browse/YARN-1870 Project: Hadoop YARN Issue Type: Bug Reporter: Ted Yu Priority: Minor {code} ListString lines = IOUtils.readLines(new FileInputStream(file)); {code} FileInputStream is not closed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1870) FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo()
[ https://issues.apache.org/jira/browse/YARN-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated YARN-1870: -- Attachment: YARN-1870.patch FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo() -- Key: YARN-1870 URL: https://issues.apache.org/jira/browse/YARN-1870 Project: Hadoop YARN Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: YARN-1870.patch {code} ListString lines = IOUtils.readLines(new FileInputStream(file)); {code} FileInputStream is not closed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1870) FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo()
[ https://issues.apache.org/jira/browse/YARN-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated YARN-1870: -- Attachment: YARN-1870.patch FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo() -- Key: YARN-1870 URL: https://issues.apache.org/jira/browse/YARN-1870 Project: Hadoop YARN Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: YARN-1870.patch {code} ListString lines = IOUtils.readLines(new FileInputStream(file)); {code} FileInputStream is not closed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1870) FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo()
[ https://issues.apache.org/jira/browse/YARN-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945991#comment-13945991 ] Ted Yu commented on YARN-1870: -- lgtm FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo() -- Key: YARN-1870 URL: https://issues.apache.org/jira/browse/YARN-1870 Project: Hadoop YARN Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: YARN-1870.patch {code} ListString lines = IOUtils.readLines(new FileInputStream(file)); {code} FileInputStream is not closed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1870) FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo()
[ https://issues.apache.org/jira/browse/YARN-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946012#comment-13946012 ] Fengdong Yu commented on YARN-1870: --- [~yuzhih...@gmail.com], can you add me as YARN contributor? FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo() -- Key: YARN-1870 URL: https://issues.apache.org/jira/browse/YARN-1870 Project: Hadoop YARN Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: YARN-1870.patch {code} ListString lines = IOUtils.readLines(new FileInputStream(file)); {code} FileInputStream is not closed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1866) YARN RM fails to load state store with delegation token parsing error
[ https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946028#comment-13946028 ] Vinod Kumar Vavilapalli commented on YARN-1866: --- The changes look good to me. +1. This part of the code keeps breaking despite all the tests. Here's hoping this is the last of the issues. Will check this in if Jenkins says okay.. YARN RM fails to load state store with delegation token parsing error - Key: YARN-1866 URL: https://issues.apache.org/jira/browse/YARN-1866 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Jian He Priority: Blocker Attachments: YARN-1866.1.patch In our secure Nightlies we saw exceptions in the RM log where it failed to parse the deletegation token. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1866) YARN RM fails to load state store with delegation token parsing error
[ https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946046#comment-13946046 ] Hadoop QA commented on YARN-1866: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636470/YARN-1866.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewerLifecycle {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3449//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3449//console This message is automatically generated. YARN RM fails to load state store with delegation token parsing error - Key: YARN-1866 URL: https://issues.apache.org/jira/browse/YARN-1866 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Jian He Priority: Blocker Attachments: YARN-1866.1.patch In our secure Nightlies we saw exceptions in the RM log where it failed to parse the deletegation token. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1868) YARN status web ui does not show correctly in IE 11
[ https://issues.apache.org/jira/browse/YARN-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946049#comment-13946049 ] Chuan Liu commented on YARN-1868: - From the screenshots, the IE 11 seems much worse than YARN-1810. The page did show correctly on IE 9, or Chrome on Windows. YARN status web ui does not show correctly in IE 11 --- Key: YARN-1868 URL: https://issues.apache.org/jira/browse/YARN-1868 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 3.0.0 Reporter: Chuan Liu Attachments: YARN_status.png The YARN status web ui does not show correctly in IE 11. The drop down menu for app entries are not shown. Also the navigation menu displays incorrectly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1866) YARN RM fails to load state store with delegation token parsing error
[ https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-1866: -- Attachment: YARN-1866.2.patch YARN RM fails to load state store with delegation token parsing error - Key: YARN-1866 URL: https://issues.apache.org/jira/browse/YARN-1866 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Jian He Priority: Blocker Attachments: YARN-1866.1.patch, YARN-1866.2.patch In our secure Nightlies we saw exceptions in the RM log where it failed to parse the deletegation token. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1866) YARN RM fails to load state store with delegation token parsing error
[ https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946092#comment-13946092 ] Jian He commented on YARN-1866: --- Fixed the test failures. The test failures were due to in the tests, the rmContext is a mock object. rmContext#getClientRMService()#getBindAddress() throws NPE. YARN RM fails to load state store with delegation token parsing error - Key: YARN-1866 URL: https://issues.apache.org/jira/browse/YARN-1866 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Jian He Priority: Blocker Attachments: YARN-1866.1.patch, YARN-1866.2.patch In our secure Nightlies we saw exceptions in the RM log where it failed to parse the deletegation token. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1867) NPE while fetching apps via the REST API
[ https://issues.apache.org/jira/browse/YARN-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946096#comment-13946096 ] Vinod Kumar Vavilapalli commented on YARN-1867: --- Okay, I think I know the issue. Working on a patch.. NPE while fetching apps via the REST API Key: YARN-1867 URL: https://issues.apache.org/jira/browse/YARN-1867 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Labels: rest_api We ran into the following NPE when fetching applications using the REST API: {noformat} INTERNAL_SERVER_ERROR java.lang.NullPointerException at org.apache.hadoop.yarn.server.security.ApplicationACLsManager.checkAccess(ApplicationACLsManager.java:104) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.hasAccess(RMWebServices.java:123) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.getApps(RMWebServices.java:418) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1850) Make enabling timeline service configurable
[ https://issues.apache.org/jira/browse/YARN-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946110#comment-13946110 ] Hudson commented on YARN-1850: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5396 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5396/]) YARN-1850. Introduced the ability to optionally disable sending out timeline-events in the TimelineClient. Contributed by Zhijie Shen. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1581189) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineClient.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml Make enabling timeline service configurable Key: YARN-1850 URL: https://issues.apache.org/jira/browse/YARN-1850 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.4.0 Attachments: YARN-1850.1.patch Like generic history service, we'd better to make enabling timeline service configurable, in case the timeline server is not up -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1866) YARN RM fails to load state store with delegation token parsing error
[ https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946141#comment-13946141 ] Hadoop QA commented on YARN-1866: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636510/YARN-1866.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3451//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3451//console This message is automatically generated. YARN RM fails to load state store with delegation token parsing error - Key: YARN-1866 URL: https://issues.apache.org/jira/browse/YARN-1866 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Jian He Priority: Blocker Attachments: YARN-1866.1.patch, YARN-1866.2.patch In our secure Nightlies we saw exceptions in the RM log where it failed to parse the deletegation token. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1452) Document the usage of the generic application history and the timeline data service
[ https://issues.apache.org/jira/browse/YARN-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1452: -- Attachment: YARN-1452.1.patch Create a patch to make a one page introduction of timeline plus generic history service, including what the service, what is the current status, how to config and start the timeline server, how to use the CLI to access the generic data, and how to publish per-framework data via TimelineClient. Document the usage of the generic application history and the timeline data service --- Key: YARN-1452 URL: https://issues.apache.org/jira/browse/YARN-1452 Project: Hadoop YARN Issue Type: Task Reporter: Zhijie Shen Assignee: Zhijie Shen Labels: documentation Attachments: YARN-1452.1.patch We need to write a bunch of documents to guide users. such as command line tools, configurations and REST APIs -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1452) Document the usage of the generic application history and the timeline data service
[ https://issues.apache.org/jira/browse/YARN-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1452: -- Attachment: TimelineServer.html Upload the generated webpage for demonstration. Document the usage of the generic application history and the timeline data service --- Key: YARN-1452 URL: https://issues.apache.org/jira/browse/YARN-1452 Project: Hadoop YARN Issue Type: Task Reporter: Zhijie Shen Assignee: Zhijie Shen Labels: documentation Attachments: TimelineServer.html, YARN-1452.1.patch We need to write a bunch of documents to guide users. such as command line tools, configurations and REST APIs -- This message was sent by Atlassian JIRA (v6.2#6252)