[jira] [Updated] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs

2014-03-24 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-1852:
-

Attachment: YARN-1852.3patch

bq. We may check against RMApp.recoveredFinalState state instead?
Done

Test is written for checking FINISHED/KILLED/FAILED applications. The fix I 
verified in single node cluster.

Attached updated patch as per comment. Please review.

 Application recovery throws InvalidStateTransitonException for FAILED and 
 KILLED jobs
 -

 Key: YARN-1852
 URL: https://issues.apache.org/jira/browse/YARN-1852
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0, 2.4.0
Reporter: Rohith
Assignee: Rohith
 Attachments: YARN-1852.2.patch, YARN-1852.3patch, YARN-1852.patch


 Recovering for failed/killed application throw InvalidStateTransitonException.
 These are logged during recovery of applications.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs

2014-03-24 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-1852:
-

Attachment: (was: YARN-1852.3patch)

 Application recovery throws InvalidStateTransitonException for FAILED and 
 KILLED jobs
 -

 Key: YARN-1852
 URL: https://issues.apache.org/jira/browse/YARN-1852
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0, 2.4.0
Reporter: Rohith
Assignee: Rohith
 Attachments: YARN-1852.2.patch, YARN-1852.3.patch, YARN-1852.patch


 Recovering for failed/killed application throw InvalidStateTransitonException.
 These are logged during recovery of applications.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs

2014-03-24 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-1852:
-

Attachment: YARN-1852.3.patch

 Application recovery throws InvalidStateTransitonException for FAILED and 
 KILLED jobs
 -

 Key: YARN-1852
 URL: https://issues.apache.org/jira/browse/YARN-1852
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0, 2.4.0
Reporter: Rohith
Assignee: Rohith
 Attachments: YARN-1852.2.patch, YARN-1852.3.patch, YARN-1852.patch


 Recovering for failed/killed application throw InvalidStateTransitonException.
 These are logged during recovery of applications.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs

2014-03-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944881#comment-13944881
 ] 

Hadoop QA commented on YARN-1852:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636314/YARN-1852.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3441//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3441//console

This message is automatically generated.

 Application recovery throws InvalidStateTransitonException for FAILED and 
 KILLED jobs
 -

 Key: YARN-1852
 URL: https://issues.apache.org/jira/browse/YARN-1852
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0, 2.4.0
Reporter: Rohith
Assignee: Rohith
 Attachments: YARN-1852.2.patch, YARN-1852.3.patch, YARN-1852.patch


 Recovering for failed/killed application throw InvalidStateTransitonException.
 These are logged during recovery of applications.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1865) ShellScriptBuilder does not check for some error conditions

2014-03-24 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated YARN-1865:
---

Priority: Minor  (was: Major)

 ShellScriptBuilder does not check for some error conditions
 ---

 Key: YARN-1865
 URL: https://issues.apache.org/jira/browse/YARN-1865
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.2.0, 2.3.0
Reporter: Remus Rusanu
Priority: Minor

 The WindowsShellScriptBuilder does not check for commands exceeding windows 
 maximum shell command line length (8191 chars)
 Neither Unix  nor Windows script builder do not check for error condition 
 after mkdir nor link
 WindowsShellScriptBuilder mkdir is not safe with regard to paths containing 
 spaces



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1865) ShellScriptBuilder does not check for some error conditions

2014-03-24 Thread Remus Rusanu (JIRA)
Remus Rusanu created YARN-1865:
--

 Summary: ShellScriptBuilder does not check for some error 
conditions
 Key: YARN-1865
 URL: https://issues.apache.org/jira/browse/YARN-1865
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.3.0, 2.2.0, 3.0.0
Reporter: Remus Rusanu


The WindowsShellScriptBuilder does not check for commands exceeding windows 
maximum shell command line length (8191 chars)
Neither Unix  nor Windows script builder do not check for error condition after 
mkdir nor link
WindowsShellScriptBuilder mkdir is not safe with regard to paths containing 
spaces



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1865) ShellScriptBuilder does not check for some error conditions

2014-03-24 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated YARN-1865:
---

Attachment: YARN-1865.1.patch

 ShellScriptBuilder does not check for some error conditions
 ---

 Key: YARN-1865
 URL: https://issues.apache.org/jira/browse/YARN-1865
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.2.0, 2.3.0
Reporter: Remus Rusanu
Priority: Minor
 Attachments: YARN-1865.1.patch


 The WindowsShellScriptBuilder does not check for commands exceeding windows 
 maximum shell command line length (8191 chars)
 Neither Unix  nor Windows script builder do not check for error condition 
 after mkdir nor link
 WindowsShellScriptBuilder mkdir is not safe with regard to paths containing 
 spaces



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1670) aggregated log writer can write more log data then it says is the log length

2014-03-24 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945188#comment-13945188
 ] 

Mit Desai commented on YARN-1670:
-

I realize that I created the patch based on the trunk before the commit of the 
earlier patch so it fails. I will upload a new one.
[~jeagles]
# Nice logic. This is much easier to understand. I will incorporate your 
suggestion in the new change.
# For the buffer size, you are correct. I already did some analysis on that. I 
read some discussions/articles online which say that 64K buffer size performs 
efficiently.

 aggregated log writer can write more log data then it says is the log length
 

 Key: YARN-1670
 URL: https://issues.apache.org/jira/browse/YARN-1670
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.10, 2.2.0
Reporter: Thomas Graves
Assignee: Mit Desai
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, 
 YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, 
 YARN-1670-v4-b23.patch, YARN-1670-v4.patch, YARN-1670.patch, YARN-1670.patch


 We have seen exceptions when using 'yarn logs' to read log files. 
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441)
at java.lang.Long.parseLong(Long.java:483)
at 
 org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246)
 We traced it down to the reader trying to read the file type of the next file 
 but where it reads is still log data from the previous file.  What happened 
 was the Log Length was written as a certain size but the log data was 
 actually longer then that.  
 Inside of the write() routine in LogValue it first writes what the logfile 
 length is, but then when it goes to write the log itself it just goes to the 
 end of the file.  There is a race condition here where if someone is still 
 writing to the file when it goes to be aggregated the length written could be 
 to small.
 We should have the write() routine stop when it writes whatever it said was 
 the length.  It would be nice if we could somehow tell the user it might be 
 truncated but I'm not sure of a good way to do this.
 We also noticed that a bug in readAContainerLogsForALogType where it is using 
 an int for curRead whereas it should be using a long. 
   while (len != -1  curRead  fileLength) {
 This isn't actually a problem right now as it looks like the underlying 
 decoder is doing the right thing and the len condition exits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1670) aggregated log writer can write more log data then it says is the log length

2014-03-24 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-1670:


Attachment: YARN-1670-v4-b23.patch

 aggregated log writer can write more log data then it says is the log length
 

 Key: YARN-1670
 URL: https://issues.apache.org/jira/browse/YARN-1670
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.10, 2.2.0
Reporter: Thomas Graves
Assignee: Mit Desai
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, 
 YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, 
 YARN-1670-v4-b23.patch, YARN-1670-v4-b23.patch, YARN-1670-v4.patch, 
 YARN-1670.patch, YARN-1670.patch


 We have seen exceptions when using 'yarn logs' to read log files. 
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441)
at java.lang.Long.parseLong(Long.java:483)
at 
 org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246)
 We traced it down to the reader trying to read the file type of the next file 
 but where it reads is still log data from the previous file.  What happened 
 was the Log Length was written as a certain size but the log data was 
 actually longer then that.  
 Inside of the write() routine in LogValue it first writes what the logfile 
 length is, but then when it goes to write the log itself it just goes to the 
 end of the file.  There is a race condition here where if someone is still 
 writing to the file when it goes to be aggregated the length written could be 
 to small.
 We should have the write() routine stop when it writes whatever it said was 
 the length.  It would be nice if we could somehow tell the user it might be 
 truncated but I'm not sure of a good way to do this.
 We also noticed that a bug in readAContainerLogsForALogType where it is using 
 an int for curRead whereas it should be using a long. 
   while (len != -1  curRead  fileLength) {
 This isn't actually a problem right now as it looks like the underlying 
 decoder is doing the right thing and the len condition exits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1670) aggregated log writer can write more log data then it says is the log length

2014-03-24 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-1670:


Attachment: YARN-1670-v4.patch

Attaching the patch with the updated changes.

 aggregated log writer can write more log data then it says is the log length
 

 Key: YARN-1670
 URL: https://issues.apache.org/jira/browse/YARN-1670
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.10, 2.2.0
Reporter: Thomas Graves
Assignee: Mit Desai
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, 
 YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, 
 YARN-1670-v4-b23.patch, YARN-1670-v4-b23.patch, YARN-1670-v4.patch, 
 YARN-1670-v4.patch, YARN-1670.patch, YARN-1670.patch


 We have seen exceptions when using 'yarn logs' to read log files. 
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441)
at java.lang.Long.parseLong(Long.java:483)
at 
 org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246)
 We traced it down to the reader trying to read the file type of the next file 
 but where it reads is still log data from the previous file.  What happened 
 was the Log Length was written as a certain size but the log data was 
 actually longer then that.  
 Inside of the write() routine in LogValue it first writes what the logfile 
 length is, but then when it goes to write the log itself it just goes to the 
 end of the file.  There is a race condition here where if someone is still 
 writing to the file when it goes to be aggregated the length written could be 
 to small.
 We should have the write() routine stop when it writes whatever it said was 
 the length.  It would be nice if we could somehow tell the user it might be 
 truncated but I'm not sure of a good way to do this.
 We also noticed that a bug in readAContainerLogsForALogType where it is using 
 an int for curRead whereas it should be using a long. 
   while (len != -1  curRead  fileLength) {
 This isn't actually a problem right now as it looks like the underlying 
 decoder is doing the right thing and the len condition exits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1670) aggregated log writer can write more log data then it says is the log length

2014-03-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945296#comment-13945296
 ] 

Hadoop QA commented on YARN-1670:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636350/YARN-1670-v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3442//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3442//console

This message is automatically generated.

 aggregated log writer can write more log data then it says is the log length
 

 Key: YARN-1670
 URL: https://issues.apache.org/jira/browse/YARN-1670
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.10, 2.2.0
Reporter: Thomas Graves
Assignee: Mit Desai
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, 
 YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, 
 YARN-1670-v4-b23.patch, YARN-1670-v4-b23.patch, YARN-1670-v4.patch, 
 YARN-1670-v4.patch, YARN-1670.patch, YARN-1670.patch


 We have seen exceptions when using 'yarn logs' to read log files. 
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441)
at java.lang.Long.parseLong(Long.java:483)
at 
 org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246)
 We traced it down to the reader trying to read the file type of the next file 
 but where it reads is still log data from the previous file.  What happened 
 was the Log Length was written as a certain size but the log data was 
 actually longer then that.  
 Inside of the write() routine in LogValue it first writes what the logfile 
 length is, but then when it goes to write the log itself it just goes to the 
 end of the file.  There is a race condition here where if someone is still 
 writing to the file when it goes to be aggregated the length written could be 
 to small.
 We should have the write() routine stop when it writes whatever it said was 
 the length.  It would be nice if we could somehow tell the user it might be 
 truncated but I'm not sure of a good way to do this.
 We also noticed that a bug in readAContainerLogsForALogType where it is using 
 an int for curRead whereas it should be using a long. 
   while (len != -1  curRead  fileLength) {
 This isn't actually a problem right now as it looks like the underlying 
 decoder is doing the right thing and the len condition exits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1452) Document AHS Feature

2014-03-24 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1452:
--

Issue Type: Bug  (was: Sub-task)
Parent: (was: YARN-321)

 Document AHS Feature
 

 Key: YARN-1452
 URL: https://issues.apache.org/jira/browse/YARN-1452
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
  Labels: documentation

 We need to write a bunch of documents to guide users. such as command line 
 tools, configurations and REST APIs



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1452) Document AHS Feature

2014-03-24 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1452:
--

Issue Type: Task  (was: Bug)

 Document AHS Feature
 

 Key: YARN-1452
 URL: https://issues.apache.org/jira/browse/YARN-1452
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
  Labels: documentation

 We need to write a bunch of documents to guide users. such as command line 
 tools, configurations and REST APIs



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1452) Document the usage of the generic application history and the timeline data service

2014-03-24 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1452:
--

Summary: Document the usage of the generic application history and the 
timeline data service  (was: Document AHS Feature)

 Document the usage of the generic application history and the timeline data 
 service
 ---

 Key: YARN-1452
 URL: https://issues.apache.org/jira/browse/YARN-1452
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
  Labels: documentation

 We need to write a bunch of documents to guide users. such as command line 
 tools, configurations and REST APIs



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (YARN-1827) yarn client fails when RM is killed within 5s of job submission

2014-03-24 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-1827.
---

Resolution: Duplicate

This is getting fixed as part of YARN-1521. Closing as dup.

 yarn client fails when RM is killed within 5s of job submission
 ---

 Key: YARN-1827
 URL: https://issues.apache.org/jira/browse/YARN-1827
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Arpit Gupta





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-556) RM Restart phase 2 - Work preserving restart

2014-03-24 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945437#comment-13945437
 ] 

Bikas Saha commented on YARN-556:
-

Please align with the design doc while prototyping. If the design needs changes 
then please update the document. The sub-tasks need to follow the design doc so 
that other folks can follow even if they are not writing the code.

Some pieces of this are already underway in trunk (eg. RM not killing the 
containers on app attempt exit). The scheduler changes are the most complex 
piece. But they can come in the end. Working on trunk allows breaks/bugs to be 
caught quicker and forces us to be more methodical in our approach. A branch is 
useful when its not clear what approach to take or when we know the code is 
going to be broken across commits. So I would prefer we do this on trunk.

 RM Restart phase 2 - Work preserving restart
 

 Key: YARN-556
 URL: https://issues.apache.org/jira/browse/YARN-556
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: Work Preserving RM Restart.pdf


 YARN-128 covered storing the state needed for the RM to recover critical 
 information. This umbrella jira will track changes needed to recover the 
 running state of the cluster so that work can be preserved across RM restarts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1670) aggregated log writer can write more log data then it says is the log length

2014-03-24 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945436#comment-13945436
 ] 

Jonathan Eagles commented on YARN-1670:
---

+1 on this change. committing to trunk, branch-2.4, branch-2, branch-0.23

 aggregated log writer can write more log data then it says is the log length
 

 Key: YARN-1670
 URL: https://issues.apache.org/jira/browse/YARN-1670
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.10, 2.2.0
Reporter: Thomas Graves
Assignee: Mit Desai
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, 
 YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, 
 YARN-1670-v4-b23.patch, YARN-1670-v4-b23.patch, YARN-1670-v4.patch, 
 YARN-1670-v4.patch, YARN-1670.patch, YARN-1670.patch


 We have seen exceptions when using 'yarn logs' to read log files. 
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441)
at java.lang.Long.parseLong(Long.java:483)
at 
 org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246)
 We traced it down to the reader trying to read the file type of the next file 
 but where it reads is still log data from the previous file.  What happened 
 was the Log Length was written as a certain size but the log data was 
 actually longer then that.  
 Inside of the write() routine in LogValue it first writes what the logfile 
 length is, but then when it goes to write the log itself it just goes to the 
 end of the file.  There is a race condition here where if someone is still 
 writing to the file when it goes to be aggregated the length written could be 
 to small.
 We should have the write() routine stop when it writes whatever it said was 
 the length.  It would be nice if we could somehow tell the user it might be 
 truncated but I'm not sure of a good way to do this.
 We also noticed that a bug in readAContainerLogsForALogType where it is using 
 an int for curRead whereas it should be using a long. 
   while (len != -1  curRead  fileLength) {
 This isn't actually a problem right now as it looks like the underlying 
 decoder is doing the right thing and the len condition exits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1838) Timeline service getEntities API should provide ability to get entities from given id

2014-03-24 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945459#comment-13945459
 ] 

Zhijie Shen commented on YARN-1838:
---

Committed to trunk, branch-2 and branch-2.4. Thanks, Billie!

 Timeline service getEntities API should provide ability to get entities from 
 given id
 -

 Key: YARN-1838
 URL: https://issues.apache.org/jira/browse/YARN-1838
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Srimanth Gunturi
Assignee: Billie Rinaldi
 Fix For: 2.4.0

 Attachments: YARN-1838.1.patch, YARN-1838.2.patch, YARN-1838.3.patch, 
 YARN-1838.4.patch, YARN-1838.5.patch


 To support pagination, we need ability to get entities from a certain ID by 
 providing a new param called {{fromid}}.
 For example on a page of 10 jobs, our first call will be like
 [http://server:8188/ws/v1/timeline/HIVE_QUERY_ID?fields=events,primaryfilters,otherinfolimit=11]
 When user hits next, we would like to call
 [http://server:8188/ws/v1/timeline/HIVE_QUERY_ID?fields=events,primaryfilters,otherinfofromid=JID11limit=11]
 and continue on for further _Next_ clicks
 On hitting back, we will make similar calls for previous items
 [http://server:8188/ws/v1/timeline/HIVE_QUERY_ID?fields=events,primaryfilters,otherinfofromid=JID1limit=11]
 {{fromid}} should be inclusive of the id given.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs

2014-03-24 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945464#comment-13945464
 ] 

Jian He commented on YARN-1852:
---

LGTM, +1

 Application recovery throws InvalidStateTransitonException for FAILED and 
 KILLED jobs
 -

 Key: YARN-1852
 URL: https://issues.apache.org/jira/browse/YARN-1852
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0, 2.4.0
Reporter: Rohith
Assignee: Rohith
 Attachments: YARN-1852.2.patch, YARN-1852.3.patch, YARN-1852.patch


 Recovering for failed/killed application throw InvalidStateTransitonException.
 These are logged during recovery of applications.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-556) RM Restart phase 2 - Work preserving restart

2014-03-24 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945480#comment-13945480
 ] 

Karthik Kambatla commented on YARN-556:
---

bq. Please align with the design doc while prototyping. If the design needs 
changes then please update the document. The sub-tasks need to follow the 
design doc so that other folks can follow even if they are not writing the code.
Yes, that is the idea. The prototype should be mostly ready by end of the week. 
Will update the document with any minor changes we see are required, along with 
a prototype. 

bq. The scheduler changes are the most complex piece. But they can come in the 
end.
Without the scheduler changes, I am concerned the remaining patches would only 
break things. The alternative is to have a config to enable work-preserving 
restart and guard all changes by that config - I am not yet fully convinced of 
this approach, would we want to leave this config even after the feature is 
complete? 

 RM Restart phase 2 - Work preserving restart
 

 Key: YARN-556
 URL: https://issues.apache.org/jira/browse/YARN-556
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: Work Preserving RM Restart.pdf


 YARN-128 covered storing the state needed for the RM to recover critical 
 information. This umbrella jira will track changes needed to recover the 
 running state of the cluster so that work can be preserved across RM restarts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1838) Timeline service getEntities API should provide ability to get entities from given id

2014-03-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945478#comment-13945478
 ] 

Hudson commented on YARN-1838:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5387 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5387/])
YARN-1838. Enhanced timeline service getEntities API to get entities from a 
given entity ID or insertion timestamp. Contributed by Billie Rinaldi. (zjshen: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1580960)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/timeline/GenericObjectMapper.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/timeline/LeveldbTimelineStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/timeline/MemoryTimelineStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/timeline/TimelineReader.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TimelineWebServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/timeline/TestLeveldbTimelineStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/timeline/TestMemoryTimelineStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/timeline/TimelineStoreTestUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestTimelineWebServices.java


 Timeline service getEntities API should provide ability to get entities from 
 given id
 -

 Key: YARN-1838
 URL: https://issues.apache.org/jira/browse/YARN-1838
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Srimanth Gunturi
Assignee: Billie Rinaldi
 Fix For: 2.4.0

 Attachments: YARN-1838.1.patch, YARN-1838.2.patch, YARN-1838.3.patch, 
 YARN-1838.4.patch, YARN-1838.5.patch


 To support pagination, we need ability to get entities from a certain ID by 
 providing a new param called {{fromid}}.
 For example on a page of 10 jobs, our first call will be like
 [http://server:8188/ws/v1/timeline/HIVE_QUERY_ID?fields=events,primaryfilters,otherinfolimit=11]
 When user hits next, we would like to call
 [http://server:8188/ws/v1/timeline/HIVE_QUERY_ID?fields=events,primaryfilters,otherinfofromid=JID11limit=11]
 and continue on for further _Next_ clicks
 On hitting back, we will make similar calls for previous items
 [http://server:8188/ws/v1/timeline/HIVE_QUERY_ID?fields=events,primaryfilters,otherinfofromid=JID1limit=11]
 {{fromid}} should be inclusive of the id given.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1670) aggregated log writer can write more log data then it says is the log length

2014-03-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945479#comment-13945479
 ] 

Hudson commented on YARN-1670:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5387 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5387/])
YARN-1670. aggregated log writer can write more log data then it says is the 
log length (Mit Desai via jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1580957)
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java


 aggregated log writer can write more log data then it says is the log length
 

 Key: YARN-1670
 URL: https://issues.apache.org/jira/browse/YARN-1670
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.10, 2.2.0
Reporter: Thomas Graves
Assignee: Mit Desai
Priority: Critical
 Fix For: 3.0.0, 0.23.11, 2.4.0, 2.5.0

 Attachments: YARN-1670-b23.patch, YARN-1670-v2-b23.patch, 
 YARN-1670-v2.patch, YARN-1670-v3-b23.patch, YARN-1670-v3.patch, 
 YARN-1670-v4-b23.patch, YARN-1670-v4-b23.patch, YARN-1670-v4.patch, 
 YARN-1670-v4.patch, YARN-1670.patch, YARN-1670.patch


 We have seen exceptions when using 'yarn logs' to read log files. 
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441)
at java.lang.Long.parseLong(Long.java:483)
at 
 org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130)
at 
 org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246)
 We traced it down to the reader trying to read the file type of the next file 
 but where it reads is still log data from the previous file.  What happened 
 was the Log Length was written as a certain size but the log data was 
 actually longer then that.  
 Inside of the write() routine in LogValue it first writes what the logfile 
 length is, but then when it goes to write the log itself it just goes to the 
 end of the file.  There is a race condition here where if someone is still 
 writing to the file when it goes to be aggregated the length written could be 
 to small.
 We should have the write() routine stop when it writes whatever it said was 
 the length.  It would be nice if we could somehow tell the user it might be 
 truncated but I'm not sure of a good way to do this.
 We also noticed that a bug in readAContainerLogsForALogType where it is using 
 an int for curRead whereas it should be using a long. 
   while (len != -1  curRead  fileLength) {
 This isn't actually a problem right now as it looks like the underlying 
 decoder is doing the right thing and the len condition exits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1852) Application recovery throws InvalidStateTransitonException for FAILED and KILLED jobs

2014-03-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945557#comment-13945557
 ] 

Hudson commented on YARN-1852:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5390 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5390/])
YARN-1852. Fixed RMAppAttempt to not resend AttemptFailed/AttemptKilled events 
to already recovered Failed/Killed RMApps. Contributed by Rohith Sharmaks 
(jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1580997)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java


 Application recovery throws InvalidStateTransitonException for FAILED and 
 KILLED jobs
 -

 Key: YARN-1852
 URL: https://issues.apache.org/jira/browse/YARN-1852
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0, 2.4.0
Reporter: Rohith
Assignee: Rohith
 Fix For: 2.4.0

 Attachments: YARN-1852.2.patch, YARN-1852.3.patch, YARN-1852.patch


 Recovering for failed/killed application throw InvalidStateTransitonException.
 These are logged during recovery of applications.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-556) RM Restart phase 2 - Work preserving restart

2014-03-24 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945587#comment-13945587
 ] 

Vinod Kumar Vavilapalli commented on YARN-556:
--

I don't see the value of a prototype given we have a mostly concrete design. 
It's fine to do it, but let's make sure we are not taking shortcuts in the 
interest of getting a quick  dirty version up.

 RM Restart phase 2 - Work preserving restart
 

 Key: YARN-556
 URL: https://issues.apache.org/jira/browse/YARN-556
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: Work Preserving RM Restart.pdf


 YARN-128 covered storing the state needed for the RM to recover critical 
 information. This umbrella jira will track changes needed to recover the 
 running state of the cluster so that work can be preserved across RM restarts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions

2014-03-24 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-596:
-

Attachment: YARN-596.patch

[~sandyr], [~kkambatl]. Could you guys take a look of this patch?

The premption rules right now.
For FSQueue: select the child candidate queue/app in reverse of its scheduling 
policy (fair, drf, or fifo).
For AppScheduable: select the candidate container in decreasing order of 
priority.

And I moved all preemption-related code from FairScheduler.java to a separate 
file FSPreemption.java.

 In fair scheduler, intra-application container priorities affect 
 inter-application preemption decisions
 ---

 Key: YARN-596
 URL: https://issues.apache.org/jira/browse/YARN-596
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-596.patch


 In the fair scheduler, containers are chosen for preemption in the following 
 way:
 All containers for all apps that are in queues that are over their fair share 
 are put in a list.
 The list is sorted in order of the priority that the container was requested 
 in.
 This means that an application can shield itself from preemption by 
 requesting it's containers at higher priorities, which doesn't really make 
 sense.
 Also, an application that is not over its fair share, but that is in a queue 
 that is over it's fair share is just as likely to have containers preempted 
 as an application that is over its fair share.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions

2014-03-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945609#comment-13945609
 ] 

Hadoop QA commented on YARN-596:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636408/YARN-596.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3443//console

This message is automatically generated.

 In fair scheduler, intra-application container priorities affect 
 inter-application preemption decisions
 ---

 Key: YARN-596
 URL: https://issues.apache.org/jira/browse/YARN-596
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-596.patch


 In the fair scheduler, containers are chosen for preemption in the following 
 way:
 All containers for all apps that are in queues that are over their fair share 
 are put in a list.
 The list is sorted in order of the priority that the container was requested 
 in.
 This means that an application can shield itself from preemption by 
 requesting it's containers at higher priorities, which doesn't really make 
 sense.
 Also, an application that is not over its fair share, but that is in a queue 
 that is over it's fair share is just as likely to have containers preempted 
 as an application that is over its fair share.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1866) YARN RM fails to load state store with delegation token parsing error

2014-03-24 Thread Arpit Gupta (JIRA)
Arpit Gupta created YARN-1866:
-

 Summary: YARN RM fails to load state store with delegation token 
parsing error
 Key: YARN-1866
 URL: https://issues.apache.org/jira/browse/YARN-1866
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Priority: Critical


In our secure Nightlies we saw exceptions in the RM log where it failed to 
parse the deletegation token.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1866) YARN RM fails to load state store with delegation token parsing error

2014-03-24 Thread Arpit Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945613#comment-13945613
 ] 

Arpit Gupta commented on YARN-1866:
---

{code}
interface org.apache.hadoop.ha.HAServiceProtocol
2014-03-23 17:14:49,300 INFO  client.ConfiguredRMFailoverProxyProvider 
(ConfiguredRMFailoverProxyProvider.java:performFailover(100)) - Failing over to 
rm1
2014-03-23 17:14:49,303 WARN  retry.RetryInvocationHandler 
(RetryInvocationHandler.java:invoke(119)) - Exception while invoking class 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.renewDelegationToken
 over rm1. Not retrying because failovers (30) exceeded maximum allowed (30)
java.net.ConnectException: Call From host/ip to host:8032 failed on connection 
exception: java.net.ConnectException: Connection refused; For more details see: 
 http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
at org.apache.hadoop.ipc.Client.call(Client.java:1414)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at $Proxy11.renewDelegationToken(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.renewDelegationToken(ApplicationClientProtocolPBClientImpl.java:297)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
at $Proxy12.renewDelegationToken(Unknown Source)
at 
org.apache.hadoop.yarn.security.client.RMDelegationTokenIdentifier$Renewer.renew(RMDelegationTokenIdentifier.java:107)
at org.apache.hadoop.security.token.Token.renew(Token.java:377)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:466)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:463)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:462)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:384)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.addApplicationSync(DelegationTokenRenewer.java:346)
at 
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:326)
at 
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:425)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1018)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:481)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:830)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:870)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:867)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:867)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:265)
at 
org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:116)
at 

[jira] [Updated] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions

2014-03-24 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-596:
-

Attachment: YARN-596.patch

 In fair scheduler, intra-application container priorities affect 
 inter-application preemption decisions
 ---

 Key: YARN-596
 URL: https://issues.apache.org/jira/browse/YARN-596
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-596.patch, YARN-596.patch


 In the fair scheduler, containers are chosen for preemption in the following 
 way:
 All containers for all apps that are in queues that are over their fair share 
 are put in a list.
 The list is sorted in order of the priority that the container was requested 
 in.
 This means that an application can shield itself from preemption by 
 requesting it's containers at higher priorities, which doesn't really make 
 sense.
 Also, an application that is not over its fair share, but that is in a queue 
 that is over it's fair share is just as likely to have containers preempted 
 as an application that is over its fair share.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1865) ShellScriptBuilder does not check for some error conditions

2014-03-24 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated YARN-1865:


Assignee: Remus Rusanu

 ShellScriptBuilder does not check for some error conditions
 ---

 Key: YARN-1865
 URL: https://issues.apache.org/jira/browse/YARN-1865
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.2.0, 2.3.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
 Attachments: YARN-1865.1.patch


 The WindowsShellScriptBuilder does not check for commands exceeding windows 
 maximum shell command line length (8191 chars)
 Neither Unix  nor Windows script builder do not check for error condition 
 after mkdir nor link
 WindowsShellScriptBuilder mkdir is not safe with regard to paths containing 
 spaces



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-556) RM Restart phase 2 - Work preserving restart

2014-03-24 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945650#comment-13945650
 ] 

Karthik Kambatla commented on YARN-556:
---

We think the prototype would be a validation of the design. Individual 
sub-tasks will go through the same rigor of unit tests and code review. It 
would help to add further details to the design or evaluate any minor changes 
required before committing the sub-tasks. 

 RM Restart phase 2 - Work preserving restart
 

 Key: YARN-556
 URL: https://issues.apache.org/jira/browse/YARN-556
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: Work Preserving RM Restart.pdf


 YARN-128 covered storing the state needed for the RM to recover critical 
 information. This umbrella jira will track changes needed to recover the 
 running state of the cluster so that work can be preserved across RM restarts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1867) NPE while fetching apps via the REST API

2014-03-24 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945662#comment-13945662
 ] 

Karthik Kambatla commented on YARN-1867:


It appears the RM doesn't have the ACLs corresponding to an application from 
{{RMContext#getRMApps()}}. We couldn't reproduce this. The cluster had RM 
failover with app recovery. 

Couldn't identify the source of this through visual inspection. This can happen 
when the RM goes down (or restarted, or failed over) between adding the app and 
adding the ACLs for the app. If that were the case, I see the following 
solutions:
# If ACLs are not available, use the default value. 
# Reverse the order of adding an app to ACLsManager and RMContext. 

 NPE while fetching apps via the REST API
 

 Key: YARN-1867
 URL: https://issues.apache.org/jira/browse/YARN-1867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
  Labels: rest_api

 We ran into the following NPE when fetching applications using the REST API:
 {noformat}
 INTERNAL_SERVER_ERROR
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.security.ApplicationACLsManager.checkAccess(ApplicationACLsManager.java:104)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.hasAccess(RMWebServices.java:123)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.getApps(RMWebServices.java:418)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1854) TestRMHA#testStartAndTransitions Fails

2014-03-24 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-1854:


Attachment: Log.rtf

[~rohithsharma], [~vinodkv], I have added the logs for the failure that I 
mentioned before. I found this failure in our nightly builds

 TestRMHA#testStartAndTransitions Fails
 --

 Key: YARN-1854
 URL: https://issues.apache.org/jira/browse/YARN-1854
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 2.4.0
Reporter: Mit Desai
Assignee: Rohith
Priority: Blocker
 Fix For: 2.4.0

 Attachments: Log.rtf, YARN-1854.1.patch, YARN-1854.patch


 {noformat}
 testStartAndTransitions(org.apache.hadoop.yarn.server.resourcemanager.TestRMHA)
   Time elapsed: 5.883 sec   FAILURE!
 java.lang.AssertionError: Incorrect value for metric availableMB 
 expected:2048 but was:4096
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMHA.assertMetric(TestRMHA.java:396)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMHA.verifyClusterMetrics(TestRMHA.java:387)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMHA.testStartAndTransitions(TestRMHA.java:160)
 Results :
 Failed tests: 
   
 TestRMHA.testStartAndTransitions:160-verifyClusterMetrics:387-assertMetric:396
  Incorrect value for metric availableMB expected:2048 but was:4096
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions

2014-03-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945695#comment-13945695
 ] 

Hadoop QA commented on YARN-596:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636418/YARN-596.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3444//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/3444//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3444//console

This message is automatically generated.

 In fair scheduler, intra-application container priorities affect 
 inter-application preemption decisions
 ---

 Key: YARN-596
 URL: https://issues.apache.org/jira/browse/YARN-596
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-596.patch, YARN-596.patch


 In the fair scheduler, containers are chosen for preemption in the following 
 way:
 All containers for all apps that are in queues that are over their fair share 
 are put in a list.
 The list is sorted in order of the priority that the container was requested 
 in.
 This means that an application can shield itself from preemption by 
 requesting it's containers at higher priorities, which doesn't really make 
 sense.
 Also, an application that is not over its fair share, but that is in a queue 
 that is over it's fair share is just as likely to have containers preempted 
 as an application that is over its fair share.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1868) YARN status web ui does not show correctly in IE 11

2014-03-24 Thread Chuan Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chuan Liu updated YARN-1868:


Attachment: YARN_status.png

Attach a screenshot.

 YARN status web ui does not show correctly in IE 11
 ---

 Key: YARN-1868
 URL: https://issues.apache.org/jira/browse/YARN-1868
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Chuan Liu
 Attachments: YARN_status.png


 The YARN status web ui does not show correctly in IE 11. The drop down menu 
 for app entries are not shown. Also the navigation menu displays incorrectly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1868) YARN status web ui does not show correctly in IE 11

2014-03-24 Thread Chuan Liu (JIRA)
Chuan Liu created YARN-1868:
---

 Summary: YARN status web ui does not show correctly in IE 11
 Key: YARN-1868
 URL: https://issues.apache.org/jira/browse/YARN-1868
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Chuan Liu


The YARN status web ui does not show correctly in IE 11. The drop down menu for 
app entries are not shown. Also the navigation menu displays incorrectly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1865) ShellScriptBuilder does not check for some error conditions

2014-03-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945707#comment-13945707
 ] 

Hadoop QA commented on YARN-1865:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636335/YARN-1865.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.TestContainerExecutor

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3445//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3445//console

This message is automatically generated.

 ShellScriptBuilder does not check for some error conditions
 ---

 Key: YARN-1865
 URL: https://issues.apache.org/jira/browse/YARN-1865
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.2.0, 2.3.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
 Attachments: YARN-1865.1.patch


 The WindowsShellScriptBuilder does not check for commands exceeding windows 
 maximum shell command line length (8191 chars)
 Neither Unix  nor Windows script builder do not check for error condition 
 after mkdir nor link
 WindowsShellScriptBuilder mkdir is not safe with regard to paths containing 
 spaces



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions

2014-03-24 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-596:
-

Attachment: YARN-596.patch

 In fair scheduler, intra-application container priorities affect 
 inter-application preemption decisions
 ---

 Key: YARN-596
 URL: https://issues.apache.org/jira/browse/YARN-596
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-596.patch, YARN-596.patch, YARN-596.patch


 In the fair scheduler, containers are chosen for preemption in the following 
 way:
 All containers for all apps that are in queues that are over their fair share 
 are put in a list.
 The list is sorted in order of the priority that the container was requested 
 in.
 This means that an application can shield itself from preemption by 
 requesting it's containers at higher priorities, which doesn't really make 
 sense.
 Also, an application that is not over its fair share, but that is in a queue 
 that is over it's fair share is just as likely to have containers preempted 
 as an application that is over its fair share.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1837) TestMoveApplication.testMoveRejectedByScheduler randomly fails

2014-03-24 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945730#comment-13945730
 ] 

Mit Desai commented on YARN-1837:
-

This test is also failing for us.

 TestMoveApplication.testMoveRejectedByScheduler randomly fails
 --

 Key: YARN-1837
 URL: https://issues.apache.org/jira/browse/YARN-1837
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Tsuyoshi OZAWA

 TestMoveApplication#testMoveRejectedByScheduler fails because of 
 NullPointerException. It looks caused by unhandled exception handling at 
 server-side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1865) ShellScriptBuilder does not check for some error conditions

2014-03-24 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated YARN-1865:
---

Attachment: YARN-1865.2.patch

Remove the bogus change in TestcontainerExecutor, reverted the only whitespace 
change in Shell

 ShellScriptBuilder does not check for some error conditions
 ---

 Key: YARN-1865
 URL: https://issues.apache.org/jira/browse/YARN-1865
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.2.0, 2.3.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
 Attachments: YARN-1865.1.patch, YARN-1865.2.patch


 The WindowsShellScriptBuilder does not check for commands exceeding windows 
 maximum shell command line length (8191 chars)
 Neither Unix  nor Windows script builder do not check for error condition 
 after mkdir nor link
 WindowsShellScriptBuilder mkdir is not safe with regard to paths containing 
 spaces



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1521) Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation

2014-03-24 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945736#comment-13945736
 ] 

Jian He commented on YARN-1521:
---

post initial review:
- this doesn’t seem correct, user is possible to get multiple delegation 
tokens. Given this change, user can only get one token
{code}
 // check whether the token exists or not.
  // If this token existed, recover the token with
  // DelegationTokenInformation
{code}
- Explicitly assert the App does not exist in RMStateStore or not.
{code}
// After submission, the applicationState will
// not be saved in RMStateStore
{code}
- Explicitly assert the app exists in RMContext after the 2nd RM comes Active.
{code}
// Submit Application
// After submission, the applicationState will be saved in RMStateStore.
{code}
- The bulk of test-specific hack in MiniYarnCluster can be moved to 
TestHAProtocol, as MiniYarnCluster is commonly used by others.

 Mark appropriate protocol methods with the idempotent annotation or 
 AtMostOnce annotation
 -

 Key: YARN-1521
 URL: https://issues.apache.org/jira/browse/YARN-1521
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1521.0.patch


 After YARN-1028, we add the automatically failover into RMProxy. This JIRA is 
 to identify whether we need to add idempotent annotation and which methods 
 can be marked as idempotent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1867) NPE while fetching apps via the REST API

2014-03-24 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945741#comment-13945741
 ] 

Vinod Kumar Vavilapalli commented on YARN-1867:
---

That doesn't sound right. When RM recovers, it just simply puts the app back 
into *both* the context and the acls manager. Can you debug more?

 NPE while fetching apps via the REST API
 

 Key: YARN-1867
 URL: https://issues.apache.org/jira/browse/YARN-1867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
  Labels: rest_api

 We ran into the following NPE when fetching applications using the REST API:
 {noformat}
 INTERNAL_SERVER_ERROR
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.security.ApplicationACLsManager.checkAccess(ApplicationACLsManager.java:104)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.hasAccess(RMWebServices.java:123)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.getApps(RMWebServices.java:418)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1866) YARN RM fails to load state store with delegation token parsing error

2014-03-24 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1866:
--

Priority: Blocker  (was: Critical)

This sounds like a blocker. Likely broken by the patch for YARN-1812.

 YARN RM fails to load state store with delegation token parsing error
 -

 Key: YARN-1866
 URL: https://issues.apache.org/jira/browse/YARN-1866
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Priority: Blocker

 In our secure Nightlies we saw exceptions in the RM log where it failed to 
 parse the deletegation token.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1810) YARN RM Webapp Application page Issue

2014-03-24 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1810:
--

Component/s: webapp

 YARN RM Webapp Application page Issue
 -

 Key: YARN-1810
 URL: https://issues.apache.org/jira/browse/YARN-1810
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
Affects Versions: 2.3.0
Reporter: Ethan Setnik
 Attachments: Screen Shot 2014-03-10 at 3.59.54 PM.png, Screen Shot 
 2014-03-11 at 1.40.12 PM.png


 When browsing the ResourceManager's web interface I am presented with the 
 attached screenshot.
 I can't understand why it does not show the applications, even though there 
 is no search text.  The application counts show the correct values relative 
 to the submissions, successes, and failures.
 Also see the text in the screenshot:
 Showing 0 to 0 of 0 entries (filtered from 19 total entries)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1868) YARN status web ui does not show correctly in IE 11

2014-03-24 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1868:
--

Component/s: webapp

 YARN status web ui does not show correctly in IE 11
 ---

 Key: YARN-1868
 URL: https://issues.apache.org/jira/browse/YARN-1868
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Affects Versions: 3.0.0
Reporter: Chuan Liu
 Attachments: YARN_status.png


 The YARN status web ui does not show correctly in IE 11. The drop down menu 
 for app entries are not shown. Also the navigation menu displays incorrectly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1868) YARN status web ui does not show correctly in IE 11

2014-03-24 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945772#comment-13945772
 ] 

Vinod Kumar Vavilapalli commented on YARN-1868:
---

Sounds related to YARN-1810.

 YARN status web ui does not show correctly in IE 11
 ---

 Key: YARN-1868
 URL: https://issues.apache.org/jira/browse/YARN-1868
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Affects Versions: 3.0.0
Reporter: Chuan Liu
 Attachments: YARN_status.png


 The YARN status web ui does not show correctly in IE 11. The drop down menu 
 for app entries are not shown. Also the navigation menu displays incorrectly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1865) ShellScriptBuilder does not check for some error conditions

2014-03-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945787#comment-13945787
 ] 

Hadoop QA commented on YARN-1865:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636441/YARN-1865.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3447//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3447//console

This message is automatically generated.

 ShellScriptBuilder does not check for some error conditions
 ---

 Key: YARN-1865
 URL: https://issues.apache.org/jira/browse/YARN-1865
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.2.0, 2.3.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
 Attachments: YARN-1865.1.patch, YARN-1865.2.patch


 The WindowsShellScriptBuilder does not check for commands exceeding windows 
 maximum shell command line length (8191 chars)
 Neither Unix  nor Windows script builder do not check for error condition 
 after mkdir nor link
 WindowsShellScriptBuilder mkdir is not safe with regard to paths containing 
 spaces



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions

2014-03-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945792#comment-13945792
 ] 

Hadoop QA commented on YARN-596:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636436/YARN-596.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3446//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3446//console

This message is automatically generated.

 In fair scheduler, intra-application container priorities affect 
 inter-application preemption decisions
 ---

 Key: YARN-596
 URL: https://issues.apache.org/jira/browse/YARN-596
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-596.patch, YARN-596.patch, YARN-596.patch


 In the fair scheduler, containers are chosen for preemption in the following 
 way:
 All containers for all apps that are in queues that are over their fair share 
 are put in a list.
 The list is sorted in order of the priority that the container was requested 
 in.
 This means that an application can shield itself from preemption by 
 requesting it's containers at higher priorities, which doesn't really make 
 sense.
 Also, an application that is not over its fair share, but that is in a queue 
 that is over it's fair share is just as likely to have containers preempted 
 as an application that is over its fair share.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (YARN-1866) YARN RM fails to load state store with delegation token parsing error

2014-03-24 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He reassigned YARN-1866:
-

Assignee: Jian He

 YARN RM fails to load state store with delegation token parsing error
 -

 Key: YARN-1866
 URL: https://issues.apache.org/jira/browse/YARN-1866
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker

 In our secure Nightlies we saw exceptions in the RM log where it failed to 
 parse the deletegation token.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1869) Access to zkAcl should be synchronized in ZKRMStateStore#addStoreOrUpdateOps()

2014-03-24 Thread Ted Yu (JIRA)
Ted Yu created YARN-1869:


 Summary: Access to zkAcl should be synchronized in 
ZKRMStateStore#addStoreOrUpdateOps()
 Key: YARN-1869
 URL: https://issues.apache.org/jira/browse/YARN-1869
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor


Here is related code:
{code}
  } else {
opList.add(Op.create(nodeCreatePath, tokenOs.toByteArray(), zkAcl,
CreateMode.PERSISTENT));
  }
{code}
The other methods accessing zkAcl are synchronized.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1870) FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo()

2014-03-24 Thread Ted Yu (JIRA)
Ted Yu created YARN-1870:


 Summary: FileInputStream is not closed in 
ProcfsBasedProcessTree#constructProcessSMAPInfo()
 Key: YARN-1870
 URL: https://issues.apache.org/jira/browse/YARN-1870
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor


{code}
  ListString lines = IOUtils.readLines(new FileInputStream(file));
{code}
FileInputStream is not closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1866) YARN RM fails to load state store with delegation token parsing error

2014-03-24 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945895#comment-13945895
 ] 

Jian He commented on YARN-1866:
---

This is due to RMDelegationTokenIdentifier#localServiceAddress is not yet set 
when application is renewing the local tokens on recovery.
YARN-1107 had a similar issue, where the test case didn't catch this because 
RMDelegationTokenIdentifier#localServiceAddress is a static field and once it's 
set in the first RM,  the second RM can also pick up the same value.

Upload  a patch to set the localSecretManager and service address in 
DelegationTokenRenwer#serviceInit  also.

 YARN RM fails to load state store with delegation token parsing error
 -

 Key: YARN-1866
 URL: https://issues.apache.org/jira/browse/YARN-1866
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker

 In our secure Nightlies we saw exceptions in the RM log where it failed to 
 parse the deletegation token.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1521) Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation

2014-03-24 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1521:


Attachment: YARN-1521.1.patch

 Mark appropriate protocol methods with the idempotent annotation or 
 AtMostOnce annotation
 -

 Key: YARN-1521
 URL: https://issues.apache.org/jira/browse/YARN-1521
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1521.0.patch, YARN-1521.1.patch


 After YARN-1028, we add the automatically failover into RMProxy. This JIRA is 
 to identify whether we need to add idempotent annotation and which methods 
 can be marked as idempotent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1866) YARN RM fails to load state store with delegation token parsing error

2014-03-24 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1866:
--

Attachment: YARN-1866.1.patch

The patch added a few missing test timeouts in TestRMRestart.

 YARN RM fails to load state store with delegation token parsing error
 -

 Key: YARN-1866
 URL: https://issues.apache.org/jira/browse/YARN-1866
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-1866.1.patch


 In our secure Nightlies we saw exceptions in the RM log where it failed to 
 parse the deletegation token.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1521) Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation

2014-03-24 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945905#comment-13945905
 ] 

Xuan Gong commented on YARN-1521:
-

Thanks for the review.

bq. The bulk of test-specific hack in MiniYarnCluster can be moved to 
TestHAProtocol, as MiniYarnCluster is commonly used by others.

Yes, you are right. Remove those changes from MiniYarnCluster to TestHAProtocol

bq. Explicitly assert the app exists in RMContext after the 2nd RM comes 
Active./ Explicitly assert the App does not exist in RMStateStore or not.

DONE

bq. this doesn’t seem correct, user is possible to get multiple delegation 
tokens. Given this change, user can only get one token

Remove all changes from ClientRMService#getDelegationToken(). Just let the 
method re-entry if failover happens. 
[~jianhe] Please correct me if I am wrong.
The getDelegationToken() is used to get a new Token, but is not saved in 
zookeeper yet. So, even if failover happens during getDelegationToken call, we 
can not get previous generated Token back, so just let the method re-generate a 
new token, and it should be fine.

 Mark appropriate protocol methods with the idempotent annotation or 
 AtMostOnce annotation
 -

 Key: YARN-1521
 URL: https://issues.apache.org/jira/browse/YARN-1521
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1521.0.patch, YARN-1521.1.patch


 After YARN-1028, we add the automatically failover into RMProxy. This JIRA is 
 to identify whether we need to add idempotent annotation and which methods 
 can be marked as idempotent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1521) Mark appropriate protocol methods with the idempotent annotation or AtMostOnce annotation

2014-03-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945964#comment-13945964
 ] 

Hadoop QA commented on YARN-1521:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636467/YARN-1521.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3448//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3448//console

This message is automatically generated.

 Mark appropriate protocol methods with the idempotent annotation or 
 AtMostOnce annotation
 -

 Key: YARN-1521
 URL: https://issues.apache.org/jira/browse/YARN-1521
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1521.0.patch, YARN-1521.1.patch


 After YARN-1028, we add the automatically failover into RMProxy. This JIRA is 
 to identify whether we need to add idempotent annotation and which methods 
 can be marked as idempotent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1870) FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo()

2014-03-24 Thread Fengdong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945968#comment-13945968
 ] 

Fengdong Yu commented on YARN-1870:
---

Good catch [~yuzhih...@gmail.com], I've upload a simple patch to cover it.

 FileInputStream is not closed in 
 ProcfsBasedProcessTree#constructProcessSMAPInfo()
 --

 Key: YARN-1870
 URL: https://issues.apache.org/jira/browse/YARN-1870
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor

 {code}
   ListString lines = IOUtils.readLines(new FileInputStream(file));
 {code}
 FileInputStream is not closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1870) FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo()

2014-03-24 Thread Fengdong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengdong Yu updated YARN-1870:
--

Attachment: YARN-1870.patch

 FileInputStream is not closed in 
 ProcfsBasedProcessTree#constructProcessSMAPInfo()
 --

 Key: YARN-1870
 URL: https://issues.apache.org/jira/browse/YARN-1870
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor
 Attachments: YARN-1870.patch


 {code}
   ListString lines = IOUtils.readLines(new FileInputStream(file));
 {code}
 FileInputStream is not closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1870) FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo()

2014-03-24 Thread Fengdong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengdong Yu updated YARN-1870:
--

Attachment: YARN-1870.patch

 FileInputStream is not closed in 
 ProcfsBasedProcessTree#constructProcessSMAPInfo()
 --

 Key: YARN-1870
 URL: https://issues.apache.org/jira/browse/YARN-1870
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor
 Attachments: YARN-1870.patch


 {code}
   ListString lines = IOUtils.readLines(new FileInputStream(file));
 {code}
 FileInputStream is not closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1870) FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo()

2014-03-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945991#comment-13945991
 ] 

Ted Yu commented on YARN-1870:
--

lgtm

 FileInputStream is not closed in 
 ProcfsBasedProcessTree#constructProcessSMAPInfo()
 --

 Key: YARN-1870
 URL: https://issues.apache.org/jira/browse/YARN-1870
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor
 Attachments: YARN-1870.patch


 {code}
   ListString lines = IOUtils.readLines(new FileInputStream(file));
 {code}
 FileInputStream is not closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1870) FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo()

2014-03-24 Thread Fengdong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946012#comment-13946012
 ] 

Fengdong Yu commented on YARN-1870:
---

[~yuzhih...@gmail.com], can you add me as YARN contributor?

 FileInputStream is not closed in 
 ProcfsBasedProcessTree#constructProcessSMAPInfo()
 --

 Key: YARN-1870
 URL: https://issues.apache.org/jira/browse/YARN-1870
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor
 Attachments: YARN-1870.patch


 {code}
   ListString lines = IOUtils.readLines(new FileInputStream(file));
 {code}
 FileInputStream is not closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1866) YARN RM fails to load state store with delegation token parsing error

2014-03-24 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946028#comment-13946028
 ] 

Vinod Kumar Vavilapalli commented on YARN-1866:
---

The changes look good to me. +1.

This part of the code keeps breaking despite all the tests. Here's hoping this 
is the last of the issues.

Will check this in if Jenkins says okay..

 YARN RM fails to load state store with delegation token parsing error
 -

 Key: YARN-1866
 URL: https://issues.apache.org/jira/browse/YARN-1866
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-1866.1.patch


 In our secure Nightlies we saw exceptions in the RM log where it failed to 
 parse the deletegation token.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1866) YARN RM fails to load state store with delegation token parsing error

2014-03-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946046#comment-13946046
 ] 

Hadoop QA commented on YARN-1866:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636470/YARN-1866.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer
  
org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewerLifecycle

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3449//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3449//console

This message is automatically generated.

 YARN RM fails to load state store with delegation token parsing error
 -

 Key: YARN-1866
 URL: https://issues.apache.org/jira/browse/YARN-1866
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-1866.1.patch


 In our secure Nightlies we saw exceptions in the RM log where it failed to 
 parse the deletegation token.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1868) YARN status web ui does not show correctly in IE 11

2014-03-24 Thread Chuan Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946049#comment-13946049
 ] 

Chuan Liu commented on YARN-1868:
-

From the screenshots, the IE 11 seems much worse than YARN-1810. The page did 
show correctly on IE 9, or Chrome on Windows.

 YARN status web ui does not show correctly in IE 11
 ---

 Key: YARN-1868
 URL: https://issues.apache.org/jira/browse/YARN-1868
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Affects Versions: 3.0.0
Reporter: Chuan Liu
 Attachments: YARN_status.png


 The YARN status web ui does not show correctly in IE 11. The drop down menu 
 for app entries are not shown. Also the navigation menu displays incorrectly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1866) YARN RM fails to load state store with delegation token parsing error

2014-03-24 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1866:
--

Attachment: YARN-1866.2.patch

 YARN RM fails to load state store with delegation token parsing error
 -

 Key: YARN-1866
 URL: https://issues.apache.org/jira/browse/YARN-1866
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-1866.1.patch, YARN-1866.2.patch


 In our secure Nightlies we saw exceptions in the RM log where it failed to 
 parse the deletegation token.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1866) YARN RM fails to load state store with delegation token parsing error

2014-03-24 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946092#comment-13946092
 ] 

Jian He commented on YARN-1866:
---

Fixed the test failures.
The test failures were due to in the tests, the rmContext is a mock object.  
rmContext#getClientRMService()#getBindAddress() throws NPE.

 YARN RM fails to load state store with delegation token parsing error
 -

 Key: YARN-1866
 URL: https://issues.apache.org/jira/browse/YARN-1866
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-1866.1.patch, YARN-1866.2.patch


 In our secure Nightlies we saw exceptions in the RM log where it failed to 
 parse the deletegation token.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1867) NPE while fetching apps via the REST API

2014-03-24 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946096#comment-13946096
 ] 

Vinod Kumar Vavilapalli commented on YARN-1867:
---

Okay, I think I know the issue. Working on a patch..

 NPE while fetching apps via the REST API
 

 Key: YARN-1867
 URL: https://issues.apache.org/jira/browse/YARN-1867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
  Labels: rest_api

 We ran into the following NPE when fetching applications using the REST API:
 {noformat}
 INTERNAL_SERVER_ERROR
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.security.ApplicationACLsManager.checkAccess(ApplicationACLsManager.java:104)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.hasAccess(RMWebServices.java:123)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.getApps(RMWebServices.java:418)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1850) Make enabling timeline service configurable

2014-03-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946110#comment-13946110
 ] 

Hudson commented on YARN-1850:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5396 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5396/])
YARN-1850. Introduced the ability to optionally disable sending out 
timeline-events in the TimelineClient. Contributed by Zhijie Shen. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1581189)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineClient.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml


 Make enabling timeline service configurable 
 

 Key: YARN-1850
 URL: https://issues.apache.org/jira/browse/YARN-1850
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.4.0

 Attachments: YARN-1850.1.patch


 Like generic history service, we'd better to make enabling timeline service 
 configurable, in case the timeline server is not up



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1866) YARN RM fails to load state store with delegation token parsing error

2014-03-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946141#comment-13946141
 ] 

Hadoop QA commented on YARN-1866:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636510/YARN-1866.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3451//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3451//console

This message is automatically generated.

 YARN RM fails to load state store with delegation token parsing error
 -

 Key: YARN-1866
 URL: https://issues.apache.org/jira/browse/YARN-1866
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-1866.1.patch, YARN-1866.2.patch


 In our secure Nightlies we saw exceptions in the RM log where it failed to 
 parse the deletegation token.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1452) Document the usage of the generic application history and the timeline data service

2014-03-24 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1452:
--

Attachment: YARN-1452.1.patch

Create a patch to make a one page introduction of timeline plus generic history 
service, including what the service, what is the current status, how to config 
and start the timeline server, how to use the CLI to access the generic data, 
and how to publish per-framework data via TimelineClient.

 Document the usage of the generic application history and the timeline data 
 service
 ---

 Key: YARN-1452
 URL: https://issues.apache.org/jira/browse/YARN-1452
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
  Labels: documentation
 Attachments: YARN-1452.1.patch


 We need to write a bunch of documents to guide users. such as command line 
 tools, configurations and REST APIs



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1452) Document the usage of the generic application history and the timeline data service

2014-03-24 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1452:
--

Attachment: TimelineServer.html

Upload the generated webpage for demonstration.

 Document the usage of the generic application history and the timeline data 
 service
 ---

 Key: YARN-1452
 URL: https://issues.apache.org/jira/browse/YARN-1452
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
  Labels: documentation
 Attachments: TimelineServer.html, YARN-1452.1.patch


 We need to write a bunch of documents to guide users. such as command line 
 tools, configurations and REST APIs



--
This message was sent by Atlassian JIRA
(v6.2#6252)