[jira] [Created] (MAPREDUCE-4149) Rumen fails to parse certain counter strings
Rumen fails to parse certain counter strings Key: MAPREDUCE-4149 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4149 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Ravi Gummadi If a counter name contains { or }, Rumen is not able to parse it and throws ParseException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4149) Rumen fails to parse certain counter strings
[ https://issues.apache.org/jira/browse/MAPREDUCE-4149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-4149: Attachment: 4149.patch Attaching patch for trunk with the fix. Rumen fails to parse certain counter strings Key: MAPREDUCE-4149 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4149 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 4149.patch If a counter name contains { or }, Rumen is not able to parse it and throws ParseException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhallamudi Venkata Siva Kamesh updated MAPREDUCE-4102: -- Attachment: MAPREDUCE-4102.patch For this, I *think* we should display all sucessfully completed map/reduce task counters as the overall job counters for killed/failed jobs.Also when job has no sucessfully completed map/reduce tasks it displays {noformat}Sorry it looks like job_clusterid_jobNum has no counters.{noformat} Attaching the patch for the same. Please review. job counters not available in Jobhistory webui for killed jobs -- Key: MAPREDUCE-4102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102 Project: Hadoop Map/Reduce Issue Type: Bug Components: webapps Affects Versions: 0.23.2 Reporter: Thomas Graves Attachments: MAPREDUCE-4102.patch Run a simple wordcount or sleep, and kill the job before it finishes. Go to the job history web ui and click the Counters link for that job. It displays 500 error. The job history log has: Caused by: com.google.inject.ProvisionException: Guice provision errors: 2012-04-03 19:42:53,148 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /jobhistory/jobcounters/job_1333482028750_0001 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) ... .. ... 1) Error injecting constructor, java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:56) while locating org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock ... .. ... Caused by: java.lang.NullPointerExceptionat org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:328) at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.getCounters(CountersBlock.java:188) at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:57) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) There are task counters available if you drill down into successful tasks though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhallamudi Venkata Siva Kamesh updated MAPREDUCE-4102: -- Affects Version/s: 2.0.0 Status: Patch Available (was: Open) job counters not available in Jobhistory webui for killed jobs -- Key: MAPREDUCE-4102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102 Project: Hadoop Map/Reduce Issue Type: Bug Components: webapps Affects Versions: 0.23.2, 2.0.0 Reporter: Thomas Graves Attachments: MAPREDUCE-4102.patch Run a simple wordcount or sleep, and kill the job before it finishes. Go to the job history web ui and click the Counters link for that job. It displays 500 error. The job history log has: Caused by: com.google.inject.ProvisionException: Guice provision errors: 2012-04-03 19:42:53,148 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /jobhistory/jobcounters/job_1333482028750_0001 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) ... .. ... 1) Error injecting constructor, java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:56) while locating org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock ... .. ... Caused by: java.lang.NullPointerExceptionat org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:328) at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.getCounters(CountersBlock.java:188) at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:57) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) There are task counters available if you drill down into successful tasks though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4150) Versioning and rolling upgrades for Yarn/MR2
Versioning and rolling upgrades for Yarn/MR2 Key: MAPREDUCE-4150 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4150 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.1 Reporter: Ahmed Radwan It doesn't seem that Yarn components, for example the ResourceManager or NodeManager, do build/package version checking before trying to communicate with each other. The objective of this ticket is to support the following requirements / use cases: - New versions can be marked incompatible with old versions, and services should be prevented from communicating with each other in such case. This will avoid non-deterministic behavior/problems resulting from incompatible components trying to communicate with each other. - Permitting a policy for running different - but compatible - versions on the same cluster (for example, in a rolling upgrade scenario). See HDFS-2983 for the corresponding HDFS implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3927) Shuffle hang when set map.failures.percent
[ https://issues.apache.org/jira/browse/MAPREDUCE-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253225#comment-13253225 ] Bhallamudi Venkata Siva Kamesh commented on MAPREDUCE-3927: --- Hi MengWang, Any update on the patch? Shuffle hang when set map.failures.percent -- Key: MAPREDUCE-3927 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3927 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.21.0, 0.23.0 Reporter: MengWang Labels: patch Fix For: 0.24.0 Attachments: MAPREDUCE-3927.patch, MAPREDUCE-3927.patch When set mapred.max.map.failures.percent and there does have some failed maps, then shuffle will hang -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4150) Versioning and rolling upgrades for Yarn/MR2
[ https://issues.apache.org/jira/browse/MAPREDUCE-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253226#comment-13253226 ] Ahmed Radwan commented on MAPREDUCE-4150: - I see the code has a YarnVersionInfo class, but as far as I can see, it is only used in displaying version info in the web UI. Versioning and rolling upgrades for Yarn/MR2 Key: MAPREDUCE-4150 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4150 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.1 Reporter: Ahmed Radwan It doesn't seem that Yarn components, for example the ResourceManager or NodeManager, do build/package version checking before trying to communicate with each other. The objective of this ticket is to support the following requirements / use cases: - New versions can be marked incompatible with old versions, and services should be prevented from communicating with each other in such case. This will avoid non-deterministic behavior/problems resulting from incompatible components trying to communicate with each other. - Permitting a policy for running different - but compatible - versions on the same cluster (for example, in a rolling upgrade scenario). See HDFS-2983 for the corresponding HDFS implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4149) Rumen fails to parse certain counter strings
[ https://issues.apache.org/jira/browse/MAPREDUCE-4149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-4149: Attachment: 4149.branch-1.v1.patch Attaching patch for branch-1 with the fix. Also added testcase that fails without this fix and passes with the fix. Rumen fails to parse certain counter strings Key: MAPREDUCE-4149 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4149 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 4149.branch-1.v1.patch, 4149.patch If a counter name contains { or }, Rumen is not able to parse it and throws ParseException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253252#comment-13253252 ] Hadoop QA commented on MAPREDUCE-4102: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522540/MAPREDUCE-4102.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell org.apache.hadoop.yarn.server.TestDiskFailures org.apache.hadoop.yarn.server.TestContainerManagerSecurity org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobs org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesTasks org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesTasks org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebApp org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobs org.apache.hadoop.mapred.TestMiniMRClasspath org.apache.hadoop.mapreduce.v2.TestMRJobs org.apache.hadoop.mapred.TestMiniMRWithDFSWithDistinctUsers org.apache.hadoop.mapred.TestMiniMRBringup org.apache.hadoop.mapred.TestMiniMRChildTask org.apache.hadoop.mapred.TestReduceFetch org.apache.hadoop.mapred.TestClusterMRNotification org.apache.hadoop.mapred.TestReduceFetchFromPartialMem org.apache.hadoop.mapred.TestJobCounters org.apache.hadoop.mapreduce.TestChild org.apache.hadoop.mapred.TestMiniMRClientCluster org.apache.hadoop.ipc.TestSocketFactory org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService org.apache.hadoop.mapreduce.v2.TestMROldApiJobs org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter org.apache.hadoop.mapred.TestClientRedirect org.apache.hadoop.mapred.TestLazyOutput org.apache.hadoop.mapred.TestJobCleanup org.apache.hadoop.mapreduce.TestMapReduceLazyOutput org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath org.apache.hadoop.mapreduce.v2.TestMRAppWithCombiner org.apache.hadoop.conf.TestNoDefaultsJobConf org.apache.hadoop.mapreduce.v2.TestRMNMInfo org.apache.hadoop.mapred.TestClusterMapReduceTestCase org.apache.hadoop.mapreduce.v2.TestNonExistentJob org.apache.hadoop.mapred.TestJobSysDirWithDFS org.apache.hadoop.mapreduce.v2.TestUberAM org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser org.apache.hadoop.mapred.TestJobName org.apache.hadoop.mapreduce.security.TestJHSSecurity +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2218//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2218//console This message is automatically generated. job counters not available in Jobhistory webui for killed jobs -- Key: MAPREDUCE-4102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102 Project: Hadoop Map/Reduce Issue Type: Bug Components: webapps Affects Versions: 0.23.2, 2.0.0 Reporter: Thomas Graves Attachments: MAPREDUCE-4102.patch Run a simple wordcount or sleep, and kill the job before it finishes. Go to the job history web ui and click the Counters link for that job.
[jira] [Updated] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhallamudi Venkata Siva Kamesh updated MAPREDUCE-4102: -- Status: Open (was: Patch Available) Cancelling the patch to address the test failures job counters not available in Jobhistory webui for killed jobs -- Key: MAPREDUCE-4102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102 Project: Hadoop Map/Reduce Issue Type: Bug Components: webapps Affects Versions: 0.23.2, 2.0.0 Reporter: Thomas Graves Attachments: MAPREDUCE-4102.patch Run a simple wordcount or sleep, and kill the job before it finishes. Go to the job history web ui and click the Counters link for that job. It displays 500 error. The job history log has: Caused by: com.google.inject.ProvisionException: Guice provision errors: 2012-04-03 19:42:53,148 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /jobhistory/jobcounters/job_1333482028750_0001 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) ... .. ... 1) Error injecting constructor, java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:56) while locating org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock ... .. ... Caused by: java.lang.NullPointerExceptionat org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:328) at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.getCounters(CountersBlock.java:188) at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:57) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) There are task counters available if you drill down into successful tasks though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4149) Rumen fails to parse certain counter strings
[ https://issues.apache.org/jira/browse/MAPREDUCE-4149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253264#comment-13253264 ] Amar Kamat commented on MAPREDUCE-4149: --- Ravi, let 'UserCounterMapper' extend 'IdentityMapper'. This way, you can set our special counters in UserCounterMapper's map api and then invoke IdentityMapper's map() api. Thoughts? Rumen fails to parse certain counter strings Key: MAPREDUCE-4149 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4149 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 4149.branch-1.v1.patch, 4149.patch If a counter name contains { or }, Rumen is not able to parse it and throws ParseException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4050) Invalid node link
[ https://issues.apache.org/jira/browse/MAPREDUCE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253332#comment-13253332 ] Hudson commented on MAPREDUCE-4050: --- Integrated in Hadoop-Hdfs-0.23-Build #226 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/226/]) Merge MAPREDUCE-4050 from trunk. For tasks without assigned containers, changes the node text on the UI to N/A instead of a link to null. (Contributed by Bhallamudi Venkata Siva Kamesh) (Revision 1325437) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325437 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/TaskPage.java Invalid node link - Key: MAPREDUCE-4050 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4050 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Bhallamudi Venkata Siva Kamesh Assignee: Bhallamudi Venkata Siva Kamesh Fix For: 0.23.3 Attachments: MAPREDUCE-4050.patch, MAPREDUCE-4050.png When a task is in *UNASSIGNED* state, node link is displayed as +null+. But I think it is better to display the link as *N/A* rather than +null+. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4140) mapreduce classes incorrectly importing clover.org.apache.* classes
[ https://issues.apache.org/jira/browse/MAPREDUCE-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253331#comment-13253331 ] Hudson commented on MAPREDUCE-4140: --- Integrated in Hadoop-Hdfs-0.23-Build #226 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/226/]) Merge -r 1325351:1325352 from trunk to branch-0.23. Fixes: MAPREDUCE-4140 (Revision 1325362) Result = SUCCESS tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325362 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetDelegationTokenRequest.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java mapreduce classes incorrectly importing clover.org.apache.* classes - Key: MAPREDUCE-4140 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4140 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, mrv2 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 0.23.3 Attachments: MAPREDUCE-4140.patch, MAPREDUCE-4140.patch, MAPREDUCE-4140.patch A number of classes in mapreduce are importing clover.org.apache.* classes e.g. hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4050) Invalid node link
[ https://issues.apache.org/jira/browse/MAPREDUCE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253349#comment-13253349 ] Hudson commented on MAPREDUCE-4050: --- Integrated in Hadoop-Hdfs-trunk #1013 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1013/]) MAPREDUCE-4050. For tasks without assigned containers, changes the node text on the UI to N/A instead of a link to null. (Contributed by Bhallamudi Venkata Siva Kamesh) (Revision 1325435) Result = FAILURE sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325435 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/TaskPage.java Invalid node link - Key: MAPREDUCE-4050 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4050 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Bhallamudi Venkata Siva Kamesh Assignee: Bhallamudi Venkata Siva Kamesh Fix For: 0.23.3 Attachments: MAPREDUCE-4050.patch, MAPREDUCE-4050.png When a task is in *UNASSIGNED* state, node link is displayed as +null+. But I think it is better to display the link as *N/A* rather than +null+. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4147) YARN should not have a compile-time dependency on HDFS
[ https://issues.apache.org/jira/browse/MAPREDUCE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253351#comment-13253351 ] Hudson commented on MAPREDUCE-4147: --- Integrated in Hadoop-Hdfs-trunk #1013 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1013/]) MAPREDUCE-4147. YARN should not have a compile-time dependency on HDFS. (Revision 1325573) Result = FAILURE tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325573 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/pom.xml YARN should not have a compile-time dependency on HDFS -- Key: MAPREDUCE-4147 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4147 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.1 Reporter: Tom White Assignee: Tom White Fix For: 2.0.0 Attachments: MAPREDUCE-4147.patch YARN doesn't (and shouldn't) use any HDFS-specific APIs, so it should not declare HDFS as a compile-time dependency. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4139) Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped
[ https://issues.apache.org/jira/browse/MAPREDUCE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-4139: -- Status: Open (was: Patch Available) Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped --- Key: MAPREDUCE-4139 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4139 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-4139.patch When the main thread calls ResourceManager$SchedulerEventDispatcher.stop() it grabs a lock on the object, kicks the event processor thread, and then waits for the thread to exit. However the interrupted event processor thread can end up trying to call the synchronized getConfig() method which results in deadlock. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4128) AM Recovery expects all attempts of a completed task to also be completed.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans resolved MAPREDUCE-4128. Resolution: Fixed Fix Version/s: (was: 3.0.0) 2.0.0 0.23.3 Thanks for your work on this Bikas. I have put this into trunk, branch-2 and branch-0.23 AM Recovery expects all attempts of a completed task to also be completed. -- Key: MAPREDUCE-4128 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4128 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 3.0.0 Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4128-1.patch, MAPREDUCE-4128-2.patch, MAPREDUCE-4128.patch The AM seems to assume that all attempts of a completed task (from a previous AM incarnation) would also be completed. There is at least one case in which this does not hold. Case being cancellation of a completed task resulting in a new running attempt. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4139) Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped
[ https://issues.apache.org/jira/browse/MAPREDUCE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-4139: -- Attachment: MAPREDUCE-4139.patch Updated patch to fix for findbug warning. Test failures are unrelated to this patch. They are the same failures that were recently reported for MAPREDUCE-4144 and others. Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped --- Key: MAPREDUCE-4139 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4139 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-4139.patch, MAPREDUCE-4139.patch When the main thread calls ResourceManager$SchedulerEventDispatcher.stop() it grabs a lock on the object, kicks the event processor thread, and then waits for the thread to exit. However the interrupted event processor thread can end up trying to call the synchronized getConfig() method which results in deadlock. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4139) Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped
[ https://issues.apache.org/jira/browse/MAPREDUCE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-4139: -- Status: Patch Available (was: Open) Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped --- Key: MAPREDUCE-4139 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4139 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-4139.patch, MAPREDUCE-4139.patch When the main thread calls ResourceManager$SchedulerEventDispatcher.stop() it grabs a lock on the object, kicks the event processor thread, and then waits for the thread to exit. However the interrupted event processor thread can end up trying to call the synchronized getConfig() method which results in deadlock. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4128) AM Recovery expects all attempts of a completed task to also be completed.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253379#comment-13253379 ] Hudson commented on MAPREDUCE-4128: --- Integrated in Hadoop-Hdfs-trunk-Commit #2143 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2143/]) MAPREDUCE-4128. AM Recovery expects all attempts of a completed task to also be completed. (Bikas Saha via bobby) (Revision 1325765) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325765 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEventHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/TaskFinishedEvent.java * /hadoop/common/trunk/hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/Task20LineHistoryEventEmitter.java AM Recovery expects all attempts of a completed task to also be completed. -- Key: MAPREDUCE-4128 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4128 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 3.0.0 Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4128-1.patch, MAPREDUCE-4128-2.patch, MAPREDUCE-4128.patch The AM seems to assume that all attempts of a completed task (from a previous AM incarnation) would also be completed. There is at least one case in which this does not hold. Case being cancellation of a completed task resulting in a new running attempt. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4128) AM Recovery expects all attempts of a completed task to also be completed.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253380#comment-13253380 ] Hudson commented on MAPREDUCE-4128: --- Integrated in Hadoop-Common-trunk-Commit #2070 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2070/]) MAPREDUCE-4128. AM Recovery expects all attempts of a completed task to also be completed. (Bikas Saha via bobby) (Revision 1325765) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325765 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEventHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/TaskFinishedEvent.java * /hadoop/common/trunk/hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/Task20LineHistoryEventEmitter.java AM Recovery expects all attempts of a completed task to also be completed. -- Key: MAPREDUCE-4128 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4128 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 3.0.0 Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4128-1.patch, MAPREDUCE-4128-2.patch, MAPREDUCE-4128.patch The AM seems to assume that all attempts of a completed task (from a previous AM incarnation) would also be completed. There is at least one case in which this does not hold. Case being cancellation of a completed task resulting in a new running attempt. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4140) mapreduce classes incorrectly importing clover.org.apache.* classes
[ https://issues.apache.org/jira/browse/MAPREDUCE-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253385#comment-13253385 ] Hudson commented on MAPREDUCE-4140: --- Integrated in Hadoop-Mapreduce-trunk #1048 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1048/]) MAPREDUCE-4140. mapreduce classes incorrectly importing clover.org.apache.* classes. Contributed by Patrick Hunt (Revision 1325352) Result = SUCCESS tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325352 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetDelegationTokenRequest.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java mapreduce classes incorrectly importing clover.org.apache.* classes - Key: MAPREDUCE-4140 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4140 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, mrv2 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 0.23.3 Attachments: MAPREDUCE-4140.patch, MAPREDUCE-4140.patch, MAPREDUCE-4140.patch A number of classes in mapreduce are importing clover.org.apache.* classes e.g. hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4128) AM Recovery expects all attempts of a completed task to also be completed.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253394#comment-13253394 ] Hudson commented on MAPREDUCE-4128: --- Integrated in Hadoop-Mapreduce-trunk-Commit #2084 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2084/]) MAPREDUCE-4128. AM Recovery expects all attempts of a completed task to also be completed. (Bikas Saha via bobby) (Revision 1325765) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325765 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEventHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/TaskFinishedEvent.java * /hadoop/common/trunk/hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/Task20LineHistoryEventEmitter.java AM Recovery expects all attempts of a completed task to also be completed. -- Key: MAPREDUCE-4128 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4128 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 3.0.0 Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4128-1.patch, MAPREDUCE-4128-2.patch, MAPREDUCE-4128.patch The AM seems to assume that all attempts of a completed task (from a previous AM incarnation) would also be completed. There is at least one case in which this does not hold. Case being cancellation of a completed task resulting in a new running attempt. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4050) Invalid node link
[ https://issues.apache.org/jira/browse/MAPREDUCE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253389#comment-13253389 ] Hudson commented on MAPREDUCE-4050: --- Integrated in Hadoop-Mapreduce-trunk #1048 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1048/]) MAPREDUCE-4050. For tasks without assigned containers, changes the node text on the UI to N/A instead of a link to null. (Contributed by Bhallamudi Venkata Siva Kamesh) (Revision 1325435) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325435 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/TaskPage.java Invalid node link - Key: MAPREDUCE-4050 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4050 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Bhallamudi Venkata Siva Kamesh Assignee: Bhallamudi Venkata Siva Kamesh Fix For: 0.23.3 Attachments: MAPREDUCE-4050.patch, MAPREDUCE-4050.png When a task is in *UNASSIGNED* state, node link is displayed as +null+. But I think it is better to display the link as *N/A* rather than +null+. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4147) YARN should not have a compile-time dependency on HDFS
[ https://issues.apache.org/jira/browse/MAPREDUCE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253391#comment-13253391 ] Hudson commented on MAPREDUCE-4147: --- Integrated in Hadoop-Mapreduce-trunk #1048 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1048/]) MAPREDUCE-4147. YARN should not have a compile-time dependency on HDFS. (Revision 1325573) Result = SUCCESS tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325573 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/pom.xml YARN should not have a compile-time dependency on HDFS -- Key: MAPREDUCE-4147 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4147 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.1 Reporter: Tom White Assignee: Tom White Fix For: 2.0.0 Attachments: MAPREDUCE-4147.patch YARN doesn't (and shouldn't) use any HDFS-specific APIs, so it should not declare HDFS as a compile-time dependency. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4071) NPE while executing MRAppMaster shutdown hook
[ https://issues.apache.org/jira/browse/MAPREDUCE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhallamudi Venkata Siva Kamesh updated MAPREDUCE-4071: -- Status: Patch Available (was: Open) NPE while executing MRAppMaster shutdown hook - Key: MAPREDUCE-4071 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4071 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.3, 2.0.0 Reporter: Bhallamudi Venkata Siva Kamesh Attachments: MAPREDUCE-4071-1.patch, MAPREDUCE-4071-2.patch, MAPREDUCE-4071.patch While running the shutdown hook of MRAppMaster, hit NPE {noformat} Exception in thread Thread-1 java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:668) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1004) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4071) NPE while executing MRAppMaster shutdown hook
[ https://issues.apache.org/jira/browse/MAPREDUCE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhallamudi Venkata Siva Kamesh updated MAPREDUCE-4071: -- Attachment: MAPREDUCE-4071-2.patch To fix this issue, I just re-factored the existing code. After re-factoring I successfully ran *wordcount* and *terasort* examples. Please review the patch and provide your comments. NPE while executing MRAppMaster shutdown hook - Key: MAPREDUCE-4071 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4071 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.3, 2.0.0 Reporter: Bhallamudi Venkata Siva Kamesh Attachments: MAPREDUCE-4071-1.patch, MAPREDUCE-4071-2.patch, MAPREDUCE-4071.patch While running the shutdown hook of MRAppMaster, hit NPE {noformat} Exception in thread Thread-1 java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:668) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1004) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4151) RM scheduler web page should filter apps to those that are relevant to scheduling
RM scheduler web page should filter apps to those that are relevant to scheduling - Key: MAPREDUCE-4151 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4151 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2, webapps Reporter: Jason Lowe On the ResourceManager's scheduler web page, the bottom of the page shows the apps block. When the cluster has run a lot of applications (e.g.: 10,000+) loading the apps table can take a long time, and that prolongs the plotting of the queue status which is the most interesting portion of the page. If the user is bothering to go to the scheduler page, they're probably not interested in apps that are not affecting what the scheduler is doing (e.g.: FINISHED, FAILED, KILLED, etc.). Having the RM filter the apps for this page should significantly reduce the time it takes to load this page on the client, and it also helps reduce the amount of apps the user has to sift through when looking for the apps that are affecting what the scheduler is doing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhallamudi Venkata Siva Kamesh updated MAPREDUCE-4102: -- Attachment: MAPREDUCE-4102-1.patch Certainly all the test failures are not induced by this patch. Fixed all the relavant test failures and ran all the Testcases shown above. All the tests passed in my env. Not modified any *src* changes. Just modified the *test* changes. Patch contains a testcase, which fails without src changes, and passes with src changes. Please review. job counters not available in Jobhistory webui for killed jobs -- Key: MAPREDUCE-4102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102 Project: Hadoop Map/Reduce Issue Type: Bug Components: webapps Affects Versions: 0.23.2, 2.0.0 Reporter: Thomas Graves Attachments: MAPREDUCE-4102-1.patch, MAPREDUCE-4102.patch Run a simple wordcount or sleep, and kill the job before it finishes. Go to the job history web ui and click the Counters link for that job. It displays 500 error. The job history log has: Caused by: com.google.inject.ProvisionException: Guice provision errors: 2012-04-03 19:42:53,148 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /jobhistory/jobcounters/job_1333482028750_0001 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) ... .. ... 1) Error injecting constructor, java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:56) while locating org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock ... .. ... Caused by: java.lang.NullPointerExceptionat org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:328) at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.getCounters(CountersBlock.java:188) at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:57) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) There are task counters available if you drill down into successful tasks though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhallamudi Venkata Siva Kamesh updated MAPREDUCE-4102: -- Status: Patch Available (was: Open) job counters not available in Jobhistory webui for killed jobs -- Key: MAPREDUCE-4102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102 Project: Hadoop Map/Reduce Issue Type: Bug Components: webapps Affects Versions: 0.23.2, 2.0.0 Reporter: Thomas Graves Attachments: MAPREDUCE-4102-1.patch, MAPREDUCE-4102.patch Run a simple wordcount or sleep, and kill the job before it finishes. Go to the job history web ui and click the Counters link for that job. It displays 500 error. The job history log has: Caused by: com.google.inject.ProvisionException: Guice provision errors: 2012-04-03 19:42:53,148 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /jobhistory/jobcounters/job_1333482028750_0001 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) ... .. ... 1) Error injecting constructor, java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:56) while locating org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock ... .. ... Caused by: java.lang.NullPointerExceptionat org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:328) at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.getCounters(CountersBlock.java:188) at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:57) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) There are task counters available if you drill down into successful tasks though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4139) Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped
[ https://issues.apache.org/jira/browse/MAPREDUCE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253439#comment-13253439 ] Hadoop QA commented on MAPREDUCE-4139: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522569/MAPREDUCE-4139.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2219//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2219//console This message is automatically generated. Potential ResourceManager deadlock when SchedulerEventDispatcher is stopped --- Key: MAPREDUCE-4139 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4139 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-4139.patch, MAPREDUCE-4139.patch When the main thread calls ResourceManager$SchedulerEventDispatcher.stop() it grabs a lock on the object, kicks the event processor thread, and then waits for the thread to exit. However the interrupted event processor thread can end up trying to call the synchronized getConfig() method which results in deadlock. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4150) Versioning and rolling upgrades for Yarn/MR2
[ https://issues.apache.org/jira/browse/MAPREDUCE-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253450#comment-13253450 ] Robert Joseph Evans commented on MAPREDUCE-4150: As part of this I would like to see something like HDFS-3245 so that if we do support multiple versions/rolling upgrades we can display a breakdown of what versions the cluster is currently running with. Versioning and rolling upgrades for Yarn/MR2 Key: MAPREDUCE-4150 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4150 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.1 Reporter: Ahmed Radwan It doesn't seem that Yarn components, for example the ResourceManager or NodeManager, do build/package version checking before trying to communicate with each other. The objective of this ticket is to support the following requirements / use cases: - New versions can be marked incompatible with old versions, and services should be prevented from communicating with each other in such case. This will avoid non-deterministic behavior/problems resulting from incompatible components trying to communicate with each other. - Permitting a policy for running different - but compatible - versions on the same cluster (for example, in a rolling upgrade scenario). See HDFS-2983 for the corresponding HDFS implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4152) map task left hanging after AM dies trying to connect to RM
map task left hanging after AM dies trying to connect to RM --- Key: MAPREDUCE-4152 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4152 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves We had an instance where the RM went down for more then an hour. The application master exited with Could not contact RM after 36 milliseconds 2012-04-11 10:43:36,040 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1333003059741_15999Job Transitioned from RUNNING to ERROR -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4152) map task left hanging after AM dies trying to connect to RM
[ https://issues.apache.org/jira/browse/MAPREDUCE-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253460#comment-13253460 ] Thomas Graves commented on MAPREDUCE-4152: -- The Job did not kill off the map task that it had running before exiting. In JobImpl when it moves from RUNNING to ERROR, all it does is send the JobUnsuccessfulCompletion event. I would think it would atleast try to kill any tasks it has. Now there might also be another issue with NM as to why it didn't kill it. I need to investigate that further. The NM was also not able to connect to RM and I saw one of the threads restart. I'm guessing when that restarted it lost that container but I need to investigate that further. map task left hanging after AM dies trying to connect to RM --- Key: MAPREDUCE-4152 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4152 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves We had an instance where the RM went down for more then an hour. The application master exited with Could not contact RM after 36 milliseconds 2012-04-11 10:43:36,040 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1333003059741_15999Job Transitioned from RUNNING to ERROR -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253481#comment-13253481 ] Hadoop QA commented on MAPREDUCE-4102: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522573/MAPREDUCE-4102-1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2221//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2221//console This message is automatically generated. job counters not available in Jobhistory webui for killed jobs -- Key: MAPREDUCE-4102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102 Project: Hadoop Map/Reduce Issue Type: Bug Components: webapps Affects Versions: 0.23.2, 2.0.0 Reporter: Thomas Graves Attachments: MAPREDUCE-4102-1.patch, MAPREDUCE-4102.patch Run a simple wordcount or sleep, and kill the job before it finishes. Go to the job history web ui and click the Counters link for that job. It displays 500 error. The job history log has: Caused by: com.google.inject.ProvisionException: Guice provision errors: 2012-04-03 19:42:53,148 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /jobhistory/jobcounters/job_1333482028750_0001 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) ... .. ... 1) Error injecting constructor, java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:56) while locating org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock ... .. ... Caused by: java.lang.NullPointerExceptionat org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:328) at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.getCounters(CountersBlock.java:188) at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:57) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) There are task counters available if you drill down into successful tasks though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4008) ResourceManager throws MetricsException on start up saying QueueMetrics MBean already exists
[ https://issues.apache.org/jira/browse/MAPREDUCE-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-4008: - Component/s: (was: resourcemanager) ResourceManager throws MetricsException on start up saying QueueMetrics MBean already exists Key: MAPREDUCE-4008 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4008 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, scheduler Affects Versions: 2.0.0, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Attachments: MAPREDUCE-4008.patch {code:xml} 2012-03-14 15:22:23,089 WARN org.apache.hadoop.metrics2.util.MBeans: Error creating MBean object name: Hadoop:service=ResourceManager,name=QueueMetrics,q0=default org.apache.hadoop.metrics2.MetricsException: org.apache.hadoop.metrics2.MetricsException: Hadoop:service=ResourceManager,name=QueueMetrics,q0=default already exists! at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newObjectName(DefaultMetricsSystem.java:117) at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newMBeanName(DefaultMetricsSystem.java:102) at org.apache.hadoop.metrics2.util.MBeans.getMBeanName(MBeans.java:91) at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:55) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:218) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:93) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:243) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$1.postStart(MetricsSystemImpl.java:227) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$3.invoke(MetricsSystemImpl.java:288) at $Proxy6.postStart(Unknown Source) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:183) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.init(MetricsSystemImpl.java:155) at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.init(DefaultMetricsSystem.java:54) at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.initialize(DefaultMetricsSystem.java:50) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.start(ResourceManager.java:454) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:588) Caused by: org.apache.hadoop.metrics2.MetricsException: Hadoop:service=ResourceManager,name=QueueMetrics,q0=default already exists! at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newObjectName(DefaultMetricsSystem.java:113) ... 19 more 2012-03-14 15:22:23,090 WARN org.apache.hadoop.metrics2.util.MBeans: Failed to register MBean null javax.management.RuntimeOperationsException: Exception occurred trying to register the MBean at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:969) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:917) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:312) at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:482) at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:57) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:218) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:93) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:243) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$1.postStart(MetricsSystemImpl.java:227) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$3.invoke(MetricsSystemImpl.java:288) at $Proxy6.postStart(Unknown Source) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:183) at
[jira] [Updated] (MAPREDUCE-4048) NullPointerException exception while accessing the Application Master UI
[ https://issues.apache.org/jira/browse/MAPREDUCE-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-4048: - Component/s: mrv2 NullPointerException exception while accessing the Application Master UI Key: MAPREDUCE-4048 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4048 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Attachments: MAPREDUCE-4048.patch {code:xml} 2012-03-21 10:21:31,838 ERROR [2145015588@qtp-957250718-801] org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /mapreduce/attempts/job_1332261815858_2_8/m/KILLED java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:150) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) ... at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: java.lang.NullPointerException at com.google.common.base.Joiner.toString(Joiner.java:317) at com.google.common.base.Joiner.appendTo(Joiner.java:97) at com.google.common.base.Joiner.appendTo(Joiner.java:127) at com.google.common.base.Joiner.join(Joiner.java:158) at com.google.common.base.Joiner.join(Joiner.java:166) at org.apache.hadoop.yarn.util.StringHelper.join(StringHelper.java:102) at org.apache.hadoop.mapreduce.v2.app.webapp.AppController.badRequest(AppController.java:319) at org.apache.hadoop.mapreduce.v2.app.webapp.AppController.attempts(AppController.java:286) ... 36 more {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3494) The RM should handle the graceful shutdown of the NM.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-3494: - Component/s: (was: resourcemanager) The RM should handle the graceful shutdown of the NM. - Key: MAPREDUCE-3494 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3494 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 2.0.0, 3.0.0 Reporter: Ravi Teja Ch N V Assignee: Devaraj K Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, MAPREDUCE-3494.patch Instead of waiting for the NM expiry, RM should remove and handle the NM, which is shutdown gracefully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4071) NPE while executing MRAppMaster shutdown hook
[ https://issues.apache.org/jira/browse/MAPREDUCE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253503#comment-13253503 ] Hadoop QA commented on MAPREDUCE-4071: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522570/MAPREDUCE-4071-2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified test files. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell org.apache.hadoop.yarn.server.TestDiskFailures org.apache.hadoop.yarn.server.TestContainerManagerSecurity org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs org.apache.hadoop.mapreduce.v2.app.TestRecovery org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator org.apache.hadoop.mapred.TestMiniMRClasspath org.apache.hadoop.mapreduce.v2.TestMRJobs org.apache.hadoop.mapred.TestMiniMRWithDFSWithDistinctUsers org.apache.hadoop.mapred.TestMiniMRBringup org.apache.hadoop.mapred.TestMiniMRChildTask org.apache.hadoop.mapred.TestReduceFetch org.apache.hadoop.mapred.TestClusterMRNotification org.apache.hadoop.mapred.TestReduceFetchFromPartialMem org.apache.hadoop.mapred.TestJobCounters org.apache.hadoop.mapreduce.TestChild org.apache.hadoop.mapred.TestMiniMRClientCluster org.apache.hadoop.ipc.TestSocketFactory org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService org.apache.hadoop.mapreduce.v2.TestMROldApiJobs org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter org.apache.hadoop.mapred.TestClientRedirect org.apache.hadoop.mapred.TestLazyOutput org.apache.hadoop.mapred.TestJobCleanup org.apache.hadoop.mapreduce.TestMapReduceLazyOutput org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath org.apache.hadoop.mapreduce.v2.TestMRAppWithCombiner org.apache.hadoop.conf.TestNoDefaultsJobConf org.apache.hadoop.mapreduce.v2.TestRMNMInfo org.apache.hadoop.mapred.TestClusterMapReduceTestCase org.apache.hadoop.mapreduce.v2.TestNonExistentJob org.apache.hadoop.mapred.TestJobSysDirWithDFS org.apache.hadoop.mapreduce.v2.TestUberAM org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser org.apache.hadoop.mapred.TestJobName org.apache.hadoop.mapreduce.security.TestJHSSecurity +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2220//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2220//console This message is automatically generated. NPE while executing MRAppMaster shutdown hook - Key: MAPREDUCE-4071 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4071 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.3, 2.0.0 Reporter: Bhallamudi Venkata Siva Kamesh Attachments: MAPREDUCE-4071-1.patch, MAPREDUCE-4071-2.patch, MAPREDUCE-4071.patch While running the shutdown hook of MRAppMaster, hit NPE {noformat} Exception in thread Thread-1 java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:668) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1004)
[jira] [Updated] (MAPREDUCE-3921) MR AM should act on the nodes liveliness information when nodes go up/down/unhealthy
[ https://issues.apache.org/jira/browse/MAPREDUCE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-3921: -- Status: Open (was: Patch Available) MR AM should act on the nodes liveliness information when nodes go up/down/unhealthy Key: MAPREDUCE-3921 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3921 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Bikas Saha Fix For: 0.23.2 Attachments: MAPREDUCE-3921-1.patch, MAPREDUCE-3921-3.patch, MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3921) MR AM should act on the nodes liveliness information when nodes go up/down/unhealthy
[ https://issues.apache.org/jira/browse/MAPREDUCE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-3921: -- Attachment: MAPREDUCE-3921-3.patch Attaching patch after pulling latest changes and improved test for AM recovery. MR AM should act on the nodes liveliness information when nodes go up/down/unhealthy Key: MAPREDUCE-3921 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3921 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Bikas Saha Fix For: 0.23.2 Attachments: MAPREDUCE-3921-1.patch, MAPREDUCE-3921-3.patch, MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3921) MR AM should act on the nodes liveliness information when nodes go up/down/unhealthy
[ https://issues.apache.org/jira/browse/MAPREDUCE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-3921: -- Status: Patch Available (was: Open) MR AM should act on the nodes liveliness information when nodes go up/down/unhealthy Key: MAPREDUCE-3921 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3921 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Bikas Saha Fix For: 0.23.2 Attachments: MAPREDUCE-3921-1.patch, MAPREDUCE-3921-3.patch, MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4153) NM Application invalid state transition on reboot command from RM
NM Application invalid state transition on reboot command from RM - Key: MAPREDUCE-4153 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4153 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.2 Reporter: Thomas Graves If the RM goes down and comes back up, it tells the NM to reboot. When the NM reboots, if it has any applications it aggregates the logs for those applications, then it transitions the app to APPLICATION_LOG_HANDLING_FINISHED. I saw a case where there was an app that was in the RUNNING state and tried to transition to APPLICATION_LOG_HANDLING_finished and it got the invalid transition. [DeletionService #1]2012-04-11 15:12:40,476 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Can't handle this event at current state [AsyncDispatcher event handler]org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at RUNNING at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:382) at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:517) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:509) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:125) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:74) at java.lang.Thread.run(Thread.java:619) 2012-04-11 15:12:40,476 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1333003059741_15999 transitioned from RUNNING to null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3921) MR AM should act on the nodes liveliness information when nodes go up/down/unhealthy
[ https://issues.apache.org/jira/browse/MAPREDUCE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253678#comment-13253678 ] Hadoop QA commented on MAPREDUCE-3921: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522596/MAPREDUCE-3921-3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified test files. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 508 javac compiler warnings (more than the trunk's current 506 warnings). +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell org.apache.hadoop.yarn.server.TestDiskFailures org.apache.hadoop.yarn.server.TestContainerManagerSecurity org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs org.apache.hadoop.mapred.TestMiniMRClasspath org.apache.hadoop.mapreduce.v2.TestMRJobs org.apache.hadoop.mapred.TestMiniMRWithDFSWithDistinctUsers org.apache.hadoop.mapred.TestMiniMRBringup org.apache.hadoop.mapred.TestMiniMRChildTask org.apache.hadoop.mapred.TestReduceFetch org.apache.hadoop.mapred.TestClusterMRNotification org.apache.hadoop.mapred.TestReduceFetchFromPartialMem org.apache.hadoop.mapred.TestJobCounters org.apache.hadoop.mapreduce.TestChild org.apache.hadoop.mapred.TestMiniMRClientCluster org.apache.hadoop.ipc.TestSocketFactory org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService org.apache.hadoop.mapreduce.v2.TestMROldApiJobs org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter org.apache.hadoop.mapred.TestClientRedirect org.apache.hadoop.mapred.TestLazyOutput org.apache.hadoop.mapred.TestJobCleanup org.apache.hadoop.mapreduce.TestMapReduceLazyOutput org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath org.apache.hadoop.mapreduce.v2.TestMRAppWithCombiner org.apache.hadoop.conf.TestNoDefaultsJobConf org.apache.hadoop.mapreduce.v2.TestRMNMInfo org.apache.hadoop.mapred.TestClusterMapReduceTestCase org.apache.hadoop.mapreduce.v2.TestNonExistentJob org.apache.hadoop.mapred.TestJobSysDirWithDFS org.apache.hadoop.mapreduce.v2.TestUberAM org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser org.apache.hadoop.mapred.TestJobName org.apache.hadoop.mapreduce.security.TestJHSSecurity +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build///testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build///console This message is automatically generated. MR AM should act on the nodes liveliness information when nodes go up/down/unhealthy Key: MAPREDUCE-3921 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3921 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Bikas Saha Fix For: 0.23.2 Attachments: MAPREDUCE-3921-1.patch, MAPREDUCE-3921-3.patch, MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see:
[jira] [Updated] (MAPREDUCE-3972) Locking and exception issues in JobHistory Server.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3972: --- Attachment: MR-3972.txt This patch should address all of the review comments so far. The JobListCache is now a ConcurrentSkipListMap with no locking around it. To do this I declared it safe that we may delete a few more jobs from the cache then expected. The HistoryStorage class is no longer informed about items being removed from HDFS, and the CachedHistoryStorage tries to preemptively know that they were removed, but it is not that critical if it does not happen. Please take a look as now that there is no locking around the JobListCache there needed to be some extra checks to avoid NPEs Locking and exception issues in JobHistory Server. -- Key: MAPREDUCE-3972 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3972 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.2 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-3972.txt, MR-3972.txt, MR-3972.txt, MR-3972.txt, MR-3972.txt The JobHistory server's locking is inconsistent and wrong in some cases. This is not super critical because the issues would only show up if a job is being cleaned up or moved from intermediate done to done, at the same time it is being parsed into a CompletedJob. However the locking is slowing down the server in some cases, and is a ticking time bomb that needs to be addressed. As part of this too we need to be sure that the Cleaner and Intermediate to Done migration threads handle exceptions properly. Now it appears that the exception is logged, and the thread just shuts down. This means that the history server could still be up and running for weeks and never remove old jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3972) Locking and exception issues in JobHistory Server.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3972: --- Status: Patch Available (was: Open) Locking and exception issues in JobHistory Server. -- Key: MAPREDUCE-3972 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3972 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.2 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-3972.txt, MR-3972.txt, MR-3972.txt, MR-3972.txt, MR-3972.txt The JobHistory server's locking is inconsistent and wrong in some cases. This is not super critical because the issues would only show up if a job is being cleaned up or moved from intermediate done to done, at the same time it is being parsed into a CompletedJob. However the locking is slowing down the server in some cases, and is a ticking time bomb that needs to be addressed. As part of this too we need to be sure that the Cleaner and Intermediate to Done migration threads handle exceptions properly. Now it appears that the exception is logged, and the thread just shuts down. This means that the history server could still be up and running for weeks and never remove old jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4150) Versioning and rolling upgrades for Yarn/MR2
[ https://issues.apache.org/jira/browse/MAPREDUCE-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253738#comment-13253738 ] Aaron T. Myers commented on MAPREDUCE-4150: --- One of the first steps in implementing this should probably be to move VersionUtil into Common from HDFS. Versioning and rolling upgrades for Yarn/MR2 Key: MAPREDUCE-4150 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4150 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.1 Reporter: Ahmed Radwan It doesn't seem that Yarn components, for example the ResourceManager or NodeManager, do build/package version checking before trying to communicate with each other. The objective of this ticket is to support the following requirements / use cases: - New versions can be marked incompatible with old versions, and services should be prevented from communicating with each other in such case. This will avoid non-deterministic behavior/problems resulting from incompatible components trying to communicate with each other. - Permitting a policy for running different - but compatible - versions on the same cluster (for example, in a rolling upgrade scenario). See HDFS-2983 for the corresponding HDFS implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-4150) Versioning and rolling upgrades for Yarn/MR2
[ https://issues.apache.org/jira/browse/MAPREDUCE-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan reassigned MAPREDUCE-4150: --- Assignee: Ahmed Radwan Versioning and rolling upgrades for Yarn/MR2 Key: MAPREDUCE-4150 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4150 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan It doesn't seem that Yarn components, for example the ResourceManager or NodeManager, do build/package version checking before trying to communicate with each other. The objective of this ticket is to support the following requirements / use cases: - New versions can be marked incompatible with old versions, and services should be prevented from communicating with each other in such case. This will avoid non-deterministic behavior/problems resulting from incompatible components trying to communicate with each other. - Permitting a policy for running different - but compatible - versions on the same cluster (for example, in a rolling upgrade scenario). See HDFS-2983 for the corresponding HDFS implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3972) Locking and exception issues in JobHistory Server.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253792#comment-13253792 ] Hadoop QA commented on MAPREDUCE-3972: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522611/MR-3972.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified test files. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell org.apache.hadoop.yarn.server.TestDiskFailures org.apache.hadoop.yarn.server.TestContainerManagerSecurity org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs org.apache.hadoop.mapred.TestMiniMRClasspath org.apache.hadoop.mapreduce.v2.TestMRJobs org.apache.hadoop.mapred.TestMiniMRWithDFSWithDistinctUsers org.apache.hadoop.mapred.TestMiniMRBringup org.apache.hadoop.mapred.TestMiniMRChildTask org.apache.hadoop.mapred.TestReduceFetch org.apache.hadoop.mapred.TestClusterMRNotification org.apache.hadoop.mapred.TestReduceFetchFromPartialMem org.apache.hadoop.mapred.TestJobCounters org.apache.hadoop.mapreduce.TestChild org.apache.hadoop.mapred.TestMiniMRClientCluster org.apache.hadoop.ipc.TestSocketFactory org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService org.apache.hadoop.mapreduce.v2.TestMROldApiJobs org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter org.apache.hadoop.mapred.TestClientRedirect org.apache.hadoop.mapred.TestLazyOutput org.apache.hadoop.mapred.TestJobCleanup org.apache.hadoop.mapreduce.TestMapReduceLazyOutput org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath org.apache.hadoop.mapreduce.v2.TestMRAppWithCombiner org.apache.hadoop.conf.TestNoDefaultsJobConf org.apache.hadoop.mapreduce.v2.TestRMNMInfo org.apache.hadoop.mapred.TestClusterMapReduceTestCase org.apache.hadoop.mapreduce.v2.TestNonExistentJob org.apache.hadoop.mapred.TestJobSysDirWithDFS org.apache.hadoop.mapreduce.v2.TestUberAM org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser org.apache.hadoop.mapred.TestJobName org.apache.hadoop.mapreduce.security.TestJHSSecurity +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2223//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2223//console This message is automatically generated. Locking and exception issues in JobHistory Server. -- Key: MAPREDUCE-3972 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3972 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.2 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-3972.txt, MR-3972.txt, MR-3972.txt, MR-3972.txt, MR-3972.txt The JobHistory server's locking is inconsistent and wrong in some cases. This is not super critical because the issues would only show up if a job is being cleaned up or moved from intermediate done to done, at the same time it is being parsed into a CompletedJob. However the locking is slowing down the server in some cases, and is a ticking time bomb that needs to be addressed. As part of this too we need to be sure that the Cleaner and Intermediate to Done migration threads
[jira] [Commented] (MAPREDUCE-4144) ResourceManager NPE while handling NODE_UPDATE
[ https://issues.apache.org/jira/browse/MAPREDUCE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253805#comment-13253805 ] Siddharth Seth commented on MAPREDUCE-4144: --- +1. lgtm. Thanks Jason. The same won't occur with off-switch requests since * seems to be a mandatory request - and is never removed from the request table. ResourceManager NPE while handling NODE_UPDATE -- Key: MAPREDUCE-4144 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4144 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Attachments: MAPREDUCE-4144-testcase.patch, MAPREDUCE-4144.patch The RM on one of our clusters has exited twice in the past few days because of an NPE while trying to handle a NODE_UPDATE: {noformat} 2012-04-12 02:09:01,672 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler [ResourceManager Event Processor]java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:261) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:223) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.allocate(SchedulerApp.java:246) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1229) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignNodeLocalContainers(LeafQueue.java:1078) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1048) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:859) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:756) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:573) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:622) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:78) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:302) at java.lang.Thread.run(Thread.java:619) {noformat} This is very similar to the failure reported in MAPREDUCE-3005. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3972) Locking and exception issues in JobHistory Server.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253810#comment-13253810 ] Robert Joseph Evans commented on MAPREDUCE-3972: These tests are not related to the patch. Locking and exception issues in JobHistory Server. -- Key: MAPREDUCE-3972 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3972 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.2 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-3972.txt, MR-3972.txt, MR-3972.txt, MR-3972.txt, MR-3972.txt The JobHistory server's locking is inconsistent and wrong in some cases. This is not super critical because the issues would only show up if a job is being cleaned up or moved from intermediate done to done, at the same time it is being parsed into a CompletedJob. However the locking is slowing down the server in some cases, and is a ticking time bomb that needs to be addressed. As part of this too we need to be sure that the Cleaner and Intermediate to Done migration threads handle exceptions properly. Now it appears that the exception is logged, and the thread just shuts down. This means that the history server could still be up and running for weeks and never remove old jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4144) ResourceManager NPE while handling NODE_UPDATE
[ https://issues.apache.org/jira/browse/MAPREDUCE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-4144: -- Resolution: Fixed Fix Version/s: 0.23.3 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk, branch-2 and branch-0.23 ResourceManager NPE while handling NODE_UPDATE -- Key: MAPREDUCE-4144 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4144 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Fix For: 0.23.3 Attachments: MAPREDUCE-4144-testcase.patch, MAPREDUCE-4144.patch The RM on one of our clusters has exited twice in the past few days because of an NPE while trying to handle a NODE_UPDATE: {noformat} 2012-04-12 02:09:01,672 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler [ResourceManager Event Processor]java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:261) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:223) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.allocate(SchedulerApp.java:246) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1229) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignNodeLocalContainers(LeafQueue.java:1078) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1048) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:859) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:756) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:573) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:622) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:78) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:302) at java.lang.Thread.run(Thread.java:619) {noformat} This is very similar to the failure reported in MAPREDUCE-3005. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4144) ResourceManager NPE while handling NODE_UPDATE
[ https://issues.apache.org/jira/browse/MAPREDUCE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253820#comment-13253820 ] Hudson commented on MAPREDUCE-4144: --- Integrated in Hadoop-Hdfs-trunk-Commit #2145 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2145/]) MAPREDUCE-4144. Fix a NPE in the ResourceManager when handling node updates. (Contributed by Jason Lowe) (Revision 1325991) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325991 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java ResourceManager NPE while handling NODE_UPDATE -- Key: MAPREDUCE-4144 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4144 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Fix For: 0.23.3 Attachments: MAPREDUCE-4144-testcase.patch, MAPREDUCE-4144.patch The RM on one of our clusters has exited twice in the past few days because of an NPE while trying to handle a NODE_UPDATE: {noformat} 2012-04-12 02:09:01,672 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler [ResourceManager Event Processor]java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:261) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:223) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.allocate(SchedulerApp.java:246) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1229) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignNodeLocalContainers(LeafQueue.java:1078) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1048) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:859) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:756) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:573) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:622) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:78) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:302) at java.lang.Thread.run(Thread.java:619) {noformat} This is very similar to the failure reported in MAPREDUCE-3005. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4144) ResourceManager NPE while handling NODE_UPDATE
[ https://issues.apache.org/jira/browse/MAPREDUCE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253823#comment-13253823 ] Hudson commented on MAPREDUCE-4144: --- Integrated in Hadoop-Common-trunk-Commit #2072 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2072/]) MAPREDUCE-4144. Fix a NPE in the ResourceManager when handling node updates. (Contributed by Jason Lowe) (Revision 1325991) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325991 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java ResourceManager NPE while handling NODE_UPDATE -- Key: MAPREDUCE-4144 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4144 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Fix For: 0.23.3 Attachments: MAPREDUCE-4144-testcase.patch, MAPREDUCE-4144.patch The RM on one of our clusters has exited twice in the past few days because of an NPE while trying to handle a NODE_UPDATE: {noformat} 2012-04-12 02:09:01,672 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler [ResourceManager Event Processor]java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:261) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:223) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.allocate(SchedulerApp.java:246) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1229) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignNodeLocalContainers(LeafQueue.java:1078) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1048) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:859) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:756) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:573) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:622) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:78) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:302) at java.lang.Thread.run(Thread.java:619) {noformat} This is very similar to the failure reported in MAPREDUCE-3005. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4144) ResourceManager NPE while handling NODE_UPDATE
[ https://issues.apache.org/jira/browse/MAPREDUCE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253835#comment-13253835 ] Hudson commented on MAPREDUCE-4144: --- Integrated in Hadoop-Mapreduce-trunk-Commit #2086 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2086/]) MAPREDUCE-4144. Fix a NPE in the ResourceManager when handling node updates. (Contributed by Jason Lowe) (Revision 1325991) Result = FAILURE sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1325991 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java ResourceManager NPE while handling NODE_UPDATE -- Key: MAPREDUCE-4144 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4144 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Fix For: 0.23.3 Attachments: MAPREDUCE-4144-testcase.patch, MAPREDUCE-4144.patch The RM on one of our clusters has exited twice in the past few days because of an NPE while trying to handle a NODE_UPDATE: {noformat} 2012-04-12 02:09:01,672 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler [ResourceManager Event Processor]java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:261) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:223) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp.allocate(SchedulerApp.java:246) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1229) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignNodeLocalContainers(LeafQueue.java:1078) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1048) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:859) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:756) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:573) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:622) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:78) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:302) at java.lang.Thread.run(Thread.java:619) {noformat} This is very similar to the failure reported in MAPREDUCE-3005. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4154) streaming MR job succeeds even if the streaming command fails
streaming MR job succeeds even if the streaming command fails - Key: MAPREDUCE-4154 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4154 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.0.2 Reporter: Thejas M Nair Hadoop 1.0.1 behaves as expected - The task fails for streaming MR job if the streaming command fails. But it succeeds in hadoop 1.0.2 . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4154) streaming MR job succeeds even if the streaming command fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253853#comment-13253853 ] Thejas M Nair commented on MAPREDUCE-4154: -- Example of streaming job that succeeds with 1.0.2 - sudo -u templeton hadoop fs -rmr /tmp/t.out; sudo -u templeton /usr/bin/hadoop jar /usr/lib/hadoop/contrib/streaming/hadoop-streaming-1.0.2.jar -input /tmp/nums.txt -output /tmp/t.out -mapper *'/bin/ls no_such-file-12e3'* -reducer /usr/bin/wc streaming MR job succeeds even if the streaming command fails - Key: MAPREDUCE-4154 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4154 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.0.2 Reporter: Thejas M Nair Hadoop 1.0.1 behaves as expected - The task fails for streaming MR job if the streaming command fails. But it succeeds in hadoop 1.0.2 . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3378) Create a single 'hadoop-mapreduce' Maven artifact
[ https://issues.apache.org/jira/browse/MAPREDUCE-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White resolved MAPREDUCE-3378. -- Resolution: Won't Fix I've opened HADOOP-8278 to track 1. HADOOP-8009 addressed 2. So I'm closing this JIRA now. Create a single 'hadoop-mapreduce' Maven artifact - Key: MAPREDUCE-3378 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3378 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build Affects Versions: 0.23.0 Reporter: Tom White Attachments: MAPREDUCE-3378.patch In 0.23.0 there are multiple artifacts (hadoop-mapreduce-client-app, hadoop-mapreduce-client-common, hadoop-mapreduce-client-core, etc). It would be simpler for users to declare a dependency on hadoop-mapreduce (much like there's hadoop-common and hadoop-hdfs). (This would also be a step towards MAPREDUCE-2600.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4150) Versioning and rolling upgrades for Yarn/MR2
[ https://issues.apache.org/jira/browse/MAPREDUCE-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253963#comment-13253963 ] Ahmed Radwan commented on MAPREDUCE-4150: - @Robert: Thanks for the suggestion! I agree displaying such information on the web UI will be very helpful next step. Let's plan on this after we are done with this ticket so we know exactly what to display. @ATM: Good point, I filed HADOOP-8280 to take care of that. Thanks! Versioning and rolling upgrades for Yarn/MR2 Key: MAPREDUCE-4150 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4150 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan It doesn't seem that Yarn components, for example the ResourceManager or NodeManager, do build/package version checking before trying to communicate with each other. The objective of this ticket is to support the following requirements / use cases: - New versions can be marked incompatible with old versions, and services should be prevented from communicating with each other in such case. This will avoid non-deterministic behavior/problems resulting from incompatible components trying to communicate with each other. - Permitting a policy for running different - but compatible - versions on the same cluster (for example, in a rolling upgrade scenario). See HDFS-2983 for the corresponding HDFS implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira