[jira] [Commented] (MAPREDUCE-3736) Variable substitution depth too large for fs.default.name causes jobs to fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208296#comment-13208296 ] Hudson commented on MAPREDUCE-3736: --- Integrated in Hadoop-Common-trunk-Commit #1728 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1728/]) MAPREDUCE-3736. Variable substitution depth too large for fs.default.name causes jobs to fail (ahmed via tucu) (Revision 1244264) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244264 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestMRWithDistributedCache.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/conf/TestNoDefaultsJobConf.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/JHLogAnalyzer.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/io/FileBench.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestCombineFileInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestConcatenatedCompressedInput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestTextInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestMapCollection.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFileInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestMRKeyValueTextInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml Variable substitution depth too large for fs.default.name causes jobs to fail - Key: MAPREDUCE-3736 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3736 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Eli Collins Assignee: Ahmed Radwan Priority: Blocker Attachments: MAPREDUCE-3736.patch, MAPREDUCE-3736_rev2.patch, MAPREDUCE-3736_rev3.patch I'm seeing the same failure as MAPREDUCE-3462 in downstream projects running against a recent build of branch-23. MR-3462 modified the tests rather than fixing the framework. In that jira Ravi mentioned I'm still ignorant of the change which made the tests start to fail. I should probably understand better the reasons for that change before proposing a more generalized fix. Let's figure out the general fix (rather than require all projects to set mapreduce.job.hdfs-servers in their conf we should fix this in the framework). Perhaps we should not default this config to $fs.default.name? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3858) Task attempt failure during commit results in task never completing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208303#comment-13208303 ] Hudson commented on MAPREDUCE-3858: --- Integrated in Hadoop-Common-0.23-Commit #552 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/552/]) MAPREDUCE-3858. Task attempt failure during commit results in task never completing. (Tom White via mahadev) - Merging r1244254 from trunk. (Revision 1244255) Result = SUCCESS mahadev : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244255 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java Task attempt failure during commit results in task never completing --- Key: MAPREDUCE-3858 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3858 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Tom White Assignee: Tom White Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3858.patch On a terasort job a task attempt failed during the commit phase. Another attempt was rescheduled, but when it tried to commit it failed. {noformat} attempt_1329019187148_0083_r_000586_0 already given a go for committing the task output, so killing attempt_1329019187148_0083_r_000586_1 {noformat} The job hung as new attempts kept getting scheduled only to fail during commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3736) Variable substitution depth too large for fs.default.name causes jobs to fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208301#comment-13208301 ] Hudson commented on MAPREDUCE-3736: --- Integrated in Hadoop-Common-0.23-Commit #552 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/552/]) Merge -r 1244263:1244264 from trunk to branch. FIXES: MAPREDUCE-3736 (Revision 1244265) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244265 Files : * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestMRWithDistributedCache.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/conf/TestNoDefaultsJobConf.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/JHLogAnalyzer.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/io/FileBench.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestCombineFileInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestConcatenatedCompressedInput.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestTextInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestMapCollection.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFileInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestMRKeyValueTextInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml Variable substitution depth too large for fs.default.name causes jobs to fail - Key: MAPREDUCE-3736 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3736 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Eli Collins Assignee: Ahmed Radwan Priority: Blocker Attachments: MAPREDUCE-3736.patch, MAPREDUCE-3736_rev2.patch, MAPREDUCE-3736_rev3.patch I'm seeing the same failure as MAPREDUCE-3462 in downstream projects running against a recent build of branch-23. MR-3462 modified the tests rather than fixing the framework. In that jira Ravi mentioned I'm still ignorant of the change which made the tests start to fail. I should probably understand better the reasons for that change before proposing a more generalized fix. Let's figure out the general fix (rather than require all projects to set mapreduce.job.hdfs-servers in their conf we should fix this in the framework). Perhaps we should not default this config to $fs.default.name? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3854) Reinstate environment variable tests in TestMiniMRChildTask
[ https://issues.apache.org/jira/browse/MAPREDUCE-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208302#comment-13208302 ] Hudson commented on MAPREDUCE-3854: --- Integrated in Hadoop-Common-0.23-Commit #552 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/552/]) MAPREDUCE-3854. Fixed and reenabled tests related to MR child JVM's environmental variables in TestMiniMRChildTask. (Tom White via vinodkv) svn merge --ignore-ancestry -c 1244223 ../../trunk/ (Revision 1244224) Result = SUCCESS vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244224 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java Reinstate environment variable tests in TestMiniMRChildTask --- Key: MAPREDUCE-3854 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3854 Project: Hadoop Map/Reduce Issue Type: Test Components: mrv2 Reporter: Tom White Assignee: Tom White Fix For: 0.23.1 Attachments: MAPREDUCE-3854.patch, MAPREDUCE-3854.patch MAPREDUCE-3716 reinstated one of the tests in TestMiniMRChildTask, but there are two more which should be run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3634) All daemons should crash instead of hanging around when their EventHandlers get exceptions
[ https://issues.apache.org/jira/browse/MAPREDUCE-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208319#comment-13208319 ] Hadoop QA commented on MAPREDUCE-3634: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514598/MAPREDUCE-3634-20120214.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 30 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitor +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1857//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1857//console This message is automatically generated. All daemons should crash instead of hanging around when their EventHandlers get exceptions -- Key: MAPREDUCE-3634 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3634 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: 0.23.2 Attachments: MAPREDUCE-3634-20120118.1.txt, MAPREDUCE-3634-20120119.txt, MAPREDUCE-3634-20120214.txt We should make sure that the daemons crash in case the dispatchers get exceptions and stop processing. That way we will be debugging RM/NM/AM crashes instead of hard-to-track hanging jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3858) Task attempt failure during commit results in task never completing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208326#comment-13208326 ] Hudson commented on MAPREDUCE-3858: --- Integrated in Hadoop-Mapreduce-0.23-Commit #556 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/556/]) MAPREDUCE-3858. Task attempt failure during commit results in task never completing. (Tom White via mahadev) - Merging r1244254 from trunk. (Revision 1244255) Result = ABORTED mahadev : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244255 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java Task attempt failure during commit results in task never completing --- Key: MAPREDUCE-3858 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3858 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Tom White Assignee: Tom White Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3858.patch On a terasort job a task attempt failed during the commit phase. Another attempt was rescheduled, but when it tried to commit it failed. {noformat} attempt_1329019187148_0083_r_000586_0 already given a go for committing the task output, so killing attempt_1329019187148_0083_r_000586_1 {noformat} The job hung as new attempts kept getting scheduled only to fail during commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3736) Variable substitution depth too large for fs.default.name causes jobs to fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208327#comment-13208327 ] Hudson commented on MAPREDUCE-3736: --- Integrated in Hadoop-Mapreduce-trunk-Commit #1740 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1740/]) MAPREDUCE-3736. Variable substitution depth too large for fs.default.name causes jobs to fail (ahmed via tucu) (Revision 1244264) Result = ABORTED tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244264 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestMRWithDistributedCache.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/conf/TestNoDefaultsJobConf.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/JHLogAnalyzer.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/io/FileBench.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestCombineFileInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestConcatenatedCompressedInput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestTextInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestMapCollection.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFileInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestMRKeyValueTextInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml Variable substitution depth too large for fs.default.name causes jobs to fail - Key: MAPREDUCE-3736 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3736 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Eli Collins Assignee: Ahmed Radwan Priority: Blocker Attachments: MAPREDUCE-3736.patch, MAPREDUCE-3736_rev2.patch, MAPREDUCE-3736_rev3.patch I'm seeing the same failure as MAPREDUCE-3462 in downstream projects running against a recent build of branch-23. MR-3462 modified the tests rather than fixing the framework. In that jira Ravi mentioned I'm still ignorant of the change which made the tests start to fail. I should probably understand better the reasons for that change before proposing a more generalized fix. Let's figure out the general fix (rather than require all projects to set mapreduce.job.hdfs-servers in their conf we should fix this in the framework). Perhaps we should not default this config to $fs.default.name? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3858) Task attempt failure during commit results in task never completing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208328#comment-13208328 ] Hudson commented on MAPREDUCE-3858: --- Integrated in Hadoop-Mapreduce-trunk-Commit #1740 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1740/]) MAPREDUCE-3858. Task attempt failure during commit results in task never completing. (Tom White via mahadev) (Revision 1244254) Result = ABORTED mahadev : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244254 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java Task attempt failure during commit results in task never completing --- Key: MAPREDUCE-3858 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3858 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Tom White Assignee: Tom White Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3858.patch On a terasort job a task attempt failed during the commit phase. Another attempt was rescheduled, but when it tried to commit it failed. {noformat} attempt_1329019187148_0083_r_000586_0 already given a go for committing the task output, so killing attempt_1329019187148_0083_r_000586_1 {noformat} The job hung as new attempts kept getting scheduled only to fail during commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3736) Variable substitution depth too large for fs.default.name causes jobs to fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208325#comment-13208325 ] Hudson commented on MAPREDUCE-3736: --- Integrated in Hadoop-Mapreduce-0.23-Commit #556 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/556/]) Merge -r 1244263:1244264 from trunk to branch. FIXES: MAPREDUCE-3736 (Revision 1244265) Result = ABORTED tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244265 Files : * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestMRWithDistributedCache.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/conf/TestNoDefaultsJobConf.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/JHLogAnalyzer.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/io/FileBench.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestCombineFileInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestConcatenatedCompressedInput.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestTextInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestMapCollection.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFileInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestMRKeyValueTextInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml Variable substitution depth too large for fs.default.name causes jobs to fail - Key: MAPREDUCE-3736 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3736 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Eli Collins Assignee: Ahmed Radwan Priority: Blocker Attachments: MAPREDUCE-3736.patch, MAPREDUCE-3736_rev2.patch, MAPREDUCE-3736_rev3.patch I'm seeing the same failure as MAPREDUCE-3462 in downstream projects running against a recent build of branch-23. MR-3462 modified the tests rather than fixing the framework. In that jira Ravi mentioned I'm still ignorant of the change which made the tests start to fail. I should probably understand better the reasons for that change before proposing a more generalized fix. Let's figure out the general fix (rather than require all projects to set mapreduce.job.hdfs-servers in their conf we should fix this in the framework). Perhaps we should not default this config to $fs.default.name? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3761) AM info in job -list does not reflect the actual AM hostname
[ https://issues.apache.org/jira/browse/MAPREDUCE-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208330#comment-13208330 ] Hadoop QA commented on MAPREDUCE-3761: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514547/MAPREDUCE-3761-20120214.1.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1856//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1856//console This message is automatically generated. AM info in job -list does not reflect the actual AM hostname Key: MAPREDUCE-3761 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3761 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.1 Reporter: Ramya Sunil Assignee: Vinod Kumar Vavilapalli Fix For: 0.23.1 Attachments: MAPREDUCE-3761-20120202.txt, MAPREDUCE-3761-20120214.1.txt The AM info field on bin/mapred job -list currently has a value resourcemanager hostname:8088/proxy/appID. This info is irrelevant unless it shows the real information of where the AM was launched. This needs to be fixed to show the AM host details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3849) Change TokenCache's reading of the binary token file
[ https://issues.apache.org/jira/browse/MAPREDUCE-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208331#comment-13208331 ] Hadoop QA commented on MAPREDUCE-3849: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514555/MAPREDUCE-3849-2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1855//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1855//console This message is automatically generated. Change TokenCache's reading of the binary token file Key: MAPREDUCE-3849 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3849 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3849-2.patch, MAPREDUCE-3849.patch When obtaining the tokens for a {{FileSystem}}, the {{TokenCache}} will read the binary token file if a token is not already in the {{Credentials}}. However, it will overwrite any existing tokens in the {{Credentials}} with the contents of the binary token file if a single token is missing. This may cause new tokens to be replaced with invalid/cancelled tokens from the binary file. The new tokens will not be canceled, and thus leak in the namenode until they expire. The binary tokens should be merged with, but not replace, existing tokens in the {{Credentials}}. The code that reads the binary token file is prefaced with: {code} //TODO: Need to come up with a better place to put //this block of code to do with reading the file {code} Also, the loading of the binary token file is the only reason that the {{TokenCache}} has to use {{getCanonicalService}}. If this linkage can be broken, then the 1-to-1 filesystem to token service coupling may be removed. And use of {{getCanonicalService}} can be removed in a subsequent jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3757) Rumen Folder is not adjusting the shuffleFinished and sortFinished times of reduce task attempts
[ https://issues.apache.org/jira/browse/MAPREDUCE-3757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-3757: Attachment: 3757.v1.patch 3757.v1.patch Attaching new patch with the gold trace expected in a Folder testcase modified to reflect this bug-fix. Rumen Folder is not adjusting the shuffleFinished and sortFinished times of reduce task attempts Key: MAPREDUCE-3757 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3757 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 3757.v0.patch, 3757.v1.patch, 3757.v1.patch Rumen Folder is not adjusting the shuffleFinished and sortFinished times of reduce task attempts when it is adjusting the attempt-start-time and attempt-finish-time. This is leading to wrong values which are greater than the attempt-finish-time in trace file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208337#comment-13208337 ] Sergey Tryuber commented on MAPREDUCE-3859: --- Arun, no, that's not 'user-limit-factor' issue(( This option perfectly works in case if job consumes only 1 slot per task and we've been using this option for a while. This bug affects only cases if job consumes more than one slot per task. Harsh, what version should I try to patch? Is it branch-1.0? Or trunk too? CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched Affects Versions: 1.0.0 Environment: CDH3u1 Reporter: Sergey Tryuber Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit teststoo
[Rumen] Bring back the removed Rumen unit teststoo -- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208356#comment-13208356 ] Hadoop QA commented on MAPREDUCE-3583: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514572/mapreduce-3583-trunk.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1858//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1858//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1858//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1858//console This message is automatically generated. ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files
[jira] [Updated] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated MAPREDUCE-3583: -- Attachment: mapreduce-3583-trunk-v2.txt Patch v2 for TRUNK. Patch v1 missed PROCESSTREE_DUMP_FORMAT in hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 2048 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited 6 Running in Jenkins mode {code} From Nicolas Sze: {noformat} It looks like that the ppid is a 64-bit positive integer but Java long is signed and so only works with 63-bit positive integers. In your case, 2^64 18446743988060683582 2^63. Therefore, there is a NFE. {noformat} I propose changing allProcessInfo to MapString, ProcessInfo so that we don't encounter this problem by avoiding parsing large integer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated MAPREDUCE-3583: -- Attachment: mapreduce-3583-trunk-v2.txt Reattaching patch v2 for TRUNK with --no-prefix ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 2048 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited 6 Running in Jenkins mode {code} From Nicolas Sze: {noformat} It looks like that the ppid is a 64-bit positive integer but Java long is signed and so only works with 63-bit positive integers. In your case, 2^64 18446743988060683582 2^63. Therefore, there is a NFE. {noformat} I propose changing allProcessInfo to MapString, ProcessInfo so that we don't encounter this problem by avoiding parsing large integer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208403#comment-13208403 ] Hadoop QA commented on MAPREDUCE-3583: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514624/mapreduce-3583-trunk-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1859//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1859//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1859//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1859//console This message is automatically generated. ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory
[jira] [Commented] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208409#comment-13208409 ] Hadoop QA commented on MAPREDUCE-3583: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514626/mapreduce-3583-trunk-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1860//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1860//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1860//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1860//console This message is automatically generated. ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory
[jira] [Commented] (MAPREDUCE-3736) Variable substitution depth too large for fs.default.name causes jobs to fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208421#comment-13208421 ] Hudson commented on MAPREDUCE-3736: --- Integrated in Hadoop-Hdfs-trunk #956 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/956/]) MAPREDUCE-3736. Variable substitution depth too large for fs.default.name causes jobs to fail (ahmed via tucu) (Revision 1244264) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244264 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestMRWithDistributedCache.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/conf/TestNoDefaultsJobConf.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/JHLogAnalyzer.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/io/FileBench.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestCombineFileInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestConcatenatedCompressedInput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestTextInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestMapCollection.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFileInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestMRKeyValueTextInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml Variable substitution depth too large for fs.default.name causes jobs to fail - Key: MAPREDUCE-3736 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3736 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Eli Collins Assignee: Ahmed Radwan Priority: Blocker Attachments: MAPREDUCE-3736.patch, MAPREDUCE-3736_rev2.patch, MAPREDUCE-3736_rev3.patch I'm seeing the same failure as MAPREDUCE-3462 in downstream projects running against a recent build of branch-23. MR-3462 modified the tests rather than fixing the framework. In that jira Ravi mentioned I'm still ignorant of the change which made the tests start to fail. I should probably understand better the reasons for that change before proposing a more generalized fix. Let's figure out the general fix (rather than require all projects to set mapreduce.job.hdfs-servers in their conf we should fix this in the framework). Perhaps we should not default this config to $fs.default.name? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3854) Reinstate environment variable tests in TestMiniMRChildTask
[ https://issues.apache.org/jira/browse/MAPREDUCE-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208422#comment-13208422 ] Hudson commented on MAPREDUCE-3854: --- Integrated in Hadoop-Hdfs-trunk #956 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/956/]) MAPREDUCE-3854. Fixed and reenabled tests related to MR child JVM's environmental variables in TestMiniMRChildTask. (Tom White via vinodkv) (Revision 1244223) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244223 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java Reinstate environment variable tests in TestMiniMRChildTask --- Key: MAPREDUCE-3854 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3854 Project: Hadoop Map/Reduce Issue Type: Test Components: mrv2 Reporter: Tom White Assignee: Tom White Fix For: 0.23.1 Attachments: MAPREDUCE-3854.patch, MAPREDUCE-3854.patch MAPREDUCE-3716 reinstated one of the tests in TestMiniMRChildTask, but there are two more which should be run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3846) Restarted+Recovered AM hangs in some corner cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208423#comment-13208423 ] Hudson commented on MAPREDUCE-3846: --- Integrated in Hadoop-Hdfs-trunk #956 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/956/]) MAPREDUCE-3802. Added test to validate that AM can crash multiple times and still can recover successfully after MAPREDUCE-3846. (vinodkv) (Revision 1244178) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244178 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java Restarted+Recovered AM hangs in some corner cases - Key: MAPREDUCE-3846 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3846 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3846-20120210.txt, MAPREDUCE-3846-20120210.txt, MAPREDUCE-3846-20120213.txt [~karams] found this while testing AM restart/recovery feature. After the first generation AM crashes (manually killed by kill -9), the second generation AM starts, but hangs after a while. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3858) Task attempt failure during commit results in task never completing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208426#comment-13208426 ] Hudson commented on MAPREDUCE-3858: --- Integrated in Hadoop-Hdfs-trunk #956 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/956/]) MAPREDUCE-3858. Task attempt failure during commit results in task never completing. (Tom White via mahadev) (Revision 1244254) Result = FAILURE mahadev : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244254 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java Task attempt failure during commit results in task never completing --- Key: MAPREDUCE-3858 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3858 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Tom White Assignee: Tom White Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3858.patch On a terasort job a task attempt failed during the commit phase. Another attempt was rescheduled, but when it tried to commit it failed. {noformat} attempt_1329019187148_0083_r_000586_0 already given a go for committing the task output, so killing attempt_1329019187148_0083_r_000586_1 {noformat} The job hung as new attempts kept getting scheduled only to fail during commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3802) If an MR AM dies twice it looks like the process freezes
[ https://issues.apache.org/jira/browse/MAPREDUCE-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208427#comment-13208427 ] Hudson commented on MAPREDUCE-3802: --- Integrated in Hadoop-Hdfs-trunk #956 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/956/]) MAPREDUCE-3802. Added test to validate that AM can crash multiple times and still can recover successfully after MAPREDUCE-3846. (vinodkv) (Revision 1244178) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244178 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java If an MR AM dies twice it looks like the process freezes - Key: MAPREDUCE-3802 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3802 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: applicationmaster, mrv2 Affects Versions: 0.23.1, 0.24.0 Reporter: Robert Joseph Evans Assignee: Vinod Kumar Vavilapalli Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3802-20120213.txt, MAPREDUCE-3802-20120213.txt, syslog It looks like recovering from an RM AM dieing works very well on a single failure. But if it fails multiple times we appear to get into a live lock situation. {noformat} yarn jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*-SNAPSHOT.jar wordcount -Dyarn.app.mapreduce.am.log.level=DEBUG -Dmapreduce.job.reduces=30 input output 12/02/03 21:06:57 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS 12/02/03 21:06:57 WARN conf.Configuration: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used 12/02/03 21:06:57 INFO input.FileInputFormat: Total input paths to process : 17 12/02/03 21:06:57 INFO util.NativeCodeLoader: Loaded the native-hadoop library 12/02/03 21:06:57 WARN snappy.LoadSnappy: Snappy native library not loaded 12/02/03 21:06:57 INFO mapreduce.JobSubmitter: number of splits:17 12/02/03 21:06:57 INFO mapred.ResourceMgrDelegate: Submitted application application_1328302034486_0003 to ResourceManager at HOST/IP:8040 12/02/03 21:06:57 INFO mapreduce.Job: The url to track the job: http://HOST:8088/proxy/application_1328302034486_0003/ 12/02/03 21:06:57 INFO mapreduce.Job: Running job: job_1328302034486_0003 12/02/03 21:07:03 INFO mapreduce.Job: Job job_1328302034486_0003 running in uber mode : false 12/02/03 21:07:03 INFO mapreduce.Job: map 0% reduce 0% 12/02/03 21:07:09 INFO mapreduce.Job: map 5% reduce 0% 12/02/03 21:07:10 INFO mapreduce.Job: map 17% reduce 0% #KILLED AM with kill -9 here 12/02/03 21:07:16 INFO mapreduce.Job: map 29% reduce 0% 12/02/03 21:07:17 INFO mapreduce.Job: map 35% reduce 0% 12/02/03 21:07:30 INFO mapreduce.Job: map 52% reduce 0% 12/02/03 21:07:35 INFO mapreduce.Job: map 58% reduce 0% 12/02/03 21:07:37 INFO mapreduce.Job: map 70% reduce 0% 12/02/03 21:07:41 INFO mapreduce.Job: map 76% reduce 0% 12/02/03 21:07:43 INFO mapreduce.Job: map 82% reduce 0% 12/02/03 21:07:44 INFO mapreduce.Job: map 88% reduce 0% 12/02/03 21:07:47 INFO mapreduce.Job: map 94% reduce 0% 12/02/03 21:07:49 INFO mapreduce.Job: map 100% reduce 0% 12/02/03 21:07:53 INFO mapreduce.Job: map 100% reduce 3% 12/02/03 21:08:00 INFO mapreduce.Job: map 100% reduce 6% 12/02/03 21:08:06 INFO mapreduce.Job: map 100% reduce 10% 12/02/03 21:08:12 INFO mapreduce.Job: map 100% reduce 13% 12/02/03 21:08:18 INFO mapreduce.Job: map 100% reduce 16% #killed AM with kill -9 here 12/02/03 21:08:20 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. Already tried 0 time(s). 12/02/03 21:08:21 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. Already tried 1 time(s). 12/02/03 21:08:22 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. Already tried 2 time(s). 12/02/03 21:08:26 INFO mapreduce.Job: map 64% reduce 16% #It never makes any more progress... {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3854) Reinstate environment variable tests in TestMiniMRChildTask
[ https://issues.apache.org/jira/browse/MAPREDUCE-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208429#comment-13208429 ] Hudson commented on MAPREDUCE-3854: --- Integrated in Hadoop-Hdfs-0.23-Build #169 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/169/]) MAPREDUCE-3854. Fixed and reenabled tests related to MR child JVM's environmental variables in TestMiniMRChildTask. (Tom White via vinodkv) svn merge --ignore-ancestry -c 1244223 ../../trunk/ (Revision 1244224) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244224 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java Reinstate environment variable tests in TestMiniMRChildTask --- Key: MAPREDUCE-3854 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3854 Project: Hadoop Map/Reduce Issue Type: Test Components: mrv2 Reporter: Tom White Assignee: Tom White Fix For: 0.23.1 Attachments: MAPREDUCE-3854.patch, MAPREDUCE-3854.patch MAPREDUCE-3716 reinstated one of the tests in TestMiniMRChildTask, but there are two more which should be run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3736) Variable substitution depth too large for fs.default.name causes jobs to fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208428#comment-13208428 ] Hudson commented on MAPREDUCE-3736: --- Integrated in Hadoop-Hdfs-0.23-Build #169 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/169/]) Merge -r 1244263:1244264 from trunk to branch. FIXES: MAPREDUCE-3736 (Revision 1244265) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244265 Files : * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestMRWithDistributedCache.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/conf/TestNoDefaultsJobConf.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/JHLogAnalyzer.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/io/FileBench.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestCombineFileInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestConcatenatedCompressedInput.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestTextInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestMapCollection.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFileInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestMRKeyValueTextInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml Variable substitution depth too large for fs.default.name causes jobs to fail - Key: MAPREDUCE-3736 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3736 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Eli Collins Assignee: Ahmed Radwan Priority: Blocker Attachments: MAPREDUCE-3736.patch, MAPREDUCE-3736_rev2.patch, MAPREDUCE-3736_rev3.patch I'm seeing the same failure as MAPREDUCE-3462 in downstream projects running against a recent build of branch-23. MR-3462 modified the tests rather than fixing the framework. In that jira Ravi mentioned I'm still ignorant of the change which made the tests start to fail. I should probably understand better the reasons for that change before proposing a more generalized fix. Let's figure out the general fix (rather than require all projects to set mapreduce.job.hdfs-servers in their conf we should fix this in the framework). Perhaps we should not default this config to $fs.default.name? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3858) Task attempt failure during commit results in task never completing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208432#comment-13208432 ] Hudson commented on MAPREDUCE-3858: --- Integrated in Hadoop-Hdfs-0.23-Build #169 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/169/]) MAPREDUCE-3858. Task attempt failure during commit results in task never completing. (Tom White via mahadev) - Merging r1244254 from trunk. (Revision 1244255) Result = FAILURE mahadev : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244255 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java Task attempt failure during commit results in task never completing --- Key: MAPREDUCE-3858 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3858 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Tom White Assignee: Tom White Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3858.patch On a terasort job a task attempt failed during the commit phase. Another attempt was rescheduled, but when it tried to commit it failed. {noformat} attempt_1329019187148_0083_r_000586_0 already given a go for committing the task output, so killing attempt_1329019187148_0083_r_000586_1 {noformat} The job hung as new attempts kept getting scheduled only to fail during commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3846) Restarted+Recovered AM hangs in some corner cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208430#comment-13208430 ] Hudson commented on MAPREDUCE-3846: --- Integrated in Hadoop-Hdfs-0.23-Build #169 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/169/]) MAPREDUCE-3802. Added test to validate that AM can crash multiple times and still can recover successfully after MAPREDUCE-3846. (vinodkv) svn merge --ignore-ancestry -c 1244178 ../../trunk/ (Revision 1244180) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244180 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java Restarted+Recovered AM hangs in some corner cases - Key: MAPREDUCE-3846 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3846 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3846-20120210.txt, MAPREDUCE-3846-20120210.txt, MAPREDUCE-3846-20120213.txt [~karams] found this while testing AM restart/recovery feature. After the first generation AM crashes (manually killed by kill -9), the second generation AM starts, but hangs after a while. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3802) If an MR AM dies twice it looks like the process freezes
[ https://issues.apache.org/jira/browse/MAPREDUCE-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208433#comment-13208433 ] Hudson commented on MAPREDUCE-3802: --- Integrated in Hadoop-Hdfs-0.23-Build #169 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/169/]) MAPREDUCE-3802. Added test to validate that AM can crash multiple times and still can recover successfully after MAPREDUCE-3846. (vinodkv) svn merge --ignore-ancestry -c 1244178 ../../trunk/ (Revision 1244180) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244180 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java If an MR AM dies twice it looks like the process freezes - Key: MAPREDUCE-3802 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3802 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: applicationmaster, mrv2 Affects Versions: 0.23.1, 0.24.0 Reporter: Robert Joseph Evans Assignee: Vinod Kumar Vavilapalli Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3802-20120213.txt, MAPREDUCE-3802-20120213.txt, syslog It looks like recovering from an RM AM dieing works very well on a single failure. But if it fails multiple times we appear to get into a live lock situation. {noformat} yarn jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*-SNAPSHOT.jar wordcount -Dyarn.app.mapreduce.am.log.level=DEBUG -Dmapreduce.job.reduces=30 input output 12/02/03 21:06:57 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS 12/02/03 21:06:57 WARN conf.Configuration: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used 12/02/03 21:06:57 INFO input.FileInputFormat: Total input paths to process : 17 12/02/03 21:06:57 INFO util.NativeCodeLoader: Loaded the native-hadoop library 12/02/03 21:06:57 WARN snappy.LoadSnappy: Snappy native library not loaded 12/02/03 21:06:57 INFO mapreduce.JobSubmitter: number of splits:17 12/02/03 21:06:57 INFO mapred.ResourceMgrDelegate: Submitted application application_1328302034486_0003 to ResourceManager at HOST/IP:8040 12/02/03 21:06:57 INFO mapreduce.Job: The url to track the job: http://HOST:8088/proxy/application_1328302034486_0003/ 12/02/03 21:06:57 INFO mapreduce.Job: Running job: job_1328302034486_0003 12/02/03 21:07:03 INFO mapreduce.Job: Job job_1328302034486_0003 running in uber mode : false 12/02/03 21:07:03 INFO mapreduce.Job: map 0% reduce 0% 12/02/03 21:07:09 INFO mapreduce.Job: map 5% reduce 0% 12/02/03 21:07:10 INFO mapreduce.Job: map 17% reduce 0% #KILLED AM with kill -9 here 12/02/03 21:07:16 INFO mapreduce.Job: map 29% reduce 0% 12/02/03 21:07:17 INFO mapreduce.Job: map 35% reduce 0% 12/02/03 21:07:30 INFO mapreduce.Job: map 52% reduce 0% 12/02/03 21:07:35 INFO mapreduce.Job: map 58% reduce 0% 12/02/03 21:07:37 INFO mapreduce.Job: map 70% reduce 0% 12/02/03 21:07:41 INFO mapreduce.Job: map 76% reduce 0% 12/02/03 21:07:43 INFO mapreduce.Job: map 82% reduce 0% 12/02/03 21:07:44 INFO mapreduce.Job: map 88% reduce 0% 12/02/03 21:07:47 INFO mapreduce.Job: map 94% reduce 0% 12/02/03 21:07:49 INFO mapreduce.Job: map 100% reduce 0% 12/02/03 21:07:53 INFO mapreduce.Job: map 100% reduce 3% 12/02/03 21:08:00 INFO mapreduce.Job: map 100% reduce 6% 12/02/03 21:08:06 INFO mapreduce.Job: map 100% reduce 10% 12/02/03 21:08:12 INFO mapreduce.Job: map 100% reduce 13% 12/02/03 21:08:18 INFO mapreduce.Job: map 100% reduce 16% #killed AM with kill -9 here 12/02/03 21:08:20 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. Already tried 0 time(s). 12/02/03 21:08:21 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. Already tried 1 time(s). 12/02/03 21:08:22 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. Already tried 2 time(s). 12/02/03 21:08:26 INFO mapreduce.Job: map 64% reduce 16% #It never makes any more progress... {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3846) Restarted+Recovered AM hangs in some corner cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208454#comment-13208454 ] Hudson commented on MAPREDUCE-3846: --- Integrated in Hadoop-Mapreduce-0.23-Build #197 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/197/]) MAPREDUCE-3802. Added test to validate that AM can crash multiple times and still can recover successfully after MAPREDUCE-3846. (vinodkv) svn merge --ignore-ancestry -c 1244178 ../../trunk/ (Revision 1244180) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244180 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java Restarted+Recovered AM hangs in some corner cases - Key: MAPREDUCE-3846 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3846 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3846-20120210.txt, MAPREDUCE-3846-20120210.txt, MAPREDUCE-3846-20120213.txt [~karams] found this while testing AM restart/recovery feature. After the first generation AM crashes (manually killed by kill -9), the second generation AM starts, but hangs after a while. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3854) Reinstate environment variable tests in TestMiniMRChildTask
[ https://issues.apache.org/jira/browse/MAPREDUCE-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208453#comment-13208453 ] Hudson commented on MAPREDUCE-3854: --- Integrated in Hadoop-Mapreduce-0.23-Build #197 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/197/]) MAPREDUCE-3854. Fixed and reenabled tests related to MR child JVM's environmental variables in TestMiniMRChildTask. (Tom White via vinodkv) svn merge --ignore-ancestry -c 1244223 ../../trunk/ (Revision 1244224) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244224 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java Reinstate environment variable tests in TestMiniMRChildTask --- Key: MAPREDUCE-3854 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3854 Project: Hadoop Map/Reduce Issue Type: Test Components: mrv2 Reporter: Tom White Assignee: Tom White Fix For: 0.23.1 Attachments: MAPREDUCE-3854.patch, MAPREDUCE-3854.patch MAPREDUCE-3716 reinstated one of the tests in TestMiniMRChildTask, but there are two more which should be run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3736) Variable substitution depth too large for fs.default.name causes jobs to fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208452#comment-13208452 ] Hudson commented on MAPREDUCE-3736: --- Integrated in Hadoop-Mapreduce-0.23-Build #197 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/197/]) Merge -r 1244263:1244264 from trunk to branch. FIXES: MAPREDUCE-3736 (Revision 1244265) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244265 Files : * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestMRWithDistributedCache.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/conf/TestNoDefaultsJobConf.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/JHLogAnalyzer.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/io/FileBench.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestCombineFileInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestConcatenatedCompressedInput.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestTextInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestMapCollection.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFileInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestMRKeyValueTextInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml Variable substitution depth too large for fs.default.name causes jobs to fail - Key: MAPREDUCE-3736 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3736 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Eli Collins Assignee: Ahmed Radwan Priority: Blocker Attachments: MAPREDUCE-3736.patch, MAPREDUCE-3736_rev2.patch, MAPREDUCE-3736_rev3.patch I'm seeing the same failure as MAPREDUCE-3462 in downstream projects running against a recent build of branch-23. MR-3462 modified the tests rather than fixing the framework. In that jira Ravi mentioned I'm still ignorant of the change which made the tests start to fail. I should probably understand better the reasons for that change before proposing a more generalized fix. Let's figure out the general fix (rather than require all projects to set mapreduce.job.hdfs-servers in their conf we should fix this in the framework). Perhaps we should not default this config to $fs.default.name? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3858) Task attempt failure during commit results in task never completing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208456#comment-13208456 ] Hudson commented on MAPREDUCE-3858: --- Integrated in Hadoop-Mapreduce-0.23-Build #197 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/197/]) MAPREDUCE-3858. Task attempt failure during commit results in task never completing. (Tom White via mahadev) - Merging r1244254 from trunk. (Revision 1244255) Result = FAILURE mahadev : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244255 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java Task attempt failure during commit results in task never completing --- Key: MAPREDUCE-3858 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3858 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Tom White Assignee: Tom White Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3858.patch On a terasort job a task attempt failed during the commit phase. Another attempt was rescheduled, but when it tried to commit it failed. {noformat} attempt_1329019187148_0083_r_000586_0 already given a go for committing the task output, so killing attempt_1329019187148_0083_r_000586_1 {noformat} The job hung as new attempts kept getting scheduled only to fail during commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3802) If an MR AM dies twice it looks like the process freezes
[ https://issues.apache.org/jira/browse/MAPREDUCE-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208457#comment-13208457 ] Hudson commented on MAPREDUCE-3802: --- Integrated in Hadoop-Mapreduce-0.23-Build #197 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/197/]) MAPREDUCE-3802. Added test to validate that AM can crash multiple times and still can recover successfully after MAPREDUCE-3846. (vinodkv) svn merge --ignore-ancestry -c 1244178 ../../trunk/ (Revision 1244180) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244180 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java If an MR AM dies twice it looks like the process freezes - Key: MAPREDUCE-3802 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3802 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: applicationmaster, mrv2 Affects Versions: 0.23.1, 0.24.0 Reporter: Robert Joseph Evans Assignee: Vinod Kumar Vavilapalli Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3802-20120213.txt, MAPREDUCE-3802-20120213.txt, syslog It looks like recovering from an RM AM dieing works very well on a single failure. But if it fails multiple times we appear to get into a live lock situation. {noformat} yarn jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*-SNAPSHOT.jar wordcount -Dyarn.app.mapreduce.am.log.level=DEBUG -Dmapreduce.job.reduces=30 input output 12/02/03 21:06:57 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS 12/02/03 21:06:57 WARN conf.Configuration: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used 12/02/03 21:06:57 INFO input.FileInputFormat: Total input paths to process : 17 12/02/03 21:06:57 INFO util.NativeCodeLoader: Loaded the native-hadoop library 12/02/03 21:06:57 WARN snappy.LoadSnappy: Snappy native library not loaded 12/02/03 21:06:57 INFO mapreduce.JobSubmitter: number of splits:17 12/02/03 21:06:57 INFO mapred.ResourceMgrDelegate: Submitted application application_1328302034486_0003 to ResourceManager at HOST/IP:8040 12/02/03 21:06:57 INFO mapreduce.Job: The url to track the job: http://HOST:8088/proxy/application_1328302034486_0003/ 12/02/03 21:06:57 INFO mapreduce.Job: Running job: job_1328302034486_0003 12/02/03 21:07:03 INFO mapreduce.Job: Job job_1328302034486_0003 running in uber mode : false 12/02/03 21:07:03 INFO mapreduce.Job: map 0% reduce 0% 12/02/03 21:07:09 INFO mapreduce.Job: map 5% reduce 0% 12/02/03 21:07:10 INFO mapreduce.Job: map 17% reduce 0% #KILLED AM with kill -9 here 12/02/03 21:07:16 INFO mapreduce.Job: map 29% reduce 0% 12/02/03 21:07:17 INFO mapreduce.Job: map 35% reduce 0% 12/02/03 21:07:30 INFO mapreduce.Job: map 52% reduce 0% 12/02/03 21:07:35 INFO mapreduce.Job: map 58% reduce 0% 12/02/03 21:07:37 INFO mapreduce.Job: map 70% reduce 0% 12/02/03 21:07:41 INFO mapreduce.Job: map 76% reduce 0% 12/02/03 21:07:43 INFO mapreduce.Job: map 82% reduce 0% 12/02/03 21:07:44 INFO mapreduce.Job: map 88% reduce 0% 12/02/03 21:07:47 INFO mapreduce.Job: map 94% reduce 0% 12/02/03 21:07:49 INFO mapreduce.Job: map 100% reduce 0% 12/02/03 21:07:53 INFO mapreduce.Job: map 100% reduce 3% 12/02/03 21:08:00 INFO mapreduce.Job: map 100% reduce 6% 12/02/03 21:08:06 INFO mapreduce.Job: map 100% reduce 10% 12/02/03 21:08:12 INFO mapreduce.Job: map 100% reduce 13% 12/02/03 21:08:18 INFO mapreduce.Job: map 100% reduce 16% #killed AM with kill -9 here 12/02/03 21:08:20 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. Already tried 0 time(s). 12/02/03 21:08:21 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. Already tried 1 time(s). 12/02/03 21:08:22 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. Already tried 2 time(s). 12/02/03 21:08:26 INFO mapreduce.Job: map 64% reduce 16% #It never makes any more progress... {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3854) Reinstate environment variable tests in TestMiniMRChildTask
[ https://issues.apache.org/jira/browse/MAPREDUCE-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208472#comment-13208472 ] Hudson commented on MAPREDUCE-3854: --- Integrated in Hadoop-Mapreduce-trunk #991 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/991/]) MAPREDUCE-3854. Fixed and reenabled tests related to MR child JVM's environmental variables in TestMiniMRChildTask. (Tom White via vinodkv) (Revision 1244223) Result = SUCCESS vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244223 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java Reinstate environment variable tests in TestMiniMRChildTask --- Key: MAPREDUCE-3854 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3854 Project: Hadoop Map/Reduce Issue Type: Test Components: mrv2 Reporter: Tom White Assignee: Tom White Fix For: 0.23.1 Attachments: MAPREDUCE-3854.patch, MAPREDUCE-3854.patch MAPREDUCE-3716 reinstated one of the tests in TestMiniMRChildTask, but there are two more which should be run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3736) Variable substitution depth too large for fs.default.name causes jobs to fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208471#comment-13208471 ] Hudson commented on MAPREDUCE-3736: --- Integrated in Hadoop-Mapreduce-trunk #991 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/991/]) MAPREDUCE-3736. Variable substitution depth too large for fs.default.name causes jobs to fail (ahmed via tucu) (Revision 1244264) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244264 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestMRWithDistributedCache.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/conf/TestNoDefaultsJobConf.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/JHLogAnalyzer.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/io/FileBench.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestCombineFileInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestConcatenatedCompressedInput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestTextInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestMapCollection.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFileInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestMRKeyValueTextInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml Variable substitution depth too large for fs.default.name causes jobs to fail - Key: MAPREDUCE-3736 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3736 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Eli Collins Assignee: Ahmed Radwan Priority: Blocker Attachments: MAPREDUCE-3736.patch, MAPREDUCE-3736_rev2.patch, MAPREDUCE-3736_rev3.patch I'm seeing the same failure as MAPREDUCE-3462 in downstream projects running against a recent build of branch-23. MR-3462 modified the tests rather than fixing the framework. In that jira Ravi mentioned I'm still ignorant of the change which made the tests start to fail. I should probably understand better the reasons for that change before proposing a more generalized fix. Let's figure out the general fix (rather than require all projects to set mapreduce.job.hdfs-servers in their conf we should fix this in the framework). Perhaps we should not default this config to $fs.default.name? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3858) Task attempt failure during commit results in task never completing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208476#comment-13208476 ] Hudson commented on MAPREDUCE-3858: --- Integrated in Hadoop-Mapreduce-trunk #991 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/991/]) MAPREDUCE-3858. Task attempt failure during commit results in task never completing. (Tom White via mahadev) (Revision 1244254) Result = SUCCESS mahadev : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1244254 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java Task attempt failure during commit results in task never completing --- Key: MAPREDUCE-3858 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3858 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Tom White Assignee: Tom White Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3858.patch On a terasort job a task attempt failed during the commit phase. Another attempt was rescheduled, but when it tried to commit it failed. {noformat} attempt_1329019187148_0083_r_000586_0 already given a go for committing the task output, so killing attempt_1329019187148_0083_r_000586_1 {noformat} The job hung as new attempts kept getting scheduled only to fail during commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3685) There are some bugs in implementation of MergeMnager :
[ https://issues.apache.org/jira/browse/MAPREDUCE-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208485#comment-13208485 ] Ravi Prakash commented on MAPREDUCE-3685: - anty, Can you please edit the attributes of this JIRA? e.g. Affects version and Target versions? The priority of this jira should be atleast Major IMHO. There are some bugs in implementation of MergeMnager : -- Key: MAPREDUCE-3685 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3685 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: anty.rao Priority: Minor Attachments: MAPREDUCE-3685.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3685) There are some bugs in implementation of MergeMnager :
[ https://issues.apache.org/jira/browse/MAPREDUCE-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208495#comment-13208495 ] Hadoop QA commented on MAPREDUCE-3685: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514083/MAPREDUCE-3685.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause tar ant target to fail. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed the unit tests build +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1861//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1861//console This message is automatically generated. There are some bugs in implementation of MergeMnager : -- Key: MAPREDUCE-3685 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3685 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: anty.rao Priority: Minor Attachments: MAPREDUCE-3685.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3746) Nodemanagers are not automatically shut down after decommissioning
[ https://issues.apache.org/jira/browse/MAPREDUCE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208502#comment-13208502 ] Jason Lowe commented on MAPREDUCE-3746: --- Ramya, could you reproduce the issue and grab the jstack output on one of the hung NMs? I'm guessing one or more threads are not shutting down properly and preventing the process from exiting. The jstack output could help verify this and show which subsystem owns the threads. Nodemanagers are not automatically shut down after decommissioning -- Key: MAPREDUCE-3746 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3746 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Ramya Sunil Assignee: Devaraj K Priority: Critical Fix For: 0.23.1 Nodemanagers are not automatically shutdown after decommissioning. MAPREDUCE-2775 does not seem to fix the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3861) Oozie job status couldn't be updated correctly after Pig job SUCCEEDED from hadoop.
Oozie job status couldn't be updated correctly after Pig job SUCCEEDED from hadoop. --- Key: MAPREDUCE-3861 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3861 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.1, 0.23.2 Reporter: John George Priority: Blocker Submit an Oozie job having Pig action, the job is SUCCEEDED from hadoop and the output file is generated on HDFS, but oozie status is KILLED. Marcy Chen reported this issue while testing a pig job through oozie. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3861) Oozie job status couldn't be updated correctly after Pig job SUCCEEDED from hadoop.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208505#comment-13208505 ] John George commented on MAPREDUCE-3861: From Hadoop action log, 2012-02-14 15:24:59,909 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.service.AbstractService: Service:org.apache.hadoop.mapreduce.v2.app.MRAppMaster is stopped. 2012-02-14 15:24:59,910 INFO [TaskHeartbeatHandler PingChecker] org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler thread interrupted 2012-02-14 15:24:59,911 FATAL [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread. Exiting.. java.lang.IllegalStateException: Shutdown in progress at java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:39) at java.lang.Runtime.addShutdownHook(Runtime.java:192) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2076) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2048) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:284) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:151) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.getFileSystem(MRAppMaster.java:340) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.cleanupStagingDir(MRAppMaster.java:350) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler.handle(MRAppMaster.java:415) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler.handle(MRAppMaster.java:375) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:125) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:82) at java.lang.Thread.run(Thread.java:619) Again - thanks to Marcy Chen. Oozie job status couldn't be updated correctly after Pig job SUCCEEDED from hadoop. --- Key: MAPREDUCE-3861 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3861 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.1, 0.23.2 Reporter: John George Priority: Blocker Submit an Oozie job having Pig action, the job is SUCCEEDED from hadoop and the output file is generated on HDFS, but oozie status is KILLED. Marcy Chen reported this issue while testing a pig job through oozie. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208533#comment-13208533 ] Zhihong Yu commented on MAPREDUCE-3583: --- I couldn't reproduce the remaining test failure: {code} 737 cd hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient 738 mvn test -Dtest=TestProcfsBasedProcessTree {code} ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 2048 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited 6 Running in Jenkins mode {code} From Nicolas Sze: {noformat} It looks like that the ppid is a 64-bit positive integer but Java long is signed and so only works with 63-bit positive integers. In your case, 2^64 18446743988060683582 2^63. Therefore, there is a NFE. {noformat} I propose changing allProcessInfo to MapString, ProcessInfo so that we don't encounter this problem by avoiding parsing large integer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3798) TestJobCleanup testCustomCleanup is failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated MAPREDUCE-3798: Attachment: MAPREDUCE-3798.patch I've ported the test to run from mvn builds. The problem was essentially two confs that needed to be set now in the tests: {noformat} conf.set(JHAdminConfig.MR_HISTORY_INTERMEDIATE_DONE_DIR, TEST_ROOT_DIR + /intermediate); conf.set(org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter .SUCCESSFUL_JOB_OUTPUT_DIR_MARKER, true); {noformat} And also that FileOutputCommitter had been modified in the JIRA's Bobby mentioned. TestJobCleanup testCustomCleanup is failing --- Key: MAPREDUCE-3798 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3798 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Labels: test Attachments: MAPREDUCE-3798.patch File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 junit.framework.AssertionFailedError: File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 at org.apache.hadoop.mapred.TestJobCleanup.testKilledJob(TestJobCleanup.java:228) at org.apache.hadoop.mapred.TestJobCleanup.testCustomCleanup(TestJobCleanup.java:302) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24) at junit.extensions.TestSetup$1.protect(TestSetup.java:23) at junit.extensions.TestSetup.run(TestSetup.java:27) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3798) TestJobCleanup testCustomCleanup is failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated MAPREDUCE-3798: Status: Patch Available (was: Open) Sneaking in a change in documentation that I missed earlier =D. TestJobCleanup testCustomCleanup is failing --- Key: MAPREDUCE-3798 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3798 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Labels: test Attachments: MAPREDUCE-3798.patch File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 junit.framework.AssertionFailedError: File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 at org.apache.hadoop.mapred.TestJobCleanup.testKilledJob(TestJobCleanup.java:228) at org.apache.hadoop.mapred.TestJobCleanup.testCustomCleanup(TestJobCleanup.java:302) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24) at junit.extensions.TestSetup$1.protect(TestSetup.java:23) at junit.extensions.TestSetup.run(TestSetup.java:27) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3798) TestJobCleanup testCustomCleanup is failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208554#comment-13208554 ] Ravi Prakash commented on MAPREDUCE-3798: - The same patch also applies to trunk. TestJobCleanup testCustomCleanup is failing --- Key: MAPREDUCE-3798 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3798 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Labels: test Attachments: MAPREDUCE-3798.patch File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 junit.framework.AssertionFailedError: File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 at org.apache.hadoop.mapred.TestJobCleanup.testKilledJob(TestJobCleanup.java:228) at org.apache.hadoop.mapred.TestJobCleanup.testCustomCleanup(TestJobCleanup.java:302) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24) at junit.extensions.TestSetup$1.protect(TestSetup.java:23) at junit.extensions.TestSetup.run(TestSetup.java:27) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3861) Oozie job status couldn't be updated correctly after Pig job SUCCEEDED from hadoop.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John George updated MAPREDUCE-3861: --- Component/s: mrv2 Oozie job status couldn't be updated correctly after Pig job SUCCEEDED from hadoop. --- Key: MAPREDUCE-3861 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3861 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1, 0.23.2 Reporter: John George Assignee: John George Priority: Blocker Submit an Oozie job having Pig action, the job is SUCCEEDED from hadoop and the output file is generated on HDFS, but oozie status is KILLED. Marcy Chen reported this issue while testing a pig job through oozie. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-3861) Oozie job status couldn't be updated correctly after Pig job SUCCEEDED from hadoop.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John George reassigned MAPREDUCE-3861: -- Assignee: John George Oozie job status couldn't be updated correctly after Pig job SUCCEEDED from hadoop. --- Key: MAPREDUCE-3861 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3861 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1, 0.23.2 Reporter: John George Assignee: John George Priority: Blocker Submit an Oozie job having Pig action, the job is SUCCEEDED from hadoop and the output file is generated on HDFS, but oozie status is KILLED. Marcy Chen reported this issue while testing a pig job through oozie. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3798) TestJobCleanup testCustomCleanup is failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208563#comment-13208563 ] Hadoop QA commented on MAPREDUCE-3798: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514651/MAPREDUCE-3798.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 510 javac compiler warnings (more than the trunk's current 507 warnings). +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1862//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1862//console This message is automatically generated. TestJobCleanup testCustomCleanup is failing --- Key: MAPREDUCE-3798 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3798 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Labels: test Attachments: MAPREDUCE-3798.patch File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 junit.framework.AssertionFailedError: File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 at org.apache.hadoop.mapred.TestJobCleanup.testKilledJob(TestJobCleanup.java:228) at org.apache.hadoop.mapred.TestJobCleanup.testCustomCleanup(TestJobCleanup.java:302) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24) at junit.extensions.TestSetup$1.protect(TestSetup.java:23) at junit.extensions.TestSetup.run(TestSetup.java:27) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3798) TestJobCleanup testCustomCleanup is failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated MAPREDUCE-3798: Attachment: MAPREDUCE-3798.patch Suppressing the javac warnings due to the usage of MiniMRCluster TestJobCleanup testCustomCleanup is failing --- Key: MAPREDUCE-3798 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3798 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Labels: test Attachments: MAPREDUCE-3798.patch, MAPREDUCE-3798.patch File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 junit.framework.AssertionFailedError: File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 at org.apache.hadoop.mapred.TestJobCleanup.testKilledJob(TestJobCleanup.java:228) at org.apache.hadoop.mapred.TestJobCleanup.testCustomCleanup(TestJobCleanup.java:302) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24) at junit.extensions.TestSetup$1.protect(TestSetup.java:23) at junit.extensions.TestSetup.run(TestSetup.java:27) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
Nodemanager can appear to hang on shutdown due to lingering DeletionService threads --- Key: MAPREDUCE-3862 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.1 Reporter: Jason Lowe When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads. This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown. The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
[ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208601#comment-13208601 ] Jason Lowe commented on MAPREDUCE-3862: --- DeletionService has the following code which implies we don't want to wait too long for the shutdown to complete: {code} public void stop() { sched.shutdown(); try { sched.awaitTermination(10, SECONDS); } catch (InterruptedException e) { sched.shutdownNow(); } super.stop(); } {code} However the code never checks the result from {{awaitTermination()}}, and we can end up trying to continue the shutdown process with the thread pool still active. Nodemanager can appear to hang on shutdown due to lingering DeletionService threads --- Key: MAPREDUCE-3862 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.1 Reporter: Jason Lowe When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads. This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown. The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3798) TestJobCleanup testCustomCleanup is failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208605#comment-13208605 ] Hadoop QA commented on MAPREDUCE-3798: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514658/MAPREDUCE-3798.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 508 javac compiler warnings (more than the trunk's current 507 warnings). +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.mapred.TestIndexCache +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1863//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1863//console This message is automatically generated. TestJobCleanup testCustomCleanup is failing --- Key: MAPREDUCE-3798 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3798 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Labels: test Attachments: MAPREDUCE-3798.patch, MAPREDUCE-3798.patch File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 junit.framework.AssertionFailedError: File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 at org.apache.hadoop.mapred.TestJobCleanup.testKilledJob(TestJobCleanup.java:228) at org.apache.hadoop.mapred.TestJobCleanup.testCustomCleanup(TestJobCleanup.java:302) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24) at junit.extensions.TestSetup$1.protect(TestSetup.java:23) at junit.extensions.TestSetup.run(TestSetup.java:27) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3849) Change TokenCache's reading of the binary token file
[ https://issues.apache.org/jira/browse/MAPREDUCE-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208608#comment-13208608 ] Daryn Sharp commented on MAPREDUCE-3849: This was also successfully tested on a secure cluster. Change TokenCache's reading of the binary token file Key: MAPREDUCE-3849 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3849 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3849-2.patch, MAPREDUCE-3849.patch When obtaining the tokens for a {{FileSystem}}, the {{TokenCache}} will read the binary token file if a token is not already in the {{Credentials}}. However, it will overwrite any existing tokens in the {{Credentials}} with the contents of the binary token file if a single token is missing. This may cause new tokens to be replaced with invalid/cancelled tokens from the binary file. The new tokens will not be canceled, and thus leak in the namenode until they expire. The binary tokens should be merged with, but not replace, existing tokens in the {{Credentials}}. The code that reads the binary token file is prefaced with: {code} //TODO: Need to come up with a better place to put //this block of code to do with reading the file {code} Also, the loading of the binary token file is the only reason that the {{TokenCache}} has to use {{getCanonicalService}}. If this linkage can be broken, then the 1-to-1 filesystem to token service coupling may be removed. And use of {{getCanonicalService}} can be removed in a subsequent jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2835) Make per-job counter limits configurable
[ https://issues.apache.org/jira/browse/MAPREDUCE-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208624#comment-13208624 ] Harsh J commented on MAPREDUCE-2835: I agree with Brian's reasoning. I understand that if you place a configuration and a newbie user/ops encounters it, they'll just raise it without second consideration. We need to address such best practices with docs/warn logs (logs can warn with hard numbers), rather than magic numbers that would not fit every one/every case. Make per-job counter limits configurable Key: MAPREDUCE-2835 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2835 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.20.204.0 Reporter: Tom White Assignee: Tom White Attachments: MAPREDUCE-2835.patch, MAPREDUCE-2835.patch The per-job counter limits introduced in MAPREDUCE-1943 are fixed, except for the total number allowed per job (mapreduce.job.counters.limit). It would be useful to make them all configurable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2835) Make per-job counter limits configurable
[ https://issues.apache.org/jira/browse/MAPREDUCE-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208627#comment-13208627 ] Harsh J commented on MAPREDUCE-2835: Btw, we've already made it configurable in 0.23 via MAPREDUCE-901. Make per-job counter limits configurable Key: MAPREDUCE-2835 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2835 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.20.204.0 Reporter: Tom White Assignee: Tom White Attachments: MAPREDUCE-2835.patch, MAPREDUCE-2835.patch The per-job counter limits introduced in MAPREDUCE-1943 are fixed, except for the total number allowed per job (mapreduce.job.counters.limit). It would be useful to make them all configurable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3863) 0.22 branch mvn deploy is not publishing hadoop-streaming JAR
0.22 branch mvn deploy is not publishing hadoop-streaming JAR - Key: MAPREDUCE-3863 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3863 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.22.0, 0.22.1 Reporter: Alejandro Abdelnur Priority: Critical Without this JAR Oozie cannot be built/tested against 0.22 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3864) Fix cluster setup docs for correct SNN HTTPS parameters
Fix cluster setup docs for correct SNN HTTPS parameters --- Key: MAPREDUCE-3864 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3864 Project: Hadoop Map/Reduce Issue Type: Improvement Components: documentation, security Affects Versions: 0.23.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Currently the docs reference dfs.namenode.secondary.https-address, which does not exist. Instead it should reference dfs.namenode.secondary.https-port (new name of dfs.secondary.https.port as of HDFS-2950) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3864) Fix cluster setup docs for correct SNN HTTPS parameters
[ https://issues.apache.org/jira/browse/MAPREDUCE-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated MAPREDUCE-3864: --- Target Version/s: 0.24.0, 0.23.2 (was: 0.23.2, 0.24.0) Status: Patch Available (was: Open) Fix cluster setup docs for correct SNN HTTPS parameters --- Key: MAPREDUCE-3864 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3864 Project: Hadoop Map/Reduce Issue Type: Improvement Components: documentation, security Affects Versions: 0.23.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: mr-3864.txt Currently the docs reference dfs.namenode.secondary.https-address, which does not exist. Instead it should reference dfs.namenode.secondary.https-port (new name of dfs.secondary.https.port as of HDFS-2950) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3864) Fix cluster setup docs for correct SNN HTTPS parameters
[ https://issues.apache.org/jira/browse/MAPREDUCE-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated MAPREDUCE-3864: --- Attachment: mr-3864.txt Three fixes: - dfs.namenode.secondary.https-address is an invalid parameter - it's never referred to by the code. Rather, even with SSL, we use the hostname from dfs.namenode.secondary.http-address. - dfs.secondary.https.port renamed to dfs.namenode.secondary.https-port per HDFS-2950 - changed the suggestion from 50090 (which overlaps with the http port and thus probably would not work) to 50490 (the default) Fix cluster setup docs for correct SNN HTTPS parameters --- Key: MAPREDUCE-3864 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3864 Project: Hadoop Map/Reduce Issue Type: Improvement Components: documentation, security Affects Versions: 0.23.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: mr-3864.txt Currently the docs reference dfs.namenode.secondary.https-address, which does not exist. Instead it should reference dfs.namenode.secondary.https-port (new name of dfs.secondary.https.port as of HDFS-2950) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3856) Instances of RunningJob class givs incorrect job tracking urls when mutiple jobs are submitted from same client jvm.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated MAPREDUCE-3856: -- Status: Patch Available (was: Open) Instances of RunningJob class givs incorrect job tracking urls when mutiple jobs are submitted from same client jvm. Key: MAPREDUCE-3856 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3856 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1, 0.23.2 Reporter: Eric Payne Assignee: Eric Payne Priority: Critical Attachments: MAPREDUCE-3856-1.txt When multiple jobs are submitted from the same client JVM, each call to RunningJob.getTrackingURL() always returns the tracking URL from the first job. This happens even if the jobs are submitted and the client waits for the job to complete before submitting the subsequent job. Each job runs fine and is definitely a new, unique job, but the call to getTrackingURL() still returns the URL for the first job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3856) Instances of RunningJob class givs incorrect job tracking urls when mutiple jobs are submitted from same client jvm.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated MAPREDUCE-3856: -- Attachment: MAPREDUCE-3856-1.txt ClientServiceDelegage was not re-initializing notRunningJobs for every new JobID even though the constructor for ClientServiceDelegate was being called for each new JobID. Instances of RunningJob class givs incorrect job tracking urls when mutiple jobs are submitted from same client jvm. Key: MAPREDUCE-3856 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3856 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1, 0.23.2 Reporter: Eric Payne Assignee: Eric Payne Priority: Critical Attachments: MAPREDUCE-3856-1.txt When multiple jobs are submitted from the same client JVM, each call to RunningJob.getTrackingURL() always returns the tracking URL from the first job. This happens even if the jobs are submitted and the client waits for the job to complete before submitting the subsequent job. Each job runs fine and is definitely a new, unique job, but the call to getTrackingURL() still returns the URL for the first job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3856) Instances of RunningJob class givs incorrect job tracking urls when mutiple jobs are submitted from same client jvm.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208694#comment-13208694 ] Eric Payne commented on MAPREDUCE-3856: --- Patch applies the same to both branch-0.23 and trunk. Instances of RunningJob class givs incorrect job tracking urls when mutiple jobs are submitted from same client jvm. Key: MAPREDUCE-3856 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3856 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1, 0.23.2 Reporter: Eric Payne Assignee: Eric Payne Priority: Critical Attachments: MAPREDUCE-3856-1.txt When multiple jobs are submitted from the same client JVM, each call to RunningJob.getTrackingURL() always returns the tracking URL from the first job. This happens even if the jobs are submitted and the client waits for the job to complete before submitting the subsequent job. Each job runs fine and is definitely a new, unique job, but the call to getTrackingURL() still returns the URL for the first job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3861) Oozie job status couldn't be updated correctly after Pig job SUCCEEDED from hadoop.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John George resolved MAPREDUCE-3861. Resolution: Invalid This was because Pig needed to be recompiled after the counters compatibility change went in. Oozie job status couldn't be updated correctly after Pig job SUCCEEDED from hadoop. --- Key: MAPREDUCE-3861 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3861 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1, 0.23.2 Reporter: John George Assignee: John George Priority: Blocker Submit an Oozie job having Pig action, the job is SUCCEEDED from hadoop and the output file is generated on HDFS, but oozie status is KILLED. Marcy Chen reported this issue while testing a pig job through oozie. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated MAPREDUCE-3583: -- Attachment: mapreduce-3583-trunk-v3.txt Noticed the exception thrown from checkPidPgrpidForMatch(285) Patch v3 adds pgrpId to the exception message ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v3.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 2048 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited 6 Running in Jenkins mode {code} From Nicolas Sze: {noformat} It looks like that the ppid is a 64-bit positive integer but Java long is signed and so only works with 63-bit positive integers. In your case, 2^64 18446743988060683582 2^63. Therefore, there is a NFE. {noformat} I propose changing allProcessInfo to MapString, ProcessInfo so that we don't encounter this problem by avoiding parsing large integer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3864) Fix cluster setup docs for correct SNN HTTPS parameters
[ https://issues.apache.org/jira/browse/MAPREDUCE-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208725#comment-13208725 ] Hadoop QA commented on MAPREDUCE-3864: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514673/mr-3864.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1864//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1864//console This message is automatically generated. Fix cluster setup docs for correct SNN HTTPS parameters --- Key: MAPREDUCE-3864 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3864 Project: Hadoop Map/Reduce Issue Type: Improvement Components: documentation, security Affects Versions: 0.23.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: mr-3864.txt Currently the docs reference dfs.namenode.secondary.https-address, which does not exist. Instead it should reference dfs.namenode.secondary.https-port (new name of dfs.secondary.https.port as of HDFS-2950) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2793) [MR-279] Maintain consistency in naming appIDs, jobIDs and attemptIDs
[ https://issues.apache.org/jira/browse/MAPREDUCE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-2793: -- Status: Patch Available (was: Open) [MR-279] Maintain consistency in naming appIDs, jobIDs and attemptIDs -- Key: MAPREDUCE-2793 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2793 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Ramya Sunil Assignee: Bikas Saha Priority: Critical Fix For: 0.23.2 Attachments: MAPREDUCE-2793-branch-0.23.patch, MAPREDUCE-2793.patch appIDs, jobIDs and attempt/container ids are not consistently named in the logs, console and UI. For consistency purpose, they all have to follow a common naming convention. Currently, For appID = On the RM UI: app_1308259676864_5 On the JHS UI: No appID Console/logs: No appID mapred-local dirs are named as: application_1308259676864_0005 For jobID = On the RM UI: job_1308259676864_5_5 JHS UI: job_1308259676864_5_5 Console/logs: job_1308259676864_0005 mapred-local dirs are named as: No jobID For attemptID On the RM UI: attempt_1308259676864_5_5_m_24_0 JHS attempt_1308259676864_5_5_m_24_0 Console/logs: attempt_1308259676864_0005_m_24_0 mapred-local dirs are named as: container_1308259676864_0005_24 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3856) Instances of RunningJob class givs incorrect job tracking urls when mutiple jobs are submitted from same client jvm.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208756#comment-13208756 ] Hadoop QA commented on MAPREDUCE-3856: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514685/MAPREDUCE-3856-1.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1865//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1865//console This message is automatically generated. Instances of RunningJob class givs incorrect job tracking urls when mutiple jobs are submitted from same client jvm. Key: MAPREDUCE-3856 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3856 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1, 0.23.2 Reporter: Eric Payne Assignee: Eric Payne Priority: Critical Attachments: MAPREDUCE-3856-1.txt When multiple jobs are submitted from the same client JVM, each call to RunningJob.getTrackingURL() always returns the tracking URL from the first job. This happens even if the jobs are submitted and the client waits for the job to complete before submitting the subsequent job. Each job runs fine and is definitely a new, unique job, but the call to getTrackingURL() still returns the URL for the first job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2793) [MR-279] Maintain consistency in naming appIDs, jobIDs and attemptIDs
[ https://issues.apache.org/jira/browse/MAPREDUCE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208760#comment-13208760 ] Hadoop QA commented on MAPREDUCE-2793: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514593/MAPREDUCE-2793-branch-0.23.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesAttempts org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobs org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesTasks +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1867//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1867//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1867//console This message is automatically generated. [MR-279] Maintain consistency in naming appIDs, jobIDs and attemptIDs -- Key: MAPREDUCE-2793 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2793 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Ramya Sunil Assignee: Bikas Saha Priority: Critical Fix For: 0.23.2 Attachments: MAPREDUCE-2793-branch-0.23.patch, MAPREDUCE-2793.patch appIDs, jobIDs and attempt/container ids are not consistently named in the logs, console and UI. For consistency purpose, they all have to follow a common naming convention. Currently, For appID = On the RM UI: app_1308259676864_5 On the JHS UI: No appID Console/logs: No appID mapred-local dirs are named as: application_1308259676864_0005 For jobID = On the RM UI: job_1308259676864_5_5 JHS UI: job_1308259676864_5_5 Console/logs: job_1308259676864_0005 mapred-local dirs are named as: No jobID For attemptID On the RM UI: attempt_1308259676864_5_5_m_24_0 JHS attempt_1308259676864_5_5_m_24_0 Console/logs: attempt_1308259676864_0005_m_24_0 mapred-local dirs are named as: container_1308259676864_0005_24 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208775#comment-13208775 ] Hadoop QA commented on MAPREDUCE-3583: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514689/mapreduce-3583-trunk-v3.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1866//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1866//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1866//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1866//console This message is automatically generated. ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v3.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory
[jira] [Updated] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated MAPREDUCE-3583: -- Attachment: mapreduce-3583-trunk-v4.txt Patch v4 for TRUNK should pass. ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v3.txt, mapreduce-3583-trunk-v4.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 2048 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited 6 Running in Jenkins mode {code} From Nicolas Sze: {noformat} It looks like that the ppid is a 64-bit positive integer but Java long is signed and so only works with 63-bit positive integers. In your case, 2^64 18446743988060683582 2^63. Therefore, there is a NFE. {noformat} I propose changing allProcessInfo to MapString, ProcessInfo so that we don't encounter this problem by avoiding parsing large integer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
[ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-3862: -- Attachment: MAPREDUCE-3862.patch Patch to call setExecuteExistingDelayedTasksAfterShutdownPolicy() on init and fallback to shutdownNow() if ScheduledThreadPoolExecutor.awaitTermination() fails. Nodemanager can appear to hang on shutdown due to lingering DeletionService threads --- Key: MAPREDUCE-3862 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.1 Reporter: Jason Lowe Attachments: MAPREDUCE-3862.patch When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads. This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown. The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
[ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-3862: -- Assignee: Jason Lowe Target Version/s: 0.24.0, 0.23.2 Status: Patch Available (was: Open) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads --- Key: MAPREDUCE-3862 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.1 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-3862.patch When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads. This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown. The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3746) Nodemanagers are not automatically shut down after decommissioning
[ https://issues.apache.org/jira/browse/MAPREDUCE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208801#comment-13208801 ] Jason Lowe commented on MAPREDUCE-3746: --- If the jstack output shows DeletionService threads hanging around and {{yarn.nodemanager.delete.debug-delay-sec}} has been set to a relatively large value then this is a dup of MAPREDUCE-3862. Nodemanagers are not automatically shut down after decommissioning -- Key: MAPREDUCE-3746 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3746 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Ramya Sunil Assignee: Devaraj K Priority: Critical Fix For: 0.23.1 Nodemanagers are not automatically shutdown after decommissioning. MAPREDUCE-2775 does not seem to fix the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
[ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208806#comment-13208806 ] Hadoop QA commented on MAPREDUCE-3862: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514708/MAPREDUCE-3862.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.server.nodemanager.TestDeletionService +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1869//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1869//console This message is automatically generated. Nodemanager can appear to hang on shutdown due to lingering DeletionService threads --- Key: MAPREDUCE-3862 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.1 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-3862.patch When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads. This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown. The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
[ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-3862: -- Target Version/s: 0.24.0, 0.23.2 (was: 0.23.2, 0.24.0) Status: Open (was: Patch Available) Canceling patch to investigate test failures. Nodemanager can appear to hang on shutdown due to lingering DeletionService threads --- Key: MAPREDUCE-3862 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.1 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-3862.patch When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads. This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown. The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208815#comment-13208815 ] Hadoop QA commented on MAPREDUCE-3583: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514704/mapreduce-3583-trunk-v4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1868//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1868//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1868//console This message is automatically generated. ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v3.txt, mapreduce-3583-trunk-v4.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack
[jira] [Commented] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208817#comment-13208817 ] Zhihong Yu commented on MAPREDUCE-3583: --- @Matt, @Mahadev, @Nicolas: Can you take another look ? Patches for hadoop 1.0 and TRUNK are good to go. ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v3.txt, mapreduce-3583-trunk-v4.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 2048 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited 6 Running in Jenkins mode {code} From Nicolas Sze: {noformat} It looks like that the ppid is a 64-bit positive integer but Java long is signed and so only works with 63-bit positive integers. In your case, 2^64 18446743988060683582 2^63. Therefore, there is a NFE. {noformat} I propose changing allProcessInfo to MapString, ProcessInfo so that we don't encounter this problem by avoiding parsing large integer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3798) TestJobCleanup testCustomCleanup is failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208824#comment-13208824 ] Ravi Prakash commented on MAPREDUCE-3798: - The single javac warning is because I am using MiniMRCluster which is now deprecated in MRv2. The test file I migrated had been using MRv1 in which MiniMRCluster had not been deprecated (hence the increment in warnings I guess). I tried different things with @SuppressWarnings annotation but wasn't able to get rid of the one warning. The core test failure has nothing to do with this patch (I am not changing any src/main code) Can someone please commit this in 0.23.1, 0.23 and trunk? TestJobCleanup testCustomCleanup is failing --- Key: MAPREDUCE-3798 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3798 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 0.23.0, 0.23.1 Reporter: Ravi Prakash Assignee: Ravi Prakash Labels: test Attachments: MAPREDUCE-3798.patch, MAPREDUCE-3798.patch File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 junit.framework.AssertionFailedError: File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 at org.apache.hadoop.mapred.TestJobCleanup.testKilledJob(TestJobCleanup.java:228) at org.apache.hadoop.mapred.TestJobCleanup.testCustomCleanup(TestJobCleanup.java:302) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24) at junit.extensions.TestSetup$1.protect(TestSetup.java:23) at junit.extensions.TestSetup.run(TestSetup.java:27) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3798) TestJobCleanup testCustomCleanup is failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated MAPREDUCE-3798: Target Version/s: 0.23.1, 0.24.0, 0.23.2 (was: 0.23.1) Affects Version/s: 0.23.1 TestJobCleanup testCustomCleanup is failing --- Key: MAPREDUCE-3798 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3798 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 0.23.0, 0.23.1 Reporter: Ravi Prakash Assignee: Ravi Prakash Labels: test Attachments: MAPREDUCE-3798.patch, MAPREDUCE-3798.patch File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 junit.framework.AssertionFailedError: File somepath/hadoop-mapreduce-project/build/test/data/test-job-cleanup/output-8/_custom_cleanup missing for job job_20120203035807432_0009 at org.apache.hadoop.mapred.TestJobCleanup.testKilledJob(TestJobCleanup.java:228) at org.apache.hadoop.mapred.TestJobCleanup.testCustomCleanup(TestJobCleanup.java:302) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24) at junit.extensions.TestSetup$1.protect(TestSetup.java:23) at junit.extensions.TestSetup.run(TestSetup.java:27) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
[ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-3862: -- Target Version/s: 0.24.0, 0.23.2 (was: 0.23.2, 0.24.0) Status: Patch Available (was: Open) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads --- Key: MAPREDUCE-3862 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.1 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads. This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown. The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
[ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-3862: -- Attachment: MAPREDUCE-3862.patch Patch updated for unit test failures. Nodemanager can appear to hang on shutdown due to lingering DeletionService threads --- Key: MAPREDUCE-3862 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.1 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads. This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown. The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208890#comment-13208890 ] Tsz Wo (Nicholas), SZE commented on MAPREDUCE-3583: --- Hi Zhihong, thanks for all the hard works! There is a [findbugs warning|https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1868/artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html] in the last build. Could you take a look? ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v3.txt, mapreduce-3583-trunk-v4.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 2048 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited 6 Running in Jenkins mode {code} From Nicolas Sze: {noformat} It looks like that the ppid is a 64-bit positive integer but Java long is signed and so only works with 63-bit positive integers. In your case, 2^64 18446743988060683582 2^63. Therefore, there is a NFE. {noformat} I propose changing allProcessInfo to MapString, ProcessInfo so that we don't encounter this problem by avoiding parsing large integer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3824) Distributed caches are not removed properly
[ https://issues.apache.org/jira/browse/MAPREDUCE-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-3824: - Attachment: MAPREDUCE-3824-branch-1.0.patch patch changes: * refcount to be Atomic Integer * private caches properly update sizes * add test for sizes Distributed caches are not removed properly --- Key: MAPREDUCE-3824 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3824 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache Affects Versions: 1.0.0 Reporter: Allen Wittenauer Assignee: Thomas Graves Priority: Critical Attachments: MAPREDUCE-3824-branch-1.0.patch, MAPREDUCE-3824-branch-1.0.txt Distributed caches are not being properly removed by the TaskTracker when they are expected to be expired. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3824) Distributed caches are not removed properly
[ https://issues.apache.org/jira/browse/MAPREDUCE-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-3824: - Target Version/s: 1.0.1 Status: Patch Available (was: Open) Distributed caches are not removed properly --- Key: MAPREDUCE-3824 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3824 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache Affects Versions: 1.0.0 Reporter: Allen Wittenauer Assignee: Thomas Graves Priority: Critical Attachments: MAPREDUCE-3824-branch-1.0.patch, MAPREDUCE-3824-branch-1.0.txt Distributed caches are not being properly removed by the TaskTracker when they are expected to be expired. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3348) mapred job -status fails to give info even if the job is present in History
[ https://issues.apache.org/jira/browse/MAPREDUCE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-3348: --- Fix Version/s: 0.23.2 Affects Version/s: (was: 0.24.0) 0.23.0 Status: Open (was: Patch Available) I don't like the fact that we have to rely on ApplicationReport being null, but can live with it for now. We should add different exceptions like NotFoundException, AccessControlException which we can then use for app, queue, node etc. Looks good otherwise. Couple of minor nit: - Please add a comment to ClientRMService.getApplicationReport() when we return with null report. Also, clarify the same thing by adding a javadoc for this method. - Can you also change the two log statements Could not get Job info from RM for job ... in ClientServiceDelegate to INFO? mapred job -status fails to give info even if the job is present in History --- Key: MAPREDUCE-3348 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3348 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Devaraj K Assignee: Devaraj K Fix For: 0.23.2 Attachments: MAPREDUCE-3348.patch It is trying to get the app report from the RM for the job, RM throws exception when it doesn't find and then it is giving the same exception without trying from History Server. {code} 11/11/03 08:47:27 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.mapred uce.v2.api.MRClientProtocol 11/11/03 08:47:28 WARN mapred.ClientServiceDelegate: Exception thrown by remote end. RemoteTrace: at LocalTrace: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: Trying to get information for an absent applicat ion application_1320278804241_0002 at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:142) at $Proxy6.getApplicationReport(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ClientRMProtocolPBClientImpl.getApplicationReport(ClientRMProtocolPBClie ntImpl.java:111) at org.apache.hadoop.mapred.ResourceMgrDelegate.getApplicationReport(ResourceMgrDelegate.java:321) at org.apache.hadoop.mapred.ClientServiceDelegate.getProxy(ClientServiceDelegate.java:137) at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:273) at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:353) at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:429) at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:186) at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:240) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:83) at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1106) Exception in thread main RemoteTrace: at Local Trace: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: Trying to get information for an absent applicat ion application_1320278804241_0002 at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:142) at $Proxy6.getApplicationReport(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ClientRMProtocolPBClientImpl.getApplicationReport(ClientRMProtocolPBClie ntImpl.java:111) at org.apache.hadoop.mapred.ResourceMgrDelegate.getApplicationReport(ResourceMgrDelegate.java:321) at org.apache.hadoop.mapred.ClientServiceDelegate.getProxy(ClientServiceDelegate.java:137) at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:273) at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:353) at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:429) at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:186) at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:240) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69) at
[jira] [Created] (MAPREDUCE-3865) RM should throw different exceptions for while querying app/node/queue
RM should throw different exceptions for while querying app/node/queue -- Key: MAPREDUCE-3865 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3865 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli We should distinguish the exceptions for absent app/node/queue, illegally accessed app/node/queue etc. Today everything is a {{YarnRemoteException}}. We should extend {{YarnRemoteException}} to add {{NotFoundException}}, {{AccessControlException}} etc. Today, {{AccessControlException}} exists but not as part of the protocol descriptions (i.e. only available to Java). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3348) mapred job -status fails to give info even if the job is present in History
[ https://issues.apache.org/jira/browse/MAPREDUCE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208905#comment-13208905 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-3348: Created MAPREDUCE-3865 for throwing different exceptions. mapred job -status fails to give info even if the job is present in History --- Key: MAPREDUCE-3348 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3348 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Devaraj K Assignee: Devaraj K Fix For: 0.23.2 Attachments: MAPREDUCE-3348.patch It is trying to get the app report from the RM for the job, RM throws exception when it doesn't find and then it is giving the same exception without trying from History Server. {code} 11/11/03 08:47:27 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.mapred uce.v2.api.MRClientProtocol 11/11/03 08:47:28 WARN mapred.ClientServiceDelegate: Exception thrown by remote end. RemoteTrace: at LocalTrace: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: Trying to get information for an absent applicat ion application_1320278804241_0002 at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:142) at $Proxy6.getApplicationReport(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ClientRMProtocolPBClientImpl.getApplicationReport(ClientRMProtocolPBClie ntImpl.java:111) at org.apache.hadoop.mapred.ResourceMgrDelegate.getApplicationReport(ResourceMgrDelegate.java:321) at org.apache.hadoop.mapred.ClientServiceDelegate.getProxy(ClientServiceDelegate.java:137) at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:273) at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:353) at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:429) at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:186) at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:240) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:83) at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1106) Exception in thread main RemoteTrace: at Local Trace: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: Trying to get information for an absent applicat ion application_1320278804241_0002 at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:142) at $Proxy6.getApplicationReport(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ClientRMProtocolPBClientImpl.getApplicationReport(ClientRMProtocolPBClie ntImpl.java:111) at org.apache.hadoop.mapred.ResourceMgrDelegate.getApplicationReport(ResourceMgrDelegate.java:321) at org.apache.hadoop.mapred.ClientServiceDelegate.getProxy(ClientServiceDelegate.java:137) at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:273) at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:353) at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:429) at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:186) at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:240) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:83) at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1106) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated MAPREDUCE-3583: -- Attachment: mapreduce-3583-trunk-v5.txt Patch v5 for TRUNK fixes the warning. ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v3.txt, mapreduce-3583-trunk-v4.txt, mapreduce-3583-trunk-v5.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 2048 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited 6 Running in Jenkins mode {code} From Nicolas Sze: {noformat} It looks like that the ppid is a 64-bit positive integer but Java long is signed and so only works with 63-bit positive integers. In your case, 2^64 18446743988060683582 2^63. Therefore, there is a NFE. {noformat} I propose changing allProcessInfo to MapString, ProcessInfo so that we don't encounter this problem by avoiding parsing large integer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3824) Distributed caches are not removed properly
[ https://issues.apache.org/jira/browse/MAPREDUCE-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208908#comment-13208908 ] Hadoop QA commented on MAPREDUCE-3824: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514727/MAPREDUCE-3824-branch-1.0.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1871//console This message is automatically generated. Distributed caches are not removed properly --- Key: MAPREDUCE-3824 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3824 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache Affects Versions: 1.0.0 Reporter: Allen Wittenauer Assignee: Thomas Graves Priority: Critical Attachments: MAPREDUCE-3824-branch-1.0.patch, MAPREDUCE-3824-branch-1.0.txt Distributed caches are not being properly removed by the TaskTracker when they are expected to be expired. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3824) Distributed caches are not removed properly
[ https://issues.apache.org/jira/browse/MAPREDUCE-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208923#comment-13208923 ] Thomas Graves commented on MAPREDUCE-3824: -- I did a bunch of manual testing. I tested private and public version of files and directories and also private and public archives going into the distributed cache and verified that the sizes were set properly. If there are any other cases I missed please let me know. Distributed caches are not removed properly --- Key: MAPREDUCE-3824 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3824 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache Affects Versions: 1.0.0 Reporter: Allen Wittenauer Assignee: Thomas Graves Priority: Critical Attachments: MAPREDUCE-3824-branch-1.0.patch, MAPREDUCE-3824-branch-1.0.txt Distributed caches are not being properly removed by the TaskTracker when they are expected to be expired. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3824) Distributed caches are not removed properly
[ https://issues.apache.org/jira/browse/MAPREDUCE-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208927#comment-13208927 ] Allen Wittenauer commented on MAPREDUCE-3824: - It would be good to add some debug statements. When stuff does break (related or not), ops teams end up being completely blind. (This is a problem with many many many other parts of Hadoop!) Distributed caches are not removed properly --- Key: MAPREDUCE-3824 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3824 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache Affects Versions: 1.0.0 Reporter: Allen Wittenauer Assignee: Thomas Graves Priority: Critical Attachments: MAPREDUCE-3824-branch-1.0.patch, MAPREDUCE-3824-branch-1.0.txt Distributed caches are not being properly removed by the TaskTracker when they are expected to be expired. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3866) bin/yarn prints the command line unnecessarily
bin/yarn prints the command line unnecessarily -- Key: MAPREDUCE-3866 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3866 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Priority: Minor Fix For: 0.23.2 For commands like rmadmin, version etc, it also prints the whole command line unnecessarily. This was /me from long time ago, pre alpha :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3866) bin/yarn prints the command line unnecessarily
[ https://issues.apache.org/jira/browse/MAPREDUCE-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-3866: --- Status: Patch Available (was: Open) bin/yarn prints the command line unnecessarily -- Key: MAPREDUCE-3866 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3866 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Priority: Minor Fix For: 0.23.2 Attachments: MAPREDUCE-3866.txt For commands like rmadmin, version etc, it also prints the whole command line unnecessarily. This was /me from long time ago, pre alpha :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3866) bin/yarn prints the command line unnecessarily
[ https://issues.apache.org/jira/browse/MAPREDUCE-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-3866: --- Attachment: MAPREDUCE-3866.txt Trivial patch. bin/yarn prints the command line unnecessarily -- Key: MAPREDUCE-3866 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3866 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Priority: Minor Fix For: 0.23.2 Attachments: MAPREDUCE-3866.txt For commands like rmadmin, version etc, it also prints the whole command line unnecessarily. This was /me from long time ago, pre alpha :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208949#comment-13208949 ] Hadoop QA commented on MAPREDUCE-3583: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514728/mapreduce-3583-trunk-v5.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitor +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1872//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1872//console This message is automatically generated. ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v3.txt, mapreduce-3583-trunk-v4.txt, mapreduce-3583-trunk-v5.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu
[jira] [Commented] (MAPREDUCE-3862) Nodemanager can appear to hang on shutdown due to lingering DeletionService threads
[ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208955#comment-13208955 ] Hadoop QA commented on MAPREDUCE-3862: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514725/MAPREDUCE-3862.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1870//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1870//console This message is automatically generated. Nodemanager can appear to hang on shutdown due to lingering DeletionService threads --- Key: MAPREDUCE-3862 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.1 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch When a nodemanager attempts to shutdown cleanly, it's possible for it to appear to hang due to lingering DeletionService threads. This can occur when yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value and one or more containers executes on the node shortly before the shutdown. The DeletionService is never calling {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled tasks to complete before exiting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208965#comment-13208965 ] Mahadev konar commented on MAPREDUCE-3583: -- Ted, Looks like one of test cases (TestContainerMonitor) failed with: {code} 2012-02-15 23:20:54,516 WARN [Container Monitor] monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(456)) - Uncaught exception in ContainerMemoryManager while managing memory of container_0__01_00 java.lang.NumberFormatException: For input string: 18446743988089421650 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:424) at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:170) {code} This should have been fixed with the patch right? ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v3.txt, mapreduce-3583-trunk-v4.txt, mapreduce-3583-trunk-v5.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 2048 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited 6 Running in Jenkins mode {code} From Nicolas Sze: {noformat} It looks like that the ppid is a 64-bit positive integer but Java long is signed and so only
[jira] [Updated] (MAPREDUCE-3746) Nodemanagers are not automatically shut down after decommissioning
[ https://issues.apache.org/jira/browse/MAPREDUCE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-3746: --- Fix Version/s: (was: 0.23.1) 0.23.2 I also cannot reproduce this - decommissioning normally and during app execution both shut down the NM cleanly for me. Ramya please take a thread dump of the faulty NM when you run into this again. Thanks! Moving this to 0.23.2. Nodemanagers are not automatically shut down after decommissioning -- Key: MAPREDUCE-3746 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3746 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Ramya Sunil Assignee: Devaraj K Priority: Critical Fix For: 0.23.2 Nodemanagers are not automatically shutdown after decommissioning. MAPREDUCE-2775 does not seem to fix the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
[ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208966#comment-13208966 ] Tom White commented on MAPREDUCE-3583: -- Can you add a test for the overflow case too please. How do you want to handle the JVM reuse bug I mentioned above in https://issues.apache.org/jira/browse/MAPREDUCE-3583?focusedCommentId=1324page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-1324 ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException - Key: MAPREDUCE-3583 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0 Environment: 64-bit Linux: asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Critical Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v3.txt, mapreduce-3583-trunk-v4.txt, mapreduce-3583-trunk-v5.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt HBase PreCommit builds frequently gave us NumberFormatException. From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: {code} 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). java.lang.NumberFormatException: For input string: 18446743988060683582 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:422) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:536) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) {code} From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: {code} // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), {code} You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: {code} asf011.sp2.ygridcore.net Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 2048 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited 6 Running in Jenkins mode {code} From Nicolas Sze: {noformat} It looks like that the ppid is a 64-bit positive integer but Java long is signed and so only works with 63-bit positive integers. In your case, 2^64 18446743988060683582 2^63. Therefore, there is a NFE. {noformat} I propose changing allProcessInfo to MapString, ProcessInfo so that we don't encounter this problem by avoiding parsing large integer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira