[jira] [Commented] (MAPREDUCE-4096) tests seem to be randomly failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245069#comment-13245069 ] Devaraj K commented on MAPREDUCE-4096: -- Dup of MAPREDUCE-4094. tests seem to be randomly failing - Key: MAPREDUCE-4096 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4096 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 2.0.0, trunk Reporter: Thomas Graves Looking at the output from test-patch from jenkins recently it seems that tests are randomly failing: jira MAPREDUCE-4089 is an example where 22 failed, the next time patch was put up 4 failed and in both cases the patch had nothing to do with those tests. I also manually ran mvn test in mapreduce directory and had 20 failures and saw a couple of processes still laying around. One was using port 10020 which other tests were trying to use and you saw a bind address error come out of the tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4096) tests seem to be randomly failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K resolved MAPREDUCE-4096. -- Resolution: Duplicate tests seem to be randomly failing - Key: MAPREDUCE-4096 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4096 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 2.0.0, trunk Reporter: Thomas Graves Looking at the output from test-patch from jenkins recently it seems that tests are randomly failing: jira MAPREDUCE-4089 is an example where 22 failed, the next time patch was put up 4 failed and in both cases the patch had nothing to do with those tests. I also manually ran mvn test in mapreduce directory and had 20 failures and saw a couple of processes still laying around. One was using port 10020 which other tests were trying to use and you saw a bind address error come out of the tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4024) RM webservices can't query on finalStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245090#comment-13245090 ] Bhallamudi Venkata Siva Kamesh commented on MAPREDUCE-4024: --- I don't know whether I can comment here or not (as the jira has already been closed), but here is my observation from the patch {code:title=RMWebServices.java|borderStyle=solid} if (finalStatusQuery != null !finalStatusQuery.isEmpty()) { FinalApplicationStatus.valueOf(finalStatusQuery); if (!rmapp.getFinalApplicationStatus().toString() .equalsIgnoreCase(finalStatusQuery)) { continue; } } {code} From the above code, I think the following statement {noformat}FinalApplicationStatus.valueOf(finalStatusQuery);{noformat} validates whether a given string is one of the FinalApplicationStatus's enum types.If the enum type is in uppercase and user has given the same enum type in lowercase, this statement throws *IllegalArgumentException*. However, in the next statement we are comparing strings using *equalIgnoreCase*, so from this, I *assume* that, finalStatusQuery can also be given in a lowercase. But it won't work. Please ignore this comment, if my observation is wrong. RM webservices can't query on finalStatus - Key: MAPREDUCE-4024 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4024 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Fix For: 0.23.3 Attachments: MAPREDUCE-4024.patch The resource manager web service api to get the list of apps doesn't have a query parameter for finalStatus. It has one for the state but since that isn't what is reported by app master so we really need to be able to query on both state and finalStatus. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3315) Master-Worker Application on YARN
[ https://issues.apache.org/jira/browse/MAPREDUCE-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245108#comment-13245108 ] Sharad Agarwal commented on MAPREDUCE-3315: --- can have it in hadoop-yarn-applications module. Example app could be sub-module of the master-worker app. Master-Worker Application on YARN - Key: MAPREDUCE-3315 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3315 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Sharad Agarwal Assignee: Sharad Agarwal Fix For: 0.24.0 Currently master worker scenarios are forced fit into Map-Reduce. Now with YARN, these can be first class and would benefit real/near realtime workloads and be more effective in using the cluster resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4098) TestMRApps testSetClasspath fails
TestMRApps testSetClasspath fails - Key: MAPREDUCE-4098 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4098 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.0.0 The assertion of this test is testing for equality, as the generated classpath file is in the classpath the test fails. Instead, the test should test for the expected path elements to be in the classspath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4098) TestMRApps testSetClasspath fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated MAPREDUCE-4098: -- Attachment: MAPREDUCE-4098.patch testing for PWD to be at the beginning of the classpath and the YARN_APPLICATION_CLASSPATH to be in the classpath. TestMRApps testSetClasspath fails - Key: MAPREDUCE-4098 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4098 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.0.0 Attachments: MAPREDUCE-4098.patch The assertion of this test is testing for equality, as the generated classpath file is in the classpath the test fails. Instead, the test should test for the expected path elements to be in the classspath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4098) TestMRApps testSetClasspath fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated MAPREDUCE-4098: -- Status: Patch Available (was: Open) TestMRApps testSetClasspath fails - Key: MAPREDUCE-4098 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4098 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.0.0 Attachments: MAPREDUCE-4098.patch The assertion of this test is testing for equality, as the generated classpath file is in the classpath the test fails. Instead, the test should test for the expected path elements to be in the classspath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4098) TestMRApps testSetClasspath fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245129#comment-13245129 ] Hadoop QA commented on MAPREDUCE-4098: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521129/MAPREDUCE-4098.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2132//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2132//console This message is automatically generated. TestMRApps testSetClasspath fails - Key: MAPREDUCE-4098 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4098 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.0.0 Attachments: MAPREDUCE-4098.patch The assertion of this test is testing for equality, as the generated classpath file is in the classpath the test fails. Instead, the test should test for the expected path elements to be in the classspath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4098) TestMRApps testSetClasspath fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245144#comment-13245144 ] Ahmed Radwan commented on MAPREDUCE-4098: - Thanks Tucu! The environment CLASSPATH is populated using the value for YarnConfiguration.YARN_APPLICATION_CLASSPATH from yarn-default.xml and then the test again checks if it contains the classpath from the YarnConfiguration.YARN_APPLICATION_CLASSPATH value, so the test will always succeed regardless of the value set for YarnConfiguration.YARN_APPLICATION_CLASSPATH. Do we need to keep the check against the hardcoded classpath values so the test will fail if they were changed in yarn-default.xml? TestMRApps testSetClasspath fails - Key: MAPREDUCE-4098 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4098 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.0.0 Attachments: MAPREDUCE-4098.patch The assertion of this test is testing for equality, as the generated classpath file is in the classpath the test fails. Instead, the test should test for the expected path elements to be in the classspath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4098) TestMRApps testSetClasspath fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245146#comment-13245146 ] Ahmed Radwan commented on MAPREDUCE-4098: - So you still check for contains instead of equality (per your changes), but the check will remain against hardcoded classpath values? TestMRApps testSetClasspath fails - Key: MAPREDUCE-4098 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4098 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.0.0 Attachments: MAPREDUCE-4098.patch The assertion of this test is testing for equality, as the generated classpath file is in the classpath the test fails. Instead, the test should test for the expected path elements to be in the classspath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4098) TestMRApps testSetClasspath fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245260#comment-13245260 ] Alejandro Abdelnur commented on MAPREDUCE-4098: --- @Ahmed, as you indicated, the new assertion verifies that the YARN_APPLICATION_CLASSPATH value is in the resulting classpath. If we change the default value, the test will still pass. This seems more correct than checking against a hardcoded value. I don't think we need check for hardcoded values, at least not in this test. If you see value in that, we should have an additional test checking the hardcoded value; personally I don't think such test is needed. Regarding your second comment, the check for contains is because the classpath is massaged to prefix '$PWD' and to postfix 'job.jar' and '$PWD/*'. TestMRApps testSetClasspath fails - Key: MAPREDUCE-4098 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4098 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.0.0 Attachments: MAPREDUCE-4098.patch The assertion of this test is testing for equality, as the generated classpath file is in the classpath the test fails. Instead, the test should test for the expected path elements to be in the classspath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3540) saveVersion.sh script fails in windows/cygwin (hadoop-yarn-common)
[ https://issues.apache.org/jira/browse/MAPREDUCE-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245264#comment-13245264 ] Wu Wei commented on MAPREDUCE-3540: --- +1 for this patch. It works on my windows environment (using cygwin). saveVersion.sh script fails in windows/cygwin (hadoop-yarn-common) -- Key: MAPREDUCE-3540 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3540 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.24.0 Reporter: Alejandro Abdelnur Fix For: 0.24.0 Attachments: MAPREDUCE-3540.patch {code} [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2:exec (generate-version) on project hadoop-yarn-common: Comman d execution failed. Cannot run program scripts\saveVersion.sh (in directory C:\cygwin\home\tucu\src\hadoop\hadoop-mapreduce-proje ct\hadoop-yarn\hadoop-yarn-common): CreateProcess error=2, The system cannot find the file specified - [Help 1] [ERROR] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4024) RM webservices can't query on finalStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245268#comment-13245268 ] Hudson commented on MAPREDUCE-4024: --- Integrated in Hadoop-Hdfs-0.23-Build #217 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/217/]) svn merge -c 1308566 from trunk. FIXES MAPREDUCE-4024. RM webservices can't query on finalStatus (Tom Graves via bobby) (Revision 1308569) Result = UNSTABLE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308569 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/webapp/HsWebServices.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/webapp/TestHsWebServicesJobsQuery.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesApps.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/HistoryServerRest.apt.vm * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerRest.apt.vm RM webservices can't query on finalStatus - Key: MAPREDUCE-4024 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4024 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Fix For: 0.23.3 Attachments: MAPREDUCE-4024.patch The resource manager web service api to get the list of apps doesn't have a query parameter for finalStatus. It has one for the state but since that isn't what is reported by app master so we really need to be able to query on both state and finalStatus. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4092) commitJob Exception does not fail job (regression in 0.23 vs 0.20)
[ https://issues.apache.org/jira/browse/MAPREDUCE-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245269#comment-13245269 ] Hudson commented on MAPREDUCE-4092: --- Integrated in Hadoop-Hdfs-0.23-Build #217 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/217/]) svn merge -c 1308507 from trunk FIXES: MAPREDUCE-4092. commitJob Exception does not fail job (Jon Eagles via bobby) (Revision 1308510) Result = UNSTABLE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308510 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java commitJob Exception does not fail job (regression in 0.23 vs 0.20) -- Key: MAPREDUCE-4092 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4092 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Blocker Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4092.patch, MAPREDUCE-4092.patch If commitJob throws an exception JobImpl will swallow the exception with a warning and succeed the Job. This is a break from 0.20 and 1.0 where commitJob exception will fail the job Exception logged in the AM as WARN org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Could not do commit for Job Job still finishes as succeeded -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4089) Hung Tasks never time out.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245267#comment-13245267 ] Hudson commented on MAPREDUCE-4089: --- Integrated in Hadoop-Hdfs-0.23-Build #217 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/217/]) merge -r 1308532:1308533 from branch-2. FIXES: MAPREDUCE-4089 (Revision 1308537) Result = UNSTABLE tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308537 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/TaskHeartbeatHandler.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestTaskHeartbeatHandler.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml Hung Tasks never time out. --- Key: MAPREDUCE-4089 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4089 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2, 2.0.0, trunk Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Fix For: 0.23.3 Attachments: MR-4089.txt, MR-4089.txt, MR-4089.txt, MR-4089.txt, MR-4089.txt The AM will timeout a task through mapreduce.task.timeout only when it does not hear from the task within the given timeframe. On 1.0 a task must be making progress, either by reading input from HDFS, writing output to HDFS, writing to a log, or calling a special method to inform it that it is still making progress. This is because on 0.23 a status update which happens every 3 seconds is counted as progress. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3315) Master-Worker Application on YARN
[ https://issues.apache.org/jira/browse/MAPREDUCE-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikhil S. Ketkar updated MAPREDUCE-3315: Attachment: MAPREDUCE-3315.patch This is a preliminary patch, only for feedback. The directory hadoop-yarn-applications-masterworker-core contains the framework. A simple example using the framework is placed in hadoop-yarn-applications-masterworker-example. Master-Worker Application on YARN - Key: MAPREDUCE-3315 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3315 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Sharad Agarwal Assignee: Sharad Agarwal Fix For: 0.24.0 Attachments: MAPREDUCE-3315.patch Currently master worker scenarios are forced fit into Map-Reduce. Now with YARN, these can be first class and would benefit real/near realtime workloads and be more effective in using the cluster resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4024) RM webservices can't query on finalStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245288#comment-13245288 ] Hudson commented on MAPREDUCE-4024: --- Integrated in Hadoop-Hdfs-trunk #1004 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1004/]) MAPREDUCE-4024. RM webservices can't query on finalStatus (Tom Graves via bobby) (Revision 1308566) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308566 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/webapp/HsWebServices.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/webapp/TestHsWebServicesJobsQuery.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/HistoryServerRest.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerRest.apt.vm RM webservices can't query on finalStatus - Key: MAPREDUCE-4024 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4024 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Fix For: 0.23.3 Attachments: MAPREDUCE-4024.patch The resource manager web service api to get the list of apps doesn't have a query parameter for finalStatus. It has one for the state but since that isn't what is reported by app master so we really need to be able to query on both state and finalStatus. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4089) Hung Tasks never time out.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245286#comment-13245286 ] Hudson commented on MAPREDUCE-4089: --- Integrated in Hadoop-Hdfs-trunk #1004 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1004/]) MAPREDUCE-4089. Hung Tasks never time out. (Robert Evans via tgraves) (Revision 1308531) Result = FAILURE tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308531 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/TaskHeartbeatHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestTaskHeartbeatHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml Hung Tasks never time out. --- Key: MAPREDUCE-4089 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4089 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2, 2.0.0, trunk Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Fix For: 0.23.3 Attachments: MR-4089.txt, MR-4089.txt, MR-4089.txt, MR-4089.txt, MR-4089.txt The AM will timeout a task through mapreduce.task.timeout only when it does not hear from the task within the given timeframe. On 1.0 a task must be making progress, either by reading input from HDFS, writing output to HDFS, writing to a log, or calling a special method to inform it that it is still making progress. This is because on 0.23 a status update which happens every 3 seconds is counted as progress. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4092) commitJob Exception does not fail job (regression in 0.23 vs 0.20)
[ https://issues.apache.org/jira/browse/MAPREDUCE-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245290#comment-13245290 ] Hudson commented on MAPREDUCE-4092: --- Integrated in Hadoop-Hdfs-trunk #1004 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1004/]) MAPREDUCE-4092. commitJob Exception does not fail job (Jon Eagles via bobby) (Revision 1308507) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308507 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java commitJob Exception does not fail job (regression in 0.23 vs 0.20) -- Key: MAPREDUCE-4092 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4092 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Blocker Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4092.patch, MAPREDUCE-4092.patch If commitJob throws an exception JobImpl will swallow the exception with a warning and succeed the Job. This is a break from 0.20 and 1.0 where commitJob exception will fail the job Exception logged in the AM as WARN org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Could not do commit for Job Job still finishes as succeeded -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4095) TestJobInProgress#testLocality uses a bogus topology
[ https://issues.apache.org/jira/browse/MAPREDUCE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245292#comment-13245292 ] Hudson commented on MAPREDUCE-4095: --- Integrated in Hadoop-Hdfs-trunk #1004 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1004/]) MAPREDUCE-4095. TestJobInProgress#testLocality uses a bogus topology. Contributed by Colin Patrick McCabe (Revision 1308519) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308519 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapred/TestJobInProgress.java TestJobInProgress#testLocality uses a bogus topology Key: MAPREDUCE-4095 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4095 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.0, 2.0.0 Reporter: Eli Collins Assignee: Colin Patrick McCabe Fix For: 0.24.0, 1.1.0 Attachments: MAPREDUCE-4095-b1.001.patch, MAPREDUCE-4095.001.patch The following in TestJobInProgress#testLocality: {code} Node r2n4 = new NodeBase(/default/rack2/s1/node4); nt.add(r2n4); {code} violates the check introduced by HADOOP-8159: {noformat} Testcase: testLocality took 0.005 sec Caused an ERROR Invalid network topology. You cannot have a rack and a non-rack node at the same level of the network topology. org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid network topology. You cannot have a rack and a non-rack node at the same level of the network topology. at org.apache.hadoop.net.NetworkTopology.add(NetworkTopology.java:349) at org.apache.hadoop.mapred.TestJobInProgress.testLocality(TestJobInProgress.java:232) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4072) User set java.library.path seems to overwrite default creating problems native lib loading
[ https://issues.apache.org/jira/browse/MAPREDUCE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245317#comment-13245317 ] Robert Joseph Evans commented on MAPREDUCE-4072: I agree that doc only is the preferable solution. I like the documentation. I think it would be good to have in the mapred-default.xml mapred.child.java.opts when it talks about LD_LIBRARY_PATH have it point to mapred.child.env. User set java.library.path seems to overwrite default creating problems native lib loading -- Key: MAPREDUCE-4072 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4072 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Anupam Seth Assignee: Anupam Seth Attachments: MAPREDUCE-4072-branch-23.patch, MAPREDUCE-4072-branch-23_documentation.patch This was found by Peeyush Bishnoi. While running a distributed cache example with Hadoop-0.23, tasks are failing as follows: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:261) at org.apache.hadoop.util.Shell.run(Shell.java:188) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:381) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:207) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:241) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) main : command provided 1 main : user is user Same Pig script and command work successfully on 0.20 See this in the stderr: Exception in thread main java.lang.ExceptionInInitializerError at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1179) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1149) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1238) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1264) at org.apache.hadoop.security.Groups.(Groups.java:54) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:178) at org.apache.hadoop.security.UserGroupInformation.initUGI(UserGroupInformation.java:252) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:223) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:265) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:75) Caused by: java.lang.RuntimeException: Bailing out since native library couldn't be loaded at org.apache.hadoop.security.JniBasedUnixGroupsMapping.(JniBasedUnixGroupsMapping.java:48) ... 12 more Pig command: $ pig -Dmapred.job.queue.name=queue -Dmapred.cache.archives=archives -Dmapred.child.java.opts=-Djava.library.path=./ygeo/lib -Dip2geo.preLoadLibraries=some other libs -Djava.io.tmpdir=/grid/0/tmp -Dmapred.create.symlink=yes -Dmapred.job.map.memory.mb=3072 piggeoscript.pig -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4099) ApplicationMaster may fail to remove staging directory
ApplicationMaster may fail to remove staging directory -- Key: MAPREDUCE-4099 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4099 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Jason Lowe Priority: Critical When the ApplicationMaster shuts down it's supposed to remove the staging directory, assuming properties weren't set to override this behavior. During shutdown the AM tells the ResourceManager that it has finished before it cleans up the staging directory. However upon hearing the AM has finished, the RM turns right around and kills the AM container. If the AM is too slow, the AM will be killed before the staging directory is removed. We're seeing the AM lose this race fairly consistently on our clusters, and the lack of staging directory cleanup quickly leads to filesystem quota issues for some users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4089) Hung Tasks never time out.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245334#comment-13245334 ] Hudson commented on MAPREDUCE-4089: --- Integrated in Hadoop-Mapreduce-trunk #1039 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1039/]) MAPREDUCE-4089. Hung Tasks never time out. (Robert Evans via tgraves) (Revision 1308531) Result = FAILURE tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308531 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/TaskHeartbeatHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestTaskHeartbeatHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml Hung Tasks never time out. --- Key: MAPREDUCE-4089 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4089 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2, 2.0.0, trunk Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Fix For: 0.23.3 Attachments: MR-4089.txt, MR-4089.txt, MR-4089.txt, MR-4089.txt, MR-4089.txt The AM will timeout a task through mapreduce.task.timeout only when it does not hear from the task within the given timeframe. On 1.0 a task must be making progress, either by reading input from HDFS, writing output to HDFS, writing to a log, or calling a special method to inform it that it is still making progress. This is because on 0.23 a status update which happens every 3 seconds is counted as progress. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4024) RM webservices can't query on finalStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245336#comment-13245336 ] Hudson commented on MAPREDUCE-4024: --- Integrated in Hadoop-Mapreduce-trunk #1039 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1039/]) MAPREDUCE-4024. RM webservices can't query on finalStatus (Tom Graves via bobby) (Revision 1308566) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308566 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/webapp/HsWebServices.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/webapp/TestHsWebServicesJobsQuery.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/HistoryServerRest.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerRest.apt.vm RM webservices can't query on finalStatus - Key: MAPREDUCE-4024 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4024 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Fix For: 0.23.3 Attachments: MAPREDUCE-4024.patch The resource manager web service api to get the list of apps doesn't have a query parameter for finalStatus. It has one for the state but since that isn't what is reported by app master so we really need to be able to query on both state and finalStatus. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4095) TestJobInProgress#testLocality uses a bogus topology
[ https://issues.apache.org/jira/browse/MAPREDUCE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245340#comment-13245340 ] Hudson commented on MAPREDUCE-4095: --- Integrated in Hadoop-Mapreduce-trunk #1039 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1039/]) MAPREDUCE-4095. TestJobInProgress#testLocality uses a bogus topology. Contributed by Colin Patrick McCabe (Revision 1308519) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308519 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapred/TestJobInProgress.java TestJobInProgress#testLocality uses a bogus topology Key: MAPREDUCE-4095 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4095 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.0, 2.0.0 Reporter: Eli Collins Assignee: Colin Patrick McCabe Fix For: 0.24.0, 1.1.0 Attachments: MAPREDUCE-4095-b1.001.patch, MAPREDUCE-4095.001.patch The following in TestJobInProgress#testLocality: {code} Node r2n4 = new NodeBase(/default/rack2/s1/node4); nt.add(r2n4); {code} violates the check introduced by HADOOP-8159: {noformat} Testcase: testLocality took 0.005 sec Caused an ERROR Invalid network topology. You cannot have a rack and a non-rack node at the same level of the network topology. org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid network topology. You cannot have a rack and a non-rack node at the same level of the network topology. at org.apache.hadoop.net.NetworkTopology.add(NetworkTopology.java:349) at org.apache.hadoop.mapred.TestJobInProgress.testLocality(TestJobInProgress.java:232) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4092) commitJob Exception does not fail job (regression in 0.23 vs 0.20)
[ https://issues.apache.org/jira/browse/MAPREDUCE-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245338#comment-13245338 ] Hudson commented on MAPREDUCE-4092: --- Integrated in Hadoop-Mapreduce-trunk #1039 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1039/]) MAPREDUCE-4092. commitJob Exception does not fail job (Jon Eagles via bobby) (Revision 1308507) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308507 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java commitJob Exception does not fail job (regression in 0.23 vs 0.20) -- Key: MAPREDUCE-4092 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4092 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Blocker Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4092.patch, MAPREDUCE-4092.patch If commitJob throws an exception JobImpl will swallow the exception with a warning and succeed the Job. This is a break from 0.20 and 1.0 where commitJob exception will fail the job Exception logged in the AM as WARN org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Could not do commit for Job Job still finishes as succeeded -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4062: - Status: Open (was: Patch Available) making changes for trunk and adding test for containerLauncher AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Attachments: MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4060) Multiple SLF4J binding warning
[ https://issues.apache.org/jira/browse/MAPREDUCE-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4060: --- Resolution: Fixed Fix Version/s: 2.0.0 0.23.3 Target Version/s: 0.24.0, 0.23.2, 0.23.3 (was: 0.23.3, 0.23.2, 0.24.0) Status: Resolved (was: Patch Available) Thanks Jason I just put this into trunk, branch-2, and branch-0.23. Multiple SLF4J binding warning -- Key: MAPREDUCE-4060 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4060 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.23.0 Reporter: Jason Lowe Assignee: Jason Lowe Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4060.patch This is the MAPREDUCE portion of HADOOP-8005. We should remove slf4j from the assembly and use the one provided by hadoop-common so we don't end up with multiple binding warnings for SLF4J. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4060) Multiple SLF4J binding warning
[ https://issues.apache.org/jira/browse/MAPREDUCE-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245355#comment-13245355 ] Hudson commented on MAPREDUCE-4060: --- Integrated in Hadoop-Common-trunk-Commit #1977 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1977/]) MAPREDUCE-4060. Multiple SLF4J binding warning (Jason Lowe via bobby) (Revision 1308925) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308925 Files : * /hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-mapreduce-dist.xml * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt Multiple SLF4J binding warning -- Key: MAPREDUCE-4060 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4060 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.23.0 Reporter: Jason Lowe Assignee: Jason Lowe Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4060.patch This is the MAPREDUCE portion of HADOOP-8005. We should remove slf4j from the assembly and use the one provided by hadoop-common so we don't end up with multiple binding warnings for SLF4J. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4060) Multiple SLF4J binding warning
[ https://issues.apache.org/jira/browse/MAPREDUCE-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245353#comment-13245353 ] Hudson commented on MAPREDUCE-4060: --- Integrated in Hadoop-Hdfs-trunk-Commit #2051 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2051/]) MAPREDUCE-4060. Multiple SLF4J binding warning (Jason Lowe via bobby) (Revision 1308925) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308925 Files : * /hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-mapreduce-dist.xml * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt Multiple SLF4J binding warning -- Key: MAPREDUCE-4060 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4060 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.23.0 Reporter: Jason Lowe Assignee: Jason Lowe Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4060.patch This is the MAPREDUCE portion of HADOOP-8005. We should remove slf4j from the assembly and use the one provided by hadoop-common so we don't end up with multiple binding warnings for SLF4J. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4060) Multiple SLF4J binding warning
[ https://issues.apache.org/jira/browse/MAPREDUCE-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245368#comment-13245368 ] Hudson commented on MAPREDUCE-4060: --- Integrated in Hadoop-Mapreduce-trunk-Commit #1989 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1989/]) MAPREDUCE-4060. Multiple SLF4J binding warning (Jason Lowe via bobby) (Revision 1308925) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308925 Files : * /hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-mapreduce-dist.xml * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt Multiple SLF4J binding warning -- Key: MAPREDUCE-4060 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4060 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.23.0 Reporter: Jason Lowe Assignee: Jason Lowe Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4060.patch This is the MAPREDUCE portion of HADOOP-8005. We should remove slf4j from the assembly and use the one provided by hadoop-common so we don't end up with multiple binding warnings for SLF4J. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3983) TestTTResourceReporting can fail, and should just be deleted
[ https://issues.apache.org/jira/browse/MAPREDUCE-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3983: --- Resolution: Fixed Fix Version/s: 2.0.0 0.23.3 Status: Resolved (was: Patch Available) +1 thanks Ravi. I just put this into trunk, branch-2 and branch-0.23 TestTTResourceReporting can fail, and should just be deleted Key: MAPREDUCE-3983 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3983 Project: Hadoop Map/Reduce Issue Type: Test Components: mrv1 Affects Versions: 0.23.2 Reporter: Robert Joseph Evans Assignee: Ravi Prakash Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-3983.patch TestTTResourceReporting can fail. It is an ant test for task trackers which shoudl just be removed because task trackers are no longer supported outside of the ant tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3983) TestTTResourceReporting can fail, and should just be deleted
[ https://issues.apache.org/jira/browse/MAPREDUCE-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245375#comment-13245375 ] Hudson commented on MAPREDUCE-3983: --- Integrated in Hadoop-Hdfs-trunk-Commit #2052 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2052/]) MAPREDUCE-3983. TestTTResourceReporting can fail, and should just be deleted (Ravi Prakash via bobby) (Revision 1308957) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308957 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapred/TestTTResourceReporting.java TestTTResourceReporting can fail, and should just be deleted Key: MAPREDUCE-3983 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3983 Project: Hadoop Map/Reduce Issue Type: Test Components: mrv1 Affects Versions: 0.23.2 Reporter: Robert Joseph Evans Assignee: Ravi Prakash Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-3983.patch TestTTResourceReporting can fail. It is an ant test for task trackers which shoudl just be removed because task trackers are no longer supported outside of the ant tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3983) TestTTResourceReporting can fail, and should just be deleted
[ https://issues.apache.org/jira/browse/MAPREDUCE-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245378#comment-13245378 ] Hudson commented on MAPREDUCE-3983: --- Integrated in Hadoop-Common-trunk-Commit #1978 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1978/]) MAPREDUCE-3983. TestTTResourceReporting can fail, and should just be deleted (Ravi Prakash via bobby) (Revision 1308957) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308957 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapred/TestTTResourceReporting.java TestTTResourceReporting can fail, and should just be deleted Key: MAPREDUCE-3983 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3983 Project: Hadoop Map/Reduce Issue Type: Test Components: mrv1 Affects Versions: 0.23.2 Reporter: Robert Joseph Evans Assignee: Ravi Prakash Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-3983.patch TestTTResourceReporting can fail. It is an ant test for task trackers which shoudl just be removed because task trackers are no longer supported outside of the ant tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4041) TestMapredGroupMappingServiceRefresh unit test failures
[ https://issues.apache.org/jira/browse/MAPREDUCE-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245382#comment-13245382 ] Kihwal Lee commented on MAPREDUCE-4041: --- The following is the exception that is occurring. It happens whenever the test tries to run MRAdmin to refresh. {noformat} 2012-04-03 04:29:59,604 INFO [IPC Server handler 3 on 43005] ipc.Server (Server.java:run(1680)) - IPC Server handler 3 on 43005, call refreshUserToGroupsMappings(), rpc version=2, client version=1, methodsFingerPrint=-876529506 from 127.0.0.1:50564: error: java.io.IOException: java.io.IOException: Unknown protocol: org.apache.hadoop.security.RefreshUserMappingsProtocol java.io.IOException: java.io.IOException: Unknown protocol: org.apache.hadoop.security.RefreshUserMappingsProtocol at org.apache.hadoop.ipc.WritableRpcEngine$Server$WritableRpcInvoker.call(WritableRpcEngine.java:477) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:891) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1661) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1657) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1205) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1655) {noformat} And the the JT's RPC server sorts things out on start-up. {noformat} 2012-04-03 04:29:58,581 INFO [Thread-56] mapred.JobTracker (JobTracker.java:init(1481)) - Starting jobtracker with owner as xx 2012-04-03 04:29:58,588 INFO [Socket Reader #1 for port 43005] ipc.Server (Server.java:run(456)) - Starting Socket Reader #1 for port 43005 2012-04-03 04:29:58,589 WARN [Thread-56] ipc.RPC (RPC.java:getSuperInterfaces(109)) - Interface interface org.apache.hadoop.mapred.MRConstants ignored because it does not extend VersionedProtocol 2012-04-03 04:29:58,589 WARN [Thread-56] ipc.RPC (RPC.java:getSuperInterfaces(109)) - Interface interface org.apache.hadoop.mapred.TaskTrackerManager ignored because it does not extend VersionedProtocol 2012-04-03 04:29:58,589 WARN [Thread-56] ipc.RPC (RPC.java:getSuperInterfaces(109)) - Interface interface org.apache.hadoop.security.RefreshUserMappingsProtocol ignored because it does not extend VersionedProtocol 2012-04-03 04:29:58,589 WARN [Thread-56] ipc.RPC (RPC.java:getSuperInterfaces(109)) - Interface interface org.apache.hadoop.security.authorize.RefreshAuthorizationPolicyProtocol ignored because it does not extend VersionedProtocol 2012-04-03 04:29:58,589 WARN [Thread-56] ipc.RPC (RPC.java:getSuperInterfaces(109)) - Interface interface org.apache.hadoop.tools.GetUserMappingsProtocol ignored because it does not extend VersionedProtocol 2012-04-03 04:29:58,589 WARN [Thread-56] ipc.RPC (RPC.java:getSuperInterfaces(109)) - Interface interface org.apache.hadoop.mapreduce.server.jobtracker.JTConfig ignored because it does not extend VersionedProtocol {noformat} TestMapredGroupMappingServiceRefresh unit test failures --- Key: MAPREDUCE-4041 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4041 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3 Reporter: Thomas Graves On branch-0.23 the following unit tests fail: org.apache.hadoop.security.TestMapredGroupMappingServiceRefresh.testGroupMappingRefresh org.apache.hadoop.security.TestMapredGroupMappingServiceRefresh.testRefreshSuperUserGroupsConfiguration -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4100) Sometimes gridmix emulates data larger much larger then acutal counter for map only jobs
Sometimes gridmix emulates data larger much larger then acutal counter for map only jobs Key: MAPREDUCE-4100 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4100 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Affects Versions: 1.1.0 Reporter: Karam Singh Priority: Minor While running 1400+ jobs trace I encountered this issue. For map-only jobs, observed that some Maps generating data of around 9 GB (From HDFS_BYTES_WRITTEN) whereas actual value is around 5GB in trace. This can sometimes also cause jobs to fail intermittently. Other GridMix version coming be Hadoop-1.1.X and above might also effected -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3983) TestTTResourceReporting can fail, and should just be deleted
[ https://issues.apache.org/jira/browse/MAPREDUCE-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245386#comment-13245386 ] Hudson commented on MAPREDUCE-3983: --- Integrated in Hadoop-Mapreduce-trunk-Commit #1990 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1990/]) MAPREDUCE-3983. TestTTResourceReporting can fail, and should just be deleted (Ravi Prakash via bobby) (Revision 1308957) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308957 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapred/TestTTResourceReporting.java TestTTResourceReporting can fail, and should just be deleted Key: MAPREDUCE-3983 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3983 Project: Hadoop Map/Reduce Issue Type: Test Components: mrv1 Affects Versions: 0.23.2 Reporter: Robert Joseph Evans Assignee: Ravi Prakash Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-3983.patch TestTTResourceReporting can fail. It is an ant test for task trackers which shoudl just be removed because task trackers are no longer supported outside of the ant tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4059) The history server should have a separate pluggable storage/query interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4059: --- Attachment: MR-4059.txt Updating the patch. Upmerged to trunk and addressed dome of the comments so far. I have not touched The internal data structures just yet. I figure we can look at those as part of MAPREDUCE-3973 and MAPREDUCE-3971. The history server should have a separate pluggable storage/query interface --- Key: MAPREDUCE-4059 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4059 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4059.txt, MR-4059.txt, MR-4059.txt The history server currently caches all parsed jobs in RAM. These jobs can be very large because of counters. It would be nice to have a pluggable interface for the cacheing and querying of the cached data so that we can play around with different implementations. Also just for cleanness of the code it would be nice to split the very large JobHistoryServer.java into a few smaller ones that are more understandable and readable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4041) TestMapredGroupMappingServiceRefresh unit test failures
[ https://issues.apache.org/jira/browse/MAPREDUCE-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245390#comment-13245390 ] Kihwal Lee commented on MAPREDUCE-4041: --- This has been broken after HADOOP-7994. I don't think we want to update the RPC in mrv1 just to make the test pass. The tests need to be ported. TestMapredGroupMappingServiceRefresh unit test failures --- Key: MAPREDUCE-4041 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4041 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3 Reporter: Thomas Graves On branch-0.23 the following unit tests fail: org.apache.hadoop.security.TestMapredGroupMappingServiceRefresh.testGroupMappingRefresh org.apache.hadoop.security.TestMapredGroupMappingServiceRefresh.testRefreshSuperUserGroupsConfiguration -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4062: - Attachment: MAPREDUCE-4062-branch-0.23.patch patch for branch-0.23 since now different then trunk/branch-2 AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Attachments: MAPREDUCE-4062-branch-0.23.patch, MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4062: - Attachment: MAPREDUCE-4062.patch patch for trunk/branch-2. Contains updates to TestContainerLauncher. AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Attachments: MAPREDUCE-4062-branch-0.23.patch, MAPREDUCE-4062.patch, MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4012) Hadoop Job setup error leaves no useful info to users (when LinuxTaskController is used)
[ https://issues.apache.org/jira/browse/MAPREDUCE-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245419#comment-13245419 ] Hudson commented on MAPREDUCE-4012: --- Integrated in Hadoop-Hdfs-trunk-Commit #2053 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2053/]) MAPREDUCE-4012 Hadoop Job setup error leaves no useful info to users. (tgraves) (Revision 1308976) Result = SUCCESS tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308976 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java Hadoop Job setup error leaves no useful info to users (when LinuxTaskController is used) Key: MAPREDUCE-4012 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4012 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0, 1.0.0, 2.0.0 Reporter: Koji Noguchi Assignee: Thomas Graves Priority: Minor Attachments: MAPREDUCE-4012-branch-1.patch, MAPREDUCE-4012.patch, mapreduce-4012-1.patch When distributed cache pull fail on the TaskTracker, job webUI only shows {noformat} Job initialization failed (255) {noformat} leaving users confused. On the TaskTracker log, there is a log with useful info {noformat} 2012-03-14 21:44:17,083 INFO org.apache.hadoop.mapred.TaskController: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=user1, access=READ, inode=testfile:user3:users:rw--- ... 2012-03-14 21:44:17,083 INFO org.apache.hadoop.mapred.TaskController: at org.apache.hadoop.filecache.TrackerDistributedCacheManager.downloadCacheObject(TrackerDistributedCacheManager.java:415) ... 2012-03-14 21:44:17,083 INFO org.apache.hadoop.mapred.TaskController: at org.apache.hadoop.mapred.JobLocalizer.main(JobLocalizer.java:530) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4012) Hadoop Job setup error leaves no useful info to users (when LinuxTaskController is used)
[ https://issues.apache.org/jira/browse/MAPREDUCE-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4012: - Resolution: Fixed Fix Version/s: 0.23.3 1.1.0 Target Version/s: 1.1.0, 0.23.3 (was: 2.0.0, 1.0.3) Status: Resolved (was: Patch Available) Thanks for the review Bobby. I've committed this to branch-1, branch-0.23, branch-2, and trunk. Hadoop Job setup error leaves no useful info to users (when LinuxTaskController is used) Key: MAPREDUCE-4012 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4012 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0, 1.0.0, 2.0.0 Reporter: Koji Noguchi Assignee: Thomas Graves Priority: Minor Fix For: 1.1.0, 0.23.3 Attachments: MAPREDUCE-4012-branch-1.patch, MAPREDUCE-4012.patch, mapreduce-4012-1.patch When distributed cache pull fail on the TaskTracker, job webUI only shows {noformat} Job initialization failed (255) {noformat} leaving users confused. On the TaskTracker log, there is a log with useful info {noformat} 2012-03-14 21:44:17,083 INFO org.apache.hadoop.mapred.TaskController: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=user1, access=READ, inode=testfile:user3:users:rw--- ... 2012-03-14 21:44:17,083 INFO org.apache.hadoop.mapred.TaskController: at org.apache.hadoop.filecache.TrackerDistributedCacheManager.downloadCacheObject(TrackerDistributedCacheManager.java:415) ... 2012-03-14 21:44:17,083 INFO org.apache.hadoop.mapred.TaskController: at org.apache.hadoop.mapred.JobLocalizer.main(JobLocalizer.java:530) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4012) Hadoop Job setup error leaves no useful info to users (when LinuxTaskController is used)
[ https://issues.apache.org/jira/browse/MAPREDUCE-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245421#comment-13245421 ] Hudson commented on MAPREDUCE-4012: --- Integrated in Hadoop-Common-trunk-Commit #1979 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1979/]) MAPREDUCE-4012 Hadoop Job setup error leaves no useful info to users. (tgraves) (Revision 1308976) Result = SUCCESS tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308976 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java Hadoop Job setup error leaves no useful info to users (when LinuxTaskController is used) Key: MAPREDUCE-4012 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4012 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0, 1.0.0, 2.0.0 Reporter: Koji Noguchi Assignee: Thomas Graves Priority: Minor Fix For: 1.1.0, 0.23.3 Attachments: MAPREDUCE-4012-branch-1.patch, MAPREDUCE-4012.patch, mapreduce-4012-1.patch When distributed cache pull fail on the TaskTracker, job webUI only shows {noformat} Job initialization failed (255) {noformat} leaving users confused. On the TaskTracker log, there is a log with useful info {noformat} 2012-03-14 21:44:17,083 INFO org.apache.hadoop.mapred.TaskController: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=user1, access=READ, inode=testfile:user3:users:rw--- ... 2012-03-14 21:44:17,083 INFO org.apache.hadoop.mapred.TaskController: at org.apache.hadoop.filecache.TrackerDistributedCacheManager.downloadCacheObject(TrackerDistributedCacheManager.java:415) ... 2012-03-14 21:44:17,083 INFO org.apache.hadoop.mapred.TaskController: at org.apache.hadoop.mapred.JobLocalizer.main(JobLocalizer.java:530) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4062: - Status: Patch Available (was: Open) AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Attachments: MAPREDUCE-4062-branch-0.23.patch, MAPREDUCE-4062.patch, MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4024) RM webservices can't query on finalStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245431#comment-13245431 ] Thomas Graves commented on MAPREDUCE-4024: -- You are correct in that it will not work for lower case values. equalsIgnoreCase is not needed, it could have been just equals. It does work as intended/documented though. We could change it to accept lower case if that is preferable. documentation states: the job state - valid values are: NEW, INITED, RUNNING, SUCCEEDED, FAILED, KILL_WAIT, KILLED, ERROR The final status of the application if finished - reported by the application itself - valid values are: UNDEFINED, SUCCEEDED, FAILED, KILLED RM webservices can't query on finalStatus - Key: MAPREDUCE-4024 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4024 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Fix For: 0.23.3 Attachments: MAPREDUCE-4024.patch The resource manager web service api to get the list of apps doesn't have a query parameter for finalStatus. It has one for the state but since that isn't what is reported by app master so we really need to be able to query on both state and finalStatus. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4012) Hadoop Job setup error leaves no useful info to users (when LinuxTaskController is used)
[ https://issues.apache.org/jira/browse/MAPREDUCE-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245429#comment-13245429 ] Hudson commented on MAPREDUCE-4012: --- Integrated in Hadoop-Mapreduce-trunk-Commit #1991 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1991/]) MAPREDUCE-4012 Hadoop Job setup error leaves no useful info to users. (tgraves) (Revision 1308976) Result = SUCCESS tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1308976 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java Hadoop Job setup error leaves no useful info to users (when LinuxTaskController is used) Key: MAPREDUCE-4012 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4012 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0, 1.0.0, 2.0.0 Reporter: Koji Noguchi Assignee: Thomas Graves Priority: Minor Fix For: 1.1.0, 0.23.3 Attachments: MAPREDUCE-4012-branch-1.patch, MAPREDUCE-4012.patch, mapreduce-4012-1.patch When distributed cache pull fail on the TaskTracker, job webUI only shows {noformat} Job initialization failed (255) {noformat} leaving users confused. On the TaskTracker log, there is a log with useful info {noformat} 2012-03-14 21:44:17,083 INFO org.apache.hadoop.mapred.TaskController: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=user1, access=READ, inode=testfile:user3:users:rw--- ... 2012-03-14 21:44:17,083 INFO org.apache.hadoop.mapred.TaskController: at org.apache.hadoop.filecache.TrackerDistributedCacheManager.downloadCacheObject(TrackerDistributedCacheManager.java:415) ... 2012-03-14 21:44:17,083 INFO org.apache.hadoop.mapred.TaskController: at org.apache.hadoop.mapred.JobLocalizer.main(JobLocalizer.java:530) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4097) tools testcases fail because missing mrapp-generated-classpath file in classpath
[ https://issues.apache.org/jira/browse/MAPREDUCE-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245442#comment-13245442 ] Harsh J commented on MAPREDUCE-4097: +1, looks good to me. I applied and ran 'mvn test' at hadoop-tools level. tools testcases fail because missing mrapp-generated-classpath file in classpath Key: MAPREDUCE-4097 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4097 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 2.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.0.0 Attachments: MAPREDUCE-4097.patch The mrapp-generated-classpath file is created in hadoop-mapreduce-client-apptarget/classes/ dir but it is excluded from the JAR. When running tools testcases from root level, mvn uses hadoop-mapreduce-client-app/target/classes/ dir to create the classpath. When running tools testcases from tools level, mvn uses the hadoop-mapreduce-client-app JAR from M2 cache to create the classpath. In the later the mrapp-generated-classpath is not present. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4072) User set java.library.path seems to overwrite default creating problems native lib loading
[ https://issues.apache.org/jira/browse/MAPREDUCE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anupam Seth updated MAPREDUCE-4072: --- Attachment: MAPREDUCE-4072-branch-23_documentation.patch Thanks bobby! I have updated the patch to include a pointer to conifg settings for the child JVM. User set java.library.path seems to overwrite default creating problems native lib loading -- Key: MAPREDUCE-4072 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4072 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Anupam Seth Assignee: Anupam Seth Attachments: MAPREDUCE-4072-branch-23.patch, MAPREDUCE-4072-branch-23_documentation.patch, MAPREDUCE-4072-branch-23_documentation.patch This was found by Peeyush Bishnoi. While running a distributed cache example with Hadoop-0.23, tasks are failing as follows: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:261) at org.apache.hadoop.util.Shell.run(Shell.java:188) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:381) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:207) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:241) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) main : command provided 1 main : user is user Same Pig script and command work successfully on 0.20 See this in the stderr: Exception in thread main java.lang.ExceptionInInitializerError at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1179) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1149) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1238) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1264) at org.apache.hadoop.security.Groups.(Groups.java:54) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:178) at org.apache.hadoop.security.UserGroupInformation.initUGI(UserGroupInformation.java:252) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:223) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:265) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:75) Caused by: java.lang.RuntimeException: Bailing out since native library couldn't be loaded at org.apache.hadoop.security.JniBasedUnixGroupsMapping.(JniBasedUnixGroupsMapping.java:48) ... 12 more Pig command: $ pig -Dmapred.job.queue.name=queue -Dmapred.cache.archives=archives -Dmapred.child.java.opts=-Djava.library.path=./ygeo/lib -Dip2geo.preLoadLibraries=some other libs -Djava.io.tmpdir=/grid/0/tmp -Dmapred.create.symlink=yes -Dmapred.job.map.memory.mb=3072 piggeoscript.pig -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4098) TestMRApps testSetClasspath fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245449#comment-13245449 ] Harsh J commented on MAPREDUCE-4098: +1, this looks good to me. Applied and test passes. There are reported test failures. I went over them but they appear unrelated to this change. If you too reach the same conclusion, please do state so and then resolve. TestMRApps testSetClasspath fails - Key: MAPREDUCE-4098 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4098 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.0.0 Attachments: MAPREDUCE-4098.patch The assertion of this test is testing for equality, as the generated classpath file is in the classpath the test fails. Instead, the test should test for the expected path elements to be in the classspath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245459#comment-13245459 ] Hadoop QA commented on MAPREDUCE-4062: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521159/MAPREDUCE-4062.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2133//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2133//console This message is automatically generated. AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Attachments: MAPREDUCE-4062-branch-0.23.patch, MAPREDUCE-4062.patch, MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4059) The history server should have a separate pluggable storage/query interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4059: --- Attachment: MR-4059.txt One more update. There was an issue where the job state was not being set in the job report for a partial job. This resulted in the query using just partial jobs caused it to fail. The history server should have a separate pluggable storage/query interface --- Key: MAPREDUCE-4059 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4059 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4059.txt, MR-4059.txt, MR-4059.txt, MR-4059.txt The history server currently caches all parsed jobs in RAM. These jobs can be very large because of counters. It would be nice to have a pluggable interface for the cacheing and querying of the cached data so that we can play around with different implementations. Also just for cleanness of the code it would be nice to split the very large JobHistoryServer.java into a few smaller ones that are more understandable and readable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4059) The history server should have a separate pluggable storage/query interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4059: --- Status: Patch Available (was: Open) The history server should have a separate pluggable storage/query interface --- Key: MAPREDUCE-4059 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4059 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4059.txt, MR-4059.txt, MR-4059.txt, MR-4059.txt The history server currently caches all parsed jobs in RAM. These jobs can be very large because of counters. It would be nice to have a pluggable interface for the cacheing and querying of the cached data so that we can play around with different implementations. Also just for cleanness of the code it would be nice to split the very large JobHistoryServer.java into a few smaller ones that are more understandable and readable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245479#comment-13245479 ] Thomas Graves commented on MAPREDUCE-4062: -- The tests all pass when I manually run them on both branch-0.23 and trunk. I assume they are related to the random test failures we've been seeing. AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Attachments: MAPREDUCE-4062-branch-0.23.patch, MAPREDUCE-4062.patch, MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245482#comment-13245482 ] Robert Joseph Evans commented on MAPREDUCE-4062: I reviewed the code for trunk/branch-2 and it looks good to me. I like how there is lots of code being deleted and almost only tests being added. I am going to look at the branch-0.23 patch now. AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Attachments: MAPREDUCE-4062-branch-0.23.patch, MAPREDUCE-4062.patch, MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245483#comment-13245483 ] Robert Joseph Evans commented on MAPREDUCE-4062: I reviewed the code for trunk/branch-2 and it looks good to me. I like how there is lots of code being deleted and almost only tests being added. I am going to look at the branch-0.23 patch now. AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Attachments: MAPREDUCE-4062-branch-0.23.patch, MAPREDUCE-4062.patch, MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4020) Web services returns incorrect JSON for deep queue tree
[ https://issues.apache.org/jira/browse/MAPREDUCE-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245487#comment-13245487 ] Thomas Graves commented on MAPREDUCE-4020: -- Mostly looks good. Can you add the file license header to hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerQueueInfoList.java Web services returns incorrect JSON for deep queue tree --- Key: MAPREDUCE-4020 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4020 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, webapps Affects Versions: 0.23.1 Reporter: Jason Lowe Assignee: Anupam Seth Attachments: MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, testcase.patch When the capacity scheduler is configured for more than two levels of queues, the web services API returns incorrect JSON for the subQueues field of some parent queues. The subQueues field for parent queues should always be an array, but sometimes the field appears multiple times for a queue and as what looks like a CapacityQueueInfo object instead of an array. Besides the sometimes-an-array-sometimes-not problem, parsing the result into a JSON object causes all but the last subQueues field to be discarded since they are overwritten by subsequent fields with the same name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4020) Web services returns incorrect JSON for deep queue tree
[ https://issues.apache.org/jira/browse/MAPREDUCE-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4020: - Target Version/s: 0.23.3 Status: Open (was: Patch Available) Web services returns incorrect JSON for deep queue tree --- Key: MAPREDUCE-4020 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4020 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, webapps Affects Versions: 0.23.1 Reporter: Jason Lowe Assignee: Anupam Seth Attachments: MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, testcase.patch When the capacity scheduler is configured for more than two levels of queues, the web services API returns incorrect JSON for the subQueues field of some parent queues. The subQueues field for parent queues should always be an array, but sometimes the field appears multiple times for a queue and as what looks like a CapacityQueueInfo object instead of an array. Besides the sometimes-an-array-sometimes-not problem, parsing the result into a JSON object causes all but the last subQueues field to be discarded since they are overwritten by subsequent fields with the same name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245490#comment-13245490 ] Robert Joseph Evans commented on MAPREDUCE-4062: The patch for 0.23 looks good too. +1. AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Attachments: MAPREDUCE-4062-branch-0.23.patch, MAPREDUCE-4062.patch, MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4098) TestMRApps testSetClasspath fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245493#comment-13245493 ] Ahmed Radwan commented on MAPREDUCE-4098: - +1 thanks tucu. TestMRApps testSetClasspath fails - Key: MAPREDUCE-4098 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4098 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.0.0 Attachments: MAPREDUCE-4098.patch The assertion of this test is testing for equality, as the generated classpath file is in the classpath the test fails. Instead, the test should test for the expected path elements to be in the classspath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4072) User set java.library.path seems to overwrite default creating problems native lib loading
[ https://issues.apache.org/jira/browse/MAPREDUCE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245494#comment-13245494 ] Hadoop QA commented on MAPREDUCE-4072: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521161/MAPREDUCE-4072-branch-23_documentation.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2134//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2134//console This message is automatically generated. User set java.library.path seems to overwrite default creating problems native lib loading -- Key: MAPREDUCE-4072 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4072 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Anupam Seth Assignee: Anupam Seth Attachments: MAPREDUCE-4072-branch-23.patch, MAPREDUCE-4072-branch-23_documentation.patch, MAPREDUCE-4072-branch-23_documentation.patch This was found by Peeyush Bishnoi. While running a distributed cache example with Hadoop-0.23, tasks are failing as follows: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:261) at org.apache.hadoop.util.Shell.run(Shell.java:188) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:381) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:207) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:241) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) main : command provided 1 main : user is user Same Pig script and command work successfully on 0.20 See this in the stderr: Exception in thread main java.lang.ExceptionInInitializerError at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1179) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1149) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1238) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1264) at org.apache.hadoop.security.Groups.(Groups.java:54) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:178) at org.apache.hadoop.security.UserGroupInformation.initUGI(UserGroupInformation.java:252) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:223) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:265) at
[jira] [Commented] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245500#comment-13245500 ] Hudson commented on MAPREDUCE-4062: --- Integrated in Hadoop-Hdfs-trunk-Commit #2054 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2054/]) MAPREDUCE-4062. AM Launcher thread can hang forever (tgraves via bobby) (Revision 1309037) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309037 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/launcher/TestContainerLauncher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ContainerManagerPBClientImpl.java AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Attachments: MAPREDUCE-4062-branch-0.23.patch, MAPREDUCE-4062.patch, MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4062: --- Resolution: Fixed Fix Version/s: 2.0.0 0.23.3 Status: Resolved (was: Patch Available) Thanks Tom, I just put this into trunk, branch-2, branch-0.23 AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4062-branch-0.23.patch, MAPREDUCE-4062.patch, MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245503#comment-13245503 ] Hudson commented on MAPREDUCE-4062: --- Integrated in Hadoop-Common-trunk-Commit #1980 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1980/]) Missed a test file as part of MAPREDUCE-4062 (Revision 1309043) MAPREDUCE-4062. AM Launcher thread can hang forever (tgraves via bobby) (Revision 1309037) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309043 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestContainerLaunchRPC.java bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309037 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/launcher/TestContainerLauncher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ContainerManagerPBClientImpl.java AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4062-branch-0.23.patch, MAPREDUCE-4062.patch, MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4098) TestMRApps testSetClasspath fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245509#comment-13245509 ] Alejandro Abdelnur commented on MAPREDUCE-4098: --- Thanks Harsh. Yes, test failures seem unrelated, it looks like a zombie JVM with minicluster is around (YARN minicluster uses fixed ports) TestMRApps testSetClasspath fails - Key: MAPREDUCE-4098 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4098 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.0.0 Attachments: MAPREDUCE-4098.patch The assertion of this test is testing for equality, as the generated classpath file is in the classpath the test fails. Instead, the test should test for the expected path elements to be in the classspath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245512#comment-13245512 ] Hudson commented on MAPREDUCE-4062: --- Integrated in Hadoop-Mapreduce-trunk-Commit #1992 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1992/]) MAPREDUCE-4062. AM Launcher thread can hang forever (tgraves via bobby) (Revision 1309037) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309037 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/launcher/TestContainerLauncher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ContainerManagerPBClientImpl.java AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4062-branch-0.23.patch, MAPREDUCE-4062.patch, MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245533#comment-13245533 ] Hudson commented on MAPREDUCE-4062: --- Integrated in Hadoop-Hdfs-trunk-Commit #2055 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2055/]) Missed a test file as part of MAPREDUCE-4062 (Revision 1309043) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309043 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestContainerLaunchRPC.java AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4062-branch-0.23.patch, MAPREDUCE-4062.patch, MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4059) The history server should have a separate pluggable storage/query interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245541#comment-13245541 ] Bhallamudi Venkata Siva Kamesh commented on MAPREDUCE-4059: --- Hi Robert, if count is null, I think assigning count as jobs.size() seems a good option for me. If so, I think we won't have any overflows and can eliminate the following piece of the code. {code:xml} if(end 0) { //due to overflow end = Long.MAX_VALUE; } {code} Any comments? The history server should have a separate pluggable storage/query interface --- Key: MAPREDUCE-4059 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4059 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4059.txt, MR-4059.txt, MR-4059.txt, MR-4059.txt The history server currently caches all parsed jobs in RAM. These jobs can be very large because of counters. It would be nice to have a pluggable interface for the cacheing and querying of the cached data so that we can play around with different implementations. Also just for cleanness of the code it would be nice to split the very large JobHistoryServer.java into a few smaller ones that are more understandable and readable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4062) AM Launcher thread can hang forever
[ https://issues.apache.org/jira/browse/MAPREDUCE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245544#comment-13245544 ] Hudson commented on MAPREDUCE-4062: --- Integrated in Hadoop-Mapreduce-trunk-Commit #1993 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1993/]) Missed a test file as part of MAPREDUCE-4062 (Revision 1309043) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309043 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestContainerLaunchRPC.java AM Launcher thread can hang forever --- Key: MAPREDUCE-4062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4062 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4062-branch-0.23.patch, MAPREDUCE-4062.patch, MAPREDUCE-4062.patch We saw an instance where the RM stopped launch Application masters. We found that the launcher thread was hung because something weird/bad happened to the NM node. Currently there is only 1 launcher thread (jira 4061 to fix that). We need this to not happen. Even once we increase the number of threads to 1 if that many nodes go bad the RM would be stuck. Note that this was stuck like this for approximately 9 hours. Stack trace on hung AM launcher: pool-1-thread-1 prio=10 tid=0x4343e800 nid=0x3a4c in Object.wait() [0x4fad2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1076) - locked 0x2aab05a4f3f0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:135) at $Proxy76.startContainer(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:87) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:265) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4101) nodemanager depends on /bin/bash
nodemanager depends on /bin/bash Key: MAPREDUCE-4101 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4101 Project: Hadoop Map/Reduce Issue Type: Bug Components: nodemanager Affects Versions: 0.23.1 Environment: FreeBSD 8.2 / 64 bit Reporter: Radim Kolar Currently nodemanager depends on bash shell. It should be well documented for system not having bash installed by default such as FreeBSD. Because only basic functionality of bash is used, probably changing bash to /bin/sh would work enough. i found 2 cases: 1. DefaultContainerExecutor.java creates file with /bin/bash hardcoded in writeLocalWrapperScript. (this needs bash in /bin) 2. yarn-hduser-nodemanager-ponto.amerinoc.com.log:2012-04-03 19:50:10,798 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [bash, -c, /tmp/nm-local-dir/usercache/hduser/appcache/application_1333474251533_0002/container_1333474251533_0002_01_12/default_container_executor.sh] this created script is also launched by bash - bash anywhere in path works - in freebsd it is /usr/local/bin/bash -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4059) The history server should have a separate pluggable storage/query interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245547#comment-13245547 ] Hadoop QA commented on MAPREDUCE-4059: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521166/MR-4059.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 21 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath org.apache.hadoop.mapreduce.v2.TestRMNMInfo org.apache.hadoop.mapred.TestReduceFetchFromPartialMem org.apache.hadoop.mapred.TestLazyOutput org.apache.hadoop.mapreduce.TestChild org.apache.hadoop.mapred.TestReduceFetch org.apache.hadoop.mapred.TestMiniMRBringup org.apache.hadoop.mapred.TestMiniMRClientCluster org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter org.apache.hadoop.mapreduce.v2.TestMROldApiJobs org.apache.hadoop.mapreduce.v2.TestNonExistentJob org.apache.hadoop.mapred.TestMiniMRChildTask org.apache.hadoop.mapred.TestJobCounters org.apache.hadoop.mapreduce.v2.TestUberAM org.apache.hadoop.conf.TestNoDefaultsJobConf org.apache.hadoop.mapred.TestJobSysDirWithDFS org.apache.hadoop.mapreduce.v2.TestMRJobs org.apache.hadoop.mapred.TestClientRedirect org.apache.hadoop.mapred.TestJobCleanup org.apache.hadoop.mapreduce.TestMapReduceLazyOutput +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2135//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2135//console This message is automatically generated. The history server should have a separate pluggable storage/query interface --- Key: MAPREDUCE-4059 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4059 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4059.txt, MR-4059.txt, MR-4059.txt, MR-4059.txt The history server currently caches all parsed jobs in RAM. These jobs can be very large because of counters. It would be nice to have a pluggable interface for the cacheing and querying of the cached data so that we can play around with different implementations. Also just for cleanness of the code it would be nice to split the very large JobHistoryServer.java into a few smaller ones that are more understandable and readable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4072) User set java.library.path seems to overwrite default creating problems native lib loading
[ https://issues.apache.org/jira/browse/MAPREDUCE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4072: --- Resolution: Fixed Fix Version/s: 2.0.0 0.23.3 Release Note: -Djava.library.path in mapred.child.java.opts can cause issues with native libraries. LD_LIBRARY_PATH through mapred.child.env should be used instead. Hadoop Flags: Incompatible change,Reviewed Status: Resolved (was: Patch Available) Thanks Anupam. +1 on the documentation. I have checked this into trunk, branch-2, and branch-0.23 User set java.library.path seems to overwrite default creating problems native lib loading -- Key: MAPREDUCE-4072 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4072 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Anupam Seth Assignee: Anupam Seth Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4072-branch-23.patch, MAPREDUCE-4072-branch-23_documentation.patch, MAPREDUCE-4072-branch-23_documentation.patch This was found by Peeyush Bishnoi. While running a distributed cache example with Hadoop-0.23, tasks are failing as follows: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:261) at org.apache.hadoop.util.Shell.run(Shell.java:188) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:381) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:207) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:241) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) main : command provided 1 main : user is user Same Pig script and command work successfully on 0.20 See this in the stderr: Exception in thread main java.lang.ExceptionInInitializerError at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1179) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1149) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1238) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1264) at org.apache.hadoop.security.Groups.(Groups.java:54) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:178) at org.apache.hadoop.security.UserGroupInformation.initUGI(UserGroupInformation.java:252) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:223) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:265) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:75) Caused by: java.lang.RuntimeException: Bailing out since native library couldn't be loaded at org.apache.hadoop.security.JniBasedUnixGroupsMapping.(JniBasedUnixGroupsMapping.java:48) ... 12 more Pig command: $ pig -Dmapred.job.queue.name=queue -Dmapred.cache.archives=archives -Dmapred.child.java.opts=-Djava.library.path=./ygeo/lib -Dip2geo.preLoadLibraries=some other libs -Djava.io.tmpdir=/grid/0/tmp -Dmapred.create.symlink=yes -Dmapred.job.map.memory.mb=3072 piggeoscript.pig -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4059) The history server should have a separate pluggable storage/query interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245553#comment-13245553 ] Robert Joseph Evans commented on MAPREDUCE-4059: Yes that would probably work too, but if someone does happen to put in a count that is close to LONG_MAX we can still run into this situation, so I would prefer to just leave the code as is. The history server should have a separate pluggable storage/query interface --- Key: MAPREDUCE-4059 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4059 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4059.txt, MR-4059.txt, MR-4059.txt, MR-4059.txt The history server currently caches all parsed jobs in RAM. These jobs can be very large because of counters. It would be nice to have a pluggable interface for the cacheing and querying of the cached data so that we can play around with different implementations. Also just for cleanness of the code it would be nice to split the very large JobHistoryServer.java into a few smaller ones that are more understandable and readable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4072) User set java.library.path seems to overwrite default creating problems native lib loading
[ https://issues.apache.org/jira/browse/MAPREDUCE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245560#comment-13245560 ] Hudson commented on MAPREDUCE-4072: --- Integrated in Hadoop-Common-trunk-Commit #1981 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1981/]) MAPREDUCE-4072. User set java.library.path seems to overwrite default creating problems native lib loading (Anupam Seth via bobby) (Revision 1309077) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309077 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/WritingYarnApplications.apt.vm User set java.library.path seems to overwrite default creating problems native lib loading -- Key: MAPREDUCE-4072 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4072 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Anupam Seth Assignee: Anupam Seth Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4072-branch-23.patch, MAPREDUCE-4072-branch-23_documentation.patch, MAPREDUCE-4072-branch-23_documentation.patch This was found by Peeyush Bishnoi. While running a distributed cache example with Hadoop-0.23, tasks are failing as follows: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:261) at org.apache.hadoop.util.Shell.run(Shell.java:188) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:381) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:207) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:241) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) main : command provided 1 main : user is user Same Pig script and command work successfully on 0.20 See this in the stderr: Exception in thread main java.lang.ExceptionInInitializerError at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1179) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1149) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1238) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1264) at org.apache.hadoop.security.Groups.(Groups.java:54) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:178) at org.apache.hadoop.security.UserGroupInformation.initUGI(UserGroupInformation.java:252) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:223) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:265) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:75) Caused by: java.lang.RuntimeException: Bailing out since native library couldn't be loaded at org.apache.hadoop.security.JniBasedUnixGroupsMapping.(JniBasedUnixGroupsMapping.java:48) ... 12 more Pig command: $ pig -Dmapred.job.queue.name=queue -Dmapred.cache.archives=archives -Dmapred.child.java.opts=-Djava.library.path=./ygeo/lib -Dip2geo.preLoadLibraries=some other libs -Djava.io.tmpdir=/grid/0/tmp -Dmapred.create.symlink=yes -Dmapred.job.map.memory.mb=3072 piggeoscript.pig -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4072) User set java.library.path seems to overwrite default creating problems native lib loading
[ https://issues.apache.org/jira/browse/MAPREDUCE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245559#comment-13245559 ] Hudson commented on MAPREDUCE-4072: --- Integrated in Hadoop-Hdfs-trunk-Commit #2056 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2056/]) MAPREDUCE-4072. User set java.library.path seems to overwrite default creating problems native lib loading (Anupam Seth via bobby) (Revision 1309077) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309077 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/WritingYarnApplications.apt.vm User set java.library.path seems to overwrite default creating problems native lib loading -- Key: MAPREDUCE-4072 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4072 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Anupam Seth Assignee: Anupam Seth Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4072-branch-23.patch, MAPREDUCE-4072-branch-23_documentation.patch, MAPREDUCE-4072-branch-23_documentation.patch This was found by Peeyush Bishnoi. While running a distributed cache example with Hadoop-0.23, tasks are failing as follows: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:261) at org.apache.hadoop.util.Shell.run(Shell.java:188) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:381) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:207) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:241) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) main : command provided 1 main : user is user Same Pig script and command work successfully on 0.20 See this in the stderr: Exception in thread main java.lang.ExceptionInInitializerError at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1179) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1149) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1238) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1264) at org.apache.hadoop.security.Groups.(Groups.java:54) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:178) at org.apache.hadoop.security.UserGroupInformation.initUGI(UserGroupInformation.java:252) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:223) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:265) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:75) Caused by: java.lang.RuntimeException: Bailing out since native library couldn't be loaded at org.apache.hadoop.security.JniBasedUnixGroupsMapping.(JniBasedUnixGroupsMapping.java:48) ... 12 more Pig command: $ pig -Dmapred.job.queue.name=queue -Dmapred.cache.archives=archives -Dmapred.child.java.opts=-Djava.library.path=./ygeo/lib -Dip2geo.preLoadLibraries=some other libs -Djava.io.tmpdir=/grid/0/tmp -Dmapred.create.symlink=yes -Dmapred.job.map.memory.mb=3072 piggeoscript.pig -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4072) User set java.library.path seems to overwrite default creating problems native lib loading
[ https://issues.apache.org/jira/browse/MAPREDUCE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245575#comment-13245575 ] Hudson commented on MAPREDUCE-4072: --- Integrated in Hadoop-Mapreduce-trunk-Commit #1994 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1994/]) MAPREDUCE-4072. User set java.library.path seems to overwrite default creating problems native lib loading (Anupam Seth via bobby) (Revision 1309077) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309077 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/WritingYarnApplications.apt.vm User set java.library.path seems to overwrite default creating problems native lib loading -- Key: MAPREDUCE-4072 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4072 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Anupam Seth Assignee: Anupam Seth Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-4072-branch-23.patch, MAPREDUCE-4072-branch-23_documentation.patch, MAPREDUCE-4072-branch-23_documentation.patch This was found by Peeyush Bishnoi. While running a distributed cache example with Hadoop-0.23, tasks are failing as follows: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:261) at org.apache.hadoop.util.Shell.run(Shell.java:188) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:381) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:207) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:241) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) main : command provided 1 main : user is user Same Pig script and command work successfully on 0.20 See this in the stderr: Exception in thread main java.lang.ExceptionInInitializerError at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1179) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1149) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1238) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1264) at org.apache.hadoop.security.Groups.(Groups.java:54) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:178) at org.apache.hadoop.security.UserGroupInformation.initUGI(UserGroupInformation.java:252) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:223) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:265) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:75) Caused by: java.lang.RuntimeException: Bailing out since native library couldn't be loaded at org.apache.hadoop.security.JniBasedUnixGroupsMapping.(JniBasedUnixGroupsMapping.java:48) ... 12 more Pig command: $ pig -Dmapred.job.queue.name=queue -Dmapred.cache.archives=archives -Dmapred.child.java.opts=-Djava.library.path=./ygeo/lib -Dip2geo.preLoadLibraries=some other libs -Djava.io.tmpdir=/grid/0/tmp -Dmapred.create.symlink=yes -Dmapred.job.map.memory.mb=3072 piggeoscript.pig -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3988) mapreduce.job.local.dir doesn't point to a single directory on a node.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3988: --- Resolution: Fixed Fix Version/s: 2.0.0 0.23.3 Status: Resolved (was: Patch Available) Thanks Eric. I checked this into trunk, branch-2, and branch-0.23 mapreduce.job.local.dir doesn't point to a single directory on a node. -- Key: MAPREDUCE-3988 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3988 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Eric Payne Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-3988-1.txt, MAPREDUCE-3988-2.txt, MAPREDUCE-3988-2.txt After MAPREDUCE-3975, mapreduce.job.local.dir is set correctly for the tasks but it doesn't point to the same directory for all tasks running on the node. It is a public API. Either we should point to a single directory or point it to all directories and change the documentation to say that it points to all dirs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3988) mapreduce.job.local.dir doesn't point to a single directory on a node.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245599#comment-13245599 ] Hudson commented on MAPREDUCE-3988: --- Integrated in Hadoop-Hdfs-trunk-Commit #2057 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2057/]) MAPREDUCE-3988. mapreduce.job.local.dir doesn't point to a single directory on a node. (Eric Payne via bobby) (Revision 1309086) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309086 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java mapreduce.job.local.dir doesn't point to a single directory on a node. -- Key: MAPREDUCE-3988 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3988 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Eric Payne Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-3988-1.txt, MAPREDUCE-3988-2.txt, MAPREDUCE-3988-2.txt After MAPREDUCE-3975, mapreduce.job.local.dir is set correctly for the tasks but it doesn't point to the same directory for all tasks running on the node. It is a public API. Either we should point to a single directory or point it to all directories and change the documentation to say that it points to all dirs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3315) Master-Worker Application on YARN
[ https://issues.apache.org/jira/browse/MAPREDUCE-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245600#comment-13245600 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-3315: We have a hadoop-yarn/hadoop-yarn-applications/ module under which you can put your framework (as a sibling to hadoop-yarn-applications-distributedshell/). The example could go as a sub-module under yours. Taking a step back, I think you should start with writing down the APIs: (1) The client API for submission and getting the app-status. (2) The task API: which the users can override and provide their own impl. Advanced: (3) The master API: as to how to order the tasks (a queue?) etc. Master-Worker Application on YARN - Key: MAPREDUCE-3315 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3315 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Sharad Agarwal Assignee: Sharad Agarwal Fix For: 0.24.0 Attachments: MAPREDUCE-3315.patch Currently master worker scenarios are forced fit into Map-Reduce. Now with YARN, these can be first class and would benefit real/near realtime workloads and be more effective in using the cluster resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3988) mapreduce.job.local.dir doesn't point to a single directory on a node.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245605#comment-13245605 ] Hudson commented on MAPREDUCE-3988: --- Integrated in Hadoop-Common-trunk-Commit #1982 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1982/]) MAPREDUCE-3988. mapreduce.job.local.dir doesn't point to a single directory on a node. (Eric Payne via bobby) (Revision 1309086) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309086 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java mapreduce.job.local.dir doesn't point to a single directory on a node. -- Key: MAPREDUCE-3988 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3988 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Eric Payne Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-3988-1.txt, MAPREDUCE-3988-2.txt, MAPREDUCE-3988-2.txt After MAPREDUCE-3975, mapreduce.job.local.dir is set correctly for the tasks but it doesn't point to the same directory for all tasks running on the node. It is a public API. Either we should point to a single directory or point it to all directories and change the documentation to say that it points to all dirs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3682) Tracker URL says AM tasks run on localhost
[ https://issues.apache.org/jira/browse/MAPREDUCE-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245607#comment-13245607 ] Thomas Graves commented on MAPREDUCE-3682: -- Ravi, Can we use app.getJob().isUber() in the HsTaskPage rather then the negative port. Also testing out the patch it looks like the node column on the history server task attempts page now shows both ip and hostname, for example: 192.0.0.1/hostname.domain:port logs The logs link gives an error about user: Cannot get container logs without an app owner. This is using the aggregated logs. This patch has been up here for a while, so it might be other changes went in that affect this now. Can you please update and re-test. Tracker URL says AM tasks run on localhost -- Key: MAPREDUCE-3682 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3682 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: David Capwell Assignee: Ravi Prakash Attachments: MAPREDUCE-3682.patch, MAPREDUCE-3682.patch If you look at the task page, it will show you the node the task ran on. For jobs that run in UberAM they point to http://localhost: and logs points to http://localhost:/node/containerlogs/$container_id/ This was run on a multi node cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4020) Web services returns incorrect JSON for deep queue tree
[ https://issues.apache.org/jira/browse/MAPREDUCE-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anupam Seth updated MAPREDUCE-4020: --- Status: Patch Available (was: Open) Web services returns incorrect JSON for deep queue tree --- Key: MAPREDUCE-4020 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4020 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, webapps Affects Versions: 0.23.1 Reporter: Jason Lowe Assignee: Anupam Seth Attachments: MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, testcase.patch When the capacity scheduler is configured for more than two levels of queues, the web services API returns incorrect JSON for the subQueues field of some parent queues. The subQueues field for parent queues should always be an array, but sometimes the field appears multiple times for a queue and as what looks like a CapacityQueueInfo object instead of an array. Besides the sometimes-an-array-sometimes-not problem, parsing the result into a JSON object causes all but the last subQueues field to be discarded since they are overwritten by subsequent fields with the same name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4020) Web services returns incorrect JSON for deep queue tree
[ https://issues.apache.org/jira/browse/MAPREDUCE-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anupam Seth updated MAPREDUCE-4020: --- Attachment: MAPREDUCE-4020-branch-23.patch Thanks for the review Tom! Re-sttaching patch with the missing license header. Web services returns incorrect JSON for deep queue tree --- Key: MAPREDUCE-4020 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4020 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, webapps Affects Versions: 0.23.1 Reporter: Jason Lowe Assignee: Anupam Seth Attachments: MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, testcase.patch When the capacity scheduler is configured for more than two levels of queues, the web services API returns incorrect JSON for the subQueues field of some parent queues. The subQueues field for parent queues should always be an array, but sometimes the field appears multiple times for a queue and as what looks like a CapacityQueueInfo object instead of an array. Besides the sometimes-an-array-sometimes-not problem, parsing the result into a JSON object causes all but the last subQueues field to be discarded since they are overwritten by subsequent fields with the same name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3082) archive command take wrong path for input file with current directory
[ https://issues.apache.org/jira/browse/MAPREDUCE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245610#comment-13245610 ] Robert Joseph Evans commented on MAPREDUCE-3082: Is there a reason that you are doing {code} if (!parentPath.isAbsolute()) { parentPath= new Path(parentPath.getFileSystem(getConf()).getWorkingDirectory(), args[i+1]); } {code} instead of {code} if (!parentPath.isAbsolute()) { parentPath= parentPath.getFileSystem(getConf()).makeQualified(parentPath); } {code} archive command take wrong path for input file with current directory - Key: MAPREDUCE-3082 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3082 Project: Hadoop Map/Reduce Issue Type: Bug Components: harchive Affects Versions: 0.20.204.1, 0.23.2, 0.23.3 Reporter: Rajit Saha Assignee: John George Attachments: MR-3082.patch $hadoop dfs -copyFromLocal /etc/passwd . $hadoop dfs -lsr . -rw--- 3 hadoopqa hdfs 6883 2011-09-23 22:37 /user/hadoopqa/passwd $hadoop archive -archiveName test1.har -p . passwd . 11/09/23 22:39:22 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 4 for hadoopqa 11/09/23 22:39:22 INFO security.TokenCache: Got dt for hdfs://NN host/user/hadoopqa/.staging/job_201109232234_0004;uri=NN IP:8020;t.service=NN IP:8020 11/09/23 22:39:22 INFO mapred.JobClient: Running job: job_201109232234_0004 11/09/23 22:39:23 INFO mapred.JobClient: map 0% reduce 0% 11/09/23 22:39:34 INFO mapred.JobClient: Task Id : attempt_201109232234_0004_m_00_0, Status : FAILED java.io.FileNotFoundException: File does not exist: hdfs://NN host/user/hadoopqa/hadoopqa/passwd at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:525) at org.apache.hadoop.tools.HadoopArchives$HArchivesMapper.map(HadoopArchives.java:697) at org.apache.hadoop.tools.HadoopArchives$HArchivesMapper.map(HadoopArchives.java:587) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:261) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:255) So Archiving is failing as it was finding input file at /user/hadoopqa/hadoopqa/passwd , whereas it should look for /user/hadoopqa/passwd -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3988) mapreduce.job.local.dir doesn't point to a single directory on a node.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245618#comment-13245618 ] Hudson commented on MAPREDUCE-3988: --- Integrated in Hadoop-Mapreduce-trunk-Commit #1995 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1995/]) MAPREDUCE-3988. mapreduce.job.local.dir doesn't point to a single directory on a node. (Eric Payne via bobby) (Revision 1309086) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309086 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java mapreduce.job.local.dir doesn't point to a single directory on a node. -- Key: MAPREDUCE-3988 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3988 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Eric Payne Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-3988-1.txt, MAPREDUCE-3988-2.txt, MAPREDUCE-3988-2.txt After MAPREDUCE-3975, mapreduce.job.local.dir is set correctly for the tasks but it doesn't point to the same directory for all tasks running on the node. It is a public API. Either we should point to a single directory or point it to all directories and change the documentation to say that it points to all dirs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4101) nodemanager depends on /bin/bash
[ https://issues.apache.org/jira/browse/MAPREDUCE-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245623#comment-13245623 ] Bikas Saha commented on MAPREDUCE-4101: --- I agree on using sh. It would be great if you could post a patch to fix it. Thanks! nodemanager depends on /bin/bash Key: MAPREDUCE-4101 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4101 Project: Hadoop Map/Reduce Issue Type: Bug Components: nodemanager Affects Versions: 0.23.1 Environment: FreeBSD 8.2 / 64 bit Reporter: Radim Kolar Currently nodemanager depends on bash shell. It should be well documented for system not having bash installed by default such as FreeBSD. Because only basic functionality of bash is used, probably changing bash to /bin/sh would work enough. i found 2 cases: 1. DefaultContainerExecutor.java creates file with /bin/bash hardcoded in writeLocalWrapperScript. (this needs bash in /bin) 2. yarn-hduser-nodemanager-ponto.amerinoc.com.log:2012-04-03 19:50:10,798 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [bash, -c, /tmp/nm-local-dir/usercache/hduser/appcache/application_1333474251533_0002/container_1333474251533_0002_01_12/default_container_executor.sh] this created script is also launched by bash - bash anywhere in path works - in freebsd it is /usr/local/bin/bash -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3672) Killed maps shouldn't be counted towards JobCounter.NUM_FAILED_MAPS
[ https://issues.apache.org/jira/browse/MAPREDUCE-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245628#comment-13245628 ] Robert Joseph Evans commented on MAPREDUCE-3672: Have you tested this on a cluster? Killed maps shouldn't be counted towards JobCounter.NUM_FAILED_MAPS --- Key: MAPREDUCE-3672 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3672 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Anupam Seth Fix For: 0.24.0 Attachments: MAPREDUCE-3672-branch-23.patch, MAPREDUCE-3672-branch-23.patch We count maps that are killed, say by speculator, towards JobCounter.NUM_FAILED_MAPS. We should instead have a separate JobCounter for killed maps. Same with reduces too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3999) Tracking link gives an error if the AppMaster hasn't started yet
[ https://issues.apache.org/jira/browse/MAPREDUCE-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3999: --- Resolution: Fixed Fix Version/s: 2.0.0 0.23.3 Status: Resolved (was: Patch Available) Thanks Ravi, I just put this into trunk, branch-2, and branch-0.23 Tracking link gives an error if the AppMaster hasn't started yet Key: MAPREDUCE-3999 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3999 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, webapps Affects Versions: 0.23.1 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-3999.patch, MAPREDUCE-3999.patch, MAPREDUCE-3999.patch Courtesy [~sseth] {quote} The MRAppMaster died before writing anything. Steps to generate the error: 1. Setup a queue with 1 max active application per user 2. Submit a long running job to this queue. 3. Submit another job to the queue as the same user. Access the tracking URL for job 2 directly or via Oozie (not via the RM link - which is rewritten once the app starts). This would exist in situations where the queue doesn't have enough capacity - or for the small period of time between app submission and AM start. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4102) job counters not available in Jobhistory webui for killed jobs
job counters not available in Jobhistory webui for killed jobs -- Key: MAPREDUCE-4102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4102 Project: Hadoop Map/Reduce Issue Type: Bug Components: webapps Affects Versions: 0.23.2 Reporter: Thomas Graves Run a simple wordcount or sleep, and kill the job before it finishes. Go to the job history web ui and click the Counters link for that job. It displays 500 error. The job history log has: Caused by: com.google.inject.ProvisionException: Guice provision errors: 2012-04-03 19:42:53,148 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /jobhistory/jobcounters/job_1333482028750_0001 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) ... .. ... 1) Error injecting constructor, java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:56) while locating org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock ... .. ... Caused by: java.lang.NullPointerExceptionat org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:328) at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.getCounters(CountersBlock.java:188) at org.apache.hadoop.mapreduce.v2.app.webapp.CountersBlock.init(CountersBlock.java:57) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) There are task counters available if you drill down into successful tasks though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3082) archive command take wrong path for input file with current directory
[ https://issues.apache.org/jira/browse/MAPREDUCE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John George updated MAPREDUCE-3082: --- Status: Open (was: Patch Available) archive command take wrong path for input file with current directory - Key: MAPREDUCE-3082 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3082 Project: Hadoop Map/Reduce Issue Type: Bug Components: harchive Affects Versions: 0.20.204.1, 0.23.2, 0.23.3 Reporter: Rajit Saha Assignee: John George Attachments: MR-3082.patch, MR-3082.patch $hadoop dfs -copyFromLocal /etc/passwd . $hadoop dfs -lsr . -rw--- 3 hadoopqa hdfs 6883 2011-09-23 22:37 /user/hadoopqa/passwd $hadoop archive -archiveName test1.har -p . passwd . 11/09/23 22:39:22 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 4 for hadoopqa 11/09/23 22:39:22 INFO security.TokenCache: Got dt for hdfs://NN host/user/hadoopqa/.staging/job_201109232234_0004;uri=NN IP:8020;t.service=NN IP:8020 11/09/23 22:39:22 INFO mapred.JobClient: Running job: job_201109232234_0004 11/09/23 22:39:23 INFO mapred.JobClient: map 0% reduce 0% 11/09/23 22:39:34 INFO mapred.JobClient: Task Id : attempt_201109232234_0004_m_00_0, Status : FAILED java.io.FileNotFoundException: File does not exist: hdfs://NN host/user/hadoopqa/hadoopqa/passwd at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:525) at org.apache.hadoop.tools.HadoopArchives$HArchivesMapper.map(HadoopArchives.java:697) at org.apache.hadoop.tools.HadoopArchives$HArchivesMapper.map(HadoopArchives.java:587) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:261) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:255) So Archiving is failing as it was finding input file at /user/hadoopqa/hadoopqa/passwd , whereas it should look for /user/hadoopqa/passwd -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3082) archive command take wrong path for input file with current directory
[ https://issues.apache.org/jira/browse/MAPREDUCE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John George updated MAPREDUCE-3082: --- Attachment: MR-3082.patch Bobby, You are right - I should have done it that way. A new patch addressing your comment. archive command take wrong path for input file with current directory - Key: MAPREDUCE-3082 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3082 Project: Hadoop Map/Reduce Issue Type: Bug Components: harchive Affects Versions: 0.20.204.1, 0.23.2, 0.23.3 Reporter: Rajit Saha Assignee: John George Attachments: MR-3082.patch, MR-3082.patch $hadoop dfs -copyFromLocal /etc/passwd . $hadoop dfs -lsr . -rw--- 3 hadoopqa hdfs 6883 2011-09-23 22:37 /user/hadoopqa/passwd $hadoop archive -archiveName test1.har -p . passwd . 11/09/23 22:39:22 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 4 for hadoopqa 11/09/23 22:39:22 INFO security.TokenCache: Got dt for hdfs://NN host/user/hadoopqa/.staging/job_201109232234_0004;uri=NN IP:8020;t.service=NN IP:8020 11/09/23 22:39:22 INFO mapred.JobClient: Running job: job_201109232234_0004 11/09/23 22:39:23 INFO mapred.JobClient: map 0% reduce 0% 11/09/23 22:39:34 INFO mapred.JobClient: Task Id : attempt_201109232234_0004_m_00_0, Status : FAILED java.io.FileNotFoundException: File does not exist: hdfs://NN host/user/hadoopqa/hadoopqa/passwd at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:525) at org.apache.hadoop.tools.HadoopArchives$HArchivesMapper.map(HadoopArchives.java:697) at org.apache.hadoop.tools.HadoopArchives$HArchivesMapper.map(HadoopArchives.java:587) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:261) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:255) So Archiving is failing as it was finding input file at /user/hadoopqa/hadoopqa/passwd , whereas it should look for /user/hadoopqa/passwd -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3082) archive command take wrong path for input file with current directory
[ https://issues.apache.org/jira/browse/MAPREDUCE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John George updated MAPREDUCE-3082: --- Status: Patch Available (was: Open) archive command take wrong path for input file with current directory - Key: MAPREDUCE-3082 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3082 Project: Hadoop Map/Reduce Issue Type: Bug Components: harchive Affects Versions: 0.20.204.1, 0.23.2, 0.23.3 Reporter: Rajit Saha Assignee: John George Attachments: MR-3082.patch, MR-3082.patch $hadoop dfs -copyFromLocal /etc/passwd . $hadoop dfs -lsr . -rw--- 3 hadoopqa hdfs 6883 2011-09-23 22:37 /user/hadoopqa/passwd $hadoop archive -archiveName test1.har -p . passwd . 11/09/23 22:39:22 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 4 for hadoopqa 11/09/23 22:39:22 INFO security.TokenCache: Got dt for hdfs://NN host/user/hadoopqa/.staging/job_201109232234_0004;uri=NN IP:8020;t.service=NN IP:8020 11/09/23 22:39:22 INFO mapred.JobClient: Running job: job_201109232234_0004 11/09/23 22:39:23 INFO mapred.JobClient: map 0% reduce 0% 11/09/23 22:39:34 INFO mapred.JobClient: Task Id : attempt_201109232234_0004_m_00_0, Status : FAILED java.io.FileNotFoundException: File does not exist: hdfs://NN host/user/hadoopqa/hadoopqa/passwd at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:525) at org.apache.hadoop.tools.HadoopArchives$HArchivesMapper.map(HadoopArchives.java:697) at org.apache.hadoop.tools.HadoopArchives$HArchivesMapper.map(HadoopArchives.java:587) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:261) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:255) So Archiving is failing as it was finding input file at /user/hadoopqa/hadoopqa/passwd , whereas it should look for /user/hadoopqa/passwd -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3999) Tracking link gives an error if the AppMaster hasn't started yet
[ https://issues.apache.org/jira/browse/MAPREDUCE-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245665#comment-13245665 ] Hudson commented on MAPREDUCE-3999: --- Integrated in Hadoop-Hdfs-trunk-Commit #2058 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2058/]) MAPREDUCE-3999. Tracking link gives an error if the AppMaster hasn't started yet (Ravi Prakash via bobby) (Revision 1309108) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309108 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java Tracking link gives an error if the AppMaster hasn't started yet Key: MAPREDUCE-3999 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3999 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, webapps Affects Versions: 0.23.1 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-3999.patch, MAPREDUCE-3999.patch, MAPREDUCE-3999.patch Courtesy [~sseth] {quote} The MRAppMaster died before writing anything. Steps to generate the error: 1. Setup a queue with 1 max active application per user 2. Submit a long running job to this queue. 3. Submit another job to the queue as the same user. Access the tracking URL for job 2 directly or via Oozie (not via the RM link - which is rewritten once the app starts). This would exist in situations where the queue doesn't have enough capacity - or for the small period of time between app submission and AM start. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3999) Tracking link gives an error if the AppMaster hasn't started yet
[ https://issues.apache.org/jira/browse/MAPREDUCE-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245669#comment-13245669 ] Hudson commented on MAPREDUCE-3999: --- Integrated in Hadoop-Common-trunk-Commit #1983 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1983/]) MAPREDUCE-3999. Tracking link gives an error if the AppMaster hasn't started yet (Ravi Prakash via bobby) (Revision 1309108) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309108 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java Tracking link gives an error if the AppMaster hasn't started yet Key: MAPREDUCE-3999 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3999 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, webapps Affects Versions: 0.23.1 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-3999.patch, MAPREDUCE-3999.patch, MAPREDUCE-3999.patch Courtesy [~sseth] {quote} The MRAppMaster died before writing anything. Steps to generate the error: 1. Setup a queue with 1 max active application per user 2. Submit a long running job to this queue. 3. Submit another job to the queue as the same user. Access the tracking URL for job 2 directly or via Oozie (not via the RM link - which is rewritten once the app starts). This would exist in situations where the queue doesn't have enough capacity - or for the small period of time between app submission and AM start. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3672) Killed maps shouldn't be counted towards JobCounter.NUM_FAILED_MAPS
[ https://issues.apache.org/jira/browse/MAPREDUCE-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245670#comment-13245670 ] Thomas Graves commented on MAPREDUCE-3672: -- Hey Anupam, Can you update it so we get the pretty formatted version from the cli for the new NUM_KILLED fields? It is in: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/org/apache/hadoop/mapreduce/JobCounter.properties From cli output right now has: Job Counters NUM_KILLED_MAPS=1 NUM_KILLED_REDUCES=1 Launched map tasks=1 Launched reduce tasks=1 It would be nice to display Killed map tasks=X, Killed reduce tasks=Y Killed maps shouldn't be counted towards JobCounter.NUM_FAILED_MAPS --- Key: MAPREDUCE-3672 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3672 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Anupam Seth Fix For: 0.24.0 Attachments: MAPREDUCE-3672-branch-23.patch, MAPREDUCE-3672-branch-23.patch We count maps that are killed, say by speculator, towards JobCounter.NUM_FAILED_MAPS. We should instead have a separate JobCounter for killed maps. Same with reduces too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3082) archive command take wrong path for input file with current directory
[ https://issues.apache.org/jira/browse/MAPREDUCE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245674#comment-13245674 ] Hadoop QA commented on MAPREDUCE-3082: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521201/MR-3082.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2137//console This message is automatically generated. archive command take wrong path for input file with current directory - Key: MAPREDUCE-3082 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3082 Project: Hadoop Map/Reduce Issue Type: Bug Components: harchive Affects Versions: 0.20.204.1, 0.23.2, 0.23.3 Reporter: Rajit Saha Assignee: John George Attachments: MR-3082.patch, MR-3082.patch $hadoop dfs -copyFromLocal /etc/passwd . $hadoop dfs -lsr . -rw--- 3 hadoopqa hdfs 6883 2011-09-23 22:37 /user/hadoopqa/passwd $hadoop archive -archiveName test1.har -p . passwd . 11/09/23 22:39:22 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 4 for hadoopqa 11/09/23 22:39:22 INFO security.TokenCache: Got dt for hdfs://NN host/user/hadoopqa/.staging/job_201109232234_0004;uri=NN IP:8020;t.service=NN IP:8020 11/09/23 22:39:22 INFO mapred.JobClient: Running job: job_201109232234_0004 11/09/23 22:39:23 INFO mapred.JobClient: map 0% reduce 0% 11/09/23 22:39:34 INFO mapred.JobClient: Task Id : attempt_201109232234_0004_m_00_0, Status : FAILED java.io.FileNotFoundException: File does not exist: hdfs://NN host/user/hadoopqa/hadoopqa/passwd at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:525) at org.apache.hadoop.tools.HadoopArchives$HArchivesMapper.map(HadoopArchives.java:697) at org.apache.hadoop.tools.HadoopArchives$HArchivesMapper.map(HadoopArchives.java:587) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:261) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:255) So Archiving is failing as it was finding input file at /user/hadoopqa/hadoopqa/passwd , whereas it should look for /user/hadoopqa/passwd -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3999) Tracking link gives an error if the AppMaster hasn't started yet
[ https://issues.apache.org/jira/browse/MAPREDUCE-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245683#comment-13245683 ] Hudson commented on MAPREDUCE-3999: --- Integrated in Hadoop-Mapreduce-trunk-Commit #1996 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1996/]) MAPREDUCE-3999. Tracking link gives an error if the AppMaster hasn't started yet (Ravi Prakash via bobby) (Revision 1309108) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1309108 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java Tracking link gives an error if the AppMaster hasn't started yet Key: MAPREDUCE-3999 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3999 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, webapps Affects Versions: 0.23.1 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 0.23.3, 2.0.0 Attachments: MAPREDUCE-3999.patch, MAPREDUCE-3999.patch, MAPREDUCE-3999.patch Courtesy [~sseth] {quote} The MRAppMaster died before writing anything. Steps to generate the error: 1. Setup a queue with 1 max active application per user 2. Submit a long running job to this queue. 3. Submit another job to the queue as the same user. Access the tracking URL for job 2 directly or via Oozie (not via the RM link - which is rewritten once the app starts). This would exist in situations where the queue doesn't have enough capacity - or for the small period of time between app submission and AM start. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4020) Web services returns incorrect JSON for deep queue tree
[ https://issues.apache.org/jira/browse/MAPREDUCE-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245694#comment-13245694 ] Hadoop QA commented on MAPREDUCE-4020: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521189/MAPREDUCE-4020-branch-23.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath org.apache.hadoop.mapreduce.v2.TestRMNMInfo org.apache.hadoop.mapred.TestReduceFetchFromPartialMem org.apache.hadoop.mapred.TestLazyOutput org.apache.hadoop.mapreduce.TestChild org.apache.hadoop.mapred.TestReduceFetch org.apache.hadoop.mapred.TestMiniMRBringup org.apache.hadoop.mapred.TestMiniMRClientCluster org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter org.apache.hadoop.mapreduce.v2.TestMROldApiJobs org.apache.hadoop.mapreduce.v2.TestNonExistentJob org.apache.hadoop.mapred.TestMiniMRChildTask org.apache.hadoop.mapred.TestJobCounters org.apache.hadoop.mapreduce.v2.TestUberAM org.apache.hadoop.conf.TestNoDefaultsJobConf org.apache.hadoop.mapred.TestJobSysDirWithDFS org.apache.hadoop.mapreduce.v2.TestMRJobs org.apache.hadoop.mapred.TestClientRedirect org.apache.hadoop.mapred.TestJobCleanup org.apache.hadoop.mapreduce.TestMapReduceLazyOutput +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2136//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2136//console This message is automatically generated. Web services returns incorrect JSON for deep queue tree --- Key: MAPREDUCE-4020 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4020 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, webapps Affects Versions: 0.23.1 Reporter: Jason Lowe Assignee: Anupam Seth Attachments: MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, MAPREDUCE-4020-branch-23.patch, testcase.patch When the capacity scheduler is configured for more than two levels of queues, the web services API returns incorrect JSON for the subQueues field of some parent queues. The subQueues field for parent queues should always be an array, but sometimes the field appears multiple times for a queue and as what looks like a CapacityQueueInfo object instead of an array. Besides the sometimes-an-array-sometimes-not problem, parsing the result into a JSON object causes all but the last subQueues field to be discarded since they are overwritten by subsequent fields with the same name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3650) testGetTokensForHftpFS() fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245703#comment-13245703 ] Robert Joseph Evans commented on MAPREDUCE-3650: Ravi. The patch looks OK for the most part. I am just a bit curious why you decided to use 192.168.1.1 instead of something like 127.0.0.1? testGetTokensForHftpFS() fails -- Key: MAPREDUCE-3650 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3650 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Thomas Graves Assignee: Ravi Prakash Priority: Blocker Fix For: 0.23.0 Attachments: MAPREDUCE-3650.patch org.apache.hadoop.mapreduce.security.TestTokenCache.testGetTokensForHftpFS fails. Looks like it may have been introduced with HADOOP-7808 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3672) Killed maps shouldn't be counted towards JobCounter.NUM_FAILED_MAPS
[ https://issues.apache.org/jira/browse/MAPREDUCE-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anupam Seth updated MAPREDUCE-3672: --- Attachment: MAPREDUCE-3672-branch-23.patch Thanks! Updated patch to pretty the output. Tested on a single node cluster locally. Killed maps shouldn't be counted towards JobCounter.NUM_FAILED_MAPS --- Key: MAPREDUCE-3672 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3672 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Anupam Seth Fix For: 0.24.0 Attachments: MAPREDUCE-3672-branch-23.patch, MAPREDUCE-3672-branch-23.patch, MAPREDUCE-3672-branch-23.patch We count maps that are killed, say by speculator, towards JobCounter.NUM_FAILED_MAPS. We should instead have a separate JobCounter for killed maps. Same with reduces too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira